For instance, our current research topics include the following:
- data collection, aggregation and computing at wide-ranging scales -- spatial and temporal - and latencies - from high latency to almost real-time;
- vertical applications in the biomedical and industrial domains - e.g., high-throughput automated processing pipelines that cope with the latest generation of data-intensive experimental devices; integrated computational and data management systems that cope with the complexity of large-scale computable data provenance graphs;
- scalable process mining technologies to extract and analyze the underlying processes from large collections of the event data they generate.
Whenever possible, we distribute the software resulting from our research activities as liberally licensed open source. Examples include the following:
- Pydoop a very efficient Python MapReduce and HDFS API for Hadoop;
- Seal Hadoop-based suite of tools for processing NGS data