pyEHR
Python modules for managing OpenEHR projects
Contacts
Challenge
In longitudinal biomedical studies, multiple, heterogeneous data types can become necessary within the time span of the project. Ideally, common solutions based on ad-hoc database tables should be replaced by computable formalisms for the meta-description of structured clinical/phenotypic records. Such systems should easily support operations such as aggregations on sophisticated profile descriptions across all the available records, as well as in-depth navigation of all data related to a specific study participant.
Overview
pyEHR is a toolkit for building scalable clinical/phenotypic data management systems for biomedical research applications. The toolkit uses openEHR formalisms to guarantee the decoupling of clinical data descriptions from implementation details, and database technologies (both SQL and NoSQL) to provide scalable storage back-ends. pyEHR is specifically designed to efficiently manage queries on large collections of very heterogeneous structured records with recurring repeated substructures. The system is built for the large data volumes and the evolving and heterogeneous data structures encountered in biomedical research.
Innovative features
- pyEHR uses openEHR - a computable formalism for the meta-description of structured clinical records - as a systematic approach to handle data heterogeneity while keeping a clear separation between the well defined semantic description of data structures and their concrete storage implementation;
- scalability is achieved through a multi-tier architecture with interfaces for multiple data storage systems;
- queries can be expressed at the semantic level;
- Database Management Systems (DBMS) supported are two NoSQL databases/search engines: MongoDB and ElasticSearch. Support for DBMs either NoSQL or relational can be easily added.
Potential users
Researchers and developers.
Impact sectors
Health - ICT
Other resources
- https://github.com/crs4/pyEHR
- Delussu, G.; Lianas, L.; Frexia, F.; Zanetti, G.; (2016) A Scalable Data Access Layer to Manage Structured Heterogeneous Biomedical Data. PLoS ONE 11(12): e0168004. https://doi.org/10.1371/journal.pone.0168004.
- Lianas, L.; Frexia, F.; Delussu, G.; Anedda, P.; Zanetti, G.; PYEHR: A scalable clinical data management toolkit for biomedical research projects. In: Healthcom, p. 370–374, 2014.