Work package 1: Data capture and infrastructure
Lead: Karolinska Institutet
The work package aims to capture relevant real-world data (RWD), harmonise, combine with genetics, and ensure safe, secure, and transparent running and sustainability of the main data infrastructure. Specific objectives are to:
- Capture complementary pharmacological treatment outcome data (response, adverse effects, trajectories) from cross-border eHRs, questionnaires, health registries and genotyped biobanks, applying novel analytic tools including machine learning (ML) and artificial intelligence (AI).
- Define pharmacological treatment outcome and multimorbidities with standardised data within (eHRs, registry data) and between different data sources (RWD, RCT).
- Run and sustain the secure Tryggve/ELIXIR infrastructure and for cross-border analysis of data from different sources, and develop and exploit the REAL-WD Platform in compliance with the General Data Protection Regulation (GDPR).
Description of work
WP1 provides the infrastructure and tools to capture and collate RWD from multiple sources and to integrate it with research data and randomised controlled trial (RCT) data. The WP will customise existing tools and receive data or co-ordinate non-sharable data from each country, and support the other scientific WPs (2-5) with infrastructure and portable tools for data analysis to obtain their specific aims.
WP1 will develop a cross-European network, expanding the current Nordic system (Tryggve), to secure sustainability of the developed data tools and infrastructure, while maintaining ethical, secure and GDPR-compliant data transfer and sharing protocols. The REAL-WD data platform will ensure data access and sharing by researchers and other stakeholders across Europe.
The WP will capture and curate RWD to define treatment outcomes and multimorbidities in those with mental disorders from cross-border sources, including registry data, eHR, clinical records, and data collected through questionnaires and digital tools. We will define the disorders, comorbidities and outcomes of already genotyped cases (cross-referenced, nation-wide biobanks) based on information from one or more of: i) clinical studies on mental disorders; ii) registry information, iii) eHRs; or iv) health surveys/online questionnaires (inference).
Information from registries enables life-course observation through recorded International Classification of Diseases (ICD) diagnoses in in- and outpatient clinics or inference through medication use from prescription registries (type, dose, duration). The latter two will allow us to maximise the numbers of cases. We will combine these with national datasets (e.g. CLOZUK) for rarer disorders and population cohorts for common disorders (depression in UK Biobank). We estimate that we will have mental disorder diagnostic phenotypes from n~260k from a total sample of ~1.9 million individuals. We will perform country-specific data collection according to the available resources and take advantage of established data integration and analytical protocols, allowing harmonisation and meta-analysis across countries using container technology through the secure and GDPR-compliant ICT infrastructure. We will improve and extend our capturing tools and perform discovery meta-analysis across countries.
Key WP1 tasks
- Identification of RWD relevant for treatment outcome and multimorbidities in severe mental disorders.
- Collection of RWD based from eHR.
- Curation and integration of identified RWD and RCTs.
- Maintenance and expansion of a secure data infrastructure cross-borders for meta-analysis.
- Exploration of possibilities for further exploitation of the REAL-WD platform for researchers, health providers and other stakeholders, including commercial users across Europe.