This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Mental Health, is properly cited. The complete bibliographic information, a link to the original publication on https://mental.jmir.org/, as well as this copyright and license information must be included.
Empirically driven personalized diagnostic applications and treatment stratification is widely perceived as a major hallmark in psychiatry. However, databased personalized decision making requires standardized data acquisition and data access, which are currently absent in psychiatric clinical routine.
Here, we describe the informatics infrastructure implemented at the psychiatric Münster University Hospital, which allows standardized acquisition, transfer, storage, and export of clinical data for future real-time predictive modelling in psychiatric routine.
We designed and implemented a technical architecture that includes an extension of the electronic health record (EHR) via scalable standardized data collection and data transfer between EHRs and research databases, thus allowing the pooling of EHRs and research data in a unified database and technical solutions for the visual presentation of collected data and analyses results in the EHR. The Single-source Metadata ARchitecture Transformation (SMA:T) was used as the software architecture. SMA:T is an extension of the EHR system and uses module-driven engineering to generate standardized applications and interfaces. The operational data model was used as the standard. Standardized data were entered on iPads via the Mobile Patient Survey (MoPat) and the web application Mopat@home, and the standardized transmission, processing, display, and export of data were realized via SMA:T.
The technical feasibility of the informatics infrastructure was demonstrated in the course of this study. We created 19 standardized documentation forms with 241 items. For 317 patients, 6451 instances were automatically transferred to the EHR system without errors. Moreover, 96,323 instances were automatically transferred from the EHR system to the research database for further analyses.
In this study, we present the successful implementation of the informatics infrastructure enabling standardized data acquisition and data access for future real-time predictive modelling in clinical routine in psychiatry. The technical solution presented here might guide similar initiatives at other sites and thus help to pave the way toward future application of predictive models in psychiatric clinical routine.
Psychiatric disorders represent one of the leading causes of disability worldwide. In the challenge to provide advanced treatment and prevention strategies for psychiatric disorders, previous research has focused on better understanding of the neurobiological basis of affective disorders [
Importantly, large-scale studies reporting the successful application of multivariate models trained on data from electronic health records (EHRs), including features such as diagnosis and procedures, laboratory parameters, and medications for the prediction of suicide risk or weight gain following antidepressant treatment have demonstrated the capacity and generalizability of predictive models trained on real-world data [
This study aims to present the design and implementation of the technical requirements to address the aforementioned challenges with the ultimate goal of providing the basis for a successful future translation of predictive models to clinical application in psychiatric disorders. The implementation of the outlined technical solution will ultimately allow the evaluation of the potential of predictive models for the clinical management of psychiatric disorders under real-world conditions. In detail, we present the design and implementation of the informatics infrastructure, including technical solutions for (1) extension of the EHR via standardized electronic collection of patient-reported outcomes, (2) data transfer between EHRs and research databases, (3) pooling of EHRs and research data in a unified database, and (4) visual presentation of the analyses results in the EHRs.
The main objective of this study was the design and successful implementation of the informatics infrastructure required to train and validate predictive models in day-to-day clinical application in psychiatry as part of the SEED 11/19 study [
Implementation of standardized documentation forms in EHRs.
The set-up of an interface for direct data transfer between clinical documentation systems and a database for predictive analysis.
The set-up of a unified database that allows pooling of clinical data with further research data for predictive analysis.
Visual presentation of relevant data entities and results of predictive analysis in EHRs at the point of care.
The Münster University Hospital in Germany is a tertiary care hospital with 1457 beds and 11,197 staff who treated 607,414 patients (inbound and outbound) in 2019 [
The EHR system ORBIS by Dedalus Healthcare is used at Münster University Hospital in more than 40 clinics and is the market leader in Germany, Austria, and Switzerland with over 1300 installations [
To address the study aims, the following requirements were identified through focus groups including physicians and researchers at Münster University Hospital in Germany.
Extension of the EHR via standardized data collection: At first sight, the widely established usage of electronic documentation systems in clinical routine might supplement the notion of a fast translation of predictive models. However, until now, the majority of clinical data is still acquired and stored in an unstructured way that cannot be directly used for predictive modeling. Extension of EHR data via standardized forms of data collection in routine care is therefore required to provide a sufficient database for the development of predictive models. Importantly, the technical solution should be flexible and allow to update the content of the collected EHR data. Content-wise, in an initial step, standardized extension of EHR data should include assessment of symptomatology in order to allow both patient stratification at baseline as well as outcome measurement following intervention. Furthermore, standardized assessment of known risk factors, including life events and sociodemographic data, appears meaningful.
Data transfer: Routine EHR data storage systems are usually strictly separated from research databases for safety reasons and hence are not directly accessible for predictive analyses. Training and validation of predictive models based on EHR data requires the set-up of interfaces and a database in which EHR data can be transferred and subsequently stored in a standardized way. In line with our study aim, the technical solutions should be scalable and allow data transfer in real time. EHR data transferred and stored in the database must be accessible for researchers in order to allow the development of predictive models.
Combination of EHRs and research data: Again, since routine EHR data storage systems are strictly separated from research databases, pooling of EHR and research data is not possible within state-of-the art EHR databases. Pooling EHR with research data in a unified database would allow the enrichment of predictive models trained on EHR data by adding already existing research data and furthermore to validate EHR data based on research data. To this end, in order to combine each patient’s EHR and research data, a unified scalable research database is needed that allows the integration of EHRs and research data acquired via experimental studies.
Presentation of standardized data within the EHR: Once collected, clinically useful standardized data as well as results of any analysis must be transferred back to the main EHR system in real time and presented to the clinician at the point of care.
An informatics infrastructure enabling real-time clinical predictive modeling based on the single-source architecture was derived from the named requirements. Custom metadata must be supported. The Clinical Data Interchange Standards Consortium (CDISC) Operational Data Model (ODM) (version 1.3.2) was used as a flexible standard for exchange and archiving of metadata within the framework of clinical studies [
The technical feasibility was demonstrated by the implementation of an infrastructure that enables clinical predictive modeling in real time. Java version 1.8.0_181 [
The Single-source Metadata ARchitecture Transformation (SMA:T) was used as the software architecture [
Unified Modeling Language sequence diagram of the data collection workflow. In process steps 1-3, the patient completes the forms and sends data to the communication server. In process steps 4-8, the communication server sends data to the electronic health record system and creates a blank documentation form. This form is populated with imported data. In process steps 9-13, SMA:T creates the documentation form with metadata and imported data. EHR: electronic health record; HL7: Health Level 7; MoPat: Mobile Patient Survey; ODM: operational data model; SMA:T: Single-source Metadata ARchitecture Transformation.
Unified Modeling Language sequence diagram of the data extraction workflow. In process steps 1-8, a study query is created with SMA:T and a generic operational data model file is saved in the database of the electronic health record system. In process steps 9-18, a generic Mirth Channel is created based on the study query. In process steps 19-20, data points are automatically extracted from the electronic health record system and transferred to the study database using operational data model standard format. EDC: electronic data capture; EHR: electronic health record; HDD: Hard Disc Drive; HL7: Health Level 7; LOC: Lines Of Code; MoPat: Mobile Patient Survey; ODM: operational data model; SMA:T: Single-source Metadata ARchitecture Transformation.
SEED software architecture of the Münster University Hospital. EHR: electronic health record; MoPat: Mobile Patient Survey; SMA:T: Single-source Metadata ARchitecture Transformation; *supports custom applications.
The implementation of the architecture is divided into 4 areas: data collection, data transfer, data storage, and data visualization. Agile methods were used for Project Life Cycle and Development Cycle [
SMA:T provides 2 options for data collection, namely, the EHR system in clinical routine and dedicated web applications. Data input via web applications can be designed freely. In this study, EHR data generated as part of clinical routine documentation comprised, among others, laboratory data, medication, information on diagnosis, time of admission, and length of stay and are presented in detail in
Research documentation used in the Department of Psychiatry.
Name of the documentation form | Items |
SEED ClinicalData Admission Date & Time | 4 |
SEED ClinicalData Classification | 2 |
SEED ClinicalData Diagnosis-Related Groups/Diagnosis | 3 |
SEED ClinicalData Electroconvulsive Therapy | 11 |
SEED ClinicalData Laboratory Assessments | 7 |
SEED ClinicalData Medication | 5 |
SEED ClinicalData Patient | 4 |
SEED ClinicalData Vital Signs | 3 |
One item of Beck Depression Inventory presented in the MoPat app (clinic for psychiatry and psychotherapy at Münster University Hospital).
SMA:T provides 2 types of data transfer in the present scenario, that is, data transfer into the EHR system and transfer into the electronic data capture system. MoPat sends data to the EHR system via the communication server of the University Hospital. Data are saved in the ClinicalData structure of the ODM format. The ODM document is embedded in an HL7 message. Each HL7 message creates a form in the EHR system. The header of the HL7 message determines which form is automatically created. Data transfer to the electronic data capture takes place via SMA:T interfaces. Both retrospective and prospective data exports in real time are supported. When a study query was activated via the EHR frontend, metadata and corresponding structured query language statements were read by the SMA:T extension of the communication server. SMA:T uses its code library and channel framework to generate unique Mirth channels. These send a database query to the EHR system and transfers the output directly to the electronic data capture system. Both metadata (clinical documentation form) and clinical patient data are provided by SMA:T in the ODM format. Data records are combined into an ODM document. In this study, SMA:T converts the resulting XML-based ODM document into JavaScript Object Notation format [
Data storage addresses metadata and clinical data. Metadata of clinical documentation forms are stored centrally in the SMA:T database. The SMA:T database model is part of the EHR database model. Metadata and clinical data are available in the ODM format. MoPat also supports ODM format; therefore, the same data model can be used for both systems. Clinical data are clearly identified by unique object identifiers and the associated object identifier on the documentation form.
Usability principles were applied to visualize data [
As part of the study, 11 standardized documentation forms with 202 items were created for the clinic for psychiatry and psychotherapy (
Routine documentation used in the Department of Psychiatry.
Name of the documentation form (n=11) | Items (n=202) |
Beck Depression Inventory | 23 |
Big Five Inventory (BFI-2-S) | 35 |
Big Five Inventory (BFI-2-XS) | 20 |
Childhood Trauma Questionnaire | 34 |
Family Mental History | 14 |
Hamilton Depression Scale | 25 |
Narcissistic Admiration and Rivalry Questionnaire | 9 |
Symptom Checklist-90 Somatization Scale | 14 |
Sociodemographic questionnaire | 5 |
Questions on individual disease course | 18 |
Questions on somatic comorbidities | 5 |
Number of instances created for each documentation form: the counts of patients, patient cases, and users are shown.
Name of the documentation form | Cases | Patients | Instances | Users |
Beck Depression Inventory | 380 | 307 | 1266 | 50 |
Big Five Inventory (BFI-2-S) | 358 | 303 | 559 | 25 |
Big Five Inventory (BFI-2-XS) | 258 | 217 | 692 | 19 |
Childhood Trauma Questionnaire | 313 | 303 | 354 | 33 |
Family Mental History | 315 | 305 | 343 | 31 |
Hamilton Depression Scale | 350 | 296 | 516 | 42 |
Narcissistic Admiration and Rivalry Questionnaire | 357 | 302 | 558 | 20 |
Symptom Checklist-90 Somatization Scale | 360 | 303 | 564 | 18 |
Sociodemographic questionnaire | 315 | 305 | 344 | 26 |
Questions on individual disease course | 315 | 305 | 342 | 26 |
Questions on somatic comorbidities | 313 | 303 | 328 | 10 |
Data quality of patient-based documentation regarding score calculation.
Name of the documentation form | Instances | Scores | Missing dataa |
Beck Depression Inventory | 1266 | 1238 | 28 |
Big Five Inventory (BFI-2-S) | 559 | 540 | 19 |
Big Five Inventory (BFI-2-XS) | 692 | 656 | 36 |
Childhood Trauma Questionnaire | 354 | 320 | 34 |
Hamilton Depression Scale | 516 | 502 | 14 |
Narcissistic Admiration and Rivalry Questionnaire | 558 | 550 | 8 |
Symptom Checklist-90 Somatization Scale | 564 | 554 | 10 |
aMissing data frequency is determined by missing data entries.
Data on the completeness of the documentation forms.a
Name of the documentation form | Items | Completed items | Uncompleted items |
Beck Depression Inventory | 29,118 | 29,015 | 103 |
Big Five Inventory (BFI-2-S) | 19,565 | 19,519 | 46 |
Big Five Inventory (BFI-2-XS) | 13,840 | 13,739 | 101 |
Childhood Trauma Questionnaire | 12,036 | 11,985 | 51 |
Family Mental History | 4802 | 4354 | 448 |
Hamilton Depression Scale | 12,384 | 12,076 | 308 |
Narcissistic Admiration and Rivalry Questionnaire | 5031 | 5012 | 19 |
Symptom Checklist-90 Somatization Scale | 7896 | 7879 | 17 |
Sociodemographic questionnaire | 1720 | 1715 | 5 |
Questions on individual disease course | 6156 | 5453 | 703 |
Questions on somatic comorbidities | 1640 | 1095 | 545 |
aIn this context, completeness means that the documentation form contains values in all data points.
Number of retrospectively transferred research documentation forms (electronic health record to electronic data capture).a
Name of the documentation form | Instances in electronic health records |
SEED ClinicalData Admission Date & Time | 245 |
SEED ClinicalData Classification | 8260 |
SEED ClinicalData Diagnosis-Related Groups/Diagnosis | 1163 |
SEED ClinicalData Electroconvulsive therapy | 452 |
SEED ClinicalData Laboratory Assessments | 22,886 |
SEED ClinicalData Medication | 14,244 |
SEED ClinicalData Patient | 245 |
SEED ClinicalData Vital Signs | 48,828 |
aElectronic health record data were extracted with generic study queries in the Single-source Metadata ARchitecture Transformation system.
Number of instances created with Mopat@home for each documentation form.
Name of the documentation form | Instances |
Beck Depression Inventory | 65 |
Big Five Inventory (BFI-2-S) | 64 |
Childhood Trauma Questionnaire | 65 |
Family Mental History | 65 |
Narcissistic Admiration and Rivalry Questionnaire | 64 |
Symptom Checklist-90 Somatization Scale | 66 |
Sociodemographic questionnaire | 66 |
Questions on individual disease course | 65 |
Questions on somatic comorbidities | 65 |
The aim of this study was the design and implementation of an informatics infrastructure enabling standardized data acquisition at the point of care and subsequent accessibility of clinical data for analytic purposes, which is required for future application of predictive models in day-to-day clinical routine in psychiatry. In this study, we have shown the overall technical feasibility of the implemented solution. Standardized documentation forms were implemented to extend EHR data domains and to improve data quality in the EHR system. An automated transfer of data into the EHR system and the research database was implemented, thus enabling the pooling of EHR data with already existing research data from ongoing cohort studies. This system was accepted by clinical staff from the Department of Psychiatry of Münster University Hospital in Germany. Widespread use of documentation forms could be demonstrated. Standardized electronic data collection in the EHR at the point of care was successfully implemented. The latter solution can similarly be applied for the presentation of results from predictive models.
The major strengths of this study are standardized acquisition, transfer, storage, and export of data in real time with a generic informatics infrastructure. This system fulfills the prerequisites for future predictive modelling in clinical routine in psychiatry [
Through our study, we extend a previous line of research on predictive modeling based on EHR data. While previous studies have demonstrated empirical evidence for the predictive validity of EHR data in psychiatric use cases [
The informatics infrastructure for standardized data acquisition, transfer, storage, and export in real time for future predictive modelling outlined in this study is an important step in the complex process toward the implementation of machine learning and clinical decision support solutions in routine care. Our study shows that this approach is technically feasible. Owing to the standardization, this concept is also scalable for other medical areas. Data warehouse applications of a heterogeneous hospital landscape can be implemented with this software architecture. In addition to local artificial intelligence applications, multi-site implementations of the architecture could also transfer pseudonymized data points into a global predictive model. The implementation of national and international predictive models in medicine would be possible.
Artificial intelligence systems rely on high-quality data. In the future, artificial intelligence applications might send real-time evaluations directly back into the EHR system. Clinical staff could access and respond to calculated predictions. Selected data will be provided in a modular dashboard. Medical device regulation needs to be taken into account for implementation of such systems. Direct data transfer back from the clinic would be possible. Real-time adjustments of the prediction models would thus be possible. Standardization of clinical routine documentation via SMA:T can provide high-quality structured data points. It is planned to augment this database with further research data from existing cohort studies, for example, covering neuroimaging and genetic data. Specific prediction models can be trained in this way with the same architecture. Generic model pipelines can be set up. Model clusters can be set up to answer complex medical questions. Basically, SMA:T forms a solid technical infrastructure for the implementation of artificial intelligence solutions in medicine. Scheme extensions of the ODM standard can be implemented to optimize communication between systems. Observational and interventional studies are warranted to evaluate the predictive validity of machine learning models in psychiatric routine. For multi-center studies, SMA:T needs to be reimplemented in the respective EHR environments to process CDISC ODM files. A software blueprint is available [
The presented informatics infrastructure enabling standardized data acquisition, transfer, storage, and export in real time for future predictive modelling in clinical routine in psychiatry is technically feasible. The outlined architecture provides a technical basis for the application, first and foremost, and the validation of clinical decision support systems and artificial intelligence applications in clinical studies.
Beck Depression Inventory documentation form created with Single-source Metadata ARchitecture Transformation.
Symptom Checklist-90 Somatization Scale documentation form created with Single-source Metadata ARchitecture Transformation.
Clinical Data Interchange Standards Consortium
electronic health record
Fast Healthcare Interoperability Resources
Health Level 7
Mobile Patient Survey
operational data model
Single-source Metadata ARchitecture Transformation
We are deeply indebted to all the participants of this study. This study was supported by a grant from BMBF (HiGHmed 01ZZ1802V). Funding was provided by the Interdisciplinary Center for Clinical Research (IZKF) of the medical faculty of Münster (Grant SEED 11/19 to NO) as well as the “Innovative Medizinische Forschung” (IMF) of the medical faculty of Münster (Grants OP121710 to NO).
RB and NO drafted the manuscript. RB developed the software architecture and conducted the statistical analyses. NO was responsible for the formulation of the overarching research goal and aims, the study design, conception, development of the methodology, management and coordination responsibility for the research activity planning and execution, data acquisition and design, creation of data models, and data analysis. NO and MD acquired financial support for the conduction of this study. MS supported programming and the implementation of computer code. All authors contributed to the interpretation of data. All authors critically reviewed and substantially revised the final manuscript.
None declared.