A convergence data model for medical information related to acute myocardial infarction
© The Author(s) 2016
Received: 17 January 2016
Accepted: 29 June 2016
Published: 5 October 2016
Medical and health data has become accepted as the highly worthful and useful big data. However, the current medical information systems have aimed for digitalizaing, storing, and transferring clinical materials. Therefore, many experts assert the needs for the intelligent management and utilization of medical information. In this paper, in order to manage medical data which is separately stored in various materials, we attempt to define a convergence data model by analyzing characteristics and semantic relations of medical data. In particular, since essential clinical materials and medical data for the diagnoses and treatments depend on the kind of diseases, the proposed data model is applied to a target disease, AMI (acute myocardial infarction). Our model can contribute to the effective management and provision of medical data.
Recently, rather than electronically storing clinical materials or processing each data, the needs for discovering meanings from medical big data and utilizing them have significantly increased. In particular, as some works have tried to use medical/health data and personal life-log for personalization services, the standardization of those data have been conducted. The international institutions realized the value of EPR (electronic patient record), EHR (electronic health record), PHR (personal health record), health data, UHR (universal health record). For describing those data, they have presented standards like CCR (continuity of care record) , CDA (clinical document architecture) , and CCD (continuity of care document) . They provide the specifications for the exchange, sharing, and integration of electronic medical and health information. These standards support the management of general information and provide XML (extensible markup language)-based schemes. However, they cannot consider the disease-specificity and cannot describe associative relations of data. That is, with the standards, it is hard to describe details and semantic relations between heterogeneous data related to a specific disease. Furthermore, some works have actively tried for processing, mining, and overall managing big data in medical and health areas. And some studies have aimed for diagnoses of patients’ condition or prediction prognoses.
However, medical data and services have special and particular characteristics compared to other general big data areas. The purposes of general big data mining are to discover hidden meanings, to support decision-making, to recommend actions, and to predict the future by processing and analyzing data which is treated as useless. In contrast, in medical fields, many experts are concerned about the risk and distrust of the analysis or decision by systems which are not human experts. Therefore, rather than the diagnosis or decision about treatments, an efficient provision and management of medical data is more feasible and helpful for medical experts. In other words, the technologies of automating or replacing roles of physicians are less practical, and a method to manage clinical data is necessary so that experts can promptly search and refer related data. It can be helpful to provide clinical data through various views based on their semantics, importance, and relations. Therefore, in this paper, we propose a semantic convergence method by analyzing characteristics and associations of heterogeneous medical data which is currently distributed in different systems.
Medical data generated in the medical fields is very vast in volume. As well as its types, formats, and attributes are heterogeneous depending on the medical departments and diseases. Medical data includes common parts such as a patient’s basic profile and vital information, but clinical materials (documents, examination results, etc.) are different in modality for diseases. One type of materials can contain data of various types, formats, properties, and significance. Besides, kinds and formats of one data depend on institutions and medical experts (inspectors, physicians, and creators of materials). In our paper, we select a target disease to consider the disease-specific characteristics. Acute myocardial infarction (AMI) is an urgent disease that physicians should make a rapid decision about a patient’s condition and determine procedures from their information in the golden time. This target disease is proper and optimal to show the necessity and effects of our methods to converge heterogeneous medical data.
Therefore, in this paper, we propose a method for a semantic convergence and modeling of medical information. In particular, in order to consider the disease-specificity, we focus on AMI, since it is a critical disease that requires the efficient provision of essential parts among vast information for the quick decision in the golden time. The proposed method can extract semantic data which are distributed in various medical materials and unify them into one record.
Related works for describing and analysing medical information
XML-based model to describe medical information
They cannot consider disease-specific data and their semantic relations
A model for clinical documents
A mapping between CCR and CDA
Medical data modeling
Data models for a one type of medical image for automatic annotation or interpretation
They cannot consider semantic relations between data inside various clinical materials
Methods to automatically analyze and process CT or CAG
The decision or prediction has high risk
Methods to predict patients’ condition or recommend treatment by monitoring vital data
Standards for medical data
As personal health information including medical data has become increasingly important, related institutions like ASTM (American Society for Testing and Materials) and HL7 (Health level 7) have developed various standards for health records. The most representative standard CCR  is a XML-based standard to electronically describe patients’ health information. It consists of three core components, the CCR Header, the CCR Body, and the CCR Footer. The CCR Header defines basic information (unique identifier, creation date, etc.) for documents or records, and the CCR Footer describes additional information. The core part, the CCR Body, contains patient-specific significant information such as medical problem (date, condition, etc.), family history (blood type, genetic relatives, etc.), social factor (life pattern, environmental factors, etc.), allergies, medication, and so on.
CDA  is a document markup standard for the structure and semantics of clinical documents. It provides a kind of a template for clinical documents and comprises two parts. The CDA Header contains basic elements for a CDA document like its type and provider. The CDA body specifies all the sections of the health record such as diseases, medical procedure, plan of care, and so on.
These standards for medical and health information have strengths that they can cover a wide range of general health information and represent them as XML-based forms. However, they cannot describe detailed data for a specific disease and lack flexibility in their specification. Moreover, they cannot consider semantic relations between data elements. In other words, with the standards, it is difficult to represent specific data related to a certain disease like AMI and semantic associations between heterogeneous data.
Analysis of medical data
In the past, many studies have been constantly conducted to generate medical data models and semantically interpret or analyze them. Especially, some of them have presented data models for describing semantic data embedded in videos/images which are one of the most important medical materials [13–15]. They proposed ontology schemas by extracting characteristics of medical videos or images and tried to develop systems for annotations or automatic extraction of semantics. However, since they just focused on one type of clinical materials, they did not consider heterogeneous data in EMR or PACS. In recent years, some works have started to construct domain knowledge bases from semantic relations of data stored in medical information systems like EMR . However, those works focused on automatically finding relations of symptoms and disorders. Therefore, studies on efficient providing methods of medical data by converging of heterogeneous data are still insufficient.
In studies of cardiac diseases which are the target domain of this paper, some works have aimed at automatically processing or analyzing videos of CT (computed tomography) or angiography [17, 18]. Also, some systems have been developed in order to inform patients’ conditions to medical experts, to recommend required treatments, or to predict prognoses [19, 20]. However, these works considered a single kind of medical data and applied statistical analyses or rule-based mining to fragmentary data of patients’ current conditions. Those decision or diagnosis systems have little practicality in real medical institutions.
Data analysis for AMI
Acute myocardial infarction (AMI) is a disease known as a heart attack which occurs when the coronary arteries become blocked or narrowed due to a buildup of plaque. This can cause a significant decrease or complete stop of the blood flow to the heart. In the case of AMI, a correct diagnosis and treatment should be done within 90 min at most and its total time has direct effects on a prognosis and a rate of death. Therefore, systems should be able to provide and manage essential data required in emergency so that medical experts can make a quick decision. However, in the current HISs, the scope and kinds of data that experts can check depend on the institutions’ cooperation or physicians’ capability since medical data for one patient is separately stored and managed by institutions and kinds of data. These management methods can cause fatal results in emergency of AMI.
In this paper, for converging of medical information, we collected important clinical materials related to AMI, extracted semantic data from individual materials, and defined one data record based on their associative relations.
First of all, in order to enhance the reality or practicality of our data model, we collected materials which are actually used at four general tertiary hospitals in Korea. Then, we selected eight types of materials commonly used. Our samples contain four types of images/videos, two types of reports, and two types of examination result reports.
Categorization of medical data
As mentioned in the introduction section, the modeling of data contained in clinical materials is necessary for intelligent utilization of medical information. Before defining a data model, it is necessary to analyze types and characteristics of data which are generated and used at medical institutions. Therefore, in this section, we define the scope, kinds, and properties of clinical data which should be included in our convergence model by classifying them according to the degree of structuration.
Structure-based categorization of medical data
Structured data (S)
Data which resides in a fixed field with specifications or forms
Data which is stored in a certain field in EMR system
Meta-data of clinical materials like reports, images, or videos
Semi-structured data (SS)
Data that does not have fixed formats, but have some kinds of schema or organizational properties
Additional explanation that experts manually write for images or videos. They are text descriptions, but written with certain patterns or forms
Comments of medical experts
Texts that medical experts write in medical reports. Generally they are treated as unstructured data, but medical texts have regular patterns or rules
Unstructured data (U)
Data which has no identifiable format or structure
Images or videos generated from medical examinations
Analysis of AMI-related data
AMI-related clinical materials
CAG (Coronary angiography)
A video resulting from coronary angiography or examinations which includes meta-data and values measured by examinations
An image of EKG examination results which includes image itself, measurement values, and meta-data
Coronary angiography report
A report describing information about medical procedures or treatments procedures. It describes text with certain writing patterns for AMI-related information (patient history, current states, etc.)
PTCA and Stent Deployment report
A cardiac diagram including major diagnosis and treatment information. It is generated by handwriting or system input depending on the medical institutions
Cardiology lab sheet
A report about blood tests related to AMI
A report for echocardiography. It includes numerical values for each examination item and text comments of experts (physicians)
System input materials generated by entering data in fixed fields in electronic systems of hospitals. Most of them contain structured data.
Document various reports which include both of structured data and semi-structured texts.
Image/video images or videos resulting from medical examinations. Images or videos themselves are unstructured data, but they also include structured data like measurement values and their meta-data.
Chart analogue or digital charts that medical experts write or draw data by handwriting or using systems. These charts can have all types of data.
Among these materials, we focus on three types grayed in Table 3 and analyze each material to extract and unify their data based on semantic relations. These materials are related to CAG (coronary angiography) which is an essential medical examination for diagnosis and treatment of AMI.
Analysis of data in CAG report
Attitude to chest pain
Chest pain start/first/recent
Left coronary artery
Right coronary artery
Data convergence modeling for AMI
A convergence model of three types of CAG-related data for AMI
Chest pain start/first/recent
Patient a patient’s basic information such as personal information and coronary anatomical information, etc.
Physical history a patient’s states or habits which have a strong influence on AMI.
Vital history a patient’s basic medical states related to AMI.
Medical history information about the past diagnoses or treatments related to AMI. That is, this part includes locations and states of lesions, disease names about past diagnoses, and results of treatments.
Properties for elements in a record
Creator medical experts (physicians, inspectors, etc.) or institutions who created and modified each element.
Date the date on which creators created and changed the value of each element.
Importance the level of significance of each element. This property can have grades depending on whether the element is essential for the diagnosis and treatment, or on the degree that it influenced AMI.
Reference-metadata metadata for ‘reference’ elements of a ‘Medical History’ group. This property represents metadata of related materials (documents, images/videos, etc.) such as equipment models and angles of examinations.
The proposed data model shown in Table 5 can be a foundation to efficiently manage and promptly provide information which is crucial for a decision of diagnoses and treatments, particularly in the case of urgent diseases.
Data generated in medical fields are vast in volume and heterogeneous in terms of types, formats, and characteristics. However, the current HISs just store these data in different documents and images/videos. Therefore, it is difficult to efficiently provide relevant data in the golden time when experts need to view essential data for decision about urgent diseases like AMI. In order to usefully manage these data, a new data model should be defined based on the analysis of characteristics and semantic relations of data included in heterogeneous materials. In this paper, we proposed a convergence model to specify data which are essential for the diagnoses of AMI by analyzing three materials related to CAG. In contrast with the current HISs, the proposed model can unify semantic data contained in various materials as a single record. The convergence record will enable medical experts to easily and intuitively search important data for quick decision about diagnoses and treatments.
Improvement of the convergence record in order to enhance the coverage and completeness of our data model, we will extend and refine its elements by analyzing all of essential materials.
Verification it is necessary to examine the feasibility or applicability of the data model to the HISs through medical experts.
Implementation and experiment we will implement a medical data provision and management system applying the model and will qualitatively and quantitatively compare its efficiency with the current HISs.
Author ML, YSP, MHK and JWL are responsible for the concept of the paper, the results presented and the writing. All authors read and approved the final manuscript.
This work was supported by Institute for Information & communications Technology Promotion (IITP) Grant funded by the Korea government (MSIP) (No.B0101-15-247, Development of Open ICT Healing Platform using Personal Health Data).
The authors declare that they have no competing interests.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
- MacKinnon W, Wasserman M (2009) Implementing electronic medical record systems. IT Prof 11(6):50–53. doi:10.1109/MITP.2009.125 View ArticleGoogle Scholar
- Jeun Y-J (2013) EMR system and patient medical information protection. Korean J Health Serv Manage 7(3):213–224. doi:10.12811/kshsm.2013.7.3.213 View ArticleGoogle Scholar
- Zhao C, Zhang L (2013) Research of information presentation for electronic medical record based on ontology. In: Proceeding of the 2013 international conference on information management, innovation management and industrial engineering, pp. 489–492. doi:10.1109/iciii.2013.6703628
- Na H-S, Yun S-Y, Park S-C (2012) Design and implementation of mapping system for effective health information data exchange in multi-platform environment. J Korean Inst Inf Technol 10(12):143–150Google Scholar
- Perera S, Henson C, Thirunarayan K, Sheth A, Nair S (2014) Semantics driven approach for knowledge acquisition from EMRs. IEEE J Biomed Health Inform 18(2):515–524. doi:10.1109/JBHI.2013.2282125 View ArticleGoogle Scholar
- Greenspan H, Pinhas AT (2007) Medical image categorization and retrieval for PACS using the GMM-KL framework. IEEE Trans Inf Technol Biomed 11(2):190–202. doi:10.1109/TITB.2006.874191 View ArticleGoogle Scholar
- Tao Y, Peng Z, Krishnan A, Zhou XS (2011) Robust learning-based parsing and annotation of medical radiographs. IEEE Trans Med Imaging 30(2):338–350. doi:10.1109/TMI.2010.2077740 View ArticleGoogle Scholar
- Valente F, Viana-Ferreira C, Costa C, Oliveira JL (2012) A RESTful image gateway for multiple medical image repositories. IEEE Trans Inf Technol Biomed 16(3):356–364. doi:10.1109/TITB.2011.2176497 View ArticleGoogle Scholar
- Alvarez LR, Vargas Solis RC (2013) DICOM RIS/PACS telemedicine network implementation using free open source software. IEEE Lat Am Trans 11(1):168–171. doi:10.1109/TLA.2013.6502797 View ArticleGoogle Scholar
- ASTM International (2012) ASTM E2369-12: standard specification for continuity of care record (CCR) doi:10.1520/E2369-12
- Health Level 7 International. HL7 implementation guide for CDA release 2. http://www.hl7.org/implement/standards/product_brief.cfm?product_id=7. Accessed 16 Jan 2016
- Health Level 7 International (2007) HL7 implementation guide: CDA release 2—continuity of care document (CCD)Google Scholar
- Mhiri S, Despres S, Zagrouba E (2008) Ontologies for the semantic-based medical image Indexing: an overview. In: Proceeding of the international conference on information and knowledge engineering, pp 311–317Google Scholar
- Iakovidis DK, Schober D, Boeker M, Schulz S (2009) An ontology of image representations for medical image mining. In: Proceeding of the international conference on information technology and applications in biomedicine, pp 1–4 doi:10.1109/itab.2009.5394373
- Rubin DL (2012) Finding the meaning in images: annotation and image markup. Philos Psychiatry Psychol 18(4):311–318. doi:10.1353/ppp.2011.0045 View ArticleGoogle Scholar
- Kim S-H, Lee J, Kim J-H, Lee H-W, Jung W-G, Lee G-K (2011) Insertion path extraction of catheter for coronary angiography. J Korea Ins Inf Commun Eng 15(4):951–956. doi:10.6109/jkiice.2011.15.4.951 MathSciNetView ArticleGoogle Scholar
- Rivest-Henault D, Sundar H, Cheriet M (2012) Nonrigid 2D/3D registration of coronary artery models with live fluoroscopy for guidance of cardiac interventions. IEEE Trans Med Imaging 31(8):1557–1572. doi:10.1109/TMI.2012.2195009 View ArticleGoogle Scholar
- Park SH, Park HA, Ryu KS, Kim H, Ryu KH (2012) A long-term mortality prediction model for patient with ST-segment elevation myocardial infarction using decision tree. In: Proceeding of the Korea Computer Congress 1(C), pp 139–141Google Scholar
- Martinez-Romero M, Vazquez-Naya JM, Pereira J, Pereira M, Pazos A, Banos G (2013) The iOSC3 system—using ontologies and SWRL rules for intelligent supervision and care of patients with acute cardiac disorders. Comput Math Method M. doi:10.1155/2013/650671 Google Scholar
- Lee JH, Jeong MH, Rhee JA, Choi JS, Park IH, Chai LS, Jang SY, Cho JY, Jeong HC, Lee KH, Park K-H, Sim DS, Kim KH, Hong YJ, Park HW, Kim JH, Ahn Y, Cho JG, Park JC (2014) Factors influencing delay in symptom-to-door time in patients with acute ST-segment elevation myocardial infarction. Korean J Med 87(4):429–438. doi:10.3904/kjm.2014.87.4.429 View ArticleGoogle Scholar