Learning of Disease Specific Knowledge Graph from Unstructured Medical Health Records & Radiology Reports
Cancer is currently the second leading cause of death worldwide, killing more people every year owing to its increasing growth rate. There is a vast amount of clinical data in radiology reports and electronic health records (EHRs). Case studies are important because they offer a plethora of medical information on diseases, treatments, and other issues. However, because this information is frequently available as unstructured notes, working with it can be challenging. Additionally, the data volume is huge, the production rate is rapid, and the format is special. Thus, the conversion of health information into standards-compliant, comparable, and consistent data is essential for these scenarios.
To address these challenges, a knowledge extraction pipeline is proposed in this work, based on schema based knowledge graphs (KG), from EHRs and clinical reports. After extracting knowledge using Name Entity Recognition from radiology reports and EHRs of 33,431 cancer patients, a knowledge graph is developed in Neo4j containing 368,436 entities and 754,061 relationships of 15 different semantic categories based upon the proposed schema. The proposed method would serve as the initial step in understanding how to use KG intelligently for uniform representation of medical knowledge to analyse the course of disease after learning about it via EHRs.