The ever-increasing availability of Electronic Health Records (EHRs) is the key enabling factor of precision medicine, which aims to provide therapies and diagnoses based not only on medical literature, but also on clinical experience and individual information of patients (e.g. genomics, lifestyle, health history). The unstructured nature of EHRs has posed several challenges on their effective analysis, and heterogeneous graphs are the most suitable solution to handle the heterogeneity of information contained in EHRs. However, while EHRs are an extremely valuable data source, information from current medical literature has yet to be considered in clinical decision support systems. In this work, we build an heterogeneous graph from Italian EHRs provided by the Hospital of Naples Federico II, and we define a methodological workflow allowing us to predict the presence of a link between patients and diagnosed diseases. We empirically demonstrate that linking concepts to biomedical ontologies (e.g. UMLS, DBpedia) — which allow us to extract entities and relationships from medical literature — is significantly beneficial to our link-prediction workflow in terms of Area Under the ROC curve (AUC) and Mean Reciprocal Rank (MRR).
Improving graph embeddings via entity linking: A case study on Italian clinical notes / D'Auria, D.; Moscato, V.; Postiglione, M.; Romito, G.; Sperli, G.. - In: INTELLIGENT SYSTEMS WITH APPLICATIONS. - ISSN 2667-3053. - 17:(2023). [10.1016/j.iswa.2022.200161]
Improving graph embeddings via entity linking: A case study on Italian clinical notes
Moscato V.;Postiglione M.;Sperli G.
2023
Abstract
The ever-increasing availability of Electronic Health Records (EHRs) is the key enabling factor of precision medicine, which aims to provide therapies and diagnoses based not only on medical literature, but also on clinical experience and individual information of patients (e.g. genomics, lifestyle, health history). The unstructured nature of EHRs has posed several challenges on their effective analysis, and heterogeneous graphs are the most suitable solution to handle the heterogeneity of information contained in EHRs. However, while EHRs are an extremely valuable data source, information from current medical literature has yet to be considered in clinical decision support systems. In this work, we build an heterogeneous graph from Italian EHRs provided by the Hospital of Naples Federico II, and we define a methodological workflow allowing us to predict the presence of a link between patients and diagnosed diseases. We empirically demonstrate that linking concepts to biomedical ontologies (e.g. UMLS, DBpedia) — which allow us to extract entities and relationships from medical literature — is significantly beneficial to our link-prediction workflow in terms of Area Under the ROC curve (AUC) and Mean Reciprocal Rank (MRR).I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.