Because of the importance of the information conveyed by the clinical documents and owing to the large quantity of raw texts produced in the healthcare system, it became a determinant challenge, in the NLP research field, to arrange the extraction and the management of meaningful data, starting from real text occurrences. In this paper we approach a corpus of 5000 medical diagnoses with sophisticated linguistic and computational devices, which are able to access the semantic dimension of words and sentences contained in it. Our morphosemantic method is grounded on a list of neoclassical formative elements pertaining to the medical domain which has been used for the automatic creation and population of medical lexical resources. The outcomes of this work are automatically built electronic dictionaries and thesauri and an annotated corpus for the NLP in the medical domain. Copyright © 2017 Inderscience Enterprises Ltd.
Morphosemantic strategies for the automatic enrichment of Italian lexical databases in the medical domain / Amato, Flora; Mazzeo, Antonino; D'Elia, Annibale; Maisto, Alessandro; Pelosi, Serena. - In: INTERNATIONAL JOURNAL OF GRID AND UTILITY COMPUTING. - ISSN 1741-847X. - 8:4(2017), pp. 312-320. [10.1504/IJGUC.2017.088262]
Morphosemantic strategies for the automatic enrichment of Italian lexical databases in the medical domain
Amato Flora;Mazzeo Antonino;D'ELIA, ANNIBALE;Maisto Alessandro;Pelosi Serena
2017
Abstract
Because of the importance of the information conveyed by the clinical documents and owing to the large quantity of raw texts produced in the healthcare system, it became a determinant challenge, in the NLP research field, to arrange the extraction and the management of meaningful data, starting from real text occurrences. In this paper we approach a corpus of 5000 medical diagnoses with sophisticated linguistic and computational devices, which are able to access the semantic dimension of words and sentences contained in it. Our morphosemantic method is grounded on a list of neoclassical formative elements pertaining to the medical domain which has been used for the automatic creation and population of medical lexical resources. The outcomes of this work are automatically built electronic dictionaries and thesauri and an annotated corpus for the NLP in the medical domain. Copyright © 2017 Inderscience Enterprises Ltd.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.