Named Entity Recognition and Entity Linking systems usually require a rich and annotated dataset to be trained and produce high-quality results, but the annotation process is time consuming and expensive, especially when it needs the effort of domain experts, such as in the medical field. However, recent developments in Natural Language Processing (NLP) allow us to easily use transformer language models which have been pre-trained on a huge quantity of data (often coming from specialized domains), and thus obtain high performance without excessive efforts. In this work, we outline our approach to NER and EL tasks on Spanish clinical notes for the DisTEMIST track at the BioASQ 2022 challenge. Our results demonstrate that the proposed methodology based on biomedical pre-trained language models turned out the best for the NER task with a ∼ 3% higher F1 w.r.t. the second-best solution.

Biomedical Spanish Language Models for entity recognition and linking at BioASQ DisTEMIST / Moscato, V.; Postiglione, M.; Sperli', G.. - 3180:(2022), pp. 315-324.

Biomedical Spanish Language Models for entity recognition and linking at BioASQ DisTEMIST

Moscato V.;Postiglione M.;Sperli' G.
2022

Abstract

Named Entity Recognition and Entity Linking systems usually require a rich and annotated dataset to be trained and produce high-quality results, but the annotation process is time consuming and expensive, especially when it needs the effort of domain experts, such as in the medical field. However, recent developments in Natural Language Processing (NLP) allow us to easily use transformer language models which have been pre-trained on a huge quantity of data (often coming from specialized domains), and thus obtain high performance without excessive efforts. In this work, we outline our approach to NER and EL tasks on Spanish clinical notes for the DisTEMIST track at the BioASQ 2022 challenge. Our results demonstrate that the proposed methodology based on biomedical pre-trained language models turned out the best for the NER task with a ∼ 3% higher F1 w.r.t. the second-best solution.
2022
Biomedical Spanish Language Models for entity recognition and linking at BioASQ DisTEMIST / Moscato, V.; Postiglione, M.; Sperli', G.. - 3180:(2022), pp. 315-324.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11588/915668
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 4
  • ???jsp.display-item.citation.isi??? ND
social impact