Geopolitical news provides vast amounts of information essential for understanding international relations and political events. However, organizing this information into a coherent, structured format poses challenges due to the complexity and dynamic nature of the domain. This paper introduces a scalable system leveraging Large Language Models to build continuously updated Knowledge Graphs from Italian geopolitical news. The system features a modular architecture, including a Collector Node for scalable article extraction, a Redis-based reliable queue to manage large-scale data ingestion, and a Named Entity Recognition/Relation Extraction Engine to standardize entity-relation triples. The framework addresses key challenges, such as continuous updating and hallucination mitigation, ensuring the reliability of the graph. Our evaluations demonstrate significant improvements in scalability, uniformity of extracted triples, and graph accuracy, making this architecture particularly suitable for real-time geopolitical analysis.
Scaling LLM-Based Knowledge Graph Generation: A Case Study of Italian Geopolitical News / Russo, D.; Orlando, G. M.; Romano, A.; Riccio, G.; La Gatta, V.; Postiglione, M.; Moscato, V.. - (2024), pp. 3494-3497. ( 2024 IEEE International Conference on Big Data, BigData 2024 usa 2024) [10.1109/BigData62323.2024.10825937].
Scaling LLM-Based Knowledge Graph Generation: A Case Study of Italian Geopolitical News
Russo D.;Orlando G. M.;La Gatta V.;Postiglione M.;Moscato V.
2024
Abstract
Geopolitical news provides vast amounts of information essential for understanding international relations and political events. However, organizing this information into a coherent, structured format poses challenges due to the complexity and dynamic nature of the domain. This paper introduces a scalable system leveraging Large Language Models to build continuously updated Knowledge Graphs from Italian geopolitical news. The system features a modular architecture, including a Collector Node for scalable article extraction, a Redis-based reliable queue to manage large-scale data ingestion, and a Named Entity Recognition/Relation Extraction Engine to standardize entity-relation triples. The framework addresses key challenges, such as continuous updating and hallucination mitigation, ensuring the reliability of the graph. Our evaluations demonstrate significant improvements in scalability, uniformity of extracted triples, and graph accuracy, making this architecture particularly suitable for real-time geopolitical analysis.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


