Large Language Models (LLM) have revolutionised natural language processing and its applications. However, high-performance LLMs require copious data and computing resources for their development and are rarely public. This also concerns Large Acoustic Models (LAM) for processing spoken language. The Phoné initiative seeks to build an open Italian speech dataset to advance Automatic Speech Recognition (ASR) systems and support public research. Spearheaded by institutions in Naples, Pisa, and Bolzano, the project gathers diverse Italian audio sources and applies advanced ASR architectures, including supervised and self-supervised models. This paper details Phoné’s dataset creation, ASR model evaluation, and ethical considerations, aiming to democratise access to Italian-language resources and foster innovation in ASR technologies.
Phoné: An Initiative to Develop a Dataset for the Automatic Recognition of Spoken Italian / Coro, Gianpaolo; Cutugno, Francesco; Schettino, Loredana; Tanda, Emilia; Vietti, Alessandro; Vitale, VINCENZO NORMAN. - In: ORAL ARCHIVES JOURNAL. - ISSN 3035-4781. - 1:(2025), pp. 89-107. [10.36253/oar-3340]
Phoné: An Initiative to Develop a Dataset for the Automatic Recognition of Spoken Italian
Francesco Cutugno
Co-primo
Writing – Original Draft Preparation
;Loredana SchettinoMembro del Collaboration Group
;Emilia TandaWriting – Original Draft Preparation
;Vincenzo Norman VitaleConceptualization
2025
Abstract
Large Language Models (LLM) have revolutionised natural language processing and its applications. However, high-performance LLMs require copious data and computing resources for their development and are rarely public. This also concerns Large Acoustic Models (LAM) for processing spoken language. The Phoné initiative seeks to build an open Italian speech dataset to advance Automatic Speech Recognition (ASR) systems and support public research. Spearheaded by institutions in Naples, Pisa, and Bolzano, the project gathers diverse Italian audio sources and applies advanced ASR architectures, including supervised and self-supervised models. This paper details Phoné’s dataset creation, ASR model evaluation, and ethical considerations, aiming to democratise access to Italian-language resources and foster innovation in ASR technologies.| File | Dimensione | Formato | |
|---|---|---|---|
|
W00174__89-107_06-Coro.pdf
accesso aperto
Tipologia:
Documento in Post-print
Licenza:
Dominio pubblico
Dimensione
497.18 kB
Formato
Adobe PDF
|
497.18 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


