Phoné: An Initiative to Develop a Dataset for the Automatic Recognition of Spoken Italian

IRIS

Large Language Models (LLM) have revolutionised natural language processing and its applications. However, high-performance LLMs require copious data and computing resources for their development and are rarely public. This also concerns Large Acoustic Models (LAM) for processing spoken language. The Phoné initiative seeks to build an open Italian speech dataset to advance Automatic Speech Recognition (ASR) systems and support public research. Spearheaded by institutions in Naples, Pisa, and Bolzano, the project gathers diverse Italian audio sources and applies advanced ASR architectures, including supervised and self-supervised models. This paper details Phoné’s dataset creation, ASR model evaluation, and ethical considerations, aiming to democratise access to Italian-language resources and foster innovation in ASR technologies.

Phoné: An Initiative to Develop a Dataset for the Automatic Recognition of Spoken Italian / Coro, Gianpaolo; Cutugno, Francesco; Schettino, Loredana; Tanda, Emilia; Vietti, Alessandro; Vitale, VINCENZO NORMAN. - In: ORAL ARCHIVES JOURNAL. - ISSN 3035-4781. - 1:(2025), pp. 89-107. [10.36253/oar-3340]

Phoné: An Initiative to Develop a Dataset for the Automatic Recognition of Spoken Italian

Francesco Cutugno^{Co-primo

Writing – Original Draft Preparation};Loredana Schettino^{Membro del Collaboration Group};Emilia Tanda^{Writing – Original Draft Preparation};Alessandro Vietti;Vincenzo Norman Vitale^{Conceptualization}

2025

Abstract

Large Language Models (LLM) have revolutionised natural language processing and its applications. However, high-performance LLMs require copious data and computing resources for their development and are rarely public. This also concerns Large Acoustic Models (LAM) for processing spoken language. The Phoné initiative seeks to build an open Italian speech dataset to advance Automatic Speech Recognition (ASR) systems and support public research. Spearheaded by institutions in Naples, Pisa, and Bolzano, the project gathers diverse Italian audio sources and applies advanced ASR architectures, including supervised and self-supervised models. This paper details Phoné’s dataset creation, ASR model evaluation, and ethical considerations, aiming to democratise access to Italian-language resources and foster innovation in ASR technologies.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2025
			
	Rivista
	
				ORAL ARCHIVES JOURNAL
			
	Citazione
	
				Phoné: An Initiative to Develop a Dataset for the Automatic Recognition of Spoken Italian / Coro, Gianpaolo; Cutugno, Francesco; Schettino, Loredana; Tanda, Emilia; Vietti, Alessandro; Vitale, VINCENZO NORMAN. - In: ORAL ARCHIVES JOURNAL. - ISSN 3035-4781. - 1:(2025), pp. 89-107. [10.36253/oar-3340]
			
	Appare nelle tipologie:
	
				1.1 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
W00174__89-107_06-Coro.pdf accesso aperto Tipologia: Documento in Post-print Licenza: Dominio pubblico Dimensione 497.18 kB Formato Adobe PDF Visualizza/Apri	497.18 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11588/999714

Citazioni

ND

ND

ND

social impact