Recent studies have highlighted the differences between machine translation systems, aimed at drawing the boundaries between text data processing based on a neural architecture (NMT, Neural Machine Translation ) or artificial intelligence (AI) (Jiang et al. 2023; Gaspari et al. 2015). The training of NMT translation systems, such as Deep L or Google Translate, based upon parallel corpora in different language combinations and basically including just written sources, shows obvious limitations when compared with AI systems (ChatGPT, for example), which are configured as learning tools based on monolingual corpora in multiple languages that can also produce languagerelated tasks other than translation, such as mathematical reasoning. However, comparison with human translation (HT) reveals the difficulty of automated systems to adapt to pragmatic and contextual variations (Jiang et al. 2023). Few studies have so far been concerned, from a qualitative point of view, with the comparison of translation performance of automatic systems dealing with language innovation and change. Gen Z jargon is undoubtedly illustrative of the cultural impact of language change and the rapidity with which human language is transformed by its speakers, particularly through new generations and ongoing processes of lexical resemantization, word-formation or outright creation of new words in new contexts. In Italy, the influence of the English language of social media on younger generations and the co-presence of local dialect substrata make the process of language change even more complex and difficult to predict. The present work, therefore, aims to test NMT and AI-based translation systems on what can be called "language frontiers," i.e., the most recent and innovative slang expressions included in the language of Gen Z (citizens born between 1995 and 2010). The absence of linguistic corpora of Italian youth slang varieties, comparable to the corpus for the English language which is being built within the iGen project (Katz et al. 2021), limits the investigation to a qualitative test conducted on the translation from Italian into English of about 100 sentences extracted from social media (X, Tik Tok, Instagram, YouTube) and representative of different forms of online interaction. On such a corpus, the purpose of the work is first and foremost a) to provide a first description of the linguistic features of translation outputs from English into Italian produced by different machine translation systems (ChatGPT vs. Deep L) dealing with language change; b) to describe the translation differences between NMT and AI devices; c) evaluate the nature of the gap between AI generated translations and human translation.

Translating language innovation: assessing the performance of AI systems / Calabrese, Carmen; Cigliano, Chiara; Donadio, Paolo. - (2024), pp. 32-33. (Intervento presentato al convegno How can AI translate? tenutosi a Napoli nel 22-23 aprile 2024).

Translating language innovation: assessing the performance of AI systems

Carmen Calabrese;Chiara Cigliano;Paolo Donadio
2024

Abstract

Recent studies have highlighted the differences between machine translation systems, aimed at drawing the boundaries between text data processing based on a neural architecture (NMT, Neural Machine Translation ) or artificial intelligence (AI) (Jiang et al. 2023; Gaspari et al. 2015). The training of NMT translation systems, such as Deep L or Google Translate, based upon parallel corpora in different language combinations and basically including just written sources, shows obvious limitations when compared with AI systems (ChatGPT, for example), which are configured as learning tools based on monolingual corpora in multiple languages that can also produce languagerelated tasks other than translation, such as mathematical reasoning. However, comparison with human translation (HT) reveals the difficulty of automated systems to adapt to pragmatic and contextual variations (Jiang et al. 2023). Few studies have so far been concerned, from a qualitative point of view, with the comparison of translation performance of automatic systems dealing with language innovation and change. Gen Z jargon is undoubtedly illustrative of the cultural impact of language change and the rapidity with which human language is transformed by its speakers, particularly through new generations and ongoing processes of lexical resemantization, word-formation or outright creation of new words in new contexts. In Italy, the influence of the English language of social media on younger generations and the co-presence of local dialect substrata make the process of language change even more complex and difficult to predict. The present work, therefore, aims to test NMT and AI-based translation systems on what can be called "language frontiers," i.e., the most recent and innovative slang expressions included in the language of Gen Z (citizens born between 1995 and 2010). The absence of linguistic corpora of Italian youth slang varieties, comparable to the corpus for the English language which is being built within the iGen project (Katz et al. 2021), limits the investigation to a qualitative test conducted on the translation from Italian into English of about 100 sentences extracted from social media (X, Tik Tok, Instagram, YouTube) and representative of different forms of online interaction. On such a corpus, the purpose of the work is first and foremost a) to provide a first description of the linguistic features of translation outputs from English into Italian produced by different machine translation systems (ChatGPT vs. Deep L) dealing with language change; b) to describe the translation differences between NMT and AI devices; c) evaluate the nature of the gap between AI generated translations and human translation.
2024
Translating language innovation: assessing the performance of AI systems / Calabrese, Carmen; Cigliano, Chiara; Donadio, Paolo. - (2024), pp. 32-33. (Intervento presentato al convegno How can AI translate? tenutosi a Napoli nel 22-23 aprile 2024).
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11588/971784
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact