The paper aims at showing the advantages of formulating lexical structures with variable elements (LSVE) in terms of symbolic objects (OS). The proposal main consequence is the collapse of the huge information usually present in a text into matrices of complex data. Thus, it is possible to use statistical tools in order to analyse the underlying properties of lexical structures by taking into account their different component distributions. Describing lexical structures with variable elements as symbolic data improves the effort of text mining by putting in a strict relation the knowledge extraction and the statistical analysis steps. The study of such lexical structures is performed by applying a factorial analysis on complex data. An application deals with a very large corpus extracted from Italian newspaper “La Repubblica” during the Nineties. In this study, we analyse the relations among the different components of some lexical structures with variable elements and the temporal evolution of some peculiar semantic traits, related to geographical-historical-political contexts.
Outils de Text Mining pour l'analyse de structures lexicales à éléments variables / Sergio, Bolasco; Rosanna, Verde; Balbi, Simona. - (2002), pp. 197-208. (Intervento presentato al convegno JADT 2002 tenutosi a Saint Malo (F)).
Outils de Text Mining pour l'analyse de structures lexicales à éléments variables
BALBI, SIMONA
2002
Abstract
The paper aims at showing the advantages of formulating lexical structures with variable elements (LSVE) in terms of symbolic objects (OS). The proposal main consequence is the collapse of the huge information usually present in a text into matrices of complex data. Thus, it is possible to use statistical tools in order to analyse the underlying properties of lexical structures by taking into account their different component distributions. Describing lexical structures with variable elements as symbolic data improves the effort of text mining by putting in a strict relation the knowledge extraction and the statistical analysis steps. The study of such lexical structures is performed by applying a factorial analysis on complex data. An application deals with a very large corpus extracted from Italian newspaper “La Repubblica” during the Nineties. In this study, we analyse the relations among the different components of some lexical structures with variable elements and the temporal evolution of some peculiar semantic traits, related to geographical-historical-political contexts.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.