Data fusion techniques usually have the task to create a complete data file from different sources which do not contain the same units. Generally, it is obtained considering variables common to all files. In this paper, we provide an innovative methodology for Data Fusion based on an incremental imputation algorithm using tree-based methods. In addition, we consider robust tree validation by boosting procedures. As benchmarking methods we consider a classical technique, multiple regression, as well as an implicit method based on principal component analysis. A widely extended simulation study shows how the proposed method is more accurate than the other ones.
Robust Incremental Trees for Missing Data Imputation and data Fusion / Aria, Massimo; D'Ambrosio, Antonio; Siciliano, Roberta. - STAMPA. - 1:(2007), pp. 287-290. (Intervento presentato al convegno Meeting of Classification and Data Analysis Group 2007 tenutosi a Macerata nel 12-14 settembre).
Robust Incremental Trees for Missing Data Imputation and data Fusion
ARIA, MASSIMO;D'AMBROSIO, ANTONIO;SICILIANO, ROBERTA
2007
Abstract
Data fusion techniques usually have the task to create a complete data file from different sources which do not contain the same units. Generally, it is obtained considering variables common to all files. In this paper, we provide an innovative methodology for Data Fusion based on an incremental imputation algorithm using tree-based methods. In addition, we consider robust tree validation by boosting procedures. As benchmarking methods we consider a classical technique, multiple regression, as well as an implicit method based on principal component analysis. A widely extended simulation study shows how the proposed method is more accurate than the other ones.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.