Robust Incremental Trees for Missing Data Imputation and data Fusion

Aria, Massimo; D'Ambrosio, Antonio; Siciliano, Roberta

Data fusion techniques usually have the task to create a complete data file from different sources which do not contain the same units. Generally, it is obtained considering variables common to all files. In this paper, we provide an innovative methodology for Data Fusion based on an incremental imputation algorithm using tree-based methods. In addition, we consider robust tree validation by boosting procedures. As benchmarking methods we consider a classical technique, multiple regression, as well as an implicit method based on principal component analysis. A widely extended simulation study shows how the proposed method is more accurate than the other ones.

Robust Incremental Trees for Missing Data Imputation and data Fusion / Aria, M., D'Ambrosio, A., Siciliano, R.. - STAMPA. - 1:(2007), pp. 287-290. (Meeting of Classification and Data Analysis Group 2007 Macerata 12-14 settembre).