Posterior Prediction Modelling of Optimal Trees COMPSTAT 2008

Siciliano, Roberta; Aria, Massimo; D'Ambrosio, Antonio

doi:10.1007/978-3-7908-2084-3_27

The framework of this paper is classification and regression trees, also known as tree-based methods, binary segmentation, tree partitioning, decision trees. Trees can be fruitfully used either to explore and understand the dependence relationship between the response variable and a set of predictors or to assign the response class or value for new objects on which only the measurements of predictors are known. Since the introduction of two-stage splitting procedure in 1992, the research unit in Naples has been introducing several contributions in this field, one of the main issues is combining tree partitioning with statistical models. This paper will provide a new idea of knowledge extraction using trees and models. It will deal with the trade off between the interpretability of the tree structure (i.e., exploratory trees) and the accuracy of the decision tree model (i.e., decision tree-based rules). Prospective and retrospective view of using models and trees will be discussed. In particular, we will introduce a tree-based methodology that grows an optimal tree structure with the posterior prediction modelling to be used as decision rule for new objects. The general methodology will be presented and a special case will be described in details. An application on a real world data set will be finally shown.

Posterior Prediction Modelling of Optimal Trees COMPSTAT 2008 / Siciliano, Roberta; Aria, Massimo; D'Ambrosio, Antonio. - STAMPA. - (2008), pp. 323-334. [10.1007/978-3-7908-2084-3_27]