This work introduces a supervised tree-based method dealing with preference rankings as response variable. Preference rankings usually depend on the characteristics of both the individuals judging a set of objects and the objects being judged. This theme has been handled in the literature with log-linear representations of the generalized Bradley-Terry model. Tree-based models working with multivariate response variables are present in the literature. The problem with preference rankings is that an ordering can be considered as a unique multidimensional ”entity” rather than a multivariate distribution. For this reason the techniques known in the literature to define split for multivariate response variables are not able to obtain impurity measures which are feasible in this case. This tree-based model works with a spitting criterion based on the minimization of a suitable distance for rankings. In the framework of preference rankings theory, often on discuss about the meaning of ties: who believes that ties are a positive statement of agreement, and not just indifference declarations, should accept a set of four axioms formulated by Kemeny: these axioms should be verified in the definition of a distance measure involving preference rankings. As impurity measure we chose the sum of the Kemeny distances within nodes whereas the ranking-class assignment rule is the consensus ranking computed by maximizing the Emond and Mason’s tau_X rank correlation coefficient. The characterization of the impurity measure suggested us to call our method Distance BasedMultivariate Trees for Rankings. From the confirmatory point of view, we take in account both cross-validation and bootstrap procedures to select the final classifier.
Decision trees for preference rankings / D'Ambrosio, Antonio; Heiser, W. J.. - STAMPA. - (2009), pp. 133-136. (Intervento presentato al convegno CLADAG 2009 tenutosi a Catania nel 9-11 Settembre).
Decision trees for preference rankings
D'AMBROSIO, ANTONIO;
2009
Abstract
This work introduces a supervised tree-based method dealing with preference rankings as response variable. Preference rankings usually depend on the characteristics of both the individuals judging a set of objects and the objects being judged. This theme has been handled in the literature with log-linear representations of the generalized Bradley-Terry model. Tree-based models working with multivariate response variables are present in the literature. The problem with preference rankings is that an ordering can be considered as a unique multidimensional ”entity” rather than a multivariate distribution. For this reason the techniques known in the literature to define split for multivariate response variables are not able to obtain impurity measures which are feasible in this case. This tree-based model works with a spitting criterion based on the minimization of a suitable distance for rankings. In the framework of preference rankings theory, often on discuss about the meaning of ties: who believes that ties are a positive statement of agreement, and not just indifference declarations, should accept a set of four axioms formulated by Kemeny: these axioms should be verified in the definition of a distance measure involving preference rankings. As impurity measure we chose the sum of the Kemeny distances within nodes whereas the ranking-class assignment rule is the consensus ranking computed by maximizing the Emond and Mason’s tau_X rank correlation coefficient. The characterization of the impurity measure suggested us to call our method Distance BasedMultivariate Trees for Rankings. From the confirmatory point of view, we take in account both cross-validation and bootstrap procedures to select the final classifier.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.