Factor clustering methods have been proposed in order to cluster large datasets, where large is referred to the number of variables. Among them Factor Probabilistic Distance Clustering (FPDC) has been proposed. The two main steps of the method are: a Tucker 3 decomposition of the distance matrix and a Probabilistic Distance (PD) clustering on the Tucker 3 factors. The aim of this paper is to apply FPDC on behavioral and social datasets of large dimensions, to obtain homogeneous and well-separated clusters of individuals. The scope is to evaluate the stability and the robustness of the method dealing with real large datasets. Stability of results is referred to the invariance of results in each iteration of the method. Robustness is referred to the sensitivity of the method to errors in data, as outliers. These characteristics of the method are evaluated using Bootstrap resampling.

Robustness and stability analysis of factor PD-clustering on large social data sets / Tortora, C.; Marino, Marina. - (2014), pp. 273-281. [10.1007/978-3-319-06692-9_29]

Robustness and stability analysis of factor PD-clustering on large social data sets

MARINO, MARINA
2014

Abstract

Factor clustering methods have been proposed in order to cluster large datasets, where large is referred to the number of variables. Among them Factor Probabilistic Distance Clustering (FPDC) has been proposed. The two main steps of the method are: a Tucker 3 decomposition of the distance matrix and a Probabilistic Distance (PD) clustering on the Tucker 3 factors. The aim of this paper is to apply FPDC on behavioral and social datasets of large dimensions, to obtain homogeneous and well-separated clusters of individuals. The scope is to evaluate the stability and the robustness of the method dealing with real large datasets. Stability of results is referred to the invariance of results in each iteration of the method. Robustness is referred to the sensitivity of the method to errors in data, as outliers. These characteristics of the method are evaluated using Bootstrap resampling.
2014
978-3-319-06692-9
Robustness and stability analysis of factor PD-clustering on large social data sets / Tortora, C.; Marino, Marina. - (2014), pp. 273-281. [10.1007/978-3-319-06692-9_29]
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11588/571601
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 2
  • ???jsp.display-item.citation.isi??? 2
social impact