This paper focuses on determining the influence of outliers on a joint dimension reduction and clustering method for categorical data, namely Cluster Cor- respondence Analysis (CCA). Joint methods, such as CCA, solutions consist of both a cluster membership vector and a set of low dimensional scores for observations and attributes. We evaluate the impact of outliers on the identification of the cluster struc- ture. As a benchmark, we use the tandem approach, which is a sequential application of multiple correspondence analysis followed by K-means clustering. The appraisal is based on synthetic data and outliers generated using an evolutionary algorithm that provides data with a user-defined cluster structure.
INFLUENCE OF OUTLIERS ON CLUSTER CORRESPONDENCE ANALYSIS / Michel van de Velden, ; IODICE D'ENZA, Alfonso; Schut, Lisa. - (2019), pp. 454-457.
INFLUENCE OF OUTLIERS ON CLUSTER CORRESPONDENCE ANALYSIS
Alfonso Iodice D’Enza;
2019
Abstract
This paper focuses on determining the influence of outliers on a joint dimension reduction and clustering method for categorical data, namely Cluster Cor- respondence Analysis (CCA). Joint methods, such as CCA, solutions consist of both a cluster membership vector and a set of low dimensional scores for observations and attributes. We evaluate the impact of outliers on the identification of the cluster struc- ture. As a benchmark, we use the tandem approach, which is a sequential application of multiple correspondence analysis followed by K-means clustering. The appraisal is based on synthetic data and outliers generated using an evolutionary algorithm that provides data with a user-defined cluster structure.File | Dimensione | Formato | |
---|---|---|---|
CLADAG2019_clusCA_outlier.pdf
accesso aperto
Tipologia:
Versione Editoriale (PDF)
Licenza:
Non specificato
Dimensione
19.63 MB
Formato
Adobe PDF
|
19.63 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.