A new model-free clustering approach for time series is introduced. We combine a probability distance clustering method with a penalized smoothing procedure of time series. We take advantage from the class of probability distance clustering. It allows for probabilistic allocation of series to groups and it is based on the principle that probability and distance are inversely related. On the other hand the class of penalized smoothing procedure allows us to efficiently separate the signal and the noise component of a generic time series. Using the smoothed version of the raw data as an input of our procedure makes the clustering algorithm more flexible due to the fact that we do not need to specify a parametric model to summarize the observed dynamics. Our approach can be easily generalized for the analysis of multivariate time series. We evaluate the performances of the proposed clustering procedure using different distance measures. Comparisons of our proposal with already known procedures are provided. We discuss the applicability of the proposed clustering method to analyse general classes of time series emphasizing its fuzzy nature.
Model-free probability distance clustering of time series / Frasso, G.; D'Ambrosio, Antonio; Siciliano, Roberta. - (2013). (Intervento presentato al convegno 6th International Conference of the ERCIM WG on Computational and Methodological Statistics (ERCIM 2013) tenutosi a University of London nel 14-16 December 2013).
Model-free probability distance clustering of time series
D'AMBROSIO, ANTONIO;SICILIANO, ROBERTA
2013
Abstract
A new model-free clustering approach for time series is introduced. We combine a probability distance clustering method with a penalized smoothing procedure of time series. We take advantage from the class of probability distance clustering. It allows for probabilistic allocation of series to groups and it is based on the principle that probability and distance are inversely related. On the other hand the class of penalized smoothing procedure allows us to efficiently separate the signal and the noise component of a generic time series. Using the smoothed version of the raw data as an input of our procedure makes the clustering algorithm more flexible due to the fact that we do not need to specify a parametric model to summarize the observed dynamics. Our approach can be easily generalized for the analysis of multivariate time series. We evaluate the performances of the proposed clustering procedure using different distance measures. Comparisons of our proposal with already known procedures are provided. We discuss the applicability of the proposed clustering method to analyse general classes of time series emphasizing its fuzzy nature.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.