Logical Analysis of Data deals with the classification of huge data set by boolean formulas and their synthetic representation by ternary string, referred to as patterns. In this context, the simple pattern minimality problem (SPMP) arises. It consists in determining the minimum number of patterns “explaining” an initial data set of binary strings. This problem is equivalent to the minimum disjunctive normal form problem and, hence, it has been widely tackled by set covering based heuristic approaches. In this work, we describe and tackle a particular variant of the SPMP coming from an application arising in the car industry production field. The main difference with respect to SPMP tackled in literature resides in the fact that the determined patterns must be partitions and not covers of the initial binary string data set. The problem is solved by an effective and fast heuristic, tested on several large size instances coming from a real application.
A Partitioning Based Heuristic for a Variant of the Simple Pattern Minimality Problem / Boccia, Maurizio; Masone, Adriano; Sforza, Antonio; Sterle, Claudio. - 217:(2017), pp. 93-102. [10.1007/978-3-319-67308-0_10]
A Partitioning Based Heuristic for a Variant of the Simple Pattern Minimality Problem
Boccia, Maurizio;MASONE, ADRIANO;Sforza, Antonio;Sterle, Claudio
2017
Abstract
Logical Analysis of Data deals with the classification of huge data set by boolean formulas and their synthetic representation by ternary string, referred to as patterns. In this context, the simple pattern minimality problem (SPMP) arises. It consists in determining the minimum number of patterns “explaining” an initial data set of binary strings. This problem is equivalent to the minimum disjunctive normal form problem and, hence, it has been widely tackled by set covering based heuristic approaches. In this work, we describe and tackle a particular variant of the SPMP coming from an application arising in the car industry production field. The main difference with respect to SPMP tackled in literature resides in the fact that the determined patterns must be partitions and not covers of the initial binary string data set. The problem is solved by an effective and fast heuristic, tested on several large size instances coming from a real application.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.