Many Deep Learning approaches are based on variations of standard multi-layer feed-forward neural networks. These are also referred to as deep networks. The basic idea is that each hidden neural layer accomplishes a data transformation which is expected to make the data representation “somewhat more linearly separable” than the previous one to obtain a final data representation which is as linearly separable as possible. However, determining the optimal network parameters for these transformations is a crucial challenge. In this study, we propose a Deep Neural Network architecture (Hidden Classification Layer, HCL) which induces an error function involving the output values of all the network layers. The proposed architecture leads toward solutions where the data representations in the hidden layers exhibit a higher degree of linear separability between classes compared to conventional methods. While similar approaches have been discussed in prior literature, this paper presents a new architecture with a novel error function and conducts an extensive experimental analysis. Furthermore, the architecture can be easily integrated into existing frameworks by simply adding densely connected layers and making a straightforward adjustment to the loss function to account for the output of the added layers. The experiments focus on image classification tasks using four well-established datasets, employing as baselines three widely recognized architectures in the literature. The findings reveal that the proposed approach consistently enhances accuracy on the test sets across all considered cases.
Hidden classification layers: Enhancing linear separability between classes in neural networks layers / Apicella, A.; Isgro, F.; Prevete, R.. - In: PATTERN RECOGNITION LETTERS. - ISSN 0167-8655. - 177:(2024), pp. 69-74. [10.1016/j.patrec.2023.11.016]
Hidden classification layers: Enhancing linear separability between classes in neural networks layers
Apicella A.;Isgro F.;Prevete R.
2024
Abstract
Many Deep Learning approaches are based on variations of standard multi-layer feed-forward neural networks. These are also referred to as deep networks. The basic idea is that each hidden neural layer accomplishes a data transformation which is expected to make the data representation “somewhat more linearly separable” than the previous one to obtain a final data representation which is as linearly separable as possible. However, determining the optimal network parameters for these transformations is a crucial challenge. In this study, we propose a Deep Neural Network architecture (Hidden Classification Layer, HCL) which induces an error function involving the output values of all the network layers. The proposed architecture leads toward solutions where the data representations in the hidden layers exhibit a higher degree of linear separability between classes compared to conventional methods. While similar approaches have been discussed in prior literature, this paper presents a new architecture with a novel error function and conducts an extensive experimental analysis. Furthermore, the architecture can be easily integrated into existing frameworks by simply adding densely connected layers and making a straightforward adjustment to the loss function to account for the output of the added layers. The experiments focus on image classification tasks using four well-established datasets, employing as baselines three widely recognized architectures in the literature. The findings reveal that the proposed approach consistently enhances accuracy on the test sets across all considered cases.File | Dimensione | Formato | |
---|---|---|---|
01_apicellaEtAl_patternRecognitionLetter_2024.pdf
accesso aperto
Licenza:
Non specificato
Dimensione
1.36 MB
Formato
Adobe PDF
|
1.36 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.