Many Deep Learning approaches are based on variations of standard multi-layer feed-forward neural networks. These are also referred to as deep networks. The basic idea is that each hidden neural layer accomplishes a data transformation which is expected to make the data representation “somewhat more linearly separable” than the previous one to obtain a final data representation which is as linearly separable as possible. However, determining the optimal network parameters for these transformations is a crucial challenge. In this study, we propose a Deep Neural Network architecture (Hidden Classification Layer, HCL) which induces an error function involving the output values of all the network layers. The proposed architecture leads toward solutions where the data representations in the hidden layers exhibit a higher degree of linear separability between classes compared to conventional methods. While similar approaches have been discussed in prior literature, this paper presents a new architecture with a novel error function and conducts an extensive experimental analysis. Furthermore, the architecture can be easily integrated into existing frameworks by simply adding densely connected layers and making a straightforward adjustment to the loss function to account for the output of the added layers. The experiments focus on image classification tasks using four well-established datasets, employing as baselines three widely recognized architectures in the literature. The findings reveal that the proposed approach consistently enhances accuracy on the test sets across all considered cases.

Hidden classification layers: Enhancing linear separability between classes in neural networks layers / Apicella, A.; Isgro, F.; Prevete, R.. - In: PATTERN RECOGNITION LETTERS. - ISSN 0167-8655. - 177:(2024), pp. 69-74. [10.1016/j.patrec.2023.11.016]

Hidden classification layers: Enhancing linear separability between classes in neural networks layers

Apicella A.;Isgro F.;Prevete R.
2024

Abstract

Many Deep Learning approaches are based on variations of standard multi-layer feed-forward neural networks. These are also referred to as deep networks. The basic idea is that each hidden neural layer accomplishes a data transformation which is expected to make the data representation “somewhat more linearly separable” than the previous one to obtain a final data representation which is as linearly separable as possible. However, determining the optimal network parameters for these transformations is a crucial challenge. In this study, we propose a Deep Neural Network architecture (Hidden Classification Layer, HCL) which induces an error function involving the output values of all the network layers. The proposed architecture leads toward solutions where the data representations in the hidden layers exhibit a higher degree of linear separability between classes compared to conventional methods. While similar approaches have been discussed in prior literature, this paper presents a new architecture with a novel error function and conducts an extensive experimental analysis. Furthermore, the architecture can be easily integrated into existing frameworks by simply adding densely connected layers and making a straightforward adjustment to the loss function to account for the output of the added layers. The experiments focus on image classification tasks using four well-established datasets, employing as baselines three widely recognized architectures in the literature. The findings reveal that the proposed approach consistently enhances accuracy on the test sets across all considered cases.
2024
Hidden classification layers: Enhancing linear separability between classes in neural networks layers / Apicella, A.; Isgro, F.; Prevete, R.. - In: PATTERN RECOGNITION LETTERS. - ISSN 0167-8655. - 177:(2024), pp. 69-74. [10.1016/j.patrec.2023.11.016]
File in questo prodotto:
File Dimensione Formato  
01_apicellaEtAl_patternRecognitionLetter_2024.pdf

accesso aperto

Licenza: Non specificato
Dimensione 1.36 MB
Formato Adobe PDF
1.36 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11588/952627
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact