Coordination number and geometry are critical factors affecting the chemical properties of metal complexes, both abiogenic and biological. Automated determination of these features from atom coordinates is desirable for applications such as structure quality control, data-mining and machine learning, however classical approaches to this task tend to be computationally intensive. Here, we introduce a deep learning-based approach to perform simultaneous classification of coordination number and geometry through artificial neural networks (ANNs) [1]. Crystal structures of metal complexes with coordination number between 2 and 6 across seven of the most common coordination geometries were retrieved from the CSD and MetalPDB databases [2,3]. Geometric features including distances and angles formed by the 6 heteroatoms closest to the metal are then computed from crystallographic coordinates and used as input data in order to train separate models designed to perform classification on abiogenic and biomolecular sites (Figure 1). The balanced accuracy of 5-fold cross-validated models is above 98% for the CSD-trained model and above 90% for the MetalPDB-trained model, while the time required for classification once the input features have been generated is less than a millisecond per metal site. Overall, these results demonstrate the validity of this approach for automated metal site classification, in particular for high- throughput applications which require the extraction of molecular properties from a large number of structures. [1] A. Krogh, Nat. Biotechnol 26 (2008) 195–197 [2] C. R. Groom, I. J. Bruno, M. P. Lightfoot and S. C. Ward, Acta Cryst. B72 (2016) 171-179 [3] V. Putignano, A. Rosato, L. Banci and C. Andreini, Nucleic Acids Res. 46D1 (2018) D459-D464
Classification of metal site coordination number and geometry through artificial neural networks / Sgueglia, Gianmattia; Vrettas, Michail; Chino, Marco; DE SIMONE, Alfonso; Lombardi, Angela. - (2021), p. 48. (Intervento presentato al convegno Merck Young Chemists’ Symposium 2022 tenutosi a Rimini nel 21-23 Novembre).
Classification of metal site coordination number and geometry through artificial neural networks
Gianmattia Sgueglia
;Michail Vrettas;Marco Chino;Alfonso De Simone;Angela Lombardi
2021
Abstract
Coordination number and geometry are critical factors affecting the chemical properties of metal complexes, both abiogenic and biological. Automated determination of these features from atom coordinates is desirable for applications such as structure quality control, data-mining and machine learning, however classical approaches to this task tend to be computationally intensive. Here, we introduce a deep learning-based approach to perform simultaneous classification of coordination number and geometry through artificial neural networks (ANNs) [1]. Crystal structures of metal complexes with coordination number between 2 and 6 across seven of the most common coordination geometries were retrieved from the CSD and MetalPDB databases [2,3]. Geometric features including distances and angles formed by the 6 heteroatoms closest to the metal are then computed from crystallographic coordinates and used as input data in order to train separate models designed to perform classification on abiogenic and biomolecular sites (Figure 1). The balanced accuracy of 5-fold cross-validated models is above 98% for the CSD-trained model and above 90% for the MetalPDB-trained model, while the time required for classification once the input features have been generated is less than a millisecond per metal site. Overall, these results demonstrate the validity of this approach for automated metal site classification, in particular for high- throughput applications which require the extraction of molecular properties from a large number of structures. [1] A. Krogh, Nat. Biotechnol 26 (2008) 195–197 [2] C. R. Groom, I. J. Bruno, M. P. Lightfoot and S. C. Ward, Acta Cryst. B72 (2016) 171-179 [3] V. Putignano, A. Rosato, L. Banci and C. Andreini, Nucleic Acids Res. 46D1 (2018) D459-D464File | Dimensione | Formato | |
---|---|---|---|
BoA_MYCS2022.pdf
accesso aperto
Descrizione: BookofAbstracts
Tipologia:
Versione Editoriale (PDF)
Licenza:
Copyright dell'editore
Dimensione
9.83 MB
Formato
Adobe PDF
|
9.83 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.