Development of the Rubin Observatory Legacy Survey of Space and Time (LSST) includes a series of Data Challenges (DCs) arranged by various LSST Scientific Collaborations that are taking place during the project's preoperational phase. The AGN Science Collaboration Data Challenge (AGNSC-DC) is a partial prototype of the expected LSST data on active galactic nuclei (AGNs), aimed at validating machine learning approaches for AGN selection and characterization in large surveys like LSST. The AGNSC-DC took place in 2021, focusing on accuracy, robustness, and scalability. The training and the blinded data sets were constructed to mimic the future LSST release catalogs using the data from the Sloan Digital Sky Survey Stripe 82 region and the XMM-Newton Large Scale Structure Survey region. Data features were divided into astrometry, photometry, color, morphology, redshift, and class label with the addition of variability features and images. We present the results of four submitted solutions to DCs using both classical and machine learning methods. We systematically test the performance of supervised models (support vector machine, random forest, extreme gradient boosting, artificial neural network, convolutional neural network) and unsupervised ones (deep embedding clustering) when applied to the problem of classifying/clustering sources as stars, galaxies, or AGNs. We obtained classification accuracy of 97.5% for supervised models and clustering accuracy of 96.0% for unsupervised ones and 95.0% with a classic approach for a blinded data set. We find that variability features significantly improve the accuracy of the trained models, and correlation analysis among different bands enables a fast and inexpensive first-order selection of quasar candidates.

The LSST AGN Data Challenge: Selection Methods / Savić, Đorđe V.; Jankov, Isidora; Yu, Weixiang; Petrecca, Vincenzo; Temple, Matthew J.; Ni, Qingling; Shirley, Raphael; Kovačević, Andjelka B.; Nikolić, Mladen; Ilić, Dragana; Popović, Luka Č.; Paolillo, Maurizio; Panda, Swayamtrupta; Ćiprijanović, Aleksandra; Richards, Gordon T.. - In: THE ASTROPHYSICAL JOURNAL. - ISSN 0004-637X. - 953:2(2023), p. 138. [10.3847/1538-4357/ace31a]

The LSST AGN Data Challenge: Selection Methods

Petrecca, Vincenzo
Formal Analysis
;
Paolillo, Maurizio
Formal Analysis
;
2023

Abstract

Development of the Rubin Observatory Legacy Survey of Space and Time (LSST) includes a series of Data Challenges (DCs) arranged by various LSST Scientific Collaborations that are taking place during the project's preoperational phase. The AGN Science Collaboration Data Challenge (AGNSC-DC) is a partial prototype of the expected LSST data on active galactic nuclei (AGNs), aimed at validating machine learning approaches for AGN selection and characterization in large surveys like LSST. The AGNSC-DC took place in 2021, focusing on accuracy, robustness, and scalability. The training and the blinded data sets were constructed to mimic the future LSST release catalogs using the data from the Sloan Digital Sky Survey Stripe 82 region and the XMM-Newton Large Scale Structure Survey region. Data features were divided into astrometry, photometry, color, morphology, redshift, and class label with the addition of variability features and images. We present the results of four submitted solutions to DCs using both classical and machine learning methods. We systematically test the performance of supervised models (support vector machine, random forest, extreme gradient boosting, artificial neural network, convolutional neural network) and unsupervised ones (deep embedding clustering) when applied to the problem of classifying/clustering sources as stars, galaxies, or AGNs. We obtained classification accuracy of 97.5% for supervised models and clustering accuracy of 96.0% for unsupervised ones and 95.0% with a classic approach for a blinded data set. We find that variability features significantly improve the accuracy of the trained models, and correlation analysis among different bands enables a fast and inexpensive first-order selection of quasar candidates.
2023
The LSST AGN Data Challenge: Selection Methods / Savić, Đorđe V.; Jankov, Isidora; Yu, Weixiang; Petrecca, Vincenzo; Temple, Matthew J.; Ni, Qingling; Shirley, Raphael; Kovačević, Andjelka B.; Nikolić, Mladen; Ilić, Dragana; Popović, Luka Č.; Paolillo, Maurizio; Panda, Swayamtrupta; Ćiprijanović, Aleksandra; Richards, Gordon T.. - In: THE ASTROPHYSICAL JOURNAL. - ISSN 0004-637X. - 953:2(2023), p. 138. [10.3847/1538-4357/ace31a]
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11588/935986
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 3
  • ???jsp.display-item.citation.isi??? 2
social impact