Modern scientific data mainly consist of huge data sets gathered by a very large number of techniques and stored in much diversified and often incompatible data repositories. More in general, in the escience environment, it is considered as a critical and urgent requirement to integrate services across distributed, heterogeneous, dynamic ‘‘virtual organizations’’ formed by different resources within a single enterprise. In the last decade, Astronomy has become an immensely data-rich field due to the evolution of detectors (plates to digital to mosaics), telescopes and space instruments. The Virtual Observatory approach consists of the federation under common standards of all astronomical archives available worldwide, as well as data analysis, data mining and data exploration applications. The main drive behind such an effort is that once the infrastructure is complete, it will allow a new type of multiwavelength, multi-epoch science, which can only be barely imagined. Data mining, or knowledge discovery in databases, while being the main methodology to extract the scientific information contained in such Massive Data Sets (MDS), poses crucial problems since it has to orchestrate complex problems posed by transparent access to different computing environments, scalability of algorithms, reusability of resources, etc. In the present paper we summarize the present status of the MDS in the Virtual Observatory and what is currently done and planned to bring advanced data mining methodologies in the case of the DAME (DAta Mining and Exploration) project.
Mining Knowledge in Astrophysical Massive Data Sets / Brescia, Massimo; Longo, Giuseppe; Pasian, Fabio. - In: NUCLEAR INSTRUMENTS & METHODS IN PHYSICS RESEARCH. SECTION A, ACCELERATORS, SPECTROMETERS, DETECTORS AND ASSOCIATED EQUIPMENT. - ISSN 0168-9002. - 623:2(2010), pp. 845-849. [10.1016/j.nima.2010.02.002]
Mining Knowledge in Astrophysical Massive Data Sets
Massimo Brescia
;Giuseppe Longo;
2010
Abstract
Modern scientific data mainly consist of huge data sets gathered by a very large number of techniques and stored in much diversified and often incompatible data repositories. More in general, in the escience environment, it is considered as a critical and urgent requirement to integrate services across distributed, heterogeneous, dynamic ‘‘virtual organizations’’ formed by different resources within a single enterprise. In the last decade, Astronomy has become an immensely data-rich field due to the evolution of detectors (plates to digital to mosaics), telescopes and space instruments. The Virtual Observatory approach consists of the federation under common standards of all astronomical archives available worldwide, as well as data analysis, data mining and data exploration applications. The main drive behind such an effort is that once the infrastructure is complete, it will allow a new type of multiwavelength, multi-epoch science, which can only be barely imagined. Data mining, or knowledge discovery in databases, while being the main methodology to extract the scientific information contained in such Massive Data Sets (MDS), poses crucial problems since it has to orchestrate complex problems posed by transparent access to different computing environments, scalability of algorithms, reusability of resources, etc. In the present paper we summarize the present status of the MDS in the Virtual Observatory and what is currently done and planned to bring advanced data mining methodologies in the case of the DAME (DAta Mining and Exploration) project.File | Dimensione | Formato | |
---|---|---|---|
46-Brescia-1-s2.0-S0168900210001804-main.pdf
solo utenti autorizzati
Tipologia:
Versione Editoriale (PDF)
Licenza:
Copyright dell'editore
Dimensione
445.18 kB
Formato
Adobe PDF
|
445.18 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.