Nowadays, network data integration is a demanding problem and still an open challenge, especially when dealing with large datasets. When collecting several data sets and heterogeneous data types on a given phenomenon of interest, the individual analysis of each data set will give only a particular view of such phenomenon. In contrast, integrating all the data will widen and deepen the results giving a more global view of the entire system. We developed a novel statistical method named INet algorithm, for data integration based on weighted multilayer networks. Under the assumption that the structure underneath the different layers has some similarity that we want to emerge in the integrated network, we generate a “consensus network” through an iterative procedure based on structure comparison, capable of pulling out important information about the phenomenon under study. The procedure tries to preserve common higher-order structures of the original networks in the integrated one, i.e. neighbourhood. Once obtained the consensus network, we compared it with the starting networks extracting “specific networks”, one for each layer, containing peculiar information of the single data type not present in all the others. We tested our method on simulated networks to analyse the performance of our algorithm and we analyzed virus and vaccine gene co-expression networks to better understand infectious diseases.
INet for Network Integration / Policastro, Valeria; Magnani, Matteo; Angelini, Claudia; Carissimo, Annamaria. - (2023). (Intervento presentato al convegno ARS'23 Ninth International Workshop on Social Network Analysis tenutosi a Ischia nel 2/5/23-3/5/23).
INet for Network Integration
Valeria Policastro;Annamaria Carissimo
2023
Abstract
Nowadays, network data integration is a demanding problem and still an open challenge, especially when dealing with large datasets. When collecting several data sets and heterogeneous data types on a given phenomenon of interest, the individual analysis of each data set will give only a particular view of such phenomenon. In contrast, integrating all the data will widen and deepen the results giving a more global view of the entire system. We developed a novel statistical method named INet algorithm, for data integration based on weighted multilayer networks. Under the assumption that the structure underneath the different layers has some similarity that we want to emerge in the integrated network, we generate a “consensus network” through an iterative procedure based on structure comparison, capable of pulling out important information about the phenomenon under study. The procedure tries to preserve common higher-order structures of the original networks in the integrated one, i.e. neighbourhood. Once obtained the consensus network, we compared it with the starting networks extracting “specific networks”, one for each layer, containing peculiar information of the single data type not present in all the others. We tested our method on simulated networks to analyse the performance of our algorithm and we analyzed virus and vaccine gene co-expression networks to better understand infectious diseases.File | Dimensione | Formato | |
---|---|---|---|
contribution.pdf
accesso aperto
Licenza:
Non specificato
Dimensione
130.85 kB
Formato
Adobe PDF
|
130.85 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.