The large part of human pathology is composed by complex disease, such as heart disease, obesity, cancer, diabetes, and many common psychiatric and neurological conditions. The common feature of all these conditions is the multifactorial etiology that involves both genetic and environmental factors. The common disease-common variant (CDCV) hypothesis posits that common, interacting alleles underlie most common diseases, in association with environmental factors. Furthermore, according to the thrift genotype, such alleles have been subjected to selective pressure, mainly those involved in metabolic disease such as T2DM and obesity. Although the concept of gene-environment interaction is central to ecogenetics, and has long been recognized by geneticists (Haldane 1946), there are relatively few detailed descriptions of gene–environment interaction in biomedical literature. This lacking may be explained by difficulties in collecting environmental information of enough quality and by great difficulties in analyze them. Indeed, when the number of factors to analyze is large, become overwhelming the course of dimensionality and the multiple testing problems. In the present thesis the hypothesis that knowledge-driven approaches may improve the ability to identify genes involved in complex disease was checked. Three approaches have been presented, each of them leading to the identification of a factor or of a interaction of factors. As the study a complex disease is composed by three steps: (1) selection of candidate genes, (2) collecting of genetic and non-genetic information and (3) statistical analysis of data, it is showed that each of these steps may be improved by consideration of the biological background. The first study, regarded the possibility to exploit evolutionary information to identify genes involved in type 2 diabetes. This hypothesis was based on the thrifty genotype hypothesis. A gene was identified, ACO1, and was successfully associated to the disease. In the second study, we analyses the case of a gene, PPAGγ that have been inconsistency associated with obesity. We hypothesized that the inconsistence of association may be due to its relationship with environment. Then we jointly analyzed the genotype of the gene and comprehensive nutritional information about a cohort and proved an interaction. The genotype of PPARγ modulated the response to the diet. Ala-carriers gained more weight than ProPro individuals when had the same caloric intake. In the third study, we implemented a software tool to create simulated populations based on gene-environment interactions. The system was based on genetic information to simulate realistic populations. We used these simulated populations to collect information on statistical methods more frequently used to study case-controls samples. Afterward, we built an ensemble of these methods and applied it to a real sample. We showed that ensemble had better performances of each single methods in condition of small sample size.

THREE METHODS TO INCREASE THE LIKELY TO IDENTIFY GENE INVOLVED IN COMPLEX DISEASE / Cocozza, Sergio. - (2008).

THREE METHODS TO INCREASE THE LIKELY TO IDENTIFY GENE INVOLVED IN COMPLEX DISEASE

COCOZZA, SERGIO
2008

Abstract

The large part of human pathology is composed by complex disease, such as heart disease, obesity, cancer, diabetes, and many common psychiatric and neurological conditions. The common feature of all these conditions is the multifactorial etiology that involves both genetic and environmental factors. The common disease-common variant (CDCV) hypothesis posits that common, interacting alleles underlie most common diseases, in association with environmental factors. Furthermore, according to the thrift genotype, such alleles have been subjected to selective pressure, mainly those involved in metabolic disease such as T2DM and obesity. Although the concept of gene-environment interaction is central to ecogenetics, and has long been recognized by geneticists (Haldane 1946), there are relatively few detailed descriptions of gene–environment interaction in biomedical literature. This lacking may be explained by difficulties in collecting environmental information of enough quality and by great difficulties in analyze them. Indeed, when the number of factors to analyze is large, become overwhelming the course of dimensionality and the multiple testing problems. In the present thesis the hypothesis that knowledge-driven approaches may improve the ability to identify genes involved in complex disease was checked. Three approaches have been presented, each of them leading to the identification of a factor or of a interaction of factors. As the study a complex disease is composed by three steps: (1) selection of candidate genes, (2) collecting of genetic and non-genetic information and (3) statistical analysis of data, it is showed that each of these steps may be improved by consideration of the biological background. The first study, regarded the possibility to exploit evolutionary information to identify genes involved in type 2 diabetes. This hypothesis was based on the thrifty genotype hypothesis. A gene was identified, ACO1, and was successfully associated to the disease. In the second study, we analyses the case of a gene, PPAGγ that have been inconsistency associated with obesity. We hypothesized that the inconsistence of association may be due to its relationship with environment. Then we jointly analyzed the genotype of the gene and comprehensive nutritional information about a cohort and proved an interaction. The genotype of PPARγ modulated the response to the diet. Ala-carriers gained more weight than ProPro individuals when had the same caloric intake. In the third study, we implemented a software tool to create simulated populations based on gene-environment interactions. The system was based on genetic information to simulate realistic populations. We used these simulated populations to collect information on statistical methods more frequently used to study case-controls samples. Afterward, we built an ensemble of these methods and applied it to a real sample. We showed that ensemble had better performances of each single methods in condition of small sample size.
2008
THREE METHODS TO INCREASE THE LIKELY TO IDENTIFY GENE INVOLVED IN COMPLEX DISEASE / Cocozza, Sergio. - (2008).
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11588/362139
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact