Background: Analysis of non- coding sequences in several bacterial genomes brought to the identification of families of repeated sequences, able to fold as secondary structures. These sequences have often been claimed to be transcribed and fulfill a functional role. A previous systematic analysis of a representative set of 40 bacterial genomes produced a large collection of sequences, potentially able to fold as stem- loop structures ( SLS). Computational analysis of these sequences was carried out by searching for families of repetitive nucleic acid elements sharing a common secondary structure. Results: The initial clustering procedure identified clusters of similar sequences in 29 genomes, corresponding to about 1% of the whole population. Sequences selected in this way have a substantially higher aptitude to fold into a stable secondary structure than the initial set. Removal of redundancies and regrouping of the selected sequences resulted in a final set of 92 families, defined by HMM analysis. 25 of them include all well- known SLS containing repeats and others reported in literature, but not analyzed in detail. The remaining 67 families have not been previously described. Two thirds of the families share a common predicted secondary structure and are located within intergenic regions. Conclusion: Systematic analysis of 40 bacterial genomes revealed a large number of repeated sequence families, including known and novel ones. Their predicted structure and genomic location suggest that, even in compact bacterial genomes, a relatively large fraction of the genome consists of non- protein- coding sequences, possibly functioning at the RNA level.

Systematic identification of stem-loop containing sequence families in bacterial genomes / Cozzuto, Luca; Petrillo, Mauro; Silvestro, G.; DI NOCERA, Pierpaolo; Paolella, Giovanni. - In: BMC GENOMICS. - ISSN 1471-2164. - 9:(2008), pp. 1-17. [10.1186/1471-2164-9-20]

Systematic identification of stem-loop containing sequence families in bacterial genomes

COZZUTO, LUCA;PETRILLO, MAURO;DI NOCERA, PIERPAOLO;PAOLELLA, GIOVANNI
2008

Abstract

Background: Analysis of non- coding sequences in several bacterial genomes brought to the identification of families of repeated sequences, able to fold as secondary structures. These sequences have often been claimed to be transcribed and fulfill a functional role. A previous systematic analysis of a representative set of 40 bacterial genomes produced a large collection of sequences, potentially able to fold as stem- loop structures ( SLS). Computational analysis of these sequences was carried out by searching for families of repetitive nucleic acid elements sharing a common secondary structure. Results: The initial clustering procedure identified clusters of similar sequences in 29 genomes, corresponding to about 1% of the whole population. Sequences selected in this way have a substantially higher aptitude to fold into a stable secondary structure than the initial set. Removal of redundancies and regrouping of the selected sequences resulted in a final set of 92 families, defined by HMM analysis. 25 of them include all well- known SLS containing repeats and others reported in literature, but not analyzed in detail. The remaining 67 families have not been previously described. Two thirds of the families share a common predicted secondary structure and are located within intergenic regions. Conclusion: Systematic analysis of 40 bacterial genomes revealed a large number of repeated sequence families, including known and novel ones. Their predicted structure and genomic location suggest that, even in compact bacterial genomes, a relatively large fraction of the genome consists of non- protein- coding sequences, possibly functioning at the RNA level.
2008
Systematic identification of stem-loop containing sequence families in bacterial genomes / Cozzuto, Luca; Petrillo, Mauro; Silvestro, G.; DI NOCERA, Pierpaolo; Paolella, Giovanni. - In: BMC GENOMICS. - ISSN 1471-2164. - 9:(2008), pp. 1-17. [10.1186/1471-2164-9-20]
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11588/346941
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 6
  • ???jsp.display-item.citation.isi??? 5
social impact