TY - JOUR
T1 - Non-random retention of protein-coding overlapping genes in Metazoa
AU - Soldà, Giulia
AU - Suyama, Mikita
AU - Pelucchi, Paride
AU - Boi, Silvia
AU - Guffanti, Alessandro
AU - Rizzi, Ermanno
AU - Bork, Peer
AU - Tenchini, Maria Luisa
AU - Ciccarelli, Francesca D.
N1 - Funding Information:
We wish to thank Davide Rambaldi (IEO, Milan) for his help in retrieving the data needed for the simulation of the random distribution. We also thank Raoul Bonnal and Michele Iacono of ITB-CNR for contributing to the generation, sequencing and analysis of the 454 cDNA library sequences. This work was supported by the Start Up grant of AIRC to FDC and by "Borsa di studio per il perfezionamento all'estero" of the University of Milan to GS.
PY - 2008/4/16
Y1 - 2008/4/16
N2 - Background: Although the overlap of transcriptional units occurs frequently in eukaryotic genomes, its evolutionary and biological significance remains largely unclear. Here we report a comparative analysis of overlaps between genes coding for well-annotated proteins in five metazoan genomes (human, mouse, zebrafish, fruit fly and worm). Results: For all analyzed species the observed number of overlapping genes is always lower than expected assuming functional neutrality, suggesting that gene overlap is negatively selected. The comparison to the random distribution also shows that retained overlaps do not exhibit random features: antiparallel overlaps are significantly enriched, while overlaps lying on the same strand and those involving coding sequences are highly underrepresented. We confirm that overlap is mostly species-specific and provide evidence that it frequently originates through the acquisition of terminal, non-coding exons. Finally, we show that overlapping genes tend to be significantly co-expressed in a breast cancer cDNA library obtained by 454 deep sequencing, and that different overlap types display different patterns of reciprocal expression. Conclusion: Our data suggest that overlap between protein-coding genes is selected against in Metazoa. However, when retained it may be used as a species-specific mechanism for the reciprocal regulation of neighboring genes. The tendency of overlaps to involve non-coding regions of the genes leads to the speculation that the advantages achieved by an overlapping arrangement may be optimized by evolving regulatory non-coding transcripts.
AB - Background: Although the overlap of transcriptional units occurs frequently in eukaryotic genomes, its evolutionary and biological significance remains largely unclear. Here we report a comparative analysis of overlaps between genes coding for well-annotated proteins in five metazoan genomes (human, mouse, zebrafish, fruit fly and worm). Results: For all analyzed species the observed number of overlapping genes is always lower than expected assuming functional neutrality, suggesting that gene overlap is negatively selected. The comparison to the random distribution also shows that retained overlaps do not exhibit random features: antiparallel overlaps are significantly enriched, while overlaps lying on the same strand and those involving coding sequences are highly underrepresented. We confirm that overlap is mostly species-specific and provide evidence that it frequently originates through the acquisition of terminal, non-coding exons. Finally, we show that overlapping genes tend to be significantly co-expressed in a breast cancer cDNA library obtained by 454 deep sequencing, and that different overlap types display different patterns of reciprocal expression. Conclusion: Our data suggest that overlap between protein-coding genes is selected against in Metazoa. However, when retained it may be used as a species-specific mechanism for the reciprocal regulation of neighboring genes. The tendency of overlaps to involve non-coding regions of the genes leads to the speculation that the advantages achieved by an overlapping arrangement may be optimized by evolving regulatory non-coding transcripts.
UR - http://www.scopus.com/inward/record.url?scp=42549155622&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=42549155622&partnerID=8YFLogxK
U2 - 10.1186/1471-2164-9-174
DO - 10.1186/1471-2164-9-174
M3 - Article
C2 - 18416813
AN - SCOPUS:42549155622
SN - 1471-2164
VL - 9
JO - BMC genomics
JF - BMC genomics
M1 - 174
ER -