The authors of the article have combined the genomic maps of copy number variants (CNV) with information of thousands of genomes from different human populations to explore the functional implications of introns.
The methods developed in this study show a direct relationship between mutations in introns and variability in human populations.
One of the greatest challenge of genomics is to reveal what role the “dark side” the human genome plays: those regions where it has not yet been possible to find specific functions. The role that introns play within that immense part of the genome is especially mysterious. The introns, which represent almost half the size of the human genome, are constitutive parts of the genes that alternate with the regions that code for proteins, called exons.
PLOS Genetics publishes the article "Intronic CNVs and gene expression variation in human populations" led by Alfonso Valencia, ICREA and director of the Life Sciences department of the Barcelona Supercomputing Center (BSC) and Daniel Rico of the Institute of Cellular Medicine (Newcastle University, United Kingdom). This work has analyzed how the introns are affected by copy number variants (CNV). CNVs are genomic variants that result in the presence (even in multiple copies) or absence of regions of the genome in different individuals. Difficulty detecting and interpreting this type of variation has made its analysis impossible until now. This research team has developed a methodology to understand how CNVs, specifically when they represent DNA losses in some individuals, affect introns.
The results show that introns tend to be lost less frequently than other non-coding regions of the dark side of the genome. This suggests that there is a selective pressure to not lose them during evolution. This finding can be interpreted as a consequence of their functional importance. Confirming this hypothesis, this work has revealed that the loss of intron fragments tends to selectively exclude those parts of the introns that contain known regulatory signals and therefore are more likely to affect the organism. The analysis of these regulatory signals has required the study of their organization in the three-dimensional structure of the nucleus of the cells.
"The data was there, but no one had paid attention to it: as introns are not usually given importance, nobody had noticed that more than 6,000 genes have introns with variable sizes in different people", comments Maria Rigau, BSC researcher and principal author of the paper, and adds that "the size of the genes matters, since we see a significant number of genes in which having shorter or longer introns affects the amount of RNA that is produced, which could be associated with changes in transcriptional regulation and could be related to different diseases."
David Juan, from the Institute of Evolutionary Biology (IBE, UPF-CSIC), explains how these discoveries have been possible thanks to the fact that genomic data have been made publicly available and can be re-analyzed, leading to new discoveries. "It's funny, because many researchers throw introns into the container during their analyses. We have "dug" into that container and found a treasure in which nobody had previously exploted. For this reason, we would like to thank the work of hundreds of people, both those who have generated the data, in particular the International Consortium of the 1000 Genomes, and those who maintain the high-performance computational resources (HPC) that make these studies possible".
Citation: Rigau M, Juan D, Valencia A, Rico D (2019) Intronic CNVs and gene expression variation in human populations. PLoS Genet 15(1):e1007902. https://doi.org/10.1371/journal.pgen.1007902