An international team of scientists has scanned the genomes of 2,504 people from around the world to create the world’s largest catalog of human genetic variation (HGV). The extensive database will help them understand why some people are susceptible to certain diseases.
No two people are identical, yet human beings share 99.9% of their DNA. That tiny 0.1% difference accounts for all the individual variation among us. The new catalog, which was compiled by the 1000 Genomes Project Consortium and assisted by the U.S. National Institutes of Health, identifies all those global differences in people’s genomes. More often than not, these variations are harmless. Some are even beneficial. But others contribute to host of genetic diseases and conditions, including cognitive impairments and predispositions to cancer, obesity, diabetes, and heart disease.
HGV can describe genetic differences both within and among populations. Also known as single nucleotide polymorphisms (SNPs), they’re described by the NIH as the most common type of genetic variation.
Each SNP represents a difference in a single DNA building block, called a nucleotide. For example, a SNP may replace the nucleotide cytosine (C) with the nucleotide thymine (T) in a certain stretch of DNA. SNPs occur normally throughout a person’s DNA. They occur once in every 300 nucleotides on average, which means there are roughly 10 million SNPs in the human genome. Most commonly, these variations are found in the DNA between genes. They can act as biological markers, helping scientists locate genes that are associated with disease. When SNPs occur within a gene or in a regulatory region near a gene, they may play a more direct role in disease by affecting the gene’s function.
A sampling of the population groups of the 1000 Genomes Project (Credit: bioinf.jku.at)
A statement from the NIH explains more about the latest findings:
...investigators identified about 88 million sites in the human genome that vary among people, establishing a database available to researchers as a standard reference for how the genomic make-up of people varies in populations and around the world. The catalog more than doubles the number of known variant sites in the human genome, and can now be used in a wide range of studies of human biology and medicine, providing the basis for a new understanding of how inherited differences in DNA can contribute to disease risk and drug response.
Of the more than 88 million variable sites identified, about 12 million had common variants that were likely shared by many of the populations. The study showed that the greatest genomic diversity is in African populations, consistent with evidence that humans originated in Africa and that migrations from Africa established other populations around the world.
The researchers found that more than 99% of varients in the human genome can be found in 1% of the populations studied. Of the 88 million variations about 25% are common and occur in many or all populations, while about 75% occur in only 1% of people or less.
“The 1000 Genomes Project data are a resource for any study in which scientists are looking for genomic contributions to disease, including the study of both common and rare variants,” noted Lisa Brooks, program director in the NHGRI Genomic Variation Program, in the NIH release.
Email the author at firstname.lastname@example.org and follow him at @dvorsky. Top image by Loozrboy/Flickr/CC BY 2.0