Abstract
We describe the Phase II HapMap, which characterizes over 3.1 million human single nucleotide polymorphisms (SNPs) genotyped in 270 individuals from four geographically diverse populations and includes 25–35% of common SNP variation in the populations surveyed. The map is estimated to capture untyped common variation with an average maximum r2 of between 0.9 and 0.96 depending on population. We demonstrate that the current generation of commercial genome-wide genotyping products captures common Phase II SNPs with an average maximum r2 of up to 0.8 in African and up to 0.95 in non-African populations, and that potential gains in power in association studies can be obtained through imputation. These data also reveal novel aspects of the structure of linkage disequilibrium. We show that 10–30% of pairs of individuals within a population share at least one region of extended genetic identity arising from recent ancestry and that up to 1% of all common variants are untaggable, primarily because they lie within recombination hotspots. We show that recombination rates vary systematically around genes and between genes of different function. Finally, we demonstrate increased differentiation at non-synonymous, compared to synonymous, SNPs, resulting from systematic differences in the strength or efficacy of natural selection between populations.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 51 print issues and online access
$199.00 per year
only $3.90 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Change history
18 January 2008
Co-author Todd A. Johnson's name was inadvertently omitted from the list of RIKEN authors in the HTML version of the paper only. This was corrected on 18 January 2008.
References
The International HapMap Consortium. Integrating ethics and science in the International HapMap Project. Nature Rev. Genet. 5, 467–475 (2004)
The International HapMap Consortium. The International HapMap Project. Nature 426, 789–796 (2003)
The International HapMap Consortium. A haplotype map of the human genome. Nature 437, 1299–1320 (2005)
Bowcock, A. M. Genomics: guilt by association. Nature 447, 645–646 (2007)
Altshuler, D. & Daly, M. Guilt beyond a reasonable doubt. Nature Genet. 39, 813–815 (2007)
Myers, S., Bottolo, L., Freeman, C., McVean, G. & Donnelly, P. A fine-scale map of recombination rates and hotspots across the human genome. Science 310, 321–324 (2005)
McCarroll, S. A. et al. Common deletion polymorphisms in the human genome. Nature Genet. 38, 86–92 (2006)
Conrad, D. F., Andrews, T. D., Carter, N. P., Hurles, M. E. & Pritchard, J. K. A high-resolution survey of deletion polymorphism in the human genome. Nature Genet. 38, 75–81 (2006)
Voight, B. F., Kudaravalli, S., Wen, X. & Pritchard, J. K. A map of recent positive selection in the human genome. PLoS Biol. 4, e72 (2006)
Redon, R. et al. Global variation in copy number in the human genome. Nature 444, 444–454 (2006)
de Bakker, P. I. et al. A high-resolution HLA and SNP haplotype map for disease association studies in the extended human MHC. Nature Genet. 38, 1166–1172 (2006)
Pastinen, T. et al. Mapping common regulatory variants to human haplotypes. Hum. Mol. Genet. 14, 3963–3971 (2005)
Stranger, B. E. et al. Genome-wide associations of gene expression variation in humans. PLoS Genet. 1, e78 (2005)
Cheung, V. G. et al. Mapping determinants of human gene expression by regional and genome-wide association. Nature 437, 1365–1369 (2005)
Hinds, D. A. et al. Whole-genome patterns of common DNA variation in three human populations. Science 307, 1072–1079 (2005)
de Bakker, P. I. et al. Efficiency and power in genetic association studies. Nature Genet. 37, 1217–1223 (2005)
Pe'er, I. et al. Evaluating and improving power in whole-genome association studies using fixed marker sets. Nature Genet. 38, 663–667 (2006)
Barrett, J. C. & Cardon, L. R. Evaluating coverage of genome-wide association studies. Nature Genet. 38, 659–662 (2006)
Burdick, J. T., Chen, W. M., Abecasis, G. R. & Cheung, V. G. In silico method for inferring genotypes in pedigrees. Nature Genet. 38, 1002–1004 (2006)
Servin, B. R. & Stephens, M. Imputation-based analysis of association studies: candidate regions and quantitative traits. PLoS Genet. 3, e114 (2007)
The Wellcome Trust Case Control Consortium. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447, 661–668 (2007)
Scott, L. J. et al. A genome-wide association study of type 2 diabetes in Finns detects multiple susceptibility variants. Science 316, 1341–1345 (2007)
Marchini, J., Howie, B., Myers, S., McVean, G. & Donnelly, P. A new multipoint method for genome-wide association studies via imputation of genotypes. Nature Genet. 39, 906–913 (2007)
Chapman, J. M., Cooper, J. D., Todd, J. A. & Clayton, D. G. Detecting disease associations due to linkage disequilibrium using haplotype tags: a class of tests and the determinants of statistical power. Hum. Hered. 56, 18–31 (2003)
Paabo, S. The mosaic that is our genome. Nature 421, 409–412 (2003)
McVean, G., Spencer, C. C. & Chaix, R. Perspectives on human genetic variation from the HapMap Project. PLoS Genet. 1, e54 (2005)
Purcell, S. et al. PLINK: a toolset for whole-genome association and population-based linkage analysis. Am. J. Hum. Genet. 81, 559–575 (2007)
Broman, K. W. & Weber, J. L. Long homozygous chromosomal segments in reference families from the centre d’Etude du polymorphisme humain. Am. J. Hum. Genet. 65, 1493–1500 (1999)
Gibson, J., Morton, N. E. & Collins, A. Extended tracts of homozygosity in outbred human populations. Hum. Mol. Genet. 15, 789–795 (2006)
Lander, E. S. & Botstein, D. Homozygosity mapping: a way to map human recessive traits with the DNA of inbred children. Science 236, 1567–1570 (1987)
Leutenegger, A. L. et al. Using genomic inbreeding coefficient estimates for homozygosity mapping of rare recessive traits: application to Taybi-Linder syndrome. Am. J. Hum. Genet. 79, 62–66 (2006)
Te Meerman, G. J., Van der Meulen, M. A. & Sandkuijl, L. A. Perspectives of identity by descent (IBD) mapping in founder populations. Clin. Exp. Allergy 25 (Suppl 2). 97–102 (1995)
Houwen, R. H. et al. Genome screening by searching for shared segments: mapping a gene for benign recurrent intrahepatic cholestasis. Nature Genet. 8, 380–386 (1994)
Durham, L. K. & Feingold, E. Genome scanning for segments shared identical by descent among distant relatives in isolated populations. Am. J. Hum. Genet. 61, 830–842 (1997)
Jeffreys, A. J. & May, C. A. Intense and highly localized gene conversion activity in human meiotic crossover hot spots. Nature Genet. 36, 151–156 (2004)
McVean, G. A. et al. The fine-scale structure of recombination rate variation in the human genome. Science 304, 581–584 (2004)
Myers, S. et al. The distribution and causes of meiotic recombination in the human genome. Biochem. Soc. Trans. 34, 526–530 (2006)
Spencer, C. C. et al. The influence of recombination on human genetic diversity. PLoS Genet. 2, e148 (2006)
Petes, T. D. Meiotic recombination hot spots and cold spots. Nature Rev. Genet. 2, 360–369 (2001)
Smith, A. V., Thomas, D. J., Munro, H. M. & Abecasis, G. R. Sequence features in regions of weak and strong linkage disequilibrium. Genome Res. 15, 1519–1534 (2005)
Thomas, P. D. et al. PANTHER: a library of protein families and subfamilies indexed by function. Genome Res. 13, 2129–2141 (2003)
Winckler, W. et al. Comparison of fine-scale recombination rates in humans and chimpanzees. Science 308, 107–111 (2005)
Ptak, S. E. et al. Fine-scale recombination patterns differ between chimpanzees and humans. Nature Genet. 37, 429–434 (2005)
Sabeti, P. C. et al. Detecting recent positive selection in the human genome from haplotype structure. Nature 419, 832–837 (2002)
Sabeti, P. C. et al. Genome-wide detection and characterization of positive selection in human populations. Nature doi:10.1038/nature06250 (this issue).
Bustamante, C. D. et al. Natural selection on protein-coding genes in the human genome. Nature 437, 1153–1157 (2005)
Cargill, M. et al. Characterization of single-nucleotide polymorphisms in coding regions of human genes. Nature Genet. 22, 231–238 (1999)
Akey, J. M., Zhang, G., Zhang, K., Jin, L. & Shriver, M. D. Interrogating a high-density SNP map for signatures of natural selection. Genome Res. 12, 1805–1814 (2002)
Sabeti, P. C. et al. Positive natural selection in the human lineage. Science 312, 1614–1620 (2006)
de Bakker, P. I. et al. Transferability of tag SNPs in genetic association studies in multiple populations. Nature Genet. 38, 1298–1303 (2006)
Conrad, D. F. et al. A worldwide survey of haplotype variation and linkage disequilibrium in the human genome. Nature Genet. 38, 1251–1260 (2006)
Service, S., Sabatti, C. & Freimer, N. Tag SNPs chosen from HapMap perform well in several population isolates. Genet. Epidemiol. 31, 189–194 (2007)
Lim, J. et al. Comparative study of the linkage disequilibrium of an ENCODE region, chromosome 7p15, in Korean, Japanese, and Han Chinese samples. Genomics 87, 392–398 (2006)
Rabbee, N. & Speed, T. P. A genotype calling algorithm for affymetrix SNP arrays. Bioinformatics 22, 7–12 (2006)
Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007)
Price, A. L. et al. Principal components analysis corrects for stratification in genome-wide association studies. Nature Genet. 38, 904–909 (2006)
Smith, R. A., Ho, P. J., Clegg, J. B., Kidd, J. R. & Thein, S. L. Recombination breakpoints in the human β-globin gene cluster. Blood 92, 4415–4421 (1998)
Holloway, K., Lawson, V. E. & Jeffreys, A. J. Allelic recombination and de novo deletions in sperm in the human β-globin gene region. Hum. Mol. Genet. 15, 1099–1111 (2006)
Weir, B. S. & Cockerham, C. C. Estimating F-statistics for the analysis of population structure. Evolution 38, 1358–1370 (1984)
Acknowledgements
We thank many people who contributed to this project: all members of the genotyping laboratory and the sample, primer, bioinformatics, data quality and IT groups at Perlegen Sciences for technical and infrastructural support; J. Beck, C. Beiswanger, D. Coppock, A. Leach, J. Mintzer and L. Toji for transforming the Yoruba, Japanese and Han Chinese samples, distributing the DNA and cell lines, storing the samples for use in future research, and producing the community newsletters and reports; J. Greenberg and R. Anderson for providing funding and support for cell line transformation and storage in the NIGMS Human Genetic Cell Repository at the Coriell Institute; T. Dibling, T. Ishikura, S. Kanazawa, S. Mizusawa and S. Saito for help with genotyping; C. Hind and A. Moghadam for technical support in genotyping and all members of the subcloning and sequencing teams at the Wellcome Trust Sanger Institute; X. Ke for help with data analysis; Oxford E-Science Centre for provision of high-performance computing resources; H. Chen, W. Chen, L. Deng, Y. Dong, C. Fu, L. Gao, H. Geng, J. Geng, M. He, H. Li, H. Li, S. Li, X. Li, B. Liu, Z. Liu, F. Lu, F. Lu, G. Lu, C. Luo, X. Wang, Z. Wang, C. Ye and X. Yu for help with genotyping and sample collection; X. Feng, Y. Li, J. Ren and X. Zhou for help with sample collection; J. Fan, W. Gu, W. Guan, S. Hu, H. Jiang, R. Lei, Y. Lin, Z. Niu, B. Wang, L. Yang, W. Yang, Y. Wang, Z. Wang, S. Xu, W. Yan, H. Yang, W. Yuan, C. Zhang, J. Zhang, K. Zhang and G. Zhao for help with genotyping; P. Fong, C. Lai, C. Lau, T. Leung, L. Luk and W. Tong for help with genotyping; C. Pang for help with genotyping; K. Ding, B. Qiang, J. Zhang, X. Zhang and K. Zhou for help with genotyping; Q. Fu, S. Ghose, X. Lu, D. Nelson, A. Perez, S. Poole, R. Vega and H. Yonath for help with genotyping; C. Bruckner, T. Brundage, S. Chow, O. Iartchouk, M. Jain, M. Moorhead and K. Tran for help with genotyping; N. Addleman, J. Atilano, T. Chan, C. Chu, C. Ha, T. Nguyen, M. Minton and A. Phong for help with genotyping, and D. Lind for help with quality control and experimental design; R. Donaldson and S. Duan for help with genotyping, and J. Rice and N. Saccone for help with experimental design; J. Wigginton for help with implementing and testing QA/QC software; A. Clark, B. Keats, R. Myers, D. Nickerson and A. Williamson for providing advice to NIH; C. Juenger, C. Bennet, C. Bird, J. Melone, P. Nailer, M. Weiss, J. Witonsky and E. DeHaut-Combs for help with project management; M. Gray for organizing phone calls and meetings; D. Leja for help with figures; the Yoruba people of Ibadan, Nigeria, the people of Tokyo, Japan, and the community at Beijing Normal University, who participated in public consultations and community engagements; the people in these communities who donated their blood samples; and the people in the Utah CEPH community who allowed the samples they donated earlier to be used for the Project. This work was supported by the Japanese Ministry of Education, Culture, Sports, Science and Technology, the Wellcome Trust, Nuffield Trust, Wolfson Foundation, UK EPSRC, Genome Canada, Génome Québec, the Chinese Academy of Sciences, the Ministry of Science and Technology of the People’s Republic of China, the National Natural Science Foundation of China, the Hong Kong Innovation and Technology Commission, the University Grants Committee of Hong Kong, the SNP Consortium, the US National Institutes of Health (FIC, NCI, NCRR, NEI, NHGRI, NIA, NIAAA, NIAID, NIAMS, NIBIB, NIDA, NIDCD, NIDCR, NIDDK, NIEHS, NIGMS, NIMH, NINDS, NLM, OD), the W.M. Keck Foundation, and the Delores Dore Eccles Foundation. All SNPs genotyped within the HapMap Project are available from dbSNP (http://www.ncbi.nlm.nih.gov/SNP); all genotype information is available from dbSNP and the HapMap website (http://www.hapmap.org).
Author information
Authors and Affiliations
Consortia
Corresponding authors
Ethics declarations
Competing interests
Some authors declare employment and personal financial interests. These authors declare employment financial interests: authors who are current employees of genotyping companies or were employees of genotyping companies (Affymetrix, Illumina, ParAllele, Perlegen) during the project. These authors declare personal financial interests (defined as serving on the advisory board of a genotyping company, owning stock in a genotyping company, or receiving royalties from a patent licensed to a genotyping company): A.B., A.C., A.S., D.R.C., M.S.C., J.B.F., L.M.G., L.R.C., P.H., P.Y.K., S.S.M. and T.D.W.
Additional information
Lists of participants and affiliations appear at the end of the paper. (Participants are arranged by institution and then alphabetically within institutions except for Principal Investigators and Project Leaders, as indicated.)
Supplementary information
Supplementary Information
The file contains Supplementary Notes, Supplementary Tables 1-9, Supplementary Figures 1-7 with Legends and additional references. (PDF 4470 kb)
Rights and permissions
About this article
Cite this article
The International HapMap Consortium. A second generation human haplotype map of over 3.1 million SNPs. Nature 449, 851–861 (2007). https://doi.org/10.1038/nature06258
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.1038/nature06258
This article is cited by
-
Reconstructing the ancestral gene pool to uncover the origins and genetic links of Hmong–Mien speakers
BMC Biology (2024)
-
Association of verbal and non-verbal theory of mind abilities with non-coding variants of OXTR in youth with autism spectrum disorder and typically developing individuals: a case-control study
BMC Psychiatry (2024)
-
Inferring compound heterozygosity from large-scale exome sequencing data
Nature Genetics (2024)
-
Biochemical and molecular biomarkers: unraveling their role in gestational diabetes mellitus
Diabetology & Metabolic Syndrome (2023)
-
Polygenic risk scores and the need for pharmacotherapy in neonatal abstinence syndrome
Pediatric Research (2023)