Effect of Genome-Wide Genotyping and Reference Panels on Rare Variants Imputation
Effect of Genome-Wide Genotyping and Reference Panels on Rare Variants Imputation作者机构:Department of Medicine Human Genetics McGill University Montreal Quebec H3T 1E2 Canada Department of Epidemiology and Biostatistics Lady Davis Institute for Medical Research Jewish General Hospital Montreal Quebec H3T 1E2 Canada Research Center of Montreal Heart Institute Montreal Quebec HIT 1C8 Canada Department of Oncology McGill University Montreal Quebec H3T IE2 Canada Twin Research and Genetic Epidemiology King's College London London SE1 7EH United Kingdom
出 版 物:《Journal of Genetics and Genomics》 (遗传学报(英文版))
年 卷 期:2012年第39卷第10期
页 面:545-550页
核心收录:
学科分类:0710[理学-生物学] 07[理学] 08[工学] 09[农学] 071007[理学-遗传学] 0901[农学-作物学] 0836[工学-生物工程] 090102[农学-作物遗传育种]
基 金:supported by the Canadian Institues of Health Research (CIHR)
主 题:Genotype imputation Genome-wide association study 1000 Genome Project HapMap Rare variant Common disease
摘 要:Common variants explain little of the variance of most common disease, prompting large-scale sequencing studies to understand the contribution of rare variants to these diseases. Imputation of rare variants from genome-wide genotypic arrays offers a cost-efficient strategy to achieve necessary sample sizes required for adequate statistical power. To estimate the performance of imputation of rare variants, we imputed 153 individuals, each of whom was genotyped on 3 different genotype arrays including 317k, 610k and 1 million single nucleotide polymorphisms (SNPs), to two different reference panels: HapMap2 and 1000 Genomes pilot March 2010 release (1KGpilot) by using IMPUTE version 2. We found that more than 94% and 84% of all SNPs yield acceptable accuracy (info 〉 0.4) in HapMap2 and 1KGpilot-based imputation, respectively. For rare variants (minor allele frequency (MAF) 〈5%), the proportion of well- imputed SNPs increased as the MAF increased from 0.3% to 5% across all 3 genome-wide association study (GWAS) datasets. The proportion of well-imputed SNPs was 69%, 60% and 49% for SNPs with a MAF from 0.3% to 5% for 1M, 610k and 317k, respectively. None of the very rare variants (MAF 〈 0.3%) were well imputed. We conclude that the imputation accuracy of rare variants increases with higher density of genome-wide genotyping arrays when the size of the reference panel is small. Variants with lower MAF are more difficult to impute. These findings have important implications in the design and replication of large-scale sequencing studies.