Exploration of haplotype research consortium imputation for genome-wide association studies in 20,032 Generation Scotland participants

Reka Nagy, Thibaud S Boutin, Jonathan Marten, Jennifer E Huffman, Shona M Kerr, Archie Campbell, Louise Evenden, Jude Gibson, Carmen Amador, David M Howard, Pau Navarro, Andrew Morris, Ian J Deary, Lynne J Hocking, Sandosh Padmanabhan, Blair H Smith, Peter Joshi, James F Wilson, Nicholas D Hastie, Alan F WrightAndrew M McIntosh, David J Porteous, Chris S Haley, Veronique Vitart, Caroline Hayward (Corresponding Author)

Research output: Contribution to journalArticlepeer-review

71 Citations (Scopus)
6 Downloads (Pure)


BACKGROUND: The Generation Scotland: Scottish Family Health Study (GS:SFHS) is a family-based population cohort with DNA, biological samples, socio-demographic, psychological and clinical data from approximately 24,000 adult volunteers across Scotland. Although data collection was cross-sectional, GS:SFHS became a prospective cohort due to of the ability to link to routine Electronic Health Record (EHR) data. Over 20,000 participants were selected for genotyping using a large genome-wide array.

METHODS: GS:SFHS was analysed using genome-wide association studies (GWAS) to test the effects of a large spectrum of variants, imputed using the Haplotype Research Consortium (HRC) dataset, on medically relevant traits measured directly or obtained from EHRs. The HRC dataset is the largest available haplotype reference panel for imputation of variants in populations of European ancestry and allows investigation of variants with low minor allele frequencies within the entire GS:SFHS genotyped cohort.

RESULTS: Genome-wide associations were run on 20,032 individuals using both genotyped and HRC imputed data. We present results for a range of well-studied quantitative traits obtained from clinic visits and for serum urate measures obtained from data linkage to EHRs collected by the Scottish National Health Service. Results replicated known associations and additionally reveal novel findings, mainly with rare variants, validating the use of the HRC imputation panel. For example, we identified two new associations with fasting glucose at variants near to Y_RNA and WDR4 and four new associations with heart rate at SNPs within CSMD1 and ASPH, upstream of HTR1F and between PROKR2 and GPCPD1. All were driven by rare variants (minor allele frequencies in the range of 0.08-1%). Proof of principle for use of EHRs was verification of the highly significant association of urate levels with the well-established urate transporter SLC2A9.

CONCLUSIONS: GS:SFHS provides genetic data on over 20,000 participants alongside a range of phenotypes as well as linkage to National Health Service laboratory and clinical records. We have shown that the combination of deeper genotype imputation and extended phenotype availability make GS:SFHS an attractive resource to carry out association studies to gain insight into the genetic architecture of complex traits.

Original languageEnglish
Article number23
Pages (from-to)1-14
Number of pages14
JournalGenome Research
Publication statusPublished - 7 Mar 2017

Bibliographical note

We are grateful to all the families who took part in the Generation Scotland: Scottish Family Health Study, the general practitioners and Scottish School of Primary Care for their help in recruiting them, and the whole Generation Scotland team, which includes academic researchers, IT staff, laboratory technicians, statisticians and research managers. We thank staff at the University of Dundee Health Informatics Centre for their expert assistance with EHR data linkage. IJD is supported by The University of Edinburgh Centre for Cognitive Ageing and Cognitive Epidemiology, part of the cross council Lifelong Health and Wellbeing Initiative (MR/K026992/1); funding from the BBSRC and MRC is gratefully acknowledged. Data on glycaemic traits have been contributed by MAGIC investigators and have been downloaded from www.magicinvestigators.org.

Genotyping of the GS:SFHS samples was carried out by the Edinburgh Clinical Research Facility, University of Edinburgh and was funded by the Medical Research Council UK and the Wellcome Trust (Wellcome Trust Strategic Award ‘STratifying Resilience and Depression Longitudinally’ (STRADL) (Reference 104036/Z/14/Z). GS:SFHS received core support from the Scottish Executive Health Department, Chief Scientist Office, grant number CZD/16/6. The MRC provides core funding to the QTL in Health and Disease research program at the MRC HGU, IGMM, University of Edinburgh.
Availability of data and materials

The datasets supporting the conclusions of this article are included within the article (and its Additional files).


  • Genome-wide association studies (GWAS)
  • Electronic health records
  • Imputation
  • Quantitative trait
  • Genetics
  • Urate
  • Heart rate
  • Glucose
  • Haplotype Research Consortium (HRC)


Dive into the research topics of 'Exploration of haplotype research consortium imputation for genome-wide association studies in 20,032 Generation Scotland participants'. Together they form a unique fingerprint.

Cite this