Genotyping Analysis for SEED Samples

Daniele Fallin, PhD, Principal Investigator, Johns Hopkins University. (Funded 2014-2015).  There is considerable interest in identifying risk factors and understanding the molecular etiology of ASD; areas of active investigation include genome-wide genetic association, gene-environment interaction, environmental exposure, and epigenetic studies. Given the high heritability of ASD it is likely that all areas of investigation will benefit from incorporating genetic data into their analyses. The Study to Explore Early Development (SEED) is the only US-based autism study that has collected bio specimens, comprehensive clinical evaluations, and prenatal environmental exposure data among thousands of children. Thus, it offers a unique opportunity to examine genes, prenatal environmental exposures, and epigenetic risk factors for ASD in the same individuals. All analyses seeking to investigate or incorporate genetic ASD risk will benefit from having a large number of genotyped samples. The purpose of this project is to increase the total number of SEED samples with genome-wide genotyping data from 1,539 to 1,915 (including 811 ASD cases, 921 controls, and 183 non-ASD developmental delay) by generating new genotype data for 376 existing SEED samples. This goal will be achieved through 3 aims designed to provide the highest quality genotype data and enable valid integration with existing SEED genotype data. First, we will measure genotypes at over 4.5 million loci using the HumanOmni5 plus Exome BeadChip (Omni5). Aim 2 will employ an existing pipeline, used with existing SEED genotype datasets, for quality control and imputation purposes to obtain genotype data at ~30 million SNPs per sample. For aim 3, we will develop a pipeline to identify poorly imputed SNPs and perform analyses to assess potential batch-related issues that need to be addressed when combining newly generated and existing SEED datasets. This proposal builds upon existing infrastructure and resources provided by the CDC, NIEHS, and Autism Speaks; the additional SEED genotype data obtained in this study will strengthen existing and future efforts, both within and outside of the SEED network, to further our understanding of the molecular basis of ASD by providing more power for studies that incorporate genetic data in their analyses.