Bayesian Approaches to Functional Integration of Genomic Data

*********************************
There is now a CONTENT FREEZE for Mercury while we switch to a new platform. It began on Friday, March 10 at 6pm and will end on Wednesday, March 15 at noon. No new content can be created during this time, but all material in the system as of the beginning of the freeze will be migrated to the new platform, including users and groups. Functionally the new site is identical to the old one. webteam@gatech.edu
*********************************

Event Details

Date/Time:
- Thursday February 15, 2018
  10:55 pm
Location: Room 1005, Roger A. and Helen B. Krone Engineered Biosystems Building (EBB), 950 Atlantic Dr NW, Atlanta, GA 30332
Phone:
URL:
Email:
Fee(s):
N/A
Extras:

Contact

If you have questions about logistics or would like to set up an appointment with the speaker, please contact the School of Biological Science's administrative office at bio-admin@lists.gatech.edu.

Summaries

Summary Sentence: A Biological Sciences Seminar by Jingjing Yang, Ph.D.

Full Summary: No summary paragraph submitted.

Dr. Jingjing Yang
Department of Human Genetics
Emory University

Abstract:
Although genome-wide association studies (GWAS) have identified thousands of SNP-trait associations (>55K reported on GWAS catalog), the biological mechanisms underlying these associations are largely unknown. Here, we propose a Bayesian variable selection model to integrate variant functional annotations and help understand and prioritize causal variants and mechanisms. Our method improves upon previous approaches by accounting for multiple categories of functional annotations, for genotype correlation due to linkage disequilibrium (LD) and, importantly, by quantifying the proportion of causal variants and relative effect sizes of variants with different functional annotation. To apply our model to very large GWAS and sequencing data sets, we present a novel scalable Bayesian computation method through a block-wise expectation maximization Markov Chain Monte Carlo (EM-MCMC) algorithm. Our algorithm dramatically improves both computational speed and posterior sampling convergence by taking advantage of the block-like LD structure of the human genome. In simulations, we show that our method increases power and identifies more true signals compared with competing methods. In real data, we show that previous greedy approaches and MCMC implementations lead to apparently sub-optimal sets of likely causal variants because they fail to fully explore the set of possible causal variants. We applied our method to a genome-wide association study of age-related macular degeneration with ~33 thousand individuals and >12 million genotyped and imputed variants. Our results show that the non-synonymous markers are about 20 times more likely to be causal than the other markers, and that the effect size of associated non-synonymous variants is about 3 times larger than for other variants. Importantly, our method can help prioritize likely functional candidates for follow-up while disentangling the effects of genotype, linkage disequilibrium and functional annotation. Further, we implemented this method using only summary level data from standard GWAS, which saves up to 85% CPU time while producing the same results as using individual-level data. In conclusion, our method has the potential to shed light on the biological mechanism of SNP associations and can help prioritize SNPs for downstream analysis.

Host: Greg Gibson

Additional Information

In Campus Calendar

Yes

Groups

School of Biological Sciences

Invited Audience

Faculty/Staff, Public, Graduate students, Undergraduate students

Categories

Seminar/Lecture/Colloquium

Keywords

School of Biological Science Seminar, Greg Gibson, Jingjing Yang, The Center for Integrative Genomics and Predictive Health in Atlanta

Status

Created By: Jasmine Martin
Workflow Status: Published
Created On: Jan 31, 2018 - 1:04pm
Last Updated: Jan 31, 2018 - 1:10pm

Georgia Tech

Bayesian Approaches to Functional Integration of Genomic Data

Additional Information