PhD Defense by Hector F. Espitia-Navarro

*********************************
There is now a CONTENT FREEZE for Mercury while we switch to a new platform. It began on Friday, March 10 at 6pm and will end on Wednesday, March 15 at noon. No new content can be created during this time, but all material in the system as of the beginning of the freeze will be migrated to the new platform, including users and groups. Functionally the new site is identical to the old one. webteam@gatech.edu
*********************************

Event Details

Date/Time:
- Monday December 9, 2019 - Tuesday December 10, 2019
  11:00 am - 12:59 pm
Location: IBB 1128 (Suddath Seminar Room)
Phone:
URL:
Email:
Fee(s):
N/A
Extras:

Contact

No contact information submitted.

Summaries

Summary Sentence: Efficient Alignment-free Software Applications for Next Generation Sequencing-based Molecular Epidemiology

Full Summary: No summary paragraph submitted.

In partial fulfillment of the requirements for the degree of

Doctor of Philosophy in Bioinformatics

in the School of Biological Sciences

Hector F. Espitia-Navarro

Defends his thesis:

Efficient Alignment-free Software Applications for Next Generation Sequencing-based Molecular Epidemiology

Monday, December 9^th, 2019

11:00 AM Eastern Time

IBB Suddath Room 1128

Thesis Advisor:

Dr. King Jordan

School of Biological Sciences

Georgia Institute of Technology

Committee Members:

Dr. Srinivas Aluru

School of Computational Science and Engineering

Georgia Institute of Technology

Dr. Jung Choi

School of Biological Sciences

Georgia Institute of Technology

Dr. Lavanya Rishishwar

School of Biological Sciences

Georgia Institute of Technology

Dr. Leonard Mayer

School of Medicine

Emory University

Abstract

Public health agencies increasingly couple next generation sequencing (NGS) characterization of microbial genomes with bioinformatics analysis methods for molecular epidemiology. The overhead associated with the bioinformatics methods that are used for this purpose, in terms of both the required human expertise and computational resources, represents a critical bottleneck that limits the potential impact of microbial genomics on public health. This is particularly true for local public health agency laboratories, which are typically staffed with microbiologists who may not have substantial bioinformatics expertise or ready access to high-performance computational resources. There is a pressing need for bioinformatics solutions to genome-enabled molecular epidemiology that must be easy to use, computationally efficient, fast, and most importantly, highly accurate. This thesis research is focused on the development of an alignment-free algorithm for NGS data analysis and its implementation into turn-key software applications specifically tailored for genome-enabled molecular epidemiology and environmental microbial genomics. I explored a computational strategy based on k-mer frequencies to distinguish between sequences of interest in NGS read samples. By combining this strategy with an efficient data structure called Enhanced Suffix Array (ESA), I developed a base algorithm – STing – for the rapid analysis of unprocessed NGS reads. I further adapted and implemented this algorithm into a suite of software applications for sequence typing, gene detection, and gene-based taxonomic read classification. Benchmarking and validation analyses showed that STing is an ultrafast and accurate solution for genome-enabled molecular epidemiology, which performs better than existing bioinformatics methods for sequence typing and gene detection. To contribute to overcoming the limitation of bioinformatics infrastructure and expertise in public health laboratories, I developed WebSTing, a Web-platform that uses the STing algorithm to provide easy access to the accurate and rapid alignment-free automated characterization of WGS samples of bacterial isolates. Finally, to demonstrate the utility of the STing in problems beyond simple sequence typing and gene detection, I applied the alignment-free algorithm to two different areas: (1) public health, with the virulence gene profiling of Shiga toxin-producing Escherichia coli (STEC) isolates, and (2) environmental microbial genomics, with the nifH gene-based taxonomy classification of amplicon sequencing reads. I showed that STing performs better than the gold-standard method for STEC isolate characterization, and that it correctly classifies amplicon sequencing reads on simulated communities of nitrogen-fixing organisms.

Additional Information

In Campus Calendar

Groups

Graduate Studies

Invited Audience

Faculty/Staff, Public, Graduate students, Undergraduate students

Categories

Other/Miscellaneous

Keywords

Phd proposal

Status

Created By: Tatianna Richardson
Workflow Status: Published
Created On: Dec 3, 2019 - 11:32am
Last Updated: Dec 3, 2019 - 11:32am

Georgia Tech

PhD Defense by Hector F. Espitia-Navarro

Additional Information