Ph.D. Dissertation Defense - Muhammad Rizwan

*********************************
There is now a CONTENT FREEZE for Mercury while we switch to a new platform. It began on Friday, March 10 at 6pm and will end on Wednesday, March 15 at noon. No new content can be created during this time, but all material in the system as of the beginning of the freeze will be migrated to the new platform, including users and groups. Functionally the new site is identical to the old one. webteam@gatech.edu
*********************************

Event Details

Date/Time:
- Friday June 30, 2017 - Saturday July 1, 2017
  12:00 pm - 1:59 pm
Location: Room 509, TSRB
Phone:
URL:
Email:
Fee(s):
N/A
Extras:

Contact

No contact information submitted.

Summaries

Summary Sentence: Adaptation of Hybrid Deep Neural Network-hidden Markov Model Speech Recognition System using a Sub-space Approach

Full Summary: No summary paragraph submitted.

Title: Adaptation of Hybrid Deep Neural Network-hidden Markov Model Speech Recognition System using a Sub-space Approach

Committee:

Dr. David Anderson, ECE, Chair , Advisor

Dr. Mark Clements, ECE

Dr. Mark Davenport, ECE

Dr. Omer Inan, ECE

Dr. Fang Liu, CoC

Dr. Wayne Daley, GTRI

Abstract:

The objective of the study is to enhance the performance of automatic speech recognition (ASR) system by adaptation of the ASR for a particular speaker or a group of speakers. In ASR, training and testing data often do not follow the same statistics; they are often mismatched, which leads to a gap in performance. The difference between training and testing statistics can be minimized by speaker adaptation techniques, which require adaptation data from a target speaker to optimize system performance. In the past, ASR systems were based on Gaussian mixture model-hidden Markov models (GMM- HMM). A resurgence of neural networks has resulted in popularity of hybrid deep neural network-hidden Markov models (DNN-HMM) for speech recognition. The adaptation techniques developed for GMM-HMM systems cannot be directly applied to DNN-HMM systems because GMMs are generative models and DNNs are discriminative models. Also, DNN-HMM systems contain large numbers of parameters and require a huge amount of data from target speaker to adapt ASR. In many cases, only a limited amount of adaptation data is available for the target speaker. This thesis proposes multiple methods for the adaptation of speech recognition system by using limited amount of data (a few words). The first method uses multiple words for accent classification in order to identify variability in speaking style. Next adaptive phoneme classification is propose based on target speaker similarity with speakers in the training data. Finally, we present adaptation of ASR by augmenting the speech features with speaker-specific information learned using sparse coding.

Additional Information

In Campus Calendar

Groups

ECE Ph.D. Dissertation Defenses

Invited Audience

Public

Categories

Other/Miscellaneous

Keywords

Phd Defense, graduate students

Status

Created By: Daniela Staiculescu
Workflow Status: Published
Created On: Jun 20, 2017 - 11:10am
Last Updated: Jun 20, 2017 - 11:10am

Georgia Tech

Ph.D. Dissertation Defense - Muhammad Rizwan

Additional Information