Ph.D. Dissertation Defense - Zhong Meng

*********************************
There is now a CONTENT FREEZE for Mercury while we switch to a new platform. It began on Friday, March 10 at 6pm and will end on Wednesday, March 15 at noon. No new content can be created during this time, but all material in the system as of the beginning of the freeze will be migrated to the new platform, including users and groups. Functionally the new site is identical to the old one. webteam@gatech.edu
*********************************

Event Details

Date/Time:
- Friday June 22, 2018 - Saturday June 23, 2018
  10:00 am - 11:59 am
Location: Room 5126, Centergy
Phone:
URL:
Email:
Fee(s):
N/A
Extras:

Contact

No contact information submitted.

Summaries

Summary Sentence: Discriminative and Adaptive Training for Robust Speech Recognition and Understanding

Full Summary: No summary paragraph submitted.

Title: Discriminative and Adaptive Training for Robust Speech Recognition and Understanding

Committee:

Dr. Biing-Hwang Juang, ECE, Chair , Advisor

Dr. Chin-Hui Lee, ECE

Dr. Elliott Moore, ECE

Dr. James McClellan, ECE

Dr. Yao Xie, ISyE

Abstract:

Robust automatic speech recognition (ASR) and understanding (ASU) under noisy conditions remains to be a challenging problem even with the advances of deep learning. To achieve robust ASU, two discriminative training objectives are proposed for keyword spotting and topic classification: (1) To accurately recognize the semantically important keywords, the non-uniform error cost minimum classification error training of DNN and BLSTM acoustic models is proposed to minimize the recognition errors of only the keywords. (2) To compensate for the mismatched objectives of speech recognition and understanding, minimum semantic error cost training of the BLSTM acoustic model is proposed to generate semantically accurate lattices for topic classification.

Further, to expand the application of the ASU system to various conditions, four adaptive training approaches are proposed to improve the robustness of the ASR under different conditions: (1) To suppress the effect of inter-speaker variability on speaker-independent DNN acoustic model, speaker-invariant training is proposed to learn a deep representation in the DNN that is both senone-discriminative and speaker-invariant through adversarial multi-task training (2) To achieve condition-robust unsupervised adaptation with parallel data, adversarial teacher-student learning is proposed to suppress multiple factors of condition variability in the procedure of knowledge transfer from a well-trained source domain LSTM acoustic model to the target domain. (3) To further improve the adversarial learning for unsupervised adaptation with unparallel data, domain separation networks are used to enhance the domain-invariance of the senone-discriminative deep representation by explicitly modeling the private component that is unique to each domain. (4) To achieve robust far-field ASR, an LSTM adaptive beamforming network is proposed to estimate the real-time beamforming filter coefficients to cope with non-stationary environmental noise and dynamic nature of source and microphones positions.

Additional Information

In Campus Calendar

Groups

ECE Ph.D. Dissertation Defenses

Invited Audience

Public

Categories

Other/Miscellaneous

Keywords

Phd Defense, graduate students

Status

Created By: Daniela Staiculescu
Workflow Status: Published
Created On: Jun 7, 2018 - 4:18pm
Last Updated: Jun 7, 2018 - 4:18pm

Georgia Tech

Ph.D. Dissertation Defense - Zhong Meng

Additional Information