Center for Signal and Information Processing Seminar

*********************************
There is now a CONTENT FREEZE for Mercury while we switch to a new platform. It began on Friday, March 10 at 6pm and will end on Wednesday, March 15 at noon. No new content can be created during this time, but all material in the system as of the beginning of the freeze will be migrated to the new platform, including users and groups. Functionally the new site is identical to the old one. webteam@gatech.edu
*********************************

Event Details
  • Date/Time:
    • Friday October 29, 2021
      3:00 pm - 4:00 pm
  • Location: Virtual
  • Phone:
  • URL:
  • Email:
  • Fee(s):
    N/A
  • Extras:
Contact

Ghassan AlRegib

alregib@gatech.edu

Raquel Plaskett

raquel.plaskett@ece.gatech.edu

Huck Yang

huckiyang@gatech.edu

Center for Signal and Information Processing 

School of Electrical and Computer Engineering

Summaries

Summary Sentence: Jinyu Li will deliver the October 29 CSIP Seminar, which is entitled "Advances in end-to-end automatic speech recognition."

Full Summary: Jinyu Li will deliver the October 29 CSIP Seminar, which is entitled "Advances in end-to-end automatic speech recognition."

Date: Friday, Oct. 29, 2021

Time: 3:00 pm to 4:00 pm

BlueJeans Link: https://bluejeans.com/4658604304

Speaker: Jinyu Li

Affiliation: Microsoft Corporation, Redmond, Washington

Title: Advances in end-to-end automatic speech recognition

Bio: Jinyu Li received a Ph.D. degree from the Georgia Institute of Technology, Atlanta, in 2008. Currently, he is a Partner Applied Scientist and Technical Lead in Microsoft Corporation, Redmond, Washington, USA. He leads a team to design and improve speech modeling algorithms and technologies that ensure industry state-of-the-art speech recognition accuracy for Microsoft. His major research interests cover several topics in speech recognition, including end-to-end modeling, deep learning, noise robustness, etc. He is the leading author of the book "Robust Automatic Speech Recognition -- A Bridge to Practical Applications", Academic Press, Oct, 2015. He is the member of IEEE Speech and Language Processing Technical Committee since 2017. He also served as the associate editor of IEEE/ACM Transactions on Audio, Speech, and Language Processing from 2015 to 2020.

Abstract: Recently, the speech community is seeing a significant trend of moving from deep neural network-based hybrid modeling to end-to-end (E2E) modeling for automatic speech recognition (ASR). While E2E models  achieve the state-of-the-art results in most benchmarks in terms of ASR accuracy, hybrid models are still used in a large proportion of  commercial ASR systems at the current time. There are lots of practical factors that affect the production model deployment decision. Traditional hybrid models, been optimized for production for decades, are usually good at these factors. Without providing excellent solutions to all these factors, it is hard for E2E models to be widely commercialized. In this talk, I will overview the recent advances in E2E models with the focus on technologies addressing those challenges from the perspective of the industry.

Additional Information

In Campus Calendar
No
Groups

School of Electrical and Computer Engineering

Invited Audience
Faculty/Staff, Public, Undergraduate students
Categories
No categories were selected.
Keywords
No keywords were submitted.
Status
  • Created By: Jackie Nemeth
  • Workflow Status: Published
  • Created On: Oct 20, 2021 - 11:49am
  • Last Updated: Oct 20, 2021 - 11:49am