*********************************
There is now a CONTENT FREEZE for Mercury while we switch to a new platform. It began on Friday, March 10 at 6pm and will end on Wednesday, March 15 at noon. No new content can be created during this time, but all material in the system as of the beginning of the freeze will be migrated to the new platform, including users and groups. Functionally the new site is identical to the old one. webteam@gatech.edu
*********************************
Title: Reducing Human Labor Cost in Deep Learning for Natural Language Processing
Date: Thursday, April 15, 2021
Time: 12:00PM (EST) / 9:00AM (PST)
Location: (BlueJeans meeting link) https://bluejeans.com/206131916
Haoming Jiang
Machine Learning Ph.D. Student
School of Industrial and Systems Engineering
Georgia Institute of Technology
Committee
Dr. Tuo Zhao (Advisor), School of Industrial and Systems Engineering, Georgia Tech
Dr. Weizhu Chen, Microsoft Dynamics 365 AI, Microsoft
Dr. Yao Xie, School of Industrial and Systems Engineering, Georgia Tech
Dr. Diyi Yang, School of Interactive Computing, Georgia Tech
Dr. Chao Zhang, School of Computational Science and Engineering, Georgia Tech
Abstract
Deep learning has fundamentally changed the landscape of natural language processing (NLP). However, training deep learning models requires huge amounts of manually labeled data, which are prohibitive to obtain in some real-world applications. In addition, accurately evaluating models requires human interaction, which is not affordable for large-scale experiments. This dissertation focuses on reducing such human labor costs in deep learning for NLP. Specifically, we develop novel frameworks for training deep learning models with limited/noisy annotation and for estimating human evaluation scores:
Training with Limited Supervision. Many state-of-the-art models are first pre-trained on a large text corpus and then fine-tuned on downstream tasks. However, due to limited data resources from downstream tasks and the extremely high complexity of pre-trained models, aggressive fine-tuning often causes the fine-tuned model to overfit the training data of downstream tasks and fail to generalize well to unseen data. To address such an issue, we propose a new learning framework for robust and efficient fine-tuning for pre-trained models via regularized optimization. Our experiments show that the proposed framework achieves new state-of-the-art performance on many NLP tasks.
Training with Weak Supervision. When manually labeled data is not available, we can leverage domain expert knowledge to generate weakly labeled data. The weak supervision, though does not require large amounts of manual annotations, yields highly incomplete and noisy weak labels via external knowledge bases. To address this challenge, we propose a two-stage self-training framework, which leverages the power of pre-trained language models to improve the prediction performance of NLP models. Thorough experiments on benchmark datasets demonstrate the superiority of the proposed framework.
Dialogue Evaluation without Human Interaction. In addition to the model training, we also address the problem of reliable human-free automatic evaluation for dialog systems. An ideal environment for evaluating dialog systems, also known as the Turing Test, needs to involve human interaction, which is usually not affordable for large-scale experiments. To bridge such a gap, we propose a new framework named ENIGMA for estimating human evaluation scores based on recent advances of off-policy evaluation in reinforcement learning. Our experiments show that ENIGMA strongly correlates with human evaluation scores.