*********************************
There is now a CONTENT FREEZE for Mercury while we switch to a new platform. It began on Friday, March 10 at 6pm and will end on Wednesday, March 15 at noon. No new content can be created during this time, but all material in the system as of the beginning of the freeze will be migrated to the new platform, including users and groups. Functionally the new site is identical to the old one. webteam@gatech.edu
*********************************
Title: Towards Transparent and Grounded Systems for Visual Question Answering
----------------
Date: Thursday, December 13th, 2018
Time: 11:00am to 12:30pm (ET)
Location: CCB 312A
Yash Goyal
Ph.D. Student in Computer Science
School of Interactive Computing
Georgia Institute of Technology
https://www.cc.gatech.edu/~ygoyal3/
Committee:
----------------
Dr. Dhruv Batra (Advisor; School of Interactive Computing, Georgia Institute of Technology)
Dr. Devi Parikh (School of Interactive Computing, Georgia Institute of Technology)
Dr. Mark Riedl (School of Interactive Computing, Georgia Institute of Technology)
Dr. Trevor Darrell (University of California, Berkeley)
Abstract:
----------------
My research goal is to build transparent and grounded AI systems. Grounding is essential to build reliable and generalizable systems that are not driven by dataset biases. Transparency in AI systems can help system designers find their failure modes and provide guidance to teach humans.
In my thesis, I study these two dimensions -- visual grounding and transparency in the context of Visual Question Answering (VQA), where the task for an AI system is to answer natural language questions about images. Specifically, I will present my work on:
1) tackling the language priors present in the popular VQA datasets for abstract scenes and real images to elevate the role of image understanding in VQA, and
2) studying transparency of VQA systems by:
a) building an interpretable VQA model,
b) proposing a new counter-example explanation modality, and
c) using saliency-based visualization techniques to gain insight into what evidence in the input do uninterpretable VQA models base their decisions on.
In the above works, AI systems are inferior to humans. In these cases, explanations are used to identify their errors and improve them. In my proposed work, I will focus on the setting where AI systems are superior to humans and will study if explanations from these systems can be used to teach humans. More specifically, in the context of fine-grained bird recognition task, I will study if deep models can point humans to look at the right regions of the birds and help them perform better at this hard task.