PhD Proposla by Yash Goyal

*********************************
There is now a CONTENT FREEZE for Mercury while we switch to a new platform. It began on Friday, March 10 at 6pm and will end on Wednesday, March 15 at noon. No new content can be created during this time, but all material in the system as of the beginning of the freeze will be migrated to the new platform, including users and groups. Functionally the new site is identical to the old one. webteam@gatech.edu
*********************************

Event Details
  • Date/Time:
    • Thursday December 13, 2018 - Friday December 14, 2018
      11:00 am - 12:59 pm
  • Location: CCB 312A
  • Phone:
  • URL:
  • Email:
  • Fee(s):
    N/A
  • Extras:
Contact
No contact information submitted.
Summaries

Summary Sentence: Towards Transparent and Grounded Systems for Visual Question Answering

Full Summary: No summary paragraph submitted.

Title: Towards Transparent and Grounded Systems for Visual Question Answering

----------------

 

Date: Thursday, December 13th, 2018

Time: 11:00am to 12:30pm (ET)

Location: CCB 312A

 

Yash Goyal

Ph.D. Student in Computer Science 

School of Interactive Computing

Georgia Institute of Technology

https://www.cc.gatech.edu/~ygoyal3/

 

 

Committee:

----------------

Dr. Dhruv Batra (Advisor; School of Interactive Computing, Georgia Institute of Technology)

Dr. Devi Parikh (School of Interactive Computing, Georgia Institute of Technology)

Dr. Mark Riedl (School of Interactive Computing, Georgia Institute of Technology)

Dr. Trevor Darrell (University of California, Berkeley)

 

Abstract:

----------------

My research goal is to build transparent and grounded AI systems. Grounding is essential to build reliable and generalizable systems that are not driven by dataset biases. Transparency in AI systems can help system designers find their failure modes and provide guidance to teach humans.

 

In my thesis, I study these two dimensions -- visual grounding and transparency in the context of Visual Question Answering (VQA), where the task for an AI system is to answer natural language questions about images. Specifically, I will present my work on:

1) tackling the language priors present in the popular VQA datasets for abstract scenes and real images to elevate the role of image understanding in VQA, and

2) studying transparency of VQA systems by:

    a) building an interpretable VQA model,

    b) proposing a new counter-example explanation modality, and

    c) using saliency-based visualization techniques to gain insight into what evidence in the input do uninterpretable VQA models base their decisions on.

 

In the above works, AI systems are inferior to humans. In these cases, explanations are used to identify their errors and improve them. In my proposed work, I will focus on the setting where AI systems are superior to humans and will study if explanations from these systems can be used to teach humans. More specifically, in the context of fine-grained bird recognition task, I will study if deep models can point humans to look at the right regions of the birds and help them perform better at this hard task.

Additional Information

In Campus Calendar
No
Groups

Graduate Studies

Invited Audience
Faculty/Staff, Public, Graduate students, Undergraduate students
Categories
Other/Miscellaneous
Keywords
Phd proposal
Status
  • Created By: Tatianna Richardson
  • Workflow Status: Published
  • Created On: Dec 12, 2018 - 12:08pm
  • Last Updated: Dec 12, 2018 - 12:08pm