PhD Proposal by Shanmukha Ramakrishna Vedantam

*********************************
There is now a CONTENT FREEZE for Mercury while we switch to a new platform. It began on Friday, March 10 at 6pm and will end on Wednesday, March 15 at noon. No new content can be created during this time, but all material in the system as of the beginning of the freeze will be migrated to the new platform, including users and groups. Functionally the new site is identical to the old one. webteam@gatech.edu
*********************************

Event Details

Date/Time:
- Wednesday November 29, 2017
  12:30 pm - 2:30 pm
Location: CCB 247
Phone:
URL:
Email:
Fee(s):
N/A
Extras:

Contact

No contact information submitted.

Summaries

Summary Sentence: : Connecting Vision and Language for Interpretation, Grounding, and Imagination

Full Summary: No summary paragraph submitted.

Title: Connecting Vision and Language for Interpretation, Grounding, and Imagination

Date: Wednesday, November 29 2017
Time: 12:30PM - 02:30PM (EDT)
Location: CCB 247

Shanmukha Ramakrishna Vedantam
Ph.D. Student
School of Interactive Computing
College of Computing
Georgia Institute of Technology

Committee:
Dr. Devi Parikh (Advisor, School of Interactive Computing, Georgia Institute of Technology)
Dr. Dhruv Batra (School of Interactive Computing, Georgia Institute of Technology)
Dr. Jacob Eisenstein (School of Interactive Computing, Georgia Institute of Technology)
Dr. Kevin P. Murphy (Research Scientist, Google Research)
Dr. C. Lawrence Zitnick (Research Manager, Facebook AI Research)

Abstract:

Understanding how to model computer vision and natural language jointly is a long-standing challenge in artificial intelligence. In this thesis, I will study how modeling vision and language in meaningful ways can derive more human-like inferences from machine learning models. Specifically, I will consider three related problems: interpretation, grounding, and imagination.

In interpretation, the goal will be to get machine learning models to understand an image and describe its contents using natural language in a contextually relevant manner. In grounding, I will study how to connect natural language to referents in the physical world, and show how this can help learn common sense. Finally, in proposed work, I will study how to ‘imagine’ visual concepts completely and accurately across the full range and (potentially unseen) compositions of their visual attributes. I will study these problems from computational as well as algorithmic perspectives and suggest exciting directions for future work.

Additional Information

In Campus Calendar

Groups

Graduate Studies

Invited Audience

Faculty/Staff, Public, Graduate students, Undergraduate students

Categories

Other/Miscellaneous

Keywords

Phd proposal

Status

Created By: Tatianna Richardson
Workflow Status: Published
Created On: Nov 22, 2017 - 9:10am
Last Updated: Nov 22, 2017 - 9:10am

Georgia Tech

PhD Proposal by Shanmukha Ramakrishna Vedantam

Additional Information