*********************************
There is now a CONTENT FREEZE for Mercury while we switch to a new platform. It began on Friday, March 10 at 6pm and will end on Wednesday, March 15 at noon. No new content can be created during this time, but all material in the system as of the beginning of the freeze will be migrated to the new platform, including users and groups. Functionally the new site is identical to the old one. webteam@gatech.edu
*********************************
Title: Structured Visual Understanding, Generation and Reasoning
Jianwei Yang
Ph.D. Candidate in Computer Science
School of Interactive Computing
Georgia Institute of Technology
https://www.cc.gatech.edu/~jyang375/
Date: Thursday, January 2nd, 2020
Time: 4:00-6:00 PM (EST)
Location: Coda C1003 Adair
BlueJeans: https://bluejeans.com/998872971
Committee:
Dr. Devi Parikh (Advisor), School of Interactive Computing, Georgia Institute of Technology
Dr. Dhruv Batra, School of Interactive Computing, Georgia Institute of Technology
Dr. David Crandall, School of Informatics, Computing and Engineering, Indiana University
Dr. Stefan Lee, School of Electrical Engineering and Computer Science, Oregon State University
Dr. Judy Hoffman, School of Interactive Computing, Georgia Institute of Technology
Abstract:
The world around us is highly structured. In the real world, multiple objects usually exist in a scene and interact with each other in predictable ways (e.g., mug on table, keyboard below computer monitor); for a single object, it usually consists of multiple components under some structured configurations (e.g., a person has different body parts). These structures manifest themselves in the visual data that captures the world around us, and thus can potentially provide a strong inductive bias to various vision tasks. In this talk, I will discuss how to integrate such structure priors into different tasks including visual understanding, generation and reasoning. Specifically,
On these different levels of tasks, we demonstrate that modeling the structures in visual data and the associated text can not only improve the model performance but also increase the model transparency. To the end, I will briefly discuss the challenges in this domain and the extensions of recent works.