PhD Defense by Unaiza Ahsan

*********************************
There is now a CONTENT FREEZE for Mercury while we switch to a new platform. It began on Friday, March 10 at 6pm and will end on Wednesday, March 15 at noon. No new content can be created during this time, but all material in the system as of the beginning of the freeze will be migrated to the new platform, including users and groups. Functionally the new site is identical to the old one. webteam@gatech.edu
*********************************

Event Details

Date/Time:
- Tuesday November 27, 2018 - Wednesday November 28, 2018
  10:00 am - 11:59 am
Location: College of Computing Building (CCB) 345
Phone:
URL:
Email:
Fee(s):
N/A
Extras:

Contact

No contact information submitted.

Summaries

Summary Sentence: Leveraging Mid-Level Representations For Complex Activity Recognition

Full Summary: No summary paragraph submitted.

Title: Leveraging Mid-Level Representations For Complex Activity Recognition

Unaiza Ahsan
Computer Science Ph.D. Student

School of Interactive Computing
College of Computing
Georgia Institute of Technology

Date: Tuesday, Nov 27, 2018
Time: 10:00 AM to 12:00PM (EST)
Location: College of Computing Building (CCB) 345

Committee:

---------------

Dr. Irfan Essa (Advisor), School of Interactive Computing, Georgia Institute of Technology

Dr. James Hayes, School of Interactive Computing, Georgia Institute of Technology
Dr. Devi Parikh, School of Interactive Computing, Georgia Institute of Technology
Dr. Munmun De Choudhury, School of Interactive Computing, Georgia Institute of Technology

Dr. Zsolt Kira, School of Interactive Computing, Georgia Institute of Technology
Dr. Chen Sun, Google

Summary:

---------------

Dynamic scene understanding requires learning representations of the components of the scene including objects, environments, actions and events. Complex activity recognition from images and videos requires annotating large datasets with action labels which is a tedious and expensive task. Thus, there is a need to design a mid-level or intermediate feature representation which does not require millions of labels, yet is able to generalize to semantic-level recognition of activities in visual data. This thesis makes three contributions in this regard.

First, we propose an event concept-based intermediate representation which learns concepts via the Web and uses this representation to identify events even with a single labeled example. To demonstrate the strength of the proposed approaches, we contribute two diverse social event datasets to the community. We then present a use case of event concepts as a mid-level representation that generalizes to sentiment recognition in diverse social event images.

Second, we propose to train Generative Adversarial Networks (GANs) with video frames (which does not require labels), use the trained discriminator from GANs as an intermediate representation and finetune it on a smaller labeled video activity dataset to recognize actions in videos. This unsupervised pre-training step avoids any manual feature engineering, video frame encoding or searching for the best video frame sampling technique.

Our third contribution is a self-supervised learning approach on videos that exploits both spatial and temporal coherency to learn feature representations on video data without any supervision. We demonstrate the transfer learning capability of this model on smaller labeled datasets. We present comprehensive experimental analysis on the self-supervised model to provide insights into the unsupervised pretraining paradigm and how it can help with activity recognition on target datasets which the model has never seen during training.

Additional Information

In Campus Calendar

Groups

Graduate Studies

Invited Audience

Public, Graduate students, Undergraduate students

Categories

Other/Miscellaneous

Keywords

Phd Defense

Status

Created By: Tatianna Richardson
Workflow Status: Published
Created On: Nov 20, 2018 - 12:42pm
Last Updated: Nov 20, 2018 - 12:42pm

Georgia Tech

PhD Defense by Unaiza Ahsan

Additional Information