*********************************
There is now a CONTENT FREEZE for Mercury while we switch to a new platform. It began on Friday, March 10 at 6pm and will end on Wednesday, March 15 at noon. No new content can be created during this time, but all material in the system as of the beginning of the freeze will be migrated to the new platform, including users and groups. Functionally the new site is identical to the old one. webteam@gatech.edu
*********************************
Title: Computational Methods for Measurement of Visual Attention from Videos towards Large-scale Behavioral Analysis
Eunji Chong
School of Computer Science
College of Computing
Georgia Institute of Technology
Date: Thursday, January 9th, 2020
Time: 3:30 - 5:30 PM (EST)
Location: TSRB 222
Committee:
Dr. James M. Rehg (Advisor), School of Computer Science, Georgia Institute of Technology
Dr. Agata Rozga, School of Computer Science, Georgia Institute of Technology
Dr. Gregory D. Abowd, School of Computer Science, Georgia Institute of Technology
Dr. Irfan Essa, School of Computer Science, Georgia Institute of Technology
Dr. Yaser Sheikh, Robotics Institute, Carnegie Mellon University
Abstract:
Visual attention is a critically-important aspect of human social behavior, visual navigation, and interaction with the 3D environment, and where and what people are paying attention to reveals a lot of information about their social, cognitive, and affective states. While monitor-based and wearable eye trackers are widely available, they are not sufficient to support the large-scale collection of naturalistic gaze data in contexts such as face-to-face social interactions or object manipulation in 3D environments. Wearable eye trackers are burdensome to participants and bring issues of calibration, compliance, cost, and battery life.
This thesis investigates different ways to measure real-world human visual attention using computer vision from plain videos and its use for identifying meaningful social behaviors. Specifically, three methods are investigated. First, I present methods for detection of looks to camera in first-person view and its use for eye contact detection. Experimental results show that the presented method can achieve the first human expert-level detection of eye contact. Second, I develop a method for tracking heads in a 3d space for measuring attentional shifts. Lastly, I propose spatiotemporal deep neural networks for detecting time-varying attention targets in video and present its application for the detection of shared attention and joint attention. The final method achieves state-of-the-art results on different benchmark datasets on attention measurement as well as the first empirical result on clinically-relevant gaze shift classification.
Presented approaches have the benefit of linking gaze estimation to the broader tasks of action recognition and dynamic visual scene understanding, and bears potential as a useful tool for understanding attention in various contexts such as human social interactions, skill assessments, and human-robot interactions.