*********************************
There is now a CONTENT FREEZE for Mercury while we switch to a new platform. It began on Friday, March 10 at 6pm and will end on Wednesday, March 15 at noon. No new content can be created during this time, but all material in the system as of the beginning of the freeze will be migrated to the new platform, including users and groups. Functionally the new site is identical to the old one. webteam@gatech.edu
*********************************
Title: Opportunistic Use of Video Data For Wearable-based Human Activity Recognition
Hyeokhyen Kwon
Ph.D. student in Computer Science
School of Interactive Computing
College of Computing
Georgia Institute of Technology
Date: Monday, October 19th, 2020
Time: 10:00 AM to 12:00 PM (EST)
Location: https://bluejeans.com/3537894193
Committee:
Dr. Gregory D. Abowd (Advisor) – School of Interactive Computing, Georgia Institute of Technology
Dr. Thomas Ploetz (Co-Advisor) – School of Interactive Computing, Georgia Institute of Technology
Dr. Thad Starner – School of Interactive Computing, Georgia Institute of Technology
Dr. Irfan Essa – School of Interactive Computing, Georgia Institute of Technology
Dr. Nicholas D. Lane – Dept. of Computer Science & Tech., University of Cambridge
Abstract:
Wearable Inertial Measurement Unit (IMU)-based human activity recognition is at the core of continuous monitoring for human well-being, which can detect precursors of health risks in everyday life. Conventionally, wearable sensor data is collected from recruited users, where user engagement is expensive, and the annotation is time-consuming. Due to the lack of large-scale labeled datasets, the wearable-based human activity recognition model has yet to experience significant improvements in recognition performance. To tackle the scale limitations in the wearable sensor dataset, this dissertation proposes a novel approach, which aims at harvesting existing video data from virtually unlimitedly large repositories, such as YouTube. I introduce an automated processing pipeline that integrates existing computer vision and signal processing techniques to convert human activity videos into virtual IMU data streams. I show how the virtually-generated IMU data improves the performance of various models on known human activity recognition datasets. I also proposed approaches to improve the quality of the generated virtual IMU data and decrease the domain gap between virtual and real IMU data. To further improve the recognition accuracy, I discuss a novel model training approach to handle human activity annotation noise in video datasets. This dissertation shows the promise of using video as a novel source for human activity recognition with wearables, representing a paradigm shift for deriving a robust human activity recognition system.