*********************************
There is now a CONTENT FREEZE for Mercury while we switch to a new platform. It began on Friday, March 10 at 6pm and will end on Wednesday, March 15 at noon. No new content can be created during this time, but all material in the system as of the beginning of the freeze will be migrated to the new platform, including users and groups. Functionally the new site is identical to the old one. webteam@gatech.edu
*********************************
Title: Multi-object Tracking from the Classics to the Modern
Committee:
Dr. Rehg, Advisor
Dr. Clements, Co-Advisor
Dr. Vela, Chair
Dr. Hays
Dr. Schiele
Abstract:
The objective of this research is to design a computer vision algorithm that tracks multiple objects of interest in monocular video. Visual object tracking is one of the computer vision problems that has been researched extensively over the past several decades. Many computer vision applications, such as robotics, autonomous driving, and video surveillance, require the capability to track multiple objects in videos. Despite its importance and long history, the accuracy of recent multi-object trackers is still not matched to that of humans. In this work, I provide several approaches to solve the problem of multi-object tracking that allow us to efficiently extract accurate 2D or 3D motion trajectories of objects from monocular videos. Also, I will discuss future research directions in this domain by presenting challenging scenarios where modern trackers still struggle. In the first part of the work, I am going to present approaches to solve the problem of 2D object tracking. The approaches under this category are (1) an online appearance learning method that is well suited for the classical Multiple Hypothesis Tracking (MHT) framework and (2) data-driven appearance learning methods that utilize a Bilinear LSTM, a novel deep model based on insights drawn from recursive least squares. In the second part of the work, I am going to propose an approach to solve the problem of 3D object tracking that allows us to track multiple objects in the real world coordinates from monocular videos.