*********************************
There is now a CONTENT FREEZE for Mercury while we switch to a new platform. It began on Friday, March 10 at 6pm and will end on Wednesday, March 15 at noon. No new content can be created during this time, but all material in the system as of the beginning of the freeze will be migrated to the new platform, including users and groups. Functionally the new site is identical to the old one. webteam@gatech.edu
*********************************
Title: Multi-object Tracking from the Classics to the Modern
Committee:
Dr. James Rehg, IC, Chair , Advisor
Dr. Mark Clements, ECE, Co-Advisor
Dr. Patricio Vela, ECE
Dr. James Hays, IC
Dr. Bernt Schiele, Max-Planck-Institut für Informatik
Dr. Fuxin Li, Oregon State University
Abstract: Visual object tracking is one of the computer vision problems that has been researched extensively over the past several decades. Many computer vision applications, such as robotics, autonomous driving, and video surveillance, require the capability to track multiple objects in videos. The most popular solution approach to tracking multiple objects follows the tracking-by-detection paradigm in which the problem of tracking is divided into object detection and data association. In data association, track proposals are often generated by extending the object tracks from the previous frame with new detections in the current frame. The association algorithm then utilizes a track scorer or classifier in evaluating track proposals in order to estimate the correspondence between the object detections and object tracks. In this dissertation, I present novel track scorers and track classifiers that make a prediction based on long-term object motion and appearance cues. First, I present an online learning algorithm that can efficiently train a track scorer based on a long-term appearance model for the classical Multiple Hypothesis Tracking (MHT) framework. I show that the classical MHT framework is effective even in modern tracking settings in which strong object detector and strong appearance models are available. Second, I present a novel Bilinear LSTM model as a deep, long-term appearance model which is a basis for an end-to-end learned track classifier. I incorporate this track classifier into the classical MHT framework in order to demonstrate its effectiveness in object tracking. Third, I present a novel multi-track pooling module that enables the Bilinear LSTM-based track classifier to simultaneously consider all the objects in the scene in order to better handle appearance ambiguities between different objects. I utilize this track classifier in a simple, greedy data association algorithm and achieve real-time, state-of-the-art tracking performance. I evaluate the proposed methods in this dissertation on public multi-object tracking datasets that capture challenging object tracking scenarios in urban areas.