*********************************
There is now a CONTENT FREEZE for Mercury while we switch to a new platform. It began on Friday, March 10 at 6pm and will end on Wednesday, March 15 at noon. No new content can be created during this time, but all material in the system as of the beginning of the freeze will be migrated to the new platform, including users and groups. Functionally the new site is identical to the old one. webteam@gatech.edu
*********************************
Title: Learning Representations Toward the Understanding of Out-of-Distribution for Neural Networks
Committee:
Dr. Ghassan Al-Regib, ECE, Chair , Advisor
Dr. Justin Romberg, ECE
Dr. Mark Davenport, ECE
Dr. Avinash Ravichandran, AWS AI
Dr. Eva Dyer, BME
Abstract: Data-driven representations achieve powerful generalization performance in diverse information processing tasks. However, the generalization is often limited to test data from the same distribution as training data (in-distribution (ID)). In addition, the neural networks often make overconfident and incorrect predictions for data outside training distribution, called out-of-distribution (OOD). In this dissertation, we develop representations that can characterize OOD for the neural networks and utilize the characterization to efficiently generalize to OOD. We categorize the data-driven representations based on information flow in neural networks and develop novel gradient-based representations. In particular, we utilize the backpropagated gradients to represent what the neural networks has not learned in the data. The capability of gradient-based representations for OOD characterization is comprehensively analyzed in comparison with standard activation-based representations. We also develop a regularization technique for the gradient-based representations to better characterize OOD. We develop an anomaly detection algorithm named GradCon using the gradient constraint and achieve state-of-the-art performance in OOD detection. We also propose activation-based representations learned with auxiliary information to efficiently generalize to data from OOD. We use an unsupervised learning framework to learn the aligned representations of visual and attribute data. These aligned representations are utilized to calibrate the overconfident prediction toward ID classes. The generalization performance of the aligned representations is validated in the application of generalized zero-shot learning. Our developed GZSL method, GatingAE, achieves state-of-the-art performance in generalizing to OOD without using labeled OOD data. Also, balanced performance for both ID and OOD is achieved by mitigating the prediction bias presented in the network. Finally, GatingAE requires significantly less number of model parameters compared to other state-of-the-art method.