*********************************
There is now a CONTENT FREEZE for Mercury while we switch to a new platform. It began on Friday, March 10 at 6pm and will end on Wednesday, March 15 at noon. No new content can be created during this time, but all material in the system as of the beginning of the freeze will be migrated to the new platform, including users and groups. Functionally the new site is identical to the old one. webteam@gatech.edu
*********************************
Title: On Using Inductive Biases for Designing Deep Learning Architectures
Date: Wednesday, December 9th, 2020
Time: 13:00 to 15:00 (EDT)
BlueJeans: https://bluejeans.com/225189060
Harsh Shrivastava
Machine Learning PhD Student
School of Computational Science and Engineering
Georgia Institute of Technology
I will go over two novel and generic approaches for designing deep learning architectures which incorporate our domain knowledge about the problem under consideration. The 'Cooperative Neural Networks' take their inductive biases from the underlying probabilistic graphical models, while the problem dependent 'Unrolled Algorithms' are designed using the structure obtained by unrolling an optimization algorithm on an objective function of interest as a template. We found that the neural network architectures obtained from our approaches typically end up with very few learnable parameters and provide considerable improvement in run-time compared to other deep learning methods. We have applied our techniques to solve NLP related tasks and problems in finance, healthcare & computational biology.
There are three components of my thesis:
Firstly, I will go through the Cooperative Neural Network (CoNN-sLDA) approach which we developed for the document classification task. We use the popular Latent Dirichlet Allocation graphical model as the inductive bias for the CoNN-sLDA model. We demonstrate a 23% reduction in error on the challenging MultiSent data set compared to state-of-the-art.
Secondly, I will explain the idea of using ‘Unrolled Algorithms’ for the sparse graph recovery task. We propose a deep learning architecture, GLAD, which uses an Alternating Minimization algorithm as our model inductive bias and learns the model parameters via supervised learning. We show that GLAD learns a very compact and effective model for recovering sparse graphs from data.
Finally, I will walk through our approach of solving problems related to single-cell RNA sequencing data. Specifically, we design a novel gene regulatory network reconstruction framework called `GRNUlar'. Our method smartly utilizes the expressive ability of neural networks in a multi-task learning framework merged with our `Unrolled Algorithms' technique. To the best of our knowledge, our work is the first to introduce the successful use of expression data simulators for supervised learning of gene regulatory networks from single cell RNA seq data.