*********************************
There is now a CONTENT FREEZE for Mercury while we switch to a new platform. It began on Friday, March 10 at 6pm and will end on Wednesday, March 15 at noon. No new content can be created during this time, but all material in the system as of the beginning of the freeze will be migrated to the new platform, including users and groups. Functionally the new site is identical to the old one. webteam@gatech.edu
*********************************
Le Song
Post-doctoral Fellow at School of Computer Science, Carnegie Mellon University
Title:
Modeling Rich Structured Data via Kernel Distribution Embeddings
Abstract:
Real world applications often produce a large volume of highly uncertain and complex data. Many of them have rich microscopic structures where each variable can take values on manifolds (e.g., camera rotations), combinatorial objects (e.g., texts, graphs of drug compounds) or high dimensional continuous domains (e.g., images and videos). Furthermore, these problems may possess additional macroscopic structures where the large collections of observed and hidden variables are connected by networks of conditional independence relations (e.g., in predicting depth from still images, and forecasting in time-series).
Most previous learning algorithms for problems with such rich structures rely heavily on linear relations and parametric models where data are typically assumed to be multivariate Gaussian or discrete with a relatively small number of values. Conclusions inferred under these restricted assumptions can be misleading, if the underlying data generating processes contain nonlinear, non-discrete, or non-Gaussian components.
How can we find a suitable representation for nonlinear and non-Gaussian relationships in a data-driven fashion? How can we exploit conditional independence structures between variables in rich structured setting? How can we design efficient algorithms to solve challenging nonparametric problems involving large amount of data?
In this talk, I will introduce a nonparametric representation for distributions called kernel embeddings that are capable of addressing problems with both microscopic and macroscopic structures. The key idea of the method is to map distributions to their expected features (potentially infinite dimensional), and given evidence, update these new representations solely in the feature space. Compared to existing nonparametric representations which are largely restricted to vectorial data and usually lead to intractable algorithms, very often kernel distribution embeddings lead to simpler, faster and more accurate algorithms in a diverse range of problems such as organizing photo albums, understanding social networks, retrieving documents across languages, predicting depth from still images and forecasting sensor time-series.
Bio:
Le Song is a post doctoral fellow at School of Computer Science, Carnegie Mellon University, working with a number of professors, including Eric Xing, Carlos Guestrin, Geoff Gordon and Jeff Schneider. Prior to that, Le Song obtained his PhD. degree in computer science from University of Sydney and National ICT Australia in 2008 under the supervision of Alex Smola. Le Song conducted research in statistical machine learning and data mining, with primary focus on kernel methods, probabilistic graphical models, and network analysis. He is also interested in large-scale machine learning problems, and machine learning applications in computational biology, texts, images, and network analysis.
To receive future announcements, please sign up to the cse-seminar email list: https://mailman.cc.gatech.edu/mailman/listinfo/cse-seminar