ARC Colloquium: Ravi Kannan (MSR)

*********************************
There is now a CONTENT FREEZE for Mercury while we switch to a new platform. It began on Friday, March 10 at 6pm and will end on Wednesday, March 15 at noon. No new content can be created during this time, but all material in the system as of the beginning of the freeze will be migrated to the new platform, including users and groups. Functionally the new site is identical to the old one. webteam@gatech.edu
*********************************

Event Details
  • Date/Time:
    • Monday March 25, 2019 - Tuesday March 26, 2019
      11:00 am - 11:59 am
  • Location: Klaus 1116 East
  • Phone:
  • URL:
  • Email:
  • Fee(s):
    N/A
  • Extras:
Contact
No contact information submitted.
Summaries

Summary Sentence: A General Algorithm for Unsupervised Learning problems - Klaus 1116 East at 11 am

Full Summary: No summary paragraph submitted.

Algorithms & Randomness Center (ARC)

Ravi Kannan (MSR)

Monday, March 25, 2019

Klaus 1116E - 11:00 am

 

Title:  A General Algorithm for Unsupervised Learning problems

Abstract:  The following simply-stated geometric problem includes as special cases the core problems of a  number  of  areas  in  Unsupervised  Learning,  including,  Topic  Modeling,  Non-negative  Matrix Factorization, Clustering, Stochastic Block Models and Overalapping Communities Detection:

There is an unknown polytope K  ∈ Rd with k vertices.  We are given n data points, each aperturbation of some point in K.  The problem is to find K, i.e., its vertices (approximately).  [The perturbations are large; indeed, many data points lie outside K.]

Our main result is an algorithm which solves this general problem under two natural assumptions.  Our assumptions are technically different,  but similar in spirit to existing models for the special cases.  We assume separation between the vertices of K  and the existence of “pure” data points whose unperturbed versions are close to the vertices of K.  Notably we do not assume any stochastic  model  of  data.   Our  algorithm  has  better  complexity  than  known  algorithms  for  the special cases when the input matrix A is sparse and k is relatively small compared to n, d.

The algorithm is simply stated, but the proof of correctness is involved.  It draws on tools in Numerical Analysis, especially perturbation of singular spaces of matrices.  Here is a description of our algorithm:  It has k stages; in each stage, it picks a certain random vector u, finds the m largest u · x over data points x and outputs the average of these data points as an approximation to a new vertex of K.

Joint Work with C. Bhattacharyya

 

----------------------------------

Speaker's Webpage

Videos of recent talks are available at: https://smartech.gatech.edu/handle/1853/46836

Click here to subscribe to the seminar email list: arc-colloq@cc.gatech.edu

Additional Information

In Campus Calendar
No
Groups

ARC

Invited Audience
Faculty/Staff, Postdoc, Public, Graduate students
Categories
Conference/Symposium
Keywords
No keywords were submitted.
Status
  • Created By: Francella Tonge
  • Workflow Status: Published
  • Created On: Mar 5, 2019 - 9:33am
  • Last Updated: Mar 19, 2019 - 8:35am