First Topic Model Paper Wins GEM of PODS

*********************************
There is now a CONTENT FREEZE for Mercury while we switch to a new platform. It began on Friday, March 10 at 6pm and will end on Wednesday, March 15 at noon. No new content can be created during this time, but all material in the system as of the beginning of the freeze will be migrated to the new platform, including users and groups. Functionally the new site is identical to the old one. webteam@gatech.edu
*********************************

Atlanta, GA | Posted: October 7, 2019

Contact

Tess Malone, Communications Officer

tess.malone@cc.gatech.edu

Sidebar Content

No sidebar content submitted.

Summaries

Summary Sentence:

The team won a test-of-time award for their impactful work on information retrieval.

Full Summary:

No summary paragraph submitted.

Media

Santosh Vempala compressed
(image/jpeg)

Unsupervised learning relies on finding relevant information in large databases. This is possible thanks in part to groundbreaking research by School of Computer Science Professor Santosh Vempala and his collaborators 20 years ago.

The team won a test-of-time award for their impactful work on information retrieval. The annual conference Principles of Database Systems (PODS) gave their paper, Latent Semantic Indexing: A Probabilistic Analysis, its Gems of PODS honor at the this year’s conference in May.

The 1998 paper analyzed a popular spectral algorithm and introduced the very first topic model, now a standard in unsupervised learning.

The researchers discovered that if a database or corpus is viewed as a matrix, a computer algorithm can perform singular-value decomposition, a matrix reduction technique that pulls out the most significant directions to explain the data. This step not only involves minimal distortion of data but it actually yields better retrieval results than the full original matrix.

Their topic model was able to identify the original underlying topics. The model and guarantees have been significantly enhanced in the decades since the paper was published.

“This was one of the first provable techniques for automatically extracting information from data,” Vempala said.

Their work has influenced prominent computing fields such as spectral methods, data mining, machine learning, and deep neural networks.

Vempala wrote the paper when he was a summer intern at IBM with Prabhakar Raghavan (now VP of Engineering at Google), together with Columbia University Professor Christos Papadimitriou (then at Berkeley), and Meiji University Professor Hisao Tamaki. Papadimitriou gave the award talk.

Additional Information

Groups

School of Computer Science, College of Computing

Categories

No categories were selected.

Related Core Research Areas

No core research areas were selected.

Newsroom Topics

No newsroom topics were selected.

Keywords

No keywords were submitted.

Status

Created By: Tess Malone
Workflow Status: Published
Created On: Oct 7, 2019 - 2:19pm
Last Updated: Oct 11, 2019 - 11:10am

Georgia Tech

First Topic Model Paper Wins GEM of PODS

Additional Information