STATISTICS SEMINAR SERIES :: Computational Foundations for Statistics/Machine Learning: Enabling Massive Science

*********************************
There is now a CONTENT FREEZE for Mercury while we switch to a new platform. It began on Friday, March 10 at 6pm and will end on Wednesday, March 15 at noon. No new content can be created during this time, but all material in the system as of the beginning of the freeze will be migrated to the new platform, including users and groups. Functionally the new site is identical to the old one. webteam@gatech.edu
*********************************

Event Details

Date/Time:
- Thursday September 29, 2005
  11:00 am - 11:59 pm
Location: Executive Room #228
Phone:
URL:
Email:
Fee(s):
N/A
Extras:

Contact

Barbara Christopher
Industrial and Systems Engineering
Contact Barbara Christopher
404.385.3102

Summaries

Summary Sentence: STATISTICS SEMINAR SERIES :: Computational Foundations for Statistics/Machine Learning: Enabling Massive Science

Full Summary: STATISTICS SEMINAR SERIES :: Computational Foundations for Statistics/Machine Learning: Enabling Massive Science

The data sciences (statistics, and recently machine learning) have always been part of the underpinning of all of the natural sciences. `Massive datasets' represent potentially unprecedented capabilities in a growing number of fields, but most of this potential remains untapped, due to the computational intractability of the most powerful statistics and learning methods. The computational problems underlying many of these methods are related to some of the hardest problems of applied mathematics, but have unique properties which make classical solution classes inappropriate. I will describe the beginnings of a unified framework for a large class of problems, which I call generalized N-body problems. The resulting algorithms, which I call multi-tree methods, appear to be the fastest practical algorithms to date for several foundational problems. I will describe four examples -- all-nearest-neighbors, kernel density estimation, distribution-free Bayes classification, and spatial correlation functions, and touch on two more recent projects, kernel matrix-vector multiplication and high-dimensional integration. I'll conclude by showing examples where these algorithms are enabling previously intractable data analyses at the heart of major modern scientific questions in cosmology and fundamental physics, which have been featured in Science and Nature.

Additional Information

In Campus Calendar

Groups

School of Industrial and Systems Engineering (ISYE)

Invited Audience

No audiences were selected.

Categories

Seminar/Lecture/Colloquium

Keywords

No keywords were submitted.

Status

Created By: Barbara Christopher
Workflow Status: Published
Created On: Oct 8, 2010 - 7:37am
Last Updated: Oct 7, 2016 - 9:52pm

Georgia Tech

STATISTICS SEMINAR SERIES :: Computational Foundations for Statistics/Machine Learning: Enabling Massive Science

Additional Information