Georgia Tech Researchers Awarded Best Paper at SIAM International Conference on Data Mining

*********************************
There is now a CONTENT FREEZE for Mercury while we switch to a new platform. It began on Friday, March 10 at 6pm and will end on Wednesday, March 15 at noon. No new content can be created during this time, but all material in the system as of the beginning of the freeze will be migrated to the new platform, including users and groups. Functionally the new site is identical to the old one. webteam@gatech.edu
*********************************

Contact

Joshua Preston

jpreston@cc.gatech.edu

678-231-0787

 

Sidebar Content
No sidebar content submitted.
Summaries

Summary Sentence:

No summary sentence submitted.

Full Summary:

Georgia Institute of Technology researchers Dongryeol Lee, Alexander G. Gray and Richard Vuduc, from the College of Computing, were awarded Best Paper at the SIAM International Conference on Data Mining April 26 for their paper “A Distributed Kernel Summation Framework for General-Dimension Machine Learning.” 

Georgia Institute of Technology researchers Dongryeol Lee, Alexander G. Gray and Richard Vuduc, from the College of Computing, were awarded Best Paper at the SIAM International Conference on Data Mining April 26 for their paper “A Distributed Kernel Summation Framework for General-Dimension Machine Learning.” 

Kernel summations are a ubiquitous key computational bottleneck in many data analysis methods. The paper proposes a hybrid MPI/OpenMP kernel summation framework for scaling many popular data analysis methods. Advantages to the approach include utilizing the platform-independent C++ code base that utilizes standard protocols such as MPI and OpenMP; using the template code structure that uses any multidimensional binary trees and any approximation schemes that may be suitable for high-dimensional problems; and having extendibility to a large class of problems that require fast evaluations of kernel sums.

“Researchers have previously parallelized kernel summations in the context of simulations,” says Dongryeol Lee, a Ph.D. candidate in Computer Science. “But this paper is the first serious effort in parallelizing kernel summations in the context of data mining with potentially high-profile scientific applications.”

In data mining, kernel summations appear in popular so-called kernel methods which can model complex, nonlinear structures in data. The richer expressiveness of the methods comes with the drawback of requiring many data points and hence more computational power for crunching collected data, according to Lee. The collected data in some cases must be stored on multiple machines.

From the data mining community, Lee says this work is the first to utilize algorithmic techniques in both high performance computing, computer science, computational physics, computational geometry, and approximation theory in a general framework.

Kernel summations drive algorithms in application areas such as finance, astronomy, and medical science. 

Lee notes some examples: “Fraudulent financial transactions can be detected more quickly using fast kernel summations. Astronomy uses the algorithms to predict redshift of many galaxies and stars, which can shed light onto the ultimate fate of the universe. Medicine uses fast kernel summation algorithms in automated early detection of cancer that can save human lives."

Additional Information

Groups

High Performance Computing (HPC)

Categories
No categories were selected.
Related Core Research Areas
No core research areas were selected.
Newsroom Topics
No newsroom topics were selected.
Keywords
data analysis, data analytics, data mining
Status
  • Created By: Joshua Preston
  • Workflow Status: Published
  • Created On: May 10, 2012 - 11:21am
  • Last Updated: Oct 7, 2016 - 11:12pm