Ph.D. Proposal by Kisung Lee

*********************************
There is now a CONTENT FREEZE for Mercury while we switch to a new platform. It began on Friday, March 10 at 6pm and will end on Wednesday, March 15 at noon. No new content can be created during this time, but all material in the system as of the beginning of the freeze will be migrated to the new platform, including users and groups. Functionally the new site is identical to the old one. webteam@gatech.edu
*********************************

Event Details

Date/Time:
- Monday October 20, 2014
  9:30 am - 11:30 am
Location: KACB 3402
Phone:
URL:
Email:
Fee(s):
N/A
Extras:

Contact

No contact information submitted.

Summaries

Summary Sentence: Scalable Big Data Systems: Architectures and Optimizations

Full Summary: No summary paragraph submitted.

Title: Scalable Big Data Systems: Architectures and Optimizations

Kisung Lee
School of Computer Science
College of Computing
Georgia Institute of Technology

Date: Monday, October 20, 2014
Time: 9:30 AM - 11:30 AM EDT
Location: KACB 3402

Committee:
Dr. Ling Liu (Advisor, School of Computer Science, Georgia Institute of Technology)
Dr. Ed Omiecinski (School of Computer Science, Georgia Institute of Technology)
Dr. Calton Pu (School of Computer Science, Georgia Institute of Technology)
Dr. Karsten Schwan (School of Computer Science, Georgia Institute of Technology)
Dr. Lakshmish Ramaswamy (Department of Computer Science, University of Georgia)

Abstract:
Big data analytics has become not just a hot buzzword but also a strategic Information Technology direction for many enterprises and government organizations. This dissertation research is dedicated to the novel architectural design and optimization techniques for building big data systems that can offer elastic scalability. We have made three novel contributions for addressing the technical challenges of big data processing, centered on both graph datasets and mobile/spatial datasets. First, we develop a suite of graph partitioning algorithms that can run much faster than existing data distribution methods and inherently scale to the growth of big data. The main idea of our approach is to partition a big graph by preserving the core computation data structure as much as possible to maximize the intra-server computation and minimize the inter-server communication. Second, we have developed a distributed framework for iterative graph computations by maximizing the access locality and minimizing distributed messaging cost. Our initial experimental evaluation shows that our approach can significantly outperform Apache Hama on big graphs with a large number of edges. In addition, we have developed optimization techniques for scaling mobile data processing along with three orthogonal dimensions: (i) scalable processing of a large number of spatial alarms for mobile users traveling on road networks, (ii) scalable location tagging techniques for improving the quality of Twitter data analytics and prediction accuracy, and (iii) a lightweight spatial indexing technique for enhancing the search performance of big spatial data. In this dissertation proposal exam, I will briefly highlight these technical contributions and focus on presenting our semantic hashing-based graph partitioning techniques, including system architecture, optimizations and experimental evaluation.

Additional Information

In Campus Calendar

Groups

Graduate Studies

Invited Audience

Public

Categories

Other/Miscellaneous

Keywords

coc, graduate students, Phd proposal

Status

Created By: Danielle Ramirez
Workflow Status: Published
Created On: Oct 7, 2014 - 6:25am
Last Updated: Oct 7, 2016 - 10:09pm

Georgia Tech

Ph.D. Proposal by Kisung Lee

Additional Information