*********************************
There is now a CONTENT FREEZE for Mercury while we switch to a new platform. It began on Friday, March 10 at 6pm and will end on Wednesday, March 15 at noon. No new content can be created during this time, but all material in the system as of the beginning of the freeze will be migrated to the new platform, including users and groups. Functionally the new site is identical to the old one. webteam@gatech.edu
*********************************
Title: Memory-Efficient Distributed Parallel Frameworks using Compressed Buffer Trees
Hrishikesh Amur
School of Computer Science
College of Computing
Georgia Institute of Technology
Date: Friday August 31st, 2012
Time: 3:00PM - 5:00PM (EST) - UPDATED
Location: KACB 3402
Committee:
Abstract:
Memory is a valuable commodity in datacenters. DRAM is expensive and an expensive consumer of power. With the number of cores per socket growing faster than the memory capacity per socket, memory is increasingly scarce. Given the rise of data-intensive computing, this focus on memory gains increased relevance. Data-intensive computing systems are primarily to designed to operate on large amounts of data from storage. However, in order to overcome the high latencies associated with disk access, applications commonly use memory for performance-sensitive data. Therefore, scarcity of memory can impact the performance of distributed applications significantly.
In this thesis we introduce techniques for memory-efficiency without compromising performance. We introduce a novel data structure called the Compressed Buffer Tree (CBT) which stores data in memory-efficient form and allows computation to be executed on the data with high throughput. The CBT achieves memory-efficiency through the efficient application of data compression and offloading of state of disk. We demonstrate the utility of the CBT through implementations of high-performance, memory-efficient runtimes for the following programming models, listed in order of increasing complexity:
– a synchronous, message-passing model (Pregel)
– an asynchronous, shared-memory model (GraphLab)