SCS Talk: Ganesh Ananthanarayanan, University of California at Berkeley

*********************************
There is now a CONTENT FREEZE for Mercury while we switch to a new platform. It began on Friday, March 10 at 6pm and will end on Wednesday, March 15 at noon. No new content can be created during this time, but all material in the system as of the beginning of the freeze will be migrated to the new platform, including users and groups. Functionally the new site is identical to the old one. webteam@gatech.edu
*********************************

Event Details
Contact

Kishore Ramachandran @ (404) 385-5136 rama@cc.gatech.edu 

Summaries

Summary Sentence: Big Data Analytics with All-or-Nothing Parallel Jobs

Full Summary: SCS Talk: Ganesh Ananthanarayanan, University of California at BerkeleyTitle: Big Data Analytics with All-or-Nothing Parallel JobsBio:  Ganesh Ananthanarayanan is a PhD candidate in the University of California at Berkeley, working with Prof. Ion Stoica in the AMP Lab. His research interests are in systems and networking, with a focus on cloud computing and large scale data analytics systems. Prior to joining Berkeley, he worked for two years at Microsoft Research’s Bangalore office. 

Media
  • Ganesh Ananthanarayanan, UC Berkeley Ganesh Ananthanarayanan, UC Berkeley
    (image/jpeg)

SCS Talk: Ganesh Ananthanarayanan, University of California at Berkeley

Title: Big Data Analytics with All-or-Nothing Parallel Jobs

Abstract

Extensive data analysis has become the enabler for diagnostics and decision making in many modern systems. These analyses have both competitive as well as social benefits. To cope with the deluge in data that is growing faster than Moore’s law, computation frameworks have resorted to massive parallelization of analytics jobs into many fine-grained tasks. These frameworks promised to provide efficient and fault-tolerant execution of these tasks. However, meeting this promise in clusters spanning hundreds of thousands of machines is challenging and a key departure from earlier work on parallel computing.

 

A simple but key aspect of parallel jobs is the all-or-nothing property: unless all tasks of a job are provided equal improvement, there is no speedup in the completion of the job. This talk will demonstrate how the all-or-nothing property impacts replacement algorithms in distributed caches for parallel jobs. Our coordinated caching system, PACMan, makes global caching decisions and employs a provably optimal cache replacement algorithm. A highlight of our evaluation using workloads from Facebook and Bing datacenters is that PACMan’s replacement algorithm outperforms even Belady’s MIN (that uses an oracle) in speeding up jobs. Along the way, I will also describe how we broke the myth of disk-locality’s importance in datacenter computing and solutions to mitigate straggler tasks.

 

Bio

Ganesh Ananthanarayanan is a PhD candidate in the University of California at Berkeley, working with Prof. Ion Stoica in the AMP Lab. His research interests are in systems and networking, with a focus on cloud computing and large scale data analytics systems. Prior to joining Berkeley, he worked for two years at Microsoft Research’s Bangalore office. 

Additional Information

In Campus Calendar
No
Groups

College of Computing

Invited Audience
No audiences were selected.
Categories
No categories were selected.
Keywords
No keywords were submitted.
Status
  • Created By: Antonette Benford
  • Workflow Status: Published
  • Created On: Feb 25, 2013 - 10:48am
  • Last Updated: Oct 7, 2016 - 10:02pm