*********************************
There is now a CONTENT FREEZE for Mercury while we switch to a new platform. It began on Friday, March 10 at 6pm and will end on Wednesday, March 15 at noon. No new content can be created during this time, but all material in the system as of the beginning of the freeze will be migrated to the new platform, including users and groups. Functionally the new site is identical to the old one. webteam@gatech.edu
*********************************
Title: System Design Principles for Heterogeneous Resource Management and Scheduling in Accelerator-based Systems
Dipanjan Sengupta
School of Computer Science
College of Computing
Georgia Institute of Technology
Date: Tuesday, December 15th, 2015
Time: 11 AM to 1 PM EST
Location: KACB 3402
Committee:
------------
Dr. Karsten Schwan (Advisor, School of Computer Science, Georgia Tech) Dr. Matthew Wolf (Committee Chair, School of Computer Science, Georgia Tech) Dr. Ada Gavrilovska (School of Computer Science, Georgia Tech) Dr. Ling Liu (School of Computer Science, Georgia Tech) Dr. Sudhakar Yalamanchili (School of Electrical and Computer Engineering, Georgia Tech) Dr. Richard Vuduc (School of Computational Science and Engineering, Georgia Tech)
Abstract:
-----------
Accelerator-based systems are making rapid inroads into becoming platforms of choice for both high end cloud services and processing irregular applications like real-world graph analytics due to their high scalability and low dollar to FLOPS ratios. Yet GPUs are not first class schedulable entities causing substantial hardware resource underutilization, including their computational and data movement engines. Therefore, software solutions with support for efficient resource management principles are required to address such scheduling crisis in GPUs. Further, two important characteristics of real world graphs like those in social networks are that they are big and are constantly evolving over time. This poses challenge due to limitations in GPU-resident memory for storing these large graphs. And because of the high rate at which these large-scale graphs evolve, it is undesirable and computationally infeasible to repeatedly run static graph analytics on a sequence of versions, or snapshots, of the evolving graph. Therefore, novel incremental solutions are required to process large-scale evolving graphs in near real-time using GPUs with memory footprint exceeding the device's internal memory capacity.
First, the thesis proposes Strings, a GPU scheduling infrastructure that achieves high system throughput and fairness among applications from multiple tenants using manycore GPU servers by treating GPUs as first class schedulable entities, and decomposing the scheduling problem into a novel combination of load balancing and per-device resource sharing.
Second, for processing graph applications with larger memory footprint than the device memory the thesis proposes GraphReduce, a highly efficient and scalable GPU-based framework that adopts a combination of edge- and vertex-centric implementations of the Gather-Apply-Scatter programming model and operates on multiple asynchronous GPU streams to fully exploit the high degrees of parallelism in GPUs supporting efficient graph data movement between the host and device. Finally the thesis also proposes a novel programming model that allows for implementing a large set of incremental graph processing algorithms seamlessly across multiple GPU cores. It also characterizes various graph algorithms and how related graph properties affect the complexity of incremental graph processing in making runtime decisions to choose between an incremental vs static run over a particular update batch to achieve the best performance.