*********************************
There is now a CONTENT FREEZE for Mercury while we switch to a new platform. It began on Friday, March 10 at 6pm and will end on Wednesday, March 15 at noon. No new content can be created during this time, but all material in the system as of the beginning of the freeze will be migrated to the new platform, including users and groups. Functionally the new site is identical to the old one. webteam@gatech.edu
*********************************
Title : Enhancing manageability of execution and data for GPGPU computing
Anshuman Goswami
School of Computer Science
College of Computing
Georgia Institute of Technology
Date: Wednesday, Nov 30, 2016
Time: 10 am to 12 noon EST
Location: KACB 1202
Committee:
---------------
Dr. Karsten Schwan (Advisor, School of Computer Science, Georgia Tech)
Dr. Matthew Wolf (Advisor and Committee Chair, Oak Ridge National Laboratory)
Dr. Ling Liu (School of Computer Science, Georgia Tech)
Dr. Sudhakar Yalamanchili (School of Electrical and Computer Engineering, Georgia Tech)
Dr. Richard Vuduc (School of Computational Science and Engineering, Georgia Tech)
Dr. Hyesoon Kim (School of Computer Science, Georgia Tech)
Abstract:
-----------
GPGPUs are useful for many types of compute-intensive workloads from scientific simulations to cloud-focused applications like machine learning and graph analytics. However, unlike CPUs they do not allow for software-controlled sharing of resources. This leads to underutilization, unfair use and reduced programmability. This thesis looks at three different areas, 1) in situ analysis in scientific workflows, 2) multi tenancy in cloud computing environments, and 3) network sharing between evolving distributed GPU frameworks. The thesis presents four distinct software-scheduling based constructs to handle problems in each of these spaces.
First, the thesis will present Landrush, an idle cycle scavenging approach for GPUs to improve time to answer in scientific workflows by running data analysis in situ with controlled interference due to co-location.
Second, the thesis will present GPUShare, which enables sharing of GPUs between long-running cloud workloads helping to reduce cost of usage by ensuring resources are fairly shared while ensuring that standalone execution remains unaffected.
Third, the thesis will present Symphony, a software-supervised GPU scheduler that trades off the low overhead of hardware dispatching and the runtime responsiveness of software scheduling to improve time to answer for such scientific workflows that do not afford idle cycles.
Finally, the thesis will present GpuCoflow, a novel approach to network sharing between evolving distributed GPU computing frameworks that considers and application's computing and data transfer characteristics to ensure increased overall throughput compared to traditional network scheduling approaches that are geared towards providing high bisection bandwidth.