Ph.D. Proposal Oral Exam - Jin Wang

*********************************
There is now a CONTENT FREEZE for Mercury while we switch to a new platform. It began on Friday, March 10 at 6pm and will end on Wednesday, March 15 at noon. No new content can be created during this time, but all material in the system as of the beginning of the freeze will be migrated to the new platform, including users and groups. Functionally the new site is identical to the old one. webteam@gatech.edu
*********************************

Event Details

Date/Time:
- Wednesday February 24, 2016 - Thursday February 25, 2016
  12:00 pm - 11:59 am
Location: Room 2100, Klaus
Phone:
URL:
Email:
Fee(s):
N/A
Extras:

Contact

No contact information submitted.

Summaries

Summary Sentence: ECE Proposal Oral Exam

Full Summary: No summary paragraph submitted.

Title: Acceleration and Optimization of Dynamic Parallelism for Irregular Applications on GPUs

Committee:

Dr. Yalamanchili, Advisor

Dr. Kim, Chair

Dr. Vuduc

Abstract:

The objective of the proposed research is an optimized GPU execution model extension that efficiently supports the irregular data intensive applications via the effective exploration of dynamic parallelism with light weight thread block launching mechanism and associated optimizations in scheduling strategy and power efficiency. There exists dynamically formed pockets of structured parallelism within the emerging irregular applications that can utilize the recently introduced device-side nested kernel launch capabilities on GPUs. However, the low utilization of GPU resources and the high cost of the device kernel launch make it still difficult to harness dynamic parallelism on GPUs. The preliminary research then proposes an extension to the GPU execution model -- Dynamic Thread Block Launch (DTBL), which provides the capability of spawning light-weight thread blocks from GPU threads on demand and coalescing them to existing native executing kernels. The finer granularity of a thread block provides effective and efficient control of smaller-scale,dynamically occurring pockets of structured parallelism during the computation. Evaluations of DTBL shows an average of 1.21x speedup over the baseline implementations. DTBL is further optimized with a thread block scheduling strategy that exploits spatial and temporal reference locality between parent kernels and dynamically launched child kernels. The locality-aware thread block scheduler is able to achieve another 27% increase in the overall performance. The proposed research will further explore the energy and power dissipation consequence of the DTBL model. This will be built on the characterization of GPU utilizations from a power dissipation perspective to develop techniques that can improve power efficiency.

Additional Information

In Campus Calendar

Groups

ECE Ph.D. Proposal Oral Exams

Invited Audience

Public

Categories

Other/Miscellaneous

Keywords

graduate students, Phd proposal

Status

Created By: Daniela Staiculescu
Workflow Status: Published
Created On: Feb 18, 2016 - 12:00pm
Last Updated: Oct 7, 2016 - 10:16pm

Georgia Tech

Ph.D. Proposal Oral Exam - Jin Wang

Additional Information