PhD Defense by Seonmyeong Bak

*********************************
There is now a CONTENT FREEZE for Mercury while we switch to a new platform. It began on Friday, March 10 at 6pm and will end on Wednesday, March 15 at noon. No new content can be created during this time, but all material in the system as of the beginning of the freeze will be migrated to the new platform, including users and groups. Functionally the new site is identical to the old one. webteam@gatech.edu
*********************************

Event Details
  • Date/Time:
    • Tuesday November 3, 2020
      2:30 pm - 4:30 pm
  • Location: Bluejeans
  • Phone:
  • URL: Bluejeans
  • Email:
  • Fee(s):
    N/A
  • Extras:
Contact
No contact information submitted.
Summaries

Summary Sentence: Runtime Approaches to Improve the Efficiency of Hybrid and Irregular Applications

Full Summary: No summary paragraph submitted.

Title: Runtime Approaches to Improve the Efficiency of Hybrid and Irregular Applications

 

Seonmyeong Bak

Ph.D. Candidate

School of Computer Science

Georgia Institute of Technology

 

Date: Tuesday, November 3rd, 2020
Time: 2:00 pm to 4:00 pm (EST)
Location: *No Physical Location*

BlueJeans:  https://bluejeans.com/sbak3

 

Committee:
Dr. Vivek Sarkar (advisor), School of Computer Science, Georgia Institute of Technology

Dr. Ümit V. Çatalyürek, School of Computational Science and Engineering, Georgia Institute of Technology
Dr. Ada Gavrilovska, School of Computer Science, Georgia Institute of Technology
Dr. Tushar Krishna, School of Electrical and Computer Engineering, Georgia Institute of Technology
Dr. Alexey Tumanov, School of Computer Science, Georgia Institute of Technology

Abstract:
On-node parallelism has increased significantly in high-performance computing systems. This huge amount of parallelism can be used to speed up regular parallel applications easily because straightforward approaches usually suffice to map their computation patterns and data layouts on to available on-node parallelism. However, irregular parallel applications require considerable effort to run on the modern processors with massive amounts of intra-node parallelism. Parallel programming models and runtime approaches have been proposed to help programmers to write
those applications quickly, but it’s still not easy to write efficient irregular parallel applications. Two key challenges in mapping irregular applications onto on-node parallelism are load balance and computation-communication overlap. In this thesis defense, we address these challenges through new runtime approaches and new APIs that enable users to provide minimal information for application-aware scheduling.

First, we introduce new algorithms to improve the scheduling of irregular task graphs containing a mix of communication and computation tasks with data-parallelism and blocking operations. We combine gang-scheduling with work-stealing for data parallel tasks with frequent inter/intra-node communication in the task graphs so as to reduce interference and expensive context switching operations. We also propose
improved victim selection policies for work-stealing to improve the load balance and overlap of ready tasks that have child tasks.

Next, we propose an efficient integrated runtime system to handle load balancing of irregular applications written in hybrid parallel programming models. We introduce a unified runtime system that integrates distributed and shared-memory programming, as exemplified by the combination of Charm++ and OpenMP. In this approach, all processing resources (cores) can be used flexibly across both the distributed and shared-memory levels, thereby enabling more efficient load balancing at the intra-node level and reduced waiting times for global synchronization at the inter-node
level.

Finally, we propose a set of APIs that enable users to specify functions used to decompose a target loop into subspaces and to create chunks within each subspace for application-specific load balancing. Our runtime leverages the information provided in the APIs to create user-defined chunks and store balanced groups of chunks in a shared data structure indexed by static loop constructs. In this way, the stored information from one invocation of a loop can be reused in following invocations for an improved initial load balance.

Additional Information

In Campus Calendar
No
Groups

Graduate Studies

Invited Audience
Faculty/Staff, Public, Graduate students, Undergraduate students
Categories
Other/Miscellaneous
Keywords
Phd Defense
Status
  • Created By: Tatianna Richardson
  • Workflow Status: Published
  • Created On: Oct 26, 2020 - 2:38pm
  • Last Updated: Oct 26, 2020 - 2:38pm