PhD Defense by Hyoukjun Kwon

*********************************
There is now a CONTENT FREEZE for Mercury while we switch to a new platform. It began on Friday, March 10 at 6pm and will end on Wednesday, March 15 at noon. No new content can be created during this time, but all material in the system as of the beginning of the freeze will be migrated to the new platform, including users and groups. Functionally the new site is identical to the old one. webteam@gatech.edu
*********************************

Event Details
  • Date/Time:
    • Thursday July 16, 2020 - Friday July 17, 2020
      3:00 pm - 4:59 pm
  • Location: REMOTE: BLUE JEANS
  • Phone:
  • URL: BlueJeans Link
  • Email:
  • Fee(s):
    N/A
  • Extras:
Contact
No contact information submitted.
Summaries

Summary Sentence: Data- and Communication-centric Approaches to Model and Design Flexible Deep Neural Network Accelerators

Full Summary: No summary paragraph submitted.

Title: Data- and Communication-centric Approaches to Model and Design Flexible Deep Neural Network Accelerators

Hyoukjun Kwon
PhD Candidate 
School of Computer Science
Georgia Institute of Technology 
http://hyoukjunkwon.com/

Date: Thursday, July 16th, 2020

Time: 3 -5 pm
Location:  https://bluejeans.com/219451978 (remote)

Committee:
Dr. Tushar Krishna (advisor), School of Electrical and Computer Engineering, Georgia Institute of Technology
Dr. Vivek Sarkar, School of Computer Science, Georgia Institute of Technology
Dr. Hyesoon Kim, School of Computer Science, Georgia Institute of Technology
Dr. Alexey Tumanov, School of Computer Science, Georgia Institute of Technology
Dr. Micahel Pellauer, Architecture Research Group, NVIDIA

Abstract:

Deep neural network (DNN) acceleration has emerged as an enabler of many applications such as image classification, face recognition, natural language processing, that was challenging to achieve high operational performance (i.e., accuracy or quality of outputs). Since recent DNNs involve billions of multiply-and-accumulate (MAC) operations with millions of parameters, DNN accelerators, specialized hardware for DNN computation, have emerged. However, designing dedicated hardware for each DNN model requires high development costs while DNN models and algorithms rapidly evolve. In addition, specializing a DNN accelerator for one DNN model with limited support for compiler mappings often leads to inefficiency for other DNN models. Therefore, this thesis explores flexible DNN accelerator designs that support diverse compiler mappings (i.e., dataflow + tile sizes for each data dimension) to adapt to new DNN models without re-designing hardware.

This thesis first focuses on the modeling costs and benefits of mapping choices to quantify the potential costs and benefits of mapping choices considering underlying hardware. We codify the cost model and implement MAESTRO, and perform case studies that show no single mapping is ideal for all the layers. For the flexible DNN accelerator designs, this thesis addresses the challenge from two perspectives: reconfigurability and heterogeneity.

For the reconfigurability approach, this thesis focuses on the data movement since the cost of data movement dominates in DNN accelerators, and the rearranging the data movement is effectively equivalent to programming a DNN accelerator considering the nature of predefined target application. We propose a light-weight network-on-chip (NoC) architecture, Microswitch NoC, specialized for DNN accelerator traffic while providing sufficient flexibility for any dataflow. We also present a reconfigurable DNN accelerator design, MAERI, that employs reconfigurable data distribution and reduction NoCs that support all the communication patterns in DNN accelerators and perform reduction inside NoC switches (i.e., in-network-processing style). MAERI enables to map computations on compute units without underutilizing PEs for any irregular DNN computations resulting from diverse layers and various optimizations (e.g., cross layer mapping, sparsity, etc.).

For the heterogeneity approach, this thesis explores heterogeneous DNN accelerators (HDAs), which contains multiple sub-accelerators that contain different amount of hardware resources and run different dataflows. For the HDA-based approach, this thesis proposes a comprehensive HDA optimization framework, Herald, that automatically explore optimization opportunities of mapping DNN layers to a sub-accelerator with the lowest EDP at run time and proper hardware resource partitioning at design time. Finally, we formally define the mapping flexibility so that we can quantify the degree of flexibility of flexible accelerators, which enables comprehensive comparison across flexible DNN accelerators.

Additional Information

In Campus Calendar
No
Groups

Graduate Studies

Invited Audience
Public, Graduate students, Undergraduate students
Categories
Other/Miscellaneous
Keywords
Phd Defense
Status
  • Created By: Tatianna Richardson
  • Workflow Status: Published
  • Created On: Jul 7, 2020 - 11:35am
  • Last Updated: Jul 7, 2020 - 11:35am