*********************************
There is now a CONTENT FREEZE for Mercury while we switch to a new platform. It began on Friday, March 10 at 6pm and will end on Wednesday, March 15 at noon. No new content can be created during this time, but all material in the system as of the beginning of the freeze will be migrated to the new platform, including users and groups. Functionally the new site is identical to the old one. webteam@gatech.edu
*********************************
Title: Energy Efficient On-chip Deep Neural Network (DNN) Inference and Training with Emerging Non-volatile Memory Technologies
Committee:
Dr. Shimeng Yu, ECE, Chair, Advisor
Dr. Callie Hao, ECE
Dr. Yingyan Lin, CS
Dr. Tushar Krishna, ECE
Dr. Saibal Mukhopadhyay, ECE
Abstract: Emerging non-volatile memory (eNVM) technologies are providing new opportunities for designing DNN accelerators with high energy efficiency. In this thesis, DNN accelerator designs using the eNVM-based compute-in-memory (CIM) paradigm and high-density on-chip buffer are proposed. For DNN inference, a CIM accelerator with a reconfigurable interconnect is presented. It optimizes the communication pattern by using application-specific interconnect topology. To support the multi-head self-attention (MHSA) mechanism in transformers, a heterogeneous computing platform with CIM and a digital sparse engine is utilized for the various types of matrix-matrix multiplications involved. A CIM-based approximate computing scheme is proposed to support the run-time sparsity in attention score computation. For DNN training, to overcome the high write energy of eNVM, a hybrid weight cell design using eNVM and a capacitor is proposed for the weight update during training. To store large volumes of intermediate data during training, a dual-mode buffer design is proposed based on ferroelectric materials. It optimizes both the dynamic read/write energy and the standby power by operating at volatile and non-volatile modes.