*********************************
There is now a CONTENT FREEZE for Mercury while we switch to a new platform. It began on Friday, March 10 at 6pm and will end on Wednesday, March 15 at noon. No new content can be created during this time, but all material in the system as of the beginning of the freeze will be migrated to the new platform, including users and groups. Functionally the new site is identical to the old one. webteam@gatech.edu
*********************************
Title: Scalable Energy-Efficient Microarchitectures with Computational Error Tolerance
Bobin Deng
Ph.D. Student
Schools of Computer Science
Georgia Institute of Technology
Date: Thursday, April 9th, 2020
Time: 9:00 am to 11:00 am (EDT)
Location: *No Physical Location*
BlueJeans: https://gatech.bluejeans.com/9026676698
Committee:
Dr. Thomas Conte (advisor), School of Computer Science, Georgia Institute of Technology
Dr. Hyesoon Kim, School of Computer Science, Georgia Institute of Technology
Dr. Alexandros Daglis, School of Computer Science, Georgia Institute of Technology
Dr. Jeanine Cook, Sandia National Laboratories
Abstract:
Due to the problems of high leakage current and threshold voltage, Dennard scaling has reached its limit on conventional semiconductor technology. Energy reduction at the transistor level by simply lowering supply voltage has proven to be infeasible for these devices (e.g., MOSFETs). Some recently proposed millivolt switch techniques are aimed at mitigating these issues, by maintaining high on/off ratio of drain currents with a much lower supply voltage. However, Vdd reduction is limited by high intermittent error probabilities in millivolt switches. Energy-efficient microarchitectures that are computationally error-tolerant are therefore urgently needed.
This thesis systematically leverages the error detection and correction properties of Redundant Residue Number System (RRNS) by varying the number of non-redundant (n) and redundant (r) components (residues) within such a two-dimensional (n, r)-RRNS design plane. Being able to efficiently handle resilience in this (n, r)-RRNS plane significantly improves reliability, allowing further Vdd reduction to save energy.
To this end, first, I will discuss the necessary implementation details of a single error correction RRNS core. Second, propose a scalable RRNS microarchitecture that simultaneously supports both, error-correction, as well as checkpointing with restart capabilities upon detecting uncorrectable errors. Third, design novel RRNS-based adaptive checkpointing&restart mechanisms that automatically guarantee reliability while minimizing the energy-delay product (EDP). Finally, I will explore the RRNS design space systematically to find the optimal (n, r) configuration point.