*********************************
There is now a CONTENT FREEZE for Mercury while we switch to a new platform. It began on Friday, March 10 at 6pm and will end on Wednesday, March 15 at noon. No new content can be created during this time, but all material in the system as of the beginning of the freeze will be migrated to the new platform, including users and groups. Functionally the new site is identical to the old one. webteam@gatech.edu
*********************************
Title: Taming Latency In Data Center Applications
Mohan Kumar Kumar
School of Computer Science
College of Computing
Georgia Institute of Technology
Date: Friday, November 2nd, 2018
Time: 1:00pm EDT
Location: KACB 1123
Committee:
---------------
Dr. Taesoo Kim (Advisor, School of Computer Science, Georgia Tech)
Dr. Ada Gavrilovska (School of Computer Science, Georgia Tech)
Dr. Umakishore Ramachandran (School of Computer Science, Georgia Tech)
Dr. Tushar Krishna (School of Electrical Engineering, Georgia Tech)
Dr. Keon Jang (Software Engineer, Google)
Abstract:
------------
A new breed of low-latency I/O devices, such as the emerging remote memory access
and the high-speed Ethernet NICs, are becoming ubiquitous in current data centers. For
example, big data center operators such as Amazon, Facebook, Google, and Microsoft are
already migrating their networks to 100G. However, the overhead incurred by the system
software, such as memory management and protocol stack, is dominant with these faster
I/O devices.
To address these system software overheads, this thesis analyses system services such as
memory management and protocol stacks, and makes the following contributions: First, the thesis proposes a lazy, asynchronous mechanism to address the system software overhead incurred due to a synchronous TLB shootdown. The key idea of the lazy shootdown mechanism, called LATR , is to use lazy memory reclamation and lazy page table unmap to perform an asynchronous TLB shootdown. By handling TLB shootdowns in a lazy fashion, LATR can eliminate the performance overheads associated with IPI mechanisms as well as the waiting time for acknowledgments from remote cores. Second, the thesis proposes an extensible protocol stack to address the software overhead incurred in protocol stacks such as TCP and UDP. Xps allows an application to specify its latency-sensitive operations and executes them inside the kernel and user space protocol stacks, providing higher throughput and lower tail latency by avoiding the socket interface. For all other operations, Xps retains the popular, well-understood socket interface. In addition, Xps abstraction is flexible enough to even embody the latency-sensitive operations in a off-the-shelf smart NIC. Third, the thesis analyses the overhead incurred on the leader node for consensus algorithms such as Multi-Paxos/Viewstamp Replication(VR). In addition, it classifies the parts of the VR algorithm that will be executed on the Smart NIC and the host processor. With such a classification, the consensus and recovery overhead on the leader node is eliminated, which in turn reduces the latency of the consensus algorithm.