*********************************
There is now a CONTENT FREEZE for Mercury while we switch to a new platform. It began on Friday, March 10 at 6pm and will end on Wednesday, March 15 at noon. No new content can be created during this time, but all material in the system as of the beginning of the freeze will be migrated to the new platform, including users and groups. Functionally the new site is identical to the old one. webteam@gatech.edu
*********************************
Title: Scaling Synchronization Primitives
Sanidhya Kashyap
Ph.D. Candidate
School of Computer Science
College of Computing
Georgia Institute of Technology
Date: Thursday, June 11th
Time: 1:00 PM-3:00 PM (EST)
Location: https://bluejeans.com/209362103 (remote)
Committee:
Dr. Taesoo Kim (Advisor), School of Computer Science, Georgia Institute of Technology
Dr. Changwoo Min (Co-advisor), Virginia Tech
Dr. Ada Gavrilovska, School of Computer Science, Georgia Institute of Technology
Dr. Irina Calciu, VMware Research
Dr. Joy Arulraj, School of Computer Science, Georgia Institute of Technology
Abstract:
Over the past decade, multicore machines have become the norm. A single machine is
capable of having thousands of hardware threads or cores. Even cloud providers offer such
large multicore machines for data processing engines and databases. Thus, a fundamental
question arises is how efficient are existing synchronization primitives—timestamping and
locking—that developers use for designing concurrent, scalable, and performant applications.
Hence, this dissertation focuses on understanding the scalability aspect of these primitives,
and presents new algorithms and approaches, that either leverage the hardware or the
application domain knowledge, to scale up to hundreds of cores.
First, the thesis presents Ordo, a scalable ordering or timestamping primitive, that forms
the basis of designing scalable timestamp-based concurrency control mechanisms. Ordo
relies on invariant hardware clocks and provides a notion of a globally synchronized clock
within a machine. We use the Ordo primitive to redesign a synchronization mechanism and
concurrency control mechanisms in databases and software transactional memory.
Later, this thesis focuses on the scalability aspect of locks in both virtualized and
non-virtualized scenarios. We identify that synchronization primitives suffer from various
preemption problems that happen because of the double scheduling problem. We then
leverage the hypervisor’s scheduler to address this problem by bridging the semantic gap in
the form of scheduling information between the hypervisor and VMs.
Finally, we focus on the design of lock algorithms in general. We find that locks in
practice have discrepancies from locks in design. For example, popular spinlocks suffer
from excessive cache-line bouncing in multicore (NUMA) systems, while scalable,
NUMA-aware locks exhibit sub-par single-thread performance. We classify several
dominating factors that impact the performance of lock algorithms. We then propose
a new technique, shuffling, that can dynamically accommodate all these factors, without
slowing down the critical path of the lock. The key idea of shuffling is to re-order the queue
of threads waiting to acquire the lock with some pre-established policy. Using shuffling, we
propose a family of locking algorithms, called ShflLocks that respect all factors, efficiently
utilize waiters, and achieve the best performance.