*********************************
There is now a CONTENT FREEZE for Mercury while we switch to a new platform. It began on Friday, March 10 at 6pm and will end on Wednesday, March 15 at noon. No new content can be created during this time, but all material in the system as of the beginning of the freeze will be migrated to the new platform, including users and groups. Functionally the new site is identical to the old one. webteam@gatech.edu
*********************************
Title: Adding Persistence to Main Memory Programming
Pradeep Fernando
School of Computer Science
College of Computing
Georgia Institute of Technology
https://www.cc.gatech.edu/grads/p/pfernand
Date: Thursday, October 3rd, 2019
Time: 10:00 AM - Noon (EDT)
Location: KACB 3100
Committee:
Dr. Ada Gavrilovska (Advisor, School of Computer Science, Georgia Tech)
Dr. Umakishore Ramachandran (School of Computer Science, Georgia Tech)
Dr. Joy Arulraj (School of Computer Science, Georgia Tech)
Dr. Tushar Krishna (School of Electrical Engineering, Georgia Tech)
Dr. Amitabha Roy (Software Engineer, Google)
Abstract:
Unlocking the true potential of the new non-volatile memories (NVMs) requires eliminating traditional persistent I/O abstractions altogether, by introducing persistent semantics directly into main memory programming. Such a programming model elevates failure atomicity to a first-class application property in addition to in-memory data layout, concurrency-control, and fault tolerance, and therefore requires redesign of programming abstractions for both program correctness and maximum performance gains. To address these challenges, this thesis proposes a set of system software designs that integrate persistence with main memory programming, and makes the following contributions.
First, this thesis proposes an NVM-aware I/O runtime, NVStream, that supports fast durable streaming I/O. NVStream uses a memory-friendly I/O API that plugs into existing I/O data movement points of an application to accelerate persistent data writes. NVStream carefully designs its persistent data storage layout and crash-consistent semantics to match both application and NVM characteristics. Specifically we use a log-structured NVM storage engine with append only failure-atomic semantics to support streaming I/O produced during HPC simulations. Furthermore, the thesis acknowledges the NVM bandwidth bottlenecks during parallel HPC I/O writes and proposes a novel data movement design -- PHX. PHX uses alternative network data movement paths available in data-centers to ease up the bandwidth pressure on the NVM memory interconnects, all while maintaining the correctness of the persistent data.
Next, the thesis explores the challenges and opportunities of using NVM for true main memory persistent programming -- a single data domain for both runtime and persistent application state. Such a programming model includes maintaining ACID properties during each and every update to application’s persistent structures. ACID-qualified persistent programming for multi-threaded applications is hard, as the programmer has to reason about both crash-consistency and synchronization -- crash-sync -- semantics for programming correctness. The thesis introduces NVMTSX, that extends the popular hardware transactional memory (HTM) primitive with durability semantics, and offers a hardware accelerated crash-sync primitive that supports both low overhead synchronization and correct crash-consistency.
Finally, the application state stored on node-local persistent memory is still vulnerable to catastrophic node failures. The thesis proposes a replicated persistent memory runtime, Blizzard, that supports truly fault tolerant, concurrent and persistent data-structure programming. Blizzard carefully integrates userspace networking with byte addressable NVM for a fast, persistent memory replication runtime. Further, the design also supports a replication aware crash-sync protocol that supports consistent and concurrent updates on persistent data-structures.