*********************************
There is now a CONTENT FREEZE for Mercury while we switch to a new platform. It began on Friday, March 10 at 6pm and will end on Wednesday, March 15 at noon. No new content can be created during this time, but all material in the system as of the beginning of the freeze will be migrated to the new platform, including users and groups. Functionally the new site is identical to the old one. webteam@gatech.edu
*********************************
Title: Distributed Services and Elastic Container Memory Abstractions for Big Data Analytic Clouds
Date: Friday, April 22th, 2022
Time: 08:00am – 10:00am (EDT)
Location: https://bluejeans.com/535243043/0051?src=calendarLink
Juhyun Bae
PhD Candidate
School of Computer Science
College of Computing
Georgia Institute of Technology
Committee
———————
Dr. Ling Liu (Advisor, School of Computer Science, Georgia Institute of Technology)
Dr. Calton Pu (School of Computer Science, Georgia Institute of Technology)
Dr. Joy Arulraj (School of Computer Science, Georgia Institute of Technology)
Dr. Said Mohamed Tawfik Issa (School of Computer Science, Georgia Institute of Technology)
Dr. Jay Lofstead (Sandia National Laboratories)
Abstract
———————
Big data powered machine learning has become an integral part of many products and services today. Solely relying on elastic scaling up to deal with resource limits in Cloud and at Edge may not be sufficient, especially when the workload pattern and volume are diverse and hard to predict, let alone the cost of scaling up and the presence of unutilized
resources in cross-container and cross-cluster settings. Although container is an essential server technology for multi cloud, hybrid cloud and edge cloud computing, the performance of containers on a host node is affected by the memory usage of other co-hosted containers
due to the shared common pool of host memory. Transient load surge can cause serious performance degradation when the container runtime is out of allocated memory or the host memory runs out. This dissertation research addresses these problems by developing distributed services with elastic container memory abstraction, enabling container runtime
to utilize free memory on the same host and in remote nodes of a cluster with three unique contributions.
We first design and develop the elastic host memory abstraction to allow cross-container memory sharing based on dynamic and transient memory demands of container runtime, enabling containers to elastically expand or shrink their available memory on demand. This new cross container memory abstraction enables a container to manage the unexpectedly large working set memory gracefully rather than relying on restarting the container to increase the memory capacity. This new memory abstraction can be incorporated to container runtime execution transparently without any modification to the host operating system (OS) and the application.
Next, we design and develop the elastic host-remote memory abstraction to allow cross container and cross node memory sharing. This enables containers to flexibly expand its memory demand to remote memory in the cluster in response to the unexpected demand on the working set memory when the host memory is insufficient to accommodate the demand. By efficiently leveraging available remote free memory in the cluster, it incorporates the remote network memory as a part of the runtime memory hierarchy of containers transparently without any impact on existing OS and applications running in the containers on both host and remote nodes. By enabling containers to leverage remote network memory, which is two orders of magnitude faster than disk I/O, our elastic host-remote memory abstraction can provide graceful performance adaptation to the demand of memory intensive applications, alleviating the restarting of containers for expanding its memory allocation or the migration of container execution when its host memory is insufficient to accommodate the additional memory demand.
We implement our elastic remote network memory abstraction on top of RDMA to utilize its one-sided read / write operations. Several RDMA optimizations are introduced to provide efficient communication fabrics for enabling containers to leverage our cross-node memory abstraction and achieve maximum throughput. For example, we provide efficient merging and chaining of requests to reduce the cost of I/O. We develop new event polling mechanisms to adapt to diverse workload patterns. To demonstrate the effectiveness of our elastic host-remote container memory abstraction, we implement a transparent network memory storage for container execution (CNetMem) with two novel features. We introduce
group based remote memory abstraction to enable our elastic container memory abstraction to be scalable to large cluster size by managing a controlled pool of concurrent connections. Our grouping strategy uses pre-connection and mapping warm up to avoid connecting and
mapping remote memory via the performance critical path. We then introduce a rank-based node selection algorithm to find the most suitable remote node for mapping remote memory blocks by considering memory increasing and decreasing trend on the remote node as well as the amount of free memory available, which can help avoiding or minimizing remote eviction. We also utilize a new hybrid batching technique tailored especially for replication or disk-backup to increase fault tolerance with minimal performance impact.
Our next research is focused on providing scalable distributed data services across a large population of multi-cloud data centers and hybrid-cloud with edge clients (e.g., end devices). To support distributed services with no inherent bottleneck and high reachability to the overlay network, we design and develop a multi overlay network, with the application overlay offering dynamic and hierarchical grouping to support two types of distributed application services: scalable distributed network memory sharing system and federated learning system, and the management overlay providing efficient and transparent connection management for decentralized management of hierarchical peer to peer computation with high dependability and fault tolerance. By presenting in-depth analysis on architectures and scalable system design of our G-RING overlay network, we attempt to answer a number of most frequently asked and yet challenging questions regarding designing of distributed services and computing systems with massive participants: How do we manage the massive participants in a large-scale network? How do we provide a scalable architecture/protocol resilient and flexible to high churning participants?
Next, we implement and evaluate two distributed services on RING : Distributed Network Memory Sharing(DNMS) system and Federated Learning system to show the benefit and effectiveness of G-RING. We analyze the limitation of existing approaches for both services and discuss the benefit of implementation of these services on our G-RING system.