PhD Defense by Adit Ranadive

*********************************
There is now a CONTENT FREEZE for Mercury while we switch to a new platform. It began on Friday, March 10 at 6pm and will end on Wednesday, March 15 at noon. No new content can be created during this time, but all material in the system as of the beginning of the freeze will be migrated to the new platform, including users and groups. Functionally the new site is identical to the old one. webteam@gatech.edu
*********************************

Event Details
  • Date/Time:
    • Thursday November 12, 2015 - Friday November 13, 2015
      4:00 pm - 5:59 pm
  • Location: KACB 2100
  • Phone:
  • URL:
  • Email:
  • Fee(s):
    N/A
  • Extras:
Contact
No contact information submitted.
Summaries

Summary Sentence: Virtualized Resource Management in High Performance Fabric Clusters

Full Summary: No summary paragraph submitted.

Title: Virtualized Resource Management in High Performance Fabric Clusters

Adit Ranadive
School of Computer Science
College of Computing
Georgia Institute of Technology

Date: November 12, 2015 (Thursday)
Time: 12:00 PM - 2:00 PM (EST)
Location: KACB 2100

Committee:
---------------


Dr. Karsten Schwan (Advisor, School of Computer Science, Georgia Tech)
Dr. Ada Gavrilovska (Advisor, School of Computer Science, Georgia Tech)
Dr. Sudhakar Yalamanchili (School of Electrical and Computer Engineering, Georgia Tech)
Dr. Ellen Zegura (School of Computer Science, Georgia Tech)
Dr. Ling Liu (School of Computer Science, Georgia Tech)
Dr. Douglas M. Blough (School of Electrical and Computer Engineering, Georgia Tech)

Abstract:
------------


Providing performance and isolation guarantees for applications running in virtualized datacenter environments requires continuous management of the underlying physical resources. For communication- and I/O-intensive applications running on such platforms, the management methods must adequately deal with the shared use of the high-performance fabrics these applications require. In particular, new classes of latency-sensitive and data-intensive workloads running in virtualized environments rely on emerging fabrics like 40+Gbps Ethernet and InfiniBand/RoCE with support for RDMA, VMM-bypass and hardware-level virtualization (SR-IOV). However, the benefits provided by these technology advances are offset by several management constraints: (i) the inability of the hypervisor to monitor the VMs’ usage of these fabrics can affect the platform’s ability to provide isolation and performance guarantees, (ii) the hypervisor cannot provide fine-grained I/O provisioning or perform management decisions for VMs, thus reducing the degree of consolidation that can be supported on the platforms, and (iii) without such support it is harder to integrate these fabrics into emerging cloud computing platforms and datacenter fabric management solutions. This is made particularly challenging for workloads spanning multiple VMs, utilizing physical resources distributed across multiple server nodes and the interconnection fabric.

This thesis addresses the problem of realizing a flexible, dynamic resource management system for virtualized platforms with high performance fabrics. We make the following key contributions:
(i) A lightweight monitoring tool, IBMon, integrated with the hypervisor to monitor VMs’ use of RDMA-enabled virtualized interconnects, using memory introspection techniques.
(ii) The design and construction of a resource management system that leverages IBMon to provide latency-sensitive applications performance guarantees. This system is built on microeconomic principles of supply and demand and can be deployed on a per-node (Resource Exchange) or a multi-node (Distributed Resource Exchange) basis. Fine-grained resource allocations can be enforced through several mechanisms, including CPU capping or fabric-level congestion control.
(iii) Sphinx, a fabric management solution that leverages Resource Exchange to orchestrate network and provide latency proportionality for consolidated workloads, based on user/application-specified policies.
(iv) Implementation and experimental evaluation using InfiniBand clusters virtualized with the Xen or KVM hypervisor, managed via the OpenFloodlight SDN controller, and using representative data-intensive and latency-sensitive benchmarks.

Additional Information

In Campus Calendar
No
Groups

Graduate Studies

Invited Audience
Public
Categories
Other/Miscellaneous
Keywords
Phd Defense
Status
  • Created By: Tatianna Richardson
  • Workflow Status: Published
  • Created On: Oct 28, 2015 - 10:55am
  • Last Updated: Oct 7, 2016 - 10:14pm