*********************************
There is now a CONTENT FREEZE for Mercury while we switch to a new platform. It began on Friday, March 10 at 6pm and will end on Wednesday, March 15 at noon. No new content can be created during this time, but all material in the system as of the beginning of the freeze will be migrated to the new platform, including users and groups. Functionally the new site is identical to the old one. webteam@gatech.edu
*********************************
Title: Robust Learning Frameworks and Algorithms for Scalable Data Systems
Date: Friday, September 30th, 2022
Time: 10:00am-12:00pm (ET)
Location: https://gatech.zoom.us/j/4596258486?pwd=RTdjNWtpREszVkU3eW43WjMydUJEdz09
Zoom Meeting ID: 459 625 8486
Zoom Passcode: 705273
Ka Ho Chow
Ph.D. Student
School of Computer Science
Georgia Institute of Technology
Committee
=========
Dr. Ling Liu (Advisor, School of Computer Science, Georgia Institute of Technology)
Dr. Calton Pu (School of Computer Science, Georgia Institute of Technology)
Dr. Shamkant Navathe (School of Computer Science, Georgia Institute of Technology)
Dr. Lakshmish Ramaswamy (Department of Computer Science, University of Georgia)
Abstract
=======
The data explosion and advances in machine learning have transformed modern cognitive computing systems. While blossomed in business, science, and engineering applications and services, deep learning is known to be vulnerable to data corruption and adversarial manipulation. This dissertation research is dedicated to exploring, designing, and advancing robust learning algorithms and scalable frameworks for next-generation data-intensive systems.
The first contribution is to develop risk assessment frameworks for in-depth investigation of security threats in deep learning-driven visual recognition systems, including both the vulnerability during the model inference phase and the distributed model training phase. We identify potential risks unique to object detection systems arising from their multi-task learning nature and introduce TOG, a suite of optimization algorithms generating deceptive queries, also known as adversarial examples, to fool well-trained object detection models. TOG is a pioneering framework to support risk assessments on both one-stage and two-stage detection algorithms. It can target different loss functions in object recognition to deceive the victim model into misbehaving randomly or purposefully with domain knowledge-driven semantics. Similarly, we take a holistic approach to understanding the data poisoning vulnerability that typically happens in distributed model training. We introduce the first family of attacks, named perception poisoning, to effectively mislead the learning process of the global object detection model in federated learning by selectively poisoning various combinations of objectness, bounding boxes, and class labels. Our innovations offer practitioners comprehensive frameworks for risk management and researchers to identify root causes and insights for designing mitigation strategies.
The second contribution is to develop risk mitigation frameworks for building reliable systems with robustness guarantees against adversarial manipulation. Deceptive queries at the model inference phase can be detrimental to the integrity of numerous existing intelligent systems. They can be transferred across different models to launch black-box attacks. To circumvent such a severe threat, we present the first-of-its-kind diversity-driven model fusion framework for robust object detection. It employs a team of models carefully constructed by our optimization algorithms and focal diversity methodology to conduct robust fusion through a three-stage technique. Extensive experiments validate its effectiveness in mitigating TOG and other state-of-the-art attacks and demonstrate enhanced detection accuracy in the benign scenario. For the perception poisoning threat during the distributed training phase, only a small population of clients is present, and malicious clients can contribute gradients inconsistently to obfuscate their identity. Such adaptivity adds a layer of complication to identify malicious clients with minimal false alarms. To overcome the above challenges, we introduce a new poisoning-resilient federated learning framework, STDLens, with a spatial-temporal forensic methodology with robust statistics to perform timely identification and removal of malicious clients. Our extensive experiments confirm that, even under various adaptive attacks, the STDLens-protected system has no observable performance degradation.
In addition to security threats due to deceptive queries and data poisoning, the third contribution of this dissertation research is to develop machine learning-enhanced algorithms to strengthen the reliability and scalability of microservice applications in hybrid clouds. Cyberattacks such as ransomware have been on the rise, and rapid recovery from such attacks with minimal data loss is crucial for business continuity. We introduce an algorithm, DeepRest, to estimate how many resources are expected to serve the application traffic received from its end users. It enables the verification of resource usage by comparing the expected consumption with the actual measurement from the microservice application without any assumption on workload periodicity. Any statistically unjustifiable resource usage can be identified as a potential threat. Our extensive studies confirm the effective detection of representative ransomware and crypto-jacking attacks. As a solution with dual purposes, DeepRest is the first to support resource estimation for unseen application traffic in the future (e.g., ten times more users purchasing products due to a special sale). While it enables precise scaling in advance, the expected resource usage can exceed the capacity of the current private computing infrastructure. We further propose an application-aware hybrid cloud migration planner to span the microservice application across private and public clouds to enjoy virtually unlimited resources while remaining cost-effective and performance-optimized with the least disruption to the regular operation of the application during the migration process.
In this proposal exam, I will focus on the first two contributions and my ongoing research plan.