*********************************
There is now a CONTENT FREEZE for Mercury while we switch to a new platform. It began on Friday, March 10 at 6pm and will end on Wednesday, March 15 at noon. No new content can be created during this time, but all material in the system as of the beginning of the freeze will be migrated to the new platform, including users and groups. Functionally the new site is identical to the old one. webteam@gatech.edu
*********************************
Thesis Title: Spatio-Temporal Change-point Detection and Constrained Bayesian Optimization
Advisors: Dr. Seong-Hee Kim and Dr. Yao Xie
Committee members:
Dr. Kamran Paynabar
Dr. Jianjun Shi
Dr. Mustafa M. Aral (Department of Civil Engineering, Bartin University)
Date and Time: Friday, March 15, 2019, 14:00 PM
Location: Groseclose 226A
Abstract:
This thesis makes contributions to two research topics: spatio-temporal change-point detection and constrained Bayesian optimization. Spatio-temporal change-point detection is concerned with detecting statistical anomalies based on multiple data streams collected at different locations. The first two chapters of the thesis address two challenges in spatio-temporal change-point detection: (i) how to deal with data with high dimensionality, and (ii) how to capture spatial and temporal correlations. Bayesian optimization is a prevalent approach for optimization problems defined by expensive-to-evaluate black-box functions. In the third chapter, we develop a practical algorithm for optimization problems with black-box objective function and constraints.
In Chapter 1, we study dimension reduction via spatial scanning. The majority of control charts using scan statistics for spatio-temporal change-point detection use full observation vectors. To deal with high dimensionality, most of the dimension reduction techniques are done as a post-processing step rather than in the data acquisition stage and thus the full sample covariance matrix is required. In a high dimensional application, (i) the sample covariance matrix tends to be ill-conditioned due to a limited number of samples; (ii) inversion of such a sample covariance matrix causes numerical issues; (iii) aggregating information from all variables may lead to high communication costs in sensor networks. We consider a set of reduced-dimension (RD) control charts which perform dimension reduction during data acquisition by spatial scanning and avoid the computational difficulties and possibly high communication costs. We characterize the performance difference between the RD and the full observation approaches, under several common spatial correlation models, in terms of average run lengths. Our results show that the RD approach has little performance loss under the correlation models considered in this chapter while enjoying all the implementation benefits. Our theoretical analysis is verified by extensive numerical studies including water quality monitoring.
In Chapter 2, we propose an efficient score statistic, called the S3T statistic, to detect the emergence of a spatially and temporally correlated signal from either fixed-sample or sequential data. The signal may cause a mean shift and/or a change in the covariance structure. The score statistic can capture both the spatial and temporal structures of the change, and hence, is particularly powerful in detecting weak signals. The score statistic is computationally efficient and statistically powerful. Our main theoretical contribution is analytical approximations of the false alarm rate of the detection procedures. Numerical experiments on simulated and real data, as well as a real case study of water quality monitoring, demonstrate the good performance of our procedure.
In Chapter 3, we study the problem of optimal sensor network design, which is formulated as a joint problem of constrained black-box function optimization and spatio-temporal change-point detection. We propose a practical algorithm called the Confidence-Set based Constrained Bayesian Optimization (CSCBO), which provides a flexible framework to handle noisy black-box function constraints and is easy to implement. We also extend the algorithm to tackle with a challenge that arises specifically in the sensor network design problem: we use the Wasserstein similarity metric to deal with high-dimensional binary decision variables. Finally, the S3T statistic proposed in Chapter 2 is combined with CSCBO to identify optimal sensor network designs that are robust to sensor measurement errors.