*********************************
There is now a CONTENT FREEZE for Mercury while we switch to a new platform. It began on Friday, March 10 at 6pm and will end on Wednesday, March 15 at noon. No new content can be created during this time, but all material in the system as of the beginning of the freeze will be migrated to the new platform, including users and groups. Functionally the new site is identical to the old one. webteam@gatech.edu
*********************************
Thesis Title: Statistics, Computation & Applications
Advisor: Dr. Xiaoming Huo, School of Industrial and Systems Engineering
Committee members:
Dr. Valerie Thomas, School of Industrial and Systems Engineering
Dr. Jianjun Shi, School of Industrial and Systems Engineering
Dr. Yajun Mei, School of Industrial and Systems Engineering
Dr. Yao Xie, School of Industrial and Systems Engineering
Dr. Wenjing Liao, School of Mathematics
Date and Time: 9am-11am, Wednesday, March 25th, 2020
Meeting URL:
https://bluejeans.com/6928821939?src=join_info
Meeting ID
692 882 193 9
Want to dial in from a phone?
Dial one of the following numbers:
+1.408.740.7256 (US (San Jose))
+1.408.317.9253 (US (Primary))
(see all numbers - https://www.bluejeans.com/premium-numbers)
Enter the meeting ID and passcode followed by #
Connecting from a room system?
Dial: bjn.vc or 199.48.152.152 and enter your meeting ID & passcode
Abstract:
When statistics meets real applications, the computational aspect of the statistical methods becomes critical. In this dissertation, I try to improve the computational efficiency of some statistical methods, so that they become both computationally and statistically optimal. Inspired by the recent development of the distance-based methods in statistics, I first propose a novel distance-based canonical analysis method. Secondly, an efficient algorithm of calculating distance-based statistics is studied. Moreover, a new semidefinite programming algorithm is developed for the applications in power flow analysis problems; it appears to be more robust than existing methods.
I give more details in the following. In the first part of this dissertation, we introduce a novel dimension reduction method called distance-based independence screening for canonical analysis (DISCA), which can be used to reduce dimensions of two random vectors with arbitrary dimensions. The essence of our method -- DISCA -- is to use the distance-based independence measure -- distance correlation, which was proposed by Székely and Rizzo in 2007 -- to eliminate the “redundant” dimensions until infeasible. Numerically, DISCA is to solve a non-convex optimization problem. Algorithms and theoretical justifications are provided, and the comparisons with other existing methods demonstrate its accuracy, universality, and effectiveness. An R package DISCA can be found on GitHub.
Noticing that distance correlation used in DISCA is computationally expensive with the increase of space dimensions, in the second part of this dissertation, we manage to accelerate the calculation of distance-based statistics, by projecting multidimensional variables onto pre-specified projection directions, with the improvement of computational complexity from to, where is the number of projection directions and is the sample size. Computational savings are achieved when. The optimal pre-specified projection directions can be obtained by minimizing the worse-case difference between the true distance and the approximated distance. We provide solutions and greedy algorithms for different scenarios, and confirm the advantage of our technique in comparison with the pure Monte Carlo approach, in which the directions are randomly selected rather than pre-calculated.
In the third part of this dissertation, we turn our focus on the applications of statistical computational algorithms in power systems area. A new semidefinite programming algorithm is proposed to solve the power flow and power system state estimation problems. Both two kinds of problems are non-convex, and convex relaxation is the typical approach to handling non-convexity in power systems area, while the objective functions are required to be carefully designed in order to keep the equivalency before and after relaxation. We first reformulate the two types of complex-valued problems as non-convex problems with real-valued objective functions. We show that an alternating semidefinite programming algorithm can be applied and is not sensitive to the start point without the sacrifices of accuracy. Convergence analysis is provided, and numerical studies on representative power systems datasets demonstrate the accuracy of our proposed algorithm, and applicability on various scenarios of different given measurements.