Statistics Seminar - Ery Arias-Castro

*********************************
There is now a CONTENT FREEZE for Mercury while we switch to a new platform. It began on Friday, March 10 at 6pm and will end on Wednesday, March 15 at noon. No new content can be created during this time, but all material in the system as of the beginning of the freeze will be migrated to the new platform, including users and groups. Functionally the new site is identical to the old one. webteam@gatech.edu
*********************************

Event Details
  • Date/Time:
    • Friday October 7, 2016 - Saturday October 8, 2016
      11:00 am - 11:59 am
  • Location: ISyE Main 341
  • Phone:
  • URL:
  • Email:
  • Fee(s):
    N/A
  • Extras:
Contact

Xiaoming Huo

Summaries

Summary Sentence: Statistics Seminar - Ery Arias-Castro

Full Summary: No summary paragraph submitted.

TITLE: Distribution-Free Detection of Structured Anomalies: Permutation and Rank-Based Scans

ABSTRACT:

The scan statistic is by far the most popular method for anomaly detection, being popular in syndromic surveillance, signal and image processing, and target detection based on sensor networks, among other applications.  The use of the scan statistics in such settings yields a hypothesis testing procedure, where the null hypothesis corresponds to the absence of anomalous behavior.  If the null distribution is known, then calibration of a scan-based test is relatively easy, as it can be done by Monte Carlo simulation.  When the null distribution is unknown, it is less straightforward. 

 

We investigate two procedures.  The first one is a calibration by permutation and the other is a rank-based scan test, which is distribution-free and less sensitive to outliers.  Furthermore, the rank scan test requires only a one-time calibration for a given data size making it computationally much more appealing.  In both cases, we quantify the performance loss with respect to an oracle scan test that knows the null distribution.  We  show that using one of these calibration procedures results in only a very small loss of power in the context of a natural exponential family. This includes the classical normal location model, popular in signal processing, and the Poisson model, popular in syndromic surveillance.  We perform numerical experiments on simulated data further supporting our theory and also on a real dataset from genomics.

 

Joint work with Rui M. Castro(1), Ervin Tánczos(1), and Meng Wang(2)

 

(1) Technische Universiteit Eindhoven

(2) Stanford University

 

The paper is available online at

http://arxiv.org/abs/1508.03002

Additional Information

In Campus Calendar
No
Groups

School of Industrial and Systems Engineering (ISYE)

Invited Audience
Faculty/Staff, Public, Undergraduate students, Graduate students
Categories
No categories were selected.
Keywords
No keywords were submitted.
Status
  • Created By: Anita Race
  • Workflow Status: Published
  • Created On: Sep 28, 2016 - 1:51pm
  • Last Updated: Apr 13, 2017 - 5:14pm