*********************************
There is now a CONTENT FREEZE for Mercury while we switch to a new platform. It began on Friday, March 10 at 6pm and will end on Wednesday, March 15 at noon. No new content can be created during this time, but all material in the system as of the beginning of the freeze will be migrated to the new platform, including users and groups. Functionally the new site is identical to the old one. webteam@gatech.edu
*********************************
Thesis Title: Sequential Interval Estimation for Bernoulli Trials
Advisors: Dr. David Goldsman and Dr. Yajun Mei
Committee members:
Dr. Brani Vidakovic
Dr. Christos Alexopoulos
Dr. George Moustakides (Rutgers University)
Date and Time: Friday, July 20, 2018, 10:00 AM
Location: ISyE Groseclose 304
Abstract:
Interval estimation of a binomial proportion is one of the most-basic problems in statistics with many important real-world applications. Some classical applications include estimation of the prevalence of a rare disease and accuracy assessment in remote sensing. In these applications, the sample size is fixed beforehand, and a confidence interval for the proportion is obtained. However, in many modern applications, sampling is costly and time consuming, e.g., estimating the customer click-through probability in online marketing campaigns and estimating the probability by which a stochastic system satisfies a specific property as in Statistical Model Checking. Because these applications tend to require extensive time and cost, it is advantageous to reduce the sample size while simultaneously assuring satisfactory quality (coverage) levels for the corresponding interval estimate. The sequential version of the interval estimation aims at the latter goal by allowing the sample size to be random and, in particular, formulating a stopping time controlled by the observations themselves. The literature focusing on the sequential setup of the problem is limited compared to its fixed sample-size counterpart, and optimality has not been established in the literature. The work in this thesis aims to extend the body of knowledge on the topic of sequential interval estimation for Bernoulli trials, addressing both the theoretical and practical concerns.
In Chapter 2, we propose an optimal sequential methodology for obtaining fixed-width confidence intervals for a binomial proportion when prior knowledge of the proportion is available. We assume that there exists a prior distribution for the binomial proportion, and our goal is to minimize the average number of samples while we guarantee a minimal coverage probability level. We demonstrate our stopping time is always bounded from above and below, suggesting that we need to first accumulate a sufficient amount of information before we start applying our stopping rule, and that our stopping time will always terminate. Finally, we compare our method with the optimum fixed-sample-size procedure as well as with existing alternative sequential schemes.
In Chapter 3, we propose a two-stage sequential method for obtaining tandem-width confidence intervals for a binomial proportion when no prior knowledge of the proportion is given. By tandem-width, we mean that the half-width of the confidence interval of the proportion is not fixed beforehand; it is instead required to satisfy two different upper bounds depending on the underlying value of the binomial proportion. To tackle this problem, we propose a simple but useful sequential method for obtaining fixed-width confidence intervals for the binomial proportion based on the minimax estimator of the binomial proportion.
In Chapter 4, we extend the idea for Bernoulli distributions in Chapter 2 to interval estimation for arbitrary distributions, with an alternative optimality formulation. Here, we propose a conditional cost alternative formulation to circumvent certain analytical/computational difficulties. Specifically, we assume that an i.i.d. random process is observed sequentially with its common probability density function having a random parameter that must be estimated. We follow a semi-Bayesian approach where we assign cost to the pair (estimator, true parameter), and our goal is to minimize the average sample size guaranteeing at the same time an average cost below some prescribed level. For a variety of examples, we compare our method with the optimum fixed sample size and other existing sequential schemes.