ARC Colloquium: Semih Cayci (Ohio State University)

*********************************
There is now a CONTENT FREEZE for Mercury while we switch to a new platform. It began on Friday, March 10 at 6pm and will end on Wednesday, March 15 at noon. No new content can be created during this time, but all material in the system as of the beginning of the freeze will be migrated to the new platform, including users and groups. Functionally the new site is identical to the old one. webteam@gatech.edu
*********************************

Event Details
  • Date/Time:
    • Monday February 3, 2020 - Tuesday February 4, 2020
      11:00 am - 11:59 am
  • Location: Groseclose 402
  • Phone:
  • URL:
  • Email:
  • Fee(s):
    N/A
  • Extras:
Contact
No contact information submitted.
Summaries

Summary Sentence: Budget-Constrained Learning and Optimization with Bandit Feedback - Groseclose 402 at 11:00am

Full Summary: No summary paragraph submitted.

Algorithms & Randomness Center (ARC)

Semih Cayci (Ohio State University)

Monday, February 3, 2020

Groseclose 402 - 11:00 am

 

Title:  Budget-Constrained Learning and Optimization with Bandit Feedback

Abstract:  In numerous decision processes such as algorithm portfolio design, adaptive routing and dynamic pricing, each action incurs a random cost and yields a random and potentially correlated reward, where the process continues until the total cost exceeds a resource constraint. Motivated by these applications, we investigate the budget-constrained bandit problem in which the decision-maker aims to maximize the expected total reward subject to a budget constraint on the total cost. For this problem, we show that logarithmic regret is achievable as long as moments of order p > 2 exist. In particular, we propose robust learning algorithms that incorporate linear estimators to extract and exploit the correlation between the cost and reward of an arm for optimal sample complexity. We prove that these algorithms achieve tight regret bounds, which are optimal up to a constant factor in the Gaussian case. In the second part, we focus on the time-constrained bandit problem, and allow the decision-maker to interrupt an ongoing task and forgo its immediate reward for a potentially faster alternative. We show that this controlled interrupt mechanism improves the total reward linearly over time for a broad class of completion time distributions. In order to learn the optimal action and interrupt strategy, we propose learning algorithms that exploit the information structure of the problem to achieve order-optimal regret. We show that these learning algorithms provide efficient solutions for boosting the time-efficiency of stochastic local search methods in various fundamental applications such as the k-SAT problem.

----------------------------------

Speaker's Webpage

Videos of recent talks are available at: https://smartech.gatech.edu/handle/1853/46836

Click here to subscribe to the seminar email list: arc-colloq@Klauscc.gatech.edu

Additional Information

In Campus Calendar
No
Groups

ARC

Invited Audience
Postdoc, Graduate students, Undergraduate students
Categories
Seminar/Lecture/Colloquium
Keywords
No keywords were submitted.
Status
  • Created By: Francella Tonge
  • Workflow Status: Published
  • Created On: Jan 27, 2020 - 9:51am
  • Last Updated: Jan 27, 2020 - 9:53am