Splitting Randomized Stationary Policies in Markov Decision Processes

*********************************
There is now a CONTENT FREEZE for Mercury while we switch to a new platform. It began on Friday, March 10 at 6pm and will end on Wednesday, March 15 at noon. No new content can be created during this time, but all material in the system as of the beginning of the freeze will be migrated to the new platform, including users and groups. Functionally the new site is identical to the old one. webteam@gatech.edu
*********************************

Event Details

Date/Time:
- Thursday April 2, 2009 - Friday April 3, 2009
  11:00 am - 11:59 am
Location: Executive classroom
Phone:
URL:
Email:
Fee(s):
$0.00
Extras:

Contact

Anita Race
H. Milton Stewart School of Industrial and Systems Engineering
Contact Anita Race

Summaries

Summary Sentence: Splitting Randomized Stationary Policies in Markov Decision Processes

Full Summary: Splitting Randomized Stationary Policies in Markov Decision Processes

TITLE: Splitting Randomized Stationary Policies in Markov Decision Processes

SPEAKER: Eugene A. Feinberg

ABSTRACT:

This talk discusses whether an occupation measure for a given randomized stationary policy for a Markov Decision Process (MDPs) can be represented as a convex combination of occupation measures for simpler policies. In particular, the question is whether an occupation measure for a randomized stationary policy can be represented as a convex integral combination of occupation measures for deterministic policies. If this is possible, we say that the policy can be split.

We present general results on such representations and focus our attention on two particular situations: (i) the given subset of states is a singleton (splitting at a state), and (ii) the given subset is finite and the policy uses a finite number of actions at this set (finite splitting). For splitting at a state, we present new formulas and provide necessary and sufficient conditions when a policy can be split. For finite splitting, we describe the structure of the split and provide an algorithm for its computation. Important properties of the structure of the finite splitting are that each two consecutive policies in the split differ only at one state and the number of summands in the convex combination is relatively small.

We apply finite splitting to constrained discounted MDPs with countable state spaces and prove that an optimal mixed stationary policy can be selected in such a way that each two consecutive stationary policies in the split differ at one state. Though this talk discusses infinite state and action MDPs, the results on finite splitting are new for discounted MDPs with finite state and action sets. We shall also discuss splitting of state-action frequencies in unichain MDPs with average rewards per unit time.

This talk is based in part on a joint paper with Eric V. Denardo and Uriel G. Rothblum.

Additional Information

In Campus Calendar

Groups

School of Industrial and Systems Engineering (ISYE)

Invited Audience

No audiences were selected.

Categories

Seminar/Lecture/Colloquium

Keywords

MDPs

Status

Created By: Anita Race
Workflow Status: Published
Created On: Oct 12, 2009 - 4:36pm
Last Updated: Oct 7, 2016 - 9:47pm

Georgia Tech

Splitting Randomized Stationary Policies in Markov Decision Processes

Additional Information