Average, sensitive and Blackwell-optimal policies in denumerable Markov decision chains with unbounded rewards

Dekker, Rommert; Hordijk, Arie

R. Dekker (Rommert) and A. Hordijk (Arie)

1988-08-01

Average, sensitive and Blackwell-optimal policies in denumerable Markov decision chains with unbounded rewards

Mathematics of Operations Research , Volume 13 - Issue 3 p. 395- 420

In this paper we consider a (discrete-time) Markov decision chain with a denumerabloe state space and compact action sets and we assume that for all states the rewards and transition probabilities depend continuously on the actions. The first objective of this paper is to develop an analysis for average optimality without assuming a special Markov chain structure. In doing so, we present a set of conditions guaranteeing average optimality, which are automatically fulfilled in the finite state and action model. The second objective is to study simultaneously average and discount optimality as Veinott (1969) did for the finite state and action model. We investigate the concepts of n-discount and Blackwell optimality in the denumerable state space, using a Laurent series expansion for the discounted rewards. Under the same conditions as for average optimality, we establish solutions to the n-discount optimality equations for every n.

Additional Metadata
Keywords	Denumerable Markov Decision Chains, Laurent series expansion, average optimality, multi-chain model, optimality equation, sensitive optimality criteria, unbounded immediate rewards
Persistent URL	hdl.handle.net/1765/2251
Journal	Mathematics of Operations Research
Organisation	Erasmus School of Economics
Citation APA Style AAA Style APA Style Cell Style Chicago Style Harvard Style IEEE Style MLA Style Nature Style Vancouver Style American-Institute-of-Physics Style Council-of-Science-Editors Style BibTex Format Endnote Format RIS Format CSL Format DOIs only Format	Dekker, R., & Hordijk, A. (1988). Average, sensitive and Blackwell-optimal policies in denumerable Markov decision chains with unbounded rewards. Mathematics of Operations Research, 13(3), 395–420. Retrieved from http://hdl.handle.net/1765/2251

Full Text ( Final Version , 1mb )

Additional Files
Permalink Final Version

Average, sensitive and Blackwell-optimal policies in denumerable Markov decision chains with unbounded rewards

Publication

Publication

About

Average, sensitive and Blackwell-optimal policies in denumerable Markov decision chains with unbounded rewards

Publication

Publication

Workflow

Workflow

Add Content