Denumerable Markov decision chains: sensitive optimality criteria
In this paper we investigate denumerable state semi-Markov decision chains with small interest rates. We consider average and Blackwell optimality and allow for multiple closed sets and unbounded immediate rewards. Our analysis uses the existence of a Laurent series expansion for the total discounted rewards and the continuity of its terms. The assumptions are expressed in terms of a weighted supremum norm. Our method is based on an algebraic treatment of Laurent series; it constructs an appropriate linear space with a lexicographic ordering. Using two operators and a positiveness property we establish the existence of bounded solutions to optimality equations. The theory is illustrated with an example of aK-dimensional queueing system. This paper is strongly based on the work of Denardo  and Dekker and Hordijk . This research has partially been sponsored by the Netherlands Organization for Scientific Research (NWO).
|Keywords||Markov Decision Chains, operations research|
|Persistent URL||dx.doi.org/10.1007/BF02055581, hdl.handle.net/1765/2247|
Hordijk, A., & Dekker, R.. (1991). Denumerable Markov decision chains: sensitive optimality criteria. Operations Research Proceedings, 28(1), 185–211. doi:10.1007/BF02055581