In this paper we investigate denumerable state semi-Markov decision chains with small interest rates. We consider average and Blackwell optimality and allow for multiple closed sets and unbounded immediate rewards. Our analysis uses the existence of a Laurent series expansion for the total discounted rewards and the continuity of its terms. The assumptions are expressed in terms of a weighted supremum norm. Our method is based on an algebraic treatment of Laurent series; it constructs an appropriate linear space with a lexicographic ordering. Using two operators and a positiveness property we establish the existence of bounded solutions to optimality equations. The theory is illustrated with an example of aK-dimensional queueing system. This paper is strongly based on the work of Denardo [11] and Dekker and Hordijk [7]. This research has partially been sponsored by the Netherlands Organization for Scientific Research (NWO).

Markov Decision Chains, operations research,
ERIM Article Series (EAS)
Operations Research Proceedings
Erasmus Research Institute of Management

Hordijk, A, & Dekker, R. (1991). Denumerable Markov decision chains: sensitive optimality criteria. Operations Research Proceedings, 28(1), 185–211. doi:10.1007/BF02055581