In this paper we investigate denumerable state semi-Markov decision chains with small interest rates. We consider average and Blackwell optimality and allow for multiple closed sets and unbounded immediate rewards. Our analysis uses the existence of a Laurent series expansion for the total discounted rewards and the continuity of its terms. The assumptions are expressed in terms of a weighted supremum norm. Our method is based on an algebraic treatment of Laurent series; it constructs an appropriate linear space with a lexicographic ordering. Using two operators and a positiveness property we establish the existence of bounded solutions to optimality equations. The theory is illustrated with an example of aK-dimensional queueing system. This paper is strongly based on the work of Denardo [11] and Dekker and Hordijk [7]. This research has partially been sponsored by the Netherlands Organization for Scientific Research (NWO).

Markov Decision Chains, operations research
dx.doi.org/10.1007/BF02055581, hdl.handle.net/1765/2247
ERIM Article Series (EAS)
Operations Research Proceedings
Erasmus Research Institute of Management

Hordijk, A, & Dekker, R. (1991). Denumerable Markov decision chains: sensitive optimality criteria. Operations Research Proceedings, 28(1), 185–211. doi:10.1007/BF02055581