Skip to main content
Log in

Denumerable semi-Markov decision chains with small interest rates

  • Research Contributions
  • Published:
Annals of Operations Research Aims and scope Submit manuscript

Abstract

In this paper we investigate denumerable state semi-Markov decision chains with small interest rates. We consider average and Blackwell optimality and allow for multiple closed sets and unbounded immediate rewards. Our analysis uses the existence of a Laurent series expansion for the total discounted rewards and the continuity of its terms. The assumptions are expressed in terms of a weighted supremum norm. Our method is based on an algebraic treatment of Laurent series; it constructs an appropriate linear space with a lexicographic ordering. Using two operators and a positiveness property we establish the existence of bounded solutions to optimality equations. The theory is illustrated with an example of aK-dimensional queueing system. This paper is strongly based on the work of Denardo [11] and Dekker and Hordijk [7].

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. J.A. Bather, Optimal decision procedures for finite Markov chains, part I: Examples, Adv. Appl. Prob. 5 (1973) 328–339; part II: Communicating systems, Adv. Appl. Prob. 5 (1973) 521–540; part III: General convex systems. Adv. Appl. Prob. 5 (1973) 541–553.

    Google Scholar 

  2. D. Blackwell, Discrete dynamic programming, Ann. Math. Statist. 33 (1962) 719–726.

    Google Scholar 

  3. D. Blackwell, Discounted dynamic programming, Ann. Math. Statist. 36 (1965) 226–235.

    Google Scholar 

  4. K.L. Chung,Markov Chains with Stationary Transition Probabilities (Springer, Berlin, 1960).

    Google Scholar 

  5. E. Cinlar,Introduction to Stochastic Processes (Prentice-Hall, Englewood Cliffs, NJ, 1975).

    Google Scholar 

  6. J.S. De Cani, A dynamic programming algorithm for embedded Markov chains when the planning horizon is infinitely, Mgmt. Sci. 10 (1964) 716–733.

    Google Scholar 

  7. R. Dekker and A. Hordijk, Average, sensitive and Blackwell optimal policies in denumerable Markov decision chains with unbounded rewards, Math. Oper. Res. 13 (1988) 395–421.

    Google Scholar 

  8. R. Dekker and A. Hordijk, Recurrence conditions for average and Blackwell optimality in denumerable state Markov decision chains, Technical report, Dept. of Math, and Comp. Sci, Univ. of Leiden (1989) to appear in Math. Oper. Res.

  9. R. Dekker, Denumerable Markov decision chains: optimal policies for small interest rates, Ph.D. thesis, Univ. of Leiden (1985).

  10. E.V. Denardo and B.L. Miller, An optimality condition for discrete dynamic programming with no discounting, Ann. Math. Statist. 39 (1968) 1220–1227.

    Google Scholar 

  11. E.V. Denardo, Markov renewal programming with small interest rates, Ann. Math. Statist. 42 (1971) 477–496.

    Google Scholar 

  12. H. Deppe, On the existence of average optimal policies in semiregenerative decision models, Math. Oper. Res. 9 (1984) 558–575.

    Google Scholar 

  13. C. Derman, Denumerable state Markovian decision processes — average cost criterion, Ann. Math. Statist. 42 (1966) 1545–1553.

    Google Scholar 

  14. A. Federgruen, A. Hordijk and H.C. Tijms, Denumerable state semi-Markov decision processes with unbounded costs, average cost criterion, Stoch. Proc. Appl. 9 (1979) 223–235.

    Google Scholar 

  15. A. Federgruen, P.J. Schweitzer and H.C. Tijms, Denumerable undiscounted semi-Markov decision processes with unbounded rewards, Math. Oper. Res. 8 (1983) 298–313.

    Google Scholar 

  16. A. Federgruen and H.C. Tijms, The optimality equation in average cost denumerable state semi-Markov decision problems, recurrency conditions and algorithms, J. Appl. Prob. 15 (1978) 356–373.

    Google Scholar 

  17. J.M. Harrison, Discrete dynamic programming with unbounded rewards, Ann. Math. Statist. 43 (1972) 636–644.

    Google Scholar 

  18. A. Hordijk,Dynamic Programming and Markov Potential Theory, Mathematical Centre Tract no. 51, Amsterdam (1974).

  19. A. Hordijk and K. Sladky, Sensitive optimality criteria in countable state dynamic programming, Math. Oper. Res. 2 (1977) 1–14.

    Article  Google Scholar 

  20. R.A. Howard,Semi-Markovian Decision Processes, Proc. Int. Statist. Inst., Ottawa, Canada (1963).

  21. W.S. Jewell, Markov-renewal programming I: formulation, finite return models; Markov-renewal programming II: infinite return models, example, Oper. Res. 11 (1963) 938–971.

    Article  Google Scholar 

  22. G.P. Klimov, Time-sharing service systems I, Th. Prob. Appl. 19 (1974) 532–551.

    Google Scholar 

  23. J.B. Lasserre, Conditions for existence of average and Blackwell optimal stationary policies in denumerable Markov decision processes, J. Math. Anal. Appl. 136 (1988) 479–490.

    Google Scholar 

  24. E. Mann, Optimality equations and sensitive optimality in bounded Markov decision processes, Optimization 16 (1985) 767–781.

    Google Scholar 

  25. B.L. Miller, Finite state continuous time Markov decision processes with infinite planning horizon, J. Math. Anal. Appl. 22 (1968) 552–569.

    Google Scholar 

  26. B.L. Miller and A.F. Veinott Jr., Discrete dynamic programming with a small interest rate, Ann. Math. Statist. 40 (1969) 366–370.

    Google Scholar 

  27. S.M. Ross, Non-discounted denumerable Markovian decision models, Ann. Math. Statist. 39 (1968) 412–423.

    Google Scholar 

  28. S.M. Ross,Applied Probability Models with Optimization Applications (Holden-Day, San Fransisco, 1970).

    Google Scholar 

  29. H.L. Royden,Real Analysis (MacMillan, New-York, 2nd ed. 1968).

    Google Scholar 

  30. M. SchÄl, Conditions for optimality in dynamic programming and for the limit ofn-stage optimal policies to be optimal, Z. Wahr. verw. Gebiete 32 (1975) 179–196.

    Google Scholar 

  31. M. SchÄl, On the second optimality equation for semi-Markov decision models, Technical report. Inst. Angew. Math., Univ. Bonn (1989) submitted for publication.

  32. P.J. Schweitzer, Perturbation theory and Markovian decision processes, Ph.D. dissertation, Mass. Inst. of Techn. (1965).

  33. L.J. Sennott, Average cost optimal stationary policies in infinite state Markov decision processes with unbounded costs, Oper. Res. 37 (1989) 626–633.

    Google Scholar 

  34. L.J. Sennott, Average cost semi-Markov decision processes and the control of queueing systems. Prob. Eng. Inform. Sci. 2 (1989) 247–272.

    Article  Google Scholar 

  35. F.M. Spieksma, The existence of sensitive optimal policies in two multi-dimensional queueing models, this volume, pp. 273–296.

  36. Sh. Stidham, Jr. and R.R. Weber, Monotonic and insensitive optimal policies for control of queues with undiscounted costs, Oper. Res. 87 (1989) 611–625.

    Google Scholar 

  37. J.A.E.E. van Nunen,Contracting Markov Decision Processes, Mathematical Centre Tract no. 71, Amsterdam (1976).

  38. J.A.E.E. van Nunen and J. Wessels, Markov decision processes with unbounded rewards, in:Markov Decision Theory, eds. H.C. Tijms and J. Wessels, Mathematical Centre Tract no. 93, Amsterdam (1976).

  39. R.R. Weber and Sh. Stidham, Jr., Optimal control of service rates in networks of queues, Adv. Appl. Prob. 19 (1987) 202–218.

    Google Scholar 

  40. A.F. Veinott Jr., On discrete dynamic programming with sensitive discount optimality criteria, Ann. Math. Stat. 40 (1969) 1635–1660.

    Google Scholar 

  41. W.H.M. Zijm, The optimality equations in multichain denumerable state Markov decision processes with the average cost criterion: the bounded cost case, Stat. Decisions 3 (1985) 143–165.

    Google Scholar 

  42. W.H.M. Zijm, The optimality equations in multichain denumerable state Markov decision processes with the average cost criterion: the unbounded cost case, C.Q.M. Note 22, Centre for Quant. Methods, Philips B.V. Eindhoven (1984).

Download references

Author information

Authors and Affiliations

Authors

Additional information

This research has partially been sponsored by the Netherlands Organization for Scientific Research (NWO).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dekker, R., Hordijk, A. Denumerable semi-Markov decision chains with small interest rates. Ann Oper Res 28, 185–211 (1991). https://doi.org/10.1007/BF02055581

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02055581

Keywords

Navigation