Denumerable semi-Markov decision chains with small interest rates

Dekker, Rommert; Hordijk, Arie

doi:10.1007/BF02055581

Denumerable semi-Markov decision chains with small interest rates

Research Contributions
Published: December 1991

Volume 28, pages 185–211, (1991)
Cite this article

Annals of Operations Research Aims and scope Submit manuscript

Rommert Dekker¹^nAff2 &
Arie Hordijk¹

65 Accesses
5 Citations
Explore all metrics

Abstract

In this paper we investigate denumerable state semi-Markov decision chains with small interest rates. We consider average and Blackwell optimality and allow for multiple closed sets and unbounded immediate rewards. Our analysis uses the existence of a Laurent series expansion for the total discounted rewards and the continuity of its terms. The assumptions are expressed in terms of a weighted supremum norm. Our method is based on an algebraic treatment of Laurent series; it constructs an appropriate linear space with a lexicographic ordering. Using two operators and a positiveness property we establish the existence of bounded solutions to optimality equations. The theory is illustrated with an example of aK-dimensional queueing system. This paper is strongly based on the work of Denardo [11] and Dekker and Hordijk [7].

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Constrained Optimality for First Passage Criteria in Semi-Markov Decision Processes

Average criteria in denumerable semi-Markov decision chains under risk-aversion

Article 21 August 2023

Sample-Path Optimality in Average Markov Decision Chains Under a Double Lyapunov Function Condition

References

J.A. Bather, Optimal decision procedures for finite Markov chains, part I: Examples, Adv. Appl. Prob. 5 (1973) 328–339; part II: Communicating systems, Adv. Appl. Prob. 5 (1973) 521–540; part III: General convex systems. Adv. Appl. Prob. 5 (1973) 541–553.
Google Scholar
D. Blackwell, Discrete dynamic programming, Ann. Math. Statist. 33 (1962) 719–726.
Google Scholar
D. Blackwell, Discounted dynamic programming, Ann. Math. Statist. 36 (1965) 226–235.
Google Scholar
K.L. Chung,Markov Chains with Stationary Transition Probabilities (Springer, Berlin, 1960).
Google Scholar
E. Cinlar,Introduction to Stochastic Processes (Prentice-Hall, Englewood Cliffs, NJ, 1975).
Google Scholar
J.S. De Cani, A dynamic programming algorithm for embedded Markov chains when the planning horizon is infinitely, Mgmt. Sci. 10 (1964) 716–733.
Google Scholar
R. Dekker and A. Hordijk, Average, sensitive and Blackwell optimal policies in denumerable Markov decision chains with unbounded rewards, Math. Oper. Res. 13 (1988) 395–421.
Google Scholar
R. Dekker and A. Hordijk, Recurrence conditions for average and Blackwell optimality in denumerable state Markov decision chains, Technical report, Dept. of Math, and Comp. Sci, Univ. of Leiden (1989) to appear in Math. Oper. Res.
R. Dekker, Denumerable Markov decision chains: optimal policies for small interest rates, Ph.D. thesis, Univ. of Leiden (1985).
E.V. Denardo and B.L. Miller, An optimality condition for discrete dynamic programming with no discounting, Ann. Math. Statist. 39 (1968) 1220–1227.
Google Scholar
E.V. Denardo, Markov renewal programming with small interest rates, Ann. Math. Statist. 42 (1971) 477–496.
Google Scholar
H. Deppe, On the existence of average optimal policies in semiregenerative decision models, Math. Oper. Res. 9 (1984) 558–575.
Google Scholar
C. Derman, Denumerable state Markovian decision processes — average cost criterion, Ann. Math. Statist. 42 (1966) 1545–1553.
Google Scholar
A. Federgruen, A. Hordijk and H.C. Tijms, Denumerable state semi-Markov decision processes with unbounded costs, average cost criterion, Stoch. Proc. Appl. 9 (1979) 223–235.
Google Scholar
A. Federgruen, P.J. Schweitzer and H.C. Tijms, Denumerable undiscounted semi-Markov decision processes with unbounded rewards, Math. Oper. Res. 8 (1983) 298–313.
Google Scholar
A. Federgruen and H.C. Tijms, The optimality equation in average cost denumerable state semi-Markov decision problems, recurrency conditions and algorithms, J. Appl. Prob. 15 (1978) 356–373.
Google Scholar
J.M. Harrison, Discrete dynamic programming with unbounded rewards, Ann. Math. Statist. 43 (1972) 636–644.
Google Scholar
A. Hordijk,Dynamic Programming and Markov Potential Theory, Mathematical Centre Tract no. 51, Amsterdam (1974).
A. Hordijk and K. Sladky, Sensitive optimality criteria in countable state dynamic programming, Math. Oper. Res. 2 (1977) 1–14.
Article Google Scholar
R.A. Howard,Semi-Markovian Decision Processes, Proc. Int. Statist. Inst., Ottawa, Canada (1963).
W.S. Jewell, Markov-renewal programming I: formulation, finite return models; Markov-renewal programming II: infinite return models, example, Oper. Res. 11 (1963) 938–971.
Article Google Scholar
G.P. Klimov, Time-sharing service systems I, Th. Prob. Appl. 19 (1974) 532–551.
Google Scholar
J.B. Lasserre, Conditions for existence of average and Blackwell optimal stationary policies in denumerable Markov decision processes, J. Math. Anal. Appl. 136 (1988) 479–490.
Google Scholar
E. Mann, Optimality equations and sensitive optimality in bounded Markov decision processes, Optimization 16 (1985) 767–781.
Google Scholar
B.L. Miller, Finite state continuous time Markov decision processes with infinite planning horizon, J. Math. Anal. Appl. 22 (1968) 552–569.
Google Scholar
B.L. Miller and A.F. Veinott Jr., Discrete dynamic programming with a small interest rate, Ann. Math. Statist. 40 (1969) 366–370.
Google Scholar
S.M. Ross, Non-discounted denumerable Markovian decision models, Ann. Math. Statist. 39 (1968) 412–423.
Google Scholar
S.M. Ross,Applied Probability Models with Optimization Applications (Holden-Day, San Fransisco, 1970).
Google Scholar
H.L. Royden,Real Analysis (MacMillan, New-York, 2nd ed. 1968).
Google Scholar
M. SchÄl, Conditions for optimality in dynamic programming and for the limit ofn-stage optimal policies to be optimal, Z. Wahr. verw. Gebiete 32 (1975) 179–196.
Google Scholar
M. SchÄl, On the second optimality equation for semi-Markov decision models, Technical report. Inst. Angew. Math., Univ. Bonn (1989) submitted for publication.
P.J. Schweitzer, Perturbation theory and Markovian decision processes, Ph.D. dissertation, Mass. Inst. of Techn. (1965).
L.J. Sennott, Average cost optimal stationary policies in infinite state Markov decision processes with unbounded costs, Oper. Res. 37 (1989) 626–633.
Google Scholar
L.J. Sennott, Average cost semi-Markov decision processes and the control of queueing systems. Prob. Eng. Inform. Sci. 2 (1989) 247–272.
Article Google Scholar
F.M. Spieksma, The existence of sensitive optimal policies in two multi-dimensional queueing models, this volume, pp. 273–296.
Sh. Stidham, Jr. and R.R. Weber, Monotonic and insensitive optimal policies for control of queues with undiscounted costs, Oper. Res. 87 (1989) 611–625.
Google Scholar
J.A.E.E. van Nunen,Contracting Markov Decision Processes, Mathematical Centre Tract no. 71, Amsterdam (1976).
J.A.E.E. van Nunen and J. Wessels, Markov decision processes with unbounded rewards, in:Markov Decision Theory, eds. H.C. Tijms and J. Wessels, Mathematical Centre Tract no. 93, Amsterdam (1976).
R.R. Weber and Sh. Stidham, Jr., Optimal control of service rates in networks of queues, Adv. Appl. Prob. 19 (1987) 202–218.
Google Scholar
A.F. Veinott Jr., On discrete dynamic programming with sensitive discount optimality criteria, Ann. Math. Stat. 40 (1969) 1635–1660.
Google Scholar
W.H.M. Zijm, The optimality equations in multichain denumerable state Markov decision processes with the average cost criterion: the bounded cost case, Stat. Decisions 3 (1985) 143–165.
Google Scholar
W.H.M. Zijm, The optimality equations in multichain denumerable state Markov decision processes with the average cost criterion: the unbounded cost case, C.Q.M. Note 22, Centre for Quant. Methods, Philips B.V. Eindhoven (1984).

Download references

Author information

Rommert Dekker
Present address: Dept. MFTS, Shell Internationale Petroleum Mij B.V., The Hague, The Netherlands

Authors and Affiliations

Department of Mathematics and Computer Science, University of Leiden, P.O. Box 9512, 2300 RA, Leiden, The Netherlands
Rommert Dekker & Arie Hordijk

Authors

Rommert Dekker
View author publications
You can also search for this author in PubMed Google Scholar
Arie Hordijk
View author publications
You can also search for this author in PubMed Google Scholar

Additional information

This research has partially been sponsored by the Netherlands Organization for Scientific Research (NWO).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dekker, R., Hordijk, A. Denumerable semi-Markov decision chains with small interest rates. Ann Oper Res 28, 185–211 (1991). https://doi.org/10.1007/BF02055581

Download citation

Issue Date: December 1991
DOI: https://doi.org/10.1007/BF02055581

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Denumerable semi-Markov decision chains with small interest rates

Abstract

Access this article

Similar content being viewed by others

Constrained Optimality for First Passage Criteria in Semi-Markov Decision Processes

Average criteria in denumerable semi-Markov decision chains under risk-aversion

Sample-Path Optimality in Average Markov Decision Chains Under a Double Lyapunov Function Condition

References

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Denumerable semi-Markov decision chains with small interest rates

Abstract

Access this article

Similar content being viewed by others

Constrained Optimality for First Passage Criteria in Semi-Markov Decision Processes

Average criteria in denumerable semi-Markov decision chains under risk-aversion

Sample-Path Optimality in Average Markov Decision Chains Under a Double Lyapunov Function Condition

References

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation