An optimality principle for Markovian decision processes

Paul J Schweitzer; Bezalel Gavish

doi:10.1016/0022-247X(76)90243-2

J. Math. Anal. Appl.

Paper

01 Jan 1976

An optimality principle for Markovian decision processes

View publication

Abstract

The following optimality principle is established for finite undiscounted or discounted Markov decision processes: If a policy is (gain, bias, or discounted) optimal in one state, it is also optimal for all states reachable from this state using this policy. The optimality principle is used constructively to demonstrate the existence of a policy that is optimal in every state, and then to derive the coupled functional equations satisfied by the optimal return vectors. This reverses the usual sequence, where one first establishes (via policy iteration or linear programming) the solvability of the coupled functional equations, and then shows that the solution is indeed the optimal return vector and that the maximizing policy for the functional equations is optimal for every state. © 1976.

Paper

Adaptive solution strategy for solving large systems of p‐type finite element equations

R.B. Morris, Y. Tsuji, et al.

International Journal for Numerical Methods in Engineering

Conference paper

Changes of T_c under epitaxial strain: Implications for the mechanism of superconductivity

J.P. Locquet, J. Perret, et al.

SPIE Optical Science, Engineering, and Instrumentation 1998

Conference paper

Time development of AlGaAs single-quantum-well laser facet temperature on route to catastrophical breakdown

W.C. Tang, H. Rosen, et al.

SPIE Optics, Electro-Optics, and Laser Applications in Science and Engineering 1991

Paper

Output distribution of the burrows-wheeler transform

Karthik Visweswariah, Sanjeev Kulkarni, et al.

IEEE International Symposium on Information Theory - Proceedings

View all publications

Abstract

Related

Adaptive solution strategy for solving large systems of p‐type finite element equations

Changes of Tc under epitaxial strain: Implications for the mechanism of superconductivity

Time development of AlGaAs single-quantum-well laser facet temperature on route to catastrophical breakdown

Output distribution of the burrows-wheeler transform

Changes of T_c under epitaxial strain: Implications for the mechanism of superconductivity