Reinforcement Learning 4: Dynamic Programming
4 Dynamic Programming
4.1 Policy Evaluation
Iterative policy evaluation
4.2 Policy Improvement
4.3 Policy Iteration
Example 4.2: Jack’s Car Rental
4.4 Value Iteration
Example 4.3: Gambler’s Problem