Dynamic Programming Example - Optimal Control
I'm concerned with step T-1: It is optimal to go from state 2 and 4 to state 5 because you get the $1000 as reward. Also you should remain in 3, if you're there, because you can only loose money going to another.Remaining in 5 is also trivial. However, I don't understand why you should stay in state 1? Isn't that the same case as state 2 and 4? If you go from 1 to 5 you would end up having $965 as compared to $0, when you stay.
Penny for a thought !