AIspace

What is the value in the top-left state after performing another step of value iteration?

We take the max of the four possible actions:
- left: 0.8(0+0.9(-10)) + 0.1(0+0.9(-10)) + 0.1(-100+0.9(0)) = -18.1
- up: 0.8(0+0.9(-10)) + 0.1(0+0.9(-10)) + 0.1(-100+0.9(0)) = -18.1
- down: 0.1(0+0.9(-10)) + 0.1(-100+0.9(0)) + 0.8(-100+0.9(0)) = -90.9
- right: 0.1(0+0.9(-10)) + 0.1(-100+0.9(0)) + 0.8(-100+0.9(0)) = -90.9
Therefore the new value is -18.1.