The arrow of time casts its ugly shadow on MDPs
June 17, 2008
The good thing about MDPs is that you can come up with just about any quantity, you can write its dynamic programming equations and solve it. I had to realize that there are exceptions. The reason is: you cannot reverse the flow of information (which, funnily enough, flows backwards in time in MDPs). And what I found quite interesting: the arrow of time hits you only when you want decision-making and randomness at the same time. If you give up either one, it goes just fine.