Conference time
June 29, 2008
As you possibly all know, EWRL starts tomorrow, followed immediately by ICML (then UAI and COLT). I will be there on the first two, so probably there won’t be many new posts in the next one-and-a-half weeks. After that, however, I guess my head will be full of things to blog about. So, stay tuned :-)
Also, the Second Annual RL-competition is held on the workshop day of ICML, where my Tetris code is trying to do its best. Unfortunately, I had much less time tending it than I expected, but results are not bad this far. I intend to write about that, too!
Wandering aimlessly in continuous space
June 23, 2008
If reinforcement learning techniques were pet animals, epsilon-greedy exploration would certainly be the cockroach. It is undemanding, it can live just about everywhere, stupid beyond words, and its presence in your property is quite embarrassing. Fortunately, there are lots of prettier and more intelligent species in the pet shop of exploration methods. And most of these die when entering a continuous state space (or, almost equivalently, when function approximation appears).
Rewards: keep out of reach of paradoxes
June 19, 2008
The reward hypothesis states
“That all of what we mean by goals and purposes can be well thought of as maximization of the expected value of the cumulative sum of a received scalar signal (reward).”
For RL researchers, this is really good news: it means that whenever they come up with a reward-maximizing algorithm, it can attack (in principle) any goal-oriented task. Although a bit vague, one can heartily agree with the hypothesis. One must be careful, though, to keep it out of reach of paradoxes.
Although in its original form Newcomb’s paradox involves omniscient deities, philosophers arguing over free will and other dubious figures, it seems like it can be liberated of these, and leave something that is still paradoxish, but also RL-ish – a dangerous mix.
Exploration in Puerto Rico
June 17, 2008
Puerto Rico is the #1 game on BoardGameGeek since ages. While it’s not my personal favorite, I can fully understand that – the game has a beautiful design (I mean the game mechanics, not the artwork). One thing in the design that stands out, being simple but brilliant, is the mechanism for diversifying the gameplay. And it is something very familiar from reinforcement learning.
The arrow of time casts its ugly shadow on MDPs
June 17, 2008
The good thing about MDPs is that you can come up with just about any quantity, you can write its dynamic programming equations and solve it. I had to realize that there are exceptions. The reason is: you cannot reverse the flow of information (which, funnily enough, flows backwards in time in MDPs). And what I found quite interesting: the arrow of time hits you only when you want decision-making and randomness at the same time. If you give up either one, it goes just fine.
Introducing category “doofus theory”
June 16, 2008
Sometimes (usually when I should be working hard on something several days before deadline) I like to marvel at “profound connections” and “great ideas” (which, when brought to sunlight, turn out to be obvious, useless, long-known or faulty with roughly the same chance). They are ideal sources of procrastination: thinking on them seems like hard work, but still delightfully useless. For these doofus theories, getting an own blogpost may be well beyond their merits. Still, they get it (they even get an own category!), in the hope that you will enjoy reading them as much as I did. If in a procrastinating mood, feel free to contribute!
Blogs to check
June 10, 2008
It might be an occupational impairment but I feel the urge to begin with a Literature Review. So, here goes my highly subjective list. For some time, I have been following several excellent research-related blogs. I also noticed some strange tendencies: there are excellent blogs on machine learning in general (half of them calling themselves “Yet another machine learning blog”), there are excellent specialized blogs on evolutionary methods, on data mining, on game AI and all kinds of game-related topics, but the reinforcement learning niche seemed strangely underpopulated until lately. We are lucky, however: the recently started RL blog has some of the biggest names in RL, Satinder Singh, Michael Littman, Peter Stone and Rich Sutton.
Welcome, dear reader
June 8, 2008
… to this new blog on reinforcement learning, games and all things related (in a broad sense of relatedness). I have (I guess, most of us have) lots of ideas that are somehow related to research, but I know too little about the subject, have no time to work on them, or the ideas are too vague or too funny to be included in any serious research project. I’ve decided to put some of these thoughts on display. Partly because it’s amusing, and partly because writing about them helps me a lot to think about them.
But hopefully, the blog will be something more: a place for discussions and brainstormings. For this, I need you, dear reader! I would love to hear your opinions and comments about the topics discussed here! Also, if you have questions, ideas, critiques (positive or negative), please let me know! Check back from time to time, or subscribe to the RSS feed for getting the new posts and/or comments.
And… gimme evaluative feedback :-)