EWRL is over now, the next part of the is late not only because of the internet access difficulties, but because of extreme exhaustion. Talks from 9am to 6pm, then spontaneous social events with various people, closed by a great banquet (held in a place that functioned as brothel, theater and porn cinema in various stages of its existence, then transformed into a luxurious restaurant). And we’ve got the coolest conference T-shirt ever.

Regarding the talks, I gave up on picking my favorite talks, browse the abstracts yourselves :-)

A few words only from the two keynote speakers. Dimitri Bertsekas spoke about the least-squares way of solving the Bellman equations. These can be converted into huge systems of linear equations, too huge to be handled well. To tame the beast, we can use both row- and column-sampling, and what is even better, you can use different techniques to sample rows and columns. We have seen examples of this on LSPE and LSTD.

Jan Peters has been talking about robotics and policy search. He argued that if you want to teach your robots primitive motor skills (like balancing), you cannot handcode the controller (it needs to be robust), you cannot really build a proper model, and you cannot afford uncontrolled exploration (robots are expensive things). What remains is policy search, which works well for imitation tasks. After this introduction, he talked about reward-weighted regression and the natural policy gradient, then showed several nifty videos of successful applications, like hopping a ball-on-the-string into a bucket.

And now ICML is at full throttle.

Leave a Reply