|
- Sutton's DYNA algorithm, which integrates Planning, Reaction and Reinforcement Learning, was implemented at ISR in the Robosoft ROBUTER mobile platform by Alex Weiser, a former Tecnische Universitat Munchen undergraduate student who made his Final Project at ISR/IST supported by an ERASMUS grant. The platform uses odometry for self-location inside a well-structured world of obstacles and empty cells. It tries to reach a goal destination with absolutely no previous knowledge of the world, by trial and error. A reward is received if and only if the goal is reached.
After reaching the goal for the first time, the robot learns a path from start to goal while it keeps building a limited world model, based on real experiences and also experiences with the world model.
- An extension to symmetric (because the robot learns on both the start-goal
and goal-start paths) and cooperative (because 2 robots share the world map
and communicate policy information about it) reinforcement learning has been
made by Sjoerd Van der Zwaan and José Moreira, as their project for the
MSc Mobile Robotics course. They use external global vision to localize the
LEGO robots (picture on the left) and external processing to tell them where to
move next.
- Check here a 4mn mpeg with the robots evolving in the maze.
|