Public defence in Computer Science, M.Sc. (Tech) Rinu Boney
Reinforcement learning (RL) is a general framework for learning intelligent behaviors to solve problems in any domain. The standard approach to solving problems with hand-coded solutions will inevitably fail as they encounter unfamiliar situations. On the other hand, an effective RL algorithm could automatically learn intelligent solutions with little or no human intervention, and also potentially learn to deal with unfamiliar situations. There has been remarkable progress in RL research in recent years to show that there exist RL algorithms that are good enough to solve challenging problems. RL algorithms have, for example, learned to beat world-class human players (in challenging games like Go, Poker, Starcraft II, and Dota 2), drive cars from camera inputs in simple scenarios, and perform dexterous robotic manipulation tasks. This recent progress in RL has been fueled by deep RL which combines RL with deep learning for function approximation. This enables the RL agent to learn and represent complex behaviors that can be conditioned on high-dimensional observations such as images. However, this comes at the cost of decreased training stability and high sample complexity, limiting the practical impact of deep RL algorithms.
The thesis presents advances toward improving the sample efficiency and benchmarking of deep RL algorithms on real-world problems. The thesis first presents sample-efficient deep RL algorithms for three different problem settings: multi-agent discrete control, continuous control, and continuous control from image observations. It is shown that planning with known or learned dynamics models lead to improved sample-efficiency in discrete and continuous control problems. In addition, the thesis presents two low-cost robot learning benchmarks, one of them based on a low-cost quadruped robot that we developed, to ground the research of RL algorithms on real-world problems.
Opponent: Doctor Yuval Tassa, Google/DeepMind Technologies, United Kingdom
Custos: Professor Juho Kannala, Aalto University School of Science, Department of Computer Science
Contact details of the doctoral student: [email protected]
The public defence will be organised on campus (Maarintie 8, lecture hall AS1)
The thesis is publicly displayed 10 days before the defence in the publication archive Aaltodoc of Aalto University.