Events

Public defence in Computer Science, M.Sc.(Tech.) Kalle Kujanpää

Hierarchies, Search, and Generative Models in Sequential Decision-Making


Public defence from the Aalto University School of Science, Department of Computer Science.
Doctoral hat floating above a speaker's podium with a microphone.

Title of the thesis: Hierarchies, Search, and Generative Models in Sequential Decision-Making

Thesis defender: Kalle Kujanpää 
Opponent: Professor Frans Oliehoek, Delft University of Technology, Netherlands
Custos: Professor Pekka Marttinen, Aalto University School of Science

How can we make robots, self-driving cars, and intelligent agents perform well in practice? Intelligent agents trained with reinforcement learning (trial-and-error) or imitation learning (learning from demonstration) can reach or surpass expert-level performance in controlled settings like board and computer games. Yet translating this success to real-world tasks remains difficult.

This thesis addresses key obstacles in sequential decision-making with reinforcement and imitation learning, including long-term planning and modeling complex behaviors. To tackle these, it develops methods combining simple ideas: breaking objectives into smaller subtasks (hierarchies) and planning ahead with search. It also models human-like behavior by incorporating styles into generative modeling for decision-making.

The aim is to advance sequential decision-making algorithms, making them less dependent on trial-and-error learning and more effective at complex tasks through planning. To this end, the thesis shows the value of these ideas across several domains. In autonomous driving, using learned driving styles produces more realistic traffic simulations. In robotics, flexibly reusing experience makes planning more efficient. It also applies reinforcement learning as a stress-testing tool to probe system weaknesses. For instance, a red-teaming agent is trained to gain higher-level access rights in a computer system.

Across these settings, we find an interesting trade-off: even though continuous models are precise in principle, discrete alternatives often lead to simpler planning and better performance.

Overall, the results suggest that combining hierarchy, planning ahead with search, and human-style modeling can significantly improve the performance of decision-making agents. These contributions are relevant to teams building autonomous driving simulators, security engineers, and robotics developers seeking to improve long-term planning in AI systems.

Thesis available for public display 7 days prior to the defence at Aaltodoc

Doctoral theses of the School of Science

A large white 'A!' sculpture on the rooftop of the Undergraduate centre. A large tree and other buildings in the background.

Doctoral theses of the School of Science at Aaltodoc (external link)

Doctoral theses of the School of Science are available in the open access repository maintained by Aalto, Aaltodoc.

Zoom Quick Guide
  • Updated:
  • Published:
Share
URL copied!