Events

Public defence, Computer Science, MSc Nicola Dainese

Language models for sample-efficient, generalizable, and interpretable RL agents via world models and black-box optimization.

Public defence from the Aalto University School of Science, Department of Computer Science.
Chessboard and its shadow, Copyright: Carlo Dainese.
Chessboard and its shadow, Copyright: Carlo Dainese.

Title of the thesis: World modeling and black-box optimization with language models

Thesis defender: Nicola Dainese
Opponent: Associate Professor Florian T. Pokorny, KTH Royal Institute of Technology, Sweden
Custos: Associate Professor Pekka Marttinen, Aalto University School of Science

World modeling and black-box optimization with language models

Recent reinforcement learning (RL) advances have produced performant agents, yet current methods remain data-inefficient and generalize poorly, learning from scratch without prior knowledge. Therefore, the goal of this PhD dissertation is to explore how language can improve RL sample efficiency and generalization by incorporating prior knowledge into world models It further investigates large language models (LLMs) for black-box optimization (BBO) with applications to prompt optimization and symbolic regression. These lines also converge in programmatic world models, where program synthesis is applied to world modeling, yielding agents that rapidly adapt to language-described tasks with interpretable internal models.

Contributions include a stochastic world model leveraging language descriptions that improves over the state of the art; LLM-based BBO methods revealing unexpected robustness of open-source LLMs to docstring changes and achieving state-of-the-art symbolic regression with simpler formulas; and two programmatic world model approaches (one using LLM-based code generation guided by Monte Carlo Tree Search for faster, interpretable planning, and a visual planning benchmark revealing complementary strengths of symbolic and direct VLM-based planning). Together, these contributions advance autonomous agents that reason, plan, and optimize through language, combining the sample efficiency and interpretability of model-based methods with the flexibility of LLMs.

Keywords: reinforcement learning, world models, large language models, vision-language models, black-box optimization

Thesis available for public display 7 days prior to the defence at Aalto University's public display page

Contact information: nicola.dainese@aalto.fi 

Doctoral theses of the School of Science

A large white 'A!' sculpture on the rooftop of the Undergraduate centre. A large tree and other buildings in the background.

Doctoral theses of the School of Science at Aaltodoc (external link)

Doctoral theses of the School of Science are available in the open access repository maintained by Aalto, Aaltodoc.

Zoom Quick Guide
  • Updated:
  • Published:
Share
URL copied!