Public defence in Computer Science, M.Sc. (Tech) Topi Paananen
Title of the doctoral thesis: Computational methods for Bayesian model assessment
A statistical model is a set of assumptions about how a data sample has been generated. A Bayesian model is a statistical model where probabilities are used to represent uncertainty within the model. When new data is collected, the probability representation of the model can be updated with a process called Bayesian inference. With many models, inference cannot be done exactly, and is instead carried out using computational methods.
Assessing the behaviour and performance of models is essential in statistical modelling. Model assessment is not only useful for comparing or improving models, but also for uncovering the assumptions and limitations underlying the models. Assessment of Bayesian models often requires a series of computationally costly calculations. Thus it is important to find algorithms that are both performant and computationally feasible
The thesis studies several computational tools for evaluating Bayesian models. The first part of the thesis studies and develops model assessment methods using importance sampling algorithms. These techniques are mainly applied to two model assessment tasks, namely leave-one-out cross-validation and sensitivity analysis of priors and likelihoods. The thesis concludes that in certain situations both leave-one-out cross-validation and sensitivity analysis become faster and more accurate when using the studied importance sampling methods.
The second part of the thesis studies the assessment of variable importance in supervised learning. The thesis presents new methods for incorporating the predictive uncertainty of the model in the evaluation of variable importance. Existing variable importance assessment methods are generalized from singular predictions to probability distributions of predictions. The thesis shows that this can lead to a more accurate identification of the important variables.
By making model assessment easier and more effective, the methods studied in the thesis increase trust towards statistical models and their ability to represent real-world phenomena. Algorithms developed in the thesis are implemented in open source statistical software packages that are used by scientists and other practitioners of Bayesian statistics worldwide.
Opponent: Reader Víctor Elvira, University of Edinburgh, England
Custos: Professor Aki Vehtari, Aalto University School of Science, Department of Computer Science
Contact details of the doctoral student: [email protected]
The public defence will be organised on campus (lecture hall A208d Jeti).
The thesis is publicly displayed 10 days before the defence in the publication archive Aaltodoc of Aalto University.