Public defence in Computer Science, M.Sc. (Tech) Zheyang Shen
Title of the doctoral thesis: Strengthening nonparametric Bayesian methods with structured kernels
Opponent: Professor Chris Oates, Newcastle University / Alan Turing Institute, England
Custos: Professor Samuel Kaski, Aalto University School of Science, Department of Computer Science
The public defence will be organised on campus.
The thesis is publicly displayed 10 days before the defence in the publication archive Aaltodoc of Aalto University
Public defence announcement:
Recent decades have seen groundbreaking advances in machine learning, the impact of which is seen from many facets of our society. This thesis summarizes a few contributions at the frontier of statistical machine learning, with a uniting theme of kernel methods. Bayesian nonparametrics represent a wide class of flexible models with measured uncertainty, but their merits come with drawbacks regarding their capacities for (1) pattern extrapolation, (2) computational expense and (3) efficient and accurate inference. This thesis attempts to tackle the above 3 bottlenecks of selected nonparametric Bayesian models by developing novel methodologies.
Gaussian processes (GPs) are distributions of functions, which form a canonical building block of Bayesian nonparametrics. GPs typically interpolate data points flexibly, but extrapolation has proven difficult. The first section of the thesis proposes novel kernels for GP models, with a specific focus on extrapolation potentials. We illustrate in theory and practice that our kernels expand the learning biases of typical GP models.
Another bottleneck of GP models hinges upon their computationally expensive inference. The second section of the thesis focuses on the scalable approximation of GPs that conditions on a smaller set of pseudo-inputs, known as sparse GPs. While sparse GPs offer considerable speedup, we discover a limiting factor in their approximations, and propose a thoroughly Bayesian treatment of pseudo-inputs that significantly enhances the expressivity of sparse GP models.
The third section of the thesis tackles the general problem of characterizing un-normalized distributions. Approximate random samples can be drawn via simulation of a Markov chain, a popular set of samplers known as Markov chain Monte Carlo (MCMC). Alternatively, we could transport a kernel-based system of interacting particles via deterministic dynamics to represent the target distribution. We explore the connection between the two families of samplers, and propose new particle samplers that emulate more efficient MCMC samplers.
While this thesis does not revolve around a singular topic, we showcase the efficacy of kernels in augmenting a variety of nonparametric models, thus shedding light on better practice in the Bayesian modeling community.
Contact details of the doctoral student: [email protected], 0449191362