Public defence in Mathematics and Statistics, M.Sc. (Tech) Sami Helander
With the recent developments in the precision of measurement technology and storage capacity, massively large and high dimensional data sets have become commonplace over nearly all fields of science. Functional data – data arising from measuring a generating process of continuous nature over its continuum – has emerged as a prominent type of such big data due to the richness of its structural features. One does not have to search far to find an abundance of great examples of functional data sets: the growth curves of children, measurements of meteorological events such as temperature or precipitation and hourly electricity consumption over a day, are all examples of processes within the realm of functional data.
Detailed analysis of the shape features of functional data is often the key to revealing important modes of variance in functional data. For instance, recognizing structural deviancies from the typical in the growth pattern of school aged children can be one of the earliest markers warning of potential underlying problems in health and well-being. Accurately predicting the hourly electricity consumption is crucial for an electricity company to be able to match the production to the demand. Distinguishing between the silhouettes of a child and a dog can be crucial in computer vision applications for self-driving cars. In short, sensitivity of the developed methodology to variations in shape has become an important topic in the literature. However, precisely defining typicality or atypicality in shape has proven to be a difficult problem. In how fine detail should the variations in local features be considered? What precisely makes a curve ‘too curvy’ in comparison to a set of other observations? Clearly, it is time to leave the classical, location-based considerations in the offside and shift our focus towards the intricacies in shape and structure.
In the dissertation, we develop methods for assessing the shape typicality and similarity of observations and study their properties in theory and in practice. Furthermore, we study the practical implementations of the methods in some prominent, common applications such as supervised learning and outlier detection, and evaluate their performance compared to some popular modern competitors. In particular, we demonstrate the excellent properties of the proposed methods and show that in many commonly encountered settings, they are able to match or even outperform many of the leading competitors.
Opponent is Professor Thomas Verdebout, Université Libre de Bruxelles, Belgium
Custos is Professor Pauliina Ilmonen, Aalto University School of Science, Department of Mathematics and Systems Analysis
Contact details of the doctoral student: [email protected], +358 50 5186136
The public defence will be organised on campus (Otakaari 1, lecture hall H304).
The doctoral thesis is publicly displayed 10 days before the defence in the publication archive Aaltodoc of Aalto University.