Public defence in Mathematics and Statistics, M.Sc. (Tech) Paavo Raittinen

Title of the doctoral thesis is "On statistical analysis and machine learning in prostate cancer research"

The age of machine learning is now. Machine learning application are ubiquitous in digital platforms such as Google, Netflix streaming service, or Wolt food delivery service. However, machine learning is not yet considered as state-of-the-art statistical analysis method, e.g., in the field of biomedicine. In scientific biomedical literature most of the method used for statistical analysis are classic frequentist methods.

In my doctoral thesis “On statistical analysis and machine learning in prostate cancer research”, the application of certain machine learning method (random forests) in prostate cancer research was studied, compared to traditional method (Wilcoxon rank sum test). Due to multidisciplinary nature of the study the purpose was two-branched: on one hand the feasibility of machine learning methods was studied while on the other hand the results were analysed in the context of prostate cancer and an attempt to explain the biological mechanisms were made. The basic data was based on ESTO1 randomized clinical trial (RCT) which was expanded with a spectrum of lipid and steroid compound measurements, i.e., lipidome and steroidome. The main purpose of the ESTO1 study was to investigate the impact of statin use on the cellular level, compared to placebo. To our knowledge, such vast lipidome / steroidome data have not been studied in similar study settings, not to mention analysed by machine learning.

With the selected machine learning method, random forests, we reached same results and arrived at same conclusion than by using classic methods. Moreover, random forests model results enable analysing the hierarchy between the features (variables) which is a clear benefit over classic methods. From the biology perspective, statin use influences serum steroidome and lipidome in general. Furthermore, statin use seems to have a down-shifting impact on the lipid and steroid milieu in the prostatic tissue. Both biological results are novel.

The result of the thesis gives a positive example about machine learning feasibility in analysing results from classic RCT study design in the field of prostate cancer research. The conclusion is that prostate cancer can already now benefit about machine learning and developments made in that field in the future.

Opponent is Professor Tommi Sottinen, University of Vaasa, Finland

Custos is Professor Pauliina Ilmonen, Aalto University School of Science, Department of Mathematics and Systems Analysis

Contact details of the doctoral student: [email protected]

The public defence will be organised on campus.

The doctoral thesis is publicly displayed 10 days before the defence in the publication archive Aaltodoc of Aalto University.

Electronic thesis

  • Published:
  • Updated: