Department of Computer Science: MSc Thesis Presentation
Date: Wednesday 23 June 2021
Zoom: https://aalto.zoom.us/j/61801880094 (passcode: 861245)
Using sequences of clinical codes for treatment effect estimation
Presenter: Nguyen Luong
Supervisor: Pekka Marttinen
Abstract: The estimation of causal effect from observational data has recently undergone exponential growth in the research community. This machine learning domain has burgeoned due to the increased availability of sizable data resources, for instance, Electronic Health Records (EHR) and social media posts, which have efficiently transgressed the known limitation of randomized controlled trial (RCT) such as time and budget requirements. Nevertheless, the majority of the proposed methods have been operated in a static setting that is not applicable to clinical environments whose patients’ states are ever-changing. Therefore, the present thesis aims to develop a novel approach utilizing clinical codes as covariates to evaluate treatment effects based on synthetic medical data. Two neural network (NN) models, Long Short-Term Memory (LSTM)-based and Transformer Encoder-based, are employed to generate a representation from raw clinical codes that is suitable for treatment effect inference. With a synthetic dataset, the NN models were found to be on par with the Lasso model when the treatment is equally distributed and moderately more propitious in the imbalanced case. Subsequent work includes benchmarking the models on more complex synthetic data as well as real data. Additionally, the presence of hidden confounders in time-series data is an essential aspect that requires meticulous attention before evaluating the treatment effects. Thus, adjusting for the effects of hidden confounders is another potential research direction.