Events

Department of Computer Science: MSc Thesis Presentations

Shiva Ayoubzadeh will present their MSc thesis on Thursday 22 May at 13:30 in C111, CS building
MSc_thesis_CS

Adverse drug reactions prediction using deep learning models

Author: Shiva Ayoubzadeh
Supervisor: Juho Rousu
Advisor: Dr. Ziaurrehman Tanoli, FIMM

Abstract: Adverse drug reactions (ADRs) are side effects that occur under normal drug usage and remain a leading cause of death. Since bringing a new drug to market is a time-consuming and expensive process, machine learning approaches can play a crucial role in the early detection of ADRs. This study presents a machine-learning-based approach to predicting a collection of ADRs by building four models: a Deep Neural Network (DNN), a Convolutional Neural Network (CNN), a self-attention-based transformer model, and a random forest classifier—evaluated across two experiments. The problem is framed as a binary classification task: in the first experiment, all ADRs of a drug are predicted simultaneously, while in the second, each ADR-compound pair is treated as an independent binary classification instance. A major challenge in this task is data imbalance and sparsity, which was addressed differently across the experiments. Both chemical (Morgan fingerprints) and biological (gene expression) drug features—used separately and in combination in the first experiment—and ChemBERTa-driven SMILES embeddings in the second experiment were employed as training features for ADR prediction.

In the first experiment, training and testing performances were closely aligned with moderate F1 scores. In the second experiment, although training performance improved, generalization weakened. The transformer consistently ranked among the top performers across all datasets, with DNN achieving comparable results on Morgan fingerprints and gene expressions but performing poorly on ChemBERTa embeddings. Random forest performed best on gene expression but failed on Morgan fingerprints, while CNN struggled across all datasets. Notably, the models that performed well achieved strong F1 scores only for a subset of ADRs, emphasizing the critical role of data quality, class balance, and model robustness in building effective unified ADR prediction models based on domain-driven features.

Department of Computer Science

We are an internationally-oriented community and home to world-class research in modern computer science.

Read more
  • Updated:
  • Published:
Share
URL copied!