Department of Information and Communications Engineering

Speech recognition

Our goal is to generally improve the speech recognition methodology with the help of the new algorithms developed in Aalto University. Speech recognition offers challenging benchmarking tasks for efficient algorithms that can process and learn to represent large quantities of data. In addition to improving the acoustic models of phonemes we aim at developing new learning statistical language models for difficult large vocabulary continuous speech recognition tasks.

Research Overview

We currently specialize in the following research areas in speech recognition:

Sub-word units and deep learning in language modeling
Speaker adaptation and pronunciation rating in acoustic modeling
Unlimited vocabulary continuous speech recognition
Speech recognition and language modeling methods for under-resourced languages
Methods for describing and translating audiovisual
Speaker and language recognition and diarization

We are part of Finnish Center of Artificial Intelligence (FCAI, https://fcai.fi/).

Teaching

We are teaching the following courses:

ELEC-E5500 Speech Processing

ELEC-E5510 Speech Recognition

ELEC-E5550 Statistical Natural Language Processing

ELEC-E5521 Speech and Language Processing Methods

We are part of the major in Machine Learning, Data Science and Artificial Intelligence (Macadamia) in the Master's Programme in Computer, Communication and Information Sciences.

Group members

Software & Demonstrations

Software produced as part of our research is available on our GitHub

Demonstration videos of our research work can be watched on our YouTube Channel

Autoencoder Based Optimized SSL Representations: Complexity Minimization and Improved Dysarthric ASR

Paban Sapkota, Hemant Kumar Kathania, Mikko Kurimo, Shrikanth Narayanan, Sudarsana Reddy Kadiri 2026 2026 National Conference on Communications (NCC)

A study on the layer-wise transferability of self-supervised learning features for children’s speech processing tasks

Abhijit Sinha, Hemant Kumar Kathania, Mikko Kurimo 2026 Speech Communication

Self-Supervised App-Based Speech Training for Children With Speech Sound Disorder—A Single-Case Experimental Design Study

Sofia Strömbergsson, Ella Edlund, Magdalena Pettersson, Nhan Phan, Mikko Kurimo 2026 International Journal of Language & Communication Disorders

Research portal

Updated: 19.4.2024
Published: 13.6.2018

Speech recognition

Research Overview

Teaching

Group members

Mikko Kurimo

Mittul Singh

Anssi Ilmari Moisio

Yaroslav Getman

Nhan Phan

Mehedi Hasan Bijoy

Zirui Li

Elina Anneli Nurminen

Software & Demonstrations

Latest publications

Autoencoder Based Optimized SSL Representations: Complexity Minimization and Improved Dysarthric ASR

A study on the layer-wise transferability of self-supervised learning features for children’s speech processing tasks

Self-Supervised App-Based Speech Training for Children With Speech Sound Disorder—A Single-Case Experimental Design Study

A transformer-based spelling error correction framework for Bangla and resource scarce Indic languages

Multi-Teacher Language-Aware Knowledge Distillation for Multilingual Speech Emotion Recognition

Is your model big enough? Training and interpreting large-scale monolingual speech foundation models

Non-Native Children's Automatic Speech Assessment Challenge (NOCASA)

Towards large-scale speech foundation models for a low-resource minority language

Developing a digital tool for L2 speaking assessment in low-resourced languages

Proceedings of the Workshop on Automatic Assessment of Atypical Speech