Department of Information and Communications Engineering

Speech communication technology

Speech communication technology aims at describing, explaining and reproducing communication by speech.

The focus of the team is on fundamental research questions of speech communication. Our research has always been characterized by its interdisciplinary nature. Joint research has been conducted across science boundaries, especially with physicians, brain researchers, phoneticians and mathematicians. Some of the topics studied are application-oriented and have been investigated jointly with ICT industry.

The research topics are various, but all of them address speech in one form or another. The main topics of our research (both past and current) are:

analysis and parameterization of speech production
artificial bandwidth extension of speech
brain functions in speech perception
occupational voice care
robust feature extraction in speech and speaker recognition
spectral modelling of speech
speech-based biomarking of human health
speech intelligibility improvement
statistical parametric speech synthesis

The team has acquired funding from the Academy of Finland, the EU, Nokia, Huawei, Business Finland and Aalto University.

Examples of our recent articles:

Prathamesh Parasharam Patil, Mittapalle Kiran Reddy, Paavo Alku: Classification of phonation types in singing and speaking voice using self-supervised learning models. Speech Communication, Vol. 178, Article 103355, 2026.
Saska Tirronen, Farhad Javanmardi, Hilla Pohjalainen, Sudarsana Reddy Kadiri, Kiran Reddy Mittapalle, Pyry Helkkula, Kasimir Kaitue, Mikko Minkkinen, Heli Tolppanen, Tuomo Nieminen, Paavo Alku: Towards robust heart failure detection in digital telephony environments by utilizing transformer-based codec inversion. Speech Communication, Vol. 173, Article 103279, 2025.
Manila Kodali, Luna Ansari, Sudarsana Reddy Kadiri, Shrikanth Narayanan, Paavo Alku: Automatic classification of vocal intensity categories from amplitude-normalized speech signals by comparing acoustic features and classifier models. Speech Communication, Vol. 174, Article 103288, 2025.
Mittapalle Kiran Reddy, Paavo Alku: Automatic detection of parkinsonian speech using wavelet scattering features. JASA Express Letters, Vol. 5, Issue 5, Article 55202, 2025.
Manila Kodali, Sudarsana Reddy Kadiri, Shrikanth Narayanan, Paavo Alku: The machine learning-based prediction of the sound pressure level from pathological and healthy speech signals. Journal of the Acoustical Society of America, Vol. 157, No. 3, pp. 1726-1741, 2025.
Sudarsana Reddy Kadiri, Kevin Huang, Christina Hagedorn, Dani Byrd, Paavo Alku, Shrikanth Narayanan: Formant tracking by combining deep neural network and linear prediction. IEEE Open Journal of Signal Processing, Vol. 6, pp. 222-230, 2025.
Farhad Javanmardi, Sudarsana Reddy Kadiri, Paavo Alku: Pre-trained models for detection and severity level classification of dysarthria from speech. Speech Communication, Vol. 158, Article 103047, 2024.
Mittapalle Kiran Reddy, Paavo Alku: Tunable Q wavelet transform -based features in the classification of phonation types in the singing and speaking voice. Journal of Voice, 2024 (In press).
Farhad Javanmardi, Sudarsana Reddy Kadiri, Paavo Alku: Exploring the impact of fine-tuning the Wav2vec2 model in database-independent detection of dysarthric speech. IEEE Journal of Biomedical and Health Informatics, Vol. 28, Issue 8, pp. 4951-4962, 2024.
Mittapalle Kiran Reddy, Paavo Alku: Classification of phonation types in singing voice using wavelet scattering network-based features. JASA Express Letters, Vol. 4, Issue 6, Article 065201, 2024.
Anne-Maria Laukkanen, Sudarsana Reddy Kadiri, Shrikanth Narayanan, Paavo Alku. Can a machine distinguish high and low amount of social creak in speech? Journal of Voice, 2024 (In press).
Madhu Keerthana Yagnavajjula, Kiran Reddy Mittapalle, Paavo Alku, Sreenivasa Rao, Pabitra Mitra: Automatic classification of neurological voice disorders using wavelet scattering features. Speech Communication, Vol. 157, Article 103040, 2024
Paavo Alku, Manila Kodali, Laura Laaksonen, Sudarsana Reddy Kadiri: AVID: A speech database for machine learning studies on vocal intensity. Speech Communication, Vol. 157, Article 103039, 2024.
Mittapalle Kiran Reddy, Yagnavajjula Madhu Keerthana, Paavo Alku: Classification of functional dysphonia using the tunable Q wavelet transform. Speech Communication, Vol. 155, Article 102989, 2023.
Sudarsana Reddy Kadiri, Farhad Javanmardi, Paavo Alku: Investigation of self-supervised pre-trained models for classification of voice quality from speech and neck surface accelerometer signals. Computer Speech and Language, Vol. 83, Article 101550, 2023.
Manila Kodali, Sudarsana Reddy Kadiri, Paavo Alku: Automatic classification of the severity level of Parkinson’s disease: A comparison of speaking tasks, features, and classifiers. Computer Speech and Language, Vol. 83, Article 101548, 2023.
Paavo Alku, Sudarsana Reddy Kadiri, Dhananjaya Gowda: Refining a deep learning-based formant tracker using linear prediction methods. Computer Speech and Language, Vol. 81, Article 101515, 2023.
Saska Tirronen, Sudarsana Reddy Kadiri, Paavo Alku: Hierarchical multi-class classification of voice disorders using self-supervised models and glottal features. IEEE Open Journal of Signal Processing, Vol. 4, pp. 80-88, 2023.
Mittapalle Kiran Reddy, Paavo Alku: Exemplar-based sparse representations for detection of Parkinson’s disease from speech. IEEE/ACM Transactions on Audio, Speech and Language Processing, Vol. 31, pp. 1386-1396, 2023.
Sudarsana Reddy Kadiri, Paavo Alku, Bayya Yegnanarayana: Analysis of instantaneous frequency components of speech signals for epoch extraction. Computer Speech and Language, Vol. 78, Article 101443, 2023.
Yuanyuan Liu, Mittapalle Kiran Reddy, Nelly Penttilä, Tiina Ihalainen, Paavo Alku, Okko Räsänen: Automatic assessment of Parkinson’s disease using speech representations of phonation and articulation. IEEE/ACM Transactions on Audio, Speech and Language Processing, Vol. 31, pp. 242-255, 2023.

The research team is led by Professor Paavo Alku.

Updated: 29.1.2026
Published: 13.6.2018