Department of Information and Communications Engineering

Speech communication technology

Speech communication technology aims at describing, explaining and reproducing communication by speech.

The focus of the team is on fundamental research questions of speech communication. Our research has always been characterized by its interdisciplinary nature. Joint research has been conducted across science boundaries, especially with physicians, brain researchers, phoneticians and mathematicians. Some of the topics studied are application-oriented and have been investigated jointly with ICT industry.

The research topics are various, but all of them address speech in one form or another. The main topics of our research (both past and current) are:

analysis and parameterization of speech production
artificial bandwidth extension of speech
brain functions in speech perception
occupational voice care
robust feature extraction in speech and speaker recognition
spectral modelling of speech
speech-based biomarking of human health
speech intelligibility improvement
statistical parametric speech synthesis

The team has acquired funding from the Academy of Finland, the EU, Nokia, Huawei, Tekes and Aalto University.

Examples of our recent articles:

Farhad Javanmardi, Sudarsana Reddy Kadiri, Paavo Alku: Pre-trained models for detection and severity level classification of dysarthria from speech. Speech Communication, Vol. 158, Article 103047, 2024.
Madhu Keerthana Yagnavajjula, Kiran Reddy Mittapalle, Paavo Alku, Sreenivasa Rao, Pabitra Mitra: Automatic classification of neurological voice disorders using wavelet scattering features. Speech Communication, Vol. 157, Article 103040, 2024
Paavo Alku, Manila Kodali, Laura Laaksonen, Sudarsana Reddy Kadiri: AVID: A speech database for machine learning studies on vocal intensity. Speech Communication, Vol. 157, Article 103039, 2024.
Mittapalle Kiran Reddy, Yagnavajjula Madhu Keerthana, Paavo Alku: Classification of functional dysphonia using the tunable Q wavelet transform. Speech Communication, Vol. 155, Article 102989, 2023.
Sudarsana Reddy Kadiri, Farhad Javanmardi, Paavo Alku: Investigation of self-supervised pre-trained models for classification of voice quality from speech and neck surface accelerometer signals. Computer Speech and Language, Vol. 83, Article 101550, 2023.
Manila Kodali, Sudarsana Reddy Kadiri, Paavo Alku: Automatic classification of the severity level of Parkinson’s disease: A comparison of speaking tasks, features, and classifiers. Computer Speech and Language, Vol. 83, Article 101548, 2023.
Paavo Alku, Sudarsana Reddy Kadiri, Dhananjaya Gowda: Refining a deep learning-based formant tracker using linear prediction methods. Computer Speech and Language, Vol. 81, Article 101515, 2023.
Saska Tirronen, Sudarsana Reddy Kadiri, Paavo Alku: Hierarchical multi-class classification of voice disorders using self-supervised models and glottal features. IEEE Open Journal of Signal Processing, Vol. 4, pp. 80-88, 2023.
Mittapalle Kiran Reddy, Paavo Alku: Exemplar-based sparse representations for detection of Parkinson’s disease from speech. IEEE/ACM Transactions on Audio, Speech and Language Processing, Vol. 31, pp. 1386-1396, 2023.
Sudarsana Reddy Kadiri, Paavo Alku, Bayya Yegnanarayana: Analysis of instantaneous frequency components of speech signals for epoch extraction. Computer Speech and Language, Vol. 78, Article 101443, 2023.
Yuanyuan Liu, Mittapalle Kiran Reddy, Nelly Penttilä, Tiina Ihalainen, Paavo Alku, Okko Räsänen: Automatic assessment of Parkinson’s disease using speech representations of phonation and articulation. IEEE/ACM Transactions on Audio, Speech and Language Processing, Vol. 31, pp. 242-255, 2023.
Mittapalle Kiran Reddy, Yagnavajjula Madhu Keerthana, Paavo Alku: End-to-end pathological speech detection using wavelet scattering network. IEEE Signal Processing Letters, Vol. 29, pp. 1863-1867, 2022.
Mittapalle Kiran Reddy, Hilla Pohjalainen, Pyry Helkkula, Kasimir Kaitue, Mikko Minkkinen, Heli Tolppanen, Tuomo Nieminen, Paavo Alku: Glottal flow characteristics in vowels produced by speakers with heart failure. Speech Communication, Vol. 137, pp. 35-43, 2022.
Sudarsana Reddy Kadiri, Paavo Alku, Bayya Yegnanarayana: Extraction and utilization of excitation information of speech: A review. Proceedings of the IEEE, Vol. 109, Issue 12, pp. 1920-2941, 2021.
M. Kiran Reddy, Pyry Helkkula, Y. Madhu Keerthana, Kasimir Kaitue, Mikko Minkkinen, Heli Tolppanen, Tuomo Nieminen, Paavo Alku: The automatic detection of heart failure using speech signals. Computer Speech and Language, Vol. 69, Article 101205, 2021.
Sudarsana Reddy Kadiri, Paavo Alku: Glottal features for classification of phonation type from speech and neck surface accelerometer signals. Computer Speech and Language, Vol. 70, Article 101232, 2021.
Nonavinakere Prabhakera Narendra, Björn Schuller, Paavo Alku: The detection of Parkinson’s disease from speech using voice source information. IEEE/ACM Transactions on Audio, Speech and Language Processing, Vol. 29, pp. 1925-1936, 2021.
Sudarsana Reddy Kadiri, Paavo Alku: Analysis and detection of pathological voice using glottal source features. IEEE Journal of Selected Topics in Signal Processing, Vol. 14, Issue 2, pp. 367-379, 2020.

The research team is led by Professor Paavo Alku.

Published: 13.6.2018
Updated: 25.3.2024