Department of Signal Processing and Acoustics

Speech communication technology

Speech communication technology aims at describing, explaining and reproducing communication by speech.
Speech production

The focus of the team is on fundamental research questions of speech communication. Our research has always been characterized by its interdisciplinary nature. Joint research has been conducted across science boundaries, especially with physicians, brain researchers, phoneticians and mathematicians.  Some of the topics studied are application-oriented and have been investigated jointly with ICT industry.

The research topics are various, but all of them address speech in one form or another. The main topics of our research (both past and current) are:

  • analysis and parameterization of speech production
  • artificial bandwidth extension of speech
  • brain functions in speech perception
  • occupational voice care
  • robust feature extraction in speech and speaker recognition
  • spectral modelling of speech
  • speech-based biomarking of human health
  • speech intelligibility improvement
  • statistical parametric speech synthesis

The team has acquired funding from the Academy of Finland, the EU, Nokia, Huawei, Tekes and Aalto University.

Examples of our recent articles:

  • Manila Kodali, Sudarsana Reddy Kadiri, Paavo Alku: Automatic classification of the severity level of Parkinson’s disease: A comparison of speaking tasks, features, and classifiers. Computer Speech and Language, 2023 (In press).
  • Saska Tirronen, Sudarsana Reddy Kadiri, Paavo Alku: Hierarchical multi-class classification of voice disorders using self-supervised models and glottal features. IEEE Open Journal of Signal Processing, Vol. 4, pp. 80-88, 2023.
  • Mittapalle Kiran Reddy, Paavo Alku:Exemplar-based sparse representations for detection of Parkinson’s disease from speech. IEEE/ACM Transactions on Audio, Speech and Language Processing, Vol. 31, pp. 1386-1396, 2023.
  • Paavo Alku, Sudarsana Reddy Kadiri, Dhananjaya Gowda: Refining a deep learning-based formant tracker using linear prediction methods. Computer Speech and Language, Vol. 81, Article 101515, 2023.
  • Sudarsana Reddy Kadiri, Paavo Alku, Bayya Yegnanarayana: Analysis of instantaneous frequency components of speech signals for epoch extraction. Computer Speech and Language, Vol. 78, Article 101443, 2023.
  • Yuanyuan Liu, Mittapalle Kiran Reddy, Nelly Penttilä, Tiina Ihalainen, Paavo Alku, Okko Räsänen: Automatic assessment of Parkinson’s disease using speech representations of phonation and articulation. IEEE/ACM Transactions on Audio, Speech and Language Processing, Vol. 31, pp. 242-255, 2023.
  • Mittapalle Kiran Reddy, Yagnavajjula Madhu Keerthana, Paavo Alku: End-to-end pathological speech detection using wavelet scattering network. IEEE Signal Processing Letters, Vol. 29, pp. 1863-1867, 2022.
  • Mittapalle Kiran Reddy, Hilla Pohjalainen, Pyry Helkkula, Kasimir Kaitue, Mikko Minkkinen, Heli Tolppanen, Tuomo Nieminen, Paavo Alku: Glottal flow characteristics in vowels produced by speakers with heart failure. Speech Communication, Vol. 137, pp. 35-43, 2022.
  • Sudarsana Reddy Kadiri, Paavo Alku, Bayya Yegnanarayana: Extraction and utilization of excitation information of speech: A review. Proceedings of the IEEE, Vol. 109, Issue 12, pp. 1920-2941, 2021.
  • M. Kiran Reddy, Pyry Helkkula, Y. Madhu Keerthana, Kasimir Kaitue, Mikko Minkkinen, Heli Tolppanen, Tuomo Nieminen, Paavo Alku: The automatic detection of heart failure using speech signals. Computer Speech and Language, Vol. 69, Article 101205, 2021.
  • Sudarsana Reddy Kadiri, Paavo Alku: Glottal features for classification of phonation type from speech and neck surface accelerometer signals. Computer Speech and Language, Vol. 70, Article 101232, 2021.
  • Nonavinakere Prabhakera Narendra, Björn Schuller, Paavo Alku: The detection of Parkinson’s disease from speech using voice source information. IEEE/ACM Transactions on Audio, Speech and Language Processing, Vol. 29, pp. 1925-1936, 2021.
  • Tiina Murtola, Paavo Alku: Indicators of anterior—posterior phase difference in glottal opening measured from natural production of vowels. The Journal of the Acoustical Society of America Express Letters, Vol. 148, No. 2, pp. 141-146, 2020.
  • Nonavinakere Prabhakera Narendra, Paavo Alku: Automatic intelligibility assessment of dysarthric speech using glottal parameters. Speech Communication, Vol. 123, pp. 1-9, 2020.
  • Dhananjaya Gowda, Sudarsana Reddy Kadiri, Brad Story, Paavo Alku: Time-varying quasi-closed-phase analysis for accurate formant tracking in speech signals. IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol. 28, pp. 1901-1914, 2020.
  • Sudarsana Reddy Kadiri, Paavo Alku: Analysis and detection of pathological voice using glottal source features. IEEE Journal of Selected Topics in Signal Processing, Vol. 14, Issue 2, pp. 367-379, 2020.

The research team is led by Professor Paavo Alku.

  • Published:
  • Updated:
URL copied!