Department of Signal Processing and Acoustics

Speech communication technology

Speech communication technology aims at describing, explaining and reproducing communication by speech.
Speech production

The focus of the team is on fundamental research questions of speech communication. Our research has always been characterized by its interdisciplinary nature. Joint research has been conducted across science boundaries, especially with physicians, brain researchers, phoneticians and mathematicians.  Some of the topics studied are application-oriented and have been investigated jointly with ICT industry.

The research topics are various, but all of them address speech in one form or another. The main topics of our research (both past and current) are:

  • analysis and parameterization of speech production
  • artificial bandwidth extension of speech
  • brain functions in speech perception
  • occupational voice care
  • robust feature extraction in speech and speaker recognition
  • spectral modelling of speech
  • speech-based biomarking of human health
  • speech intelligibility improvement
  • statistical parametric speech synthesis

The team has acquired funding from the Academy of Finland, the EU, Nokia, Tekes and Aalto University.

Examples of our recent articles:

  • Tiina Murtola, Paavo Alku: Indicators of anterior—posterior phase difference in glottal opening measured from natural production of vowels. The Journal of the Acoustical Society of America Express Letters, Vol. 148, No. 2, pp. 141-146, 2020.
  • Nonavinakere Prabhakera Narendra, Paavo Alku: Automatic intelligibility assessment of dysarthric speech using glottal parameters. Speech Communication, Vol. 123, pp. 1-9, 2020.
  • Dhananjaya Gowda, Sudarsana Reddy Kadiri, Brad Story, Paavo Alku: Time-varying quasi-closed-phase analysis for accurate formant tracking in speech signals. IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol. 28, pp. 1901-1914, 2020.
  • Nonavinakere Prabhakera Narendra, Paavo Alku: Glottal source information for pathological voice detection. IEEE Access, Vol. 8, Issue 1, pp. 67745-67755, 2020.
  • Sudarsana Reddy Kadiri, Paavo Alku: Analysis and detection of pathological voice using glottal source features. IEEE Journal of Selected Topics in Signal Processing, Vol. 14, Issue 2, pp. 367-379, 2020.
  • Sudarsana Reddy Kadiri, Paavo Alku, Bayya Yegnanarayana: Analysis and classification of phonation types in speech and singing voice. Speech Communication, Vol. 118, pp. 33-47, 2020.
  • Paavo Alku, Tiina Murtola, Jarmo Malinen, Ahmed Geneid, Erkki Vilkman: Skewing of the glottal flow with respect to the glottal area measured in natural production of vowels. Journal of the Acoustical Society of America, Vol. 146, No. 4, pp. 2501-2509, 2019.
  • Bajibabu Bollepalli, Lauri Juvela, Manu Airaksinen, Cassia Valentini-Botinhao, Paavo Alku: Normal-to-Lombard adaptation of speech synthesis using long short-term memory recurrent neural networks. Speech Communication, Vol. 110, pp. 64-75, 2019.
  • Tiina Murtola, Jarmo Malinen, Ahmed Geneid, Paavo Alku: Analysis of phonation onsets in vowel production, using information from glottal area and flow estimate. Speech Communication, Vol. 109, pp. 55-65, 2019.
  • Paavo Alku, Tiina Murtola, Jarmo Malinen, Juha Kuortti, Brad Story, Manu Airaksinen, Mika Salmi, Erkki Vilkman, Ahmed Geneid: OPENGLOT - An open environment for the evaluation of glottal inverse filtering. Speech Communication, Vol. 107, pp. 38-47, 2019.
  • Lauri Juvela, Bajibabu Bollepalli, Vassillis Tsiaras, Paavo Alku: GlotNet—A raw waveform model for the glottal excitation in statistical parametric speech synthesis. IEEE/ACM Transactions on Audio, Speech and Language Processing, Vol. 27, No. 6, pp. 1019-1030, 2019.

 

The research team is led by Professor Paavo Alku.

  • Published:
  • Updated:
Share
URL copied!