Defence of doctoral thesis in the field of Speech and Language Technology, M.Sc. Sneha Das
M.Sc. Sneha Das will defend the thesis "Robust and Efficient Methods for Distributed Speech Processing – Perspectives on Coding, Enhancement and Privacy" on 26 November 2021 at 12 in Aalto University School of Electrical Engineering, Department of Signal Processing and Acoustics.
Opponent: Prof. Mads Græsbøll Christensen, Aalborg University, Denmark
Custos: Prof. Tom Bäckström, Aalto University School of Electrical Engineering, Department of Signal Processing and Acoustics
The public defense will be organized via remote technology. Follow defence: https://aalto.zoom.us/j/61255513284
Zoom Quick Guide: https://www.aalto.fi/en/services/zoom-quick-guide
Thesis available for public display at: https://aaltodoc.aalto.fi/doc_public/eonly/riiputus/
Doctoral theses in the School of Electrical Engineering: https://aaltodoc.aalto.fi/handle/123456789/53
Computers and technology are deeply embedded in our lives today, and people invest a considerable part of their day communicating with technology. Conventional modes of human-technology interaction have predominantly been device-centric, due to which the users are required to be in the vicinity of the device. This can become cumbersome as the number of personal devices owned by an individual increases. Use of speech interface for human-computer interaction can make communication with technology more efficient. However, to successfully apply speech technology in multi-device scenarios, we need to address challenges in speech enhancement, coding and user privacy in the given setting. The objectives of this thesis are to develop methods to enable the advancement of conventional speech coding for multiple microphones and to understand the state of privacy in speech-user interfaces.
In order to obtain simple and robust speech coding systems, we propose the use of postfilters to improve speech quality at the receiving end of a speech codec. We develop methods that employ envelope and harmonic models of speech and show the effectiveness of the models for both single-microphone and multi-microphone assumptions. Furthermore, we address the topic of privacy in speech interfaces by investigating how to instill smart speech interfaces with an intuitive understanding of user privacy preferences. The study reveals that individuals have an intuitive understanding of privacy in speech communication that is dependent on the acoustic scenarios among other factors and the insights can be exploited by conditioning the privacy preferences on the sensed acoustic environment in a speech interface.
Contact information of doctoral candidate: