Events

Defence of dissertation in the field of Language Technology, M.Sc. (Tech.) Peter Smit

Public defences

The title of the thesis is “Modern subword-based models for automatic speech recognition”

When

17.6.2019 12:00 – 23:59 (UTC +3)

Where

Health Technology House

Otakaari 3, 02150 Espoo F239 a

Event language(s)

English.

This thesis implements subwords in a theoretical sound way for modern finite-state transducer -based speech recognition systems. This implementation is so powerful, that even characters can be used as a unit, which allows for true unlimited vocabulary decoding and possibilities for using new character-based language models that were earlier not possible to use.

Whereas older style n-gram language models can handle large word vocabularies, newer neural network language models have practical limitations for the vocabulary size. It is shown that subword models reduce the need for workarounds and heuristics and reduce the number of parameters needed to train models with similar power.

Through reasoning and experimentation, it is shown that these new subword-based models outperform word-based models in almost all scenarios. Specifically, the subword-based models are strong for languages with very large vocabularies or languages with limited text resources.
In practice, these inventions and results will allow speech recognition to be more accurate and easier to build, even for those languages that have only little resources available.

Opponent: Professor Thomas Hain, Sheffield University, UK

Supervisor: Professor Mikko Kurimo Aalto University School of Electrical Engineering, Department of Signal Processing and Acoustics

Thesis webpage

Contact information: Peter Smit, Department of Signal Processing and Acoustics, peter.smit@aalto.fi, +31622782851

Updated: 9.12.2022
Published: 10.5.2019