Aalto researchers win an international speech recognition contest
The speech recognition research group headed by Professor Mikko Kurimo has won the international Multi-Genre Broadcast (MGB) Challenge contest, in which the task was to create a speech recogniser for Egyptian Arabic based on samples collected from YouTube.
‘The vocabulary of the spoken language of Egypt deviates significantly from that of standard Arabic, and no extensive Egyptian speech material is available. The research group had no prior experience of speech recognition of languages related to Arabic, and none of us could understand any Arabic, but in spite of this, Aalto's system learned to recognise both standard Arabic and Egyptian speech significantly better than any other competitor,’ Professor Mikko Kurimo says.
Aalto’s system utilized a number of state-of-the-art methods for deep neural network modelling and adaptation, speech recognition and text segmentation. Especially, the tools for sub-word segmentation and language modelling, which have all been developed in Aalto's research group over the past years, made an impact on the performance of the system. These tools can effectively model the numerous word forms of morphologically rich languages such as Finnish and Estonian and their appearance in speech.
The research group has developed speech technologies for under-resourced languages.
‘Aalto probably won because our system was the only one capable of efficiently using units shorter than words in its language models, and the system was not limited to a pre-selected vocabulary.’
Coming in second after Aalto were research groups from Tsinghua University in China and from Johns Hopkins University and MIT in the United States. The Aalto system is described in a paper that will be presented in the 2017 IEEE Automatic Speech Recognition and Understanding Workshop which will be held in December in Okinawa, Japan. The manuscript is already available through the links below.
Doctoral Candidate Peter Smit
Professor Mikko Kurimo