The Lahjoita puhetta campaign, which aims at better Finnish-language speech recognition, received the Grand One media prize in the Best Mobile Service category. Grand One is Finland's biggest contest for the digital media and the winners were announced in a virtual gala event on 29 April 2021. The campaign also won honourable mention in the Best Use of Data category.
The purpose of the Lahjoita puhetta campaign is to collect as many different types of spoken Finnish as possible. The collected speech is used to develop speech recognition and artificial intelligence to better understand spoken Finnish. Taking part in the public service project are Yle, the University of Helsinki, the Finnish State Development Company Vake Oy (today the Finnish Climate Fund), and Aalto University. Speech can be donated using a computer browser or a telephone app. Learn more about the campaign here (in Finnish).
Professor Mikko Kurimo, who leads the Aalto University speech recognition research group, says that he has discussed the importance of the topic at different events for decades. Kurimo has an important role in implementing the campaign. For example, he has worked on what kinds of data should be collected, how much is needed, and who it should be collected from. Nearly 4,000 hours of spoken Finnish have been collected, so next, Kurimo's research group will develop automatic methods for checking data, correction, and annotation, or the description and classification of the material.
Other languages lag behind English
So why does Finnish-language speech recognition need to be developed? Speech recognition is increasingly utilised in many different important applications, such as voice control, voice search, dictation, speech transcription, subtitling, interpretation, and in information searches. According to Kurimo, Finns themselves are the ones who suffer the most from problems with Finnish speech recognition, and it would be unfortunate if the use of these services were to continue to require fluency in English.
'Most of the world's more than 6,000 languages are in the same situation. Thanks to this data, which is unique for its coverage and its openness, Finland now has the possibility to be a pioneer in the development of IT applications for small languages,' Kurimo says.
Research into speech and language is also important in itself, because it reveals important aspects of human communications and behaviour, and the study of large amounts of material also requires the development of automatic tools.
Speech collected for the Lahjoita puhetta campaign will keep the researchers busy, but the research group is working on other interesting projects as well. The group recently got a large amount of speech material from recordings of sessions of Finnish Parliament between 2008 and 2020. In addition, the group also studies computer-assisted measurement and practice of spoken language skills, subtitling of television programmes and films, and speech recognition for challenging user groups such as children and language learners.
Do you wish to take part in the speech campaign and improve Finnish-language speech recognition? The campaign is still in progress - you can donate speech at the address lahjoitapuhetta.fi! Remember that the campaign collects all kinds of spoken Finnish, so you don't need to be a native speaker to participate.