Services

Speech-to-Text

Speech-to-Text is an AI-powered automated transcription service based on Whisper, running on Aalto’s data centre using local servers. It supports multiple file formats and languages. The service is for the Aalto community.
Digital illustration of clouds, arrows, and a microphone representing data flow and cloud computing.

Access

You can access the service here.

Features

Speech2Text provides the ability to make transcriptions of audio from a wide variety of audio/video files. The system consists of a web server and a transcription service. The web server provides an API and endpoints to connect to the transcription service, which utilises WhisperX for transcription. The system uses Celery for job submission management, which communicates with the rest of the system through HTTPS requests, secured by an API key. The transcribed job files (video and audio) are stored and downloadable from the network file system.

The system accepts files in sound-based formats, such as mp3, m4a, wav, and wav. The current output file types are text-based, such as json, srt, aalto (generate a srt file but in 3 languages English, Swedish, and Finnish), and text_only.

The service offers a wide range of languages: English, Finnish, Swedish, Arabic, Armenian, Bulgarian, Catalan, Chinese, Czech, Danish, Dutch, Estonian, French, Galician, German, Greek, Hebrew, Hindi, Hungarian, Icelandic, Indonesian, Italian, Japanese, Kazakh, Korean, Latvian, Lithuanian, Malay, Marathi, Nepali, Norwegian, Persian, Polish, Portuguese, Romanian, Russian, Serbian, Slovak, Slovenian, Spanish, Thai, Turkish, Ukrainian, Urdu, and Vietnamese.

Coverage

The service is available within the Aalto network. For usage outside of Aalto network, please establish a remote connection to an Aalto network.

Use cases

Speech2Text is a system which is used to make transcription of audio from speech in videos or audio files to text. It is suitable for everyday academic activities such as learning, teaching, and research. It is ideal for transcribing lecture recordings, meeting notes, interviews, and other personal uses. It provides a safe transcription service for Aalto personnel (students, teachers, researchers, etc.) since data stays inside Aalto. Users are allowed to transcribe data up to the confidential level according to Aalto’s data classification. Personal data can be processed but not special categories, such as ethnicity, biometrics, and religion. 

Aalto’s data classification

Target groups

The main target audiences for Speech2Text are Aalto-affiliated, such as students, teachers, researchers, and employees. Additionally, the service is completely free of charge.

Restrictions and risks

The use of the Speech2Text for commercial purposes is prohibited. Access rights are limited to Aalto network only. Data encryption is done according to the Kubernetes cluster specification. Data from users' machines is uploaded to the service. The data is stored on the Cluster file system for processing. Then, process results are downloaded by the user. Once downloaded, all data is removed. It is important to note that you are not allowed to process secret data or special categories of personal data. 

Aalto’s data classification

How to start using the service

To use the service, login to the service using your Aalto account. In the main page, you can drop your audio file into the drop box, configurate your job description (name, language, number of speakers, and output format). After the job is submitted, it will be placed in the job list on the Current job section, where it will be processed and returned when the output is ready. Finally, you can download the output to your local storage when it is ready. 

Support

To begin utilising Speech2Text, you must have a connection to the Aalto network. To do this, you have to be on the campus’s premises or have a remote connection to an Aalto network. If you have any questions, reach out to the IT Service Desk at servicedesk@aalto.fi with a description of your use case.

More information

Aalto_nextG_watertower-1.png

Aalto nextG

Aalto nextG is an infrastructure supporting research on future mobile networks. It is operated by the technical support service in the Department of Information and Communications Engineering, School of Electrical Engineering.

Two figures carrying a sign that reads 'IT Service Desk' with a magnifying glass icon, on a blue background.

IT Service Desk contact information and service hours

Contact IT End User Support for help or information on Aalto University IT. You can visit the service desk during opening hours or ask for help by email, telephone or chat.

Services
This service is provided by:

IT Services

For further support, please contact us.
  • Updated:
  • Published:
Share
URL copied!