Eero Hyvönen introduces masses of data to humanists – and everyone else

Data from the ‘sampos’ developed by Hyvönen and his colleagues are open to everyone, and they facilitate the work of historians, for example
Kuvakaappaus Biografiasammosta
In BiographySampo, one can search and compare individuals but also groups of people. In the screenshot, the user has selected generals and admirals (on left) and priests (on right) during the Grand Duchy of Finland that existed between 1809 and 1917.

Looking for the latest book tips or information on a relative who fell in the Winter War? Out to find the life’s story of a famous Finn? Or are you perhaps a historian interested in the mobility of medieval texts globally?

Eero Hyvönen, Professor at Aalto University and Director of Helsinki Centre for Digital Humanities (HELDIG), and his Semantic Computing Research Group (SeCo) has made it significantly easier to search for such information. Hyvönen is known as the person behind the sampos.

Using the sampos, anyone can search information on a range of subjects – ranging from Finnish fiction, soldiers who fell during World War II and the life and close circle of famous Finns.

Sampos combine data picked from different databases. They can also be used to visualise and analyse the data. 'At some point, we realised that the sampo makes for a rather good brand. I believe we have twelve different sampos at the moment, with more on the way,' Hyvönen states.

BookSampo and WarSampo have gained the most popularity

Sampos facilitate the works of humanists, as they provide access to data-analytical tools without requiring in-depth IT skills. According to Hyvönen, sampos have resulted in several humanists taking an interest in technology.

Hyvönen highlights that all sampos are open: the data and the applications based on this data can be used by anyone and in any manner of their choosing. Several sampos have been of interest to researchers, as revealed by the number of users.

Eero Hyvösen potrettikuva

At some point, we realised that the sampo makes for a rather good brand. I believe we have twelve different sampos at the moment.

Eero Hyvönen

BookSampo, which combines all the works of fiction from Finland’s public libraries, was used by a total of two million people last year. It is currently maintained by public libraries, which update the database whenever they acquire new works of fiction.

The second-most popular of the sampos is the WarSampo, which has received international awards and attracted over 630,000 users since 2015. It combines data related to the Winter War and Continuation War from different war history sources. 'We know of all the approximately 95,000 Finns who passed in World War II and thousands of other known soldiers, based on materials from the National Archives. The linked data also includes thousands of military units, tens of thousands of war diaries, 160,000 authentic photographs from the Finnish Defence Forces, historical maps and so much more,' Hyvönen describes.

The user can, for example, look for information on a relative who died as a soldier in World War II. This appears to be the most common reason why people have been enthusiastic about getting to know WarSampo. 'The system automatically reconstructs the soldiers’ warpath or war story.'

BiographySampo reveals even surprising connections between famous Finns

Perhaps the most versatile of all the sampos is the BiographySampo. It contains over 13,000 biographies on famous Finns from the Finnish Literature Society, complemented by sixteen other sources, such as BookSampo and WarSampo. BiographySampo has seen over 29,000 visitors.

When developing the BiographySampo, researchers created a gigantic semantic network from biographical texts with the help of artificial intelligence. The network includes 120 million connections between different pieces of data. The service can be used to research the biographical events of different people on maps and in time, along with their movements, networks and connections. BiographySampo easily displays, for example, a renowned person’s relatives, which parts of the world they have influenced in and how.

BiographySampo reveals e.g. that biographies about female Members of Parliament use a large number of words 'child' or 'family'.

Eero Hyvönen

In the portal, you can also conduct language analysis and explore the extent of certain words being associated with different people. 'It reveals, for example, that biographies about female Members of Parliament use a large number of the words "child" or "family," whereas family matters are rarely brought up in connection with corresponding male Members of Parliament.'

Some connections between different people may even be confusing. 'Looking, for example, at the egocentric network of Tapio Rautavaara, you can see that he has a direct connection to academic and poet Aale Tynni – which seems a little peculiar. However, BiographySampo reveals that they both won a gold medal in the London Olympics: Aale Tynni as the winner of lyric works, which was an Olympic title at the time.'

The latest sampo reveals how medieval texts may have circulated the world during the centuries

The latest newcomer in the Sampo series is Mapping Manuscript Migrations (MMM). Published in Washington DC at the end of January 2020, it has proven useful to historians in particular.

MMM combines over 200,000 hand-written documents from the Middle Ages and the Renaissance period, along with 900,000 related events. The documents have been compiled from three massive sources: the famous Bodleian Library at the University of Oxford, the Schoenberg Institute in the US and the French research institute IRHT.

'We gathered information from these different organisations to make the manuscripts easier to research. These are internationally circulating manuscripts and the same ones are mentioned in different databases. In this project, data from the different databases were combined in order to provide a global view.'

The service can be used to find out, for instance, when a certain manuscript has been made and by whom. It includes also over 2,000 copies of the documents by the ancient Greek philosopher Aristotle, made in the Middle Ages. Since many of the texts are copies, their contents may deviate from the original.

'For instance, the adventures of Marco Polo come in many different versions. A new copyist or publisher wanting to make business may have added a few funny anecdotes,' Hyvönen says and laughs.

One idea behind the new sampo was to make visible in the portal’s map view how the documents have circulated globally. User-friendly data analysis tools have been integrated into this one as well, and they can be used without any additional learning. 'If the researcher is not satisfied with our visualisations and wants to use, say, another map programme, they may select an interesting cluster of data and download it as a spreadsheet.'

A bit over a month after the MMM had been published, it had attracted a total of approximately 1,500 users. Considering that it is directed at researchers of medieval manuscripts in particular, the amount is quite high.

Do you have a historical background, considering your devotion to these subjects? 'I do find history interesting, of course, and I’ve always admired the multidisciplinary work of Renaissance people, but I am a graduate of Helsinki University of Technology,' Hyvönen says.

He considers the subject appropriate for semantic research. 'Even though we’re not professional historians, we are able to understand these things at a general level, which makes this an understandable research theme. A more in-depth understanding on the projects comes from the humanist researchers who are involved. Collaborating with the Helsinki Centre for Digital Humanities at the University of Helsinki’s Faculty of the Arts is an important part of our work.'

English translation: Annika Rautakoura

Opiskelija istuu kannettavan tietokoneen ääressä
Photo: Unto Rautio / Aalto University

The sampos that are currently being developed:

  • AcademySampo, which holds detailed information on the 28,000 people who received academic education in Finland during 1640–1899
  • FindSampo, which is developed from archaeological findings data from the Finnish Heritage Agency and the National Museum of Finland
  • LawSampo, which is created by researchers in collaboration with the Finnish Ministry of Justice and Edita Publishing, publishes central and Finnish legal cases as an intelligent semantic portal
  • ParliamentSampo is based on materials from the parliament and is developed for the purposes of researching political culture within the DIGIHUM programme of the Academy of Finland
  • HistorySampo addresses Finnish history and utilises, for exampe, the timeline data of Suomen humanistiverkko Agricola ('Finnish Network of Humanists – Agricola')
  • Eero Hyvönen reveals that there is an open infrastructure linking the different sampos in the making. 'You might call it SampoSampo.'
  • Published:
  • Updated:
URL copied!

Read more news

Research & Art Published:

Testing virtual library card

Testing virtual library card
Comic-style illustration of Solip Park's research methods
Awards and Recognition Published:

Doctoral Researcher Solip Park's Paper Receives Honorable Mention at CHI 2024

Doctoral researcher Solip Park's paper has recently garnered attention at the prestigious CHI 2024 conference, earning an "honorable mention" distinction.
Accounting capstone course winners
Studies Published:

Towards the transition from the academic world to the professional world

Accounting Masters students had the opportunity to tackle real-life business challenges in the CAPSTONE course in accounting
Skanskan kehitysjohtaja Jan Elfving esiintyy Rakennustekniikan päivässä 2024
Cooperation Published:

Civil Engineering invites companies to participate in future development work

A stakeholder event organized by the Department of Civil Engineering has already become a spring tradition. Civil Engineering Day was held on April 25 in Dipoli Otaniemi.