News

Decoding the chemistry of space with machine learning

Astronomers can detect complex chemical fingerprints in stardust – but many of them remain unidentified. The SpaceML project combines machine learning and computational chemistry to simulate how molecules form and evolve in space, helping researchers decode these signals.
Left: person wearing a black jacket and pearl necklace. Right: molecular structure illustration against a cosmic background.
Left: Academy Research Fellow Rina Ibragimova. Right: Graphene in space (artist's concept). Photo: NASA.

Organic molecules made of carbon, hydrogen, and oxygen are widespread in the universe. Yet important questions remain about where these compounds come from and how they form. 

The SpaceML project models how organic molecules form and break down around stars, with the aim of explaining spectroscopic signals – signatures produced by light-matter interactions that astronomers can detect but often struggle to interpret. 

“Machine learning allows us to simulate processes that were completely out of scope five years ago – harsh astrophysical conditions, constant radiation, rare reaction events. In practice, we now have tools to simulate matter in space. This way we can study systems that were previously impossible to approach computationally,” says Academy Research Fellow Rina Ibragimova

Ibragimova is an expert in computational physics and the Principal Investigator of the SpaceML project. She is also a member of the Data-driven Atomistic Simulation (DAS) group led by professor Miguel Caro at Aalto. 

The idea to use machine learning in astrochemistry emerged from an unexpected interdisciplinary meeting. A few years ago, during a visit to the University of La Laguna in Spain, one of Ibragimova’s lectures was attended by an astronomer. 

“Before that meeting, I had not thought about astrochemistry at all. But we talked and found there was a lot of common ground and that we could work together. That’s how the collaboration emerged. We started planning potential projects, and later I wrote my own proposal to the Academy of Finland, and SpaceML was formed” Ibragimova says. 

Carbon-rich AGB star with molecule formation stages at 10^14 cm, 10^15 cm, and 10^16 cm radii, denoted in white on a starry background.
A conceptual depiction of the molecules in the circumstellar medium (CRM) – the gas, dust, and plasma surrounding a star. Credit: Martínez, L., Santoro, G., Merino, P. et al. / Nature Astronomy.*

The challenge of identifying molecules in space 

When organic molecules drift around stars, they interact with light in characteristic ways. Using infrared spectroscopy, researchers break that light into a spectrum – a pattern of peaks and features. Each molecular structure leaves its own ‘fingerprint’. 

For simple molecules, the signal can be straightforward to link to a specific bond or structure. Often, however, astronomers observe complex mixtures. Their spectral signatures overlap, producing features that may correspond to several possible structures – or to none that can be identified with confidence. As a result, many signals in astronomical data remain unexplained. 

In materials science, researchers solve this problem in a more controlled way. 

“We synthesize a specific compound in the lab, measure its infrared spectrum, and create a direct link between structure and signal. In astronomy, the challenge is much greater. Researchers often do not know what kinds of molecules are present to begin with. The number of possible molecular structures and combinations is vast, making it difficult even to decide what to test experimentally,” Ibragimova explains. 

Laboratories can recreate certain space-like conditions by placing a starting material – a precursor – in a chamber, exposing it to hydrogen atoms and ultraviolet radiation, and measuring how its spectrum evolves. But selecting the right precursor is a major challenge. 

“This is where computational simulations become essential. By screening large numbers of precursor structures with advanced modeling and machine learning, we can identify the ones most likely to explain specific infrared features observed in space”, Ibragimova says. 

Building reusable methods 

Beyond its astrochemical focus, the SpaceML project aims to develop computational methods that can be applied widely in chemistry and materials science. Ultimately, the goal is to make the simulations accessible – not something that takes an entire PhD to run, but something more efficient and widely usable. 

The second goal is to fill theoretical gaps and stitch everything together into a coherent methodology. Although many machine-learning approaches already exist, Ibragimova sees a need for a more integrated workflow. 

“Ideally, you would have an idea, run the calculation quickly, identify the structure and its spectrum, and then bring that information to collaborators and say, ‘We think these structures are relevant under these conditions – you could try to test them’”, she says. 

Such a framework could guide experimentalists more efficiently and help make telescope observations more targeted. 

“For example, with the James Webb Space Telescope, we could better predict what to look for and estimate the confidence of identifying certain structures.” 

The project builds on tools developed within the Data-driven Atomistic Simulation (DAS) group while creating new ones. One model predicts infrared spectra using machine learning. Another, developed in collaboration with professor Andrea Sand at Aalto, simulates the effects of ionizing UV radiation on materials. 

“It’s amazing that the same tools can simulate catalytic materials and materials millions of light-years away. What’s fascinating is the difference in time scales. Stellar evolution happens over thousands or millions of years, while atomistic simulations might cover only hundreds of picoseconds. And yet with careful considerations and thorough verification, we can capture meaningful patterns”, Ibragimova says.

* The source for the image depicting the circumstellar medium can be found here. Rina Ibragimova is not a co-author in the article from which the image is derived from but José A. Martín-Gago and Gonzalo Santoro are the collaborators of her ARF.

Further information:

DAS research group photo

Data-driven Atomistic Simulation (DAS)

Research group led by Miguel Caro

Department of Chemistry and Materials Science
A 3D structure with green spheres interconnected by a grey mesh, set against a multicoloured background.

A paradigm shift: machine learning is transforming research at the atomic scale

Assistant professor Miguel Caro and his research group use and develop machine learning tools to accelerate discoveries from simulation to experiment

News
  • Updated:
  • Published:
Share
URL copied!

Read more news

Abstract structure of pale rods and rough wooden planks against a light blue background
Research & Art Published:

Pressed by the devil , shaped by the future

Curly birch shines in Aalto University’s Wood Studio’s fresh perspectives at the Craft Museum of Finland’s summer exhibition.
Group in black and gold costumes tosses silver balls in bright dance studio
Research & Art Published:

When atoms begin to dance – At Aalto University, metallurgy became choreography

On the Dance Metallurgy pilot course, copper ions were given movement and a face. When a metal essential to the green transition stepped onto the dance floor, chemical phenomena that often seem intimidating opened up in an entirely new way.
Three people hold yarn spools in front of large green textile machinery in a factory setting.
Cooperation, Research & Art, University Published:

Design at the start of the supply chain – Aalto University leads a major EU project to transform textile colouration practices

The EU Horizon-funded MELANGE project brings together design, technology and business to rethink colouration practices in the textile industry and accelerate the transition towards circular and sustainable textile systems.
Blue outlines of phones and tablets over black, white and pink marbled abstract background
Aalto Magazine, Research & Art Published:

Arsi Ikäheimonen’s doctoral research: Smartphone data could reveal early signs of depression

A phone in your pocket, a smart ring on your finger, and an activity tracker on your wrist: everyday devices collect information about their users almost continuously. This data can help monitor and predict symptoms of depression.