Research data management
Properly managed research data creates competitive edge and is an important part of a high-quality research process. Optimal use and reuse of research data is a strategic goal of Aalto University. The goal is to make the data compatible as well as easy to find, reach and understand.
In order to be a world-class university, we must have a worldwide audience. At the same time, we must not lose opportunities due to internal disorganisation or being unaware of what data is available to us. Thus, it is worthwhile to take some time to ensure that data is managed properly. The importance of data management is being recognised by funding agencies, who incresingly require data management plans before funding research projects.
What part of data is published, when, and how data is curated, are strategic decisions of the principal investigator (PI), who is usually a professor. The PI takes into consideration agreements, commercial interests, policies and the law. The data repository chosen is also a strategic decision of the PI, and so is the decision whether to publish software. University services offer guidance for these decisions, and support in selecting a repository. Please read more in the Aalto University Research Data Management Policy (pdf).
By managing research data well you can:
- Find and understand data when needed.
- Avoid unnecessary duplication of research work.
- Validate your results if required by publishers, funding agencies or anyone reading your publications.
- Increase the impact of your research.
- Ensure your research is visible by publishing datasets.
- Get credit when others cite your work, in case you have published your data.
- Comply with funder mandates.
- Comply with publisher's requirements. Many leading journals require underlying datasets also to be published or made accessible as part of the essential evidence base of a scholarly article.
A short guide to research data management (pdf) can be downloaded here.
At the very least, you should:
- Consider if your data should be open either partially or completely.
- Ensure your data is backed up during and after a research project.
- Ensure you and others can access your data. The best way to do this is to make it open and publicly archive it in a well known repository which will guide you with metadata too!
Achieving visibility with data
In recent years, more and more journals offer practical models to foster data visibility, sharing and open access to data. This increases transparency and trust in research, as well as understanding of claims and results. Typically, the journals that have adopted data policies recommend the deposition of the data related to the article in open repositories. Journals can also support openness by offering the possibility to share the article-related data as an extra or supplement material. See more details here.
Image source: openscience.fi/research-process-and-data
Data life cycle
Research questions guide the search of existing data from data repositories, and help in collecting new data. For browsing existing data you can use metadata catalogues to find data in discipline-specific and general repositories, such as Zenodo and FSD (Finnish Social Science Data Archive). Other sources include citations of datasets in relevant articles and the data from previous research. Interoperability of data guarantees that you can combine existing data with the new data you collect in your project into research data.
It is sometimes possible to store research data during the research in the intended long-term repository with restricted access. You can also use the services offered by IT Services that enable file sharing and the proper handling of backup and security issues. Proper data curation includes the description of the data, making it understandable and usable for yourself and other researchers. Proper versioning keeps track of changes and updates to data.
Giving open access to research publications often means that the research data underlying the publication is to be published as well. Validated repositories give persistent identifiers to the data. To authorize the reuse of your research data, you need to license the data, considering the rights to the original existing data and newly collected data.