Introduction to Research data management (RDM) and Open Science
Research data are a valuable resource that often requires a lot of time and money to create. Thus, it is worthwhile to take some time to ensure that data is managed properly. The importance of data management is being recognised by funding agencies, who increasingly require data management plans before funding research projects.
What part of data is published, when, and how data is curated, are strategic decisions of the principal investigator (PI), who is usually a professor. The PI takes into consideration agreements, commercial interests, policies and the law. The data repository chosen is also a strategic decision of the PI, and so is the decision whether to publish software. University services offer guidance for these decisions, and support in selecting a repository. Please read more in the Aalto University Research Data Management Policy and a short guide to research data management below.
By managing research data well you can:
- Find and understand data when needed.
- Avoid unnecessary duplication of research work.
- Validate your results if required by publishers, funding agencies or anyone reading your publications.
- Increase the impact of your research.
- Ensure your research is visible by publishing datasets.
- Get credit when others cite your work, in case you have published your data.
- Comply with funders' mandates.
- Comply with publisher's requirements. Many leading journals require underlying datasets also to be published or made accessible as part of the essential evidence base of a scholarly article.
At the very least, you should:
- Consider if your data should be open either partially or completely.
- Ensure your data is backed up during and after a research project.
- Ensure you and others can access your data. The best way to do this is to make it open and publicly archive it in a well known repository which will guide you with metadata too!
Achieving visibility with data
In recent years, more and more journals offer practical models to foster data visibility, sharing and open access to data. This increases transparency and trust in research, as well as understanding of claims and results. Typically, the journals that have adopted data policies recommend the deposition of the data related to the article in open repositories. Journals can also support openness by offering the possibility to share the article-related data as an extra or supplement material. See more details here.
Data life cycle
Research questions guide the search of existing data from data repositories, and help in collecting new data. For browsing existing data you can use metadata catalogues to find data in discipline-specific and general repositories, such as Zenodo and FSD (Finnish Social Science Data Archive). Other sources include citations of datasets in relevant articles and the data from previous research. Interoperability of data guarantees that you can combine existing data with the new data you collect in your project into research data.
It is sometimes possible to store research data during the research in the intended long-term repository with restricted access. You can also use the services offered by IT Services that enable file sharing and the proper handling of backup and security issues. Proper data curation includes the description of the data, making it understandable and usable for yourself and other researchers. Proper versioning keeps track of changes and updates to data.
Giving open access to research publications often means that the research data underlying the publication is to be published as well. Validated repositories give persistent identifiers to the data. To authorize the reuse of your research data, you need to license the data, considering the rights to the original existing data and newly collected data.
Learn more on research data management
Take a look at these detailed guides to learn how to manage your research data from planning your research to publishing results.
Citing and publishing data
Follow these instructions to make your data reusable and to benefit from open data.
Publishing the underlying research data increases citations to your journal articles and other publications. Data citations are also citations.
At the moment, two types of journals stand in the front line of promoting the opening and sharing of data: Journals that require data availability as a precondition for publishing, and scientific data journals that publish descriptions of research datasets.
Data publishing repositories used in Aalto University
In many disciplines, researchers develop software as part of their research work. Software can either be a primary academic output to be widely used, or a byproduct of getting other work done.
Referencing data is as simple as citing publications, and citations to data can be counted and used in research metrics in the same way as the citations to articles.