How to cite a data set?

Referencing data is as simple as citing publications, and citations to data can be counted and used in research metrics in the same way as the citations to articles. Data citation is an essential factor to promote data access, sharing and reuse of your data.
Laine, H (ed.) 2018 Tracing Data - Data Citation Roadmap for Finland. Helsinki, Finland: Finnish Committee for Research Data. URN:
Data reference model: H. Laine (ed.) Tracing Data - Data Citation Roadmap for Finland. Helsinki, Finland: Finnish Committee for Research Data (2018). URN:

Citing is the conventional way to acknowledge someone’s contribution to traditional research outputs, like publications. Equally, the achievements in data collection and production should be recognised. Don’t hesitate to cite your own data either!

The precondition for data citation is the proper entry to data. It is not necessary that the data is publicly open, but to generate a citation, information about data has to be available. Repositories, metadata archives, and publishers’ services hold metadata records that offer reliable information to generate the citation. 

Citing data: recommendation

Generating a data reference is easy. Follow the same citation style to data sets as citing literature in your publication.

The main elements are:

  • Author(s) – the author can be the Principal Investigator, creator, or other roles can be named, too
  • Title – the name of a data set
  • Date – the year a data set was added to the repository
  • Publisher - the unique identification of the repository/archive hosting the data (e.g. “Finnish Social Science Data Archive”, or by their domain “”).
  • Version number – used in cases where there is more than one version available
  • Access information – persistent digital identifier like DOI or URN is recommended

Check what the data repository suggests:

  • When the cited data belongs to a certain repository, the repository might suggest a proper way to cite or even give a ready-made citation for the retrieved data. For example, the Zenodo repository suggests the citation style:

Richard Darst, Enrico Glerean, Dan Häggman, Clemens Icheln, Mika Jalava, Mikhail Kuklin, … Ella Bingham. (2019). Illustration of Data Agents network of Aalto University: Data Agents: How to put research data management into practice? (Version 1.0). Zenodo.

  • There might also be special recommendations like in the case of Dryad. Dryad recommends citing both the original article as well as the Dryad data set.

Extra resources

Links to research data management instructions

Follow these links to navigate through research data management instructions.

Aalto univerisity library

Publishing and reusing open data

Overview and instructions to services for sharing and publishing research data

People talking with each other

Research Data Management (RDM) and Open Science

Properly managed research data creates competitive edge and is an important part of a high-quality research process. Here you will find links to support, services and instructions for research data management.

This service is provided by:

Research and Innovation Services

Did you find what you were looking for? If not, please contact us.
  • Published:
  • Updated: