Data storage, collaboration and backup
There are a number of solutions available for storing your data. As the research proceeds, the needs for the storage may change and most likely the data needs to be stored in several systems or with different access rights during its lifecycle. To get the most out of your data during its lifecycle, keep it managed, documented, secured and backed up.
Put the data in a repository that secures the backup of the data and where you can choose the level of openness (not opened / opened to your colleagues partially or completely / opened to everyone partially or completely):
- Finnish Social Science Data Archive (FSD)
- ACRIS - Aalto Current Research Information System for research outputs, researchers, projects, datasets, etc.
- Also consider opening the data directly if possible. See here for options.
For day to day collaboration you can consider cloud services described in Aalto intranet (login required):
- Google Drive
- Microsoft Office 365 One Drive for Business
- For a list of the security aspects fo various cloud services, please see here
You might also want to check the IT Services for Research page (login required) to make sure you are aware of all resources at Aalto.
Should you have special requirements concerning storage (e.g. more performance, capacity, collaboration, mobility), please contact esupport [at] aalto [dot] fi (Aalto IT support) to find a suitable solution to your specific needs.
What research data should be preserved and shared?
Minimally researchers must ensure that the data needed to validate results in scientific publications are preserved and should be available, minimally for other researchers on request. Everything that is needed to replicate a study should be preserved, and everything that is potentially useful for others. For more information see “How to select and appraise research data “:www.dcc.ac.uk/resources/how-guides/appraise-select-research-data
The datasets must have the associated metadata: the dataset’s creator, title, year of publication, repository, identifier etc. The Finnish Social Science Data Archive staff can help add the metadata to materials that are stored and opened in the repository, for example interviews. FSD's data descriptions are available online as DDI 2.0 XML files. See more http://www.fsd.uta.fi/en/data/background/ddi-records.html
The datasets should be FAIR. FAIR Guiding Principles for scientific data management & stewardship http://www.nature.com/articles/sdata201618
The datasets must have a persistent identifier. The repository will assign a persistent ID to the dataset: this is important for discovering and citing the data.
Documentation should be preserved: code books, lab journals these are important for understanding the data and combining them with other data sources.
Software, hardware, tools, syntax queries, machine configurations – domain-dependent, and important for using the data. (Alternative: information about the software etc.)
Source: Sarah Jones and Marjan Grootveld: How to write a Data Management Plan https://eudat.eu/events/webinar/joint-eudat-openaire-webinar-%E2%80%9Chow-to-write-a-data-management-plan%E2%80%9D licensed with a CC-BY 4.0 license https://creativecommons.org/licenses/by/4.0/
Data can be archived to a repository and the access right can at the beginning and during the project be defined as closed access. This can be changed to restricted access, embargoed access and open access according to the goals of the project.
Embargoed access can be used for datasets. With embargoed access the researchers who have collected the data use the research data as underlying data to their publications first. Only after publication researchers do publish the citable datasets, using a license that requires attribution, for example CC BY 4.0 https://creativecommons.org/licenses/by/4.0/ . The license requires that authors and publications are cited according to the Attribution term of the license
You will need to provide metadata that complies with an international metadata standard. A researcher should follow a metadata standard in his line of work, or a generic standard, e.g. Dublin Core or DataCite, for more information see Research Data Alliance (RDA) http://rd-alliance.github.io/metadata-directory/standards/. To facilitate providing the metadata you should answer the following questions already during the research work. This information can additionally be listed in a README file.
Who are the creators and what are their affiliations
Where the data is located and is there a persistent identifier
What is the license chosen to allow reuse
- How, when and by whom the data has been collected/ created
- How the data has been prepared for analysis
- What kind of data manipulations have taken place
- How and what methods have been used to analyse the data
- What instruments and devices have been used
- Which scientific publications are based on this data
- What is the software used to process and analyse the data
Confidential data and information security
Information security is about keeping your information safe and accessible. Information should be safe: neither changed nor destroyed accidentally. Information should be accessible: available to you and away from unauthorized users.
If you obtain confidential data, make sure to follow the nondisclosure agreement (NDA).
Plan the security aspects and the handling of personal data in the beginning of your research. More information on personal data as part of research data in the section research ethics. If you start a project that collects confidential data then please contact researchdata [at] aalto [dot] fi.
Here are some things to consider:
- First, classify the information. How sensitive is it? Are there restrictions on how to handle the data and what services to use? Check the guidelines for classification of information in Aalto Intranet (login required).
- Do you need to encrypt sensitive data for transfer and collaboration? For details, look at the encryption guides for Aalto (login required):