Services

Responsible use of Artificial Intelligence in the research process

This page gives Aalto University's guidance on what is considered ethically appropriate usage of artificial intelligence (AI) tools when conducting research. Our emphasis here is not on the outcomes of research as such, but instead the reseach process itself, along the lines of research integrity and responsible conduct of research: reproducibility; bearing responsibility on the correctness of the results presented; respecting the authorship of others; and data protection. How to interpret these concepts now that generative AI tools can generate content indistinguishable from human-generated content?

Artificial intelligence (AI) based technologies, especially general purpose AI tools built upon large language models such as ChatGPT and text-to-image DALL-E, have quickly gained popularity, and Aalto University also provides help in their usage. They are able to generate text and artwork which are hard to distinguish from human-generated ones. While we encourage their usage, an obvious question arises: if they are used in research work, are there any issues with research integrity and responsible conduct of research? Can we find guidance on appropriate usage of AI tools when conducting research?

Literally speaking, the range of "AI tools" may vary from a spell-checker to the full functionality of GPT-type tools, but here we refer to generative AI, technologies based on large language models which able to generate human-imitating text, code or artwork.

To narrow down the presentation: the discussion “ethical AI”, whether the outcomes of an AI system are ethical and responsible, is important and there is a lot of research on how and whether it can be ensured that the outcome of an AI system is fair, nonbiased and nondiscriminative. Nevertheless, for clarity and conciseness of presentation, our approach here is more narrow. We are concerned about the research process itself, along the lines of research integrity and responsible conduct of research as advised by Finnish National Board of Research Integrity TENK: reproducibility; bearing responsibility on the correctness of the results presented; respecting the authorship of others; and data protection. How to interpret these concepts now?

Our guidance

The topic is evolving in time as new AI tools become available, but for the time being, Aalto University's guidance is the following:

AI cannot be given authorship, as it is considered as a tool, and authorship always involves responsibility that AI cannot cover. You are responsible for the correctness of the results you present, and you need to respect the authorship of other researchers by citing their work where appropriate.
The use of AI is should be transparent by openly describing how AI is used in the research process so that others can reproduce your results.
GDPR is to be followed, meaning that research data containing personal data, or any personal data that would infringe the rights of data subjects, should not be fed into an online AI tool.
To protect your own unpublished work, do not upload it into an online system, especially if it is of sensitive nature.
Beware also that the output of the AI system can be sensitive, and should be treated with caution and according to GDPR.
In artistic work, artificial intelligence is again only a tool and should not be given authorship. When creating artistic output, contributor roles must be explained transparently and specifically. Applicable copyright legislation is to be followed when you publish something based on the output of an AI tool.

Nothing new here, actually! The rules of research integrity and responsible conduct of research still apply.

Reproducibility: If you used an AI system, you should disclose the technical details and parameters so that someone else can verify your results on their own. Online generative AI services (such as various versions of OpenAI ChatGPT) are often updated and sometimes discontinued, making it impossible to reproduce exactly the same results.

Accountability: You are responsible for the correctness of the results you present. The current tools sometimes provide nonsense. Reference to an AI system cannot be used as a disclaimer or an excuse for incorrect results.

You need to respect the authorship of others by citing their work where appropriate. ChatGPT’s output may be based on someone else’s results – can you trace it back to the original references? See more on AI and copyright in this page.

GDPR is to be followed – nothing new here either. Do not feed in confidential information or personal data unless you can trust the way the system is handling it. At the moment (6/2023), online OpenAI services such as ChatGPT are not GDPR compliant. Only public data can be uploaded in them. There exist local implementations where one can feed personal data to a large language model that runs locally while ensuring full data protection, see more at https://www.aalto.fi/en/services/productised-ai-and-openai-services-in-aalto. If you get personal data as an output, then you are responsible for handling it in an appropriate way, and informing the subjects.

Make sure that you protect your unpublished work. By feeding text or data from unpublished manuscripts or grant applications into online AI systems, you might be giving the permission to reuse your data. Your unpublished work might then be used for training future versions of language models, or passed to third parties (e.g. ChatGPT plugins) without any transparent mechanism, making it impossible for you to trace where your unpublished text or data have gone and how they will be reused.

If you use code written by ChatGPT you cannot check the licenses of the code snippets that ChatGPT is using. Apart from the problem with licenses, a good idea is to publish your code in an open source repository so that anyone can check its correctness.

Copyright and contributor issues especially in art

Again, artificial intelligence is only a tool in artistic work, and should not be given authorship. When creating artistic output, contributor roles must be explained transparently and specifically.

Use of copyright protected artwork to train an AI system is allowed both in the US and EU when following the applicable copyright legislation exception rules. Obviously, AI-based image generators have been trained on material which was copyright protected. If you use an AI-based image generator, you might receive output which contains copyrighted material. Applicable copyright legislation is to be followed when you publish something based on such output.

Whether you have any copyright on your output is another question, and e.g. the US Copyright Office has stated that copyright only protects creations made by humans. That is, copyright will only protect aspects of the work that are original works by the authoring human.

See more on AI and copyright in this page.

The discussion is ongoing

The topic is constantly evolving and new powerful AI tools enter the market at unpredecented rate. We need to keep updating the guidance and you can participate in the discussion. Please do not hesitate to contact the persons listed in the end of the page.

Responsible conduct of research with generative AI technologies is only one aspect of the broader ongoing discussion on the Ethics of AI. For further readings the Recommendation on the Ethics of Artificial Intelligence by UNESCO and the OECD AI Principles offer comprehensive overviews of the topic. The European Union and European Parliament are in the process of generating regulation on the usage of artificial intelligence, and they also provide an introduction on what is meant by general purpose artificial intelligence.

There are several open questions regarding intellectual property, see for example this news item from the European Commission. Aalto University's guidance on AI and copyright is here.