Le 2 juillet 2026
Artificial intelligence and large-scale digital data practices are transforming research across all disciplines. Whether you are training a machine learning model, using a generative AI tool to assist with writing or analysis, scraping web data for a corpus, or analysing datasets derived from digital platforms, these activities raise a distinct and growing set of ethical and legal responsibilities.
Generative AI has great potential for accelerating scientific discovery, leading to new research breakthroughs and significant productivity gains, and improving the effectiveness and pace of research and verification processes. At the same time, the technology entails real risks of abuse. Some risks stem from the tool’s technical limitations, and others arise from the intentional or unintentional use of the tool in ways that erode sound research practices. In many respects, these tools can harm research integrity and raise questions about the ability of current models to combat deceptive scientific practices and misinformation.
To address this, the European Research Area Forum developed guidelines on the use of generative AI in research for funding bodies, research organisations, and researchers, both in the public and private research ecosystems.
The result is the Living Guidelines on the Responsible Use of Generative AI in Research (European Commission, 2026), which provide a shared framework applicable across Europe. While non-binding, they should be considered as a supporting tool for researchers, research organisations and research funding bodies, including those applying to the European Framework Programme for Research and Innovation.
AI systems can exhibit a range of limitations that directly affect research quality and integrity:
Beyond these technical limitations, there are also risks linked to the proprietary nature of many tools, including lack of openness, fees, and the potential use of input data by the platform provider.
The use of AI in research is not a neutral act: it engages directly with the core principles of research integrity. The key principles framing the Living Guidelines are grounded in the European Code of Conduct for Research Integrity and the guidelines on Trustworthy AI. They cover:
Reliability in ensuring the quality of research, including verifying and reproducing AI-generated information;
Honesty in transparently disclosing the use of generative AI;
Respect for colleagues, participants, society, and the environment, including proper management of privacy, confidentiality, and intellectual property;
Accountability for all outputs produced, underpinned by human agency and oversight.
Concealing the use of AI in the creation of content or in drafting publications is considered an unacceptable practice under the ALLEA European Code of Conduct for Research Integrity. AI systems are not authors or co-authors: authorship implies agency and responsibility, which rest solely with human researchers.
Transparency and disclosure
Any substantial use of AI must be clearly described in your research outputs. This applies across the full research process: if AI has been used to carry out a literature review, analyse data, generate or refine text, develop hypotheses, or produce images, this must be disclosed in the methods section (or equivalent). You should specify the tool name, version, and date of use, and explain how it shaped your results.
Human agency and oversight
AI tools must support rather than replace human judgement. Researchers should maintain a critical relationship with AI outputs, verifying claims, checking citations, and evaluating results independently. Acknowledging the stochastic nature of generative AI ( the tendency to produce different outputs from the same input ) researchers should strive for reproducibility and robustness, and openly discuss any limitations arising from the tools used. Future Needs
When AI systems interact with study participants or the public, those individuals must be clearly informed that they are engaging with an AI and must receive comprehensible information about its capabilities and limitations.
Privacy and intellectual property
Special caution is required when inputting data into external AI platforms. The output produced by generative AI can contain personal data. If this becomes apparent, researchers are responsible for handling any personal data output responsibly and appropriately, and EU data protection rules are to be followed.
Researchers should not upload personal data to external AI systems without explicit consent and a documented lawful basis under the GDPR, and should check the data governance policies of any tool before use.
AI-generated text, code, or images may incorporate or closely resemble existing protected works. All AI-generated content must be critically reviewed before publication for factual errors, invented citations, bias, and inadvertent reproduction of third-party material.
Environmental responsibility
Large generative models carry significant computational costs. Researchers should evaluate whether the tool chosen is appropriately matched to the task, and consider the environmental impact of AI use as part of responsible research practice.
Digital data collection (including web scraping, API access, use of social media datasets, or mining of existing databases) engages multiple overlapping legal and ethical obligations. The regulatory frameworks differ substantially depending on the type of data involved.
Personal data (names, email addresses, location data, IP addresses, or any information that could directly or indirectly identify an individual) falls under the General Data Protection Regulation (GDPR). If your dataset includes personal data, you must establish a lawful basis for processing, apply the principles of data minimisation and purpose limitation, implement appropriate security measures, and consider conducting a Data Protection Impact Assessment (DPIA) when the processing is likely to result in high risks to individuals.
Non-personal data may still be protected by other legal instruments. The Database Directive (96/9/EC) protects databases that reflect either originality or substantial investment. Using a substantial portion of such a database without authorisation may infringe on these rights.
Web scraping and terms of service
Before scraping any website, review its terms of service carefully: many platforms explicitly restrict automated access or commercial reuse of their content. Breaching these terms may expose researchers to civil liability. EU copyright law does include a specific exception for text and data mining (TDM) for scientific research purposes under Directive 2019/790, but this exception has conditions and does not override the GDPR or database rights.
Intellectual property and third-party data
Before incorporating datasets, images, or text produced by others into your research, including as inputs to AI tools, verify whether the material is protected by copyright or database rights. If you are publishing datasets derived from third-party sources, ensure that your reuse is covered by an appropriate licence or falls within a recognised legal exception.
The EU Artificial Intelligence Act entered into force on 1 August 2024, with provisions being phased in over two to three years. It applies a risk-based framework to AI systems across four tiers: prohibited applications, high-risk systems (subject to strict obligations including conformity assessments and human oversight), limited-risk systems (subject to transparency obligations), and minimal-risk systems. Researchers developing or deploying AI should assess where their systems fall within this framework, maintain detailed technical documentation throughout the project, and apply the principles of transparency, human oversight, and non-discrimination from the outset.
A useful self-assessment tool is the Assessment List for Trustworthy Artificial Intelligence (ALTAI), developed by the EU High-Level Expert Group on AI.
Before and during your project:
When publishing:
Key References and future readings:
Living Guidelines on the Responsible Use of Generative AI in Research
ALLEA European Code of Conduct for Research Integrity
Guidelines for AI use in EU project deliverables (based on EC recommendations)
EU Grants: How to complete your ethics self-assessment
Assessment List for Trustworthy Artificial Intelligence (ALTAI)