PINDARE

Artificial Intelligence and Research

Why does it matter?

Generative AI has great potential for accelerating scientific discovery, leading to new research breakthroughs and significant productivity gains, and improving the effectiveness and pace of research and verification processes. At the same time, the technology entails real risks of abuse. Some risks stem from the tool’s technical limitations, and others arise from the intentional or unintentional use of the tool in ways that erode sound research practices. In many respects, these tools can harm research integrity and raise questions about the ability of current models to combat deceptive scientific practices and misinformation.

To address this, the European Research Area Forum developed guidelines on the use of generative AI in research for funding bodies, research organisations, and researchers, both in the public and private research ecosystems.

The result is the Living Guidelines on the Responsible Use of Generative AI in Research (European Commission, 2026), which provide a shared framework applicable across Europe. While non-binding, they should be considered as a supporting tool for researchers, research organisations and research funding bodies, including those applying to the European Framework Programme for Research and Innovation.

Main issues

AI systems can exhibit a range of limitations that directly affect research quality and integrity:

Training data bias: biases present in training data can produce skewed or inaccurate outputs, reflecting and amplifying systemic inequalities in the source material.
Sycophantic behaviour: models may align their responses with the perceived beliefs or preferences of the user, producing misleadingly agreeable outputs rather than accurate ones.
Invented citations: generative AI models may produce plausible-sounding but entirely fictitious references, which can seriously mislead anyone relying on those sources.
Opacity: AI models operate as « black boxes », making it difficult to understand how specific responses are generated. This makes independent verification essential, particularly in automated data analysis.
Hallucinations: models regularly produce confident but factually incorrect statements, requiring systematic critical review of all outputs before use.

Beyond these technical limitations, there are also risks linked to the proprietary nature of many tools, including lack of openness, fees, and the potential use of input data by the platform provider.

AI and research integrity

The use of AI in research is not a neutral act: it engages directly with the core principles of research integrity. The key principles framing the Living Guidelines are grounded in the European Code of Conduct for Research Integrity and the guidelines on Trustworthy AI. They cover:

Reliability in ensuring the quality of research, including verifying and reproducing AI-generated information;
Honesty in transparently disclosing the use of generative AI;
Respect for colleagues, participants, society, and the environment, including proper management of privacy, confidentiality, and intellectual property;
Accountability for all outputs produced, underpinned by human agency and oversight.

Concealing the use of AI in the creation of content or in drafting publications is considered an unacceptable practice under the ALLEA European Code of Conduct for Research Integrity. AI systems are not authors or co-authors: authorship implies agency and responsibility, which rest solely with human researchers.

What is recommended

Transparency and disclosure

Any substantial use of AI must be clearly described in your research outputs. This applies across the full research process: if AI has been used to carry out a literature review, analyse data, generate or refine text, develop hypotheses, or produce images, this must be disclosed in the methods section (or equivalent). You should specify the tool name, version, and date of use, and explain how it shaped your results.

Human agency and oversight

AI tools must support rather than replace human judgement. Researchers should maintain a critical relationship with AI outputs, verifying claims, checking citations, and evaluating results independently. Acknowledging the stochastic nature of generative AI ( the tendency to produce different outputs from the same input ) researchers should strive for reproducibility and robustness, and openly discuss any limitations arising from the tools used. Future Needs

When AI systems interact with study participants or the public, those individuals must be clearly informed that they are engaging with an AI and must receive comprehensible information about its capabilities and limitations.

Privacy and intellectual property

Special caution is required when inputting data into external AI platforms. The output produced by generative AI can contain personal data. If this becomes apparent, researchers are responsible for handling any personal data output responsibly and appropriately, and EU data protection rules are to be followed.

Researchers should not upload personal data to external AI systems without explicit consent and a documented lawful basis under the GDPR, and should check the data governance policies of any tool before use.

AI-generated text, code, or images may incorporate or closely resemble existing protected works. All AI-generated content must be critically reviewed before publication for factual errors, invented citations, bias, and inadvertent reproduction of third-party material.

Environmental responsibility

Large generative models carry significant computational costs. Researchers should evaluate whether the tool chosen is appropriately matched to the task, and consider the environmental impact of AI use as part of responsible research practice.

Digital Data Collection and Research

Key ethical and legal considerations

Digital data collection (including web scraping, API access, use of social media datasets, or mining of existing databases) engages multiple overlapping legal and ethical obligations. The regulatory frameworks differ substantially depending on the type of data involved.

Personal data (names, email addresses, location data, IP addresses, or any information that could directly or indirectly identify an individual) falls under the General Data Protection Regulation (GDPR). If your dataset includes personal data, you must establish a lawful basis for processing, apply the principles of data minimisation and purpose limitation, implement appropriate security measures, and consider conducting a Data Protection Impact Assessment (DPIA) when the processing is likely to result in high risks to individuals.

Non-personal data may still be protected by other legal instruments. The Database Directive (96/9/EC) protects databases that reflect either originality or substantial investment. Using a substantial portion of such a database without authorisation may infringe on these rights.

Web scraping and terms of service

Before scraping any website, review its terms of service carefully: many platforms explicitly restrict automated access or commercial reuse of their content. Breaching these terms may expose researchers to civil liability. EU copyright law does include a specific exception for text and data mining (TDM) for scientific research purposes under Directive 2019/790, but this exception has conditions and does not override the GDPR or database rights.

Intellectual property and third-party data

Before incorporating datasets, images, or text produced by others into your research, including as inputs to AI tools, verify whether the material is protected by copyright or database rights. If you are publishing datasets derived from third-party sources, ensure that your reuse is covered by an appropriate licence or falls within a recognised legal exception.

The EU AI Act

The EU Artificial Intelligence Act entered into force on 1 August 2024, with provisions being phased in over two to three years. It applies a risk-based framework to AI systems across four tiers: prohibited applications, high-risk systems (subject to strict obligations including conformity assessments and human oversight), limited-risk systems (subject to transparency obligations), and minimal-risk systems. Researchers developing or deploying AI should assess where their systems fall within this framework, maintain detailed technical documentation throughout the project, and apply the principles of transparency, human oversight, and non-discrimination from the outset.

A useful self-assessment tool is the Assessment List for Trustworthy Artificial Intelligence (ALTAI), developed by the EU High-Level Expert Group on AI.

What you should do

Before and during your project:

Conduct an ethics self-assessment addressing AI and digital data dimensions if applying for EU-funded research. Consult the EU Grants: How to complete your ethics self-assessment (pages 39–45 for AI).
Contact your local ethics committee if your research involves the development of AI systems, large-scale digital data collection, or the use of AI in interaction with human participants.
Apply the principles of data minimisation, purpose limitation, and security to all personal data, and consult your institution’s Data Protection Officer where needed.
Review website terms of service before any scraping activity and confirm that your collection falls within applicable legal exceptions.
Use the ALTAI checklist to evaluate the trustworthiness of any AI system you develop or deploy.
Document your use of AI tools throughout the project, including which tools were used, how, and at which stages.

When publishing:

Disclose all substantial uses of AI in methods sections, following the norms of your discipline and any requirements set by your target journal or funder.
Critically review all AI-generated content before publication: check for factual errors, invented citations, bias, and any inadvertent reproduction of third-party material.
Do not list AI systems as authors or co-authors.

Key References and future readings:

Living Guidelines on the Responsible Use of Generative AI in Research

ALLEA European Code of Conduct for Research Integrity

Guidelines for AI use in EU project deliverables (based on EC recommendations)

EU Grants: How to complete your ethics self-assessment

Assessment List for Trustworthy Artificial Intelligence (ALTAI)

EU Artificial Intelligence Act

Directive 2019/790,

General Data Protection Regulation (GDPR)

Plus d’articles sur cette thématique

Le 2 juillet 2026

Research involving non-EU countries

Research data management
Le 16 juin 2026

Where can I publish my research data ?

Research data management
Le 27 janvier 2026

Introduction to RDM and FAIR data

Research data management
Le 21 janvier 2026

Can I / Should I share my data openly?

Research data management
Le 20 janvier 2026

License your data

Research data management
Le 20 janvier 2026

Data Sharing Agreement

Research data management
Le 20 janvier 2026

Add an embargo period

Research data management
Le 16 janvier 2026

Select data for publication

Research data management
Le 16 janvier 2026

Publish a data paper

Research data management
Le 16 janvier 2026

Add metadata

Research data management
Le 16 janvier 2026

Choose a data repository

Research data management
Le 27 novembre 2025

Research Involving Human Cells or Tissues

Research data management
Le 27 novembre 2025

Research involving Animals

Research data management
Le 27 novembre 2025

Research on human participants

Research data management
Le 26 novembre 2025

Ethics

Research data management
Le 13 novembre 2025

Going further

Research data management
Le 13 novembre 2025

Type, format and volume of data

Research data management
Le 13 novembre 2025

Data Quality

Research data management
Le 13 novembre 2025

File Organization and Naming Conventions

Research data management
Le 13 novembre 2025

Metadata

Research data management
Le 13 novembre 2025

Codebook

Research data management
Le 13 novembre 2025

Document your data

Research data management
Le 13 novembre 2025

Search for existing datasets

Research data management
Le 13 novembre 2025

Sampling strategies

Research data management
Le 13 novembre 2025

Questionnaire design

Research data management
Le 13 novembre 2025

Compass to Research Data Management

Research data management
Le 13 novembre 2025

Experimental planning

Research data management
Le 4 novembre 2025

Write your DMP on DMPonline.be

Research data management
Le 4 novembre 2025

Plan data management cost

Research data management
Le 4 novembre 2025

Data Management Plan (DMP)

Research data management
Le 4 novembre 2025

Research Data Management

Research data management
Le 15 octobre 2025

FAIR data principles

Research data management
Le 3 octobre 2025

Data Cleaning

Research data management
Le 18 septembre 2025

Data Collection

Research data management
Le 18 septembre 2025

Publish and share your data

Research data management
Le 17 septembre 2025

Qui sont vos personnes ressources pour la gestion des données de recherche ? DPOs

Research data management
Le 15 septembre 2025

Managing Your Research Data

Research data management
Le 16 mai 2025

(Re)visionnez les webinaires du réseau de Data Ambassadors !

Research data management
Le 16 mai 2025

Qui sont vos personnes ressources pour la gestion des données de recherche ?

Research data management
Le 16 mai 2025

Do I have to write a DMP for my funder?

Research data management