Data Quality

Ensuring data quality throughout the research lifecycle

Back to Managing Your Research Data
Back to RDM Home page

 

Data quality is essential to producing credible, reliable, and reusable research. Quality control is not a one-time task—it must be embedded throughout all research stages, from data collection and entry to transcription and validation. Assigning clear roles and responsibilities, and developing appropriate procedures before data collection begins, are key components of a robust data quality management approach.

Based on guidance from the UK Data Archive, OECD, and other international standards, the following sections describe best practices for quality assurance and control.

1. Quality control during data collection

High-quality data begins at the point of collection. Apply standardized protocols to reduce measurement bias and variation.

Best practices include:

  • Calibrating instruments to ensure accurate measurement of variables (e.g., temperature sensors, laboratory tools)

  • Collecting multiple samples or measurements to increase reliability

  • Cross-checking records with experts or reference data

  • Using standardized observation methods and clear, structured data collection forms

  • Employing computer-assisted data collection tools (e.g., CAPI, CATI), which help:

    • Standardize questionnaires

    • Verify internal consistency

    • Route logic based on responses

    • Prevent invalid or incomplete entries

Source: UK Data Archive – Quality during data collection

2. Quality control during digitization, entry, and coding

Whether you’re converting paper records to digital, entering survey responses, or coding variables, systematic input processes are essential to prevent errors.

Recommended measures:

  • Validation rules or input masks in entry software (e.g., restrict date formats, numerical ranges)

  • Structured data entry interfaces (e.g., forms with dropdowns, required fields)

  • Controlled vocabularies and standard code lists to minimize inconsistencies

  • Detailed variable and record naming to avoid ambiguity

  • Well-designed database schemas that reflect the logic of the dataset

Tip: Tools like REDCap, Qualtrics, and OpenClinica offer built-in validation and structured input features.

See also: Data Entry Quality Guide – UKDA

3. Transcription quality in qualitative Research

Transcribing qualitative data (e.g., interviews or focus groups) converts rich verbal data into a format that can be analyzed, shared, and reused. Transcription is both a technical and interpretive task and must be carried out with methodological rigour.

Key considerations:

  • Security when outsourcing: Encrypt files, define protocols, and sign non-disclosure agreements

  • Consistent transcription style: Provide clear formatting guidelines for spelling, pauses, non-verbal cues, etc.

  • Compatibility: Use templates that work with your analysis software (e.g., NVivo, MAXQDA, ATLAS.ti)

  • Anonymization: Remove or mark personal/sensitive information for redaction

Transcripts should:

  • Be clearly labeled (e.g., Interview_2025-06-15_P01)

  • Use a consistent layout (e.g., paragraph style, line breaks between speakers)

  • Include speaker tags and page numbers

  • Begin with a header or cover page indicating metadata: interviewee pseudonym, interviewer, date, location

Source: UKDA – Transcription Guidance

4. Data checking, validation, and cleaning

After data entry, systematic verification and cleaning are required to detect and correct errors.

Common practices include:

  • Checking for out-of-range values or coding inconsistencies

  • Verifying completeness (e.g., no missing mandatory fields)

  • Random spot checks: Compare samples of digital entries against the original source

  • Double data entry (for sensitive or high-stakes data)

  • Statistical validation: Use summary statistics (e.g., means, frequencies, outlier detection) to identify anomalies

  • Peer review of data entries or codebooks by another researcher

Tools like R, Stata, and SPSS offer powerful scripts for automated cleaning and outlier detection.

More resources:

5. Data quality depends on context

Not all data types or fields require the same procedures. Data quality standards should be tailored to:

  • The nature of your dataset (quantitative, qualitative, observational, experimental)

  • Your disciplinary standards (e.g., clinical, ethnographic, ecological)

  • Compliance with legal, ethical, or regulatory requirements

Check for existing guidelines or SOPs at your institution, lab, or discipline. For example:

  • Social Sciences: CESSDA Data Management Expert Guide

  • Environmental Monitoring: EPA QA/QC Handbook

Go Further

Need help documenting your data quality practices? Explore:

 

Back to Managing Your Research Data
Back to RDM Home page

Plus d’articles sur cette thématique

  • Illustration de l’article Going further

    Going further

    Research data management
  • Illustration de l’article Type, format and volume of data

    Type, format and volume of data

    Research data management
  • Illustration de l’article File Organization and Naming Conventions

    File Organization and Naming Conventions

    Research data management
  • Illustration de l’article Metadata

    Metadata

    Research data management
  • Illustration de l’article Codebook

    Codebook

    Research data management
  • Illustration de l’article Document your data

    Document your data

    Research data management
  • Illustration de l’article Search for existing datasets

    Search for existing datasets

    Research data management
  • Illustration de l’article Sampling strategies

    Sampling strategies

    Research data management
  • Illustration de l’article Questionnaire design

    Questionnaire design

    Research data management
  • Illustration de l’article Compass to Research Data Management

    Compass to Research Data Management

    Research data management
  • Illustration de l’article Experimental planning

    Experimental planning

    Research data management
  • Illustration de l’article Write your DMP on DMPonline.be

    Write your DMP on DMPonline.be

    Research data management
  • Illustration de l’article Plan data management cost

    Plan data management cost

    Research data management
  • Illustration de l’article Data Management Plan (DMP)

    Data Management Plan (DMP)

    Research data management
  • Illustration de l’article Research Data Management

    Research Data Management

    Research data management
  • Illustration de l’article FAIR data principles

    FAIR data principles

    Research data management
  • Illustration de l’article Data Cleaning

    Data Cleaning

    Research data management
  • Illustration de l’article Data Collection

    Data Collection

    Research data management
  • Illustration de l’article Publish and share your data

    Publish and share your data

    Research data management
  • Illustration de l’article Qui sont vos personnes ressources pour la gestion des données de recherche ? DPOs

    Qui sont vos personnes ressources pour la gestion des données de recherche ? DPOs

    Research data management
  • Illustration de l’article Managing Your Research Data

    Managing Your Research Data

    Research data management
  • Illustration de l’article Qui sont vos personnes ressources pour la gestion des données de recherche ?

    Qui sont vos personnes ressources pour la gestion des données de recherche ?

    Open Data