The General Data Protection Regulation (GDPR)

The General Data Protection Regulation (GDPR), in German Datenschutz-Grundverordnung (DSGVO), is the central legal framework for the protection of personal data in the European Union. It came into force in 2018 and applies to anyone who processes personal data of individuals in the EU - including researchers and universities, regardless of where they are based (Jarmul 2023).

As a researcher, you almost certainly process personal data, and the GDPR applies to you. Understanding its core concepts helps you navigate ethics board requirements, data management plans, and - most importantly for this tutorial - why and when you need to anonymize data.

This chapter gives you a brief, research-focused overview. We are not going to cover everything the GDPR has to say, but we will focus on the parts that matter most for your work.

Key Concepts for Researchers

The GDPR defines several concepts that come up regularly in research. Let’s go through the most important ones.

NotePersonal Data

Any information relating to an identified or identifiable natural person. This includes obvious identifiers like names and email addresses, but also indirect identifiers like age, postal code, or job title - if they can be combined to single out an individual. See the next chapter for a detailed explanation.

In research: Most survey and experimental data contain personal data, even if you never ask for a participant’s name. A combination of demographic variables (age, gender, occupation, location) can be enough.

NoteProcessing

Any operation performed on personal data, whether automated or manual. This includes collecting, recording, organizing, storing, analyzing, sharing, and deleting data.

In research: Essentially everything you do with your data - from the moment a participant fills in a survey to the moment you publish or archive the dataset - counts as processing.

NoteController

The person or organization that determines the purposes and means of processing personal data.

In research: This is typically the principal investigator (PI) or the research institution. If you design a study and decide what data to collect and why, you are the controller - and you carry the legal responsibility for data protection.

NoteProcessor

A person or organization that processes personal data on behalf of the controller.

In research: If you use a third-party survey platform (like Qualtrics or SoSci Survey), a cloud storage provider, or a transcription service, these act as processors. You remain responsible for ensuring they handle data in compliance with the GDPR.

What the GDPR Means for Your Research

There are a few GDPR principles and provisions that are particularly relevant to researchers:

Lawful basis for processing. You need a legal basis to process personal data. In research, this is usually either informed consent (Art. 6(1)(a)) or legitimate interest / public interest (Art. 6(1)(e/f)). The GDPR sets strict requirements for what information participants must receive before giving consent (Art. 13/14) - including the purpose of processing, who will have access, how long data will be stored, and what rights participants have. We discuss informed consent further in the chapter on mechanisms of data protection.

Purpose limitation. Data may only be collected for specified, explicit purposes. In research, this means you should be clear about what your data will be used for - and if you want to re-use data for a new purpose, you may need to check whether this is covered by the original consent.

Data minimization. You should only collect data that is necessary for your research purpose. Collecting “nice-to-have” demographics without a clear reason is not just bad practice - it may also be a GDPR issue. We discuss this more in the chapter on mechanisms of data protection.

Storage limitation. Personal data should not be kept longer than necessary. For research, there are exceptions - data may be stored longer for archiving purposes in the public interest, scientific research, or statistical purposes (Art. 89) - but this requires appropriate safeguards, such as anonymization.

The research exemption (Art. 89). The GDPR acknowledges the importance of scientific research and provides some flexibility. For example, further processing of personal data for research purposes is generally considered compatible with the original purpose of collection. However, this exemption requires “appropriate safeguards” - and anonymization is one of the key safeguards the GDPR has in mind.

The GDPR allows EU member states to adopt more specific rules in certain areas. Germany has done this through its Federal Data Protection Act (BDSG), which adds provisions for research (§ 27 BDSG). For example, it allows processing of special categories of personal data (like health data) for scientific research without explicit consent, provided that appropriate safeguards are in place and the research interest substantially outweighs the interests of the data subject.

Bavaria, as a German state, has its own Bavarian Data Protection Act (BayDSG), which applies to public institutions - including universities like LMU. In practice, this means that researchers at Bavarian universities are subject to the GDPR, the BDSG, and the BayDSG. The good news: the core principles (purpose limitation, data minimization, storage limitation) are consistent across all three. The main thing to be aware of is that your institution’s data protection officer can advise you on any Bavaria-specific requirements.

ImportantWhen Does the GDPR Stop Applying?

The GDPR applies to personal data. If data is truly anonymized - meaning individuals can no longer be identified, directly or indirectly - the GDPR no longer applies to that data (Recital 26). This is one of the main reasons why anonymization is so valuable for open data: it allows you to share data freely without the legal constraints of the GDPR.

But be careful: pseudonymized data (e.g., replacing names with codes while keeping a key file) is still personal data under the GDPR. The regulation explicitly states this. We cover the distinction between anonymization and pseudonymization in a later chapter.

Participant Rights

The GDPR grants individuals (including your research participants) several rights regarding their data. The most relevant ones for research are:

  • Right to be informed: Participants must be told how their data is used - this is typically covered in your consent form.
  • Right to access: Participants can request to see what data you hold about them.
  • Right to erasure: Participants can request that their data be deleted - though exceptions exist for research in the public interest.
  • Right to withdraw consent: Participants can withdraw their consent at any time, and you must be able to honor that request (which is easier if your data is well-organized and pseudonymized, so you can find and remove specific records).
TipPractical Tip

If you plan to anonymize and share your data, it is good practice to inform participants about this in your consent form. Once data is truly anonymized, individual data points can no longer be identified or deleted - so withdrawal of consent after anonymization is no longer possible. Being transparent about this upfront is both ethically and legally important.

Learning Objective

  • After completing this part of the tutorial, you will have an overview of the legal obligations when collecting and processing personal data according to GDPR.

Exercises

(none)

Back to top

References

Jarmul, Katharine. 2023. Practical Data Privacy. 1st ed. O’Reilly Media, Incorporated.