Welcome to the Data Anonymization Course!
This website is a work in progress. Please come back later in 2026 :)
Tutorial Overview
This self-paced tutorial on anonymization of quantitative research data is intended to take about three hours to complete.
The tutorial is split into the following sections:
- FOUNDATIONS OF DATA PROTECTION talks about data protection basics in ethics and law, mechanisms of data protection in research, and basic terms.
- DATA ANONYMIZATION PROCESS walks you through the process of anonymizing your research data based on example data.
- BALANCING DATA PROTECTION AND OPENNESS presents methods for aligning your data protection and open science interests.
- ANONYMIZATION WORKFLOW closes this tutorial by summarizing the learned workflow.
What You’ll Learn
By the end of this tutorial, you will be able to:
Understand key concepts in the world of privacy (e.g., anonymization, k-anonymity)
Classify data in relevant categories for data protection (e.g., personal data, sensitive data)
Apply anonymization techniques using R in a coherent workflow
Make informed decisions when balancing the risks and utility of the anonymized data
What You Will NOT Learn
You will not learn anything other than anonymization of quantitative data.
Here are a few helpful links for other data types:
- Anonymizing neuroimaging metadata (recommendations)
- Anonymizing qualitative textual data (video tutorial for QualiAnon)
- Anonymizing sensitive qualitative data (lecture and tutorial)
Prerequisites
- You need basic R skills (e.g., loading data and packages). Experience with data wrangling with
tidyverseis beneficial. - Necessary software:
RandRStudio