Balancing Utility and Privacy
- contradictory ideals, but enable each other
summarize ideas by Jansen, Borgert, and Elson (2025)
Measuring Privacy
Calculate privacy levels (k-anonymity) and compare to the value measured before
Red teaming data anonymization (Jansen et al. 2026)
Include exercise on calculating the privacy level with k-anonymity like before in 2_1_Planning Privacy
Measuring Utility
Provide overview of utility measurements and maybe try out one
- utility indices (Carvalho et al. 2023)
- predictive performance measures for machine learning (potentially as call-out box)
- information loss measures:
- distance/distribution comparisons
- a penalty of transformations through generalisation and suppression
- statistical differences
- assessing utility for the current use case
- perform statistical analysis before and after anonymization
- see if results are comparable
Include exercise for calculating one information loss measure
Striking the Right Balance
iterative process: rework anonymization after measuring both
explain the norms for acceptable values of privacy and utility indicators
People to contact at LMU (data stewards? open science team?)
Learning Objective
- After completing this part of the tutorial, you will be able to make informed decisions when balancing the risks and utility of the anonymized data.
Exercises
apply measurement of k-anonymity
apply measurement of utility
Resources, Links, Examples
- examples in UK: https://ukdataservice.ac.uk/deposit-data/sharing-experiences/
References
Carvalho, Tânia, Nuno Moniz, Pedro Faria, and Luís Antunes. 2023. “Survey on Privacy-Preserving Techniques for Microdata Publication.” ACM Computing Surveys 55 (14s): 1–42. https://doi.org/10.1145/3588765.
Jansen, Luisa, Nele Borgert, and Malte Elson. 2025. “On the Tension Between Open Data and Data Protection in Research.” April 7, 2025. https://doi.org/10.31234/osf.io/5jt3s_v2.
Jansen, Luisa, Tim Ulmann, Robine Jordi, and Malte Elson. 2026. “Putting Privacy to the Test: Introducing Red Teaming for Research Data Anonymization.” arXiv. https://doi.org/10.48550/arXiv.2601.19575.