Data anonymization#
Data anonymization is important because it protects the privacy and confidentiality of individuals whose data is being used.
Data de-identification vs. anonymization#
The degree of data anonymization is important because it determines the sharing regulations. There are at least two levels of data anonymization [White et al., 2022]:
Fully anonymized data: All personal identifiers are removed, and a separate identification code is assigned. The link between the anonymized dataset and any trace back to the original data is permanently deleted
De-identified data: Personal information is removed from the dataset, and individuals are assigned a unique identification number. However, a key is retained that allows the de-identified data to be linked back to the original personal data if needed
Recommended tools for data anonymization#
(coming soon!)