Statistical Disclosure Control for Microdata: Methods and Applications in R
Preface -- Overview of the Book -- Acknowledgements -- Contents -- Acronyms -- 1 Software -- 1.1 Prerequisites -- 1.1.1 Installation and Updates -- 1.1.2 Install sdcMicro and Its Browser-Based Point-and-Click App -- 1.1.3 Updating the SDC Tools -- 1.1.4 Help -- 1.1.5 The R Workspace and the Working Directory -- 1.1.6 Data Types -- 1.1.7 Generic Functions, Methods and Classes -- 1.2 Brief Overview on SDC Software Tools -- 1.3 Differences Between SDC Tools -- 1.4 Working with sdcMicro -- 1.4.1 General Information About sdcMicro -- 1.4.2 S4 Class Structure of the sdcMicro Package -- 1.4.3 Utility Functions -- 1.4.4 Reporting Facilities -- 1.5 The Point-and-Click App sdcApp -- 1.6 The simPop package -- References -- 2 Basic Concepts -- 2.1 Types of Variables -- 2.1.1 Non-confidential Variables -- 2.1.2 Identifying Variables -- 2.1.3 Sensitive Variables -- 2.1.4 Linked Variables -- 2.1.5 Sampling Weights -- 2.1.6 Hierarchies, Clusters and Strata -- 2.1.7 Categorical Versus Continuous Variables -- 2.2 Types of Disclosure -- 2.2.1 Identity Disclosure -- 2.2.2 Attribute Disclosure -- 2.2.3 Inferential Disclosure -- 2.3 Disclosure Risk Versus Information Loss and Data Utility -- 2.4 Release Types -- 2.4.1 Public Use Files (PUF) -- 2.4.2 Scientific Use Files (SUF) -- 2.4.3 Controlled Research Data Center -- 2.4.4 Remote Execution -- 2.4.5 Remote Access -- References -- 3 Disclosure Risk -- 3.1 Introduction -- 3.2 Frequency Counts -- 3.2.1 The Number of Cells of Equal Size -- 3.2.2 Frequency Counts with Missing Values -- 3.2.3 Sample Frequencies in sdcMicro -- 3.3 Principles of k-anonymity and l-diversity -- 3.3.1 Simplified Estimation of Population Frequency Counts -- 3.4 Special Uniques Detection Algorithm (SUDA) -- 3.4.1 Minimal Sample Uniqueness -- 3.4.2 SUDA Scores -- 3.4.3 SUDA DIS Scores -- 3.4.4 SUDA in sdcMicro -- 3.5 The Individual Risk Approach