Risk Assessments and Measurements of Privacy Leaks within Google's Ads Data Hub

Abstract

In Google’s Ads Data Hub (ADH), advertisers can analyze ad campaign data by utilizing a combination of internally collected data and Google’s event-level ad data. ADH employs its own privacy measures that filter SQL queries and output so that advertisers only obtain aggregate results, thereby protecting end-user privacy. Even with these existing protections, targeted and untargeted privacy leaks can occur. The goal of this project was to develop methods and algorithms that measure the risk of privacy leaks in Google’s ADH. Throughout this project, we had to consider preserving the utility of ADH for advertisers. To address our goal, we modified the Special Unique Detection Algorithm (SUDA) commonly utilized in Statistical Disclosure Control. Our new score, PIRATE, the Probabilistic Identification Risk and Attacker Threat Estimate, uses probability to assess the likelihood of uniquely identifying a record in a dynamic dataset. We applied PIRATE to simulated ADH data and successfully calculated the risk for each row.

Report available on request.