NIST is crowdsourcing differential privacy techniques for safety datasets
The National Institute of Standards and Technology (NIST) is launching the Differential Privacy Temporal Map Challenge. It’s a set of contests, with cash prizes attached, intended to crowdsource new ways of handling personally identifiable information (PII) in public safety datasets.
The problem is that although rich, detailed data is valuable for researchers and building AI models — in this case, in the areas of emergency planning and epidemiology — using it raises serious and potentially dangerous data privacy and rights issues. Even if datasets are kept under a proverbial lock and key, malicious actors can, based on just a few data points, re-infer sensitive information about people.
The solution is to de-identify the data such that it remains useful without compromising individuals’ privacy. NIST already has a clear standard for what that means. In part, it says “De-identification removes identifying information from a dataset so that individual data cannot be linked with specific individuals.”
Specifically, the challenge focuses on temporal map data, which contains temporal and spatial information. The call for the NIST contest states, “Public safety agencies collect extensive data containing time, geographic, and potentially personally identifiable information.” For example, a 911 call would reveal a person’s name, age, gender, address, symptoms or situation, and more. The NIST announcement notes that “Temporal map data is of particular interest to the public safety community.”
The Differential Privacy Temporal Map Challenge stands on the shoulders of previous NIST differential privacy challenges — one centered on synthetic data and one aimed at developing the technique more generally.
NIST is offering a total of $276,000 in prize money across three categories. The Better Meter Stick will award a total of $29,000 to entries that measure the quality of differentially private algorithms. A total of $147,000 is available for those who come up with the best balance of data utility and privacy preservation. And the wing of the contest that awards the usability of source code for open source endeavors has a $100,000 pot.
The challenge is accepting submissions now through January 5, 2021. Non-federal agency partners include DrivenData, HeroX, and Knexus Research. Winners will be announced February 4, 2021.