Saturday, April 27, 2024
No menu items!
HomeCloud ComputingIntroducing BigQuery differential privacy and partnership with Tumult Labs

Introducing BigQuery differential privacy and partnership with Tumult Labs

We are excited to announce the public preview of BigQuery differential privacy, an SQL building block that analysts and data scientists can use to anonymize their data. In the future, we’ll integrate differential privacy with BigQuery data clean rooms to help organizations anonymize and share sensitive data, all while preserving privacy. 

This launch adds differential privacy to Google SQL for BigQuery, building on the open-source differential privacy library that is used by Ads Data Hub and the COVID-19 Community Mobility Reports. Google’s research in differentially private SQL was published in a 2019 paper and was recognized with the Future of Privacy Forum’s 2021 Award for Research Data Stewardship. 

We’re also excited to announce our partnership with Tumult Labs, a leader in differential privacy for companies and government agencies. Tumult Labs offers technology and professional services to help Google Cloud customers with their differential privacy implementations. Learn more about how Tumult Labs can help you below.

What is differential privacy?

Differential privacy is an anonymization technique that limits the personal information that is revealed by an output. Differential privacy is commonly used to allow inferences and to share data while preventing someone from learning information about an entity in that dataset. 

Advertising, financial services, healthcare, and education companies use differential privacy to perform analysis without exposing individual records. Differential privacy is also used by public sector organizations like the U.S. Census and by companies that comply with the General Data Protection Regulation (GDPR), the Health Insurance Portability and Accountability Act (HIPAA), the Family Educational Rights and Privacy Act (FERPA), and the California Consumer Privacy Act (CCPA).

What can I do with BigQuery differential privacy?

With BigQuery differential privacy, you can:

Anonymize results with individual-record privacy

Anonymize results without copying or moving your data, including data from AWS and Azure with BigQuery Omni

Anonymize results that are sent to Dataform pipelines so that they can be consumed by other applications

Anonymize results that are sent to Apache Spark stored procedures

Use additional differential privacy features by calling external frameworks and platforms like PipelineDP.io and Tumult Analytics 

[Coming soon] Use differential privacy with authorized views and authorized routines

[Coming soon] Share anonymized data with BigQuery Data Clean Rooms 

BigQuery differential privacy also works with your existing security controls so you can:

Anonymize results while using row- and column-level security, dynamic data masking, and column-level encryption

Prevent sensitive data from being queried without proper permission using Data profiles for BigQuery data

How do I get started?

Differential privacy is now part of GoogleSQL for BigQuery and is available in all editions and the on-demand pricing model.

You can apply differential privacy to the following aggregate functions to anonymize the results:

SUM

COUNT

AVG

PERCENTILE_CONT

Here is a sample differential privacy query on a BigQuery public dataset that computes the 50th and 90th percentiles of Medicare beneficiaries by provider type. This query anonymizes the percentile results that are calculated using the physician identifier to protect physician privacy.

Note: The parameters in DIFFERENTIAL_PRIVACY OPTIONS in this sample query are not recommendations. You can learn more about how privacy parameters work in the differential privacy clause and can work with your privacy officer or with a Google partner to determine the optimal privacy parameters for your dataset and organization.

code_block[StructValue([(u’code’, u’SELECTrnWITHrn DIFFERENTIAL_PRIVACYrn OPTIONS (rn epsilon = 1,rn delta = 1e-7,rn privacy_unit_column = npi)rn provider_type,rnPERCENTILE_CONT(rn bene_unique_cnt, 0.5, contribution_bounds_per_row => (0, 10000))rn percentile_50th,rnPERCENTILE_CONT(rn bene_unique_cnt, 0.9, contribution_bounds_per_row => (0, 10000))rn percentile_90thrnFROM `bigquery-public-data.cms_medicare.physicians_and_other_supplier_2015`rnWHERE provider_type IS NOT NULLrnGROUP BY 1rnORDER BY 2 DESCrnLIMIT 10;rnrn– Query results may differ slightly with each run due to noise being appliedrn/*————————————–+—————–+—————–*rn| provider_type | percentile_50th | percentile_90th |rn+————————————–+—————–+—————–+rn| Peripheral Vascular Disease | 132.95 | 3134.24 |rn| Ambulance Service Supplier | 101.81 | 697.79 |rn| Multispecialty Clinic/Group Practice | 75.03 | 2316.40 |rn| Addiction Medicine | 68.38 | 3811.18 |rn| Public Health Welfare Agency | 67.27 | 597.46 |rn| Neuropsychiatry | 63.85 | 375.88 |rn| Emergency Medicine | 62.86 | 272.00 |rn| Centralized Flu | 52.97 | 216.98 |rn| Clinical Laboratory | 52.04 | 744.01 |rn| Ophthalmology | 49.93 | 282.12 |rn*————————————–+—————–+—————–*/’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3eb00c3eb2d0>)])]

How can Tumult Labs help me?

Some uses of differential privacy require features like privacy accounting or variants like zero-concentrated differential privacy. Through our partnership with Tumult Labs, you can ensure that your use of BigQuery differential privacy:

Aligns with compliance and regulatory requirements

Certifies that your use of differential privacy provides end-to-end privacy guarantees

Balances data sharing with privacy risk

Learn more about how Tumult Labs can help you with BigQuery differential privacy here.

Where can I learn more?

Learn more about BigQuery differential privacy at:

Use differential privacy

Differentially private aggregate functions

The differential privacy clause

Let us know where you need help with BigQuery differential privacy.

Cloud BlogRead More

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular

Recent Comments