How to secure a BigQuery data warehouse that stores confidential data
This document is intended for data engineers and security administrators who deploy and secure data warehouses using BigQuery. It’s part of a security blueprint that’s made up of the following:
- A GitHub repository that contains a set of Terraform configurations and scripts. The Terraform configuration sets up an environment in Google Cloud that supports a data warehouse that stores confidential data.
- A guide to the architecture, design, and security controls that you use this blueprint to implement (this document).
This document discusses the following:
- The architecture and Google Cloud services that you can use to help secure a data warehouse in a production environment.
- Best practices for data governance when creating, deploying, and operating a data warehouse in Google Cloud, including data de-identification, differential handling of confidential data, and column-level access controls.
This document assumes that you have already configured a foundational set of security controls as described in theĀ Google Cloud security foundations. It helps you to layer additional controls onto your existing security controls to help protect confidential data in a data warehouse.
Architecture
To create a confidential data warehouse, you need to categorize data as confidential and non-confidential, and then store the data in separate perimeters. The following image shows how ingested data is categorized, de-identified, and stored. It also shows how you can re-identify confidential data on demand for analysis.
Organization structure
You group your organization’s resources so that you can manage them and separate your testing environments from your production environment. Resource Manager lets you logically group resources by project, folder, and organization.
The following diagram shows you a resource hierarchy with folders that represent different environments such as bootstrap, common, production, non-production (or staging), and development. You deploy most of the projects in the blueprint into the production folder, and the data governance project in the common folder which is used for governance.
References:
- https://github.com/GoogleCloudPlatform/terraform-google-secured-data-warehouse
- https://cloud.google.com/architecture/confidential-data-warehouse-blueprint?hl=en
Hey people!!!!! Good mood and good luck to everyone!!!!!
Hi , do you have similar aws architecture decison flowchart or guide me where I can get in similar manner…
A cloud architecture is the most advanced and cutting-edge technology. The technique you described in this post, which includes reviewing…
Hi Tama, thanks for reading this article. Definitely the answer will be back to your decision, but here are some…
Hello Mr.Doddi! I've been read for your article since 2 years ago before i get into a collage. Then now…