Once upon a time, all Google Kubernetes Engine (GKE) clusters used public IP addressing for communication between nodes and the control plane. Subsequently, we heard your security concerns and introduced private clusters enabled by VPC peering.
To consolidate the connectivity types, starting in March 2022, we began using Google Cloud’s Private Service Connect (PSC) for new public clusters’ communication between the GKE cluster control plane and nodes, which has profound implications for how you can configure your GKE environment. Today, we’re presenting a new consistent PSC-based framework for GKE control plane connectivity from cluster nodes. Additionally, we’re excited to announce a new feature set which includes cluster isolation at the control plane and node pool levels to enable more scalable, secure — and cheaper! — GKE clusters.
Starting with GKE version 1.23 and later, all new public clusters created on or after March 15th, 2022 began using Google Cloud’s PSC infrastructure to communicate between the GKE cluster control plane and nodes. PSC provides a consistent framework that helps connect different networks through a service networking approach, and allows service producers and consumers to communicate using private IP addresses internal to a VPC.
The biggest benefit of this change is to set the stage for using PSC-enabled features for GKE clusters.
The new set of cluster isolation capabilities we’re presenting here is part of the evolution to a more scalable and secure GKE cluster posture. Previously, private GKE clusters were enabled with VPC peering, introducing specific network architectures. With this feature set, you now have the ability to:
Update the GKE cluster control plane to only allow access to a private endpoint
Create or update a GKE cluster node pool with public or private nodes
Enable or disable GKE cluster control plane access from Google-owned IPs.
In addition, the new PSC infrastructure can provide cost savings. Traditionally, control plane communication is treated as normal egress and is charged for public clusters as a normal public IP charge. This is also true if you’re running kubectl for provisioning or other operational reasons. With PSC infrastructure, we have eliminated the cost of communication between the control plane and your cluster nodes, resulting in one less network egress charge to worry about.
Now, let’s take a look at how this feature set enables these new capabilities.
Allow access to the control plane only via a private endpoint
Private cluster users have long had the ability to create the control plane with both public and private endpoints. We now extend the same flexibility to public GKE clusters based on PSC. With this, if you want private-only access to your GKE control plane but want all your node pools to be public, you can do so.
This model provides a tighter security posture for the control plane, while leaving you to choose what kind of cluster node you need, based on your deployment.
To enable access only to a private endpoint on the control plane, use the following gcloud command:
Allow toggling and mixed-mode clusters with public and private node pools
All cloud providers with managed Kubernetes offerings offer both public and private clusters. Whether a cluster is public or private is enforced at the cluster level, and cannot be changed once it is created. Now you have the ability to toggle a node pool to have private or public IP addressing.
You may also want a mix of private and public node pools. For example, you may be running a mix of workloads in your cluster in which some require internet access and some don’t. Instead of setting up NAT rules, you can deploy a workload on a node pool with public IP addressing to ensure that only such node pool deployments are publicly accessible.
To enable private-only IP addressing on existing node pools, use the following gcloud command:
To enable private-only IP addressing at node pool creation time, use the following gcloud command:
Configure access from Google Cloud
In some scenarios, users have identified workloads outside of their GKE cluster, for example, applications running in Cloud Run or any GCP VMs sourced with Google Cloud public IPs were allowed to reach the cluster control plane. To mitigate potential security concerns, we have introduced a feature that allows you to toggle access to your cluster control plane from such sources.
To remove access from Google Cloud public IPs to the control plane, use the following gcloud command:
Similarly, you can use this flag at cluster creation time.
Choose your private endpoint address
Many customers like to map IPs to a stack for easier troubleshooting and to track usage. For example — IP block x for Infrastructure, IP block y for Services, IP block z for the GKE control plane, etc. By default, the private IP address for the control plane in PSC-based GKE clusters comes from the node subnet. However, some customers treat node subnets as infrastructure and apply security policies against it. To differentiate between infrastructure and the GKE control plane, you can now create a new custom subnet and assign it to your cluster control plane.
What can you do with this new GKE architecture?
With this new set of features, you can basically remove all public IP communication for your GKE clusters! This, in essence, means you can make your GKE clusters completely private.
You currently need to create the cluster as public to ensure that it uses PSC, but you can then update your cluster using gcloud with the –enable-private-endpoint flag, or the UI, to configure access via only a private endpoint on the control plane or create new private node pools.
Alternatively, you can control access at cluster creation time with the –master-authorized-networks and –no-enable-google-cloud-access flags to prevent access from public addressing to the control plane.
Furthermore, you can use the REST API or Terraform Providers to actually build a new PSC-based GKE cluster with the default (thus first) node pools to have private nodes. This can be done by setting the enablePrivateNodes field to true (instead of leveraging the public GKE cluster defaults and then updating afterwards, as currently required with gcloud and UI operations).
Lastly, the aforementioned features extend not only to Standard GKE clusters, but also to GKE Autopilot clusters.
When evaluating if you’re ready to move these PSC-based GKE cluster types to take advantage of private cluster isolation, keep in mind that the control plane’s private endpoint has the following limitations:
Private addresses in URLs for new or existing webhooks that you configure are not supported. To mitigate this incompatibility and assign an internal IP address to the URL for webhooks, set up a webhook to a private address by URL, create a headless service without a selector and a corresponding endpoint for the required destination.
The control plane private endpoint is not currently accessible from on-premises systems.
The control plane private endpoint is not currently globally accessible: Client VMs from different regions than the cluster region cannot connect to the control plane’s private endpoint.
All public clusters on version 1.25 and later that are not yet PSC-based are currently being migrated to the new PSC infrastructure; therefore, your clusters might already be using PSC to communicate with the control plane.
To learn more about GKE clusters with PSC-based control plane communication, check out these references:
GKE Concept page for public clusters with PSC
How-to: Change Cluster Isolation page
How-to: GKE node pool creation page with isolation feature flag
How-to: Schedule Pods on GKE Autopilot private nodes
gcloud reference to create a cluster with a custom private subnet
Terraform Providers Google: release v4.45.0 page
Google Cloud Private Services Connect page.
Here are the more specific features in the latest Terraform Provider, handy to integrate into your automation pipeline:
Terraform Providers Google: release v4.45.0
Cloud BlogRead More