New control plane connectivity and isolation options for your GKE clusters

By mullaned2002

December 21, 2022

1160

Once upon a time, all Google Kubernetes Engine (GKE) clusters used public IP addressing for communication between nodes and the control plane. Subsequently, we heard your security concerns and introduced private clusters enabled by VPC peering.

To consolidate the connectivity types, starting in March 2022, we began using Google Cloud’s Private Service Connect (PSC) for new public clusters’ communication between the GKE cluster control plane and nodes, which has profound implications for how you can configure your GKE environment. Today, we’re presenting a new consistent PSC-based framework for GKE control plane connectivity from cluster nodes. Additionally, we’re excited to announce a new feature set which includes cluster isolation at the control plane and node pool levels to enable more scalable, secure — and cheaper! — GKE clusters.

New architecture

Starting with GKE version 1.23 and later, all new public clusters created on or after March 15th, 2022 began using Google Cloud’s PSC infrastructure to communicate between the GKE cluster control plane and nodes. PSC provides a consistent framework that helps connect different networks through a service networking approach, and allows service producers and consumers to communicate using private IP addresses internal to a VPC.

The biggest benefit of this change is to set the stage for using PSC-enabled features for GKE clusters.

Figure 1: Simplified diagram of PSC-based architecture for GKE clusters

The new set of cluster isolation capabilities we’re presenting here is part of the evolution to a more scalable and secure GKE cluster posture. Previously, private GKE clusters were enabled with VPC peering, introducing specific network architectures. With this feature set, you now have the ability to:

Update the GKE cluster control plane to only allow access to a private endpoint

Create or update a GKE cluster node pool with public or private nodes

Enable or disable GKE cluster control plane access from Google-owned IPs.

In addition, the new PSC infrastructure can provide cost savings. Traditionally, control plane communication is treated as normal egress and is charged for public clusters as a normal public IP charge. This is also true if you’re running kubectl for provisioning or other operational reasons. With PSC infrastructure, we have eliminated the cost of communication between the control plane and your cluster nodes, resulting in one less network egress charge to worry about.

Now, let’s take a look at how this feature set enables these new capabilities.

Allow access to the control plane only via a private endpoint

Private cluster users have long had the ability to create the control plane with both public and private endpoints. We now extend the same flexibility to public GKE clusters based on PSC. With this, if you want private-only access to your GKE control plane but want all your node pools to be public, you can do so.

This model provides a tighter security posture for the control plane, while leaving you to choose what kind of cluster node you need, based on your deployment.

To enable access only to a private endpoint on the control plane, use the following gcloud command:

code_block[StructValue([(u’code’, u’gcloud container clusters update CLUSTER_NAME \rn –enable-private-endpoint’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3e4cfbce17d0>)])]

Allow toggling and mixed-mode clusters with public and private node pools

All cloud providers with managed Kubernetes offerings offer both public and private clusters. Whether a cluster is public or private is enforced at the cluster level, and cannot be changed once it is created. Now you have the ability to toggle a node pool to have private or public IP addressing.

You may also want a mix of private and public node pools. For example, you may be running a mix of workloads in your cluster in which some require internet access and some don’t. Instead of setting up NAT rules, you can deploy a workload on a node pool with public IP addressing to ensure that only such node pool deployments are publicly accessible.

To enable private-only IP addressing on existing node pools, use the following gcloud command:

code_block[StructValue([(u’code’, u’gcloud container node-pools update POOL_NAME \rn –cluster CLUSTER_NAME \rn –enable-private-nodes’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3e4cfbcfd850>)])]

To enable private-only IP addressing at node pool creation time, use the following gcloud command:

code_block[StructValue([(u’code’, u’gcloud container node-pools create POOL_NAME \rn –cluster CLUSTER_NAME \rn –enable-private-nodes’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3e4cfbca3350>)])]

Configure access from Google Cloud

In some scenarios, users have identified workloads outside of their GKE cluster, for example, applications running in Cloud Run or any GCP VMs sourced with Google Cloud public IPs were allowed to reach the cluster control plane. To mitigate potential security concerns, we have introduced a feature that allows you to toggle access to your cluster control plane from such sources.

To remove access from Google Cloud public IPs to the control plane, use the following gcloud command:

code_block[StructValue([(u’code’, u’gcloud container clusters update CLUSTER_NAME \rn –no-enable-google-cloud-access’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3e4cfbca3850>)])]

Similarly, you can use this flag at cluster creation time.

Choose your private endpoint address

Many customers like to map IPs to a stack for easier troubleshooting and to track usage. For example — IP block x for Infrastructure, IP block y for Services, IP block z for the GKE control plane, etc. By default, the private IP address for the control plane in PSC-based GKE clusters comes from the node subnet. However, some customers treat node subnets as infrastructure and apply security policies against it. To differentiate between infrastructure and the GKE control plane, you can now create a new custom subnet and assign it to your cluster control plane.

code_block[StructValue([(u’code’, u’gcloud container clusters create CLUSTER_NAME \rn –private-endpoint-subnetwork=SUBNET_NAME’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3e4cfbca3f90>)])]

What can you do with this new GKE architecture?

With this new set of features, you can basically remove all public IP communication for your GKE clusters! This, in essence, means you can make your GKE clusters completely private.

You currently need to create the cluster as public to ensure that it uses PSC, but you can then update your cluster using gcloud with the –enable-private-endpoint flag, or the UI, to configure access via only a private endpoint on the control plane or create new private node pools.

Alternatively, you can control access at cluster creation time with the –master-authorized-networks and –no-enable-google-cloud-access flags to prevent access from public addressing to the control plane.

Furthermore, you can use the REST API or Terraform Providers to actually build a new PSC-based GKE cluster with the default (thus first) node pools to have private nodes. This can be done by setting the enablePrivateNodes field to true (instead of leveraging the public GKE cluster defaults and then updating afterwards, as currently required with gcloud and UI operations).

Lastly, the aforementioned features extend not only to Standard GKE clusters, but also to GKE Autopilot clusters.

When evaluating if you’re ready to move these PSC-based GKE cluster types to take advantage of private cluster isolation, keep in mind that the control plane’s private endpoint has the following limitations:

Private addresses in URLs for new or existing webhooks that you configure are not supported. To mitigate this incompatibility and assign an internal IP address to the URL for webhooks, set up a webhook to a private address by URL, create a headless service without a selector and a corresponding endpoint for the required destination.

The control plane private endpoint is not currently accessible from on-premises systems.

The control plane private endpoint is not currently globally accessible: Client VMs from different regions than the cluster region cannot connect to the control plane’s private endpoint.

All public clusters on version 1.25 and later that are not yet PSC-based are currently being migrated to the new PSC infrastructure; therefore, your clusters might already be using PSC to communicate with the control plane.

To learn more about GKE clusters with PSC-based control plane communication, check out these references:

GKE Concept page for public clusters with PSC

How-to: Change Cluster Isolation page

How-to: GKE node pool creation page with isolation feature flag

How-to: Schedule Pods on GKE Autopilot private nodes

gcloud reference to create a cluster with a custom private subnet

Terraform Providers Google: release v4.45.0 page

Google Cloud Private Services Connect page.

Here are the more specific features in the latest Terraform Provider, handy to integrate into your automation pipeline:

Terraform Providers Google: release v4.45.0

gcp_public_cidrs_access_enabled

enable_private_endpoint