Understand the change in Cloud Monitoring service discovery and how to adapt

By mullaned2002

May 24, 2024

95

If you’ve opened the SLOs Overview in the Google Cloud console recently, you may have seen this notice:

This notice announces a recent change in the way of defining services for Cloud Monitoring. Before the change, Cloud Monitoring automatically discovered services that were provisioned in AppEngine, Cloud Run or Google Kubernetes Engine (GKE). These services were automatically populated in the Services Overview dashboard.

Now, all services in the Services Overview dashboard have to be created explicitly. To simplify this task, when defining a new service in the console UI you are presented with a list of candidates that is built based on the auto-discovered services. The full list of the auto-discovered services includes managed services from AppEngine, Cloud Run and Istio as well as GKE workloads and services.

Besides using the UI, you can add managed services to Cloud Monitoring using the services.create API or using the Terraform google_monitoring_service resource.

For example, if you have a GKE cluster named cluster-001 provisioned in the us-central1 region that has a service frontend in the default namespace, the following command in Cloud Shell defines this service for Cloud Monitoring:

code_block
<ListValue: [StructValue([(‘code’, ‘curl -X POST \rn https://monitoring.googleapis.com/v3/\rn projects/${GOOGLE_CLOUD_PROJECT}/services?service_id=frontend \rn -H “Authorization: Bearer $(gcloud auth print-access-token)” \rn -H “Content-Type: application/json; charset=utf-8” \rn -d \rn’rn{rn “gkeService”: {rn “clusterName”: “cluster-001”,rn “location”: “us-central1”,rn “namespaceName”: “default”,rn “serviceName”: “frontend”rn }rn}”), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e66134698e0>)])]>

When using the Terraform resource, the keys for the service_labels argument should be converted from the camel case notation (in documentation) to the snake case notation. For example, the command above will look in Terraform like the following:

code_block
<ListValue: [StructValue([(‘code’, ‘resource “google_monitoring_service” “frontend” {rn service_id = “frontend”rn basic_service {rn service_type = “GKE_SERVICE”rn service_labels = {rn location : “us-central1”,rn cluster_name : “cluster-001”,rn service_namespace : “default”,rn service_name : “frontend”rn }rn }rn}’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e6613469ee0>)])]>

When your definition of the service does not match one to one with one of the managed services, you can add it to Cloud Monitoring by defining a custom service. You will use the same API request:

code_block
<ListValue: [StructValue([(‘code’, ‘curl -X POST \rn https://monitoring.googleapis.com/v3/\rn projects/${GOOGLE_CLOUD_PROJECT}/services?service_id=custom_svc \rn -H “Authorization: Bearer $(gcloud auth print-access-token)” \rn -H “Content-Type: application/json; charset=utf-8” \rn -d \rn’rn{rn “displayName”: “custom sevice”rn “custom”: {}rn}”), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e6613469c10>)])]>

Or you will use a designated Terraform resource, google_monitoring_custom_service:

code_block
<ListValue: [StructValue([(‘code’, ‘resource “google_monitoring_custom_service” “custom_svc” {rn service_id = “custom_svc”rn display_name = “custom service”rn}’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e6613469850>)])]>

Compared to a custom service, the auto-detected services come with two predefined SLIs for availability and latency. These SLIs utilize the metrics of the managed services that are automatically captured such as request processing time or HTTP request status. For custom services these SLIs have to be defined explicitly using request-based or window-based SLIs.

Check out creating SLOs and SLO-based alerts to find more information about tracking your service SLO and error budgets. And see this blog to learn about the predefined SLIs that are used in availability and latency SLOs.

Cloud BlogRead More

Previous articleCreate ecommerce experiences with commercetools and Google Cloud Application Integration

Next articleHow the Nerds at Nerdery do great work with ChromeOS

Understand the change in Cloud Monitoring service discovery and how to adapt

Leverage enterprise data with Denodo and Vertex AI for generative AI applications

TypeScript takes aim at truthy and nullish bugs

Hex-LLM: High-efficiency large language model serving on TPUs in Vertex AI Model Garden

LEAVE A REPLY Cancel reply

Most Popular

Schneider Electric automates Salesforce account hierarchy management with generative artificial intelligence (AI) using Amazon Aurora and Amazon Bedrock

Leverage enterprise data with Denodo and Vertex AI for generative AI applications

TypeScript takes aim at truthy and nullish bugs

Make relevant movie recommendations using Amazon Neptune, Amazon Neptune Machine Learning, and Amazon OpenSearch Service

Recent Comments

EDITOR PICKS

Exploring the Click Element Variable in Google Tag Manager

How to track events with Google Tag Manager and Google Analytics

Data Layer Variable in GTM: What, Why, and Where?

POPULAR POSTS

Enghouse EspialTV enables TV accessibility with Amazon Polly

Google unveils open source projects for generative AI

Search engines made simple: A low-code approach with GKE and Vertex AI Agent Builder

POPULAR CATEGORY