Use Process Metrics for troubleshooting and resource attribution

By mullaned2002

August 18, 2021

931

When you are experiencing an issue with your application or service, having deep visibility into both the infrastructure and the software powering your apps and services is critical. Most monitoring services provide insights at the Virtual Machine (VM) level, but few go further. To get a full picture of the state of your application or service, you need to know what processes are running on your infrastructure. That visibility into the processes running on your VMs is provided out of the box by the new Ops Agent and made available by default in Cloud Monitoring. Today we will cover how to access process metrics and why you should start monitoring them.

Better visibility with process metrics

The data gathered by process metrics include CPU, memory, I/O, number of threads, and more, for any running processes and services on your VMs. When the Ops Agent or the Cloud Monitoring agent is installed, these metrics are captured at 60-second intervals and sent to Cloud Monitoring so you can visualize, analyze, track, and alert on them. A single VM may run tens or hundreds of processes, while you may have tens of thousands running across your fleet of VMs.

As a developer, you may only care about seeing inside a single VM to troubleshoot and identify memory leaks or the source of performance issues.

As an operator or IT Admin, you may be interested in aggregate resource consumption, building baseline views of compute, storage, and networking usage across your VM fleet. Then, when those baseline consumption levels break normal behaviors, you will know when to investigate your systems.

Built for scale and ease of use

Cloud Monitoring is built on the same advanced backend that powers metrics across Google. This proven scalability means your metrics ingestion will be supported despite the extremely high cardinality. Additionally, our agents do not require any config file changes to turn on process metric monitoring.

Lastly, our goal is to provide you the observability and telemetry data where, and when, you need it. So, like the rest of the operations suite, we deliver process metrics in the context of your infrastructure, directly in the VM admin console.

Navigating to a single VM’s in-context process monitoring in GCE

The navigation is simple. Once you have the Ops Agent or the Cloud Monitoring agent installed in your VMs:

Go to the Compute Engine console page and click on VM Instances

Select the VM that you want to investigate

In the navigation menu on the top, click Observability

Click on Metrics

Lastly, click on Processes

In the window on the right you will see a chart and a table with all of the processes in your VM. You can also filter by time frame and sort by name or value. You do not need to do anything, other than have the agent installed, for the process to be detected and displayed.

Fleet-wide metrics monitoring

Cloud Monitoring gives you a look across your fleet of VMs so you can identify the aggregated usage of resources by processes. This level of broad, yet granular, insight can drive your decisions around which software to run or how many VMs you need to optimally power your apps and services. Admins can perform a cost-savings analysis if they determine that certain processes are slowing down the work of a large number of VMs. The larger numbers of less powerful VMs can be replaced by fewer, more capable VMs.

To get this fleet-wide view:

Navigate to Cloud Monitoring

Click Dashboards in the left menu

In the All Dashboards list, click on VM Instances

Towards the top of the window, click on Processes

This provides many charts detailing the processes running across your fleet of VMs.

The new Cloud Monitoring VM Fleet-wide Process view in the VM Instance Dashboard

Get started today

To start identifying and monitoring your process metrics, you must first install the Ops Agent, or have installed the legacy Cloud Monitoring agent. Once that is complete, the process metrics data will automatically be ingested into Cloud Monitoring and the VM admin console.

If you have any questions, or to join the conversation with other developers, operators, DevOps, and SREs, visit the Cloud Operations page in the Google Cloud Community.

Cloud BlogRead More

Previous articleRAMP-TAO: Layering atomic transactions on Facebook’s online graph store

Next articleHow to conduct live network forensics in GCP

Use Process Metrics for troubleshooting and resource attribution

Better visibility with process metrics

Built for scale and ease of use

Fleet-wide metrics monitoring

Get started today

Leverage enterprise data with Denodo and Vertex AI for generative AI applications

TypeScript takes aim at truthy and nullish bugs

Hex-LLM: High-efficiency large language model serving on TPUs in Vertex AI Model Garden

LEAVE A REPLY Cancel reply

Most Popular

Schneider Electric automates Salesforce account hierarchy management with generative artificial intelligence (AI) using Amazon Aurora and Amazon Bedrock

Leverage enterprise data with Denodo and Vertex AI for generative AI applications

TypeScript takes aim at truthy and nullish bugs

Make relevant movie recommendations using Amazon Neptune, Amazon Neptune Machine Learning, and Amazon OpenSearch Service

Recent Comments

EDITOR PICKS

Exploring the Click Element Variable in Google Tag Manager

How to track events with Google Tag Manager and Google Analytics

Data Layer Variable in GTM: What, Why, and Where?

POPULAR POSTS

Calculating Derivatives in PyTorch

Google showcases Cloud TPU v4 Pods for large model training

How OCX Cognition reduced ML model development time from weeks to days and model update time from days to real time using AWS Step...

POPULAR CATEGORY

Use Process Metrics for troubleshooting and resource attribution

Better visibility with process metrics

Built for scale and ease of use

Fleet-wide metrics monitoring

Get started today

The Ops Agent is now GA and it leverages OpenTelemetry

LEAVE A REPLY Cancel reply

Most Popular

Recent Comments

EDITOR PICKS

POPULAR POSTS

POPULAR CATEGORY