How to monitor IBM MQ instance on Google Cloud with Prometheus metrics

By mullaned2002

May 17, 2023

484

Often, it is challenging to monitor third party applications on Google Cloud that do not have direct support in the cloud Ops agent for collecting telemetry data. As of April, 2023 here is a list of third-party supported applications. IBM MQ (messaging and queuing middleware) is one of the non-supported products by Ops agent that involves not only a hefty cost but also an overhead for Google Cloud customers to maintain and secure their telemetry data in other operations suites like Dyantrace, IBM monitoring service etc. In this article, I’m going to help you create your own monitoring flow that not only extracts metrics from your IBM MQ instance but also exports them into the native Cloud Monitoring for generating charts, dashboarding and alerting.

Prerequisites:

The flow we describe here has two major steps:

MQ Metrics Exporter:

MQ metrics exporter step can be understood as an abstraction layer between MQ server and Prometheus that transforms MQ metrics into Prometheus metrics for easy consumption. The MQ exporter exposes an endpoint where the Prometheus metrics can be scraped. To implement this exporter we need the below components:

IBM MQ Client: MQ client is part of IBM MQ product that can be installed on its own, on a separate machine from the base product and server. It is available free of cost. You must install this client on a host where you’re planning to run the MQ exporter. The exporter will use the MQ client to interact with one or more IBM MQ servers and connect to their queue managers to fetch useful metrics.

pymqi Python library: The exporter we plan to build is written in Python and it leverages various built-in methods from pymqi library to execute IBM MQ PCF commands. It launches PCF commands to get metrics from MQ servers. Make sure, you’ve Python 3.6.8 and pip3 installed on the host along with the below dependencies:

code_block[StructValue([(u’code’, u’sudo dnf install gcc-c++ python3-develrnpip3 install –upgrade piprnpython3 -m pip install pymqi’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3e65fba1c1d0>)])]

Prometheus client: Prometheus client is required to represent MQ metrics as Prometheus metrics. It exposes an http endpoint to make these metrics available for data collectors. Since we’re writing our exporter in Python, we’ll be using the Python client for Prometheus. Ensure the below dependency is installed:

code_block[StructValue([(u’code’, u’python3 -m pip install prometheus-client’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3e65fba1cf90>)])]

MQ Metrics data collector:

The data collector collects metrics from the MQ exporter and lets you store data in Cloud Monitoring for querying and dashboarding. Since the data can be queried globally using PromQL, you can keep using any existing Grafana dashboards, PromQL-based alerts, and workflows. The data collector we will be using in this solution is Ops agent as it is the easiest way to scrape our MQ exporter running on Compute Engines. However, you can containerize the exporter and run it in Google Kubernetes Engine containers. The Kubernetes environment you can leverage Google Cloud Managed Service for Prometheus with self-managed or managed data collector options to store metrics in Cloud Monitoring. Please see:

Google Cloud Managed Service for Prometheus with managed collection Google Cloud Managed Service for Prometheus with self-deployed collection Google Cloud Managed Service for Prometheus with OpenTelemetry collector

Solution:

MQ monitoring flow

The Python client you run in the IBM MQ client environment issues PCF commands and then the request is redirected to a queue manager, where it is processed and from where a reply is sent back to the client.The link between the Python client and MQ client is established dynamically at run time. Ops Agent has a prometheus receiver that keeps polling http server (/metrics path) to collect and forward data to Cloud Monitoring.

Code Example:

Now we’ve all our dependencies addressed it’s time to look into the implementation of MQ Exporter with a few examples of IBM MQ metrics. Below sample code lets you capture three important metrics:

Current connection count

Queue depth.

Queue manager health state

You can extend this code to capture as many metrics as required.

code_block[StructValue([(u’code’, u'”””MQ Metrics exporter”””rnrnimport osrnimport pymqirnimport timernimport google.authrnfrom prometheus_client import start_http_server, Gauge, Enumrnrnrnclass MQMetrics:rn def __init__(self, interval_seconds=10):rn self.interval_seconds = interval_secondsrn “””rn Prometheus metrics to collectrn 1. current_connection_count: represents number of rn active connections currently.rn 2. queue_depth: Number of messages per queue.rn 3. qmgr_health: health state of queue manager. rn value: [0(unhealthy),1(healthy)]rn “””rn self.current_connection_count = Gauge(rn “current_connection_count”,rn “Current connection count in queue manager”,rn [“qmgr”],rn )rn self.queue_depth = Gauge(“queue_depth”, “Queue Depth”, rn [“qmgr”, “queue”])rn self.health = Enum(rn “qmgr_health”, “Health”, [“qmgr”], states=[“healthy”, “unhealthy”]rn )rnrn def collect_metrics_in_loop(rn self, queue_manager, channel, conn_info, user, passwordrn ):rn “””scrape metrcis from MQ server”””rn while True:rn self.fetch_metrics(queue_manager, channel, conn_info, rn user, password)rn time.sleep(self.interval_seconds)rnrn def fetch_metrics(self, queue_manager, channel, conn_info, rn user, password):rn “””rn Fetch metrics from MQ server and update prometheus.rn “””rn qmgr = pymqi.connect(queue_manager, channel, conn_info, rn user, password)rn pcf = pymqi.PCFExecute(qmgr)rnrn try:rn _, project_id = google.auth.default()rn except google.auth.exceptions.DefaultCredentialsError:rn raise ValueError(rn “Couldn’t find Google Cloud credentials, set the “rn “project ID with ‘gcloud set project'”rn )rnrn # Metric 1: Queue manager current connection countrn response = pcf.MQCMD_INQUIRE_Q_MGR_STATUS()rnrn for data in response:rn self.current_connection_count.labels(qmgr=queue_manager).set(rn data[pymqi.CMQCFC.MQIACF_CONNECTION_COUNT]rn )rnrn # Metric 2: Current Queue Depthrn prefix = “*”rn queue_type = pymqi.CMQC.MQQT_LOCALrnrn args = {pymqi.CMQC.MQCA_Q_NAME: prefix, rn pymqi.CMQC.MQIA_Q_TYPE: queue_type}rnrn response = pcf.MQCMD_INQUIRE_Q(args)rnrn for q_info in response:rn queue_name = q_info[pymqi.CMQC.MQCA_Q_NAME].decode(“utf-8”)rn queue_depth = q_info[pymqi.CMQC.MQIA_CURRENT_Q_DEPTH]rn self.queue_depth.labels(qmgr=queue_manager, queue=queue_name).set(rn queue_depthrn )rnrn # Metric 3: Check health state of queue managerrn self.collect_health_state(pcf, queue_manager)rnrn qmgr.disconnect()rnrn def collect_health_state(self, pcf, queue_manager):rn try:rn “””check if the queue manager is responsive”””rn pcf.MQCMD_PING_Q_MGR()rn except pymqi.MQMIError as e:rn self.health.labels(qmgr=queue_manager).state(“unhealthy”)rn else:rn self.health.labels(qmgr=queue_manager).state(“healthy”)rnrnrndef main():rnrn interval_seconds = 10rn “””port used by exporter to expose prometheus metrcis”””rn exporter_port = int(os.getenv(“MQ_EXPORTER_PORT”, “5000”))rnrn # Provide values for the below variablesrn queue_manager = int(os.getenv(“QUEUE_MANAGER”, “[queue manager name]”))rn channel = int(os.getenv(“CHANNEL”, “[channel name]”))rn host = int(os.getenv(“HOST”, “[MQ server IP]”))rn port = int(os.getenv(“PORT”, “[Port no.]”))rn user = int(os.getenv(“USER”, “[user to connect queue manager]”))rn password = int(os.getenv(“PASSWORD”, “[password for the user]”))rnrn conn_info = “%s(%s)” % (host, port)rnrn mq_metrics_exporter = MQMetrics(interval_seconds=interval_seconds)rn start_http_server(exporter_port)rn mq_metrics_exporter.collect_metrics_in_loop(rn queue_manager, channel, conn_info, user, passwordrn )rnrnrnif __name__ == “__main__”:rn main()’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3e6602fae390>)])]

Metric Collection:

We need to collect the exported metrics with a data collector. Follow the steps below to setup Ops Agent on your host:

Install the Ops Agent. You must install the version 2.25.0 or higher.

Edit the Ops Agent configuration file, /etc/google-cloud-ops-agent/config.yaml and add Prometheus receiver and pipeline:

code_block[StructValue([(u’code’, u”metrics:rn receivers:rn prometheus:rn type: prometheusrn config:rn scrape_configs:rn – job_name: ‘mq_metrics_exporter’rn scrape_interval: 20srn static_configs:rn – targets: [‘localhost:5000]rn service:rn pipelines:rn prometheus_pipeline:rn receivers:rn – prometheus”), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3e6602fae6d0>)])]

Note: Change the target host to point collector to your machine where the MQ exporter is running.

Restart the Ops Agent

code_block[StructValue([(u’code’, u’sudo service google-cloud-ops-agent restart’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3e6602880b50>)])]

Viewing Prometheus metrics with Cloud Monitoring

After running the MQ exporter and Ops Agent as mentioned above, you can visualize metric data with the help of below options available in Cloud Monitoring:

PromQL

Monitoring Query Language (MQL)

Metrics Explorer

You can also view your metrics in other interfaces, like Prometheus UI and Grafana.

Prometheus UI

Grafana

To verify that our 3 MQ metrics are being ingested, we’ll use PromQL:

In the Google Cloud Console, go to Monitoring.

In the Monitoring navigation pane, click Metrics Explorer.

Select the PromQL tab.

Execute the following queries one by one to see data and chart:

code_block[StructValue([(u’code’, u’1. current_connection_countrn2. queue_depthrn3. qmgr_health’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3e660440e710>)])]

Metric 1: Current connection count

Metric 2: Queue depth

Metric 3: Queue manager health state

Summing-up:

Now that we’ve understood that MQ exporter and Ops Agent components are independent in nature, it can be deployed in a distributed manner or bundled on the same machine. However, the recommendation for MQ monitor (or exporter) is to deploy it on a different machine than MQ server to ensure scalability.

The sample code mentioned above covers few metrics, however, you can extend the solution to execute more PCF commands available in pymqi library to gather additional metrics mentioned below:

Shared connections

Active channels

Channel status

Total messages

Bytes sent/received

Buffer sent/received

Queue depth percentage

Queue I/O

Uncommitted messages

Enqueued messages

Dequeued messages

What’s-next:

Cloud Monitoring provides visibility into the performance, uptime, and overall health of cloud-powered applications whereas Cloud Logging is a fully managed service that performs at scale and can ingest application and platform log data, as well as custom log data from GKE environments, VMs, and other services inside and outside of Google Cloud.

Get started today with the below interactive articles:

Monitor network and CPU utilization for a VM

How to use logging with a Compute Engine VM

Running and Monitoring Integrations

References:

Check-out the below references to learn more:

Managed Prometheus

Ops Agent For Prometheus

IBM MQ Client Installation

aside_block[StructValue([(u’title’, u’IBM MQ Client Documentation’), (u’body’, <wagtail.wagtailcore.rich_text.RichText object at 0x3e66044f6310>), (u’btn_text’, u’Read more’), (u’href’, u’https://www.ibm.com/support/pages/mqc91-ibm-mq-clients’), (u’image’, None)])]

Cloud BlogRead More

Previous articleHow Broadcom simplifies compliance for federal customers with Assured Workloads

Next articleOpening the doors to accessibility with Google Maps Platform

How to monitor IBM MQ instance on Google Cloud with Prometheus metrics

Prerequisites:

MQ Metrics Exporter:

MQ Metrics data collector:

Solution:

Code Example:

Metric Collection:

Viewing Prometheus metrics with Cloud Monitoring

Summing-up:

What’s-next:

References:

The overwhelmed person’s guide to Google Cloud: week of April 18

The temptation of AI as a service

Announcing PyTorch/XLA 2.3: Distributed training, dev improvements, and GPUs

LEAVE A REPLY Cancel reply

Most Popular

The overwhelmed person’s guide to Google Cloud: week of April 18

The temptation of AI as a service

Inpainting and Outpainting with Stable Diffusion

Announcing PyTorch/XLA 2.3: Distributed training, dev improvements, and GPUs

Recent Comments

EDITOR PICKS

Exploring the Click Element Variable in Google Tag Manager

How to track events with Google Tag Manager and Google Analytics

Data Layer Variable in GTM: What, Why, and Where?

POPULAR POSTS

Identify key insights from text documents through fine-tuning and HPO with Amazon SageMaker JumpStart

Cloud Data Integration: Benefits, Examples, and Why it Matters

Part 1: How NatWest Group built a scalable, secure, and sustainable MLOps platform

POPULAR CATEGORY