Monday, October 14, 2024
No menu items!
HomeCloud ComputingA definitive guide to the Database Observability with Cloud SQL: Part 1

A definitive guide to the Database Observability with Cloud SQL: Part 1

Introduction

Database performance is essential for any business because it affects the efficiency and effectiveness of day-to-day operations. A slow database can cause delays in processing transactions, which can have a negative impact on customer satisfaction and profitability. 

Cloud SQL for PostgreSQLoffers database observability through Cloud SQL Insights and SQLcommenter, which help customers diagnose, detect, and prevent database performance issues using a developer-first approach.

Cloud SQL for PostgreSQL supports additional database logs that can include metadata about database connections and disconnections, as well as query execution. All of these logs can be configured using database flags. Cloud SQL also supports widely used PostgreSQL Extensions, such as pg_stat_statements. By enabling both Cloud SQL Insights and native options, customers can get the best of both worlds.

This blog post explains how customers who have migrated to Cloud SQL for PostgreSQL can still use familiar PostgreSQL tools such as pgBadger and pg_stat_statements for database observability, in addition to Cloud SQL Insights. Unlike Cloud SQL Insights, generating reports using pgBadger requires additional steps to activate all necessary logging. If logging is not enabled, a partial report will be generated. Additionally, pgBadger requires additional operational steps to set up a server or GCE instance to download and create a report.

This article describes how to configure logging and generate HTML reports using pgBadger. It also emphasizes the value of using pg_stat_statement to capture nested calls within procedural calls.

Configuring database flags

Cloud SQL for PostgreSQL provides database flags that control the information captured in database logs. First, we will configure flags to enable logging of slow queries, every connection and disconnection. We will enable the following database flags with the values set as an example, not as the recommended values:

Activating the above-mentioned parameters or setting aggressive values for logging can put a strain on the database. We should assess the performance impact on the database before changing it in production. Here we can refer to the complete list of flags that are available for configuration on Cloud SQL for PostgreSQL.

Sample command line for updating database flags.

code_block[StructValue([(u’code’, u’gcloud sql instances patch <<instanceName>> –database-flags=log_connections=on,log_disconnections=on,log_min_duration_statement=1000,log_temp_files=512,log_lock_waits=on;’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3edcc136cd10>)])]

We are going to activate a new setting to record all nested calls as part of a top-level statement issued by clients. This will help us identify problematic nested calls, i.e. statements invoked within procedural code, in the pg_stat_statements view.

code_block[StructValue([(u’code’, u’pg_stat_statements.track = all’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3edcdef55510>)])]

We can verify our set parameter from the database by querying the pg_settings view.

Setting up pgbench Database

We will use pgbench, a tool for benchmarking PostgreSQL databases, to simulate a performance baseline run on our Cloud SQL database. The loads generated during the mock benchmark run will be used to generate reports from database logs using pgBadger.

We will initialize the database using the pgbench command line available as part of PostgreSQL client libraries on a newly created database “pgbench”.

code_block[StructValue([(u’code’, u’deepakmahto$ PGPASSWORD=******** pgbench -i -s 2 –partitions=365 -h localhost -p 5432 -U postgres pgbenchrndropping old tables…rnNOTICE: table “pgbench_accounts” does not exist, skippingrnNOTICE: table “pgbench_branches” does not exist, skippingrnNOTICE: table “pgbench_history” does not exist, skippingrnNOTICE: table “pgbench_tellers” does not exist, skippingrncreating tables…rncreating 365 partitions…rngenerating data (client-side)…rn200000 of 200000 tuples (100%) done (elapsed 3.36 s, remaining 0.00 s)rnvacuuming…rncreating primary keys…rndone in 111.31 s (drop tables 0.26 s, create tables 94.93 s, client-side generate 12.63 s, vacuum 1.60 s, primary keys 1.89 s).’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3edcdcb60690>)])]

Generate pgBadger reports

pgBadger, a tool that processes database log files to generate HTML reports, includes information about connection, session, vacuum, temporary files, and top queries. We have already configured the relevant database flags to log information about each database connection and queries taking more than 300 milliseconds.

We will configure Cloud SQL for PostgreSQL logs to be stored in a Cloud Storage bucket, and then download them to Cloud Shell to process them using pgBadger to generate HTML reports.

As the first step we will install and set up pgBadger on a cloud shell.

Setting up pgBadger on a Cloud Shell or Compute Instance

code_block[StructValue([(u’code’, u’git clone https://github.com/darold/pgbadger.gitrnrncd pgbadger/rnrnperl Makefile.PLrnmake && sudo make installrndeepakmahto@pgbadger:~/pgbadger$ pgbadger -VrnpgBadger version 11.8′), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3edcde571750>)])]

Configure Cloud SQL logs routing sink to Cloud Storage

Cloud SQL uses Cloud logging to store all instance logs. Cloud logging can be used to view and query database instance logs. We will create log routing sinks to send all required Cloud SQL instance logs to a Google Cloud Storage (GCS) destination. The logs router is a part of Cloud logging. The sink destination is defined as a Google Cloud storage bucket.

We have set up the GCS bucket to be the destination for Cloud SQL logs, and we have filtered the logs to only include those from Cloud SQL databases with the specified ID label.

We have provided the required sink information and set up a new Cloud Storage bucket to route all logs to as an hourly batch process. We can also provide additional filters to include only specific Cloud SQL instances in the logs that are included in the sink.

Inclusion filter for our sample instance is:

code_block[StructValue([(u’code’, u’resource.type=”cloudsql_database”rnresource.labels.database_id=”deepakmahto:pgbadgersample”‘), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3edcc3d0e390>)])]

Cloud SQL execution logs will be available in hourly batches for generating reports using pgBadger by analyzing logs. We can always use Cloud SQL insights for real time insight to understand load and SQL details.

Run mock benchmark test using pgbench

We have already initialized the pgbench sample database and will now simulate 50 client connections in 5 threads as a benchmark test to generate load on our Cloud SQL database.

code_block[StructValue([(u’code’, u’deepakmahto$ PGPASSWORD=******** pgbench -c 50 -j 5 -t 10 -h localhost -p 5432 -U postgres pgbenchrnrnpgbench (14.2)rnstarting vacuum…end.rntransaction type: <builtin: TPC-B (sort of)>rnscaling factor: 2rnpartition method: rangernpartitions: 365rnquery mode: simplernnumber of clients: 50rnnumber of threads: 5rnnumber of transactions per client: 10rnnumber of transactions actually processed: 500/500rnlatency average = 13516.238 msrninitial connection time = 18347.037 msrntps = 3.699254 (without initial connection time)’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3edcc3d0e190>)])]

Run pgBadger to generate html reports.

We will download Cloud SQL for PostgreSQL logs from Cloud Storage to our cloud shell environment and run pgBadger to generate an HTML report for the mock benchmark run. We can also automate the process of downloading logs from Cloud Storage and executing pgBadger to generate reports on a variety of intervals, from hourly to weekly.

The Compute Engine service account must be granted the necessary roles in order to download logs from Cloud Storage.

code_block[StructValue([(u’code’, u’deepakmahto@pgbadger:~$ gsutil cp gs://cloudsqlpostgresqllogs/cloudsql.googleapis.com/postgres.log/2022/05/31/*.json .’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3edcdf060510>)])]

Once Cloud SQL logs are downloaded, you can use the pgBadger tool to generate HTML reports from the JSON-based logs.

code_block[StructValue([(u’code’, u’deepakmahto@pgbadger:~$ pgbadger -f jsonlog -o outputReport.html -O . 16\:00\:00_16\:59\:59_S0.json 17\:00\:00_17\:59\:59_S0.json rnrn[========================>] rnParsed 171657125 bytes of 171657125 (100.00%), queries: 6496, events: 35176rnLOG: Ok, generating html report…’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3edcc30fea10>)])]

Initial look on the HTML report will look like the screenshot below.

The HTML report includes information on connections, temp files usage, SQL execution statistics, and more.

For example, we can check the slowest queries and the most resource-intensive queries under the TOP sections.

pg_stat_statements Extension

Thepg_stat_statements extension provides a convenient way to view cumulative query execution statistics, which are displayed as a database view. It is preloaded in shared libraries and can be used in databases with the Create extension command.

code_block[StructValue([(u’code’, u’pgbench=> show shared_preload_libraries;rn shared_preload_libraries rn——————————————————–rn google_insights,pg_stat_statements,password_validationrn(1 row)rnrnpgbench=> create extension pg_stat_statements;rnCREATE EXTENSIONrnrnpgbench=> \dx pg_stat_statementsrnList of installed extensionsrn Name | Version | Schema | Description rn——————–+———+————+———————————-rnpg_stat_statements | 1.9 | public | track planning and execution statistics of all SQL statements executed’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3edcdde62290>)])]

The official documentation for PostgreSQL covers the pg_stat_statements extension in great detail. In this blog, we will discuss how to track nested calls as part of procedural calls using the flag pg_stat_statements.track.

During performance investigation, we might encounter a procedural call labelled as problematic or consuming considerable time in terms of execution.It becomes difficult to debug problematic statements or nested statements within procedural code causing or impacting overall performance.

When pg_stat_statements.track is set to all, nested queries executed within a procedural statement can be captured as part of the pg_stat_statements view itself.

Let’s create faulty procedural calls and run it as part of SQL in a loop.

code_block[StructValue([(u’code’, u’CREATE OR REPLACE FUNCTION public.sampleproceduralblock(id integer) RETURNS booleanrn LANGUAGE plpgsqlrn AS $$rnDECLARE rncnt bigint ;rnBEGINrn select count(1) into cnt from pgbench_accounts where bid = id;rn if cnt > 0 thenrn return true;rn elsern return false;rn end if;rnEND $$;’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3edce5301550>)])]

The procedure for checking the existence of records uses count aggregation on an unindexed filter column, bid. If the procedure is called multiple times, it will add overhead to the overall performance of the procedure.

To get a clean result, we will reset pg_stat_statements to capture all SQL as fresh runs.

code_block[StructValue([(u’code’, u’pgbench=> select pg_stat_statements_reset();rn pg_stat_statements_reset rn————————–rn rn(1 row)’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3edc8e1f31d0>)])]

Let’s execute the faulty procedural block repeatedly and query the pg_stat_statements extension to get the runtime execution time.

code_block[StructValue([(u’code’, u”pgbench=> select public.sampleproceduralblock(col1) from generate_series(1,50) col1;rnrnpgbench=> select query , total_exec_time , toplevel from pg_stat_statements where query like ‘%sampleproceduralblock%’ ;rnquery | total_exec_time | toplevel rn—————————————————————————-+—————–+———-rnselect public.sampleproceduralblock(col1) from generate_series($1,$2) col1 | 3045.590353 | trn(1 row)rnProcedural calls are recorded as top-level calls and take a total of 3 seconds. If we need to check if any nested calls are included in the top procedure call, we can use queries similar to the following:rnpgbench=> select query , total_exec_time , toplevel from pg_stat_statements where toplevel = false;rnquery | total_exec_time | toplevel rn—————————————————————+——————–+———-rnselect count(1) from pgbench_accounts where bid = id | 2966.6783230000005 | frn(1 row)”), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3edc8f495490>)])]

It highlights nested SQL calls within faulty procedural calls as a problematic statement. Now that we know which nested calls are problematic, we can provide recommendations and fix the overall procedural executions.

pg_stat_statements.track setting needs to change only for testing and finding problematic nested calls during the quality assurance phase; changing in a high workload environment may have additional overhead.

What’s next?

PostgreSQL-native observability options, such as pgBadger and the pg_stat_statements extension, enable database developers and administrators to continue leveraging tools they are familiar with when using Cloud SQL for PostgreSQL. 

Unlike Cloud SQL insights, pgBadger requires all necessary logging to be enabled in order to generate a complete report. A partial report will be generated if this is not done. Additionally, a server or GCE instance is required to download and generate a report with pgBadger.

Cloud SQL for PostgreSQL gives customers the best of both worlds: native PostgreSQL features and Cloud SQL insights. Cloud SQL insights provide intuitive monitoring and root-cause analysis for performance issues. We will cover Cloud Insights in the next part of this Database Observability series. In the meantime, you can learn more about this topic in our documentation.

Cloud BlogRead More

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular

Recent Comments