Boost your development on Cloud Spanner change streams with a new tail tool

By mullaned2002

March 22, 2023

868

Cloud Spanner change streams is a flexible, scalable mechanism to collect and stream out change data from Spanner databases. It allows you to integrate with downstream systems such as BigQuery and Pub/Sub to process the incremental data changes happening in Spanner databases for analytics, event triggering, and many other use cases.

Today, we’re excited to announce the release of a new open source tool called spanner-change-streams-tail in Cloud Spanner Ecosystem, which allows you to “tail -f” the change stream on your local machine without integrating with Dataflow pipelines. With this tool, you can understand how change streams work and what the captured data looks like.

In this blog, we’ll walk you through how to use this tool to view Spanner change streams. Let’s get started!

Set up a change stream

In our example, we are going to use the following table and change stream.

code_block[StructValue([(u’code’, u’CREATE TABLE Players (rn PlayerId INT64 NOT NULL,rn PlayerName STRING(36) NOT NULLrn) Primary Key (PlayerId);rnrnCREATE CHANGE STREAM PlayersStream FOR Players;’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3ef4fc4c7810>)])]

Refer to the documentation on creating a Spanner database to set up a database and then create a table with this DDL.

Run spanner-change-streams-tail

To install spanner-change-streams-tail, run the following Go command. If you have not installed Go, follow the install page to install the Go toolchain.

code_block[StructValue([(u’code’, u’$ go install github.com/cloudspannerecosystem/spanner-change-streams-tail@latest’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3ef4d54c9750>)])]

Next, run the following gcloud command for authentication and authorization for spanner-change-streams-tail.

code_block[StructValue([(u’code’, u’$ gcloud auth application-default login’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3ef4fc3c7a50>)])]

Now spanner-change-streams-tail is ready for use. Run the following command to start reading the change stream.

code_block[StructValue([(u’code’, u’$ spanner-change-streams-tail -p test-project -i test-instance -d test-db -s PlayersStreamrnrnReading the stream…’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3ef4fc3c7990>)])]

Let’s open a new terminal and insert rows to the table.

code_block[StructValue([(u’code’, u’$ gcloud spanner databases execute-sql test-db \rn –instance=test-instance \rn –sql=”INSERT INTO Players (PlayerId, PlayerName) VALUES (1, ‘foo’), (2, ‘bar’)”‘), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3ef4f5cd9050>)])]

You will immediately see the change stream captured data in your spanner-change-streams-tail terminal.

code_block[StructValue([(u’code’, u’2023-01-13 05:13:09.771183 +0000 UTC | INSERT | Players | [{“keys”:{“PlayerId”:”1″},”new_values”:{“PlayerName”:”foo”},”old_values”:{}},{“keys”:{“PlayerId”:”2″},”new_values”:{“PlayerName”:”bar”},”old_values”:{}}]’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3ef4f5cd9110>)])]

Now that you’re viewing your change stream in real time, you may want to run some more DML statements on the table so that you can see the collected data in the change stream. Here are some examples for the change data produced by UPDATE and DELETE statements.

code_block[StructValue([(u’code’, u’2023-01-13 14:28:35.111743 +0000 UTC | UPDATE | Players | [{“keys”:{“PlayerId”:”1″},”new_values”:{“PlayerName”:”baz”},”old_values”:{“PlayerName”:”foo”}}]rn2023-01-13 14:29:02.561631 +0000 UTC | DELETE | Players | [{“keys”:{“PlayerId”:”2″},”new_values”:{},”old_values”:{“PlayerName”:”bar”}}]’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3ef4f5cd9e90>)])]

Customize the output

By default, spanner-change-streams-tail will output the log in a text format. However, you can change it to output in JSON format by specifying the “-f json” option. With this option, you can customize the output with a JSON-compatible tool like jq.

code_block[StructValue([(u’code’, u’$ spanner-change-streams-tail -p test-project -i test-instance -d test-db -s PlayersStream -f json | jq ‘{time:.commit_timestamp, type:.mod_type, table:.table_name}’rnrnReading the stream…rn{rn “time”: “2023-01-13T05:20:53.310335Z”,rn “type”: “INSERT”,rn “table”: “Players”rn}rn{rn “time”: “2023-01-13T05:21:10.114332Z”,rn “type”: “UPDATE”,rn “table”: “Players”rn}’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3ef4f5cd9b90>)])]

For other options, please refer to the README.

Read change streams from your Go code

If you want to view change streams in a format other than text or JSON, you can even write your own Go code to read the change stream for your purpose. spanner-change-streams-tail provides a Go package that helps you write a program to read your change streams in Go.

Let’s see a simple example outputting data change commit timestamps, mod types, and affected table names to the console.

code_block[StructValue([(u’code’, u’package mainrnrnimport (rnt”context”rnt”fmt”rnt”log”rnrn”github.com/cloudspannerecosystem/spanner-change-streams-tail/changestreams”rn)rnrnfunc main() {rntctx := context.Background()rntreader, err := changestreams.NewReader(ctx, “myproject”, “myinstance”, “mydb”, “mystream”)rntif err != nil {rnttlog.Fatalf(“failed to create a reader: %v”, err)rnt}rntdefer reader.Close()rnrntif err := reader.Read(ctx, func(result *changestreams.ReadResult) error {rnttfor _, cr := range result.ChangeRecords {rntttfor _, dcr := range cr.DataChangeRecords {rnttttfmt.Printf(“[%s] %s %s\n”, dcr.CommitTimestamp, dcr.ModType, dcr.TableName)rnttt}rntt}rnttreturn nilrnt}); err != nil {rnttlog.Fatalf(“failed to read: %v”, err)rnt}rn}’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3ef4f5cd9d90>)])]

View change stream partitions

One of the key concepts of Spanner change streams is change stream partitions, which contain the change stream records. Change stream partitions are dynamically split and merged alongside database data splits.

spanner-change-streams-tail can display the lineage of change stream partitions in Graphviz DOT format. If you’re debugging your application reading change streams, this helps

you understand how change stream partitions have been split and merged during a specified time period.

Please check –visualize-partitions option to know more about how to visualize change stream partitions.

Next steps

The tail tool is designed for quick testing and code prototyping during the development. For production use cases that require higher scale and reliability, you can use our Dataflow or Kafka connectors. You can also use Spanner API to read change streams directly. Please check out this blog post to learn more about how to get started with Dataflow connectors to integrate with change streams.

We hope you’ll enjoy the simplicity and the ease of trying out Spanner change streams with spanner-change-streams-tail.

Cloud BlogRead More

Previous articleExpanding Cloud Armor DDoS protection to Network Load Balancing and VMs with Public IP addresses

Next articleAutomate Amazon Rekognition Custom Labels model training and deployment using AWS Step Functions

Boost your development on Cloud Spanner change streams with a new tail tool

Set up a change stream

Run spanner-change-streams-tail

Customize the output

Read change streams from your Go code

View change stream partitions

Next steps

Leverage enterprise data with Denodo and Vertex AI for generative AI applications

TypeScript takes aim at truthy and nullish bugs

Hex-LLM: High-efficiency large language model serving on TPUs in Vertex AI Model Garden

LEAVE A REPLY Cancel reply

Most Popular

Schneider Electric automates Salesforce account hierarchy management with generative artificial intelligence (AI) using Amazon Aurora and Amazon Bedrock

Leverage enterprise data with Denodo and Vertex AI for generative AI applications

TypeScript takes aim at truthy and nullish bugs

Make relevant movie recommendations using Amazon Neptune, Amazon Neptune Machine Learning, and Amazon OpenSearch Service

Recent Comments

EDITOR PICKS

Exploring the Click Element Variable in Google Tag Manager

How to track events with Google Tag Manager and Google Analytics

Data Layer Variable in GTM: What, Why, and Where?

POPULAR POSTS

How to protect sensitive data with masking

Announcing the General Availability of openCypher support for Amazon Neptune

Why openness always matters

POPULAR CATEGORY