Saturday, November 2, 2024
No menu items!
HomeCloud ComputingBoost your development on Cloud Spanner change streams with a new tail...

Boost your development on Cloud Spanner change streams with a new tail tool

Cloud Spanner change streams is a flexible, scalable mechanism to collect and stream out change data from Spanner databases. It allows you to integrate with downstream systems such as BigQuery and Pub/Sub to process the incremental data changes happening in Spanner databases for analytics, event triggering, and many other use cases.

Today, we’re excited to announce the release of a new open source tool called spanner-change-streams-tail in Cloud Spanner Ecosystem, which allows you to “tail -f” the change stream on your local machine without integrating with Dataflow pipelines. With this tool, you can understand how change streams work and what the captured data looks like.

In this blog, we’ll walk you through how to use this tool to view Spanner change streams. Let’s get started!

Set up a change stream

In our example, we are going to use the following table and change stream.

code_block[StructValue([(u’code’, u’CREATE TABLE Players (rn PlayerId INT64 NOT NULL,rn PlayerName STRING(36) NOT NULLrn) Primary Key (PlayerId);rnrnCREATE CHANGE STREAM PlayersStream FOR Players;’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3ef4fc4c7810>)])]

Refer to the documentation on creating a Spanner database to set up a database and then create a table with this DDL.

Run spanner-change-streams-tail

To install spanner-change-streams-tail, run the following Go command. If you have not installed Go, follow the install page to install the Go toolchain.

code_block[StructValue([(u’code’, u’$ go install github.com/cloudspannerecosystem/spanner-change-streams-tail@latest’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3ef4d54c9750>)])]

Next, run the following gcloud command for authentication and authorization for spanner-change-streams-tail.

code_block[StructValue([(u’code’, u’$ gcloud auth application-default login’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3ef4fc3c7a50>)])]

Now spanner-change-streams-tail is ready for use. Run the following command to start reading the change stream.

code_block[StructValue([(u’code’, u’$ spanner-change-streams-tail -p test-project -i test-instance -d test-db -s PlayersStreamrnrnReading the stream…’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3ef4fc3c7990>)])]

Let’s open a new terminal and insert rows to the table.

code_block[StructValue([(u’code’, u’$ gcloud spanner databases execute-sql test-db \rn –instance=test-instance \rn –sql=”INSERT INTO Players (PlayerId, PlayerName) VALUES (1, ‘foo’), (2, ‘bar’)”‘), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3ef4f5cd9050>)])]

You will immediately see the change stream captured data in your spanner-change-streams-tail terminal.

code_block[StructValue([(u’code’, u’2023-01-13 05:13:09.771183 +0000 UTC | INSERT | Players | [{“keys”:{“PlayerId”:”1″},”new_values”:{“PlayerName”:”foo”},”old_values”:{}},{“keys”:{“PlayerId”:”2″},”new_values”:{“PlayerName”:”bar”},”old_values”:{}}]’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3ef4f5cd9110>)])]

Now that you’re viewing your change stream in real time, you may want to run some more DML statements on the table so that you can see the collected data in the change stream. Here are some examples for the change data produced by UPDATE and DELETE statements.

code_block[StructValue([(u’code’, u’2023-01-13 14:28:35.111743 +0000 UTC | UPDATE | Players | [{“keys”:{“PlayerId”:”1″},”new_values”:{“PlayerName”:”baz”},”old_values”:{“PlayerName”:”foo”}}]rn2023-01-13 14:29:02.561631 +0000 UTC | DELETE | Players | [{“keys”:{“PlayerId”:”2″},”new_values”:{},”old_values”:{“PlayerName”:”bar”}}]’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3ef4f5cd9e90>)])]

Customize the output

By default, spanner-change-streams-tail will output the log in a text format. However, you can change it to output in JSON format by specifying the “-f json” option. With this option, you can customize the output with a JSON-compatible tool like jq.

code_block[StructValue([(u’code’, u’$ spanner-change-streams-tail -p test-project -i test-instance -d test-db -s PlayersStream -f json | jq ‘{time:.commit_timestamp, type:.mod_type, table:.table_name}’rnrnReading the stream…rn{rn “time”: “2023-01-13T05:20:53.310335Z”,rn “type”: “INSERT”,rn “table”: “Players”rn}rn{rn “time”: “2023-01-13T05:21:10.114332Z”,rn “type”: “UPDATE”,rn “table”: “Players”rn}’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3ef4f5cd9b90>)])]

For other options, please refer to the README.

Read change streams from your Go code

If you want to view change streams in a format other than text or JSON, you can even write your own Go code to read the change stream for your purpose. spanner-change-streams-tail provides a Go package that helps you write a program to read your change streams in Go.

Let’s see a simple example outputting data change commit timestamps, mod types, and affected table names to the console.

code_block[StructValue([(u’code’, u’package mainrnrnimport (rnt”context”rnt”fmt”rnt”log”rnrn”github.com/cloudspannerecosystem/spanner-change-streams-tail/changestreams”rn)rnrnfunc main() {rntctx := context.Background()rntreader, err := changestreams.NewReader(ctx, “myproject”, “myinstance”, “mydb”, “mystream”)rntif err != nil {rnttlog.Fatalf(“failed to create a reader: %v”, err)rnt}rntdefer reader.Close()rnrntif err := reader.Read(ctx, func(result *changestreams.ReadResult) error {rnttfor _, cr := range result.ChangeRecords {rntttfor _, dcr := range cr.DataChangeRecords {rnttttfmt.Printf(“[%s] %s %s\n”, dcr.CommitTimestamp, dcr.ModType, dcr.TableName)rnttt}rntt}rnttreturn nilrnt}); err != nil {rnttlog.Fatalf(“failed to read: %v”, err)rnt}rn}’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3ef4f5cd9d90>)])]

View change stream partitions

One of the key concepts of Spanner change streams is change stream partitions, which contain the change stream records. Change stream partitions are dynamically split and merged alongside database data splits.

spanner-change-streams-tail can display the lineage of change stream partitions in Graphviz DOT format. If you’re debugging your application reading change streams, this helps 

you understand how change stream partitions have been split and merged during a specified time period.

Please check –visualize-partitions option to know more about how to visualize change stream partitions.

Next steps

The tail tool is designed for quick testing and code prototyping during the development. For production use cases that require higher scale and reliability, you can use our Dataflow or Kafka connectors. You can also use Spanner API to read change streams directly. Please check out this blog post to learn more about how to get started with Dataflow connectors to integrate with change streams.

We hope you’ll enjoy the simplicity and the ease of trying out Spanner change streams with spanner-change-streams-tail.

Cloud BlogRead More

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular

Recent Comments