Saturday, May 4, 2024
No menu items!
HomeCloud ComputingSession leak detection solutions: Debug non-responsive applications on Cloud Spanner

Session leak detection solutions: Debug non-responsive applications on Cloud Spanner

Cloud Spanner is a fully managed, mission-critical, relational database service that offers transactional consistency at global scale with automatic, synchronous replication for high availability. Customers can connect their application to Spanner using Google built and supported open source Client Libraries. If you use any of the Spanner Client Libraries and you’re facing an issue where a transaction or query in your application is blocked or your application has been waiting indefinitely for the transaction to complete, read on. You might be facing a session leak, keep reading and we will review sessions, session pools and how to debug and resolve a session leak.

What is a session?

A session represents a communication channel with the Cloud Spanner database service. Some database technologies refer to these channels as connections. A session is used to perform transactions that read, write, or modify data in a Cloud Spanner database. Each unique session is associated with a single database, with a one to many mapping of databases to sessions. In Spanner, a session can execute only one transaction at a time. Each standalone read, write, or query counts toward the one transaction limit.

What is a session pool?

Creating a session is expensive. To avoid the performance cost each time a database operation is made, we configure our client libraries to have a session pool, a pool of available sessions that are ready to use. We maintain a cache that stores existing sessions and returns the appropriate type of session when requested, and handles cleanup of unused sessions.

Sessions are intended to be long-lived, so after a session is used for a database operation, the session would be returned to the cache for reuse. You can configure this session pool to manage the minimum and maximum number of sessions, how many idle sessions you would like to keep, etc.

What is a session leak?

A Spanner object of the Client Library has a limit on the number of maximum sessions. For example the default value of maximum sessions in Golang Client Library are 100 and in Java Client Library are 400. You can configure these values at the time of Client side Database object creation by passing in the SessionPoolOptions.

When all the sessions are checked out of the session pool, every new transaction has to wait until a session is returned to the pool. If a session is never returned to the pool(hence causing a session leak), the transactions will have to wait indefinitely and your application will be blocked.

How are session leaks caused?

The most common reason for a session leak is that a transaction was started by the application but never committed or rolled back. What you should do is simply call commit on the transaction at the end of your transaction code.

Spanner has two types of transactions, read only and read-write transactions. When we perform a read in a read-write transaction we still need to commit it.

As shown in the example below it is important to commit the transaction when it ends. Not calling commit or rollback will lead to the session leaking.

The below sample is for Java. In Java try-with-resources block releases the session after it is complete unless you explicitly call the close() method on all resources such as ResultSet. If the transaction does not run in it then there would be a leak.

code_block<ListValue: [StructValue([(‘code’, ‘DatabaseClient client =rn spanner.getDatabaseClient(DatabaseId.of(“my-project”, “my-instance”, “my-database”));rntry (ResultSet resultSet =rn client.singleUse().executeQuery(Statement.of(“select col1, col2 from my_table”))) {rn while (resultSet.next()) {rn // use the results.rn }rn}’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3ef1e4c6b250>)])]>

How can I debug and resolve a session leak?

Logging

Enabled by default, the logging option shares warn logs when you have exhausted >95% of your session pool. This could mean two things, either you need to increase the max sessions in your session pool (as the number of queries run using the client side database object is greater than your session pool can serve) or you may have a session leak.

To help you debug which transactions may be causing this session leak, the logs will also contain stack traces of transactions which have been running longer than expected. For Java and Golang the logs are pushed depending on how the log exporter is configured.

Automatically clean inactive transactions

When the option to automatically clean inactive transactions is enabled, the client library will automatically spot problematic transactions that are running for extremely long periods of time that might be causing a leak and close them. In Java and Golang the session will be removed from the pool and be replaced by a new session.

In all languages to dig deeper into which transactions are being closed, you can check the logs to see the stack trace of the transactions which might be causing these leaks and further debug them.

Code samples

The samples below show how the above explained features can be enabled or disabled.

Java

code_block<ListValue: [StructValue([(‘code’, ‘// pass setWarnAndCloseIfInactiveTransactions to warn as well as remove // inactive transactionsrn// alternatively use method setWarnIfInactiveTransactions if you want to // simply generate warning logs and not remove any transactionsrnfinal SessionPoolOptions sessionPoolOptions = SessionPoolOptions.newBuilder().setWarnAndCloseIfInactiveTransactions().build()rnrnfinal Spanner spanner =rn SpannerOptions.newBuilder()rn .setProjectId(TEST_PROJECT)rn .setChannelConfigurator(ManagedChannelBuilder::usePlaintext)rn .setHost(“http://” + endpoint)rn .setCredentials(NoCredentials.getInstance())rn .setSessionPoolOption(sessionPoolOptions)rn .build()rn .getService();rnfinal DatabaseClient client = spanner.getDatabaseClient(databaseId);’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3ef1e61071c0>)])]>

GoLang

code_block<ListValue: [StructValue([(‘code’, ‘// pass SessionPoolConfig with InactiveTransactionRemovalOptions // for closing or logging inactive transactionsrnclient, err := spanner.NewClientWithConfig(rntctx, database, spanner.ClientConfig{SessionPoolConfig: spanner.SessionPoolConfig{rnttInactiveTransactionRemovalOptions: spanner.InactiveTransactionRemovalOptions{rnttActionOnInactiveTransaction: spanner.WarnAndClose,rntt}rnt}},rn)rnif err != nil {rntreturn errrn}rndefer client.Close()’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3ef1e43427f0>)])]>

Log messages

Java

code_block<ListValue: [StructValue([(‘code’, ‘// Log message to warn long running transaction is presentrnDetected long-running session <session-info>. To automatically remove long-running sessions, set SessionOption ActionOnInactiveTransaction to WARN_AND_CLOSE by invoking setWarnAndCloseIfInactiveTransactions() method.rn<Stack Trace and information on session>rn// Log message for when transaction is recycledrnRemoving long-running session <Stack Trace and information on session>rn// Error message when transaction for which session is recycled is usedrnN/A. Java does not throw an errorand allows using the session as rather than recycling, it removes the session from the pool.’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3ef1e4342cd0>)])]>

Golang

code_block<ListValue: [StructValue([(‘code’, ‘// Log message to warn long running transaction is presentrnsession <session-info> checked out of pool at <session-checkout-time> is long running due to possible session leak for goroutinern<Stack Trace of transaction>rn// Log message for when transaction is recycledrnsession <session-info> checked out of pool at <session-checkout-time> is long running and will be removed due to possible session leak for goroutine rn<Stack Trace of transaction>rn// Error message when transaction for which session is recycled is usedrnN/A. Golang does not throw an errorand allows using the session as rather than recycling, it removes the session from the pool.’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3ef1e4342700>)])]>

Using these best practices and with the help of logging you can improve the reliability of your application. These features can help find and debug any issues in the application which may cause a session leak leading to your application being stuck and not responding. For any further information on sessions read more in the Spanner documentation.

Cloud BlogRead More

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular

Recent Comments