Today we are announcing the Preview of BigQuery Remote Functions. Remote Functions are user-defined functions (UDF) that let you extend BigQuery SQL with your own custom code, written and hosted in Cloud Functions, Google Cloud’s scalable pay-as-you-go functions as a service. A remote UDF accepts columns from BigQuery as input, performs actions on that input using a Cloud Function, and returns the result of those actions as a value in the query result. With Remote Functions, you can now write custom SQL functions in Node.js, Python, Go, Java, NET, Ruby, or PHP. This ability means you can personalize BigQuery for your company, leverage the same management and permission models without having to manage a server.
In what type of situations could you use remote functions?
Security and Compliance: Use data encryption and tokenization services from the Google Cloud security ecosystem for external encryption and de-identification. We’ve already started working with key partners like Protegrity and Microstrategy on using these external functions as a mechanism to merge BigQuery into their security platform, which will help our mutual customers address strict compliance controls. Real Time APIs: Enrich BigQuery data using external APIs to obtain the latest stock price data, weather updates, or geocoding information.Code Migration: Migrate legacy UDFs or other procedural functions written in Node.js, Python, Go, Java, .NET, Ruby or PHP. Data Science: Encapsulate complex business logic and score BigQuery datasets by calling models hosted in Vertex AI or other Machine Learning platforms.
Let’s go through the steps to use a BigQuery remote UDF.
Setup the BigQuery Connection:
1. Create a BigQuery Connection
a. You may need to enable the BigQuery Connection API
Deploy a Cloud Function with your code:
1. Deploying your Cloud Function
a. You may need to enable Cloud Functions API
b. You may need to enable Cloud Build APIs
2. Grant the BigQuery Connection service account access to the Cloud Function
a. One way you can find the service account is by using the bq cli show command
Use the BigQuery remote UDF in SQL:
1. Write a SQL statement as you would calling a UDF
2. Get your results!
How remote functions can help you with common data tasks
Let’s take a look at some examples of how using BigQuery with remote UDFs can help accelerate development and enhance data processing and analysis.
Encryption and Decryption
As an example, let’s create a simple custom encryption and decryption Cloud Function in Python.
The encryption function can receive the data and return an encrypted base64 encoded string.
In the same Cloud Function, the decryption function can receive an encrypted base64 encoded string and return the decrypted string. A data engineer would be able to enable this functionality in BigQuery.
The Cloud Function receives the data and determines which function you want to invoke. The data is received as an HTTP request. The additional userDefinedContext fields allow you to send additional pieces of data to the Cloud Function.
The result is returned in a specific JSON formatted response that is returned to BigQuery to be parsed.
This Python code is deployed to Cloud Functions where it awaits to be invoked.
Let’s add the User Defined Function to BigQuery so we can invoke it from a SQL statement. The additional user_defined_context is what is sent to Cloud Functions as additional context in the request payloadso you can use multiple remote functions mapped to one endpoint.
Once we’ve created our functions, users with the right IAM permissions can use them in SQL on BigQuery.
If you’re new to Cloud Functions, be aware that there are very minimal delays known as “cold starts”.
The neat thing is you can call APIs as well, which is how our partners at Protegrity and Voltage enable their platforms to perform encryption and decryption of BigQuery data.
Calling APIs to enrich your data
Users, such as data analysts, can use the user defined functions created easily without needing other tools and moving the data out of BigQuery.
You can enrich your dataset with many more APIs, for example, the Google Cloud Natural Language API to analyze sentiment on your text without having to use another tool.
Once the Cloud Function is deployed and the remote UDF definition is created on BigQuery, you are able to invoke the NLP API and return the data from it for use in your queries.
Custom Vertex AI endpoint
Data Scientists can integrate Vertex AI endpoints and other APIs, all from the SQL console for custom models.
Remember, the remote UDFs are meant for scalar executions.
You are able to deploy a model to a Vertex AI endpoint, which is another API, and then call that endpoint from Cloud Functions.
Try it out today
Cloud BlogRead More