Wednesday, December 4, 2024
No menu items!
HomeCloud ComputingHarnessing the power of PaLM in BigQuery

Harnessing the power of PaLM in BigQuery

IDC estimates that by 2025, there will be 175 zettabytes of data in the world, and 80% of that data will be unstructured. However, 90% of unstructured data is never analyzed. That’s because it can be cumbersome, expensive and risky to extract and transform unstructured data, requiring multiple tools. As such, it’s rarely used in organizations’ data pipelines. 

Google Cloud’s recent innovations in generative AI, including foundation models for text and vision, open up various avenues for data teams to harness this untapped unstructured data. Object tables, a new table type in BigQuery, provides a structured record interface for unstructured data stored in Cloud Storage, unlocking additional possibilities.

Today, we are taking it one step further with the integration of BigQuery and Vertex AI foundation models, making it simple and easy for you to analyze unstructured data from right inside BigQuery. With the integration of BigQuery and Vertex AI foundation models, we are bringing generative AI directly to where your data resides. This approach has numerous benefits:

Eliminates the need to build and manage data pipelines between BigQuery and generative AI model APIs

Streamlines governance and helps reduce the risk of data loss by avoiding data movement 

Reduces the need to write and manage custom Python code to call AI models

Enables you to analyze data at petabyte-scale without compromising on performance

Can lower your total cost of ownership with a simplified architecture 

All this is made possible with BigQuery ML inference engine, which offers machine learning capabilities right inside BigQuery, and which recently became generally available. For each of the last two years, BigQuery ML has seen over 250% YoY query growth. This year, customers have run over 300 million prediction and training queries in BigQuery ML. 

Starting with the first supported foundation model, text analysis via PaLM 2 (text-bison), you can now write just a few lines of SQL in BigQuery ML to analyze unstructured data for advanced text processing tasks such as summarization or sentiment analysis, retrieve results in a structured format, and use it with other data for further analysis.

How does it work?

Under the hood, BigQuery ML’s inference engine uses ML.GENERATE_TEXT function to call Vertex AI text-bison models from the Model Garden. Here are two simple steps to use this feature:

1. Register the model as a remote model

code_block[StructValue([(u’code’, u”CREATE MODEL my_project.my_company.llm_modelrnREMOTE WITH CONNECTION my_project.us.remote_connection_namernOPTIONS (remote_service_type = ‘CLOUD_AI_LARGE_LANGUAGE_MODEL_V1′)”), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3eb699f1e5d0>)])]

2. Run inference. Here’s an example where users can do data enrichment by obtaining the country name for a given city name. Note that “city” is a column in the “example_table”.

code_block[StructValue([(u’code’, u’SELECT * FROMrnML.GENERATE_TEXT (rnMODEL u2018my_company.llm_modelu2019,rn(SELECT CONCAT (u201cGive the country name for city: u201d, city) AS promptrnFROM example_table),rnSTRUCT ( 0.2 AS temperature,rn 1024 AS max_output_tokens,rn 0.8 AS top_p,rn 40 AS top_k))’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3eb6b5744c50>)])]

How customers are leveraging PaLM in BigQuery

Early users of BigQuery and Vertex AI foundation model integration have expressed tremendous interest in solving various use cases across industries. For instance, using ML.GENERATE_TEXT can simplify advanced data processing tasks:

Content generation: Analyze customer feedback and generate personalized email content right inside BigQuery without the need for complex tools

Summarization: Summarize text stored in BigQuery columns such as online reviews or chat transcripts

Data enhancement: Obtain a country name for a given city name

Rephrasing: Correct spelling and grammar in textual content such as voice-to-text transcriptions

Feature extraction: Extract key information or words from the large text files such as in online reviews and call transcripts

Sentiment analysis: Understand human sentiment about specific subjects in a text

Faraday, a leading customer prediction platform, previously had to build data pipelines and join multiple datasets. Now, not only can they simplify sentiment analysis, but they can also take customer sentiment, join it with additional customer first-party data, and feed it back into the LLMs to generate hyper personalized content — all within BigQuery. Watch this demo video to learn more.

“Faraday’s clients already get the benefit of predictions made from structured data. Now that Google has integrated BigQuery and Vertex AI foundation models, we can scalably predict business outcomes using unstructured data too..” – Seamus Abshere, CTO, Faraday. 

Getting started

To learn more,  visit the documentation page, or try out  this tutorial to extract keywords from text.

Cloud BlogRead More

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular

Recent Comments