Announcing general availability of Ray on Vertex AI

By mullaned2002

May 15, 2024

72

Developers and engineers face several major challenges when scaling AI/ML workloads. One challenge is getting access to the AI infrastructure they need. AI/ML workloads require a significant amount of computational resources, such as CPUs and GPUs. Developers need to have sufficient resources to run their workloads. Another challenge is handling the diverse patterns and programming interfaces required for effective AI/ML workload scaling. Developers may need to adapt their code to run efficiently on the specific infrastructure they have available. This can be a time-consuming and complex task.

To address these challenges, Ray provides a comprehensive and easy-to-use Python distributed framework. With Ray, you configure a scalable cluster of computational resources and utilize a collection of domain-specific libraries to efficiently distribute common AI/ML tasks like training, serving, and tuning.

Today, we are thrilled to announce our seamless integration of Ray, a powerful distributed Python framework, with Google Cloud’s Vertex AI is generally available. This integration empowers AI developers to effortlessly scale their AI workloads on Vertex AI’s versatile infrastructure, which unlocks the full potential of machine learning, data processing, and distributed computing.

Why Ray on Vertex AI?

Accelerated and Scalable AI Development: Ray’s distributed computing framework provides a unified experience for both generative AI and predictive AI, which seamlessly integrates with Vertex AI’s infrastructure services. Scale your Python-based machine learning, deep learning, reinforcement learning, data processing, and scientific computing workloads from a single machine to a massive cluster, so you can tackle even the most demanding AI challenges without the complexity of managing the underlying infrastructure.

Unified Development Experience: Integrating Ray’s ergonomic API with Vertex AI SDK for Python, AI developers can now seamlessly transition from interactive prototyping on their local development environment or in Vertex AI Colab Enterprise to production deployment on Vertex AI’s managed infrastructure with minimal code changes.

Enterprise-Grade Security: Vertex AI’s robust security features, including VPC Service Controls, Private Service Connect, and Customer-Managed Encryption Keys (CMEK), can help safeguard your sensitive data and models while leveraging the power of Ray’s distributed computing capabilities. Vertex AI’s comprehensive security framework can help ensure that your Ray applications comply with strict enterprise security requirements.

Get started with Ray and Vertex AI

Let’s assume that you want to tune a small language model (SML) such as Llama or Gemma. To fine-tune Gemma using Ray on Vertex AI, first you need a Ray cluster on Vertex AI, which Ray on Vertex AI lets you create in just a few minutes, using either the console or the Vertex AI SDK for Python. You can monitor the cluster either by leveraging the integration with Google Cloud Logging or using the Ray Dashboard.

Figure 1. Create a cluster using Ray on Vertex AI

Currently, Ray on Vertex AI supports Ray 2.9.3. Moreover, you can define a custom image, providing more flexibility in terms of the dependencies included in your Ray cluster.

After you get your Ray cluster running, using Ray on Vertex AI for developing AI/ML applications is straightforward. The process can vary based on your development environment. You can establish a connection to the Ray cluster and run your application interactively by using the Vertex AI SDK for Python either within Colab Enterprise or any IDE you prefer. Alternatively, you have the option to create a Python script and submit it to the Ray cluster on Vertex AI programmatically using the Ray Jobs API as you can see below.

Figure 2. Tuning Gemma on Ray on Vertex AI

Using Ray on Vertex AI for developing AI/ML applications offers various benefits. In this scenario, you can leverage Vertex AI TensorBoard for validating your tuning jobs. Vertex AI TensorBoard provides a managed TensorBoard service that enables you to track, visualize, compare your tuning jobs, and collaborate effectively with your team. Also, you can use Cloud Storage to conveniently store model checkpoints, metrics and more. This allows you to quickly consume the model for AI/ML downstreaming tasks including generating batch predictions using Ray Data, as you can see in the following code.

code_block
<ListValue: [StructValue([(‘code’, ‘# Librariesrnimport datasetsrnimport ray rnrninput_data = datasets.load_dataset(dataset_id)rnray_input_data = ray.data.from_huggingface(input_data)rnrnpredictions_data = ray_input_data.map_batches(rn Summarizer,rn concurrency=config[“num_gpus”],rn num_gpus=1,rn batch_size=config[‘batch_size’])rnrn# Store resulting predictionsrnpredictions_data.write_json(‘your-bucker-uri/preds.json’, try_create_dir=True)’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e74d66a0c10>)])]>

How HEB and eDreams scale AI on Ray on Vertex AI

As any large operation, but especially in grocery stores, getting an accurate forecasting of the demand is directly related with the profitability of the business. It’s hard enough to get the forecasting right with one item, but imagine having millions of items to forecast for hundreds of stores. Scaling the forecasting model is not an easy task. H-E-B, one of the largest grocery chains in the US, uses Ray on Vertex AI to achieve speed, reliability, and cost savings.

“Ray has enabled us to achieve transformative efficiencies that have been critical to our business. We especially appreciate Ray’s easy to use API and enterprise capabilities,” said Philippe Dagher, Principal Data Scientist at H-E-B. “We are excited about the increased accessibility to Vertex AI’s infrastructure that the integration of Ray on Vertex presents, so much that we have chosen it as our production ML platform.”

eDreams ODIGEO, the world’s leading travel subscription platform and one of the largest e-commerce businesses in Europe, offers the best quality products in regular flights, low-cost airlines, hotels, dynamic packages, car rental and travel insurance to make travel easier, more accessible, and better value for consumers across the globe. The company processes 100 million daily user searches, combining travel options from nearly 700 global airlines and 2.1 million hotels enabled by 1.8 billion daily machine learning predictions.

The eDreams ODIGEO Data Science team are currently using Ray on Vertex AI to train their ranking models to enable the best travel experiences for you at the best price with minimum effort.

José Luis González, eDreams ODIGEO Data Science Director, said, “We are creating the best ranking models, personalized to the preferences of our 5.4 million Prime customers at scale, with the largest base of accommodation and flight options. With Ray on Vertex AI taking care of the infrastructure for distributed hyper-parameter tuning, we are focusing on building the best experience to drive better value for our customers.”

What’s next

Are you trying to scale AI/ML applications but you are struggling with it? Start by creating a Ray cluster on Vertex AI in the Google Cloud console – new customers get $300 in free credits on signup.

Ray on Vertex AI will empower you to build innovative and scalable applications. With Ray on Vertex AI, it’s never been easier to scale both your Gen AI and Predictive workloads on Vertex AI and unlock new possibilities for your organizations!

If you want to know more about Ray on Vertex AI, join the vibrant Ray community and Vertex AI Google Cloud community to share your experiences, ask questions, and collaborate on new projects. Also check out the following resources:

Documentation

Create a Ray cluster on Vertex AI
Develop an application on the Ray cluster on Vertex AI
Deploy a model on Vertex AI and get predictions

Github samples

Ray on Vertex AI for Cluster management
Get started with PyTorch on Ray on Vertex AI
Tuning and serving Gemma on Ray on Vertex AI

Community blog posts

Ray on Vertex AI: Let’s get it started
Is it Pop or Rock? Classify songs with Hugging Face and Ray on Vertex AI

Cloud BlogRead More

Previous articleLearn how Amazon Ads created a generative AI-powered image generation capability using Amazon SageMaker

Next articleGoogle is named a Visionary in its first 2024 Gartner® Magic Quadrant™ for SIEM

Announcing general availability of Ray on Vertex AI

Why Ray on Vertex AI?

Get started with Ray and Vertex AI

How HEB and eDreams scale AI on Ray on Vertex AI

What’s next

Leverage enterprise data with Denodo and Vertex AI for generative AI applications

TypeScript takes aim at truthy and nullish bugs

Hex-LLM: High-efficiency large language model serving on TPUs in Vertex AI Model Garden

LEAVE A REPLY Cancel reply

Most Popular

Schneider Electric automates Salesforce account hierarchy management with generative artificial intelligence (AI) using Amazon Aurora and Amazon Bedrock

Leverage enterprise data with Denodo and Vertex AI for generative AI applications

TypeScript takes aim at truthy and nullish bugs

Make relevant movie recommendations using Amazon Neptune, Amazon Neptune Machine Learning, and Amazon OpenSearch Service

Recent Comments

EDITOR PICKS

Exploring the Click Element Variable in Google Tag Manager

How to track events with Google Tag Manager and Google Analytics

Data Layer Variable in GTM: What, Why, and Where?

POPULAR POSTS

Simplify cross-account access control with Amazon DynamoDB using resource-based policies

Metadata filtering for tabular data with Knowledge Bases for Amazon Bedrock

A technical solution producing highly-personalized investment recommendations using ML

POPULAR CATEGORY