Introducing Prediction Private Endpoints for fast and secure serving on Vertex AI

By mullaned2002

August 25, 2021

694

One of the biggest challenges when serving machine learning models is delivering predictions in near real-time. Whether you’re a retailer generating recommendations for users shopping on your site, or a food service company estimating delivery time, being able to serve results with low latency is crucial. That’s why we’re excited to announce Private Endpoints on Vertex AI, a new feature in Vertex Predictions. Through VPC Peering, you can set up a private connection to talk to your endpoint without your data ever traversing the public internet, resulting in increased security and lower latency for online predictions.

Configuring VPC Network Peering

Before you make use of a Private Endpoint, you’ll first need to create connections between your VPC (Virtual Private Cloud) network and Vertex AI. A VPC network is a global resource that consists of regional virtual subnetworks, known as subnets, in data centers, all connected by a global network. You can think of a VPC network the same way you’d think of a physical network, except that it’s virtualized within GCP. If you’re new to cloud networking and would like to learn more, check out this introductory video on VPCs.

With VPC Network Peering, you can connect internal IP addresses across two VPC networks, regardless of whether they belong to the same project or the same organization. As a result, all traffic stays within Google’s network.

Deploying Models with Vertex Predictions

Vertex Predictions is a serverless way to serve machine learning models. You can host your model in the cloud and make predictions through a REST API. If your use case requires online predictions, you’ll need to deploy your model to an endpoint. Deploying a model to an endpoint associates physical resources with the model so it can serve predictions with low latency.

When deploying a model to an endpoint, you can specify details such as the machine type, and parameters for autoscaling. Additionally, you now have the option to create a Private Endpoint. Because your data never traverses the public internet, Private Endpoints offer security benefits in addition to reducing the time your system takes to serve the prediction when it receives the request. The overhead introduced by Private Endpoints is minimal, achieving performance nearly identical to DIY serving on GKE or GCE. There is also no payload size limit for models deployed on the private endpoint.

Creating a Private Endpoint on Vertex AI is simple.

In the Models section of the Cloud console, select the model resource you want to deploy.

Next, select DEPLOY TO ENDPOINT

In the window on the right hand side of the console, navigate to the Access section and select Private. You’ll need to add the full name of the VPC network for which your deployment should be peered.

Note that many other managed services on GCP support VPC peering, such as Vertex Training, Cloud SQL, and Firestore. Endpoints is the latest to join that list.

What’s Next?

Now you know the basics of VPC Peering and how to use Private Endpoints on Vertex AI. If you want to learn more about configuring VPCs, check out this overview guide. And if you’re interested to learn more about how to use Vertex AI to support your ML workflow, check out this introductory video. Now it’s time for you to deploy your own ML model to a Private Endpoint for super speedy predictions!

Cloud BlogRead More

Previous articleSecurity blind spots persist as companies cross-breed security with devops

Next articleShift security left with on-demand vulnerability scanning

Introducing Prediction Private Endpoints for fast and secure serving on Vertex AI

Configuring VPC Network Peering

Deploying Models with Vertex Predictions

What’s Next?

Leverage enterprise data with Denodo and Vertex AI for generative AI applications

TypeScript takes aim at truthy and nullish bugs

Hex-LLM: High-efficiency large language model serving on TPUs in Vertex AI Model Garden

LEAVE A REPLY Cancel reply

Most Popular

Schneider Electric automates Salesforce account hierarchy management with generative artificial intelligence (AI) using Amazon Aurora and Amazon Bedrock

Leverage enterprise data with Denodo and Vertex AI for generative AI applications

TypeScript takes aim at truthy and nullish bugs

Make relevant movie recommendations using Amazon Neptune, Amazon Neptune Machine Learning, and Amazon OpenSearch Service

Recent Comments

EDITOR PICKS

Exploring the Click Element Variable in Google Tag Manager

How to track events with Google Tag Manager and Google Analytics

Data Layer Variable in GTM: What, Why, and Where?

POPULAR POSTS

Supporting Diverse ML Systems at Netflix

Build an air quality anomaly detector using Amazon Lookout for Metrics

How to simplify and fast-track your data warehouse migrations using BigQuery Migration Service

POPULAR CATEGORY

Introducing Prediction Private Endpoints for fast and secure serving on Vertex AI

Configuring VPC Network Peering

Deploying Models with Vertex Predictions

What’s Next?

What is Vertex AI? Developer advocates share more

LEAVE A REPLY Cancel reply

Most Popular

Recent Comments

EDITOR PICKS

POPULAR POSTS

POPULAR CATEGORY