Serving PyTorch models with prebuilt containers on Vertex AI

By mullaned2002

February 17, 2023

473

Machine learning (ML) practitioners using PyTorch tell us that it can be challenging to advance their ML project beyond experimentation. That’s why over the last year, we’ve prioritized development workthat makes it easier for PyTorch users to deploy models in the cloud using Vertex AI. Vertex AI is a fully-managed machine learning platform with tools, workflows, and infrastructure designed to help ML practitioners accelerate and scale ML in production with the benefit of open-source tools.

We are excited to announce that Vertex AI now offers support for pre-built PyTorch serving containers, which makes it easier to bring your PyTorch models into production. You don’t have to build a custom container to serve your PyTorch model. With pre-built containers, we’ve streamlined the ML lifecycle for PyTorch users. This post describes how to deploy your own PyTorch models on Vertex AI. For more details, you can also have a look at the documentation.

Deploy a PyTorch model in three steps

Step 1 – Package your PyTorch model

The first step is to package your trained PyTorch model, including any default or custom handlers, into an archive file using Torch model archiver. The handlers help with the following:

Pre-processing input data into the expected format

Customizing how the model is invoked

Post-processing output from the model

After defining your handlers, you create the model archive file using the Torch model archiver. The pre-built PyTorch image requires the archived model file to be named model.mar, so you need to set the model name as model.

Step 2 – Upload the model to Vertex AI with the pre-built PyTorch serving container image

After you package the PyTorch model, you upload it to the Vertex AI Model Registry, where you can track and manage all of your models and quickly deploy it as a Vertex AI endpoint. You can use the Vertex AI SDK and the pre-built PyTorch serving image to upload the PyTorch model. The Vertex AI SDK provides an optimized experience for interacting with the Vertex AI APIs. Your code will look something like this:

code_block[StructValue([(u’code’, u’from google.cloud import aiplatform as vertexairnrn# initialize the Vertex AI SDKrnvertexai.init(project=PROJECT_ID, staging_bucket=BUCKET_NAME)rnrnrn# upload the PyTorch modelrnmodel = vertexai.Model.upload(rn display_name=model_display_name,rn description=model_description,rn serving_container_image_uri=serving_container_image_uri,rn artifact_uri=ARCHIVED_MODEL_GCS_URI,rn)rnrnmodel.wait()’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3edb2461fa90>)])]

Step 3 – Create a Vertex AI endpoint and deploy the PyTorch model

The third, and last, step is to create a Vertex AI endpoint and deploy the PyTorch model to the endpoint. For this, you can also use the Vertex AI SDK or you can deploy it through the Google Cloud Console. First, you need to create an endpoint.

code_block[StructValue([(u’code’, u’endpoint_display_name = f”pytorch-endpoint-{TIMESTAMP}”rnendpoint = vertexai.Endpoint.create(display_name=endpoint_display_name)’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3edb2461f750>)])]

Next, deploy the model into the endpoint so it can serve online predictions with low latency.

code_block[StructValue([(u’code’, u’# Deploy your PyTorch model as an endpointrnendpoint = model.deploy(rn endpoint=endpoint,rn deployed_model_display_name=deployed_model_display_name,rn machine_type=machine_type,rn traffic_percentage=traffic_percentage,rn sync=sync,rn)’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3edb2461f910>)])]

Once your model is deployed, you can integrate it with your business application(s). You can test your endpoint via the Vertex AI SDK, endpoint.predict(instances=test_instance), Cloud Shell, or the Google Cloud Console.

What’s next?

To learn more about PyTorch on Vertex AI, take a look at the documentation, which explains Vertex AI’s PyTorch integrations and provides resources that show you how to use PyTorch on Vertex AI. You’ll see how easy it is to train, deploy, and orchestrate models in production using PyTorch and Vertex AI. You can also have a look at the notebook that shows how to deploy and host a generative vision model on Vertex AI or try this notebook that deploys a text classification model.

Cloud BlogRead More

Previous articleSecuring Cloud Run Deployments with Least Privilege Access

Next articleUsing an Oracle Database Gateway to connect Amazon RDS Custom for Oracle to PostgreSQL

Serving PyTorch models with prebuilt containers on Vertex AI

Deploy a PyTorch model in three steps

What’s next?

Leverage enterprise data with Denodo and Vertex AI for generative AI applications

TypeScript takes aim at truthy and nullish bugs

Hex-LLM: High-efficiency large language model serving on TPUs in Vertex AI Model Garden

LEAVE A REPLY Cancel reply

Most Popular

Schneider Electric automates Salesforce account hierarchy management with generative artificial intelligence (AI) using Amazon Aurora and Amazon Bedrock

Leverage enterprise data with Denodo and Vertex AI for generative AI applications

TypeScript takes aim at truthy and nullish bugs

Make relevant movie recommendations using Amazon Neptune, Amazon Neptune Machine Learning, and Amazon OpenSearch Service

Recent Comments

EDITOR PICKS

Exploring the Click Element Variable in Google Tag Manager

How to track events with Google Tag Manager and Google Analytics

Data Layer Variable in GTM: What, Why, and Where?

POPULAR POSTS

Introducing Anthos for VMs and tools to simplify the developer experience

Useful Cloud Security Tools For Your Business

Announcing the General Availability of openCypher support for Amazon Neptune

POPULAR CATEGORY