Deploying your Generative AI model in only four steps with Vertex AI and PyTorch

By mullaned2002

May 15, 2023

755

Foundational models are trained on extensive unlabeled data and used for downstream generative AI tasks, such as text, images, and music generation. They are increasingly popular as businesses explore their potential to create new products and services. You can use foundational models for use cases like image generation. Diffusion models are generative models that have gained popularity over the past years because of the high-quality images they can generate. Stable Diffusion is a latent text-to-image diffusion model that researchers at CompVis, Stability AI, and LAION have developed.

Deploying large models, like Stable Diffusion, can be challenging and time-consuming. In this blog, we will show how you can streamline the deployment of a PyTorch Stable Diffusion model by leveraging Vertex AI. PyTorch is the framework used by Stability AI onStable Diffusion v1.5. Vertex AI is a fully-managed machine learning platform with tools and infrastructure designed to help ML practitioners accelerate and scale ML in production with the benefit of open-source frameworks like PyTorch. In four steps you can deploy a PyTorch Stable Diffusion model (v1.5).

Deploying a PyTorch Stable Diffusion model as a Vertex AI Endpoint

Deploying your Stable Diffusion model on a Vertex AI Endpoint can be done in four steps:

Create a custom TorchServe handler.

Upload model artifacts to Google Cloud Storage (GCS).

Create a Vertex AI model with the model artifacts and a prebuilt PyTorch containerimage.

Deploy the Vertex AI model onto an endpoint.

Let’s have a look at each step in more detail. You can follow and implement the steps using the Notebook example.

Step 1 – Create a custom TorchServe handler

TorchServe is an easy and flexible tool for serving PyTorch models. The model deployed to Vertex AI uses TorchServe to handle requests and return responses from the model. You must create a custom TorchServe handler to include in the model artifacts uploaded to Vertex AI. Include the handler file in the directory with the other model artifacts, like this: model_artifacts/handler.py.

After creating the handler file, you must package the handler as a model archiver (MAR) file. The output file must be named model.mar.

code_block[StructValue([(u’code’, u’!torch-model-archiver \rn -f \rn –model-name <your_model_name> \rn –version 1.0 \rn –handler model_artifacts/handler.py \rn –export-path model_artifacts’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3ea468300e10>)])]

Step 2 – Upload the model artifacts to Google Cloud Storage

The next step is uploading model artifacts to GCS, like the model file or handler. The advantage of storing your artifacts on GCS is that you can track the artifacts in a central bucket.

code_block[StructValue([(u’code’, u’BUCKET_NAME = “your-bucket-name-unique” # @param {type:”string”}rnBUCKET_URI = f”gs://{BUCKET_NAME}/”rnrn# Will copy the artifacts into the bucketrn!gsutil cp -r model_artifacts $BUCKET_URI’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3ea468f2f650>)])]

Step 3 – Create the Vertex AI model

Once you’ve uploaded the model artifacts into a GCS bucket, you can upload your PyTorch model to Vertex AI Model Registry. From the Vertex AI Model Registry, you have an overview of your models so you can better organize, track, and train new versions. For this you can use the Vertex AI SDK and our pre-built PyTorch container.

code_block[StructValue([(u’code’, u’from google.cloud import aiplatform as vertexairnPYTORCH_PREDICTION_IMAGE_URI = (rn “us-docker.pkg.dev/vertex-ai/prediction/pytorch-gpu.1-12:latest”rn)rnMODEL_DISPLAY_NAME = “stable_diffusion_1_5-unique”rnMODEL_DESCRIPTION = “stable_diffusion_1_5 container”rnrnvertexai.init(project=’your_project’, location=’us-central1′, staging_bucket=BUCKET_NAME)rnrnrnmodel = aiplatform.Model.upload(rn display_name=MODEL_DISPLAY_NAME,rn description=MODEL_DESCRIPTION,rn serving_container_image_uri=PYTORCH_PREDICTION_IMAGE_URI,rn artifact_uri=BUCKET_URI,rn)’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3ea468f2f350>)])]

Step 4 – Deploy the model to an endpoint

Once the model has been uploaded to Vertex AI Model Registry you can then take it and deploy it to an Vertex AI Endpoint. For this you can use the Console or the Vertex AI SDK. In this example you will deploy the model on a NVIDIA Tesla P100 GPU and n1-standard-8 machine. You can specify your machine type.

code_block[StructValue([(u’code’, u’ENDPOINT_DISPLAY_NAME = f”my-stable-diffusion-endpoint” rnendpoint = aiplatform.Endpoint.create(display_name=ENDPOINT_DISPLAY_NAME)rnrnmodel.deploy(rn endpoint=endpoint,rn deployed_model_display_name=MODEL_DISPLAY_NAME,rn machine_type=”n1-standard-8″,rn accelerator_type=”NVIDIA_TESLA_P100″,rn accelerator_count=1,rn traffic_percentage=100,rn deploy_request_timeout=1200,rn sync=True,rn)’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3ea468ef5bd0>)])]

If you follow the notebook you can also get online predictions using the Vertex AI SDK.

What’s next?

To learn more about PyTorch on Vertex AI, take a look at the documentation, which explains Vertex AI’s PyTorch integrations and provides resources that show you how to use PyTorch on Vertex AI. You’ll see how easy it is to train, deploy, and orchestrate models in production using PyTorch and Vertex AI. You can also have a look at the notebook that deploys a text classification model.

Cloud BlogRead More

Previous articleGame-changing IT security with Unity, Orca Security, and Google Cloud

Next articleAI for Business Invests in Wharton Startup, Félix

Deploying your Generative AI model in only four steps with Vertex AI and PyTorch

Deploying a PyTorch Stable Diffusion model as a Vertex AI Endpoint

Step 1 – Create a custom TorchServe handler

Step 2 – Upload the model artifacts to Google Cloud Storage

Step 3 – Create the Vertex AI model

Step 4 – Deploy the model to an endpoint

What’s next?

Leverage enterprise data with Denodo and Vertex AI for generative AI applications

TypeScript takes aim at truthy and nullish bugs

Hex-LLM: High-efficiency large language model serving on TPUs in Vertex AI Model Garden

LEAVE A REPLY Cancel reply

Most Popular

Schneider Electric automates Salesforce account hierarchy management with generative artificial intelligence (AI) using Amazon Aurora and Amazon Bedrock

Leverage enterprise data with Denodo and Vertex AI for generative AI applications

TypeScript takes aim at truthy and nullish bugs

Make relevant movie recommendations using Amazon Neptune, Amazon Neptune Machine Learning, and Amazon OpenSearch Service

Recent Comments

EDITOR PICKS

Exploring the Click Element Variable in Google Tag Manager

How to track events with Google Tag Manager and Google Analytics

Data Layer Variable in GTM: What, Why, and Where?

POPULAR POSTS

Leading Cloud Security Vendors: Categories of Solutions

Native image compilation – what’s new, and what’s next?

Product Scoop – March 2024

POPULAR CATEGORY