Gemma is now available on Google Cloud

By mullaned2002

February 21, 2024

116

Today, Google announced Gemma, a family of lightweight, state-of-the art open models built from the same research and technology that we used to create the Gemini models. We’re pleased to share that Google Cloud customers can get started today customizing and building with Gemma models in Vertex AI and running them on Google Kubernetes Engine (GKE). The release of Gemma and our updated platform capabilities are the next phase of our commitment to making AI more open and accessible to developers on Google Cloud.

Gemma is now available on Google Cloud

Gemma models share technical and infrastructure components with our capable Gemini models. This enables Gemma models to achieve best-in-class performance for their sizes compared to other open models. We are releasing weights in two sizes: Gemma 2B and Gemma 7B. Each size is released with pre-trained and instruction-tuned variants to enable both research and development.

Gemma supports tools that Google Cloud developers love and use today, including Colab and Kaggle notebooks, as well as frameworks like JAX, PyTorch, Keras 3.0, and Hugging Face Transformers. Gemma models can run on a laptop, workstation, or on Google Cloud. With these new open models, developers can now build with and customize in Vertex AI and run it on GKE. To maximize industry-leading performance, we have collaborated with NVIDIA to optimize Gemma for NVIDIA GPUs.

Unlocking the power of Gemma in Vertex AI

Gemma joins over 130 models in Vertex AI Model Garden, including our recently announced expanded access to Gemini: Gemini 1.0 Pro, 1.0 Ultra, and 1.5 Pro models.

By using Gemma models on Vertex AI, developers can take advantage of an end-to-end ML platform that makes tuning, managing, and monitoring models simple and intuitive. With Vertex AI, builders can reduce operational overhead and focus on creating bespoke versions of Gemma that are optimized for their use case. For example, using Gemma models on Vertex AI, developers can:

Build generative AI apps for lightweight tasks such as text generation, summarization, and Q&AEnable research and development using lightweight-but-customized models for exploration and experimentationSupport real-time generative AI use cases that require low latency, such as streaming text

Vertex AI makes it easy for developers to turn their own tuned models into scalable endpoints that can power AI applications of all sizes.

Scale from prototype to production with Gemma on GKE

GKE provides tools to build custom apps, from prototyping simple projects to rolling them out at enterprise scale. Today, developers can also deploy Gemma directly on GKE to create their own gen AI apps for building prototypes or testing model capabilities:

Deploy custom, fine-tuned models in portable containers alongside applications using familiar toolchainsCustomize model serving and infrastructure configurations without the need to provision or maintain nodesIntegrate AI Infrastructure fast with the ability to scale to meet the most demanding training and inference scenarios

GKE delivers efficient resource management, consistent ops environments, and autoscaling. In addition, it helps enhance these environments with easy orchestration of Google Cloud AI accelerators, including GPUs and TPUs, for faster training and inference when building generative AI models.

Get started with Gemma on Google Cloud today

You can start working with Gemma models today on Google Cloud in Vertex AI and GKE. For more information about Gemma, access quickstart guides on ai.google.dev/gemma.

Cloud BlogRead More

Previous articleKaren Buffo, CMO of MixMode, on the Rise of AI in Safeguarding Digital Assets

Next articleCount Missing Values in SAS (with Examples)

Gemma is now available on Google Cloud

Gemma is now available on Google Cloud

Unlocking the power of Gemma in Vertex AI

Scale from prototype to production with Gemma on GKE

Get started with Gemma on Google Cloud today

Leverage enterprise data with Denodo and Vertex AI for generative AI applications

TypeScript takes aim at truthy and nullish bugs

Hex-LLM: High-efficiency large language model serving on TPUs in Vertex AI Model Garden

LEAVE A REPLY Cancel reply

Most Popular

Schneider Electric automates Salesforce account hierarchy management with generative artificial intelligence (AI) using Amazon Aurora and Amazon Bedrock

Leverage enterprise data with Denodo and Vertex AI for generative AI applications

TypeScript takes aim at truthy and nullish bugs

Make relevant movie recommendations using Amazon Neptune, Amazon Neptune Machine Learning, and Amazon OpenSearch Service

Recent Comments

EDITOR PICKS

Exploring the Click Element Variable in Google Tag Manager

How to track events with Google Tag Manager and Google Analytics

Data Layer Variable in GTM: What, Why, and Where?

POPULAR POSTS

Why xHE-AAC is being embraced at Meta

Log-based Metrics in Action

How to work with dates in conditions that do not accept date data types

POPULAR CATEGORY