Thursday, November 30, 2023
No menu items!
HomeCloud ComputingGoogle Cloud and NVIDIA bring next-generation AI infrastructure and software for large...

Google Cloud and NVIDIA bring next-generation AI infrastructure and software for large scale models and generative AI applications to enterprises

Generative AI is quickly becoming a strategic imperative for businesses and organizations across all industries. Yet many organizations face a barrier to adopting generative AI because they don’t have access to the latest models or the AI infrastructure to support large workloads. These barriers prevent organizations from innovating into the next era of AI. 

We’re excited to continue the Google Cloud and NVIDIA collaboration to help companies accelerate generative AI and other modern AI workloads in a cost-effective, scalable, and sustainable way. We’re bringing together best-in-class GPUs from NVIDIA for large model inference and training with the latest AI models and managed tools for generative AI from Google Cloud. We believe for customers to innovate with AI, they also need the best supporting technology—that’s why NVIDIA and Google Cloud are also coming together to offer leading capabilities for data analytics and integration of the best open-source tools.  

In this blog, we highlight ways Google Cloud and NVIDIA are teaming up to help the most innovative AI companies succeed.

Accelerate and scale Generative AI in production 

Google Cloud and NVIDIA are partnering to provide leading capabilities across the AI stack that will help customers take advantage of one of the most influential technologies of our generation: generative AI. In March, Google Cloud launched Generative AI support in Vertex AI, which makes it possible for developers to access, tune, and deploy foundation models. For companies to effectively scale generative AI in production, they need high-efficiency, performant GPUs to support these large AI workloads. 

At GTC, NVIDIA announced that Google Cloud is the first cloud provider to offer the NVIDIA L4 Tensor Core GPU, which is purpose-built for large inference AI workloads like generative AI. L4 GPU will be integrated with Vertex AI and delivers cutting-edge performance-per-dollar for AI inference workloads that run on GPUs in the cloud. Compared to previous-generation instances, the new G2 VM powered by NVIDIA L4 instance delivers up to 4x more performance. As a universal GPU offering, G2 VM instances also accelerate other workloads, offering significant performance improvements on HPC, graphics, and video transcoding. Currently in private preview, G2 VMs are both powerful and flexible, and scale easily from one up to eight GPUs. 

We are also excited to work with NVIDIA to bring our customers the highest performance GPU offering for generative AI training workloads. With optimized support on Vertex AI for both A100 and L4 GPUs, users can both train and deploy generative AI models with the highest performance available on GPUs today. 

We’re excited to offer NVIDIA AI Enterprise software on Google Marketplace. NVIDIA AI Enterprise is a suite of software that  accelerates the data science pipeline and streamlines development and deployment of production AI. With over 50 frameworks, pretrained models and development tools, NVIDIA AI Enterprise is designed to accelerate enterprises to the leading edge of AI, while also simplifying AI to make it accessible to every enterprise.

The latest release supports NVIDIA L4 and H100 Tensor Core GPUs, as well as prior GPU generations including A100 and more.   

Access to wide variety of open source tools 

We’ve worked with NVIDIA to make a wide range of GPUs accessible across Vertex AI’s Workbench, Training, Serving, and Pipeline services to support a variety of open-source models and frameworks. Whether an organization wants to accelerate their Spark, Dask and XGBoost pipelines or leverage PyTorch, TensorFlow, Keras or Ray frameworks for larger deep learning workloads, we have a range of GPUs throughout the Vertex AI Platform that can meet both performance and budget needs. These offerings allow users to take advantage of OSS frameworks and models in a managed and scalable way to accelerate the ML development and deployment lifecycle.

Improve efficiency of data preparation and model training 

Different workloads require different cluster configurations, owing to different goals, data sets, complexities, and timeframes. So, having a one-size-fits-all Spark cluster always at the ready is just not cost-effective or appropriate. Google has partnered with NVIDIA to make GPU-accelerated Spark available to Dataproc customers using the RAPIDS suite of open-source software libraries and APIs for executing data science pipelines entirely on GPUs, so that customers can tailor their Spark clusters to AI/ML workloads.

NVIDIA has been working with the Spark open-source community to implement GPU acceleration in the latest Spark version (3.x). This new version of Spark will let Dataproc customers accelerate various Spark-based AI/ML and ETL workloads without any code changes. Running on GPUs provides latency and cost improvements during the data preparation and model training. Data science teams can tackle larger data sets, iterate faster, and tune models to maximize prediction accuracy and business value.

Reduce carbon footprint of intensive AI workloads 

Google and NVIDIA are focused on helping users reduce the carbon footprint of their digital workloads. In addition to operating the cleanest cloud infrastructure in the industry, Google partners with NVIDIA to offer GPUs that can help increase the energy efficiency of computationally intensive workloads like AI. Accelerated computing not only delivers the best performance, it is also the most energy-efficient compute as well, and is essential to realizing AI’s full potential. For example, looking at the Green500 list of the world’s most efficient supercomputers: GPU-accelerated systems are 10x more energy-efficient compared to CPU-only systems. And when you carefully choose the Google Cloud region and the right GPU for training large models, Google researchers found you can reduce the carbon emissions of AI/ML training by as much as 1,000x. 

Since data center location is such an important factor in reducing carbon emissions of the workload, Google Cloud users are presented with low-carbon icons in the resource creation workflow to help them choose the most carbon-free location to place NVIDIA GPUs on Google Cloud. 

What customers are saying about Generative AI

A handful of early customers have been testing G2 and seeing great results in real-world applications. Here are what some of them have to say about the benefits that G2 with NVIDIA L4 GPUs bring:


AppLovin enables developers and marketers to grow with market-leading technologies. Businesses rely on AppLovin to solve their mission-critical functions with a powerful, full-stack solution including user acquisition, retention, monetization and measurement.

“AppLovin serves billions of AI-powered recommendations per day, so scalability and value are essential to our business,” said Omer Hasan, Vice President, Operations at AppLovin. “With Google Cloud’s G2 we’re seeing that NVIDIA L4 GPUs offer a significant increase in the scalability of our business, giving us the power to grow faster than ever before.”


WOMBO aims to unleash everyone’s creativity through the magic of AI, transforming the way content is created, consumed, and distributed. 

“WOMBO relies upon the latest AI technology for people to create immersive digital artwork from users’ prompts, letting them create high-quality, realistic art in any style with just an idea,” said Ben-Zion Benkhin, Co-Founder and CEO of WOMBO. “Google Cloud’s G2 instances powered by NVIDIA’s L4 GPUs will enable us to offer a better, more efficient image-generation experience for users seeking to create and share unique artwork.”


Descript’s AI-powered features and intuitive interface fuel YouTube and TikTok channels, top podcasts, and businesses using video for marketing, sales, and internal training and collaboration. Descript aims to make video a staple of every communicator’s toolkit, alongside docs and slides. 

“G2 with L4’s AI Video capabilities allow us to deploy new features augmented by natural-language processing and generative AI to create studio-quality media with excellent performance and energy efficiency,” said Kundan Kumar, Head of Artificial Intelligence at Descript.


Workspot believes that the software-as-a-service (SaaS) model is the most secure, accessible and cost-effective way to deliver an enterprise desktop and should be central to accelerating the digital transformation of the modern enterprise.

“The Workspot team looks forward to continuing to evolve our partnership with Google Cloud and NVIDIA. Our customers have been seeing incredible performance leveraging NVIDIA’s T4 GPUs. The new G2 instances with L4 GPUs through Workspot’s remote Cloud PC workstations provide 2x and higher frame rates at 1280×711 and higher resolutions” said Jimmy Chang, Chief Product Officer at Workspot.

We’re excited to continue growing our strategic partnership with NVIDIA, and look forward to ongoing collaboration to bring generative AI services and accelerated cloud computing to customers. Learn more at

Cloud BlogRead More



Please enter your comment!
Please enter your name here

Most Popular

Recent Comments