Artificial Intelligence and Machine Learning

Introducing three new NVIDIA GPU-based Amazon EC2 instances

By mullaned2002

November 27, 2023

165

Amazon Elastic Compute Cloud (Amazon EC2) accelerated computing portfolio offers the broadest choice of accelerators to power your artificial intelligence (AI), machine learning (ML), graphics, and high performance computing (HPC) workloads. We are excited to announce the expansion of this portfolio with three new instances featuring the latest NVIDIA GPUs: Amazon EC2 P5e instances powered by NVIDIA H200 GPUs, Amazon EC2 G6 instances featuring NVIDIA L4 GPUs, and Amazon EC2 G6e instances powered by NVIDIA L40S GPUs. All three instances will be available in 2024, and we look forward to seeing what you can do with them.

AWS and NVIDIA have collaborated for over 13 years and have pioneered large-scale, highly performant, and cost-effective GPU-based solutions for developers and enterprise across the spectrum. We have combined NVIDIA’s powerful GPUs with differentiated AWS technologies such as AWS Nitro System, 3,200 Gbps of Elastic Fabric Adapter (EFA) v2 networking, hundreds of GB/s of data throughput with Amazon FSx for Lustre, and exascale computing with Amazon EC2 UltraClusters to deliver the most performant infrastructure for AI/ML, graphics, and HPC. Coupled with other managed services such as Amazon Bedrock, Amazon SageMaker, and Amazon Elastic Kubernetes Service (Amazon EKS), these instances provide developers with the industry’s best platform for building and deploying generative AI, HPC, and graphics applications.

High-performance and cost-effective GPU-based instances for AI, HPC, and graphics workloads

To power the development, training, and inference of the largest large language models (LLMs), EC2 P5e instances will feature NVIDIA’s latest H200 GPUs, which offer 141 GBs of HBM3e GPU memory, which is 1.7 times larger and 1.4 times faster than H100 GPUs. This boost in GPU memory along with up to 3200 Gbps of EFA networking enabled by AWS Nitro System will enable you to continue to build, train, and deploy your cutting-edge models on AWS.

EC2 G6e instances, featuring NVIDIA L40S GPUs, are built to provide developers with a broadly available option for training and inference of publicly available LLMs, as well as support the increasing adoption of Small Language Models (SLM). They are also optimal for digital twin applications that use NVIDIA Omniverse for describing and simulating across 3D tools and applications, and for creating virtual worlds and advanced workflows for industrial digitalization.

EC2 G6 instances, featuring NVIDIA L4 GPUs, will deliver a lower-cost, energy-efficient solution for deploying ML models for natural language processing, language translation, video and image analysis, speech recognition, and personalization as well as graphics workloads, such as creating and rendering real-time, cinematic-quality graphics and game streaming.

About the Author

Chetan Kapoor is the Director of Product Management for the Amazon EC2 Accelerated Computing Portfolio.

Introducing three new NVIDIA GPU-based Amazon EC2 instances

High-performance and cost-effective GPU-based instances for AI, HPC, and graphics workloads

About the Author

Amazon SageMaker inference launches faster auto scaling for generative AI models

Node problem detection and recovery for AWS Neuron nodes within Amazon EKS clusters

Evaluate conversational AI agents with Amazon Bedrock

LEAVE A REPLY Cancel reply

Most Popular

Schneider Electric automates Salesforce account hierarchy management with generative artificial intelligence (AI) using Amazon Aurora and Amazon Bedrock

Leverage enterprise data with Denodo and Vertex AI for generative AI applications

TypeScript takes aim at truthy and nullish bugs

Make relevant movie recommendations using Amazon Neptune, Amazon Neptune Machine Learning, and Amazon OpenSearch Service

Recent Comments

EDITOR PICKS

Exploring the Click Element Variable in Google Tag Manager

How to track events with Google Tag Manager and Google Analytics

Data Layer Variable in GTM: What, Why, and Where?

POPULAR POSTS

How to Build a Decision Tree in SAS

Inside Ericsson’s Cognitive Software, the AI solution for automated network optimization

Extending the value of Google Workspace to 3 billion users and counting

POPULAR CATEGORY