IT prediction: The era of workload-optimized, ultra-reliable infrastructure is upon us

By mullaned2002

December 13, 2022

298

Editor’s note: This post is part of an ongoing series on IT predictions from Google Cloud experts. Check out the full list of our predictions on how IT will change in the coming years.

Prediction: By 2025, over half of cloud infrastructure decisions will be automated by AI and ML

Google’s infrastructure is designed with scale-out capabilities to support billions of people, powering services like Search, YouTube, and Gmail every day. To do that, we’ve had to pioneer global-scale computing and storage systems and shorten network latency and distance limitations with new innovations. Along the way, we’ve come to see cloud infrastructure as more than a simple commodity — it’s a source of inspiration and new capabilities.

But even as the demand on the industry’s cloud infrastructure continues to increase, there are simultaneously plateaus in the efficiency available from the underlying hardware. In the past, we saw annual performance gains of 30-40%, levels that often enabled a single infrastructure configuration to meet the needs of the vast majority of workloads. As these improvements have slowed and new workloads such as AI/ML and analytics have emerged, we have seen a corresponding explosion in the variety and capability of infrastructure. While empowering, the burden of picking the right combination of infrastructure components for a given workload still falls on an organization’s cloud architects.

But by 2025, we predict that the burden and complexity of infrastructure decision making will disappear through the power of AI and ML automation, which will automatically combine purpose-built infrastructure, prescriptive architectures, and an ecosystem to deliver a workload-optimized, ultra-reliable infrastructure. The focus for cloud architects will therefore be on enabling business logic and innovation, rather than how that logic maps to underlying infrastructure.

Already, we are making investments to turn this vision into reality, building custom silicon like the Infrastructure Processing Unit (IPU) for our new C3 VMs or a liquid-cooled board for the new tensor processing unit. The latter, the TPU v4 platform, is likely the world’s fastest, largest, and most efficient machine learning supercomputer. It can train large-scale workloads up to 80% faster and 50% cheaper than alternatives. Put another way, TPU v4 will nearly double the performance of critical ML and AI services at half the cost, unlocking new possibilities for what organizations can achieve when leveraging large-scale learning and inference for business services.

The TPUv4

These same IPUs and TPUs represent the foundation that will make it possible to automate cloud infrastructure decisions. They’ll be able to support the telemetry data and ML-based analytics for proactive infrastructure recommendations that will increase the performance and reliability of workloads.

Instead of determining hardware specifications and building the right infrastructure, you’ll only need to specify a workload. AI and ML will take over the burden and recommend, configure, and identify the best options based on your budgetary, performance, and scaling requirements. What is most exciting for us is how this will enable a much more rapid pace of service innovation, which is the primary end goal of great cloud infrastructure.

Cloud BlogRead More

Previous articleHow a steel distributor reinvents its data science & ML workflows with Vertex AI

Next articleHarness the power of data and AI in your life science supply chain

IT prediction: The era of workload-optimized, ultra-reliable infrastructure is upon us

Prediction: By 2025, over half of cloud infrastructure decisions will be automated by AI and ML

Leverage enterprise data with Denodo and Vertex AI for generative AI applications

TypeScript takes aim at truthy and nullish bugs

Hex-LLM: High-efficiency large language model serving on TPUs in Vertex AI Model Garden

LEAVE A REPLY Cancel reply

Most Popular

Schneider Electric automates Salesforce account hierarchy management with generative artificial intelligence (AI) using Amazon Aurora and Amazon Bedrock

Leverage enterprise data with Denodo and Vertex AI for generative AI applications

TypeScript takes aim at truthy and nullish bugs

Make relevant movie recommendations using Amazon Neptune, Amazon Neptune Machine Learning, and Amazon OpenSearch Service

Recent Comments

EDITOR PICKS

Exploring the Click Element Variable in Google Tag Manager

How to track events with Google Tag Manager and Google Analytics

Data Layer Variable in GTM: What, Why, and Where?

POPULAR POSTS

Cloud Load Balancing enhancements improve security and distributed application support

Real-time analysis of customer sentiment using AWS

The overwhelmed person’s guide to Google Cloud: week of May 30

POPULAR CATEGORY