Friday, January 27, 2023
No menu items!
HomeCloud ComputingAnnouncing open innovations for a new era of systems design

Announcing open innovations for a new era of systems design

We’re at a pivotal moment in systems design. Demand for computing is growing at insatiable rates. At the same time, the slowing of Moore’s law means that improvements to CPU performance, power consumption, memory and storage cost efficiencies have all plateaued. These headwinds are further exacerbated by new challenges in reliability, and security

At Google, we’ve responded to these challenges and opportunities with system design innovations across the stack: from new custom-silicon accelerators (e.g., TPU, VCU, and IPU), new hardware and data center infrastructure, all the way to new distributed systems and cloud solutions. But this is only the beginning. There are many more opportunities for advancements, including closely-coupled accelerators for core data center functions to minimize the so-called “data center tax.” As server and data center infrastructure diverges from decades-old traditional designs to be more modular, heterogeneous, disaggregated, and software-defined, distributed systems are also entering a new epoch — one defined by optimizations for the “killer microsecond” and novel programming models optimized for low-latency and accelerators. 

At Google, we believe that these new opportunities and challenges are best addressed together, across the industry. Today, at the Open Compute Project (OCP) Global Summit, we are demonstrating our support of open hardware ecosystems, presenting at more than 40 talks, and announcing several key contributions:

Server design: We will share Google’s vision for a “multi-brained” server of the future, transforming traditional server designs to more modular disaggregated distributed systems across host computing, accelerators, memory expansion trays, infrastructure processing units, etc. We are sharing the work we are doing with all our OCP partners on the varied innovations needed to make this a reality — from modular hardware with DC-MHS, standardized management with OpenBMC and RedFish, standardized root of trust, and standardized interfaces including CXL, NVMe and beyond.

Trusted computing: The root of trust is an essential part of future systems. Google has a tradition of making contributions for transparent and best in-class security, including our OpenTitan discrete security solutions on consumer devices. We are looking ahead to future innovations in confidential computing and varied use-cases that require chip-level attestation at the level of a package or System on a Chip (SoC). Together with other industry leaders, AMD, Microsoft, and NVIDIA, we are contributing Caliptra, a re-usable IP block for root of trust measurement, to OCP. In the coming months we will roll out initial code for the community to collectively harden together.

Reliable computing: To address the challenges of reliability at scale, we’ve formed a new server-component resilience workstream at OCP,  along with AMD, ARM, Intel, Meta, Microsoft, and NVIDIA. Through this workstream, we’ll develop consistent metrics about silent data errors and corruptions for the broader industry to track. We’ll also contribute test execution frameworks and suites, and provide access to test environments with faulty devices. This will enable the broader community — across industry and academia — to take a systems-approach to addressing silicon faults and silent data errors. 

Sustainability: Finally, we’re announcing our support for a new initiative within OCP to support environmental sustainability as a key tenet across the ecosystem. Google has been a leader in environmental sustainability for many years. We have been carbon neutral since 2007, powered by 100% renewable energy since 2017, and have an ambitious goal to achieve net-zero emissions across all of our operations and value chain by 2030. In turn, as the cleanest cloud in the industry, we have helped customers track and reduce their carbon footprint and achieve significant energy savings. We’re excited to share these best practices with OCP and work with the broader community to standardize sustainability measurement and optimization in this important area. 

As the industry body focused on system integration (e.g., compute, memory, storage, management, power and cooling), the OCP Foundation is uniquely positioned to facilitate the industry-wide codesign we need. Google is active in OCP, serving in leadership roles, incubating new initiatives, and supporting numerous contributions.

These announcements are the latest example of our history of fostering open and standards-based ecosystems. Open ecosystems enable a diverse product marketplace, with agility in time-to-market, and the opportunity to be strategic about innovation. Google’s open source leadership is multidimensional: driving industry standardization and adoption, strong and varied community contributions to grow the ecosystem, as well as broad policy and organizational leadership and sharing of best practices. 

The four initiatives we are announcing today, in combination with the Google-led talks at the OCP Summit, provide a small glimpse into the exciting new era of systems ahead. We look forward to working with the broader OCP community and other industry organizations to build a vibrant open hardware ecosystem to support even more innovation in this space. Please join us in this exciting journey.

Related Article

Jupiter evolving: Reflecting on Google’s data center network transformation

Thanks to optical circuit switching (OCS) and wave division multiplexing (WDM) in the Jupiter data center network, Google enjoys a host o…

Read Article

Cloud BlogRead More

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular

Recent Comments