Tuesday, September 17, 2024
No menu items!
HomeCloud ComputingA developer’s guide to understanding carbon

A developer’s guide to understanding carbon

I’m guessing you aren’t thinking about carbon when you’re writing code, scheduling jobs, or building applications. 

(I wasn’t either)

You’re probably thinking about security, scaling, cost, latency…But it’s important to include carbon in that list.

In this post, I’ll break down the basics of how to be a carbon aware developer. 

We’ll look at:

Where this carbon pollution comes from in the first place

How to measure your IT carbon emissions on Google Cloud

Strategies for lowering your carbon footprint on Google Cloud

So what is this footprint anyway?

If you’re wondering why your software applications have a carbon footprint in the first place, the tldr is that running any kind of compute workload requires energy in the form of electricity. And this energy often comes from sources that release CO2. 

As you might know, energy is produced by power plants. These power plants could be sourcing energy from carbon emitting fossil fuels like coal or gas, or non-carbon emitting sources like wind or solar. This energy then flows through a network of transmission lines and substations (aka the electric grid) to power our homes, offices, restaurants, hospitals, and even datacenters.

The mix of carbon and non carbon emitting sources of energy is different for each regional grid. For example, France relies heavily on nuclear, Sweden on hydro, and Texas on natural gas. And even within a particular regional grid, the carbon intensity can change depending on time of day because some sources like solar are only available during the daytime.

If you want to see what regional grid you’re connected to and what kind of energy sources are powering it, you can check out our partner ElectrictyMaps. They provide data quantifyinghow carbon intensive electricity is on an hourly basis across 50+ countries. 

For example, here’s the electricity breakdown for Texas at the time I’m writing this on an early winter morning. You can see that wind and gas make up most of the energy on this regional grid.

Because we know the carbon intensity of a particular regional grid, and we know which grid each Google Cloud datacenter is connected to, we can estimate the carbon emissions associated with a workload run in that location.

Luckily, if you’re a Google Cloud user, you don’t have to do the work of calculating all of this yourself.

How do I figure out my carbon footprint?

If you want to quantify the greenhouse gas emissions (GHG) from your IT operations on Google Cloud, you can use the Carbon Footprint dashboard in the Cloud Console. 

This tool estimates greenhouse gas emissions in metric tons of equivalent carbon dioxide for usage of Google Cloud services associated with a particular billing account. The overview page shows you a monthly carbon footprint estimate across project, region, and product.

In the top of the dashboard under Yearly carbon footprint, you’ll see your total GHG emissions associated with your Google Cloud usage under Location-based total.

This number is further broken down into the three emissions categories defined by the Greenhouse Gas Protocol Corporate Standard, Scope 1, Scope 2, and Scope 3. These categories help organizations measure their carbon impact.

What exactly contributes to each category will differ across organizations depending on the type of business they run. When we talk about the Carbon Footprint tool specifically, you’re seeing Google’s scope 1, 2, and 3 emissions from datacenter operations. And as a customer, your usage of Google Cloud would fall under your scope 3 emissions.

To get a better understanding of how these scopes are defined, let’s take a look at what each one includes and how they are estimated in the context of a Google Cloud datacenter.

Scope 1 includes the direct emissions from sources controlled by an organization.

In a typical office building, scope 1 could include the gas range in a kitchen, or fuel used to run shuttles. In the context of a Google datacenter, scope 1 would include an onsite backup generator.  

Scope 2 captures the indirect emissions from the purchase of electricity, heating, and cooling

When we talk about carbon and computing, scope 2 is the most relevant because it includes the emissions related to the electricity usage in Google Cloud datacenter as a result of being connected to the grid.

In the Carbon Footprint tool, the scope 2 value is estimated from the GHG emissions produced from electricity provided by the local grid where your compute workload was executed. 

Scope 3 captures the indirect emissions from assets not controlled by your organization.

This includes categories like items bought from suppliers, waste disposal, and business travel. In the case of Google Cloud, this could include the emissions associated with producing the GPUs that we have in our data centers. 

In the Carbon Footprint tool, scopes 1 and 3 are calculated by taking the total Google Cloud scope 1 and 3 emissions and apportioning them based on usage for your specific billing group. 

What can I learn about my carbon footprint?

I wanted to investigate the emissions specifically from my Google Cloud projects. Because I share a billing account with my fellow Cloud Developer Advocates, and I wanted to look at more than just emissions from the past month, I needed something more granular than the Carbon Footprint overview page.

I exported the data via the BigQuery Data Transfer Service, which creates a monthly partitioned table called carbon_footprint in the BigQuery dataset of your choice. To get historical data, I ran a data backfill to 2021.

Each row in this table represents the emissions for a particular Google Cloud service in a specific project over the time frame of one month. You’ll notice that there isn’t just one single column for “emissions” because there are a few different ways to calculate and report this number.

The carbon_footprint_kgCO2e field has three nested columns, one for each scope.

The scope2 field can be further broken down into market_based and location_based.

location_based is most likely the number you’ll want to look at. This refers to the actual amount of CO2 equivalent being emitted from electricity consumed in the usage of a particular Google Cloud service.

On the other hand, the market_based value is calculated as location_based minus apportioned renewable energy purchases for the workload, and can be useful for reporting purposes in specific cases. But for you as a developer, you should focus on location_based, which is the physical amount of CO2 equivalent generated by running your workload. Also note that at this point, the Carbon Footprint tool does not support market_based calculations yet so that value is always null. This field will be populated once the data is released.

If you want to know the total emissions from electricity generation for a particular service, say BigQuery, you can run:

code_block[StructValue([(u’code’, u’SELECT DISTINCT SUM(carbon_footprint_kgCO2e.scope2.location_based)rnFROM `{PROJECT_ID}.{DATASET}.carbon_footprint`rnWHERE service.description = “BigQuery”‘), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3eab01367bd0>)])]

Which returns an estimate of the greenhouse gas emissions in metric tons of equivalent carbon dioxide produced from all of your BigQuery activity.

If you want to know the total emissions for all 3 scopes, and not just the scope 2 electricity usage, you can look at the carbon_footprint_total_kgCO2e field.

The carbon_footprint_total_kgCO2e field includes three nested columns. Again the most relevant one here is location_based, which will tell you the actual amount of carbon emissions released into the atmosphere directly related to your workload. after_offsets and market_based are useful for specific reporting purposes, but as a developer these numbers are unlikely to be relevant to your objectives.

So if you want to see the breakdown of carbon emissions (all 3 scopes) for a particular project across month, region, and service, you could run:

code_block[StructValue([(u’code’, u’SELECT DISTINCT usage_month, service.description, location.location, carbon_footprint_total_kgCO2e.location_basedrnFROM `{PROJECT_ID}.{DATASET}.carbon_footprint`rnWHERE project.number = “{PROJECT_NUMBER}”rnORDER BY usage_month, service.description’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3eaaf722d510>)])]

Which will result in something like:

And if you want to know the total amount of emissions (all 3 scopes) for a specific project, you could run:

code_block[StructValue([(u’code’, u’SELECT DISTINCT SUM(carbon_footprint_total_kgCO2e.location_based)rnFROM `{PROJECT_ID}.{DATASET}.carbon_footprint`rnWHERE project.number = “{PROJECT_NUMBER}”‘), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3eaaf7210a10>)])]

As a developer, why should I care?

The idea that we as developers can have any impact on these emissions does seem a little daunting at first. I mean, at the end of the day we’re just launching jobs on machines connected to an electric grid somewhere out there…

But as I mentioned earlier, different regional grids have different makeups of energy sources. This means that some Google Cloud regions are connected to electric grids that have a higher amount of carbon free energy sources operating on them. 

As a result, the region you choose to run your compute jobs in can have a big impact. Different regions can have a significantly different carbon emissions profile, even regions that are geographically close together.

If you look at the Google Cloud docs, you can see indicators for regions that have a carbon free energy score (CFE)of at least 75%. CFE is a metric that represents the average percentage of time your application will be running on carbon-free energy.

In many of Google Cloud’s products, when choosing a region, there’s  also a friendly Low CO2 icon in the Cloud Console.

When we think about reducing carbon emissions, the first thing that comes to mind is usually adding more carbon free energy sources to the grid. However, it’s not just the supply that matters. As developers, we can be flexible in the way we manage demand. 

This means that for batch workloads that aren’t super time sensitive, we can choose to run them at times of the day and in locations where there is more carbon free energy available. A good place to start is by trying to select the closest Low CO2 region whenever possible.

Now of course, selecting a region depends on many other factors (latency, price, data locality, etc) beyond carbon. To help you with region selection, you can use the Region Picker tool to help select regions, weighing multiple factors from “Not Important” to “Important” and selecting the region from where your user traffic emanates, if applicable. Region Picker can help you choose the lowest CO2 region within the parameters required for your specific workload.

What’s next?

Congratulations developer. You’ve taken the first step in understanding the world of carbon. Now it’s time for you to get hands on, jump into the Carbon Footprint dashboard, and take a look at your own data. Then, you can start moving workloads to Low CO2 regions when possible, and see if your monthly carbon footprint starts to go down. 

Once you get a handle on how to measure and report your carbon emissions, there are additional architectures, design patterns, and techniques that you can use to reduce the carbon footprint of your software applications. But, more on all of that next time.

Cloud BlogRead More

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular

Recent Comments