Monday, July 15, 2024
No menu items!
HomeCloud ComputingModel accidents and potholes using Waze and NOAA data in BigQuery

Model accidents and potholes using Waze and NOAA data in BigQuery

Transportation agencies face the challenge of managing heavy amounts of data from diverse and high-velocity sources. It can be difficult to gather insights across the disparate sources, let alone be fast enough to inform operational decisions. Traditionally, agencies rely on yearly reporting to understand the state of their operations, but this limits their ability to act quickly on emerging problems. At Google, we see a future where every government leader can draw insights from their big data on-demand. BigQuery, our cloud data warehouse, can address data volume, consolidation, and freshness without a high degree of technical specialization.

Kentucky Transportation Cabinet transformed safety operations with data

For example, Kentucky Transportation Cabinet (KYTC) outgrew an on-premise Hadoop cluster as it attempted to combine HERE®, Waze, weather, and road segment geospatial data. By consolidating the analytics into BigQuery, reporting could be done both at petabyte-scale and in near-real time with streaming data.

Once they consolidated data into BigQuery, new use cases started to emerge for KYTC. During the snowstorms of 2021, the operations team generated materialized views to efficiently monitor signs of congestion on throughways. Every 10 minutes became a decision point on whether crews should be proactively dispatched, as the team could infer the flow of traffic based on the data instead of relying on humans or 3rd party sources, which may be unable to provide the data clearly or with specificity. Another use case is providing transparency to the public by presenting multiple layers of roadway events on a public Maps application.

Many government organizations are members of the Waze of Cities program, which offers direct access to five years of their jurisdiction’s jams, alerts, and irregularities data already hosted on BigQuery. Google’s onboarding guide provides excellent starter queries to understand what’s happening on the roads. To take it a step further, agencies can make use of machine learning (ML) capabilities and public datasets in BigQuery to model incidents.

Modeling road conditions and safety using Waze and weather data

As a proof of concept, let’s correlate precipitation with accidents using BigQuery in Google Cloud. We can join the National Oceanic and Atmospheric Administration’s (NOAA) daily Global Historical Climatology Network (GHCN) precipitation numbers with our Alerts table from Waze with the following query. Note: Be sure to have “write access” to a Google Cloud project and BigQuery dataset.

code_block[StructValue([(u’code’, u’CREATE OR REPLACE TABLErn`<your dataset>.prcp_accidents_table` AS (rnWITHrn local_stations AS(rn SELECTrn id AS station_id,rn ST_BUFFER(ST_GEOGPOINT(longitude, latitude),rn 10000) AS geo_clustersrn FROMrn `bigquery-public-data.ghcn_d.ghcnd_stations`rn WHERErn state = <state abbreviation> ),rn accidents AS (rn SELECTrn city,street,roadType,ts,georn FROMrn `waze-public-dataset.partner_<your organization>.view_alerts_clustered`rn WHERErn ts >= TIMESTAMP(‘2021-12-01’) AND ts < TIMESTAMP(‘2022-01-01’)rn AND subtype = “ACCIDENT_MINOR” )rnSELECTrn precip.id AS station_id,rn geo_clusters,rn precip.value AS prcp,rn date,rn (rn SELECTrn COUNT(*)rn FROMrn accidentsrn WHERErn precip.date = DATE(accidents.ts)rn AND ST_INTERSECTS(accidents.geo, geo_clusters)) AS associated_accidents_countrnFROMrn `bigquery-public-data.ghcn_d.ghcnd_2021` preciprnJOINrn local_stationsrnONrn local_stations.station_id = precip.idrnWHERErn precip.element = “PRCP” AND date >= ‘2021-12-01’ AND date < ‘2022-01-01′ )’), (u’language’, u’lang-sql’), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3ea0334f5d50>)])]

Once built, this query provides a training dataset for creating a machine learning (ML)model. Feel free to replace the subtype feature from ACCIDENT_MINOR to HAZARD_ON_ROAD_POT_HOLE or HAZARD_WEATHER_FLOOD if you wish to create further predictions on those. 

You can then create an ML model using the query below:

code_block[StructValue([(u’code’, u’CREATE OR REPLACE MODEL `<your dataset>.predict_accidents_regression`rnOPTIONS(model_type=’linear_reg’) ASrnSELECTrnassociated_accidents_count as label,rnprcp,rnIFrn (EXTRACT(DAYOFWEEKrn FROMrn date) = 1rn OR EXTRACT(DAYOFWEEKrn FROMrn date) = 7,rn “weekend”,rn “weekday”) AS isweekdayrnFROMrn`<your dataset>.prcp_accidents_table`’), (u’language’, u’lang-sql’), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3ea033403710>)])]

The model’s features are the precipitation amount (tenths of mm) and whether it’s a weekday, and the output is the predicted number of accidents. After the query completes, we’ll be able to use this model for prediction by “querying” the model like a Structured Query Language (SQL) table.

code_block[StructValue([(u’code’, u”SELECTrn*rnFROMrnML.EXPLAIN_PREDICT(MODEL `<your dataset>.predict_accidents_regression`,rn (rn SELECTrn 10000 as prcp,rn ‘weekday’ as isweekday),rn STRUCT(2 as top_k_features))rnHere are some example results:”), (u’language’, u’lang-sql’), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3ea03343d810>)])]

The predicted number of accidents is around 14. We also asked the model to explain how each feature was attributed to the result. This can be useful for agencies to understand what factors they need to influence to reach safety outcomes in their respective jurisdictions.

Next steps

Start working on building your own models using these queries as a first step. Access to the Waze tables in BigQuery is controlled in the Waze Partner Hub. If you would like any additional information or help with getting set up, please reach out to your Google Public Sector Account Team or contact here.

Cloud BlogRead More

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular

Recent Comments