Organizations typically store business and customer data in databases like Amazon Relational Database Service (Amazon RDS) and Amazon Redshift, and often want to enrich this data by integrating with external services. One such enrichment is to add spatial attributes such as location coordinates for an address. With the introduction of Amazon Location Service, you now have access to geospatial functionalities such as map visualization, geocoding, and reverse geocoding using data providers such as Esri and HERE.
Moreover, with the ability of Amazon Redshift and RDS databases to call AWS Lambda functions through user-defined functions (UDFs), you can now integrate these databases with geospatial functionality provided by Amazon Location Service. For more information about user-defined functions in Amazon Aurora, see Invoking an AWS Lambda function from an Aurora PostgreSQL DB cluster.
In this post, we take customer addresses stored in an Amazon Aurora PostgreSQL-Compatible Edition database and call Amazon Location Service geocoding APIs to find the coordinates of the addresses and persist them in the database. If you are using Amazon Redshift, see Access Amazon Location Service from Amazon Redshift.
Prerequisites
Before you get started, make sure you have the following prerequisites:
An understanding of basic concepts such as AWS Identity and Access Management (IAM) roles and policies, Lambda functions, and Aurora PostgreSQL.
An Aurora PostgreSQL cluster with a table that contains customer addresses and the following details: street number, street name, street type, municipality name, state or province code, postal or zip code, and country code. You also need a column to save the coordinates for the address.
A SQL developer tool of your choice to connect to the Aurora PostgreSQL cluster.
An Amazon QuickSight account with access to Aurora.
Solution overview
Our solution consists of the following components:
A Python Lambda function to call the search_place_index_for_text function. This function takes a text input and returns the coordinate as longitude and latitude values for each search result.
An IAM role to allow Lambda to call the SearchPlaceIndexForText operation for Amazon Location Service
A UDF in the Aurora PostgreSQL database to invoke the Lambda function.
An IAM role to allow Aurora PostgreSQL to invoke the Lambda function.
SQL statements to update and select the coordinate data for records in the Aurora PostgreSQL database by calling the UDF.
A QuickSight dataset that uses the SQL statement to access the coordinate data.
A QuickSight analysis that displays the address location in a geospatial chart.
The following diagram illustrates our solution architecture:
To implement the solution, we complete the following steps:
Set up a place index resource for Amazon Location Service.
Create a common address geocoding function.
Invoke the Amazon Location Service API from Aurora PostgreSQL.
Create a Lambda function.
Set up Aurora PostgreSQL.
Run SQL statements to invoke the Lambda function.
Visualize the address locations in QuickSight.
Pay close attention to names and parameters used in this post; they must match and be consistent across all solution components.
The full code is available in GitHub. The code also includes an AWS CloudFormation template.
Set up a place index resource for Amazon Location Service
Amazon Location Service uses a place index resource for providing geocoding and reverse geocoding functionality. Let’s start by creating a new place index resource. Your account may have a default place index but we don’t use it for this post because it isn’t configured for storage.
On the Amazon Location Service console, use the navigation pane to launch the wizard to create a new place index.
For Name, enter placeindex.aurora.
For Data provider, select either of the data providers for your place index.
For Data storage options, select Yes, results will be stored because we’re saving the geocoding results in a database field.
Leave all other values as their default and choose Create place index.
We use the place index name in the Lambda code as a parameter to the search_place_index_for_text API call.
Create a common address geocoding Lambda function
Next, let’s create a function to call the Amazon Location Service API. We reuse this function code for both Aurora and Amazon Redshift because the underlying service call is the same.
Error handling has been skipped in this code snippet for brevity. The full code is available in GitHub.
We use country_code to limit the search_place_index_for_text API to a particular country using the ISO 3166-1 alpha-3 country code. Amazon Location Service throws an exception if an input is provided but doesn’t contain a valid value.
Create a new file geocode.py with the following code snippet:
Invoke the Amazon Location Service API from Amazon Aurora PostgreSQL
The Aurora PostgreSQL database uses PostgreSQL extensions aws_lambda and aws_common. For instructions on installing these extensions, see Invoking an AWS Lambda function from an Aurora PostgreSQL DB cluster.
Aurora PostgreSQL Lambda invocation provides a JSON document representing the parameters of the UDF. Aurora PostgreSQL sends one database record in one function call.
The interface expects the response as a JSON document.
In case of an exception, the function can return an error message.
Create a Lambda function
We now create a Lambda function GeocodeAddress-Aurora using the Python runtime.
Error handling has been skipped in this code snippet for brevity. The full code is available in GitHub.
Create file geocode.py as described in the previous section.
It’s important to set the concurrency for the Lambda function to allow multiple requests from Aurora PostgreSQL to be run concurrently.
Replace the default generated code in lambda_function.py with the following code:
This Lambda function requires permission to call the search_place_index_for_text API to geocode addresses using the place index placeindex.aurora that we created earlier.
Update the IAM role for the Lambda function to add the following inline policy GeocodeAddress-Aurora-policy:
Set up Aurora PostgreSQL
Now we create a new UDF in Aurora PostgreSQL and configure it to use an IAM role that gives the permissions to invoke the specific Lambda function. For more information, see Invoking an AWS Lambda function from an Aurora PostgreSQL DB cluster.
Create an IAM role Aurora-Lambda-role and add the following inline policy Aurora-Lambda-policy to this role to allow Aurora PostgreSQL to invoke the GeocodeAddress-Aurora function that we created:
We need to add this role to the Aurora PostgreSQL cluster.
On the Aurora PostgreSQL console, choose the cluster to use.
Under Manage IAM Roles, add the Aurora-Lambda-role for feature Lambda to the cluster.
Wait for the cluster to be modified and be in Available status.
Now we create an Aurora PostgreSQL UDF to invoke a Lambda function.
Use your preferred SQL editor to connect to your Aurora PostgreSQL cluster and create a new UDF f_geocode_address in Aurora PostgreSQL using the following code:
This UDF calls the function using the permissions provided in the Aurora-Lambda-role that we created.
Run SQL statements to invoke the Lambda function
We’re now ready to run SQL statements that can take address records from the customer_address table in the Aurora PostgreSQL database and geocode them using Amazon Location Service.
If you don’t have a customer_address table, you can create it using the script included in the full code in GitHub.
You may choose to simply select the geocoding results as part of the SQL statement, as in the following code:
Geocoding results can be persisted in a database field in Aurora PostgreSQL. We can then parse the saved JSON results to extract the geographic coordinates.
Now let’s extract the geographic coordinates from the results field.
Visualize the address location using QuickSight
Let’s configure a dataset in QuickSight and create an analysis for this dataset.
Create a new Aurora data source Aurora-Geocode and configure it with endpoint details.
Create a new dataset for the data source using the preceding SELECT statement as a custom SQL.
Now we’re ready to configure our analysis.
On the QuickSight console, create a new QuickSight analysis using the address location dataset.
Choose the Point on map visual type.
Choose the longitude and latitude fields from the fields list and drag them to the Geospatial field wells.
You should now see points on the map representing the customer address locations.
Conclusion
Congratulations! We have successfully integrated Amazon Location Service with Amazon Aurora and geocoded address data in an Aurora PostgreSQL table without leaving our SQL environment. We enriched the address data by adding location coordinates. We also successfully visualized the address locations on a map in QuickSight. You can now explore extending the solution to other functionalities of Amazon Location Service such as reverse geocoding or even integrating with Lambda functions for any other custom functionality.
About the author
Parag Srivastava is a Solutions Architect at Amazon Web Services (AWS), helping enterprise customers with successful cloud adoption and migration. During his professional career, he has been extensively involved in complex digital transformation projects. He is also passionate about building innovative solutions around geospatial aspects of addresses.
Read MoreAWS Database Blog