Saturday, February 24, 2024
No menu items!
HomeDatabase ManagementAnalyze healthcare FHIR data with Amazon Neptune

Analyze healthcare FHIR data with Amazon Neptune

In this post we focus on data analysis as part of the modern data strategy. I cover how to generate insights from healthcare FHIR (Fast Healthcare Interoperability Resources) data with Amazon Neptune, a fast, reliable, fully managed graph database service. Using a graph database for this use case allows you to model and navigate complex connected FHIR resources and answer questions based on the relationships in the data.

Moving towards a modern data strategy gives you a comprehensive plan to manage, access, analyze, and act on healthcare data and data in general. Data in the healthcare industry is exchanged based on interoperability standards like FHIR as defined by Health Level Seven International (HL7). FHIR defines resources to model and exchange clinical or administrative data. Resources can be linked, for example, to create a comprehensive patient view. Because resources can also be deeply nested, analysis of the data becomes cumbersome when advancing beyond simple queries.

In this post, we focus on reducing the complexity of analyzing healthcare FHIR data while keeping the data in a standard compliant encoding by using the Resource Description Framework (RDF)­. This means that queries are easier to generate, understand, and share because the data model remains standardized. Moreover, Neptune removes the need for designing and creating a complex data schema on data ingestion. New types of FHIR resources can be added to the graph and are instantly available for querying because no schema definition is required.

Also, as RDF is machine-interpretable, you get the option to add an inferencing engine to get additional insights from the data.

Using a graph database to query FHIR data

We see customers mainly using FHIR for data exchange and storing it in relational databases. However, different FHIR resources, like FHIR QuestionnaireResponse that can be used in medical trials, are intended for analysis, not only for storing data. Analyzing the data in a relational database involves complex queries to link the numerous tables that are required to break up the highly nested FHIR structure. This negatively impacts performance.

Neptune is purpose built to work with highly connected data. It enables you to query data in its natural form by focusing on the underlying relationships. Neptune doesn’t require schema definitions and allows for fast queries even over larger graphs.

Storing FHIR data in Neptune enables you to run complex queries such as identifying patients with similar characteristics. You can link multiple FHIR resources and query them based on the relationships among each other, such as “Which patients based on a specific treatment show similar symptoms post-surgery?” The following image shows what the relationship between FHIR resources could look like that could be represented in a graph to answer this question.

The FHIR standard uses Terse RDF Triple Language (Turtle). Turtle represents a textual syntax for writing down a graph based on RDF. Neptune supports RDF, which can be queried by SPARQL Protocol and RDF Query Language (SPARQL), a standard query language for RDF graph databases.

Illustrative queries

In this section, we will describe three queries on a Neptune graph containing FHIR QuestionnaireResponse resources. Use the walkthrough described after this section to implement this graph and try out the queries yourself.

The graph is based on a dataset containing 500 Turtle files of sample FHIR QuestionnaireResponse resources following the RDF specification of FHIR.

The responses include unique identifiers of the covered patient (subject) and the practitioner (author, source). The following main question sections are covered:

Drinking and smoking behavior
Cardiopulmonary system
Gastrointestinal tract
Employment history
Contact with hazardous materials

The questions are the same in all response resources. Only the answer values vary. The data is fictional and not pulled from any real person’s information. The following snippet shows the content of one sample Turtle file:

@prefix fhir: <http://hl7.org/fhir/> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

<http://hl7.org/fhir/QuestionnaireResponse/54a3a35d-a2aa-4eb8-82cc-d27624bb61d7> a fhir:QuestionnaireResponse ;
fhir:nodeRole fhir:treeRoot ;
fhir:QuestionnaireResponse.author [
fhir:link <http://hl7.org/fhir/Practitioner/8b7cb028-4d9b-4bc3-bf35-c8d408170d3a> ;
fhir:Reference.reference [
fhir:value “Practitioner/8b7cb028-4d9b-4bc3-bf35-c8d408170d3a”
]
] ;
fhir:QuestionnaireResponse.authored [
fhir:value “2022-10-28T10:28:53″^^xsd:dateTime
] ;
fhir:QuestionnaireResponse.item [
fhir:index “0”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item [
fhir:index “0”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer [
fhir:index “0”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer.valueInteger [
fhir:value “4”^^xsd:integer
]
] ;
fhir:QuestionnaireResponse.item.item.linkId [
fhir:value “1.1”
] ;
fhir:QuestionnaireResponse.item.item.text [
fhir:value “How many liters of beer do you consume in a week?”
]
],
[
fhir:index “1”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer [
fhir:index “0”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer.valueInteger [
fhir:value “2”^^xsd:integer
]
] ;
fhir:QuestionnaireResponse.item.item.linkId [
fhir:value “1.2”
] ;
fhir:QuestionnaireResponse.item.item.text [
fhir:value “How many liters of wine do you consume in a month?”
]
],
[
fhir:index “2”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer [
fhir:index “0”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer.valueInteger [
fhir:value “0”^^xsd:integer
]
] ;
fhir:QuestionnaireResponse.item.item.linkId [
fhir:value “1.3”
] ;
fhir:QuestionnaireResponse.item.item.text [
fhir:value “How many years have you been smoking?”
]
],
[
fhir:index “3”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer [
fhir:index “0”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer.valueInteger [
fhir:value “0”^^xsd:integer
]
] ;
fhir:QuestionnaireResponse.item.item.linkId [
fhir:value “1.4”
] ;
fhir:QuestionnaireResponse.item.item.text [
fhir:value “How many cigarettes do you currently smoke per day?”
]
],
[
fhir:index “4”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer [
fhir:index “0”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer.valueInteger [
fhir:value “0”^^xsd:integer
]
] ;
fhir:QuestionnaireResponse.item.item.linkId [
fhir:value “1.5”
] ;
fhir:QuestionnaireResponse.item.item.text [
fhir:value “How many cigars do you currently smoke per week?”
]
] ;
fhir:QuestionnaireResponse.item.linkId [
fhir:value “1”
] ;
fhir:QuestionnaireResponse.item.text [
fhir:value “Drinking and smoking behavior”
]
],
[
fhir:index “2”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item [
fhir:index “2”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer [
fhir:index “0”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer.valueBoolean [
fhir:value “false”^^xsd:boolean
]
] ;
fhir:QuestionnaireResponse.item.item.linkId [
fhir:value “3.3”
] ;
fhir:QuestionnaireResponse.item.item.text [
fhir:value “Chronic Diarrhea?”
]
],
[
fhir:index “1”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer [
fhir:index “0”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer.valueBoolean [
fhir:value “false”^^xsd:boolean
]
] ;
fhir:QuestionnaireResponse.item.item.linkId [
fhir:value “3.2”
] ;
fhir:QuestionnaireResponse.item.item.text [
fhir:value “Constipation?”
]
],
[
fhir:index “4”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer [
fhir:index “0”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer.valueBoolean [
fhir:value “false”^^xsd:boolean
]
] ;
fhir:QuestionnaireResponse.item.item.linkId [
fhir:value “3.5”
] ;
fhir:QuestionnaireResponse.item.item.text [
fhir:value “Ulcers?”
]
],
[
fhir:index “3”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer [
fhir:index “0”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer.valueBoolean [
fhir:value “false”^^xsd:boolean
]
] ;
fhir:QuestionnaireResponse.item.item.linkId [
fhir:value “3.4”
] ;
fhir:QuestionnaireResponse.item.item.text [
fhir:value “Hemorrhoids?”
]
],
[
fhir:index “0”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer [
fhir:index “0”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer.valueBoolean [
fhir:value “false”^^xsd:boolean
]
] ;
fhir:QuestionnaireResponse.item.item.linkId [
fhir:value “3.1”
] ;
fhir:QuestionnaireResponse.item.item.text [
fhir:value “Celiac Disease?”
]
] ;
fhir:QuestionnaireResponse.item.linkId [
fhir:value “3”
] ;
fhir:QuestionnaireResponse.item.text [
fhir:value “Gastrointestinal tract – Do you currently suffer or ever suffered from:”
]
],
[
fhir:index “1”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item [
fhir:index “3”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer [
fhir:index “0”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer.valueBoolean [
fhir:value “false”^^xsd:boolean
]
] ;
fhir:QuestionnaireResponse.item.item.linkId [
fhir:value “2.4”
] ;
fhir:QuestionnaireResponse.item.item.text [
fhir:value “Heart attack?”
]
],
[
fhir:index “0”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer [
fhir:index “0”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer.valueBoolean [
fhir:value “false”^^xsd:boolean
]
] ;
fhir:QuestionnaireResponse.item.item.linkId [
fhir:value “2.1”
] ;
fhir:QuestionnaireResponse.item.item.text [
fhir:value “Congestive heart failure?”
]
],
[
fhir:index “4”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer [
fhir:index “0”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer.valueBoolean [
fhir:value “false”^^xsd:boolean
]
] ;
fhir:QuestionnaireResponse.item.item.linkId [
fhir:value “2.5”
] ;
fhir:QuestionnaireResponse.item.item.text [
fhir:value “Vascular heart disease?”
]
],
[
fhir:index “5”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer [
fhir:index “0”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer.valueBoolean [
fhir:value “false”^^xsd:boolean
]
] ;
fhir:QuestionnaireResponse.item.item.linkId [
fhir:value “2.6”
] ;
fhir:QuestionnaireResponse.item.item.text [
fhir:value “Hypertension?”
]
],
[
fhir:index “6”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer [
fhir:index “0”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer.valueBoolean [
fhir:value “false”^^xsd:boolean
]
] ;
fhir:QuestionnaireResponse.item.item.linkId [
fhir:value “2.7”
] ;
fhir:QuestionnaireResponse.item.item.text [
fhir:value “Emphysema?”
]
],
[
fhir:index “7”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer [
fhir:index “0”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer.valueBoolean [
fhir:value “true”^^xsd:boolean
]
] ;
fhir:QuestionnaireResponse.item.item.linkId [
fhir:value “2.8”
] ;
fhir:QuestionnaireResponse.item.item.text [
fhir:value “Asthma?”
]
],
[
fhir:index “2”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer [
fhir:index “0”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer.valueBoolean [
fhir:value “false”^^xsd:boolean
]
] ;
fhir:QuestionnaireResponse.item.item.linkId [
fhir:value “2.3”
] ;
fhir:QuestionnaireResponse.item.item.text [
fhir:value “(Chronic) bronchitis?”
]
],
[
fhir:index “1”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer [
fhir:index “0”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer.valueBoolean [
fhir:value “true”^^xsd:boolean
]
] ;
fhir:QuestionnaireResponse.item.item.linkId [
fhir:value “2.2”
] ;
fhir:QuestionnaireResponse.item.item.text [
fhir:value “Angina?”
]
] ;
fhir:QuestionnaireResponse.item.linkId [
fhir:value “2”
] ;
fhir:QuestionnaireResponse.item.text [
fhir:value “Cardio-pulmonary system – Do you currently suffer or ever suffered from:”
]
],
[
fhir:index “4”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item [
fhir:index “11”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer [
fhir:index “0”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer.valueBoolean [
fhir:value “false”^^xsd:boolean
]
] ;
fhir:QuestionnaireResponse.item.item.linkId [
fhir:value “5.12”
] ;
fhir:QuestionnaireResponse.item.item.text [
fhir:value “Nickel”
]
],
[
fhir:index “0”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer [
fhir:index “0”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer.valueBoolean [
fhir:value “false”^^xsd:boolean
]
] ;
fhir:QuestionnaireResponse.item.item.linkId [
fhir:value “5.1”
] ;
fhir:QuestionnaireResponse.item.item.text [
fhir:value “Industrial alcohols”
]
],
[
fhir:index “6”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer [
fhir:index “0”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer.valueBoolean [
fhir:value “false”^^xsd:boolean
]
] ;
fhir:QuestionnaireResponse.item.item.linkId [
fhir:value “5.7”
] ;
fhir:QuestionnaireResponse.item.item.text [
fhir:value “Welding Fumes”
]
],
[
fhir:index “12”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer [
fhir:index “0”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer.valueBoolean [
fhir:value “true”^^xsd:boolean
]
] ;
fhir:QuestionnaireResponse.item.item.linkId [
fhir:value “5.13”
] ;
fhir:QuestionnaireResponse.item.item.text [
fhir:value “Pesticides”
]
],
[
fhir:index “1”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer [
fhir:index “0”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer.valueBoolean [
fhir:value “false”^^xsd:boolean
]
] ;
fhir:QuestionnaireResponse.item.item.linkId [
fhir:value “5.2”
] ;
fhir:QuestionnaireResponse.item.item.text [
fhir:value “Potassium cyanide”
]
],
[
fhir:index “14”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer [
fhir:index “0”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer.valueBoolean [
fhir:value “false”^^xsd:boolean
]
] ;
fhir:QuestionnaireResponse.item.item.linkId [
fhir:value “5.15”
] ;
fhir:QuestionnaireResponse.item.item.text [
fhir:value “Radiation”
]
],
[
fhir:index “4”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer [
fhir:index “0”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer.valueBoolean [
fhir:value “true”^^xsd:boolean
]
] ;
fhir:QuestionnaireResponse.item.item.linkId [
fhir:value “5.5”
] ;
fhir:QuestionnaireResponse.item.item.text [
fhir:value “Anaesthetic gas”
]
],
[
fhir:index “13”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer [
fhir:index “0”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer.valueBoolean [
fhir:value “false”^^xsd:boolean
]
] ;
fhir:QuestionnaireResponse.item.item.linkId [
fhir:value “5.14”
] ;
fhir:QuestionnaireResponse.item.item.text [
fhir:value “Talc”
]
],
[
fhir:index “8”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer [
fhir:index “0”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer.valueBoolean [
fhir:value “false”^^xsd:boolean
]
] ;
fhir:QuestionnaireResponse.item.item.linkId [
fhir:value “5.9”
] ;
fhir:QuestionnaireResponse.item.item.text [
fhir:value “Chloroform”
]
],
[
fhir:index “2”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer [
fhir:index “0”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer.valueBoolean [
fhir:value “true”^^xsd:boolean
]
] ;
fhir:QuestionnaireResponse.item.item.linkId [
fhir:value “5.3”
] ;
fhir:QuestionnaireResponse.item.item.text [
fhir:value “Selenophenol”
]
],
[
fhir:index “9”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer [
fhir:index “0”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer.valueBoolean [
fhir:value “false”^^xsd:boolean
]
] ;
fhir:QuestionnaireResponse.item.item.linkId [
fhir:value “5.10”
] ;
fhir:QuestionnaireResponse.item.item.text [
fhir:value “Fiberglass”
]
],
[
fhir:index “7”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer [
fhir:index “0”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer.valueBoolean [
fhir:value “false”^^xsd:boolean
]
] ;
fhir:QuestionnaireResponse.item.item.linkId [
fhir:value “5.8”
] ;
fhir:QuestionnaireResponse.item.item.text [
fhir:value “Cadmium”
]
],
[
fhir:index “10”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer [
fhir:index “0”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer.valueBoolean [
fhir:value “false”^^xsd:boolean
]
] ;
fhir:QuestionnaireResponse.item.item.linkId [
fhir:value “5.11”
] ;
fhir:QuestionnaireResponse.item.item.text [
fhir:value “Mercury”
]
],
[
fhir:index “5”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer [
fhir:index “0”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer.valueBoolean [
fhir:value “true”^^xsd:boolean
]
] ;
fhir:QuestionnaireResponse.item.item.linkId [
fhir:value “5.6”
] ;
fhir:QuestionnaireResponse.item.item.text [
fhir:value “Gasoline”
]
],
[
fhir:index “15”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer [
fhir:index “0”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer.valueBoolean [
fhir:value “false”^^xsd:boolean
]
] ;
fhir:QuestionnaireResponse.item.item.linkId [
fhir:value “5.16”
] ;
fhir:QuestionnaireResponse.item.item.text [
fhir:value “X-ray”
]
],
[
fhir:index “3”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer [
fhir:index “0”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer.valueBoolean [
fhir:value “true”^^xsd:boolean
]
] ;
fhir:QuestionnaireResponse.item.item.linkId [
fhir:value “5.4”
] ;
fhir:QuestionnaireResponse.item.item.text [
fhir:value “Wood dust”
]
] ;
fhir:QuestionnaireResponse.item.linkId [
fhir:value “5”
] ;
fhir:QuestionnaireResponse.item.text [
fhir:value “Did you come in contact with the following in your workspace over the past 15 years?”
]
],
[
fhir:index “3”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item [
fhir:index “3”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer [
fhir:index “0”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer.valueInteger [
fhir:value “0”^^xsd:integer
]
] ;
fhir:QuestionnaireResponse.item.item.linkId [
fhir:value “4.4”
] ;
fhir:QuestionnaireResponse.item.item.text [
fhir:value “Employment duration in hazardous workplace in years”
]
],
[
fhir:index “0”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer [
fhir:index “0”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer.valueBoolean [
fhir:value “false”^^xsd:boolean
]
] ;
fhir:QuestionnaireResponse.item.item.linkId [
fhir:value “4.1”
] ;
fhir:QuestionnaireResponse.item.item.text [
fhir:value “Have you ever worked in a hazardous workplace?”
]
],
[
fhir:index “2”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer [
fhir:index “0”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer.valueString [
fhir:value “None”
]
] ;
fhir:QuestionnaireResponse.item.item.linkId [
fhir:value “4.3”
] ;
fhir:QuestionnaireResponse.item.item.text [
fhir:value “Hazards in Workplace”
]
],
[
fhir:index “1”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer [
fhir:index “0”^^xsd:integer ;
fhir:QuestionnaireResponse.item.item.answer.valueString [
fhir:value “Metal & Electronics”
]
] ;
fhir:QuestionnaireResponse.item.item.linkId [
fhir:value “4.2”
] ;
fhir:QuestionnaireResponse.item.item.text [
fhir:value “Employer industry”
]
] ;
fhir:QuestionnaireResponse.item.linkId [
fhir:value “4”
] ;
fhir:QuestionnaireResponse.item.text [
fhir:value “Employment history”
]
] ;
fhir:QuestionnaireResponse.source [
fhir:link <http://hl7.org/fhir/Practitioner/8b7cb028-4d9b-4bc3-bf35-c8d408170d3a> ;
fhir:Reference.reference [
fhir:value “Practitioner/8b7cb028-4d9b-4bc3-bf35-c8d408170d3a”
]
] ;
fhir:QuestionnaireResponse.status [
fhir:value “completed”
] ;
fhir:QuestionnaireResponse.subject [
fhir:link <http://hl7.org/fhir/Patient/4aa4d316-800f-47a1-8e72-61811b591752> ;
fhir:Reference.display [
fhir:value “39534f16-20ee-48f0-ad47-9c5ba7abf1e3”
] ;
fhir:Reference.reference [
fhir:value “Patient/4aa4d316-800f-47a1-8e72-61811b591752”
]
] ;
fhir:Resource.id [
fhir:value “54a3a35d-a2aa-4eb8-82cc-d27624bb61d7”
] .

<http://hl7.org/fhir/QuestionnaireResponse/54a3a35d-a2aa-4eb8-82cc-d27624bb61d7.ttl> a owl:Ontology ;
owl:imports fhir:fhir.ttl .

<http://hl7.org/fhir/Patient/4aa4d316-800f-47a1-8e72-61811b591752> a fhir:Patient .

<http://hl7.org/fhir/Practitioner/8b7cb028-4d9b-4bc3-bf35-c8d408170d3a> a fhir:Practitioner .

The illustrative queries focus on a subset of nodes in the graph representing the questionnaire responses. The following image visualizes the graph structure of the nodes involved in the queries.

Sample query 1: Identify patients that work in the same industry

The following SPARQL query matches questionnaire responses with the same answer to question 4.2 “Employer industry” and returns the patients that correspond to these. As a result, you can quickly identify the clusters of patients that work or worked in the same industry.

PREFIX fhir: <http://hl7.org/fhir/>
PREFIX qr: <http://hl7.org/fhir/QuestionnaireResponse.>

CONSTRUCT {
?questionnaireResponse fhir:value ?patient ;
fhir:value ?industryAnswer .

?questionnaireResponse a fhir:QuestionnaireResponse .
?patient a fhir:Patient .
}
WHERE {
?questionnaireResponse qr:subject/fhir:Reference.reference/fhir:value ?patient ;
qr:item/qr:item.item ?item4_2 .
?item4_2 qr:item.item.answer/qr:item.item.answer.valueString/fhir:value ?industryAnswer ;
qr:item.item.linkId/fhir:value “4.2” .
}

The visualization makes it easy to identify industries that were named very often or less frequently. Pharma & Health represents an industry that was named by many patients. The following image shows the graph visualization of the query result.

Sample query 2: Identify industries with common hazards

The following query matches the answers to question 4.2 “Employer industry” and 4.3 “Hazards in Workplace.” Answers stating no hazards are filtered out. This gives an overview of hazards that are more common in some industries than in others.

PREFIX fhir: <http://hl7.org/fhir/>
PREFIX qr: <http://hl7.org/fhir/QuestionnaireResponse.>

CONSTRUCT {
?parentItem4 fhir:value ?industryAnswer ;
fhir:value ?hazardAnswer .

?parentItem4 a qr:item.item .
?industryAnswer a fhir:value .
}
WHERE {
?industryAnswer ^fhir:value/^qr:item.item.answer.valueString/^qr:item.item.answer ?item4_2 .
?item4_2 qr:item.item.linkId/fhir:value “4.2” ;
^qr:item.item ?parentItem4 .
?parentItem4 qr:item.item ?item4_3 .
?item4_3 qr:item.item.linkId/fhir:value “4.3” ;
qr:item.item.answer/qr:item.item.answer.valueString/fhir:value ?hazardAnswer .

FILTER(‘None’ != ?hazardAnswer)
}

Given the density of nodes, you can identify two general clusters of industries related to more and less threatening hazards:

The first cluster contains industries related to Safety hazards, Biological hazards, Chemical hazards, and Physical hazards. The Construction industry, for example, is closely related to Safety hazards.
The second cluster contains industries related to Ergonomic hazards and Workload hazards. The Service & Crafts industry, for example, is linked to Ergonomic hazards.

Some questionnaire responses link industries with hazards from the other cluster, but most patients answered on hazards within one cluster. You can use this information to dive deeper into these cases and understand where the difference comes from.

The following image shows the graph visualization of the query result.

Sample query 3: Get questionnaires with similar answers for a question group compared to a single questionnaire

The following query compares answers of patients in the question section 1 “Drinking and smoking behavior,” which contains five questions:

How many liters of beer do you consume in a week?
How many liters of wine do you consume in a month?
How many years have you been smoking?
How many cigarettes do you currently smoke per day?
How many cigars do you currently smoke per week?

The result of the query is a list of questionnaire responses that match the answers of a particular questionnaire response (QuestionnaireResponse/92d290e2-26a6-4474-9085-71f3b146dfd5) in at least four out of five answers. The questionnaire response against which the responses are matched is also included in the result list in this example.

PREFIX fhir: <http://hl7.org/fhir/>
PREFIX qr: <http://hl7.org/fhir/QuestionnaireResponse.>

SELECT ?similarQR (count(?sameAnswerValue) as ?sameAnswerCount)
WHERE {
<http://hl7.org/fhir/QuestionnaireResponse/92d290e2-26a6-4474-9085-71f3b146dfd5> qr:item ?parentItem1_a .
?parentItem1_a qr:item.linkId/fhir:value “1” ;
qr:item.item ?subItem_a .
?subItem_a qr:item.item.answer/qr:item.item.answer.valueInteger/fhir:value ?sameAnswerValue ;
qr:item.item.text/fhir:value ?question .

?similarQR qr:item ?parentItem1_b .
?parentItem1_b qr:item.linkId/fhir:value “1” ;
qr:item.item ?subItem_b .
?subItem_b qr:item.item.answer/qr:item.item.answer.valueInteger/fhir:value ?sameAnswerValue ;
qr:item.item.text/fhir:value ?question .
}
GROUP BY ?similarQR
HAVING (?sameAnswerCount > 3)
ORDER BY DESC(?sameAnswerCount)

The following image shows the table visualization of the query result.

Solution overview

In the walkthrough, you use Amazon Simple Storage Service (Amazon S3) to store the sample dataset of FHIR QuestionnaireResponse. You load the data into a graph via Neptune’s Bulk Loader feature. Then you can use the Neptune workbench, an extension to Jupyter notebooks to help you query your graph using SPARQL and visualize your query results. The architecture is visualized in the following image.

We complete the following steps to implement the solution:

Create a Neptune database.
Create an S3 bucket and upload the sample dataset of 500 FHIR QuestionnaireResponse resources in RDF format.
Grant Neptune access to Amazon S3.
Start the Neptune workbench.
Load the RDF data into Neptune via the bulk loader and run queries in the sample Jupyter notebook.

You can find the sample dataset and Jupyter notebook in the GitHub repository. You will be responsible for costs incurred by service use beyond the AWS Free Tier based on the AWS Pricing.

Prerequisites

For this walkthrough, you should have the following prerequisites:

An AWS account
Permission to create an S3 bucket, a Neptune cluster, and a SageMaker Notebook instance
Access to the AWS Identity and Access Management (IAM) console and permission to manage IAM roles and policies

Create a Neptune database

To create your Neptune database, complete the following step:

Create a database on the Neptune console using the Development and Testing template.
Ensure that the Create notebook option is enabled.
Specify a custom name for the notebook and the IAM role.
Keep all other options at the default values.

The Databases section should look like the following screenshot.

Create an S3 bucket and upload your dataset

To create an S3 bucket for the questionnaire responses, complete the following steps:

On the Amazon S3 console, choose Buckets in the navigation pane.
Choose Create bucket.
For Bucket name, enter a globally unique name.
Choose the Region in which you set up your Neptune cluster.
Accept the default settings and choose Create bucket.
In the bucket list, select the bucket you just created.
Unzip the dataset zip file from GitHub as Neptune’s bulk loader cannot unpack a compressed file containing multiple bulk load files.
Upload all 500 individual Turtle files from the unzipped sample dataset by doing one of the following:
Drag and drop the files into the S3 bucket listing.
Choose Upload, and follow the prompts to choose and upload the files.

The Objects section should look like the following screenshot.

Grant access to Amazon S3 for Neptune

To grant Neptune access to Amazon S3, complete the following steps:

Create an IAM role to access Amazon S3.
Add the IAM role to your Neptune cluster.
Note the ARN of the IAM role you created to use in a later step.
Create a VPC Gateway endpoint for Amazon S3.

Start a Neptune workbench and load your data

To start a Neptune workbench for your queries, complete the following steps:

On the Neptune console, in the Databases list, verify that the status of your recently created Neptune database cluster shows as Available. If it doesn’t, wait until the status changes.
Choose Notebooks in the navigation pane.
Open the notebook that you created earlier.
Upload the sample Jupyter notebook file and open it.

You should now see a directory structure as in the following screenshot.

Follow the instructions in the notebook to load your data into the graph database and run queries against it. To run cells in the notebook, select them and do one of the following:
Choose the Run icon in the toolbar.
Press Shift + Enter.

Follow the rest of the instructions in the notebook to run the sample queries on your database.

Clean up

To avoid incurring future charges, delete the resources you created:

On the Neptune console, on the Databases page, select the cluster you created.
On the Actions menu, choose Delete.
On the Amazon S3 console, on the Buckets page, select the bucket you created and choose Empty.
Follow the instructions and choose Empty.
Select the bucket and choose Delete.
Follow the instructions and choose Delete bucket.
On the Amazon VPC console, on the Endpoints page, select the endpoint you created.
On the Actions menu, choose Delete VPC endpoints.
Follow the instructions and choose Delete.
On the IAM console, on the Roles page, select the role you created and choose Delete.
Follow the instructions and choose Delete.

Conclusion

In this post, you learned how to load FHIR data into Neptune and use SPARQL to perform different queries on the data. With these queries, you identified clusters and items with similar properties. The visualization in the Neptune workbench eases the analysis of these. You can unlock the generation of insights by storing FHIR data in a graph database. Because Neptune allows you to store the data in its natural form, you don’t need to create schemas or join many tables.

In the walkthrough, you only worked with FHIR QuestionnaireResponse resources. However, you’re not limited to analyses of this resource type. Take your analyses to the next level! Upload data of different FHIR resource types to a Neptune graph and build your on SPARQL queries to generate even more valuable insights.

Check out the open-source Jupyter notebooks or the Getting Started with Amazon Neptune learning path to learn more on graph technology and Neptune.

Go a step beyond human written queries and learn about how to find hidden correlations in your graph by applying Neptune machine learning, which uses Graph Neural Networks (GNN) to make predications based on the relationships in your data.

Also refer to Analyzing healthcare FHIR data with Amazon Redshift PartiQL for a complementary approach to analyzing FHIR data.

About the author

Alena Schmickl is a Startup Solutions Architect (SA) for healthcare and life sciences at AWS. Her passion is using technology as an enabler for advancing healthcare.

Read MoreAWS Database Blog

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular

Recent Comments