Understanding business trends, customer behavior, sales revenue, increase in demand, and buyer propensity all start with data. Exploring, analyzing, interpreting, and finding trends in data is essential for businesses to achieve successful outcomes.
Business analysts play a pivotal role in facilitating data-driven business decisions through activities such as the visualization of business metrics and the prediction of future events. Quick iteration and faster time-to-value can be achieved by providing these analysts with a visual business intelligence (BI) tool for simple analysis, supported by technologies like machine learning (ML).
Amazon QuickSight is a fully managed, cloud-native BI service that makes it easy to connect to your data, create interactive dashboards and reports, and share these with tens of thousands of users, either within QuickSight or embedded in your application or website. Amazon SageMaker Canvas is a visual interface that enables business analysts to generate accurate ML predictions on their own, without requiring any ML experience or having to write a single line of code.
In this post, we show how you can publish predictive dashboards in QuickSight using ML-based predictions from Canvas, without explicitly downloading predictions and importing into QuickSight. This solution will help you send predictions from Canvas to QuickSight, enabling you with accelerated decision-making using ML to achieve effective business outcomes.
Solution overview
In the following sections, we discuss steps that will help administrators configure the right permissions to seamlessly redirect users from Canvas to QuickSight. Then we detail how to build a model and run predictions, and demonstrate the business analyst experience.
Prerequisites
The following prerequisites are needed to implement this solution:
An AWS account with permissions to create AWS Identity and Access Management (IAM) policies and roles.
Access to Amazon SageMaker, an instance of Amazon SageMaker Studio, and a user for Studio. For more information about prerequisites, see Getting started with using Amazon SageMaker Canvas.
A QuickSight subscription. For this post, we only use QuickSight features included in the Standard subscription.
Access to the QuickSight dashboard to author and analyze the inferred data.
Make sure to use the same QuickSight Region as Canvas. You can change the Region by navigating from the profile icon on the QuickSight console.
Administrator setup
In this section, we detail the steps to set up IAM resources, prepare the data, train the data with the training dataset, and infer the validation dataset. Thereafter, we send the data to QuickSight for further analysis.
Create a new IAM policy for QuickSight access
To create an IAM policy, complete the following steps:
On the IAM console, choose Policies in the navigation pane.
Choose Create policy.
On the JSON tab, enter the following permissions policy into the editor:
For details about the IAM policy language, see IAM JSON policy reference.
Choose Next: Tags.
You can add metadata to the policy by attaching tags as key-value pairs, then choose Next: Review.
For more information about using tags in IAM, see Tagging IAM resources.
On the Review policy page, enter a name (for example, canvas-quicksight-access-policy) and an optional description of the policy.
Review the Summary section to see the permissions that are granted by your policy.
Choose Create policy to save your work.
After you create a policy, you can attach it to your execution role that grants your users the necessary permissions to send batch predictions to users in QuickSight.
Attach the policy to your Studio execution role
To attach the policy to your Studio execution role, complete the following steps:
On the SageMaker console, choose Domains in the navigation pane.
Choose your domain.
Choose Domain settings.
Copy the role name under Execution role.
On the IAM console, choose Roles in the navigation pane.
In the search bar, enter the execution role you copied, then choose the role.
On the page for the user’s role, navigate to the Permissions policies section.
On the Add permissions menu, choose Attach policies.
Search for the previously created policy (canvas-quicksight-access-policy), select it, and choose Add permissions.
Now you have an IAM policy attached to your execution role that grants your users the necessary permissions to send batch predictions to users in QuickSight.
Download the datasets
Let’s download the datasets that we use to train the model and make the predictions:
Build a model and run predictions
In this section, we cover how we can build a model and run predictions on the loan dataset. Then we send the data to the QuickSight dashboard to get business insights.
Launch Canvas
To launch Canvas, complete the following steps:
On the SageMaker console, choose Domains in the navigation pane.
Choose your domain.
On the Launch menu, choose Canvas.
Upload training and validation datasets
Complete the following steps to upload your datasets to Canvas:
On the Canvas home page, choose Datasets.
Choose Import data, then upload lending_club_loan_data_train.csv and lending_club_loan_data_test.csv.
Choose Save & Close, then choose Import data.
Now let’s create new model.
Choose My models in the navigation pane.
Choose New model.
Enter a name to your model (Loan_Prediction) and choose Create.
If this is the first time creating a Canvas model, you will be welcomed by an informative pop-up about how to build your first model in four simple steps. You can read this through, then come back to this guide.
In the model view, on the Select tab, select the lending_club_loan_data_train dataset.
This dataset has 18 columns and 32,000 rows.
Choose Select dataset.
On the Build tab, choose the target column, in our case loan_status.
Canvas will automatically detect that this is a 3+ category prediction problem (also known as multi-class classification).
If another model type is detected, change it manually by choosing Change type.
Choose Quick build, and select Start quick build from the pop-up.
You can also choose Standard build, which goes through the complete AutoML cycle, generating multiple models before recommending the best model.
Now your model is being built. Quick build usually takes 2–15 minutes.
After the model is built, you can find the model status on the Analyze tab.
Make predictions with the model
After we build and train the model, we can generate predictions on this model.
Choose Predict on the Analyze tab, or choose the Predict tab.
Run a single prediction by choosing Single prediction and providing entries.
You will see the loan_status prediction on the right side of the page. You can copy the prediction by choosing Copy, or download it by choosing Download prediction. This is ideal for generating what-if scenarios and testing how different columns impact the predictions of our model.
To run batch predictions, choose Batch prediction.
This is best when you’d like to make predictions for an entire dataset. You should make predictions with a dataset that matches your input dataset.
For each prediction or set of predictions, Canvas returns the predicted values and the probability of the predicted value being correct.
Let’s make predictions from the trained model using the validation dataset.
Choose Select the dataset.
Select lending_club_loan_data_test and choose Generate predictions.
When your predictions are ready, you can find them in the Dataset section. You can preview the prediction, download it to a local machine, delete it, or send it to QuickSight.
Send predictions to QuickSight
You can now share predictions from these ML models as QuickSight datasets that will serve as a new source for enterprise-wide dashboards. You can analyze trends, risks, and business opportunities. Through this capability, ML becomes more accessible to business teams so they can accelerate data-driven decision-making. Sharing data with QuickSight users grants them owner permissions on the dataset. Multiple inferred datasets can be sent at once to QuickSight.
Note that you can only send predictions to users in the default namespace of the QuickSight account, and the user must have the Author or Admin role in QuickSight. Predictions sent to QuickSight are available in the same Region as Canvas.
Select the inferred batch dataset and choose Send to Amazon QuickSight.
Enter one or multiple QuickSight user names to share the dataset with and press Enter.
Choose Send to share data.
After you send your batch predictions, the QuickSight field for the datasets you sent shows as Sent.
In the confirmation box, you can choose Open Amazon QuickSight to open your QuickSight application.
If you’re done using Canvas, log out of the Canvas application.
You can send batch predictions to QuickSight for numeric, categorical prediction, and time series forecasting models. You can also send predictions generated with the bring your own model (BYOM) method. Single-label image prediction and multi-category text prediction models are excluded.
The QuickSight users that you’ve sent datasets to can open their QuickSight console and view the Canvas datasets that have been shared with them. Then they can create predictive dashboards with the data. For more information, see Getting started with Amazon QuickSight data analysis.
By default, all the users to whom you send predictions have owner permissions for the dataset in QuickSight. Owners are able to create analyses, refresh, edit, delete, and reshare datasets. The changes that owners make to a dataset change the dataset for all users with access. To change the permissions, go to the dataset in QuickSight and manage its permissions. For more information, see Viewing and editing the permissions users that a dataset is shared with.
Business analysts experience
With QuickSight, you can visualize your data to better understand it. We start by getting some high-level information.
On the QuickSight console, choose Datasets in the navigation pane.
Create an analysis on the batch prediction dataset shared from Canvas by choosing Create analysis on the drop-down options menu (three vertical dots).
On the analysis page, choose the sheet name and rename to it Loan Data Analysis.
Let’s create a visual to show the count by loan status.
For Visual types, choose Donut chart.
Use the loan_status field for Group/Color.
We can see that 99% are fully paid, 1% are current, and 0% are charged off.
Now we add a second visual to show the amount of loans by status.
On the top-left corner, choose the plus sign and choose Add visual.
For Visual types, choose Waterfall chart.
Use the loan_status field for Category.
Use the loan_amount field for Value.
We can see that the total loan amount is around $88 million, with around $221,000 charged off.
Let’s try to detect some risk drivers for defaulting on loans.
Choose the plus sign and choose Add visual.
For Visual types, choose Horizontal bar chart.
Use the loan_status field for Y axis.
Use the loan_amount field for Value.
Modify the Value field aggregation from Sum to Average.
We can see that on average, the loan amount was around $3,500 lower for the fully paid loans compared to the current loans, and around $3,500 lower for the fully paid loans compared to the charged off loans. There seems to be a correlation between the loan amount and the credit risk.
To duplicate the visual, choose the options menu (three dots), choose Duplicate visual to, and choose This sheet.
Choose the duplicated visual to modify its configuration.
For Visual types, choose Horizontal bar chart.
Use the loan_status field for Y axis.
Use the loan_amount field for Value.
Modify the Value field aggregation from Sum to Average.
You can create additional visuals to check for additional risk drivers. For example:
Loan term
Open credit lines
Revolving line utilization rate
Total credit lines
After you add the visuals, publish the dashboard using the Share option on the analyses page and share the dashboard with the business stakeholders.
Clean up
To avoid incurring future charges, delete or shut down the resources you created while following this post. Refer to Logging out of Amazon SageMaker Canvas for more details.
Conclusion
In this post, we trained an ML model using Canvas without writing a single line of code thanks to its user-friendly interfaces and clear visualizations. We then generated single and batch predictions for this model in Canvas. To assess the trends, risks, and business opportunities across the enterprise, we sent the predictions of this ML model to QuickSight. As business analysts, we created various visualizations to assess the trends in QuickSight.
This capability is available in all Regions where Canvas is now supported. You can learn more on the Canvas product page and documentation.
About the Authors
Ajjay Govindaram is a Senior Solutions Architect at AWS. He works with strategic customers who are using AI/ML to solve complex business problems. His experience lies in providing technical direction as well as design assistance for modest to large-scale AI/ML application deployments. His knowledge ranges from application architecture to big data, analytics, and machine learning. He enjoys listening to music while resting, experiencing the outdoors, and spending time with his loved ones.
Varun Mehta is a Solutions Architect at AWS. He is passionate about helping customers build enterprise-scale well-architected solutions on the AWS Cloud. He works with strategic customers who are using AI/ML to solve complex business problems.
Shyam Srinivasan is a Principal Product Manager on the AWS AI/ML team, leading product management for Amazon SageMaker Canvas. Shyam cares about making the world a better place through technology and is passionate about how AI and ML can be a catalyst in this journey.
Read MoreAWS Machine Learning Blog