Vertex AI Example-based Explanations improve ML via explainability

By mullaned2002

August 24, 2022

434

Artificial intelligence (AI) can automatically learn patterns that humans can’t detect, making it a powerful tool for getting more value out of data. A high-performing model starts with high-quality data, but in many cases, datasets have issues such as incorrect labels or unclear examples that contribute to poor model performance. Data quality is a constant challenge for enterprises—even some datasets used as machine learning (ML) benchmarks suffer from label errors. ML models are thus often notoriously difficult to debug and troubleshoot. Without special tools, it’s difficult to connect model failures to root causes and even harder to know the next step to resolve the problem.

Today, we’re thrilled to announce the public preview of Vertex AI Example-based Explanations, a novel feature that provides actionable explanations to mitigate data challenges such as mislabeled examples. With Vertex AI Example-based Explanations, data scientists can quickly identify misclassified data, improve datasets, and more efficiently involve stakeholders in the decisions and progress. This new feature takes the guessing games out of model refinement, enabling you to identify problems faster and speed up time to value.

How Examples-based Explanations create better models

Vertex AI Example-based Explanations can be used in numerous ways, from supporting users in building better models to closing the loop with stakeholders. Below, we describe some notable capabilities of the feature:

Figure 1. Use case overview Example-based Explanations

To illustrate the use case of misclassification analysis, we trained an image classification model on a subset of the STL-10 dataset, using only images of birds and planes. We noted some images of birds being misclassified as planes. For one such image, we used Example-based Explanations to retrieve other images in the training data that appeared most similar to this misclassified bird image in the latent space. Examining those, we identified that both the misclassified bird image and the similar images were dark silhouettes.

To take a closer look, we expanded the similar example search to show us the 20 nearest neighbors. From this, we identified that 15 examples were images of planes, and only five were images of birds. This signaled a lack of images of birds with dark silhouettes in the training data, as only one of the training data bird images was a dark-silhouetted one. The immediate actionable insight was to improve the model by gathering more data with images of silhouetted birds.

Figure 2. Use Example-based Explanations for misclassification analysis

Beyond misclassification analysis, Example-based Explanations can enable active learning, so that data can be selectively labeled when its Example-based Explanations come from confounding classes. For instance, if out of 10 total explanations for an image, five are from class “bird” and five are from class “plane,” the image can be a candidate for human annotation, further enriching the data.

Example-based Explanations are not limited to images. They can generate embeddings for multiple types of data: image, text, tabular.

Let’s look at an illustration of how to use Example-based Explanations with tabular data. Suppose we have a trained model that predicts the duration of a bike ride. When examining the model’s projected duration for a bike ride, Example-based Explanations can help us identify issues with the underlying data points.

Looking at row #5 in the below image, the duration seems too long when compared with the distance covered. This bike ride is also very similar to the query ride, which is expected since Example-based Explanations are supposed to find similar examples. Given the distance, time of day, temperature, etc. are all very similar between the query ride and the ride in row #5, the duration label seems suspicious.The immediate next step is to examine this data point more closely and either remove it from the dataset or try to understand if there might be some missing features (say, whether the biker took a snack break) contributing to the difference in durations.

Figure 3. Use Example-based Explanations for tabular data

Getting started with Examples-based Explanations in Vertex AI

It takes only three steps to set up Example-based Explanations.

First, upload your model and dataset. The service will represent the entire dataset in a latent space (called embeddings). As a concrete example, let’s examine words in a latent space. The below visualizations show such word embeddings, where the position in the vector space encodes meaningful semantics of each word, such as the relation between verbs or between a country and its capital.Next, deploy your index and model, after which the Example-based API will be ready to query. Then, you can query for similar data points and only need to repeat steps 1 and 2 when you retrain the model or change your dataset.

Figure 4. Embeddings can capture meaningful semantic information

Under the hood, the Example-based Explanations API builds on cutting-edge technology developed by Google research organizations, described in this blog post and used at scale across a wide range of Google applications, such as Search, YouTube and Play Store. This technology, ScaNN, enables querying for similar examples significantly faster and with better recall, compared to other vector similarity search techniques.

Learn how to use Example-based Explanations by following the instructions available in this configuration documentation. To learn more about Vertex AI, visit our product page or explore this summary of tutorials and resources.

Cloud BlogRead More

Previous articleBuilding a sustainable agricultural supply chain on Google Cloud

Next articleAWS Deep Learning Challenge sees innovative and impactful use of Amazon EC2 DL1 instances

Vertex AI Example-based Explanations improve ML via explainability

How Examples-based Explanations create better models

Getting started with Examples-based Explanations in Vertex AI

Leverage enterprise data with Denodo and Vertex AI for generative AI applications

TypeScript takes aim at truthy and nullish bugs

Hex-LLM: High-efficiency large language model serving on TPUs in Vertex AI Model Garden

LEAVE A REPLY Cancel reply

Most Popular

Schneider Electric automates Salesforce account hierarchy management with generative artificial intelligence (AI) using Amazon Aurora and Amazon Bedrock

Leverage enterprise data with Denodo and Vertex AI for generative AI applications

TypeScript takes aim at truthy and nullish bugs

Make relevant movie recommendations using Amazon Neptune, Amazon Neptune Machine Learning, and Amazon OpenSearch Service

Recent Comments

EDITOR PICKS

Exploring the Click Element Variable in Google Tag Manager

How to track events with Google Tag Manager and Google Analytics

Data Layer Variable in GTM: What, Why, and Where?

POPULAR POSTS

DevOps Awards winner Priceline on leveraging a loosely coupled architecture

Choosing the right GPU for AI, machine learning, and more

Harnessing the power of enterprise data with generative AI: Insights from Amazon Kendra, LangChain, and large language models

POPULAR CATEGORY

Vertex AI Example-based Explanations improve ML via explainability

How Examples-based Explanations create better models

Getting started with Examples-based Explanations in Vertex AI

Vertex Matching Engine: Blazing fast and massively scalable nearest neighbor search

LEAVE A REPLY Cancel reply

Most Popular

Recent Comments

EDITOR PICKS

POPULAR POSTS

POPULAR CATEGORY