Artificial Intelligence and Machine Learning

More AI Developers Focused on Engineering the Bias Out of AI

By mullaned2002

September 9, 2021

1385

By John P. Desmond, AI Trends Editor

With AI systems today determining whether someone can get a job or a loan, it’s in the interest of the company running the AI system to make sure the underlying dataset is not so biased that it leads to errors in its conclusions.

Cases of biased data leading to biased results have been documented, such as in the research of Joy Buolamwini and Timnit Gebru, authors of a 2018 study that showed facial-recognition algorithms were very good at identifying white males, but recognized Black females only two thirds of the time. If law enforcement is using such a system to identify suspects, that can lead to some serious problems.

Alexandra Ebert, chief trust officer, Mostly AI

The stage is set for serious effort to go into reducing biased datasets on which AI systems rely. “It’s an opportunity,” stated Alexandra Ebert, chief trust officer at Mostly AI, a startup focused on synthetic data based in Vienna, quoted in a recent account in IEEE Spectrum. Businesses, data scientists, and engineers are beginning to focus on how to remove bias from AI datasets and algorithms, for the betterment of society.

Training datasets may come up short in data from minority groups and reflect historical inequities such as lower salaries for women or racial bias, such as when Asian-Americans are labelled foreigners. Models that learn from biased training data will exhibit the same biases. To collect high quality data that is balanced and inclusive can cost some money.

That’s where suppliers of synthetic data such as Mostly AI see an opportunity. They can, for example, create a person that may have never existed but who fits in with the pattern of existing data showing for example, race, income, education background. The new individual would “behave like a female with higher income would behave, so that all the data points from the person match up and make sense,” Ebert stated. The synthetic data may slightly sacrifice some accuracy, but it is still statistically highly representative.

Another synthetic data startup is Synthesized, based in London, whose founders were machine learning researchers at the University of Cambridge. The company is focused on serving data scientists. Mostly AI and several other firms are working toward the launch of an IEEE standards group on synthetic data, Ebert stated.

Toolkits, Frameworks Emerging to Help Reduce Bias in Datasets

Developers are creating tools to help reduce bias in AI. These include tools from Aequitas to measure bias in uploaded data sets, and from Themis–ml that put datasets through bias-mitigation algorithms

A team at IBM has assembled a comprehensive open-source toolkit called AI Fairness 360, which helps detect and reduce unwanted bias in datasets and machine-learning models. It assembles14 different bias-mitigation algorithms developed by computer scientists over the past decade, and is aimed at being intuitive to use. “The idea is to have a common interface to make these tools available to working professionals,” stated Kush Varshney, a research manager at IBM Research AI in Yorktown Heights, New York, leader of the project, to IEEE Spectrum.

The tools implement different techniques to massage the data. Reweighing, for example, gives higher weight to input/output pairs that give the underprivileged group a more positive outcome. Some work on tweaking machine learning algorithms, such as to optimize for the group A or B that has less data, to prod the model to a more fair outcome across groups.

At the root of fairness in AI is the dataset. “We can’t say a priori that this algorithm will work best for your fairness problem or dataset,” stated Varshney. “You have to figure out which algorithm is best for your data.” He has seen developers learn to use the bias-reducing toolkit. “There’s some nuance to it, but once you make up your mind to mitigate bias, yes you can do it,” he stated.

Checking on Whether Developer Worldviews Are Influencing Datasets

Shomron Jacob, engineering manager, iterate.ai

AI engineering managers need to be aware of whether their AI engineers are passing their own biases onto the systems they develop. “The success of any AI application is intrinsically tied to its training data,” stated Shomron Jacob, engineering manager for application machine learning and platform at Iterate.ai, in a recent account in VentureBeat. Iterate.ai is a startup based in San Jose building an AI platform that in part helps startups participate in large enterprises.

“If engineers allow their own worldviews and assumptions to influence datasets—perhaps supplying data that is limited to only certain demographics or focal points—applications dependent on AI problem-solving will be similarly biased, inaccurate, and, well, not all that useful,” Jacob stated. “I expect bias scrutiny is only going to increase as AI continues its rapid transition from a relatively nascent technology into an utterly ubiquitous one. But human bias must be overridden to truly achieve that reality.”

AI development organizations need to employ effective frameworks, toolkits, processes and policies for recognizing and mitigating AI bias. Available open source tools can be of assistance in finding blind spots in data.

AI Frameworks are designed to protect organizations from the risks of AI bias by introducing checks and balances. Benchmarks for trusted, bias-free practices can be automated and ingrained into products using these frameworks, Jacob advised.

He suggested these example AI frameworks:

The Aletheia Framework from Rolls Royce provides a 32-step process for designing accurate and carefully managed AI applications;

Deloitte’s AI framework highlights six essential dimensions for implementing AI safeguards and ethical practices;

And a framework from Naveen Joshi details cornerstone practices for developing trustworthy AI. It focuses on the need for explainability, machine learning integrity, conscious development, reproducibility, and smart regulations.

And Jacob suggested these example AI toolkits, including the AI Fairness 360 previously mentioned:

IBM Watson OpenScale provides real-time bias detection and mitigation and enables detailed explainability to help make AI predictions trusted and transparent;

Google’s What-If Tool offers visualization of machine learning model behavior, making it easier to test trained models against machine learning fairness metrics to root out bias.

One Team Practices Community-Based System Dynamics

One AI engineer values an approach that combines many stakeholders in the initial definition of an AI project. The team needs to take into account the social implications of its implementation, suggests Damian Scalerandi, VP of operations at BairesDev, author of a recent account in Forbes. The San Francisco-based BairesDev offers AI software development services to its clients.

AI development is likely to have its blind spots. “And our best chance to find them and patch them is to collaborate with the people closest to the societal context itself—sociologists, behavioral scientists and humanities specialists,” Scalerandi stated.

Some engineers refer to this approach as community-based system dynamics (CBSD), a term introduced in 2013 in a book by that name by author Peter S. Hovmand.

“Together, we can form a shared hypothesis of how a certain algorithm could work and how we can best guarantee win-win scenarios,” Scalerandi stated. “In the end, this is all about supporting technological innovations that are fair, safe, and beneficial to everyone.”

Read the source articles and information in IEEE Spectrum, in VentureBeat and in Forbes.

More AI Developers Focused on Engineering the Bias Out of AI

Amazon SageMaker inference launches faster auto scaling for generative AI models

Node problem detection and recovery for AWS Neuron nodes within Amazon EKS clusters

Evaluate conversational AI agents with Amazon Bedrock

LEAVE A REPLY Cancel reply

Most Popular

Schneider Electric automates Salesforce account hierarchy management with generative artificial intelligence (AI) using Amazon Aurora and Amazon Bedrock

Leverage enterprise data with Denodo and Vertex AI for generative AI applications

TypeScript takes aim at truthy and nullish bugs

Make relevant movie recommendations using Amazon Neptune, Amazon Neptune Machine Learning, and Amazon OpenSearch Service

Recent Comments

EDITOR PICKS

Exploring the Click Element Variable in Google Tag Manager

How to track events with Google Tag Manager and Google Analytics

Data Layer Variable in GTM: What, Why, and Where?

POPULAR POSTS

Enhance PostgreSQL database security using hooks with Trusted Language Extensions

Data Fabric: What is it and Why Do You Need it?

Package and deploy classical ML and LLMs easily with Amazon SageMaker, part 2: Interactive User Experiences in SageMaker Studio

POPULAR CATEGORY