Improving Safety of AI and Online Communities with PaLM 2

By mullaned2002

September 7, 2023

299

To empower developers to identify sensitive content in a rapidly changing media environment, we are excited to announce Text Moderationpowered by PaLM 2, available through the Cloud Natural Language API. Built in collaboration with Jigsaw and Google Research, Text Moderation helps organizations scan for sensitive or harmful content. Here are some examples of how the Text Moderation service can be used:

Brand Safety: Protect against user-generated content and publisher content that are considered not “brand safe” for the advertiser

User protection: Scan for potentially offensive or harmful content

Generative AI risk mitigation: Help safeguard against the generation of inappropriate content in outputs from generative models

Promote brand safety

Brand safety is a set of procedures that aim to protect the reputation and trustworthiness of a brand in the digital age. One of the biggest risks to brand safety is the content that ads are associated with; if an ad appears on a website that contains content that does not conform with the sponsoring brand’s values, it can reflect poorly on the brand and organization, so it’s important for companies to identify and remove content that isn’t aligned with brand guidelines or consistent with the brand.

Text Moderation can be used by our customers to identify content that they determine is offensive or harmful, sensitive in context, or otherwise inappropriate for their brand. Once an organization has identified this content, teams can take steps to remove it from advertising campaigns or prevent it from being associated with the brand in the future, helping ensure that advertising campaigns are effective and that the brand is associated with positive and trustworthy content.

Protect users from harmful content

Digital media platforms, gaming publishers, and online marketplaces all have a vested interest in mitigating the risks of user-generated content. They want to provide a safe and welcoming environment for their users while also maintaining an open and free exchange of ideas. Text Moderation can help them achieve this goal, using artificial neural networks to detect and remove harmful content, such as harassment or abuse. These efforts can help reduce harm, improve customer experience, and increase customer retention.

Mitigate risks of generative models

Over the last year, progress in AI has enabled software to more reliably generate text, images, and video, leading to new products and services that use machine learning, including text generators, to create content. However, with any AI content generation, there is a risk of producing offensive material, even inadvertently.

To address this risk, we have trained and evaluated the Text Moderation service on real prompts and responses from large generative models. Text Moderation is versatile and covers a broad range of content types, making it a powerful tool for protecting users from harmful content.

Getting started with Text Moderation using the Natural Language API

Text Moderation is powered by Google’s latest PaLM 2 foundation model to identify a wide range of harmful content, including hate speech, bullying, and sexual harassment. Easy to use and integrate with existing systems, the API can be accessed from almost any programming language to return confidence scores across 16 different “safety attributes.”

Visit the Natural Language AIwebsite to give it a try and refer to the “Text Moderation” page for details. You may also try out the Text Moderation codelab here.

Cloud BlogRead More

Previous articleHarnessing the Power of Transformative Analytics

Next articleBuilding a high throughput ETL for Google Cloud’s public datasets platform

Improving Safety of AI and Online Communities with PaLM 2

Promote brand safety

Protect users from harmful content

Mitigate risks of generative models

Getting started with Text Moderation using the Natural Language API

Leverage enterprise data with Denodo and Vertex AI for generative AI applications

TypeScript takes aim at truthy and nullish bugs

Hex-LLM: High-efficiency large language model serving on TPUs in Vertex AI Model Garden

LEAVE A REPLY Cancel reply

Most Popular

Schneider Electric automates Salesforce account hierarchy management with generative artificial intelligence (AI) using Amazon Aurora and Amazon Bedrock

Leverage enterprise data with Denodo and Vertex AI for generative AI applications

TypeScript takes aim at truthy and nullish bugs

Make relevant movie recommendations using Amazon Neptune, Amazon Neptune Machine Learning, and Amazon OpenSearch Service

Recent Comments

EDITOR PICKS

Exploring the Click Element Variable in Google Tag Manager

How to track events with Google Tag Manager and Google Analytics

Data Layer Variable in GTM: What, Why, and Where?

POPULAR POSTS

Google’s Cloud TPU v4 provides exaFLOPS-scale ML with industry-leading efficiency

Excel Add-in for ChatGPT

How to easily migrate your on-premises firewall rules to Cloud Firewall policies

POPULAR CATEGORY