Artificial Intelligence and Machine Learning

Apply profanity masking in Amazon Translate

By mullaned2002

February 11, 2022

1649

Amazon Translate is a neural machine translation service that delivers fast, high-quality, affordable, and customizable language translation. This post shows how you can mask profane words and phrases with a grawlix string (“?$#@$”).

Amazon Translate typically chooses clean words for your translation output. But in some situations, you want to prevent words that are commonly considered as profane terms from appearing in the translated output. For example, when you’re translating video captions or subtitle content, or enabling in-game chat, and you want the translated content to be age appropriate and clear of any profanity, Amazon Translate allows you to mask the profane words and phrases using the profanity masking setting. You can apply profanity masking to both real-time translation or asynchronous batch processing in Amazon Translate. When using Amazon Translate with profanity masking enabled, the five-character sequence ?$#@$ is used to mask each profane word or phrase, regardless of the number of characters. Amazon Translate detects each profane word or phrase literally, not contextually.

Solution overview

To mask profane words and phrases in your translation output, you can enable the profanity option under the additional settings on the Amazon Translate console when you run the translations with Amazon Translate both through real-time and asynchronous batch processing requests. The following sections demonstrate using profanity masking for real-time translation requests via the Amazon Translate console, AWS Command Line Interface (AWS CLI), or with the Amazon Translate SDK (Python Boto3).

Amazon Translate console

To demonstrate handling profanity with real-time translation, we use the following sample text in French to be translated into English:

Ne sois pas une garce

Complete the following steps on the Amazon Translate console:

Choose French (fr) as the Source language.
Choose English (en) as the Target Language.
Enter the preceding example text in the Source Language text area.

The translated text appears under Target language. It contains a word that is considered profane in English.

Expand Additional settings and enable Profanity.

The word is now replaced with the grawlix string ?$#@$.

AWS CLI

Calling the translate-text AWS CLI command with –settings Profanity=MASK masks profane words and phrases in your translated text.

The following AWS CLI commands are formatted for Unix, Linux, and macOS. For Windows, replace the backslash () Unix continuation character at the end of each line with a caret (^).

aws translate translate-text
–text <<INPUT TEXT>>
–source-language-code fr
–target-language-code en
–settings Profanity=MASK

You get a response like the following snippet:

{
“TranslatedText”: “<output text with ?$#@$>“,
“SourceLanguageCode”: “fr”,
“TargetLanguageCode”: “en”,
“AppliedSettings”: {
“Profanity”: “MASK”
}
}

Amazon Translate SDK (Python Boto3)

The following Python 3 code uses the real-time translation call with the profanity setting:

import boto3
import json

translate = boto3.client(‘translate’)

SOURCE_TEXT = (“<Sample Input Text>“)

OUTPUT_LANG_CODE = ‘en’

result = translate.translate_text(
Text=SOURCE_TEXT,
SourceLanguageCode=’auto’,
TargetLanguageCode=OUTPUT_LANG_CODE,
Settings={‘Profanity’: ‘MASK’}
)

print(“Translated Text:{}”.format(result[‘TranslatedText’]))

Conclusion

You can use the profanity masking setting to mask words and phrases that are considered profane to keep your translated text clean and meet your business requirements. To learn more about all the ways you can customize your translations, refer to Customizing Your Translations using Amazon Translate.

About the Authors

Siva Rajamani is a Boston-based Enterprise Solutions Architect at AWS. He enjoys working closely with customers and supporting their digital transformation and AWS adoption journey. His core areas of focus are serverless, application integration, and security. Outside of work, he enjoys outdoors activities and watching documentaries.

Sudhanshu Malhotra is a Boston-based Enterprise Solutions Architect for AWS. He’s a technology enthusiast who enjoys helping customers find innovative solutions to complex business challenges. His core areas of focus are DevOps, machine learning, and security. When he’s not working with customers on their journey to the cloud, he enjoys reading, hiking, and exploring new cuisines.

Watson G. Srivathsan is the Sr. Product Manager for Amazon Translate, AWS’s natural language processing service. On weekends you will find him exploring the outdoors in the Pacific Northwest.

Apply profanity masking in Amazon Translate

Solution overview

Amazon Translate console

AWS CLI

Amazon Translate SDK (Python Boto3)

Conclusion

About the Authors

Building scalable, secure, and reliable RAG applications using Knowledge Bases for Amazon Bedrock

Significant new capabilities make it easier to use Amazon Bedrock to build and scale generative AI applications – and achieve impressive results

The executive’s guide to generative AI for sustainability

LEAVE A REPLY Cancel reply

Most Popular

The overwhelmed person’s guide to Google Cloud: week of April 11

Driving Digital Transformation: Why Enterprises Must Migrate Data to the Cloud

M-Trends 2024: Our View from the Frontlines

AWS moves Amazon Bedrock’s AI guardrails, and other features to general availability

Recent Comments

EDITOR PICKS

Exploring the Click Element Variable in Google Tag Manager

How to track events with Google Tag Manager and Google Analytics

Data Layer Variable in GTM: What, Why, and Where?

POPULAR POSTS

Introducing Last Mile Fleet Solution: Maximize fleet performance from ecommerce order to doorstep delivery

Missed out on Automate? Keep calm and catch up with all the news from our Product Showcase

Introducing new Cloud Client Libraries for Compute Engine

POPULAR CATEGORY