Build Voice AI into your apps with our top 3 Speech API codelabs

By mullaned2002

April 20, 2022

519

With voice-controlled touchpoints becoming more and more the norm in human-computer interactions, our Speech-to-Text (STT) API is a great option for developers looking to build voice into their applications. The API processes over 1 billion spoken minutes of speech each month, enough to transcribe all Presidential inauguration speeches in U.S. history over 1 million times. Our customers use STT for everything from auto-generating captions, to generating insights to improve sales calls, to powering robots that help with childhood development.

With Speech-to-Text, you can accurately convert speech into text with several adaptations including:

Model Customization – customize for domain-specific termsSpeech Adaptation – provide context to influence results and formattingDiarization – separate speakers on different channels or automatically detect when speakers changeProfanity Filtering – configure your request to detect profane words and edit them outof the transcript

Whether you’re using our pre-trained APIs for the first time or you’re a seasoned AI veteran, our codelabs are great resources for practicing and getting even more comfortable with our pre-trained models. In addition to helping you brush up on your skills, Codelabs also provide step-by-step instructions for how to set up your GCP project and get a $300 credit if you need it. They’ll also walk you through everything else you need to get your sample up and running, such as authentication, and installing the client libraries and tooling like the Cloud Shell Editor.

That’s why we’ve decided to round up some our top Speech codelabs, to help you get the most of our Speech-to-Text API, and our Text-to-Speech API as well:

1. Using the Speech-to-Text API with Python lab and C# lab

Speech-to-Text is easy to get started with; in the code snippet below you can see all you need is the client library, an audio file and a few lines of code to get a transcript created:

code_block[StructValue([(u’code’, u’from google.cloud import speech_v1 as speechrnrnrndef speech_to_text(config, audio):rn client = speech.SpeechClient()rn response = client.recognize(config=config, audio=audio)rn print_sentences(response)rnrnrndef print_sentences(response):rn for result in response.results:rn best_alternative = result.alternatives[0]rn transcript = best_alternative.transcriptrn confidence = best_alternative.confidencern print(“-” * 80)rn print(f”Transcript: {transcript}”)rn print(f”Confidence: {confidence:.0%}”)rnrnrnconfig = dict(language_code=”en-US”)rnaudio = dict(uri=”gs://cloud-samples-data/speech/brooklyn_bridge.flac”)’), (u’language’, u”)])]

This lab will also show you how to transcribe in multiple languages. Speech-to-Text supports 137 locales for over 70 languages!

On-prem? No problem: Speech-to-Text is also available on-prem to meet your infrastructure, data residency and compliance requirements.

2. Using the Text-to-Speech API with Python lab and C# lab

On the flip side, if the reverse of STT is what you need for your integration, we have labs to help you get started withText-to-Speech (TTS) in both Python and C#. With TTS, you can convert text into natural speech using groundbreaking synthesis AI from Google.

TTS lets you train custom voices, in addition to the 220+ voices from 40+ languages and variants that are available out of the box. Further customize your audio output with Speech Synthesis Markup Language (SSML) in your TTS request, which allows for more customization in your audio response by providing details on pauses and audio formatting for acronyms, dates, times, abbreviations or text that should be censored.

3. Using the Google Docs API Machine Learning (Speech-to-Text) lab

If you’re looking for an interesting sample to see how to use our APIs to solve business problems, check out this lab on how to create a transcript of your business meetings using Google Docs.

You’ll learn how to set up both APIs and send an audio file through the STT API, which then writes to a Google Doc using Java—so you’ll never forget what happened in a meeting again!

Try these labs out and use your $300 worth of Cloud credit to get started on the Cloud Speech API today. To learn more about Google Cloud’s Speech API, click here.

Cloud BlogRead More

Previous articleGoogle Cloud Partner Consumption Packs accelerate customer journeys to the cloud

Next articleAnnouncing the General Availability of openCypher support for Amazon Neptune

Build Voice AI into your apps with our top 3 Speech API codelabs

1. Using the Speech-to-Text API with Python lab and C# lab

2. Using the Text-to-Speech API with Python lab and C# lab

3. Using the Google Docs API Machine Learning (Speech-to-Text) lab

Leverage enterprise data with Denodo and Vertex AI for generative AI applications

TypeScript takes aim at truthy and nullish bugs

Hex-LLM: High-efficiency large language model serving on TPUs in Vertex AI Model Garden

LEAVE A REPLY Cancel reply

Most Popular

Schneider Electric automates Salesforce account hierarchy management with generative artificial intelligence (AI) using Amazon Aurora and Amazon Bedrock

Leverage enterprise data with Denodo and Vertex AI for generative AI applications

TypeScript takes aim at truthy and nullish bugs

Make relevant movie recommendations using Amazon Neptune, Amazon Neptune Machine Learning, and Amazon OpenSearch Service

Recent Comments

EDITOR PICKS

Exploring the Click Element Variable in Google Tag Manager

How to track events with Google Tag Manager and Google Analytics

Data Layer Variable in GTM: What, Why, and Where?

POPULAR POSTS

Sources And Destinations Podcast Episode #4 Joseph Arriola

Deploy large models on Amazon SageMaker using DJLServing and DeepSpeed model parallel inference

Analyze Amazon SageMaker spend and determine cost optimization opportunities based on usage, Part 1

POPULAR CATEGORY

Build Voice AI into your apps with our top 3 Speech API codelabs

1. Using the Speech-to-Text API with Python lab and C# lab

2. Using the Text-to-Speech API with Python lab and C# lab

3. Using the Google Docs API Machine Learning (Speech-to-Text) lab

Your ultimate guide to Speech on Google Cloud

LEAVE A REPLY Cancel reply

Most Popular

Recent Comments

EDITOR PICKS

POPULAR POSTS

POPULAR CATEGORY