You don't have access to this class

Keep learning! Join and start boosting your career

Aprovecha el precio especial y haz tu profesi贸n a prueba de IA

Antes: $249

Currency
$209
Suscr铆bete

Termina en:

1 D铆as
16 Hrs
43 Min
55 Seg
Curso de Azure Cognitive Services

Curso de Azure Cognitive Services

Luis Antonio Ruvalcaba Sanchez

Luis Antonio Ruvalcaba Sanchez

Convierte voz a texto

12/27
Resources

How to implement speech-to-text service?

The speech-to-text service transforms audio into text using advanced cognitive technology. To start using it, it is essential to have a subscription to Azure's cognitive service. Here you will learn how to set it up from scratch in a Visual Studio environment, using C# and some additional tools to work with the file system and asynchronous methods.

What do you need to get started?

  1. Subscription to the speech cognitive service: This service is key to perform the speech to text conversion.
  2. Subscription key and location: Configure these crucial elements to access the service and run it correctly.

How to configure Visual Studio?

  1. Create a console application: Select the console option to create a new application. This simplifies the initial setup process.
  2. Select framework: Choose the appropriate .NET framework to work with.
  3. Add NuGet package:
    • Right-click on the project and select "Manage NuGet packages".
    • Search for Microsoft.CognitiveServices.Speech and add the package.

How to configure the speech-to-text service?

Get the subscription and set up the region

Go to the Azure portal:

  1. Select the speech type cognitive service.
  2. Confirm the region in which it is configured, e.g. WestUS.
  3. Copy the first key from the keys and endpoint section.

Configure the subscription key

In Visual Studio, configure the subscription and region in your application with speech config:

using var speechConfig = SpeechConfig.FromSubscription("YourSubscriptionKey", "WestUS");

How to capture audio from the microphone?

  1. Create an asynchronous method: Define a method to capture audio, for example async static Task FromMic.
  2. Configure audio sources: Use the default microphone to capture audio:
using var audioConfig = AudioConfig.FromDefaultMicrophoneInput();using var recognizer = new SpeechRecognizer(speechConfig, audioConfig);
  1. Generate speech recognition:
    • Prompts the user to speak into the microphone.
    • Uses the recognizer to transcribe the captured speech into text.
var result = await recognizer.RecognizeOnceAsync();Console.WriteLine("You said: " + result.Text);

How to process an audio file?

  1. Create another asynchronous method: Define async static Task FromFile.
  2. File configuration: Change the microphone source to an audio file:
using var audioConfig = AudioConfig.FromWavFileInput("test.wav");
  1. Execute recognition: Uses the same recognition process, but keeping the configuration to process a file:
var result = await recognizer.RecognizeOnceAsync();Console.WriteLine("The result is: " + result.Text);

Complete execution

To execute correctly, set the signature of the Main method in the console program to work with asynchronous tasks, and invoke the methods as you need to work from a microphone or an audio file:

static async Task Main(){ var speechConfig = SpeechConfig.FromSubscription("YourSubscriptionKey", "WestUS"); await FromMic(speechConfig); // or await FromFile(speechConfig); Console.ReadLine();}

And there you have it! With these instructions, you will be able to implement and test the speech-to-text service using both microphone inputs and audio files. By exploring the course repository, you will also find more advanced examples. Become a speech processing expert by taking advantage of these tools.

Contributions 5

Questions 3

Sort by:

Want to see more contributions, questions and answers from the community?

Ejemplo del proyecto realizado en esta clase.

Un servicio de la vida real que usa esta tecnolog铆a, es el men煤 de alg煤n banco, que te pide decir los n煤meros de tu tarjeta para no digitarlos en el teclado del tel茅fono. Es muy com煤n en USA y por supuesto, detecta a los hispanohablantes.

C贸digo de ejemplo

using System;
using System.IO;
using System.Threading.Tasks;
using Microsoft.CognitiveServices.Speech;
using Microsoft.CognitiveServices.Speech.Audio;

    class Program
    {
        async static Task Main(string[] args)
        {
            var speechConfig = SpeechConfig.FromSubscription("0d0fcb275cf949ce8cdc32c215da56ed", "westus");
            Console.WriteLine("Hello world");
            //await FromMic(speechConfig);
            await FromFile(speechConfig);
            Console.ReadLine();
        }

        async static Task FromMic(SpeechConfig speechConfig)
        {
            using var audioConfig = AudioConfig.FromDefaultMicrophoneInput();
            using var recognizer = new SpeechRecognizer(speechConfig, "es-MX" audioConfig);

            Console.WriteLine("Habla al microfono");
            var result = await recognizer.RecognizeOnceAsync();
            Console.WriteLine("Tu dijiste lo siguiente : " + result.Text);
        }

        async static Task FromFile(SpeechConfig speechConfig)
        {
            using var audioConfig = AudioConfig.FromWavFileInput("test.wav");
            using var recognizer = new SpeechRecognizer(speechConfig, audioConfig);

            var result = await recognizer.RecognizeOnceAsync();
            Console.WriteLine("El resultado es : " + result.Text);
        }   
    }
}

Este tipo de servicio se puede usar para comunicarse con asistentes de voz, similares a Amazon Echo.

La clase a utlizar es SpeechRecognizer