#AI – Real-time audio translation using #CognitiveServices

Hi!

I still have some work to do after the Azure Global Bootcamp. After showing the Audio Bot in live mode, one of the classic questions in Canada, is that what happens with French?, is this supported?

Well, Cognitive Services offers us several services that can be useful to create multi cultural apps, mostly if we are working with text or audio. Regardless of the Cognitive Service operation that we use, the process to perform an audio translation is usually always the following

  • Convert audio into text
  • Convert the text from a language A to a language B

In the first step, it is possible to use local services of the device to convert the audio into text, or if you work with a specific business domain, Custom Speech Service is the service to use.

Another option, which is also interesting is to use Translator Speech API. This service uses an audio stream as input and with a single call to an Http Endpoint. It is worth seeing the implementation of the service, since it works with WebSockets sending chunks of data from an audio file.

The best thing as always is to go to the code examples of the Microsoft Translator repository and see how they have been implemented. In the example for WPF we can see that we define options like source and destination language, text in subtitles and more.

capture_001_30042018_190722

At the moment of initial service, the code connects to the EndPoint and starts sending the audio that is recorded from the Input Device

capture_002_30042018_190726

Almost in real time, we can see how the application translates between 2 languages

capture_003_30042018_190735

In addition to the WPF example, in repos we can see examples for iOS, Android, UWP and more.

Happy Coding!

Greetings @ Toronto

El Bruno

References

1 comment

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.