Yesterday I wrote a post on how to create and publish an Acoustinc Model in Custom Speech Service to perform a text-to-speech process (TTS). The next step is to add some C# code in an App to use this service. For this sample I will use a sample wav file with single sentencente. When I try this file in CRIS test console I get the following result:
So, it’s working. Let’s create a Console App and add the NuGet package for our platform target.
Important: By default the platform configuration is set to “Any CPU”, we need to change this to x86 or x64 so we can use the Speech NuGet package without any issues.
So, big surprise, this package is the same for CRIS and for BING Speech recognition (Thanks to Victor for this tip!). There is a sample WPF implementation in the GitHub repo which uses the Bing keys and architecture, I’ll continue with my CRIS sample.
Let’s work in the sample Console App. There are 3 main sections here
- Initialize the STT client
- Process the wav file
- Get and process CRIS result
The next pieces of code are part sample App.
- In the Main section I create and init the STT client using the information of my previous post
- To process the wav file, we open and send the file using small chunks to CRIS
- Then we need to subscribe to client events
- In this events me show some of the client received information in the Console App
We get the following result from the running app.
We can download the source code from GitHub (link)
Greetings @ Toronto (-5!)
- El Bruno, Tutorial to create and publish a complete model in Custom Speech Service (#CRIS)
- GitHub, Cognitive Speech STT Windows
- Azure, Use a custom speech-to-text endpoint