#AI – #Translatotron is not dot a dorky name, it’s maybe the best translator ever #GoogleResearch

Hi !

A couple of days ago, Google presented Translatotron. The name is not the best name, however the idea is amazing:

Google researchers trained a neural network to map audio “voiceprints” from one language to another. After the tool translates an original audio, Translatotron retains the voice and tone of the original speaker. It converts audio input directly to audio output without any intermediary steps.

Model architecture of Translatotron.

As usual, the best way to understand this, is to see Translatotron in action. Let’s take a look at the following audios.

Input (Spanish)
Reference translation (English)
Baseline cascade translation
Translatotron translation (canonical voice)
Translatotron translation (original speaker’s voice )

There is a full set of sample audios here: https://google-research.github.io/lingvo-lab/translatotron/#fisher_1

This is an amazing technology, and also a great starting point for scenarios where it’s important to keep original speaker vocal characteristics. And let me be honest, it’s also scary if you think on Fake Voice scenarios.

Happy coding!

Greetings @ Toronto

El Bruno

Source: Introducing Translatotron: An End-to-End Speech-to-Speech Translation Model

Advertisements

#MLNET – How to use the AutoML API in a Console App

Hi !

In my last posts I was testing AutoML using the Model Builder inside Visual Studio and also the CLI commands. There is also an API to use this in a .Net app, and the usage is very simple.

It all start, of course, adding the [Microsoft.ML.AutoML] nuget package

I read the documentation in [How to use the ML.NET automated machine learning API], and I created the following sample using the same data as in my previous posts.

The final result displays the results for each one of the tests and showcase the top 3 ranked models. This time LightGBM Trainer is one more time the best trainer to choose.

There is a full set of samples in the Machine Learning .Net Samples repository. I’ve reused some classes from the Common folder.

The complete source code is available https://github.com/elbruno/Blog/tree/master/20190516%20MLNET%20AutoML%20API

Happy Coding!

Greetings @ Toronto

El Bruno

References

#MLNET – Are you a Command line user? MLNet CLI is great for some AutoML train tasks!

Hi !

Yesterday I wrote about how easy is to use Model Builder to create Machine Learning models directly from data inside Visual Studio.

If you prefer to work with command line interfaces, Machine Learning.Net AutoML also have a CLI interface and with a couple of commands, you can get some amazing results.

So, for this test I follow the tutorial [Auto generate a binary classifier using the CLI] and make some changes to the original command

> mlnet auto-train --task binary-classification --dataset "yelp_labelled.txt" --label-column-index 1 --has-header false --max-exploration-time 10

I’m using the same set of data I used yesterday and, my command is

mlnet auto-train --task regression --dataset "AgeRangeData03_AgeGenderLabelEncodedMoreData.csv" --label-column-index 2 --has-header true --max-exploration-time 60

The output is also interesting: it suggest to use a FastTree Regression trainer

My yesterday test using the IDE suggested a LightBGM regression trainer.

So, I decided to run the CLI one more time with some more processing time. This time the result is also a FastTree Tegression trainer.

Unless you need to use Visual Studio, this option is amazing for fast tests and you can also use the generated projects!

Happy Coding!

Greetings @ Toronto

El Bruno

References

#AI – Exporting #CustomVision projects to #docker for #RaspberryPi, the extra 2 steps

Hi !

I wrote several posts on how to create a image analysis solution using CustomVision.ai and how to export and use the project in a Raspberry Pi with Docker.

In my posts, I created a custom docker file for RPI using as a base the Linux one, from CustomVision.ai.

There is a new feature in Custom Vision, which allows us to directly export the docker image for Raspberry Pi.

This is amazing, and because I’m going to use it on Chicago CodeCamp, I decided to test it. And, of course, I got a couple of ugly errors when I try to build my image in the device.

Once I edit and read the docker file content, I realized that I need to disable the CROSS-BUILD option

And that’s it, now I’m waiting for the image to finish and I’ll be ready to test it!

Happy coding!

Greetings @ Toronto

El Bruno

My Posts

  1. Object recognition with Custom Vision and ONNX in Windows applications using WinML
  2. Object recognition with Custom Vision and ONNX in Windows applications using WinML
  3. Object recognition with Custom Vision and ONNX in Windows applications using Windows ML, drawing frames
  4. Object recognition with Custom Vision and ONNX in Windows applications using Windows ML, calculate FPS
  5. Can’t install Docker on Windows 10 Home, need Pro or Enterprise
  6. Running a Custom Vision project in a local Docker Container
  7. Analyzing images in a Console App using a Custom Vision project in a Docker Container
  8. Analyzing images using PostMan from a Custom Vision project hosted in a Docker Container
  9. Building the CustomVision.ai project in Docker in a RaspberryPi
  10. Container dies immediately upon successful start in a RaspberryPi. Of course, it’s all about TensorFlow dependencies
  11. About ports, IPs and more to access a container hosted in a Raspberry Pi
  12. Average response times using a CustomVision.ai docker container in a RaspberryPi and a PC

Windows 10 and YOLOV2 for Object Detection Series

#MSBUILD – #HoloLens 2 Apollo moon landing demo

Hi !

Today at Build, just before Satya Nadella Keynote, there was a failed demo attempt using Hololens 2 and Unreal Engine.


“Well, it seems doing a live demo is actually harder than landing on the moon,” said A Man on the Moon author Andrew Chaikin.

Lucky for us, there is a video from a rehearsal of the HoloLens 2 demo. In this scenario the holograms are streamed directly to the headsets from remote PCs, and it’s all powered by Epic’s UnReal Engine.

The high-quality images streamed to the headsets are amazing, and this is a perfect scenario on how to leverage the power of remote computers to process and generate high quality graphics, and to render this graphics in the Hololens 2 device.

Let’s take a look at the video.

Greetings @ Toronto

El Bruno

#Event – See you @ChicagoCodeCamp on May 11, 2019 for some Deep Learning and Custom Vision experiences

Hi !

I’m very lucky to be at the next Chicago CodeCamp with another session around Custom Vision:

How a PoC at home can scale to Enterprise Level using Custom Vision APIs

It all started with a DIY project to use Computer Vision for security cameras at home. A custom Machine Learning model is the core component used to analyze pictures to detect people, animals and more in a house environment. The AI processing is performed at the edge, in dedicated hardware and the collected information is stored in the cloud. The same idea can be applied to several CCTV scenarios, like parking lots, train stations, malls and more. However, moving this into enterprise scale brings a set of challenges, which are going to be described and explained in this session.

More Information: https://www.chicagocodecamp.com/ and remember that we will be also talking about Deep Learning.

Greetings @ Toronto

El Bruno

#Windows10 – Windows #VisionSkills sample UWP App

Hi!

Yesterday the Windows Team announced the preview version of Windows Vision Skills. So today I was browsing the samples in Github and I’ve created a simplified version of the Skeleton tracker using a live feed from a webcam.

Here are some notes about my GitHub sample

  • The UWP App must be Windows 10 version 1809
  • I added the NuGet packages [Microsoft.AI.Skills.Vision.SkeletalDetectorPreview] and [Microsoft.Toolkit.Uwp.UI.Controls]
  • The MainView uses the CameraPreview control from the [Microsoft.Toolkit.Uwp.UI.Controls] toolkit.
  • Each frame is processed and I use a SkeletalBinding to detect Skeletons / bodies
  • The core detection is performed here
        private async Task RunSkillAsync(VideoFrame frame, bool isStream)
        {
            m_evalPerfStopwatch.Restart();

            // Update input image and run the skill against it
            await m_skeletalDetectorBinding.SetInputImageAsync(frame);
            await m_skeletalDetectorSkill.EvaluateAsync(m_skeletalDetectorBinding);

            m_evalPerfStopwatch.Stop();
            m_skeletalDetectionRunTime = m_evalPerfStopwatch.ElapsedMilliseconds;

            await Dispatcher.RunAsync(Windows.UI.Core.CoreDispatcherPriority.Normal, () =>
            {
                m_bodyRenderer.Update(m_skeletalDetectorBinding.Bodies, !isStream);
                m_bodyRenderer.IsVisible = true;
                UISkillOutputDetails.Text = $"Found {m_skeletalDetectorBinding.Bodies.Count} bodies (took {m_skeletalDetectionRunTime} ms)";
            });
        }
  • There is also a BodyRenderer.cs class used to draw the skeletons on top of the CameraPreview Image control. It draws lines in an empty canvas.

You can download the sample code from here https://github.com/elbruno/Blog/tree/master/20190501%20VisionSkills%20Skeleton%20Sample

Greetings @ Burlington

El Bruno

References

#Windows10 – Windows Vision Skills (Preview), an amazing set of AI APIs to run in the edge!

Hi!

Today’s announcement is a big one if you are interested on move AI capabilities to the Edge. The Windows team make public the preview of Windows Vision Skills framework:

Windows Vision Skills framework is meant to standardize the way AI and CV is put to use within a WinRT application running on the edge. It aims to abstract away the complexity of AI and CV techniques by simply defining the concept of skills which are modular pieces of code that process input and produce output. The implementation that contains the complex details is encapsulated by an extensible WinRT API that inherits the base class present in this namespace, which leverages built-in Windows primitives which in-turn eases interop with built-in acceleration frameworks or external 3rd party ones.

The official blog explain the basic features of the framework and describes a set of scenarios like Object Detector, Skeletal Detector, and Emotion Recognizer.

We have UWP Apps in the repo samples, and it only took 1 min to setup everything to get the App up and running. In the following image, it smoothly detects a person and a chair.

The next image is the sample for Skeletal detector (as a old Kinect dev, this really makes me happy!)

This is an big announcement, because all of this APIs are native , and that means we can easily use them in

Greetings @ Toronto

El Bruno

References


#AI – AI for Earth, AI tools in the hands of those working to solve global environmental challenges

Hi !

When I was in Ohio @CodeMash, I was lucky enough to meet Jennifer Marsman, Principal Engineer & speaker on the AI for Earth team at Microsoft (@jennifermarsman). She hosted an amazing session where she shared details about some projects on AI for Earth.

AI for Earth puts Microsoft cloud and AI tools in the hands of those working to solve global environmental challenges

See references

The work that the AI for Earth teams are doing are amazing, and I was really impressed by the “Mexican whale story”. The team uses image analysis to identify individual animals in regular persons photos or videos, and using meta data like date and location of a photo or a video, they can generate paths of animal migration. And yes, the photos came from public social media spaces like Facebook, Instagram or YouTube.

So, I got this information as a draft for a while, and now I get some more details and it makes sense to share it. The project name is Wild Me:

Wild Me is using computer vision and deep learning algorithms to power Wildbook, a platform that can identify individual animals within a species.  They also augment their data with an intelligent agent that can mine social media. 

And as usual, a video is the best way to explain this:

Besides Wild Me, there are other amazing projects like SilviaTerra or FarmBeats. You can find the complete list of projects and challenges here (link).

Happy Coding!

Greetings @ Burlington

El Bruno

References