#Event – Global AI Nights on Sept 5th!

Hi !

During August, I’ll be participating and supporting a couple of hackathons and work events (check my next events section!).

And I’m happy to share that on September 5th I’ll part of the Global AI Night in Toronto.

The Global AI Night is a free evening event organized by 88 communities all over the world that are passionate about Artificial Intelligence on the Microsoft Azure. During this AI Night you will get inspired through sessions and get your hands dirty during the workshops. By the end of the night you will be able to infuse AI into your applications.

The Toronto registration and information is available here https://global.ainights.com/bootcamp/8d354913-4243-4a9e-8c1d-a594dc7dbe69

As usual, the best way to explain this is with a video

Happy coding!

Greetings @ Rogers Cup

El Bruno

Advertisements

#Python –Detecting #Hololens in realtime in webcam feed using #ImageAI and #OpenCV (thanks to @OlafenwaMoses)

Hi!

Let’s start with a very quick intro:

During the past months, I’ve been playing around with several Image Analysis tools. And ImageAI (see references) is one that deserves a full series of posts. Please take a look at the product and the source code in GitHub, and also please thank the one behind this: Moses Olafenwa (@OlafenwaMoses).

And now, my 2 cents. I’ve started to test ImageAI to create my own image detection models. Most of the times, this is a hard path to do, however ImageAI show me an interesting option.

… with the latest release of ImageAI v2.1.0, support for training your custom YOLOv3 models to detect literally any kind and number of objects is now fully supported, …

Wow! That’s mean that I can pick up my own set of images dataset and train on top of a YOLOv3 and use it as a trained model. Again, this is amazing.

So, I started to read the article [Train Object Detection AI with 6 lines of code, see references] where Olafenwa explains how to do this using a data set with almost 500 rows with images for Hololens and Oculus Rift.

The code is very simple and easy to read. There are also examples on how to analyze a single file, or a video, or even a camera feed. The output for the analysis can be also in a new file, in a processed video or even a full log file with the detected information.

I started to read the code samples and I realized that I’m missing a scenario:

Display the realtime feed from a webcam, analyze each webcam frame and if a device is found, add a frame to the realtime feed to display this.

I use OpenCV to access to my camera, and it took me some time to figure out how to convert my OpenCV2 camera frame to the format needed by ImageAI. At the end, thanks to the GitHub code I manage to create this (very slow but working) demo

As usual in this scenario, now it’s time to improve the performance and start testing with some tweaks to get a decent up and running App.

And of course, the code

Happy coding!

Greetings @ Toronto

El Bruno

Resources

#AI – Introduction to #deeplearning vs. #machinelearning by @frlazzeri. The best 10 min read for today

Hi!

Explain the differences / relationship between Machine Learning and Deep Learning is a question that I face in every event or chat about Machine Learning.

And I used to have my 5 bullets explanation for this. However, now thanks to Francesca Lazzeri (@frlazzeri) I can advice people to read this amazing article.

Introduction to deep learning vs. machine learning

Introduction to deep learning vs. machine learning

So, you know, if you have 10 minutes, this will really help you understand the relationships between AI, ML and DL!

Happy Coding!

Greetings @ Toronto

El Bruno

#Office – Live subtitles in Microsoft Teams, oh yeah! Another great #AI live sample.

Hi !

I usually use the live subtitle demo feature in PowerPoint to showcase how amazing is the current state of AI, and how we can use it in our daily lives. And now, after the official announcement of Microsoft, we can also use the live subtitle feature in Microsoft Teams.

As you can expect the way to use it is very easy, just enable the Live Subtitles feature and Microsoft Teams will automatically start to

  • Listen to every audio conversation
  • Convert the audio to text
  • Present the live as a subtitle in the MS Teams windows

In the official announcement there is a nice animation on this

We may expect to have also maybe some extra features like language translations and more. That will be also so cool!

Happy coding!

Greetings @ Toronto

El Bruno

References

My posts

#Office – Acronyms pane in Word, another amazing example of #AI embedded in our day to day tools – Powered by Microsoft Graph!

Hi!

Today’s post is, one more time, related to some amazing Artificial Intelligence features embedded in Microsoft Office. And this is very helpful if you work in an organization with tons of Acronyms. I’m sure, you have your own set of acronyms at different levels: Team, Group and Organization.

When you are new to this Acronyms, is very hard to get up to date with all of them. That’s why the Acronyms feature in Word is very important, it may help us and save us lot of time!

The Acronyms page is the [References] tab in the Ribbon, or you can just search for it.

Search for Acronyms Pane in Word

Once, you enabled the pane, it will analyze the text of your Word document and also analyze the definitions mostly used on your organization to get a sense of “what can be an Acronym“. It will leverage the Microsoft Graph to surface definitions of terms that have been previously defined across emails and documents.

The results are amazing:

Word Acronyms page results

Another amazing example of AI in our day to day use.

Happy coding!

Greetings @ Burlington

El Bruno

My posts

#AI – MineRL, play #Minecraft to benefit science!

Hi !

I’ve write a couple of time about project Malmo and Minecraft, so if you like Minecraft and Artificial Intelligence, MineRL will make your day. Let’s start with some basis:

MineRL is a large-scale dataset on Minecraft of seven different tasks, which highlight a variety of research challenges including open-world multi-agent interactions, long-term planning, vision, control, navigation, and explicit and implicit subtask hierarchies.

There are 2 main ways to be involved with MineRL, entering the AI (DL) competition, or playing Minecraft (to create more source data to train and test models!)

In the play more, MineRL want to solve Minecraft using state-of-the-art Machine Learning! To do so, MineRL is creating one of the largest datasets of recorded human player data. The dataset includes a set of tasks which highlights many of the hardest problems in modern-day Reinforcement Learning: sparse rewards and hierarchical policies.

There is plenty of information and details on the main website, and as soon as I finish some of my current work and personal projects, I’ll for sure spend more time here!

More information http://minerl.io/about/

Happy coding!

Greetings @ Toronto

El Bruno

#AI – #Translatotron is not dot a dorky name, it’s maybe the best translator ever #GoogleResearch

Hi !

A couple of days ago, Google presented Translatotron. The name is not the best name, however the idea is amazing:

Google researchers trained a neural network to map audio “voiceprints” from one language to another. After the tool translates an original audio, Translatotron retains the voice and tone of the original speaker. It converts audio input directly to audio output without any intermediary steps.

Model architecture of Translatotron.

As usual, the best way to understand this, is to see Translatotron in action. Let’s take a look at the following audios.

Input (Spanish)
Reference translation (English)
Baseline cascade translation
Translatotron translation (canonical voice)
Translatotron translation (original speaker’s voice )

There is a full set of sample audios here: https://google-research.github.io/lingvo-lab/translatotron/#fisher_1

This is an amazing technology, and also a great starting point for scenarios where it’s important to keep original speaker vocal characteristics. And let me be honest, it’s also scary if you think on Fake Voice scenarios.

Happy coding!

Greetings @ Toronto

El Bruno

Source: Introducing Translatotron: An End-to-End Speech-to-Speech Translation Model

#Event – Resources for the sessions about #DeepLearning and #CustomVision at the @ChicagoCodeCamp

Hi!

Another post-event post, this time with a big thanks to the team behind one of the most amazing event I’ve been this year: Chicago CodeCamp.

I had the chance to meet a lot of amazing people, to learn a lot during the sessions and also to visit the great city of Chicago.

As usual, now it’s time to share slides, code and more.

Deep Learning for Everyone? Challenge Accepted!

Let’s start with the Deep Learning resources


Demos Source Code: https://github.com/elbruno/events/tree/master/2019%2005%2011%20Chicago%20CodeCamp%20Deep%20Learning

Session: How a PoC at home can scale to Enterprise Level using Custom Vision APIs

And also the [How a PoC at home can scale to Enterprise Level using Custom Vision APIs] resources

Demos Source Code: https://github.com/elbruno/events/tree/master/2019%2005%2011%20Chicago%20CodeCamp%20CustomVision

And finally, some Machine Learning.Net, Deep Learning and Custom Vision resources:

My posts on Custom Vision and ONNX

  1. Object recognition with Custom Vision and ONNX in Windows applications using WinML
  2. Object recognition with Custom Vision and ONNX in Windows applications using WinML
  3. Object recognition with Custom Vision and ONNX in Windows applications using Windows ML, drawing frames
  4. Object recognition with Custom Vision and ONNX in Windows applications using Windows ML, calculate FPS
  5. Can’t install Docker on Windows 10 Home, need Pro or Enterprise
  6. Running a Custom Vision project in a local Docker Container
  7. Analyzing images in a Console App using a Custom Vision project in a Docker Container
  8. Analyzing images using PostMan from a Custom Vision project hosted in a Docker Container
  9. Building the CustomVision.ai project in Docker in a RaspberryPi
  10. Container dies immediately upon successful start in a RaspberryPi. Of course, it’s all about TensorFlow dependencies
  11. About ports, IPs and more to access a container hosted in a Raspberry Pi
  12. Average response times using a CustomVision.ai docker container in a RaspberryPi and a PC

Windows 10 and YOLOV2 for Object Detection Series

See you next one in Chicago for some Deep Learning fun!

Happy coding!

Greetings @ Toronto

El Bruno

#Event – See you @ChicagoCodeCamp on May 11, 2019 for some Deep Learning and Custom Vision experiences

Hi !

I’m very lucky to be at the next Chicago CodeCamp with another session around Custom Vision:

How a PoC at home can scale to Enterprise Level using Custom Vision APIs

It all started with a DIY project to use Computer Vision for security cameras at home. A custom Machine Learning model is the core component used to analyze pictures to detect people, animals and more in a house environment. The AI processing is performed at the edge, in dedicated hardware and the collected information is stored in the cloud. The same idea can be applied to several CCTV scenarios, like parking lots, train stations, malls and more. However, moving this into enterprise scale brings a set of challenges, which are going to be described and explained in this session.

More Information: https://www.chicagocodecamp.com/ and remember that we will be also talking about Deep Learning.

Greetings @ Toronto

El Bruno

#Windows10 – Windows Vision Skills (Preview), an amazing set of AI APIs to run in the edge!

Hi!

Today’s announcement is a big one if you are interested on move AI capabilities to the Edge. The Windows team make public the preview of Windows Vision Skills framework:

Windows Vision Skills framework is meant to standardize the way AI and CV is put to use within a WinRT application running on the edge. It aims to abstract away the complexity of AI and CV techniques by simply defining the concept of skills which are modular pieces of code that process input and produce output. The implementation that contains the complex details is encapsulated by an extensible WinRT API that inherits the base class present in this namespace, which leverages built-in Windows primitives which in-turn eases interop with built-in acceleration frameworks or external 3rd party ones.

The official blog explain the basic features of the framework and describes a set of scenarios like Object Detector, Skeletal Detector, and Emotion Recognizer.

We have UWP Apps in the repo samples, and it only took 1 min to setup everything to get the App up and running. In the following image, it smoothly detects a person and a chair.

The next image is the sample for Skeletal detector (as a old Kinect dev, this really makes me happy!)

This is an big announcement, because all of this APIs are native , and that means we can easily use them in

Greetings @ Toronto

El Bruno

References