#Coding4Fun – How to control your #drone with 20 lines of code! (11/N)

Buy Me A Coffee

Hi!

Today code objective is very simple:

The drone is flying very happy, but if the camera detects a banana, the drone must land !

Let’s take a look at the program working:

drone flying and when detect a banana lands

And a couple of notes regarding the app

  • Still use Haar Cascades for object detection. I found an article with a Xml file to detect bananas, so I’m working with this one (see references).
  • Using Haar Cascades is not the best technique for object detection. During the testing process, I found a lot of false positives. Mostly with small portions of the frame who were detected as bananas. One solution, was to limit the size of the detected objects using OpenCV (I’ll write more about this in the future)
  • As you can see in the animation, when the drone is a few meters away, the video feed becomes messy. And because the object detection is performed locally, it takes some time to detect the banana.
  • I also implemented some code to take off the drone when the user press the key ‘T’, and land the drone when the user press the key ‘L’
  • The code is starting to become a mess, so a refactoring is needed

Here is the code

In next posts, I’ll analyze more in details how this works, and a couple of improvements that I can implement.

Happy coding!

Greetings

El Bruno

References

My Posts

#Windows10 – Windows Vision Skills (Preview), an amazing set of AI APIs to run in the edge!

Hi!

Today’s announcement is a big one if you are interested on move AI capabilities to the Edge. The Windows team make public the preview of Windows Vision Skills framework:

Windows Vision Skills framework is meant to standardize the way AI and CV is put to use within a WinRT application running on the edge. It aims to abstract away the complexity of AI and CV techniques by simply defining the concept of skills which are modular pieces of code that process input and produce output. The implementation that contains the complex details is encapsulated by an extensible WinRT API that inherits the base class present in this namespace, which leverages built-in Windows primitives which in-turn eases interop with built-in acceleration frameworks or external 3rd party ones.

The official blog explain the basic features of the framework and describes a set of scenarios like Object Detector, Skeletal Detector, and Emotion Recognizer.

We have UWP Apps in the repo samples, and it only took 1 min to setup everything to get the App up and running. In the following image, it smoothly detects a person and a chair.

The next image is the sample for Skeletal detector (as a old Kinect dev, this really makes me happy!)

This is an big announcement, because all of this APIs are native , and that means we can easily use them in

Greetings @ Toronto

El Bruno

References