Windows 10 and YOLOV2 for Object Detection Series
- Introduction to YoloV2 for object detection
- Create a basic Windows10 App and use YoloV2 in the camera for object detection
- Transform YoloV2 output analysis to C# classes and display them in frames
- Resize YoloV2 output to support multiple formats and process and display frames per second
Today I start with the final UWP App running because much of the post will be code and more code. The expected output in a Windows 10 object recognition App with YoloV2 is similar to the following image
Following in the footsteps of my previous post, we were with the result of the process of our Webcam With YoloV2 and that is an array of 21125 Numbers Float. Well, this number is not trivial, as rereading the documentation of YOLOV2 we see that YOLO divides the image into a 13-by 13-cell grid:
Each of these Cells It is responsible for predicting 5 bounding boxes. A bounding box describes the rectangle that contains an object. and from here The number
13 * 13 * 125 = 21125
There are many posts that describe how Yolo works internally, I left some in the references if someone is interested in the details.
Well, in this scenario the next step was to start translating that Grid in C# objects to work with. As the internet is a very broad source of knowledge, instead of translating some of the Python classes that already exist, I saw that Rene Schulte It had between its GitHub repositories a Fork From another repo where you could see the following classes
- This class is a parser to convert the Grid In a collection of Frames with the size and location coordinates of the objects detected in the image.
- This class represents a Frame of detected object.
To show these frames, we add a Canvas About the control that shows the Feed of the camera.
The following code completes the example, with the following considerations
- There are a number of private variables to work with the model, the collection of frames and the visual styles with which they are painted.
- The frames of people are painted in green, the other objects in yellow
The only detail that remains to comment is that YoloV2 is designed to work with images of size 416 x 416. In this case you have to resize the control of Webcam and the Canvas To that Size So that the frames are displayed in the correct position.
In the next post I will share the final example, and also add some work of Rescaling To be able to support other definitions different from 416 x 416.
Greetings @ Toronto
- YOLO: Real-time object detection
- YOLO9000: Better, Faster, Stronger by Joseph Redmon and Ali Farhadi (2016)
- ONNX Tools
- Azure AI Gallery, Tiny YOLO V2
- El Bruno, Windows Community Toolkit V 3.0 makes life incredibly easy if you need working with the camera in a UWP App
- Visual Studio Marketplace, Visual Studio Tools for AI
- Real-time object detection with YOLO
- Rene Schulte GitHub
- Sevans4067 WinML-TinyYolo