#Event – Resources used during the session “Hack a drone, hack the camera and use AI” at the Global AI Tour, Lahore Pakistan, 2020

Buy Me A Coffee

Hi !

I had a great time early in the day with my Microsoft Student Partners from Lahore Pakistan, for the Global AI Tour. As usual, time for slides and code:

Slides

Code

https://github.com/elbruno/events/tree/main/2020%2011%2030%20Global%20AI%20Tour%20Pakistan

Resources

Happy coding!

Greetings

El Bruno

#dotnet – Age and Gender estimation from the 🎦 camera feed using #OpenCV and #net5

Buy Me A Coffee

Hi !

Face detected, so next step is to use some prebuild models to perform additional actions: like estimate the Age of a face, and also the Gender. In order to do this, I downloaded a couple of models from here.

Disclaimer: these models are just sample models, do not use them in production. These model does not covers all the necessary scenarios for a real implementation.

And the final winform app is kind of cute!

Below you can find the complete Form1 source code, before let’s take a look at the sample analyzing a magazine photo.

2020-11-23_16-37-43 opencv net 5 detecting multiple faces

So let’s analyze the code. For this sample, we load 3 models to work with age, faces and gender.

// # detect faces, age and gender using models from https://github.com/spmallick/learnopencv/tree/08e61fe80b8c0244cc4029ac11e44cd0fbb008c3/AgeGender
const string faceProto = "models/deploy.prototxt";
const string faceModel = "models/res10_300x300_ssd_iter_140000_fp16.caffemodel";
const string ageProto = @"models/age_deploy.prototxt";
const string ageModel = @"models/age_net.caffemodel";
const string genderProto = @"models/gender_deploy.prototxt";
const string genderModel = @"models/gender_net.caffemodel";
_ageNet = CvDnn.ReadNetFromCaffe(ageProto, ageModel);
_genderNet = CvDnn.ReadNetFromCaffe(genderProto, genderModel);
_faceNet = CvDnn.ReadNetFromCaffe(faceProto, faceModel);

Once the models are loaded, in the loop to analyze camera frames, we perform face detection, and then age and gender estimation.

while (true)
{
    if (!_run) continue;
    var startTime = DateTime.Now;

    _capture.Read(_image);
    if (_image.Empty()) return;
    var imageRes = new Mat();
    Cv2.Resize(_image, imageRes, new Size(320, 240));
    var newImage = imageRes.Clone();

    if (_doFaceDetection) DetectFaces(newImage, imageRes);

    if (_fps) CalculateFps(startTime, newImage);

    var bmpWebCam = BitmapConverter.ToBitmap(imageRes);
    var bmpEffect = BitmapConverter.ToBitmap(newImage);

    pictureBoxWebCam.Image = bmpWebCam;
    pictureBoxEffect.Image = bmpEffect;
}

For each detected face, we perform the age and gender estimation. In order to do this, we crop the detected face (plus a padding), and perform the estimation on the cropped image.

private void AnalyzeAgeAndGender(int x1, int y1, int x2, int y2, Mat imageRes, Mat newImage)
{
    // get face frame
    var x = x1 - Padding;
    var y = y1 - Padding;
    var w = (x2 - x1) + Padding * 3;
    var h = (y2 - y1) + Padding * 3;
    Rect roiNew = new Rect(x, y, w, h);
    var face = imageRes[roi: roiNew];

    var meanValues = new Scalar(78.4263377603, 87.7689143744, 114.895847746);
    var blobGender = CvDnn.BlobFromImage(face, 1.0, new Size(227, 227), mean: meanValues,
        swapRB: false);
    _genderNet.SetInput(blobGender);
    var genderPreds = _genderNet.Forward();

    GetMaxClass(genderPreds, out int classId, out double classProbGender);
    var gender = _genderList[classId];

    _ageNet.SetInput(blobGender);
    var agePreds = _ageNet.Forward();
    GetMaxClass(agePreds, out int classIdAge, out double classProbAge);
    var age = _ageList[classIdAge];

    var label = $"{gender},{age}";
    Cv2.PutText(newImage, label, new Point(x1 - 10, y2 + 20), HersheyFonts.HersheyComplexSmall, 1, Scalar.Yellow, 1);
}

private void GetMaxClass(Mat probBlob, out int classId, out double classProb)
{
    // reshape the blob to 1x1000 matrix
    using var probMat = probBlob.Reshape(1, 1);
    Cv2.MinMaxLoc(probMat, out _, out classProb, out _, out var classNumber);
    classId = classNumber.X;
    Debug.WriteLine($"X: {classNumber.X} - Y: {classNumber.Y} ");
}

It’s also important to mention to the GetMaxClass() function, to retrieve the best detected element in the prob result.

And the complete source code:

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Threading;
using System.Windows.Forms;
using OpenCvSharp;
using OpenCvSharp.Dnn;
using OpenCvSharp.Extensions;
using Point = OpenCvSharp.Point;
using Size = OpenCvSharp.Size;
namespace Demo10_WinFormAgeAndGender
{
public partial class Form1 : Form
{
private bool _run = true;
private bool _doFaceDetection = true;
private bool _doAgeGender = false;
private VideoCapture _capture;
private Mat _image;
private Thread _cameraThread;
private bool _fps = false;
private Net _faceNet;
private Net _ageNet;
private Net _genderNet;
private const int LineThickness = 2;
private const int Padding = 10;
private readonly List<string> _genderList = new List<string> { "Male", "Female" };
private readonly List<string> _ageList = new List<string> { "(0-2)", "(4-6)", "(8-12)", "(15-20)", "(25-32)", "(38-43)", "(48-53)", "(60-100)" };
public Form1()
{
InitializeComponent();
Load += Form1_Load;
Closed += Form1_Closed;
}
private void Form1_Closed(object sender, EventArgs e)
{
_cameraThread.Interrupt();
_capture.Release();
}
private void btnStart_Click(object sender, EventArgs e)
{
_run = true;
}
private void btnStop_Click(object sender, EventArgs e)
{
_run = false;
}
private void btnFDDNN_Click(object sender, EventArgs e)
{
_doFaceDetection = !_doFaceDetection;
}
private void buttonFPS_Click(object sender, EventArgs e)
{
_fps = !_fps;
}
private void btnAgeGender_Click(object sender, EventArgs e)
{
_doAgeGender = !_doAgeGender;
}
private void Form1_Load(object sender, EventArgs e)
{
// # detect faces, age and gender using models from https://github.com/spmallick/learnopencv/tree/08e61fe80b8c0244cc4029ac11e44cd0fbb008c3/AgeGender
const string faceProto = "models/deploy.prototxt";
const string faceModel = "models/res10_300x300_ssd_iter_140000_fp16.caffemodel";
const string ageProto = @"models/age_deploy.prototxt";
const string ageModel = @"models/age_net.caffemodel";
const string genderProto = @"models/gender_deploy.prototxt";
const string genderModel = @"models/gender_net.caffemodel";
_ageNet = CvDnn.ReadNetFromCaffe(ageProto, ageModel);
_genderNet = CvDnn.ReadNetFromCaffe(genderProto, genderModel);
_faceNet = CvDnn.ReadNetFromCaffe(faceProto, faceModel);
_capture = new VideoCapture(0);
_image = new Mat();
_cameraThread = new Thread(new ThreadStart(CaptureCameraCallback));
_cameraThread.Start();
}
private void CaptureCameraCallback()
{
while (true)
{
if (!_run) continue;
var startTime = DateTime.Now;
_capture.Read(_image);
if (_image.Empty()) return;
var imageRes = new Mat();
Cv2.Resize(_image, imageRes, new Size(320, 240));
var newImage = imageRes.Clone();
if (_doFaceDetection) DetectFaces(newImage, imageRes);
if (_fps) CalculateFps(startTime, newImage);
var bmpWebCam = BitmapConverter.ToBitmap(imageRes);
var bmpEffect = BitmapConverter.ToBitmap(newImage);
pictureBoxWebCam.Image = bmpWebCam;
pictureBoxEffect.Image = bmpEffect;
}
}
private static void CalculateFps(DateTime startTime, Mat imageRes)
{
var diff = DateTime.Now startTime;
var fpsInfo = $"FPS: Nan";
if (diff.Milliseconds > 0)
{
var fpsVal = 1.0 / diff.Milliseconds * 1000;
fpsInfo = $"FPS: {fpsVal:00}";
}
Cv2.PutText(imageRes, fpsInfo, new Point(10, 20), HersheyFonts.HersheyComplexSmall, 1, Scalar.White);
}
private void DetectFaces(Mat newImage, Mat imageRes)
{
// DNN
int frameHeight = newImage.Rows;
int frameWidth = newImage.Cols;
using var blob = CvDnn.BlobFromImage(newImage, 1.0, new Size(300, 300), new Scalar(104, 117, 123), false, false);
_faceNet.SetInput(blob, "data");
using var detection = _faceNet.Forward("detection_out");
using var detectionMat = new Mat(detection.Size(2), detection.Size(3), MatType.CV_32F, detection.Ptr(0));
for (int i = 0; i < detectionMat.Rows; i++)
{
float confidence = detectionMat.At<float>(i, 2);
if (confidence > 0.7)
{
int x1 = (int)(detectionMat.At<float>(i, 3) * frameWidth);
int y1 = (int)(detectionMat.At<float>(i, 4) * frameHeight);
int x2 = (int)(detectionMat.At<float>(i, 5) * frameWidth);
int y2 = (int)(detectionMat.At<float>(i, 6) * frameHeight);
Cv2.Rectangle(newImage, new Point(x1, y1), new Point(x2, y2), Scalar.Green, LineThickness);
if (_doAgeGender)
AnalyzeAgeAndGender(x1, y1, x2, y2, imageRes, newImage);
}
}
}
private void AnalyzeAgeAndGender(int x1, int y1, int x2, int y2, Mat imageRes, Mat newImage)
{
// get face frame
var x = x1 Padding;
var y = y1 Padding;
var w = (x2 x1) + Padding * 3;
var h = (y2 y1) + Padding * 3;
Rect roiNew = new Rect(x, y, w, h);
var face = imageRes[roi: roiNew];
var meanValues = new Scalar(78.4263377603, 87.7689143744, 114.895847746);
var blobGender = CvDnn.BlobFromImage(face, 1.0, new Size(227, 227), mean: meanValues,
swapRB: false);
_genderNet.SetInput(blobGender);
var genderPreds = _genderNet.Forward();
GetMaxClass(genderPreds, out int classId, out double classProbGender);
var gender = _genderList[classId];
_ageNet.SetInput(blobGender);
var agePreds = _ageNet.Forward();
GetMaxClass(agePreds, out int classIdAge, out double classProbAge);
var age = _ageList[classIdAge];
var label = $"{gender},{age}";
Cv2.PutText(newImage, label, new Point(x1 10, y2 + 20), HersheyFonts.HersheyComplexSmall, 1, Scalar.Yellow, 1);
}
private void GetMaxClass(Mat probBlob, out int classId, out double classProb)
{
// reshape the blob to 1×1000 matrix
using var probMat = probBlob.Reshape(1, 1);
Cv2.MinMaxLoc(probMat, out _, out classProb, out _, out var classNumber);
classId = classNumber.X;
Debug.WriteLine($"X: {classNumber.X} – Y: {classNumber.Y} ");
}
}
}

Happy coding!

Greetings

El Bruno

References

#dotnet – Detecting Faces, DNN vs Haar Cascades from the 🎦 camera feed using #OpenCV and #net5

Buy Me A Coffee

Hi !

In one session around computer vision, someone ask the question about which approach is better

Haar Cascades or DNN?

And the answer can be show using the video below

net5 opencv face detection comparison between dnn and haar cascades

As you can see Haar Cascades works great for faces looking directly to the camera, with good lights and in optimal conditions. However, once my face stop looking at the camera, HC stop detecting faces; and the custom DNN model still working.

But, and this is important, the DNN model will rely only on the set of faces that has been used for training. If the model doesn’t include any demographics, in example Asian friends, the model won’t work with Asian subjects.

At the end, DNN models usually works great than Haar Cascades, however it’s really important to know the limitations of each model.

Happy coding!

Greetings

El Bruno

References

#dotnet – Detecting Faces using DNN from the 🎦 camera feed in a WinForm using #OpenCV and #net5

Buy Me A Coffee

Hi !

Let’s do some face detection using a DNN model (See references). As yesterday, I won’t write about details, there are almost 20 years of online documentation available.

And, IMHO opinion code is much more useful that long writing, so let’s go there. 1st load the Caffe model and the config file.

// download model and prototxt from https://github.com/spmallick/learnopencv/tree/master/FaceDetectionComparison/models
const string configFile = "deploy.prototxt";
const string faceModel = "res10_300x300_ssd_iter_140000_fp16.caffemodel";
_faceNet = CvDnn.ReadNetFromCaffe(configFile, faceModel);

And, once we grab the camera frame, let’s perform face detection using the dnn model:

int frameHeight = newImage.Rows;
int frameWidth = newImage.Cols;

using var blob = CvDnn.BlobFromImage(newImage, 1.0, new Size(300, 300),
    new Scalar(104, 117, 123), false, false);
_faceNet.SetInput(blob, "data");

using var detection = _faceNet.Forward("detection_out");
using var detectionMat = new Mat(detection.Size(2), detection.Size(3), MatType.CV_32F,
    detection.Ptr(0));
for (int i = 0; i < detectionMat.Rows; i++)
{
    float confidence = detectionMat.At<float>(i, 2);

    if (confidence > 0.7)
    {
        int x1 = (int)(detectionMat.At<float>(i, 3) * frameWidth);
        int y1 = (int)(detectionMat.At<float>(i, 4) * frameHeight);
        int x2 = (int)(detectionMat.At<float>(i, 5) * frameWidth);
        int y2 = (int)(detectionMat.At<float>(i, 6) * frameHeight);

        Cv2.Rectangle(newImage, new Point(x1, y1), new Point(x2, y2), Scalar.Green);
        Cv2.PutText(newImage, "Face Dnn", new Point(x1 + 2, y2 + 20),
            HersheyFonts.HersheyComplexSmall, 1, Scalar.Green, 2);
    }
}

And, the full code is here

using System;
using System.Threading;
using System.Windows.Forms;
using OpenCvSharp;
using OpenCvSharp.Dnn;
using OpenCvSharp.Extensions;
using Point = OpenCvSharp.Point;
using Size = OpenCvSharp.Size;
namespace Demo08_WinFormFaceDetectionDNN
{
public partial class Form1 : Form
{
private bool _run = false;
private bool _doFaceDetection = false;
private VideoCapture _capture;
private Mat _image;
private Thread _cameraThread;
private bool _fps = false;
private Net _faceNet;
public Form1()
{
InitializeComponent();
Load += Form1_Load;
Closed += Form1_Closed;
}
private void Form1_Closed(object sender, EventArgs e)
{
_cameraThread.Interrupt();
_capture.Release();
}
private void btnStart_Click(object sender, EventArgs e)
{
_run = true;
}
private void btnStop_Click(object sender, EventArgs e)
{
_run = false;
}
private void btnFDDNN_Click(object sender, EventArgs e)
{
_doFaceDetection = !_doFaceDetection;
}
private void buttonFPS_Click(object sender, EventArgs e)
{
_fps = !_fps;
}
private void Form1_Load(object sender, EventArgs e)
{
// download model and prototxt from https://github.com/spmallick/learnopencv/tree/master/FaceDetectionComparison/models
const string configFile = "deploy.prototxt";
const string faceModel = "res10_300x300_ssd_iter_140000_fp16.caffemodel";
_faceNet = CvDnn.ReadNetFromCaffe(configFile, faceModel);
_capture = new VideoCapture(0);
_image = new Mat();
_cameraThread = new Thread(new ThreadStart(CaptureCameraCallback));
_cameraThread.Start();
}
private void CaptureCameraCallback()
{
while (true)
{
if (!_run) continue;
var startTime = DateTime.Now;
_capture.Read(_image);
if (_image.Empty()) return;
var imageRes = new Mat();
Cv2.Resize(_image, imageRes, new Size(320, 240));
var newImage = imageRes.Clone();
if (_doFaceDetection)
{
int frameHeight = newImage.Rows;
int frameWidth = newImage.Cols;
using var blob = CvDnn.BlobFromImage(newImage, 1.0, new Size(300, 300),
new Scalar(104, 117, 123), false, false);
_faceNet.SetInput(blob, "data");
using var detection = _faceNet.Forward("detection_out");
using var detectionMat = new Mat(detection.Size(2), detection.Size(3), MatType.CV_32F,
detection.Ptr(0));
for (int i = 0; i < detectionMat.Rows; i++)
{
float confidence = detectionMat.At<float>(i, 2);
if (confidence > 0.7)
{
int x1 = (int)(detectionMat.At<float>(i, 3) * frameWidth);
int y1 = (int)(detectionMat.At<float>(i, 4) * frameHeight);
int x2 = (int)(detectionMat.At<float>(i, 5) * frameWidth);
int y2 = (int)(detectionMat.At<float>(i, 6) * frameHeight);
Cv2.Rectangle(newImage, new Point(x1, y1), new Point(x2, y2), Scalar.Green);
Cv2.PutText(newImage, "Face Dnn", new Point(x1 + 2, y2 + 20),
HersheyFonts.HersheyComplexSmall, 1, Scalar.Green, 2);
}
}
}
if (_fps)
{
var diff = DateTime.Now startTime;
var fpsInfo = $"FPS: Nan";
if (diff.Milliseconds > 0)
{
var fpsVal = 1.0 / diff.Milliseconds * 1000;
fpsInfo = $"FPS: {fpsVal:00}";
}
Cv2.PutText(imageRes, fpsInfo, new Point(10, 20), HersheyFonts.HersheyComplexSmall, 1, Scalar.White);
}
var bmpWebCam = BitmapConverter.ToBitmap(imageRes);
var bmpEffect = BitmapConverter.ToBitmap(newImage);
pictureBoxWebCam.Image = bmpWebCam;
pictureBoxEffect.Image = bmpEffect;
}
}
}
}

That’s all for today!

Happy coding!

Greetings

El Bruno

References

#dotnet – Detecting Faces using Cascades from the 🎦 camera feed in a WinForm using #OpenCV and #net5

Buy Me A Coffee

Hi !

Let’s do some face detection using one of the most popular methods: Haar Casacades (See references). I won’t write about Cascades, there are almost 20 years of online documentation available.

And, IMHO opinion code is much more useful that long writing, so let’s go there.

1st load the cascade definition file.

_faceCascade = new CascadeClassifier();
_faceCascade.Load("haarcascade_frontalface_default.xml");

And, once we grab the camera frame, let’s perform some face detection:

using var gray = new Mat();
Cv2.CvtColor(newImage, gray, ColorConversionCodes.BGR2GRAY);

var faces = _faceCascade.DetectMultiScale(gray, 1.3, 5);
foreach (var face in faces)
{
  Cv2.Rectangle(newImage, face, Scalar.Red);
  Cv2.PutText(newImage, "Face Cascade", new Point(face.Left + 2, face.Top + face.Width + 20), HersheyFonts.HersheyComplexSmall, 1, Scalar.Red, 2);
}

And, the full code is here

using System;
using System.Threading;
using System.Windows.Forms;
using OpenCvSharp;
using OpenCvSharp.Extensions;
using Point = OpenCvSharp.Point;
using Size = OpenCvSharp.Size;
namespace Demo07_WinFormFaceDetectionCascades
{
public partial class Form1 : Form
{
private bool _run = true;
private bool _doFaceDetection = false;
private VideoCapture _capture;
private Mat _image;
private Thread _cameraThread;
private bool _fps = false;
private CascadeClassifier _faceCascade;
public Form1()
{
InitializeComponent();
Load += Form1_Load;
Closed += Form1_Closed;
}
private void Form1_Closed(object sender, EventArgs e)
{
_cameraThread.Interrupt();
_capture.Release();
}
private void btnStart_Click(object sender, EventArgs e)
{
_run = true;
}
private void btnStop_Click(object sender, EventArgs e)
{
_run = false;
}
private void btnFaceDetectionCascades_Click(object sender, EventArgs e)
{
_doFaceDetection = !_doFaceDetection;
}
private void buttonFPS_Click(object sender, EventArgs e)
{
_fps = !_fps;
}
private void Form1_Load(object sender, EventArgs e)
{
_faceCascade = new CascadeClassifier();
_faceCascade.Load("haarcascade_frontalface_default.xml");
_capture = new VideoCapture(0);
_image = new Mat();
_cameraThread = new Thread(new ThreadStart(CaptureCameraCallback));
_cameraThread.Start();
}
private void CaptureCameraCallback()
{
while (true)
{
if (!_run) continue;
var startTime = DateTime.Now;
_capture.Read(_image);
if (_image.Empty()) return;
var imageRes = new Mat();
Cv2.Resize(_image, imageRes, new Size(320, 240));
var newImage = imageRes.Clone();
if (_doFaceDetection)
{
using var gray = new Mat();
Cv2.CvtColor(newImage, gray, ColorConversionCodes.BGR2GRAY);
var faces = _faceCascade.DetectMultiScale(gray, 1.3, 5);
foreach (var face in faces)
{
Cv2.Rectangle(newImage, face, Scalar.Red);
Cv2.PutText(newImage, "Face Cascade", new Point(face.Left + 2, face.Top + face.Width + 20),
HersheyFonts.HersheyComplexSmall, 1, Scalar.Red, 2);
}
}
if (_fps)
{
var diff = DateTime.Now startTime;
var fpsInfo = $"FPS: Nan";
if (diff.Milliseconds > 0)
{
var fpsVal = 1.0 / diff.Milliseconds * 1000;
fpsInfo = $"FPS: {fpsVal:00}";
}
Cv2.PutText(newImage, fpsInfo, new Point(10, 20), HersheyFonts.HersheyComplexSmall, 1, Scalar.White);
}
var bmpWebCam = BitmapConverter.ToBitmap(imageRes);
var bmpEffect = BitmapConverter.ToBitmap(newImage);
pictureBoxWebCam.Image = bmpWebCam;
pictureBoxEffect.Image = bmpEffect;
}
}
}
}

That’s all for today!

Happy coding!

Greetings

El Bruno

References

#Event – Resources used on my session for the @netcoreconf

Buy Me A Coffee

Hi !

Another huge and amazing events from my friends at the @netcoreconf, perfect excuse to talk about python, machine learning, computer vision and more. This one was also tricky, no slides, just code during 50 min so here are some related resources.

Code

The code used during the session is available here: https://github.com/elbruno/events/tree/main/2020%2010%2003%20Netcoreconf%20Virtual

Session Recording

Resources

#Coding4Fun – How to control your #drone with 20 lines of code! (21/N)

Buy Me A Coffee

Hi !

In my post series I already wrote about how to detect faces. We can do this with a camera and OpenCV. However, a drone can also be moved on command, so let’s write some lines to detect a face, and calculate the orientation and distance of the detected face from the center camera of the camera.

In order to do this, 1st let’s draw a grid in the camera frame, and once a face is detected, let’s show the distance and orientation from the center.

face detected on camera and calculate position from center

Let’s start with a Grid. The idea is to create a 3×3 grid in the camera frame, and use the center cell as reference for the detected objects. The code to create a 3×3 grid is this one:

def displayGrid(frame):
    # Add a 3x3 Grid
    cv2.line(frame, (int(camera_Width/2)-centerZone, 0)     , (int(camera_Width/2)-centerZone, camera_Heigth)    , lineColor, lineThickness)
    cv2.line(frame, (int(camera_Width/2)+centerZone, 0)     , (int(camera_Width/2)+centerZone, camera_Heigth)    , lineColor, lineThickness)
    cv2.line(frame, (0, int(camera_Heigth / 2) - centerZone), (camera_Width, int(camera_Heigth / 2) - centerZone), lineColor, lineThickness)
    cv2.line(frame, (0, int(camera_Heigth / 2) + centerZone), (camera_Width, int(camera_Heigth / 2) + centerZone), lineColor, lineThickness)

# Camera Settings
camera_Width  = 1024 # 1280 # 640
camera_Heigth = 780  # 960  # 480
centerZone    = 100

# GridLine color green and thickness
lineColor = (0, 255, 0) 
lineThickness = 2

We use the line() function on OpenCV, and do some calculations to get the starting and endpoint for the 4 lines for the grid: 2 vertical lines and 2 horizontal lines. For this demo, I’ll implement this in my main webcam.

drone 3x3 grid in the camera frame

Based on my face detection samples and other samples in GitHub (see references), now I’ll calculate the position of the detected face (with x, y, h, w) from the center of the camera:

def calculatePositionForDetectedFace(frame, x, y, h , w):
    # calculate direction and relative position of the face
    cx = int(x + (w / 2))  # Center X of the Face
    cy = int(y + (h / 2))  # Center Y of the Face

    if (cx <int(camera_Width/2) - centerZone):
        cv2.putText  (frame, " LEFT " , (20, 50), cv2.FONT_HERSHEY_COMPLEX, 1 , colorGreen, 2)
        dir = 1
    elif (cx > int(camera_Width / 2) + centerZone):
        cv2.putText(frame, " RIGHT ", (20, 50), cv2.FONT_HERSHEY_COMPLEX,1,colorGreen, 3)
        dir = 2
    elif (cy < int(camera_Heigth / 2) - centerZone):
        cv2.putText(frame, " UP ", (20, 50), cv2.FONT_HERSHEY_COMPLEX,1,colorGreen, 3)
        dir = 3
    elif (cy > int(camera_Heigth / 2) + centerZone):
        cv2.putText(frame, " DOWN ", (20, 50), cv2.FONT_HERSHEY_COMPLEX, 1,colorGreen, 3)
        dir = 4
    else: dir=0

    # display detected face frame, line from center and direction to go
    cv2.line     (frame, (int(camera_Width/2),int(camera_Heigth/2)), (cx,cy), colorRed, messageThickness)
    cv2.rectangle(frame, (x, y), (x + w, y + h), colorBlue, messageThickness)
    cv2.putText  (frame, str(int(x)) + " " + str(int(y)), (x - 20, y - 45), cv2.FONT_HERSHEY_COMPLEX,0.7, colorRed, messageThickness)

The output is similar to this one

And now with the base code completed, it’s time to add this logic to the drone samples !

Bonus: the complete code.

# Bruno Capuano 2020
# display the camera feed using OpenCV
# display a 3×3 Grid
# detect faces using openCV and haar cascades
# calculate the relative position for the face from the center of the camera
import os
import time
import cv2
def displayGrid(frame):
# Add a 3×3 Grid
cv2.line(frame, (int(camera_Width/2)centerZone, 0) , (int(camera_Width/2)centerZone, camera_Heigth) , lineColor, lineThickness)
cv2.line(frame, (int(camera_Width/2)+centerZone, 0) , (int(camera_Width/2)+centerZone, camera_Heigth) , lineColor, lineThickness)
cv2.line(frame, (0, int(camera_Heigth / 2) centerZone), (camera_Width, int(camera_Heigth / 2) centerZone), lineColor, lineThickness)
cv2.line(frame, (0, int(camera_Heigth / 2) + centerZone), (camera_Width, int(camera_Heigth / 2) + centerZone), lineColor, lineThickness)
def calculatePositionForDetectedFace(frame, x, y, h , w):
# calculate direction and relative position of the face
cx = int(x + (w / 2)) # Center X of the Face
cy = int(y + (h / 2)) # Center Y of the Face
if (cx <int(camera_Width/2) centerZone):
cv2.putText (frame, " LEFT " , (20, 50), cv2.FONT_HERSHEY_COMPLEX, 1 , colorGreen, 2)
dir = 1
elif (cx > int(camera_Width / 2) + centerZone):
cv2.putText(frame, " RIGHT ", (20, 50), cv2.FONT_HERSHEY_COMPLEX,1,colorGreen, 3)
dir = 2
elif (cy < int(camera_Heigth / 2) centerZone):
cv2.putText(frame, " UP ", (20, 50), cv2.FONT_HERSHEY_COMPLEX,1,colorGreen, 3)
dir = 3
elif (cy > int(camera_Heigth / 2) + centerZone):
cv2.putText(frame, " DOWN ", (20, 50), cv2.FONT_HERSHEY_COMPLEX, 1,colorGreen, 3)
dir = 4
else: dir=0
# display detected face frame, line from center and direction to go
cv2.line (frame, (int(camera_Width/2),int(camera_Heigth/2)), (cx,cy), colorRed, messageThickness)
cv2.rectangle(frame, (x, y), (x + w, y + h), colorBlue, messageThickness)
cv2.putText (frame, str(int(x)) + " " + str(int(y)), (x 20, y 45), cv2.FONT_HERSHEY_COMPLEX,0.7, colorRed, messageThickness)
# Camera Settings
camera_Width = 1024 # 1280 # 640
camera_Heigth = 780 # 960 # 480
centerZone = 100
# GridLine color green and thickness
lineColor = (0, 255, 0)
lineThickness = 2
# message color and thickness
colorBlue = (255, 0, 0)
colorGreen = (0, 255, 0)
colorRed = (0, 0, 255) #red
messageThickness = 2
dsize = (camera_Width, camera_Heigth)
video_capture = cv2.VideoCapture(1)
time.sleep(2.0)
# enable face and smile detection
face_cascade = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')
i = 0
while True:
i = i + 1
ret, frameOrig = video_capture.read()
frame = cv2.resize(frameOrig, dsize)
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
displayGrid(frame)
# detect faces
faces = face_cascade.detectMultiScale(gray, 1.3, 5)
for (x, y, w, h) in faces:
# display face in grid
calculatePositionForDetectedFace(frame, x, y, h , w)
cv2.imshow('@ElBruno – Follow Faces', frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
video_capture.release()
cv2.destroyAllWindows()

Happy coding!

Greetings

El Bruno

References

#Coding4Fun – How to control your #drone with 20 lines of code! (12/N)

Buy Me A Coffee

Hi!

Today code objective is very simple, based on a request I received from internet:

The drone is flying very happy, but if the camera detects a face, the drone will flip out !

Let’s take a look at the program working:

This one is very similar to the previous one. I also realized that I may need a better camera to record the live action side by side with the drone footage, but I think you get the idea. The command to make the drone flip is “flip x”, where “x” is the direction. In example:

"flip l" # flip left
"flip r" # flip right
"flip f" # flip forward
"flip b" # flip back

Here is the code:

# Bruno Capuano
# detect faces using haar cascades from https://github.com/opencv/opencv/tree/master/data/haarcascades
# enable drone video camera
# display video camera using OpenCV and display FPS
# detect faces
# launch the drone with key T, and land with key L
# if the drone is flying, and a face is detected, the drone will flip left
import cv2
import socket
import time
import threading
import winsound
def receiveData():
global response
while True:
try:
response, _ = clientSocket.recvfrom(1024)
except:
break
def readStates():
global battery
while True:
try:
response_state, _ = stateSocket.recvfrom(256)
if response_state != 'ok':
response_state = response_state.decode('ASCII')
list = response_state.replace(';', ':').split(':')
battery = int(list[21])
except:
break
def sendCommand(command):
global response
timestamp = int(time.time() * 1000)
clientSocket.sendto(command.encode('utf-8'), address)
while response is None:
if (time.time() * 1000) timestamp > 5 * 1000:
return False
return response
def sendReadCommand(command):
response = sendCommand(command)
try:
response = str(response)
except:
pass
return response
def sendControlCommand(command):
response = None
for i in range(0, 5):
response = sendCommand(command)
if response == 'OK' or response == 'ok':
return True
return False
# ———————————————–
# Main program
# ———————————————–
# connection info
UDP_IP = '192.168.10.1'
UDP_PORT = 8889
last_received_command = time.time()
STATE_UDP_PORT = 8890
address = (UDP_IP, UDP_PORT)
response = None
response_state = None
clientSocket = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
clientSocket.bind(('', UDP_PORT))
stateSocket = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
stateSocket.bind(('', STATE_UDP_PORT))
# start threads
recThread = threading.Thread(target=receiveData)
recThread.daemon = True
recThread.start()
stateThread = threading.Thread(target=readStates)
stateThread.daemon = True
stateThread.start()
# connect to drone
response = sendControlCommand("command")
print(f'command response: {response}')
response = sendControlCommand("streamon")
print(f'streamon response: {response}')
# drone information
battery = 0
# enable face and smile detection
face_cascade = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')
# open UDP
print(f'opening UDP video feed, wait 2 seconds ')
videoUDP = 'udp://192.168.10.1:11111'
cap = cv2.VideoCapture(videoUDP)
time.sleep(2)
# open
drone_flying = False
i = 0
while True:
i = i + 1
start_time = time.time()
try:
_, frameOrig = cap.read()
frame = cv2.resize(frameOrig, (480, 360))
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
# detect faces
faces = face_cascade.detectMultiScale(gray, 1.3, 5)
for (x, y, w, h) in faces:
cv2.rectangle(frame, (x, y), ((x + w), (y + h)), (0, 0, 255), 2)
font = cv2.FONT_HERSHEY_COMPLEX_SMALL
cv2.putText(frame, 'face', (h + 6, w 6), font, 0.7, (255, 255, 255), 1)
if(len(faces) > 0 and drone_flying == True):
msg = "flip l"
sendCommand(msg)
# display fps
if (time.time() start_time ) > 0:
fpsInfo = "FPS: " + str(1.0 / (time.time() start_time)) # FPS = 1 / time to process loop
font = cv2.FONT_HERSHEY_DUPLEX
cv2.putText(frame, fpsInfo, (10, 20), font, 0.4, (255, 255, 255), 1)
cv2.imshow('@elbruno – DJI Tello Camera', frame)
sendReadCommand('battery?')
print(f'flying: {drone_flying} – battery: {battery} % – i: {i}{fpsInfo}')
except Exception as e:
print(f'exc: {e}')
pass
if cv2.waitKey(1) & 0xFF == ord('t'):
drone_flying = True
detection_started = True
msg = "takeoff"
sendCommand(msg)
if cv2.waitKey(1) & 0xFF == ord('l'):
drone_flying = False
msg = "land"
sendCommand(msg)
time.sleep(5)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
msg = "land"
sendCommand(msg) # land
response = sendControlCommand("streamoff")
print(f'streamon response: {response}')

As I promised last time, in next posts, I’ll analyze more in details how this works, and a couple of improvements that I can implement.

Happy coding!

Greetings

El Bruno

References

My Posts

#Coding4Fun – How to control your #drone with 20 lines of code! (10/N)

Buy Me A Coffee

Hi!

Back to some drone posts! I was kind of busy during the last weeks and now I can get back to write about the drone.

OK, in the last posts I described how to connect and work with the drone camera feed using OpenCV. Now with 2 extra lines of code we can also detect faces. Let’s take a look at the final sample.

drone camera and camera view performing face detection

In the previous image we can see 2 camera feeds. My computer webcam, where you can see how I hold the drone with the drone camera pointing to my face. And the drone camera feed, presented using OpenCV and drawing a frame over each detected face.

Let’s share some code insights:

  • As usual, I resize the camera feed to 320 x 240
  • The average processing time is between 40 and 70 FPS
  • I use a haar cascade classifier to detect the faces in each frame

Note: I need to write about Haar Cascades as part of my face detection post series.

# Bruno Capuano
# detect faces using haar cascades from https://github.com/opencv/opencv/tree/master/data/haarcascades
# enable drone video camera
# display video camera using OpenCV
# display FPS
# detect faces
import cv2
import socket
import time
import threading
def receiveData():
global response
while True:
try:
response, _ = clientSocket.recvfrom(1024)
except:
break
def readStates():
global battery
while True:
try:
response_state, _ = stateSocket.recvfrom(256)
if response_state != 'ok':
response_state = response_state.decode('ASCII')
list = response_state.replace(';', ':').split(':')
battery = int(list[21])
except:
break
def sendCommand(command):
global response
timestamp = int(time.time() * 1000)
clientSocket.sendto(command.encode('utf-8'), address)
while response is None:
if (time.time() * 1000) timestamp > 5 * 1000:
return False
return response
def sendReadCommand(command):
response = sendCommand(command)
try:
response = str(response)
except:
pass
return response
def sendControlCommand(command):
response = None
for i in range(0, 5):
response = sendCommand(command)
if response == 'OK' or response == 'ok':
return True
return False
# ———————————————–
# Main program
# ———————————————–
# connection info
UDP_IP = '192.168.10.1'
UDP_PORT = 8889
last_received_command = time.time()
STATE_UDP_PORT = 8890
address = (UDP_IP, UDP_PORT)
response = None
response_state = None
clientSocket = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
clientSocket.bind(('', UDP_PORT))
stateSocket = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
stateSocket.bind(('', STATE_UDP_PORT))
# start threads
recThread = threading.Thread(target=receiveData)
recThread.daemon = True
recThread.start()
stateThread = threading.Thread(target=readStates)
stateThread.daemon = True
stateThread.start()
# connect to drone
response = sendControlCommand("command")
print(f'command response: {response}')
response = sendControlCommand("streamon")
print(f'streamon response: {response}')
# drone information
battery = 0
# enable face detection
face_cascade = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')
# open UDP
print(f'opening UDP video feed, wait 2 seconds ')
videoUDP = 'udp://192.168.10.1:11111'
cap = cv2.VideoCapture(videoUDP)
time.sleep(2)
# open
i = 0
while True:
i = i + 1
start_time = time.time()
try:
_, frameOrig = cap.read()
frame = cv2.resize(frameOrig, (320, 240))
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
faces = face_cascade.detectMultiScale(gray, 1.3, 5)
for (top, right, bottom, left) in faces:
cv2.rectangle(frame,(top,right),(top+bottom,right+left),(0,0,255),2)
# display fps
if (time.time() start_time ) > 0:
fpsInfo = "FPS: " + str(1.0 / (time.time() start_time)) # FPS = 1 / time to process loop
font = cv2.FONT_HERSHEY_DUPLEX
cv2.putText(frame, fpsInfo, (10, 20), font, 0.4, (255, 255, 255), 1)
cv2.imshow('@elbruno – DJI Tello Camera', frame)
sendReadCommand('battery?')
print(f'battery: {battery} % – i: {i}{fpsInfo}')
except Exception as e:
print(f'exc: {e}')
pass
if cv2.waitKey(1) & 0xFF == ord('q'):
break
response = sendControlCommand("streamoff")
print(f'streamon response: {response}')

In my next posts, I’ll add some drone specific behaviors for each face detected.

Happy coding!

Greetings

El Bruno

References

My Posts

#RaspberryPi – Performance differences in #FaceRecognition using #OpenVino (code with @code!)

Buy Me A Coffee

Hi !

I’ve been looking to use the amazing Intel Neural Stick 2 for a while, and one of the 1st ideas that I have was to check how fast my Raspberry Pi 4 can run using this device.

The Intel team released a nice step by step process installation for Raspberry Pi. And it works great, there are a couple of minor glitches that you need to figure out, like the latest package version, everything else works great.

Note: I downloaded my openvino toolkit from here (https://download.01.org/opencv/2019/openvinotoolkit/R3/), and the downloaded file is (l_openvino_toolkit_runtime_raspbian_p_2019.3.334.tgz).

Once installed, the 1st python sample is a face recognition one. This sample analyzes a image file using OpenCV to detect faces, and creates a new output file with the detected images. As I said, is very straight forward.

So, I decided to create a new python sample to run live face detection using the camera feed and also display the FPS. This is the output code:

# perform face detection
# display detected face frame
# display FPS info in webcam video feed
# This is the official sample demo file desribed in the installer documentation
# Date: 2020 01 26
# Install OpenVINO™ toolkit for Raspbian* OS
# http://docs.openvinotoolkit.org/2019_R1/_docs_install_guides_installing_openvino_raspbian.html
import cv2
import time
import imutils
# Load the model.
net = cv2.dnn.readNet('face-detection-adas-0001.xml',
'face-detection-adas-0001.bin')
# Specify target device.
# ERROR net.setPreferableBackend(cv2.dnn.DNN_BACKEND_OPENCV)
# OK net.setPreferableBackend(cv2.dnn.DNN_BACKEND_INFERENCE_ENGINE)
# ERROR net.setPreferableBackend(cv2.dnn.DNN_BACKEND_HALIDE)
net.setPreferableBackend(cv2.dnn.DNN_BACKEND_INFERENCE_ENGINE)
net.setPreferableTarget(cv2.dnn.DNN_TARGET_MYRIAD)
# open video frame
video_capture = cv2.VideoCapture(0)
while True:
start_time = time.time()
ret, frame = video_capture.read()
# frame resize to improve performance
frame = imutils.resize(frame, width=648, height=480)
rgb_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
# Prepare input blob and perform an inference.
blob = cv2.dnn.blobFromImage(rgb_frame, size=(640, 480), ddepth=cv2.CV_8U)
net.setInput(blob)
out = net.forward()
# Draw detected faces on the frame.
for detection in out.reshape(1, 7):
confidence = float(detection[2])
xmin = int(detection[3] * frame.shape[1])
ymin = int(detection[4] * frame.shape[0])
xmax = int(detection[5] * frame.shape[1])
ymax = int(detection[6] * frame.shape[0])
if confidence > 0.5:
cv2.rectangle(frame, (xmin, ymin), (xmax, ymax), color=(0, 255, 0))
#display FPS
fpsInfo = "FPS: " + str(1.0 / (time.time() start_time)) # FPS = 1 / time to process loop
print(fpsInfo)
font = cv2.FONT_HERSHEY_DUPLEX
cv2.putText(frame, fpsInfo, (10, 20), font, 0.4, (255, 255, 255), 1)
cv2.imshow('Video', frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
video_capture.release()
cv2.destroyAllWindows()

The code is very straight forward and the main matters are

  • It uses 2 models from the Intel Zoo to perform the face detection: face-detection-adas-0001.xml and face-detection-adas-0001.bin
  • Lines 22 and 23 are key to define that OpenCV will load and use the models in the Intel device
  • I use imutils to resize the image to 640×480. Feel free to use any other library for this, even OpenCV
  • Also, it works also with smaller resolutions, however 640×480 is good for this demo

And the final app running analyzing almost 8 frames per second (8 FPS).

Which is almost 10 times faster that the 0.7 FPS without Intel NCS2

And, I already wrote about running Visual Studio Code in the Raspberry Pi (see references) is an amazing experience. I did all my Python in VSCode coding remote accesing my device via VNC. Python runs like a charm!

You can download the code from https://github.com/elbruno/rpiopenvino/tree/master/facedetection

References

My posts on Raspberry Pi

Dev posts for Raspberry Pi
Tools and Apps for Raspberry Pi
Setup the device
Hardware