#dotnet – Pose detection from the 🎦 camera feed using #OpenCV and #net5. Home-made #kinect!

Buy Me A Coffee

Hi !

LearnOpenCV is an amazing resource to learn about OpenCV. And, it has lot of scenarios of real life problem solved with OpenCV. Most of the samples are in C++ or Python, so I decided to pick one related to pose estimation, and using .Net 5 in a Winforms App, build something like this:

net5 opencv pose estimation on real camera feed

The main model is OpenPose (see references). The model is amazing, and also works fast: ~1 FPS. There are other variations here, detecting face, Body, Foot, Face, and Hands Estimation, and more. I’ll try and share some of the other models usage in C# in next posts.

Now as usual, a huge code snippet with only the frame recognition and processing to detect the body joints.

private void CaptureCameraCallback()
{
    while (true)
    {
        if (!_run) continue;
        var startTime = DateTime.Now;

        _capture.Read(_image);
        if (_image.Empty()) return;
        var imageRes = new Mat();
        Cv2.Resize(_image, imageRes, new Size(320, 240));
        if (_detectPose)
        {

            var frameWidth = imageRes.Cols;
            var frameHeight = imageRes.Rows;

            const int inWidth = 368;
            const int inHeight = 368;

            // Convert Mat to batch of images
            using var inpBlob = CvDnn.BlobFromImage(imageRes, 1.0 / 255, new Size(inWidth, inHeight), new Scalar(0, 0, 0), false, false);

            _netPose.SetInput(inpBlob);

            using var output = _netPose.Forward();
            var H = output.Size(2);
            var W = output.Size(3);

            var points = new List<Point>();

            for (var n = 0; n < nPoints; n++)
            {
                // Probability map of corresponding body's part.
                using var probMap = new Mat(H, W, MatType.CV_32F, output.Ptr(0, n));
                var p = new Point2f(-1, -1);

                Cv2.MinMaxLoc(probMap, out _, out var maxVal, out _, out var maxLoc);

                var x = (frameWidth * maxLoc.X) / W;
                var y = (frameHeight * maxLoc.Y) / H;

                if (maxVal > thresh)
                {
                    p = maxLoc;
                    p.X *= (float)frameWidth / W;
                    p.Y *= (float)frameHeight / H;

                    Cv2.Circle(imageRes, (int)p.X, (int)p.Y, 8, Scalar.Azure, -1);
                    //Cv2.PutText(imageRes, Cv2.Format(n), new Point((int)p.X, (int)p.Y), HersheyFonts.HersheyComplex, 1, new Scalar(0, 0, 255), 1);
                }

                points.Add((Point)p);
            }

            WriteTextSafe(@$"Joints {nPoints} found");

            var nPairs = 14; //(POSE_PAIRS).Length / POSE_PAIRS[0].Length;

            for (var n = 0; n < nPairs; n++)
            {
                // lookup 2 connected body/hand parts
                var partA = points[posePairs[n][0]];
                var partB = points[posePairs[n][1]];
                if (partA.X <= 0 || partA.Y <= 0 || partB.X <= 0 || partB.Y <= 0)
                    continue;
                Cv2.Line(imageRes, partA, partB, new Scalar(0, 255, 255), 8);
                Cv2.Circle(imageRes, partA.X, partA.Y, 8, new Scalar(0, 0, 255), -1);
                Cv2.Circle(imageRes, partB.X, partB.Y, 8, new Scalar(0, 0, 255), -1);
            }

        }
// rest of the code to calc FPS and display the image
    }
}

Super fun ! and check the references for the model and support files download location.

Happy coding!

Greetings

El Bruno

References

#dotnet – GoogleNet detection from the 🎦 camera feed using #OpenCV and #net5. Bonus: C++ to C# time!

Buy Me A Coffee

Hi !

So I was browsing in the OpenCV documentation and I find a nice sample that uses opencv_dnn module for image classification by using GoogLeNet trained network from Caffe model zoo.

So I give it a try, and get a decent .Net 5 Winforms App running at ~30 FPS.

opencv net5 load and analyze camera frames with googlenet

The model was trained with 1000 classes, and once you get the main focus on the camera it work great with objects like a machine, mug, bottle, etc. There is a nice amount of code here, and because the DNN analysis is performed in a separated thread, I need to update the label details using PInvoke functions.

using System;
using System.IO;
using System.Linq;
using System.Threading;
using System.Windows.Forms;
using OpenCvSharp;
using OpenCvSharp.Dnn;
using OpenCvSharp.Extensions;
using Point = OpenCvSharp.Point;
using Size = OpenCvSharp.Size;
namespace Demo11_WinFormGoogleNet
{
public partial class Form1 : Form
{
private bool _run = true;
private bool _useGoogleNet = false;
private VideoCapture _capture;
private Mat _image;
private Thread _cameraThread;
private bool _fps = false;
private Net _netGoogleNet;
private string[] _classNames;
const string ProtoTxt = @"models\bvlc_googlenet.prototxt";
const string CaffeModel = @"models\bvlc_googlenet.caffemodel";
const string SynsetWords = @"models\synset_words.txt";
private delegate void SafeCallDelegate(string text);
public Form1()
{
InitializeComponent();
Load += Form1_Load;
Closed += Form1_Closed;
}
private void Form1_Closed(object sender, EventArgs e)
{
_cameraThread.Interrupt();
_capture.Release();
}
private void btnStart_Click(object sender, EventArgs e)
{
_run = true;
}
private void btnStop_Click(object sender, EventArgs e)
{
_run = false;
}
private void btnGoogleNet_Click(object sender, EventArgs e)
{
_useGoogleNet = !_useGoogleNet;
}
private void buttonFPS_Click(object sender, EventArgs e)
{
_fps = !_fps;
}
private void Form1_Load(object sender, EventArgs e)
{
_classNames = File.ReadAllLines(SynsetWords)
.Select(line => line.Split(' ').Last())
.ToArray();
_netGoogleNet = CvDnn.ReadNetFromCaffe(ProtoTxt, CaffeModel);
_capture = new VideoCapture(0);
_image = new Mat();
_cameraThread = new Thread(new ThreadStart(CaptureCameraCallback));
_cameraThread.Start();
}
private void CaptureCameraCallback()
{
while (true)
{
if (!_run) continue;
var startTime = DateTime.Now;
_capture.Read(_image);
if (_image.Empty()) return;
var imageRes = new Mat();
Cv2.Resize(_image, imageRes, new Size(320, 240));
if (_useGoogleNet)
{
// Convert Mat to batch of images
using var inputBlob = CvDnn.BlobFromImage(imageRes, 1, new Size(224, 224), new Scalar(104, 117, 123));
_netGoogleNet.SetInput(inputBlob, "data");
using var prob = _netGoogleNet.Forward("prob");
// find the best class
GetMaxClass(prob, out int classId, out double classProb);
var msg = @$"Best class: #{classId} '{_classNames[classId]}' – Probability: {classProb:P2}";
// display output
WriteTextSafe(msg);
}
if (_fps)
{
var diff = DateTime.Now startTime;
var fpsInfo = $"FPS: Nan";
if (diff.Milliseconds > 0)
{
var fpsVal = 1.0 / diff.Milliseconds * 1000;
fpsInfo = $"FPS: {fpsVal:00}";
}
Cv2.PutText(imageRes, fpsInfo, new Point(10, 20), HersheyFonts.HersheyComplexSmall, 1, Scalar.White);
}
var bmpWebCam = BitmapConverter.ToBitmap(imageRes);
pictureBoxWebCam.Image = bmpWebCam;
}
}
private void WriteTextSafe(string text)
{
if (lblOutputAnalysis.InvokeRequired)
{
var d = new SafeCallDelegate(WriteTextSafe);
lblOutputAnalysis.Invoke(d, new object[] { text });
}
else
{
lblOutputAnalysis.Text = text;
}
}
private static void GetMaxClass(Mat probBlob, out int classId, out double classProb)
{
// reshape the blob to 1×1000 matrix
using (var probMat = probBlob.Reshape(1, 1))
{
Cv2.MinMaxLoc(probMat, out _, out classProb, out _, out var classNumber);
classId = classNumber.X;
}
}
}
}

Super fun ! and check the references for the model and support files download location.

Happy coding!

Greetings

El Bruno

References

#dotnet – Age and Gender estimation from the 🎦 camera feed using #OpenCV and #net5

Buy Me A Coffee

Hi !

Face detected, so next step is to use some prebuild models to perform additional actions: like estimate the Age of a face, and also the Gender. In order to do this, I downloaded a couple of models from here.

Disclaimer: these models are just sample models, do not use them in production. These model does not covers all the necessary scenarios for a real implementation.

And the final winform app is kind of cute!

Below you can find the complete Form1 source code, before let’s take a look at the sample analyzing a magazine photo.

2020-11-23_16-37-43 opencv net 5 detecting multiple faces

So let’s analyze the code. For this sample, we load 3 models to work with age, faces and gender.

// # detect faces, age and gender using models from https://github.com/spmallick/learnopencv/tree/08e61fe80b8c0244cc4029ac11e44cd0fbb008c3/AgeGender
const string faceProto = "models/deploy.prototxt";
const string faceModel = "models/res10_300x300_ssd_iter_140000_fp16.caffemodel";
const string ageProto = @"models/age_deploy.prototxt";
const string ageModel = @"models/age_net.caffemodel";
const string genderProto = @"models/gender_deploy.prototxt";
const string genderModel = @"models/gender_net.caffemodel";
_ageNet = CvDnn.ReadNetFromCaffe(ageProto, ageModel);
_genderNet = CvDnn.ReadNetFromCaffe(genderProto, genderModel);
_faceNet = CvDnn.ReadNetFromCaffe(faceProto, faceModel);

Once the models are loaded, in the loop to analyze camera frames, we perform face detection, and then age and gender estimation.

while (true)
{
    if (!_run) continue;
    var startTime = DateTime.Now;

    _capture.Read(_image);
    if (_image.Empty()) return;
    var imageRes = new Mat();
    Cv2.Resize(_image, imageRes, new Size(320, 240));
    var newImage = imageRes.Clone();

    if (_doFaceDetection) DetectFaces(newImage, imageRes);

    if (_fps) CalculateFps(startTime, newImage);

    var bmpWebCam = BitmapConverter.ToBitmap(imageRes);
    var bmpEffect = BitmapConverter.ToBitmap(newImage);

    pictureBoxWebCam.Image = bmpWebCam;
    pictureBoxEffect.Image = bmpEffect;
}

For each detected face, we perform the age and gender estimation. In order to do this, we crop the detected face (plus a padding), and perform the estimation on the cropped image.

private void AnalyzeAgeAndGender(int x1, int y1, int x2, int y2, Mat imageRes, Mat newImage)
{
    // get face frame
    var x = x1 - Padding;
    var y = y1 - Padding;
    var w = (x2 - x1) + Padding * 3;
    var h = (y2 - y1) + Padding * 3;
    Rect roiNew = new Rect(x, y, w, h);
    var face = imageRes[roi: roiNew];

    var meanValues = new Scalar(78.4263377603, 87.7689143744, 114.895847746);
    var blobGender = CvDnn.BlobFromImage(face, 1.0, new Size(227, 227), mean: meanValues,
        swapRB: false);
    _genderNet.SetInput(blobGender);
    var genderPreds = _genderNet.Forward();

    GetMaxClass(genderPreds, out int classId, out double classProbGender);
    var gender = _genderList[classId];

    _ageNet.SetInput(blobGender);
    var agePreds = _ageNet.Forward();
    GetMaxClass(agePreds, out int classIdAge, out double classProbAge);
    var age = _ageList[classIdAge];

    var label = $"{gender},{age}";
    Cv2.PutText(newImage, label, new Point(x1 - 10, y2 + 20), HersheyFonts.HersheyComplexSmall, 1, Scalar.Yellow, 1);
}

private void GetMaxClass(Mat probBlob, out int classId, out double classProb)
{
    // reshape the blob to 1x1000 matrix
    using var probMat = probBlob.Reshape(1, 1);
    Cv2.MinMaxLoc(probMat, out _, out classProb, out _, out var classNumber);
    classId = classNumber.X;
    Debug.WriteLine($"X: {classNumber.X} - Y: {classNumber.Y} ");
}

It’s also important to mention to the GetMaxClass() function, to retrieve the best detected element in the prob result.

And the complete source code:

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Threading;
using System.Windows.Forms;
using OpenCvSharp;
using OpenCvSharp.Dnn;
using OpenCvSharp.Extensions;
using Point = OpenCvSharp.Point;
using Size = OpenCvSharp.Size;
namespace Demo10_WinFormAgeAndGender
{
public partial class Form1 : Form
{
private bool _run = true;
private bool _doFaceDetection = true;
private bool _doAgeGender = false;
private VideoCapture _capture;
private Mat _image;
private Thread _cameraThread;
private bool _fps = false;
private Net _faceNet;
private Net _ageNet;
private Net _genderNet;
private const int LineThickness = 2;
private const int Padding = 10;
private readonly List<string> _genderList = new List<string> { "Male", "Female" };
private readonly List<string> _ageList = new List<string> { "(0-2)", "(4-6)", "(8-12)", "(15-20)", "(25-32)", "(38-43)", "(48-53)", "(60-100)" };
public Form1()
{
InitializeComponent();
Load += Form1_Load;
Closed += Form1_Closed;
}
private void Form1_Closed(object sender, EventArgs e)
{
_cameraThread.Interrupt();
_capture.Release();
}
private void btnStart_Click(object sender, EventArgs e)
{
_run = true;
}
private void btnStop_Click(object sender, EventArgs e)
{
_run = false;
}
private void btnFDDNN_Click(object sender, EventArgs e)
{
_doFaceDetection = !_doFaceDetection;
}
private void buttonFPS_Click(object sender, EventArgs e)
{
_fps = !_fps;
}
private void btnAgeGender_Click(object sender, EventArgs e)
{
_doAgeGender = !_doAgeGender;
}
private void Form1_Load(object sender, EventArgs e)
{
// # detect faces, age and gender using models from https://github.com/spmallick/learnopencv/tree/08e61fe80b8c0244cc4029ac11e44cd0fbb008c3/AgeGender
const string faceProto = "models/deploy.prototxt";
const string faceModel = "models/res10_300x300_ssd_iter_140000_fp16.caffemodel";
const string ageProto = @"models/age_deploy.prototxt";
const string ageModel = @"models/age_net.caffemodel";
const string genderProto = @"models/gender_deploy.prototxt";
const string genderModel = @"models/gender_net.caffemodel";
_ageNet = CvDnn.ReadNetFromCaffe(ageProto, ageModel);
_genderNet = CvDnn.ReadNetFromCaffe(genderProto, genderModel);
_faceNet = CvDnn.ReadNetFromCaffe(faceProto, faceModel);
_capture = new VideoCapture(0);
_image = new Mat();
_cameraThread = new Thread(new ThreadStart(CaptureCameraCallback));
_cameraThread.Start();
}
private void CaptureCameraCallback()
{
while (true)
{
if (!_run) continue;
var startTime = DateTime.Now;
_capture.Read(_image);
if (_image.Empty()) return;
var imageRes = new Mat();
Cv2.Resize(_image, imageRes, new Size(320, 240));
var newImage = imageRes.Clone();
if (_doFaceDetection) DetectFaces(newImage, imageRes);
if (_fps) CalculateFps(startTime, newImage);
var bmpWebCam = BitmapConverter.ToBitmap(imageRes);
var bmpEffect = BitmapConverter.ToBitmap(newImage);
pictureBoxWebCam.Image = bmpWebCam;
pictureBoxEffect.Image = bmpEffect;
}
}
private static void CalculateFps(DateTime startTime, Mat imageRes)
{
var diff = DateTime.Now startTime;
var fpsInfo = $"FPS: Nan";
if (diff.Milliseconds > 0)
{
var fpsVal = 1.0 / diff.Milliseconds * 1000;
fpsInfo = $"FPS: {fpsVal:00}";
}
Cv2.PutText(imageRes, fpsInfo, new Point(10, 20), HersheyFonts.HersheyComplexSmall, 1, Scalar.White);
}
private void DetectFaces(Mat newImage, Mat imageRes)
{
// DNN
int frameHeight = newImage.Rows;
int frameWidth = newImage.Cols;
using var blob = CvDnn.BlobFromImage(newImage, 1.0, new Size(300, 300), new Scalar(104, 117, 123), false, false);
_faceNet.SetInput(blob, "data");
using var detection = _faceNet.Forward("detection_out");
using var detectionMat = new Mat(detection.Size(2), detection.Size(3), MatType.CV_32F, detection.Ptr(0));
for (int i = 0; i < detectionMat.Rows; i++)
{
float confidence = detectionMat.At<float>(i, 2);
if (confidence > 0.7)
{
int x1 = (int)(detectionMat.At<float>(i, 3) * frameWidth);
int y1 = (int)(detectionMat.At<float>(i, 4) * frameHeight);
int x2 = (int)(detectionMat.At<float>(i, 5) * frameWidth);
int y2 = (int)(detectionMat.At<float>(i, 6) * frameHeight);
Cv2.Rectangle(newImage, new Point(x1, y1), new Point(x2, y2), Scalar.Green, LineThickness);
if (_doAgeGender)
AnalyzeAgeAndGender(x1, y1, x2, y2, imageRes, newImage);
}
}
}
private void AnalyzeAgeAndGender(int x1, int y1, int x2, int y2, Mat imageRes, Mat newImage)
{
// get face frame
var x = x1 Padding;
var y = y1 Padding;
var w = (x2 x1) + Padding * 3;
var h = (y2 y1) + Padding * 3;
Rect roiNew = new Rect(x, y, w, h);
var face = imageRes[roi: roiNew];
var meanValues = new Scalar(78.4263377603, 87.7689143744, 114.895847746);
var blobGender = CvDnn.BlobFromImage(face, 1.0, new Size(227, 227), mean: meanValues,
swapRB: false);
_genderNet.SetInput(blobGender);
var genderPreds = _genderNet.Forward();
GetMaxClass(genderPreds, out int classId, out double classProbGender);
var gender = _genderList[classId];
_ageNet.SetInput(blobGender);
var agePreds = _ageNet.Forward();
GetMaxClass(agePreds, out int classIdAge, out double classProbAge);
var age = _ageList[classIdAge];
var label = $"{gender},{age}";
Cv2.PutText(newImage, label, new Point(x1 10, y2 + 20), HersheyFonts.HersheyComplexSmall, 1, Scalar.Yellow, 1);
}
private void GetMaxClass(Mat probBlob, out int classId, out double classProb)
{
// reshape the blob to 1×1000 matrix
using var probMat = probBlob.Reshape(1, 1);
Cv2.MinMaxLoc(probMat, out _, out classProb, out _, out var classNumber);
classId = classNumber.X;
Debug.WriteLine($"X: {classNumber.X} – Y: {classNumber.Y} ");
}
}
}

Happy coding!

Greetings

El Bruno

References

#dotnet – Detecting Faces, DNN vs Haar Cascades from the 🎦 camera feed using #OpenCV and #net5

Buy Me A Coffee

Hi !

In one session around computer vision, someone ask the question about which approach is better

Haar Cascades or DNN?

And the answer can be show using the video below

net5 opencv face detection comparison between dnn and haar cascades

As you can see Haar Cascades works great for faces looking directly to the camera, with good lights and in optimal conditions. However, once my face stop looking at the camera, HC stop detecting faces; and the custom DNN model still working.

But, and this is important, the DNN model will rely only on the set of faces that has been used for training. If the model doesn’t include any demographics, in example Asian friends, the model won’t work with Asian subjects.

At the end, DNN models usually works great than Haar Cascades, however it’s really important to know the limitations of each model.

Happy coding!

Greetings

El Bruno

References

#dotnet – Detecting Faces using DNN from the 🎦 camera feed in a WinForm using #OpenCV and #net5

Buy Me A Coffee

Hi !

Let’s do some face detection using a DNN model (See references). As yesterday, I won’t write about details, there are almost 20 years of online documentation available.

And, IMHO opinion code is much more useful that long writing, so let’s go there. 1st load the Caffe model and the config file.

// download model and prototxt from https://github.com/spmallick/learnopencv/tree/master/FaceDetectionComparison/models
const string configFile = "deploy.prototxt";
const string faceModel = "res10_300x300_ssd_iter_140000_fp16.caffemodel";
_faceNet = CvDnn.ReadNetFromCaffe(configFile, faceModel);

And, once we grab the camera frame, let’s perform face detection using the dnn model:

int frameHeight = newImage.Rows;
int frameWidth = newImage.Cols;

using var blob = CvDnn.BlobFromImage(newImage, 1.0, new Size(300, 300),
    new Scalar(104, 117, 123), false, false);
_faceNet.SetInput(blob, "data");

using var detection = _faceNet.Forward("detection_out");
using var detectionMat = new Mat(detection.Size(2), detection.Size(3), MatType.CV_32F,
    detection.Ptr(0));
for (int i = 0; i < detectionMat.Rows; i++)
{
    float confidence = detectionMat.At<float>(i, 2);

    if (confidence > 0.7)
    {
        int x1 = (int)(detectionMat.At<float>(i, 3) * frameWidth);
        int y1 = (int)(detectionMat.At<float>(i, 4) * frameHeight);
        int x2 = (int)(detectionMat.At<float>(i, 5) * frameWidth);
        int y2 = (int)(detectionMat.At<float>(i, 6) * frameHeight);

        Cv2.Rectangle(newImage, new Point(x1, y1), new Point(x2, y2), Scalar.Green);
        Cv2.PutText(newImage, "Face Dnn", new Point(x1 + 2, y2 + 20),
            HersheyFonts.HersheyComplexSmall, 1, Scalar.Green, 2);
    }
}

And, the full code is here

using System;
using System.Threading;
using System.Windows.Forms;
using OpenCvSharp;
using OpenCvSharp.Dnn;
using OpenCvSharp.Extensions;
using Point = OpenCvSharp.Point;
using Size = OpenCvSharp.Size;
namespace Demo08_WinFormFaceDetectionDNN
{
public partial class Form1 : Form
{
private bool _run = false;
private bool _doFaceDetection = false;
private VideoCapture _capture;
private Mat _image;
private Thread _cameraThread;
private bool _fps = false;
private Net _faceNet;
public Form1()
{
InitializeComponent();
Load += Form1_Load;
Closed += Form1_Closed;
}
private void Form1_Closed(object sender, EventArgs e)
{
_cameraThread.Interrupt();
_capture.Release();
}
private void btnStart_Click(object sender, EventArgs e)
{
_run = true;
}
private void btnStop_Click(object sender, EventArgs e)
{
_run = false;
}
private void btnFDDNN_Click(object sender, EventArgs e)
{
_doFaceDetection = !_doFaceDetection;
}
private void buttonFPS_Click(object sender, EventArgs e)
{
_fps = !_fps;
}
private void Form1_Load(object sender, EventArgs e)
{
// download model and prototxt from https://github.com/spmallick/learnopencv/tree/master/FaceDetectionComparison/models
const string configFile = "deploy.prototxt";
const string faceModel = "res10_300x300_ssd_iter_140000_fp16.caffemodel";
_faceNet = CvDnn.ReadNetFromCaffe(configFile, faceModel);
_capture = new VideoCapture(0);
_image = new Mat();
_cameraThread = new Thread(new ThreadStart(CaptureCameraCallback));
_cameraThread.Start();
}
private void CaptureCameraCallback()
{
while (true)
{
if (!_run) continue;
var startTime = DateTime.Now;
_capture.Read(_image);
if (_image.Empty()) return;
var imageRes = new Mat();
Cv2.Resize(_image, imageRes, new Size(320, 240));
var newImage = imageRes.Clone();
if (_doFaceDetection)
{
int frameHeight = newImage.Rows;
int frameWidth = newImage.Cols;
using var blob = CvDnn.BlobFromImage(newImage, 1.0, new Size(300, 300),
new Scalar(104, 117, 123), false, false);
_faceNet.SetInput(blob, "data");
using var detection = _faceNet.Forward("detection_out");
using var detectionMat = new Mat(detection.Size(2), detection.Size(3), MatType.CV_32F,
detection.Ptr(0));
for (int i = 0; i < detectionMat.Rows; i++)
{
float confidence = detectionMat.At<float>(i, 2);
if (confidence > 0.7)
{
int x1 = (int)(detectionMat.At<float>(i, 3) * frameWidth);
int y1 = (int)(detectionMat.At<float>(i, 4) * frameHeight);
int x2 = (int)(detectionMat.At<float>(i, 5) * frameWidth);
int y2 = (int)(detectionMat.At<float>(i, 6) * frameHeight);
Cv2.Rectangle(newImage, new Point(x1, y1), new Point(x2, y2), Scalar.Green);
Cv2.PutText(newImage, "Face Dnn", new Point(x1 + 2, y2 + 20),
HersheyFonts.HersheyComplexSmall, 1, Scalar.Green, 2);
}
}
}
if (_fps)
{
var diff = DateTime.Now startTime;
var fpsInfo = $"FPS: Nan";
if (diff.Milliseconds > 0)
{
var fpsVal = 1.0 / diff.Milliseconds * 1000;
fpsInfo = $"FPS: {fpsVal:00}";
}
Cv2.PutText(imageRes, fpsInfo, new Point(10, 20), HersheyFonts.HersheyComplexSmall, 1, Scalar.White);
}
var bmpWebCam = BitmapConverter.ToBitmap(imageRes);
var bmpEffect = BitmapConverter.ToBitmap(newImage);
pictureBoxWebCam.Image = bmpWebCam;
pictureBoxEffect.Image = bmpEffect;
}
}
}
}

That’s all for today!

Happy coding!

Greetings

El Bruno

References

#dotnet – Detecting Faces using Cascades from the 🎦 camera feed in a WinForm using #OpenCV and #net5

Buy Me A Coffee

Hi !

Let’s do some face detection using one of the most popular methods: Haar Casacades (See references). I won’t write about Cascades, there are almost 20 years of online documentation available.

And, IMHO opinion code is much more useful that long writing, so let’s go there.

1st load the cascade definition file.

_faceCascade = new CascadeClassifier();
_faceCascade.Load("haarcascade_frontalface_default.xml");

And, once we grab the camera frame, let’s perform some face detection:

using var gray = new Mat();
Cv2.CvtColor(newImage, gray, ColorConversionCodes.BGR2GRAY);

var faces = _faceCascade.DetectMultiScale(gray, 1.3, 5);
foreach (var face in faces)
{
  Cv2.Rectangle(newImage, face, Scalar.Red);
  Cv2.PutText(newImage, "Face Cascade", new Point(face.Left + 2, face.Top + face.Width + 20), HersheyFonts.HersheyComplexSmall, 1, Scalar.Red, 2);
}

And, the full code is here

using System;
using System.Threading;
using System.Windows.Forms;
using OpenCvSharp;
using OpenCvSharp.Extensions;
using Point = OpenCvSharp.Point;
using Size = OpenCvSharp.Size;
namespace Demo07_WinFormFaceDetectionCascades
{
public partial class Form1 : Form
{
private bool _run = true;
private bool _doFaceDetection = false;
private VideoCapture _capture;
private Mat _image;
private Thread _cameraThread;
private bool _fps = false;
private CascadeClassifier _faceCascade;
public Form1()
{
InitializeComponent();
Load += Form1_Load;
Closed += Form1_Closed;
}
private void Form1_Closed(object sender, EventArgs e)
{
_cameraThread.Interrupt();
_capture.Release();
}
private void btnStart_Click(object sender, EventArgs e)
{
_run = true;
}
private void btnStop_Click(object sender, EventArgs e)
{
_run = false;
}
private void btnFaceDetectionCascades_Click(object sender, EventArgs e)
{
_doFaceDetection = !_doFaceDetection;
}
private void buttonFPS_Click(object sender, EventArgs e)
{
_fps = !_fps;
}
private void Form1_Load(object sender, EventArgs e)
{
_faceCascade = new CascadeClassifier();
_faceCascade.Load("haarcascade_frontalface_default.xml");
_capture = new VideoCapture(0);
_image = new Mat();
_cameraThread = new Thread(new ThreadStart(CaptureCameraCallback));
_cameraThread.Start();
}
private void CaptureCameraCallback()
{
while (true)
{
if (!_run) continue;
var startTime = DateTime.Now;
_capture.Read(_image);
if (_image.Empty()) return;
var imageRes = new Mat();
Cv2.Resize(_image, imageRes, new Size(320, 240));
var newImage = imageRes.Clone();
if (_doFaceDetection)
{
using var gray = new Mat();
Cv2.CvtColor(newImage, gray, ColorConversionCodes.BGR2GRAY);
var faces = _faceCascade.DetectMultiScale(gray, 1.3, 5);
foreach (var face in faces)
{
Cv2.Rectangle(newImage, face, Scalar.Red);
Cv2.PutText(newImage, "Face Cascade", new Point(face.Left + 2, face.Top + face.Width + 20),
HersheyFonts.HersheyComplexSmall, 1, Scalar.Red, 2);
}
}
if (_fps)
{
var diff = DateTime.Now startTime;
var fpsInfo = $"FPS: Nan";
if (diff.Milliseconds > 0)
{
var fpsVal = 1.0 / diff.Milliseconds * 1000;
fpsInfo = $"FPS: {fpsVal:00}";
}
Cv2.PutText(newImage, fpsInfo, new Point(10, 20), HersheyFonts.HersheyComplexSmall, 1, Scalar.White);
}
var bmpWebCam = BitmapConverter.ToBitmap(imageRes);
var bmpEffect = BitmapConverter.ToBitmap(newImage);
pictureBoxWebCam.Image = bmpWebCam;
pictureBoxEffect.Image = bmpEffect;
}
}
}
}

That’s all for today!

Happy coding!

Greetings

El Bruno

References

#dotnet – Display the 🎦 camera feed in a WinForm using #OpenCV and #net5

Buy Me A Coffee

Hi !

Back on the Windows Forms days, cameras were tricky. We didn’t have a lot of libraries to work, and they usually require some extra work to handle unexpected errors. With Net 5 and OpenCVSharp, we can create a simple WebCam viewer like this one.

net5 opencv live camera and effects

Let’s start with a [Take a Photo] Windows Form app. This App have a PictureBox and a Button, and it’s all connected with these lines.

using System;
using System.Windows.Forms;
using OpenCvSharp;
namespace Demo03_WinForm
{
public partial class Form1 : Form
{
private VideoCapture _capture;
private Mat _image;
public Form1()
{
InitializeComponent();
}
private void Form1_Load(object sender, EventArgs e)
{
_capture = new VideoCapture(0);
_image = new Mat();
}
private void button1_Click(object sender, EventArgs e)
{
_capture.Read(_image);
if (_image.Empty()) return;
pictureBox1.Image = OpenCvSharp.Extensions.BitmapConverter.ToBitmap(_image);
}
}
}

As you can see it’s super simple, and the main lesson to learn here is on line 27, when we need to convert the OpenCV Mat object to a standard Bitmap to be used on a PictureBox.

We can some more features, like the one in the animation before

  • Start / Stop Camera Preview
  • Calculate FPS
  • Add Canny effect

with the following lines

using System;
using System.Threading;
using System.Windows.Forms;
using OpenCvSharp;
using OpenCvSharp.Extensions;
using Point = OpenCvSharp.Point;
using Size = OpenCvSharp.Size;
namespace Demo04_WinForm
{
public partial class Form1 : Form
{
private bool _run = false;
private bool _canny = false;
private VideoCapture _capture;
private Mat _image;
private Thread _cameraThread;
private bool _fps = false;
public Form1()
{
InitializeComponent();
Load += Form1_Load;
Closed += Form1_Closed;
}
private void Form1_Closed(object sender, EventArgs e)
{
_cameraThread.Interrupt();
_capture.Release();
}
private void btnStart_Click(object sender, EventArgs e)
{
_run = true;
}
private void btnStop_Click(object sender, EventArgs e)
{
_run = false;
}
private void btnCanny_Click(object sender, EventArgs e)
{
_canny = !_canny;
}
private void buttonFPS_Click(object sender, EventArgs e)
{
_fps = !_fps;
}
private void Form1_Load(object sender, EventArgs e)
{
_capture = new VideoCapture(0);
_image = new Mat();
_cameraThread = new Thread(new ThreadStart(CaptureCameraCallback));
_cameraThread.Start();
}
private void CaptureCameraCallback()
{
while (true)
{
if (!_run) continue;
var startTime = DateTime.Now;
_capture.Read(_image);
if (_image.Empty()) return;
var imageRes = new Mat();
Cv2.Resize(_image, imageRes, new Size(320, 240));
var newImage = imageRes.Clone();
if (_canny)
Cv2.Canny(imageRes, newImage, 50, 200);
if (_fps)
{
var diff = DateTime.Now startTime;
var fpsInfo = $"FPS: Nan";
if (diff.Milliseconds > 0)
{
var fpsVal = 1.0 / diff.Milliseconds * 1000;
fpsInfo = $"FPS: {fpsVal:00}";
}
Cv2.PutText(imageRes, fpsInfo, new Point(10, 20), HersheyFonts.HersheyComplexSmall, 1, Scalar.White);
}
var bmpWebCam = BitmapConverter.ToBitmap(imageRes);
var bmpEffect = BitmapConverter.ToBitmap(newImage);
pictureBoxWebCam.Image = bmpWebCam;
pictureBoxEffect.Image = bmpEffect;
}
}
}
}

Less than 100 lines of code to do some camera image processing in WinForms!

That’s all for today!

Happy coding!

Greetings

El Bruno

References

#Tools – How to create an IP Camera from a USB Camera on #Windows10

Buy Me A Coffee

Hi!

So, I may need to create a new category for these posts, something like “scenarios that you will never need in your life”, or “I did this once and now I can’t remember what I did”. Today is a simple one,

Let’s create a IP Camera from a USB Webcam in Windows.

There are several ways to do this, you can even write some code, however I will use a deprecated software: Dorgem (see references)

Dorgem is a webcam capture application for Windows 9x and up. Any Video for Windows compatible webcam (or other digital camera) is supported.
It has unlimited storage events that can put the captured image on an FTP site as well as a local disk, all with their own time interval. It can put a unlimited texts and bitmaps on the captured image before the image is stored.
Dorgem supports an unlimited number of simultaneous cameras. It has a built-in webserver for still images and can be used as security camera because of its motion detection.

Note: YawCam is also an option, however it requires Java Runtime, and I don’t want to install it unless is necessary.

Ok, once you installed the app, we need to select the source camera.

We can also choose camera options like Resolution, Pixed Depth, etc.

In order to have a IP Cam, we need to enable the WebServer, from the Options button. Here we can define port, refresh rate and more.

And that’s it, we have a small and usefull webserver sharing our webcam via http.

Happy coding!

Greetings

El Bruno

References

#VSCode – 20 lines to display a webcam camera feed with #Python using #OpenCV

Hi !

I always write this from scratch, so it seems that I’ll drop this one here. So next time I search for this, I’ll find myself.

import os
import cv2
import time
# init camera
execution_path = os.getcwd()
camera = cv2.VideoCapture(0)
while True:
# Init and FPS process
start_time = time.time()
# Grab a single frame of video
ret, frame = camera.read()
# calculate FPS >> FPS = 1 / time to process loop
fpsInfo = "FPS: " + str(1.0 / (time.time() start_time))
print(fpsInfo)
cv2.putText(frame, fpsInfo, (10, 10), cv2.FONT_HERSHEY_SIMPLEX, 0.4, (255, 255, 255), 1)
# Display the resulting image
cv2.imshow('Video', frame)
# Hit 'q' on the keyboard to quit!
if cv2.waitKey(1) & 0xFF == ord('q'):
break
# Release handle to the webcam
camera.release()
cv2.destroyAllWindows()
view raw PythonWebCamFeed.py hosted with ❤ by GitHub

And with some extra lines, we can even detect faces and display some face landmarks:

This is the base of some many image recognition scenarios, so I hope this will save me some local search time 😀

Happy coding!

Greetings @ Toronto

El Bruno

References

My posts on Face Recognition using Python

  1. Detecting Faces with 20 lines in Python
  2. Face Recognition with 20 lines in Python
  3. Detecting Facial Features with 20 lines in Python
  4. Facial Features and Face Recognition with 20 lines in Python
  5. Performance improvements with code
  6. More performance improvements, lowering the camera resolution

And some general Python posts

#Python – Detecting #Hololens in realtime in webcam feed using #ImageAI and #OpenCV with performance improvements

Hi!

In my previous post I created a sample on how to use ImageAI and OpenCV to detect Hololens from a webcam frame (see references). I added some code to the last sample, and I found that the performance was not very good.

python using imageai to detect hololens less than 1 fps

With the previous sample code, I couldn’t process more than 1 frame per second. So, I started to make some improvements and I got this result

python using imageai to detect hololens little more than 1 fps

Not an amazing one, but still is nice to have more than 1 frame per second analyzed.

# load HL detection model from imageAI
# open camera with openCV, analyze frame by frame
# draw a red frame around the detected object
# display FPS, resize image to 1/4 to improve performance
from imageai.Detection.Custom import CustomObjectDetection
import os
import cv2
import time
detector = CustomObjectDetection()
detector.setModelTypeAsYOLOv3()
detector.setModelPath("hololens-ex-60–loss-2.76.h5")
detector.setJsonPath("detection_config.json")
detector.loadModel()
# init camera
execution_path = os.getcwd()
camera = cv2.VideoCapture(0)
camera.set(cv2.CAP_PROP_FRAME_WIDTH,640)
camera.set(cv2.CAP_PROP_FRAME_HEIGHT,480)
while True:
# FPS process
start_time = time.time()
# Grab a single frame of video
ret, frame = camera.read()
fast_frame = cv2.resize(frame, (0, 0), fx=0.25, fy=0.25)
detected_image, detections = detector.detectObjectsFromImage(input_image=fast_frame, input_type="array", output_type="array")
for detection in detections:
# frame for the detected object
(x1, y1, x2, y2) = detection["box_points"]
x1 *= 4
y1 *= 4
x2 *= 4
y2 *= 4
cv2.rectangle(frame, (x1, y1), (x2, y2), (0, 0, 255), 2)
# Draw a label with the detected object type below the frame
font = cv2.FONT_HERSHEY_DUPLEX
cv2.putText(frame, detection["name"], (x1 + 6, y1 6), font, 1.0, (255, 255, 255), 1)
#display FPS
fpsInfo = "FPS: " + str(1.0 / (time.time() start_time)) # FPS = 1 / time to process loop
print(fpsInfo)
font = cv2.FONT_HERSHEY_DUPLEX
cv2.putText(frame, fpsInfo, (10, 20), font, 0.4, (255, 255, 255), 1)
# Display the resulting image
cv2.imshow('Video', frame)
# Hit 'q' on the keyboard to quit!
if cv2.waitKey(1) & 0xFF == ord('q'):
break
# Release handle to the webcam
video_capture.release()
cv2.destroyAllWindows()

I even remove all the camera preview and still works in less than 1FPS.

python using imageai to detect hololens no opencv camera preview

So, now it’s time to read and learn of the deep code on ImageAI. Fun times!

Happy coding!

Greetings @ Burlington

El Bruno

References