#Python – Context Meny and Key Press with #PySimpleGUI

Buy Me A Coffee

Hi !

After my base sample code for a GUI app, now it’s time to add some interaction features:

  • Context Menu
  • Capture Key Press

In the following example, I’m adding a context menu with the following elements:

python pysimplegui context menu

This can be done as part of the window definition, in example

right_click_menu = ['Unused', ['&FPS', '---', 'Menu A', 'Menu B', 'Menu C', ['Menu C1', 'Menu C2'], '---', 'Exit']]

window    = sg.Window("El Bruno - Webcams and GrayScale with PySimpleGUI", layout, 
                    right_click_menu=right_click_menu,
                    no_titlebar=False, alpha_channel=1, grab_anywhere=False, 
                    return_keyboard_events=True, location=(100, 100))      

The menu definition and how to create submenus, separators, quick access keys and more are part of the PySimpleGUI documentation.

And then, in order to capture events in the window, we need to check the events read in each loop of the while. The following sample, check the window events

  • Close the Window is the user press the [X] to close the window, or click the [Exit] element on the context menu
  • Change the value of a boolean var if the user press the key [F] , or click the [FPS] element on the context menu
    # process windows events
    event, values = window.read(timeout=20)
    if event == sg.WIN_CLOSED or event == "Exit":
        break
    if event == "f" or event == "F" or event == "FPS":
        display_fps = not display_fps

The full code:

# Bruno Capuano 2020
# display the camera feed using OpenCV
# display FPS
import time
import cv2
import PySimpleGUI as sg
# init Windows Manager
sg.theme("DarkBlue")
# def webcam col
colwebcam1_layout = [[sg.Text("Camera View", size=(60, 1), justification="center")],
[sg.Image(filename="", key="cam1")]]
colwebcam1 = sg.Column(colwebcam1_layout, element_justification='center')
colwebcam2_layout = [[sg.Text("Camera View GrayScale", size=(60, 1), justification="center")],
[sg.Image(filename="", key="cam1gray")]]
colwebcam2 = sg.Column(colwebcam2_layout, element_justification='center')
colslayout = [colwebcam1, colwebcam2]
rowfooter = [sg.Image(filename="avabottom.png", key="-IMAGEBOTTOM-")]
layout = [colslayout, rowfooter]
right_click_menu = ['Unused', ['&FPS', '—', 'Menu A', 'Menu B', 'Menu C', ['Menu C1', 'Menu C2'], '—', 'Exit']]
window = sg.Window("El Bruno – Webcams and GrayScale with PySimpleGUI", layout,
right_click_menu=right_click_menu,
no_titlebar=False, alpha_channel=1, grab_anywhere=False,
return_keyboard_events=True, location=(100, 100))
# Camera Settings
camera_Width = 480 # 640 # 1024 # 1280
camera_Heigth = 320 # 480 # 780 # 960
frameSize = (camera_Width, camera_Heigth)
video_capture = cv2.VideoCapture(0)
time.sleep(2.0)
display_fps = False
while True:
start_time = time.time()
# process windows events
event, values = window.read(timeout=20)
if event == sg.WIN_CLOSED or event == "Exit":
break
if event == "f" or event == "F" or event == "FPS":
display_fps = not display_fps
# get camera frame
ret, frameOrig = video_capture.read()
frame = cv2.resize(frameOrig, frameSize)
if (display_fps == True) and (time.time() start_time ) > 0:
fpsInfo = "FPS: " + str(1.0 / (time.time() start_time)) # FPS = 1 / time to process loop
font = cv2.FONT_HERSHEY_DUPLEX
cv2.putText(frame, fpsInfo, (10, 20), font, 0.4, (255, 255, 255), 1)
# update webcam1
imgbytes = cv2.imencode(".png", frame)[1].tobytes()
window["cam1"].update(data=imgbytes)
# transform frame to grayscale
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
# update webcam2
imgbytes = cv2.imencode(".png", gray)[1].tobytes()
window["cam1gray"].update(data=imgbytes)
video_capture.release()
cv2.destroyAllWindows()

Happy coding!

Greetings

El Bruno

Resources

#Python – Creating GUIs with #PySimpleGUI. 2 webcams view with 50 lines of code

Buy Me A Coffee

Hi !

Working with Computer Vision is super fun. And there are some scenarios where display the step by step of the processing of an image is the best way to present this.

In most of my scenarios I use OpenCV, however for a more detailed presentation I needed to search and learn a GUI framework in Python. That’s how I get to PySimpleGUI (see references).

Note: As a long time C# dev, I’m missing XAML a lot !

PySimpleGUI is very simple (as you can expect!), and with a few lines of code we can create an UI like this one:

Let’s display the camera feed and a gray scale view of the camera feed

2 cameras in python, using PySimpleGUI to create a windows to display this

Super easy !

Let’s take a look at the code

  • Lines 16-34. This is the main window definition. The Window have 2 rows.
    • 1st row have 2 columns with the 2 cameras, each camera have their own element key
    • 2nd row have an image to display a bottom
  • Line 30. This is the final merge of the 2 rows
  • Lines 32-34. This is the window definition. We can define title, transparency, etc.
  • Lines 37-40. Window Event management, I’ll write more about this for sure. Right now, I’m only checking for window close to exit the loop.
  • Lines 51-53. Transform the camera frame to a byte array, based on the PNG format and assign the array to the 1st camera viewer.
  • Lines 55-60. Transform the camera frame to Gray Scale frame. Then transform the gray scale frame to a byte array, based on the PNG format and assign the array to the 2nd camera viewer.

Done !

# Bruno Capuano 2020
# display the camera feed using OpenCV
# display the camera feed with grayscale using OpenCV
import time
import cv2
import PySimpleGUI as sg
# Camera Settings
camera_Width = 320 # 480 # 640 # 1024 # 1280
camera_Heigth = 240 # 320 # 480 # 780 # 960
frameSize = (camera_Width, camera_Heigth)
video_capture = cv2.VideoCapture(0)
time.sleep(2.0)
# init Windows Manager
sg.theme("DarkBlue")
# def webcam col
colwebcam1_layout = [[sg.Text("Camera View", size=(60, 1), justification="center")],
[sg.Image(filename="", key="cam1")]]
colwebcam1 = sg.Column(colwebcam1_layout, element_justification='center')
colwebcam2_layout = [[sg.Text("Camera View GrayScale", size=(60, 1), justification="center")],
[sg.Image(filename="", key="cam1gray")]]
colwebcam2 = sg.Column(colwebcam2_layout, element_justification='center')
colslayout = [colwebcam1, colwebcam2]
rowfooter = [sg.Image(filename="avabottom.png", key="-IMAGEBOTTOM-")]
layout = [colslayout, rowfooter]
window = sg.Window("El Bruno – Webcams and GrayScale with PySimpleGUI", layout,
no_titlebar=False, alpha_channel=1, grab_anywhere=False,
return_keyboard_events=True, location=(100, 100))
while True:
start_time = time.time()
event, values = window.read(timeout=20)
if event == sg.WIN_CLOSED:
break
# get camera frame
ret, frameOrig = video_capture.read()
frame = cv2.resize(frameOrig, frameSize)
# if (time.time() – start_time ) > 0:
# fpsInfo = "FPS: " + str(1.0 / (time.time() – start_time)) # FPS = 1 / time to process loop
# font = cv2.FONT_HERSHEY_DUPLEX
# cv2.putText(frame, fpsInfo, (10, 20), font, 0.4, (255, 255, 255), 1)
# # update webcam1
imgbytes = cv2.imencode(".png", frame)[1].tobytes()
window["cam1"].update(data=imgbytes)
# # transform frame to grayscale
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
# # update webcam2
imgbytes = cv2.imencode(".png", gray)[1].tobytes()
window["cam1gray"].update(data=imgbytes)
video_capture.release()
cv2.destroyAllWindows()

Happy coding!

Greetings

El Bruno

Resources

#Python – #FastAPI Webserver sharing information from values in a different thread

Buy Me A Coffee

Hi !

After my yesterday post using Flask, I was sure that a FastAPI version will be needed, so here it goes:

I have a common scenario which involves:

  • A sensor collecting information
  • A web-server publishing the sensor information

Read my previous posts to understand why I think this is the simple way to solve this: Multi-threading.

  • Thread 1, where an infinite loop request information from the sensor, and stores the latest value to be shared.
  • Thread 2, where a web-server process requests and share the latest sensor information.

Easy ! And after a couple of tests, I manage to create a single file implementing this:

# Bruno Capuano
# simple webserver with fastapi
# run with uvicorn 07:app -reload
# test with http://127.0.0.1:8000/getdata
# on each call, validate if the thread is started,
# of the thread is None, start a different thread +1 a shared var
from typing import Optional
from fastapi import FastAPI
import threading
import time
stateThread = None
iCounter = 0
app = FastAPI()
def validateStateThread():
global stateThread
if (stateThread is None):
print(f"start thread")
stateThread = threading.Thread(target=mainSum)
stateThread.daemon = True
stateThread.start()
@app.get("/getdata")
def main():
global iCounter
validateStateThread()
t = time.localtime()
current_time = time.strftime("%H:%M:%S", t)
return str(f"{current_time} – data {iCounter}")
def mainSum():
# increment counter every second
global iCounter
while True:
iCounter = iCounter + 1
t = time.localtime()
current_time = time.strftime("%H:%M:%S", t)
print(str(f"{current_time} – data {iCounter}"))
time.sleep(1)

So at this point, you may think: why does El Bruno need this? So, let’s share an image that I’ll use in future posts:

thermal camera demo

Happy coding!

Greetings

El Bruno

Resources

#Python – Flask Webserver sharing information from values in a different thread

Buy Me A Coffee

Hi !

I have a common scenario which involves:

  • A sensor collecting information
  • A web-server publishing the sensor information

This is simple, however the sensor does not support constants requests, and it may return a “too many requests” response when called directly. The idea to get the sensor information directly in the web-request was not valid from day zero.

I asked for support / guidance and my amazing and smart friends show me the concept of OVER ENGINEERING. Dockers, Compose, Queues, Coordination and more was part of some of the proposals. However, they also show me the most easy and simple way to solve this: Multi-threading.

  • Thread 1, where an infinite loop request information from the sensor, and stores the latest value to be shared.
  • Thread 2, where a web-server process requests and share the latest sensor information.

Easy ! And after a couple of tests, I manage to create a single file implementing this:

# Bruno Capuano
# start a webserver with flask in a thread
# start a different thread +1 a shared var
from flask import Flask
import threading
import time
iCounter = 0
data = 'foo'
app = Flask(__name__)
def mainSum():
# increment counter every second
global iCounter
while True:
iCounter = iCounter + 1
t = time.localtime()
current_time = time.strftime("%H:%M:%S", t)
print(str(f"{current_time} – data {iCounter}"))
time.sleep(1)
def startWebServer():
app.run(host='0.0.0.0', port=8080)
@app.route("/getdata")
def main():
global iCounter
t = time.localtime()
current_time = time.strftime("%H:%M:%S", t)
return str(f"{current_time} – data {iCounter}")
if __name__ == "__main__":
stateThread = threading.Thread(target=mainSum)
stateThread.daemon = True
stateThread.start()
webThread = threading.Thread(target=startWebServer)
webThread.start()

So at this point, you may think: why does El Bruno need this? So, let’s share an image that I’ll use in future posts:

thermal camera demo

Note: Some very smart people also suggested to implements this using FastAPI instead of Flask, so a future post may include this.

Happy coding!

Greetings

El Bruno

Resources

#ComputerVision – Object Detection with #YoloV3 and #MobileNetSSD

Buy Me A Coffee

Hi !

I have a ToDo in my list, to add some new drone demos. In order to do this, I was planning to perform some tests with pretrained models and use them. The 1st 2 in my list are Yolo and MobileNetSSD (see references).

YoloV3

Let’s start with one of the most popular object detection tools, YOLOV3. The official definition:

YOLO (You Only Look Once) is a real-time object detection algorithm that is a single deep convolutional neural network that splits the input image into a set of grid cells, so unlike image classification or face detection, each grid cell in YOLO algorithm will have an associated vector in the output that tells us:

If an object exists in that grid cell.

The class of that object (i.e label).

The predicted bounding box for that object (location).

YoloV3

I pickup some sample code from GitHub repositories and, as usual, from PyImageSearch (see references), and I created a real-time object detection scenario using my webcam as the input feed for YoloV3.

Object Detection live sample with Yolo V3

The final demo, works great; we can use the 80 classes that YoloV3 supports and it’s working at ~2FPS.

MobileNetSSD

Another very popular Object Detection Tool is MobileNetSSD. And, the important part here is SSD, Single Shot Detection. Let’s go to the definition:

Single Shot object detection or SSD takes one single shot to detect multiple objects within the image. As you can see in the above image we are detecting coffee, iPhone, notebook, laptop and glasses at the same time.

It composes of two parts

– Extract feature maps, and

– Apply convolution filter to detect objects

SSD is developed by Google researcher teams to main the balance between the two object detection methods which are YOLO and RCNN.

There are specifically two models of SSD are available

– SSD300: In this model the input size is fixed to 300×300. It is used in lower resolution images, faster processing speed and it is less accurate than SSD512

– SSD512: In this model the input size is fixed to 500×500. It is used in higher resolution images and it is more accurate than other models.

SSD is faster than R-CNN because in R-CNN we need two shots one for generating region proposals and one for detecting objects whereas in SSD It can be done in a single shot.

The MobileNet SSD method was first trained on the COCO dataset and was then fine-tuned on PASCAL VOC reaching 72.7% mAP (mean average precision).

For this demo, I’ll use the SSD300 model. Even, if the drone support better quality images and the SSD500 model works with bigger images, SSD300 is a good fit for this.

bject Detection with MobileNetSSD

This sample works at ~20FPS, and this triggered my curiosity to learn more about the 2nd one. I started to read a lot about this, and found some amazing articles and papers. At the end, if you are interested on my personal take, I really enjoyed this 30 min video about the different detectors side-by-side

Source Code

YoloV3 webcam live object detection

# Bruno Capuano 2020
# display the camera feed using OpenCV
# display FPS
# load YOLO object detector trained with COCO Dataset (80 classes)
# analyze each camera frame using YoloV3 searching for banana classes
import numpy as np
import time
import cv2
import os
def initYoloV3():
global labelColors, layerNames, net
# random color collection for each class label
np.random.seed(42)
labelColors = np.random.randint(0, 255, size=(len(Labels), 3), dtype="uint8")
# load model
net = cv2.dnn.readNetFromDarknet(configPath, weightsPath)
layerNames = net.getLayerNames()
layerNames = [layerNames[i[0] 1] for i in net.getUnconnectedOutLayers()]
def analyzeFrame(frame, displayBoundingBox = True, displayClassName = True, displayConfidence = True):
global H, W
# init
if W is None or H is None:
(H, W) = frame.shape[:2]
if net is None:
initYoloV3()
yoloV3ImgSize = (416, 416)
blob = cv2.dnn.blobFromImage(frame, 1 / 255.0, yoloV3ImgSize, swapRB=True, crop=False)
net.setInput(blob)
start = time.time()
layerOutputs = net.forward(layerNames)
end = time.time()
boxes = []
confidences = []
classIDs = []
for output in layerOutputs:
for detection in output:
scores = detection[5:]
classID = np.argmax(scores)
confidence = scores[classID]
if confidence > confidenceDef:
box = detection[0:4] * np.array([W, H, W, H])
(centerX, centerY, width, height) = box.astype("int")
x = int(centerX (width / 2))
y = int(centerY (height / 2))
boxes.append([x, y, int(width), int(height)])
confidences.append(float(confidence))
classIDs.append(classID)
idxs = cv2.dnn.NMSBoxes(boxes, confidences, confidenceDef, thresholdDef)
if len(idxs) > 0:
for i in idxs.flatten():
(x, y) = (boxes[i][0], boxes[i][1])
(w, h) = (boxes[i][2], boxes[i][3])
if (displayBoundingBox):
color = [int(c) for c in labelColors[classIDs[i]]]
cv2.rectangle(frame, (x, y), (x + w, y + h), color, 2)
if(displayClassName and displayConfidence):
text = "{}: {:.4f}".format(Labels[classIDs[i]], confidences[i])
cv2.putText(frame, text, (x, y 5), cv2.FONT_HERSHEY_SIMPLEX, 0.5, color, 2)
elif(displayClassName):
text = str(f"{Labels[classIDs[i]]}:")
cv2.putText(frame, text, (x, y 5), cv2.FONT_HERSHEY_SIMPLEX, 0.5, color, 2)
# Camera Settings
camera_Width = 640 # 1024 # 1280 # 640
camera_Heigth = 480 # 780 # 960 # 480
frameSize = (camera_Width, camera_Heigth)
video_capture = cv2.VideoCapture(1)
time.sleep(2.0)
(W, H) = (None, None)
# YOLO Settings
weightsPath = "yolov3.weights"
configPath = "yolov3.cfg"
LabelsPath = "coco.names"
Labels = open(LabelsPath).read().strip().split("\n")
confidenceDef = 0.5
thresholdDef = 0.3
net = (None)
labelColors = (None)
layerNames = (None)
i = 0
detectionEnabled = False
while True:
i = i + 1
start_time = time.time()
ret, frameOrig = video_capture.read()
frame = cv2.resize(frameOrig, frameSize)
if(detectionEnabled):
analyzeFrame(frame)
if (time.time() start_time ) > 0:
fpsInfo = "FPS: " + str(1.0 / (time.time() start_time)) # FPS = 1 / time to process loop
font = cv2.FONT_HERSHEY_DUPLEX
cv2.putText(frame, fpsInfo, (10, 20), font, 0.4, (255, 255, 255), 1)
cv2.imshow('@elbruno – YoloV3 Object Detection', frame)
# key controller
key = cv2.waitKey(1) & 0xFF
if key == ord("d"):
if (detectionEnabled == True):
detectionEnabled = False
else:
detectionEnabled = True
if key == ord("q"):
break
video_capture.release()
cv2.destroyAllWindows()

MobileNetSSD webcam live object detection

# Bruno Capuano 2020
# display the camera feed using OpenCV
# display FPS
# load MobileNetSSD object detector trained with COCO Dataset (20 classes)
# analyze each camera frame using MobileNet
# enable disable obj detection pressing D key
import numpy as np
import time
import cv2
import os
def initMobileNetSSD():
global classesMobileNetSSD, colorsMobileNetSSD, net
classesMobileNetSSD = ["background", "aeroplane", "bicycle", "bird", "boat",
"bottle", "bus", "car", "cat", "chair", "cow", "diningtable",
"dog", "horse", "motorbike", "person", "pottedplant", "sheep",
"sofa", "train", "tvmonitor"]
colorsMobileNetSSD = np.random.uniform(0, 255, size=(len(classesMobileNetSSD), 3))
net = cv2.dnn.readNetFromCaffe(prototxtFile, modelFile)
def analyzeFrame(frame, displayBoundingBox = True, displayClassName = True, displayConfidence = True):
global H, W
# init
if W is None or H is None:
(H, W) = frame.shape[:2]
if net is None:
initMobileNetSSD()
mobileNetSSDImgSize = (300, 300)
blob = cv2.dnn.blobFromImage(cv2.resize(frame, mobileNetSSDImgSize), 0.007843, mobileNetSSDImgSize, 127.5)
net.setInput(blob)
detections = net.forward()
for i in np.arange(0, detections.shape[2]):
confidence = detections[0, 0, i, 2]
if confidence > confidenceDef:
idx = int(detections[0, 0, i, 1])
box = detections[0, 0, i, 3:7] * np.array([W, H, W, H])
(startX, startY, endX, endY) = box.astype("int")
if(displayBoundingBox):
cv2.rectangle(frame, (startX, startY), (endX, endY), colorsMobileNetSSD[idx], 2)
if(displayClassName and displayConfidence):
label = "{}: {:.2f}%".format(classesMobileNetSSD[idx], confidence * 100)
y = startY 15 if startY 15 > 15 else startY + 15
cv2.putText(frame, label, (startX, y), cv2.FONT_HERSHEY_SIMPLEX, 0.5, colorsMobileNetSSD[idx], 2)
elif (displayClassName):
label = str(f"{classesMobileNetSSD[idx]}")
y = startY 15 if startY 15 > 15 else startY + 15
cv2.putText(frame, label, (startX, y), cv2.FONT_HERSHEY_SIMPLEX, 0.5, colorsMobileNetSSD[idx], 2)
# Camera Settings
camera_Width = 640 # 1024 # 1280 # 640
camera_Heigth = 480 # 780 # 960 # 480
frameSize = (camera_Width, camera_Heigth)
video_capture = cv2.VideoCapture(1)
time.sleep(2.0)
(W, H) = (None, None)
# MobileNetSSD Settings
confidenceDef = 0.5
thresholdDef = 0.3
prototxtFile = "MobileNetSSD_deploy.prototxt.txt"
modelFile = "MobileNetSSD_deploy.caffemodel"
net = (None)
classesMobileNetSSD = (None)
colorsMobileNetSSD = (None)
i = 0
detectionEnabled = False
while True:
i = i + 1
start_time = time.time()
ret, frameOrig = video_capture.read()
frame = cv2.resize(frameOrig, frameSize)
if(detectionEnabled):
analyzeFrame(frame)
if (time.time() start_time ) > 0:
fpsInfo = "FPS: " + str(1.0 / (time.time() start_time)) # FPS = 1 / time to process loop
font = cv2.FONT_HERSHEY_DUPLEX
cv2.putText(frame, fpsInfo, (10, 20), font, 0.4, (255, 255, 255), 1)
cv2.imshow('@elbruno – MobileNetSSD Object Detection', frame)
# key controller
key = cv2.waitKey(1) & 0xFF
if key == ord("d"):
if (detectionEnabled == True):
detectionEnabled = False
else:
detectionEnabled = True
if key == ord("q"):
break
video_capture.release()
cv2.destroyAllWindows()

Happy coding!

Greetings

El Bruno

Resources

#Coding4Fun – How to control your #drone with 20 lines of code! (21/N)

Buy Me A Coffee

Hi !

In my post series I already wrote about how to detect faces. We can do this with a camera and OpenCV. However, a drone can also be moved on command, so let’s write some lines to detect a face, and calculate the orientation and distance of the detected face from the center camera of the camera.

In order to do this, 1st let’s draw a grid in the camera frame, and once a face is detected, let’s show the distance and orientation from the center.

face detected on camera and calculate position from center

Let’s start with a Grid. The idea is to create a 3×3 grid in the camera frame, and use the center cell as reference for the detected objects. The code to create a 3×3 grid is this one:

def displayGrid(frame):
    # Add a 3x3 Grid
    cv2.line(frame, (int(camera_Width/2)-centerZone, 0)     , (int(camera_Width/2)-centerZone, camera_Heigth)    , lineColor, lineThickness)
    cv2.line(frame, (int(camera_Width/2)+centerZone, 0)     , (int(camera_Width/2)+centerZone, camera_Heigth)    , lineColor, lineThickness)
    cv2.line(frame, (0, int(camera_Heigth / 2) - centerZone), (camera_Width, int(camera_Heigth / 2) - centerZone), lineColor, lineThickness)
    cv2.line(frame, (0, int(camera_Heigth / 2) + centerZone), (camera_Width, int(camera_Heigth / 2) + centerZone), lineColor, lineThickness)

# Camera Settings
camera_Width  = 1024 # 1280 # 640
camera_Heigth = 780  # 960  # 480
centerZone    = 100

# GridLine color green and thickness
lineColor = (0, 255, 0) 
lineThickness = 2

We use the line() function on OpenCV, and do some calculations to get the starting and endpoint for the 4 lines for the grid: 2 vertical lines and 2 horizontal lines. For this demo, I’ll implement this in my main webcam.

drone 3x3 grid in the camera frame

Based on my face detection samples and other samples in GitHub (see references), now I’ll calculate the position of the detected face (with x, y, h, w) from the center of the camera:

def calculatePositionForDetectedFace(frame, x, y, h , w):
    # calculate direction and relative position of the face
    cx = int(x + (w / 2))  # Center X of the Face
    cy = int(y + (h / 2))  # Center Y of the Face

    if (cx <int(camera_Width/2) - centerZone):
        cv2.putText  (frame, " LEFT " , (20, 50), cv2.FONT_HERSHEY_COMPLEX, 1 , colorGreen, 2)
        dir = 1
    elif (cx > int(camera_Width / 2) + centerZone):
        cv2.putText(frame, " RIGHT ", (20, 50), cv2.FONT_HERSHEY_COMPLEX,1,colorGreen, 3)
        dir = 2
    elif (cy < int(camera_Heigth / 2) - centerZone):
        cv2.putText(frame, " UP ", (20, 50), cv2.FONT_HERSHEY_COMPLEX,1,colorGreen, 3)
        dir = 3
    elif (cy > int(camera_Heigth / 2) + centerZone):
        cv2.putText(frame, " DOWN ", (20, 50), cv2.FONT_HERSHEY_COMPLEX, 1,colorGreen, 3)
        dir = 4
    else: dir=0

    # display detected face frame, line from center and direction to go
    cv2.line     (frame, (int(camera_Width/2),int(camera_Heigth/2)), (cx,cy), colorRed, messageThickness)
    cv2.rectangle(frame, (x, y), (x + w, y + h), colorBlue, messageThickness)
    cv2.putText  (frame, str(int(x)) + " " + str(int(y)), (x - 20, y - 45), cv2.FONT_HERSHEY_COMPLEX,0.7, colorRed, messageThickness)

The output is similar to this one

And now with the base code completed, it’s time to add this logic to the drone samples !

Bonus: the complete code.

# Bruno Capuano 2020
# display the camera feed using OpenCV
# display a 3×3 Grid
# detect faces using openCV and haar cascades
# calculate the relative position for the face from the center of the camera
import os
import time
import cv2
def displayGrid(frame):
# Add a 3×3 Grid
cv2.line(frame, (int(camera_Width/2)centerZone, 0) , (int(camera_Width/2)centerZone, camera_Heigth) , lineColor, lineThickness)
cv2.line(frame, (int(camera_Width/2)+centerZone, 0) , (int(camera_Width/2)+centerZone, camera_Heigth) , lineColor, lineThickness)
cv2.line(frame, (0, int(camera_Heigth / 2) centerZone), (camera_Width, int(camera_Heigth / 2) centerZone), lineColor, lineThickness)
cv2.line(frame, (0, int(camera_Heigth / 2) + centerZone), (camera_Width, int(camera_Heigth / 2) + centerZone), lineColor, lineThickness)
def calculatePositionForDetectedFace(frame, x, y, h , w):
# calculate direction and relative position of the face
cx = int(x + (w / 2)) # Center X of the Face
cy = int(y + (h / 2)) # Center Y of the Face
if (cx <int(camera_Width/2) centerZone):
cv2.putText (frame, " LEFT " , (20, 50), cv2.FONT_HERSHEY_COMPLEX, 1 , colorGreen, 2)
dir = 1
elif (cx > int(camera_Width / 2) + centerZone):
cv2.putText(frame, " RIGHT ", (20, 50), cv2.FONT_HERSHEY_COMPLEX,1,colorGreen, 3)
dir = 2
elif (cy < int(camera_Heigth / 2) centerZone):
cv2.putText(frame, " UP ", (20, 50), cv2.FONT_HERSHEY_COMPLEX,1,colorGreen, 3)
dir = 3
elif (cy > int(camera_Heigth / 2) + centerZone):
cv2.putText(frame, " DOWN ", (20, 50), cv2.FONT_HERSHEY_COMPLEX, 1,colorGreen, 3)
dir = 4
else: dir=0
# display detected face frame, line from center and direction to go
cv2.line (frame, (int(camera_Width/2),int(camera_Heigth/2)), (cx,cy), colorRed, messageThickness)
cv2.rectangle(frame, (x, y), (x + w, y + h), colorBlue, messageThickness)
cv2.putText (frame, str(int(x)) + " " + str(int(y)), (x 20, y 45), cv2.FONT_HERSHEY_COMPLEX,0.7, colorRed, messageThickness)
# Camera Settings
camera_Width = 1024 # 1280 # 640
camera_Heigth = 780 # 960 # 480
centerZone = 100
# GridLine color green and thickness
lineColor = (0, 255, 0)
lineThickness = 2
# message color and thickness
colorBlue = (255, 0, 0)
colorGreen = (0, 255, 0)
colorRed = (0, 0, 255) #red
messageThickness = 2
dsize = (camera_Width, camera_Heigth)
video_capture = cv2.VideoCapture(1)
time.sleep(2.0)
# enable face and smile detection
face_cascade = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')
i = 0
while True:
i = i + 1
ret, frameOrig = video_capture.read()
frame = cv2.resize(frameOrig, dsize)
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
displayGrid(frame)
# detect faces
faces = face_cascade.detectMultiScale(gray, 1.3, 5)
for (x, y, w, h) in faces:
# display face in grid
calculatePositionForDetectedFace(frame, x, y, h , w)
cv2.imshow('@ElBruno – Follow Faces', frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
video_capture.release()
cv2.destroyAllWindows()

Happy coding!

Greetings

El Bruno

References

#Coding4Fun – How to control your #drone with 20 lines of code! (20/N)

Buy Me A Coffee

Hi !

We already have the drone camera feed ready to process, so let’s do some Image Segmentation today. As usual, let’s start with the formal definition of Image Segmentation

In digital image processing and computer vision, image segmentation is the process of partitioning a digital image into multiple segments (sets of pixels, also known as image objects). The goal of segmentation is to simplify and/or change the representation of an image into something that is more meaningful and easier to analyze.[1][2] Image segmentation is typically used to locate objects and boundaries (lines, curves, etc.) in images. More precisely, image segmentation is the process of assigning a label to every pixel in an image such that pixels with the same label share certain characteristics.

The result of image segmentation is a set of segments that collectively cover the entire image, or a set of contours extracted from the image (see edge detection). Each of the pixels in a region are similar with respect to some characteristic or computed property, such as color, intensity, or texture. Adjacent regions are significantly different with respect to the same characteristic(s).[1] When applied to a stack of images, typical in medical imaging, the resulting contours after image segmentation can be used to create 3D reconstructions with the help of interpolation algorithms like marching cubes.[3]

Wikipedia, Image Segmentation

The technique is amazing, and once is attached to the drone camera, we can get something like this:

I used a Python library to make most of the work: PixelLib. It was created by an amazing set of colleagues, so please check the references and take a look at the project description.

PixelLib: is a library built for an easy implementation of Image Segmentation in real life problems. PixelLib is a flexible library that can be integrated into software solutions that require the application of Image Segmentation.

PixelLib

Once I have all the pieces together, I pulled a Pull Request with a single change to allow the use of OpenCV and webcam camera frames and I got a basic demo up and running.

Let’s review the code

  • Line 147. That’s it, a single line which performs the instance segmentation, and also display the bounding boxes.

Sample Code

# Bruno Capuano
# enable drone video camera
# display video camera using OpenCV
# display FPS
# add a bottom image overlay, using a background image
# key D enable / disable instance segmentation detection
# save a local video with the camera recorded
import pixellib
from pixellib.instance import instance_segmentation
import socket
import time
import threading
import os
import cv2
def receiveData():
global response
while True:
try:
response, _ = clientSocket.recvfrom(1024)
except:
break
def readStates():
global battery
while True:
try:
response_state, _ = stateSocket.recvfrom(256)
if response_state != 'ok':
response_state = response_state.decode('ASCII')
list = response_state.replace(';', ':').split(':')
battery = int(list[21])
except:
break
def sendCommand(command):
global response
timestamp = int(time.time() * 1000)
clientSocket.sendto(command.encode('utf-8'), address)
while response is None:
if (time.time() * 1000) timestamp > 5 * 1000:
return False
return response
def sendReadCommand(command):
response = sendCommand(command)
try:
response = str(response)
except:
pass
return response
def sendControlCommand(command):
response = None
for i in range(0, 5):
response = sendCommand(command)
if response == 'OK' or response == 'ok':
return True
return False
# ———————————————–
# Main program
# ———————————————–
# connection info
UDP_IP = '192.168.10.1'
UDP_PORT = 8889
last_received_command = time.time()
STATE_UDP_PORT = 8890
address = (UDP_IP, UDP_PORT)
response = None
response_state = None
clientSocket = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
clientSocket.bind(('', UDP_PORT))
stateSocket = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
stateSocket.bind(('', STATE_UDP_PORT))
# start threads
recThread = threading.Thread(target=receiveData)
recThread.daemon = True
recThread.start()
stateThread = threading.Thread(target=readStates)
stateThread.daemon = True
stateThread.start()
# connect to drone
response = sendControlCommand("command")
print(f'command response: {response}')
response = sendControlCommand("streamon")
print(f'streamon response: {response}')
# drone information
battery = 0
# open UDP
print(f'opening UDP video feed, wait 2 seconds ')
videoUDP = 'udp://192.168.10.1:11111'
cap = cv2.VideoCapture(videoUDP)
time.sleep(2)
# open video writer to save video
vid_cod = cv2.VideoWriter_fourcc(*'XVID')
vid_output = cv2.VideoWriter("cam_video.mp4", vid_cod, 20.0, (640,480))
dsize = (640, 480)
# load bottom img
background = cv2.imread('Bottom03.png')
background = cv2.resize(background, dsize)
# load model
instance_seg = instance_segmentation()
instance_seg.load_model("mask_rcnn_coco.h5")
# main app
detectionEnabled = False
i = 0
while True:
i = i + 1
start_time = time.time()
sendReadCommand('battery?')
print(f'battery: {battery} % – i: {i}')
try:
ret, frame = cap.read()
img = cv2.resize(frame, (640, 480))
if (detectionEnabled):
# save image to disk and open it
imgNumber = str(i).zfill(5)
frameImageFileName = str(f'tmp\image{imgNumber}.png')
outputImageName = str(f'tmp\image{imgNumber}Out.png')
if os.path.exists(frameImageFileName):
os.remove(frameImageFileName)
cv2.imwrite(frameImageFileName, img)
segmask, img = instance_seg.segmentFrame(img, show_bboxes= True)
cv2.imwrite(outputImageName, img)
# overlay background
img = cv2.addWeighted(background, 1, img, 1, 0)
if (time.time() start_time ) > 0:
fpsInfo = "FPS: " + str(1.0 / (time.time() start_time)) # FPS = 1 / time to process loop
font = cv2.FONT_HERSHEY_DUPLEX
cv2.putText(img, fpsInfo, (10, 20), font, 0.4, (255, 255, 255), 1)
cv2.imshow('@elbruno – DJI Tello Camera', img)
vid_output.write(img)
except Exception as e:
print(f'exc: {e}')
pass
# key controller
key = cv2.waitKey(1) & 0xFF
if key == ord("d"):
if (detectionEnabled == True):
detectionEnabled = False
else:
detectionEnabled = True
if key == ord("q"):
break
# release resources
response = sendControlCommand("streamoff")
print(f'streamon response: {response}')
# close the already opened camera, and the video file
cap.release()
vid_output.release()
cv2.destroyAllWindows()

I’ll show a couple of live demos of this in my next Global AI Community, Drone AI demos. Check my next event sections!

Happy coding!

Greetings

El Bruno

References

#Coding4Fun – How to control your #drone with 20 lines of code! (19/N)

Buy Me A Coffee

Hi !

Today I face another challenge: I needed to overlay an image on top of another. Something like this.

camera overlay images with python

Lucky for me, and as usual, OpenCV allow us to do this with a few lines of code. Let’s take a look.

  • Line 8. Define a custom size for all the images: background image and camera feed frame.
  • Lines 10-12. Load and resize background image.
  • Line 21. Overlay the camera frame and the background image.

Sample Code

# Bruno Capuano 2020
# display the camera feed using OpenCV
# add a bottom image overlay, using a background image
import time
import cv2
dsize = (640, 480)
# load bottom img
background = cv2.imread('Bottom03.png')
background = cv2.resize(background, dsize)
video_capture = cv2.VideoCapture(0)
time.sleep(2.0)
while True:
ret, frameOrig = video_capture.read()
frame = cv2.resize(frameOrig, dsize)
img = cv2.addWeighted(background, 1, frame, 1, 0)
cv2.imshow('Video', img)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
video_capture.release()
cv2.destroyAllWindows()
view raw CameraOverlayLogo.py hosted with ❤ by GitHub

And from here, I’ll update some posts with the drone camera.

Happy coding!

Greetings

El Bruno

References

#Coding4Fun – How to control your #drone with 20 lines of code! (18/N)

Buy Me A Coffee

Hi !

Today I’ll step back a couple of posts, and add 2 simple lines to allow me to save a video file from the Drone camera. This is a request, and it’s makes a lot of sense to have recorded a file with the drone camera.

The video will later contains detected objects and more, so let’s go with the code. All the magic happens here:

  • Lines 97-103. Open the drone camera stream, and also opens a video output stream to save the video file.
  • Lines 123-124. Display the camera feed and add the camera frame into the output video file.
  • Lines 136-139. Dispose objects, and close the video output file.
# Bruno Capuano
# enable drone video camera
# display video camera using OpenCV
# display FPS
import socket
import time
import threading
import cv2
def receiveData():
global response
while True:
try:
response, _ = clientSocket.recvfrom(1024)
except:
break
def readStates():
global battery
while True:
try:
response_state, _ = stateSocket.recvfrom(256)
if response_state != 'ok':
response_state = response_state.decode('ASCII')
list = response_state.replace(';', ':').split(':')
battery = int(list[21])
except:
break
def sendCommand(command):
global response
timestamp = int(time.time() * 1000)
clientSocket.sendto(command.encode('utf-8'), address)
while response is None:
if (time.time() * 1000) timestamp > 5 * 1000:
return False
return response
def sendReadCommand(command):
response = sendCommand(command)
try:
response = str(response)
except:
pass
return response
def sendControlCommand(command):
response = None
for i in range(0, 5):
response = sendCommand(command)
if response == 'OK' or response == 'ok':
return True
return False
# ———————————————–
# Main program
# ———————————————–
# connection info
UDP_IP = '192.168.10.1'
UDP_PORT = 8889
last_received_command = time.time()
STATE_UDP_PORT = 8890
address = (UDP_IP, UDP_PORT)
response = None
response_state = None
clientSocket = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
clientSocket.bind(('', UDP_PORT))
stateSocket = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
stateSocket.bind(('', STATE_UDP_PORT))
# start threads
recThread = threading.Thread(target=receiveData)
recThread.daemon = True
recThread.start()
stateThread = threading.Thread(target=readStates)
stateThread.daemon = True
stateThread.start()
# connect to drone
response = sendControlCommand("command")
print(f'command response: {response}')
response = sendControlCommand("streamon")
print(f'streamon response: {response}')
# drone information
battery = 0
# open UDP
print(f'opening UDP video feed, wait 2 seconds ')
videoUDP = 'udp://192.168.10.1:11111'
cap = cv2.VideoCapture(videoUDP)
time.sleep(2)
vid_cod = cv2.VideoWriter_fourcc(*'XVID')
vid_output = cv2.VideoWriter("videos/cam_video.mp4", vid_cod, 20.0, (640,480))
# open
i = 0
while True:
i = i + 1
start_time = time.time()
sendReadCommand('battery?')
print(f'battery: {battery} % – i: {i}')
try:
ret, frame = cap.read()
img = cv2.resize(frame, (640, 480))
if (time.time() start_time ) > 0:
fpsInfo = "FPS: " + str(1.0 / (time.time() start_time)) # FPS = 1 / time to process loop
font = cv2.FONT_HERSHEY_DUPLEX
cv2.putText(img, fpsInfo, (10, 20), font, 0.4, (255, 255, 255), 1)
cv2.imshow('@elbruno – DJI Tello Camera', img)
vid_output.write(img)
except Exception as e:
print(f'exc: {e}')
pass
if cv2.waitKey(1) & 0xFF == ord('q'):
break
response = sendControlCommand("streamoff")
print(f'streamon response: {response}')
# close the already opened camera, and the video file
cap.release()
vid_output.release()
cv2.destroyAllWindows()

Happy coding!

Greetings

El Bruno

#Coding4Fun – How to control your #drone with 20 lines of code! (17/N)

Buy Me A Coffee

Hi !

Once we have the a custom vision trained model instance, we can use it to recognize objects from the drone camera feed. Read my previous posts for descriptions on these.

Another interesting scenario, is to save local files for every detected object. In the following code, I’ll save 2 different files for every detected object

  • A camera frame image, with a frame around the detected object
  • A plain text file with the JSON information

In the sample code below, the save process is in the lines 122-129. And, not in a fancy way, the files have the same name to correlate them.

drone recognized files

So let’s go to the full code:

# Bruno Capuano
# open camera with openCV
# analyze camera frame with local docker custom vision project
# draw bounding boxes for each reconized object
import socket
import time
import threading
import cv2
import urllib
import json
import requests
import os
from flask import Flask, request, jsonify
def receiveData():
global response
while True:
try:
response, _ = clientSocket.recvfrom(1024)
except:
break
def readStates():
global battery
while True:
try:
response_state, _ = stateSocket.recvfrom(256)
if response_state != 'ok':
response_state = response_state.decode('ASCII')
list = response_state.replace(';', ':').split(':')
battery = int(list[21])
pitch = int(list[1])
except:
break
def sendCommand(command):
global response
timestamp = int(time.time() * 1000)
clientSocket.sendto(command.encode('utf-8'), address)
while response is None:
if (time.time() * 1000) timestamp > 5 * 1000:
return False
return response
def sendReadCommand(command):
response = sendCommand(command)
try:
response = str(response)
except:
pass
return response
def sendControlCommand(command):
response = None
for i in range(0, 5):
response = sendCommand(command)
if response == 'OK' or response == 'ok':
return True
return False
# ———————————————–
# Local calls
# ———————————————–
probabilityThreshold = 75
def displayPredictions(jsonPrediction, frame, frameImageFileName):
global camera_Width, camera_Heigth
jsonObj = json.loads(jsonPrediction)
preds = jsonObj['predictions']
sorted_preds = sorted(preds, key=lambda x: x['probability'], reverse=True)
strSortedPreds = ""
resultFound = False
if (sorted_preds):
# open img to save results
img = cv2.imread(frameImageFileName)
detected = False
for pred in sorted_preds:
# tag name and prob * 100
tagName = str(pred['tagName'])
probability = pred['probability'] * 100
# apply threshold
if (probability >= probabilityThreshold):
detected = True
bb = pred['boundingBox']
resize_factor = 100
height = int(bb['height'] * resize_factor)
left = int(bb['left'] * resize_factor)
top = int(bb['top'] * resize_factor)
width = int(bb['width'] * resize_factor)
print(f'height = {height} – left {left} – top {top} – width {width}')
# adjust to size
camera_Width,
height = int(height * camera_Heigth / 100)
left = int(left * camera_Width / 100)
top = int(top * camera_Heigth / 100)
width = int(width * camera_Width / 100)
print(f'Adjusted height = {height} – left {left} – top {top} – width {width}')
# draw bounding boxes
start_point = (top, left)
end_point = (top + height, left + width)
print(f'MVP – {probability}')
print(f'start point: {start_point} – end point: {end_point}')
color = (255, 0, 0)
thickness = 2
cv2.rectangle(img, start_point, end_point, color, thickness)
print(jsonPrediction)
# save the detected image
cv2.rectangle(img, start_point, end_point, color, thickness)
if (detected == True):
detImageFileName = frameImageFileName.replace('tmp', 'det')
cv2.imwrite(detImageFileName, img)
detJsonFileName = detImageFileName.replace('png', 'json')
save_text = open(detJsonFileName, 'w')
save_text.write(jsonStr)
save_text.close()
return strSortedPreds
# instantiate flask app and push a context
app = Flask(__name__)
# ———————————————–
# Main program
# ———————————————–
# connection info
UDP_IP = '192.168.10.1'
UDP_PORT = 8889
last_received_command = time.time()
STATE_UDP_PORT = 8890
address = (UDP_IP, UDP_PORT)
response = None
response_state = None
clientSocket = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
clientSocket.bind(('', UDP_PORT))
stateSocket = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
stateSocket.bind(('', STATE_UDP_PORT))
# start threads
recThread = threading.Thread(target=receiveData)
recThread.daemon = True
recThread.start()
stateThread = threading.Thread(target=readStates)
stateThread.daemon = True
stateThread.start()
# connect to drone
response = sendControlCommand("command")
print(f'command response: {response}')
response = sendControlCommand("streamon")
print(f'streamon response: {response}')
# drone information
battery = 0
pitch = 0
# open UDP
print(f'opening UDP video feed, wait 2 seconds ')
videoUDP = 'udp://192.168.10.1:11111'
cap = cv2.VideoCapture(videoUDP)
time.sleep(2)
camera_Width = 640
camera_Heigth = 480
# open
i = 0
while True:
i = i + 1
imgNumber = str(i).zfill(5)
start_time = time.time()
sendReadCommand('battery?')
print(f'battery: {battery} % – pitch: {pitch} – i: {imgNumber}')
try:
ret, frame = cap.read()
img = cv2.resize(frame, (camera_Width, camera_Heigth))
# save image to disk and open it
frameImageFileName = str(f'tmp\image{imgNumber}.png')
cv2.imwrite(frameImageFileName, img)
with open(frameImageFileName, 'rb') as f:
img_data = f.read()
# analyze file in local container
api_url = "http://127.0.0.1:8070/image&quot;
r = requests.post(api_url, data=img_data)
with app.app_context():
jsonResults = jsonify(r.json())
jsonStr = jsonResults.get_data(as_text=True)
displayPredictions(jsonStr, frame, frameImageFileName)
fpsInfo = ""
if (time.time() start_time ) > 0:
fpsInfo = "FPS: " + str(1.0 / (time.time() start_time)) # FPS = 1 / time to process loop
font = cv2.FONT_HERSHEY_DUPLEX
cv2.putText(img, fpsInfo, (10, 20), font, 0.4, (255, 255, 255), 1)
cv2.imshow('@elbruno – DJI Tello Camera', img)
except Exception as e:
print(f'exc: {e}')
pass
if cv2.waitKey(1) & 0xFF == ord('q'):
break
response = sendControlCommand("streamoff")
print(f'streamon response: {response}')

And if you want to see this up and running, it’s much better to see this in a video (start at ):

The complete source code can be found here https://github.com/elbruno/events/tree/master/2020%2004%2018%20Global%20AI%20On%20Tour%20MTY%20Drone%20AI%20Mex

Happy coding!

Greetings

El Bruno

References