Jetson 20 Color Tracking Based on OpenCV

This section introduces how to implement gesture recognition using MediaPipe + OpenCV.

What is MediaPipe?

MediaPipe is an open-source framework developed by Google for building machine learning-based multimedia processing applications. It provides a set of tools and libraries for processing video, audio, and image data, and applies machine learning models to achieve various functionalities such as pose estimation, gesture recognition, and face detection. MediaPipe is designed to offer efficient, flexible, and easy-to-use solutions, enabling developers to quickly build a variety of multimedia processing applications.

Preparation

Since the product automatically runs the main program at startup, which occupies the camera resource, this tutorial cannot be used in such situations. You need to terminate the main program or disable its automatic startup before restarting the robot.
It's worth noting that because the robot's main program uses multi-threading and is configured to run automatically at startup through crontab, the usual method sudo killall python typically doesn't work. Therefore, we'll introduce the method of disabling the automatic startup of the main program here.
If you have already disabled the automatic startup of the robot's main demo, you do not need to proceed with the section on Terminate the Main Demo.

Terminate the Main Demo

1. Click the + icon next to the tab for this page to open a new tab called "Launcher."

2. Click on Terminal under Other to open a terminal window.

3. Type bash into the terminal window and press Enter.

4. Now you can use the Bash Shell to control the robot.

5. Enter the command: sudo killall -9 python.

Demo

The following code block can be run directly:

1. Select the code block below.

2. Press Shift + Enter to run the code block.

3. Watch the real-time video window.

4. Press STOP to close the real-time video and release the camera resources.

If you cannot see the real-time camera feed when running:

Click on Kernel -> Shut down all kernels above.
Close the current section tab and open it again.
Click STOP to release the camera resources, then run the code block again.
Reboot the device.

Running

In this tutorial, the camera pan-tilt will rotate to make sure your hand or other fragile objects are away from the camera pan-tilt's rotation radius.

We detect blue balls by default in the demo to ensure that there are no blue objects in the background of the picture that affect the color recognition function, and you can also change the detection color (HSV color space) through secondary development.

import matplotlib.pyplot as plt
import cv2
from picamera2 import Picamera2
import numpy as np
from IPython.display import display, Image
import ipywidgets as widgets
import threading

# Stop button
# ================
stopButton = widgets.ToggleButton(
    value=False,
    description='Stop',
    disabled=False,
    button_style='danger', # 'success', 'info', 'warning', 'danger' or ''
    tooltip='Description',
    icon='square' # (FontAwesome names without the `fa-` prefix)
)


def gimbal_track(fx, fy, gx, gy, iterate):
    global gimbal_x, gimbal_y
    distance = math.sqrt((fx - gx) ** 2 + (gy - fy) ** 2)
    gimbal_x += (gx - fx) * iterate
    gimbal_y += (fy - gy) * iterate
    if gimbal_x > 180:
        gimbal_x = 180
    elif gimbal_x < -180:
        gimbal_x = -180
    if gimbal_y > 90:
        gimbal_y = 90
    elif gimbal_y < -30:
        gimbal_y = -30
    gimbal_spd = int(distance * track_spd_rate)
    gimbal_acc = int(distance * track_acc_rate)
    if gimbal_acc < 1:
        gimbal_acc = 1
    if gimbal_spd < 1:
        gimbal_spd = 1
    base.base_json_ctrl({"T":self.CMD_GIMBAL,"X":gimbal_x,"Y":gimbal_y,"SPD":gimbal_spd,"ACC":gimbal_acc})
    return distance


# Display function
# ================
def view(button):
    picam2 = Picamera2()
    picam2.configure(picam2.create_video_configuration(main={"format": 'XRGB8888', "size": (640, 480)}))
    picam2.start()
    display_handle=display(None, display_id=True)

    color_upper = np.array([120, 255, 220])
    color_lower = np.array([ 90, 120,  90])
    min_radius = 12
    track_color_iterate = 0.023
    
    while True:
        frame = picam2.capture_array()
        # frame = cv2.flip(frame, 1) # if your camera reverses your image

        # uncomment this line if you are using USB camera
        # frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)

        img = cv2.cvtColor(frame, cv2.COLOR_RGB2BGR)
        blurred = cv2.GaussianBlur(img, (11, 11), 0)
        hsv = cv2.cvtColor(blurred, cv2.COLOR_BGR2HSV)
        mask = cv2.inRange(hsv, color_lower, color_upper)
        mask = cv2.erode(mask, None, iterations=5)
        mask = cv2.dilate(mask, None, iterations=5)

        cnts = cv2.findContours(mask.copy(), cv2.RETR_EXTERNAL,
            cv2.CHAIN_APPROX_SIMPLE)
        cnts = imutils.grab_contours(cnts)
        center = None

        height, width = img.shape[:2]
        center_x, center_y = width // 2, height // 2

        if len(cnts) > 0:
            # find the largest contour in the mask, then use
            # it to compute the minimum enclosing circle and
            # centroid
            c = max(cnts, key=cv2.contourArea)
            ((x, y), radius) = cv2.minEnclosingCircle(c)
            M = cv2.moments(c)
            center = (int(M["m10"] / M["m00"]), int(M["m01"] / M["m00"]))

            # only proceed if the radius meets a minimum size
            if radius > min_radius:
                distance = gimbal_track(center_x, center_y, center[0], center[1], track_color_iterate) #
                cv2.circle(overlay_buffer, (int(x), int(y)), int(radius), (128, 255, 255), 1)
        
        
        _, frame = cv2.imencode('.jpeg', frame)
        display_handle.update(Image(data=frame.tobytes()))
        if stopButton.value==True:
            picam2.close()
            display_handle.update(None)
            
            
# Run
# ================
display(stopButton)
thread = threading.Thread(target=view, args=(stopButton,))
thread.start()

Navigation menu