23 Pose Detection with MediaPipe

From Waveshare Wiki
Jump to: navigation, search

This section describes how to implement pose detection using MediaPipe + OpenCV.

What is MediaPipe?

MediaPipe is an open-source framework developed by Google for building machine learning-based multimedia processing applications. It provides a set of tools and libraries for processing video, audio, and image data, and applies machine learning models to achieve various functionalities such as pose estimation, gesture recognition, and face detection. MediaPipe is designed to offer efficient, flexible, and easy-to-use solutions, enabling developers to quickly build a variety of multimedia processing applications.

Preparation

Since the product automatically runs the main program at startup, which occupies the camera resource, this tutorial cannot be used in such situations. You need to terminate the main program or disable its automatic startup before restarting the robot.
It's worth noting that because the robot's main program uses multi-threading and is configured to run automatically at startup through crontab, the usual method sudo killall python typically doesn't work. Therefore, we'll introduce the method of disabling the automatic startup of the main program here.
If you have already disabled the automatic startup of the robot's main demo, you do not need to proceed with the section on Terminate the Main Demo.

Terminate the Main Demo

1. Click the "+" icon next to the tab for this page to open a new tab called "Launcher."
2. Click on "Terminal" under "Other" to open a terminal window.
3. Type bash into the terminal window and press Enter.
4. Now you can use the Bash Shell to control the robot.
5. Enter the command: crontab -e.
6. If prompted to choose an editor, enter 1 and press Enter to select nano.
7. After opening the crontab configuration file, you'll see the following two lines:
@reboot ~/ugv_pt_rpi/ugv-env/bin/python ~/ugv_pt_rpi/app.py >> ~/ugv.log 2>&1
@reboot /bin/bash ~/ugv_pt_rpi/start_jupyter.sh >> ~/jupyter_log.log 2>&1
8. Add a # character at the beginning of the line with ……app.py >> …… to comment out this line.
#@reboot ~/ugv_pt_rpi/ugv-env/bin/python ~/ugv_pt_rpi/app.py >> ~/ugv.log 2>&1
@reboot /bin/bash ~/ugv_pt_rpi/start_jupyter.sh >> ~/jupyter_log.log 2>&1
9. Press Ctrl + X in the terminal window to exit. It will ask you Save modified buffer? Enter Y and press Enter to save the changes.
10. Reboot the device. Note that this process will temporarily close the current Jupyter Lab session. If you didn't comment out ……start_jupyter.sh >>…… at the previous step, you can still use Jupyter Lab normally after the robot reboots (JupyterLab and the robot's main program app.py run independently). You may need to refresh the page.
11. One thing to note is that since the lower machine continues to communicate with the upper machine through the serial port, the host may not start up properly during the restart process due to the continuous change of serial port levels. Taking the case where the upper machine is a Raspberry Pi, after the Raspberry Pi is shut down and the green LED is constantly on without the green LED blinking, you can turn off the power switch of the robot, then turn it on again, and the robot will restart normally.
12. Enter the reboot command: sudo reboot.
13. After waiting for the device to restart (during the restart process, the green LED of the Raspberry Pi will blink, and when the frequency of the green LED blinking decreases or goes out, it means that the startup is successful), refresh the page and continue with the remaining part of this tutorial.

Example

The following code block can be executed directly:

1. Select the code block below.
2. Press Shift + Enter to run the code block.
3. Watch the real-time video window.
4. Press STOP to close the real-time video and release the camera resources.

If the real-time camera view is not visible during execution

  • Click on Kernel - Shut down all kernels above.
  • Close the current chapter tab and reopen it.
  • Click STOP to release the camera resources, then run the code block again.
  • Reboot the device.

Note

If you use the USB camera you need to uncomment frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB).

Features

When the code block runs normally, MediaPipe will automatically mark the joints of the human body when there is a face in the frame.ace in the frame.

import cv2  # Import the OpenCV library for image processing
import imutils, math  # Auxiliary libraries for image processing and mathematical operations
from picamera2 import Picamera2  # Library to access the Raspberry Pi Camera
from IPython.display import display, Image  # Library to display images in Jupyter Notebook
import ipywidgets as widgets  # Library for creating interactive widgets, such as buttons
import threading  # Library for creating new threads for asynchronous execution of tasks
import mediapipe as mp  # Import the MediaPipe library for pose detection

# Create a "Stop" button that users can click to stop the video stream
# ================
stopButton = widgets.ToggleButton(
    value=False,
    description='Stop',
    disabled=False,
    button_style='danger', # 'success', 'info', 'warning', 'danger' or ''
    tooltip='Description',
    icon='square' # (FontAwesome names without the `fa-` prefix)
)

# Initialize MediaPipe's drawing tools and pose detection model
mpDraw = mp.solutions.drawing_utils


# MediaPipe Hand GS
mp_pose = mp.solutions.pose
pose = mp_pose.Pose(static_image_mode=False, 
                    model_complexity=1, 
                    smooth_landmarks=True, 
                    min_detection_confidence=0.5, 
                    min_tracking_confidence=0.5)

#  Define the display function to process video frames and perform pose detection
def view(button):
    picam2 = Picamera2()  # Create an instance of Picamera2
    picam2.configure(picam2.create_video_configuration(main={"format": 'XRGB8888', "size": (640, 480)}))  # Configure camera parameters
    picam2.start()  # Start the camera
    display_handle=display(None, display_id=True)  # Create a display handle to update the displayed image
    
    while True:
        frame = picam2.capture_array()
        # frame = cv2.flip(frame, 1) # if your camera reverses your image

        # uncomment this line if you are using USB camera
        # frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)

        img = cv2.cvtColor(frame, cv2.COLOR_RGB2BGR)
        
        results = pose.process(img) # Use MediaPipe to process the image and get pose detection results

         # If pose landmarks are detected
        if results.pose_landmarks:
            frame = cv2.cvtColor(frame, cv2.COLOR_RGB2BGR)  #  Convert the image from RGB to BGR for drawing
            mpDraw.draw_landmarks(frame, results.pose_landmarks, mp_pose.POSE_CONNECTIONS)  # Use MediaPipe's drawing tools to draw pose landmarks and connections
            frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)  # Convert the image back from BGR to RGB for display
            
        _, frame = cv2.imencode('.jpeg', frame)  # Encode the processed frame into JPEG format
        display_handle.update(Image(data=frame.tobytes()))  # Update the displayed image
        if stopButton.value==True:  # Check if the "Stop" button is pressed
            picam2.close()  # If yes, close the camera
            display_handle.update(None)  # Clear the displayed content

# Display the "Stop" button and start a thread to run the display function
display(stopButton)
thread = threading.Thread(target=view, args=(stopButton,))
thread.start()  # Start the thread