Video Processing Guide¶
This document provides instructions for using the psifx
package for video inference and visualization, including manipulation, pose, and face.
Manipulation¶
Process Video¶
Extract a specific time window from a video, crop it based on a bounding box, and optionally rescale the video.
psifx video manipulation process \
--in_video Video.mp4 \
--out_video VideoProcessed.mp4 \
--start 18 \
--end 210 \
--x_min 1347 \
--y_min 459 \
--x_max 2553 \
--y_max 1898 \
[--width 1920] \
[--height 1080]
--in_video
: Path to the input video file, such as/path/to/video.mp4
(or.avi
,.mkv
, etc.).--out_video
: Path to the output video file, such as/path/to/video.mp4
(or.avi
,.mkv
, etc.).--start
: Timestamp in seconds for the start of the selection.--end
: Timestamp in seconds for the end of the selection.--x_min
: X-axis coordinate of the top-left corner for cropping in pixels.--y_min
: Y-axis coordinate of the top-left corner for cropping in pixels.--x_max
: X-axis coordinate of the bottom-right corner for cropping in pixels.--y_max
: Y-axis coordinate of the bottom-right corner for cropping in pixels.--width
: Width of the resized output video, defaultNone
.--height
: Height of the resized output video, defaultNone
.
Pose¶
Pose Estimation¶
Detects and analyzes human poses in a video using MediaPipe.
psifx video pose mediapipe inference \
--video Video.mp4 \
--poses Poses.tar.xz \
[--masks Masks.mp4] \
[--mask_threshold 0.1] \
[--model_complexity 2] \
[--smooth] \
[--device cuda]
--video
: Input video file for pose estimation, can be.mp4
,.avi
,.mkv
, etc…--poses
: Path to save pose estimation data in.tar.xz
format.--masks
: Path to save a video mask showing detected body parts.--mask_threshold
: Threshold for the binarization of the segmentation mask, default0.1
.--model_complexity
: Complexity of the model: {0
,1
,2
}, higher means more FLOPs, but also more accurate results, default2
.--smooth
: Temporally smooth the inference results to reduce the jitter, defaultTrue
.--device
: Device on which to run the inference, eithercpu
orcuda
, defaultcpu
.
Pose Visualization¶
Creates a visual representation of detected poses in the video.
psifx video pose mediapipe visualization \
--video Video.mp4 \
--poses Poses.tar.xz \
--visualization Visualization.mediapipe.mp4 \
[--confidence_threshold 0.0]
--video
: Original video file for pose visualization overlay.--poses
: Pose estimation data file in.tar.xz
format.--visualization
: Path to save the visualized output video.--confidence_threshold
: Threshold for not displaying low confidence keypoints, default0.0
.
Face¶
Requirements¶
The face feature extraction requires OpenFace. We provide an API endpoint to use OpenFace, useable only if you comply with their license agreement, e.g. academic, research or non-commercial purposes.
Install the following system-wide:
sudo apt install \ build-essential \ cmake \ wget \ libopenblas-dev \ libopencv-dev \ libdlib-dev \ libboost-all-dev \ libsqlite3-dev
Install OpenFace using our fork.
wget https://raw.githubusercontent.com/GuillaumeRochette/OpenFace/master/install.py && \ python install.py
Face Feature Extraction¶
Extracts facial features from a video using OpenFace. To use OpenFace you need to install it beforehand follow the installation instructions.
psifx video face openface inference \
--video Video.mp4 \
--features Faces.tar.xz
--video
: Input video file for face feature extraction.--features
: Path to save extracted facial features in.tar.xz
format.
Face Visualization¶
Creates a visual overlay of facial features detected in the video.
psifx video face openface visualization \
--video Video.mp4 \
--features Faces.tar.xz \
--visualization Visualization.openface.mp4 \
[--depth 3.0] \
[--f_x 1600.0] [--f_y 1600.0] \
[--c_x 960.0] [--c_y 540.0]
--video
: Original video file for face feature visualization overlay.--features
: Extracted facial feature data file in.tar.xz
format.--visualization
: Path to save the visualized output video.--depth
: Projection: assumed static depth of the subject in meters, default3.0
.--f_x
,--f_y
: Projection: x-axis (respectively y-axis) of the focal length, defaultNone
.--c_x
,--c_y
: Projection: x-axis (respectively y-axis) of the principal point, defaultNone
.