Hi py-feat team,
I have been working on optimizing py-feat for video processing on CPU/low-resource environments.
Currently, the detector runs heavy models (like img2pose or RetinaFace) on every single frame, which is computationally expensive even when face positions are predictable.
The Solution I implemented a Hybrid Tracking Engine that decouples "Search" (Detection) from "Tracking."
-
Kalman Filter Integration: Uses filterpy to track face coordinates across frames.
-
Hybrid Switching: Added a detection_interval argument.
The heavy detector only runs every $N$ frames (e.g., 5) to correct the tracker. Intermediate frames are predicted via Kalman Filter (CPU).
-
Adaptive Padding: Implemented a safety crop (20% padding) to prevent the "drifting crop" issue where landmarks fail if the tracker is slightly misaligned.
Benchmark ResultsTested on a simulated workload (0.1s load per detection) over 50 frames:
1. Running BASELINE (Interval=1)...
-> Time: 31.91s | FPS: 1.57
2. Running HYBRID (Interval=5)...
-> Time: 27.63s | FPS: 1.81
SPEEDUP ACHIEVED: 13.4%
PERFORMANCE BOOST: 1.2x Faster
Note: The speedup affects the Detection stage. The Analysis stage (Landmarks/AUs) still runs on every frame to ensure accuracy.
Code You can view the implementation in my fork here: Link to feat/detector.py]
Next Steps I have this working and tested on my fork. Would you be open to a PR for this feature?
Hi py-feat team,
I have been working on optimizing py-feat for video processing on CPU/low-resource environments.
Currently, the detector runs heavy models (like img2pose or RetinaFace) on every single frame, which is computationally expensive even when face positions are predictable.
The Solution I implemented a Hybrid Tracking Engine that decouples "Search" (Detection) from "Tracking."
Kalman Filter Integration: Uses
filterpyto track face coordinates across frames.Hybrid Switching: Added a$N$ frames (e.g., 5) to correct the tracker. Intermediate frames are predicted via Kalman Filter (CPU).
detection_intervalargument.The heavy detector only runs every
Adaptive Padding: Implemented a safety crop (20% padding) to prevent the "drifting crop" issue where landmarks fail if the tracker is slightly misaligned.
Benchmark ResultsTested on a simulated workload (0.1s load per detection) over 50 frames:
Note: The speedup affects the Detection stage. The Analysis stage (Landmarks/AUs) still runs on every frame to ensure accuracy.
Code You can view the implementation in my fork here: Link to feat/detector.py]
Next Steps I have this working and tested on my fork. Would you be open to a PR for this feature?