-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Open
Description
Summary
Propose adding a new example notebook that implements an interactive, OpenCV-based GUI for Video Object Segmentation (VOS) using SAM2VideoPredictor. This enhances the user experience for segmenting and tracking objects across video frames.
Motivation
Existing examples lack a direct, interactive demonstration of SAM-2's advanced VOS and object tracking capabilities. This new example will:
- Showcase Temporal Tracking: Directly utilize
SAM2VideoPredictor's state-aware inference (inference_state,obj_id) for robust, continuous tracking. - Enable User Refinement: Allow users to provide real-time positive/negative point prompts (mouse clicks) to correct segmentation errors in each frame.
- Provide a Practical Tool: Offer a ready-to-use boilerplate for interactive video annotation and qualitative model evaluation.
Core Changes
The implementation is contained within a custom Python class (InteractiveAnnotatorWithSAM) that manages the following:
- GUI Setup: Uses
cv2(OpenCV) to handle frame display and mouse/key callbacks. - Prompting: Captures right-clicks (positive) and left-clicks (negative) to guide the segmentation.
- On-Demand Prediction: Triggers
predictor.add_new_points_or_box(...)on keypress ('v') and displays the mask overlay immediately for visual feedback. - Flow Control: Allows users to advance to the next frame (
'n') while preserving the object's tracking state.
I have a working notebook ready for submission. Please let me know if this feature aligns with the repository's goals so I can prepare a Pull Request.
Metadata
Metadata
Assignees
Labels
No labels