🚀 Feature Proposal: Interactive GUI for Video Object Segmentation (VOS)

### Summary

Propose adding a new example notebook that implements an **interactive, OpenCV-based GUI** for **Video Object Segmentation (VOS)** using `SAM2VideoPredictor`. This enhances the user experience for segmenting and tracking objects across video frames.

### Motivation

Existing examples lack a direct, interactive demonstration of SAM-2's advanced VOS and object tracking capabilities. This new example will:

1.  **Showcase Temporal Tracking:** Directly utilize `SAM2VideoPredictor`'s state-aware inference (`inference_state`, `obj_id`) for robust, continuous tracking.
2.  **Enable User Refinement:** Allow users to provide real-time **positive/negative point prompts** (mouse clicks) to correct segmentation errors in each frame.
3.  **Provide a Practical Tool:** Offer a ready-to-use boilerplate for interactive video annotation and qualitative model evaluation.

### Core Changes

The implementation is contained within a custom Python class (`InteractiveAnnotatorWithSAM`) that manages the following:

* **GUI Setup:** Uses `cv2` (OpenCV) to handle frame display and mouse/key callbacks.
* **Prompting:** Captures right-clicks (positive) and left-clicks (negative) to guide the segmentation.
* **On-Demand Prediction:** Triggers `predictor.add_new_points_or_box(...)` on keypress (`'v'`) and displays the mask overlay immediately for visual feedback.
* **Flow Control:** Allows users to advance to the next frame (`'n'`) while preserving the object's tracking state.

I have a working notebook ready for submission. Please let me know if this feature aligns with the repository's goals so I can prepare a Pull Request.

@haithamkhedr @NielsRogge @ronghanghu 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

🚀 Feature Proposal: Interactive GUI for Video Object Segmentation (VOS) #722

Summary

Motivation

Core Changes

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

🚀 Feature Proposal: Interactive GUI for Video Object Segmentation (VOS) #722

Description

Summary

Motivation

Core Changes

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions