(This is not my wheelhouse as I developer, but here is what my AI and I came up with for a bug report; I hope it is useful). I got this same error when this was a branch (multitask-v2-detector), using Detectorv2 with both path and tensor as inputs.
I have been testing the new DetectorV2 multitask branch on a System76 laptop with an NVIDIA RTX 2060 (6 GB VRAM). My environment is Python 3.12.3, PyTorch 2.12.1+cu130, CUDA runtime 13.0, cuDNN 9.2, NVIDIA proprietary driver 580.159.03. torch.cuda.is_available() is True, torch.backends.cudnn.is_available() is True, and the model parameters are on cuda:0 with torch.float32.
DetectorV2.detect() fails almost immediately during the first forward pass with:
RuntimeError: GET was unable to find an engine to execute this computation
The traceback points into the ConvNeXt backbone (timm) at the first depthwise convolution (conv_dw -> F.conv2d).
To isolate the problem, I created several standalone tests outside of py-feat:
- Standalone CUDA
Conv2d works.
- Standalone depthwise
Conv2d(groups=...) works.
- Standalone
timm ConvNeXt (features_only=True) on CUDA works correctly, including batch size 16 and 256×256 inputs.
- cuDNN reports available (
version=92000), and the PyTorch build reports USE_CUDA=ON, USE_CUDNN=ON.
- The GPU architecture (SM 7.5) is included in the PyTorch wheel.
This seems to rule out a general CUDA, cuDNN, driver, or PyTorch installation problem. The failure appears to be specific to the new DetectorV2 multitask inference path rather than the underlying ConvNeXt implementation itself.
If it would be helpful, I'd be happy to test patches or provide additional diagnostics. If there are additional experiments that would help narrow this down further, I'm happy to run them.
This script (below) creates the failure on my machine, using a pexels video (couldn't upload). genfail.py:
import os
import argparse
from feat import Detectorv2
from feat.utils.io import video_to_tensor
parser = argparse.ArgumentParser()
parser.add_argument('--skip', dest='skip', type=int, default=24)
parser.add_argument('--batch_size', dest='batch_size', type=int, default=1)
parser.add_argument('--num_workers', dest='num_workers', type=int, default=1)
parser.add_argument('video_file')
args = parser.parse_args()
print(f"processing: {args}")
detector = Detectorv2(device="cuda")
print(detector.info)
tvf = video_to_tensor(args.video_file)
fex = detector.detect(tvf, data_type="tensor", face_identity_threshold=0.95, face_detection_threshold=0.95, skip=args.skip, batch_size=args.batch_size, num_workers=args.num_workers, verbose=True )
print(fex)
Output is:
Warning: You are sending unauthenticated requests to the HF Hub. Please set a HF_TOKEN to enable higher rate limits and faster downloads.
{'face_model': 'retinaface', 'multitask_model': 'face_multitask_v2', 'identity_model': 'arcface', 'facepose_model': 'multitask', 'gaze_model': 'multitask'}
0%| | 0/801 [00:01<?, ?it/s]
Traceback (most recent call last):
File "/home/kcostilow/venvs/genfail.py", line 21, in
fex = detector.detect(tvf, data_type="tensor", face_identity_threshold=0.95, face_detection_threshold=0.95, skip=args.skip, batch_size=args.batch_size, num_workers=args.num_workers, verbose=True )
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/feat/detector_v2.py", line 534, in detect
batch_results = self.forward(faces_data, batch_data)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 124, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/feat/detector_v2.py", line 363, in forward
out = self.multitask(faces) # MultitaskOutput; faces already [0,1] 256 crops
^^^^^^^^^^^^^^^^^^^^^
File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 124, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/feat/multitask/inference.py", line 165, in call
out = self.model(x)
^^^^^^^^^^^^^
File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1778, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1789, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/feat/multitask/model_v2.py", line 695, in forward
feats = self.backbone(x)[-1] # [B, bb_ch, H, W]
^^^^^^^^^^^^^^^^
File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1778, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1789, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/timm/models/_features.py", line 345, in forward
return list(self._collect(x).values())
^^^^^^^^^^^^^^^^
File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/timm/models/_features.py", line 299, in _collect
x = module(x)
^^^^^^^^^
File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1778, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1789, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/timm/models/convnext.py", line 306, in forward
x = self.blocks(x)
^^^^^^^^^^^^^^
File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1778, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1789, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/torch/nn/modules/container.py", line 253, in forward
input = module(input)
^^^^^^^^^^^^^
File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1778, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1789, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/timm/models/convnext.py", line 200, in forward
x = self.conv_dw(x)
^^^^^^^^^^^^^^^
File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1778, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1789, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/torch/nn/modules/conv.py", line 565, in forward
return self._conv_forward(input, self.weight, self.bias)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/torch/nn/modules/conv.py", line 560, in _conv_forward
return F.conv2d(
^^^^^^^^^
RuntimeError: GET was unable to find an engine to execute this computation
(This is not my wheelhouse as I developer, but here is what my AI and I came up with for a bug report; I hope it is useful). I got this same error when this was a branch (multitask-v2-detector), using Detectorv2 with both path and tensor as inputs.
I have been testing the new
DetectorV2multitask branch on a System76 laptop with an NVIDIA RTX 2060 (6 GB VRAM). My environment is Python 3.12.3, PyTorch 2.12.1+cu130, CUDA runtime 13.0, cuDNN 9.2, NVIDIA proprietary driver 580.159.03.torch.cuda.is_available()isTrue,torch.backends.cudnn.is_available()isTrue, and the model parameters are oncuda:0withtorch.float32.DetectorV2.detect()fails almost immediately during the first forward pass with:The traceback points into the ConvNeXt backbone (
timm) at the first depthwise convolution (conv_dw -> F.conv2d).To isolate the problem, I created several standalone tests outside of py-feat:
Conv2dworks.Conv2d(groups=...)works.timmConvNeXt (features_only=True) on CUDA works correctly, including batch size 16 and 256×256 inputs.version=92000), and the PyTorch build reportsUSE_CUDA=ON,USE_CUDNN=ON.This seems to rule out a general CUDA, cuDNN, driver, or PyTorch installation problem. The failure appears to be specific to the new
DetectorV2multitask inference path rather than the underlying ConvNeXt implementation itself.If it would be helpful, I'd be happy to test patches or provide additional diagnostics. If there are additional experiments that would help narrow this down further, I'm happy to run them.
This script (below) creates the failure on my machine, using a pexels video (couldn't upload). genfail.py:
import os
import argparse
from feat import Detectorv2
from feat.utils.io import video_to_tensor
parser = argparse.ArgumentParser()
parser.add_argument('--skip', dest='skip', type=int, default=24)
parser.add_argument('--batch_size', dest='batch_size', type=int, default=1)
parser.add_argument('--num_workers', dest='num_workers', type=int, default=1)
parser.add_argument('video_file')
args = parser.parse_args()
print(f"processing: {args}")
detector = Detectorv2(device="cuda")
print(detector.info)
tvf = video_to_tensor(args.video_file)
fex = detector.detect(tvf, data_type="tensor", face_identity_threshold=0.95, face_detection_threshold=0.95, skip=args.skip, batch_size=args.batch_size, num_workers=args.num_workers, verbose=True )
print(fex)
Output is:
Warning: You are sending unauthenticated requests to the HF Hub. Please set a HF_TOKEN to enable higher rate limits and faster downloads.
{'face_model': 'retinaface', 'multitask_model': 'face_multitask_v2', 'identity_model': 'arcface', 'facepose_model': 'multitask', 'gaze_model': 'multitask'}
0%| | 0/801 [00:01<?, ?it/s]
Traceback (most recent call last):
File "/home/kcostilow/venvs/genfail.py", line 21, in
fex = detector.detect(tvf, data_type="tensor", face_identity_threshold=0.95, face_detection_threshold=0.95, skip=args.skip, batch_size=args.batch_size, num_workers=args.num_workers, verbose=True )
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/feat/detector_v2.py", line 534, in detect
batch_results = self.forward(faces_data, batch_data)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 124, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/feat/detector_v2.py", line 363, in forward
out = self.multitask(faces) # MultitaskOutput; faces already [0,1] 256 crops
^^^^^^^^^^^^^^^^^^^^^
File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 124, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/feat/multitask/inference.py", line 165, in call
out = self.model(x)
^^^^^^^^^^^^^
File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1778, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1789, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/feat/multitask/model_v2.py", line 695, in forward
feats = self.backbone(x)[-1] # [B, bb_ch, H, W]
^^^^^^^^^^^^^^^^
File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1778, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1789, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/timm/models/_features.py", line 345, in forward
return list(self._collect(x).values())
^^^^^^^^^^^^^^^^
File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/timm/models/_features.py", line 299, in _collect
x = module(x)
^^^^^^^^^
File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1778, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1789, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/timm/models/convnext.py", line 306, in forward
x = self.blocks(x)
^^^^^^^^^^^^^^
File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1778, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1789, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/torch/nn/modules/container.py", line 253, in forward
input = module(input)
^^^^^^^^^^^^^
File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1778, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1789, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/timm/models/convnext.py", line 200, in forward
x = self.conv_dw(x)
^^^^^^^^^^^^^^^
File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1778, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1789, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/torch/nn/modules/conv.py", line 565, in forward
return self._conv_forward(input, self.weight, self.bias)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/torch/nn/modules/conv.py", line 560, in _conv_forward
return F.conv2d(
^^^^^^^^^
RuntimeError: GET was unable to find an engine to execute this computation