RuntimeError: GET was unable to find an engine to execute this computation from F.conv2D()

(This is not my wheelhouse as I developer, but here is what my AI and I came up with for a bug report; I hope it is useful). I got this same error when this was a branch (multitask-v2-detector), using Detectorv2 with both path and tensor as inputs.

I have been testing the new `DetectorV2` multitask branch on a System76 laptop with an NVIDIA RTX 2060 (6 GB VRAM). My environment is Python 3.12.3, PyTorch 2.12.1+cu130, CUDA runtime 13.0, cuDNN 9.2, NVIDIA proprietary driver 580.159.03. `torch.cuda.is_available()` is `True`, `torch.backends.cudnn.is_available()` is `True`, and the model parameters are on `cuda:0` with `torch.float32`.

`DetectorV2.detect()` fails almost immediately during the first forward pass with:

```text
RuntimeError: GET was unable to find an engine to execute this computation
```

The traceback points into the ConvNeXt backbone (`timm`) at the first depthwise convolution (`conv_dw -> F.conv2d`).

To isolate the problem, I created several standalone tests outside of py-feat:

* Standalone CUDA `Conv2d` works.
* Standalone depthwise `Conv2d(groups=...)` works.
* Standalone `timm` ConvNeXt (`features_only=True`) on CUDA works correctly, including batch size 16 and 256×256 inputs.
* cuDNN reports available (`version=92000`), and the PyTorch build reports `USE_CUDA=ON`, `USE_CUDNN=ON`.
* The GPU architecture (SM 7.5) is included in the PyTorch wheel.

This seems to rule out a general CUDA, cuDNN, driver, or PyTorch installation problem. The failure appears to be specific to the new `DetectorV2` multitask inference path rather than the underlying ConvNeXt implementation itself.

If it would be helpful, I'd be happy to test patches or provide additional diagnostics.  If there are additional experiments that would help narrow this down further, I'm happy to run them.

This script (below) creates the failure on my machine, using a pexels video (couldn't upload).  genfail.py:

import os
import argparse
from feat import Detectorv2
from feat.utils.io import video_to_tensor

parser = argparse.ArgumentParser()
parser.add_argument('--skip', dest='skip', type=int, default=24)
parser.add_argument('--batch_size', dest='batch_size', type=int, default=1)
parser.add_argument('--num_workers', dest='num_workers', type=int, default=1)
parser.add_argument('video_file')
args = parser.parse_args()

print(f"processing:  {args}")

detector = Detectorv2(device="cuda")

print(detector.info)

tvf = video_to_tensor(args.video_file)
fex = detector.detect(tvf, data_type="tensor", face_identity_threshold=0.95, face_detection_threshold=0.95, skip=args.skip, batch_size=args.batch_size, num_workers=args.num_workers, verbose=True )


print(fex)

Output is:
Warning: You are sending unauthenticated requests to the HF Hub. Please set a HF_TOKEN to enable higher rate limits and faster downloads.
{'face_model': 'retinaface', 'multitask_model': 'face_multitask_v2', 'identity_model': 'arcface', 'facepose_model': 'multitask', 'gaze_model': 'multitask'}
  0%|                                                                                                                                                                                         | 0/801 [00:01<?, ?it/s]
Traceback (most recent call last):
  File "/home/kcostilow/venvs/genfail.py", line 21, in <module>
    fex = detector.detect(tvf, data_type="tensor", face_identity_threshold=0.95, face_detection_threshold=0.95, skip=args.skip, batch_size=args.batch_size, num_workers=args.num_workers, verbose=True )
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/feat/detector_v2.py", line 534, in detect
    batch_results = self.forward(faces_data, batch_data)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 124, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/feat/detector_v2.py", line 363, in forward
    out = self.multitask(faces)  # MultitaskOutput; faces already [0,1] 256 crops
          ^^^^^^^^^^^^^^^^^^^^^
  File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 124, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/feat/multitask/inference.py", line 165, in __call__
    out = self.model(x)
          ^^^^^^^^^^^^^
  File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1778, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1789, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/feat/multitask/model_v2.py", line 695, in forward
    feats = self.backbone(x)[-1]                # [B, bb_ch, H, W]
            ^^^^^^^^^^^^^^^^
  File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1778, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1789, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/timm/models/_features.py", line 345, in forward
    return list(self._collect(x).values())
                ^^^^^^^^^^^^^^^^
  File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/timm/models/_features.py", line 299, in _collect
    x = module(x)
        ^^^^^^^^^
  File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1778, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1789, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/timm/models/convnext.py", line 306, in forward
    x = self.blocks(x)
        ^^^^^^^^^^^^^^
  File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1778, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1789, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/torch/nn/modules/container.py", line 253, in forward
    input = module(input)
            ^^^^^^^^^^^^^
  File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1778, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1789, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/timm/models/convnext.py", line 200, in forward
    x = self.conv_dw(x)
        ^^^^^^^^^^^^^^^
  File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1778, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1789, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/torch/nn/modules/conv.py", line 565, in forward
    return self._conv_forward(input, self.weight, self.bias)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/kcostilow/venvs/3.12-py-feat/lib/python3.12/site-packages/torch/nn/modules/conv.py", line 560, in _conv_forward
    return F.conv2d(
           ^^^^^^^^^
**RuntimeError: GET was unable to find an engine to execute this computation**


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

RuntimeError: GET was unable to find an engine to execute this computation from F.conv2D() #372

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

RuntimeError: GET was unable to find an engine to execute this computation from F.conv2D() #372

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions