-
Notifications
You must be signed in to change notification settings - Fork 60
Description
nvidia-smi
Mon Mar 2 19:54:20 2026
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 580.126.20 Driver Version: 580.126.20 CUDA Version: 13.0 |
nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2025 NVIDIA Corporation
Built on Tue_Dec_16_07:23:41_PM_PST_2025
Cuda compilation tools, release 13.1, V13.1.115
Build cuda_13.1.r13.1/compiler.37061995_0
Run:
SAFETENSORS_FAST_GPU=1 vllm serve MiniMaxAI/MiniMax-M2.5 --trust-remote-code --tensor-parallel-size 4 --enable-auto-tool-choice --tool-call-parser minimax_m2 --reasoning-parser minimax_m2_append_think
(EngineCore_DP0 pid=120905) WARNING 03-02 19:48:25 [multiproc_executor.py:921] Reducing Torch parallelism from 112 threads to 1 to avoid unnecessary CPU contention. Set OMP_NUM_THREADS in the external environment to tune this value as needed.
ERROR 03-02 19:49:00 [multiproc_executor.py:783] WorkerProc failed to start.
ERROR 03-02 19:49:00 [multiproc_executor.py:783] Traceback (most recent call last):
ERROR 03-02 19:49:00 [multiproc_executor.py:783] File "/home/user/.venv/lib/python3.10/site-packages/vllm/v1/executor/multiproc_executor.py", line 754, in worker_main
ERROR 03-02 19:49:00 [multiproc_executor.py:783] worker = WorkerProc(*args, **kwargs)
ERROR 03-02 19:49:00 [multiproc_executor.py:783] File "/home/user/.venv/lib/python3.10/site-packages/vllm/v1/executor/multiproc_executor.py", line 571, in init
ERROR 03-02 19:49:00 [multiproc_executor.py:783] self.worker.init_device()
ERROR 03-02 19:49:00 [multiproc_executor.py:783] File "/home/user/.venv/lib/python3.10/site-packages/vllm/v1/worker/worker_base.py", line 322, in init_device
ERROR 03-02 19:49:00 [multiproc_executor.py:783] self.worker.init_device() # type: ignore
ERROR 03-02 19:49:00 [multiproc_executor.py:783] File "/home/user/.venv/lib/python3.10/site-packages/vllm/v1/worker/gpu_worker.py", line 224, in init_device
ERROR 03-02 19:49:00 [multiproc_executor.py:783] current_platform.set_device(self.device)
ERROR 03-02 19:49:00 [multiproc_executor.py:783] File "/home/user/.venv/lib/python3.10/site-packages/vllm/platforms/cuda.py", line 126, in set_device
ERROR 03-02 19:49:00 [multiproc_executor.py:783] torch.cuda.set_device(device)
ERROR 03-02 19:49:00 [multiproc_executor.py:783] File "/home/user/.venv/lib/python3.10/site-packages/torch/cuda/init.py", line 567, in set_device
ERROR 03-02 19:49:00 [multiproc_executor.py:783] torch._C._cuda_setDevice(device)
ERROR 03-02 19:49:00 [multiproc_executor.py:783] File "/home/user/.venv/lib/python3.10/site-packages/torch/cuda/init.py", line 410, in _lazy_init
ERROR 03-02 19:49:00 [multiproc_executor.py:783] torch._C._cuda_init()
ERROR 03-02 19:49:00 [multiproc_executor.py:783] RuntimeError: Unexpected error from cudaGetDeviceCount(). Did you run some cuda functions before calling NumCudaDevices() that might have already set an error? Error 802: system not yet initialized