Skip to content

Runtime Error when running pipeline #119

@e-bug

Description

@e-bug

Hi,

I get runtime errors when I try running the example command (as well as similar ones):

python vace/vace_pipeline.py --base wan --task depth --video assets/videos/test.mp4 --prompt 'xxx'

The runtime errors are due to the unflatten operation in flash_attn (see below) -- any thoughts on how to fix it?


python vace/vace_pipeline.py --base wan --task depth --video assets/videos/test.mp4 --prompt 'xxx'


Save frames result to processed/depth/2025-08-26-22-54-10/src_video-depth.mp4
preprocess_output: {'src_video': 'processed/depth/2025-08-26-22-54-10/src_video-depth.mp4'}
[2025-08-26 22:54:12,135] INFO: offload_model is not specified, set to True.
[2025-08-26 22:54:12,135] INFO: Generation job args: Namespace(model_name='vace-1.3B', size='480p', frame_num=81, ckpt_dir='models/Wan2.1-VACE-1.3B/', offload_model=True, ulysses_size=1, ring_size=1, t5_fsdp=False, t5_cpu=False, dit_fsdp=False, save_dir=None, save_file=None, src_video='processed/depth/2025-08-26-22-54-10/src_video-depth.mp4', src_mask=None, src_ref_images=None, prompt='xxx', use_prompt_extend='plain', base_seed=2025, sample_solver='unipc', sample_steps=50, sample_shift=16, sample_guide_scale=5.0)
[2025-08-26 22:54:12,135] INFO: Generation model config: {'__name__': 'Config: Wan T2V 1.3B', 't5_model': 'umt5_xxl', 't5_dtype': torch.bfloat16, 'text_len': 512, 'param_dtype': torch.bfloat16, 'num_train_timesteps': 1000, 'sample_fps': 16, 'sample_neg_prompt': '色调艳丽,过曝,静态,细节模糊不清,字幕,风格,作品,画作,画面,静止,整体发灰,最差质量,低质量,JPEG压缩残留,丑陋的,残缺的,多余的手指,画得不好的手部,画得不好的脸部,畸形的,毁容的,形态畸形的肢体,手指融合,静止不动的画面,杂乱的背景,三条腿,背景人很多,倒着走', 't5_checkpoint': 'models_t5_umt5-xxl-enc-bf16.pth', 't5_tokenizer': 'google/umt5-xxl', 'vae_checkpoint': 'Wan2.1_VAE.pth', 'vae_stride': (4, 8, 8), 'patch_size': (1, 2, 2), 'dim': 1536, 'ffn_dim': 8960, 'freq_dim': 256, 'num_heads': 12, 'num_layers': 30, 'window_size': (-1, -1), 'qk_norm': True, 'cross_attn_norm': True, 'eps': 1e-06}
[2025-08-26 22:54:12,135] INFO: Input prompt: xxx
[2025-08-26 22:54:12,135] INFO: Creating WanT2V pipeline.
[2025-08-26 22:55:47,924] INFO: loading models/Wan2.1-VACE-1.3B/models_t5_umt5-xxl-enc-bf16.pth
[2025-08-26 22:56:07,823] INFO: loading models/Wan2.1-VACE-1.3B/Wan2.1_VAE.pth
[2025-08-26 22:56:09,288] INFO: Creating VaceWanModel from models/Wan2.1-VACE-1.3B/
[2025-08-26 22:56:23,905] INFO: Generating video...
  0%|                                                                                                                                                                                                                                                                                                                                                | 0/50 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/workdir/VACE/vace/vace_pipeline.py", line 58, in <module>
    main()
  File "/workdir/VACE/vace/vace_pipeline.py", line 53, in main
    preprocess_output = importlib.import_module(inference_name).main(inference_args)
  File "/workdir/VACE/vace/vace_wan_inference.py", line 291, in main
    video = wan_vace.generate(
  File "/workdir/VACE/vace/models/wan/wan_vace.py", line 402, in generate
    noise_pred_cond = self.model(
  File "/opt/python/3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/opt/python/3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
    return forward_call(*args, **kwargs)
  File "/workdir/VACE/vace/models/wan/modules/model.py", line 227, in forward
    hints = self.forward_vace(x, vace_context, seq_len, kwargs)
  File "/workdir/VACE/vace/models/wan/modules/model.py", line 140, in forward_vace
    c = block(c, **new_kwargs)
  File "/opt/python/3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/opt/python/3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
    return forward_call(*args, **kwargs)
  File "/workdir/VACE/vace/models/wan/modules/model.py", line 40, in forward
    c = super().forward(c, **kwargs)
  File "/opt/python/3.10/lib/python3.10/site-packages/wan/modules/model.py", line 302, in forward
    y = self.self_attn(
  File "/opt/python/3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/opt/python/3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
    return forward_call(*args, **kwargs)
  File "/opt/python/3.10/lib/python3.10/site-packages/wan/modules/model.py", line 149, in forward
    x = flash_attention(
  File "/opt/python/3.10/lib/python3.10/site-packages/wan/modules/attention.py", line 110, in flash_attention
    deterministic=deterministic)[0].unflatten(0, (b, lq))
  File "/opt/python/3.10/lib/python3.10/site-packages/torch/_tensor.py", line 1421, in unflatten
    return super().unflatten(dim, sizes)
RuntimeError: unflatten: Provided sizes [1, 32760] don't multiply up to the size of dim 0 (12) in the input tensor

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions