I encountered a critical compatibility issue while attempting to quantize this model locally.
I found that the model architecture was accidentally registered as Qwen3 inside the configuration file. However, this multimodal model is essentially built on Qwen2.5.
Furthermore, the Transformers version recommended by this project does not yet support the Qwen3 architecture. This mismatch directly prevents model loading and successful INT8 quantization, and always triggers runtime errors.
I encountered a critical compatibility issue while attempting to quantize this model locally.
I found that the model architecture was accidentally registered as Qwen3 inside the configuration file. However, this multimodal model is essentially built on Qwen2.5.
Furthermore, the Transformers version recommended by this project does not yet support the Qwen3 architecture. This mismatch directly prevents model loading and successful INT8 quantization, and always triggers runtime errors.