Skip to content

Q2_0 group 64: Metal backend#41

Draft
khosravipasha wants to merge 1 commit into
pr/q2_0-cpufrom
pr/q2_0-metal
Draft

Q2_0 group 64: Metal backend#41
khosravipasha wants to merge 1 commit into
pr/q2_0-cpufrom
pr/q2_0-metal

Conversation

@khosravipasha

Copy link
Copy Markdown
Collaborator

DRAFT PR for tesintg and reviews.

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds Metal backend support for the Q2_0 quantization format (group size 64) so quantized weights can participate in common Metal paths (copy/dequant, get_rows, mul_mv, mul_mm, and mul_mat dispatch).

Changes:

  • Implements Q2_0 quantize/dequant routines in the Metal shader library and wires them into existing generic kernels (cpy/get_rows/mul_mm).
  • Adds a Q2_0-specific dot-product routine and a dedicated kernel_mul_mv_q2_0_f32 path.
  • Enables GGML_TYPE_Q2_0 in Metal op dispatch / device capability checks and sets up pipeline params for mul_mv/mul_mv_id.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.

Show a summary per file
File Description
ggml/src/ggml-metal/ggml-metal.metal Adds Q2_0 quantize/dequant, dot-product helper, and kernel/template instantiations for copy/get_rows/mul_mm/mul_mv.
ggml/src/ggml-metal/ggml-metal-ops.cpp Allows Q2_0 to use the small-batch mul-mv-ext path in MUL_MAT when applicable.
ggml/src/ggml-metal/ggml-metal-impl.h Introduces N_R0_Q2_0 / N_SG_Q2_0 constants for pipeline configuration.
ggml/src/ggml-metal/ggml-metal-device.m Updates Metal device op support checks to include Q2_0 for relevant copy/dup/cont paths.
ggml/src/ggml-metal/ggml-metal-device.cpp Adds Q2_0 case for mul_mv and mul_mv_id pipeline configuration (nsg/nr0).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants