Skip to content

fix(gguf): safely propagate runtime errors for unknown architectures#2106

Open
glaziermag wants to merge 1 commit into
EricLBuehler:masterfrom
glaziermag:fix-gguf-architecture-panic
Open

fix(gguf): safely propagate runtime errors for unknown architectures#2106
glaziermag wants to merge 1 commit into
EricLBuehler:masterfrom
glaziermag:fix-gguf-architecture-panic

Conversation

@glaziermag

@glaziermag glaziermag commented Apr 14, 2026

Copy link
Copy Markdown
Contributor

Summary

Fixes #2098 by returning a normal runtime error for unsupported GGUF architecture metadata instead of panicking through unwrap() / expect().

This PR does not add Gemma 4 GGUF model support. It changes the failure mode for unsupported architectures such as gemma4 from Rust panic to ordinary error.

Validation Result

Validation, 2026-05-14: ACTUAL, FEASIBLE_NOW.

Environment:

  • GPU validation environment
  • Rust 1.95.0
  • Real fixture: Unsloth Gemma 4 GGUF gemma-4-26B-A4B-it-UD-Q4_K_S.gguf
  • Synthetic fixtures: minimal local GGUFs with general.architecture = "definitely_unknown_arch" and general.architecture = "gemma4"

The commands used --cpu; GPU execution is not material because this failure occurs during GGUF metadata parsing before model execution.

Base commit tested: 2d4ba4f16f61e5e18be085d0dd137bc95cba038a

cargo build -p mistralrs-server --release
RUST_BACKTRACE=1 timeout 120 ./target/release/mistralrs-server --cpu --port 19063 gguf \
  -m /path/to/models \
  -f gemma-4-26B-A4B-it-UD-Q4_K_S.gguf

Base result: exit 101 with a Rust panic on the real issue fixture:

thread 'main' panicked at mistralrs-core/src/gguf/content.rs:151:22:
called `Result::unwrap()` on an `Err` value: Unknown GGUF architecture `gemma4`

The synthetic unknown and synthetic gemma4 fixtures also exited 101 with the same panic path.

PR head tested: f09a818633b6f8815cebe9f5bb39b7964b983f61

Same real-fixture command result: exit 1 with a normal runtime error and no Rust panic:

Error: Unknown GGUF architecture `gemma4`

The synthetic unknown and synthetic gemma4 fixtures also returned ordinary errors. With RUST_BACKTRACE=1, anyhow may still print an error stack; the important before/after difference is that the Rust panic/backtrace is gone and the unsupported architecture is surfaced as a runtime error.

This is ACTUAL before/after validation for the #2098 panic-to-error behavior.

@glaziermag glaziermag marked this pull request as ready for review April 14, 2026 00:14
@github-actions

github-actions Bot commented Apr 16, 2026

Copy link
Copy Markdown
Code Metrics Report
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
 Language              Files        Lines         Code     Comments       Blanks
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
 C Header                  5          305          210           52           43
 CSS                       3          281          252            5           24
 CUDA                     59        17661        13824         1637         2200
 Dockerfile                1           38           21            8            9
 HTML                      2           27           27            0            0
 JavaScript                3          392          387            2            3
 Jinja2                    7          694          656            5           33
 JSON                     25         9346         9343            0            3
 Makefile                  1            6            5            0            1
 MDX                       1          147            0          132           15
 Metal Shading Lan|       31        11647         9007         1064         1576
 PowerShell                1          300          227           30           43
 Python                  129         9969         8194          456         1319
 Shell                     2          489          331           96           62
 Plain Text                3         3723            0         2413         1310
 TOML                     27         1309         1145           36          128
 TypeScript               11         1607         1371           66          170
 YAML                      3           25           23            2            0
─────────────────────────────────────────────────────────────────────────────────
 Jupyter Notebooks         3          122           83           23           16
 |- Markdown               1           60           30           22            8
 |- Python                 1          122          113            1            8
 (Total)                              304          226           46           32
─────────────────────────────────────────────────────────────────────────────────
 Markdown                119         8232            0         5591         2641
 |- BASH                  52          491          432           34           25
 |- Dockerfile             2            5            5            0            0
 |- JSON                  16          582          582            0            0
 |- PowerShell             3            5            5            0            0
 |- Python                22          687          604            5           78
 |- Rust                  13          415          362            1           52
 |- TOML                   9          107           83            3           21
 |- YAML                   1            9            9            0            0
 (Total)                            10533         2082         5634         2817
─────────────────────────────────────────────────────────────────────────────────
 Rust                    571       245656       216375         6437        22844
 |- Markdown             379         9235          452         7653         1130
 (Total)                           254891       216827        14090        23974
─────────────────────────────────────────────────────────────────────────────────
 Svelte                   18         1831         1696           50           85
 |- CSS                    1            4            4            0            0
 |- JavaScript            18          876          727           24          125
 (Total)                             2711         2427           74          210
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
 Total                  1025       326405       266585        25848        33972
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

@glaziermag glaziermag force-pushed the fix-gguf-architecture-panic branch from df434ef to a6145ee Compare April 17, 2026 00:45
@glaziermag

Copy link
Copy Markdown
Contributor Author

Housekeeping note: This branch currently bundles the CI fix from #2115 (.typos.toml, openapi_doc.rs, distributed/layers.rs). Once #2115 is merged, this branch will need a rebase onto updated master to drop the duplicate CI fix commit and resolve the resulting conflicts.

@glaziermag glaziermag force-pushed the fix-gguf-architecture-panic branch from a6145ee to c6eb266 Compare April 18, 2026 02:31
@glaziermag glaziermag force-pushed the fix-gguf-architecture-panic branch from c6eb266 to 9829e1a Compare April 27, 2026 16:47
@glaziermag glaziermag marked this pull request as draft May 5, 2026 19:16
@glaziermag

glaziermag commented May 18, 2026

Copy link
Copy Markdown
Contributor Author

Wave 1 evidence bundle update:

PR: #2106
Linked issue: #2098
Base SHA: 2d4ba4f16f61e5e18be085d0dd137bc95cba038a
Current PR-head SHA: f09a818633b6f8815cebe9f5bb39b7964b983f61
Fixed-head SHA, if changed: N/A

Exact commands recorded in the PR body:

cargo build -p mistralrs-server --release
RUST_BACKTRACE=1 timeout 120 ./target/release/mistralrs-server --cpu --port 19063 gguf \
  -m /path/to/models \
  -f gemma-4-26B-A4B-it-UD-Q4_K_S.gguf

Synthetic unknown-architecture and synthetic gemma4 GGUF fixtures were also exercised.

Environment: GCP a2-highgpu-1g A100 host used as cloud host, Rust 1.95.0; commands used --cpu, so GPU behavior is not material.
A100 category: A100_HOST_OPTIONAL.
Base result: real and synthetic unsupported-architecture GGUFs exited 101 with Rust panic from content.rs unwrap/expect path.
Current PR-head result: same fixtures returned ordinary runtime errors with no Rust panic.
Tests added/changed: GGUF unsupported-architecture error propagation path.
Tests passed: real Gemma 4 GGUF fixture and synthetic unknown/gemma4 fixtures per PR body.
Side-effect controls: does not add Gemma 4 support; only changes panic-to-error behavior.
Raw logs/artifacts: PR body contains command list and panic/error excerpts; no separate raw log file is attached in this comment.
Remaining risks: standalone raw command logs are still useful if reviewers require downloadable artifacts.
Can say “Fixes #issue”: yes for #2098 panic-to-error behavior.
Safe wording: “Fixes #2098 by returning a normal unsupported-GGUF-architecture error instead of panicking; does not add Gemma 4 support.”
Readiness status: ready-now if PR-body log excerpts are accepted; otherwise standalone raw command logs remain to attach.

@glaziermag glaziermag marked this pull request as ready for review May 18, 2026 23:38
@glaziermag

Copy link
Copy Markdown
Contributor Author

Marked ready for review. Validation evidence and narrowed claim wording are already attached in the PR discussion/body. This PR is ready under the scoped claim described in the PR.

Ready for maintainer review. Evidence attached shows the original/base failure and current-head pass under the scoped issue conditions. The PR may use the Fixes #... wording already present in the body.

@glaziermag glaziermag force-pushed the fix-gguf-architecture-panic branch 2 times, most recently from 09505f4 to 173da6c Compare May 20, 2026 02:25
@glaziermag glaziermag force-pushed the fix-gguf-architecture-panic branch from 173da6c to 523993f Compare May 20, 2026 21:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

GGUF pipeline panics on gemma4 architecture instead of returning an error

1 participant