Skip to content

feat(catalog): add vertex_ai/gemini-3.5-flash to bundled model catalog#1138

Open
Serhan-Asad wants to merge 1 commit into
mainfrom
fix/issue-1136-gemini-3.5-flash
Open

feat(catalog): add vertex_ai/gemini-3.5-flash to bundled model catalog#1138
Serhan-Asad wants to merge 1 commit into
mainfrom
fix/issue-1136-gemini-3.5-flash

Conversation

@Serhan-Asad
Copy link
Copy Markdown
Collaborator

Summary

  • Adds Google Vertex AI,vertex_ai/gemini-3.5-flash to pdd/data/llm_model.csv. Pricing $1.50/$9.00 matches Google's published Vertex rate and the pdd_cloud worker catalog (commit 6e08eb488b). ELO 1442 matches pdd_cloud. Preview row (vertex_ai/gemini-3-flash-preview) is kept for backward compat.
  • Adds vertex_ai/gemini-3.5-flash to _MANDATORY_MODEL_ROWS in pdd/generate_model_catalog.py so catalog regeneration preserves it even if litellm dedup removes it; adds "gemini-3.5-flash": 1442 to STATIC_ELO_FALLBACK to pass the ELO_CUTOFF guard in the mandatory-row seeder.
  • Four new tests (2 catalog, 2 llm_invoke) pin the fix; 344 existing tests all pass.

Why

pdd_cloud runs with LLM_INVOKE_DEFAULT_MODEL=vertex_ai/gemini-3.5-flash (pdd_cloud origin/main). Without this row, a local PDD_MODEL_DEFAULT=vertex_ai/gemini-3.5-flash run falls through to the surrogate-base path. Under the provider lock from PR #1115, the surrogate becomes vertex_ai/claude-opus-4-7 (first Google Vertex AI row) — wrong model, wrong pricing, wrong ELO for strength interpolation, wrong temperature-clamp behavior.

Test plan

  • test_committed_csv_includes_vertex_gemini_3_5_flash_ga_default — CSV substring assertion
  • test_build_rows_includes_vertex_gemini_3_5_flash_ga_default — seeded via _MANDATORY_MODEL_ROWS with fake litellm that lacks the row
  • TestSelectModelCandidates::test_issue_1136_vertex_gemini_3_5_flash_resolves_directly — resolves to GA row, not vertex_ai/claude-opus-4-7 surrogate trap
  • TestSelectModelCandidates::test_issue_1136_gemini_3_5_flash_in_gemini_3_family_clamp — temperature clamp fires for GA id
  • pytest tests/test_generate_model_catalog.py tests/test_llm_invoke.py tests/test_llm_invoke_csv_model_registration.py tests/test_update_model_costs.py — 344 passed

Closes #1136.

🤖 Generated with Claude Code

#1136)

pdd_cloud runs with LLM_INVOKE_DEFAULT_MODEL=vertex_ai/gemini-3.5-flash
(commit 6e08eb488b in pdd_cloud). Without this row in the bundled CSV,
PDD_MODEL_DEFAULT=vertex_ai/gemini-3.5-flash falls through to the
surrogate-base path. Under the provider lock from PR #1115, the surrogate
becomes vertex_ai/claude-opus-4-7 (the first Google Vertex AI row) —
wrong model, wrong pricing, wrong ELO for strength interpolation.

Changes:
- pdd/data/llm_model.csv: add Google Vertex AI,vertex_ai/gemini-3.5-flash
  row. Pricing $1.50/$9.00 matches Google's published Vertex rate and the
  pdd_cloud worker catalog. ELO 1442 matches the pdd_cloud catalog.
  Preview row (vertex_ai/gemini-3-flash-preview) is kept for users who
  pin it.
- pdd/generate_model_catalog.py: add vertex_ai/gemini-3.5-flash to
  _MANDATORY_MODEL_ROWS so catalog regeneration preserves the row even
  if LiteLLM's model_cost dedup removes it; add "gemini-3.5-flash": 1442
  to STATIC_ELO_FALLBACK so the mandatory-row ELO resolver passes the
  ELO_CUTOFF guard.
- tests/test_generate_model_catalog.py: pin both the committed-CSV
  assertion and the build_rows() seeding path (verified against fake
  litellm with no gemini-3.5-flash entry).
- tests/test_llm_invoke.py: regression that vertex_ai/gemini-3.5-flash
  resolves directly to the GA row (not the claude-opus-4-7 surrogate
  trap); pin that the model is in the Gemini 3 temperature-clamp family.

Closes #1136.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add Gemini 3.5 Flash to the bundled LLM model catalog

1 participant