Skip to content

[FIX] add guidellm preflight check and pin vllm to <=0.19#17

Open
VincentG1234 wants to merge 5 commits into
mainfrom
fix/guidellm-cli-preflight
Open

[FIX] add guidellm preflight check and pin vllm to <=0.19#17
VincentG1234 wants to merge 5 commits into
mainfrom
fix/guidellm-cli-preflight

Conversation

@VincentG1234

@VincentG1234 VincentG1234 commented Apr 28, 2026

Copy link
Copy Markdown
Collaborator

Problem

Guidellm do not launch at the start of the vLLM server, leading to fail every trials. The error in #15 seems to come from vllm instead of transformers
#19

Summary

  • Add a preflight check in the GuideLLM benchmark provider to fail fast when the GuideLLM CLI cannot start.
  • Validate guidellm benchmark --help before launching the benchmark process, with a short timeout.
  • Return a clear error with captured CLI output when preflight fails, instead of waiting for benchmark timeout.
  • Pin vllm to <=0.19 in pyproject.toml to avoid unsupported behavior with 0.20.

Why

Some trials were timing out because GuideLLM could crash at startup due to dependency/import issues. This change detects startup failures immediately and surfaces actionable diagnostics, while constraining vllm to the supported version range.

Signed-off-by: Vincent Gimenes <vincent.gimenes@gmail.com>
@VincentG1234 VincentG1234 marked this pull request as ready for review April 29, 2026 08:09
@VincentG1234 VincentG1234 requested a review from hgsmn April 29, 2026 08:09
Comment thread auto_tune_vllm/benchmarks/providers.py Outdated
return self._process

def _validate_guidellm_cli(self, env: dict[str, str]) -> None:
"""Fail fast if GuideLLM cannot start due to dependency/import issues."""

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is the only possible error when launching guidellm --help ?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, but it is an easy way to catch problems early

Comment thread auto_tune_vllm/benchmarks/providers.py Outdated
["guidellm", "benchmark", "--help"],
capture_output=True,
text=True,
timeout=15,

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

15 second is not too short ?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I wanted to put 30 sec. I will modify

Comment thread pyproject.toml
@VincentG1234 VincentG1234 force-pushed the fix/guidellm-cli-preflight branch from 75ec8cb to 30d5933 Compare May 12, 2026 14:28
VincentG1234 added a commit that referenced this pull request May 20, 2026
## Summary
Add structured documentation for human maintainers and coding agents:
`AGENTS.md`, `.ai/context/`, `.ai/skills/`, and a user-facing
architecture guide with Mermaid diagrams. Link the new doc from README
and quick start. Align the local example YAML with vLLM V1 constraints.

## Why
The InseeFrLab fork needs a stable onboarding path for contributors and
agents (local backend focus, safe commands, known issues/PRs) without
reading the whole codebase. Architecture was previously only implicit in
code; diagrams improve onboarding and reviews.

## What changed
- `AGENTS.md` — entry point for agents (context files, skills, safe
commands).
- `.ai/context/` — repo map, execution flow, history, known issues,
current work snapshot, external links.
- `.ai/skills/` — pr-writer, pr-reviewer, test-writer, docs-writer,
architecture-diagrams.
- `docs/architecture.md` — end-to-end, layout, orchestration, trial
lifecycle, outputs (Mermaid).
- `README.md`, `docs/quick_start.md` — links to architecture doc.
- `examples/study_config_local_exec.yaml` — disable
`max_num_partial_prefills` (unsupported on V1; comment added).

## How tested
- [x] `ruff check .`
- [x] `pytest -v tests/`
- [x] Manual E2E (maintainer): not required (docs-only PR)

## Risks / limitations
- `.ai/context/current-work.md` is a point-in-time snapshot (open PRs
#13, #17, #21, #22); it will drift until refreshed after merges.
- Mermaid rendering depends on the viewer (GitHub, IDE); no runtime
behavior change except the example YAML default.

## Links
- (none — no issue closed)

Signed-off-by: Vincent Gimenes <vincent.gimenes@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants