[FEAT]: add warmup and cooldown for GuideLLM runs by VincentG1234 · Pull Request #24 · InseeFrLab/auto-tuning-vllm

VincentG1234 · 2026-05-20T14:22:48Z

Summary

Expose optional GuideLLM warmup and cooldown settings in study YAML so each trial (baseline and optimization) can exclude cold-start and shutdown phases from reported metrics, reducing benchmark variance without changing Optuna objectives or trial lifecycle.

Why

Cold GPU/KV cache at the start of each benchmark run increases metric variance across trials. GuideLLM already supports --warmup and --cooldown; auto-tuning-vllm did not pass them through. Users had no config-level way to stabilize measurements.

What changed

auto_tune_vllm/benchmarks/config.py — add optional warmup / cooldown fields; validate values (> 0, fractional sum < 1).
auto_tune_vllm/benchmarks/providers.py — forward flags to guidellm benchmark when set.
docs/configuration.md — document fields, measured-duration note, production example comments.
examples/study_config.yaml, examples/study_config_minimal.yaml — commented example lines.
tests/benchmarks/test_guidellm_command.py — unit tests for CLI args and validation (no GPU).

How tested

ruff check .
pytest -v tests/ (60 passed, including 7 new benchmark tests)
Manual E2E (maintainer): auto-tune-vllm optimize with benchmark.warmup: 0.1 / cooldown: 0.1 and verify GuideLLM CLI receives flags and trials complete

Risks / limitations

Requires a recent GuideLLM install with --warmup / --cooldown on guidellm benchmark (pyproject.toml still pins guidellm>=0.1.0; older CLIs may fail at runtime).
Warmup/cooldown share the same max_seconds budget; users should increase max_seconds if they need a longer steady-state measurement window.
No change to trial timeout logic (max_seconds * 1.5); acceptable because GuideLLM excludes warmup/cooldown within the same run duration.

Branch: feat/guidellm-warmup-cooldown

Signed-off-by: Vincent Gimenes <vincent.gimenes@gmail.com>

## Summary Add optional `benchmark.rampup` to study configs, forwarding GuideLLM's `--rampup` flag so concurrent benchmarks can ramp load linearly to target concurrency instead of starting at full rate. ## Why Sudden full-concurrency load can skew benchmark metrics (cold caches, queue buildup, OOM risk). GuideLLM supports a ramp-up period; this fork already exposes `warmup` and `cooldown` but not `rampup`, so users could not control how load increases at the start of a run. ## What changed - `auto_tune_vllm/benchmarks/config.py` — add optional `rampup` field; validate `> 0` in `__post_init__` - `auto_tune_vllm/benchmarks/providers.py` — pass `--rampup` to GuideLLM CLI when set - `tests/benchmarks/test_guidellm_command.py` — CLI construction and validation tests for rampup - `docs/configuration.md` — document `rampup` (seconds, included in metrics, unlike warmup) - `examples/study_config.yaml`, `examples/study_config_minimal.yaml` — commented example - `README.md` — fork changelog entry ## How tested - [x] `ruff check .` - [x] `pytest -v tests/benchmarks/test_guidellm_command.py` (11 passed) - [ ] Manual E2E (maintainer): `auto-tune-vllm optimize` with `benchmark.rampup: 10` and verify GuideLLM receives `--rampup 10` ## Risks / limitations - Requires a GuideLLM version that supports `--rampup` (same pattern as existing warmup/cooldown flags). - Ramp-up requests are included in reported metrics; only `warmup`/`cooldown` exclude phases from measurement. - No interaction with fractional warmup/cooldown sum validation (rampup is always absolute seconds). ## Links - Follows [#24](#24) (warmup/cooldown) and [#27](#27) (sample_requests) benchmark config extensions. Signed-off-by: Vincent Gimenes <vincent.gimenes@gmail.com>

feat(benchmark): add warmup and cooldown for GuideLLM runs

c1dc97d

Signed-off-by: Vincent Gimenes <vincent.gimenes@gmail.com>

VincentG1234 merged commit 5841e59 into main May 20, 2026
7 checks passed

VincentG1234 deleted the FEAT/guidellm-warmup-cooldown branch May 20, 2026 16:47

VincentG1234 mentioned this pull request Jun 11, 2026

[FEAT] add rampup for GuideLLM concurrent benchmarks #31

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEAT]: add warmup and cooldown for GuideLLM runs#24

[FEAT]: add warmup and cooldown for GuideLLM runs#24
VincentG1234 merged 1 commit into
mainfrom
FEAT/guidellm-warmup-cooldown

VincentG1234 commented May 20, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

VincentG1234 commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why

What changed

How tested

Risks / limitations

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

VincentG1234 commented May 20, 2026 •

edited

Loading