[FEAT] Add optimization.n_repeats for repeated benchmark runs by VincentG1234 · Pull Request #28 · InseeFrLab/auto-tuning-vllm

VincentG1234 · 2026-05-28T17:18:04Z

Summary

Add optimization.n_repeats (default 1) to run multiple GuideLLM benchmarks per Optuna trial configuration on the same vLLM server, report mean objectives to Optuna, and expose repeat spread in Optuna user attributes for log_metrics.

Why

Benchmark results are noisy. Repeating each configuration a few times (typically 2–3) reduces variance in the objective reported to the sampler, without changing the meaning of n_trials (still one unique config per trial).

What changed

auto_tune_vllm/core/config.py — add n_repeats: int = 1 and validation (>= 1)
auto_tune_vllm/execution/trial_controller.py — loop benchmark runs after a single vLLM startup; aggregate objectives/metrics by mean; store per-run values under detailed_metrics.repeats when n_repeats > 1; fail the whole trial if any repeat fails
auto_tune_vllm/core/study_controller.py — for log_metrics only, when n_repeats > 1, write metric_<name>, metric_<name>_rel_range, metric_<name>_values, and n_repeats as Optuna user attrs
docs/configuration.md — document n_repeats and repeat-related log_metrics attrs
examples/study_config.yaml — minimal smoke-test config (n_trials: 3, n_repeats: 2, baseline disabled)

How tested

ruff check .
pytest -v tests/ (60 passed)
Manual E2E (maintainer): auto-tune-vllm optimize --config examples/study_config.yaml, then inspect ./optuna_studies/n_repeats_smoke_test/study.db in Optuna Dashboard for metric_*_rel_range / metric_*_values

Signed-off-by: Vincent Gimenes <vincent.gimenes@gmail.com>

[FEAT] Add optimization.n_repeats for repeated benchmark runs

ee2db53

Signed-off-by: Vincent Gimenes <vincent.gimenes@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEAT] Add optimization.n_repeats for repeated benchmark runs#28

[FEAT] Add optimization.n_repeats for repeated benchmark runs#28
VincentG1234 wants to merge 1 commit into
mainfrom
FEAT/n-repeats-benchmark

VincentG1234 commented May 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

VincentG1234 commented May 28, 2026

Summary

Why

What changed

How tested

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant