Add Windrose metrics exporter#15
Conversation
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c639463e7e
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| - name: WINDROSE_GAME_CPU_LIMIT | ||
| value: {{ .Values.resources.game.limits.cpu | default "" | quote }} | ||
| - name: WINDROSE_GAME_MEM_LIMIT | ||
| value: {{ .Values.resources.game.limits.memory | default "" | quote }} |
There was a problem hiding this comment.
Source game limits from the game container, not sidecar
When .Values.resources.game.limits.* are unset, this sidecar gets empty WINDROSE_GAME_*_LIMIT values and metrics.py falls back to cgroup detection. In the windrose-metrics container, /sys/fs/cgroup/* reflects the sidecar’s own cgroup (for example its default 128Mi memory cap), not the game container’s, so windrose_cpu_limit_millicores / windrose_memory_limit_bytes can be wrong and dashboards/alerts will compute incorrect utilization. This is especially visible for installs that do not set explicit game CPU/memory limits in Helm values.
Useful? React with 👍 / 👎.
d8bde2e to
981b1ef
Compare
Summary
metrics.py, runnable as a standalone sidecar or imported byserver.pyfor opt-in/metricsValidation
python3 -m py_compile server.py metrics.py scripts/patch-idle-cpu.py tests/test_metrics.pypython3 tests/test_metrics.pypython3 tests/test_retention.pypython3 tests/test_restore.pypython3 tests/test_auto_backup.pypython3 tests/test_server_control.pypython3 tests/test_schema.pypython3 tests/test_http.pybash tests/test_install_env_merge.shbash tests/test_engine_ini_reconcile.shbash tests/test_shadow_stamp.shdocker compose config --quietdocker compose --profile metrics config --quiethelm lint ./helm/windrosemetrics.port == service.portCanary dogfood
registry.local:30500/games/windrose-server:metrics-split-5554b8dmanual-20260505T020754Z4/4 Running, zero restarts, withwindrose-metricson28082windrose_exporter_scrape_success 1and build labels for Steam build23065343windrose_exporter_scrape_success{job="windrose-canary"}returns one successful seriesWindrose Serverat/d/windrose-server/windrose-serverNote:
https://windrose-canary.shipstuff.fun/api/statustimed out from this environment; current canary Helm values haveingress.enabled=false, so validation used in-cluster service/API.