feat(ktuner): add deterministic kernel-tuning engine#1278
Merged
Conversation
4fccd95 to
5051fe6
Compare
Collaborator
|
@jfeng18 代码看过了,非常可以,感谢贡献。组件集成的时候,有一些文档和分发格式上的要求,后续我们会做到 github-ci & AGENTS.md 里,不过很遗憾目前还没有完善,请 @kongche-jbw @ikunkun-sys 后续给一些指引。 |
Collaborator
|
整体看下来 ktuner 这部分已经很完整了,代码、README、CHANGELOG、CI scope 和 commitlint scope 都已经补上,感谢这个大改动。 有两个很小的仓库治理同步点,麻烦再考虑一下:
这两个不是代码实现问题,主要是让新组件在仓库文档入口上更容易被发现和维护。 |
Add ktuner as a new component for automated kernel parameter tuning. The agent diagnoses system configuration against 207 rules, outputs JSON recommendations, and can safely apply/rollback changes. Subcommands: check, tune, fix, why, rollback. All stdout is JSON. Registered as a cosh skill for automatic agent discovery.
- Add ktuner row to root README component table - Add ktuner scope to prelint validScopes (title + branch checks) - Add src/ktuner/CHANGELOG.md ([Unreleased] stub) - test-ktuner: cargo test --lib -> cargo test (parity, no-op today)
Address review feedback on alibaba#1278 from @kongche-jbw: - AGENTS.md: add ktuner to component table, dev commands, Rust conventions list, and Scope Inference table - Add docs/user-guide user-entrypoint/ktuner.md (en + zh) covering diagnose / dry-run / apply / rollback and the permission boundary - Link ktuner from the user-guide index (en + zh)
a807e3d to
8bf35d6
Compare
Collaborator
Author
|
感谢 review 和合入!🎉 AGENTS.md 和 user-guide 两处建议都已按你的意见补上。first-run auto-check(#1279)和打包分发路线我会继续跟进,到时按你和 @ikunkun-sys 的指引补齐。 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
为 anolisa 新增内核参数自动调优组件 ktuner。cosh 的 agent 可以通过它诊断系统内核配置、给出调优建议、一键应用并支持回滚。
动机
当前 cosh 缺少系统性的内核调优能力。agent 虽然能读取单个内核参数,但没有确定性的规则基线——不知道该检查哪些参数、合理值是什么、改完怎么安全回滚。ktuner 提供这套基线:207 条规则系统性评估当前配置,输出 JSON 格式的诊断和建议,由 agent 调用。规则引擎不依赖 LLM,零 token 成本;LLM 只在解释层加价值("为什么建议改 swappiness")。
什么改了
src/ktuner/:新 Rust crate(lib + bin),~10k 行check(诊断)、tune(应用)、fix(单条)、why(解释)、rollback(撤销)src/os-skills/system-admin/ktuner/SKILL.md:注册为 cosh skill,agent 自动发现.github/workflows/ci.yaml:新增 Step 11 test-ktuner job(fmt + clippy + test).github/commitlint.config.json:新增ktunerscope什么没改
anolisa 其他组件零改动。
安全设计
ktuner 以 root 身份写 /proc/sys 和 /sys。安全措施:
验证
后续规划
本 PR 是第一步:确定性规则引擎,cosh 按需调用,覆盖 80% 常见调优场景。
第二步(agentic-doctor):ktuner 将依靠 agentsight 可观测机制做持续 profiling——利用 BPF 可观测数据做动态调优,智能处理规则引擎覆盖不到的 edge cases。