Second-year PhD student in the Department of Computing at The Hong Kong Polytechnic University, advised by Prof. Wenjie Li (Maggie) and Prof. Wei Zhang.
I work on Environment-Centric AI: treating the training environment of intelligent agents as a designed object. The environment is not a given, it has pieces (reward, feedback, observation, evaluation), and those pieces can be analyzed and re-designed.
More at battam1111.github.io.
-
Exact Is Easier: Credit Assignment for Cooperative LLM Agents (in submission, arXiv:2603.06859) Cooperative LLM histories are deterministic, so per-agent counterfactual credit is exactly computable. Delivers a learning algorithm that outperforms every approximate multi-agent RL alternative, plus the first method-agnostic auditing tool for credit quality.
-
The Accuracy Paradox in RLHF (EMNLP 2024) Moderate reward models train better language models than highly accurate ones on relevance, factuality, and completeness. Reward-model accuracy as an environment design property.
-
battam1111.github.io Source for my homepage. al-folio + custom SCSS, trilingual EN/中/日.
- Email: yan-jun.chen@connect.polyu.hk
- Google Scholar · ORCID 0009-0001-9065-9137
- Hong Kong

