fix(review-filter): add `submit_filter_result` func for review filter stage by ScarletCarpet · Pull Request #295 · alibaba/open-code-review

ScarletCarpet · 2026-07-03T12:42:24Z

fix(review-filter): add submit_filter_result func for review filter stage

At review filter stage, LLM(test GLM5.2) may return natural language content such as:
"""
Looking at each comment...

commit 1: ...
commit 2: ...

Summary:
json [...]
"""

add function calling for LLM will get more stable result for some model of providers.
and this tool only appears at review filter stage, so it does no harm for review quality.

Description

Type of Change

Bug fix (non-breaking change that fixes an issue)
New feature (non-breaking change that adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)
Refactoring (no functional changes)
Documentation update
CI / Build / Tooling

How Has This Been Tested?

make test passes locally
Manual testing (describe below)

Checklist

My code follows the project's coding style (go fmt, go vet)
I have performed a self-review of my code
I have added tests that prove my fix is effective or my feature works
New and existing unit tests pass locally with my changes
I have updated the documentation accordingly (if applicable)
I have signed the CLA

Related Issues

github-actions

🔍 OpenCodeReview found 3 issue(s) in this PR.

✅ 3 posted as inline comment(s)
📝 0 posted as summary

ScarletCarpet · 2026-07-03T13:21:37Z

@lizhengfeng101 这里咨询下，为什么reviewfiletr不用tool_call的方式限制LLM的输出呢？保证LLM的注意力么？

修改：
已将文本解析改为functioncalling：

… stage At review filter stage, LLM(test GLM5.2) may return natural language content such as: """ Looking at each comment... - commit 1: ... - commit 2: ... Summary: ```json [...]``` """ add function calling for LLM will get more stable result for some model of providers. and this tool only appears at review filter stage, so it does no harm for review quality.

revenant20 · 2026-07-03T18:21:39Z

+1 — independent repro with the exact same model, and some data from the session store showing this bug has real output-affecting impact.

Repro. v1.7.1 built from 10f5875, glm-5.2 via an OpenAI-protocol provider, range review of an external Kotlin repo (10 files, 10 comments). The filter ran 6 times (once per commented file); 4 of 6 calls failed to parse with invalid character 'L' — the model reasoned step by step in prose despite the "without any explanation" instruction.

The verdicts were computed and lost. The session store keeps the full filter responses, and all 4 unparseable ones have the same shape — reasoning prose with a correct fenced JSON array as the last line:

... The fundamental mechanism the reviewer describes — synchronous throwing from `launch` — does
not exist in the Kotlin coroutines API. This is verifiable from the diff.

```json
["c-0"]
```

Three of the four lost verdicts were [] (harmless), but the one above was ["c-0"]: the filter had correctly decided to remove a refuted comment, the parse failure discarded that verdict, and the comment shipped in the final output. So this PR fixes real damage, not cosmetics — thanks for it.

Two residual gaps you may want to cover here or in a follow-up (happy to send a PR for either):

The text fallback stays brittle. When a model ignores the tool or a provider lacks function calling, parseFilterResponse still requires the whole response to be a bare array — and the failure shape observed above (prose + trailing fenced array) loses the verdict again. A small salvage step on unmarshal failure — take the last balanced [...] fragment that unmarshals into []string — would have recovered all 4 responses in my run.
Parse failure is still indistinguishable from "checked and kept everything". The new review_filter.completed event reports removed=0 in both cases (and telemetry is off by default), and the raw-response preview in the log is truncated to 200 chars — which is exactly what hides the trailing array. A parse-failure counter surfaced in the run summary would make the fail-open visible.

ScarletCarpet · 2026-07-04T02:06:03Z

+1 — 独立重现，使用完全相同的模型，并且一些来自会话存储的数据表明这个错误确实对输出产生了影响。

Repro. v1.7.1 由 10f5875, glm-5.2 构建，通过 OpenAI 协议提供商，范围审查外部 Kotlin 代码库（10 个文件，10 条注释）。过滤器运行了 6 次（每个注释的文件运行一次）；6 次调用中有 4 次未能解析 与 invalid character 'L' — 模型尽管有“不作任何解释”的指令，还是逐步用散文进行推理。

判决结果已计算并丢失。 会话存储保持完整的过滤响应，所有 4 个无法解析的响应具有相同的形状 —— 以 正确括起来的 JSON 数组作为最后一行的推理文字:
... The fundamental mechanism the reviewer describes — synchronous throwing from `launch` — does
not exist in the Kotlin coroutines API. This is verifiable from the diff.

```json
["c-0"]
四个败诉中有三个是`[]`（无害的），但上面的那个不是`["c-0"]`：过滤器正确地决定删除一个被证伪的评论，解析失败丢弃了该败诉，而该评论被最终输出。所以这个PR修复了真正的损害，而不是表面问题——感谢你的努力。

两个剩余的差距，您可能希望在这里或后续阶段解决（很乐意为此发送一个PR）：

1. **文本回退仍然很脆弱。** 当模型忽略工具或提供者缺乏函数调用时，`parseFilterResponse` 仍然需要整个响应是一个裸数组——而上面观察到的失败形态（ prose + 末尾的围栏数组）再次失去了判断。在反序列化失败时进行一个小的恢复步骤——取**最后一个**平衡的`[...]`片段，该片段可以反序列化为`[]string`——将恢复我在运行中的所有4个响应。
2. **解析失败仍然无法与“已检查并保留所有内容”区分开来。** 新的 `review_filter.completed` 事件报告 `removed=0` 在两种情况下（默认情况下遥测数据是关闭的），日志中的原始响应预览被截断为200个字符——这正是隐藏数组尾巴的地方。运行总结中出现的解析失败计数器将使失败打开可见。

thanks a lot for reveiwing!

i'll try to do impromve the logic which you mentioned.

github-actions Bot reviewed Jul 3, 2026

View reviewed changes

Comment thread internal/llmloop/compression.go Outdated

Comment thread internal/llmloop/compression.go Outdated

Comment thread internal/agent/agent.go Outdated

ScarletCarpet force-pushed the fix/review-filter-fix branch from 2ffdf0e to 55bcd09 Compare July 3, 2026 12:57

ScarletCarpet force-pushed the fix/review-filter-fix branch from 55bcd09 to 90c919a Compare July 3, 2026 16:31

ScarletCarpet changed the title ~~fix(review-filter): recover JSON from natural language response~~ fix(review-filter): add submit_filter_result func for review filter stage Jul 3, 2026

ScarletCarpet force-pushed the fix/review-filter-fix branch from 90c919a to 8032c52 Compare July 3, 2026 16:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(review-filter): add `submit_filter_result` func for review filter stage#295

fix(review-filter): add `submit_filter_result` func for review filter stage#295
ScarletCarpet wants to merge 1 commit into
alibaba:mainfrom
ScarletCarpet:fix/review-filter-fix

ScarletCarpet commented Jul 3, 2026 •

edited

Loading

Uh oh!

github-actions Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ScarletCarpet commented Jul 3, 2026 •

edited

Loading

Uh oh!

revenant20 commented Jul 3, 2026

Uh oh!

ScarletCarpet commented Jul 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

ScarletCarpet commented Jul 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of Change

How Has This Been Tested?

Checklist

Related Issues

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ScarletCarpet commented Jul 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

revenant20 commented Jul 3, 2026

Uh oh!

ScarletCarpet commented Jul 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ScarletCarpet commented Jul 3, 2026 •

edited

Loading

ScarletCarpet commented Jul 3, 2026 •

edited

Loading