feat: add host-agent review bundle runtime#301
Conversation
|
Test seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
| if options.sessionID != "" { | ||
| if err := recordAgentEvent( | ||
| repoDir, | ||
| options.sessionID, | ||
| bundle.BundleID, | ||
| "validate", | ||
| session.AgentEvent{ | ||
| Files: comments.Summary.FilesReviewed, | ||
| Findings: len(comments.Comments), | ||
| Warnings: len(result.Warnings), | ||
| DurationMS: time.Since(started).Milliseconds(), | ||
| ValidationValid: &result.Valid, | ||
| }, | ||
| false, | ||
| ); err != nil { | ||
| fmt.Fprintf(os.Stderr, "Warning: agent session not recorded: %v\n", err) | ||
| } | ||
| } |
There was a problem hiding this comment.
Minor consistency issue: The helper function recordAgentEventBestEffort in agent_cmd.go already encapsulates this exact pattern (call recordAgentEvent, print warning to stderr on error). Using it here would reduce duplication and ensure consistent behavior if the error-handling strategy changes in the future.
recordAgentEventBestEffort(
repoDir,
options.sessionID,
bundle.BundleID,
"validate",
session.AgentEvent{...},
false,
)| manifest, manifestErr := reviewbundle.LoadScanManifest(bytes.NewReader(content)) | ||
| if manifestErr != nil { | ||
| return nil, nil, bundleErr | ||
| } |
There was a problem hiding this comment.
When the input file fails to parse as both a Bundle and a ScanManifest, only bundleErr is returned here, discarding manifestErr. This makes debugging difficult because the user won't know why the manifest interpretation also failed. Consider wrapping both errors to provide full diagnostic context.
Similarly, on line 177, when the bundleID is not found in a successfully parsed manifest, the error message doesn't mention that the file was initially tried as a standalone bundle (and failed), which could confuse users who expected it to be a bundle.
Suggestion:
| manifest, manifestErr := reviewbundle.LoadScanManifest(bytes.NewReader(content)) | |
| if manifestErr != nil { | |
| return nil, nil, bundleErr | |
| } | |
| manifest, manifestErr := reviewbundle.LoadScanManifest(bytes.NewReader(content)) | |
| if manifestErr != nil { | |
| return nil, nil, fmt.Errorf("input is neither a valid bundle (%v) nor a valid scan manifest (%v)", bundleErr, manifestErr) | |
| } |
| fileLevel := false | ||
| if rawValue, ok := comment["file_level_comment"]; ok { | ||
| _ = json.Unmarshal(rawValue, &fileLevel) | ||
| } | ||
| if fileLevel { | ||
| var startLine, endLine int | ||
| if rawValue, ok := comment["start_line"]; ok { | ||
| _ = json.Unmarshal(rawValue, &startLine) | ||
| } | ||
| if rawValue, ok := comment["end_line"]; ok { | ||
| _ = json.Unmarshal(rawValue, &endLine) | ||
| } |
There was a problem hiding this comment.
Bug: Errors from json.Unmarshal are silently discarded with _ =. If file_level_comment contains a non-boolean value (e.g., a string "true" or a number), the unmarshal will fail and fileLevel will remain false, causing the file-level constraint check to be skipped entirely. Similarly, if start_line/end_line contain non-integer values, they'll silently default to 0, potentially passing the startLine != 0 || endLine != 0 check incorrectly.
Since LoadComments aims to be a strict decoder (it uses decodeStrict and validates shape before struct decoding), these silent failures undermine the validation contract. At minimum, unmarshal errors should be propagated. Alternatively, since decodeStrict with DisallowUnknownFields will catch type mismatches during the subsequent struct decode, you could add a comment explaining that these unmarshal errors are intentionally ignored because the strict decode below will reject malformed types. However, even in that case, there's a subtle gap: if file_level_comment fails to unmarshal as bool, the file-level line-number constraint is never enforced, and the strict decode might still accept the document if the field happens to be a valid JSON type that Go coerces.
Suggestion:
| fileLevel := false | |
| if rawValue, ok := comment["file_level_comment"]; ok { | |
| _ = json.Unmarshal(rawValue, &fileLevel) | |
| } | |
| if fileLevel { | |
| var startLine, endLine int | |
| if rawValue, ok := comment["start_line"]; ok { | |
| _ = json.Unmarshal(rawValue, &startLine) | |
| } | |
| if rawValue, ok := comment["end_line"]; ok { | |
| _ = json.Unmarshal(rawValue, &endLine) | |
| } | |
| fileLevel := false | |
| if rawValue, ok := comment["file_level_comment"]; ok { | |
| if err := json.Unmarshal(rawValue, &fileLevel); err != nil { | |
| return fmt.Errorf("invalid comments schema: comments[%d].file_level_comment must be a boolean", index) | |
| } | |
| } | |
| if fileLevel { | |
| var startLine, endLine int | |
| if rawValue, ok := comment["start_line"]; ok { | |
| if err := json.Unmarshal(rawValue, &startLine); err != nil { | |
| return fmt.Errorf("invalid comments schema: comments[%d].start_line must be an integer", index) | |
| } | |
| } | |
| if rawValue, ok := comment["end_line"]; ok { | |
| if err := json.Unmarshal(rawValue, &endLine); err != nil { | |
| return fmt.Errorf("invalid comments schema: comments[%d].end_line must be an integer", index) | |
| } | |
| } |
| if startLine-1 > totalLines || (totalLines > 0 && startLine-1 >= totalLines) { | ||
| return "", fmt.Errorf("file %q has only %d lines, requested range starting at %d", path, totalLines, startLine) | ||
| } |
There was a problem hiding this comment.
Bug: When the file content is empty (totalLines == 0), this boundary check allows execution to proceed (both conditions evaluate to false), resulting in output with LINE_RANGE: 1-0 and zero content lines. This is misleading to the consumer.
Additionally, the compound condition is redundant — the first disjunct startLine-1 > totalLines is strictly weaker than the second startLine-1 >= totalLines (when totalLines > 0), making the logic harder to reason about.
Consider simplifying and handling the empty-file case explicitly:
Suggestion:
| if startLine-1 > totalLines || (totalLines > 0 && startLine-1 >= totalLines) { | |
| return "", fmt.Errorf("file %q has only %d lines, requested range starting at %d", path, totalLines, startLine) | |
| } | |
| if totalLines == 0 { | |
| return "", fmt.Errorf("file %q is empty", path) | |
| } | |
| if startLine > totalLines { | |
| return "", fmt.Errorf("file %q has only %d lines, requested range starting at %d", path, totalLines, startLine) | |
| } |
| // ponytail: minimal git-pathspec compatibility; expand if scan search needs full pathspec semantics. | ||
| if strings.HasPrefix(pattern, "*.") && strings.HasSuffix(path, strings.TrimPrefix(pattern, "*")) { | ||
| return true | ||
| } |
There was a problem hiding this comment.
Behavioral inconsistency: This function does not handle git-pathspec negation/exclude patterns such as :(exclude)*_test.go or :!vendor/. The non-scan search path uses git grep which natively supports these patterns. In scan mode, exclude patterns will silently fail to match any file (they won't match via filepath.Match, prefix check, or the *. suffix check), meaning all files will be included rather than excluded. This produces incorrect search results without any warning.
Consider either adding basic exclude-pattern support or returning an error/warning when an unsupported pattern syntax is detected.
| if int64(len(data)) > MaxProtocolDocumentBytes { | ||
| return nil, fmt.Errorf( | ||
| "document exceeds %d byte limit", | ||
| MaxProtocolDocumentBytes, | ||
| ) | ||
| } |
There was a problem hiding this comment.
Inconsistent error type: readLimited returns a generic fmt.Errorf when the document exceeds the byte limit, while validateProtocolDocumentSize and protocolDocumentWriter.Write return a structured *ProtocolError with a machine-readable Code field. Since readLimited is used by LoadBundle, LoadComments, and LoadScanManifest (in load.go), callers that use errors.As(err, &ProtocolError{}) to detect size violations will fail to match when the oversized document is read from an io.Reader but will succeed when validating an already-loaded byte slice or streaming write.
Consider returning a *ProtocolError here for consistency, e.g.:
return nil, &ProtocolError{
Code: "document_too_large",
Message: fmt.Sprintf(
"document exceeds %d byte protocol limit (%d bytes); reduce scope or bundle count",
MaxProtocolDocumentBytes,
len(data),
),
}Also note the error code should likely be document_too_large rather than manifest_too_large, since this function is used for bundles and comments as well.
| if len(files) == 1 { | ||
| path = files[0].Path | ||
| if files[0].Reviewable { | ||
| manifest.Summary.ReviewableFiles-- | ||
| manifest.Summary.ExcludedFiles++ | ||
| manifest.Summary.Insertions -= files[0].Insertions | ||
| manifest.Summary.Deletions -= files[0].Deletions | ||
| } |
There was a problem hiding this comment.
Bug: Summary counters not adjusted for non-reviewable skipped files.
When a file is skipped due to bundle_too_large, the summary adjustment (decrementing ReviewableFiles/Insertions/Deletions and incrementing ExcludedFiles) only happens when files[0].Reviewable == true. However, a non-reviewable file can also end up being skipped here (e.g., if it has a very large patch). In that case, the file is added to SkippedFiles but the manifest's Summary remains unchanged — meaning TotalFiles still counts it, yet it won't appear in any bundle. This creates an inconsistency between the manifest summary and the actual contents of the bundles + skipped list.
Consider adjusting the summary for non-reviewable files as well, or at minimum decrementing TotalFiles / ExcludedFiles appropriately so the aggregate stays consistent.
| _, encoded, err := buildDiffPartition(full, nil, maxBundleSize) | ||
| estimated := int64(4096) | ||
| if err == nil { | ||
| estimated = int64(len(encoded)) | ||
| } |
There was a problem hiding this comment.
Potential issue: Silent fallback on empty-bundle estimation error.
If buildDiffPartition fails when called with nil files (which shouldn't normally happen, but could occur due to marshal errors or other unexpected conditions), the error is silently discarded and a hardcoded 4096-byte baseline is used. If the actual empty bundle serialization is significantly different from 4096 bytes, all subsequent packing decisions will be based on an incorrect baseline, potentially causing either excessive partition splits or repeated overflow-and-re-split cycles.
Consider logging the error or using a more defensive approach (e.g., returning the error rather than silently falling back).
| bundle, encoded, err := buildDiffPartition(base, files, maxBundleSize) | ||
| if err == nil && int64(len(encoded)) <= maxBundleSize { | ||
| manifest.Bundles = append(manifest.Bundles, *bundle) | ||
| return nil | ||
| } |
There was a problem hiding this comment.
Performance concern: Each call to buildDiffPartition performs full JSON marshaling via marshalWithStableSize, which itself iterates up to 8 times.
During the initial packing loop, estimateAddition uses lightweight JSON marshaling of individual files/rules. But when a flush occurs (flushDiffPartition → appendDiffPartition → buildDiffPartition), the entire partition is serialized with marshalWithStableSize (up to 8 marshal iterations). If the estimate is inaccurate and many flushes trigger recursive bisection, each bisection step also calls buildDiffPartition with its own marshalWithStableSize. For large diffs with many files, this could result in significant redundant serialization work.
Consider caching or short-circuiting the size check (e.g., skip marshalWithStableSize during bisection and use plain json.Marshal since the stable-size invariant is only needed for the final accepted bundle).
| warnings = append(warnings, ProtocolNotice{ | ||
| Code: "oversized_diff", | ||
| Message: fmt.Sprintf( | ||
| "%s (~%d tokens) exceeds 80%% of max review tokens (%d)", | ||
| path, | ||
| tokens, | ||
| maxTokens, | ||
| ), | ||
| }) |
There was a problem hiding this comment.
The ProtocolNotice struct has a dedicated Path field (see bundle.go line 131), but it's not being populated here. The path is only embedded in the human-readable Message string, which makes it harder for programmatic consumers to extract the file path from the warning. Consider populating the Path field as well.
Additionally, when both diff.NewPath and diff.OldPath are empty strings, the warning message will display an empty path, making it difficult to identify which file was skipped. Consider providing a fallback like "<unknown>".
Suggestion:
| warnings = append(warnings, ProtocolNotice{ | |
| Code: "oversized_diff", | |
| Message: fmt.Sprintf( | |
| "%s (~%d tokens) exceeds 80%% of max review tokens (%d)", | |
| path, | |
| tokens, | |
| maxTokens, | |
| ), | |
| }) | |
| if path == "" { | |
| path = "<unknown>" | |
| } | |
| warnings = append(warnings, ProtocolNotice{ | |
| Code: "oversized_diff", | |
| Path: path, | |
| Message: fmt.Sprintf( | |
| "%s (~%d tokens) exceeds 80%% of max review tokens (%d)", | |
| path, | |
| tokens, | |
| maxTokens, | |
| ), | |
| }) |
| contentSHA256 := hashFields([]byte(change.NewFileContent)) | ||
| if change.IsDeleted { | ||
| contentSHA256 = "" | ||
| } |
There was a problem hiding this comment.
Minor inefficiency: hashFields computes a SHA256 digest of change.NewFileContent even when the file is deleted, only to immediately discard the result. For deleted files, NewFileContent is typically empty, so the cost is trivial, but the code reads as if the hash is meaningful before being overridden. Consider guarding the computation to make intent clearer and avoid unnecessary work.
Suggestion:
| contentSHA256 := hashFields([]byte(change.NewFileContent)) | |
| if change.IsDeleted { | |
| contentSHA256 = "" | |
| } | |
| var contentSHA256 string | |
| if !change.IsDeleted { | |
| contentSHA256 = hashFields([]byte(change.NewFileContent)) | |
| } |
| func writeMarkdownBody(output *bytes.Buffer, value string) { | ||
| if strings.Contains(value, "```") { | ||
| writeFencedCode(output, value) | ||
| return | ||
| } | ||
| fmt.Fprintln(output, escapeMarkdownBody(value)) | ||
| } |
There was a problem hiding this comment.
Bug: When comment.Content or comment.Recommendation contains triple backticks (e.g., a code example within a prose explanation, which is very common in code reviews), the entire value is wrapped in a fenced code block. This causes all surrounding prose text to be rendered as literal code, making the review comment unreadable.
For example, if Content is:
This function has a race condition. Consider using a mutex:
```go
mu.Lock()
defer mu.Unlock()
This ensures thread safety.
The current logic would wrap everything (including "This function has..." and "This ensures...") inside a single code fence.
Consider instead splitting the content on code fences and rendering each segment appropriately (prose vs. code), or simply always rendering as escaped markdown body without the special-casing for triple backticks.
| for _, notice := range validation.Errors { | ||
| fmt.Fprintf(output, "\n- `%s`: %s\n", notice.Code, notice.Message) | ||
| } |
There was a problem hiding this comment.
Inconsistency: Validation error and warning messages (notice.Message) are interpolated directly into markdown output without escaping, while writeMarkdownNotices (line 189) correctly applies escapeMarkdownBody. If a validation message contains markdown-sensitive characters (e.g., lines starting with #), it could produce unintended formatting in the rendered report. Apply escapeMarkdownBody here for consistency.
Suggestion:
| for _, notice := range validation.Errors { | |
| fmt.Fprintf(output, "\n- `%s`: %s\n", notice.Code, notice.Message) | |
| } | |
| for _, notice := range validation.Errors { | |
| fmt.Fprintf(output, "\n- `%s`: %s\n", notice.Code, escapeMarkdownBody(notice.Message)) | |
| } |
| if err := json.Unmarshal(document, &instance); err != nil { | ||
| return fmt.Errorf("invalid %s schema: %w", label, err) | ||
| } |
There was a problem hiding this comment.
The error message "invalid %s schema" is misleading here. When json.Unmarshal(document, &instance) fails, it means the input document is not valid JSON — not that the schema is invalid. This makes debugging harder because the caller sees "invalid bundle schema: ..." when the actual problem is malformed input.
Consider distinguishing between document parse errors and schema validation errors, e.g.:
- Line 19:
"invalid %s document: %w"(document is not valid JSON) - Line 26:
"%s document failed schema validation: %w"(document doesn't conform to schema)
Suggestion:
| if err := json.Unmarshal(document, &instance); err != nil { | |
| return fmt.Errorf("invalid %s schema: %w", label, err) | |
| } | |
| if err := json.Unmarshal(document, &instance); err != nil { | |
| return fmt.Errorf("invalid %s document: %w", label, err) | |
| } |
| resolvedSchemaMu.Lock() | ||
| defer resolvedSchemaMu.Unlock() |
There was a problem hiding this comment.
The mutex is held during the potentially expensive json.Unmarshal(schemaBytes, &schema) and schema.Resolve(nil) calls below. This blocks all other goroutines from accessing the cache, including those that could get an immediate cache hit.
While in practice there are only 3 fixed embedded schemas (limiting real-world impact), a common improvement is to release the lock before doing the expensive work, then re-acquire it to store the result. Alternatively, use sync.Once per schema key or a single-check-then-compute pattern:
resolvedSchemaMu.Lock()
if resolved, ok := resolvedSchemas[cacheKey]; ok {
resolvedSchemaMu.Unlock()
return resolved, nil
}
resolvedSchemaMu.Unlock()
// ... do expensive unmarshal/resolve without holding the lock ...
resolvedSchemaMu.Lock()
defer resolvedSchemaMu.Unlock()
if existing, ok := resolvedSchemas[cacheKey]; ok {
return existing, nil // another goroutine beat us
}
resolvedSchemas[cacheKey] = resolved
return resolved, nil| if options.MaxTokenBudget > 0 { | ||
| sort.SliceStable(items, func(i, j int) bool { | ||
| left := scan.EstimateItemTokens(items[i], true) | ||
| right := scan.EstimateItemTokens(items[j], true) | ||
| if left != right { | ||
| return left < right | ||
| } | ||
| return items[i].Path < items[j].Path | ||
| }) | ||
| } |
There was a problem hiding this comment.
Performance: EstimateItemTokens calls llm.CountTokens(item.Content) which processes the full file content string each time. Inside a sort comparator, this is invoked O(n log n) times, meaning each file's token count is redundantly recomputed many times during sorting. For large scans with many files, this can be a significant performance bottleneck.
Pre-compute the token estimates once before sorting and use the cached values in the comparator.
Suggestion:
| if options.MaxTokenBudget > 0 { | |
| sort.SliceStable(items, func(i, j int) bool { | |
| left := scan.EstimateItemTokens(items[i], true) | |
| right := scan.EstimateItemTokens(items[j], true) | |
| if left != right { | |
| return left < right | |
| } | |
| return items[i].Path < items[j].Path | |
| }) | |
| } | |
| if options.MaxTokenBudget > 0 { | |
| // Pre-compute token estimates to avoid O(n log n) redundant CountTokens calls. | |
| tokenCache := make(map[string]int64, len(items)) | |
| for _, item := range items { | |
| tokenCache[item.Path] = scan.EstimateItemTokens(item, true) | |
| } | |
| sort.SliceStable(items, func(i, j int) bool { | |
| left := tokenCache[items[i].Path] | |
| right := tokenCache[items[j].Path] | |
| if left != right { | |
| return left < right | |
| } | |
| return items[i].Path < items[j].Path | |
| }) | |
| } |
| midpoint := len(items) / 2 | ||
| if err := appendScanBundles(ctx, manifest, items[:midpoint], batchIndex, targetHash, resolver, maxBundleSize); err != nil { | ||
| return err | ||
| } | ||
| return appendScanBundles(ctx, manifest, items[midpoint:], batchIndex, targetHash, resolver, maxBundleSize) |
There was a problem hiding this comment.
Correctness / Partial state mutation on error: When appendScanBundles recursively splits a batch due to bundle_too_large, it mutates manifest.Bundles, manifest.SkippedFiles, and manifest.Summary counters incrementally. If the first recursive half succeeds (appending bundles and updating summary) but the second half fails with a non-recoverable error, the manifest is left in an inconsistent state — some bundles are appended and summary counters partially updated, yet the error propagates up to PrepareScan which returns nil, nil, err.
While the caller discards the manifest on error, this pattern is fragile. If any future caller inspects the manifest after an error return, or if error recovery/retry logic is added, the partial mutations will cause bugs. Consider either: (1) building bundles into a local slice and only appending to the manifest on full success, or (2) documenting explicitly that the manifest is invalid after any error return from this function.
| var protocolError *ProtocolError | ||
| if !errors.As(err, &protocolError) || protocolError.Code != "bundle_too_large" || len(items) <= 1 { | ||
| if errors.As(err, &protocolError) && protocolError.Code == "bundle_too_large" && len(items) == 1 { | ||
| manifest.SkippedFiles = append(manifest.SkippedFiles, ScanSkippedFile{ | ||
| Path: items[0].Path, | ||
| Reason: "bundle_too_large", | ||
| }) | ||
| manifest.Summary.ReviewableFiles-- | ||
| manifest.Summary.ExcludedFiles++ | ||
| manifest.Summary.Insertions -= int64(items[0].LineCount) | ||
| manifest.Partial = true | ||
| return nil | ||
| } | ||
| return err | ||
| } |
There was a problem hiding this comment.
Correctness: The condition logic here is difficult to follow and verify. The outer if checks !errors.As(...) || protocolError.Code != "bundle_too_large" || len(items) <= 1, and then the inner if re-checks errors.As(...) && protocolError.Code == "bundle_too_large" && len(items) == 1. This double-check pattern is confusing because:
- When the outer condition is true AND the inner condition is also true (single oversized item), the file is skipped gracefully.
- When the outer condition is true AND the inner condition is false (non-
bundle_too_largeerror, orbundle_too_largewithlen(items) == 0), the error is returned. - When the outer condition is false (
bundle_too_largewithlen(items) > 1), execution falls through to the split logic.
Consider restructuring for clarity, e.g., using explicit early returns or a switch statement to make each case self-documenting.
| if _, err := hasher.Write(length[:]); err != nil { | ||
| return err | ||
| } | ||
| written, err := io.Copy(hasher, reader) |
There was a problem hiding this comment.
Potential issue: Unbounded read beyond expected size. io.Copy will read the entire reader to EOF, which means if the file grows between the os.Stat call (which determined size) and during reading, all extra bytes will be consumed into the hash before the mismatch is detected at line 67. While the error prevents returning an incorrect hash, this could cause unnecessary I/O and memory usage for large files that grow significantly.
Consider using io.LimitReader to cap the read at exactly size bytes, then verify the actual bytes read match:
limited := io.LimitReader(reader, size)
written, err := io.Copy(hasher, limited)This ensures no more than size bytes are ever read from the file, making the function more defensive against concurrent modifications.
| func lockSessionFile(*os.File) error { return nil } | ||
| func unlockSessionFile(*os.File) error { return nil } |
There was a problem hiding this comment.
The no-op implementation means there is no cross-process file locking on Windows. While the sync.Mutex in AgentRecorder protects concurrent writes within a single process, multiple processes writing to the same JSONL session file (e.g., parallel agent runs sharing a session) can corrupt the file with interleaved writes.
Windows provides LockFileEx / UnlockFile via the syscall or golang.org/x/sys/windows package that can provide equivalent exclusive file locking. Consider implementing actual file locking instead of silently returning nil, or at minimum logging a warning when locking is skipped so users are aware of the concurrency risk.
| if _, err := io.Copy(&body, file); err != nil { | ||
| return false, nil, 0, err | ||
| } |
There was a problem hiding this comment.
TOCTOU race: The caller checks info.Size() > p.maxFileSizeBytes via Lstat before calling this function, but the file could grow (or be replaced with a larger file) between the stat and the read. Since readRegularFile reads the entire file into memory without any size cap, this defeats the memory-protection purpose of maxFileSizeBytes. Consider adding an io.LimitReader inside readRegularFile (e.g., capped at maxFileSizeBytes + 1) and returning an error if the limit is exceeded, similar to how readLimited works in internal/reviewbundle/read_limit.go.
| return | ||
| } | ||
| if err := ValidateSessionID(sid); err != nil { | ||
| http.Error(w, "invalid session path", http.StatusBadRequest) |
There was a problem hiding this comment.
Minor: The error message says "invalid session path" but this is validating a session ID, not a path. Consider using "invalid session ID" for clarity and consistency with the ValidateSessionID function name.
Suggestion:
| http.Error(w, "invalid session path", http.StatusBadRequest) | |
| http.Error(w, "invalid session ID", http.StatusBadRequest) |
| contentCache := make(map[string][]byte) | ||
| for index := range comments.Comments { | ||
| validateOneComment(ctx, &result, bundle, files, contentCache, comments.Comments[index], index, repoDir, runner) | ||
| } |
There was a problem hiding this comment.
Bug: When readTargetFile fails (e.g., file not found or git error), the function returns without caching anything. However, when the read succeeds but the hash check at line 264 fails (stale content), the stale content remains cached. On subsequent comments referencing the same path, the stale content is reused from cache and the hash mismatch is re-detected — this is correct behavior.
However, there's a subtle correctness issue: if readTargetFile fails for a transient reason (e.g., temporary I/O error), the next comment on the same file will retry the read. But if the first read succeeds and caches content, then a different process modifies the file before the second comment is validated, the cached (now-stale) content is used instead of re-reading. Since validation is supposed to check against the current target state, this caching strategy can mask concurrent modifications within a single validation run.
Consider either not caching when staleness is detected, or documenting that the cache assumes no concurrent file modifications during validation.
| for index := range comments.Comments { | ||
| validateOneComment(ctx, &result, bundle, files, contentCache, comments.Comments[index], index, repoDir, runner) | ||
| } |
There was a problem hiding this comment.
Missing context cancellation check: The validation loop iterates over all comments without checking ctx.Done(). For bundles with many comments (especially when each comment triggers a git show call via readTargetFile), a cancelled context won't stop the loop promptly. Consider adding a context check inside the loop.
Suggestion:
| for index := range comments.Comments { | |
| validateOneComment(ctx, &result, bundle, files, contentCache, comments.Comments[index], index, repoDir, runner) | |
| } | |
| for index := range comments.Comments { | |
| select { | |
| case <-ctx.Done(): | |
| addValidationError(&result, "context_cancelled", "", nil, "validation cancelled") | |
| result.Valid = false | |
| return result | |
| default: | |
| } | |
| validateOneComment(ctx, &result, bundle, files, contentCache, comments.Comments[index], index, repoDir, runner) | |
| } |
| normalizedContent := "\n" + strings.TrimSuffix(string(content), "\n") + "\n" | ||
| normalizedSnippet := "\n" + strings.Trim(snippet, "\n") + "\n" |
There was a problem hiding this comment.
Potential inconsistency: countStandaloneSnippet normalizes the content with strings.TrimSuffix(string(content), "\n") (removes only ONE trailing newline) but normalizes the snippet with strings.Trim(snippet, "\n") (removes ALL leading and trailing newlines). This asymmetry means that if the file content ends with multiple newlines (e.g., "code\n\n") while the snippet also has trailing newlines, the boundary matching could produce unexpected results. Consider using consistent normalization for both content and snippet, e.g., strings.TrimRight(string(content), "\n") for the content side as well.
Suggestion:
| normalizedContent := "\n" + strings.TrimSuffix(string(content), "\n") + "\n" | |
| normalizedSnippet := "\n" + strings.Trim(snippet, "\n") + "\n" | |
| normalizedContent := "\n" + strings.TrimRight(string(content), "\n") + "\n" | |
| normalizedSnippet := "\n" + strings.Trim(snippet, "\n") + "\n" |
| scanner := bufio.NewScanner(file) | ||
| for scanner.Scan() { | ||
| var record map[string]any | ||
| if err := json.Unmarshal(scanner.Bytes(), &record); err != nil { | ||
| continue | ||
| } | ||
| if recordType, _ := record["type"].(string); recordType != "session_start" { | ||
| continue | ||
| } | ||
| timestamp, _ := record["timestamp"].(string) | ||
| started, err := time.Parse(time.RFC3339, timestamp) | ||
| if err != nil { | ||
| return fallback | ||
| } | ||
| return started | ||
| } | ||
| return fallback |
There was a problem hiding this comment.
Scanner buffer size inconsistency: The bufio.Scanner here (and in readAgentSessionBundleID, agentSessionHasEnd, and agentSessionFileHasEnd) uses the default max token size of 64KB. However, the viewer's peekSession and LoadSession in internal/viewer/store.go explicitly set a 10MB buffer (scanner.Buffer(buf, 10*1024*1024)). If any JSONL record exceeds 64KB (e.g., an event with a large files_reviewed list), these scanners will silently fail — scanner.Scan() returns false and scanner.Err() may return nil, causing session resumption to miss the start record, lose bundle IDs, or incorrectly determine that no session_end exists.
Recommend setting an explicit larger buffer on all scanners in this file to match the viewer's approach.
Suggestion:
| scanner := bufio.NewScanner(file) | |
| for scanner.Scan() { | |
| var record map[string]any | |
| if err := json.Unmarshal(scanner.Bytes(), &record); err != nil { | |
| continue | |
| } | |
| if recordType, _ := record["type"].(string); recordType != "session_start" { | |
| continue | |
| } | |
| timestamp, _ := record["timestamp"].(string) | |
| started, err := time.Parse(time.RFC3339, timestamp) | |
| if err != nil { | |
| return fallback | |
| } | |
| return started | |
| } | |
| return fallback | |
| scanner := bufio.NewScanner(file) | |
| buf := make([]byte, 0, 1024*1024) | |
| scanner.Buffer(buf, 10*1024*1024) | |
| for scanner.Scan() { | |
| var record map[string]any | |
| if err := json.Unmarshal(scanner.Bytes(), &record); err != nil { | |
| continue | |
| } | |
| if recordType, _ := record["type"].(string); recordType != "session_start" { | |
| continue | |
| } | |
| timestamp, _ := record["timestamp"].(string) | |
| started, err := time.Parse(time.RFC3339, timestamp) | |
| if err != nil { | |
| return fallback | |
| } | |
| return started | |
| } | |
| return fallback |
| unlockSessionFile(file) | ||
| return file.Close() |
There was a problem hiding this comment.
Inconsistent unlock error handling: In writeExclusiveStart, the return value of unlockSessionFile(file) on line 271 is silently discarded, while in the write() method (lines 314-320), unlock errors are properly checked and returned. Although on Unix flock(LOCK_UN) rarely fails, this inconsistency could mask issues on other platforms or in edge cases. Consider handling the unlock error consistently.
Suggestion:
| unlockSessionFile(file) | |
| return file.Close() | |
| if unlockErr := unlockSessionFile(file); unlockErr != nil { | |
| _ = file.Close() | |
| return fmt.Errorf("unlock agent session: %w", unlockErr) | |
| } | |
| return file.Close() |
| Partial bool `json:"partial,omitempty"` | ||
| DurationMS int64 `json:"duration_ms,omitempty"` | ||
| ValidationValid *bool `json:"validation_valid,omitempty"` | ||
| FilesReviewed []string `json:"files_reviewed,omitempty"` |
There was a problem hiding this comment.
Potential issue with FilesReviewed being dropped when empty: The AgentEvent.FilesReviewed field has omitempty, so when it is nil or empty, it will be omitted during the marshal→unmarshal round-trip in agentEventRecord. This means the files_reviewed key won't appear in the JSONL record. While this may be intentional for most events, note that in Finalize, record["files_reviewed"] is explicitly set afterward (line 137), which correctly overrides this. However, if Record() is ever called with a non-empty FilesReviewed, the omitempty behavior would still drop it if the slice happens to be empty. Verify this is the intended behavior.
| <td>{{if .Files}}{{.Files}}{{else}}-{{end}}</td> | ||
| <td>{{if .Findings}}{{.Findings}}{{else}}-{{end}}</td> | ||
| <td>{{if .Warnings}}{{.Warnings}}{{else}}-{{end}}</td> | ||
| <td>{{if .ContextCalls}}{{.ContextCalls}}{{else}}-{{end}}</td> |
There was a problem hiding this comment.
Bug: In Go's html/template, {{if .Files}} treats integer zero as falsy. When an agent event legitimately has Files: 0, Findings: 0, Warnings: 0, ContextCalls: 0, or DurationMS: 0, the template will display - instead of 0. This loses meaningful information — for example, a "report" event with 0 findings is different from one where findings were not tracked.
Consider using pointer types (e.g., *int) for these fields in AgentEvent so that nil means "not provided" and 0 means "explicitly zero". Alternatively, you could add separate boolean flags like HasFiles, HasFindings, etc., or compare against a sentinel value.
| var rec map[string]any | ||
| if err := json.Unmarshal(line, &rec); err == nil { | ||
| if typ, _ := rec["type"].(string); typ == "session_end" { | ||
| lastSessionEnd = append([]byte(nil), line...) | ||
| } | ||
| } |
There was a problem hiding this comment.
Performance: Every line is JSON-unmarshalled twice in peekSession.
When summary.Timestamp.IsZero() is true (i.e., for the first few lines until a timestamp is found), the line is parsed once at line 149 and then parsed again at line 179. After the timestamp is set, every remaining line is still parsed at line 179 just to check for session_end.
Since peekSession is called for every session during listing (ListSessions), this double-parsing adds unnecessary overhead — especially for large session files where only the last session_end record matters.
Consider restructuring the loop to avoid re-parsing. For example, you could track whether the current line was already parsed and reuse the result, or only attempt the session_end check inside an else branch / after the timestamp block using the already-parsed rec.
| <td>{{if .ContextCalls}}{{.ContextCalls}}{{else}}-{{end}}</td> | ||
| <td>{{if .Partial}}yes{{else}}-{{end}}</td> | ||
| <td>{{.ValidationLabel}}</td> | ||
| <td>{{if .DurationMS}}{{.DurationMS}} ms{{else}}-{{end}}</td> |
There was a problem hiding this comment.
Same zero-value issue as above: {{if .DurationMS}} will display - when the duration is legitimately 0 ms (e.g., a very fast operation). Since DurationMS is an int64, consider using a pointer type *int64 or a separate boolean to distinguish "not recorded" from "zero duration".
| if strings.Contains(s, "..") { | ||
| return "Error: file_patterns must not contain ..", nil | ||
| } |
There was a problem hiding this comment.
The strings.Contains(s, "..") check is overly broad and will produce false positives. Legitimate git pathspec patterns like file..go, my..package/, or glob patterns containing .. as a substring (but not as a path traversal component) would be incorrectly rejected.
A more precise approach would check for .. as an actual path component, e.g.:
if s == ".." || strings.HasPrefix(s, "../") || strings.HasSuffix(s, "/..") || strings.Contains(s, "/../") {
return "Error: file_patterns must not contain ..", nil
}Alternatively, you could use filepath.Clean and check if the cleaned path starts with ...
|
🔍 OpenCodeReview found 33 issue(s) in this PR.
📊 Posting Statistics:
|
Summary
This PR extracts and lands the host-agent review bundle runtime as a standalone, reviewable core.
It keeps OCR as the deterministic data plane for host-agent-led reviews:
ocr agent preparebuilds immutable review bundles/manifestsocr agent contextexposes target-aware read/find/diff/search helpersocr agent validate-commentsvalidates agent findings against bundle evidenceocr agent reportrenders validated findingsThe goal of this PR is to land the adapter-agnostic runtime first, without mixing in the larger surface/docs migration or the Cursor adapter and CI bridge.
What Is In Scope
ocr agent ...CLI/runtime plumbing required to expose and test the flowWhat Is Explicitly Out Of Scope
These will follow in separate PRs:
Design Notes
This follows the direction discussed in #276:
ocr review/ocr scanbehaviorValidation
GOTOOLCHAIN=auto go test ./...GOTOOLCHAIN=auto go build -o dist/opencodereview ./cmd/opencodereviewFollow-ups
Planned follow-up PRs after this lands or direction is confirmed: