Skip to content

fix: PID lock recycle, recall threshold bypass, orphaned compressor refs#1211

Open
JasonOA888 wants to merge 2 commits intovolcengine:mainfrom
JasonOA888:fix/issue-1106-recall-threshold-bypass
Open

fix: PID lock recycle, recall threshold bypass, orphaned compressor refs#1211
JasonOA888 wants to merge 2 commits intovolcengine:mainfrom
JasonOA888:fix/issue-1106-recall-threshold-bypass

Conversation

@JasonOA888
Copy link
Copy Markdown
Contributor

Fixes #1088, Fixes #1106, Fixes #1048

Bug 1: PID lock recycle → cascading Gateway failure (#1088)

On Linux, PIDs are recycled. After an OpenViking crash, _is_pid_alive() checks os.kill(pid, 0) which returns True for any process with that PID — not just OpenViking. This causes DataDirectoryLocked for an unrelated process, which cascades into unhandled promise rejections and session deadlock in the OpenClaw Gateway.

Fix: After os.kill(pid, 0) succeeds on Linux, verify the process is actually OpenViking by reading /proc/{pid}/cmdline. If it's not an OpenViking process, treat the lock as stale and reclaim it.

Bug 2: recallScoreThreshold bypassed (#1106)

postProcessMemories correctly filters by score threshold, but pickMemoriesForInjection bypasses it when supplementing leaf memories with non-leaf items. The fallback loop at line 258 pushes items without checking clampScore(item.score) < scoreThreshold.

Fix: Add scoreThreshold parameter to pickMemoriesForInjection (default 0 for backward compat) and check threshold in the supplementation loop.

Bug 3: Compressor infinite retry on orphaned refs (#1048)

When a memory file is deleted externally (e.g., manual cleanup), _merge_into_existing fails with a generic exception and returns False. The caller skips the merge but the orphaned reference stays in the compressor's tracking state, causing retries every few minutes indefinitely.

Fix: Catch FileNotFoundError separately, log a warning, and clean up the orphaned vector record via vikingdb.delete_uris so it won't be retried.

Testing

  • Python files validated via ast.parse
  • TypeScript brace balance verified
  • All changes are minimal and backward-compatible

1. process_lock: verify PID is actually OpenViking on Linux by checking
   /proc/{pid}/cmdline. Prevents false DataDirectoryLocked when PIDs are
   recycled to unrelated processes after crash (Fixes volcengine#1088).

2. memory-ranking: add scoreThreshold param to pickMemoriesForInjection
   and filter non-leaf items below threshold. Previously low-scoring
   memories bypassed recallScoreThreshold when supplementing leaves
   (Fixes volcengine#1106).

3. compressor: catch FileNotFoundError separately in _merge_into_existing,
   clean up orphaned vector records so they are not retried indefinitely
   (Fixes volcengine#1048).
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 3, 2026

PR Code Suggestions ✨

No code suggestions found for the PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Backlog

1 participant