Skip to content

proxy: add per-model ackTimeout keep-alive for slow upstream responses#735

Open
Ales999 wants to merge 5 commits into
mostlygeek:mainfrom
Ales999:feature/kilo-code-502
Open

proxy: add per-model ackTimeout keep-alive for slow upstream responses#735
Ales999 wants to merge 5 commits into
mostlygeek:mainfrom
Ales999:feature/kilo-code-502

Conversation

@Ales999

@Ales999 Ales999 commented May 3, 2026

Copy link
Copy Markdown

Add an optional ackTimeout field to model config that sends a keep-alive heartbeat to the client when the upstream inference server takes longer than the configured timeout to respond. This prevents client-side timeouts (e.g. Kilo Code task timeouts) during long generation or startup sessions.

Key changes:

  • Add AckTimeout int field to ModelConfig with YAML tag "ackTimeout"
  • Validate ackTimeout >= 0 in LoadConfigFromReader
  • Background goroutine in Process.ProxyRequest creates a per-model timer
  • Uses atomic.CompareAndSwapUint32 to ensure only one heartbeat is sent per model instance, preventing duplicate ACKs across concurrent requests
  • Heartbeat is an SSE comment (: heart-beat) which standard clients ignore but keeps the connection alive and prevents proxy/browser timeouts
  • Timer goroutine respects r.Context().Done() for graceful cancellation
  • Add ackTimeout to JSON schema (minimum: 0, default: 0 = disabled)

Example config:
models:
my-slow-model:
cmd: llama-server --model /path/to/model.gguf
ackTimeout: 110 # send heartbeat after 110s of upstream silence

Summary

Add per-model ackTimeout configuration to send a keep-alive heartbeat when the upstream inference server takes longer than expected to respond. This prevents client-side timeouts during long generation or model startup sessions.

  • Add AckTimeout field to ModelConfig (YAML: ackTimeout)
  • Background goroutine in Process.ProxyRequest sends SSE heartbeat after timeout elapses
  • Atomic CAS ensures only one heartbeat per model instance across concurrent requests
  • Config validation: ackTimeout >= 0, default 0 (disabled)

Add an optional ackTimeout field to model config that sends a keep-alive
heartbeat to the client when the upstream inference server takes longer
than the configured timeout to respond. This prevents client-side
timeouts (e.g. Kilo Code task timeouts) during long generation or startup
sessions.

Key changes:
- Add AckTimeout int field to ModelConfig with YAML tag "ackTimeout"
- Validate ackTimeout >= 0 in LoadConfigFromReader
- Background goroutine in Process.ProxyRequest creates a per-model timer
- Uses atomic.CompareAndSwapUint32 to ensure only one heartbeat is sent
  per model instance, preventing duplicate ACKs across concurrent requests
- Heartbeat is an SSE comment (: heart-beat) which standard clients ignore
  but keeps the connection alive and prevents proxy/browser timeouts
- Timer goroutine respects r.Context().Done() for graceful cancellation
- Add ackTimeout to JSON schema (minimum: 0, default: 0 = disabled)

Example config:
    models:
      my-slow-model:
        cmd: llama-server --model /path/to/model.gguf
        ackTimeout: 110  # send heartbeat after 110s of upstream silence
@coderabbitai

coderabbitai Bot commented May 3, 2026

Copy link
Copy Markdown

Caution

Review failed

Failed to post review comments

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Adds two per-model config options, heartbeatInterval and ackTimeout (integers ≥ 0, default 0) and implements a streaming keep‑alive path: when configured, the proxy starts a background heartbeat goroutine that emits SSE-style keep-alives until upstream writes or the client cancels. Config validation and a synchronized response writer were added to coordinate heartbeats and upstream writes.

Changes

Streaming keep‑alive

Layer / File(s) Summary
Schema
config-schema.json
Adds per-model heartbeatInterval and ackTimeout properties: integer, minimum: 0, default: 0 (inserted under properties.models.additionalProperties.properties).
Config Structure
proxy/config/model_config.go
Adds HeartbeatInterval int field to ModelConfig with YAML tag heartbeatInterval and comment describing keep‑alive behavior (0 disables).
Config Validation
proxy/config/config.go
Validates per‑model HeartbeatInterval: rejects negative values (model <id>: heartbeatInterval must be >= 0) and normalizes values >0 and <10 up to 10.
Process Initialization
proxy/process.go
Adds heartbeatInterval field to Process and initializes it from config.HeartbeatInterval in NewProcess.
ProxyRequest / Heartbeat Wiring
proxy/process.go
Reworks streaming path in ProxyRequest: extracts streaming flag from context, optionally creates a loading-status writer, wraps the http.ResponseWriter with a synchronized wrapper (responseSyncer / prw), sets SSE headers, and starts a heartbeat goroutine using heartbeatInterval to write keep‑alive SSE lines until the upstream performs the first write or the request context cancels. Final request handling is routed through the wrapper (proxyW).
Writer Synchronization
proxy/process.go
Introduces responseSyncer type and methods (Header, WriteHeartbeat, Write, FlushUnlocked, WriteHeader) to coordinate concurrent writes from the heartbeat goroutine and the upstream proxy, ensuring safe, ordered writes and signaling the first upstream write to stop heartbeats.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • mostlygeek/llama-swap#371: Modifies proxy/process.go with streaming-aware response writer wrappers and related streaming flow; closely related to the response-writer and heartbeat changes in this PR.
🚥 Pre-merge checks | ✅ 2 | ❌ 3

❌ Failed checks (3 warnings)

Check name Status Explanation Resolution
Title check ⚠️ Warning The title mentions 'ackTimeout' but the actual implementation has been renamed to 'HeartbeatInterval' across the codebase per commits, creating a mismatch between the title and implementation. Update the PR title to reflect the actual implementation: 'proxy: add per-model HeartbeatInterval keep-alive for slow upstream responses' or similar.
Description check ⚠️ Warning The description describes 'ackTimeout' configuration but the commits rename this to 'HeartbeatInterval', and the description does not mention the 10-second minimum enforcement or atomicity changes using atomic.Bool. Update the description to reflect the actual implementation changes, including the rename to HeartbeatInterval, the 10-second minimum validation, and the atomic.Bool-based responseSyncer mechanism.
Docstring Coverage ⚠️ Warning Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@proxy/process.go`:
- Around line 80-82: The ackSentFlag on the Process struct is shared across all
requests and never reset, causing the first timed-out request to permanently
disable future heartbeats; remove ackSentFlag from the Process struct and
replace it with a per-request flag (e.g., a local uint32 in the request-handling
function or attached to the request context) that you atomically set/read where
currently using ackSentFlag, ensuring it is initialized to 0 for each new
request and not reused across concurrent requests; update all usages (places
referencing ackSentFlag alongside ackTimeout and the methods that send the
heartbeat/ack) to use the new per-request variable so concurrent slow requests
each manage their own ack state.
- Around line 640-663: The goroutine currently writes headers/status/body to the
original ResponseWriter (w) before the upstream response exists, which commits a
200 and can corrupt non-streaming responses; change it to never modify or write
to w before the reverseProxy response is started: in the ackTimeout goroutine
(the function that references p.ackTimeout, p.ackSentFlag, proxyLogger,
requestBeginTime, r, w) remove the calls to w.Header().Set, w.WriteHeader,
fmt.Fprintf and any Flush; instead only atomically set p.ackSentFlag and log the
heartbeat (keep the Debugf), and if you need true heartbeat bytes implement it
in a streaming-specific code path (e.g., only write when a response-wrapping
ResponseWriter indicates headers have already been written or use a dedicated
streaming/hijack mechanism) so no concurrent/early writes occur before
reverseProxy.ServeHTTP runs.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 7c114581-ab9d-4ebb-94f4-54e7e113e8b6

📥 Commits

Reviewing files that changed from the base of the PR and between e261745 and e63c976.

📒 Files selected for processing (4)
  • config-schema.json
  • proxy/config/config.go
  • proxy/config/model_config.go
  • proxy/process.go

Comment thread proxy/process.go Outdated
Comment thread proxy/process.go Outdated
@Ales999

Ales999 commented May 3, 2026

Copy link
Copy Markdown
Author

Yes, I'll work on it a little more...

slot update_slots: id  0 | task 74 | prompt processing progress, n_tokens = 28869, batch.n_tokens = 1024, progress = 0.871359
slot update_slots: id  0 | task 74 | n_tokens = 28869, memory_seq_rm [28869, end)
slot update_slots: id  0 | task 74 | prompt processing progress, n_tokens = 29893, batch.n_tokens = 1024, progress = 0.902267
2026-05-04T04:01:55+05:00 [DEBUG] <qwen36th> - Sent keep-alive heartbeat for /v1/chat/completions after 1m50.0004718s
slot update_slots: id  0 | task 74 | n_tokens = 29893, memory_seq_rm [29893, end)
slot update_slots: id  0 | task 74 | prompt processing progress, n_tokens = 30917, batch.n_tokens = 1024, progress = 0.933174
slot update_slots: id  0 | task 74 | n_tokens = 30917, memory_seq_rm [30917, end)
slot update_slots: id  0 | task 74 | prompt processing progress, n_tokens = 31941, batch.n_tokens = 1024, progress = 0.964082
slot update_slots: id  0 | task 74 | n_tokens = 31941, memory_seq_rm [31941, end)

Task is continue after 110 sec and kilocode not restart this task after 10 secs.

Implements periodic keep-alive heartbeats when an upstream response is slow, preventing client timeouts during extended processing or generation sessions. This logic uses `ackTimeout` to determine the heartbeat frequency, sending HTTP 200 responses to the client as necessary.

Refactors the background goroutine from a single timer-based mechanism to a ticker-based approach for continuous heartbeats (mostlygeek#726)

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
proxy/process.go (1)

639-670: ⚠️ Potential issue | 🔴 Critical | 🏗️ Heavy lift

Critical: Concurrent writes to ResponseWriter cause data races and corrupt responses.

This implementation still has the issues flagged in the previous review:

  1. Data race: The goroutine writes headers/body to w while reverseProxy.ServeHTTP(w, r) (line 677) runs concurrently — both writing to the same ResponseWriter without synchronization.

  2. Premature response commitment: Writing WriteHeader(http.StatusOK) before the upstream responds forces HTTP 200 regardless of actual upstream status. Non-streaming or error responses are corrupted.

  3. No termination signal: The goroutine has no way to know when the reverse proxy starts sending the actual response. It only exits on context cancellation, not when data begins flowing.

  4. Repeated header operations: Header().Set() and WriteHeader() are called every tick — ineffective after the first write and indicates the design needs rethinking.

To fix this properly, consider a coordinated approach where:

  • A custom ResponseWriter wrapper intercepts the first upstream write
  • The wrapper signals the heartbeat goroutine to stop via a channel/context before any upstream data is written
  • All writes go through the wrapper to ensure mutual exclusion
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@proxy/process.go` around lines 639 - 670, The heartbeat goroutine is
concurrently writing to the original ResponseWriter (w) causing races and
premature committed responses; fix by replacing direct writes with a
synchronized wrapper: implement a custom responseWriter wrapper type (e.g.,
proxyResponseWriter) that embeds http.ResponseWriter and overrides Header(),
WriteHeader(), and Write() to acquire a mutex and detect the first write; have
the wrapper expose a channel or context-cancel function (e.g., firstWriteCh or
stopHeartbeat()) that the wrapper closes/signals when the first upstream Write
or WriteHeader occurs; change the heartbeat goroutine (spawned when p.ackTimeout
> 0) to accept the wrapper and stop sending heartbeats when it receives that
signal instead of writing to the raw w, and ensure reverseProxy.ServeHTTP is
called with the wrapper so all writes are funneled through the mutex-protected
methods to avoid concurrent writes and prevent premature WriteHeader calls.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@proxy/process.go`:
- Around line 639-670: The heartbeat goroutine is concurrently writing to the
original ResponseWriter (w) causing races and premature committed responses; fix
by replacing direct writes with a synchronized wrapper: implement a custom
responseWriter wrapper type (e.g., proxyResponseWriter) that embeds
http.ResponseWriter and overrides Header(), WriteHeader(), and Write() to
acquire a mutex and detect the first write; have the wrapper expose a channel or
context-cancel function (e.g., firstWriteCh or stopHeartbeat()) that the wrapper
closes/signals when the first upstream Write or WriteHeader occurs; change the
heartbeat goroutine (spawned when p.ackTimeout > 0) to accept the wrapper and
stop sending heartbeats when it receives that signal instead of writing to the
raw w, and ensure reverseProxy.ServeHTTP is called with the wrapper so all
writes are funneled through the mutex-protected methods to avoid concurrent
writes and prevent premature WriteHeader calls.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 0dfe116c-e93d-4d8a-b020-6dfe0cad86a0

📥 Commits

Reviewing files that changed from the base of the PR and between e63c976 and da36fbb.

📒 Files selected for processing (1)
  • proxy/process.go

The ackTimeout heartbeat goroutine was writing directly to http.ResponseWriter
while reverseProxy.ServeHTTP did the same, causing data races and potential
premature response commits.

- Add proxyResponseWriter type wrapping http.ResponseWriter with mutex-based
  synchronization for all write operations (Header, WriteHeader, Write, Flush)
- Use firstWriteCh channel to signal heartbeat goroutine when upstream starts
  writing, allowing it to stop sending heartbeats automatically
- Route all response writes through the synchronized wrapper: heartbeat,
  reverseProxy.ServeHTTP, testHandler, and statusResponseWriter

fixes race condition described in PR review of proxy/process.go

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (1)
proxy/process.go (1)

647-674: ⚠️ Potential issue | 🔴 Critical | 🏗️ Heavy lift

Heartbeat path can pre-commit an incorrect response before upstream responds.

At Line 668–673, the goroutine writes SSE headers + 200 + body before reverseProxy.ServeHTTP(...) runs. That can overwrite intended upstream status/headers and corrupt non-SSE payloads.

#!/bin/bash
# Verify heartbeat writes happen before reverse proxy serving in the same handler path.
rg -n -C4 'if p\.ackTimeout > 0|WriteHeader\(http\.StatusOK\)|fmt\.Fprintf\(prw, ": heart-beat' proxy/process.go
rg -n -C3 'p\.testHandler\.ServeHTTP\(prw, r\)|p\.reverseProxy\.ServeHTTP\(srw, r\)|p\.reverseProxy\.ServeHTTP\(prw, r\)' proxy/process.go
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@proxy/process.go`:
- Around line 643-675: The Go file fails gofmt checks due to misformatted
goroutine/for/select block around the ackTimeout heartbeat logic; run gofmt -w
on this file to fix formatting (ensure the goroutine that uses p.ackTimeout,
tickInterval, the for range over ticker.C, the select handling prw.firstWriteCh
and r.Context().Done(), and the prw.Header()/prw.WriteHeader()/fmt.Fprintf(prw,
...) / prw.Flush() lines are properly indented and braces aligned), then re-run
gofmt -l . to confirm no files are reported.
- Around line 648-651: The heartbeat ticker is currently set to half of
p.ackTimeout (tickInterval := p.ackTimeout / 2), causing the first heartbeat to
fire too early; change the ticker interval logic in the goroutine that creates
ticker to use p.ackTimeout (or otherwise ensure the first tick occurs after
p.ackTimeout) so that the first and subsequent heartbeats honor the configured
ackTimeout; update references to tickInterval, ticker, and the anonymous
goroutine accordingly to reflect the corrected interval.
- Around line 793-797: The Header() method on proxyResponseWriter returns the
live http.Header after releasing p.mu, causing races with concurrent
WriteHeader() and other goroutines; change Header() to return a deep copy/clone
of p.writer.Header() (copy each header slice) instead of the live map so callers
mutate the copy, or alternatively provide and update usage to a locked setter
(e.g., a SetHeader/AddHeader method on proxyResponseWriter that takes
p.mu.Lock/p.mu.Unlock) and replace direct Header().Set usages; ensure changes
reference proxyResponseWriter.Header, p.writer, p.mu, and WriteHeader so all
header mutations are either performed on the cloned map or through the new
locked accessor to eliminate the race.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: bb657c4b-6d04-4f1d-bb3e-f504777a5fd9

📥 Commits

Reviewing files that changed from the base of the PR and between da36fbb and 45bfea7.

📒 Files selected for processing (1)
  • proxy/process.go

Comment thread proxy/process.go Outdated
Comment thread proxy/process.go Outdated
Comment thread proxy/process.go Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (2)
proxy/process.go (2)

648-651: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Heartbeat ticker fires at ackTimeout/2, not at ackTimeout.

tickInterval := p.ackTimeout / 2 means the first tick — and every subsequent tick — fires at half the configured value. A user who sets ackTimeout: 110 expects the keep-alive after 110 s but receives it after 55 s.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@proxy/process.go` around lines 648 - 651, The heartbeat ticker currently uses
tickInterval := p.ackTimeout / 2 which fires at half the configured ackTimeout;
change the ticker interval to use the actual ack timeout value (e.g., set
tickInterval := p.ackTimeout) so the Ticker created by ticker :=
time.NewTicker(tickInterval) fires at the expected p.ackTimeout; update any
related comments or tests referencing the old behavior and ensure p.ackTimeout
is a time.Duration when used here.

793-797: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Header() returns the live map without lock protection on subsequent mutations.

Header() acquires RLock only to read p.writer, then returns the live http.Header map before releasing the lock. The .Set() calls at lines 668–670 (heartbeat goroutine) and 905–907 (newStatusResponseWriter) execute concurrently with WriteHeader() (which holds an exclusive Lock), creating an unguarded data race on the underlying map.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@proxy/process.go` around lines 793 - 797, Header() currently returns the live
http.Header without protecting subsequent mutations, causing data races; change
Header() to return a deep copy (clone) of p.writer.Header() while holding the
lock, and add/convert callers to use new locked setters (e.g.,
proxyResponseWriter.SetHeader(key, value) and DelHeader(key)) that take
p.mu.Lock/p.mu.Unlock and call p.writer.Header().Set/Del, updating the heartbeat
goroutine (the .Set calls around lines 668–670) and newStatusResponseWriter
header mutations (around lines 905–907) to use these setter methods instead of
calling Header().Set concurrently; ensure WriteHeader() continues to use the
exclusive Lock as-is.
🧹 Nitpick comments (1)
proxy/process.go (1)

570-596: 💤 Low value

proxyResponseWriter creation and srw wiring look correct.

Worth noting: when srw != nil, newStatusResponseWriter immediately calls WriteHeader(200) on prw, which closes prw.firstWriteCh. The heartbeat goroutine (started later at line 649) will therefore exit on its very first tick, making ackTimeout a no-op in the loading-state path. This is the correct behavior — the client is already receiving updates — but it means the goroutine is unconditionally started and immediately exits in that path. Consider gating the goroutine start on srw == nil to avoid the unnecessary spawn.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@proxy/process.go` around lines 570 - 596, The heartbeat goroutine is always
started even when srw != nil (i.e., newStatusResponseWriter already called
prw.WriteHeader and client is streaming), causing an unnecessary immediate-exit
goroutine; change the logic that starts the heartbeat goroutine to only spawn it
when srw == nil (keep existing behavior for proxyResponseWriter/prw and when no
statusResponseWriter is used), so locate the code that starts the heartbeat
goroutine (the goroutine started after creating prw/newProxyResponseWriter and
srw/newStatusResponseWriter) and wrap its creation in a conditional that checks
srw == nil before starting.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@proxy/process.go`:
- Around line 799-807: The WriteHeader method on proxyResponseWriter calls
p.writer.WriteHeader(code) unconditionally, causing superfluous WriteHeader
calls after firstWritten is set by the heartbeat; update WriteHeader so that the
call to p.writer.WriteHeader(code) is performed only inside the existing if
!p.firstWritten guard (after setting firstWritten=true and closing firstWriteCh)
and skipped on subsequent calls, ensuring only the first WriteHeader delegates
to p.writer.WriteHeader and avoiding redundant writes.
- Around line 647-676: The heartbeat goroutine is firing for non-streaming
requests and corrupting responses; gate it on the request's isStreaming flag
instead of running unconditionally. Before launching the goroutine (the block
that checks p.ackTimeout > 0 and uses prw.firstWriteCh, prw.Header,
prw.WriteHeader, fmt.Fprintf, prw.Flush), check the resolved isStreaming value
from the request context and only start the ticker if isStreaming is true;
additionally, inside the goroutine double-check isStreaming (or short-circuit)
before writing headers/body to ensure no SSE headers or ": heart-beat" lines are
sent for non-streaming handlers.

---

Duplicate comments:
In `@proxy/process.go`:
- Around line 648-651: The heartbeat ticker currently uses tickInterval :=
p.ackTimeout / 2 which fires at half the configured ackTimeout; change the
ticker interval to use the actual ack timeout value (e.g., set tickInterval :=
p.ackTimeout) so the Ticker created by ticker := time.NewTicker(tickInterval)
fires at the expected p.ackTimeout; update any related comments or tests
referencing the old behavior and ensure p.ackTimeout is a time.Duration when
used here.
- Around line 793-797: Header() currently returns the live http.Header without
protecting subsequent mutations, causing data races; change Header() to return a
deep copy (clone) of p.writer.Header() while holding the lock, and add/convert
callers to use new locked setters (e.g., proxyResponseWriter.SetHeader(key,
value) and DelHeader(key)) that take p.mu.Lock/p.mu.Unlock and call
p.writer.Header().Set/Del, updating the heartbeat goroutine (the .Set calls
around lines 668–670) and newStatusResponseWriter header mutations (around lines
905–907) to use these setter methods instead of calling Header().Set
concurrently; ensure WriteHeader() continues to use the exclusive Lock as-is.

---

Nitpick comments:
In `@proxy/process.go`:
- Around line 570-596: The heartbeat goroutine is always started even when srw
!= nil (i.e., newStatusResponseWriter already called prw.WriteHeader and client
is streaming), causing an unnecessary immediate-exit goroutine; change the logic
that starts the heartbeat goroutine to only spawn it when srw == nil (keep
existing behavior for proxyResponseWriter/prw and when no statusResponseWriter
is used), so locate the code that starts the heartbeat goroutine (the goroutine
started after creating prw/newProxyResponseWriter and
srw/newStatusResponseWriter) and wrap its creation in a conditional that checks
srw == nil before starting.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: bd25a890-93c3-4fba-ba07-b92d88779979

📥 Commits

Reviewing files that changed from the base of the PR and between 45bfea7 and befeae9.

📒 Files selected for processing (1)
  • proxy/process.go

Comment thread proxy/process.go Outdated
Comment thread proxy/process.go Outdated
…y handling

Refactor the keep-alive heartbeat feature (issue mostlygeek#726):

- Rename AckTimeout config field to HeartbeatInterval across all files
- Enforce minimum 10 second interval for heartbeat
- Replace proxyResponseWriter with lighter responseSyncer using atomic.Bool
- Add FlushUnlocked to prevent deadlock during concurrent flush operations
- Activate heartbeat only for streaming requests (isStreaming check moved earlier)
- Use closeCh signal instead of firstWriteCh for upstream started detection
- Wrap statusResponseWriter in responseSyncer when SendLoadingState is enabled
@Ales999 Ales999 changed the title proxy: add per-model ackTimeout keep-alive for slow upstream responses (#726) proxy: add per-model ackTimeout keep-alive for slow upstream responses May 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant