Fix goroutine leak: cancel socket context after close handlers#3
Open
ertakbuga wants to merge 2 commits intoRobertWHurst:masterfrom
Open
Fix goroutine leak: cancel socket context after close handlers#3ertakbuga wants to merge 2 commits intoRobertWHurst:masterfrom
ertakbuga wants to merge 2 commits intoRobertWHurst:masterfrom
Conversation
When a connection closes via context.Canceled (the Eurus close path), handler goroutines blocked in ReceiveInto/ReceiveIntoWithContext wait on socket.Done() which never fires because cancelCtx() is never called. This causes permanent goroutine leaks (~3k/hour in production). Two fixes: 1. router.go: call socket.cancelCtx() after HandleClose returns in both HandleConnection and handleWebsocketConnection. This is the primary fix — it ensures all in-flight handler goroutines are unblocked after close handlers complete. 2. socket.go: call s.Close() on context.Canceled in HandleNextMessageWithNode for consistency with other error paths. This is defense-in-depth. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… sync Two issues that cause data races detected by `go test -race`: 1. Router.Handle/HandleOpen/HandleClose read `subCtx.currentHandlerNode` AFTER calling `subCtx.free()`, which returns the context to sync.Pool. Another goroutine can Get the same context and write to it concurrently. Fix: capture the field before calling free(). 2. HandleNextMessageWithNode spawns a goroutine for message handling but handleWebsocketConnection proceeds to HandleClose without waiting for it to complete. The goroutine's ctx.free() can race with HandleClose's contextFromPool(). Fix: track goroutines with sync.WaitGroup on Socket and wait before HandleClose. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
context.Canceled(the Eurus close path). Handler goroutines blocked inReceiveInto/ReceiveIntoWithContextwait onsocket.Done()which never fires becausecancelCtx()is never called after close handlers run.router.go): Callsocket.cancelCtx()afterHandleClosereturns in bothHandleConnectionandhandleWebsocketConnection. This ensures all in-flight handler goroutines are unblocked after close handlers complete.socket.go): Calls.Close()oncontext.CanceledinHandleNextMessageWithNodefor consistency with other error paths (websocket close status already callsClose).router_goroutine_leak_test.go): Reproduces the exact Eurus close scenario — a handler blocks onReceiveInto, connection returnscontext.Canceled, and verifies the handler goroutine exits (socket context is cancelled).Context
In production (Applications-Service), the
SubscribeEntryWebSocket handler spawns goroutines that block onReceiveIntowaiting for follow-up messages. When the connection closes via Eurus (which returnscontext.Canceled),HandleNextMessageWithNodereturnsfalsewithout callingsocket.Close(). The router then callssocket.HandleClose()which runs close handler callbacks but never callssocket.cancelCtx(). Handler goroutines blocked inReceiveIntowait onsocket.Done()which never fires → permanent goroutine leak (~3,000 goroutines/hour, causing OOMKills).Changes
router.gosocket.cancelCtx()afterHandleClosein bothHandleConnectionandhandleWebsocketConnectionsocket.gos.Close()inHandleNextMessageWithNodefor thecontext.Cancelederror pathrouter_goroutine_leak_test.gocontext.Canceledclose pathTest plan
TestHandlerGoroutineCleanupOnContextCanceledpassesgo test ./...)🤖 Generated with Claude Code