Feature: token streaming support for ReAct Agent#1851
Feature: token streaming support for ReAct Agent#1851thepatrickchin wants to merge 7 commits intoNVIDIA:developfrom
Conversation
ReAct's _stream_llm explicitly passed self._runnable_config to the LLM, overriding LangGraph's injected runtime config and dropping its streaming callbacks. Fixed by merging both configs so LangGraph can observe tokens. Without a _stream_fn, the framework had no streaming entry point and always fell back to the non-streaming path. Fixed by propagating the injected node config through agent_node down to _stream_llm, and register a _stream_fn via FunctionInfo.create that yields ChatResponseChunk tokens from graph.astream(stream_mode="messages"). Signed-off-by: Patrick Chin <8509935+thepatrickchin@users.noreply.github.com>
Buffer streamed tokens until FINAL_ANSWER_PATTERN is detected, then yield only the content after the marker. Falls back to yielding the full buffer for direct answers that omit the ReAct format. Signed-off-by: Patrick Chin <8509935+thepatrickchin@users.noreply.github.com>
Signed-off-by: Patrick Chin <8509935+thepatrickchin@users.noreply.github.com>
Signed-off-by: Patrick Chin <8509935+thepatrickchin@users.noreply.github.com>
WalkthroughAdds optional Changes
Sequence DiagramsequenceDiagram
participant Client
participant ReActGraph
participant AgentNode
participant BaseAgent
participant Runnable
participant Stream
Client->>ReActGraph: astream(inputs, config)
ReActGraph->>AgentNode: agent_node(state, config)
AgentNode->>AgentNode: build inputs dict
AgentNode->>BaseAgent: _stream_llm(runnable, inputs, config)
BaseAgent->>BaseAgent: effective_config = merge_configs(self._runnable_config, config)
BaseAgent->>Runnable: astream(inputs, effective_config)
Runnable->>Stream: emit AIMessageChunk events
Stream->>AgentNode: deliver chunks (metadata: node="agent")
AgentNode->>ReActGraph: forward buffered/filtered chunks
ReActGraph->>Client: emit post-FINAL_ANSWER_PATTERN chunks
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/react_agent/register.py (1)
238-251: Consider usinglogger.exception()when re-raising or clarify intent.Per coding guidelines: when catching and logging exceptions without re-raising, use
logger.exception()to capture the full stack trace. Line 250 useslogger.error()before re-raising, which is correct per the guideline for re-raising. However, verify this is intentional.The
GraphRecursionErrorhandling (lines 238-248) gracefully yields an error message instead of raising, which is appropriate for streaming—users receive a clear message rather than a broken stream.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/react_agent/register.py` around lines 238 - 251, The current except Exception block logs with logger.error("%s ReAct Agent streaming failed with exception: %s", AGENT_LOG_PREFIX, ex) and then re-raises; if you want the full stack trace captured in logs change this to logger.exception(...) in the except Exception as ex handler (preserving AGENT_LOG_PREFIX and the raise), otherwise explicitly document with an inline comment that error-level logging without stack trace is intentional; refer to the except Exception handler, logger.error call, GraphRecursionError handling, and ChatResponseChunk.create_streaming_chunk to locate the area to change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In
`@packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/react_agent/register.py`:
- Around line 238-251: The current except Exception block logs with
logger.error("%s ReAct Agent streaming failed with exception: %s",
AGENT_LOG_PREFIX, ex) and then re-raises; if you want the full stack trace
captured in logs change this to logger.exception(...) in the except Exception as
ex handler (preserving AGENT_LOG_PREFIX and the raise), otherwise explicitly
document with an inline comment that error-level logging without stack trace is
intentional; refer to the except Exception handler, logger.error call,
GraphRecursionError handling, and ChatResponseChunk.create_streaming_chunk to
locate the area to change.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 0e3d555c-285d-4252-b495-0bfda33dc121
📒 Files selected for processing (6)
packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/base.pypackages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/react_agent/agent.pypackages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/react_agent/prompt.pypackages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/react_agent/register.pypackages/nvidia_nat_langchain/tests/agent/test_base.pypackages/nvidia_nat_langchain/tests/agent/test_react.py
willkill07
left a comment
There was a problem hiding this comment.
Structurally, this is good.
We just need to ensure consistent interfaces which are currently being violated as well as ensure behavior is consistent between streaming/non-streaming.
packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/react_agent/agent.py
Show resolved
Hide resolved
packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/react_agent/prompt.py
Show resolved
Hide resolved
packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/react_agent/register.py
Outdated
Show resolved
Hide resolved
…eter - Update agent_node of tool calling agent to match Signed-off-by: Patrick Chin <8509935+thepatrickchin@users.noreply.github.com>
Apply remove_r1_think_tags to the streaming buffer before searching for the Final Answer marker and before yielding the fallback response, so think tags are not leaked to the client when the LLM answers directly without ReAct format, matching non-streaming behavior Signed-off-by: Patrick Chin <8509935+thepatrickchin@users.noreply.github.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/dual_node.py`:
- Around line 48-49: Add a Google-style docstring to the public async method
agent_node(self, state: BaseModel, config: RunnableConfig | None = None) ->
BaseModel explaining the expected types and semantics: describe the purpose of
the method, the meaning and required shape/fields of the state parameter
(BaseModel), the optional config parameter (RunnableConfig) and when it may be
None, the contract of the returned BaseModel (what fields/side-effects callers
should expect), any exceptions that implementations may raise, and thread/async
considerations; place the docstring immediately under the async def and ensure
it follows Google-style param/returns/raises sections so implementers have a
clear, consistent contract.
🪄 Autofix (Beta)
❌ Autofix failed (check again to retry)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: e133c126-9700-426a-b026-7c303c96084a
📒 Files selected for processing (3)
packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/dual_node.pypackages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/react_agent/register.pypackages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/tool_calling_agent/agent.py
✅ Files skipped from review due to trivial changes (1)
- packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/tool_calling_agent/agent.py
🚧 Files skipped from review as they are similar to previous changes (1)
- packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/react_agent/register.py
packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/dual_node.py
Show resolved
Hide resolved
|
Note Autofix is a beta feature. Expect some limitations and changes as we gather feedback and continue to improve it. ❌ Failed to clone repository into sandbox. Please try again. |
|
/ok to test 2c371ee |
Description
Previously, ReAct's
_stream_llmexplicitly passedself._runnable_configto the LLM, overriding LangGraph's injected runtime config and dropping its streaming callbacks. This was fixed by merging both configs so LangGraph can observe tokens.The ReAct Agent was also missing
_stream_fn, so the framework had no streaming entry point and always fell back to the non-streaming path. This was fixed by propagating the injected node config throughagent_nodedown to_stream_llm, and registering a_stream_fnviaFunctionInfo.createthat yields ChatResponseChunk tokens fromgraph.astreamCloses #1850
By Submitting this PR I confirm:
Summary by CodeRabbit
New Features
Improvements
Tests