fix: don't honor result_as_answer when tool execution errors#5157
Open
devin-ai-integration[bot] wants to merge 2 commits intomainfrom
Open
fix: don't honor result_as_answer when tool execution errors#5157devin-ai-integration[bot] wants to merge 2 commits intomainfrom
devin-ai-integration[bot] wants to merge 2 commits intomainfrom
Conversation
When a tool with result_as_answer=True raises an exception, the agent now continues reasoning about the error instead of treating the error message as the final answer. Fixes #5156 Co-Authored-By: João <joao@crewai.com>
Contributor
Author
|
Prompt hidden (unlisted session) |
Contributor
Author
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Address Cursor Bugbot review: the add_image exception handlers in use() and ause() were missing the error flag, allowing result_as_answer to be incorrectly honored when those paths errored. Co-Authored-By: João <joao@crewai.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Summary
Fixes #5156. When a tool with
result_as_answer=Trueraises an exception, the error message was being treated as the agent's final answer, preventing the agent from reflecting on the failure and retrying.The fix adds error tracking across all tool execution code paths so that
result_as_answeris only honored on successful tool executions:tool_usage.py: Added_last_execution_erroredflag, set in all error branches (ToolUsageError, tool selection failure, runtime exception in_use/_ause)tool_utils.py: Bothexecute_tool_and_check_finalityandaexecute_tool_and_check_finalitycheck the flag before returningresult_as_answer=Truecrew_agent_executor.py: Propagateserror_occurredthrough execution result dict;_append_tool_result_and_check_finalitygates on itagent_utils.py: Uses existingerror_event_emittedto gateresult_as_answerexperimental/agent_executor.py: Same pattern applied to sequential loop, parallel results loop, and parallel error fallbackReview & Testing Checklist for Human
step_executor.pycoverage: This file was not modified. Confirm that its native tool path delegates to one of the fixed executors and doesn't have its own independentresult_as_answercheck that bypasses the fix._last_execution_erroredreliability: The flag is a mutable instance attribute onToolUsage, reset at the top ofuse()/ause()and read immediately after bytool_utils.py. Confirm no intermediate call can reset it before it's read.result_as_answer=Truetool that intentionally fails, and confirm the agent continues reasoning rather than returning the error as its final answer."original_tool": Nonealongside"error_occurred": True— theresult_as_answerguard is technically unreachable here sinceoriginal_toolis falsy. Confirm this is acceptable.Notes
ToolUsageflag,execute_tool_and_check_finality(both error and success), and native tool execution inAgentExecutor(both error and success)._last_execution_erroredflag onToolUsage(for text/ReAct pattern), and anerror_occurreddict key /error_event_emittedlocal variable (for native tool calling). This follows existing conventions in each module rather than introducing a new abstraction.Link to Devin session: https://app.devin.ai/sessions/a7393abd35bf4141bf23fe9e1b86b364
Note
Medium Risk
Changes tool-execution finality logic across multiple executors and hook wrappers; behavior around
result_as_answernow depends on new error-tracking flags, which could alter when agents short-circuit after tool calls.Overview
Prevents tools marked
result_as_answer=Truefrom prematurely short-circuiting the agent when the tool execution fails, allowing the model to see the error and continue reasoning/retrying.This propagates explicit error state through native tool execution results (including parallel paths) in
CrewAgentExecutorand the experimentalAgentExecutor, and adds_last_execution_erroredtracking inToolUsagesotool_utils.execute_*_tool_and_check_finalityonly returnsresult_as_answeron successful runs. Adds regression tests covering both success/error cases for native tool execution andToolUsage/execute_tool_and_check_finalitybehavior.Written by Cursor Bugbot for commit f5dc745. This will update automatically on new commits. Configure here.