Skip to content

fix: don't honor result_as_answer when tool execution errors#5157

Open
devin-ai-integration[bot] wants to merge 2 commits intomainfrom
devin/1774716145-fix-result-as-answer-on-error
Open

fix: don't honor result_as_answer when tool execution errors#5157
devin-ai-integration[bot] wants to merge 2 commits intomainfrom
devin/1774716145-fix-result-as-answer-on-error

Conversation

@devin-ai-integration
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration bot commented Mar 28, 2026

Summary

Fixes #5156. When a tool with result_as_answer=True raises an exception, the error message was being treated as the agent's final answer, preventing the agent from reflecting on the failure and retrying.

The fix adds error tracking across all tool execution code paths so that result_as_answer is only honored on successful tool executions:

  • tool_usage.py: Added _last_execution_errored flag, set in all error branches (ToolUsageError, tool selection failure, runtime exception in _use/_ause)
  • tool_utils.py: Both execute_tool_and_check_finality and aexecute_tool_and_check_finality check the flag before returning result_as_answer=True
  • crew_agent_executor.py: Propagates error_occurred through execution result dict; _append_tool_result_and_check_finality gates on it
  • agent_utils.py: Uses existing error_event_emitted to gate result_as_answer
  • experimental/agent_executor.py: Same pattern applied to sequential loop, parallel results loop, and parallel error fallback

Review & Testing Checklist for Human

  • Verify step_executor.py coverage: This file was not modified. Confirm that its native tool path delegates to one of the fixed executors and doesn't have its own independent result_as_answer check that bypasses the fix.
  • Verify _last_execution_errored reliability: The flag is a mutable instance attribute on ToolUsage, reset at the top of use()/ause() and read immediately after by tool_utils.py. Confirm no intermediate call can reset it before it's read.
  • End-to-end test: Create an agent with a result_as_answer=True tool that intentionally fails, and confirm the agent continues reasoning rather than returning the error as its final answer.
  • Verify parallel execution path in experimental executor: The parallel error fallback sets "original_tool": None alongside "error_occurred": True — the result_as_answer guard is technically unreachable here since original_tool is falsy. Confirm this is acceptable.

Notes

  • Six new unit tests were added covering the ToolUsage flag, execute_tool_and_check_finality (both error and success), and native tool execution in AgentExecutor (both error and success).
  • The fix uses two different error-tracking mechanisms depending on the code path: a _last_execution_errored flag on ToolUsage (for text/ReAct pattern), and an error_occurred dict key / error_event_emitted local variable (for native tool calling). This follows existing conventions in each module rather than introducing a new abstraction.

Link to Devin session: https://app.devin.ai/sessions/a7393abd35bf4141bf23fe9e1b86b364


Note

Medium Risk
Changes tool-execution finality logic across multiple executors and hook wrappers; behavior around result_as_answer now depends on new error-tracking flags, which could alter when agents short-circuit after tool calls.

Overview
Prevents tools marked result_as_answer=True from prematurely short-circuiting the agent when the tool execution fails, allowing the model to see the error and continue reasoning/retrying.

This propagates explicit error state through native tool execution results (including parallel paths) in CrewAgentExecutor and the experimental AgentExecutor, and adds _last_execution_errored tracking in ToolUsage so tool_utils.execute_*_tool_and_check_finality only returns result_as_answer on successful runs. Adds regression tests covering both success/error cases for native tool execution and ToolUsage/execute_tool_and_check_finality behavior.

Written by Cursor Bugbot for commit f5dc745. This will update automatically on new commits. Configure here.

When a tool with result_as_answer=True raises an exception, the agent
now continues reasoning about the error instead of treating the error
message as the final answer.

Fixes #5156

Co-Authored-By: João <joao@crewai.com>
@devin-ai-integration
Copy link
Copy Markdown
Contributor Author

Prompt hidden (unlisted session)

@devin-ai-integration
Copy link
Copy Markdown
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Address Cursor Bugbot review: the add_image exception handlers in
use() and ause() were missing the error flag, allowing result_as_answer
to be incorrectly honored when those paths errored.

Co-Authored-By: João <joao@crewai.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

0 participants