Skip to content

fix: surface Anthropic stop_reason to detect truncation (#5148)#5149

Open
devin-ai-integration[bot] wants to merge 2 commits intomainfrom
devin/1774643154-anthropic-stop-reason-truncation-warning
Open

fix: surface Anthropic stop_reason to detect truncation (#5148)#5149
devin-ai-integration[bot] wants to merge 2 commits intomainfrom
devin/1774643154-anthropic-stop-reason-truncation-warning

Conversation

@devin-ai-integration
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration bot commented Mar 27, 2026

Summary

Fixes #5148. Anthropic's Message response includes a stop_reason field that indicates why the API stopped generating (e.g. "max_tokens" means the output was truncated). Previously, AnthropicCompletion silently discarded this field, making it impossible for users to detect truncation via hooks or events.

Changes:

  • Add stop_reason: str | None = None field to LLMCallCompletedEvent
  • Add stop_reason parameter to BaseLLM._emit_call_completed_event
  • Add _extract_stop_reason() static method on AnthropicCompletion that safely extracts stop_reason as str | None (guards against non-string values, e.g. from MagicMock in tests)
  • Add _warn_if_truncated() helper on AnthropicCompletion that logs a warning when stop_reason == "max_tokens"
  • Plumb stop_reason through all 6 Anthropic completion methods (sync + async × regular/streaming/tool-use)
  • Add 7 unit tests covering warning behavior, event field propagation, and edge cases

Non-Anthropic providers are unaffected — they continue to emit stop_reason=None by default.

Review & Testing Checklist for Human

  • Verify all 6 methods are consistently updated: The same pattern (extract → warn → pass to event) is applied across _handle_completion, _handle_streaming_completion, _handle_tool_use_conversation, and their async counterparts. Confirm no code path was missed, especially around early returns (structured output, tool use blocks).
  • Streaming path correctness: In _handle_streaming_completion / _ahandle_streaming_completion, stop_reason is read from stream.get_final_message(). Verify that the reconstructed final_message actually carries the stop_reason from the stream (it should, per Anthropic SDK behavior).
  • Consider whether from_agent.role access is safe: _warn_if_truncated accesses from_agent.role when from_agent is truthy. This should always work since from_agent is typed as Agent | None, but confirm no caller passes a non-Agent truthy value.
  • Decide if other providers should also surface finish reasons: This PR is Anthropic-only. OpenAI has an analogous finish_reason field — consider whether a follow-up is needed.

Suggested manual test: Set max_tokens to a very small value (e.g. 50) on an Anthropic model, run a crew, and verify:

  1. A warning is logged containing stop_reason='max_tokens'
  2. The LLMCallCompletedEvent carries stop_reason="max_tokens" (observable via a custom event handler)

Notes

  • The stop_reason field on LLMCallCompletedEvent and the BaseLLM parameter are additive and backwards-compatible (default None).
  • _extract_stop_reason uses an isinstance(raw, str) guard so that non-string attribute values (e.g. auto-created MagicMock attributes in existing tests) safely become None rather than causing Pydantic validation errors.
  • No async-specific tests for the warning (only sync), though the code is symmetrical. The event-bus tests do exercise the full emit→capture path.

Link to Devin session: https://app.devin.ai/sessions/7214a66c41b94b07803ad5faacf12270

- Add stop_reason field to LLMCallCompletedEvent
- Update BaseLLM._emit_call_completed_event to accept and pass stop_reason
- Add _warn_if_truncated helper to AnthropicCompletion that logs a warning
  when stop_reason='max_tokens'
- Apply fix to all 6 Anthropic completion methods (sync and async):
  _handle_completion, _handle_streaming_completion,
  _handle_tool_use_conversation, _ahandle_completion,
  _ahandle_streaming_completion, _ahandle_tool_use_conversation
- Add 7 tests covering truncation warning, event field, and tool use paths

Co-Authored-By: João <joao@crewai.com>
@devin-ai-integration
Copy link
Copy Markdown
Contributor Author

Prompt hidden (unlisted session)

@devin-ai-integration
Copy link
Copy Markdown
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

… values

MagicMock objects (and other non-Anthropic responses) can return non-string
values for getattr(response, 'stop_reason', None). Add a typed extraction
helper that returns None unless the value is actually a string, preventing
Pydantic validation errors in LLMCallCompletedEvent.

Co-Authored-By: João <joao@crewai.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

AnthropicCompletion._handle_completion silently discards stop_reason; no way to detect truncation via hooks or events

0 participants