Skip to content

[BUG] guardrail_redact_input override ltm_msg instead of the last user message #1639

@Deiugarte

Description

@Deiugarte

Checks

  • I have updated to the lastest minor and patch version of Strands
  • I have checked the documentation and this is not expected behavior
  • I have searched ./issues and there are no duplicates of my issue

Strands Version

1.24.0

Python Version

3.13

Operating System

macOs 26

Installation Method

other

Steps to Reproduce

With the guardrail is enabled with

 config["guardrail_redact_input"] = True
 config["guardrail_redact_input_message"] = DEFAULT_BLOCKED_MESSAGE

and long-term memory is set

session_manager = AgentCoreMemorySessionManager(
                agentcore_memory_config=AgentCoreMemoryConfig(
                    memory_id=MEMORY_ID,
                    actor_id=actor_id,
                    session_id=session_id,
                    retrieval_config={
                        "/preferences/{actorId}": RetrievalConfig(top_k=5, relevance_score=0.7),
                        "/facts/{actorId}": RetrievalConfig(top_k=10, relevance_score=0.3),
                        "/summaries/{actorId}/{sessionId}": RetrievalConfig(top_k=5, relevance_score=0.5),
                    },
                ),
                region_name=AWS_REGION,
            )

When the user sends a message that triggers the guardrail the message that is redacted is the context of the long-term memory and not the user input.

Expected Behavior

When the user sends a message that triggers the guardrail, the message that is redacted is the message with the user input.

Actual Behavior

When the user sends a message that triggers the guardrail the message that is redacted is the context of the long-term memory and not the user input.

Additional Context

on the session manager the context is set as last message

ltm_msg: Message = {
                    "role": "assistant",
                    "content": [{"text": f"<user_context>{context_text}</user_context>"}],
                }
                event.agent.messages.append(ltm_msg)

But the later in the agent code the redacted message is the last one

 self.messages[-1]["content"] = self._redact_user_content(
                        self.messages[-1]["content"], str(event.chunk["redactContent"]["redactUserContentMessage"])
                    )
                    if self._session_manager:
                        self._session_manager.redact_latest_message(self.messages[-1], self)

Also because of this, the condition is not trigger so the last message is not warp

# Wrap text or image content in guardrailContent if this is the last user message
               if (
                   guardrail_latest_message
                   and idx == len(messages) - 1
                   and message["role"] == "user"
                   and ("text" in formatted_content or "image" in formatted_content)
               ):
                   if "text" in formatted_content:
                       formatted_content = {"guardContent": {"text": {"text": formatted_content["text"]}}}
                   elif "image" in formatted_content:
                       formatted_content = {"guardContent": {"image": formatted_content["image"]}}

Possible Solution

Look for the last user message redact that instead of the last message

Related Issues

#1324

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions