🔴 Required Information
Describe the Bug:
LoadSkillResourceTool.run_async returns RESOURCE_NOT_FOUND as a structured soft-error string when a path passed by the LLM does not exist inside the skill's bundled resources. Because the response is a normal tool result (not an exception or terminal signal), the LLM treats it as a transient/recoverable failure and retries — but critically, it hallucinates a different plausible path on every retry, not the same path. Nothing in SkillToolset tracks total failures across paths, so the loop continues until RunConfig.max_llm_calls is exhausted.
max_llm_calls defaults to 500 (src/google/adk/agents/run_config.py:314). This means a single invocation can silently consume the entire per-invocation call budget on repeated failing tool calls before the framework intervenes — and max_llm_calls is a global cap on legitimate reasoning, not a defense against a repeated-failure loop on one specific tool.
Steps to Reproduce:
- Install
google-adk (any version that ships SkillToolset — verified on 1.32.0).
- Create an agent with a
SkillToolset containing a skill whose SKILL.md references files by natural-language names (e.g. "Document 1", "the reference guide") without exact filenames.
- Issue a query that prompts the model to read one of those resources.
- Observe in the trace that the model calls
load_skill_resource with a hallucinated path, receives RESOURCE_NOT_FOUND, then calls it again with a different hallucinated path, receives RESOURCE_NOT_FOUND again, and loops.
Expected Behavior:
After the first RESOURCE_NOT_FOUND within an invocation, any subsequent load_skill_resource failure should return a terminal error code that unambiguously instructs the LLM to stop retrying and report the error. The agent's overall reasoning budget (max_llm_calls) should not be the only thing standing between an imperfect prompt and a runaway invocation.
Observed Behavior:
The same RESOURCE_NOT_FOUND soft error is returned on every attempt regardless of path or how many times it has already failed. The loop terminates only when max_llm_calls is exceeded.
Live trace evidence (captured via GET /debug/trace/session/{session_id} against adk web):
SPAN: execute_tool load_skill_resource
args: {'file_path': 'references/reference_doc.md', 'skill_name': 'document-classifier'}
error_code: RESOURCE_NOT_FOUND
error: Resource 'references/reference_doc.md' not found in skill 'document-classifier'.
SPAN: execute_tool load_skill_resource
args: {'skill_name': 'document-classifier', 'file_path': 'references/Document1.md'}
error_code: RESOURCE_NOT_FOUND
error: Resource 'references/Document1.md' not found in skill 'document-classifier'.
The model tried references/reference_doc.md first, then hallucinated a completely different path (references/Document1.md) on the retry. Both returned the same soft error — the LLM had no signal to stop. This pattern continues indefinitely.
Environment Details:
- ADK Library Version:
1.32.0 (issue exists on main as of commit 2d61cb69)
- Desktop OS: Linux (defect is in framework logic, not OS-specific)
- Python Version:
3.12.3
Model Information:
- Are you using LiteLLM: N/A (defect is provider-agnostic)
- Which model:
gemini-3-flash-preview (observed; reproducible across any function-calling model — the retry behavior is a consequence of the soft error signal, not model-specific)
🟡 Optional Information
Regression:
Not a regression. The defect has existed since SkillToolset was introduced — LoadSkillResourceTool.run_async has never had any retry-guard logic.
Additional Context:
Four factors combine to make this loop reachable through ordinary use:
- No resource manifest at L2 — the
load_skill response intentionally omits available file paths (progressive-disclosure spec). The LLM must infer paths from prose, and inferred paths are routinely wrong.
- Soft error string —
RESOURCE_NOT_FOUND looks transient and recoverable to the model; retry is its default response.
- No terminal signal — nothing escalates after the first miss.
- No scope boundary in default prompt — the system instruction doesn't distinguish skill-bundled files from runtime user inputs (e.g. a PDF the user is processing), so the model sometimes routes runtime documents through
load_skill_resource and loops on them.
Considered and rejected alternatives:
| Alternative |
Why not |
| Per-path retry guard |
LLM hallucinates a different path on each retry — confirmed in live trace; a per-path list never triggers |
Tighten or default-lower max_llm_calls |
Caps overall reasoning budget; punishes legitimate long-running agents |
User-side after_tool_callback workaround |
Symptomatic; pushes the fix onto every SkillToolset user |
Add available_resources manifest to L2 load_skill |
Defeats the lazy-loading / token-saving design |
New list_skill_resources tool |
Violates the L1→L2→L3 progressive disclosure contract |
Minimal Reproduction Code:
import asyncio
from unittest import mock
from google.adk.skills import models
from google.adk.tools import skill_toolset, tool_context
skill = mock.create_autospec(models.Skill, instance=True)
skill.name = "demo"
skill.resources = mock.MagicMock()
skill.resources.get_reference.return_value = None # every path "missing"
ctx = mock.MagicMock(spec=tool_context.ToolContext)
ctx.state = {}
ctx.invocation_id = "inv1"
ctx._invocation_context = mock.MagicMock()
ctx.agent_name = "agent"
toolset_obj = skill_toolset.SkillToolset([skill])
tool = skill_toolset.LoadSkillResourceTool(toolset_obj)
async def main():
paths = [
"references/missing.md",
"references/other_guess.md", # different path — LLM hallucination pattern
"references/yet_another.md",
]
for i, path in enumerate(paths):
r = await tool.run_async(
args={"skill_name": "demo", "file_path": path},
tool_context=ctx,
)
print(i, r["error_code"])
# On main (unpatched): all 3 print RESOURCE_NOT_FOUND — LLM has no reason to stop
# With fix applied: call 0 → RESOURCE_NOT_FOUND, calls 1-2 → RESOURCE_NOT_FOUND_FATAL
asyncio.run(main())
How often has this issue occurred?: Always (100%) — deterministic given any skill whose SKILL.md lets the model infer plausible-looking paths that don't literally exist.
Proposed Fix
A two-layer fix is in linked PR #5651:
Code: an invocation-scoped total failure counter inside LoadSkillResourceTool.run_async. The counter tracks the number of RESOURCE_NOT_FOUND responses across all paths within an invocation (not per-path — live testing confirmed the LLM uses a different path on each retry). State key:
temp:_adk_skill_resource_not_found_count_<invocation_id>
- First failure →
RESOURCE_NOT_FOUND (unchanged behavior).
- Any subsequent failure →
RESOURCE_NOT_FOUND_FATAL with an explicit stop instruction and failure count.
The temp: prefix uses ADK's existing convention to prevent persistence to durable storage. The <invocation_id> suffix isolates in-memory backends where temp: keys are not auto-cleared between invocations.
Prompt: a no-retry rule and a scope boundary added to _DEFAULT_SKILL_SYSTEM_INSTRUCTION.
Live trace with fix applied (same session, patched build):
SPAN: execute_tool load_skill_resource
args: {'file_path': 'references/reference_doc.md', 'skill_name': 'document-classifier'}
error_code: RESOURCE_NOT_FOUND
error: Resource 'references/reference_doc.md' not found in skill 'document-classifier'.
SPAN: execute_tool load_skill_resource
args: {'skill_name': 'document-classifier', 'file_path': 'references/Document1.md'}
error_code: RESOURCE_NOT_FOUND_FATAL
error: Resource 'references/Document1.md' not found in skill 'document-classifier'.
This is resource lookup failure #2 this invocation. Do not retry any path
— report the error to the user and stop.
Loop terminated on the second call. The model attempted a different path (Document1.md vs reference_doc.md) — exactly the hallucination pattern that a per-path guard would have missed.
Linked PR: #5651
🔴 Required Information
Describe the Bug:
LoadSkillResourceTool.run_asyncreturnsRESOURCE_NOT_FOUNDas a structured soft-error string when a path passed by the LLM does not exist inside the skill's bundled resources. Because the response is a normal tool result (not an exception or terminal signal), the LLM treats it as a transient/recoverable failure and retries — but critically, it hallucinates a different plausible path on every retry, not the same path. Nothing inSkillToolsettracks total failures across paths, so the loop continues untilRunConfig.max_llm_callsis exhausted.max_llm_callsdefaults to 500 (src/google/adk/agents/run_config.py:314). This means a single invocation can silently consume the entire per-invocation call budget on repeated failing tool calls before the framework intervenes — andmax_llm_callsis a global cap on legitimate reasoning, not a defense against a repeated-failure loop on one specific tool.Steps to Reproduce:
google-adk(any version that shipsSkillToolset— verified on1.32.0).SkillToolsetcontaining a skill whoseSKILL.mdreferences files by natural-language names (e.g. "Document 1", "the reference guide") without exact filenames.load_skill_resourcewith a hallucinated path, receivesRESOURCE_NOT_FOUND, then calls it again with a different hallucinated path, receivesRESOURCE_NOT_FOUNDagain, and loops.Expected Behavior:
After the first
RESOURCE_NOT_FOUNDwithin an invocation, any subsequentload_skill_resourcefailure should return a terminal error code that unambiguously instructs the LLM to stop retrying and report the error. The agent's overall reasoning budget (max_llm_calls) should not be the only thing standing between an imperfect prompt and a runaway invocation.Observed Behavior:
The same
RESOURCE_NOT_FOUNDsoft error is returned on every attempt regardless of path or how many times it has already failed. The loop terminates only whenmax_llm_callsis exceeded.Live trace evidence (captured via
GET /debug/trace/session/{session_id}againstadk web):The model tried
references/reference_doc.mdfirst, then hallucinated a completely different path (references/Document1.md) on the retry. Both returned the same soft error — the LLM had no signal to stop. This pattern continues indefinitely.Environment Details:
1.32.0(issue exists onmainas of commit2d61cb69)3.12.3Model Information:
gemini-3-flash-preview(observed; reproducible across any function-calling model — the retry behavior is a consequence of the soft error signal, not model-specific)🟡 Optional Information
Regression:
Not a regression. The defect has existed since
SkillToolsetwas introduced —LoadSkillResourceTool.run_asynchas never had any retry-guard logic.Additional Context:
Four factors combine to make this loop reachable through ordinary use:
load_skillresponse intentionally omits available file paths (progressive-disclosure spec). The LLM must infer paths from prose, and inferred paths are routinely wrong.RESOURCE_NOT_FOUNDlooks transient and recoverable to the model; retry is its default response.load_skill_resourceand loops on them.Considered and rejected alternatives:
max_llm_callsafter_tool_callbackworkaroundSkillToolsetuseravailable_resourcesmanifest to L2load_skilllist_skill_resourcestoolMinimal Reproduction Code:
How often has this issue occurred?: Always (100%) — deterministic given any skill whose
SKILL.mdlets the model infer plausible-looking paths that don't literally exist.Proposed Fix
A two-layer fix is in linked PR #5651:
Code: an invocation-scoped total failure counter inside
LoadSkillResourceTool.run_async. The counter tracks the number ofRESOURCE_NOT_FOUNDresponses across all paths within an invocation (not per-path — live testing confirmed the LLM uses a different path on each retry). State key:RESOURCE_NOT_FOUND(unchanged behavior).RESOURCE_NOT_FOUND_FATALwith an explicit stop instruction and failure count.The
temp:prefix uses ADK's existing convention to prevent persistence to durable storage. The<invocation_id>suffix isolates in-memory backends wheretemp:keys are not auto-cleared between invocations.Prompt: a no-retry rule and a scope boundary added to
_DEFAULT_SKILL_SYSTEM_INSTRUCTION.Live trace with fix applied (same session, patched build):
Loop terminated on the second call. The model attempted a different path (
Document1.mdvsreference_doc.md) — exactly the hallucination pattern that a per-path guard would have missed.Linked PR: #5651