Add Exa Search API support as internet search tool by maxwbuckley · Pull Request #1846 · NVIDIA/NeMo-Agent-Toolkit

maxwbuckley · 2026-04-07T12:47:07Z

Summary

Add exa_internet_search tool using langchain_exa.ExaSearchResults, mirroring the existing tavily_internet_search tool
Includes ExaInternetSearchToolConfig with configurable max_results, search_type (Literal["auto", "neural", "keyword"]), livecrawl (Literal["always", "fallback", "never"]), max_query_length, and api_key (via config or EXA_API_KEY env var)
Client instantiated lazily inside the invocation path, only when a valid API key is present
Adds langchain-exa>=1.1.0,<2.0.0 dependency to nvidia-nat-langchain
Updates tutorial documentation with an "Using Exa Search" section alongside the existing Tavily section

Closes #1848

Test plan

Unit tests pass (12 tests in test_exa_internet_search.py — config validation, retries, truncation, empty results, empty key)
Existing Tavily tests still pass (no regressions)
Tool registers correctly in GlobalTypeRegistry and appears in nat info components -t function
ruff check passes on all new/modified files
Integration test with a valid EXA_API_KEY against live Exa API

🤖 Generated with Claude Code

copy-pr-bot · 2026-04-07T12:47:11Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

coderabbitai · 2026-04-07T12:47:25Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

Walkthrough

Adds an Exa-backed LangChain internet search tool: new config, async tool implementation with retries and result formatting, automatic registration, dependency addition, docs for Exa usage, and unit tests for the config.

Changes

Cohort / File(s)	Summary
Documentation `docs/source/get-started/tutorials/add-tools-to-a-workflow.md`	Expanded the web-search tutorial into "Using Tavily Search" and "Using Exa Search"; added install, `EXA_API_KEY`, `functions.internet_search` config with `_type: exa_internet_search`, optional params (`max_results`, `search_type`, `livecrawl`), and `workflow.tool_names` example.
Dependencies `packages/nvidia_nat_langchain/pyproject.toml`	Added `langchain-exa>=1.1.0,<2.0.0` to the package dependencies.
Tool Implementation `packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py`	New `ExaInternetSearchToolConfig` (registered as `exa_internet_search`) and `exa_internet_search` async tool for LangChain. Resolves API key from config or `EXA_API_KEY`, instantiates `exa_py.AsyncExa`, truncates long queries, calls `search_and_contents`, retries with exponential backoff, formats results into `<Document href="..."/>` blocks, and returns informative failure messages.
Plugin Registration `packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/register.py`	Imported `exa_internet_search` to include the new tool in automatic registration.
Tests `packages/nvidia_nat_langchain/tests/test_exa_internet_search.py`	Added tests for `ExaInternetSearchToolConfig`: parametrized constructor scenarios validating `SecretStr` handling and a test ensuring distinct `api_key` instances per config object.

Sequence Diagram(s)

sequenceDiagram
    participant Agent
    participant ExaTool as ExaInternetSearchTool
    participant Builder
    participant ExaClient
    participant ExaAPI
    Agent->>ExaTool: request internet_search(query)
    ExaTool->>Builder: resolve tool config & secrets
    ExaTool->>ExaClient: instantiate AsyncExa (use API key)
    ExaTool->>ExaClient: search_and_contents(query, params)
    ExaClient->>ExaAPI: HTTP request
    ExaAPI-->>ExaClient: search results
    ExaClient-->>ExaTool: results
    ExaTool-->>Agent: formatted <Document/> blocks or error message

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and concisely summarizes the main change: adding Exa Search API as an internet search tool option.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (2)

packages/nvidia_nat_langchain/pyproject.toml (1)
65-66: Keep dependency entries sorted to match local file contract.

The new langchain-exa entry breaks the declared “Keep sorted!!!” ordering in this dependency block. Please move it after langchain-core to preserve deterministic diffs.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/nvidia_nat_langchain/pyproject.toml` around lines 65 - 66, The
dependency list is out of sorted order: move the "langchain-exa>=1.1.0,<2.0.0"
entry so it appears after "langchain-core>=1.2.6,<2.0.0" to restore the declared
"Keep sorted!!!" ordering; ensure the two entries remain otherwise unchanged and
the block stays alphabetically sorted.
packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py (1)
46-47: Add an explicit return type to the public registration function.

The async registration function should declare its yielded type for API clarity and static checks.
Proposed fix
+from collections.abc import AsyncGenerator
...
-async def exa_internet_search(tool_config: ExaInternetSearchToolConfig, builder: Builder):
+async def exa_internet_search(
+    tool_config: ExaInternetSearchToolConfig,
+    builder: Builder,
+) -> AsyncGenerator[FunctionInfo, None]:
As per coding guidelines: "All public APIs require Python 3.11+ type hints on parameters and return values".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py`
around lines 46 - 47, The public async registration function exa_internet_search
is missing an explicit return type; update its signature to include a typed
async generator return annotation (e.g., -> AsyncGenerator[Tool, None]) and add
the necessary import from typing (AsyncGenerator) and the Tool type used by the
registration system so the signature reads like: async def
exa_internet_search(tool_config: ExaInternetSearchToolConfig, builder: Builder)
-> AsyncGenerator[Tool, None]: ensuring the yielded type matches the actual
yielded objects in the function body.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py`:
- Line 37: The model field max_retries can be <= 0 which causes the retry loop
in _exa_internet_search to be skipped and the function to implicitly return
None; add a guard and ensure a non-None return: validate/normalize max_retries
on model initialization (e.g., enforce min 1 or coerce negatives to 0) and
modify _exa_internet_search so that when the retry loop is skipped or all
attempts fail it explicitly returns an empty list (or other documented default)
instead of None; update references to max_retries and the retry loop inside
_exa_internet_search to use the validated value and always return a concrete
value.
- Around line 53-58: The code mutates process-wide environment EXA_API_KEY
during tool setup; remove the conditional that sets os.environ["EXA_API_KEY"]
and instead rely solely on the explicit api_key argument (falling back to
os.environ.get("EXA_API_KEY") only when constructing ExaSearchResults). Update
the ExaSearchResults instantiation (ExaSearchResults(exa_api_key=...)) to use
api_key or os.environ.get(...) but do not write to os.environ anywhere in this
module (remove the block that assigns os.environ["EXA_API_KEY"]).
- Around line 38-43: Replace the loose string types for the config fields with
enum-like types so invalid values fail at parse time: change the annotations for
search_type and livecrawl to constrained types (e.g., from typing import Literal
and use search_type: Literal["neural","keyword","auto"] and livecrawl:
Literal["always","fallback","never"] or define enums via class SearchType(Enum)
and class Livecrawl(Enum) and use those types), keep the Field(...) calls for
defaults/description but update the defaults to one of the allowed values and
add the necessary imports (Literal or Enum) so pydantic validates inputs when
parsing the model.

---

Nitpick comments:
In `@packages/nvidia_nat_langchain/pyproject.toml`:
- Around line 65-66: The dependency list is out of sorted order: move the
"langchain-exa>=1.1.0,<2.0.0" entry so it appears after
"langchain-core>=1.2.6,<2.0.0" to restore the declared "Keep sorted!!!"
ordering; ensure the two entries remain otherwise unchanged and the block stays
alphabetically sorted.

In
`@packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py`:
- Around line 46-47: The public async registration function exa_internet_search
is missing an explicit return type; update its signature to include a typed
async generator return annotation (e.g., -> AsyncGenerator[Tool, None]) and add
the necessary import from typing (AsyncGenerator) and the Tool type used by the
registration system so the signature reads like: async def
exa_internet_search(tool_config: ExaInternetSearchToolConfig, builder: Builder)
-> AsyncGenerator[Tool, None]: ensuring the yielded type matches the actual
yielded objects in the function body.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 71f86120-bb29-4abf-929b-468819a50794

📥 Commits

Reviewing files that changed from the base of the PR and between 5816918 and 3b58630.

⛔ Files ignored due to path filters (1)

uv.lock is excluded by !**/*.lock

📒 Files selected for processing (5)

docs/source/get-started/tutorials/add-tools-to-a-workflow.md
packages/nvidia_nat_langchain/pyproject.toml
packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py
packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/register.py
packages/nvidia_nat_langchain/tests/test_exa_internet_search.py

packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (3)

packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py (3)

37-37: ⚠️ Potential issue | 🟠 Major

Guard retry bounds to avoid implicit None returns.

If max_retries <= 0, the loop at Line 73 is skipped and _exa_internet_search can return None implicitly.

Suggested fix

-    max_retries: int = Field(default=3, description="Maximum number of retries for the search request")
+    max_retries: int = Field(default=3, ge=1, description="Maximum number of retries for the search request")
...
         for attempt in range(tool_config.max_retries):
             try:
                 ...
             except Exception:
                 ...
                 await asyncio.sleep(2**attempt)
+        return f"Web search failed after {tool_config.max_retries} attempts for: {question}"

Also applies to: 73-96

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In
`@packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py`
at line 37, The max_retries Field can be zero or negative causing the retry loop
in _exa_internet_search to be skipped and the function to implicitly return
None; add a guard in either the Field validation or at start of
_exa_internet_search to coerce/validate max_retries to a positive integer (e.g.,
if max_retries is None or <=0 set to 1 or raise ValueError), and ensure
_exa_internet_search always returns an explicit value (like an empty list or a
standardized error result) rather than None so callers don’t get implicit None
returns.

38-43: ⚠️ Potential issue | 🟠 Major

Constrain search_type and livecrawl at config-parse time.

Right now, invalid strings pass validation and fail only at runtime. Use enum-like typing (Literal) so bad values are rejected early.

Suggested fix

+from typing import Literal
...
-    search_type: str = Field(
+    search_type: Literal["auto", "neural", "keyword"] = Field(
         default="auto",
         description="Type of search to perform - 'neural', 'keyword', or 'auto'")
-    livecrawl: str = Field(
+    livecrawl: Literal["always", "fallback", "never"] = Field(
         default="fallback",
         description="Livecrawl behavior - 'always', 'fallback', or 'never'")

As per coding guidelines, "Validate and sanitise all user input, especially in web or CLI interfaces".

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In
`@packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py`
around lines 38 - 43, Replace the loose str types for search_type and livecrawl
so invalid values are rejected during config parsing: change the type
annotations on the model fields search_type and livecrawl from str to Literal
types (e.g., Literal["neural","keyword","auto"] for search_type and
Literal["always","fallback","never"] for livecrawl) or use an Enum, and keep the
existing Field(...) defaults and descriptions; this ensures pydantic/schema
validation fails at parse time instead of letting bad strings slip through to
runtime in functions that rely on these fields.

53-58: ⚠️ Potential issue | 🔴 Critical

Do not mutate process-wide EXA_API_KEY during tool setup.

Writing to os.environ here creates shared global state and can leak/cross wires credentials under concurrency. Resolve the key locally and pass it directly to Exa(...).

Suggested fix

-    if not os.environ.get("EXA_API_KEY"):
-        if api_key:
-            os.environ["EXA_API_KEY"] = api_key
-    # This Exa tool requires an API Key and it must be set as an environment variable (EXA_API_KEY)
-
-    exa_client = Exa(api_key=api_key or os.environ.get("EXA_API_KEY", ""))
+    resolved_api_key = api_key or os.environ.get("EXA_API_KEY", "")
+    exa_client = Exa(api_key=resolved_api_key)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In
`@packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py`
around lines 53 - 58, Do not write to process-wide os.environ; instead resolve
the key locally and pass it into the Exa constructor: compute a local variable
(e.g., resolved_api_key = api_key or os.environ.get("EXA_API_KEY", "")) and
instantiate exa_client = Exa(api_key=resolved_api_key) without assigning to
os.environ; remove the branch that mutates EXA_API_KEY and optionally validate
resolved_api_key and raise/handle missing key near where exa_client is created.

🧹 Nitpick comments (1)

packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py (1)
46-47: Add public API docstring and explicit return type annotation.

exa_internet_search(...) is a public registered function and should include a Google-style docstring plus an explicit return type (AsyncGenerator[FunctionInfo, None]).

As per coding guidelines, "Provide Google-style docstrings for every public module, class, function and CLI command" and "All public APIs require Python 3.11+ type hints on parameters and return values".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py`
around lines 46 - 47, Add a Google-style docstring to the public registered
function exa_internet_search describing its purpose, parameters (tool_config:
ExaInternetSearchToolConfig, builder: Builder), and yield behavior, and add an
explicit return type annotation AsyncGenerator[FunctionInfo, None] to the
function signature; ensure imports/types needed for AsyncGenerator and
FunctionInfo are available and reference the registration via register_function
so tooling recognizes the API.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py`:
- Around line 75-81: The call to exa_client.search_and_contents is synchronous
inside async code; replace the blocking Exa usage by importing AsyncExa (change
`from exa_py import Exa` to `from exa_py import AsyncExa`), instantiate the
async client (replace where `exa_client = Exa(...)` is created) and call its
async method with await (use `await exa_client.search_and_contents(...)`),
ensuring any surrounding function is async and errors are awaited/handled; keep
the same arguments (question, num_results, type, livecrawl, text) and update any
teardown/close calls to the async client equivalents.

---

Duplicate comments:
In
`@packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py`:
- Line 37: The max_retries Field can be zero or negative causing the retry loop
in _exa_internet_search to be skipped and the function to implicitly return
None; add a guard in either the Field validation or at start of
_exa_internet_search to coerce/validate max_retries to a positive integer (e.g.,
if max_retries is None or <=0 set to 1 or raise ValueError), and ensure
_exa_internet_search always returns an explicit value (like an empty list or a
standardized error result) rather than None so callers don’t get implicit None
returns.
- Around line 38-43: Replace the loose str types for search_type and livecrawl
so invalid values are rejected during config parsing: change the type
annotations on the model fields search_type and livecrawl from str to Literal
types (e.g., Literal["neural","keyword","auto"] for search_type and
Literal["always","fallback","never"] for livecrawl) or use an Enum, and keep the
existing Field(...) defaults and descriptions; this ensures pydantic/schema
validation fails at parse time instead of letting bad strings slip through to
runtime in functions that rely on these fields.
- Around line 53-58: Do not write to process-wide os.environ; instead resolve
the key locally and pass it into the Exa constructor: compute a local variable
(e.g., resolved_api_key = api_key or os.environ.get("EXA_API_KEY", "")) and
instantiate exa_client = Exa(api_key=resolved_api_key) without assigning to
os.environ; remove the branch that mutates EXA_API_KEY and optionally validate
resolved_api_key and raise/handle missing key near where exa_client is created.

---

Nitpick comments:
In
`@packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py`:
- Around line 46-47: Add a Google-style docstring to the public registered
function exa_internet_search describing its purpose, parameters (tool_config:
ExaInternetSearchToolConfig, builder: Builder), and yield behavior, and add an
explicit return type annotation AsyncGenerator[FunctionInfo, None] to the
function signature; ensure imports/types needed for AsyncGenerator and
FunctionInfo are available and reference the registration via register_function
so tooling recognizes the API.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: e896f727-5c3f-4946-8c76-57f8bc3fe183

📥 Commits

Reviewing files that changed from the base of the PR and between 3b58630 and b5962c7.

📒 Files selected for processing (1)

packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py

packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py

Add `exa_internet_search` tool using the langchain-exa integration, mirroring the existing tavily_internet_search tool. Includes config class, tool registration, unit tests, dependency, and documentation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Max Buckley <maxwbuckley@gmail.com>

The langchain_exa ExaSearchResults wrapper doesn't pass num_results and other params through its .run() method. Use the exa_py.Exa client directly for correct behavior (max_results, search_type, livecrawl). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Max Buckley <maxwbuckley@gmail.com>

- Use AsyncExa instead of sync Exa to avoid blocking the event loop - Remove os.environ mutation; resolve API key locally - Use Literal types for search_type and livecrawl config validation - Add ge=1 constraint on max_retries to prevent implicit None returns - Add explicit return after retry loop as safety fallback - Fix dependency sort order: langchain-core before langchain-exa Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Max Buckley <maxwbuckley@gmail.com>

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py`:
- Line 35: The max_results field currently allows zero/negative values; update
the config model's declaration of max_results to enforce a minimum of 1 (e.g.,
replace "max_results: int = 3" with a pydantic constrained field like
"max_results: int = Field(3, ge=1)" or use conint(ge=1)), and add the necessary
import from pydantic (Field or conint) so model-parse time validation prevents
invalid values; ensure this change is applied to the config class that defines
max_results in exa_internet_search.py.
- Around line 57-70: In _exa_internet_search, add a fast-fail check at the top
to immediately return an error (or raise) when no Exa API key is available:
check both the configured key (tool_config.exa_api_key) and the environment
(os.environ.get('EXA_API_KEY')) and if both are empty, return/raise immediately
instead of proceeding into the retry/backoff loop that uses
tool_config.max_retries; place this check before the question truncation and the
for attempt in range(tool_config.max_retries) loop so unnecessary
retries/backoff are avoided.
- Around line 87-92: The except block in the web-search retry logic in
exa_internet_search.py currently catches all exceptions silently; change this to
import logging and create a module-level logger, narrow the except to retryable
exceptions (e.g., httpx.RequestError, httpx.ReadTimeout, asyncio.TimeoutError)
and call logger.exception(...) before each retry, and separately handle
non-retryable errors (e.g., ValueError, httpx.HTTPStatusError with 401/403) to
fail fast (log with logger.exception and return the fallback message
immediately). Ensure the final fallback return still logs the last exception
with logger.exception so the full stack trace is captured when giving up after
tool_config.max_retries.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: e72dc882-36a7-402a-b00f-37853e2058f5

📥 Commits

Reviewing files that changed from the base of the PR and between b5962c7 and b8da77f.

📒 Files selected for processing (2)

packages/nvidia_nat_langchain/pyproject.toml
packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py

✅ Files skipped from review due to trivial changes (1)

packages/nvidia_nat_langchain/pyproject.toml

packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py

coderabbitai · 2026-04-07T13:27:42Z

packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py

+            except Exception:
+                # Return a graceful message instead of raising, so the agent can
+                # continue reasoning without web search rather than failing entirely.
+                if attempt == tool_config.max_retries - 1:
+                    return f"Web search failed after {tool_config.max_retries} attempts for: {question}"
+                await asyncio.sleep(2**attempt)


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

In the exa-pySDK version compatible withlangchain-exa>=1.1.0,<2.0.0, what specific exception types can AsyncExa.search_and_contents(...) raise for (1) transient network/server errors and (2) auth/configuration errors?

💡 Result:

For langchain-exa>=1.1.0,<2.0.0, the compatible exa-py range is exa-py>=1.0.8,<2.0.0. [1] A representative compatible release (e.g. exa-py==1.16.1) depends on httpx, so AsyncExa surfaces httpx’s request/status exceptions. [2]

(1) Transient network / server errors

Network/transport failures (DNS, connect/reset, timeouts, etc.): httpx.RequestError (and its subclasses like httpx.ConnectError, httpx.ReadTimeout, etc.). [3]

Server-side HTTP errors (5xx) when the SDK raises on non-2xx: httpx.HTTPStatusError (raised for 4xx/5xx by response.raise_for_status()). [3]

(2) Auth / configuration errors

Invalid request configuration / invalid option types passed into the SDK: ValueError (the SDK validates options and raises ValueError on invalid options/types). [4]

Auth/permission HTTP responses (typically 401/403) when the SDK raises on non-2xx: httpx.HTTPStatusError. [3]

Sources: [1] [2] [3] [4]

🏁 Script executed:

cat -n packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py | head -120

Repository: NVIDIA/NeMo-Agent-Toolkit

Length of output: 5404

🏁 Script executed:

# Check if logger is used in other files in the same directory or similar plugin modules rg "import logging\|from logging\|logger\." packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/ -A 2 -B 2 | head -50

Repository: NVIDIA/NeMo-Agent-Toolkit

Length of output: 51

🏁 Script executed:

# Check for logger usage patterns in similar langchain integrations or plugins rg "logger\.(exception|error)" packages/nvidia_nat_langchain/src/ -B 3 -A 1 | head -60

Repository: NVIDIA/NeMo-Agent-Toolkit

Length of output: 6714

Add logging with logger.exception() and narrow exception handling.

Line 87 catches all exceptions without logging, violating the coding guideline: "When catching and logging exceptions without re-raising, always use logger.exception() to capture the full stack trace". This suppresses error context needed for production debugging.

Additionally, not all exceptions are retryable:

Transient errors (httpx.RequestError, httpx.ReadTimeout, etc.) warrant retry.

Non-transient errors (ValueError for invalid config, httpx.HTTPStatusError 401/403 for auth failures) should fail immediately or be handled separately.

Import logging, create a module-level logger, and log exceptions with full stack trace before retrying or returning the fallback message. Narrow the exception handler to catch only retryable exception types where appropriate.

🧰 Tools

🪛 Ruff (0.15.9)

[warning] 87-87: Do not catch blind exception: Exception

(BLE001)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py` around lines 87 - 92, The except block in the web-search retry logic in exa_internet_search.py currently catches all exceptions silently; change this to import logging and create a module-level logger, narrow the except to retryable exceptions (e.g., httpx.RequestError, httpx.ReadTimeout, asyncio.TimeoutError) and call logger.exception(...) before each retry, and separately handle non-retryable errors (e.g., ValueError, httpx.HTTPStatusError with 401/403) to fail fast (log with logger.exception and return the fallback message immediately). Ensure the final fallback return still logs the last exception with logger.exception so the full stack trace is captured when giving up after tool_config.max_retries.

Partially addressed in 78dbf08 — added logger.exception(...) inside the retry except block so failures surface in logs. Kept the broad except Exception because exa-py does not expose a stable public exception hierarchy for auth vs. transient errors; happy to narrow this if/when upstream adds typed exceptions.

@maxwbuckley, that's a reasonable approach. The logging concern is now addressed with logger.exception(...). As for keeping except Exception — confirmed, the exa-py SDK does not expose a stable public exception hierarchy, so narrowing the catch to typed exceptions would require importing httpx directly (an implementation detail that could break across SDK versions).

To silence the Ruff BLE001 warning without changing the semantics, you can add a # noqa: BLE001 inline comment:

except Exception: # noqa: BLE001

This makes the intentional broad catch explicit to both tooling and future readers. Happy to leave the comment as resolved with that small addition, or mark it fully resolved as-is — your call.

✏️ Learnings added

Learnt from: maxwbuckley Repo: NVIDIA/NeMo-Agent-Toolkit PR: 1846 File: packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py:102-108 Timestamp: 2026-04-13T08:57:13.415Z Learning: In `packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py`, `except Exception` is intentionally broad because `exa-py` does not expose a stable public exception hierarchy. To suppress the Ruff BLE001 warning in this file, use `# noqa: BLE001` on the except line. The `logger.exception(...)` call inside the block satisfies the logging guideline for stack trace capture.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

🧠 Learnings used

Learnt from: mnajafian-nv Repo: NVIDIA/NeMo-Agent-Toolkit PR: 1716 File: ci/scripts/model_health_check.py:163-167 Timestamp: 2026-03-03T01:17:03.694Z Learning: Do not rename external/SDK environment variables when applying the NAT prefix convention. NAT_ prefix is intended only for NAT-internal configuration variables (e.g., NAT_CONFIG_FILE, NAT_LOG_LEVEL, NAT_SPAN_PREFIX) and NAT CI infrastructure variables (e.g., NAT_CI_REDIS_HOST). External SDK environment variables like NVIDIA_API_KEY or OPENAI_API_KEY should remain as-is and not be prefixed with NAT_. When reviewing Python code (including ci/scripts/*.py), ensure SDK-related environment variable names are preserved and not inadvertently renamed to include NAT_. Ensure consistent, explicit usage in code and configuration.

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (2)

packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py (2)

87-92: ⚠️ Potential issue | 🟠 Major

Replace blind catch with logged, selective retry handling.

Current handling catches everything silently; this loses stack traces and may retry non-retryable failures.

🔧 Suggested direction

+import logging
@@
+logger = logging.getLogger(__name__)
@@
-            except Exception:
+            except Exception:
+                logger.exception("Exa search attempt %s/%s failed", attempt + 1, tool_config.max_retries)
                 # Return a graceful message instead of raising, so the agent can
                 # continue reasoning without web search rather than failing entirely.
                 if attempt == tool_config.max_retries - 1:
                     return f"Web search failed after {tool_config.max_retries} attempts for: {question}"
                 await asyncio.sleep(2**attempt)

In exa-py versions compatible with langchain-exa>=1.1.0,<2.0.0, which exception classes can AsyncExa.search_and_contents raise for transient network/server failures versus auth/configuration failures?

As per coding guidelines: "When catching and logging exceptions without re-raising: always use logger.exception() to capture the full stack trace information."

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In
`@packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py`
around lines 87 - 92, Replace the blind except in the retry loop around
AsyncExa.search_and_contents with selective handling: catch
transient/network/server exceptions thrown by AsyncExa.search_and_contents
(e.g., connection/timeouts/retryable HTTP errors) and on those call
logger.exception(...) to record the stack trace, perform the exponential backoff
(await asyncio.sleep(2**attempt)), and only return the graceful failure message
after exhausting tool_config.max_retries; for non-retryable errors
(authentication/configuration errors) re-raise or return immediately so they are
not retried. Locate the retry block in exa_internet_search.py around the
AsyncExa.search_and_contents call and replace the broad except Exception with
specific exception classes and logger.exception usage while preserving the
existing max_retries/attempt logic.

53-71: ⚠️ Potential issue | 🟠 Major

Fail fast when no Exa API key is configured.

If both config and env are empty, the tool still enters retries and backoff, adding avoidable latency.

🔧 Suggested fix

     async def _exa_internet_search(question: str) -> str:
         """This tool retrieves relevant contexts from web search (using Exa) for the given question.
@@
         Returns:
             str: The web search results.
         """
+        if not resolved_api_key:
+            return "Web search is unavailable: `EXA_API_KEY` is not configured."
+
         # Exa API supports longer queries than Tavily but truncate at a reasonable limit
         if len(question) > 2000:
             question = question[:1997] + "..."

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In
`@packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py`
around lines 53 - 71, The function _exa_internet_search currently creates
exa_client with resolved_api_key and then enters retry/backoff loop even when no
key is configured; change it to fail fast by checking resolved_api_key (or
api_key) before creating/using AsyncExa and raise/log a clear error or return
immediately if it's empty so you don't enter the for attempt in
range(tool_config.max_retries) loop; update the early check near where
resolved_api_key/api_key and exa_client are set (and before the loop that uses
tool_config.max_retries) to short-circuit execution when no API key is present.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/source/get-started/tutorials/add-tools-to-a-workflow.md`:
- Around line 170-187: The Exa subsection shows configuring
functions.internet_search with _type: exa_internet_search but omits wiring the
tool into the workflow; update the docs to add an explicit workflow block that
sets workflow.tool_names to include internet_search and current_datetime (and
use the correct workflow._type, e.g., react_agent) so the example demonstrates
both function registration (functions.internet_search / current_datetime) and
adding those names to workflow.tool_names.

In
`@packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py`:
- Around line 82-84: The generated web_search_results string incorrectly uses a
self-closing opening tag plus a separate closing tag; update the formatting
where web_search_results is built (iterating over search_response.results and
using doc.url and doc.text) to use a proper opening tag with href (e.g.,
<Document href="...">) followed by the document text and then the closing
</Document> tag so the XML/HTML is well-formed.

---

Duplicate comments:
In
`@packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py`:
- Around line 87-92: Replace the blind except in the retry loop around
AsyncExa.search_and_contents with selective handling: catch
transient/network/server exceptions thrown by AsyncExa.search_and_contents
(e.g., connection/timeouts/retryable HTTP errors) and on those call
logger.exception(...) to record the stack trace, perform the exponential backoff
(await asyncio.sleep(2**attempt)), and only return the graceful failure message
after exhausting tool_config.max_retries; for non-retryable errors
(authentication/configuration errors) re-raise or return immediately so they are
not retried. Locate the retry block in exa_internet_search.py around the
AsyncExa.search_and_contents call and replace the broad except Exception with
specific exception classes and logger.exception usage while preserving the
existing max_retries/attempt logic.
- Around line 53-71: The function _exa_internet_search currently creates
exa_client with resolved_api_key and then enters retry/backoff loop even when no
key is configured; change it to fail fast by checking resolved_api_key (or
api_key) before creating/using AsyncExa and raise/log a clear error or return
immediately if it's empty so you don't enter the for attempt in
range(tool_config.max_retries) loop; update the early check near where
resolved_api_key/api_key and exa_client are set (and before the loop that uses
tool_config.max_retries) to short-circuit execution when no API key is present.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: db1cc117-034d-4644-b824-f2cfbfcbf8e9

📥 Commits

Reviewing files that changed from the base of the PR and between b8da77f and a69e612.

⛔ Files ignored due to path filters (1)

uv.lock is excluded by !**/*.lock

📒 Files selected for processing (5)

docs/source/get-started/tutorials/add-tools-to-a-workflow.md
packages/nvidia_nat_langchain/pyproject.toml
packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py
packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/register.py
packages/nvidia_nat_langchain/tests/test_exa_internet_search.py

✅ Files skipped from review due to trivial changes (3)

packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/register.py
packages/nvidia_nat_langchain/pyproject.toml
packages/nvidia_nat_langchain/tests/test_exa_internet_search.py

docs/source/get-started/tutorials/add-tools-to-a-workflow.md

coderabbitai · 2026-04-07T13:36:32Z

packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py

+                web_search_results = "\n\n---\n\n".join([
+                    f'<Document href="{doc.url}"/>\n{doc.text}\n</Document>'
+                    for doc in search_response.results if doc.text


⚠️ Potential issue | 🟡 Minor

Fix malformed Document wrapper formatting.

The current string uses a self-closing opening tag and a closing tag together, which is inconsistent.

🔧 Suggested fix

- f'<Document href="{doc.url}"/>\n{doc.text}\n</Document>' + f'<Document href="{doc.url}">\n{doc.text}\n</Document>'

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py` around lines 82 - 84, The generated web_search_results string incorrectly uses a self-closing opening tag plus a separate closing tag; update the formatting where web_search_results is built (iterating over search_response.results and using doc.url and doc.text) to use a proper opening tag with href (e.g., <Document href="...">) followed by the document text and then the closing </Document> tag so the XML/HTML is well-formed.

Not changed intentionally — the existing tavily_internet_search tool uses the same <Document href="..."/> format (tavily_internet_search.py:84), and this PR was designed to mirror Tavily. Fixing the XML wrapper should be done consistently across both tools in a separate change.

@maxwbuckley, understood — that's a fair point. Keeping the format consistent with tavily_internet_search makes sense for this PR. Would you like me to open a follow-up issue to track fixing the <Document href="..."/> wrapper format consistently across both exa_internet_search and tavily_internet_search in a future change?

🧠 Learnings used

Learnt from: mnajafian-nv Repo: NVIDIA/NeMo-Agent-Toolkit PR: 1716 File: ci/scripts/model_health_check.py:163-167 Timestamp: 2026-03-03T01:17:03.694Z Learning: Do not rename external/SDK environment variables when applying the NAT prefix convention. NAT_ prefix is intended only for NAT-internal configuration variables (e.g., NAT_CONFIG_FILE, NAT_LOG_LEVEL, NAT_SPAN_PREFIX) and NAT CI infrastructure variables (e.g., NAT_CI_REDIS_HOST). External SDK environment variables like NVIDIA_API_KEY or OPENAI_API_KEY should remain as-is and not be prefixed with NAT_. When reviewing Python code (including ci/scripts/*.py), ensure SDK-related environment variable names are preserved and not inadvertently renamed to include NAT_. Ensure consistent, explicit usage in code and configuration.

- Add ge=1 constraint on max_results field - Fail fast when no EXA_API_KEY is configured - Add workflow.tool_names example to Exa docs section Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Max Buckley <maxwbuckley@gmail.com>

coderabbitai

♻️ Duplicate comments (1)

packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py (1)

85-88: ⚠️ Potential issue | 🟡 Minor

Fix malformed XML tag syntax.

The <Document> wrapper uses a self-closing opening tag (/>) combined with a separate closing tag, which is inconsistent XML.

Proposed fix

                 web_search_results = "\n\n---\n\n".join([
-                    f'<Document href="{doc.url}"/>\n{doc.text}\n</Document>'
+                    f'<Document href="{doc.url}">\n{doc.text}\n</Document>'
                     for doc in search_response.results if doc.text
                 ])

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In
`@packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py`
around lines 85 - 88, The XML wrapper for search results is malformed: the
opening tag in the web_search_results join uses a self-closing form ('<Document
href="..."/>') but then adds a separate closing tag; update the string
construction inside web_search_results (the list comprehension iterating over
search_response.results and using doc.url/doc.text) so the opening tag is a
proper start tag (e.g., '<Document href="...">') paired with the existing
'</Document>' closing tag to produce well-formed XML.

🧹 Nitpick comments (1)

packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py (1)

90-96: Add logging for exception handling.

The code catches all exceptions silently without logging, making production debugging difficult. Per coding guidelines, use logger.exception() when catching exceptions without re-raising.

Additionally, not all exceptions warrant retry (e.g., auth failures from 401/403 should fail fast).

Proposed improvement

+import logging
+
+logger = logging.getLogger(__name__)
+
 # ... in the function ...
             except Exception:
+                logger.exception("Exa search attempt %d failed", attempt + 1)
                 # Return a graceful message instead of raising, so the agent can
                 # continue reasoning without web search rather than failing entirely.
                 if attempt == tool_config.max_retries - 1:
                     return f"Web search failed after {tool_config.max_retries} attempts for: {question}"
                 await asyncio.sleep(2**attempt)

As per coding guidelines: "When catching and logging exceptions without re-raising, always use logger.exception() to capture the full stack trace."

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In
`@packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py`
around lines 90 - 96, The except block in the web search retry loop swallows all
exceptions—update it to catch Exception as e and call logger.exception(...) to
log the full stack trace and context (include question and attempt), and add a
fast-fail for authorization errors by checking the exception for HTTP status
401/403 (e.g., inspect e.response.status or isinstance checks for HTTPError) and
immediately return a clear failure string in that case; for other exceptions
continue the existing exponential backoff (await asyncio.sleep(2**attempt)) and
only return the final failure after tool_config.max_retries attempts. Reference
the existing variables/methods: attempt, tool_config.max_retries, question, and
logger/asyncio.sleep in exa_internet_search.py.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In
`@packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py`:
- Around line 85-88: The XML wrapper for search results is malformed: the
opening tag in the web_search_results join uses a self-closing form ('<Document
href="..."/>') but then adds a separate closing tag; update the string
construction inside web_search_results (the list comprehension iterating over
search_response.results and using doc.url/doc.text) so the opening tag is a
proper start tag (e.g., '<Document href="...">') paired with the existing
'</Document>' closing tag to produce well-formed XML.

---

Nitpick comments:
In
`@packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py`:
- Around line 90-96: The except block in the web search retry loop swallows all
exceptions—update it to catch Exception as e and call logger.exception(...) to
log the full stack trace and context (include question and attempt), and add a
fast-fail for authorization errors by checking the exception for HTTP status
401/403 (e.g., inspect e.response.status or isinstance checks for HTTPError) and
immediately return a clear failure string in that case; for other exceptions
continue the existing exponential backoff (await asyncio.sleep(2**attempt)) and
only return the final failure after tool_config.max_retries attempts. Reference
the existing variables/methods: attempt, tool_config.max_retries, question, and
logger/asyncio.sleep in exa_internet_search.py.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 79ae3df7-b586-4594-bb27-53cf77807065

📥 Commits

Reviewing files that changed from the base of the PR and between a69e612 and 2e43508.

📒 Files selected for processing (2)

docs/source/get-started/tutorials/add-tools-to-a-workflow.md
packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py

bbednarski9 · 2026-04-07T17:25:08Z

Hi @maxwbuckley, appreciate your interest in contributing to NAT! Can you raise an issue describing the use case and argument to include Exa integration with NeMo Agent Toolkit? Why does this need to be a built-in tool directly in nat-langchain? When you raise the issue we can get product involved to consider as an RFR.

Adding DO NOT MERGE to this PR, until issue is raised and approved

maxwbuckley · 2026-04-07T18:46:23Z

Thanks @bbednarski9! Filed #1848 with the use case and rationale. Happy to iterate on it.

bbednarski9 · 2026-04-09T19:01:02Z

packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py

@@ -0,0 +1,102 @@
+# SPDX-FileCopyrightText: Copyright (c) 2024-2026, NVIDIA CORPORATION & AFFILIATES. All rights reserved.


just 2026 here :)

bbednarski9 · 2026-04-09T19:19:16Z

packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py

+async def exa_internet_search(tool_config: ExaInternetSearchToolConfig, builder: Builder):
+    import os
+
+    from exa_py import AsyncExa


The dependency declaration in pyproject.toml does not match this runtime import. Can we use from langchain_exa ... instead? This transitive dependency could break without warning.

Good catch — switched to from langchain_exa import ExaSearchResults so the runtime import matches the declared dependency.

bbednarski9 · 2026-04-09T19:21:19Z

packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py

+
+    api_key = get_secret_value(tool_config.api_key) if tool_config.api_key else ""
+    resolved_api_key = api_key or os.environ.get("EXA_API_KEY", "")
+    exa_client = AsyncExa(api_key=resolved_api_key)


Client is created even if the key is empty. A safer pattern would be the check key first, then instantiate lazily inside of the invocation path

Done — client is now instantiated lazily inside _exa_internet_search(), and only when resolved_api_key is non-empty. If the key is missing, it returns an early error message.

bbednarski9 · 2026-04-09T19:23:32Z

packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py

+        if len(question) > 2000:
+            question = question[:1997] + "..."


bit of a 🪄 magic number here. Is there a way to provide an override for users who want longer limits? Or provide a warning that the message was truncated?

Seems like something that should go inside of tool config

Added a max_query_length config field (default 2000, ge=1) so users can override it in YAML. Also added a logger.warning() when truncation occurs.

Agreed — now a max_query_length field on ExaInternetSearchToolConfig.

bbednarski9 · 2026-04-09T19:27:11Z

packages/nvidia_nat_langchain/tests/test_exa_internet_search.py

+import pytest
+from pydantic import SecretStr
+
+
+@pytest.mark.parametrize("constructor_args", [{}, {
+    "api_key": ""
+}, {
+    "api_key": "my_api_key"
+}],
+                         ids=["default", "empty_api_key", "provided_api_key"])
+def test_api_key_is_secret_str(constructor_args: dict):
+    from nat.plugins.langchain.tools.exa_internet_search import ExaInternetSearchToolConfig
+    expected_api_key = constructor_args.get("api_key", "")
+
+    config = ExaInternetSearchToolConfig(**constructor_args)
+    assert isinstance(config.api_key, SecretStr)
+
+    api_key = config.api_key.get_secret_value()
+    assert api_key == expected_api_key
+
+
+def test_default_api_key_is_unique_instance():
+    from nat.plugins.langchain.tools.exa_internet_search import ExaInternetSearchToolConfig
+
+    config1 = ExaInternetSearchToolConfig()
+    config2 = ExaInternetSearchToolConfig()
+
+    assert config1.api_key is not config2.api_key


need more complete code coverage on these test cases: in general the tests are not checking the tool's behavior. Can you add some tests for retries, truncation, empty result, empty key?

Expanded to 12 tests — added: config validation (max_retries=0, max_results=0, invalid search_type, invalid livecrawl), empty key returns unavailable message, query truncation verified via mock, empty results handling, and retry count verification on exception.

bbednarski9 · 2026-04-09T19:30:08Z

@maxwbuckley thanks for filing the issue. Did a first pass of the code. Would you mind addressing the comments above and I'll take another look?

-Bryan

- Fix copyright year to 2026 - Use langchain_exa.ExaSearchResults instead of exa_py directly to match the declared dependency - Lazily instantiate client inside invocation path, only if key exists - Add configurable max_query_length field (default 2000) with truncation warning log - Expand test coverage: retries, truncation, empty results, empty key, config validation for invalid search_type/livecrawl/max_retries/max_results Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Max Buckley <maxwbuckley@gmail.com>

maxwbuckley · 2026-04-13T08:27:40Z

@bbednarski9 thanks for the detailed review! All six comments have been addressed across the follow-up commits (a69e612, 2e43508, 27786c9). Summary of changes:

Transitive import — switched to from langchain_exa import ExaSearchResults so the runtime import matches the declared langchain-exa dependency in pyproject.toml.
Lazy client instantiation — ExaSearchResults is now only constructed inside _exa_internet_search() after verifying resolved_api_key is non-empty; missing-key returns an early "Web search is unavailable" message.
Magic truncation constant — replaced with a configurable max_query_length field on ExaInternetSearchToolConfig (default 2000, ge=1), and a logger.warning() is emitted when truncation occurs.
Config surface for truncation — same as above; lives on the tool config so users can override it in YAML.
Test coverage — expanded to 12 tests covering config validation (max_retries=0, max_results=0, invalid search_type, invalid livecrawl), empty-key unavailable message, query truncation via mock, empty results handling, and retry count on exception.

Ready for another pass whenever you have a moment. Thanks!

Address CodeRabbit feedback to surface failures instead of swallowing them silently in the retry loop. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

maxwbuckley · 2026-04-13T08:49:57Z

Ran through the CodeRabbit comments — summary of resolution:

Already addressed in earlier commits:

max_retries / max_results ge=1 constraint — done (config fields now enforce ge=1).
search_type / livecrawl as Literal — done.
os.environ["EXA_API_KEY"] mutation — removed; key is passed directly to ExaSearchResults(exa_api_key=...).
Fail-fast when no API key — done (early return "Web search is unavailable").
Docs workflow.tool_names example — already present in the Exa subsection of add-tools-to-a-workflow.md.

Addressed in 78dbf08:

Exception logging in the retry loop — added logger.exception("Exa search attempt %d of %d failed", ...) so transient failures surface in logs instead of being swallowed silently.

Intentionally not changed:

<Document href="..."/> wrapper formatting — this matches the existing tavily_internet_search tool (packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/tavily_internet_search.py:84). Keeping parity with the sibling tool is deliberate since this PR was designed to mirror Tavily; if we want to fix the XML wrapper, it should be done consistently across both tools in a separate change.
Narrow exception-type handling in the retry loop — exa-py does not expose a stable public exception hierarchy, so broad except Exception with logging is the pragmatic choice. Happy to revisit if exa-py adds typed exceptions upstream.

All 12 unit tests still pass and ruff check is clean.

maxwbuckley requested review from a team as code owners April 7, 2026 12:47

coderabbitai bot reviewed Apr 7, 2026

View reviewed changes

packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py Outdated Show resolved Hide resolved

maxwbuckley and others added 3 commits April 7, 2026 15:24

maxwbuckley force-pushed the add-exa-internet-search branch from b8da77f to a69e612 Compare April 7, 2026 13:24

coderabbitai bot reviewed Apr 7, 2026

View reviewed changes

bbednarski9 self-assigned this Apr 7, 2026

bbednarski9 added feature request New feature or request Under Review PR is under review and should not be marked stale DO NOT MERGE PR should not be merged; see PR for details labels Apr 7, 2026

willkill07 added the non-breaking Non-breaking change label Apr 7, 2026

maxwbuckley mentioned this pull request Apr 7, 2026

Add Exa Search API as a built-in internet search tool #1848

Open

bbednarski9 requested changes Apr 9, 2026

View reviewed changes

Log exceptions during Exa search retries

78dbf08

Address CodeRabbit feedback to surface failures instead of swallowing them silently in the retry loop. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

		@@ -0,0 +1,102 @@
		# SPDX-FileCopyrightText: Copyright (c) 2024-2026, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

Conversation

maxwbuckley commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

copy-pr-bot bot commented Apr 7, 2026

Uh oh!

coderabbitai bot commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

(1) Transient network / server errors

(2) Auth / configuration errors

Uh oh!

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

bbednarski9 commented Apr 7, 2026

Uh oh!

maxwbuckley commented Apr 7, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

maxwbuckley commented Apr 7, 2026 •

edited

Loading

coderabbitai bot commented Apr 7, 2026 •

edited

Loading

coderabbitai bot Apr 7, 2026 •

edited

Loading

coderabbitai bot Apr 7, 2026 •

edited

Loading