diff --git a/docs/examples/deep_search_sandbox_cookbook.ipynb b/docs/examples/deep_search_sandbox_cookbook.ipynb new file mode 100644 index 0000000..424b960 --- /dev/null +++ b/docs/examples/deep_search_sandbox_cookbook.ipynb @@ -0,0 +1,1106 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "d6c1bd74", + "metadata": {}, + "source": [ + "# Deep Search with the Sandbox Tool\n", + "\n", + "Using the `web_search` tool in the Agent API returns ranked links and snippets, and reasons over them. That's plenty when the answer\n", + "is written in prose -- but some facts only live in structured metadata or deep in a long document,\n", + "where a snippet doesn't have it. The `sandbox` [tool](https://docs.perplexity.ai/docs/agent-api/tools/sandbox) gives the model another move: write Python that\n", + "drives the same web search and page fetches *as code*, then parses the result. That's\n", + "[Search as Code](https://research.perplexity.ai/articles/rethinking-search-as-code-generation).\n", + "\n", + "We test both on two questions -- one whose answer **is** in snippets, one whose answer **isn't** —- each\n", + "with `web_search` alone vs. `web_search + sandbox`. " + ] + }, + { + "cell_type": "markdown", + "id": "41ca5a3b", + "metadata": {}, + "source": [ + "## 1. Setup\n", + "\n", + "Everything goes through the official [Perplexity Python SDK](https://github.com/ppl-ai/perplexity-python).\n", + "We submit jobs with `background=True` and poll `responses.retrieve` — the `sandbox` tool only runs on\n", + "the background path.\n", + "\n", + "> Set your key as `PERPLEXITY_API_KEY` (or `PPLX_API_KEY`) — get one at\n", + "> [perplexity.ai/account/api](https://www.perplexity.ai/account/api/keys)." + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "id": "60f0f494", + "metadata": { + "execution": { + "iopub.execute_input": "2026-06-09T23:46:04.321872Z", + "iopub.status.busy": "2026-06-09T23:46:04.320837Z", + "iopub.status.idle": "2026-06-09T23:46:05.379278Z", + "shell.execute_reply": "2026-06-09T23:46:05.378963Z" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\r\n", + "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m A new release of pip is available: \u001b[0m\u001b[31;49m26.0.1\u001b[0m\u001b[39;49m -> \u001b[0m\u001b[32;49m26.1.2\u001b[0m\r\n", + "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m To update, run: \u001b[0m\u001b[32;49mpip install --upgrade pip\u001b[0m\r\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Note: you may need to restart the kernel to use updated packages." + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n" + ] + } + ], + "source": [ + "%pip install --quiet perplexityai" + ] + }, + { + "cell_type": "markdown", + "id": "3ac17f35", + "metadata": {}, + "source": [ + "You can choose any frontier model from here: https://docs.perplexity.ai/docs/agent-api/models" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "id": "409cc405", + "metadata": { + "execution": { + "iopub.execute_input": "2026-06-09T23:46:05.380723Z", + "iopub.status.busy": "2026-06-09T23:46:05.380589Z", + "iopub.status.idle": "2026-06-09T23:46:05.588131Z", + "shell.execute_reply": "2026-06-09T23:46:05.587890Z" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Perplexity SDK ready | model: openai/gpt-5.5\n" + ] + } + ], + "source": [ + "import os, time, json\n", + "from perplexity import Perplexity\n", + "\n", + "API_KEY = os.environ.get(\"PERPLEXITY_API_KEY\") or os.environ.get(\"PPLX_API_KEY\", \"pplx-YOUR-KEY-HERE\")\n", + "MODEL = \"openai/gpt-5.5\"\n", + "\n", + "assert API_KEY.startswith(\"pplx-\"), \"Set PERPLEXITY_API_KEY (or PPLX_API_KEY) to your Perplexity API key.\"\n", + "\n", + "client = Perplexity(api_key=API_KEY)\n", + "print(\"Perplexity SDK ready | model:\", MODEL)" + ] + }, + { + "cell_type": "markdown", + "id": "e3d4d1b0", + "metadata": {}, + "source": [ + "## 2. The A/B harness\n", + "\n", + "`run(prompt, use_sandbox)` creates a background job and polls until it's done. The only thing that\n", + "changes between runs is the tool list -- `web_search`, optionally plus `sandbox`. Same model, prompt,\n", + "and `max_steps`, so the comparison is fair. `poll` returns the response as a dict so the helpers below\n", + "can read it." + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "id": "6065abbf", + "metadata": { + "execution": { + "iopub.execute_input": "2026-06-09T23:46:05.589301Z", + "iopub.status.busy": "2026-06-09T23:46:05.589187Z", + "iopub.status.idle": "2026-06-09T23:46:05.603700Z", + "shell.execute_reply": "2026-06-09T23:46:05.603465Z" + } + }, + "outputs": [], + "source": [ + "def submit(prompt: str, use_sandbox: bool, max_steps: int = 20) -> str:\n", + " \"\"\"Create a background Agent API job via the SDK. Returns the response id.\"\"\"\n", + " tools = [{\"type\": \"web_search\"}]\n", + " if use_sandbox:\n", + " tools.append({\"type\": \"sandbox\"})\n", + " resp = client.responses.create(\n", + " model=MODEL, input=prompt, background=True, max_steps=max_steps, tools=tools,\n", + " )\n", + " return resp.id\n", + "\n", + "def poll(resp_id: str, interval: int = 12, timeout_s: int = 1200) -> dict:\n", + " \"\"\"Poll a background job until it reaches a terminal state, tolerating transient errors.\"\"\"\n", + " deadline = time.time() + timeout_s\n", + " while time.time() < deadline:\n", + " try:\n", + " resp = client.responses.retrieve(resp_id)\n", + " except Exception as e:\n", + " print(f\" transient error, retrying: {e}\")\n", + " time.sleep(interval)\n", + " continue\n", + " data = resp.model_dump(warnings=False)\n", + " status = data.get(\"status\")\n", + " print(f\" {resp_id[:18]}… -> {status}\")\n", + " if status in (\"completed\", \"failed\", \"incomplete\"):\n", + " return data\n", + " time.sleep(interval)\n", + " raise TimeoutError(f\"{resp_id} did not finish within {timeout_s}s\")\n", + "\n", + "def run(prompt: str, use_sandbox: bool, **kw) -> dict:\n", + " label = \"web_search + sandbox\" if use_sandbox else \"web_search only\"\n", + " print(f\"Submitting [{label}] …\")\n", + " return poll(submit(prompt, use_sandbox, **kw))" + ] + }, + { + "cell_type": "markdown", + "id": "a29d257e", + "metadata": {}, + "source": [ + "### Reading the response\n", + "\n", + "The response has typed `output` items. We use three: `message` (the answer), `sandbox_results` (each\n", + "Python cell the model ran), and `usage` (tokens and cost)." + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "id": "c6278137", + "metadata": { + "execution": { + "iopub.execute_input": "2026-06-09T23:46:05.605150Z", + "iopub.status.busy": "2026-06-09T23:46:05.605059Z", + "iopub.status.idle": "2026-06-09T23:46:05.607443Z", + "shell.execute_reply": "2026-06-09T23:46:05.607257Z" + } + }, + "outputs": [], + "source": [ + "def answer_text(resp: dict) -> str:\n", + " for o in resp.get(\"output\", []):\n", + " if o.get(\"type\") == \"message\":\n", + " return \"\".join(c.get(\"text\", \"\") for c in o.get(\"content\", []))\n", + " return \"\"\n", + "\n", + "def usage_summary(resp: dict) -> dict:\n", + " u = resp.get(\"usage\", {}) or {}\n", + " cost = (u.get(\"cost\") or {}).get(\"total_cost\")\n", + " return {\"input_tokens\": u.get(\"input_tokens\"), \"output_tokens\": u.get(\"output_tokens\"),\n", + " \"total_cost_usd\": cost}\n", + "\n", + "def sandbox_cells(resp: dict) -> list:\n", + " cells = []\n", + " for o in resp.get(\"output\", []):\n", + " if o.get(\"type\") == \"sandbox_results\":\n", + " res = (o.get(\"results\") or [{}])[0]\n", + " cells.append({\"code\": o.get(\"code\", \"\"), \"exit_code\": res.get(\"exit_code\"),\n", + " \"duration_ms\": res.get(\"duration_ms\"), \"stdout\": res.get(\"stdout\", \"\"),\n", + " \"stderr\": res.get(\"stderr\", \"\")})\n", + " return cells" + ] + }, + { + "cell_type": "markdown", + "id": "2432d92f", + "metadata": {}, + "source": [ + "### Scorecard and trace helpers\n", + "\n", + "`print_scorecard` lays two runs side by side; `print_trace` dumps the Python the sandbox actually ran." + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "id": "663027dd", + "metadata": { + "execution": { + "iopub.execute_input": "2026-06-09T23:46:05.608535Z", + "iopub.status.busy": "2026-06-09T23:46:05.608463Z", + "iopub.status.idle": "2026-06-09T23:46:05.611148Z", + "shell.execute_reply": "2026-06-09T23:46:05.610974Z" + } + }, + "outputs": [], + "source": [ + "def make_row(label, resp, score, max_score):\n", + " u = usage_summary(resp)\n", + " return {\"run\": label, \"correct\": f\"{score}/{max_score}\", \"input_tokens\": u[\"input_tokens\"],\n", + " \"output_tokens\": u[\"output_tokens\"], \"total_cost_usd\": u[\"total_cost_usd\"],\n", + " \"sandbox_cells\": len(sandbox_cells(resp))}\n", + "\n", + "def print_scorecard(rows):\n", + " cols = [\"run\", \"correct\", \"input_tokens\", \"output_tokens\", \"total_cost_usd\", \"sandbox_cells\"]\n", + " w = {c: max(len(c), *(len(str(r[c])) for r in rows)) for c in cols}\n", + " print(\" | \".join(c.ljust(w[c]) for c in cols))\n", + " print(\"-+-\".join(\"-\" * w[c] for c in cols))\n", + " for r in rows:\n", + " print(\" | \".join(str(r[c]).ljust(w[c]) for c in cols))\n", + "\n", + "def print_trace(resp, max_chars=900):\n", + " cells = sandbox_cells(resp)\n", + " print(f\"The model ran {len(cells)} sandbox cell(s).\\n\")\n", + " for i, c in enumerate(cells, 1):\n", + " print(\"=\" * 72)\n", + " print(f\"CELL {i} (exit={c['exit_code']}, {c['duration_ms']} ms)\")\n", + " print(\"=\" * 72)\n", + " print(c[\"code\"])\n", + " if c[\"stdout\"]:\n", + " print(\"--- stdout ---\"); print(c[\"stdout\"][:max_chars])\n", + " if c[\"stderr\"]:\n", + " print(\"--- stderr ---\"); print(c[\"stderr\"][:400])\n", + " print()" + ] + }, + { + "cell_type": "markdown", + "id": "1d8089ed", + "metadata": {}, + "source": [ + "## Example 1: Fetching npm latest versions: `web_search` vs `sandbox`\n", + "\n", + "**Question:** the current published version of 15 npm packages, sourced from the web with a citation\n", + "for each.\n", + "\n", + "Both runs search the web -- the difference is *how*. `web_search` pulls snippets into the model's\n", + "context and reasons over them. The sandbox uses the same search stack **as code** — `pplx_sdk.search.web`\n", + "to find the official source, `content.fetch` to read it -- parses the version, and returns just that.\n", + "\"Latest version\" isn't in a snippet, so the first approach has to hunt; the second reads it straight\n", + "off the page. We grade both against the npm registry, so the answer key can't go stale." + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "id": "aa67e490", + "metadata": { + "execution": { + "iopub.execute_input": "2026-06-09T23:46:05.612157Z", + "iopub.status.busy": "2026-06-09T23:46:05.612095Z", + "iopub.status.idle": "2026-06-09T23:46:05.613747Z", + "shell.execute_reply": "2026-06-09T23:46:05.613536Z" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Build a reference table of the current latest published version of these npm packages. Source every value from the web and cite the exact URL you read it from; use the web-search and page-fetch tools available to you rather than any hardcoded API client, and don't answer from prior knowledge. Return one Markdown row per package: package | latest version | source URL. Packages: react, lodash, express, axios, webpack, typescript, eslint, next, vue, chalk, commander, jest, vite, redux, zod.\n" + ] + } + ], + "source": [ + "NPM_PACKAGES = [\"react\", \"lodash\", \"express\", \"axios\", \"webpack\", \"typescript\", \"eslint\",\n", + " \"next\", \"vue\", \"chalk\", \"commander\", \"jest\", \"vite\", \"redux\", \"zod\"]\n", + "\n", + "NPM_PROMPT = (\n", + " \"Build a reference table of the current latest published version of these npm packages. \"\n", + " \"Source every value from the web and cite the exact URL you read it from; use the web-search \"\n", + " \"and page-fetch tools available to you rather than any hardcoded API client, and don't answer \"\n", + " \"from prior knowledge. Return one Markdown row per package: package | latest version | source URL. \"\n", + " \"Packages: \" + \", \".join(NPM_PACKAGES) + \".\"\n", + ")\n", + "print(NPM_PROMPT)" + ] + }, + { + "cell_type": "markdown", + "id": "5bd59d45", + "metadata": {}, + "source": [ + "### Ground truth from the registry\n", + "\n", + "The npm registry is authoritative, so we use it as the answer key -- one small request per package for\n", + "its `latest` version. (This is just the grader; the model isn't allowed to shortcut to it.)" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "id": "73dfcb1f", + "metadata": { + "execution": { + "iopub.execute_input": "2026-06-09T23:46:05.614737Z", + "iopub.status.busy": "2026-06-09T23:46:05.614675Z", + "iopub.status.idle": "2026-06-09T23:46:06.521243Z", + "shell.execute_reply": "2026-06-09T23:46:06.520844Z" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "{'react': '19.2.7', 'lodash': '4.18.1', 'express': '5.2.1', 'axios': '1.17.0', 'webpack': '5.107.2', 'typescript': '6.0.3', 'eslint': '10.4.1', 'next': '16.2.9', 'vue': '3.5.35', 'chalk': '5.6.2', 'commander': '15.0.0', 'jest': '30.4.2', 'vite': '8.0.16', 'redux': '5.0.1', 'zod': '4.4.3'}\n" + ] + } + ], + "source": [ + "import urllib.request # answer key only — an independent source of truth for grading\n", + "\n", + "def npm_latest(pkg):\n", + " with urllib.request.urlopen(f\"https://registry.npmjs.org/{pkg}/latest\", timeout=20) as r:\n", + " return json.load(r)[\"version\"]\n", + "\n", + "NPM_TRUTH = {pkg: npm_latest(pkg) for pkg in NPM_PACKAGES}\n", + "\n", + "def grade_versions(resp, truth):\n", + " \"\"\"Count how many true version strings appear verbatim in the answer.\"\"\"\n", + " text = answer_text(resp)\n", + " hits = {pkg: (ver in text) for pkg, ver in truth.items()}\n", + " return sum(hits.values()), len(hits)\n", + "\n", + "print(NPM_TRUTH)" + ] + }, + { + "cell_type": "markdown", + "id": "7c40b1e9", + "metadata": {}, + "source": [ + "### Run both ways" + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "id": "d2f718e8", + "metadata": { + "execution": { + "iopub.execute_input": "2026-06-09T23:46:06.523045Z", + "iopub.status.busy": "2026-06-09T23:46:06.522902Z", + "iopub.status.idle": "2026-06-09T23:49:26.462411Z", + "shell.execute_reply": "2026-06-09T23:49:26.460725Z" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Submitting [web_search only] …\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + " resp_42e849da-ad20… -> queued\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + " resp_42e849da-ad20… -> in_progress\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + " resp_42e849da-ad20… -> in_progress\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + " resp_42e849da-ad20… -> in_progress\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + " resp_42e849da-ad20… -> in_progress\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + " resp_42e849da-ad20… -> in_progress\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + " resp_42e849da-ad20… -> in_progress\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + " resp_42e849da-ad20… -> in_progress\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + " resp_42e849da-ad20… -> completed\n", + "Submitting [web_search + sandbox] …\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + " resp_3aeb283f-07df… -> queued\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + " resp_3aeb283f-07df… -> in_progress\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + " resp_3aeb283f-07df… -> in_progress\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + " resp_3aeb283f-07df… -> in_progress\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + " resp_3aeb283f-07df… -> in_progress\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + " resp_3aeb283f-07df… -> in_progress\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + " resp_3aeb283f-07df… -> in_progress\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + " resp_3aeb283f-07df… -> in_progress\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + " resp_3aeb283f-07df… -> completed\n", + "\n", + "baseline answer:\n", + " | package | latest version | source URL |\n", + "|---|---:|---|\n", + "| react | 19.2.7 | https://security.snyk.io/package/npm/react |\n", + "| lodash | 4.18.1 | https://security.snyk.io/package/npm/lodash |\n", + "| express | 5.2.1 | https://security.snyk.io/package/npm/express |\n", + "| axios | 1.17.0 | https://security.snyk.io/package/npm/axios |\n", + "| webpack | 5.105.4 | https://security.snyk.io/package/npm/webpack |\n", + "| typescript | 6.0.3 | https://security.snyk.io/package/npm/typescript |\n", + "| eslint | 10.4.1 | https://secure.software/npm/packages/eslint/10.4.1 |\n", + "| next | 16.2.7 | https://security.snyk.io/package/npm/next |\n", + "| vue | 3.5.35 | https://snyk.io/test/npm/vue@3.5.35 |\n", + "| chalk | 5.6.2 | https://security.snyk.io/package/npm/chalk |\n", + "| commander | 14.0.3 | https://security.snyk.io/package/npm/commander |\n", + "| jest | 30.4.2 | https://security.snyk.io/package/npm/jest |\n", + "| vite | 8.0.16 | https://secure.software/npm/packages/vite/8.0.16 |\n", + "| redux | 5.0.1 | https://libup.wmcloud.org/library/npm/redux?branch=REL1_46 |\n", + "| zod | 4.4.3 | https://security.snyk.io/package/npm/zod |\n", + "\n", + "sandbox answer:\n", + " Read the `version` field from each package’s npm registry `/latest` page.\n", + "\n", + "| package | latest version | source URL |\n", + "|---|---:|---|\n", + "| react | 19.2.7 | https://registry.npmjs.org/react/latest |\n", + "| lodash | 4.18.1 | https://registry.npmjs.org/lodash/latest |\n", + "| express | 5.2.1 | https://registry.npmjs.org/express/latest |\n", + "| axios | 1.17.0 | https://registry.npmjs.org/axios/latest |\n", + "| webpack | 5.107.2 | https://registry.npmjs.org/webpack/latest |\n", + "| typescript | 6.0.3 | https://registry.npmjs.org/typescript/latest |\n", + "| eslint | 10.4.1 | https://registry.npmjs.org/eslint/latest |\n", + "| next | 16.2.9 | https://registry.npmjs.org/next/latest |\n", + "| vue | 3.5.35 | https://registry.npmjs.org/vue/latest |\n", + "| chalk | 5.6.2 | https://registry.npmjs.org/chalk/latest |\n", + "| commander | 15.0.0 | https://registry.npmjs.org/commander/latest |\n", + "| jest | 30.4.2 | https://registry.npmjs.org/jest/latest |\n", + "| vite | 8.0.16 | https://registry.npmjs.org/vite/latest |\n", + "| redux | 5.0.1 | https://registry.npmjs.org/redux/latest |\n", + "| zod | 4.4.3 | https://registry.npmjs.org/zod/latest |\n" + ] + } + ], + "source": [ + "npm_baseline = run(NPM_PROMPT, use_sandbox=False)\n", + "npm_sandbox = run(NPM_PROMPT, use_sandbox=True)\n", + "\n", + "print(\"\\nbaseline answer:\\n\", answer_text(npm_baseline))\n", + "print(\"\\nsandbox answer:\\n\", answer_text(npm_sandbox))" + ] + }, + { + "cell_type": "markdown", + "id": "ae0b28bd", + "metadata": {}, + "source": [ + "### Scorecard" + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "id": "2b29de75", + "metadata": { + "execution": { + "iopub.execute_input": "2026-06-09T23:49:26.469983Z", + "iopub.status.busy": "2026-06-09T23:49:26.469574Z", + "iopub.status.idle": "2026-06-09T23:49:26.475838Z", + "shell.execute_reply": "2026-06-09T23:49:26.475122Z" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "run | correct | input_tokens | output_tokens | total_cost_usd | sandbox_cells\n", + "---------------------+---------+--------------+---------------+----------------+--------------\n", + "web_search only | 12/15 | 597642 | 3251 | 1.01768 | 0 \n", + "web_search + sandbox | 15/15 | 55625 | 2588 | 0.2014 | 7 \n" + ] + } + ], + "source": [ + "print_scorecard([\n", + " make_row(\"web_search only\", npm_baseline, *grade_versions(npm_baseline, NPM_TRUTH)),\n", + " make_row(\"web_search + sandbox\", npm_sandbox, *grade_versions(npm_sandbox, NPM_TRUTH)),\n", + "])" + ] + }, + { + "cell_type": "markdown", + "id": "6a1ea984", + "metadata": {}, + "source": [ + "### What did the sandbox actually do?" + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "id": "abe3e022", + "metadata": { + "execution": { + "iopub.execute_input": "2026-06-09T23:49:26.478546Z", + "iopub.status.busy": "2026-06-09T23:49:26.478392Z", + "iopub.status.idle": "2026-06-09T23:49:26.481262Z", + "shell.execute_reply": "2026-06-09T23:49:26.480910Z" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "The model ran 7 sandbox cell(s).\n", + "\n", + "========================================================================\n", + "CELL 1 (exit=1, 7314 ms)\n", + "========================================================================\n", + "import pplx_sdk, json\n", + "urls=['https://www.npmjs.com/package/react','https://www.npmjs.com/package/lodash']\n", + "res=pplx_sdk.content.fetch(urls, cache_enabled=False)\n", + "for r in res:\n", + " print('URL',r.get('url'),'error',r.get('error'),'title',r.get('title'))\n", + " c=r.get('content') or ''\n", + " print(c[:2000].replace('\\n',' ')[:2000])\n", + " print('---')\n", + "--- stderr ---\n", + "Traceback (most recent call last):\n", + " File \"\", line 5, in \n", + "AttributeError: 'pplx_sdk.PageResult' object has no attribute 'get'\n", + "\n", + "\n", + "========================================================================\n", + "CELL 2 (exit=1, 3259 ms)\n", + "========================================================================\n", + "import pplx_sdk, json\n", + "urls=['https://www.npmjs.com/package/react','https://www.npmjs.com/package/lodash']\n", + "res=pplx_sdk.content.fetch(urls, cache_enabled=False)\n", + "for rr in res:\n", + " r=dict(rr)\n", + " print('URL',r.get('url'),'error',r.get('error'),'title',r.get('title'))\n", + " c=r.get('content') or ''\n", + " print(c[:3000].replace('\\n',' ')[:3000])\n", + " print('---')\n", + "--- stderr ---\n", + "Traceback (most recent call last):\n", + " File \"\", line 5, in \n", + "TypeError: 'pplx_sdk.PageResult' object is not iterable\n", + "\n", + "\n", + "========================================================================\n", + "CELL 3 (exit=0, 4199 ms)\n", + "========================================================================\n", + "import pplx_sdk\n", + "res=pplx_sdk.content.fetch(['https://www.npmjs.com/package/react'], cache_enabled=False)\n", + "for r in res:\n", + " print(type(r), dir(r)[:50])\n", + " print(r)\n", + " print(getattr(r,'url',None), getattr(r,'content',None)[:1000] if getattr(r,'content',None) else None)\n", + "--- stdout ---\n", + " ['__class__', '__copy__', '__deepcopy__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getstate__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'author', 'content', 'description', 'error', 'from_dict', 'hostname', 'is_cached', 'is_paywall', 'published_date', 'title', 'to_dict', 'url']\n", + "PageResult(url='https://www.npmjs.com/package/react', is_paywall=False, is_cached=False, error='disallow_by_robots')\n", + "https://www.npmjs.com/package/react None\n", + "\n", + "\n", + "========================================================================\n", + "CELL 4 (exit=0, 11596 ms)\n", + "========================================================================\n", + "import pplx_sdk\n", + "urls=['https://registry.npmjs.org/react','https://registry.npmjs.org/lodash']\n", + "res=pplx_sdk.content.fetch(urls, cache_enabled=False)\n", + "for r in res:\n", + " print('url',r.url,'err',r.error,'title',r.title)\n", + " c=r.content or ''\n", + " print(c[:1000])\n", + " print('---')\n", + "--- stdout ---\n", + "url https://registry.npmjs.org/react err None title react\n", + "{\"_id\":\"react\",\"_rev\":\"5274-7b0e898191d468cab53eaf6fecbb183b\",\"name\":\"react\",\"dist-tags\":{\"beta\":\"19.0.0-beta-26f2496093-20240514\",\"rc\":\"19.0.0-rc.1\",\"next\":\"19.3.0-canary-d5736f09-20260507\",\"backport\":\"19.1.8\",\"latest\":\"19.2.7\",\"experimental\":\"0.0.0-experimental-900ae094-20260605\",\"canary\":\"19.3.0-canary-900ae094-20260605\"},\"versions\":{\"0.0.1\":{\"name\":\"react\",\"version\":\"0.0.1\",\"author\":{\"name\":\"Jeff Barczewski\",\"email\":\"jeff.barczewski@gmail.com\"},\"_id\":\"react@0.0.1\",\"maintainers\":[{\"name\":\"jeffbski\",\"email\":\"jeff.barczewski@gmail.com\"}],\"bugs\":{\"url\":\"http://github.com/jeffbski/react/issues\"},\"dist\":{\"shasum\":\"c84d3dbff0c65577a52f0bfe431f8bcc155fa365\",\"tarball\":\"https://registry.npmjs.org/react/-/react-0.0.1.tgz\",\"integrity\":\"sha512-rFzmt9tKNA5BxD0teT90iTceOQsDrcdtnUTyn0EMReoKiUzbEcobYxpc1WW+trpHja5HtU2HQCoDYX6gBf+9xw==\",\"signat\n", + "\n", + "========================================================================\n", + "CELL 5 (exit=0, 8173 ms)\n", + "========================================================================\n", + "import pplx_sdk\n", + "urls=['https://registry.npmjs.org/react/latest','https://registry.npmjs.org/lodash/latest']\n", + "res=pplx_sdk.content.fetch(urls, cache_enabled=False)\n", + "for r in res:\n", + " print('url',r.url,'err',r.error,'title',r.title)\n", + " print((r.content or '')[:1000])\n", + " print('---')\n", + "--- stdout ---\n", + "url https://registry.npmjs.org/react/latest err None title latest\n", + "{\"name\":\"react\",\"version\":\"19.2.7\",\"keywords\":[\"react\"],\"license\":\"MIT\",\"_id\":\"react@19.2.7\",\"maintainers\":[{\"name\":\"fb\",\"email\":\"opensource+npm@fb.com\"},{\"name\":\"react-bot\",\"email\":\"react-core@meta.com\"}],\"homepage\":\"https://react.dev/\",\"bugs\":{\"url\":\"https://github.com/facebook/react/issues\"},\"dist\":{\"shasum\":\"1f47a1bfc06f8ec885752c6f4af14369a9f8260b\",\"tarball\":\"https://registry.npmjs.org/react/-/react-19.2.7.tgz\",\"fileCount\":27,\"integrity\":\"sha512-HNe9WslTbXmFK8o8cmwgAeJFSBvt1bPdHCVKtaaV+WlAN36mpT4hcRpwbf3fY56ar2oIXzsBpOAiIRHAdY0OlQ==\",\"signatures\":[{\"sig\":\"MEYCIQDqvJPxxAEO8owvoxucD5ckHemMeoUolX8Q6cnirHLUegIhAKRMNrX4t77u0C/Xsqh1BIpUeKe7dAWL9sydhTUycWhd\",\"keyid\":\"SHA256:DhQ8wR5APBvFHLF/+Tc+AYvPOdTpcIDqOhxsBHRwC7U\"}],\"attestations\":{\"url\":\"https://registry.npmjs.org/-/npm/v1/attestations/react@19.2.7\",\"provenance\":{\"predi\n", + "\n", + "========================================================================\n", + "CELL 6 (exit=0, 8747 ms)\n", + "========================================================================\n", + "import pplx_sdk, json\n", + "hits=pplx_sdk.search.web('npm registry package latest version metadata', limit=5)\n", + "for h in hits:\n", + " print(h.to_dict() if hasattr(h,'to_dict') else h)\n", + "\n", + "--- stdout ---\n", + "{'url': 'https://github.com/npm/registry/blob/master/docs/responses/package-metadata.md', 'title': 'package-metadata.md - npm/registry', 'domain': 'github.com', 'summary': '# Package Metadata\\nPackage *metadata* describes a package for its consumers: who wrote it, where its repository is, and what versions of it have been published.\\nIt also contains a description of each *version* of a package present in the registry, listing its dependencies, giving the url of its tarball, and so on.\\nPackage metadata is useful for finding packages and for installing them.\\nYou can request *package metadata* from this endpoint:\\n`GET https://registry.npmjs.org/:package`\\nThe registry responds with a JSON-formatted string containing metadata for the package named, either in full or abbreviated form depending on what you request in the `Accept` header.\\nIf you provide no Accept header, the full document \n", + "\n", + "========================================================================\n", + "CELL 7 (exit=0, 28430 ms)\n", + "========================================================================\n", + "import pplx_sdk, json, re\n", + "packages = ['react','lodash','express','axios','webpack','typescript','eslint','next','vue','chalk','commander','jest','vite','redux','zod']\n", + "urls = [f'https://registry.npmjs.org/{p}/latest' for p in packages]\n", + "res = pplx_sdk.content.fetch(urls, cache_enabled=False)\n", + "rows=[]\n", + "for p,r in zip(packages,res):\n", + " c = r.content or ''\n", + " err = r.error\n", + " try:\n", + " data=json.loads(c)\n", + " v=data.get('version')\n", + " except Exception as e:\n", + " # maybe c has preamble? regex\n", + " m=re.search(r'\"version\"\\s*:\\s*\"([^\"]+)\"', c)\n", + " v=m.group(1) if m else None\n", + " rows.append({'package':p,'version':v,'url':r.url,'error':err,'title':r.title,'len':len(c)})\n", + "print(json.dumps(rows, indent=2))\n", + "# save raw minimal rows\n", + "open('/home/user/workspace/npm_latest_versions.json','w').write(json.dumps(rows, indent=2))\n", + "--- stdout ---\n", + "[\n", + " {\n", + " \"package\": \"react\",\n", + " \"version\": \"19.2.7\",\n", + " \"url\": \"https://registry.npmjs.org/react/latest\",\n", + " \"error\": null,\n", + " \"title\": \"latest\",\n", + " \"len\": 1994\n", + " },\n", + " {\n", + " \"package\": \"lodash\",\n", + " \"version\": \"4.18.1\",\n", + " \"url\": \"https://registry.npmjs.org/lodash/latest\",\n", + " \"error\": null,\n", + " \"title\": \"latest\",\n", + " \"len\": 1669\n", + " },\n", + " {\n", + " \"package\": \"express\",\n", + " \"version\": \"5.2.1\",\n", + " \"url\": \"https://registry.npmjs.org/express/latest\",\n", + " \"error\": null,\n", + " \"title\": \"latest\",\n", + " \"len\": 3508\n", + " },\n", + " {\n", + " \"package\": \"axios\",\n", + " \"version\": \"1.17.0\",\n", + " \"url\": \"https://registry.npmjs.org/axios/latest\",\n", + " \"error\": null,\n", + " \"title\": \"latest\",\n", + " \"len\": 6569\n", + " },\n", + " {\n", + " \"package\": \"webpack\",\n", + " \"version\": \"5.107.2\",\n", + " \"url\": \"https://registry.npmjs.org/webpack/latest\",\n", + " \"error\": null,\n", + " \"title\": \"latest\",\n", + " \"len\": 10011\n", + " },\n", + " {\n", + " \"package\": \"typescript\",\n", + " \n", + "\n" + ] + } + ], + "source": [ + "print_trace(npm_sandbox)" + ] + }, + { + "cell_type": "markdown", + "id": "f2b63840", + "metadata": {}, + "source": [ + "## Example 2: Usain Bolt records: `web_search` vs `sandbox`\n", + "\n", + "**Question:** four facts about Usain Bolt's world records: his 100 m and 200 m record times, and the\n", + "city and year he set them.\n", + "\n", + "These are quoted in countless articles, so this is the opposite case: `web_search` finds them right in\n", + "the snippets. It's a fair check on when you **don't** need the sandbox." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "b39bec20", + "metadata": { + "execution": { + "iopub.execute_input": "2026-06-09T23:49:26.483295Z", + "iopub.status.busy": "2026-06-09T23:49:26.483106Z", + "iopub.status.idle": "2026-06-09T23:49:26.486562Z", + "shell.execute_reply": "2026-06-09T23:49:26.486251Z" + } + }, + "outputs": [], + "source": [ + "BOLT_PROMPT = (\n", + " \"I need four facts about Usain Bolt's athletics world records, each with a source URL: \"\n", + " \"(1) his 100 m world-record time in seconds; (2) his 200 m world-record time in seconds; \"\n", + " \"(3) the city where he set both; (4) the year he set both.\"\n", + ")\n", + "\n", + "BOLT_KEY = {\"100m\": \"9.58\", \"200m\": \"19.19\", \"city\": \"Berlin\", \"year\": \"2009\"}\n", + "\n", + "def grade_bolt(resp):\n", + " text = answer_text(resp)\n", + " checks = [v in text for v in BOLT_KEY.values()]\n", + " return sum(checks), len(checks)" + ] + }, + { + "cell_type": "markdown", + "id": "87976446", + "metadata": {}, + "source": [ + "### Run both ways" + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "id": "7fb99753", + "metadata": { + "execution": { + "iopub.execute_input": "2026-06-09T23:49:26.488673Z", + "iopub.status.busy": "2026-06-09T23:49:26.488528Z", + "iopub.status.idle": "2026-06-09T23:52:46.300654Z", + "shell.execute_reply": "2026-06-09T23:52:46.299734Z" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Submitting [web_search only] …\n", + " resp_dedce1cd-af30… -> queued\n", + " resp_dedce1cd-af30… -> in_progress\n", + " resp_dedce1cd-af30… -> completed\n", + "Submitting [web_search + sandbox] …\n", + " resp_90f8cf97-9c7a… -> queued\n", + " resp_90f8cf97-9c7a… -> in_progress\n", + " resp_90f8cf97-9c7a… -> in_progress\n", + " resp_90f8cf97-9c7a… -> in_progress\n", + " resp_90f8cf97-9c7a… -> in_progress\n", + " resp_90f8cf97-9c7a… -> completed\n", + "\n", + "sandbox answer:\n", + " | # | Fact requested | Answer | Source URL |\n", + "|---|---|---:|---|\n", + "| 1 | Usain Bolt’s 100 m world-record time | **9.58 seconds** | https://worldathletics.org/records/by-discipline/sprints/100-metres/all/men |\n", + "| 2 | Usain Bolt’s 200 m world-record time | **19.19 seconds** | https://worldathletics.org/records/by-discipline/sprints/200-metres/outdoor/men |\n", + "| 3 | City where he set both records | **Berlin, Germany** | 100 m: https://worldathletics.org/records/by-discipline/sprints/100-metres/all/men ; 200 m: https://worldathletics.org/records/by-discipline/sprints/200-metres/outdoor/men |\n", + "| 4 | Year he set both records | **2009** | 100 m: https://worldathletics.org/records/by-discipline/sprints/100-metres/all/men ; 200 m: https://worldathletics.org/records/by-discipline/sprints/200-metres/outdoor/men |\n" + ] + } + ], + "source": [ + "bolt_baseline = run(BOLT_PROMPT, use_sandbox=False)\n", + "bolt_sandbox = run(BOLT_PROMPT, use_sandbox=True)\n", + "\n", + "print(\"\\nsandbox answer:\\n\", answer_text(bolt_sandbox))" + ] + }, + { + "cell_type": "markdown", + "id": "70fa9fa1", + "metadata": {}, + "source": [ + "### Scorecard" + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "id": "e5d91462", + "metadata": { + "execution": { + "iopub.execute_input": "2026-06-09T23:52:46.303690Z", + "iopub.status.busy": "2026-06-09T23:52:46.303479Z", + "iopub.status.idle": "2026-06-09T23:52:46.307344Z", + "shell.execute_reply": "2026-06-09T23:52:46.306896Z" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "run | correct | input_tokens | output_tokens | total_cost_usd | sandbox_cells\n", + "---------------------+---------+--------------+---------------+----------------+--------------\n", + "web_search only | 4/4 | 16362 | 917 | 0.10204 | 0 \n", + "web_search + sandbox | 4/4 | 18378 | 1144 | 0.08359 | 3 \n" + ] + } + ], + "source": [ + "print_scorecard([\n", + " make_row(\"web_search only\", bolt_baseline, *grade_bolt(bolt_baseline)),\n", + " make_row(\"web_search + sandbox\", bolt_sandbox, *grade_bolt(bolt_sandbox)),\n", + "])" + ] + }, + { + "cell_type": "markdown", + "id": "d48a1039", + "metadata": {}, + "source": [ + "Both runs land all four facts at about the same cost: on a question the snippets already answer, the\n", + "sandbox buys you nothing. Reach for plain `web_search` here -- it's the simpler tool. (The sandbox\n", + "answer is still fully auditable in the trace below.)" + ] + }, + { + "cell_type": "markdown", + "id": "4d57096d", + "metadata": {}, + "source": [ + "### The sandbox's trace\n", + "\n", + "Even here it really searches the web -- `pplx_sdk.search.web` plus the usual `WebHit` self-correction -- it just isn't worth the extra steps when the snippet already had the answer." + ] + }, + { + "cell_type": "code", + "execution_count": 13, + "id": "673108ad", + "metadata": { + "execution": { + "iopub.execute_input": "2026-06-09T23:52:46.309484Z", + "iopub.status.busy": "2026-06-09T23:52:46.309336Z", + "iopub.status.idle": "2026-06-09T23:52:46.312364Z", + "shell.execute_reply": "2026-06-09T23:52:46.312036Z" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "The model ran 3 sandbox cell(s).\n", + "\n", + "========================================================================\n", + "CELL 1 (exit=1, 3945 ms)\n", + "========================================================================\n", + "import pplx_sdk\n", + "queries=[\"Usain Bolt 100m world record 9.58 Berlin 2009 World Athletics\", \"Usain Bolt 200m world record 19.19 Berlin 2009 World Athletics\"]\n", + "for q in queries:\n", + " print('QUERY',q)\n", + " hits=pplx_sdk.search.web(q, limit=5, domains=['worldathletics.org'])\n", + " for h in hits:\n", + " print(dict(h))\n", + " print()\n", + "\n", + "--- stdout ---\n", + "QUERY Usain Bolt 100m world record 9.58 Berlin 2009 World Athletics\n", + "\n", + "--- stderr ---\n", + "Traceback (most recent call last):\n", + " File \"\", line 7, in \n", + "TypeError: 'pplx_sdk.WebHit' object is not iterable\n", + "\n", + "\n", + "========================================================================\n", + "CELL 2 (exit=0, 3781 ms)\n", + "========================================================================\n", + "import pplx_sdk, json\n", + "hits=pplx_sdk.search.web(\"Usain Bolt 100m world record 9.58 Berlin 2009 World Athletics\", limit=3, domains=['worldathletics.org'])\n", + "for h in hits:\n", + " print(type(h))\n", + " print(dir(h)[:30])\n", + " print(getattr(h,'__dict__',None))\n", + " print(h)\n", + "\n", + "--- stdout ---\n", + "\n", + "['__class__', '__copy__', '__deepcopy__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getstate__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'date', 'domain', 'from_dict']\n", + "None\n", + "WebHit(url='https://worldathletics.org/results/world-athletics-championships/2009/12th-ia...<+71 chars>', title='100 Metres Result | 12th IAAF World Championships in ...', domain='worldathletics.org', summary='### 12th IAAF World Championships in Athletics\\nBerlin (Olympiastadion), GERMA...<+190 chars>')\n", + "\n", + "['__class__', '__copy__', '__deepcopy__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getstate\n", + "\n", + "========================================================================\n", + "CELL 3 (exit=0, 32067 ms)\n", + "========================================================================\n", + "import pplx_sdk\n", + "urls=['https://worldathletics.org/records/by-discipline/sprints/100-metres/all/men','https://worldathletics.org/records/by-discipline/sprints/200-metres/outdoor/men','https://worldathletics.org/news/news/bolt-again-958-world-record-in-berlin-updat']\n", + "res=pplx_sdk.content.fetch(urls, cache_enabled=False)\n", + "for r in res:\n", + " print('URL',r.url)\n", + " print('title',r.title)\n", + " print('error',r.error)\n", + " print((r.content or '')[:1000].replace('\\n',' ') )\n", + " print('---')\n", + "\n", + "--- stdout ---\n", + "URL https://worldathletics.org/records/by-discipline/sprints/100-metres/all/men\n", + "title Stats | World Athletics | World Athletics\n", + "error None\n", + "Win A trip for 2 with flights to Copenhagen, accommodation, half marathon bib and full ASICS kit included ENTER NOW|Type|Progression|Mark|Wind|Competitor|DOB|Venue|Date| |--|--|--|--|--|--|--|--| |World Records|9.58|+0.9|Usain BOLT|21 AUG 1986|JAM|Olympiastadion, Berlin (GER)|16 AUG 2009| |World Championships in Athletics Records|9.58|+0.9|Usain BOLT|21 AUG 1986|JAM|Olympiastadion, Berlin (GER)|16 AUG 2009| |World Championships Combined Best Performances|10.23|-0.4|Ashton EATON|21 JAN 1988|USA|National Stadium, Beijing (CHN)|28 AUG 2015| |World Combined Best Performances|10.12|+0.9|Damian WARNER|04 NOV 1989|CAN|Mösle-Stadium, Götzis (AUT)|25 MAY 2019| |World Leading 2026|9.84|+0.7|Kayinsola AJAYI|14 SEP 2004|NGR|Un. of Kentucky Outdoor Track Facility\n", + "\n" + ] + } + ], + "source": [ + "print_trace(bolt_sandbox)" + ] + }, + { + "cell_type": "markdown", + "id": "7b8df8c8", + "metadata": {}, + "source": [ + "## Takeaways\n", + "\n", + "1. **Use `web_search` when the answer is written down.** For well-documented facts the sandbox adds no\n", + " accuracy and no real cost saving, so the simpler tool wins -- Example 2.\n", + "2. **Reach for `sandbox` the moment it isn't.** Both tools search the same web, but the sandbox does it\n", + " as code — search, fetch, parse — so the bulky page content never lands in the model's context. When\n", + " a fact lives in metadata or a long page, that's cheaper *and* more accurate — Example 1.\n", + "3. **The sandbox answer is auditable.** Every value traces back to a page it searched for and read,\n", + " right there in the trace.\n", + "\n", + "**Try next:** swap in your own snippet-hostile question -- a number in a PDF table, a clause in a\n", + "filing, a value behind a lookup -- and watch the two columns separate.\n", + "\n", + "> [Perplexity Research — *Rethinking Search as Code Generation*](https://research.perplexity.ai/articles/rethinking-search-as-code-generation)" + ] + }, + { + "cell_type": "markdown", + "id": "4575deb2", + "metadata": {}, + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.12.7" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +}