docs(mcp): add control-surface fallback guidance (CDP/iPhone/screen)#1352
docs(mcp): add control-surface fallback guidance (CDP/iPhone/screen)#1352Andrew Gazelka (andrewgazelka) wants to merge 1 commit into
Conversation
When a system has no programmatic integration (or a scope-limited one), the kernel can still drive it through a control surface. Document the three bundled surfaces and recommend CDP first (structured DOM, selector clicks) over screen/iphone pixel control.
There was a problem hiding this comment.
AI review found issues in this pull request.
Verdict: patch is incorrect
Confidence: 0.86
The added guidance is runtime-facing model instruction text, and it includes non-awaited async browser calls that would not execute when copied into the notebook kernel. That makes the patch incorrect until the examples are fixed.
- P2
packages/mcp/ix_notebook_mcp/guide.py:186Await async browser helpers in the control guide
| "(`import browser`, then `await browser.goto(url)` / `browser.vdom()` / `browser.read()` / " | ||
| "`browser.shot()`), the iPhone over WebDriverAgent (`import iphone`: `iphone.tap`, " |
There was a problem hiding this comment.
Await async browser helpers in the control guide
The new control-surface recipe shows browser.vdom(), browser.read(), and browser.shot() without await, but these helpers are all async functions. In this kernel a bare trailing coroutine is captured as the cell result rather than executed, so an agent following this guidance would get a coroutine repr instead of the DOM/readout/screenshot and the verification/control step would silently not happen. The example should await each browser helper, matching the browser module's own docs.
| "(`import browser`, then `await browser.goto(url)` / `browser.vdom()` / `browser.read()` / " | |
| "`browser.shot()`), the iPhone over WebDriverAgent (`import iphone`: `iphone.tap`, " | |
| "(`import browser`, then `await browser.goto(url)` / `await browser.vdom()` / `await browser.read()` / " | |
| "`await browser.shot()`), the iPhone over WebDriverAgent (`import iphone`: `iphone.tap`, |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 3d9524127c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "Codex (@codex) review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "Codex (@codex) address that feedback".
| "(`import browser`, then `await browser.goto(url)` / `browser.vdom()` / `browser.read()` / " | ||
| "`browser.shot()`), the iPhone over WebDriverAgent (`import iphone`: `iphone.tap`, " | ||
| "`iphone.swipe`, `iphone.screenshot`), and the Mac itself by screen capture plus synthetic " |
There was a problem hiding this comment.
Prefix async control examples with await
In a python_exec cell, calls like browser.vdom() or iphone.tap(...) are captured as the final expression's value and are not executed unless they are awaited; they render as coroutine objects instead. Since browser.vdom/read/shot and iphone.tap/swipe/screenshot are all async functions, a user following these new examples will get no DOM read, screenshot, tap, or swipe, so these snippets should include await just like the preceding browser.goto(url) example.
Useful? React with 👍 / 👎.
Blast radius
pie showData title Rebuilt checks by category
"rust" : 42
"image" : 15
"site" : 2
"agent" : 1
"blast" : 1
"eval" : 1
"lint" : 1
flowchart LR
c0["ix-notebook-mcp-module"]
c1["blast-radius-test"]
c2["agent-skills"]
c3["lint"]
c4["rust-mcp.viewSmoke"]
c5["site-test"]
c0 --> k0["agent-skills"]
c0 --> k2["eval"]
c0 --> k3["image-development-base"]
c0 --> k4["image-kernel-dev"]
c0 --> k5["image-minecraft"]
changed checks (63)
|
Adds a
CONTROLsection to the kernel guide: when a system has no programmatic integration (or a scope-limited one), the kernel can still drive it through a bundled control surface. Documents the three surfaces (browserover CDP,iphoneover WDA,screenfor the Mac) and recommends CDP first (structured DOM + selector clicks) over pixel-basedscreen/iphonecontrol.Motivated by a real session where a scope-limited Slack token blocked the API path and the fallback to a control surface was not signposted in the guide.
(sent by an AI agent via Claude Code)
Note
Add control surface fallback guidance for CDP, iPhone, and screen capture to kernel guide
Adds a
CONTROLconstant to guide.py describing the three available control surfaces (Chromium via CDP, iPhone via WebDriverAgent, Mac screen capture) and when to use each. The constant is inserted into the_KERNEL_GUIDEblock in tools.py between theVERIFYandRESULT_SPLITsections, so agents receive this guidance at runtime.Macroscope summarized 3d95241.