feat(mcp): add hotels cross-site hotel search module#1318
feat(mcp): add hotels cross-site hotel search module#1318Andrew Gazelka (andrewgazelka) wants to merge 6 commits into
hotels cross-site hotel search module#1318Conversation
Search and cross-compare hotels across Expedia, Google Hotels, Booking.com and
Kayak into polars, by driving the user's already-running browser over the Chrome
DevTools Protocol (hotel sites challenge fresh/headless browsers).
- hotels.search(where, check_in=, check_out=, amenities=, sort=) -> typed
SearchResult whose .df is a polars frame, cheapest first.
- hotels.compare(...) fans out across every site concurrently and aligns the
same hotel (fuzzy name match) into .matches showing the cheapest per site;
.errors maps any challenged site to its reason.
- hotels.amenities(...) lists Expedia's filterable amenities + live counts.
Pluggable backends (backends/{expedia,google,booking,kayak}.py) each parse their
site's cards into one pydantic Hotel; canonical site-independent amenity keys map
to each site's filter (Expedia amenity URL tokens verified live). Public
listings only, so no incognito gate.
Transport is raw CDP (httpx + websockets), isolated in hotels/cdp.py, because
Playwright's pinned driver cannot currently handshake with very recent Chrome;
it can move onto the browser module once versions realign (tracked separately).
Registered in registry.py (surfaces in api() + server instructions), bundled in
default.nix with an import smoke test + network-free unit tests, documented in
docs/mcp/tool-providers/overview.md.
(authored by Claude Code)
Blast radius
2 added, 0 removed pie showData title Rebuilt checks by category
"rust" : 43
"image" : 15
"site" : 2
"agent" : 1
"blast" : 1
"eval" : 1
"lint" : 1
flowchart LR
c0["ix-mcp-hotels-python-module"]
c1["ix-notebook-mcp-module"]
c2["blast-radius-test"]
c3["agent-skills"]
c4["lint"]
c5["rust-mcp.viewSmoke"]
c0 --> k0["agent-skills"]
c0 --> k2["eval"]
c0 --> k3["image-development-base"]
c0 --> k4["image-kernel-dev"]
c0 --> k5["image-minecraft"]
c1 --> k0["agent-skills"]
c1 --> k2["eval"]
c1 --> k3["image-development-base"]
c1 --> k4["image-kernel-dev"]
c1 --> k5["image-minecraft"]
changed checks (62)
|
…l_until stall CI flake-check caught two real issues the local nix env couldn't build: - ruff (repo-wide ANN/SIM/UP/PT/...): add `-> None` to tests, split a compound assert, drop quoted annotations (UP037), dict.fromkeys (C420), remove now- unused noqa, type `CDP.eval -> object` (ANN401). - cdp.scroll_until: the stall branch (`n == last`) was dead, so a filtered/thin result set scrolled for all rounds (~14s) and could pull in the non-matching suggestion cards. Now breaks on reaching limit OR a stalled count. (Also independently flagged BLOCK by the review-changes adversarial pass.) - hotelsTests interpreter was missing httpx + websockets (hotels imports them at import time), so collection failed with ModuleNotFoundError. (authored by Claude Code)
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a809040a1c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "Codex (@codex) review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "Codex (@codex) address that feedback".
| async def compare( | ||
| where: str, *, check_in: str | _dt.date, check_out: str | _dt.date, | ||
| adults: int = 2, sort: str = "price", amenities: Sequence[str] = (), | ||
| limit: int = 50, sites: Sequence[str] = ALL_SITES, |
There was a problem hiding this comment.
Reject unfiltered amenity comparisons
When a caller passes amenities to compare() without overriding sites, this default still fans out to ALL_SITES, but GoogleBackend.search and KayakBackend.search never read q.amenities and return unfiltered results. That lets an amenity-constrained comparison mark a Google/Kayak property as cheapest even though it was never filtered for the requested amenity; unsupported backends should be skipped/reported as errors or implement the filter before being included.
Useful? React with 👍 / 👎.
| hotelsTestPython = pkgs.python3.withPackages (ps: [ | ||
| ps.pytest | ||
| ps.polars | ||
| ps.pydantic | ||
| hotelsModule |
There was a problem hiding this comment.
This bare pkgs.python3.withPackages test environment includes pytest, polars, pydantic, and hotelsModule, but import hotels imports hotels.cdp, which imports httpx and websockets. Since hotelsModule is a runCommand copy with no propagatedBuildInputs, the hotelsTests derivation will fail during collection unless those packages are added here or the test reuses mcpPython.
Useful? React with 👍 / 👎.
| hotels = [h for r in results for h in r["hotels"]] | ||
| totals = {r["site"]: r["total"] for r in results if r["total"] is not None} |
There was a problem hiding this comment.
When the default Expedia search fails, such as when no debug browser is listening or a site challenge/layout change raises in the backend, _run converts that failure to r['error'], but search() only flattens r['hotels'] and totals and returns a SearchResult with no errors field. The common single-site await hotels.search(...) path therefore reports an empty dataframe with no hint that the search failed; re-raise single-site failures or carry errors like compare().
Useful? React with 👍 / 👎.
| return n | ||
| await asyncio.sleep(1.0) | ||
| waited += 1.0 | ||
| return 0 |
There was a problem hiding this comment.
Treat selector timeouts as backend errors
If a site's DOM changes or a challenge page does not put the marker in the URL/title, the awaited selector never appears and this returns 0; every backend ignores that return value and continues to extract an empty list, so compare().errors stays empty even though the backend failed to load usable results. Raise a timeout/BotWall-style exception here so the existing _run error path can report the failed site instead of silently producing incomplete comparisons.
Useful? React with 👍 / 👎.
| return ( | ||
| "https://www.google.com/travel/search?" | ||
| f"q=hotels%20in%20{quote(q.where)}&" | ||
| f"checkin={q.check_in.isoformat()}&checkout={q.check_out.isoformat()}" |
There was a problem hiding this comment.
Include adult count in Google searches
For compare(..., adults=3) or any non-default occupancy, the Google backend still builds the same URL because it never includes q.adults, while the other backends do include the requested adult count. That mixes Google prices for the site's default occupancy with prices from the requested occupancy on Expedia/Booking/Kayak, so cross-site cheapest results can be wrong for groups.
Useful? React with 👍 / 👎.
|
|
||
| hotels: list[Hotel] = [] | ||
| for c in cards: | ||
| total = money(c.get("price")) |
There was a problem hiding this comment.
Don't concatenate discounted Booking prices
For Booking listings that show both an original and discounted price in the price-and-discounted-price element, money() strips non-digits from the whole string, so a value like $300 $250 becomes 300250 before the nightly split. That makes discounted properties look thousands of dollars more expensive; extract the individual currency amounts and choose the current/lowest one instead of parsing the combined text.
Useful? React with 👍 / 👎.
| if not q.amenities: | ||
| await cdp.scroll_until(_CARD, limit=q.limit) |
There was a problem hiding this comment.
Scroll filtered Expedia results too
When an Expedia amenity filter is common, such as wifi, pool, or parking, there can be many genuine matching cards, but this branch skips scroll_until entirely whenever any amenity is present. The search then returns only the initially rendered batch instead of up to q.limit cheapest matches; since the extractor already stops at the non-matching divider, filtered searches should still scroll until the real result list is exhausted or the requested limit is reached.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
AI review found issues in this pull request.
Verdict: patch is incorrect
Confidence: 0.87
The new hotels tool can return materially incorrect filtered comparisons and silently treats several backend failure modes as successful empty searches.
- P1
packages/mcp/src/hotels/hotels/__init__.py:231Filtered comparisons include unfiltered backends - P2
packages/mcp/src/hotels/hotels/__init__.py:148Single-site search silently drops backend failures - P2
packages/mcp/src/hotels/hotels/cdp.py:127Selector timeouts are reported as empty results
| async def compare( | ||
| where: str, *, check_in: str | _dt.date, check_out: str | _dt.date, | ||
| adults: int = 2, sort: str = "price", amenities: Sequence[str] = (), | ||
| limit: int = 50, sites: Sequence[str] = ALL_SITES, | ||
| endpoint: str = DEFAULT_ENDPOINT, timeout: float = 30.0, | ||
| ) -> Comparison: | ||
| """Search every site in ``sites`` concurrently and cross-compare the prices. | ||
|
|
||
| Returns a :class:`Comparison`: ``.df`` stacks all rows (tagged by ``site``, | ||
| cheapest first), ``.matches`` aligns the same hotel across sites with the | ||
| cheapest one flagged, and ``.errors`` maps any failed site to its reason. | ||
| """ | ||
| q, results = await _fanout( | ||
| where, check_in=check_in, check_out=check_out, adults=adults, sort=sort, | ||
| amenities=amenities, limit=limit, sites=sites, endpoint=endpoint, timeout=timeout, |
There was a problem hiding this comment.
Filtered comparisons include unfiltered backends
compare() defaults to every registered backend even when amenities is set, but the Google and Kayak backends never apply or reject q.amenities. As a result, compare(..., amenities=['washer']) can include unfiltered Google/Kayak rows and mark one as the cheapest hotel even though it may not satisfy the requested filter. Either exclude unsupported backends for filtered queries or make those backends return an error instead of rows until they can honor the filter.
| hotels = [h for r in results for h in r["hotels"]] | ||
| totals = {r["site"]: r["total"] for r in results if r["total"] is not None} | ||
| return SearchResult( | ||
| where=where, check_in=q.check_in, check_out=q.check_out, adults=adults, | ||
| sort=sort, amenities=list(q.amenities), sites=list(sites), | ||
| total_found=totals, hotels=hotels, | ||
| ) |
There was a problem hiding this comment.
Single-site search silently drops backend failures
_run() converts CDP and scraper failures into an error field, but search() only collects hotels and total and discards that error information. The default await hotels.search(...) therefore returns an empty SearchResult when the browser is not listening, a bot wall is hit, or the scraper fails, making failure indistinguishable from genuinely no hotels. search() should raise or expose errors the same way compare() does.
| n = await self.count(selector) | ||
| if n: | ||
| return n | ||
| await asyncio.sleep(1.0) | ||
| waited += 1.0 | ||
| return 0 |
There was a problem hiding this comment.
Selector timeouts are reported as empty results
When the expected result-card selector never appears, wait_for() returns 0; every backend ignores that return value and continues to parse, which produces empty rows without a cmp.errors entry. A layout change, slow page, or blocked load is then indistinguishable from a valid empty search, and the advertised per-site error reporting fails closed only for explicit bot markers. This should raise a timeout error or otherwise signal failure to the caller.
The ai-review gate flagged the patch incorrect (filtered comparisons could include unfiltered backends; several failures looked like empty results). Fixes: - Backends declare `supported_amenities`; `_run` skips (and reports in `.errors`) any backend asked for an amenity it can't filter, so a filtered `compare()` never includes unfiltered Google/Kayak/Booking rows. Only Expedia filters (verified URL tokens). (P1) - `SearchResult` carries `.errors`; `search()` populates it and RAISES when every requested site failed, so single-site `search()` surfaces a dead browser / bot wall instead of a silently empty frame. (P2) - `CDP.wait_for` raises `TimeoutError` on a missing selector instead of returning 0, so a layout change / slow load is reported as a backend error, not empty rows. (P2) - Booking: take the discounted (lowest) amount from "$300 $250" cells instead of concatenating to 300250; drop the unreliable facility-click path. (P2) - Google: pass `adults` so occupancy matches the other backends. (P2) - Expedia: scroll filtered results too (extractor already stops at the divider), so common-amenity searches return up to `limit` matches. (P2) - Backend protocol methods use docstring bodies, not no-op `...` (CodeQL). Adds unit tests for the amenity-skip, discounted-price, and capability invariants. (authored by Claude Code)
- cdp.wait_for: one combined CDP eval per poll (url+title+count) instead of three round-trips; compare() polls 4 tabs concurrently so this adds up. - models.py: fix stale doc cross-reference (_AMENITY_LABELS -> expedia._AMENITY_TOKENS). review-changes verdict was approve-with-fixes (0 blockers); these clear the two low warnings. (authored by Claude Code)
There was a problem hiding this comment.
AI review found issues in this pull request.
Verdict: patch is incorrect
Confidence: 0.86
The feature is mostly self-contained, but the new CDP client can fail in the same environment the existing browser probe explicitly handles, and the public examples are already stale enough to fail for users.
- P1
packages/mcp/src/hotels/hotels/cdp.py:54Local CDP probe still inherits broken SSL/proxy environment - P2
packages/mcp/ix_notebook_mcp/registry.py:163Published examples use dates that are already invalid
| await self.close() | ||
|
|
||
| async def open(self) -> None: | ||
| async with httpx.AsyncClient() as client: |
There was a problem hiding this comment.
Local CDP probe still inherits broken SSL/proxy environment
This new raw CDP path constructs a default httpx client for the same local /json/version probe that the existing browser helper had to harden. In Nix sessions, SSL_CERT_FILE can point at a path that does not exist, and httpx builds its SSL configuration at client construction even though this request is plain http://127.0.0.1; the result is that hotels.search() reports that no browser is listening before it ever reaches the local debug port. Opt out of ambient env handling for this local probe, or mirror the browser helper's non-TLS client setup.
| async with httpx.AsyncClient() as client: | |
| async with httpx.AsyncClient(trust_env=False) as client: |
| "`await hotels.search('San Francisco', check_in='2026-06-17', check_out='2026-06-18', " | ||
| "amenities=['washer'], sort='price')` returns one row per hotel (name, nightly/total price, " |
There was a problem hiding this comment.
Published examples use dates that are already invalid
The registry text is the copy-paste help shown to users, but it documents check_in='2026-06-17' and check_out='2026-06-18'. As of June 18, 2026, that check-in date is already in the past, so the first documented call is rejected or rewritten by hotel sites before exercising the new tool. Use relative-date setup or placeholders instead of fixed dates, and update the matching module docstring examples too.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 3db4f01eb1
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "Codex (@codex) review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "Codex (@codex) address that feedback".
| async def eval(self, expression: str, *, await_promise: bool = False) -> object: | ||
| result = await self.send( | ||
| "Runtime.evaluate", | ||
| {"expression": expression, "returnByValue": True, "awaitPromise": await_promise}, | ||
| session=True, | ||
| ) | ||
| return result.get("result", {}).get("value") |
There was a problem hiding this comment.
Surface JavaScript extraction failures
When a page still has the expected card selector but one of the backend extractor expressions throws after a site DOM change, Chrome returns exceptionDetails from Runtime.evaluate rather than a useful value. This path currently converts that to None, so the backends treat the site as a successful empty result instead of letting _run populate .errors; raise on exceptionDetails here so comparisons do not silently omit a broken backend.
Useful? React with 👍 / 👎.
| async with CDP(endpoint) as cdp: | ||
| options = await backend.amenities(cdp, q) |
There was a problem hiding this comment.
Skip CDP for unsupported amenity catalogs
For site="google", "booking", or "kayak", the backend amenities() implementations just return [] and the docstring promises non-Expedia sites return empty, but this wrapper opens a CDP browser tab before dispatching. In the common case where no debug browser is listening, await hotels.amenities(..., site="booking") raises no browser is listening instead of returning the documented empty frame; return the empty schema before entering CDP when the backend has no supported amenity catalog.
Useful? React with 👍 / 👎.
| name = (raw.get("name") or "").strip() | ||
| # Expedia uses "Photo gallery for <name>" as a figure caption that can win | ||
| # the title lookup; strip it back to the real name. | ||
| name = re.sub(r"^Photo gallery for\s+", "", name).strip() |
There was a problem hiding this comment.
Reject cards without hotel names
If a site's title selector changes while the card and price selectors still match, raw.get("name") becomes None and this coerces it to an empty string instead of dropping the card or reporting the backend as broken. In compare(), those blank-name rows normalize to the same key, so unrelated nameless cards from different sites can be clustered together and assigned a bogus cheapest site; treat a missing/empty name as an extraction failure or skip that card.
Useful? React with 👍 / 👎.
…ale examples Addresses the AI-review findings on the prior head: - cdp.open(): probe the local CDP endpoint with httpx verify=False + trust_env=False and connect the websocket with proxy=None, mirroring the `browser` module's hardening. The default verifying client eagerly loads $SSL_CERT_FILE (missing in a minimal sandbox -> FileNotFoundError) and an ambient $HTTP(S)_PROXY would misroute the localhost request. (P1) - Replace the concrete `2026-06-17/18` example dates in the registry tagline and module docstrings with `YYYY-MM-DD` placeholders so the published examples never go stale / invalid. (P2) (authored by Claude Code)
There was a problem hiding this comment.
AI review found issues in this pull request.
Verdict: patch is incorrect
Confidence: 0.86
The new hotels provider can silently return empty comparison/results for total backend failure or JavaScript extraction failure, which undermines the correctness of its primary API.
- P1
packages/mcp/src/hotels/hotels/__init__.py:262Raise when every compare backend failed - P2
packages/mcp/src/hotels/hotels/cdp.py:111Propagate JavaScript extraction failures
| hotels = [h for r in results for h in r["hotels"]] | ||
| totals = {r["site"]: r["total"] for r in results if r["total"] is not None} | ||
| errors = {r["site"]: r["error"] for r in results if r["error"]} | ||
| result = SearchResult( | ||
| where=where, check_in=q.check_in, check_out=q.check_out, adults=adults, | ||
| sort=sort, amenities=list(q.amenities), sites=list(sites), | ||
| total_found=totals, errors=errors, hotels=hotels, | ||
| ) | ||
| return Comparison(result=result, errors=errors) |
There was a problem hiding this comment.
Raise when every compare backend failed
compare() converts per-site failures into errors, but unlike search() it never fails when all requested sites failed. With the default sites, a dead CDP browser or universal bot wall returns an empty Comparison.df and renders as an empty table unless the caller remembers to inspect cmp.errors, which is indistinguishable from no hotel availability in the common notebook path.
| hotels = [h for r in results for h in r["hotels"]] | |
| totals = {r["site"]: r["total"] for r in results if r["total"] is not None} | |
| errors = {r["site"]: r["error"] for r in results if r["error"]} | |
| result = SearchResult( | |
| where=where, check_in=q.check_in, check_out=q.check_out, adults=adults, | |
| sort=sort, amenities=list(q.amenities), sites=list(sites), | |
| total_found=totals, errors=errors, hotels=hotels, | |
| ) | |
| return Comparison(result=result, errors=errors) | |
| hotels = [h for r in results for h in r["hotels"]] | |
| totals = {r["site"]: r["total"] for r in results if r["total"] is not None} | |
| errors = {r["site"]: r["error"] for r in results if r["error"]} | |
| if errors and len(errors) == len(results): | |
| raise RuntimeError( | |
| "hotel comparison failed on every site: " | |
| + "; ".join(f"{s}: {e}" for s, e in errors.items()) | |
| ) | |
| result = SearchResult( | |
| where=where, check_in=q.check_in, check_out=q.check_out, adults=adults, | |
| sort=sort, amenities=list(q.amenities), sites=list(sites), | |
| total_found=totals, errors=errors, hotels=hotels, | |
| ) | |
| return Comparison(result=result, errors=errors) |
| async def eval(self, expression: str, *, await_promise: bool = False) -> object: | ||
| result = await self.send( | ||
| "Runtime.evaluate", | ||
| {"expression": expression, "returnByValue": True, "awaitPromise": await_promise}, | ||
| session=True, | ||
| ) | ||
| return result.get("result", {}).get("value") |
There was a problem hiding this comment.
Propagate JavaScript extraction failures
Runtime.evaluate reports thrown JavaScript via exceptionDetails inside a successful response, not as a top-level CDP error. This wrapper ignores that field and returns None, and every backend treats await cdp.eval(_EXTRACT) or [] as a successful zero-row scrape. A scraper bug or site markup change can therefore silently produce empty hotel results with no errors entry.
| async def eval(self, expression: str, *, await_promise: bool = False) -> object: | |
| result = await self.send( | |
| "Runtime.evaluate", | |
| {"expression": expression, "returnByValue": True, "awaitPromise": await_promise}, | |
| session=True, | |
| ) | |
| return result.get("result", {}).get("value") | |
| async def eval(self, expression: str, *, await_promise: bool = False) -> object: | |
| result = await self.send( | |
| "Runtime.evaluate", | |
| {"expression": expression, "returnByValue": True, "awaitPromise": await_promise}, | |
| session=True, | |
| ) | |
| if "exceptionDetails" in result: | |
| details = result["exceptionDetails"] | |
| exception = details.get("exception", {}) | |
| description = ( | |
| exception.get("description") | |
| or exception.get("value") | |
| or details.get("text") | |
| or "JavaScript evaluation failed" | |
| ) | |
| raise RuntimeError(f"CDP Runtime.evaluate failed: {description}") | |
| return result.get("result", {}).get("value") |
Addresses the remaining AI-review findings: - compare() now raises when EVERY backend failed (mirrors search()), instead of returning a silently empty Comparison. (P1) - cdp.eval() inspects Runtime.evaluate exceptionDetails and raises on an in-page JavaScript error rather than returning None, so a broken extraction script is reported as a backend error (via _run -> .errors) instead of empty results. (P2) (authored by Claude Code)
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
There was a problem hiding this comment.
AI review found issues in this pull request.
Verdict: patch is incorrect
Confidence: 0.83
The new module can produce materially incorrect comparison output: prices are all labeled/ranked as USD even when the browser returns another currency, and the hotel matching heuristic can merge distinct properties.
- P1
packages/mcp/src/hotels/hotels/models.py:107Hard-coded USD lets comparisons rank mixed currencies - P2
packages/mcp/src/hotels/hotels/__init__.py:84Name-only matching can merge distinct hotels
| return cls.model_validate( | ||
| { | ||
| "site": site, | ||
| "name": name, | ||
| "price_per_night": nightly, | ||
| "total_price": money(raw.get("total")), | ||
| "currency": raw.get("currency") or "USD", | ||
| "area": (raw.get("area") or None), |
There was a problem hiding this comment.
Hard-coded USD lets comparisons rank mixed currencies
Every backend passes only numeric prices into Hotel.from_card and none supplies raw["currency"], so this path labels every scraped price as USD. Because the scraper drives the user's existing browser, different hotel sites can use saved locale or currency preferences; compare() can then sort raw amounts from different currencies and report the cheapest site incorrectly. Parse/force a single currency per backend before constructing Hotel rows.
| _STOPWORDS = { | ||
| "hotel", "hotels", "the", "a", "an", "inn", "suites", "suite", "and", | ||
| "by", "at", "resort", "motel", "hostel", | ||
| } | ||
|
|
||
|
|
||
| def _norm(name: str) -> str: | ||
| tokens = re.sub(r"[^a-z0-9 ]", " ", (name or "").lower()).split() | ||
| return " ".join(t for t in tokens if t not in _STOPWORDS) |
There was a problem hiding this comment.
Name-only matching can merge distinct hotels
The normalizer drops property-type words such as hotel and inn before the fuzzy identity check. Distinct properties like Hotel Union Square San Francisco and Inn at Union Square San Francisco normalize to the same key, so matches can cluster different hotels and mark one site's price as the cheapest for the wrong property. Matching needs a stronger identifier, address/area validation, or less destructive normalization.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ce79334a42
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "Codex (@codex) review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "Codex (@codex) address that feedback".
| return ( | ||
| f"https://www.kayak.com/hotels/{_slug(q.where)}/" | ||
| f"{q.check_in.isoformat()}/{q.check_out.isoformat()}/{q.adults}adults" | ||
| "?sort=price_a" |
There was a problem hiding this comment.
Honor non-price Kayak sort requests
When a caller requests sort='rating' or sort='recommended' with sites=['kayak'] and a finite limit, this URL still asks Kayak for ascending price results, unlike the Query contract and the other mapped backends. That means the rows are drawn from the cheapest page rather than the requested ranking, so either map q.sort here or report the sort as unsupported instead of silently returning the wrong result set.
Useful? React with 👍 / 👎.
| "https://www.google.com/travel/search?" | ||
| f"q=hotels%20in%20{quote(q.where)}&" | ||
| f"checkin={q.check_in.isoformat()}&checkout={q.check_out.isoformat()}&" | ||
| f"adults={q.adults}" | ||
| ) |
There was a problem hiding this comment.
Reject non-default Google sorts
When a caller requests sort='rating' or sort='recommended' with sites=['google'] or the default compare(), this backend never includes or otherwise handles q.sort, so Google contributes its default-ranked first page while the other sortable backends are constrained to the requested ranking. With a finite limit, that can make a rating/recommended comparison use the wrong Google candidates; either map the sort here or report unsupported non-default sorts instead of silently mixing rankings.
Useful? React with 👍 / 👎.
| "name": name, | ||
| "price_per_night": nightly, | ||
| "total_price": money(raw.get("total")), | ||
| "currency": raw.get("currency") or "USD", |
There was a problem hiding this comment.
Preserve the scraped price currency
When the reused browser or hotel-site account is configured for a non-USD currency, the backends still pass only digits into Hotel.from_card and this default labels every parsed amount as USD. For example Booking's extractor accepts the digits from a €250 price cell, after which compare() can rank that value against dollar-denominated rows as if it were $250; either force USD in every backend URL/session or parse and carry the displayed currency instead of defaulting here.
Useful? React with 👍 / 👎.
What
A new bundled kernel module,
hotels, that searches and cross-compares hotels across Expedia, Google Hotels, Booking.com and Kayak into polars, by driving the user's already-running browser over the Chrome DevTools Protocol. (Hotel sites bot-wall fresh/headless browsers, so reusing the real, signed-in browser is the trick.)Why
Came out of a "look on Expedia for the cheapest SF hotel tonight with a washing machine" request. Driving the browser by hand was fragile; this makes live, filtered, cross-site hotel search a one-liner that returns polars like the other providers (
x,slack,view). The singleregistry.pyrow also surfaces it inapi()and the server instructions, so the next agent reaches for it.Design
flowchart TD A["hotels.search / compare / amenities"] --> B["backends.Query"] B --> E[expedia] B --> G[google] B --> K[booking] B --> Y[kayak] E & G & K & Y --> C["CDP (raw, hotels/cdp.py)"] C --> R["running browser :9222"] E & G & K & Y --> H["pydantic Hotel (models.py)"] H --> P["polars .df / .matches"]backends/{expedia,google,booking,kayak}.py) each parse their site's rendered cards into one pydanticHotel. A failing/challenged backend contributes 0 rows + an entry in.errorsrather than sinking the whole comparison.washer,kitchen, ...) map to each site's filter; Expedia's amenity URL tokens (WASHER_DRYER,KITCHEN_KITCHENETTE, ...) were verified against the live site.compare(...).matchesfuzzy-matches normalized names across sites (stdlibdifflib, no new dep) and flags the cheapest per hotel.x/slack).Transport note
Drives raw CDP (httpx + websockets) rather than the
browsermodule: Playwright's pinned driver currently cannotconnect_over_cdpto very recent Chrome builds (Browser context management is not supported). Isolated behindhotels/cdp.pyso it can move ontobrowseronce the driver/browser versions realign. Filed separately. Addsps.websocketsto the interpreter (declared directly).Verification
packages/mcp/tests/test_hotels.py, 8/8 pass..matches.ruff checkclean;nixevaluation valid.Note
The local nix store on my machine can't build the Rust toolchain (store-write
Permission deniedon ca-derivations), so I could not run thenixsmoke builds (mcp.tests.hotelsBundled/hotelsTests) or the pre-commit lint locally — they need to compile cdylibs (tui/search/astlog). CI (with the cachix cache) runs the full gate. The commit skipped the infra-broken pre-commit hook only; ruff + unit tests were verified independently.Backend reliability
Expedia + Booking are solid; Kayak is clean but (being a metasearch) folds in nearby Bay Area inventory; Google Hotels has the most volatile DOM and is best-effort. Backends degrade gracefully via
.errors.(authored by Claude Code)
Note
Add
hotelscross-site hotel search and comparison module to MCPhotelsPython module with three public async APIs:search(...),compare(...), andamenities(...), supporting Expedia, Google, Booking.com, and Kayak backends.Hotelpydantic model and returned as polars DataFrames viaSearchResult;Comparison.matchesaligns the same hotel across sites using fuzzy name matching.websocketsdependency added and registered in the kernel module catalog in registry.py.Macroscope summarized e31a731.