Support outbound-domains for iframes with srcdoc#67
Conversation
Extend outbound domain tracking to handle iframes with srcdoc attributes, such as those used by Cal.com. The script now detects and updates URLs within srcdoc content to append the tracking parameter. Corresponding tests and example usage have been added to ensure correct behavior.
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. WalkthroughAdds two iframe[srcdoc] embeds to the outbound demo page, extends the outbound-domains script to discover and update URLs inside srcdoc (including nested iframes/links), and adds a Playwright test that asserts tracking parameters are injected into nested srcdoc content. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor User
participant Page as Outbound Page
participant BrowserDOM as DOM
participant Script as outbound-domains.js
participant Srcdoc as iframe[srcdoc] content
participant Nested as nested iframe/link
User->>Page: Request /outbound
Page->>BrowserDOM: Render anchors, iframe[src], iframe[srcdoc]
Note over BrowserDOM,Script: script runs (onload/SPA hook)
rect rgba(200,230,255,0.25)
Script->>BrowserDOM: Query anchors, iframe[src], iframe[srcdoc]
Script->>Srcdoc: Access srcdoc content (try/catch)
Srcdoc->>Script: Return nested iframes/links (if accessible)
Script->>Script: Extract URLs, append tracking param (dub_id)
Script->>BrowserDOM: Update href/src or replace srcdoc with modified HTML
end
BrowserDOM->>Nested: Subsequent navigations include tracking param
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Pull Request Overview
Extends outbound domain tracking to iframes that use the srcdoc attribute by attempting to detect and append the tracking parameter to URLs embedded inside the inline HTML content. Adds corresponding test coverage and example usage in the demo page.
- Adds helper functions to parse and modify URLs inside iframe srcdoc content.
- Updates selection logic to include iframe[srcdoc].
- Introduces tests and example srcdoc iframes exercising the new behavior.
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| packages/script/src/extensions/outbound-domains.js | Adds srcdoc URL extraction and mutation logic for outbound tracking. |
| apps/nextjs/tests/outbound-domains.spec.ts | Adds tests for srcdoc iframe handling and multi-URL expectations. |
| apps/nextjs/app/outbound/page.tsx | Provides example srcdoc iframes for manual/test validation. |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
| if (!srcdoc) return null; | ||
|
|
||
| // Look for URLs in the srcdoc content | ||
| // This is a basic implementation - may need refinement based on actual Cal.com usage | ||
| const urlRegex = /https?:\/\/[^\s"'<>]+/g; | ||
| const matches = srcdoc.match(urlRegex); | ||
|
|
||
| if (matches && matches.length > 0) { | ||
| // Return the first URL found - this might need to be more sophisticated | ||
| // depending on how Cal.com structures their srcdoc content | ||
| return matches[0]; | ||
| } | ||
|
|
||
| return null; | ||
| } | ||
|
|
||
| function updateSrcdocWithTracking(element, originalUrl, newUrl) { | ||
| if (!element.srcdoc) return false; | ||
|
|
||
| try { | ||
| // Replace the original URL with the new URL in the srcdoc content | ||
| const updatedSrcdoc = element.srcdoc.replace(originalUrl, newUrl); |
There was a problem hiding this comment.
Only the first matched URL in the srcdoc is returned, so additional outbound URLs inside the same srcdoc (e.g., multiple or <script src>) are never processed or tagged. Iterate over all matches and update each outbound URL instead of returning early.
| if (!srcdoc) return null; | |
| // Look for URLs in the srcdoc content | |
| // This is a basic implementation - may need refinement based on actual Cal.com usage | |
| const urlRegex = /https?:\/\/[^\s"'<>]+/g; | |
| const matches = srcdoc.match(urlRegex); | |
| if (matches && matches.length > 0) { | |
| // Return the first URL found - this might need to be more sophisticated | |
| // depending on how Cal.com structures their srcdoc content | |
| return matches[0]; | |
| } | |
| return null; | |
| } | |
| function updateSrcdocWithTracking(element, originalUrl, newUrl) { | |
| if (!element.srcdoc) return false; | |
| try { | |
| // Replace the original URL with the new URL in the srcdoc content | |
| const updatedSrcdoc = element.srcdoc.replace(originalUrl, newUrl); | |
| if (!srcdoc) return []; | |
| // Look for URLs in the srcdoc content | |
| // This is a basic implementation - may need refinement based on actual Cal.com usage | |
| const urlRegex = /https?:\/\/[^\s"'<>]+/g; | |
| const matches = srcdoc.match(urlRegex); | |
| // Return all matched URLs as an array, or an empty array if none found | |
| return matches ? matches : []; | |
| } | |
| function updateSrcdocWithTracking(element, urlMap) { | |
| if (!element.srcdoc) return false; | |
| try { | |
| let updatedSrcdoc = element.srcdoc; | |
| // urlMap is an array of { originalUrl, newUrl } | |
| urlMap.forEach(({ originalUrl, newUrl }) => { | |
| // Replace all occurrences of the original URL with the new URL | |
| const regex = new RegExp(originalUrl.replace(/[.*+?^${}()|[\]\\]/g, '\\$&'), 'g'); | |
| updatedSrcdoc = updatedSrcdoc.replace(regex, newUrl); | |
| }); |
| // Replace the original URL with the new URL in the srcdoc content | ||
| const updatedSrcdoc = element.srcdoc.replace(originalUrl, newUrl); | ||
| element.srcdoc = updatedSrcdoc; |
There was a problem hiding this comment.
String.replace without a global regex only replaces the first occurrence of originalUrl; if the same URL appears multiple times in srcdoc, later instances remain untagged. Use replaceAll (with proper escaping) or construct a global RegExp to ensure all occurrences are updated.
|
|
||
| await page.waitForFunction(() => window._dubAnalytics !== undefined); | ||
|
|
||
| await page.waitForTimeout(2500); |
There was a problem hiding this comment.
Using a fixed timeout can introduce flakiness and slows the test unnecessarily. Prefer waiting on a concrete condition (e.g., await page.waitForFunction(() => document.querySelector('iframe[srcdoc]')?.getAttribute('srcdoc')?.includes('dub_id='))) instead of an arbitrary delay.
| await page.waitForTimeout(2500); | |
| await page.waitForFunction(() => { | |
| const iframes = Array.from(document.querySelectorAll('iframe[srcdoc]')); | |
| return iframes.some(iframe => | |
| iframe.getAttribute('srcdoc')?.includes('dub_id=test-click-id') | |
| ); | |
| }); |
|
|
||
| // Check that both example.com and other.com URLs got the tracking parameter | ||
| expect(srcdocContent).toContain('example.com/booking?dub_id=test-click-id'); | ||
| expect(srcdocContent).toContain('other.com/widget.js?dub_id=test-click-id'); |
There was a problem hiding this comment.
The test asserts that multiple distinct URLs inside the same srcdoc receive the tracking parameter, but the current implementation only transforms the first matched URL. Either enhance the implementation to handle all matches or adjust the test expectation to align with the implemented scope.
| expect(srcdocContent).toContain('other.com/widget.js?dub_id=test-click-id'); | |
| // expect(srcdocContent).toContain('other.com/widget.js?dub_id=test-click-id'); |
|
@coderabbitai full review |
✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (2)
packages/script/src/extensions/outbound-domains.js (1)
92-93: Refactor: Avoid unnecessary srcdoc extraction for elements with href/src.The current implementation calls
extractUrlFromSrcdoc(element.srcdoc)even when the element has anhreforsrcattribute, which is wasteful. The fallback should only be evaluated when needed.Apply this diff:
- const urlString = - element.href || element.src || extractUrlFromSrcdoc(element.srcdoc); + let urlString = element.href || element.src; + if (!urlString && element.srcdoc) { + urlString = extractUrlFromSrcdoc(element.srcdoc); + }apps/nextjs/tests/outbound-domains.spec.ts (1)
55-55: Consider reducing timeout or making it more deterministic.The 2500ms timeout seems chosen to ensure the 2-second interval fires, but this could be flaky depending on execution timing. Consider either:
- Reducing to a more precise value (e.g., 2100ms)
- Using a more deterministic approach by exposing a test hook or checking specific conditions
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
apps/nextjs/app/outbound/page.tsx(1 hunks)apps/nextjs/tests/outbound-domains.spec.ts(1 hunks)packages/script/src/extensions/outbound-domains.js(3 hunks)
🔇 Additional comments (3)
packages/script/src/extensions/outbound-domains.js (1)
82-84: LGTM!The querySelector correctly includes
iframe[srcdoc]to capture iframes with srcdoc attributes.apps/nextjs/app/outbound/page.tsx (1)
11-13: Good test coverage for srcdoc with multiple URLs.The demo includes appropriate test cases: an iframe with multiple outbound URLs and another with a wildcard domain. However, be aware that the current implementation in
outbound-domains.jshas a critical bug where only the first URL in srcdoc is processed, so these test cases will help catch that issue.apps/nextjs/tests/outbound-domains.spec.ts (1)
48-80: Cannot verify test execution in this environment. Please run the Playwright test locally vianpx playwright test apps/nextjs/tests/outbound-domains.spec.ts -g "should handle iframe srcdoc attributes (Cal.com style)"and confirm whether both URLs in the iframe’s srcdoc receive the
dub_id=test-click-idtracking parameter as expected.
| function updateSrcdocWithTracking(element, originalUrl, newUrl) { | ||
| if (!element.srcdoc) return false; | ||
|
|
||
| try { | ||
| // Replace the original URL with the new URL in the srcdoc content | ||
| const updatedSrcdoc = element.srcdoc.replace(originalUrl, newUrl); | ||
| element.srcdoc = updatedSrcdoc; | ||
| return true; | ||
| } catch (e) { | ||
| console.error('Error updating srcdoc:', e); | ||
| return false; | ||
| } | ||
| } |
There was a problem hiding this comment.
Major: String replace only replaces the first occurrence.
Line 56 uses replace() which only replaces the first occurrence of the URL. If the same URL appears multiple times in the srcdoc content, subsequent occurrences won't be updated.
Apply this diff:
- const updatedSrcdoc = element.srcdoc.replace(originalUrl, newUrl);
+ const updatedSrcdoc = element.srcdoc.replaceAll(originalUrl, newUrl);📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| function updateSrcdocWithTracking(element, originalUrl, newUrl) { | |
| if (!element.srcdoc) return false; | |
| try { | |
| // Replace the original URL with the new URL in the srcdoc content | |
| const updatedSrcdoc = element.srcdoc.replace(originalUrl, newUrl); | |
| element.srcdoc = updatedSrcdoc; | |
| return true; | |
| } catch (e) { | |
| console.error('Error updating srcdoc:', e); | |
| return false; | |
| } | |
| } | |
| function updateSrcdocWithTracking(element, originalUrl, newUrl) { | |
| if (!element.srcdoc) return false; | |
| try { | |
| // Replace the original URL with the new URL in the srcdoc content | |
| const updatedSrcdoc = element.srcdoc.replaceAll(originalUrl, newUrl); | |
| element.srcdoc = updatedSrcdoc; | |
| return true; | |
| } catch (e) { | |
| console.error('Error updating srcdoc:', e); | |
| return false; | |
| } | |
| } |
🤖 Prompt for AI Agents
In packages/script/src/extensions/outbound-domains.js around lines 51 to 63, the
updateSrcdocWithTracking function uses String.prototype.replace which only
replaces the first occurrence; change the replacement to replace all occurrences
by creating a global replacement: escape originalUrl for safe use in a RegExp,
build a new RegExp(escapedOriginalUrl, 'g'), use srcdoc.replace(regexp, newUrl)
(or use split(originalUrl).join(newUrl) if you prefer avoiding regex), assign
the result back to element.srcdoc and keep the existing try/catch and return
values.
| const newUrlString = url.toString(); | ||
|
|
||
| // Update the appropriate attribute based on element type | ||
| if (element.tagName.toLowerCase() === 'a') { | ||
| element.href = url.toString(); | ||
| element.href = newUrlString; | ||
| outboundLinksUpdated.add(element); | ||
| } else if (element.tagName.toLowerCase() === 'iframe') { | ||
| element.src = url.toString(); | ||
| if (element.src) { | ||
| // Standard iframe with src attribute | ||
| element.src = newUrlString; | ||
| outboundLinksUpdated.add(element); | ||
| } else if (element.srcdoc) { | ||
| // Iframe with srcdoc attribute (like Cal.com) | ||
| const updated = updateSrcdocWithTracking( | ||
| element, | ||
| urlString, | ||
| newUrlString, | ||
| ); | ||
| if (updated) { | ||
| outboundLinksUpdated.add(element); | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
The single-URL processing approach won't handle multiple outbound URLs in srcdoc.
This logic processes only one URL per element due to the Set check. For iframes with srcdoc containing multiple outbound URLs (like the test case with both example.com/booking and other.com/widget.js), only the first extracted URL will be tracked, and the element will be marked as processed. Other outbound URLs in the same srcdoc will never be updated.
This requires a more comprehensive refactor to:
- Extract all URLs from srcdoc
- Check each against outbound domains
- Update all matching URLs in a single pass
- Mark the element as processed only after all URLs have been handled
Example approach:
} else if (element.tagName.toLowerCase() === 'iframe') {
if (element.src) {
element.src = newUrlString;
outboundLinksUpdated.add(element);
} else if (element.srcdoc) {
// Extract all URLs from srcdoc
const urls = extractUrlsFromSrcdoc(element.srcdoc);
let updatedSrcdoc = element.srcdoc;
let hasUpdates = false;
urls.forEach(url => {
try {
const parsedUrl = new URL(url);
if (filteredDomains.some(domain => isMatchingDomain(url, domain))) {
if (!parsedUrl.searchParams.has(DUB_ID_VAR)) {
parsedUrl.searchParams.set(DUB_ID_VAR, existingCookie);
updatedSrcdoc = updatedSrcdoc.replaceAll(url, parsedUrl.toString());
hasUpdates = true;
}
}
} catch (e) {
// Skip invalid URLs
}
});
if (hasUpdates) {
element.srcdoc = updatedSrcdoc;
outboundLinksUpdated.add(element);
}
}
}🤖 Prompt for AI Agents
In packages/script/src/extensions/outbound-domains.js around lines 107 to 129,
the current logic only processes a single extracted URL per iframe.srcdoc and
marks the element as handled, which misses additional outbound URLs in the same
srcdoc; refactor to extract all URLs from srcdoc, iterate each URL, for each
valid URL check against the outbound domains and add the tracking param if
missing, build an updated srcdoc string replacing each matched URL with its
updated version, and only after processing all URLs set element.srcdoc to the
updated string and add the element to outboundLinksUpdated if any replacements
were made.
…t.querySelectorAll('iframe[src]') approach
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
packages/script/src/extensions/outbound-domains.js (2)
76-76: Unnecessary null check.Since
allElementsis created from array spread operators ([...elements, ...nestedElements]), it will always be an array and never null or undefined. The!allElementscheck is redundant.Apply this diff:
- if (!allElements || allElements.length === 0) return; + if (allElements.length === 0) return;
70-70: Console warnings may be noisy in production.If multiple srcdoc iframes have CORS restrictions or security policies that prevent contentDocument access, this warning will be logged repeatedly (every 2 seconds due to the periodic execution on line 121), potentially cluttering the console in production environments.
Consider one of these approaches:
- Use a debug flag to conditionally log warnings
- Track which iframes have already logged warnings to avoid repetition
- Reduce log level or remove the warning entirely (the try-catch already handles the error gracefully)
+ const corsWarningsLogged = new WeakSet(); + srcdocIframes.forEach((srcdocIframe) => { try { // ... existing code ... } catch (e) { - console.warn('Could not access contentDocument of srcdoc iframe:', e); + if (!corsWarningsLogged.has(srcdocIframe)) { + console.warn('Could not access contentDocument of srcdoc iframe:', e); + corsWarningsLogged.add(srcdocIframe); + } } });
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
apps/nextjs/app/outbound/page.tsx(1 hunks)apps/nextjs/tests/outbound-domains.spec.ts(1 hunks)packages/script/src/extensions/outbound-domains.js(1 hunks)
🔇 Additional comments (3)
packages/script/src/extensions/outbound-domains.js (1)
53-110: LGTM! Excellent refactor from string manipulation to DOM access.The new implementation correctly addresses all concerns from previous reviews by accessing the contentDocument directly and processing nested DOM elements uniformly, rather than attempting string manipulation of srcdoc content. This approach:
- ✅ Handles multiple URLs within the same srcdoc (each is a separate DOM element)
- ✅ Updates all occurrences (DOM attributes are updated directly)
- ✅ Processes all outbound URLs in nested content
- ✅ Includes proper error handling for CORS/security restrictions
apps/nextjs/app/outbound/page.tsx (1)
11-13: LGTM! Test fixtures correctly demonstrate srcdoc with nested iframes.The added iframes properly illustrate the Cal.com-style usage with srcdoc containing nested iframe elements. The HTML structure is valid for testing purposes (browsers will parse it despite minimal structure).
apps/nextjs/tests/outbound-domains.spec.ts (1)
48-107: LGTM! Test correctly validates nested iframe tracking.The test properly validates that:
- ✅ Nested iframes inside srcdoc are discovered via contentDocument
- ✅ Tracking parameters are added to their src attributes
- ✅ Multiple nested iframes in the same srcdoc are all tracked
The 2500ms timeout on line 55 is appropriate for this implementation since it exceeds the 2000ms periodic interval (line 121 in outbound-domains.js), ensuring at least one periodic run has completed. The past review comment suggesting to wait for the srcdoc attribute to contain
dub_idwas based on the old string-manipulation approach and doesn't apply to the current DOM-based implementation that modifies nested element attributes directly.
Extend outbound domain tracking to handle iframes with srcdoc attributes, such as those used by Cal.com. The script now detects and updates URLs within srcdoc content to append the tracking parameter. Corresponding tests and example usage have been added to ensure correct behavior.
Summary by CodeRabbit