Skip to content

Support outbound-domains for iframes with srcdoc#67

Merged
steven-tey merged 2 commits intomainfrom
iframe-srcdoc
Oct 8, 2025
Merged

Support outbound-domains for iframes with srcdoc#67
steven-tey merged 2 commits intomainfrom
iframe-srcdoc

Conversation

@steven-tey
Copy link
Contributor

@steven-tey steven-tey commented Oct 7, 2025

Extend outbound domain tracking to handle iframes with srcdoc attributes, such as those used by Cal.com. The script now detects and updates URLs within srcdoc content to append the tracking parameter. Corresponding tests and example usage have been added to ensure correct behavior.

Summary by CodeRabbit

  • New Features
    • Added outbound tracking support for srcdoc-based iframes (including nested embeds like Cal.com-style), appending tracking parameters to embedded URLs.
    • Expanded coverage beyond standard links and iframe src attributes for more reliable tracking.
  • Tests
    • Added end-to-end test validating tracking injection within srcdoc-based nested iframes.
  • Chores
    • Updated outbound demo page to include example srcdoc iframes for verification.

Extend outbound domain tracking to handle iframes with srcdoc attributes, such as those used by Cal.com. The script now detects and updates URLs within srcdoc content to append the tracking parameter. Corresponding tests and example usage have been added to ensure correct behavior.
@steven-tey steven-tey requested review from Copilot and devkiran October 7, 2025 19:43
@vercel
Copy link

vercel bot commented Oct 7, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
analytics-nextjs-geolocation-script Ready Ready Preview Comment Oct 7, 2025 8:32pm

@coderabbitai
Copy link

coderabbitai bot commented Oct 7, 2025

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

Adds two iframe[srcdoc] embeds to the outbound demo page, extends the outbound-domains script to discover and update URLs inside srcdoc (including nested iframes/links), and adds a Playwright test that asserts tracking parameters are injected into nested srcdoc content.

Changes

Cohort / File(s) Summary
Outbound demo page
apps/nextjs/app/outbound/page.tsx
Adds two iframe elements using srcdoc that contain nested iframe HTML/scripts; no other component changes.
Playwright E2E test
apps/nextjs/tests/outbound-domains.spec.ts
New test navigating to /outbound, waiting for analytics init, collecting iframe[srcdoc] content and nested iframes, and asserting presence of injected tracking parameter(s) and expected nested URLs.
Outbound tracking script
packages/script/src/extensions/outbound-domains.js
Expands element collection to include content inside iframe[srcdoc] (nested iframe and a), unifies processing over combined element set, safely accesses srcdoc content with try/catch, and updates href/src or srcdoc content to append tracking parameters where applicable.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor User
  participant Page as Outbound Page
  participant BrowserDOM as DOM
  participant Script as outbound-domains.js
  participant Srcdoc as iframe[srcdoc] content
  participant Nested as nested iframe/link

  User->>Page: Request /outbound
  Page->>BrowserDOM: Render anchors, iframe[src], iframe[srcdoc]
  Note over BrowserDOM,Script: script runs (onload/SPA hook)

  rect rgba(200,230,255,0.25)
  Script->>BrowserDOM: Query anchors, iframe[src], iframe[srcdoc]
  Script->>Srcdoc: Access srcdoc content (try/catch)
  Srcdoc->>Script: Return nested iframes/links (if accessible)
  Script->>Script: Extract URLs, append tracking param (dub_id)
  Script->>BrowserDOM: Update href/src or replace srcdoc with modified HTML
  end

  BrowserDOM->>Nested: Subsequent navigations include tracking param
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Poem

I’m a rabbit in a tiny page,
digging srcdoc tunnels at every stage—
I tuck a dub_id into each nested gate,
hop through tests to mark it straight.
🐇✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title succinctly captures the primary change of adding support for srcdoc iframes in outbound domain tracking by directly referencing both “outbound-domains” and “iframes with srcdoc,” which aligns precisely with the PR’s objectives and content.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch iframe-srcdoc

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Extends outbound domain tracking to iframes that use the srcdoc attribute by attempting to detect and append the tracking parameter to URLs embedded inside the inline HTML content. Adds corresponding test coverage and example usage in the demo page.

  • Adds helper functions to parse and modify URLs inside iframe srcdoc content.
  • Updates selection logic to include iframe[srcdoc].
  • Introduces tests and example srcdoc iframes exercising the new behavior.

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File Description
packages/script/src/extensions/outbound-domains.js Adds srcdoc URL extraction and mutation logic for outbound tracking.
apps/nextjs/tests/outbound-domains.spec.ts Adds tests for srcdoc iframe handling and multi-URL expectations.
apps/nextjs/app/outbound/page.tsx Provides example srcdoc iframes for manual/test validation.

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment on lines 35 to 56
if (!srcdoc) return null;

// Look for URLs in the srcdoc content
// This is a basic implementation - may need refinement based on actual Cal.com usage
const urlRegex = /https?:\/\/[^\s"'<>]+/g;
const matches = srcdoc.match(urlRegex);

if (matches && matches.length > 0) {
// Return the first URL found - this might need to be more sophisticated
// depending on how Cal.com structures their srcdoc content
return matches[0];
}

return null;
}

function updateSrcdocWithTracking(element, originalUrl, newUrl) {
if (!element.srcdoc) return false;

try {
// Replace the original URL with the new URL in the srcdoc content
const updatedSrcdoc = element.srcdoc.replace(originalUrl, newUrl);
Copy link

Copilot AI Oct 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only the first matched URL in the srcdoc is returned, so additional outbound URLs inside the same srcdoc (e.g., multiple or <script src>) are never processed or tagged. Iterate over all matches and update each outbound URL instead of returning early.

Suggested change
if (!srcdoc) return null;
// Look for URLs in the srcdoc content
// This is a basic implementation - may need refinement based on actual Cal.com usage
const urlRegex = /https?:\/\/[^\s"'<>]+/g;
const matches = srcdoc.match(urlRegex);
if (matches && matches.length > 0) {
// Return the first URL found - this might need to be more sophisticated
// depending on how Cal.com structures their srcdoc content
return matches[0];
}
return null;
}
function updateSrcdocWithTracking(element, originalUrl, newUrl) {
if (!element.srcdoc) return false;
try {
// Replace the original URL with the new URL in the srcdoc content
const updatedSrcdoc = element.srcdoc.replace(originalUrl, newUrl);
if (!srcdoc) return [];
// Look for URLs in the srcdoc content
// This is a basic implementation - may need refinement based on actual Cal.com usage
const urlRegex = /https?:\/\/[^\s"'<>]+/g;
const matches = srcdoc.match(urlRegex);
// Return all matched URLs as an array, or an empty array if none found
return matches ? matches : [];
}
function updateSrcdocWithTracking(element, urlMap) {
if (!element.srcdoc) return false;
try {
let updatedSrcdoc = element.srcdoc;
// urlMap is an array of { originalUrl, newUrl }
urlMap.forEach(({ originalUrl, newUrl }) => {
// Replace all occurrences of the original URL with the new URL
const regex = new RegExp(originalUrl.replace(/[.*+?^${}()|[\]\\]/g, '\\$&'), 'g');
updatedSrcdoc = updatedSrcdoc.replace(regex, newUrl);
});

Copilot uses AI. Check for mistakes.
Comment on lines 55 to 57
// Replace the original URL with the new URL in the srcdoc content
const updatedSrcdoc = element.srcdoc.replace(originalUrl, newUrl);
element.srcdoc = updatedSrcdoc;
Copy link

Copilot AI Oct 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

String.replace without a global regex only replaces the first occurrence of originalUrl; if the same URL appears multiple times in srcdoc, later instances remain untagged. Use replaceAll (with proper escaping) or construct a global RegExp to ensure all occurrences are updated.

Copilot uses AI. Check for mistakes.

await page.waitForFunction(() => window._dubAnalytics !== undefined);

await page.waitForTimeout(2500);
Copy link

Copilot AI Oct 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using a fixed timeout can introduce flakiness and slows the test unnecessarily. Prefer waiting on a concrete condition (e.g., await page.waitForFunction(() => document.querySelector('iframe[srcdoc]')?.getAttribute('srcdoc')?.includes('dub_id='))) instead of an arbitrary delay.

Suggested change
await page.waitForTimeout(2500);
await page.waitForFunction(() => {
const iframes = Array.from(document.querySelectorAll('iframe[srcdoc]'));
return iframes.some(iframe =>
iframe.getAttribute('srcdoc')?.includes('dub_id=test-click-id')
);
});

Copilot uses AI. Check for mistakes.

// Check that both example.com and other.com URLs got the tracking parameter
expect(srcdocContent).toContain('example.com/booking?dub_id=test-click-id');
expect(srcdocContent).toContain('other.com/widget.js?dub_id=test-click-id');
Copy link

Copilot AI Oct 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test asserts that multiple distinct URLs inside the same srcdoc receive the tracking parameter, but the current implementation only transforms the first matched URL. Either enhance the implementation to handle all matches or adjust the test expectation to align with the implemented scope.

Suggested change
expect(srcdocContent).toContain('other.com/widget.js?dub_id=test-click-id');
// expect(srcdocContent).toContain('other.com/widget.js?dub_id=test-click-id');

Copilot uses AI. Check for mistakes.
@steven-tey
Copy link
Contributor Author

@coderabbitai full review

@coderabbitai
Copy link

coderabbitai bot commented Oct 7, 2025

✅ Actions performed

Full review triggered.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (2)
packages/script/src/extensions/outbound-domains.js (1)

92-93: Refactor: Avoid unnecessary srcdoc extraction for elements with href/src.

The current implementation calls extractUrlFromSrcdoc(element.srcdoc) even when the element has an href or src attribute, which is wasteful. The fallback should only be evaluated when needed.

Apply this diff:

-        const urlString =
-          element.href || element.src || extractUrlFromSrcdoc(element.srcdoc);
+        let urlString = element.href || element.src;
+        if (!urlString && element.srcdoc) {
+          urlString = extractUrlFromSrcdoc(element.srcdoc);
+        }
apps/nextjs/tests/outbound-domains.spec.ts (1)

55-55: Consider reducing timeout or making it more deterministic.

The 2500ms timeout seems chosen to ensure the 2-second interval fires, but this could be flaky depending on execution timing. Consider either:

  1. Reducing to a more precise value (e.g., 2100ms)
  2. Using a more deterministic approach by exposing a test hook or checking specific conditions
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 887f973 and 7c60c31.

📒 Files selected for processing (3)
  • apps/nextjs/app/outbound/page.tsx (1 hunks)
  • apps/nextjs/tests/outbound-domains.spec.ts (1 hunks)
  • packages/script/src/extensions/outbound-domains.js (3 hunks)
🔇 Additional comments (3)
packages/script/src/extensions/outbound-domains.js (1)

82-84: LGTM!

The querySelector correctly includes iframe[srcdoc] to capture iframes with srcdoc attributes.

apps/nextjs/app/outbound/page.tsx (1)

11-13: Good test coverage for srcdoc with multiple URLs.

The demo includes appropriate test cases: an iframe with multiple outbound URLs and another with a wildcard domain. However, be aware that the current implementation in outbound-domains.js has a critical bug where only the first URL in srcdoc is processed, so these test cases will help catch that issue.

apps/nextjs/tests/outbound-domains.spec.ts (1)

48-80: Cannot verify test execution in this environment. Please run the Playwright test locally via

npx playwright test apps/nextjs/tests/outbound-domains.spec.ts -g "should handle iframe srcdoc attributes (Cal.com style)"

and confirm whether both URLs in the iframe’s srcdoc receive the dub_id=test-click-id tracking parameter as expected.

Comment on lines 51 to 63
function updateSrcdocWithTracking(element, originalUrl, newUrl) {
if (!element.srcdoc) return false;

try {
// Replace the original URL with the new URL in the srcdoc content
const updatedSrcdoc = element.srcdoc.replace(originalUrl, newUrl);
element.srcdoc = updatedSrcdoc;
return true;
} catch (e) {
console.error('Error updating srcdoc:', e);
return false;
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Major: String replace only replaces the first occurrence.

Line 56 uses replace() which only replaces the first occurrence of the URL. If the same URL appears multiple times in the srcdoc content, subsequent occurrences won't be updated.

Apply this diff:

-      const updatedSrcdoc = element.srcdoc.replace(originalUrl, newUrl);
+      const updatedSrcdoc = element.srcdoc.replaceAll(originalUrl, newUrl);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
function updateSrcdocWithTracking(element, originalUrl, newUrl) {
if (!element.srcdoc) return false;
try {
// Replace the original URL with the new URL in the srcdoc content
const updatedSrcdoc = element.srcdoc.replace(originalUrl, newUrl);
element.srcdoc = updatedSrcdoc;
return true;
} catch (e) {
console.error('Error updating srcdoc:', e);
return false;
}
}
function updateSrcdocWithTracking(element, originalUrl, newUrl) {
if (!element.srcdoc) return false;
try {
// Replace the original URL with the new URL in the srcdoc content
const updatedSrcdoc = element.srcdoc.replaceAll(originalUrl, newUrl);
element.srcdoc = updatedSrcdoc;
return true;
} catch (e) {
console.error('Error updating srcdoc:', e);
return false;
}
}
🤖 Prompt for AI Agents
In packages/script/src/extensions/outbound-domains.js around lines 51 to 63, the
updateSrcdocWithTracking function uses String.prototype.replace which only
replaces the first occurrence; change the replacement to replace all occurrences
by creating a global replacement: escape originalUrl for safe use in a RegExp,
build a new RegExp(escapedOriginalUrl, 'g'), use srcdoc.replace(regexp, newUrl)
(or use split(originalUrl).join(newUrl) if you prefer avoiding regex), assign
the result back to element.srcdoc and keep the existing try/catch and return
values.

Comment on lines 107 to 129
const newUrlString = url.toString();

// Update the appropriate attribute based on element type
if (element.tagName.toLowerCase() === 'a') {
element.href = url.toString();
element.href = newUrlString;
outboundLinksUpdated.add(element);
} else if (element.tagName.toLowerCase() === 'iframe') {
element.src = url.toString();
if (element.src) {
// Standard iframe with src attribute
element.src = newUrlString;
outboundLinksUpdated.add(element);
} else if (element.srcdoc) {
// Iframe with srcdoc attribute (like Cal.com)
const updated = updateSrcdocWithTracking(
element,
urlString,
newUrlString,
);
if (updated) {
outboundLinksUpdated.add(element);
}
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

The single-URL processing approach won't handle multiple outbound URLs in srcdoc.

This logic processes only one URL per element due to the Set check. For iframes with srcdoc containing multiple outbound URLs (like the test case with both example.com/booking and other.com/widget.js), only the first extracted URL will be tracked, and the element will be marked as processed. Other outbound URLs in the same srcdoc will never be updated.

This requires a more comprehensive refactor to:

  1. Extract all URLs from srcdoc
  2. Check each against outbound domains
  3. Update all matching URLs in a single pass
  4. Mark the element as processed only after all URLs have been handled

Example approach:

} else if (element.tagName.toLowerCase() === 'iframe') {
  if (element.src) {
    element.src = newUrlString;
    outboundLinksUpdated.add(element);
  } else if (element.srcdoc) {
    // Extract all URLs from srcdoc
    const urls = extractUrlsFromSrcdoc(element.srcdoc);
    let updatedSrcdoc = element.srcdoc;
    let hasUpdates = false;
    
    urls.forEach(url => {
      try {
        const parsedUrl = new URL(url);
        if (filteredDomains.some(domain => isMatchingDomain(url, domain))) {
          if (!parsedUrl.searchParams.has(DUB_ID_VAR)) {
            parsedUrl.searchParams.set(DUB_ID_VAR, existingCookie);
            updatedSrcdoc = updatedSrcdoc.replaceAll(url, parsedUrl.toString());
            hasUpdates = true;
          }
        }
      } catch (e) {
        // Skip invalid URLs
      }
    });
    
    if (hasUpdates) {
      element.srcdoc = updatedSrcdoc;
      outboundLinksUpdated.add(element);
    }
  }
}
🤖 Prompt for AI Agents
In packages/script/src/extensions/outbound-domains.js around lines 107 to 129,
the current logic only processes a single extracted URL per iframe.srcdoc and
marks the element as handled, which misses additional outbound URLs in the same
srcdoc; refactor to extract all URLs from srcdoc, iterate each URL, for each
valid URL check against the outbound domains and add the tracking param if
missing, build an updated srcdoc string replacing each matched URL with its
updated version, and only after processing all URLs set element.srcdoc to the
updated string and add the element to outboundLinksUpdated if any replacements
were made.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
packages/script/src/extensions/outbound-domains.js (2)

76-76: Unnecessary null check.

Since allElements is created from array spread operators ([...elements, ...nestedElements]), it will always be an array and never null or undefined. The !allElements check is redundant.

Apply this diff:

-    if (!allElements || allElements.length === 0) return;
+    if (allElements.length === 0) return;

70-70: Console warnings may be noisy in production.

If multiple srcdoc iframes have CORS restrictions or security policies that prevent contentDocument access, this warning will be logged repeatedly (every 2 seconds due to the periodic execution on line 121), potentially cluttering the console in production environments.

Consider one of these approaches:

  1. Use a debug flag to conditionally log warnings
  2. Track which iframes have already logged warnings to avoid repetition
  3. Reduce log level or remove the warning entirely (the try-catch already handles the error gracefully)
+  const corsWarningsLogged = new WeakSet();
+
   srcdocIframes.forEach((srcdocIframe) => {
     try {
       // ... existing code ...
     } catch (e) {
-      console.warn('Could not access contentDocument of srcdoc iframe:', e);
+      if (!corsWarningsLogged.has(srcdocIframe)) {
+        console.warn('Could not access contentDocument of srcdoc iframe:', e);
+        corsWarningsLogged.add(srcdocIframe);
+      }
     }
   });
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7c60c31 and 8e9425f.

📒 Files selected for processing (3)
  • apps/nextjs/app/outbound/page.tsx (1 hunks)
  • apps/nextjs/tests/outbound-domains.spec.ts (1 hunks)
  • packages/script/src/extensions/outbound-domains.js (1 hunks)
🔇 Additional comments (3)
packages/script/src/extensions/outbound-domains.js (1)

53-110: LGTM! Excellent refactor from string manipulation to DOM access.

The new implementation correctly addresses all concerns from previous reviews by accessing the contentDocument directly and processing nested DOM elements uniformly, rather than attempting string manipulation of srcdoc content. This approach:

  • ✅ Handles multiple URLs within the same srcdoc (each is a separate DOM element)
  • ✅ Updates all occurrences (DOM attributes are updated directly)
  • ✅ Processes all outbound URLs in nested content
  • ✅ Includes proper error handling for CORS/security restrictions
apps/nextjs/app/outbound/page.tsx (1)

11-13: LGTM! Test fixtures correctly demonstrate srcdoc with nested iframes.

The added iframes properly illustrate the Cal.com-style usage with srcdoc containing nested iframe elements. The HTML structure is valid for testing purposes (browsers will parse it despite minimal structure).

apps/nextjs/tests/outbound-domains.spec.ts (1)

48-107: LGTM! Test correctly validates nested iframe tracking.

The test properly validates that:

  1. ✅ Nested iframes inside srcdoc are discovered via contentDocument
  2. ✅ Tracking parameters are added to their src attributes
  3. ✅ Multiple nested iframes in the same srcdoc are all tracked

The 2500ms timeout on line 55 is appropriate for this implementation since it exceeds the 2000ms periodic interval (line 121 in outbound-domains.js), ensuring at least one periodic run has completed. The past review comment suggesting to wait for the srcdoc attribute to contain dub_id was based on the old string-manipulation approach and doesn't apply to the current DOM-based implementation that modifies nested element attributes directly.

@steven-tey steven-tey merged commit fb4cfa9 into main Oct 8, 2025
4 checks passed
@steven-tey steven-tey deleted the iframe-srcdoc branch October 8, 2025 04:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants