Conversation
Group the flat 17-section layout into five titled parts (Motivation, Architecture, Data Model & Validation, Operations, Reference) with short intros, add a design-spec status banner, add TL;DR leads to the densest sections, de-duplicate canonical-identity and producer-contract discussion, and add a manager-vs-cohort comparison table. Add five Operations sections promised but not previously specified: Testing Strategy, Performance Considerations, Rollback And Coexistence, Direct-Push API Surface, and Security Considerations. Open questions are marked inline so reviewers can react to concrete text. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
Hello, I'm the AEM Code Sync Bot and I will run some actions to deploy your branch.
|
There was a problem hiding this comment.
Pull request overview
Adds a design-specification document for a planned JsonLdGraphManager runtime, describing motivation, architecture/lifecycle, canonical graph/merge rules, operational concerns (logging/testing/perf/rollback), and reference examples.
Changes:
- Introduces a comprehensive JSON-LD graph-manager design spec (feature-flagging, lifecycle, data contracts).
- Defines normalization/merge/dedupe and provenance conventions for multi-producer JSON-LD aggregation.
- Documents operational strategy (observability via Lana, testing levels, performance envelope, rollback).
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
… remove self-import from push example
Second pass on the JsonLdGraphManager design doc focused on readability and presentation flow for a broader audience. - Restructure into 6 parts (add Part II Rollout) with italic dicta under each section heading to anchor the key idea - Add Quickstart, "Who this is for" audience matrix, and Glossary - Add Mermaid diagrams: 3-beat architecture flowchart, before/after comparison, initialization and mutation sequence diagrams, canonical editorial and product page graph shapes - Annotate Appendix A examples with "What to notice" callouts - Consolidate all Open Questions into Appendix B table
Reorganize JsonLdGraphManager spec so the reading order follows the systems/design-paper convention (why -> what -> does-it-work -> how-we-ship -> caveats) instead of interleaving deployment before design. - Part I Introduction (Abstract, Scope, Problem, Before/After, Contributions) - Part II Design (Decision, Architecture, Lifecycle, DOM & Output Contracts, Producer Integration, Direct-Push API, Normalization, Canonical Graph Model) - Part III Evaluation (Validation Cohort, Testing, Performance) - Part IV Deployment (Feature Flag, Rollout, Rollback, Observability) - Part V Security Considerations (promoted to top-level, RFC convention) - Part VI Related Work & Reference (Authoring Catalog, References, Appendices A-D; Glossary moved to appendix) Specific moves: - Design Decision moves from Motivation to opener of Design - Before/After moves from Architecture to Introduction (motivation device) - Direct-Push API moves from Operations to Design (it's a public interface) - Validation Cohort + Testing + Performance grouped in Evaluation - Security promoted from Operations subsection to top-level part - Glossary moves to Appendix D - Rename "Data Model And Contracts" -> "DOM And Output Contracts" to eliminate name collision with the data-model material in Part II - Add bulleted Contributions list in Introduction No content changes; only section relocations, one rename, and the new Contributions list.
|
This PR has not been updated recently and will be closed in 7 days if no action is taken. Please ensure all checks are passing, https://github.com/orgs/adobecom/discussions/997 provides instructions. If the PR is ready to be merged, please mark it with the "Ready for Stage" label. |
Reframe the spec to point at the requirements sheet in structured-data-json-ld.json as the machine-readable source of truth and keep the markdown doc as rationale and contract. Remove sections that restated rules now owned by the JSON sheet; remove provenance entirely (debug mode is the appropriate place to surface per-source origin). - Externalize: drop "DOM And Output Contracts" subsections, identity policy table, dedupe policy, governing-rules bullets, and the "Manager guarantees vs. cohort expectations" table; replace each with a one-line pointer to the requirements sheet. - Provenance: remove the provenance contract subsection, the Provenance preservation security bullet, the Provenance glossary entry, and all producerName/producerType/ingestMode/discoveryPhase references in the Producer Integration Model, Direct-Push API, runtime lifecycle, sequence diagram, and testing strategy. Reframe observability so debug mode logs the original captured payload and DOM location rather than persisting a provenance record. - Naming: rename section 3 from "Evaluation" to "Conformance" -- the doc covers conformance to the requirements spec, not empirical evaluation. Rename section 4 from "Deployment" to "Operations" so feature flagging and observability sit naturally together. - Section numbering: collapse the 2.1->2.2->2.3->2.6 gap to a contiguous 2.1->2.6 sequence after the renames; add 3.1, 3.2. - Out of scope: add a 3.2 "Out Of Scope" note clarifying that search-engine effectiveness measurement (bot-traffic logs, GSC URL Inspection API) is not gated by this spec. - Cross-references: drop the broken anchor link on the canonical-graph section (target was renumbered); drop "direct graph-manager push" from the merge priority since the direct-push API is no longer specified in this doc; drop BreadcrumbList from Article.hasPart in the editorial diagram and Example 1 since it isn't a supplemental per the supplemental-linkage rule. - Typos and grammar: paramater, eachother, this these, fo this, compelete, it's complexity, on on, speadsheet, awkward "JSON-LD on page meets" wording in the e2e testing bullet.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 1 out of 1 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…pace, feature-flag default
Add a single-file ES module at libs/features/jsonld-graph-manager.js that collects all per-page JSON-LD emitted by existing producers and rewrites it as one canonical, linked @graph. Disabled by default; enabled per page via the jsonld-graph-manager metadata flag or URL query parameter (string 'true', case-insensitive). The implementation is organized as pure helper functions plus a class, all in one file, with named exports for unit-testability: - RULES table encodes the requirements sheet (WebPage, Organization, Article, BreadcrumbList, SoftwareApplication, HowTo, FAQPage, VideoObject, Event, Product) — identity fragments, singleton flags, and default linkage edges. - parsePayload: accepts object | array | { @graph } shapes; logs a Lana warning on parse failure. - normalizeNode: strips per-node @context; rewrites @id to canonical page-scoped fragment (e.g. #article) or site-wide id (Organization). - mergeNodes: resolves scalar conflicts by source priority (bootDom < runtime); unions reference arrays (hasPart, mainEntity, itemListElement) by @id. - injectLinks: derives WebPage.mainEntity/breadcrumb/publisher and Article.isPartOf/mainEntityOfPage/publisher from the RULES table. - JsonLdGraphManager class: boot scan of existing unmanaged scripts, MutationObserver on documentElement (childList + subtree), debounced rebuild queue, and rewrite() that synthesizes a minimal WebPage root when producers haven't provided one. - init() default export: idempotent singleton stored on window.__jsonLdGraphManager. Boot wiring added to documentPostSectionLoading in libs/utils/utils.js — placed before seotech/richresults so the MutationObserver is attached before those producers append their scripts. Tests (37, all passing) cover: flattenPayload, parsePayload (valid shapes + invalid JSON → Lana warning), normalizeNode (canonical ids, context strip, unknown type retention), unionByRef, mergeNodes (priority resolution, field union, reference array union), injectLinks (forward/back links, no-overwrite), boot scan, singleton enforcement, output contract (one managed script, no per-node @context, WebPage-first ordering), MutationObserver pickup, and three e2e pipeline fixtures (editorial, product, multi-producer priority). What v1 does not include: direct-push producer API, runtime fetch of the requirements sheet, provenance persistence, e2e cohort tests against live URLs, search-effectiveness measurement.
|
This PR does not qualify for the zero-impact label as it touches code outside of the allowed areas. The label is auto applied, do not manually apply the label. |
…le logging Add ?jsonld-graph-manager-debug=true URL flag that emits console.debug output at each queue lifecycle event: enqueue (source, DOM location, original payload), rebuild (batch size, graph size), parsed (types, node count), removed from DOM, and rewrite (node count, full expandable graph object). The graph object logged on rewrite is the canonical @graph as produced, inspectable in DevTools without a separate console snippet. Debug output is gated entirely on the URL param and is independent of lanadebug and the Lana endpoint -- these are high-volume success-path events that should never be sent to Lana.
…ug flag doc Organization synthesis: - Always ensure a canonical Organization node is present in the graph. rewrite() synthesizes a minimal default if none is provided, or merges the default at graph-manager-generated priority (weight 2) so baseline fields (name, url, logo) always win over producer-supplied values while producer-only fields (e.g. sameAs) are preserved. - Domain-aware: siteRoot() returns https://business.adobe.com for hostnames matching /business|bacom/i; defaults to https://www.adobe.com. defaultOrg() derives name ("Adobe" / "Adobe for Business"), url, logo, and @id from the site root. Both accept an optional hostname override for testability. - 3-tier merge priority: generated (2) > runtime (1) > bootDom (0). Inline entity extraction: - extractInlineEntities() walks publisher, author, creator, provider, brand properties; hoists any inline typed object that lacks @id to a top-level graph node (via normalizeNode) and replaces the property value with an @id reference. Called during rebuild() after each node is normalized. Doc (libs/utils/json-ld.md): - Summary: add one-line mention of jsonld-graph-manager-debug=true. - §4.1: add debug flag entry alongside the feature flag. - §4.2: replace vague "debug logging conventions" bullets with a concrete description of the five lifecycle events logged by the debug param; remove stale lanadebug reference. Tests: 45 passing (8 new cases covering synthesis, precedence, domain selection for www/business/bacom, inline extraction, and integration).
- Turn off no-continue globally in .eslintrc.js - Add file-level no-use-before-define disable (lanaLog hoisted above parsePayload) - Add inline no-nested-ternary disables for unionByRef coercions - Add missing no-console disables for console.error/warn in lanaLog - Rename _collect → collect (private method, underscore convention unnecessary) - Rename window.__jsonLdGraphManager → window.miloJsonLdGraphManager - Remove unused canonicalUrl import from test file - Add no-promise-executor-return disable for test microtask flush Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 9 out of 9 changed files in this pull request and generated 10 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if (this.isProcessing) return; | ||
| this.isProcessing = true; | ||
| try { | ||
| const batch = this.queue.splice(0); | ||
| debugLog('rebuild', { batchSize: batch.length, graphSize: this.graph.size }); | ||
| for (const { scriptEl, source } of batch) { | ||
| const nodes = parsePayload(scriptEl); | ||
| debugLog('parsed', { source, types: nodes.map((n) => n['@type']), nodeCount: nodes.length }); | ||
| scriptEl.remove(); | ||
| debugLog('removed from DOM', scriptEl.parentElement?.tagName ?? 'already detached'); | ||
| for (const raw of nodes) { | ||
| const node = normalizeNode(raw); | ||
| const inlined = extractInlineEntities(node); | ||
| const toMerge = [node, ...inlined]; | ||
| for (const n of toMerge) { | ||
| const id = n['@id'] ?? n['@type'] ?? JSON.stringify(n); | ||
| if (this.graph.has(id)) { | ||
| const prevSrc = this.sources.get(id) ?? 'bootDom'; | ||
| this.graph.set(id, mergeNodes(this.graph.get(id), n, prevSrc, source)); | ||
| } else { | ||
| this.graph.set(id, n); | ||
| } | ||
| this.sources.set(id, source); | ||
| } | ||
| } | ||
| } | ||
| this.rewrite(); |
There was a problem hiding this comment.
rebuild() returns early when isProcessing is true, but enqueued items can still be added to this.queue while processing a batch. If the debounced rebuild fires during a long-running rebuild, it will no-op due to isProcessing and no further rebuild may be scheduled, leaving queued scripts unprocessed. Consider setting a needsRebuild flag (or looping until queue is empty) so any items enqueued during processing are guaranteed to be handled after the current pass finishes.
| if (this.isProcessing) return; | |
| this.isProcessing = true; | |
| try { | |
| const batch = this.queue.splice(0); | |
| debugLog('rebuild', { batchSize: batch.length, graphSize: this.graph.size }); | |
| for (const { scriptEl, source } of batch) { | |
| const nodes = parsePayload(scriptEl); | |
| debugLog('parsed', { source, types: nodes.map((n) => n['@type']), nodeCount: nodes.length }); | |
| scriptEl.remove(); | |
| debugLog('removed from DOM', scriptEl.parentElement?.tagName ?? 'already detached'); | |
| for (const raw of nodes) { | |
| const node = normalizeNode(raw); | |
| const inlined = extractInlineEntities(node); | |
| const toMerge = [node, ...inlined]; | |
| for (const n of toMerge) { | |
| const id = n['@id'] ?? n['@type'] ?? JSON.stringify(n); | |
| if (this.graph.has(id)) { | |
| const prevSrc = this.sources.get(id) ?? 'bootDom'; | |
| this.graph.set(id, mergeNodes(this.graph.get(id), n, prevSrc, source)); | |
| } else { | |
| this.graph.set(id, n); | |
| } | |
| this.sources.set(id, source); | |
| } | |
| } | |
| } | |
| this.rewrite(); | |
| if (this.isProcessing) { | |
| this.needsRebuild = true; | |
| return; | |
| } | |
| this.isProcessing = true; | |
| try { | |
| do { | |
| this.needsRebuild = false; | |
| const batch = this.queue.splice(0); | |
| debugLog('rebuild', { batchSize: batch.length, graphSize: this.graph.size }); | |
| for (const { scriptEl, source } of batch) { | |
| const nodes = parsePayload(scriptEl); | |
| debugLog('parsed', { source, types: nodes.map((n) => n['@type']), nodeCount: nodes.length }); | |
| scriptEl.remove(); | |
| debugLog('removed from DOM', scriptEl.parentElement?.tagName ?? 'already detached'); | |
| for (const raw of nodes) { | |
| const node = normalizeNode(raw); | |
| const inlined = extractInlineEntities(node); | |
| const toMerge = [node, ...inlined]; | |
| for (const n of toMerge) { | |
| const id = n['@id'] ?? n['@type'] ?? JSON.stringify(n); | |
| if (this.graph.has(id)) { | |
| const prevSrc = this.sources.get(id) ?? 'bootDom'; | |
| this.graph.set(id, mergeNodes(this.graph.get(id), n, prevSrc, source)); | |
| } else { | |
| this.graph.set(id, n); | |
| } | |
| this.sources.set(id, source); | |
| } | |
| } | |
| } | |
| this.rewrite(); | |
| } while (this.needsRebuild || this.queue.length > 0); |
Real bugs:
- mergeNodes(): when b wins (aWins=false), recursive calls passed srcA/srcB
unchanged, but vW/vL came from the swapped winner/loser. Compute
winnerSrc/loserSrc once and pass those to recursive merges so nested
scalars resolve under the correct priority.
- rebuild(): this.sources.set(id, source) overwrote with the last-enqueued
source even when mergeNodes preserved a higher-priority value. Now we
only update the source map when the new source priority >= prev, so
subsequent merges see the correct prevSrc weight.
Defensive / consistency:
- flattenPayload(): guard against null/primitive parsed JSON (returns []).
- rebuild(): capture parentTagName before scriptEl.remove() so the debug
log reflects the actual parent rather than always "already detached".
- Convert debugLog to thunk pattern with cached DEBUG flag at module init,
so expensive arg construction (JSON.parse, .map, .trim) is skipped when
the debug param is not set. Use `nodes` directly instead of
JSON.parse(payload) in the rewrite log.
- Remove info-level lanaLog ("Graph rewritten with N nodes") -- per spec,
high-volume success-path events should not be sent to Lana.
- Replace nested ternary in unionByRef with a small asArray() helper
(cleaner than two eslint-disable comments).
- Move no-continue from a global .eslintrc.js rule change to a
file-scoped disable comment to avoid affecting unrelated code.
Tooling:
- utils.js: swap flag precedence so `?jsonld-graph-manager=true` always
wins over metadata, matching mep-lingo-skip-qi/langfirst convention.
- Add JsonLdGraphManager.destroy() that disconnects the MutationObserver.
- Tests now wrap manager construction in trackedManager() and call
destroy() in resetManager() so observers do not leak across tests.
Tests: 45 passing.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…, org id canonicalization Implementation (libs/features/jsonld-graph-manager.js): - Add TYPE_TRANSFORMS table; normalizeNode() rewrites @type:Product to @type:SoftwareApplication so canonical primary type for product-oriented pages is uniform across producers (review block today, merch cards next). - Add Offer to RULES (idFragment #offer, repeatable). Remove Product entry since it never reaches RULES lookup. - Extend extractInlineEntities to walk seller/offers/itemOffered in addition to publisher/author/creator/provider/brand. Handle arrays (offers[]) and drop the old "no @id" guard so producer-identified inline entities are also hoisted with canonical @id. Restrict hoisting to recognized types (RULES or TYPE_TRANSFORMS) so anonymous Brand stays inline per the doc. - Add canonicalizeOrgId() and canonicalizeReferences() for defensive rewriting of producer Organization aliases (#org, #publisher, #adobe -> #organization). Walks reference stubs only ({@id} with no @type). - Wire canonicalizeReferences into rebuild() after normalization+extraction. Doc (libs/utils/json-ld.md): no changes -- existing §2.4 type-specific transforms section already documents Product->SoftwareApplication; Appendix A.3 already shows the merch-card fixture transformation. Tests: 59 passing (was 45). New cases: - normalizeNode Product->SoftwareApplication - end-to-end review-block Product shape produces SoftwareApplication - inline Offer array hoisting with @id rewrite - BreadcrumbList itemListElement (URL strings) preserved as-is - anonymous and identified Brand stays inline - canonicalizeOrgId for #org, #publisher, #adobe; idempotent for canonical - end-to-end seller alias #org -> #organization - WebPage without BreadcrumbList omits breadcrumb property - merch-card fixture: full Product transformation with Offer hoist, Brand inline, seller canonicalization, priceSpecification preserved - Updated existing test to expect new behavior: inline objects with @id are hoisted (was: left untouched) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Move libs/utils/json-ld.md to libs/features/jsonld-graph-manager.md so the spec sits next to the implementation it describes. Matches the existing co-location pattern in libs/utils (lana.js + lana.md, service-config.js + service-config.md) and the README convention used by other libs/features modules (seotech, mep, spectrum-web-components).
Mirror the test path layout (test/features/jsonld-graph-manager/) by moving the implementation and design doc into libs/features/jsonld-graph-manager/. Matches the seotech precedent (libs/features/seotech/seotech.js + README.md). - libs/features/jsonld-graph-manager.js -> libs/features/jsonld-graph-manager/jsonld-graph-manager.js - libs/features/jsonld-graph-manager.md -> libs/features/jsonld-graph-manager/jsonld-graph-manager.md - update relative import to '../../utils/action.js' - update consumer in libs/utils/utils.js - update test import path
Pages with no JSON-LD producers were getting no managed graph because rewrite() short-circuited on graph.size === 0. The requirements sheet mandates the graph always contain exactly one WebPage and one Organization node (webpage-singleton, organization-singleton, organization-default-*), so a manager-enabled page must always produce that baseline. Drop the early return; the rest of rewrite() already synthesizes WebPage and Organization when missing, links WebPage.publisher, and writes the managed script. Update the test that asserted the old skip behavior to assert the expected baseline graph instead.
…e strategy Move the 38-rule requirements sheet (structured-data-json-ld.json) into a new normative section of the design doc so the manager spec lives in one place and the JSON sheet can be retired. While merging, capture three policy refinements raised in design review: - Add 'softwareapplication-subtype-allowed' (info): preserve more specific schema.org subtypes (WebApplication, MobileApplication, VideoGame) when a producer supplies them; the Product->SoftwareApplication baseline transform never rewrites a producer subtype down to plain SA. Aligns with Google's Software App rich result, which explicitly supports these subtypes. - Replace 'webpage-singleton' with 'webpage-canonical-singleton' and define cross-page WebPage rewriting (policy choice C): inline cross-page WebPage references in isPartOf/mainEntityOfPage are rewritten to the current canonical #webpage id and the inline body is dropped. Schema.org permits cross-page references and Google's spec is silent, but our managed graph stays single-page coherent. - Add 'source-priority' (error): codify generated > runtime > bootDom resolution explicitly so the rule that runtime producers (e.g. review block aggregateRating) win over hardcoded bootDom values is normative, not implementation lore. - Upgrade 'organization-default-logo' from favicon URL string to schema.org ImageObject pointing at the Adobe corporate horizontal red SVG, tied to Google's logo guidelines (112x112 minimum, ImageObject preferred). Drop 'external-reference-includes-url' (redundant since referenced entities are top-level @graph nodes). Add new section 4 (Per-Type Strategy) with one subsection per supported type (WebPage, Organization, BreadcrumbList, SoftwareApplication + subtypes, Article/NewsArticle, HowTo, FAQPage, VideoObject, Offer, Event, WebSite, seotech variable). Each subsection cites schema.org hierarchy, Google rich-result requirements, manager handling, and known producers in the milo repo (sourced from the integrations sheet). Renumber Conformance to section 5 and Operations to section 6. Update Appendix A.3 example output to use the new ImageObject logo. Total: 39 normative rules across sections 3.1-3.8.
Remove inline JS comments from libs/features/jsonld-graph-manager.js that documented design decisions, after promoting each one into a normative requirement in the design doc: - 'manager-baseline-graph' (error): manager always emits the baseline WebPage + Organization graph when enabled, even on producer-free pages - 'organization-id-aliases' (info): defensive canonicalization of '#org', '#publisher', '#adobe' fragments to '#organization' - 'repeatable-types' note: v1 collapses repeatable nodes (VideoObject, Offer) to a single canonical id when distinct producer @ids are absent - 'source-priority' note: ratcheting behavior — recorded source moves monotonically toward higher-priority sources so a generated default cannot be overwritten by a later out-of-order bootDom payload Also strip restate-the-rule comments (TYPE_TRANSFORMS, RULES table, priority weights, rewrite() synthesis) — these are now fully covered by section 3 requirements. Keep ESLint pragmas. Tests: 59/59 passing.
…logo
Implement the three manager changes captured in the design doc:
1. SoftwareApplication subtype preservation (softwareapplication-subtype-allowed).
- normalizeNode lands WebApplication / MobileApplication / VideoGame at the
canonical #softwareapplication @id but keeps the specific @type.
- mergeNodes adds type promotion: when one merge input is a SA-family
subtype, that subtype wins regardless of source priority. Source priority
still governs scalar field resolution as before.
- This solves the duplicate-primary-entity case (e.g. Acrobat compress-pdf):
team-hardcoded WebApplication and review-block Product->SoftwareApplication
now merge into one canonical WebApplication node.
2. Cross-page WebPage rewrite (webpage-canonical-singleton, policy choice C).
- New rewriteCrossPageRefs() handler runs on every ingested node before
merge: for isPartOf and mainEntityOfPage, any inline WebPage body or
reference stub ending in '#webpage' is rewritten to { @id: <current>#webpage }.
Inline cross-page WebPage bodies are discarded.
- This eliminates the phantom inline WebPage seen in producer markup
(e.g. acrobat WebApplication.isPartOf pointing at /online/#webpage).
- Non-WebPage isPartOf values (CreativeWorkSeries, etc.) pass through.
3. Organization.logo upgrade (organization-default-logo).
- defaultOrg().logo is now an ImageObject pointing at the canonical Adobe
corporate horizontal red SVG, satisfying Google's logo guidelines
(112x112 minimum, ImageObject preferred over bare URL string).
- Fidelity gain over the prior favicon.ico string default.
Tests: 68/68 passing (59 prior + 9 new across subtype preservation, mergeNodes
type promotion, end-to-end WebApplication+Product merge, cross-page rewrite
unit + e2e). Five existing logo assertions updated to expect the ImageObject
shape via a new ADOBE_LOGO_OBJECT test constant.
Lint: clean.
Introduced by a907365: when mergeNodes promoted @type to a SoftwareApplication subtype (WebApplication / MobileApplication / VideoGame), injectLinks() failed two ways: 1. WebPage.mainEntity was never set, because the byType index keyed on the exact @type and the lookup was 'byType.Article ?? byType.SoftwareApplication'. With @type=WebApplication, byType.SoftwareApplication is undefined. 2. provider / isPartOf weren't auto-injected on the SA-subtype node, because the linksBack rule lookup was 'RULES[node["@type"]]' and RULES has no entry for the subtype. Fix: introduce effectiveType(t) that maps SA subtypes to 'SoftwareApplication', and apply it in two places: - byType build: index the node under both its exact @type AND its effective parent (so byType.SoftwareApplication is populated when the node is a subtype) - linksBack lookup: RULES[effectiveType(node['@type'])] so SA's linksBack rules apply to subtypes Also extend the WebPage.mainEntity primary-type fallback to include NewsArticle (richresults emits this and it should attach as mainEntity the same way Article does). Tests: 71/71 passing (68 + 3 new) covering mainEntity for WebApplication, auto-provider on WebApplication, and mainEntity for NewsArticle. Lint: clean.
Add AggregateRating to the canonical graph as its own top-level node: - New requirement aggregaterating-singleton (error): at most one AggregateRating per page, at the canonical @id '{canonicalPageURL}#aggregaterating'. - New requirement aggregaterating-extraction (info): inline aggregateRating values on host entities (SoftwareApplication, Article, Product, etc.) are hoisted to the top-level @graph and replaced with { @id } references. - New section 4.10 AggregateRating: schema.org hierarchy, Google rich-result citations (Software App, Product, Course, Review snippet), manager handling, known producers (review flow). Implementation: - Add AggregateRating: { idFragment: '#aggregaterating', singleton: true } to RULES so normalizeNode rewrites the @id. - Add 'aggregateRating' to ENTITY_PROPS so extractInlineEntities hoists it. Why singleton: every Adobe.com primary entity that exposes ratings has exactly one canonical rating; multi-producer contributions describe the same product (team-hardcoded snapshot vs. live review-block fetch) and should merge. Source priority resolves freshness — runtime (review block) wins over bootDom (team hardcode), so the freshest counts surface to Google's software-app rich result. Tests: 73/73 passing (71 + 2 new — extractInlineEntities hoisting, end-to-end merge with bootDom + runtime contributions). One existing end-to-end assertion updated to expect '{ @id }' instead of inline body. Lint: clean.
|
This pull request is not passing all required checks. Please see this discussion for information on how to get all checks passing. Inconsistent checks can be manually retried. If a test absolutely can not pass for a good reason, please add a comment with an explanation to the PR. |
Summary
The JSON-LD Graph Manager is a Milo feature that collects all the JSON-LD on a page and rewrites it as one canonical, linked
@graph. This centralization enables the manager to automatically apply JSON-LD graph features that may improve search engine and LLM visibility, such as cross-entity@idlinking and singleton enforcement for certain types.Specification
See
libs/utils/json-ld.md.Testing
You can use the following URL query parameters with any AEM url:
milolibs=hgpa-jsonld-graph-managerto load this Milo from this branchjsonld-graph-manager=trueto enable the feature (off by default). This can also be done via page metadata.jsonld-graph-manager-debug=trueto enable console.debug logging. Remember to add 'Verbose' to Console levels to view.Example URLs:
Use the following JavaScript snippet to quickly parse available JSON-LD content: