feat(rum): scope latest-metrics RUM to locale path prefix#1707
feat(rum): scope latest-metrics RUM to locale path prefix#1707alinarublea wants to merge 5 commits into
Conversation
Locale-specific sites (e.g. https://example.com/de) have no RUM domain key of their own — keys exist only for the main domain. Today the totalMetrics query resolves the bare hostname and aggregates RUM for the entire parent domain, so locale sites report domain-wide numbers instead of their own subtree. Add result-level path scoping: - spacecat-shared-utils: getBaseURLPathPrefix(baseURL) extracts the locale path (e.g. '/de'), or null at the domain root. - spacecat-shared-rum-api-client: the totalMetrics handler now reads opts.pathPrefix and filters bundles to that subtree before aggregating, with boundary-safe matching ('/de' does not match '/design'). Opt-in: with no pathPrefix the handler behaves exactly as before. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Hey @alinarublea,
⚠ Degraded review - no spec document was found for this change (searched the PR links, the touched repos' docs, the architecture/guidelines docs, and linked Jira). This review covers code-level quality but could not validate the change against an agreed design, so confidence is reduced. Add a spec link (PR template section 4) and re-request review for a full-confidence pass.
Verdict: Approve - clean, well-scoped feature with solid test coverage.
Complexity: LOW - small diff, single package surface.
Changes: Adds opt-in locale path-prefix filtering to the totalMetrics RUM handler plus a getBaseURLPathPrefix utility for extracting the path prefix from a site's baseURL (6 files).
Non-blocking (3): minor issues and suggestions
- nit:
matchesPathPrefixdoes not normalizeprefix- if a caller passes/de/(trailing slash), the boundary checkpath.startsWith('/de//')silently fails to match nested paths. A defensiveprefix.replace(/\/+$/, '')would make this robust against future callers that skip normalization -packages/spacecat-shared-rum-api-client/src/functions/total-metrics.js:22 - nit: Silent catch blocks in both
matchesPathPrefixandgetBaseURLPathPrefixprovide no observability signal when bundles or URLs are dropped due to parse failure. For a data-aggregation pipeline, systematically malformed URLs would silently degrade metrics with no alert -packages/spacecat-shared-rum-api-client/src/functions/total-metrics.js:23 - suggestion: Add a test for empty-string
pathPrefixto lock in the falsy-means-no-filter contract, since''could arrive from upstream coercion of a null return -packages/spacecat-shared-rum-api-client/test/total-metrics.test.js:79
Skill: pr-review | Model: us.anthropic.claude-opus-4-6-v1[1m] | Duration: 8m 37s | Cost: $3.58 | Commit: 851176e272acc7e92d556b4533e2a1a80791e94a
If this code review was useful, please react with 👍. Otherwise, react with 👎.
|
This PR will trigger a minor release when merged. |
getBaseURLPathPrefix returned the full path for any non-root baseURL, so a site registered at a landing page (e.g. https://example.com/us/en.html) yielded pathPrefix=/us/en.html, which matches no RUM bundles and scopes metrics to ~0 (verified: a 700 PV/30d site dropped to 0). The locale directory cannot be reliably inferred from a file path — 'en.html' is a locale but 'home.html' in '/en/home.html' is a page, and the two are syntactically identical. So when the last path segment has an extension, return null and fall back to whole-domain metrics (today's behavior) rather than over-filtering. Clean locale directories like /jp and /en-gb still filter normally. Verified against live RUM: www.adobe.com 242.3M PV -> /jp scoped 9.49M (3.92%). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…g' into feature/rum-locale-path-filtering
1.121.0 is already published by an unrelated change and does NOT contain getBaseURLPathPrefix; the helper lands in the next utils release after adobe/spacecat-shared#1707 merges (expected 1.122.0). Still a prediction — reconcile both shared pins against the actually-published versions once #1707 releases, then take this PR out of draft. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Hey @alinarublea,
⚠ Degraded review - no spec document was found for this change (searched the PR links, the touched repos' docs, the architecture/guidelines docs, and linked Jira). This review covers code-level quality but could not validate the change against an agreed design, so confidence is reduced. Add a spec link (PR template section 4) and re-request review for a full-confidence pass.
Verdict: Approve - focused improvement with solid test coverage, no blocking issues.
Complexity: MEDIUM - 165 lines across 6 files in two packages.
Changes: Adds opt-in locale path-prefix filtering to totalMetrics and a getBaseURLPathPrefix utility with file-extension detection to avoid over-filtering on file-like URLs (6 files).
Non-blocking (3): minor issues and suggestions
- nit: The file-extension regex
/\.[a-z0-9]+$/imatches single-char extensions, so a versioned path like/v1.2has last segmentv1.2where.2matches - returning null instead of/v1.2. Unlikely for locale baseURLs but worth a test documenting the intended behavior -packages/spacecat-shared-utils/src/url-helpers.js:329 - suggestion: Add a test for empty-string
pathPrefix(handler(bundles, { pathPrefix: '' })) to lock in the falsy-means-no-filter contract against future refactors -packages/spacecat-shared-rum-api-client/test/total-metrics.test.js:79 - suggestion: Add a test confirming query strings and fragments in baseURL are ignored (e.g.
https://example.com/de?lang=dereturns/de) to guard the URL-constructor behavior against regressions -packages/spacecat-shared-utils/test/url-helpers.test.js:737
Skill: pr-review | Model: us.anthropic.claude-opus-4-6-v1[1m] | Duration: 1m 45s | Cost: $3.26 | Commit: f89c44dbaf9d831c1bdbd8653141dbe4579d37a4
If this code review was useful, please react with 👍. Otherwise, react with 👎.
Resolve package.json by keeping the currently-published shared dependency versions (spacecat-shared-utils 1.119.2, rum-api-client 2.40.13) rather than the unpublished 1.122.0/2.44.0 — so this PR builds and passes CI on existing deps without waiting on the spacecat-shared release. Because published spacecat-shared-utils does not yet contain getBaseURLPathPrefix, inline a local copy in src/support/utils.js (with the file-path fallback) and import it there instead of from the shared package. Replace with the shared import once adobe/spacecat-shared#1707 publishes and the dependency is bumped. Note: result-level bundle filtering lives in rum-api-client (the #1707 change), so on the current published rum-api-client, passing pathPrefix is a forward-compatible no-op; it activates once that dependency is bumped. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Why
Locale-specific sites whose locale lives in the URL path (e.g.
https://example.com/de) have no RUM domain key of their own — in the RUM bundler, a domain key is issued per hostname, and a path-locale shares the apex hostname. TodaywwwUrlResolverstrips the path and thetotalMetricsquery aggregates RUM for the entire apex domain, so a path-locale site'slatest-metricsreports domain-wide numbers instead of its own subtree.What
Result-level path scoping for
totalMetrics, opt-in via a newpathPrefixoption:spacecat-shared-utils— newgetBaseURLPathPrefix(baseURL)helper: extracts the locale path (https://example.com/de→/de), returnsnullat the domain root, for a file-pointing baseURL (see below), or for unparseable input.spacecat-shared-rum-api-client— thetotalMetricshandler now readsopts.pathPrefixand filters bundles to that subtree before aggregating. Matching is boundary-safe:/dematches/deand/de/...but not/design.With no
pathPrefix, behavior is byte-for-byte identical to today — no existing caller is affected.Verified against live RUM data
Pulled real bundles with the success-studio RUM admin key:
/jpwww.adobe.comBundle URLs confirmed carrying locale paths (
/jp,/de,/uk,/in,/br…), so path filtering selects the correct subset. The www-vs-bare wrinkle (adobe.comreturns 0 PV,www.adobe.comreturns the data) is already handled by the existingresolveWwwUrl, which probeshasBundleDataand resolves to the variant that has data —pathPrefixrides alongside the resolveddomain.Page-path baseURLs (the second commit)
Some sites register their baseURL as a landing page, not a directory — e.g.
https://www2.deloitte.com/us/en.html. Returning the full path as the prefix (/us/en.html) matches no bundles and scopes RUM to ~0 (a 700 PV/30d site dropped to 0 in testing).The locale directory cannot be reliably inferred from a file path: in
/us/en.htmltheenis a locale, but in/en/home.htmlthehomeis a page — the two are syntactically identical. So when the last path segment has an extension,getBaseURLPathPrefixreturnsnulland the caller falls back to whole-domain metrics (today's behavior) rather than over-filtering. Clean locale directories (/jp,/en-gb) still filter normally.Why only path-locales, not subdomains
Subdomain-based locales (
de.example.com,business.adobe.com) are a different hostname, so the RUM bundler issues them their own domain key and their own bundles — confirmed live (business.adobe.com→key: YES, 4.59M PV of its own). They already resolve and scope correctly with no filtering, andgetBaseURLPathPrefixreturnsnullfor them. No change needed; no regression. This PR is specifically the path-locale fix.How it fits together
domainstays the bare hostname (required by the bundler URL and for domain-key resolution on the apex). The locale scoping happens purely at the result level. The consuming side —spacecat-api-service'sgetLatestSiteMetricsextracting the prefix fromsite.getBaseURL()and passing it through — lands in a follow-up PR (this shared release must publish first; see companion below).The prefix-matching logic is intentionally kept self-contained in
rum-api-clientrather than importing the new util, to avoid a cross-package version bump (rum-api-clientpins an exactspacecat-shared-utilsversion).Companion PR
Tests
getBaseURLPathPrefix: root → null, trailing slash, single/multi-segment, schema-less, unparseable, null/undefined, file-pointing paths → null (/us/en.html,/en/home.html,/index.php), extensionless deep path retained, hyphenated-locale not treated as a file.url-helpers.jsat 100% coverage.totalMetrics: aggregates all withoutpathPrefix(regression guard); scopes to subtree with it; rejects look-alike sibling paths; drops bundles with missing/invalid URLs.🤖 Generated with Claude Code