Skip to content

fix(db): nested toArray includes drop children when sibling groups share a correlation key (#1501)#1607

Open
kevin-dp wants to merge 7 commits into
TanStack:mainfrom
kevin-dp:repro/1501-nested-toarray-shared-buffer
Open

fix(db): nested toArray includes drop children when sibling groups share a correlation key (#1501)#1607
kevin-dp wants to merge 7 commits into
TanStack:mainfrom
kevin-dp:repro/1501-nested-toarray-shared-buffer

Conversation

@kevin-dp

@kevin-dp kevin-dp commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Fixes #1501.

The bug

With three (or more) levels of nested toArray includes (e.g. products → priceRanges → region), when two children in different parent groups share the same deepest correlation key, only one of them receives the nested rows — the other comes back as an empty array:

T-Shirt → priceRange 1 (regionId 1) → region: []          ❌  expected [Europe]
T-Shirt → priceRange 2 (regionId 2) → region: [North America]   ✓
Hoodie  → priceRange 3 (regionId 1) → region: [Europe]          ✓

Root cause

The nested pipeline writes results into a buffer shared across per-parent-group states, and routing assumes one parent per nested correlation key. Two compounding issues:

  1. 1:1 routing + destructive drain. nestedRoutingIndex mapped each nested correlation key to a single parent group (last-writer-wins), and drainNestedBuffers deleted the buffer entry after routing to the first match — so a sibling group sharing the key was dropped.
  2. No re-emit for late arrivals. The nested pipeline does not re-emit already-materialized rows, so a parent group that starts referencing an existing correlation key after the rows were drained (e.g. a sibling inserted after the initial load) saw nothing.

Fix

  • nestedRoutingIndex now maps a nested correlation key to a Set of parent groups; drainNestedBuffers fans buffered grandchild changes out to every ready parent group before dropping the buffer entry. Routing-index inserts/deletes and parent-delete cleanup maintain the per-key parent sets.
  • A per-level cumulative snapshot of net-present grandchild rows seeds late-arriving parent groups with the rows their siblings already received.

Tests

Three tests under nested toArray includes (depth 3+) cover initial load, a sibling inserted after load, and deleting one of two siblings that share a correlation key. Full tests/query/ suite (1596 tests) passes.

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Bug Fixes
    • Fixed nested toArray and live nested includes behavior when sibling parent groups share the same deepest correlation key, preventing dropped or mis-associated child results.
    • Improved fan-out routing so nested updates are applied to all relevant parent groups, including correct seeding for newly-associated parents without breaking sibling isolation during inserts and deletes.
  • Tests
    • Added regression coverage for multi-level nested includes (region under priceRanges under products) for both live and materialized collection modes, including preload, post-load insert, updates, and multiple delete scenarios.

…#1501)

Adds a failing test reproducing TanStack#1501: with three collection levels
(products -> priceRanges -> region), when two priceRanges in different
parent groups share the same deepest correlation key (regionId === 1),
one of the two nested `region` arrays comes back empty.

The nested pipeline buffer is shared by reference across per-parent-group
states (createPerEntryIncludesStates) and drainNestedBuffers deletes a
buffer entry after routing it to the first matching parent group, so the
sibling that drains second finds nothing.

Note: the minimal repro in the issue does not trigger the bug as written
(its dummy `eq(p.id, _.id)` correlation against a single-row anchor with
findOne collapses to one product, so the two overlapping siblings never
coexist in the output). This test puts both sibling groups in the result
so the collision actually occurs.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Nested include routing now tracks multiple parents per deepest correlation key, replays drained nested rows from snapshots, and adds regression coverage for shared-key three-level includes across preload, insert, delete, live collection, and materialized variants.

Changes

Nested toArray fan-out and snapshotting

Layer / File(s) Summary
Nested snapshot state
packages/db/src/query/live/collection-config-builder.ts
Adds snapshot row tracking, snapshot storage on nested include setup, parent-set routing indexes, and snapshot initialization for each nested pipeline.
Routing and snapshot replay
packages/db/src/query/live/collection-config-builder.ts
Adds helpers to accumulate drained nested rows into snapshots, seed late parent groups, fan buffered rows out to all matching parents, and preserve shared nested routing entries during insert/update/delete handling.
Nested includes regression coverage
.changeset/nested-toarray-shared-buffer-overlap.md, packages/db/tests/query/includes.test.ts
Updates the changeset note and adds regression tests for preload, late insertion, deletion, live collection, and materialized nested includes sharing the same deepest regionId.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Suggested reviewers

  • samwillis

Poem

🐇 Two parents hop the same key trail,
Snapshots keep the buffered tale.
Regions bloom for each small nest,
Tests make sure the paths behave best.
Hop-hop, the shared key stays in view,
And every sibling gets its due.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 77.78% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main fix for nested toArray includes dropping children with overlapping correlation keys.
Description check ✅ Passed The description explains the bug, root cause, fix, and tests, though it omits the repo's checklist and release-impact sections.
Linked Issues check ✅ Passed The changes address the linked bug by fixing nested routing fan-out, late-arriving siblings, and sibling deletion behavior.
Out of Scope Changes check ✅ Passed The PR stays focused on the nested include routing bug, associated tests, and the changeset with no clear unrelated additions.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 OpenGrep (1.23.0)
packages/db/src/query/live/collection-config-builder.ts

┌──────────────┐
│ Opengrep CLI │
└──────────────┘

�[32m✔�[39m �[1mOpengrep OSS�[0m
�[32m✔�[39m Basic security coverage for first-party code vulnerabilities.

[00.10][ERROR]: unable to find a config; path .coderabbit-opengrep-fallback.yml does not exist

packages/db/tests/query/includes.test.ts

┌──────────────┐
│ Opengrep CLI │
└──────────────┘

�[32m✔�[39m �[1mOpengrep OSS�[0m
�[32m✔�[39m Basic security coverage for first-party code vulnerabilities.

[00.10][ERROR]: unable to find a config; path .coderabbit-opengrep-fallback.yml does not exist


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@pkg-pr-new

pkg-pr-new Bot commented Jun 22, 2026

Copy link
Copy Markdown
More templates

@tanstack/angular-db

npm i https://pkg.pr.new/@tanstack/angular-db@1607

@tanstack/browser-db-sqlite-persistence

npm i https://pkg.pr.new/@tanstack/browser-db-sqlite-persistence@1607

@tanstack/capacitor-db-sqlite-persistence

npm i https://pkg.pr.new/@tanstack/capacitor-db-sqlite-persistence@1607

@tanstack/cloudflare-durable-objects-db-sqlite-persistence

npm i https://pkg.pr.new/@tanstack/cloudflare-durable-objects-db-sqlite-persistence@1607

@tanstack/db

npm i https://pkg.pr.new/@tanstack/db@1607

@tanstack/db-ivm

npm i https://pkg.pr.new/@tanstack/db-ivm@1607

@tanstack/db-sqlite-persistence-core

npm i https://pkg.pr.new/@tanstack/db-sqlite-persistence-core@1607

@tanstack/electric-db-collection

npm i https://pkg.pr.new/@tanstack/electric-db-collection@1607

@tanstack/electron-db-sqlite-persistence

npm i https://pkg.pr.new/@tanstack/electron-db-sqlite-persistence@1607

@tanstack/expo-db-sqlite-persistence

npm i https://pkg.pr.new/@tanstack/expo-db-sqlite-persistence@1607

@tanstack/node-db-sqlite-persistence

npm i https://pkg.pr.new/@tanstack/node-db-sqlite-persistence@1607

@tanstack/offline-transactions

npm i https://pkg.pr.new/@tanstack/offline-transactions@1607

@tanstack/powersync-db-collection

npm i https://pkg.pr.new/@tanstack/powersync-db-collection@1607

@tanstack/query-db-collection

npm i https://pkg.pr.new/@tanstack/query-db-collection@1607

@tanstack/react-db

npm i https://pkg.pr.new/@tanstack/react-db@1607

@tanstack/react-native-db-sqlite-persistence

npm i https://pkg.pr.new/@tanstack/react-native-db-sqlite-persistence@1607

@tanstack/rxdb-db-collection

npm i https://pkg.pr.new/@tanstack/rxdb-db-collection@1607

@tanstack/solid-db

npm i https://pkg.pr.new/@tanstack/solid-db@1607

@tanstack/svelte-db

npm i https://pkg.pr.new/@tanstack/svelte-db@1607

@tanstack/tauri-db-sqlite-persistence

npm i https://pkg.pr.new/@tanstack/tauri-db-sqlite-persistence@1607

@tanstack/trailbase-db-collection

npm i https://pkg.pr.new/@tanstack/trailbase-db-collection@1607

@tanstack/vue-db

npm i https://pkg.pr.new/@tanstack/vue-db@1607

commit: ea83de9

kevin-dp and others added 2 commits June 24, 2026 09:52
…ation key (TanStack#1501)

With 3+ levels of nested toArray includes, when two children in different
parent groups shared the same deepest correlation key, only one received the
nested rows and the other came back empty.

Two compounding causes:
- nestedRoutingIndex mapped each nested correlation key to a single parent
  group (last-writer-wins), and the shared buffer entry was deleted after
  routing to the first match, so sibling groups sharing the key were dropped.
- the nested pipeline does not re-emit already-materialized rows, so a parent
  group that starts referencing an existing correlation key after the rows
  were drained (e.g. a sibling inserted after the initial load) saw nothing.

Fixes:
- nestedRoutingIndex now maps a nested correlation key to a Set of parent
  groups; drainNestedBuffers fans buffered grandchild changes out to every
  ready parent group before dropping the buffer entry.
- a per-level cumulative snapshot of net-present grandchild rows seeds
  late-arriving parent groups from what their siblings already received.

Routing-index inserts/deletes and parent-delete cleanup are updated to
maintain the per-key parent sets.

Adds tests covering initial load, a sibling inserted after load, and deleting
one of two siblings that share a correlation key.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@kevin-dp kevin-dp changed the title test(db): failing repro for nested toArray dropped children (#1501) fix(db): nested toArray includes drop children when sibling groups share a correlation key (#1501) Jun 24, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
packages/db/tests/query/includes.test.ts (1)

5078-5079: 🩺 Stability & Availability | 🔵 Trivial

Replace the fixed 50ms sleeps with a deterministic wait helper. flushPromises() is already available in the test utilities, and the same change should be applied to the matching delete case at line 5160.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/db/tests/query/includes.test.ts` around lines 5078 - 5079, The tests
in `includes.test.ts` use fixed 50ms sleeps after `priceRanges.insert(...)`,
which should be replaced with the deterministic `flushPromises()` helper already
available in the test utilities. Update the insert case and the matching delete
case to await `flushPromises()` instead of `setTimeout`, keeping the same
behavior while removing timing flakiness.

Source: Coding guidelines

packages/db/src/query/live/collection-config-builder.ts (1)

1153-1158: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Avoid adding new any to snapshot state.

SnapshotRow.value and the new helper signature can use unknown here; the code only stores/replays the value and does not need unchecked access. As per coding guidelines, **/*.{ts,tsx}: Avoid using any types; use unknown instead when the type is truly unknown.

♻️ Proposed type tightening
 type SnapshotRow = {
-  value: any
+  value: unknown
   orderByIndex: string | undefined
   /** Net multiplicity (inserts − deletes) currently materialized for this row */
   count: number
 }
@@
 function accumulateSnapshot(
   setup: NestedIncludesSetup,
   nestedCorrelationKey: unknown,
-  childChanges: Map<unknown, Changes<any>>,
+  childChanges: Map<unknown, Changes<unknown>>,
 ): void {

Also applies to: 1373-1377

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/db/src/query/live/collection-config-builder.ts` around lines 1153 -
1158, The snapshot state is introducing an unnecessary any type in
SnapshotRow.value and the related helper signature, even though the value is
only stored and replayed. Tighten the types in collection-config-builder by
changing SnapshotRow and the affected helper used around the snapshot logic to
use unknown instead of any, and keep the existing handling unchanged except for
type-safe narrowing where needed.

Source: Coding guidelines

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/db/src/query/live/collection-config-builder.ts`:
- Around line 1387-1399: Clone snapshot values before assigning them into row
state in collection-config-builder’s snapshot/replay logic, because row.value is
currently sharing the same object as changes.value and later mutation of
changes.value can strip routing metadata. Update the row initialization and
replay paths in the affected update/merge flow (including the shared logic
around the count/orderByIndex handling) to store a cloned copy instead of the
original object, so later cleanup cannot affect previously buffered rows and
nested routing can be rebuilt correctly.

---

Nitpick comments:
In `@packages/db/src/query/live/collection-config-builder.ts`:
- Around line 1153-1158: The snapshot state is introducing an unnecessary any
type in SnapshotRow.value and the related helper signature, even though the
value is only stored and replayed. Tighten the types in
collection-config-builder by changing SnapshotRow and the affected helper used
around the snapshot logic to use unknown instead of any, and keep the existing
handling unchanged except for type-safe narrowing where needed.

In `@packages/db/tests/query/includes.test.ts`:
- Around line 5078-5079: The tests in `includes.test.ts` use fixed 50ms sleeps
after `priceRanges.insert(...)`, which should be replaced with the deterministic
`flushPromises()` helper already available in the test utilities. Update the
insert case and the matching delete case to await `flushPromises()` instead of
`setTimeout`, keeping the same behavior while removing timing flakiness.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ace73dfd-754c-45cb-8106-836a50331bd6

📥 Commits

Reviewing files that changed from the base of the PR and between b7229fc and 5d5259e.

📒 Files selected for processing (3)
  • .changeset/nested-toarray-shared-buffer-overlap.md
  • packages/db/src/query/live/collection-config-builder.ts
  • packages/db/tests/query/includes.test.ts
✅ Files skipped from review due to trivial changes (1)
  • .changeset/nested-toarray-shared-buffer-overlap.md

Comment on lines +1387 to +1399
row = {
value: changes.value,
orderByIndex: changes.orderByIndex,
count: 0,
}
snap.set(childKey, row)
}
row.count += changes.inserts - changes.deletes
if (changes.inserts > 0) {
row.value = changes.value
if (changes.orderByIndex !== undefined) {
row.orderByIndex = changes.orderByIndex
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Clone snapshot values before storing or replaying them.

row.value currently aliases changes.value; later cleanup deletes INCLUDES_ROUTING from changes.value at Line 1984. A late replay can then seed a row without routing metadata, preventing deeper nested routing from being rebuilt via Lines 1544-1547.

🐛 Proposed fix
+function cloneSnapshotValue<T>(value: T): T {
+  if (value == null || typeof value !== `object`) return value
+  if (Array.isArray(value)) return [...value] as T
+  return { ...(value as Record<PropertyKey, unknown>) } as T
+}
+
 function accumulateSnapshot(
   setup: NestedIncludesSetup,
   nestedCorrelationKey: unknown,
   childChanges: Map<unknown, Changes<any>>,
@@
       row = {
-        value: changes.value,
+        value: cloneSnapshotValue(changes.value),
         orderByIndex: changes.orderByIndex,
         count: 0,
       }
@@
     if (changes.inserts > 0) {
-      row.value = changes.value
+      row.value = cloneSnapshotValue(changes.value)
       if (changes.orderByIndex !== undefined) {
         row.orderByIndex = changes.orderByIndex
       }
@@
     byChild.set(childKey, {
       deletes: 0,
       inserts: row.count,
-      value: row.value,
+      value: cloneSnapshotValue(row.value),
       orderByIndex: row.orderByIndex,
     })

Also applies to: 1436-1443

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/db/src/query/live/collection-config-builder.ts` around lines 1387 -
1399, Clone snapshot values before assigning them into row state in
collection-config-builder’s snapshot/replay logic, because row.value is
currently sharing the same object as changes.value and later mutation of
changes.value can strip routing metadata. Update the row initialization and
replay paths in the affected update/merge flow (including the shared logic
around the count/orderByIndex handling) to store a cloned copy instead of the
original object, so later cleanup cannot affect previously buffered rows and
nested routing can be rebuilt correctly.

@kevin-dp

Copy link
Copy Markdown
Contributor Author

Here's a complete explanation of the bugs and the fix. This PR implements the snapshot accumulation fix but it means we store each distinct nested row in that snapshot. The explanation below also documents a potential alternative.

The example

products → priceRanges → region

For each product you want its price ranges, and for each price range its region. Three levels deep. Our problem data:

  • product 1 (T-Shirt) has priceRange 1 → regionId 1 (Europe)
  • product 2 (Hoodie) has priceRange 3 → regionId 1 (Europe) ← same region!

How includes actually compute (the important part)

You might imagine the engine runs the "region" query fresh for each price range. It doesn't. That would be slow and wouldn't update reactively. Instead:

There is ONE shared "region" pipeline. It runs once over all regions and emits region rows tagged with a correlation key — the value that says "who do I belong to". For region, that key is regionId. So the region pipeline emits one thing:

regionId 1 → { Europe }

Just one entry. Even though two price ranges want region 1, the pipeline produces it once, keyed by regionId 1.

Results land in a "buffer" — a temporary mailbox: regionId → the region rows. Think of it as "here are freshly-computed region rows, waiting to be delivered."

Then a "drain" step delivers buffered rows to the right parents. This is the routing step. To deliver region 1 to the correct price range, the engine keeps a routing index:

routingIndex:  regionId 1  →  "which price-range group does this belong to?"

Each price-range group has its own private copy of region results (its own little child collection), so the final tree can show each one its own region array. The drain reads the buffer, looks up the parent in the routing index, copies the rows into that parent's private copy, and then deletes the buffer entry (it's been delivered — don't deliver twice).

Why it broke

The whole design quietly assumed one nested key belongs to one parent. Two things follow from that assumption, and both are wrong when two siblings share regionId 1:

Bug 1 — the routing index only remembered one parent.

routingIndex.set(regionId 1, priceRange1)   // T-Shirt's
routingIndex.set(regionId 1, priceRange3)   // Hoodie's — OVERWRITES

It's a plain key→value map, so the second write clobbers the first. Now regionId 1 points to only one group. Then drain delivers region 1 to that one group and deletes the buffer entry. The other group's lookup finds nothing → its region comes back []. That's the empty array you saw.

Bug 2 — late arrivals get nothing.

Say T-Shirt's priceRange exists at load, but Hoodie's priceRange 3 is inserted later. By then the region pipeline already emitted region 1 (at load), it was drained, and the buffer entry was deleted. When priceRange 3 shows up, the pipeline does not re-emit region 1 — from its point of view nothing about regions changed; it already produced that row. So the buffer is empty and there's nothing to deliver to Hoodie. Empty array again.

Both of these were confirmed with debug logging before writing the fix.

The fix — two matching pieces

Fix 1: let one nested key map to many parents

The routing index value changes from a single parent to a Set of parents:

routingIndex:  regionId 1  →  { priceRange1, priceRange3 }
  • Adding a parent now adds to the set instead of overwriting (the updateRoutingIndex insert change).
  • Removing a parent removes just that one from the set, leaving the others (the delete change and cleanRoutingIndexOnDelete). This is what keeps the deletion case working — deleting Hoodie mustn't wipe regionId 1 while T-Shirt still uses it.
  • Drain now loops over every parent in the set and copies the rows into each one's private copy, and only then deletes the buffer entry. That's the for (const parentCorrelationKey of parentCorrelationKeys) loop. This fixes the initial-load case.

Fix 2: remember delivered rows so late arrivals can be seeded

Because the pipeline won't re-emit region 1 for a price range added later, the engine needs its own memory of "what region rows currently exist for regionId 1". That's the new snapshot:

snapshot:  regionId 1  →  { Europe }   (kept around, not deleted on drain)
  • Every time something is drained, it's also folded into the snapshot (accumulateSnapshot). The snapshot tracks a running count so that if a region is later deleted, it drops out of the snapshot too — it stays an accurate picture of "what currently exists", not a pile of stale rows.
  • When a new parent group first references an existing key (isNewParent in updateRoutingIndex), we seed it straight from the snapshot (seedParentFromSnapshot) — hand it the rows its siblings already have, without waiting on the pipeline. This fixes the insert-after-load case.

One-line summary

The nested-includes system assumed each nested key had exactly one parent and treated delivered rows as throwaway. The fix makes a nested key able to feed multiple parents (Set instead of single value + fan-out on drain), and keeps a snapshot of currently-existing rows so parents that show up late can be handed the same data their siblings already received.


Is the snapshot a memory problem?

Partly a fair concern, but it's bounded, not unbounded:

  • It stores one entry per distinct (nestedKey, childKey) — i.e. one copy of each distinct nested row, not one per parent. Siblings sharing a key share the single snapshot entry. So its size tracks the distinct nested result set, not parent count.
  • It holds references to the same row value objects already materialized elsewhere, not deep copies. The extra cost is Map overhead + one slot per distinct row.
  • It shrinks: accumulateSnapshot tracks a net count and deletes rows (and empty keys) when the count hits 0 on delete. So it's not an ever-growing log — it stays proportional to currently-live data.

So it's not a leak and not multiplied by parents. But it is a real added cost: roughly O(distinct nested rows) extra — essentially one more index over the deepest level, for the lifetime of the live query. For a query with a very large nested collection, that's non-trivial memory.


The alternative: copy from an existing sibling instead of keeping a snapshot

The key realization

When does a late-arriving parent need seeding at all? Only when the region rows were already delivered and the buffer entry deleted in an earlier flush, and the pipeline won't re-emit them. But "already delivered to whom?" — to a sibling. So whenever seeding is needed, at least one sibling already has the rows materialized in its own private copy.

That's the insight the alternative exploits: you don't need a separate stash of the data, because a sibling is already holding a perfectly good copy of it. Go ask the sibling.

How "copy from sibling" works

Walk through Hoodie's priceRange 3 arriving late, referencing regionId 1:

  1. Look up the routing index (which is now a Set thanks to Fix 1):

    regionId 1 → { priceRange1, priceRange3 }
    

    priceRange3 is the newcomer; priceRange1 is an existing sibling.

  2. Find a sibling that already has the data. priceRange1's private region state has a childRegistry keyed by regionId, and childRegistry[regionId 1] is a little collection already holding { Europe }.

  3. Copy those rows into priceRange3's pending changes, then let the normal materialize step run. priceRange3 now shows [Europe].

So instead of consulting a snapshot, step 2 reaches into a sibling's already-materialized collection.

What it costs you in code

The snapshot stored exactly what the delivery machinery needs: for each row, its childKey, its value, and its orderByIndex. A live child collection only stores the rows themselves. To rebuild "changes" from it you have to look those extra bits back up:

  • the sibling state's resultKeys map (row → childKey),
  • its orderByIndices map (row → orderByIndex),

then iterate the sibling's collection and reconstruct insert-changes from each row. It's the same idea as seedParentFromSnapshot, just reading from a collection + two side-maps instead of from one tidy snapshot Map.

You also need a small guard: "pick a sibling that is actually populated right now." The routing-index Set gives you candidates, but you'd skip the newcomer itself and any sibling that happens to be empty, and copy from the first good one. (As argued above, a good one is essentially guaranteed to exist whenever seeding is needed — if none did, the buffer path would still be handling it.)


The two approaches compared

Snapshot (current fix) Copy-from-sibling
Standing memory one extra index over distinct nested rows, kept for the query's life none — reuses data already in siblings
Work at steady state a little on every drain (fold into snapshot) none
Work on a late insert cheap: read tidy snapshot a bit more: find a ready sibling, rebuild changes from its collection + side-maps
Code complexity self-contained, one clean structure more fiddly: sibling selection + reconstructing changes
Correctness risk must keep the snapshot's net-count in sync with deletes/updates sibling is always the live truth, so no parallel structure to drift — but "is a sibling ready?" is a subtler question

How to choose

The snapshot trades memory for simplicity and predictability: one clean structure, consulted in one place, no "go find a healthy donor" logic. Copy-from-sibling trades that memory away but pays in code complexity and a reconstruction step on each late insert.

For typical includes (modest nested result sets) the snapshot's overhead is negligible and the simpler code wins. If you expect very large nested collections where doubling an index matters, copy-from-sibling is the better fit — and it's a localized swap: only accumulateSnapshot + seedParentFromSnapshot change; the Set-based routing fan-out (Fix 1) stays exactly as-is either way.

…lize includes

The nested-includes routing fix is independent of how each level is
materialized. Add regression tests proving sibling parent groups that share a
deepest correlation key resolve their grandchildren when the nested levels are
left as live Collections (no wrapper) and when wrapped with materialize(),
mirroring the existing toArray coverage.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/db/tests/query/includes.test.ts`:
- Around line 5171-5261: The current live/materialize regression tests only
verify preload behavior; extend the existing nested-collection coverage in the
relevant `it(...)` cases to assert post-load mutations as well. Use the same
`createLiveQueryCollection`, `toTree`, and nested `q.from(...)` setup to add
insert/delete assertions for the live and materialized variants, and include
async stale-order/race scenarios that exercise the same shared-correlation-key
path after initial load. Keep the new assertions anchored to the existing
collection setup so the bug is reproduced across both materialization modes.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e827313e-ce5c-41ab-a147-be0bec3dfe49

📥 Commits

Reviewing files that changed from the base of the PR and between 5d5259e and 7bd39a8.

📒 Files selected for processing (1)
  • packages/db/tests/query/includes.test.ts

Comment thread packages/db/tests/query/includes.test.ts
@kevin-dp

Copy link
Copy Markdown
Contributor Author

Had a quick discussion with @samwillis on the 2 alternatives. We decided to go for correctness and simplicity so we pick the snapshot approach.

@kevin-dp kevin-dp requested a review from samwillis June 25, 2026 09:32
Add a post-load insert assertion to the shared-correlation-key
materialize() regression test: inserting a sibling group that references
an already-materialized correlation key must be seeded via the cumulative
snapshot without disturbing the existing group's nested rows.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

@samwillis samwillis left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found one remaining routing hole in the new Set<parentCorrelationKey> approach.

A single parent correlation key can be backed by multiple child rows that share the same nested correlation key. The new route set dedupes those rows down to one parent entry, so when one of the child rows is deleted the delete path removes the whole parent route even though another child row in that same parent group still needs it. After that, later grandchild updates/deletes for the shared nested key no longer reach the surviving child.

The problematic path is around packages/db/src/query/live/collection-config-builder.ts:1591, where the delete path does parents.delete(correlationKey) without knowing whether another child row in that parent group still references the same nested key.

I pushed a minimal failing repro here:

The repro adds one product with two price ranges sharing regionId: 1, deletes one price range, then updates region 1. The surviving price range still shows Europe instead of Renamed Europe.

Targeted command used:

PATH=/Users/samwillis/.cache/codex-runtimes/codex-primary-runtime/dependencies/node/bin:$PATH pnpm exec vitest --run tests/query/includes.test.ts -t 'keeps routing when one of multiple same-parent siblings sharing a nested key is deleted'

This likely needs per-parent route multiplicity/refcounting, or tracking routes at the child-key level, so deleting one sibling does not remove routing for the surviving sibling.

The Set<parentCorrelationKey> routing index introduced by the previous
commit collapses multiple child rows in the same parent group that share
a nested correlation key into a single route entry. Deleting one such
sibling emptied the entry and dropped the whole route, so the surviving
sibling stopped receiving grandchild changes (reported by @samwillis).

Track the referencing child keys per (nestedKey, parentGroup) so the
parent route is only dropped once its last referencing child row is gone.

Also fix a routing hole this exposed: an update that changes a child
row's nested correlation key (e.g. a price range's regionId) only carries
the new key, so the row's stale reference under the old key was never
released — a later sibling delete then mis-routed grandchild changes. A
per-setup childKey -> nestedKey map records each row's current nested key
so updates can release the old reference, scoped per nested setup so a
change to one nested include never disturbs another on the same child.

Tests cover: same-parent siblings sharing a key with one deleted, an
update that changes the nested key followed by a sibling delete, and
isolation between two nested includes on the same child row.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@kevin-dp

Copy link
Copy Markdown
Contributor Author

Great catch @samwillis — confirmed and fixed in 33bfcaa.

I reproduced your repro against the branch and it failed exactly as you described: deleting one of two same-parent siblings that share a nested correlation key dropped the route for the survivor, so later grandchild updates never reached it. Root cause was exactly what you identified at collection-config-builder.ts:1591 — the Set<parentCorrelationKey> route had no notion of how many child rows in a parent group referenced a nested key, so the first sibling delete emptied the entry and dropped the whole route.

Fix: the routing index now tracks the referencing child keys per (nestedKey, parentGroup)Map<nestedKey, Map<parentCorrelationKey, Set<childKey>>> — and a parent group is only dropped from a route once its last referencing child row is removed. Extracted a removeChildKeyFromRoute helper so the delete path and parent-delete cleanup share one consistent cascade.

While verifying, I found a second routing hole your report exposed: an update that changes a child row's nested correlation key (e.g. a price range's regionId changes) only carries the new key, so the row's stale reference under the old key was never released. With the new refcounting that stale child key kept the old route alive, and a later sibling delete then mis-routed grandchild changes (the updated row lost its grandchildren). Fixed with a per-setup childKey → nestedKey map that releases the previous reference on update — scoped per nested setup so a change to one nested include can't disturb another include on the same child row.

New regression tests cover all three: your same-parent-sibling-delete case, the update-then-sibling-delete case, and cross-include isolation. Full tests/query/ suite passes (1601), no type errors, lint clean.

Separately — and not introduced by this PR — I noticed that an atomic mutual swap (two same-parent siblings exchanging nested keys in a single tick) produces wrong results; it reproduces on main too, so it predates this work. I'll open a separate issue with a repro rather than expand this PR's scope. Let me know if you'd prefer it folded in here.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/db/tests/query/includes.test.ts`:
- Around line 5179-5184: The test currently uses fixed `setTimeout(50)` sleeps,
which makes the async behavior timing-sensitive and flaky. In the affected
`includes.test.ts` cases, replace those waits with state-based synchronization
by polling or awaiting the expected collection state after each
`currencies.update(...)` mutation, using the existing test helpers or a
repo-local wait helper instead of arbitrary delays. Keep the assertions anchored
around the relevant `currencies` updates so the test verifies ordering and
completion directly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 7e8a55a3-f66e-4f39-aed2-6091dd08e82f

📥 Commits

Reviewing files that changed from the base of the PR and between d914565 and ea83de9.

📒 Files selected for processing (2)
  • packages/db/src/query/live/collection-config-builder.ts
  • packages/db/tests/query/includes.test.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • packages/db/src/query/live/collection-config-builder.ts

Comment on lines +5179 to +5184
await new Promise((r) => setTimeout(r, 50))

currencies.update(9, (draft) => {
draft.code = `USD`
})
await new Promise((r) => setTimeout(r, 50))

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🩺 Stability & Availability | 🟡 Minor | ⚡ Quick win

Replace fixed sleeps with state-based synchronization.

These setTimeout(50) waits make the regressions timing-sensitive: slow CI can still flake, and fast runs can hide ordering problems instead of asserting them. Please wait on the expected collection state (or a repo-local polling helper) after each mutation rather than sleeping for an arbitrary duration. As per coding guidelines, **/*.test.{ts,tsx,js}: “Test corner cases including … async race conditions …”.

Also applies to: 5269-5279, 5431-5436

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/db/tests/query/includes.test.ts` around lines 5179 - 5184, The test
currently uses fixed `setTimeout(50)` sleeps, which makes the async behavior
timing-sensitive and flaky. In the affected `includes.test.ts` cases, replace
those waits with state-based synchronization by polling or awaiting the expected
collection state after each `currencies.update(...)` mutation, using the
existing test helpers or a repo-local wait helper instead of arbitrary delays.
Keep the assertions anchored around the relevant `currencies` updates so the
test verifies ordering and completion directly.

Source: Coding guidelines

@samwillis samwillis left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update. The previous same-parent repro is fixed now: the added test keeps routing when one of multiple same-parent siblings sharing a nested key is deleted passes on ea83de909.

I found one remaining routing hole in the new child-key refcounting approach, though. nestedRoutingIndex is still shared across all sibling nested includes at a level (nestedCorrelationKey -> parent -> childKeys), while nestedRoutingChildToNested is per nested setup. When two sibling nested includes use the same nested correlation value for the same child row, cleanup for one include can delete the shared route that the other include still needs.

Concrete failing case:

  • A price range has regionId: 1 and currencyId: 1.
  • The child select defines nested includes in this order: currency first, region second.
  • Updating only regionId from 1 to 2 makes the currency setup observe key 1 unchanged first, then the region setup removes child key 1 from the shared route for nested key 1.
  • Because the route is shared and only keyed by child key, the later removal deletes the route needed by currency.
  • A subsequent currency update no longer reaches the surviving currency include, so the row still shows EUR instead of USD.

The problematic shape is around packages/db/src/query/live/collection-config-builder.ts:1216 and the deletion at packages/db/src/query/live/collection-config-builder.ts:1556: the route refcount is not scoped by nested setup/include field, so equal correlation values from different nested includes collide.

Temporary repro I ran locally:

it(`keeps sibling include routing when an earlier include still references the old nested key`, async () => {
  // one price range: regionId: 1, currencyId: 1
  // select order: currency first, region second
  priceRanges.update(1, (draft) => {
    draft.regionId = 2
  })
  await new Promise((r) => setTimeout(r, 50))

  currencies.update(1, (draft) => {
    draft.code = `USD`
  })
  await new Promise((r) => setTimeout(r, 50))

  // Fails: currency remains EUR
})

Targeted command:

PATH=/Users/samwillis/.cache/codex-runtimes/codex-primary-runtime/dependencies/node/bin:$PATH pnpm exec vitest --run tests/query/includes.test.ts -t 'keeps sibling include routing when an earlier include still references the old nested key'

This likely needs the route reference tracking to be scoped per nested setup/include field as well as by parent and child key, or otherwise include setup identity in the routing key/refcount so sibling includes with equal correlation values cannot remove each other's routes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3-level nested toArray: shared buffer in createPerEntryIncludesStates drops children when correlation keys overlap across parent groups

2 participants