Skip to content

debezium/dbz#2040 feat: add Chroma adapter and end-to-end sync pipeli…#8

Merged
vjuranek merged 1 commit into
gsoc-week-3-syncfrom
gsoc-week-3-chroma
Jun 26, 2026
Merged

debezium/dbz#2040 feat: add Chroma adapter and end-to-end sync pipeli…#8
vjuranek merged 1 commit into
gsoc-week-3-syncfrom
gsoc-week-3-chroma

Conversation

@KMohnishM

Copy link
Copy Markdown
Contributor

Description

Fixes debezium/dbz#2040.

This pull request completes the Week 3 deliverables by introducing the Chroma Vector Store Adapter (ChromaAdapter) and adding the first suite of end-to-end pipeline integration tests to verify real-time event synchronization.

This PR builds directly on top of the SyncManager introduced in PR #7 (gsoc-week-3-sync).

Key Changes

  1. Chroma Vector Store Adapter (pydebeziumai/adapters/chroma.py):

    • Implements the ChromaAdapter which inherits from our base VectorStoreAdapter interface.
    • Wraps langchain_chroma.Chroma to enable vector operations inside the SyncManager.
    • Implements upsert and delete operations for document lifecycle sync.
    • Exposes as_retriever() which hooks directly into LangChain's retrieval chains.
  2. E2E Sync Integration Tests (tests/integration/test_chroma_pipeline.py):

    • Validates the complete CDC-to-Vector-Store pipeline end-to-end using an ephemeral, in-memory Chroma instance and FakeEmbeddings.
    • Covers the following scenarios through the entire flow (Event Ingestion ➔ Document Builder ➔ SyncManager ➔ Chroma Vector Store):
      • Insert/Snapshot (c/r): Verifies documents are embedded and upserted.
      • Update (u): Verifies documents are updated (deleted and re-upserted to prevent duplicate keys).
      • Hard Delete (d with soft_delete=False): Verifies documents are fully deleted from the database.
      • Soft Delete (d with soft_delete=True): Verifies document is retained in the vector store but its metadata is updated with _is_deleted=True and its original state is preserved.

Verification Results

All checks passed locally in the WSL testing environment:

  • Ruff formatting & linting: Checked and passed (All checks passed!).
  • MyPy strict type-checking: Passed (Success: no issues found in 28 source files).
  • PyTest test suite: All 56 unit and integration tests passed successfully, including the 4 new integration tests for the Chroma adapter pipeline.

@vjuranek vjuranek left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@KMohnishM

Copy link
Copy Markdown
Contributor Author

LGTM

@vjuranek I will update the other 2 PRs and rebase this. Thank you for the review !

…ne tests

Signed-off-by: Mohnish <kmohnishm@gmail.com>
@KMohnishM KMohnishM force-pushed the gsoc-week-3-chroma branch from 9c7c425 to 0fbfe10 Compare June 18, 2026 15:40
@vjuranek vjuranek merged commit 392eae2 into gsoc-week-3-sync Jun 26, 2026
4 checks passed
@vjuranek

Copy link
Copy Markdown
Member

Merge, thanks @KMohnishM !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants