Skip to content

Duplicate liveliness samples in FetchingSubscriber (zenoh-ext) under concurrent fetch + live updates #2523

@yellowhatter

Description

@yellowhatter

Describe the bug

We are observing duplicate liveliness samples when using FetchingSubscriber with liveliness subscriptions. This started surfacing consistently after switching to an io_uring-based transport (likely due to timing changes making the race deterministic).

Specifically, sub.recv_async() may return the same key expression (e.g. LIVELINESS_KEYEXPR_1) twice.

Expected Behavior

Liveliness tokens should be deduplicated by key expression, consistent with the session-level invariant in Zenoh (i.e., a token should not be observed twice unless it was actually re-declared).

Actual Behavior

The same liveliness key expression is delivered twice:

  • once from the fetch phase (liveliness().get(...))
  • once from the live subscription stream

Root Cause (Analysis)

FetchingSubscriber merges:

  1. fetched samples
  2. live samples

via an internal merge queue that deduplicates only by timestamp, not by key expression.

However:

  • fetched liveliness samples and live updates may have different timestamps (or synthetic ones),
  • therefore both entries are preserved,
  • resulting in duplicate delivery of the same liveliness token.

This is particularly visible in clique scenarios and under timing-sensitive transports like io_uring.

Notes

From discussion with maintainers:

  • This is considered a bug in zenoh-ext, not in core zenoh.
  • Fetching subscribers are somewhat deprecated / discouraged in favor of history(true).
  • Liveliness support in fetching subscribers was likely added without full consideration of deduplication semantics.

Suggested Fix

Deduplicate liveliness samples in FetchingSubscriber by key expression, not just timestamp.

Possible approaches:

  • maintain a HashSet<KeyExpr> for liveliness tokens
  • or enforce the same invariant as session-level liveliness handling

Workaround

Temporarily ignoring the failing test in CI:

#[ignore = "https://github.com/eclipse-zenoh/zenoh/issues/2523"]

To reproduce

Test:
test_liveliness_fetching_subscriber_clique
with "uring" feature

Symptom:
sub.recv_async() yields duplicate LIVELINESS_KEYEXPR_1

System info

Linux

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions