Describe the bug
We are observing duplicate liveliness samples when using FetchingSubscriber with liveliness subscriptions. This started surfacing consistently after switching to an io_uring-based transport (likely due to timing changes making the race deterministic).
Specifically, sub.recv_async() may return the same key expression (e.g. LIVELINESS_KEYEXPR_1) twice.
Expected Behavior
Liveliness tokens should be deduplicated by key expression, consistent with the session-level invariant in Zenoh (i.e., a token should not be observed twice unless it was actually re-declared).
Actual Behavior
The same liveliness key expression is delivered twice:
- once from the fetch phase (
liveliness().get(...))
- once from the live subscription stream
Root Cause (Analysis)
FetchingSubscriber merges:
- fetched samples
- live samples
via an internal merge queue that deduplicates only by timestamp, not by key expression.
However:
- fetched liveliness samples and live updates may have different timestamps (or synthetic ones),
- therefore both entries are preserved,
- resulting in duplicate delivery of the same liveliness token.
This is particularly visible in clique scenarios and under timing-sensitive transports like io_uring.
Notes
From discussion with maintainers:
- This is considered a bug in zenoh-ext, not in core zenoh.
- Fetching subscribers are somewhat deprecated / discouraged in favor of
history(true).
- Liveliness support in fetching subscribers was likely added without full consideration of deduplication semantics.
Suggested Fix
Deduplicate liveliness samples in FetchingSubscriber by key expression, not just timestamp.
Possible approaches:
- maintain a
HashSet<KeyExpr> for liveliness tokens
- or enforce the same invariant as session-level liveliness handling
Workaround
Temporarily ignoring the failing test in CI:
#[ignore = "https://github.com/eclipse-zenoh/zenoh/issues/2523"]
To reproduce
Test:
test_liveliness_fetching_subscriber_clique
with "uring" feature
Symptom:
sub.recv_async() yields duplicate LIVELINESS_KEYEXPR_1
System info
Linux
Describe the bug
We are observing duplicate liveliness samples when using
FetchingSubscriberwith liveliness subscriptions. This started surfacing consistently after switching to anio_uring-based transport (likely due to timing changes making the race deterministic).Specifically,
sub.recv_async()may return the same key expression (e.g.LIVELINESS_KEYEXPR_1) twice.Expected Behavior
Liveliness tokens should be deduplicated by key expression, consistent with the session-level invariant in Zenoh (i.e., a token should not be observed twice unless it was actually re-declared).
Actual Behavior
The same liveliness key expression is delivered twice:
liveliness().get(...))Root Cause (Analysis)
FetchingSubscribermerges:via an internal merge queue that deduplicates only by timestamp, not by key expression.
However:
This is particularly visible in clique scenarios and under timing-sensitive transports like
io_uring.Notes
From discussion with maintainers:
history(true).Suggested Fix
Deduplicate liveliness samples in
FetchingSubscriberby key expression, not just timestamp.Possible approaches:
HashSet<KeyExpr>for liveliness tokensWorkaround
Temporarily ignoring the failing test in CI:
#[ignore = "https://github.com/eclipse-zenoh/zenoh/issues/2523"]To reproduce
Test:
test_liveliness_fetching_subscriber_cliquewith "uring" feature
Symptom:
sub.recv_async() yields duplicate LIVELINESS_KEYEXPR_1
System info
Linux