Describe the bug
Hi, thanks for maintaining zenoh.
We are seeing repeated OpenSyn -> Close(INVALID) errors when two zenoh peers are both configured to:
- run in
peer mode,
- listen on a TCP endpoint,
- connect to each other,
- and use
transport.auth.usrpwd.
To make this easier to investigate, I prepared a minimal repro branch based on upstream/main:
With that repro, whichever peer is started first keeps printing errors like:
ERROR zenoh_transport::unicast::establishment::open:
Received a close message (reason INVALID) in response to an OpenSyn on:
TransportLinkUnicast { ... }
If I reverse the startup order, the repeated error moves to the other peer instead.
Expected behavior:
- duplicate/bidirectional peer connection attempts should be handled cleanly,
- and they should not keep producing repeated
Close(INVALID) responses.
Observed behavior:
- the peer started first keeps retrying and repeatedly logs
Received a close message (reason INVALID) in response to an OpenSyn,
- while the peer started second does not emit the same repeated error.
As a workaround, we changed peer-to-peer auto-connect to use greater-zid (for example autoconnect_strategy: { peer: { to_peer: "greater-zid" } }). With that configuration change, we have not been able to reproduce this issue in our peer-to-peer deployment so far.
To reproduce
-
Check out the repro branch:
git checkout repro/usrpwd-duplicate-transport
-
Build the example:
cargo check -p zenoh-examples --example z_repro_usrpwd_peer
-
In terminal A, run:
RUST_LOG=zenoh_transport::unicast::manager=trace,zenoh_transport::unicast::establishment::accept=trace,zenoh_transport::unicast::establishment::open=trace,zenoh::net::runtime::orchestrator=debug ./repro/usrpwd-duplicate-transport/run-peer-a.sh
-
In terminal B, run:
RUST_LOG=zenoh_transport::unicast::manager=trace,zenoh_transport::unicast::establishment::accept=trace,zenoh_transport::unicast::establishment::open=trace,zenoh::net::runtime::orchestrator=debug ./repro/usrpwd-duplicate-transport/run-peer-b.sh
-
Observe that the peer started first prints repeated Close(INVALID) / OpenSyn errors.
You can also start peer-b first and then peer-a; the repeated error follows the peer that was started first.
System info
- Minimal repro platform: macOS arm64 (Apple Silicon)
- Kernel:
Darwin 25.3.0
- Zenoh base commit used for the minimal repro:
d12952d8599b83fef49a5a1b290b6b039ea028e0
- Repro branch:
origin/repro/usrpwd-duplicate-transport
- Repro commit:
1b92bc932c5e55ecfb90a6c1650240a2ba8cccd7
- Original environment where we first noticed this behavior: ROS 2 Humble with
rmw_zenoh on its humble branch, with the commit titled Bump zenoh to 1.8.0 - 2nd attempt
Describe the bug
Hi, thanks for maintaining zenoh.
We are seeing repeated
OpenSyn -> Close(INVALID)errors when two zenoh peers are both configured to:peermode,transport.auth.usrpwd.To make this easier to investigate, I prepared a minimal repro branch based on
upstream/main:With that repro, whichever peer is started first keeps printing errors like:
If I reverse the startup order, the repeated error moves to the other peer instead.
Expected behavior:
Close(INVALID)responses.Observed behavior:
Received a close message (reason INVALID) in response to an OpenSyn,As a workaround, we changed peer-to-peer auto-connect to use
greater-zid(for exampleautoconnect_strategy: { peer: { to_peer: "greater-zid" } }). With that configuration change, we have not been able to reproduce this issue in our peer-to-peer deployment so far.To reproduce
Check out the repro branch:
git checkout repro/usrpwd-duplicate-transportBuild the example:
cargo check -p zenoh-examples --example z_repro_usrpwd_peerIn terminal A, run:
In terminal B, run:
Observe that the peer started first prints repeated
Close(INVALID)/OpenSynerrors.You can also start
peer-bfirst and thenpeer-a; the repeated error follows the peer that was started first.System info
Darwin 25.3.0d12952d8599b83fef49a5a1b290b6b039ea028e0origin/repro/usrpwd-duplicate-transport1b92bc932c5e55ecfb90a6c1650240a2ba8cccd7rmw_zenohon itshumblebranch, with the commit titledBump zenoh to 1.8.0 - 2nd attempt