Skip to content

fix routers unregistering liveness tokens and queryables#2603

Open
Evanev7 wants to merge 3 commits into
eclipse-zenoh:mainfrom
Evanev7:fix
Open

fix routers unregistering liveness tokens and queryables#2603
Evanev7 wants to merge 3 commits into
eclipse-zenoh:mainfrom
Evanev7:fix

Conversation

@Evanev7

@Evanev7 Evanev7 commented May 14, 2026

Copy link
Copy Markdown

Description

Zenoh routers did not unregister liveness tokens or queryables on cold disconnects due to early graph mutation logic

What does this PR do?

this pr removes the graph mutating logic and defers it until after all unregistering operations have completed

Why is this change needed?

after cold disconnecting two routers, they would not register that the remote router had disconnected, and would have to re-build their queries to be informed of the disconnection. after this change, liveliness tokens are more likely to be correctly updated after a network interface goes down

Related Issues

fixes #2602


🏷️ Label-Based Checklist

Based on the labels applied to this PR, please complete these additional requirements:

Labels: bug

🐛 Bug Fix Requirements

Since this PR is labeled as a bug fix, please ensure:

  • Root cause documented - Explain what caused the bug in the PR description
  • Reproduction test added - Test that fails on main branch without the fix
  • Test passes with fix - The reproduction test passes with your changes
  • Regression prevention - Test will catch if this bug reoccurs in the future
  • Fix is minimal - Changes are focused only on fixing the bug
  • Related bugs checked - Verified no similar bugs exist in related code

Why this matters: Bugs without tests often reoccur.

Instructions:

  1. Check off items as you complete them (change - [ ] to - [x])
  2. The PR checklist CI will verify these are completed

This checklist updates automatically when labels change, but preserves your checked boxes.

@Evanev7

Evanev7 commented May 14, 2026

Copy link
Copy Markdown
Author

happy to iterate on this more to bring it inline with the rest of the codebase

Evanev7 added 2 commits May 14, 2026 16:45
- router unregistering doesn't modify the net graph
- add cleanup at the end of send_close() to actually update the graph
- small refactor to cleanup duplicated logic

@fuzzypixelz fuzzypixelz left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution @Evanev7.

I think this is a regression introduced in #2096. I've confirmed that #2602 doesn't occur in Zenoh 1.7.1. I'm yet to check 1.8 but I'm positive that the issue is specific to 1.9.

remove_link is an old API: the root cause of the issue here is that the remove_node_* methods misuse its (undocumented) contract. So the proper fix would be to call it exactly once in a new remove_node_entities method such that:

fn remove_node_entities(&mut self, zid: &ZenohIdProto) -> UnregisterFaceEntitiesResult;

and:

struct UnregisterFaceEntitiesResult {
  subscribers: HashSet<Arc<Resource>>,
  queryables: HashSet<Arc<Resource>>,
  tokens: HashSet<Arc<Resource>>,
}

This also implies replacing unregister_face_subscribers with a new unregister_face_entities method.

The name UnregisterFaceEntitiesResult is a bit of a mouthful but it is consistent with other names such as UnregisterEntityResult.

@fuzzypixelz fuzzypixelz added the bug Something isn't working label May 22, 2026
@fuzzypixelz

Copy link
Copy Markdown
Member

@fuzzypixelz fuzzypixelz left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We would also need a reproduction test under zenoh/src/net/tests/regions/declare.rs. But the framework has no support for closing connections yet. This can be the subject a separate pull request.

@Evanev7

Evanev7 commented May 28, 2026

Copy link
Copy Markdown
Author

ECA signed, ill get to this when I have a minute. thanks for the review!

@codecov

codecov Bot commented Jun 2, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 97.05882% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 74.04%. Comparing base (77507c1) to head (e819b10).
⚠️ Report is 16 commits behind head on main.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
zenoh/src/net/protocol/network.rs 96.77% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2603      +/-   ##
==========================================
- Coverage   74.05%   74.04%   -0.02%     
==========================================
  Files         400      400              
  Lines       60846    60863      +17     
==========================================
+ Hits        45060    45065       +5     
- Misses      15786    15798      +12     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Router liveliness tokens aren't properly cleaned up on link disconnects

2 participants