feat: add exit node health checks and auto-failover#45
Open
mkarim1378 wants to merge 1 commit intomasterking32:python_testingfrom
Open
feat: add exit node health checks and auto-failover#45mkarim1378 wants to merge 1 commit intomasterking32:python_testingfrom
mkarim1378 wants to merge 1 commit intomasterking32:python_testingfrom
Conversation
Monitors all configured exit node URLs in the background and automatically switches to the next healthy URL when one goes down, then switches back to primary when it recovers. - Add _exit_node_health_loop: background task that pings all exit node URLs every health_check_interval seconds (default 30s) - Add _ping_exit_node: lightweight GET to detect reachability - Add _record_exit_node_failure / _record_exit_node_success: track consecutive failures per URL with cooldown (2x interval) - Add _try_exit_node_failover: switches _exit_node_url to next alive URL; clears it to "" when all are down so traffic silently falls back to Apps Script - Add _record_exit_node_success recovery: restores _exit_node_url from all-down state and switches back to primary when it recovers - Support urls[] list in exit_node config for fallback URLs - Promote first entry of urls[] if url field is empty - Bump version to 1.2.0 Config additions (all optional, backward-compatible): exit_node.urls — fallback URL list exit_node.health_check_interval — default 30s (min 10s) exit_node.health_check_failures_before_failover — default 3
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
exit node URLs and automatically fails over to the next healthy URL when
one goes down, then restores the primary when it recovers.
urlslist inexit_nodeconfig for declaring fallback URLs.What changed
src/relay/domain_fronter.py_exit_node_health_loop— background task, pings everyhealth_check_intervalseconds_ping_exit_node— lightweight TCP+TLS GET, returns True if any HTTP response arrives_record_exit_node_failure— tracks consecutive failures per URL; marks dead after threshold and triggers failover; skips increment if URL already in cooldown_record_exit_node_success— clears failure state; restores_exit_node_urlfrom all-down state; switches back to primary when it recovers_try_exit_node_failover— switches active URL to next alive fallback; clears_exit_node_urlto""when all are down so_exit_node_matchesreturns False and traffic silently falls back to Apps Script_build_exit_node_url_list— mergesurl+urls[]into deduped ordered list; promotes firsturls[]entry ifurlis emptyconfig.example.jsonurls,health_check_interval,health_check_failures_before_failoverfieldsREADME.mdsrc/core/constants.py1.1.0→1.2.0Behavior
_exit_node_urlcleared → traffic falls back to Apps Script silently_exit_node_urlrestored, exit node re-enabledurls)Test plan
exit_node.enabled: false— no health task started, no matchingurls-only config (emptyurlfield): first entry promoted as primaryhealth_check_intervalandhealth_check_failures_before_failovervalues respected