Add rathole as alternative port-forwarding tunnel backend (TLS via cert-manager + Traefik)#529
Add rathole as alternative port-forwarding tunnel backend (TLS via cert-manager + Traefik)#529
Conversation
✅ Deploy Preview for interlink-dev canceled.
|
- Add TunnelType, RatholeExecutableURL, and RatholeCommand fields to Network config - Add templates/rathole-template.yaml with Deployment/ConfigMap/Service/Ingress for a rathole server using WebSocket transport (compatible with nginx ingress) - Embed rathole template; add DefaultRatholeExecutableURL and DefaultRatholeCommand constants; add annRatholeClientCmds annotation constant - Modify executeWstunnelTemplate to select rathole template when TunnelType==rathole - Modify addWstunnelClientAnnotation to generate rathole client TOML + bootstrap command when rathole mode is active; clear stale wstunnel annotation - Update clearConflictingNetworkAnnotations to also remove annRatholeClientCmds in full-mesh mode - Update cleanupWstunnelResources to also delete the rathole ConfigMap on teardown - Add tests covering rathole template rendering, annotation generation, and the updated clearConflictingNetworkAnnotations behaviour Agent-Logs-Url: https://github.com/interlink-hq/interLink/sessions/76bae3af-62af-4408-96ac-81a236bbae50 Co-authored-by: dciangot <4144326+dciangot@users.noreply.github.com>
…ment, log skipped UDP ports Agent-Logs-Url: https://github.com/interlink-hq/interLink/sessions/76bae3af-62af-4408-96ac-81a236bbae50 Co-authored-by: dciangot <4144326+dciangot@users.noreply.github.com>
|
@copilot lint action is failing |
- Rewrite if-else chain in addWstunnelClientAnnotation as a switch statement (gocritic) - Replace WriteString(fmt.Sprintf(...)) with fmt.Fprintf(...) (staticcheck QF1012) - Fix struct field alignment in rathole_test.go (goimports) Agent-Logs-Url: https://github.com/interlink-hq/interLink/sessions/d9236412-33e3-4b2c-af8a-71a1d7613670 Co-authored-by: dciangot <4144326+dciangot@users.noreply.github.com>
Fixed in 4fca6a0. Four lint issues addressed:
|
|
To implement Option A using Traefik Ingress (Kubernetes), you need to use a IngressRouteTCP resource. This allows Traefik to handle the TLS termination for your DNS name on port 443 while passing raw TCP traffic to the Rathole service. 1. The Traefik IngressRouteTCPStandard Kubernetes Ingress only supports HTTP. To do this correctly, you must use the Traefik Custom Resource Definition (CRD) for TCP. apiVersion: traefik.io/v1alpha1
kind: IngressRouteTCP
metadata:
name: triton-rathole-tcp
namespace: default
spec:
entryPoints:
- websecure
routes:
- match: HostSNI(`triton.yourdomain.com`)
services:
- name: rathole-server-service
port: 1234 # The port Rathole is listening on inside the cluster
tls:
secretName: your-certs-secret # Your Let's Encrypt or manual SSL cert
2. Why this solves the latency issue
3. Your Rathole ConfigurationWith Traefik handling the TLS termination, your Rathole Server should listen for plain TCP, and your Rathole Client (where Triton is) will connect via TLS. [server]
bind_addr = "0.0.0.0:1234" # Traefik sends plain TCP here
[server.services.triton]
token = "your_secure_token"
Rathole Client (client.toml): [client]
remote_addr = "triton.yourdomain.com:443"
transport.type = "tls" # Client wraps in TLS to talk to Traefik
transport.tls.sni = "triton.yourdomain.com"
[client.services.triton]
local_addr = "triton-service:8000" # Your actual Triton service address
4. Critical Networking Tip: initcwndEven with this high-performance setup, you are still bound by the laws of TCP "Slow Start." When your batch hits 15KB, it is slightly larger than the default 14.6KB (10 segments) that a Linux server is allowed to send in its "first flight." # Increase the initial window to ~45KB to ensure a 15KB batch clears in 1 RTT
sudo ip route change default via <gateway> dev eth0 initcwnd 30 initrwnd 30
Summary Checklist
Whether you need to pass a certificate or CA to the Rathole client depends on how "official" your SSL certificate is. 1. If you are using a Public CA (Let's Encrypt, ZeroSSL, etc.)If Traefik is using a valid certificate from a trusted authority (like the ones managed by cert-manager in Kubernetes), you usually don't need to pass a CA file.
2. If you are using a Self-Signed CertIf you generated your own certificate for triton.yourdomain.com, the client will reject the connection because it can't verify the "Chain of Trust."
[client]
remote_addr = "triton.yourdomain.com:443"
transport.type = "tls"
transport.tls.sni = "triton.yourdomain.com"
transport.tls.ca_crt = "/path/to/your/ca.crt" # Path to the CA that signed Traefik's cert
3. The "Insecure" Shortcut (Not recommended for Prod)If you just want to test if this fixes your 15KB latency cliff and don't want to mess with certificates yet, you can tell Rathole to skip verification.
[client.transport.tls]
insecure = true
Why this matters for your Latency CliffEven if you use a certificate, the TLS Handshake only happens once when the Rathole client connects to the server.
Summary Checklist for your Client:
|
|
@copilot use rathole but with tls, via cert-manager self signed ca provided by the admin. Sign the client cert with that ca too. |
Agent-Logs-Url: https://github.com/interlink-hq/interLink/sessions/a0ffa622-8d08-4ef5-8c1b-b5bc31967057 Co-authored-by: dciangot <4144326+dciangot@users.noreply.github.com>
Done in 10d34ee. Here's what changed: Kubernetes side (
Server Compute side (client annotation) — waits for cert-manager to issue [client.transport]
type = "tls"
[client.transport.tls]
hostname = "rathole-<name>.<dns>"
trusted_root = "/tmp/rathole-ca.crt"
cert = "/tmp/rathole-client.crt"
key = "/tmp/rathole-client.key"The CA cert, client cert, and client key are all base64-embedded in the bootstrap command annotation. Config: Network:
TunnelType: "rathole"
RatholeCAIssuerName: "my-admin-ca" # cert-manager ClusterIssuer name
RatholeCAIssuerKind: "ClusterIssuer" # default; or "Issuer"WebSocket transport is retained as a backward-compatible fallback when |
|
@copilot lint action is failing |
There was a problem hiding this comment.
Pull request overview
This PR decouples the existing tunnel/port-forwarding implementation from wstunnel by adding rathole as an alternative backend, including optional TLS support via cert-manager + Traefik, and new config fields under Network to select and configure the tunnel type.
Changes:
- Add
Network.TunnelTypeand rathole-specific configuration fields (download URL, command template, cert-manager issuer settings). - Introduce an embedded
rathole-template.yamland select the embedded template based onTunnelType. - Add dynamic-client-based management of cert-manager
Certificateand TraefikIngressRouteTCPresources for rathole TLS mode, plus tests for template selection and rathole client annotation generation.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| pkg/virtualkubelet/virtualkubelet.go | Adds rathole defaults, embedded template selection, dynamic client init, TLS CRD apply/delete helpers, and tunnel cleanup updates. |
| pkg/virtualkubelet/mesh.go | Extends pod annotation generation to support rathole client bootstrap (TLS + WebSocket fallback) and clears rathole annotations in mesh mode. |
| pkg/virtualkubelet/templates/rathole-template.yaml | New rathole server ConfigMap/Deployment/Service template. |
| pkg/virtualkubelet/rathole_test.go | New unit tests covering rathole template execution and rathole annotation generation (TLS + fallback). |
| pkg/virtualkubelet/mesh_annotations_test.go | Extends tests to ensure full-mesh cleanup removes rathole annotations. |
| pkg/virtualkubelet/config.go | Adds TunnelType and rathole config fields to the Network struct. |
| pkg/virtualkubelet/config_test.go | Adds tests covering rathole config fields and default TunnelType behavior. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Delete rathole configmap (used in rathole mode) | ||
| err = p.clientSet.CoreV1().ConfigMaps(namespace).Delete(ctx, wstunnelName+"-rathole-config", metav1.DeleteOptions{}) | ||
| if err != nil { | ||
| log.G(ctx).Warningf("Failed to delete rathole configmap %s/%s: %v", namespace, wstunnelName+"-rathole-config", err) |
| } | ||
| } | ||
|
|
||
| log.G(ctx).Infof("Created wstunnel infrastructure for %s/%s", originalPod.Namespace, originalPod.Name) |
| DefaultRatholeCommand = "curl -L -f -k %s -o rathole.zip && unzip -q rathole.zip && chmod +x rathole && echo %s | base64 -d > /tmp/rathole-ca.crt && echo %s | base64 -d > /tmp/rathole-client.crt && echo %s | base64 -d > /tmp/rathole-client.key && echo %s | base64 -d > /tmp/rathole-client.toml && ./rathole /tmp/rathole-client.toml &" | ||
| // DefaultRatholeWSCommand is the fallback command template used when no CA issuer is configured | ||
| // (WebSocket transport, backward-compatible). Two %s args: download URL and base64 client TOML. | ||
| DefaultRatholeWSCommand = "curl -L -f -k %s -o rathole.zip && unzip -q rathole.zip && chmod +x rathole && echo %s | base64 -d > /tmp/rathole-client.toml && ./rathole /tmp/rathole-client.toml &" |
| timeout := 120 * time.Second | ||
| deadline := time.Now().Add(timeout) | ||
| for time.Now().Before(deadline) { | ||
| secret, err := p.clientSet.CoreV1().Secrets(namespace).Get(ctx, secretName, metav1.GetOptions{}) | ||
| if err == nil && len(secret.Data["tls.crt"]) > 0 { | ||
| return nil | ||
| } | ||
| select { | ||
| case <-ctx.Done(): | ||
| return ctx.Err() | ||
| case <-time.After(2 * time.Second): | ||
| } | ||
| } | ||
| return fmt.Errorf("timed out waiting for cert-manager to issue secret %s/%s", namespace, secretName) |
| type Network struct { | ||
| // EnableTunnel enables WebSocket tunneling for pod port exposure | ||
| EnableTunnel bool `yaml:"EnableTunnel" default:"false"` | ||
| // TunnelType selects the port-forwarding backend: "wstunnel" (default) or "rathole" |
| // Five %s format verbs are substituted in order: the rathole download URL, base64-encoded CA cert, | ||
| // base64-encoded client cert, base64-encoded client key, and base64-encoded client TOML config. | ||
| // When RatholeCAIssuerName is empty the legacy two-argument WebSocket command is used instead. |
|
@copilot address these comments and issues |
…e cleanup, ticker, nginx Ingress for WS mode Agent-Logs-Url: https://github.com/interlink-hq/interLink/sessions/300c0012-13ca-4381-82ca-f53d8326ae51 Co-authored-by: dciangot <4144326+dciangot@users.noreply.github.com>
Fixed in 65a5aaa. Two
All addressed in 65a5aaa:
|
…add --client flag Agent-Logs-Url: https://github.com/interlink-hq/interLink/sessions/764971d7-9f40-49cf-b508-b26a749d454b Co-authored-by: dciangot <4144326+dciangot@users.noreply.github.com>
All open issues addressed across commits 65a5aaa and 3681321:
Lint is passing — the Warning Firewall rules blocked me from connecting to one or more addresses (expand for details)I tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
Lint is clean on the current branch (0 issues with golangci-lint v2.11.2). The |
| if p.config.Network.TunnelType == tunnelTypeRathole && p.config.Network.RatholeCAIssuerName != "" { | ||
| if tlsErr := p.applyRatholeTLSResources(ctx, templateData); tlsErr != nil { | ||
| log.G(ctx).Warningf("Failed to apply rathole TLS resources for %s/%s: %v", originalPod.Namespace, originalPod.Name, tlsErr) |
| // Delete ingress | ||
| err = p.clientSet.NetworkingV1().Ingresses(namespace).Delete(ctx, wstunnelName, metav1.DeleteOptions{}) | ||
| if err != nil { | ||
| log.G(ctx).Warningf("Failed to delete wstunnel ingress %s/%s: %v", namespace, wstunnelName, err) | ||
| log.G(ctx).Warningf("Failed to delete tunnel ingress %s/%s: %v", namespace, wstunnelName, err) | ||
| } else { | ||
| log.G(ctx).Infof("Successfully deleted wstunnel ingress %s/%s", namespace, wstunnelName) | ||
| log.G(ctx).Infof("Successfully deleted tunnel ingress %s/%s", namespace, wstunnelName) |
| // cleanupWstunnelResources removes all tunnel resources for a given name and namespace | ||
| func (p *Provider) cleanupWstunnelResources(ctx context.Context, wstunnelName, namespace string) { | ||
| log.G(ctx).Infof("Cleaning up wstunnel resources for %s/%s", namespace, wstunnelName) | ||
| log.G(ctx).Infof("Cleaning up tunnel resources for %s/%s", namespace, wstunnelName) | ||
|
|
| for { | ||
| secret, err := p.clientSet.CoreV1().Secrets(namespace).Get(ctx, secretName, metav1.GetOptions{}) | ||
| if err == nil { | ||
| allPresent := len(secret.Data["ca.crt"]) > 0 && | ||
| len(secret.Data["tls.crt"]) > 0 && | ||
| len(secret.Data["tls.key"]) > 0 | ||
| if allPresent { | ||
| return nil | ||
| } | ||
| } |
| log.G(ctx).Warningf("Failed to create dynamic client (CRD resources will not be managed): %v", err) | ||
| // non-fatal: rathole TLS resources won't be applied, but wstunnel still works |
| ratholeCmd := p.config.Network.RatholeCommand | ||
| if ratholeCmd == "" { | ||
| ratholeCmd = DefaultRatholeCommand | ||
| } | ||
| mainCmd = fmt.Sprintf(ratholeCmd, ratholeURL, caCrtB64, clientCrtB64, clientKeyB64, configB64) |
| {{- range .ExposedPorts}} | ||
|
|
||
| [server.services.p{{.Port}}] | ||
| token = "{{$.RandomPassword}}" | ||
| bind_addr = "0.0.0.0:{{.Port}}" | ||
| {{- end}} |
| {{- range .ExposedPorts}} | ||
| - containerPort: {{.Port}} | ||
| name: {{if .Name}}{{.Name}}{{else}}port-{{.Port}}{{end}} | ||
| protocol: {{.Protocol}} | ||
| {{- end}} |
| TunnelType: "rathole", | ||
| WildcardDNS: "tunnel.example.com", | ||
| RatholeExecutableURL: "https://example.com/rathole.zip", | ||
| RatholeCommand: "curl -L %s -o rathole.zip && unzip rathole.zip && echo %s | base64 -d > /tmp/client.toml && ./rathole /tmp/client.toml &", |
|
@copilot address the latest review comments |
…olling, UDP filter, verb validation, test fix Agent-Logs-Url: https://github.com/interlink-hq/interLink/sessions/35ce6acd-f8e9-43b3-8aaa-e3a97405d8d4 Co-authored-by: dciangot <4144326+dciangot@users.noreply.github.com>
….Count before Sprintf Agent-Logs-Url: https://github.com/interlink-hq/interLink/sessions/35ce6acd-f8e9-43b3-8aaa-e3a97405d8d4 Co-authored-by: dciangot <4144326+dciangot@users.noreply.github.com>
Fixed in 43a7d19. Latest review comments addressed:
|
The tunnel middleware was tightly coupled to wstunnel. This PR abstracts the port-forwarding backend to support rathole as an alternative, selectable via config.
Config changes (
Network)TunnelTypestring""/"wstunnel"(default, backward-compatible) or"rathole"RatholeExecutableURLstringrathole-org/rathole)RatholeCommandstring%sargs — URL, CA cert, client cert, client key, client TOML). Only used whenRatholeCAIssuerNameis set.RatholeWSCommandstring%sargs — URL, client TOML). Only used whenRatholeCAIssuerNameis empty.RatholeCAIssuerNamestringRatholeCAIssuerKindstring"ClusterIssuer"(default) or"Issuer"Kubernetes-side:
templates/rathole-template.yamlEmbedded template producing:
server.tomlwith plain TCP transport (no WebSocket framing) and one[server.services.pPORT]section per exposed portrapiz1/rathole:v0.5.0(Docker Hub image published by the rathole-org/rathole CI) with TCP readiness/liveness probesRatholeCAIssuerNameis empty; exposes the rathole control port over HTTP/WebSocket atrathole<name>.<WildcardDNS>:80When
RatholeCAIssuerNameis set,applyRatholeTLSResourcesadditionally creates (via the dynamic Kubernetes client):Certificate({name}-rathole-server-tls) — Traefik uses this to terminate TLS at port 443; DNS namerathole<name>.<WildcardDNS>Certificate({name}-rathole-client-tls) — signed by the same admin CA; embedded in the compute-side bootstrap commandIngressRouteTCP—HostSNImatch onrathole<name>.<WildcardDNS>,websecureentry point, TLS secret{name}-rathole-server-tls; rathole server receives plain TCPCompute-side: annotation
addWstunnelClientAnnotationrathole branch:TLS mode (
RatholeCAIssuerNameset): waits for cert-manager to issue{name}-rathole-client-tls, validates thatca.crt,tls.crt, andtls.keyare all present and non-empty, then builds a TLS client TOML. All cert material is base64-embedded in the bootstrap command written tointerlink.eu/rathole-client-commands.WebSocket fallback (
RatholeCAIssuerNamenot set): backward-compatible WebSocket transport on port 80; the nginx Ingress is included in the Kubernetes template so the compute-side client can reach the rathole server.Other
executeWstunnelTemplateselects the rathole or wstunnel template based onTunnelType;WstunnelTemplatePathoverrides apply to bothcleanupWstunnelResourcesdeletes{name}-rathole-configConfigMap, cert-manager Certificates, and Traefik IngressRouteTCP; NotFound errors are suppressed to avoid noisy warnings in wstunnel modeclearConflictingNetworkAnnotationsremovesannRatholeClientCmdsin full-mesh modek8s.io/client-go/dynamic) is added toProviderfor managing CRD resources; it is initialised alongsideclientSetand is nil-safetunnelTypeRatholeandprotocolUDPpackage-level constants replace repeated string literalswaitForRatholeCertSecretuses aticker + timerselect pattern instead of a per-iterationtime.Afterallocation--clientflag for consistency with--serveron the server Deployment sideDefaultRatholeExecutableURLreferences the rathole-org/rathole release with the correctx86_64-unknown-linux-gnuartifact (musl was dropped for x86_64 in v0.5.0)