Skip to content

[Networking] Geneve tunnel conncheck#3277

Open
cheina97 wants to merge 1 commit intoliqotech:masterfrom
cheina97:frc/genevetunnelping
Open

[Networking] Geneve tunnel conncheck#3277
cheina97 wants to merge 1 commit intoliqotech:masterfrom
cheina97:frc/genevetunnelping

Conversation

@cheina97
Copy link
Copy Markdown
Member

@cheina97 cheina97 commented May 4, 2026

Summary

This PR adds an active connectivity health-check (ping) mechanism for Geneve tunnels, tracking connection status and latency directly on the GeneveTunnel CRD.

What changed:

  • GeneveTunnel CRD extended — added a Status subresource with value (Connected/Error) and latency fields, plus kubectl get print columns for quick status inspection.
  • Gateway InternalNodeReconciler — on reconcile, lazily initialises a ConnChecker bound to the gateway's inner Geneve IP; registers a UDP ping sender per tunnel and wires a callback that writes connectivity results back to GeneveTunnel status. Senders are stopped when the corresponding InternalNode is deleted.
  • Fabric InternalFabricReconciler — lazily starts a ConnChecker receiver bound to the node's inner Geneve IP, so the fabric side can respond to pings from the gateway.
  • ConnChecker improvements:
    • Supports binding to a specific IP (BindIP option), enabling per-interface isolation between Geneve and WireGuard ping traffic.
    • Latency is now smoothed with an EWMA (PingLatencyAlpha option, default 0.1) instead of using raw per-packet measurements.
  • New flags (gateway geneve-fabric component):
    • --geneve-ping-enabled (default true)
    • --geneve-ping-port (default 12346)
    • --geneve-ping-interval (default 2s)
    • --geneve-ping-loss-threshold (default 5)
    • --geneve-ping-update-status-interval (default 10s)
    • --geneve-ping-latency-alpha (default 0.1)
  • New flag (fabric component): --geneve-ping-port must match the gateway value.
  • New flag (connection component): --ping-latency-alpha applies the same EWMA smoothing to WireGuard connection checks.

How it works

Gateway InternalNodeReconciler
  └─ ConnChecker (sender, bound to gateway inner IP)
       └─ UDP ping → Fabric node inner IP : geneve-ping-port
            └─ ConnChecker (receiver, bound to node inner IP)
       └─ pong → callback → GeneveTunnel.Status.{Value, Latency}

When --geneve-ping-enabled=false the status is set to Connected immediately without running the ping loop.

Test plan

  • Deploy two clusters peered with Geneve mode.
  • Verify kubectl get genevetunnels shows Connected and a latency value after a few seconds.
  • Delete an InternalNode; verify no goroutine leak (sender is stopped).
  • Test with --geneve-ping-enabled=false; verify status is immediately Connected.

@adamjensenbot
Copy link
Copy Markdown
Collaborator

Hi @cheina97. Thanks for your PR!

I am @adamjensenbot.
You can interact with me issuing a slash command in the first line of a comment.
Currently, I understand the following commands:

  • /rebase: Rebase this PR onto the master branch (You can add the option test=true to launch the tests
    when the rebase operation is completed)
  • /merge: Merge this PR into the master branch
  • /build Build Liqo components
  • /test Launch the E2E and Unit tests
  • /hold, /unhold Add/remove the hold label to prevent merging with /merge

Make sure this PR appears in the liqo changelog, adding one of the following labels:

  • feat: 🚀 New Feature
  • fix: 🐛 Bug Fix
  • refactor: 🧹 Code Refactoring
  • docs: 📝 Documentation
  • style: 💄 Code Style
  • perf: 🐎 Performance Improvement
  • test: ✅ Tests
  • chore: 🚚 Dependencies Management
  • build: 📦 Builds Management
  • ci: 👷 CI/CD
  • revert: ⏪ Reverts Previous Changes

@github-actions github-actions Bot added the feat Adds a new feature to the codebase label May 4, 2026
@cheina97 cheina97 force-pushed the frc/genevetunnelping branch from ee5dcff to 68f9eb6 Compare May 4, 2026 08:21
@cheina97
Copy link
Copy Markdown
Member Author

cheina97 commented May 5, 2026

/build

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feat Adds a new feature to the codebase size/L

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants