eBPF Acceleration via NovaNet¶
NovaEdge leverages eBPF/XDP acceleration for the data plane through NovaNet, the Nova CNI component. NovaEdge itself no longer loads or manages eBPF programs directly. Instead, it communicates with NovaNet via a gRPC client over a Unix domain socket to request eBPF services.
L4 Load Balancing
Kubernetes Service L4 load balancing is handled by NovaNet. NovaEdge focuses on L7 ingress load balancing.
Overview¶
The following eBPF acceleration features are provided by NovaNet and consumed by NovaEdge via gRPC. All features are auto-detected at runtime by NovaNet. If NovaNet is not available, NovaEdge continues to operate without eBPF acceleration (graceful degradation).
| Feature | Provided By | Fallback (NovaNet unavailable) |
|---|---|---|
| SOCKMAP Same-Node Bypass | NovaNet | Kernel network stack |
| eBPF Mesh Redirect | NovaNet | nftables/iptables TPROXY |
| Rate Limiting | NovaNet | Userspace rate limiting |
| Health Monitoring | NovaNet | Dataplane health checks |
Architecture¶
Previous Architecture (Removed)¶
Previously, NovaEdge loaded and managed eBPF programs directly within the Go
agent, requiring privileged: true, CAP_BPF, CAP_SYS_ADMIN, and BPF
filesystem mounts.
Current Architecture¶
flowchart LR
subgraph NovaEdge["NovaEdge Agent"]
GC["gRPC Client"]
end
subgraph NovaNet["NovaNet (CNI)"]
EBPF["eBPF Services"]
SOCKMAP["SOCKMAP Bypass"]
SKLOOKUP["SK_LOOKUP Redirect"]
RL["Rate Limiting"]
HM["Health Monitor"]
end
GC -->|"Unix socket<br/>/run/novanet/ebpf-services.sock"| EBPF
EBPF --> SOCKMAP & SKLOOKUP & RL & HM
NovaEdge connects to NovaNet via a Unix domain socket at
/run/novanet/ebpf-services.sock (configurable via Helm). NovaNet handles all
eBPF program loading, lifecycle management, and kernel interactions.
Security Improvement¶
By removing direct eBPF management from NovaEdge, the agent container no longer requires elevated privileges for BPF operations:
| Requirement | Before (Direct eBPF) | After (NovaNet) |
|---|---|---|
privileged: true |
Required | Not required |
CAP_BPF |
Required | Not required |
CAP_SYS_ADMIN |
Required | Not required |
CAP_NET_RAW |
Required | Not required |
/sys/fs/bpf mount |
Required | Not required |
/run/novanet/ mount |
Not applicable | Required (for socket access) |
The NovaEdge agent now runs with a significantly reduced privilege set. Only
CAP_NET_ADMIN and CAP_NET_BIND_SERVICE are needed for VIP management and
binding privileged ports.
Helm Configuration¶
Enabling NovaNet Integration¶
# charts/novaedge-agent/values.yaml
novanet:
# Enable NovaNet eBPF services integration
enabled: true
# Path to the NovaNet eBPF services Unix socket
ebpfServicesSocket: /run/novanet/ebpf-services.sock
# Reduced security context — no BPF capabilities needed
securityContext:
capabilities:
add:
- NET_ADMIN
- NET_BIND_SERVICE
drop:
- ALL
Volume Mount¶
The agent DaemonSet needs access to the NovaNet socket directory:
volumes:
- name: novanet-socket
hostPath:
path: /run/novanet
type: DirectoryOrCreate
volumeMounts:
- name: novanet-socket
mountPath: /run/novanet
readOnly: true
Mesh Integration¶
When the service mesh is enabled, the NovaEdge agent reconciles eBPF acceleration state on every config snapshot via the NovaNet gRPC client. This section describes the four acceleration features and how they integrate with the mesh manager.
SK_LOOKUP Mesh Redirect¶
Traditional mesh traffic interception uses nftables or iptables NAT REDIRECT
rules to divert ClusterIP traffic to the transparent listener on port 15001.
When NovaNet is available, the agent instead requests NovaNet to install
sk_lookup eBPF program entries that redirect matching traffic at the socket
lookup layer, bypassing the entire netfilter rule chain.
How it works:
- The controller pushes mesh-enabled
InternalServiceentries to the agent via the ConfigSnapshot. - The mesh manager calls
AddMeshRedirect(clusterIP, port, 15001)for each new service andRemoveMeshRedirect(clusterIP, port)for services that are no longer enrolled. - NovaNet installs an
sk_lookupmap entry that redirects TCP connections matching(clusterIP, port)to the agent's transparent listener socket. - The agent tracks only successfully installed/removed entries so that failures are retried on the next reconciliation cycle.
Benefits over nftables/iptables:
- No PREROUTING chain traversal -- redirect happens at socket lookup time
- No conntrack entries created for the redirect itself
- Atomic per-entry updates (no batch flush/replace)
- Lower latency for the first packet of each connection
The nftables/iptables rules remain active as a fallback when NovaNet is
unavailable or when sk_lookup is not supported by the kernel.
SOCKMAP Same-Node Acceleration¶
For mesh traffic between pods on the same node, NovaNet can install SOCKMAP entries that splice connected socket pairs directly in the kernel, bypassing the full TCP/IP stack for data transfer.
How it works:
- The mesh manager identifies endpoints running on the local node by
comparing the
topology.kubernetes.io/nodelabel against the agent's node name, or by comparing the endpoint address against the node IP. - For each local endpoint pod, the agent calls
EnableSockmap(namespace, name)via the NovaNet client. - NovaNet attaches
sk_msgandsk_skbprograms to the pod's sockets, enabling kernel-level splice for matching connections. - When pods leave the mesh or move to a different node, the agent calls
DisableSockmap(namespace, name)to clean up. - On agent shutdown, all tracked SOCKMAP entries are cleaned up using the shutdown context to honor cancellation and timeout.
Requirements:
- The
NODE_IPenvironment variable must be set in the agent pod spec (typically via the Downward API). - The
--node-nameflag must match the Kubernetes node name for topology label matching. - Endpoint objects must carry
kubernetes.io/namespaceandkubernetes.io/namelabels identifying the pod. These labels must be populated by the snapshot builder when constructingInternalServiceendpoints.
Endpoint Label Requirement
SOCKMAP acceleration requires that InternalService endpoints carry
kubernetes.io/namespace, kubernetes.io/name, and
topology.kubernetes.io/node labels. If your snapshot builder does not
populate these labels, SOCKMAP will not activate. See issue tracking for
the snapshot builder enhancement.
Kernel-Level Rate Limiting¶
NovaNet can enforce per-CIDR rate limits directly in the eBPF datapath,
dropping packets before they reach the userspace proxy. The mesh manager
exposes an ApplyRateLimits() method that reconciles desired rate limit
entries against the currently installed set.
Current status: The ApplyRateLimits() method is implemented but not
yet wired into the agent's config application path. Per-CIDR rate limit
policies need to be extracted from the ConfigSnapshot's ProxyPolicy
objects and passed to this method. Until then, rate limiting falls back to
the Rust dataplane's userspace token-bucket implementation.
Planned integration:
- The agent's
applyAgentConfig()callback will extract per-CIDR rate limit entries from policy objects in the ConfigSnapshot. - These entries will be passed to
meshManager.ApplyRateLimits(ctx, entries). - NovaNet will install TC or XDP rate-limit maps for each CIDR.
- Stale entries are automatically removed on reconciliation.
Passive Backend Health Monitoring¶
NovaNet's eBPF layer passively monitors TCP connection outcomes (SYN-ACK latency, RST rates, connection failures) for backend endpoints without injecting active health-check probes. The agent streams these health events via a long-lived gRPC server-streaming RPC.
How it works:
- At startup, the agent calls
StartHealthStream()which opens aStreamBackendHealthRPC to NovaNet. - The stream is started unconditionally -- if NovaNet is not yet connected, the stream loop waits with a 5-second backoff before retrying.
- If the stream is interrupted (NovaNet restart, network error), the agent waits for a backoff period before reconnecting, preventing CPU spin on rapid failures.
- Each received event contains per-backend metrics: total connections, failed connections, and failure rate.
Current status: Health events are logged at DEBUG level. A future enhancement will forward these events to the Rust dataplane's outlier detection subsystem to influence routing decisions (e.g., ejecting backends with high failure rates).
NovaNet Integration Requirements¶
To use eBPF acceleration features, the following must be in place:
| Requirement | Purpose |
|---|---|
| NovaNet installed as cluster CNI | Provides eBPF services on each node |
/run/novanet/ebpf-services.sock mounted |
gRPC communication channel |
NODE_IP environment variable set |
SOCKMAP same-node detection |
--node-name flag matches K8s node name |
Topology label matching |
--mesh-enabled flag set |
Enables mesh manager reconciliation |
--novanet-socket flag (optional) |
Override default socket path |
| Kernel 5.9+ (recommended) | Full sk_lookup and SOCKMAP support |
Graceful Degradation¶
If NovaNet is not available (not installed, socket not present, or service unavailable), NovaEdge continues to operate normally:
- SOCKMAP bypass: Falls back to kernel network stack for same-node traffic
- Mesh redirect: Falls back to nftables/iptables TPROXY rules
- Rate limiting: Falls back to userspace rate limiting in the Rust dataplane
- Health monitoring: Falls back to dataplane-side health checks
The agent logs a warning at startup when NovaNet is not available and periodically retries the connection.
{"level":"warn","msg":"NovaNet eBPF services not available, operating in degraded mode","socket":"/run/novanet/ebpf-services.sock"}
Monitoring¶
Prometheus Metrics¶
NovaNet integration exposes the following metrics:
| Metric | Type | Description |
|---|---|---|
novaedge_novanet_connected |
Gauge | Whether the agent is connected to NovaNet (0 or 1) |
novaedge_novanet_requests_total |
Counter | Total gRPC requests to NovaNet |
novaedge_novanet_errors_total |
Counter | Total gRPC errors from NovaNet |
For eBPF program-level metrics (loaded programs, map operations, etc.), see the NovaNet documentation.
Troubleshooting¶
Agent cannot connect to NovaNet¶
Symptom: Agent logs NovaNet eBPF services not available
Common causes:
- NovaNet not installed -- install NovaNet as the cluster CNI
- Socket path incorrect -- verify
novanet.ebpfServicesSocketmatches the actual NovaNet socket location - Missing volume mount -- ensure
/run/novanet/is mounted in the agent pod - Permissions -- the socket file must be readable by the agent process
Verifying NovaNet is providing eBPF services¶
# Check if the NovaNet socket exists on the node
ls -la /run/novanet/ebpf-services.sock
# Check agent logs for successful connection
kubectl logs -n nova-system -l app.kubernetes.io/name=novaedge-agent | grep novanet
# Expected on success:
# {"level":"info","msg":"Connected to NovaNet eBPF services","socket":"/run/novanet/ebpf-services.sock"}