๐Ÿ”€

eBPF Destination Redirect โ€” How connect() Syscalls Get Rewritten at Kernel Level

The core mechanism behind transparent proxies used by Keploy, Cilium, and Istio

The core trick that makes Keploy's "zero code changes" possible is eBPF destination redirect. This isn't Keploy-specific โ€” Cilium (Kubernetes CNI) and Istio (service mesh) use the same principle.

Why This Is Possible โ€” The Structure of Syscalls

When an app communicates over the network, the request to the kernel follows a fixed pattern:

  1. socket() โ€” create a socket
  2. connect(fd, addr, addrlen) โ€” connect to destination address
  3. send()/recv() โ€” send/receive data

The second argument of connect(), addr, contains the destination IP and port. For postgres:5432, it becomes a struct like {AF_INET, 5432, 10.0.0.5}.

eBPF can hook into the exact moment this connect() syscall executes. Before the kernel starts the TCP handshake, it can directly modify the addr struct.

Keploy's Implementation โ€” 3 eBPF Maps

Looking at Keploy's source (pkg/agent/hooks/linux/), the eBPF program uses 3 shared maps:

clientRegistrationMap โ€” registers the process (app) Keploy should monitor. PID-based: "intercept only this process's network traffic."

agentRegistrationMap โ€” registers the Keploy agent itself. "Don't touch this process's (proxy) traffic" โ€” prevents infinite loops.

redirectProxyMap โ€” the key map. Key is source port, value is original destination info (DestInfo struct). When eBPF rewrites the destination, it saves the original here.

Redirect Sequence

1. App: calls connect(postgres:5432)
2. eBPF connect4 hook fires
3. Check clientRegistrationMap โ†’ is this PID monitored? Yes
4. Save to redirectProxyMap: {srcPort: 54321} โ†’ {ip: 10.0.0.5, port: 5432}
5. Modify addr struct: {10.0.0.5:5432} โ†’ {127.0.0.1:16789}
6. Kernel starts TCP handshake with modified addr
7. App thinks it connected to postgres, but actually connected to local proxy

The app process's memory still has the original destination. The swap happens only at kernel level, so the app can never know.

How the Proxy Restores the Original Destination

When Keploy's transparent proxy (pkg/agent/proxy/) accepts a connection:

  1. Checks client's source port (e.g., 54321)
  2. GetDestinationInfo(54321) โ€” looks up original destination from eBPF map
  3. Result: {ip: 10.0.0.5, port: 5432} โ€” the postgres server
  4. Proxy connects to the real postgres
  5. Relays traffic between app โ†” proxy โ†” postgres while recording

This is why it's called a "transparent proxy." Neither the app nor the server knows the proxy exists.

SockOps โ€” cgroup-Level Socket Monitoring

connect4/6 alone isn't enough. The SockOps eBPF program attaches to cgroupv2 to monitor all socket events within a container.

When running an app in a Docker container, SockOps automatically attaches to the container's cgroup. Every socket event (connect, accept, close) inside the container passes through eBPF.

Preventing Infinite Loops

There's a problem here. The proxy also calls connect() to reach the real postgres. If eBPF catches that too? Proxy โ†’ proxy โ†’ proxy โ†’ ... infinite loop.

agentRegistrationMap prevents this. If the Keploy agent (proxy) process PID is registered there, eBPF skips that process's connect() calls.

How This Differs from Traditional Proxies

Traditional tools like mitmproxy, Charles, VCR require:

  • Setting HTTP_PROXY environment variables

  • Or installing SDKs/libraries in the app

  • Or manipulating DNS to reroute traffic

All require modifying the app or environment. The eBPF approach operates at kernel level โ€” nothing in the app or environment is touched. Run without keploy record and no eBPF hooks attach, no proxy starts. Completely transparent.

Why Linux Only

eBPF is a Linux kernel feature. Introduced in Linux 3.15 and massively expanded in 5.x โ€” a virtual machine inside the kernel. macOS's XNU kernel and Windows's NT kernel don't have it.

On macOS, use Docker Desktop. It runs a Linux VM internally where eBPF works. The trade-off is wrapping everything in Docker. CI/CD (GitHub Actions, Jenkins, etc.) typically uses Linux runners, so no issues there.

Ruby to Go

1

eBPF connect4 hook fires at connect() syscall execution โ€” overwrites destination IP:port in addr struct to proxy address

2

Original destination saved in redirectProxyMap (eBPF shared map) as {source port โ†’ original IP:port}

3

Proxy restores original destination via GetDestinationInfo(srcPort) โ†’ forwards to real server while recording

4

agentRegistrationMap makes eBPF skip proxy's own connect() calls โ€” prevents infinite loops

Pros

  • Zero changes to app code/config/env vars โ€” operates in kernel, doesn't touch userspace
  • Language/framework agnostic โ€” every process uses connect() syscalls

Cons

  • Requires Linux kernel 5.15+ โ€” macOS (XNU) and Windows (NT) don't have eBPF
  • Can only intercept network socket traffic โ€” SQLite (file I/O), Unix domain sockets (partially) are out of scope

Use Cases

Keploy โ€” auto-generate API tests. Records all app traffic and converts to test cases + mocks Cilium โ€” Kubernetes network policy enforcement. Controls pod-to-pod traffic via eBPF Istio โ€” service mesh traffic routing via eBPF without sidecars