← Back to Index

openshift 4.20 Network Observability with ovn egress firewall

OpenShift 4.20 introduces eBPF as the agent for Network Observability (NetObserv). Unlike the previous IPFIX-based approach, eBPF runs as a DaemonSet directly on each node and captures packets at the kernel level — enabling RTT (Round-Trip Time) measurements with nanosecond precision, which was not possible before.

This document walks through the full setup: installing LokiStack as the log backend, deploying the Network Observability Operator, setting up an Egress IP scenario to observe external traffic paths, and finally implementing an automated EgressFirewall to block Google traffic based on dynamically updated IP ranges.

The following diagram shows the overall architecture of this lab:

The lab uses a 3-master SNO-style cluster. Network flows are captured by eBPF agents on each node, enriched with Kubernetes metadata by flowlogs-pipeline, then stored in LokiStack (backed by S3 object storage). The OCP web console plugin (netobserv-plugin) queries Loki to visualize flows in real time. Separately, aggregated metrics are exported to the built-in OCP Prometheus for dashboard display.

try with loki

install loki

Network Observability stores raw flow logs (each individual TCP/UDP flow as a JSON record) in LokiStack. Loki is a log aggregation system optimized for large volumes of structured data — it compresses and stores logs in object storage (S3), making it far cheaper than storing in Prometheus time-series format. Without Loki, you can still get aggregated metrics via Prometheus, but you lose the ability to browse individual flow records and filter by pod, namespace, port, or protocol in the traffic flows table.

RTT measurement data is stored as a field (TimeFlowRttNs) in each flow log record in Loki. This is why Loki is a prerequisite for RTT visibility.

create S3 bucket

Loki uses object storage (S3 or S3-compatible) as its primary data store. All flow log data — compressed chunks and index files — are written to S3. Without a working S3 backend, LokiStack pods will fail to start. In this lab we use rustfs (a lightweight, self-hosted S3-compatible server running on the helper node at 192.168.99.1:9000) as a stand-in for a cloud S3 bucket.

The screenshot below shows the rustfs web UI after creating the demo bucket that Loki will use for storage. The bucket must exist before deploying LokiStack.

install loki operator

With the S3 bucket ready, we install the Loki Operator from OperatorHub and then create a LokiStack custom resource. The key configuration decisions are:

The screenshot below shows the Loki Operator successfully installed from OperatorHub, ready for LokiStack CR creation.


        oc new-project netobserv
        
        # netobserv is a resource-hungry application, it has high requirements for the underlying loki, we configure the maximum gear, and the number of replicas, etc., to adapt to our test environment.
        
        cat << EOF > ${BASE_DIR}/data/install/loki-netobserv.yaml
        ---
        apiVersion: v1
        kind: Secret
        metadata:
          name: loki-s3 
        stringData:
          access_key_id: rustfsadmin
          access_key_secret: rustfsadmin
          bucketnames: demo
          endpoint: http://192.168.99.1:9000
          # region: eu-central-1
        
        ---
        apiVersion: loki.grafana.com/v1
        kind: LokiStack
        metadata:
          name: loki
        spec:
          size: 1x.demo # 1x.medium , 1x.demo
          replication:
            factor: 1
          storage:
            schemas:
            - version: v13
              effectiveDate: '2022-06-01'
            secret:
              name: loki-s3
              type: s3
          storageClassName: nfs-csi
          tenants:
            mode: openshift-network
            openshift:
                adminGroups: 
                - cluster-admin
          template:
            gateway:
              replicas: 1
            ingester:
              replicas: 1
            indexGateway:
              replicas: 1
        
        EOF
        
        oc create --save-config -n netobserv -f ${BASE_DIR}/data/install/loki-netobserv.yaml
        
        # to delete
        
        # oc delete -n netobserv -f ${BASE_DIR}/data/install/loki-netobserv.yaml
        
        # oc get pvc -n netobserv | grep loki- | awk '{print $1}' | xargs oc delete -n netobserv pvc
        
        # run below, if reinstall
        
        oc adm groups new cluster-admin
        
        oc adm groups add-users cluster-admin admin
        
        oc adm policy add-cluster-role-to-group cluster-admin cluster-admin

install net observ

The Network Observability Operator is the core component that orchestrates the entire pipeline. Once installed, it manages three sub-components through a single FlowCollector custom resource:

Installation is straightforward via OperatorHub. However, there is a known issue with the eBPF agent: after initial deployment, some agents may not fully activate. Restarting the cluster nodes after installation resolves this — do not skip this step if the eBPF agents appear stuck.

The screenshots below walk through the installation steps in the OCP web console:

Step 1 — Search for “Network Observability” in OperatorHub:

Step 2 — Select the operator and click Install:

Step 3 — Create the FlowCollector CR. This is the main configuration object. Key fields include the Loki URL (pointing to our LokiStack), the agent type (eBPF), and the sampling rate:

Step 4 — Configure the Loki connection in the FlowCollector. The lokiStack section references the LokiStack resource we created earlier in the netobserv namespace:

Step 5 — eBPF agent settings: sampling rate, interfaces to monitor, and privilege settings. The eBPF agent needs elevated privileges to access the kernel network stack:

Step 6 — After applying the FlowCollector CR, all operator pods come up in the netobserv namespace. The eBPF agent pods run on every node. Once ready, the “Network Traffic” menu item appears in the OCP console:

try it out

deploy egress IP

The goal of this test is to observe how the Egress IP feature interacts with Network Observability. An Egress IP assigns a stable, predictable source IP address to all outbound traffic from a given namespace. This is important in scenarios where:

In this lab, we assign 192.168.99.103 as the egress IP for the llm-demo namespace. All pods in that namespace will appear to originate from 192.168.99.103 when reaching external destinations — regardless of which node the pod is actually running on.

Without Egress IP, outbound traffic uses the node’s primary IP as the source address, which changes if the pod is rescheduled to a different node. With Egress IP, the source is always stable.


        # label a node to host egress ip
        
        oc label node --all k8s.ovn.org/egress-assignable="" --overwrite
        
        # label a namespace with env
        
        oc new-project llm-demo
        oc label ns llm-demo env=egress-demo
        
        
        # create a egress ip
        
        cat << EOF > ${BASE_DIR}/data/install/egressip.yaml
        apiVersion: k8s.ovn.org/v1
        kind: EgressIP
        metadata:
          name: egressips-prod
        spec:
          egressIPs:
          - 192.168.99.103
          namespaceSelector:
            matchLabels:
              env: egress-demo
        EOF
        
        oc apply -f ${BASE_DIR}/data/install/egressip.yaml
        
        # oc delete -f ${BASE_DIR}/data/install/egressip.yaml
        
        oc get egressip -o json | jq -r '.items[] | [.status.items[].egressIP, .status.items[].node] | @tsv'
        
        # 192.168.99.103  master-01-demo

make traffic and see result

With the Egress IP in place, we deploy a test pod in the llm-demo namespace on master-02-demo — a different node than where the egress IP is assigned. This is intentional: OVN-Kubernetes will route outbound traffic from master-02-demo through the egress node (master-01-demo) so it exits via the 192.168.99.103 IP. This cross-node egress path creates interesting RTT values because the traffic traverses an extra network hop inside the cluster before leaving.

The pod continuously curls https://www.google.com to generate external traffic. The eBPF agent on each node captures these flows, and flowlogs-pipeline enriches them with Kubernetes metadata before writing to Loki.


        # go back to helper
        
        # create a dummy pod
        
        cat << EOF > ${BASE_DIR}/data/install/demo1.yaml
        ---
        kind: Pod
        apiVersion: v1
        metadata:
          name: wzh-demo-pod
        spec:
          nodeSelector:
            kubernetes.io/hostname: 'master-02-demo'
          restartPolicy: Always
          containers:
            - name: demo1
              image: >- 
                quay.io/wangzheng422/qimgs:centos9-test-2025.12.18.v01
              env:
                - name: key
                  value: value
              command: [ "/bin/bash", "-c", "--" ]
              args: [ "tail -f /dev/null" ]
              # imagePullPolicy: Always
        EOF
        
        oc apply -n llm-demo -f ${BASE_DIR}/data/install/demo1.yaml
        
        # oc delete -n llm-demo -f ${BASE_DIR}/data/install/demo1.yaml
        
        oc exec -n llm-demo wzh-demo-pod -it -- bash
        
        # in the container terminal
        
        while true; do curl https://www.google.com && sleep 1; done;
        
        # while true; do curl http://192.168.77.8:13000/cache.db > /dev/null; done;

After the pod starts generating traffic, we can observe it in the OCP web console under Observe → Network Traffic or Pod → Network Traffic. The screenshots below walk through what you see in the UI:

You can see flows from wzh-demo-pod in llm-demo reaching external Google IP addresses (e.g., 142.251.x.x):

Each flow record stored in Loki contains a full JSON document. The following example shows a captured flow from wzh-demo-pod receiving a response from a Google server (142.251.152.119:443). Key fields to note:

{
          "AgentIP": "192.168.99.24",
          "Bytes": 6938,
          "Dscp": 0,
          "DstAddr": "10.133.0.21",
          "DstK8S_HostIP": "192.168.99.24",
          "DstK8S_HostName": "master-02-demo",
          "DstK8S_Name": "wzh-demo-pod",
          "DstK8S_Namespace": "llm-demo",
          "DstK8S_NetworkName": "primary",
          "DstK8S_OwnerName": "wzh-demo-pod",
          "DstK8S_OwnerType": "Pod",
          "DstK8S_Type": "Pod",
          "DstMac": "0a:58:64:58:00:02",
          "DstPort": 52960,
          "DstSubnetLabel": "Pods",
          "Etype": 2048,
          "Flags": [
            "ACK"
          ],
          "FlowDirection": "0",
          "IfDirections": [
            0,
            0
          ],
          "Interfaces": [
            "genev_sys_6081",
            "eth0"
          ],
          "K8S_FlowLayer": "app",
          "Packets": 3,
          "Proto": 6,
          "Sampling": 50,
          "SrcAddr": "142.251.152.119",
          "SrcMac": "0a:58:64:58:00:04",
          "SrcPort": 443,
          "TimeFlowEndMs": 1776231454328,
          "TimeFlowRttNs": 8421000,
          "TimeFlowStartMs": 1776231454316,
          "TimeReceived": 1776231455,
          "Udns": [
            ""
          ],
          "app": "netobserv-flowcollector"
        }

The remaining screenshots show additional views and dashboards available in the NetObserv UI:

block google with egress firewall

background

OVN EgressFirewall supports blocking traffic by CIDR range. However, it does not support domain name blocking in a reliable way for Google, because:

The solution is to use Google’s own published IP range lists to compute the exact CIDRs, then automatically update the EgressFirewall daily.

ip range strategy

Google publishes two IP range lists:

The formula: goog.json minus cloud.json = Google’s own service IPs (Search, Gmail, YouTube, Maps, etc.)

This avoids over-blocking legitimate GCP-hosted services while targeting Google’s consumer/search services.

architecture

flowchart TD
            subgraph nsUpdater[egress-fw-updater - no EgressFirewall]
                CJ[CronJob<br/>runs daily at 02h00]
                POD[Pod python3 + curl<br/>compute goog minus cloud = 91 CIDRs<br/>generate EgressFirewall YAML]
                CJ --> POD
            end

            GSTATIC[www.gstatic.com<br/>goog.json and cloud.json]

            subgraph nsDemo[llm-demo - has EgressFirewall]
                EFW[EgressFirewall default<br/>Allow 192.168.99.0/24 node network<br/>Allow 172.22.0.0/16 service network<br/>Allow 10.132.0.0/14 pod network<br/>Deny 91 Google CIDRs<br/>Allow 0.0.0.0/0 all other traffic]
            end

            POD -- "fetch IP lists" --> GSTATIC
            POD -- "PATCH via K8s API ClusterRole" --> EFW

            style CJ fill:#C8E6C9,stroke:#388E3C
            style POD fill:#C8E6C9,stroke:#388E3C
            style GSTATIC fill:#BBDEFB,stroke:#1976D2
            style EFW fill:#FFE0B2,stroke:#F57C00

Key Design Point: The CronJob must run in a separate namespace with no EgressFirewall.

If the CronJob is in the same namespace as the EgressFirewall it manages, it will be blocked from reaching www.gstatic.com (a Google IP) and fail to download the IP lists.

deploy the automation


        # apply all resources at once:
        
        # - Namespace: egress-fw-updater (no EgressFirewall)
        
        # - ServiceAccount + ClusterRole + ClusterRoleBinding
        
        # - ConfigMap (Python script)
        
        # - CronJob (runs daily at 02:00)
        
        cat << 'EOF' > ${BASE_DIR}/data/install/egress-firewall-google-updater.yaml
        ---
        
        # Dedicated namespace for the updater - NO EgressFirewall here
        
        apiVersion: v1
        kind: Namespace
        metadata:
          name: egress-fw-updater
        
        ---
        apiVersion: v1
        kind: ServiceAccount
        metadata:
          name: egress-firewall-updater
          namespace: egress-fw-updater
        
        ---
        
        # ClusterRole: can manage EgressFirewall in any namespace
        
        apiVersion: rbac.authorization.k8s.io/v1
        kind: ClusterRole
        metadata:
          name: egress-firewall-updater
        rules:
        
        - apiGroups: ["k8s.ovn.org"]
          resources: ["egressfirewalls"]
          verbs: ["get", "create", "update", "patch", "delete"]
        
        ---
        apiVersion: rbac.authorization.k8s.io/v1
        kind: ClusterRoleBinding
        metadata:
          name: egress-firewall-updater
        subjects:
        
        - kind: ServiceAccount
          name: egress-firewall-updater
          namespace: egress-fw-updater
        roleRef:
          kind: ClusterRole
          name: egress-firewall-updater
          apiGroup: rbac.authorization.k8s.io
        
        ---
        apiVersion: v1
        kind: ConfigMap
        metadata:
          name: egress-firewall-updater-script
          namespace: egress-fw-updater
        data:
          update.py: |
            import json, urllib.request, ipaddress, sys, os
        
            def fetch_json(url):
                with urllib.request.urlopen(url, timeout=30) as r:
                    return json.loads(r.read())
        
            # TARGET_NAMESPACE: the namespace to apply EgressFirewall to
            NS              = os.environ.get("TARGET_NAMESPACE", "llm-demo")
            MACHINE_NETWORK = os.environ.get("MACHINE_NETWORK", "192.168.99.0/24")
            SERVICE_NETWORK = os.environ.get("SERVICE_NETWORK", "172.22.0.0/16")
            CLUSTER_NETWORK = os.environ.get("CLUSTER_NETWORK", "10.132.0.0/14")
        
            print("Fetching goog.json from www.gstatic.com ...")
            goog  = fetch_json("https://www.gstatic.com/ipranges/goog.json")
            print("Fetching cloud.json from www.gstatic.com ...")
            cloud = fetch_json("https://www.gstatic.com/ipranges/cloud.json")
        
            def get_v4(data):
                return {ipaddress.ip_network(p["ipv4Prefix"])
                        for p in data["prefixes"] if "ipv4Prefix" in p}
        
            goog_v4  = get_v4(goog)
            cloud_v4 = get_v4(cloud)
        
            # goog - cloud = Google own service IPs (not GCP customer IPs)
            google_only = sorted(
                [net for net in goog_v4
                 if not any(net.subnet_of(c) for c in cloud_v4)],
                key=lambda n: (n.network_address, n.prefixlen)
            )
            print(f"goog IPv4: {len(goog_v4)}, cloud IPv4: {len(cloud_v4)}, google-only: {len(google_only)}")
        
            lines = []
            lines.append("apiVersion: k8s.ovn.org/v1")
            lines.append("kind: EgressFirewall")
            lines.append("metadata:")
            lines.append(f"  name: default")
            lines.append(f"  namespace: {NS}")
            lines.append("spec:")
            lines.append("  egress:")
            # Allow internal cluster networks first (must be before deny rules)
            for cidr, comment in [
                (MACHINE_NETWORK, "node/machine network (API server access)"),
                (SERVICE_NETWORK,  "service network"),
                (CLUSTER_NETWORK,  "pod/cluster network"),
            ]:
                lines.append(f"  - type: Allow")
                lines.append(f"    to:")
                lines.append(f"      cidrSelector: {cidr}")
            # Deny Google-only CIDRs
            for net in google_only:
                lines.append(f"  - type: Deny")
                lines.append(f"    to:")
                lines.append(f"      cidrSelector: {net}")
            # Allow everything else
            lines.append(f"  - type: Allow")
            lines.append(f"    to:")
            lines.append(f"      cidrSelector: 0.0.0.0/0")
        
            with open("/tmp/egress-firewall.yaml", "w") as f:
                f.write("\n".join(lines))
            print(f"YAML written ({len(google_only) + 4} rules total)")
        
        ---
        apiVersion: batch/v1
        kind: CronJob
        metadata:
          name: egress-firewall-google-updater
          namespace: egress-fw-updater
        spec:
          schedule: "0 2 * * *"
          successfulJobsHistoryLimit: 3
          failedJobsHistoryLimit: 3
          jobTemplate:
            spec:
              template:
                spec:
                  serviceAccountName: egress-firewall-updater
                  restartPolicy: OnFailure
                  containers:
                  - name: updater
                    image: quay.io/wangzheng422/qimgs:centos9-test-2025.12.18.v01
                    env:
                    # Target namespace where EgressFirewall will be applied
                    - name: TARGET_NAMESPACE
                      value: "llm-demo"
                    # Cluster network CIDRs to allow (customize for your cluster)
                    - name: MACHINE_NETWORK
                      value: "192.168.99.0/24"
                    - name: SERVICE_NETWORK
                      value: "172.22.0.0/16"
                    - name: CLUSTER_NETWORK
                      value: "10.132.0.0/14"
                    command:
                    - /bin/bash
                    - -c
                    - |
                      set -e
                      echo "=== Step 1: Generate EgressFirewall YAML ==="
                      python3 /scripts/update.py
        
                      echo "=== Step 2: Apply via Kubernetes Server-Side Apply API ==="
                      TOKEN=$(cat /var/run/secrets/kubernetes.io/serviceaccount/token)
        
                      HTTP_RESULT=$(curl -k -s -w "\nHTTP_STATUS:%{http_code}" \
                        -X PATCH \
                        -H "Authorization: Bearer ${TOKEN}" \
                        -H "Content-Type: application/apply-patch+yaml" \
                        "https://kubernetes.default.svc/apis/k8s.ovn.org/v1/namespaces/${TARGET_NAMESPACE}/egressfirewalls/default?fieldManager=egress-firewall-updater&force=true" \
                        --data-binary @/tmp/egress-firewall.yaml)
        
                      HTTP_STATUS=$(echo "${HTTP_RESULT}" | grep HTTP_STATUS | cut -d: -f2)
                      echo "Apply HTTP status: ${HTTP_STATUS}"
                      if [[ "${HTTP_STATUS}" == "200" || "${HTTP_STATUS}" == "201" ]]; then
                        echo "=== EgressFirewall in ${TARGET_NAMESPACE} updated successfully ==="
                      else
                        echo "=== ERROR: HTTP ${HTTP_STATUS} ==="
                        echo "${HTTP_RESULT}"
                        exit 1
                      fi
                    volumeMounts:
                    - name: scripts
                      mountPath: /scripts
                  volumes:
                  - name: scripts
                    configMap:
                      name: egress-firewall-updater-script
        EOF
        
        oc apply -f ${BASE_DIR}/data/install/egress-firewall-google-updater.yaml
        
        # to delete
        
        # oc delete -f ${BASE_DIR}/data/install/egress-firewall-google-updater.yaml
        
        # oc delete egressfirewall default -n llm-demo

manually trigger and verify


        # manually trigger one run (for testing, without waiting for cron schedule)
        
        oc create job -n egress-fw-updater egress-fw-test-run \
          --from=cronjob/egress-firewall-google-updater
        
        # watch job status
        
        oc get job -n egress-fw-updater egress-fw-test-run -w
        
        # check job logs
        
        oc logs -n egress-fw-updater -l job-name=egress-fw-test-run
        
        # expected log output:
        
        # === Step 1: Generate EgressFirewall YAML ===
        
        # Fetching goog.json from www.gstatic.com ...
        
        # Fetching cloud.json from www.gstatic.com ...
        
        # goog IPv4: 96, cloud IPv4: 862, google-only: 91
        
        # YAML written (95 rules total)
        
        # === Step 2: Apply via Kubernetes Server-Side Apply API ===
        
        # Apply HTTP status: 200
        
        # === EgressFirewall in llm-demo updated successfully ===
        
        # verify EgressFirewall status
        
        oc get egressfirewall -n llm-demo
        
        # NAME      EGRESSFIREWALL STATUS
        
        # default   EgressFirewall Rules applied
        
        # check rule count
        
        oc get egressfirewall -n llm-demo default -o json | jq '.spec.egress | length'
        
        # 95

verify google is blocked


        # Before applying EgressFirewall - Google is accessible
        
        oc exec -n llm-demo wzh-demo-pod -- curl -s --max-time 8 \
          -o /dev/null -w "%{http_code}" https://www.google.com
        
        # 200
        
        # After applying EgressFirewall - Google is blocked (connection timeout)
        
        oc exec -n llm-demo wzh-demo-pod -- curl -s --max-time 8 \
          -o /dev/null -w "%{http_code}" https://www.google.com
        
        # 000  (exit code 28 = timeout, blocked by EgressFirewall)
        
        # Other sites remain accessible
        
        oc exec -n llm-demo wzh-demo-pod -- curl -s --max-time 8 \
          -o /dev/null -w "%{http_code}" https://www.baidu.com
        
        # 200

notes

eBPF deep-dive: testing every feature module

overview

The Network Observability Operator 1.10 on OCP 4.20 supports the following eBPF agent features, all configured via the spec.agent.ebpf.features list in the FlowCollector CR. This section documents a systematic test of every feature module, including deployment steps, CLI verification, flow record evidence, and CPU/memory impact analysis.

The following features were enabled simultaneously for testing:

spec:
          agent:
            ebpf:
              features:
                - PacketDrop
                - DNSTracking
                - FlowRTT
                - NetworkEvents
                - PacketTranslation
                - UDNMapping
                - IPSec
              privileged: true    # required for PacketDrop, NetworkEvents, UDNMapping
              sampling: 50         # 1-in-50 packet sampling
              cacheMaxFlows: 120000
              cacheActiveTimeout: "15s"
Feature Purpose Requires privileged: true Key Flow Fields
FlowRTT TCP round-trip time measurement TimeFlowRttNs
DNSTracking DNS request/response latency & RCODE DnsId, DnsLatencyMs, DnsFlagsResponseCode
PacketDrop Kernel-level packet drop tracking PktDropBytes, PktDropPackets, PktDropLatestDropCause, PktDropLatestState
PacketTranslation Service → Pod address translation (xlat) XlatSrcAddr, XlatDstAddr, XlatSrcPort, XlatDstPort, XlatDstK8S_Name
NetworkEvents OVN-Kubernetes ACL action tracking NetworkEvents (array of Type/Action/Feature)
UDNMapping User-defined network interface mapping Udns
IPSec IPSec encryption status tracking IPSec-related fields

test environment

Cluster:       OCP 4.20.21 (3 control-plane/worker nodes)
        Operator:      Network Observability Operator 1.10
        Agent:         eBPF DaemonSet (3 pods, one per node)
        Backend:       LokiStack (1x.demo, backed by rustfs S3)
        Namespace:     netobserv-test (test workloads), llm-demo (egress IP test)

baseline resource usage (all features enabled)


        # eBPF agent resource consumption with ALL 7 features enabled
        
        oc adm top pods -n netobserv-privileged
        
        # NAME                         CPU(cores)   MEMORY(bytes)
        
        # netobserv-ebpf-agent-48t8v   11m          164Mi         (master-03)
        
        # netobserv-ebpf-agent-h9ldf   12m          160Mi         (master-02)
        
        # netobserv-ebpf-agent-q92x7   11m          144Mi         (master-01)
        
        # Node-level resource usage
        
        oc adm top nodes
        
        # NAME             CPU(cores)   CPU(%)   MEMORY(bytes)   MEMORY(%)
        
        # master-01-demo   1341m        5%       9274Mi          32%
        
        # master-02-demo   1704m        7%       12737Mi         43%
        
        # master-03-demo   2296m        9%       15883Mi         54%

Key observation: Each eBPF agent pod uses approximately 11–12m CPU (0.01 cores) and 144–164 MiB memory with all 7 features enabled. This is remarkably lightweight — the eBPF programs run in kernel space and only the user-space agent process (which aggregates and forwards flows) shows up in the pod metrics. The actual kernel-side CPU overhead is not reflected in pod metrics but is included in the node-level CPU figures.

deploy test workloads


        # Create test namespace with Service + backend pods + traffic generator
        
        cat << EOF | oc apply -f -
        ---
        apiVersion: v1
        kind: Namespace
        metadata:
          name: netobserv-test
        ---
        apiVersion: apps/v1
        kind: Deployment
        metadata:
          name: web-server
          namespace: netobserv-test
        spec:
          replicas: 2
          selector:
            matchLabels:
              app: web-server
          template:
            metadata:
              labels:
                app: web-server
            spec:
              containers:
              - name: nginx
                image: quay.io/wangzheng422/qimgs:centos9-test-2025.12.18.v01
                command: ["/bin/bash", "-c"]
                args:
                - |
                  python3 -m http.server 8080
                ports:
                - containerPort: 8080
        ---
        apiVersion: v1
        kind: Service
        metadata:
          name: web-svc
          namespace: netobserv-test
        spec:
          selector:
            app: web-server
          ports:
          - port: 80
            targetPort: 8080
            protocol: TCP
        ---
        apiVersion: v1
        kind: Pod
        metadata:
          name: traffic-gen
          namespace: netobserv-test
          labels:
            app: traffic-gen
        spec:
          nodeSelector:
            kubernetes.io/hostname: 'master-02-demo'
          containers:
          - name: client
            image: quay.io/wangzheng422/qimgs:centos9-test-2025.12.18.v01
            command: ["/bin/bash", "-c", "tail -f /dev/null"]
        EOF
        
        # Deploy NetworkPolicy to allow only port 8080 ingress to web-server
        
        # Traffic to other ports (e.g., 9999) will be dropped by OVN → captured by PacketDrop
        
        cat << EOF | oc apply -f -
        apiVersion: networking.k8s.io/v1
        kind: NetworkPolicy
        metadata:
          name: deny-port-9999
          namespace: netobserv-test
        spec:
          podSelector:
            matchLabels:
              app: web-server
          policyTypes:
          - Ingress
          ingress:
          - from:
            - podSelector:
                matchLabels:
                  app: traffic-gen
            ports:
            - protocol: TCP
              port: 8080
          - from:
            - namespaceSelector: {}
            ports:
            - protocol: TCP
              port: 8080
        EOF

generate test traffic


        # 1. Service traffic for PacketTranslation (xlat) — via ClusterIP
        
        SVC_IP=$(oc get svc web-svc -n netobserv-test -o jsonpath="{.spec.clusterIP}")
        echo "Service ClusterIP: $SVC_IP"
        
        # Service ClusterIP: 172.22.114.234
        
        for i in $(seq 1 10); do
          oc exec -n netobserv-test traffic-gen -- \
            curl -s --max-time 3 -o /dev/null -w "req$i: HTTP %{http_code} " http://$SVC_IP:80/
        done
        
        # req1: HTTP 200 req2: HTTP 200 ... req10: HTTP 200
        
        # 2. DNS queries for DNSTracking
        
        for domain in www.google.com www.baidu.com kubernetes.default.svc redhat.com github.com; do
          oc exec -n netobserv-test traffic-gen -- nslookup $domain > /dev/null 2>&1 \
            && echo "DNS OK: $domain" || echo "DNS FAIL: $domain"
        done
        
        # DNS OK: www.google.com
        
        # DNS OK: www.baidu.com
        
        # DNS OK: kubernetes.default.svc
        
        # DNS OK: redhat.com
        
        # DNS OK: github.com
        
        # 3. External traffic for FlowRTT measurement
        
        oc exec -n netobserv-test traffic-gen -- \
          curl -s --max-time 5 -o /dev/null -w "HTTP %{http_code} RTT_connect=%{time_connect}s" \
          https://www.baidu.com
        
        # HTTP 200 RTT_connect=0.594255s
        
        # 4. Blocked traffic for PacketDrop — connect to port 9999 (blocked by NetworkPolicy)
        
        WEB_POD_IP=$(oc get pod -n netobserv-test -l app=web-server -o jsonpath="{.items[0].status.podIP}")
        oc exec -n netobserv-test traffic-gen -- \
          timeout 3 bash -c "echo test | nc -w 2 $WEB_POD_IP 9999"
        
        # Ncat: TIMEOUT. (connection blocked by NetworkPolicy — packet dropped by OVN)

feature 1: FlowRTT (TCP Round-Trip Time)

what it does

FlowRTT uses eBPF to measure TCP Smoothed Round-Trip Time (sRTT) at the kernel level. It hooks into the TCP stack and reads the tcp_sock->srtt_us value, recording it in nanoseconds as the TimeFlowRttNs field in each flow record. This provides accurate network latency measurement without any application-level instrumentation.

evidence from Loki flow records

Multiple flows captured with TimeFlowRttNs values showing intra-cluster RTT measurements:

Flow: 10.134.0.2:58136 -> 10.134.0.30:3101   RTT = 89,000 ns (0.089 ms)   — same-node
        Flow: 10.133.0.91:34732 -> 10.134.0.30:3100   RTT = 912,000 ns (0.9 ms)    — cross-node
        Flow: 10.134.0.14:45106 -> 10.134.0.29:9095   RTT = 13,508,000 ns (13.5 ms) — Loki ingester
        Flow: 192.168.99.25:10250 -> 10.132.0.90       RTT = 10,041,000 ns (10 ms)  — kubelet
        Flow: 10.133.0.4:8443 -> 10.133.0.91:53320     RTT = 15,425,000 ns (15.4 ms) — API server

Interpretation: Same-node flows show sub-millisecond RTT (0.089ms). Cross-node flows via GENEVE overlay show 0.9–15ms RTT depending on load and path. These RTT values are measured at the TCP layer by eBPF, not by ICMP ping, so they reflect actual application-perceived latency.

CPU impact

FlowRTT adds minimal overhead because it reads an existing kernel field (tcp_sock->srtt_us) rather than performing new measurements. The eBPF program simply copies a value that the kernel already maintains for TCP congestion control.

feature 2: DNSTracking

what it does

DNSTracking captures DNS request/response pairs at the eBPF level. It hooks into UDP port 53 traffic and parses DNS headers to extract:

observation

With sampling rate of 50 (1 in 50 packets), DNS tracking captures are sparse because DNS queries are typically single-packet request/response pairs. The probability of capturing both the request AND response for the same DNS transaction is (1/50) × (1/50) = 0.04%. To reliably observe DNS tracking data, reduce the sampling rate:


        # Temporarily reduce sampling to capture DNS flows
        
        oc patch flowcollector cluster --type=json \
          -p '[{"op": "replace", "path": "/spec/agent/ebpf/sampling", "value": 1}]'
        
        # After testing, restore default sampling
        
        oc patch flowcollector cluster --type=json \
          -p '[{"op": "replace", "path": "/spec/agent/ebpf/sampling", "value": 50}]'

CPU impact

DNSTracking adds a small overhead because the eBPF program must parse DNS packet headers (beyond the normal IP/TCP/UDP header inspection). For each DNS packet, the agent decodes the DNS header to extract the transaction ID, flags, and response code. At sampling=50, the impact is negligible. At sampling=1, expect a noticeable increase in CPU for DNS-heavy workloads.

feature 3: PacketDrop

what it does

PacketDrop hooks into the kernel’s kfree_skb tracepoint to detect when packets are dropped. It captures:

Requires privileged: true because the kfree_skb tracepoint is a privileged kernel operation.

evidence from Loki flow records

A packet drop was captured from the Loki internal traffic:

{
          "SrcAddr": "10.132.0.90",
          "DstAddr": "10.134.0.30",
          "SrcPort": 47464,
          "DstPort": 3100,
          "PktDropBytes": 32,
          "PktDropPackets": 1,
          "PktDropLatestDropCause": "SKB_DROP_REASON_TCP_RESET",
          "PktDropLatestFlags": 16,
          "PktDropLatestState": "TCP_INVALID_STATE"
        }

Interpretation: This shows a TCP RST packet (flags=16=ACK) being dropped by the kernel because it arrived in an invalid TCP state. This is normal TCP behavior — connection teardown can result in transient invalid states. The important point is that eBPF captures these drops that are invisible to application-level monitoring.

test with NetworkPolicy-blocked traffic

The NetworkPolicy we deployed blocks traffic to port 9999 on web-server pods. When traffic-gen attempts to connect to port 9999, OVN drops the packet. The eBPF agent captures this as a SYN packet on the GENEVE tunnel interface:

{
          "SrcAddr": "10.132.0.46",
          "DstAddr": "10.134.0.36",
          "SrcPort": 37186,
          "DstPort": 9999,
          "SrcK8S_Name": "traffic-gen",
          "DstK8S_Name": "web-server-65d6fb4bc4-2l2nb",
          "Flags": ["SYN"],
          "Interfaces": ["genev_sys_6081"],
          "Packets": 1,
          "Bytes": 74
        }

Note: The SYN packet is captured at the GENEVE tunnel interface before OVN drops it. PacketDrop fields (PktDropBytes, etc.) appear when the kernel’s kfree_skb is triggered, which may happen at a different point in the OVN processing pipeline.

CPU impact

PacketDrop attaches to the kfree_skb tracepoint which fires for every dropped packet in the kernel — not just sampled ones. The eBPF program then records the drop reason and associates it with the flow. On nodes with high packet drop rates (e.g., under DDoS or heavy NetworkPolicy enforcement), this can add measurable CPU overhead. In normal cluster operation, packet drops are infrequent, so the impact is minimal.

feature 4: PacketTranslation (xlat)

what it does

PacketTranslation enriches flow records with the translated endpoint information — i.e., when a packet passes through NAT/DNAT (e.g., Service ClusterIP → backend Pod IP), both the original and translated addresses are recorded. This allows you to see:

evidence from Loki flow records

Service-to-node translation (Kubernetes API Server):

{
          "SrcAddr": "10.132.0.54",
          "DstAddr": "172.22.0.1",
          "SrcPort": 38680,
          "DstPort": 443,
          "SrcK8S_Name": "machine-api-controllers-5644995fb-fpvtt",
          "SrcK8S_Namespace": "openshift-machine-api",
          "XlatDstAddr": "192.168.99.24",
          "XlatDstK8S_Name": "master-02-demo",
          "XlatDstK8S_Type": "Node",
          "XlatDstPort": 6443,
          "XlatSrcAddr": "10.132.0.54",
          "XlatSrcK8S_Name": "machine-api-controllers-5644995fb-fpvtt",
          "XlatSrcK8S_Type": "Pod"
        }

Interpretation: The machine-api-controllers pod accessed the Kubernetes API at 172.22.0.1:443 (the Service ClusterIP). OVN DNAT’d this to 192.168.99.24:6443 (master-02-demo’s actual API server port). The Xlat* fields reveal this translation, which is invisible in standard flow logs.

Node-to-node etcd translation:

{
          "SrcAddr": "10.134.0.27",
          "DstAddr": "192.168.99.25",
          "DstPort": 2379,
          "XlatDstAddr": "192.168.99.25",
          "XlatDstK8S_Name": "master-03-demo",
          "XlatDstK8S_Type": "Node",
          "XlatSrcAddr": "192.168.99.23",
          "XlatSrcK8S_Name": "master-01-demo",
          "XlatSrcK8S_Type": "Node",
          "XlatSrcPort": 41234
        }

CPU impact

PacketTranslation uses eBPF’s connection tracking (conntrack) integration to look up NAT mappings. This adds a conntrack lookup for each captured packet, which is a constant-time hash table operation. The overhead is proportional to the number of sampled packets, making it negligible at the default sampling rate of 50.

feature 5: NetworkEvents

what it does

NetworkEvents tracks OVN-Kubernetes ACL (Access Control List) actions — whenever OVN processes a packet against NetworkPolicy, AdminNetworkPolicy, BaselineNetworkPolicy, EgressFirewall, UDN isolation, or Multicast rules, the action (allow/drop) is recorded in the NetworkEvents field.

This is a Technology Preview feature as of OCP 4.20.

observation

NetworkEvents requires OVN-Kubernetes to emit ACL action events via a specific mechanism that links eBPF observations to OVN internal state. In our testing with sampling=50, no NetworkEvents were captured in the 15-minute observation window. This is expected behavior:

  1. NetworkEvents are generated only for new connections (not every packet)
  2. OVN ACL events are relatively rare compared to data-plane traffic
  3. With 1-in-50 sampling, the probability of capturing the exact first packet of a new connection (where the ACL decision is recorded) is low

To reliably observe NetworkEvents, use the OCP console’s Network Traffic UI with the “Network events” filter, or temporarily reduce the sampling rate.

CPU impact

NetworkEvents adds overhead by hooking into the OVN ACL processing path. Since ACL evaluations happen for every new connection (not every packet), the per-packet overhead is minimal. The main cost is the additional metadata collection and correlation between eBPF and OVN event streams.

feature 6: UDNMapping

what it does

UDNMapping (User-Defined Network Mapping) maps network flows to user-defined networks (UDNs). When pods are attached to custom networks (created via UserDefinedNetwork CRDs), this feature identifies which network a flow belongs to via the Udns field.

Requires privileged: true because it needs access to network namespace information.

evidence from Loki flow records

All flow records include the Udns field showing the network name:

{
          "Udns": ["default"]          // Pod on the default cluster network
        }
        {
          "Udns": ["", "default"]      // Flow observed on multiple interfaces
        }

In our cluster, all pods use the default OVN-Kubernetes network, so all Udns values show "default". When UserDefinedNetwork CRs are created and pods are attached to them, the Udns field would show the custom network name (e.g., "my-tenant-network").

CPU impact

UDNMapping adds a lookup to determine which UDN a network interface belongs to. This is a per-flow operation (not per-packet due to flow caching), so the overhead scales with the number of unique flows rather than packet rate. Impact is negligible for most workloads.

feature 7: IPSec

what it does

IPSec tracking detects whether traffic is encrypted/decrypted using IPSec. When IPSec is configured at the cluster level (via the Cluster Network Operator), this feature adds encryption status metadata to flow records.

observation

Our test cluster does not have cluster-level IPSec enabled (it requires spec.defaultNetwork.ovnKubernetesConfig.ipsecConfig in the Network operator CR). Therefore, no IPSec-specific fields appear in flow records. The feature is enabled in the FlowCollector but has no effect without IPSec infrastructure.

To test this feature, IPSec must be enabled at the cluster level, which requires re-encrypting all inter-node traffic — a significant infrastructure change not suitable for a simple lab test.

eBPF Flow Filter (bonus feature)

what it does

The eBPF flow filter (spec.agent.ebpf.flowFilter) allows kernel-level filtering of flows before they leave the eBPF agent. This is different from Loki-side filtering — it reduces the amount of data sent from the agent, saving CPU, memory, and network bandwidth.

configuration example

spec:
          agent:
            ebpf:
              flowFilter:
                enable: true
                rules:
                - cidr: "10.132.0.0/14"
                  action: Accept
                  protocol: TCP
                  ports: 8080
                  direction: Ingress
                - cidr: "0.0.0.0/0"
                  action: Reject

This configuration would:

  1. Accept only TCP port 8080 traffic within the pod network
  2. Reject all other traffic at the eBPF level (never sent to flowlogs-pipeline or Loki)

Note: Flow filtering was not applied during our testing because we wanted to capture all features’ data. In production, flow filtering can significantly reduce resource consumption by eliminating uninteresting infrastructure traffic.

CPU and system impact analysis

eBPF agent resource consumption

Metric Value (per agent) Notes
CPU usage 11–12m (0.01 cores) With all 7 features enabled
Memory usage 144–164 MiB Flow cache + eBPF maps
CPU request 100m Configured in FlowCollector
Memory limit 800 MiB Configured in FlowCollector

kernel-side impact

The eBPF programs run inside the kernel and their CPU consumption is not reflected in the agent pod’s CPU metrics. Instead, it shows up in the node-level CPU figures. The kernel-side overhead depends on:

  1. Packet rate — More packets = more eBPF invocations
  2. Sampling rate — Lower sampling = more processing per packet
  3. Features enabled — Each feature adds hooks to different kernel subsystems
  4. Flow diversity — More unique flows = larger eBPF hash maps

In our 3-node cluster with light test traffic:

flowlogs-pipeline resource consumption

Metric Value
CPU usage 49m (0.05 cores)
Memory usage 63 MiB

The flowlogs-pipeline pod processes and enriches flows before sending to Loki. Its resource consumption scales with the flow rate (which depends on cluster traffic volume and sampling rate).

recommendations for production

  1. Sampling rate: Start with the default of 50. Lower values increase accuracy but also increase CPU, memory, and storage. For most clusters, sampling=50 provides sufficient visibility.

  2. Feature selection: Enable only the features you need:

    • FlowRTT — low overhead, high value for latency troubleshooting
    • PacketTranslation — low overhead, essential for Service traffic visibility
    • PacketDrop — moderate overhead (requires privileged: true), valuable for NetworkPolicy debugging
    • DNSTracking — low overhead, useful for DNS troubleshooting
    • NetworkEvents — moderate overhead, useful for security audit
    • UDNMapping — low overhead, only needed with UserDefinedNetworks
    • IPSec — negligible overhead, only relevant with IPSec-enabled clusters
  3. Flow filtering: Use spec.agent.ebpf.flowFilter to exclude infrastructure traffic (e.g., etcd, kubelet health checks) at the kernel level, significantly reducing downstream resource usage.

  4. Memory limits: The default 800 MiB memory limit for the eBPF agent is sufficient for most workloads. If NetObservAgentFlowsDropped alerts fire, increase cacheMaxFlows and the memory limit.

querying Loki for flow records

You can query Loki directly to inspect flow records with feature fields:


        # Get authentication token
        
        TOKEN=$(oc whoami -t)
        
        # Query flows from a specific namespace (last 10 minutes)
        
        NOW=$(date +%s)
        START=$((NOW-600))
        
        oc exec -n netobserv flowlogs-pipeline-<pod-id> -- curl -s -k \
          "https://loki-gateway-http.netobserv.svc:8080/api/logs/v1/network/loki/api/v1/query_range" \
          --data-urlencode 'query={SrcK8S_Namespace="netobserv-test"}' \
          --data-urlencode "limit=10" \
          --data-urlencode "start=${START}000000000" \
          --data-urlencode "end=${NOW}000000000" \
          -H "Authorization: Bearer $TOKEN"

Important: Use the query_range endpoint (not query) because Loki stores flow logs as log streams, which require range queries.

feature fields summary

The following table shows all eBPF-specific fields that may appear in flow records stored in Loki:

Field Feature Type Description
TimeFlowRttNs FlowRTT integer TCP smoothed RTT in nanoseconds
DnsId DNSTracking integer DNS transaction ID
DnsLatencyMs DNSTracking integer DNS query-to-response latency in ms
DnsFlagsResponseCode DNSTracking string DNS RCODE (NOERROR, NXDOMAIN, etc.)
DnsErrno DNSTracking integer System DNS error code
PktDropBytes PacketDrop integer Total bytes in dropped packets
PktDropPackets PacketDrop integer Number of dropped packets
PktDropLatestDropCause PacketDrop string Kernel drop reason
PktDropLatestState PacketDrop string TCP state at drop time
PktDropLatestFlags PacketDrop integer TCP flags of dropped packet
XlatSrcAddr PacketTranslation string Pre-NAT source IP
XlatDstAddr PacketTranslation string Pre-NAT destination IP
XlatSrcPort PacketTranslation integer Pre-NAT source port
XlatDstPort PacketTranslation integer Pre-NAT destination port
XlatSrcK8S_Name PacketTranslation string Pre-NAT source K8s object
XlatDstK8S_Name PacketTranslation string Pre-NAT destination K8s object
NetworkEvents NetworkEvents array OVN ACL actions (allow/drop)
Udns UDNMapping array User-defined network names

cleanup


        # Remove test workloads
        
        oc delete namespace netobserv-test
        
        # To disable specific features, patch the FlowCollector:
        
        # oc patch flowcollector cluster --type=json \
        
        #   -p '[{"op": "replace", "path": "/spec/agent/ebpf/features", "value": ["FlowRTT"]}]'

eBPF kernel-level deep-dive: bpftool inspection on CoreOS nodes

CoreOS nodes do not ship bpftool, so we use oc debug with a custom image that has it pre-installed. The --image flag overrides the debug pod’s container image, but we do not use chroot /host because the host filesystem lacks bpftool. Instead, the debug pod runs with host privileges and can directly access the kernel’s eBPF subsystem.


        IMG="quay.io/wangzheng422/qimgs:centos9-test-2025.12.18.v01"
        
        # list all eBPF programs on master-01
        
        oc debug node/master-01-demo --image=$IMG -- bash -c 'bpftool prog list'
        
        # list all eBPF maps
        
        oc debug node/master-01-demo --image=$IMG -- bash -c 'bpftool map list'
        
        # show network attachment points
        
        oc debug node/master-01-demo --image=$IMG -- bash -c 'bpftool net list'
        
        # export program list as JSON for scripted analysis
        
        oc debug node/master-01-demo --image=$IMG -- bash -c 'bpftool prog list --json'

NetObserv eBPF programs on master-01-demo

Running bpftool prog list on master-01-demo reveals 145 total eBPF programs loaded system-wide. Of these, 16 belong to netobserv-ebpf-agent (PID 7253). The following table maps each program to its NetObserv feature:

ID Type Name Feature xlated jited memlock
133 tracepoint kfree_skb PacketDrop 4,208B 2,375B 12,288B
134 kprobe network_events_monitoring NetworkEvents 6,560B 3,695B 16,384B
135 kprobe probe_entry_SSL_write TLS/SSL monitoring 48B 39B 4,096B
136 sched_cls tc_egress_flow_parse Flow - TC egress 9,904B 6,148B 20,480B
137 sched_cls tc_egress_pca_parse PacketCapture - TC 80B 69B 12,288B
138 sched_cls tc_ingress_flow_parse Flow - TC ingress 9,904B 6,145B 20,480B
141 tracing tcp_rcv_fentry FlowRTT - fentry 2,936B 1,716B 12,288B
142 kprobe tcp_rcv_kprobe FlowRTT - kprobe fallback 2,928B 1,713B 12,288B
143 sched_cls tcx_egress_flow_parse Flow - TCX egress 9,912B 6,151B 20,480B
145 sched_cls tcx_ingress_flow_parse Flow - TCX ingress 9,912B 6,148B 20,480B
146 sched_cls tcx_ingress_pca_parse PacketCapture - TCX 88B 69B 12,288B
147 kprobe track_nat_manip_pkt PacketTranslation 6,184B 3,733B 16,384B
148 kprobe xfrm_input_kprobe IPSec - input 3,680B 2,123B 12,288B
149 kprobe xfrm_input_kretprobe IPSec - input return 1,312B 876B 4,096B
150 kprobe xfrm_output_kprobe IPSec - output 3,704B 2,141B 12,288B
151 kprobe xfrm_output_kretprobe IPSec - output return 1,336B 894B 4,096B

Total program memlock: 208.0 KB (all 16 NetObserv programs combined).

Key observations:

eBPF maps memory usage

The eBPF maps are where the real memory lives. Running bpftool map list shows 18 NetObserv-related maps:

ID Type Name max_entries memlock Purpose
9 array .bss 1 24,576B Global state
10 array .rodata 1 8,192B Read-only config
11 percpu_array global_counters 11 2,464B Per-CPU counters
12 lpm_trie filter_map 1 0B Flow filter
13 lpm_trie peer_filter_map 1 0B Peer filter
14 percpu_hash aggregated_flow 120,000 2,108,536B Flow aggregation - PacketDrop
15 percpu_hash aggregated_flow 120,000 2,098,816B Flow aggregation - NetworkEvents
16 ringbuf ssl_data_event_ 4,096 16,680B SSL event buffer
17 hash dns_flows 1,048,576 16,779,192B DNS flow tracking ★
18 percpu_array dns_name_map 1 1,040B DNS name cache
19 hash aggregated_flow 120,000 2,196,928B Flow aggregation - DNSTracking
20 percpu_hash aggregated_flow 120,000 2,105,440B Flow aggregation - main
21 ringbuf direct_flows 4,096 16,680B Direct flow export
22 ringbuf packet_record 4,096 16,680B Packet capture buffer
23 percpu_hash additional_flow 120,000 2,218,696B Additional flow data - FlowRTT
24 percpu_hash aggregated_flow 120,000 2,144,200B Flow aggregation - PacketTranslation
25 hash ipsec_ingress_m 1,048,576 16,778,880B IPSec ingress map ★
26 hash ipsec_egress_ma 1,048,576 16,778,880B IPSec egress map ★

The ★ marks indicate large pre-allocated maps (~16 MB each). The top 3 memory consumers are:

Total NetObserv map memlock: ~63.3 MB per node.

Important: The IPSec maps (32 MB combined) are pre-allocated even if IPSec is not actively used. Similarly, dns_flows (16 MB) is pre-allocated with max_entries=1048576 regardless of actual DNS traffic volume. These are the primary memory optimization targets if memory is constrained.

Network attachment: TCX hooks on every interface

Running bpftool net list shows that tcx_ingress_flow_parse (prog_id 145) and tcx_egress_flow_parse (prog_id 143) are attached to every network interface on the node via TCX (TC eXpress) hooks. Below is a partial listing:

tc:
        enp1s0(2)          tcx/ingress tcx_ingress_flow_parse  prog_id 145
        enp1s0(2)          tcx/egress  tcx_egress_flow_parse   prog_id 143
        enp2s0(3)          tcx/ingress tcx_ingress_flow_parse  prog_id 145
        enp2s0(3)          tcx/egress  tcx_egress_flow_parse   prog_id 143
        ovs-system(4)      tcx/ingress tcx_ingress_flow_parse  prog_id 145
        ovs-system(4)      tcx/egress  tcx_egress_flow_parse   prog_id 143
        ovn-k8s-mp0(6)     tcx/ingress tcx_ingress_flow_parse  prog_id 145
        ovn-k8s-mp0(6)     tcx/egress  tcx_egress_flow_parse   prog_id 143
        genev_sys_6081(7)  tcx/ingress tcx_ingress_flow_parse  prog_id 145
        genev_sys_6081(7)  tcx/egress  tcx_egress_flow_parse   prog_id 143
        br-int(8)          tcx/ingress tcx_ingress_flow_parse  prog_id 145
        br-int(8)          tcx/egress  tcx_egress_flow_parse   prog_id 143
        br-ex(9)           tcx/ingress tcx_ingress_flow_parse  prog_id 145
        br-ex(9)           tcx/egress  tcx_egress_flow_parse   prog_id 143
        ... (plus all pod veth interfaces)

On master-01-demo, these programs are attached to 37 interfaces (including physical NICs, OVS bridges, GENEVE tunnels, and pod veth ports), creating 74 TCX hook points (37 × 2 for ingress + egress). All hook points share the same 2 program IDs — the kernel reuses the JIT-compiled code across all interfaces.

Cross-node comparison

Running the same bpftool analysis on all 3 nodes confirms that the NetObserv eBPF program set is identical across the cluster:

Node Total eBPF progs NetObserv progs Prog memlock Map memlock (est.)
master-01-demo 145 16 208.0 KB ~63.3 MB
master-02-demo 203 16 208.0 KB ~63.3 MB
master-03-demo 209 16 208.0 KB ~63.3 MB

Observations:

eBPF program architecture: how features map to kernel hooks

Each FlowCollector feature maps to specific eBPF programs and kernel hooks:

Feature eBPF Program Hook Type Kernel Function
FlowRTT tcp_rcv_fentry tracing/fentry tcp_rcv_established
FlowRTT tcp_rcv_kprobe kprobe (fallback) tcp_rcv_established
DNSTracking (inline in flow_parse) sched_cls/TCX DNS parsed in TC path
PacketDrop kfree_skb tracepoint kfree_skb (packet free = drop)
PacketTranslation track_nat_manip_pkt kprobe nf_nat_manip_pkt
NetworkEvents network_events_monitoring kprobe OVN network event hooks
IPSec xfrm_input_kprobe kprobe xfrm_input
IPSec xfrm_input_kretprobe kprobe (return) xfrm_input
IPSec xfrm_output_kprobe kprobe xfrm_output_resume
IPSec xfrm_output_kretprobe kprobe (return) xfrm_output_resume
UDNMapping (inline in flow_parse) Handled in flow_parse path

The TC/TCX flow path is the core data path, attached to every interface:

The flow_parse programs (~10 KB xlated each) are the “main” eBPF programs that process every packet on every interface. They: (1) extract the 5-tuple (src/dst IP, port, protocol), (2) look up/update flow aggregation maps, (3) enrich with DNS, RTT, NAT translation data from other maps, and (4) export via ring buffers when a flow expires.

Memory footprint summary

Category Memory
eBPF program code (16 progs) 208 KB
Flow aggregation maps (6 maps) ~12.9 MB
DNS flow map (1M entries) ~16.0 MB
IPSec maps (2 × 1M entries) ~32.0 MB
Ring buffers (3 buffers) ~50 KB
Misc (counters, config, filter) ~35 KB
Total per node ~63.5 MB
Total across 3-node cluster ~190.5 MB

The IPSec maps (32 MB) and DNS flow map (16 MB) are pre-allocated at maximum capacity regardless of actual traffic. If IPSec and DNS tracking are not needed, disabling these features would save ~48 MB per node. The flow aggregation maps use cacheMaxFlows=120000 entries, which is configurable in the FlowCollector CR.

end