← Back to Index

Deep Dive: OVN Egress Service with MetalLB-Based BGP in OpenShift 4.19

Overview

In OpenShift 4.19, the Egress Service provides a powerful networking abstraction that combines the dynamic allocation of LoadBalancer IPs with the high-performance traffic interception of OVN-Kubernetes. By integrating MetalLB with the cluster’s shared FRR-K8s backbone, administrators can advertise Egress IPs to upstream BGP peers using standard Kubernetes CRDs. This document explores the architectural implementation, environment setup, and a deep-dive verification of the “Tromboning” traffic flow and SNAT mechanisms.


1. Laboratory Environment and Validation Topology

To demonstrate the feasibility of this solution, a verification environment was constructed using native Linux virtual machines (KVM/libvirt) on a CentOS 9 bare-metal host. The host bridge br-ocp uses the subnet 192.168.99.0/24 (Host IP: 192.168.99.1).

1.1 Environment Specifications

  1. OpenShift 4.19 Cluster:
    • 3-Node Compact Cluster (Master + Worker roles).
    • Network interfaces attached to br-ocp.
    • CNI: OVN-Kubernetes (Default).
    • Additional Add-on: MetalLB Operator.
    • Node IPs: 192.168.99.23 ~ 192.168.99.25.
    • Egress IP Pool: 192.168.66.100/24.
  2. KVM Router Node (CentOS 9):
    • Simulates a Datacenter Core Switch running FRRouting (FRR).
    • eth0 (Management/Internal): 192.168.99.12.
    • eth1 (External simulator): 192.168.55.12.
  3. KVM Server Node (CentOS 9):
    • Acts as an external application server to verify SNAT.
    • eth0: 192.168.55.13, with gateway pointing to 192.168.55.12.

2. Infrastructure Configuration

2.1 Router (FRR) Node Setup

The Router node acts as the BGP neighbor for the cluster. It must enable kernel forwarding and establish iBGP sessions with all OCP nodes.


        # 1. Install base routing components
        
        sudo dnf install -y frr
        
        # Explicitly enable bgpd in the daemons file before starting
        
        sudo sed -i 's/^bgpd=no/bgpd=yes/' /etc/frr/daemons
        
        sudo systemctl enable --now frr
        
        # 2. Enable IPv4 kernel forwarding
        
        echo "net.ipv4.ip_forward = 1" | sudo tee -a /etc/sysctl.d/99-ipforward.conf
        sudo sysctl -p /etc/sysctl.d/99-ipforward.conf
        
        # 3. Configure the core FRR configuration file
        
        cat <<EOF | sudo tee /etc/frr/frr.conf
        frr defaults traditional
        log syslog informational
        no ipv6 forwarding
        !
        router bgp 64512
         bgp router-id 192.168.99.12
         
         ! Configure iBGP neighbors for each OCP cluster node
         neighbor 192.168.99.23 remote-as 64512
         neighbor 192.168.99.24 remote-as 64512
         neighbor 192.168.99.25 remote-as 64512
        
         ! Configure iBGP ECMP and network advertisement within the address-family
         address-family ipv4 unicast
          network 192.168.55.0/24
          maximum-paths ibgp 4
         exit-address-family
        !
        line vty
        !
        EOF
        
        # 4. Reload FRR to apply changes
        
        sudo systemctl restart frr

2.2 Server Node Configuration (Traffic Destination)

The Server node requires a route back to the Egress IP range via the Router. We use Python to spin up listeners for connectivity verification.


        # Configure route to the Egress IP pool via the Router
        
        sudo ip route add 192.168.66.100 via 192.168.55.12
        
        # Verify routing table
        
        ip r
        
        # default via 192.168.99.1 dev enp1s0 proto static metric 100
        
        # 192.168.55.0/24 dev enp1s0 proto kernel scope link src 192.168.55.13 metric 100
        
        # 192.168.66.100 via 192.168.55.12 dev enp1s0
        
        # 192.168.99.0/24 dev enp1s0 proto kernel scope link src 192.168.99.13 metric 100
        
        # Start background listeners to verify connectivity
        
        # (Listening on ports 80 and 8080 to prove Egress IP transparency)
        
        nohup python3 -m http.server 80 &
        nohup python3 -m http.server 8080 &
        
        # Monitor incoming traffic to observe the source (Egress IP 192.168.66.100)
        
        sudo tcpdump -i any 'tcp port 80 or tcp port 8080' -n

3. OpenShift Cluster Preparation

3.1 Network Operator and Node Configuration

In OpenShift 4.19, the BGP capability is provided by the FRR-K8s daemon. We must enable this provider in the Network.operator API.


        # Removing auxiliary IP addresses to the host interfaces to support routing/testing
        
        oc debug node/master-01-demo -- chroot /host /bin/bash -c "nmcli connection modify enp1s0 -ipv4.addresses 192.168.66.23/24"
        oc debug node/master-02-demo -- chroot /host /bin/bash -c "nmcli connection modify enp1s0 -ipv4.addresses 192.168.66.24/24"
        oc debug node/master-03-demo -- chroot /host /bin/bash -c "nmcli connection modify enp1s0 -ipv4.addresses 192.168.66.25/24"
        
        # Enable FRR provider and Route Advertisements in OVN-Kubernetes
        
        oc patch Network.operator.openshift.io cluster --type=merge -p \
        '{
          "spec": {
            "additionalRoutingCapabilities": {
              "providers": ["FRR"]
            },
            "defaultNetwork": {
              "ovnKubernetesConfig": {
                "routeAdvertisements": "Enabled"
              }
            }
          }
        }'

4. BGP and Egress Service Implementation

4.1 Establishing BGP Connectivity (MetalLB CRDs)

Instead of using manual FRRConfiguration, we leverage the MetalLB Operator’s BGP CRDs.

apiVersion: metallb.io/v1beta1
        kind: MetalLB
        metadata:
          name: metallb
          namespace: metallb-system

check the pods the supporting bgp.


        # Check the status of the FRR pods
        
        oc get pods -n openshift-frr-k8s
        
        # NAME                                      READY   STATUS    RESTARTS   AGE
        
        # frr-k8s-952fv                             7/7     Running   0          63m
        
        # frr-k8s-webhook-server-79fdc47779-w8lv9   1/1     Running   0          63m
        
        # frr-k8s-z4xkk                             7/7     Running   0          63m
        
        # frr-k8s-zkwzh                             7/7     Running   0          63m
        
        # Check the status of the MetalLB pods
        
        oc get pods -n metallb-system -l app=metallb
        
        # NAME                                               READY   STATUS    RESTARTS   AGE
        
        # controller-6df44df4f5-wgpxs                        2/2     Running   0          63m
        
        # metallb-operator-webhook-server-7866f494cc-68qs9   1/1     Running   0          63m
        
        # speaker-7cxcm                                      2/2     Running   0          63m
        
        # speaker-8sxp7                                      2/2     Running   0          62m
        
        # speaker-fdhmj                                      2/2     Running   0          63m

4.2 MetalLB Operator Initialization

We define the BGPPeer to establish the session with the external router.


        # MetalLB Peer Configuration
        
        apiVersion: metallb.io/v1beta2
        kind: BGPPeer
        metadata:
          name: peer-sample1
          namespace: metallb-system
        spec:
          peerAddress: 192.168.99.12
          peerASN: 64512
          myASN: 64512

4.3 Egress IP Pool and Advertisement

We establish the IP address pool and the specific BGPAdvertisement rule, targeting the nodes managed by the EgressService.


        # Define the Egress IP range for the cluster
        
        apiVersion: metallb.io/v1beta1
        kind: IPAddressPool
        metadata:
          name: egress-pool
          namespace: metallb-system
        spec:
          addresses:
          - 192.168.66.100-192.168.66.100 # Static Egress IP
        ---
        
        # Advertisement logic
        
        apiVersion: metallb.io/v1beta1
        kind: BGPAdvertisement
        metadata:
          name: egress-identity-bgp-adv
          namespace: metallb-system
        spec:
          ipAddressPools: 
          - egress-pool
          # Only nodes labeled by OVN-K as Egress Gateways will advertise this IP
          nodeSelectors:
          - matchLabels:
              egress-service.k8s.ovn.org/demo-egress-egress-identity: ""

4.4 Deploying the Egress Service

The EgressService CRD instructs OVN-K to use the LoadBalancer IP for SNAT.

oc new-project demo-egress
        
        # Apply the Service and EgressService bundle
        
        cat <<EOF | oc apply -f -
        apiVersion: v1
        kind: Service
        metadata:
          name: egress-identity
          namespace: demo-egress
          annotations:
            metallb.universe.tf/address-pool: egress-pool
        spec:
          type: LoadBalancer
          selector:
            app: requester
          ports:
            - name: http
              protocol: TCP
              port: 8080
              targetPort: 8080
        ---
        apiVersion: k8s.ovn.org/v1
        kind: EgressService
        metadata:
          name: egress-identity 
          namespace: demo-egress
        spec:
          sourceIPBy: "LoadBalancerIP"
          nodeSelector:
            matchLabels:
              node-role.kubernetes.io/master: ""
        EOF

5. Verification and Result Analysis

5.1 Dynamic FRRConfiguration Sync

When MetalLB CRDs are applied in the metallb-system namespace (with the correct namespace labels or cluster configuration), the MetalLB Operator automatically synthesizes FRRConfiguration objects in the openshift-frr-k8s namespace.

oc get FRRConfiguration -A
        
        # NAMESPACE           NAME                     AGE
        
        # openshift-frr-k8s   metallb-master-01-demo   15m
        
        # openshift-frr-k8s   metallb-master-02-demo   15m
        
        # openshift-frr-k8s   metallb-master-03-demo   15m
        
        # Inspecting a dynamically generated configuration
        
        oc get FRRConfiguration metallb-master-01-demo -n openshift-frr-k8s -o yaml
        
        # apiVersion: v1
        
        # items:
        
        # - apiVersion: frrk8s.metallb.io/v1beta1
        
        #   kind: FRRConfiguration
        
        #   metadata:
        
        #     creationTimestamp: "2026-03-05T02:22:03Z"
        
        #     generation: 2
        
        #     name: metallb-master-01-demo
        
        #     namespace: openshift-frr-k8s
        
        #     resourceVersion: "366323"
        
        #     uid: 1b71c010-5e53-4973-8c4e-49896cd8181a
        
        #   spec:
        
        #     bgp:
        
        #       routers:
        
        #       - asn: 64512
        
        #         neighbors:
        
        #         - address: 192.168.99.12
        
        #           asn: 64512
        
        #           disableMP: false
        
        #           dualStackAddressFamily: false
        
        #           passwordSecret: {}
        
        #           port: 179
        
        #           toAdvertise:
        
        #             allowed:
        
        #               mode: filtered
        
        #               prefixes:
        
        #               - 192.168.66.100/32
        
        #           toReceive:
        
        #             allowed:
        
        #               mode: filtered
        
        #         prefixes:
        
        #         - 192.168.66.100/32
        
        #     nodeSelector:
        
        #       matchLabels:
        
        #         kubernetes.io/hostname: master-01-demo
        
        #     raw: {}
        
        # - apiVersion: frrk8s.metallb.io/v1beta1
        
        #   kind: FRRConfiguration
        
        #   metadata:
        
        #     creationTimestamp: "2026-03-05T02:22:03Z"
        
        #     generation: 1
        
        #     name: metallb-master-02-demo
        
        #     namespace: openshift-frr-k8s
        
        #     resourceVersion: "365356"
        
        #     uid: 99415279-10a5-44fe-aef8-78a342ef589c
        
        #   spec:
        
        #     bgp:
        
        #       routers:
        
        #       - asn: 64512
        
        #         neighbors:
        
        #         - address: 192.168.99.12
        
        #           asn: 64512
        
        #           disableMP: false
        
        #           dualStackAddressFamily: false
        
        #           passwordSecret: {}
        
        #           port: 179
        
        #           toAdvertise:
        
        #             allowed:
        
        #               mode: filtered
        
        #           toReceive:
        
        #             allowed:
        
        #               mode: filtered
        
        #     nodeSelector:
        
        #       matchLabels:
        
        #         kubernetes.io/hostname: master-02-demo
        
        #     raw: {}
        
        # - apiVersion: frrk8s.metallb.io/v1beta1
        
        #   kind: FRRConfiguration
        
        #   metadata:
        
        #     creationTimestamp: "2026-03-05T02:22:03Z"
        
        #     generation: 1
        
        #     name: metallb-master-03-demo
        
        #     namespace: openshift-frr-k8s
        
        #     resourceVersion: "365355"
        
        #     uid: 38743b7a-903c-41c7-8027-7e905d0f49d4
        
        #   spec:
        
        #     bgp:
        
        #       routers:
        
        #       - asn: 64512
        
        #         neighbors:
        
        #         - address: 192.168.99.12
        
        #           asn: 64512
        
        #           disableMP: false
        
        #           dualStackAddressFamily: false
        
        #           passwordSecret: {}
        
        #           port: 179
        
        #           toAdvertise:
        
        #             allowed:
        
        #               mode: filtered
        
        #           toReceive:
        
        #             allowed:
        
        #               mode: filtered
        
        #     nodeSelector:
        
        #       matchLabels:
        
        #         kubernetes.io/hostname: master-03-demo
        
        #     raw: {}
        
        # kind: List
        
        # metadata:
        
        #   resourceVersion: ""

5.2 Connectivity and BGP Route Check

Verify that the external router has learned the /32 route from the correct node.


        # Verify route on the external FRR router
        
        vtysh -c 'show ip route bgp'
        
        # B>* 192.168.66.100/32 [200/0] via 192.168.99.23, enp1s0, weight 1, 00:18:37
        
        # Execute curl from Pods on different nodes
        
        oc exec -it $NODE3_POD -n demo-egress -- curl -I http://192.168.55.13 --connect-timeout 2

6. Under the Hood: Deep Dive into Interception

The EgressService in OCP 4.19 relies on a sophisticated interplay between OVS registers and native Linux nftables.

6.1 OVN Address Sets

OVN-K creates an AddressSet containing the IPs of all Pods in the namespace to track who should be SNATed.

6.2 OVS OpenFlow Redirection (Stage 1)

On the originating node (master-03), OVS modifies registers to redirect the packet into the Geneve tunnel towards the Egress node (master-01).


        # Check OVS flows for the source Pod IP
        
        oc exec -n openshift-ovn-kubernetes $OVN_NODE_03_POD -c ovn-controller -- ovs-ofctl dump-flows br-int | grep "10.133.0.3"
        
        # table=25, priority=101, ip, nw_src=10.133.0.3 actions=load:0x64580004->NXM_NX_XXREG0[96..127]...

6.3 NFTables Dynamic Maps (Stage 2)

On the egress node (master-01), the final SNAT transformation occurs using nftables maps for maximum throughput.


        # Extract the dynamic SNAT map from the host
        
        oc debug node/master-01-demo -- chroot /host /bin/bash -c "nft list ruleset | grep 'egress-service-snat-v4' -A 5 -B 5"
                # map egress-service-snat-v4 {
                #         type ipv4_addr : ipv4_addr
                #         elements = { 10.132.0.39 comment "demo-egress/egress-identity" : 192.168.66.100, 10.133.0.4 comment "demo-egress/egress-identity" : 192.168.66.100 }
                # }
        
        oc debug node/master-02-demo -- chroot /host /bin/bash -c "nft list ruleset | grep 'egress-service-snat-v4' -A 5 -B 5"
                # map egress-service-snat-v4 {
                #         type ipv4_addr : ipv4_addr
                # }
        
        oc debug node/master-03-demo -- chroot /host /bin/bash -c "nft list ruleset | grep 'egress-service-snat-v4' -A 5 -B 5"
                # map egress-service-snat-v4 {
                #         type ipv4_addr : ipv4_addr
                # }
        
        
        
        # Extract the DNAT rules
        
        oc debug node/master-01-demo -- chroot /host /bin/bash -c "iptables -L -v -n -t nat | grep 'Chain OVN-KUBE-EXTERNALIP' -A 4 -B 0"
        
        # Chain OVN-KUBE-EXTERNALIP (2 references)
        
        #  pkts bytes target     prot opt in     out     source               destination
        
        #     0     0 DNAT       tcp  --  *      *       0.0.0.0/0            192.168.66.100       tcp dpt:8080 to:172.22.88.148:8080
        
        oc debug node/master-02-demo -- chroot /host /bin/bash -c "iptables -L -v -n -t nat | grep 'Chain OVN-KUBE-EXTERNALIP' -A 4 -B 0"
        
        # Chain OVN-KUBE-EXTERNALIP (2 references)
        
        #  pkts bytes target     prot opt in     out     source               destination
        
        #     0     0 DNAT       tcp  --  *      *       0.0.0.0/0            192.168.66.100       tcp dpt:8080 to:172.22.88.148:8080
        
        oc debug node/master-03-demo -- chroot /host /bin/bash -c "iptables -L -v -n -t nat | grep 'Chain OVN-KUBE-EXTERNALIP' -A 4 -B 0"
        
        # Chain OVN-KUBE-EXTERNALIP (2 references)
        
        #  pkts bytes target     prot opt in     out     source               destination
        
        #     0     0 DNAT       tcp  --  *      *       0.0.0.0/0            192.168.66.100       tcp dpt:8080 to:172.22.88.148:8080

Conclusion

The evolution from manual FRRConfiguration to integrated MetalLB CRDs in OpenShift 4.19 represents a significant improvement in usability and automation. By leveraging the Shared FRR-K8s Instance Observation (via namespace labeling), administrators can use standard Kubernetes networking abstractions to manage complex BGP routing policies, ensuring high-performance Egress traffic management with minimal manual overhead.