Understanding Pod Restarts Triggered by Secret Rotation in OpenShift

1. Introduction

In a dynamic container orchestration platform like OpenShift, automated processes such as certificate rotation are crucial for maintaining security and operational health. However, these automated events can sometimes lead to seemingly unexpected side effects, such as the automatic restart of certain pods.

This document provides a technical deep-dive into why pods associated with specific services, such as Prometheus, restart automatically when a related TLS secret is rotated. We will analyze the underlying mechanism, explain why this behavior is intentional (“by-design”), and provide robust recommendations for managing this behavior in a production environment to ensure application stability.

2. Problem Statement

A customer observed that after the metrics-client-certs secret in the openshift-monitoring namespace was rotated, several pods, including those for Prometheus, initiated a restart. This behavior raised concerns about potential service disruptions, especially if critical application workloads were to be affected by platform-level component restarts.

The primary goal of this analysis is to uncover the root cause of these restarts and provide clear, actionable guidance.

3. Root Cause Analysis

The investigation begins by examining the metrics-client-certs secret and its relationships within the cluster.

3.1. Identifying the Consuming Resource

First, let’s inspect the secret itself. It contains a standard TLS key pair.

oc get secret metrics-client-certs -n openshift-monitoring -o yaml
# apiVersion: v1
# data:
#   tls.crt: LS0tLS1.....
#   tls.key: LS0tLS1C....
# kind: Secret
# metadata:
#   annotations:
#     openshift.io/owning-component: Monitoring
#   creationTimestamp: "2025-09-28T13:59:54Z"
#   name: metrics-client-certs
#   namespace: openshift-monitoring
#   resourceVersion: "362238"
#   uid: 5b5e6060-634d-4724-bf1d-204e0d7f9d5c
# type: Opaque

Next, we need to identify which component consumes this secret. By inspecting the Prometheuses Custom Resource (CR) named k8s, we can see that metrics-client-certs is one of the secrets it references.

oc get Prometheuses
# NAME   VERSION   DESIRED   READY   RECONCILED   AVAILABLE   AGE
# k8s    2.55.1    2         2       True         True        11d


oc get Prometheuses/k8s -o json | jq .spec.secrets
# [
#   "prometheus-k8s-tls",
#   "prometheus-k8s-thanos-sidecar-tls",
#   "kube-rbac-proxy",
#   "prometheus-k8s-kube-rbac-proxy-web",
#   "metrics-client-certs"
# ]

3.2. The Operator Reconciliation Loop and Hash-Based Updates

The Prometheuses CR is managed by the Prometheus Operator. This operator follows a standard control loop pattern: it continuously watches its managed resources (like the Prometheuses CR and the secrets it references) and works to bring the cluster state in line with the desired configuration.

When a referenced secret like metrics-client-certs is updated, the Prometheus Operator detects this change. To ensure the running Prometheus instances load the new certificate, the operator performs the following steps:

Calculate a Hash: The operator calculates a new hash based on the collective inputs that define the Prometheus StatefulSet, including the configuration and the content of all referenced secrets.
Update Annotations: This new hash is then injected as an annotation into the pod template of the StatefulSet that manages the Prometheus pods.
Trigger a Rolling Restart: Kubernetes detects that the StatefulSet’s template has changed (due to the new annotation value). This change automatically triggers a rolling update of the StatefulSet’s pods, allowing them to be recreated with the new configuration and mount the updated secret.

This mechanism is deeply embedded in the operator’s source code, as seen in the following snippets from the Prometheus Operator repository.

Hash Calculation Logic: This function computes the input hash that reflects the state of all dependencies.

Reference: https://github.com/openshift/prometheus-operator/blob/release-4.18/pkg/prometheus/server/operator.go#L825-L838

      newSSetInputHash, err := createSSetInputHash(*p, c.config, ruleConfigMapNames, tlsAssets, existingStatefulSet.Spec)
      if err != nil {
          return err
      }

      sset, err := makeStatefulSet(
          ssetName,
          p,
          c.config,
          cg,
          ruleConfigMapNames,
          newSSetInputHash,
          int32(shard),
          tlsAssets)

Hash Annotation Injection: This logic applies the calculated hash to the object’s metadata.
- Reference: https://github.com/openshift/prometheus-operator/blob/release-4.18/pkg/operator/factory.go#L111-L126 ```go // InputHashAnnotationName is the name of the annotation used to store the // operator’s computed hash value. const InputHashAnnotationName = “prometheus-operator-input-hash”

// WithInputHashAnnotation records the given hash string in the object’s // annotations. func WithInputHashAnnotation(h string) ObjectOption { return func(o metav1.Object) { a := o.GetAnnotations() if a == nil { a = map[string]string{} } a[InputHashAnnotationName] = h o.SetAnnotations(a) } }


### 3.3. Conclusion: An Intentional Design Pattern

The pod restart is not an error but a direct consequence of a deliberate and robust design pattern. The operator's primary responsibility is to ensure the running configuration matches the declared state. By using a hash of the inputs, it can declaratively trigger updates whenever a dependency changes, ensuring configurations (like new TLS certificates) are applied reliably.

This behavior is not unique to the Prometheus Operator; many operators across the OpenShift and Kubernetes ecosystem use a similar hash-based annotation mechanism to manage workload updates.

## 4. Solutions and Recommendations

While the restart behavior is by-design, its impact on application workloads can be mitigated through proper cluster architecture and operational procedures.

### 4.1. Primary Recommendation: Isolate Platform and Application Workloads

The most effective and architecturally sound solution is to prevent OpenShift platform components from impacting application performance by isolating them. OpenShift is designed with this separation in mind.

-   **Use Infrastructure Nodes**: Designate a set of nodes specifically for running cluster infrastructure components (like the monitoring stack, routers, registry, etc.).
-   **Apply Taints and Tolerations**:
    -   **Taint** the infrastructure nodes to repel regular application pods.
    -   Add **tolerations** to the platform components' deployments, allowing them to run on the tainted infrastructure nodes.
-   **Use Node Selectors**: Schedule application pods exclusively on dedicated application nodes.
-   **Workload partition**: Isolate cpu core between ocp component and application on the same node.

By implementing this pattern, restarts of platform components (due to certificate rotations, updates, or other maintenance) will occur on the isolated infrastructure nodes and will **not** affect the nodes running your business-critical applications.

### 4.2. Secondary Recommendation: Schedule Changes During Maintenance Windows

If workload isolation is not yet implemented, a procedural approach is to perform any actions that might trigger restarts (like manual certificate rotation) during planned maintenance windows. This minimizes the perceived impact on end-users.

### 4.3. Tertiary Recommendation: Raise an RFE for Service Continuity

If the restart of a specific OpenShift component still poses a significant business challenge, consider raising a case or a Request for Enhancement (RFE) with Red Hat. This can drive improvements in future versions, potentially introducing features like in-memory certificate reloading without requiring a full pod restart for specific components.

## 5. Appendix: Auditing Scripts for Hash-Based Annotations

If you still wish to identify other workloads in your cluster that might be subject to similar restart behavior, the following scripts can be used as auditing tools. They search for workloads that have annotations containing hash values, which is a strong indicator of being managed by an operator.

**Disclaimer**: These scripts are heuristics. The presence of a hash does not guarantee restart behavior for all configuration changes, but it serves as an excellent starting point for further investigation.

### 5.1. Script 1: Find Workloads with "-input-hash" Annotations

This script specifically looks for the annotation key `prometheus-operator-input-hash`, which is a precise way to find workloads managed by the Prometheus Operator.

```bash
#!/bin/bash

# Script Function:
# In all namespaces of an OpenShift cluster, find all Deployment, StatefulSet,
# DaemonSet, and DeploymentConfig resources that contain an annotation key
# ending with "-input-hash" in their pod template.

# Define the Kubernetes/OpenShift resource types to check
RESOURCE_TYPES=("deployment" "statefulset" "daemonset" "deploymentconfig")

echo "Searching globally for workloads with '*-input-hash' annotations..."
echo "================================================================================"

# Internal function to process all resources and output in tab-separated format
find_resources() {
  # Print the header first
  printf "NAMESPACE\tKIND\tNAME\n"

  # Loop through each resource type
  for KIND in "${RESOURCE_TYPES[@]}"; do
    # Use the oc command to get all resources of the current type from all namespaces in JSON format.
    # Use jq to parse, filter, and format the output.
    oc get "$KIND" --all-namespaces -o json 2>/dev/null | jq -r '
     .items[] |
      select(
       .metadata.annotations and 
        (.metadata.annotations | keys | any(endswith("-input-hash")))
      ) |
      "\(.metadata.namespace)\t\(.kind)\t\(.metadata.name)"
    '
  done
}

# Call the function and pipe the output to the column command for formatting
# -t creates a table
# -s $'\t' uses tab as the separator
find_resources | column -t -s $'\t'

echo "================================================================================"
echo "Search complete."

Sample Output:

NAMESPACE             KIND         NAME
openshift-monitoring  StatefulSet  alertmanager-main
openshift-monitoring  StatefulSet  prometheus-k8s

5.2. Script 2: Find Workloads with Any Hash-Like Annotation Value

This script uses a regular expression to find any workload with an annotation value that looks like a long hexadecimal hash. This is a broader search and can help identify workloads managed by other operators.

#!/bin/bash

# Define the resource types to scan
RESOURCE_TYPES=("deployment" "statefulset" "daemonset" "deploymentconfig")

echo "Searching for workloads with annotations containing hash-like values..."
echo "================================================================================"

# Internal function to process resources and output in tab-separated format
find_resources() {
  # Print the header first
  printf "NAMESPACE\tKIND\tNAME\n"

  # Loop through each resource type
  for KIND in "${RESOURCE_TYPES[@]}"; do
    # Get all resources of the current kind from all namespaces in JSON format.
    # Use jq to parse, filter, and format the output.
    oc get "$KIND" --all-namespaces -o json 2>/dev/null | jq -r '
      .items[] |
      select(
        .metadata.annotations and
        # --- This is the key change ---
        # 1. Get all annotation values
        # 2. Check if ANY value matches the regex for a long hex string
        (.metadata.annotations | values | any(test("^[a-f0-9]{16,}$")))
      ) |
      # Format the output for matching resources
      "\(.metadata.namespace)\t\(.kind)\t\(.metadata.name)"
    '
  done
}

# Call the function and pipe the output to the column command for formatting
# -t creates a table
# -s $'\t' uses tab as the separator
find_resources | column -t -s $'\t'

Sample Output:

NAMESPACE                                KIND         NAME
openshift-apiserver                      Deployment   apiserver
openshift-authentication                 Deployment   oauth-openshift
openshift-catalogd                       Deployment   catalogd-controller-manager
openshift-cluster-storage-operator       Deployment   csi-snapshot-controller
openshift-console                        Deployment   console
openshift-console                        Deployment   downloads
openshift-controller-manager             Deployment   controller-manager
openshift-kube-storage-version-migrator  Deployment   migrator
openshift-machine-api                    Deployment   machine-api-controllers
openshift-machine-api                    Deployment   metal3
openshift-machine-api                    Deployment   metal3-baremetal-operator
openshift-machine-api                    Deployment   metal3-image-customization
openshift-oauth-apiserver                Deployment   apiserver
openshift-operator-controller            Deployment   operator-controller-controller-manager
openshift-route-controller-manager       Deployment   route-controller-manager
openshift-service-ca                     Deployment   service-ca
openshift-monitoring                     StatefulSet  alertmanager-main
openshift-monitoring                     StatefulSet  prometheus-k8s
openshift-image-registry                 DaemonSet    node-ca
openshift-machine-api                    DaemonSet    ironic-proxy

5.3. Script 3: Displaying the Full Annotations

This modified version of the previous script also outputs the full JSON of the annotations for deeper analysis.

#!/bin/bash

# Define the resource types to scan
RESOURCE_TYPES=("deployment" "statefulset" "daemonset" "deploymentconfig")

echo "Searching for workloads with annotations containing hash-like values..."
echo "====================================================================================================================="

# Internal function to process resources and output in tab-separated format
find_resources() {
  # --- Change 1: Add ANNOTATIONS to the header ---
  printf "NAMESPACE\tKIND\tNAME\tANNOTATIONS\n"

  # Loop through each resource type
  for KIND in "${RESOURCE_TYPES[@]}"; do
    # Get all resources of the current kind from all namespaces in JSON format.
    # Use jq to parse, filter, and format the output.
    oc get "$KIND" --all-namespaces -o json 2>/dev/null | jq -r '
      .items[] |
      select(
        .metadata.annotations and
        (.metadata.annotations | values | any(test("^[a-f0-9]{16,}$")))
      ) |
      # --- Change 2: Add the annotations object (formatted as a single JSON line) to the output ---
      "\(.metadata.namespace)\t\(.kind)\t\(.metadata.name)\t\(.metadata.annotations | @json)"
    '
  done
}

# Call the function and pipe the output to the column command for formatting
# -t creates a table
# -s $'\t' uses tab as the separator
find_resources | column -t -s $'\t'

Sample Output:


Searching for workloads with annotations containing hash-like values...
=====================================================================================================================
NAMESPACE                                KIND         NAME                                    ANNOTATIONS
openshift-apiserver                      Deployment   apiserver                               {"deployment.kubernetes.io/revision":"6","openshiftapiservers.operator.openshift.io/operator-pull-spec":"quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:c3b95206d3d2e0fd6597e1942a2413a4f22f4fc330d8f364b3984357fb9dac53","openshiftapiservers.operator.openshift.io/pull-spec":"quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:eda9397d28c8ad22b2eb42c4e68d9e2b345c06e7fb3823ab9aabbef61da2aa01","operator.openshift.io/dep-desired.generation":"6","operator.openshift.io/dep-openshift-apiserver.config.configmap":"yUygSw==","operator.openshift.io/dep-openshift-apiserver.etcd-client.secret":"t3EW1g==","operator.openshift.io/dep-openshift-apiserver.etcd-serving-ca.configmap":"oCFg0A==","operator.openshift.io/dep-openshift-apiserver.image-import-ca.configmap":"OUXnWA==","operator.openshift.io/dep-openshift-apiserver.trusted-ca-bundle.configmap":"ElMHxA==","operator.openshift.io/spec-hash":"5476d90eedb138629193d3e6833708781ebac8c1b55901da00d3986db63556da"}
openshift-authentication                 Deployment   oauth-openshift                         {"deployment.kubernetes.io/revision":"3","operator.openshift.io/rvs-hash":"9YrNp-rkOFV7dkiIBAEk_Ja34a0yuAvXGjQAJsT1xslo2fcw6E4_MtOM6ErG7YABQjNP5D893UZnH5nkaaAuIQ","operator.openshift.io/spec-hash":"b971292f840e8a4ff4b90b14d60a2b269132874ad8c0d6cd3f8076723e371c37"}
openshift-catalogd                       Deployment   catalogd-controller-manager             {"deployment.kubernetes.io/revision":"1","kubectl.kubernetes.io/default-logs-container":"manager","operator.openshift.io/spec-hash":"6d5bf3cc306dcd6b197b3eb4592b1ad9b0e28029b204dbf127b6ba2d5b631b20"}
openshift-cluster-storage-operator       Deployment   csi-snapshot-controller                 {"deployment.kubernetes.io/revision":"1","operator.openshift.io/spec-hash":"afe3e1215751d3db8dd652d7a9c4afa0ce445a54dab9b19fc41df903b8121002"}
openshift-console                        Deployment   console                                 {"console.openshift.io/authn-ca-trust-config-version":"19072","console.openshift.io/console-config-version":"1180108","console.openshift.io/image":"quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:bf6a9e0d72842ba4d75805068cec11e7258ceacca387c153da7c15e9df883f9f","console.openshift.io/infrastructure-config-version":"536","console.openshift.io/oauth-secret-version":"19105","console.openshift.io/proxy-config-version":"554","console.openshift.io/service-ca-config-version":"22469","console.openshift.io/trusted-ca-config-version":"22474","deployment.kubernetes.io/revision":"32","operator.openshift.io/spec-hash":"b35f1ad529bd9eab6d04c8c3ef308d2e926348d0d1c7a057d969c3baa112816a"}
openshift-console                        Deployment   downloads                               {"deployment.kubernetes.io/revision":"1","operator.openshift.io/spec-hash":"5aa6b0709352f48b7bfddc9f710da047108a83b47599594e00fe22f22f6e975e"}
openshift-controller-manager             Deployment   controller-manager                      {"deployment.kubernetes.io/revision":"12","operator.openshift.io/spec-hash":"a6613ece1d17ddfab14bf7d4b97fe6a46765129c8839147a312c473e7a02a339","release.openshift.io/version":"4.18.24"}
openshift-kube-storage-version-migrator  Deployment   migrator                                {"deployment.kubernetes.io/revision":"1","operator.openshift.io/spec-hash":"c6081fec4123a6931c2a6d33a723fb45ce4adb23f6c6c01fd58a17df0a127478"}
openshift-machine-api                    Deployment   machine-api-controllers                 {"deployment.kubernetes.io/revision":"1","machine.openshift.io/owned":"","operator.openshift.io/dep-openshift-machine-api.mao-trusted-ca.configmap":"ElMHxA==","operator.openshift.io/spec-hash":"a12a3ac49310bace4d414c7f1a0bf00f26b7749be82a6eca2b1761a257e88641"}
openshift-machine-api                    Deployment   metal3                                  {"baremetal.openshift.io/owned":"","deployment.kubernetes.io/revision":"1","operator.openshift.io/spec-hash":"96095467821b2ad2d7bf2a1adf96f87f815b7e553e5996b0aabc488bb23ec525"}
openshift-machine-api                    Deployment   metal3-baremetal-operator               {"baremetal.openshift.io/owned":"","deployment.kubernetes.io/revision":"11","operator.openshift.io/spec-hash":"93eba5b33ce68ef394f6857a884b6e558bc7a9742e572f0594668390c45014a6"}
openshift-machine-api                    Deployment   metal3-image-customization              {"deployment.kubernetes.io/revision":"1","operator.openshift.io/spec-hash":"ccdc9cf0b7248d6cb5bd31c68b2a2dbd3d535de44a9780ec9b579f1f3f1b06e5"}
openshift-oauth-apiserver                Deployment   apiserver                               {"deployment.kubernetes.io/revision":"5","openshiftapiservers.operator.openshift.io/operator-pull-spec":"","operator.openshift.io/dep-openshift-oauth-apiserver.etcd-client.secret":"t3EW1g==","operator.openshift.io/dep-openshift-oauth-apiserver.etcd-serving-ca.configmap":"oCFg0A==","operator.openshift.io/spec-hash":"a33850925f3bd8adfab2c93432d9684b0b0267711a27a31d850be17300cf5848"}
openshift-operator-controller            Deployment   operator-controller-controller-manager  {"deployment.kubernetes.io/revision":"1","kubectl.kubernetes.io/default-logs-container":"manager","operator.openshift.io/spec-hash":"e30cf1612910510a2486deee4d4ac9a06bdcbad81fc567b832dd33206aec51ab"}
openshift-route-controller-manager       Deployment   route-controller-manager                {"deployment.kubernetes.io/revision":"10","operator.openshift.io/spec-hash":"5162147590a104746378aeaf57d5b0c7e2cfea27ad9845ef0659f1bab66eea4d","release.openshift.io/version":"4.18.24"}
openshift-service-ca                     Deployment   service-ca                              {"deployment.kubernetes.io/revision":"1","operator.openshift.io/spec-hash":"53652ec8c019ec868855371b7424e32488d8aee83f2db09dd9c5d581872c801c"}
openshift-monitoring                     StatefulSet  alertmanager-main                       {"operator.prometheus.io/controller-id":"openshift-monitoring/prometheus-operator","prometheus-operator-input-hash":"2744621181287598978"}
openshift-monitoring                     StatefulSet  prometheus-k8s                          {"operator.prometheus.io/controller-id":"openshift-monitoring/prometheus-operator","prometheus-operator-input-hash":"1758422184247178709"}
openshift-image-registry                 DaemonSet    node-ca                                 {"deprecated.daemonset.template.generation":"1","operator.openshift.io/spec-hash":"d745004a621fa53c4361fec938c808855b15a3467148f404d44ab77c5e8b8424"}
openshift-machine-api                    DaemonSet    ironic-proxy                            {"deprecated.daemonset.template.generation":"8","operator.openshift.io/spec-hash":"0420378fab7b2b88ea2ec3a02e2d924d9248166ca79572a2b30ad4967232b624"}

As the analysis shows, the logic for these updates is deeply integrated into the source code of each respective operator. A comprehensive audit would require investigating the behavior of each operator individually. Therefore, the architectural recommendation of workload isolation remains the most robust strategy.