openshift4 内置 dns, haproxy, image registry / openshift4 embeds dns, haproxy, image registry
⚠️注意,本文所述操作,涉及更改 openshift 4 底层操作系统 rhcos,这有可能导致失去红帽支持资格,具体的情况,请和对口的红帽 GPS 团队沟通, 或者联系红帽 CEE 团队确认。这是因为本方案:
- 没有经过严格的测试
- 将在rhcos上安装rpm
- rpm来自于epel, DIY
⚠️Note that the operation described in this article involves changing the underlying operating system rhcos of openshift 4, which may lead to the loss of Red Hat support qualification. For specific circumstances, please communicate with the corresponding Red Hat GPS team, or contact the Red Hat CEE team for confirmation. This is because this solution:
- Not rigorously tested
- will install rpm on rhcos
- rpm from epel, DIY
rhcos 是一个特殊版本的coreos, 它是openshift 4的底座操作系统,在openshift 4的官方文档中,rhcos被描述成为不可变操作系统,这会让人误以为,rhcos是不可改变的。这个错误的认识,让openshift 4在项目实施的过程中,遇到很多尴尬,也让很多场景,支持起来非常的别扭。
rhcos is a special version of coreos, which is the base operating system of openshift 4. In the official documents of openshift 4, rhcos is described as an immutable operating system, which will make people mistakenly think that rhcos is immutable. This wrong understanding made Openshift 4 encounter a lot of embarrassment in the process of project implementation, and it also made many scenarios very awkward to support.
比如,我们有一个边缘的5GC的场景,客户要求服务器数量尽量少,并且要求高可用。而openshift 4如果要做到高可用,必须3台服务器,而如果考虑到,需要外部的dns, 负载分担,镜像仓库,并且考虑他们的HA,那么还需要2个服务器,这样一共就5台服务器了。这对于一个边缘部署来说,太重了。
For example, we have an edge 5GC scenario, where customers require as few servers as possible and high availability. If openshift 4 is to be highly available, 3 servers are required, and if it is considered that external dns, load sharing, mirror registry are required, and their HA is considered, then 2 servers are needed, so there are 5 servers in total. This is too heavy for an edge deployment.
openshift 4的竞品们,一般都是把dns,负载分担,镜像仓库等等周边组件,融入到集群内部,也就是在操作系统上直接部署,而openshift 4号称操作系统不可变,那是不是这些服务就不能部署到内部去呢?本文我们就来探索一下。
Competitors of openshift 4 generally integrate dns, load sharing, mirror registry and other peripheral components into the cluster, that is, deploy directly on the operating system, while openshift 4 claims that the operating system is immutable, is that right? Can't the service be deployed internally? In this article we will explore.
openshift4 虽然号称支持单节点,3节点的边缘部署模式,但是实际项目实施的时候,往往需要多一个节点,这个节点需要承载的任务有:
- DNS服务 : 因为k8s的各种内部服务,都依赖DNS解析
- load balancer 服务 : 3 k8s master是需要负载均衡服务的。
- 镜像仓库 : 这个是因为crio会在系统重启的时候,检查是否是意外重启,如果是,会清空本机镜像缓存,重新从镜像仓库下载。
- NTP服务 : 集群节点之间的时间同步服务,好在现在大多数 交换机/路由器 都可以提供这个服务。
Although openshift4 claims to support the edge deployment mode of single node and 3 nodes, when the actual project is implemented, one more node is often required. The tasks that this node needs to carry include:
- DNS service: Because various internal services of k8s rely on DNS resolution
- Load balancer service: 3 k8s master needs load balancing service.
- Mirror registry: This is because crio will check whether it is an accidental restart when the system restarts. If so, it will clear the local container image cache and download it from the mirror registry again.
- NTP service: Time synchronization service between cluster nodes. Fortunately, most switches/routers can provide this service.
上述服务,当然可以集中部署到核心区域,但是有些场景,比如私有5G核心网,我们必须把上述服务部署到边缘站点中,这是因为,私有5GC是断外网的环境。
The above services can of course be deployed in the core area, but in some scenarios, such as private 5G core networks, we must deploy the above services to edge sites, because private 5GC is an environment where the external network is disconnected.
我们还知道,openshift4 本身就是基于 rhcos / coreos 操作系统之上的 k8s, 我们自然希望可以把上述的服务,内嵌到 rhcos / coreos 里面去,实现真正意义上的 单节点/3节点 的部署模式。
We also know that openshift4 itself is based on k8s on the rhcos / coreos operating system. We naturally hope that the above services can be embedded in rhcos / coreos to achieve a true single-node/3-node deployment mode.
如果没有本方案,那么我们的部署会是这个样子的,可以看到,必须要有一个 helper 节点,提供辅助功能。
Without this solution, our deployment would look like this. As you can see, there must be a helper node to provide auxiliary functions.
以下是本方案的架构设计: / The following is the architectural design of this scheme:
让我们开始吧。 / Let's begin
视频讲解 / Video explanation
on single node ocp
我们从最简单的单节点openshift 4 集群开始。我们的目标,是把helper上的以下组件,用openshift 4的单节点中的组件替代:
We start with the simplest single node openshift 4 cluster. Our goal is to replace the following components on the helper with components in a single node of openshift 4:
- dns -> pdns (power dns)
- image registry -> docker distribution
我们不考虑 haproxy ,是因为单节点,没有外部负载分担的需要。
We do not consider haproxy because it is a single node and there is no need for external load sharing.
而NTP服务,我们认为网络交换机/路由器可以提供。或者在SNO场景下,可以不用NTP服务。也可以在SNO节点上直接启动一个NTP服务都可以。
And NTP service, we think network switch/router can provide. Or in the SNO scenario, the NTP service may not be used. You can also directly start an NTP service on the SNO node.
这里是这个single node ocp的day-0的部署过程记录。
Here is the deployment process record of day-0 of this single node ocp.
以下是day-0的时候,部署的架构图: / The following is the architecture diagram of the deployment at day-0:
我们的目标,是通过day-2的操作,把他变成这个样子: / Our goal is to make it look like this through the operation of day-2:
prepare docker registry content
我们需要先准备以下离线镜像仓库,openshift支持了一个oc-mirror的工具,我们可以方便的使用。我们先把离线镜像仓库下载到文件中。留着后面使用。
We need to prepare the following offline mirror images first. Openshift supports an oc-mirror tool, which we can use easily. We first download the offline mirror repository to a file. Save it for later use.
# setup a stand alone docker registry
# on helper
cat > /data/ocp4/mirror.yaml << EOF
apiVersion: mirror.openshift.io/v1alpha1
kind: ImageSetConfiguration
# archiveSize: 4
mirror:
ocp:
channels:
- name: stable-4.10
versions:
- '4.10.28'
- '4.10.26'
additionalImages:
- name: registry.redhat.io/redhat/redhat-operator-index:v4.10
- name: registry.redhat.io/redhat/certified-operator-index:v4.10
- name: registry.redhat.io/redhat/community-operator-index:v4.10
- name: registry.redhat.io/redhat/redhat-marketplace-index:v4.10
EOF
mkdir -p /data/install/mirror-tmp
cd /data/install/mirror-tmp
oc-mirror --config /data/ocp4/mirror.yaml file:///data/install/mirror-tmp
install rpm to rhcos
我们需要向rhcos直接安装pdns, docker distribution等软件,为什么不能用openshift的容器来提供这些服务呢?这里面有一个crio的bug,简单说,如果主机意外重启,crio会把本地镜像全部作废,然后重新从镜像仓库下载。所以,我们的dns, registry服务就不能用容器来启动,否则如果宿主机暴力重启,dns, registry的容器服务都启动不了,这个节点的openshift服务就无法启动了。
We need to install pdns, docker distribution and other software directly to rhcos, why can't we use openshift containers to provide these services? There is a crio bug here. Simply speaking, if the host restarts unexpectedly, crio will invalidate all the local images and download them from the mirror repository again. Therefore, our dns and registry services cannot be started with containers. Otherwise, if the host restarts violently, the container services of dns and registry cannot be started, and the openshift service of this node cannot be started.
有同事建议,可以使用podman/systemd的方式,在systemd里面注册一个服务,在服务里面通过podman启动pdns, registry,经过实验测试,断电重启的情况下,podman的镜像,也会丢失,所以对应的systemd service也启动不了。所以我们就彻底放弃容器解决方案。
A colleague suggested that we can use the podman/systemd method to register a service in systemd, and start pdns and registry through podman in the service. After experimental testing, in the case of power failure and restart, the image of podman will also be lost, so the corresponding The systemd service does not start either. So we dropped the container solution entirely.
我们还需要做一个rpm repo源,这里作者做好了一个demo rpm repo源,注意,这个源引用了epel的rpm, 还有作者自己打包的rpm。所以这个源只能作为学习和测试之用。
We also need to make an rpm repo source. Here the author has prepared a demo rpm repo source. Note that this source refers to the rpm of epel and the rpm packaged by the author himself. So this source should only be used for learning and testing purposes.
最后,用rpm-ostree向rhcos装rpm,这个技术是openshift 4自己就在使用的,openshift 4 extension功能,比如real-time kernel extension, 就是通过rpm-ostree向rhcos装了对应的kernel包实现的。
Finally, use rpm-ostree to install rpm to rhcos. This technology is used by openshift 4 itself. Openshift 4 extension functions, such as real-time kernel extension, are implemented by installing the corresponding kernel package to rhcos through rpm-ostree.
# on helper
mkdir -p /data/repo
cd /data/repo
# here is the demo simple repo
# you can build the repo by youself, just following rhel8.4 way
wget https://github.com/wangzheng422/release/releases/download/ocp.4.10.28.simple.repo/dnf-ocp-4.10-simple.tgz
tar zvxf dnf-ocp-4.10-simple.tgz
cd /data/repo/dnf-ocp-4.10-simple/
# start http server to serve the rpm repo
python3 -m http.server 5180
# Serving HTTP on 0.0.0.0 port 5180 (http://0.0.0.0:5180/) ...
# login into single node
export REPO_IP=http://192.168.7.11:5180
cat << EOF > /etc/yum.repos.d/wzh.repo
# RHEL repos
[simple]
baseurl=${REPO_IP}/
enabled=1
repo_gpgcheck=0
gpgcheck=0
EOF
rpm-ostree install htop pdns pdns-recursor docker-distribution
# Checking out tree 8b334e0... done
# Enabled rpm-md repositories: simple
# Updating metadata for 'simple'... done
# rpm-md repo 'simple'; generated: 2022-09-09T06:17:17Z
# Importing rpm-md... done
# Resolving dependencies... done
# Will download: 11 packages (12.9 MB)
# Downloading from 'simple'... done
# Importing packages... done
# Checking out packages... done
# Running pre scripts... done
# Running post scripts... done
# Running posttrans scripts... done
# Writing rpmdb... done
# Writing OSTree commit... done
# Staging deployment... done
# Added:
# boost-context-1.66.0-10.el8.x86_64
# boost-filesystem-1.66.0-10.el8.x86_64
# boost-program-options-1.66.0-10.el8.x86_64
# boost-system-1.66.0-10.el8.x86_64
# docker-distribution-2.8.1-0.el8.x86_64
# htop-3.2.1-1.el8.x86_64
# libsodium-1.0.18-2.el8.x86_64
# luajit-2.1.0-0.16beta3.el8.x86_64
# pdns-4.6.2-1.el8.x86_64
# pdns-recursor-4.3.6-1.el8.x86_64
# protobuf-3.5.0-13.el8.x86_64
# Run "systemctl reboot" to start a reboot
systemctl reboot
# after reboot
rpm-ostree status
# State: idle
# Deployments:
# ● pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:23d0609643c25efcd30a7a64483fdee2343ced26b1fd08c0cbf8d03a5d405939
# CustomOrigin: Managed by machine-config-operator
# Version: 410.84.202208030316-0 (2022-08-03T03:19:21Z)
# LayeredPackages: docker-distribution htop pdns pdns-recursor
# pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:23d0609643c25efcd30a7a64483fdee2343ced26b1fd08c0cbf8d03a5d405939
# CustomOrigin: Managed by machine-config-operator
# Version: 410.84.202208030316-0 (2022-08-03T03:19:21Z)
重启以后,我们就能看到LayeredPackages了,以后版本的 openshift 4 会在集群层面支持 LayeredPackages 功能。目前我们只能直接登录rhcos来手动做安装。
After restarting, we can see LayeredPackages, and furture versions of openshift 4 will support the LayeredPackages function at the cluster / k8s level. At present, we can only log in directly to rhcos to do the installation manually.
embed registry on single node ocp
我们需要的软件,已经装在节点上了,接下来,我们就做一些配置,把本地的镜像仓库激活。注意,这里面我们使用的是docker distribution, 我们把之前helper上的镜像仓库的证书拿来直接给他用,这样之后,我们只要更改dns指向就可以了。
The software we need has been installed on the node. Next, we will do some configuration to activate the local mirror registry. Note that we are using the docker distribution here. We use the certificate of the image registry on the helper directly for it to use. After that, we only need to change the dns point.
我们的配置文件位于/etc下面, 上传的镜像位于/var下面,那么节点重启,集群升级,这些目录会不会被重置呢?目前的实测表明不会,按照文档的说法,/etc下面的内容,在升级的时候会进行合并,/var下面的内容,会保留。
Our configuration file is located under /etc, and the uploaded image is located under /var. Then, if the node is restarted and the cluster is upgraded, will these directories be reset? The current testing shows that it will not. According to the document, the content under /etc will be merged during the upgrade, and the content under /var will be retained.
export BASE_DIR='/home/sno/'
export VAR_CERT_DIR=/etc/crts/
echo "obase=8;ibase=10;420" | bc
# 644
echo "obase=10;ibase=8;700" | bc
# 448
#########################
# run with root
# to grant read access to key
chmod og+r $VAR_CERT_DIR/redhat.ren.key
#########################
cat << EOF > ${BASE_DIR}/data/sno/registry.images.bu
variant: openshift
version: 4.10.0
metadata:
labels:
machineconfiguration.openshift.io/role: master
name: 99-zzz-master-registry
storage:
files:
- path: /etc/wzh/redhat.ren.crt
overwrite: true
contents:
source: data:text/plain;charset=utf-8;base64,$( base64 -w 0 < ${VAR_CERT_DIR}/redhat.ren.crt )
mode: 420
user:
name: root
- path: /etc/wzh/redhat.ren.key
overwrite: true
contents:
source: data:text/plain;charset=utf-8;base64,$( base64 -w 0 < ${VAR_CERT_DIR}/redhat.ren.key )
mode: 420
user:
name: root
- path: /etc/wzh/registry-config.yml
overwrite: true
contents:
inline: |
version: 0.1
log:
accesslog:
disabled: true
fields:
service: registry
storage:
cache:
layerinfo: inmemory
filesystem:
rootdirectory: /var/wzh-registry
delete:
enabled: true
maintenance:
readonly:
enabled: false
http:
addr: :8443
tls:
certificate: /etc/wzh/redhat.ren.crt
key: /etc/wzh/redhat.ren.key
mode: 420
user:
name: root
systemd:
units:
- contents: |
[Unit]
Description=Set SELinux chcon for image registry
Before=docker-distribution.service
[Service]
Type=oneshot
RemainAfterExit=yes
User=root
ExecStartPre=-mkdir -p /var/wzh-registry
ExecStart=/usr/bin/chcon -Rt container_file_t /var/wzh-registry
[Install]
WantedBy=multi-user.target
enabled: true
name: hostpath-registry.service
- contents: |
[Unit]
Description=v2 Registry server for Docker
After=network.target hostpath-registry.service
Requires=hostpath-registry.service
Before=kubelet.service
[Service]
Type=simple
ExecStart=/usr/bin/registry serve /etc/wzh/registry-config.yml
[Install]
WantedBy=multi-user.target
enabled: true
name: docker-distribution.service
- name: kubelet.service
dropins:
- name: 99-after-registry.conf
contents: |
[Unit]
Requires=docker-distribution.service
After=docker-distribution.service
EOF
butane ${BASE_DIR}/data/sno/registry.images.bu > ${BASE_DIR}/data/sno/99-zzz-master-registry.yaml
oc create --save-config -f ${BASE_DIR}/data/sno/99-zzz-master-registry.yaml
# oc apply -f ${BASE_DIR}/data/sno/99-zzz-master-registry.yaml
# oc delete -f ${BASE_DIR}/data/sno/99-zzz-master-registry.yaml
upload registry content
有了镜像仓库,我们就把之前下载的离线镜像文件,导入到节点内置的镜像仓库中。
With the mirror registry, we import the offline mirror file downloaded before into the built-in mirror registry of the node.
# on helper
oc-mirror --dest-skip-tls --from mirror_seq1_000000.tar docker://192.168.7.13:8443
(optional) update registry config to read only
我们的离线镜像上传了,就不希望别别人改掉,那么我们可以把本地的镜像仓库设置成只读模式。
Our offline mirror is uploaded, and we don't want others to change it, then we can set the local mirror repository to read-only mode.
cat << EOF > ${BASE_DIR}/data/sno/registry.images.bu
variant: openshift
version: 4.10.0
metadata:
labels:
machineconfiguration.openshift.io/role: master
name: 99-zzz-master-registry
storage:
files:
- path: /etc/wzh/redhat.ren.crt
overwrite: true
contents:
source: data:text/plain;charset=utf-8;base64,$( base64 -w 0 < ${VAR_CERT_DIR}/redhat.ren.crt )
mode: 420
user:
name: root
- path: /etc/wzh/redhat.ren.key
overwrite: true
contents:
source: data:text/plain;charset=utf-8;base64,$( base64 -w 0 < ${VAR_CERT_DIR}/redhat.ren.key )
mode: 420
user:
name: root
- path: /etc/wzh/registry-config.yml
overwrite: true
contents:
inline: |
version: 0.1
log:
accesslog:
disabled: true
fields:
service: registry
storage:
cache:
layerinfo: inmemory
filesystem:
rootdirectory: /var/wzh-registry
delete:
enabled: false
maintenance:
readonly:
enabled: true
http:
addr: :5443
tls:
certificate: /etc/wzh/redhat.ren.crt
key: /etc/wzh/redhat.ren.key
mode: 420
user:
name: root
systemd:
units:
- contents: |
[Unit]
Description=Set SELinux chcon for image registry
Before=docker-distribution.service
[Service]
Type=oneshot
RemainAfterExit=yes
User=root
ExecStartPre=-mkdir -p /var/wzh-registry
ExecStart=/usr/bin/chcon -Rt container_file_t /var/wzh-registry
[Install]
WantedBy=multi-user.target
enabled: true
name: hostpath-registry.service
- contents: |
[Unit]
Description=v2 Registry server for Docker
After=network.target hostpath-registry.service
Requires=hostpath-registry.service
Before=kubelet.service
[Service]
Type=simple
ExecStart=/usr/bin/registry serve /etc/wzh/registry-config.yml
[Install]
WantedBy=multi-user.target
enabled: true
name: docker-distribution.service
- name: kubelet.service
dropins:
- name: 99-after-registry.conf
contents: |
[Unit]
Requires=docker-distribution.service
After=docker-distribution.service
EOF
butane ${BASE_DIR}/data/sno/registry.images.bu > ${BASE_DIR}/data/sno/99-zzz-master-registry.yaml
oc apply -f ${BASE_DIR}/data/sno/99-zzz-master-registry.yaml
deploy power dns (pdns) as local dns service
我们配置本地的power dns,把我们需要的dns记录都写进去,并且配置它在kubelet之前启动。
We configure the local power dns, write all the dns records we need, and configure it to start before the kubelet.
oc patch mcp/master --patch '{"spec":{"paused":true}}' --type=merge
oc patch mcp/worker --patch '{"spec":{"paused":true}}' --type=merge
cat > ${BASE_DIR}/data/sno/pdns.bu << 'EOF'
variant: openshift
version: 4.10.0
metadata:
labels:
machineconfiguration.openshift.io/role: master
name: 99-zzz-master-pdns
storage:
files:
- path: /etc/pdns/pdns.conf
overwrite: true
contents:
inline: |
launch=bind
local-address=0.0.0.0
local-port=53
setgid=pdns
setuid=pdns
bind-config=/etc/pdns/bind.conf
bind-check-interval=300
enable-lua-records=yes
mode: 420
user:
name: root
- path: /etc/pdns/bind.conf
overwrite: true
contents:
inline: |
zone "acm-demo-hub.redhat.ren" { type master; file "/etc/pdns/inside-out.xyz"; };
zone "infra.redhat.ren" { type master; file "/etc/pdns/infra.xyz"; };
mode: 420
user:
name: root
- path: /etc/pdns/inside-out.xyz
overwrite: true
contents:
inline: |
$TTL 10
@ IN SOA ns1.acm-demo-hub.redhat.ren. postmaster.acm-demo-hub.redhat.ren. (
2014080704 ; Serial Number (date YYYYMMDD++)
3H ; refresh (3 hours)
30M ; retry (30 minutes)
2W ; expiry (2 weeks)
1W ) ; minimum (1 week)
;IN NS ns1.ocp4.redhat.ren.
;IN NS ns2.ocp4.redhat.ren.
@ IN A 192.168.7.13
;ns1 IN A 8.8.8.8
;ns2 IN A 8.8.4.4
helper IN A 192.168.7.11
;
;
; The api points to the IP of your load balancer
api IN A 192.168.7.13
api-int IN A 192.168.7.13
;
; The wildcard also points to the load balancer
*.apps IN A 192.168.7.13
;
; Create entry for the bootstrap host
; bootstrap IN A 192.168.7.12
;
; Create entries for the master hosts
master-0 IN A 192.168.7.13
;master-1 IN A 192.168.7.14
;master-2 IN A 192.168.7.15
;
; Create entries for the worker hosts
;worker-0 IN A 192.168.7.16
;worker-1 IN A 192.168.7.17
;worker-2 IN A 192.168.7.18
;
; The ETCd cluster lives on the masters...so point these to the IP of the masters
;etcd-0 IN A 192.168.7.13
;etcd-1 IN A 192.168.7.14
;etcd-2 IN A 192.168.7.15
;
; Create entries for the other hosts
registry IN A 192.168.7.13
yum IN A 192.168.7.1
nexus IN A 192.168.7.1
git IN A 192.168.7.11
tmp-registry IN A 192.168.7.177
mode: 420
user:
name: root
- path: /etc/pdns/infra.xyz
overwrite: true
contents:
inline: |
$TTL 10
@ IN SOA ns1.infra.redhat.ren. postmaster.infra.redhat.ren. (
2014080704 ; Serial Number (date YYYYMMDD++)
3H ; refresh (3 hours)
30M ; retry (30 minutes)
2W ; expiry (2 weeks)
1W ) ; minimum (1 week)
;IN NS ns1.ocp4.redhat.ren.
;IN NS ns2.ocp4.redhat.ren.
@ IN A 192.168.7.13
quay IN A 192.168.7.13
quaylab IN A 192.168.7.13
mode: 420
user:
name: root
systemd:
units:
- name: pdns.service
enabled: true
- name: kubelet.service
dropins:
- name: 99-after-pdns.conf
contents: |
[Unit]
Requires=pdns.service
After=pdns.service
EOF
butane ${BASE_DIR}/data/sno/pdns.bu > ${BASE_DIR}/data/sno/99-zzz-master-pdns.yaml
oc create --save-config -f ${BASE_DIR}/data/sno/99-zzz-master-pdns.yaml
# oc apply -f ${BASE_DIR}/data/sno/99-zzz-master-pdns.yaml
update registry.conf to point to local registry
默认情况下,这一步并不需要,但是作者的集群装的时候,对registries.conf做过特殊的配置,这里面就要把镜像仓库重新调整以下。image.registries.conf.sh脚本的源代码在这里。
By default, this step is not required, but when the author's cluster is installed, he has made special configuration to registries.conf, and the mirror warehouse needs to be readjusted as follows. The source code for the image.registries.conf.sh script is here.
######################
# run as root
cd /data/ocp4
bash image.registries.conf.sh quay.infra.redhat.ren:8443
######################
oc apply -f /data/ocp4/99-worker-container-registries.yaml
oc apply -f /data/ocp4/99-master-container-registries.yaml
set sno dns to local dns service
更改single node ocp的dns配置,根据集群安装的方法不同而不同。本次实验的集群的安装方法在这里,于是我们就这样来更改dns指向。
Change the dns configuration of the single node ocp, which varies according to the method of cluster installation. The installation method of the cluster in this experiment is here, so we will change the dns point like this.
NTP_SERVER=192.168.7.11
HELP_SERVER=192.168.7.11
KVM_HOST=192.168.7.11
API_VIP=192.168.7.100
INGRESS_VIP=192.168.7.101
CLUSTER_PROVISION_IP=192.168.7.103
BOOTSTRAP_IP=192.168.7.12
ACM_DEMO_MNGED_CLUSTER=acm-demo1
ACM_DEMO_MNGED_SNO_IP=192.168.7.15
# 定义单节点集群的节点信息
SNO_CLUSTER_NAME=acm-demo-hub
SNO_BASE_DOMAIN=redhat.ren
SNO_IP=192.168.7.13
# ocp bug, gateway needs to be online, otherwise, ovn will mis-behaviour, and ingress failed to start.
SNO_GW=192.168.7.9
SNO_NETMAST=255.255.255.0
SNO_NETMAST_S=24
SNO_HOSTNAME=acm-demo-hub-master
SNO_IF=enp1s0
SNO_IF_MAC=`printf '00:60:2F:%02X:%02X:%02X' $[RANDOM%256] $[RANDOM%256] $[RANDOM%256]`
SNO_DNS=192.168.7.11
SNO_DISK=/dev/vda
SNO_CORE_PWD=redhat
export BASE_DIR='/home/sno/'
cat << EOF > ${BASE_DIR}/data/sno/static.ip.bu
variant: openshift
version: 4.9.0
metadata:
labels:
machineconfiguration.openshift.io/role: master
name: 99-zzz-master-static-ip
storage:
files:
- path: /etc/NetworkManager/system-connections/${SNO_IF}.nmconnection
mode: 0600
overwrite: true
contents:
inline: |
[connection]
id=${SNO_IF}
type=ethernet
autoconnect-retries=1
interface-name=${SNO_IF}
multi-connect=1
permissions=
wait-device-timeout=60000
[ethernet]
mac-address-blacklist=
[ipv4]
address1=${SNO_IP}/${SNO_NETMAST_S=24},${SNO_GW}
dhcp-hostname=${SNO_HOSTNAME}
dhcp-timeout=90
dns=${SNO_IP};
dns-search=
may-fail=false
method=manual
[ipv6]
addr-gen-mode=eui64
dhcp-hostname=${SNO_HOSTNAME}
dhcp-timeout=90
dns-search=
method=disabled
[proxy]
EOF
butane ${BASE_DIR}/data/sno/static.ip.bu > ${BASE_DIR}/data/sno/disconnected/99-zzz-master-ip.yaml
oc apply -f ${BASE_DIR}/data/sno/disconnected/99-zzz-master-ip.yaml
oc patch mcp/master --patch '{"spec":{"paused":false}}' --type=merge
oc patch mcp/worker --patch '{"spec":{"paused":false}}' --type=merge
test with force power off
我们知道,如果ocp node意外断电的话,启动的时候,他会重新下载集群需要的基础镜像,那么我们就暴力断电,来测试sno能否启动吧。
We know that if the ocp node is accidentally powered off, it will re-download the basic image required by the cluster when it starts up, so we will power off violently to test whether the sno can be started.
重启之后,正常启动。 / After restarting, start normally.
oc get co
# NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE
# authentication 4.10.26 True False False 30m
# baremetal 4.10.26 True False False 4d22h
# cloud-controller-manager 4.10.26 True False False 4d22h
# cloud-credential 4.10.26 True False False 4d22h
# cluster-autoscaler 4.10.26 True False False 4d22h
# config-operator 4.10.26 True False False 4d22h
# console 4.10.26 True False False 7m23s
# csi-snapshot-controller 4.10.26 True False False 4d22h
# dns 4.10.26 True False False 20m
# etcd 4.10.26 True False False 4d22h
# image-registry 4.10.26 True False False 4d22h
# ingress 4.10.26 True False False 4d22h
# insights 4.10.26 True False False 40s
# kube-apiserver 4.10.26 True False False 4d22h
# kube-controller-manager 4.10.26 True False False 4d22h
# kube-scheduler 4.10.26 True False False 4d22h
# kube-storage-version-migrator 4.10.26 True False False 3d18h
# machine-api 4.10.26 True False False 4d22h
# machine-approver 4.10.26 True False False 4d22h
# machine-config 4.10.26 True False False 4d22h
# marketplace 4.10.26 True False False 4d22h
# monitoring 4.10.26 True False False 4d22h
# network 4.10.26 True False False 4d22h
# node-tuning 4.10.26 True False False 30m
# openshift-apiserver 4.10.26 True False False 3d22h
# openshift-controller-manager 4.10.26 True False False 2d19h
# openshift-samples 4.10.26 True False False 3d23h
# operator-lifecycle-manager 4.10.26 True False False 4d22h
# operator-lifecycle-manager-catalog 4.10.26 True False False 4d22h
# operator-lifecycle-manager-packageserver 4.10.26 True False False 7m48s
# service-ca 4.10.26 True False False 4d22h
# storage 4.10.26 True False False 4d22h
test with ocp upgrade
我们上传的镜像,包括了4.10.26, 4.10.28两个版本,那么我们就来试试升级吧
The images we uploaded include two versions: 4.10.26 and 4.10.28, so let's try to upgrade
rpm-ostree status
# State: idle
# Deployments:
# ● pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:23d0609643c25efcd30a7a64483fdee2343ced26b1fd08c0cbf8d03a5d405939
# CustomOrigin: Managed by machine-config-operator
# Version: 410.84.202208030316-0 (2022-08-03T03:19:21Z)
# LayeredPackages: docker-distribution htop pdns pdns-recursor
# pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:23d0609643c25efcd30a7a64483fdee2343ced26b1fd08c0cbf8d03a5d405939
# CustomOrigin: Managed by machine-config-operator
# Version: 410.84.202208030316-0 (2022-08-03T03:19:21Z)
# before upgrade, make sure the rpm repo is online
# rpm-ostree will call rpm repo during upgrade
# although it will not download anything
# upgrade ocp to 4.10.28
oc adm upgrade \
--to-image=quay.io/openshift-release-dev/ocp-release@sha256:2127608ebd67a2470860c42368807a0de2308dba144ec4c298bec1c03d79cb52 \
--allow-explicit-upgrade --allow-upgrade-with-warnings=true --force=true
rpm-ostree status
# State: idle
# Deployments:
# ● pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:822737b305b28aa4890f7bf847ebebc896cd7b549318195fc8c953ae3008cc44
# CustomOrigin: Managed by machine-config-operator
# Version: 410.84.202208161501-0 (2022-08-16T15:04:45Z)
# LayeredPackages: docker-distribution htop pdns pdns-recursor
# pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:23d0609643c25efcd30a7a64483fdee2343ced26b1fd08c0cbf8d03a5d405939
# CustomOrigin: Managed by machine-config-operator
# Version: 410.84.202208030316-0 (2022-08-03T03:19:21Z)
# LayeredPackages: docker-distribution htop pdns pdns-recursor
oc get co
# NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE
# authentication 4.10.28 True False False 26m
# baremetal 4.10.28 True False False 130m
# cloud-controller-manager 4.10.28 True False False 130m
# cloud-credential 4.10.28 True False False 154m
# cluster-autoscaler 4.10.28 True False False 130m
# config-operator 4.10.28 True False False 142m
# console 4.10.28 True False False 26m
# csi-snapshot-controller 4.10.28 True False False 32m
# dns 4.10.28 True False False 26m
# etcd 4.10.28 True False False 138m
# image-registry 4.10.28 True False False 36m
# ingress 4.10.28 True False False 141m
# insights 4.10.28 True False False 17s
# kube-apiserver 4.10.28 True False False 131m
# kube-controller-manager 4.10.28 True False False 136m
# kube-scheduler 4.10.28 True False False 133m
# kube-storage-version-migrator 4.10.28 True False False 141m
# machine-api 4.10.28 True False False 130m
# machine-approver 4.10.28 True False False 141m
# machine-config 4.10.28 True False False 138m
# marketplace 4.10.28 True False False 141m
# monitoring 4.10.28 True False False 35m
# network 4.10.28 True False False 142m
# node-tuning 4.10.28 True False False 36m
# openshift-apiserver 4.10.28 True False False 36m
# openshift-controller-manager 4.10.28 True False False 131m
# openshift-samples 4.10.28 True False False 36m
# operator-lifecycle-manager 4.10.28 True False False 130m
# operator-lifecycle-manager-catalog 4.10.28 True False False 130m
# operator-lifecycle-manager-packageserver 4.10.28 True False False 104m
# service-ca 4.10.28 True False False 141m
# storage 4.10.28 True False False 130m
我们可以看到,能够正常的升级和启动。
We can see that it can be upgraded and started normally.
3 node cluster
接下来,我们尝试 3 node openshift / compact cluster。我们的目标,是把helper上的以下组件,用openshift 4的节点中的组件替代:
Next, we try a 3 node openshift / compact cluster. Our goal is to replace the following components on the helper with components in the openshift 4 node:
- dns -> pdns (power dns)
- haproxy -> pdns lua plugin (ifportup)
- image registry -> docker distribution
而NTP服务,我们依然认为网络交换机/路由器可以提供。
And NTP service, we still think that network switch/router can provide.
install rpm to rhcos
这个步骤,和single node ocp是一样的,只不过需要在 3 master 上都执行一遍。另外,我们多安装了一个pdns-selinux, 这个包和docker-distribution都是作者自己打包的,pdns-selinux补充了selinux规则,运行pdns能够做对外的端口检查。
This step is the same as single node ocp, except that it needs to be executed on all 3 masters. In addition, we have installed one more pdns-selinux. This package and docker-distribution are packaged by the author himself. pdns-selinux supplements the selinux rules, and running pdns can perform external port inspection.
# Delete cached rpm repo metadata
# rpm-ostree cleanup -m
rpm-ostree install htop pdns pdns-recursor docker-distribution pdns-selinux
# Added:
# pdns-selinux-0.0.1-0.el8.x86_64
# Run "systemctl reboot" to start a reboot
reboot
rpm-ostree status
# State: idle
# Deployments:
# ● pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:23d0609643c25efcd30a7a64483fdee2343ced26b1fd08c0cbf8d03a5d405939
# CustomOrigin: Managed by machine-config-operator
# Version: 410.84.202208030316-0 (2022-08-03T03:19:21Z)
# LayeredPackages: docker-distribution htop pdns pdns-recursor pdns-selinux
# pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:23d0609643c25efcd30a7a64483fdee2343ced26b1fd08c0cbf8d03a5d405939
# CustomOrigin: Managed by machine-config-operator
# Version: 410.84.202208030316-0 (2022-08-03T03:19:21Z)
# LayeredPackages: docker-distribution htop pdns pdns-recursor
embed registry on each ocp node
这个步骤,也和 single node ocp是一样的。
This step is also the same as single node ocp.
export BASE_DIR='/home/3node/'
export VAR_CERT_DIR=/etc/crts/
# ......
upload registry content
这个步骤,和single node ocp是一样的,只不过需要为 3 master 都执行一遍。
This step is the same as single node ocp, but it needs to be executed for all 3 masters.
oc-mirror --dest-skip-tls --from mirror_seq1_000000.tar docker://192.168.7.13:8443
oc-mirror --dest-skip-tls --from mirror_seq1_000000.tar docker://192.168.7.14:8443
oc-mirror --dest-skip-tls --from mirror_seq1_000000.tar docker://192.168.7.15:8443
deploy power dns (pdns) as local dns service
我们配置本地的power dns,把我们需要的dns记录都写进去,并且配置它在kubelet之前启动。这一步和之前的single node ocp不一样,需要用到pdns lua plugin,用 ifportup 的方法,探测对应节点上的端口是否打开,如果没有打开,认为对应的服务没有启动,或者节点掉线,然后 pdns 就不会返回对应节点的解析。我们用这种方法,来代替haproxy。
We configure the local power dns, write all the dns records we need, and configure it to start before the kubelet. This step is different from the previous single node ocp. You need to use the pdns lua plugin. Use the ifportup method to detect whether the port on the corresponding node is open. If it is not open, it is considered that the corresponding service is not started, or the node is offline, and then pdns The parsing of the corresponding node will not be returned. We use this method to replace haproxy.
cat > ${BASE_DIR}/data/sno/pdns.bu << 'EOF'
variant: openshift
version: 4.10.0
metadata:
labels:
machineconfiguration.openshift.io/role: master
name: 99-zzz-master-pdns
storage:
files:
- path: /etc/pdns/pdns.conf
overwrite: true
contents:
inline: |
launch=bind
local-address=0.0.0.0
local-port=53
setgid=pdns
setuid=pdns
bind-config=/etc/pdns/bind.conf
bind-check-interval=300
enable-lua-records=yes
mode: 420
user:
name: root
- path: /etc/pdns/bind.conf
overwrite: true
contents:
inline: |
zone "acm-demo-hub.redhat.ren" { type master; file "/etc/pdns/inside-out.xyz"; };
zone "infra.redhat.ren" { type master; file "/etc/pdns/infra.xyz"; };
mode: 420
user:
name: root
- path: /etc/pdns/inside-out.xyz
overwrite: true
contents:
inline: |
$TTL 10
@ IN SOA ns1.acm-demo-hub.redhat.ren. postmaster.acm-demo-hub.redhat.ren. (
2014080704 ; Serial Number (date YYYYMMDD++)
3H ; refresh (3 hours)
30M ; retry (30 minutes)
2W ; expiry (2 weeks)
1W ) ; minimum (1 week)
;IN NS ns1.ocp4.redhat.ren.
;IN NS ns2.ocp4.redhat.ren.
@ IN LUA A "ifportup(6443, {'192.168.7.13', '192.168.7.14', '192.168.7.15'})"
;ns1 IN A 8.8.8.8
;ns2 IN A 8.8.4.4
helper IN A 192.168.7.11
;
;
; The api points to the IP of your load balancer
api IN LUA A "ifportup(6443, {'192.168.7.13', '192.168.7.14', '192.168.7.15'})"
api-int IN LUA A "ifportup(6443, {'192.168.7.13', '192.168.7.14', '192.168.7.15'})"
;
; The wildcard also points to the load balancer
*.apps IN LUA A "ifportup(443, {'192.168.7.13', '192.168.7.14', '192.168.7.15'})"
;
; Create entry for the bootstrap host
; bootstrap IN A 192.168.7.12
;
; Create entries for the master hosts
;master-0 IN A 192.168.7.13
;master-1 IN A 192.168.7.14
;master-2 IN A 192.168.7.15
;
; Create entries for the worker hosts
;worker-0 IN A 192.168.7.16
;worker-1 IN A 192.168.7.17
;worker-2 IN A 192.168.7.18
;
; The ETCd cluster lives on the masters...so point these to the IP of the masters
;etcd-0 IN A 192.168.7.13
;etcd-1 IN A 192.168.7.14
;etcd-2 IN A 192.168.7.15
;
; Create entries for the other hosts
;registry IN LUA A "ifportup(8443, {'192.168.7.13', '192.168.7.14', '192.168.7.15'})"
;yum IN A 192.168.7.1
;quay IN LUA A "ifportup(8443, {'192.168.7.13', '192.168.7.14', '192.168.7.15'})"
nexus IN A 192.168.7.1
git IN A 192.168.7.11
tmp-registry IN A 192.168.7.177
mode: 420
user:
name: root
- path: /etc/pdns/infra.xyz
overwrite: true
contents:
inline: |
$TTL 10
@ IN SOA ns1.infra.redhat.ren. postmaster.infra.redhat.ren. (
2014080704 ; Serial Number (date YYYYMMDD++)
3H ; refresh (3 hours)
30M ; retry (30 minutes)
2W ; expiry (2 weeks)
1W ) ; minimum (1 week)
;IN NS ns1.ocp4.redhat.ren.
;IN NS ns2.ocp4.redhat.ren.
@ IN A 192.168.7.13
quay IN LUA A "ifportup(8443, {'192.168.7.13', '192.168.7.14', '192.168.7.15'})"
quaylab IN LUA A "ifportup(8443, {'192.168.7.13', '192.168.7.14', '192.168.7.15'})"
mode: 420
user:
name: root
systemd:
units:
- name: pdns.service
enabled: true
- name: kubelet.service
dropins:
- name: 99-after-pdns.conf
contents: |
[Unit]
Requires=pdns.service
After=pdns.service
EOF
butane ${BASE_DIR}/data/sno/pdns.bu > ${BASE_DIR}/data/sno/99-zzz-master-pdns.yaml
oc create --save-config -f ${BASE_DIR}/data/sno/99-zzz-master-pdns.yaml
# oc apply -f ${BASE_DIR}/data/sno/99-zzz-master-pdns.yaml
update registry.conf to point to local registry
这个步骤,也和 single node ocp是一样的。根据集群的安装方法不同,而不同。
This step is also the same as single node ocp. It varies according to the installation method of the cluster.
######################
# run as root
cd /data/ocp4
bash image.registries.conf.sh quay.infra.redhat.ren:8443
######################
oc patch mcp/master --patch '{"spec":{"paused":true}}' --type=merge
oc patch mcp/worker --patch '{"spec":{"paused":true}}' --type=merge
oc apply -f /data/ocp4/99-worker-container-registries.yaml
oc apply -f /data/ocp4/99-master-container-registries.yaml
oc patch mcp/master --patch '{"spec":{"paused":false}}' --type=merge
oc patch mcp/worker --patch '{"spec":{"paused":false}}' --type=merge
set sno dns to local dns service
把dns指向到本地的 power dns, 指向的方法根据集群安装的方法各不相同。作者的 3 node / compact cluster 是这么安装的,因为网络使用ovn,dns配置信息会在启动的时候,从网卡copy到 br-ex 上,所以作者需要在每个节点上,修改网卡的dns指向,然后重启。
Point the dns to the local power dns, the method of pointing varies according to the method of cluster installation. The author's 3 node / compact cluster is installed like this. Because the network uses ovn, the dns configuration information will be copied from the network card to br-ex at startup, so the author needs to modify the dns point of the network card on each node. Then reboot.
# for master-01
nmcli con mod enp1s0 ipv4.dns 192.168.7.13
reboot
# for master-02
nmcli con mod enp1s0 ipv4.dns 192.168.7.14
reboot
# for master-03
nmcli con mod enp1s0 ipv4.dns 192.168.7.15
reboot
# after reboot, test the dns
dig @127.0.0.1 quaylab.infra.redhat.ren
# ; <<>> DiG 9.11.26-RedHat-9.11.26-4.el8_4 <<>> @127.0.0.1 quaylab.infra.redhat.ren
# ; (1 server found)
# ;; global options: +cmd
# ;; Got answer:
# ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 55590
# ;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
# ;; WARNING: recursion requested but not available
# ;; OPT PSEUDOSECTION:
# ; EDNS: version: 0, flags:; udp: 1232
# ;; QUESTION SECTION:
# ;quaylab.infra.redhat.ren. IN A
# ;; ANSWER SECTION:
# quaylab.infra.redhat.ren. 10 IN A 192.168.7.15
# ;; Query time: 7 msec
# ;; SERVER: 127.0.0.1#53(127.0.0.1)
# ;; WHEN: Thu Sep 15 02:23:09 UTC 2022
# ;; MSG SIZE rcvd: 69
dig @127.0.0.1 api.acm-demo-hub.redhat.ren
# ; <<>> DiG 9.11.26-RedHat-9.11.26-4.el8_4 <<>> @127.0.0.1 api.acm-demo-hub.redhat.ren
# ; (1 server found)
# ;; global options: +cmd
# ;; Got answer:
# ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 14103
# ;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
# ;; WARNING: recursion requested but not available
# ;; OPT PSEUDOSECTION:
# ; EDNS: version: 0, flags:; udp: 1232
# ;; QUESTION SECTION:
# ;api.acm-demo-hub.redhat.ren. IN A
# ;; ANSWER SECTION:
# api.acm-demo-hub.redhat.ren. 10 IN A 192.168.7.15
# ;; Query time: 1 msec
# ;; SERVER: 127.0.0.1#53(127.0.0.1)
# ;; WHEN: Thu Sep 15 02:24:19 UTC 2022
# ;; MSG SIZE rcvd: 72
dig @127.0.0.1 a.apps.acm-demo-hub.redhat.ren
# ; <<>> DiG 9.11.26-RedHat-9.11.26-4.el8_4 <<>> @127.0.0.1 a.apps.acm-demo-hub.redhat.ren
# ; (1 server found)
# ;; global options: +cmd
# ;; Got answer:
# ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 16264
# ;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
# ;; WARNING: recursion requested but not available
# ;; OPT PSEUDOSECTION:
# ; EDNS: version: 0, flags:; udp: 1232
# ;; QUESTION SECTION:
# ;a.apps.acm-demo-hub.redhat.ren. IN A
# ;; ANSWER SECTION:
# a.apps.acm-demo-hub.redhat.ren. 10 IN A 192.168.7.14
# ;; Query time: 1 msec
# ;; SERVER: 127.0.0.1#53(127.0.0.1)
# ;; WHEN: Thu Sep 15 02:25:20 UTC 2022
# ;; MSG SIZE rcvd: 75
test with force power off
我们知道,如果ocp node意外断电的话,启动的时候,他会重新下载集群需要的基础镜像,那么我们就暴力断电其中一个节点,来测试这个节点能否启动吧。
We know that if the ocp node is accidentally powered off, it will re-download the basic image required by the cluster when it starts up. Then we will violently power off one of the nodes to test whether the node can be started.
oc get mcp
# NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE
# master rendered-master-80dda25e010fb6de88514875eefd7c19 True False False 3 3 3 0 19h
# worker rendered-worker-df248a1c64755ca00714f4f2b6d13e48 True False False 0 0 0 0 19h
oc get node
# NAME STATUS ROLES AGE VERSION
# master-01-demo Ready master,worker 19h v1.23.5+012e945
# master-02-demo Ready master,worker 19h v1.23.5+012e945
# master-03-demo Ready master,worker 19h v1.23.5+012e945
oc get co
# NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE
# authentication 4.10.26 True False False 3m14s
# baremetal 4.10.26 True False False 19h
# cloud-controller-manager 4.10.26 True False False 19h
# cloud-credential 4.10.26 True False False 19h
# cluster-autoscaler 4.10.26 True False False 19h
# config-operator 4.10.26 True False False 19h
# console 4.10.26 True False False 3m58s
# csi-snapshot-controller 4.10.26 True False False 19h
# dns 4.10.26 True False False 153m
# etcd 4.10.26 True False False 19h
# image-registry 4.10.26 True False False 19h
# ingress 4.10.26 True False False 130m
# insights 4.10.26 True False False 55s
# kube-apiserver 4.10.26 True False False 19h
# kube-controller-manager 4.10.26 True False False 19h
# kube-scheduler 4.10.26 True False False 19h
# kube-storage-version-migrator 4.10.26 True False False 71m
# machine-api 4.10.26 True False False 19h
# machine-approver 4.10.26 True False False 19h
# machine-config 4.10.26 True False False 12h
# marketplace 4.10.26 True False False 19h
# monitoring 4.10.26 True False False 19h
# network 4.10.26 True False False 19h
# node-tuning 4.10.26 True False False 19h
# openshift-apiserver 4.10.26 True False False 131m
# openshift-controller-manager 4.10.26 True False False 19h
# openshift-samples 4.10.26 True False False 19h
# operator-lifecycle-manager 4.10.26 True False False 19h
# operator-lifecycle-manager-catalog 4.10.26 True False False 19h
# operator-lifecycle-manager-packageserver 4.10.26 True False False 131m
# service-ca 4.10.26 True False False 19h
# storage 4.10.26 True False False 19h
测试结果,能正常启动。
The test result shows that it can be started normally.
test showdown 1 master
我们关掉一个节点,然后看集群的状态
We shut down a node and see the status of the cluster
oc get node
# NAME STATUS ROLES AGE VERSION
# master-01-demo NotReady master,worker 19h v1.23.5+012e945
# master-02-demo Ready master,worker 19h v1.23.5+012e945
# master-03-demo Ready master,worker 19h v1.23.5+012e945
oc get co
# NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE
# authentication 4.10.26 True False False 8m5s
# baremetal 4.10.26 True False False 19h
# cloud-controller-manager 4.10.26 True False False 19h
# cloud-credential 4.10.26 True False False 19h
# cluster-autoscaler 4.10.26 True False False 19h
# config-operator 4.10.26 True False False 19h
# console 4.10.26 True False False 14m
# csi-snapshot-controller 4.10.26 True False False 19h
# dns 4.10.26 True True False 164m DNS "default" reports Progressing=True: "Have 2 available node-resolver pods, want 3."
# etcd 4.10.26 True False True 19h ClusterMemberControllerDegraded: unhealthy members found during reconciling members...
# image-registry 4.10.26 True False False 19h
# ingress 4.10.26 True False False 141m
# insights 4.10.26 True False False 93s
# kube-apiserver 4.10.26 True False True 19h NodeControllerDegraded: The master nodes not ready: node "master-01-demo" not ready since 2022-09-15 03:33:40 +0000 UTC because NodeStatusUnknown (Kubelet stopped posting node status.)
# kube-controller-manager 4.10.26 True False True 19h NodeControllerDegraded: The master nodes not ready: node "master-01-demo" not ready since 2022-09-15 03:33:40 +0000 UTC because NodeStatusUnknown (Kubelet stopped posting node status.)
# kube-scheduler 4.10.26 True False True 19h NodeControllerDegraded: The master nodes not ready: node "master-01-demo" not ready since 2022-09-15 03:33:40 +0000 UTC because NodeStatusUnknown (Kubelet stopped posting node status.)
# kube-storage-version-migrator 4.10.26 True False False 82m
# machine-api 4.10.26 True False False 19h
# machine-approver 4.10.26 True False False 19h
# machine-config 4.10.26 True False False 12h
# marketplace 4.10.26 True False False 19h
# monitoring 4.10.26 True False False 19h
# network 4.10.26 True True False 19h DaemonSet "openshift-multus/multus" is not available (awaiting 1 nodes)...
# node-tuning 4.10.26 True False False 19h
# openshift-apiserver 4.10.26 True False False 8m
# openshift-controller-manager 4.10.26 True False False 19h
# openshift-samples 4.10.26 True False False 19h
# operator-lifecycle-manager 4.10.26 True False False 19h
# operator-lifecycle-manager-catalog 4.10.26 True False False 19h
# operator-lifecycle-manager-packageserver 4.10.26 True False False 142m
# service-ca 4.10.26 True False False 19h
# storage 4.10.26 True False False 19h
关闭了一个节点,集群还能工作。
After shutting down a node, the cluster still works.
看看web console能否使用? / See if the web console can be used?
test with ocp upgrade
我们上传的镜像,包括了4.10.26, 4.10.28两个版本,那么我们就来试试升级吧
The images we uploaded include two versions: 4.10.26 and 4.10.28, so let's try to upgrade
oc get node
# NAME STATUS ROLES AGE VERSION
# master-01-demo Ready master,worker 19h v1.23.5+012e945
# master-02-demo Ready master,worker 19h v1.23.5+012e945
# master-03-demo Ready master,worker 19h v1.23.5+012e945
oc get clusterversion
# NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
# version 4.10.26 True False 19h Cluster version is 4.10.26
# upgrade ocp to 4.10.28
oc adm upgrade \
--to-image=quay.io/openshift-release-dev/ocp-release@sha256:2127608ebd67a2470860c42368807a0de2308dba144ec4c298bec1c03d79cb52 \
--allow-explicit-upgrade --allow-upgrade-with-warnings=true --force=true
# after upgrade
oc get clusterversion
# NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
# version 4.10.28 True False 43m Cluster version is 4.10.28
oc get co
# NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE
# authentication 4.10.28 True False False 62m
# baremetal 4.10.28 True False False 21h
# cloud-controller-manager 4.10.28 True False False 21h
# cloud-credential 4.10.28 True False False 22h
# cluster-autoscaler 4.10.28 True False False 21h
# config-operator 4.10.28 True False False 21h
# console 4.10.28 True False False 148m
# csi-snapshot-controller 4.10.28 True False False 21h
# dns 4.10.28 True False False 4h58m
# etcd 4.10.28 True False False 21h
# image-registry 4.10.28 True False False 21h
# ingress 4.10.28 True False False 4h35m
# insights 4.10.28 True False False 81s
# kube-apiserver 4.10.28 True False False 21h
# kube-controller-manager 4.10.28 True False False 21h
# kube-scheduler 4.10.28 True False False 21h
# kube-storage-version-migrator 4.10.28 True False False 54m
# machine-api 4.10.28 True False False 21h
# machine-approver 4.10.28 True False False 21h
# machine-config 4.10.28 True False False 129m
# marketplace 4.10.28 True False False 21h
# monitoring 4.10.28 True False False 21h
# network 4.10.28 True False False 21h
# node-tuning 4.10.28 True False False 100m
# openshift-apiserver 4.10.28 True False False 142m
# openshift-controller-manager 4.10.28 True False False 21h
# openshift-samples 4.10.28 True False False 98m
# operator-lifecycle-manager 4.10.28 True False False 21h
# operator-lifecycle-manager-catalog 4.10.28 True False False 21h
# operator-lifecycle-manager-packageserver 4.10.28 True False False 4h36m
# service-ca 4.10.28 True False False 21h
# storage 4.10.28 True False False 21h
oc get mcp
# NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE
# master rendered-master-24f4773e2eb47a6524572c1e7185e836 True False False 3 3 3 0 21h
# worker rendered-worker-28261f188bfcb7348c5f6aab2e876b2e True False False 0 0 0 0 21h
rpm-ostree status
# State: idle
# Deployments:
# ● pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:822737b305b28aa4890f7bf847ebebc896cd7b549318195fc8c953ae3008cc44
# CustomOrigin: Managed by machine-config-operator
# Version: 410.84.202208161501-0 (2022-08-16T15:04:45Z)
# LayeredPackages: docker-distribution htop pdns pdns-recursor pdns-selinux
# pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:23d0609643c25efcd30a7a64483fdee2343ced26b1fd08c0cbf8d03a5d405939
# CustomOrigin: Managed by machine-config-operator
# Version: 410.84.202208030316-0 (2022-08-03T03:19:21Z)
# LayeredPackages: docker-distribution htop pdns pdns-recursor pdns-selinux
我们可以看到,升级成功,各个后安装的软件包也都在。
We can see that the upgrade is successful, and all the installed packages are also there.
web console工作也正常。 / The web console works fine too.
finished
notes
research
yum install -y pdns pdns-recursor
mv /etc/pdns/pdns.conf /etc/pdns/pdns.conf.bak
cat << EOF > /etc/pdns/pdns.conf
launch=bind
local-address=127.0.0.1
local-port=5301
setgid=pdns
setuid=pdns
bind-config=/etc/pdns/bind.conf
bind-check-interval=300
enable-lua-records=yes
EOF
cat << EOF > /etc/pdns/bind.conf
zone "ocp4.redhat.ren" { type master; file "/etc/pdns/inside-out.xyz"; };
EOF
cat << 'EOF' > /etc/pdns/inside-out.xyz
$TTL 180
@ IN SOA ns1.ocp4.redhat.ren. postmaster.ocp4.redhat.ren. (
2014080704 ; Serial Number (date YYYYMMDD++)
3H ; refresh (3 hours)
30M ; retry (30 minutes)
2W ; expiry (2 weeks)
1W ) ; minimum (1 week)
IN NS ns1.ocp4.redhat.ren.
IN NS ns2.ocp4.redhat.ren.
@ IN LUA A "ifportup(6443, {'192.168.7.13', '192.168.7.14', '192.168.7.15'})"
ns1 IN A 8.8.8.8
ns2 IN A 8.8.4.4
helper IN A 192.168.7.11
;
;
; The api points to the IP of your load balancer
api IN LUA A "ifportup(6443, {'192.168.7.13', '192.168.7.14', '192.168.7.15'})"
api-int IN LUA A "ifportup(6443, {'192.168.7.13', '192.168.7.14', '192.168.7.15'})"
;
; The wildcard also points to the load balancer
*.apps IN LUA A "ifportup(443, {'192.168.7.13', '192.168.7.14', '192.168.7.15'})"
;
; Create entry for the bootstrap host
; bootstrap IN A 192.168.7.12
;
; Create entries for the master hosts
master-0 IN A 192.168.7.13
master-1 IN A 192.168.7.14
master-2 IN A 192.168.7.15
;
; Create entries for the worker hosts
worker-0 IN A 192.168.7.16
worker-1 IN A 192.168.7.17
worker-2 IN A 192.168.7.18
;
; The ETCd cluster lives on the masters...so point these to the IP of the masters
etcd-0 IN A 192.168.7.13
etcd-1 IN A 192.168.7.14
etcd-2 IN A 192.168.7.15
;
; Create entries for the other hosts
registry IN LUA A "ifportup(5443, {'192.168.7.13', '192.168.7.14', '192.168.7.15'})"
yum IN A 192.168.7.1
quay IN LUA A "ifportup(5443, {'192.168.7.13', '192.168.7.14', '192.168.7.15'})"
nexus IN A 192.168.7.1
git IN A 192.168.7.11
tmp-registry IN A 192.168.7.177
EOF
# ausearch -c 'pdns_server' --raw | audit2allow -M my-pdnsserver
# semodule -X 300 -i my-pdnsserver.pp
# SELinux is preventing /usr/sbin/pdns_server from name_connect access on the tcp_socket port 6443.
# ***** Plugin connect_ports (92.2 confidence) suggests *********************
# If you want to allow /usr/sbin/pdns_server to connect to network port 6443
# Then you need to modify the port type.
# Do
# # semanage port -a -t PORT_TYPE -p tcp 6443
# where PORT_TYPE is one of the following: dns_port_t, dnssec_port_t, kerberos_port_t, ocsp_port_t.
# ***** Plugin catchall_boolean (7.83 confidence) suggests ******************
# If you want to allow system to run with NIS
# Then you must tell SELinux about this by enabling the 'nis_enabled' boolean.
# Do
# setsebool -P nis_enabled 1
# ***** Plugin catchall (1.41 confidence) suggests **************************
# If you believe that pdns_server should be allowed name_connect access on the port 6443 tcp_socket by default.
# Then you should report this as a bug.
# You can generate a local policy module to allow this access.
# Do
# allow this access for now by executing:
# # ausearch -c 'pdns/distributo' --raw | audit2allow -M my-pdnsdistributo
# # semodule -X 300 -i my-pdnsdistributo.pp
systemctl enable --now pdns
pdnsutil check-all-zones
mv /etc/pdns-recursor/recursor.conf /etc/pdns-recursor/recursor.conf.bak
cat << EOF > /etc/pdns-recursor/recursor.conf
local-address=0.0.0.0 ::
allow-from=192.168.7.0/0 #允许所有用户端请求
dnssec=off #关闭dnssec
forward-zones=ocp4.redhat.ren=127.0.0.1:5301
forward-zones-recurse=.=114.114.114.114
setgid=pdns-recursor
setuid=pdns-recursor
security-poll-suffix=
EOF
systemctl enable --now pdns-recursor
ausearch -m avc --start recent -i
audit2allow -a -M wzh-pdns
semodule -i wzh-pdns.pp
systemctl restart pdns
dig @127.0.0.1 helper.ocp4.redhat.ren
dig @127.0.0.1 api.ocp4.redhat.ren
dig @127.0.0.1 c.apps.ocp4.redhat.ren
dig @127.0.0.1 registry.ocp4.redhat.ren
test stand alone
dnf install -y epel-release
dnf install -y pdns pdns-recursor
dnf update -y
semodule -i wzh-pdns.pp
cat << EOF > /etc/pdns/pdns.conf
launch=bind
local-address=0.0.0.0 ::
# local-port=5301
setgid=pdns
setuid=pdns
bind-config=/etc/pdns/bind.conf
bind-check-interval=300
enable-lua-records=yes
EOF
cat << EOF > /etc/pdns/bind.conf
zone "ocp4.redhat.ren" { type master; file "/etc/pdns/inside-out.xyz"; };
EOF
cat << 'EOF' > /etc/pdns/inside-out.xyz
$TTL 180
@ IN SOA ns1.ocp4.redhat.ren. postmaster.ocp4.redhat.ren. (
2014080704 ; Serial Number (date YYYYMMDD++)
3H ; refresh (3 hours)
30M ; retry (30 minutes)
2W ; expiry (2 weeks)
1W ) ; minimum (1 week)
IN NS ns1.ocp4.redhat.ren.
IN NS ns2.ocp4.redhat.ren.
@ IN LUA A "ifportup(6443, {'192.168.7.13', '192.168.7.14', '192.168.7.15'})"
ns1 IN A 8.8.8.8
ns2 IN A 8.8.4.4
helper IN A 192.168.7.11
;
;
; The api points to the IP of your load balancer
api IN LUA A "ifportup(6443, {'192.168.7.13', '192.168.7.14', '192.168.7.15'})"
api-int IN LUA A "ifportup(6443, {'192.168.7.13', '192.168.7.14', '192.168.7.15'})"
;
; The wildcard also points to the load balancer
*.apps IN LUA A "ifportup(443, {'192.168.7.13', '192.168.7.14', '192.168.7.15'})"
;
; Create entry for the bootstrap host
; bootstrap IN A 192.168.7.12
;
; Create entries for the master hosts
master-0 IN A 192.168.7.13
master-1 IN A 192.168.7.14
master-2 IN A 192.168.7.15
;
; Create entries for the worker hosts
worker-0 IN A 192.168.7.16
worker-1 IN A 192.168.7.17
worker-2 IN A 192.168.7.18
;
; The ETCd cluster lives on the masters...so point these to the IP of the masters
etcd-0 IN A 192.168.7.13
etcd-1 IN A 192.168.7.14
etcd-2 IN A 192.168.7.15
;
; Create entries for the other hosts
registry IN LUA A "ifportup(5443, {'192.168.7.13', '192.168.7.14', '192.168.7.15'})"
yum IN A 192.168.7.1
quay IN LUA A "ifportup(5443, {'192.168.7.13', '192.168.7.14', '192.168.7.15'})"
nexus IN A 192.168.7.1
git IN A 192.168.7.11
tmp-registry IN A 192.168.7.177
EOF
systemctl enable --now pdns
dig @127.0.0.1 helper.ocp4.redhat.ren
dig @127.0.0.1 api.ocp4.redhat.ren
dig @127.0.0.1 c.apps.ocp4.redhat.ren
dig @127.0.0.1 registry.ocp4.redhat.ren
test install
======================================================================================================================================================================================================
Package Architecture Version Repository Size
======================================================================================================================================================================================================
Installing:
pdns x86_64 4.6.2-1.el8 epel 3.7 M
pdns-recursor x86_64 4.3.6-1.el8 epel 2.0 M
Installing dependencies:
boost-context x86_64 1.66.0-10.el8 appstream 15 k
boost-program-options x86_64 1.66.0-10.el8 appstream 140 k
libsodium x86_64 1.0.18-2.el8 epel 162 k
luajit x86_64 2.1.0-0.16beta3.el8 epel 359 k
protobuf x86_64 3.5.0-13.el8 appstream 892 k
Transaction Summary
======================================================================================================================================================================================================
Install 7 Packages
registry
cat << EOF > /usr/lib/systemd/system/docker-distribution.service
[Unit]
Description=v2 Registry server for Docker
After=network.target
[Service]
Type=simple
ExecStart=/usr/bin/registry serve /etc/wzh/registry-config.yml
Restart=on-failure
[Install]
WantedBy=multi-user.target
EOF
mkdir -p /etc/wzh
cat << EOF > /etc/wzh/registry-config.yml
version: 0.1
log:
accesslog:
disabled: true
fields:
service: registry
storage:
cache:
layerinfo: inmemory
filesystem:
rootdirectory: /var/wzh-registry
delete:
enabled: false
maintenance:
readonly:
enabled: true
http:
addr: :5443
tls:
certificate: /etc/wzh/redhat.ren.crt
key: /etc/wzh/redhat.ren.key
EOF
# 配置registry
export VAR_CERT_DIR=/etc/wzh/
mkdir -p ${VAR_CERT_DIR} && cd ${VAR_CERT_DIR}
# https://access.redhat.com/documentation/en-us/red_hat_codeready_workspaces/2.1/html/installation_guide/installing-codeready-workspaces-in-tls-mode-with-self-signed-certificates_crw
openssl genrsa -out ${VAR_CERT_DIR}/redhat.ren.ca.key 4096
openssl req -x509 \
-new -nodes \
-key ${VAR_CERT_DIR}/redhat.ren.ca.key \
-sha256 \
-days 36500 \
-out ${VAR_CERT_DIR}/redhat.ren.ca.crt \
-subj /CN="Local Red Hat Ren Signer" \
-reqexts SAN \
-extensions SAN \
-config <(cat /etc/pki/tls/openssl.cnf \
<(printf '[SAN]\nbasicConstraints=critical, CA:TRUE\nkeyUsage=keyCertSign, cRLSign, digitalSignature'))
openssl genrsa -out ${VAR_CERT_DIR}/redhat.ren.key 2048
openssl req -new -sha256 \
-key ${VAR_CERT_DIR}/redhat.ren.key \
-subj "/O=Local Red Hat Ren /CN=*.ocp4.redhat.ren" \
-reqexts SAN \
-config <(cat /etc/pki/tls/openssl.cnf \
<(printf "\n[SAN]\nsubjectAltName=DNS:*.ocp4.redhat.ren,DNS:*.apps.ocp4.redhat.ren,DNS:*.redhat.ren\nbasicConstraints=critical, CA:FALSE\nkeyUsage=digitalSignature, keyEncipherment, keyAgreement, dataEncipherment\nextendedKeyUsage=serverAuth")) \
-out ${VAR_CERT_DIR}/redhat.ren.csr
openssl x509 \
-req \
-sha256 \
-extfile <(printf "subjectAltName=DNS:*.ocp4.redhat.ren,DNS:*.apps.ocp4.redhat.ren,DNS:*.redhat.ren\nbasicConstraints=critical, CA:FALSE\nkeyUsage=digitalSignature, keyEncipherment, keyAgreement, dataEncipherment\nextendedKeyUsage=serverAuth") \
-days 36500 \
-in ${VAR_CERT_DIR}/redhat.ren.csr \
-CA ${VAR_CERT_DIR}/redhat.ren.ca.crt \
-CAkey ${VAR_CERT_DIR}/redhat.ren.ca.key \
-CAcreateserial -out ${VAR_CERT_DIR}/redhat.ren.crt
openssl x509 -in ${VAR_CERT_DIR}/redhat.ren.crt -text
/bin/cp -f ${VAR_CERT_DIR}/redhat.ren.ca.crt /etc/pki/ca-trust/source/anchors/
update-ca-trust extract
cat << EOF >> /etc/hosts
127.0.0.1 registry.redhat.ren
EOF
mkdir -p /var/wzh-registry
systemctl restart docker-distribution
podman for pdns & registrty
mkdir -p /data/pdns/conf
cd /data/pdns
cat > /data/pdns/pdns.Dockerfile << EOF
FROM docker.io/library/almalinux:8
RUN dnf -y install epel-release
RUN dnf -y update
RUN dnf -y install pdns pdns-recursor
ENTRYPOINT ["/usr/sbin/pdns_server"]
CMD ["--socket-dir=/tmp/pdns", "--guardian=no", "--daemon=no", "--disable-syslog", "--log-timestamp=no", "--write-pid=no"]
EOF
podman build --squash -t quay.io/nepdemo/pdns:4.6.2-alma8 -f pdns.Dockerfile .
podman push quay.io/nepdemo/pdns:4.6.2-alma8
cat > /data/pdns/pdns.Dockerfile << EOF
FROM registry.access.redhat.com/ubi8
RUN dnf -y install https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm
RUN dnf -y update
RUN dnf -y install pdns pdns-recursor
ENTRYPOINT ["/usr/sbin/pdns_server"]
CMD ["--socket-dir=/tmp/pdns", "--guardian=no", "--daemon=no", "--disable-syslog", "--log-timestamp=no", "--write-pid=no"]
EOF
podman build --squash -t quay.io/nepdemo/pdns:4.6.2-ubi8 -f pdns.Dockerfile .
podman push quay.io/nepdemo/pdns:4.6.2-ubi8
cat > /data/pdns/conf/pdns.conf << EOF
launch=bind
local-address=0.0.0.0
local-port=53
setgid=pdns
setuid=pdns
bind-config=/etc/pdns/bind.conf
bind-check-interval=300
enable-lua-records=yes
EOF
cat > /data/pdns/conf/bind.conf << EOF
zone "acm-demo-hub.redhat.ren" { type master; file "/etc/pdns/inside-out.xyz"; };
zone "infra.redhat.ren" { type master; file "/etc/pdns/infra.xyz"; };
EOF
cat > /data/pdns/conf/inside-out.xyz << 'EOF'
$TTL 10
@ IN SOA ns1.acm-demo-hub.redhat.ren. postmaster.acm-demo-hub.redhat.ren. (
2014080704 ; Serial Number (date YYYYMMDD++)
3H ; refresh (3 hours)
30M ; retry (30 minutes)
2W ; expiry (2 weeks)
1W ) ; minimum (1 week)
;IN NS ns1.ocp4.redhat.ren.
;IN NS ns2.ocp4.redhat.ren.
@ IN A 192.168.7.13
;ns1 IN A 8.8.8.8
;ns2 IN A 8.8.4.4
helper IN A 192.168.7.11
;
;
; The api points to the IP of your load balancer
api IN A 192.168.7.13
api-int IN A 192.168.7.13
;
; The wildcard also points to the load balancer
*.apps IN A 192.168.7.13
;
; Create entry for the bootstrap host
; bootstrap IN A 192.168.7.12
;
; Create entries for the master hosts
master-0 IN A 192.168.7.13
;master-1 IN A 192.168.7.14
;master-2 IN A 192.168.7.15
;
; Create entries for the worker hosts
;worker-0 IN A 192.168.7.16
;worker-1 IN A 192.168.7.17
;worker-2 IN A 192.168.7.18
;
; The ETCd cluster lives on the masters...so point these to the IP of the masters
;etcd-0 IN A 192.168.7.13
;etcd-1 IN A 192.168.7.14
;etcd-2 IN A 192.168.7.15
;
; Create entries for the other hosts
registry IN A 192.168.7.13
yum IN A 192.168.7.1
nexus IN A 192.168.7.1
git IN A 192.168.7.11
tmp-registry IN A 192.168.7.177
EOF
cat > /data/pdns/conf/infra.xyz << 'EOF'
$TTL 10
@ IN SOA ns1.infra.redhat.ren. postmaster.infra.redhat.ren. (
2014080704 ; Serial Number (date YYYYMMDD++)
3H ; refresh (3 hours)
30M ; retry (30 minutes)
2W ; expiry (2 weeks)
1W ) ; minimum (1 week)
;IN NS ns1.ocp4.redhat.ren.
;IN NS ns2.ocp4.redhat.ren.
@ IN A 192.168.7.13
quay IN LUA A "ifportup(5180, {'158.247.225.4', '192.168.7.14', '192.168.7.15'})"
quaylab IN A 192.168.7.13
EOF
rm -f /tmp/pdns-*
podman run \
--name local-pdns \
--network=host \
-v /data/pdns/conf/:/etc/pdns/:z \
--conmon-pidfile /tmp/pdns-pid \
--cidfile /tmp/pdns-cid \
--cgroups=no-conmon \
--replace \
quay.io/nepdemo/pdns:4.6.2-ubi8
/usr/bin/podman stop --ignore --cidfile /tmp/pdns-cid -t 1
registry
cat << EOF > ${BASE_DIR}/data/sno/registry.images.bu
variant: openshift
version: 4.10.0
metadata:
labels:
machineconfiguration.openshift.io/role: master
name: 99-zzz-master-registry
storage:
files:
- path: /etc/wzh/redhat.ren.crt
overwrite: true
contents:
source: data:text/plain;charset=utf-8;base64,$( base64 -w 0 < ${VAR_CERT_DIR}/redhat.ren.crt )
mode: 420
user:
name: root
- path: /etc/wzh/redhat.ren.key
overwrite: true
contents:
source: data:text/plain;charset=utf-8;base64,$( base64 -w 0 < ${VAR_CERT_DIR}/redhat.ren.key )
mode: 420
user:
name: root
- path: /etc/wzh/registry-config.yml
overwrite: true
contents:
inline: |
version: 0.1
log:
accesslog:
disabled: true
fields:
service: registry
storage:
cache:
layerinfo: inmemory
filesystem:
rootdirectory: /var/wzh-registry
delete:
enabled: true
maintenance:
readonly:
enabled: false
http:
addr: :8443
tls:
certificate: /etc/wzh/redhat.ren.crt
key: /etc/wzh/redhat.ren.key
mode: 420
user:
name: root
systemd:
units:
- contents: |
[Unit]
Description=Set SELinux chcon for image registry
Before=docker-distribution.service
[Service]
Type=oneshot
RemainAfterExit=yes
User=root
ExecStartPre=-mkdir -p /var/wzh-registry
ExecStart=/usr/bin/chcon -Rt container_file_t /var/wzh-registry
[Install]
WantedBy=multi-user.target
enabled: true
name: hostpath-registry.service
- contents: |
[Unit]
Description=v2 Registry server for Docker
After=network.target hostpath-registry.service
Requires=hostpath-registry.service
Before=kubelet.service
[Service]
Type=simple
TimeoutStartSec=5m
ExecStartPre=-/bin/rm -f %t/%n-pid %t/%n-cid
ExecStart=/usr/bin/podman run \
--name local-registry \
--network=host \
-v /var/wzh-registry/:/var/lib/registry:z \
-v /etc/wzh:/certs:z \
-e REGISTRY_HTTP_ADDR=0.0.0.0:8443 \
-e REGISTRY_HTTP_TLS_CERTIFICATE=/certs/redhat.ren.crt \
-e REGISTRY_HTTP_TLS_KEY=/certs/redhat.ren.key \
--conmon-pidfile %t/%n-pid \
--cidfile %t/%n-cid \
--cgroups=no-conmon \
--replace \
docker.io/library/registry:2
ExecStop=-/usr/bin/podman stop --ignore --cidfile %t/%n-cid -t 1
ExecStopPost=-/usr/bin/podman rm --ignore -f --cidfile %t/%n-cid
PIDFile=%t/%n-pid
KillMode=none
Restart=always
RestartSec=30
[Install]
WantedBy=multi-user.target
enabled: true
name: docker-distribution.service
- name: kubelet.service
dropins:
- name: 99-after-registry.conf
contents: |
[Unit]
Requires=docker-distribution.service
After=docker-distribution.service
EOF
butane ${BASE_DIR}/data/sno/registry.images.bu > ${BASE_DIR}/data/sno/99-zzz-master-registry.yaml
oc create --save-config -f ${BASE_DIR}/data/sno/99-zzz-master-registry.yaml
# oc apply -f ${BASE_DIR}/data/sno/99-zzz-master-registry.yaml
# oc delete -f ${BASE_DIR}/data/sno/99-zzz-master-registry.yaml
pdns
cat > ${BASE_DIR}/data/sno/pdns.bu << 'EOF'
variant: openshift
version: 4.10.0
metadata:
labels:
machineconfiguration.openshift.io/role: master
name: 99-zzz-master-pdns
storage:
files:
- path: /etc/pdns/pdns.conf
overwrite: true
contents:
inline: |
launch=bind
local-address=0.0.0.0
local-port=53
setgid=pdns
setuid=pdns
bind-config=/etc/pdns/bind.conf
bind-check-interval=300
enable-lua-records=yes
mode: 420
user:
name: root
- path: /etc/pdns/bind.conf
overwrite: true
contents:
inline: |
zone "acm-demo-hub.redhat.ren" { type master; file "/etc/pdns/inside-out.xyz"; };
zone "infra.redhat.ren" { type master; file "/etc/pdns/infra.xyz"; };
mode: 420
user:
name: root
- path: /etc/pdns/inside-out.xyz
overwrite: true
contents:
inline: |
$TTL 10
@ IN SOA ns1.acm-demo-hub.redhat.ren. postmaster.acm-demo-hub.redhat.ren. (
2014080704 ; Serial Number (date YYYYMMDD++)
3H ; refresh (3 hours)
30M ; retry (30 minutes)
2W ; expiry (2 weeks)
1W ) ; minimum (1 week)
;IN NS ns1.ocp4.redhat.ren.
;IN NS ns2.ocp4.redhat.ren.
@ IN A 192.168.7.13
;ns1 IN A 8.8.8.8
;ns2 IN A 8.8.4.4
helper IN A 192.168.7.11
;
;
; The api points to the IP of your load balancer
api IN A 192.168.7.13
api-int IN A 192.168.7.13
;
; The wildcard also points to the load balancer
*.apps IN A 192.168.7.13
;
; Create entry for the bootstrap host
; bootstrap IN A 192.168.7.12
;
; Create entries for the master hosts
master-0 IN A 192.168.7.13
;master-1 IN A 192.168.7.14
;master-2 IN A 192.168.7.15
;
; Create entries for the worker hosts
;worker-0 IN A 192.168.7.16
;worker-1 IN A 192.168.7.17
;worker-2 IN A 192.168.7.18
;
; The ETCd cluster lives on the masters...so point these to the IP of the masters
;etcd-0 IN A 192.168.7.13
;etcd-1 IN A 192.168.7.14
;etcd-2 IN A 192.168.7.15
;
; Create entries for the other hosts
registry IN A 192.168.7.13
yum IN A 192.168.7.1
nexus IN A 192.168.7.1
git IN A 192.168.7.11
tmp-registry IN A 192.168.7.177
mode: 420
user:
name: root
- path: /etc/pdns/infra.xyz
overwrite: true
contents:
inline: |
$TTL 10
@ IN SOA ns1.infra.redhat.ren. postmaster.infra.redhat.ren. (
2014080704 ; Serial Number (date YYYYMMDD++)
3H ; refresh (3 hours)
30M ; retry (30 minutes)
2W ; expiry (2 weeks)
1W ) ; minimum (1 week)
;IN NS ns1.ocp4.redhat.ren.
;IN NS ns2.ocp4.redhat.ren.
@ IN A 192.168.7.13
quay IN A 192.168.7.13
quaylab IN A 192.168.7.13
mode: 420
user:
name: root
systemd:
units:
- contents: |
[Unit]
Description=PowerDNS Authoritative Server
After=network.target
Before=kubelet.service
[Service]
Type=simple
TimeoutStartSec=5m
ExecStartPre=-/bin/rm -f %t/%n-pid %t/%n-cid
ExecStart=/usr/bin/podman run \
--name local-pdns \
--network=host \
-v /etc/pdns/:/etc/pdns/:z \
--conmon-pidfile %t/%n-pid \
--cidfile %t/%n-cid \
--cgroups=no-conmon \
--replace \
quay.io/nepdemo/pdns:4.6.2-ubi8
ExecStop=-/usr/bin/podman stop --ignore --cidfile %t/%n-cid -t 1
ExecStopPost=-/usr/bin/podman rm --ignore -f --cidfile %t/%n-cid
PIDFile=%t/%n-pid
KillMode=none
Restart=always
SyslogIdentifier=pdns_server
User=pdns
Group=pdns
RestartSec=1
StartLimitInterval=0
RuntimeDirectory=pdns
[Install]
WantedBy=multi-user.target
name: pdns.service
enabled: true
- name: kubelet.service
dropins:
- name: 99-after-pdns.conf
contents: |
[Unit]
Requires=pdns.service
After=pdns.service
EOF
butane ${BASE_DIR}/data/sno/pdns.bu > ${BASE_DIR}/data/sno/99-zzz-master-pdns.yaml
oc create --save-config -f ${BASE_DIR}/data/sno/99-zzz-master-pdns.yaml
# oc apply -f ${BASE_DIR}/data/sno/99-zzz-master-pdns.yaml