Openshift4 慢慢走
本仓库是作者在日常系统操作中的技术笔记。作者平日有些机会进行很多系统操作,包括很多PoC,新系统验证,方案探索工作,所以会有很多系统实际操作的机会,涉及到操作系统安装,iaas, paas平台搭建,中间件系统验证,应用系统的开发和验证。很多操作步骤比较复杂,所以需要一个地方进行集中的笔记记录,方便自己整理,并第一时间在线分享。
作者还做了一个chrome extension,用来在new tab上展示bing.com的美图,简单美观,欢迎使用。
作者还有很多视频演示,欢迎前往作者的频道订阅
许可证
书中涉及代码采用GNU V3许可。
版权声明
本书遵循 CC-BY-NC-SA 4.0 协议。商业转载必须征求作者 wangzheng422 授权同意,转载请务必注明出处。 作者保留最终解释权及法律追究权力。
免费获得OpenShift4下载密钥
4.6 离线安装, 介质准备
本文的安装步骤,最好是在美国的VPS上完成,然后打包传输回来。
准备离线安装源的步骤如下
- 准备好operator hub catalog,主要是需要里面的日期信息
- 运行脚本,准备离线安装源
环境准备
# on vultr
yum -y install https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
yum -y install htop byobu ethtool dstat
rm -rf /data/ocp4
mkdir -p /data/ocp4
cd /data/ocp4
yum -y install podman docker-distribution pigz skopeo docker buildah jq python3-pip git python36
pip3 install yq
# https://blog.csdn.net/ffzhihua/article/details/85237411
# wget http://mirror.centos.org/centos/7/os/x86_64/Packages/python-rhsm-certificates-1.19.10-1.el7_4.x86_64.rpm
# rpm2cpio python-rhsm-certificates-1.19.10-1.el7_4.x86_64.rpm | cpio -iv --to-stdout ./etc/rhsm/ca/redhat-uep.pem | tee /etc/rhsm/ca/redhat-uep.pem
systemctl enable --now docker
# systemctl start docker
docker login -u ****** -p ******** registry.redhat.io
docker login -u ****** -p ******** registry.access.redhat.com
docker login -u ****** -p ******** registry.connect.redhat.com
podman login -u ****** -p ******** registry.redhat.io
podman login -u ****** -p ******** registry.access.redhat.com
podman login -u ****** -p ******** registry.connect.redhat.com
# to download the pull-secret.json, open following link
# https://cloud.redhat.com/openshift/install/metal/user-provisioned
cat << 'EOF' > /data/pull-secret.json
{"auths":{"cloud.openshift.com":*********************
EOF
cat << EOF >> /etc/hosts
127.0.0.1 registry.redhat.ren
EOF
# 配置registry
mkdir -p /etc/crts/ && cd /etc/crts
# https://access.redhat.com/documentation/en-us/red_hat_codeready_workspaces/2.1/html/installation_guide/installing-codeready-workspaces-in-tls-mode-with-self-signed-certificates_crw
openssl genrsa -out /etc/crts/redhat.ren.ca.key 4096
openssl req -x509 \
-new -nodes \
-key /etc/crts/redhat.ren.ca.key \
-sha256 \
-days 36500 \
-out /etc/crts/redhat.ren.ca.crt \
-subj /CN="Local Red Hat Ren Signer" \
-reqexts SAN \
-extensions SAN \
-config <(cat /etc/pki/tls/openssl.cnf \
<(printf '[SAN]\nbasicConstraints=critical, CA:TRUE\nkeyUsage=keyCertSign, cRLSign, digitalSignature'))
openssl genrsa -out /etc/crts/redhat.ren.key 2048
openssl req -new -sha256 \
-key /etc/crts/redhat.ren.key \
-subj "/O=Local Red Hat Ren /CN=*.ocp4.redhat.ren" \
-reqexts SAN \
-config <(cat /etc/pki/tls/openssl.cnf \
<(printf "\n[SAN]\nsubjectAltName=DNS:*.ocp4.redhat.ren,DNS:*.apps.ocp4.redhat.ren,DNS:*.redhat.ren\nbasicConstraints=critical, CA:FALSE\nkeyUsage=digitalSignature, keyEncipherment, keyAgreement, dataEncipherment\nextendedKeyUsage=serverAuth")) \
-out /etc/crts/redhat.ren.csr
openssl x509 \
-req \
-sha256 \
-extfile <(printf "subjectAltName=DNS:*.ocp4.redhat.ren,DNS:*.apps.ocp4.redhat.ren,DNS:*.redhat.ren\nbasicConstraints=critical, CA:FALSE\nkeyUsage=digitalSignature, keyEncipherment, keyAgreement, dataEncipherment\nextendedKeyUsage=serverAuth") \
-days 36500 \
-in /etc/crts/redhat.ren.csr \
-CA /etc/crts/redhat.ren.ca.crt \
-CAkey /etc/crts/redhat.ren.ca.key \
-CAcreateserial -out /etc/crts/redhat.ren.crt
openssl x509 -in /etc/crts/redhat.ren.crt -text
/bin/cp -f /etc/crts/redhat.ren.ca.crt /etc/pki/ca-trust/source/anchors/
update-ca-trust extract
cd /data/ocp4
# systemctl stop docker-distribution
/bin/rm -rf /data/registry
mkdir -p /data/registry
cat << EOF > /etc/docker-distribution/registry/config.yml
version: 0.1
log:
fields:
service: registry
storage:
cache:
layerinfo: inmemory
filesystem:
rootdirectory: /data/registry
delete:
enabled: true
http:
addr: :5443
tls:
certificate: /etc/crts/redhat.ren.crt
key: /etc/crts/redhat.ren.key
compatibility:
schema1:
enabled: true
EOF
# systemctl restart docker
# systemctl enable docker-distribution
# systemctl restart docker-distribution
# podman login registry.redhat.ren:5443 -u a -p a
systemctl enable --now docker-distribution
operator hub catalog
mkdir -p /data/ocp4
cd /data/ocp4
export BUILDNUMBER=4.6.28
wget -O openshift-client-linux-${BUILDNUMBER}.tar.gz https://mirror.openshift.com/pub/openshift-v4/clients/ocp/${BUILDNUMBER}/openshift-client-linux-${BUILDNUMBER}.tar.gz
wget -O openshift-install-linux-${BUILDNUMBER}.tar.gz https://mirror.openshift.com/pub/openshift-v4/clients/ocp/${BUILDNUMBER}/openshift-install-linux-${BUILDNUMBER}.tar.gz
tar -xzf openshift-client-linux-${BUILDNUMBER}.tar.gz -C /usr/local/sbin/
tar -xzf openshift-install-linux-${BUILDNUMBER}.tar.gz -C /usr/local/sbin/
wget -O operator.sh https://raw.githubusercontent.com/wangzheng422/docker_env/dev/redhat/ocp4/4.6/scripts/operator.sh
bash operator.sh
# 2021.05.07.0344
离线安装源制作
rm -rf /data/ocp4
mkdir -p /data/ocp4
cd /data/ocp4
# wget -O build.dist.sh https://raw.githubusercontent.com/wangzheng422/docker_env/dev/redhat/ocp4/4.6/scripts/build.dist.sh
# bash build.dist.sh
wget -O prepare.offline.content.sh https://raw.githubusercontent.com/wangzheng422/docker_env/dev/redhat/ocp4/4.6/scripts/prepare.offline.content.sh
# git clone https://github.com/wangzheng422/docker_env.git
# cd docker_env
# git checkout dev
# cp redhat/ocp4/4.6/scripts/prepare.offline.content.sh /data/ocp4/
# cd /data/ocp4
# rm -rf docker_env
bash prepare.offline.content.sh -v 4.6.28, -m 4.6 -h 2021.05.07.0344
output of mirror of images
Success
Update image: registry.redhat.ren:5443/ocp4/openshift4:4.6.5
Mirror prefix: registry.redhat.ren:5443/ocp4/openshift4
To use the new mirrored repository to install, add the following section to the install-config.yaml:
imageContentSources:
- mirrors:
- registry.redhat.ren:5443/ocp4/openshift4
source: quay.io/openshift-release-dev/ocp-release
- mirrors:
- registry.redhat.ren:5443/ocp4/openshift4
source: quay.io/openshift-release-dev/ocp-v4.0-art-dev
To use the new mirrored repository for upgrades, use the following to create an ImageContentSourcePolicy:
apiVersion: operator.openshift.io/v1alpha1
kind: ImageContentSourcePolicy
metadata:
name: example
spec:
repositoryDigestMirrors:
- mirrors:
- registry.redhat.ren:5443/ocp4/openshift4
source: quay.io/openshift-release-dev/ocp-release
- mirrors:
- registry.redhat.ren:5443/ocp4/openshift4
source: quay.io/openshift-release-dev/ocp-v4.0-art-dev
########################################
##
Success
Update image: openshift/release:4.3.3
To upload local images to a registry, run:
oc image mirror --from-dir=/data/mirror_dir file://openshift/release:4.3.3* REGISTRY/REPOSITORY
download image for components
########################################
# your images
cd /data/ocp4/
export MIRROR_DIR='/data/install.image'
/bin/rm -rf ${MIRROR_DIR}
bash add.image.sh install.image.list ${MIRROR_DIR}
export MIRROR_DIR='/data/poc.image'
/bin/rm -rf ${MIRROR_DIR}
bash add.image.sh poc.image.list ${MIRROR_DIR}
########################################
# common function
build_image_list() {
VAR_INPUT_FILE=$1
VAR_OUTPUT_FILE=$2
VAR_OPERATOR=$3
VAR_FINAL=`cat $VAR_INPUT_FILE | grep $VAR_OPERATOR | awk '{if ($2) print $2;}' | sort | uniq | tail -1`
echo $VAR_FINAL
cat $VAR_INPUT_FILE | grep $VAR_FINAL | awk '{if ($2) print $1;}' >> $VAR_OUTPUT_FILE
}
########################################
# redhat operator hub
export MIRROR_DIR='/data/redhat-operator'
/bin/rm -rf ${MIRROR_DIR}
/bin/rm -f /data/ocp4/mapping-redhat.list
wanted_operator_list=$(cat redhat-operator-image.list | awk '{if ($2) print $2;}' \
| sed 's/\..*//g' | sort | uniq
)
while read -r line; do
build_image_list '/data/ocp4/redhat-operator-image.list' '/data/ocp4/mapping-redhat.list' $line
done <<< "$wanted_operator_list"
bash add.image.sh mapping-redhat.list ${MIRROR_DIR}
# /bin/cp -f pull.add.image.failed.list pull.add.image.failed.list.bak
# bash add.image.resume.sh pull.add.image.failed.list.bak ${MIRROR_DIR}
cd ${MIRROR_DIR%/*}
tar cf - echo ${MIRROR_DIR##*/}/ | pigz -c > echo ${MIRROR_DIR##*/}.tgz
# to load image back
bash add.image.load.sh '/data/redhat-operator' 'registry.redhat.ren:5443'
######################################
# certified operator hub
export MIRROR_DIR='/data/certified-operator'
/bin/rm -rf ${MIRROR_DIR}
/bin/rm -f /data/ocp4/mapping-certified.list
wanted_operator_list=$(cat certified-operator-image.list | awk '{if ($2) print $2;}' \
| sed 's/\..*//g' | sort | uniq
)
while read -r line; do
build_image_list '/data/ocp4/certified-operator-image.list' '/data/ocp4/mapping-certified.list' $line
done <<< "$wanted_operator_list"
bash add.image.sh mapping-certified.list ${MIRROR_DIR}
# /bin/cp -f pull.add.image.failed.list pull.add.image.failed.list.bak
# bash add.image.resume.sh pull.add.image.failed.list.bak ${MIRROR_DIR}
cd ${MIRROR_DIR%/*}
tar cf - echo ${MIRROR_DIR##*/}/ | pigz -c > echo ${MIRROR_DIR##*/}.tgz
# bash add.image.sh mapping-certified.txt
#######################################
# community operator hub
export MIRROR_DIR='/data/community-operator'
/bin/rm -rf ${MIRROR_DIR}
/bin/rm -f /data/ocp4/mapping-community.list
wanted_operator_list=$(cat community-operator-image.list | awk '{if ($2) print $2;}' \
| sed 's/\..*//g' | sort | uniq
)
while read -r line; do
build_image_list '/data/ocp4/community-operator-image.list' '/data/ocp4/mapping-community.list' $line
done <<< "$wanted_operator_list"
bash add.image.sh mapping-community.list ${MIRROR_DIR}
# /bin/cp -f pull.add.image.failed.list pull.add.image.failed.list.bak
# bash add.image.resume.sh pull.add.image.failed.list.bak ${MIRROR_DIR}
cd ${MIRROR_DIR%/*}
tar cf - echo ${MIRROR_DIR##*/}/ | pigz -c > echo ${MIRROR_DIR##*/}.tgz
# bash add.image.sh mapping-community.txt
# to load image back
bash add.image.load.sh '/data/community-operator' 'registry.redhat.ren:5443'
#####################################
# samples operator
export MIRROR_DIR='/data/is.samples'
/bin/rm -rf ${MIRROR_DIR}
bash add.image.sh is.openshift.list ${MIRROR_DIR}
镜像仓库代理 / image registry proxy
准备离线镜像仓库非常麻烦,好在我们找到了一台在线的主机,那么我们可以使用nexus构造image registry proxy,在在线环境上面,做一遍PoC,然后就能通过image registry proxy得到离线镜像了
- https://mtijhof.wordpress.com/2018/07/23/using-nexus-oss-as-a-proxy-cache-for-docker-images/
#####################################################
# init build the nexus fs
mkdir -p /data/ccn/nexus-image
chown -R 200 /data/ccn/nexus-image
# podman run -d -p 8082:8081 -p 8083:8083 -it --name nexus-image -v /data/ccn/nexus-image:/nexus-data:Z docker.io/sonatype/nexus3:3.29.0
podman run -d -p 8082:8081 -p 8083:8083 -it --name nexus-image -v /data/ccn/nexus-image:/nexus-data:Z docker.io/wangzheng422/imgs:nexus3-3.29.0-wzh
podman stop nexus-image
podman rm nexus-image
# get the admin password
cat /data/ccn/nexus-image/admin.password && echo
# 84091bcd-c82f-44a3-8b7b-dfc90f5b7da1
# open http://nexus.ocp4.redhat.ren:8082
# 开启 https
# https://blog.csdn.net/s7799653/article/details/105378645
# https://help.sonatype.com/repomanager3/system-configuration/configuring-ssl#ConfiguringSSL-InboundSSL-ConfiguringtoServeContentviaHTTPS
mkdir -p /data/install/tmp
cd /data/install/tmp
# 将证书导出成pkcs格式
# 这里需要输入密码 用 password,
openssl pkcs12 -export -out keystore.pkcs12 -inkey /etc/crts/redhat.ren.key -in /etc/crts/redhat.ren.crt
cat << EOF >> Dockerfile
FROM docker.io/sonatype/nexus3:3.29.0
USER root
COPY keystore.pkcs12 /keystore.pkcs12
RUN keytool -v -importkeystore -srckeystore keystore.pkcs12 -srcstoretype PKCS12 -destkeystore keystore.jks -deststoretype JKS -storepass password -srcstorepass password &&\
cp keystore.jks /opt/sonatype/nexus/etc/ssl/
USER nexus
EOF
buildah bud --format=docker -t docker.io/wangzheng422/imgs:nexus3-3.29.0-wzh -f Dockerfile .
buildah push docker.io/wangzheng422/imgs:nexus3-3.29.0-wzh
######################################################
# go to helper, update proxy setting for ocp cluster
cd /data/ocp4
bash image.registries.conf.sh nexus.ocp4.redhat.ren:8083
mkdir -p /etc/containers/registries.conf.d
/bin/cp -f image.registries.conf /etc/containers/registries.conf.d/
cd /data/ocp4
oc apply -f ./99-worker-container-registries.yaml -n openshift-config
oc apply -f ./99-master-container-registries.yaml -n openshift-config
######################################################
# dump the nexus image fs out
podman stop nexus-image
var_date=$(date '+%Y-%m-%d-%H%M')
echo $var_date
cd /data/ccn
tar cf - ./nexus-image | pigz -c > nexus-image.tgz
buildah from --name onbuild-container scratch
buildah copy onbuild-container nexus-image.tgz /
buildah umount onbuild-container
buildah commit --rm --format=docker onbuild-container docker.io/wangzheng422/nexus-fs:image-$var_date
# buildah rm onbuild-container
# rm -f nexus-image.tgz
buildah push docker.io/wangzheng422/nexus-fs:image-$var_date
echo "docker.io/wangzheng422/nexus-fs:image-$var_date"
# 以下这个版本,可以作为初始化的image proxy,里面包含了nfs provision,以及sample operator的metadata。很高兴的发现,image stream并不会完全下载镜像,好想只是下载metadata,真正用的时候,才去下载。
# docker.io/wangzheng422/nexus-fs:image-2020-12-26-1118
##################################################
## call nexus api to get image list
# https://community.sonatype.com/t/how-can-i-get-a-list-of-tags-for-a-docker-image-akin-to-the-docker-hub-list/3210
# https://help.sonatype.com/repomanager3/rest-and-integration-api/search-api
curl -k -u admin:84091bcd-c82f-44a3-8b7b-dfc90f5b7da1 -X GET 'http://nexus.ocp4.redhat.ren:8082/service/rest/v1/search?repository=registry.redhat.io'
curl -u admin:84091bcd-c82f-44a3-8b7b-dfc90f5b7da1 -X GET 'http://nexus.ocp4.redhat.ren:8082/service/rest/v1/components?repository=registry.redhat.io'
podman pull docker.io/anoxis/registry-cli
podman run --rm anoxis/registry-cli -l admin:84091bcd-c82f-44a3-8b7b-dfc90f5b7da1 -r https://nexus.ocp4.redhat.ren:8083
# https://github.com/rpardini/docker-registry-proxy
REPO_URL=nexus.ocp4.redhat.ren:8083
curl -k -s -X GET https://$REPO_URL/v2/_catalog \
| jq '.repositories[]' \
| sort \
| xargs -I _ curl -s -k -X GET https://$REPO_URL/v2/_/tags/list
##################################################
## prepare for baidu disk
mkdir -p /data/ccn/baidu
cd /data/ccn
tar cf - ./nexus-image | pigz -c > /data/ccn/baidu/nexus-image.tgz
cd /data/ccn/baidu
split -b 20000m nexus-image.tgz nexus-image.tgz.
rm -f nexus-image.tgz
yum -y install python3-pip
pip3 install --user bypy
/root/.local/bin/bypy list
/root/.local/bin/bypy upload
upload to baidu disk
export BUILDNUMBER=4.6.28
mkdir -p /data/bypy
cd /data
tar -cvf - ocp4/ | pigz -c > /data/bypy/ocp.$BUILDNUMBER.tgz
tar -cvf - registry/ | pigz -c > /data/bypy/registry.$BUILDNUMBER.tgz
cd /data/bypy
# https://github.com/houtianze/bypy
yum -y install python3-pip
pip3 install --user bypy
/root/.local/bin/bypy list
/root/.local/bin/bypy upload
openshift 4.9 single node, assisted install mode, without dhcp, connected
本文描述,如何使用assisted service(辅助安装服务),来安装一个单节点openshift4集群,特别的地方是,默认情况,openshift4要求网络上提供dhcp服务,让节点启动的时候,能拿到IP地址,从而进一步下载容器镜像,并且和assisted service交互,拿到配置。可是大部分客户的网络,是不允许开启dhcp服务的,那么我们在这里就使用assisted service暂时隐藏的功能,进行static ip模式的部署。
本实验设想的客户环境/需求是这样的:
- 实验网络没有dhcp
- 实验网络可以访问外网
- 实验环境中有2台主机
- 将在实验环境中的1台主机上,安装单节点openshift4(baremetal模式)
由于作者实验环境所限,我们就用kvm来代替baremetal进行实验。
安装过程大概是这样的:
- 启动helper vm,并在helper节点上配置dns服务
- 启动本地assisted service服务
- 在assisted service上进行配置
- 从assisted service上下载iso
- 通过iso启动kvm/baremetal
- 在assisted service上进行配置,开始安装
- 观察和等待安装结束
- 获得openshift4的用户名密码等信息,登录集群。
本次实验的架构图:
部署 dns
assisted install 模式下,如果想静态ip安装,需要在实验网络上部署一个dns服务。因为我们部署的是single node openshift,只需要把如下4个域名,指向同一个ip地址就可以。当然,你需要提前想好域名。
- api.ocp4s.redhat.ren
- api-int.ocp4s.redhat.ren
- *.apps.ocp4.redhat.ren
- ocp4-sno.ocp4.redhat.ren
部署 assisted install service
assisted install service有2个版本,一个是cloud.redhat.com上面那个,同时还有一个本地版本,两个版本功能一样,因为我们需要有定制需求,所以我们选择本地版本。
# https://github.com/openshift/assisted-service/blob/master/docs/user-guide/assisted-service-on-local.md
# https://github.com/openshift/assisted-service/tree/master/deploy/podman
podman version
# Version: 3.4.2
# API Version: 3.4.2
# Go Version: go1.16.12
# Built: Wed Feb 2 07:59:28 2022
# OS/Arch: linux/amd64
mkdir -p /data/assisted-service/
cd /data/assisted-service/
export http_proxy="http://192.168.195.54:5085"
export https_proxy=${http_proxy}
wget https://raw.githubusercontent.com/openshift/assisted-service/master/deploy/podman/configmap.yml
wget https://raw.githubusercontent.com/openshift/assisted-service/master/deploy/podman/pod.yml
unset http_proxy
unset https_proxy
sed -i 's/ SERVICE_BASE_URL:.*/ SERVICE_BASE_URL: "http:\/\/172.21.6.103:8090"/' configmap.yml
# 启动本地assisted service
podman play kube --configmap configmap.yml pod.yml
# 用以下命令,停止/删除本地assisted service
cd /data/assisted-service/
podman play kube --down pod.yml
⚠️注意:本地版本的assisted service,会从mirror.openshift.com上面下载多个版本的iso,总共有6GB。请等待下载完成
podman exec assisted-installer-image-service du -h /data
# 6.3G /data
运行成功以后,访问以下url
http://172.21.6.103:8080
创建cluster
访问本地的assist install service, 创建一个cluster
配置集群的基本信息
填写自己的pull-secret信息,并点击下一步
进入下一个页面后,点击add host
直接点击generate discovery iso,我们会在后面定制ssh key,现在不需要配置。
记录下来download command,因为我们需要里面的env infra id
我们这里的command是
wget -O discovery_image_ocp4s.iso 'http://127.0.0.1:8888/images/78506b3c-46e4-47f7-8a18-ec1ca4baa3b9?arch=x86_64&type=full-iso&version=4.9'
定制 assisted install service的配置
assisted install service创建的iso,要去实验网络必须有dhcp服务,我们要做的是static ip,那么我们就要定制一下 assisted install service, 激活他现在还是隐藏的功能(暂时没有官方支持)。
# on helper
cd /data/sno
SNO_IP=172.21.6.13
SNO_GW=172.21.6.254
SNO_NETMAST=255.255.255.0
SNO_NETMAST_S=24
SNO_HOSTNAME=ocp4-sno
SNO_IF=enp1s0
SNO_IF_MAC=`printf '00:60:2F:%02X:%02X:%02X' $[RANDOM%256] $[RANDOM%256] $[RANDOM%256]`
SNO_DNS=172.21.1.1
SNO_DISK=/dev/vda
SNO_CORE_PWD=redhat
echo ${SNO_IF_MAC} > /data/sno/sno.mac
ASSISTED_SERVICE_URL=http://172.21.6.103:8080
# infra id is part of download url on UI
INFRA_ENV_ID=78506b3c-46e4-47f7-8a18-ec1ca4baa3b9
NODE_SSH_KEY="$(cat ~/.ssh/id_rsa.pub)"
request_body=$(mktemp)
cat << EOF > /data/sno/server-a.yaml
dns-resolver:
config:
server:
- ${SNO_DNS}
interfaces:
- ipv4:
address:
- ip: ${SNO_IP}
prefix-length: ${SNO_NETMAST_S}
dhcp: false
enabled: true
name: ${SNO_IF}
state: up
type: ethernet
routes:
config:
- destination: 0.0.0.0/0
next-hop-address: ${SNO_GW}
next-hop-interface: ${SNO_IF}
table-id: 254
EOF
cat << EOF > /data/sno/static.ip.bu
variant: openshift
version: 4.9.0
metadata:
labels:
machineconfiguration.openshift.io/role: master
name: 99-zzz-master-static-ip
storage:
files:
- path: /etc/NetworkManager/system-connections/${SNO_IF}.nmconnection
overwrite: true
contents:
inline: |
[connection]
id=${SNO_IF}
type=ethernet
autoconnect-retries=1
interface-name=${SNO_IF}
multi-connect=1
permissions=
wait-device-timeout=60000
[ethernet]
mac-address-blacklist=
[ipv4]
address1=${SNO_IP}/${SNO_NETMAST_S=24},${SNO_GW}
dhcp-hostname=${SNO_HOSTNAME}
dhcp-timeout=90
dns=${SNO_DNS};
dns-search=
may-fail=false
method=manual
[ipv6]
addr-gen-mode=eui64
dhcp-hostname=${SNO_HOSTNAME}
dhcp-timeout=90
dns-search=
method=disabled
[proxy]
EOF
# https://access.redhat.com/solutions/6194821
# butane /data/sno/static.ip.bu | python3 -c 'import json, yaml, sys; print(json.dumps(yaml.load(sys.stdin)))'
# https://stackoverflow.com/questions/2854655/command-to-escape-a-string-in-bash
# VAR_PULL_SEC=`printf "%q" $(cat /data/pull-secret.json)`
tmppath=$(mktemp)
butane /data/sno/static.ip.bu | python3 -c 'import json, yaml, sys; print(json.dumps(yaml.load(sys.stdin)))' | jq -c '.spec.config | .ignition.version = "3.1.0" ' > ${tmppath}
VAR_NMSTATIC=$(cat ${tmppath})
# rm -f ${tmppath}
jq -n --arg SSH_KEY "$NODE_SSH_KEY" \
--arg NMSTATE_YAML1 "$(cat server-a.yaml)" \
--arg MAC_ADDR "$(cat /data/sno/sno.mac)" \
--arg PULL_SEC "$(cat /data/pull-secret.json)" \
--arg NMSTATIC "${VAR_NMSTATIC}" \
'{
"proxy":{"http_proxy":"","https_proxy":"","no_proxy":""},
"ssh_authorized_key":$SSH_KEY,
"pull_secret":$PULL_SEC,
"image_type":"full-iso",
"ignition_config_override":$NMSTATIC,
"static_network_config": [
{
"network_yaml": $NMSTATE_YAML1,
"mac_interface_map": [{"mac_address": $MAC_ADDR, "logical_nic_name": "enp1s0"}]
}
]
}' > $request_body
# 我们来看看创建的request body
cat $request_body
# 向 assisted install service发送请求,进行定制
curl -H "Content-Type: application/json" -X PATCH -d @$request_body ${ASSISTED_SERVICE_URL}/api/assisted-install/v2/infra-envs/$INFRA_ENV_ID
# {"cluster_id":"850934fd-fa64-4057-b9d2-1eeebd890e1a","cpu_architecture":"x86_64","created_at":"2022-02-11T03:54:46.632598Z","download_url":"http://127.0.0.1:8888/images/89cc84a1-2dfd-4d7e-9ca3-903342c40d60?arch=x86_64&type=full-iso&version=4.9","email_domain":"Unknown","expires_at":"0001-01-01T00:00:00.000Z","href":"/api/assisted-install/v2/infra-envs/89cc84a1-2dfd-4d7e-9ca3-903342c40d60","id":"89cc84a1-2dfd-4d7e-9ca3-903342c40d60","kind":"InfraEnv","name":"ocp4s_infra-env","openshift_version":"4.9","proxy":{"http_proxy":"","https_proxy":"","no_proxy":""},"pull_secret_set":true,"ssh_authorized_key":"ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQCrkO4oLIFTwjkGON+aShlQRKwXHOf3XKrGDmpb+tQM3UcbsF2U7klsr9jBcGObQMZO7KBW8mlRu0wC2RxueBgjbqvylKoFacgVZg6PORfkclqE1gZRYFwoxDkLo2c5y5B7OhcAdlHO0eR5hZ3/0+8ZHZle0W+A0AD7qqowO2HlWLkMMt1QXFD7R0r6dzTs9u21jASGk3jjYgCOw5iHvqm2ueVDFAc4yVwNZ4MXKg5MRvqAJDYPqhaRozLE60EGIziy9SRj9HWynyNDncCdL1/IBK2z9T0JwDebD6TDNcPCtL+AeKIpaHed52PkjnFf+Q+8/0Z0iXt6GyFYlx8OkxdsiMgMxiXx43yIRaWZjx54kVtc9pB6CL50UKPQ2LjuFPIZSfaCab5KDgPRtzue82DE6Mxxg4PS+FTW32/bq1WiOxCg9ABrZ0n1CGaZWFepJkSw47wodMnvlBkcKY3Rn/SsLZVOUsJysd+b08LQgl1Fr3hjVrEQMLbyU0UxvoerYfk= root@ocp4-helper","static_network_config":"dns-resolver:\n config:\n server:\n - 172.21.1.1\ninterfaces:\n- ipv4:\n address:\n - ip: 172.21.6.13\n prefix-length: 24\n dhcp: false\n enabled: true\n name: enp1s0\n state: up\n type: ethernet\nroutes:\n config:\n - destination: 0.0.0.0/0\n next-hop-address: 172.21.6.254\n next-hop-interface: enp1s0\n table-id: 254HHHHH00:60:2F:8B:42:88=enp1s0","type":"full-iso","updated_at":"2022-02-11T04:01:14.008388Z","user_name":"admin"}
# on helper
cd /data/sno/
wget -O discovery_image_ocp4s.iso "http://172.21.6.103:8888/images/${INFRA_ENV_ID}?arch=x86_64&type=full-iso&version=4.9"
# coreos-installer iso kargs modify -a \
# " ip=${SNO_IP}::${SNO_GW}:${SNO_NETMAST}:${SNO_HOSTNAME}:${SNO_IF}:none nameserver=${SNO_DNS}" \
# /data/sno/discovery_image_ocp4s.iso
/bin/mv -f /data/sno/discovery_image_ocp4s.iso /data/sno/sno.iso
启动kvm
我们回到kvm宿主机,启动kvm,开始安装single node openshift
# back to kvm host
create_lv() {
var_vg=$1
var_lv=$2
var_size=$3
lvremove -f $var_vg/$var_lv
lvcreate -y -L $var_size -n $var_lv $var_vg
wipefs --all --force /dev/$var_vg/$var_lv
}
create_lv vgdata lvsno 120G
export KVM_DIRECTORY=/data/kvm
mkdir -p ${KVM_DIRECTORY}
cd ${KVM_DIRECTORY}
scp root@192.168.7.11:/data/sno/sno.* ${KVM_DIRECTORY}/
# on kvm host
# export KVM_DIRECTORY=/data/kvm
virt-install --name=ocp4-sno --vcpus=16 --ram=65536 \
--cpu=host-model \
--disk path=/dev/vgdata/lvsno,device=disk,bus=virtio,format=raw \
--os-variant rhel8.3 --network bridge=baremetal,model=virtio,mac=$(<sno.mac) \
--graphics vnc,port=59012 \
--boot menu=on --cdrom ${KVM_DIRECTORY}/sno.iso
在 assisted install service里面配置sno参数
回到 assisted install service webUI,能看到node已经被发现
点击下一步,配置物理机的安装子网
点击下一步,回顾集群配置信息
开始安装,到这里,我们等待就可以
一段时间以后,通常20-30分钟,就安装完成了,当然这要网络情况比较好的条件下。
⚠️不要忘记下载集群证书,还有webUI的用户名,密码。
访问sno集群
# back to helper
# copy kubeconfig from web browser to /data/sno
export KUBECONFIG=/data/sno/auth/kubeconfig
oc get node
# NAME STATUS ROLES AGE VERSION
# ocp4-sno Ready master,worker 71m v1.22.3+e790d7f
oc get co
# NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE
# authentication 4.9.18 True False False 54m
# baremetal 4.9.18 True False False 58m
# cloud-controller-manager 4.9.18 True False False 63m
# cloud-credential 4.9.18 True False False 68m
# cluster-autoscaler 4.9.18 True False False 59m
# config-operator 4.9.18 True False False 69m
# console 4.9.18 True False False 54m
# csi-snapshot-controller 4.9.18 True False False 68m
# dns 4.9.18 True False False 58m
# etcd 4.9.18 True False False 62m
# image-registry 4.9.18 True False False 55m
# ingress 4.9.18 True False False 57m
# insights 4.9.18 True False False 63m
# kube-apiserver 4.9.18 True False False 58m
# kube-controller-manager 4.9.18 True False False 61m
# kube-scheduler 4.9.18 True False False 60m
# kube-storage-version-migrator 4.9.18 True False False 68m
# machine-api 4.9.18 True False False 59m
# machine-approver 4.9.18 True False False 60m
# machine-config 4.9.18 True False False 63m
# marketplace 4.9.18 True False False 68m
# monitoring 4.9.18 True False False 54m
# network 4.9.18 True False False 68m
# node-tuning 4.9.18 True False False 64m
# openshift-apiserver 4.9.18 True False False 55m
# openshift-controller-manager 4.9.18 True False False 60m
# openshift-samples 4.9.18 True False False 57m
# operator-lifecycle-manager 4.9.18 True False False 60m
# operator-lifecycle-manager-catalog 4.9.18 True False False 60m
# operator-lifecycle-manager-packageserver 4.9.18 True False False 58m
# service-ca 4.9.18 True False False 68m
# storage 4.9.18 True False False 63m
访问集群的webUI
https://console-openshift-console.apps.ocp4s.redhat.ren/
用户名密码是: kubeadmin / 3QS3M-HA3Px-376HD-bvfif
reference
https://github.com/openshift/assisted-service/tree/master/docs/user-guide
- https://access.redhat.com/solutions/6135171
- https://github.com/openshift/assisted-service/blob/master/docs/user-guide/assisted-service-on-local.md
- https://github.com/openshift/assisted-service/blob/master/docs/user-guide/restful-api-guide.md
search
- pre-network-manager-config.sh
- /Users/wzh/Desktop/dev/assisted-service/internal/constants/scripts.go
- NetworkManager
https://superuser.com/questions/218340/how-to-generate-a-valid-random-mac-address-with-bash-shell
end
cat << EOF > test
02:00:00:2c:23:a5=enp1s0
EOF
cat test | cut -d= -f1 | tr '[:lower:]' '[:upper:]'
printf '00-60-2F-%02X-%02X-%02X\n' $[RANDOM%256] $[RANDOM%256] $[RANDOM%256]
virsh domifaddr freebsd11.1
openshift 4.9 single node, assisted install mode, without dhcp, disconnected
本文描述,如何使用assisted service(辅助安装服务),来安装一个单节点openshift4集群,特别的地方是,默认情况,openshift4要求网络上提供dhcp服务,让节点启动的时候,能拿到IP地址,从而进一步下载容器镜像,并且和assisted service交互,拿到配置。可是大部分客户的网络,是不允许开启dhcp服务的,那么我们在这里就使用assisted service暂时隐藏的功能,进行static ip模式的部署。
本实验设想的客户环境/需求是这样的:
- 实验网络没有dhcp
- 实验网络不能访问外网
- 实验环境中有2台主机
- 将在实验环境中的1台主机上,安装单节点openshift4(baremetal模式)
由于作者实验环境所限,我们就用kvm来代替baremetal进行实验。
安装过程大概是这样的:
- 启动helper vm,并在helper节点上配置dns服务
- 启动本地assisted service服务
- 在assisted service上进行配置
- 从assisted service上下载iso
- 通过iso启动kvm/baremetal
- 在assisted service上进行配置,开始安装
- 观察和等待安装结束
- 获得openshift4的用户名密码等信息,登录集群。
本次实验的架构图:
安装介质
本文的安装,使用openshift 4.9.12,未来方便,作者打包了安装介质,里面除了openshift镜像,还有一些辅助软件和工具。
打包好的安装包,在这里下载,百度盘下载链接,版本是 4.9.12 :
- 4.9.12
- 链接: https://pan.baidu.com/s/1Wj5MUBLMFli1kOit1eafug 提取码: ur8r
部署 dns
assisted install 模式下,如果想静态ip安装,需要在实验网络上部署一个dns服务。因为我们部署的是single node openshift,只需要把如下4个域名,指向同一个ip地址就可以。当然,你需要提前想好域名。
- api.ocp4s.redhat.ren
- api-int.ocp4s.redhat.ren
- *.apps.ocp4.redhat.ren
- ocp4-sno.ocp4.redhat.ren
cd /data/ocp4/ocp4-upi-helpernode-master/
cat << 'EOF' > /data/ocp4/ocp4-upi-helpernode-master/vars.yaml
---
ocp_version: 4.9.12
ssh_gen_key: false
staticips: true
firewalld: false
dns_forward: yes
iso:
iso_dl_url: "file:///data/ocp4/rhcos-live.x86_64.iso"
my_iso: "rhcos-live.iso" # this is internal file, just leave as it.
helper:
name: "helper"
ipaddr: "192.168.7.11"
networkifacename: "enp1s0"
gateway: "192.168.7.1"
netmask: "255.255.255.0"
dns:
domain: "redhat.ren"
clusterid: "sno"
forwarder1: "172.21.1.1"
forwarder2: "172.21.1.1"
bootstrap:
name: "bootstrap"
ipaddr: "192.168.7.112"
interface: "enp1s0"
install_drive: "vda"
masters:
- name: "master-0"
ipaddr: "192.168.7.113"
interface: "enp1s0"
install_drive: "vda"
# - name: "master-1"
# ipaddr: "192.168.7.14"
# interface: "enp1s0"
# install_drive: "vda"
# - name: "master-2"
# ipaddr: "192.168.7.15"
# interface: "enp1s0"
# install_drive: "vda"
workers:
- name: "worker-0"
ipaddr: "192.168.7.116"
interface: "eno1"
install_drive: "sda"
- name: "worker-1"
ipaddr: "192.168.7.117"
interface: "enp1s0"
install_drive: "sda"
# - name: "worker-2"
# ipaddr: "192.168.7.18"
# interface: "enp1s0"
# install_drive: "vda"
# - name: "infra-0"
# ipaddr: "192.168.7.19"
# interface: "enp1s0"
# install_drive: "vda"
# - name: "infra-1"
# ipaddr: "192.168.7.20"
# interface: "enp1s0"
# install_drive: "vda"
# - name: "worker-3"
# ipaddr: "192.168.7.21"
# interface: "enp1s0"
# install_drive: "vda"
# - name: "worker-4"
# ipaddr: "192.168.7.22"
# interface: "enp1s0"
# install_drive: "vda"
others:
- name: "registry"
ipaddr: "192.168.7.1"
- name: "yum"
ipaddr: "192.168.7.1"
- name: "quay"
ipaddr: "192.168.7.1"
- name: "nexus"
ipaddr: "192.168.7.1"
- name: "git"
ipaddr: "192.168.7.1"
otherdomains:
- domain: "infra.redhat.ren"
hosts:
- name: "registry"
ipaddr: "192.168.7.1"
- name: "yum"
ipaddr: "192.168.7.1"
- name: "quay"
ipaddr: "192.168.7.1"
- name: "quaylab"
ipaddr: "192.168.7.1"
- name: "nexus"
ipaddr: "192.168.7.1"
- name: "git"
ipaddr: "192.168.7.1"
- domain: "ocp4s-ais.redhat.ren"
hosts:
- name: "api"
ipaddr: "192.168.7.13"
- name: "api-int"
ipaddr: "192.168.7.13"
- name: "ocp4-sno"
ipaddr: "192.168.7.13"
- name: "*.apps"
ipaddr: "192.168.7.13"
force_ocp_download: false
remove_old_config_files: false
ocp_client: "file:///data/ocp4/{{ ocp_version }}/openshift-client-linux-{{ ocp_version }}.tar.gz"
ocp_installer: "file:///data/ocp4/{{ ocp_version }}/openshift-install-linux-{{ ocp_version }}.tar.gz"
ppc64le: false
arch: 'x86_64'
chronyconfig:
enabled: true
content:
- server: "192.168.7.11"
options: iburst
setup_registry: # don't worry about this, just leave it here
deploy: false
registry_image: docker.io/library/registry:2
local_repo: "ocp4/openshift4"
product_repo: "openshift-release-dev"
release_name: "ocp-release"
release_tag: "4.6.1-x86_64"
ocp_filetranspiler: "file:///data/ocp4/filetranspiler.tgz"
registry_server: "registry.ocp4.redhat.ren:5443"
EOF
ansible-playbook -e @vars.yaml tasks/main.yml
/bin/cp -f /data/ocp4/rhcos-live.x86_64.iso /var/www/html/install/live.iso
部署 assisted install service
assisted install service有2个版本,一个是cloud.redhat.com上面那个,同时还有一个本地版本,两个版本功能一样,因为我们需要有定制需求,所以我们选择本地版本。
# https://github.com/openshift/assisted-service/blob/master/docs/user-guide/assisted-service-on-local.md
# https://github.com/openshift/assisted-service/tree/master/deploy/podman
podman version
# Version: 3.4.2
# API Version: 3.4.2
# Go Version: go1.16.12
# Built: Wed Feb 2 07:59:28 2022
# OS/Arch: linux/amd64
mkdir -p /data/assisted-service/
cd /data/assisted-service/
export http_proxy="http://192.168.195.54:5085"
export https_proxy=${http_proxy}
wget https://raw.githubusercontent.com/openshift/assisted-service/master/deploy/podman/configmap.yml
wget https://raw.githubusercontent.com/openshift/assisted-service/master/deploy/podman/pod.yml
/bin/cp -f configmap.yml configmap.yml.bak
unset http_proxy
unset https_proxy
sed -i 's/ SERVICE_BASE_URL:.*/ SERVICE_BASE_URL: "http:\/\/172.21.6.103:8090"/' configmap.yml
cat << EOF > /data/assisted-service/os_image.json
[{
"openshift_version": "4.9",
"cpu_architecture": "x86_64",
"url": "http://192.168.7.11:8080/install/live.iso",
"rootfs_url": "http://192.168.7.11:8080/install/rootfs.img",
"version": "49.84.202110081407-0"
}]
EOF
cat << EOF > /data/assisted-service/release.json
[{
"openshift_version": "4.9",
"cpu_architecture": "x86_64",
"url": "quaylab.infra.redhat.ren/ocp4/openshift4:4.9.12-x86_64",
"version": "4.9.12",
"default": true
}]
EOF
cat configmap.yml.bak \
| python3 -c 'import json, yaml, sys; print(json.dumps(yaml.load(sys.stdin)))' \
| jq --arg OSIMAGE "$(jq -c . /data/assisted-service/os_image.json)" '. | .data.OS_IMAGES = $OSIMAGE ' \
| jq --arg RELEASE_IMAGES "$(jq -c . /data/assisted-service/release.json)" '. | .data.RELEASE_IMAGES = $RELEASE_IMAGES ' \
| python3 -c 'import yaml, sys; print(yaml.dump(yaml.load(sys.stdin), default_flow_style=False))' \
> configmap.yml
# 启动本地assisted service
cd /data/assisted-service/
podman play kube --configmap configmap.yml pod.yml
# 注入离线镜像仓库的证书
podman cp /etc/crts/redhat.ren.ca.crt assisted-installer-service:/etc/pki/ca-trust/source/anchors/quaylab.crt
podman exec assisted-installer-service update-ca-trust
# 用以下命令,停止/删除本地assisted service
cd /data/assisted-service/
podman play kube --down pod.yml
podman exec assisted-installer-image-service du -h /data
# 1.1G /data
运行成功以后,访问以下url
http://172.21.6.103:8080
创建cluster
访问本地的assist install service, 创建一个cluster, ocp4s-ais.redhat.ren
配置集群的基本信息
填写自己的pull-secret信息,并点击下一步
进入下一个页面后,点击add host
直接点击generate discovery iso,我们会在后面定制ssh key,现在不需要配置。
记录下来download command,因为我们需要里面的env infra id
我们这里的command是
wget -O discovery_image_ocp4s-ais.iso 'http://127.0.0.1:8888/images/b6b173ab-f080-4378-a9e0-bb6ff02f78bb?arch=x86_64&type=full-iso&version=4.9'
定制 assisted install service的配置
assisted install service创建的iso,要去实验网络必须有dhcp服务,我们要做的是static ip,那么我们就要定制一下 assisted install service, 激活他现在还是隐藏的功能(暂时没有官方支持)。
# on helper
cd /data/sno
ASSISTED_SERVICE_URL=http://172.21.6.103:8080
# infra id is part of download url on UI
INFRA_ENV_ID=b6b173ab-f080-4378-a9e0-bb6ff02f78bb
NODE_SSH_KEY="$(cat ~/.ssh/id_rsa.pub)"
SNO_IP=192.168.7.13
SNO_GW=192.168.7.1
SNO_NETMAST=255.255.255.0
SNO_NETMAST_S=24
SNO_HOSTNAME=ocp4-sno
SNO_IF=enp1s0
SNO_IF_MAC=`printf '00:60:2F:%02X:%02X:%02X' $[RANDOM%256] $[RANDOM%256] $[RANDOM%256]`
SNO_DNS=192.168.7.11
SNO_DISK=/dev/vda
SNO_CORE_PWD=redhat
echo ${SNO_IF_MAC} > /data/sno/sno.mac
cat << EOF > /data/sno/server-a.yaml
dns-resolver:
config:
server:
- ${SNO_DNS}
interfaces:
- ipv4:
address:
- ip: ${SNO_IP}
prefix-length: ${SNO_NETMAST_S}
dhcp: false
enabled: true
name: ${SNO_IF}
state: up
type: ethernet
routes:
config:
- destination: 0.0.0.0/0
next-hop-address: ${SNO_GW}
next-hop-interface: ${SNO_IF}
table-id: 254
EOF
cat << EOF > /data/sno/static.ip.bu
variant: openshift
version: 4.9.0
metadata:
labels:
machineconfiguration.openshift.io/role: master
name: 99-zzz-master-static-ip
EOF
VAR_INSTALL_IMAGE_REGISTRY=quaylab.infra.redhat.ren
cat << EOF > /data/sno/install.images.bu
variant: openshift
version: 4.9.0
metadata:
labels:
machineconfiguration.openshift.io/role: master
name: 99-zzz-master-install-images
storage:
files:
- path: /etc/containers/registries.conf.d/base.registries.conf
overwrite: true
contents:
inline: |
unqualified-search-registries = ["registry.access.redhat.com", "docker.io"]
short-name-mode = ""
[[registry]]
prefix = ""
location = "quay.io/openshift-release-dev/ocp-release"
mirror-by-digest-only = true
[[registry.mirror]]
location = "${VAR_INSTALL_IMAGE_REGISTRY}/ocp4/openshift4"
[[registry.mirror]]
location = "${VAR_INSTALL_IMAGE_REGISTRY}/ocp4/release"
[[registry]]
prefix = ""
location = "quay.io/openshift-release-dev/ocp-v4.0-art-dev"
mirror-by-digest-only = true
[[registry.mirror]]
location = "${VAR_INSTALL_IMAGE_REGISTRY}/ocp4/openshift4"
[[registry.mirror]]
location = "${VAR_INSTALL_IMAGE_REGISTRY}/ocp4/release"
EOF
cat << EOF > /data/sno/install.crts.bu
variant: openshift
version: 4.9.0
metadata:
labels:
machineconfiguration.openshift.io/role: master
name: 99-zzz-master-install-crts
storage:
files:
- path: /etc/pki/ca-trust/source/anchors/quaylab.crt
overwrite: true
contents:
inline: |
$( cat /etc/crts/redhat.ren.ca.crt | sed 's/^/ /g' )
EOF
mkdir -p /data/sno/disconnected/
# copy ntp related config
/bin/cp -f /data/ocp4/ocp4-upi-helpernode-master/machineconfig/* /data/sno/disconnected/
# copy image registry proxy related config
cd /data/ocp4
bash image.registries.conf.sh nexus.infra.redhat.ren:8083
/bin/cp -f /data/ocp4/99-worker-container-registries.yaml /data/sno/disconnected/
/bin/cp -f /data/ocp4/99-master-container-registries.yaml /data/sno/disconnected/
cd /data/sno/
# scripts to get ignition from yaml file
# run under bash
# 1st paramter: is the filename which will write to coreos for first boot
# 2nd parameter: is the file content to read from
get_file_content_for_ignition() {
VAR_FILE_NAME=$1
VAR_FILE_CONTENT_IN_FILE=$2
tmppath=$(mktemp)
cat << EOF > $tmppath
{
"overwrite": true,
"path": "$VAR_FILE_NAME",
"user": {
"name": "root"
},
"contents": {
"source": "data:text/plain,$(cat $VAR_FILE_CONTENT_IN_FILE | python3 -c "import sys, urllib.parse; print(urllib.parse.quote(''.join(sys.stdin.readlines())))" )"
}
}
EOF
RET_VAL=$(cat $tmppath | jq -c .)
FILE_JSON=$(cat $VAR_FILE_CONTENT_IN_FILE | python3 -c 'import json, yaml, sys; print(json.dumps(yaml.load(sys.stdin)))')
cat << EOF > $tmppath
{
"overwrite": true,
"path": "$(echo $FILE_JSON | jq -r .spec.config.storage.files[0].path )",
"user": {
"name": "root"
},
"contents": {
"source": "$( echo $FILE_JSON | jq -r .spec.config.storage.files[0].contents.source )"
}
}
EOF
# cat $tmppath
RET_VAL_2=$(cat $tmppath | jq -c .)
/bin/rm -f $tmppath
}
get_file_content_for_ignition "/opt/openshift/openshift/99-master-chrony-configuration.yaml" "/data/sno/disconnected/99-master-chrony-configuration.yaml"
VAR_99_master_chrony=$RET_VAL
VAR_99_master_chrony_2=$RET_VAL_2
get_file_content_for_ignition "/opt/openshift/openshift/99-worker-chrony-configuration.yaml" "/data/sno/disconnected/99-worker-chrony-configuration.yaml"
VAR_99_worker_chrony=$RET_VAL
VAR_99_worker_chrony_2=$RET_VAL_2
get_file_content_for_ignition "/opt/openshift/openshift/99-master-container-registries.yaml" "/data/sno/disconnected/99-master-container-registries.yaml"
VAR_99_master_container_registries=$RET_VAL
VAR_99_master_container_registries_2=$RET_VAL_2
get_file_content_for_ignition "/opt/openshift/openshift/99-worker-container-registries.yaml" "/data/sno/disconnected/99-worker-container-registries.yaml"
VAR_99_worker_container_registries=$RET_VAL
VAR_99_worker_container_registries_2=$RET_VAL_2
butane /data/sno/install.images.bu > /data/sno/disconnected/99-zzz-master-install-images.yaml
get_file_content_for_ignition "/opt/openshift/openshift/99-zzz-master-install-images.yaml" "/data/sno/disconnected/99-zzz-master-install-images.yaml"
VAR_99_master_install_images=$RET_VAL
VAR_99_master_install_images_2=$RET_VAL_2
butane /data/sno/install.crts.bu > /data/sno/disconnected/99-zzz-master-install-crts.yaml
get_file_content_for_ignition "/opt/openshift/openshift/99-zzz-master-install-crts.yaml" "/data/sno/disconnected/99-zzz-master-install-crts.yaml"
VAR_99_master_install_crts=$RET_VAL
VAR_99_master_install_crts_2=$RET_VAL_2
# https://access.redhat.com/solutions/6194821
# butane /data/sno/static.ip.bu | python3 -c 'import json, yaml, sys; print(json.dumps(yaml.load(sys.stdin)))'
# https://stackoverflow.com/questions/2854655/command-to-escape-a-string-in-bash
# VAR_PULL_SEC=`printf "%q" $(cat /data/pull-secret.json)`
# https://access.redhat.com/solutions/221403
# VAR_PWD_HASH="$(openssl passwd -1 -salt 'openshift' 'redhat')"
VAR_PWD_HASH="$(python3 -c 'import crypt,getpass; print(crypt.crypt("redhat"))')"
tmppath=$(mktemp)
butane /data/sno/static.ip.bu \
| python3 -c 'import json, yaml, sys; print(json.dumps(yaml.load(sys.stdin)))' \
| jq '.spec.config | .ignition.version = "3.1.0" ' \
| jq --arg VAR "$VAR_PWD_HASH" --arg VAR_SSH "$NODE_SSH_KEY" '.passwd.users += [{ "name": "wzh", "system": true, "passwordHash": $VAR , "sshAuthorizedKeys": [ $VAR_SSH ], "groups": [ "adm", "wheel", "sudo", "systemd-journal" ] }]' \
| jq --argjson VAR "$VAR_99_master_chrony" '.storage.files += [$VAR] ' \
| jq --argjson VAR "$VAR_99_worker_chrony" '.storage.files += [$VAR] ' \
| jq --argjson VAR "$VAR_99_master_container_registries" '.storage.files += [$VAR] ' \
| jq --argjson VAR "$VAR_99_worker_container_registries" '.storage.files += [$VAR] ' \
| jq --argjson VAR "$VAR_99_master_install_images" '.storage.files += [$VAR] ' \
| jq --argjson VAR "$VAR_99_master_install_crts" '.storage.files += [$VAR] ' \
| jq --argjson VAR "$VAR_99_master_chrony_2" '.storage.files += [$VAR] ' \
| jq --argjson VAR "$VAR_99_master_container_registries_2" '.storage.files += [$VAR] ' \
| jq --argjson VAR "$VAR_99_master_install_images_2" '.storage.files += [$VAR] ' \
| jq --argjson VAR "$VAR_99_master_install_crts_2" '.storage.files += [$VAR] ' \
| jq -c . \
> ${tmppath}
VAR_IGNITION=$(cat ${tmppath})
rm -f ${tmppath}
# cat /run/user/0/containers/auth.json
# {
# "auths": {
# "quaylab.infra.redhat.ren": {
# "auth": "cXVheWFkbWluOnBhc3N3b3Jk"
# }
# }
# }
request_body=$(mktemp)
jq -n --arg SSH_KEY "$NODE_SSH_KEY" \
--arg NMSTATE_YAML1 "$(cat server-a.yaml)" \
--arg MAC_ADDR "$(cat /data/sno/sno.mac)" \
--arg IF_NIC "${SNO_IF}" \
--arg PULL_SEC '{"auths":{"registry.ocp4.redhat.ren:5443": {"auth": "ZHVtbXk6ZHVtbXk=","email": "noemail@localhost"},"quaylab.infra.redhat.ren": {"auth": "cXVheWFkbWluOnBhc3N3b3Jk","email": "noemail@localhost"}}}' \
--arg IGNITION "${VAR_IGNITION}" \
'{
"proxy":{"http_proxy":"","https_proxy":"","no_proxy":""},
"ssh_authorized_key":$SSH_KEY,
"pull_secret":$PULL_SEC,
"image_type":"full-iso",
"ignition_config_override":$IGNITION,
"static_network_config": [
{
"network_yaml": $NMSTATE_YAML1,
"mac_interface_map": [{"mac_address": $MAC_ADDR, "logical_nic_name": $IF_NIC}]
}
]
}' > $request_body
# 我们来看看创建的request body
cat $request_body
# 向 assisted install service发送请求,进行定制
curl -H "Content-Type: application/json" -X PATCH -d @$request_body ${ASSISTED_SERVICE_URL}/api/assisted-install/v2/infra-envs/$INFRA_ENV_ID
# {"cluster_id":"850934fd-fa64-4057-b9d2-1eeebd890e1a","cpu_architecture":"x86_64","created_at":"2022-02-11T03:54:46.632598Z","download_url":"http://127.0.0.1:8888/images/89cc84a1-2dfd-4d7e-9ca3-903342c40d60?arch=x86_64&type=full-iso&version=4.9","email_domain":"Unknown","expires_at":"0001-01-01T00:00:00.000Z","href":"/api/assisted-install/v2/infra-envs/89cc84a1-2dfd-4d7e-9ca3-903342c40d60","id":"89cc84a1-2dfd-4d7e-9ca3-903342c40d60","kind":"InfraEnv","name":"ocp4s_infra-env","openshift_version":"4.9","proxy":{"http_proxy":"","https_proxy":"","no_proxy":""},"pull_secret_set":true,"ssh_authorized_key":"ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQCrkO4oLIFTwjkGON+aShlQRKwXHOf3XKrGDmpb+tQM3UcbsF2U7klsr9jBcGObQMZO7KBW8mlRu0wC2RxueBgjbqvylKoFacgVZg6PORfkclqE1gZRYFwoxDkLo2c5y5B7OhcAdlHO0eR5hZ3/0+8ZHZle0W+A0AD7qqowO2HlWLkMMt1QXFD7R0r6dzTs9u21jASGk3jjYgCOw5iHvqm2ueVDFAc4yVwNZ4MXKg5MRvqAJDYPqhaRozLE60EGIziy9SRj9HWynyNDncCdL1/IBK2z9T0JwDebD6TDNcPCtL+AeKIpaHed52PkjnFf+Q+8/0Z0iXt6GyFYlx8OkxdsiMgMxiXx43yIRaWZjx54kVtc9pB6CL50UKPQ2LjuFPIZSfaCab5KDgPRtzue82DE6Mxxg4PS+FTW32/bq1WiOxCg9ABrZ0n1CGaZWFepJkSw47wodMnvlBkcKY3Rn/SsLZVOUsJysd+b08LQgl1Fr3hjVrEQMLbyU0UxvoerYfk= root@ocp4-helper","static_network_config":"dns-resolver:\n config:\n server:\n - 172.21.1.1\ninterfaces:\n- ipv4:\n address:\n - ip: 172.21.6.13\n prefix-length: 24\n dhcp: false\n enabled: true\n name: enp1s0\n state: up\n type: ethernet\nroutes:\n config:\n - destination: 0.0.0.0/0\n next-hop-address: 172.21.6.254\n next-hop-interface: enp1s0\n table-id: 254HHHHH00:60:2F:8B:42:88=enp1s0","type":"full-iso","updated_at":"2022-02-11T04:01:14.008388Z","user_name":"admin"}
rm -f ${request_body}
# on helper
cd /data/sno/
wget -O discovery_image_ocp4s.iso "http://172.21.6.103:8888/images/${INFRA_ENV_ID}?arch=x86_64&type=full-iso&version=4.9"
# coreos-installer iso kargs modify -a \
# " ip=${SNO_IP}::${SNO_GW}:${SNO_NETMAST}:${SNO_HOSTNAME}:${SNO_IF}:none nameserver=${SNO_DNS}" \
# /data/sno/discovery_image_ocp4s.iso
/bin/mv -f /data/sno/discovery_image_ocp4s.iso /data/sno/sno.iso
启动kvm
我们回到kvm宿主机,启动kvm,开始安装single node openshift
# back to kvm host
create_lv() {
var_vg=$1
var_lv=$2
var_size=$3
lvremove -f $var_vg/$var_lv
lvcreate -y -L $var_size -n $var_lv $var_vg
wipefs --all --force /dev/$var_vg/$var_lv
}
create_lv vgdata lvsno 120G
export KVM_DIRECTORY=/data/kvm
mkdir -p ${KVM_DIRECTORY}
cd ${KVM_DIRECTORY}
scp root@192.168.7.11:/data/sno/sno.* ${KVM_DIRECTORY}/
# on kvm host
# export KVM_DIRECTORY=/data/kvm
virt-install --name=ocp4-sno --vcpus=16 --ram=65536 \
--cpu=host-model \
--disk path=/dev/vgdata/lvsno,device=disk,bus=virtio,format=raw \
--os-variant rhel8.3 --network bridge=baremetal,model=virtio,mac=$(<sno.mac) \
--graphics vnc,port=59012 \
--boot menu=on --cdrom ${KVM_DIRECTORY}/sno.iso
在 assisted install service里面配置sno参数
回到 assisted install service webUI,能看到node已经被发现
点击下一步,配置物理机的安装子网
点击下一步,回顾集群配置信息
开始安装,到这里,我们等待就可以
一段时间以后,通常20-30分钟,就安装完成了,当然这要网络情况比较好的条件下。
⚠️不要忘记下载集群证书,还有webUI的用户名,密码。
访问sno集群
# back to helper
# copy kubeconfig from web browser to /data/sno
export KUBECONFIG=/data/sno/auth/kubeconfig
oc get node
# NAME STATUS ROLES AGE VERSION
# ocp4-sno Ready master,worker 9h v1.22.3+e790d7f
oc get co
# NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE
# authentication 4.9.12 True False False 6h6m
# baremetal 4.9.12 True False False 9h
# cloud-controller-manager 4.9.12 True False False 9h
# cloud-credential 4.9.12 True False False 9h
# cluster-autoscaler 4.9.12 True False False 9h
# config-operator 4.9.12 True False False 9h
# console 4.9.12 True False False 9h
# csi-snapshot-controller 4.9.12 True False False 9h
# dns 4.9.12 True False False 6h6m
# etcd 4.9.12 True False False 9h
# image-registry 4.9.12 True False False 9h
# ingress 4.9.12 True False False 9h
# insights 4.9.12 True False False 9h
# kube-apiserver 4.9.12 True False False 9h
# kube-controller-manager 4.9.12 True False False 9h
# kube-scheduler 4.9.12 True False False 9h
# kube-storage-version-migrator 4.9.12 True False False 9h
# machine-api 4.9.12 True False False 9h
# machine-approver 4.9.12 True False False 9h
# machine-config 4.9.12 True False False 9h
# marketplace 4.9.12 True False False 9h
# monitoring 4.9.12 True False False 9h
# network 4.9.12 True False False 9h
# node-tuning 4.9.12 True False False 9h
# openshift-apiserver 4.9.12 True False False 6h4m
# openshift-controller-manager 4.9.12 True False False 9h
# openshift-samples 4.9.12 True False False 6h4m
# operator-lifecycle-manager 4.9.12 True False False 9h
# operator-lifecycle-manager-catalog 4.9.12 True False False 9h
# operator-lifecycle-manager-packageserver 4.9.12 True False False 9h
# service-ca 4.9.12 True False False 9h
# storage 4.9.12 True False False 9h
访问集群的webUI
https://console-openshift-console.apps.ocp4s-ais.redhat.ren/
用户名密码是: kubeadmin / Sb7Fp-U466I-SkPB4-6bpEn
reference
https://github.com/openshift/assisted-service/tree/master/docs/user-guide
- https://access.redhat.com/solutions/6135171
- https://github.com/openshift/assisted-service/blob/master/docs/user-guide/assisted-service-on-local.md
- https://github.com/openshift/assisted-service/blob/master/docs/user-guide/restful-api-guide.md
search
- pre-network-manager-config.sh
- /Users/wzh/Desktop/dev/assisted-service/internal/constants/scripts.go
- NetworkManager
https://superuser.com/questions/218340/how-to-generate-a-valid-random-mac-address-with-bash-shell
end
cat << EOF > test
02:00:00:2c:23:a5=enp1s0
EOF
cat test | cut -d= -f1 | tr '[:lower:]' '[:upper:]'
printf '00-60-2F-%02X-%02X-%02X\n' $[RANDOM%256] $[RANDOM%256] $[RANDOM%256]
virsh domifaddr freebsd11.1
cat configmap.yml | python3 -c 'import json, yaml, sys; print(json.dumps(yaml.load(sys.stdin)))' | jq -r .data.OS_IMAGES | jq '.[] | select( .openshift_version == "4.9" and .cpu_architecture == "x86_64" ) ' | jq .
# {
# "openshift_version": "4.9",
# "cpu_architecture": "x86_64",
# "url": "https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/4.9/4.9.0/rhcos-4.9.0-x86_64-live.x86_64.iso",
# "rootfs_url": "https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/4.9/4.9.0/rhcos-live-rootfs.x86_64.img",
# "version": "49.84.202110081407-0"
# }
cat configmap.yml | python3 -c 'import json, yaml, sys; print(json.dumps(yaml.load(sys.stdin)))' | jq -r .data.RELEASE_IMAGES | jq -r .
# [
# {
# "openshift_version": "4.6",
# "cpu_architecture": "x86_64",
# "url": "quay.io/openshift-release-dev/ocp-release:4.6.16-x86_64",
# "version": "4.6.16"
# },
# {
# "openshift_version": "4.7",
# "cpu_architecture": "x86_64",
# "url": "quay.io/openshift-release-dev/ocp-release:4.7.42-x86_64",
# "version": "4.7.42"
# },
# {
# "openshift_version": "4.8",
# "cpu_architecture": "x86_64",
# "url": "quay.io/openshift-release-dev/ocp-release:4.8.29-x86_64",
# "version": "4.8.29"
# },
# {
# "openshift_version": "4.9",
# "cpu_architecture": "x86_64",
# "url": "quay.io/openshift-release-dev/ocp-release:4.9.18-x86_64",
# "version": "4.9.18",
# "default": true
# },
# {
# "openshift_version": "4.9",
# "cpu_architecture": "arm64",
# "url": "quay.io/openshift-release-dev/ocp-release:4.9.18-aarch64",
# "version": "4.9.18"
# },
# {
# "openshift_version": "4.10",
# "cpu_architecture": "x86_64",
# "url": "quay.io/openshift-release-dev/ocp-release:4.10.0-rc.0-x86_64",
# "version": "4.10.0-rc.0"
# }
# ]
cat << EOF > /data/sno/static.ip.bu
variant: openshift
version: 4.9.0
metadata:
labels:
machineconfiguration.openshift.io/role: master
name: 99-zzz-master-static-ip
# passwd:
# users:
# name: wzh
# password_hash: "$(openssl passwd -1 wzh)"
# storage:
# files:
# - path: /etc/NetworkManager/system-connections/${SNO_IF}.nmconnection
# overwrite: true
# contents:
# inline: |
# [connection]
# id=${SNO_IF}
# type=ethernet
# autoconnect-retries=1
# interface-name=${SNO_IF}
# multi-connect=1
# permissions=
# wait-device-timeout=60000
# [ethernet]
# mac-address-blacklist=
# [ipv4]
# address1=${SNO_IP}/${SNO_NETMAST_S=24},${SNO_GW}
# dhcp-hostname=${SNO_HOSTNAME}
# dhcp-timeout=90
# dns=${SNO_DNS};
# dns-search=
# may-fail=false
# method=manual
# [ipv6]
# addr-gen-mode=eui64
# dhcp-hostname=${SNO_HOSTNAME}
# dhcp-timeout=90
# dns-search=
# method=disabled
# [proxy]
EOF
# https://access.redhat.com/solutions/221403
# VAR_PWD_HASH="$(openssl passwd -1 -salt 'openshift' 'redhat')"
VAR_PWD_HASH="$(python3 -c 'import crypt,getpass; print(crypt.crypt("redhat"))')"
tmppath=$(mktemp)
butane /data/sno/static.ip.bu \
| python3 -c 'import json, yaml, sys; print(json.dumps(yaml.load(sys.stdin)))' \
| jq '.spec.config | .ignition.version = "3.1.0" ' \
| jq --arg VAR "$VAR_PWD_HASH" --arg VAR_SSH "$NODE_SSH_KEY" '.passwd.users += [{ "name": "wzh", "system": true, "passwordHash": $VAR , "sshAuthorizedKeys": [ $VAR_SSH ], "groups": [ "adm", "wheel", "sudo", "systemd-journal" ] }]' \
| jq --argjson VAR "$VAR_99_master_chrony" '.storage.files += [$VAR] ' \
| jq --argjson VAR "$VAR_99_worker_chrony" '.storage.files += [$VAR] ' \
| jq --argjson VAR "$VAR_99_master_container_registries" '.storage.files += [$VAR] ' \
| jq --argjson VAR "$VAR_99_worker_container_registries" '.storage.files += [$VAR] ' \
| jq --argjson VAR "$VAR_99_master_install_images" '.storage.files += [$VAR] ' \
| jq --argjson VAR "$VAR_99_master_install_crts" '.storage.files += [$VAR] ' \
| jq --argjson VAR "$VAR_99_master_chrony_2" '.storage.files += [$VAR] ' \
| jq --argjson VAR "$VAR_99_master_container_registries_2" '.storage.files += [$VAR] ' \
| jq --argjson VAR "$VAR_99_master_install_images_2" '.storage.files += [$VAR] ' \
| jq --argjson VAR "$VAR_99_master_install_crts_2" '.storage.files += [$VAR] ' \
| jq -c . \
> ${tmppath}
VAR_IGNITION=$(cat ${tmppath})
rm -f ${tmppath}
bottom
openshift 4.6 静态IP离线 baremetal 安装,包含operator hub
安装过程视频
本文描述ocp4.6在baremetal(kvm模拟)上面,静态ip安装的方法。包括operator hub步骤。
离线安装包下载
ocp4.3的离线安装包下载和3.11不太一样,按照如下方式准备。另外,由于默认的baremetal是需要dhcp, pxe环境的,那么需要准备一个工具机,上面有dhcp, tftp, haproxy等工具,另外为了方便项目现场工作,还准备了ignition文件的修改工具,所以离线安装包需要一些其他第三方的工具。
https://github.com/wangzheng422/ocp4-upi-helpernode 这个工具,是创建工具机用的。
https://github.com/wangzheng422/filetranspiler 这个工具,是修改ignition文件用的。
打包好的安装包,在这里下载,百度盘下载链接,版本是4.6.5:
链接: https://pan.baidu.com/s/1-5QWpayV2leinq4DOtiFEg 密码: gjoe
其中包括如下类型的文件:
- ocp4.tgz 这个文件包含了iso等安装介质,以及各种安装脚本,全部下载的镜像列表等。需要复制到宿主机,以及工具机上去。
- registry.tgz 这个文件也是docker image registry的仓库打包文件。需要先补充镜像的话,按照这里操作: 4.6.add.image.md
- install.image.tgz 这个文件是安装集群的时候,需要的补充镜像.
- rhel-data.7.9.tgz 这个文件是 rhel 7 主机的yum更新源,这么大是因为里面有gpu, epel等其他的东西。这个包主要用于安装宿主机,工具机,以及作为计算节点的rhel。
合并这些切分文件,使用类似如下的命令
cat registry.?? > registry.tgz
在外网云主机上面准备离线安装源
准备离线安装介质的文档,已经转移到了这里:4.6.build.dist.md
宿主机准备
本次实验,是在一个32C, 256G 的主机上面,用很多个虚拟机安装测试。所以先准备这个宿主机。
如果是多台宿主机,记得一定要调整时间配置,让这些宿主机的时间基本一致,否则证书会出问题。
主要的准备工作有
- 配置yum源
- 配置dns
- 安装镜像仓库
- 配置vnc环境
- 配置kvm需要的网络
- 创建helper kvm
- 配置一个haproxy,从外部导入流量给kvm
以上准备工作,dns部分需要根据实际项目环境有所调整。
本次的宿主机是一台rhel7
cat << EOF >> /etc/hosts
127.0.0.1 registry.ocp4.redhat.ren
EOF
# 准备yum更新源
mkdir /etc/yum.repos.d.bak
mv /etc/yum.repos.d/* /etc/yum.repos.d.bak
cat << EOF > /etc/yum.repos.d/remote.repo
[remote]
name=RHEL FTP
baseurl=ftp://127.0.0.1/data
enabled=1
gpgcheck=0
EOF
yum clean all
yum repolist
yum -y install byobu htop
systemctl disable --now firewalld
# 配置registry
mkdir -p /etc/crts/ && cd /etc/crts
openssl req \
-newkey rsa:2048 -nodes -keyout redhat.ren.key \
-x509 -days 3650 -out redhat.ren.crt -subj \
"/C=CN/ST=GD/L=SZ/O=Global Security/OU=IT Department/CN=*.ocp4.redhat.ren" \
-config <(cat /etc/pki/tls/openssl.cnf \
<(printf "[SAN]\nsubjectAltName=DNS:registry.ocp4.redhat.ren,DNS:*.ocp4.redhat.ren,DNS:*.redhat.ren"))
/bin/cp -f /etc/crts/redhat.ren.crt /etc/pki/ca-trust/source/anchors/
update-ca-trust extract
cd /data
mkdir -p /data/registry
# tar zxf registry.tgz
yum -y install podman docker-distribution pigz skopeo
# pigz -dc registry.tgz | tar xf -
cat << EOF > /etc/docker-distribution/registry/config.yml
version: 0.1
log:
fields:
service: registry
storage:
cache:
layerinfo: inmemory
filesystem:
rootdirectory: /data/4.6.5/registry
delete:
enabled: true
http:
addr: :5443
tls:
certificate: /etc/crts/redhat.ren.crt
key: /etc/crts/redhat.ren.key
compatibility:
schema1:
enabled: true
EOF
# systemctl restart docker
# systemctl stop docker-distribution
systemctl enable --now docker-distribution
# systemctl restart docker-distribution
# podman login registry.redhat.ren:5443 -u a -p a
# firewall-cmd --permanent --add-port=5443/tcp
# firewall-cmd --reload
# 加载更多的镜像
# 解压缩 ocp4.tgz
bash add.image.load.sh /data/4.6.5/install.image 'registry.ocp4.redhat.ren:5443'
# https://github.com/christianh814/ocp4-upi-helpernode/blob/master/docs/quickstart.md
# 准备vnc环境
yum -y install tigervnc-server tigervnc gnome-terminal gnome-session \
gnome-classic-session gnome-terminal nautilus-open-terminal \
control-center liberation-mono-fonts google-noto-sans-cjk-fonts \
google-noto-sans-fonts fonts-tweak-tool
yum install -y qgnomeplatform xdg-desktop-portal-gtk \
NetworkManager-libreswan-gnome PackageKit-command-not-found \
PackageKit-gtk3-module abrt-desktop at-spi2-atk at-spi2-core \
avahi baobab caribou caribou-gtk2-module caribou-gtk3-module \
cheese compat-cheese314 control-center dconf empathy eog \
evince evince-nautilus file-roller file-roller-nautilus \
firewall-config firstboot fprintd-pam gdm gedit glib-networking \
gnome-bluetooth gnome-boxes gnome-calculator gnome-classic-session \
gnome-clocks gnome-color-manager gnome-contacts gnome-dictionary \
gnome-disk-utility gnome-font-viewer gnome-getting-started-docs \
gnome-icon-theme gnome-icon-theme-extras gnome-icon-theme-symbolic \
gnome-initial-setup gnome-packagekit gnome-packagekit-updater \
gnome-screenshot gnome-session gnome-session-xsession \
gnome-settings-daemon gnome-shell gnome-software gnome-system-log \
gnome-system-monitor gnome-terminal gnome-terminal-nautilus \
gnome-themes-standard gnome-tweak-tool nm-connection-editor orca \
redhat-access-gui sane-backends-drivers-scanners seahorse \
setroubleshoot sushi totem totem-nautilus vinagre vino \
xdg-user-dirs-gtk yelp
yum install -y cjkuni-uming-fonts dejavu-sans-fonts \
dejavu-sans-mono-fonts dejavu-serif-fonts gnu-free-mono-fonts \
gnu-free-sans-fonts gnu-free-serif-fonts \
google-crosextra-caladea-fonts google-crosextra-carlito-fonts \
google-noto-emoji-fonts jomolhari-fonts khmeros-base-fonts \
liberation-mono-fonts liberation-sans-fonts liberation-serif-fonts \
lklug-fonts lohit-assamese-fonts lohit-bengali-fonts \
lohit-devanagari-fonts lohit-gujarati-fonts lohit-kannada-fonts \
lohit-malayalam-fonts lohit-marathi-fonts lohit-nepali-fonts \
lohit-oriya-fonts lohit-punjabi-fonts lohit-tamil-fonts \
lohit-telugu-fonts madan-fonts nhn-nanum-gothic-fonts \
open-sans-fonts overpass-fonts paktype-naskh-basic-fonts \
paratype-pt-sans-fonts sil-abyssinica-fonts sil-nuosu-fonts \
sil-padauk-fonts smc-meera-fonts stix-fonts \
thai-scalable-waree-fonts ucs-miscfixed-fonts vlgothic-fonts \
wqy-microhei-fonts wqy-zenhei-fonts
vncpasswd
cat << EOF > ~/.vnc/xstartup
#!/bin/sh
unset SESSION_MANAGER
unset DBUS_SESSION_BUS_ADDRESS
vncconfig &
gnome-session &
EOF
chmod +x ~/.vnc/xstartup
vncserver :1 -geometry 1280x800
# 如果你想停掉vnc server,这么做
vncserver -kill :1
# firewall-cmd --permanent --add-port=6001/tcp
# firewall-cmd --permanent --add-port=5901/tcp
# firewall-cmd --reload
# connect vnc at port 5901
# export DISPLAY=:1
# https://www.cyberciti.biz/faq/how-to-install-kvm-on-centos-7-rhel-7-headless-server/
# 配置kvm环境
yum -y install qemu-kvm libvirt libvirt-python libguestfs-tools virt-install virt-viewer virt-manager
systemctl enable libvirtd
systemctl start libvirtd
lsmod | grep -i kvm
brctl show
virsh net-list
virsh net-dumpxml default
# 创建实验用虚拟网络
cat << EOF > /data/virt-net.xml
<network>
<name>openshift4</name>
<forward mode='nat'>
<nat>
<port start='1024' end='65535'/>
</nat>
</forward>
<bridge name='openshift4' stp='on' delay='0'/>
<domain name='openshift4'/>
<ip address='192.168.7.1' netmask='255.255.255.0'>
</ip>
</network>
EOF
virsh net-define --file virt-net.xml
virsh net-autostart openshift4
virsh net-start openshift4
# restore back
virsh net-destroy openshift4
virsh net-undefine openshift4
# 创建工具机
mkdir -p /data/kvm
cd /data/kvm
lvremove -f datavg/helperlv
lvcreate -y -L 430G -n helperlv datavg
virt-install --name="ocp4-aHelper" --vcpus=2 --ram=4096 \
--disk path=/dev/datavg/helperlv,device=disk,bus=virtio,format=raw \
--os-variant centos7.0 --network network=openshift4,model=virtio \
--boot menu=on --location /data/kvm/rhel-server-7.8-x86_64-dvd.iso \
--initrd-inject helper-ks.cfg --extra-args "inst.ks=file:/helper-ks.cfg"
# virt-viewer --domain-name ocp4-aHelper
# virsh start ocp4-aHelper
# virsh list --all
# start chrony/ntp server on host
cat << EOF > /etc/chrony.conf
driftfile /var/lib/chrony/drift
makestep 1.0 3
rtcsync
allow 192.0.0.0/8
local stratum 10
logdir /var/log/chrony
EOF
systemctl enable --now chronyd
# systemctl restart chronyd
chronyc tracking
chronyc sources -v
chronyc sourcestats -v
chronyc makestep
工具机准备
以下是在工具机里面,进行的安装操作。
主要的操作有
- 配置yum源
- 运行ansible脚本,自动配置工具机
- 上传定制的安装配置文件
- 生成ignition文件
sed -i 's/#UseDNS yes/UseDNS no/g' /etc/ssh/sshd_config
systemctl restart sshd
cat << EOF > /root/.ssh/config
StrictHostKeyChecking no
UserKnownHostsFile=/dev/null
EOF
# in helper node
mkdir /etc/yum.repos.d.bak
mv /etc/yum.repos.d/* /etc/yum.repos.d.bak/
cat << EOF > /etc/yum.repos.d/remote.repo
[remote]
name=RHEL FTP
baseurl=ftp://192.168.7.1/data
enabled=1
gpgcheck=0
EOF
yum clean all
yum repolist
yum -y install ansible git unzip podman python36
mkdir -p /data/ocp4/
# scp ocp4.tgz to /data
cd /data
tar zvxf ocp4.tgz
cd /data/ocp4
# 这里使用了一个ansible的项目,用来部署helper节点的服务。
# https://github.com/wangzheng422/ocp4-upi-helpernode
unzip ocp4-upi-helpernode.zip
# 这里使用了一个ignition文件合并的项目,用来帮助自定义ignition文件。
# https://github.com/wangzheng422/filetranspiler
podman load -i filetranspiler.tgz
# 接下来,我们使用ansible来配置helper节点,装上各种openshift集群需要的服务
# 根据现场环境,修改 ocp4-upi-helpernode-master/vars-static.yaml
# 主要是修改各个节点的网卡和硬盘参数,还有IP地址
cd /data/ocp4/ocp4-upi-helpernode-master
ansible-playbook -e @vars-static.yaml -e '{staticips: true}' tasks/main.yml
# try this:
/usr/local/bin/helpernodecheck
mkdir -p /data/install
# GOTO image registry host
# copy crt files to helper node
scp /etc/crts/redhat.ren.ca.crt root@192.168.7.11:/data/install/
scp /etc/crts/redhat.ren.crt root@192.168.7.11:/data/install/
scp /etc/crts/redhat.ren.key root@192.168.7.11:/data/install/
# GO back to help node
/bin/cp -f /data/install/redhat.ren.crt /etc/pki/ca-trust/source/anchors/
update-ca-trust extract
# 定制ignition
cd /data/install
# 根据现场环境,修改 install-config.yaml
# 至少要修改ssh key, 还有 additionalTrustBundle,这个是镜像仓库的csr
# vi install-config.yaml
cat << EOF > /data/install/install-config.yaml
apiVersion: v1
baseDomain: redhat.ren
compute:
- hyperthreading: Enabled
name: worker
replicas: 3
controlPlane:
hyperthreading: Enabled
name: master
replicas: 3
metadata:
name: ocp4
networking:
clusterNetworks:
- cidr: 10.254.0.0/16
hostPrefix: 24
networkType: OpenShiftSDN
serviceNetwork:
- 172.30.0.0/16
platform:
none: {}
pullSecret: '{"auths":{"registry.ocp4.redhat.ren:5443": {"auth": "ZHVtbXk6ZHVtbXk=","email": "noemail@localhost"},"registry.ppa.redhat.ren:5443": {"auth": "ZHVtbXk6ZHVtbXk=","email": "noemail@localhost"}}}'
sshKey: |
$( cat /root/.ssh/helper_rsa.pub | sed 's/^/ /g' )
additionalTrustBundle: |
$( cat /data/install/redhat.ren.ca.crt | sed 's/^/ /g' )
imageContentSources:
- mirrors:
- registry.ocp4.redhat.ren:5443/ocp4/openshift4
source: quay.io/openshift-release-dev/ocp-release
- mirrors:
- registry.ocp4.redhat.ren:5443/ocp4/openshift4
source: quay.io/openshift-release-dev/ocp-v4.0-art-dev
EOF
cd /data/install/
/bin/rm -rf *.ign .openshift_install_state.json auth bootstrap manifests master*[0-9] worker*[0-9]
openshift-install create ignition-configs --dir=/data/install
cd /data/ocp4/ocp4-upi-helpernode-master
# 我们来为每个主机,复制自己版本的ign,并复制到web server的目录下
ansible-playbook -e @vars-static.yaml -e '{staticips: true}' tasks/ign.yml
# 如果对每个主机有自己ign的独特需求,在这一步,去修改ign。
# 以下操作本来是想设置网卡地址,但是实践发现是不需要的。
# 保留在这里,是因为他可以在安装的时候注入文件,非常有用。
# mkdir -p bootstrap/etc/sysconfig/network-scripts/
# cat <<EOF > bootstrap/etc/sysconfig/network-scripts/ifcfg-ens3
# DEVICE=ens3
# BOOTPROTO=none
# ONBOOT=yes
# IPADDR=192.168.7.12
# NETMASK=255.255.255.0
# GATEWAY=192.168.7.1
# DNS=192.168.7.11
# DNS1=192.168.7.11
# DNS2=192.168.7.1
# DOMAIN=redhat.ren
# PREFIX=24
# DEFROUTE=yes
# IPV6INIT=no
# EOF
# filetranspiler -i bootstrap.ign -f bootstrap -o bootstrap-static.ign
# /bin/cp -f bootstrap-static.ign /var/www/html/ignition/
# 我们为每个节点创建各自的iso文件
cd /data/ocp4/ocp4-upi-helpernode-master
ansible-playbook -e @vars-static.yaml -e '{staticips: true}' tasks/iso.yml
回到宿主机
本来,到了这一步,就可以开始安装了,但是我们知道coreos装的时候,要手动输入很长的命令行,实际操作的时候,那是不可能输入对的,输入错一个字符,安装就失败,要重启,重新输入。。。
为了避免这种繁琐的操作,参考网上的做法,我们就需要为每个主机定制iso了。好在,之前的步骤,我们已经用ansible创建了需要的iso,我们把这些iso复制到宿主机上,就可以继续了。
这里面有一个坑,我们是不知道主机的网卡名称的,只能先用coreos iso安装启动一次,进入单用户模式以后,ip a 来查看以下,才能知道,一般来说,是ens3。
另外,如果是安装物理机,disk是哪个,也需要上述的方法,来看看具体的盘符。另外,推荐在物理机上安装rhel 8 来测试一下物理机是不是支持coreos。物理机安装的时候,遇到不写盘的问题,可以尝试添加启动参数: ignition.firstboot=1
# on kvm host
export KVM_DIRECTORY=/data/kvm
cd ${KVM_DIRECTORY}
scp root@192.168.7.11:/data/install/*.iso ${KVM_DIRECTORY}/
create_lv() {
var_name=$1
lvremove -f datavg/$var_name
lvcreate -y -L 120G -n $var_name datavg
# wipefs --all --force /dev/datavg/$var_name
}
create_lv bootstraplv
create_lv master0lv
create_lv master1lv
create_lv master2lv
create_lv worker0lv
create_lv worker1lv
create_lv worker2lv
# finally, we can start install :)
# 你可以一口气把虚拟机都创建了,然后喝咖啡等着。
# 从这一步开始,到安装完毕,大概30分钟。
virt-install --name=ocp4-bootstrap --vcpus=4 --ram=8192 \
--disk path=/dev/datavg/bootstraplv,device=disk,bus=virtio,format=raw \
--os-variant rhel8.0 --network network=openshift4,model=virtio \
--boot menu=on --cdrom ${KVM_DIRECTORY}/rhcos_install-bootstrap.iso
# 想登录进coreos一探究竟?那么这么做
# ssh core@bootstrap
# journalctl -b -f -u bootkube.service
virt-install --name=ocp4-master0 --vcpus=4 --ram=16384 \
--disk path=/dev/datavg/master0lv,device=disk,bus=virtio,format=raw \
--os-variant rhel8.0 --network network=openshift4,model=virtio \
--boot menu=on --cdrom ${KVM_DIRECTORY}/rhcos_install-master-0.iso
# ssh core@192.168.7.13
virt-install --name=ocp4-master1 --vcpus=4 --ram=16384 \
--disk path=/dev/datavg/master1lv,device=disk,bus=virtio,format=raw \
--os-variant rhel8.0 --network network=openshift4,model=virtio \
--boot menu=on --cdrom ${KVM_DIRECTORY}/rhcos_install-master-1.iso
virt-install --name=ocp4-master2 --vcpus=4 --ram=16384 \
--disk path=/dev/datavg/master2lv,device=disk,bus=virtio,format=raw \
--os-variant rhel8.0 --network network=openshift4,model=virtio \
--boot menu=on --cdrom ${KVM_DIRECTORY}/rhcos_install-master-2.iso
virt-install --name=ocp4-worker0 --vcpus=4 --ram=32768 \
--disk path=/dev/datavg/worker0lv,device=disk,bus=virtio,format=raw \
--os-variant rhel8.0 --network network=openshift4,model=virtio \
--boot menu=on --cdrom ${KVM_DIRECTORY}/rhcos_install-worker-0.iso
virt-install --name=ocp4-worker1 --vcpus=4 --ram=16384 \
--disk path=/dev/datavg/worker1lv,device=disk,bus=virtio,format=raw \
--os-variant rhel8.0 --network network=openshift4,model=virtio \
--boot menu=on --cdrom ${KVM_DIRECTORY}/rhcos_install-worker-1.iso
virt-install --name=ocp4-worker2 --vcpus=4 --ram=16384 \
--disk path=/dev/datavg/worker2lv,device=disk,bus=virtio,format=raw \
--os-variant rhel8.0 --network network=openshift4,model=virtio \
--boot menu=on --cdrom ${KVM_DIRECTORY}/rhcos_install-worker-2.iso
# on workstation
# open http://192.168.7.11:9000/
# to check
# if you want to stop or delete vm, try this
virsh list --all
virsh destroy ocp4-bootstrap
virsh destroy ocp4-master0
virsh destroy ocp4-master1
virsh destroy ocp4-master2
virsh destroy ocp4-worker0
virsh destroy ocp4-worker1
virsh destroy ocp4-worker2
virsh undefine ocp4-bootstrap
virsh undefine ocp4-master0
virsh undefine ocp4-master1
virsh undefine ocp4-master2
virsh undefine ocp4-worker0
virsh undefine ocp4-worker1
virsh undefine ocp4-worker2
在工具机上面
这个时候,安装已经自动开始了,我们只需要回到工具机上静静的观察就可以了。
在bootstrap和装master阶段,用这个命令看进度。
cd /data/install
export KUBECONFIG=/data/install/auth/kubeconfig
echo "export KUBECONFIG=/data/install/auth/kubeconfig" >> ~/.bashrc
oc completion bash | sudo tee /etc/bash_completion.d/openshift > /dev/null
cd /data/install
openshift-install wait-for bootstrap-complete --log-level debug
一切正常的话,会看到这个。
有时候证书会过期,验证方法是登录 bootstrap, 看看过期时间。如果确定过期,要清除所有的openshift-install生成配置文件的缓存,重新来过。
echo | openssl s_client -connect localhost:6443 | openssl x509 -noout -text | grep Not
一般来说,如果在openshift-install这一步之前,按照文档,删除了缓存文件,就不会出现过期的现象。
oc get nodes
这个时候,只能看到master,是因为worker的csr没有批准。如果虚拟机是一口气创建的,那么多半不会遇到下面的问题。
oc get csr
会发现有很多没有被批准的
批准之
yum -y install jq
oc get csr | grep -v Approved
oc get csr -ojson | jq -r '.items[] | select(.status == {} ) | .metadata.name' | xargs oc adm certificate approve
# oc get csr -o name | xargs oc adm certificate approve
然后worker 节点cpu飙高,之后就能看到worker了。
等一会,会看到这个,就对了。
上面的操作完成以后,就可以完成最后的安装了
openshift-install wait-for install-complete --log-level debug
# here is the output
# INFO To access the cluster as the system:admin user when using 'oc', run 'export KUBECONFIG=/data/install/auth/kubeconfig'
# INFO Access the OpenShift web-console here: https://console-openshift-console.apps.ocp4.redhat.ren
# INFO Login to the console with user: "kubeadmin", and password: "7MXaT-vqouq-UukdG-uzNEi"
我们的工具机是带nfs的,那么就配置高档一些的nfs存储吧,不要用emptydir
bash /data/ocp4/ocp4-upi-helpernode-master/files/nfs-provisioner-setup.sh
# oc edit configs.imageregistry.operator.openshift.io
# 修改 storage 部分
# storage:
# pvc:
# claim:
oc patch configs.imageregistry.operator.openshift.io cluster -p '{"spec":{"managementState": "Managed","storage":{"pvc":{"claim":""}}}}' --type=merge
oc patch configs.imageregistry.operator.openshift.io cluster -p '{"spec":{"managementState": "Removed"}}' --type=merge
oc get clusteroperator image-registry
oc get configs.imageregistry.operator.openshift.io cluster -o yaml
# 把imagepruner给停掉
# https://bugzilla.redhat.com/show_bug.cgi?id=1852501#c24
# oc patch imagepruner.imageregistry/cluster --patch '{"spec":{"suspend":true}}' --type=merge
# oc -n openshift-image-registry delete jobs --all
oc get configs.samples.operator.openshift.io/cluster -o yaml
oc patch configs.samples.operator.openshift.io/cluster -p '{"spec":{"managementState": "Managed"}}' --type=merge
oc patch configs.samples.operator.openshift.io/cluster -p '{"spec":{"managementState": "Unmanaged"}}' --type=merge
oc patch configs.samples.operator.openshift.io/cluster -p '{"spec":{"managementState": "Removed"}}' --type=merge
配置一下本地的dns ( 把 *.apps.ocp4.redhat.ren 配置成 192.168.7.11 ) ,指向工具机的haproxy,打开浏览器就能访问管理界面了
chrony/NTP 设置
在 ocp 4.6 里面,需要设定ntp同步,我们之前ansible脚本,已经创建好了ntp的mco配置,把他打到系统里面就好了。
oc apply -f /data/ocp4/ocp4-upi-helpernode-master/machineconfig/
Operator Hub 离线安装
https://docs.openshift.com/container-platform/4.2/operators/olm-restricted-networks.html
https://github.com/operator-framework/operator-registry
https://www.cnblogs.com/ericnie/p/11777384.html?from=timeline&isappinstalled=0
https://access.redhat.com/documentation/en-us/openshift_container_platform/4.2/html-single/images/index
operator hub 准备分2个层次,一个是本文章描述的,制作operator hub的离线资源,并镜像operator 镜像。做到这一步,能够在离线部署的ocp4.2上,看到operator hub,并且能够部署operator。但是如果要用operator来部署要用的组件,那么operator会再去下载镜像,这个层次的镜像,也需要离线部署,但是由于每个operator需要的镜像都不一样,也没有统一的地方进行描述,所以需要各个项目现场,根据需要另外部署,本项目会尽量多的下载需要的镜像,但是目前无法避免遗漏。
# on helper node, 在工具机上
cd /data/ocp4
# scp /etc/crts/redhat.ren.crt 192.168.7.11:/root/ocp4/
# https://docs.openshift.com/container-platform/4.4/builds/setting-up-trusted-ca.html
oc project openshift-config
oc create configmap ca.for.registry -n openshift-config \
--from-file=registry.ocp4.redhat.ren..5443=/data/install/redhat.ren.crt
# 如果你想删除这个config map,这么做
# oc delete configmap ca.for.registry
oc patch image.config.openshift.io/cluster -p '{"spec":{"additionalTrustedCA":{"name":"ca.for.registry"}}}' --type=merge
# oc patch image.config.openshift.io/cluster -p '{"spec":{"registrySources":{"insecureRegistries":["registry.redhat.ren"]}}}' --type=merge
oc get image.config.openshift.io/cluster -o yaml
# 以下这个步骤是官网文档要做的,实践中发现,disconnected环境不需要
# oc patch OperatorHub cluster --type json -p '[{"op": "add", "path": "/spec/disableAllDefaultSources", "value": true}]'
# 如果你不小心还是照着官网做了,用如下步骤删掉
# oc patch OperatorHub cluster --type json -p '[{"op": "remove", "path": "/spec/disableAllDefaultSources"}]'
oc patch OperatorHub cluster --type json \
-p '[{"op": "add", "path": "/spec/disableAllDefaultSources", "value": true}]'
oc get OperatorHub cluster -o yaml
# yum -y install python36
# 根据项目现场情况,调整参数,运行以下命令,生成配置文件,指向内网镜像仓库
cd /data/ocp4/
bash image.registries.conf.sh registry.ocp4.redhat.ren:5443
# 由于某些ocp 4.2的更新机制,以下操作会触发集群更新,
# 集群节点会逐个重启,集群组件也会逐个重启,请等待集群重启完毕。
oc apply -f ./99-worker-container-registries.yaml -n openshift-config
oc apply -f ./99-master-container-registries.yaml -n openshift-config
# !!!正常情况,以下操作不需要!!!
# 以下操作,删除mirror镜像信息,也会触发集群更新操作,请等待集群重启完毕
oc delete -f ./99-worker-container-registries.yaml -n openshift-config
oc delete -f ./99-master-container-registries.yaml -n openshift-config
watch oc get machineconfigpools
watch oc get node
从监控界面,能看到节点在升级,重启。
# on helper node
# params for operator hub images
export var_date='2020.11.23.0135'
echo $var_date
export var_major_version='4.6'
echo ${var_major_version}
export LOCAL_REG='registry.ocp4.redhat.ren:5443'
# 如果想看到redhat的operator,这样做
# 镜像源在 docker.io/wangzheng422/operator-catalog:redhat-$var_major_version-$var_date
# 后面的参数,去build.dist.sh文件里面,查看
# var_date 和 var_major_version 参数得到
cat <<EOF > redhat-operator-catalog.yaml
apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
name: redhat-operators-catalog
namespace: openshift-marketplace
spec:
displayName: Red Hat Operators
sourceType: grpc
image: ${LOCAL_REG}/ocp4/operator-catalog:redhat-${var_major_version}-${var_date}
publisher: Red Hat
EOF
oc create -f redhat-operator-catalog.yaml
# 如果想看到certified的operator,这样做
# 镜像源在 docker.io/wangzheng422/operator-catalog:certified-$var_major_version-$var_date
# 后面的参数,去build.dist.sh文件里面,查看
# var_date 和 var_major_version 参数得到
cat <<EOF > certified-operator-catalog.yaml
apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
name: certified-operator-catalog
namespace: openshift-marketplace
spec:
displayName: Certified Operators
sourceType: grpc
image: ${LOCAL_REG}/ocp4/operator-catalog:certified-${var_major_version}-${var_date}
publisher: Red Hat
EOF
oc create -f certified-operator-catalog.yaml
# 如果想看到community的operator,这样做
# 镜像源在 docker.io/wangzheng422/operator-catalog:community-$var_major_version-$var_date
# 后面的参数,去build.dist.sh文件里面,查看
# var_date 和 var_major_version 参数得到
cat <<EOF > community-operator-catalog.yaml
apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
name: community-operator-catalog
namespace: openshift-marketplace
spec:
displayName: Community Operator
sourceType: grpc
image: ${LOCAL_REG}/ocp4/operator-catalog:community-${var_major_version}-${var_date}
publisher: Red Hat
EOF
oc create -f community-operator-catalog.yaml
cat <<EOF > marketplace-operator-catalog.yaml
apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
name: redhat-marketplace-catalog
namespace: openshift-marketplace
spec:
displayName: Red Hat Marketplace
sourceType: grpc
image: ${LOCAL_REG}/ocp4/operator-catalog:redhat-marketplace-${var_major_version}-${var_date}
publisher: Red Hat
EOF
oc create -f marketplace-operator-catalog.yaml
# 想删除这些离线operator hub,就这样做。
# find . -name "*-operator-catalog.yaml" -exec oc delete -f {} \;
oc get pods -n openshift-marketplace
oc get catalogsource -n openshift-marketplace
oc get packagemanifest -n openshift-marketplace
能看到operator 列表
部署一个operator也能成功
# set master and worker combine
# https://github.com/openshift-telco/openshift4x-poc/blob/master/MASTER-WORKER-COMBINED.md
oc edit schedulers cluster
# apiVersion: config.openshift.io/v1
# kind: Scheduler
# metadata:
# name: cluster
# spec:
# mastersSchedulable: true
其他链接
https://www.cnblogs.com/ericnie/p/11764124.html
以下是参考材料
https://blog.openshift.com/openshift-4-2-disconnected-install/
https://blog.openshift.com/openshift-4-bare-metal-install-quickstart/
https://github.com/christianh814/ocp4-upi-helpernode#ocp4-upi-helper-node-playbook
https://github.com/openshift/cluster-samples-operator/blob/master/manifests/image-references
https://github.com/e-minguez/ocp4-upi-bm-pxeless-staticips/blob/master/docs/12-post-installation.md
https://www.openshift.com/blog/deploying-a-upi-environment-for-openshift-4-1-on-vms-and-bare-metal
openshift 4.6 静态IP离线 baremetal 安装,包含operator hub
安装过程视频
本文描述ocp4.6在baremetal(kvm模拟)上面,静态ip安装的方法。包括operator hub步骤。
离线安装包下载
ocp4.3的离线安装包下载和3.11不太一样,按照如下方式准备。另外,由于默认的baremetal是需要dhcp, pxe环境的,那么需要准备一个工具机,上面有dhcp, tftp, haproxy等工具,另外为了方便项目现场工作,还准备了ignition文件的修改工具,所以离线安装包需要一些其他第三方的工具。
https://github.com/wangzheng422/ocp4-upi-helpernode 这个工具,是创建工具机用的。
https://github.com/wangzheng422/filetranspiler 这个工具,是修改ignition文件用的。
打包好的安装包,在这里下载,百度盘下载链接,版本是4.6.28:
- 链接: https://pan.baidu.com/s/1XFbiOAcz7nul-N9U0aDxHg 密码: 6qtt
其中包括如下类型的文件:
- ocp4.tgz 这个文件包含了iso等安装介质,以及各种安装脚本,全部下载的镜像列表等。需要复制到宿主机,以及工具机上去。
- registry.tgz 这个文件也是docker image registry的仓库打包文件。需要先补充镜像的话,按照这里操作: 4.6.add.image.md
- install.image.tgz 这个文件是安装集群的时候,需要的补充镜像.
- rhel-data.7.9.tgz 这个文件是 rhel 7 主机的yum更新源,这么大是因为里面有gpu, epel等其他的东西。这个包主要用于安装宿主机,工具机,以及作为计算节点的rhel。
合并这些切分文件,使用类似如下的命令
cat registry.?? > registry.tgz
在外网云主机上面准备离线安装源
准备离线安装介质的文档,已经转移到了这里:4.6.build.dist.md
宿主机准备
本次实验,是在一个32C, 256G 的主机上面,用很多个虚拟机安装测试。所以先准备这个宿主机。
如果是多台宿主机,记得一定要调整时间配置,让这些宿主机的时间基本一致,否则证书会出问题。
主要的准备工作有
- 配置yum源
- 配置dns
- 安装镜像仓库
- 配置vnc环境
- 配置kvm需要的网络
- 创建helper kvm
- 配置一个haproxy,从外部导入流量给kvm
以上准备工作,dns部分需要根据实际项目环境有所调整。
本次的宿主机是一台rhel8, 参考这里进行离线repo等基本的配置rhel8.build.kernel.repo.cache.md
cat << EOF > /root/.ssh/config
StrictHostKeyChecking no
UserKnownHostsFile=/dev/null
EOF
cat << EOF >> /etc/hosts
127.0.0.1 registry.ocp4.redhat.ren
EOF
dnf clean all
dnf repolist
dnf -y install byobu htop
systemctl disable --now firewalld
# 配置registry
mkdir -p /etc/crts/ && cd /etc/crts
openssl req \
-newkey rsa:2048 -nodes -keyout redhat.ren.key \
-x509 -days 3650 -out redhat.ren.crt -subj \
"/C=CN/ST=GD/L=SZ/O=Global Security/OU=IT Department/CN=*.ocp4.redhat.ren" \
-config <(cat /etc/pki/tls/openssl.cnf \
<(printf "[SAN]\nsubjectAltName=DNS:registry.ocp4.redhat.ren,DNS:*.ocp4.redhat.ren,DNS:*.redhat.ren"))
/bin/cp -f /etc/crts/redhat.ren.crt /etc/pki/ca-trust/source/anchors/
update-ca-trust extract
cd /data
mkdir -p /data/registry
# tar zxf registry.tgz
dnf -y install podman pigz skopeo jq
# pigz -dc registry.tgz | tar xf -
cd /data/ocp4
podman load -i /data/ocp4/registry.tgz
podman run --name local-registry -p 5443:5000 \
-d --restart=always \
-v /data/registry/:/var/lib/registry:z \
-v /etc/crts:/certs:z \
-e REGISTRY_HTTP_TLS_CERTIFICATE=/certs/redhat.ren.crt \
-e REGISTRY_HTTP_TLS_KEY=/certs/redhat.ren.key \
docker.io/library/registry:2
# firewall-cmd --permanent --add-port=5443/tcp
# firewall-cmd --reload
# 加载更多的镜像
# 解压缩 ocp4.tgz
bash add.image.load.sh /data/install.image 'registry.ocp4.redhat.ren:5443'
# https://github.com/christianh814/ocp4-upi-helpernode/blob/master/docs/quickstart.md
# 准备vnc环境
vncpasswd
cat << EOF > ~/.vnc/config
session=gnome
securitytypes=vncauth,tlsvnc
desktop=sandbox
geometry=1280x800
alwaysshared
EOF
cat << EOF >> /etc/tigervnc/vncserver.users
:1=root
EOF
systemctl start vncserver@:1
# 如果你想停掉vnc server,这么做
systemctl stop vncserver@:1
# firewall-cmd --permanent --add-port=6001/tcp
# firewall-cmd --permanent --add-port=5901/tcp
# firewall-cmd --reload
# connect vnc at port 5901
# export DISPLAY=:1
# 创建实验用虚拟网络
cat << EOF > /data/kvm/virt-net.xml
<network>
<name>openshift4</name>
<forward mode='nat'>
<nat>
<port start='1024' end='65535'/>
</nat>
</forward>
<bridge name='openshift4' stp='on' delay='0'/>
<domain name='openshift4'/>
<ip address='192.168.7.1' netmask='255.255.255.0'>
</ip>
</network>
EOF
virsh net-define --file /data/kvm/virt-net.xml
virsh net-autostart openshift4
virsh net-start openshift4
# restore back
virsh net-destroy openshift4
virsh net-undefine openshift4
# 创建工具机
mkdir -p /data/kvm
cd /data/kvm
lvremove -f rhel/helperlv
lvcreate -y -L 200G -n helperlv rhel
virt-install --name="ocp4-aHelper" --vcpus=2 --ram=4096 \
--disk path=/dev/rhel/helperlv,device=disk,bus=virtio,format=raw \
--os-variant rhel8.0 --network network=openshift4,model=virtio \
--boot menu=on --location /data/kvm/rhel-8.3-x86_64-dvd.iso \
--initrd-inject helper-ks-rhel8.cfg --extra-args "inst.ks=file:/helper-ks-rhel8.cfg"
# restore kvm
virsh destroy ocp4-aHelper
virsh undefine ocp4-aHelper
# virt-viewer --domain-name ocp4-aHelper
# virsh start ocp4-aHelper
# virsh list --all
# start chrony/ntp server on host
/bin/cp -f /etc/chrony.conf /etc/chrony.conf.default
cat << EOF > /etc/chrony.conf
# pool 2.rhel.pool.ntp.org iburst
driftfile /var/lib/chrony/drift
makestep 1.0 3
rtcsync
allow 192.0.0.0/8
local stratum 10
logdir /var/log/chrony
EOF
systemctl enable --now chronyd
# systemctl restart chronyd
chronyc tracking
chronyc sources -v
chronyc sourcestats -v
chronyc makestep
# setup ftp data root
mount --bind /data/dnf /var/ftp/dnf
chcon -R -t public_content_t /var/ftp/dnf
工具机准备
以下是在工具机里面,进行的安装操作。
主要的操作有
- 配置yum源
- 运行ansible脚本,自动配置工具机
- 上传定制的安装配置文件
- 生成ignition文件
sed -i 's/#UseDNS yes/UseDNS no/g' /etc/ssh/sshd_config
systemctl restart sshd
cat << EOF > /root/.ssh/config
StrictHostKeyChecking no
UserKnownHostsFile=/dev/null
EOF
# in helper node
mkdir /etc/yum.repos.d.bak
mv /etc/yum.repos.d/* /etc/yum.repos.d.bak
export YUMIP="192.168.7.1"
cat << EOF > /etc/yum.repos.d/remote.repo
[remote-epel]
name=epel
baseurl=ftp://${YUMIP}/dnf/epel
enabled=1
gpgcheck=0
[remote-epel-modular]
name=epel-modular
baseurl=ftp://${YUMIP}/dnf/epel-modular
enabled=1
gpgcheck=0
[remote-appstream]
name=appstream
baseurl=ftp://${YUMIP}/dnf/rhel-8-for-x86_64-appstream-rpms
enabled=1
gpgcheck=0
[remote-baseos]
name=baseos
baseurl=ftp://${YUMIP}/dnf/rhel-8-for-x86_64-baseos-rpms
enabled=1
gpgcheck=0
[remote-baseos-source]
name=baseos-source
baseurl=ftp://${YUMIP}/dnf/rhel-8-for-x86_64-baseos-source-rpms
enabled=1
gpgcheck=0
[remote-supplementary]
name=supplementary
baseurl=ftp://${YUMIP}/dnf/rhel-8-for-x86_64-supplementary-rpms
enabled=1
gpgcheck=0
[remote-codeready-builder]
name=supplementary
baseurl=ftp://${YUMIP}/dnf/codeready-builder-for-rhel-8-x86_64-rpms
enabled=1
gpgcheck=0
EOF
yum clean all
yum makecache
yum repolist
yum -y install ansible git unzip podman python3
yum -y update
reboot
# yum -y install ansible git unzip podman python36
mkdir -p /data/ocp4/
# scp ocp4.tgz to /data
# scp /data/down/ocp4.tgz root@192.168.7.11:/data/
cd /data
tar zvxf ocp4.tgz
cd /data/ocp4
# 这里使用了一个ansible的项目,用来部署helper节点的服务。
# https://github.com/wangzheng422/ocp4-upi-helpernode
unzip ocp4-upi-helpernode.zip
# 这里使用了一个ignition文件合并的项目,用来帮助自定义ignition文件。
# https://github.com/wangzheng422/filetranspiler
podman load -i filetranspiler.tgz
# 接下来,我们使用ansible来配置helper节点,装上各种openshift集群需要的服务
# 根据现场环境,修改 ocp4-upi-helpernode-master/vars-static.yaml
# 主要是修改各个节点的网卡和硬盘参数,还有IP地址
cd /data/ocp4/ocp4-upi-helpernode-master
ansible-playbook -e @vars-static.rhel8.yaml -e '{staticips: true}' tasks/main.yml
# try this:
/usr/local/bin/helpernodecheck
mkdir -p /data/install
# GOTO image registry host
# copy crt files to helper node
scp /etc/crts/redhat.ren.ca.crt root@192.168.7.11:/data/install/
scp /etc/crts/redhat.ren.crt root@192.168.7.11:/data/install/
scp /etc/crts/redhat.ren.key root@192.168.7.11:/data/install/
# GO back to help node
/bin/cp -f /data/install/redhat.ren.crt /etc/pki/ca-trust/source/anchors/
update-ca-trust extract
# 定制ignition
cd /data/install
# 根据现场环境,修改 install-config.yaml
# 至少要修改ssh key, 还有 additionalTrustBundle,这个是镜像仓库的csr
# vi install-config.yaml
cat << EOF > /data/install/install-config.yaml
apiVersion: v1
baseDomain: redhat.ren
compute:
- hyperthreading: Enabled
name: worker
replicas: 3
controlPlane:
hyperthreading: Enabled
name: master
replicas: 3
metadata:
name: ocp4
networking:
clusterNetworks:
- cidr: 10.254.0.0/16
hostPrefix: 24
networkType: OpenShiftSDN
serviceNetwork:
- 172.30.0.0/16
platform:
none: {}
pullSecret: '{"auths":{"registry.ocp4.redhat.ren:5443": {"auth": "ZHVtbXk6ZHVtbXk=","email": "noemail@localhost"},"registry.ppa.redhat.ren:5443": {"auth": "ZHVtbXk6ZHVtbXk=","email": "noemail@localhost"}}}'
sshKey: |
$( cat /root/.ssh/helper_rsa.pub | sed 's/^/ /g' )
additionalTrustBundle: |
$( cat /data/install/redhat.ren.ca.crt | sed 's/^/ /g' )
imageContentSources:
- mirrors:
- registry.ocp4.redhat.ren:5443/ocp4/openshift4
source: quay.io/openshift-release-dev/ocp-release
- mirrors:
- registry.ocp4.redhat.ren:5443/ocp4/openshift4
source: quay.io/openshift-release-dev/ocp-v4.0-art-dev
EOF
cd /data/install/
/bin/rm -rf *.ign .openshift_install_state.json auth bootstrap manifests master*[0-9] worker*[0-9]
openshift-install create ignition-configs --dir=/data/install
cd /data/ocp4/ocp4-upi-helpernode-master
# 我们来为每个主机,复制自己版本的ign,并复制到web server的目录下
ansible-playbook -e @vars-static.rhel8.yaml -e '{staticips: true}' tasks/ign.yml
# 如果对每个主机有自己ign的独特需求,在这一步,去修改ign。
# 以下操作本来是想设置网卡地址,但是实践发现是不需要的。
# 保留在这里,是因为他可以在安装的时候注入文件,非常有用。
# mkdir -p bootstrap/etc/sysconfig/network-scripts/
# cat <<EOF > bootstrap/etc/sysconfig/network-scripts/ifcfg-ens3
# DEVICE=ens3
# BOOTPROTO=none
# ONBOOT=yes
# IPADDR=192.168.7.12
# NETMASK=255.255.255.0
# GATEWAY=192.168.7.1
# DNS=192.168.7.11
# DNS1=192.168.7.11
# DNS2=192.168.7.1
# DOMAIN=redhat.ren
# PREFIX=24
# DEFROUTE=yes
# IPV6INIT=no
# EOF
# filetranspiler -i bootstrap.ign -f bootstrap -o bootstrap-static.ign
# /bin/cp -f bootstrap-static.ign /var/www/html/ignition/
# 我们为每个节点创建各自的iso文件
cd /data/ocp4/ocp4-upi-helpernode-master
ansible-playbook -e @vars-static.rhel8.yaml -e '{staticips: true}' tasks/iso.yml
回到宿主机
本来,到了这一步,就可以开始安装了,但是我们知道coreos装的时候,要手动输入很长的命令行,实际操作的时候,那是不可能输入对的,输入错一个字符,安装就失败,要重启,重新输入。。。
为了避免这种繁琐的操作,参考网上的做法,我们就需要为每个主机定制iso了。好在,之前的步骤,我们已经用ansible创建了需要的iso,我们把这些iso复制到宿主机上,就可以继续了。
这里面有一个坑,我们是不知道主机的网卡名称的,只能先用coreos iso安装启动一次,进入单用户模式以后,ip a 来查看以下,才能知道,一般来说,是ens3。
另外,如果是安装物理机,disk是哪个,也需要上述的方法,来看看具体的盘符。另外,推荐在物理机上安装rhel 8 来测试一下物理机是不是支持coreos。物理机安装的时候,遇到不写盘的问题,可以尝试添加启动参数: ignition.firstboot=1
# on kvm host
export KVM_DIRECTORY=/data/kvm
cd ${KVM_DIRECTORY}
scp root@192.168.7.11:/data/install/*.iso ${KVM_DIRECTORY}/
create_lv() {
var_vg=$1
var_lv=$2
lvremove -f $var_vg/$var_lv
lvcreate -y -L 120G -n $var_lv $var_vg
# wipefs --all --force /dev/datavg/$var_name
}
create_lv rhel bootstraplv
create_lv nvme master0lv
create_lv nvme master1lv
create_lv nvme master2lv
create_lv rhel worker0lv
create_lv rhel worker1lv
create_lv rhel worker2lv
# finally, we can start install :)
# 你可以一口气把虚拟机都创建了,然后喝咖啡等着。
# 从这一步开始,到安装完毕,大概30分钟。
virt-install --name=ocp4-bootstrap --vcpus=4 --ram=8192 \
--disk path=/dev/rhel/bootstraplv,device=disk,bus=virtio,format=raw \
--os-variant rhel8.0 --network network=openshift4,model=virtio \
--boot menu=on --cdrom ${KVM_DIRECTORY}/rhcos_install-bootstrap.iso
# 想登录进coreos一探究竟?那么这么做
# ssh core@bootstrap
# journalctl -b -f -u bootkube.service
virt-install --name=ocp4-master0 --vcpus=4 --ram=16384 \
--disk path=/dev/nvme/master0lv,device=disk,bus=virtio,format=raw \
--os-variant rhel8.0 --network network=openshift4,model=virtio \
--boot menu=on --cdrom ${KVM_DIRECTORY}/rhcos_install-master-0.iso
# ssh core@192.168.7.13
virt-install --name=ocp4-master1 --vcpus=4 --ram=16384 \
--disk path=/dev/nvme/master1lv,device=disk,bus=virtio,format=raw \
--os-variant rhel8.0 --network network=openshift4,model=virtio \
--boot menu=on --cdrom ${KVM_DIRECTORY}/rhcos_install-master-1.iso
virt-install --name=ocp4-master2 --vcpus=4 --ram=16384 \
--disk path=/dev/nvme/master2lv,device=disk,bus=virtio,format=raw \
--os-variant rhel8.0 --network network=openshift4,model=virtio \
--boot menu=on --cdrom ${KVM_DIRECTORY}/rhcos_install-master-2.iso
virt-install --name=ocp4-worker0 --vcpus=4 --ram=32768 \
--disk path=/dev/rhel/worker0lv,device=disk,bus=virtio,format=raw \
--os-variant rhel8.0 --network network=openshift4,model=virtio \
--boot menu=on --cdrom ${KVM_DIRECTORY}/rhcos_install-worker-0.iso
virt-install --name=ocp4-worker1 --vcpus=4 --ram=16384 \
--disk path=/dev/rhel/worker1lv,device=disk,bus=virtio,format=raw \
--os-variant rhel8.0 --network network=openshift4,model=virtio \
--boot menu=on --cdrom ${KVM_DIRECTORY}/rhcos_install-worker-1.iso
virt-install --name=ocp4-worker2 --vcpus=4 --ram=16384 \
--disk path=/dev/rhel/worker2lv,device=disk,bus=virtio,format=raw \
--os-variant rhel8.0 --network network=openshift4,model=virtio \
--boot menu=on --cdrom ${KVM_DIRECTORY}/rhcos_install-worker-2.iso
# on workstation
# open http://192.168.7.11:9000/
# to check
# if you want to stop or delete vm, try this
virsh list --all
virsh destroy ocp4-bootstrap
virsh destroy ocp4-master0
virsh destroy ocp4-master1
virsh destroy ocp4-master2
virsh destroy ocp4-worker0
virsh destroy ocp4-worker1
virsh destroy ocp4-worker2
virsh undefine ocp4-bootstrap
virsh undefine ocp4-master0
virsh undefine ocp4-master1
virsh undefine ocp4-master2
virsh undefine ocp4-worker0
virsh undefine ocp4-worker1
virsh undefine ocp4-worker2
在工具机上面
这个时候,安装已经自动开始了,我们只需要回到工具机上静静的观察就可以了。
在bootstrap和装master阶段,用这个命令看进度。
cd /data/ocp4
export KUBECONFIG=/data/install/auth/kubeconfig
echo "export KUBECONFIG=/data/install/auth/kubeconfig" >> ~/.bashrc
oc completion bash | sudo tee /etc/bash_completion.d/openshift > /dev/null
cd /data/install
openshift-install wait-for bootstrap-complete --log-level debug
一切正常的话,会看到这个。
有时候证书会过期,验证方法是登录 bootstrap, 看看过期时间。如果确定过期,要清除所有的openshift-install生成配置文件的缓存,重新来过。
echo | openssl s_client -connect localhost:6443 | openssl x509 -noout -text | grep Not
一般来说,如果在openshift-install这一步之前,按照文档,删除了缓存文件,就不会出现过期的现象。
oc get nodes
这个时候,只能看到master,是因为worker的csr没有批准。如果虚拟机是一口气创建的,那么多半不会遇到下面的问题。
oc get csr
会发现有很多没有被批准的
批准之
yum -y install jq
oc get csr | grep -v Approved
oc get csr -ojson | jq -r '.items[] | select(.status == {} ) | .metadata.name' | xargs oc adm certificate approve
# oc get csr -o name | xargs oc adm certificate approve
然后worker 节点cpu飙高,之后就能看到worker了。
等一会,会看到这个,就对了。
上面的操作完成以后,就可以完成最后的安装了
openshift-install wait-for install-complete --log-level debug
# here is the output
# INFO To access the cluster as the system:admin user when using 'oc', run 'export KUBECONFIG=/data/install/auth/kubeconfig'
# INFO Access the OpenShift web-console here: https://console-openshift-console.apps.ocp4.redhat.ren
# INFO Login to the console with user: "kubeadmin", and password: "6yL7t-uDCaN-6grKP-VtYkx"
我们的工具机是带nfs的,那么就配置高档一些的nfs存储吧,不要用emptydir
bash /data/ocp4/ocp4-upi-helpernode-master/files/nfs-provisioner-setup.sh
# oc edit configs.imageregistry.operator.openshift.io
# 修改 storage 部分
# storage:
# pvc:
# claim:
oc patch configs.imageregistry.operator.openshift.io cluster -p '{"spec":{"managementState": "Managed","storage":{"pvc":{"claim":""}}}}' --type=merge
oc patch configs.imageregistry.operator.openshift.io cluster -p '{"spec":{"managementState": "Removed"}}' --type=merge
oc get clusteroperator image-registry
oc get configs.imageregistry.operator.openshift.io cluster -o yaml
# 把imagepruner给停掉
# https://bugzilla.redhat.com/show_bug.cgi?id=1852501#c24
# oc patch imagepruner.imageregistry/cluster --patch '{"spec":{"suspend":true}}' --type=merge
# oc -n openshift-image-registry delete jobs --all
oc get configs.samples.operator.openshift.io/cluster -o yaml
oc patch configs.samples.operator.openshift.io/cluster -p '{"spec":{"managementState": "Managed"}}' --type=merge
oc patch configs.samples.operator.openshift.io/cluster -p '{"spec":{"managementState": "Unmanaged"}}' --type=merge
oc patch configs.samples.operator.openshift.io/cluster -p '{"spec":{"managementState": "Removed"}}' --type=merge
配置一下本地的dns ( 把 *.apps.ocp4.redhat.ren 配置成 192.168.7.11 ) ,指向工具机的haproxy,打开浏览器就能访问管理界面了
chrony/NTP 设置
在 ocp 4.6 里面,需要设定ntp同步,我们之前ansible脚本,已经创建好了ntp的mco配置,把他打到系统里面就好了。
oc apply -f /data/ocp4/ocp4-upi-helpernode-master/machineconfig/
Operator Hub 离线安装
https://docs.openshift.com/container-platform/4.2/operators/olm-restricted-networks.html
https://github.com/operator-framework/operator-registry
https://www.cnblogs.com/ericnie/p/11777384.html?from=timeline&isappinstalled=0
https://access.redhat.com/documentation/en-us/openshift_container_platform/4.2/html-single/images/index
operator hub 准备分2个层次,一个是本文章描述的,制作operator hub的离线资源,并镜像operator 镜像。做到这一步,能够在离线部署的ocp4.2上,看到operator hub,并且能够部署operator。但是如果要用operator来部署要用的组件,那么operator会再去下载镜像,这个层次的镜像,也需要离线部署,但是由于每个operator需要的镜像都不一样,也没有统一的地方进行描述,所以需要各个项目现场,根据需要另外部署,本项目会尽量多的下载需要的镜像,但是目前无法避免遗漏。
# on helper node, 在工具机上
cd /data/ocp4
# scp /etc/crts/redhat.ren.crt 192.168.7.11:/root/ocp4/
# https://docs.openshift.com/container-platform/4.4/builds/setting-up-trusted-ca.html
oc project openshift-config
oc create configmap ca.for.registry -n openshift-config \
--from-file=registry.ocp4.redhat.ren..5443=/data/install/redhat.ren.crt
# 如果你想删除这个config map,这么做
# oc delete configmap ca.for.registry
oc patch image.config.openshift.io/cluster -p '{"spec":{"additionalTrustedCA":{"name":"ca.for.registry"}}}' --type=merge
# oc patch image.config.openshift.io/cluster -p '{"spec":{"registrySources":{"insecureRegistries":["registry.redhat.ren"]}}}' --type=merge
oc get image.config.openshift.io/cluster -o yaml
# 以下这个步骤是官网文档要做的,实践中发现,disconnected环境不需要
# oc patch OperatorHub cluster --type json -p '[{"op": "add", "path": "/spec/disableAllDefaultSources", "value": true}]'
# 如果你不小心还是照着官网做了,用如下步骤删掉
# oc patch OperatorHub cluster --type json -p '[{"op": "remove", "path": "/spec/disableAllDefaultSources"}]'
oc patch OperatorHub cluster --type json \
-p '[{"op": "add", "path": "/spec/disableAllDefaultSources", "value": true}]'
oc get OperatorHub cluster -o yaml
# yum -y install python36
# 根据项目现场情况,调整参数,运行以下命令,生成配置文件,指向内网镜像仓库
cd /data/ocp4/
bash image.registries.conf.sh registry.ocp4.redhat.ren:5443
# 由于某些ocp 4.2的更新机制,以下操作会触发集群更新,
# 集群节点会逐个重启,集群组件也会逐个重启,请等待集群重启完毕。
oc apply -f ./99-worker-container-registries.yaml -n openshift-config
oc apply -f ./99-master-container-registries.yaml -n openshift-config
# !!!正常情况,以下操作不需要!!!
# 以下操作,删除mirror镜像信息,也会触发集群更新操作,请等待集群重启完毕
oc delete -f ./99-worker-container-registries.yaml -n openshift-config
oc delete -f ./99-master-container-registries.yaml -n openshift-config
watch oc get machineconfigpools
watch oc get node
从监控界面,能看到节点在升级,重启。
# on helper node
# params for operator hub images
export var_date='2020.11.23.0135'
echo $var_date
export var_major_version='4.6'
echo ${var_major_version}
export LOCAL_REG='registry.ocp4.redhat.ren:5443'
# 如果想看到redhat的operator,这样做
# 镜像源在 docker.io/wangzheng422/operator-catalog:redhat-$var_major_version-$var_date
# 后面的参数,去build.dist.sh文件里面,查看
# var_date 和 var_major_version 参数得到
cat <<EOF > redhat-operator-catalog.yaml
apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
name: redhat-operators-catalog
namespace: openshift-marketplace
spec:
displayName: Red Hat Operators
sourceType: grpc
image: ${LOCAL_REG}/ocp4/operator-catalog:redhat-${var_major_version}-${var_date}
publisher: Red Hat
EOF
oc create -f redhat-operator-catalog.yaml
# 如果想看到certified的operator,这样做
# 镜像源在 docker.io/wangzheng422/operator-catalog:certified-$var_major_version-$var_date
# 后面的参数,去build.dist.sh文件里面,查看
# var_date 和 var_major_version 参数得到
cat <<EOF > certified-operator-catalog.yaml
apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
name: certified-operator-catalog
namespace: openshift-marketplace
spec:
displayName: Certified Operators
sourceType: grpc
image: ${LOCAL_REG}/ocp4/operator-catalog:certified-${var_major_version}-${var_date}
publisher: Red Hat
EOF
oc create -f certified-operator-catalog.yaml
# 如果想看到community的operator,这样做
# 镜像源在 docker.io/wangzheng422/operator-catalog:community-$var_major_version-$var_date
# 后面的参数,去build.dist.sh文件里面,查看
# var_date 和 var_major_version 参数得到
cat <<EOF > community-operator-catalog.yaml
apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
name: community-operator-catalog
namespace: openshift-marketplace
spec:
displayName: Community Operator
sourceType: grpc
image: ${LOCAL_REG}/ocp4/operator-catalog:community-${var_major_version}-${var_date}
publisher: Red Hat
EOF
oc create -f community-operator-catalog.yaml
cat <<EOF > marketplace-operator-catalog.yaml
apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
name: redhat-marketplace-catalog
namespace: openshift-marketplace
spec:
displayName: Red Hat Marketplace
sourceType: grpc
image: ${LOCAL_REG}/ocp4/operator-catalog:redhat-marketplace-${var_major_version}-${var_date}
publisher: Red Hat
EOF
oc create -f marketplace-operator-catalog.yaml
# 想删除这些离线operator hub,就这样做。
# find . -name "*-operator-catalog.yaml" -exec oc delete -f {} \;
oc get pods -n openshift-marketplace
oc get catalogsource -n openshift-marketplace
oc get packagemanifest -n openshift-marketplace
能看到operator 列表
部署一个operator也能成功
# set master and worker combine
# https://github.com/openshift-telco/openshift4x-poc/blob/master/MASTER-WORKER-COMBINED.md
oc edit schedulers cluster
# apiVersion: config.openshift.io/v1
# kind: Scheduler
# metadata:
# name: cluster
# spec:
# mastersSchedulable: true
其他链接
https://www.cnblogs.com/ericnie/p/11764124.html
以下是参考材料
https://blog.openshift.com/openshift-4-2-disconnected-install/
https://blog.openshift.com/openshift-4-bare-metal-install-quickstart/
https://github.com/christianh814/ocp4-upi-helpernode#ocp4-upi-helper-node-playbook
https://github.com/openshift/cluster-samples-operator/blob/master/manifests/image-references
https://github.com/e-minguez/ocp4-upi-bm-pxeless-staticips/blob/master/docs/12-post-installation.md
https://www.openshift.com/blog/deploying-a-upi-environment-for-openshift-4-1-on-vms-and-bare-metal
openshift 4.6 离线 baremetal IPI (全自动)安装 单网络模式
简介
视频讲解
本文描述ocp4.6在baremetal(kvm模拟)上面,IPI (全自动)安装。
根据openshift文档,baremetal IPI安装有两种模式,一种是provisioning网络独立,另外一种是provisioning网络和baremetal(服务)网络合并的模式。考虑到POC现场的环境,本次实验,使用简单的网络部署,也就是合并的网络模式。
以下是本次实验的架构图:
离线安装包下载
打包好的安装包,在这里下载,百度盘下载链接,版本是4.6.9-ccn:
链接: https://pan.baidu.com/s/1jJU0HLnZMnvCNMNq1OEDxA 密码: uaaw
其中包括如下类型的文件:
- ocp4.tgz 这个文件包含了iso等安装介质,以及各种安装脚本,全部下载的镜像列表等。需要复制到宿主机,以及工具机上去。
- registry.tgz 这个文件也是docker image registry的仓库打包文件。需要先补充镜像的话,按照这里操作: 4.6.add.image.md
- nexus-image.tgz 这个是nexus的镜像仓库打包,集群的镜像proxy指向nexus,由nexus提供镜像的cache
- poc.image.tgz 这个是给registry.tgz补充的一些镜像,主要是ccn使用,补充的镜像列表在这里 poc.image.list ,按照这里操作: 4.6.add.image.md
合并这些切分文件,使用类似如下的命令
cat registry.?? > registry.tgz
注意,可能需要更新离线镜像包中的helper用的ansible脚本。
在外网云主机上面准备离线安装源
准备离线安装介质的文档,已经转移到了这里:4.6.build.dist.md
宿主机准备
本次实验,是在一个32C, 256G 的主机上面,用很多个虚拟机安装测试。所以先准备这个宿主机。
如果是多台宿主机,记得一定要调整时间配置,让这些宿主机的时间基本一致,否则证书会出问题。
主要的准备工作有
- 配置yum源
- 配置dns
- 安装镜像仓库
- 配置vnc环境
- 配置kvm需要的网络
- 创建helper kvm
以上准备工作,dns部分需要根据实际项目环境有所调整。
本次的宿主机是一台rhel8, 参考这里进行离线repo等基本的配置rhel8.build.kernel.repo.cache.md
cat << EOF > /root/.ssh/config
StrictHostKeyChecking no
UserKnownHostsFile=/dev/null
EOF
cat << EOF >> /etc/hosts
127.0.0.1 registry.ocp4.redhat.ren nexus.ocp4.redhat.ren git.ocp4.redhat.ren
EOF
dnf clean all
dnf repolist
dnf -y install byobu htop jq ipmitool
systemctl disable --now firewalld
# 配置registry
mkdir -p /etc/crts/ && cd /etc/crts
# https://access.redhat.com/documentation/en-us/red_hat_codeready_workspaces/2.1/html/installation_guide/installing-codeready-workspaces-in-tls-mode-with-self-signed-certificates_crw
openssl genrsa -out /etc/crts/redhat.ren.ca.key 4096
openssl req -x509 \
-new -nodes \
-key /etc/crts/redhat.ren.ca.key \
-sha256 \
-days 36500 \
-out /etc/crts/redhat.ren.ca.crt \
-subj /CN="Local Red Hat Ren Signer" \
-reqexts SAN \
-extensions SAN \
-config <(cat /etc/pki/tls/openssl.cnf \
<(printf '[SAN]\nbasicConstraints=critical, CA:TRUE\nkeyUsage=keyCertSign, cRLSign, digitalSignature'))
openssl genrsa -out /etc/crts/redhat.ren.key 2048
openssl req -new -sha256 \
-key /etc/crts/redhat.ren.key \
-subj "/O=Local Red Hat Ren /CN=*.ocp4.redhat.ren" \
-reqexts SAN \
-config <(cat /etc/pki/tls/openssl.cnf \
<(printf "\n[SAN]\nsubjectAltName=DNS:*.ocp4.redhat.ren,DNS:*.apps.ocp4.redhat.ren,DNS:*.redhat.ren\nbasicConstraints=critical, CA:FALSE\nkeyUsage=digitalSignature, keyEncipherment, keyAgreement, dataEncipherment\nextendedKeyUsage=serverAuth")) \
-out /etc/crts/redhat.ren.csr
openssl x509 \
-req \
-sha256 \
-extfile <(printf "subjectAltName=DNS:*.ocp4.redhat.ren,DNS:*.apps.ocp4.redhat.ren,DNS:*.redhat.ren\nbasicConstraints=critical, CA:FALSE\nkeyUsage=digitalSignature, keyEncipherment, keyAgreement, dataEncipherment\nextendedKeyUsage=serverAuth") \
-days 36500 \
-in /etc/crts/redhat.ren.csr \
-CA /etc/crts/redhat.ren.ca.crt \
-CAkey /etc/crts/redhat.ren.ca.key \
-CAcreateserial -out /etc/crts/redhat.ren.crt
openssl x509 -in /etc/crts/redhat.ren.crt -text
/bin/cp -f /etc/crts/redhat.ren.ca.crt /etc/pki/ca-trust/source/anchors/
update-ca-trust extract
cd /data
mkdir -p /data/registry
# tar zxf registry.tgz
dnf -y install podman pigz skopeo jq
# pigz -dc registry.tgz | tar xf -
cd /data/ocp4
podman load -i /data/ocp4/registry.tgz
podman run --name local-registry -p 5443:5000 \
-d --restart=always \
-v /data/registry/:/var/lib/registry:z \
-v /etc/crts:/certs:z \
-e REGISTRY_HTTP_TLS_CERTIFICATE=/certs/redhat.ren.crt \
-e REGISTRY_HTTP_TLS_KEY=/certs/redhat.ren.key \
docker.io/library/registry:2
podman start local-registry
# firewall-cmd --permanent --add-port=5443/tcp
# firewall-cmd --reload
# 加载更多的镜像
# 解压缩 ocp4.tgz
bash add.image.load.sh /data/install.image 'registry.ocp4.redhat.ren:5443'
# https://github.com/christianh814/ocp4-upi-helpernode/blob/master/docs/quickstart.md
# 准备vnc环境
vncpasswd
cat << EOF > ~/.vnc/config
session=gnome
securitytypes=vncauth,tlsvnc
desktop=sandbox
geometry=1440x855
alwaysshared
EOF
cat << EOF >> /etc/tigervnc/vncserver.users
:1=root
EOF
systemctl start vncserver@:1
# 如果你想停掉vnc server,这么做
systemctl stop vncserver@:1
# firewall-cmd --permanent --add-port=6001/tcp
# firewall-cmd --permanent --add-port=5901/tcp
# firewall-cmd --reload
# connect vnc at port 5901
# export DISPLAY=:1
# 创建实验用虚拟网络
cat << 'EOF' > /data/kvm/bridge.sh
#!/usr/bin/env bash
PUB_CONN='eno1'
PUB_IP='172.21.6.105/24'
PUB_GW='172.21.6.254'
PUB_DNS='172.21.1.1'
nmcli con down "$PUB_CONN"
nmcli con delete "$PUB_CONN"
nmcli con down baremetal
nmcli con delete baremetal
# RHEL 8.1 appends the word "System" in front of the connection,delete in case it exists
nmcli con down "System $PUB_CONN"
nmcli con delete "System $PUB_CONN"
nmcli connection add ifname baremetal type bridge con-name baremetal ipv4.method 'manual' \
ipv4.address "$PUB_IP" \
ipv4.gateway "$PUB_GW" \
ipv4.dns "$PUB_DNS"
nmcli con add type bridge-slave ifname "$PUB_CONN" master baremetal
nmcli con down "$PUB_CONN";pkill dhclient;dhclient baremetal
nmcli con up baremetal
EOF
nmcli con mod baremetal +ipv4.address '192.168.7.1/24'
nmcli networking off; nmcli networking on
# 创建工具机
mkdir -p /data/kvm
cd /data/kvm
lvremove -f rhel/helperlv
lvcreate -y -L 200G -n helperlv rhel
virt-install --name="ocp4-aHelper" --vcpus=2 --ram=4096 \
--disk path=/dev/rhel/helperlv,device=disk,bus=virtio,format=raw \
--os-variant rhel8.0 --network bridge=baremetal,model=virtio \
--boot menu=on --location /data/kvm/rhel-8.3-x86_64-dvd.iso \
--initrd-inject helper-ks-rhel8-ipi.cfg --extra-args "inst.ks=file:/helper-ks-rhel8-ipi.cfg"
virsh start ocp4-aHelper
# DO NOT USE, restore kvm
virsh destroy ocp4-aHelper
virsh undefine ocp4-aHelper
# virt-viewer --domain-name ocp4-aHelper
# virsh start ocp4-aHelper
# virsh list --all
# start chrony/ntp server on host
/bin/cp -f /etc/chrony.conf /etc/chrony.conf.default
cat << EOF > /etc/chrony.conf
# pool 2.rhel.pool.ntp.org iburst
driftfile /var/lib/chrony/drift
makestep 1.0 3
rtcsync
allow 192.0.0.0/8
local stratum 10
logdir /var/log/chrony
EOF
systemctl enable --now chronyd
# systemctl restart chronyd
chronyc tracking
chronyc sources -v
chronyc sourcestats -v
chronyc makestep
# setup ftp data root
mount --bind /data/dnf /var/ftp/dnf
chcon -R -t public_content_t /var/ftp/dnf
# create the master and worker vm, but not start them
export KVM_DIRECTORY=/data/kvm
mkdir -p ${KVM_DIRECTORY}
cd ${KVM_DIRECTORY}
# scp root@192.168.7.11:/data/install/*.iso ${KVM_DIRECTORY}/
remove_lv() {
var_vg=$1
var_lv=$2
lvremove -f $var_vg/$var_lv
}
create_lv() {
var_vg=$1
var_lv=$2
lvcreate -y -L 120G -n $var_lv $var_vg
wipefs --all --force /dev/$var_vg/$var_lv
}
remove_lv nvme master0lv
remove_lv nvme master1lv
remove_lv nvme master2lv
remove_lv rhel worker0lv
remove_lv rhel worker1lv
remove_lv rhel worker2lv
# create_lv rhel bootstraplv
create_lv nvme master0lv
create_lv nvme master1lv
create_lv nvme master2lv
create_lv rhel worker0lv
create_lv rhel worker1lv
create_lv rhel worker2lv
virt-install --name=ocp4-master0 --vcpus=4 --ram=16384 \
--disk path=/dev/nvme/master0lv,device=disk,bus=virtio,format=raw \
--os-variant rhel8.0 --network bridge=baremetal,model=virtio \
--boot uefi,nvram_template=/usr/share/OVMF/OVMF_VARS.fd,menu=on \
--print-xml > ${KVM_DIRECTORY}/ocp4-master0.xml
virsh define --file ${KVM_DIRECTORY}/ocp4-master0.xml
virt-install --name=ocp4-master1 --vcpus=4 --ram=16384 \
--disk path=/dev/nvme/master1lv,device=disk,bus=virtio,format=raw \
--os-variant rhel8.0 --network bridge=baremetal,model=virtio \
--boot uefi,nvram_template=/usr/share/OVMF/OVMF_VARS.fd,menu=on \
--print-xml > ${KVM_DIRECTORY}/ocp4-master1.xml
virsh define --file ${KVM_DIRECTORY}/ocp4-master1.xml
virt-install --name=ocp4-master2 --vcpus=4 --ram=16384 \
--disk path=/dev/nvme/master2lv,device=disk,bus=virtio,format=raw \
--os-variant rhel8.0 --network bridge=baremetal,model=virtio \
--boot uefi,nvram_template=/usr/share/OVMF/OVMF_VARS.fd,menu=on \
--print-xml > ${KVM_DIRECTORY}/ocp4-master2.xml
virsh define --file ${KVM_DIRECTORY}/ocp4-master2.xml
virt-install --name=ocp4-worker0 --vcpus=8 --ram=65536 \
--disk path=/dev/rhel/worker0lv,device=disk,bus=virtio,format=raw \
--os-variant rhel8.0 --network bridge=baremetal,model=virtio \
--boot uefi,nvram_template=/usr/share/OVMF/OVMF_VARS.fd,menu=on \
--print-xml > ${KVM_DIRECTORY}/ocp4-worker0.xml
virsh define --file ${KVM_DIRECTORY}/ocp4-worker0.xml
virt-install --name=ocp4-worker1 --vcpus=4 --ram=32768 \
--disk path=/dev/rhel/worker1lv,device=disk,bus=virtio,format=raw \
--os-variant rhel8.0 --network bridge=baremetal,model=virtio \
--boot uefi,nvram_template=/usr/share/OVMF/OVMF_VARS.fd,menu=on \
--print-xml > ${KVM_DIRECTORY}/ocp4-worker1.xml
virsh define --file ${KVM_DIRECTORY}/ocp4-worker1.xml
virt-install --name=ocp4-worker2 --vcpus=2 --ram=8192 \
--disk path=/dev/rhel/worker2lv,device=disk,bus=virtio,format=raw \
--os-variant rhel8.0 --network bridge=baremetal,model=virtio \
--boot uefi,nvram_template=/usr/share/OVMF/OVMF_VARS.fd,menu=on \
--print-xml > ${KVM_DIRECTORY}/ocp4-worker2.xml
virsh define --file ${KVM_DIRECTORY}/ocp4-worker2.xml
cd /data/kvm/
for i in master{0..2} worker{0..2}
do
echo -ne "${i}\t" ;
virsh dumpxml ocp4-${i} | grep "mac address" | cut -d\' -f2 | tr '\n' '\t'
echo
done > mac.list
cat /data/kvm/mac.list
# master0 52:54:00:7b:5b:83
# master1 52:54:00:9b:f4:bc
# master2 52:54:00:72:16:ac
# worker0 52:54:00:19:f4:65
# worker1 52:54:00:88:4f:2c
# worker2 52:54:00:ed:25:30
# GOTO image registry & kvm host
# copy crt files to helper node
ssh-copy-id root@192.168.7.11
ssh root@192.168.7.11 mkdir -p /data/install
ssh root@192.168.7.11 mkdir -p /data/ocp4
scp /data/down/ocp4.tgz root@192.168.7.11:/data/
rsync -e ssh --info=progress2 -P --delete -arz /data/ocp4/ 192.168.7.11:/data/ocp4/
scp /etc/crts/redhat.ren.ca.crt root@192.168.7.11:/data/install/
scp /data/kvm/mac.list root@192.168.7.11:/data/install/
# install redfish for kvm
# https://access.redhat.com/solutions/4315581
# https://access.redhat.com/solutions/3057171
# https://docs.openstack.org/virtualbmc/latest/user/index.html
# https://docs.openstack.org/sushy-tools/latest/user/dynamic-emulator.html
dnf -y install python3-pip
# pip3 install --user sushy-tools
mkdir -p /data/install
cd /data/install
# podman create --name swap docker.io/wangzheng422/imgs:openshift-baremetal-install-4.6.5 ls
# podman cp swap:/openshift-baremetal-install ./
# podman rm -fv swap
podman create --name swap docker.io/wangzheng422/imgs:ocp.bm.ipi.python.dep.rhel8-4.6.7 ls
podman cp swap:/wheelhouse.tar.gz - > wheelhouse.tar.gz
tar zvxf wheelhouse.tar.gz
podman rm -fv swap
pip3 install --user -r wheelhouse/requirements.txt --no-index --find-links wheelhouse
/root/.local/bin/sushy-emulator -i 0.0.0.0 --ssl-certificate /etc/crts/redhat.ren.crt --ssl-key /etc/crts/redhat.ren.key
# curl https://registry.ocp4.redhat.ren:8000/redfish/v1/Systems/
# DO NOT USE, restore
# if you want to stop or delete vm, try this
virsh list --all
# virsh destroy ocp4-bootstrap
virsh destroy ocp4-master0
virsh destroy ocp4-master1
virsh destroy ocp4-master2
virsh destroy ocp4-worker0
virsh destroy ocp4-worker1
virsh destroy ocp4-worker2
# virsh undefine ocp4-bootstrap
virsh undefine ocp4-master0 --nvram
virsh undefine ocp4-master1 --nvram
virsh undefine ocp4-master2 --nvram
virsh undefine ocp4-worker0 --nvram
virsh undefine ocp4-worker1 --nvram
virsh undefine ocp4-worker2 --nvram
工具机准备
以下是在工具机里面,进行的安装操作。
主要的操作有
- 配置yum源
- 运行ansible脚本,自动配置工具机
- 上传定制的安装配置文件
- 生成ignition文件
sed -i 's/#UseDNS yes/UseDNS no/g' /etc/ssh/sshd_config
systemctl restart sshd
cat << EOF > /root/.ssh/config
StrictHostKeyChecking no
UserKnownHostsFile=/dev/null
EOF
systemctl disable --now firewalld
# in helper node
mkdir /etc/yum.repos.d.bak
mv /etc/yum.repos.d/* /etc/yum.repos.d.bak
export YUMIP="192.168.7.1"
cat << EOF > /etc/yum.repos.d/remote.repo
[remote-epel]
name=epel
baseurl=ftp://${YUMIP}/dnf/epel
enabled=1
gpgcheck=0
[remote-epel-modular]
name=epel-modular
baseurl=ftp://${YUMIP}/dnf/epel-modular
enabled=1
gpgcheck=0
[remote-appstream]
name=appstream
baseurl=ftp://${YUMIP}/dnf/rhel-8-for-x86_64-appstream-rpms
enabled=1
gpgcheck=0
[remote-baseos]
name=baseos
baseurl=ftp://${YUMIP}/dnf/rhel-8-for-x86_64-baseos-rpms
enabled=1
gpgcheck=0
[remote-baseos-source]
name=baseos-source
baseurl=ftp://${YUMIP}/dnf/rhel-8-for-x86_64-baseos-source-rpms
enabled=1
gpgcheck=0
[remote-supplementary]
name=supplementary
baseurl=ftp://${YUMIP}/dnf/rhel-8-for-x86_64-supplementary-rpms
enabled=1
gpgcheck=0
[remote-codeready-builder]
name=supplementary
baseurl=ftp://${YUMIP}/dnf/codeready-builder-for-rhel-8-x86_64-rpms
enabled=1
gpgcheck=0
EOF
yum clean all
yum makecache
yum repolist
yum -y install ansible git unzip podman python3
yum -y update
reboot
# yum -y install ansible git unzip podman python36
mkdir -p /data/ocp4/
# scp ocp4.tgz to /data
# scp /data/down/ocp4.tgz root@192.168.7.11:/data/
cd /data
tar zvxf ocp4.tgz
cd /data/ocp4
# 这里使用了一个ansible的项目,用来部署helper节点的服务。
# https://github.com/wangzheng422/ocp4-upi-helpernode
unzip ocp4-upi-helpernode.zip
# 这里使用了一个ignition文件合并的项目,用来帮助自定义ignition文件。
# https://github.com/wangzheng422/filetranspiler
podman load -i filetranspiler.tgz
mkdir -p /data/install
mkdir -p /data/ocp4/
cd /data/ocp4/
cat << 'EOF' > redfish.sh
#!/usr/bin/env bash
curl -k -s https://192.168.7.1:8000/redfish/v1/Systems/ | jq -r '.Members[]."@odata.id"' > list
while read -r line; do
curl -k -s https://192.168.7.1:8000/$line | jq -j '.Id, " ", .Name, "\n" '
done < list
EOF
bash redfish.sh > /data/install/vm.list
cat /data/install/vm.list
# 9cc02fbc-cbfe-4006-b5a9-f04712321157 ocp4-worker0
# b1a13dd1-7864-4b61-bd0c-851c11f87199 ocp4-master0
# 0a121472-6d24-47ae-9715-8e8e175ab397 ocp4-master2
# b30891d1-b14b-4645-9b05-504a58e1e059 ocp4-worker1
# fb261d6c-31c5-4e7e-8020-2789d5cc63e3 ocp4-aHelper
# 4497d313-390c-4c6b-a5d6-3f533e397aaf ocp4-master1
# f9b0a86d-1587-47ea-9a92-a2762b0684fd ocp4-worker2
cat << EOF > /data/ocp4/ocp4-upi-helpernode-master/vars-dhcp.rhel8.yaml
---
ssh_gen_key: true
staticips: false
bm_ipi: true
firewalld: false
dns_forward: false
iso:
iso_dl_url: "file:///data/ocp4/rhcos-live.x86_64.iso"
my_iso: "rhcos-live.iso"
helper:
name: "helper"
ipaddr: "192.168.7.11"
networkifacename: "enp1s0"
gateway: "192.168.7.1"
netmask: "255.255.255.0"
dns:
domain: "redhat.ren"
clusterid: "ocp4"
forwarder1: "192.168.7.1"
forwarder2: "192.168.7.1"
api_vip: "192.168.7.100"
ingress_vip: "192.168.7.101"
dhcp:
router: "192.168.7.1"
bcast: "192.168.7.255"
netmask: "255.255.255.0"
poolstart: "192.168.7.70"
poolend: "192.168.7.90"
ipid: "192.168.7.0"
netmaskid: "255.255.255.0"
bootstrap:
name: "bootstrap"
ipaddr: "192.168.7.12"
interface: "enp1s0"
install_drive: "vda"
macaddr: "52:54:00:7e:f8:f7"
masters:
- name: "master-0"
ipaddr: "192.168.7.13"
interface: "enp1s0"
install_drive: "vda"
macaddr: "$(cat /data/install/mac.list | grep master0 | awk '{print $2}')"
- name: "master-1"
ipaddr: "192.168.7.14"
interface: "enp1s0"
install_drive: "vda"
macaddr: "$(cat /data/install/mac.list | grep master1 | awk '{print $2}')"
- name: "master-2"
ipaddr: "192.168.7.15"
interface: "enp1s0"
install_drive: "vda"
macaddr: "$(cat /data/install/mac.list | grep master2 | awk '{print $2}')"
workers:
- name: "worker-0"
ipaddr: "192.168.7.16"
interface: "enp1s0"
install_drive: "vda"
macaddr: "$(cat /data/install/mac.list | grep worker0 | awk '{print $2}')"
- name: "worker-1"
ipaddr: "192.168.7.17"
interface: "enp1s0"
install_drive: "vda"
macaddr: "$(cat /data/install/mac.list | grep worker1 | awk '{print $2}')"
- name: "worker-2"
ipaddr: "192.168.7.18"
interface: "enp1s0"
install_drive: "vda"
macaddr: "$(cat /data/install/mac.list | grep worker2 | awk '{print $2}')"
others:
- name: "registry"
ipaddr: "192.168.7.1"
macaddr: "52:54:00:7e:f8:f7"
- name: "yum"
ipaddr: "192.168.7.1"
macaddr: "52:54:00:7e:f8:f7"
- name: "quay"
ipaddr: "192.168.7.1"
macaddr: "52:54:00:7e:f8:f7"
- name: "nexus"
ipaddr: "192.168.7.1"
macaddr: "52:54:00:7e:f8:f7"
- name: "git"
ipaddr: "192.168.7.1"
macaddr: "52:54:00:7e:f8:f7"
otherdomains:
- domain: "rhv.redhat.ren"
hosts:
- name: "manager"
ipaddr: "192.168.7.71"
- name: "rhv01"
ipaddr: "192.168.7.72"
- domain: "cmri-edge.redhat.ren"
hosts:
- name: "*"
ipaddr: "192.168.7.71"
- name: "*.apps"
ipaddr: "192.168.7.72"
force_ocp_download: false
remove_old_config_files: false
ocp_client: "file:///data/ocp4/4.6.9/openshift-client-linux-4.6.9.tar.gz"
ocp_installer: "file:///data/ocp4/4.6.9/openshift-install-linux-4.6.9.tar.gz"
ppc64le: false
arch: 'x86_64'
chronyconfig:
enabled: true
content:
- server: "192.168.7.1"
options: iburst
setup_registry:
deploy: false
registry_image: docker.io/library/registry:2
local_repo: "ocp4/openshift4"
product_repo: "openshift-release-dev"
release_name: "ocp-release"
release_tag: "4.6.1-x86_64"
registry_server: "registry.ocp4.redhat.ren:5443"
EOF
# 接下来,我们使用ansible来配置helper节点,装上各种openshift集群需要的服务
# 根据现场环境,修改 ocp4-upi-helpernode-master/vars-static.yaml
cd /data/ocp4/ocp4-upi-helpernode-master
ansible-playbook -e @vars-dhcp.rhel8.yaml -e '{ staticips: false, bm_ipi: true }' tasks/main.yml
# try this:
/usr/local/bin/helpernodecheck
mkdir -p /data/install
# GO back to help node
/bin/cp -f /data/install/redhat.ren.ca.crt /etc/pki/ca-trust/source/anchors/
update-ca-trust extract
# 根据现场环境,修改 install-config.yaml
# 至少要修改ssh key, 还有 additionalTrustBundle,这个是镜像仓库的csr
# copy your pull secret file into helper
# SEC_FILE='/data/pull-secret.json'
# cat << 'EOF' > $SEC_FILE
# 定制ignition
cd /data/install
# vi install-config.yaml
cat << EOF > /data/install/install-config.yaml
apiVersion: v1
baseDomain: redhat.ren
platform:
baremetal:
apiVIP: 192.168.7.100
ingressVIP: 192.168.7.101
bootstrapProvisioningIP: 192.168.7.102
provisioningHostIP: 192.168.7.103
provisioningNetwork: "Disabled"
bootstrapOSImage: http://192.168.7.11:8080/install/rhcos-qemu.x86_64.qcow2.gz?sha256=$(zcat /var/www/html/install/rhcos-qemu.x86_64.qcow2.gz | sha256sum | awk '{print $1}')
clusterOSImage: http://192.168.7.11:8080/install/rhcos-openstack.x86_64.qcow2.gz?sha256=$(sha256sum /var/www/html/install/rhcos-openstack.x86_64.qcow2.gz | awk '{print $1}')
hosts:
- name: master-0
role: master
bmc:
address: redfish-virtualmedia://192.168.7.1:8000/redfish/v1/Systems/$(cat vm.list | grep master0 | awk '{print $1}')
username: admin
password: password
disableCertificateVerification: True
bootMACAddress: $(cat mac.list | grep master0 | awk '{print $2}')
rootDeviceHints:
deviceName: "/dev/vda"
- name: master-1
role: master
bmc:
address: redfish-virtualmedia://192.168.7.1:8000/redfish/v1/Systems/$(cat vm.list | grep master1 | awk '{print $1}')
username: admin
password: password
disableCertificateVerification: True
bootMACAddress: $(cat mac.list | grep master1 | awk '{print $2}')
rootDeviceHints:
deviceName: "/dev/vda"
- name: master-2
role: master
bmc:
address: redfish-virtualmedia://192.168.7.1:8000/redfish/v1/Systems/$(cat vm.list | grep master2 | awk '{print $1}')
username: admin
password: password
disableCertificateVerification: True
bootMACAddress: $(cat mac.list | grep master2 | awk '{print $2}')
rootDeviceHints:
deviceName: "/dev/vda"
- name: worker-0
role: worker
bmc:
address: redfish-virtualmedia://192.168.7.1:8000/redfish/v1/Systems/$(cat vm.list | grep worker0 | awk '{print $1}')
username: admin
password: password
disableCertificateVerification: True
bootMACAddress: $(cat mac.list | grep worker0 | awk '{print $2}')
rootDeviceHints:
deviceName: "/dev/vda"
- name: worker-1
role: worker
bmc:
address: redfish-virtualmedia://192.168.7.1:8000/redfish/v1/Systems/$(cat vm.list | grep worker1 | awk '{print $1}')
username: admin
password: password
disableCertificateVerification: True
bootMACAddress: $(cat mac.list | grep worker1 | awk '{print $2}')
rootDeviceHints:
deviceName: "/dev/vda"
metadata:
name: ocp4
networking:
clusterNetworks:
- cidr: 10.254.0.0/16
hostPrefix: 24
networkType: OpenShiftSDN
serviceNetwork:
- 172.30.0.0/16
machineCIDR: 192.168.7.0/24
compute:
- name: worker
replicas: 2
controlPlane:
name: master
replicas: 3
platform:
baremetal: {}
pullSecret: '$( cat /data/pull-secret.json )'
sshKey: |
$( cat /root/.ssh/helper_rsa.pub | sed 's/^/ /g' )
additionalTrustBundle: |
$( cat /data/install/redhat.ren.ca.crt | sed 's/^/ /g' )
imageContentSources:
- mirrors:
- registry.ocp4.redhat.ren:5443/ocp4/openshift4
source: quay.io/openshift-release-dev/ocp-release
- mirrors:
- registry.ocp4.redhat.ren:5443/ocp4/openshift4
source: quay.io/openshift-release-dev/ocp-v4.0-art-dev
EOF
# GO back to host
mkdir -p /data/install
cd /data/install
scp root@192.168.7.11:/data/install/install-config.yaml /data/install/
cd /data/install
for i in $(sudo virsh list --all | tail -n +3 | grep bootstrap | awk {'print $2'});
do
sudo virsh destroy $i;
sudo virsh undefine $i;
sudo virsh vol-delete $i --pool default;
sudo virsh vol-delete $i.ign --pool default;
virsh pool-destroy $i
virsh pool-delete $i
virsh pool-undefine $i
done
/bin/rm -rf .openshift_install.log .openshift_install_state.json terraform* auth tls
/data/ocp4/4.6.9/openshift-baremetal-install --dir /data/install/ --log-level debug create cluster
# INFO Install complete!
# INFO To access the cluster as the system:admin user when using 'oc', run 'export KUBECONFIG=/data/install/auth/kubeconfig'
# INFO Access the OpenShift web-console here: https://console-openshift-console.apps.ocp4.redhat.ren
# INFO Login to the console with user: "kubeadmin", and password: "tjRNB-xHf2f-fFh8n-ppNXi"
# on kvm host, copy back auth folder
rsync -arz /data/install/auth root@192.168.7.11:/data/install/
# Go back to helper
ansible localhost -m lineinfile -a 'path=$HOME/.bashrc regexp="^export KUBECONFIG" line="export KUBECONFIG=/data/install/auth/kubeconfig"'
source $HOME/.bashrc
oc get node
oc get pod -n openshift-machine-api
oc get BareMetalHost -n openshift-machine-api
oc get bmh -n openshift-machine-api
# NAME STATUS PROVISIONING STATUS CONSUMER BMC HARDWARE PROFILE ONLINE ERROR
# master-0 OK externally provisioned ocp4-zn8lq-master-0 redfish-virtualmedia://192.168.7.1:8000/redfish/v1/Systems/965c420a-f127-4639-9184-fe3546d2bde4 true
# master-1 OK externally provisioned ocp4-zn8lq-master-1 redfish-virtualmedia://192.168.7.1:8000/redfish/v1/Systems/46f9dff4-1b44-4286-8a7c-691673340030 true
# master-2 OK externally provisioned ocp4-zn8lq-master-2 redfish-virtualmedia://192.168.7.1:8000/redfish/v1/Systems/9e544eb6-1b98-4b0a-ad32-7df232ae582a true
# worker-0 OK provisioned ocp4-zn8lq-worker-0-mv4d7 redfish-virtualmedia://192.168.7.1:8000/redfish/v1/Systems/c399c6b7-525a-4f4e-8280-0472b6494fc5 unknown true
# worker-1 OK provisioned ocp4-zn8lq-worker-0-9frt6 redfish-virtualmedia://192.168.7.1:8000/redfish/v1/Systems/a4052132-7598-4879-b3e1-c48c47cf67ed unknown true
我们就能看到bm的输出了 可以看到web console上node的配置指向了bm 我们也可以看到久违的machine配置
添加一个新节点
IPI 模式下,添加一个新节点非常方便,只要定义一个BareMetalHost就好了。
cd /data/install/
cat << EOF > /data/install/bmh.yaml
---
apiVersion: v1
kind: Secret
metadata:
name: worker-2-bmc-secret
type: Opaque
data:
username: $(echo -ne "admin" | base64)
password: $(echo -ne "password" | base64)
---
apiVersion: metal3.io/v1alpha1
kind: BareMetalHost
metadata:
name: worker-2
spec:
online: true
bootMACAddress: $(cat mac.list | grep worker2 | awk '{print $2}')
bmc:
address: redfish-virtualmedia://192.168.7.1:8000/redfish/v1/Systems/$(cat vm.list | grep worker2 | awk '{print $1}')
credentialsName: worker-2-bmc-secret
disableCertificateVerification: true
rootDeviceHints:
deviceName: /dev/vda
EOF
oc -n openshift-machine-api create -f bmh.yaml
# DO NOT USE, restore, delete the vm
oc -n openshift-machine-api delete -f bmh.yaml
oc get bmh -n openshift-machine-api
# NAME STATUS PROVISIONING STATUS CONSUMER BMC HARDWARE PROFILE ONLINE ERROR
# master-0 OK externally provisioned ocp4-zn8lq-master-0 redfish-virtualmedia://192.168.7.1:8000/redfish/v1/Systems/965c420a-f127-4639-9184-fe3546d2bde4 true
# master-1 OK externally provisioned ocp4-zn8lq-master-1 redfish-virtualmedia://192.168.7.1:8000/redfish/v1/Systems/46f9dff4-1b44-4286-8a7c-691673340030 true
# master-2 OK externally provisioned ocp4-zn8lq-master-2 redfish-virtualmedia://192.168.7.1:8000/redfish/v1/Systems/9e544eb6-1b98-4b0a-ad32-7df232ae582a true
# worker-0 OK provisioned ocp4-zn8lq-worker-0-mv4d7 redfish-virtualmedia://192.168.7.1:8000/redfish/v1/Systems/c399c6b7-525a-4f4e-8280-0472b6494fc5 unknown true
# worker-1 OK provisioned ocp4-zn8lq-worker-0-9frt6 redfish-virtualmedia://192.168.7.1:8000/redfish/v1/Systems/a4052132-7598-4879-b3e1-c48c47cf67ed unknown true
# worker-2 OK inspecting redfish-virtualmedia://192.168.7.1:8000/redfish/v1/Systems/2eee2e57-e18b-460b-bb3f-7f048f84c69b true
oc get machinesets -n openshift-machine-api
# NAME DESIRED CURRENT READY AVAILABLE AGE
# ocp4-zn8lq-worker-0 2 2 2 2 155m
oc get machinesets -n openshift-machine-api -o json | jq -r .items[0].metadata.name
# 扩容worker到3副本,会触发worker-2的部署
oc scale --replicas=3 machineset $(oc get machinesets -n openshift-machine-api -o json | jq -r .items[0].metadata.name) -n openshift-machine-api
镜像仓库代理 / image registry proxy
准备离线镜像仓库非常麻烦,好在我们找到了一台在线的主机,那么我们可以使用nexus构造image registry proxy,在在线环境上面,做一遍PoC,然后就能通过image registry proxy得到离线镜像了
- https://mtijhof.wordpress.com/2018/07/23/using-nexus-oss-as-a-proxy-cache-for-docker-images/
#####################################################
# init build the nexus fs
/bin/cp -f nexus-image.tgz /data/ccn/
tar zxf nexus-image.tgz
chown -R 200 /data/ccn/nexus-image
# podman run -d -p 8082:8081 -p 8083:8083 -it --name nexus-image -v /data/ccn/nexus-image:/nexus-data:Z docker.io/sonatype/nexus3:3.29.0
podman run -d -p 8082:8081 -p 8083:8083 -it --name nexus-image -v /data/ccn/nexus-image:/nexus-data:Z docker.io/wangzheng422/imgs:nexus3-3.29.0-wzh
podman stop nexus-image
podman rm nexus-image
# get the admin password
cat /data/ccn/nexus-image/admin.password && echo
# 84091bcd-c82f-44a3-8b7b-dfc90f5b7da1
# open http://nexus.ocp4.redhat.ren:8082
# 开启 https
# https://blog.csdn.net/s7799653/article/details/105378645
# https://help.sonatype.com/repomanager3/system-configuration/configuring-ssl#ConfiguringSSL-InboundSSL-ConfiguringtoServeContentviaHTTPS
mkdir -p /data/install/tmp
cd /data/install/tmp
# 将证书导出成pkcs格式
# 这里需要输入密码 用 password,
openssl pkcs12 -export -out keystore.pkcs12 -inkey /etc/crts/redhat.ren.key -in /etc/crts/redhat.ren.crt
cat << EOF >> Dockerfile
FROM docker.io/sonatype/nexus3:3.29.0
USER root
COPY keystore.pkcs12 /keystore.pkcs12
RUN keytool -v -importkeystore -srckeystore keystore.pkcs12 -srcstoretype PKCS12 -destkeystore keystore.jks -deststoretype JKS -storepass password -srcstorepass password &&\
cp keystore.jks /opt/sonatype/nexus/etc/ssl/
USER nexus
EOF
buildah bud --format=docker -t docker.io/wangzheng422/imgs:nexus3-3.29.0-wzh -f Dockerfile .
buildah push docker.io/wangzheng422/imgs:nexus3-3.29.0-wzh
######################################################
# go to helper, update proxy setting for ocp cluster
cd /data/ocp4
bash image.registries.conf.sh nexus.ocp4.redhat.ren:8083
mkdir -p /etc/containers/registries.conf.d
/bin/cp -f image.registries.conf /etc/containers/registries.conf.d/
cd /data/ocp4
oc apply -f ./99-worker-container-registries.yaml -n openshift-config
oc apply -f ./99-master-container-registries.yaml -n openshift-config
######################################################
# dump the nexus image fs out
podman stop nexus-image
var_date=$(date '+%Y-%m-%d-%H%M')
echo $var_date
cd /data/ccn
tar cf - ./nexus-image | pigz -c > nexus-image.tgz
buildah from --name onbuild-container scratch
buildah copy onbuild-container nexus-image.tgz /
buildah umount onbuild-container
buildah commit --rm --format=docker onbuild-container docker.io/wangzheng422/nexus-fs:image-$var_date
# buildah rm onbuild-container
# rm -f nexus-image.tgz
buildah push docker.io/wangzheng422/nexus-fs:image-$var_date
echo "docker.io/wangzheng422/nexus-fs:image-$var_date"
# 以下这个版本,可以作为初始化的image proxy,里面包含了nfs provision,以及sample operator的metadata。很高兴的发现,image stream并不会完全下载镜像,好想只是下载metadata,真正用的时候,才去下载。
# docker.io/wangzheng422/nexus-fs:image-2020-12-26-1118
配置镜像仓库的ca
安装过程里面,已经把镜像仓库的ca放进去了,但是好想image stream不认,让我们再试试
oc project openshift-config
oc create configmap ca.for.registry -n openshift-config \
--from-file=registry.ocp4.redhat.ren..5443=/data/install/redhat.ren.ca.crt \
--from-file=nexus.ocp4.redhat.ren..8083=/data/install/redhat.ren.ca.crt
oc patch image.config.openshift.io/cluster -p '{"spec":{"additionalTrustedCA":{"name":"ca.for.registry"}}}' --type=merge
# oc patch image.config.openshift.io/cluster -p '{"spec":{"registrySources":{"insecureRegistries":["nexus.ocp4.redhat.ren:8083"]}}}' --type=merge
oc get image.config.openshift.io/cluster -o yaml
# openshift project下面的image stream重新加载一下把
oc get is -o json | jq -r '.items[].metadata.name' | xargs -L1 oc import-image --all
配置internal registry
我们的工具机是带nfs的,那么就给interneal registry配置高档一些的nfs存储吧,不要用emptydir
bash /data/ocp4/ocp4-upi-helpernode-master/files/nfs-provisioner-setup.sh
# oc edit configs.imageregistry.operator.openshift.io
# 修改 storage 部分
# storage:
# pvc:
# claim:
oc patch configs.imageregistry.operator.openshift.io cluster -p '{"spec":{"managementState": "Managed","storage":{"pvc":{"claim":""}}}}' --type=merge
oc patch configs.imageregistry.operator.openshift.io cluster -p '{"spec":{"managementState": "Removed"}}' --type=merge
oc get clusteroperator image-registry
oc get configs.imageregistry.operator.openshift.io cluster -o yaml
# 把imagepruner给停掉
# https://bugzilla.redhat.com/show_bug.cgi?id=1852501#c24
# oc patch imagepruner.imageregistry/cluster --patch '{"spec":{"suspend":true}}' --type=merge
# oc -n openshift-image-registry delete jobs --all
配置sample operator
openshift内置了一个sample operator,里面有一大堆红帽的产品。
oc get configs.samples.operator.openshift.io/cluster -o yaml
oc patch configs.samples.operator.openshift.io/cluster -p '{"spec":{"managementState": "Managed", "samplesRegistry": "nexus.ocp4.redhat.ren:8083"}}' --type=merge
oc patch configs.samples.operator.openshift.io/cluster -p '{"spec":{"managementState": "Unmanaged"}}' --type=merge
oc patch configs.samples.operator.openshift.io/cluster -p '{"spec":{"managementState": "Removed"}}' --type=merge
chrony/NTP 设置
在 ocp 4.6 里面,需要设定ntp同步,我们之前ansible脚本,已经创建好了ntp的mco配置,把他打到系统里面就好了。
oc apply -f /data/ocp4/ocp4-upi-helpernode-master/machineconfig/
Operator Hub 离线安装
使用nexus作为image proxy以后,就不需要做这个离线操作了,但是如果我们想搞CCN这种项目,因为他自带了一个catalog,为了避免冲突,我们可能还是需要屏蔽到默认的operator hub
oc patch OperatorHub cluster --type json \
-p '[{"op": "add", "path": "/spec/disableAllDefaultSources", "value": true}]'
oc get OperatorHub cluster -o yaml
给 openshift project image stream 打补丁
在有代理的网络环境中,我们需要给openshift project下的image stream打一些补丁。
cd /data/ocp4
bash is.patch.sh registry.ocp4.redhat.ren:5443/ocp4/openshift4
给 router / ingress 更换证书
有时候,我们需要公网CA认证的证书,给router来用,那么我们就搞一下
https://docs.openshift.com/container-platform/4.6/security/certificates/replacing-default-ingress-certificate.html
mkdir -p /data/ccn/ingress-keys/etc
mkdir -p /data/ccn/ingress-keys/lib
cd /data/ccn/ingress-keys
podman run -it --rm --name certbot \
-v "/data/ccn/ingress-keys/etc:/etc/letsencrypt":Z \
-v "/data/ccn/ingress-keys/lib:/var/lib/letsencrypt":Z \
docker.io/certbot/certbot certonly -d "*.apps.ocp4.redhat.ren" --manual --preferred-challenges dns-01 --server https://acme-v02.api.letsencrypt.org/directory
cp ./etc/archive/apps.ocp4.redhat.ren/fullchain1.pem apps.ocp4.redhat.ren.crt
cp ./etc/archive/apps.ocp4.redhat.ren/privkey1.pem apps.ocp4.redhat.ren.key
ssh root@192.168.7.11 mkdir -p /data/install/ingress-key
scp apps.* root@192.168.7.11:/data/install/ingress-key
# on helper
cd /data/install/ingress-key
oc create secret tls wzh-ingress-key \
--cert=apps.ocp4.redhat.ren.crt \
--key=apps.ocp4.redhat.ren.key \
-n openshift-ingress
oc patch ingresscontroller.operator default \
--type=merge -p \
'{"spec":{"defaultCertificate": {"name": "wzh-ingress-key"}}}' \
-n openshift-ingress-operator
排错技巧
# login to bootstrap to debug
# find the ip from kvm console
ssh -i ~/.ssh/helper_rsa core@192.168.7.75
journalctl -b -f -u release-image.service -u bootkube.service
journalctl -b -u release-image.service -u bootkube.service | grep -i baremetal
sudo -i
export KUBECONFIG=/etc/kubernetes/kubeconfig
oc get pod -n openshift-machine-api
oc get BareMetalHost -n openshift-machine-api
# debug why bootstrap can't be ping...
cat .openshift_install_state.json | jq '."*bootstrap.Bootstrap"'.Config.storage.files[].path
cat .openshift_install_state.json | jq -r '."*bootstrap.Bootstrap"'.File.Data | base64 -d | jq -r . > ign.json
cat .openshift_install_state.json | jq -r '."*bootstrap.Bootstrap".Config.storage.files[].contents.source ' | sed 's/.*base64,//g' | base64 -d > decode
cat .openshift_install_state.json | jq -r '."*bootstrap.Bootstrap".Config.storage.files[] | .path, .contents.source ' | while read -r line ; do if [[ $line =~ .*base64,.* ]]; then echo $(echo $line | sed 's/.*base64,//g' | base64 -d) ; else echo $line; fi; done > files
openshift 4.6 离线 baremetal IPI (全自动)安装 使用 provisionning network 双网络模式
简介
视频讲解
本文描述ocp4.6在baremetal(kvm模拟)上面,IPI (全自动)安装。
根据openshift文档,baremetal IPI安装有两种模式,一种是provisioning网络独立,另外一种是provisioning网络和baremetal(服务)网络合并的模式。考虑到POC现场的环境,本次实验,使用复杂的网络部署,也就是baremetal, provisioning network分离的模式。
以下是本次实验的架构图:
离线安装包下载
打包好的安装包,在这里下载,百度盘下载链接,版本是4.6.28:
- 链接: https://pan.baidu.com/s/1XFbiOAcz7nul-N9U0aDxHg 密码: 6qtt
其中包括如下类型的文件:
- ocp4.tgz 这个文件包含了iso等安装介质,以及各种安装脚本,全部下载的镜像列表等。需要复制到宿主机,以及工具机上去。
- registry.tgz 这个文件也是docker image registry的仓库打包文件。需要先补充镜像的话,按照这里操作: 4.6.add.image.md
合并这些切分文件,使用类似如下的命令
cat registry.?? > registry.tgz
注意,可能需要更新离线镜像包中的helper用的ansible脚本。
在外网云主机上面准备离线安装源
准备离线安装介质的文档,已经转移到了这里:4.6.build.dist.md
宿主机准备
本次实验,是在一个32C, 256G 的主机上面,用很多个虚拟机安装测试。所以先准备这个宿主机。
如果是多台宿主机,记得一定要调整时间配置,让这些宿主机的时间基本一致,否则证书会出问题。
主要的准备工作有
- 配置yum源
- 配置dns
- 安装镜像仓库
- 配置vnc环境
- 配置kvm需要的网络
- 创建helper kvm
以上准备工作,dns部分需要根据实际项目环境有所调整。
本次的宿主机是一台rhel8, 参考这里进行离线repo等基本的配置rhel8.build.kernel.repo.cache.md
cat << EOF > /root/.ssh/config
StrictHostKeyChecking no
UserKnownHostsFile=/dev/null
EOF
cat << EOF >> /etc/hosts
127.0.0.1 registry.ocp4.redhat.ren
EOF
dnf clean all
dnf repolist
dnf -y install byobu htop jq ipmitool
systemctl disable --now firewalld
# 配置registry
mkdir -p /etc/crts/ && cd /etc/crts
# https://access.redhat.com/documentation/en-us/red_hat_codeready_workspaces/2.1/html/installation_guide/installing-codeready-workspaces-in-tls-mode-with-self-signed-certificates_crw
openssl genrsa -out /etc/crts/redhat.ren.ca.key 4096
openssl req -x509 \
-new -nodes \
-key /etc/crts/redhat.ren.ca.key \
-sha256 \
-days 36500 \
-out /etc/crts/redhat.ren.ca.crt \
-subj /CN="Local Red Hat Ren Signer" \
-reqexts SAN \
-extensions SAN \
-config <(cat /etc/pki/tls/openssl.cnf \
<(printf '[SAN]\nbasicConstraints=critical, CA:TRUE\nkeyUsage=keyCertSign, cRLSign, digitalSignature'))
openssl genrsa -out /etc/crts/redhat.ren.key 2048
openssl req -new -sha256 \
-key /etc/crts/redhat.ren.key \
-subj "/O=Local Red Hat Ren /CN=*.ocp4.redhat.ren" \
-reqexts SAN \
-config <(cat /etc/pki/tls/openssl.cnf \
<(printf "\n[SAN]\nsubjectAltName=DNS:*.ocp4.redhat.ren,DNS:*.apps.ocp4.redhat.ren,DNS:*.redhat.ren\nbasicConstraints=critical, CA:FALSE\nkeyUsage=digitalSignature, keyEncipherment, keyAgreement, dataEncipherment\nextendedKeyUsage=serverAuth")) \
-out /etc/crts/redhat.ren.csr
openssl x509 \
-req \
-sha256 \
-extfile <(printf "subjectAltName=DNS:*.ocp4.redhat.ren,DNS:*.apps.ocp4.redhat.ren,DNS:*.redhat.ren\nbasicConstraints=critical, CA:FALSE\nkeyUsage=digitalSignature, keyEncipherment, keyAgreement, dataEncipherment\nextendedKeyUsage=serverAuth") \
-days 365 \
-in /etc/crts/redhat.ren.csr \
-CA /etc/crts/redhat.ren.ca.crt \
-CAkey /etc/crts/redhat.ren.ca.key \
-CAcreateserial -out /etc/crts/redhat.ren.crt
openssl x509 -in /etc/crts/redhat.ren.crt -text
/bin/cp -f /etc/crts/redhat.ren.ca.crt /etc/pki/ca-trust/source/anchors/
update-ca-trust extract
cd /data
mkdir -p /data/registry
# tar zxf registry.tgz
dnf -y install podman pigz skopeo jq
# pigz -dc registry.tgz | tar xf -
cd /data/ocp4
podman load -i /data/ocp4/registry.tgz
podman run --name local-registry -p 5443:5000 \
-d --restart=always \
-v /data/registry/:/var/lib/registry:z \
-v /etc/crts:/certs:z \
-e REGISTRY_HTTP_TLS_CERTIFICATE=/certs/redhat.ren.crt \
-e REGISTRY_HTTP_TLS_KEY=/certs/redhat.ren.key \
docker.io/library/registry:2
podman start local-registry
# firewall-cmd --permanent --add-port=5443/tcp
# firewall-cmd --reload
# 加载更多的镜像
# 解压缩 ocp4.tgz
bash add.image.load.sh /data/install.image 'registry.ocp4.redhat.ren:5443'
# https://github.com/christianh814/ocp4-upi-helpernode/blob/master/docs/quickstart.md
# 准备vnc环境
vncpasswd
cat << EOF > ~/.vnc/config
session=gnome
securitytypes=vncauth,tlsvnc
desktop=sandbox
geometry=1280x800
alwaysshared
EOF
cat << EOF >> /etc/tigervnc/vncserver.users
:1=root
EOF
systemctl start vncserver@:1
# 如果你想停掉vnc server,这么做
systemctl stop vncserver@:1
# firewall-cmd --permanent --add-port=6001/tcp
# firewall-cmd --permanent --add-port=5901/tcp
# firewall-cmd --reload
# connect vnc at port 5901
# export DISPLAY=:1
# 创建实验用虚拟网络
cat << 'EOF' > /data/kvm/bridge.sh
#!/usr/bin/env bash
PUB_CONN='eno1'
PUB_IP='172.21.6.105/24'
PUB_GW='172.21.6.254'
PUB_DNS='172.21.1.1'
nmcli con down "$PUB_CONN"
nmcli con delete "$PUB_CONN"
nmcli con down baremetal
nmcli con delete baremetal
# RHEL 8.1 appends the word "System" in front of the connection,delete in case it exists
nmcli con down "System $PUB_CONN"
nmcli con delete "System $PUB_CONN"
nmcli connection add ifname baremetal type bridge con-name baremetal ipv4.method 'manual' \
ipv4.address "$PUB_IP" \
ipv4.gateway "$PUB_GW" \
ipv4.dns "$PUB_DNS"
nmcli con add type bridge-slave ifname "$PUB_CONN" master baremetal
pkill dhclient;dhclient baremetal
nmcli con down baremetal
nmcli con up baremetal
EOF
bash /data/kvm/bridge.sh
nmcli con mod baremetal +ipv4.address '192.168.7.1/24'
cat << 'EOF' > /data/kvm/bridge.provisioning.sh
#!/usr/bin/env bash
PUB_CONN='eno2'
PUB_IP='172.22.0.1/24'
nmcli con down "$PUB_CONN"
nmcli con delete "$PUB_CONN"
nmcli con down provisioning
nmcli con delete provisioning
# RHEL 8.1 appends the word "System" in front of the connection,delete in case it exists
nmcli con down "System $PUB_CONN"
nmcli con delete "System $PUB_CONN"
nmcli connection add ifname provisioning type bridge con-name provisioning ipv4.addresses $PUB_IP ipv4.method manual
nmcli con add type bridge-slave ifname "$PUB_CONN" master provisioning
nmcli con down provisioning
nmcli con up provisioning
EOF
bash /data/kvm/bridge.provisioning.sh
nmcli networking off; nmcli networking on
# 创建工具机
mkdir -p /data/kvm
cd /data/kvm
lvremove -f rhel/helperlv
lvcreate -y -L 100G -n helperlv rhel
virt-install --name="ocp4-aHelper" --vcpus=4 --ram=6144 \
--disk path=/dev/rhel/helperlv,device=disk,bus=virtio,format=raw \
--os-variant rhel8.0 --network bridge=baremetal,model=virtio \
--boot menu=on --location /data/kvm/rhel-8.3-x86_64-dvd.iso \
--initrd-inject helper-ks-rhel8-ipi.cfg --extra-args "inst.ks=file:/helper-ks-rhel8-ipi.cfg"
virsh start ocp4-aHelper
# DO NOT USE, restore kvm
virsh destroy ocp4-aHelper
virsh undefine ocp4-aHelper
# virt-viewer --domain-name ocp4-aHelper
# virsh start ocp4-aHelper
# virsh list --all
# start chrony/ntp server on host
/bin/cp -f /etc/chrony.conf /etc/chrony.conf.default
cat << EOF > /etc/chrony.conf
# pool 2.rhel.pool.ntp.org iburst
driftfile /var/lib/chrony/drift
makestep 1.0 3
rtcsync
allow 192.0.0.0/8
local stratum 10
logdir /var/log/chrony
EOF
systemctl enable --now chronyd
# systemctl restart chronyd
chronyc tracking
chronyc sources -v
chronyc sourcestats -v
chronyc makestep
# setup ftp data root
mount --bind /data/dnf /var/ftp/dnf
chcon -R -t public_content_t /var/ftp/dnf
# create the master and worker vm, but not start them
export KVM_DIRECTORY=/data/kvm
# cd ${KVM_DIRECTORY}
# scp root@192.168.7.11:/data/install/*.iso ${KVM_DIRECTORY}/
create_lv() {
var_vg=$1
var_lv=$2
lvremove -f $var_vg/$var_lv
lvcreate -y -L 120G -n $var_lv $var_vg
wipefs --all --force /dev/$var_vg/$var_lv
}
# create_lv rhel bootstraplv
create_lv nvme master0lv
create_lv nvme master1lv
create_lv nvme master2lv
create_lv rhel worker0lv
create_lv rhel worker1lv
create_lv rhel worker2lv
virt-install --name=ocp4-master0 --vcpus=4 --ram=16384 \
--disk path=/dev/nvme/master0lv,device=disk,bus=virtio,format=raw \
--os-variant rhel8.0 --network bridge=provisioning,model=virtio \
--network bridge=baremetal,model=virtio \
--boot uefi,nvram_template=/usr/share/OVMF/OVMF_VARS.fd,menu=on \
--print-xml > ${KVM_DIRECTORY}/ocp4-master0.xml
virsh define --file ${KVM_DIRECTORY}/ocp4-master0.xml
virt-install --name=ocp4-master1 --vcpus=4 --ram=16384 \
--disk path=/dev/nvme/master1lv,device=disk,bus=virtio,format=raw \
--os-variant rhel8.0 --network bridge=provisioning,model=virtio \
--network bridge=baremetal,model=virtio \
--boot uefi,nvram_template=/usr/share/OVMF/OVMF_VARS.fd,menu=on \
--print-xml > ${KVM_DIRECTORY}/ocp4-master1.xml
virsh define --file ${KVM_DIRECTORY}/ocp4-master1.xml
virt-install --name=ocp4-master2 --vcpus=4 --ram=16384 \
--disk path=/dev/nvme/master2lv,device=disk,bus=virtio,format=raw \
--os-variant rhel8.0 --network bridge=provisioning,model=virtio \
--network bridge=baremetal,model=virtio \
--boot uefi,nvram_template=/usr/share/OVMF/OVMF_VARS.fd,menu=on \
--print-xml > ${KVM_DIRECTORY}/ocp4-master2.xml
virsh define --file ${KVM_DIRECTORY}/ocp4-master2.xml
virt-install --name=ocp4-worker0 --vcpus=4 --ram=32768 \
--disk path=/dev/rhel/worker0lv,device=disk,bus=virtio,format=raw \
--os-variant rhel8.0 --network bridge=provisioning,model=virtio \
--network bridge=baremetal,model=virtio \
--boot uefi,nvram_template=/usr/share/OVMF/OVMF_VARS.fd,menu=on \
--print-xml > ${KVM_DIRECTORY}/ocp4-worker0.xml
virsh define --file ${KVM_DIRECTORY}/ocp4-worker0.xml
virt-install --name=ocp4-worker1 --vcpus=4 --ram=16384 \
--disk path=/dev/rhel/worker1lv,device=disk,bus=virtio,format=raw \
--os-variant rhel8.0 --network bridge=provisioning,model=virtio \
--network bridge=baremetal,model=virtio \
--boot uefi,nvram_template=/usr/share/OVMF/OVMF_VARS.fd,menu=on \
--print-xml > ${KVM_DIRECTORY}/ocp4-worker1.xml
virsh define --file ${KVM_DIRECTORY}/ocp4-worker1.xml
virt-install --name=ocp4-worker2 --vcpus=4 --ram=16384 \
--disk path=/dev/rhel/worker2lv,device=disk,bus=virtio,format=raw \
--os-variant rhel8.0 --network bridge=provisioning,model=virtio \
--network bridge=baremetal,model=virtio \
--boot uefi,nvram_template=/usr/share/OVMF/OVMF_VARS.fd,menu=on \
--print-xml > ${KVM_DIRECTORY}/ocp4-worker2.xml
virsh define --file ${KVM_DIRECTORY}/ocp4-worker2.xml
cd /data/kvm/
for i in master{0..2} worker{0..2}
do
echo -ne "${i}\t" ;
virsh dumpxml ocp4-${i} | grep "mac address" | cut -d\' -f2 | tr '\n' '\t'
echo
done > mac.list
cat /data/kvm/mac.list
# master0 52:54:00:a8:77:90 52:54:00:1f:1c:1f
# master1 52:54:00:8a:97:b3 52:54:00:a1:d6:df
# master2 52:54:00:54:8f:4a 52:54:00:0b:7c:61
# worker0 52:54:00:4c:8a:80 52:54:00:f0:f4:2b
# worker1 52:54:00:89:eb:62 52:54:00:ee:e4:2b
# worker2 52:54:00:e1:ec:6e 52:54:00:1b:d6:b5
# GOTO image registry & kvm host
# copy crt files to helper node
ssh-copy-id root@192.168.7.11
ssh root@192.168.7.11 mkdir -p /data/install
ssh root@192.168.7.11 mkdir -p /data/ocp4
scp /data/down/ocp4.tgz root@192.168.7.11:/data/
scp /etc/crts/redhat.ren.ca.crt root@192.168.7.11:/data/install/
scp /data/kvm/mac.list root@192.168.7.11:/data/install/
# install redfish for kvm
# https://access.redhat.com/solutions/4315581
# https://access.redhat.com/solutions/3057171
# https://docs.openstack.org/virtualbmc/latest/user/index.html
# https://docs.openstack.org/sushy-tools/latest/user/dynamic-emulator.html
dnf -y install python3-pip
# pip3 install --user sushy-tools
mkdir -p /data/install
cd /data/install
podman create --name swap docker.io/wangzheng422/imgs:openshift-baremetal-install-4.6.5 ls
podman cp swap:/openshift-baremetal-install ./
podman rm -fv swap
podman create --name swap docker.io/wangzheng422/imgs:ocp.bm.ipi.python.dep.rhel8-4.6.7 ls
podman cp swap:/wheelhouse.tar.gz - > wheelhouse.tar.gz
tar zvxf wheelhouse.tar.gz
podman rm -fv swap
pip3 install --user -r wheelhouse/requirements.txt --no-index --find-links wheelhouse
ps -ef | grep vbmcd | awk '{print $2}' | xargs kill
/bin/rm -f /root/.vbmc/master.pid
/root/.local/bin/vbmcd
# curl https://registry.ocp4.redhat.ren:8000/redfish/v1/Systems/
virsh list --all
# /root/.local/bin/vbmc add ocp4-bootstrap --port 6230 --username admin --password password
# /root/.local/bin/vbmc start ocp4-bootstrap
var_i=1
for i in master{0..2} worker{0..2}
do
/root/.local/bin/vbmc add ocp4-$i --port $(( 6230 + $var_i )) --username admin --password password
/root/.local/bin/vbmc start ocp4-$i
(( var_i += 1))
done
/root/.local/bin/vbmc list
# +--------------+---------+---------+------+
# | Domain name | Status | Address | Port |
# +--------------+---------+---------+------+
# | ocp4-master0 | running | :: | 6231 |
# | ocp4-master1 | running | :: | 6232 |
# | ocp4-master2 | running | :: | 6233 |
# | ocp4-worker0 | running | :: | 6234 |
# | ocp4-worker1 | running | :: | 6235 |
# | ocp4-worker2 | running | :: | 6236 |
# +--------------+---------+---------+------+
/root/.local/bin/vbmc show ocp4-master0
# DO NOT USE, restore
var_i=1
for i in master{0..2} worker{0..2}
do
/root/.local/bin/vbmc stop ocp4-$i
/root/.local/bin/vbmc delete ocp4-$i
(( var_i += 1))
done
# if you want to stop or delete vm, try this
virsh list --all
# virsh destroy ocp4-bootstrap
virsh destroy ocp4-master0
virsh destroy ocp4-master1
virsh destroy ocp4-master2
virsh destroy ocp4-worker0
virsh destroy ocp4-worker1
virsh destroy ocp4-worker2
# virsh undefine ocp4-bootstrap
virsh undefine ocp4-master0 --nvram
virsh undefine ocp4-master1 --nvram
virsh undefine ocp4-master2 --nvram
virsh undefine ocp4-worker0 --nvram
virsh undefine ocp4-worker1 --nvram
virsh undefine ocp4-worker2 --nvram
工具机准备
以下是在工具机里面,进行的安装操作。
主要的操作有
- 配置yum源
- 运行ansible脚本,自动配置工具机
- 上传定制的安装配置文件
- 生成ignition文件
sed -i 's/#UseDNS yes/UseDNS no/g' /etc/ssh/sshd_config
systemctl restart sshd
cat << EOF > /root/.ssh/config
StrictHostKeyChecking no
UserKnownHostsFile=/dev/null
EOF
systemctl disable --now firewalld
# in helper node
mkdir /etc/yum.repos.d.bak
mv /etc/yum.repos.d/* /etc/yum.repos.d.bak
export YUMIP="192.168.7.1"
cat << EOF > /etc/yum.repos.d/remote.repo
[remote-epel]
name=epel
baseurl=ftp://${YUMIP}/dnf/epel
enabled=1
gpgcheck=0
[remote-epel-modular]
name=epel-modular
baseurl=ftp://${YUMIP}/dnf/epel-modular
enabled=1
gpgcheck=0
[remote-appstream]
name=appstream
baseurl=ftp://${YUMIP}/dnf/rhel-8-for-x86_64-appstream-rpms
enabled=1
gpgcheck=0
[remote-baseos]
name=baseos
baseurl=ftp://${YUMIP}/dnf/rhel-8-for-x86_64-baseos-rpms
enabled=1
gpgcheck=0
[remote-baseos-source]
name=baseos-source
baseurl=ftp://${YUMIP}/dnf/rhel-8-for-x86_64-baseos-source-rpms
enabled=1
gpgcheck=0
[remote-supplementary]
name=supplementary
baseurl=ftp://${YUMIP}/dnf/rhel-8-for-x86_64-supplementary-rpms
enabled=1
gpgcheck=0
[remote-codeready-builder]
name=supplementary
baseurl=ftp://${YUMIP}/dnf/codeready-builder-for-rhel-8-x86_64-rpms
enabled=1
gpgcheck=0
EOF
yum clean all
yum makecache
yum repolist
yum -y install ansible git unzip podman python3
yum -y update
reboot
# yum -y install ansible git unzip podman python36
mkdir -p /data/ocp4/
# scp ocp4.tgz to /data
# scp /data/down/ocp4.tgz root@192.168.7.11:/data/
cd /data
tar zvxf ocp4.tgz
cd /data/ocp4
# 这里使用了一个ansible的项目,用来部署helper节点的服务。
# https://github.com/wangzheng422/ocp4-upi-helpernode
unzip ocp4-upi-helpernode.zip
# 这里使用了一个ignition文件合并的项目,用来帮助自定义ignition文件。
# https://github.com/wangzheng422/filetranspiler
podman load -i filetranspiler.tgz
mkdir -p /data/install
mkdir -p /data/ocp4/
cat << EOF > /data/ocp4/ocp4-upi-helpernode-master/vars-dhcp.rhel8.yaml
---
ssh_gen_key: true
staticips: false
bm_ipi: true
firewalld: false
dns_forward: false
iso:
iso_dl_url: "file:///data/ocp4/rhcos-live.x86_64.iso"
my_iso: "rhcos-live.iso"
helper:
name: "helper"
ipaddr: "192.168.7.11"
networkifacename: "enp1s0"
gateway: "192.168.7.1"
netmask: "255.255.255.0"
dns:
domain: "redhat.ren"
clusterid: "ocp4"
forwarder1: "192.168.7.1"
forwarder2: "192.168.7.1"
api_vip: "192.168.7.100"
ingress_vip: "192.168.7.101"
dhcp:
router: "192.168.7.1"
bcast: "192.168.7.255"
netmask: "255.255.255.0"
poolstart: "192.168.7.70"
poolend: "192.168.7.90"
ipid: "192.168.7.0"
netmaskid: "255.255.255.0"
bootstrap:
name: "bootstrap"
ipaddr: "192.168.7.12"
interface: "enp1s0"
install_drive: "vda"
macaddr: "52:54:00:7e:f8:f7"
masters:
- name: "master-0"
ipaddr: "192.168.7.13"
interface: "enp1s0"
install_drive: "vda"
macaddr: "$(cat /data/install/mac.list | grep master0 | awk '{print $3}')"
- name: "master-1"
ipaddr: "192.168.7.14"
interface: "enp1s0"
install_drive: "vda"
macaddr: "$(cat /data/install/mac.list | grep master1 | awk '{print $3}')"
- name: "master-2"
ipaddr: "192.168.7.15"
interface: "enp1s0"
install_drive: "vda"
macaddr: "$(cat /data/install/mac.list | grep master2 | awk '{print $3}')"
workers:
- name: "worker-0"
ipaddr: "192.168.7.16"
interface: "enp1s0"
install_drive: "vda"
macaddr: "$(cat /data/install/mac.list | grep worker0 | awk '{print $3}')"
- name: "worker-1"
ipaddr: "192.168.7.17"
interface: "enp1s0"
install_drive: "vda"
macaddr: "$(cat /data/install/mac.list | grep worker1 | awk '{print $3}')"
- name: "worker-2"
ipaddr: "192.168.7.18"
interface: "enp1s0"
install_drive: "vda"
macaddr: "$(cat /data/install/mac.list | grep worker2 | awk '{print $3}')"
others:
- name: "registry"
ipaddr: "192.168.7.1"
macaddr: "52:54:00:7e:f8:f7"
- name: "yum"
ipaddr: "192.168.7.1"
macaddr: "52:54:00:7e:f8:f7"
- name: "quay"
ipaddr: "192.168.7.1"
macaddr: "52:54:00:7e:f8:f7"
otherdomains:
- domain: "rhv.redhat.ren"
hosts:
- name: "manager"
ipaddr: "192.168.7.71"
- name: "rhv01"
ipaddr: "192.168.7.72"
- domain: "cmri-edge.redhat.ren"
hosts:
- name: "*"
ipaddr: "192.168.7.71"
- name: "*.apps"
ipaddr: "192.168.7.72"
force_ocp_download: false
remove_old_config_files: false
ocp_client: "file:///data/ocp4/4.6.5/openshift-client-linux-4.6.5.tar.gz"
ocp_installer: "file:///data/ocp4/4.6.5/openshift-install-linux-4.6.5.tar.gz"
ppc64le: false
arch: 'x86_64'
chronyconfig:
enabled: true
content:
- server: "192.168.7.1"
options: iburst
setup_registry:
deploy: false
registry_image: docker.io/library/registry:2
local_repo: "ocp4/openshift4"
product_repo: "openshift-release-dev"
release_name: "ocp-release"
release_tag: "4.6.1-x86_64"
registry_server: "registry.ocp4.redhat.ren:5443"
EOF
# 接下来,我们使用ansible来配置helper节点,装上各种openshift集群需要的服务
# 根据现场环境,修改 ocp4-upi-helpernode-master/vars-static.yaml
cd /data/ocp4/ocp4-upi-helpernode-master
ansible-playbook -e @vars-dhcp.rhel8.yaml -e '{ staticips: false, bm_ipi: true }' tasks/main.yml
# try this:
/usr/local/bin/helpernodecheck
# GO back to help node
/bin/cp -f /data/install/redhat.ren.ca.crt /etc/pki/ca-trust/source/anchors/
update-ca-trust extract
# 定制ignition
cd /data/install
# 根据现场环境,修改 install-config.yaml
# 至少要修改ssh key, 还有 additionalTrustBundle,这个是镜像仓库的csr
# vi install-config.yaml
cat << EOF > /data/install/install-config.yaml
apiVersion: v1
baseDomain: redhat.ren
platform:
baremetal:
apiVIP: 192.168.7.100
ingressVIP: 192.168.7.101
# provisioningBridge: provisioning
provisioningNetworkCIDR: 172.22.0.0/24
# provisioningDHCPRange: 172.22.0.10,172.22.0.100
# clusterProvisioningIP: 172.22.0.3
# bootstrapProvisioningIP: 172.22.0.2
# provisioningNetwork: Managed
provisioningNetworkInterface: enp1s0
# externalBridge: baremetal
bootstrapOSImage: http://192.168.7.11:8080/install/rhcos-qemu.x86_64.qcow2.gz?sha256=$(zcat /var/www/html/install/rhcos-qemu.x86_64.qcow2.gz | sha256sum | awk '{print $1}')
clusterOSImage: http://192.168.7.11:8080/install/rhcos-openstack.x86_64.qcow2.gz?sha256=$(sha256sum /var/www/html/install/rhcos-openstack.x86_64.qcow2.gz | awk '{print $1}')
hosts:
- name: master-0
role: master
bmc:
address: ipmi://192.168.7.1:6231
username: admin
password: password
disableCertificateVerification: True
bootMACAddress: $(cat mac.list | grep master0 | awk '{print $2}')
hardwareProfile: default
rootDeviceHints:
deviceName: "/dev/vda"
- name: master-1
role: master
bmc:
address: ipmi://192.168.7.1:6232
username: admin
password: password
disableCertificateVerification: True
bootMACAddress: $(cat mac.list | grep master1 | awk '{print $2}')
hardwareProfile: default
rootDeviceHints:
deviceName: "/dev/vda"
- name: master-2
role: master
bmc:
address: ipmi://192.168.7.1:6233
username: admin
password: password
disableCertificateVerification: True
bootMACAddress: $(cat mac.list | grep master2 | awk '{print $2}')
hardwareProfile: default
rootDeviceHints:
deviceName: "/dev/vda"
- name: worker-0
role: worker
bmc:
address: ipmi://192.168.7.1:6234
username: admin
password: password
disableCertificateVerification: True
bootMACAddress: $(cat mac.list | grep worker0 | awk '{print $2}')
hardwareProfile: unknown
rootDeviceHints:
deviceName: "/dev/vda"
- name: worker-1
role: worker
bmc:
address: ipmi://192.168.7.1:6235
username: admin
password: password
disableCertificateVerification: True
bootMACAddress: $(cat mac.list | grep worker1 | awk '{print $2}')
hardwareProfile: unknown
rootDeviceHints:
deviceName: "/dev/vda"
metadata:
name: ocp4
networking:
clusterNetworks:
- cidr: 10.254.0.0/16
hostPrefix: 24
networkType: OpenShiftSDN
serviceNetwork:
- 172.30.0.0/16
machineCIDR: 192.168.7.0/24
compute:
- name: worker
replicas: 2
controlPlane:
name: master
replicas: 3
platform:
baremetal: {}
pullSecret: '{"auths":{"registry.ocp4.redhat.ren:5443": {"auth": "ZHVtbXk6ZHVtbXk=","email": "noemail@localhost"},"registry.ppa.redhat.ren:5443": {"auth": "ZHVtbXk6ZHVtbXk=","email": "noemail@localhost"}}}'
sshKey: |
$( cat /root/.ssh/helper_rsa.pub | sed 's/^/ /g' )
additionalTrustBundle: |
$( cat /data/install/redhat.ren.ca.crt | sed 's/^/ /g' )
imageContentSources:
- mirrors:
- registry.ocp4.redhat.ren:5443/ocp4/openshift4
source: quay.io/openshift-release-dev/ocp-release
- mirrors:
- registry.ocp4.redhat.ren:5443/ocp4/openshift4
source: quay.io/openshift-release-dev/ocp-v4.0-art-dev
EOF
# GO back to host
mkdir -p /data/install
cd /data/install
scp root@192.168.7.11:/data/install/install-config.yaml /data/install/
cd /data/install
for i in $(sudo virsh list --all | tail -n +3 | grep bootstrap | awk {'print $2'});
do
sudo virsh destroy $i;
sudo virsh undefine $i;
sudo virsh vol-delete $i --pool default;
sudo virsh vol-delete $i.ign --pool default;
virsh pool-destroy $i
virsh pool-delete $i
virsh pool-undefine $i
done
/bin/rm -rf .openshift_install.log .openshift_install_state.json terraform* auth tls
./openshift-baremetal-install --dir /data/install/ --log-level debug create cluster
# INFO Install complete!
# INFO To access the cluster as the system:admin user when using 'oc', run 'export KUBECONFIG=/data/install/auth/kubeconfig'
# INFO Access the OpenShift web-console here: https://console-openshift-console.apps.ocp4.redhat.ren
# INFO Login to the console with user: "kubeadmin", and password: "dTSbu-aIIZr-gxRxT-njrEr"
安装的过程是全自动的,所以也不用干什么,在provisioning network的模式下,可以看到master激活了2个网卡。
接着,我们就可以去helper节点上,用我们熟悉的oc命令操作集群了。
# on kvm host, copy back auth folder
rsync -arz /data/install/auth root@192.168.7.11:/data/install/
# Go back to helper
ansible localhost -m lineinfile -a 'path=$HOME/.bashrc regexp="^export KUBECONFIG" line="export KUBECONFIG=/data/install/auth/kubeconfig"'
source $HOME/.bashrc
oc get node
oc get pod -n openshift-machine-api
oc get BareMetalHost -n openshift-machine-api
oc get bmh -n openshift-machine-api
# NAME STATUS PROVISIONING STATUS CONSUMER BMC HARDWARE PROFILE ONLINE ERROR
# master-0 OK externally provisioned ocp4-sbsqb-master-0 ipmi://192.168.7.1:6231 true
# master-1 OK externally provisioned ocp4-sbsqb-master-1 ipmi://192.168.7.1:6232 true
# master-2 OK externally provisioned ocp4-sbsqb-master-2 ipmi://192.168.7.1:6233 true
# worker-0 OK provisioned ocp4-sbsqb-worker-0-kcz5t ipmi://192.168.7.1:6234 unknown true
# worker-1 OK provisioned ocp4-sbsqb-worker-0-5ktqw ipmi://192.168.7.1:6235 unknown true
# worker-2 OK ready ipmi://192.168.7.1:6236 unknown false
oc get pod -n openshift-kni-infra
我们就能看到bm的输出了
可以看到web console上node的配置指向了bm
我们也可以看到久违的machine配置
添加一个新节点
IPI 模式下,添加一个新节点非常方便,只要定义一个BareMetalHost就好了。
cd /data/install/
cat << EOF > /data/install/bmh.yaml
---
apiVersion: v1
kind: Secret
metadata:
name: worker-2-bmc-secret
type: Opaque
data:
username: $(echo -ne "admin" | base64)
password: $(echo -ne "password" | base64)
---
apiVersion: metal3.io/v1alpha1
kind: BareMetalHost
metadata:
name: worker-2
spec:
online: true
bootMACAddress: $(cat mac.list | grep worker2 | awk '{print $2}')
bmc:
address: ipmi://192.168.7.1:6236
credentialsName: worker-2-bmc-secret
disableCertificateVerification: true
hardwareProfile: unknown
rootDeviceHints:
deviceName: /dev/vda
EOF
oc -n openshift-machine-api create -f bmh.yaml
# DO NOT USE, restore, delete the vm
oc -n openshift-machine-api delete -f bmh.yaml
oc get bmh -n openshift-machine-api
# NAME STATUS PROVISIONING STATUS CONSUMER BMC HARDWARE PROFILE ONLINE ERROR
# master-0 OK externally provisioned ocp4-sbsqb-master-0 ipmi://192.168.7.1:6231 true
# master-1 OK externally provisioned ocp4-sbsqb-master-1 ipmi://192.168.7.1:6232 true
# master-2 OK externally provisioned ocp4-sbsqb-master-2 ipmi://192.168.7.1:6233 true
# worker-0 OK provisioned ocp4-sbsqb-worker-0-kcz5t ipmi://192.168.7.1:6234 unknown true
# worker-1 OK provisioned ocp4-sbsqb-worker-0-5ktqw ipmi://192.168.7.1:6235 unknown true
# worker-2 OK ready ipmi://192.168.7.1:6236 unknown false
oc get machinesets -n openshift-machine-api
# NAME DESIRED CURRENT READY AVAILABLE AGE
# ocp4-sbsqb-worker-0 2 2 2 2 99m
oc get machinesets -n openshift-machine-api -o json | jq -r .items[0].metadata.name
# 扩容worker到3副本,会触发worker-2的部署
oc scale --replicas=3 machineset $(oc get machinesets -n openshift-machine-api -o json | jq -r .items[0].metadata.name) -n openshift-machine-api
排错技巧
# login to bootstrap to debug
# find the ip from kvm console
ssh -i ~/.ssh/helper_rsa core@192.168.7.75
journalctl -b -f -u release-image.service -u bootkube.service
journalctl -b -u release-image.service -u bootkube.service | grep -i baremetal
sudo -i
export KUBECONFIG=/etc/kubernetes/kubeconfig
oc get pod -n openshift-machine-api
oc get BareMetalHost -n openshift-machine-api
# debug why bootstrap can't be ping...
cat .openshift_install_state.json | jq '."*bootstrap.Bootstrap"'.Config.storage.files[].path
cat .openshift_install_state.json | jq -r '."*bootstrap.Bootstrap"'.File.Data | base64 -d | jq -r . > ign.json
cat .openshift_install_state.json | jq -r '."*bootstrap.Bootstrap".Config.storage.files[].contents.source ' | sed 's/.*base64,//g' | base64 -d > decode
cat .openshift_install_state.json | jq -r '."*bootstrap.Bootstrap".Config.storage.files[] | .path, .contents.source ' | while read -r line ; do if [[ $line =~ .*base64,.* ]]; then echo $(echo $line | sed 's/.*base64,//g' | base64 -d) ; else echo $line; fi; done > files
openshift 4.6 静态IP离线 baremetal 安装,采用 cilium 网络插件
based on: https://docs.cilium.io/en/stable/gettingstarted/k8s-install-openshift-okd/
base on cilium v1.9.4
安装过程视频
本文描述ocp4.6在baremetal(kvm模拟)上面,静态ip安装的方法,并使用cilium 网络插件
以下是本次实验的架构图:
离线安装包下载
ocp4.3的离线安装包下载和3.11不太一样,按照如下方式准备。另外,由于默认的baremetal是需要dhcp, pxe环境的,那么需要准备一个工具机,上面有dhcp, tftp, haproxy等工具,另外为了方便项目现场工作,还准备了ignition文件的修改工具,所以离线安装包需要一些其他第三方的工具。
https://github.com/wangzheng422/ocp4-upi-helpernode 这个工具,是创建工具机用的。
https://github.com/wangzheng422/filetranspiler 这个工具,是修改ignition文件用的。
打包好的安装包,在这里下载,百度盘下载链接,版本是4.6.12:
其中包括如下类型的文件:
- ocp4.tgz 这个文件包含了iso等安装介质,以及各种安装脚本,全部下载的镜像列表等。需要复制到宿主机,以及工具机上去。
- registry.tgz 这个文件也是docker image registry的仓库打包文件。需要先补充镜像的话,按照这里操作: 4.6.add.image.md
- install.image.tgz 这个文件是安装集群的时候,需要的补充镜像.
- rhel-data.7.9.tgz 这个文件是 rhel 7 主机的yum更新源,这么大是因为里面有gpu, epel等其他的东西。这个包主要用于安装宿主机,工具机,以及作为计算节点的rhel。
合并这些切分文件,使用类似如下的命令
cat registry.?? > registry.tgz
在外网云主机上面准备离线安装源
准备离线安装介质的文档,已经转移到了这里:4.6.build.dist.md
宿主机准备
本次实验,是在一个32C, 256G 的主机上面,用很多个虚拟机安装测试。所以先准备这个宿主机。
如果是多台宿主机,记得一定要调整时间配置,让这些宿主机的时间基本一致,否则证书会出问题。
主要的准备工作有
- 配置yum源
- 配置dns
- 安装镜像仓库
- 配置vnc环境
- 配置kvm需要的网络
- 创建helper kvm
- 配置一个haproxy,从外部导入流量给kvm
以上准备工作,dns部分需要根据实际项目环境有所调整。
本次的宿主机是一台rhel8, 参考这里进行基本的配置配置rhel8.build.kernel.repo.cache.md
cat << EOF > /root/.ssh/config
StrictHostKeyChecking no
UserKnownHostsFile=/dev/null
EOF
cat << EOF >> /etc/hosts
127.0.0.1 registry.ocp4.redhat.ren
EOF
dnf clean all
dnf repolist
dnf -y install byobu htop
systemctl disable --now firewalld
# 配置registry
mkdir -p /etc/crts/ && cd /etc/crts
# https://access.redhat.com/documentation/en-us/red_hat_codeready_workspaces/2.1/html/installation_guide/installing-codeready-workspaces-in-tls-mode-with-self-signed-certificates_crw
openssl genrsa -out /etc/crts/redhat.ren.ca.key 4096
openssl req -x509 \
-new -nodes \
-key /etc/crts/redhat.ren.ca.key \
-sha256 \
-days 36500 \
-out /etc/crts/redhat.ren.ca.crt \
-subj /CN="Local Red Hat Ren Signer" \
-reqexts SAN \
-extensions SAN \
-config <(cat /etc/pki/tls/openssl.cnf \
<(printf '[SAN]\nbasicConstraints=critical, CA:TRUE\nkeyUsage=keyCertSign, cRLSign, digitalSignature'))
openssl genrsa -out /etc/crts/redhat.ren.key 2048
openssl req -new -sha256 \
-key /etc/crts/redhat.ren.key \
-subj "/O=Local Red Hat Ren /CN=*.ocp4.redhat.ren" \
-reqexts SAN \
-config <(cat /etc/pki/tls/openssl.cnf \
<(printf "\n[SAN]\nsubjectAltName=DNS:*.ocp4.redhat.ren,DNS:*.apps.ocp4.redhat.ren,DNS:*.redhat.ren\nbasicConstraints=critical, CA:FALSE\nkeyUsage=digitalSignature, keyEncipherment, keyAgreement, dataEncipherment\nextendedKeyUsage=serverAuth")) \
-out /etc/crts/redhat.ren.csr
openssl x509 \
-req \
-sha256 \
-extfile <(printf "subjectAltName=DNS:*.ocp4.redhat.ren,DNS:*.apps.ocp4.redhat.ren,DNS:*.redhat.ren\nbasicConstraints=critical, CA:FALSE\nkeyUsage=digitalSignature, keyEncipherment, keyAgreement, dataEncipherment\nextendedKeyUsage=serverAuth") \
-days 36500 \
-in /etc/crts/redhat.ren.csr \
-CA /etc/crts/redhat.ren.ca.crt \
-CAkey /etc/crts/redhat.ren.ca.key \
-CAcreateserial -out /etc/crts/redhat.ren.crt
openssl x509 -in /etc/crts/redhat.ren.crt -text
/bin/cp -f /etc/crts/redhat.ren.crt /etc/pki/ca-trust/source/anchors/
update-ca-trust extract
cd /data
mkdir -p /data/registry
# tar zxf registry.tgz
dnf -y install podman pigz skopeo jq
# pigz -dc registry.tgz | tar xf -
cd /data/ocp4
podman load -i /data/ocp4/registry.tgz
podman run --name local-registry -p 5443:5000 \
-d --restart=always \
-v /data/registry/:/var/lib/registry:z \
-v /etc/crts:/certs:z \
-e REGISTRY_HTTP_TLS_CERTIFICATE=/certs/redhat.ren.crt \
-e REGISTRY_HTTP_TLS_KEY=/certs/redhat.ren.key \
docker.io/library/registry:2
# firewall-cmd --permanent --add-port=5443/tcp
# firewall-cmd --reload
# 加载更多的镜像
# 解压缩 ocp4.tgz
bash add.image.load.sh /data/install.image 'registry.ocp4.redhat.ren:5443'
# https://github.com/christianh814/ocp4-upi-helpernode/blob/master/docs/quickstart.md
# 准备vnc环境
vncpasswd
cat << EOF > ~/.vnc/config
session=gnome
securitytypes=vncauth,tlsvnc
desktop=sandbox
geometry=1280x800
alwaysshared
EOF
cat << EOF >> /etc/tigervnc/vncserver.users
:1=root
EOF
systemctl start vncserver@:1
# 如果你想停掉vnc server,这么做
systemctl stop vncserver@:1
# firewall-cmd --permanent --add-port=6001/tcp
# firewall-cmd --permanent --add-port=5901/tcp
# firewall-cmd --reload
# connect vnc at port 5901
# export DISPLAY=:1
# 创建实验用虚拟网络
cat << 'EOF' > /data/kvm/bridge.sh
#!/usr/bin/env bash
PUB_CONN='eno1'
PUB_IP='172.21.6.105/24'
PUB_GW='172.21.6.254'
PUB_DNS='172.21.1.1'
nmcli con down "$PUB_CONN"
nmcli con delete "$PUB_CONN"
nmcli con down baremetal
nmcli con delete baremetal
# RHEL 8.1 appends the word "System" in front of the connection,delete in case it exists
nmcli con down "System $PUB_CONN"
nmcli con delete "System $PUB_CONN"
nmcli connection add ifname baremetal type bridge con-name baremetal ipv4.method 'manual' \
ipv4.address "$PUB_IP" \
ipv4.gateway "$PUB_GW" \
ipv4.dns "$PUB_DNS"
nmcli con add type bridge-slave ifname "$PUB_CONN" master baremetal
nmcli con down "$PUB_CONN";pkill dhclient;dhclient baremetal
nmcli con up baremetal
EOF
nmcli con mod baremetal +ipv4.address '192.168.7.1/24'
nmcli networking off; nmcli networking on
# 创建工具机
mkdir -p /data/kvm
cd /data/kvm
lvremove -f rhel/helperlv
lvcreate -y -L 200G -n helperlv rhel
virt-install --name="ocp4-aHelper" --vcpus=2 --ram=4096 \
--disk path=/dev/rhel/helperlv,device=disk,bus=virtio,format=raw \
--os-variant rhel8.0 --network network=openshift4,model=virtio \
--boot menu=on --location /data/kvm/rhel-8.3-x86_64-dvd.iso \
--initrd-inject helper-ks-rhel8.cfg --extra-args "inst.ks=file:/helper-ks-rhel8.cfg"
# restore kvm
virsh destroy ocp4-aHelper
virsh undefine ocp4-aHelper
# virt-viewer --domain-name ocp4-aHelper
# virsh start ocp4-aHelper
# virsh list --all
# start chrony/ntp server on host
/bin/cp -f /etc/chrony.conf /etc/chrony.conf.default
cat << EOF > /etc/chrony.conf
# pool 2.rhel.pool.ntp.org iburst
driftfile /var/lib/chrony/drift
makestep 1.0 3
rtcsync
allow 192.0.0.0/8
local stratum 10
logdir /var/log/chrony
EOF
systemctl enable --now chronyd
# systemctl restart chronyd
chronyc tracking
chronyc sources -v
chronyc sourcestats -v
chronyc makestep
# setup ftp data root
mount --bind /data/dnf /var/ftp/dnf
chcon -R -t public_content_t /var/ftp/dnf
工具机准备
以下是在工具机里面,进行的安装操作。
主要的操作有
- 配置yum源
- 运行ansible脚本,自动配置工具机
- 上传定制的安装配置文件
- 生成ignition文件
sed -i 's/#UseDNS yes/UseDNS no/g' /etc/ssh/sshd_config
systemctl restart sshd
cat << EOF > /root/.ssh/config
StrictHostKeyChecking no
UserKnownHostsFile=/dev/null
EOF
# in helper node
mkdir /etc/yum.repos.d.bak
mv /etc/yum.repos.d/* /etc/yum.repos.d.bak
export YUMIP="192.168.7.1"
cat << EOF > /etc/yum.repos.d/remote.repo
[remote-epel]
name=epel
baseurl=ftp://${YUMIP}/dnf/epel
enabled=1
gpgcheck=0
[remote-epel-modular]
name=epel-modular
baseurl=ftp://${YUMIP}/dnf/epel-modular
enabled=1
gpgcheck=0
[remote-appstream]
name=appstream
baseurl=ftp://${YUMIP}/dnf/rhel-8-for-x86_64-appstream-rpms
enabled=1
gpgcheck=0
[remote-baseos]
name=baseos
baseurl=ftp://${YUMIP}/dnf/rhel-8-for-x86_64-baseos-rpms
enabled=1
gpgcheck=0
[remote-baseos-source]
name=baseos-source
baseurl=ftp://${YUMIP}/dnf/rhel-8-for-x86_64-baseos-source-rpms
enabled=1
gpgcheck=0
[remote-supplementary]
name=supplementary
baseurl=ftp://${YUMIP}/dnf/rhel-8-for-x86_64-supplementary-rpms
enabled=1
gpgcheck=0
[remote-codeready-builder]
name=supplementary
baseurl=ftp://${YUMIP}/dnf/codeready-builder-for-rhel-8-x86_64-rpms
enabled=1
gpgcheck=0
EOF
yum clean all
yum makecache
yum repolist
yum -y install ansible git unzip podman python3
yum -y update
reboot
# yum -y install ansible git unzip podman python36
mkdir -p /data/ocp4/
# scp ocp4.tgz to /data
# scp /data/down/ocp4.tgz root@192.168.7.11:/data/
# rsync -e ssh --info=progress2 -P --delete -arz /data/ocp4/ root@192.168.7.11:/data/ocp4/
cd /data
tar zvxf ocp4.tgz
cd /data/ocp4
# 这里使用了一个ansible的项目,用来部署helper节点的服务。
# https://github.com/wangzheng422/ocp4-upi-helpernode
unzip ocp4-upi-helpernode.zip
# 这里使用了一个ignition文件合并的项目,用来帮助自定义ignition文件。
# https://github.com/wangzheng422/filetranspiler
podman load -i filetranspiler.tgz
# 接下来,我们使用ansible来配置helper节点,装上各种openshift集群需要的服务
# 根据现场环境,修改 ocp4-upi-helpernode-master/vars-static.yaml
# 主要是修改各个节点的网卡和硬盘参数,还有IP地址
cat << EOF > /data/ocp4/ocp4-upi-helpernode-master/vars-static.rhel8.yaml
---
ssh_gen_key: true
staticips: true
bm_ipi: false
firewalld: false
dns_forward: false
iso:
iso_dl_url: "file:///data/ocp4/rhcos-live.x86_64.iso"
my_iso: "rhcos-live.iso"
helper:
name: "helper"
ipaddr: "192.168.7.11"
networkifacename: "enp1s0"
gateway: "192.168.7.1"
netmask: "255.255.255.0"
dns:
domain: "redhat.ren"
clusterid: "ocp4"
forwarder1: "192.168.7.1"
forwarder2: "192.168.7.1"
api_vip: "192.168.7.11"
ingress_vip: "192.168.7.11"
dhcp:
router: "192.168.7.1"
bcast: "192.168.7.255"
netmask: "255.255.255.0"
poolstart: "192.168.7.70"
poolend: "192.168.7.90"
ipid: "192.168.7.0"
netmaskid: "255.255.255.0"
bootstrap:
name: "bootstrap"
ipaddr: "192.168.7.12"
interface: "enp1s0"
install_drive: "vda"
macaddr: "52:54:00:7e:f8:f7"
masters:
- name: "master-0"
ipaddr: "192.168.7.13"
interface: "enp1s0"
install_drive: "vda"
macaddr: ""
- name: "master-1"
ipaddr: "192.168.7.14"
interface: "enp1s0"
install_drive: "vda"
macaddr: ""
- name: "master-2"
ipaddr: "192.168.7.15"
interface: "enp1s0"
install_drive: "vda"
macaddr: ""
workers:
- name: "worker-0"
ipaddr: "192.168.7.16"
interface: "enp1s0"
install_drive: "vda"
macaddr: ""
- name: "worker-1"
ipaddr: "192.168.7.17"
interface: "enp1s0"
install_drive: "vda"
macaddr: ""
others:
- name: "registry"
ipaddr: "192.168.7.1"
macaddr: "52:54:00:7e:f8:f7"
- name: "yum"
ipaddr: "192.168.7.1"
macaddr: "52:54:00:7e:f8:f7"
- name: "quay"
ipaddr: "192.168.7.1"
macaddr: "52:54:00:7e:f8:f7"
- name: "nexus"
ipaddr: "192.168.7.1"
macaddr: "52:54:00:7e:f8:f7"
- name: "git"
ipaddr: "192.168.7.1"
macaddr: "52:54:00:7e:f8:f7"
otherdomains:
- domain: "rhv.redhat.ren"
hosts:
- name: "manager"
ipaddr: "192.168.7.71"
- name: "rhv01"
ipaddr: "192.168.7.72"
- domain: "cmri-edge.redhat.ren"
hosts:
- name: "*"
ipaddr: "192.168.7.71"
- name: "*.apps"
ipaddr: "192.168.7.72"
force_ocp_download: false
remove_old_config_files: false
ocp_client: "file:///data/ocp4/4.6.16/openshift-client-linux-4.6.16.tar.gz"
ocp_installer: "file:///data/ocp4/4.6.16/openshift-install-linux-4.6.16.tar.gz"
ppc64le: false
arch: 'x86_64'
chronyconfig:
enabled: true
content:
- server: "192.168.7.1"
options: iburst
setup_registry:
deploy: false
registry_image: docker.io/library/registry:2
local_repo: "ocp4/openshift4"
product_repo: "openshift-release-dev"
release_name: "ocp-release"
release_tag: "4.6.1-x86_64"
registry_server: "registry.ocp4.redhat.ren:5443"
EOF
cd /data/ocp4/ocp4-upi-helpernode-master
ansible-playbook -e @vars-static.rhel8.yaml -e '{staticips: true}' tasks/main.yml
# try this:
/usr/local/bin/helpernodecheck
mkdir -p /data/install
# GOTO image registry host
# copy crt files to helper node
scp /etc/crts/redhat.ren.ca.crt root@192.168.7.11:/data/install/
scp /etc/crts/redhat.ren.crt root@192.168.7.11:/data/install/
scp /etc/crts/redhat.ren.key root@192.168.7.11:/data/install/
# GO back to help node
/bin/cp -f /data/install/redhat.ren.crt /etc/pki/ca-trust/source/anchors/
update-ca-trust extract
# 定制ignition
cd /data/install
# 根据现场环境,修改 install-config.yaml
# 至少要修改ssh key, 还有 additionalTrustBundle,这个是镜像仓库的csr
# vi install-config.yaml
cat << EOF > /data/install/install-config.yaml
apiVersion: v1
baseDomain: redhat.ren
compute:
- hyperthreading: Enabled
name: worker
replicas: 0
controlPlane:
hyperthreading: Enabled
name: master
replicas: 3
metadata:
name: ocp4
networking:
clusterNetworks:
- cidr: 10.254.0.0/16
hostPrefix: 24
networkType: Cilium
serviceNetwork:
- 172.30.0.0/16
platform:
none: {}
pullSecret: '$( cat /data/pull-secret.json )'
sshKey: |
$( cat /root/.ssh/helper_rsa.pub | sed 's/^/ /g' )
additionalTrustBundle: |
$( cat /data/install/redhat.ren.ca.crt | sed 's/^/ /g' )
imageContentSources:
- mirrors:
- registry.ocp4.redhat.ren:5443/ocp4/openshift4
source: quay.io/openshift-release-dev/ocp-release
- mirrors:
- registry.ocp4.redhat.ren:5443/ocp4/openshift4
source: quay.io/openshift-release-dev/ocp-v4.0-art-dev
EOF
cd /data/install/
/bin/rm -rf *.ign .openshift_install_state.json auth bootstrap manifests master*[0-9] worker*[0-9]
openshift-install create manifests --dir "./"
cat << EOF > "/data/install/manifests/cluster-network-03-cilium-namespace.yaml"
apiVersion: v1
kind: Namespace
metadata:
name: cilium
annotations:
# node selector is required to make cilium-operator run on control plane nodes
openshift.io/node-selector: ""
labels:
name: cilium
# run level sets priority for Cilium to be deployed prior to other components
openshift.io/run-level: "0"
# enable cluster logging for Cilium namespace
openshift.io/cluster-logging: "true"
# enable cluster monitoring for Cilium namespace
openshift.io/cluster-monitoring: "true"
EOF
去一个公网主机
cp /data/ocp4/clients/helm-linux-amd64 /usr/local/bin/helm
chmod +x /usr/local/bin/helm
mkdir -p /data/cilium
cd /data/cilium
helm repo add cilium https://helm.cilium.io/
helm template cilium/cilium --version 1.9.4 \
--namespace cilium \
--set ipam.mode=cluster-pool \
--set cni.binPath=/var/lib/cni/bin \
--set cni.confPath=/var/run/multus/cni/net.d \
--set ipam.operator.clusterPoolIPv4PodCIDR=10.254.0.0/16 \
--set ipam.operator.clusterPoolIPv4MaskSize=24 \
--set nativeRoutingCIDR=10.254.0.0/16 \
--set bpf.masquerade=false \
--set endpointRoutes.enabled=true \
--set hubble.enabled=true \
--set hubble.listenAddress=":4244" \
--set hubble.relay.enabled=true \
--set hubble.ui.enabled=true \
--output-dir "/data/cilium/"
回到 helper
# upload /data/cilium/cilium/templates/ to /data/install/cilium/templates
cd /data/install
for resource in cilium/templates/*
do cp "${resource}" "./manifests/cluster-network-04-cilium-$(basename ${resource})"
done
# 我们换季里面有nexus,那么我们把proxy的补丁打进去
cd /data/ocp4
bash image.registries.conf.sh nexus.ocp4.redhat.ren:8083
mkdir -p /etc/containers/registries.conf.d
/bin/cp -f image.registries.conf /etc/containers/registries.conf.d/
cd /data/install
cp /data/ocp4/99-worker-container-registries.yaml ./manifests/
cp /data/ocp4/99-master-container-registries.yaml ./manifests/
cp /data/ocp4/ocp4-upi-helpernode-master/machineconfig/* ./manifests/
openshift-install create ignition-configs --dir=/data/install
cd /data/ocp4/ocp4-upi-helpernode-master
# 我们来为每个主机,复制自己版本的ign,并复制到web server的目录下
ansible-playbook -e @vars-static.rhel8.yaml -e '{staticips: true}' tasks/ign.yml
# 如果对每个主机有自己ign的独特需求,在这一步,去修改ign。
# 以下操作本来是想设置网卡地址,但是实践发现是不需要的。
# 保留在这里,是因为他可以在安装的时候注入文件,非常有用。
# mkdir -p bootstrap/etc/sysconfig/network-scripts/
# cat <<EOF > bootstrap/etc/sysconfig/network-scripts/ifcfg-ens3
# DEVICE=ens3
# BOOTPROTO=none
# ONBOOT=yes
# IPADDR=192.168.7.12
# NETMASK=255.255.255.0
# GATEWAY=192.168.7.1
# DNS=192.168.7.11
# DNS1=192.168.7.11
# DNS2=192.168.7.1
# DOMAIN=redhat.ren
# PREFIX=24
# DEFROUTE=yes
# IPV6INIT=no
# EOF
# filetranspiler -i bootstrap.ign -f bootstrap -o bootstrap-static.ign
# /bin/cp -f bootstrap-static.ign /var/www/html/ignition/
# 我们为每个节点创建各自的iso文件
cd /data/ocp4/ocp4-upi-helpernode-master
ansible-playbook -e @vars-static.rhel8.yaml -e '{staticips: true}' tasks/iso.yml
回到宿主机
本来,到了这一步,就可以开始安装了,但是我们知道coreos装的时候,要手动输入很长的命令行,实际操作的时候,那是不可能输入对的,输入错一个字符,安装就失败,要重启,重新输入。。。
为了避免这种繁琐的操作,参考网上的做法,我们就需要为每个主机定制iso了。好在,之前的步骤,我们已经用ansible创建了需要的iso,我们把这些iso复制到宿主机上,就可以继续了。
这里面有一个坑,我们是不知道主机的网卡名称的,只能先用coreos iso安装启动一次,进入单用户模式以后,ip a 来查看以下,才能知道,一般来说,是ens3。
另外,如果是安装物理机,disk是哪个,也需要上述的方法,来看看具体的盘符。另外,推荐在物理机上安装rhel 8 来测试一下物理机是不是支持coreos。物理机安装的时候,遇到不写盘的问题,可以尝试添加启动参数: ignition.firstboot=1
# on kvm host
export KVM_DIRECTORY=/data/kvm
cd ${KVM_DIRECTORY}
scp root@192.168.7.11:/data/install/*.iso ${KVM_DIRECTORY}/
remove_lv() {
var_vg=$1
var_lv=$2
lvremove -f $var_vg/$var_lv
}
create_lv() {
var_vg=$1
var_lv=$2
lvcreate -y -L 120G -n $var_lv $var_vg
wipefs --all --force /dev/$var_vg/$var_lv
}
remove_lv rhel bootstraplv
remove_lv nvme master0lv
remove_lv nvme master1lv
remove_lv nvme master2lv
remove_lv rhel worker0lv
remove_lv rhel worker1lv
create_lv rhel bootstraplv
create_lv nvme master0lv
create_lv nvme master1lv
create_lv nvme master2lv
create_lv rhel worker0lv
create_lv rhel worker1lv
# finally, we can start install :)
# 你可以一口气把虚拟机都创建了,然后喝咖啡等着。
# 从这一步开始,到安装完毕,大概30分钟。
virt-install --name=ocp4-bootstrap --vcpus=4 --ram=8192 \
--cpu=host-model \
--disk path=/dev/rhel/bootstraplv,device=disk,bus=virtio,format=raw \
--os-variant rhel8.0 --network bridge=baremetal,model=virtio \
--boot menu=on --cdrom ${KVM_DIRECTORY}/rhcos_install-bootstrap.iso
# 想登录进coreos一探究竟?那么这么做
# ssh core@bootstrap
# journalctl -b -f -u bootkube.service
virt-install --name=ocp4-master0 --vcpus=6 --ram=36864 \
--cpu=host-model \
--disk path=/dev/nvme/master0lv,device=disk,bus=virtio,format=raw \
--os-variant rhel8.0 --network bridge=baremetal,model=virtio \
--boot menu=on --cdrom ${KVM_DIRECTORY}/rhcos_install-master-0.iso
# ssh core@192.168.7.13
virt-install --name=ocp4-master1 --vcpus=6 --ram=36864 \
--cpu=host-model \
--disk path=/dev/nvme/master1lv,device=disk,bus=virtio,format=raw \
--os-variant rhel8.0 --network bridge=baremetal,model=virtio \
--boot menu=on --cdrom ${KVM_DIRECTORY}/rhcos_install-master-1.iso
virt-install --name=ocp4-master2 --vcpus=6 --ram=36864 \
--cpu=host-model \
--disk path=/dev/nvme/master2lv,device=disk,bus=virtio,format=raw \
--os-variant rhel8.0 --network bridge=baremetal,model=virtio \
--boot menu=on --cdrom ${KVM_DIRECTORY}/rhcos_install-master-2.iso
# we add gpu passthrough into kvm
# look ./4.6.gpu.passthrough.md to find how to
lspci -n | grep 10de:1eb8
# 05:00.0 0302: 10de:1eb8 (rev a1)
virsh nodedev-list | grep pci | grep 05_00_0
# pci_0000_05_00_0
# https://docs.nvidia.com/grid/11.0/grid-vgpu-user-guide/index.html#using-gpu-pass-through
virsh nodedev-dumpxml pci_0000_05_00_0| egrep 'domain|bus|slot|function'
# <domain>0</domain>
# <bus>5</bus>
# <slot>0</slot>
# <function>0</function>
# <capability type='virt_functions' maxCount='16'/>
# <address domain='0x0000' bus='0x05' slot='0x00' function='0x0'/>
# if it is gpu passthrough
virt-install --name=ocp4-worker0 --vcpus=6 --ram=36864 \
--cpu=host-model \
--disk path=/dev/rhel/worker0lv,device=disk,bus=virtio,format=raw \
--os-variant rhel8.0 --network bridge=baremetal,model=virtio \
--host-device=pci_0000_05_00_0 \
--boot menu=on --cdrom ${KVM_DIRECTORY}/rhcos_install-worker-0.iso
# if it is vgpu
virt-install --name=ocp4-worker0 --vcpus=6 --ram=36864 \
--cpu=host-model \
--disk path=/dev/rhel/worker0lv,device=disk,bus=virtio,format=raw \
--os-variant rhel8.0 --network bridge=baremetal,model=virtio \
--boot menu=on --cdrom ${KVM_DIRECTORY}/rhcos_install-worker-0.iso
# on workstation
# open http://192.168.7.11:9000/
# to check
# if you want to stop or delete vm, try this
virsh list --all
virsh destroy ocp4-bootstrap
virsh undefine ocp4-bootstrap
virsh destroy ocp4-master0
virsh destroy ocp4-master1
virsh destroy ocp4-master2
virsh destroy ocp4-worker0
virsh destroy ocp4-worker1
virsh undefine ocp4-master0
virsh undefine ocp4-master1
virsh undefine ocp4-master2
virsh undefine ocp4-worker0
virsh undefine ocp4-worker1
在工具机上面
这个时候,安装已经自动开始了,我们只需要回到工具机上静静的观察就可以了。
在bootstrap和装master阶段,用这个命令看进度。
cd /data/ocp4
export KUBECONFIG=/data/install/auth/kubeconfig
echo "export KUBECONFIG=/data/install/auth/kubeconfig" >> ~/.bashrc
oc completion bash | sudo tee /etc/bash_completion.d/openshift > /dev/null
cd /data/install
openshift-install wait-for bootstrap-complete --log-level debug
一切正常的话,会看到这个。
有时候证书会过期,验证方法是登录 bootstrap, 看看过期时间。如果确定过期,要清除所有的openshift-install生成配置文件的缓存,重新来过。
echo | openssl s_client -connect localhost:6443 | openssl x509 -noout -text | grep Not
一般来说,如果在openshift-install这一步之前,按照文档,删除了缓存文件,就不会出现过期的现象。
oc get nodes
这个时候,只能看到master,是因为worker的csr没有批准。如果虚拟机是一口气创建的,那么多半不会遇到下面的问题。
oc get csr
会发现有很多没有被批准的
批准之
yum -y install jq
oc get csr | grep -v Approved
oc get csr -ojson | jq -r '.items[] | select(.status == {} ) | .metadata.name' | xargs oc adm certificate approve
# oc get csr -o name | xargs oc adm certificate approve
然后worker 节点cpu飙高,之后就能看到worker了。
等一会,会看到这个,就对了。
上面的操作完成以后,就可以完成最后的安装了
openshift-install wait-for install-complete --log-level debug
# here is the output
# INFO Install complete!
# INFO To access the cluster as the system:admin user when using 'oc', run 'export KUBECONFIG=/data/install/auth/kubeconfig'
# INFO Access the OpenShift web-console here: https://console-openshift-console.apps.ocp4.redhat.ren
# INFO Login to the console with user: "kubeadmin", and password: "ngc8Z-hWogN-HcVYJ-UGXcs"
# 由于cilium的pod第一次启动好像不正常,我们要统一重启一下
oc -n cilium delete pod --all
测试 cilium
# wget https://raw.githubusercontent.com/cilium/cilium/1.9.4/examples/kubernetes/connectivity-check/connectivity-check.yaml
kubectl apply -f - << EOF
apiVersion: security.openshift.io/v1
kind: SecurityContextConstraints
metadata:
name: cilium-test
allowHostPorts: true
allowHostNetwork: true
users:
- system:serviceaccount:cilium-test:default
priority: null
readOnlyRootFilesystem: false
runAsUser:
type: MustRunAsRange
seLinuxContext:
type: MustRunAs
volumes: null
allowHostDirVolumePlugin: false
allowHostIPC: false
allowHostPID: false
allowPrivilegeEscalation: false
allowPrivilegedContainer: false
allowedCapabilities: null
defaultAddCapabilities: null
requiredDropCapabilities: null
groups: null
EOF
cd /data/install
kubectl create ns cilium-test
kubectl apply -n cilium-test -f /data/install/cilium/connectivity-check.yaml
kubectl get pods -n cilium-test
# the result see below pic
# restore
kubectl delete ns cilium-test
kubectl delete scc cilium-test
因为我们是一个离线环境,所以连接外网的最后2个测试,肯定不能通过,其他都是正常的,所以是没问题的。
接下来,我们按照hubble,算是一个前端吧
oc apply -f /data/install/cilium/templates/
kubectl -n cilium get pods
# restore
# oc delete -f /data/install/cilium/templates/
cilium看来还有点稳定性问题,有几个pod需要重启,hubble才能装上。
接下来,我们就测试一下酷酷的界面吧
open: http://hubble-ui-cilium.apps.ocp4.redhat.ren/
# kubectl port-forward -n kube-system svc/hubble-ui --address 0.0.0.0 --address :: 12000:80
# 我们创建一个测试应用
# wget https://raw.githubusercontent.com/cilium/cilium/1.9.4/examples/minikube/http-sw-app.yaml
oc create -n default -f /data/install/cilium/http-sw-app.yaml
oc expose svc hubble-ui -n cilium
oc project default
kubectl exec xwing -- curl -s -XPOST deathstar.default.svc.cluster.local/v1/request-landing
kubectl exec tiefighter -- curl -s -XPOST deathstar.default.svc.cluster.local/v1/request-landing
# kubectl port-forward -n cilium svc/hubble-ui --address 0.0.0.0 --address :: 12000:80
# kubectl -n cilium get pods -l k8s-app=cilium
POD=$( oc -n cilium get pod -l k8s-app=cilium -o json | jq -r '.items[0].metadata | select( .name | contains("cilium") ) | .name' )
echo $POD
kubectl -n cilium exec $POD -- cilium endpoint list
# ENDPOINT POLICY (ingress) POLICY (egress) IDENTITY LABELS (source:key[=value])
# IPv6 IPv4 STATUS
# ENFORCEMENT ENFORCEMENT
# 292 Disabled Disabled 47012 k8s:app=network-metrics-daemon
# 10.254.2.215 ready
# k8s:component=network
# k8s:io.cilium.k8s.namespace.labels.name=openshift-multus
# k8s:io.cilium.k8s.namespace.labels.olm.operatorgroup.uid/2280aae5-4b08-41a3-a491-$
# fb50008331e
# k8s:io.cilium.k8s.namespace.labels.openshift.io/cluster-monitoring=true
# k8s:io.cilium.k8s.namespace.labels.openshift.io/run-level=0
# k8s:io.cilium.k8s.policy.cluster=default
# k8s:io.cilium.k8s.policy.serviceaccount=metrics-daemon-sa
# k8s:io.kubernetes.pod.namespace=openshift-multus
# k8s:openshift.io/component=network
# k8s:type=infra
kubectl -n cilium exec $POD -- cilium status
# KVStore: Ok Disabled
# Kubernetes: Ok 1.19 (v1.19.0+e49167a) [linux/amd64]
# Kubernetes APIs: ["cilium/v2::CiliumClusterwideNetworkPolicy", "cilium/v2::CiliumEndpoint", "cilium/v2::CiliumNetworkPolicy", "cilium/v2::CiliumNode", "core/v1::Namespace", "core/v1::Node", "core/v1::Pods", "core/v1::Service", "discovery/v1beta1::EndpointSlice", "networking.k8s.io/v1::NetworkPolicy"]
# KubeProxyReplacement: Probe [enp1s0 (Direct Routing)]
# Cilium: Ok OK
# NodeMonitor: Listening for events on 6 CPUs with 64x4096 of shared memory
# Cilium health daemon: Ok
# IPAM: IPv4: 10/255 allocated from 10.254.2.0/24,
# BandwidthManager: Disabled
# Host Routing: Legacy
# Masquerading: IPTables
# Controller Status: 50/50 healthy
# Proxy Status: OK, ip 10.254.2.79, 0 redirects active on ports 10000-20000
# Hubble: Ok Current/Max Flows: 4096/4096 (100.00%), Flows/s: 71.25 Metrics: Disabled
# Cluster health: 3/3 reachable (2021-03-08T13:10:50Z)
kubectl -n cilium exec $POD -- cilium status --all-addresses
# KVStore: Ok Disabled
# Kubernetes: Ok 1.19 (v1.19.0+e49167a) [linux/amd64]
# Kubernetes APIs: ["cilium/v2::CiliumClusterwideNetworkPolicy", "cilium/v2::CiliumEndpoint", "cilium/v2::CiliumNetworkPolicy", "cilium/v2::CiliumNode", "core/v1::Namespace", "core/v1::Node", "core/v1::Pods", "core/v1::Service", "discovery/v1beta1::EndpointSlice", "networking.k8s.io/v1::NetworkPolicy"]
# KubeProxyReplacement: Probe [enp1s0 (Direct Routing)]
# Cilium: Ok OK
# NodeMonitor: Listening for events on 6 CPUs with 64x4096 of shared memory
# Cilium health daemon: Ok
# IPAM: IPv4: 12/255 allocated from 10.254.2.0/24,
# Allocated addresses:
# 10.254.2.131 (openshift-multus/multus-admission-controller-c56l8 [restored])
# 10.254.2.132 (openshift-monitoring/alertmanager-main-0 [restored])
# 10.254.2.15 (openshift-monitoring/prometheus-k8s-0 [restored])
# 10.254.2.171 (cilium/hubble-relay-95df958c6-2jxwl)
# 10.254.2.192 (cilium/hubble-ui-5df5fb587d-6l75r)
# 10.254.2.195 (health)
# 10.254.2.215 (openshift-multus/network-metrics-daemon-vjkmr [restored])
# 10.254.2.245 (openshift-controller-manager/controller-manager-bj29h [restored])
# 10.254.2.35 (openshift-apiserver/apiserver-5c67746947-bd2wp [restored])
# 10.254.2.79 (router)
# 10.254.2.90 (openshift-dns/dns-default-hggxx [restored])
# 10.254.2.97 (openshift-oauth-apiserver/apiserver-549f94565d-kl6bb [restored])
# BandwidthManager: Disabled
# Host Routing: Legacy
# Masquerading: IPTables
# Controller Status: 66/66 healthy
# Proxy Status: OK, ip 10.254.2.79, 0 redirects active on ports 10000-20000
# Hubble: Ok Current/Max Flows: 4096/4096 (100.00%), Flows/s: 68.96 Metrics: Disabled
# Cluster health: 3/3 reachable (2021-03-08T14:47:09Z)
kubectl get cn master-0 -o yaml
# ...
# spec:
# addresses:
# - ip: 192.168.7.13
# type: InternalIP
# - ip: 10.254.0.184
# type: CiliumInternalIP
# azure: {}
# encryption: {}
# eni: {}
# health:
# ipv4: 10.254.0.167
# ipam:
# podCIDRs:
# - 10.254.0.0/24
# ...
# 让我们用bpftool来查看一下系统里面都有啥
oc -n cilium exec $POD -- bpftool net list
# xdp:
# tc:
# enp1s0(2) clsact/ingress bpf_netdev_enp1s0.o:[from-netdev] id 1077
# enp1s0(2) clsact/egress bpf_netdev_enp1s0.o:[to-netdev] id 1083
# cilium_net(3) clsact/ingress bpf_host_cilium_net.o:[to-host] id 1070
# cilium_host(4) clsact/ingress bpf_host.o:[to-host] id 1032
# cilium_host(4) clsact/egress bpf_host.o:[from-host] id 1045
# cilium_vxlan(5) clsact/ingress bpf_overlay.o:[from-overlay] id 923
# cilium_vxlan(5) clsact/egress bpf_overlay.o:[to-overlay] id 928
# lxc2818365dcf52(9) clsact/ingress bpf_lxc.o:[from-container] id 984
# lxc2818365dcf52(9) clsact/egress bpf_lxc.o:[to-container] id 1020
# lxcc899b45a3e26(11) clsact/ingress bpf_lxc.o:[from-container] id 944
# lxcc899b45a3e26(11) clsact/egress bpf_lxc.o:[to-container] id 975
# lxcad116d75c844(15) clsact/ingress bpf_lxc.o:[from-container] id 950
# lxcad116d75c844(15) clsact/egress bpf_lxc.o:[to-container] id 980
# lxc2fda292e3d0b(19) clsact/ingress bpf_lxc.o:[from-container] id 939
# lxc2fda292e3d0b(19) clsact/egress bpf_lxc.o:[to-container] id 968
# lxc3c9c10774fe9(21) clsact/ingress bpf_lxc.o:[from-container] id 948
# lxc3c9c10774fe9(21) clsact/egress bpf_lxc.o:[to-container] id 986
# lxc17eb3d9acd1f(25) clsact/ingress bpf_lxc.o:[from-container] id 1009
# lxc17eb3d9acd1f(25) clsact/egress bpf_lxc.o:[to-container] id 1063
# lxccd70d8b98510(27) clsact/ingress bpf_lxc.o:[from-container] id 1005
# lxccd70d8b98510(27) clsact/egress bpf_lxc.o:[to-container] id 1043
# lxc90379056af7f(29) clsact/ingress bpf_lxc.o:[from-container] id 941
# lxc90379056af7f(29) clsact/egress bpf_lxc.o:[to-container] id 977
# lxc_health(51) clsact/ingress bpf_lxc.o:[from-container] id 1089
# lxc_health(51) clsact/egress bpf_lxc.o:[to-container] id 1095
# lxc9b377d665f9f(53) clsact/ingress bpf_lxc.o:[from-container] id 1007
# lxc9b377d665f9f(53) clsact/egress bpf_lxc.o:[to-container] id 1040
# lxc1e3c45bc89b2(55) clsact/ingress bpf_lxc.o:[from-container] id 1018
# lxc1e3c45bc89b2(55) clsact/egress bpf_lxc.o:[to-container] id 1050
# flow_dissector:
在节点上,也能看到bpf加载了
# pod 里面的ip
ip a
# 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1000
# link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
# inet 127.0.0.1/8 scope host lo
# valid_lft forever preferred_lft forever
# inet6 ::1/128 scope host
# valid_lft forever preferred_lft forever
# 18: eth0@if19: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue state UP
# link/ether f2:93:a4:bd:7b:38 brd ff:ff:ff:ff:ff:ff
# inet 10.254.1.174/32 scope global eth0
# valid_lft forever preferred_lft forever
# inet6 fe80::f093:a4ff:febd:7b38/64 scope link
# valid_lft forever preferred_lft forever
镜像仓库代理 / image registry proxy
准备离线镜像仓库非常麻烦,好在我们找到了一台在线的主机,那么我们可以使用nexus构造image registry proxy,在在线环境上面,做一遍PoC,然后就能通过image registry proxy得到离线镜像了
- https://mtijhof.wordpress.com/2018/07/23/using-nexus-oss-as-a-proxy-cache-for-docker-images/
#####################################################
# init build the nexus fs
/bin/cp -f nexus-image.tgz /data/ccn/
tar zxf nexus-image.tgz
chown -R 200 /data/ccn/nexus-image
# podman run -d -p 8082:8081 -p 8083:8083 -it --name nexus-image -v /data/ccn/nexus-image:/nexus-data:Z docker.io/sonatype/nexus3:3.29.0
podman run -d -p 8082:8081 -p 8083:8083 -p 8084:8084 -it --name nexus-image -v /data/ccn/nexus-image:/nexus-data:Z docker.io/wangzheng422/imgs:nexus3-3.29.0-wzh
podman stop nexus-image
podman rm nexus-image
# get the admin password
cat /data/ccn/nexus-image/admin.password && echo
# 84091bcd-c82f-44a3-8b7b-dfc90f5b7da1
# open http://nexus.ocp4.redhat.ren:8082
# how to cleanup
# https://github.com/wangzheng422/nexus-docker-cleanup
# 开启 https
# https://blog.csdn.net/s7799653/article/details/105378645
# https://help.sonatype.com/repomanager3/system-configuration/configuring-ssl#ConfiguringSSL-InboundSSL-ConfiguringtoServeContentviaHTTPS
mkdir -p /data/install/tmp
cd /data/install/tmp
# 将证书导出成pkcs格式
# 这里需要输入密码 用 password,
openssl pkcs12 -export -out keystore.pkcs12 -inkey /etc/crts/redhat.ren.key -in /etc/crts/redhat.ren.crt
cat << EOF >> Dockerfile
FROM docker.io/sonatype/nexus3:3.29.0
USER root
COPY keystore.pkcs12 /keystore.pkcs12
RUN keytool -v -importkeystore -srckeystore keystore.pkcs12 -srcstoretype PKCS12 -destkeystore keystore.jks -deststoretype JKS -storepass password -srcstorepass password &&\
cp keystore.jks /opt/sonatype/nexus/etc/ssl/
USER nexus
EOF
buildah bud --format=docker -t docker.io/wangzheng422/imgs:nexus3-3.29.0-wzh -f Dockerfile .
buildah push docker.io/wangzheng422/imgs:nexus3-3.29.0-wzh
######################################################
# go to helper, update proxy setting for ocp cluster
cd /data/ocp4
bash image.registries.conf.sh nexus.ocp4.redhat.ren:8083
mkdir -p /etc/containers/registries.conf.d
/bin/cp -f image.registries.conf /etc/containers/registries.conf.d/
cd /data/ocp4
oc apply -f ./99-worker-container-registries.yaml -n openshift-config
oc apply -f ./99-master-container-registries.yaml -n openshift-config
######################################################
# dump the nexus image fs out
podman stop nexus-image
var_date=$(date '+%Y-%m-%d-%H%M')
echo $var_date
cd /data/ccn
tar cf - ./nexus-image | pigz -c > nexus-image.tgz
buildah from --name onbuild-container scratch
buildah copy onbuild-container nexus-image.tgz /
buildah umount onbuild-container
buildah commit --rm --format=docker onbuild-container docker.io/wangzheng422/nexus-fs:image-$var_date
# buildah rm onbuild-container
# rm -f nexus-image.tgz
buildah push docker.io/wangzheng422/nexus-fs:image-$var_date
echo "docker.io/wangzheng422/nexus-fs:image-$var_date"
# 以下这个版本,可以作为初始化的image proxy,里面包含了nfs provision,以及sample operator的metadata。很高兴的发现,image stream并不会完全下载镜像,好想只是下载metadata,真正用的时候,才去下载。
# docker.io/wangzheng422/nexus-fs:image-2020-12-26-1118
配置镜像仓库的ca
安装过程里面,已经把镜像仓库的ca放进去了,但是好想image stream不认,让我们再试试
oc project openshift-config
oc create configmap ca.for.registry -n openshift-config \
--from-file=registry.ocp4.redhat.ren..5443=/data/install/redhat.ren.ca.crt \
--from-file=nexus.ocp4.redhat.ren..8083=/data/install/redhat.ren.ca.crt
oc patch image.config.openshift.io/cluster -p '{"spec":{"additionalTrustedCA":{"name":"ca.for.registry"}}}' --type=merge
# oc patch image.config.openshift.io/cluster -p '{"spec":{"registrySources":{"insecureRegistries":["nexus.ocp4.redhat.ren:8083"]}}}' --type=merge
oc get image.config.openshift.io/cluster -o yaml
# openshift project下面的image stream重新加载一下把
oc get is -o json | jq -r '.items[].metadata.name' | xargs -L1 oc import-image --all
我们的工具机是带nfs的,那么就配置高档一些的nfs存储吧,不要用emptydir
bash /data/ocp4/ocp4-upi-helpernode-master/files/nfs-provisioner-setup.sh
# oc edit configs.imageregistry.operator.openshift.io
# 修改 storage 部分
# storage:
# pvc:
# claim:
oc patch configs.imageregistry.operator.openshift.io cluster -p '{"spec":{"managementState": "Managed","storage":{"pvc":{"claim":""}}}}' --type=merge
oc patch configs.imageregistry.operator.openshift.io cluster -p '{"spec":{"managementState": "Removed"}}' --type=merge
oc get clusteroperator image-registry
oc get configs.imageregistry.operator.openshift.io cluster -o yaml
# 把imagepruner给停掉
# https://bugzilla.redhat.com/show_bug.cgi?id=1852501#c24
# oc patch imagepruner.imageregistry/cluster --patch '{"spec":{"suspend":true}}' --type=merge
# oc -n openshift-image-registry delete jobs --all
oc get configs.samples.operator.openshift.io/cluster -o yaml
oc patch configs.samples.operator.openshift.io/cluster -p '{"spec":{"managementState": "Managed"}}' --type=merge
oc patch configs.samples.operator.openshift.io/cluster -p '{"spec":{"managementState": "Unmanaged"}}' --type=merge
oc patch configs.samples.operator.openshift.io/cluster -p '{"spec":{"managementState": "Removed"}}' --type=merge
配置一下本地的dns ( 把 *.apps.ocp4.redhat.ren 配置成 192.168.7.11 ) ,指向工具机的haproxy,打开浏览器就能访问管理界面了
chrony/NTP 设置
在 ocp 4.6 里面,需要设定ntp同步,我们之前ansible脚本,已经创建好了ntp的mco配置,把他打到系统里面就好了。
oc apply -f /data/ocp4/ocp4-upi-helpernode-master/machineconfig/
Operator Hub 离线安装
https://docs.openshift.com/container-platform/4.2/operators/olm-restricted-networks.html
https://github.com/operator-framework/operator-registry
https://www.cnblogs.com/ericnie/p/11777384.html?from=timeline&isappinstalled=0
https://access.redhat.com/documentation/en-us/openshift_container_platform/4.2/html-single/images/index
operator hub 准备分2个层次,一个是本文章描述的,制作operator hub的离线资源,并镜像operator 镜像。做到这一步,能够在离线部署的ocp4.2上,看到operator hub,并且能够部署operator。但是如果要用operator来部署要用的组件,那么operator会再去下载镜像,这个层次的镜像,也需要离线部署,但是由于每个operator需要的镜像都不一样,也没有统一的地方进行描述,所以需要各个项目现场,根据需要另外部署,本项目会尽量多的下载需要的镜像,但是目前无法避免遗漏。
# on helper node, 在工具机上
cd /data/ocp4
# scp /etc/crts/redhat.ren.crt 192.168.7.11:/root/ocp4/
# https://docs.openshift.com/container-platform/4.4/builds/setting-up-trusted-ca.html
oc project openshift-config
oc create configmap ca.for.registry -n openshift-config \
--from-file=registry.ocp4.redhat.ren..5443=/data/install/redhat.ren.crt
# 如果你想删除这个config map,这么做
# oc delete configmap ca.for.registry
oc patch image.config.openshift.io/cluster -p '{"spec":{"additionalTrustedCA":{"name":"ca.for.registry"}}}' --type=merge
# oc patch image.config.openshift.io/cluster -p '{"spec":{"registrySources":{"insecureRegistries":["registry.redhat.ren"]}}}' --type=merge
oc get image.config.openshift.io/cluster -o yaml
# 以下这个步骤是官网文档要做的,实践中发现,disconnected环境不需要
# oc patch OperatorHub cluster --type json -p '[{"op": "add", "path": "/spec/disableAllDefaultSources", "value": true}]'
# 如果你不小心还是照着官网做了,用如下步骤删掉
# oc patch OperatorHub cluster --type json -p '[{"op": "remove", "path": "/spec/disableAllDefaultSources"}]'
oc patch OperatorHub cluster --type json \
-p '[{"op": "add", "path": "/spec/disableAllDefaultSources", "value": true}]'
oc get OperatorHub cluster -o yaml
# yum -y install python36
# 根据项目现场情况,调整参数,运行以下命令,生成配置文件,指向内网镜像仓库
cd /data/ocp4/
bash image.registries.conf.sh registry.ocp4.redhat.ren:5443
# 由于某些ocp 4.2的更新机制,以下操作会触发集群更新,
# 集群节点会逐个重启,集群组件也会逐个重启,请等待集群重启完毕。
oc apply -f ./99-worker-container-registries.yaml -n openshift-config
oc apply -f ./99-master-container-registries.yaml -n openshift-config
# !!!正常情况,以下操作不需要!!!
# 以下操作,删除mirror镜像信息,也会触发集群更新操作,请等待集群重启完毕
oc delete -f ./99-worker-container-registries.yaml -n openshift-config
oc delete -f ./99-master-container-registries.yaml -n openshift-config
watch oc get machineconfigpools
watch oc get node
从监控界面,能看到节点在升级,重启。
oc get pods -n openshift-marketplace
oc get catalogsource -n openshift-marketplace
oc get packagemanifest -n openshift-marketplace
能看到operator 列表
部署一个operator也能成功
other tips
# disable cluster upgrade check, and insight check
oc scale --replicas 0 -n openshift-cluster-version deployments/cluster-version-operator
oc scale --replicas 0 -n openshift-insights deployments/insights-operator
# set master and worker combine
# https://github.com/openshift-telco/openshift4x-poc/blob/master/MASTER-WORKER-COMBINED.md
oc edit schedulers cluster
# apiVersion: config.openshift.io/v1
# kind: Scheduler
# metadata:
# name: cluster
# spec:
# mastersSchedulable: true
以下是参考材料
https://www.openshift.com/blog/delivering-a-three-node-architecture-for-edge-deployments
nvidia gpu for openshift 4.6 disconnected 英伟达GPU离线安装
简介
本次实验是openshift 边缘 GPU 场景的一部分,主要关注于nvidia gpu如何在离线的情况下安装。关于如何 gpu passthrough 到kvm,模拟边缘gpu主机,见这个文档
以下是讲解视频
以下是本次实验的架构图:
制作 rhel8 repo / 安装源
nvidia gpu operator需要在线下载包,来编译driver,那么在离线场景,我们就需要先准备一个rhel8 的 repo。
export PROXY="127.0.0.1:18801"
subscription-manager --proxy=$PROXY release --list
subscription-manager --proxy=$PROXY release --set=8
subscription-manager --proxy=$PROXY repos --disable="*"
subscription-manager --proxy=$PROXY repos \
--enable="rhel-8-for-x86_64-baseos-rpms" \
--enable="rhel-8-for-x86_64-baseos-source-rpms" \
--enable="rhel-8-for-x86_64-appstream-rpms" \
--enable="rhel-8-for-x86_64-supplementary-rpms" \
--enable="codeready-builder-for-rhel-8-x86_64-rpms" \
--enable="rhocp-4.6-for-rhel-8-x86_64-rpms" \
--enable="rhel-8-for-x86_64-baseos-eus-rpms" \
# endline
mkdir -p /data/dnf/gaps
cd /data/dnf/gaps
# subscription-manager --proxy=$PROXY release --set=8.2
# subscription-manager --proxy=$PROXY release --set=8
# dnf -y install https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm
dnf copr enable frostyx/modulemd-tools
dnf install -y modulemd-tools
# dnf install -y https://kojipkgs.fedoraproject.org//packages/modulemd-tools/0.9/1.fc32/noarch/modulemd-tools-0.9-1.fc32.noarch.rpm
# 注意,这里需要的包,需要先部署一下gpu operator,然后看看driver的日志,里面装什么包,这里替换成相应的包,不同版本的gpu operator要求不同,所以这里的包也不一样。
/bin/rm -rf /data/dnf/gaps/*
# dnf download --resolve --releasever=8.2 --alldeps \
# --repo rhel-8-for-x86_64-baseos-eus-rpms,rhel-8-for-x86_64-baseos-rpms,rhel-8-for-x86_64-appstream-rpms,ubi-8-baseos,ubi-8-appstream \
# kernel-headers.x86_64 kernel-devel.x86_64 kernel-core.x86_64 systemd-udev.x86_64 elfutils-libelf.x86_64 elfutils-libelf-devel.x86_64 \
# kernel-headers-4.18.0-193.40.1.el8_2.x86_64 kernel-devel-4.18.0-193.40.1.el8_2.x86_64 kernel-core-4.18.0-193.40.1.el8_2.x86_64 systemd-udev-239-31.el8_2.2.x86_64 kernel-headers-4.18.0-193.41.1.el8_2.x86_64 kernel-devel-4.18.0-193.41.1.el8_2.x86_64 \
# elfutils-libelf-0.180-1.el8.x86_64
subscription-manager --proxy=$PROXY release --set=8.2
dnf download --resolve --releasever=8.2 --alldeps \
--repo rhel-8-for-x86_64-baseos-eus-rpms,rhel-8-for-x86_64-baseos-rpms,rhel-8-for-x86_64-appstream-rpms,ubi-8-baseos,ubi-8-appstream \
kernel-headers.x86_64 kernel-devel.x86_64 kernel-core.x86_64 systemd-udev.x86_64 elfutils-libelf.x86_64 elfutils-libelf-devel.x86_64 \
kernel-headers-4.18.0-193.41.1.el8_2.x86_64 kernel-devel-4.18.0-193.41.1.el8_2.x86_64 kernel-core-4.18.0-193.41.1.el8_2.x86_64
subscription-manager --proxy=$PROXY release --set=8
dnf download --resolve --alldeps \
--repo rhel-8-for-x86_64-baseos-rpms,rhel-8-for-x86_64-appstream-rpms,ubi-8-baseos,ubi-8-appstream \
elfutils-libelf.x86_64 elfutils-libelf-devel.x86_64
# https://access.redhat.com/solutions/4907601
createrepo ./
repo2module . \
--module-name foo \
--module-stream devel \
--module-version 123 \
--module-context f32
createrepo_mod .
现在,本机的 /data/dnf/gaps/ 目录,就是repo的目录了,做一个ftp服务,把他暴露出去就好了。具体方法,参考这里
修改英伟达驱动镜像 / nvidia driver image
默认 nvidia gpu driver pod 是需要联网下载各种包的,这里面还涉及到订阅,非常麻烦,而且离线无法使用。
我们刚才已经做了一个离线的repo仓库,那么我们就需要定制一下driver image,让他直接用离线的repo仓库就好了。
官方driver镜像下载: https://ngc.nvidia.com/catalog/containers/nvidia:driver/tags
mkdir -p /data/install/
cd /data/install
# /bin/rm -rf /etc/yum.repos.d/*
export YUMIP="192.168.7.1"
cat << EOF > ./remote.repo
[gaps]
name=gaps
baseurl=ftp://${YUMIP}/dnf/gaps
enabled=1
gpgcheck=0
EOF
oc create configmap repo-config -n gpu-operator-resources --from-file=./remote.repo
可以使用oeprator UI 来安装ClusterPolicy,注意调整driver config 的 repo config : repo-config -> /etc/yum.repos.d
定制driver config image
如果我们对driver config image有特殊需求,那么这样定制
here is an reference: https://github.com/dmc5179/nvidia-driver
有人把 rpm 包直接装在driver image里面了,也是一个很好的思路。
# driver image
# nvidia-driver-daemonset
podman pull nvcr.io/nvidia/driver:450.80.02-rhcos4.6
# you can test the driver image, like this:
# podman run --rm -it --entrypoint='/bin/bash' nvcr.io/nvidia/driver:450.80.02-rhcos4.6
podman run --rm -it --entrypoint='/bin/bash' nvcr.io/nvidia/driver:460.32.03-rhcos4.6
mkdir -p /data/gpu/
cd /data/gpu
export YUMIP="192.168.7.1"
cat << EOF > /data/gpu/remote.repo
[gaps]
name=gaps
baseurl=ftp://${YUMIP}/dnf/gaps
enabled=1
gpgcheck=0
EOF
cat << EOF > /data/gpu/Dockerfile
FROM nvcr.io/nvidia/driver:450.80.02-rhcos4.6
RUN /bin/rm -rf /etc/yum.repos.d/*
COPY remote.repo /etc/yum.repos.d/remote.repo
EOF
var_date=$(date '+%Y-%m-%d-%H%M')
echo $var_date
buildah bud --format=docker -t docker.io/wangzheng422/imgs:nvidia-gpu-driver-$var_date-rhcos4.6 -f Dockerfile .
# podman run --rm -it --entrypoint='/bin/bash' docker.io/wangzheng422/imgs:nvidia-gpu-driver-2021-01-21-0942
buildah push docker.io/wangzheng422/imgs:nvidia-gpu-driver-$var_date-rhcos4.6
echo "docker.io/wangzheng422/imgs:nvidia-gpu-driver-$var_date-rhcos4.6"
# docker.io/wangzheng422/imgs:nvidia-gpu-driver-2021-02-05-1131-rhcos4.6
好了,我们最后制作好的镜像,使用tag的实话,注意要省略到后面的 rhcos4.6, 因为operator UI 会自动补全。
开始离线安装gpu operator
参考nvidia官方的安装文档
首先,我们要安装 node feature discovery (nfd),在以前,nfd只能扫描独立worker节点,所以 3 节点的edge模式,那个时候不支持的。在后面的版本里面,nfd修复了这个漏洞,3节点的edge模式也支持了。
然后给nfd做一个配置,直接点击创建就可以,什么都不用改,注意namespace是openshift-operator
接下来,我们先创建一个namespace gpu-operator-resources, 然后安装 nvidia gpu operator
我们在gpu operator里面,创建一个cluster policy,注意要修改他的参数。这里给的例子是定制过driver镜像的,如果没定制,不用修改这个参数。
最后我们从 node label 上就能看见识别到的 gpu 了。
测试一下
# 先按照官方文档试试
oc project gpu-operator-resources
POD_NAME=$(oc get pods -o json | jq -r '.items[] | select( .metadata.name | contains("nvidia-driver-daemonset") ) | .metadata.name' | head )
oc exec -it $POD_NAME -- nvidia-smi
# Thu Jan 21 04:12:36 2021
# +-----------------------------------------------------------------------------+
# | NVIDIA-SMI 450.80.02 Driver Version: 450.80.02 CUDA Version: 11.0 |
# |-------------------------------+----------------------+----------------------+
# | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
# | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
# | | | MIG M. |
# |===============================+======================+======================|
# | 0 Tesla T4 On | 00000000:05:00.0 Off | Off |
# | N/A 27C P8 14W / 70W | 0MiB / 16127MiB | 0% Default |
# | | | N/A |
# +-------------------------------+----------------------+----------------------+
# +-----------------------------------------------------------------------------+
# | Processes: |
# | GPU GI CI PID Type Process name GPU Memory |
# | ID ID Usage |
# |=============================================================================|
# | No running processes found |
# +-----------------------------------------------------------------------------+
# 我们再启动个应用试试
# https://nvidia.github.io/gpu-operator/
# https://ngc.nvidia.com/catalog/containers/nvidia:tensorrt
# 我们按照这个官方文档,做一个测试镜像
# goto helper
cd /data/ocp4
cat << EOF > /data/ocp4/gpu.yaml
---
kind: Deployment
apiVersion: apps/v1
metadata:
annotations:
name: demo1
labels:
app: demo1
spec:
replicas: 1
selector:
matchLabels:
app: demo1
template:
metadata:
labels:
app: demo1
spec:
nodeSelector:
kubernetes.io/hostname: 'worker-0'
restartPolicy: Always
containers:
- name: demo1
image: "nvcr.io/nvidia/k8s/cuda-sample:vectoradd-cuda10.2-ubi8"
EOF
oc project demo
oc apply -f gpu.yaml
# [Vector addition of 50000 elements]
# Copy input data from the host memory to the CUDA device
# CUDA kernel launch with 196 blocks of 256 threads
# Copy output data from the CUDA device to the host memory
# Test PASSED
# Done
oc delete -f gpu.yaml
# on build host
# https://ngc.nvidia.com/catalog/containers/nvidia:tensorrt
# podman run -it nvcr.io/nvidia/tensorrt:20.12-py3
mkdir -p /data/gpu
cd /data/gpu
cat << EOF > /data/gpu/Dockerfile
FROM docker.io/wangzheng422/imgs:tensorrt-ljj
CMD tail -f /dev/null
EOF
var_date=$(date '+%Y-%m-%d-%H%M')
echo $var_date
buildah bud --format=docker -t docker.io/wangzheng422/imgs:tensorrt-ljj-$var_date -f Dockerfile .
buildah push docker.io/wangzheng422/imgs:tensorrt-ljj-$var_date
echo "docker.io/wangzheng422/imgs:tensorrt-ljj-$var_date"
# docker.io/wangzheng422/imgs:tensorrt-ljj-2021-01-21-1151
# go back to helper node
cat << EOF > /data/ocp4/gpu.yaml
---
kind: Deployment
apiVersion: apps/v1
metadata:
annotations:
name: demo1
labels:
app: demo1
spec:
replicas: 1
selector:
matchLabels:
app: demo1
template:
metadata:
labels:
app: demo1
spec:
nodeSelector:
kubernetes.io/hostname: 'worker-0'
restartPolicy: Always
containers:
- name: demo1
image: docker.io/wangzheng422/imgs:tensorrt-ljj-2021-01-21-1151
EOF
oc project demo
oc apply -f gpu.yaml
# oc rsh into the pod, run sample program using gpu
# cd tensorrt/bin/
# ./sample_mnist
# you will see this correct result
# &&&& PASSED TensorRT.sample_mnist # ./sample_mnist
oc delete -f gpu.yaml
tips
- 如果发现nfd不能发现gpu型号,node reboot就好了
- 如果发现gpu feature discovery不正常, node reboot就好了
cat /proc/driver/nvidia/version
reference
https://www.openshift.com/blog/simplifying-deployments-of-accelerated-ai-workloads-on-red-hat-openshift-with-nvidia-gpu-operator
https://www.openshift.com/blog/how-to-use-entitled-image-builds-to-build-drivercontainers-with-ubi-on-openshift
https://access.redhat.com/solutions/5232901
https://docs.nvidia.com/datacenter/kubernetes/openshift-on-gpu-install-guide/
https://access.redhat.com/solutions/4907601
https://docs.nvidia.com/datacenter/tesla/tesla-installation-notes/index.html
以下弯路
# add ubi support
cat << EOF > /etc/yum.repos.d/ubi.repo
[ubi-8-baseos]
name=ubi-8-baseos
baseurl=https://cdn-ubi.redhat.com/content/public/ubi/dist/ubi8/8/x86_64/baseos/os
enabled=1
gpgcheck=1
[ubi-8-appstream]
name=ubi-8-appstream
baseurl=https://cdn-ubi.redhat.com/content/public/ubi/dist/ubi8/8/x86_64/appstream/os
enabled=1
gpgcheck=1
[ubi-8-codeready-builder]
name=ubi-8-codeready-builder
baseurl=https://cdn-ubi.redhat.com/content/public/ubi/dist/ubi8/8/x86_64/codeready-builder/os/
enabled=1
gpgcheck=1
EOF
cd /data/dnf
dnf reposync -m --download-metadata --delete -n
cat << EOF > /data/ocp4/gpu.yaml
---
kind: Deployment
apiVersion: apps/v1
metadata:
annotations:
name: demo1
labels:
app: demo1
spec:
replicas: 1
selector:
matchLabels:
app: demo1
template:
metadata:
labels:
app: demo1
spec:
nodeSelector:
kubernetes.io/hostname: 'worker-0'
restartPolicy: Always
containers:
- name: demo1
image: nvidia/cuda:11.1.1-devel-centos8
EOF
oc project demo
oc apply -f gpu.yaml
oc delete -f gpu.yaml
cat << EOF > /data/gpu/Dockerfile
FROM nvcr.io/nvidia/tensorrt:20.12-py3
RUN /opt/tensorrt/python/python_setup.sh
RUN /opt/tensorrt/install_opensource.sh
RUN /opt/tensorrt/install_opensource.sh -b master
# RUN cd /workspace/tensorrt/samples && make -j4
CMD tail -f /dev/null
EOF
本文描述如何在项目现场,补充缺失的离线镜像
感谢 william shen, kevin lin 的帮助和提醒,大大简化了ocp4.3补充镜像的过程。
大致的流程是
- 编辑 add.image.list 文件,把想要补充的镜像写进去,可以用#开始,代表注释,注意文件末尾加几个回车换行。
- 在外网主机,运行add.image.sh,会下载镜像到指定的目录,然后自行压缩成tgz
- 在内网工具机主机,上传压缩的tgz, 并解压缩
- 在内网工具机主机,cd /data/ocp4, 运行 add.image.load.sh, 加载镜像即可。
# 在外网云主机
# on vultr
# edit add.image.list
export MIRROR_DIR='/data/redhat-operator'
/bin/rm -rf ${MIRROR_DIR}
cd /data/ocp4
bash add.image.sh add.image.list ${MIRROR_DIR}
# bash add.image.sh is.openshift.list
# on 内网 工具机
# scp back /data/mirror_dir.tgz to /data/ocp4
bash add.image.load.sh /data/mirror_dir 'registry.redhat.ren:5443'
# bash add.image.load.sh /data/remote/4.3.3/is.samples/mirror_dir
openshift 补充 sample 镜像
openshift集群里面的openshift project,有很多的自带的image stream,这些image stream指向的是公网的镜像仓库地址,如果是离线环境,应该如何导入镜像,并更新image stream定义呢?
# 导入镜像
# 解压缩is.sample.tgz 到 /data
pigz -dc is.samples.tgz | tar xf -
# 根据现场环境修改add.image.load.sh,并运行
bash add.image.load.sh /data/is.samples/mirror_dir/
# 修正image stream定义
# 根据现场环境,修改is.patch.sh
bash is.patch.sh
openshift 4.3 calico 离线部署
https://docs.projectcalico.org/getting-started/openshift/requirements
image prepare
cd /data/ocp4
cat << EOF > add.image.list
quay.io/tigera/operator-init:v1.3.3
quay.io/tigera/operator:v1.3.3
docker.io/calico/ctl:v3.13.2
docker.io/calico/kube-controllers:v3.13.2
docker.io/calico/node:v3.13.2
docker.io/calico/typha:v3.13.2
docker.io/calico/pod2daemon-flexvol:v3.13.2
docker.io/calico/cni:v3.13.2
EOF
bash add.image.sh add.image.list
bash add.image.load.sh /data/down/mirror_dir
install
# scp install-config.yaml into /root/ocp4
# sed -i 's/OpenShiftSDN/Calico/' install-config.yaml
openshift-install create manifests --dir=/root/ocp4
# scp calico/manifests to manifests
openshift-install create ignition-configs --dir=/root/ocp4
# follow 4.3.disconnect.operator.md to install
oc get tigerastatus
oc get pod -n tigera-operator
oc get pod -n calico-system
# 看看都用了什么image
oc project tigera-operator
oc get pod -o json | jq -r '.items[].spec.containers[].image' | sort | uniq
# quay.io/tigera/operator-init:v1.3.3
# quay.io/tigera/operator:v1.3.3
oc project calico-system
oc get pod -o json | jq -r '.items[].spec.containers[].image' | sort | uniq
# calico/ctl:v3.13.2
# docker.io/calico/kube-controllers:v3.13.2
# docker.io/calico/node:v3.13.2
# docker.io/calico/typha:v3.13.2
# docker.io/calico/pod2daemon-flexvol:v3.13.2
# docker.io/calico/cni:v3.13.2
# 安装控制命令行
oc apply -f calicoctl.yaml
oc exec calicoctl -n calico-system -it -- /calicoctl get node -o wide
oc exec calicoctl -n calico-system -it -- /calicoctl ipam show --show-blocks
oc exec calicoctl -n calico-system -it -- /calicoctl get ipPool -o wide
calico 下,创建 pod,指定 ip pool
视频讲解
- https://youtu.be/GJSFF7DDCe8
- https://www.bilibili.com/video/BV14Z4y1p7wa/
https://www.tigera.io/blog/calico-ipam-explained-and-enhanced/
# 创建ip pool
cat << EOF > calico.ip.pool.yaml
---
apiVersion: projectcalico.org/v3
kind: IPPool
metadata:
name: ip-pool-1
spec:
cidr: 172.110.110.0/24
ipipMode: Always
natOutgoing: true
---
apiVersion: projectcalico.org/v3
kind: IPPool
metadata:
name: ip-pool-2
spec:
cidr: 172.110.220.0/24
ipipMode: Always
natOutgoing: true
EOF
cat calico.ip.pool.yaml | oc exec calicoctl -n calico-system -i -- /calicoctl apply -f -
# 检查ip pool的创建情况
oc exec calicoctl -n calico-system -it -- /calicoctl get ipPool -o wide
cat << EOF > calico.pod.yaml
---
kind: Pod
apiVersion: v1
metadata:
name: demo-pod1
namespace: demo
annotations:
cni.projectcalico.org/ipv4pools: '["ip-pool-1"]'
spec:
nodeSelector:
kubernetes.io/hostname: 'worker-0.ocp4.redhat.ren'
restartPolicy: Always
containers:
- name: demo
image: >-
registry.redhat.ren:5443/docker.io/wangzheng422/centos:centos7-test
env:
- name: key
value: value
command: ["iperf3", "-s", "-p" ]
args: [ "6666" ]
imagePullPolicy: Always
---
kind: Pod
apiVersion: v1
metadata:
name: demo-pod2
namespace: demo
annotations:
cni.projectcalico.org/ipv4pools: '["ip-pool-1"]'
spec:
nodeSelector:
kubernetes.io/hostname: 'worker-0.ocp4.redhat.ren'
restartPolicy: Always
containers:
- name: demo
image: >-
registry.redhat.ren:5443/docker.io/wangzheng422/centos:centos7-test
env:
- name: key
value: value
command: ["iperf3", "-s", "-p" ]
args: [ "6666" ]
imagePullPolicy: Always
---
kind: Pod
apiVersion: v1
metadata:
name: demo-pod3
namespace: demo
annotations:
cni.projectcalico.org/ipv4pools: '["ip-pool-2"]'
spec:
nodeSelector:
kubernetes.io/hostname: 'worker-0.ocp4.redhat.ren'
restartPolicy: Always
containers:
- name: demo
image: >-
registry.redhat.ren:5443/docker.io/wangzheng422/centos:centos7-test
env:
- name: key
value: value
command: ["iperf3", "-s", "-p" ]
args: [ "6666" ]
imagePullPolicy: Always
---
kind: Pod
apiVersion: v1
metadata:
name: demo-pod4
namespace: demo
annotations:
cni.projectcalico.org/ipv4pools: '["ip-pool-1"]'
spec:
nodeSelector:
kubernetes.io/hostname: 'worker-1.ocp4.redhat.ren'
restartPolicy: Always
containers:
- name: demo
image: >-
registry.redhat.ren:5443/docker.io/wangzheng422/centos:centos7-test
env:
- name: key
value: value
command: ["iperf3", "-s", "-p" ]
args: [ "6666" ]
imagePullPolicy: Always
---
kind: Pod
apiVersion: v1
metadata:
name: demo-pod5
namespace: demo
annotations:
cni.projectcalico.org/ipv4pools: '["ip-pool-2"]'
spec:
nodeSelector:
kubernetes.io/hostname: 'worker-1.ocp4.redhat.ren'
restartPolicy: Always
containers:
- name: demo
image: >-
registry.redhat.ren:5443/docker.io/wangzheng422/centos:centos7-test
env:
- name: key
value: value
command: ["iperf3", "-s", "-p" ]
args: [ "6666" ]
imagePullPolicy: Always
EOF
oc apply -f calico.pod.yaml
# 查看pod的IP分配,是按照我们指定的ip地址范围分配的
oc get pod -o wide -n demo
# [root@helper ocp4]# oc get pod -o wide -n demo
# NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
# demo-pod1 1/1 Running 0 8m52s 172.110.110.67 worker-0.ocp4.redhat.ren <none> <none>
# demo-pod2 1/1 Running 0 8m52s 172.110.110.68 worker-0.ocp4.redhat.ren <none> <none>
# demo-pod3 1/1 Running 0 8m52s 172.110.220.64 worker-0.ocp4.redhat.ren <none> <none>
# demo-pod4 1/1 Running 0 8m52s 172.110.110.128 worker-1.ocp4.redhat.ren <none> <none>
# demo-pod5 1/1 Running 0 8m52s 172.110.220.130 worker-1.ocp4.redhat.ren <none> <none>
# 获得除了demo-pod1以外的所有pod的ip地址
oc get pod -o json | jq -r '.items[] | select(.metadata.name != "demo-pod1") | .status.podIP'
# 从demo-pod1上pind这些pod的ip地址,都能ping通。
for var_i in $(oc get pod -o json | jq -r '.items[] | select(.metadata.name != "demo-pod1") | .status.podIP'); do
oc exec -n demo demo-pod1 -it -- ping -c 5 ${var_i}
done
# clean up
oc delete -f calico.pod.yaml
cat calico.ip.pool.yaml | oc exec calicoctl -n calico-system -i -- /calicoctl delete -f -
calico + multus
视频讲解
- https://youtu.be/MQRv6UASZcA
- https://www.bilibili.com/video/BV1zi4y147sk/
- https://www.ixigua.com/i6825969911781655048/
# 创建multus macvlan需要的ip地址
cat << EOF > calico.macvlan.yaml
apiVersion: operator.openshift.io/v1
kind: Network
metadata:
name: cluster
spec:
additionalNetworks:
- name: multus-macvlan-0
namespace: demo
type: SimpleMacvlan
simpleMacvlanConfig:
ipamConfig:
type: static
staticIPAMConfig:
addresses:
- address: 10.123.110.11/24
routes:
- name: multus-macvlan-1
namespace: demo
type: SimpleMacvlan
simpleMacvlanConfig:
ipamConfig:
type: static
staticIPAMConfig:
addresses:
- address: 10.123.110.22/24
EOF
oc apply -f calico.macvlan.yaml
# 检查创建的ip地址
oc get Network.operator.openshift.io -o yaml
# 创建pod,并配置multus,使用macvlan
cat << EOF > calico.pod.yaml
---
kind: Pod
apiVersion: v1
metadata:
name: demo-pod1
namespace: demo
annotations:
k8s.v1.cni.cncf.io/networks: '
[{
"name": "multus-macvlan-0"
}]'
spec:
nodeSelector:
kubernetes.io/hostname: 'worker-0.ocp4.redhat.ren'
restartPolicy: Always
containers:
- name: demo
image: >-
registry.redhat.ren:5443/docker.io/wangzheng422/centos:centos7-test
env:
- name: key
value: value
command: ["iperf3", "-s", "-p" ]
args: [ "6666" ]
imagePullPolicy: Always
---
kind: Pod
apiVersion: v1
metadata:
name: demo-pod2
namespace: demo
annotations:
k8s.v1.cni.cncf.io/networks: '
[{
"name": "multus-macvlan-1"
}]'
spec:
nodeSelector:
kubernetes.io/hostname: 'worker-1.ocp4.redhat.ren'
restartPolicy: Always
containers:
- name: demo
image: >-
registry.redhat.ren:5443/docker.io/wangzheng422/centos:centos7-test
env:
- name: key
value: value
command: ["iperf3", "-s", "-p" ]
args: [ "6666" ]
imagePullPolicy: Always
EOF
oc apply -f calico.pod.yaml
# 查看demo-pod2上的ip地址
var_ips=$(oc get pod -o json | jq -r '.items[] | select(.metadata.name != "demo-pod1") | .metadata.annotations["k8s.v1.cni.cncf.io/networks-status"] | fromjson | .[].ips[0] ' )
echo -e "$var_ips"
# oc get pod -o json | jq -r ' .items[] | select(.metadata.name != "demo-pod1") | { podname: .metadata.name, ip: ( .metadata.annotations["k8s.v1.cni.cncf.io/networks-status"] | fromjson | .[].ips[0] ) } | [.podname, .ip] | @tsv'
# 从demo pod1上ping demo pod2上的2个ip地址
for var_i in $var_ips; do
oc exec -n demo demo-pod1 -it -- ping -c 5 ${var_i}
done
# restore
oc delete -f calico.pod.yaml
cat << EOF > calico.macvlan.yaml
apiVersion: operator.openshift.io/v1
kind: Network
metadata:
name: cluster
EOF
oc apply -f calico.macvlan.yaml
calico + static ip
https://docs.projectcalico.org/networking/use-specific-ip
视频讲解
- https://youtu.be/q8FtuOzBixA
- https://www.bilibili.com/video/BV1zz411q78i/
# 创建测试用的静态ip deployment,和pod
cat << EOF > demo.yaml
---
kind: Deployment
apiVersion: apps/v1
metadata:
annotations:
name: demo
spec:
replicas: 1
selector:
matchLabels:
app: demo
template:
metadata:
labels:
app: demo
annotations:
"cni.projectcalico.org/ipAddrs": '["10.254.22.33"]'
spec:
nodeSelector:
# kubernetes.io/hostname: 'worker-1.ocp4.redhat.ren'
restartPolicy: Always
containers:
- name: demo1
image: >-
registry.redhat.ren:5443/docker.io/wangzheng422/centos:centos7-test
env:
- name: key
value: value
command: ["/bin/bash", "-c", "--" ]
args: [ "trap : TERM INT; sleep infinity & wait" ]
imagePullPolicy: Always
---
kind: Pod
apiVersion: v1
metadata:
name: demo-pod1
namespace: demo
spec:
nodeSelector:
kubernetes.io/hostname: 'worker-0.ocp4.redhat.ren'
restartPolicy: Always
containers:
- name: demo
image: >-
registry.redhat.ren:5443/docker.io/wangzheng422/centos:centos7-test
env:
- name: key
value: value
command: ["iperf3", "-s", "-p" ]
args: [ "6666" ]
imagePullPolicy: Always
EOF
oc apply -n demo -f demo.yaml
# 检查pod的ip地址
oc get pod -o wide
# NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
# demo-8688cf4477-s26rs 1/1 Running 0 5s 10.254.22.33 worker-1.ocp4.redhat.ren <none> <none>
# demo-pod1 1/1 Running 0 6s 10.254.115.48 worker-0.ocp4.redhat.ren <none> <none>
# ping测试
oc exec -n demo demo-pod1 -it -- ping -c 5 10.254.22.33
# 移动pod到其他node
oc get pod -o wide
# ping测试
oc exec -n demo demo-pod1 -it -- ping -c 5 10.254.22.33
# clean up
oc delete -n demo -f demo.yaml
calico + mtu
https://docs.projectcalico.org/networking/mtu
视频讲解
- https://youtu.be/hTafoKlQiY0
- https://www.bilibili.com/video/BV1Tk4y167Zs/
# 先检查一下已有的mtu
cat << EOF > demo.yaml
---
kind: Pod
apiVersion: v1
metadata:
name: demo-pod1
namespace: demo
spec:
nodeSelector:
kubernetes.io/hostname: 'worker-0.ocp4.redhat.ren'
restartPolicy: Always
containers:
- name: demo
image: >-
registry.redhat.ren:5443/docker.io/wangzheng422/centos:centos7-test
env:
- name: key
value: value
command: ["iperf3", "-s", "-p" ]
args: [ "6666" ]
imagePullPolicy: Always
---
kind: Pod
apiVersion: v1
metadata:
name: demo-pod2
namespace: demo
spec:
nodeSelector:
kubernetes.io/hostname: 'worker-1.ocp4.redhat.ren'
restartPolicy: Always
containers:
- name: demo
image: >-
registry.redhat.ren:5443/docker.io/wangzheng422/centos:centos7-test
env:
- name: key
value: value
command: ["iperf3", "-s", "-p" ]
args: [ "6666" ]
imagePullPolicy: Always
EOF
oc apply -n demo -f demo.yaml
# 检查 mtu,现在tunl上是1480,eth0上是1410
oc exec -it demo-pod1 -- ip a
# 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
# link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
# inet 127.0.0.1/8 scope host lo
# valid_lft forever preferred_lft forever
# inet6 ::1/128 scope host
# valid_lft forever preferred_lft forever
# 2: tunl0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN group default qlen 1000
# link/ipip 0.0.0.0 brd 0.0.0.0
# 4: eth0@if54: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1410 qdisc noqueue state UP group default
# link/ether c2:e9:6a:c8:62:77 brd ff:ff:ff:ff:ff:ff link-netnsid 0
# inet 10.254.115.50/32 scope global eth0
# valid_lft forever preferred_lft forever
# inet6 fe80::c0e9:6aff:fec8:6277/64 scope link
# valid_lft forever preferred_lft forever
# 把mtu 从1410改成700
oc get installations.operator.tigera.io -o yaml
oc edit installations.operator.tigera.io
# spec:
# calicoNetwork:
# mtu: 700
# 重启calico node pod
# oc delete deploy calico-kube-controllers -n calico-system
# oc delete deploy calico-typha -n calico-system
# oc delete ds calico-node -n calico-system
oc delete -n demo -f demo.yaml
# 重启worker node
# 重新创建pod
# oc apply -n demo -f demo.yaml
# 查看mtu
oc exec -i demo-pod1 -- ip a
oc exec -i demo-pod2 -- ip a
# 各种ping测试
var_ip=$(oc get pod -o json | jq -r '.items[] | select(.metadata.name == "demo-pod1") | .status.podIP')
echo $var_ip
# ICMP+IP 的包头有 28 bytes
# the IP stack of your system adds ICMP and IP headers which equals to 28 bytes
oc exec -i demo-pod2 -- ping -M do -s $((600-28)) -c 5 $var_ip
oc exec -i demo-pod2 -- ping -M do -s $((700-28)) -c 5 $var_ip
oc exec -i demo-pod2 -- ping -M do -s $((800-28)) -c 5 $var_ip
# 把mtu从700恢复成1410
oc edit installations.operator.tigera.io
# spec:
# calicoNetwork:
oc get installations.operator.tigera.io -o yaml
# 重启calico node pod
# oc delete deploy calico-kube-controllers -n calico-system
# oc delete deploy calico-typha -n calico-system
# oc delete ds calico-node -n calico-system
oc delete -n demo -f demo.yaml
# 重启worker node
# 重新创建pod
# oc apply -n demo -f demo.yaml
# 查看mtu
oc exec -i demo-pod1 -- ip a
oc exec -i demo-pod2 -- ip a
# 各种ping测试
var_ip=$(oc get pod -o json | jq -r '.items[] | select(.metadata.name == "demo-pod1") | .status.podIP')
echo $var_ip
# ICMP+IP 的包头有 28 bytes
# the IP stack of your system adds ICMP and IP headers which equals to 28 bytes
oc exec -i demo-pod2 -- ping -M do -s $((600-28)) -c 5 $var_ip
oc exec -i demo-pod2 -- ping -M do -s $((700-28)) -c 5 $var_ip
oc exec -i demo-pod2 -- ping -M do -s $((800-28)) -c 5 $var_ip
# restore
oc delete -n demo -f demo.yaml
calico + ipv4/v6 dual stack
视频讲解
- https://youtu.be/ju4d7jWs7DQ
- https://www.bilibili.com/video/BV1va4y1e7c1/
- https://www.ixigua.com/i6827830624431112715/
# 在集群安装之前,配置文件写入ipv6地址信息
# install openshift with calico and ipv6 config
# networking:
# clusterNetworks:
# - cidr: 10.254.0.0/16
# hostPrefix: 24
# - cidr: fd00:192:168:7::/64
# hostPrefix: 80
# 在安装集群的过程中,给主机添加ipv6地址,安装就可以顺利继续了
## add ipv6 address to hosts
# helper
nmcli con modify eth0 ipv6.address "fd00:192:168:7::11/64" ipv6.gateway fd00:192:168:7::1
nmcli con modify eth0 ipv6.method manual
nmcli con reload
nmcli con up eth0
# master0
nmcli con modify ens3 ipv6.address fd00:192:168:7::13/64 ipv6.gateway fd00:192:168:7::1 ipv6.method manual
nmcli con reload
nmcli con up ens3
# master1
nmcli con modify ens3 ipv6.address fd00:192:168:7::14/64 ipv6.gateway fd00:192:168:7::1 ipv6.method manual
nmcli con reload
nmcli con up ens3
# master2
nmcli con modify ens3 ipv6.address fd00:192:168:7::15/64 ipv6.gateway fd00:192:168:7::1 ipv6.method manual
nmcli con reload
nmcli con up ens3
# worker0
nmcli con modify ens3 ipv6.address fd00:192:168:7::16/64 ipv6.gateway fd00:192:168:7::1 ipv6.method manual
nmcli con reload
nmcli con up ens3
# worker1
nmcli con modify ens3 ipv6.address fd00:192:168:7::17/64 ipv6.gateway fd00:192:168:7::1 ipv6.method manual
nmcli con reload
nmcli con up ens3
oc apply -f calicoctl.yaml
oc exec calicoctl -n calico-system -it -- /calicoctl get node -o wide
oc exec calicoctl -n calico-system -it -- /calicoctl ipam show --show-blocks
oc exec calicoctl -n calico-system -it -- /calicoctl get ipPool -o wide
# 在openshift的开发者视图上部署一个tomcat
# 从浏览器上,直接访问route入口,测试ipv4的效果。
# 在master0上直接访问worker1上的pod ipv6地址
curl -g -6 'http://[fd00:192:168:7:697b:8c59:3298:b950]:8080/'
# 在集群外,直接访问worker0上的pod ipv6地址
ip -6 route add fd00:192:168:7:697b:8c59:3298::/112 via fd00:192:168:7::17 dev eth0
curl -g -6 'http://[fd00:192:168:7:697b:8c59:3298:b950]:8080/'
calico + bgp
cat << EOF > calico.serviceip.yaml
apiVersion: projectcalico.org/v3
kind: BGPConfiguration
metadata:
name: default
spec:
serviceClusterIPs:
- cidr: 10.96.0.0/16
EOF
cat calico.serviceip.yaml | oc exec calicoctl -n calico-system -i -- /calicoctl apply -f -
oc exec calicoctl -n calico-system -i -- /calicoctl patch bgpconfiguration default -p '{"spec": {"nodeToNodeMeshEnabled": true}}'
oc exec calicoctl -n calico-system -it -- /calicoctl get bgpconfig default -o yaml
oc exec calicoctl -n calico-system -it -- /calicoctl get node -o wide
oc exec calicoctl -n calico-system -it -- /calicoctl ipam show --show-blocks
oc exec calicoctl -n calico-system -it -- /calicoctl get ipPool -o wide
oc exec calicoctl -n calico-system -it -- /calicoctl get workloadEndpoint
oc exec calicoctl -n calico-system -it -- /calicoctl get BGPPeer
cat << EOF > calico.bgp.yaml
---
apiVersion: projectcalico.org/v3
kind: BGPPeer
metadata:
name: my-global-peer
spec:
peerIP: 192.168.7.11
asNumber: 64513
EOF
cat calico.bgp.yaml | oc exec calicoctl -n calico-system -i -- /calicoctl apply -f -
# on helper
# https://www.vultr.com/docs/configuring-bgp-using-quagga-on-vultr-centos-7
yum install quagga
systemctl start zebra
systemctl start bgpd
cp /usr/share/doc/quagga-*/bgpd.conf.sample /etc/quagga/bgpd.conf
vtysh
show running-config
configure terminal
no router bgp 7675
router bgp 64513
no auto-summary
no synchronization
neighbor 192.168.7.13 remote-as 64512
neighbor 192.168.7.13 description "calico"
neighbor 192.168.7.13 attribute-unchanged next-hop
neighbor 192.168.7.13 ebgp-multihop 255
neighbor 192.168.7.13 next-hop-self
# no neighbor 192.168.7.13 next-hop-self
neighbor 192.168.7.13 activate
interface eth0
exit
exit
write
show running-config
show ip bgp summary
# 测试一下
cat << EOF > calico.ip.pool.yaml
---
apiVersion: projectcalico.org/v3
kind: IPPool
metadata:
name: ip-pool-1
spec:
cidr: 172.110.110.0/24
ipipMode: Always
natOutgoing: false
---
apiVersion: projectcalico.org/v3
kind: IPPool
metadata:
name: ip-pool-2
spec:
cidr: 172.110.220.0/24
ipipMode: Always
natOutgoing: false
EOF
cat calico.ip.pool.yaml | oc exec calicoctl -n calico-system -i -- /calicoctl apply -f -
oc exec calicoctl -n calico-system -it -- /calicoctl get ipPool -o wide
cat << EOF > calico.pod.yaml
---
kind: Pod
apiVersion: v1
metadata:
name: demo-pod1
namespace: demo
annotations:
cni.projectcalico.org/ipv4pools: '["ip-pool-1"]'
spec:
nodeSelector:
kubernetes.io/hostname: 'worker-0.ocp4.redhat.ren'
restartPolicy: Always
containers:
- name: demo
image: >-
registry.redhat.ren:5443/docker.io/wangzheng422/centos:centos7-test
env:
- name: key
value: value
command: ["iperf3", "-s", "-p" ]
args: [ "6666" ]
imagePullPolicy: Always
---
kind: Pod
apiVersion: v1
metadata:
name: demo-pod2
namespace: demo
annotations:
cni.projectcalico.org/ipv4pools: '["ip-pool-1"]'
spec:
nodeSelector:
kubernetes.io/hostname: 'worker-0.ocp4.redhat.ren'
restartPolicy: Always
containers:
- name: demo
image: >-
registry.redhat.ren:5443/docker.io/wangzheng422/centos:centos7-test
env:
- name: key
value: value
command: ["iperf3", "-s", "-p" ]
args: [ "6666" ]
imagePullPolicy: Always
---
kind: Pod
apiVersion: v1
metadata:
name: demo-pod3
namespace: demo
annotations:
cni.projectcalico.org/ipv4pools: '["ip-pool-2"]'
spec:
nodeSelector:
kubernetes.io/hostname: 'worker-0.ocp4.redhat.ren'
restartPolicy: Always
containers:
- name: demo
image: >-
registry.redhat.ren:5443/docker.io/wangzheng422/centos:centos7-test
env:
- name: key
value: value
command: ["iperf3", "-s", "-p" ]
args: [ "6666" ]
imagePullPolicy: Always
---
kind: Pod
apiVersion: v1
metadata:
name: demo-pod4
namespace: demo
annotations:
cni.projectcalico.org/ipv4pools: '["ip-pool-1"]'
spec:
nodeSelector:
kubernetes.io/hostname: 'worker-1.ocp4.redhat.ren'
restartPolicy: Always
containers:
- name: demo
image: >-
registry.redhat.ren:5443/docker.io/wangzheng422/centos:centos7-test
env:
- name: key
value: value
command: ["iperf3", "-s", "-p" ]
args: [ "6666" ]
imagePullPolicy: Always
---
kind: Pod
apiVersion: v1
metadata:
name: demo-pod5
namespace: demo
annotations:
cni.projectcalico.org/ipv4pools: '["ip-pool-2"]'
spec:
nodeSelector:
kubernetes.io/hostname: 'worker-1.ocp4.redhat.ren'
restartPolicy: Always
containers:
- name: demo
image: >-
registry.redhat.ren:5443/docker.io/wangzheng422/centos:centos7-test
env:
- name: key
value: value
command: ["iperf3", "-s", "-p" ]
args: [ "6666" ]
imagePullPolicy: Always
EOF
oc apply -f calico.pod.yaml
run calico/node with —backend=none
CALICO_NETWORKING_BACKEND none
https://docs.projectcalico.org/reference/node/configuration
backups
skopeo copy docker://quay.io/tigera/operator-init:v1.3.3 docker://registry.redhat.ren:5443/tigera/operator-init:v1.3.3
skopeo copy docker://quay.io/tigera/operator:v1.3.3 docker://registry.redhat.ren:5443/tigera/operator:v1.3.3
skopeo copy docker://docker.io/calico/ctl:v3.13.2 docker://registry.redhat.ren:5443/calico/ctl:v3.13.2
skopeo copy docker://docker.io/calico/kube-controllers:v3.13.2 docker://registry.redhat.ren:5443/calico/kube-controllers:v3.13.2
skopeo copy docker://docker.io/calico/node:v3.13.2 docker://registry.redhat.ren:5443/calico/node:v3.13.2
skopeo copy docker://docker.io/calico/typha:v3.13.2 docker://registry.redhat.ren:5443/calico/typha:v3.13.2
skopeo copy docker://docker.io/calico/pod2daemon-flexvol:v3.13.2 docker://registry.redhat.ren:5443/calico/pod2daemon-flexvol:v3.13.2
skopeo copy docker://docker.io/calico/cni:v3.13.2 docker://registry.redhat.ren:5443/calico/cni:v3.13.2
curl https://docs.projectcalico.org/manifests/ocp/crds/01-crd-installation.yaml -o manifests/01-crd-installation.yaml
curl https://docs.projectcalico.org/manifests/ocp/crds/01-crd-tigerastatus.yaml -o manifests/01-crd-tigerastatus.yaml
curl https://docs.projectcalico.org/manifests/ocp/crds/calico/kdd/02-crd-bgpconfiguration.yaml -o manifests/02-crd-bgpconfiguration.yaml
curl https://docs.projectcalico.org/manifests/ocp/crds/calico/kdd/02-crd-bgppeer.yaml -o manifests/02-crd-bgppeer.yaml
curl https://docs.projectcalico.org/manifests/ocp/crds/calico/kdd/02-crd-blockaffinity.yaml -o manifests/02-crd-blockaffinity.yaml
curl https://docs.projectcalico.org/manifests/ocp/crds/calico/kdd/02-crd-clusterinformation.yaml -o manifests/02-crd-clusterinformation.yaml
curl https://docs.projectcalico.org/manifests/ocp/crds/calico/kdd/02-crd-felixconfiguration.yaml -o manifests/02-crd-felixconfiguration.yaml
curl https://docs.projectcalico.org/manifests/ocp/crds/calico/kdd/02-crd-globalnetworkpolicy.yaml -o manifests/02-crd-globalnetworkpolicy.yaml
curl https://docs.projectcalico.org/manifests/ocp/crds/calico/kdd/02-crd-globalnetworkset.yaml -o manifests/02-crd-globalnetworkset.yaml
curl https://docs.projectcalico.org/manifests/ocp/crds/calico/kdd/02-crd-hostendpoint.yaml -o manifests/02-crd-hostendpoint.yaml
curl https://docs.projectcalico.org/manifests/ocp/crds/calico/kdd/02-crd-ipamblock.yaml -o manifests/02-crd-ipamblock.yaml
curl https://docs.projectcalico.org/manifests/ocp/crds/calico/kdd/02-crd-ipamconfig.yaml -o manifests/02-crd-ipamconfig.yaml
curl https://docs.projectcalico.org/manifests/ocp/crds/calico/kdd/02-crd-ipamhandle.yaml -o manifests/02-crd-ipamhandle.yaml
curl https://docs.projectcalico.org/manifests/ocp/crds/calico/kdd/02-crd-ippool.yaml -o manifests/02-crd-ippool.yaml
curl https://docs.projectcalico.org/manifests/ocp/crds/calico/kdd/02-crd-networkpolicy.yaml -o manifests/02-crd-networkpolicy.yaml
curl https://docs.projectcalico.org/manifests/ocp/crds/calico/kdd/02-crd-networkset.yaml -o manifests/02-crd-networkset.yaml
curl https://docs.projectcalico.org/manifests/ocp/tigera-operator/00-namespace-tigera-operator.yaml -o manifests/00-namespace-tigera-operator.yaml
curl https://docs.projectcalico.org/manifests/ocp/tigera-operator/02-rolebinding-tigera-operator.yaml -o manifests/02-rolebinding-tigera-operator.yaml
curl https://docs.projectcalico.org/manifests/ocp/tigera-operator/02-role-tigera-operator.yaml -o manifests/02-role-tigera-operator.yaml
curl https://docs.projectcalico.org/manifests/ocp/tigera-operator/02-serviceaccount-tigera-operator.yaml -o manifests/02-serviceaccount-tigera-operator.yaml
curl https://docs.projectcalico.org/manifests/ocp/tigera-operator/02-configmap-calico-resources.yaml -o manifests/02-configmap-calico-resources.yaml
curl https://docs.projectcalico.org/manifests/ocp/tigera-operator/02-configmap-tigera-install-script.yaml -o manifests/02-configmap-tigera-install-script.yaml
curl https://docs.projectcalico.org/manifests/ocp/tigera-operator/02-tigera-operator.yaml -o manifests/02-tigera-operator.yaml
curl https://docs.projectcalico.org/manifests/ocp/01-cr-installation.yaml -o manifests/01-cr-installation.yaml
curl https://docs.projectcalico.org/manifests/calicoctl.yaml -o manifests/calicoctl.yaml
oc get Network.operator.openshift.io -o yaml
# defaultNetwork:
# calicoSDNConfig:
# mtu: 700
# openshiftSDNConfig:
# mtu: 700
oc api-resources | grep -i calico
oc api-resources | grep -i tigera
oc get FelixConfiguration -o yaml
oc exec calicoctl -n calico-system -it -- /calicoctl get bgpconfig default
cat << EOF > calico.serviceip.yaml
apiVersion: projectcalico.org/v3
kind: BGPConfiguration
metadata:
name: default
spec:
serviceClusterIPs:
- cidr: 10.96.0.0/16
- cidr: fd00:192:168:7:1:1::/112
EOF
cat calico.serviceip.yaml | oc exec calicoctl -n calico-system -i -- /calicoctl apply -f -
oc exec calicoctl -n calico-system -it -- /calicoctl get workloadEndpoint
oc exec calicoctl -n calico-system -it -- /calicoctl get BGPPeer
cat << EOF > calico.bgp.yaml
---
apiVersion: projectcalico.org/v3
kind: BGPPeer
metadata:
name: my-global-peer
spec:
peerIP: 192.168.7.11
asNumber: 64513
---
apiVersion: projectcalico.org/v3
kind: BGPPeer
metadata:
name: my-global-peer-v6
spec:
peerIP: fd00:192:168:7::11
asNumber: 64513
EOF
cat calico.bgp.yaml | oc exec calicoctl -n calico-system -i -- /calicoctl apply -f -
# on helper
# https://www.vultr.com/docs/configuring-bgp-using-quagga-on-vultr-centos-7
yum install quagga
systemctl start zebra
systemctl start bgpd
cp /usr/share/doc/quagga-*/bgpd.conf.sample /etc/quagga/bgpd.conf
vtysh
show running-config
configure terminal
no router bgp 7675
router bgp 64513
no auto-summary
no synchronization
neighbor 192.168.7.13 remote-as 64512
neighbor 192.168.7.13 description "calico"
neighbor fd00:192:168:7::13 remote-as 64512
neighbor fd00:192:168:7::13 description "calico"
interface eth0
?? no ipv6 nd suppress-ra
exit
exit
write
show running-config
show ip bgp summary
# https://access.redhat.com/documentation/en-us/openshift_container_platform/4.3/html/networking/cluster-network-operator
oc get Network.operator.openshift.io -o yaml
oc edit Network.operator.openshift.io cluster
# - cidr: fd01:192:168:7:11:/64
# hostPrefix: 80
oc get network.config/cluster
oc edit network.config/cluster
oc get installations.operator.tigera.io -o yaml
oc edit installations.operator.tigera.io
# nodeAddressAutodetectionV6:
# firstFound: true
- blockSize: 122
cidr: fd01:192:168:7:11:/80
encapsulation: None
natOutgoing: Disabled
nodeSelector: all()
openshift4 集群升级
4.2的集群升级很简单,更新一下镜像仓库,然后运行一个命令,等着就好了。
# on base host
cat << EOF > /etc/docker-distribution/registry/config.yml
version: 0.1
log:
fields:
service: registry
storage:
cache:
layerinfo: inmemory
filesystem:
rootdirectory: /data/4.2.7/registry
delete:
enabled: true
http:
addr: :443
tls:
certificate: /etc/crts/redhat.ren.crt
key: /etc/crts/redhat.ren.key
EOF
systemctl restart docker-distribution
# on helper node
# oc patch OperatorHub cluster --type json -p '[{"op": "add", "path": "/spec/disableAllDefaultSources", "value": true}]'
oc patch OperatorHub cluster --type json -p '[{"op": "remove", "path": "/spec/disableAllDefaultSources"}]'
oc patch -n openshift-cluster-samples-operator configs.samples.operator.openshift.io cluster -p '{"items[0]":{"spec":{"managementState":"Removed"}}}' --type=merge
oc adm upgrade --allow-explicit-upgrade --allow-upgrade-with-warnings=true --force=true --to-image=registry.redhat.ren/ocp4/openshift4:4.2.7
openshift4 缩小 / & sysroot 分区大小
openshift4默认安装的时候,会把sda/vda整个硬盘占满,如果我们是baremetal按照,一般会配置SSD/NVME, 1T大小,这样非常浪费。我们完全可以把硬盘空间节省下来,分一些分区,给local storage operator用。
视频讲解:
# backup the ignition file you want
/bin/cp -f /var/www/html/ignition/worker-1.ign /var/www/html/ignition/worker-1.ign.bak
# 修改 /data/ocp4/partition.sh ,
# 主要是修改里面的root分区大小,默认是200G
# 然后是想要创建的数据分区的个数和大小参数,
# 默认会创建5个10G分区,5个5G分区。
bash /data/ocp4/partition.sh
butane /data/ocp4/root-partition.bu -r -o /data/install/partition-ric.ign
/bin/cp -f /var/www/html/ignition/worker-1.ign.bak /var/www/html/ignition/worker-1.ign
# merge the 2 ignition files
jq -s '.[0] * .[1]' /var/www/html/ignition/worker-1.ign /data/install/partition-ric.ign | jq -c . > /var/www/html/ignition/worker-1.ign.new
/bin/cp -f /var/www/html/ignition/worker-1.ign.new /var/www/html/ignition/worker-1.ign
# then install using iso
# login to worker-1
lsblk
# NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
# sr0 11:0 1 1024M 0 rom
# vda 252:0 0 1T 0 disk
# ├─vda1 252:1 0 1M 0 part
# ├─vda2 252:2 0 127M 0 part
# ├─vda3 252:3 0 384M 0 part /boot
# ├─vda4 252:4 0 200G 0 part /sysroot
# ├─vda5 252:5 0 10G 0 part
# ├─vda6 252:6 0 10G 0 part
# ├─vda7 252:7 0 10G 0 part
# ├─vda8 252:8 0 10G 0 part
# ├─vda9 252:9 0 10G 0 part
# ├─vda10 252:10 0 5G 0 part
# ├─vda11 252:11 0 5G 0 part
# ├─vda12 252:12 0 5G 0 part
# ├─vda13 252:13 0 5G 0 part
# └─vda14 252:14 0 5G 0 part /var/lib/kubelet/pods/a364c83a-deae-4431-b7c3-bcef8457aed6/volumes/kubernetes.io~local-volume/local-pv-9fa7f
# let's check what we created
cat /data/ocp4/root-partition.bu
# variant: openshift
# version: 4.8.0
# metadata:
# name: root-storage
# labels:
# machineconfiguration.openshift.io/role: worker
# storage:
# disks:
# - device: /dev/vda
# wipe_table: false
# partitions:
# - number: 4
# label: root
# size_mib: 204800
# resize: true
# - label: data_10G_1
# size_mib: 10240
# - label: data_10G_2
# size_mib: 10240
# - label: data_10G_3
# size_mib: 10240
# - label: data_10G_4
# size_mib: 10240
# - label: data_10G_5
# size_mib: 10240
# - label: data_5G_1
# size_mib: 5120
# - label: data_5G_2
# size_mib: 5120
# - label: data_5G_3
# size_mib: 5120
# - label: data_5G_4
# size_mib: 5120
# - label: data_5G_5
# size_mib: 5120
cat /data/install/partition-ric.ign | jq .
# {
# "ignition": {
# "version": "3.2.0"
# },
# "storage": {
# "disks": [
# {
# "device": "/dev/vda",
# "partitions": [
# {
# "label": "root",
# "number": 4,
# "resize": true,
# "sizeMiB": 204800
# },
# {
# "label": "data_10G_1",
# "sizeMiB": 10240
# },
# {
# "label": "data_10G_2",
# "sizeMiB": 10240
# },
# {
# "label": "data_10G_3",
# "sizeMiB": 10240
# },
# {
# "label": "data_10G_4",
# "sizeMiB": 10240
# },
# {
# "label": "data_10G_5",
# "sizeMiB": 10240
# },
# {
# "label": "data_5G_1",
# "sizeMiB": 5120
# },
# {
# "label": "data_5G_2",
# "sizeMiB": 5120
# },
# {
# "label": "data_5G_3",
# "sizeMiB": 5120
# },
# {
# "label": "data_5G_4",
# "sizeMiB": 5120
# },
# {
# "label": "data_5G_5",
# "sizeMiB": 5120
# }
# ],
# "wipeTable": false
# }
# ]
# }
# }
local storage operator
我们有了很多分区,那么赶快来测试一下如何把他们变成 PV 吧
apiVersion: "local.storage.openshift.io/v1"
kind: "LocalVolume"
metadata:
name: "local-disks"
namespace: "openshift-local-storage"
spec:
nodeSelector:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- worker-1
storageClassDevices:
- storageClassName: "local-sc"
volumeMode: Filesystem
fsType: xfs
devicePaths:
- /dev/vda5
- /dev/vda14
我们可以看到配置已经生效 系统已经帮我们创建好了PV
我们创建pod,创建和使用pvc,然后弄点数据,然后删掉pod,删掉pvc。然后重新创建pod,创建和使用pvc,看看里面的数据是否会清空。
cat << EOF >> /data/install/pvc-demo.yaml
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: local-pvc-demo
spec:
accessModes:
- ReadWriteOnce
volumeMode: Filesystem
resources:
requests:
storage: 2Gi
storageClassName: local-sc
---
kind: Pod
apiVersion: v1
metadata:
annotations:
name: demo1
spec:
nodeSelector:
kubernetes.io/hostname: 'worker-1'
restartPolicy: Always
containers:
- name: demo1
image: >-
quay.io/wangzheng422/qimgs:centos7-test
env:
- name: key
value: value
command:
- sleep
- infinity
imagePullPolicy: Always
volumeMounts:
- mountPath: /data
name: demo
readOnly: false
volumes:
- name: demo
persistentVolumeClaim:
claimName: local-pvc-demo
EOF
oc create -n default -f /data/install/pvc-demo.yaml
我们能看到 PVC 已经创建 PV 也已经挂载
oc rsh pod/demo1
df -h
# Filesystem Size Used Avail Use% Mounted on
# overlay 200G 8.4G 192G 5% /
# tmpfs 64M 0 64M 0% /dev
# tmpfs 24G 0 24G 0% /sys/fs/cgroup
# shm 64M 0 64M 0% /dev/shm
# tmpfs 24G 64M 24G 1% /etc/hostname
# /dev/vda14 5.0G 68M 5.0G 2% /data
# /dev/vda4 200G 8.4G 192G 5% /etc/hosts
# tmpfs 24G 20K 24G 1% /run/secrets/kubernetes.io/serviceaccount
# tmpfs 24G 0 24G 0% /proc/acpi
# tmpfs 24G 0 24G 0% /proc/scsi
# tmpfs 24G 0 24G 0% /sys/firmware
echo wzh > /data/1
cat /data/1
# wzh
# destroy the pvc and pod
oc delete -n default -f /data/install/pvc-demo.yaml
# recreate
oc create -n default -f /data/install/pvc-demo.yaml
PVC重新创建了 PV也重新挂在了
我们发现,PV release以后,重新挂载,之前的存储内容,就都没有了。
oc rsh pod/demo1
sh-4.2# cd /data
sh-4.2# ls
sh-4.2# ls -hl
total 0
openshift4 离线升级服务 / disconnected update service
openshift4默认的集群管理界面,会向公网的升级服务请求升级信息,如果在离线安装的情况,这个升级信息是拿不到的,于是集群的管理界面就会一堆报错,很难看。现在openshift4有一个update server operator,这个可以在集群内部创建一个离线的update server,提供升级信息,这样集群的管理界面就不会那么难看啦。
本次实验的部署架构:
视频讲解:
based on:
- https://www.openshift.com/blog/openshift-update-service-update-manager-for-your-cluster
- https://docs.openshift.com/container-platform/4.8/updating/installing-update-service.html
离线安装以后,不配置的话,系统管理页面是这个鬼样子:
# search OpenShift Update Service in operator hub, and install
# build a update container
mkdir -p /data/update
cd /data/update
cat << EOF > /data/update/Dockerfile
FROM registry.access.redhat.com/ubi8
RUN curl -L -o cincinnati-graph-data.tar.gz https://github.com/openshift/cincinnati-graph-data/archive/master.tar.gz
CMD exec /bin/bash -c "tar xvzf cincinnati-graph-data.tar.gz -C /var/lib/cincinnati/graph-data/ --strip-components=1"
EOF
var_date=$(date '+%Y-%m-%d-%H%M')
echo $var_date
buildah bud -f ./Dockerfile -t quay.io/wangzheng422/graph-data-image:$var_date
podman push quay.io/wangzheng422/graph-data-image:$var_date
echo quay.io/wangzheng422/graph-data-image:$var_date
# quay.io/wangzheng422/graph-data-image:2021-09-07-0709
cat << EOF > /data/install/update.yaml
apiVersion: updateservice.operator.openshift.io/v1
kind: UpdateService
metadata:
namespace: openshift-update-service
name: sample
spec:
graphDataImage: 'nexus.ocp4.redhat.ren:8083/wangzheng422/graph-data-image:2021-09-07-0709'
releases: 'registry.ocp4.redhat.ren:5443/ocp4/release'
replicas: 1
EOF
oc create -f /data/install/update.yaml
# to restore
oc delete -f /data/install/update.yaml
# 部署完了update service 以后,发现报错
# 发现update service operator依赖有password的registry
# 我们之前默认安装的registry是没有密码的,就不行
# 所以重新部署一个需要密码认证的registry就可以了。
oc get secret/pull-secret -n openshift-config -o json | jq '.data.".dockerconfigjson"' | jq -r . | base64 -d | jq .
# {
# "auths": {
# "registry.ocp4.redhat.ren:5443": {
# "username": "admin",
# "password": "redhat",
# "auth": "YWRtaW46cmVkaGF0",
# "email": "admin@redhat.ren"
# }
# }
# }
oc delete cm ca.for.registry -n openshift-config
oc create configmap ca.for.registry -n openshift-config \
--from-file=registry.ocp4.redhat.ren..5443=/etc/crts/redhat.ren.ca.crt \
--from-file=updateservice-registry=/etc/crts/redhat.ren.ca.crt
oc patch image.config.openshift.io/cluster -p '{"spec":{"additionalTrustedCA":{"name":"ca.for.registry"}}}' --type=merge
# oc patch image.config.openshift.io/cluster -p '{"spec":{"additionalTrustedCA":{"name":"ca.for.registry"}}}' --type=merge
# our router's https certs is self-sign,
# update service will report error on this certs
# so we create a http route, to avoid this error
cat << EOF > /data/install/update-wzh-route.yaml
kind: Route
apiVersion: route.openshift.io/v1
metadata:
name: update-wzh
namespace: openshift-update-service
labels:
app: sample-policy-engine
spec:
to:
kind: Service
name: sample-policy-engine
weight: 100
port:
targetPort: policy-engine
EOF
oc create -f /data/install/update-wzh-route.yaml
oc patch clusterversion version --type='json' -p='[{"op": "replace", "path": "/spec/upstream", "value": "http://update-wzh-openshift-update-service.apps.ocp4.redhat.ren/api/upgrades_info/v1/graph"}]'
oc get clusterversion version -o yaml | more
可以在operator的图形界面中,配置离线的update service参数
离线update service配置好了以后,看上去就非常舒适了。
windows node in openshift 4.8
在本文中,我们将安装一个win10节点,并加入到openshift 4.8集群中去。之后会部署一个演示应用。
经过测试,我们发现,当前的win10当作worker节点,还是不太适合,原因如下:
- windows要求容器的基础镜像版本,和宿主机的版本严格一致,这样就不能向rhel一样,在rhel8上运行rhel7的容器,在部署的时候会造成很大困惑。
- windows的容器,不能运行GUI app。虽然也有很多.net的web服务应用,但是更多的老旧windows应用,应该还是包含GUI的程序。这样大大的限制了windows容器的应用访问。
- docker for windows版本,只能设置proxy,不能为第三方镜像仓库设置mirror,这样对于离线部署,就很难受了。
- 目前版本,对静态IP部署还不友好,需要手动配置windows网卡。
- 目前版本的稳定性还有待加强,会出现k8s的服务崩溃现象,只能做开发测试,体验用,当然如果我们用windows server来做,稳定性会好很多。
本次部署的架构图:
视频讲解:
安装 win10
安装win10,需要注意选择正确的版本,因为win10的docker镜像版本,要求和宿主机一致。 在这里查看 win10 docker image version.
在本文撰写的时候,版本是win10 20H2 20H2, 在这里找下载这个版本的ISO.
选择好版本,我们就要开始安装了。
# 先要准备一下 virtio 的驱动,因为 win10 里面没有, 安装的时候找不到硬盘。
podman pull registry.redhat.io/container-native-virtualization/virtio-win
podman run --rm -it --name swap registry.redhat.io/container-native-virtualization/virtio-win bash
podman create --name swap registry.redhat.io/container-native-virtualization/virtio-win ls
podman cp swap:/disk/virtio-win.iso - > virtio-win.iso.tar
gzip virtio-win.iso.tar
podman rm swap
# 直接创建kvm, 自动开始安装啦。
export KVM_DIRECTORY=/data/kvm
virt-install --name=ocp4-windows --vcpus=6,cores=6 --ram=12288 \
--cpu=host-model \
--disk path=/data/nvme/ocp4-windows.qcow2,bus=virtio,size=100 \
--os-variant win10 --network bridge=baremetal,model=virtio \
--graphics vnc,port=59017 \
--boot menu=on \
--cdrom ${KVM_DIRECTORY}/win10.iso \
--disk ${KVM_DIRECTORY}/virtio-win.iso,device=cdrom
win10的话,必须选择专业版。
选择自定义安装,因为我们要加载硬盘驱动
选择加载驱动程序
选择正确的驱动程序位置
选择驱动,下一步
默认安装整个硬盘
安装就自动进行
安装完成后,进入系统,把剩下的驱动,一口气都装了。
系统识别出了网卡,那就设置IP地址吧
我们需要装ssh服务端,从 设置-应用 中找
点击可选功能
点击添加功能
搜索ssh服务器,并安装
安装完了ssh是这样样子的
我们还需要打开防火墙端口,从网络配置进入
选择高级设置
新建入站规则
根据文档要求,打开 22, 10250 端口
允许连接
所有网络位置都允许
给起个名字
ssh服务不是自动启动了,我们设置成自动启动
选择自动
从外面,就能ssh到windows了
我把实验用的win10,打包到了一个镜像里面,需要的可以下载使用。
用户名密码是: wzh / redhat
ssh wzh@worker-1
# Microsoft Windows [版本 10.0.19043.1237]
# (c) Microsoft Corporation。保留所有权利。
# wzh@DESKTOP-FUIF19L C:\Users\wzh>
设置 ssh key auth
我们需要设置ssh使用key的方式自动登录,那么要有几个特殊的步骤。
Set-ExecutionPolicy unrestricted
接下来准备2个文件
参考这个文章,写一个允许ssh自动key登录的脚本,我们在里面还加上了自动激活hyper-v, windows container的步骤。
# the script here also enable hyper-v and windows container
cat << 'EOF' > /data/install/win-ssh.ps1
$acl = Get-Acl C:\ProgramData\ssh\administrators_authorized_keys
$acl.SetAccessRuleProtection($true, $false)
$administratorsRule = New-Object system.security.accesscontrol.filesystemaccessrule("Administrators","FullControl","Allow")
$systemRule = New-Object system.security.accesscontrol.filesystemaccessrule("SYSTEM","FullControl","Allow")
$acl.SetAccessRule($administratorsRule)
$acl.SetAccessRule($systemRule)
$acl | Set-Acl
Enable-WindowsOptionalFeature -Online -FeatureName $("Microsoft-Hyper-V", "Containers") -All
EOF
# 把脚本, key, 还有安装文件,复制到win10上
scp /data/install/win-ssh.ps1 wzh@worker-1:c:\\win-ssh.ps1
scp /root/.ssh/id_rsa.pub wzh@worker-1:C:\\ProgramData\\ssh\\administrators_authorized_keys
scp /data/down/Docker\ Desktop\ Installer.exe wzh@worker-1:c:\\docker-install.exe
scp /data/down/wsl_update_x64.msi wzh@worker-1:c:\\wsl_update_x64.msi
用管理员权限,打开power shell
运行我们的脚本
重启win10, 然后你就可以用key自动登录啦。
安装docker,并切换到windows container。
第一次启动docker,会说什么wsl2 linux kernel要更新,可以用我提供的文件,直接更新,也可以直接切换windows container,不用理会那个报警。
设置 docker for windows, 使用 process 来隔离, 因为kvm上的某种未知配置错误,默认hyper-v形式的隔离,启动不了容器,我们换成process来隔离.
{
"registry-mirrors": [],
"insecure-registries": [],
"debug": true,
"experimental": false,
"exec-opts": [
"isolation=process"
]
}
配置界面长这样
记得改一下windows的主机名
backup win10 kvm
我们备份一下win10 kvm,并上传quay.io,方便以后重新做实验。
# poweroff you win7 vm
mkdir -p /data/nvme/bak
cd /data/nvme
virsh dumpxml ocp4-windows > /data/nvme/bak/ocp4-windows.xml
pigz -c ocp4-windows.qcow2 > /data/nvme/bak/ocp4-windows.qcow2.gz
cd /data/nvme/bak
var_date=$(date '+%Y-%m-%d-%H%M')
echo $var_date
buildah from --name onbuild-container scratch
buildah copy onbuild-container ocp4-windows.xml /
buildah copy onbuild-container ocp4-windows.qcow2.gz /
buildah umount onbuild-container
buildah commit --rm onbuild-container quay.io/wangzheng422/qimgs:win7-ssh-$var_date
# buildah rm onbuild-container
# rm -f nexus-image.tgz
echo "quay.io/wangzheng422/qimgs:win7-ssh-$var_date"
buildah push quay.io/wangzheng422/qimgs:win7-ssh-$var_date
# so, we got a image contain win10, and feature enabled.
# this is for win10 versin 10.0.19043.1237
# quay.io/wangzheng422/qimgs:win7-ssh-2021-09-30-1340
你可以使用上面的这个版本的镜像,拉取到本地,并从中取出win10虚拟机,然后自己尝试啦。
安装 ocp, 使用 ovn with hybrid mode
参考官方文档:
- https://docs.openshift.com/container-platform/4.8/windows_containers/byoh-windows-instance.html
- https://docs.openshift.com/container-platform/4.8/windows_containers/enabling-windows-container-workloads.html
# vi install-config.yaml
cat << EOF > /data/install/install-config.yaml
apiVersion: v1
baseDomain: redhat.ren
compute:
- hyperthreading: Enabled
name: worker
replicas: 0
controlPlane:
hyperthreading: Enabled
name: master
replicas: 1
metadata:
name: ocp4
networking:
clusterNetworks:
- cidr: 10.128.0.0/16
hostPrefix: 23
networkType: OVNKubernetes
serviceNetwork:
- 172.30.0.0/16
platform:
none: {}
pullSecret: '{"auths":{"registry.ocp4.redhat.ren:5443": {"auth": "ZHVtbXk6ZHVtbXk=","email": "noemail@localhost"},"registry.ppa.redhat.ren:5443": {"auth": "ZHVtbXk6ZHVtbXk=","email": "noemail@localhost"}}}'
sshKey: |
$( cat /root/.ssh/id_rsa.pub | sed 's/^/ /g' )
additionalTrustBundle: |
$( cat /etc/crts/redhat.ren.ca.crt | sed 's/^/ /g' )
imageContentSources:
- mirrors:
- registry.ocp4.redhat.ren:5443/ocp4/openshift4
- registry.ocp4.redhat.ren:5443/ocp4/release
source: quay.io/openshift-release-dev/ocp-release
- mirrors:
- registry.ocp4.redhat.ren:5443/ocp4/openshift4
- registry.ocp4.redhat.ren:5443/ocp4/release
source: quay.io/openshift-release-dev/ocp-v4.0-art-dev
EOF
cat << EOF > /data/install/manifests/cluster-network-03-config.yml
apiVersion: operator.openshift.io/v1
kind: Network
metadata:
name: cluster
spec:
defaultNetwork:
ovnKubernetesConfig:
hybridOverlayConfig:
hybridClusterNetwork:
- cidr: 10.132.0.0/16
hostPrefix: 23
hybridOverlayVXLANPort: 9898
EOF
安装windows machien config operator
# 导入ssh key
oc create secret generic cloud-private-key --from-file=private-key.pem=/root/.ssh/id_rsa \
-n openshift-windows-machine-config-operator
# 配置win10自动登录用户名和ip地址
cat << EOF > /data/install/win-node.yaml
kind: ConfigMap
apiVersion: v1
metadata:
name: windows-instances
namespace: openshift-windows-machine-config-operator
data:
192.168.7.17: |-
username=wzh
EOF
oc create -f /data/install/win-node.yaml
# to restore
oc delete -f /data/install/win-node.yaml
# csr is automatically approved
oc get csr
# NAME AGE SIGNERNAME REQUESTOR CONDITION
# csr-ff7q5 63m kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Approved,Issued
# csr-gzlpq 53s kubernetes.io/kubelet-serving system:node:worker-1 Approved,Issued
# csr-rgdzv 59s kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Approved,Issued
# csr-zkw8c 63m kubernetes.io/kubelet-serving system:node:master-0 Approved,Issued
# system:openshift:openshift-authenticator 59m kubernetes.io/kube-apiserver-client system:serviceaccount:openshift-authentication-operator:authentication-operator Approved,Issued
估计是当前实现的bug,或者其他原因,windows的默认网卡,上面的协议会被disable掉,造成windows node加入集群失败,目前暂时手动的把这些协议都enable,只留一个不激活。当然,你也可以只enable ipv4的配置,也是可以的。
之后就等着好了,openshift会自动上传程序和配置,并配置好windows node,加入集群,成功以后,我们就能看到如下的日志。
{"level":"info","ts":1633004643.789956,"logger":"controllers.configmap","msg":"processing","instances in":"windows-instances"}
{"level":"info","ts":1633004674.0080738,"logger":"wc 192.168.7.17","msg":"configuring"}
{"level":"info","ts":1633004675.3135288,"logger":"wc 192.168.7.17","msg":"transferring files"}
{"level":"info","ts":1633004693.670281,"logger":"wc 192.168.7.17","msg":"configured","service":"windows_exporter","args":"--collectors.enabled cpu,cs,logical_disk,net,os,service,system,textfile,container,memory,cpu_info\""}
{"level":"info","ts":1633004697.0266535,"logger":"controllers.CertificateSigningRequests","msg":"CSR approved","CSR":"csr-rgdzv"}
{"level":"info","ts":1633004703.104529,"logger":"controllers.CertificateSigningRequests","msg":"CSR approved","CSR":"csr-gzlpq"}
{"level":"info","ts":1633004726.9497287,"logger":"wc 192.168.7.17","msg":"configured kubelet","cmd":"C:\\k\\\\wmcb.exe initialize-kubelet --ignition-file C:\\Windows\\Temp\\worker.ign --kubelet-path C:\\k\\kubelet.exe --node-ip=192.168.7.17","output":"Bootstrapping completed successfully"}
{"level":"info","ts":1633004757.078427,"logger":"wc 192.168.7.17","msg":"configure","service":"hybrid-overlay-node","args":"--node worker-1 --hybrid-overlay-vxlan-port=9898 --k8s-kubeconfig c:\\k\\kubeconfig --windows-service --logfile C:\\var\\log\\hybrid-overlay\\hybrid-overlay.log\" depend= kubelet"}
{"level":"info","ts":1633004880.6788793,"logger":"wc 192.168.7.17","msg":"configured","service":"hybrid-overlay-node","args":"--node worker-1 --hybrid-overlay-vxlan-port=9898 --k8s-kubeconfig c:\\k\\kubeconfig --windows-service --logfile C:\\var\\log\\hybrid-overlay\\hybrid-overlay.log\" depend= kubelet"}
{"level":"info","ts":1633004928.5883121,"logger":"wc 192.168.7.17","msg":"configured kubelet for CNI","cmd":"C:\\k\\wmcb.exe configure-cni --cni-dir=\"C:\\k\\cni\\ --cni-config=\"C:\\k\\cni\\config\\cni.conf","output":"CNI configuration completed successfully"}
{"level":"info","ts":1633004941.3937094,"logger":"wc 192.168.7.17","msg":"configured","service":"kube-proxy","args":"--windows-service --v=4 --proxy-mode=kernelspace --feature-gates=WinOverlay=true --hostname-override=worker-1 --kubeconfig=c:\\k\\kubeconfig --cluster-cidr=10.132.0.0/24 --log-dir=C:\\var\\log\\kube-proxy\\ --logtostderr=false --network-name=OVNKubernetesHybridOverlayNetwork --source-vip=10.132.0.14 --enable-dsr=false --feature-gates=IPv6DualStack=false\" depend= hybrid-overlay-node"}
{"level":"info","ts":1633004956.4613981,"logger":"nc 192.168.7.17","msg":"instance has been configured as a worker node","version":"3.1.0+06e96071"}
{"level":"info","ts":1633004956.4949114,"logger":"metrics","msg":"Prometheus configured","endpoints":"windows-exporter","port":9182,"name":"metrics"}
{"level":"info","ts":1633004956.5283544,"logger":"controllers.configmap","msg":"processing","instances in":"windows-instances"}
{"level":"info","ts":1633004956.5387952,"logger":"controllers.configmap","msg":"instance is up to date","node":"worker-1","version":"3.1.0+06e96071"}
{"level":"info","ts":1633004956.5493839,"logger":"metrics","msg":"Prometheus configured","endpoints":"windows-exporter","port":9182,"name":"metrics"}
我们能看到 windows节点了。
oc get node
# NAME STATUS ROLES AGE VERSION
# master-0 Ready master,worker 19h v1.21.1+a620f50
# worker-1 Ready worker 4m50s v1.21.1-1398+98073871f173ba
oc get node --show-labels
# NAME STATUS ROLES AGE VERSION LABELS
# master-0 Ready master,worker 4h13m v1.21.1+a620f50 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=master-0,kubernetes.io/os=linux,node-role.kubernetes.io/master=,node-role.kubernetes.io/worker=,node.openshift.io/os_id=rhcos
# worker-1 Ready worker 5m25s v1.21.1-1398+98073871f173ba beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=windows,kubernetes.io/arch=amd64,kubernetes.io/hostname=worker-1,kubernetes.io/os=windows,node-role.kubernetes.io/worker=,node.kubernetes.io/windows-build=10.0.19042,node.openshift.io/os_id=Windows,windowsmachineconfig.openshift.io/byoh=true
# 看了windows节点不占用machine config pool
oc get mcp
# NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE
# master rendered-master-607708e411d75c10e680d8bf5e24de6f True False False 1 1 1 0 19h
# worker rendered-worker-cacf7f7f871c77ae92070b0a44fe0b91 True False False 0 0 0 0 19h
探索一下装了什么
进入win10,可以看到C:\下面,有一个k目录,还有一个var目录,k目录下面就是配置和可执行程序啦。
wzh@WORKER-1 c:\>dir
驱动器 C 中的卷没有标签。
卷的序列号是 C607-13D4
c:\ 的目录
2021/09/28 19:37 535,444,968 Docker Desktop Installer.exe
2021/09/29 11:12 <DIR> k
2019/12/07 17:14 <DIR> PerfLogs
2021/09/28 19:57 <DIR> Program Files
2021/04/09 21:57 <DIR> Program Files (x86)
2021/09/29 11:12 <DIR> Temp
2021/09/28 08:25 <DIR> Users
2021/09/29 11:11 <DIR> var
2021/09/28 17:51 428 win-ssh.ps1
2021/09/28 16:34 <DIR> Windows
2 个文件 535,445,396 字节
8 个目录 19,381,813,248 可用字节
wzh@WORKER-1 c:\>dir k
驱动器 C 中的卷没有标签。
卷的序列号是 C607-13D4
c:\k 的目录
2021/09/29 11:12 <DIR> .
2021/09/29 11:12 <DIR> ..
2021/09/29 11:12 10,908 bootstrap-kubeconfig
2021/09/29 11:12 <DIR> cni
2021/09/29 11:12 <DIR> etc
2021/09/29 11:12 47,493,632 hybrid-overlay-node.exe
2021/09/29 11:12 47,809,536 kube-proxy.exe
2021/09/29 11:12 10,132 kubeconfig
2021/09/29 11:12 5,875 kubelet-ca.crt
2021/09/29 11:12 739 kubelet.conf
2021/09/29 11:12 117,698,048 kubelet.exe
2021/09/29 11:12 <DIR> usr
2021/09/29 11:12 16,986,112 windows_exporter.exe
2021/09/29 11:12 16,331,776 wmcb.exe
9 个文件 246,346,758 字节
5 个目录 19,381,317,632 可用字节
wzh@WORKER-1 c:\>dir var\log
驱动器 C 中的卷没有标签。
卷的序列号是 C607-13D4
c:\var\log 的目录
2021/09/29 11:12 <DIR> .
2021/09/29 11:12 <DIR> ..
2021/09/29 11:12 <DIR> containers
2021/09/29 11:12 <DIR> hybrid-overlay
2021/09/29 11:16 <DIR> kube-proxy
2021/09/29 11:12 <DIR> kubelet
2021/09/29 11:12 <DIR> pods
0 个文件 0 字节
7 个目录 19,381,059,584 可用字节
wzh@WORKER-1 c:\>dir var\lib
驱动器 C 中的卷没有标签。
卷的序列号是 C607-13D4
c:\var\lib 的目录
2021/09/28 20:36 <DIR> .
2021/09/28 20:36 <DIR> ..
2021/09/28 20:36 <DIR> dockershim
2021/09/28 20:38 <DIR> kubelet
0 个文件 0 字节
4 个目录 19,381,043,200 可用字节
删除windows节点
除了官方文档说的,改config map之外,发现,最好还是重启一下windows node为好。
改了config map,耐心等着,最后oc get node,就会看到windows node没有了。
从operator的日志里面,可以看到如下的日志信息。
{"level":"info","ts":1632916600.248877,"logger":"controllers.configmap","msg":"processing","instances in":"windows-instances"}
{"level":"info","ts":1632916610.646764,"logger":"wc 192.168.7.17","msg":"deconfiguring"}
{"level":"info","ts":1632916641.877409,"logger":"wc 192.168.7.17","msg":"deconfigured","service":"windows_exporter"}
{"level":"info","ts":1632916672.9587948,"logger":"wc 192.168.7.17","msg":"deconfigured","service":"kube-proxy"}
{"level":"info","ts":1632916703.9290483,"logger":"wc 192.168.7.17","msg":"deconfigured","service":"hybrid-overlay-node"}
{"level":"info","ts":1632916734.8715909,"logger":"wc 192.168.7.17","msg":"deconfigured","service":"kubelet"}
{"level":"info","ts":1632916734.8733184,"logger":"wc 192.168.7.17","msg":"removing directories"}
{"level":"info","ts":1632916735.4904935,"logger":"wc 192.168.7.17","msg":"removing HNS networks"}
{"level":"info","ts":1632916924.5720427,"logger":"nc 192.168.7.17","msg":"instance has been deconfigured","node":"worker-1"}
{"level":"info","ts":1632916924.6041753,"logger":"metrics","msg":"Prometheus configured","endpoints":"windows-exporter","port":9182,"name":"metrics"}
{"level":"info","ts":1632916924.6054258,"logger":"controllers.configmap","msg":"processing","instances in":"windows-instances"}
{"level":"info","ts":1632916924.6281445,"logger":"metrics","msg":"Prometheus configured","endpoints":"windows-exporter","port":9182,"name":"metrics"}
resize qcow2 disk
https://computingforgeeks.com/how-to-extend-increase-kvm-virtual-machine-disk-size/
qemu-img info /data/nvme/ocp4-windows.qcow2
# image: /data/nvme/ocp4-windows.qcow2
# file format: qcow2
# virtual size: 50 GiB (53687091200 bytes)
# disk size: 43.3 GiB
# cluster_size: 65536
# Format specific information:
# compat: 1.1
# lazy refcounts: true
# refcount bits: 16
# corrupt: false
qemu-img resize /data/nvme/ocp4-windows.qcow2 +20G
# Image resized.
windows workload
似乎现在的 docker for windows 并不支持给 mcr.microsoft.com 做镜像代理,只能配置一个proxy,这个太讨厌了,等以后迁移到 podman 或者 containerd 吧。所以我们现在基本上属于联网或者半联网的部署模式。
# pod pause的镜像
# mcr.microsoft.com/oss/kubernetes/pause:3.4.1
# 创建runtime class
cat << EOF > /data/install/win-runtime.yaml
apiVersion: node.k8s.io/v1beta1
kind: RuntimeClass
metadata:
name: runtime-class-win10
handler: 'docker'
scheduling:
nodeSelector:
kubernetes.io/os: 'windows'
kubernetes.io/arch: 'amd64'
node.kubernetes.io/windows-build: '10.0.19042'
tolerations:
- effect: NoSchedule
key: os
operator: Equal
value: "Windows"
EOF
oc create -f /data/install/win-runtime.yaml
# https://hub.docker.com/_/microsoft-windows
# mcr.microsoft.com/windows:20H2
cat << 'EOF' > /data/install/win-dep.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: win-webserver
name: win-webserver
spec:
selector:
matchLabels:
app: win-webserver
replicas: 1
template:
metadata:
labels:
app: win-webserver
name: win-webserver
spec:
tolerations:
- key: "os"
value: "Windows"
Effect: "NoSchedule"
containers:
- name: windowswebserver
image: mcr.microsoft.com/windows:20H2
imagePullPolicy: IfNotPresent
command:
- powershell.exe
- -command
- $listener = New-Object System.Net.HttpListener; $listener.Prefixes.Add('http://*:80/'); $listener.Start();Write-Host('Listening at http://*:80/'); while ($listener.IsListening) { $context = $listener.GetContext(); $response = $context.Response; $content='<html><body><H1>Red Hat OpenShift + Windows Container Workloads</H1></body></html>'; $buffer = [System.Text.Encoding]::UTF8.GetBytes($content); $response.ContentLength64 = $buffer.Length; $response.OutputStream.Write($buffer, 0, $buffer.Length); $response.Close(); };
securityContext:
windowsOptions:
runAsUserName: "ContainerAdministrator"
nodeSelector:
beta.kubernetes.io/os: windows
EOF
oc create -f /data/install/win-dep.yaml
# to restore
oc delete -f /data/install/win-dep.yaml
cat << EOF > /data/install/win-svc.yaml
---
apiVersion: v1
kind: Service
metadata:
name: win-webserver
labels:
app: win-webserver
spec:
ports:
# the port that this service should serve on
- port: 80
targetPort: 80
selector:
app: win-webserver
---
apiVersion: route.openshift.io/v1
kind: Route
metadata:
name: win-webserver
spec:
port:
targetPort: 80
to:
kind: Service
name: win-webserver
---
EOF
oc create -f /data/install/win-svc.yaml
# try windows server core, if you run on windows server
# otherwize, it will failed, say os not match with host:
# "The container operating system does not match the host operating system."
# https://hub.docker.com/_/microsoft-windows-servercore
# mcr.microsoft.com/windows/servercore:20H2
cat << EOF > /data/install/test-pod.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: mypod
labels:
app: mypod
spec:
replicas: 1
selector:
matchLabels:
app: mypod
template:
metadata:
labels:
app: mypod
spec:
containers:
- name: mypod
image: quay.io/wangzheng422/qimgs:centos7-test
command:
- sleep
- infinity
EOF
oc create -f /data/install/test-pod.yaml
oc get all
# NAME READY STATUS RESTARTS AGE
# pod/mypod-6b8b7b46cb-rrfmd 1/1 Running 1 21h
# pod/win-webserver-9f98c76d4-8nb2q 1/1 Running 0 110s
# NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
# service/kubernetes ClusterIP 172.30.0.1 <none> 443/TCP 26h
# service/openshift ExternalName <none> kubernetes.default.svc.cluster.local <none> 25h
# service/win-webserver ClusterIP 172.30.240.75 <none> 80/TCP 21h
# NAME READY UP-TO-DATE AVAILABLE AGE
# deployment.apps/mypod 1/1 1 1 21h
# deployment.apps/win-webserver 1/1 1 1 110s
# NAME DESIRED CURRENT READY AGE
# replicaset.apps/mypod-6b8b7b46cb 1 1 1 21h
# replicaset.apps/win-webserver-9f98c76d4 1 1 1 110s
# NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD
# route.route.openshift.io/win-webserver win-webserver-default.apps.ocp4.redhat.ren win-webserver 80 None
curl win-webserver-default.apps.ocp4.redhat.ren && echo
# <html><body><H1>Red Hat OpenShift + Windows Container Workloads</H1></body></html>
oc exec -it pod/win-webserver-9f98c76d4-8nb2q -- cmd
Microsoft Windows [Version 10.0.19042.1237]
(c) Microsoft Corporation. All rights reserved.
C:\>tasklist
Image Name PID Session Name Session# Mem Usage
========================= ======== ================ =========== ============
System Idle Process 0 0 8 K
System 4 0 148 K
smss.exe 9992 0 1,760 K
csrss.exe 6788 Services 3 4,524 K
wininit.exe 7096 Services 3 5,260 K
services.exe 6456 Services 3 6,668 K
lsass.exe 3324 Services 3 12,536 K
fontdrvhost.exe 5736 Services 3 2,860 K
svchost.exe 4948 Services 3 12,896 K
svchost.exe 6960 Services 3 8,180 K
svchost.exe 3332 Services 3 16,952 K
svchost.exe 756 Services 3 53,864 K
svchost.exe 5924 Services 3 9,728 K
svchost.exe 6412 Services 3 8,012 K
svchost.exe 5628 Services 3 6,740 K
svchost.exe 9488 Services 3 4,688 K
svchost.exe 8912 Services 3 12,896 K
CExecSvc.exe 5616 Services 3 4,020 K
svchost.exe 5916 Services 3 28,600 K
svchost.exe 2780 Services 3 4,404 K
powershell.exe 2816 Services 3 78,156 K
CompatTelRunner.exe 3056 Services 3 2,852 K
svchost.exe 9412 Services 3 11,104 K
conhost.exe 7748 Services 3 10,824 K
svchost.exe 3636 Services 3 7,404 K
conhost.exe 1288 Services 3 3,800 K
cmd.exe 5112 Services 3 2,884 K
svchost.exe 4492 Services 3 8,900 K
MicrosoftEdgeUpdate.exe 8808 Services 3 1,760 K
svchost.exe 7612 Services 3 10,112 K
conhost.exe 4944 Services 3 5,176 K
cmd.exe 9848 Services 3 5,140 K
MoUsoCoreWorker.exe 3016 Services 3 17,220 K
WmiPrvSE.exe 7924 Services 3 9,340 K
WmiPrvSE.exe 5976 Services 3 9,384 K
spoolsv.exe 6204 Services 3 6,580 K
conhost.exe 6184 Services 3 5,208 K
cmd.exe 5680 Services 3 4,428 K
tasklist.exe 8424 Services 3 8,812 K
在win10上,我们能从docker界面上,看到有2个container启动了。
同样,在docker界面上,我们能看到他下载了2个镜像,并且正在使用中。
排错
如果发现有异常,首先要做的是,查看kubelet, kubeproxy, hybrid-overlay-node 这3个服务,是不是还在运行,当前的版本,似乎这几个服务,很容易崩溃。
之后,就是看看默认网卡的ipv4配置,是否被禁用了,估计未来兼容性好了,就不用操心这个了。
# on windows cmd
netsh interface dump
openshift 4.9 静态IP 半离线 baremetal 安装,包含SNO(single node openshift)
安装过程视频
本文描述ocp4.9在baremetal(kvm模拟)上面,静态ip安装的方法。包括operator hub步骤。
离线安装包下载
ocp4.3的离线安装包下载和3.11不太一样,按照如下方式准备。另外,由于默认的baremetal是需要dhcp, pxe环境的,那么需要准备一个工具机,上面有dhcp, tftp, haproxy等工具,另外为了方便项目现场工作,还准备了ignition文件的修改工具,所以离线安装包需要一些其他第三方的工具。
https://github.com/wangzheng422/ocp4-upi-helpernode 这个工具,是创建工具机用的。
https://github.com/wangzheng422/filetranspiler 这个工具,是修改ignition文件用的。
打包好的安装包,在这里下载,百度盘下载链接,版本是 4.9.12 :
- 4.9.12
- 链接: https://pan.baidu.com/s/1Wj5MUBLMFli1kOit1eafug 提取码: ur8r
其中包括如下类型的文件:
- ocp4.tgz 这个文件包含了iso等安装介质,以及各种安装脚本,全部下载的镜像列表等。需要复制到宿主机,以及工具机上去。
- registry.tgz 这个文件也是docker image registry的仓库打包文件。需要先补充镜像的话,按照这里操作: 4.6.add.image.md
合并这些切分文件,使用类似如下的命令
cat registry.?? > registry.tgz
在外网云主机上面准备离线安装源
准备离线安装介质的文档,已经转移到了这里:4.9.build.dist.md
宿主机准备
本次实验,是在一个32C, 256G 的主机上面,用很多个虚拟机安装测试。所以先准备这个宿主机。
如果是多台宿主机,记得一定要调整时间配置,让这些宿主机的时间基本一致,否则证书会出问题。
主要的准备工作有
- 配置yum源
- 配置dns
- 安装镜像仓库
- 配置vnc环境
- 配置kvm需要的网络
- 创建helper kvm
- 配置一个haproxy,从外部导入流量给kvm
以上准备工作,dns部分需要根据实际项目环境有所调整。
本次的宿主机是两台rocky linux
kvm host 101
# 因为是半离线,我们的host os还有helper os是联线的,那么我们就用在线的源吧。
# dnf -y install https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm
dnf install -y epel-release
dnf install -y byobu htop dstat
# 准备vnc环境
vncpasswd
cat << EOF > ~/.vnc/config
session=gnome
securitytypes=vncauth,tlsvnc
# desktop=sandbox
geometry=1280x800
alwaysshared
EOF
cat << EOF >> /etc/tigervnc/vncserver.users
:1=root
EOF
# systemctl disable vncserver@:1
systemctl start vncserver@:1
# 如果你想停掉vnc server,这么做
systemctl stop vncserver@:1
/usr/libexec/vncsession-start :1
# 配置kvm环境
dnf -y groupinstall "Server with GUI"
dnf -y install qemu-kvm libvirt libguestfs-tools virt-install virt-viewer virt-manager tigervnc-server
systemctl disable --now firewalld
systemctl enable --now libvirtd
# 创建实验用虚拟网络
mkdir -p /data/kvm
cat << 'EOF' > /data/kvm/bridge.sh
#!/usr/bin/env bash
PUB_CONN='eno1'
PUB_IP='172.21.6.104/24'
PUB_GW='172.21.6.254'
PUB_DNS='172.21.1.1'
nmcli con down "$PUB_CONN"
nmcli con delete "$PUB_CONN"
nmcli con down baremetal
nmcli con delete baremetal
# RHEL 8.1 appends the word "System" in front of the connection,delete in case it exists
nmcli con down "System $PUB_CONN"
nmcli con delete "System $PUB_CONN"
nmcli connection add ifname baremetal type bridge con-name baremetal ipv4.method 'manual' \
ipv4.address "$PUB_IP" \
ipv4.gateway "$PUB_GW" \
ipv4.dns "$PUB_DNS"
nmcli con add type bridge-slave ifname "$PUB_CONN" master baremetal
nmcli con down "$PUB_CONN";pkill dhclient;dhclient baremetal
nmcli con up baremetal
EOF
bash /data/kvm/bridge.sh
# 创建工具机
mkdir -p /data/kvm
cd /data/kvm
osinfo-query os | grep rhel8
# rhel8-unknown | Red Hat Enterprise Linux 8 Unknown | 8-unknown | http://redhat.com/rhel/8-unknown
# rhel8.0 | Red Hat Enterprise Linux 8.0 | 8.0 | http://redhat.com/rhel/8.0
# rhel8.1 | Red Hat Enterprise Linux 8.1 | 8.1 | http://redhat.com/rhel/8.1
# rhel8.2 | Red Hat Enterprise Linux 8.2 | 8.2 | http://redhat.com/rhel/8.2
# rhel8.3 | Red Hat Enterprise Linux 8.3 | 8.3 | http://redhat.com/rhel/8.3
# rhel8.4 | Red Hat Enterprise Linux 8.4 | 8.4 | http://redhat.com/rhel/8.4
wget https://mirrors.sjtug.sjtu.edu.cn/rocky/8.4/isos/x86_64/Rocky-8.4-x86_64-minimal.iso
# lvremove -f rhel/data
lvcreate -y -l 100%FREE -n data nvme
mkfs.xfs /dev/nvme/data
mkdir -p /data/nvme
mount /dev/nvme/data /data/nvme
cat << EOF >> /etc/fstab
/dev/nvme/data /data/nvme xfs defaults 0 0
EOF
cd /data/kvm
wget https://mirrors.sjtug.sjtu.edu.cn/rocky/8.4/isos/x86_64/Rocky-8.4-x86_64-minimal.iso
export http_proxy="http://192.168.195.54:5085"
export https_proxy=${http_proxy}
wget https://raw.githubusercontent.com/wangzheng422/docker_env/dev/redhat/ocp4/4.9/scripts/helper-ks-rocky.cfg
unset http_proxy
unset https_proxy
sed -i '0,/^network.*/s/^network.*/network --bootproto=static --device=enp1s0 --gateway=192.168.7.1 --ip=192.168.7.71 --netmask=255.255.255.0 --nameserver=192.168.7.71 --ipv6=auto --activate/' helper-ks-rhel8.cfg
# https://stackoverflow.com/questions/18620153/find-matching-text-and-replace-next-line
sed -i '/^network.*/{n;s/^network.*/network --hostname=sno-helper/}' helper-ks-rhel8.cfg
export KVM_DIRECTORY=/home/data/kvm
virt-install --name="sno-aHelper" --vcpus=2 --ram=4096 \
--cpu=host-model \
--disk path=${KVM_DIRECTORY}/sno-aHelper.qcow2,bus=virtio,size=20 \
--os-variant rhel8.3 --network bridge=baremetal,model=virtio \
--graphics vnc,port=59200 \
--boot menu=on \
--location ${KVM_DIRECTORY}/rhel-8.3-x86_64-dvd.iso \
--disk ${KVM_DIRECTORY}/rhel-8.3-x86_64-dvd.iso,device=cdrom \
--initrd-inject helper-ks-rhel8.cfg --extra-args "inst.ks=file:/helper-ks-rhel8.cfg"
# virt-viewer --domain-name ocp4-aHelper
# virsh start ocp4-aHelper
# virsh list --all
# start chrony/ntp server on host
# cat << EOF > /etc/chrony.conf
# driftfile /var/lib/chrony/drift
# makestep 1.0 3
# rtcsync
# allow 192.0.0.0/8
# local stratum 10
# logdir /var/log/chrony
# EOF
# echo "allow 192.0.0.0/8" >> /etc/chrony.conf
# systemctl enable --now chronyd
# # systemctl restart chronyd
# chronyc tracking
# chronyc sources -v
# chronyc sourcestats -v
# chronyc makestep
工具机准备
以下是在工具机里面,进行的安装操作。
主要的操作有
- 配置yum源
- 运行ansible脚本,自动配置工具机
- 上传定制的安装配置文件
- 生成ignition文件
export YUMIP="192.168.7.1"
cat << EOF > /etc/yum.repos.d/remote.repo
[remote-ftp]
name=ftp
baseurl=ftp://${YUMIP}/
enabled=1
gpgcheck=0
EOF
# dnf install -y epel-release
# dnf install -y byobu
dnf update -y
reboot
sed -i 's/#UseDNS yes/UseDNS no/g' /etc/ssh/sshd_config
systemctl restart sshd
cat << EOF > /root/.ssh/config
StrictHostKeyChecking no
UserKnownHostsFile=/dev/null
EOF
echo "allow 192.0.0.0/8" >> /etc/chrony.conf
systemctl enable --now chronyd
# systemctl restart chronyd
chronyc tracking
chronyc sources -v
chronyc sourcestats -v
chronyc makestep
# nmcli con mod enp1s0 +ipv4.addresses "192.168.7.71/24"
# nmcli con up enp1s0
dnf -y install ansible git unzip podman python3 buildah skopeo
mkdir -p /data/ocp4/
# scp ocp4.tgz to /data
# scp * root@172.21.6.11:/data/
cd /data
tar zvxf ocp.*.tgz
tar zvxf registry.*.tgz
cd /data/ocp4
rm -f /data/*.tgz
# 配置registry
# 配置registry
mkdir -p /etc/crts/ && cd /etc/crts
# https://access.redhat.com/documentation/en-us/red_hat_codeready_workspaces/2.1/html/installation_guide/installing-codeready-workspaces-in-tls-mode-with-self-signed-certificates_crw
openssl genrsa -out /etc/crts/redhat.ren.ca.key 4096
openssl req -x509 \
-new -nodes \
-key /etc/crts/redhat.ren.ca.key \
-sha256 \
-days 36500 \
-out /etc/crts/redhat.ren.ca.crt \
-subj /CN="Local Red Hat Ren Signer" \
-reqexts SAN \
-extensions SAN \
-config <(cat /etc/pki/tls/openssl.cnf \
<(printf '[SAN]\nbasicConstraints=critical, CA:TRUE\nkeyUsage=keyCertSign, cRLSign, digitalSignature'))
openssl genrsa -out /etc/crts/redhat.ren.key 2048
openssl req -new -sha256 \
-key /etc/crts/redhat.ren.key \
-subj "/O=Local Red Hat Ren /CN=*.ocp4.redhat.ren" \
-reqexts SAN \
-config <(cat /etc/pki/tls/openssl.cnf \
<(printf "\n[SAN]\nsubjectAltName=DNS:*.ocp4.redhat.ren,DNS:*.apps.ocp4.redhat.ren,DNS:*.redhat.ren\nbasicConstraints=critical, CA:FALSE\nkeyUsage=digitalSignature, keyEncipherment, keyAgreement, dataEncipherment\nextendedKeyUsage=serverAuth")) \
-out /etc/crts/redhat.ren.csr
openssl x509 \
-req \
-sha256 \
-extfile <(printf "subjectAltName=DNS:*.ocp4.redhat.ren,DNS:*.apps.ocp4.redhat.ren,DNS:*.redhat.ren\nbasicConstraints=critical, CA:FALSE\nkeyUsage=digitalSignature, keyEncipherment, keyAgreement, dataEncipherment\nextendedKeyUsage=serverAuth") \
-days 365 \
-in /etc/crts/redhat.ren.csr \
-CA /etc/crts/redhat.ren.ca.crt \
-CAkey /etc/crts/redhat.ren.ca.key \
-CAcreateserial -out /etc/crts/redhat.ren.crt
openssl x509 -in /etc/crts/redhat.ren.crt -text
/bin/cp -f /etc/crts/redhat.ren.ca.crt /etc/pki/ca-trust/source/anchors/
update-ca-trust extract
cd /data
# mkdir -p /data/registry
# tar zxf registry.tgz
dnf -y install podman pigz skopeo jq
# pigz -dc registry.tgz | tar xf -
cd /data/ocp4
podman load -i /data/ocp4/registry.tgz
systemctl disable --now firewalld
podman run --restart always --name local-registry -p 5443:5443 \
-d --restart=always \
-v /home/ocp.4.9.5/registry/:/var/lib/registry:z \
-v /etc/crts:/certs:z \
-e REGISTRY_HTTP_ADDR=0.0.0.0:5443 \
-e REGISTRY_HTTP_TLS_CERTIFICATE=/certs/redhat.ren.crt \
-e REGISTRY_HTTP_TLS_KEY=/certs/redhat.ren.key \
docker.io/library/registry:2
podman start local-registry
# https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/building_running_and_managing_containers/assembly_porting-containers-to-systemd-using-podman_building-running-and-managing-containers
# podman generate systemd --new --files --name local-registry
podman generate systemd --files --name local-registry
# /root/container-local-registry.service
cp -Z container-local-registry.service /usr/lib/systemd/system
systemctl enable --now container-local-registry.service
systemctl status container-local-registry.service
# podman rm --storage 7cb9fcea76ad384313a682a469be6784786eb5004a190ad2abe68978b1566416
# firewall-cmd --permanent --add-port=5443/tcp
# firewall-cmd --reload
# 加载更多的镜像
# 解压缩 ocp4.tgz
# bash add.image.load.sh /data/4.6.5/install.image 'registry.ocp4.redhat.ren:5443'
# https://github.com/christianh814/ocp4-upi-helpernode/blob/master/docs/quickstart.md
# in helper node
# mkdir /etc/yum.repos.d.bak
# mv /etc/yum.repos.d/* /etc/yum.repos.d.bak/
# cat << EOF > /etc/yum.repos.d/remote.repo
# [remote]
# name=RHEL FTP
# baseurl=ftp://192.168.7.1/data
# enabled=1
# gpgcheck=0
# EOF
# yum clean all
# yum repolist
# 这里使用了一个ansible的项目,用来部署helper节点的服务。
# https://github.com/wangzheng422/ocp4-upi-helpernode
cd /data/ocp4
unzip ocp4-upi-helpernode.zip
# 这里使用了一个ignition文件合并的项目,用来帮助自定义ignition文件。
# https://github.com/wangzheng422/filetranspiler
# podman load -i filetranspiler.tgz
# podman load -i nexus.tgz
ssh-keygen
# 接下来,我们使用ansible来配置helper节点,装上各种openshift集群需要的服务
# 根据现场环境,修改 ocp4-upi-helpernode-master/vars-static.yaml
# 主要是修改各个节点的网卡和硬盘参数,还有IP地址
cd /data/ocp4/ocp4-upi-helpernode-master
# ansible-playbook -e @vars-static.yaml -e '{staticips: true}' tasks/main.yml
cat << 'EOF' > /data/ocp4/ocp4-upi-helpernode-master/vars.yaml
---
ocp_version: 4.9.5
ssh_gen_key: false
staticips: true
firewalld: false
dns_forward: yes
iso:
iso_dl_url: "/data/ocp4/rhcos-live.x86_64.iso"
my_iso: "rhcos-live.iso" # this is internal file, just leave as it.
helper:
name: "helper"
ipaddr: "192.168.7.71"
networkifacename: "enp1s0"
gateway: "192.168.7.1"
netmask: "255.255.255.0"
dns:
domain: "redhat.ren"
clusterid: "ocp4s"
forwarder1: "202.106.0.20"
forwarder2: "202.106.0.20"
bootstrap:
name: "bootstrap"
ipaddr: "192.168.7.72"
interface: "enp1s0"
install_drive: "vda"
manual: false
masters:
- name: "master-0"
ipaddr: "192.168.7.73"
interface: "enp103s0f1"
install_drive: "sda"
disable_interfaces:
- interface: "enp3s0"
ipaddr: "10.44.44.44"
- interface: "enp4s0"
ipaddr: "10.44.44.45"
- interface: "enp103s0f0"
ipaddr: "10.44.44.46"
manual: false
# - name: "master-1"
# ipaddr: "192.168.7.14"
# interface: "enp1s0"
# install_drive: "vda"
# - name: "master-2"
# ipaddr: "192.168.7.15"
# interface: "enp1s0"
# install_drive: "vda"
# workers:
# - name: "worker-0"
# ipaddr: "192.168.7.16"
# interface: "ens3f0"
# install_drive: "sda"
# - name: "worker-1"
# ipaddr: "192.168.7.17"
# interface: "enp1s0"
# install_drive: "sda"
# - name: "worker-2"
# ipaddr: "192.168.7.18"
# interface: "enp1s0"
# install_drive: "vda"
# - name: "infra-0"
# ipaddr: "192.168.7.19"
# interface: "enp1s0"
# install_drive: "vda"
# - name: "infra-1"
# ipaddr: "192.168.7.20"
# interface: "enp1s0"
# install_drive: "vda"
# - name: "worker-3"
# ipaddr: "192.168.7.21"
# interface: "enp1s0"
# install_drive: "vda"
# - name: "worker-4"
# ipaddr: "192.168.7.22"
# interface: "enp1s0"
# install_drive: "vda"
others:
- name: "registry"
ipaddr: "192.168.7.1"
- name: "yum"
ipaddr: "192.168.7.1"
- name: "quay"
ipaddr: "192.168.7.11"
- name: "nexus"
ipaddr: "192.168.7.1"
- name: "git"
ipaddr: "192.168.7.11"
otherdomains:
- domain: "rhv.redhat.ren"
hosts:
- name: "manager"
ipaddr: "192.168.7.71"
- name: "rhv01"
ipaddr: "192.168.7.72"
- domain: "others.redhat.ren"
hosts:
- name: "*"
ipaddr: "192.168.7.71"
- name: "*.apps"
ipaddr: "192.168.7.71"
- domain: "ocp4.redhat.ren"
hosts:
- name: "registry"
ipaddr: "192.168.7.1"
- name: "yum"
ipaddr: "192.168.7.1"
- name: "quay"
ipaddr: "192.168.7.11"
- name: "nexus"
ipaddr: "192.168.7.1"
- name: "git"
ipaddr: "192.168.7.11"
force_ocp_download: false
remove_old_config_files: false
ocp_client: "file:///data/ocp4/{{ ocp_version }}/openshift-client-linux-{{ ocp_version }}.tar.gz"
ocp_installer: "file:///data/ocp4/{{ ocp_version }}/openshift-install-linux-{{ ocp_version }}.tar.gz"
ocp_bios: "file:///data/ocp4/rhcos-metal.x86_64.raw.gz"
ppc64le: false
arch: 'x86_64'
chronyconfig:
enabled: true
content:
- server: "192.168.7.1"
options: iburst
setup_registry: # don't worry about this, just leave it here
deploy: false
registry_image: docker.io/library/registry:2
local_repo: "ocp4/openshift4"
product_repo: "openshift-release-dev"
release_name: "ocp-release"
release_tag: "4.6.1-x86_64"
ocp_filetranspiler: "file:///data/ocp4/filetranspiler.tgz"
registry_server: "registry.ocp4.redhat.ren:5443"
EOF
ansible-playbook -e @vars.yaml tasks/main.yml
# try this:
/usr/local/bin/helpernodecheck
mkdir -p /data/install
# # GOTO image registry host
# # copy crt files to helper node
# scp /etc/crts/redhat.ren.ca.crt root@192.168.7.11:/data/install/
# scp /etc/crts/redhat.ren.crt root@192.168.7.11:/data/install/
# scp /etc/crts/redhat.ren.key root@192.168.7.11:/data/install/
# GO back to help node
# /bin/cp -f /data/install/redhat.ren.crt /etc/pki/ca-trust/source/anchors/
# update-ca-trust extract
# 定制ignition
cd /data/install
# 根据现场环境,修改 install-config.yaml
# 至少要修改ssh key, 还有 additionalTrustBundle,这个是镜像仓库的csr
# vi install-config.yaml
cat << EOF > /data/install/install-config.yaml
apiVersion: v1
baseDomain: redhat.ren
compute:
- hyperthreading: Enabled
name: worker
replicas: 0
controlPlane:
hyperthreading: Enabled
name: master
replicas: 1
metadata:
name: ocp4s
networking:
clusterNetworks:
- cidr: 10.128.0.0/14
hostPrefix: 23
networkType: OpenShiftSDN
serviceNetwork:
- 172.30.0.0/16
platform:
none: {}
pullSecret: '{"auths":{"registry.ocp4.redhat.ren:5443": {"auth": "ZHVtbXk6ZHVtbXk=","email": "noemail@localhost"},"registry.ppa.redhat.ren:5443": {"auth": "ZHVtbXk6ZHVtbXk=","email": "noemail@localhost"}}}'
sshKey: |
$( cat /root/.ssh/id_rsa.pub | sed 's/^/ /g' )
additionalTrustBundle: |
$( cat /etc/crts/redhat.ren.ca.crt | sed 's/^/ /g' )
imageContentSources:
- mirrors:
- registry.ocp4.redhat.ren:5443/ocp4/openshift4
- registry.ocp4.redhat.ren:5443/ocp4/release
source: quay.io/openshift-release-dev/ocp-release
- mirrors:
- registry.ocp4.redhat.ren:5443/ocp4/openshift4
- registry.ocp4.redhat.ren:5443/ocp4/release
source: quay.io/openshift-release-dev/ocp-v4.0-art-dev
EOF
cd /data/install/
/bin/rm -rf *.ign .openshift_install_state.json auth bootstrap manifests master*[0-9] worker*[0-9]
openshift-install create manifests --dir=/data/install
# copy ntp related config
/bin/cp -f /data/ocp4/ocp4-upi-helpernode-master/machineconfig/* /data/install/openshift/
# copy image registry proxy related config
cd /data/ocp4
bash image.registries.conf.sh nexus.ocp4.redhat.ren:8083
/bin/cp -f /data/ocp4/image.registries.conf /etc/containers/registries.conf.d/
/bin/cp -f /data/ocp4/99-worker-container-registries.yaml /data/install/openshift
/bin/cp -f /data/ocp4/99-master-container-registries.yaml /data/install/openshift
cd /data/install/
openshift-install create ignition-configs --dir=/data/install
cd /data/ocp4/ocp4-upi-helpernode-master
# 我们来为每个主机,复制自己版本的ign,并复制到web server的目录下
ansible-playbook -e @vars.yaml tasks/ign.yml
# 如果对每个主机有自己ign的独特需求,在这一步,去修改ign。
# 以下操作本来是想设置网卡地址,但是实践发现是不需要的。
# 保留在这里,是因为他可以在安装的时候注入文件,非常有用。
# mkdir -p bootstrap/etc/sysconfig/network-scripts/
# cat <<EOF > bootstrap/etc/sysconfig/network-scripts/ifcfg-ens3
# DEVICE=ens3
# BOOTPROTO=none
# ONBOOT=yes
# IPADDR=192.168.7.12
# NETMASK=255.255.255.0
# GATEWAY=192.168.7.1
# DNS=192.168.7.11
# DNS1=192.168.7.11
# DNS2=192.168.7.1
# DOMAIN=redhat.ren
# PREFIX=24
# DEFROUTE=yes
# IPV6INIT=no
# EOF
# filetranspiler -i bootstrap.ign -f bootstrap -o bootstrap-static.ign
# /bin/cp -f bootstrap-static.ign /var/www/html/ignition/
# 我们为每个节点创建各自的iso文件
cd /data/ocp4/ocp4-upi-helpernode-master
ansible-playbook -e @vars.yaml tasks/iso.yml
# ansible-playbook -e @vars.yaml tasks/iso.small.yml
# if boot using live-iso, you need to run this cmd during install
nmtui
coreos-installer install --copy-network \
--ignition-url=http://192.168.7.71:8080/ignition/master-0.ign --insecure-ignition --image-url=http://192.168.7.71:8080/install/bios.raw.gz --insecure /dev/sda
回到宿主机
本来,到了这一步,就可以开始安装了,但是我们知道coreos装的时候,要手动输入很长的命令行,实际操作的时候,那是不可能输入对的,输入错一个字符,安装就失败,要重启,重新输入。。。
为了避免这种繁琐的操作,参考网上的做法,我们就需要为每个主机定制iso了。好在,之前的步骤,我们已经用ansible创建了需要的iso,我们把这些iso复制到宿主机上,就可以继续了。
这里面有一个坑,我们是不知道主机的网卡名称的,只能先用coreos iso安装启动一次,进入单用户模式以后,ip a 来查看以下,才能知道,一般来说,是ens3。
另外,如果是安装物理机,disk是哪个,也需要上述的方法,来看看具体的盘符。另外,推荐在物理机上安装rhel 8 来测试一下物理机是不是支持coreos。物理机安装的时候,遇到不写盘的问题,可以尝试添加启动参数: ignition.firstboot=1
# on kvm host 172.21.6.101
export KVM_DIRECTORY=/home/data/kvm
mkdir -p ${KVM_DIRECTORY}
cd ${KVM_DIRECTORY}
scp root@192.168.7.71:/data/install/{*boot*,*master-0,*worker-0}.iso ${KVM_DIRECTORY}/
# on kvm host 172.21.6.101
# finally, we can start install :)
# 你可以一口气把虚拟机都创建了,然后喝咖啡等着。
# 从这一步开始,到安装完毕,大概30分钟。
# export KVM_DIRECTORY=/data/kvm
virt-install --name=sno-bootstrap --vcpus=4 --ram=8192 \
--disk path=${KVM_DIRECTORY}/ocp4-bootstrap.qcow2,bus=virtio,size=30 \
--os-variant rhel8.3 --network bridge=baremetal,model=virtio \
--graphics vnc,port=59101 \
--boot menu=on --cdrom ${KVM_DIRECTORY}/rhcos_install-bootstrap.iso
# 想登录进coreos一探究竟?那么这么做
# ssh core@bootstrap
# journalctl -b -f -u bootkube.service
# export KVM_DIRECTORY=/data/kvm
virt-install --name=sno-master-0 --vcpus=16 --ram=49152 \
--cpu=host-model \
--disk path=${KVM_DIRECTORY}/ocp4-master-0.qcow2,bus=virtio,size=120 \
--os-variant rhel8.3 --network bridge=baremetal,model=virtio \
--graphics vnc,port=59002 \
--boot menu=on --cdrom ${KVM_DIRECTORY}/rhcos_install-master-0.iso
# virt-install --name=ocp4-master-1 --vcpus=10 --ram=20480 \
# --cpu=host-model \
# --disk path=/data/nvme/ocp4-master-1.qcow2,bus=virtio,size=120 \
# --os-variant rhel8.4 --network bridge=baremetal,model=virtio \
# --graphics vnc,port=59003 \
# --boot menu=on --cdrom ${KVM_DIRECTORY}/rhcos_install-master-1.iso
# # ssh core@192.168.7.13
# # on kvm host 172.21.6.103
# export KVM_DIRECTORY=/data/kvm
# virt-install --name=ocp4-master-2 --vcpus=22 --ram=30720 \
# --cpu=host-model \
# --disk path=/data/kvm/ocp4-master-2.qcow2,bus=virtio,size=120 \
# --os-variant rhel8.4 --network bridge=baremetal,model=virtio \
# --graphics vnc,port=59004 \
# --boot menu=on --cdrom ${KVM_DIRECTORY}/rhcos_install-master-2.iso
# on kvm host 172.21.6.104
export KVM_DIRECTORY=/data/kvm
# virt-install --name=ocp4-worker-0 --vcpus=4 --ram=10240 \
# --disk path=/data/kvm/ocp4-worker-0.qcow2,bus=virtio,size=120 \
# --os-variant rhel8.4 --network bridge=baremetal,model=virtio \
# --graphics vnc,port=59005 \
# --boot menu=on --cdrom ${KVM_DIRECTORY}/rhcos_install-worker-0.iso
# if install on baremetal
nmtui
coreos-install install --copy-network --ignition-url=http://192.168.7.71:8080/ignition/master-0.ign --inscure-ignition /dev/sda
# on workstation
# open http://192.168.7.11:9000/
# to check
# if you want to stop or delete vm, try this
virsh list --all
virsh destroy ocp4-bootstrap
virsh destroy ocp4-master-0
# virsh destroy ocp4-master-1
# virsh destroy ocp4-master-2
# virsh destroy ocp4-worker0
# virsh destroy ocp4-worker1
# virsh destroy ocp4-worker2
virsh undefine ocp4-bootstrap --remove-all-storage
virsh undefine ocp4-master-0 --remove-all-storage
# virsh undefine ocp4-master-1 --remove-all-storage
# virsh undefine ocp4-master-2 --remove-all-storage
# virsh undefine ocp4-worker0
# virsh undefine ocp4-worker1
# virsh undefine ocp4-worker2
在工具机上面
这个时候,安装已经自动开始了,我们只需要回到工具机上静静的观察就可以了。
在bootstrap和装master阶段,用这个命令看进度。
cd /data/install
export KUBECONFIG=/data/install/auth/kubeconfig
echo "export KUBECONFIG=/data/install/auth/kubeconfig" >> ~/.bashrc
oc completion bash | sudo tee /etc/bash_completion.d/openshift > /dev/null
cd /data/install
openshift-install wait-for bootstrap-complete --log-level debug
一切正常的话,会看到这个。
有时候证书会过期,验证方法是登录 bootstrap, 看看过期时间。如果确定过期,要清除所有的openshift-install生成配置文件的缓存,重新来过。
echo | openssl s_client -connect localhost:6443 | openssl x509 -noout -text | grep Not
一般来说,如果在openshift-install这一步之前,按照文档,删除了缓存文件,就不会出现过期的现象。
oc get nodes
这个时候,只能看到master,是因为worker的csr没有批准。如果虚拟机是一口气创建的,那么多半不会遇到下面的问题。
oc get csr
会发现有很多没有被批准的
批准之
yum -y install jq
oc get csr | grep -v Approved
oc get csr -ojson | jq -r '.items[] | select(.status == {} ) | .metadata.name' | xargs oc adm certificate approve
# oc get csr -o name | xargs oc adm certificate approve
然后worker 节点cpu飙高,之后就能看到worker了。
等一会,会看到这个,就对了。
上面的操作完成以后,就可以完成最后的安装了
cd /data/install
openshift-install wait-for install-complete --log-level debug
# here is the output
# INFO Install complete!
# INFO To access the cluster as the system:admin user when using 'oc', run 'export KUBECONFIG=/data/install/auth/kubeconfig'
# INFO Access the OpenShift web-console here: https://console-openshift-console.apps.ocp4s.redhat.ren
# INFO Login to the console with user: "kubeadmin", and password: "pg4cY-hBERh-GrAmI-Srku5"
# INFO Time elapsed: 0s
# we are testing env, so we don't need ingress replicas.
oc patch --namespace=openshift-ingress-operator --patch='{"spec": {"replicas": 1}}' --type=merge ingresscontroller/default
# after install finish, delete bootstrap vm,
# and we need to fix the dns setting,
# remove them from helper to master-0
cd /data/ocp4/ocp4-upi-helpernode-master
ansible-playbook -e @vars.yaml tasks/sno.dns.yml
镜像仓库代理 / image registry proxy
准备离线镜像仓库非常麻烦,好在我们找到了一台在线的主机,那么我们可以使用nexus构造image registry proxy,在在线环境上面,做一遍PoC,然后就能通过image registry proxy得到离线镜像了
- https://mtijhof.wordpress.com/2018/07/23/using-nexus-oss-as-a-proxy-cache-for-docker-images/
#############################################
## build nexus docker image
# 开启 https
# https://blog.csdn.net/s7799653/article/details/105378645
# https://help.sonatype.com/repomanager3/system-configuration/configuring-ssl#ConfiguringSSL-InboundSSL-ConfiguringtoServeContentviaHTTPS
mkdir -p /data/install/tmp
cd /data/install/tmp
# 将证书导出成pkcs格式
# 这里需要输入密码 用 password,
# openssl pkcs12 -export -out keystore.pkcs12 -inkey /etc/crts/redhat.ren.key -in /etc/crts/redhat.ren.crt
# cat << EOF >> Dockerfile
# FROM docker.io/sonatype/nexus3:3.29.0
# USER root
# COPY keystore.pkcs12 /keystore.pkcs12
# RUN keytool -v -importkeystore -srckeystore keystore.pkcs12 -srcstoretype PKCS12 -destkeystore keystore.jks -deststoretype JKS -storepass password -srcstorepass password &&\
# cp keystore.jks /opt/sonatype/nexus/etc/ssl/
# USER nexus
# EOF
# buildah bud --format=docker -t quay.io/wangzheng422/qimgs:nexus3-3.29.0-wzh -f Dockerfile .
# buildah push quay.io/wangzheng422/qimgs:nexus3-3.29.0-wzh
#####################################################
# init build the nexus fs
# /bin/cp -f nexus-image.tgz /data/ccn/
# tar zxf nexus-image.tgz
# chown -R 200 /data/ccn/nexus-image
###################################################
## import nexus fs
mkdir -p /data/ccn
cd /data/ccn
podman create --name swap quay.io/wangzheng422/qimgs:nexus-fs-image-2021-05-08-1516 ls
podman cp swap:/nexus-image.tgz - > /data/ccn/nexus-image.tgz.tar
podman rm -fv swap
tar vxf nexus-image.tgz.tar
tar zvxf nexus-image.tgz
rm -f nexus-image.tgz*
chown -R 200 /data/ccn/nexus-image
#########################################
## run the nexus for image
# podman run -d -p 8082:8081 -p 8083:8083 -it --name nexus-image -v /data/ccn/nexus-image:/nexus-data:Z docker.io/sonatype/nexus3:3.30.1
podman run -d -p 8082:8081 -p 8083:8083 -it --name nexus-image -v //home/data/ccn/nexus-image:/nexus-data:Z docker.io/sonatype/nexus3:3.33.1
podman generate systemd --files --name nexus-image
# /root/container-local-registry.service
cp -Z container-nexus-image.service /usr/lib/systemd/system
systemctl enable --now container-nexus-image.service
systemctl status container-nexus-image.service
# podman run -d -p 8082:8081 -p 8083:8083 -it --name nexus-image -v /data/ccn/nexus-image:/nexus-data:Z quay.io/wangzheng422/qimgs:nexus3-3.29.0-wzh
podman stop nexus-image
podman rm nexus-image
# get the admin password
cat /data/ccn/nexus-image/admin.password && echo
# 84091bcd-c82f-44a3-8b7b-dfc90f5b7da1
# open http://nexus.ocp4.redhat.ren:8082
######################################################
# go to helper, update proxy setting for ocp cluster
cd /data/ocp4
bash image.registries.conf.sh nexus.ocp4.redhat.ren:8083
# mkdir -p /etc/containers/registries.conf.d
# /bin/cp -f image.registries.conf /etc/containers/registries.conf.d/
cd /data/ocp4
oc apply -f ./99-worker-container-registries.yaml -n openshift-config
oc apply -f ./99-master-container-registries.yaml -n openshift-config
######################################################
# dump the nexus image fs out
podman stop nexus-image
var_date=$(date '+%Y-%m-%d-%H%M')
echo $var_date
cd /data/ccn
tar cf - ./nexus-image | pigz -c > nexus-image.tgz
buildah from --name onbuild-container scratch
buildah copy onbuild-container nexus-image.tgz /
buildah umount onbuild-container
buildah commit --rm onbuild-container quay.io/wangzheng422/qimgs:nexus-fs-image-$var_date
# buildah rm onbuild-container
# rm -f nexus-image.tgz
buildah push quay.io/wangzheng422/qimgs:nexus-fs-image-$var_date
echo "quay.io/wangzheng422/qimgs:nexus-fs-image-$var_date"
# 以下这个版本,可以作为初始化的image proxy,里面包含了nfs provision,以及sample operator的metadata。很高兴的发现,image stream并不会完全下载镜像,好想只是下载metadata,真正用的时候,才去下载。
# quay.io/wangzheng422/qimgs:nexus-fs-image-2022-01-14-2155
配置镜像仓库的ca
安装过程里面,已经把镜像仓库的ca放进去了,但是好想image stream不认,让我们再试试
oc project openshift-config
oc create configmap ca.for.registry -n openshift-config \
--from-file=registry.ocp4.redhat.ren..5443=/etc/crts/redhat.ren.ca.crt
# --from-file=nexus.ocp4.redhat.ren..8083=/data/install/redhat.ren.ca.crt
oc patch image.config.openshift.io/cluster -p '{"spec":{"additionalTrustedCA":{"name":"ca.for.registry"}}}' --type=merge
oc patch image.config.openshift.io/cluster -p '{"spec":{"registrySources":{"insecureRegistries":["nexus.ocp4.redhat.ren:8083"]}}}' --type=merge
oc get image.config.openshift.io/cluster -o yaml
# openshift project下面的image stream重新加载一下把
oc get is -o json | jq -r '.items[].metadata.name' | xargs -L1 oc import-image --all
配置internal registry
我们的工具机是带nfs的,那么就给interneal registry配置高档一些的nfs存储吧,不要用emptydir
bash /data/ocp4/ocp4-upi-helpernode-master/files/nfs-provisioner-setup.sh
# oc edit configs.imageregistry.operator.openshift.io
# 修改 storage 部分
# storage:
# pvc:
# claim:
oc patch configs.imageregistry.operator.openshift.io cluster -p '{"spec":{"managementState": "Managed","storage":{"pvc":{"claim":""}}}}' --type=merge
# if you want to restore
oc patch configs.imageregistry.operator.openshift.io cluster -p '{"spec":{"managementState": "Removed"}}' --type=merge
oc get clusteroperator image-registry
oc get configs.imageregistry.operator.openshift.io cluster -o yaml
# 把imagepruner给停掉
# https://bugzilla.redhat.com/show_bug.cgi?id=1852501#c24
# oc patch imagepruner.imageregistry/cluster --patch '{"spec":{"suspend":true}}' --type=merge
# oc -n openshift-image-registry delete jobs --all
配置sample operator
openshift内置了一个sample operator,里面有一大堆红帽的产品。
oc get configs.samples.operator.openshift.io/cluster -o yaml
oc patch configs.samples.operator.openshift.io/cluster -p '{"spec":{"managementState": "Managed", "samplesRegistry": "nexus.ocp4.redhat.ren:8083"}}' --type=merge
# if you want to restore
oc patch configs.samples.operator.openshift.io/cluster -p '{"spec":{"managementState": "Unmanaged"}}' --type=merge
# if you want to get ride of sampe operator
oc patch configs.samples.operator.openshift.io/cluster -p '{"spec":{"managementState": "Removed"}}' --type=merge
chrony/NTP 设置
在 ocp 4.6 里面,需要设定ntp同步,我们之前ansible脚本,已经创建好了ntp的mco配置,把他打到系统里面就好了。
oc apply -f /data/ocp4/ocp4-upi-helpernode-master/machineconfig/
Operator Hub 离线安装
使用nexus作为image proxy以后,就不需要做这个离线操作了。有些情况下,我们可能还需要屏蔽掉默认的operator hub
oc patch OperatorHub cluster --type json \
-p '[{"op": "add", "path": "/spec/disableAllDefaultSources", "value": true}]'
oc get OperatorHub cluster -o yaml
oc get OperatorHub
# NAME AGE
# cluster 20h
给 openshift project image stream 打补丁
在有代理的网络环境中,我们需要给openshift project下的image stream打一些补丁。
cd /data/ocp4
bash is.patch.sh registry.ocp4.redhat.ren:5443/ocp4/openshift4
disable offical helm chart & enable helm proxy
我们是半离线环境,所以openshift4内置的官方helm chart是无法访问的,我们禁用之。
oc get HelmChartRepository
# NAME AGE
# redhat-helm-repo 19h
# oc patch HelmChartRepository redhat-helm-repo --type json \
# -p '[{"op": "add", "path": "/spec/disabled", "value": true}]'
oc patch --patch='{"spec": {"disabled": true}}' --type=merge HelmChartRepository/openshift-helm-charts
cat << EOF > /data/install/helm.ocp.yaml
apiVersion: helm.openshift.io/v1beta1
kind: HelmChartRepository
metadata:
name: openshift-helm-charts-wzh
spec:
# optional name that might be used by console
name: openshift-helm-charts-wzh
connectionConfig:
url: http://nexus.ocp4.redhat.ren:8082/repository/charts.openshift.io/
EOF
oc create -f /data/install/helm.ocp.yaml
给 router / ingress 更换证书
有时候,我们需要公网CA认证的证书,给router来用,那么我们就搞一下
https://docs.openshift.com/container-platform/4.6/security/certificates/replacing-default-ingress-certificate.html
mkdir -p /data/ccn/ingress-keys/etc
mkdir -p /data/ccn/ingress-keys/lib
cd /data/ccn/ingress-keys
podman run -it --rm --name certbot \
-v "/data/ccn/ingress-keys/etc:/etc/letsencrypt":Z \
-v "/data/ccn/ingress-keys/lib:/var/lib/letsencrypt":Z \
docker.io/certbot/certbot certonly -d "*.apps.ocp4.redhat.ren" --manual --preferred-challenges dns-01 --server https://acme-v02.api.letsencrypt.org/directory
cp ./etc/archive/apps.ocp4.redhat.ren/fullchain1.pem apps.ocp4.redhat.ren.crt
cp ./etc/archive/apps.ocp4.redhat.ren/privkey1.pem apps.ocp4.redhat.ren.key
ssh root@192.168.7.11 mkdir -p /data/install/ingress-key
scp apps.* root@192.168.7.11:/data/install/ingress-key
# on helper
cd /data/install/ingress-key
oc create secret tls wzh-ingress-key \
--cert=apps.ocp4.redhat.ren.crt \
--key=apps.ocp4.redhat.ren.key \
-n openshift-ingress
oc patch ingresscontroller.operator default \
--type=merge -p \
'{"spec":{"defaultCertificate": {"name": "wzh-ingress-key"}}}' \
-n openshift-ingress-operator
更改系统默认时区
https://access.redhat.com/solutions/5487331
cat << EOF > /data/install/timezone.yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
labels:
machineconfiguration.openshift.io/role: worker
name: worker-custom-timezone-configuration
spec:
config:
ignition:
version: 2.2.0
systemd:
units:
- contents: |
[Unit]
Description=set timezone
After=network-online.target
[Service]
Type=oneshot
ExecStart=timedatectl set-timezone Asia/Shanghai
[Install]
WantedBy=multi-user.target
enabled: true
name: custom-timezone.service
EOF
oc create -f /data/install/timezone.yaml
time sync between kvm and hosts
CHAPTER 8. KVM GUEST TIMING MANAGEMENT
echo ptp_kvm > /etc/modules-load.d/ptp_kvm.conf
echo "refclock PHC /dev/ptp0 poll 2" >> /etc/chrony.conf
systemctl restart chronyd
chronyc sources
# 210 Number of sources = 1
# MS Name/IP address Stratum Poll Reach LastRx Last sample
# ===============================================================================
# #* PHC0 0 2 377 5 -4784ns[ -62us] +/- 56us
排错技巧
# login to bootstrap to debug
# find the ip from kvm console
ssh -i ~/.ssh/helper_rsa core@192.168.7.75
journalctl -b -f -u release-image.service -u bootkube.service
journalctl -b -u release-image.service -u bootkube.service | grep -i baremetal
sudo -i
export KUBECONFIG=/etc/kubernetes/kubeconfig
oc get pod -n openshift-machine-api
oc get BareMetalHost -n openshift-machine-api
# debug why bootstrap can't be ping...
cat .openshift_install_state.json | jq '."*bootstrap.Bootstrap"'.Config.storage.files[].path
cat .openshift_install_state.json | jq -r '."*bootstrap.Bootstrap"'.File.Data | base64 -d | jq -r . > ign.json
cat .openshift_install_state.json | jq -r '."*bootstrap.Bootstrap".Config.storage.files[].contents.source ' | sed 's/.*base64,//g' | base64 -d > decode
cat .openshift_install_state.json | jq -r '."*bootstrap.Bootstrap".Config.storage.files[] | .path, .contents.source ' | while read -r line ; do if [[ $line =~ .*base64,.* ]]; then echo $(echo $line | sed 's/.*base64,//g' | base64 -d) ; else echo $line; fi; done > files
# https://serverfault.com/questions/329287/free-up-not-used-space-on-a-qcow2-image-file-on-kvm-qemu
virt-sparsify disk.img new-file.img
# https://access.redhat.com/solutions/24029
xz -dc < /boot/initrd-$(uname -r).img | cpio -idmv
openshift 4.10 离线 baremetal IPI (全自动)安装 单网络 静态IP模式
简介
本文描述ocp4.10在baremetal(kvm模拟)上面,IPI (全自动)安装。由于4.10支持nmstat,所以他原生支持静态IP安装了。
根据openshift文档,baremetal IPI安装有两种模式,一种是provisioning网络独立,另外一种是provisioning网络和baremetal(服务)网络合并的模式。考虑到POC现场的环境,本次实验,使用简单的网络部署,也就是合并的网络模式。
以下是本次实验的架构图:
注意:本文使用single node (sno) 模式,并使用 IPI (全自动) 安装,这种模式官方是不支持的,我们这么做,是为了后续的ACM zero touch provision 实验, ZTP实验,需要ACM hub集群是IPI模式安装,而我们做实验资源紧张,所以我们搞了一个sno with IPI 模式的安装步骤。本文中有一些手动执行的步骤,都是因为官方IPI不支持sno,我们需要做一些小小的patch操作。
离线安装包下载
打包好的安装包,在这里下载,百度盘下载链接,版本是4.10.4:
链接:https://pan.baidu.com/s/16H8goM8AQ5ASXXPsWT4GAg?pwd=426x 提取码:426x
其中包括如下类型的文件:
- ocp4.tgz 这个文件包含了iso等安装介质,以及各种安装脚本,全部下载的镜像列表等。需要复制到宿主机,以及工具机上去。
- registry.tgz 这个文件也是docker image registry的仓库打包文件。需要先补充镜像的话,按照这里操作: 4.6.add.image.md
- nexus-image.tgz 这个是nexus的镜像仓库打包,集群的镜像proxy指向nexus,由nexus提供镜像的cache
- poc.image.tgz 这个是给registry.tgz补充的一些镜像,主要是ccn使用,补充的镜像列表在这里 poc.image.list ,按照这里操作: 4.6.add.image.md
合并这些切分文件,使用类似如下的命令
cat registry.?? > registry.tgz
注意,可能需要更新离线镜像包中的helper用的ansible脚本。
在外网云主机上面准备离线安装源
准备离线安装介质的文档,已经转移到了这里:4.10.build.dist.md
前期准备,主要在宿主机上
本次实验,是在一个24C, 128G 的主机上面,用很多个虚拟机安装测试。所以先准备这个宿主机。
如果是多台宿主机,记得一定要调整时间配置,让这些宿主机的时间基本一致,否则证书会出问题。
主要的准备工作有
- 配置yum源
- 配置dns
- 安装镜像仓库
- 配置vnc环境
- 配置kvm需要的网络
- 创建helper kvm
以上准备工作,dns部分需要根据实际项目环境有所调整。
本次的宿主机是一台rhel8, 参考这里进行离线repo等基本的配置rhel8.build.kernel.repo.cache.md
cat << EOF > /root/.ssh/config
StrictHostKeyChecking no
UserKnownHostsFile=/dev/null
EOF
cat << EOF >> /etc/hosts
127.0.0.1 registry.ocp4.redhat.ren nexus.ocp4.redhat.ren git.ocp4.redhat.ren
EOF
dnf clean all
dnf repolist
dnf -y install byobu htop jq ipmitool
systemctl disable --now firewalld
# 配置registry
mkdir -p /etc/crts/ && cd /etc/crts
# https://access.redhat.com/documentation/en-us/red_hat_codeready_workspaces/2.1/html/installation_guide/installing-codeready-workspaces-in-tls-mode-with-self-signed-certificates_crw
openssl genrsa -out /etc/crts/redhat.ren.ca.key 4096
openssl req -x509 \
-new -nodes \
-key /etc/crts/redhat.ren.ca.key \
-sha256 \
-days 36500 \
-out /etc/crts/redhat.ren.ca.crt \
-subj /CN="Local Red Hat Ren Signer" \
-reqexts SAN \
-extensions SAN \
-config <(cat /etc/pki/tls/openssl.cnf \
<(printf '[SAN]\nbasicConstraints=critical, CA:TRUE\nkeyUsage=keyCertSign, cRLSign, digitalSignature'))
openssl genrsa -out /etc/crts/redhat.ren.key 2048
openssl req -new -sha256 \
-key /etc/crts/redhat.ren.key \
-subj "/O=Local Red Hat Ren /CN=*.ocp4.redhat.ren" \
-reqexts SAN \
-config <(cat /etc/pki/tls/openssl.cnf \
<(printf "\n[SAN]\nsubjectAltName=DNS:*.ocp4.redhat.ren,DNS:*.apps.ocp4.redhat.ren,DNS:*.redhat.ren\nbasicConstraints=critical, CA:FALSE\nkeyUsage=digitalSignature, keyEncipherment, keyAgreement, dataEncipherment\nextendedKeyUsage=serverAuth")) \
-out /etc/crts/redhat.ren.csr
openssl x509 \
-req \
-sha256 \
-extfile <(printf "subjectAltName=DNS:*.ocp4.redhat.ren,DNS:*.apps.ocp4.redhat.ren,DNS:*.redhat.ren\nbasicConstraints=critical, CA:FALSE\nkeyUsage=digitalSignature, keyEncipherment, keyAgreement, dataEncipherment\nextendedKeyUsage=serverAuth") \
-days 36500 \
-in /etc/crts/redhat.ren.csr \
-CA /etc/crts/redhat.ren.ca.crt \
-CAkey /etc/crts/redhat.ren.ca.key \
-CAcreateserial -out /etc/crts/redhat.ren.crt
openssl x509 -in /etc/crts/redhat.ren.crt -text
/bin/cp -f /etc/crts/redhat.ren.ca.crt /etc/pki/ca-trust/source/anchors/
update-ca-trust extract
配置镜像仓库
这里是旧的,使用docker registry的配置镜像仓库的方法,如果想配置quay,可以参考这里 。
cd /data
mkdir -p /data/registry
# tar zxf registry.tgz
dnf -y install podman pigz skopeo jq
# pigz -dc registry.tgz | tar xf -
cd /data/ocp4
podman load -i /data/ocp4/registry.tgz
podman run --name local-registry -p 5443:5000 \
-d --restart=always \
-v /data/registry/:/var/lib/registry:z \
-v /etc/crts:/certs:z \
-e REGISTRY_HTTP_TLS_CERTIFICATE=/certs/redhat.ren.crt \
-e REGISTRY_HTTP_TLS_KEY=/certs/redhat.ren.key \
docker.io/library/registry:2
podman start local-registry
# firewall-cmd --permanent --add-port=5443/tcp
# firewall-cmd --reload
# 加载更多的镜像
# 解压缩 ocp4.tgz
bash add.image.load.sh /data/install.image 'registry.ocp4.redhat.ren:5443'
# https://github.com/christianh814/ocp4-upi-helpernode/blob/master/docs/quickstart.md
oc image mirror -a /data/registry.auth.json --from-dir=/data/file.registry/ 'file://openshift/release:4.10.4-x86_64*' quaylab.infra.redhat.ren/ocp4/openshift4
准备vnc环境
vncpasswd
cat << EOF > ~/.vnc/config
session=gnome
securitytypes=vncauth,tlsvnc
desktop=sandbox
geometry=1440x855
alwaysshared
EOF
cat << EOF >> /etc/tigervnc/vncserver.users
:1=root
EOF
systemctl start vncserver@:1
# 如果你想停掉vnc server,这么做
systemctl stop vncserver@:1
# firewall-cmd --permanent --add-port=6001/tcp
# firewall-cmd --permanent --add-port=5901/tcp
# firewall-cmd --reload
# connect vnc at port 5901
# export DISPLAY=:1
创建实验用虚拟网络
cat << 'EOF' > /data/kvm/bridge.sh
#!/usr/bin/env bash
PUB_CONN='eno1'
PUB_IP='172.21.6.105/24'
PUB_GW='172.21.6.254'
PUB_DNS='172.21.1.1'
nmcli con down "$PUB_CONN"
nmcli con delete "$PUB_CONN"
nmcli con down baremetal
nmcli con delete baremetal
# RHEL 8.1 appends the word "System" in front of the connection,delete in case it exists
nmcli con down "System $PUB_CONN"
nmcli con delete "System $PUB_CONN"
nmcli connection add ifname baremetal type bridge con-name baremetal ipv4.method 'manual' \
ipv4.address "$PUB_IP" \
ipv4.gateway "$PUB_GW" \
ipv4.dns "$PUB_DNS"
nmcli con add type bridge-slave ifname "$PUB_CONN" master baremetal
nmcli con down "$PUB_CONN";pkill dhclient;dhclient baremetal
nmcli con up baremetal
EOF
nmcli con mod baremetal +ipv4.address '192.168.7.1/24'
nmcli networking off; nmcli networking on
创建工具机
mkdir -p /data/kvm
cd /data/kvm
lvremove -f rhel/helperlv
lvcreate -y -L 200G -n helperlv rhel
virt-install --name="ocp4-aHelper" --vcpus=2 --ram=4096 \
--disk path=/dev/rhel/helperlv,device=disk,bus=virtio,format=raw \
--os-variant rhel8.0 --network bridge=baremetal,model=virtio \
--boot menu=on --location /data/kvm/rhel-8.3-x86_64-dvd.iso \
--initrd-inject helper-ks-rhel8-ipi.cfg --extra-args "inst.ks=file:/helper-ks-rhel8-ipi.cfg"
virsh start ocp4-aHelper
# DO NOT USE, restore kvm
virsh destroy ocp4-aHelper
virsh undefine ocp4-aHelper
# virt-viewer --domain-name ocp4-aHelper
# virsh start ocp4-aHelper
# virsh list --all
配置时间服务
# start chrony/ntp server on host
/bin/cp -f /etc/chrony.conf /etc/chrony.conf.default
cat << EOF > /etc/chrony.conf
# pool 2.rhel.pool.ntp.org iburst
driftfile /var/lib/chrony/drift
makestep 1.0 3
rtcsync
allow 192.0.0.0/8
local stratum 10
logdir /var/log/chrony
EOF
systemctl enable --now chronyd
# systemctl restart chronyd
chronyc tracking
chronyc sources -v
chronyc sourcestats -v
chronyc makestep
# setup ftp data root
mount --bind /data/dnf /var/ftp/dnf
chcon -R -t public_content_t /var/ftp/dnf
在helper上配置静态变量
在 helper / 工具机上,配置静态变量。这些变量,将帮助配置工作可以在不同项目之间复用。后续也许可以考虑把相关的脚本,放到ansible项目里面去。
# on helper define static parameter
NODE_SSH_KEY="$(cat ~/.ssh/id_rsa.pub)"
INSTALL_IMAGE_REGISTRY=quaylab.infra.redhat.ren
PULL_SECRET='{"auths":{"registry.redhat.io": {"auth": "ZHVtbXk6ZHVtbXk=","email": "noemail@localhost"},"registry.ocp4.redhat.ren:5443": {"auth": "ZHVtbXk6ZHVtbXk=","email": "noemail@localhost"},"'${INSTALL_IMAGE_REGISTRY}'": {"auth": "'$( echo -n 'quayadmin:password' | openssl base64 )'","email": "noemail@localhost"}}}'
NTP_SERVER=192.168.7.1
HELP_SERVER=192.168.7.11
KVM_HOST=192.168.7.1
API_VIP=192.168.7.100
INGRESS_VIP=192.168.7.101
CLUSTER_PROVISION_IP=192.168.7.103
BOOTSTRAP_IP=192.168.7.12
ACM_DEMO_MNGED_CLUSTER=acm-demo1
ACM_DEMO_MNGED_SNO_IP=192.168.7.15
echo $PULL_SECRET
# 定义单节点集群的节点信息
SNO_CLUSTER_NAME=acm-demo-hub
SNO_BASE_DOMAIN=redhat.ren
SNO_IP=192.168.7.13
SNO_GW=192.168.7.1
SNO_NETMAST=255.255.255.0
SNO_NETMAST_S=24
SNO_HOSTNAME=acm-demo-hub-master
SNO_IF=enp1s0
SNO_IF_MAC=`printf '00:60:2F:%02X:%02X:%02X' $[RANDOM%256] $[RANDOM%256] $[RANDOM%256]`
SNO_DNS=192.168.7.11
SNO_DISK=/dev/vda
SNO_CORE_PWD=redhat
echo ${SNO_IF_MAC} > /data/sno/sno.mac
创建openshift4集群节点vm模板
# back to kvm host
# create the master and worker vm, but not start them
export KVM_DIRECTORY=/data/kvm
mkdir -p ${KVM_DIRECTORY}
cd ${KVM_DIRECTORY}
# scp root@192.168.7.11:/data/install/*.iso ${KVM_DIRECTORY}/
scp root@192.168.7.11:/data/sno/sno.mac ${KVM_DIRECTORY}/
remove_lv() {
var_vg=$1
var_lv=$2
lvremove -f $var_vg/$var_lv
}
create_lv() {
var_vg=$1
var_lv=$2
lvcreate -y -L 120G -n $var_lv $var_vg
wipefs --all --force /dev/$var_vg/$var_lv
}
remove_lv vgdata lvacmdemo1
# remove_lv vgdata lvbootstrap
# remove_lv vgdata lvdata01
# remove_lv vgdata lvdata02
remove_lv vgdata lvmaster0
# remove_lv vgdata lvsno
# create_lv rhel bootstraplv
create_lv vgdata lvmaster0
virt-install --name=ocp4-master0 --vcpus=16 --ram=49152 \
--cpu=host-model \
--disk path=/dev/vgdata/lvmaster0,device=disk,bus=virtio,format=raw \
--os-variant rhel8.0 --network bridge=baremetal,model=virtio,mac=$(<sno.mac) \
--boot uefi,nvram_template=/usr/share/OVMF/OVMF_VARS.fd,menu=on \
--print-xml > ${KVM_DIRECTORY}/ocp4-master0.xml
virsh define --file ${KVM_DIRECTORY}/ocp4-master0.xml
# --boot uefi,nvram_template=/usr/share/OVMF/OVMF_VARS.fd,menu=on \
# --boot hd,cdrom,menu=on \
cd /data/kvm/
# for i in master{0..2} worker{0..2}
for i in master{0..0}
do
echo -ne "${i}\t" ;
virsh dumpxml ocp4-${i} | grep "mac address" | cut -d\' -f2 | tr '\n' '\t'
echo
done > mac.list
cat /data/kvm/mac.list
# master0 00:60:2f:86:fc:ba
# GOTO image registry & kvm host
# copy crt files to helper node
ssh-copy-id root@192.168.7.11
ssh root@192.168.7.11 mkdir -p /data/install
ssh root@192.168.7.11 mkdir -p /data/ocp4
ssh root@192.168.7.11 mkdir -p /etc/crts
scp /data/down/ocp4.tgz root@192.168.7.11:/data/
rsync -e ssh --info=progress2 -P --delete -arz /data/ocp4/ 192.168.7.11:/data/ocp4/
# scp /etc/crts/redhat.ren.ca.crt root@192.168.7.11:/data/install/
scp /etc/crts/redhat.ren.ca.crt root@192.168.7.11:/etc/crts/
scp /data/kvm/mac.list root@192.168.7.11:/data/install/
配置redfish模拟
# install redfish for kvm
# https://access.redhat.com/solutions/4315581
# https://access.redhat.com/solutions/3057171
# https://docs.openstack.org/virtualbmc/latest/user/index.html
# https://docs.openstack.org/sushy-tools/latest/user/dynamic-emulator.html
dnf -y install python3-pip
# pip3 install --user sushy-tools
mkdir -p /data/install
cd /data/install
# podman create --name swap docker.io/wangzheng422/imgs:openshift-baremetal-install-4.6.5 ls
# podman cp swap:/openshift-baremetal-install ./
# podman rm -fv swap
# quay.io/wangzheng422/qimgs:ocp.bm.ipi.python.dep.rhel8-4.6.7
podman create --name swap quay.io/wangzheng422/qimgs:ocp.bm.ipi.python.dep.rhel8-4.10.4 ls
podman cp swap:/wheelhouse.tar.gz - > wheelhouse.tar.gz.tar
tar xvf wheelhouse.tar.gz.tar
tar zvxf wheelhouse.tar.gz
podman rm -fv swap
dnf groupinstall -y 'Development Tools'
dnf -y install python3-pip libvirt libvirt-devel python3-devel openssl-devel
pip3 install --user --no-index --find-links wheelhouse setuptools-rust
# export CRYPTOGRAPHY_DONT_BUILD_RUST=1
dnf install -y rust cargo
pip3 install --user -r wheelhouse/requirements.txt --no-index --find-links wheelhouse
/root/.local/bin/sushy-emulator -i 0.0.0.0 --ssl-certificate /etc/crts/redhat.ren.crt --ssl-key /etc/crts/redhat.ren.key
# curl https://registry.ocp4.redhat.ren:8000/redfish/v1/Systems/
# DO NOT USE, restore
# if you want to stop or delete vm, try this
virsh list --all
# virsh destroy ocp4-bootstrap
virsh destroy ocp4-master0
# virsh destroy ocp4-master1
# virsh destroy ocp4-master2
# virsh destroy ocp4-worker0
# virsh destroy ocp4-worker1
# virsh destroy ocp4-worker2
# virsh undefine ocp4-bootstrap
virsh undefine ocp4-master0 --nvram
# virsh undefine ocp4-master1 --nvram
# virsh undefine ocp4-master2 --nvram
# virsh undefine ocp4-worker0 --nvram
# virsh undefine ocp4-worker1 --nvram
# virsh undefine ocp4-worker2 --nvram
工具机上的准备工作
以下是在工具机里面,进行的安装操作。
主要的操作有
- 配置yum源
- 运行ansible脚本,自动配置工具机
- 上传定制的安装配置文件
- 生成ignition文件
工具机的基础配置
sed -i 's/#UseDNS yes/UseDNS no/g' /etc/ssh/sshd_config
systemctl restart sshd
cat << EOF > /root/.ssh/config
StrictHostKeyChecking no
UserKnownHostsFile=/dev/null
EOF
systemctl disable --now firewalld
# in helper node
mkdir /etc/yum.repos.d.bak
mv /etc/yum.repos.d/* /etc/yum.repos.d.bak
export YUMIP="192.168.7.1"
cat << EOF > /etc/yum.repos.d/remote.repo
[remote-epel]
name=epel
baseurl=ftp://${YUMIP}/dnf/epel
enabled=1
gpgcheck=0
[remote-epel-modular]
name=epel-modular
baseurl=ftp://${YUMIP}/dnf/epel-modular
enabled=1
gpgcheck=0
[remote-appstream]
name=appstream
baseurl=ftp://${YUMIP}/dnf/rhel-8-for-x86_64-appstream-rpms
enabled=1
gpgcheck=0
[remote-baseos]
name=baseos
baseurl=ftp://${YUMIP}/dnf/rhel-8-for-x86_64-baseos-rpms
enabled=1
gpgcheck=0
[remote-baseos-source]
name=baseos-source
baseurl=ftp://${YUMIP}/dnf/rhel-8-for-x86_64-baseos-source-rpms
enabled=1
gpgcheck=0
[remote-supplementary]
name=supplementary
baseurl=ftp://${YUMIP}/dnf/rhel-8-for-x86_64-supplementary-rpms
enabled=1
gpgcheck=0
[remote-codeready-builder]
name=supplementary
baseurl=ftp://${YUMIP}/dnf/codeready-builder-for-rhel-8-x86_64-rpms
enabled=1
gpgcheck=0
EOF
yum clean all
yum makecache
yum repolist
yum -y install ansible git unzip podman python3
yum -y update
reboot
# yum -y install ansible git unzip podman python36
准备 openshift 的定制化 ansible 安装工具
mkdir -p /data/ocp4/
# scp ocp4.tgz to /data
# scp /data/down/ocp4.tgz root@192.168.7.11:/data/
cd /data
tar zvxf ocp4.tgz
cd /data/ocp4
# 这里使用了一个ansible的项目,用来部署helper节点的服务。
# https://github.com/wangzheng422/ocp4-upi-helpernode
unzip ocp4-upi-helpernode.zip
# 这里使用了一个ignition文件合并的项目,用来帮助自定义ignition文件。
# https://github.com/wangzheng422/filetranspiler
podman load -i filetranspiler.tgz
mkdir -p /data/install
# on helper
mkdir -p /data/ocp4/
cd /data/ocp4/
cat << EOF > redfish.sh
#!/usr/bin/env bash
curl -k -s https://${KVM_HOST}:8000/redfish/v1/Systems/ | jq -r '.Members[]."@odata.id"' > list
while read -r line; do
curl -k -s https://${KVM_HOST}:8000/\$line | jq -j '.Id, " ", .Name, "\n" '
done < list
EOF
bash redfish.sh > /data/install/vm.list
cat /data/install/vm.list
# 1bc0116f-d376-45e2-b28c-d6b4b772b2bf ocp4-master0
# e70f66bc-7878-4617-811d-89cdaf62cc8c ocp4-Helper
# 配置ansible脚本的参数,注意修改里面的静态参数
cat << EOF > /data/ocp4/ocp4-upi-helpernode-master/vars.yaml
---
ocp_version: 4.10.4
ssh_gen_key: false
staticips: true
bm_ipi: true
firewalld: false
dns_forward: true
iso:
iso_dl_url: "file:///data/ocp4/rhcos-live.x86_64.iso"
my_iso: "rhcos-live.iso"
helper:
name: "helper"
ipaddr: "${HELP_SERVER}"
networkifacename: "enp1s0"
gateway: "${SNO_GW}"
netmask: "${SNO_NETMAST}"
dns:
domain: "redhat.ren"
clusterid: "ocp4"
forwarder1: "172.21.1.1"
forwarder2: "172.21.1.1"
api_vip: "${API_VIP}"
ingress_vip: "${INGRESS_VIP}"
bootstrap:
name: "bootstrap"
ipaddr: "${BOOTSTRAP_IP}"
interface: "enp1s0"
install_drive: "vda"
masters:
- name: "master-0"
ipaddr: "192.168.7.13"
interface: "enp1s0"
install_drive: "vda"
others:
- name: "registry"
ipaddr: "192.168.7.103"
- name: "yum"
ipaddr: "172.21.6.103"
- name: "quay"
ipaddr: "172.21.6.103"
- name: "nexus"
ipaddr: "172.21.6.103"
- name: "git"
ipaddr: "172.21.6.103"
otherdomains:
- domain: "infra.redhat.ren"
hosts:
- name: "registry"
ipaddr: "192.168.7.1"
- name: "yum"
ipaddr: "192.168.7.1"
- name: "quay"
ipaddr: "192.168.7.1"
- name: "quaylab"
ipaddr: "192.168.7.1"
- name: "nexus"
ipaddr: "192.168.7.1"
- name: "git"
ipaddr: "192.168.7.1"
- domain: "${ACM_DEMO_MNGED_CLUSTER}.${SNO_BASE_DOMAIN}"
hosts:
- name: "api"
ipaddr: "${ACM_DEMO_MNGED_SNO_IP}"
- name: "api-int"
ipaddr: "${ACM_DEMO_MNGED_SNO_IP}"
- name: "${ACM_DEMO_MNGED_CLUSTER}-master"
ipaddr: "${ACM_DEMO_MNGED_SNO_IP}"
- name: "*.apps"
ipaddr: "${ACM_DEMO_MNGED_SNO_IP}"
- domain: "${SNO_CLUSTER_NAME}.${SNO_BASE_DOMAIN}"
hosts:
- name: "api"
ipaddr: "${SNO_IP}"
- name: "api-int"
ipaddr: "${SNO_IP}"
- name: "${SNO_CLUSTER_NAME}-master"
ipaddr: "${SNO_IP}"
- name: "*.apps"
ipaddr: "${SNO_IP}"
force_ocp_download: false
remove_old_config_files: false
ocp_client: "file:///data/ocp4/{{ ocp_version }}/openshift-client-linux-{{ ocp_version }}.tar.gz"
ocp_installer: "file:///data/ocp4/{{ ocp_version }}/openshift-install-linux-{{ ocp_version }}.tar.gz"
ppc64le: false
arch: 'x86_64'
chronyconfig:
enabled: true
content:
- server: "${NTP_SERVER}"
options: iburst
setup_registry: # don't worry about this, just leave it here
deploy: false
registry_image: docker.io/library/registry:2
local_repo: "ocp4/openshift4"
product_repo: "openshift-release-dev"
release_name: "ocp-release"
release_tag: "4.6.1-x86_64"
ocp_filetranspiler: "file:///data/ocp4/filetranspiler.tgz"
EOF
# 接下来,我们使用ansible来配置helper节点,装上各种openshift集群需要的服务
# 根据现场环境,修改 ocp4-upi-helpernode-master/vars-static.yaml
cd /data/ocp4/ocp4-upi-helpernode-master
ansible-playbook -e @vars.yaml -e '{ staticips: true, bm_ipi: true }' tasks/main.yml
# generate image registry proxy related config
cd /data/ocp4
bash image.registries.conf.sh nexus.ocp4.redhat.ren:8083
# try this:
/usr/local/bin/helpernodecheck
mkdir -p /data/install
# GO back to help node
# apply registry's CA
/bin/cp -f /etc/crts/redhat.ren.ca.crt /etc/pki/ca-trust/source/anchors/
update-ca-trust extract
配置 ignition 点火配置文件
openshift4安装的关键,就是ignition文件,更准确的说,是rhcos的点火配置文件,所有项目现场想做的定制,都在ignition文件里面。
rhcos就是一个rhel,所有你想要的定制化,都可以写成配置文件和脚本,加到ignition文件中去。但是,openshift4在安装过程中,至少要重启3次,我们的ignition文件中的配置,更多的是影响第一次启动,而之后的启动,rhcos会根据自身的升级机制,使用新的ignition去启动,这个新的ignition文件在哪里?怎么影响这个igntion文件的生成?作者现在也还在探索中,但是大致的方向是定制 /opt/openshift/openshift/ 下面的machine config yaml文件,把machine config写进去。
# on helper
# 根据现场环境,修改 install-config.yaml
# 至少要修改ssh key, 还有 additionalTrustBundle,这个是镜像仓库的csr
# copy your pull secret file into helper
# SEC_FILE='/data/pull-secret.json'
# cat << 'EOF' > $SEC_FILE
# 定制ignition
mkdir -p /data/install
cd /data/install
# vi install-config.yaml
cat << EOF > /data/install/install-config.yaml
apiVersion: v1
baseDomain: ${SNO_BASE_DOMAIN}
# bootMode: legacy
platform:
baremetal:
apiVIP: ${API_VIP}
ingressVIP: ${INGRESS_VIP}
bootstrapProvisioningIP: ${BOOTSTRAP_IP}
clusterProvisioningIP: ${CLUSTER_PROVISION_IP}
provisioningNetwork: "Disabled"
externalBridge: baremetal
bootstrapOSImage: http://${HELP_SERVER}:8080/install/rhcos-qemu.x86_64.qcow2.gz?sha256=$(zcat /var/www/html/install/rhcos-qemu.x86_64.qcow2.gz | sha256sum | awk '{print $1}')
clusterOSImage: http://${HELP_SERVER}:8080/install/rhcos-openstack.x86_64.qcow2.gz?sha256=$(zcat /var/www/html/install/rhcos-openstack.x86_64.qcow2.gz | sha256sum | awk '{print $1}')
hosts:
- name: ${SNO_HOSTNAME}
role: master
bmc:
address: redfish-virtualmedia://${KVM_HOST}:8000/redfish/v1/Systems/$(cat vm.list | grep master0 | awk '{print $1}')
username: admin
password: password
disableCertificateVerification: True
bootMACAddress: $(cat mac.list | grep master0 | awk '{print $2}')
rootDeviceHints:
deviceName: "${SNO_DISK}"
networkConfig:
dns-resolver:
config:
server:
- ${SNO_DNS}
interfaces:
- ipv4:
address:
- ip: ${SNO_IP}
prefix-length: ${SNO_NETMAST_S}
# - ip: ${API_VIP}
# prefix-length: 32
# - ip: ${INGRESS_VIP}
# prefix-length: 32
# - ip: ${CLUSTER_PROVISION_IP}
# prefix-length: 32
dhcp: false
enabled: true
name: ${SNO_IF}
state: up
type: ethernet
routes:
config:
- destination: 0.0.0.0/0
next-hop-address: ${SNO_GW}
next-hop-interface: ${SNO_IF}
table-id: 254
metadata:
name: ${SNO_CLUSTER_NAME}
networking:
clusterNetworks:
- cidr: 10.254.0.0/16
hostPrefix: 24
networkType: OpenShiftSDN
serviceNetwork:
- 172.30.0.0/16
machineCIDR: 192.168.7.0/24
compute:
- name: worker
replicas: 0
controlPlane:
name: master
replicas: 1
platform:
baremetal: {}
pullSecret: '${PULL_SECRET}'
sshKey: |
$( cat /root/.ssh/id_rsa.pub | sed 's/^/ /g' )
additionalTrustBundle: |
$( cat /etc/crts/redhat.ren.ca.crt | sed 's/^/ /g' )
imageContentSources:
- mirrors:
- ${INSTALL_IMAGE_REGISTRY}/ocp4/openshift4
source: quay.io/openshift-release-dev/ocp-release
- mirrors:
- ${INSTALL_IMAGE_REGISTRY}/ocp4/openshift4
source: quay.io/openshift-release-dev/ocp-v4.0-art-dev
EOF
在宿主机上开始安装
将配置文件复制到宿主机上
# GO back to host
mkdir -p /data/install
cd /data/install
/bin/rm -rf .openshift_install.log .openshift_install_state.json terraform* auth tls *
scp root@192.168.7.11:/data/install/install-config.yaml /data/install/
cd /data/install
for i in $(sudo virsh list --all | tail -n +3 | grep bootstrap | awk {'print $2'});
do
sudo virsh destroy $i;
sudo virsh undefine $i;
sudo virsh vol-delete $i --pool default;
sudo virsh vol-delete $i.ign --pool default;
virsh pool-destroy $i
virsh pool-delete $i
virsh pool-undefine $i
done
从ignition点火配置文件创建安装配置文件
export BUILDNUMBER=4.10.4
/data/ocp4/${BUILDNUMBER}/openshift-baremetal-install --dir /data/install/ create manifests
# copy ntp related config
scp root@192.168.7.11:/data/ocp4/ocp4-upi-helpernode-master/machineconfig/* /data/install/openshift/
# /bin/cp -f /data/ocp4/image.registries.conf /etc/containers/registries.conf.d/
scp root@192.168.7.11:/data/ocp4/99-worker-container-registries.yaml /data/install/openshift
scp root@192.168.7.11:/data/ocp4/99-master-container-registries.yaml /data/install/openshift
# /data/ocp4/${BUILDNUMBER}/openshift-baremetal-install --dir /data/install/ --log-level debug create cluster
/data/ocp4/${BUILDNUMBER}/openshift-baremetal-install --dir /data/install/ create ignition-configs
定制 bootstrap 的 ignition 点火配置文件
mkdir -p /data/sno/disconnected/
# 定义单节点集群的节点信息
BTS_CLUSTER_NAME=ocp4s-ais
BTS_BASE_DOMAIN=redhat.ren
BTS_IP=192.168.7.12
BTS_GW=192.168.7.1
BTS_NETMAST=255.255.255.0
BTS_NETMAST_S=24
BTS_HOSTNAME=ocp4s-ais-bootstrap
# SNO_CON="Wired connection 1"
BTS_CON="ens3"
BTS_IF=ens3
BTS_DNS=192.168.7.11
BTS_DISK=/dev/vda
BTS_CORE_PWD=redhat
SNO_HOSTNAME=acm-demo-hub-master
cat << EOF > /data/sno/static.ip.bu
variant: openshift
version: 4.9.0
metadata:
labels:
machineconfiguration.openshift.io/role: master
name: 99-zzz-master-static-ip
storage:
files:
- path: /etc/NetworkManager/system-connections/${BTS_CON}.nmconnection
mode: 0600
overwrite: true
contents:
inline: |
[connection]
id=${BTS_IF}
# uuid=$(uuidgen)
type=ethernet
interface-name=${BTS_IF}
autoconnect=true
[ipv4]
address1=${BTS_IP}/${BTS_NETMAST_S=24},${BTS_GW}
dns=${BTS_DNS};
dns-search=
method=manual
[ipv6]
addr-gen-mode=eui64
dhcp-hostname=${BTS_HOSTNAME}
dhcp-timeout=90
dns-search=
method=disabled
[proxy]
EOF
# set static hostname for master
# only works for sno
# do not use this in 3-master cluster
# in 3-master cluster, use dhcp to set hostname instead.
cat << EOF > /data/sno/static.hostname.bu
variant: openshift
version: 4.9.0
metadata:
labels:
machineconfiguration.openshift.io/role: master
name: 99-zzz-master-static-hostname
storage:
files:
- path: /etc/hostname
mode: 0644
overwrite: true
contents:
inline: |
${SNO_HOSTNAME}
EOF
source /data/ocp4/acm.fn.sh
butane /data/sno/static.ip.bu > /data/sno/disconnected/99-zzz-bootstrap-ip.yaml
get_file_content_for_ignition "/opt/openshift/openshift/99-zzz-bootstrap-ip.yaml" "/data/sno/disconnected/99-zzz-bootstrap-ip.yaml"
VAR_99_master_bootstrap_ip=$RET_VAL
VAR_99_master_bootstrap_ip_2=$RET_VAL_2
butane /data/sno/static.hostname.bu > /data/sno/disconnected/99-zzz-master-static-hostname.yaml
get_file_content_for_ignition "/opt/openshift/openshift/99-zzz-master-static-hostname.yaml" "/data/sno/disconnected/99-zzz-master-static-hostname.yaml"
VAR_99_master_master_static_hostname=$RET_VAL
VAR_99_master_master_static_hostname_2=$RET_VAL_2
VAR_PWD_HASH="$(python3 -c 'import crypt,getpass; print(crypt.crypt("redhat"))')"
tmppath=$(mktemp)
cat /data/install/bootstrap.ign \
| jq --arg VAR "$VAR_PWD_HASH" --arg VAR_SSH "$NODE_SSH_KEY" '.passwd.users += [{ "name": "wzh", "system": true, "passwordHash": $VAR , "sshAuthorizedKeys": [ $VAR_SSH ], "groups": [ "adm", "wheel", "sudo", "systemd-journal" ] }]' \
| jq --argjson VAR "$VAR_99_master_bootstrap_ip_2" '.storage.files += [$VAR] ' \
| jq --argjson VAR "$VAR_99_master_master_static_hostname" '.storage.files += [$VAR] ' \
| jq -c . \
> ${tmppath}
/bin/cp -f ${tmppath} /data/install/bootstrap.ign
rm -f ${tmppath}
开始 IPI 安装 openshift4
/data/ocp4/${BUILDNUMBER}/openshift-baremetal-install --dir /data/install/ --log-level debug create cluster
安装自动开始,等2分钟以后,可以看到自动创建了一个bootstrap虚拟机
bootstrap运行一段时间后,会通过redfish,启动 master vm.
# we can login to the bootstrap by using username and password ( wzh/redhat ) in console
# or we can login using ssh
ssh core@192.168.7.12
# 在安装过程中,安装程序会检查master-0节点的hostname是不是localhost,不是的话等待网络配置
# 这个超时时间还有点长,等不及的话,登录到master-0节点上,直接用以下命令改一下
# hostnamectl set-hostname acm-demo-hub-master
# 在安装过程中,也许是bug,apiVIP, ingressVIP 无法漂移到master-0上正常加载
# 我们手动加上去就好了
# 这并不是一个bug,而是一个解决方案,因为IPI安装的设计,是要求3个master节点. 也许以后会内置支持吧。
# on master-0 kvm
nmcli con mod enp1s0 +ipv4.addresses 192.168.7.100/32
nmcli con mod enp1s0 +ipv4.addresses 192.168.7.101/32
nmcli con mod enp1s0 +ipv4.addresses 192.168.7.103/32
nmcli con up enp1s0
/data/ocp4/${BUILDNUMBER}/openshift-baremetal-install --dir /data/install/ wait-for bootstrap-complete --log-level debug
# DEBUG Bootstrap status: complete
# INFO It is now safe to remove the bootstrap resources
# DEBUG Time elapsed per stage:
# DEBUG Bootstrap Complete: 14s
# DEBUG API: 14s
# INFO Time elapsed: 14s
/data/ocp4/${BUILDNUMBER}/openshift-baremetal-install --dir /data/install/ wait-for install-complete --log-level debug
# INFO Install complete!
# INFO To access the cluster as the system:admin user when using 'oc', run 'export KUBECONFIG=/data/install/auth/kubeconfig'
# INFO Access the OpenShift web-console here: https://console-openshift-console.apps.acm-demo-hub.redhat.ren
# INFO Login to the console with user: "kubeadmin", and password: "FpbMV-zasXr-8xczB-SSuIy"
# DEBUG Time elapsed per stage:
# DEBUG Cluster Operators: 8m39s
# INFO Time elapsed: 8m39s
# on kvm host, copy back auth folder to helper node
rsync -arz /data/install/auth root@192.168.7.11:/data/install/
# Go back to helper
ansible localhost -m lineinfile -a 'path=$HOME/.bashrc regexp="^export KUBECONFIG" line="export KUBECONFIG=/data/install/auth/kubeconfig"'
source $HOME/.bashrc
oc get node
# NAME STATUS ROLES AGE VERSION
# acm-demo-hub-master Ready master,worker 143m v1.23.3+e419edf
oc get pod -n openshift-machine-api
# NAME READY STATUS RESTARTS AGE
# cluster-autoscaler-operator-86fb4975-ljssk 2/2 Running 8 137m
# cluster-baremetal-operator-5946dc9f9b-sksrh 2/2 Running 6 137m
# machine-api-controllers-9688d969d-qgn2j 7/7 Running 32 (34m ago) 135m
# machine-api-operator-568bb89984-s28kx 2/2 Running 6 137m
# metal3-d88947f6f-rbp9m 7/7 Running 24 (35m ago) 134m
# metal3-image-cache-vf548 1/1 Running 3 134m
# metal3-image-customization-577f886bb4-v7xg5 1/1 Running 3 134m
oc get all -n openshift-kni-infra
# NAME READY STATUS RESTARTS AGE
# pod/coredns-acm-demo-hub-master 2/2 Running 4 92m
# pod/haproxy-acm-demo-hub-master 2/2 Running 4 93m
# pod/keepalived-acm-demo-hub-master 2/2 Running 4 92m
oc get BareMetalHost -n openshift-machine-api
# NAME STATE CONSUMER ONLINE ERROR AGE
# acm-demo-hub-master externally provisioned acm-demo-hub-6rh7s-master-0 true 157m
oc get bmh -n openshift-machine-api
# NAME STATE CONSUMER ONLINE ERROR AGE
# acm-demo-hub-master externally provisioned acm-demo-hub-6rh7s-master-0 true 161m
可以看到web console上node的配置指向了bm
我们也可以看到久违的machine配置
machine set 也有了
有了machine 自然 machine health check 也有了
有一个单独的 baremetal hosts 的页面也出来了
静态添加 vip for api_server, ingress
我们是定制的 SNO IPI,其实不需要 api server , ingress 的 vip, 所以我们就写死到节点的启动脚本中,把这些 vip 给静态加上。 但是默认 ipi 安装会有一个 keepalived static pod , 启动的时候,会清除到这些vip,那么我们还要把这个 keepalived static pod 关掉,否则会导致 vip 不可用。
# on helper
cat << EOF > /data/install/wzh.script
#!/bin/bash
nmcli con mod enp1s0 +ipv4.addresses 192.168.7.100/32
nmcli con mod enp1s0 +ipv4.addresses 192.168.7.101/32
nmcli con mod enp1s0 +ipv4.addresses 192.168.7.103/32
nmcli con up enp1s0
EOF
var_local=$(cat /data/install/wzh.script | python3 -c "import sys, urllib.parse; print(urllib.parse.quote(''.join(sys.stdin.readlines())))" )
cat <<EOF > /data/install/45-master-wzh-service.yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
labels:
machineconfiguration.openshift.io/role: master
name: 45-master-wzh-service
spec:
config:
ignition:
version: 3.2.0
storage:
files:
- contents:
source: data:text/plain,${var_local}
verification: {}
filesystem: root
mode: 0755
path: /etc/rc.d/wzh.local
- path: /etc/kubernetes/manifests/keepalived.yaml
contents:
source: data:text/plain,
verification: {}
filesystem: root
mode: 0644
overwrite: true
systemd:
units:
- name: wzh.service
enabled: true
contents: |
[Unit]
Description=/etc/rc.d/wzh.local Compatibility
ConditionFileIsExecutable=/etc/rc.d/wzh.local
After=network.target
[Service]
Type=oneshot
User=root
Group=root
ExecStart=/bin/bash -c /etc/rc.d/wzh.local
[Install]
WantedBy=multi-user.target
EOF
oc apply -f 45-master-wzh-service.yaml
安装后的操作
添加一个新节点(sno未验证)
IPI 模式下,添加一个新节点非常方便,只要定义一个BareMetalHost就好了。
cd /data/install/
cat << EOF > /data/install/bmh.yaml
---
apiVersion: v1
kind: Secret
metadata:
name: worker-2-bmc-secret
type: Opaque
data:
username: $(echo -ne "admin" | base64)
password: $(echo -ne "password" | base64)
---
apiVersion: metal3.io/v1alpha1
kind: BareMetalHost
metadata:
name: worker-2
spec:
online: true
bootMACAddress: $(cat mac.list | grep worker2 | awk '{print $2}')
bmc:
address: redfish-virtualmedia://192.168.7.1:8000/redfish/v1/Systems/$(cat vm.list | grep worker2 | awk '{print $1}')
credentialsName: worker-2-bmc-secret
disableCertificateVerification: true
rootDeviceHints:
deviceName: /dev/vda
EOF
oc -n openshift-machine-api create -f bmh.yaml
# DO NOT USE, restore, delete the vm
oc -n openshift-machine-api delete -f bmh.yaml
oc get bmh -n openshift-machine-api
# NAME STATUS PROVISIONING STATUS CONSUMER BMC HARDWARE PROFILE ONLINE ERROR
# master-0 OK externally provisioned ocp4-zn8lq-master-0 redfish-virtualmedia://192.168.7.1:8000/redfish/v1/Systems/965c420a-f127-4639-9184-fe3546d2bde4 true
# master-1 OK externally provisioned ocp4-zn8lq-master-1 redfish-virtualmedia://192.168.7.1:8000/redfish/v1/Systems/46f9dff4-1b44-4286-8a7c-691673340030 true
# master-2 OK externally provisioned ocp4-zn8lq-master-2 redfish-virtualmedia://192.168.7.1:8000/redfish/v1/Systems/9e544eb6-1b98-4b0a-ad32-7df232ae582a true
# worker-0 OK provisioned ocp4-zn8lq-worker-0-mv4d7 redfish-virtualmedia://192.168.7.1:8000/redfish/v1/Systems/c399c6b7-525a-4f4e-8280-0472b6494fc5 unknown true
# worker-1 OK provisioned ocp4-zn8lq-worker-0-9frt6 redfish-virtualmedia://192.168.7.1:8000/redfish/v1/Systems/a4052132-7598-4879-b3e1-c48c47cf67ed unknown true
# worker-2 OK inspecting redfish-virtualmedia://192.168.7.1:8000/redfish/v1/Systems/2eee2e57-e18b-460b-bb3f-7f048f84c69b true
oc get machinesets -n openshift-machine-api
# NAME DESIRED CURRENT READY AVAILABLE AGE
# ocp4-zn8lq-worker-0 2 2 2 2 155m
oc get machinesets -n openshift-machine-api -o json | jq -r .items[0].metadata.name
# 扩容worker到3副本,会触发worker-2的部署
oc scale --replicas=3 machineset $(oc get machinesets -n openshift-machine-api -o json | jq -r .items[0].metadata.name) -n openshift-machine-api
镜像仓库代理 / image registry proxy
准备离线镜像仓库非常麻烦,好在我们找到了一台在线的主机,那么我们可以使用nexus构造image registry proxy,在在线环境上面,做一遍PoC,然后就能通过image registry proxy得到离线镜像了
- https://mtijhof.wordpress.com/2018/07/23/using-nexus-oss-as-a-proxy-cache-for-docker-images/
#####################################################
# init build the nexus fs
/bin/cp -f nexus-image.tgz /data/ccn/
tar zxf nexus-image.tgz
chown -R 200 /data/ccn/nexus-image
# podman run -d -p 8082:8081 -p 8083:8083 -it --name nexus-image -v /data/ccn/nexus-image:/nexus-data:Z docker.io/sonatype/nexus3:3.29.0
podman run -d -p 8082:8081 -p 8083:8083 -it --name nexus-image -v /data/ccn/nexus-image:/nexus-data:Z docker.io/wangzheng422/imgs:nexus3-3.29.0-wzh
podman stop nexus-image
podman rm nexus-image
# get the admin password
cat /data/ccn/nexus-image/admin.password && echo
# 84091bcd-c82f-44a3-8b7b-dfc90f5b7da1
# open http://nexus.ocp4.redhat.ren:8082
# 开启 https
# https://blog.csdn.net/s7799653/article/details/105378645
# https://help.sonatype.com/repomanager3/system-configuration/configuring-ssl#ConfiguringSSL-InboundSSL-ConfiguringtoServeContentviaHTTPS
mkdir -p /data/install/tmp
cd /data/install/tmp
# 将证书导出成pkcs格式
# 这里需要输入密码 用 password,
openssl pkcs12 -export -out keystore.pkcs12 -inkey /etc/crts/redhat.ren.key -in /etc/crts/redhat.ren.crt
cat << EOF >> Dockerfile
FROM docker.io/sonatype/nexus3:3.29.0
USER root
COPY keystore.pkcs12 /keystore.pkcs12
RUN keytool -v -importkeystore -srckeystore keystore.pkcs12 -srcstoretype PKCS12 -destkeystore keystore.jks -deststoretype JKS -storepass password -srcstorepass password &&\
cp keystore.jks /opt/sonatype/nexus/etc/ssl/
USER nexus
EOF
buildah bud --format=docker -t docker.io/wangzheng422/imgs:nexus3-3.29.0-wzh -f Dockerfile .
buildah push docker.io/wangzheng422/imgs:nexus3-3.29.0-wzh
######################################################
# go to helper, update proxy setting for ocp cluster
cd /data/ocp4
bash image.registries.conf.sh nexus.ocp4.redhat.ren:8083
mkdir -p /etc/containers/registries.conf.d
/bin/cp -f image.registries.conf /etc/containers/registries.conf.d/
cd /data/ocp4
oc apply -f ./99-worker-container-registries.yaml -n openshift-config
oc apply -f ./99-master-container-registries.yaml -n openshift-config
######################################################
# dump the nexus image fs out
podman stop nexus-image
var_date=$(date '+%Y-%m-%d-%H%M')
echo $var_date
cd /data/ccn
tar cf - ./nexus-image | pigz -c > nexus-image.tgz
buildah from --name onbuild-container scratch
buildah copy onbuild-container nexus-image.tgz /
buildah umount onbuild-container
buildah commit --rm --format=docker onbuild-container docker.io/wangzheng422/nexus-fs:image-$var_date
# buildah rm onbuild-container
# rm -f nexus-image.tgz
buildah push docker.io/wangzheng422/nexus-fs:image-$var_date
echo "docker.io/wangzheng422/nexus-fs:image-$var_date"
# 以下这个版本,可以作为初始化的image proxy,里面包含了nfs provision,以及sample operator的metadata。很高兴的发现,image stream并不会完全下载镜像,好想只是下载metadata,真正用的时候,才去下载。
# docker.io/wangzheng422/nexus-fs:image-2020-12-26-1118
配置镜像仓库的ca
安装过程里面,已经把镜像仓库的ca放进去了,但是好想image stream不认,让我们再试试
oc project openshift-config
oc create configmap ca.for.registry -n openshift-config \
--from-file=registry.ocp4.redhat.ren..5443=/data/install/redhat.ren.ca.crt \
--from-file=nexus.ocp4.redhat.ren..8083=/data/install/redhat.ren.ca.crt
oc patch image.config.openshift.io/cluster -p '{"spec":{"additionalTrustedCA":{"name":"ca.for.registry"}}}' --type=merge
# oc patch image.config.openshift.io/cluster -p '{"spec":{"registrySources":{"insecureRegistries":["nexus.ocp4.redhat.ren:8083"]}}}' --type=merge
oc get image.config.openshift.io/cluster -o yaml
# openshift project下面的image stream重新加载一下把
oc get is -o json | jq -r '.items[].metadata.name' | xargs -L1 oc import-image --all
配置internal registry
我们的工具机是带nfs的,那么就给interneal registry配置高档一些的nfs存储吧,不要用emptydir
bash /data/ocp4/ocp4-upi-helpernode-master/files/nfs-provisioner-setup.sh
# oc edit configs.imageregistry.operator.openshift.io
# 修改 storage 部分
# storage:
# pvc:
# claim:
oc patch configs.imageregistry.operator.openshift.io cluster -p '{"spec":{"managementState": "Managed","storage":{"pvc":{"claim":""}}}}' --type=merge
oc patch configs.imageregistry.operator.openshift.io cluster -p '{"spec":{"managementState": "Removed"}}' --type=merge
oc get clusteroperator image-registry
oc get configs.imageregistry.operator.openshift.io cluster -o yaml
# 把imagepruner给停掉
# https://bugzilla.redhat.com/show_bug.cgi?id=1852501#c24
# oc patch imagepruner.imageregistry/cluster --patch '{"spec":{"suspend":true}}' --type=merge
# oc -n openshift-image-registry delete jobs --all
配置sample operator
openshift内置了一个sample operator,里面有一大堆红帽的产品。
oc get configs.samples.operator.openshift.io/cluster -o yaml
oc patch configs.samples.operator.openshift.io/cluster -p '{"spec":{"managementState": "Managed", "samplesRegistry": "nexus.ocp4.redhat.ren:8083"}}' --type=merge
oc patch configs.samples.operator.openshift.io/cluster -p '{"spec":{"managementState": "Unmanaged"}}' --type=merge
oc patch configs.samples.operator.openshift.io/cluster -p '{"spec":{"managementState": "Removed"}}' --type=merge
chrony/NTP 设置
在 ocp 4.6 里面,需要设定ntp同步,我们之前ansible脚本,已经创建好了ntp的mco配置,把他打到系统里面就好了。
oc apply -f /data/ocp4/ocp4-upi-helpernode-master/machineconfig/
Operator Hub 离线安装
使用nexus作为image proxy以后,就不需要做这个离线操作了,但是如果我们想搞CCN这种项目,因为他自带了一个catalog,为了避免冲突,我们可能还是需要屏蔽到默认的operator hub
oc patch OperatorHub cluster --type json \
-p '[{"op": "add", "path": "/spec/disableAllDefaultSources", "value": true}]'
oc get OperatorHub cluster -o yaml
给 openshift project image stream 打补丁
在有代理的网络环境中,我们需要给openshift project下的image stream打一些补丁。
cd /data/ocp4
bash is.patch.sh registry.ocp4.redhat.ren:5443/ocp4/openshift4
给 router / ingress 更换证书
有时候,我们需要公网CA认证的证书,给router来用,那么我们就搞一下
https://docs.openshift.com/container-platform/4.6/security/certificates/replacing-default-ingress-certificate.html
mkdir -p /data/ccn/ingress-keys/etc
mkdir -p /data/ccn/ingress-keys/lib
cd /data/ccn/ingress-keys
podman run -it --rm --name certbot \
-v "/data/ccn/ingress-keys/etc:/etc/letsencrypt":Z \
-v "/data/ccn/ingress-keys/lib:/var/lib/letsencrypt":Z \
docker.io/certbot/certbot certonly -d "*.apps.ocp4.redhat.ren" --manual --preferred-challenges dns-01 --server https://acme-v02.api.letsencrypt.org/directory
cp ./etc/archive/apps.ocp4.redhat.ren/fullchain1.pem apps.ocp4.redhat.ren.crt
cp ./etc/archive/apps.ocp4.redhat.ren/privkey1.pem apps.ocp4.redhat.ren.key
ssh root@192.168.7.11 mkdir -p /data/install/ingress-key
scp apps.* root@192.168.7.11:/data/install/ingress-key
# on helper
cd /data/install/ingress-key
oc create secret tls wzh-ingress-key \
--cert=apps.ocp4.redhat.ren.crt \
--key=apps.ocp4.redhat.ren.key \
-n openshift-ingress
oc patch ingresscontroller.operator default \
--type=merge -p \
'{"spec":{"defaultCertificate": {"name": "wzh-ingress-key"}}}' \
-n openshift-ingress-operator
build the pip dependencies for rhel8
export BUILDNUMBER=4.10.4
dnf groupinstall -y 'Development Tools'
dnf -y install python3-pip libvirt libvirt-devel python3-devel
pip3 uninstall -y $(pip3 list --user --format=legacy | awk '{print $1}' | tr '\n' ' ' )
pip3 install --user setuptools-rust
pip3 install --user virtualbmc
pip3 install --user sushy-tools
pip3 freeze --user > requirements.txt
# pip3 install -r requirements.txt --user
mkdir -p wheelhouse
pip3 download -r requirements.txt -d wheelhouse
/bin/cp -f requirements.txt wheelhouse/
tar -zcf wheelhouse.tar.gz wheelhouse
buildah from --name onbuild-container scratch
buildah copy onbuild-container wheelhouse.tar.gz /
buildah umount onbuild-container
buildah commit --rm --format=docker onbuild-container quay.io/wangzheng422/qimgs:ocp.bm.ipi.python.dep.rhel8-${BUILDNUMBER}
# buildah rm onbuild-container
buildah push quay.io/wangzheng422/qimgs:ocp.bm.ipi.python.dep.rhel8-${BUILDNUMBER}
echo "quay.io/wangzheng422/qimgs:ocp.bm.ipi.python.dep.rhel8-${BUILDNUMBER}"
# quay.io/wangzheng422/qimgs:ocp.bm.ipi.python.dep.rhel8-4.10.4
排错技巧
# login to bootstrap to debug
# find the ip from kvm console
ssh -i ~/.ssh/helper_rsa core@192.168.7.75
journalctl -b -f -u release-image.service -u bootkube.service
journalctl -b -u release-image.service -u bootkube.service | grep -i baremetal
sudo -i
export KUBECONFIG=/etc/kubernetes/kubeconfig
oc get pod -n openshift-machine-api
oc get BareMetalHost -n openshift-machine-api
# debug why bootstrap can't be ping...
cat .openshift_install_state.json | jq '."*bootstrap.Bootstrap"'.Config.storage.files[].path
cat .openshift_install_state.json | jq -r '."*bootstrap.Bootstrap"'.File.Data | base64 -d | jq -r . > ign.json
cat .openshift_install_state.json | jq -r '."*bootstrap.Bootstrap".Config.storage.files[].contents.source ' | sed 's/.*base64,//g' | base64 -d > decode
cat .openshift_install_state.json | jq -r '."*bootstrap.Bootstrap".Config.storage.files[] | .path, .contents.source ' | while read -r line ; do if [[ $line =~ .*base64,.* ]]; then echo $(echo $line | sed 's/.*base64,//g' | base64 -d) ; else echo $line; fi; done > files
cat bootstrap.ign | jq '.storage.files[] | select ( .path == "/opt/openshift/openshift/99_baremetal-provisioning-config.yaml" ) ' | jq -r .contents.source | sed 's/.*base64,//g' | base64 -d
cat bootstrap.ign | jq '.storage.files[] | select ( .path | contains("/opt/openshift/openshift/") ) ' | jq -r .contents.source | sed 's/.*base64,//g' | base64 -d
openshift4.10 acm with ztp disconnected static-ip auto
本文介绍,在openshift4.10上,装ACM组件以后,如何通过zero touch provision的方式,来部署一个单节点openshift4.10的集群(SNO),在部署的过程中,我们模拟离线的网络环境,并且禁止DHCP,只用静态IP。
ZTP(zero touch provision)模式之所以诱人,是因为他只需要baremetal的bmc信息,以及网卡的mac地址,就可以完成集群的部署。ACM会创建一个iso,并调用bmc的api,去挂载这个iso并启动。
本次实验,使用一个半自动流程,就是让ACM创建iso,但是手动用iso启动kvm。整个流程如下:
- 在openshift4上安装ACM组件
- 在ACM上配置cluster, infra env等配置。
- ACM通过网络启动kvm
- kvm自动开始集群安装,但是由于kvm+redfish的限制,需要手动配置之后的重启都是由硬盘启动。
- 集群安装完成,保存集群登录信息
本次实验的部署架构图:
本次实验有一个前导实验,就是用一个单机版本的assisted install service部署一个SNO集群,这个SNO集群是本次实验部署ACM的基础。这个前导实验如何做,请参见这里。
参考资料:
- https://github.com/jparrill/ztp-the-hard-way/blob/main/docs/connected-ZTP-flow-hub-deployment.md
- https://github.com/jparrill/ztp-the-hard-way/blob/main/docs/disconnected-ZTP-flow-hub-deployment.md
视频讲解
静态变量和 kvm 配置
assisted install 模式下,如果想静态ip安装,需要在实验网络上部署一个dns服务。因为我们部署的是single node openshift,只需要把如下4个域名,指向同一个ip地址就可以。当然,你需要提前想好域名。同时,我们的实验环境里面,其实有2个SNO,所以要配置2套域名。
- acm-demo-hub.redhat.ren
- api.acm-demo-hub.redhat.ren
- api-int.acm-demo-hub.redhat.ren
- *.apps.acm-demo-hub.redhat.ren
- acm-demo-hub-master.acm-demo-hub.redhat.ren
- acm-demo1.redhat.ren
- api.acm-demo1.redhat.ren
- api-int.acm-demo1.redhat.ren
- *.apps.acm-demo1.redhat.ren
- acm-demo1-master.acm-demo1.redhat.ren
我们复用本作者基于上游改的一套ansible脚本来配置这个dns
# on helper
# 做一些配置参数定义
INSTALL_IMAGE_REGISTRY=quaylab.infra.redhat.ren
PULL_SECRET='{"auths":{"registry.redhat.io": {"auth": "ZHVtbXk6ZHVtbXk=","email": "noemail@localhost"},"registry.ocp4.redhat.ren:5443": {"auth": "ZHVtbXk6ZHVtbXk=","email": "noemail@localhost"},"'${INSTALL_IMAGE_REGISTRY}'": {"auth": "'$( echo -n 'quayadmin:password' | openssl base64 )'","email": "noemail@localhost"}}}'
ACM_DEMO_CLUSTER=acm-demo1
SNO_BASE_DOMAIN=redhat.ren
SNO_IP=192.168.7.15
SNO_GW=192.168.7.1
SNO_NETMAST=255.255.255.0
SNO_NETMAST_S=24
SNO_HOSTNAME=acm-demo1-master
SNO_IF=enp1s0
SNO_IF_MAC=`printf '00:60:2F:%02X:%02X:%02X' $[RANDOM%256] $[RANDOM%256] $[RANDOM%256]`
SNO_DNS=192.168.7.11
SNO_DISK=/dev/vda
SNO_CORE_PWD=redhat
echo ${SNO_IF_MAC} > /data/install/acm.demo1.mac
# back to kvm host
create_lv() {
var_vg=$1
var_lv=$2
var_size=$3
lvremove -f $var_vg/$var_lv
lvcreate -y -L $var_size -n $var_lv $var_vg
wipefs --all --force /dev/$var_vg/$var_lv
}
create_lv vgdata lvacmdemo1 120G
export KVM_DIRECTORY=/data/kvm
mkdir -p ${KVM_DIRECTORY}
cd ${KVM_DIRECTORY}
scp root@192.168.7.11:/data/install/acm.demo1.mac ${KVM_DIRECTORY}/
# on kvm host
# export KVM_DIRECTORY=/data/kvm
virt-install --name=ocp4-acm-demo1-master0 --vcpus=16 --ram=32768 \
--cpu=host-model \
--disk path=/dev/vgdata/lvacmdemo1,device=disk,bus=virtio,format=raw \
--disk device=cdrom \
--os-variant rhel8.3 --network bridge=baremetal,model=virtio,mac=$(<acm.demo1.mac) \
--graphics vnc,port=59013 \
--boot uefi,nvram_template=/usr/share/OVMF/OVMF_VARS.fd,menu=on \
--print-xml > ${KVM_DIRECTORY}/ocp4-acm-demo1.xml
virsh define --file ${KVM_DIRECTORY}/ocp4-acm-demo1.xml
cd /data/kvm/
# for i in master{0..2} worker{0..2}
for i in acm-demo1-master{0..0}
do
echo -ne "${i}\t" ;
virsh dumpxml ocp4-${i} | grep "mac address" | cut -d\' -f2 | tr '\n' '\t'
echo
done > mac.list
cat /data/kvm/mac.list
# acm-demo1-master0 00:60:2f:ee:aa:4e
scp /data/kvm/mac.list root@192.168.7.11:/data/install/
DNS 配置
# back to helper
# set up dns
cd /data/ocp4/ocp4-upi-helpernode-master/
cat << 'EOF' > /data/ocp4/ocp4-upi-helpernode-master/vars.yaml
---
ocp_version: 4.10.4
ssh_gen_key: false
staticips: true
bm_ipi: true
firewalld: false
dns_forward: true
iso:
iso_dl_url: "file:///data/ocp4/rhcos-live.x86_64.iso"
my_iso: "rhcos-live.iso"
helper:
name: "helper"
ipaddr: "192.168.7.11"
networkifacename: "enp1s0"
gateway: "192.168.7.1"
netmask: "255.255.255.0"
dns:
domain: "redhat.ren"
clusterid: "ocp4"
forwarder1: "172.21.1.1"
forwarder2: "172.21.1.1"
api_vip: "192.168.7.100"
ingress_vip: "192.168.7.101"
bootstrap:
name: "bootstrap"
ipaddr: "192.168.7.12"
interface: "enp1s0"
install_drive: "vda"
# macaddr: "52:54:00:7e:f8:f7"
masters:
- name: "master-0"
ipaddr: "192.168.7.13"
interface: "enp1s0"
install_drive: "vda"
# macaddr: "$(cat /data/install/mac.list | grep master0 | awk '{print $2}')"
# - name: "master-1"
# ipaddr: "192.168.7.14"
# interface: "enp1s0"
# install_drive: "vda"
# macaddr: "$(cat /data/install/mac.list | grep master1 | awk '{print $2}')"
# - name: "master-2"
# ipaddr: "192.168.7.15"
# interface: "enp1s0"
# install_drive: "vda"
# macaddr: "$(cat /data/install/mac.list | grep master2 | awk '{print $2}')"
# workers:
# - name: "worker-0"
# ipaddr: "192.168.7.16"
# interface: "enp1s0"
# install_drive: "vda"
# macaddr: "$(cat /data/install/mac.list | grep worker0 | awk '{print $2}')"
# - name: "worker-1"
# ipaddr: "192.168.7.17"
# interface: "enp1s0"
# install_drive: "vda"
# macaddr: "$(cat /data/install/mac.list | grep worker1 | awk '{print $2}')"
# - name: "worker-2"
# ipaddr: "192.168.7.18"
# interface: "enp1s0"
# install_drive: "vda"
# macaddr: "$(cat /data/install/mac.list | grep worker2 | awk '{print $2}')"
others:
- name: "registry"
ipaddr: "192.168.7.103"
- name: "yum"
ipaddr: "172.21.6.103"
- name: "quay"
ipaddr: "172.21.6.103"
- name: "nexus"
ipaddr: "172.21.6.103"
- name: "git"
ipaddr: "172.21.6.103"
otherdomains:
- domain: "infra.redhat.ren"
hosts:
- name: "registry"
ipaddr: "192.168.7.1"
- name: "yum"
ipaddr: "192.168.7.1"
- name: "quay"
ipaddr: "192.168.7.1"
- name: "quaylab"
ipaddr: "192.168.7.1"
- name: "nexus"
ipaddr: "192.168.7.1"
- name: "git"
ipaddr: "192.168.7.1"
- domain: "acm-demo1.redhat.ren"
hosts:
- name: "api"
ipaddr: "192.168.7.15"
- name: "api-int"
ipaddr: "192.168.7.15"
- name: "acm-demo1-master"
ipaddr: "192.168.7.15"
- name: "*.apps"
ipaddr: "192.168.7.15"
- domain: "acm-demo-hub.redhat.ren"
hosts:
- name: "api"
ipaddr: "192.168.7.13"
- name: "api-int"
ipaddr: "192.168.7.13"
- name: "acm-demo-hub-master"
ipaddr: "192.168.7.13"
- name: "*.apps"
ipaddr: "192.168.7.13"
force_ocp_download: false
remove_old_config_files: false
ocp_client: "file:///data/ocp4/{{ ocp_version }}/openshift-client-linux-{{ ocp_version }}.tar.gz"
ocp_installer: "file:///data/ocp4/{{ ocp_version }}/openshift-install-linux-{{ ocp_version }}.tar.gz"
ppc64le: false
arch: 'x86_64'
chronyconfig:
enabled: true
content:
- server: "192.168.7.11"
options: iburst
setup_registry: # don't worry about this, just leave it here
deploy: false
registry_image: docker.io/library/registry:2
local_repo: "ocp4/openshift4"
product_repo: "openshift-release-dev"
release_name: "ocp-release"
release_tag: "4.6.1-x86_64"
ocp_filetranspiler: "file:///data/ocp4/filetranspiler.tgz"
registry_server: "registry.ocp4.redhat.ren:5443"
EOF
ansible-playbook -e @vars.yaml tasks/main.yml
# then followin AIS, to install sno using 192.168.7.13
部署CNV
我们部署ACM,是需要存储的,最简单的存储,就是本地目录啦,那我们就需要一个自动的auto provisioner,正好CNV带有一个hostpath auto provisioner,所以作者就犯懒,部署一个CNV,为的是里面的本地目录的自动部署。
# 首先需要一个本地目录
cat << EOF > /data/install/host-path.yaml
---
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
name: 50-set-selinux-for-hostpath-master
labels:
machineconfiguration.openshift.io/role: master
spec:
config:
ignition:
version: 3.2.0
systemd:
units:
- contents: |
[Unit]
Description=Set SELinux chcon for hostpath baicell
Before=kubelet.service
[Service]
Type=oneshot
RemainAfterExit=yes
ExecStartPre=-mkdir -p /var/hostpath
ExecStart=chcon -Rt container_file_t /var/hostpath/
[Install]
WantedBy=multi-user.target
enabled: true
name: hostpath-baicell.service
EOF
oc create -f /data/install/host-path.yaml
# install operator OpenShift Virtualization
# active HostPathProvisioner deployment
# https://docs.openshift.com/container-platform/4.9/virt/install/installing-virt-cli.html
cat << EOF > /data/install/cnv.subscript.yaml
apiVersion: v1
kind: Namespace
metadata:
name: openshift-cnv
---
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
name: kubevirt-hyperconverged-group
namespace: openshift-cnv
spec:
targetNamespaces:
- openshift-cnv
---
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
name: hco-operatorhub
namespace: openshift-cnv
spec:
source: redhat-operators
sourceNamespace: openshift-marketplace
name: kubevirt-hyperconverged
# startingCSV: kubevirt-hyperconverged-operator.v4.9.3
channel: "stable"
EOF
oc create -f /data/install/cnv.subscript.yaml
# 创建hostpath配置
cat << EOF > /data/install/host-path-provision.yaml
apiVersion: hostpathprovisioner.kubevirt.io/v1beta1
kind: HostPathProvisioner
metadata:
name: hostpath-provisioner
spec:
imagePullPolicy: IfNotPresent
pathConfig:
path: "/var/hostpath"
useNamingPrefix: false
EOF
oc create -f /data/install/host-path-provision.yaml -n openshift-cnv
# 创建storage class配置
cat << EOF > /data/install/host-path-storage-class.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: hostpath-provisioner
annotations:
storageclass.kubernetes.io/is-default-class: 'true'
provisioner: kubevirt.io/hostpath-provisioner
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
EOF
oc create -f /data/install/host-path-storage-class.yaml
部署完了这样:
部署ACM
接下来,我们就部署ACM,我们用最简单的部署模式。
# install operator Advanced Cluster Management for Kubernetes
# https://docs.openshift.com/container-platform/4.9/scalability_and_performance/ztp-deploying-disconnected.html#enabling-assisted-installer-service-on-bare-metal_ztp-deploying-disconnected
# https://access.redhat.com/documentation/en-us/red_hat_advanced_cluster_management_for_kubernetes/2.4/html/install/installing#installing-from-the-cli
cat << EOF > /data/install/acm.subscript.yaml
apiVersion: v1
kind: Namespace
metadata:
name: open-cluster-management
---
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
name: open-cluster-management-wzh
namespace: open-cluster-management
spec:
targetNamespaces:
- open-cluster-management
---
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
name: advanced-cluster-management
namespace: open-cluster-management
spec:
sourceNamespace: openshift-marketplace
source: redhat-operators
channel: release-2.4
installPlanApproval: Automatic
name: advanced-cluster-management
# startingCSV: advanced-cluster-management.v2.4.2
EOF
oc create -f /data/install/acm.subscript.yaml
# RHACM create the MultiClusterHub resource
cat << EOF > /data/install/acm.mch.mch.yaml
apiVersion: operator.open-cluster-management.io/v1
kind: MultiClusterHub
metadata:
name: multiclusterhub
namespace: open-cluster-management
spec: {}
EOF
oc create -f /data/install/acm.mch.mch.yaml
装好了是这样:
我们可以通过webUI访问ACM: https://multicloud-console.apps.acm-demo-hub.redhat.ren/overview
我们可以看到有一个local clustr,这个就是ACM自己运行的集群:
用ZTP模式部署一个SNO
有过部署assisted install service,并通过AIS来部署SNO的经验,那么通过ACM,用ZTP的模式来部署,就容易理解了,整个过程一样,都是配置ACM里面的assisted install service,然后创建一个iso出来,调用BMC API,来直接挂载iso,并启动主机。
命令行配置新集群
ACM 2.4 UI 上并不是完全支持ZTP,所以有些配置要用命令行完成,后续版本会把UI补上。
# https://access.redhat.com/documentation/en-us/red_hat_advanced_cluster_management_for_kubernetes/2.4/html-single/clusters/index#infra-env-prerequisites
oc project open-cluster-management
# do not need, because now, it is acm 2.4.2
# but it seems doesn't matter, if you enable it
oc patch hiveconfig hive --type merge -p '{"spec":{"targetNamespace":"hive","logLevel":"debug","featureGates":{"custom":{"enabled":["AlphaAgentInstallStrategy"]},"featureSet":"Custom"}}}'
oc get hiveconfig hive -o yaml
# ......
# spec:
# featureGates:
# custom:
# enabled:
# - AlphaAgentInstallStrategy
# featureSet: Custom
# logLevel: debug
# targetNamespace: hive
# ......
oc patch provisioning provisioning-configuration --type merge -p '{"spec":{"watchAllNamespaces": true }}'
oc get provisioning provisioning-configuration -o yaml
# ......
# spec:
# preProvisioningOSDownloadURLs: {}
# provisioningIP: 192.168.7.103
# provisioningMacAddresses:
# - 00:60:2f:ab:66:f6
# provisioningNetwork: Disabled
# provisioningNetworkCIDR: 192.168.7.0/24
# provisioningOSDownloadURL: http://192.168.7.11:8080/install/rhcos-openstack.x86_64.qcow2.gz?sha256=6b5731d90fa78eb50c07928811675d$f9c1d3f94eca0a94afef17cfbce706ddf
# watchAllNamespaces: true
# ......
cat << EOF > /data/install/acm.ocp.release.yaml
apiVersion: hive.openshift.io/v1
kind: ClusterImageSet
metadata:
name: openshift-v4.10.4
namespace: open-cluster-management
spec:
releaseImage: ${INSTALL_IMAGE_REGISTRY}/ocp4/openshift4:4.10.4-x86_64
EOF
oc create -f /data/install/acm.ocp.release.yaml
oc get ClusterImageSet
# NAME RELEASE
# openshift-v4.10.4 quaylab.infra.redhat.ren/ocp4/openshift4:4.10.4-x86_64
cat << EOF > /data/install/acm.cm.asc.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: assisted-service-config
namespace: open-cluster-management
labels:
app: assisted-service
data:
LOG_LEVEL: "debug"
EOF
oc create -f /data/install/acm.cm.asc.yaml
cat << EOF > /data/install/acm.secret.yaml
apiVersion: v1
kind: Secret
metadata:
name: assisted-deployment-pull-secret
namespace: open-cluster-management
stringData:
.dockerconfigjson: '$PULL_SECRET'
EOF
oc create -f /data/install/acm.secret.yaml
# oc get pod -A | grep metal3
# the result is empty, so we will go in manual way
oc get pod -A | grep metal3
# openshift-machine-api metal3-697fb46867-8zxxw 7/7 Running 8 (42m ago) 4h40m
# openshift-machine-api metal3-image-cache-hhvnm 1/1 Running 1 4h40m
# openshift-machine-api metal3-image-customization-577f886bb4-cwl2l 1/1 Running 1 4h40m
# curl -s https://mirror.openshift.com/pub/openshift-v4/x86_64/clients/ocp/4.9.12/release.txt | grep 'machine-os '
cat /data/ocp4/4.10.4/release.txt | grep 'machine-os '
# machine-os 410.84.202203081640-0 Red Hat Enterprise Linux CoreOS
cat << EOF > /data/install/acm.mirror.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: hyper1-mirror-config
namespace: open-cluster-management
labels:
app: assisted-service
data:
ca-bundle.crt: |
$( cat /etc/crts/redhat.ren.ca.crt | sed 's/^/ /g' )
registries.conf: |
unqualified-search-registries = ["registry.access.redhat.com", "docker.io"]
[[registry]]
prefix = ""
location = "quay.io/openshift-release-dev/ocp-release"
mirror-by-digest-only = true
[[registry.mirror]]
location = "${INSTALL_IMAGE_REGISTRY}/ocp4/openshift4"
[[registry]]
prefix = ""
location = "quay.io/openshift-release-dev/ocp-v4.0-art-dev"
mirror-by-digest-only = true
[[registry.mirror]]
location = "${INSTALL_IMAGE_REGISTRY}/ocp4/openshift4"
---
EOF
oc create -f /data/install/acm.mirror.yaml
cat << EOF > /data/install/acm.agentservicecofnig.yaml
apiVersion: agent-install.openshift.io/v1beta1
kind: AgentServiceConfig
metadata:
name: agent
namespace: open-cluster-management
### This is the annotation that injects modifications in the Assisted Service pod
annotations:
unsupported.agent-install.openshift.io/assisted-service-configmap: "assisted-service-config"
###
spec:
databaseStorage:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 40Gi
filesystemStorage:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 40Gi
### This is a ConfigMap that only will make sense on Disconnected environments
mirrorRegistryRef:
name: "hyper1-mirror-config"
###
osImages:
- openshiftVersion: "4.10"
version: "410.84.202203081640-0"
url: "http://192.168.7.11:8080/install/live.iso"
rootFSUrl: "http://192.168.7.11:8080/install/rootfs.img.4.9"
cpuArchitecture: x86_64
EOF
oc create -f /data/install/acm.agentservicecofnig.yaml
# oc delete -f /data/install/acm.asc.yaml
oc get AgentServiceConfig/agent -n open-cluster-management -o yaml
# ......
# status:
# conditions:
# - lastTransitionTime: "2022-04-06T09:38:21Z"
# message: AgentServiceConfig reconcile completed without error.
# reason: ReconcileSucceeded
# status: "True"
# type: ReconcileCompleted
# logs in infrastructure-operator
# stop here, and wait the assisted-service pod run into ok status
oc get pod -n open-cluster-management | grep assisted
# assisted-image-service-b686cf67d-4hs2t 1/1 Running 0 44s
# assisted-service-7476bfdd8c-lnnn8 2/2 Running 0 44s
# begin to create new cluster
oc create ns ${ACM_DEMO_CLUSTER}
oc project ${ACM_DEMO_CLUSTER}
cat << EOF > /data/install/acm.managed.secret.yaml
apiVersion: v1
kind: Secret
metadata:
name: assisted-deployment-pull-secret
namespace: ${ACM_DEMO_CLUSTER}
stringData:
.dockerconfigjson: '$PULL_SECRET'
EOF
oc create -f /data/install/acm.managed.secret.yaml
cat << EOF > /data/install/acm.nmsc.yaml
apiVersion: agent-install.openshift.io/v1beta1
kind: NMStateConfig
metadata:
name: ${ACM_DEMO_CLUSTER}
namespace: ${ACM_DEMO_CLUSTER}
labels:
nmstate-conf-cluster-name: ${ACM_DEMO_CLUSTER}
spec:
config:
interfaces:
- name: ${SNO_IF}
type: ethernet
state: up
ipv4:
enabled: true
address:
- ip: ${SNO_IP}
prefix-length: ${SNO_NETMAST_S}
dhcp: false
dns-resolver:
config:
server:
- ${SNO_DNS}
routes:
config:
- destination: 0.0.0.0/0
next-hop-address: ${SNO_GW}
next-hop-interface: ${SNO_IF}
table-id: 254
interfaces:
- name: "${SNO_IF}"
macAddress: ${SNO_IF_MAC}
EOF
oc create -f /data/install/acm.nmsc.yaml
cat << EOF > /data/install/acm.clusterdeployment.yaml
apiVersion: hive.openshift.io/v1
kind: ClusterDeployment
metadata:
name: ${ACM_DEMO_CLUSTER}
namespace: ${ACM_DEMO_CLUSTER}
spec:
baseDomain: ${SNO_BASE_DOMAIN}
clusterName: ${ACM_DEMO_CLUSTER}
controlPlaneConfig:
servingCertificates: {}
installed: false
clusterInstallRef:
group: extensions.hive.openshift.io
kind: AgentClusterInstall
name: ${ACM_DEMO_CLUSTER}
version: v1beta1
platform:
agentBareMetal:
agentSelector:
matchLabels:
cluster-name: "${ACM_DEMO_CLUSTER}"
pullSecretRef:
name: assisted-deployment-pull-secret
EOF
oc create -f /data/install/acm.clusterdeployment.yaml
oc get ClusterDeployment/${ACM_DEMO_CLUSTER} -n ${ACM_DEMO_CLUSTER} -o json | jq .status | head
# {
# "conditions": [
# {
# "lastProbeTime": "2022-04-08T14:58:11Z",
# "lastTransitionTime": "2022-04-08T14:58:11Z",
# "message": "Platform credentials passed authentication check",
# "reason": "PlatformAuthSuccess",
# "status": "False",
# "type": "AuthenticationFailure"
# },
cat << EOF > /data/install/acm.agentclusterinstall.yaml
apiVersion: extensions.hive.openshift.io/v1beta1
kind: AgentClusterInstall
metadata:
name: ${ACM_DEMO_CLUSTER}
namespace: ${ACM_DEMO_CLUSTER}
# Only include the annotation if using OVN, otherwise omit the annotation
# annotations:
# agent-install.openshift.io/install-config-overrides: '{"networking":{"networkType":"OVNKubernetes"}}'
spec:
clusterDeploymentRef:
name: ${ACM_DEMO_CLUSTER}
imageSetRef:
name: openshift-v4.10.4
networking:
clusterNetwork:
- cidr: "10.128.0.0/14"
hostPrefix: 23
serviceNetwork:
- "172.30.0.0/16"
machineNetwork:
- cidr: "192.168.7.0/24"
provisionRequirements:
controlPlaneAgents: 1
sshPublicKey: "$(< ~/.ssh/id_rsa.pub)"
EOF
oc create -f /data/install/acm.agentclusterinstall.yaml
# oc delete -f /data/install/acm.agentclusterinstall.yaml
# wait a moment, and this will be ok
oc get AgentClusterInstall/${ACM_DEMO_CLUSTER} -n ${ACM_DEMO_CLUSTER} -o json | jq .status | head
# {
# "conditions": [
# {
# "lastProbeTime": "2022-04-06T13:50:43Z",
# "lastTransitionTime": "2022-04-06T13:50:43Z",
# "message": "SyncOK",
# "reason": "SyncOK",
# "status": "True",
# "type": "SpecSynced"
# },
cat << EOF > /data/install/acm.klusterletaddonconfig.yaml
apiVersion: agent.open-cluster-management.io/v1
kind: KlusterletAddonConfig
metadata:
name: ${ACM_DEMO_CLUSTER}
namespace: ${ACM_DEMO_CLUSTER}
spec:
clusterName: ${ACM_DEMO_CLUSTER}
clusterNamespace: ${ACM_DEMO_CLUSTER}
clusterLabels:
cloud: auto-detect
vendor: auto-detect
applicationManager:
enabled: true
certPolicyController:
enabled: true
iamPolicyController:
enabled: true
policyController:
enabled: true
searchCollector:
enabled: true
EOF
oc create -f /data/install/acm.klusterletaddonconfig.yaml
oc get KlusterletAddonConfig/${ACM_DEMO_CLUSTER} -n ${ACM_DEMO_CLUSTER} -o yaml
# apiVersion: agent.open-cluster-management.io/v1
# kind: KlusterletAddonConfig
# metadata:
# creationTimestamp: "2022-04-06T13:51:19Z"
# generation: 1
# name: acm-demo1
# namespace: acm-demo1
# resourceVersion: "2187935"
# uid: 1615ed53-80a3-48d2-823f-8dff08a97d75
# spec:
# applicationManager:
# enabled: true
# certPolicyController:
# enabled: true
# clusterLabels:
# cloud: auto-detect
# vendor: auto-detect
# clusterName: acm-demo1
# clusterNamespace: acm-demo1
# iamPolicyController:
# enabled: true
# policyController:
# enabled: true
# searchCollector:
# enabled: true
# status:
# conditions:
# - lastTransitionTime: "2022-04-06T13:51:19Z"
# message: The cluster is not provisioned by ACM.
# reason: OCPGlobalProxyNotDetected
# status: "False"
# type: OCPGlobalProxyDetected
# ocpGlobalProxy: {}
cat << EOF > /data/install/acm.managedcluster.yaml
apiVersion: cluster.open-cluster-management.io/v1
kind: ManagedCluster
metadata:
name: ${ACM_DEMO_CLUSTER}
spec:
hubAcceptsClient: true
EOF
oc create -f /data/install/acm.managedcluster.yaml
# 我们是离线安装,所以要定制一下启动配置文件
# generate the ignition
cat << EOF > /data/sno/ign.base.json
{
"ignition": {
"version": "3.1.0"
}
}
EOF
cat << EOF > /data/sno/install.images.bu
variant: openshift
version: 4.9.0
metadata:
labels:
machineconfiguration.openshift.io/role: master
name: 99-zzz-master-install-images
storage:
files:
- path: /etc/containers/registries.conf.d/base.registries.conf
overwrite: true
contents:
inline: |
unqualified-search-registries = ["registry.access.redhat.com", "docker.io"]
short-name-mode = ""
[[registry]]
prefix = ""
location = "quay.io/openshift-release-dev/ocp-release"
mirror-by-digest-only = true
[[registry.mirror]]
location = "${INSTALL_IMAGE_REGISTRY}/ocp4/openshift4"
[[registry]]
prefix = ""
location = "quay.io/openshift-release-dev/ocp-v4.0-art-dev"
mirror-by-digest-only = true
[[registry.mirror]]
location = "${INSTALL_IMAGE_REGISTRY}/ocp4/openshift4"
EOF
cat << EOF > /data/sno/install.crts.bu
variant: openshift
version: 4.9.0
metadata:
labels:
machineconfiguration.openshift.io/role: master
name: 99-zzz-master-install-crts
storage:
files:
- path: /etc/pki/ca-trust/source/anchors/quaylab.crt
overwrite: true
contents:
inline: |
$( cat /etc/crts/redhat.ren.ca.crt | sed 's/^/ /g' )
EOF
mkdir -p /data/sno/disconnected/
# copy ntp related config
/bin/cp -f /data/ocp4/ocp4-upi-helpernode-master/machineconfig/* /data/sno/disconnected/
# copy image registry proxy related config
cd /data/ocp4
bash image.registries.conf.sh nexus.infra.redhat.ren:8083
/bin/cp -f /data/ocp4/99-worker-container-registries.yaml /data/sno/disconnected/
/bin/cp -f /data/ocp4/99-master-container-registries.yaml /data/sno/disconnected/
cd /data/sno/
# load ignition file generation function
source /data/ocp4/acm.fn.sh
get_file_content_for_ignition "/opt/openshift/openshift/99-master-chrony-configuration.yaml" "/data/sno/disconnected/99-master-chrony-configuration.yaml"
VAR_99_master_chrony=$RET_VAL
VAR_99_master_chrony_2=$RET_VAL_2
get_file_content_for_ignition "/opt/openshift/openshift/99-worker-chrony-configuration.yaml" "/data/sno/disconnected/99-worker-chrony-configuration.yaml"
VAR_99_worker_chrony=$RET_VAL
VAR_99_worker_chrony_2=$RET_VAL_2
get_file_content_for_ignition "/opt/openshift/openshift/99-master-container-registries.yaml" "/data/sno/disconnected/99-master-container-registries.yaml"
VAR_99_master_container_registries=$RET_VAL
VAR_99_master_container_registries_2=$RET_VAL_2
get_file_content_for_ignition "/opt/openshift/openshift/99-worker-container-registries.yaml" "/data/sno/disconnected/99-worker-container-registries.yaml"
VAR_99_worker_container_registries=$RET_VAL
VAR_99_worker_container_registries_2=$RET_VAL_2
butane /data/sno/install.images.bu > /data/sno/disconnected/99-zzz-master-install-images.yaml
get_file_content_for_ignition "/opt/openshift/openshift/99-zzz-master-install-images.yaml" "/data/sno/disconnected/99-zzz-master-install-images.yaml"
VAR_99_master_install_images=$RET_VAL
VAR_99_master_install_images_2=$RET_VAL_2
butane /data/sno/install.crts.bu > /data/sno/disconnected/99-zzz-master-install-crts.yaml
get_file_content_for_ignition "/opt/openshift/openshift/99-zzz-master-install-crts.yaml" "/data/sno/disconnected/99-zzz-master-install-crts.yaml"
VAR_99_master_install_crts=$RET_VAL
VAR_99_master_install_crts_2=$RET_VAL_2
# https://access.redhat.com/solutions/6194821
# butane /data/sno/static.ip.bu | python3 -c 'import json, yaml, sys; print(json.dumps(yaml.load(sys.stdin)))'
# https://stackoverflow.com/questions/2854655/command-to-escape-a-string-in-bash
# VAR_PULL_SEC=`printf "%q" $(cat /data/pull-secret.json)`
# https://access.redhat.com/solutions/221403
# VAR_PWD_HASH="$(openssl passwd -1 -salt 'openshift' 'redhat')"
VAR_PWD_HASH="$(python3 -c 'import crypt,getpass; print(crypt.crypt("redhat"))')"
tmppath=$(mktemp)
cat /data/sno/ign.base.json \
| jq --arg VAR "$VAR_PWD_HASH" --arg VAR_SSH "$NODE_SSH_KEY" '.passwd.users += [{ "name": "wzh", "system": true, "passwordHash": $VAR , "sshAuthorizedKeys": [ $VAR_SSH ], "groups": [ "adm", "wheel", "sudo", "systemd-journal" ] }]' \
| jq --argjson VAR "$VAR_99_master_chrony" '.storage.files += [$VAR] ' \
| jq --argjson VAR "$VAR_99_worker_chrony" '.storage.files += [$VAR] ' \
| jq --argjson VAR "$VAR_99_master_container_registries" '.storage.files += [$VAR] ' \
| jq --argjson VAR "$VAR_99_worker_container_registries" '.storage.files += [$VAR] ' \
| jq --argjson VAR "$VAR_99_master_chrony_2" '.storage.files += [$VAR] ' \
| jq --argjson VAR "$VAR_99_master_container_registries_2" '.storage.files += [$VAR] ' \
| jq --argjson VAR "$VAR_99_master_install_images_2" '.storage.files += [$VAR] ' \
| jq --argjson VAR "$VAR_99_master_install_crts_2" '.storage.files += [$VAR] ' \
| jq -c . \
> ${tmppath}
VAR_IGNITION=$(cat ${tmppath})
rm -f ${tmppath}
cat << EOF > /data/install/acm.infraenv.yaml
apiVersion: agent-install.openshift.io/v1beta1
kind: InfraEnv
metadata:
name: ${ACM_DEMO_CLUSTER}
namespace: ${ACM_DEMO_CLUSTER}
spec:
additionalNTPSources:
- 192.168.7.11
clusterRef:
name: ${ACM_DEMO_CLUSTER}
namespace: ${ACM_DEMO_CLUSTER}
sshAuthorizedKey: "$(< ~/.ssh/id_rsa.pub)"
pullSecretRef:
name: assisted-deployment-pull-secret
ignitionConfigOverride: '${VAR_IGNITION}'
nmStateConfigLabelSelector:
matchLabels:
nmstate-conf-cluster-name: ${ACM_DEMO_CLUSTER}
# imageType: "full-iso"
EOF
oc create -f /data/install/acm.infraenv.yaml
# oc delete -f /data/install/acm.infraenv.yaml
oc get infraenv/${ACM_DEMO_CLUSTER} -n ${ACM_DEMO_CLUSTER} -o json | jq .status
# {
# "agentLabelSelector": {
# "matchLabels": {
# "infraenvs.agent-install.openshift.io": "acm-demo1"
# }
# },
# "conditions": [
# {
# "lastTransitionTime": "2022-04-06T13:52:54Z",
# "message": "Image has been created",
# "reason": "ImageCreated",
# "status": "True",
# "type": "ImageCreated"
# }
# ],
# "createdTime": "2022-04-06T13:52:54Z",
# "debugInfo": {
# "eventsURL": ""
# },
# "isoDownloadURL": "https://assisted-image-service-open-cluster-management.apps.acm-demo-hub.redhat.ren/images/a87141aa-d980-4f34-ba59-d236e2158c98?api_key=eyJhbGciOiJFUzI1NiIsInR5cCI6IkpXVCJ9.eyJpbmZyYV9lbnZfaWQiOiJhODcxNDFhYS1kOTgwLTRmMzQtYmE1OS1kMjM2ZTIxNThjOTgifQ.muD_hlhMIgcNaAZk00M09QW-EwI1REGkxavKo26P-CZ_IkPR3GcdPhWLVtBjdTkrcAOgt__pcWkmJQyko5sqtw&arch=x86_64&type=minimal-iso&version=4.10"
# }
# VAR_ISO=`oc get infraenv ${ACM_DEMO_CLUSTER} -n ${ACM_DEMO_CLUSTER} -o jsonpath={.status.isoDownloadURL}`
# cd /data/install/
# wget --no-check-certificate -O acm.demo1.iso $VAR_ISO
oc get pod -A | grep metal3
# openshift-machine-api metal3-6b7b4665f6-knwzr 7/7 Running 0 39m
# openshift-machine-api metal3-image-cache-hhvnm 1/1 Running 1 6h14m
# openshift-machine-api metal3-image-customization-577f886bb4-cwl2l 1/1 Running 1 6h13m
cd /data/ocp4/
cat << 'EOF' > redfish.sh
#!/usr/bin/env bash
curl -k -s https://192.168.7.1:8000/redfish/v1/Systems/ | jq -r '.Members[]."@odata.id"' > list
while read -r line; do
curl -k -s https://192.168.7.1:8000/$line | jq -j '.Id, " ", .Name, "\n" '
done < list
EOF
bash redfish.sh > /data/install/vm.list
cat /data/install/vm.list
# 075b17f7-9be9-4576-8d72-2ddd99909e19 ocp4-acm-demo1-master0
# c991312a-26de-438d-8c2d-6aa6cd586bca ocp4-master0
# e70f66bc-7878-4617-811d-89cdaf62cc8c ocp4-Helper
# oc patch provisioning provisioning-configuration --type merge -p '{"spec":{"watchAllNamespaces": true}}'
cat << EOF > /data/install/acm.demo.secret.bmc.yaml
apiVersion: v1
kind: Secret
metadata:
name: ${ACM_DEMO_CLUSTER}-bmc-master-0
namespace: ${ACM_DEMO_CLUSTER}
data:
password: $(echo password | base64)
username: $(echo admin | base64)
type: Opaque
EOF
oc create -f /data/install/acm.demo.secret.bmc.yaml
cat << EOF > /data/install/acm.demo.bmh.master.yaml
apiVersion: metal3.io/v1alpha1
kind: BareMetalHost
metadata:
name: ${ACM_DEMO_CLUSTER}-master0
namespace: ${ACM_DEMO_CLUSTER}
labels:
infraenvs.agent-install.openshift.io: "${ACM_DEMO_CLUSTER}"
annotations:
## Disable the Introspection
inspect.metal3.io: disabled
## Set Static Hostname
bmac.agent-install.openshift.io/hostname: "${SNO_HOSTNAME}"
## Set Static Role
bmac.agent-install.openshift.io/role: "master"
spec:
online: true
bmc:
address: redfish-virtualmedia://192.168.7.1:8000/redfish/v1/Systems/$(cat /data/install/vm.list | grep acm-demo1-master0 | awk '{print $1}')
credentialsName: ${ACM_DEMO_CLUSTER}-bmc-master-0
disableCertificateVerification: true
bootMACAddress: $(cat /data/install/mac.list | grep acm-demo1-master0 | awk '{print $2}')
automatedCleaningMode: disabled
EOF
oc create -f /data/install/acm.demo.bmh.master.yaml
# oc delete -f /data/install/acm.demo.bmh.master.yaml
我们回到ACM的界面中,能从基础架构中,看到我们新创建的HOST了,能看到ACM正在通过redfish配置这个kvm
这个bare metal host其实是调用的openshift4平台上的服务创建的,所以从openshift4的console上也能看得到:
能从openshift4 console上看到这个bare metal host的详细信息:
回到ACM的界面中,我们能看到安装正在继续:
从ACM的cluster界面中,我们能看到安装的详细进展情况:
但是安装的中途,提示我们需要动手操作一下。这是因为我们是用kvm模拟的物理机,并且模拟了一个redfish,这个redfish功能比较简单,在安装ocp的过程中,kvm会重启,但是远程挂载的光盘没有卸载,所以我们需要卸载掉这个光驱,然后继续安装:
进入kvm的界面,调整一下启动顺序:
然后重启kvm,等待一段时间,infra env就安装完成了。
不过,cluster还在继续安装,我们安心等待安装过程完成。
安装完成
装好了以后,我们在ACM里面就能看到如下景象: https://multicloud-console.apps.acm-demo-hub.redhat.ren/overview
cluster 也能看到了绿色的正常状态了, 这里面local-cluster是ACM hub所在的集群:
看cluster的详细信息,也正常了:
⚠️一定记得,下载kubeconfig文件,还有密码
cluster的node tab也有内容了:
cluster的add-on,也装上了我们之前配置的组件:
infra env也绿色状态了 https://multicloud-console.apps.acm-demo-hub.redhat.ren/multicloud/infra-environments
详细信息和原来一样:
hosts tab 也完成了
# on helper
useradd -m wzh
su - wzh
mkdir ~/auth
# upload kubeconfig.json to /home/wzh/auth/
ansible localhost -m lineinfile -a 'path=$HOME/.bashrc regexp="^export KUBECONFIG" line="export KUBECONFIG=~/auth/kubeconfig.json"'
source $HOME/.bashrc
check dhcp existed
我们是静态IP安装,那么就要确认一下环境里面是不是真的 DHCP 给关了,检查的方法如下。
https://superuser.com/questions/750359/check-if-a-dhcp-server-existing-in-my-network-using-bash
dnf install nmap -y
nmap --script broadcast-dhcp6-discover -e enp1s0
end
# revert the order
tac << EOF
oc delete -f /data/install/acm.ocp.release.yaml
oc delete -f /data/install/acm.cm.asc.yaml
oc delete -f /data/install/acm.secret.yaml
oc delete -f /data/install/acm.mirror.yaml
oc delete -f /data/install/acm.agentservicecofnig.yaml
oc delete -f /data/install/acm.managed.secret.yaml
oc delete -f /data/install/acm.agentclusterinstall.yaml
oc delete -f /data/install/acm.nmsc.yaml
oc delete -f /data/install/acm.clusterdeployment.yaml
oc delete -f /data/install/acm.klusterletaddonconfig.yaml
oc delete -f /data/install/acm.managedcluster.yaml
oc delete -f /data/install/acm.infraenv.yaml
EOF
oc delete -f /data/install/acm.infraenv.yaml
oc delete -f /data/install/acm.managedcluster.yaml
oc delete -f /data/install/acm.klusterletaddonconfig.yaml
oc delete -f /data/install/acm.clusterdeployment.yaml
oc delete -f /data/install/acm.nmsc.yaml
oc delete -f /data/install/acm.agentclusterinstall.yaml
oc delete -f /data/install/acm.managed.secret.yaml
oc delete -f /data/install/acm.agentservicecofnig.yaml
oc delete -f /data/install/acm.mirror.yaml
oc delete -f /data/install/acm.secret.yaml
oc delete -f /data/install/acm.cm.asc.yaml
oc delete -f /data/install/acm.ocp.release.yaml
coreos 启动分析
我们知道,coreos 是内核和根文件系统,一起打包升级的,也就是所谓的 A/B 切换升级,那么他到底是怎么实现这个的呢?现在我们就来分析一下。
视频讲解
首先我们分析一下 /boot 分区
ls -hl /boot/loader/entries/
total 4.0K
-rw-r--r--. 1 root root 629 Apr 9 11:57 ostree-1-rhcos.conf
-rw-r--r--. 1 root root 630 Apr 9 11:57 ostree-2-rhcos.conf
cat /boot/loader/entries/*.conf
title Red Hat Enterprise Linux CoreOS 49.84.202110081407-0 (Ootpa) (ostree:1)
version 1
options random.trust_cpu=on console=tty0 console=ttyS0,115200n8 ignition.platform.id=metal $ignition_firstboot ostree=/ostree/boot.0/rhcos/a10b07df1aa66c008cd3b9acb17d765f0755702cadfa0090155dced4d2e9bfe0/0 ip=enp1s0:dhcp root=UUID=0a0d4701-04bf-45a2-8b9b-f761542a617a rw rootflags=prjquota
linux /ostree/rhcos-a10b07df1aa66c008cd3b9acb17d765f0755702cadfa0090155dced4d2e9bfe0/vmlinuz-4.18.0-305.19.1.el8_4.x86_64
initrd /ostree/rhcos-a10b07df1aa66c008cd3b9acb17d765f0755702cadfa0090155dced4d2e9bfe0/initramfs-4.18.0-305.19.1.el8_4.x86_64.img
title Red Hat Enterprise Linux CoreOS 410.84.202203081640-0 (Ootpa) (ostree:0)
version 2
options random.trust_cpu=on console=tty0 console=ttyS0,115200n8 ignition.platform.id=metal $ignition_firstboot ostree=/ostree/boot.0/rhcos/838cd9a10892dbd5e32ffdbec249a4c0db18f6d1c56f416f7a59a2f806f55941/0 ip=enp1s0:dhcp root=UUID=0a0d4701-04bf-45a2-8b9b-f761542a617a rw rootflags=prjquota
linux /ostree/rhcos-838cd9a10892dbd5e32ffdbec249a4c0db18f6d1c56f416f7a59a2f806f55941/vmlinuz-4.18.0-305.40.1.el8_4.x86_64
initrd /ostree/rhcos-838cd9a10892dbd5e32ffdbec249a4c0db18f6d1c56f416f7a59a2f806f55941/initramfs-4.18.0-305.40.1.el8_4.x86_64.img
我们可以清晰的看到,这里面定义了2个入口,并且每个入口,对应了/boot/ostree/
参考文档:
- https://access.redhat.com/solutions/5847011
再分析一下 mount
lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sr0 11:0 1 104M 0 rom
vda 252:0 0 120G 0 disk
├─vda1 252:1 0 1M 0 part
├─vda2 252:2 0 127M 0 part
├─vda3 252:3 0 384M 0 part /boot
└─vda4 252:4 0 119.5G 0 part /sysroot
mount | grep vda4
/dev/vda4 on /sysroot type xfs (ro,relatime,seclabel,attr2,inode64,logbufs=8,logbsize=32k,prjquota)
/dev/vda4 on / type xfs (rw,relatime,seclabel,attr2,inode64,logbufs=8,logbsize=32k,prjquota)
/dev/vda4 on /etc type xfs (rw,relatime,seclabel,attr2,inode64,logbufs=8,logbsize=32k,prjquota)
/dev/vda4 on /usr type xfs (ro,relatime,seclabel,attr2,inode64,logbufs=8,logbsize=32k,prjquota)
/dev/vda4 on /var type xfs (rw,relatime,seclabel,attr2,inode64,logbufs=8,logbsize=32k,prjquota)
/dev/vda4 on /var/lib/containers/storage/overlay type xfs (rw,relatime,seclabel,attr2,inode64,logbufs=8,logbsize=32k,prjquota)
/dev/vda4 on /var/lib/kubelet/pods/80389395-c0f4-4342-a2ee-2b8c31dbbdbc/volume-subpaths/etc/tuned/1 type xfs (rw,relatime,seclabel,attr2,inode64,logbufs=8,logbsize=32k,prjquota)
/dev/vda4 on /var/lib/kubelet/pods/80389395-c0f4-4342-a2ee-2b8c31dbbdbc/volume-subpaths/etc/tuned/2 type xfs (rw,relatime,seclabel,attr2,inode64,logbufs=8,logbsize=32k,prjquota)
/dev/vda4 on /var/lib/kubelet/pods/80389395-c0f4-4342-a2ee-2b8c31dbbdbc/volume-subpaths/etc/tuned/3 type xfs (rw,relatime,seclabel,attr2,inode64,logbufs=8,logbsize=32k,prjquota)
/dev/vda4 on /var/lib/kubelet/pods/80389395-c0f4-4342-a2ee-2b8c31dbbdbc/volume-subpaths/etc/tuned/4 type xfs (rw,relatime,seclabel,attr2,inode64,logbufs=8,logbsize=32k,prjquota)
/dev/vda4 on /var/lib/kubelet/pods/80389395-c0f4-4342-a2ee-2b8c31dbbdbc/volume-subpaths/etc/tuned/5 type xfs (rw,relatime,seclabel,attr2,inode64,logbufs=8,logbsize=32k,prjquota)
我们非常困惑,mount命令的输出显示,vda4被挂载了很多次,每次都是不同的路径,这是为什么呢?
cat /proc/1/mountinfo | grep vda4
99 102 252:4 / /sysroot ro,relatime - xfs /dev/vda4 rw,seclabel,attr2,inode64,logbufs=8,logbsize=32k,prjquota
102 1 252:4 /ostree/deploy/rhcos/deploy/b1df1247e3ad53173c1e13a913ec645d48a22f6a294e70e2ca5bda8c31f78d78.0 / rw,relatime shared:1 - xfs /dev/vda4 rw,seclabel,attr2,inode64,logbufs=8,logbsize=32k,prjquota
103 102 252:4 /ostree/deploy/rhcos/deploy/b1df1247e3ad53173c1e13a913ec645d48a22f6a294e70e2ca5bda8c31f78d78.0/etc /etc rw,relatime shared:2 - xfs /dev/vda4 rw,seclabel,attr2,inode64,logbufs=8,logbsize=32k,prjquota
104 102 252:4 /ostree/deploy/rhcos/deploy/b1df1247e3ad53173c1e13a913ec645d48a22f6a294e70e2ca5bda8c31f78d78.0/usr /usr ro,relatime shared:3 - xfs /dev/vda4 rw,seclabel,attr2,inode64,logbufs=8,logbsize=32k,prjquota
133 102 252:4 /ostree/deploy/rhcos/var /var rw,relatime shared:4 - xfs /dev/vda4 rw,seclabel,attr2,inode64,logbufs=8,logbsize=32k,prjquota
299 133 252:4 /ostree/deploy/rhcos/var/lib/containers/storage/overlay /var/lib/containers/storage/overlay rw,relatime - xfs /dev/vda4 rw,seclabel,attr2,inode64,logbufs=8,logbsize=32k,prjquota
7886 133 252:4 /ostree/deploy/rhcos/deploy/b1df1247e3ad53173c1e13a913ec645d48a22f6a294e70e2ca5bda8c31f78d78.0/etc/modprobe.d /var/lib/kubelet/pods/80389395-c0f4-4342-a2ee-2b8c31dbbdbc/volume-subpaths/etc/tuned/1 rw,relatime shared:2 - xfs /dev/vda4 rw,seclabel,attr2,inode64,logbufs=8,logbsize=32k,prjquota
5920 133 252:4 /ostree/deploy/rhcos/deploy/b1df1247e3ad53173c1e13a913ec645d48a22f6a294e70e2ca5bda8c31f78d78.0/etc/sysconfig /var/lib/kubelet/pods/80389395-c0f4-4342-a2ee-2b8c31dbbdbc/volume-subpaths/etc/tuned/2 rw,relatime shared:2 - xfs /dev/vda4 rw,seclabel,attr2,inode64,logbufs=8,logbsize=32k,prjquota
7429 133 252:4 /ostree/deploy/rhcos/deploy/b1df1247e3ad53173c1e13a913ec645d48a22f6a294e70e2ca5bda8c31f78d78.0/etc/sysctl.d /var/lib/kubelet/pods/80389395-c0f4-4342-a2ee-2b8c31dbbdbc/volume-subpaths/etc/tuned/3 rw,relatime shared:2 - xfs /dev/vda4 rw,seclabel,attr2,inode64,logbufs=8,logbsize=32k,prjquota
7965 133 252:4 /ostree/deploy/rhcos/deploy/b1df1247e3ad53173c1e13a913ec645d48a22f6a294e70e2ca5bda8c31f78d78.0/etc/sysctl.conf /var/lib/kubelet/pods/80389395-c0f4-4342-a2ee-2b8c31dbbdbc/volume-subpaths/etc/tuned/4 rw,relatime shared:2 - xfs /dev/vda4 rw,seclabel,attr2,inode64,logbufs=8,logbsize=32k,prjquota
8491 133 252:4 /ostree/deploy/rhcos/deploy/b1df1247e3ad53173c1e13a913ec645d48a22f6a294e70e2ca5bda8c31f78d78.0/etc/systemd /var/lib/kubelet/pods/80389395-c0f4-4342-a2ee-2b8c31dbbdbc/volume-subpaths/etc/tuned/5 rw,relatime shared:2 - xfs /dev/vda4 rw,seclabel,attr2,inode64,logbufs=8,logbsize=32k,prjquota
答案在/proc/1/mountinfo中,我们来仔细分析一下里面的内容,特别是根文件系统的挂载。
- 我们看第一行,/dev/vda4说的是设备,xfs说的是这个设备上的文件系统,/ 说的是设备上的本来的目录, /sysroot 说的是挂载到当前进程空间的什么目录
- 我们再来看第二行,/dev/vda4说的是设备,xfs说的是这个设备上的文件系统,/ostree/deploy/rhcos/deploy/b1df1247e3ad53173c1e13a913ec645d48a22f6a294e70e2ca5bda8c31f78d78.0 说的是设备上的本来的目录, / 说的是挂载到当前进程空间的什么目录
所以,总结下来,/dev/vda4 上面的目录结构,和我们一般的目录结果不一样,系统启动以后,关键的路径被重新的安排就位了一下。
find /sysroot -maxdepth 3
/sysroot
/sysroot/boot
/sysroot/ostree
/sysroot/ostree/repo
/sysroot/ostree/repo/config
/sysroot/ostree/repo/tmp
/sysroot/ostree/repo/extensions
/sysroot/ostree/repo/state
/sysroot/ostree/repo/refs
/sysroot/ostree/repo/objects
/sysroot/ostree/repo/.lock
/sysroot/ostree/deploy
/sysroot/ostree/deploy/rhcos
/sysroot/ostree/boot.0.1
/sysroot/ostree/boot.0.1/rhcos
/sysroot/ostree/boot.0
/sysroot/.coreos-aleph-version.json
调查和 mount fs 相关的systemd
systemctl cat ostree-remount.service
[Unit]
Description=OSTree Remount OS/ Bind Mounts
Documentation=man:ostree(1)
DefaultDependencies=no
ConditionKernelCommandLine=ostree
OnFailure=emergency.target
Conflicts=umount.target
# Run after core mounts
After=-.mount var.mount
After=systemd-remount-fs.service
# But we run *before* most other core bootup services that need write access to /etc and /var
Before=local-fs.target umount.target
Before=systemd-random-seed.service plymouth-read-write.service systemd-journal-flush.service
Before=systemd-tmpfiles-setup.service
[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/usr/lib/ostree/ostree-remount
StandardInput=null
StandardOutput=journal
StandardError=journal+console
[Install]
WantedBy=local-fs.target
systemctl list-unit-files | grep mount
proc-sys-fs-binfmt_misc.automount static
boot.mount generated
dev-hugepages.mount static
dev-mqueue.mount static
proc-fs-nfsd.mount static
proc-sys-fs-binfmt_misc.mount static
run-vmblock\x2dfuse.mount disabled
sys-fs-fuse-connections.mount static
sys-kernel-config.mount static
sys-kernel-debug.mount static
tmp.mount disabled
var-lib-nfs-rpc_pipefs.mount static
var.mount generated
dracut-mount.service static
dracut-pre-mount.service static
nfs-mountd.service static
ostree-remount.service disabled
systemd-remount-fs.service static
umount.target static
systemctl cat dracut-mount.service
# /usr/lib/systemd/system/../../dracut/modules.d/98dracut-systemd/dracut-mount.service
# This file is part of dracut.
#
# See dracut.bootup(7) for details
[Unit]
Description=dracut mount hook
Documentation=man:dracut-mount.service(8)
After=initrd-root-fs.target initrd-parse-etc.service
After=dracut-initqueue.service dracut-pre-mount.service
ConditionPathExists=/usr/lib/initrd-release
ConditionDirectoryNotEmpty=|/lib/dracut/hooks/mount
ConditionKernelCommandLine=|rd.break=mount
DefaultDependencies=no
Conflicts=shutdown.target emergency.target
[Service]
Environment=DRACUT_SYSTEMD=1
Environment=NEWROOT=/sysroot
Type=oneshot
ExecStart=-/bin/dracut-mount
StandardInput=null
StandardOutput=syslog
StandardError=syslog+console
KillMode=process
RemainAfterExit=yes
# Bash ignores SIGTERM, so we send SIGHUP instead, to ensure that bash
# terminates cleanly.
KillSignal=SIGHUP
参考文档:
- https://man7.org/linux/man-pages/man7/dracut.bootup.7.html
- https://ostreedev.github.io/ostree/adapting-existing/#booting-and-initramfs-technology
openshift 4.10 single node, installer 安装,离线静态IP
openshift single node 是可以用installer来安装的,但是很多客户都遇到问题,这里我们就来试一下。
本文有一个前导实验,就是创建 helper node , 这个工具机用来做一个跳板,模拟离线环境的proxy
installer 的内部安装逻辑图:
视频讲解
on helper node
NODE_SSH_KEY="$(cat ~/.ssh/id_rsa.pub)"
INSTALL_IMAGE_REGISTRY=quaylab.infra.redhat.ren:8443
PULL_SECRET='{"auths":{"registry.redhat.io": {"auth": "ZHVtbXk6ZHVtbXk=","email": "noemail@localhost"},"registry.ocp4.redhat.ren:5443": {"auth": "ZHVtbXk6ZHVtbXk=","email": "noemail@localhost"},"'${INSTALL_IMAGE_REGISTRY}'": {"auth": "'$( echo -n 'admin:shadowman' | openssl base64 )'","email": "noemail@localhost"}}}'
NTP_SERVER=192.168.7.11
HELP_SERVER=192.168.7.11
KVM_HOST=192.168.7.11
API_VIP=192.168.7.100
INGRESS_VIP=192.168.7.101
CLUSTER_PROVISION_IP=192.168.7.103
BOOTSTRAP_IP=192.168.7.12
ACM_DEMO_MNGED_CLUSTER=acm-demo1
ACM_DEMO_MNGED_SNO_IP=192.168.7.15
# 定义单节点集群的节点信息
SNO_CLUSTER_NAME=acm-demo-hub
SNO_BASE_DOMAIN=redhat.ren
SNO_IP=192.168.7.13
SNO_GW=192.168.7.11
SNO_NETMAST=255.255.255.0
SNO_NETMAST_S=24
SNO_HOSTNAME=acm-demo-hub-master
SNO_IF=enp1s0
SNO_IF_MAC=`printf '00:60:2F:%02X:%02X:%02X' $[RANDOM%256] $[RANDOM%256] $[RANDOM%256]`
SNO_DNS=192.168.7.11
SNO_DISK=/dev/vda
SNO_CORE_PWD=redhat
echo ${SNO_IF_MAC} > /data/sno/sno.mac
mkdir -p /data/install
cd /data/install
/bin/rm -rf *.ign .openshift_install_state.json auth bootstrap manifests master*[0-9] worker*[0-9]
cat << EOF > /data/install/install-config.yaml
apiVersion: v1
baseDomain: $SNO_BASE_DOMAIN
compute:
- name: worker
replicas: 0
controlPlane:
name: master
replicas: 1
metadata:
name: $SNO_CLUSTER_NAME
networking:
# OVNKubernetes , OpenShiftSDN
networkType: OVNKubernetes
clusterNetwork:
- cidr: 10.128.0.0/14
hostPrefix: 23
serviceNetwork:
- 172.30.0.0/16
platform:
none: {}
bootstrapInPlace:
installationDisk: $SNO_DISK
pullSecret: '${PULL_SECRET}'
sshKey: |
$( cat /root/.ssh/id_rsa.pub | sed 's/^/ /g' )
additionalTrustBundle: |
$( cat /etc/crts/redhat.ren.ca.crt | sed 's/^/ /g' )
imageContentSources:
- mirrors:
- ${INSTALL_IMAGE_REGISTRY}/ocp4/openshift4
source: quay.io/openshift-release-dev/ocp-release
- mirrors:
- ${INSTALL_IMAGE_REGISTRY}/ocp4/openshift4
source: quay.io/openshift-release-dev/ocp-v4.0-art-dev
EOF
openshift-install create manifests --dir=/data/install
/bin/cp -f /data/ocp4/ocp4-upi-helpernode-master/machineconfig/* /data/install/openshift/
# copy image registry proxy related config
cd /data/ocp4
bash image.registries.conf.sh nexus.infra.redhat.ren:8083
/bin/cp -f /data/ocp4/image.registries.conf /etc/containers/registries.conf.d/
/bin/cp -f /data/ocp4/99-worker-container-registries.yaml /data/install/openshift
/bin/cp -f /data/ocp4/99-master-container-registries.yaml /data/install/openshift
cd /data/install/
openshift-install --dir=/data/install create single-node-ignition-config
alias coreos-installer='podman run --privileged --rm \
-v /dev:/dev -v /run/udev:/run/udev -v $PWD:/data \
-w /data quay.io/coreos/coreos-installer:release'
# /bin/cp -f bootstrap-in-place-for-live-iso.ign iso.ign
cat << EOF > /data/sno/static.hostname.bu
variant: openshift
version: 4.9.0
metadata:
labels:
machineconfiguration.openshift.io/role: master
name: 99-zzz-master-static-hostname
storage:
files:
- path: /etc/hostname
mode: 0644
overwrite: true
contents:
inline: |
${SNO_HOSTNAME}
EOF
cat << EOF > /data/sno/static.ip.bu
variant: openshift
version: 4.9.0
metadata:
labels:
machineconfiguration.openshift.io/role: master
name: 99-zzz-master-static-ip
storage:
files:
- path: /etc/NetworkManager/system-connections/${SNO_IF}.nmconnection
mode: 0600
overwrite: true
contents:
inline: |
[connection]
id=${SNO_IF}
type=ethernet
autoconnect-retries=1
interface-name=${SNO_IF}
multi-connect=1
permissions=
wait-device-timeout=60000
[ethernet]
mac-address-blacklist=
[ipv4]
address1=${SNO_IP}/${SNO_NETMAST_S=24},${SNO_GW}
dhcp-hostname=${SNO_HOSTNAME}
dhcp-timeout=90
dns=${SNO_DNS};
dns-search=
may-fail=false
method=manual
[ipv6]
addr-gen-mode=eui64
dhcp-hostname=${SNO_HOSTNAME}
dhcp-timeout=90
dns-search=
method=disabled
[proxy]
EOF
source /data/ocp4/acm.fn.sh
# butane /data/sno/static.bootstrap.ip.bu > /data/sno/disconnected/99-zzz-bootstrap-ip.yaml
# get_file_content_for_ignition "/opt/openshift/openshift/99-zzz-bootstrap-ip.yaml" "/data/sno/disconnected/99-zzz-bootstrap-ip.yaml"
# VAR_99_master_bootstrap_ip=$RET_VAL
# VAR_99_master_bootstrap_ip_2=$RET_VAL_2
butane /data/sno/static.hostname.bu > /data/sno/disconnected/99-zzz-master-static-hostname.yaml
get_file_content_for_ignition "/opt/openshift/openshift/99-zzz-master-static-hostname.yaml" "/data/sno/disconnected/99-zzz-master-static-hostname.yaml"
VAR_99_master_master_static_hostname=$RET_VAL
VAR_99_master_master_static_hostname_2=$RET_VAL_2
butane /data/sno/static.ip.bu > /data/sno/disconnected/99-zzz-master-ip.yaml
get_file_content_for_ignition "/opt/openshift/openshift/99-zzz-master-ip.yaml" "/data/sno/disconnected/99-zzz-master-ip.yaml"
VAR_99_master_ip=$RET_VAL
VAR_99_master_ip_2=$RET_VAL_2
# 我们会创建一个wzh用户,密码是redhat,这个可以在第一次启动的是,从console/ssh直接用用户名口令登录
# 方便排错和研究
VAR_PWD_HASH="$(python3 -c 'import crypt,getpass; print(crypt.crypt("redhat"))')"
# tmppath=$(mktemp)
cat /data/install/bootstrap-in-place-for-live-iso.ign \
| jq --arg VAR "$VAR_PWD_HASH" --arg VAR_SSH "$NODE_SSH_KEY" '.passwd.users += [{ "name": "wzh", "system": true, "passwordHash": $VAR , "sshAuthorizedKeys": [ $VAR_SSH ], "groups": [ "adm", "wheel", "sudo", "systemd-journal" ] }]' \
| jq --argjson VAR "$VAR_99_master_ip_2" '.storage.files += [$VAR] ' \
| jq --argjson VAR "$VAR_99_master_master_static_hostname" '.storage.files += [$VAR] ' \
| jq --argjson VAR "$VAR_99_master_ip" '.storage.files += [$VAR] ' \
| jq -c . \
> /data/install/iso.ign
# jump to other document here, if you want to customize the ignition file for partition and user
# then comeback
/bin/cp -f /data/ocp4/rhcos-live.x86_64.iso sno.iso
coreos-installer iso ignition embed -fi iso.ign sno.iso
on kvm host ( 103 )
# 创建实验用虚拟网络
mkdir -p /data/kvm
cd /data/kvm
cat << 'EOF' > /data/kvm/bridge.sh
#!/usr/bin/env bash
PUB_CONN='eno1'
PUB_IP='172.21.6.103/24'
PUB_GW='172.21.6.254'
PUB_DNS='172.21.1.1'
nmcli con down "$PUB_CONN"
nmcli con delete "$PUB_CONN"
nmcli con down baremetal
nmcli con delete baremetal
# RHEL 8.1 appends the word "System" in front of the connection,delete in case it exists
nmcli con down "System $PUB_CONN"
nmcli con delete "System $PUB_CONN"
nmcli connection add ifname baremetal type bridge con-name baremetal ipv4.method 'manual' \
ipv4.address "$PUB_IP" \
ipv4.gateway "$PUB_GW" \
ipv4.dns "$PUB_DNS"
nmcli con add type bridge-slave ifname "$PUB_CONN" master baremetal
nmcli con down "$PUB_CONN";pkill dhclient;dhclient baremetal
nmcli con up baremetal
EOF
bash /data/kvm/bridge.sh
nmcli con mod baremetal +ipv4.addresses "192.168.7.103/24"
nmcli con up baremetal
cat << EOF > /root/.ssh/config
StrictHostKeyChecking no
UserKnownHostsFile=/dev/null
EOF
pvcreate -y /dev/vdb
vgcreate vgdate /dev/vdb
# https://access.redhat.com/articles/766133
lvcreate -y -n poolA -L 500G vgdata
lvcreate -y -n poolA_meta -L 10G vgdata
lvconvert -y --thinpool vgdata/poolA --poolmetadata vgdata/poolA_meta
scp root@192.168.7.11:/data/install/sno.iso /data/kvm/
virsh destroy ocp4-acm-hub
virsh undefine ocp4-acm-hub
create_lv() {
var_vg=$1
var_pool=$2
var_lv=$3
var_size=$4
var_action=$5
lvremove -f $var_vg/$var_lv
# lvcreate -y -L $var_size -n $var_lv $var_vg
if [ "$var_action" == "recreate" ]; then
lvcreate --type thin -n $var_lv -V $var_size --thinpool $var_vg/$var_pool
wipefs --all --force /dev/$var_vg/$var_lv
fi
}
create_lv vgdata poolA lvacmhub 100G recreate
create_lv vgdata poolA lvacmhub-data 100G recreate
SNO_MEM=64
virt-install --name=ocp4-acm-hub-master01 --vcpus=16 --ram=$(($SNO_MEM*1024)) \
--cpu=host-model \
--disk path=/dev/vgdata/lvacmhub,device=disk,bus=virtio,format=raw \
--disk path=/dev/vgdata/lvacmhub-data,device=disk,bus=virtio,format=raw \
--os-variant rhel8.3 --network bridge=baremetal,model=virtio \
--graphics vnc,port=59002 \
--boot menu=on --cdrom /data/kvm/sno.iso
# --disk path=/dev/vgdata/lvacmhub-data,device=disk,bus=virtio,format=raw \
on helper to see result
cd /data/install
export KUBECONFIG=/data/install/auth/kubeconfig
echo "export KUBECONFIG=/data/install/auth/kubeconfig" >> ~/.bashrc
oc completion bash | sudo tee /etc/bash_completion.d/openshift > /dev/null
cd /data/install
openshift-install wait-for install-complete --log-level debug
# INFO Install complete!
# INFO To access the cluster as the system:admin user when using 'oc', run 'export KUBECONFIG=/data/install/auth/kubeconfig'
# INFO Access the OpenShift web-console here: https://console-openshift-console.apps.acm-demo-hub.redhat.ren
# INFO Login to the console with user: "kubeadmin", and password: "M5hQw-NizfX-qKzEq-eUnNk"
# DEBUG Time elapsed per stage:
# DEBUG Cluster Operators: 9m39s
# INFO Time elapsed: 9m39s
back and merge kubeconfig
mkdir -p ~/.kube/bak/
var_date=$(date '+%Y-%m-%d-%H%M')
/bin/cp -f /data/install/auth/kubeconfig ~/.kube/bak/kubeconfig-$var_date
/bin/cp -f /data/install/auth/kubeadmin-password ~/.kube/bak/kubeadmin-password-$var_date
sed "s/admin/admin\/$SNO_CLUSTER_NAME/g" /data/install/auth/kubeconfig > /tmp/config.new
# https://medium.com/@jacobtomlinson/how-to-merge-kubernetes-kubectl-config-files-737b61bd517d
/bin/cp -f ~/.kube/config ~/.kube/config.bak && KUBECONFIG=~/.kube/config:/tmp/config.new kubectl config view --flatten > /tmp/config && /bin/mv -f /tmp/config ~/.kube/config
unset KUBECONFIG
add worker node
我们装好了single node,那么接下来,我们还可以给这个single node添加worker节点,让这个single node cluster变成一个单master的集群。
# first, lets stick ingress to master
oc label node acm-demo-hub-master ocp-ingress-run="true"
oc patch ingresscontroller default -n openshift-ingress-operator --type=merge --patch='{"spec":{"nodePlacement":{"nodeSelector": {"matchLabels":{"ocp-ingress-run":"true"}}}}}'
# we are testing env, so we don't need ingress replicas.
oc patch --namespace=openshift-ingress-operator --patch='{"spec": {"replicas": 1}}' --type=merge ingresscontroller/default
oc get -n openshift-ingress-operator ingresscontroller/default -o yaml
# then we get worker's ignition file, and start worker node, add it to cluster
oc extract -n openshift-machine-api secret/worker-user-data --keys=userData --to=- > /var/www/html/ignition/sno-worker.ign
HELP_SERVER=192.168.7.11
# 定义单节点集群的节点信息
SNO_IP=192.168.7.16
SNO_GW=192.168.7.11
SNO_NETMAST=255.255.255.0
SNO_HOSTNAME=acm-demo-hub-worker-01
SNO_IF=enp1s0
SNO_DNS=192.168.7.11
SNO_DISK=/dev/vda
SNO_MEM=16
BOOT_ARG=" ip=$SNO_IP::$SNO_GW:$SNO_NETMAST:$SNO_HOSTNAME:$SNO_IF:none nameserver=$SNO_DNS coreos.inst.install_dev=${SNO_DISK##*/} coreos.inst.ignition_url=http://$HELP_SERVER:8080/ignition/sno-worker.ign"
/bin/cp -f /data/ocp4/rhcos-live.x86_64.iso sno.iso
coreos-installer iso kargs modify -a "$BOOT_ARG" sno.iso
# go to kvm host ( 103 )
scp root@192.168.7.11:/data/install/sno.iso /data/kvm/
virsh destroy ocp4-acm-hub-worker01
virsh undefine ocp4-acm-hub-worker01
create_lv() {
var_vg=$1
var_pool=$2
var_lv=$3
var_size=$4
var_action=$5
lvremove -f $var_vg/$var_lv
# lvcreate -y -L $var_size -n $var_lv $var_vg
if [ "$var_action" == "recreate" ]; then
lvcreate --type thin -n $var_lv -V $var_size --thinpool $var_vg/$var_pool
wipefs --all --force /dev/$var_vg/$var_lv
fi
}
create_lv vgdata poolA lvacmhub-worker01 120G recreate
# create_lv vgdata poolA lvacmhub-worker01-data 100G remove
virt-install --name=ocp4-acm-hub-worker01 --vcpus=16 --ram=$(($SNO_MEM*1024)) \
--cpu=host-model \
--disk path=/dev/vgdata/lvacmhub-worker01,device=disk,bus=virtio,format=raw \
`# --disk path=/dev/vgdata/lvacmhub-data,device=disk,bus=virtio,format=raw` \
--os-variant rhel8.3 --network bridge=baremetal,model=virtio \
--graphics vnc,port=59003 \
--boot menu=on --cdrom /data/kvm/sno.iso
# after 2 boot up,
# go back to helper
oc get csr
oc get csr -ojson | jq -r '.items[] | select(.status == {} ) | .metadata.name' | xargs oc adm certificate approve
end
openshift 4.10, single node with customized partition
做POC经常有这样的问题:客户的主机上,就一块1T SSD,我们知道,ocp 默认安装,会占用这第一块硬盘,分4个分区,其中root分区几乎占满整个硬盘。但是其实ocp运行的时候,只会用到100G左右,那么这1T SSD的大部分就浪费掉了。我们希望能从这个1T的硬盘中,分出来800G,给存储解决方案用,比如ceph/odf,那么怎么在安装的过程中,定制installer,让他给我们多分出来第5个分区呢?
本文描述,如何在single node openshift上,如果是单独硬盘的话,如何在这个硬盘上,多分出来几个分区,来做数据分区。
这个在某些资源有限PoC场景下,会很有用,比如需要在单硬盘single node上启动ceph.
本文,有一个背景知识,或者是前导实验,就是如何部署一个普通的single node openshift
内部安装逻辑图如下:
视频讲解
additional steps during sno install
# download yq and install
mkdir tmp; cd tmp
wget https://github.com/mikefarah/yq/releases/download/v4.25.1/yq_linux_amd64.tar.gz
tar -zxvf yq_linux_amd64.tar.gz
install yq_linux_amd64 /usr/local/bin/yq
# calcuate a password
# VAR_PWD=`podman run -ti --rm quay.io/coreos/mkpasswd --method=yescrypt redhat`
# $y$j9T$UCg7ef5in/0aw0C2ZqSFo.$n8gC9.kDzWwlq0GmXKDVH8KUuGNdj7l6tnAsR4RZaG5
VAR_PWD_HASH="$(python3 -c 'import crypt,getpass; print(crypt.crypt("redhat"))')"
# # https://docs.fedoraproject.org/en-US/fedora-coreos/storage/#_setting_up_separate_var_mounts
# cat << EOF > /data/sno/root-partition.bu
# variant: openshift
# version: 4.8.0
# metadata:
# name: root-storage
# labels:
# machineconfiguration.openshift.io/role: master
# storage:
# disks:
# - device: /dev/vda
# wipe_table: false
# partitions:
# - number: 4
# label: root
# size_mib: $(( 120 * 1024 ))
# resize: true
# - label: data_odf_lvm
# size_mib: 0
# EOF
# butane /data/sno/root-partition.bu -r -o /data/install/partition-ric.ign
# # merge the 2 ignition files
# jq -s '.[0] * .[1]' /data/install/iso.ign /data/install/partition-ric.ign | jq -c . > /data/install/iso.ign.new
# /bin/cp -f /data/install/iso.ign.new /data/install/iso.ign
# https://github.com/openshift/installer/blob/master/data/data/bootstrap/bootstrap-in-place/files/opt/openshift/bootstrap-in-place/master-update.fcc
# cat iso.ign | jq ' .storage.files[] | select ( .path == "/opt/openshift/bootstrap-in-place/master-update.fcc" ) ' | jq -r .contents.source | sed 's/.*base64,//g' | base64 -d > /data/install/master-update.fcc
cat << EOF > /data/install/root-partition.fc
variant: fcos
version: 1.3.0
# !!! do not include passwd / users in production system !!!
passwd:
users:
- name: wzh
password_hash: $VAR_PWD_HASH
system: true
ssh_authorized_keys:
- $NODE_SSH_KEY
groups:
- adm
- wheel
- sudo
- systemd-journal
storage:
disks:
- device: /dev/vda
wipe_table: false
partitions:
- number: 4
label: root
size_mib: $(( 120 * 1024 ))
resize: true
# - label: data_01
# size_mib: $(( 5 * 1024 ))
- label: data_odf_lvm
size_mib: 0
EOF
butane /data/install/root-partition.fc -r -o /data/install/partition-ric.ign
cat << EOF > /data/sno/root-partition.bu
variant: openshift
version: 4.9.0
metadata:
labels:
machineconfiguration.openshift.io/role: master
name: 99-zzz-master-static-hostname
storage:
files:
- path: /opt/openshift/partition-ric.ign
mode: 0644
overwrite: true
contents:
local: partition-ric.ign
EOF
# yq '. *= load("/data/install/master-update.fcc")' /data/install/root-partition.fc > /data/install/root-partition.fcc
# config_source=$(cat /data/install/root-partition.fcc | python3 -c "import sys, urllib.parse; print(urllib.parse.quote(''.join(sys.stdin.readlines())))" )
# VAR_FCC_FILE="data:text/plain,${config_source}"
butane -d /data/install /data/sno/root-partition.bu > /data/sno/disconnected/99-zzz-master-root-partition.yaml
get_file_content_for_ignition "/opt/openshift/partition-ric.ign" "/data/sno/disconnected/99-zzz-master-root-partition.yaml"
VAR_99_master_fcc=$RET_VAL
VAR_99_master_fcc_2=$RET_VAL_2
cat iso.ign | jq ' .storage.files[] | select ( .path == "/usr/local/bin/bootstrap-in-place.sh" ) ' | jq -r .contents.source | sed 's/.*base64,//g' | base64 -d > /data/install/bootstrap-in-place.sh
# try to replace
# merge the 2 ignition files
cat << EOF > /data/install/bootstrap-in-place.sh.patch
jq -s '.[0] * .[1]' /opt/openshift/master.ign /opt/openshift/partition-ric.ign | jq -c . > /opt/openshift/master.ign.new
/bin/cp -f /opt/openshift/master.ign.new /opt/openshift/master.ign
EOF
# https://stackoverflow.com/questions/26141347/using-sed-to-insert-file-content-into-a-file-before-a-pattern
sed $'/touch master-ignition.done/{e cat \/data\/install\/bootstrap-in-place.sh.patch\n}' /data/install/bootstrap-in-place.sh > /data/install/bootstrap-in-place.sh.new
cat << EOF > /data/sno/bootstrap-in-place.bu
variant: openshift
version: 4.9.0
metadata:
labels:
machineconfiguration.openshift.io/role: master
name: 99-zzz-master-bootstrap-in-place
storage:
files:
- path: /usr/local/bin/bootstrap-in-place.sh
mode: 0555
overwrite: true
contents:
local: bootstrap-in-place.sh.new
EOF
butane -d /data/install /data/sno/bootstrap-in-place.bu > /data/sno/disconnected/99-zzz-master-bootstrap-in-place.yaml
get_file_content_for_ignition "/usr/local/bin/bootstrap-in-place.sh" "/data/sno/disconnected/99-zzz-master-bootstrap-in-place.yaml"
VAR_99_master_bootstrap_sh=$RET_VAL
VAR_99_master_bootstrap_sh_2=$RET_VAL_2
cat iso.ign | jq ' del ( .storage.files[] | select ( .path == "/usr/local/bin/bootstrap-in-place.sh" ) )' > /data/install/iso.ign.new
# cat iso.ign | jq ' .storage.files[] | select ( .path == "/usr/local/bin/bootstrap-in-place.sh" ) ' | jq -r .contents.source
cat /data/install/iso.ign.new \
| jq --argjson VAR "$VAR_99_master_fcc_2" '.storage.files += [$VAR] ' \
| jq --argjson VAR "$VAR_99_master_bootstrap_sh_2" '.storage.files += [$VAR] ' \
| jq -c . \
> /data/install/iso.ign
# cat iso.ign | jq ' .storage.files[] | select ( .path == "/opt/openshift/bootstrap-in-place/master-update.fcc" ) ' | jq -r .contents.source
# cat iso.ign | jq ' .storage.files[] | select ( .path == "/opt/openshift/partition-ric.ign" ) ' | jq -r .contents.source
check the result
ssh -tt core@192.168.7.13 -- lsblk
# NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
# sr0 11:0 1 1024M 0 rom
# vda 252:0 0 400G 0 disk
# ├─vda1 252:1 0 1M 0 part
# ├─vda2 252:2 0 127M 0 part
# ├─vda3 252:3 0 384M 0 part /boot
# ├─vda4 252:4 0 120G 0 part /sysroot
# └─vda5 252:5 0 279.5G 0 part
local storage operator
我们有了很多分区,那么赶快来测试一下如何把他们变成 PV 吧
cat << EOF > /data/install/local-storage.yaml
---
apiVersion: v1
kind: Namespace
metadata:
name: openshift-local-storage
annotations:
workload.openshift.io/allowed: management
---
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
name: openshift-local-storage
namespace: openshift-local-storage
spec:
targetNamespaces:
- openshift-local-storage
---
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
name: local-storage-operator
namespace: openshift-local-storage
spec:
channel: "stable"
installPlanApproval: Manual
name: local-storage-operator
source: redhat-operators
sourceNamespace: openshift-marketplace
EOF
oc create -f /data/install/local-storage.yaml
cat << EOF > /data/install/local-storage-lv.yaml
apiVersion: "local.storage.openshift.io/v1"
kind: "LocalVolume"
metadata:
name: "local-disks"
namespace: "openshift-local-storage"
spec:
nodeSelector:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- acm-demo-hub-master
storageClassDevices:
- storageClassName: "local-sc"
volumeMode: Filesystem
fsType: xfs
devicePaths:
- /dev/vda5
EOF
oc create -f /data/install/local-storage-lv.yaml
我们创建pod,创建和使用pvc,然后弄点数据,然后删掉pod,删掉pvc。然后重新创建pod,创建和使用pvc,看看里面的数据是否会清空。
cat << EOF >> /data/install/pvc-demo.yaml
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: local-pvc-demo
spec:
accessModes:
- ReadWriteOnce
volumeMode: Filesystem
resources:
requests:
storage: 2Gi
storageClassName: local-sc
---
kind: Pod
apiVersion: v1
metadata:
annotations:
name: demo1
spec:
nodeSelector:
kubernetes.io/hostname: 'acm-demo-hub-master'
restartPolicy: Always
containers:
- name: demo1
image: >-
quay.io/wangzheng422/qimgs:centos7-test
env:
- name: key
value: value
command:
- sleep
- infinity
imagePullPolicy: Always
volumeMounts:
- mountPath: /data
name: demo
readOnly: false
volumes:
- name: demo
persistentVolumeClaim:
claimName: local-pvc-demo
EOF
oc create -n default -f /data/install/pvc-demo.yaml
install lvm operator
lvm operator dose NOT work.
tips
cat iso.ign | jq .storage.files[].path | grep fcc
# "/opt/openshift/bootstrap-in-place/master-update.fcc"
cat iso.ign | jq ' .storage.files[] | select ( .path == "/opt/openshift/bootstrap-in-place/master-update.fcc" ) ' | jq -r .contents.source | sed 's/.*base64,//g' | base64 -d
# variant: fcos
# version: 1.1.0
# ignition:
# config:
# merge:
# - local: original-master.ign
# storage:
# trees:
# - local: kubernetes/bootstrap-configs
# path: /etc/kubernetes/bootstrap-configs
# - local: tls/
# path: /etc/kubernetes/bootstrap-secrets
# - local: etcd-bootstrap/etc-kubernetes/static-pod-resources/etcd-member/
# path: /etc/kubernetes/static-pod-resources/etcd-member
# - local: etcd-data
# path: /var/lib/etcd
# files:
# - path: /etc/kubernetes/bootstrap-secrets/kubeconfig
# contents:
# local: auth/kubeconfig-loopback
# - path: /etc/kubernetes/manifests/etcd-pod.yaml
# contents:
# local: etcd-bootstrap/etc-kubernetes/manifests/etcd-member-pod.yaml
# - path: /etc/kubernetes/manifests/kube-apiserver-pod.yaml
# contents:
# local: bootstrap-manifests/kube-apiserver-pod.yaml
# - path: /etc/kubernetes/manifests/kube-controller-manager-pod.yaml
# contents:
# local: bootstrap-manifests/kube-controller-manager-pod.yaml
# - path: /etc/kubernetes/manifests/kube-scheduler-pod.yaml
# contents:
# local: bootstrap-manifests/kube-scheduler-pod.yaml
# - path: /usr/local/bin/bootstrap-in-place-post-reboot.sh
# contents:
# local: bootstrap-in-place/bootstrap-in-place-post-reboot.sh
# mode: 0555
# - path: /var/log/log-bundle-bootstrap.tar.gz
# contents:
# local: log-bundle-bootstrap.tar.gz
# - path: /usr/local/bin/installer-masters-gather.sh
# contents:
# local: bin/installer-masters-gather.sh
# mode: 0555
# - path: /usr/local/bin/installer-gather.sh
# contents:
# local: bin/installer-gather.sh
# mode: 0555
# systemd:
# units:
# - name: bootkube.service
# enabled: true
# contents: |
# [Unit]
# Description=Bootkube - bootstrap in place post reboot
# Wants=kubelet.service
# After=kubelet.service
# ConditionPathExists=/etc/kubernetes/bootstrap-secrets/kubeconfig
# [Service]
# Type=oneshot
# ExecStart=/usr/local/bin/bootstrap-in-place-post-reboot.sh
# RestartSec=5s
# [Install]
# WantedBy=multi-user.target
cat iso.ign | jq ' .storage.files[] | select ( .path == "/usr/local/bin/bootstrap-in-place.sh" ) ' | jq -r .contents.source | sed 's/.*base64,//g' | base64 -d
# ......
# echo "Adding bootstrap control plane and bootstrap installer-gather bundle to master ignition"
# bootkube_podman_run \
# --rm \
# --privileged \
# --volume "$PWD:/assets:z" \
# --volume "/usr/local/bin/:/assets/bin" \
# --volume "/var/lib/etcd/:/assets/etcd-data" \
# --volume "/etc/kubernetes:/assets/kubernetes" \
# "${CLUSTER_BOOTSTRAP_IMAGE}" \
# bootstrap-in-place \
# --asset-dir /assets \
# --input /assets/bootstrap-in-place/master-update.fcc \
# --output /assets/master.ign
# touch master-ignition.done
# record_service_stage_success
# fi
# https://github.com/openshift/cluster-bootstrap
cd /data/install
podman run --rm -it \
--privileged \
--volume "$PWD:/assets:z" \
--volume "/usr/local/bin/:/assets/bin" \
--volume "/var/lib/etcd/:/assets/etcd-data" \
--volume "/etc/kubernetes:/assets/kubernetes" \
quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:c29cb321d7ac72d86a86ba4a74a0774ed2ebf9910d65c1805245a17d7b005b88 \
bootstrap-in-place \
--asset-dir /assets \
--input /assets/bootstrap-in-place/master-update.fcc \
--output /assets/master.ign
lsblk -o PARTUUID,NAME,FSTYPE,LABEL,UUID,MOUNTPOINT
# PARTUUID NAME FSTYPE LABEL UUID MOUNTPOINT
# sr0
# vda
# e23d3123-1d83-4665-8b0f-1c39f8e8f533 ├─vda1
# ed26d305-052e-4148-9b44-05357053742a ├─vda2 vfat EFI-SYSTEM 1533-24B8
# ae634b25-a5b9-4667-85ce-119455a92e53 ├─vda3 ext4 boot 85555068-e37d-4773-837c-d279550eb818 /boot
# ef1b4117-0c2d-4f53-abd4-d3019ecf267e ├─vda4 xfs root 936512ae-5449-4a2f-808e-1c698859c877 /sysroot
# e7b459fb-f2e1-43c9-b638-c732898eedf5 ├─vda5
# 9f0f85c7-51c6-4f2a-b7b7-c8ea3131fb32 └─vda6
end
Array.from(document.querySelectorAll("div[class='catalog-tile-pf-title']")).forEach(txt => console.log(txt.html()));
openshift 4.10 single node, post-install, lvm and nfs
single node ocp,如果有一块单独的硬盘,那么可以用lvm operator来自动分配lvm,创建存储给应用用。我们接下来在这个存储基础上配置nfs,就变成了一个集群内部的nfs服务。
视频讲解
install lvm operator
we need local storage, and we are single node openshift, so we use lvm operator, find the operator from operator hub and install :
lvm operator is in TP, so it is buggy, we need some fix.
# oc create ns lvm-operator-system
cat << EOF > /data/install/lvm-operator.yaml
---
apiVersion: v1
kind: Namespace
metadata:
name: lvm-operator-system
annotations:
workload.openshift.io/allowed: management
---
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
name: lvm-operator-system
namespace: lvm-operator-system
spec:
targetNamespaces:
- lvm-operator-system
---
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
name: odf-lvm-operator
namespace: lvm-operator-system
spec:
channel: "stable-4.10"
installPlanApproval: Manual
name: odf-lvm-operator
source: redhat-operators
sourceNamespace: openshift-marketplace
EOF
oc create -f /data/install/lvm-operator.yaml
# oc delete -f /data/install/lvm-operator.yaml
ssh -tt core@192.168.7.13 -- lsblk
# NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
# sr0 11:0 1 1024M 0 rom
# vda 252:0 0 120G 0 disk
# ├─vda1 252:1 0 1M 0 part
# ├─vda2 252:2 0 127M 0 part
# ├─vda3 252:3 0 384M 0 part /boot
# └─vda4 252:4 0 119.5G 0 part /sysroot
# vdb 252:16 0 100G 0 disk
oc adm policy add-cluster-role-to-user cluster-admin system:serviceaccount:lvm-operator-system:topolvm-controller -n lvm-operator-system
oc adm policy add-cluster-role-to-user cluster-admin system:serviceaccount:lvm-operator-system:vg-manager -n lvm-operator-system
oc adm policy add-cluster-role-to-user cluster-admin system:serviceaccount:lvm-operator-system:topolvm-node -n lvm-operator-system
cat << EOF > /data/install/lvm.op.yaml
apiVersion: lvm.topolvm.io/v1alpha1
kind: LVMCluster
metadata:
name: lvmcluster-sample
spec:
storage:
deviceClasses:
- name: vg1
# thinPoolConfig:
# name: thin-pool-1
# sizePercent: 50
# overprovisionRatio: 50
EOF
oc create -n lvm-operator-system -f /data/install/lvm.op.yaml
kubectl patch storageclass odf-lvm-vg1 -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
cat << EOF > /data/install/lvm.op.pvc.sample.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: lvm-file-pvc
spec:
volumeMode: Filesystem
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
storageClassName: odf-lvm-vg1
EOF
oc create -f /data/install/lvm.op.pvc.sample.yaml -n default
cat <<EOF > /data/install/lvm.op.app.sample.yaml
apiVersion: v1
kind: Pod
metadata:
name: app-file
spec:
containers:
- name: app-file
image: registry.access.redhat.com/ubi8/ubi:8.4
imagePullPolicy: IfNotPresent
command: ["/usr/bin/bash", "-c", "/usr/bin/tail -f /dev/null"]
volumeMounts:
- mountPath: "/mnt/file"
name: lvm-file-pvc
volumes:
- name: lvm-file-pvc
persistentVolumeClaim:
claimName: lvm-file-pvc
EOF
oc create -f /data/install/lvm.op.app.sample.yaml -n default
install nfs service inside cluster
oc create ns nfs-system
oc project nfs-system
cd /data/install
wget -O nfs.all.yaml https://raw.githubusercontent.com/wangzheng422/nfs-ganesha-server-and-external-provisioner/wzh/deploy/openshift/nfs.all.yaml
oc create -n nfs-system -f nfs.all.yaml
# try it out
wget -O nfs.demo.yaml https://raw.githubusercontent.com/wangzheng422/nfs-ganesha-server-and-external-provisioner/wzh/deploy/openshift/nfs.demo.yaml
oc create -n default -f nfs.demo.yaml
start install openshift 4.10 single node by booting from linux
在做PoC的时候,客户经常会给我们虚拟机平台上的虚拟机,这个时候,我们就需要从虚拟机 boot coreos , 进而开始安装, 本文就描述相关步骤。
安装过程内部流程图:
视频讲解
from centos7
# install a centos7 vm
sed -i '0,/^network.*/s/^network.*/network --bootproto=static --device=eth0 --gateway=192.168.7.1 --ip=192.168.7.12 --netmask=255.255.255.0 --nameserver=192.168.7.11 --ipv6=auto --activate/' helper-ks.cfg
virsh destroy ocp4-acm-hub
virsh undefine ocp4-acm-hub
create_lv() {
var_vg=$1
var_pool=$2
var_lv=$3
var_size=$4
var_action=$5
lvremove -f $var_vg/$var_lv
# lvcreate -y -L $var_size -n $var_lv $var_vg
if [ "$var_action" == "recreate" ]; then
lvcreate --type thin -n $var_lv -V $var_size --thinpool $var_vg/$var_pool
wipefs --all --force /dev/$var_vg/$var_lv
fi
}
create_lv vgdata poolA lvacmhub 100G recreate
create_lv vgdata poolA lvacmhub-data 100G recreate
virt-install --name="ocp4-acm-hub" --vcpus=16 --ram=$((52*1024)) \
--cpu=host-model \
--disk path=/dev/vgdata/lvacmhub,device=disk,bus=virtio,format=raw \
--disk path=/dev/vgdata/lvacmhub-data,device=disk,bus=virtio,format=raw \
--os-variant rhel8.5 --network bridge=baremetal,model=virtio \
--graphics vnc,port=59000 \
--boot menu=on --location /data/kvm/CentOS-7-x86_64-Minimal-2009.iso \
--initrd-inject helper-ks.cfg --extra-args "inst.ks=file:/helper-ks.cfg"
# copy ignition file to webserver
# /bin/cp -f iso.ign /var/www/html/ignition/iso.ign
# copy rhcos-live.x86_64.iso to centos
ssh-copy-id root@192.168.7.12
scp /data/install/sno.iso root@192.168.7.12:~/
# goto centos
ssh root@192.168.7.12
mount -o ro sno.iso /mnt
/bin/cp -f /mnt/images/pxeboot/{initrd.img,vmlinuz} /boot/
/bin/cp -f /mnt/images/ignition.img /boot/
SNO_IP=192.168.7.13
SNO_GW=192.168.7.11
SNO_NETMAST=255.255.255.0
SNO_HOSTNAME=acm-demo-hub-master
SNO_IF=enp1s0
SNO_DNS=192.168.7.11
SNO_DISK=/dev/vda
SNO_ROOTFS=http://192.168.7.11:8080/install/rootfs.img
SNO_IGN=http://192.168.7.11:8080/ignition/iso.ign
cat << EOF >> /etc/grub.d/40_custom
menuentry 'coreos' --class fedora --class gnu-linux --class gnu --class os {
insmod gzio
insmod part_msdos
insmod xfs
set root='hd0,msdos1'
echo 'Loading coreos kernel ...'
linux /vmlinuz rd.neednet=1 console=tty0 console=ttyS0 coreos.live.rootfs_url=$SNO_ROOTFS ip=$SNO_IP::$SNO_GW:$SNO_NETMAST:$SNO_HOSTNAME:$SNO_IF:none nameserver=$SNO_DNS ignition.firstboot ignition.platform.id=metal random.trust_cpu=on
echo 'Loading coreos initrd ...'
initrd /initrd.img /ignition.img
}
EOF
sed -i 's/^GRUB_DEFAULT=.*/GRUB_DEFAULT="coreos"/' /etc/default/grub
grub2-mkconfig -o /etc/grub2.cfg
reboot
from rocky linux 8
# install a rocky8 vm
sed -i '0,/^network.*/s/^network.*/network --bootproto=static --device=enp1s0 --gateway=192.168.7.1 --ip=192.168.7.12 --netmask=255.255.255.0 --nameserver=192.168.7.11 --ipv6=auto --activate/' helper-ks-rocky.cfg
virsh destroy ocp4-acm-hub
virsh undefine ocp4-acm-hub
create_lv() {
var_vg=$1
var_pool=$2
var_lv=$3
var_size=$4
var_action=$5
lvremove -f $var_vg/$var_lv
# lvcreate -y -L $var_size -n $var_lv $var_vg
if [ "$var_action" == "recreate" ]; then
lvcreate --type thin -n $var_lv -V $var_size --thinpool $var_vg/$var_pool
wipefs --all --force /dev/$var_vg/$var_lv
fi
}
create_lv vgdata poolA lvacmhub 100G recreate
create_lv vgdata poolA lvacmhub-data 100G recreate
virt-install --name="ocp4-acm-hub" --vcpus=16 --ram=$((52*1024)) \
--cpu=host-model \
--disk path=/dev/vgdata/lvacmhub,device=disk,bus=virtio,format=raw \
--disk path=/dev/vgdata/lvacmhub-data,device=disk,bus=virtio,format=raw \
--os-variant rhel8.5 --network bridge=baremetal,model=virtio \
--graphics vnc,port=59000 \
--boot menu=on --location /data/kvm/Rocky-8.6-x86_64-minimal.iso \
--initrd-inject helper-ks-rocky.cfg --extra-args "inst.ks=file:/helper-ks-rocky.cfg"
# copy ignition file to webserver
# /bin/cp -f iso.ign /var/www/html/ignition/iso.ign
# copy rhcos-live.x86_64.iso to centos
ssh-copy-id root@192.168.7.12
scp /data/install/sno.iso root@192.168.7.12:~/
# goto centos
ssh root@192.168.7.12
mount -o ro sno.iso /mnt
/bin/cp -f /mnt/images/pxeboot/{initrd.img,vmlinuz} /boot/
/bin/cp -f /mnt/images/ignition.img /boot/
SNO_IP=192.168.7.13
SNO_GW=192.168.7.11
SNO_NETMAST=255.255.255.0
SNO_HOSTNAME=acm-demo-hub-master
SNO_IF=enp1s0
SNO_DNS=192.168.7.11
SNO_DISK=/dev/vda
SNO_ROOTFS=http://192.168.7.11:8080/install/rootfs.img
SNO_IGN=http://192.168.7.11:8080/ignition/iso.ign
cat << EOF >> /etc/grub.d/40_custom
menuentry 'coreos' --class fedora --class gnu-linux --class gnu --class os {
insmod gzio
insmod part_msdos
insmod xfs
set root='hd0,msdos1'
echo 'Loading coreos kernel ...'
linux /vmlinuz rd.neednet=1 console=tty0 console=ttyS0 coreos.live.rootfs_url=$SNO_ROOTFS ip=$SNO_IP::$SNO_GW:$SNO_NETMAST:$SNO_HOSTNAME:$SNO_IF:none nameserver=$SNO_DNS ignition.firstboot ignition.platform.id=metal random.trust_cpu=on
echo 'Loading coreos initrd ...'
initrd /initrd.img /ignition.img
}
EOF
sed -i 's/^GRUB_DEFAULT=.*/GRUB_DEFAULT="coreos"/' /etc/default/grub
grub2-mkconfig -o /etc/grub2.cfg
reboot
openshift 4.10 single node with ODF
我们可以给single node openshift,配置一个ceph/odf存储。可以是单独的一块硬盘,也可以是系统安装盘上面多分出来的数据分区。
本文档的前导实验,是如何部署一个普通的single node openshift
视频讲解
reference:
install ceph components to ocp
# cat << EOF > /data/install/local-storage.yaml
# ---
# apiVersion: v1
# kind: Namespace
# metadata:
# name: openshift-local-storage
# annotations:
# workload.openshift.io/allowed: management
# ---
# apiVersion: operators.coreos.com/v1
# kind: OperatorGroup
# metadata:
# name: openshift-local-storage
# namespace: openshift-local-storage
# spec:
# targetNamespaces:
# - openshift-local-storage
# ---
# apiVersion: operators.coreos.com/v1alpha1
# kind: Subscription
# metadata:
# name: local-storage-operator
# namespace: openshift-local-storage
# spec:
# channel: "stable"
# installPlanApproval: Manual
# name: local-storage-operator
# source: redhat-operators
# sourceNamespace: openshift-marketplace
# EOF
# oc create -f /data/install/local-storage.yaml
cat << EOF > /data/install/openshift-storage.yaml
---
apiVersion: v1
kind: Namespace
metadata:
name: openshift-storage
annotations:
workload.openshift.io/allowed: management
---
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
name: openshift-storage
namespace: openshift-storage
spec:
targetNamespaces:
- openshift-storage
---
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
name: odf-operator
namespace: openshift-storage
spec:
channel: "stable-4.10"
installPlanApproval: Manual
name: odf-operator
source: redhat-operators
sourceNamespace: openshift-marketplace
EOF
oc create -f /data/install/openshift-storage.yaml
cd /data/install
cat << EOF > /data/install/ceph-cluster.yaml
---
apiVersion: ceph.rook.io/v1
kind: CephCluster
metadata:
name: main
namespace: openshift-storage
spec:
storage:
useAllNodes: true
useAllDevices: true
cephVersion:
# Ceph 16 (pacific)
image: quay.io/ceph/ceph:v16.2.6 # https://quay.io/repository/ceph/ceph?tab=tags
#image: registry.redhat.io/rhceph/rhceph-5-rhel8:5-14 # https://catalog.redhat.com/software/containers/rhceph/rhceph-5-rhel8/60ec72a74a6a2c7844abe5fb?tag=all
# Ceph 14 (nautilus)
#image: quay.io/ceph/ceph:v14.2.22
#image: registry.redhat.io/rhceph/rhceph-4-rhel8:4-59 # https://catalog.redhat.com/software/containers/detail/5e39df7cd70cc54b02baf33f?tag=all
# Ceph 12 (luminous)
#image: registry.redhat.io/rhceph/rhceph-3-rhel7:3-51 # https://catalog.redhat.com/software/containers/rhceph/rhceph-3-rhel7/5a15ec17ecb5244d5b553577?tag=all
mon:
allowMultiplePerNode: true
mgr:
allowMultiplePerNode: true
modules:
- name: balancer
enabled: true
- name: pg_autoscaler
enabled: true
- name: rook
enabled: true
dashboard:
enabled: true
port: 8443
ssl: false
monitoring:
enabled: true
rulesNamespace: openshift-storage
logCollector:
enabled: true
periodicity: 24h
disruptionManagement:
managePodBudgets: true
machineDisruptionBudgetNamespace: openshift-machine-api
priorityClassNames:
mgr: system-node-critical
mon: system-node-critical
osd: system-node-critical
dataDirHostPath: /var/lib/rook # under /host in CoreOS
continueUpgradeAfterChecksEvenIfNotHealthy: true
---
kind: ConfigMap
apiVersion: v1
metadata:
name: rook-config-override # this name is required!
namespace: openshift-storage
data:
config: |
[global]
osd_pool_default_size = 1
mon_warn_on_pool_no_redundancy = false
EOF
oc create -f /data/install/ceph-cluster.yaml
# oc apply -f /data/install/ceph-cluster.yaml
oc exec deployment/rook-ceph-operator -n openshift-storage -- \
ceph -c /var/lib/rook/openshift-storage/openshift-storage.config -s
# cluster:
# id: 17cb663d-e4f4-4f9b-9993-ce33c971496a
# health: HEALTH_OK
# services:
# mon: 3 daemons, quorum a,b,c (age 8m)
# mgr: a(active, since 7m)
# osd: 1 osds: 1 up (since 7m), 1 in (since 7m)
# data:
# pools: 1 pools, 128 pgs
# objects: 0 objects, 0 B
# usage: 5.4 MiB used, 100 GiB / 100 GiB avail
# pgs: 128 active+clean
# oc expose svc/rook-ceph-mgr-dashboard -n openshift-storage
oc create route edge --service=rook-ceph-mgr-dashboard -n openshift-storage
oc get route -n openshift-storage
# NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD
# rook-ceph-mgr-dashboard rook-ceph-mgr-dashboard-openshift-storage.apps.acm-demo-hub.redhat.ren rook-ceph-mgr-dashboard http-dashboard None
oc get secret rook-ceph-dashboard-password --output=jsonpath="{['data']['password']}" -n openshift-storage | base64 -d && echo
# d%`1E#/jBL?7NcG0G5\*
# access cashboard on http://rook-ceph-mgr-dashboard-openshift-storage.apps.acm-demo-hub.redhat.ren/
# with username admin
add cephfs support
cat << EOF > /data/install/ceph-cluster-config.yaml
apiVersion: ceph.rook.io/v1
kind: CephFilesystem
metadata:
name: main
namespace: openshift-storage
# See:
# https://github.com/rook/rook/blob/master/Documentation/ceph-filesystem.md
# https://github.com/rook/rook/blob/master/Documentation/ceph-filesystem-crd.md
# https://github.com/rook/rook/blob/master/Documentation/ceph-pool-crd.md
spec:
metadataPool:
replicated:
size: 1
requireSafeReplicaSize: false
dataPools:
- failureDomain: osd
replicated:
size: 1
requireSafeReplicaSize: false
metadataServer:
activeCount: 1
activeStandby: true
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: ceph-fs
reclaimPolicy: Delete
provisioner: openshift-storage.cephfs.csi.ceph.com
parameters:
clusterID: openshift-storage
fsName: main
pool: main-data0
csi.storage.k8s.io/provisioner-secret-namespace: openshift-storage
csi.storage.k8s.io/provisioner-secret-name: rook-csi-cephfs-provisioner
csi.storage.k8s.io/controller-expand-secret-namespace: openshift-storage
csi.storage.k8s.io/controller-expand-secret-name: rook-csi-cephfs-provisioner
csi.storage.k8s.io/node-stage-secret-namespace: openshift-storage
csi.storage.k8s.io/node-stage-secret-name: rook-csi-cephfs-node
EOF
oc create -f /data/install/ceph-cluster-config.yaml
# oc delete -f /data/install/ceph-cluster-config.yaml
oc exec deployment/rook-ceph-operator -n openshift-storage -- ceph -c /var/lib/rook/openshift-storage/openshift-storage.config -s
# cluster:
# id: 3e7d32b0-9160-4421-9c7e-217116279601
# health: HEALTH_OK
# services:
# mon: 3 daemons, quorum a,b,c (age 4m)
# mgr: a(active, since 3m)
# mds: 1/1 daemons up, 1 hot standby
# osd: 1 osds: 1 up (since 3m), 1 in (since 4m)
# data:
# volumes: 1/1 healthy
# pools: 3 pools, 192 pgs
# objects: 22 objects, 2.3 KiB
# usage: 6.2 MiB used, 100 GiB / 100 GiB avail
# pgs: 192 active+clean
# io:
# client: 852 B/s rd, 1 op/s rd, 0 op/s wr
# progress:
add ceph-rbd support
cat << EOF > /data/install/ceph-cluster-config-rdb.yaml
---
apiVersion: ceph.rook.io/v1
kind: CephBlockPool
metadata:
name: replicapool
namespace: openshift-storage
spec:
failureDomain: osd
replicated:
size: 1
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: ceph-rbd
# Change "rook-ceph" provisioner prefix to match the operator namespace if needed
provisioner: openshift-storage.rbd.csi.ceph.com
parameters:
# clusterID is the namespace where the rook cluster is running
clusterID: openshift-storage
# Ceph pool into which the RBD image shall be created
pool: replicapool
# (optional) mapOptions is a comma-separated list of map options.
# For krbd options refer
# https://docs.ceph.com/docs/master/man/8/rbd/#kernel-rbd-krbd-options
# For nbd options refer
# https://docs.ceph.com/docs/master/man/8/rbd-nbd/#options
# mapOptions: lock_on_read,queue_depth=1024
# (optional) unmapOptions is a comma-separated list of unmap options.
# For krbd options refer
# https://docs.ceph.com/docs/master/man/8/rbd/#kernel-rbd-krbd-options
# For nbd options refer
# https://docs.ceph.com/docs/master/man/8/rbd-nbd/#options
# unmapOptions: force
# RBD image format. Defaults to "2".
imageFormat: "2"
# RBD image features. Available for imageFormat: "2". CSI RBD currently supports only layering feature.
imageFeatures: layering
# The secrets contain Ceph admin credentials.
csi.storage.k8s.io/provisioner-secret-name: rook-csi-rbd-provisioner
csi.storage.k8s.io/provisioner-secret-namespace: openshift-storage
csi.storage.k8s.io/controller-expand-secret-name: rook-csi-rbd-provisioner
csi.storage.k8s.io/controller-expand-secret-namespace: openshift-storage
csi.storage.k8s.io/node-stage-secret-name: rook-csi-rbd-node
csi.storage.k8s.io/node-stage-secret-namespace: openshift-storage
csi.storage.k8s.io/fstype: ext4
# Delete the rbd volume when a PVC is deleted
reclaimPolicy: Delete
# Optional, if you want to add dynamic resize for PVC.
# For now only ext3, ext4, xfs resize support provided, like in Kubernetes itself.
allowVolumeExpansion: true
EOF
oc create -f /data/install/ceph-cluster-config-rdb.yaml
# oc delete -f /data/install/ceph-cluster-config-rdb.yaml
kubectl patch storageclass ceph-rbd -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
oc exec deployment/rook-ceph-operator -n openshift-storage -- ceph -c /var/lib/rook/openshift-storage/openshift-storage.config -s
# cluster:
# id: 17cb663d-e4f4-4f9b-9993-ce33c971496a
# health: HEALTH_WARN
# too many PGs per OSD (302 > max 250)
# services:
# mon: 3 daemons, quorum a,b,c (age 67m)
# mgr: a(active, since 38m)
# mds: 1/1 daemons up, 1 hot standby
# osd: 1 osds: 1 up (since 38m), 1 in (since 67m)
# data:
# volumes: 1/1 healthy
# pools: 4 pools, 302 pgs
# objects: 28 objects, 2.3 KiB
# usage: 33 MiB used, 100 GiB / 100 GiB avail
# pgs: 0.331% pgs not active
# 301 active+clean
# 1 peering
# progress:
# Global Recovery Event (4s)
# [===========================.]
add object storage / s3 support
cat << EOF > /data/install/ceph-cluster-config-object-store.yaml
---
apiVersion: ceph.rook.io/v1
kind: CephObjectStore
metadata:
name: my-store
namespace: openshift-storage
spec:
metadataPool:
failureDomain: osd
replicated:
size: 1
dataPool:
failureDomain: osd
# erasureCoded:
# dataChunks: 2
# codingChunks: 1
preservePoolsOnDelete: true
gateway:
sslCertificateRef:
port: 80
# securePort: 443
instances: 1
healthCheck:
bucket:
disabled: false
interval: 60s
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: ceph-bucket
# Change "rook-ceph" provisioner prefix to match the operator namespace if needed
provisioner: openshift-storage.ceph.rook.io/bucket
reclaimPolicy: Delete
parameters:
objectStoreName: my-store
objectStoreNamespace: openshift-storage
EOF
oc create -f /data/install/ceph-cluster-config-object-store.yaml
# test out
cat << EOF > /data/install/ceph-cluster-config-s3.yaml
apiVersion: objectbucket.io/v1alpha1
kind: ObjectBucketClaim
metadata:
name: ceph-bucket
spec:
generateBucketName: ceph-bkt
storageClassName: ceph-bucket
EOF
oc create -n default -f /data/install/ceph-cluster-config-s3.yaml
# oc get -n default ObjectBucketClaim
# get parameters from ceph's object storage
export AWS_HOST=$(kubectl -n default get cm ceph-bucket -o jsonpath='{.data.BUCKET_HOST}')
export PORT=$(kubectl -n default get cm ceph-bucket -o jsonpath='{.data.BUCKET_PORT}')
export BUCKET_NAME=$(kubectl -n default get cm ceph-bucket -o jsonpath='{.data.BUCKET_NAME}')
export AWS_ACCESS_KEY_ID=$(kubectl -n default get secret ceph-bucket -o jsonpath='{.data.AWS_ACCESS_KEY_ID}' | base64 --decode)
export AWS_SECRET_ACCESS_KEY=$(kubectl -n default get secret ceph-bucket -o jsonpath='{.data.AWS_SECRET_ACCESS_KEY}' | base64 --decode)
customized coreos/rhcos for openshift4 / 定制 openshift4 的底层 coreos/rhcos 操作系统
我们做项目的时候,经常有对底层操作系统做修改的需求,比如
- 添加一些顺手的工具,如 htop, tcpdump, iperf 等,都是系统出故障的时候,排查用的利器
- 添加一些内核驱动程序,特别是我们有特殊的硬件插了进来,比如 DPU,GPU
- 我们有一些特殊的软件方案,需要在操作系统一层进行启动。
When we are working on projects, we often need to modify the underlying operating system, such as
- Add some handy tools, such as htop, tcpdump, iperf, etc., which are all useful tools for troubleshooting when the system fails
- Add some kernel drivers, especially we have special hardware plugged in, like DPU, GPU
- We have some special software solutions that need to be activated at the OS level.
而 openshift4 设计的初衷,是云原生安全,于是把底层操作系统,使用 coreos / rhcos 的方式提供,并且 rhcos 官方没有定制化方法和文档。这种方法确实提高了 openshift4 的安全性,稳定性,和全局的一致性,但是项目中也确实遇到了很多尴尬。
The original intention of openshift4 design is cloud-native security, so the underlying operating system is provided in the form of coreos / rhcos, and rhcos officially does not have customized methods and documents. This approch does improve the security, stability, and global consistency of openshift4, but it does encounter a lot of embarrassment in the project.
本文就针对以上问题,摸索出了如何定制底层 rhcos , 并且应用到 openshift4 之上的方法。其实这些方法,在 openshift 的 github 项目文档中都有,只不过之前没仔细研究罢了。
In view of the above problems, this article finds out how to customize the underlying rhcos and apply it to openshift4. In fact, these methods are available in the github project documentation of openshift, but they have not been carefully studied before.
⚠️⚠️⚠️注意,本文所述方法,涉及到了以下问题,不能使用在生产环境中,只能作为 PoC 应急,或者研究学习之用。如果确实是项目急需,请和红帽GPS部门沟(gěi)通(qián),获得支持。
- ⚠️编译需要多个 rhel 相关的特种源,而且还是 eus, tus 版本,这些都需要单独购买
- ⚠️编译需要一个红帽内部的 repo 源,属于红帽机密
- ⚠️自定义的 rhcos 不能得到红帽 CEE 支持
⚠️⚠️⚠️ Note that the method described in this article involves the following issues and cannot be used in a production environment. It can only be used as a PoC emergency or for research and learning. If it is really urgent for the project, please communicate with the Red Hat GPS department for support.
- ⚠️ Compilation requires multiple rhel-related special sources, and they are also eus and tus versions, which need to be purchased separately
- ⚠️ Compilation requires a Red Hat internal repo source, which is Red Hat Confidential
- ⚠️ Custom rhcos cannot be supported by Red Hat CEE
本次实验的架构图如下: The architecture diagram of this experiment is as follows:
过程中,重度使用了 cosa , 这个是 coreos-assembler 工具集中的命令,他封装了一系列的工具,根据一个配置文件项目,来自动化的编译出来 coreos/rhcos 镜像。
In the process, cosa is heavily used, which is a command in the coreos-assembler tool set. It encapsulates a series of tools and automatically compiles the coreos/rhcos image according to a configuration file project.
视频讲解 / Video explanation
准备 dnf repo 源 / Prepare the dnf repo source
注意,这些 repo 源都是需要特殊单独购买,请联系红帽销售和GPS服务部门。
Note that these repo sources are required to be purchased separately, please contact Red Hat Sales and GPS Services.
# install a rhel on vultr
# disable user/passwd login
# ChallengeResponseAuthentication no
# PasswordAuthentication no
# UsePAM no
sed -i 's/PasswordAuthentication yes/PasswordAuthentication no/g' /etc/ssh/sshd_config
# sed -i 's/UsePAM yes/UsePAM no/g' /etc/ssh/sshd_config
systemctl restart sshd
ssh root@v.redhat.ren -o PubkeyAuthentication=no
# root@v.redhat.ren: Permission denied (publickey,gssapi-keyex,gssapi-with-mic).
subscription-manager register --auto-attach --username ******** --password ********
# subscription-manager release --list
# subscription-manager release --set=8.4
# subscription-manager config --rhsm.baseurl=https://china.cdn.redhat.com
subscription-manager repos --list > list
subscription-manager repos \
--enable="rhel-8-for-x86_64-baseos-rpms" \
--enable="rhel-8-for-x86_64-appstream-rpms" \
--enable="codeready-builder-for-rhel-8-x86_64-rpms" \
#
dnf -y install https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm
dnf install -y byobu htop
# byobu
dnf update -y
reboot
mkdir -p /data/dnf
# Create new empty partitions, and filesystem
parted -s /dev/vdb mklabel gpt
parted -s /dev/vdb unit mib mkpart primary 0% 100%
mkfs.ext4 /dev/vdb1
cat << EOF >> /etc/fstab
/dev/vdb1 /data/dnf ext4 defaults,noatime,nofail 0 0
EOF
mount /dev/vdb1 /data/dnf
mkdir -p /data/dnf/dnf-ocp-4.10
cd /data/dnf/dnf-ocp-4.10
subscription-manager release --set=8.4
dnf reposync --repoid rhel-8-for-x86_64-baseos-eus-rpms -m --download-metadata --delete -n
dnf reposync --repoid=rhel-8-for-x86_64-appstream-eus-rpms -m --download-metadata --delete -n
# dnf reposync --repoid=rhel-8-for-x86_64-nfv-tus-rpms -m --download-metadata --delete -n
dnf reposync --repoid=rhel-8-for-x86_64-nfv-rpms -m --download-metadata --delete -n
dnf reposync --repoid=advanced-virt-for-rhel-8-x86_64-eus-rpms -m --download-metadata --delete -n
dnf reposync --repoid=fast-datapath-for-rhel-8-x86_64-rpms -m --download-metadata --delete -n
subscription-manager release --set=8
dnf -y install vsftpd
mkdir -p /var/ftp/dnf
mount --bind /data/dnf/dnf-ocp-4.10 /var/ftp/dnf
chcon -R -t public_content_t /var/ftp/dnf
sed -i "s/anonymous_enable=NO/anonymous_enable=YES/" /etc/vsftpd/vsftpd.conf
cat << EOF >> /etc/vsftpd/vsftpd.conf
pasv_enable=YES
pasv_max_port=10100
pasv_min_port=10090
EOF
systemctl disable --now firewalld
systemctl enable --now vsftpd
准备 build 服务器 / Prepare the build server
注意,build 服务器需要支持 kvm ,如果选用的云平台,需要云平台支持嵌套虚拟化。
本次实验,我们选用了一台 centos stream 8 的云主机。
Note that the build server needs to support kvm. If you choose a cloud platform, the cloud platform needs to support nested virtualization.
In this experiment, we chose a cloud host of centos stream 8.
# install a centos stream 8 on digitalocean,
# 2c 2G for ostree only
# 4c 8G for iso because it needs metal first
dnf install -y epel-release
dnf install -y byobu htop
dnf update -y
reboot
dnf groupinstall -y server
dnf install -y lftp podman
dnf -y install qemu-kvm libvirt libguestfs-tools virt-install virt-viewer virt-manager tigervnc-server
systemctl disable --now firewalld
systemctl enable --now libvirtd
开始编译 rhcos / Start compiling rhcos
cosa 的输入是一个配置文件项目,上游是 https://github.com/openshift/os , 我们做了下游扩展,加入了 epel 源,并且把操作系统名字,加入了 wzh 的标记,并且添加了 htop, tcpdump, iperf3 这几个常用的排错命令,作为演示。我们还从 epel 引入 pdns, pdns-recursor , 支持离线环境 dns 内嵌。 同时,我们从 https://github.com/distribution/distribution/releases 下载 docker image registry 的二进制文件加入进来,支持镜像仓库内嵌。
The input of cosa is a configuration file project, and the upstream is https://github.com/openshift/os. We made downstream extensions, added the epel source, added the operating system name, added the wzh mark, and added htop , tcpdump, iperf3 These commonly used troubleshooting commands are used as demonstrations.
# machine-os-images just copy a iso into container
# machine-os-content is our target
# follow coreos-assembler instruction
# https://github.com/coreos/coreos-assembler/blob/main/docs/building-fcos.md
# https://coreos.github.io/coreos-assembler/
# https://github.com/openshift/os/blob/master/docs/development-rhcos.md
# https://github.com/openshift/os/blob/master/docs/development.md
# https://github.com/openshift/os/blob/master/docs/development.md
# https://github.com/openshift/release/blob/master/core-services/release-controller/README.md#rpm-mirrors
export COREOS_ASSEMBLER_CONTAINER=quay.io/coreos-assembler/coreos-assembler:rhcos-4.10
# export COREOS_ASSEMBLER_CONTAINER=quay.io/coreos-assembler/coreos-assembler:latest
podman pull $COREOS_ASSEMBLER_CONTAINER
podman login ************* quay.io
cosa() {
env | grep COREOS_ASSEMBLER
local -r COREOS_ASSEMBLER_CONTAINER_LATEST="quay.io/coreos-assembler/coreos-assembler:latest"
if [[ -z ${COREOS_ASSEMBLER_CONTAINER} ]] && $(podman image exists ${COREOS_ASSEMBLER_CONTAINER_LATEST}); then
local -r cosa_build_date_str="$(podman inspect -f "{{.Created}}" ${COREOS_ASSEMBLER_CONTAINER_LATEST} | awk '{print $1}')"
local -r cosa_build_date="$(date -d ${cosa_build_date_str} +%s)"
if [[ $(date +%s) -ge $((cosa_build_date + 60*60*24*7)) ]] ; then
echo -e "\e[0;33m----" >&2
echo "The COSA container image is more that a week old and likely outdated." >&2
echo "You should pull the latest version with:" >&2
echo "podman pull ${COREOS_ASSEMBLER_CONTAINER_LATEST}" >&2
echo -e "----\e[0m" >&2
sleep 10
fi
fi
set -x
podman run --rm -ti --security-opt label=disable --privileged \
--uidmap=1000:0:1 --uidmap=0:1:1000 --uidmap 1001:1001:64536 \
-v ${PWD}:/srv/ --device /dev/kvm --device /dev/fuse \
-v /run/user/0/containers/auth.json:/home/builder/.docker/config.json \
--tmpfs /tmp -v /var/tmp:/var/tmp --name cosa \
${COREOS_ASSEMBLER_CONFIG_GIT:+-v $COREOS_ASSEMBLER_CONFIG_GIT:/srv/src/config/:ro} \
${COREOS_ASSEMBLER_GIT:+-v $COREOS_ASSEMBLER_GIT/src/:/usr/lib/coreos-assembler/:ro} \
${COREOS_ASSEMBLER_CONTAINER_RUNTIME_ARGS} \
${COREOS_ASSEMBLER_CONTAINER:-$COREOS_ASSEMBLER_CONTAINER_LATEST} "$@"
rc=$?; set +x; return $rc
}
rm -rf /data/rhcos
mkdir -p /data/rhcos
cd /data/rhcos
cosa init --branch wzh-ocp-4.10 https://github.com/wangzheng422/machine-os-content
sed -i 's/REPO_IP/v.redhat.ren/g' /data/rhcos/src/config/wzh.repo
wget -O src/config/overlay.d/99wzh-pdns/usr/bin/registry.tgz https://github.com/distribution/distribution/releases/download/v2.8.1/registry_2.8.1_linux_amd64.tar.gz
tar zvxf src/config/overlay.d/99wzh-pdns/usr/bin/registry.tgz -C src/config/overlay.d/99wzh-pdns/usr/bin
/bin/rm -f src/config/overlay.d/99wzh-pdns/usr/bin/registry.tgz
/bin/rm -f src/config/overlay.d/99wzh-pdns/usr/bin/LICENSE
/bin/rm -f src/config/overlay.d/99wzh-pdns/usr/bin/README.md
cosa fetch
cosa build ostree
# ......
# Ignored user missing from new passwd file: root
# New passwd entries: clevis, dnsmasq, gluster, systemd-coredump, systemd-journal-remote, systemd-resolve, tcpdump, unbound
# Ignored group missing from new group file: root
# New group entries: clevis, dnsmasq, gluster, input, kvm, printadmin, render, systemd-coredump, systemd-journal-remote, systemd-resolve, tcpdump, unbound
# Committing... done
# Metadata Total: 10907
# Metadata Written: 3721
# Content Total: 6584
# Content Written: 1344
# Content Cache Hits: 22043
# Content Bytes Written: 328474751
# 3721 metadata, 24647 content objects imported; 2.4 GB content written
# Wrote commit: 12876365301ad8f07ecf89b4fbe184f000a0816c895c6659ebc6822ef9c18ff7
# New image input checksum: 05e3c499a794b62d22ba12d8d73404ce5970d24b4f7a664b71d17c5cf50ccd4c
# None
# New build ID: 410.84.wzh.202208220647-0
# sha256:fa305389ffa50b73e259000d8f21753049de7e4c217c12df470798d34bd4b209
# Total objects: 28612
# No unreachable objects
# Ignoring non-directory /srv/builds/.build-commit
# + rc=0
# + set +x
# or build with default setting, ostree and qcow2
# cosa build
cosa list
# 410.84.wzh.202208220647-0
# Timestamp: 2022-08-22T06:59:55Z (0:02:49 ago)
# Artifacts: ostree
# Config: wzh-ocp-4.10 (16c263bb4b5c) (dirty)
cosa upload-oscontainer --name "quay.io/wangzheng422/ocp"
# quay.io/wangzheng422/ocp:410.84.202208220734-wzh-0 afbdcfab3ffa897842f181505897e6b448f40e961014f74d94996e0589934b7e
# for pdns
# quay.io/wangzheng422/ocp:410.84.202208251115-wzh-0 75beaec896b43eaa910e04f9c405687419baff09eb627c984382698f67066e8a
# for pdns, registry
# quay.io/wangzheng422/ocp:410.84.202208260926-wzh-0 13942b16d990b5934f8fc1dd344ffc2b7a009459a0af7d26624601b01a3ebe30
cosa buildextend-metal
# ......
# + cosa meta --workdir /srv --build 410.84.202208221336-wzh-0 --artifact metal --artifact-json /srv/tmp/build.metal/meta.json.new
# /srv/builds/410.84.202208221336-wzh-0/x86_64/meta.json wrote with version stamp 1661176393037675276
# + /usr/lib/coreos-assembler/finalize-artifact rhcos-410.84.202208221336-wzh-0-metal.x86_64.raw /srv/builds/410.84.202208221336-wzh-0/x86_64/rhcos-410.84.202208221336-wzh-0-metal.x86_64.raw
# + set +x
# Successfully generated: rhcos-410.84.202208221336-wzh-0-metal.x86_64.raw
cosa buildextend-metal4k
# ......
# + cosa meta --workdir /srv --build 410.84.202208221336-wzh-0 --artifact metal4k --artifact-json /srv/tmp/build.metal4k/meta.json.new
# /srv/builds/410.84.202208221336-wzh-0/x86_64/meta.json wrote with version stamp 1661176647683428498
# + /usr/lib/coreos-assembler/finalize-artifact rhcos-410.84.202208221336-wzh-0-metal4k.x86_64.raw /srv/builds/410.84.202208221336-wzh-0/x86_64/rhcos-410.84.202208221336-wzh-0-metal4k.x86_64.raw
# + set +x
# Successfully generated: rhcos-410.84.202208221336-wzh-0-metal4k.x86_64.raw
cosa buildextend-live
# ......
# Writing: Extension record Start Block 43
# Done with: Extension record Block(s) 1
# Writing: The File(s) Start Block 44
# 9.70% done, estimate finish Mon Aug 22 14:14:12 2022
# 19.36% done, estimate finish Mon Aug 22 14:14:12 2022
# 29.05% done, estimate finish Mon Aug 22 14:14:12 2022
# 38.71% done, estimate finish Mon Aug 22 14:14:12 2022
# 48.40% done, estimate finish Mon Aug 22 14:14:12 2022
# 58.06% done, estimate finish Mon Aug 22 14:14:12 2022
# 67.75% done, estimate finish Mon Aug 22 14:14:12 2022
# 77.41% done, estimate finish Mon Aug 22 14:14:12 2022
# 87.10% done, estimate finish Mon Aug 22 14:14:12 2022
# 96.78% done, estimate finish Mon Aug 22 14:14:12 2022
# Total translation table size: 2048
# Total rockridge attributes bytes: 2838
# Total directory bytes: 12288
# Path table size(bytes): 96
# Done with: The File(s) Block(s) 51483
# Writing: Ending Padblock Start Block 51527
# Done with: Ending Padblock Block(s) 150
# Max brk space used 24000
# 51677 extents written (100 MB)
# + /usr/bin/isohybrid --uefi /srv/tmp/buildpost-live/rhcos-410.84.202208221336-wzh-0-live.x86_64.iso.minimal
# + isoinfo -lR -i /srv/tmp/buildpost-live/rhcos-410.84.202208221336-wzh-0-live.x86_64.iso
# Embedded 262144 bytes Ignition config space at 4872192
# + coreos-installer iso extract pack-minimal-iso /srv/tmp/buildpost-live/rhcos-410.84.202208221336-wzh-0-live.x86_64.iso /srv/tmp/buildpost-live/rhcos-410.84.202208221336-wzh-0-live.x86_64.iso.minimal --consume
# Packing minimal ISO
# Matched 17 files of 17
# Total bytes skipped: 105419322
# Total bytes written: 486854
# Total bytes written (compressed): 2808
# Verifying that packed image matches digest
# Packing successful!
# Updated: builds/410.84.202208221336-wzh-0/x86_64/meta.json
# Create a new release based on openshift 4.10.28 and override a single image
oc adm release new -a /data/pull-secret.json \
--from-release quay.io/openshift-release-dev/ocp-release@sha256:2127608ebd67a2470860c42368807a0de2308dba144ec4c298bec1c03d79cb52 \
machine-os-content=quay.io/wangzheng422/ocp:410.84.202208260926-wzh-0 \
--to-image docker.io/wangzheng422/ocp:4.10-demo-pdns
oc image mirror docker.io/wangzheng422/ocp:4.10-demo-pdns quay.io/wangzheng422/ocp:4.10-demo-pdns
oc adm release info quay.io/wangzheng422/ocp:4.10-demo --commit-urls=true
# Name: 4.10.28
# Digest: sha256:57add9e36d950ea7eacfe8704279573952cfbed3192449b7cdcc8a72c4d28921
# Created: 2022-08-22T08:13:38Z
# OS/Arch: linux/amd64
# Manifests: 544
# Pull From: quay.io/wangzheng422/ocp@sha256:57add9e36d950ea7eacfe8704279573952cfbed3192449b7cdcc8a72c4d28921
# Release Metadata:
# Version: 4.10.28
# Upgrades: 4.9.19, 4.9.21, 4.9.22, 4.9.23, 4.9.24, 4.9.25, 4.9.26, 4.9.27, 4.9.28, 4.9.29, 4.9.30, 4.9.31, 4.9.32, 4.9.33, 4.9.34, 4.9.35, 4.9.36, 4.9.37, 4.9.38, 4.9.39, 4.9.40, 4.9.41, 4.9.42, 4.9.43, 4.9.44, 4.9.45, 4.9.46, 4.10.3, 4.10.4, 4.10.5, 4.10.6, 4.10.7, 4.10.8, 4.10.9, 4.10.10, 4.10.11, 4.10.12, 4.10.13, 4.10.14, 4.10.15, 4.10.16, 4.10.17, 4.10.18, 4.10.20, 4.10.21, 4.10.22, 4.10.23, 4.10.24, 4.10.25, 4.10.26, 4.10.27
# Metadata:
# url: https://access.redhat.com/errata/RHBA-2022:6095
# Component Versions:
# kubernetes 1.23.5
# machine-os 410.84.202208220734-wzh-0 Red Hat Enterprise Linux CoreOS
# Images:
# NAME URL
# alibaba-cloud-controller-manager https://github.com/openshift/cloud-provider-alibaba-cloud/commit/db2d118ad70ff62a2111e83a8d14c5b32e176b38
# alibaba-cloud-csi-driver https://github.com/openshift/alibaba-cloud-csi-driver/commit/3ddbb2b9d4994206183b5ffd6a0872ad9a5ce193
# alibaba-disk-csi-driver-operator https://github.com/openshift/alibaba-disk-csi-driver-operator/commit/f0d6966321e3d416efec2ac7405494b057cb35f8
# alibaba-machine-controllers https://github.com/openshift/cluster-api-provider-alibaba/commit/0206121348c9a0d220dd6805cea79d1eae7fd3e0
# ......
oc adm release info quay.io/wangzheng422/ocp:4.10-demo
# ......
# machine-config-operator sha256:6f0daed53e44e6377b0ac440f6293949278b912051b933b2dddfce0e6af2c70b
# machine-os-content quay.io/wangzheng422/ocp:410.84.202208220734-wzh-0
# machine-os-images sha256:783daa259e91647dec5b3e82ce496f8733345d707910d7dbbbdcaadcd75d599b
# ......
oc adm release info quay.io/wangzheng422/ocp:4.10-demo-pdns
# ......
# machine-config-operator sha256:6f0daed53e44e6377b0ac440f6293949278b912051b933b2dddfce0e6af2c70b
# machine-os-content sha256:99bfb9b88cd8bddcea353d304032a59d0734b2ef10353e105dbe4b6538207b88
# machine-os-images sha256:783daa259e91647dec5b3e82ce496f8733345d707910d7dbbbdcaadcd75d599b
# ......
应用到 openshift4 / Apply to openshift4
我们编译好了 rhcos, 那么怎么应用到 openshift4 集群上呢,一般来说,有3种办法,github上有文章写,笔者认为,直接集群级别强制升级最简单。当然,不同项目,不同情况,需要根据情况分析。
We have compiled rhcos, so how to apply it to the openshift4 cluster, generally speaking, there are 3 ways, there is an article on github, the author believes that the direct cluster level forced upgrade is the easiest. Of course, different projects and different situations need to be analyzed according to the situation.
直接强制升级 / Direct forced upgrade
# test upgrade
oc adm upgrade \
--to-image=quay.io/wangzheng422/ocp@sha256:b6d6fd197df2acf3ceafe60a7cc1023e7b192dffb43c675ddfcdfb4322828ddb \
--allow-explicit-upgrade --allow-upgrade-with-warnings=true --force=true
# after cluster upgrade
# check the os-release
cat /etc/os-release
# NAME="Red Hat Enterprise Linux CoreOS"
# VERSION="410.84.202208220734-wzh-0"
# ID="rhcos"
# ID_LIKE="rhel fedora"
# VERSION_ID="4.10"
# PLATFORM_ID="platform:el8"
# PRETTY_NAME="Red Hat Enterprise Linux CoreOS 410.84.202208220734-wzh-0 (Ootpa)"
# ANSI_COLOR="0;31"
# CPE_NAME="cpe:/o:redhat:enterprise_linux:8::coreos"
# HOME_URL="https://www.redhat.com/"
# DOCUMENTATION_URL="https://docs.openshift.com/container-platform/4.10/"
# BUG_REPORT_URL="https://bugzilla.redhat.com/"
# REDHAT_BUGZILLA_PRODUCT="OpenShift Container Platform"
# REDHAT_BUGZILLA_PRODUCT_VERSION="4.10"
# REDHAT_SUPPORT_PRODUCT="OpenShift Container Platform"
# REDHAT_SUPPORT_PRODUCT_VERSION="4.10"
# OPENSHIFT_VERSION="4.10"
# RHEL_VERSION="8.4"
# OSTREE_VERSION='410.84.202208220734-wzh-0'
以下是截屏,这里是 os-release, 可以看到有 wzh 的标记:
这里是在 rhcos 上直接运行 htop 的界面:
analyze the content
我们可以 dump 这个 machine-os-content 的内容出来仔细分析。
We can dump the contents of this machine-os-content and analyze it carefully.
export BUILDNUMBER=4.10.28
wget -O openshift-client-linux-${BUILDNUMBER}.tar.gz https://mirror.openshift.com/pub/openshift-v4/clients/ocp/${BUILDNUMBER}/openshift-client-linux-${BUILDNUMBER}.tar.gz
wget -O openshift-install-linux-${BUILDNUMBER}.tar.gz https://mirror.openshift.com/pub/openshift-v4/clients/ocp/${BUILDNUMBER}/openshift-install-linux-${BUILDNUMBER}.tar.gz
tar -xzf openshift-client-linux-${BUILDNUMBER}.tar.gz -C /usr/local/sbin/
tar -xzf openshift-install-linux-${BUILDNUMBER}.tar.gz -C /usr/local/sbin/
mkdir -p /data/ostree
cd /data/ostree
oc image extract --path /:/data/ostree --registry-config /run/user/0/containers/auth.json quay.io/wangzheng422/ocp:410.84.wzh.202208211552-0
end
rhcos / coreos install rpm using rpm-ostree | 给 rhcos / coreos 安装 rpm包
⚠️注意,本文所述操作,涉及更改 openshift 4 底层操作系统 rhcos,这有可能导致失去红帽支持资格,具体的情况,请和对口的红帽 GPS 团队沟通, 或者联系红帽 CEE 团队确认。
rhcos 是一个特殊版本的coreos, 它是openshift 4的底座操作系统,在openshift 4的官方文档中,rhcos被描述成为不可变操作系统,这会让人误以为,rhcos是不可改变的。这个错误的认识,让openshift 4在项目实施的过程中,遇到很多尴尬,也让很多场景,支持起来非常的别扭。
本文我们就来探索一下,如何在 rhcos / coreos 上安装rpm包,并正确理解一下不可变操作系统。
先说结论吧,笔者认为 rhcos / coreos 的 immutable os / 不可变操作系统的意思是这样的
- 操作系统的 /usr /lib /boot 等重要分区是只读的
- 操作系统的 /etc /var 是可写的,并且升级,重启保留/合并客户的修改内容。
- 操作系统的整个文件系统,使用类似 git 版本的方式管理,并且(当前)最多有2个版本
- 由于使用git方式管理,操作系统的改动,可以分为版本切换,和patch(layerd package)。其中版本切换,是中心下发的大版本升级,而patch可以认为是各个设备上做的小的修改。
而最终的实验结果,告诉我们,rhcos / coreos 是可以安装rpm的,安装命令是 rpm-ostree 。
接下来,我们就开始做实验,探索一下。以下是实验的部署架构图,部署结构很简单,就是一个openshift 4.10.26的6节点机器,并且有一个外部的rhel 8.4作为repo源。
视频讲解 / Video explanation
reference
openshift 4 using rpm-ostree install
使用rpm-ostree install并不神秘,openshift 4 支持的 machine config extension 操作的时候,就使用 rpm-ostree install来装软件包的。比如,如果我们激活 openshift 4 real-time kernel的支持,在node上看,就能看到他是通过装了更多的rpm来实现的。
rpm-ostree status
# State: idle
# Deployments:
# ● pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:480e39d63063bae8992542905d48442fd1d9d1325a5136a3be8256d123efe490
# CustomOrigin: Managed by machine-config-operator
# Version: 49.84.202110220538-0 (2021-10-22T05:41:35Z)
# RemovedBasePackages: kernel-core kernel-modules kernel kernel-modules-extra 4.18.0-305.19.1.el8_4
# LayeredPackages: kernel-rt-core kernel-rt-kvm kernel-rt-modules kernel-rt-modules-extra
# pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:480e39d63063bae8992542905d48442fd1d9d1325a5136a3be8256d123efe490
# CustomOrigin: Managed by machine-config-operator
# Version: 49.84.202110220538-0 (2021-10-22T05:41:35Z)
在这里,我们可以看到,他是装了real-time kernel相关的rpm包来实现的,同时,他还删除了一些kernel相关的包。
using single rpm file
我们先做一个准备实验,如果我们有一个rpm文件,我们能下载并且直接安装吗?后面,如果openshift 4升级了,这个安装的rpm还在吗?
为了回答这个问题,我们就从epel上,下载一个 htop 的 rpm, 然后安装一下看看。
# login to worker: ip-10-0-139-149 shell
curl -o htop-3.2.1-1.el8.x86_64.rpm https://rpmfind.net/linux/epel/8/Everything/x86_64/Packages/h/htop-3.2.1-1.el8.x86_64.rpm
rpm-ostree install ./htop-3.2.1-1.el8.x86_64.rpm
# Checking out tree 8b334e0... done
# No enabled rpm-md repositories.
# Importing rpm-md... done
# Resolving dependencies... done
# Checking out packages... done
# Running pre scripts... done
# Running post scripts... done
# Running posttrans scripts... done
# Writing rpmdb... done
# Writing OSTree commit... done
# Staging deployment... done
# Added:
# htop-3.2.1-1.el8.x86_64
# Run "systemctl reboot" to start a reboot
systemctl reboot
# after reboot
rpm-ostree status
# State: idle
# Deployments:
# * pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:23d0609643c25efcd30a7a64483fdee2343ced26b1fd08c0cbf8d03a5d405939
# CustomOrigin: Managed by machine-config-operator
# Version: 410.84.202208030316-0 (2022-08-03T03:19:21Z)
# LocalPackages: htop-3.2.1-1.el8.x86_64
# pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:23d0609643c25efcd30a7a64483fdee2343ced26b1fd08c0cbf8d03a5d405939
# CustomOrigin: Managed by machine-config-operator
# Version: 410.84.202208030316-0 (2022-08-03T03:19:21Z)
oc get mcp
# NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE
# master rendered-master-c3ceea1602f442fde75df6aab905c41e True False False 3 3 3 0 11h
# worker rendered-worker-c527565b03d522c2eb9bf6f33c419175 True False False 3 3 3 0 11h
oc get node
# NAME STATUS ROLES AGE VERSION
# ip-10-0-133-232.us-east-2.compute.internal Ready master 11h v1.23.5+012e945
# ip-10-0-139-149.us-east-2.compute.internal Ready worker 11h v1.23.5+012e945
# ip-10-0-159-38.us-east-2.compute.internal Ready master 11h v1.23.5+012e945
# ip-10-0-167-145.us-east-2.compute.internal Ready worker 11h v1.23.5+012e945
# ip-10-0-189-34.us-east-2.compute.internal Ready master 11h v1.23.5+012e945
# ip-10-0-215-151.us-east-2.compute.internal Ready worker 11h v1.23.5+012e945
# upgrade from 4.10.26 to 4.10.28
rpm-ostree status
# State: idle
# Deployments:
# * pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:822737b305b28aa4890f7bf847ebebc896cd7b549318195fc8c953ae3008cc44
# CustomOrigin: Managed by machine-config-operator
# Version: 410.84.202208161501-0 (2022-08-16T15:04:45Z)
# LocalPackages: htop-3.2.1-1.el8.x86_64
# pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:23d0609643c25efcd30a7a64483fdee2343ced26b1fd08c0cbf8d03a5d405939
# CustomOrigin: Managed by machine-config-operator
# Version: 410.84.202208030316-0 (2022-08-03T03:19:21Z)
# LocalPackages: htop-3.2.1-1.el8.x86_64
可以看到,htop rpm文件,是可以单独安装的,并且集群升级以后,这个rpm还在,patch的类型是LocalPackages。
using repo source
我们平常项目里面,要装的rpm很多,并且这些rpm还有依赖,那么rhcos能用传统的repo的方式,我们指定repo源,它就帮我们自动解析依赖,自动安装呢?
接下来,我们就从配置一个rpm源开始,一步一步的操作看看结果。
rpm simplified list
首先,我们确定一下,我们要装如下的rpm包。
htop lshw numactl libhugetlbfs-utils iperf3 tcpdump pdns pdns-recursor
build the repo
然后,我们要做一个rpm的repo。
export REPO_IP=http://v.redhat.ren:5180
cat << EOF > /etc/yum.repos.d/wzh.repo
# RHEL repos
[rhel-8-baseos]
baseurl=${REPO_IP}/rhel-8-for-x86_64-baseos-eus-rpms
[rhel-8-appstream]
baseurl=${REPO_IP}/rhel-8-for-x86_64-appstream-eus-rpms
[rhel-8-fast-datapath]
baseurl=${REPO_IP}/fast-datapath-for-rhel-8-x86_64-rpms
[rhel-8-advanced-virt]
baseurl=${REPO_IP}/advanced-virt-for-rhel-8-x86_64-eus-rpms
[rhel-8-nfv]
baseurl=${REPO_IP}/rhel-8-for-x86_64-nfv-tus-rpms
# upstream: http://download.eng.bos.redhat.com/rcm-guest/puddles/RHAOS/plashets/4.10-el8/building/x86_64/os/
# it is internal resource right now, confidential.
# or: https://mirror.openshift.com/enterprise/reposync/
# https://mirror.openshift.com/enterprise/reposync/4.10/rhel-8-server-ose-rpms/
# it also require logins.
[rhel-8-server-ose]
baseurl=${REPO_IP}/rhel-8-server-ose
# mirror list
# https://mirrors.fedoraproject.org/mirrorlist?repo=epel-8&arch=x86_64&&country=us
[epel]
baseurl=https://mirror.fcix.net/epel/8/Everything/x86_64/
enabled=1
repo_gpgcheck=0
gpgcheck=0
EOF
mv /etc/yum.repos.d/redhat.repo /etc/yum.repos.d/redhat.repo.wzh
rpm --import /etc/pki/rpm-gpg/RPM-GPG-KEY-redhat-release
dnf install -y byobu htop createrepo_c python39
mkdir -p /data/dnf-ocp-4.10-simple
cd /data/dnf-ocp-4.10-simple
# 注意,这里是把rpm的依赖也一起下载了。
dnf download --resolve htop lshw numactl libhugetlbfs-utils iperf3 tcpdump pdns pdns-recursor
createrepo ./
至此,我们就有一个目录,目录里面是一个小小的rpm repo。
setup repo source
接下来,我们就把这个目录,通过http的方式发布出去,让openshift 4的节点,能使用到。
systemctl disable --now firewalld
mkdir -p /data/dnf
mount /dev/vdb1 /data/dnf
cd /data/dnf/dnf-ocp-4.10-simple
python3 -m http.server 5180
install to rhcos
我们使用前面提供好的rpm repo,并且在worker节点上安装我们需要的包。
export REPO_IP=http://v.redhat.ren:5180
cat << EOF > /etc/yum.repos.d/wzh.repo
# RHEL repos
[simple]
baseurl=${REPO_IP}/
enabled=1
repo_gpgcheck=0
gpgcheck=0
EOF
rpm-ostree install htop lshw numactl libhugetlbfs-utils iperf3 tcpdump pdns pdns-recursor
systemctl reboot
# after reboot
rpm-ostree status
# State: idle
# Deployments:
# * pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:23d0609643c25efcd30a7a64483fdee2343ced26b1fd08c0cbf8d03a5d405939
# CustomOrigin: Managed by machine-config-operator
# Version: 410.84.202208030316-0 (2022-08-03T03:19:21Z)
# LayeredPackages: htop iperf3 libhugetlbfs-utils lshw numactl pdns pdns-recursor tcpdump
# pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:23d0609643c25efcd30a7a64483fdee2343ced26b1fd08c0cbf8d03a5d405939
# CustomOrigin: Managed by machine-config-operator
# Version: 410.84.202208030316-0 (2022-08-03T03:19:21Z)
可以看到,安装完成,多了很多包。那么我们把集群升级一下,会是什么效果呢?
# upgrade from 4.10.26 -> 4.10.28
rpm-ostree status
# State: idle
# Deployments:
# * pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:822737b305b28aa4890f7bf847ebebc896cd7b549318195fc8c953ae3008cc44
# CustomOrigin: Managed by machine-config-operator
# Version: 410.84.202208161501-0 (2022-08-16T15:04:45Z)
# LayeredPackages: htop iperf3 libhugetlbfs-utils lshw numactl pdns pdns-recursor tcpdump
# pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:23d0609643c25efcd30a7a64483fdee2343ced26b1fd08c0cbf8d03a5d405939
# CustomOrigin: Managed by machine-config-operator
# Version: 410.84.202208030316-0 (2022-08-03T03:19:21Z)
# LayeredPackages: htop iperf3 libhugetlbfs-utils lshw numactl pdns pdns-recursor tcpdump
可以看到,升级完成以后,我们装的包,依然都在。
research
setup repo source
systemctl disable --now firewalld
mkdir -p /data/dnf
mount /dev/vdb1 /data/dnf
cd /data/dnf/dnf-ocp-4.10
python3 -m http.server 5180
install to rhcos
export REPO_IP=http://v.redhat.ren:5180
cat << EOF > /etc/yum.repos.d/wzh.repo
# RHEL repos
[rhel-8-baseos]
baseurl=${REPO_IP}/rhel-8-for-x86_64-baseos-eus-rpms
[rhel-8-appstream]
baseurl=${REPO_IP}/rhel-8-for-x86_64-appstream-eus-rpms
[rhel-8-fast-datapath]
baseurl=${REPO_IP}/fast-datapath-for-rhel-8-x86_64-rpms
[rhel-8-advanced-virt]
baseurl=${REPO_IP}/advanced-virt-for-rhel-8-x86_64-eus-rpms
[rhel-8-nfv]
baseurl=${REPO_IP}/rhel-8-for-x86_64-nfv-tus-rpms
# upstream: http://download.eng.bos.redhat.com/rcm-guest/puddles/RHAOS/plashets/4.10-el8/building/x86_64/os/
# it is internal resource right now, confidential.
# or: https://mirror.openshift.com/enterprise/reposync/
# https://mirror.openshift.com/enterprise/reposync/4.10/rhel-8-server-ose-rpms/
# it also require logins.
[rhel-8-server-ose]
baseurl=${REPO_IP}/rhel-8-server-ose
# mirror list
# https://mirrors.fedoraproject.org/mirrorlist?repo=epel-8&arch=x86_64&&country=us
[epel]
baseurl=https://mirror.fcix.net/epel/8/Everything/x86_64/
enabled=1
repo_gpgcheck=0
gpgcheck=0
EOF
rpm-ostree install htop
# Checking out tree 203abe6... done
# Enabled rpm-md repositories: rhel-8-baseos rhel-8-appstream rhel-8-fast-datapath rhel-8-advanced-virt rhel-8-nfv rhel-8-server-ose epel
# rpm-md repo 'rhel-8-baseos' (cached); generated: 2022-07-19T19:30:27Z
# Updating metadata for 'rhel-8-appstream'... done
# rpm-md repo 'rhel-8-appstream'; generated: 2022-08-16T17:13:40Z
# Updating metadata for 'rhel-8-fast-datapath'... done
# rpm-md repo 'rhel-8-fast-datapath'; generated: 2022-08-01T13:46:17Z
# Updating metadata for 'rhel-8-advanced-virt'... done
# rpm-md repo 'rhel-8-advanced-virt'; generated: 2022-06-13T11:46:08Z
# Updating metadata for 'rhel-8-nfv'... done
# rpm-md repo 'rhel-8-nfv'; generated: 2022-07-19T19:21:36Z
# Updating metadata for 'rhel-8-server-ose'... done
# rpm-md repo 'rhel-8-server-ose'; generated: 2022-08-20T01:24:13Z
# Updating metadata for 'epel'... done
# rpm-md repo 'epel'; generated: 2022-09-01T10:12:52Z
# Importing rpm-md... done
# Resolving dependencies... done
# Will download: 1 package (173.6 kB)
# Downloading from 'epel'... done
# Importing packages... done
# Checking out packages... done
# Running pre scripts... done
# Running post scripts... done
# Running posttrans scripts... done
# Writing rpmdb... done
# Writing OSTree commit... done
# Staging deployment... done
# Freed: 1.2 GB (pkgcache branches: 0)
# Added:
# htop-3.2.1-1.el8.x86_64
# Run "systemctl reboot" to start a reboot
rpm-ostree status
# State: idle
# Deployments:
# pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:822737b305b28aa4890f7bf847ebebc896cd7b549318195fc8c953ae3008cc44
# CustomOrigin: Managed by machine-config-operator
# Version: 410.84.202208161501-0 (2022-08-16T15:04:45Z)
# Diff: 1 added
# LayeredPackages: htop
# * pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:822737b305b28aa4890f7bf847ebebc896cd7b549318195fc8c953ae3008cc44
# CustomOrigin: Managed by machine-config-operator
# Version: 410.84.202208161501-0 (2022-08-16T15:04:45Z)
# pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:23d0609643c25efcd30a7a64483fdee2343ced26b1fd08c0cbf8d03a5d405939
# CustomOrigin: Managed by machine-config-operator
# Version: 410.84.202208030316-0 (2022-08-03T03:19:21Z)
systemcto reboot
# after reboot
rpm-ostree status
# State: idle
# Deployments:
# * pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:822737b305b28aa4890f7bf847ebebc896cd7b549318195fc8c953ae3008cc44
# CustomOrigin: Managed by machine-config-operator
# Version: 410.84.202208161501-0 (2022-08-16T15:04:45Z)
# LayeredPackages: htop
# pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:822737b305b28aa4890f7bf847ebebc896cd7b549318195fc8c953ae3008cc44
# CustomOrigin: Managed by machine-config-operator
# Version: 410.84.202208161501-0 (2022-08-16T15:04:45Z)
ostree admin status
# * rhcos 43a75eb50db67449b7b546dec6e30866a907d6b85317ce1ba5af71d07c903755.0
# Version: 410.84.202208161501-0
# origin: <unknown origin type>
# rhcos 203abe66048544a0415be2c3089e236da15b3a468f9e2bf3c6e2590c31ecc8db.0 (rollback)
# Version: 410.84.202208161501-0
# origin refspec: 203abe66048544a0415be2c3089e236da15b3a468f9e2bf3c6e2590c31ecc8db
which htop
# /usr/bin/htop
end
openshift 4 组件的版本 / components version of openshift 4
客户在项目中提出了一个问题,就是openshift 4是由很多开源组件构成,并且打了补丁的,那么这些开源组件是什么版本呢?
针对这个问题,红帽由一个官方知识库,里面由核心组件的版本信息:
- OpenShift Container Platform 4.x Tested Integrations
- OpenShift Container Platform 4.x Tested Integrations (for x86_x64)
但是上面的知识库,只告诉了我们,crio, etcd, ovs, ovn 的版本信息,但是并没有说其他的,比如 multus 的版本信息。
客户需要很多其他组件的版本信息,好和已有的解决方法进行匹配度检查。那么我们就来一步一步的看,怎么找到这些版本信息吧。
本文,用 multus 来举例子。
视频讲解 / Video explanation
begin
首先,我们可以从openshift的发布信息中,得到multus的源代码地址。
oc adm release info `curl -s https://mirror.openshift.com/pub/openshift-v4/x86_64/clients/ocp/4.9.30/release.txt | grep ocp-release | awk '{print $3}'` --commit-urls=true | grep multus
# ......
# multus-admission-controller https://github.com/openshift/multus-admission-controller/commit/3c28a57a831d11380e612a616820bf8a42261d9d
# multus-cni https://github.com/openshift/multus-cni/commit/c2499377b6fb43320618025876eb5b9751006222
# multus-networkpolicy https://github.com/openshift/multus-networkpolicy/commit/fd12fedeb9e05637279386aa2aacd443ac1c0da7
# multus-route-override-cni https://github.com/openshift/route-override-cni/commit/1953205643c2739486c315d4ea58e17d29cfa610
# multus-whereabouts-ipam-cni https://github.com/openshift/whereabouts-cni/commit/43552df5f301618a1857c9a1c2b51cbb7188ad38
# ......
我们可以看到,openshift 4 使用的multus,源代码地址是: https://github.com/openshift/multus-cni ,使用的版本id是 c2499377b6fb43320618025876eb5b9751006222
我们用git clone出来这个源代码项目,并且打开git history,查找 c2499377b6fb43320618025876eb5b9751006222 这个commit id,可以看到,他对应的是 release-4.9 分支。
那么,我们就筛选 release-4.9 分支,看看 git 的历史上有些什么 tag 信息。
我们可以看到,在release-4.9 分支上,有很多的补丁,但是最近的一个 tag 是 3.7.1
到这里,我们就知道了,openshift 4.9 使用的 multus ,是在 3.7.1 版本基础上打补丁出来的版本。
openshift4 内置 dns, haproxy, image registry / openshift4 embeds dns, haproxy, image registry
⚠️注意,本文所述操作,涉及更改 openshift 4 底层操作系统 rhcos,这有可能导致失去红帽支持资格,具体的情况,请和对口的红帽 GPS 团队沟通, 或者联系红帽 CEE 团队确认。这是因为本方案:
- 没有经过严格的测试
- 将在rhcos上安装rpm
- rpm来自于epel, DIY
⚠️Note that the operation described in this article involves changing the underlying operating system rhcos of openshift 4, which may lead to the loss of Red Hat support qualification. For specific circumstances, please communicate with the corresponding Red Hat GPS team, or contact the Red Hat CEE team for confirmation. This is because this solution:
- Not rigorously tested
- will install rpm on rhcos
- rpm from epel, DIY
rhcos 是一个特殊版本的coreos, 它是openshift 4的底座操作系统,在openshift 4的官方文档中,rhcos被描述成为不可变操作系统,这会让人误以为,rhcos是不可改变的。这个错误的认识,让openshift 4在项目实施的过程中,遇到很多尴尬,也让很多场景,支持起来非常的别扭。
rhcos is a special version of coreos, which is the base operating system of openshift 4. In the official documents of openshift 4, rhcos is described as an immutable operating system, which will make people mistakenly think that rhcos is immutable. This wrong understanding made Openshift 4 encounter a lot of embarrassment in the process of project implementation, and it also made many scenarios very awkward to support.
比如,我们有一个边缘的5GC的场景,客户要求服务器数量尽量少,并且要求高可用。而openshift 4如果要做到高可用,必须3台服务器,而如果考虑到,需要外部的dns, 负载分担,镜像仓库,并且考虑他们的HA,那么还需要2个服务器,这样一共就5台服务器了。这对于一个边缘部署来说,太重了。
For example, we have an edge 5GC scenario, where customers require as few servers as possible and high availability. If openshift 4 is to be highly available, 3 servers are required, and if it is considered that external dns, load sharing, mirror registry are required, and their HA is considered, then 2 servers are needed, so there are 5 servers in total. This is too heavy for an edge deployment.
openshift 4的竞品们,一般都是把dns,负载分担,镜像仓库等等周边组件,融入到集群内部,也就是在操作系统上直接部署,而openshift 4号称操作系统不可变,那是不是这些服务就不能部署到内部去呢?本文我们就来探索一下。
Competitors of openshift 4 generally integrate dns, load sharing, mirror registry and other peripheral components into the cluster, that is, deploy directly on the operating system, while openshift 4 claims that the operating system is immutable, is that right? Can't the service be deployed internally? In this article we will explore.
openshift4 虽然号称支持单节点,3节点的边缘部署模式,但是实际项目实施的时候,往往需要多一个节点,这个节点需要承载的任务有:
- DNS服务 : 因为k8s的各种内部服务,都依赖DNS解析
- load balancer 服务 : 3 k8s master是需要负载均衡服务的。
- 镜像仓库 : 这个是因为crio会在系统重启的时候,检查是否是意外重启,如果是,会清空本机镜像缓存,重新从镜像仓库下载。
- NTP服务 : 集群节点之间的时间同步服务,好在现在大多数 交换机/路由器 都可以提供这个服务。
Although openshift4 claims to support the edge deployment mode of single node and 3 nodes, when the actual project is implemented, one more node is often required. The tasks that this node needs to carry include:
- DNS service: Because various internal services of k8s rely on DNS resolution
- Load balancer service: 3 k8s master needs load balancing service.
- Mirror registry: This is because crio will check whether it is an accidental restart when the system restarts. If so, it will clear the local container image cache and download it from the mirror registry again.
- NTP service: Time synchronization service between cluster nodes. Fortunately, most switches/routers can provide this service.
上述服务,当然可以集中部署到核心区域,但是有些场景,比如私有5G核心网,我们必须把上述服务部署到边缘站点中,这是因为,私有5GC是断外网的环境。
The above services can of course be deployed in the core area, but in some scenarios, such as private 5G core networks, we must deploy the above services to edge sites, because private 5GC is an environment where the external network is disconnected.
我们还知道,openshift4 本身就是基于 rhcos / coreos 操作系统之上的 k8s, 我们自然希望可以把上述的服务,内嵌到 rhcos / coreos 里面去,实现真正意义上的 单节点/3节点 的部署模式。
We also know that openshift4 itself is based on k8s on the rhcos / coreos operating system. We naturally hope that the above services can be embedded in rhcos / coreos to achieve a true single-node/3-node deployment mode.
如果没有本方案,那么我们的部署会是这个样子的,可以看到,必须要有一个 helper 节点,提供辅助功能。
Without this solution, our deployment would look like this. As you can see, there must be a helper node to provide auxiliary functions.
以下是本方案的架构设计: / The following is the architectural design of this scheme:
让我们开始吧。 / Let's begin
视频讲解 / Video explanation
on single node ocp
我们从最简单的单节点openshift 4 集群开始。我们的目标,是把helper上的以下组件,用openshift 4的单节点中的组件替代:
We start with the simplest single node openshift 4 cluster. Our goal is to replace the following components on the helper with components in a single node of openshift 4:
- dns -> pdns (power dns)
- image registry -> docker distribution
我们不考虑 haproxy ,是因为单节点,没有外部负载分担的需要。
We do not consider haproxy because it is a single node and there is no need for external load sharing.
而NTP服务,我们认为网络交换机/路由器可以提供。或者在SNO场景下,可以不用NTP服务。也可以在SNO节点上直接启动一个NTP服务都可以。
And NTP service, we think network switch/router can provide. Or in the SNO scenario, the NTP service may not be used. You can also directly start an NTP service on the SNO node.
这里是这个single node ocp的day-0的部署过程记录。
Here is the deployment process record of day-0 of this single node ocp.
以下是day-0的时候,部署的架构图: / The following is the architecture diagram of the deployment at day-0:
我们的目标,是通过day-2的操作,把他变成这个样子: / Our goal is to make it look like this through the operation of day-2:
prepare docker registry content
我们需要先准备以下离线镜像仓库,openshift支持了一个oc-mirror的工具,我们可以方便的使用。我们先把离线镜像仓库下载到文件中。留着后面使用。
We need to prepare the following offline mirror images first. Openshift supports an oc-mirror tool, which we can use easily. We first download the offline mirror repository to a file. Save it for later use.
# setup a stand alone docker registry
# on helper
cat > /data/ocp4/mirror.yaml << EOF
apiVersion: mirror.openshift.io/v1alpha1
kind: ImageSetConfiguration
# archiveSize: 4
mirror:
ocp:
channels:
- name: stable-4.10
versions:
- '4.10.28'
- '4.10.26'
additionalImages:
- name: registry.redhat.io/redhat/redhat-operator-index:v4.10
- name: registry.redhat.io/redhat/certified-operator-index:v4.10
- name: registry.redhat.io/redhat/community-operator-index:v4.10
- name: registry.redhat.io/redhat/redhat-marketplace-index:v4.10
EOF
mkdir -p /data/install/mirror-tmp
cd /data/install/mirror-tmp
oc-mirror --config /data/ocp4/mirror.yaml file:///data/install/mirror-tmp
install rpm to rhcos
我们需要向rhcos直接安装pdns, docker distribution等软件,为什么不能用openshift的容器来提供这些服务呢?这里面有一个crio的bug,简单说,如果主机意外重启,crio会把本地镜像全部作废,然后重新从镜像仓库下载。所以,我们的dns, registry服务就不能用容器来启动,否则如果宿主机暴力重启,dns, registry的容器服务都启动不了,这个节点的openshift服务就无法启动了。
We need to install pdns, docker distribution and other software directly to rhcos, why can't we use openshift containers to provide these services? There is a crio bug here. Simply speaking, if the host restarts unexpectedly, crio will invalidate all the local images and download them from the mirror repository again. Therefore, our dns and registry services cannot be started with containers. Otherwise, if the host restarts violently, the container services of dns and registry cannot be started, and the openshift service of this node cannot be started.
有同事建议,可以使用podman/systemd的方式,在systemd里面注册一个服务,在服务里面通过podman启动pdns, registry,经过实验测试,断电重启的情况下,podman的镜像,也会丢失,所以对应的systemd service也启动不了。所以我们就彻底放弃容器解决方案。
A colleague suggested that we can use the podman/systemd method to register a service in systemd, and start pdns and registry through podman in the service. After experimental testing, in the case of power failure and restart, the image of podman will also be lost, so the corresponding The systemd service does not start either. So we dropped the container solution entirely.
我们还需要做一个rpm repo源,这里作者做好了一个demo rpm repo源,注意,这个源引用了epel的rpm, 还有作者自己打包的rpm。所以这个源只能作为学习和测试之用。
We also need to make an rpm repo source. Here the author has prepared a demo rpm repo source. Note that this source refers to the rpm of epel and the rpm packaged by the author himself. So this source should only be used for learning and testing purposes.
最后,用rpm-ostree向rhcos装rpm,这个技术是openshift 4自己就在使用的,openshift 4 extension功能,比如real-time kernel extension, 就是通过rpm-ostree向rhcos装了对应的kernel包实现的。
Finally, use rpm-ostree to install rpm to rhcos. This technology is used by openshift 4 itself. Openshift 4 extension functions, such as real-time kernel extension, are implemented by installing the corresponding kernel package to rhcos through rpm-ostree.
# on helper
mkdir -p /data/repo
cd /data/repo
# here is the demo simple repo
# you can build the repo by youself, just following rhel8.4 way
wget https://github.com/wangzheng422/release/releases/download/ocp.4.10.28.simple.repo/dnf-ocp-4.10-simple.tgz
tar zvxf dnf-ocp-4.10-simple.tgz
cd /data/repo/dnf-ocp-4.10-simple/
# start http server to serve the rpm repo
python3 -m http.server 5180
# Serving HTTP on 0.0.0.0 port 5180 (http://0.0.0.0:5180/) ...
# login into single node
export REPO_IP=http://192.168.7.11:5180
cat << EOF > /etc/yum.repos.d/wzh.repo
# RHEL repos
[simple]
baseurl=${REPO_IP}/
enabled=1
repo_gpgcheck=0
gpgcheck=0
EOF
rpm-ostree install htop pdns pdns-recursor docker-distribution
# Checking out tree 8b334e0... done
# Enabled rpm-md repositories: simple
# Updating metadata for 'simple'... done
# rpm-md repo 'simple'; generated: 2022-09-09T06:17:17Z
# Importing rpm-md... done
# Resolving dependencies... done
# Will download: 11 packages (12.9 MB)
# Downloading from 'simple'... done
# Importing packages... done
# Checking out packages... done
# Running pre scripts... done
# Running post scripts... done
# Running posttrans scripts... done
# Writing rpmdb... done
# Writing OSTree commit... done
# Staging deployment... done
# Added:
# boost-context-1.66.0-10.el8.x86_64
# boost-filesystem-1.66.0-10.el8.x86_64
# boost-program-options-1.66.0-10.el8.x86_64
# boost-system-1.66.0-10.el8.x86_64
# docker-distribution-2.8.1-0.el8.x86_64
# htop-3.2.1-1.el8.x86_64
# libsodium-1.0.18-2.el8.x86_64
# luajit-2.1.0-0.16beta3.el8.x86_64
# pdns-4.6.2-1.el8.x86_64
# pdns-recursor-4.3.6-1.el8.x86_64
# protobuf-3.5.0-13.el8.x86_64
# Run "systemctl reboot" to start a reboot
systemctl reboot
# after reboot
rpm-ostree status
# State: idle
# Deployments:
# ● pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:23d0609643c25efcd30a7a64483fdee2343ced26b1fd08c0cbf8d03a5d405939
# CustomOrigin: Managed by machine-config-operator
# Version: 410.84.202208030316-0 (2022-08-03T03:19:21Z)
# LayeredPackages: docker-distribution htop pdns pdns-recursor
# pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:23d0609643c25efcd30a7a64483fdee2343ced26b1fd08c0cbf8d03a5d405939
# CustomOrigin: Managed by machine-config-operator
# Version: 410.84.202208030316-0 (2022-08-03T03:19:21Z)
重启以后,我们就能看到LayeredPackages了,以后版本的 openshift 4 会在集群层面支持 LayeredPackages 功能。目前我们只能直接登录rhcos来手动做安装。
After restarting, we can see LayeredPackages, and furture versions of openshift 4 will support the LayeredPackages function at the cluster / k8s level. At present, we can only log in directly to rhcos to do the installation manually.
embed registry on single node ocp
我们需要的软件,已经装在节点上了,接下来,我们就做一些配置,把本地的镜像仓库激活。注意,这里面我们使用的是docker distribution, 我们把之前helper上的镜像仓库的证书拿来直接给他用,这样之后,我们只要更改dns指向就可以了。
The software we need has been installed on the node. Next, we will do some configuration to activate the local mirror registry. Note that we are using the docker distribution here. We use the certificate of the image registry on the helper directly for it to use. After that, we only need to change the dns point.
我们的配置文件位于/etc下面, 上传的镜像位于/var下面,那么节点重启,集群升级,这些目录会不会被重置呢?目前的实测表明不会,按照文档的说法,/etc下面的内容,在升级的时候会进行合并,/var下面的内容,会保留。
Our configuration file is located under /etc, and the uploaded image is located under /var. Then, if the node is restarted and the cluster is upgraded, will these directories be reset? The current testing shows that it will not. According to the document, the content under /etc will be merged during the upgrade, and the content under /var will be retained.
export BASE_DIR='/home/sno/'
export VAR_CERT_DIR=/etc/crts/
echo "obase=8;ibase=10;420" | bc
# 644
echo "obase=10;ibase=8;700" | bc
# 448
#########################
# run with root
# to grant read access to key
chmod og+r $VAR_CERT_DIR/redhat.ren.key
#########################
cat << EOF > ${BASE_DIR}/data/sno/registry.images.bu
variant: openshift
version: 4.10.0
metadata:
labels:
machineconfiguration.openshift.io/role: master
name: 99-zzz-master-registry
storage:
files:
- path: /etc/wzh/redhat.ren.crt
overwrite: true
contents:
source: data:text/plain;charset=utf-8;base64,$( base64 -w 0 < ${VAR_CERT_DIR}/redhat.ren.crt )
mode: 420
user:
name: root
- path: /etc/wzh/redhat.ren.key
overwrite: true
contents:
source: data:text/plain;charset=utf-8;base64,$( base64 -w 0 < ${VAR_CERT_DIR}/redhat.ren.key )
mode: 420
user:
name: root
- path: /etc/wzh/registry-config.yml
overwrite: true
contents:
inline: |
version: 0.1
log:
accesslog:
disabled: true
fields:
service: registry
storage:
cache:
layerinfo: inmemory
filesystem:
rootdirectory: /var/wzh-registry
delete:
enabled: true
maintenance:
readonly:
enabled: false
http:
addr: :8443
tls:
certificate: /etc/wzh/redhat.ren.crt
key: /etc/wzh/redhat.ren.key
mode: 420
user:
name: root
systemd:
units:
- contents: |
[Unit]
Description=Set SELinux chcon for image registry
Before=docker-distribution.service
[Service]
Type=oneshot
RemainAfterExit=yes
User=root
ExecStartPre=-mkdir -p /var/wzh-registry
ExecStart=/usr/bin/chcon -Rt container_file_t /var/wzh-registry
[Install]
WantedBy=multi-user.target
enabled: true
name: hostpath-registry.service
- contents: |
[Unit]
Description=v2 Registry server for Docker
After=network.target hostpath-registry.service
Requires=hostpath-registry.service
Before=kubelet.service
[Service]
Type=simple
ExecStart=/usr/bin/registry serve /etc/wzh/registry-config.yml
[Install]
WantedBy=multi-user.target
enabled: true
name: docker-distribution.service
- name: kubelet.service
dropins:
- name: 99-after-registry.conf
contents: |
[Unit]
Requires=docker-distribution.service
After=docker-distribution.service
EOF
butane ${BASE_DIR}/data/sno/registry.images.bu > ${BASE_DIR}/data/sno/99-zzz-master-registry.yaml
oc create --save-config -f ${BASE_DIR}/data/sno/99-zzz-master-registry.yaml
# oc apply -f ${BASE_DIR}/data/sno/99-zzz-master-registry.yaml
# oc delete -f ${BASE_DIR}/data/sno/99-zzz-master-registry.yaml
upload registry content
有了镜像仓库,我们就把之前下载的离线镜像文件,导入到节点内置的镜像仓库中。
With the mirror registry, we import the offline mirror file downloaded before into the built-in mirror registry of the node.
# on helper
oc-mirror --dest-skip-tls --from mirror_seq1_000000.tar docker://192.168.7.13:8443
(optional) update registry config to read only
我们的离线镜像上传了,就不希望别别人改掉,那么我们可以把本地的镜像仓库设置成只读模式。
Our offline mirror is uploaded, and we don't want others to change it, then we can set the local mirror repository to read-only mode.
cat << EOF > ${BASE_DIR}/data/sno/registry.images.bu
variant: openshift
version: 4.10.0
metadata:
labels:
machineconfiguration.openshift.io/role: master
name: 99-zzz-master-registry
storage:
files:
- path: /etc/wzh/redhat.ren.crt
overwrite: true
contents:
source: data:text/plain;charset=utf-8;base64,$( base64 -w 0 < ${VAR_CERT_DIR}/redhat.ren.crt )
mode: 420
user:
name: root
- path: /etc/wzh/redhat.ren.key
overwrite: true
contents:
source: data:text/plain;charset=utf-8;base64,$( base64 -w 0 < ${VAR_CERT_DIR}/redhat.ren.key )
mode: 420
user:
name: root
- path: /etc/wzh/registry-config.yml
overwrite: true
contents:
inline: |
version: 0.1
log:
accesslog:
disabled: true
fields:
service: registry
storage:
cache:
layerinfo: inmemory
filesystem:
rootdirectory: /var/wzh-registry
delete:
enabled: false
maintenance:
readonly:
enabled: true
http:
addr: :5443
tls:
certificate: /etc/wzh/redhat.ren.crt
key: /etc/wzh/redhat.ren.key
mode: 420
user:
name: root
systemd:
units:
- contents: |
[Unit]
Description=Set SELinux chcon for image registry
Before=docker-distribution.service
[Service]
Type=oneshot
RemainAfterExit=yes
User=root
ExecStartPre=-mkdir -p /var/wzh-registry
ExecStart=/usr/bin/chcon -Rt container_file_t /var/wzh-registry
[Install]
WantedBy=multi-user.target
enabled: true
name: hostpath-registry.service
- contents: |
[Unit]
Description=v2 Registry server for Docker
After=network.target hostpath-registry.service
Requires=hostpath-registry.service
Before=kubelet.service
[Service]
Type=simple
ExecStart=/usr/bin/registry serve /etc/wzh/registry-config.yml
[Install]
WantedBy=multi-user.target
enabled: true
name: docker-distribution.service
- name: kubelet.service
dropins:
- name: 99-after-registry.conf
contents: |
[Unit]
Requires=docker-distribution.service
After=docker-distribution.service
EOF
butane ${BASE_DIR}/data/sno/registry.images.bu > ${BASE_DIR}/data/sno/99-zzz-master-registry.yaml
oc apply -f ${BASE_DIR}/data/sno/99-zzz-master-registry.yaml
deploy power dns (pdns) as local dns service
我们配置本地的power dns,把我们需要的dns记录都写进去,并且配置它在kubelet之前启动。
We configure the local power dns, write all the dns records we need, and configure it to start before the kubelet.
oc patch mcp/master --patch '{"spec":{"paused":true}}' --type=merge
oc patch mcp/worker --patch '{"spec":{"paused":true}}' --type=merge
cat > ${BASE_DIR}/data/sno/pdns.bu << 'EOF'
variant: openshift
version: 4.10.0
metadata:
labels:
machineconfiguration.openshift.io/role: master
name: 99-zzz-master-pdns
storage:
files:
- path: /etc/pdns/pdns.conf
overwrite: true
contents:
inline: |
launch=bind
local-address=0.0.0.0
local-port=53
setgid=pdns
setuid=pdns
bind-config=/etc/pdns/bind.conf
bind-check-interval=300
enable-lua-records=yes
mode: 420
user:
name: root
- path: /etc/pdns/bind.conf
overwrite: true
contents:
inline: |
zone "acm-demo-hub.redhat.ren" { type master; file "/etc/pdns/inside-out.xyz"; };
zone "infra.redhat.ren" { type master; file "/etc/pdns/infra.xyz"; };
mode: 420
user:
name: root
- path: /etc/pdns/inside-out.xyz
overwrite: true
contents:
inline: |
$TTL 10
@ IN SOA ns1.acm-demo-hub.redhat.ren. postmaster.acm-demo-hub.redhat.ren. (
2014080704 ; Serial Number (date YYYYMMDD++)
3H ; refresh (3 hours)
30M ; retry (30 minutes)
2W ; expiry (2 weeks)
1W ) ; minimum (1 week)
;IN NS ns1.ocp4.redhat.ren.
;IN NS ns2.ocp4.redhat.ren.
@ IN A 192.168.7.13
;ns1 IN A 8.8.8.8
;ns2 IN A 8.8.4.4
helper IN A 192.168.7.11
;
;
; The api points to the IP of your load balancer
api IN A 192.168.7.13
api-int IN A 192.168.7.13
;
; The wildcard also points to the load balancer
*.apps IN A 192.168.7.13
;
; Create entry for the bootstrap host
; bootstrap IN A 192.168.7.12
;
; Create entries for the master hosts
master-0 IN A 192.168.7.13
;master-1 IN A 192.168.7.14
;master-2 IN A 192.168.7.15
;
; Create entries for the worker hosts
;worker-0 IN A 192.168.7.16
;worker-1 IN A 192.168.7.17
;worker-2 IN A 192.168.7.18
;
; The ETCd cluster lives on the masters...so point these to the IP of the masters
;etcd-0 IN A 192.168.7.13
;etcd-1 IN A 192.168.7.14
;etcd-2 IN A 192.168.7.15
;
; Create entries for the other hosts
registry IN A 192.168.7.13
yum IN A 192.168.7.1
nexus IN A 192.168.7.1
git IN A 192.168.7.11
tmp-registry IN A 192.168.7.177
mode: 420
user:
name: root
- path: /etc/pdns/infra.xyz
overwrite: true
contents:
inline: |
$TTL 10
@ IN SOA ns1.infra.redhat.ren. postmaster.infra.redhat.ren. (
2014080704 ; Serial Number (date YYYYMMDD++)
3H ; refresh (3 hours)
30M ; retry (30 minutes)
2W ; expiry (2 weeks)
1W ) ; minimum (1 week)
;IN NS ns1.ocp4.redhat.ren.
;IN NS ns2.ocp4.redhat.ren.
@ IN A 192.168.7.13
quay IN A 192.168.7.13
quaylab IN A 192.168.7.13
mode: 420
user:
name: root
systemd:
units:
- name: pdns.service
enabled: true
- name: kubelet.service
dropins:
- name: 99-after-pdns.conf
contents: |
[Unit]
Requires=pdns.service
After=pdns.service
EOF
butane ${BASE_DIR}/data/sno/pdns.bu > ${BASE_DIR}/data/sno/99-zzz-master-pdns.yaml
oc create --save-config -f ${BASE_DIR}/data/sno/99-zzz-master-pdns.yaml
# oc apply -f ${BASE_DIR}/data/sno/99-zzz-master-pdns.yaml
update registry.conf to point to local registry
默认情况下,这一步并不需要,但是作者的集群装的时候,对registries.conf做过特殊的配置,这里面就要把镜像仓库重新调整以下。image.registries.conf.sh脚本的源代码在这里。
By default, this step is not required, but when the author's cluster is installed, he has made special configuration to registries.conf, and the mirror warehouse needs to be readjusted as follows. The source code for the image.registries.conf.sh script is here.
######################
# run as root
cd /data/ocp4
bash image.registries.conf.sh quay.infra.redhat.ren:8443
######################
oc apply -f /data/ocp4/99-worker-container-registries.yaml
oc apply -f /data/ocp4/99-master-container-registries.yaml
set sno dns to local dns service
更改single node ocp的dns配置,根据集群安装的方法不同而不同。本次实验的集群的安装方法在这里,于是我们就这样来更改dns指向。
Change the dns configuration of the single node ocp, which varies according to the method of cluster installation. The installation method of the cluster in this experiment is here, so we will change the dns point like this.
NTP_SERVER=192.168.7.11
HELP_SERVER=192.168.7.11
KVM_HOST=192.168.7.11
API_VIP=192.168.7.100
INGRESS_VIP=192.168.7.101
CLUSTER_PROVISION_IP=192.168.7.103
BOOTSTRAP_IP=192.168.7.12
ACM_DEMO_MNGED_CLUSTER=acm-demo1
ACM_DEMO_MNGED_SNO_IP=192.168.7.15
# 定义单节点集群的节点信息
SNO_CLUSTER_NAME=acm-demo-hub
SNO_BASE_DOMAIN=redhat.ren
SNO_IP=192.168.7.13
# ocp bug, gateway needs to be online, otherwise, ovn will mis-behaviour, and ingress failed to start.
SNO_GW=192.168.7.9
SNO_NETMAST=255.255.255.0
SNO_NETMAST_S=24
SNO_HOSTNAME=acm-demo-hub-master
SNO_IF=enp1s0
SNO_IF_MAC=`printf '00:60:2F:%02X:%02X:%02X' $[RANDOM%256] $[RANDOM%256] $[RANDOM%256]`
SNO_DNS=192.168.7.11
SNO_DISK=/dev/vda
SNO_CORE_PWD=redhat
export BASE_DIR='/home/sno/'
cat << EOF > ${BASE_DIR}/data/sno/static.ip.bu
variant: openshift
version: 4.9.0
metadata:
labels:
machineconfiguration.openshift.io/role: master
name: 99-zzz-master-static-ip
storage:
files:
- path: /etc/NetworkManager/system-connections/${SNO_IF}.nmconnection
mode: 0600
overwrite: true
contents:
inline: |
[connection]
id=${SNO_IF}
type=ethernet
autoconnect-retries=1
interface-name=${SNO_IF}
multi-connect=1
permissions=
wait-device-timeout=60000
[ethernet]
mac-address-blacklist=
[ipv4]
address1=${SNO_IP}/${SNO_NETMAST_S=24},${SNO_GW}
dhcp-hostname=${SNO_HOSTNAME}
dhcp-timeout=90
dns=${SNO_IP};
dns-search=
may-fail=false
method=manual
[ipv6]
addr-gen-mode=eui64
dhcp-hostname=${SNO_HOSTNAME}
dhcp-timeout=90
dns-search=
method=disabled
[proxy]
EOF
butane ${BASE_DIR}/data/sno/static.ip.bu > ${BASE_DIR}/data/sno/disconnected/99-zzz-master-ip.yaml
oc apply -f ${BASE_DIR}/data/sno/disconnected/99-zzz-master-ip.yaml
oc patch mcp/master --patch '{"spec":{"paused":false}}' --type=merge
oc patch mcp/worker --patch '{"spec":{"paused":false}}' --type=merge
test with force power off
我们知道,如果ocp node意外断电的话,启动的时候,他会重新下载集群需要的基础镜像,那么我们就暴力断电,来测试sno能否启动吧。
We know that if the ocp node is accidentally powered off, it will re-download the basic image required by the cluster when it starts up, so we will power off violently to test whether the sno can be started.
重启之后,正常启动。 / After restarting, start normally.
oc get co
# NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE
# authentication 4.10.26 True False False 30m
# baremetal 4.10.26 True False False 4d22h
# cloud-controller-manager 4.10.26 True False False 4d22h
# cloud-credential 4.10.26 True False False 4d22h
# cluster-autoscaler 4.10.26 True False False 4d22h
# config-operator 4.10.26 True False False 4d22h
# console 4.10.26 True False False 7m23s
# csi-snapshot-controller 4.10.26 True False False 4d22h
# dns 4.10.26 True False False 20m
# etcd 4.10.26 True False False 4d22h
# image-registry 4.10.26 True False False 4d22h
# ingress 4.10.26 True False False 4d22h
# insights 4.10.26 True False False 40s
# kube-apiserver 4.10.26 True False False 4d22h
# kube-controller-manager 4.10.26 True False False 4d22h
# kube-scheduler 4.10.26 True False False 4d22h
# kube-storage-version-migrator 4.10.26 True False False 3d18h
# machine-api 4.10.26 True False False 4d22h
# machine-approver 4.10.26 True False False 4d22h
# machine-config 4.10.26 True False False 4d22h
# marketplace 4.10.26 True False False 4d22h
# monitoring 4.10.26 True False False 4d22h
# network 4.10.26 True False False 4d22h
# node-tuning 4.10.26 True False False 30m
# openshift-apiserver 4.10.26 True False False 3d22h
# openshift-controller-manager 4.10.26 True False False 2d19h
# openshift-samples 4.10.26 True False False 3d23h
# operator-lifecycle-manager 4.10.26 True False False 4d22h
# operator-lifecycle-manager-catalog 4.10.26 True False False 4d22h
# operator-lifecycle-manager-packageserver 4.10.26 True False False 7m48s
# service-ca 4.10.26 True False False 4d22h
# storage 4.10.26 True False False 4d22h
test with ocp upgrade
我们上传的镜像,包括了4.10.26, 4.10.28两个版本,那么我们就来试试升级吧
The images we uploaded include two versions: 4.10.26 and 4.10.28, so let's try to upgrade
rpm-ostree status
# State: idle
# Deployments:
# ● pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:23d0609643c25efcd30a7a64483fdee2343ced26b1fd08c0cbf8d03a5d405939
# CustomOrigin: Managed by machine-config-operator
# Version: 410.84.202208030316-0 (2022-08-03T03:19:21Z)
# LayeredPackages: docker-distribution htop pdns pdns-recursor
# pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:23d0609643c25efcd30a7a64483fdee2343ced26b1fd08c0cbf8d03a5d405939
# CustomOrigin: Managed by machine-config-operator
# Version: 410.84.202208030316-0 (2022-08-03T03:19:21Z)
# before upgrade, make sure the rpm repo is online
# rpm-ostree will call rpm repo during upgrade
# although it will not download anything
# upgrade ocp to 4.10.28
oc adm upgrade \
--to-image=quay.io/openshift-release-dev/ocp-release@sha256:2127608ebd67a2470860c42368807a0de2308dba144ec4c298bec1c03d79cb52 \
--allow-explicit-upgrade --allow-upgrade-with-warnings=true --force=true
rpm-ostree status
# State: idle
# Deployments:
# ● pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:822737b305b28aa4890f7bf847ebebc896cd7b549318195fc8c953ae3008cc44
# CustomOrigin: Managed by machine-config-operator
# Version: 410.84.202208161501-0 (2022-08-16T15:04:45Z)
# LayeredPackages: docker-distribution htop pdns pdns-recursor
# pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:23d0609643c25efcd30a7a64483fdee2343ced26b1fd08c0cbf8d03a5d405939
# CustomOrigin: Managed by machine-config-operator
# Version: 410.84.202208030316-0 (2022-08-03T03:19:21Z)
# LayeredPackages: docker-distribution htop pdns pdns-recursor
oc get co
# NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE
# authentication 4.10.28 True False False 26m
# baremetal 4.10.28 True False False 130m
# cloud-controller-manager 4.10.28 True False False 130m
# cloud-credential 4.10.28 True False False 154m
# cluster-autoscaler 4.10.28 True False False 130m
# config-operator 4.10.28 True False False 142m
# console 4.10.28 True False False 26m
# csi-snapshot-controller 4.10.28 True False False 32m
# dns 4.10.28 True False False 26m
# etcd 4.10.28 True False False 138m
# image-registry 4.10.28 True False False 36m
# ingress 4.10.28 True False False 141m
# insights 4.10.28 True False False 17s
# kube-apiserver 4.10.28 True False False 131m
# kube-controller-manager 4.10.28 True False False 136m
# kube-scheduler 4.10.28 True False False 133m
# kube-storage-version-migrator 4.10.28 True False False 141m
# machine-api 4.10.28 True False False 130m
# machine-approver 4.10.28 True False False 141m
# machine-config 4.10.28 True False False 138m
# marketplace 4.10.28 True False False 141m
# monitoring 4.10.28 True False False 35m
# network 4.10.28 True False False 142m
# node-tuning 4.10.28 True False False 36m
# openshift-apiserver 4.10.28 True False False 36m
# openshift-controller-manager 4.10.28 True False False 131m
# openshift-samples 4.10.28 True False False 36m
# operator-lifecycle-manager 4.10.28 True False False 130m
# operator-lifecycle-manager-catalog 4.10.28 True False False 130m
# operator-lifecycle-manager-packageserver 4.10.28 True False False 104m
# service-ca 4.10.28 True False False 141m
# storage 4.10.28 True False False 130m
我们可以看到,能够正常的升级和启动。
We can see that it can be upgraded and started normally.
3 node cluster
接下来,我们尝试 3 node openshift / compact cluster。我们的目标,是把helper上的以下组件,用openshift 4的节点中的组件替代:
Next, we try a 3 node openshift / compact cluster. Our goal is to replace the following components on the helper with components in the openshift 4 node:
- dns -> pdns (power dns)
- haproxy -> pdns lua plugin (ifportup)
- image registry -> docker distribution
而NTP服务,我们依然认为网络交换机/路由器可以提供。
And NTP service, we still think that network switch/router can provide.
install rpm to rhcos
这个步骤,和single node ocp是一样的,只不过需要在 3 master 上都执行一遍。另外,我们多安装了一个pdns-selinux, 这个包和docker-distribution都是作者自己打包的,pdns-selinux补充了selinux规则,运行pdns能够做对外的端口检查。
This step is the same as single node ocp, except that it needs to be executed on all 3 masters. In addition, we have installed one more pdns-selinux. This package and docker-distribution are packaged by the author himself. pdns-selinux supplements the selinux rules, and running pdns can perform external port inspection.
# Delete cached rpm repo metadata
# rpm-ostree cleanup -m
rpm-ostree install htop pdns pdns-recursor docker-distribution pdns-selinux
# Added:
# pdns-selinux-0.0.1-0.el8.x86_64
# Run "systemctl reboot" to start a reboot
reboot
rpm-ostree status
# State: idle
# Deployments:
# ● pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:23d0609643c25efcd30a7a64483fdee2343ced26b1fd08c0cbf8d03a5d405939
# CustomOrigin: Managed by machine-config-operator
# Version: 410.84.202208030316-0 (2022-08-03T03:19:21Z)
# LayeredPackages: docker-distribution htop pdns pdns-recursor pdns-selinux
# pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:23d0609643c25efcd30a7a64483fdee2343ced26b1fd08c0cbf8d03a5d405939
# CustomOrigin: Managed by machine-config-operator
# Version: 410.84.202208030316-0 (2022-08-03T03:19:21Z)
# LayeredPackages: docker-distribution htop pdns pdns-recursor
embed registry on each ocp node
这个步骤,也和 single node ocp是一样的。
This step is also the same as single node ocp.
export BASE_DIR='/home/3node/'
export VAR_CERT_DIR=/etc/crts/
# ......
upload registry content
这个步骤,和single node ocp是一样的,只不过需要为 3 master 都执行一遍。
This step is the same as single node ocp, but it needs to be executed for all 3 masters.
oc-mirror --dest-skip-tls --from mirror_seq1_000000.tar docker://192.168.7.13:8443
oc-mirror --dest-skip-tls --from mirror_seq1_000000.tar docker://192.168.7.14:8443
oc-mirror --dest-skip-tls --from mirror_seq1_000000.tar docker://192.168.7.15:8443
deploy power dns (pdns) as local dns service
我们配置本地的power dns,把我们需要的dns记录都写进去,并且配置它在kubelet之前启动。这一步和之前的single node ocp不一样,需要用到pdns lua plugin,用 ifportup 的方法,探测对应节点上的端口是否打开,如果没有打开,认为对应的服务没有启动,或者节点掉线,然后 pdns 就不会返回对应节点的解析。我们用这种方法,来代替haproxy。
We configure the local power dns, write all the dns records we need, and configure it to start before the kubelet. This step is different from the previous single node ocp. You need to use the pdns lua plugin. Use the ifportup method to detect whether the port on the corresponding node is open. If it is not open, it is considered that the corresponding service is not started, or the node is offline, and then pdns The parsing of the corresponding node will not be returned. We use this method to replace haproxy.
cat > ${BASE_DIR}/data/sno/pdns.bu << 'EOF'
variant: openshift
version: 4.10.0
metadata:
labels:
machineconfiguration.openshift.io/role: master
name: 99-zzz-master-pdns
storage:
files:
- path: /etc/pdns/pdns.conf
overwrite: true
contents:
inline: |
launch=bind
local-address=0.0.0.0
local-port=53
setgid=pdns
setuid=pdns
bind-config=/etc/pdns/bind.conf
bind-check-interval=300
enable-lua-records=yes
mode: 420
user:
name: root
- path: /etc/pdns/bind.conf
overwrite: true
contents:
inline: |
zone "acm-demo-hub.redhat.ren" { type master; file "/etc/pdns/inside-out.xyz"; };
zone "infra.redhat.ren" { type master; file "/etc/pdns/infra.xyz"; };
mode: 420
user:
name: root
- path: /etc/pdns/inside-out.xyz
overwrite: true
contents:
inline: |
$TTL 10
@ IN SOA ns1.acm-demo-hub.redhat.ren. postmaster.acm-demo-hub.redhat.ren. (
2014080704 ; Serial Number (date YYYYMMDD++)
3H ; refresh (3 hours)
30M ; retry (30 minutes)
2W ; expiry (2 weeks)
1W ) ; minimum (1 week)
;IN NS ns1.ocp4.redhat.ren.
;IN NS ns2.ocp4.redhat.ren.
@ IN LUA A "ifportup(6443, {'192.168.7.13', '192.168.7.14', '192.168.7.15'})"
;ns1 IN A 8.8.8.8
;ns2 IN A 8.8.4.4
helper IN A 192.168.7.11
;
;
; The api points to the IP of your load balancer
api IN LUA A "ifportup(6443, {'192.168.7.13', '192.168.7.14', '192.168.7.15'})"
api-int IN LUA A "ifportup(6443, {'192.168.7.13', '192.168.7.14', '192.168.7.15'})"
;
; The wildcard also points to the load balancer
*.apps IN LUA A "ifportup(443, {'192.168.7.13', '192.168.7.14', '192.168.7.15'})"
;
; Create entry for the bootstrap host
; bootstrap IN A 192.168.7.12
;
; Create entries for the master hosts
;master-0 IN A 192.168.7.13
;master-1 IN A 192.168.7.14
;master-2 IN A 192.168.7.15
;
; Create entries for the worker hosts
;worker-0 IN A 192.168.7.16
;worker-1 IN A 192.168.7.17
;worker-2 IN A 192.168.7.18
;
; The ETCd cluster lives on the masters...so point these to the IP of the masters
;etcd-0 IN A 192.168.7.13
;etcd-1 IN A 192.168.7.14
;etcd-2 IN A 192.168.7.15
;
; Create entries for the other hosts
;registry IN LUA A "ifportup(8443, {'192.168.7.13', '192.168.7.14', '192.168.7.15'})"
;yum IN A 192.168.7.1
;quay IN LUA A "ifportup(8443, {'192.168.7.13', '192.168.7.14', '192.168.7.15'})"
nexus IN A 192.168.7.1
git IN A 192.168.7.11
tmp-registry IN A 192.168.7.177
mode: 420
user:
name: root
- path: /etc/pdns/infra.xyz
overwrite: true
contents:
inline: |
$TTL 10
@ IN SOA ns1.infra.redhat.ren. postmaster.infra.redhat.ren. (
2014080704 ; Serial Number (date YYYYMMDD++)
3H ; refresh (3 hours)
30M ; retry (30 minutes)
2W ; expiry (2 weeks)
1W ) ; minimum (1 week)
;IN NS ns1.ocp4.redhat.ren.
;IN NS ns2.ocp4.redhat.ren.
@ IN A 192.168.7.13
quay IN LUA A "ifportup(8443, {'192.168.7.13', '192.168.7.14', '192.168.7.15'})"
quaylab IN LUA A "ifportup(8443, {'192.168.7.13', '192.168.7.14', '192.168.7.15'})"
mode: 420
user:
name: root
systemd:
units:
- name: pdns.service
enabled: true
- name: kubelet.service
dropins:
- name: 99-after-pdns.conf
contents: |
[Unit]
Requires=pdns.service
After=pdns.service
EOF
butane ${BASE_DIR}/data/sno/pdns.bu > ${BASE_DIR}/data/sno/99-zzz-master-pdns.yaml
oc create --save-config -f ${BASE_DIR}/data/sno/99-zzz-master-pdns.yaml
# oc apply -f ${BASE_DIR}/data/sno/99-zzz-master-pdns.yaml
update registry.conf to point to local registry
这个步骤,也和 single node ocp是一样的。根据集群的安装方法不同,而不同。
This step is also the same as single node ocp. It varies according to the installation method of the cluster.
######################
# run as root
cd /data/ocp4
bash image.registries.conf.sh quay.infra.redhat.ren:8443
######################
oc patch mcp/master --patch '{"spec":{"paused":true}}' --type=merge
oc patch mcp/worker --patch '{"spec":{"paused":true}}' --type=merge
oc apply -f /data/ocp4/99-worker-container-registries.yaml
oc apply -f /data/ocp4/99-master-container-registries.yaml
oc patch mcp/master --patch '{"spec":{"paused":false}}' --type=merge
oc patch mcp/worker --patch '{"spec":{"paused":false}}' --type=merge
set sno dns to local dns service
把dns指向到本地的 power dns, 指向的方法根据集群安装的方法各不相同。作者的 3 node / compact cluster 是这么安装的,因为网络使用ovn,dns配置信息会在启动的时候,从网卡copy到 br-ex 上,所以作者需要在每个节点上,修改网卡的dns指向,然后重启。
Point the dns to the local power dns, the method of pointing varies according to the method of cluster installation. The author's 3 node / compact cluster is installed like this. Because the network uses ovn, the dns configuration information will be copied from the network card to br-ex at startup, so the author needs to modify the dns point of the network card on each node. Then reboot.
# for master-01
nmcli con mod enp1s0 ipv4.dns 192.168.7.13
reboot
# for master-02
nmcli con mod enp1s0 ipv4.dns 192.168.7.14
reboot
# for master-03
nmcli con mod enp1s0 ipv4.dns 192.168.7.15
reboot
# after reboot, test the dns
dig @127.0.0.1 quaylab.infra.redhat.ren
# ; <<>> DiG 9.11.26-RedHat-9.11.26-4.el8_4 <<>> @127.0.0.1 quaylab.infra.redhat.ren
# ; (1 server found)
# ;; global options: +cmd
# ;; Got answer:
# ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 55590
# ;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
# ;; WARNING: recursion requested but not available
# ;; OPT PSEUDOSECTION:
# ; EDNS: version: 0, flags:; udp: 1232
# ;; QUESTION SECTION:
# ;quaylab.infra.redhat.ren. IN A
# ;; ANSWER SECTION:
# quaylab.infra.redhat.ren. 10 IN A 192.168.7.15
# ;; Query time: 7 msec
# ;; SERVER: 127.0.0.1#53(127.0.0.1)
# ;; WHEN: Thu Sep 15 02:23:09 UTC 2022
# ;; MSG SIZE rcvd: 69
dig @127.0.0.1 api.acm-demo-hub.redhat.ren
# ; <<>> DiG 9.11.26-RedHat-9.11.26-4.el8_4 <<>> @127.0.0.1 api.acm-demo-hub.redhat.ren
# ; (1 server found)
# ;; global options: +cmd
# ;; Got answer:
# ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 14103
# ;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
# ;; WARNING: recursion requested but not available
# ;; OPT PSEUDOSECTION:
# ; EDNS: version: 0, flags:; udp: 1232
# ;; QUESTION SECTION:
# ;api.acm-demo-hub.redhat.ren. IN A
# ;; ANSWER SECTION:
# api.acm-demo-hub.redhat.ren. 10 IN A 192.168.7.15
# ;; Query time: 1 msec
# ;; SERVER: 127.0.0.1#53(127.0.0.1)
# ;; WHEN: Thu Sep 15 02:24:19 UTC 2022
# ;; MSG SIZE rcvd: 72
dig @127.0.0.1 a.apps.acm-demo-hub.redhat.ren
# ; <<>> DiG 9.11.26-RedHat-9.11.26-4.el8_4 <<>> @127.0.0.1 a.apps.acm-demo-hub.redhat.ren
# ; (1 server found)
# ;; global options: +cmd
# ;; Got answer:
# ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 16264
# ;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
# ;; WARNING: recursion requested but not available
# ;; OPT PSEUDOSECTION:
# ; EDNS: version: 0, flags:; udp: 1232
# ;; QUESTION SECTION:
# ;a.apps.acm-demo-hub.redhat.ren. IN A
# ;; ANSWER SECTION:
# a.apps.acm-demo-hub.redhat.ren. 10 IN A 192.168.7.14
# ;; Query time: 1 msec
# ;; SERVER: 127.0.0.1#53(127.0.0.1)
# ;; WHEN: Thu Sep 15 02:25:20 UTC 2022
# ;; MSG SIZE rcvd: 75
test with force power off
我们知道,如果ocp node意外断电的话,启动的时候,他会重新下载集群需要的基础镜像,那么我们就暴力断电其中一个节点,来测试这个节点能否启动吧。
We know that if the ocp node is accidentally powered off, it will re-download the basic image required by the cluster when it starts up. Then we will violently power off one of the nodes to test whether the node can be started.
oc get mcp
# NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE
# master rendered-master-80dda25e010fb6de88514875eefd7c19 True False False 3 3 3 0 19h
# worker rendered-worker-df248a1c64755ca00714f4f2b6d13e48 True False False 0 0 0 0 19h
oc get node
# NAME STATUS ROLES AGE VERSION
# master-01-demo Ready master,worker 19h v1.23.5+012e945
# master-02-demo Ready master,worker 19h v1.23.5+012e945
# master-03-demo Ready master,worker 19h v1.23.5+012e945
oc get co
# NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE
# authentication 4.10.26 True False False 3m14s
# baremetal 4.10.26 True False False 19h
# cloud-controller-manager 4.10.26 True False False 19h
# cloud-credential 4.10.26 True False False 19h
# cluster-autoscaler 4.10.26 True False False 19h
# config-operator 4.10.26 True False False 19h
# console 4.10.26 True False False 3m58s
# csi-snapshot-controller 4.10.26 True False False 19h
# dns 4.10.26 True False False 153m
# etcd 4.10.26 True False False 19h
# image-registry 4.10.26 True False False 19h
# ingress 4.10.26 True False False 130m
# insights 4.10.26 True False False 55s
# kube-apiserver 4.10.26 True False False 19h
# kube-controller-manager 4.10.26 True False False 19h
# kube-scheduler 4.10.26 True False False 19h
# kube-storage-version-migrator 4.10.26 True False False 71m
# machine-api 4.10.26 True False False 19h
# machine-approver 4.10.26 True False False 19h
# machine-config 4.10.26 True False False 12h
# marketplace 4.10.26 True False False 19h
# monitoring 4.10.26 True False False 19h
# network 4.10.26 True False False 19h
# node-tuning 4.10.26 True False False 19h
# openshift-apiserver 4.10.26 True False False 131m
# openshift-controller-manager 4.10.26 True False False 19h
# openshift-samples 4.10.26 True False False 19h
# operator-lifecycle-manager 4.10.26 True False False 19h
# operator-lifecycle-manager-catalog 4.10.26 True False False 19h
# operator-lifecycle-manager-packageserver 4.10.26 True False False 131m
# service-ca 4.10.26 True False False 19h
# storage 4.10.26 True False False 19h
测试结果,能正常启动。
The test result shows that it can be started normally.
test showdown 1 master
我们关掉一个节点,然后看集群的状态
We shut down a node and see the status of the cluster
oc get node
# NAME STATUS ROLES AGE VERSION
# master-01-demo NotReady master,worker 19h v1.23.5+012e945
# master-02-demo Ready master,worker 19h v1.23.5+012e945
# master-03-demo Ready master,worker 19h v1.23.5+012e945
oc get co
# NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE
# authentication 4.10.26 True False False 8m5s
# baremetal 4.10.26 True False False 19h
# cloud-controller-manager 4.10.26 True False False 19h
# cloud-credential 4.10.26 True False False 19h
# cluster-autoscaler 4.10.26 True False False 19h
# config-operator 4.10.26 True False False 19h
# console 4.10.26 True False False 14m
# csi-snapshot-controller 4.10.26 True False False 19h
# dns 4.10.26 True True False 164m DNS "default" reports Progressing=True: "Have 2 available node-resolver pods, want 3."
# etcd 4.10.26 True False True 19h ClusterMemberControllerDegraded: unhealthy members found during reconciling members...
# image-registry 4.10.26 True False False 19h
# ingress 4.10.26 True False False 141m
# insights 4.10.26 True False False 93s
# kube-apiserver 4.10.26 True False True 19h NodeControllerDegraded: The master nodes not ready: node "master-01-demo" not ready since 2022-09-15 03:33:40 +0000 UTC because NodeStatusUnknown (Kubelet stopped posting node status.)
# kube-controller-manager 4.10.26 True False True 19h NodeControllerDegraded: The master nodes not ready: node "master-01-demo" not ready since 2022-09-15 03:33:40 +0000 UTC because NodeStatusUnknown (Kubelet stopped posting node status.)
# kube-scheduler 4.10.26 True False True 19h NodeControllerDegraded: The master nodes not ready: node "master-01-demo" not ready since 2022-09-15 03:33:40 +0000 UTC because NodeStatusUnknown (Kubelet stopped posting node status.)
# kube-storage-version-migrator 4.10.26 True False False 82m
# machine-api 4.10.26 True False False 19h
# machine-approver 4.10.26 True False False 19h
# machine-config 4.10.26 True False False 12h
# marketplace 4.10.26 True False False 19h
# monitoring 4.10.26 True False False 19h
# network 4.10.26 True True False 19h DaemonSet "openshift-multus/multus" is not available (awaiting 1 nodes)...
# node-tuning 4.10.26 True False False 19h
# openshift-apiserver 4.10.26 True False False 8m
# openshift-controller-manager 4.10.26 True False False 19h
# openshift-samples 4.10.26 True False False 19h
# operator-lifecycle-manager 4.10.26 True False False 19h
# operator-lifecycle-manager-catalog 4.10.26 True False False 19h
# operator-lifecycle-manager-packageserver 4.10.26 True False False 142m
# service-ca 4.10.26 True False False 19h
# storage 4.10.26 True False False 19h
关闭了一个节点,集群还能工作。
After shutting down a node, the cluster still works.
看看web console能否使用? / See if the web console can be used?
test with ocp upgrade
我们上传的镜像,包括了4.10.26, 4.10.28两个版本,那么我们就来试试升级吧
The images we uploaded include two versions: 4.10.26 and 4.10.28, so let's try to upgrade
oc get node
# NAME STATUS ROLES AGE VERSION
# master-01-demo Ready master,worker 19h v1.23.5+012e945
# master-02-demo Ready master,worker 19h v1.23.5+012e945
# master-03-demo Ready master,worker 19h v1.23.5+012e945
oc get clusterversion
# NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
# version 4.10.26 True False 19h Cluster version is 4.10.26
# upgrade ocp to 4.10.28
oc adm upgrade \
--to-image=quay.io/openshift-release-dev/ocp-release@sha256:2127608ebd67a2470860c42368807a0de2308dba144ec4c298bec1c03d79cb52 \
--allow-explicit-upgrade --allow-upgrade-with-warnings=true --force=true
# after upgrade
oc get clusterversion
# NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
# version 4.10.28 True False 43m Cluster version is 4.10.28
oc get co
# NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE
# authentication 4.10.28 True False False 62m
# baremetal 4.10.28 True False False 21h
# cloud-controller-manager 4.10.28 True False False 21h
# cloud-credential 4.10.28 True False False 22h
# cluster-autoscaler 4.10.28 True False False 21h
# config-operator 4.10.28 True False False 21h
# console 4.10.28 True False False 148m
# csi-snapshot-controller 4.10.28 True False False 21h
# dns 4.10.28 True False False 4h58m
# etcd 4.10.28 True False False 21h
# image-registry 4.10.28 True False False 21h
# ingress 4.10.28 True False False 4h35m
# insights 4.10.28 True False False 81s
# kube-apiserver 4.10.28 True False False 21h
# kube-controller-manager 4.10.28 True False False 21h
# kube-scheduler 4.10.28 True False False 21h
# kube-storage-version-migrator 4.10.28 True False False 54m
# machine-api 4.10.28 True False False 21h
# machine-approver 4.10.28 True False False 21h
# machine-config 4.10.28 True False False 129m
# marketplace 4.10.28 True False False 21h
# monitoring 4.10.28 True False False 21h
# network 4.10.28 True False False 21h
# node-tuning 4.10.28 True False False 100m
# openshift-apiserver 4.10.28 True False False 142m
# openshift-controller-manager 4.10.28 True False False 21h
# openshift-samples 4.10.28 True False False 98m
# operator-lifecycle-manager 4.10.28 True False False 21h
# operator-lifecycle-manager-catalog 4.10.28 True False False 21h
# operator-lifecycle-manager-packageserver 4.10.28 True False False 4h36m
# service-ca 4.10.28 True False False 21h
# storage 4.10.28 True False False 21h
oc get mcp
# NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE
# master rendered-master-24f4773e2eb47a6524572c1e7185e836 True False False 3 3 3 0 21h
# worker rendered-worker-28261f188bfcb7348c5f6aab2e876b2e True False False 0 0 0 0 21h
rpm-ostree status
# State: idle
# Deployments:
# ● pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:822737b305b28aa4890f7bf847ebebc896cd7b549318195fc8c953ae3008cc44
# CustomOrigin: Managed by machine-config-operator
# Version: 410.84.202208161501-0 (2022-08-16T15:04:45Z)
# LayeredPackages: docker-distribution htop pdns pdns-recursor pdns-selinux
# pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:23d0609643c25efcd30a7a64483fdee2343ced26b1fd08c0cbf8d03a5d405939
# CustomOrigin: Managed by machine-config-operator
# Version: 410.84.202208030316-0 (2022-08-03T03:19:21Z)
# LayeredPackages: docker-distribution htop pdns pdns-recursor pdns-selinux
我们可以看到,升级成功,各个后安装的软件包也都在。
We can see that the upgrade is successful, and all the installed packages are also there.
web console工作也正常。 / The web console works fine too.
finished
notes
research
yum install -y pdns pdns-recursor
mv /etc/pdns/pdns.conf /etc/pdns/pdns.conf.bak
cat << EOF > /etc/pdns/pdns.conf
launch=bind
local-address=127.0.0.1
local-port=5301
setgid=pdns
setuid=pdns
bind-config=/etc/pdns/bind.conf
bind-check-interval=300
enable-lua-records=yes
EOF
cat << EOF > /etc/pdns/bind.conf
zone "ocp4.redhat.ren" { type master; file "/etc/pdns/inside-out.xyz"; };
EOF
cat << 'EOF' > /etc/pdns/inside-out.xyz
$TTL 180
@ IN SOA ns1.ocp4.redhat.ren. postmaster.ocp4.redhat.ren. (
2014080704 ; Serial Number (date YYYYMMDD++)
3H ; refresh (3 hours)
30M ; retry (30 minutes)
2W ; expiry (2 weeks)
1W ) ; minimum (1 week)
IN NS ns1.ocp4.redhat.ren.
IN NS ns2.ocp4.redhat.ren.
@ IN LUA A "ifportup(6443, {'192.168.7.13', '192.168.7.14', '192.168.7.15'})"
ns1 IN A 8.8.8.8
ns2 IN A 8.8.4.4
helper IN A 192.168.7.11
;
;
; The api points to the IP of your load balancer
api IN LUA A "ifportup(6443, {'192.168.7.13', '192.168.7.14', '192.168.7.15'})"
api-int IN LUA A "ifportup(6443, {'192.168.7.13', '192.168.7.14', '192.168.7.15'})"
;
; The wildcard also points to the load balancer
*.apps IN LUA A "ifportup(443, {'192.168.7.13', '192.168.7.14', '192.168.7.15'})"
;
; Create entry for the bootstrap host
; bootstrap IN A 192.168.7.12
;
; Create entries for the master hosts
master-0 IN A 192.168.7.13
master-1 IN A 192.168.7.14
master-2 IN A 192.168.7.15
;
; Create entries for the worker hosts
worker-0 IN A 192.168.7.16
worker-1 IN A 192.168.7.17
worker-2 IN A 192.168.7.18
;
; The ETCd cluster lives on the masters...so point these to the IP of the masters
etcd-0 IN A 192.168.7.13
etcd-1 IN A 192.168.7.14
etcd-2 IN A 192.168.7.15
;
; Create entries for the other hosts
registry IN LUA A "ifportup(5443, {'192.168.7.13', '192.168.7.14', '192.168.7.15'})"
yum IN A 192.168.7.1
quay IN LUA A "ifportup(5443, {'192.168.7.13', '192.168.7.14', '192.168.7.15'})"
nexus IN A 192.168.7.1
git IN A 192.168.7.11
tmp-registry IN A 192.168.7.177
EOF
# ausearch -c 'pdns_server' --raw | audit2allow -M my-pdnsserver
# semodule -X 300 -i my-pdnsserver.pp
# SELinux is preventing /usr/sbin/pdns_server from name_connect access on the tcp_socket port 6443.
# ***** Plugin connect_ports (92.2 confidence) suggests *********************
# If you want to allow /usr/sbin/pdns_server to connect to network port 6443
# Then you need to modify the port type.
# Do
# # semanage port -a -t PORT_TYPE -p tcp 6443
# where PORT_TYPE is one of the following: dns_port_t, dnssec_port_t, kerberos_port_t, ocsp_port_t.
# ***** Plugin catchall_boolean (7.83 confidence) suggests ******************
# If you want to allow system to run with NIS
# Then you must tell SELinux about this by enabling the 'nis_enabled' boolean.
# Do
# setsebool -P nis_enabled 1
# ***** Plugin catchall (1.41 confidence) suggests **************************
# If you believe that pdns_server should be allowed name_connect access on the port 6443 tcp_socket by default.
# Then you should report this as a bug.
# You can generate a local policy module to allow this access.
# Do
# allow this access for now by executing:
# # ausearch -c 'pdns/distributo' --raw | audit2allow -M my-pdnsdistributo
# # semodule -X 300 -i my-pdnsdistributo.pp
systemctl enable --now pdns
pdnsutil check-all-zones
mv /etc/pdns-recursor/recursor.conf /etc/pdns-recursor/recursor.conf.bak
cat << EOF > /etc/pdns-recursor/recursor.conf
local-address=0.0.0.0 ::
allow-from=192.168.7.0/0 #允许所有用户端请求
dnssec=off #关闭dnssec
forward-zones=ocp4.redhat.ren=127.0.0.1:5301
forward-zones-recurse=.=114.114.114.114
setgid=pdns-recursor
setuid=pdns-recursor
security-poll-suffix=
EOF
systemctl enable --now pdns-recursor
ausearch -m avc --start recent -i
audit2allow -a -M wzh-pdns
semodule -i wzh-pdns.pp
systemctl restart pdns
dig @127.0.0.1 helper.ocp4.redhat.ren
dig @127.0.0.1 api.ocp4.redhat.ren
dig @127.0.0.1 c.apps.ocp4.redhat.ren
dig @127.0.0.1 registry.ocp4.redhat.ren
test stand alone
dnf install -y epel-release
dnf install -y pdns pdns-recursor
dnf update -y
semodule -i wzh-pdns.pp
cat << EOF > /etc/pdns/pdns.conf
launch=bind
local-address=0.0.0.0 ::
# local-port=5301
setgid=pdns
setuid=pdns
bind-config=/etc/pdns/bind.conf
bind-check-interval=300
enable-lua-records=yes
EOF
cat << EOF > /etc/pdns/bind.conf
zone "ocp4.redhat.ren" { type master; file "/etc/pdns/inside-out.xyz"; };
EOF
cat << 'EOF' > /etc/pdns/inside-out.xyz
$TTL 180
@ IN SOA ns1.ocp4.redhat.ren. postmaster.ocp4.redhat.ren. (
2014080704 ; Serial Number (date YYYYMMDD++)
3H ; refresh (3 hours)
30M ; retry (30 minutes)
2W ; expiry (2 weeks)
1W ) ; minimum (1 week)
IN NS ns1.ocp4.redhat.ren.
IN NS ns2.ocp4.redhat.ren.
@ IN LUA A "ifportup(6443, {'192.168.7.13', '192.168.7.14', '192.168.7.15'})"
ns1 IN A 8.8.8.8
ns2 IN A 8.8.4.4
helper IN A 192.168.7.11
;
;
; The api points to the IP of your load balancer
api IN LUA A "ifportup(6443, {'192.168.7.13', '192.168.7.14', '192.168.7.15'})"
api-int IN LUA A "ifportup(6443, {'192.168.7.13', '192.168.7.14', '192.168.7.15'})"
;
; The wildcard also points to the load balancer
*.apps IN LUA A "ifportup(443, {'192.168.7.13', '192.168.7.14', '192.168.7.15'})"
;
; Create entry for the bootstrap host
; bootstrap IN A 192.168.7.12
;
; Create entries for the master hosts
master-0 IN A 192.168.7.13
master-1 IN A 192.168.7.14
master-2 IN A 192.168.7.15
;
; Create entries for the worker hosts
worker-0 IN A 192.168.7.16
worker-1 IN A 192.168.7.17
worker-2 IN A 192.168.7.18
;
; The ETCd cluster lives on the masters...so point these to the IP of the masters
etcd-0 IN A 192.168.7.13
etcd-1 IN A 192.168.7.14
etcd-2 IN A 192.168.7.15
;
; Create entries for the other hosts
registry IN LUA A "ifportup(5443, {'192.168.7.13', '192.168.7.14', '192.168.7.15'})"
yum IN A 192.168.7.1
quay IN LUA A "ifportup(5443, {'192.168.7.13', '192.168.7.14', '192.168.7.15'})"
nexus IN A 192.168.7.1
git IN A 192.168.7.11
tmp-registry IN A 192.168.7.177
EOF
systemctl enable --now pdns
dig @127.0.0.1 helper.ocp4.redhat.ren
dig @127.0.0.1 api.ocp4.redhat.ren
dig @127.0.0.1 c.apps.ocp4.redhat.ren
dig @127.0.0.1 registry.ocp4.redhat.ren
test install
======================================================================================================================================================================================================
Package Architecture Version Repository Size
======================================================================================================================================================================================================
Installing:
pdns x86_64 4.6.2-1.el8 epel 3.7 M
pdns-recursor x86_64 4.3.6-1.el8 epel 2.0 M
Installing dependencies:
boost-context x86_64 1.66.0-10.el8 appstream 15 k
boost-program-options x86_64 1.66.0-10.el8 appstream 140 k
libsodium x86_64 1.0.18-2.el8 epel 162 k
luajit x86_64 2.1.0-0.16beta3.el8 epel 359 k
protobuf x86_64 3.5.0-13.el8 appstream 892 k
Transaction Summary
======================================================================================================================================================================================================
Install 7 Packages
registry
cat << EOF > /usr/lib/systemd/system/docker-distribution.service
[Unit]
Description=v2 Registry server for Docker
After=network.target
[Service]
Type=simple
ExecStart=/usr/bin/registry serve /etc/wzh/registry-config.yml
Restart=on-failure
[Install]
WantedBy=multi-user.target
EOF
mkdir -p /etc/wzh
cat << EOF > /etc/wzh/registry-config.yml
version: 0.1
log:
accesslog:
disabled: true
fields:
service: registry
storage:
cache:
layerinfo: inmemory
filesystem:
rootdirectory: /var/wzh-registry
delete:
enabled: false
maintenance:
readonly:
enabled: true
http:
addr: :5443
tls:
certificate: /etc/wzh/redhat.ren.crt
key: /etc/wzh/redhat.ren.key
EOF
# 配置registry
export VAR_CERT_DIR=/etc/wzh/
mkdir -p ${VAR_CERT_DIR} && cd ${VAR_CERT_DIR}
# https://access.redhat.com/documentation/en-us/red_hat_codeready_workspaces/2.1/html/installation_guide/installing-codeready-workspaces-in-tls-mode-with-self-signed-certificates_crw
openssl genrsa -out ${VAR_CERT_DIR}/redhat.ren.ca.key 4096
openssl req -x509 \
-new -nodes \
-key ${VAR_CERT_DIR}/redhat.ren.ca.key \
-sha256 \
-days 36500 \
-out ${VAR_CERT_DIR}/redhat.ren.ca.crt \
-subj /CN="Local Red Hat Ren Signer" \
-reqexts SAN \
-extensions SAN \
-config <(cat /etc/pki/tls/openssl.cnf \
<(printf '[SAN]\nbasicConstraints=critical, CA:TRUE\nkeyUsage=keyCertSign, cRLSign, digitalSignature'))
openssl genrsa -out ${VAR_CERT_DIR}/redhat.ren.key 2048
openssl req -new -sha256 \
-key ${VAR_CERT_DIR}/redhat.ren.key \
-subj "/O=Local Red Hat Ren /CN=*.ocp4.redhat.ren" \
-reqexts SAN \
-config <(cat /etc/pki/tls/openssl.cnf \
<(printf "\n[SAN]\nsubjectAltName=DNS:*.ocp4.redhat.ren,DNS:*.apps.ocp4.redhat.ren,DNS:*.redhat.ren\nbasicConstraints=critical, CA:FALSE\nkeyUsage=digitalSignature, keyEncipherment, keyAgreement, dataEncipherment\nextendedKeyUsage=serverAuth")) \
-out ${VAR_CERT_DIR}/redhat.ren.csr
openssl x509 \
-req \
-sha256 \
-extfile <(printf "subjectAltName=DNS:*.ocp4.redhat.ren,DNS:*.apps.ocp4.redhat.ren,DNS:*.redhat.ren\nbasicConstraints=critical, CA:FALSE\nkeyUsage=digitalSignature, keyEncipherment, keyAgreement, dataEncipherment\nextendedKeyUsage=serverAuth") \
-days 36500 \
-in ${VAR_CERT_DIR}/redhat.ren.csr \
-CA ${VAR_CERT_DIR}/redhat.ren.ca.crt \
-CAkey ${VAR_CERT_DIR}/redhat.ren.ca.key \
-CAcreateserial -out ${VAR_CERT_DIR}/redhat.ren.crt
openssl x509 -in ${VAR_CERT_DIR}/redhat.ren.crt -text
/bin/cp -f ${VAR_CERT_DIR}/redhat.ren.ca.crt /etc/pki/ca-trust/source/anchors/
update-ca-trust extract
cat << EOF >> /etc/hosts
127.0.0.1 registry.redhat.ren
EOF
mkdir -p /var/wzh-registry
systemctl restart docker-distribution
podman for pdns & registrty
mkdir -p /data/pdns/conf
cd /data/pdns
cat > /data/pdns/pdns.Dockerfile << EOF
FROM docker.io/library/almalinux:8
RUN dnf -y install epel-release
RUN dnf -y update
RUN dnf -y install pdns pdns-recursor
ENTRYPOINT ["/usr/sbin/pdns_server"]
CMD ["--socket-dir=/tmp/pdns", "--guardian=no", "--daemon=no", "--disable-syslog", "--log-timestamp=no", "--write-pid=no"]
EOF
podman build --squash -t quay.io/nepdemo/pdns:4.6.2-alma8 -f pdns.Dockerfile .
podman push quay.io/nepdemo/pdns:4.6.2-alma8
cat > /data/pdns/pdns.Dockerfile << EOF
FROM registry.access.redhat.com/ubi8
RUN dnf -y install https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm
RUN dnf -y update
RUN dnf -y install pdns pdns-recursor
ENTRYPOINT ["/usr/sbin/pdns_server"]
CMD ["--socket-dir=/tmp/pdns", "--guardian=no", "--daemon=no", "--disable-syslog", "--log-timestamp=no", "--write-pid=no"]
EOF
podman build --squash -t quay.io/nepdemo/pdns:4.6.2-ubi8 -f pdns.Dockerfile .
podman push quay.io/nepdemo/pdns:4.6.2-ubi8
cat > /data/pdns/conf/pdns.conf << EOF
launch=bind
local-address=0.0.0.0
local-port=53
setgid=pdns
setuid=pdns
bind-config=/etc/pdns/bind.conf
bind-check-interval=300
enable-lua-records=yes
EOF
cat > /data/pdns/conf/bind.conf << EOF
zone "acm-demo-hub.redhat.ren" { type master; file "/etc/pdns/inside-out.xyz"; };
zone "infra.redhat.ren" { type master; file "/etc/pdns/infra.xyz"; };
EOF
cat > /data/pdns/conf/inside-out.xyz << 'EOF'
$TTL 10
@ IN SOA ns1.acm-demo-hub.redhat.ren. postmaster.acm-demo-hub.redhat.ren. (
2014080704 ; Serial Number (date YYYYMMDD++)
3H ; refresh (3 hours)
30M ; retry (30 minutes)
2W ; expiry (2 weeks)
1W ) ; minimum (1 week)
;IN NS ns1.ocp4.redhat.ren.
;IN NS ns2.ocp4.redhat.ren.
@ IN A 192.168.7.13
;ns1 IN A 8.8.8.8
;ns2 IN A 8.8.4.4
helper IN A 192.168.7.11
;
;
; The api points to the IP of your load balancer
api IN A 192.168.7.13
api-int IN A 192.168.7.13
;
; The wildcard also points to the load balancer
*.apps IN A 192.168.7.13
;
; Create entry for the bootstrap host
; bootstrap IN A 192.168.7.12
;
; Create entries for the master hosts
master-0 IN A 192.168.7.13
;master-1 IN A 192.168.7.14
;master-2 IN A 192.168.7.15
;
; Create entries for the worker hosts
;worker-0 IN A 192.168.7.16
;worker-1 IN A 192.168.7.17
;worker-2 IN A 192.168.7.18
;
; The ETCd cluster lives on the masters...so point these to the IP of the masters
;etcd-0 IN A 192.168.7.13
;etcd-1 IN A 192.168.7.14
;etcd-2 IN A 192.168.7.15
;
; Create entries for the other hosts
registry IN A 192.168.7.13
yum IN A 192.168.7.1
nexus IN A 192.168.7.1
git IN A 192.168.7.11
tmp-registry IN A 192.168.7.177
EOF
cat > /data/pdns/conf/infra.xyz << 'EOF'
$TTL 10
@ IN SOA ns1.infra.redhat.ren. postmaster.infra.redhat.ren. (
2014080704 ; Serial Number (date YYYYMMDD++)
3H ; refresh (3 hours)
30M ; retry (30 minutes)
2W ; expiry (2 weeks)
1W ) ; minimum (1 week)
;IN NS ns1.ocp4.redhat.ren.
;IN NS ns2.ocp4.redhat.ren.
@ IN A 192.168.7.13
quay IN LUA A "ifportup(5180, {'158.247.225.4', '192.168.7.14', '192.168.7.15'})"
quaylab IN A 192.168.7.13
EOF
rm -f /tmp/pdns-*
podman run \
--name local-pdns \
--network=host \
-v /data/pdns/conf/:/etc/pdns/:z \
--conmon-pidfile /tmp/pdns-pid \
--cidfile /tmp/pdns-cid \
--cgroups=no-conmon \
--replace \
quay.io/nepdemo/pdns:4.6.2-ubi8
/usr/bin/podman stop --ignore --cidfile /tmp/pdns-cid -t 1
registry
cat << EOF > ${BASE_DIR}/data/sno/registry.images.bu
variant: openshift
version: 4.10.0
metadata:
labels:
machineconfiguration.openshift.io/role: master
name: 99-zzz-master-registry
storage:
files:
- path: /etc/wzh/redhat.ren.crt
overwrite: true
contents:
source: data:text/plain;charset=utf-8;base64,$( base64 -w 0 < ${VAR_CERT_DIR}/redhat.ren.crt )
mode: 420
user:
name: root
- path: /etc/wzh/redhat.ren.key
overwrite: true
contents:
source: data:text/plain;charset=utf-8;base64,$( base64 -w 0 < ${VAR_CERT_DIR}/redhat.ren.key )
mode: 420
user:
name: root
- path: /etc/wzh/registry-config.yml
overwrite: true
contents:
inline: |
version: 0.1
log:
accesslog:
disabled: true
fields:
service: registry
storage:
cache:
layerinfo: inmemory
filesystem:
rootdirectory: /var/wzh-registry
delete:
enabled: true
maintenance:
readonly:
enabled: false
http:
addr: :8443
tls:
certificate: /etc/wzh/redhat.ren.crt
key: /etc/wzh/redhat.ren.key
mode: 420
user:
name: root
systemd:
units:
- contents: |
[Unit]
Description=Set SELinux chcon for image registry
Before=docker-distribution.service
[Service]
Type=oneshot
RemainAfterExit=yes
User=root
ExecStartPre=-mkdir -p /var/wzh-registry
ExecStart=/usr/bin/chcon -Rt container_file_t /var/wzh-registry
[Install]
WantedBy=multi-user.target
enabled: true
name: hostpath-registry.service
- contents: |
[Unit]
Description=v2 Registry server for Docker
After=network.target hostpath-registry.service
Requires=hostpath-registry.service
Before=kubelet.service
[Service]
Type=simple
TimeoutStartSec=5m
ExecStartPre=-/bin/rm -f %t/%n-pid %t/%n-cid
ExecStart=/usr/bin/podman run \
--name local-registry \
--network=host \
-v /var/wzh-registry/:/var/lib/registry:z \
-v /etc/wzh:/certs:z \
-e REGISTRY_HTTP_ADDR=0.0.0.0:8443 \
-e REGISTRY_HTTP_TLS_CERTIFICATE=/certs/redhat.ren.crt \
-e REGISTRY_HTTP_TLS_KEY=/certs/redhat.ren.key \
--conmon-pidfile %t/%n-pid \
--cidfile %t/%n-cid \
--cgroups=no-conmon \
--replace \
docker.io/library/registry:2
ExecStop=-/usr/bin/podman stop --ignore --cidfile %t/%n-cid -t 1
ExecStopPost=-/usr/bin/podman rm --ignore -f --cidfile %t/%n-cid
PIDFile=%t/%n-pid
KillMode=none
Restart=always
RestartSec=30
[Install]
WantedBy=multi-user.target
enabled: true
name: docker-distribution.service
- name: kubelet.service
dropins:
- name: 99-after-registry.conf
contents: |
[Unit]
Requires=docker-distribution.service
After=docker-distribution.service
EOF
butane ${BASE_DIR}/data/sno/registry.images.bu > ${BASE_DIR}/data/sno/99-zzz-master-registry.yaml
oc create --save-config -f ${BASE_DIR}/data/sno/99-zzz-master-registry.yaml
# oc apply -f ${BASE_DIR}/data/sno/99-zzz-master-registry.yaml
# oc delete -f ${BASE_DIR}/data/sno/99-zzz-master-registry.yaml
pdns
cat > ${BASE_DIR}/data/sno/pdns.bu << 'EOF'
variant: openshift
version: 4.10.0
metadata:
labels:
machineconfiguration.openshift.io/role: master
name: 99-zzz-master-pdns
storage:
files:
- path: /etc/pdns/pdns.conf
overwrite: true
contents:
inline: |
launch=bind
local-address=0.0.0.0
local-port=53
setgid=pdns
setuid=pdns
bind-config=/etc/pdns/bind.conf
bind-check-interval=300
enable-lua-records=yes
mode: 420
user:
name: root
- path: /etc/pdns/bind.conf
overwrite: true
contents:
inline: |
zone "acm-demo-hub.redhat.ren" { type master; file "/etc/pdns/inside-out.xyz"; };
zone "infra.redhat.ren" { type master; file "/etc/pdns/infra.xyz"; };
mode: 420
user:
name: root
- path: /etc/pdns/inside-out.xyz
overwrite: true
contents:
inline: |
$TTL 10
@ IN SOA ns1.acm-demo-hub.redhat.ren. postmaster.acm-demo-hub.redhat.ren. (
2014080704 ; Serial Number (date YYYYMMDD++)
3H ; refresh (3 hours)
30M ; retry (30 minutes)
2W ; expiry (2 weeks)
1W ) ; minimum (1 week)
;IN NS ns1.ocp4.redhat.ren.
;IN NS ns2.ocp4.redhat.ren.
@ IN A 192.168.7.13
;ns1 IN A 8.8.8.8
;ns2 IN A 8.8.4.4
helper IN A 192.168.7.11
;
;
; The api points to the IP of your load balancer
api IN A 192.168.7.13
api-int IN A 192.168.7.13
;
; The wildcard also points to the load balancer
*.apps IN A 192.168.7.13
;
; Create entry for the bootstrap host
; bootstrap IN A 192.168.7.12
;
; Create entries for the master hosts
master-0 IN A 192.168.7.13
;master-1 IN A 192.168.7.14
;master-2 IN A 192.168.7.15
;
; Create entries for the worker hosts
;worker-0 IN A 192.168.7.16
;worker-1 IN A 192.168.7.17
;worker-2 IN A 192.168.7.18
;
; The ETCd cluster lives on the masters...so point these to the IP of the masters
;etcd-0 IN A 192.168.7.13
;etcd-1 IN A 192.168.7.14
;etcd-2 IN A 192.168.7.15
;
; Create entries for the other hosts
registry IN A 192.168.7.13
yum IN A 192.168.7.1
nexus IN A 192.168.7.1
git IN A 192.168.7.11
tmp-registry IN A 192.168.7.177
mode: 420
user:
name: root
- path: /etc/pdns/infra.xyz
overwrite: true
contents:
inline: |
$TTL 10
@ IN SOA ns1.infra.redhat.ren. postmaster.infra.redhat.ren. (
2014080704 ; Serial Number (date YYYYMMDD++)
3H ; refresh (3 hours)
30M ; retry (30 minutes)
2W ; expiry (2 weeks)
1W ) ; minimum (1 week)
;IN NS ns1.ocp4.redhat.ren.
;IN NS ns2.ocp4.redhat.ren.
@ IN A 192.168.7.13
quay IN A 192.168.7.13
quaylab IN A 192.168.7.13
mode: 420
user:
name: root
systemd:
units:
- contents: |
[Unit]
Description=PowerDNS Authoritative Server
After=network.target
Before=kubelet.service
[Service]
Type=simple
TimeoutStartSec=5m
ExecStartPre=-/bin/rm -f %t/%n-pid %t/%n-cid
ExecStart=/usr/bin/podman run \
--name local-pdns \
--network=host \
-v /etc/pdns/:/etc/pdns/:z \
--conmon-pidfile %t/%n-pid \
--cidfile %t/%n-cid \
--cgroups=no-conmon \
--replace \
quay.io/nepdemo/pdns:4.6.2-ubi8
ExecStop=-/usr/bin/podman stop --ignore --cidfile %t/%n-cid -t 1
ExecStopPost=-/usr/bin/podman rm --ignore -f --cidfile %t/%n-cid
PIDFile=%t/%n-pid
KillMode=none
Restart=always
SyslogIdentifier=pdns_server
User=pdns
Group=pdns
RestartSec=1
StartLimitInterval=0
RuntimeDirectory=pdns
[Install]
WantedBy=multi-user.target
name: pdns.service
enabled: true
- name: kubelet.service
dropins:
- name: 99-after-pdns.conf
contents: |
[Unit]
Requires=pdns.service
After=pdns.service
EOF
butane ${BASE_DIR}/data/sno/pdns.bu > ${BASE_DIR}/data/sno/99-zzz-master-pdns.yaml
oc create --save-config -f ${BASE_DIR}/data/sno/99-zzz-master-pdns.yaml
# oc apply -f ${BASE_DIR}/data/sno/99-zzz-master-pdns.yaml
end
upgrade openshift 4.10 based rhcos to rhel 9.1 / 升级 openshift 4.10 基础操作系统到 rhel 9.1 支持海光x86 cpu
我们项目中,要求openshift支持海光x86 cpu,linux kernel大概是在4.20以后,合并了对海光x86 cpu支持的代码。但是当前版本的openshift(<4.12)都是基于rhel8的,rhel8的内核是基于4.18版本改造而来,还没有海光x86 cpu的支持。
好在redhat已经推出了rhel9, 是基于kernel 5.14的,经过实际测试,rhel9.1是能在海光x86 cpu上正常安装和运行的,那么我们就来试试,把openshift 4.10的底层操作系统rhcos,升级到rhel9.1的内核。
In our project, openshift is required to support Hygon x86 cpu, and the linux kernel is probably after 4.20, which merged the code supporting Hygon x86 cpu. However, the current version of openshift (<4.12) is based on rhel8, and the kernel of rhel8 is modified based on version 4.18, and there is no support for Hygon x86 cpu.
Fortunately, redhat has launched rhel9, which is based on kernel 5.14. After actual testing, rhel9.1 can be installed and run normally on Hygon x86 cpu, so let's try it and use rhcos, the underlying operating system of openshift 4.10, Upgrade to rhel9.1 kernel.
⚠️⚠️⚠️注意,本文所述方法,涉及到了以下问题,不能使用在生产环境中,只能作为 PoC 应急,或者研究学习之用。如果确实是项目急需,请和红帽GPS部门沟(gěi)通(qián),获得支持。
- ⚠️编译需要多个 rhel 相关的特种源,而且还是 eus, tus 版本,这些都需要单独购买
- ⚠️编译需要一个红帽内部的 repo 源,属于红帽机密
- ⚠️自定义的 rhcos 不能得到红帽 CEE 支持
⚠️⚠️⚠️ Note that the method described in this article involves the following issues and cannot be used in a production environment. It can only be used as a PoC emergency or for research and learning. If it is really urgent for the project, please communicate with the Red Hat GPS department for support.
- ⚠️ Compilation requires multiple rhel-related special sources, and they are also eus and tus versions, which need to be purchased separately
- ⚠️ Compilation requires a Red Hat internal repo source, which is Red Hat Confidential
- ⚠️ Custom rhcos cannot be supported by Red Hat CEE
本次实验的架构图如下: The architecture diagram of this experiment is as follows:
过程中,重度使用了 cosa , 这个是 coreos-assembler 工具集中的命令,他封装了一系列的工具,根据一个配置文件项目,来自动化的编译出来 coreos/rhcos 镜像。
In the process, cosa is heavily used, which is a command in the coreos-assembler tool set. It encapsulates a series of tools and automatically compiles the coreos/rhcos image according to a configuration file project.
编译成果 / compiling result
以下是编译成果 / The following is the compiling result
- openshift4.10.41 release image
- quay.io/wangzheng422/ocp:4.10.41-rhel-9.1-v02
- openshift4.10.41 os images
- 百度分享 / baidu sharing: https://pan.baidu.com/s/16_T72CqQeS2rLJ4MzW4dEQ?pwd=zpbg
⚠️⚠️⚠️ 另外,编译成果并没有严格测试,还需要客户根据自己的场景,完善的测试以后,才可以使用。
⚠️⚠️⚠️ In addition, the compilation results have not been strictly tested, and customers need to complete the test according to their own scenarios before they can be used.
视频讲解 / Video explanation
准备 dnf repo 源 / Prepare the dnf repo source
注意,这些 repo 源都是需要特殊单独购买,请联系红帽销售和GPS服务部门。
Note that these repo sources are required to be purchased separately, please contact Red Hat Sales and GPS Services.
rhel 9.1
我们首先要做的,是准备一个rhel9.1的rpm repo,这里有准备步骤。很遗憾,其中有几个openshift专用的repo,是不公开的。如果客户必须要这些repo的访问权限,请联系对口的SA,在公司内部申请试试。
# install a rhel on vultr
# disable user/passwd login
# ChallengeResponseAuthentication no
# PasswordAuthentication no
# UsePAM no
# sed -i 's/PasswordAuthentication yes/PasswordAuthentication no/g' /etc/ssh/sshd_config
# sed -i 's/UsePAM yes/UsePAM no/g' /etc/ssh/sshd_config
cat << EOF > /etc/ssh/sshd_config.d/99-wzh.conf
PasswordAuthentication no
UsePAM no
EOF
systemctl restart sshd
ssh root@v.redhat.ren -o PubkeyAuthentication=no
# root@v.redhat.ren: Permission denied (publickey,gssapi-keyex,gssapi-with-mic).
subscription-manager register --auto-attach --username ******** --password ********
# subscription-manager release --list
# subscription-manager release --set=8.4
# subscription-manager config --rhsm.baseurl=https://china.cdn.redhat.com
subscription-manager repos --list > list
subscription-manager repos \
--enable="rhel-9-for-x86_64-baseos-rpms" \
--enable="rhel-9-for-x86_64-appstream-rpms" \
--enable="codeready-builder-for-rhel-9-x86_64-rpms" \
#
dnf -y install https://dl.fedoraproject.org/pub/epel/epel-release-latest-9.noarch.rpm
dnf install -y htop createrepo_c
dnf install -y https://download-ib01.fedoraproject.org/pub/epel/8/Everything/x86_64/Packages/b/byobu-5.133-1.el8.noarch.rpm
# byobu
dnf update -y
reboot
mkdir -p /data/dnf
# Create new empty partitions, and filesystem
parted -s /dev/vdb mklabel gpt
parted -s /dev/vdb unit mib mkpart primary 0% 100%
mkfs.ext4 /dev/vdb1
cat << EOF >> /etc/fstab
/dev/vdb1 /data/dnf ext4 defaults,noatime,nofail 0 0
EOF
mount /dev/vdb1 /data/dnf
mkdir -p /data/dnf/dnf-ocp
cd /data/dnf/dnf-ocp
# subscription-manager release --set=9.0
# dnf reposync --repoid rhel-9-for-x86_64-baseos-eus-rpms -m --download-metadata --delete -n
# dnf reposync --repoid=rhel-9-for-x86_64-appstream-eus-rpms -m --download-metadata --delete -n
dnf reposync --repoid rhel-9-for-x86_64-baseos-rpms -m --download-metadata --delete -n
dnf reposync --repoid=rhel-9-for-x86_64-appstream-rpms -m --download-metadata --delete -n
dnf reposync --repoid=rhel-9-for-x86_64-nfv-rpms -m --download-metadata --delete -n
# dnf reposync --repoid=advanced-virt-for-rhel-8-x86_64-eus-rpms -m --download-metadata --delete -n
dnf reposync --repoid=fast-datapath-for-rhel-9-x86_64-rpms -m --download-metadata --delete -n
subscription-manager release --set=9
# fix for coreos-installer version
mkdir -p /data/dnf/dnf-ocp/fixes
cd /data/dnf/dnf-ocp/fixes
# dnf download --resolve --alldeps coreos-installer coreos-installer-bootinfra
dnf download --resolve coreos-installer coreos-installer-bootinfra selinux-policy
createrepo ./
# username, and password is confidensial
cat << 'EOF' > /etc/yum.repos.d/ose.repo
[rhel-8-server-ose]
name=rhel-8-server-ose
enabled=1
gpgcheck=0
baseurl=https://mirror.openshift.com/enterprise/reposync/4.10/rhel-8-server-ose-rpms/
module_hotfixes=true
username=??????
password=??????
[rhel-9-server-ose]
name=rhel-9-server-ose
enabled=1
gpgcheck=0
baseurl=https://mirror.openshift.com/enterprise/reposync/4.13/rhel-9-server-ose-rpms/
module_hotfixes=true
username=??????
password=??????
[rhel-9-server-ironic]
name=rhel-9-server-ironic
enabled=1
gpgcheck=0
baseurl=https://mirror.openshift.com/enterprise/reposync/4.13/rhel-9-server-ironic-rpms/
module_hotfixes=true
username=??????
password=??????
EOF
dnf reposync --repoid=rhel-8-server-ose -m --download-metadata --delete -n
dnf reposync --repoid=rhel-9-server-ose -m --download-metadata --delete -n
dnf reposync --repoid=rhel-9-server-ironic -m --download-metadata --delete -n
systemctl disable --now firewalld
# host the repo with web service
cd /data/dnf/dnf-ocp
python3 -m http.server 5180
准备 build 服务器 / Prepare the build server
注意,build 服务器需要支持 kvm ,如果选用的云平台,需要云平台支持嵌套虚拟化。
本次实验,我们选用了一台 centos stream 8 的云主机。
Note that the build server needs to support kvm. If you choose a cloud platform, the cloud platform needs to support nested virtualization.
In this experiment, we chose a cloud host of centos stream 8.
# install a centos stream 8 on digitalocean,
# 2c 2G for ostree only
# 4c 8G for iso because it needs metal first
dnf install -y epel-release
dnf install -y byobu htop
dnf update -y
reboot
dnf groupinstall -y server
dnf install -y lftp podman
dnf -y install qemu-kvm libvirt libguestfs-tools virt-install virt-viewer virt-manager tigervnc-server
systemctl disable --now firewalld
systemctl enable --now libvirtd
开始编译 rhcos / Start compiling rhcos
cosa 的输入是一个配置文件项目,上游是 https://github.com/openshift/os , 我们做了下游扩展,加入了各种repo源,并且把操作系统名字,加入了 wzh 的标记。
The input of cosa is a configuration file project, and the upstream is https://github.com/openshift/os. We made downstream extensions, added the rpm repo source, added the operating system name, added the wzh mark.
# machine-os-images just copy a iso into container
# machine-os-content is our target
# follow coreos-assembler instruction
# https://github.com/coreos/coreos-assembler/blob/main/docs/building-fcos.md
# https://coreos.github.io/coreos-assembler/
# https://github.com/openshift/os/blob/master/docs/development-rhcos.md
# https://github.com/openshift/os/blob/master/docs/development.md
# https://github.com/openshift/os/blob/master/docs/development.md
# https://github.com/openshift/release/blob/master/core-services/release-controller/README.md#rpm-mirrors
podman login ************* quay.io
# export COREOS_ASSEMBLER_CONTAINER=quay.io/coreos-assembler/coreos-assembler:rhcos-4.12
export COREOS_ASSEMBLER_CONTAINER=quay.io/coreos-assembler/coreos-assembler:latest
podman pull $COREOS_ASSEMBLER_CONTAINER
cosa() {
env | grep COREOS_ASSEMBLER
local -r COREOS_ASSEMBLER_CONTAINER_LATEST="quay.io/coreos-assembler/coreos-assembler:latest"
if [[ -z ${COREOS_ASSEMBLER_CONTAINER} ]] && $(podman image exists ${COREOS_ASSEMBLER_CONTAINER_LATEST}); then
local -r cosa_build_date_str="$(podman inspect -f "{{.Created}}" ${COREOS_ASSEMBLER_CONTAINER_LATEST} | awk '{print $1}')"
local -r cosa_build_date="$(date -d ${cosa_build_date_str} +%s)"
if [[ $(date +%s) -ge $((cosa_build_date + 60*60*24*7)) ]] ; then
echo -e "\e[0;33m----" >&2
echo "The COSA container image is more that a week old and likely outdated." >&2
echo "You should pull the latest version with:" >&2
echo "podman pull ${COREOS_ASSEMBLER_CONTAINER_LATEST}" >&2
echo -e "----\e[0m" >&2
sleep 10
fi
fi
set -x
podman run --rm -ti --security-opt label=disable --privileged \
--uidmap=1000:0:1 --uidmap=0:1:1000 --uidmap 1001:1001:64536 \
-v ${PWD}:/srv/ --device /dev/kvm --device /dev/fuse \
-v /run/user/0/containers/auth.json:/home/builder/.docker/config.json \
--tmpfs /tmp -v /var/tmp:/var/tmp --name cosa \
${COREOS_ASSEMBLER_CONFIG_GIT:+-v $COREOS_ASSEMBLER_CONFIG_GIT:/srv/src/config/:ro} \
${COREOS_ASSEMBLER_GIT:+-v $COREOS_ASSEMBLER_GIT/src/:/usr/lib/coreos-assembler/:ro} \
${COREOS_ASSEMBLER_CONTAINER_RUNTIME_ARGS} \
${COREOS_ASSEMBLER_CONTAINER:-$COREOS_ASSEMBLER_CONTAINER_LATEST} "$@"
rc=$?; set +x; return $rc
}
rm -rf /data/rhcos
mkdir -p /data/rhcos
cd /data/rhcos
# cosa init --branch wzh-ocp-4.8-rhel-9.1 https://github.com/wangzheng422/machine-os-content
cosa init \
--branch wzh-ocp-4.10-based-on-4.13-rhel-9 \
--variant rhel-coreos-9 \
https://github.com/wangzheng422/machine-os-content
sed -i 's/REPO_IP/45.77.125.88:5180/g' /data/rhcos/src/config/rhel-9.0.repo
cosa fetch
# cosa build ostree
# ......
# Ignored user missing from new passwd file: root
# New passwd entries: clevis, dnsmasq, gluster, systemd-coredump, systemd-journal-remote, unbound
# Ignored group missing from new group file: root
# New group entries: clevis, dnsmasq, gluster, input, kvm, printadmin, render, systemd-coredump, systemd-journal-remote, unbound
# Committing... done
# Metadata Total: 9777
# Metadata Written: 3156
# Content Total: 6635
# Content Written: 1456
# Content Cache Hits: 19307
# Content Bytes Written: 149555523
# 3156 metadata, 22414 content objects imported; 2.0 GB content written
# Wrote commit: 9c9831a17f276a55d263c7856aa61af722ec84d9780405018ac46b3c2c7aa5d6
# New image input checksum: 9062762601fde9b726033297ef1c442589066328334c88268d3952dcf1014826
# None
# New build ID: 48.90.202211260320-wzh-0
# Running: rpm-ostree compose container-encapsulate --max-layers=50 --format-version=1 --repo=/srv/tmp/repo --label=coreos-assembler.image-config-checksum=e748dfefac80583a123d35bfdfe87fcce2c2757f15d8251e8482d1aeb7e4b7a0 --label=coreos-assembler.image-input-checksum=9062762601fde9b726033297ef1c442589066328334c88268d3952dcf1014826 --label=org.opencontainers.image.source=https://github.com/wangzheng422/machine-os-content --label=org.opencontainers.image.revision=331baaa292509c237e8647b598a9768aefbb984d 48.90.202211260320-wzh-0 oci-archive:rhcos-48.90.202211260320-wzh-0-ostree.x86_64.ociarchive.tmp:latest
# Reading packages... done
# Building package mapping... done
# 22414 objects in 511 packages (332 source)
# rpm size: 1978859148
# Earliest changed package: nss-altfiles-2.18.1-20.el9.x86_64 at 2021-08-02 15:39:20 UTC
# 1488 duplicates
# Multiple owners:
# /usr/lib/.build-id/93/1521a98c6e8ca8485e3508ac3ee12e7a0bb233
# /usr/lib/.build-id/fb/c60f5edbc2853811a813d9fb404cdaddfaf70a
# /usr/share/licenses/systemd/LICENSE.LGPL2.1
# Generating container image... done
# Pushed digest: sha256:95ea1eeff653f2ec7ee9a3826978cbe5cadad2e9894d76edffb6a425892fdbab
# Total objects: 25866
# No unreachable objects
# Ignoring non-directory /srv/builds/.build-commit
# + rc=0
# + set +x
# or build with default setting, ostree and qcow2
cosa build
# ......
# + cosa meta --workdir /srv --build 48.90.202211270909-wzh-0 --artifact qemu --artifact-json /srv/tmp/build.qemu/meta.json.new
# /srv/builds/48.90.202211270909-wzh-0/x86_64/meta.json wrote with version stamp 1669540779194835967
# + /usr/lib/coreos-assembler/finalize-artifact rhcos-48.90.202211270909-wzh-0-qemu.x86_64.qcow2 /srv/builds/48.90.202211270909-wzh-0/x86_64/rhcos-48.90.202211270909-wzh-0-qemu.x86_64.qcow2
# + set +x
# Successfully generated: rhcos-48.90.202211270909-wzh-0-qemu.x86_64.qcow2
cosa list
# 48.90.202211270909-wzh-0
# Timestamp: 2022-11-27T09:14:21Z (0:05:40 ago)
# Artifacts: ostree qemu
# Config: wzh-ocp-4.8-based-on-4.13-rhel-9.0 (64094f653298) (dirty)
cosa upload-oscontainer --name "quay.io/wangzheng422/ocp"
# ......
# 2022-11-27 09:22:35,785 INFO - Running command: ['ostree', '--repo=/srv/tmp/containers-storage/vfs/dir/da857426a657461466a3d17f4faa848f71a9a311b2fec5165946adabf5ea3900/srv/repo', 'pull-local', '--disable-fsync', '/srv/tmp/repo', '3c009c9794dc1deea6b419e84e56d17247954d236777842de59abef6ef82658f']
# Writing objects: 55
# 2022-11-27 09:22:41,424 INFO - Running command: ['tar', '-xf', '/srv/builds/48.90.202211270909-wzh-0/x86_64/rhcos-48.90.202211270909-wzh-0-extensions.x86_64.tar']
# 2022-11-27 09:22:41,665 INFO - Running command: ['buildah', '--root=./tmp/containers-storage', '--storage-driver', 'vfs', 'config', '--entrypoint', '["/noentry"]', '-l', 'com.coreos.ostree-commit=3c009c9794dc1deea6b419e84e56d17247954d236777842de59abef6ef82658f', '-l', 'version=48.90.202211270909-wzh-0', '-l', 'com.coreos.rpm.cri-o=1.25.0-53.rhaos4.12.git2002c49.el9.x86_64', '-l', 'com.coreos.rpm.ignition=2.13.0-1.el9.x86_64', '-l', 'com.coreos.rpm.kernel=5.14.0-70.30.1.el9_0.x86_64', '-l', 'com.coreos.rpm.ostree=2022.5-1.el9_0.x86_64', '-l', 'com.coreos.rpm.rpm-ostree=2022.2-2.el9.x86_64', '-l', 'com.coreos.rpm.runc=4:1.1.3-2.el9_0.x86_64', '-l', 'com.coreos.rpm.systemd=250-6.el9_0.1.x86_64', '-l', 'com.coreos.coreos-assembler-commit=538402ec655961f7a79e9745c9a3af67e1123e39', '-l', 'com.coreos.redhat-coreos-commit=64094f6532982cd2118224785b88ba2890659aee', '-l', 'com.coreos.os-extensions=kerberos;kernel-devel;kernel-rt;usbguard;sandboxed-containers', '-l', 'com.coreos.rpm.kernel=5.14.0-70.30.1.el9_0.x86_64', '-l', 'com.coreos.rpm.kernel-rt-core=5.14.0-70.30.1.rt21.102.el9_0.x86_64', '-l', 'io.openshift.build.version-display-names=machine-os=Red Hat Enterprise Linux CoreOS', '-l', 'io.openshift.build.versions=machine-os=48.90.202211270909-wzh-0', 'ubi-working-container']
# WARN[0000] cmd "/bin/bash" exists and will be passed to entrypoint as a parameter
# Committing container...
# Getting image source signatures
# Copying blob 33204bfe17ee skipped: already exists
# Copying blob 06081b81a130 done
# Copying config 031de9981c done
# Writing manifest to image destination
# Storing signatures
# quay.io/wangzheng422/ocp:48.90.202211270909-wzh-0 031de9981c87301aeaffa5c7a0166067dad7a5c7f86166e999694953b89ef264
# Pushing container
# 2022-11-27 09:23:24,398 INFO - Running command: ['buildah', '--root=./tmp/containers-storage', '--storage-driver', 'vfs', 'push', '--tls-verify', '--authfile=/home/builder/.docker/config.json', '--digestfile=tmp/oscontainer-digest', '--format=v2s2', 'quay.io/wangzheng422/ocp:48.90.202211270909-wzh-0']
# Getting image source signatures
# Copying blob 06081b81a130 done
# Copying blob 33204bfe17ee done
# Copying config 031de9981c done
# Writing manifest to image destination
# Storing signatures
cosa buildextend-metal
# ......
# + cosa meta --workdir /srv --build 48.90.202211270909-wzh-0 --artifact metal --artifact-json /srv/tmp/build.metal/meta.json.new
# /srv/builds/48.90.202211270909-wzh-0/x86_64/meta.json wrote with version stamp 1669541240634979743
# + /usr/lib/coreos-assembler/finalize-artifact rhcos-48.90.202211270909-wzh-0-metal.x86_64.raw /srv/builds/48.90.202211270909-wzh-0/x86_64/rhcos-48.90.202211270909-wzh-0-metal.x86_64.raw
# + set +x
# Successfully generated: rhcos-48.90.202211270909-wzh-0-metal.x86_64.raw
cosa buildextend-metal4k
# ......
# + cosa meta --workdir /srv --build 48.90.202211270909-wzh-0 --artifact metal4k --artifact-json /srv/tmp/build.metal4k/meta.json.new
# /srv/builds/48.90.202211270909-wzh-0/x86_64/meta.json wrote with version stamp 1669541380398141511
# + /usr/lib/coreos-assembler/finalize-artifact rhcos-48.90.202211270909-wzh-0-metal4k.x86_64.raw /srv/builds/48.90.202211270909-wzh-0/x86_64/rhcos-48.90.202211270909-wzh-0-metal4k.x86_64.raw
# + set +x
# Successfully generated: rhcos-48.90.202211270909-wzh-0-metal4k.x86_64.raw
cosa buildextend-live
# ......
# 2022-11-27 09:38:49,575 INFO - Running command: ['/usr/bin/isohybrid', '--uefi', '/srv/tmp/buildpost-live/rhcos-48.90.202211270909-wzh-0-live.x86_64.iso.minimal']
# 2022-11-27 09:38:49,661 INFO - Running command: ['/usr/lib/coreos-assembler/runvm-coreos-installer', 'builds/48.90.202211270909-wzh-0/x86_64/rhcos-48.90.202211270909-wzh-0-metal.x86_64.raw', '', 'pack', 'minimal-iso', '/srv/tmp/buildpost-live/rhcos-48.90.202211270909-wzh-0-live.x86_64.iso', '/srv/tmp/buildpost-live/rhcos-48.90.202211270909-wzh-0-live.x86_64.iso.minimal', '--consume']
# + RUST_BACKTRACE=full
# + chroot /sysroot/ostree/deploy/rhcos/deploy/3c009c9794dc1deea6b419e84e56d17247954d236777842de59abef6ef82658f.0 env -C /srv coreos-installer pack minimal-iso /srv/tmp/buildpost-live/rhcos-48.90.202211270909-wzh-0-live.x86_64.iso /srv/tmp/buildpost-live/rhcos-48.90.202211270909-wzh-0-live.x86_64.iso.minimal --consume
# Packing minimal ISO
# Matched 16 files of 16
# Total bytes skipped: 89430463
# Total bytes written: 747073
# Total bytes written (compressed): 2788
# Verifying that packed image matches digest
# Packing successful!
# + '[' -f /var/tmp/coreos-installer-output ']'
# Updated: builds/48.90.202211270909-wzh-0/x86_64/meta.json
# run them all
cat << 'EOF' > /root/build.sh
# exit when any command fails
set -e
set -x
rm -rf /data/rhcos
mkdir -p /data/rhcos
cd /data/rhcos
export COREOS_ASSEMBLER_CONTAINER=quay.io/coreos-assembler/coreos-assembler:latest
podman pull $COREOS_ASSEMBLER_CONTAINER
cosa() {
env | grep COREOS_ASSEMBLER
local -r COREOS_ASSEMBLER_CONTAINER_LATEST="quay.io/coreos-assembler/coreos-assembler:latest"
if [[ -z ${COREOS_ASSEMBLER_CONTAINER} ]] && $(podman image exists ${COREOS_ASSEMBLER_CONTAINER_LATEST}); then
local -r cosa_build_date_str="$(podman inspect -f "{{.Created}}" ${COREOS_ASSEMBLER_CONTAINER_LATEST} | awk '{print $1}')"
local -r cosa_build_date="$(date -d ${cosa_build_date_str} +%s)"
if [[ $(date +%s) -ge $((cosa_build_date + 60*60*24*7)) ]] ; then
echo -e "\e[0;33m----" >&2
echo "The COSA container image is more that a week old and likely outdated." >&2
echo "You should pull the latest version with:" >&2
echo "podman pull ${COREOS_ASSEMBLER_CONTAINER_LATEST}" >&2
echo -e "----\e[0m" >&2
sleep 10
fi
fi
set -x
podman run --rm -ti --security-opt label=disable --privileged \
--uidmap=1000:0:1 --uidmap=0:1:1000 --uidmap 1001:1001:64536 \
-v ${PWD}:/srv/ --device /dev/kvm --device /dev/fuse \
-v /run/user/0/containers/auth.json:/home/builder/.docker/config.json \
--tmpfs /tmp -v /var/tmp:/var/tmp --name cosa \
${COREOS_ASSEMBLER_CONFIG_GIT:+-v $COREOS_ASSEMBLER_CONFIG_GIT:/srv/src/config/:ro} \
${COREOS_ASSEMBLER_GIT:+-v $COREOS_ASSEMBLER_GIT/src/:/usr/lib/coreos-assembler/:ro} \
${COREOS_ASSEMBLER_CONTAINER_RUNTIME_ARGS} \
${COREOS_ASSEMBLER_CONTAINER:-$COREOS_ASSEMBLER_CONTAINER_LATEST} "$@"
rc=$?; set +x; return $rc
}
cosa init \
--branch wzh-ocp-4.10-based-on-4.13-rhel-9 \
--variant rhel-coreos-9 \
https://github.com/wangzheng422/machine-os-content
sed -i 's/REPO_IP/45.76.173.230:5180/g' /data/rhcos/src/config/rhel-9.0.repo
cosa fetch
cosa build
cosa upload-oscontainer --name "quay.io/wangzheng422/ocp"
cosa buildextend-metal
cosa buildextend-metal4k
cosa buildextend-live
EOF
cd /root
bash /root/build.sh
# podman pull quay.io/wangzheng422/ocp:410.91.202211291516-wzh-0
# podman pull quay.io/wangzheng422/ocp@sha256:c7209dcadf2d27892eab9c692e8afb6a752307270526231961500647591d7129
ls -l /data/rhcos/builds/latest/x86_64/
# total 10333424
# -r--r--r--. 1 root root 66639 Nov 29 15:24 commitmeta.json
# -r--r--r--. 1 root root 473 Nov 29 15:16 coreos-assembler-config-git.json
# -r--r--r--. 1 root root 346037 Nov 29 15:16 coreos-assembler-config.tar.gz
# -rw-r--r--. 1 root root 14107 Nov 29 15:16 manifest.json
# -r--r--r--. 1 root root 33628 Nov 29 15:21 manifest-lock.generated.x86_64.json
# -rw-r--r--. 1 root root 6965 Nov 29 15:43 meta.json
# -r--r--r--. 1 root root 34844 Nov 29 15:21 ostree-commit-object
# -rw-r--r--. 1 root root 347832320 Nov 29 15:28 rhcos-410.91.202211291516-wzh-0-extensions.x86_64.tar
# -rw-r--r--. 1 root root 80525940 Nov 29 15:42 rhcos-410.91.202211291516-wzh-0-live-initramfs.x86_64.img
# -rw-r--r--. 1 root root 11649784 Nov 29 15:43 rhcos-410.91.202211291516-wzh-0-live-kernel-x86_64
# -rw-r--r--. 1 root root 930239488 Nov 29 15:42 rhcos-410.91.202211291516-wzh-0-live-rootfs.x86_64.img
# -rw-r--r--. 1 root root 1028653056 Nov 29 15:43 rhcos-410.91.202211291516-wzh-0-live.x86_64.iso
# -r--r--r--. 1 root root 3596615680 Nov 29 15:34 rhcos-410.91.202211291516-wzh-0-metal4k.x86_64.raw
# -r--r--r--. 1 root root 3596615680 Nov 29 15:32 rhcos-410.91.202211291516-wzh-0-metal.x86_64.raw
# -r--r--r--. 1 root root 965853184 Nov 29 15:24 rhcos-410.91.202211291516-wzh-0-ostree.x86_64.ociarchive
# -r--r--r--. 1 root root 2383609856 Nov 29 15:26 rhcos-410.91.202211291516-wzh-0-qemu.x86_64.qcow2
# ocp 4.8 is too buggy, we switch to ocp 4.10
# https://bugzilla.redhat.com/show_bug.cgi?id=2044808
# Create a new release based on openshift 4.10.41 and override a single image
export BUILDNUMBER=4.10.41
export VAR_RELEASE_VER=$BUILDNUMBER-rhel-9.1-v02
oc adm release new -a /data/pull-secret.json \
--from-release ` curl -s https://mirror.openshift.com/pub/openshift-v4/x86_64/clients/ocp/$BUILDNUMBER/release.txt | grep "Pull From:" | awk '{print $3}' ` \
machine-os-content=quay.io/wangzheng422/ocp@sha256:c7209dcadf2d27892eab9c692e8afb6a752307270526231961500647591d7129 \
--to-image docker.io/wangzheng422/ocp:$VAR_RELEASE_VER
# docker.io/wangzheng422/ocp:4.10.41-rhel-9.1-v02
oc image mirror docker.io/wangzheng422/ocp:$VAR_RELEASE_VER quay.io/wangzheng422/ocp:$VAR_RELEASE_VER
# podman pull quay.io/wangzheng422/ocp:4.10.41-rhel-9.1-v02
# podman pull quay.io/wangzheng422/ocp@sha256:73394d5833b12a848fed80154953fe97962362cc153b239e513afade7f87fb3c
try to install using UPI
我们已经准备好了镜像,那就试试装一个集群出来看看什么样子的。
We have prepared the image, so let's try to install a cluster to see what it looks like.
on vps, download image and binary for 4.10.41
第一步,还是在公网上,下载一些安装用的文件,这一步不是必须的。我们主要用里面的ansible工具,配置我们环境的dns。
The first step is to download some installation files from the public network. This step is not necessary. We mainly use the ansible tool inside to configure the dns of our environment.
# download image and binary for 4.8.53
# on vultr
rm -rf /data/ocp4/
mkdir -p /data/ocp4/
cd /data/ocp4
export BUILDNUMBER=4.11.18
wget -O openshift-client-linux-${BUILDNUMBER}.tar.gz https://mirror.openshift.com/pub/openshift-v4/x86_64/clients/ocp/${BUILDNUMBER}/openshift-client-linux-${BUILDNUMBER}.tar.gz
wget -O openshift-install-linux-${BUILDNUMBER}.tar.gz https://mirror.openshift.com/pub/openshift-v4/x86_64/clients/ocp/${BUILDNUMBER}/openshift-install-linux-${BUILDNUMBER}.tar.gz
tar -xzf openshift-client-linux-${BUILDNUMBER}.tar.gz -C /usr/local/bin/
tar -xzf openshift-install-linux-${BUILDNUMBER}.tar.gz -C /usr/local/bin/
wget -O opm-linux.tar.gz https://mirror.openshift.com/pub/openshift-v4/x86_64/clients/opm/4.6.1/opm-linux-4.6.1.tar.gz
tar -xzf opm-linux.tar.gz -C /usr/local/bin/
wget https://github.com/operator-framework/operator-registry/releases/download/v1.26.2/linux-amd64-opm
chmod +x linux-amd64-opm
install linux-amd64-opm /usr/local/bin/opm
rm -rf /data/ocp4/
mkdir -p /data/ocp4/tmp
cd /data/ocp4/tmp
git clone https://github.com/wangzheng422/openshift4-shell
cd openshift4-shell
git checkout ocp-4.8
/bin/cp -f prepare.content.with.oc.mirror.sh /data/ocp4/
rm -rf /data/ocp4/tmp
cd /data/ocp4
# bash prepare.content.with.oc.mirror.sh -v 4.11.5,${BUILDNUMBER}, -m ${BUILDNUMBER%.*} -b ocp-4.11
bash prepare.content.with.oc.mirror.sh -v ${BUILDNUMBER}, -m ${BUILDNUMBER%.*} -b ocp-4.8
import ocp content into quay
第二步,根据我们自定义的release image,同步安装镜像,到我们内部的镜像仓库,并且抽取安装二进制文件。
The second part, according to our custom release image, synchronously installs the image to our internal mirror warehouse, and extracts the installation binary file.
export BUILDNUMBER=4.11.18
pushd /data/ocp4/${BUILDNUMBER}
tar -xzf openshift-client-linux-${BUILDNUMBER}.tar.gz -C /usr/local/bin/
tar -xzf openshift-install-linux-${BUILDNUMBER}.tar.gz -C /usr/local/bin/
# tar -xzf oc-mirror.tar.gz -C /usr/local/bin/
# chmod +x /usr/local/bin/oc-mirror
install -m 755 /data/ocp4/clients/butane-amd64 /usr/local/bin/butane
# install -m 755 /data/ocp4/clients/coreos-installer_amd64 /usr/local/bin/coreos-installer
popd
SEC_FILE="$XDG_RUNTIME_DIR/containers/auth.json"
# $XDG_RUNTIME_DIR/containers
mkdir -p ${SEC_FILE%/*}
# OR
# SEC_FILE="$HOME/.docker/config.json"
SEC_FILE="$HOME/.config/containers/auth.json"
mkdir -p ${SEC_FILE%/*}
# copy the password file
podman login quaylab.infra.redhat.ren:8443 --username admin --password redhatadmin
export VAR_RELEASE_VER=4.10.41-rhel-9.1-v02
oc adm release mirror -a $SEC_FILE \
--from=quay.io/wangzheng422/ocp:$VAR_RELEASE_VER \
--to=quaylab.infra.wzhlab.top:5443/ocp4/openshift4
# ......
# Success
# Update image: quaylab.infra.wzhlab.top:5443/ocp4/openshift4:4.10.41-x86_64
# Mirror prefix: quaylab.infra.wzhlab.top:5443/ocp4/openshift4
# To use the new mirrored repository to install, add the following section to the install-config.yaml:
# imageContentSources:
# - mirrors:
# - quaylab.infra.wzhlab.top:5443/ocp4/openshift4
# source: quay.io/openshift-release-dev/ocp-v4.0-art-dev
# - mirrors:
# - quaylab.infra.wzhlab.top:5443/ocp4/openshift4
# source: quay.io/wangzheng422/ocp
# To use the new mirrored repository for upgrades, use the following to create an ImageContentSourcePolicy:
# apiVersion: operator.openshift.io/v1alpha1
# kind: ImageContentSourcePolicy
# metadata:
# name: example
# spec:
# repositoryDigestMirrors:
# - mirrors:
# - quaylab.infra.wzhlab.top:5443/ocp4/openshift4
# source: quay.io/openshift-release-dev/ocp-v4.0-art-dev
# - mirrors:
# - quaylab.infra.wzhlab.top:5443/ocp4/openshift4
# source: quay.io/wangzheng422/ocp
# !!!! 注意,以下步骤必须执行,因为版本信息在可执行程序和里面 !!!
mkdir -p /data/work/ext-client
cd /data/work/ext-client
RELEASE_IMAGE=quay.io/wangzheng422/ocp:$VAR_RELEASE_VER
LOCAL_SECRET_JSON=/data/pull-secret.json
oc adm release extract --registry-config ${LOCAL_SECRET_JSON} --command='openshift-baremetal-install' ${RELEASE_IMAGE}
oc adm release extract --registry-config ${LOCAL_SECRET_JSON} --command='openshift-install' ${RELEASE_IMAGE}
oc adm release extract --registry-config ${LOCAL_SECRET_JSON} --command='oc' ${RELEASE_IMAGE}
# oc adm release extract --registry-config ${LOCAL_SECRET_JSON} --tools=true ${RELEASE_IMAGE}
./openshift-install version
# ./openshift-install 4.10.41
# built from commit 14145f0cbc879ca19cfcb583c86bd01595afb9d5
# release image quay.io/wangzheng422/ocp@sha256:1c6a539ac44c65e2d1005a270e5d05442deaa9b3a0101edab695010a90f09aed
# release architecture amd64
install -m 755 /data/work/ext-client/openshift-install /usr/local/bin/openshift-install
install -m 755 /data/work/ext-client/oc /usr/local/bin/oc
# install -m 755 /data/ocp4/clients/butane-amd64 /usr/local/bin/butane
mirror for disconnected
我们把operator用到的镜像,都mirror到内部镜像仓库试试。
# we use oc-mirror from ocp 4.11
cat > /data/ocp4/mirror.yaml << EOF
apiVersion: mirror.openshift.io/v1alpha2
kind: ImageSetConfiguration
# archiveSize: 4
mirror:
platform:
architectures:
- amd64
# - arm64
channels:
# - name: stable-4.11
# type: ocp
# minVersion: 4.11.18
# maxVersion: 4.11.18
# shortestPath: true
# - name: stable-4.10
# type: ocp
# minVersion: 4.10.45
# maxVersion: 4.10.45
# shortestPath: true
graph: false
additionalImages:
- name: registry.redhat.io/redhat/redhat-operator-index:v4.10
- name: registry.redhat.io/redhat/certified-operator-index:v4.10
- name: registry.redhat.io/redhat/community-operator-index:v4.10
- name: registry.redhat.io/redhat/redhat-marketplace-index:v4.10
- name: quay.io/wangzheng422/local-storage-operator:wzh-ocp-4.10-v01
- name: quay.io/wangzheng422/local-storage-bundle:wzh-ocp-4.10-v01
- name: quay.io/wangzheng422/local-diskmaker:wzh-ocp-4.10-v01
- name: quay.io/wangzheng422/local-storage-operator:wzh-ocp-4.10-v01
- name: quay.io/wangzheng422/local-must-gather:wzh-ocp-4.10-v01
- name: quay.io/openshift/origin-kube-rbac-proxy:latest
- name: quay.io/wangzheng422/debug-pod:alma-9.1
operators:
- catalog: registry.redhat.io/redhat/redhat-operator-index:v4.10
packages:
- name: cluster-logging
channels:
- name: stable
minVersion: 5.5.5
- name: elasticsearch-operator
channels:
- name: stable
minVersion: 5.5.5
- name: jaeger-product
channels:
- name: stable
minVersion: 1.39.0-3
- name: kubernetes-nmstate-operator
channels:
- name: stable
minVersion: 4.10.0-202212061900
- name: odf-operator
channels:
- name: stable-4.10
minVersion: 4.10.9
- name: sriov-network-operator
channels:
- name: stable
minVersion: 4.10.0-202212061900
- name: kubevirt-hyperconverged
channels:
- name: stable
minVersion: 4.10.7
- catalog: quay.io/wangzheng422/local-storage-index:wzh-ocp-4.10-v01
packages:
- name: local-storage-operator
channels:
- name: preview
EOF
mkdir -p /data/install/mirror-tmp
cd /data/install/mirror-tmp
oc-mirror --config /data/ocp4/mirror.yaml docker://quaylab.infra.wzhlab.top:5443
mirror to files and import back
之前我们都是直接mirror到内部镜像仓库,但是实际项目环境,是根本不会联网的,所以我们需要先镜像到本地目录/文件,然后从目录/文件导入到内部镜像仓库。这里就按照这个流程做一遍。
mkdir -p /data/ocp4/
mkdir -p /data/ocp-install/images/
mkdir -p /data/ocp-install/clients/
cd /data/ocp4/
export BUILDNUMBER=4.10.41
wget -O openshift-client-linux-${BUILDNUMBER}.tar.gz https://mirror.openshift.com/pub/openshift-v4/x86_64/clients/ocp/${BUILDNUMBER}/openshift-client-linux-${BUILDNUMBER}.tar.gz
wget -O openshift-install-linux-${BUILDNUMBER}.tar.gz https://mirror.openshift.com/pub/openshift-v4/x86_64/clients/ocp/${BUILDNUMBER}/openshift-install-linux-${BUILDNUMBER}.tar.gz
tar -xzf openshift-client-linux-${BUILDNUMBER}.tar.gz -C /usr/local/bin/
tar -xzf openshift-install-linux-${BUILDNUMBER}.tar.gz -C /usr/local/bin/
export BUILDNUMBER=4.11.18
wget -O oc-mirror.tar.gz https://mirror.openshift.com/pub/openshift-v4/x86_64/clients/ocp/${BUILDNUMBER}/oc-mirror.tar.gz
tar -xzf oc-mirror.tar.gz -C /usr/local/bin/
chmod +x /usr/local/bin/oc-mirror
# SEC_FILE="$XDG_RUNTIME_DIR/containers/auth.json"
# # $XDG_RUNTIME_DIR/containers
# mkdir -p ${SEC_FILE%/*}
# OR
SEC_FILE="$HOME/.config/containers/auth.json"
mkdir -p ${SEC_FILE%/*}
# copy the password file
# podman login quaylab.infra.redhat.ren:8443 --username admin --password redhatadmin
export VAR_RELEASE_VER=4.10.41-rhel-9.1-v02
oc adm release mirror -a $SEC_FILE \
--from=quay.io/wangzheng422/ocp:$VAR_RELEASE_VER \
--to-dir=/data/ocp-install/images/
# ......
# Success
# Update image: openshift/release:4.10.41-x86_64
# To upload local images to a registry, run:
# oc image mirror --from-dir=/data/ocp-install/images/ 'file://openshift/release:4.10.41-x86_64*' REGISTRY/REPOSITORY
cd /data/ocp-install/clients/
RELEASE_IMAGE=quay.io/wangzheng422/ocp:$VAR_RELEASE_VER
LOCAL_SECRET_JSON=$SEC_FILE
oc adm release extract --registry-config ${LOCAL_SECRET_JSON} --command='openshift-baremetal-install' ${RELEASE_IMAGE}
oc adm release extract --registry-config ${LOCAL_SECRET_JSON} --command='openshift-install' ${RELEASE_IMAGE}
oc adm release extract --registry-config ${LOCAL_SECRET_JSON} --command='oc' ${RELEASE_IMAGE}
/bin/cp -f /usr/local/bin/oc-mirror ./
cat > /data/ocp4/mirror.yaml << EOF
apiVersion: mirror.openshift.io/v1alpha2
kind: ImageSetConfiguration
# archiveSize: 4
mirror:
platform:
architectures:
- amd64
# - arm64
channels:
# - name: stable-4.11
# type: ocp
# minVersion: 4.11.18
# maxVersion: 4.11.18
# shortestPath: true
# - name: stable-4.10
# type: ocp
# minVersion: 4.10.45
# maxVersion: 4.10.45
# shortestPath: true
graph: false
additionalImages:
- name: registry.redhat.io/redhat/redhat-operator-index:v4.10
- name: registry.redhat.io/redhat/certified-operator-index:v4.10
- name: registry.redhat.io/redhat/community-operator-index:v4.10
- name: registry.redhat.io/redhat/redhat-marketplace-index:v4.10
- name: quay.io/wangzheng422/local-storage-operator:wzh-ocp-4.10-v01
- name: quay.io/wangzheng422/local-storage-bundle:wzh-ocp-4.10-v01
- name: quay.io/wangzheng422/local-diskmaker:wzh-ocp-4.10-v01
- name: quay.io/wangzheng422/local-storage-operator:wzh-ocp-4.10-v01
- name: quay.io/wangzheng422/local-must-gather:wzh-ocp-4.10-v01
- name: quay.io/openshift/origin-kube-rbac-proxy:latest
- name: quay.io/wangzheng422/debug-pod:alma-9.1
operators:
- catalog: registry.redhat.io/redhat/redhat-operator-index:v4.10
packages:
- name: cluster-logging
channels:
- name: stable
minVersion: 5.5.5
- name: elasticsearch-operator
channels:
- name: stable
minVersion: 5.5.5
- name: jaeger-product
channels:
- name: stable
minVersion: 1.39.0-3
- name: kubernetes-nmstate-operator
channels:
- name: stable
minVersion: 4.10.0-202212061900
- name: odf-operator
channels:
- name: stable-4.10
minVersion: 4.10.9
- name: sriov-network-operator
channels:
- name: stable
minVersion: 4.10.0-202212061900
- name: kubevirt-hyperconverged
channels:
- name: stable
minVersion: 4.10.7
- catalog: quay.io/wangzheng422/local-storage-index:wzh-ocp-4.10-v01
packages:
- name: local-storage-operator
channels:
- name: preview
EOF
mkdir -p /data/ocp-install/oc-mirror/
cd /data/ocp-install/oc-mirror/
oc-mirror --config /data/ocp4/mirror.yaml file:///data/ocp-install/oc-mirror/
mkdir -p /data/bypy
cd /data/bypy
cd /data
# export BUILDNUMBER=4.8.17
tar -cvf - ocp-install/ | pigz -c > /data/bypy/ocp-install.tgz
cd /data/bypy
# https://github.com/houtianze/bypy
# yum -y install python3-pip
# pip3 install --user bypy
# /root/.local/bin/bypy list
/root/.local/bin/bypy upload
# test import
tar zvxf ocp-install.tgz
/bin/cp -f ./ocp-install/clients/* /usr/local/bin/
oc image mirror --from-dir=./ocp-install/images/ 'file://openshift/release:4.10.41-x86_64*' quaylab.infra.wzhlab.top:5443/ocp4/openshift4
oc-mirror --from=./ocp-install/oc-mirror/mirror_seq1_000000.tar \
docker://quaylab.infra.wzhlab.top:5443
try to config the ocp install
然后,我们就开始定义ocp的安装install配置文件,并且由于我们是UPI安装,我们还要定制iso。
Then, we start to define the installation configuration file of ocp, and since we are installing using UPI, we also need to customize the iso.
# export BUILDNUMBER=4.8.53
# pushd /data/ocp4/${BUILDNUMBER}
# tar -xzf openshift-client-linux-${BUILDNUMBER}.tar.gz -C /usr/local/bin/
# tar -xzf openshift-install-linux-${BUILDNUMBER}.tar.gz -C /usr/local/bin/
# tar -xzf oc-mirror.tar.gz -C /usr/local/bin/
# chmod +x /usr/local/bin/oc-mirror
# install -m 755 /data/ocp4/clients/butane-amd64 /usr/local/bin/butane
# install -m 755 /data/ocp4/clients/coreos-installer_amd64 /usr/local/bin/coreos-installer
# popd
# create a user and create the cluster under the user
useradd -m 3node
# useradd -G wheel 3node
usermod -aG wheel 3node
echo -e "%wheel\tALL=(ALL)\tNOPASSWD: ALL" > /etc/sudoers.d/020_sudo_for_me
su - 3node
ssh-keygen
cat << EOF > ~/.ssh/config
StrictHostKeyChecking no
UserKnownHostsFile=/dev/null
EOF
chmod 600 ~/.ssh/config
cat << 'EOF' >> ~/.bashrc
export BASE_DIR='/home/3node/'
EOF
# export BASE_DIR='/home/3node/'
mkdir -p ${BASE_DIR}/data/{sno/disconnected,install}
# set some parameter of you rcluster
NODE_SSH_KEY="$(cat ${BASE_DIR}/.ssh/id_rsa.pub)"
INSTALL_IMAGE_REGISTRY=quaylab.infra.wzhlab.top:5443
PULL_SECRET='{"auths":{"registry.redhat.io": {"auth": "ZHVtbXk6ZHVtbXk=","email": "noemail@localhost"},"registry.ocp4.redhat.ren:5443": {"auth": "ZHVtbXk6ZHVtbXk=","email": "noemail@localhost"},"'${INSTALL_IMAGE_REGISTRY}'": {"auth": "'$( echo -n 'admin:redhatadmin' | openssl base64 )'","email": "noemail@localhost"}}}'
# NTP_SERVER=192.168.7.11
# HELP_SERVER=192.168.7.11
# KVM_HOST=192.168.7.11
# API_VIP=192.168.7.100
# INGRESS_VIP=192.168.7.101
# CLUSTER_PROVISION_IP=192.168.7.103
# BOOTSTRAP_IP=192.168.7.12
# 定义单节点集群的节点信息
SNO_CLUSTER_NAME=acm-demo-one
SNO_BASE_DOMAIN=wzhlab.top
# echo ${SNO_IF_MAC} > /data/sno/sno.mac
mkdir -p ${BASE_DIR}/data/install
cd ${BASE_DIR}/data/install
/bin/rm -rf *.ign .openshift_install_state.json auth bootstrap manifests master*[0-9] worker*[0-9]
cat << EOF > ${BASE_DIR}/data/install/install-config.yaml
apiVersion: v1
baseDomain: $SNO_BASE_DOMAIN
compute:
- name: worker
replicas: 0
controlPlane:
name: master
replicas: 3
metadata:
name: $SNO_CLUSTER_NAME
networking:
# OVNKubernetes , OpenShiftSDN
networkType: OVNKubernetes
clusterNetwork:
- cidr: 10.128.0.0/14
hostPrefix: 23
- cidr: fd01::/48
hostPrefix: 64
serviceNetwork:
- 172.30.0.0/16
- fd02::/112
machineNetwork:
- cidr: 10.0.0.0/16
- cidr: fd03::/64
platform:
none: {}
pullSecret: '${PULL_SECRET}'
sshKey: |
$( cat ${BASE_DIR}/.ssh/id_rsa.pub | sed 's/^/ /g' )
additionalTrustBundle: |
$( cat /etc/crts/redhat.ren.ca.crt | sed 's/^/ /g' )
imageContentSources:
- mirrors:
- ${INSTALL_IMAGE_REGISTRY}/ocp4/openshift4
source: quay.io/openshift-release-dev/ocp-release
- mirrors:
- ${INSTALL_IMAGE_REGISTRY}/ocp4/openshift4
source: quay.io/openshift-release-dev/ocp-v4.0-art-dev
- mirrors:
- ${INSTALL_IMAGE_REGISTRY}/ocp4/openshift4
source: quay.io/wangzheng422/ocp
EOF
/bin/cp -f ${BASE_DIR}/data/install/install-config.yaml ${BASE_DIR}/data/install/install-config.yaml.bak
openshift-install create manifests --dir=${BASE_DIR}/data/install
# additional ntp config
/bin/cp -f /data/ocp4/ansible-helper/files/* ${BASE_DIR}/data/install/openshift/
#############################################
# run as root if you have not run below, at least one time
# it will generate registry configuration
# copy image registry proxy related config
# cd /data/ocp4
# bash image.registries.conf.sh nexus.infra.redhat.ren:8083
# /bin/cp -f /data/ocp4/image.registries.conf /etc/containers/registries.conf.d/
#############################################
sudo bash -c "cd /data/ocp4 ; bash image.registries.conf.sh quaylab.infra.wzhlab.top:5443 ;"
/bin/cp -f /data/ocp4/99-worker-container-registries.yaml ${BASE_DIR}/data/install/openshift
/bin/cp -f /data/ocp4/99-master-container-registries.yaml ${BASE_DIR}/data/install/openshift
cd ${BASE_DIR}/data/install/
openshift-install --dir=${BASE_DIR}/data/install create ignition-configs
BOOTSTRAP_IP=192.168.77.22
MASTER_01_IP=192.168.77.23
MASTER_02_IP=192.168.77.24
MASTER_03_IP=192.168.77.25
BOOTSTRAP_IPv6=fd03::22
MASTER_01_IPv6=fd03::23
MASTER_02_IPv6=fd03::24
MASTER_03_IPv6=fd03::25
BOOTSTRAP_HOSTNAME=bootstrap-demo
MASTER_01_HOSTNAME=master-01-demo
MASTER_02_HOSTNAME=master-02-demo
MASTER_03_HOSTNAME=master-03-demo
BOOTSTRAP_INTERFACE=enp1s0
MASTER_01_INTERFACE=enp1s0
MASTER_02_INTERFACE=enp1s0
MASTER_03_INTERFACE=enp1s0
BOOTSTRAP_DISK=/dev/vda
MASTER_01_DISK=/dev/vda
MASTER_02_DISK=/dev/vda
MASTER_03_DISK=/dev/vda
OCP_GW=192.168.77.11
OCP_NETMASK=255.255.255.0
OCP_NETMASK_S=24
OCP_DNS=192.168.77.11
OCP_GW_v6=fd03::11
OCP_NETMASK_v6=64
# HTTP_PATH=http://192.168.7.11:8080/ignition
source /data/ocp4/acm.fn.sh
# 我们会创建一个wzh用户,密码是redhat,这个可以在第一次启动的是,从console/ssh直接用用户名口令登录
# 方便排错和研究
VAR_PWD_HASH="$(python3 -c 'import crypt,getpass; print(crypt.crypt("redhat"))')"
cat ${BASE_DIR}/data/install/bootstrap.ign \
| jq --arg VAR "$VAR_PWD_HASH" --arg VAR_SSH "$NODE_SSH_KEY" '.passwd.users += [{ "name": "wzh", "system": true, "passwordHash": $VAR , "sshAuthorizedKeys": [ $VAR_SSH ], "groups": [ "adm", "wheel", "sudo", "systemd-journal" ] }]' \
| jq -c . \
> ${BASE_DIR}/data/install/bootstrap-iso.ign
cat ${BASE_DIR}/data/install/master.ign \
| jq --arg VAR "$VAR_PWD_HASH" --arg VAR_SSH "$NODE_SSH_KEY" '.passwd.users += [{ "name": "wzh", "system": true, "passwordHash": $VAR , "sshAuthorizedKeys": [ $VAR_SSH ], "groups": [ "adm", "wheel", "sudo", "systemd-journal" ] }]' \
| jq -c . \
> ${BASE_DIR}/data/install/master-iso.ign
VAR_IMAGE_VER=410.91.202211291516-wzh-0
cd ${BASE_DIR}/data/install/
/bin/cp -f /data/work/ext-client/iso/rhcos-$VAR_IMAGE_VER-live.x86_64.iso bootstrap.iso
/bin/cp -f bootstrap.iso master01.iso
/bin/cp -f bootstrap.iso master02.iso
/bin/cp -f bootstrap.iso master03.iso
sudo /bin/cp -f /data/work/ext-client/iso/rhcos-$VAR_IMAGE_VER-metal.x86_64.raw /data/dnf/
sudo /bin/cp -f ${BASE_DIR}/data/install/{bootstrap,master}-iso.ign /data/dnf/
# for ipv4 only
coreos-installer iso kargs modify -a "ip=$BOOTSTRAP_IP::$OCP_GW:$OCP_NETMASK:$BOOTSTRAP_HOSTNAME:$BOOTSTRAP_INTERFACE:none nameserver=$OCP_DNS coreos.inst.install_dev=$BOOTSTRAP_DISK coreos.inst.ignition_url=http://192.168.77.11:5000/bootstrap-iso.ign coreos.inst.image_url=http://192.168.77.11:5000/rhcos-$VAR_IMAGE_VER-metal.x86_64.raw coreos.inst.insecure" bootstrap.iso
coreos-installer iso kargs modify -a "ip=$MASTER_01_IP::$OCP_GW:$OCP_NETMASK:$MASTER_01_HOSTNAME:$MASTER_01_INTERFACE:none nameserver=$OCP_DNS coreos.inst.install_dev=$MASTER_01_DISK coreos.inst.ignition_url=http://192.168.77.11:5000/master-iso.ign coreos.inst.image_url=http://192.168.77.11:5000/rhcos-$VAR_IMAGE_VER-metal.x86_64.raw coreos.inst.insecure" master01.iso
coreos-installer iso kargs modify -a "ip=$MASTER_02_IP::$OCP_GW:$OCP_NETMASK:$MASTER_02_HOSTNAME:$MASTER_02_INTERFACE:none nameserver=$OCP_DNS coreos.inst.install_dev=$MASTER_02_DISK coreos.inst.ignition_url=http://192.168.77.11:5000/master-iso.ign coreos.inst.image_url=http://192.168.77.11:5000/rhcos-$VAR_IMAGE_VER-metal.x86_64.raw coreos.inst.insecure" master02.iso
coreos-installer iso kargs modify -a "ip=$MASTER_03_IP::$OCP_GW:$OCP_NETMASK:$MASTER_03_HOSTNAME:$MASTER_03_INTERFACE:none nameserver=$OCP_DNS coreos.inst.install_dev=$MASTER_03_DISK coreos.inst.ignition_url=http://192.168.77.11:5000/master-iso.ign coreos.inst.image_url=http://192.168.77.11:5000/rhcos-$VAR_IMAGE_VER-metal.x86_64.raw coreos.inst.insecure" master03.iso
# for ipv4 / ipv6 dual stack
coreos-installer iso kargs modify -a " ip=$BOOTSTRAP_IP::$OCP_GW:$OCP_NETMASK:$BOOTSTRAP_HOSTNAME:$BOOTSTRAP_INTERFACE:none nameserver=$OCP_DNS ip=[$BOOTSTRAP_IPv6]::[$OCP_GW_v6]:$OCP_NETMASK_v6:$BOOTSTRAP_HOSTNAME:$BOOTSTRAP_INTERFACE:none coreos.inst.install_dev=$BOOTSTRAP_DISK coreos.inst.ignition_url=http://192.168.77.11:5000/bootstrap-iso.ign coreos.inst.image_url=http://192.168.77.11:5000/rhcos-$VAR_IMAGE_VER-metal.x86_64.raw coreos.inst.insecure " bootstrap.iso
coreos-installer iso kargs modify -a " ip=$MASTER_01_IP::$OCP_GW:$OCP_NETMASK:$MASTER_01_HOSTNAME:$MASTER_01_INTERFACE:none nameserver=$OCP_DNS ip=[$MASTER_01_IPv6]::[$OCP_GW_v6]:$OCP_NETMASK_v6:$MASTER_01_HOSTNAME:$MASTER_01_INTERFACE:none coreos.inst.install_dev=$MASTER_01_DISK coreos.inst.ignition_url=http://192.168.77.11:5000/master-iso.ign coreos.inst.image_url=http://192.168.77.11:5000/rhcos-$VAR_IMAGE_VER-metal.x86_64.raw coreos.inst.insecure " master01.iso
coreos-installer iso kargs modify -a " ip=$MASTER_02_IP::$OCP_GW:$OCP_NETMASK:$MASTER_02_HOSTNAME:$MASTER_02_INTERFACE:none nameserver=$OCP_DNS ip=[$MASTER_02_IPv6]::[$OCP_GW_v6]:$OCP_NETMASK_v6:$MASTER_02_HOSTNAME:$MASTER_02_INTERFACE:none coreos.inst.install_dev=$MASTER_02_DISK coreos.inst.ignition_url=http://192.168.77.11:5000/master-iso.ign coreos.inst.image_url=http://192.168.77.11:5000/rhcos-$VAR_IMAGE_VER-metal.x86_64.raw coreos.inst.insecure " master02.iso
coreos-installer iso kargs modify -a " ip=$MASTER_03_IP::$OCP_GW:$OCP_NETMASK:$MASTER_03_HOSTNAME:$MASTER_03_INTERFACE:none nameserver=$OCP_DNS ip=[$MASTER_03_IPv6]::[$OCP_GW_v6]:$OCP_NETMASK_v6:$MASTER_03_HOSTNAME:$MASTER_03_INTERFACE:none coreos.inst.install_dev=$MASTER_03_DISK coreos.inst.ignition_url=http://192.168.77.11:5000/master-iso.ign coreos.inst.image_url=http://192.168.77.11:5000/rhcos-$VAR_IMAGE_VER-metal.x86_64.raw coreos.inst.insecure " master03.iso
deploy on kvm host
有了iso文件,我们就可以用他们启动kvm,开始安装了,这一部分,可以参考引用文档,这里就不重复写了。
With the iso files, we can use them to start kvm and start the installation. For this part, you can refer to the reference document, so I will not repeat it here.
result
等着安装完成,什么都不需要做,然后运行下面的命令,就能得到我们集群的登录参数了。
之后,我们登录到节点,就能看到,节点的kernel已经升级好了。
Wait for the installation to complete, you don't need to do anything, and then run the following command to get the login parameters of our cluster.
After that, when we log in to the node, we can see that the kernel of the node has been upgraded.
openshift-install wait-for install-complete --log-level debug
# ......
# INFO Waiting up to 10m0s (until 12:31PM) for the openshift-console route to be created...
# DEBUG Route found in openshift-console namespace: console
# DEBUG OpenShift console route is admitted
# INFO Install complete!
# INFO To access the cluster as the system:admin user when using 'oc', run 'export KUBECONFIG=/home/3node/data/install/auth/kubeconfig'
# INFO Access the OpenShift web-console here: https://console-openshift-console.apps.acm-demo-one.wzhlab.top
# INFO Login to the console with user: "kubeadmin", and password: "NpBWx-CM25p-oykYx-TBAoy"
# DEBUG Time elapsed per stage:
# DEBUG Cluster Operators: 6m44s
# INFO Time elapsed: 6m44s
password login and oc config
# init setting for helper node
cat << EOF > ~/.ssh/config
StrictHostKeyChecking no
UserKnownHostsFile=/dev/null
EOF
chmod 600 ~/.ssh/config
cat > ${BASE_DIR}/data/install/crack.txt << 'EOF'
echo redhat | sudo passwd --stdin root
sudo sh -c 'echo "PasswordAuthentication yes" > /etc/ssh/sshd_config.d/99-wzh.conf '
sudo sh -c 'echo "PermitRootLogin yes" >> /etc/ssh/sshd_config.d/99-wzh.conf '
sudo sh -c 'echo "ClientAliveInterval 1800" >> /etc/ssh/sshd_config.d/99-wzh.conf '
sudo systemctl restart sshd
sudo sh -c 'echo "export KUBECONFIG=/etc/kubernetes/static-pod-resources/kube-apiserver-certs/secrets/node-kubeconfigs/localhost.kubeconfig" >> /root/.bashrc'
sudo sh -c 'echo "RET=\`oc config use-context system:admin\`" >> /root/.bashrc'
EOF
for i in 23 24 25
do
ssh core@192.168.77.$i < ${BASE_DIR}/data/install/crack.txt
done
from other host
# https://unix.stackexchange.com/questions/230084/send-the-password-through-stdin-in-ssh-copy-id
dnf install -y sshpass
for i in 23 24 25
do
sshpass -p 'redhat' ssh-copy-id root@192.168.77.$i
done
log into ocp to check
我们登录到openshift里面,看看成果吧。
Let's log in to openshift and see the results.
# login to master-01
uname -a
# Linux master-01-demo 5.14.0-162.6.1.el9_1.x86_64 #1 SMP PREEMPT_DYNAMIC Fri Sep 30 07:36:03 EDT 2022 x86_64 x86_64 x86_64 GNU/Linux
cat /etc/os-release
# NAME="Red Hat Enterprise Linux CoreOS"
# ID="rhcos"
# ID_LIKE="rhel fedora"
# VERSION="410.91.202211291516-wzh-0"
# VERSION_ID="4.10"
# VARIANT="CoreOS"
# VARIANT_ID=coreos
# PLATFORM_ID="platform:el9"
# PRETTY_NAME="Red Hat Enterprise Linux CoreOS 410.91.202211291516-wzh-0 (Plow)"
# ANSI_COLOR="0;31"
# CPE_NAME="cpe:/o:redhat:enterprise_linux:9::coreos"
# HOME_URL="https://www.redhat.com/"
# DOCUMENTATION_URL="https://docs.openshift.com/container-platform/4.10/"
# BUG_REPORT_URL="https://bugzilla.redhat.com/"
# REDHAT_BUGZILLA_PRODUCT="OpenShift Container Platform"
# REDHAT_BUGZILLA_PRODUCT_VERSION="4.10"
# REDHAT_SUPPORT_PRODUCT="OpenShift Container Platform"
# REDHAT_SUPPORT_PRODUCT_VERSION="4.10"
# OPENSHIFT_VERSION="4.10"
# RHEL_VERSION="9.1"
# OSTREE_VERSION="410.91.202211291516-wzh-0"
lscpu
# Architecture: x86_64
# CPU op-mode(s): 32-bit, 64-bit
# Address sizes: 48 bits physical, 48 bits virtual
# Byte Order: Little Endian
# CPU(s): 128
# On-line CPU(s) list: 0-127
# Vendor ID: HygonGenuine
# BIOS Vendor ID: Chengdu Hygon
# Model name: Hygon C86 7285 32-core Processor
# BIOS Model name: Hygon C86 7285 32-core Processor
# CPU family: 24
# Model: 1
# Thread(s) per core: 2
# Core(s) per socket: 32
# Socket(s): 2
# Stepping: 1
# Frequency boost: enabled
# CPU max MHz: 2000.0000
# CPU min MHz: 1200.0000
# BogoMIPS: 4000.04
# Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rd
# tscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid amd_dcm aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_
# 2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce
# topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb hw_pstate ssbd ibpb vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clf
# lushopt sha_ni xsaveopt xsavec xgetbv1 xsaves clzero irperf xsaveerptr arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid d
# ecodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif overflow_recov succor smca sme sev sev_es
# Virtualization features:
# Virtualization: AMD-V
# Caches (sum of all):
# L1d: 2 MiB (64 instances)
# L1i: 4 MiB (64 instances)
# L2: 32 MiB (64 instances)
# L3: 128 MiB (16 instances)
# NUMA:
# NUMA node(s): 8
# NUMA node0 CPU(s): 0-7,64-71
# NUMA node1 CPU(s): 8-15,72-79
# NUMA node2 CPU(s): 16-23,80-87
# NUMA node3 CPU(s): 24-31,88-95
# NUMA node4 CPU(s): 32-39,96-103
# NUMA node5 CPU(s): 40-47,104-111
# NUMA node6 CPU(s): 48-55,112-119
# NUMA node7 CPU(s): 56-63,120-127
# Vulnerabilities:
# Itlb multihit: Not affected
# L1tf: Not affected
# Mds: Not affected
# Meltdown: Not affected
# Mmio stale data: Not affected
# Retbleed: Mitigation; untrained return thunk; SMT vulnerable
# Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl
# Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
# Spectre v2: Mitigation; Retpolines, IBPB conditional, STIBP disabled, RSB filling, PBRSB-eIBRS Not affected
# Srbds: Not affected
# Tsx async abort: Not affected
oc get mcp
# NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE
# master rendered-master-e21c00ca880030866d0c598d24ca301b True False False 3 3 3 0 40m
# worker rendered-worker-537f39ac419adbe3ede22a4d09132329 True False False 0 0 0 0 40m
oc get node
# NAME STATUS ROLES AGE VERSION
# master-01-demo Ready master,worker 45m v1.23.12+8a6bfe4
# master-02-demo Ready master,worker 44m v1.23.12+8a6bfe4
# master-03-demo Ready master,worker 43m v1.23.12+8a6bfe4
oc get clusterversion
# NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
# version 4.10.41 True False 5h30m Cluster version is 4.10.41
oc get co
# NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE
# authentication 4.10.41 True False False 19m
# baremetal 4.10.41 True False False 43m
# cloud-controller-manager 4.10.41 True False False 49m
# cloud-credential 4.10.41 True False False 50m
# cluster-autoscaler 4.10.41 True False False 43m
# config-operator 4.10.41 True False False 44m
# console 4.10.41 True False False 28m
# csi-snapshot-controller 4.10.41 True False False 32m
# dns 4.10.41 True False False 32m
# etcd 4.10.41 True False False 42m
# image-registry 4.10.41 True False False 30m
# ingress 4.10.41 True False False 32m
# insights 4.10.41 True False False 90s
# kube-apiserver 4.10.41 True False False 40m
# kube-controller-manager 4.10.41 True False False 41m
# kube-scheduler 4.10.41 True False False 40m
# kube-storage-version-migrator 4.10.41 True False False 30m
# machine-api 4.10.41 True False False 43m
# machine-approver 4.10.41 True False False 43m
# machine-config 4.10.41 True False False 43m
# marketplace 4.10.41 True False False 43m
# monitoring 4.10.41 True False False 36m
# network 4.10.41 True False False 44m
# node-tuning 4.10.41 True False False 43m
# openshift-apiserver 4.10.41 True False False 32m
# openshift-controller-manager 4.10.41 True False False 32m
# openshift-samples 4.10.41 True False False 37m
# operator-lifecycle-manager 4.10.41 True False False 43m
# operator-lifecycle-manager-catalog 4.10.41 True False False 43m
# operator-lifecycle-manager-packageserver 4.10.41 True False False 32m
# service-ca 4.10.41 True False False 44m
# storage 4.10.41 True False False 44m
other config to fix hygon deploy errors
disk treated as removalable disk (flag RM)
不知道是海关x86 cpu的问题,还是这个服务器主板的问题,所有内置硬盘都会认成移动硬盘。在主板bios里面,sata controller没有可以配置的项,只有海光cpu有相关的配置,ACHI的相关配置,没有关闭热插拔的选项。
这个问题,对于安装openshift倒是没看出来有什么影响,但是会影响安装openshift data fundation(odf),因为odf安装的时候,会默认扫描节点的硬盘,然后把移动硬盘都排除。结果,海光cpu的服务器,就变成没有硬盘可以来装了。
没办法,我们只好定制local storage operator,这个东西是odf的底层,真正的硬盘扫描,就是这个operator干的。
# you can see the RM flag is set to 1
lsblk
# NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
# sda 8:0 1 3.6T 0 disk
# |-sda1 8:1 1 1M 0 part
# |-sda2 8:2 1 127M 0 part
# |-sda3 8:3 1 384M 0 part /boot
# `-sda4 8:4 1 3.6T 0 part /var/lib/kubelet/pods/9c993e46-ed1f-4f5c-a48a-bf563a29d6b8/volume-subpaths/etc/tuned/5
# /var/lib/kubelet/pods/9c993e46-ed1f-4f5c-a48a-bf563a29d6b8/volume-subpaths/etc/tuned/4
# /var/lib/kubelet/pods/9c993e46-ed1f-4f5c-a48a-bf563a29d6b8/volume-subpaths/etc/tuned/3
# /var/lib/kubelet/pods/9c993e46-ed1f-4f5c-a48a-bf563a29d6b8/volume-subpaths/etc/tuned/2
# /var/lib/kubelet/pods/9c993e46-ed1f-4f5c-a48a-bf563a29d6b8/volume-subpaths/etc/tuned/1
# /var/lib/containers/storage/overlay
# /var
# /sysroot/ostree/deploy/rhcos/var
# /usr
# /etc
# /
# /sysroot
# sdb 8:16 1 447.1G 0 disk
# sdc 8:32 1 447.1G 0 disk
dmesg | grep sdb
# [ 6.900118] sd 16:0:0:0: [sdb] 937703088 512-byte logical blocks: (480 GB/447 GiB)
# [ 6.900134] sd 16:0:0:0: [sdb] 4096-byte physical blocks
# [ 6.900206] sd 16:0:0:0: [sdb] Write Protect is off
# [ 6.900211] sd 16:0:0:0: [sdb] Mode Sense: 00 3a 00 00
# [ 6.900908] sd 16:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
# [ 6.953704] sd 16:0:0:0: [sdb] Attached SCSI removable disk
udevadm info --query all --path /sys/block/sdb --attribute-walk
# looking at device '/devices/pci0000:00/0000:00:08.1/0000:05:00.2/ata18/host17/target17:0:0/17:0:0:0/block/sdb':
# KERNEL=="sdb"
# SUBSYSTEM=="block"
# DRIVER==""
# ATTR{alignment_offset}=="0"
# ATTR{capability}=="1"
# ATTR{discard_alignment}=="0"
# ATTR{diskseq}=="2"
# ATTR{events}=="media_change"
# ATTR{events_async}==""
# ATTR{events_poll_msecs}=="-1"
# ATTR{ext_range}=="256"
# ATTR{hidden}=="0"
# ATTR{inflight}==" 0 0"
# ATTR{integrity/device_is_integrity_capable}=="0"
# ATTR{integrity/format}=="none"
# ATTR{integrity/protection_interval_bytes}=="0"
# ATTR{integrity/read_verify}=="0"
# ATTR{integrity/tag_size}=="0"
# ATTR{integrity/write_generate}=="0"
# ATTR{mq/0/cpu_list}=="0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127"
# ATTR{mq/0/nr_reserved_tags}=="0"
# ATTR{mq/0/nr_tags}=="32"
# ATTR{power/control}=="auto"
# ATTR{power/runtime_active_time}=="0"
# ATTR{power/runtime_status}=="unsupported"
# ATTR{power/runtime_suspended_time}=="0"
# ATTR{queue/add_random}=="0"
# ATTR{queue/chunk_sectors}=="0"
# ATTR{queue/dax}=="0"
# ATTR{queue/discard_granularity}=="4096"
# ATTR{queue/discard_max_bytes}=="2147450880"
# ATTR{queue/discard_max_hw_bytes}=="2147450880"
# ATTR{queue/discard_zeroes_data}=="0"
# ATTR{queue/fua}=="0"
# ATTR{queue/hw_sector_size}=="512"
# ATTR{queue/io_poll}=="0"
# ATTR{queue/io_poll_delay}=="-1"
# ATTR{queue/io_timeout}=="30000"
# ATTR{queue/iosched/async_depth}=="48"
# ATTR{queue/iosched/fifo_batch}=="16"
# ATTR{queue/iosched/front_merges}=="1"
# ATTR{queue/iosched/prio_aging_expire}=="10000"
# ATTR{queue/iosched/read_expire}=="500"
# ATTR{queue/iosched/write_expire}=="5000"
# ATTR{queue/iosched/writes_starved}=="2"
# ATTR{queue/iostats}=="1"
# ATTR{queue/logical_block_size}=="512"
# ATTR{queue/max_discard_segments}=="1"
# ATTR{queue/max_hw_sectors_kb}=="32767"
# ATTR{queue/max_integrity_segments}=="0"
# ATTR{queue/max_sectors_kb}=="1280"
# ATTR{queue/max_segment_size}=="65536"
# ATTR{queue/max_segments}=="168"
# ATTR{queue/minimum_io_size}=="4096"
# ATTR{queue/nomerges}=="0"
# ATTR{queue/nr_requests}=="64"
# ATTR{queue/nr_zones}=="0"
# ATTR{queue/optimal_io_size}=="0"
# ATTR{queue/physical_block_size}=="4096"
# ATTR{queue/read_ahead_kb}=="128"
# ATTR{queue/rotational}=="0"
# ATTR{queue/rq_affinity}=="1"
# ATTR{queue/scheduler}=="[mq-deadline] kyber bfq none"
# ATTR{queue/stable_writes}=="0"
# ATTR{queue/virt_boundary_mask}=="0"
# ATTR{queue/wbt_lat_usec}=="2000"
# ATTR{queue/write_cache}=="write back"
# ATTR{queue/write_same_max_bytes}=="0"
# ATTR{queue/write_zeroes_max_bytes}=="0"
# ATTR{queue/zone_append_max_bytes}=="0"
# ATTR{queue/zone_write_granularity}=="0"
# ATTR{queue/zoned}=="none"
# ATTR{range}=="16"
# ATTR{removable}=="1"
# ATTR{ro}=="0"
# ATTR{size}=="937703088"
# ATTR{stat}==" 94 0 4504 16 0 0 0 0 0 16 16 0 0 0 0 0 0"
# ATTR{trace/act_mask}=="disabled"
# ATTR{trace/enable}=="0"
# ATTR{trace/end_lba}=="disabled"
# ATTR{trace/pid}=="disabled"
# ATTR{trace/start_lba}=="disabled"
build local-storage-operator
我们要做的,就是修改local-storage-operator里面的源代码,在源代码里面,写死了移动硬盘不能作为local-storage使用,我们就把这个限制放开。比较走运的是,这个项目的代码逻辑还算是简单,让我们比较方便的找到了写死的地方。
# https://github.com/wangzheng422/local-storage-operator
# dnf module -y install go-toolset docker ruby
dnf module -y install go-toolset ruby
dnf install -y docker
rm -rf /data/operator
mkdir -p /data/operator
cd /data/operator
git clone https://github.com/wangzheng422/local-storage-operator
cd local-storage-operator
git checkout wzh-ocp-4.10
export REGISTRY=quay.io/wangzheng422/
export VERSION=wzh-ocp-4.10-v01
sed -i 's/REPO_IP/45.76.77.134:5180/g' wzh.repo
make images
make push-images
# quay.io/wangzheng422/local-diskmaker:wzh-ocp-4.10-v01
# quay.io/wangzheng422/local-storage-operator:wzh-ocp-4.10-v01
# quay.io/wangzheng422/local-must-gather:wzh-ocp-4.10-v01
make bundle
# quay.io/wangzheng422/local-storage-bundle:wzh-ocp-4.10-v01
# quay.io/wangzheng422/ocp/local-storage-index:wzh-ocp-4.10-v01
deploy RM hotfix to openshift
我们编译好了自定义版本的local-storage-operator,里面包括了operator本身,还有catalog source。接下来,我们就部署这个local-storage-operator版本,然后在这个基础之上,再部署odf。
cat << EOF >> ~/wzh/disk.fix.project.yaml
apiVersion: v1
kind: Namespace
metadata:
labels:
openshift.io/cluster-monitoring: "true"
name: openshift-local-storage
EOF
oc create --save-config -f ~/wzh/disk.fix.project.yaml
oc project openshift-local-storage
cat << EOF > ~/wzh/disk.fix.sub.yaml
# apiVersion: v1
# kind: Namespace
# metadata:
# labels:
# openshift.io/cluster-monitoring: "true"
# name: openshift-local-storage
---
apiVersion: operators.coreos.com/v1alpha2
kind: OperatorGroup
metadata:
name: local-operator-group
namespace: openshift-local-storage
spec:
targetNamespaces:
- openshift-local-storage
---
apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
name: localstorage-operator-manifests
namespace: openshift-local-storage
spec:
sourceType: grpc
# replace this with your index image
image: quay.io/wangzheng422/local-storage-index:wzh-ocp-4.10-v01
---
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
name: local-storage-subscription
namespace: openshift-local-storage
spec:
channel: preview # this is the default channel name defined in config bundle file
name: local-storage-operator
source: localstorage-operator-manifests
sourceNamespace: openshift-local-storage
EOF
oc create --save-config -f ~/wzh/disk.fix.sub.yaml
# if you want to restore
# oc delete -f ~/wzh/disk.fix.sub.yaml
# after deploy ODF, set default storage class to rbd
oc patch storageclass ocs-storagecluster-ceph-rbd -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
sriov fix
实验环境,有sriov官方不支持的网卡,那么我们需要激活这些网卡支持,要做2个事情,一个是禁用webhook,另外一个是配置一个config map,把网卡识别信息放进去。
# disable sriov webhook
# https://docs.openshift.com/container-platform/4.10/networking/hardware_networks/configuring-sriov-operator.html#disable-enable-sr-iov-operator-admission-control-webhook_configuring-sriov-operator
oc patch sriovoperatorconfig default --type=merge \
-n openshift-sriov-network-operator \
--patch '{ "spec": { "enableOperatorWebhook": false } }'
# add unsupport nic ids
cat << EOF > ~/wzh/sriov-unsupport.yaml
apiVersion: v1
data:
INTEL: 8086 10fb 10ed
I350: 8086 1521 1520
Wuxi: 8848 1000 1080
kind: ConfigMap
metadata:
name: unsupported-nic-ids
namespace: openshift-sriov-network-operator
EOF
oc apply -f ~/wzh/sriov-unsupport.yaml
# 如何查找上面的那些网卡参数?在kernel里面能找到。
VAR_IF=ens19f0
cat /sys/class/net/$VAR_IF/device/vendor
# 0x8086
cat /sys/class/net/$VAR_IF/device/device
# 0x1521
cat /sys/class/net/$VAR_IF/device/sriov_vf_device
# 1520
[root@master1 device]# dmesg |grep i40
[ 3.700084] i40e: Intel(R) Ethernet Connection XL710 Network Driver
[ 3.700088] i40e: Copyright (c) 2013 - 2019 Intel Corporation.
[ 3.718875] i40e 0000:23:00.0: fw 8.84.66032 api 1.14 nvm 8.40 0x8000af82 20.5.13 [8086:1572] [8086:0006]
[ 3.815120] i40e 0000:23:00.0: MAC address: 6c:fe:54:44:29:60
[ 3.815438] i40e 0000:23:00.0: FW LLDP is enabled
[ 3.832075] i40e 0000:23:00.0: PCI-Express: Speed 8.0GT/s Width x8
[ 3.862256] i40e 0000:23:00.0: Features: PF-id[0] VFs: 64 VSIs: 66 QP: 119 RSS FD_ATR FD_SB NTUPLE DCB VxLAN Geneve PTP VEPA
[ 3.892534] i40e 0000:23:00.1: fw 8.84.66032 api 1.14 nvm 8.40 0x8000af82 20.5.13 [8086:1572] [8086:0006]
[ 3.977303] i40e 0000:23:00.1: MAC address: 6c:fe:54:44:29:61
[ 3.980272] i40e 0000:23:00.1: FW LLDP is enabled
[ 3.993587] i40e 0000:23:00.1: PCI-Express: Speed 8.0GT/s Width x8
[ 4.009877] i40e 0000:23:00.1: Features: PF-id[1] VFs: 64 VSIs: 66 QP: 119 RSS FD_ATR FD_SB NTUPLE DCB VxLAN Geneve PTP VEPA
[ 4.033807] i40e 0000:51:00.0: fw 6.0.48442 api 1.7 nvm 6.01 0x80003554 1.1747.0 [8086:158b] [8086:0001]
[ 4.115076] i40e 0000:51:00.0: MAC address: 3c:fd:fe:c5:58:68
[ 4.120848] i40e 0000:51:00.0: FW LLDP is enabled
[ 4.136188] i40e 0000:51:00.0: PCI-Express: Speed 8.0GT/s Width x8
[ 4.139533] i40e 0000:51:00.0: Features: PF-id[0] VFs: 64 VSIs: 66 QP: 119 RSS FD_ATR FD_SB NTUPLE DCB VxLAN Geneve PTP VEPA
[ 4.158734] i40e 0000:51:00.1: fw 6.0.48442 api 1.7 nvm 6.01 0x80003554 1.1747.0 [8086:158b] [8086:0001]
[ 4.245403] i40e 0000:51:00.1: MAC address: 3c:fd:fe:c5:58:69
[ 4.248148] i40e 0000:51:00.1: FW LLDP is enabled
[ 4.260198] i40e 0000:51:00.1: PCI-Express: Speed 8.0GT/s Width x8
[ 4.262961] i40e 0000:51:00.1: Features: PF-id[1] VFs: 64 VSIs: 66 QP: 119 RSS FD_ATR FD_SB NTUPLE DCB VxLAN Geneve PTP VEPA
[root@master3 device]# dmesg |grep i40
[ 3.776216] i40e: Intel(R) Ethernet Connection XL710 Network Driver
[ 3.778116] i40e: Copyright (c) 2013 - 2019 Intel Corporation.
[ 3.798495] i40e 0000:23:00.0: fw 8.13.63341 api 1.12 nvm 8.15 0x8000a4e8 1.2879.0 [8086:1572] [1bd4:0042]
[ 3.902899] i40e 0000:23:00.0: MAC address: b4:05:5d:e1:71:3e
[ 3.904856] i40e 0000:23:00.0: FW LLDP is disabled
[ 3.906678] i40e 0000:23:00.0: FW LLDP is disabled, attempting SW DCB
[ 3.924126] i40e 0000:23:00.0: SW DCB initialization succeeded.
[ 3.942003] i40e 0000:23:00.0: PCI-Express: Speed 8.0GT/s Width x8
[ 3.963194] i40e 0000:23:00.0: Features: PF-id[0] VFs: 64 VSIs: 66 QP: 119 RSS FD_ATR FD_SB NTUPLE DCB VxLAN Geneve PTP VEPA
[ 3.981141] i40e 0000:23:00.1: fw 8.13.63341 api 1.12 nvm 8.15 0x8000a4e8 1.2879.0 [8086:1572] [1bd4:0042]
[ 4.067137] i40e 0000:23:00.1: MAC address: b4:05:5d:e1:71:3f
[ 4.070012] i40e 0000:23:00.1: FW LLDP is disabled
[ 4.072641] i40e 0000:23:00.1: FW LLDP is disabled, attempting SW DCB
[ 4.085208] i40e 0000:23:00.1: SW DCB initialization succeeded.
[ 4.103701] i40e 0000:23:00.1: PCI-Express: Speed 8.0GT/s Width x8
[ 4.116830] i40e 0000:23:00.1: Features: PF-id[1] VFs: 64 VSIs: 66 QP: 119 RSS FD_ATR FD_SB NTUPLE DCB VxLAN Geneve PTP VEPA
[ 4.127157] i40e 0000:23:00.1 ens22f1: renamed from eth0
[ 4.160401] i40e 0000:23:00.0 ens22f0: renamed from eth1
lspci -vs 0000:23:00.0
# 23:00.0 Ethernet controller: Intel Corporation Ethernet Controller X710 for 10GbE SFP+ (rev 02)
# Subsystem: Intel Corporation Ethernet 10G 2P X710 Adapter
# Physical Slot: 22
# Flags: bus master, fast devsel, latency 0, IRQ 105, NUMA node 2, IOMMU group 53
# Memory at d7000000 (64-bit, prefetchable) [size=16M]
# Memory at d8008000 (64-bit, prefetchable) [size=32K]
# Expansion ROM at d9180000 [disabled] [size=512K]
# Capabilities: [40] Power Management version 3
# Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
# Capabilities: [70] MSI-X: Enable+ Count=129 Masked-
# Capabilities: [a0] Express Endpoint, MSI 00
# Capabilities: [e0] Vital Product Data
# Capabilities: [100] Advanced Error Reporting
# Capabilities: [140] Device Serial Number 60-29-44-ff-ff-54-fe-6c
# Capabilities: [150] Alternative Routing-ID Interpretation (ARI)
# Capabilities: [1a0] Transaction Processing Hints
# Capabilities: [1b0] Access Control Services
# Capabilities: [1d0] Secondary PCI Express
# Kernel driver in use: i40e
# Kernel modules: i40e
lspci -vs 0000:23:00.0
# 23:00.0 Ethernet controller: Intel Corporation Ethernet Controller X710 for 10GbE SFP+ (rev 02)
# Subsystem: Inspur Electronic Information Industry Co., Ltd. 10G SFP+ DP EP102Fi4 Adapter
# Physical Slot: 22
# Flags: bus master, fast devsel, latency 0, IRQ 130, NUMA node 2, IOMMU group 53
# Memory at d7000000 (64-bit, prefetchable) [size=8M]
# Memory at d8008000 (64-bit, prefetchable) [size=32K]
# Expansion ROM at d9180000 [disabled] [size=512K]
# Capabilities: [40] Power Management version 3
# Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
# Capabilities: [70] MSI-X: Enable+ Count=129 Masked-
# Capabilities: [a0] Express Endpoint, MSI 00
# Capabilities: [100] Advanced Error Reporting
# Capabilities: [140] Device Serial Number 3e-71-e1-ff-ff-5d-05-b4
# Capabilities: [150] Alternative Routing-ID Interpretation (ARI)
# Capabilities: [160] Single Root I/O Virtualization (SR-IOV)
# Capabilities: [1a0] Transaction Processing Hints
# Capabilities: [1b0] Access Control Services
# Capabilities: [1d0] Secondary PCI Express
# Kernel driver in use: i40e
# Kernel modules: i40e
hugepage numa allocation
默认hugepage会平均分配在numa node之上,而dpdk程序,是绑定numa node运行得,所以一个不小心,就会出现hugepage不足,导致dpdk启动不了得情况。
这里,我们先看看这个环境里面的numa是个什么情况。
debug pod
为了方便测试,我们搞一个debug pod,然后oc debug node的方式,来运行这个pod,这样以后方便查询各种主机上的硬件信息。
# https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/performance_tuning_guide/sect-red_hat_enterprise_linux-performance_tuning_guide-memory-configuring-huge-pages
# build debug pod
mkdir -p /data/pod
cd /data/pod
cat << EOF > debugpod.dockerfile
FROM docker.io/library/almalinux:9
RUN dnf install -y epel-release && dnf update -y
RUN dnf repolist
RUN dnf install -y --allowerasing which iproute bind-utils wget htop btop bash-completion curl net-tools java-1.8.0-openjdk git iperf3 tcpdump stress-ng fio numactl hwloc-gui lshw nc nmap-ncat dmidecode
RUN dnf clean all -y
EOF
# VAR_IMAGE=quay.io/wangzheng422/debug-pod:alma-9.1
podman build --squash -t quay.io/wangzheng422/debug-pod:alma-9.2 -f debugpod.dockerfile ./
podman push quay.io/wangzheng422/debug-pod:alma-9.2
podman tag quay.io/wangzheng422/debug-pod:alma-9.1 quaylab.infra.wzhlab.top:5443/wangzheng422/debug-pod:alma-9.1
podman push quaylab.infra.wzhlab.top:5443/wangzheng422/debug-pod:alma-9.1
# try it
oc debug node/master1.ocp.ytl.com --image=quay.io/wangzheng422/debug-pod:alma-9.1
numastat -cm | egrep 'Node|Huge'
# Node 0 Node 1 Node 2 Node 3 Node 4 Node 5 Node 6 Node 7 Total
# AnonHugePages 3570 1796 1830 2920 934 1366 2486 4482 19384
# ShmemHugePages 0 0 0 0 0 0 0 0 0
# HugePages_Total 0 0 24576 0 0 0 0 0 24576
# HugePages_Free 0 0 15360 0 0 0 0 0 15360
# HugePages_Surp 0 0 0 0 0 0 0 0 0
lstopo --of png > test.png
# check nic belongs to numa node
cat /sys/class/net/ens22f0/device/numa_node
# 2
# check hugepage belongs to numa node
cat /sys/devices/system/node/node2/hugepages/hugepages-1048576kB/nr_hugepages
# 24
config numa hugepage binding
我们参考官方文档,配置hugepage和numa的绑定关系。
- https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/performance_tuning_guide/sect-red_hat_enterprise_linux-performance_tuning_guide-memory-configuring-huge-pages
oc patch mcp/master --patch '{"spec":{"paused":true}}' --type=merge
oc patch mcp/worker --patch '{"spec":{"paused":true}}' --type=merge
cat << EOF > ~/wzh/master-hugepage.yaml
kind: MachineConfig
apiVersion: machineconfiguration.openshift.io/v1
metadata:
#name: 80-worker-hugepages
name: 80-master-hugepages
labels:
# machineconfiguration.openshift.io/role: worker
machineconfiguration.openshift.io/role: master
spec:
osImageURL: ""
config:
ignition:
version: 3.1.0
kernelArguments:
- hugepagesz=1G
- hugepages=32
- hugepagesz=2M
- hugepages=0
- default_hugepagesz=1G
- intel_iommu=on
- iommu=pt
EOF
oc apply -f ~/wzh/master-hugepage.yaml
cat << 'EOF' > ~/wzh/hugepage.bu
variant: openshift
version: 4.10.0
metadata:
labels:
machineconfiguration.openshift.io/role: master
name: 99-zzz-master-hugetlb-gigantic-pages
storage:
files:
- path: /etc/lib/systemd/hugetlb-reserve-pages.sh
overwrite: true
contents:
inline: |
#!/bin/sh
nodes_path=/sys/devices/system/node/
if [ ! -d $nodes_path ]; then
echo "ERROR: $nodes_path does not exist"
exit 1
fi
reserve_pages()
{
echo $1 > $nodes_path/$2/hugepages/hugepages-1048576kB/nr_hugepages
}
reserve_pages 0 node0
reserve_pages 0 node1
reserve_pages 16 node2
reserve_pages 0 node3
reserve_pages 0 node4
reserve_pages 16 node5
reserve_pages 0 node6
reserve_pages 0 node7
mode: 493
user:
name: root
systemd:
units:
- contents: |
[Unit]
Description=HugeTLB Gigantic Pages Reservation
DefaultDependencies=no
Before=dev-hugepages.mount
ConditionPathExists=/sys/devices/system/node
ConditionKernelCommandLine=hugepagesz=1G
[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/etc/lib/systemd/hugetlb-reserve-pages.sh
[Install]
WantedBy=sysinit.target
enabled: true
name: hugetlb-gigantic-pages.service
EOF
butane ~/wzh/hugepage.bu > ~/wzh/hugepage.yaml
oc apply -f ~/wzh/hugepage.yaml
# oc create --save-config -f ~/wzh/hugepage.yaml
# oc delete -f ~/wzh/hugepage.yaml
oc patch mcp/master --patch '{"spec":{"paused":false}}' --type=merge
oc patch mcp/worker --patch '{"spec":{"paused":false}}' --type=merge
cnv disable auto import
实验室环境的外网非常慢,而cnv安装完了,会自动导入centos, rhel的镜像,这些镜像我们根本用不到,那么就禁止这种自动下载和导入。
- https://docs.openshift.com/container-platform/4.10/virt/virtual_machines/advanced_vm_management/virt-automatic-bootsource-updates.html
oc patch hco kubevirt-hyperconverged -n openshift-cnv --type json -p '[{"op": "replace", "path": "/spec/featureGates/enableCommonBootImageImport", "value": false}]'
cluster logging storage sizing
默认ocp cluster logging operator,会使用200G的存储,如果我们集群内部ODF的存储很小,那么我们要调整,减小存储需求,并且配置他每天清楚旧的日志。
oc get clusterlogging/instance -n openshift-logging -o yaml
# ......
# logStore:
# elasticsearch:
# nodeCount: 3
# proxy:
# resources:
# limits:
# memory: 256Mi
# requests:
# memory: 256Mi
# redundancyPolicy: SingleRedundancy
# resources:
# limits:
# cpu: 1
# memory: 8Gi
# requests:
# cpu: 500m
# memory: 8Gi
# storage:
# size: 52Gi
# retentionPolicy:
# application:
# maxAge: 1d
# audit:
# maxAge: 1d
# infra:
# maxAge: 1d
# type: elasticsearch
# ......
numa
默认有一些系统关于numa和cpu manager的配置。
cat << EOF > ~/wzh/numa.yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: KubeletConfig
metadata:
name: worker-cpumanager-enabled
spec:
kubeletConfig:
cpuManagerPolicy: static
cpuManagerReconcilePeriod: 5s
maxPods: 1000
topologyManagerPolicy: single-numa-node
machineConfigPoolSelector:
matchLabels:
custom-kubelet: worker-cpumanager-enabled
EOF
oc apply -f ~/wzh/numa.yaml
node taint
oc adm taint nodes worker1.ocp.ytl.com intel_cpu=true:NoExecute
odf error fix
odf有过一些故障,有一个比较打的,是硬盘上有分区信息,于是ceph不纳管。另外,就是pod有遗留的volumn,导致新的pod无法创建,于是手动清理了这些volumn,让新的pod能继续创建。
# https://access.redhat.com/solutions/5512711
journalctl -r -u kubelet | grep 'orphaned pod' | head -1
journalctl -r -u kubelet | grep 'orphaned pod' | head -1 | sed 's/.*orphaned pod//' | sed 's/ found.*//' | xargs printf | xargs printf && echo
POD_NAME=`journalctl -r -u kubelet | grep 'orphaned pod' | head -1 | sed 's/.*orphaned pod//' | sed 's/ found.*//' | xargs printf | xargs printf`
echo $POD_NAME
# cd /var/lib/kubelet/pods
rm -rf /var/lib/kubelet/pods/$POD_NAME/volumes
journalctl -r -u kubelet | grep 'orphaned pod' | head -1
timezone
openshift默认的时区是utf+0的,我们要按照中国时区显示时间,就可以这么做。
TZ=":Asia/Shanghai" date
disable container image wipe
openshift在节点上有一个crio-wipe.service的systemd启动服务,他会运行crio wipe来清空本地镜像缓存,至于原因,说是因为重启,会有一定概率损坏容器存储,所以最简单的办法,就是删掉,重新下载。
我们可以把这个服务屏蔽掉,然后看看实际测试的效果,如果可以接受,那么就避免重启后去下载系统镜像了。
cat << EOF > ${BASE_DIR}/data/install/crio-wipe.bu
variant: openshift
version: 4.9.0
metadata:
labels:
machineconfiguration.openshift.io/role: master
name: 99-zzz-disable-crio-wipe-master
systemd:
units:
- name: crio-wipe.service
mask: true
EOF
butane -d ${BASE_DIR}/data/install ${BASE_DIR}/data/install/crio-wipe.bu > ${BASE_DIR}/data/install/99-zzz-disable-crio-wipe-master.yaml
oc create --save-config -f ${BASE_DIR}/data/install/99-zzz-disable-crio-wipe-master.yaml
sctp with externalIP
ipv4 single stack
# https://docs.openshift.com/container-platform/4.10/networking/using-sctp.html
cat << EOF > ${BASE_DIR}/data/install/99-load-sctp-module-master.yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
name: 99-load-sctp-module-master
labels:
machineconfiguration.openshift.io/role: master
spec:
config:
ignition:
version: 3.2.0
storage:
files:
- path: /etc/modprobe.d/sctp-blacklist.conf
mode: 0644
overwrite: true
contents:
source: data:,
- path: /etc/modules-load.d/sctp-load.conf
mode: 0644
overwrite: true
contents:
source: data:,sctp
EOF
oc create --save-config -f ${BASE_DIR}/data/install/99-load-sctp-module-master.yaml
cat << EOF > ${BASE_DIR}/data/install/sctp.demo.yaml
---
apiVersion: v1
kind: Pod
metadata:
name: sctpserver
labels:
app: sctpserver
spec:
containers:
- name: sctpserver
image: quay.io/wangzheng422/debug-pod:alma-9.1
imagePullPolicy: Always
command: ["/bin/sh", "-c"]
args:
[" ncat -l 30102 --sctp -v "]
ports:
- containerPort: 30102
name: sctpserver
protocol: SCTP
---
apiVersion: v1
kind: Service
metadata:
name: sctpservice
labels:
app: sctpserver
spec:
type: ClusterIP
externalIPs:
- 192.168.77.88
selector:
app: sctpserver
ports:
- name: sctpserver
protocol: SCTP
port: 30102
targetPort: 30102
---
apiVersion: v1
kind: Pod
metadata:
name: sctpclient
labels:
app: sctpclient
spec:
containers:
- name: sctpclient
image: quay.io/wangzheng422/debug-pod:alma-9.1
imagePullPolicy: Always
command: ["/bin/sh", "-c"]
args:
["sleep inf"]
---
EOF
oc create --save-config -n default -f ${BASE_DIR}/data/install/sctp.demo.yaml
# to restore
oc delete -n default -f ${BASE_DIR}/data/install/sctp.demo.yaml
oc get services sctpservice -o go-template='{{.spec.clusterIP}}{{"\n"}}'
# 172.30.40.207
oc get pod -n default -o wide
# NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
# sctpclient 1/1 Running 0 15m 10.128.0.45 master-03-demo <none> <none>
# sctpserver 1/1 Running 4 (4m8s ago) 15m 10.128.0.44 master-03-demo <none> <none>
oc rsh sctpclient
echo '111' | ncat 172.30.40.207 30102 --sctp -v
# Ncat: Version 7.91 ( https://nmap.org/ncat )
# Ncat: Connected to 172.30.40.207:30102.
# Ncat: 4 bytes sent, 0 bytes received in 0.13 seconds.
# login to master-01
ssh root@192.168.77.23
echo '1111' | ncat 192.168.77.88 30102 --sctp -v
# Ncat: Version 7.91 ( https://nmap.org/ncat )
# Ncat: Connected to 192.168.77.88:30102.
# Ncat: 5 bytes sent, 0 bytes received in 0.12 seconds.
ipv4 and ipv6 dual stack
oc edit network/cluster
# remove external ip policy by set it to null.
# by setting .spec.externalIP.policy -> null
# apiVersion: config.openshift.io/v1
# kind: Network
# metadata:
# creationTimestamp: "2023-01-10T06:39:18Z"
# generation: 2
# name: cluster
# resourceVersion: "3473"
# uid: c871e247-f941-426f-8f0e-02ecd2d497b8
# spec:
# clusterNetwork:
# - cidr: 10.128.0.0/14
# hostPrefix: 23
# - cidr: fd01::/48
# hostPrefix: 64
# externalIP:
# policy: {}
# networkType: OVNKubernetes
# serviceNetwork:
# - 172.30.0.0/16
# - fd02::/112
# https://docs.openshift.com/container-platform/4.10/networking/using-sctp.html
cat << EOF > ${BASE_DIR}/data/install/99-load-sctp-module-master.yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
name: 99-load-sctp-module-master
labels:
machineconfiguration.openshift.io/role: master
spec:
config:
ignition:
version: 3.2.0
storage:
files:
- path: /etc/modprobe.d/sctp-blacklist.conf
mode: 0644
overwrite: true
contents:
source: data:,
- path: /etc/modules-load.d/sctp-load.conf
mode: 0644
overwrite: true
contents:
source: data:,sctp
EOF
oc create --save-config -f ${BASE_DIR}/data/install/99-load-sctp-module-master.yaml
cat << EOF > ${BASE_DIR}/data/install/sctp.demo.yaml
---
apiVersion: v1
kind: Pod
metadata:
name: sctpserver
labels:
app: sctpserver
spec:
containers:
- name: sctpserver
image: quay.io/wangzheng422/debug-pod:alma-9.1
imagePullPolicy: Always
command: ["/bin/sh", "-c"]
args:
["sleep inf"]
# [" while true; do ncat -l 30102 --sctp -v 2>&1 ; done; "]
ports:
- containerPort: 30102
name: sctpserver
protocol: SCTP
---
apiVersion: v1
kind: Service
metadata:
name: sctpservice
labels:
app: sctpserver
spec:
type: ClusterIP
ipFamilyPolicy: RequireDualStack
ipFamilies:
- IPv4
- IPv6
externalIPs:
- 192.168.77.88
- fd03::88
selector:
app: sctpserver
ports:
- name: sctpserver
protocol: SCTP
port: 30102
targetPort: 30102
# ---
# apiVersion: v1
# kind: Service
# metadata:
# name: sctpservice-v6
# labels:
# app: sctpserver
# spec:
# type: ClusterIP
# ipFamilyPolicy: SingleStack
# ipFamilies:
# - IPv6
# externalIPs:
# - fd03::88
# selector:
# app: sctpserver
# ports:
# - name: sctpserver
# protocol: SCTP
# port: 30102
# targetPort: 30102
---
apiVersion: v1
kind: Pod
metadata:
name: sctpclient
labels:
app: sctpclient
spec:
containers:
- name: sctpclient
image: quay.io/wangzheng422/debug-pod:alma-9.1
imagePullPolicy: Always
command: ["/bin/sh", "-c"]
args:
["sleep inf"]
---
EOF
oc create --save-config -n default -f ${BASE_DIR}/data/install/sctp.demo.yaml
# run below command in terminal windows of sctp server
# while true; do ncat -l 30102 --sctp -v 2>&1 ; done;
# to restore
oc delete -n default -f ${BASE_DIR}/data/install/sctp.demo.yaml
oc get pod -n default -o wide
# NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
# sctpclient 1/1 Running 0 112s 10.128.0.71 master-03-demo <none> <none>
# sctpserver 1/1 Running 0 112s 10.128.0.70 master-03-demo <none> <none>
oc get services sctpservice -n default -o json | jq -r .spec.clusterIPs[]
# 172.30.74.183
# fd02::776e
oc rsh -n default sctpclient
echo '12345' | ncat 172.30.74.183 30102 --sctp -v
# Ncat: Version 7.91 ( https://nmap.org/ncat )
# Ncat: Connected to 172.30.74.183:30102.
# Ncat: 6 bytes sent, 0 bytes received in 0.12 seconds.
echo '123456' | ncat fd02::776e 30102 --sctp -v
# Ncat: Version 7.91 ( https://nmap.org/ncat )
# Ncat: Connected to fd02::776e:30102.
# Ncat: 7 bytes sent, 0 bytes received in 0.12 seconds.
# login to master-01
ssh root@192.168.77.23
echo '12' | ncat 192.168.77.88 30102 --sctp -v
# Ncat: Version 7.91 ( https://nmap.org/ncat )
# Ncat: Connected to 192.168.77.88:30102.
# Ncat: 3 bytes sent, 0 bytes received in 0.13 seconds.
echo '123' | ncat fd03::88 30102 --sctp -v
# Ncat: Version 7.91 ( https://nmap.org/ncat )
# Ncat: Connected to fd03::88:30102.
# Ncat: 4 bytes sent, 0 bytes received in 3.11 seconds.
podman run -it --rm --network=host quay.io/wangzheng422/debug-pod:alma-9.1 bash
nmstat operator
https://github.com/openshift/kubernetes-nmstate
end
other backup
grow fs
dnf install -y cloud-utils-growpart
lsblk
# NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
# sr0 11:0 1 1024M 0 rom
# vda 253:0 0 50G 0 disk
# ├─vda1 253:1 0 1G 0 part /boot
# ├─vda2 253:2 0 2G 0 part [SWAP]
# └─vda3 253:3 0 47G 0 part /
# vdb 253:16 0 80G 0 disk
# └─vdb1 253:17 0 40G 0 part
growpart /dev/vdb 1
e2fsck -fp /dev/vdb1
mount /dev/vdb1 /data/dnf
resize2fs /dev/vdb1
disable udisk
cat << EOF > ~/wzh/blk.rm.flag.bu
variant: openshift
version: 4.10.0
metadata:
labels:
machineconfiguration.openshift.io/role: master
name: 99-zzz-master-blk-rm-flag
storage:
files:
- path: /etc/udev/rules.d/62-internal-disk.rules
mode: 0644
overwrite: true
contents:
inline: |
KERNEL=="sd[b-c]*",ENV{UDISKS_IGNORE}="1"
EOF
# /etc/udev/rules.d/62-internal-disk.rules
# KERNEL=="sd[b-c]*",ENV{UDISKS_IGNORE}="1"
butane ~/wzh/blk.rm.flag.bu > ~/wzh/99-zzz-master-blk-rm-flag.yaml
oc create --save-config -f ~/wzh/99-zzz-master-blk-rm-flag.yaml
openshift4.11 acm with hypershift on baremetal
本文介绍,在openshift4.11上,装 ACM 组件以后,然后通过hypershift的方式,来部署一个单worker节点openshift4.11的控制面托管的集群,在部署的过程中,我们模拟离线的网络环境,并且禁止DHCP,只用静态IP。
This document, will describe how to deploy a single worker node cluster using hypershift, on a ocp 4.11 hub cluster with ACM. During the deployment process, we simulate an offline network environment, and Disable DHCP, only use static IP.
控制面托管(hypershift)模式,之所以诱人,是因为他能够让控制面变成一个namespace,然后托管到中心控制面集群上,这样就能把多个集群的控制面集中到一个中心集群上,能大大提高master节点的计算密度,节约master节点的成本。并且能够把集群master节点的运行维护工作,交给专业团队运维的控制面集群,作为最终用户,只要关心worker节点的运行和维护,而worker节点的运行维护相对来说,是非常简单的。
The control plane hosting (hypershift) mode is attractive because it can turn the control plane into a namespace and then host it on the central cluster, so that the control planes of multiple clusters can be concentrated on one central cluster, which can greatly increase the computing density of the master node and save the cost of the master node. And the operation and maintenance of the master node can be handed over to the central cluster operated by a professional team. As an end user, you only need to care about the operation and maintenance of the worker node, and the operation and maintenance of the worker node is relatively simple.
对比SNO,compact cluster这种master/worker混合部署的方案,hypershift通过剥离控制面业务负载,到中心集群,防止work load对master的不利影响,比如用户部署了一个UPF这种极度消耗CPU的应用,就会无意间影响master,从而让整个集群垮掉。而hypershift就从方案层面,避免了这种情况。而从中心集群的角度来说,他的业务负载种类比较单一,就能刚好的有针对性的优化和运维。
Compared with the master/worker combind deployment mod of SNO and compact cluster, hypershift removes the control plane work load and transfers it to the central cluster to prevent the adverse impact of work load on the master. For example, if a user deploys an application that consumes CPU such as UPF, It will inadvertently affect the master, causing the entire cluster to collapse. And hypershift avoids this situation from the architecture level. From the perspective of the central cluster, its work load type is relatively simple and consistent, and it can be optimized for operation and maintenance by focusing on the control plan.
本次实验,整个流程如下:
- 在openshift4上安装ACM组件。
- 在ACM上配置cluster, infra env等配置。
- MCE通过网络 redfish 协议启动kvm
- kvm自动开始集群安装,但是由于kvm+redfish的限制,安装过程中的重启,需要手动停止kvm,配置由硬盘启动,然后再手动启动kvm。
- 集群安装完成,保存集群登录信息
In this experiment, the whole process is as follows:
- Install the ACM component on openshift4.
- Configure cluster, infra env and other configurations on ACM.
- MCE starts kvm through network redfish protocol
- Kvm automatically starts the cluster installation, but due to the limitation of kvm+redfish, the restart during the installation process requires manually stopping kvm, configuring it to start from the hard disk, and then manually starting kvm.
- The cluster installation is complete, save the cluster login information
本次实验的部署架构图: >The deployment architecture diagram of this experiment:
本次实验的网络架构,和服务器, kvm部属架构,是依托之前的一个未完成的实验,工厂模式,虽然工厂模式实验的网络模型比较复杂,但是我们就不重复配置环境了。如果想了解IPI模式如何部署集群,可以参考上述文档。
The network architecture of this experiment, as well as the server and kvm deployment architecture, are based on a previous unfinished experiment, Factory Mode, although the network model of the factory mode experiment is more complicated , but we will not repeat the configuration environment. If you want to know how to deploy clusters in IPI mode, you can refer to the above documents.
参考资料:
reference:
- https://cloud.redhat.com/blog/how-to-build-bare-metal-hosted-clusters-on-red-hat-advanced-cluster-management-for-kubernetes
- https://cloud.redhat.com/blog/a-guide-to-red-hat-hypershift-on-bare-metal
静态变量 / static variable
根据factory的安装过程,我们弄了一个 3 node IPI 模式安装的 openshift, 是一个 ipi 的 compact cluster. 我们把这个集群作为hub集群,里面要装ACM组件。
According to the installation process of the factory, we have installed openshift in 3 node IPI mode, which is an ipi compact cluster. We use this cluster as a hub cluster, and ACM components must be installed in it.
以下的参数,是我们用这个hub集群,通过hypershift创建出来新集群的参数,新集群只有1个worker节点。
The following parameters are the parameters of the new cluster created by using this hub cluster through hypershift. The new cluster has only one worker node.
# on helper
# 做一些配置参数定义
INSTALL_IMAGE_REGISTRY=quaylab.infra.wzhlab.top:8443
# PULL_SECRET='{"auths":{"registry.redhat.io": {"auth": "ZHVtbXk6ZHVtbXk=","email": "noemail@localhost"},"registry.ocp4.redhat.ren:5443": {"auth": "ZHVtbXk6ZHVtbXk=","email": "noemail@localhost"},"'${INSTALL_IMAGE_REGISTRY}'": {"auth": "'$( echo -n 'admin:redhatadmin' | openssl base64 )'","email": "noemail@localhost"}}}'
PULL_SECRET=$(cat /data/pull-secret.json)
ACM_DEMO_CLUSTER=edge01
SNO_BASE_DOMAIN=wzhlab.top
SNO_IP=192.168.12.33
SNO_GW=192.168.12.1
SNO_NETMAST=255.255.255.0
SNO_NETMAST_S=24
SNO_HOSTNAME=edge-worker-01
SNO_IF=enp1s0
SNO_IF_MAC=52:54:00:20:a2:01
SNO_DNS=192.168.77.11
SNO_DISK=/dev/vda
SNO_CORE_PWD=redhat
另外,要说明的是,我们发现参考材料里面,对dns的配置不需要那么搞,至少对于单一worker节点来说,apps都指向这个worker节点就可以,api,api-int的域名指向并不重要,因为我们的实验,通过nodeport暴露API server,然后ip地址和端口号被静态的写入了kubelet的配置。
In addition, it should be noted that we found that in the reference materials, the configuration of dns does not need to be so, at least for a single worker node, apps can all point to this worker node, and the domain names of api and api-int are not important. Because of our experiment, the API server is exposed through nodeport, and then the ip address and port number are statically written into the kubelet configuration.
部署ACM / deploy ACM
接下来,我们就部署ACM,我们用最简单的部署模式。
Next, we deploy ACM, we use the simplest deployment mode.
# install operator Advanced Cluster Management for Kubernetes
cat << EOF > ${BASE_DIR}/data/install/acm.subscript.ns.yaml
apiVersion: v1
kind: Namespace
metadata:
name: open-cluster-management
EOF
oc create -f ${BASE_DIR}/data/install/acm.subscript.ns.yaml
cat << EOF > ${BASE_DIR}/data/install/acm.subscript.yaml
---
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
name: open-cluster-management-wzh
namespace: open-cluster-management
spec:
targetNamespaces:
- open-cluster-management
---
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
name: advanced-cluster-management
namespace: open-cluster-management
spec:
sourceNamespace: openshift-marketplace
source: redhat-operators
channel: release-2.6
installPlanApproval: Automatic
name: advanced-cluster-management
EOF
oc create -f ${BASE_DIR}/data/install/acm.subscript.yaml
# RHACM create the MultiClusterHub resource
cat << EOF > ${BASE_DIR}/data/install/acm.mch.mch.yaml
apiVersion: operator.open-cluster-management.io/v1
kind: MultiClusterHub
metadata:
name: multiclusterhub
namespace: open-cluster-management
spec: {}
EOF
oc create -f ${BASE_DIR}/data/install/acm.mch.mch.yaml
oc patch mce multiclusterengine --type=merge -p '{"spec":{"overrides":{"components":[{"name":"hypershift-preview","enabled": true}]}}}'
# wait here until you can see the local-cluster
oc get ManagedCluster -A
# NAME HUB ACCEPTED MANAGED CLUSTER URLS JOINED AVAILABLE AGE
# local-cluster true https://api.factory.wzhlab.top:6443 True True 5h22m
cat << EOF > ${BASE_DIR}/data/install/managed-cluster-addon.yaml
apiVersion: addon.open-cluster-management.io/v1alpha1
kind: ManagedClusterAddOn
metadata:
name: hypershift-addon
namespace: local-cluster
spec:
installNamespace: open-cluster-management-agent-addon
EOF
oc create --save-config -f ${BASE_DIR}/data/install/managed-cluster-addon.yaml
# oc delete -f ${BASE_DIR}/data/install/managed-cluster-addon.yaml
oc get managedclusteraddons -A
# NAMESPACE NAME AVAILABLE DEGRADED PROGRESSING
# local-cluster application-manager True
# local-cluster cert-policy-controller True
# local-cluster cluster-proxy True
# local-cluster config-policy-controller True
# local-cluster governance-policy-framework True
# local-cluster hypershift-addon True
# local-cluster iam-policy-controller True
# local-cluster work-manager True
装好了是这样,我们能看到装了2个operator, ACM和MCE
This is how it is installed, we can see that 2 operators, ACM and MCE are installed
我们可以通过webUI访问ACM:
We can access ACM through webUI:
https://console-openshift-console.apps.factory.wzhlab.top/multicloud/infrastructure/clusters/managed
可以看到,默认有一个local-cluster,类型是hub,这个就是我们这个装了ACM的集群。
As you can see, there is a local-cluster by default, the type is hub, and this is our cluster with ACM installed.
点击进去,就能看到这个cluster的详细信息。
Click into it, you can see the detailed information of this cluster.
以及这个cluster包含的节点。
And the nodes contained in this cluster.
这个集群装的ACM插件。
The ACM addon installed in this cluster.
新版本的ACM还有一个cluster set的概念,用来分类cluster.
The new version of ACM also has a concept of cluster set, which is used to classify clusters.
在ACM概览页面,能看到这个ACM管理的多云环境。
On the ACM overview page, you can see the multi-cloud environment managed by this ACM.
其他的链接,都没有内容,页面是空的。
Other links have no content and the page is empty.
用hypershift模式部署集群 / Deploy the cluster using hypershift
有过部署assisted install service,并通过AIS来部署SNO的经验,那么通过ACM,用hypershift的模式来部署,就容易理解了,整个过程一样,都是配置ACM里面的assisted install service,然后定义infr env,调用BMC API,来直接挂载iso,并启动主机。不同的地方,以前的实验,之后是定义一个 ClusterDeployment, 现在要定义一个 HostedCluster,这个hosted cluster会帮助我们创建 cluster deployment 。
Having deployed assisted install service and deploying SNO through AIS, it is easy to understand through ACM and deploying in hypershift mode. The whole process is the same as that of configuring assisted install service in ACM, and then defining infr env , call the BMC API to directly mount the iso and start the host. The difference is that in the previous experiment, after above steps, we define a ClusterDeployment, now we need to define a HostedCluster. This hosted cluster will help us create a cluster deployment.
setup ACM for agent service
ACM 2.6 UI 是完全支持hypershift的,但是,我们现在的实验,是为了项目上能定制,所以有些配置要用命令行完成。
ACM 2.6 UI fully supports hypershift, but our current experiment is for project customization, so some configurations need to be done using the command line.
本文就是手动创建yaml,然后一步一步的做,更深入的理解一下hypershift的过程。
This article is to manually create yaml, and then do it step by step to understand the process of hypershift more deeply.
oc project open-cluster-management
oc get hiveconfig hive -n multicluster-engine -o yaml
# ......
# spec: {}
# status:
# aggregatorClientCAHash: b30ffa769079a2ac0e37e40172084089
# conditions:
# - lastProbeTime: "2023-01-13T09:10:10Z"
# lastTransitionTime: "2023-01-13T09:10:10Z"
# message: Hive is deployed successfully
# reason: DeploymentSuccess
# status: "True"
# type: Ready
# configApplied: true
# observedGeneration: 1
oc patch provisioning provisioning-configuration --type merge -p '{"spec":{"watchAllNamespaces": true }}'
oc get provisioning provisioning-configuration -o yaml
# ......
# spec:
# preProvisioningOSDownloadURLs: {}
# provisioningMacAddresses:
# - 52:54:00:20:a1:01
# - 52:54:00:20:a1:02
# - 52:54:00:20:a1:03
# provisioningNetwork: Disabled
# provisioningOSDownloadURL: http://192.168.77.11:8080/rhcos-openstack.x86_64.qcow2.gz?sha256=506bb66f8cb407c74061a8201f13e7b1edd44000d944be85eb7a4df7058dcb79
# watchAllNamespaces: true
# ......
cat << EOF > ${BASE_DIR}/data/install/acm.ocp.release.yaml
apiVersion: hive.openshift.io/v1
kind: ClusterImageSet
metadata:
name: openshift-v4.11.21
namespace: multicluster-engine
spec:
releaseImage: ${INSTALL_IMAGE_REGISTRY}/openshift/release-images:4.11.21-x86_64
EOF
oc create -f ${BASE_DIR}/data/install/acm.ocp.release.yaml
# oc delete -f ${BASE_DIR}/data/install/acm.ocp.release.yaml
oc get ClusterImageSet
# NAME RELEASE
# openshift-v4.11.21 quaylab.infra.wzhlab.top:8443/openshift/release-images:4.11.21-x86_64
cat << EOF > ${BASE_DIR}/data/install/acm.cm.asc.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: assisted-service-config
namespace: multicluster-engine
labels:
app: assisted-service
data:
LOG_LEVEL: "debug"
EOF
oc create -f ${BASE_DIR}/data/install/acm.cm.asc.yaml
# oc delete -f ${BASE_DIR}/data/install/acm.cm.asc.yaml
openshift-install version
# openshift-install 4.11.21
# built from commit d3fb15afdbf1558344ea88a1e134c8e9a011440f
# release image quay.io/openshift-release-dev/ocp-release@sha256:860cc37824074671c4cf76e02d224d243e670d2298e6dab8923ee391fbd0ae1c
# release architecture amd64
openshift-install coreos print-stream-json | jq .architectures.x86_64.artifacts.metal.release -r
# 411.86.202210041459-0
VAR_COREOS_VERSION=`openshift-install coreos print-stream-json | jq .architectures.x86_64.artifacts.metal.release -r`
# the config of CA is important here.
# assisted service will not use cluster's CA config
cat << EOF > ${BASE_DIR}/data/install/acm.mirror.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: hyper1-mirror-config
namespace: multicluster-engine
labels:
app: assisted-service
data:
ca-bundle.crt: |
$( cat /etc/crts/infra.wzhlab.top.crt | sed 's/^/ /g' )
registries.conf: |
unqualified-search-registries = ["registry.access.redhat.com", "docker.io"]
[[registry]]
prefix = ""
location = "quay.io/openshift-release-dev/ocp-release"
mirror-by-digest-only = true
[[registry.mirror]]
location = "${INSTALL_IMAGE_REGISTRY}/openshift/release-images"
[[registry]]
prefix = ""
location = "quay.io/openshift-release-dev/ocp-v4.0-art-dev"
mirror-by-digest-only = true
[[registry.mirror]]
location = "${INSTALL_IMAGE_REGISTRY}/openshift/release"
---
EOF
oc create -f ${BASE_DIR}/data/install/acm.mirror.yaml
# oc delete -f ${BASE_DIR}/data/install/acm.mirror.yaml
cat << EOF > ${BASE_DIR}/data/install/acm.agentservicecofnig.yaml
apiVersion: agent-install.openshift.io/v1beta1
kind: AgentServiceConfig
metadata:
name: agent
namespace: multicluster-engine
### This is the annotation that injects modifications in the Assisted Service pod
annotations:
unsupported.agent-install.openshift.io/assisted-service-configmap: "assisted-service-config"
###
spec:
databaseStorage:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 40Gi
filesystemStorage:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 40Gi
### This is a ConfigMap that only will make sense on Disconnected environments
mirrorRegistryRef:
name: "hyper1-mirror-config"
###
osImages:
- openshiftVersion: "4.11"
version: "$VAR_COREOS_VERSION"
url: "http://192.168.77.11:8080/rhcos-live.x86_64.iso"
rootFSUrl: "http://192.168.77.11:8080/rhcos-live-rootfs.x86_64.img"
cpuArchitecture: x86_64
EOF
oc create -f ${BASE_DIR}/data/install/acm.agentservicecofnig.yaml
# oc delete -f ${BASE_DIR}/data/install/acm.agentservicecofnig.yaml
# oc get pod -n multicluster-engine -o json | jq .items[].metadata.name -r | xargs -I DEMO oc logs -n multicluster-engine --prefix=true DEMO | grep 'failed to add release image '
# wait here to see all the status is True
oc get AgentServiceConfig/agent -n multicluster-engine -o yaml
# ......
# status:
# conditions:
# - lastTransitionTime: "2023-01-13T01:38:25Z"
# message: AgentServiceConfig reconcile completed without error.
# reason: ReconcileSucceeded
# status: "True"
# type: ReconcileCompleted
# - lastTransitionTime: "2023-01-13T01:40:25Z"
# message: All the deployments managed by Infrastructure-operator are healthy.
# reason: DeploymentSucceeded
# status: "True"
# type: DeploymentsHealthy
# stop here, and wait the assisted-service pod run into ok status
oc get pod -n multicluster-engine | grep assisted
# assisted-image-service-0 1/1 Running 0 4m38s
# assisted-service-764cd98cf7-2r2db 2/2 Running 1 (2m59s ago) 4m40s
create the infra env
infra env这个概念比较古怪,他的意思是,一组相同的主机共享的配置,共享什么配置呢?主要是网络参数配置,启动盘ISO的定制化配置等等。
The concept of infra env is rather weird. What it means is that a group of identical hosts share configurations. What configurations are shared? Mainly network parameter configuration, customized configuration of boot disk ISO, etc.
oc create ns ${ACM_DEMO_CLUSTER}
oc project ${ACM_DEMO_CLUSTER}
cat << EOF > ${BASE_DIR}/data/install/acm.managed.secret.yaml
apiVersion: v1
kind: Secret
metadata:
name: assisted-deployment-pull-secret
namespace: ${ACM_DEMO_CLUSTER}
stringData:
.dockerconfigjson: '$PULL_SECRET'
EOF
oc create -f ${BASE_DIR}/data/install/acm.managed.secret.yaml
# oc delete -f ${BASE_DIR}/data/install/acm.managed.secret.yaml
cat << EOF > ${BASE_DIR}/data/install/acm.nmsc.yaml
apiVersion: agent-install.openshift.io/v1beta1
kind: NMStateConfig
metadata:
name: ${ACM_DEMO_CLUSTER}
namespace: ${ACM_DEMO_CLUSTER}
labels:
nmstate-conf-cluster-name: ${ACM_DEMO_CLUSTER}
spec:
config:
interfaces:
- name: ${SNO_IF}
type: ethernet
state: up
ipv4:
enabled: true
address:
- ip: ${SNO_IP}
prefix-length: ${SNO_NETMAST_S}
dhcp: false
dns-resolver:
config:
server:
- ${SNO_DNS}
routes:
config:
- destination: 0.0.0.0/0
next-hop-address: ${SNO_GW}
next-hop-interface: ${SNO_IF}
table-id: 254
interfaces:
- name: "${SNO_IF}"
macAddress: ${SNO_IF_MAC}
EOF
oc create -f ${BASE_DIR}/data/install/acm.nmsc.yaml
# oc delete -f ${BASE_DIR}/data/install/acm.nmsc.yaml
oc get NMStateConfig/${ACM_DEMO_CLUSTER} -n ${ACM_DEMO_CLUSTER}
# NAME AGE
# edge01 3h30m
cat << EOF > ${BASE_DIR}/data/install/acm.infraenv.yaml
apiVersion: agent-install.openshift.io/v1beta1
kind: InfraEnv
metadata:
name: ${ACM_DEMO_CLUSTER}
namespace: ${ACM_DEMO_CLUSTER}
labels:
agentclusterinstalls.extensions.hive.openshift.io/location: ${ACM_DEMO_CLUSTER}
networkType: static
spec:
agentLabels:
'agentclusterinstalls.extensions.hive.openshift.io/location': ${ACM_DEMO_CLUSTER}
additionalNTPSources:
- 192.168.77.11
# clusterRef:
# name: ${ACM_DEMO_CLUSTER}
# namespace: ${ACM_DEMO_CLUSTER}-${ACM_DEMO_CLUSTER}
sshAuthorizedKey: "$(< ~/.ssh/id_rsa.pub)"
pullSecretRef:
name: assisted-deployment-pull-secret
# ignitionConfigOverride: '${VAR_IGNITION}'
nmStateConfigLabelSelector:
matchLabels:
nmstate-conf-cluster-name: ${ACM_DEMO_CLUSTER}
# imageType: "full-iso"
EOF
oc create -f ${BASE_DIR}/data/install/acm.infraenv.yaml
# oc delete -f ${BASE_DIR}/data/install/acm.infraenv.yaml
oc get infraenv/${ACM_DEMO_CLUSTER} -n ${ACM_DEMO_CLUSTER} -o yaml
# additionalNTPSources:
# - 192.168.77.11
# agentLabels:
# agentclusterinstalls.extensions.hive.openshift.io/location: edge01
# cpuArchitecture: x86_64
# ipxeScriptType: DiscoveryImageAlways
# nmStateConfigLabelSelector:
# matchLabels:
# infraenvs.agent-install.openshift.io: edge01
# pullSecretRef:
# name: pullsecret-edge01
# sshAuthorizedKey: ssh-rsa .....
oc get infraenv/${ACM_DEMO_CLUSTER} -n ${ACM_DEMO_CLUSTER} -o json | jq .status
# {
# "agentLabelSelector": {
# "matchLabels": {
# "infraenvs.agent-install.openshift.io": "edge01"
# }
# },
# "bootArtifacts": {
# "initrd": "https://assisted-image-service-multicluster-engine.apps.factory.wzhlab.top/images/c70485f3-0b12-437f-9efe-85b17f0c627f/pxe-initrd?api_key=eyJhbGciOiJFUzI1NiIsInR5cCI6IkpXVCJ9.eyJpbmZyYV9lbnZfaWQiOiJjNzA0ODVmMy0wYjEyLTQzN2YtOWVmZS04NWIxN2YwYzYyN2YifQ.rrkRFxLVcMjEw16W3brxl_YCxHtJtUu-h0KMHcvj3DO701_ZPUM6cDg765Q02CviGSNcSTmu0ic5g06AkU0Zzg&arch=x86_64&version=4.11",
# "ipxeScript": "https://assisted-service-multicluster-engine.apps.factory.wzhlab.top/api/assisted-install/v2/infra-envs/c70485f3-0b12-437f-9efe-85b17f0c627f/downloads/files?api_key=eyJhbGciOiJFUzI1NiIsInR5cCI6IkpXVCJ9.eyJpbmZyYV9lbnZfaWQiOiJjNzA0ODVmMy0wYjEyLTQzN2YtOWVmZS04NWIxN2YwYzYyN2YifQ.3j_oKrmfOVQn85v2S3laLojUKaCTRqgkv_aSBPo-z_7k8-n2swb2m9aNT3uPr3CEstV4UVurkYwShtawFed0Cg&file_name=ipxe-script",
# "kernel": "https://assisted-image-service-multicluster-engine.apps.factory.wzhlab.top/boot-artifacts/kernel?arch=x86_64&version=4.11",
# "rootfs": "https://assisted-image-service-multicluster-engine.apps.factory.wzhlab.top/boot-artifacts/rootfs?arch=x86_64&version=4.11"
# },
# "conditions": [
# {
# "lastTransitionTime": "2023-01-13T03:15:17Z",
# "message": "Image has been created",
# "reason": "ImageCreated",
# "status": "True",
# "type": "ImageCreated"
# }
# ],
# "createdTime": "2023-01-13T03:15:16Z",
# "debugInfo": {
# "eventsURL": "https://assisted-service-multicluster-engine.apps.factory.wzhlab.top/api/assisted-install/v2/events?api_key=eyJhbGciOiJFUzI1NiIsInR5cCI6IkpXVCJ9.eyJpbmZyYV9lbnZfaWQiOiJjNzA0ODVmMy0wYjEyLTQzN2YtOWVmZS04NWIxN2YwYzYyN2YifQ.W_KCQgx4SwgbErK6eiyh7EmxPb9L8KKawXLOWPgBoPxVPH79QXq5wb-X5DT48b6qBlk3xk-F7MCT_bEG1f30Ww&infra_env_id=c70485f3-0b12-437f-9efe-85b17f0c627f"
# },
# "isoDownloadURL": "https://assisted-image-service-multicluster-engine.apps.factory.wzhlab.top/images/c70485f3-0b12-437f-9efe-85b17f0c627f?api_key=eyJhbGciOiJFUzI1NiIsInR5cCI6IkpXVCJ9.eyJpbmZyYV9lbnZfaWQiOiJjNzA0ODVmMy0wYjEyLTQzN2YtOWVmZS04NWIxN2YwYzYyN2YifQ.4FqFWSqfYijmGGWAKopqHIiKghDZBZ2NAqTY1hmUhwNfTzuKlFLZ2pDZAevAxtmf7aN96-6UCeNewIfqoLzPVQ&arch=x86_64&type=minimal-iso&version=4.11"
# }
# VAR_ISO=`oc get infraenv ${ACM_DEMO_CLUSTER} -n ${ACM_DEMO_CLUSTER} -o jsonpath={.status.isoDownloadURL}`
# cd /data/install/
# wget --no-check-certificate -O acm.demo1.iso $VAR_ISO
定义好了infra env,我们就能在ACM的web界面上看到啦。
After defining the infra env, we can see it on the ACM web interface.
infra env的详细信息,似乎没什么有用的,就是一些普通的配置。
The details of infra env seem to be of little use, just some common configurations.
在infra env的host配置里面,我们看到,现在还没有一个主机添加进来。
In the host configuration of infra env, we see that no host has been added yet.
add host to infra env
我们接下来要做的,就是给infra env添加主机,从web界面上看,大概有3种添加方法,一个是手动挂载discovery ISO,然后在infra env里面自动发现,一个是通过web界面,配置BMC等参数,来添加host,最后一种,是通过上传yaml配置文件来完成导入host的操作。
The next thing we need to do is to add hosts to the infra env. From the web interface, there are about 3 ways to add them. One is to manually mount the discovery ISO, and then automatically discover it in the infra env. The other is to configure it through the web interface, by adding BMC and other parameters to add the host, and the last one is to complete the operation of importing the host by uploading the yaml configuration file.
本文是通过命令行的方式来添加,那么就类似界面上最后一种,通过上传yaml的方式来导入host。
This article is added through the command line, so it is similar to the last one on the web interface, importing host by uploading yaml.
# lets confirm that the metal3 component is ready
# then we can use ocp to manage the baremetal
oc get pod -A | grep metal3
# openshift-machine-api metal3-8666f4cf4d-2bkfb 5/5 Running 5 12h
# openshift-machine-api metal3-image-cache-8jhtr 1/1 Running 1 13h
# openshift-machine-api metal3-image-cache-9jfs7 1/1 Running 1 13h
# openshift-machine-api metal3-image-cache-fl545 1/1 Running 1 13h
# openshift-machine-api metal3-image-customization-868d87999b-x2mnw 1/1 Running 1 13h
cat << EOF > ${BASE_DIR}/data/install/acm.demo.secret.bmc.yaml
apiVersion: v1
kind: Secret
metadata:
name: ${ACM_DEMO_CLUSTER}-bmc-master-01
namespace: ${ACM_DEMO_CLUSTER}
data:
password: $(echo password | base64)
username: $(echo admin | base64)
type: Opaque
EOF
oc create -f ${BASE_DIR}/data/install/acm.demo.secret.bmc.yaml
# oc delete -f ${BASE_DIR}/data/install/acm.demo.secret.bmc.yaml
cat << EOF > ${BASE_DIR}/data/install/acm.demo.bmh.master.yaml
apiVersion: metal3.io/v1alpha1
kind: BareMetalHost
metadata:
name: ${ACM_DEMO_CLUSTER}-${SNO_HOSTNAME}
namespace: ${ACM_DEMO_CLUSTER}
labels:
infraenvs.agent-install.openshift.io: "${ACM_DEMO_CLUSTER}"
annotations:
## Disable the Introspection
inspect.metal3.io: disabled
## Set Static Hostname
bmac.agent-install.openshift.io/hostname: "${SNO_HOSTNAME}"
## Set Static Role, auto-assign?
bmac.agent-install.openshift.io/role: "worker"
spec:
online: true
bmc:
address: redfish-virtualmedia://192.168.77.101:8000/redfish/v1/Systems/$(cat /data/install/vm.list.* | grep ocp4-ipi-edge-master-01 | awk '{print $1}')
credentialsName: ${ACM_DEMO_CLUSTER}-bmc-master-01
disableCertificateVerification: true
bootMACAddress: $(cat /data/install/mac.list.* | grep ocp4-ipi-edge-master-01 | awk '{print $2}')
automatedCleaningMode: disabled
EOF
oc create -f ${BASE_DIR}/data/install/acm.demo.bmh.master.yaml
# oc delete -f ${BASE_DIR}/data/install/acm.demo.bmh.master.yaml
oc get BareMetalHost/${ACM_DEMO_CLUSTER}-${SNO_HOSTNAME} -n ${ACM_DEMO_CLUSTER} -o yaml
# ......
# metadata:
# annotations:
# bmac.agent-install.openshift.io/hostname: edge-worker-01
# bmac.agent-install.openshift.io/role: worker
# inspect.metal3.io: disabled
# creationTimestamp: "2023-01-18T15:08:22Z"
# finalizers:
# - baremetalhost.metal3.io
# generation: 2
# labels:
# infraenvs.agent-install.openshift.io: edge01
# name: edge01-edge-worker-01
# namespace: edge01
# resourceVersion: "111945"
# uid: b21c5b31-c28c-4b43-b8c1-a0ba80581e60
# spec:
# automatedCleaningMode: disabled
# bmc:
# address: redfish-virtualmedia://192.168.77.101:8000/redfish/v1/Systems/a176e428-fea7-43ff-95c7-a927514227ed
# credentialsName: edge01-bmc-master-01
# disableCertificateVerification: true
# bootMACAddress: 52:54:00:20:a2:01
# image:
# format: live-iso
# url: https://assisted-image-service-multicluster-engine.apps.factory.wzhlab.top/images/2e9fa857-17c6-493f-8030-b4cb2b736fd1?api_key=eyJhbGciOiJFUzI1NiIsInR5cCI6IkpXVCJ9.eyJpbmZyYV9lbnZfaWQiOiIyZTlmYTg1Ny0xN2M2LTQ5M2YtODAzMC1iNGNiMmI3MzZmZDEifQ.yF_UwtoDKWdc6dYUkcYpNDOWzLt_jVS1ZSqU-SLzZq4QZwt6v7x5Hl8azM3S9THX0xi0K-ert3gqVLbNV62s9Q&arch=x86_64&type=minimal-iso&version=4.11
# online: true
# ......
配置完成以后,在web界面上,就能看到这个主机啦,其实在openshift的界面里面,也能看到这个baremetal,我们看到系统正在试图配置这个主机。
After the configuration is complete, you can see this host on the web interface. In fact, you can also see this baremetal in the openshift interface. We can see that the system is trying to configure this host.
其实在目标kvm上,是启动了一个定制的coreos live cd,启动了以后,运行了一个服务,他会搜集本机的信息,然后上报,上述操作顺利的话,我们就能在界面上看到主机信息更新了。
In fact, on the target kvm, a customized coreos live cd is started. After starting, a service is run, which will collect the information of the machine and then report it. If the above operations are successful, we can see the host on the interface. The information has been updated.
这里面的host,在后台对应的是agent的配置,我们可以通过命令行查看agent对应的详细信息。
The host here corresponds to the configuration of the agent CR in the background, and we can view the detailed information corresponding to the agent through the command line.
oc get agent -n ${ACM_DEMO_CLUSTER}
# NAME CLUSTER APPROVED ROLE STAGE
# a176e428-fea7-43ff-95c7-a927514227ed true worker
oc get agent/a176e428-fea7-43ff-95c7-a927514227ed -n ${ACM_DEMO_CLUSTER} -o yaml
# ......
# metadata:
# annotations:
# inventory.agent-install.openshift.io/version: "0.1"
# creationTimestamp: "2023-01-18T15:11:47Z"
# finalizers:
# - agent.agent-install.openshift.io/ai-deprovision
# generation: 2
# labels:
# agent-install.openshift.io/bmh: edge01-edge-worker-01
# agent-install.openshift.io/clusterdeployment-namespace: ""
# agentclusterinstalls.extensions.hive.openshift.io/location: edge01
# infraenvs.agent-install.openshift.io: edge01
# inventory.agent-install.openshift.io/cpu-architecture: x86_64
# inventory.agent-install.openshift.io/cpu-virtenabled: "false"
# inventory.agent-install.openshift.io/host-isvirtual: "true"
# inventory.agent-install.openshift.io/host-manufacturer: RedHat
# inventory.agent-install.openshift.io/host-productname: KVM
# inventory.agent-install.openshift.io/storage-hasnonrotationaldisk: "false"
# name: a176e428-fea7-43ff-95c7-a927514227ed
# namespace: edge01
# resourceVersion: "117085"
# uid: c410d01b-1bdb-4ade-b5e6-630aadf634b3
# spec:
# approved: true
# hostname: edge-worker-01
# role: worker
# ......
begin to create new cluster - control plan
准备工作都做好了,我们开始创建一个hypershift管理的,托管控制面的新集群。
With everything in place, we started creating a new hypershift-managed cluster hosting the control plane.
cat << EOF > ${BASE_DIR}/data/install/capi-role-${ACM_DEMO_CLUSTER}.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: capi-provider-role
namespace: ${ACM_DEMO_CLUSTER}
rules:
- apiGroups:
- agent-install.openshift.io
resources:
- agents
verbs:
- '*'
EOF
oc create --save-config -f ${BASE_DIR}/data/install/capi-role-${ACM_DEMO_CLUSTER}.yaml
# nodepool -> config -> config map -> machine config
# we have container image cache, so we add customize config through machine config
cat << EOF > ${BASE_DIR}/data/install/hyper.mirror.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: hyper-mirror-config
namespace: ${ACM_DEMO_CLUSTER}
data:
config: |
$( cat /data/ocp4/99-worker-container-registries.yaml | sed 's/^/ /g' )
---
EOF
oc create -f ${BASE_DIR}/data/install/hyper.mirror.yaml
# oc delete -f ${BASE_DIR}/data/install/hyper.mirror.yaml
cat << EOF > ${BASE_DIR}/data/sno/install.images.bu
variant: openshift
version: 4.9.0
metadata:
labels:
machineconfiguration.openshift.io/role: master
name: 99-zzz-master-install-images
storage:
files:
- path: /etc/containers/registries.conf.d/base.registries.conf
overwrite: true
contents:
inline: |
unqualified-search-registries = ["registry.access.redhat.com", "docker.io"]
short-name-mode = ""
[[registry]]
prefix = ""
location = "quay.io/openshift-release-dev/ocp-release"
mirror-by-digest-only = true
[[registry.mirror]]
location = "${INSTALL_IMAGE_REGISTRY}/openshift/release-images"
[[registry]]
prefix = ""
location = "quay.io/openshift-release-dev/ocp-v4.0-art-dev"
mirror-by-digest-only = true
[[registry.mirror]]
location = "${INSTALL_IMAGE_REGISTRY}/openshift/release"
EOF
butane ${BASE_DIR}/data/sno/install.images.bu > ${BASE_DIR}/data/sno/disconnected/99-zzz-master-install-images.yaml
cat << EOF > ${BASE_DIR}/data/install/hyper.mirror.main.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: hyper-mirror-main-config
namespace: ${ACM_DEMO_CLUSTER}
data:
config: |
$( cat ${BASE_DIR}/data/sno/disconnected/99-zzz-master-install-images.yaml | sed 's/^/ /g' )
---
EOF
oc create -f ${BASE_DIR}/data/install/hyper.mirror.main.yaml
cat << EOF > ${BASE_DIR}/data/install/hosted-cluster-${ACM_DEMO_CLUSTER}.yaml
---
apiVersion: hypershift.openshift.io/v1alpha1
kind: HostedCluster
metadata:
name: ${ACM_DEMO_CLUSTER}
namespace: ${ACM_DEMO_CLUSTER}
labels:
"cluster.open-cluster-management.io/clusterset": 'default'
spec:
release:
image: ${INSTALL_IMAGE_REGISTRY}/openshift/release-images:4.11.21-x86_64
pullSecret:
name: pullsecret-cluster-${ACM_DEMO_CLUSTER}
sshKey:
name: sshkey-cluster-${ACM_DEMO_CLUSTER}
networking:
podCIDR: 10.132.0.0/14
serviceCIDR: 172.31.0.0/16
machineCIDR: 192.168.12.0/24
networkType: OpenShiftSDN
platform:
type: Agent
agent:
agentNamespace: ${ACM_DEMO_CLUSTER}
infraID: ${ACM_DEMO_CLUSTER}
dns:
baseDomain: '$SNO_BASE_DOMAIN'
services:
- service: APIServer
servicePublishingStrategy:
nodePort:
address: 192.168.12.23
port: 30000
type: NodePort
- service: OAuthServer
servicePublishingStrategy:
type: Route
- service: OIDC
servicePublishingStrategy:
type: Route
- service: Konnectivity
servicePublishingStrategy:
type: Route
- service: Ignition
servicePublishingStrategy:
type: Route
---
apiVersion: v1
kind: Secret
metadata:
name: pullsecret-cluster-${ACM_DEMO_CLUSTER}
namespace: ${ACM_DEMO_CLUSTER}
stringData:
'.dockerconfigjson': '$PULL_SECRET'
type: kubernetes.io/dockerconfigjson
---
apiVersion: v1
kind: Secret
metadata:
name: sshkey-cluster-${ACM_DEMO_CLUSTER}
namespace: ${ACM_DEMO_CLUSTER}
stringData:
id_rsa.pub: '$(< ~/.ssh/id_rsa.pub)'
---
apiVersion: hypershift.openshift.io/v1alpha1
kind: NodePool
metadata:
name: 'nodepool-${ACM_DEMO_CLUSTER}-01'
namespace: ${ACM_DEMO_CLUSTER}
spec:
clusterName: ${ACM_DEMO_CLUSTER}
config:
- name: hyper-mirror-config
- name: hyper-mirror-main-config
replicas: 1
management:
autoRepair: false
upgradeType: InPlace
platform:
type: Agent
agent:
agentLabelSelector:
matchLabels: {}
release:
image: ${INSTALL_IMAGE_REGISTRY}/openshift/release-images:4.11.21-x86_64
---
apiVersion: cluster.open-cluster-management.io/v1
kind: ManagedCluster
metadata:
labels:
cloud: hypershift
name: ${ACM_DEMO_CLUSTER}
cluster.open-cluster-management.io/clusterset: 'default'
name: ${ACM_DEMO_CLUSTER}
spec:
hubAcceptsClient: true
---
apiVersion: agent.open-cluster-management.io/v1
kind: KlusterletAddonConfig
metadata:
name: ${ACM_DEMO_CLUSTER}
namespace: ${ACM_DEMO_CLUSTER}
spec:
clusterName: ${ACM_DEMO_CLUSTER}
clusterNamespace: ${ACM_DEMO_CLUSTER}
clusterLabels:
cloud: ai-hypershift
applicationManager:
enabled: true
policyController:
enabled: true
searchCollector:
enabled: true
certPolicyController:
enabled: true
iamPolicyController:
enabled: true
EOF
oc create --save-config -f ${BASE_DIR}/data/install/hosted-cluster-${ACM_DEMO_CLUSTER}.yaml
# oc delete -f ${BASE_DIR}/data/install/hosted-cluster-${ACM_DEMO_CLUSTER}.yaml
oc get HostedCluster -A
# NAMESPACE NAME VERSION KUBECONFIG PROGRESS AVAILABLE PROGRESSING MESSAGE
# edge01 edge01 edge01-admin-kubeconfig Partial True False The hosted control plane is available
oc get HostedCluster/${ACM_DEMO_CLUSTER} -n ${ACM_DEMO_CLUSTER} -o yaml | yq .spec
# autoscaling: {}
# clusterID: 8c0fb18c-22dd-4fb9-a2a6-420ee19d9f8a
# controllerAvailabilityPolicy: SingleReplica
# dns:
# baseDomain: wzhlab.top
# etcd:
# managed:
# storage:
# persistentVolume:
# size: 4Gi
# type: PersistentVolume
# managementType: Managed
# fips: false
# infraID: edge01
# infrastructureAvailabilityPolicy: SingleReplica
# issuerURL: https://kubernetes.default.svc
# networking:
# clusterNetwork:
# - cidr: 10.132.0.0/14
# machineNetwork:
# - cidr: 192.168.12.0/24
# networkType: OVNKubernetes
# serviceNetwork:
# - cidr: 172.31.0.0/16
# olmCatalogPlacement: management
# platform:
# agent:
# agentNamespace: edge01
# type: Agent
# pullSecret:
# name: pullsecret-cluster-edge01
# release:
# image: quaylab.infra.wzhlab.top:8443/openshift/release-images:4.11.21-x86_64
# services:
# - service: APIServer
# servicePublishingStrategy:
# nodePort:
# address: 192.168.12.23
# port: 30000
# type: NodePort
# - service: OAuthServer
# servicePublishingStrategy:
# type: Route
# - service: OIDC
# servicePublishingStrategy:
# type: Route
# - service: Konnectivity
# servicePublishingStrategy:
# type: Route
# - service: Ignition
# servicePublishingStrategy:
# type: Route
# sshKey:
# name: sshkey-cluster-edge01
oc get clusterdeployment -A
# NAMESPACE NAME INFRAID PLATFORM REGION VERSION CLUSTERTYPE PROVISIONSTATUS POWERSTATE AGE
# edge01-edge01 edge01 39d863f0-57f8-4ff4-a2b5-61e3e654c4db agent-baremetal 4.11.21 Provisioned 122m
oc get clusterdeployment/${ACM_DEMO_CLUSTER} -n ${ACM_DEMO_CLUSTER}-${ACM_DEMO_CLUSTER} -o yaml | yq .spec
# baseDomain: wzhlab.top
# clusterInstallRef:
# group: extensions.hive.openshift.io
# kind: AgentClusterInstall
# name: edge01
# version: v1beta1
# clusterMetadata:
# adminKubeconfigSecretRef:
# name: admin-kubeconfig
# clusterID: 28c54029-b032-4c48-8486-deb1dabe8ea8
# infraID: 28c54029-b032-4c48-8486-deb1dabe8ea8
# clusterName: edge01
# controlPlaneConfig:
# servingCertificates: {}
# installed: true
# platform:
# agentBareMetal:
# agentSelector: {}
# pullSecretRef:
# name: pull-secret
oc get AgentClusterInstall -A
# NAMESPACE NAME CLUSTER STATE
# edge01-edge01 edge01 edge01 adding-hosts
oc get AgentClusterInstall/${ACM_DEMO_CLUSTER} -n ${ACM_DEMO_CLUSTER}-${ACM_DEMO_CLUSTER} -o yaml | yq .spec
# clusterDeploymentRef:
# name: edge01
# ignitionEndpoint:
# caCertificateReference:
# name: ignition-server-ca-cert
# namespace: edge01-edge01
# url: https://ignition-server-edge01-edge01.apps.factory.wzhlab.top
# networking:
# userManagedNetworking: true
# provisionRequirements:
# controlPlaneAgents: 3
oc get agent -n ${ACM_DEMO_CLUSTER}
# NAME CLUSTER APPROVED ROLE STAGE
# a176e428-fea7-43ff-95c7-a927514227ed true worker
oc get agent/a176e428-fea7-43ff-95c7-a927514227ed -n ${ACM_DEMO_CLUSTER} -o yaml | yq .spec
# approved: true
# clusterDeploymentName:
# name: edge01
# namespace: edge01-edge01
# hostname: edge-worker-01
# ignitionEndpointTokenReference:
# name: agent-user-data-nodepool-edge01-01-e3fdfbf8
# namespace: edge01-edge01
# machineConfigPool: ignition
# role: worker
# wait here, and check the control plan creation.
oc get pod -n ${ACM_DEMO_CLUSTER}-${ACM_DEMO_CLUSTER}
# NAME READY STATUS RESTARTS AGE
# capi-provider-87b88465c-zgrx2 1/1 Running 0 10m
# catalog-operator-7dcf86576f-vffl6 2/2 Running 0 7m33s
# certified-operators-catalog-7b4bdcb679-25gls 1/1 Running 0 7m39s
# cluster-api-5984dc678b-46ms7 1/1 Running 0 10m
# cluster-autoscaler-5cd6b96d55-nzw4x 1/1 Running 0 9m33s
# cluster-network-operator-547f6988f4-6q2f2 1/1 Running 0 7m49s
# cluster-policy-controller-857bf8594f-9dhhj 1/1 Running 0 7m56s
# cluster-version-operator-85f5fd968f-rhchm 1/1 Running 0 7m55s
# community-operators-catalog-f6d797bc-87f9k 1/1 Running 0 7m38s
# control-plane-operator-65444fdff8-fzhvb 1/1 Running 0 10m
# etcd-0 1/1 Running 0 9m36s
# hosted-cluster-config-operator-cb8bd76f7-wvtfl 1/1 Running 0 7m41s
# ignition-server-57fbf98b8b-wvkv2 1/1 Running 0 9m26s
# ingress-operator-594bdd5d6d-2t6kw 2/2 Running 0 7m46s
# konnectivity-agent-67bd878b88-bwxcp 1/1 Running 0 9m35s
# konnectivity-server-764ffdb8fd-xgxqq 1/1 Running 0 9m36s
# kube-apiserver-7f85bd5d7f-cvd7r 3/3 Running 0 9m34s
# kube-controller-manager-7bd7ff884f-2c4jr 1/1 Running 0 6m35s
# kube-scheduler-68858b678d-jlpmx 1/1 Running 0 8m30s
# machine-approver-c6b6f6ff8-jh445 1/1 Running 0 9m33s
# oauth-openshift-5bb59d5596-55mtw 2/2 Running 0 6m15s
# olm-operator-949f6f76b-r8kkz 2/2 Running 0 7m32s
# openshift-apiserver-5ddbbd9847-n2824 2/2 Running 0 6m35s
# openshift-controller-manager-7cdd5bcc7b-p7kfb 1/1 Running 0 7m56s
# openshift-oauth-apiserver-8c76cb9b9-t9nts 1/1 Running 0 7m58s
# packageserver-58d5b997b9-wdn58 2/2 Running 0 7m32s
# redhat-marketplace-catalog-85748dc79-tl8sr 1/1 Running 0 7m38s
# redhat-operators-catalog-74849cb9d6-9bg49 1/1 Running 0 7m38s
oc get pod -n ${ACM_DEMO_CLUSTER}-${ACM_DEMO_CLUSTER} | tail -n +2 | wc -l
# 28
配置导入以后,我们就能看到多了一个集群edge01, 类型是hosted.
After the configuration is imported, we can see that there is an additional cluster edge01, the type is hosted.
安装过程稍微有一点时间,期间,我们能看到集群状态,nodepool状态有所变化。
The installation process takes a little while. During this period, we can see the status of the cluster and the status of the nodepool has changed.
我们还能看到hub集群上,有了一个edge01-edge01的namespace,里面有集群控制面的pod,其中就有我们熟悉的etcd, api-server
We can also see that on the hub cluster, there is an edge01-edge01 namespace, which contains the pods of the cluster control plane, including the familiar etcd and api-server
import the hosted cluster
经过一段时间,新集群就安装成功了,但是页面上提示,需要手动导入。我们复制页面上的命令,并到helper上,运行这2个命令,他们是登录到hosted control plan,然后配置一些CR进去
After a period of time, the new cluster is installed successfully, but the page prompts that it needs to be imported manually. We copy the commands on the page, and run these two commands on the helper node, they log in to the hosted control plan, and then configure some CR/CRD into it
# on helper
# copy past the 1st command
oc login https://192.168.12.23:30000 -u kubeadmin -p z2I9i-BZF8L-sYvUC-47c7x
# copy past the 2nd command
# it is too large, we will omit most of them
echo "Ci0tLQphc............" | base64 -d | oc create -f - || test $? -eq 0 && sleep 2 && echo "Ci0tLQphcGlWZ............" | base64 -d | oc apply -f - || echo "VGhlIGNsdXN..............." | base64 -d
# namespace/open-cluster-management-agent created
# serviceaccount/klusterlet created
# clusterrole.rbac.authorization.k8s.io/klusterlet created
# clusterrole.rbac.authorization.k8s.io/open-cluster-management:klusterlet-admin-aggregate-clusterrole created
# clusterrolebinding.rbac.authorization.k8s.io/klusterlet created
# Warning: would violate PodSecurity "restricted:latest": allowPrivilegeEscalation != false (container "klusterlet" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container "klusterlet" must set securityContext.capabilities.drop=["ALL"]), runAsNonRoot != true (pod or container "klusterlet" must set securityContext.runAsNonRoot=true), seccompProfile (pod or container "klusterlet" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost")
# deployment.apps/klusterlet created
# secret/bootstrap-hub-kubeconfig created
# klusterlet.operator.open-cluster-management.io/klusterlet created
# lets decode the first 2 base64 content, the 3rd one is just a message.
我们很好奇到底导入了什么东西,那让我们解码看看。第一个导入hosted control plan的yaml是一个CRD。
We are curious about what is being imported, so let's decode it. The first yaml imported into the hosted control plan is a CRD.
---
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
name: klusterlets.operator.open-cluster-management.io
spec:
conversion:
strategy: None
group: operator.open-cluster-management.io
names:
kind: Klusterlet
listKind: KlusterletList
plural: klusterlets
singular: klusterlet
scope: Cluster
preserveUnknownFields: false
versions:
- name: v1
schema:
openAPIV3Schema:
description: Klusterlet represents controllers to install the resources for a managed cluster. When configured, the Klusterlet requires a secret named bootstrap-hub-kubeconfig in the agent namespace to allow API requests to the hub for the registration protocol. In Hosted mode, the Klusterlet requires an additional secret named external-managed-kubeconfig in the agent namespace to allow API requests to the managed cluster for resources installation.
type: object
properties:
apiVersion:
description: 'APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources'
type: string
kind:
description: 'Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds'
type: string
metadata:
type: object
spec:
description: Spec represents the desired deployment configuration of Klusterlet agent.
type: object
properties:
clusterName:
description: ClusterName is the name of the managed cluster to be created on hub. The Klusterlet agent generates a random name if it is not set, or discovers the appropriate cluster name on OpenShift.
type: string
deployOption:
description: DeployOption contains the options of deploying a klusterlet
type: object
properties:
mode:
description: 'Mode can be Default or Hosted. It is Default mode if not specified In Default mode, all klusterlet related resources are deployed on the managed cluster. In Hosted mode, only crd and configurations are installed on the spoke/managed cluster. Controllers run in another cluster (defined as management-cluster) and connect to the mangaged cluster with the kubeconfig in secret of "external-managed-kubeconfig"(a kubeconfig of managed-cluster with cluster-admin permission). Note: Do not modify the Mode field once it''s applied.'
type: string
externalServerURLs:
description: ExternalServerURLs represents the a list of apiserver urls and ca bundles that is accessible externally If it is set empty, managed cluster has no externally accessible url that hub cluster can visit.
type: array
items:
description: ServerURL represents the apiserver url and ca bundle that is accessible externally
type: object
properties:
caBundle:
description: CABundle is the ca bundle to connect to apiserver of the managed cluster. System certs are used if it is not set.
type: string
format: byte
url:
description: URL is the url of apiserver endpoint of the managed cluster.
type: string
namespace:
description: 'Namespace is the namespace to deploy the agent. The namespace must have a prefix of "open-cluster-management-", and if it is not set, the namespace of "open-cluster-management-agent" is used to deploy agent. Note: in Detach mode, this field will be **ignored**, the agent will be deployed to the namespace with the same name as klusterlet.'
type: string
nodePlacement:
description: NodePlacement enables explicit control over the scheduling of the deployed pods.
type: object
properties:
nodeSelector:
description: NodeSelector defines which Nodes the Pods are scheduled on. The default is an empty list.
type: object
additionalProperties:
type: string
tolerations:
description: Tolerations is attached by pods to tolerate any taint that matches the triple <key,value,effect> using the matching operator <operator>. The default is an empty list.
type: array
items:
description: The pod this Toleration is attached to tolerates any taint that matches the triple <key,value,effect> using the matching operator <operator>.
type: object
properties:
effect:
description: Effect indicates the taint effect to match. Empty means match all taint effects. When specified, allowed values are NoSchedule, PreferNoSchedule and NoExecute.
type: string
key:
description: Key is the taint key that the toleration applies to. Empty means match all taint keys. If the key is empty, operator must be Exists; this combination means to match all values and all keys.
type: string
operator:
description: Operator represents a key's relationship to the value. Valid operators are Exists and Equal. Defaults to Equal. Exists is equivalent to wildcard for value, so that a pod can tolerate all taints of a particular category.
type: string
tolerationSeconds:
description: TolerationSeconds represents the period of time the toleration (which must be of effect NoExecute, otherwise this field is ignored) tolerates the taint. By default, it is not set, which means tolerate the taint forever (do not evict). Zero and negative values will be treated as 0 (evict immediately) by the system.
type: integer
format: int64
value:
description: Value is the taint value the toleration matches to. If the operator is Exists, the value should be empty, otherwise just a regular string.
type: string
registrationImagePullSpec:
description: RegistrationImagePullSpec represents the desired image configuration of registration agent. quay.io/open-cluster-management.io/registration:latest will be used if unspecified.
type: string
workImagePullSpec:
description: WorkImagePullSpec represents the desired image configuration of work agent. quay.io/open-cluster-management.io/work:latest will be used if unspecified.
type: string
status:
description: Status represents the current status of Klusterlet agent.
type: object
properties:
conditions:
description: 'Conditions contain the different condition statuses for this Klusterlet. Valid condition types are: Applied: Components have been applied in the managed cluster. Available: Components in the managed cluster are available and ready to serve. Progressing: Components in the managed cluster are in a transitioning state. Degraded: Components in the managed cluster do not match the desired configuration and only provide degraded service.'
type: array
items:
description: "Condition contains details for one aspect of the current state of this API Resource. --- This struct is intended for direct use as an array at the field path .status.conditions. For example, type FooStatus struct{ // Represents the observations of a foo's current state. // Known .status.conditions.type are: \"Available\", \"Progressing\", and \"Degraded\" // +patchMergeKey=type // +patchStrategy=merge // +listType=map // +listMapKey=type Conditions []metav1.Condition `json:\"conditions,omitempty\" patchStrategy:\"merge\" patchMergeKey:\"type\" protobuf:\"bytes,1,rep,name=conditions\"` \n // other fields }"
type: object
required:
- lastTransitionTime
- message
- reason
- status
- type
properties:
lastTransitionTime:
description: lastTransitionTime is the last time the condition transitioned from one status to another. This should be when the underlying condition changed. If that is not known, then using the time when the API field changed is acceptable.
type: string
format: date-time
message:
description: message is a human readable message indicating details about the transition. This may be an empty string.
type: string
maxLength: 32768
observedGeneration:
description: observedGeneration represents the .metadata.generation that the condition was set based upon. For instance, if .metadata.generation is currently 12, but the .status.conditions[x].observedGeneration is 9, the condition is out of date with respect to the current state of the instance.
type: integer
format: int64
minimum: 0
reason:
description: reason contains a programmatic identifier indicating the reason for the condition's last transition. Producers of specific condition types may define expected values and meanings for this field, and whether the values are considered a guaranteed API. The value should be a CamelCase string. This field may not be empty.
type: string
maxLength: 1024
minLength: 1
pattern: ^[A-Za-z]([A-Za-z0-9_,:]*[A-Za-z0-9_])?$
status:
description: status of the condition, one of True, False, Unknown.
type: string
enum:
- "True"
- "False"
- Unknown
type:
description: type of condition in CamelCase or in foo.example.com/CamelCase. --- Many .condition.type values are consistent across resources like Available, but because arbitrary conditions can be useful (see .node.status.conditions), the ability to deconflict is important. The regex it matches is (dns1123SubdomainFmt/)?(qualifiedNameFmt)
type: string
maxLength: 316
pattern: ^([a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*/)?(([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9])$
generations:
description: Generations are used to determine when an item needs to be reconciled or has changed in a way that needs a reaction.
type: array
items:
description: GenerationStatus keeps track of the generation for a given resource so that decisions about forced updates can be made. The definition matches the GenerationStatus defined in github.com/openshift/api/v1
type: object
properties:
group:
description: group is the group of the resource that you're tracking
type: string
lastGeneration:
description: lastGeneration is the last generation of the resource that controller applies
type: integer
format: int64
name:
description: name is the name of the resource that you're tracking
type: string
namespace:
description: namespace is where the resource that you're tracking is
type: string
resource:
description: resource is the resource type of the resource that you're tracking
type: string
version:
description: version is the version of the resource that you're tracking
type: string
observedGeneration:
description: ObservedGeneration is the last generation change you've dealt with
type: integer
format: int64
relatedResources:
description: RelatedResources are used to track the resources that are related to this Klusterlet.
type: array
items:
description: RelatedResourceMeta represents the resource that is managed by an operator
type: object
properties:
group:
description: group is the group of the resource that you're tracking
type: string
name:
description: name is the name of the resource that you're tracking
type: string
namespace:
description: namespace is where the thing you're tracking is
type: string
resource:
description: resource is the resource type of the resource that you're tracking
type: string
version:
description: version is the version of the thing you're tracking
type: string
served: true
storage: true
subresources:
status: {}
status:
acceptedNames:
kind: ""
plural: ""
conditions: []
storedVersions: []
第二个yaml是这样的,这个是配置了一个新的namespace,然后部署了一个klusterlet应用和配置,这个到底是啥,作者暂时也说不出。
The second yaml is like this. This is to configure a new namespace, and then deploy a klusterlet application and configuration. What exactly is this, the author can’t say for the time being.
---
apiVersion: v1
kind: Namespace
metadata:
annotations:
workload.openshift.io/allowed: "management"
name: "open-cluster-management-agent"
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: klusterlet
namespace: "open-cluster-management-agent"
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: klusterlet
rules:
- apiGroups: [""]
resources: ["secrets", "configmaps", "serviceaccounts"]
verbs: ["create", "get", "list", "update", "watch", "patch", "delete"]
- apiGroups: ["coordination.k8s.io"]
resources: ["leases"]
verbs: ["create", "get", "list", "update", "watch", "patch"]
- apiGroups: ["authorization.k8s.io"]
resources: ["subjectaccessreviews"]
verbs: ["create"]
- apiGroups: [""]
resources: ["namespaces"]
verbs: ["create", "get", "list", "watch","delete"]
- apiGroups: [""]
resources: ["nodes"]
verbs: ["get", "list", "watch"]
- apiGroups: ["", "events.k8s.io"]
resources: ["events"]
verbs: ["create", "patch", "update"]
- apiGroups: ["apps"]
resources: ["deployments"]
verbs: ["create", "get", "list", "update", "watch", "patch", "delete"]
- apiGroups: ["rbac.authorization.k8s.io"]
resources: ["clusterrolebindings", "rolebindings"]
verbs: ["create", "get", "list", "update", "watch", "patch", "delete"]
- apiGroups: ["rbac.authorization.k8s.io"]
resources: ["clusterroles", "roles"]
verbs: ["create", "get", "list", "update", "watch", "patch", "delete", "escalate", "bind"]
- apiGroups: ["apiextensions.k8s.io"]
resources: ["customresourcedefinitions"]
verbs: ["create", "get", "list", "update", "watch", "patch", "delete"]
- apiGroups: ["operator.open-cluster-management.io"]
resources: ["klusterlets"]
verbs: ["get", "list", "watch", "update", "patch", "delete"]
- apiGroups: ["operator.open-cluster-management.io"]
resources: ["klusterlets/status"]
verbs: ["update", "patch"]
- apiGroups: ["work.open-cluster-management.io"]
resources: ["appliedmanifestworks"]
verbs: ["list", "update", "patch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: open-cluster-management:klusterlet-admin-aggregate-clusterrole
labels:
rbac.authorization.k8s.io/aggregate-to-admin: "true"
rules:
- apiGroups: ["operator.open-cluster-management.io"]
resources: ["klusterlets"]
verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: klusterlet
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: klusterlet
subjects:
- kind: ServiceAccount
name: klusterlet
namespace: "open-cluster-management-agent"
---
kind: Deployment
apiVersion: apps/v1
metadata:
name: klusterlet
namespace: "open-cluster-management-agent"
labels:
app: klusterlet
spec:
replicas: 1
selector:
matchLabels:
app: klusterlet
template:
metadata:
annotations:
target.workload.openshift.io/management: '{"effect": "PreferredDuringScheduling"}'
labels:
app: klusterlet
spec:
serviceAccountName: klusterlet
tolerations:
- key: "node-role.kubernetes.io/infra"
value: ""
effect: "NoSchedule"
operator: "Exists"
containers:
- name: klusterlet
image: registry.redhat.io/multicluster-engine/registration-operator-rhel8@sha256:183dc28f1991ad2aa2fcb987d217fc63863909497ae9291b14a96079640463d3
imagePullPolicy: IfNotPresent
args:
- "/registration-operator"
- "klusterlet"
- "--disable-leader-election"
livenessProbe:
httpGet:
path: /healthz
scheme: HTTPS
port: 8443
initialDelaySeconds: 2
periodSeconds: 10
readinessProbe:
httpGet:
path: /healthz
scheme: HTTPS
port: 8443
initialDelaySeconds: 2
---
apiVersion: v1
kind: Secret
metadata:
name: "bootstrap-hub-kubeconfig"
namespace: "open-cluster-management-agent"
type: Opaque
data:
kubeconfig: "YXBpVmVyc2............"
---
apiVersion: operator.open-cluster-management.io/v1
kind: Klusterlet
metadata:
name: klusterlet
spec:
deployOption:
mode: Default
registrationImagePullSpec: "registry.redhat.io/multicluster-engine/registration-rhel8@sha256:52efbbbd9deef8517ea2c96b1d4756c154ebf342a6331603c6942cf0a64ee133"
workImagePullSpec: "registry.redhat.io/multicluster-engine/work-rhel8@sha256:3e1a592361dc8176dae1eb5d2bc82bd3aabb6e370add47ae84325ddeb00d661c"
clusterName: "edge01"
namespace: "open-cluster-management-agent"
nodePlacement:
tolerations:
- key: "node-role.kubernetes.io/infra"
value: ""
effect: "NoSchedule"
operator: "Exists"
导入配置以后,我们就能看到集群导入成功啦。
After importing the configuration, we can see that the cluster was imported successfully.
cluster set页面上也都是正常的标志。
There are also normal signs on the cluster set page.
集群详细页面上也都是正常的标志。
The cluster details page is also full of normal signs.
新集群的host页面,也有了一个新的worker节点。
The host page of the new cluster also has a new worker node.
集群详细信息的插件页面上,也都是正常的标志。
On the addon page of the cluster details, there are also normal signs.
我们登录到新装的edge01集群的管理页面看看。
Let's log in to the management page of the newly installed edge01 cluster to see.
新的edge01集群,是不能自行升级的,提示这是一个特殊的hosted集群。
The new edge01 cluster cannot be upgraded by itself, suggesting that this is a special hosted cluster.
回想一下,在ACM界面里面,edge01是hosted类型。
Recall that in the ACM interface, edge01 is a hosted type.
我们简单的看看,这个hosted control plan的资源消耗。
Let's take a brief look at the resource consumption of this hosted control plan.
我们看一下这个control plan里面都些什么pod。
Let's take a look at what pods are in this control plan.
cli login into the hosted cluster
接下来,我们通过命令行来登录到新的edge01集群,看看命令行上,这个新的集群有什么特殊的地方。
Next, we log in to the new edge01 cluster through the command line, and see what is special about this new cluster on the command line.
oc extract -n ${ACM_DEMO_CLUSTER} secret/${ACM_DEMO_CLUSTER}-admin-kubeconfig --to=- > ${BASE_DIR}/data/install/kubeconfig-${ACM_DEMO_CLUSTER}
# approve the worker node, if the node can't import
# under normal situation, this is no needed.
oc --kubeconfig=${BASE_DIR}/data/install/kubeconfig-${ACM_DEMO_CLUSTER} get csr | grep -v Approved
oc --kubeconfig=${BASE_DIR}/data/install/kubeconfig-${ACM_DEMO_CLUSTER} get csr -ojson | jq -r '.items[] | select(.status == {} ) | .metadata.name' | xargs oc --kubeconfig=${BASE_DIR}/data/install/kubeconfig-${ACM_DEMO_CLUSTER} adm certificate approve
oc --kubeconfig=${BASE_DIR}/data/install/kubeconfig-${ACM_DEMO_CLUSTER} get co
# NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE
# console 4.11.21 True False False 6h22m
# csi-snapshot-controller 4.11.21 True False False 6h24m
# dns 4.11.21 True False False 6h23m
# image-registry 4.11.21 True False False 6h23m
# ingress 4.11.21 True False False 6h39m
# insights 4.11.21 True False False 6h25m
# kube-apiserver 4.11.21 True False False 6h40m
# kube-controller-manager 4.11.21 True False False 6h40m
# kube-scheduler 4.11.21 True False False 6h40m
# kube-storage-version-migrator 4.11.21 True False False 6h24m
# monitoring 4.11.21 True False False 6h20m
# network 4.11.21 True False False 6h24m
# openshift-apiserver 4.11.21 True False False 6h40m
# openshift-controller-manager 4.11.21 True False False 6h40m
# openshift-samples 4.11.21 True False False 6h23m
# operator-lifecycle-manager 4.11.21 True False False 6h40m
# operator-lifecycle-manager-catalog 4.11.21 True False False 6h40m
# operator-lifecycle-manager-packageserver 4.11.21 True False False 6h40m
# service-ca 4.11.21 True False False 6h25m
# storage 4.11.21 True False False 6h25m
oc --kubeconfig=${BASE_DIR}/data/install/kubeconfig-${ACM_DEMO_CLUSTER} get node
# NAME STATUS ROLES AGE VERSION
# edge-worker-01 Ready worker 17h v1.24.6+5658434
oc --kubeconfig=${BASE_DIR}/data/install/kubeconfig-${ACM_DEMO_CLUSTER} get mcp
# error: the server doesn't have a resource type "mcp"
oc --kubeconfig=${BASE_DIR}/data/install/kubeconfig-${ACM_DEMO_CLUSTER} get mc
# error: the server doesn't have a resource type "mc"
oc --kubeconfig=${BASE_DIR}/data/install/kubeconfig-${ACM_DEMO_CLUSTER} get all -o wide -n openshift-ingress
# NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
# pod/router-default-bb569f544-cknjw 1/1 Running 0 6h41m 192.168.12.33 edge-master-01 <none> <none>
# NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
# service/router-internal-default ClusterIP 172.31.152.115 <none> 80/TCP,443/TCP,1936/TCP 6h41m ingresscontroller.operator.openshift.io/deployment-ingresscontroller=default
# NAME READY UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR
# deployment.apps/router-default 1/1 1 1 6h41m router quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:e0dc935b7825a800e32eac69fafa2d238e1d6eb2f344cdf29345cb1123c26a22 ingresscontroller.operator.openshift.io/deployment-ingresscontroller=default
# NAME DESIRED CURRENT READY AGE CONTAINERS IMAGES SELECTOR
# replicaset.apps/router-default-bb569f544 1 1 1 6h41m router quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:e0dc935b7825a800e32eac69fafa2d238e1d6eb2f344cdf29345cb1123c26a22 ingresscontroller.operator.openshift.io/deployment-ingresscontroller=default,pod-template-hash=bb569f544
oc --kubeconfig=${BASE_DIR}/data/install/kubeconfig-${ACM_DEMO_CLUSTER} get pod -A | wc -l
# 56
oc --kubeconfig=${BASE_DIR}/data/install/kubeconfig-${ACM_DEMO_CLUSTER} get clusterversion
# NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
# version 4.11.21 True False 6h35m Cluster version is 4.11.21
post operation
装完了,我们为了方便做实验,我们对集群节点做点配置。虽然减低了集群的安全性,但是做实验吗,无所谓了。
After the installation is complete, we will configure the cluster nodes for the convenience of experiments. Although the security of the cluster is reduced, it doesn't matter if you only do the experiment.
# on helper
# VAR_CLUSTER=edge01
# oc get secret/$VAR_CLUSTER-keypair -n $VAR_CLUSTER --template='{{index .data "id_rsa.key" | base64decode}}' > ${BASE_DIR}/data/install/edge.key
# chmod 600 ${BASE_DIR}/data/install/edge.key
# ssh -i ${BASE_DIR}/data/install/edge.key core@192.168.12.33
cat > ${BASE_DIR}/data/install/crack.txt << EOF
echo redhat | sudo passwd --stdin root
sudo sed -i "s|^PasswordAuthentication no$|PasswordAuthentication yes|g" /etc/ssh/sshd_config
sudo sed -i "s|^PermitRootLogin no$|PermitRootLogin yes|g" /etc/ssh/sshd_config
sudo sed -i "s|^#ClientAliveInterval 180$|ClientAliveInterval 1800|g" /etc/ssh/sshd_config
sudo systemctl restart sshd
sudo sh -c 'echo "export KUBECONFIG=/etc/kubernetes/static-pod-resources/kube-apiserver-certs/secrets/node-kubeconfigs/localhost.kubeconfig" >> /root/.bashrc'
sudo sh -c 'echo "RET=\\\`oc config use-context system:admin\\\`" >> /root/.bashrc'
EOF
for i in 33
do
ssh core@192.168.12.$i < ${BASE_DIR}/data/install/crack.txt
done
for i in 33
do
sshpass -p 'redhat' ssh-copy-id root@192.168.12.$i
done
ssh root@192.168.12.33
end
openshift 4.12 UPI in agent way, 3 node.
OpenShift的安装方式很多了,现在又多了一种,agent based installer。最大的特点是,不需要额外的bootstrap节点了。这可是天大的好消息,因为,以前安装之前,和客户交流,客户总是不理解,为什么红帽说支持3节点部署,但是却要求提供4台服务器。也不能怪客户,按照一般的理解,之前红帽是不支持严格意义上的3节点部署,就因为有这个bootstrap. 现在好了,agent based installer是真正世俗意义上的支持3节点部署了。
从官方文档来看,能压缩掉bootstrap,是因为bootstrap相关的服务,都压缩到一个master节点上,并使用了assisted installer流程,来达到真正的3节点安装的。
- https://docs.openshift.com/container-platform/4.12/installing/installing_with_agent_based_installer/preparing-to-install-with-agent-based-installer.html
本文,就用agent based installer来装一个3节点的ocp集群。和单节点集群不同,3节点集群需要配置vip,来承载api server和ingress.
on helper node
# switch to you install version
export BUILDNUMBER=4.12.9
pushd /data/ocp4/${BUILDNUMBER}
tar -xzf openshift-client-linux-${BUILDNUMBER}.tar.gz -C /usr/local/bin/
tar -xzf openshift-install-linux-${BUILDNUMBER}.tar.gz -C /usr/local/bin/
# tar -xzf oc-mirror.tar.gz -C /usr/local/bin/
# chmod +x /usr/local/bin/oc-mirror
install -m 755 /data/ocp4/clients/butane-amd64 /usr/local/bin/butane
install -m 755 /data/ocp4/clients/coreos-installer_amd64 /usr/local/bin/coreos-installer
popd
# create a user and create the cluster under the user
useradd -m 3node
su - 3node
ssh-keygen
cat << EOF > ~/.ssh/config
StrictHostKeyChecking no
UserKnownHostsFile=/dev/null
EOF
chmod 600 ~/.ssh/config
cat << 'EOF' >> ~/.bashrc
export BASE_DIR='/home/3node/'
EOF
# export BASE_DIR='/home/3node/'
export BUILDNUMBER=4.12.9
mkdir -p ${BASE_DIR}/data/{sno/disconnected,install}
# set some parameter of you rcluster
NODE_SSH_KEY="$(cat ${BASE_DIR}/.ssh/id_rsa.pub)"
INSTALL_IMAGE_REGISTRY=quaylab.infra.wzhlab.top:5443
# PULL_SECRET='{"auths":{"registry.redhat.io": {"auth": "ZHVtbXk6ZHVtbXk=","email": "noemail@localhost"},"registry.ocp4.redhat.ren:5443": {"auth": "ZHVtbXk6ZHVtbXk=","email": "noemail@localhost"},"'${INSTALL_IMAGE_REGISTRY}'": {"auth": "'$( echo -n 'admin:shadowman' | openssl base64 )'","email": "noemail@localhost"}}}'
PULL_SECRET=$(cat /data/pull-secret.json)
NTP_SERVER=192.168.77.11
# HELP_SERVER=192.168.7.11
# KVM_HOST=192.168.7.11
API_VIP=192.168.77.99
INGRESS_VIP=192.168.77.98
# CLUSTER_PROVISION_IP=192.168.7.103
# BOOTSTRAP_IP=192.168.7.12
# 定义单节点集群的节点信息
SNO_CLUSTER_NAME=osp-demo
SNO_BASE_DOMAIN=wzhlab.top
BOOTSTRAP_IP=192.168.77.42
MASTER_01_IP=192.168.77.43
MASTER_02_IP=192.168.77.44
MASTER_03_IP=192.168.77.45
BOOTSTRAP_IPv6=fd03::42
MASTER_01_IPv6=fd03::43
MASTER_02_IPv6=fd03::44
MASTER_03_IPv6=fd03::45
BOOTSTRAP_HOSTNAME=bootstrap-demo
MASTER_01_HOSTNAME=master-01-demo
MASTER_02_HOSTNAME=master-02-demo
MASTER_03_HOSTNAME=master-03-demo
BOOTSTRAP_INTERFACE=enp1s0
MASTER_01_INTERFACE=enp1s0
MASTER_02_INTERFACE=enp1s0
MASTER_03_INTERFACE=enp1s0
MASTER_01_INTERFACE_MAC=52:54:00:12:A1:01
MASTER_02_INTERFACE_MAC=52:54:00:12:A1:02
MASTER_03_INTERFACE_MAC=52:54:00:12:A1:03
BOOTSTRAP_DISK=/dev/vda
MASTER_01_DISK=/dev/vda
MASTER_02_DISK=/dev/vda
MASTER_03_DISK=/dev/vda
OCP_GW=192.168.77.11
OCP_NETMASK=255.255.255.0
OCP_NETMASK_S=24
OCP_DNS=192.168.77.11
OCP_GW_v6=fd03::11
OCP_NETMASK_v6=64
# echo ${SNO_IF_MAC} > /data/sno/sno.mac
mkdir -p ${BASE_DIR}/data/install
cd ${BASE_DIR}/data/install
/bin/rm -rf *.ign .openshift_install_state.json auth bootstrap manifests master*[0-9] worker*[0-9] *
cat << EOF > ${BASE_DIR}/data/install/install-config.yaml
apiVersion: v1
baseDomain: $SNO_BASE_DOMAIN
compute:
- name: worker
replicas: 0
controlPlane:
name: master
replicas: 3
metadata:
name: $SNO_CLUSTER_NAME
networking:
# OVNKubernetes , OpenShiftSDN
clusterNetwork:
- cidr: 172.21.0.0/16
hostPrefix: 23
# - cidr: fd02::/48
# hostPrefix: 64
machineNetwork:
- cidr: 192.168.77.0/24
# - cidr: 2001:DB8::/32
serviceNetwork:
- 172.22.0.0/16
# - fd03::/112
platform:
baremetal:
apiVIPs:
- $API_VIP
# - 2001:DB8::4
ingressVIPs:
- $INGRESS_VIP
# - 2001:DB8::5
pullSecret: '${PULL_SECRET}'
sshKey: |
$( cat ${BASE_DIR}/.ssh/id_rsa.pub | sed 's/^/ /g' )
additionalTrustBundle: |
$( cat /etc/crts/redhat.ren.ca.crt | sed 's/^/ /g' )
imageContentSources:
- mirrors:
- ${INSTALL_IMAGE_REGISTRY}/ocp4/openshift4
source: quay.io/openshift-release-dev/ocp-release
- mirrors:
- ${INSTALL_IMAGE_REGISTRY}/ocp4/openshift4
source: quay.io/openshift-release-dev/ocp-v4.0-art-dev
EOF
cat << EOF > ${BASE_DIR}/data/install/agent-config.yaml
apiVersion: v1alpha1
kind: AgentConfig
metadata:
name: $SNO_CLUSTER_NAME
rendezvousIP: $MASTER_01_IP
additionalNTPSources:
- $NTP_SERVER
hosts:
- hostname: $MASTER_01_HOSTNAME
role: master
rootDeviceHints:
deviceName: "$MASTER_01_DISK"
interfaces:
- name: $MASTER_01_INTERFACE
macAddress: $MASTER_01_INTERFACE_MAC
networkConfig:
interfaces:
- name: $MASTER_01_INTERFACE
type: ethernet
state: up
mac-address: $MASTER_01_INTERFACE_MAC
ipv4:
enabled: true
address:
- ip: $MASTER_01_IP
prefix-length: $OCP_NETMASK_S
dhcp: false
dns-resolver:
config:
server:
- $OCP_DNS
routes:
config:
- destination: 0.0.0.0/0
next-hop-address: $OCP_GW
next-hop-interface: $MASTER_01_INTERFACE
table-id: 254
- hostname: $MASTER_02_HOSTNAME
role: master
rootDeviceHints:
deviceName: "$MASTER_02_DISK"
interfaces:
- name: $MASTER_02_INTERFACE
macAddress: $MASTER_02_INTERFACE_MAC
networkConfig:
interfaces:
- name: $MASTER_02_INTERFACE
type: ethernet
state: up
mac-address: $MASTER_02_INTERFACE_MAC
ipv4:
enabled: true
address:
- ip: $MASTER_02_IP
prefix-length: $OCP_NETMASK_S
dhcp: false
dns-resolver:
config:
server:
- $OCP_DNS
routes:
config:
- destination: 0.0.0.0/0
next-hop-address: $OCP_GW
next-hop-interface: $MASTER_02_INTERFACE
table-id: 254
- hostname: $MASTER_03_HOSTNAME
role: master
rootDeviceHints:
deviceName: "$MASTER_03_DISK"
interfaces:
- name: $MASTER_03_INTERFACE
macAddress: $MASTER_03_INTERFACE_MAC
networkConfig:
interfaces:
- name: $MASTER_03_INTERFACE
type: ethernet
state: up
mac-address: $MASTER_03_INTERFACE_MAC
ipv4:
enabled: true
address:
- ip: $MASTER_03_IP
prefix-length: $OCP_NETMASK_S
dhcp: false
dns-resolver:
config:
server:
- $OCP_DNS
routes:
config:
- destination: 0.0.0.0/0
next-hop-address: $OCP_GW
next-hop-interface: $MASTER_03_INTERFACE
table-id: 254
EOF
/bin/cp -f ${BASE_DIR}/data/install/install-config.yaml ${BASE_DIR}/data/install/install-config.yaml.bak
openshift-install --dir=${BASE_DIR}/data/install agent create cluster-manifests
sudo bash -c "/bin/cp -f mirror/registries.conf /etc/containers/registries.conf.d/; chmod +r /etc/containers/registries.conf.d/*"
# /bin/cp -f /data/ocp4/ansible-helper/files/* ${BASE_DIR}/data/install/openshift/
sudo bash -c "cd /data/ocp4 ; bash image.registries.conf.sh quaylab.infra.wzhlab.top:5443 ;"
/bin/cp -f /data/ocp4/99-worker-container-registries.yaml ${BASE_DIR}/data/install/openshift
/bin/cp -f /data/ocp4/99-master-container-registries.yaml ${BASE_DIR}/data/install/openshift
cd ${BASE_DIR}/data/install/
# openshift-install --dir=${BASE_DIR}/data/install create ignition-configs
mkdir -p ~/.cache/agent/image_cache/
/bin/cp -f /data/ocp-$BUILDNUMBER/rhcos-live.x86_64.iso ~/.cache/agent/image_cache/coreos-x86_64.iso
openshift-install --dir=${BASE_DIR}/data/install agent create image --log-level=debug
# ......
# DEBUG Fetching image from OCP release (oc adm release info --image-for=machine-os-images --insecure=true --icsp-file=/tmp/icsp-file3636774741 quay.io/openshift-release-dev/ocp-release@sha256:96bf74ce789ccb22391deea98e0c5050c41b67cc17defbb38089d32226dba0b8)
# DEBUG The file was found in cache: /home/3node/.cache/agent/image_cache/coreos-x86_64.iso
# INFO Verifying cached file
# DEBUG extracting /coreos/coreos-x86_64.iso.sha256 to /tmp/cache1876698393, oc image extract --path /coreos/coreos-x86_64.iso.sha256:/tmp/cache1876698393 --confirm --icsp-file=/tmp/icsp-file455852761 quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:052130abddf741195b6753888cf8a00757dedeb7010f7d4dcc4b842b5bc705f6
# ......
coreos-installer iso ignition show agent.x86_64.iso > ignition.ign
# HTTP_PATH=http://192.168.7.11:8080/ignition
source /data/ocp4/acm.fn.sh
# 我们会创建一个wzh用户,密码是redhat,这个可以在第一次启动的是,从console/ssh直接用用户名口令登录
# 方便排错和研究
VAR_PWD_HASH="$(python3 -c 'import crypt,getpass; print(crypt.crypt("redhat"))')"
cat ${BASE_DIR}/data/install/ignition.ign \
| jq --arg VAR "$VAR_PWD_HASH" --arg VAR_SSH "$NODE_SSH_KEY" '.passwd.users += [{ "name": "wzh", "system": true, "passwordHash": $VAR , "sshAuthorizedKeys": [ $VAR_SSH ], "groups": [ "adm", "wheel", "sudo", "systemd-journal" ] }]' \
| jq '. += { "kernel_arguments" : { "should_exist" : [ "systemd.debug-shell=1" ] } }' \
| jq -c . \
> ${BASE_DIR}/data/install/ignition-iso.ign
coreos-installer iso ignition embed -f -i ignition-iso.ign agent.x86_64.iso
# VAR_IMAGE_VER=rhcos-410.86.202303200936-AnolisOS-0-live.x86_64.iso
on kvm host ( 103 )
cleanup
create_lv() {
var_vg=$1
var_pool=$2
var_lv=$3
var_size=$4
var_action=$5
lvremove -f $var_vg/$var_lv
# lvcreate -y -L $var_size -n $var_lv $var_vg
if [ "$var_action" == "recreate" ]; then
lvcreate --type thin -n $var_lv -V $var_size --thinpool $var_vg/$var_pool
wipefs --all --force /dev/$var_vg/$var_lv
fi
}
virsh destroy ocp4-acm-one-bootstrap
virsh undefine ocp4-acm-one-bootstrap
create_lv vgdata poolA lvacm-one-bootstrap 500G
create_lv vgdata poolA lvacm-one-bootstrap-data 500G
virsh destroy ocp4-acm-one-master-01
virsh undefine ocp4-acm-one-master-01
create_lv vgdata poolA lvacm-one-master-01 500G
create_lv vgdata poolA lvacm-one-master-01-data 500G
virsh destroy ocp4-acm-one-master-02
virsh undefine ocp4-acm-one-master-02
create_lv vgdata poolA lvacm-one-master-02 500G
create_lv vgdata poolA lvacm-one-master-02-data 500G
virsh destroy ocp4-acm-one-master-03
virsh undefine ocp4-acm-one-master-03
create_lv vgdata poolA lvacm-one-master-03 500G
create_lv vgdata poolA lvacm-one-master-03-data 500G
begin
cat << EOF >> /etc/sysctl.d/99-wzh-sysctl.conf
vm.overcommit_memory = 1
EOF
sysctl --system
# 创建实验用虚拟网络
mkdir -p /data/kvm
cd /data/kvm
cat << 'EOF' > /data/kvm/bridge.sh
#!/usr/bin/env bash
PUB_CONN='eno1'
PUB_IP='172.21.6.103/24'
PUB_GW='172.21.6.254'
PUB_DNS='172.21.1.1'
nmcli con down "$PUB_CONN"
nmcli con delete "$PUB_CONN"
nmcli con down baremetal
nmcli con delete baremetal
# RHEL 8.1 appends the word "System" in front of the connection,delete in case it exists
nmcli con down "System $PUB_CONN"
nmcli con delete "System $PUB_CONN"
nmcli connection add ifname baremetal type bridge con-name baremetal ipv4.method 'manual' \
ipv4.address "$PUB_IP" \
ipv4.gateway "$PUB_GW" \
ipv4.dns "$PUB_DNS"
nmcli con add type bridge-slave ifname "$PUB_CONN" master baremetal
nmcli con down "$PUB_CONN";pkill dhclient;dhclient baremetal
nmcli con up baremetal
EOF
bash /data/kvm/bridge.sh
nmcli con mod baremetal +ipv4.addresses "192.168.7.103/24"
nmcli con up baremetal
cat << EOF > /root/.ssh/config
StrictHostKeyChecking no
UserKnownHostsFile=/dev/null
EOF
pvcreate -y /dev/vdb
vgcreate vgdate /dev/vdb
# https://access.redhat.com/articles/766133
lvcreate -y -n poolA -L 500G vgdata
lvcreate -y -n poolA_meta -L 10G vgdata
lvconvert -y --thinpool vgdata/poolA --poolmetadata vgdata/poolA_meta
lvextend -l +100%FREE vgdata/poolA
mkdir -p /data/kvm/one/
scp root@192.168.77.11:/home/3node/data/install/agent.x86_64.iso /data/kvm/one/
create_lv() {
var_vg=$1
var_pool=$2
var_lv=$3
var_size=$4
var_action=$5
lvremove -f $var_vg/$var_lv
# lvcreate -y -L $var_size -n $var_lv $var_vg
if [ "$var_action" == "recreate" ]; then
lvcreate --type thin -n $var_lv -V $var_size --thinpool $var_vg/$var_pool
wipefs --all --force /dev/$var_vg/$var_lv
fi
}
SNO_MEM=32
virsh destroy ocp4-acm-one-master-01
virsh undefine ocp4-acm-one-master-01
create_lv vgdata poolA lvacm-one-master-01 500G recreate
create_lv vgdata poolA lvacm-one-master-01-data 500G recreate
virt-install --name=ocp4-acm-one-master-01 --vcpus=16 --ram=$(($SNO_MEM*1024)) \
--cpu=host-model \
--disk path=/dev/vgdata/lvacm-one-master-01,device=disk,bus=virtio,format=raw \
--disk path=/dev/vgdata/lvacm-one-master-01-data,device=disk,bus=virtio,format=raw \
--os-variant rhel8.3 --network bridge=baremetal,model=virtio,mac=52:54:00:12:A1:01 \
--graphics vnc,port=59003 --noautoconsole \
--boot menu=on --cdrom /data/kvm/one/agent.x86_64.iso
virsh destroy ocp4-acm-one-master-02
virsh undefine ocp4-acm-one-master-02
create_lv vgdata poolA lvacm-one-master-02 500G recreate
create_lv vgdata poolA lvacm-one-master-02-data 500G recreate
virt-install --name=ocp4-acm-one-master-02 --vcpus=16 --ram=$(($SNO_MEM*1024)) \
--cpu=host-model \
--disk path=/dev/vgdata/lvacm-one-master-02,device=disk,bus=virtio,format=raw \
--disk path=/dev/vgdata/lvacm-one-master-02-data,device=disk,bus=virtio,format=raw \
--os-variant rhel8.3 --network bridge=baremetal,model=virtio,mac=52:54:00:12:A1:02 \
--graphics vnc,port=59004 --noautoconsole \
--boot menu=on --cdrom /data/kvm/one/agent.x86_64.iso
virsh destroy ocp4-acm-one-master-03
virsh undefine ocp4-acm-one-master-03
create_lv vgdata poolA lvacm-one-master-03 500G recreate
create_lv vgdata poolA lvacm-one-master-03-data 500G recreate
virt-install --name=ocp4-acm-one-master-03 --vcpus=16 --ram=$(($SNO_MEM*1024)) \
--cpu=host-model \
--disk path=/dev/vgdata/lvacm-one-master-03,device=disk,bus=virtio,format=raw \
--disk path=/dev/vgdata/lvacm-one-master-03-data,device=disk,bus=virtio,format=raw \
--os-variant rhel8.3 --network bridge=baremetal,model=virtio,mac=52:54:00:12:A1:03 \
--graphics vnc,port=59005 --noautoconsole \
--boot menu=on --cdrom /data/kvm/one/agent.x86_64.iso
on helper to see result
for unkonwn reason, the vm will be shutdown, instead of reboot, you have to poweron it manually.
cd ${BASE_DIR}/data/install
export KUBECONFIG=${BASE_DIR}/data/install/auth/kubeconfig
echo "export KUBECONFIG=${BASE_DIR}/data/install/auth/kubeconfig" >> ~/.bashrc
# oc completion bash | sudo tee /etc/bash_completion.d/openshift > /dev/null
cd ${BASE_DIR}/data/install
openshift-install --dir=${BASE_DIR}/data/install agent wait-for bootstrap-complete \
--log-level=debug
# INFO Uploaded logs for host master-02-demo cluster b1d26586-caae-4b49-a0c7-30c6f8c3b9db
# INFO Host: master-01-demo, reached installation stage Writing image to disk: 100%
# INFO Host: master-01-demo, reached installation stage Waiting for control plane: Waiting for masters to join bootstrap control plane
# INFO Bootstrap Kube API Initialized
# INFO Host: master-02-demo, reached installation stage Configuring
# INFO Host: master-03-demo, reached installation stage Configuring
# INFO Host: master-02-demo, reached installation stage Joined
# INFO Host: master-01-demo, reached installation stage Waiting for bootkube
# INFO Host: master-03-demo, reached installation stage Done
# INFO Host: master-01-demo, reached installation stage Waiting for controller: waiting for controller pod ready event
# INFO Bootstrap configMap status is complete
# INFO cluster bootstrap is complete
cd ${BASE_DIR}/data/install
openshift-install --dir=${BASE_DIR}/data/install agent wait-for install-complete
# INFO Waiting for cluster install to initialize. Sleeping for 30 seconds
# INFO Bootstrap Kube API Initialized
# INFO Bootstrap configMap status is complete
# INFO cluster bootstrap is complete
# INFO Cluster is installed
# INFO Install complete!
# INFO To access the cluster as the system:admin user when using 'oc', run
# INFO export KUBECONFIG=/home/3node/data/install/auth/kubeconfig
# INFO Access the OpenShift web-console here: https://console-openshift-console.apps.osp-demo.wzhlab.top
# INFO Login to the console with user: "kubeadmin", and password: "LsWCT-b8oaw-tvEKY-RUwKC"
password login and oc config
# init setting for helper node
cat << EOF > ~/.ssh/config
StrictHostKeyChecking no
UserKnownHostsFile=/dev/null
EOF
chmod 600 ~/.ssh/config
# ssh core@*****
# sudo -i
# # change password for root
# echo 'redhat' | passwd --stdin root
# sed -i "s|^PasswordAuthentication no$|PasswordAuthentication yes|g" /etc/ssh/sshd_config
# sed -i "s|^PermitRootLogin no$|PermitRootLogin yes|g" /etc/ssh/sshd_config
# sed -i "s|^#ClientAliveInterval 180$|ClientAliveInterval 1800|g" /etc/ssh/sshd_config
# systemctl restart sshd
# # set env, so oc can be used
# cat << EOF >> ~/.bashrc
# export KUBECONFIG=/etc/kubernetes/static-pod-resources/kube-apiserver-certs/secrets/node-kubeconfigs/localhost.kubeconfig
# RET=`oc config use-context system:admin`
# EOF
cat > ${BASE_DIR}/data/install/crack.txt << EOF
echo redhat | sudo passwd --stdin root
sudo sed -i "s|^PasswordAuthentication no$|PasswordAuthentication yes|g" /etc/ssh/sshd_config
sudo sed -i "s|^PermitRootLogin no$|PermitRootLogin yes|g" /etc/ssh/sshd_config
sudo sed -i "s|^#ClientAliveInterval 180$|ClientAliveInterval 1800|g" /etc/ssh/sshd_config
sudo systemctl restart sshd
sudo sh -c 'echo "export KUBECONFIG=/etc/kubernetes/static-pod-resources/kube-apiserver-certs/secrets/node-kubeconfigs/localhost.kubeconfig" >> /root/.bashrc'
sudo sh -c 'echo "RET=\\\`oc config use-context system:admin\\\`" >> /root/.bashrc'
EOF
for i in 23 24 25
do
ssh core@192.168.7.$i < ${BASE_DIR}/data/install/crack.txt
done
from other host
# https://unix.stackexchange.com/questions/230084/send-the-password-through-stdin-in-ssh-copy-id
dnf install -y sshpass
for i in 23 24 25
do
sshpass -p 'redhat' ssh-copy-id root@192.168.7.$i
done
poweroff
for i in 23 24 25
do
ssh root@192.168.7.$i poweroff
done
poweron
virsh start ocp4-acm-one-master-01
virsh start ocp4-acm-one-master-02
virsh start ocp4-acm-one-master-03
back and merge kubeconfig
mkdir -p ~/.kube/bak/
var_date=$(date '+%Y-%m-%d-%H%M')
/bin/cp -f /data/install/auth/kubeconfig ~/.kube/bak/kubeconfig-$var_date
/bin/cp -f /data/install/auth/kubeadmin-password ~/.kube/bak/kubeadmin-password-$var_date
sed "s/admin/admin\/$SNO_CLUSTER_NAME/g" /data/install/auth/kubeconfig > /tmp/config.new
# https://medium.com/@jacobtomlinson/how-to-merge-kubernetes-kubectl-config-files-737b61bd517d
/bin/cp -f ~/.kube/config ~/.kube/config.bak && KUBECONFIG=~/.kube/config:/tmp/config.new kubectl config view --flatten > /tmp/config && /bin/mv -f /tmp/config ~/.kube/config
unset KUBECONFIG
add worker node
我们装好了single node,那么接下来,我们还可以给这个single node添加worker节点,让这个single node cluster变成一个单master的集群。
# first, lets stick ingress to master
oc label node acm-demo-hub-master ocp-ingress-run="true"
oc patch ingresscontroller default -n openshift-ingress-operator --type=merge --patch='{"spec":{"nodePlacement":{"nodeSelector": {"matchLabels":{"ocp-ingress-run":"true"}}}}}'
# we are testing env, so we don't need ingress replicas.
oc patch --namespace=openshift-ingress-operator --patch='{"spec": {"replicas": 1}}' --type=merge ingresscontroller/default
oc get -n openshift-ingress-operator ingresscontroller/default -o yaml
# then we get worker's ignition file, and start worker node, add it to cluster
oc extract -n openshift-machine-api secret/worker-user-data --keys=userData --to=- > /var/www/html/ignition/sno-worker.ign
HELP_SERVER=192.168.7.11
# 定义单节点集群的节点信息
SNO_IP=192.168.7.16
SNO_GW=192.168.7.11
SNO_NETMAST=255.255.255.0
SNO_HOSTNAME=acm-demo-hub-worker-01
SNO_IF=enp1s0
SNO_DNS=192.168.7.11
SNO_DISK=/dev/vda
SNO_MEM=16
BOOT_ARG=" ip=$SNO_IP::$SNO_GW:$SNO_NETMAST:$SNO_HOSTNAME:$SNO_IF:none nameserver=$SNO_DNS coreos.inst.install_dev=${SNO_DISK##*/} coreos.inst.ignition_url=http://$HELP_SERVER:8080/ignition/sno-worker.ign"
/bin/cp -f /data/ocp4/rhcos-live.x86_64.iso sno.iso
coreos-installer iso kargs modify -a "$BOOT_ARG" sno.iso
# go to kvm host ( 103 )
scp root@192.168.7.11:/data/install/sno.iso /data/kvm/
virsh destroy ocp4-acm-hub-worker01
virsh undefine ocp4-acm-hub-worker01
create_lv() {
var_vg=$1
var_pool=$2
var_lv=$3
var_size=$4
var_action=$5
lvremove -f $var_vg/$var_lv
# lvcreate -y -L $var_size -n $var_lv $var_vg
if [ "$var_action" == "recreate" ]; then
lvcreate --type thin -n $var_lv -V $var_size --thinpool $var_vg/$var_pool
wipefs --all --force /dev/$var_vg/$var_lv
fi
}
create_lv vgdata poolA lvacmhub-worker01 500G recreate
# create_lv vgdata poolA lvacmhub-worker01-data 500G remove
virt-install --name=ocp4-acm-hub-worker01 --vcpus=16 --ram=$(($SNO_MEM*1024)) \
--cpu=host-model \
--disk path=/dev/vgdata/lvacmhub-worker01,device=disk,bus=virtio,format=raw \
`# --disk path=/dev/vgdata/lvacmhub-data,device=disk,bus=virtio,format=raw` \
--os-variant rhel8.3 --network bridge=baremetal,model=virtio \
--graphics vnc,port=59003 \
--boot menu=on --cdrom /data/kvm/sno.iso
# after 2 boot up,
# go back to helper
oc get csr
oc get csr -ojson | jq -r '.items[] | select(.status == {} ) | .metadata.name' | xargs oc adm certificate approve
end
openshift 4.12 UPI in agent way, single node
OpenShift的安装方式很多了,现在又多了一种,agent based installer。最大的特点是,不需要额外的bootstrap节点了。这可是天大的好消息,因为,以前安装之前,和客户交流,客户总是不理解,为什么红帽说支持3节点部署,但是却要求提供4台服务器。也不能怪客户,按照一般的理解,之前红帽是不支持严格意义上的3节点部署,就因为有这个bootstrap. 现在好了,agent based installer是真正世俗意义上的支持3节点部署了。
从官方文档来看,能压缩掉bootstrap,是因为bootstrap相关的服务,都压缩到一个master节点上,并使用了assisted installer流程,来达到真正的3节点安装的。
- https://docs.openshift.com/container-platform/4.12/installing/installing_with_agent_based_installer/preparing-to-install-with-agent-based-installer.html
本文,就用agent based installer来装一个单节点的ocp集群。
on helper node
# switch to you install version
export BUILDNUMBER=4.12.9
pushd /data/ocp4/${BUILDNUMBER}
tar -xzf openshift-client-linux-${BUILDNUMBER}.tar.gz -C /usr/local/bin/
tar -xzf openshift-install-linux-${BUILDNUMBER}.tar.gz -C /usr/local/bin/
# tar -xzf oc-mirror.tar.gz -C /usr/local/bin/
# chmod +x /usr/local/bin/oc-mirror
install -m 755 /data/ocp4/clients/butane-amd64 /usr/local/bin/butane
install -m 755 /data/ocp4/clients/coreos-installer_amd64 /usr/local/bin/coreos-installer
popd
# create a user and create the cluster under the user
useradd -m 3node
su - 3node
ssh-keygen
cat << EOF > ~/.ssh/config
StrictHostKeyChecking no
UserKnownHostsFile=/dev/null
EOF
chmod 600 ~/.ssh/config
cat << 'EOF' >> ~/.bashrc
export BASE_DIR='/home/3node/'
EOF
# export BASE_DIR='/home/3node/'
export BUILDNUMBER=4.12.9
mkdir -p ${BASE_DIR}/data/{sno/disconnected,install}
# set some parameter of you rcluster
NODE_SSH_KEY="$(cat ${BASE_DIR}/.ssh/id_rsa.pub)"
INSTALL_IMAGE_REGISTRY=quaylab.infra.wzhlab.top:5443
# PULL_SECRET='{"auths":{"registry.redhat.io": {"auth": "ZHVtbXk6ZHVtbXk=","email": "noemail@localhost"},"registry.ocp4.redhat.ren:5443": {"auth": "ZHVtbXk6ZHVtbXk=","email": "noemail@localhost"},"'${INSTALL_IMAGE_REGISTRY}'": {"auth": "'$( echo -n 'admin:shadowman' | openssl base64 )'","email": "noemail@localhost"}}}'
PULL_SECRET=$(cat /data/pull-secret.json)
NTP_SERVER=192.168.77.11
# HELP_SERVER=192.168.7.11
# KVM_HOST=192.168.7.11
# API_VIP=192.168.77.99
# INGRESS_VIP=192.168.77.98
# CLUSTER_PROVISION_IP=192.168.7.103
# BOOTSTRAP_IP=192.168.7.12
MACHINE_NETWORK='192.168.77.0/24'
# 定义单节点集群的节点信息
SNO_CLUSTER_NAME=osp-demo
SNO_BASE_DOMAIN=wzhlab.top
BOOTSTRAP_IP=192.168.77.42
MASTER_01_IP=192.168.77.43
MASTER_02_IP=192.168.77.44
MASTER_03_IP=192.168.77.45
BOOTSTRAP_IPv6=fd03::42
MASTER_01_IPv6=fd03::43
MASTER_02_IPv6=fd03::44
MASTER_03_IPv6=fd03::45
BOOTSTRAP_HOSTNAME=bootstrap-demo
MASTER_01_HOSTNAME=master-01-demo
MASTER_02_HOSTNAME=master-02-demo
MASTER_03_HOSTNAME=master-03-demo
BOOTSTRAP_INTERFACE=enp1s0
MASTER_01_INTERFACE=enp1s0
MASTER_02_INTERFACE=enp1s0
MASTER_03_INTERFACE=enp1s0
MASTER_01_INTERFACE_MAC=52:54:00:12:A1:01
MASTER_02_INTERFACE_MAC=52:54:00:12:A1:02
MASTER_03_INTERFACE_MAC=52:54:00:12:A1:03
BOOTSTRAP_DISK=/dev/vda
MASTER_01_DISK=/dev/vda
MASTER_02_DISK=/dev/vda
MASTER_03_DISK=/dev/vda
OCP_GW=192.168.77.11
OCP_NETMASK=255.255.255.0
OCP_NETMASK_S=24
OCP_DNS=192.168.77.11
OCP_GW_v6=fd03::11
OCP_NETMASK_v6=64
# echo ${SNO_IF_MAC} > /data/sno/sno.mac
mkdir -p ${BASE_DIR}/data/install
cd ${BASE_DIR}/data/install
/bin/rm -rf *.ign .openshift_install_state.json auth bootstrap manifests master*[0-9] worker*[0-9] *
cat << EOF > ${BASE_DIR}/data/install/install-config.yaml
apiVersion: v1
baseDomain: $SNO_BASE_DOMAIN
compute:
- name: worker
replicas: 0
controlPlane:
name: master
replicas: 1
metadata:
name: $SNO_CLUSTER_NAME
networking:
# OVNKubernetes , OpenShiftSDN
clusterNetwork:
- cidr: 172.21.0.0/16
hostPrefix: 23
# - cidr: fd02::/48
# hostPrefix: 64
machineNetwork:
- cidr: $MACHINE_NETWORK
# - cidr: 2001:DB8::/32
serviceNetwork:
- 172.22.0.0/16
# - fd03::/112
platform:
none: {}
pullSecret: '${PULL_SECRET}'
sshKey: |
$( cat ${BASE_DIR}/.ssh/id_rsa.pub | sed 's/^/ /g' )
additionalTrustBundle: |
$( cat /etc/crts/redhat.ren.ca.crt | sed 's/^/ /g' )
imageContentSources:
- mirrors:
- ${INSTALL_IMAGE_REGISTRY}/ocp4/openshift4
source: quay.io/openshift-release-dev/ocp-release
- mirrors:
- ${INSTALL_IMAGE_REGISTRY}/ocp4/openshift4
source: quay.io/openshift-release-dev/ocp-v4.0-art-dev
EOF
cat << EOF > ${BASE_DIR}/data/install/agent-config.yaml
apiVersion: v1alpha1
kind: AgentConfig
metadata:
name: $SNO_CLUSTER_NAME
rendezvousIP: $MASTER_01_IP
additionalNTPSources:
- $NTP_SERVER
hosts:
- hostname: $MASTER_01_HOSTNAME
role: master
rootDeviceHints:
deviceName: "$MASTER_01_DISK"
interfaces:
- name: $MASTER_01_INTERFACE
macAddress: $MASTER_01_INTERFACE_MAC
networkConfig:
interfaces:
- name: $MASTER_01_INTERFACE
type: ethernet
state: up
mac-address: $MASTER_01_INTERFACE_MAC
ipv4:
enabled: true
address:
- ip: $MASTER_01_IP
prefix-length: $OCP_NETMASK_S
dhcp: false
dns-resolver:
config:
server:
- $OCP_DNS
routes:
config:
- destination: 0.0.0.0/0
next-hop-address: $OCP_GW
next-hop-interface: $MASTER_01_INTERFACE
table-id: 254
EOF
/bin/cp -f ${BASE_DIR}/data/install/install-config.yaml ${BASE_DIR}/data/install/install-config.yaml.bak
openshift-install --dir=${BASE_DIR}/data/install agent create cluster-manifests
sudo bash -c "/bin/cp -f mirror/registries.conf /etc/containers/registries.conf.d/; chmod +r /etc/containers/registries.conf.d/*"
# /bin/cp -f /data/ocp4/ansible-helper/files/* ${BASE_DIR}/data/install/openshift/
sudo bash -c "cd /data/ocp4 ; bash image.registries.conf.sh quaylab.infra.wzhlab.top:5443 ;"
/bin/cp -f /data/ocp4/99-worker-container-registries.yaml ${BASE_DIR}/data/install/openshift
/bin/cp -f /data/ocp4/99-master-container-registries.yaml ${BASE_DIR}/data/install/openshift
cd ${BASE_DIR}/data/install/
# openshift-install --dir=${BASE_DIR}/data/install create ignition-configs
mkdir -p ~/.cache/agent/image_cache/
/bin/cp -f /data/ocp-$BUILDNUMBER/rhcos-live.x86_64.iso ~/.cache/agent/image_cache/coreos-x86_64.iso
openshift-install --dir=${BASE_DIR}/data/install agent create image --log-level=debug
# ......
# DEBUG Fetching image from OCP release (oc adm release info --image-for=machine-os-images --insecure=true --icsp-file=/tmp/icsp-file3636774741 quay.io/openshift-release-dev/ocp-release@sha256:96bf74ce789ccb22391deea98e0c5050c41b67cc17defbb38089d32226dba0b8)
# DEBUG The file was found in cache: /home/3node/.cache/agent/image_cache/coreos-x86_64.iso
# INFO Verifying cached file
# DEBUG extracting /coreos/coreos-x86_64.iso.sha256 to /tmp/cache1876698393, oc image extract --path /coreos/coreos-x86_64.iso.sha256:/tmp/cache1876698393 --confirm --icsp-file=/tmp/icsp-file455852761 quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:052130abddf741195b6753888cf8a00757dedeb7010f7d4dcc4b842b5bc705f6
# ......
coreos-installer iso ignition show agent.x86_64.iso > ignition.ign
# HTTP_PATH=http://192.168.7.11:8080/ignition
source /data/ocp4/acm.fn.sh
# 我们会创建一个wzh用户,密码是redhat,这个可以在第一次启动的是,从console/ssh直接用用户名口令登录
# 方便排错和研究
VAR_PWD_HASH="$(python3 -c 'import crypt,getpass; print(crypt.crypt("redhat"))')"
cat ${BASE_DIR}/data/install/ignition.ign \
| jq --arg VAR "$VAR_PWD_HASH" --arg VAR_SSH "$NODE_SSH_KEY" '.passwd.users += [{ "name": "wzh", "system": true, "passwordHash": $VAR , "sshAuthorizedKeys": [ $VAR_SSH ], "groups": [ "adm", "wheel", "sudo", "systemd-journal" ] }]' \
| jq '. += { "kernel_arguments" : { "should_exist" : [ "systemd.debug-shell=1" ] } }' \
| jq -c . \
> ${BASE_DIR}/data/install/ignition-iso.ign
coreos-installer iso ignition embed -f -i ignition-iso.ign agent.x86_64.iso
# VAR_IMAGE_VER=rhcos-410.86.202303200936-AnolisOS-0-live.x86_64.iso
on kvm host ( 103 )
cleanup
create_lv() {
var_vg=$1
var_pool=$2
var_lv=$3
var_size=$4
var_action=$5
lvremove -f $var_vg/$var_lv
# lvcreate -y -L $var_size -n $var_lv $var_vg
if [ "$var_action" == "recreate" ]; then
lvcreate --type thin -n $var_lv -V $var_size --thinpool $var_vg/$var_pool
wipefs --all --force /dev/$var_vg/$var_lv
fi
}
virsh destroy ocp4-acm-one-bootstrap
virsh undefine ocp4-acm-one-bootstrap
create_lv vgdata poolA lvacm-one-bootstrap 500G
create_lv vgdata poolA lvacm-one-bootstrap-data 500G
virsh destroy ocp4-acm-one-master-01
virsh undefine ocp4-acm-one-master-01
create_lv vgdata poolA lvacm-one-master-01 500G
create_lv vgdata poolA lvacm-one-master-01-data 500G
virsh destroy ocp4-acm-one-master-02
virsh undefine ocp4-acm-one-master-02
create_lv vgdata poolA lvacm-one-master-02 500G
create_lv vgdata poolA lvacm-one-master-02-data 500G
virsh destroy ocp4-acm-one-master-03
virsh undefine ocp4-acm-one-master-03
create_lv vgdata poolA lvacm-one-master-03 500G
create_lv vgdata poolA lvacm-one-master-03-data 500G
begin
cat << EOF >> /etc/sysctl.d/99-wzh-sysctl.conf
vm.overcommit_memory = 1
EOF
sysctl --system
# 创建实验用虚拟网络
mkdir -p /data/kvm
cd /data/kvm
cat << 'EOF' > /data/kvm/bridge.sh
#!/usr/bin/env bash
PUB_CONN='eno1'
PUB_IP='172.21.6.103/24'
PUB_GW='172.21.6.254'
PUB_DNS='172.21.1.1'
nmcli con down "$PUB_CONN"
nmcli con delete "$PUB_CONN"
nmcli con down baremetal
nmcli con delete baremetal
# RHEL 8.1 appends the word "System" in front of the connection,delete in case it exists
nmcli con down "System $PUB_CONN"
nmcli con delete "System $PUB_CONN"
nmcli connection add ifname baremetal type bridge con-name baremetal ipv4.method 'manual' \
ipv4.address "$PUB_IP" \
ipv4.gateway "$PUB_GW" \
ipv4.dns "$PUB_DNS"
nmcli con add type bridge-slave ifname "$PUB_CONN" master baremetal
nmcli con down "$PUB_CONN";pkill dhclient;dhclient baremetal
nmcli con up baremetal
EOF
bash /data/kvm/bridge.sh
nmcli con mod baremetal +ipv4.addresses "192.168.7.103/24"
nmcli con up baremetal
cat << EOF > /root/.ssh/config
StrictHostKeyChecking no
UserKnownHostsFile=/dev/null
EOF
pvcreate -y /dev/vdb
vgcreate vgdate /dev/vdb
# https://access.redhat.com/articles/766133
lvcreate -y -n poolA -L 500G vgdata
lvcreate -y -n poolA_meta -L 10G vgdata
lvconvert -y --thinpool vgdata/poolA --poolmetadata vgdata/poolA_meta
lvextend -l +100%FREE vgdata/poolA
mkdir -p /data/kvm/one/
scp root@192.168.77.11:/home/3node/data/install/agent.x86_64.iso /data/kvm/one/
create_lv() {
var_vg=$1
var_pool=$2
var_lv=$3
var_size=$4
var_action=$5
lvremove -f $var_vg/$var_lv
# lvcreate -y -L $var_size -n $var_lv $var_vg
if [ "$var_action" == "recreate" ]; then
lvcreate --type thin -n $var_lv -V $var_size --thinpool $var_vg/$var_pool
wipefs --all --force /dev/$var_vg/$var_lv
fi
}
SNO_MEM=64
virsh destroy ocp4-acm-one-master-01
virsh undefine ocp4-acm-one-master-01
create_lv vgdata poolA lvacm-one-master-01 500G recreate
create_lv vgdata poolA lvacm-one-master-01-data 500G recreate
virt-install --name=ocp4-acm-one-master-01 --vcpus=16 --ram=$(($SNO_MEM*1024)) \
--cpu=host-model \
--disk path=/dev/vgdata/lvacm-one-master-01,device=disk,bus=virtio,format=raw \
--disk path=/dev/vgdata/lvacm-one-master-01-data,device=disk,bus=virtio,format=raw \
--os-variant rhel8.3 --network bridge=baremetal,model=virtio,mac=52:54:00:12:A1:01 \
--graphics vnc,port=59003 --noautoconsole \
--boot menu=on --cdrom /data/kvm/one/agent.x86_64.iso
on helper to see result
for unkonwn reason, the vm will be shutdown, instead of reboot, you have to poweron it manually.
cd ${BASE_DIR}/data/install
export KUBECONFIG=${BASE_DIR}/data/install/auth/kubeconfig
echo "export KUBECONFIG=${BASE_DIR}/data/install/auth/kubeconfig" >> ~/.bashrc
# oc completion bash | sudo tee /etc/bash_completion.d/openshift > /dev/null
cd ${BASE_DIR}/data/install
openshift-install --dir=${BASE_DIR}/data/install agent wait-for bootstrap-complete --log-level=debug
# ......
# DEBUG RendezvousIP from the AgentConfig 192.168.77.43
# INFO Bootstrap Kube API Initialized
# INFO Bootstrap configMap status is complete
# INFO cluster bootstrap is complete
cd ${BASE_DIR}/data/install
openshift-install --dir=${BASE_DIR}/data/install agent wait-for install-complete --log-level=debug
# ......
# INFO Install complete!
# INFO To access the cluster as the system:admin user when using 'oc', run
# INFO export KUBECONFIG=/home/3node/data/install/auth/kubeconfig
# INFO Access the OpenShift web-console here: https://console-openshift-console.apps.osp-demo.wzhlab.top
# INFO Login to the console with user: "kubeadmin", and password: "UmfI2-99uAb-BRdaS-LLjQ9"
password login and oc config
# init setting for helper node
cat << EOF > ~/.ssh/config
StrictHostKeyChecking no
UserKnownHostsFile=/dev/null
EOF
chmod 600 ~/.ssh/config
# ssh core@*****
# sudo -i
# # change password for root
# echo 'redhat' | passwd --stdin root
# sed -i "s|^PasswordAuthentication no$|PasswordAuthentication yes|g" /etc/ssh/sshd_config
# sed -i "s|^PermitRootLogin no$|PermitRootLogin yes|g" /etc/ssh/sshd_config
# sed -i "s|^#ClientAliveInterval 180$|ClientAliveInterval 1800|g" /etc/ssh/sshd_config
# systemctl restart sshd
# # set env, so oc can be used
# cat << EOF >> ~/.bashrc
# export KUBECONFIG=/etc/kubernetes/static-pod-resources/kube-apiserver-certs/secrets/node-kubeconfigs/localhost.kubeconfig
# RET=`oc config use-context system:admin`
# EOF
cat > ${BASE_DIR}/data/install/crack.txt << EOF
echo redhat | sudo passwd --stdin root
sudo sed -i "s|^PasswordAuthentication no$|PasswordAuthentication yes|g" /etc/ssh/sshd_config
sudo sed -i "s|^PermitRootLogin no$|PermitRootLogin yes|g" /etc/ssh/sshd_config
sudo sed -i "s|^#ClientAliveInterval 180$|ClientAliveInterval 1800|g" /etc/ssh/sshd_config
sudo systemctl restart sshd
sudo sh -c 'echo "export KUBECONFIG=/etc/kubernetes/static-pod-resources/kube-apiserver-certs/secrets/node-kubeconfigs/localhost.kubeconfig" >> /root/.bashrc'
sudo sh -c 'echo "RET=\\\`oc config use-context system:admin\\\`" >> /root/.bashrc'
EOF
for i in 23 24 25
do
ssh core@192.168.7.$i < ${BASE_DIR}/data/install/crack.txt
done
from other host
# https://unix.stackexchange.com/questions/230084/send-the-password-through-stdin-in-ssh-copy-id
dnf install -y sshpass
for i in 23 24 25
do
sshpass -p 'redhat' ssh-copy-id root@192.168.7.$i
done
poweroff
for i in 23 24 25
do
ssh root@192.168.7.$i poweroff
done
poweron
virsh start ocp4-acm-one-master-01
virsh start ocp4-acm-one-master-02
virsh start ocp4-acm-one-master-03
back and merge kubeconfig
mkdir -p ~/.kube/bak/
var_date=$(date '+%Y-%m-%d-%H%M')
/bin/cp -f /data/install/auth/kubeconfig ~/.kube/bak/kubeconfig-$var_date
/bin/cp -f /data/install/auth/kubeadmin-password ~/.kube/bak/kubeadmin-password-$var_date
sed "s/admin/admin\/$SNO_CLUSTER_NAME/g" /data/install/auth/kubeconfig > /tmp/config.new
# https://medium.com/@jacobtomlinson/how-to-merge-kubernetes-kubectl-config-files-737b61bd517d
/bin/cp -f ~/.kube/config ~/.kube/config.bak && KUBECONFIG=~/.kube/config:/tmp/config.new kubectl config view --flatten > /tmp/config && /bin/mv -f /tmp/config ~/.kube/config
unset KUBECONFIG
add worker node
我们装好了single node,那么接下来,我们还可以给这个single node添加worker节点,让这个single node cluster变成一个单master的集群。
# first, lets stick ingress to master
oc label node acm-demo-hub-master ocp-ingress-run="true"
oc patch ingresscontroller default -n openshift-ingress-operator --type=merge --patch='{"spec":{"nodePlacement":{"nodeSelector": {"matchLabels":{"ocp-ingress-run":"true"}}}}}'
# we are testing env, so we don't need ingress replicas.
oc patch --namespace=openshift-ingress-operator --patch='{"spec": {"replicas": 1}}' --type=merge ingresscontroller/default
oc get -n openshift-ingress-operator ingresscontroller/default -o yaml
# then we get worker's ignition file, and start worker node, add it to cluster
oc extract -n openshift-machine-api secret/worker-user-data --keys=userData --to=- > /var/www/html/ignition/sno-worker.ign
HELP_SERVER=192.168.7.11
# 定义单节点集群的节点信息
SNO_IP=192.168.7.16
SNO_GW=192.168.7.11
SNO_NETMAST=255.255.255.0
SNO_HOSTNAME=acm-demo-hub-worker-01
SNO_IF=enp1s0
SNO_DNS=192.168.7.11
SNO_DISK=/dev/vda
SNO_MEM=16
BOOT_ARG=" ip=$SNO_IP::$SNO_GW:$SNO_NETMAST:$SNO_HOSTNAME:$SNO_IF:none nameserver=$SNO_DNS coreos.inst.install_dev=${SNO_DISK##*/} coreos.inst.ignition_url=http://$HELP_SERVER:8080/ignition/sno-worker.ign"
/bin/cp -f /data/ocp4/rhcos-live.x86_64.iso sno.iso
coreos-installer iso kargs modify -a "$BOOT_ARG" sno.iso
# go to kvm host ( 103 )
scp root@192.168.7.11:/data/install/sno.iso /data/kvm/
virsh destroy ocp4-acm-hub-worker01
virsh undefine ocp4-acm-hub-worker01
create_lv() {
var_vg=$1
var_pool=$2
var_lv=$3
var_size=$4
var_action=$5
lvremove -f $var_vg/$var_lv
# lvcreate -y -L $var_size -n $var_lv $var_vg
if [ "$var_action" == "recreate" ]; then
lvcreate --type thin -n $var_lv -V $var_size --thinpool $var_vg/$var_pool
wipefs --all --force /dev/$var_vg/$var_lv
fi
}
create_lv vgdata poolA lvacmhub-worker01 500G recreate
# create_lv vgdata poolA lvacmhub-worker01-data 500G remove
virt-install --name=ocp4-acm-hub-worker01 --vcpus=16 --ram=$(($SNO_MEM*1024)) \
--cpu=host-model \
--disk path=/dev/vgdata/lvacmhub-worker01,device=disk,bus=virtio,format=raw \
`# --disk path=/dev/vgdata/lvacmhub-data,device=disk,bus=virtio,format=raw` \
--os-variant rhel8.3 --network bridge=baremetal,model=virtio \
--graphics vnc,port=59003 \
--boot menu=on --cdrom /data/kvm/sno.iso
# after 2 boot up,
# go back to helper
oc get csr
oc get csr -ojson | jq -r '.items[] | select(.status == {} ) | .metadata.name' | xargs oc adm certificate approve
end
给 openshift 的 coreos 编译内核驱动 rpm
作者有文档和项目,描述了如何编译设备的内核驱动,但是在 openshift 这里,rh-coreos用的 kernel 是高级订阅才有的,我们没办法弄一个和 rh-coreos 相同内核的 rhel 出来,也就没办法继续编译 .ko 了。
好在 openshift 发行版给了一个容器,里面有高级订阅才有的kernel版本开发包,可以帮助我们把这个 .ko 给编译出来,进而编译一个 rpm 包出来。那么我们今天就一步一步做做看。
制作一个工具镜像
openshift 发行版,自带一个 driver-toolkit 镜像,里面有 kernel 相关的开发包,满足了编译的需求,我们的目标是编译一个 rpm,那么我们就需要补充完善这个工具镜像。
OCP_VERSION=$(oc get clusterversion/version -ojsonpath={.status.desired.version})
DRIVER_TOOLKIT_IMAGE=$(oc adm release info $OCP_VERSION --image-for=driver-toolkit)
echo $OCP_VERSION
# 4.11.39
echo $DRIVER_TOOLKIT_IMAGE
# quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:dfed734e35163b1ab8483568780d13b528b4c0f558f8e727538af723b7a41ed4
# build a new image based on driver toolkit
# on a rhel
mkdir -p /data/driver
cd /data/driver
cat << EOF > docker.file
FROM quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:dfed734e35163b1ab8483568780d13b528b4c0f558f8e727538af723b7a41ed4
RUN dnf install -y rpm-build
RUN cd /root && git clone https://github.com/wangzheng422/nic-rpm-rnp
RUN cd /root/nic-rpm-rnp && git checkout ocp-4.11.36
RUN mv /root/nic-rpm-rnp/rpmbuild /root/
EOF
podman build --no-cache --authfile /data/pull-secret.json -t quay.io/wangzheng422/driver-toolkit:nic-rpm-rnp-v03 -f docker.file .
podman push quay.io/wangzheng422/driver-toolkit:nic-rpm-rnp-v03
在 openshift 里面编译 rpm
我们有了工具镜像,就可以用特权模式运行它,然后到这个 pod 里面,去运行编译命令,完成 rpm 的编译。
# come back to your cluster
# https://master.sdk.operatorframework.io/docs/best-practices/pod-security-standards/
oc create ns driver-build
oc label --overwrite ns driver-build \
pod-security.kubernetes.io/enforce=privileged
# oc create serviceaccount -n driver-build demo-app
# oc adm policy add-scc-to-user privileged -z demo-app -n driver-build
cat << EOF > ~/wzh/build.yaml
apiVersion: v1
kind: Pod
metadata:
name: kmod-driver-samplepod
annotations:
openshift.io/scc: privileged
# openshift.io/scc: restricted-v2
spec:
# serviceAccountName: demo-app
containers:
- image: quay.io/wangzheng422/driver-toolkit:nic-rpm-rnp-v03
name: simple-kmod-driver-container
imagePullPolicy: Always
command: [sleep, infinity]
securityContext:
# privileged: true
AllowPrivilegedContainer: true
# nodeSelector:
# node-role.kubernetes.io/worker: ""
EOF
oc create --save-config -n driver-build -f ~/wzh/build.yaml
# oc delete -n driver-build -f ~/wzh/build.yaml
# oc get all -n driver-build
# NAME READY STATUS RESTARTS AGE
# pod/kmod-driver-samplepod 1/1 Running 0 22m
oc rsh -n driver-build pod/kmod-driver-samplepod
bash
cd ~/nic-rpm-rnp
tar zvxf rnp-nic-drv-0.1.6.rc44-35c40ea.tgz
cd rnp-nic-drv-0.1.6.rc44-35c40ea
cd rnp
bash do_build.sh
# MODPOST 1 modules
# CC /root/nic-rpm-rnp/rnp-nic-drv-0.1.6.rc44-35c40ea/rnp/rnp.mod.o
# LD [M] /root/nic-rpm-rnp/rnp-nic-drv-0.1.6.rc44-35c40ea/rnp/rnp.ko
# make[1]: Leaving directory '/usr/src/kernels/4.18.0-372.52.1.el8_6.x86_64'
exit
# copy the rpm out to helper node
mkdir -p ~/wzh/rsync
oc project driver-build
oc rsync kmod-driver-samplepod:/root/rpmbuild/RPMS/x86_64/ ~/wzh/rsync/
scp ~/wzh/rsync/rnp-nic-drv-0.1.6.rc44_35c40ea-1.el8.x86_64.rpm core@172.29.17.61:~/
安装 rpm
我们有了驱动rpm,那么我们就直接在node上安装,看看效果吧。
ssh core@172.29.17.61
sudo -i
rpm-ostree install /home/core/rnp-nic-drv-0.1.6.rc44_35c40ea-1.el8.x86_64.rpm
# wait 1 mins at least, then
systemctl reboot
rpm-ostree status
# State: idle
# Deployments:
# ● pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:9b2f4d103a9116e5fb0e5237dd7c932360dda0ef77d3d435374692eaa26dad7c
# CustomOrigin: Managed by machine-config-operator
# Version: 411.86.202304190130-0 (2023-04-19T01:34:04Z)
# LocalPackages: rnp-nic-drv-0.1.6.rc44_35c40ea-1.el8.x86_64
################
# nic driver update
oc project driver-build
oc cp ./rnp-0.2.0-wzh.tar.gz driver-build/kmod-driver-samplepod:/root/rnp-0.2.0-wzh.tar.gz
oc rsh -n driver-build pod/kmod-driver-samplepod
bash
cd /root
rpmbuild -tb rnp-0.2.0-wzh.tar.gz
oc cp driver-build/kmod-driver-samplepod:/root/rpmbuild/RPMS/x86_64/rnp-0.2.0-1.x86_64.rpm ./rnp-0.2.0-1.x86_64.rpm
scp rnp-0.2.0-1.x86_64.rpm core@172.29.17.61:~/
ssh core@172.29.17.61
sudo -i
rpm-ostree install /home/core/rnp-0.2.0-1.x86_64.rpm
end
#### 使用ethtool命令更新固件
>新固件须重启设备后生效
1.1拷贝固件到Linux系统的/lib/firmware路径下
cp xxx.img.bin /lib/firmware
1.2执行烧录命令,<ethx>需要修改为实际网口名
ethtool -f <ethx> xxx.img.bin 0
@注意:指定网卡上任何一个网口,执行一次更新固件动作即可
ocp crash
- https://access.redhat.com/solutions/5907731
rpm-ostree kargs --append='crashkernel=256M slub_debug=FZPU'
rpm-ostree kargs --delete='crashkernel=256M'
rpm-ostree kargs --delete='slub_debug=FZPU'
rpm-ostree kargs --append='slub_debug=F'
1. demo lab for openshift 4.12
In this document, we will record the steps to build a demo lab, to show the capability of openshift.
The key show points includes:
- agent based install ( 3 master node ) with static ip allocation
- worker node scale out
- data foundation install
Some additional technical skill includes:
- simulate bmc for kvm
- lvm thin provision for kvm
- ansible tips
Suggested demo senario in the furture:
- ODF DR
- CNV
- osp on ocp
The architecture of demo lab is:
The purpose of this document is to show a practice way to build an openshift demo lab, so the partner can know where to start to build their own lab. For production env, please contact redhat professional service (GPS) for assistant.
- 1. demo lab for openshift 4.12
- 2. remote access config
- 3. setup helper node
- 4. install 3 master compact cluster
- 5. scale out 3 kvm worker nodes
- 6. add 3 infra nodes
- 7. add 2 worker BM nodes
- 8. set up infra role on cluster
- 9. install ODF
- 10. enable integrated image register
- 11. end
2. remote access config
we will use zerotier to connect to the demo lab. we will use the BM 192.168.25.90 as jumpbox.
# on 192.168.25.90
# install zerotier
curl -s https://install.zerotier.com | sudo bash
# join zerotier network
zerotier-cli join xxxxxxxxxxxx
# using a moon to accelerate network speed
zerotier-cli orbit xxxxxxxxxxxx xxxxxxxxxxxx
# enable gui
dnf groupinstall -y 'server with gui'
# add some handy tools
dnf install -y \
https://download-ib01.fedoraproject.org/pub/epel/8/Everything/x86_64/Packages/b/byobu-5.133-1.el8.noarch.rpm \
https://dl.fedoraproject.org/pub/epel/8/Everything/x86_64/Packages/s/screen-4.6.2-12.el8.x86_64.rpm \
https://dl.fedoraproject.org/pub/epel/8/Everything/x86_64/Packages/h/htop-3.2.1-1.el8.x86_64.rpm
# add support for kvm and vnc
dnf -y install qemu-kvm libvirt libguestfs-tools virt-install virt-viewer virt-manager tigervnc-server
# auto start libvirt
systemctl enable --now libvirtd
# create password for vnc
# replease xxxxxx with your password
printf 'xxxxxx\nxxxxxx\n\n' | vncpasswd
# create vnc config for vnc starting up
cat << EOF > ~/.vnc/config
session=gnome
securitytypes=vncauth,tlsvnc
# desktop=sandbox
# localhost
geometry=1440x855
alwaysshared
EOF
# auto start vnc session for root user at port 5902
cat << EOF >> /etc/tigervnc/vncserver.users
:2=root
EOF
# auto start vnc session
systemctl enable --now vncserver@:2
# disable firewalld totally, just because I am lazy.
# DO NOT use at production env.
systemctl disable --now firewalld
3. setup helper node
We need helper node, or called it base station, to host several service like container image registry, dns, load balancer for api server, yum repo ( based on use case ). The helper node is also an operation console, the login key, kubeconfig is store on helper node by default.
We will use helper node as default gw for our disconnected openshift cluster. Openshift needs a gateway to be alive, the gateway doesn't need to be functional, for example, it can forward packet to outside, if it can be pinged by openshift nodes, that is OK. If we lost the gateway, or the gateway can't be pinged, openshift installtion will be wired, and failed finally.
We will bring in some hack tips, will use powerdns as dns service, and replease load balancer, normally it is haproxy, with lua plugin of the powerdns. DO NOT use this in production env. It is just convinent for the author.
As disconnection env, we will download the installation media on VPS and sync it to helper node.
3.1. config host BM (97)
# DO NOT use at production env.
cat << EOF > ~/.ssh/config
StrictHostKeyChecking no
UserKnownHostsFile=/dev/null
EOF
# setup ntp server on BM node
sed -i "s/#allow.*/allow all/" /etc/chrony.conf
systemctl enable --now chronyd
chronyc sources -v
# .-- Source mode '^' = server, '=' = peer, '#' = local clock.
# / .- Source state '*' = current best, '+' = combined, '-' = not combined,
# | / 'x' = may be in error, '~' = too variable, '?' = unusable.
# || .- xxxx [ yyyy ] +/- zzzz
# || Reachability register (octal) -. | xxxx = adjusted offset,
# || Log2(Polling interval) --. | | yyyy = measured offset,
# || \ | | zzzz = estimated error.
# || | | \
# MS Name/IP address Stratum Poll Reach LastRx Last sample
# ===============================================================================
# ^+ 111.235.248.121 1 8 377 31 -210us[ -210us] +/- 2855us
# ^- static.home.twn.sciurida> 2 7 377 129 +468us[ +448us] +/- 9558us
# ^* twtpe2-ntp-002.aaplimg.c> 1 7 377 33 -50us[ -76us] +/- 1457us
# ^- 114-33-15-129.hinet-ip.h> 2 9 377 335 +994us[ +957us] +/- 8159us
3.2. create helper vm
SNO_MEM=32
# clean up kvm, if we created it before.
virsh destroy ocp4-helper
virsh undefine ocp4-helper
virt-install --name=ocp4-helper --vcpus=8 --ram=$(($SNO_MEM*1024)) \
--cpu=host-model \
--disk path=/image/ocp4-helper.qcow2,bus=virtio,size=800 \
--os-variant rhel8.3 --network bridge=br-int,model=virtio,mac=52:54:00:12:A1:01 \
--graphics vnc,port=59003 --noautoconsole \
--boot menu=on --cdrom /home/rhel-8.8-x86_64-dvd.iso
3.3. setup helper vm
# DO NOT use at production env.
cat << EOF > ~/.ssh/config
StrictHostKeyChecking no
UserKnownHostsFile=/dev/null
EOF
# DO NOT use at production env.
systemctl disable --now firewalld
# ntp
mv /etc/chrony.conf /etc/chrony.conf.bak
cat << EOF > /etc/chrony.conf
server 192.168.10.90 iburst
driftfile /var/lib/chrony/drift
makestep 1.0 3
rtcsync
allow all
logdir /var/log/chrony
EOF
systemctl restart chronyd
systemctl enable --now chronyd
# wait sometime, then check the status
chronyc sources -v
# .-- Source mode '^' = server, '=' = peer, '#' = local clock.
# / .- Source state '*' = current best, '+' = combined, '-' = not combined,
# | / 'x' = may be in error, '~' = too variable, '?' = unusable.
# || .- xxxx [ yyyy ] +/- zzzz
# || Reachability register (octal) -. | xxxx = adjusted offset,
# || Log2(Polling interval) --. | | yyyy = measured offset,
# || \ | | zzzz = estimated error.
# || | | \
# MS Name/IP address Stratum Poll Reach LastRx Last sample
# ===============================================================================
# ^* 192.168.10.90 3 6 7 10 -859ns[-1112ms] +/- 2795us
# setup http web server for yum repo
mkdir -p /data/yum.repos
rsync -P -arz root@192.168.10.90:/mnt/disc/BaseOS /data/yum.repos/
rsync -P -arz root@192.168.10.90:/mnt/disc/AppStream /data/yum.repos/
cat << EOF > /etc/yum.repos.d/wzh.repo
[BaseOS]
name=BaseOS
baseurl=file:////data/yum.repos/BaseOS
enabled=1
gpgcheck=0
[AppStream]
name=AppStream
baseurl=file:////data/yum.repos/AppStream
enabled=1
gpgcheck=0
EOF
dnf groupinstall -y 'development'
dnf install -y python3 nmstate ansible-core
cat << EOF > /etc/systemd/system/local-webserver-yum.service
[Unit]
Description=local-webserver-yum
[Service]
User=root
WorkingDirectory=/data/yum.repos
ExecStart=/bin/bash -c 'python3 -m http.server 5000'
Restart=always
[Install]
WantedBy=multi-user.target
EOF
systemctl daemon-reload
systemctl enable --now local-webserver-yum.service
cat << EOF > /etc/yum.repos.d/wzh.repo
[BaseOS]
name=BaseOS
baseurl=http://192.168.10.10:5000/BaseOS
enabled=1
gpgcheck=0
[AppStream]
name=AppStream
baseurl=http://192.168.10.10:5000/AppStream
enabled=1
gpgcheck=0
[epel-fix]
name=epel-fix
baseurl=http://192.168.10.10:5000/epel-fix
enabled=1
gpgcheck=0
EOF
3.4. download installation media
we will download the installation media on VPS and sync it to helper node.
3.4.1. on a VPS with vultr
# on a vultr
dnf install -y createrepo_c
# add your ocp pull secret, the content can be download from redhat portal
SEC_FILE='/data/pull-secret.json'
cat << 'EOF' > $SEC_FILE
{"auths":xxxxxxxxxxxxxxxxxxxxxxxxxxx
EOF
SEC_FILE="$HOME/.docker/config.json"
mkdir -p ${SEC_FILE%/*}
cat << 'EOF' > $SEC_FILE
{"auths":xxxxxxxxxxxxxxxxxxxxxxxxxxx
EOF
/bin/rm -rf /data/ocp4
/bin/rm -rf /data/ocp4/tmp/
mkdir -p /data/ocp4/tmp/
cd /data/ocp4/tmp/
# export http_proxy="http://127.0.0.1:18801"
# export https_proxy=${http_proxy}
git clone https://github.com/wangzheng422/openshift4-shell
# unset http_proxy
# unset https_proxy
cd /data/ocp4/tmp/openshift4-shell
git checkout ocp-4.12
# git pull origin ocp-${var_major_version}
/bin/cp -rf /data/ocp4/tmp/openshift4-shell/* /data/ocp4/
/bin/rm -rf /data/ocp4/tmp/
mkdir -p /data/ocp4/container.images
cd /data/ocp4/container.images
podman pull registry.access.redhat.com/ubi8/pause:8.7-6
podman save registry.access.redhat.com/ubi8/pause:8.7-6 | pigz -c > pause.tgz
cd /data/ocp4/
bash helper.node.client.sh -v 4.12.16
tar -xzf /data/ocp-4.12.16/oc-mirror.tar.gz -C /usr/local/bin/
chmod +x /usr/local/bin/oc-mirror
cat > /data/ocp4/mirror.yaml << EOF
apiVersion: mirror.openshift.io/v1alpha2
kind: ImageSetConfiguration
# archiveSize: 4
mirror:
platform:
architectures:
- amd64
# - arm64
channels:
- name: stable-4.12
type: ocp
minVersion: 4.12.16
maxVersion: 4.12.16
shortestPath: true
graph: false
additionalImages:
- name: registry.redhat.io/redhat/redhat-operator-index:v4.12
- name: registry.redhat.io/redhat/certified-operator-index:v4.12
- name: registry.redhat.io/redhat/community-operator-index:v4.12
- name: registry.redhat.io/redhat/redhat-marketplace-index:v4.12
- name: quay.io/openshift/origin-kube-rbac-proxy:latest
- name: quay.io/wangzheng422/debug-pod:alma-9.1
# operators:
# - catalog: registry.redhat.io/redhat/redhat-operator-index:v4.10
# packages:
# - name: cluster-logging
# channels:
# - name: stable
# minVersion: 5.6.3
# - name: elasticsearch-operator
# channels:
# - name: stable
# minVersion: 5.6.3
# - name: jaeger-product
# channels:
# - name: stable
# minVersion: 1.39.0-3
# - name: kubernetes-nmstate-operator
# channels:
# - name: stable
# minVersion: 4.10.0-202303022128
# - name: odf-operator
# channels:
# - name: stable-4.10
# minVersion: 4.10.11
# - name: sriov-network-operator
# channels:
# - name: stable
# minVersion: 4.10.0-202302280915
# - name: kubevirt-hyperconverged
# channels:
# - name: stable
# minVersion: 4.10.8
EOF
mkdir -p /data/ocp-install/oc-mirror/
cd /data/ocp-install/oc-mirror/
oc-mirror --config /data/ocp4/mirror.yaml file:///data/ocp-install/oc-mirror/
# sync back to demo lab jumpbox
cd /data
rsync -P -arz /data/ocp4 root@10.229.104.55:/home/wzh/
rsync -P -arz /data/ocp-4.12.16 root@10.229.104.55:/home/wzh/
rsync -P -arz /data/ocp-install root@10.229.104.55:/home/wzh/
3.4.2. on helper vm node
sync back from demo lab jumpbox
# on helper vm node
rsync -P -arz root@192.168.10.90:/home/wzh/* /data/
mkdir -p /data/yum.repos/epel-fix
rsync -P -arz /data/ocp4/rpms/* /data/yum.repos/epel-fix/
3.5. automatic setup power dns
setup pdns by using an ansible playbook. RedHatters build some ansible projects to help deply the openshift, our ansible playbook is used some scripts from them.
dnf install -y ansible-core
cd /data/ocp4/ansible-helper
cat > var.yaml << EOF
helper:
ip_addr: 192.168.10.10
nic: enp1s0
pdns:
bind: 0.0.0.0
port: 53
recursor_port: 5301
# forward: 172.21.1.1
static:
- base_domain: demolab-infra.wzhlab.top
record:
- name: registry
ip_addr: 192.168.10.10
- name: quay
ip_addr: 192.168.10.10
ntp:
server: 192.168.10.10
# below doesn't need after ocp-4.12 for agent based installer
# becaure coredns, haproxy move to static-pod
# and they are configured to support local resolve and redirection.
# keep here for legacy compatibility
cluster:
- base_domain: demolab-ocp.wzhlab.top
node:
- ip_addr: 192.168.10.21
name: master-01
- ip_addr: 192.168.10.22
name: master-02
- ip_addr: 192.168.10.23
name: master-03
- ip_addr: 192.168.10.31
name: infra-01
- ip_addr: 192.168.10.32
name: infra-02
- ip_addr: 192.168.10.33
name: infra-03
- ip_addr: 192.168.10.41
name: worker-01
- ip_addr: 192.168.10.42
name: worker-02
- ip_addr: 192.168.10.51
name: scale-01
- ip_addr: 192.168.10.52
name: scale-02
- ip_addr: 192.168.10.53
name: scale-03
api:
- ip_addr: 192.168.10.11
api_int:
- ip_addr: 192.168.10.11
apps:
- ip_addr: 192.168.12.12
ptr:
- addr: 192.168.10
domain: ptr01.wzhlab.top
EOF
cd /data/ocp4/ansible-helper
# ansible-playbook -vvv -e @var.yaml helper.yaml
ansible-playbook -e @var.yaml helper.yaml
and config public dns record, if your workstation's dns not point to our helper node's power dns.
3.6. create ca key and crt
# on helper vm
mkdir -p /etc/crts/ && cd /etc/crts
# https://access.redhat.com/documentation/en-us/red_hat_codeready_workspaces/2.1/html/installation_guide/installing-codeready-workspaces-in-tls-mode-with-self-signed-certificates_crw
openssl genrsa -out /etc/crts/wzhlab.top.ca.key 4096
openssl req -x509 \
-new -nodes \
-key /etc/crts/wzhlab.top.ca.key \
-sha256 \
-days 36500 \
-out /etc/crts/wzhlab.top.ca.crt \
-subj /CN="Local wzh lab Signer" \
-reqexts SAN \
-extensions SAN \
-config <(cat /etc/pki/tls/openssl.cnf \
<(printf '[SAN]\nbasicConstraints=critical, CA:TRUE\nkeyUsage=keyCertSign, cRLSign, digitalSignature'))
openssl genrsa -out /etc/crts/wzhlab.top.key 2048
openssl req -new -sha256 \
-key /etc/crts/wzhlab.top.key \
-subj "/O=Local wzh lab /CN=*.demolab-infra.wzhlab.top" \
-reqexts SAN \
-config <(cat /etc/pki/tls/openssl.cnf \
<(printf "\n[SAN]\nsubjectAltName=DNS:*.demolab-infra.wzhlab.top,DNS:*.demolab-ocp.wzhlab.top,DNS:*.wzhlab.top\nbasicConstraints=critical, CA:FALSE\nkeyUsage=digitalSignature, keyEncipherment, keyAgreement, dataEncipherment\nextendedKeyUsage=serverAuth")) \
-out /etc/crts/wzhlab.top.csr
openssl x509 \
-req \
-sha256 \
-extfile <(printf "subjectAltName=DNS:*.demolab-infra.wzhlab.top,DNS:*.demolab-ocp.wzhlab.top,DNS:*.wzhlab.top\nbasicConstraints=critical, CA:FALSE\nkeyUsage=digitalSignature, keyEncipherment, keyAgreement, dataEncipherment\nextendedKeyUsage=serverAuth") \
-days 36500 \
-in /etc/crts/wzhlab.top.csr \
-CA /etc/crts/wzhlab.top.ca.crt \
-CAkey /etc/crts/wzhlab.top.ca.key \
-CAcreateserial -out /etc/crts/wzhlab.top.crt
openssl x509 -in /etc/crts/wzhlab.top.crt -text
/bin/cp -f /etc/crts/wzhlab.top.ca.crt /etc/pki/ca-trust/source/anchors/
update-ca-trust extract
3.7. setup image registry
# https://docs.openshift.com/container-platform/4.12/installing/disconnected_install/installing-mirroring-creating-registry.html
ssh-copy-id root@192.168.10.10
podman load -i /data/ocp4/container.images/pause.tgz
mkdir -p /data/quay
cd /data/ocp4/clients
tar zvxf mirror-registry.tar.gz
# replace the xxxxxx with your password
./mirror-registry install -v \
--initPassword xxxxxx --initUser admin \
-k ~/.ssh/id_rsa \
--quayHostname quay.demolab-infra.wzhlab.top --quayRoot /data/quay \
--targetHostname quay.demolab-infra.wzhlab.top \
--sslKey /etc/crts/wzhlab.top.key --sslCert /etc/crts/wzhlab.top.crt
# ......
# PLAY RECAP ****************************************************************************************************************************************************************root@quay.demolab-infra.wzhlab.top : ok=48 changed=26 unreachable=0 failed=0 skipped=19 rescued=0 ignored=0
# INFO[2023-05-25 13:04:43] Quay installed successfully, config data is stored in /data/quay
# INFO[2023-05-25 13:04:43] Quay is available at https://quay.demolab-infra.wzhlab.top:8443 with credentials (admin, xxxxxx)
podman pod ps
# POD ID NAME STATUS CREATED INFRA ID # OF CONTAINERS
# 5afa94fc84fc quay-pod Running 9 minutes ago b911a67bf5cb 4
# import installation media into quay
mkdir -p $HOME/.local/bin
cat << 'EOF' >> ~/.bash_profile
PATH=$HOME/.local/bin:$PATH
export PATH
EOF
export BUILDNUMBER=4.12.16
pushd /data/ocp-${BUILDNUMBER}
tar -xzf openshift-client-linux-${BUILDNUMBER}.tar.gz -C ~/.local/bin/
tar -xzf openshift-install-linux-${BUILDNUMBER}.tar.gz -C ~/.local/bin/
tar -xzf oc-mirror.tar.gz -C ~/.local/bin/
chmod +x ~/.local/bin/oc-mirror
/bin/cp -f openshift-baremetal-install ~/.local/bin/
popd
SEC_FILE="$HOME/.docker/config.json"
mkdir -p ${SEC_FILE%/*}
cat << 'EOF' > $SEC_FILE
{"auths":xxxxxxxxxxxxxxxxxxxxxxxxxxx
EOF
mkdir -p /data/wzh.work
cd /data/wzh.work
oc-mirror --from=/data/ocp-install/oc-mirror/mirror_seq1_000000.tar \
docker://quay.demolab-infra.wzhlab.top:8443
after import, you can check the result from web console. as you can see, there are several repository created.
4. install 3 master compact cluster
all dependency service are installed and ready, now we will start to install 3 master compact cluster. we will begin with 3 node compact cluster, and then demo to scale out 3 kvm worker node, add 3 infra node and 2 baremetal worker node.
4.1. config on helper node
# create a user to hold the config env for the new ocp cluster
useradd -m 3node
usermod -aG wheel 3node
echo -e "%wheel\tALL=(ALL)\tNOPASSWD: ALL" > /etc/sudoers.d/020_sudo_for_me
su - 3node
ssh-keygen
cat << EOF > ~/.ssh/config
StrictHostKeyChecking no
UserKnownHostsFile=/dev/null
EOF
chmod 600 ~/.ssh/config
cat << 'EOF' >> ~/.bashrc
export BASE_DIR='/home/3node/'
EOF
export BUILDNUMBER=4.12.16
mkdir -p ~/.local/bin
pushd /data/ocp-${BUILDNUMBER}
tar -xzf openshift-client-linux-${BUILDNUMBER}.tar.gz -C ~/.local/bin/
tar -xzf openshift-install-linux-${BUILDNUMBER}.tar.gz -C ~/.local/bin/
install -m 755 /data/ocp4/clients/butane-amd64 ~/.local/bin/butane
install -m 755 /data/ocp4/clients/coreos-installer_amd64 ~/.local/bin/coreos-installer
popd
export BUILDNUMBER=4.12.16
mkdir -p ${BASE_DIR}/data/{sno/disconnected,install}
# set some parameter of you rcluster
NODE_SSH_KEY="$(cat ${BASE_DIR}/.ssh/id_rsa.pub)"
INSTALL_IMAGE_REGISTRY=quay.demolab-infra.wzhlab.top:8443
# update the xxxxxx with your password for the image registry
PULL_SECRET='{"auths":{"registry.redhat.io": {"auth": "ZHVtbXk6ZHVtbXk=","email": "noemail@localhost"},"registry.ocp4.redhat.ren:5443": {"auth": "ZHVtbXk6ZHVtbXk=","email": "noemail@localhost"},"'${INSTALL_IMAGE_REGISTRY}'": {"auth": "'$( echo -n 'admin:xxxxxx' | openssl base64 )'","email": "noemail@localhost"}}}'
NTP_SERVER=192.168.10.10
# HELP_SERVER=192.168.7.11
# KVM_HOST=192.168.7.11
API_VIP=192.168.10.11
INGRESS_VIP=192.168.10.12
# CLUSTER_PROVISION_IP=192.168.7.103
# BOOTSTRAP_IP=192.168.7.12
# 定义单节点集群的节点信息
SNO_CLUSTER_NAME=demolab-ocp
SNO_BASE_DOMAIN=wzhlab.top
# BOOTSTRAP_IP=192.168.77.42
MASTER_01_IP=192.168.10.21
MASTER_02_IP=192.168.10.22
MASTER_03_IP=192.168.10.23
# BOOTSTRAP_IPv6=fd03::42
MASTER_01_IPv6=fd03::21
MASTER_02_IPv6=fd03::22
MASTER_03_IPv6=fd03::23
# BOOTSTRAP_HOSTNAME=bootstrap-demo
MASTER_01_HOSTNAME=master-01
MASTER_02_HOSTNAME=master-02
MASTER_03_HOSTNAME=master-03
# BOOTSTRAP_INTERFACE=enp1s0
MASTER_01_INTERFACE=enp1s0
MASTER_02_INTERFACE=enp1s0
MASTER_03_INTERFACE=enp1s0
MASTER_01_INTERFACE_MAC=52:54:00:13:A1:21
MASTER_02_INTERFACE_MAC=52:54:00:13:A1:22
MASTER_03_INTERFACE_MAC=52:54:00:13:A1:23
# BOOTSTRAP_DISK=/dev/vda
MASTER_01_DISK=/dev/vda
MASTER_02_DISK=/dev/vda
MASTER_03_DISK=/dev/vda
OCP_GW=192.168.10.10
OCP_NETMASK=255.255.255.0
OCP_NETMASK_S=24
OCP_DNS=192.168.10.10
OCP_GW_v6=fd03::10
OCP_NETMASK_v6=64
# echo ${SNO_IF_MAC} > /data/sno/sno.mac
mkdir -p ${BASE_DIR}/data/install
cd ${BASE_DIR}/data/install
/bin/rm -rf *.ign .openshift_install_state.json auth bootstrap manifests master*[0-9] worker*[0-9] *
cat << EOF > ${BASE_DIR}/data/install/install-config.yaml
apiVersion: v1
baseDomain: $SNO_BASE_DOMAIN
compute:
- name: worker
replicas: 0
controlPlane:
name: master
replicas: 3
metadata:
name: $SNO_CLUSTER_NAME
networking:
# OVNKubernetes , OpenShiftSDN
clusterNetwork:
- cidr: 172.21.0.0/16
hostPrefix: 23
# - cidr: fd02::/48
# hostPrefix: 64
machineNetwork:
- cidr: 192.168.10.0/24
# - cidr: 2001:DB8::/32
serviceNetwork:
- 172.22.0.0/16
# - fd03::/112
platform:
baremetal:
apiVIPs:
- $API_VIP
# - 2001:DB8::4
ingressVIPs:
- $INGRESS_VIP
# - 2001:DB8::5
pullSecret: '${PULL_SECRET}'
sshKey: |
$( cat ${BASE_DIR}/.ssh/id_rsa.pub | sed 's/^/ /g' )
additionalTrustBundle: |
$( cat /etc/crts/wzhlab.top.ca.crt | sed 's/^/ /g' )
imageContentSources:
- mirrors:
- ${INSTALL_IMAGE_REGISTRY}/openshift/release-images
source: quay.io/openshift-release-dev/ocp-release
- mirrors:
- ${INSTALL_IMAGE_REGISTRY}/openshift/release
source: quay.io/openshift-release-dev/ocp-v4.0-art-dev
EOF
cat << EOF > ${BASE_DIR}/data/install/agent-config.yaml
apiVersion: v1alpha1
kind: AgentConfig
metadata:
name: $SNO_CLUSTER_NAME
rendezvousIP: $MASTER_01_IP
additionalNTPSources:
- $NTP_SERVER
hosts:
- hostname: $MASTER_01_HOSTNAME
role: master
rootDeviceHints:
deviceName: "$MASTER_01_DISK"
interfaces:
- name: $MASTER_01_INTERFACE
macAddress: $MASTER_01_INTERFACE_MAC
networkConfig:
interfaces:
- name: $MASTER_01_INTERFACE
type: ethernet
state: up
mac-address: $MASTER_01_INTERFACE_MAC
ipv4:
enabled: true
address:
- ip: $MASTER_01_IP
prefix-length: $OCP_NETMASK_S
dhcp: false
dns-resolver:
config:
server:
- $OCP_DNS
routes:
config:
- destination: 0.0.0.0/0
next-hop-address: $OCP_GW
next-hop-interface: $MASTER_01_INTERFACE
table-id: 254
- hostname: $MASTER_02_HOSTNAME
role: master
rootDeviceHints:
deviceName: "$MASTER_02_DISK"
interfaces:
- name: $MASTER_02_INTERFACE
macAddress: $MASTER_02_INTERFACE_MAC
networkConfig:
interfaces:
- name: $MASTER_02_INTERFACE
type: ethernet
state: up
mac-address: $MASTER_02_INTERFACE_MAC
ipv4:
enabled: true
address:
- ip: $MASTER_02_IP
prefix-length: $OCP_NETMASK_S
dhcp: false
dns-resolver:
config:
server:
- $OCP_DNS
routes:
config:
- destination: 0.0.0.0/0
next-hop-address: $OCP_GW
next-hop-interface: $MASTER_02_INTERFACE
table-id: 254
- hostname: $MASTER_03_HOSTNAME
role: master
rootDeviceHints:
deviceName: "$MASTER_03_DISK"
interfaces:
- name: $MASTER_03_INTERFACE
macAddress: $MASTER_03_INTERFACE_MAC
networkConfig:
interfaces:
- name: $MASTER_03_INTERFACE
type: ethernet
state: up
mac-address: $MASTER_03_INTERFACE_MAC
ipv4:
enabled: true
address:
- ip: $MASTER_03_IP
prefix-length: $OCP_NETMASK_S
dhcp: false
dns-resolver:
config:
server:
- $OCP_DNS
routes:
config:
- destination: 0.0.0.0/0
next-hop-address: $OCP_GW
next-hop-interface: $MASTER_03_INTERFACE
table-id: 254
EOF
/bin/cp -f ${BASE_DIR}/data/install/install-config.yaml ${BASE_DIR}/data/install/install-config.yaml.bak
/bin/cp -f ${BASE_DIR}/data/install/agent-config.yaml ${BASE_DIR}/data/install/agent-config.yaml.bak
openshift-install --dir=${BASE_DIR}/data/install agent create cluster-manifests
sudo bash -c "/bin/cp -f mirror/registries.conf /etc/containers/registries.conf.d/; chmod +r /etc/containers/registries.conf.d/*"
mkdir -p ${BASE_DIR}/data/install/openshift/
# this is used to copy ntp config for ocp
# but not used anymore for agent based install mode
# /bin/cp -f /data/ocp4/ansible-helper/files/* ${BASE_DIR}/data/install/openshift/
sudo bash -c "cd /data/ocp4 ; bash image.registries.conf.sh quay.demolab-infra.wzhlab.top:8443 ;"
/bin/cp -f /data/ocp4/99-worker-container-registries.yaml ${BASE_DIR}/data/install/openshift/
/bin/cp -f /data/ocp4/99-master-container-registries.yaml ${BASE_DIR}/data/install/openshift/
cd ${BASE_DIR}/data/install/
# openshift-install --dir=${BASE_DIR}/data/install create ignition-configs
mkdir -p ~/.cache/agent/image_cache/
/bin/cp -f /data/ocp-$BUILDNUMBER/rhcos-live.x86_64.iso ~/.cache/agent/image_cache/coreos-x86_64.iso
openshift-install --dir=${BASE_DIR}/data/install agent create image --log-level=debug
# ......
# DEBUG Fetching image from OCP release (oc adm release info --image-for=machine-os-images --insecure=true --icsp-file=/tmp/icsp-file3636774741 quay.io/openshift-release-dev/ocp-release@sha256:96bf74ce789ccb22391deea98e0c5050c41b67cc17defbb38089d32226dba0b8)
# DEBUG The file was found in cache: /home/3node/.cache/agent/image_cache/coreos-x86_64.iso
# INFO Verifying cached file
# DEBUG extracting /coreos/coreos-x86_64.iso.sha256 to /tmp/cache1876698393, oc image extract --path /coreos/coreos-x86_64.iso.sha256:/tmp/cache1876698393 --confirm --icsp-file=/tmp/icsp-file455852761 quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:052130abddf741195b6753888cf8a00757dedeb7010f7d4dcc4b842b5bc705f6
# ......
# we will add another user for debugging
# DO NOT USE IN PRODUCTION
coreos-installer iso ignition show agent.x86_64.iso > ignition.ign
# HTTP_PATH=http://192.168.7.11:8080/ignition
source /data/ocp4/acm.fn.sh
# 我们会创建一个wzh用户,密码是redhat,这个可以在第一次启动的是,从console/ssh直接用用户名口令登录
# 方便排错和研究
VAR_PWD_HASH="$(python3 -c 'import crypt,getpass; print(crypt.crypt("redhat"))')"
cat ${BASE_DIR}/data/install/ignition.ign \
| jq --arg VAR "$VAR_PWD_HASH" --arg VAR_SSH "$NODE_SSH_KEY" '.passwd.users += [{ "name": "wzh", "system": true, "passwordHash": $VAR , "sshAuthorizedKeys": [ $VAR_SSH ], "groups": [ "adm", "wheel", "sudo", "systemd-journal" ] }]' \
| jq -c . \
> ${BASE_DIR}/data/install/ignition-iso.ign
coreos-installer iso ignition embed -f -i ignition-iso.ign agent.x86_64.iso
# VAR_IMAGE_VER=rhcos-410.86.202303200936-AnolisOS-0-live.x86_64.iso
4.2. boot 3 kvm for master node
# on helper node
# copy back the iso to baremetal 97
scp /home/3node/data/install/agent.x86_64.iso root@192.168.10.90:/home/wzh.iso/
# on baremetal 97
# cleanup
virsh destroy ocp4-master-01
virsh undefine ocp4-master-01
/bin/rm -f /image/ocp4-master-01.qcow2
virsh destroy ocp4-master-02
virsh undefine ocp4-master-02
/bin/rm -f /image/ocp4-master-02.qcow2
virsh destroy ocp4-master-03
virsh undefine ocp4-master-03
/bin/rm -f /image/ocp4-master-03.qcow2
SNO_MEM=48
virsh destroy ocp4-master-01
virsh undefine ocp4-master-01
virt-install --name=ocp4-master-01 --vcpus=12 --ram=$(($SNO_MEM*1024)) \
--cpu=host-model \
--disk path=/image/ocp4-master-01.qcow2,bus=virtio,size=120 \
--os-variant rhel8.3 \
--network bridge=br-int,model=virtio,mac=52:54:00:13:A1:21 \
--graphics vnc,port=59021 --noautoconsole \
--boot menu=on --cdrom /home/wzh.iso/agent.x86_64.iso
virsh destroy ocp4-master-02
virsh undefine ocp4-master-02
virt-install --name=ocp4-master-02 --vcpus=12 --ram=$(($SNO_MEM*1024)) \
--cpu=host-model \
--disk path=/image/ocp4-master-02.qcow2,bus=virtio,size=120 \
--os-variant rhel8.3 \
--network bridge=br-int,model=virtio,mac=52:54:00:13:A1:22 \
--graphics vnc,port=59022 --noautoconsole \
--boot menu=on --cdrom /home/wzh.iso/agent.x86_64.iso
virsh destroy ocp4-master-03
virsh undefine ocp4-master-03
virt-install --name=ocp4-master-03 --vcpus=12 --ram=$(($SNO_MEM*1024)) \
--cpu=host-model \
--disk path=/image/ocp4-master-03.qcow2,bus=virtio,size=120 \
--os-variant rhel8.3 \
--network bridge=br-int,model=virtio,mac=52:54:00:13:A1:23 \
--graphics vnc,port=59023 --noautoconsole \
--boot menu=on --cdrom /home/wzh.iso/agent.x86_64.iso
The vm will reboot, in the first reboot, the kvm will not poweron after poweroff, keep an eye on the kvm manager, and start it manually.
4.3. wait and check the result
cd ${BASE_DIR}/data/install
export KUBECONFIG=${BASE_DIR}/data/install/auth/kubeconfig
echo "export KUBECONFIG=${BASE_DIR}/data/install/auth/kubeconfig" >> ~/.bashrc
# oc completion bash | sudo tee /etc/bash_completion.d/openshift > /dev/null
cd ${BASE_DIR}/data/install
openshift-install --dir=${BASE_DIR}/data/install agent wait-for bootstrap-complete \
--log-level=debug
# DEBUG Host master-02 validation: Host subnets are not overlapping
# DEBUG Host master-02 validation: cnv is disabled
# DEBUG Host master-02 validation: lso is disabled
# DEBUG Host master-02 validation: lvm is disabled
# DEBUG Host master-02 validation: odf is disabled
# INFO Host: master-03, reached installation stage Done
# INFO Host: master-01, reached installation stage Waiting for controller: waiting for controller pod ready event
# INFO Bootstrap configMap status is complete
# INFO cluster bootstrap is complete
# if for some reason, master-01 is pending approve to join cluster
# add master-01 back
# you should not use below commands in normal case.
oc get csr
oc get csr -ojson | jq -r '.items[] | select(.status == {} ) | .metadata.name' | xargs oc adm certificate approve
cd ${BASE_DIR}/data/install
openshift-install --dir=${BASE_DIR}/data/install agent wait-for install-complete
# INFO Bootstrap Kube API Initialized
# INFO Bootstrap configMap status is complete
# INFO cluster bootstrap is complete
# INFO Cluster is installed
# INFO Install complete!
# INFO To access the cluster as the system:admin user when using 'oc', run
# INFO export KUBECONFIG=/home/3node/data/install/auth/kubeconfig
# INFO Access the OpenShift web-console here: https://console-openshift-console.apps.demolab-ocp.wzhlab.top
# INFO Login to the console with user: "kubeadmin", and password: "jxjb8-PPkX5-4WF78-5w8eL"
# customize registry config for quay
# oc patch mcp/master --patch '{"spec":{"paused":true}}' --type=merge
# oc patch mcp/worker --patch '{"spec":{"paused":true}}' --type=merge
# oc create -f ${BASE_DIR}/data/install/99-worker-container-registries.yaml
# oc create -f ${BASE_DIR}/data/install/99-master-container-registries.yaml
# oc patch mcp/master --patch '{"spec":{"paused":false}}' --type=merge
# oc patch mcp/worker --patch '{"spec":{"paused":false}}' --type=merge
5. scale out 3 kvm worker nodes
We will build 3 kvm worker nodes, and let openshift to scale out openshift nodes on these kvm. So we can demo the openshift scale-out and scale-in function.
The lab BM's bmc is connect to br-mgmt, and the br-int can not route to the br-mgmt, so the metal3's pod can't access bmc/idrac to insert boot image, that is why we have to demo the scale-out function using kvm.
5.1. config on host server
# on baremetal 97
mkdir -p /home/wzh.work
# cleanup
virsh destroy ocp4-scale-01
virsh undefine ocp4-scale-01
/bin/rm -f /image/ocp4-scale-01.qcow2
virsh destroy ocp4-scale-02
virsh undefine ocp4-scale-02
/bin/rm -f /image/ocp4-scale-02.qcow2
virsh destroy ocp4-scale-03
virsh undefine ocp4-scale-03
/bin/rm -f /image/ocp4-scale-03.qcow2
# define scale worker
SNO_MEM=48
virsh destroy ocp4-scale-01
virsh undefine ocp4-scale-01
virt-install --name=ocp4-scale-01 --vcpus=12 --ram=$(($SNO_MEM*1024)) \
--cpu=host-model \
--disk path=/image/ocp4-scale-01.qcow2,bus=virtio,size=100 \
--os-variant rhel8.3 \
--network bridge=br-int,model=virtio,mac=52:54:00:13:A1:51 \
--graphics vnc,port=59051 --noautoconsole \
--print-xml > /home/wzh.work/ocp4-scale-01.xml
virsh define --file /home/wzh.work/ocp4-scale-01.xml
virsh destroy ocp4-scale-02
virsh undefine ocp4-scale-02
virt-install --name=ocp4-scale-02 --vcpus=12 --ram=$(($SNO_MEM*1024)) \
--cpu=host-model \
--disk path=/image/ocp4-scale-02.qcow2,bus=virtio,size=100 \
--os-variant rhel8.3 \
--network bridge=br-int,model=virtio,mac=52:54:00:13:A1:52 \
--graphics vnc,port=59052 --noautoconsole \
--print-xml > /home/wzh.work/ocp4-scale-02.xml
virsh define --file /home/wzh.work/ocp4-scale-02.xml
virsh destroy ocp4-scale-03
virsh undefine ocp4-scale-03
virt-install --name=ocp4-scale-03 --vcpus=12 --ram=$(($SNO_MEM*1024)) \
--cpu=host-model \
--disk path=/image/ocp4-scale-03.qcow2,bus=virtio,size=100 \
--os-variant rhel8.3 \
--network bridge=br-int,model=virtio,mac=52:54:00:13:A1:53 \
--graphics vnc,port=59053 --noautoconsole \
--print-xml > /home/wzh.work/ocp4-scale-03.xml
virsh define --file /home/wzh.work/ocp4-scale-03.xml
# setup and start bmc simulator for kvm
dnf -y install python3-pip
python3 -m pip install --upgrade pip --user
pip3 install --user sushy-tools
mkdir -p /etc/crts
scp root@192.168.10.10:/etc/crts/* /etc/crts/
# /root/.local/bin/sushy-emulator -i 0.0.0.0 --ssl-certificate /etc/crts/redhat.ren.crt --ssl-key /etc/crts/redhat.ren.key
# try to deploy as systemd service
cat << EOF > /etc/systemd/system/sushy-emulator.service
[Unit]
Description=sushy-emulator
[Service]
User=root
WorkingDirectory=/root
ExecStart=/bin/bash -c '/root/.local/bin/sushy-emulator -i 0.0.0.0 --ssl-certificate /etc/crts/wzhlab.top.crt --ssl-key /etc/crts/wzhlab.top.key'
Restart=always
[Install]
WantedBy=multi-user.target
EOF
systemctl daemon-reload
systemctl enable --now sushy-emulator.service
# collect mac and vm info for helper
# on helper clean all
# /bin/rm -f /data/install/mac.list.*
# /bin/rm -f /data/install/vm.list.*
# back to 103
cd /home/wzh.work
for i in ocp4-scale-0{1..3}
do
echo -ne "${i}\t" ;
virsh dumpxml ${i} | grep "mac address" | cut -d\' -f2 | tr '\n' '\t'
echo
done > mac.list.97
cat /home/wzh.work/mac.list.97
# ocp4-scale-01 52:54:00:13:a1:51
# ocp4-scale-02 52:54:00:13:a1:52
# ocp4-scale-03 52:54:00:13:a1:53
cat << 'EOF' > redfish.sh
#!/usr/bin/env bash
curl -k -s https://127.0.0.1:8000/redfish/v1/Systems/ | jq -r '.Members[]."@odata.id"' > list
while read -r line; do
curl -k -s https://127.0.0.1:8000/$line | jq -j '.Id, " ", .Name, "\n" '
done < list
EOF
bash redfish.sh | grep ocp4-scale > /home/wzh.work/vm.list.97
cat /home/wzh.work/vm.list.97
# e0113aa6-1465-40da-9128-ae9087c76924 ocp4-scale-02
# 97d16a4b-fae3-43b7-bd5b-711c83cf840f ocp4-scale-01
# 25dda43c-fb42-4ac8-bea1-46c46635e7fa ocp4-scale-03
scp /home/wzh.work/{mac,vm}.list.* root@192.168.10.10:/home/3node/data/install/
cat > /home/wzh.work/crack.txt << 'EOF'
chown 3node: /home/3node/data/install/*
EOF
ssh root@192.168.10.10 < /home/wzh.work/crack.txt
5.2. config on helper node
# on helper node
cd ${BASE_DIR}/data/install/
cat << EOF > ${BASE_DIR}/data/install/bmh-01.yaml
# below is for ocp4-scale-01
---
apiVersion: v1
kind: Secret
metadata:
name: scale-01-bmc-secret
type: Opaque
data:
username: $(echo -ne "admin" | base64)
password: $(echo -ne "password" | base64)
---
apiVersion: v1
kind: Secret
metadata:
name: ocp4-scale-01-network-config-secret
type: Opaque
stringData:
nmstate: |
dns-resolver:
config:
server:
- 192.168.10.10
interfaces:
- ipv4:
address:
- ip: 192.168.10.51
prefix-length: 24
dhcp: false
enabled: true
name: enp1s0
state: up
type: ethernet
routes:
config:
- destination: 0.0.0.0/0
next-hop-address: 192.168.10.10
next-hop-interface: enp1s0
table-id: 254
---
apiVersion: metal3.io/v1alpha1
kind: BareMetalHost
metadata:
name: ocp4-scale-01
spec:
online: false
bootMode: legacy
# externallyProvisioned: true
# hardwareProfile: unknown
bootMACAddress: $(cat ${BASE_DIR}/data/install/mac.list.* | grep ocp4-scale-01 | awk '{print $2}')
bmc:
address: redfish-virtualmedia://192.168.10.90:8000/redfish/v1/Systems/$(cat ${BASE_DIR}/data/install/vm.list.* | grep ocp4-scale-01 | awk '{print $1}')
credentialsName: scale-01-bmc-secret
disableCertificateVerification: true
rootDeviceHints:
deviceName: /dev/vda
preprovisioningNetworkDataName: ocp4-scale-01-network-config-secret
# below is for ocp4-scale-02
---
apiVersion: v1
kind: Secret
metadata:
name: scale-02-bmc-secret
type: Opaque
data:
username: $(echo -ne "admin" | base64)
password: $(echo -ne "password" | base64)
---
apiVersion: v1
kind: Secret
metadata:
name: ocp4-scale-02-network-config-secret
type: Opaque
stringData:
nmstate: |
dns-resolver:
config:
server:
- 192.168.10.10
interfaces:
- ipv4:
address:
- ip: 192.168.10.52
prefix-length: 24
dhcp: false
enabled: true
name: enp1s0
state: up
type: ethernet
routes:
config:
- destination: 0.0.0.0/0
next-hop-address: 192.168.10.10
next-hop-interface: enp1s0
table-id: 254
---
apiVersion: metal3.io/v1alpha1
kind: BareMetalHost
metadata:
name: ocp4-scale-02
spec:
online: false
bootMode: legacy
# externallyProvisioned: true
# hardwareProfile: unknown
bootMACAddress: $(cat ${BASE_DIR}/data/install/mac.list.* | grep ocp4-scale-02 | awk '{print $2}')
bmc:
address: redfish-virtualmedia://192.168.10.90:8000/redfish/v1/Systems/$(cat ${BASE_DIR}/data/install/vm.list.* | grep ocp4-scale-02 | awk '{print $1}')
credentialsName: scale-02-bmc-secret
disableCertificateVerification: true
rootDeviceHints:
deviceName: /dev/vda
preprovisioningNetworkDataName: ocp4-scale-02-network-config-secret
# below is for ocp4-scale-03
---
apiVersion: v1
kind: Secret
metadata:
name: scale-03-bmc-secret
type: Opaque
data:
username: $(echo -ne "admin" | base64)
password: $(echo -ne "password" | base64)
---
apiVersion: v1
kind: Secret
metadata:
name: ocp4-scale-03-network-config-secret
type: Opaque
stringData:
nmstate: |
dns-resolver:
config:
server:
- 192.168.10.10
interfaces:
- ipv4:
address:
- ip: 192.168.10.53
prefix-length: 24
dhcp: false
enabled: true
name: enp1s0
state: up
type: ethernet
routes:
config:
- destination: 0.0.0.0/0
next-hop-address: 192.168.10.10
next-hop-interface: enp1s0
table-id: 254
---
apiVersion: metal3.io/v1alpha1
kind: BareMetalHost
metadata:
name: ocp4-scale-03
spec:
online: false
bootMode: legacy
# externallyProvisioned: true
# hardwareProfile: unknown
bootMACAddress: $(cat ${BASE_DIR}/data/install/mac.list.* | grep ocp4-scale-03 | awk '{print $2}')
bmc:
address: redfish-virtualmedia://192.168.10.90:8000/redfish/v1/Systems/$(cat ${BASE_DIR}/data/install/vm.list.* | grep ocp4-scale-03 | awk '{print $1}')
credentialsName: scale-03-bmc-secret
disableCertificateVerification: true
rootDeviceHints:
deviceName: /dev/vda
preprovisioningNetworkDataName: ocp4-scale-03-network-config-secret
EOF
oc -n openshift-machine-api create -f ${BASE_DIR}/data/install/bmh-01.yaml
After apply the baremetal host config, the kvm will boot, and ocp will detect the machine hareware config using ironic. And the kvm will be power-off after the check finish.
Then you can see the baremetal is ready to provision.
5.3. scale out and check the result
find the entry of machineset config, and set the machine count to '1'.
You will find the kvm is booting and provisioning as worker node.
after some time, the worker node is provisioned, and you can see it in cli console.
oc get node
# NAME STATUS ROLES AGE VERSION
# master-01 Ready control-plane,master,worker 3h19m v1.25.8+37a9a08
# master-02 Ready control-plane,master,worker 4h1m v1.25.8+37a9a08
# master-03 Ready control-plane,master,worker 4h3m v1.25.8+37a9a08
# scale-01.demolab-ocp.wzhlab.top Ready worker 3m55s v1.25.8+37a9a08
you can find a new baremetal host is provisioned in web console.
You can also find a new machine is created in web console.
you can also find a new node is created in web console.
5.4. scale in and check the result
Scale in is very simple, just open machine set config, and decrease the number
Then, the vm is powered off, and the CR, like machine and node is deleted.
You can confirm that in the cli.
oc get node
# NAME STATUS ROLES AGE VERSION
# master-01 Ready control-plane,master,worker 3h52m v1.25.8+37a9a08
# master-02 Ready control-plane,master,worker 4h33m v1.25.8+37a9a08
# master-03 Ready control-plane,master,worker 4h35m v1.25.8+37a9a08
6. add 3 infra nodes
Add the 3 infra kvm node is simple, becasue we will not using metal3 to automatically scale-out. We will build an ISO file for each of the kvm, and boot kvm using them.
6.1. config on helper node
# get ignition file for worker node
cd ${BASE_DIR}/data/install/
oc extract -n openshift-machine-api secret/worker-user-data-managed --keys=userData --to=- > worker.ign
# copy the ignition file to root of local web server
# later, during the rhcos booting, it will fetch the ignition file
# from the webserver
sudo mkdir -p /data/yum.repos/conf
sudo /bin/cp -f worker.ign /data/yum.repos/conf/
# some env
# BOOTSTRAP_IP=192.168.77.42
INFRA_01_IP=192.168.10.31
INFRA_02_IP=192.168.10.32
INFRA_03_IP=192.168.10.33
# BOOTSTRAP_IPv6=fd03::42
INFRA_01_IPv6=fd03::31
INFRA_02_IPv6=fd03::32
INFRA_03_IPv6=fd03::33
# BOOTSTRAP_HOSTNAME=bootstrap-demo
INFRA_01_HOSTNAME=infra-01
INFRA_02_HOSTNAME=infra-02
INFRA_03_HOSTNAME=infra-03
# BOOTSTRAP_INTERFACE=enp1s0
INFRA_01_INTERFACE=enp1s0
INFRA_02_INTERFACE=enp1s0
INFRA_03_INTERFACE=enp1s0
# BOOTSTRAP_DISK=/dev/vda
INFRA_01_DISK=/dev/vda
INFRA_02_DISK=/dev/vda
INFRA_03_DISK=/dev/vda
OCP_GW=192.168.10.10
OCP_NETMASK=255.255.255.0
OCP_NETMASK_S=24
OCP_DNS=192.168.10.10
OCP_GW_v6=fd03::10
OCP_NETMASK_v6=64
# build the iso file for each of kvm
export BUILDNUMBER=4.12.16
cd ${BASE_DIR}/data/install/
/bin/cp -f /data/ocp-${BUILDNUMBER}/rhcos-live.x86_64.iso infra-01.iso
coreos-installer iso kargs modify -a "ip=$INFRA_01_IP::$OCP_GW:$OCP_NETMASK:$INFRA_01_HOSTNAME:$INFRA_01_INTERFACE:none nameserver=$OCP_DNS coreos.inst.install_dev=$INFRA_01_DISK coreos.inst.ignition_url=http://192.168.10.10:5000/conf/worker.ign coreos.inst.insecure systemd.debug-shell=1 " infra-01.iso
/bin/cp -f /data/ocp-${BUILDNUMBER}/rhcos-live.x86_64.iso infra-02.iso
coreos-installer iso kargs modify -a "ip=$INFRA_02_IP::$OCP_GW:$OCP_NETMASK:$INFRA_02_HOSTNAME:$INFRA_02_INTERFACE:none nameserver=$OCP_DNS coreos.inst.install_dev=$INFRA_02_DISK coreos.inst.ignition_url=http://192.168.10.10:5000/conf/worker.ign coreos.inst.insecure systemd.debug-shell=1 " infra-02.iso
/bin/cp -f /data/ocp-${BUILDNUMBER}/rhcos-live.x86_64.iso infra-03.iso
coreos-installer iso kargs modify -a "ip=$INFRA_03_IP::$OCP_GW:$OCP_NETMASK:$INFRA_03_HOSTNAME:$INFRA_03_INTERFACE:none nameserver=$OCP_DNS coreos.inst.install_dev=$INFRA_03_DISK coreos.inst.ignition_url=http://192.168.10.10:5000/conf/worker.ign coreos.inst.insecure systemd.debug-shell=1 " infra-03.iso
# transfer the iso file to kvm host server ( 98 )
scp infra-01.iso root@192.168.10.92:/data/kvm/
scp infra-02.iso root@192.168.10.92:/data/kvm/
scp infra-03.iso root@192.168.10.92:/data/kvm/
6.2. config infra host BM nodes (98)
# dnf setup for the server
cat << EOF > ~/.ssh/config
StrictHostKeyChecking no
UserKnownHostsFile=/dev/null
EOF
# DO NOT use at production env.
systemctl disable --now firewalld
# ntp
mv /etc/chrony.conf /etc/chrony.conf.bak
cat << EOF > /etc/chrony.conf
server 192.168.10.90 iburst
driftfile /var/lib/chrony/drift
makestep 1.0 3
rtcsync
allow all
logdir /var/log/chrony
EOF
systemctl restart chronyd
systemctl enable --now chronyd
cat << EOF > /etc/yum.repos.d/wzh.repo
[BaseOS]
name=BaseOS
baseurl=http://192.168.10.10:5000/BaseOS
enabled=1
gpgcheck=0
[AppStream]
name=AppStream
baseurl=http://192.168.10.10:5000/AppStream
enabled=1
gpgcheck=0
[epel-fix]
name=epel-fix
baseurl=http://192.168.10.10:5000/epel-fix
enabled=1
gpgcheck=0
EOF
dnf groupinstall -y 'server with gui'
# add support for kvm and vnc
dnf -y install qemu-kvm libvirt libguestfs-tools virt-install virt-viewer virt-manager tigervnc-server
# auto start libvirt
systemctl enable --now libvirtd
# create password for vnc
# replace 'xxxxxx' with your password
printf 'xxxxxx\nxxxxxx\n\n' | vncpasswd
# create vnc config for vnc starting up
cat << EOF > ~/.vnc/config
session=gnome
securitytypes=vncauth,tlsvnc
# desktop=sandbox
geometry=1440x855
alwaysshared
EOF
# auto start vnc session for root user at port 5902
cat << EOF >> /etc/tigervnc/vncserver.users
:2=root
EOF
# auto start vnc session
systemctl enable --now vncserver@:2
#setup network bridge
cat << 'EOF' > /data/kvm/bridge.sh
#!/usr/bin/env bash
PUB_CONN='eno1'
PUB_IP='192.168.10.92/24'
PUB_GW='192.168.10.10'
PUB_DNS='192.168.10.10'
BR_IF='br-int'
nmcli con down "$PUB_CONN"
nmcli con delete "$PUB_CONN"
nmcli con down "$BR_IF"
nmcli con delete "$BR_IF"
# RHEL 8.1 appends the word "System" in front of the connection,delete in case it exists
nmcli con down "System $PUB_CONN"
nmcli con delete "System $PUB_CONN"
nmcli connection add ifname "$BR_IF" type bridge con-name "$BR_IF" ipv4.method 'manual' \
ipv4.address "$PUB_IP" \
ipv4.gateway "$PUB_GW" \
ipv4.dns "$PUB_DNS"
nmcli con add type bridge-slave ifname "$PUB_CONN" master "$BR_IF"
nmcli con down "$PUB_CONN";pkill dhclient;dhclient "$BR_IF"
nmcli con up "$BR_IF"
EOF
bash /data/kvm/bridge.sh
# setup the thin provision lvm
lsblk
# NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
# sda 8:0 0 4.4T 0 disk
# ├─sda1 8:1 0 600M 0 part /boot/efi
# ├─sda2 8:2 0 1G 0 part /boot
# └─sda3 8:3 0 500G 0 part /
# sr0 11:0 1 1024M 0 rom
fdisk /dev/sda
# n -> to create new partition
# w -> to write out the new partition
lsblk
# NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
# sda 8:0 0 4.4T 0 disk
# ├─sda1 8:1 0 600M 0 part /boot/efi
# ├─sda2 8:2 0 1G 0 part /boot
# ├─sda3 8:3 0 500G 0 part /
# └─sda4 8:4 0 3.9T 0 part
# sr0 11:0 1 1024M 0 rom
pvcreate -y /dev/sda4
vgcreate vgdata /dev/sda4
# https://access.redhat.com/articles/766133
lvcreate -y -n poolA -L 100G vgdata
lvcreate -y -n poolA_meta -L 1G vgdata
lvconvert -y --thinpool vgdata/poolA --poolmetadata vgdata/poolA_meta
# Thin pool volume with chunk size 64.00 KiB can address at most <15.88 TiB of data.
# WARNING: Converting vgdata/poolA and vgdata/poolA_meta to thin pool's data and metadata volumes with metadata wiping.
# THIS WILL DESTROY CONTENT OF LOGICAL VOLUME (filesystem etc.)
# Converted vgdata/poolA and vgdata/poolA_meta to thin pool.
lvextend -l +100%FREE vgdata/poolA
# Rounding size to boundary between physical extents: <3.88 GiB.
# Size of logical volume vgdata/poolA_tmeta changed from 1.00 GiB (256 extents) to <3.88 GiB (992 extents).
# Size of logical volume vgdata/poolA_tdata changed from 100.00 GiB (25600 extents) to <3.87 TiB (1013929 extents).
# Logical volume vgdata/poolA successfully resized.
lsblk
# NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
# sda 8:0 0 4.4T 0 disk
# ├─sda1 8:1 0 600M 0 part /boot/efi
# ├─sda2 8:2 0 1G 0 part /boot
# ├─sda3 8:3 0 500G 0 part /
# └─sda4 8:4 0 3.9T 0 part
# ├─vgdata-poolA_tmeta 253:0 0 3.9G 0 lvm
# │ └─vgdata-poolA 253:2 0 3.9T 0 lvm
# └─vgdata-poolA_tdata 253:1 0 3.9T 0 lvm
# └─vgdata-poolA 253:2 0 3.9T 0 lvm
# sr0 11:0 1 1024M 0 rom
why we use thin provision lvm, not using qcow2 file on xfs filesystem? because the performance, lvm is 2x or 3x faster than qcow2 file on filesystem.
6.3. boot 3 infra kvm
# cleanup the kvm config
create_lv() {
var_vg=$1
var_pool=$2
var_lv=$3
var_size=$4
var_action=$5
lvremove -f $var_vg/$var_lv
# lvcreate -y -L $var_size -n $var_lv $var_vg
if [ "$var_action" == "recreate" ]; then
lvcreate --type thin -n $var_lv -V $var_size --thinpool $var_vg/$var_pool
wipefs --all --force /dev/$var_vg/$var_lv
fi
}
virsh destroy ocp4-infra-01
virsh undefine ocp4-infra-01
create_lv vgdata poolA lv-ocp4-infra-01 100G
create_lv vgdata poolA lv-ocp4-infra-01-data 1024G
virsh destroy ocp4-infra-02
virsh undefine ocp4-infra-02
create_lv vgdata poolA lv-ocp4-infra-02 100G
create_lv vgdata poolA lv-ocp4-infra-02-data 1024G
virsh destroy ocp4-infra-03
virsh undefine ocp4-infra-03
create_lv vgdata poolA lv-ocp4-infra-03 100G
create_lv vgdata poolA lv-ocp4-infra-03-data 1024G
# start the kvm
SNO_MEM=32
virsh destroy ocp4-infra-01
virsh undefine ocp4-infra-01
create_lv vgdata poolA lv-ocp4-infra-01 100G recreate
create_lv vgdata poolA lv-ocp4-infra-01-data 1024G recreate
virt-install --name=ocp4-infra-01 --vcpus=16 --ram=$(($SNO_MEM*1024)) \
--cpu=host-model \
--disk path=/dev/vgdata/lv-ocp4-infra-01,device=disk,bus=virtio,format=raw \
--disk path=/dev/vgdata/lv-ocp4-infra-01-data,device=disk,bus=virtio,format=raw \
--os-variant rhel8.3 --network bridge=br-int,model=virtio,mac=52:54:00:13:A1:31 \
--graphics vnc,port=59031 --noautoconsole \
--boot menu=on --cdrom /data/kvm/infra-01.iso
virsh destroy ocp4-infra-02
virsh undefine ocp4-infra-02
create_lv vgdata poolA lv-ocp4-infra-02 100G recreate
create_lv vgdata poolA lv-ocp4-infra-02-data 1024G recreate
virt-install --name=ocp4-infra-02 --vcpus=16 --ram=$(($SNO_MEM*1024)) \
--cpu=host-model \
--disk path=/dev/vgdata/lv-ocp4-infra-02,device=disk,bus=virtio,format=raw \
--disk path=/dev/vgdata/lv-ocp4-infra-02-data,device=disk,bus=virtio,format=raw \
--os-variant rhel8.3 --network bridge=br-int,model=virtio,mac=52:54:00:13:A1:32 \
--graphics vnc,port=59032 --noautoconsole \
--boot menu=on --cdrom /data/kvm/infra-02.iso
virsh destroy ocp4-infra-03
virsh undefine ocp4-infra-03
create_lv vgdata poolA lv-ocp4-infra-03 100G recreate
create_lv vgdata poolA lv-ocp4-infra-03-data 1024G recreate
virt-install --name=ocp4-infra-03 --vcpus=16 --ram=$(($SNO_MEM*1024)) \
--cpu=host-model \
--disk path=/dev/vgdata/lv-ocp4-infra-03,device=disk,bus=virtio,format=raw \
--disk path=/dev/vgdata/lv-ocp4-infra-03-data,device=disk,bus=virtio,format=raw \
--os-variant rhel8.3 --network bridge=br-int,model=virtio,mac=52:54:00:13:A1:33 \
--graphics vnc,port=59033 --noautoconsole \
--boot menu=on --cdrom /data/kvm/infra-03.iso
6.4. wait and check the result
# approve is automatically, if it is not,
# approve the new infra node to join cluster manually
oc get csr
oc get csr -ojson | jq -r '.items[] | select(.status == {} ) | .metadata.name' | xargs oc adm certificate approve
oc get node
# NAME STATUS ROLES AGE VERSION
# infra-01 Ready worker 117s v1.25.8+37a9a08
# infra-02 Ready worker 111s v1.25.8+37a9a08
# infra-03 Ready worker 110s v1.25.8+37a9a08
# master-01 Ready control-plane,master,worker 6h3m v1.25.8+37a9a08
# master-02 Ready control-plane,master,worker 6h45m v1.25.8+37a9a08
# master-03 Ready control-plane,master,worker 6h47m v1.25.8+37a9a08
7. add 2 worker BM nodes
Add the 2 worker baremetal nodes is the same with the 3 infra node.
if you still want to scale out by metal3, using below config parameter after you plug-in the line of bmc to br-int
worker-01
F0:D4:E2:EA:6F:E0
idrac-virtualmedia://<ip of bmc>/redfish/v1/Systems/System.Embedded.1
7.1. config on helper node
# some env
# BOOTSTRAP_IP=192.168.77.42
WORKER_01_IP=192.168.10.41
WORKER_02_IP=192.168.10.42
# INFRA_03_IP=192.168.10.33
# BOOTSTRAP_IPv6=fd03::42
WORKER_01_IPv6=fd03::41
WORKER_02_IPv6=fd03::42
# INFRA_03_IPv6=fd03::33
# BOOTSTRAP_HOSTNAME=bootstrap-demo
WORKER_01_HOSTNAME=worker-01
WORKER_02_HOSTNAME=worker-02
# INFRA_03_HOSTNAME=infra-03
# BOOTSTRAP_INTERFACE=enp1s0
WORKER_01_INTERFACE=eno1
WORKER_02_INTERFACE=eno1
# INFRA_03_INTERFACE=enp1s0
# BOOTSTRAP_DISK=/dev/vda
WORKER_01_DISK=/dev/sdb
WORKER_02_DISK=/dev/sda
# INFRA_03_DISK=/dev/vda
OCP_GW=192.168.10.10
OCP_NETMASK=255.255.255.0
OCP_NETMASK_S=24
OCP_DNS=192.168.10.10
OCP_GW_v6=fd03::10
OCP_NETMASK_v6=64
# build the iso file for each of kvm
export BUILDNUMBER=4.12.16
cd ${BASE_DIR}/data/install/
/bin/cp -f /data/ocp-${BUILDNUMBER}/rhcos-live.x86_64.iso worker-01.iso
coreos-installer iso kargs modify -a "ip=$WORKER_01_IP::$OCP_GW:$OCP_NETMASK:$WORKER_01_HOSTNAME:$WORKER_01_INTERFACE:none nameserver=$OCP_DNS coreos.inst.install_dev=$WORKER_01_DISK coreos.inst.ignition_url=http://192.168.10.10:5000/conf/worker.ign coreos.inst.insecure systemd.debug-shell=1 " worker-01.iso
/bin/cp -f /data/ocp-${BUILDNUMBER}/rhcos-live.x86_64.iso worker-02.iso
coreos-installer iso kargs modify -a "ip=$WORKER_02_IP::$OCP_GW:$OCP_NETMASK:$WORKER_02_HOSTNAME:$WORKER_02_INTERFACE:none nameserver=$OCP_DNS coreos.inst.install_dev=$WORKER_02_DISK coreos.inst.ignition_url=http://192.168.10.10:5000/conf/worker.ign coreos.inst.insecure systemd.debug-shell=1 " worker-02.iso
# transfer the iso file to host server
scp worker-01.iso root@192.168.10.90:/home/wzh.iso/
scp worker-02.iso root@192.168.10.90:/home/wzh.iso/
7.2. boot the BM and check the result
power on the BM after attch the ISO image to virtual cdrom.
Before booting the BM with iso, it is better to reset integrated raid card config, and reset all vdisk. Otherwise, you will fall into booting issues with uefi.
Some machine, like the BM server in demo lab, need manually remove the virtual cdrom during the 1st reboot.
# approve the new infra node to join cluster manually
oc get csr
oc get csr -ojson | jq -r '.items[] | select(.status == {} ) | .metadata.name' | xargs oc adm certificate approve
8. set up infra role on cluster
the offical document is here:
- https://docs.openshift.com/container-platform/4.12/machine_management/creating-infrastructure-machinesets.html#creating-an-infra-node_creating-infrastructure-machinesets
- https://access.redhat.com/documentation/en-us/red_hat_openshift_data_foundation/4.12/html-single/managing_and_allocating_storage_resources/index#manual_creation_of_infrastructure_nodes
I will not create infra machine set, because I can not find the document on how to create it for baremetal cluster.
8.1. basic cluster config
# currently the cluster looks like this
oc get node
# NAME STATUS ROLES AGE VERSION
# infra-01 Ready worker 23h v1.25.8+37a9a08
# infra-02 Ready worker 23h v1.25.8+37a9a08
# infra-03 Ready worker 23h v1.25.8+37a9a08
# master-01 Ready control-plane,master,worker 29h v1.25.8+37a9a08
# master-02 Ready control-plane,master,worker 30h v1.25.8+37a9a08
# master-03 Ready control-plane,master,worker 30h v1.25.8+37a9a08
# worker-01 Ready worker 3h4m v1.25.8+37a9a08
# worker-02 Ready worker 99m v1.25.8+37a9a08
oc get mcp
# NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE
# master rendered-master-6b284ac2e77636bd9f5fe05b8f68bf3a True False False 3 3 3 0 30h
# worker rendered-worker-8404cadc036bdaa800e4924522f5ace6 True False False 5 5 5 0 30h
# add node lable for infra
for i in worker-0{1..2}; do
oc label node $i node-role.kubernetes.io/app=""
done
for i in infra-0{1..3}; do
oc label node $i node-role.kubernetes.io/infra=""
# enable below if you want to run only ODF on infra
oc label node $i cluster.ocs.openshift.io/openshift-storage=""
done
oc get node
# NAME STATUS ROLES AGE VERSION
# infra-01 Ready infra,worker 23h v1.25.8+37a9a08
# infra-02 Ready infra,worker 23h v1.25.8+37a9a08
# infra-03 Ready infra,worker 23h v1.25.8+37a9a08
# master-01 Ready control-plane,master,worker 29h v1.25.8+37a9a08
# master-02 Ready control-plane,master,worker 30h v1.25.8+37a9a08
# master-03 Ready control-plane,master,worker 30h v1.25.8+37a9a08
# worker-01 Ready app,worker 3h12m v1.25.8+37a9a08
# worker-02 Ready app,worker 107m v1.25.8+37a9a08
cat << EOF > ${BASE_DIR}/data/install/infra.mcp.yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfigPool
metadata:
name: infra
spec:
machineConfigSelector:
matchExpressions:
- {key: machineconfiguration.openshift.io/role, operator: In, values: [worker,infra]}
nodeSelector:
matchLabels:
node-role.kubernetes.io/infra: ""
EOF
oc create --save-config -f ${BASE_DIR}/data/install/infra.mcp.yaml
oc get mcp
# NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE
# infra rendered-infra-8404cadc036bdaa800e4924522f5ace6 True False False 3 3 3 0 2m43s
# master rendered-master-6b284ac2e77636bd9f5fe05b8f68bf3a True False False 3 3 3 0 30h
# worker rendered-worker-8404cadc036bdaa800e4924522f5ace6 True False False 2 2 2 0 30h
# taint infra node
for i in infra-0{1..3}; do
# oc adm taint nodes $i node-role.kubernetes.io/infra=reserved:NoExecute
# remove the taint, just for our demo lab env
oc adm taint nodes $i node-role.kubernetes.io/infra:NoExecute-
# enable below if you want to run only ODF on infra
oc adm taint node $i node.ocs.openshift.io/storage="true":NoSchedule
done
# fix for dns
# https://access.redhat.com/solutions/6592171
cat << EOF > ${BASE_DIR}/data/install/patch-dns.yaml
spec:
nodePlacement:
tolerations:
- operator: Exists
EOF
oc patch dns.operator/default --type merge \
--patch-file=${BASE_DIR}/data/install/patch-dns.yaml
8.2. move workload to infra node
DO NOT move workload to infra node, if you only have 3 infra nodes in cluster, because we will use the 3 infra node for ODF dedicated.
8.2.1. for router
# for router
oc get ingresscontroller default -n openshift-ingress-operator -o json | jq .spec
# {
# "clientTLS": {
# "clientCA": {
# "name": ""
# },
# "clientCertificatePolicy": ""
# },
# "httpCompression": {},
# "httpEmptyRequestsPolicy": "Respond",
# "httpErrorCodePages": {
# "name": ""
# },
# "replicas": 2,
# "tuningOptions": {
# "reloadInterval": "0s"
# },
# "unsupportedConfigOverrides": null
# }
oc get pod -n openshift-ingress -o wide
# NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
# router-default-656fd575-d7w8s 1/1 Running 0 40h 192.168.10.22 master-02 <none> <none>
# router-default-656fd575-s6tl6 1/1 Running 0 40h 192.168.10.23 master-03 <none> <none>
cat << EOF > ${BASE_DIR}/data/install/patch-router.yaml
spec:
nodePlacement:
nodeSelector:
matchLabels:
node-role.kubernetes.io/infra: ""
tolerations:
- effect: NoSchedule
key: node-role.kubernetes.io/infra
value: reserved
- effect: NoExecute
key: node-role.kubernetes.io/infra
value: reserved
EOF
oc patch -n openshift-ingress-operator ingresscontroller/default --type merge \
--patch-file=${BASE_DIR}/data/install/patch-router.yaml
# to roll back only
# do not use this, it will delete the patch
cat << EOF > ${BASE_DIR}/data/install/patch-router.yaml
spec:
nodePlacement: null
EOF
oc patch -n openshift-ingress-operator ingresscontroller/default --type merge \
--patch-file=${BASE_DIR}/data/install/patch-router.yaml
oc get pod -n openshift-ingress -o wide
# NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
# router-default-788c864f85-5dj9f 1/1 Running 0 90s 192.168.10.32 infra-02 <none> <none>
# router-default-788c864f85-qcmv7 1/1 Running 0 2m4s 192.168.10.33 infra-03 <none> <none>
8.2.2. for internal registry
oc get configs.imageregistry.operator.openshift.io/cluster -o json | jq .spec
# {
# "logLevel": "Normal",
# "managementState": "Removed",
# "observedConfig": null,
# "operatorLogLevel": "Normal",
# "proxy": {},
# "replicas": 1,
# "requests": {
# "read": {
# "maxWaitInQueue": "0s"
# },
# "write": {
# "maxWaitInQueue": "0s"
# }
# },
# "rolloutStrategy": "RollingUpdate",
# "storage": {},
# "unsupportedConfigOverrides": null
# }
oc get pods -o wide -n openshift-image-registry |grep registry
cat << EOF > ${BASE_DIR}/data/install/patch-registry.yaml
spec:
nodeSelector:
node-role.kubernetes.io/infra: ""
tolerations:
- effect: NoSchedule
key: node-role.kubernetes.io/infra
value: reserved
- effect: NoExecute
key: node-role.kubernetes.io/infra
value: reserved
EOF
oc patch configs.imageregistry.operator.openshift.io/cluster --type merge \
--patch-file=${BASE_DIR}/data/install/patch-registry.yaml
# to roll back only
# do not use this, it will delete the patch
cat << EOF > ${BASE_DIR}/data/install/patch-registry.yaml
spec:
nodeSelector: null
tolerations: null
EOF
oc patch configs.imageregistry.operator.openshift.io/cluster --type merge \
--patch-file=${BASE_DIR}/data/install/patch-registry.yaml
8.2.3. for monitor
oc get pod -n openshift-monitoring -o wide
# NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
# alertmanager-main-0 6/6 Running 0 40h 172.21.0.107 master-03 <none> <none>
# alertmanager-main-1 6/6 Running 1 (40h ago) 40h 172.21.2.40 master-02 <none> <none>
# cluster-monitoring-operator-7dd6795794-v9mqh 2/2 Running 0 40h 172.21.2.23 master-02 <none> <none>
# kube-state-metrics-6b66b788d5-j2v2j 3/3 Running 0 40h 172.21.0.95 master-03 <none> <none>
# node-exporter-54x95 2/2 Running 0 35h 192.168.10.33 infra-03 <none> <none>
# node-exporter-7gtr5 2/2 Running 0 35h 192.168.10.31 infra-01 <none> <none>
# node-exporter-bfbt6 2/2 Running 2 42h 192.168.10.22 master-02 <none> <none>
# node-exporter-cz8p8 2/2 Running 0 35h 192.168.10.32 infra-02 <none> <none>
# node-exporter-d759x 2/2 Running 2 42h 192.168.10.23 master-03 <none> <none>
# node-exporter-jplrr 2/2 Running 0 14h 192.168.10.42 worker-02 <none> <none>
# node-exporter-k498r 2/2 Running 2 41h 192.168.10.21 master-01 <none> <none>
# node-exporter-xcxv5 2/2 Running 0 15h 192.168.10.41 worker-01 <none> <none>
# openshift-state-metrics-86884485c8-4zcpf 3/3 Running 0 40h 172.21.0.91 master-03 <none> <none>
# prometheus-adapter-68759db859-m8hw7 1/1 Running 0 18h 172.21.4.36 master-01 <none> <none>
# prometheus-adapter-68759db859-mlxfz 1/1 Running 0 11h 172.21.12.7 worker-01 <none> <none>
# prometheus-k8s-0 6/6 Running 0 40h 172.21.0.109 master-03 <none> <none>
# prometheus-k8s-1 6/6 Running 0 40h 172.21.2.34 master-02 <none> <none>
# prometheus-operator-78b549956b-676kt 2/2 Running 0 40h 172.21.0.100 master-03 <none> <none>
# prometheus-operator-admission-webhook-746c7d6ffb-nmglp 1/1 Running 0 40h 172.21.2.28 master-02 <none> <none>
# prometheus-operator-admission-webhook-746c7d6ffb-w8tz6 1/1 Running 0 40h 172.21.0.105 master-03 <none> <none>
# thanos-querier-6b5bcc9cb-b9r4j 6/6 Running 0 40h 172.21.0.104 master-03 <none> <none>
# thanos-querier-6b5bcc9cb-xvsvl 6/6 Running 0 40h 172.21.2.30 master-02 <none> <none>
cat << EOF > ${BASE_DIR}/data/install/cm-monitor.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: cluster-monitoring-config
namespace: openshift-monitoring
data:
config.yaml: |+
alertmanagerMain:
nodeSelector:
node-role.kubernetes.io/infra: ""
tolerations:
- key: node-role.kubernetes.io/infra
value: reserved
effect: NoSchedule
- key: node-role.kubernetes.io/infra
value: reserved
effect: NoExecute
prometheusK8s:
nodeSelector:
node-role.kubernetes.io/infra: ""
tolerations:
- key: node-role.kubernetes.io/infra
value: reserved
effect: NoSchedule
- key: node-role.kubernetes.io/infra
value: reserved
effect: NoExecute
prometheusOperator:
nodeSelector:
node-role.kubernetes.io/infra: ""
tolerations:
- key: node-role.kubernetes.io/infra
value: reserved
effect: NoSchedule
- key: node-role.kubernetes.io/infra
value: reserved
effect: NoExecute
k8sPrometheusAdapter:
nodeSelector:
node-role.kubernetes.io/infra: ""
tolerations:
- key: node-role.kubernetes.io/infra
value: reserved
effect: NoSchedule
- key: node-role.kubernetes.io/infra
value: reserved
effect: NoExecute
kubeStateMetrics:
nodeSelector:
node-role.kubernetes.io/infra: ""
tolerations:
- key: node-role.kubernetes.io/infra
value: reserved
effect: NoSchedule
- key: node-role.kubernetes.io/infra
value: reserved
effect: NoExecute
telemeterClient:
nodeSelector:
node-role.kubernetes.io/infra: ""
tolerations:
- key: node-role.kubernetes.io/infra
value: reserved
effect: NoSchedule
- key: node-role.kubernetes.io/infra
value: reserved
effect: NoExecute
openshiftStateMetrics:
nodeSelector:
node-role.kubernetes.io/infra: ""
tolerations:
- key: node-role.kubernetes.io/infra
value: reserved
effect: NoSchedule
- key: node-role.kubernetes.io/infra
value: reserved
effect: NoExecute
thanosQuerier:
nodeSelector:
node-role.kubernetes.io/infra: ""
tolerations:
- key: node-role.kubernetes.io/infra
value: reserved
effect: NoSchedule
- key: node-role.kubernetes.io/infra
value: reserved
effect: NoExecute
EOF
oc create --save-config -n openshift-monitoring -f ${BASE_DIR}/data/install/cm-monitor.yaml
# oc delete -n openshift-monitoring -f ${BASE_DIR}/data/install/cm-monitor.yaml
oc get pod -n openshift-monitoring -o wide
# NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
# alertmanager-main-0 6/6 Running 1 (2m29s ago) 2m33s 172.21.10.12 infra-03 <none> <none>
# alertmanager-main-1 6/6 Running 1 (3m2s ago) 3m7s 172.21.6.11 infra-01 <none> <none>
# cluster-monitoring-operator-7dd6795794-v9mqh 2/2 Running 0 40h 172.21.2.23 master-02 <none> <none>
# kube-state-metrics-857fc67cb9-snbpc 3/3 Running 0 3m11s 172.21.8.8 infra-02 <none> <none>
# node-exporter-54x95 2/2 Running 0 35h 192.168.10.33 infra-03 <none> <none>
# node-exporter-7gtr5 2/2 Running 0 35h 192.168.10.31 infra-01 <none> <none>
# node-exporter-bfbt6 2/2 Running 2 42h 192.168.10.22 master-02 <none> <none>
# node-exporter-cz8p8 2/2 Running 0 35h 192.168.10.32 infra-02 <none> <none>
# node-exporter-d759x 2/2 Running 2 42h 192.168.10.23 master-03 <none> <none>
# node-exporter-jplrr 2/2 Running 0 14h 192.168.10.42 worker-02 <none> <none>
# node-exporter-k498r 2/2 Running 2 42h 192.168.10.21 master-01 <none> <none>
# node-exporter-xcxv5 2/2 Running 0 15h 192.168.10.41 worker-01 <none> <none>
# openshift-state-metrics-6469575fd-sknv5 3/3 Running 0 3m11s 172.21.8.9 infra-02 <none> <none>
# prometheus-adapter-765d86b6c9-ffps5 1/1 Running 0 3m10s 172.21.6.9 infra-01 <none> <none>
# prometheus-adapter-765d86b6c9-s8r7p 1/1 Running 0 3m10s 172.21.8.10 infra-02 <none> <none>
# prometheus-k8s-0 6/6 Running 0 2m1s 172.21.10.13 infra-03 <none> <none>
# prometheus-k8s-1 6/6 Running 0 3m3s 172.21.8.11 infra-02 <none> <none>
# prometheus-operator-5d45f8bb65-fhgsf 2/2 Running 0 3m17s 172.21.10.10 infra-03 <none> <none>
# prometheus-operator-admission-webhook-b847d7dd4-82s44 1/1 Running 0 3m22s 172.21.10.9 infra-03 <none> <none>
# prometheus-operator-admission-webhook-b847d7dd4-f7gnt 1/1 Running 0 3m22s 172.21.6.8 infra-01 <none> <none>
# thanos-querier-696b585794-gwvdj 6/6 Running 0 3m8s 172.21.6.10 infra-01 <none> <none>
# thanos-querier-696b585794-ws5rr 6/6 Running 0 3m8s 172.21.10.11 infra-03 <none> <none>
8.2.4. for logging
The default openshift installation does not have logging included. So do not worry now.
The offical document:
- https://docs.openshift.com/container-platform/4.12/machine_management/creating-infrastructure-machinesets.html#infrastructure-moving-logging_creating-infrastructure-machinesets
9. install ODF
9.1. download additional installation media
# on helper
# try to find out the correct operator version.
# to list all channel
oc get PackageManifest -o json | jq -r ' .items[] | "\(.metadata.name),\(.status.channels[].name),\(.status.channels[].currentCSVDesc.version)" ' | column -ts $',' | grep odf
# ocs-client-operator stable-4.12 4.12.3-rhodf
# odf-multicluster-orchestrator stable-4.11 4.11.8
# odf-multicluster-orchestrator stable-4.12 4.11.8
# odf-multicluster-orchestrator stable-4.11 4.12.3-rhodf
# odf-multicluster-orchestrator stable-4.12 4.12.3-rhodf
# ocs-operator stable-4.12 4.12.3-rhodf
# ocs-operator stable-4.11 4.12.3-rhodf
# odr-hub-operator stable-4.11 4.12.3-rhodf
# odr-hub-operator stable-4.12 4.12.3-rhodf
# ibm-storage-odf-operator stable-v1.3 1.3.0
# mcg-operator stable-4.11 4.12.3-rhodf
# mcg-operator stable-4.12 4.12.3-rhodf
# odf-operator stable-4.11 4.11.8
# odf-operator stable-4.12 4.11.8
# odf-operator stable-4.11 4.12.3-rhodf
# odf-operator stable-4.12 4.12.3-rhodf
# odf-csi-addons-operator stable-4.11 4.11.8
# odf-csi-addons-operator stable-4.12 4.11.8
# odf-csi-addons-operator stable-4.11 4.12.3-rhodf
# odf-csi-addons-operator stable-4.12 4.12.3-rhodf
# odr-cluster-operator stable-4.11 4.12.3-rhodf
# odr-cluster-operator stable-4.12 4.12.3-rhodf
# on vultr host
cat > /data/ocp4/mirror.yaml << EOF
apiVersion: mirror.openshift.io/v1alpha2
kind: ImageSetConfiguration
# archiveSize: 4
mirror:
platform:
architectures:
- amd64
# - arm64
# channels:
# - name: stable-4.12
# type: ocp
# minVersion: 4.12.16
# maxVersion: 4.12.16
# shortestPath: true
# graph: false
# additionalImages:
# - name: registry.redhat.io/redhat/redhat-operator-index:v4.12
# - name: registry.redhat.io/redhat/certified-operator-index:v4.12
# - name: registry.redhat.io/redhat/community-operator-index:v4.12
# - name: registry.redhat.io/redhat/redhat-marketplace-index:v4.12
# - name: quay.io/openshift/origin-kube-rbac-proxy:latest
# - name: quay.io/wangzheng422/debug-pod:alma-9.1
operators:
- catalog: registry.redhat.io/redhat/redhat-operator-index:v4.12
packages:
- name: odf-operator
channels:
- name: stable-4.12
minVersion: 4.12.3-rhodf
- name: local-storage-operator
channels:
- name: stable
minVersion: 4.12.0-202305101515
EOF
mkdir -p /data/ocp-install/oc-mirror/
cd /data/ocp-install/oc-mirror/
cd /data/wzh.work
oc-mirror --config /data/ocp4/mirror.yaml file:///data/ocp-install/oc-mirror/
# sync back to demo lab jumpbox
cd /data
rsync -P -arz /data/ocp-install root@10.229.104.55:/home/wzh/
# on helper vm node
rsync -P -arz root@192.168.10.90:/home/wzh/ocp-install /data/
# import the image to internal registry
oc-mirror --from=/data/ocp-install/oc-mirror/mirror_seq1_000000.tar \
docker://quay.demolab-infra.wzhlab.top:8443
# as user 3node
oc get OperatorHub/cluster -o yaml
# ......
# spec: {}
# status:
# sources:
# - disabled: false
# name: certified-operators
# status: Success
# - disabled: false
# name: community-operators
# status: Success
# - disabled: false
# name: redhat-marketplace
# status: Success
# - disabled: false
# name: redhat-operators
# status: Success
cat << EOF > ${BASE_DIR}/data/install/hub.disable.yaml
spec:
sources: [
{
name: "certified-operators",
disabled: true
},
{
name: "community-operators",
disabled: true
},
{
name: "redhat-marketplace",
disabled: true
}
]
EOF
oc patch OperatorHub/cluster --type merge \
--patch-file=${BASE_DIR}/data/install/hub.disable.yaml
9.2. install ODF
install ODF is straightforward. Just following official document:
- https://access.redhat.com/documentation/en-us/red_hat_openshift_data_foundation/4.12/html-single/deploying_openshift_data_foundation_using_bare_metal_infrastructure/index#deploy-using-local-storage-devices-bm
first, you have to install local storage operator, this operator will init the disk, and provide the disk to consume by ODF
click install.
enable monitoring, this is optional.
just wait, then it will be ok.
then, install ODF
click install.
keep the default config.
after some time, the odf operator is ready.
you have to init the ODF, by creating a storage system.
keep default config in the first stop.
then, a config will be apply to local storage operator, and it will auto discovery the local disk, wait for sometime, it will show up all the node in the cluster, and all the disk. Select only infra node.
after click next, it will create local volume set in local storage operator. The local disk will be encapsulated into local volume, and consumed by ODF.
In the next, just keep the default, or you can taint the node. We already taint the node, so do not worry here.
next step, keep the default config.
review your config, and begin to create.
wait for sometime, it will be ok, remember to refresh the web console.
you can see the block and file is ok.
object service is ok either.
The default storage class is not ok, we will create a new one. To save space, we will use 2-replica, default is 3-replica, we will also use compression.
Select rbd provisioner, and create a new pool
in the popup, set the data replication policy to 2-way, and enable compression.
create block pool is ok.
keep other config in default.
then, the new storage class is ready to use.
9.3. patch for csi components
bring csi components to infra node, NO need in our demo lab.
you can ignore below.
offical document:
- https://access.redhat.com/documentation/en-us/red_hat_openshift_data_foundation/4.12/html-single/managing_and_allocating_storage_resources/index#managing-container-storage-interface-component-placements_rhodf
# bring csi components to infra node
# NO need in out demo lab.
# you can ignore it
oc get configmap rook-ceph-operator-config -n openshift-storage -o yaml
# apiVersion: v1
# kind: ConfigMap
# metadata:
# creationTimestamp: "2023-06-01T13:00:58Z"
# name: rook-ceph-operator-config
# namespace: openshift-storage
# resourceVersion: "1139866"
# uid: 94177029-8189-4725-b712-0dbbc6fef71a
cat << EOF > ${BASE_DIR}/data/install/odf.csi-patch.yaml
data:
CSI_PLUGIN_TOLERATIONS: |
- key: nodetype
operator: Equal
value: infra
effect: NoSchedule
- key: node.ocs.openshift.io/storage
operator: Equal
value: "true"
effect: NoSchedule
EOF
oc patch OperatorHub/cluster --type merge \
--patch-file=${BASE_DIR}/data/install/odf.csi-patch.yaml
10. enable integrated image register
we have odf installed, so we have backend storage, then, we can active internal image register with this odf backend storage.
official document:
- Configuring Image Registry to use OpenShift Data Foundation
- Exposing the registry
- Pushing images to RHOCP 4 internal registry fails with denied error
# create the pvc for image registry
cat << EOF > ${BASE_DIR}/data/install/pvc.image.registry.yaml
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: ocs4registry
namespace: openshift-image-registry
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 200Gi
storageClassName: ocs-storagecluster-cephfs
EOF
oc create --save-config -f ${BASE_DIR}/data/install/pvc.image.registry.yaml
# then patch the cluster object to enable image register and use the pvc
oc patch configs.imageregistry.operator.openshift.io cluster -p '{"spec":{"managementState": "Managed","storage":{"pvc":{"claim":"ocs4registry"}}}}' --type=merge
# if you want to restore
oc patch configs.imageregistry.operator.openshift.io cluster -p '{"spec":{"managementState": "Removed"}}' --type=merge
oc get clusteroperator image-registry
# NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE
# image-registry 4.12.16 True True False 3d5h Progressing: The deployment has not completed...
# wait some time for the image registry to be ready
# export the image registry route
oc patch configs.imageregistry.operator.openshift.io/cluster --patch '{"spec":{"defaultRoute":true}}' --type=merge
HOST=$(oc get route default-route -n openshift-image-registry --template='{{ .spec.host }}')
echo $HOST
# default-route-openshift-image-registry.apps.demolab-ocp.wzhlab.top
# login as kubeadmin, and podman login to try
oc login https://api.demolab-ocp.wzhlab.top:6443 -u kubeadmin
podman login -u kubeadmin -p $(oc whoami -t) --tls-verify=false $HOST
# Login Succeeded!
# push a demo image to default namespace
podman pull quay.demolab-infra.wzhlab.top:8443/wangzheng422/debug-pod:alma-9.1
podman tag quay.demolab-infra.wzhlab.top:8443/wangzheng422/debug-pod:alma-9.1 $HOST/default/debug-pod:alma-9.1
podman push $HOST/default/debug-pod:alma-9.1
# Getting image source signatures
# Copying blob 4bfa56c571a7 skipped: already exists
# Copying blob 326f1ea0e5d2 skipped: already exists
# Copying config 1d9fb6b20f done
# Writing manifest to image destination
# Storing signatures
# you can see the image as image stream in ocp platform
oc get is
# NAME IMAGE REPOSITORY TAGS UPDATED
# debug-pod default-route-openshift-image-registry.apps.demolab-ocp.wzhlab.top/default/debug-pod alma-9.1 About a minute ago
# oc config switch context back
oc config use-context admin
oc get configs.imageregistry.operator.openshift.io cluster -o json | jq .spec
# {
# "defaultRoute": true,
# "httpSecret": "5bb05db2f8a67fcfe9809bf83ae4a492d4ebcf51a50a29a10bea5fda938300f7d6e6c12618a618f8444d0f9579e5ca2f26120b8d90c480a564011cbf356d3528",
# "logLevel": "Normal",
# "managementState": "Managed",
# "observedConfig": null,
# "operatorLogLevel": "Normal",
# "proxy": {},
# "replicas": 1,
# "requests": {
# "read": {
# "maxWaitInQueue": "0s"
# },
# "write": {
# "maxWaitInQueue": "0s"
# }
# },
# "rolloutStrategy": "RollingUpdate",
# "storage": {
# "managementState": "Unmanaged",
# "pvc": {
# "claim": "ocs4registry"
# }
# },
# "unsupportedConfigOverrides": null
# }
you can see the image stream from webconsole
11. end
在 openshift 4.11 上安装和运行 openstack
➡️In English (google translated)
本文将讲述,如何在openshift上激活openstack operator,并最终安装一个openstack overcloud。本文的目的,是在home lab上,装出来一个openstack on openshift集群,方便进一步的学习和研究。
由于作者水平,时间和实验环境的限制,本文所使用的方法,会有诸多的不足甚至是错误,欢迎大家指正。
背景
在进入正题之前,我们简单的说2句为什么要搞openstack on openshift这个实验。
大家都说,现在是云时代,一般来说,我们指的是使用k8s和容器做为云的底座。但是在k8s出现之前,云是用openstack和虚机作为云的底座的。由于历史原因,openstack在客户群体中,还有很大的装机量,特别是电信行业,大量的私有云,都是用openstack来打造的。如何让这部分群体,想现在的k8s/容器云转型,就是一个很大的问题。
而openstack on openshift,就是推动这种转型的一个尝试/技术方案。我们之前看到的方案,大多数是openstack作为虚机云的底座,然后k8s作为虚机云上的应用/paas部署出来,然后把paas交付给客户。openstack on openshift则是另外一种思路,它把k8s/openshift作为云的底座,然后把openstack作为在容器云上的应用部署出来,然后把这个openstack的overcloud集群交付给客户。
在本文中,我们就展示一下,openshift作为云底座,是如何做到这些的。
实验环境和架构图
本次实验,主要是在一台24core, 128G内存的物理机上完成。实验的部署架构图如下。
- 在物理机上,创建了2个bridge(这部分以后会改进,使用ovs, 配合vlan的配置)
- 创建4个kvm,每个kvm有4个网络接口,为了方便,每个kvm都有4块硬盘,但是其实只有worker的那个节点,是需要多出来的3个硬盘的。
- 创建了虚拟BMC,这个是openshift IPI安装过程中需要的。
- 物理机上开启了内存超分,硬盘的thin provision,最大化的榨取物理机的资源。
实验步骤
- 使用IPI模式安装openshift,我们使用 3 node / compact cluster 模式安装
- 纳管worker node / kvm
- 安装 cnv, nmstat, sriov, nfs等openshift插件
- 安装 openstack operator
- 配置 openstack 参数, 部署 openstack overcloud
视频讲解
prepare host
我们的实验是在一台物理机上,资源有限,我们就开启内存超分,让我们可以多多创建虚拟机。另外openstack controller是以虚拟机的形式,运行在openshift node里面的,我们的node已经是kvm了,那么就需要开启嵌套虚拟化的功能。
另外,很重要的一步,就是准备网络,本文创建了2个bridge,这个是openshift官方的方法,虽然并不是非常适用在本次实验,但是也凑合能用,问题是多个vlan混杂在bridge里面了,还有一个问题,就是如果想多个主机部署,vlan下的接口,跨主机是不通的。这个后面再用手动部署的ovs来完善,目前先凑合用。
memory over commit
我们先看看如何开启内存超分。如果不做如下的操作,多个kvm的内存总和就不能超过物理内存,这个对我们做实验会造成很大的麻烦,我们就开启它。以下操作不用重启系统。
cat << EOF >> /etc/sysctl.d/99-wzh-sysctl.conf
vm.overcommit_memory = 1
EOF
sysctl --system
nested virtulization
接下来,就是开启嵌套虚拟化。这样,我们就能在kvm里面,再启动一个kvm了。虽然,不得不说,这个里面的kvm是真的慢啊,但是做实验,能做下去最重要,对不。以下操作,不用重启系统。
# first, go to kvm host to config nested kvm
# https://zhuanlan.zhihu.com/p/35320117
cat /sys/module/kvm_intel/parameters/nested
# 0
cat << EOF > /etc/modprobe.d/kvm-nested.conf
options kvm_intel nested=1
options kvm-intel enable_shadow_vmcs=1
options kvm-intel enable_apicv=1
options kvm-intel ept=1
EOF
modprobe -r kvm_intel #协助掉内核中的kvm_intel模块,注意要在所有虚拟机都关闭的情况下执行
modprobe -a kvm_intel #重新加载该模块
cat /sys/module/kvm_intel/parameters/nested
# 1
cat /proc/cpuinfo | grep vmx
# on guest os
# if you see this file, means it is success.
ls /dev/kvm
prepare network on 103
接下来,我们就在物理机上创建2个bridge, 分别是baremetal, provisioning,为什么要创建2个bridge,因为这个是openshift IPI安装的需求,但是其实IPI安装是有2个网络模式的,一个是单网络模式,只要baremetal就可以,另外一个模式,才是双网络模式,需要baremetal, provisioning两个网络。那我们能不能只用baremetal这个单网络模式呢?
openstack的官方文档里面,说了必须要provisioning网络,很遗憾,这次官方文档说对了。作者用baremetal单网络模式试过了,在最后部署openstack overcloud computerHCI节点的时候,openstack operator指示openshift给worker node耍操纵系统镜像,这次刷的不是coreos,而是我们提供的rhel镜像。在这一步,可能是openstack operator的限制,镜像必须从provisioning网络提供,也许以后软件升级了,这个双网络模式的要求会取消吧。
这一步操作,对应到架构图,是这部分:
bridge: baremetal
我们先按照openshift官方文档,创建baremetal bridge.
# 创建实验用虚拟网络
mkdir -p /data/kvm
cd /data/kvm
cat << 'EOF' > /data/kvm/bridge.sh
#!/usr/bin/env bash
PUB_CONN='eno1'
PUB_IP='172.21.6.103/24'
PUB_GW='172.21.6.254'
PUB_DNS='172.21.1.1'
nmcli con down "$PUB_CONN"
nmcli con delete "$PUB_CONN"
nmcli con down baremetal
nmcli con delete baremetal
# RHEL 8.1 appends the word "System" in front of the connection,delete in case it exists
nmcli con down "System $PUB_CONN"
nmcli con delete "System $PUB_CONN"
nmcli connection add ifname baremetal type bridge con-name baremetal ipv4.method 'manual' \
ipv4.address "$PUB_IP" \
ipv4.gateway "$PUB_GW" \
ipv4.dns "$PUB_DNS"
nmcli con add type bridge-slave ifname "$PUB_CONN" master baremetal
nmcli con down "$PUB_CONN";pkill dhclient;dhclient baremetal
nmcli con up baremetal
EOF
bash /data/kvm/bridge.sh
nmcli con mod baremetal +ipv4.addresses "192.168.7.103/24"
nmcli con up baremetal
vxlan: provisioning
然后,我们需要创建另外一个bridge, provisioning。如果物理机有2个网卡,当然可以在另外一个网卡上,创建这个bridge,作者之前也是这么做的。但是这里尝试另外一个做法,为以后切换到sdn做一些尝试和准备。
我们就在之前的那个网卡上,创建一个vxlan接口,并且绑定到provisioning bridge上去。
我们照着官方文档做就好了。
nmcli connection add type bridge con-name br-prov ifname br-prov ipv4.method disabled ipv6.method disabled
nmcli con modify br-prov ipv4.method manual ipv4.address 172.22.0.1/24
nmcli connection add type vxlan slave-type bridge con-name br-prov-vxlan5 ifname vxlan5 id 5 local 172.21.6.103 remote 172.21.6.102 master br-prov
nmcli connection up br-prov
bridge fdb show dev vxlan5
# c2:d2:b2:e7:6e:f5 vlan 1 master br-prov permanent
# c2:d2:b2:e7:6e:f5 master br-prov permanent
# 00:00:00:00:00:00 dst 172.21.6.103 self permanent
# ce:1f:f5:9e:f8:7f dst 172.21.6.103 self
cat << EOF > /data/kvm/vxlan5-bridge.xml
<network>
<name>provisioning</name>
<forward mode="bridge" />
<bridge name="br-prov" />
</network>
EOF
virsh net-define /data/kvm/vxlan5-bridge.xml
virsh net-start provisioning
virsh net-autostart provisioning
virsh net-list
# Name State Autostart Persistent
# -------------------------------------------------
# default active yes yes
# provisioning active yes yes
prepare rpm repo on helper
在openshift上部署openstack,目前来说,原理并不复杂,openshift负责提供虚拟机(通过cnv), 和物理机(通过machine api),这些虚拟机和物理机都已经装好了操纵系统。接下来openshift会准备好一套ansible脚本,管理员到指定pod里面,去运行这个openstack 安装的ansible脚本就好了。后面这一半的操作步骤,和安装openstack是一样的。
既然和安装openstack的过程是一样的,那我们就要按照openstack的要求,准备要rpm repo源。我们还是按照先去海外vps上下载,然后下载回来的做法来搞。
接下来,我们给下载好了rpm repo配置一个,repo config 文件,方便后面openstack安装的时候,导入。
最后,我们做一个自动启动的web server,来给这个rpm repo提供服务。
# sync repo on vultr
dnf install -y yum-utils
cd mnt/blockstorage/
subscription-manager release --set=8.6
declare -a arr=("rhel-8-for-x86_64-baseos-eus-rpms"
"rhel-8-for-x86_64-appstream-eus-rpms"
"rhel-8-for-x86_64-highavailability-eus-rpms"
"ansible-2.9-for-rhel-8-x86_64-rpms"
openstack-16.2-for-rhel-8-x86_64-rpms
fast-datapath-for-rhel-8-x86_64-rpms
)
for i in "${arr[@]}"
do
dnf reposync --repoid="$i" -m --download-metadata -n --delete
done
# baseos should sync with old versoin
dnf reposync --repoid=rhel-8-for-x86_64-baseos-eus-rpms -m --download-metadata --delete
# on local / helper
declare -a arr=("rhel-8-for-x86_64-baseos-eus-rpms"
"rhel-8-for-x86_64-appstream-eus-rpms"
"rhel-8-for-x86_64-highavailability-eus-rpms"
"ansible-2.9-for-rhel-8-x86_64-rpms"
openstack-16.2-for-rhel-8-x86_64-rpms
fast-datapath-for-rhel-8-x86_64-rpms
)
VAR_IP=158.247.234.245
for i in "${arr[@]}"
do
rsync -P --delete -arz root@$VAR_IP:/mnt/blockstorage/$i /data/dnf/
done
# after download , we create a repo config file
# this will be used later when install openstack
echo > /data/dnf/osp.repo
for i in "${arr[@]}"
do
cat << EOF >> /data/dnf/osp.repo
[$i]
name=$i
baseurl=http://192.168.7.11:5000/$i
enabled=1
gpgcheck=0
EOF
done
# setup web server startup service
# let the web server auto start
cat << EOF > /etc/systemd/system/local-webserver-osp.service
[Unit]
Description=local-webserver-osp
[Service]
User=root
WorkingDirectory=/data/dnf
ExecStart=/bin/bash -c 'python3 -m http.server 5000'
Restart=always
[Install]
WantedBy=multi-user.target
EOF
systemctl daemon-reload
systemctl enable --now local-webserver-osp.service
lvs config
为了压榨服务器资源,我们还配置lvm thin provision,这样能高效的使用磁盘资源,避免浪费。lvm thin provision简单来说,就是硬盘的超售。
pvcreate -y /dev/sdb
vgcreate vgdata /dev/sdb
# https://access.redhat.com/articles/766133
lvcreate -y -n poolA -L 500G vgdata
lvcreate -y -n poolA_meta -L 1G vgdata
lvconvert -y --thinpool vgdata/poolA --poolmetadata vgdata/poolA_meta
# Thin pool volume with chunk size 64.00 KiB can address at most <15.88 TiB of data.
# WARNING: Converting vgdata/poolA and vgdata/poolA_meta to thin pool's data and metadata volumes with metadata wiping.
# THIS WILL DESTROY CONTENT OF LOGICAL VOLUME (filesystem etc.)
# Converted vgdata/poolA and vgdata/poolA_meta to thin pool.
lvextend -l +100%FREE vgdata/poolA
# Rounding size to boundary between physical extents: <1.09 GiB.
# Size of logical volume vgdata/poolA_tmeta changed from 1.00 GiB (256 extents) to <1.09 GiB (279 extents).
# Size of logical volume vgdata/poolA_tdata changed from 500.00 GiB (128000 extents) to <1.09 TiB (285457 extents).
# Logical volume vgdata/poolA successfully resized.
kvm setup
做完了上面的准备工作,我们就要开始创建kvm了,我们做实验是会反复重装的,所以会首先有清理的脚本。然后我们有另外一些脚本去创建kvm,注意,我们是创建kvm,而不会去启动他们。
cleanup
我们准备了脚本,来清理kvm,把物理机清理成一个干净的系统。
create_lv() {
var_vg=$1
var_pool=$2
var_lv=$3
var_size=$4
var_action=$5
lvremove -f $var_vg/$var_lv
# lvcreate -y -L $var_size -n $var_lv $var_vg
if [ "$var_action" == "recreate" ]; then
lvcreate --type thin -n $var_lv -V $var_size --thinpool $var_vg/$var_pool
wipefs --all --force /dev/$var_vg/$var_lv
fi
}
virsh destroy ocp4-ipi-osp-master-01
virsh undefine ocp4-ipi-osp-master-01
create_lv vgdata poolA lv-ocp4-ipi-osp-master-01 100G
create_lv vgdata poolA lv-ocp4-ipi-osp-master-01-data 100G
create_lv vgdata poolA lv-ocp4-ipi-osp-master-01-data-02 100G
create_lv vgdata poolA lv-ocp4-ipi-osp-master-01-data-03 100G
virsh destroy ocp4-ipi-osp-master-02
virsh undefine ocp4-ipi-osp-master-02
create_lv vgdata poolA lv-ocp4-ipi-osp-master-02 100G
create_lv vgdata poolA lv-ocp4-ipi-osp-master-02-data 100G
create_lv vgdata poolA lv-ocp4-ipi-osp-master-02-data-02 100G
create_lv vgdata poolA lv-ocp4-ipi-osp-master-02-data-03 100G
virsh destroy ocp4-ipi-osp-master-03
virsh undefine ocp4-ipi-osp-master-03
create_lv vgdata poolA lv-ocp4-ipi-osp-master-03 100G
create_lv vgdata poolA lv-ocp4-ipi-osp-master-03-data 100G
create_lv vgdata poolA lv-ocp4-ipi-osp-master-03-data-02 100G
create_lv vgdata poolA lv-ocp4-ipi-osp-master-03-data-03 100G
virsh destroy ocp4-ipi-osp-worker-01
virsh undefine ocp4-ipi-osp-worker-01
create_lv vgdata poolA lv-ocp4-ipi-osp-worker-01 200G
create_lv vgdata poolA lv-ocp4-ipi-osp-worker-01-data 100G
create_lv vgdata poolA lv-ocp4-ipi-osp-worker-01-data-02 100G
create_lv vgdata poolA lv-ocp4-ipi-osp-worker-01-data-03 100G
virsh destroy ocp4-ipi-osp-worker-02
virsh undefine ocp4-ipi-osp-worker-02
create_lv vgdata poolA lv-ocp4-ipi-osp-worker-02 200G
create_lv vgdata poolA lv-ocp4-ipi-osp-worker-02-data 100G
create_lv vgdata poolA lv-ocp4-ipi-osp-worker-02-data-02 100G
create_lv vgdata poolA lv-ocp4-ipi-osp-worker-02-data-03 100G
virsh destroy ocp4-ipi-osp-worker-03
virsh undefine ocp4-ipi-osp-worker-03
create_lv vgdata poolA lv-ocp4-ipi-osp-worker-03 200G
create_lv vgdata poolA lv-ocp4-ipi-osp-worker-03-data 100G
create_lv vgdata poolA lv-ocp4-ipi-osp-worker-03-data-02 100G
create_lv vgdata poolA lv-ocp4-ipi-osp-worker-03-data-03 100G
VAR_VM=`virsh list --all | grep bootstrap | awk '{print $2}'`
virsh destroy $VAR_VM
virsh undefine $VAR_VM
VAR_POOL=`virsh pool-list --all | grep bootstrap | awk '{print $1}'`
virsh pool-destroy $VAR_POOL
virsh pool-undefine $VAR_POOL
/bin/rm -rf /var/lib/libvirt/openshift-images/*
/bin/rm -rf /var/lib/libvirt/images/*
define kvm on 103
然后,我们就可以开始定义kvm了,这里不能启动kvm,因为定义的kvm没有引导盘,启动了也无法开始安装,IPI模式下,installer会调用virtual bmc redfish接口,给kvm挂载上启动镜像,开始安装过程。
我们为了简单起见,每个kvm都配置了4块硬盘,4个网卡,其实只有worker node这一个kvm会用到4块硬盘。我们的vda硬盘还要大一些,因为要承载集群内的nfs服务器。由于我们配置了lvm thin provision,所以 lv 使用起来就可以肆无忌惮了。
/bin/rm -rf /var/lib/libvirt/images/*
create_lv() {
var_vg=$1
var_pool=$2
var_lv=$3
var_size=$4
var_action=$5
lvremove -f $var_vg/$var_lv
# lvcreate -y -L $var_size -n $var_lv $var_vg
if [ "$var_action" == "recreate" ]; then
lvcreate --type thin -n $var_lv -V $var_size --thinpool $var_vg/$var_pool
wipefs --all --force /dev/$var_vg/$var_lv
fi
}
SNO_MEM=32
export KVM_DIRECTORY=/data/kvm
virsh destroy ocp4-ipi-osp-master-01
virsh undefine ocp4-ipi-osp-master-01
create_lv vgdata poolA lv-ocp4-ipi-osp-master-01 500G recreate
create_lv vgdata poolA lv-ocp4-ipi-osp-master-01-data 100G recreate
create_lv vgdata poolA lv-ocp4-ipi-osp-master-01-data-02 100G recreate
create_lv vgdata poolA lv-ocp4-ipi-osp-master-01-data-03 100G recreate
virt-install --name=ocp4-ipi-osp-master-01 --vcpus=16 --ram=$(($SNO_MEM*1024)) \
--cpu=host-model \
--disk path=/dev/vgdata/lv-ocp4-ipi-osp-master-01,device=disk,bus=virtio,format=raw \
--disk path=/dev/vgdata/lv-ocp4-ipi-osp-master-01-data,device=disk,bus=virtio,format=raw \
--disk path=/dev/vgdata/lv-ocp4-ipi-osp-master-01-data-02,device=disk,bus=virtio,format=raw \
--disk path=/dev/vgdata/lv-ocp4-ipi-osp-master-01-data-03,device=disk,bus=virtio,format=raw \
--os-variant rhel8.4 \
--network bridge=baremetal,model=virtio \
--network network:provisioning,model=virtio \
--network bridge=baremetal,model=virtio \
--network bridge=baremetal,model=virtio \
--print-xml > ${KVM_DIRECTORY}/ocp4-ipi-osp-master-01.xml
virsh define --file ${KVM_DIRECTORY}/ocp4-ipi-osp-master-01.xml
virsh destroy ocp4-ipi-osp-master-02
virsh undefine ocp4-ipi-osp-master-02
create_lv vgdata poolA lv-ocp4-ipi-osp-master-02 500G recreate
create_lv vgdata poolA lv-ocp4-ipi-osp-master-02-data 100G recreate
create_lv vgdata poolA lv-ocp4-ipi-osp-master-02-data-02 100G recreate
create_lv vgdata poolA lv-ocp4-ipi-osp-master-02-data-03 100G recreate
virt-install --name=ocp4-ipi-osp-master-02 --vcpus=16 --ram=$(($SNO_MEM*1024)) \
--cpu=host-model \
--disk path=/dev/vgdata/lv-ocp4-ipi-osp-master-02,device=disk,bus=virtio,format=raw \
--disk path=/dev/vgdata/lv-ocp4-ipi-osp-master-02-data,device=disk,bus=virtio,format=raw \
--disk path=/dev/vgdata/lv-ocp4-ipi-osp-master-02-data-02,device=disk,bus=virtio,format=raw \
--disk path=/dev/vgdata/lv-ocp4-ipi-osp-master-02-data-03,device=disk,bus=virtio,format=raw \
--os-variant rhel8.4 \
--network bridge=baremetal,model=virtio \
--network network:provisioning,model=virtio \
--network bridge=baremetal,model=virtio \
--network bridge=baremetal,model=virtio \
--print-xml > ${KVM_DIRECTORY}/ocp4-ipi-osp-master-02.xml
virsh define --file ${KVM_DIRECTORY}/ocp4-ipi-osp-master-02.xml
# SNO_MEM=64
virsh destroy ocp4-ipi-osp-master-03
virsh undefine ocp4-ipi-osp-master-03
create_lv vgdata poolA lv-ocp4-ipi-osp-master-03 500G recreate
create_lv vgdata poolA lv-ocp4-ipi-osp-master-03-data 100G recreate
create_lv vgdata poolA lv-ocp4-ipi-osp-master-03-data-02 100G recreate
create_lv vgdata poolA lv-ocp4-ipi-osp-master-03-data-03 100G recreate
virt-install --name=ocp4-ipi-osp-master-03 --vcpus=16 --ram=$(($SNO_MEM*1024)) \
--cpu=host-model \
--disk path=/dev/vgdata/lv-ocp4-ipi-osp-master-03,device=disk,bus=virtio,format=raw \
--disk path=/dev/vgdata/lv-ocp4-ipi-osp-master-03-data,device=disk,bus=virtio,format=raw \
--disk path=/dev/vgdata/lv-ocp4-ipi-osp-master-03-data-02,device=disk,bus=virtio,format=raw \
--disk path=/dev/vgdata/lv-ocp4-ipi-osp-master-03-data-03,device=disk,bus=virtio,format=raw \
--os-variant rhel8.4 \
--network bridge=baremetal,model=virtio \
--network network:provisioning,model=virtio \
--network bridge=baremetal,model=virtio \
--network bridge=baremetal,model=virtio \
--print-xml > ${KVM_DIRECTORY}/ocp4-ipi-osp-master-03.xml
virsh define --file ${KVM_DIRECTORY}/ocp4-ipi-osp-master-03.xml
SNO_MEM=16
virsh destroy ocp4-ipi-osp-worker-01
virsh undefine ocp4-ipi-osp-worker-01
create_lv vgdata poolA lv-ocp4-ipi-osp-worker-01 500G recreate
create_lv vgdata poolA lv-ocp4-ipi-osp-worker-01-data 100G recreate
create_lv vgdata poolA lv-ocp4-ipi-osp-worker-01-data-02 100G recreate
create_lv vgdata poolA lv-ocp4-ipi-osp-worker-01-data-03 100G recreate
virt-install --name=ocp4-ipi-osp-worker-01 --vcpus=16 --ram=$(($SNO_MEM*1024)) \
--cpu=host-model \
--disk path=/dev/vgdata/lv-ocp4-ipi-osp-worker-01,device=disk,bus=virtio,format=raw \
--disk path=/dev/vgdata/lv-ocp4-ipi-osp-worker-01-data,device=disk,bus=virtio,format=raw \
--disk path=/dev/vgdata/lv-ocp4-ipi-osp-worker-01-data-02,device=disk,bus=virtio,format=raw \
--disk path=/dev/vgdata/lv-ocp4-ipi-osp-worker-01-data-03,device=disk,bus=virtio,format=raw \
--os-variant rhel8.4 \
--network bridge=baremetal,model=virtio \
--network network:provisioning,model=virtio \
--network bridge=baremetal,model=virtio \
--network bridge=baremetal,model=virtio \
--print-xml > ${KVM_DIRECTORY}/ocp4-ipi-osp-worker-01.xml
virsh define --file ${KVM_DIRECTORY}/ocp4-ipi-osp-worker-01.xml
bmc simulator
定义了kvm,我们需要配套的virtual BMC / redfish 接口来控制他们,这都是为了模拟真实的物理机,在真实的物理机场景下,openshift installer会调用redfish接口来控制物理机。
我们选用openstack项目的sushy工具来做这个virtual BMC。运行一个sushy实例,就可以管理同一个物理机上的所有kvm实例,简单易用。
最后,我们使用systemd来定义一个自动启动的服务,来运行sushy.
这一步操作,对应到架构图,是这部分:
# try to install and run it manually
dnf -y install python3-pip
pip3 install --user sushy-tools
mkdir -p /etc/crts
scp root@192.168.7.11:/etc/crts/* /etc/crts/
/root/.local/bin/sushy-emulator -i 0.0.0.0 --ssl-certificate /etc/crts/redhat.ren.crt --ssl-key /etc/crts/redhat.ren.key
# try to deploy as systemd service
cat << EOF > /etc/systemd/system/sushy-emulator.service
[Unit]
Description=sushy-emulator
[Service]
User=root
WorkingDirectory=/root
ExecStart=/bin/bash -c '/root/.local/bin/sushy-emulator -i 0.0.0.0 --ssl-certificate /etc/crts/redhat.ren.crt --ssl-key /etc/crts/redhat.ren.key'
Restart=always
[Install]
WantedBy=multi-user.target
EOF
systemctl daemon-reload
systemctl enable --now sushy-emulator.service
get mac and vm list on 103
有了virtual BMC,我们就要抽取一些openshift installer需要用到的参数,一个是kvm的mac地址,一个是redfish里面需要的uuid。
我们使用如下的脚本,来自动的得到,并且上传到 helper 节点去。
# on helper clean all
/bin/rm -f /data/install/mac.list.*
/bin/rm -f /data/install/vm.list.*
# back to 103
cd /data/kvm/
for i in ocp4-ipi-osp-master-0{1..3} ocp4-ipi-osp-worker-0{1..1}
do
echo -ne "${i}\t" ;
virsh dumpxml ${i} | grep "mac address" | cut -d\' -f2 | tr '\n' '\t'
echo
done > mac.list.103
cat /data/kvm/mac.list.103
# ocp4-ipi-osp-master-01 52:54:00:67:64:5f 52:54:00:e8:28:e7 52:54:00:4a:a4:39
# ocp4-ipi-osp-master-02 52:54:00:ac:ed:36 52:54:00:b5:34:c4 52:54:00:87:36:75
# ocp4-ipi-osp-master-03 52:54:00:ae:72:e5 52:54:00:87:19:c2 52:54:00:99:55:12
# ocp4-ipi-osp-worker-01 52:54:00:17:b2:2d 52:54:00:ca:74:c0 52:54:00:f4:5e:a8
cat << 'EOF' > redfish.sh
#!/usr/bin/env bash
curl -k -s https://127.0.0.1:8000/redfish/v1/Systems/ | jq -r '.Members[]."@odata.id"' > list
while read -r line; do
curl -k -s https://127.0.0.1:8000/$line | jq -j '.Id, " ", .Name, "\n" '
done < list
EOF
bash redfish.sh | grep ipi > /data/kvm/vm.list.103
cat /data/kvm/vm.list.103
# 6b9a4f6b-d751-4fd5-9493-39792039e9e2 ocp4-ipi-osp-worker-01
# 1a2d1e2a-5f50-49cf-920e-11f7b7f136dc ocp4-ipi-osp-master-02
# 9c7085a2-ed0c-4cbf-94ca-065d3e8db335 ocp4-ipi-osp-master-01
# 14474c89-152c-4580-8bbb-7f03e4e370e0 ocp4-ipi-osp-master-03
scp /data/kvm/{mac,vm}.list.* root@192.168.7.11:/data/install/
on helper node
终于所有的准备工作都做完了,我们开始在helper上面进行openshift的安装。在这之前,还有一个配置helper节点的步骤,主要是配置dns服务之类的,在这里就不重复了,有需要了解的,可以看这里的文档
get installer binary
我们先要从安装文件目录中,得到installer的二进制文件。
# switch to you install version
export BUILDNUMBER=4.11.6
pushd /data/ocp4/${BUILDNUMBER}
tar -xzf openshift-client-linux-${BUILDNUMBER}.tar.gz -C /usr/local/bin/
tar -xzf openshift-install-linux-${BUILDNUMBER}.tar.gz -C /usr/local/bin/
tar -xzf oc-mirror.tar.gz -C /usr/local/bin/
chmod +x /usr/local/bin/oc-mirror
install -m 755 /data/ocp4/clients/butane-amd64 /usr/local/bin/butane
install -m 755 /data/ocp4/clients/coreos-installer_amd64 /usr/local/bin/coreos-installer
popd
prepare web server for iso/images
接下来,我们准备一个自动启动的 web server,提供一些iso等镜像的下载服务。
############################
# as root create web server
cd /data/ocp4
python3 -m http.server 8080
cat << EOF > /etc/systemd/system/local-webserver.service
[Unit]
Description=local-webserver
[Service]
User=root
WorkingDirectory=/data/ocp4
ExecStart=/bin/bash -c 'python3 -m http.server 8080'
Restart=always
[Install]
WantedBy=multi-user.target
EOF
systemctl daemon-reload
systemctl enable --now local-webserver.service
# end as root
############################
create the install yaml
接下来我们创建安装配置文件。这里面最关键的就是那个yaml模板,我们在模板里面,启动IPI安装模式,并且配置3个master的redfish接口信息,并启用静态IP安装的方法,配置了静态IP信息。
安装配置yaml文件创建后,我们调用installer,把他们转化成ignition等真正的安装配置文件,并且和baremetal installer二进制文件一起,传递到物理机上。
这里面有2个二进制文件,一个是openshift installer,这个一般场景下,比如对接公有云,私有云,就够了,它会创建ignition文件,并且调用各种云的接口,创建虚拟机,开始安装。
但是如果是baremetal场景,有一个单独的baremetal installer二进制文件,它读取配置文件,调用物理机BMC接口信息,来开始安装,这个区别是目前openshift版本上的情况,不知道未来会不会有变化。
# create a user and create the cluster under the user
useradd -m 3nodeipi
su - 3nodeipi
ssh-keygen
cat << EOF > ~/.ssh/config
StrictHostKeyChecking no
UserKnownHostsFile=/dev/null
EOF
chmod 600 ~/.ssh/config
cat << 'EOF' >> ~/.bashrc
export BASE_DIR='/home/3nodeipi/'
EOF
export BASE_DIR='/home/3nodeipi/'
export BUILDNUMBER=4.11.6
mkdir -p ${BASE_DIR}/data/{sno/disconnected,install}
# set some parameter of you rcluster
NODE_SSH_KEY="$(cat ${BASE_DIR}/.ssh/id_rsa.pub)"
INSTALL_IMAGE_REGISTRY=quaylab.infra.redhat.ren:8443
PULL_SECRET='{"auths":{"registry.redhat.io": {"auth": "ZHVtbXk6ZHVtbXk=","email": "noemail@localhost"},"registry.ocp4.redhat.ren:5443": {"auth": "ZHVtbXk6ZHVtbXk=","email": "noemail@localhost"},"'${INSTALL_IMAGE_REGISTRY}'": {"auth": "'$( echo -n 'admin:shadowman' | openssl base64 )'","email": "noemail@localhost"}}}'
NTP_SERVER=192.168.7.11
HELP_SERVER=192.168.7.11
KVM_HOST=192.168.7.11
API_VIP=192.168.7.100
INGRESS_VIP=192.168.7.101
CLUSTER_PROVISION_IP=192.168.7.103
BOOTSTRAP_IP=192.168.7.12
# 定义单节点集群的节点信息
SNO_CLUSTER_NAME=acm-demo-one
SNO_BASE_DOMAIN=redhat.ren
BOOTSTRAP_IP=192.168.7.22
MASTER_01_IP=192.168.7.23
MASTER_02_IP=192.168.7.24
MASTER_03_IP=192.168.7.25
WORKER_01_IP=192.168.7.26
BOOTSTRAP_HOSTNAME=bootstrap-demo
MASTER_01_HOSTNAME=master-01-demo
MASTER_02_HOSTNAME=master-02-demo
MASTER_03_HOSTNAME=master-03-demo
WORKER_01_HOSTNAME=worker-01-demo
BOOTSTRAP_INTERFACE=enp1s0
MASTER_01_INTERFACE=enp1s0
MASTER_02_INTERFACE=enp1s0
MASTER_03_INTERFACE=enp1s0
WORKER_01_INTERFACE=enp1s0
BOOTSTRAP_DISK=/dev/vda
MASTER_01_DISK=/dev/vda
MASTER_02_DISK=/dev/vda
MASTER_03_DISK=/dev/vda
WORKER_01_DISK=/dev/vda
OCP_GW=192.168.7.11
OCP_NETMASK=255.255.255.0
OCP_NETMASK_S=24
OCP_DNS=192.168.7.11
# echo ${SNO_IF_MAC} > /data/sno/sno.mac
mkdir -p ${BASE_DIR}/data/install
cd ${BASE_DIR}/data/install
/bin/rm -rf *.ign .openshift_install_state.json auth bootstrap manifests master*[0-9] worker*[0-9] openshift
cat << EOF > ${BASE_DIR}/data/install/install-config.yaml
apiVersion: v1
baseDomain: $SNO_BASE_DOMAIN
compute:
- name: worker
replicas: 0
controlPlane:
name: master
replicas: 3
metadata:
name: $SNO_CLUSTER_NAME
networking:
# OVNKubernetes , OpenShiftSDN
networkType: OVNKubernetes
clusterNetwork:
- cidr: 10.128.0.0/14
hostPrefix: 23
serviceNetwork:
- 172.31.0.0/16
machineNetwork:
- cidr: 192.168.7.0/24
pullSecret: '${PULL_SECRET}'
sshKey: |
$( cat ${BASE_DIR}/.ssh/id_rsa.pub | sed 's/^/ /g' )
additionalTrustBundle: |
$( cat /etc/crts/redhat.ren.ca.crt | sed 's/^/ /g' )
imageContentSources:
- mirrors:
- ${INSTALL_IMAGE_REGISTRY}/openshift/release-images
source: quay.io/openshift-release-dev/ocp-release
- mirrors:
- ${INSTALL_IMAGE_REGISTRY}/openshift/release
source: quay.io/openshift-release-dev/ocp-v4.0-art-dev
platform:
baremetal:
apiVIP: $API_VIP
ingressVIP: $INGRESS_VIP
provisioningNetwork: "Managed"
provisioningNetworkCIDR: 172.22.0.0/24
provisioningNetworkInterface: enp2s0
provisioningBridge: br-prov
clusterProvisioningIP: 172.22.0.6
bootstrapProvisioningIP: 172.22.0.7
bootstrapExternalStaticIP: 192.168.7.22/24
bootstrapExternalStaticGateway: 192.168.7.11
externalBridge: baremetal
bootstrapOSImage: http://192.168.7.11:8080/rhcos-qemu.x86_64.qcow2.gz?sha256=$(zcat /data/ocp4/rhcos-qemu.x86_64.qcow2.gz | sha256sum | awk '{print $1}')
clusterOSImage: http://192.168.7.11:8080/rhcos-openstack.x86_64.qcow2.gz?sha256=$(sha256sum /data/ocp4/rhcos-openstack.x86_64.qcow2.gz | awk '{print $1}')
hosts:
- name: ocp4-ipi-osp-master-01
role: master
bootMode: legacy
bmc:
address: redfish-virtualmedia://192.168.7.103:8000/redfish/v1/Systems/$(cat /data/install/vm.list.* | grep master-01 | awk '{print $1}')
username: admin
password: password
disableCertificateVerification: True
bootMACAddress: $(cat /data/install/mac.list.* | grep master-01 | awk '{print $2}')
rootDeviceHints:
deviceName: "$MASTER_01_DISK"
networkConfig:
dns-resolver:
config:
server:
- ${OCP_DNS}
interfaces:
- ipv4:
address:
- ip: ${MASTER_01_IP}
prefix-length: ${OCP_NETMASK_S}
dhcp: false
enabled: true
name: ${MASTER_01_INTERFACE}
state: up
type: ethernet
routes:
config:
- destination: 0.0.0.0/0
next-hop-address: ${OCP_GW}
next-hop-interface: ${MASTER_01_INTERFACE}
table-id: 254
- name: ocp4-ipi-osp-master-02
role: master
bootMode: legacy
bmc:
address: redfish-virtualmedia://192.168.7.103:8000/redfish/v1/Systems/$(cat /data/install/vm.list.* | grep master-02 | awk '{print $1}')
username: admin
password: password
disableCertificateVerification: True
bootMACAddress: $(cat /data/install/mac.list.* | grep master-02 | awk '{print $2}')
rootDeviceHints:
deviceName: "$MASTER_02_DISK"
networkConfig:
dns-resolver:
config:
server:
- ${OCP_DNS}
interfaces:
- ipv4:
address:
- ip: ${MASTER_02_IP}
prefix-length: ${OCP_NETMASK_S}
dhcp: false
enabled: true
name: ${MASTER_02_INTERFACE}
state: up
type: ethernet
routes:
config:
- destination: 0.0.0.0/0
next-hop-address: ${OCP_GW}
next-hop-interface: ${MASTER_02_INTERFACE}
table-id: 254
- name: ocp4-ipi-osp-master-03
role: master
bootMode: legacy
bmc:
address: redfish-virtualmedia://192.168.7.103:8000/redfish/v1/Systems/$(cat /data/install/vm.list.* | grep master-03 | awk '{print $1}')
username: admin
password: password
disableCertificateVerification: True
bootMACAddress: $(cat /data/install/mac.list.* | grep master-03 | awk '{print $2}')
rootDeviceHints:
deviceName: "$MASTER_03_DISK"
networkConfig:
dns-resolver:
config:
server:
- ${OCP_DNS}
interfaces:
- ipv4:
address:
- ip: ${MASTER_03_IP}
prefix-length: ${OCP_NETMASK_S}
dhcp: false
enabled: true
name: ${MASTER_03_INTERFACE}
state: up
type: ethernet
routes:
config:
- destination: 0.0.0.0/0
next-hop-address: ${OCP_GW}
next-hop-interface: ${MASTER_03_INTERFACE}
table-id: 254
EOF
/bin/cp -f ${BASE_DIR}/data/install/install-config.yaml ${BASE_DIR}/data/install/install-config.yaml.bak
/data/ocp4/${BUILDNUMBER}/openshift-baremetal-install --dir ${BASE_DIR}/data/install/ create manifests
/bin/cp -f /data/ocp4/ansible-helper/files/* ${BASE_DIR}/data/install/openshift/
#############################################
# run as root if you have not run below, at least one time
# it will generate registry configuration
# copy image registry proxy related config
cd /data/ocp4
bash image.registries.conf.sh nexus.infra.redhat.ren:8083
/bin/cp -f /data/ocp4/image.registries.conf /etc/containers/registries.conf.d/
#############################################
/bin/cp -f /data/ocp4/99-worker-container-registries.yaml ${BASE_DIR}/data/install/openshift
/bin/cp -f /data/ocp4/99-master-container-registries.yaml ${BASE_DIR}/data/install/openshift
cd ${BASE_DIR}/data/install/
# then, we copy baremetal install binary to kvm host
sshpass -p panpan ssh-copy-id root@172.21.6.103
scp /data/ocp4/${BUILDNUMBER}/openshift-baremetal-install root@172.21.6.103:/usr/local/bin/
# the, we copy configuration files to kvm host
cat << EOF > ${BASE_DIR}/data/install/scp.sh
ssh root@172.21.6.103 "rm -rf /data/install;"
scp -r ${BASE_DIR}/data/install root@172.21.6.103:/data/install
EOF
bash ${BASE_DIR}/data/install/scp.sh
kvm host (103) to begin install
到现在位置,万事俱备了,我们就可以在物理机上真正的开始安装了。到这一步,我们没有特别需要做的,因为是IPI模式,全自动,我们运行命令,等着安装成功的结果,并且把各种密码输出记录下来就好了。
cd /data/install
openshift-baremetal-install --dir /data/install/ --log-level debug create cluster
# ......
# INFO Install complete!
# INFO To access the cluster as the system:admin user when using 'oc', run
# INFO export KUBECONFIG=/data/install/auth/kubeconfig
# INFO Access the OpenShift web-console here: https://console-openshift-console.apps.acm-demo-one.redhat.ren
# INFO Login to the console with user: "kubeadmin", and password: "JgTXJ-d9Nsb-QHGS2-Puor3"
# DEBUG Time elapsed per stage:
# DEBUG bootstrap: 23s
# DEBUG masters: 16m31s
# DEBUG Bootstrap Complete: 19m11s
# DEBUG Bootstrap Destroy: 11s
# DEBUG Cluster Operators: 7m10s
# INFO Time elapsed: 43m37s
# tail -f /data/install/.openshift_install.log
on helper to see result
我们需要把物理机上的密钥文件等信息,传回helper节点。方便我们后续的操作。
# on helper node
scp -r root@172.21.6.103:/data/install/auth ${BASE_DIR}/data/install/auth
cd ${BASE_DIR}/data/install
export KUBECONFIG=${BASE_DIR}/data/install/auth/kubeconfig
echo "export KUBECONFIG=${BASE_DIR}/data/install/auth/kubeconfig" >> ~/.bashrc
# oc completion bash | sudo tee /etc/bash_completion.d/openshift > /dev/null
# if you power off cluster for long time
# you will need to re-approve the csr
oc get csr | grep -v Approved
oc get csr -ojson | jq -r '.items[] | select(.status == {} ) | .metadata.name' | xargs oc adm certificate approve
password login and oc config
安装完成了,我们要配置一些节点ssh登录的配置。openshift默认的ssh登录,是禁用root登录的,并且启动了time out 机制,这个让我们做实验非常难受和不便,我们在这里就使用脚本,打开这些限制。最终,我们可以轻松的远程root直接用密码登录了。
# init setting for helper node
cat << EOF > ~/.ssh/config
StrictHostKeyChecking no
UserKnownHostsFile=/dev/null
EOF
chmod 600 ~/.ssh/config
cat > ${BASE_DIR}/data/install/crack.txt << EOF
echo redhat | sudo passwd --stdin root
sudo sed -i "s|^PasswordAuthentication no$|PasswordAuthentication yes|g" /etc/ssh/sshd_config
sudo sed -i "s|^PermitRootLogin no$|PermitRootLogin yes|g" /etc/ssh/sshd_config
sudo sed -i "s|^#ClientAliveInterval 180$|ClientAliveInterval 1800|g" /etc/ssh/sshd_config
sudo systemctl restart sshd
sudo sh -c 'echo "export KUBECONFIG=/etc/kubernetes/static-pod-resources/kube-apiserver-certs/secrets/node-kubeconfigs/localhost.kubeconfig" >> /root/.bashrc'
sudo sh -c 'echo "RET=\\\`oc config use-context system:admin\\\`" >> /root/.bashrc'
EOF
for i in 23 24 25
do
ssh core@192.168.7.$i < ${BASE_DIR}/data/install/crack.txt
done
from other host
能远程密码登录了,我们还希望能自动ssh登录,由于openshift节点很多,一台一台的去配置比较麻烦,我们这里提供了脚本,可以批量的来搞。
# https://unix.stackexchange.com/questions/230084/send-the-password-through-stdin-in-ssh-copy-id
dnf install -y sshpass
for i in 23 24 25
do
sshpass -p 'redhat' ssh-copy-id root@192.168.7.$i
done
power off
home lab的特点,是为了节电,没人用的时候,需要关机,那么我们就提供这样的脚本,来方便openshift集群关机操作。
for i in 23 24 25
do
ssh root@192.168.7.$i poweroff
done
reboot
有的时候,openshift节点需要全部来一次reboot,来排除错误。这里也有脚本来帮忙。
for i in 23 24 25
do
ssh root@192.168.7.$i reboot
done
power on
同样,我们有脚本帮助批量启动虚拟机。
# or
for i in {1..3}
do
virsh start ocp4-ipi-osp-master-0$i
done
check info
我们日常还会有一些集群各个节点收集信息,脚本操作的工作,也提供脚本模板,帮助日常工作。
for i in 23 24 25
do
ssh root@192.168.7.$i "ip a"
done
cat > ${BASE_DIR}/data/install/crack.txt << 'EOF'
for i in {3..8}
do
nmcli con down enp${i}s0
nmcli con del enp${i}s0
done
EOF
for i in 23 24 25
do
ssh root@192.168.7.$i < ${BASE_DIR}/data/install/crack.txt
done
try to deploy gitea
接下来,需要在helper节点上,部署一个git服务,这个是因为openstack在安装过程中,会先把安装脚本和配置,上传到git服务器上,作为一个git commit,然后真正的部署动作,会从这个git commit上下载,执行。
我们用gitea来提供这个git服务,安装过程网上教程一堆,我们就用最简单的方法来装。但是openstack对git服务有特殊的要求,就是git服务必须使用ssh通道来提供服务,这个就需要测试了,而我们使用了非标准的ssh端口来提供这个服务,也导致了后面一连串的不兼容错误。
不管怎么说,通过ssh访问git服务,肯定要配置出来,并且,我们还要配置ssh key 认证,用密钥的方式的访问。
最后,给出了测试ssh访问git服务的命令行,方便验证。
rm -rf /data/ccn/gitea
mkdir -p /data/ccn/gitea
chown -R 1000:1000 /data/ccn/gitea
podman run -d --replace --name gitea \
-v /data/ccn/gitea:/data:Z \
-v /etc/localtime:/etc/localtime:ro \
-e USER_UID=1000 \
-e USER_GID=1000 \
-p 10090:3000 \
-p 10022:22 \
docker.io/gitea/gitea:1.17.3
# use systemd to auto-run gitea
podman generate systemd --files --name gitea
/bin/cp -Zf container-gitea.service /etc/systemd/system/
systemctl daemon-reload
systemctl enable --now container-gitea.service
# http://quaylab.infra.redhat.ren:10090/
# root / redhat
# setup ssh key for gitea
# test the ssh git access
ssh -T -p 10022 git@quaylab.infra.redhat.ren
git clone ssh://git@quaylab.infra.redhat.ren:10022/root/demo.git
add baremetal host
IPI 模式下,添加一个新节点非常方便,只要定义一个BareMetalHost就好了。我们做osp on ocp的实验,添加一个worker node就可以了,后面osp operator会重新格式化这个worker node,然后把他加到osp cluster里面去。
配置是很简单,但是后面具体openshift是做了什么,让它能管理这个baremetal节点呢? 经过反复的实验,作者大概归纳了相关的行为如下:
- 当第一次定义BareMetalHost的以后,ocp会调用redfish端口,启动这个节点,同时挂载一个rhcos 的iso,启动这个节点,这个iso是定制过的,会有一些网络参数,还有定义了自启动的服务。启动了以后,会默认启动一个ironic agent,这个ironic agent会连接 machine api,去下载任务,没发现什么特殊的任务的时候,它会探测一下主机环境,比如有多少core, memory, 存储等等,上报machine api以后,就自动关机了。
- 如果BareMetalHost里面还定义了image,那么ironic agent会下载这个镜像(之前在metal3 service里面转化过了),然后把它写到硬盘上,然后重启。这部分自动运行的命令,作者给记录下来了,在这里
这一步操作,对应到架构图,是这部分:
cd ${BASE_DIR}/data/install/
cat << EOF > ${BASE_DIR}/data/install/bmh-01.yaml
---
apiVersion: v1
kind: Secret
metadata:
name: worker-1-bmc-secret
type: Opaque
data:
username: $(echo -ne "admin" | base64)
password: $(echo -ne "password" | base64)
---
apiVersion: v1
kind: Secret
metadata:
name: ocp4-ipi-osp-worker-01-network-config-secret
type: Opaque
stringData:
nmstate: |
dns-resolver:
config:
server:
- 192.168.7.11
interfaces:
- ipv4:
address:
- ip: 192.168.7.26
prefix-length: 24
dhcp: false
enabled: true
name: enp1s0
state: up
type: ethernet
routes:
config:
- destination: 0.0.0.0/0
next-hop-address: 192.168.7.11
next-hop-interface: enp1s0
table-id: 254
---
apiVersion: metal3.io/v1alpha1
kind: BareMetalHost
metadata:
name: ocp4-ipi-osp-worker-01
spec:
online: false
bootMode: legacy
# externallyProvisioned: true
# hardwareProfile: unknown
bootMACAddress: $(cat /data/install/mac.list.* | grep worker-01 | awk '{print $2}')
bmc:
address: redfish-virtualmedia://192.168.7.103:8000/redfish/v1/Systems/$(cat /data/install/vm.list.* | grep worker-01 | awk '{print $1}')
credentialsName: worker-1-bmc-secret
disableCertificateVerification: true
rootDeviceHints:
deviceName: /dev/vda
preprovisioningNetworkDataName: ocp4-ipi-osp-worker-01-network-config-secret
EOF
oc -n openshift-machine-api create -f ${BASE_DIR}/data/install/bmh-01.yaml
# oc delete -f ${BASE_DIR}/data/install/bmh.yaml -n openshift-machine-api
# DO NOT USE, restore, delete the vm
# oc -n openshift-machine-api delete -f ${BASE_DIR}/data/install/bmh.yaml
# oc delete -f ${BASE_DIR}/data/install/bmh-03.yaml -n openshift-machine-api
oc get bmh -n openshift-machine-api
# NAME STATE CONSUMER ONLINE ERROR AGE
# ocp4-ipi-osp-master-01 externally provisioned acm-demo-one-42z8b-master-0 true 3h23m
# ocp4-ipi-osp-master-02 externally provisioned acm-demo-one-42z8b-master-1 true 3h23m
# ocp4-ipi-osp-master-03 externally provisioned acm-demo-one-42z8b-master-2 true 3h23m
# ocp4-ipi-osp-worker-01 externally provisioned true 54s
oc get machinesets -n openshift-machine-api
# NAME DESIRED CURRENT READY AVAILABLE AGE
# acm-demo-one-42z8b-worker-0 0 0 3h25m
# oc get machinesets -n openshift-machine-api -o json | jq -r .items[0].metadata.name
# # 扩容worker到3副本,会触发worker-2的部署
# oc scale --replicas=1 machineset $(oc get machinesets -n openshift-machine-api -o json | jq -r .items[0].metadata.name) -n openshift-machine-api
# oc scale --replicas=0 machineset $(oc get machinesets -n openshift-machine-api -o json | jq -r .items[0].metadata.name) -n openshift-machine-api
install nfs
我们装的ocp集群,要想使用复杂的业务场景,肯定是需要存储的,我们是home lab,肯定想选取一个轻量化的存储方案,红帽自己的ODF对资源要求比较高,那么我们就选择k8s sig的NFS方案,装一个NFS服务,到集群里面。这个方案的特点,是把集群里面的一个节点,变成存储节点,用这个节点的一个目录作为数据存储空间,应用/pod可以在集群里面的各个节点运行。总的来说,虽然简单,但是性能受限。
如果有其他的需求,可以考虑cnv的host-path方案,或者openEBS方案。
add local volumn
我们弄一个local volumn,给k8s sig的NFS方案作为后端存储。
# go to master-03, this is as storage node
# create the storage path
mkdir -p /var/wzh-local-pv/
chcon -Rt container_file_t /var/wzh-local-pv/
# on helper
cat << EOF > ${BASE_DIR}/data/install/local-pv.yaml
---
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: local-volume
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: example-local-pv
spec:
capacity:
storage: 450Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
storageClassName: local-volume
local:
path: /var/wzh-local-pv/
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- one-master-03.acm-demo-one.redhat.ren
EOF
oc create --save-config -f ${BASE_DIR}/data/install/local-pv.yaml
# oc delete -f ${BASE_DIR}/data/install/local-pv.yaml
install nfs based on local pv
接下来,我们直接用yaml的方式,部署这个NFS服务,我们已经定制好了k8s sig的NFS服务启动yaml,我们下载,然后修改一下参数就可以直接启动了。他会创建对应的role, deployment等参数信息。
oc create ns nfs-system
# oc project nfs-system
cd ${BASE_DIR}/data/install
export http_proxy="http://127.0.0.1:18801"
export https_proxy=${http_proxy}
wget -O nfs.all.yaml https://raw.githubusercontent.com/wangzheng422/nfs-ganesha-server-and-external-provisioner/wzh/deploy/openshift/nfs.all.local.pv.yaml
unset http_proxy
unset https_proxy
/bin/cp -f nfs.all.yaml nfs.all.yaml.run
# sed -i 's/storageClassName: odf-lvm-vg1/storageClassName: local-volume/' nfs.all.yaml.run
sed -i 's/one-master-03.acm-demo-one.redhat.ren/one-master-03.acm-demo-one.redhat.ren/' nfs.all.yaml.run
sed -i 's/storage: 5Gi/storage: 450Gi/' nfs.all.yaml.run
oc create --save-config -n nfs-system -f nfs.all.yaml.run
# oc delete -n nfs-system -f nfs.all.yaml.run
install cnv, nmstat, sriov operator
我们按照openstack operator的官方文档,安装几个依赖的operator,他们是
- cnv, 这个是在openshift集群里面启动kvm虚拟机的,openstack的overcloud controller是用cnv启动的kvm来承载运行。
- nmstat, 这个是配置openshift节点网卡参数的插件,openstack会定义很复杂的网络参数,从上面的架构图就能看出来。注意,nmstat只能修改ocp集群管理的节点网卡参数,对于已经更改了基础镜像,变成osp纳管的节点,这个插件是管不到的。
- sriov, 这个是配置网卡直通的,作者还不太确定他用在什么地方,目前看,好像是cnv启动kvm的时候,会通过sriov把一堆网卡直通到kvm里面,干的是这个事情。
我们安装的时候,先用automatic approve的方式来安装,这样就能省去点击确认授权的步骤,装完了以后,我们在修改成manual approve的方式,防止集群自动升级operator。自动升级这个功能是很好,但是对于已经装好的集群,如果自动升级了,你还不知道,很可能就升级失败,导致你的集群功能受到影响。
install cnv
我们先装CNV,这个是在ocp里面启动虚拟机用的。CNV会带起来一大堆的pod,所以安装的时间有点长。
# install cnv
cat << EOF > ${BASE_DIR}/data/install/cnv.yaml
apiVersion: v1
kind: Namespace
metadata:
name: openshift-cnv
---
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
name: kubevirt-hyperconverged-group
namespace: openshift-cnv
spec:
targetNamespaces:
- openshift-cnv
---
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
name: hco-operatorhub
namespace: openshift-cnv
spec:
source: redhat-operators
sourceNamespace: openshift-marketplace
name: kubevirt-hyperconverged
startingCSV: kubevirt-hyperconverged-operator.v4.11.0
channel: "stable"
EOF
oc create --save-config -f ${BASE_DIR}/data/install/cnv.yaml
# oc delete -f ${BASE_DIR}/data/install/cnv.yaml
oc get csv -n openshift-cnv
# NAME DISPLAY VERSION REPLACES PHASE
# kubevirt-hyperconverged-operator.v4.11.0 OpenShift Virtualization 4.11.0 kubevirt-hyperconverged-operator.v4.10.5 Succeeded
cat << EOF > ${BASE_DIR}/data/install/patch.yaml
spec:
installPlanApproval: Manual
EOF
oc patch -n openshift-cnv subscription/hco-operatorhub --type merge --patch-file=${BASE_DIR}/data/install/patch.yaml
cat << EOF > ${BASE_DIR}/data/install/cnv-hc.yaml
apiVersion: hco.kubevirt.io/v1beta1
kind: HyperConverged
metadata:
name: kubevirt-hyperconverged
namespace: openshift-cnv
spec:
EOF
oc create --save-config -f ${BASE_DIR}/data/install/cnv-hc.yaml
# cat << EOF > ${BASE_DIR}/data/install/hostpath.yaml
# apiVersion: hostpathprovisioner.kubevirt.io/v1beta1
# kind: HostPathProvisioner
# metadata:
# name: hostpath-provisioner
# spec:
# imagePullPolicy: IfNotPresent
# storagePools:
# - name: wzh-cnv-hostpath-storage-pool
# path: "/var/wzh-cnv-hostpath"
# workload:
# nodeSelector:
# kubernetes.io/os: linux
# EOF
# oc create --save-config -f ${BASE_DIR}/data/install/hostpath.yaml
# cat << EOF > ${BASE_DIR}/data/install/sc.yaml
# apiVersion: storage.k8s.io/v1
# kind: StorageClass
# metadata:
# name: hostpath-csi
# provisioner: kubevirt.io.hostpath-provisioner
# reclaimPolicy: Delete
# # volumeBindingMode: WaitForFirstConsumer
# volumeBindingMode: Immediate
# parameters:
# storagePool: wzh-cnv-hostpath-storage-pool
# EOF
# oc create --save-config -f ${BASE_DIR}/data/install/sc.yaml
# oc delete -f ${BASE_DIR}/data/install/sc.yaml
install nmstat
接下来装nmstat,这个是设置网卡用的,记得不要设置ocp集群使用的通讯网口。
cat << EOF > ${BASE_DIR}/data/install/nmstat.yaml
---
apiVersion: v1
kind: Namespace
metadata:
labels:
kubernetes.io/metadata.name: openshift-nmstate
name: openshift-nmstate
name: openshift-nmstate
spec:
finalizers:
- kubernetes
---
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
annotations:
olm.providedAPIs: NMState.v1.nmstate.io
generateName: openshift-nmstate-
name: openshift-nmstate-wzh
namespace: openshift-nmstate
spec:
targetNamespaces:
- openshift-nmstate
---
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
labels:
operators.coreos.com/kubernetes-nmstate-operator.openshift-nmstate: ""
name: kubernetes-nmstate-operator
namespace: openshift-nmstate
spec:
channel: "4.11"
name: kubernetes-nmstate-operator
source: redhat-operators
sourceNamespace: openshift-marketplace
EOF
oc create --save-config -f ${BASE_DIR}/data/install/nmstat.yaml
oc get csv -n openshift-nmstate
# NAME DISPLAY VERSION REPLACES PHASE
# kubernetes-nmstate-operator.4.11.0-202210250857 Kubernetes NMState Operator 4.11.0-202210250857 Succeeded
cat << EOF > ${BASE_DIR}/data/install/patch.yaml
spec:
installPlanApproval: Manual
EOF
oc patch -n openshift-nmstate subscription/kubernetes-nmstate-operator --type merge --patch-file=${BASE_DIR}/data/install/patch.yaml
cat << EOF > ${BASE_DIR}/data/install/nmstat-stat.yaml
---
apiVersion: nmstate.io/v1
kind: NMState
metadata:
name: nmstate
EOF
oc create --save-config -f ${BASE_DIR}/data/install/nmstat-stat.yaml
install sriov
最后,装sriov,这个是给cnv启动的kvm配置网口直通的。
# oc annotate ns/openshift-sriov-network-operator workload.openshift.io/allowed=management
cat << EOF > ${BASE_DIR}/data/install/sriov.yaml
---
apiVersion: v1
kind: Namespace
metadata:
name: openshift-sriov-network-operator
annotations:
workload.openshift.io/allowed: management
---
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
name: sriov-network-operators
namespace: openshift-sriov-network-operator
spec:
targetNamespaces:
- openshift-sriov-network-operator
---
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
name: sriov-network-operator-subscription
namespace: openshift-sriov-network-operator
spec:
channel: "4.11"
name: sriov-network-operator
source: redhat-operators
sourceNamespace: openshift-marketplace
EOF
oc create --save-config -f ${BASE_DIR}/data/install/sriov.yaml
oc get csv -n openshift-sriov-network-operator
# NAME DISPLAY VERSION REPLACES PHASE
# sriov-network-operator.4.11.0-202210250857 SR-IOV Network Operator 4.11.0-202210250857 Succeeded
oc get subscription -n openshift-sriov-network-operator
# NAME PACKAGE SOURCE CHANNEL
# sriov-network-operator-subscription sriov-network-operator redhat-operators 4.11
oc get subscription/sriov-network-operator-subscription -n openshift-sriov-network-operator -o json | jq .spec
# {
# "channel": "4.11",
# "name": "sriov-network-operator",
# "source": "redhat-operators",
# "sourceNamespace": "openshift-marketplace"
# }
cat << EOF > ${BASE_DIR}/data/install/patch.yaml
spec:
installPlanApproval: Manual
EOF
oc patch -n openshift-sriov-network-operator subscription/sriov-network-operator-subscription --type merge --patch-file=${BASE_DIR}/data/install/patch.yaml
install osp operator
我们马上就要开始安装openstack组件啦。我们参考的文档是官方文档,官方文档,写的已经很用心,很好了,但是还是免不了有错误。我们会在接下来的步骤中,修正文档里面的错误。
build operator images
安装的第一步,居然是自己编译osp operator的镜像?算了,毕竟是TP版本的软件,有一些不完善,也能理解。根据文档,我们需要找最新的版本,自己打包operator catalog 镜像,这个镜像是一个operator hub的catalog,可以简单理解为,我们在ocp的应用商店里面,开了一个新的门面,叫openstack,里面就有一样商品,叫openstack。
# https://github.com/openstack-k8s-operators/osp-director-operator
# [osp-director-operator](https://catalog.redhat.com/software/containers/rhosp-rhel8-tech-preview/osp-director-operator/607dd3bf87c834779d77611b)
# [osp-director-operator-bundle](https://catalog.redhat.com/software/containers/rhosp-rhel8-tech-preview/osp-director-operator-bundle/607dd43903f4b3563ab483b3)
#########################
# on helper
# run as root
cd /data/ocp4/4.11.6/
tar zvxf opm-linux-4.11.6.tar.gz
install opm /usr/local/bin/
/bin/cp -f /etc/containers/policy.json /etc/containers/policy.json.bak
cat << EOF > /etc/containers/policy.json
{
"default": [
{
"type": "insecureAcceptAnything"
}
],
"transports":
{
"docker-daemon":
{
"": [{"type":"insecureAcceptAnything"}]
}
}
}
EOF
# end run as root
#########################
# registry.redhat.io/rhosp-rhel8-tech-preview/osp-director-operator-bundle:1.2.3-12
BUNDLE_IMG="registry.redhat.io/rhosp-rhel8-tech-preview/osp-director-operator-bundle:1.2.3-12"
INDEX_IMG="quay.io/wangzheng422/osp-director-operator-index:1.2.3-12"
opm index add --bundles ${BUNDLE_IMG} --tag ${INDEX_IMG} -u podman --pull-tool podman
podman push ${INDEX_IMG}
install openstack director operator
编译好了openstack的catalog镜像,我们就用这个镜像,部署一个catalog,并且安装openstack director operator.
oc new-project openstack
cat << EOF > ${BASE_DIR}/data/install/osp-director-operator.yaml
apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
name: osp-director-operator-index
namespace: openstack
spec:
sourceType: grpc
# image: quay.io/openstack-k8s-operators/osp-director-operator-index:1.0.0-1
# image: quay.io/openstack-k8s-operators/osp-director-operator-index:1.2.3
image: quay.io/wangzheng422/osp-director-operator-index@sha256:ac810497a3b29662573e0843715285a1ad69e3fe7a8c7b6e5fe43d2f6d5bda8d
---
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
name: "osp-director-operator-group"
namespace: openstack
spec:
targetNamespaces:
- openstack
---
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
name: osp-director-operator-subscription
namespace: openstack
spec:
config:
env:
- name: WATCH_NAMESPACE
value: openstack,openshift-machine-api,openshift-sriov-network-operator
source: osp-director-operator-index
sourceNamespace: openstack
name: osp-director-operator
EOF
oc create --save-config -f ${BASE_DIR}/data/install/osp-director-operator.yaml
# oc delete -f ${BASE_DIR}/data/install/osp-director-operator.yaml
oc get operators
# NAME AGE
# kubernetes-nmstate-operator.openshift-nmstate 21h
# kubevirt-hyperconverged.openshift-cnv 22h
# osp-director-operator.openstack 17m
# sriov-network-operator.openshift-sriov-network-operator 21h
oc get csv -n openstack
# NAME DISPLAY VERSION REPLACES PHASE
# osp-director-operator.v1.2.3 OSP Director Operator 1.2.3 Succeeded
try to deploy osp
有了openstack director operator,我们就要真正的开始一步一步的安装一个openstack overcloud啦。
我们参考的文档在这里: Chapter 7. Director operator deployment scenario: Overcloud with Hyper-Converged Infrastructure (HCI)
upload rhel image
openstack是虚机平台,我们需要准备操作系统镜像,我们就下载官网的rhel8.6镜像,并且按照文档的要求,进行小小的定制化。
然后用cnv的命令行virtctl,去把这个镜像上传。
# download rhel-8.6-x86_64-kvm.qcow2 from redhat website
ls -l /data/down | grep rhel
# -rw-r--r--. 1 root root 8770508800 Apr 27 2022 rhel-8.6-aarch64-dvd.iso
# -rw-r--r--. 1 root root 832438272 May 10 13:23 rhel-8.6-x86_64-kvm.qcow2
export PROXY="http://127.0.0.1:18801"
subscription-manager repos --proxy=$PROXY --enable=cnv-4.11-for-rhel-8-x86_64-rpms
dnf install -y kubevirt-virtctl libguestfs-tools-c
/bin/cp -f /data/down/rhel-8.6-x86_64-kvm.qcow2 /data/down/rhel-8.6-x86_64-kvm-wzh.qcow2
virt-customize -a /data/down/rhel-8.6-x86_64-kvm-wzh.qcow2 --run-command 'sed -i -e "s/^\(kernelopts=.*\)net.ifnames=0 \(.*\)/\1\2/" /boot/grub2/grubenv'
virt-customize -a /data/down/rhel-8.6-x86_64-kvm-wzh.qcow2 --run-command 'sed -i -e "s/^\(GRUB_CMDLINE_LINUX=.*\)net.ifnames=0 \(.*\)/\1\2/" /etc/default/grub'
virtctl version
# Client Version: version.Info{GitVersion:"v0.36.5-2-gdd266dff9", GitCommit:"dd266dff95f7de9f79e3e0e5d4867c5ba9d50c9d", GitTreeState:"clean", BuildDate:"2022-04-01T22:51:18Z", GoVersion:"go1.15.14", Compiler:"gc", Platform:"linux/amd64"}
# dial tcp [::1]:8080: connect: connection refused
# copy qcow2 to helper
scp /data/down/rhel-8.6-x86_64-kvm-wzh.qcow2 root@192.168.7.11:/data/swap/
# on helper download virtctl
export http_proxy="http://127.0.0.1:18801"
export https_proxy=${http_proxy}
export VERSION=v0.53.2
wget https://github.com/kubevirt/kubevirt/releases/download/${VERSION}/virtctl-${VERSION}-linux-amd64
install -m 755 virtctl-${VERSION}-linux-amd64 /usr/local/bin/virtctl
unset http_proxy
unset https_proxy
su - 3nodeipi
oc get storageclass
# NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
# hostpath-csi kubevirt.io.hostpath-provisioner Delete Immediate false 8m36s
# redhat-ren-nfs redhat.ren/nfs Delete Immediate false 3m27s
virtctl image-upload dv openstack-base-img -n openstack --size=50Gi --image-path=/data/swap/rhel-8.6-x86_64-kvm-wzh.qcow2 --storage-class redhat-ren-nfs --access-mode ReadWriteOnce --insecure
# PVC openstack/openstack-base-img not found
# DataVolume openstack/openstack-base-img created
# Waiting for PVC openstack-base-img upload pod to be ready...
# Pod now ready
# Uploading data to https://cdi-uploadproxy-openshift-cnv.apps.acm-demo-one.redhat.ren
# 797.50 MiB / 797.50 MiB [===============================================================================================================================================================] 100.00% 13s
# Uploading data completed successfully, waiting for processing to complete, you can hit ctrl-c without interrupting the progress
# Processing completed successfully
# Uploading rhel-8.6-x86_64-kvm-wzh.qcow2 completed successfully
# virtctl image-upload dv openstack-base-img -n openstack --no-create --size=50Gi --image-path=/data/swap/rhel-8.6-x86_64-kvm-wzh.qcow2 --storage-class redhat-ren-nfs --access-mode ReadWriteOnce --insecure
oc get datavolume
# NAME PHASE PROGRESS RESTARTS AGE
# openstack-base-img UploadReady N/A 1 113s
# in some case, import fail, just delete the data volume to restart
# oc delete datavolume/openstack-base-img
# ensure there is only one pvc
oc get pv
# NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
# example-local-pv 450Gi RWO Retain Bound nfs-system/lvm-file-pvc local-volume 42m
# in some case, import will never success,
# it is because cdi is kill by OOM, the reason is un-knonw.
# just reboot master-03 to fix that.
config key for git service, define default password
接下来,我们导入git服务的密钥,后面openstack会把安装脚本上传。
然后我们还要设置主机默认的用户名和口令。
oc create secret generic git-secret -n openstack --from-file=git_ssh_identity=${BASE_DIR}/.ssh/id_rsa --from-literal=git_url=ssh://git@quaylab.infra.redhat.ren:10022/root/openstack.git
# Setting the root password for nodes
echo -n "redhat" | base64
# cmVkaGF0
cat << EOF > ${BASE_DIR}/data/install/openstack-userpassword.yaml
apiVersion: v1
kind: Secret
metadata:
name: userpassword
namespace: openstack
data:
NodeRootPassword: "`echo -n "redhat" | base64`"
EOF
oc create --save-config -f ${BASE_DIR}/data/install/openstack-userpassword.yaml -n openstack
define network parameter
我们定义openstack用到的网络参数。这里面很绕,因为这个定义里面,IP地址的配置,是openstack的controller, computer节点都使用的。但是bridge, bridge对应的网卡,network附着的bridge这些配置,只是对openshift节点有效。
总的来说,这个网络参数配置,是针对openshift节点的,虽然他的名字是OpenStackNetConfig
这一步操作,对应到架构图,是这部分:
# network link name no longer than 15
# https://access.redhat.com/solutions/2425471
# https://github.com/openstack-k8s-operators/osp-director-dev-tools/blob/osp16_tech_preview/ansible/templates/osp/tripleo_heat_envs/vlan/network-environment.yaml.j2
# https://github.com/openstack-k8s-operators/osp-director-dev-tools/blob/master/ansible/templates/osp/netconfig/osnetconfig.yaml.j2
cat << EOF > ${BASE_DIR}/data/install/openstacknetconfig.yaml
apiVersion: osp-director.openstack.org/v1beta1
kind: OpenStackNetConfig
metadata:
name: openstacknetconfig
spec:
attachConfigurations:
br-osp:
nodeNetworkConfigurationPolicy:
nodeSelector:
node-role.kubernetes.io/master: ""
desiredState:
interfaces:
- bridge:
options:
stp:
enabled: false
port:
- name: enp4s0
description: Linux bridge with enp4s0 as a port
name: br-osp
state: up
type: linux-bridge
mtu: 1500
br-osp-ex:
nodeNetworkConfigurationPolicy:
nodeSelector:
node-role.kubernetes.io/master: ""
desiredState:
interfaces:
- bridge:
options:
stp:
enabled: false
port:
- name: enp3s0
description: Linux bridge with enp3s0 as a port
name: br-osp-ex
state: up
type: linux-bridge
mtu: 1500
# optional DnsServers list
dnsServers:
- 192.168.7.11
# optional DnsSearchDomains list
dnsSearchDomains:
- osp-demo.redhat.ren
- some.other.domain
# DomainName of the OSP environment
domainName: osp-demo.redhat.ren
networks:
- name: Control
nameLower: ctlplane
subnets:
- name: ctlplane
ipv4:
allocationEnd: 192.168.7.60
allocationStart: 192.168.7.40
cidr: 192.168.7.0/24
gateway: 192.168.7.11
attachConfiguration: br-osp
- name: InternalApi
nameLower: internal_api
mtu: 1350
subnets:
- name: internal_api
attachConfiguration: br-osp
vlan: 20
ipv4:
allocationEnd: 172.17.0.250
allocationStart: 172.17.0.10
cidr: 172.17.0.0/24
- name: External
nameLower: external
subnets:
- name: external
ipv4:
allocationEnd: 172.21.6.60
allocationStart: 172.21.6.40
cidr: 172.21.6.0/24
gateway: 172.21.6.254
attachConfiguration: br-osp-ex
- name: Storage
nameLower: storage
mtu: 1500
subnets:
- name: storage
ipv4:
allocationEnd: 172.18.0.250
allocationStart: 172.18.0.10
cidr: 172.18.0.0/24
vlan: 30
attachConfiguration: br-osp
- name: StorageMgmt
nameLower: storage_mgmt
mtu: 1500
subnets:
- name: storage_mgmt
ipv4:
allocationEnd: 172.19.0.250
allocationStart: 172.19.0.10
cidr: 172.19.0.0/24
vlan: 40
attachConfiguration: br-osp
- name: Tenant
nameLower: tenant
vip: False
mtu: 1500
subnets:
- name: tenant
ipv4:
allocationEnd: 172.20.0.250
allocationStart: 172.20.0.10
cidr: 172.20.0.0/24
vlan: 50
attachConfiguration: br-osp
EOF
oc create --save-config -f ${BASE_DIR}/data/install/openstacknetconfig.yaml -n openstack
# oc delete -f ${BASE_DIR}/data/install/openstacknetconfig.yaml -n openstack
# oc apply -f ${BASE_DIR}/data/install/openstacknetconfig.yaml -n openstack
oc get openstacknetconfig/openstacknetconfig -n openstack
# NAME ATTACHCONFIG DESIRED ATTACHCONFIG READY NETWORKS DESIRED NETWORKS READY PHYSNETWORKS DESIRED PHYSNETWORKS READY STATUS REASON
# openstacknetconfig 2 2 6 6 1 1 Configured OpenStackNetConfig openstacknetconfig all resources configured
# oc get openstacknetattach -n openstack
oc get openstacknet -n openstack
# NAME CIDR DOMAIN MTU VLAN VIP GATEWAY ROUTES RESERVED IPS STATUS
# ctlplane 192.168.7.0/24 ctlplane.osp-demo.redhat.ren 1500 0 true 192.168.7.11 [] 0 Configured
# external 172.21.6.0/24 external.osp-demo.redhat.ren 1500 0 true 172.21.6.254 [] 0 Configured
# internalapi 172.17.0.0/24 internalapi.osp-demo.redhat.ren 1350 20 true [] 0 Configured
# storage 172.18.0.0/24 storage.osp-demo.redhat.ren 1500 30 true [] 0 Configured
# storagemgmt 172.19.0.0/24 storagemgmt.osp-demo.redhat.ren 1500 40 true [] 0 Configured
# tenant 172.20.0.0/24 tenant.osp-demo.redhat.ren 1500 50 false [] 0 Configured
oc get network-attachment-definitions -n openstack
# NAME AGE
# ctlplane 2m12s
# ctlplane-static 2m11s
# external 2m11s
# external-static 2m11s
# internalapi 2m11s
# internalapi-static 2m11s
# storage 2m11s
# storage-static 2m11s
# storagemgmt 2m10s
# storagemgmt-static 2m10s
# tenant 2m10s
# tenant-static 2m10s
oc get nncp
# NAME STATUS REASON
# br-osp Available SuccessfullyConfigured
# br-osp-ex Available SuccessfullyConfigured
deploy controller
我们定义一个单节点的controller,这个定义保存后,openshift会通过cnv启动一个kvm,这个kvm会使用我们之前上传的rhel镜像作为操作系统,启动完成以后,就以一个空的操作系统,静静的运行在那里。
同时,他会运行一个openstack client的pod,我们后面日常对openstack的操作,就都会在这个openstack client里面。
注意,这里面文档有bug。文档里面的版本是v1beta2,而我们的镜像里面只有v1beta1,所以我们需要对配置做一些小的调整。
这一步操作,对应到架构图,是这部分:
# here version mismatch with official document.
# we use old official document, which can't be found. :(
cat << EOF > ${BASE_DIR}/data/install/openstack-controller.yaml
apiVersion: osp-director.openstack.org/v1beta1
kind: OpenStackControlPlane
metadata:
name: overcloud
namespace: openstack
spec:
# openStackClientImageURL: registry.redhat.io/rhosp-beta/openstack-tripleoclient:16.2
openStackClientNetworks:
- ctlplane
- external
- internal_api
openStackClientStorageClass: redhat-ren-nfs
passwordSecret: userpassword
gitSecret: git-secret
virtualMachineRoles:
controller:
roleName: Controller
roleCount: 1
networks:
- ctlplane
- internal_api
- external
- tenant
- storage
- storage_mgmt
cores: 6
memory: 12
diskSize: 60
baseImageVolumeName: openstack-base-img
storageClass: redhat-ren-nfs
EOF
oc create --save-config -f ${BASE_DIR}/data/install/openstack-controller.yaml -n openstack
# oc delete -f ${BASE_DIR}/data/install/openstack-controller.yaml -n openstack
# oc apply -f ${BASE_DIR}/data/install/openstack-controller.yaml -n openstack
# here, it will take a long time, because it will clone pvc to 3 pvc
# half to 1 hour, based on your disk performance.
oc get openstackcontrolplane/overcloud -n openstack
# NAME VMSETS DESIRED VMSETS READY CLIENT READY STATUS REASON
# overcloud 1 1 true Provisioned All requested OSVMSets have been provisioned
oc get openstackcontrolplane -n openstack
# NAME VMSETS DESIRED VMSETS READY CLIENT READY STATUS REASON
# overcloud 1 1 true Provisioned All requested OSVMSets have been provisioned
oc get openstackvmsets -n openstack
# NAME DESIRED READY STATUS REASON
# controller 3 3 Provisioned All requested VirtualMachines have been provisioned
oc get virtualmachines -n openstack
# NAME AGE STATUS READY
# controller-0 107m Running True
# controller-1 107m Running True
# controller-2 107m Running True
define openstack install script
接着,我们按照官方文档,定义openstack install script,这个安装脚本,是配置computer节点网络的。
官方文档里面有个bug,就是没定义StorageMgmt网络,我们补充进去就好了。
安装脚本的定义分好几个步骤,有几个步骤,是把官方文档copy past进去,还有步骤,是在openstack client pod里面,创建模板,然后导出,总之,按照步骤做就好,并不难。
# on helper
mkdir -p ${BASE_DIR}/data/custom_templates
mkdir -p ${BASE_DIR}/data/custom_environment_files
/bin/rm -rf ${BASE_DIR}/data/custom_templates/*
/bin/rm -rf ${BASE_DIR}/data/custom_environment_files/*
cat << 'EOF' > ${BASE_DIR}/data/custom_templates/net-config-two-nic-vlan-computehci.yaml
heat_template_version: rocky
description: >
Software Config to drive os-net-config to configure VLANs for the Compute role.
parameters:
ControlPlaneIp:
default: ''
description: IP address/subnet on the ctlplane network
type: string
ControlPlaneSubnetCidr:
default: ''
description: >
The subnet CIDR of the control plane network. (The parameter is
automatically resolved from the ctlplane subnet's cidr attribute.)
type: string
ControlPlaneDefaultRoute:
default: ''
description: The default route of the control plane network. (The parameter
is automatically resolved from the ctlplane subnet's gateway_ip attribute.)
type: string
ControlPlaneStaticRoutes:
default: []
description: >
Routes for the ctlplane network traffic.
JSON route e.g. [{'destination':'10.0.0.0/16', 'nexthop':'10.0.0.1'}]
Unless the default is changed, the parameter is automatically resolved
from the subnet host_routes attribute.
type: json
ControlPlaneMtu:
default: 1500
description: The maximum transmission unit (MTU) size(in bytes) that is
guaranteed to pass through the data path of the segments in the network.
(The parameter is automatically resolved from the ctlplane network's mtu attribute.)
type: number
StorageIpSubnet:
default: ''
description: IP address/subnet on the storage network
type: string
StorageNetworkVlanID:
default: 30
description: Vlan ID for the storage network traffic.
type: number
StorageMtu:
default: 1500
description: The maximum transmission unit (MTU) size(in bytes) that is
guaranteed to pass through the data path of the segments in the
Storage network.
type: number
StorageInterfaceRoutes:
default: []
description: >
Routes for the storage network traffic.
JSON route e.g. [{'destination':'10.0.0.0/16', 'nexthop':'10.0.0.1'}]
Unless the default is changed, the parameter is automatically resolved
from the subnet host_routes attribute.
type: json
StorageMgmtIpSubnet:
default: ''
description: IP address/subnet on the storage_mgmt network
type: string
StorageMgmtNetworkVlanID:
default: 40
description: Vlan ID for the storage_mgmt network traffic.
type: number
StorageMgmtMtu:
default: 1500
description: The maximum transmission unit (MTU) size(in bytes) that is
guaranteed to pass through the data path of the segments in the
StorageMgmt network.
type: number
StorageMgmtInterfaceRoutes:
default: []
description: >
Routes for the storage_mgmt network traffic.
JSON route e.g. [{'destination':'10.0.0.0/16', 'nexthop':'10.0.0.1'}]
Unless the default is changed, the parameter is automatically resolved
from the subnet host_routes attribute.
type: json
InternalApiIpSubnet:
default: ''
description: IP address/subnet on the internal_api network
type: string
InternalApiNetworkVlanID:
default: 20
description: Vlan ID for the internal_api network traffic.
type: number
InternalApiMtu:
default: 1500
description: The maximum transmission unit (MTU) size(in bytes) that is
guaranteed to pass through the data path of the segments in the
InternalApi network.
type: number
InternalApiInterfaceRoutes:
default: []
description: >
Routes for the internal_api network traffic.
JSON route e.g. [{'destination':'10.0.0.0/16', 'nexthop':'10.0.0.1'}]
Unless the default is changed, the parameter is automatically resolved
from the subnet host_routes attribute.
type: json
TenantIpSubnet:
default: ''
description: IP address/subnet on the tenant network
type: string
TenantNetworkVlanID:
default: 50
description: Vlan ID for the tenant network traffic.
type: number
TenantMtu:
default: 1500
description: The maximum transmission unit (MTU) size(in bytes) that is
guaranteed to pass through the data path of the segments in the
Tenant network.
type: number
TenantInterfaceRoutes:
default: []
description: >
Routes for the tenant network traffic.
JSON route e.g. [{'destination':'10.0.0.0/16', 'nexthop':'10.0.0.1'}]
Unless the default is changed, the parameter is automatically resolved
from the subnet host_routes attribute.
type: json
ExternalMtu:
default: 1500
description: The maximum transmission unit (MTU) size(in bytes) that is
guaranteed to pass through the data path of the segments in the
External network.
type: number
DnsServers: # Override this via parameter_defaults
default: []
description: >
DNS servers to use for the Overcloud (2 max for some implementations).
If not set the nameservers configured in the ctlplane subnet's
dns_nameservers attribute will be used.
type: comma_delimited_list
DnsSearchDomains: # Override this via parameter_defaults
default: []
description: A list of DNS search domains to be added (in order) to resolv.conf.
type: comma_delimited_list
resources:
MinViableMtu:
# This resource resolves the minimum viable MTU for interfaces, bonds and
# bridges that carry multiple VLANs. Each VLAN may have different MTU. The
# bridge, bond or interface must have an MTU to allow the VLAN with the
# largest MTU.
type: OS::Heat::Value
properties:
type: number
value:
yaql:
expression: $.data.max()
data:
- {get_param: ControlPlaneMtu}
- {get_param: StorageMtu}
- {get_param: StorageMgmtMtu}
- {get_param: InternalApiMtu}
- {get_param: TenantMtu}
- {get_param: ExternalMtu}
OsNetConfigImpl:
type: OS::Heat::SoftwareConfig
properties:
group: script
config:
str_replace:
template:
get_file: /usr/share/openstack-tripleo-heat-templates/network/scripts/run-os-net-config.sh
params:
$network_config:
network_config:
- type: interface
name: nic4
mtu:
get_attr: [MinViableMtu, value]
use_dhcp: false
dns_servers:
get_param: DnsServers
domain:
get_param: DnsSearchDomains
addresses:
- ip_netmask:
list_join:
- /
- - get_param: ControlPlaneIp
- get_param: ControlPlaneSubnetCidr
routes:
list_concat_unique:
- get_param: ControlPlaneStaticRoutes
- - default: true
next_hop:
get_param: ControlPlaneDefaultRoute
- type: vlan
mtu:
get_param: StorageMtu
device: nic4
vlan_id:
get_param: StorageNetworkVlanID
addresses:
- ip_netmask:
get_param: StorageIpSubnet
routes:
list_concat_unique:
- get_param: StorageInterfaceRoutes
- type: vlan
device: nic4
mtu:
get_param: StorageMgmtMtu
vlan_id:
get_param: StorageMgmtNetworkVlanID
addresses:
- ip_netmask:
get_param: StorageMgmtIpSubnet
routes:
list_concat_unique:
- get_param: StorageMgmtInterfaceRoutes
- type: vlan
mtu:
get_param: InternalApiMtu
device: nic4
vlan_id:
get_param: InternalApiNetworkVlanID
addresses:
- ip_netmask:
get_param: InternalApiIpSubnet
routes:
list_concat_unique:
- get_param: InternalApiInterfaceRoutes
- type: ovs_bridge
# This will default to br-ex, anything else requires specific
# bridge mapping entries for it to be used.
name: bridge_name
mtu:
get_param: ExternalMtu
use_dhcp: false
members:
- type: interface
name: nic3
mtu:
get_param: ExternalMtu
use_dhcp: false
primary: true
- type: vlan
mtu:
get_param: TenantMtu
vlan_id:
get_param: TenantNetworkVlanID
addresses:
- ip_netmask:
get_param: TenantIpSubnet
routes:
list_concat_unique:
- get_param: TenantInterfaceRoutes
outputs:
OS::stack_id:
description: The OsNetConfigImpl resource.
value:
get_resource: OsNetConfigImpl
EOF
oc rsh -n openstack openstackclient
# in the shell
unset OS_CLOUD
cd /home/cloud-admin/
openstack overcloud roles generate Controller ComputeHCI > roles_data.yaml
exit
oc cp openstack/openstackclient:home/cloud-admin/roles_data.yaml ${BASE_DIR}/data/custom_templates/roles_data.yaml
cd ${BASE_DIR}/data/custom_templates
tar -cvzf custom-config.tar.gz *.yaml
oc delete configmap tripleo-tarball-config -n openstack
oc create configmap tripleo-tarball-config --from-file=custom-config.tar.gz -n openstack
oc get configmap/tripleo-tarball-config -n openstack
# NAME DATA AGE
# tripleo-tarball-config 1 7s
cat << EOF > ${BASE_DIR}/data/custom_environment_files/network-environment.yaml
resource_registry:
OS::TripleO::ComputeHCI::Net::SoftwareConfig: net-config-two-nic-vlan-computehci.yaml
# parameter_defaults:
# # self define
# NeutronBridgeMappings: datacentre:br-osp-ex
EOF
cat << EOF > ${BASE_DIR}/data/custom_environment_files/compute-hci.yaml
resource_registry:
OS::TripleO::Services::CephMgr: deployment/ceph-ansible/ceph-mgr.yaml
OS::TripleO::Services::CephMon: deployment/ceph-ansible/ceph-mon.yaml
OS::TripleO::Services::CephOSD: deployment/ceph-ansible/ceph-osd.yaml
OS::TripleO::Services::CephClient: deployment/ceph-ansible/ceph-client.yaml
parameter_defaults:
# needed for now because of the repo used to create tripleo-deploy image
CephAnsibleRepo: "rhelosp-ceph-4-tools"
CephAnsiblePlaybookVerbosity: 3
CinderEnableIscsiBackend: false
CinderEnableRbdBackend: true
CinderBackupBackend: ceph
CinderEnableNfsBackend: false
NovaEnableRbdBackend: true
GlanceBackend: rbd
CinderRbdPoolName: "volumes"
NovaRbdPoolName: "vms"
GlanceRbdPoolName: "images"
CephPoolDefaultPgNum: 32
CephPoolDefaultSize: 2
CephAnsibleDisksConfig:
devices:
- '/dev/vdb'
- '/dev/vdc'
- '/dev/vdd'
osd_scenario: lvm
osd_objectstore: bluestore
CephAnsibleExtraConfig:
is_hci: true
CephConfigOverrides:
rgw_swift_enforce_content_length: true
rgw_swift_versioning_enabled: true
EOF
oc delete configmap -n openstack heat-env-config
oc create configmap -n openstack heat-env-config --from-file=${BASE_DIR}/data/custom_environment_files/ --dry-run=client -o yaml | oc apply -f -
oc get configmap/heat-env-config -n openstack
# NAME DATA AGE
# heat-env-config 2 4s
define computer node
接下来,我们定义computer node。在定义computer node之前,我们openshift集群是有一个worker节点的,这个worker节点是空的,啥也没有,我们通过定义OpenStackBaremetalSet,调用openshift的metal3相关的功能,用我们指定的镜像,把这个worker节点刷成一个rhel节点。
这一步操作,对应到架构图,是这部分:
cat << EOF > ${BASE_DIR}/data/install/openstack-hcicompute.yaml
apiVersion: osp-director.openstack.org/v1beta1
kind: OpenStackBaremetalSet
metadata:
name: computehci
namespace: openstack
spec:
count: 1
baseImageUrl: http://192.168.7.11:8080/rhel-8.6-x86_64-kvm-wzh.qcow2
deploymentSSHSecret: osp-controlplane-ssh-keys
ctlplaneInterface: enp4s0
networks:
- ctlplane
- internal_api
- tenant
- storage
- storage_mgmt
roleName: ComputeHCI
passwordSecret: userpassword
EOF
oc create --save-config -f ${BASE_DIR}/data/install/openstack-hcicompute.yaml -n openstack
# oc delete -f ${BASE_DIR}/data/install/openstack-hcicompute.yaml -n openstack
# very tricky, after read source code, there is a buggy logic to check the online to false.
# cat << EOF > ${BASE_DIR}/data/install/patch.yaml
# spec:
# online: fales
# EOF
# oc patch -n openshift-machine-api BareMetalHost/ocp4-ipi-osp-worker-01 --type merge --patch-file=${BASE_DIR}/data/install/patch.yaml
# ssh into the worker-1, and add public access ip address
# so it can download ironic agent podman image
# and the ironic agent will write base image to disk
# but first, it will boot using coreos live iso
# ssh -i id_rsa core@172.22.0.199
# sudo -i
# nmcli con add ifname enp1s0 con-name enp1s0 type ethernet ipv4.method manual ipv4.addresses 192.168.7.26/24 ipv4.dns 192.168.7.11
# nmcli con up enp1s0
# /bin/qemu-img convert -O host_device -t directsync -S 0 -W /tmp/compressed-rhel-8.6-x86_64-kvm-wzh.qcow2 /dev/vda
# sgdisk -e /dev/vda
# sgdisk -Z /dev/vda3
oc describe crd openstackbaremetalset
oc get openstackbaremetalset -n openstack
# NAME DESIRED READY STATUS REASON
# computehci 1 1 Provisioned All requested BaremetalHosts have been provisioned
oc get openstackbaremetalset/computehci -n openstack
# NAME DESIRED READY STATUS REASON
# computehci 1 1 Provisioned All requested BaremetalHosts have been provisioned
oc get baremetalhosts -n openshift-machine-api
# NAME STATE CONSUMER ONLINE ERROR AGE
# ocp4-ipi-osp-master-01 externally provisioned acm-demo-one-8zwdl-master-0 true 126m
# ocp4-ipi-osp-master-02 externally provisioned acm-demo-one-8zwdl-master-1 true 126m
# ocp4-ipi-osp-master-03 externally provisioned acm-demo-one-8zwdl-master-2 true 126m
# ocp4-ipi-osp-worker-01 provisioned computehci true 54m
patch for openstack nodes
在继续操作之前,我们需要给已有的openstack 节点,包括controller, computer,打一些配置上去,因为我们是离线环境,主要就是把repo yum源,还有container registry源,都指向内网的环境。
###########################
# add repo for osp nodes
oc rsh -n openstack openstackclient
cd /home/cloud-admin
VAR_URL=http://192.168.7.11:5000/osp.repo
# ansible-playbook -i /home/cloud-admin/ctlplane-ansible-inventory ./rhsm.yaml
ansible -i /home/cloud-admin/ctlplane-ansible-inventory overcloud -a "sudo dnf config-manager --add-repo $VAR_URL"
ansible -i /home/cloud-admin/ctlplane-ansible-inventory overcloud -a "sudo mkdir -p /etc/cni/net.d"
scp root@192.168.7.11:/data/ocp4/image.registries.conf ./
sed -i 's/nexus.infra.redhat.ren/192.168.7.11/g' image.registries.conf
ansible -i /home/cloud-admin/ctlplane-ansible-inventory overcloud -a "sudo mkdir -p /etc/containers/registries.conf.d/"
ansible -i /home/cloud-admin/ctlplane-ansible-inventory overcloud -b -m copy -a "src=./image.registries.conf dest=/etc/containers/registries.conf.d/image.registries.conf"
cat << EOF > ./policy.json
{
"default": [
{
"type": "insecureAcceptAnything"
}
],
"transports":
{
"docker-daemon":
{
"": [{"type":"insecureAcceptAnything"}]
}
}
}
EOF
ansible -i /home/cloud-admin/ctlplane-ansible-inventory overcloud -b -m copy -a "src=./policy.json dest=/etc/containers/policy.json"
generate ansible playbooks
我们终于到了最后的步骤了,我们将要创建安装用的ansible playbook。
在这里面,有一个osp operator的小bug,因为我们的git ssh端口不是22端口,所以我们的git uri形式不是 git@server:root/openstack.git , 而是 ssh://git@quaylab.infra.redhat.ren:10022/root/openstack.git ,这就造成的脚本解析错误,暂时没有好办法解决,只能到pod里面去,把ssh config文件patch一下。
cat << EOF > ${BASE_DIR}/data/install/openstack-config-generator.yaml
apiVersion: osp-director.openstack.org/v1beta1
kind: OpenStackConfigGenerator
metadata:
name: default
namespace: openstack
spec:
enableFencing: false
gitSecret: git-secret
# imageURL: registry.redhat.io/rhosp-rhel8/openstack-tripleoclient:16.2
heatEnvConfigMap: heat-env-config
# List of heat environment files to include from tripleo-heat-templates/environments
# heatEnvs:
# - ssl/tls-endpoints-public-dns.yaml
# - ssl/enable-tls.yaml
tarballConfigMap: tripleo-tarball-config
# interactive: true
EOF
oc create --save-config -f ${BASE_DIR}/data/install/openstack-config-generator.yaml -n openstack
# oc delete -f ${BASE_DIR}/data/install/openstack-config-generator.yaml -n openstack
oc get openstackconfiggenerator/default -n openstack
# NAME STATUS
# default Initializing
# fix for ssh connect bugs
# if the git host on ssh other than port 22
# the osp script will buggy
watch oc get pod -l job-name=generate-config-default
oc rsh $(oc get pod -o name -l job-name=generate-config-default)
# ls -la /home/cloud-admin/
cat /home/cloud-admin/.ssh/config
cat << EOF >> /home/cloud-admin/.ssh/config
Host quaylab.infra.redhat.ren
User root
IdentityFile /home/cloud-admin/.ssh/git_id_rsa
StrictHostKeyChecking no
EOF
# git clone ssh://git@quaylab.infra.redhat.ren:10022/root/openstack.git
oc get openstackconfiggenerator/default -n openstack
# NAME STATUS
# default Generating
run ansible playbooks
我们将要在openstack client pod里面,运行ansible playbook脚本,安装部署我们的overcloud。在这里,又和官方文档有点不一样的,官方文档使用的OpenStackDeploy,我们的operator里面还没有,所以我们需要手动运行。
由于我们的controller是nested kvm,运行起来那是非常慢,整个安装过程在作者的home lab里面要2个多小时。
# official doc versoin mis-match
# there is no openstackDeploy CRD for current osd operator
# cat << EOF > ${BASE_DIR}/data/install/openstack-deployment.yaml
# apiVersion: osp-director.openstack.org/v1beta1
# kind: OpenStackDeploy
# metadata:
# name: default
# spec:
# configVersion: n54dh548h5d7h5f5h648h95h5b5h686h64bhf8h566h65fh5f7h674hdchdh59dh58hf7h667h7h57fh85h557hdh59bh5dh54ch7dh547h579hcfq
# configGenerator: default
# EOF
# oc create --save-config -f ${BASE_DIR}/data/install/openstack-deployment.yaml -n openstack
oc rsh -n openstack openstackclient
cd /home/cloud-admin
ansible -i /home/cloud-admin/ctlplane-ansible-inventory overcloud -a "sudo dnf -y install python3 lvm2"
# run ansible driven OpenStack deployment
cat << EOF >> /home/cloud-admin/.ssh/config
Host quaylab.infra.redhat.ren
User root
IdentityFile /home/cloud-admin/.ssh/git_id_rsa
StrictHostKeyChecking no
EOF
chmod 600 /home/cloud-admin/.ssh/config
# it is better to run on local machine through crictl exec -it **** bash
./tripleo-deploy.sh -a
/bin/cp tripleo-deploy.sh tripleo-deploy.wzh.sh
sed -i 's/ansible-playbook /ansible-playbook -v /g' tripleo-deploy.wzh.sh
./tripleo-deploy.wzh.sh -p
# because it is nested virtulization
# it will cost almost 2 hours to deploy
# debug
ssh cloud-admin@192.168.7.43
# podman pull registry.redhat.io/rhosp-rhel8/openstack-cron:16.2
# podman pull registry.redhat.io/rhosp-rhel8/openstack-ovn-controller:16.2
# podman pull registry.redhat.io/rhosp-rhel8/openstack-nova-libvirt:16.2
# podman pull registry.redhat.io/rhosp-rhel8/openstack-iscsid:16.2
# podman pull registry.redhat.io/rhosp-rhel8/openstack-nova-compute:16.2
# podman pull registry.redhat.io/rhosp-rhel8/openstack-neutron-metadata-agent-ovn:16.2
# access the webpage
oc get secret tripleo-passwords -o jsonpath='{.data.tripleo-overcloud-passwords\.yaml}' | base64 -d | grep AdminPassword
# AdminPassword: 9dhv6qr7xlsbkrzvlvvjtndks
# CephDashboardAdminPassword: 74rpbjqm8qt586v79rtcwhr2c
# CephGrafanaAdminPassword: hlg8k7m6fg2zqqvq799xpmsxx
# HeatStackDomainAdminPassword: flqfdp86gk7xf8rjh2f6nkxhl
# http://172.21.6.40/
# admin / ....
use the openstack overcloud
我们已经装好了一个openstack overcloud了,接下来,我们就使用一下这个overcloud,创建一个vm试试。
这一步操作,对应到架构图,是这部分:
openstack endpoint list
# +----------------------------------+-----------+--------------+----------------+---------+-----------+-----------------------------------------------+
# | ID | Region | Service Name | Service Type | Enabled | Interface | URL |
# +----------------------------------+-----------+--------------+----------------+---------+-----------+-----------------------------------------------+
# | 0841809b67a84e8a9f8b5fe1c3fe78b0 | regionOne | glance | image | True | internal | http://172.17.0.10:9292 |
# | 0fc84999e6e948cf82d2abb1ff8ffbaf | regionOne | heat | orchestration | True | public | http://172.21.6.40:8004/v1/%(tenant_id)s |
# | 1447065d224943c4a3ff886c3bb8c4b3 | regionOne | heat | orchestration | True | internal | http://172.17.0.10:8004/v1/%(tenant_id)s |
# | 1c57db71dfcf438b8607cf2549929757 | regionOne | cinderv3 | volumev3 | True | admin | http://172.17.0.10:8776/v3/%(tenant_id)s |
# | 21e24d92d592457782f7c8b39b55ab41 | regionOne | nova | compute | True | public | http://172.21.6.40:8774/v2.1 |
# | 26e93f7d318149d492268d7abbeebb8c | regionOne | placement | placement | True | public | http://172.21.6.40:8778/placement |
# | 3338679b75b949b0810c807a3dd5b175 | regionOne | heat-cfn | cloudformation | True | admin | http://172.17.0.10:8000/v1 |
# | 459fa0db22ce4edcba52309bf5157bd6 | regionOne | nova | compute | True | internal | http://172.17.0.10:8774/v2.1 |
# | 494e26a5e80d40258e07a32d7f7cd527 | regionOne | placement | placement | True | admin | http://172.17.0.10:8778/placement |
# | 4a612665e36840eeb354ebfc1540d372 | regionOne | swift | object-store | True | internal | http://172.18.0.10:8080/v1/AUTH_%(tenant_id)s |
# | 4ca6bd346e714c86934de3fd80199490 | regionOne | glance | image | True | admin | http://172.17.0.10:9292 |
# | 6d1cc6552c4f4bd99b78cfa173f4d30e | regionOne | nova | compute | True | admin | http://172.17.0.10:8774/v2.1 |
# | 73a38a88baa342f0882fe846bcf20c23 | regionOne | keystone | identity | True | internal | http://172.17.0.10:5000 |
# | 7a194566071140f1a4a88da42e131520 | regionOne | cinderv3 | volumev3 | True | internal | http://172.17.0.10:8776/v3/%(tenant_id)s |
# | 8058f1fd49b6474bab4b2c889bfb8769 | regionOne | cinderv3 | volumev3 | True | public | http://172.21.6.40:8776/v3/%(tenant_id)s |
# | 8158d87f9fb648939e70ebf84398dcb2 | regionOne | neutron | network | True | admin | http://172.17.0.10:9696 |
# | 8b6ccf76ea67428ba49431cb58a7d749 | regionOne | cinderv2 | volumev2 | True | admin | http://172.17.0.10:8776/v2/%(tenant_id)s |
# | 8e462b16b50840c380d20a04c05ef19d | regionOne | heat-cfn | cloudformation | True | internal | http://172.17.0.10:8000/v1 |
# | 96f9cc117a744e74af9a8c889bdcc294 | regionOne | neutron | network | True | internal | http://172.17.0.10:9696 |
# | 97996f0256db4ecd97d24dd09d122fed | regionOne | swift | object-store | True | admin | http://172.18.0.10:8080 |
# | a874381f22ef480eb858c2062ee0bc84 | regionOne | cinderv2 | volumev2 | True | internal | http://172.17.0.10:8776/v2/%(tenant_id)s |
# | ae9e4589fda4420ebfac1e4d385ebf39 | regionOne | heat-cfn | cloudformation | True | public | http://172.21.6.40:8000/v1 |
# | c25acb62d7fd4c0c9450c62f99e257e9 | regionOne | neutron | network | True | public | http://172.21.6.40:9696 |
# | e16700f441b64798beca1d95982743e0 | regionOne | keystone | identity | True | public | http://172.21.6.40:5000 |
# | e433ea6ae0bf4ce19b2fe5424204d35b | regionOne | heat | orchestration | True | admin | http://172.17.0.10:8004/v1/%(tenant_id)s |
# | ef151edac10b4838909009e8892fa3a4 | regionOne | placement | placement | True | internal | http://172.17.0.10:8778/placement |
# | f31465304ea249f5a4888e52269c6891 | regionOne | keystone | identity | True | admin | http://192.168.7.40:35357 |
# | f42b5746d871425db507cf32f4d7d536 | regionOne | cinderv2 | volumev2 | True | public | http://172.21.6.40:8776/v2/%(tenant_id)s |
# | fbe5ee79ad504ef995d111bb2e76a032 | regionOne | swift | object-store | True | public | http://172.21.6.40:8080/v1/AUTH_%(tenant_id)s |
# | fdefc2aa2c36430294c7c436662cfb16 | regionOne | glance | image | True | public | http://172.21.6.40:9292 |
# +----------------------------------+-----------+--------------+----------------+---------+-----------+-----------------------------------------------+
openstack flavor create --ram 512 --disk 1 --vcpu 1 --public tiny
# +----------------------------+--------------------------------------+
# | Field | Value |
# +----------------------------+--------------------------------------+
# | OS-FLV-DISABLED:disabled | False |
# | OS-FLV-EXT-DATA:ephemeral | 0 |
# | disk | 1 |
# | id | 511a9cc2-9f68-4850-a3e1-de40a68db8d7 |
# | name | tiny |
# | os-flavor-access:is_public | True |
# | properties | |
# | ram | 512 |
# | rxtx_factor | 1.0 |
# | swap | |
# | vcpus | 1 |
# +----------------------------+--------------------------------------+
#####################
# on helper
wget https://download.cirros-cloud.net/0.4.0/cirros-0.4.0-x86_64-disk.img
oc cp /data/swap/cirros-0.4.0-x86_64-disk.img openstack/openstackclient:/home/cloud-admin/swap/
# end on helper
#####################
openstack image create cirros --container-format bare --disk-format qcow2 --public --file /home/cloud-admin/swap/cirros-0.4.0-x86_64-disk.img
# +------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
# | Field | Value |
# +------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
# | checksum | 443b7623e27ecf03dc9e01ee93f67afe |
# | container_format | bare |
# | created_at | 2022-11-18T15:38:57Z |
# | disk_format | qcow2 |
# | file | /v2/images/2d66e2af-8fcb-4d33-8232-5949787a6164/file |
# | id | 2d66e2af-8fcb-4d33-8232-5949787a6164 |
# | min_disk | 0 |
# | min_ram | 0 |
# | name | cirros |
# | owner | 3647f67bbd5844e38e13f418143c4b57 |
# | properties | direct_url='rbd://1438a42b-6d15-4143-8e73-8d9f2c9488be/images/2d66e2af-8fcb-4d33-8232-5949787a6164/snap', locations='[{'url': 'rbd://1438a42b-6d15-4143-8e73-8d9f2c9488be/images/2d66e2af-8fcb-4d33-8232-5949787a6164/snap', 'metadata': {'store': 'default_backend'}}]', os_hash_algo='sha512', os_hash_value='6513f21e44aa3da349f248188a44bc304a3653a04122d8fb4535423c8e1d14cd6a153f735bb0982e2161b5b5186106570c17a9e58b64dd39390617cd5a350f78', os_hidden='False', stores='default_backend' |
# | protected | False |
# | schema | /v2/schemas/image |
# | size | 12716032 |
# | status | active |
# | tags | |
# | updated_at | 2022-11-18T15:39:03Z |
# | virtual_size | None |
# | visibility | public |
# +------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
ssh-keygen -m PEM -t rsa -b 2048 -f ~/.ssh/id_rsa_pem
openstack keypair create --public-key ~/.ssh/id_rsa_pem.pub default
# +-------------+-------------------------------------------------+
# | Field | Value |
# +-------------+-------------------------------------------------+
# | fingerprint | 64:34:f6:33:9f:87:10:27:d6:5f:80:4c:e7:03:a7:2a |
# | name | default |
# | user_id | 0049debaf5d34a83a54486fd418b6981 |
# +-------------+-------------------------------------------------+
openstack security group create basic
# +-----------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
# | Field | Value |
# +-----------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
# | created_at | 2022-11-18T15:40:45Z |
# | description | basic |
# | id | c3f80883-589f-4cf4-b1b5-059e7966ae82 |
# | location | cloud='overcloud', project.domain_id=, project.domain_name='Default', project.id='3647f67bbd5844e38e13f418143c4b57', project.name='admin', region_name='regionOne', zone= |
# | name | basic |
# | project_id | 3647f67bbd5844e38e13f418143c4b57 |
# | revision_number | 1 |
# | rules | created_at='2022-11-18T15:40:46Z', direction='egress', ethertype='IPv4', id='184384ca-4048-4711-9419-b2a9c3c685f8', updated_at='2022-11-18T15:40:46Z' |
# | | created_at='2022-11-18T15:40:46Z', direction='egress', ethertype='IPv6', id='19cf0c67-6724-49e3-a13c-e366e662b63e', updated_at='2022-11-18T15:40:46Z' |
# | tags | [] |
# | updated_at | 2022-11-18T15:40:46Z |
# +-----------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
openstack security group rule create basic --protocol tcp --dst-port 22:22 --remote-ip 0.0.0.0/0
# +-------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
# | Field | Value |
# +-------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
# | created_at | 2022-11-18T15:41:24Z |
# | description | |
# | direction | ingress |
# | ether_type | IPv4 |
# | id | a44aa3ee-9fb3-4559-a655-eb90ea974cf8 |
# | location | cloud='overcloud', project.domain_id=, project.domain_name='Default', project.id='3647f67bbd5844e38e13f418143c4b57', project.name='admin', region_name='regionOne', zone= |
# | name | None |
# | port_range_max | 22 |
# | port_range_min | 22 |
# | project_id | 3647f67bbd5844e38e13f418143c4b57 |
# | protocol | tcp |
# | remote_group_id | None |
# | remote_ip_prefix | 0.0.0.0/0 |
# | revision_number | 0 |
# | security_group_id | c3f80883-589f-4cf4-b1b5-059e7966ae82 |
# | tags | [] |
# | updated_at | 2022-11-18T15:41:24Z |
# +-------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
openstack security group rule create --protocol icmp basic
# +-------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
# | Field | Value |
# +-------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
# | created_at | 2022-11-18T15:42:26Z |
# | description | |
# | direction | ingress |
# | ether_type | IPv4 |
# | id | dfe1760d-c76e-4797-9bbe-cf5cbb5a8386 |
# | location | cloud='overcloud', project.domain_id=, project.domain_name='Default', project.id='3647f67bbd5844e38e13f418143c4b57', project.name='admin', region_name='regionOne', zone= |
# | name | None |
# | port_range_max | None |
# | port_range_min | None |
# | project_id | 3647f67bbd5844e38e13f418143c4b57 |
# | protocol | icmp |
# | remote_group_id | None |
# | remote_ip_prefix | 0.0.0.0/0 |
# | revision_number | 0 |
# | security_group_id | c3f80883-589f-4cf4-b1b5-059e7966ae82 |
# | tags | [] |
# | updated_at | 2022-11-18T15:42:26Z |
# +-------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
openstack security group rule create --protocol udp --dst-port 53:53 basic
# +-------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
# | Field | Value |
# +-------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
# | created_at | 2022-11-18T15:42:58Z |
# | description | |
# | direction | ingress |
# | ether_type | IPv4 |
# | id | 339c1bfd-e812-472d-84db-77ba47425dfc |
# | location | cloud='overcloud', project.domain_id=, project.domain_name='Default', project.id='3647f67bbd5844e38e13f418143c4b57', project.name='admin', region_name='regionOne', zone= |
# | name | None |
# | port_range_max | 53 |
# | port_range_min | 53 |
# | project_id | 3647f67bbd5844e38e13f418143c4b57 |
# | protocol | udp |
# | remote_group_id | None |
# | remote_ip_prefix | 0.0.0.0/0 |
# | revision_number | 0 |
# | security_group_id | c3f80883-589f-4cf4-b1b5-059e7966ae82 |
# | tags | [] |
# | updated_at | 2022-11-18T15:42:58Z |
# +-------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
openstack network create --external --provider-physical-network datacentre --provider-network-type flat public
# +---------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
# | Field | Value |
# +---------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
# | admin_state_up | UP |
# | availability_zone_hints | |
# | availability_zones | |
# | created_at | 2022-11-18T15:47:04Z |
# | description | |
# | dns_domain | |
# | id | 38ae7119-1628-45c6-8763-24dd5eb967cc |
# | ipv4_address_scope | None |
# | ipv6_address_scope | None |
# | is_default | False |
# | is_vlan_transparent | None |
# | location | cloud='overcloud', project.domain_id=, project.domain_name='Default', project.id='3647f67bbd5844e38e13f418143c4b57', project.name='admin', region_name='regionOne', zone= |
# | mtu | 1500 |
# | name | public |
# | port_security_enabled | True |
# | project_id | 3647f67bbd5844e38e13f418143c4b57 |
# | provider:network_type | flat |
# | provider:physical_network | datacentre |
# | provider:segmentation_id | None |
# | qos_policy_id | None |
# | revision_number | 1 |
# | router:external | External |
# | segments | None |
# | shared | False |
# | status | ACTIVE |
# | subnets | |
# | tags | |
# | updated_at | 2022-11-18T15:47:06Z |
# +---------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
openstack network create --internal private
# +---------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
# | Field | Value |
# +---------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
# | admin_state_up | UP |
# | availability_zone_hints | |
# | availability_zones | |
# | created_at | 2022-11-18T15:48:08Z |
# | description | |
# | dns_domain | |
# | id | a33927cd-7983-490b-8ab1-e70887abc398 |
# | ipv4_address_scope | None |
# | ipv6_address_scope | None |
# | is_default | False |
# | is_vlan_transparent | None |
# | location | cloud='overcloud', project.domain_id=, project.domain_name='Default', project.id='3647f67bbd5844e38e13f418143c4b57', project.name='admin', region_name='regionOne', zone= |
# | mtu | 1442 |
# | name | private |
# | port_security_enabled | True |
# | project_id | 3647f67bbd5844e38e13f418143c4b57 |
# | provider:network_type | geneve |
# | provider:physical_network | None |
# | provider:segmentation_id | 44033 |
# | qos_policy_id | None |
# | revision_number | 1 |
# | router:external | Internal |
# | segments | None |
# | shared | False |
# | status | ACTIVE |
# | subnets | |
# | tags | |
# | updated_at | 2022-11-18T15:48:08Z |
# +---------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
export GATEWAY=172.21.6.254
# export STANDALONE_HOST=192.168.25.2
export PUBLIC_NETWORK_CIDR=172.21.6.0/24
export PRIVATE_NETWORK_CIDR=192.168.100.0/24
export PUBLIC_NET_START=172.21.6.70
export PUBLIC_NET_END=172.21.6.80
export DNS_SERVER=172.21.1.1
openstack subnet create public-net \
--subnet-range $PUBLIC_NETWORK_CIDR \
--no-dhcp \
--gateway $GATEWAY \
--allocation-pool start=$PUBLIC_NET_START,end=$PUBLIC_NET_END \
--network public
# +-------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
# | Field | Value |
# +-------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
# | allocation_pools | 172.21.6.70-172.21.6.80 |
# | cidr | 172.21.6.0/24 |
# | created_at | 2022-11-18T15:51:01Z |
# | description | |
# | dns_nameservers | |
# | enable_dhcp | False |
# | gateway_ip | 172.21.6.254 |
# | host_routes | |
# | id | 812aa93f-aa5b-42b5-97ef-63ae59e8c1da |
# | ip_version | 4 |
# | ipv6_address_mode | None |
# | ipv6_ra_mode | None |
# | location | cloud='overcloud', project.domain_id=, project.domain_name='Default', project.id='3647f67bbd5844e38e13f418143c4b57', project.name='admin', region_name='regionOne', zone= |
# | name | public-net |
# | network_id | 38ae7119-1628-45c6-8763-24dd5eb967cc |
# | prefix_length | None |
# | project_id | 3647f67bbd5844e38e13f418143c4b57 |
# | revision_number | 0 |
# | segment_id | None |
# | service_types | |
# | subnetpool_id | None |
# | tags | |
# | updated_at | 2022-11-18T15:51:01Z |
# +-------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
openstack subnet create private-net \
--subnet-range $PRIVATE_NETWORK_CIDR \
--network private
# +-------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
# | Field | Value |
# +-------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
# | allocation_pools | 192.168.100.2-192.168.100.254 |
# | cidr | 192.168.100.0/24 |
# | created_at | 2022-11-18T15:52:06Z |
# | description | |
# | dns_nameservers | |
# | enable_dhcp | True |
# | gateway_ip | 192.168.100.1 |
# | host_routes | |
# | id | 0a378d92-386f-437b-acf8-564595e394ba |
# | ip_version | 4 |
# | ipv6_address_mode | None |
# | ipv6_ra_mode | None |
# | location | cloud='overcloud', project.domain_id=, project.domain_name='Default', project.id='3647f67bbd5844e38e13f418143c4b57', project.name='admin', region_name='regionOne', zone= |
# | name | private-net |
# | network_id | a33927cd-7983-490b-8ab1-e70887abc398 |
# | prefix_length | None |
# | project_id | 3647f67bbd5844e38e13f418143c4b57 |
# | revision_number | 0 |
# | segment_id | None |
# | service_types | |
# | subnetpool_id | None |
# | tags | |
# | updated_at | 2022-11-18T15:52:06Z |
# +-------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
# NOTE: In this case an IP will be automatically assigned
# from the allocation pool for the subnet.
openstack router create vrouter
# +-------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
# | Field | Value |
# +-------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
# | admin_state_up | UP |
# | availability_zone_hints | |
# | availability_zones | |
# | created_at | 2022-11-18T15:53:12Z |
# | description | |
# | external_gateway_info | null |
# | flavor_id | None |
# | id | cb1fcb45-1716-4676-8ecd-9a0ee22ce936 |
# | location | cloud='overcloud', project.domain_id=, project.domain_name='Default', project.id='3647f67bbd5844e38e13f418143c4b57', project.name='admin', region_name='regionOne', zone= |
# | name | vrouter |
# | project_id | 3647f67bbd5844e38e13f418143c4b57 |
# | revision_number | 1 |
# | routes | |
# | status | ACTIVE |
# | tags | |
# | updated_at | 2022-11-18T15:53:12Z |
# +-------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
openstack router set vrouter --external-gateway public
openstack router add subnet vrouter private-net
openstack floating ip create public
# +---------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
# | Field | Value |
# +---------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
# | created_at | 2022-11-18T15:56:17Z |
# | description | |
# | dns_domain | |
# | dns_name | |
# | fixed_ip_address | None |
# | floating_ip_address | 172.21.6.79 |
# | floating_network_id | 38ae7119-1628-45c6-8763-24dd5eb967cc |
# | id | de30c4aa-3aac-4216-af83-f335aac2765e |
# | location | Munch({'cloud': 'overcloud', 'region_name': 'regionOne', 'zone': None, 'project': Munch({'id': '3647f67bbd5844e38e13f418143c4b57', 'name': 'admin', 'domain_id': None, 'domain_name': 'Default'})}) |
# | name | 172.21.6.79 |
# | port_details | None |
# | port_id | None |
# | project_id | 3647f67bbd5844e38e13f418143c4b57 |
# | qos_policy_id | None |
# | revision_number | 0 |
# | router_id | None |
# | status | DOWN |
# | subnet_id | None |
# | tags | [] |
# | updated_at | 2022-11-18T15:56:17Z |
# +---------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
openstack server create --flavor tiny --image cirros --key-name default --network private --security-group basic myserver
# +-------------------------------------+-----------------------------------------------+
# | Field | Value |
# +-------------------------------------+-----------------------------------------------+
# | OS-DCF:diskConfig | MANUAL |
# | OS-EXT-AZ:availability_zone | |
# | OS-EXT-SRV-ATTR:host | None |
# | OS-EXT-SRV-ATTR:hypervisor_hostname | None |
# | OS-EXT-SRV-ATTR:instance_name | |
# | OS-EXT-STS:power_state | NOSTATE |
# | OS-EXT-STS:task_state | scheduling |
# | OS-EXT-STS:vm_state | building |
# | OS-SRV-USG:launched_at | None |
# | OS-SRV-USG:terminated_at | None |
# | accessIPv4 | |
# | accessIPv6 | |
# | addresses | |
# | adminPass | r9QVNEs5r8Ji |
# | config_drive | |
# | created | 2022-11-18T15:57:49Z |
# | flavor | tiny (511a9cc2-9f68-4850-a3e1-de40a68db8d7) |
# | hostId | |
# | id | c6488d98-bdc8-4439-a586-f74c7d31e64d |
# | image | cirros (2d66e2af-8fcb-4d33-8232-5949787a6164) |
# | key_name | default |
# | name | myserver |
# | progress | 0 |
# | project_id | 3647f67bbd5844e38e13f418143c4b57 |
# | properties | |
# | security_groups | name='c3f80883-589f-4cf4-b1b5-059e7966ae82' |
# | status | BUILD |
# | updated | 2022-11-18T15:57:50Z |
# | user_id | 0049debaf5d34a83a54486fd418b6981 |
# | volumes_attached | |
# +-------------------------------------+-----------------------------------------------+
openstack server add floating ip myserver 172.21.6.79
ssh -i ~/.ssh/id_rsa_pem cirros@172.21.6.79
ip a
# 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue qlen 1
# link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
# inet 127.0.0.1/8 scope host lo
# valid_lft forever preferred_lft forever
# inet6 ::1/128 scope host
# valid_lft forever preferred_lft forever
# 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1442 qdisc pfifo_fast qlen 1000
# link/ether fa:16:3e:f5:62:0f brd ff:ff:ff:ff:ff:ff
# inet 192.168.100.108/24 brd 192.168.100.255 scope global eth0
# valid_lft forever preferred_lft forever
# inet6 fe80::f816:3eff:fef5:620f/64 scope link
# valid_lft forever preferred_lft forever
network config on computerHCI
[root@computehci-0 ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: enp1s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 52:54:00:4a:71:e9 brd ff:ff:ff:ff:ff:ff
inet6 fe80::e112:e339:d036:a9ae/64 scope link noprefixroute
valid_lft forever preferred_lft forever
3: enp2s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 52:54:00:ec:40:fa brd ff:ff:ff:ff:ff:ff
inet 172.22.0.25/24 brd 172.22.0.255 scope global dynamic noprefixroute enp2s0
valid_lft 79sec preferred_lft 79sec
inet6 fe80::a494:89f6:d082:30ec/64 scope link noprefixroute
valid_lft forever preferred_lft forever
4: enp3s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master ovs-system state UP group default qlen 1000
link/ether 52:54:00:7c:23:1a brd ff:ff:ff:ff:ff:ff
inet6 fe80::5054:ff:fe7c:231a/64 scope link
valid_lft forever preferred_lft forever
5: enp4s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 52:54:00:36:ee:42 brd ff:ff:ff:ff:ff:ff
inet 192.168.7.43/24 brd 192.168.7.255 scope global enp4s0
valid_lft forever preferred_lft forever
inet6 fe80::5054:ff:fe36:ee42/64 scope link
valid_lft forever preferred_lft forever
6: enp5s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 52:54:00:88:1f:84 brd ff:ff:ff:ff:ff:ff
inet6 fe80::67c4:3ad8:90c6:9196/64 scope link noprefixroute
valid_lft forever preferred_lft forever
7: enp6s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 52:54:00:92:b5:ed brd ff:ff:ff:ff:ff:ff
inet6 fe80::6f9:254d:87cb:7f3e/64 scope link noprefixroute
valid_lft forever preferred_lft forever
8: enp7s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 52:54:00:d9:66:89 brd ff:ff:ff:ff:ff:ff
inet6 fe80::d41b:2e70:f90:39a7/64 scope link noprefixroute
valid_lft forever preferred_lft forever
9: enp8s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 52:54:00:d9:61:82 brd ff:ff:ff:ff:ff:ff
inet6 fe80::b779:5bdb:b449:3d9/64 scope link noprefixroute
valid_lft forever preferred_lft forever
10: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 16:c6:40:67:40:f9 brd ff:ff:ff:ff:ff:ff
11: br-ex: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
link/ether 52:54:00:7c:23:1a brd ff:ff:ff:ff:ff:ff
inet6 fe80::5054:ff:fe7c:231a/64 scope link
valid_lft forever preferred_lft forever
12: vlan50: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
link/ether ea:13:bc:e1:ae:8d brd ff:ff:ff:ff:ff:ff
inet 172.20.0.11/24 brd 172.20.0.255 scope global vlan50
valid_lft forever preferred_lft forever
inet6 fe80::e813:bcff:fee1:ae8d/64 scope link
valid_lft forever preferred_lft forever
13: vlan30@enp4s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether 52:54:00:36:ee:42 brd ff:ff:ff:ff:ff:ff
inet 172.18.0.12/24 brd 172.18.0.255 scope global vlan30
valid_lft forever preferred_lft forever
inet6 fe80::5054:ff:fe36:ee42/64 scope link
valid_lft forever preferred_lft forever
14: vlan40@enp4s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether 52:54:00:36:ee:42 brd ff:ff:ff:ff:ff:ff
inet 172.19.0.12/24 brd 172.19.0.255 scope global vlan40
valid_lft forever preferred_lft forever
inet6 fe80::5054:ff:fe36:ee42/64 scope link
valid_lft forever preferred_lft forever
15: vlan20@enp4s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1350 qdisc noqueue state UP group default qlen 1000
link/ether 52:54:00:36:ee:42 brd ff:ff:ff:ff:ff:ff
inet 172.17.0.13/24 brd 172.17.0.255 scope global vlan20
valid_lft forever preferred_lft forever
inet6 fe80::5054:ff:fe36:ee42/64 scope link
valid_lft forever preferred_lft forever
16: br-int: <BROADCAST,MULTICAST> mtu 1442 qdisc noop state DOWN group default qlen 1000
link/ether 3a:0d:9a:c5:a4:eb brd ff:ff:ff:ff:ff:ff
17: genev_sys_6081: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 65000 qdisc noqueue master ovs-system state UNKNOWN group default qlen 1000
link/ether 92:b3:8a:d2:73:ac brd ff:ff:ff:ff:ff:ff
inet6 fe80::90b3:8aff:fed2:73ac/64 scope link
valid_lft forever preferred_lft forever
18: tapa46cc8cc-20: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1442 qdisc noqueue master ovs-system state UNKNOWN group default qlen 1000
link/ether fe:16:3e:f5:62:0f brd ff:ff:ff:ff:ff:ff
inet6 fe80::fc16:3eff:fef5:620f/64 scope link
valid_lft forever preferred_lft forever
19: tapa33927cd-70@if2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master ovs-system state UP group default qlen 1000
link/ether 6a:49:ac:e8:bd:37 brd ff:ff:ff:ff:ff:ff link-netns ovnmeta-a33927cd-7983-490b-8ab1-e70887abc398
inet6 fe80::6849:acff:fee8:bd37/64 scope link
valid_lft forever preferred_lft forever
[root@computehci-0 ~]# ip netns exec ovnmeta-a33927cd-7983-490b-8ab1-e70887abc398 ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: tapa33927cd-71@if19: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether fa:16:3e:b0:e5:a8 brd ff:ff:ff:ff:ff:ff link-netnsid 0
inet 169.254.169.254/16 brd 169.254.255.255 scope global tapa33927cd-71
valid_lft forever preferred_lft forever
inet 192.168.100.2/24 brd 192.168.100.255 scope global tapa33927cd-71
valid_lft forever preferred_lft forever
[root@computehci-0 ~]# ovs-vsctl show
54ff9cc9-ffe9-4a1d-9136-4a15c60f43dd
Manager "ptcp:6640:127.0.0.1"
is_connected: true
Bridge br-ex
fail_mode: standalone
Port patch-provnet-bfad145a-fb08-420c-8454-2c69a86ac674-to-br-int
Interface patch-provnet-bfad145a-fb08-420c-8454-2c69a86ac674-to-br-int
type: patch
options: {peer=patch-br-int-to-provnet-bfad145a-fb08-420c-8454-2c69a86ac674}
Port vlan50
tag: 50
Interface vlan50
type: internal
Port br-ex
Interface br-ex
type: internal
Port enp3s0
Interface enp3s0
Bridge br-int
fail_mode: secure
datapath_type: system
Port patch-br-int-to-provnet-bfad145a-fb08-420c-8454-2c69a86ac674
Interface patch-br-int-to-provnet-bfad145a-fb08-420c-8454-2c69a86ac674
type: patch
options: {peer=patch-provnet-bfad145a-fb08-420c-8454-2c69a86ac674-to-br-int}
Port tapa46cc8cc-20
Interface tapa46cc8cc-20
Port ovn-8c68c1-0
Interface ovn-8c68c1-0
type: geneve
options: {csum="true", key=flow, remote_ip="172.20.0.10"}
bfd_status: {diagnostic="No Diagnostic", flap_count="1", forwarding="true", remote_diagnostic="No Diagnostic", remote_state=up, state=up}
Port br-int
Interface br-int
type: internal
Port tapa33927cd-70
Interface tapa33927cd-70
ovs_version: "2.15.7"
[root@computehci-0 ~]# ovs-dpctl dump-flows | grep 172.21.6.79
recirc_id(0),in_port(4),eth(src=00:0c:29:d9:f0:d9,dst=fa:16:3e:df:80:af),eth_type(0x0800),ipv4(src=172.21.6.0/255.255.255.192,dst=172.21.6.79,proto=1,ttl=64,frag=no), packets:1962, bytes:192276, used:0.645s, actions:ct(zone=1,nat),recirc(0x3)
[root@computehci-0 ~]# ovs-dpctl dump-flows
recirc_id(0),in_port(4),eth(src=02:2e:8d:00:00:06,dst=32:df:00:f0:a3:4d),eth_type(0x8100),vlan(vid=50,pcp=0),encap(eth_type(0x0800),ipv4(frag=no)), packets:3438, bytes:412560, used:0.467s, actions:pop_vlan,5
recirc_id(0),in_port(4),eth(src=52:54:00:d9:66:89,dst=ff:ff:ff:ff:ff:ff),eth_type(0x0800),ipv4(src=0.0.0.0/255.0.0.0,dst=240.0.0.0/240.0.0.0,ttl=64,frag=no), packets:0, bytes:0, used:never, actions:3
recirc_id(0),in_port(4),eth(src=00:0c:29:d9:f0:d9,dst=ff:ff:ff:ff:ff:ff),eth_type(0x0806),arp(sip=192.168.7.11,tip=192.168.7.22,op=1/0xff), packets:2111, bytes:126660, used:0.018s, actions:3
recirc_id(0),in_port(4),eth(src=90:b1:1c:44:d6:0f,dst=ff:ff:ff:ff:ff:ff),eth_type(0x0806),arp(sip=172.21.6.103,tip=172.21.6.102,op=1/0xff,sha=90:b1:1c:44:d6:0f), packets:2280, bytes:95760, used:1.595s, actions:3
recirc_id(0x3),in_port(4),ct_state(-new+est-rel-rpl-inv+trk),ct_mark(0/0x1),eth(src=00:0c:29:d9:f0:d9,dst=fa:16:3e:df:80:af),eth_type(0x0800),ipv4(dst=192.168.100.50,proto=1,ttl=64,frag=no), packets:1988, bytes:194824, used:0.580s, actions:ct_clear,set(eth(src=fa:16:3e:89:22:a5,dst=fa:16:3e:bd:98:6b)),set(ipv4(ttl=63)),ct(zone=7),recirc(0x6)
recirc_id(0),in_port(6),eth(src=fa:16:3e:bd:98:6b,dst=fa:16:3e:89:22:a5),eth_type(0x0800),ipv4(src=192.168.100.50,dst=172.21.6.0/255.255.255.192,proto=1,frag=no), packets:1989, bytes:194922, used:0.579s, actions:ct(zone=7),recirc(0xa)
recirc_id(0),in_port(4),eth(src=52:54:00:3d:23:56,dst=ff:ff:ff:ff:ff:ff),eth_type(0x0800),ipv4(src=0.0.0.0/255.0.0.0,dst=240.0.0.0/240.0.0.0,ttl=64,frag=no), packets:0, bytes:0, used:never, actions:3
recirc_id(0),in_port(4),eth(src=52:54:00:aa:41:08,dst=ff:ff:ff:ff:ff:ff),eth_type(0x0806),arp(sip=192.168.7.101,tip=192.168.7.101,op=1/0xff), packets:0, bytes:0, used:never, actions:3
recirc_id(0),in_port(4),eth(src=52:54:00:1e:bb:a4,dst=ff:ff:ff:ff:ff:ff),eth_type(0x0800),ipv4(src=0.0.0.0/255.0.0.0,dst=240.0.0.0/240.0.0.0,ttl=64,frag=no), packets:0, bytes:0, used:never, actions:3
recirc_id(0),tunnel(tun_id=0x0,src=172.20.0.10,dst=172.20.0.11,flags(-df+csum+key)),in_port(2),eth(),eth_type(0x0800),ipv4(proto=17,frag=no),udp(dst=3784), packets:3437, bytes:226842, used:0.467s, actions:userspace(pid=3165666405,slow_path(bfd))
recirc_id(0),in_port(5),eth(src=32:df:00:f0:a3:4d,dst=02:2e:8d:00:00:06),eth_type(0x0800),ipv4(frag=no), packets:3458, bytes:401128, used:0.744s, actions:push_vlan(vid=50,pcp=0),4
recirc_id(0),in_port(4),eth(src=52:54:00:d9:66:89,dst=33:33:00:00:00:16),eth_type(0x86dd),ipv6(dst=ff02::16,proto=58,hlimit=1,frag=no), packets:1, bytes:90, used:5.509s, actions:3
recirc_id(0xe),in_port(6),ct_state(-new+est-rel+rpl-inv+trk),ct_mark(0/0x1),eth(src=fa:16:3e:df:80:af,dst=00:0c:29:d9:f0:d9),eth_type(0x0800),ipv4(src=128.0.0.0/192.0.0.0,dst=172.21.6.0/255.255.255.192,frag=no), packets:1988, bytes:194824, used:0.580s, actions:ct_clear,4
recirc_id(0),in_port(4),eth(src=00:17:94:73:12:8b,dst=01:00:0c:cc:cc:cc),eth_type(0/0xffff), packets:1, bytes:398, used:7.440s, actions:drop
recirc_id(0),in_port(4),eth(src=52:54:00:95:3f:da,dst=33:33:00:00:00:16),eth_type(0x86dd),ipv6(dst=ff02::16,proto=58,hlimit=1,frag=no), packets:1, bytes:150, used:3.723s, actions:3
recirc_id(0),in_port(4),eth(src=52:54:00:95:3f:da,dst=ff:ff:ff:ff:ff:ff),eth_type(0x0800),ipv4(src=0.0.0.0/255.0.0.0,dst=240.0.0.0/240.0.0.0,ttl=64,frag=no), packets:0, bytes:0, used:never, actions:3
recirc_id(0),in_port(4),eth(src=02:2e:8d:00:00:06,dst=32:df:00:f0:a3:4d),eth_type(0x8100),vlan(vid=50,pcp=0),encap(eth_type(0x0806)), packets:1, bytes:46, used:5.278s, actions:pop_vlan,5
recirc_id(0xa),in_port(6),ct_state(-new+est-rel+rpl-inv+trk),ct_mark(0/0x1),eth(src=fa:16:3e:bd:98:6b,dst=fa:16:3e:89:22:a5),eth_type(0x0800),ipv4(src=192.168.100.50,dst=172.21.6.11,proto=1,ttl=64,frag=no), packets:1989, bytes:194922, used:0.580s, actions:ct_clear,set(eth(src=fa:16:3e:df:80:af,dst=00:0c:29:d9:f0:d9)),set(ipv4(ttl=63)),ct(zone=1,nat),recirc(0xe)
recirc_id(0),in_port(4),eth(src=fe:54:00:7c:23:1a,dst=01:80:c2:00:00:00),eth_type(0/0xffff), packets:15467, bytes:804284, used:0.468s, actions:drop
recirc_id(0),in_port(4),eth(src=00:0c:29:d9:f0:d9,dst=fa:16:3e:df:80:af),eth_type(0x0800),ipv4(src=172.21.6.0/255.255.255.192,dst=172.21.6.79,proto=1,ttl=64,frag=no), packets:2011, bytes:197078, used:0.581s, actions:ct(zone=1,nat),recirc(0x3)
recirc_id(0x6),in_port(4),ct_state(-new+est-rel-rpl-inv+trk),ct_mark(0/0x1),eth(src=fa:16:3e:89:22:a5,dst=fa:16:3e:bd:98:6b),eth_type(0x0800),ipv4(dst=192.168.100.50,proto=1,frag=no), packets:1988, bytes:194824, used:0.581s, actions:6
recirc_id(0),in_port(4),eth(src=52:54:00:73:0f:fb,dst=ff:ff:ff:ff:ff:ff),eth_type(0x0800),ipv4(src=0.0.0.0/255.0.0.0,dst=240.0.0.0/240.0.0.0,ttl=64,frag=no), packets:0, bytes:0, used:never, actions:3
recirc_id(0),in_port(4),eth(src=52:54:00:3d:23:56,dst=33:33:00:00:00:16),eth_type(0x86dd),ipv6(dst=ff02::16,proto=58,hlimit=1,frag=no), packets:1, bytes:150, used:1.293s, actions:3
recirc_id(0),in_port(5),eth(src=32:df:00:f0:a3:4d,dst=02:2e:8d:00:00:06),eth_type(0x0806), packets:1, bytes:42, used:5.278s, actions:push_vlan(vid=50,pcp=0),4
recirc_id(0),in_port(4),eth(src=52:54:00:73:0f:fb,dst=33:33:00:00:00:16),eth_type(0x86dd),ipv6(dst=ff02::16,proto=58,hlimit=1,frag=no), packets:1, bytes:150, used:9.113s, actions:3
recirc_id(0),in_port(4),eth(src=00:0c:29:d9:f0:d9,dst=ff:ff:ff:ff:ff:ff),eth_type(0x0806),arp(sip=192.168.7.11,tip=192.168.7.26,op=1/0xff), packets:2117, bytes:127020, used:0.021s, actions:3
recirc_id(0),in_port(4),eth(src=52:54:00:1e:bb:a4,dst=33:33:00:00:00:16),eth_type(0x86dd),ipv6(dst=ff02::16,proto=58,hlimit=1,frag=no), packets:1, bytes:150, used:5.710s, actions:3
network config on contoller
[root@controller-0 ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: enp1s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc mq state UP group default qlen 1000
link/ether 02:2e:8d:00:00:00 brd ff:ff:ff:ff:ff:ff
inet 10.0.2.2/24 brd 10.0.2.255 scope global enp1s0
valid_lft forever preferred_lft forever
inet6 fe80::2e:8dff:fe00:0/64 scope link
valid_lft forever preferred_lft forever
3: enp2s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether 02:2e:8d:00:00:01 brd ff:ff:ff:ff:ff:ff
inet 192.168.7.42/24 brd 192.168.7.255 scope global enp2s0
valid_lft forever preferred_lft forever
inet 192.168.7.40/32 brd 192.168.7.255 scope global enp2s0
valid_lft forever preferred_lft forever
inet6 fe80::2e:8dff:fe00:1/64 scope link
valid_lft forever preferred_lft forever
4: enp3s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovs-system state UP group default qlen 1000
link/ether 02:2e:8d:00:00:02 brd ff:ff:ff:ff:ff:ff
inet6 fe80::2e:8dff:fe00:2/64 scope link
valid_lft forever preferred_lft forever
5: enp4s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1350 qdisc mq state UP group default qlen 1000
link/ether 02:2e:8d:00:00:03 brd ff:ff:ff:ff:ff:ff
inet 172.17.0.12/24 brd 172.17.0.255 scope global enp4s0
valid_lft forever preferred_lft forever
inet 172.17.0.10/32 brd 172.17.0.255 scope global enp4s0
valid_lft forever preferred_lft forever
inet6 fe80::2e:8dff:fe00:3/64 scope link
valid_lft forever preferred_lft forever
6: enp5s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether 02:2e:8d:00:00:04 brd ff:ff:ff:ff:ff:ff
inet 172.18.0.11/24 brd 172.18.0.255 scope global enp5s0
valid_lft forever preferred_lft forever
inet 172.18.0.10/32 brd 172.18.0.255 scope global enp5s0
valid_lft forever preferred_lft forever
inet6 fe80::2e:8dff:fe00:4/64 scope link
valid_lft forever preferred_lft forever
7: enp6s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether 02:2e:8d:00:00:05 brd ff:ff:ff:ff:ff:ff
inet 172.19.0.11/24 brd 172.19.0.255 scope global enp6s0
valid_lft forever preferred_lft forever
inet 172.19.0.10/32 brd 172.19.0.255 scope global enp6s0
valid_lft forever preferred_lft forever
inet6 fe80::2e:8dff:fe00:5/64 scope link
valid_lft forever preferred_lft forever
8: enp7s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovs-system state UP group default qlen 1000
link/ether 02:2e:8d:00:00:06 brd ff:ff:ff:ff:ff:ff
inet6 fe80::2e:8dff:fe00:6/64 scope link
valid_lft forever preferred_lft forever
9: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 7a:50:65:fe:ff:25 brd ff:ff:ff:ff:ff:ff
10: br-ex: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
link/ether 02:2e:8d:00:00:02 brd ff:ff:ff:ff:ff:ff
inet 172.21.6.42/24 brd 172.21.6.255 scope global br-ex
valid_lft forever preferred_lft forever
inet 172.21.6.40/32 brd 172.21.6.255 scope global br-ex
valid_lft forever preferred_lft forever
inet6 fe80::2e:8dff:fe00:2/64 scope link
valid_lft forever preferred_lft forever
11: br-tenant: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
link/ether 02:2e:8d:00:00:06 brd ff:ff:ff:ff:ff:ff
inet 172.20.0.10/24 brd 172.20.0.255 scope global br-tenant
valid_lft forever preferred_lft forever
inet6 fe80::2e:8dff:fe00:6/64 scope link
valid_lft forever preferred_lft forever
12: br-int: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 7a:5c:8b:47:78:45 brd ff:ff:ff:ff:ff:ff
13: genev_sys_6081: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 65000 qdisc noqueue master ovs-system state UNKNOWN group default qlen 1000
link/ether ce:f7:9c:bb:43:b7 brd ff:ff:ff:ff:ff:ff
inet6 fe80::ccf7:9cff:febb:43b7/64 scope link
valid_lft forever preferred_lft forever
[root@controller-0 ~]# ovs-vsctl show
78811126-7766-4789-98e3-e8c3cf1cff0e
Bridge br-int
fail_mode: secure
datapath_type: system
Port patch-br-int-to-provnet-bfad145a-fb08-420c-8454-2c69a86ac674
Interface patch-br-int-to-provnet-bfad145a-fb08-420c-8454-2c69a86ac674
type: patch
options: {peer=patch-provnet-bfad145a-fb08-420c-8454-2c69a86ac674-to-br-int}
Port ovn-980774-0
Interface ovn-980774-0
type: geneve
options: {csum="true", key=flow, remote_ip="172.20.0.11"}
bfd_status: {diagnostic="No Diagnostic", flap_count="1", forwarding="true", remote_diagnostic="No Diagnostic", remote_state=up, state=up}
Port br-int
Interface br-int
type: internal
Bridge br-tenant
fail_mode: standalone
Port enp7s0
Interface enp7s0
Port br-tenant
Interface br-tenant
type: internal
Bridge br-ex
fail_mode: standalone
Port br-ex
Interface br-ex
type: internal
Port patch-provnet-bfad145a-fb08-420c-8454-2c69a86ac674-to-br-int
Interface patch-provnet-bfad145a-fb08-420c-8454-2c69a86ac674-to-br-int
type: patch
options: {peer=patch-br-int-to-provnet-bfad145a-fb08-420c-8454-2c69a86ac674}
Port enp3s0
Interface enp3s0
ovs_version: "2.15.7"
[root@controller-0 ~]# ovs-dpctl dump-flows
recirc_id(0),in_port(2),eth(src=00:0c:29:d9:f0:d9,dst=ff:ff:ff:ff:ff:ff),eth_type(0x0806),arp(sip=192.168.7.11,tip=192.168.7.26,op=1/0xff), packets:2057, bytes:123420, used:0.758s, actions:1
recirc_id(0),in_port(2),eth(src=00:17:94:73:12:8b,dst=01:00:0c:00:00:00),eth_type(0/0xffff), packets:0, bytes:0, used:never, actions:drop
recirc_id(0),in_port(5),eth(src=00:17:94:73:12:8b,dst=01:00:0c:00:00:00),eth_type(0/0xffff), packets:0, bytes:0, used:never, actions:drop
recirc_id(0),in_port(5),eth(src=52:54:00:4a:71:e9,dst=33:33:00:00:00:16),eth_type(0x86dd),ipv6(frag=no), packets:1, bytes:90, used:9.340s, actions:6
recirc_id(0),in_port(5),eth(src=00:0c:29:d9:f0:d9,dst=ff:ff:ff:ff:ff:ff),eth_type(0x0806),arp(sip=192.168.7.11,tip=192.168.7.22,op=1/0xff), packets:2046, bytes:122760, used:0.756s, actions:6
recirc_id(0),in_port(2),eth(src=00:17:94:73:12:8b,dst=01:00:0c:cc:cc:cc),eth_type(0/0xffff), packets:0, bytes:0, used:never, actions:drop
recirc_id(0),in_port(5),eth(src=fe:54:00:b9:09:0e,dst=01:80:c2:00:00:00),eth_type(0/0xffff), packets:1706, bytes:88712, used:0.181s, actions:drop
recirc_id(0),in_port(6),eth(src=02:2e:8d:00:00:06,dst=32:df:00:f0:a3:4d),eth_type(0x0806), packets:1, bytes:42, used:0.636s, actions:5
recirc_id(0),tunnel(tun_id=0x0,src=172.20.0.11,dst=172.20.0.10,flags(-df+csum+key)),in_port(4),eth(),eth_type(0x0800),ipv4(proto=17,frag=no),udp(dst=3784), packets:3544, bytes:233904, used:0.822s, actions:userspace(pid=2361299521,slow_path(bfd))
recirc_id(0),in_port(2),eth(src=00:0c:29:d9:f0:d9,dst=ff:ff:ff:ff:ff:ff),eth_type(0x0806),arp(sip=192.168.7.11,tip=192.168.7.22,op=1/0xff), packets:2057, bytes:123420, used:0.758s, actions:1
recirc_id(0),in_port(5),eth(src=32:df:00:f0:a3:4d,dst=02:2e:8d:00:00:06),eth_type(0x0806), packets:1, bytes:42, used:0.636s, actions:6
recirc_id(0),in_port(5),eth(src=00:17:94:73:12:8b,dst=01:00:0c:cc:cc:cc),eth_type(0/0xffff), packets:0, bytes:0, used:never, actions:drop
recirc_id(0),in_port(2),eth(src=fe:54:00:5c:61:1f,dst=01:80:c2:00:00:00),eth_type(0/0xffff), packets:1714, bytes:89128, used:0.181s, actions:drop
recirc_id(0),in_port(2),eth(src=52:54:00:72:47:a9,dst=ff:ff:ff:ff:ff:ff),eth_type(0x0806),arp(sip=172.22.0.6,tip=172.22.0.240,op=1/0xff), packets:0, bytes:0, used:never, actions:1
recirc_id(0),in_port(5),eth(src=52:54:00:72:47:a9,dst=ff:ff:ff:ff:ff:ff),eth_type(0x0806),arp(sip=172.22.0.6,tip=172.22.0.240,op=1/0xff), packets:0, bytes:0, used:never, actions:6
recirc_id(0),in_port(2),eth(src=90:b1:1c:44:d6:0f,dst=ff:ff:ff:ff:ff:ff),eth_type(0x0806),arp(sip=172.21.6.103,tip=172.21.6.102,op=1/0xff), packets:2584, bytes:108528, used:1.372s, actions:1
recirc_id(0),in_port(5),eth(src=32:df:00:f0:a3:4d,dst=02:2e:8d:00:00:06),eth_type(0x0800),ipv4(frag=no), packets:3545, bytes:411220, used:0.822s, actions:6
recirc_id(0),in_port(5),eth(src=90:b1:1c:44:d6:0f,dst=ff:ff:ff:ff:ff:ff),eth_type(0x0806),arp(sip=172.21.6.103,tip=172.21.6.102,op=1/0xff), packets:2571, bytes:107982, used:1.371s, actions:6
recirc_id(0),in_port(5),eth(src=00:0c:29:d9:f0:d9,dst=ff:ff:ff:ff:ff:ff),eth_type(0x0806),arp(sip=192.168.7.11,tip=192.168.7.26,op=1/0xff), packets:2046, bytes:122760, used:0.756s, actions:6
recirc_id(0),in_port(6),eth(src=02:2e:8d:00:00:06,dst=32:df:00:f0:a3:4d),eth_type(0x0800),ipv4(frag=no), packets:3525, bytes:408900, used:0.280s, actions:5
recirc_id(0),in_port(2),eth(src=52:54:00:4a:71:e9,dst=33:33:00:00:00:16),eth_type(0x86dd),ipv6(src=fe80::/ffc0::,dst=ff02::16,proto=58,hlimit=1,frag=no),icmpv6(type=143), packets:1, bytes:90, used:9.344s, actions:1
tips
delete
oc delete -f ${BASE_DIR}/data/install/overcloud-network.yaml -n openstack
oc delete -f ${BASE_DIR}/data/install/ctlplane-network.yaml -n openstack
oc delete -f ${BASE_DIR}/data/install/openstack-userpassword.yaml -n openstack
oc delete -f ${BASE_DIR}/data/install/osp-director-operator.yaml
oc delete project openstack
fix nics
virsh list --all
# Id Name State
# -----------------------------------------
# - ocp4-acm-hub-master01 shut off
# - ocp4-acm-one-bootstrap shut off
# - ocp4-acm-one-master-01 shut off
# - ocp4-acm-one-master-02 shut off
# - ocp4-acm-one-master-03 shut off
# - ocp4-ipi-osp-master-01 shut off
# - ocp4-ipi-osp-master-02 shut off
# - ocp4-ipi-osp-master-03 shut off
# - ocp4-ipi-osp-worker-01 shut off
# - osp-17-0-all-in-one shut off
for i in {1..3}
do
for j in {1..4}
do
echo ocp4-ipi-osp-master-0$i
# virsh attach-interface --domain ocp4-ipi-osp-master-0$i --type bridge --source baremetal --model virtio
done
done
for i in 23 24 25
do
ssh root@192.168.7.$i "nmcli con del br-osp "
done
question
- network topo
- 2 host with ovs and vlan
end
FlexRAN 20.11 enable on ocp4, pf mode, option 7.2
本文描述,如何把 intel 的 oran 解决方案 flexran (版本 20.11) ,移植到 openshift 平台之上。
This article describes how to port Intel's oran solution flexran (version 20.11) to the openshift platform.
本运行环境,是在openshift 4.9.5 上,硬件包含了intel e810网卡, ACC100 加速卡。 由于软件的限制,网卡开启了VF模式,但是ACC100没有开启VF模式,使用的PF模式。 PTP 组件没有使用openshift自带的ptp operator,而是使用了升级的自定义版本。 容器运行的时候,和operator以及硬件的关系结构图:
This operating environment is based on openshift 4.9.5, and the hardware includes intel e810 network card and ACC100 accelerator card. Due to software limitations, the network card enables the VF mode, but the ACC100 does not enable the VF mode and uses the PF mode. The PTP component does not use the ptp operator that comes with openshift, but uses an upgraded custom version. When the container is running, the structure diagram of the relationship with the operator and hardware:
本次实验整体网络架构图: The overall network architecture diagram of this experiment:
intel E810 Nic 的样子:
intel ACC100 是这个样子的
实验用的RU 长这个样子 The RU used for the experiment looks like this
视频讲解
如何编译相关的基础镜像,请参考环境开发文档 。
How to compile the relevant basic image, please refer to Environmental Development Documentation 。
应用镜像编译
我们已经制作好了一个基础镜像,quay.io/nepdemo/flexran_vdu:flexran-20.11-dpdk-19.11-ocp4.9.5-ubi-8.4-core-conf ,镜像很大(>5G), 项目现场有一个镜像仓库,就很有必要了。在项目现场,我们需要调整bbu应用参数的,这个是通过一个config map,注入一个脚本实现的。
核配置
bbu应用是大型的dpdk应用,而dpdk应用,cpu绑定配置,非常重要,配置不善,直接导致dpdk应用core dump,甚至物理机死机,这里,我们就提供一个 16 核配置的模板。他使用 1-16 core,实际测试证明,稳定性还是可以接受的。
demo bbu 应用的特点,是物理层使用8个core,l2, l3使用剩下的8个core,这些core如果相互冲突,物理层就会coredump.
制作镜像
上游镜像是一个包含systemd的ubi-init镜像,里面是有一个set_ip.sh的脚本,并且配置了对应的system service。 但是在项目实际过程中,发现通过systemd 启动服务的方式,启动bbu等应用,有莫名其妙的退出问题,于是我们还是用这个 ubi-init 的镜像,但是启动的时候,就不去用默认的init了,而是指定脚本运行。
既然指定脚本运行了,那我们就在这个脚本里面,做环境初始化,并且把bbu的核绑定参数也放进去。最后,在k8s的配置里面,启动bbu应用。
具体的制作镜像步骤,非常繁琐,参见这个文档。
deploy on ocp 4.9.5
镜像都准备好了,我们开始在openshift4 上进行部署测试。
set security for temp image registry
我们临时创建了一个镜像仓库,那么我们就要把这个配置放到集群里面去,主要是让ocp集群,不要检查这个新镜像仓库的证书。
oc patch schedulers.config.openshift.io/cluster --type merge -p '{"spec":{"mastersSchedulable":false}}'
install /data/ocp4/clients/butane-amd64 /usr/local/bin/butane
cat << EOF > /data/sno/tmp.images.bu
variant: openshift
version: 4.9.0
metadata:
labels:
machineconfiguration.openshift.io/role: worker
name: 99-zzz-worker-temp-images
storage:
files:
- path: /etc/containers/registries.conf.d/temp.registries.conf
overwrite: true
contents:
inline: |
[[registry]]
location = "tmp-registry.ocp4.redhat.ren:5443"
insecure = true
blocked = false
mirror-by-digest-only = false
prefix = ""
EOF
butane /data/sno/tmp.images.bu > /data/sno/99-zzz-worker-temp-images.yaml
oc create -f /data/sno/99-zzz-worker-temp-images.yaml
worker-2 node, rt-kernel setting
bbu 应用是需要实时操作系统支持的,那么我们就用openshift的performance addon operator来搞这个事情,pao支持激活rt-kernel,同时还能设置一些内核参数,我们需要把hugepage,cpu隔离设置好,还有e810的驱动屏蔽。
The bbu application needs the support of the real-time operating system, so we use the performance addon operator of openshift to do this. Pao supports the activation of rt-kernel, and can also set some kernel parameters. We need to set hugepage and cpu isolation. There are driver shields for e810.
cat << EOF > /data/install/performance-2.yaml
---
apiVersion: performance.openshift.io/v2
kind: PerformanceProfile
metadata:
name: wzh-performanceprofile-2
spec:
additionalKernelArgs:
- nmi_watchdog=0
- isolcpus=1-18
- nohz_full=1-18
- rcu_nocbs=1-18
- kthread_cpus=0,19
- irqaffinity=0,19
- iommu=pt
- intel_iommu=on
- intel_pstate=disable
# try to upgrade e810 driver
- module_name.blacklist=1
- rd.driver.blacklist=ice
# profile creator
- audit=0
- mce=off
- nmi_watchdog=0
globallyDisableIrqLoadBalancing: true
cpu:
isolated: "1-18"
reserved: "0,19"
hugepages:
defaultHugepagesSize: "1G"
pages:
- size: "1G"
count: 24
realTimeKernel:
enabled: true
numa:
topologyPolicy: "single-numa-node"
nodeSelector:
node-role.kubernetes.io/worker-rt-2: ""
machineConfigPoolSelector:
machineconfiguration.openshift.io/role: worker-rt-2
EOF
oc create --save-config -f /data/install/performance-2.yaml
# oc apply -f /data/install/performance-2.yaml
# oc delete -f /data/install/performance-2.yaml
oc label node worker-2.ocp4.redhat.ren node-role.kubernetes.io/worker-rt-2=""
intel e810 driver
openshift 4.9.5 对应的coreos操作系统里面自带的e810驱动ice.ko,版本比较低,无法支持ptp,我们需要升级驱动。但是coreos升级驱动操作比较麻烦,我们需要制作一个systemd service,让他在kubelet之前启动,在这个service 里面,用podman启动一个特权容器,在容器里面,insmod ice.ko。当然一切的前提,是在kernel参数上,屏蔽了ice的自动加载。
The e810 driver ice.ko that comes with the coreos operating system corresponding to openshift 4.9.5 has a relatively low version and cannot support ptp. We need to upgrade the driver. But the coreos upgrade driver operation is more troublesome, we need to make a systemd service, let it start before kubelet, in this service, use podman to start a privileged container, in the container, insmod ice.ko. Of course, the premise of everything is that the automatic loading of ice is blocked on the kernel parameters.
cat << EOF > /data/sno/static-pod.bu
variant: openshift
version: 4.9.0
metadata:
labels:
machineconfiguration.openshift.io/role: worker-rt-2
name: 99-zzz-e810-dpdk-driver-static-worker-rt-2
storage:
files:
- path: /etc/modprobe.d/blacklist-ice.conf
mode: 0644
overwrite: true
contents:
inline: |
blacklist ice
systemd:
units:
- name: driver.ice.service
enabled: true
contents: |
[Unit]
Description=driver.ice service
Wants=network-online.target
After=network-online.target
[Service]
Type=oneshot
RemainAfterExit=yes
User=root
WorkingDirectory=/root/
ExecStart=podman run --privileged --rm -it quay.io/nepdemo/intel-driver:8.4-rt-1.9.7 /bin/sh -c " rmmod ice; rmmod auxiliary ; insmod /diy/auxiliary.ko; insmod /diy/ice.ko ; "
[Install]
WantedBy=multi-user.target
- name: kubelet.service
dropins:
- name: 99-after-ice.conf
contents: |
[Unit]
Requires=driver.ice.service
After=driver.ice.service
EOF
butane -d /data/install /data/sno/static-pod.bu > /data/install/99-zzz-e810-dpdk-driver-static-worker-rt-2.yaml
oc create --save-config -f /data/install/99-zzz-e810-dpdk-driver-static-worker-rt-2.yaml
# oc apply -f /data/install/99-zzz-e810-dpdk-driver-static-worker-rt-2.yaml
# oc delete -f /data/install/99-zzz-e810-dpdk-driver-static-worker-rt-2.yaml
linuxptp 3.11
vRAN应用,特别是option 7.2 的方案,需要ptp时钟方案支持,物理形态,要么是GPS master连到交换机上,通过交换机授时,要么GPS master直接连网卡上。在服务器端,需要支持网络授时的网卡,并且主机上要启动ptp相关的服务,同时关掉ntp。
从上图可以看到,ptp4l,是从网络上拿时间到网卡上,phc2sys,是从网卡上,写到系统时钟上。ts2phc应该是把本地时间同步给其他设备用的。
openshift自带的ptp operator版本比较低,我们需要升级,就自己做镜像,做服务吧。
build linuxptp container image
我们在外网,用linuxptp 3.1.1版本做镜像,并且支持注入参数。方便项目现场调整。
# http://linuxptp.sourceforge.net/
# download linuxptp-3.1.1
mkdir -p /data/ptp
cd /data/ptp
wget https://nchc.dl.sourceforge.net/project/linuxptp/v3.1/linuxptp-3.1.1.tgz
tar zvxf linuxptp-3.1.1.tgz
cd linuxptp-3.1.1
make
cat << 'EOF' > ptp4l.sh
#!/bin/bash
if [ -z $DEMO_ENV_PRIO ]; then
/usr/local/sbin/ptp4l -f /etc/ptp4l.conf -m $DEMO_ENV_PTP4L_ARG
else
/usr/bin/chrt -f $DEMO_ENV_PRIO /usr/local/sbin/ptp4l -f /etc/ptp4l.conf -m $DEMO_ENV_PTP4L_ARG
fi
EOF
cat << 'EOF' > phc2sys.sh
#!/bin/bash
if [ -z $DEMO_ENV_PRIO ]; then
/usr/local/sbin/phc2sys -m -z /var/run/ptp4l -t [phc2sys] $DEMO_ENV_PHC2SYS_ARG
else
/usr/bin/chrt -f $DEMO_ENV_PRIO /usr/local/sbin/phc2sys -m -z /var/run/ptp4l -t [phc2sys] $DEMO_ENV_PHC2SYS_ARG
fi
EOF
cat << 'EOF' > ts2phc.sh
#!/bin/bash
if [ -z $DEMO_ENV_PRIO ]; then
/usr/local/sbin/ts2phc -f /etc/ts2phc.cfg -m $DEMO_ENV_TS2PHC_ARG
else
/usr/bin/chrt -f $DEMO_ENV_PRIO /usr/local/sbin/ts2phc -f /etc/ts2phc.cfg -m $DEMO_ENV_TS2PHC_ARG
fi
EOF
cat << EOF > ./ptp.dockerfile
FROM registry.access.redhat.com/ubi8/ubi:8.4
COPY hwstamp_ctl nsm phc2sys phc_ctl pmc ptp4l timemaster ts2phc incdefs.sh version.sh ptp4l.sh phc2sys.sh ts2phc.sh /usr/local/sbin/
RUN cd /usr/local/sbin/ && chmod +x hwstamp_ctl nsm phc2sys phc_ctl pmc ptp4l timemaster ts2phc incdefs.sh version.sh ptp4l.sh phc2sys.sh ts2phc.sh
EOF
podman build --squash -t quay.io/nepdemo/linuxptp:3.1.1-ubi-8.4-v06 -f ptp.dockerfile ./
podman push quay.io/nepdemo/linuxptp:3.1.1-ubi-8.4-v06
deploy linux ptp
有了镜像,我们就做一个deployment,来启动ptp,注意,里面有3个container, 同时还有一些configmap 做配置文件注入. 在项目现场, 注意需要调整配置文件参数.
oc new-project vbbu-demo
oc project vbbu-demo
export REG_TMP='tmp-registry.ocp4.redhat.ren:5443'
# kernel driver deployment
oc create serviceaccount svcacct-driver -n vbbu-demo
oc adm policy add-scc-to-user privileged -z svcacct-driver -n vbbu-demo
# oc adm policy add-scc-to-user anyuid -z mysvcacct -n vbbu-demo
# !!! remember to disable chronyd on dest host !!!
# we do not use ptp opeerator, so we need to do it manually
# TODO
# https://docs.openshift.com/container-platform/4.10/scalability_and_performance/ztp-configuring-single-node-cluster-deployment-during-installation.html#sno-du-disabling-ntp_sno-du-deploying-distributed-units-manually-on-single-node-openshift
cat << 'EOF' > /data/install/ptp.chrony.conf
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
labels:
machineconfiguration.openshift.io/role: worker-rt-2
name: disable-chronyd
spec:
config:
systemd:
units:
- contents: |
[Unit]
Description=NTP client/server
Documentation=man:chronyd(8) man:chrony.conf(5)
After=ntpdate.service sntp.service ntpd.service
Conflicts=ntpd.service systemd-timesyncd.service
ConditionCapability=CAP_SYS_TIME
[Service]
Type=forking
PIDFile=/run/chrony/chronyd.pid
EnvironmentFile=-/etc/sysconfig/chronyd
ExecStart=/usr/sbin/chronyd $OPTIONS
ExecStartPost=/usr/libexec/chrony-helper update-daemon
PrivateTmp=yes
ProtectHome=yes
ProtectSystem=full
[Install]
WantedBy=multi-user.target
enabled: false
name: chronyd.service
ignition:
version: 2.2.0
EOF
oc create -f /data/install/ptp.chrony.conf
cat << EOF > /data/install/ptp4l.conf
[global]
#
# Default Data Set
#
twoStepFlag 1
slaveOnly 0
priority1 128
priority2 128
domainNumber 24
clockClass 248
clockAccuracy 0xFE
offsetScaledLogVariance 0xFFFF
free_running 0
freq_est_interval 0
#
# Port Data Set
# 16 TS a second use logSyncInterval -4
#
#logAnnounceInterval 4
logAnnounceInterval 1
logSyncInterval -4
logMinDelayReqInterval 0
logMinPdelayReqInterval 0
announceReceiptTimeout 3
syncReceiptTimeout 0
delayAsymmetry 0
fault_reset_interval 4
neighborPropDelayThresh 20000000
#
# Run time options
#
assume_two_step 0
logging_level 6
path_trace_enabled 0
follow_up_info 0
tx_timestamp_timeout 200
use_syslog 1
verbose 0
summary_interval 0
kernel_leap 1
check_fup_sync 0
#
# Servo Options
#
pi_proportional_const 0.0
pi_integral_const 0.0
pi_proportional_scale 0.0
pi_proportional_exponent -0.3
pi_proportional_norm_max 0.7
pi_integral_scale 0.0
pi_integral_exponent 0.4
pi_integral_norm_max 0.3
step_threshold 0.00000002
first_step_threshold 0.00002
max_frequency 900000000
clock_servo nullf
sanity_freq_limit 200000000
ntpshm_segment 0
#
# Transport options
#
transportSpecific 0x0
ptp_dst_mac 01:1B:19:00:00:00
p2p_dst_mac 01:80:C2:00:00:0E
udp6_scope 0x0E
uds_address /var/run/ptp4l
#
# Default interface options
#
network_transport UDPv4
#network_transport L2
delay_mechanism E2E
time_stamping hardware
delay_filter moving_median
delay_filter_length 10
egressLatency 0
ingressLatency 0
boundary_clock_jbod 0
#
# Clock description
#
productDescription ;;
revisionData ;;
manufacturerIdentity 00:00:00
userDescription ;
timeSource 0xA0
EOF
cat << EOF > /data/install/ts2phc.cfg
[global]
use_syslog 0
verbose 1
logging_level 7
ts2phc.pulsewidth 100000000
# For GNSS module
ts2phc.nmea_serialport /dev/ttyGNSS_6500_0
[ens2f0]
ts2phc.extts_polarity rising
EOF
oc delete configmap ptp-config -n vbbu-demo
oc create configmap ptp-config -n vbbu-demo --from-file=/data/install/ptp4l.conf --from-file=/data/install/ts2phc.cfg --save-config=true
# 06 for fifo
# 07 for nice
export VAR_IMAGE='quay.io/nepdemo/linuxptp:3.1.1-ubi-8.4-v06'
cat << EOF > /data/install/ptp.demo.yaml
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: nepdemo-linuxptp-daemon
labels:
app: nepdemo-linuxptp-daemon
spec:
replicas: 1
selector:
matchLabels:
app: nepdemo-linuxptp-daemon
template:
metadata:
annotations:
labels:
app: nepdemo-linuxptp-daemon
name: nepdemo-linuxptp-daemon
# namespace: openshift-ptp
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchFields:
- key: metadata.name
operator: In
values:
- worker-2.ocp4.redhat.ren
tolerations:
- key: "vbbu"
operator: "Exists"
effect: "NoSchedule"
containers:
- name: ptp4l
image: $VAR_IMAGE
command: ["/bin/sh", "-c", "--"]
args: [" /usr/local/sbin/ptp4l.sh ;"]
env:
- name: DEMO_ENV_PTP4L_ARG
value: " -i ens2f0 -2 "
- name: DEMO_ENV_PRIO
value: "65"
securityContext:
privileged: true
runAsUser: 0
volumeMounts:
- mountPath: /etc/ptp4l.conf
subPath: ptp4l.conf
name: config-volume
- mountPath: /var/run
name: socket-dir
- name: phc2sys
image: $VAR_IMAGE
imagePullPolicy: IfNotPresent
command: ["/bin/sh", "-c", "--"]
args: [" /usr/local/sbin/phc2sys.sh ;"]
env:
- name: DEMO_ENV_PHC2SYS_ARG
# value: " -s ens2f0 -O 0 -R 8 "
value: " -s ens2f0 -r -u 1 -O 0 -R 8 "
- name: DEMO_ENV_PRIO
value: "65"
securityContext:
privileged: true
runAsUser: 0
volumeMounts:
- mountPath: /etc/ptp4l.conf
subPath: ptp4l.conf
name: config-volume
- mountPath: /var/run
name: socket-dir
- name: ts2phc
image: $VAR_IMAGE
imagePullPolicy: IfNotPresent
command: ["/bin/sh", "-c", "--"]
args: [" /usr/local/sbin/ts2phc.sh ;"]
env:
- name: DEMO_ENV_TS2PHC_ARG
value: " -s generic -c ens2f0 "
- name: DEMO_ENV_PRIO
value: "65"
securityContext:
privileged: true
runAsUser: 0
volumeMounts:
- mountPath: /etc/ts2phc.cfg
subPath: ts2phc.cfg
name: config-volume
- mountPath: /var/run
name: socket-dir
- name: dev
mountPath: /dev
hostNetwork: true
# hostPID: true
serviceAccountName: svcacct-driver
volumes:
- configMap:
defaultMode: 420
name: ptp-config
name: config-volume
- name: socket-dir
emptyDir: {}
- name: dev
hostPath:
path: "/dev"
EOF
oc create --save-config -n vbbu-demo -f /data/install/ptp.demo.yaml
# oc delete -n vbbu-demo -f /data/install/ptp.demo.yaml
setup sriov operator
openshift有sriov的operator,官方支持intel e810网卡,我们直接用就好了。
the env has nic Intel e810 : 8086 1593
# install sriov operator
cat << EOF > /data/install/sriov.yaml
---
apiVersion: v1
kind: Namespace
metadata:
name: openshift-sriov-network-operator
annotations:
workload.openshift.io/allowed: management
---
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
name: sriov-network-operators
namespace: openshift-sriov-network-operator
spec:
targetNamespaces:
- openshift-sriov-network-operator
---
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
name: sriov-network-operator-subscription
namespace: openshift-sriov-network-operator
spec:
channel: "stable"
installPlanApproval: Manual
name: sriov-network-operator
source: redhat-operators
sourceNamespace: openshift-marketplace
EOF
oc create -f /data/install/sriov.yaml
oc get SriovNetworkNodeState -n openshift-sriov-network-operator
# NAME AGE
# master-0 42m
# worker-0.ocp4.redhat.ren 42m
# worker-1 42m
# worker-2.ocp4.redhat.ren 42m
oc get SriovNetworkNodeState/worker-2.ocp4.redhat.ren -n openshift-sriov-network-operator -o yaml
# apiVersion: sriovnetwork.openshift.io/v1
# kind: SriovNetworkNodeState
# metadata:
# creationTimestamp: "2022-05-06T14:34:54Z"
# generation: 61
# name: worker-2.ocp4.redhat.ren
# namespace: openshift-sriov-network-operator
# ownerReferences:
# - apiVersion: sriovnetwork.openshift.io/v1
# blockOwnerDeletion: true
# controller: true
# kind: SriovNetworkNodePolicy
# name: default
# uid: 4eca5eea-e1e5-410f-8833-dd2de1434e53
# resourceVersion: "93262422"
# uid: 1d122c8e-b788-4f1e-a3d5-865c6230a476
# spec:
# dpConfigVersion: "93222170"
# status:
# interfaces:
# - deviceID: "1593"
# driver: ice
# linkSpeed: -1 Mb/s
# linkType: ETH
# mac: 40:a6:b7:82:0e:4c
# mtu: 1500
# name: ens2f0
# pciAddress: 0000:65:00.0
# totalvfs: 64
# vendor: "8086"
# - deviceID: "1593"
# driver: ice
# linkSpeed: -1 Mb/s
# linkType: ETH
# mac: 40:a6:b7:82:0e:4d
# mtu: 1500
# name: ens2f1
# pciAddress: 0000:65:00.1
# totalvfs: 64
# vendor: "8086"
# - deviceID: "1593"
# driver: ice
# linkSpeed: -1 Mb/s
# linkType: ETH
# mac: 40:a6:b7:82:0e:4e
# mtu: 1500
# name: ens2f2
# pciAddress: 0000:65:00.2
# totalvfs: 64
# vendor: "8086"
# - deviceID: "1593"
# driver: ice
# linkSpeed: -1 Mb/s
# linkType: ETH
# mac: 40:a6:b7:82:0e:4f
# mtu: 1500
# name: ens2f3
# pciAddress: 0000:65:00.3
# totalvfs: 64
# vendor: "8086"
# - deviceID: 37d1
# driver: i40e
# linkSpeed: 1000 Mb/s
# linkType: ETH
# mac: ac:1f:6b:ea:5b:32
# mtu: 1500
# name: eno1
# pciAddress: 0000:b5:00.0
# totalvfs: 32
# vendor: "8086"
# - deviceID: 37d1
# driver: i40e
# linkSpeed: 1000 Mb/s
# linkType: ETH
# mac: ac:1f:6b:ea:5b:33
# mtu: 1500
# name: eno2
# pciAddress: 0000:b5:00.1
# totalvfs: 32
# vendor: "8086"
# syncStatus: Succeeded
# how to use the sriov to create VF and attach to pod, depends on use case from nep demo request
# remember to active SRIOV in bios
# remember to active VT-d in bios
cat << EOF > /data/install/sriov.policy.yaml
---
apiVersion: sriovnetwork.openshift.io/v1
kind: SriovNetworkNodePolicy
metadata:
name: policy-810-nic01-rt2
namespace: openshift-sriov-network-operator
spec:
resourceName: intel_810_nic01_rt2
nodeSelector:
kubernetes.io/hostname: worker-2.ocp4.redhat.ren
numVfs: 2
nicSelector:
vendor: "8086"
deviceID: "1593"
rootDevices:
- "0000:65:00.0"
# pfNames:
# - "ens2f0"
# linkType: eth
# isRdma: false
deviceType: vfio-pci
EOF
oc create -f /data/install/sriov.policy.yaml
# oc delete -f /data/install/sriov.policy.yaml
oc get sriovnetworknodestates/worker-2.ocp4.redhat.ren -n openshift-sriov-network-operator -o jsonpath='{.status.syncStatus}' && echo
# Succeeded
cat << EOF > /data/install/sriov.attach.yaml
---
apiVersion: sriovnetwork.openshift.io/v1
kind: SriovNetwork
metadata:
name: intel-810-nic01-vf0-rt2
namespace: openshift-sriov-network-operator
spec:
resourceName: intel_810_nic01_rt2
networkNamespace: vbbu-demo
vlan: 5
---
apiVersion: sriovnetwork.openshift.io/v1
kind: SriovNetwork
metadata:
name: intel-810-nic01-vf1-rt2
namespace: openshift-sriov-network-operator
spec:
resourceName: intel_810_nic01_rt2
networkNamespace: vbbu-demo
vlan: 5
EOF
oc create -f /data/install/sriov.attach.yaml
# oc delete -f /data/install/sriov.attach.yaml
oc get net-attach-def -n vbbu-demo
# NAME AGE
# intel-810-nic01-vf0-rt2 2m19s
# intel-810-nic01-vf1-rt2 2m19s
nepdemo license file
把license file放到config map里面,注入容器。
不过呢,当前,我们是在制作容器镜像的步骤里面,直接把license 复制到容器里面了。
# license file 加载到config map中
oc create configmap -n vbbu-demo license.for.nepdemo \
--from-file=license=./3496531EC238AD91DED6DBA5BD6B.lic
# to updated config map
oc create configmap -n vbbu-demo license.for.nepdemo --from-file=license=./3496531EC238AD91DED6DBA5BD6B.lic -o yaml --dry-run=client | oc apply -f -
create deployment for release/production
终于, 我们要启动服务了, 这是一个dpdk程序, 我们设置resource request, limits, 达到绑核的目的.
Finally, we have to start the service, this is a dpdk program, we set the resource request, limits, to achieve the purpose of binding the core.
oc new-project vbbu-demo
oc project vbbu-demo
# kernel driver deployment
oc create serviceaccount svcacct-driver -n vbbu-demo
oc adm policy add-scc-to-user privileged -z svcacct-driver -n vbbu-demo
# oc adm policy add-scc-to-user anyuid -z mysvcacct -n vbbu-demo
16 cpu core config, auto convert
cpu core 1-16
cat << 'EOF' > /data/install/bbu.core.conf.sh
#!/bin/bash
sed -i 's/<systemThread>.*</<systemThread>2, 0, 0</' /root/flexran/bin/nr5g/gnb/l1/phycfg_xran.xml
sed -i 's/<timerThread>.*</<timerThread>1, 96, 0</' /root/flexran/bin/nr5g/gnb/l1/phycfg_xran.xml
sed -i 's/<FpgaDriverCpuInfo>.*</<FpgaDriverCpuInfo>3, 96, 0</' /root/flexran/bin/nr5g/gnb/l1/phycfg_xran.xml
sed -i 's/<FrontHaulCpuInfo>.*</<FrontHaulCpuInfo>3, 96, 0</' /root/flexran/bin/nr5g/gnb/l1/phycfg_xran.xml
sed -i 's/<radioDpdkMaster>.*</<radioDpdkMaster>2, 99, 0</' /root/flexran/bin/nr5g/gnb/l1/phycfg_xran.xml
sed -i "s/<BbuPoolThreadDefault_0_63>.*</<BbuPoolThreadDefault_0_63>0x$(to_hex '10,11,12,13,14,15')</" /root/flexran/bin/nr5g/gnb/l1/phycfg_xran.xml
sed -i 's/<xRANThread>.*</<xRANThread>9, 96, 0</' /root/flexran/bin/nr5g/gnb/l1/xrancfg_sub6.xml
sed -i "s/<xRANWorker>.*</<xRANWorker>0x$(to_hex '16'), 96, 0</" /root/flexran/bin/nr5g/gnb/l1/xrancfg_sub6.xml
sed -i "s/OAM_SHARED_CORE_BITMAP=.*/OAM_SHARED_CORE_BITMAP=$(to_dec '3,4')/" /etc/BBU_cfg/cu_cfg/gNodeB_CU_Configuration.cfg
sed -i "s/L3_SHARED_CORE_BITMAP=.*/L3_SHARED_CORE_BITMAP=$(to_dec '4,5')/" /etc/BBU_cfg/cu_cfg/gNodeB_CU_Configuration.cfg
sed -i "s/PDCP_SHRED_CORE_BITMAP=.*/PDCP_SHRED_CORE_BITMAP=$(to_dec '7,8')/" /etc/BBU_cfg/cu_cfg/gNodeB_CU_Configuration.cfg
sed -i "s/RRM_SHARED_CORE_BITMAP=.*/RRM_SHARED_CORE_BITMAP=$(to_dec '1,8')/" /etc/BBU_cfg/cu_cfg/gNodeB_CU_Configuration.cfg
sed -i "s/SON_SHARED_CORE_BITMAP=.*/SON_SHARED_CORE_BITMAP=$(to_dec '1,2')/" /etc/BBU_cfg/cu_cfg/gNodeB_CU_Configuration.cfg
# https://unix.stackexchange.com/questions/487451/sed-replace-a-pattern-between-a-pattern-and-the-end-of-file
sed -i "/<oam_shm_logger_cfg>/,\$s/<cpu_bitmap>.*</<cpu_bitmap>$(to_dec '7')</" /etc/BBU_cfg/cu_cfg/Proprietary_gNodeB_CU_Data_Model.xml
sed -i "/<shm_logger_cfg>/,\$s/<cpu_bitmap>.*</<cpu_bitmap>$(to_dec '7')</" /etc/BBU_cfg/cu_cfg/Proprietary_gNodeB_CU_Data_Model.xml
sed -i '/<L3Params>/,$s/<core_no>.*</<core_no>2</' /etc/BBU_cfg/cu_cfg/Proprietary_gNodeB_CU_Data_Model.xml
sed -i '/<process_name>gnb_cu_son</,$s/<process_args>.* /<process_args>2 /' /etc/BBU_cfg/cu_cfg/Proprietary_gNodeB_CU_Data_Model.xml
sed -i '/<process_name>gnb_cu_rrm</,$s/<process_args>.* /<process_args>2 /' /etc/BBU_cfg/cu_cfg/Proprietary_gNodeB_CU_Data_Model.xml
sed -i '/<pdcp_index>0</,$s/<core_num_for_worker_thread>.*</<core_num_for_worker_thread>3</' /etc/BBU_cfg/cu_cfg/Proprietary_gNodeB_CU_Data_Model.xml
sed -i '/<pdcp_index>1</,$s/<core_num_for_worker_thread>.*</<core_num_for_worker_thread>3</' /etc/BBU_cfg/cu_cfg/Proprietary_gNodeB_CU_Data_Model.xml
sed -i '/<egtpu_instance>0</,$s/<core_num_of_worker_thread>.*</<core_num_of_worker_thread>6</' /etc/BBU_cfg/cu_cfg/Proprietary_gNodeB_CU_Data_Model.xml
sed -i '/<egtpu_instance>1</,$s/<core_num_of_worker_thread>.*</<core_num_of_worker_thread>6</' /etc/BBU_cfg/cu_cfg/Proprietary_gNodeB_CU_Data_Model.xml
sed -i '/<f1u_instance>0</,$s/<core_num_of_worker_thread>.*</<core_num_of_worker_thread>6</' /etc/BBU_cfg/cu_cfg/Proprietary_gNodeB_CU_Data_Model.xml
sed -i '/<f1u_instance>1</,$s/<core_num_of_worker_thread>.*</<core_num_of_worker_thread>6</' /etc/BBU_cfg/cu_cfg/Proprietary_gNodeB_CU_Data_Model.xml
sed -i 's/<core_num_mapping>.*</<core_num_mapping>4,4</' /etc/BBU_cfg/cu_cfg/Proprietary_gNodeB_CU_Data_Model.xml
sed -i 's/MAC_BINREAD_CORE_NUM=.*/MAC_BINREAD_CORE_NUM=1/' /etc/BBU_cfg/du_cfg/gNB_DU_Configuration.cfg
sed -i 's/RLC_BINREAD_CORE_NUM=.*/RLC_BINREAD_CORE_NUM=1/' /etc/BBU_cfg/du_cfg/gNB_DU_Configuration.cfg
sed -i 's/MAC_HP_CORE_NUM=.*/MAC_HP_CORE_NUM=5/' /etc/BBU_cfg/du_cfg/gNB_DU_Configuration.cfg
sed -i 's/RLC_MASTER_CORE_NUM=.*/RLC_MASTER_CORE_NUM=2/' /etc/BBU_cfg/du_cfg/gNB_DU_Configuration.cfg
sed -i "s/SHARED_CORE_BITMAP=.*/SHARED_CORE_BITMAP=$(to_dec '5,6')/" /etc/BBU_cfg/du_cfg/gNB_DU_Configuration.cfg
sed -i 's/RELAY_ADAPTER_RECVR_THREAD_CORE_NUM=.*/RELAY_ADAPTER_RECVR_THREAD_CORE_NUM=4/' /etc/BBU_cfg/du_cfg/gNB_DU_Configuration.cfg
sed -i '/<RlcProvsioningParams>/,$s/<CoreNumWorkerThread>.*</<CoreNumWorkerThread>7</' /etc/BBU_cfg/du_cfg/Proprietary_gNodeB_DU_Data_Model.xml
sed -i '/<RlclSystemParams>/,$s/<CoreNumWorkerThread>.*</<CoreNumWorkerThread>7</' /etc/BBU_cfg/du_cfg/Proprietary_gNodeB_DU_Data_Model.xml
sed -i '/<F1uProvisioningParams>/,$s/<numCoreWorkerThreads>.*</<numCoreWorkerThreads>7</' /etc/BBU_cfg/du_cfg/Proprietary_gNodeB_DU_Data_Model.xml
# --a=8 --t=8 --b=8
sed -i 's/\.\/gnb_cu_pdcp .* >/\.\/gnb_cu_pdcp --r=2 --a=8 --t=8 --m=2 --i=0 --b=8 --p=0 --s=50 --n=10 >/' /home/BaiBBU_XSS/BaiBBU_SXSS/gNB_app
EOF
oc delete configmap vbbu-core-config -n vbbu-demo
oc create configmap vbbu-core-config -n vbbu-demo --from-file=/data/install/bbu.core.conf.sh --save-config=true
create vbbu deployment
oc adm taint nodes worker-2.ocp4.redhat.ren vbbu=realtime:NoSchedule
# oc adm taint nodes worker-2.ocp4.redhat.ren vbbu=realtime:NoExecute-
oc get nodes -o json | jq '.items[] | .metadata.name, (.spec.taints | tostring )' | paste - -
# "master-0" "null"
# "worker-1" "null"
# "worker-2.ocp4.redhat.ren" "[{\"effect\":\"NoSchedule\",\"key\":\"vbbu\",\"value\":\"realtime\"}]"
export REG_TMP='tmp-registry.ocp4.redhat.ren:5443'
# the pod with vbbu container and dev container
# later, it will change to deployment
cat << EOF > /data/install/vran.intel.flexran.yaml
---
apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
name: host-device-vbbu-demo
spec:
config: '{
"cniVersion": "0.3.1",
"type": "host-device",
"device": "ens2f2",
"ipam": {
"type": "static",
"addresses": [
{
"address": "192.168.12.20/24"
},
{
"address": "192.168.12.19/24"
}
]
}
}'
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: flexran-binary-release-deployment
labels:
app: flexran-binary-release-deployment
spec:
replicas: 1
selector:
matchLabels:
app: flexran-binary-release
template:
metadata:
labels:
app: flexran-binary-release
name: flexran-binary-release
annotations:
k8s.v1.cni.cncf.io/networks: |-
[
{
"name": "host-device-vbbu-demo"
},
{
"name": "intel-810-nic01-vf0-rt2",
"mac": "00:11:22:33:44:66"
},
{
"name": "intel-810-nic01-vf1-rt2",
"mac": "00:11:22:33:44:67"
}
]
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: "app"
operator: In
values:
- flexran-binary-release
topologyKey: "kubernetes.io/hostname"
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- worker-2.ocp4.redhat.ren
tolerations:
- key: "vbbu"
operator: "Exists"
effect: "NoSchedule"
serviceAccountName: svcacct-driver
containers:
- name: flexran-release-running
securityContext:
privileged: true
runAsUser: 0
# command: [ "/sbin/init" ]
command: [ "/bin/sh","-c","--" ]
args: [" /root/systemd/set_ip.sh ; cd /home/BaiBBU_XSS/tools/ ; ./XRAN_BBU start ; trap '{ cd /home/BaiBBU_XSS/tools/ ; ./XRAN_BBU stop ; exit 255; }' SIGINT SIGTERM ERR EXIT ; sleep infinity ; "]
tty: true
stdin: true
image: ${REG_TMP}/nepdemo/flexran_vdu:flexran-20.11-dpdk-19.11-ocp4.9.5-ubi-8.4-core-conf
imagePullPolicy: Always
resources:
requests:
cpu: 16
memory: "48Gi"
hugepages-1Gi: 24Gi
limits:
cpu: 16
memory: "48Gi"
hugepages-1Gi: 24Gi
volumeMounts:
- name: hugepage
mountPath: /hugepages
readOnly: False
- name: varrun
mountPath: /var/run/dpdk
readOnly: false
- name: lib-modules
mountPath: /lib/modules
- name: src
mountPath: /usr/src
- name: dev
mountPath: /dev
- name: cache-volume
mountPath: /dev/shm
- name: license-volume
mountPath: /nepdemo/lic
- mountPath: /root/bbu.core.conf.sh
subPath: bbu.core.conf.sh
name: vbbu-core-config-volume
volumes:
- name: hugepage
emptyDir:
medium: HugePages
- name: varrun
emptyDir: {}
- name: lib-modules
hostPath:
path: /lib/modules
- name: src
hostPath:
path: /usr/src
- name: dev
hostPath:
path: "/dev"
- name: cache-volume
emptyDir:
medium: Memory
sizeLimit: 1Gi
- name: license-volume
configMap:
name: license.for.nepdemo
items:
- key: license
path: license.lic
- name: vbbu-core-config-volume
configMap:
defaultMode: 420
name: vbbu-core-config
---
apiVersion: v1
kind: Service
metadata:
name: vbbu-http
spec:
ports:
- name: http
port: 80
targetPort: 80
nodePort: 31071
type: NodePort
selector:
app: flexran-binary-release
---
apiVersion: route.openshift.io/v1
kind: Route
metadata:
name: vbbu-http
spec:
port:
targetPort: http
to:
kind: Service
name: vbbu-http
---
EOF
oc create -n vbbu-demo -f /data/install/vran.intel.flexran.yaml
# oc delete -n vbbu-demo -f /data/install/vran.intel.flexran.yaml
# below, used for debug
POD_ID=$(oc get pod -n vbbu-demo -o json | jq -r '.items[].metadata.name | select(. | contains("flexran-binary-release"))' )
oc rsh -c flexran-release-running ${POD_ID}
# below runs the command in the pod
bash
tail -100 /root/flexran/bin/nr5g/gnb/l1/Phy.log
# ......
# ==== l1app Time: 315002 ms NumCarrier: 1 NumBbuCores: 6. Tti2Tti Time: [ 0.00.. 0.00.. 0.00] usces
# ==== [o-du0][rx 17639351 pps 55999 kbps 1585561][tx 58086520 pps 184408 kbps 5133280] [on_time 17639351 early 0 late 0 corrupt 0 pkt_dupl 8 Total 17639351]
# Pusch[ 64000 63999 64000 64000 0 0 0 0] SRS[ 0]
# -------------------------------------------------------------------------------------------------------------------------------------------------------
# Cell DL Tput UL Tput UL BLER SRS SNR MIMO PCI
# 0 (Kbps) 1,329,880 34,314 / 36,537 0.00% 0 Db 4T4R 21
# -------------------------------------------------------------------------------------------------------------------------------------------------------
# Core Utilization [6 BBU core(s)]:
# Core Id: 10 11 12 13 14 15 Avg
# Util %: 14 20 18 17 16 21 17.67
# Xran Id: 9 16 Master Core Util: 61 %
# -------------------------------------------------------------------------------------------------------------------------------------------------------
top to show thread and core
我们在调优的时候,特别是分配核绑定的时候,经常需要看看现在都有哪些线程,占用了哪些核,或者说,哪些核比较繁忙,我们就要重新平衡一下。那么就需要一个好用的工具,来帮助我们。很幸运top本身就具备这样的功能。我们只要 -H 运行 top,然后打开线程绑定的核的显示项就可以了。
When we are tuning, especially when assigning core bindings, we often need to see which threads are currently available, which cores are occupied, or which cores are busy, and we need to rebalance. Then we need a useful tool to help us. Fortunately, top itself has such a function. We just need to run top with -H, and then open the display item of the core bound to the thread.
sudo -i
# /root/.config/procps/toprc
mkdir /root/wzh
cat << 'EOF' > /root/wzh/.toprc
top's Config File (Linux processes with windows)
Id:i, Mode_altscr=0, Mode_irixps=1, Delay_time=3.0, Curwin=0
Def fieldscur=ķ&')*+,-./01258<>?ABCFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghij
winflags=193844, sortindx=18, maxtasks=0, graph_cpus=0, graph_mems=0
summclr=1, msgsclr=1, headclr=3, taskclr=1
Job fieldscur=(Ļ@<)*+,-./012568>?ABCFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghij
winflags=193844, sortindx=0, maxtasks=0, graph_cpus=0, graph_mems=0
summclr=6, msgsclr=6, headclr=7, taskclr=6
Mem fieldscur=<MBND34&'()*+,-./0125689FGHIJKLOPQRSTUVWXYZ[\]^_`abcdefghij
winflags=193844, sortindx=21, maxtasks=0, graph_cpus=0, graph_mems=0
summclr=5, msgsclr=5, headclr=4, taskclr=5
Usr fieldscur=)+,-./1234568;<=>?@ABCFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghij
winflags=193844, sortindx=3, maxtasks=0, graph_cpus=0, graph_mems=0
summclr=3, msgsclr=3, headclr=2, taskclr=3
Fixed_widest=0, Summ_mscale=1, Task_mscale=0, Zero_suppress=0
EOF
HOME="/root/wzh/" top -H
end
show netflow table in openshift 4.10
begin in openshift 4.10, admin can set ovs to export netflow to a remote server
install lvm operator
we need local storage, and we are single node openshift, so we use lvm operator, find the operator from operator hub and install :
lvm operator is in TP, so it is buggy, we need some fix.
oc create ns lvm-operator-system
ssh -tt core@192.168.7.13 -- lsblk
# NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
# sr0 11:0 1 1024M 0 rom
# vda 252:0 0 120G 0 disk
# ├─vda1 252:1 0 1M 0 part
# ├─vda2 252:2 0 127M 0 part
# ├─vda3 252:3 0 384M 0 part /boot
# └─vda4 252:4 0 119.5G 0 part /sysroot
# vdb 252:16 0 100G 0 disk
oc adm policy add-cluster-role-to-user cluster-admin system:serviceaccount:lvm-operator-system:topolvm-controller -n lvm-operator-system
oc adm policy add-cluster-role-to-user cluster-admin system:serviceaccount:lvm-operator-system:vg-manager -n lvm-operator-system
oc adm policy add-cluster-role-to-user cluster-admin system:serviceaccount:lvm-operator-system:topolvm-node -n lvm-operator-system
cat << EOF > /data/install/lvm.op.yaml
apiVersion: lvm.topolvm.io/v1alpha1
kind: LVMCluster
metadata:
name: lvmcluster-sample
spec:
storage:
deviceClasses:
- name: vg1
# thinPoolConfig:
# name: thin-pool-1
# sizePercent: 50
# overprovisionRatio: 50
EOF
oc create -n lvm-operator-system -f /data/install/lvm.op.yaml
kubectl patch storageclass odf-lvm-vg1 -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
ssh -tt core@192.168.7.13 -- sudo pvs
# PV VG Fmt Attr PSize PFree
# /dev/vdb vg1 lvm2 a-- <100.00g <100.00g
ssh -tt core@192.168.7.13 -- sudo vgs
# VG #PV #LV #SN Attr VSize VFree
# vg1 1 0 0 wz--n- <100.00g <100.00g
oc get lvmvolumegroup vg1 -oyaml -n lvm-operator-system
# apiVersion: lvm.topolvm.io/v1alpha1
# kind: LVMVolumeGroup
# metadata:
# creationTimestamp: "2022-05-19T08:59:24Z"
# generation: 1
# name: vg1
# namespace: lvm-operator-system
# resourceVersion: "37141"
# uid: c67e2c71-06bc-42f8-be3e-18b7df220725
# spec: {}
oc get lvmvolumegroupnodestatuses.lvm.topolvm.io acm-demo-hub-master -oyaml -n lvm-operator-system
# apiVersion: lvm.topolvm.io/v1alpha1
# kind: LVMVolumeGroupNodeStatus
# metadata:
# creationTimestamp: "2022-05-19T09:02:34Z"
# generation: 1
# name: acm-demo-hub-master
# namespace: lvm-operator-system
# resourceVersion: "38271"
# uid: bc37f640-444c-4cca-bb2e-9235408b52e1
# spec:
# nodeStatus:
# - devices:
# - /dev/vdb
# name: vg1
# status: Ready
oc get storageclass
# NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
# odf-lvm-vg1 topolvm.cybozu.com Delete WaitForFirstConsumer true 17m
kubectl patch storageclass odf-lvm-vg1 -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
cat << EOF > /data/install/lvm.op.pvc.sample.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: lvm-file-pvc
spec:
volumeMode: Filesystem
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
storageClassName: odf-lvm-vg1
EOF
oc create -f /data/install/lvm.op.pvc.sample.yaml -n default
cat <<EOF > /data/install/lvm.op.app.sample.yaml
apiVersion: v1
kind: Pod
metadata:
name: app-file
spec:
containers:
- name: app-file
image: registry.access.redhat.com/ubi8/ubi:8.4
imagePullPolicy: IfNotPresent
command: ["/usr/bin/bash", "-c", "/usr/bin/tail -f /dev/null"]
volumeMounts:
- mountPath: "/mnt/file"
name: lvm-file-pvc
volumes:
- name: lvm-file-pvc
persistentVolumeClaim:
claimName: lvm-file-pvc
EOF
oc create -f /data/install/lvm.op.app.sample.yaml -n default
ssh -tt core@192.168.7.13 -- sudo lvs
# LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
# 34f10bb3-ebd0-4eab-acc9-41b68de832d0 vg1 -wi-ao---- 5.00g
install NetObserv Operator
install loki
following netobserv operator's installation guide, you can install a simplified version of loki.
# install Loki
kubectl create namespace network-observability
# oc delete ns network-observability
wget https://raw.githubusercontent.com/netobserv/documents/main/examples/zero-click-loki/1-storage.yaml
wget https://raw.githubusercontent.com/netobserv/documents/main/examples/zero-click-loki/2-loki.yaml
kubectl apply -f /data/install/1-storage.yaml -n network-observability
kubectl apply -f /data/install/2-loki.yaml -n network-observability
# oc delete -f /data/install/2-loki.yaml -n network-observability
# oc delete -f /data/install/1-storage.yaml -n network-observability
install NetObserv Operator
find the netobserv operator from operator hub, and install:
create flow collector with default config:
# check the result
for pod in $(oc get pods -n openshift-ovn-kubernetes -l app=ovnkube-node -o jsonpath='{range@.items[*]}{.metadata.name}{"\n"}{end}'); do echo; echo $pod; oc -n openshift-ovn-kubernetes exec -c ovnkube-node $pod \
-- bash -c 'for type in ipfix sflow netflow ; do ovs-vsctl find $type ; done'; done
# ovnkube-node-988rk
# _uuid : 6a6c11b7-157c-4cce-be66-9bafec4627de
# cache_active_timeout: 60
# cache_max_flows : 100
# external_ids : {}
# obs_domain_id : []
# obs_point_id : []
# other_config : {}
# sampling : 400
# targets : ["192.168.7.13:2055"]
install grafana
select grafana community operator
create grafana instance with default setting
# create a route by yourself
oc expose service/grafana-service -n network-observability
oc get route -n network-observability
# NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD
# grafana-service grafana-service-network-observability.apps.acm-demo-hub.redhat.ren grafana-service grafana None
# get username and password of the grafana
oc get secret/grafana-admin-credentials -n network-observability -o json | jq -r .data.GF_SECURITY_ADMIN_USER | base64 -d && echo
# admin
oc get secret/grafana-admin-credentials -n network-observability -o json | jq -r .data.GF_SECURITY_ADMIN_PASSWORD | base64 -d && echo
# ggQhu8PwVS0poQ==
# create a grafana and import dashboards
# https://github.com/netobserv/network-observability-operator/blob/release-4.10/config/samples/dashboards/Network%20Observability.json
import dashboards from :
- https://github.com/netobserv/network-observability-operator/blob/release-4.10/config/samples/dashboards/Network%20Observability.json
create loki datasource:
then the result:
from openshift console
end
install loki operator
FlexRAN 20.11 enable on ocp4
本文描述,如何把 intel 的 oran 解决方案 flexran ,移植到 openshift 平台之上。
容器镜像构建和运行架构,文件目录结构:
容器运行的时候,和operator以及硬件的关系结构图:
问题
- ptp服务配置了,vbbu 怎么用?
prepare public cloud env
我们先在公网环境里面编译镜像,并且上传到quay.io
basic init setup
# vultr, ssh enhance
# disable user/passwd login
# ChallengeResponseAuthentication no
# PasswordAuthentication no
# UsePAM no
sed -i 's/PasswordAuthentication yes/PasswordAuthentication no/g' /etc/ssh/sshd_config
sed -i 's/UsePAM yes/UsePAM no/g' /etc/ssh/sshd_config
systemctl restart sshd
ssh root@v.redhat.ren -o PubkeyAuthentication=no
# root@v.redhat.ren: Permission denied (publickey,gssapi-keyex,gssapi-with-mic).
subscription-manager register --auto-attach --username ******** --password ********
subscription-manager release --list
subscription-manager release --set=8.4
subscription-manager repos \
--enable="codeready-builder-for-rhel-8-x86_64-rpms"
dnf -y install https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm
dnf install -y byobu htop fail2ban
cat << EOF > /etc/fail2ban/jail.d/wzh.conf
[sshd]
enabled = true
# [recidive]
# enabled = true
EOF
systemctl enable --now fail2ban
cat << EOF > /etc/fail2ban/jail.d/wzh.conf
[sshd]
enabled = true
[recidive]
enabled = true
EOF
systemctl restart fail2ban
# byobu
dnf update -y
reboot
install ocp rhcos rt kernel
mkdir -p /data/ostree
export BUILDNUMBER=4.9.5
wget -O openshift-client-linux-${BUILDNUMBER}.tar.gz https://mirror.openshift.com/pub/openshift-v4/clients/ocp/${BUILDNUMBER}/openshift-client-linux-${BUILDNUMBER}.tar.gz
wget -O openshift-install-linux-${BUILDNUMBER}.tar.gz https://mirror.openshift.com/pub/openshift-v4/clients/ocp/${BUILDNUMBER}/openshift-install-linux-${BUILDNUMBER}.tar.gz
tar -xzf openshift-client-linux-${BUILDNUMBER}.tar.gz -C /usr/local/sbin/
tar -xzf openshift-install-linux-${BUILDNUMBER}.tar.gz -C /usr/local/sbin/
oc image extract --path /:/data/ostree --registry-config /data/pull-secret.json ` curl -s https://mirror.openshift.com/pub/openshift-v4/x86_64/clients/ocp/$BUILDNUMBER/release.txt | grep machine-os-content | awk '{print $2}' `
mkdir -p /data/dnf
mv /data/ostree/extensions /data/dnf/
rm -rf /data/ostree
mkdir -p /etc/yum.repos.d
cat > /etc/yum.repos.d/rt.repo << 'EOF'
[rt]
name=rt
baseurl=file:///data/dnf/extensions
gpgcheck=0
EOF
dnf install -y kernel-rt-core kernel-rt-devel kernel-rt-modules kernel-rt-modules-extra kernel-headers libhugetlbfs-devel zlib-devel numactl-devel cmake gcc gcc-c++
reboot
build flexran with intel icc/icx
dnf groupinstall -y 'Development Tools'
dnf install -y cmake
# flexran install on host
# yum install centos-release-scl devtoolset-8 -y
# install intel icc icx
cd /data/down
tar zvxf system_studio_2019_update_3_ultimate_edition_offline.tar.gz
cd /data/down/system_studio_2019_update_3_ultimate_edition_offline
cat > s.cfg << 'EOF'
ACCEPT_EULA=accept
CONTINUE_WITH_OPTIONAL_ERROR=yes
PSET_INSTALL_DIR=/opt/intel
CONTINUE_WITH_INSTALLDIR_OVERWRITE=yes
COMPONENTS=ALL
PSET_MODE=install
ACTIVATION_SERIAL_NUMBER=******************
ACTIVATION_TYPE=serial_number
EOF
./install.sh -s s.cfg
echo "source /opt/intel/system_studio_2019/bin/compilervars.sh intel64" >> /root/.bashrc
cd /data/down/
# wget https://registrationcenter-download.intel.com/akdlm/irc_nas/18236/l_BaseKit_p_2021.4.0.3422_offline.sh
bash l_BaseKit_p_2021.4.0.3422_offline.sh
# source /opt/intel/oneapi/setvars.sh
echo "source /opt/intel/oneapi/setvars.sh" >> /root/.bashrc
download dpdk and patch, and comile flexran sdk
cd /data/down/
# wget https://fast.dpdk.org/rel/dpdk-19.11.tar.xz
tar xf dpdk-19.11.tar.xz
rm -rf /opt/dpdk-19.11
mv /data/down/dpdk-19.11 /opt
export RTE_SDK=/opt/dpdk-19.11
cd $RTE_SDK
patch -p1 < /data/down/dpdk_19.11_20.11.7.patch
# patch flexran
pip3 install meson ninja
# dnf install -y ninja-build
# dnf install -y cmake
rm -rf /data/flexran/
mkdir -p /data/flexran/
cd /data/down
tar zvxf FlexRAN-20.11.tar.gz -C /data/flexran/
export RTE_SDK=/opt/dpdk-19.11
cd /data/flexran
./extract.sh
cd /data/flexran
source set_env_var.sh -d
# for intel: /opt/intel/system_studio_2019/
# for dpdk: /opt/dpdk-19.11
# sourcing /opt/intel/system_studio_2019//bin/iccvars.sh intel64 -platform linux
# Set RTE_SDK=/opt/dpdk-19.11
# Set RTE_TARGET=x86_64-native-linuxapp-icc
# ====================================================================================
# Environment Variables:
# ====================================================================================
# RTE_SDK=/opt/dpdk-19.11
# RTE_TARGET=x86_64-native-linuxapp-icc
# WIRELESS_SDK_TARGET_ISA=avx512
# RPE_DIR=/data/flexran/libs/ferrybridge
# CPA_DIR=/data/flexran/libs/cpa
# ROE_DIR=/data/flexran/libs/roe
# XRAN_DIR=/data/flexran/xran
# DIR_WIRELESS_SDK_ROOT=/data/flexran/sdk
# SDK_BUILD=build-avx512-icc
# DIR_WIRELESS_SDK=/data/flexran/sdk/build-avx512-icc
# FLEXRAN_SDK=/data/flexran/sdk/build-avx512-icc/install
# DIR_WIRELESS_FW=/data/flexran/framework
# DIR_WIRELESS_TEST_4G=/data/flexran/tests/lte
# DIR_WIRELESS_TEST_5G=/data/flexran/tests/nr5g
# DIR_WIRELESS_TABLE_5G=/data/flexran/bin/nr5g/gnb/l1/table
# ====================================================================================
./flexran_build.sh -e -r 5gnr_sub6 -i avx512 -m sdk
# https://www.i4k.xyz/article/qq_40982287/119571504
sed -i "s/.ndo_tx_timeout = kni_net_tx_timeout,/\/\/.ndo_tx_timeout = kni_net_tx_timeout,/g" /opt/dpdk-19.11/kernel/linux/kni/kni_net.c
sed -i 's/DEFAULT_PATH=.*/DEFAULT_PATH=\/opt\/intel\/system_studio_2019\/bin\/iccvars.sh/' /opt/dpdk-19.11/usertools/dpdk-setup.sh
sed -i 's/CONFIG_RTE_BBDEV_SDK_AVX2=.*/CONFIG_RTE_BBDEV_SDK_AVX2=y/' /opt/dpdk-19.11/config/common_base
sed -i 's/CONFIG_RTE_BBDEV_SDK_AVX512=.*/CONFIG_RTE_BBDEV_SDK_AVX512=y/' /opt/dpdk-19.11/config/common_base
# DEFAULT_PATH=/opt/intel/system_studio_2019/bin/iccvars.sh
# sed -i 's/CONFIG_RTE_BUILD_SHARED_LIB=.*/CONFIG_RTE_BUILD_SHARED_LIB=y/' /opt/dpdk-19.11/config/common_base
sed -i 's/MODULE_CFLAGS += -Wall -Werror/#MODULE_CFLAGS += -Wall -Werror/' /opt/dpdk-19.11/kernel/linux/kni/Makefile
cd /opt/dpdk-19.11/usertools/
./dpdk-setup.sh
# 39
# 62
sed -i 's/#include <linux\/bootmem.h>/\/\/#include <linux\/bootmem.h>/' /data/flexran/libs/cpa/sub6/rec/drv/src/nr_dev.c
# export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/data/flexran/wls_mod/lib
# export CC=icc
# export DEV_OPT=" -Wl,--exclude-libs,/usr/lib64/libmvec_nonshared.a "
# export LDFLAGS=" -Wl,--exclude-libs,/usr/lib64/libmvec_nonshared.a "
# export RTE_LIBS=" -Wl,--exclude-libs,/usr/lib64/libmvec_nonshared.a "
# -Wl,--exclude-libs=libmvec_nonshared.a
# -Wl,--allow-multiple-definition
sed -i 's/@$(LD) -o $@ $(LD_FLAGS) -Wl,-L $(BUILDDIR) $(INC_LIBS) -lm -lrt -lpthread/@$(LD) -o $@ $(LD_FLAGS) -Wl,-L $(BUILDDIR) $(INC_LIBS) -lm -lrt -lpthread -Wl,--allow-multiple-definition/' /data/flexran/build/nr5g/gnb/l1app/makefile_phy
sed -i 's/@$(LD) -o $@ $(LD_FLAGS) -Wl,-L $(BUILDDIR) $(INC_LIBS) -lm -lrt -lpthread/@$(LD) -o $@ $(LD_FLAGS) -Wl,-L $(BUILDDIR) $(INC_LIBS) -lm -lrt -lpthread -Wl,--allow-multiple-definition -Wl,-lrte_port -Wl,-lrte_cryptodev -Wl,-lrte_eventdev/' /data/flexran/build/nr5g/gnb/testapp/linux/makefile_phy
sed -i 's/@$(LD) -o $@ $(LD_FLAGS) -Wl,-L $(BUILDDIR) $(INC_LIBS) -lm -lrt -lpthread/@$(LD) -o $@ $(LD_FLAGS) -Wl,-L $(BUILDDIR) $(INC_LIBS) -lm -lrt -lpthread -Wl,--allow-multiple-definition/' /data/flexran/build/lte/l1app_nbiot/makefile
sed -i 's/@$(LD) -o $@ $(LD_FLAGS) -Wl,-L $(BUILDDIR) $(INC_LIBS) -lm -lrt -lpthread/@$(LD) -o $@ $(LD_FLAGS) -Wl,-L $(BUILDDIR) $(INC_LIBS) -lm -lrt -lpthread -Wl,--allow-multiple-definition/' /data/flexran/build/lte/bbdevapp/Makefile
sed -i 's/@$(LD) -o $@ $(LD_FLAGS) -Wl,-L $(BUILDDIR) $(INC_LIBS) -lm -lrt -lpthread/@$(LD) -o $@ $(LD_FLAGS) -Wl,-L $(BUILDDIR) $(INC_LIBS) -lm -lrt -lpthread -Wl,--allow-multiple-definition/' /data/flexran/build/lte/l1app/makefile
sed -i 's/@$(LD) -o $@ $(LD_FLAGS) -Wl,-L $(BUILDDIR) $(INC_LIBS) -lm -lrt -lpthread/@$(LD) -o $@ $(LD_FLAGS) -Wl,-L $(BUILDDIR) $(INC_LIBS) -lm -lrt -lpthread -Wl,--allow-multiple-definition/' /data/flexran/build/nr5g/gnb/bbdevapp/Makefile
sed -i 's/@$(CC) -o $(APP) $(OBJS) $(RTE_LIBS) $(LDFLAGS)/@$(CC) -o $(APP) $(OBJS) $(RTE_LIBS) $(LDFLAGS) -Wl,--allow-multiple-definition/' /data/flexran/build/nr5g/gnb/testmac/makefile
sed -i 's/@$(CC) -o $(APP) $(OBJS) $(RTE_LIBS) $(LDFLAGS)/@$(CC) -o $(APP) $(OBJS) $(RTE_LIBS) $(LDFLAGS) -Wl,--allow-multiple-definition/' /data/flexran/build/lte/l1app_nbiot/makefile
# -Wl,-lrte_port -Wl,-lrte_cryptodev -Wl,-lrte_eventdev
# build/nr5g/gnb/testapp/linux/makefile_phy:540
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/data/flexran/wls_mod/lib
cd /data/flexran
./flexran_build.sh -e -r 5gnr_sub6 -i avx512 -b
# dnf install -y podman-docker
# export RTE_SDK=/opt/dpdk-19.11
# cd /data/flexran
# bash ./flexran_build_dockerfile.sh -v -e -i avx512 -r 5gnr_sub6 -b -m all
# podman image ls
# # REPOSITORY TAG IMAGE ID CREATED SIZE
# # flexran.docker.registry/flexran_vdu latest 8c5460a697e6 16 minutes ago 1.36 GB
# # quay.io/centos/centos 7.9.2009 8652b9f0cb4c 17 months ago 212 MB
# podman tag flexran.docker.registry/flexran_vdu:latest quay.io/nepdemo/flexran_vdu:flexran-20.11-dpdk-20.11.3-ocp4.9.5-centos-7.9
# podman push quay.io/nepdemo/flexran_vdu:flexran-20.11-dpdk-20.11.3-ocp4.9.5-centos-7.9
vsftpd
我们需要在本地准备一个ftp服务器,来承载rt-kernel的repo,后面编译容器镜像,需要访问这个临时的repo
dnf install -y vsftpd
sed -i 's/anonymous_enable=NO/anonymous_enable=YES/g' /etc/vsftpd/vsftpd.conf
systemctl disable --now firewalld
systemctl enable --now vsftpd
mkdir -p /var/ftp/dnf
mount --bind /data/dnf /var/ftp/dnf
chcon -R -t public_content_t /var/ftp/dnf
find /data/dnf/extensions -type f -exec chmod 644 {} \;
chmod +x /etc/rc.d/rc.local
cat << EOF >>/etc/rc.d/rc.local
iptables -A INPUT -d 10.88.0.1 -j ACCEPT
iptables -A INPUT -p tcp --dport 21 -j REJECT
EOF
systemctl enable --now rc-local
flexran_vdu for rhel8.4
dnf install -y podman-docker
export RTE_SDK=/opt/dpdk-19.11
cd /data/flexran
bash ./flexran_build_dockerfile.wzh.sh -v -e -i avx512 -r 5gnr_sub6 -b -m all
podman tag flexran.docker.registry/flexran_vdu:latest quay.io/nepdemo/flexran_vdu:flexran-20.11-dpdk-19.11-ocp4.9.5-ubi-8.4
podman push quay.io/nepdemo/flexran_vdu:flexran-20.11-dpdk-19.11-ocp4.9.5-ubi-8.4
copy flexran sdk to image
cat << 'EOF' > /data/flexran.sdk.dockerfile
FROM registry.access.redhat.com/ubi8/ubi:8.4
RUN dnf repolist
RUN sed -i 's|enabled=1|enabled=0|g' /etc/yum/pluginconf.d/subscription-manager.conf
RUN sed -i 's|$releasever|8.4|g' /etc/yum.repos.d/redhat.repo
RUN sed -i '/codeready-builder-for-rhel-8-x86_64-rpms/,/\[/ s/enabled = 0/enabled = 1/' /etc/yum.repos.d/redhat.repo
RUN mv -f /etc/yum.repos.d/ubi.repo /etc/yum.repos.d/ubi.repo.bak
RUN dnf -y update
RUN dnf -y install rsync
COPY flexran /data/flexran
EOF
cd /data
podman build --squash -t quay.io/nepdemo/flexran_basekit:flexran-sdk-20.11-ocp-4.9.5-ubi-8.4 -f flexran.sdk.dockerfile ./
podman push quay.io/nepdemo/flexran_basekit:flexran-sdk-20.11-ocp-4.9.5-ubi-8.4
copy intel icc to image
cat << 'EOF' > /opt/intel/flexran.intel.icc.dockerfile
FROM registry.access.redhat.com/ubi8/ubi:8.4
RUN dnf repolist
RUN sed -i 's|enabled=1|enabled=0|g' /etc/yum/pluginconf.d/subscription-manager.conf
RUN sed -i 's|$releasever|8.4|g' /etc/yum.repos.d/redhat.repo
RUN sed -i '/codeready-builder-for-rhel-8-x86_64-rpms/,/\[/ s/enabled = 0/enabled = 1/' /etc/yum.repos.d/redhat.repo
RUN mv -f /etc/yum.repos.d/ubi.repo /etc/yum.repos.d/ubi.repo.bak
RUN dnf -y update
RUN dnf -y install rsync
COPY system_studio_2019 /opt/intel/system_studio_2019
COPY licenses /opt/intel/licenses
COPY packagemanager /opt/intel/packagemanager
EOF
cd /opt/intel
podman build --squash -t quay.io/nepdemo/flexran_basekit:intel.icc-21.11-ocp-4.9.5-ubi-8.4 -f flexran.intel.icc.dockerfile ./
podman push quay.io/nepdemo/flexran_basekit:intel.icc-21.11-ocp-4.9.5-ubi-8.4
copy intel icx to image
cat << 'EOF' > /opt/intel/flexran.intel.icx.dockerfile
FROM registry.access.redhat.com/ubi8/ubi:8.4
RUN dnf repolist
RUN sed -i 's|enabled=1|enabled=0|g' /etc/yum/pluginconf.d/subscription-manager.conf
RUN sed -i 's|$releasever|8.4|g' /etc/yum.repos.d/redhat.repo
RUN sed -i '/codeready-builder-for-rhel-8-x86_64-rpms/,/\[/ s/enabled = 0/enabled = 1/' /etc/yum.repos.d/redhat.repo
RUN mv -f /etc/yum.repos.d/ubi.repo /etc/yum.repos.d/ubi.repo.bak
RUN dnf -y update
RUN dnf -y install rsync
COPY oneapi /opt/intel/oneapi
COPY licenses /opt/intel/licenses
COPY packagemanager /opt/intel/packagemanager
EOF
cd /opt/intel
podman build --squash -t quay.io/nepdemo/flexran_basekit:intel.icx-21.11-ocp-4.9.5-ubi-8.4 -f flexran.intel.icx.dockerfile ./
podman push quay.io/nepdemo/flexran_basekit:intel.icx-21.11-ocp-4.9.5-ubi-8.4
build dev docker image with dpdk 19.11
cat << 'EOF' > /opt/flexran.dpdk.dockerfile
FROM registry.access.redhat.com/ubi8/ubi:8.4
RUN dnf repolist
RUN sed -i 's|enabled=1|enabled=0|g' /etc/yum/pluginconf.d/subscription-manager.conf
RUN sed -i 's|$releasever|8.4|g' /etc/yum.repos.d/redhat.repo
RUN sed -i '/codeready-builder-for-rhel-8-x86_64-rpms/,/\[/ s/enabled = 0/enabled = 1/' /etc/yum.repos.d/redhat.repo
RUN mv -f /etc/yum.repos.d/ubi.repo /etc/yum.repos.d/ubi.repo.bak
RUN echo -e "\
[localrepo]\n\
name=LocalRepo\n\
baseurl=ftp://10.88.0.1/dnf/extensions/\n\
enabled=1\n\
gpgcheck=0" \
> /etc/yum.repos.d/local.repo
RUN dnf -y update
RUN dnf -y install rsync
RUN dnf -y install kernel-rt-core kernel-rt-devel kernel-rt-modules kernel-rt-modules-extra kernel-headers libhugetlbfs-devel zlib-devel numactl-devel cmake gcc gcc-c++ libhugetlbfs-utils libhugetlbfs-devel libhugetlbfs numactl-devel pciutils libaio libaio-devel net-tools libpcap python3-pip
RUN dnf install -y --allowerasing coreutils
RUN dnf groupinstall -y development server
RUN pip-3 install meson ninja
COPY dpdk-19.11 /opt/dpdk-19.11
# RUN ln -s /opt/dpdk-stable-20.11.3 /opt/dpdk-20.11
EOF
cd /opt/
podman build --squash -t quay.io/nepdemo/flexran_basekit:dpdk-19.11-ocp-4.9.5-ubi-8.4 -f flexran.dpdk.dockerfile ./
podman push quay.io/nepdemo/flexran_basekit:dpdk-19.11-ocp-4.9.5-ubi-8.4
build in nepdemo env
在nepdemo的内网环境中,编译镜像,并将镜像推送到nepdemo的镜像仓库
create a image registry to hold the large container image
# found a centos7 host
mkdir /etc/crts/ && cd /etc/crts
openssl req \
-newkey rsa:2048 -nodes -keyout redhat.ren.key \
-x509 -days 3650 -out redhat.ren.crt -subj \
"/C=CN/ST=GD/L=SZ/O=Global Security/OU=IT Department/CN=*.redhat.ren"
cp /etc/crts/redhat.ren.crt /etc/pki/ca-trust/source/anchors/
update-ca-trust extract
mkdir -p /home/data/registry
cd /data
# tar zxf registry.tgz
yum -y install docker-distribution
cat << EOF > /etc/docker-distribution/registry/config.yml
version: 0.1
log:
fields:
service: registry
storage:
cache:
layerinfo: inmemory
filesystem:
rootdirectory: /home/data/registry
delete:
enabled: true
http:
addr: :5443
tls:
certificate: /etc/crts/redhat.ren.crt
key: /etc/crts/redhat.ren.key
EOF
# systemctl restart docker
# systemctl stop docker-distribution
systemctl enable --now docker-distribution
build container image for intel sdk
cat << EOF >> /etc/hosts
192.168.123.252 reg-tmp.redhat.ren
EOF
export REG_TMP="reg-tmp.redhat.ren:5443"
podman tag flexran.docker.registry/flexran_vdu:latest ${REG_TMP}/nepdemo/flexran_vdu:flexran-20.11-dpdk-19.11-ocp4.9.5-ubi-8.4
podman push --tls-verify=false ${REG_TMP}/nepdemo/flexran_vdu:flexran-20.11-dpdk-19.11-ocp4.9.5-ubi-8.4
# copy flexran sdk to image
cd /data
podman build --squash -t ${REG_TMP}/nepdemo/flexran_basekit:flexran-sdk-20.11-ocp-4.9.5-ubi-8.4 -f flexran.sdk.dockerfile ./
podman push --tls-verify=false ${REG_TMP}/nepdemo/flexran_basekit:flexran-sdk-20.11-ocp-4.9.5-ubi-8.4
# dpdk-kmods
cd /data/git
podman build --squash -t ${REG_TMP}/nepdemo/flexran_vdu:dpdk-kmods-ocp-4.9.5-ubi -f flexran.sdk.dockerfile ./
podman push --tls-verify=false ${REG_TMP}/nepdemo/flexran_vdu:dpdk-kmods-ocp-4.9.5-ubi
# copy intel icc to image
cd /opt/intel
podman build --squash -t ${REG_TMP}/nepdemo/flexran_basekit:intel.icc-21.11-ocp-4.9.5-ubi-8.4 -f flexran.intel.icc.dockerfile ./
podman push --tls-verify=false ${REG_TMP}/nepdemo/flexran_basekit:intel.icc-21.11-ocp-4.9.5-ubi-8.4
# copy intel icx to image
cd /opt/intel
podman build --squash -t ${REG_TMP}/nepdemo/flexran_basekit:intel.icx-21.11-ocp-4.9.5-ubi-8.4 -f flexran.intel.icx.dockerfile ./
podman push --tls-verify=false ${REG_TMP}/nepdemo/flexran_basekit:intel.icx-21.11-ocp-4.9.5-ubi-8.4
# build dev docker image with dpdk 19.11
cat << 'EOF' > /opt/flexran.dpdk.dockerfile
FROM registry.access.redhat.com/ubi8/ubi:8.4
RUN dnf repolist
RUN sed -i 's|enabled=1|enabled=0|g' /etc/yum/pluginconf.d/subscription-manager.conf
RUN sed -i 's|$releasever|8.4|g' /etc/yum.repos.d/redhat.repo
RUN sed -i 's|cdn.redhat.com|china.cdn.redhat.com|g' /etc/yum.repos.d/redhat.repo
RUN sed -i '/codeready-builder-for-rhel-8-x86_64-rpms/,/\[/ s/enabled = 0/enabled = 1/' /etc/yum.repos.d/redhat.repo
RUN mv -f /etc/yum.repos.d/ubi.repo /etc/yum.repos.d/ubi.repo.bak
RUN echo -e "\
[localrepo]\n\
name=LocalRepo\n\
baseurl=ftp://192.168.122.1/dnf/extensions/\n\
enabled=1\n\
gpgcheck=0" \
> /etc/yum.repos.d/local.repo
RUN dnf -y update
RUN dnf -y install rsync
RUN dnf -y install kernel-rt-core kernel-rt-devel kernel-rt-modules kernel-rt-modules-extra kernel-headers libhugetlbfs-devel zlib-devel numactl-devel cmake gcc gcc-c++ libhugetlbfs-utils libhugetlbfs-devel libhugetlbfs numactl-devel pciutils libaio libaio-devel net-tools libpcap python3-pip
RUN dnf install -y --allowerasing coreutils
RUN dnf groupinstall -y development server
RUN pip-3 install meson ninja
COPY dpdk-19.11 /opt/dpdk-19.11
# RUN ln -s /opt/dpdk-19.11 /opt/dpdk-20.11
EOF
cd /opt/
podman build --squash -t ${REG_TMP}/nepdemo/flexran_basekit:dpdk-19.11-ocp-4.9.5-ubi-8.4 -f flexran.dpdk.dockerfile ./
podman push --tls-verify=false ${REG_TMP}/nepdemo/flexran_basekit:dpdk-19.11-ocp-4.9.5-ubi-8.4
deploy on ocp 4.9.5
镜像都准备好了,我们开始在openshift4 上进行部署测试。
set security for temp image registry
我们临时创建了一个镜像仓库,那么我们就要把这个配置放到集群里面去,主要是让ocp集群,不要检查这个新镜像仓库的证书。
oc patch schedulers.config.openshift.io/cluster --type merge -p '{"spec":{"mastersSchedulable":false}}'
install /data/ocp4/clients/butane-amd64 /usr/local/bin/butane
cat << EOF > /data/sno/tmp.images.bu
variant: openshift
version: 4.9.0
metadata:
labels:
machineconfiguration.openshift.io/role: worker
name: 99-zzz-worker-temp-images
storage:
files:
- path: /etc/containers/registries.conf.d/temp.registries.conf
overwrite: true
contents:
inline: |
[[registry]]
location = "tmp-registry.ocp4.redhat.ren:5443"
insecure = true
blocked = false
mirror-by-digest-only = false
prefix = ""
EOF
butane /data/sno/tmp.images.bu > /data/sno/99-zzz-worker-temp-images.yaml
oc create -f /data/sno/99-zzz-worker-temp-images.yaml
set a host-path dir for flexran sdk
我们需要在 worker-2 上创建本地目录,好承载 flexran sdk, intel icc, intel icx 等超级大的目录和文件,主要是开发组有在容器平台做开发和测试的需求。如果是生成运行环境,这个本地目录是不应该存在的。
# do not need, as it is already deployed
cat << EOF > /data/install/host-path.yaml
---
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
name: 50-set-selinux-for-hostpath-nepdemo-worker-rt-2
labels:
machineconfiguration.openshift.io/role: worker-rt-2
spec:
config:
ignition:
version: 3.2.0
systemd:
units:
- contents: |
[Unit]
Description=Set SELinux chcon for hostpath nepdemo
Before=kubelet.service
[Service]
Type=oneshot
RemainAfterExit=yes
ExecStartPre=-mkdir -p /var/nepdemo
ExecStart=chcon -Rt container_file_t /var/nepdemo/
[Install]
WantedBy=multi-user.target
enabled: true
name: hostpath-nepdemo.service
EOF
oc create -f /data/install/host-path.yaml
using job to copy files to local path
我们使用job的方式,把flexran sdk, intel icc/icx sdk复制到worker-2的本地目录,以便后续使用。用job的方式,主要考虑,这个是一次性的工作,并且我们在container image 里面还装了rsync,这样以后如果破坏了本地目录,重新运行以下job,就可以很快的同步目录。
export REG_TMP='tmp-registry.ocp4.redhat.ren:5443'
# copy dpdk to local
cat << EOF > /data/install/job.flexran.dpdk.yaml
---
apiVersion: batch/v1
kind: Job
metadata:
name: flexran.basekit.dpdk.copy
namespace: default
spec:
template:
spec:
containers:
- name: files
image: ${REG_TMP}/nepdemo/flexran_basekit:dpdk-19.11-ocp-4.9.5-ubi-8.4
command: [ "/bin/sh","-c","--" ]
# command: ["rsync", "--delete", "-arz", "/opt/dpdk-19.11", "/nepdemo/"]
args: [" rsync -P --delete -arz /opt/dpdk-19.11 /nepdemo/ "]
volumeMounts:
- name: nepdemo
mountPath: /nepdemo
restartPolicy: Never
nodeName: worker-2.ocp4.redhat.ren
volumes:
- name: nepdemo
hostPath:
path: /var/nepdemo
EOF
oc create -f /data/install/job.flexran.dpdk.yaml
# copy flexran sdk to local
cat << EOF > /data/install/job.flexran.sdk.yaml
---
apiVersion: batch/v1
kind: Job
metadata:
name: flexran.basekit.sdk.copy
namespace: default
spec:
template:
spec:
containers:
- name: files
image: ${REG_TMP}/nepdemo/flexran_basekit:flexran-sdk-20.11-ocp-4.9.5-ubi-8.4
command: [ "/bin/sh","-c","--" ]
# command: ["rsync", "--delete", "-arz", "/data/flexran", "/nepdemo/"]
args: [" rsync -P --delete -arz /data/flexran /nepdemo/ "]
volumeMounts:
- name: nepdemo
mountPath: /nepdemo
restartPolicy: Never
nodeName: worker-2.ocp4.redhat.ren
volumes:
- name: nepdemo
hostPath:
path: /var/nepdemo
EOF
oc create -f /data/install/job.flexran.sdk.yaml
# copy intel icc sdk to local
cat << EOF > /data/install/job.intel.icc.yaml
---
apiVersion: batch/v1
kind: Job
metadata:
name: flexran.basekit.intel.icc.copy
namespace: default
spec:
template:
spec:
containers:
- name: files
image: ${REG_TMP}/nepdemo/flexran_basekit:intel.icc-21.11-ocp-4.9.5-ubi-8.4
command: [ "/bin/sh","-c","--" ]
# command: ["rsync", "--delete", "-arz", "/opt/intel/system_studio_2019", "/nepdemo/"]
args: [" rsync -P --delete -arz /opt/intel/system_studio_2019 /nepdemo/ "]
volumeMounts:
- name: nepdemo
mountPath: /nepdemo
restartPolicy: Never
nodeName: worker-2.ocp4.redhat.ren
volumes:
- name: nepdemo
hostPath:
path: /var/nepdemo
EOF
oc create -f /data/install/job.intel.icc.yaml
# copy intel icx sdk to local
cat << EOF > /data/install/job.intel.icx.yaml
---
apiVersion: batch/v1
kind: Job
metadata:
name: flexran.basekit.intel.icx.copy
namespace: default
spec:
template:
spec:
containers:
- name: files
image: ${REG_TMP}/nepdemo/flexran_basekit:intel.icx-21.11-ocp-4.9.5-ubi-8.4
command: [ "/bin/sh","-c","--" ]
# command: ["rsync", "--delete", "-arz", "/opt/intel/oneapi", "/nepdemo/"]
args: [" rsync -P --delete -arz /opt/intel/oneapi /nepdemo/ "]
volumeMounts:
- name: nepdemo
mountPath: /nepdemo
restartPolicy: Never
nodeName: worker-2.ocp4.redhat.ren
volumes:
- name: nepdemo
hostPath:
path: /var/nepdemo
EOF
oc create -f /data/install/job.intel.icx.yaml
# copy intel license to local
cat << EOF > /data/install/job.intel.license.yaml
---
apiVersion: batch/v1
kind: Job
metadata:
name: flexran.basekit.intel.icx.copy
namespace: default
spec:
template:
spec:
containers:
- name: files
image: ${REG_TMP}/nepdemo/flexran_basekit:intel.icx-21.11-ocp-4.9.5-ubi-8.4
command: [ "/bin/sh","-c","--" ]
args: ["rsync -P --delete -arz /opt/intel/licenses /nepdemo/ ; rsync -P --delete -arz /opt/intel/packagemanager /nepdemo/ "]
volumeMounts:
- name: nepdemo
mountPath: /nepdemo
restartPolicy: Never
nodeName: worker-2.ocp4.redhat.ren
volumes:
- name: nepdemo
hostPath:
path: /var/nepdemo
EOF
oc create -f /data/install/job.intel.license.yaml
setup sriov operator
openshift有sriov的operator,官方支持intel x710网卡,我们直接用就好了。
the env has nic Intel X710 : 8086 1572
# install sriov operator
cat << EOF > /data/install/sriov.yaml
---
apiVersion: v1
kind: Namespace
metadata:
name: openshift-sriov-network-operator
annotations:
workload.openshift.io/allowed: management
---
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
name: sriov-network-operators
namespace: openshift-sriov-network-operator
spec:
targetNamespaces:
- openshift-sriov-network-operator
---
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
name: sriov-network-operator-subscription
namespace: openshift-sriov-network-operator
spec:
channel: "4.9"
installPlanApproval: Manual
name: sriov-network-operator
source: redhat-operators
sourceNamespace: openshift-marketplace
EOF
oc create -f /data/install/sriov.yaml
oc get SriovNetworkNodeState -n openshift-sriov-network-operator
# NAME AGE
# master-0 42m
# worker-0.ocp4.redhat.ren 42m
# worker-1 42m
# worker-2.ocp4.redhat.ren 42m
oc get SriovNetworkNodeState/worker-2.ocp4.redhat.ren -n openshift-sriov-network-operator -o yaml
# apiVersion: sriovnetwork.openshift.io/v1
# kind: SriovNetworkNodeState
# metadata:
# creationTimestamp: "2022-05-06T14:34:54Z"
# generation: 1
# name: worker-2.ocp4.redhat.ren
# namespace: openshift-sriov-network-operator
# ownerReferences:
# - apiVersion: sriovnetwork.openshift.io/v1
# blockOwnerDeletion: true
# controller: true
# kind: SriovNetworkNodePolicy
# name: default
# uid: 4eca5eea-e1e5-410f-8833-dd2de1434e53
# resourceVersion: "70932404"
# uid: 1d122c8e-b788-4f1e-a3d5-865c6230a476
# spec:
# dpConfigVersion: "70930693"
# status:
# interfaces:
# - deviceID: "1572"
# driver: i40e
# linkSpeed: -1 Mb/s
# linkType: ETH
# mac: 90:e2:ba:a8:29:e6
# mtu: 1500
# name: ens2f0
# pciAddress: 0000:65:00.0
# totalvfs: 64
# vendor: "8086"
# - deviceID: "1572"
# driver: i40e
# linkSpeed: -1 Mb/s
# linkType: ETH
# mac: 90:e2:ba:a8:29:e7
# mtu: 1500
# name: ens2f1
# pciAddress: 0000:65:00.1
# totalvfs: 64
# vendor: "8086"
# - deviceID: 37d1
# driver: i40e
# linkSpeed: 1000 Mb/s
# linkType: ETH
# mac: ac:1f:6b:ea:5b:32
# mtu: 1500
# name: eno1
# pciAddress: 0000:b5:00.0
# totalvfs: 32
# vendor: "8086"
# - deviceID: 37d1
# driver: i40e
# linkSpeed: 1000 Mb/s
# linkType: ETH
# mac: ac:1f:6b:ea:5b:33
# mtu: 1500
# name: eno2
# pciAddress: 0000:b5:00.1
# totalvfs: 32
# vendor: "8086"
# syncStatus: Succeeded
# how to use the sriov to create VF and attach to pod, depends on use case from nep demo request
# remember to active SRIOV in bios
# remember to active VT-d in bios
cat << EOF > /data/install/sriov.policy.yaml
---
apiVersion: sriovnetwork.openshift.io/v1
kind: SriovNetworkNodePolicy
metadata:
name: policy-710-nic01-rt2
namespace: openshift-sriov-network-operator
spec:
resourceName: intel_710_nic01_rt2
nodeSelector:
kubernetes.io/hostname: worker-2.ocp4.redhat.ren
numVfs: 4
nicSelector:
vendor: "8086"
deviceID: "1572"
rootDevices:
- "0000:65:00.0"
# pfNames:
# - "ens2f0"
# linkType: eth
# isRdma: false
deviceType: vfio-pci
---
apiVersion: sriovnetwork.openshift.io/v1
kind: SriovNetworkNodePolicy
metadata:
name: policy-710-nic02-rt2
namespace: openshift-sriov-network-operator
spec:
resourceName: intel_710_nic02_rt2
nodeSelector:
kubernetes.io/hostname: worker-2.ocp4.redhat.ren
numVfs: 4
nicSelector:
vendor: "8086"
deviceID: "1572"
rootDevices:
- "0000:65:00.1"
# pfNames:
# - "ens2f1"
# linkType: eth
# isRdma: false
deviceType: vfio-pci
EOF
oc create -f /data/install/sriov.policy.yaml
# oc delete -f /data/install/sriov.policy.yaml
oc get sriovnetworknodestates/worker-2.ocp4.redhat.ren -n openshift-sriov-network-operator -o jsonpath='{.status.syncStatus}' && echo
# Succeeded
cat << EOF > /data/install/sriov.attach.yaml
---
apiVersion: sriovnetwork.openshift.io/v1
kind: SriovNetwork
metadata:
name: intel-710-nic01-rt2
namespace: openshift-sriov-network-operator
spec:
resourceName: intel_710_nic01_rt2
networkNamespace: vbbu-demo
ipam: |-
{
"type": "static",
"addresses": [
{
"address": "192.168.12.21/24"
}
]
}
---
apiVersion: sriovnetwork.openshift.io/v1
kind: SriovNetwork
metadata:
name: intel-710-nic02-rt2
namespace: openshift-sriov-network-operator
spec:
resourceName: intel_710_nic02_rt2
networkNamespace: vbbu-demo
# ipam: |-
# {
# "type": "dhcp"
# }
ipam: |-
{
"type": "static",
"addresses": [
{
"address": "192.168.22.21/24"
}
]
}
EOF
oc create -f /data/install/sriov.attach.yaml
# oc delete -f /data/install/sriov.attach.yaml
oc get net-attach-def -n vbbu-demo
# NAME AGE
# intel-710-nic01-rt2 34s
# intel-710-nic02-rt2 34s
setup fec sriov operator
intel已经给自己的FEC加速卡做好了operator,还有非常详细的文档,我们很幸福的直接用就好了。
- SEO Operator for Wireless FEC Accelerators documentation
- Intel's vRAN accelerators supported by SEO Operators on OpenShift
# install sriov operator
cat << EOF > /data/install/sriov.fec.yaml
---
apiVersion: v1
kind: Namespace
metadata:
name: vran-acceleration-operators
annotations:
workload.openshift.io/allowed: management
labels:
openshift.io/cluster-monitoring: "true"
---
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
name: vran-operators
namespace: vran-acceleration-operators
spec:
targetNamespaces:
- vran-acceleration-operators
---
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
name: sriov-fec-subscription
namespace: vran-acceleration-operators
spec:
channel: stable
installPlanApproval: Manual
name: sriov-fec
source: certified-operators
sourceNamespace: openshift-marketplace
EOF
oc create -f /data/install/sriov.fec.yaml
oc get csv -n vran-acceleration-operators
# NAME DISPLAY VERSION REPLACES PHASE
# performance-addon-operator.v4.9.0 Performance Addon Operator 4.9.0 Succeeded
# sriov-fec.v2.2.1 SEO SR-IOV Operator for Wireless FEC Accelerators 2.2.1 Succeeded
oc get sriovfecnodeconfig -n vran-acceleration-operators
# No resources found in vran-acceleration-operators namespace.
cat << EOF > /data/install/sriov.fec.config.yaml
apiVersion: sriovfec.intel.com/v2
kind: SriovFecClusterConfig
metadata:
name: config
namespace: vran-acceleration-operators
spec:
priority: 1
nodeSelector:
kubernetes.io/hostname: worker-2.ocp4.redhat.ren
acceleratorSelector:
pciAddress: 0000:17:00.0
physicalFunction:
pfDriver: "pci-pf-stub"
vfDriver: "vfio-pci"
vfAmount: 16
bbDevConfig:
acc100:
# Programming mode: 0 = VF Programming, 1 = PF Programming
# true = PF Programming, false = VF Programming
pfMode: true
numVfBundles: 16
maxQueueSize: 1024
uplink4G:
numQueueGroups: 0
numAqsPerGroups: 16
aqDepthLog2: 4
downlink4G:
numQueueGroups: 0
numAqsPerGroups: 16
aqDepthLog2: 4
uplink5G:
numQueueGroups: 4
numAqsPerGroups: 16
aqDepthLog2: 4
downlink5G:
numQueueGroups: 4
numAqsPerGroups: 16
aqDepthLog2: 4
EOF
oc create -f /data/install/sriov.fec.config.yaml
# oc delete -f /data/install/sriov.fec.config.yaml
oc get sriovfecnodeconfig -n vran-acceleration-operators
# NAME CONFIGURED
# worker-2.ocp4.redhat.ren Succeeded
oc get sriovfecnodeconfig -n vran-acceleration-operators worker-2.ocp4.redhat.ren -o yaml
# apiVersion: sriovfec.intel.com/v2
# kind: SriovFecNodeConfig
# metadata:
# creationTimestamp: "2022-05-09T06:51:45Z"
# generation: 2
# name: worker-2.ocp4.redhat.ren
# namespace: vran-acceleration-operators
# resourceVersion: "72789505"
# uid: 265c42ae-f898-407c-a4bc-7f17aa8b94bb
# spec:
# physicalFunctions:
# - bbDevConfig:
# acc100:
# downlink4G:
# aqDepthLog2: 4
# numAqsPerGroups: 16
# numQueueGroups: 0
# downlink5G:
# aqDepthLog2: 4
# numAqsPerGroups: 16
# numQueueGroups: 4
# maxQueueSize: 1024
# numVfBundles: 16
# pfMode: true
# uplink4G:
# aqDepthLog2: 4
# numAqsPerGroups: 16
# numQueueGroups: 0
# uplink5G:
# aqDepthLog2: 4
# numAqsPerGroups: 16
# numQueueGroups: 4
# pciAddress: "0000:17:00.0"
# pfDriver: pci-pf-stub
# vfAmount: 16
# vfDriver: vfio-pci
# status:
# conditions:
# - lastTransitionTime: "2022-05-09T12:48:10Z"
# message: Configured successfully
# observedGeneration: 2
# reason: Succeeded
# status: "True"
# type: Configured
# inventory:
# sriovAccelerators:
# - deviceID: 0d5c
# driver: pci-pf-stub
# maxVirtualFunctions: 16
# pciAddress: "0000:17:00.0"
# vendorID: "8086"
# virtualFunctions:
# - deviceID: 0d5d
# driver: vfio-pci
# pciAddress: "0000:18:00.0"
# - deviceID: 0d5d
# driver: vfio-pci
# pciAddress: "0000:18:00.1"
# - deviceID: 0d5d
# driver: vfio-pci
# pciAddress: "0000:18:01.2"
# - deviceID: 0d5d
# driver: vfio-pci
# pciAddress: "0000:18:01.3"
# - deviceID: 0d5d
# driver: vfio-pci
# pciAddress: "0000:18:01.4"
# - deviceID: 0d5d
# driver: vfio-pci
# pciAddress: "0000:18:01.5"
# - deviceID: 0d5d
# driver: vfio-pci
# pciAddress: "0000:18:01.6"
# - deviceID: 0d5d
# driver: vfio-pci
# pciAddress: "0000:18:01.7"
# - deviceID: 0d5d
# driver: vfio-pci
# pciAddress: "0000:18:00.2"
# - deviceID: 0d5d
# driver: vfio-pci
# pciAddress: "0000:18:00.3"
# - deviceID: 0d5d
# driver: vfio-pci
# pciAddress: "0000:18:00.4"
# - deviceID: 0d5d
# driver: vfio-pci
# pciAddress: "0000:18:00.5"
# - deviceID: 0d5d
# driver: vfio-pci
# pciAddress: "0000:18:00.6"
# - deviceID: 0d5d
# driver: vfio-pci
# pciAddress: "0000:18:00.7"
# - deviceID: 0d5d
# driver: vfio-pci
# pciAddress: "0000:18:01.0"
# - deviceID: 0d5d
# driver: vfio-pci
# pciAddress: "0000:18:01.1"
setup ptp
intel flexran文档里面说,必须要用ptp,这个正常,在o-ran架构中,ptp是必须的。
# install ptp operator
cat << EOF > /data/install/ptp.yaml
---
apiVersion: v1
kind: Namespace
metadata:
name: openshift-ptp
annotations:
workload.openshift.io/allowed: management
labels:
name: openshift-ptp
openshift.io/cluster-monitoring: "true"
---
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
name: ptp-operators
namespace: openshift-ptp
spec:
targetNamespaces:
- openshift-ptp
---
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
name: ptp-operator-subscription
namespace: openshift-ptp
spec:
channel: "4.9"
installPlanApproval: Manual
name: ptp-operator
source: redhat-operators
sourceNamespace: openshift-marketplace
EOF
oc create -f /data/install/ptp.yaml
oc get csv -n openshift-ptp
# NAME DISPLAY VERSION REPLACES PHASE
# performance-addon-operator.v4.9.0 Performance Addon Operator 4.9.0 Succeeded
# ptp-operator.4.9.0-202204211825 PTP Operator 4.9.0-202204211825 Succeeded
oc get csv -n openshift-ptp \
-o custom-columns=Name:.metadata.name,Phase:.status.phase
# Name Phase
# performance-addon-operator.v4.9.0 Succeeded
# ptp-operator.4.9.0-202204211825 Succeeded
# as nepdemo request, disable phc2sys service, but we enabled it.
# 坑爹的 ptp4lConf 配置,我查了源代码才知道,他不能有空行
cat << EOF > /data/install/ptp.config.yaml
apiVersion: ptp.openshift.io/v1
kind: PtpConfig
metadata:
name: ordinary-clock-ptp-config-worker-2
namespace: openshift-ptp
spec:
profile:
- name: "profile1"
interface: "ens2f1"
ptp4lOpts: "-2 -m"
phc2sysOpts: "-a -r"
ptp4lConf: |-
[global]
#
# Default Data Set
#
twoStepFlag 1
slaveOnly 0
priority1 128
priority2 128
domainNumber 24
#utc_offset 37
clockClass 248
clockAccuracy 0xFE
offsetScaledLogVariance 0xFFFF
free_running 0
freq_est_interval 1
dscp_event 0
dscp_general 0
dataset_comparison ieee1588
G.8275.defaultDS.localPriority 128
#
# Port Data Set
# 16 TS a second use logSyncInterval -4
logAnnounceInterval 1
logSyncInterval -4
logMinDelayReqInterval 0
logMinPdelayReqInterval 0
announceReceiptTimeout 3
syncReceiptTimeout 0
delayAsymmetry 0
fault_reset_interval 4
neighborPropDelayThresh 20000000
masterOnly 0
G.8275.portDS.localPriority 128
#
# Run time options
#
assume_two_step 0
logging_level 6
path_trace_enabled 0
follow_up_info 0
hybrid_e2e 0
inhibit_multicast_service 0
net_sync_monitor 0
tc_spanning_tree 0
tx_timestamp_timeout 1
unicast_listen 0
unicast_master_table 0
unicast_req_duration 3600
use_syslog 1
verbose 0
summary_interval 0
kernel_leap 1
check_fup_sync 0
#
# Servo Options
#
pi_proportional_const 0.0
pi_integral_const 0.0
pi_proportional_scale 0.0
pi_proportional_exponent -0.3
pi_proportional_norm_max 0.7
pi_integral_scale 0.0
pi_integral_exponent 0.4
pi_integral_norm_max 0.3
step_threshold 0.0
first_step_threshold 0.00002
max_frequency 900000000
clock_servo pi
sanity_freq_limit 200000000
ntpshm_segment 0
#
# Transport options
#
transportSpecific 0x0
ptp_dst_mac 01:1B:19:00:00:00
p2p_dst_mac 01:80:C2:00:00:0E
udp_ttl 1
udp6_scope 0x0E
uds_address /var/run/ptp4l
#
# Default interface options
#
clock_type OC
network_transport UDPv4
delay_mechanism E2E
time_stamping hardware
tsproc_mode filter
delay_filter moving_median
delay_filter_length 10
egressLatency 0
ingressLatency 0
boundary_clock_jbod 0
#
# Clock description
#
productDescription ;;
revisionData ;;
manufacturerIdentity 00:00:00
userDescription ;
timeSource 0xA0
ptpSchedulingPolicy: SCHED_FIFO
ptpSchedulingPriority: 65
recommend:
- profile: "profile1"
priority: 10
match:
- nodeLabel: "node-role.kubernetes.io/worker"
nodeName: "worker-2.ocp4.redhat.ren"
EOF
oc create -f /data/install/ptp.config.yaml
# oc delete -f /data/install/ptp.config.yaml
create deployment ( put all together )
最终,我们可以拼装出一个完整的部署,我们的部署是一个 pod 里面有 2 个 container。一个 container 是 vbbu 应用的 container , 按照 intel sdk 中的方法来搞,也就是尽量只把编译后的应用程序本身放进来,而不是把其他的依赖包放进来。这样镜像就会比较小,大概2G左右。 另外一个container是开发用的,因为开发组需要一个开发环境,把东西编译好,然后复制到 vbbu 应用的那个container里面去。
在这里,flexran-release-running 这个container就是最终运行用的。而flexran-dev-env就是开发环境。
目前这个版本是开发版,未来开发测试结束,将把flexran-dev-env取消,另外本地host-path的目录,也会删除,也就是本地的intel sdk都删掉。
oc new-project vbbu-demo
oc project vbbu-demo
export REG_TMP='tmp-registry.ocp4.redhat.ren:5443'
# kernel driver deployment
oc create serviceaccount svcacct-driver -n vbbu-demo
oc adm policy add-scc-to-user privileged -z svcacct-driver -n vbbu-demo
# oc adm policy add-scc-to-user anyuid -z mysvcacct -n vbbu-demo
cat << EOF > /data/install/dpdk.kmod.driver.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: dpdk-kmod-driver
# namespace: default
labels:
app: dpdk-kmod-driver
spec:
replicas: 1
selector:
matchLabels:
app: dpdk-kmod-driver
template:
metadata:
labels:
app: dpdk-kmod-driver
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: "app"
operator: In
values:
- dpdk-kmod-driver
topologyKey: "kubernetes.io/hostname"
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- worker-2.ocp4.redhat.ren
# restartPolicy: Never
serviceAccountName: svcacct-driver
initContainers:
- name: copy
image: ${REG_TMP}/nepdemo/flexran_vdu:dpdk-kmods-ocp-4.9.5-ubi
command: ["/bin/sh", "-c", "--"]
args: ["/bin/cp -rf /data/* /nepdemo/"]
# imagePullPolicy: Always
volumeMounts:
- name: driver-files
mountPath: /nepdemo
containers:
- name: driver
image: ${REG_TMP}/nepdemo/flexran_vdu:flexran-20.11-dpdk-19.11-ocp4.9.5-ubi-8.4
imagePullPolicy: Always
command: ["/bin/sh", "-c", "--"]
args: ["insmod /nepdemo/dpdk-kmods/linux/igb_uio/igb_uio.ko ; sleep infinity ;"]
resources:
requests:
cpu: 10m
memory: 20Mi
securityContext:
privileged: true
runAsUser: 0
volumeMounts:
- name: driver-files
mountPath: /nepdemo
# - name: host
# mountPath: /host
volumes:
- name: driver-files
emptyDir: {}
# - name: host
# hostPath:
# path: /
# type: Directory
EOF
oc create -n vbbu-demo -f /data/install/dpdk.kmod.driver.yaml
# to restore
# oc delete -f /data/install/dpdk.kmod.driver.yaml
# the pod with vbbu container and dev container
# later, it will change to deployment
cat << EOF > /data/install/vran.intel.flexran.yaml
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: flexran-binary-release-deployment
labels:
app: flexran-binary-release-deployment
spec:
replicas: 1
selector:
matchLabels:
app: flexran-binary-release
template:
metadata:
labels:
app: flexran-binary-release
name: flexran-binary-release
annotations:
k8s.v1.cni.cncf.io/networks: |-
[
{
"name": "intel-710-nic01-rt2",
"mac": "00:11:22:33:44:01"
},
{
"name": "intel-710-nic02-rt2",
"mac": "00:11:22:33:44:02"
}
]
cpu-load-balancing.crio.io: "true"
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: "app"
operator: In
values:
- flexran-binary-release
topologyKey: "kubernetes.io/hostname"
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- worker-2.ocp4.redhat.ren
# nodeSelector:
# kubernetes.io/hostname: worker-2.ocp4.redhat.ren
runtimeClassName: performance-wzh-performanceprofile-2
serviceAccountName: svcacct-driver
containers:
- securityContext:
privileged: false
capabilities:
add:
#- SYS_ADMIN
- IPC_LOCK
- SYS_NICE
- SYS_RESOURCE
- NET_RAW
command: [ "/sbin/init" ]
# command: [ "/bin/sh","-c","--" ]
# args: [" sleep infinity ; "]
# tty: true
# stdin: true
image: ${REG_TMP}/nepdemo/flexran_vdu:flexran-20.11-dpdk-19.11-ocp4.9.5-ubi-8.4
# image: ${REG_TMP}/nepdemo/flexran_basekit:dpdk-19.11-ocp-4.9.5-ubi-8.4
# imagePullPolicy: Always
name: flexran-release-running
resources:
requests:
memory: "24Gi"
intel.com/intel_fec_acc100: '1'
hugepages-1Gi: 16Gi
limits:
memory: "24Gi"
intel.com/intel_fec_acc100: '1'
hugepages-1Gi: 16Gi
volumeMounts:
- name: hugepage
mountPath: /hugepages
- name: varrun
mountPath: /var/run/dpdk
readOnly: false
# - name: oneapi
# mountPath: /opt/intel/oneapi
# readOnly: false
# - name: system-studio-2019
# mountPath: /opt/intel/system_studio_2019
# readOnly: false
# - name: licenses
# mountPath: /opt/intel/licenses
# readOnly: false
# - name: packagemanager
# mountPath: /opt/intel/packagemanager
# readOnly: false
- name: dpdk-19-11
mountPath: /opt/dpdk-19.11
readOnly: false
- name: flexran
mountPath: /data/flexran
readOnly: false
- name: sys
mountPath: /sys/
readOnly: false
- securityContext:
privileged: false
command: [ "/bin/sh","-c","--" ]
args: [" echo 'source /opt/intel/system_studio_2019/bin/compilervars.sh intel64' >> /root/.bashrc ; echo 'source /opt/intel/oneapi/setvars.sh' >> /root/.bashrc ; sleep infinity"]
# tty: true
# stdin: true
# env:
image: ${REG_TMP}/nepdemo/flexran_basekit:dpdk-19.11-ocp-4.9.5-ubi-8.4
name: flexran-dev-env
volumeMounts:
- name: oneapi
mountPath: /opt/intel/oneapi
readOnly: false
- name: system-studio-2019
mountPath: /opt/intel/system_studio_2019
readOnly: false
- name: licenses
mountPath: /opt/intel/licenses
readOnly: false
- name: packagemanager
mountPath: /opt/intel/packagemanager
readOnly: false
- name: dpdk-19-11
mountPath: /opt/dpdk-19-11
readOnly: false
- name: flexran
mountPath: /data/flexran
readOnly: false
volumes:
- name: hugepage
emptyDir:
medium: HugePages
- name: varrun
emptyDir: {}
- name: dpdk-19-11
hostPath:
path: "/var/nepdemo/dpdk-19.11"
- name: flexran
hostPath:
path: "/var/nepdemo/flexran"
- name: oneapi
hostPath:
path: "/var/nepdemo/oneapi"
- name: system-studio-2019
hostPath:
path: "/var/nepdemo/system_studio_2019"
- name: licenses
hostPath:
path: "/var/nepdemo/licenses"
- name: packagemanager
hostPath:
path: "/var/nepdemo/packagemanager"
- name: sys
hostPath:
path: "/sys/"
EOF
oc create -n vbbu-demo -f /data/install/vran.intel.flexran.yaml
# oc delete -n vbbu-demo -f /data/install/vran.intel.flexran.yaml
POD_ID=$(oc get pod -n vbbu-demo -o json | jq -r '.items[].metadata.name | select(. | contains("flexran-binary-release"))' )
oc rsh -c flexran-dev-env ${POD_ID}
# switch to bash, will run .bashrc, which wil bring you intel icc/icx sdk env.
# bash
# 我们从fec的device plugin里面,能看到设备已经提供出来了
POD_ID=$(oc get pod -n vran-acceleration-operators -o json | jq -r ' .items[].metadata.name | select( contains( "device-plugin" ) ) ')
oc logs -n vran-acceleration-operators $POD_ID
# ......
# I0509 12:53:38.288275 1 server.go:119] Allocate() called with &AllocateRequest{ContainerRequests:[]*ContainerAllocateRequest{&ContainerAllocateRequest{DevicesIDs:[0000:18:01.2],},},}
# I0509 12:53:38.288326 1 accelResourcePool.go:46] GetDeviceSpecs(): for devices: [0000:18:01.2]
# I0509 12:53:38.288435 1 pool_stub.go:97] GetEnvs(): for devices: [0000:18:01.2]
# I0509 12:53:38.288443 1 pool_stub.go:113] GetMounts(): for devices: [0000:18:01.2]
# I0509 12:53:38.288447 1 server.go:128] AllocateResponse send: &AllocateResponse{ContainerResponses:[]*ContainerAllocateResponse{&ContainerAllocateResponse{Envs:map[string]string{PCIDEVICE_INTEL_COM_INTEL_FEC_ACC100: 0000:18:01.2,},Mounts:[]*Mount{},Devices:[]*DeviceSpec{&DeviceSpec{ContainerPath:/dev/vfio/vfio,HostPath:/dev/vfio/vfio,Permissions:mrw,},&DeviceSpec{ContainerPath:/dev/vfio/110,HostPath:/dev/vfio/110,Permissions:mrw,},},Annotations:map[string]string{},},},}
POD_ID=$(oc get pod -n openshift-sriov-network-operator -o json | jq -r ' .items[].metadata.name | select( contains( "device-plugin" ) ) ')
oc logs -n openshift-sriov-network-operator $POD_ID
# ......
# I0511 13:03:13.167902 1 server.go:115] Allocate() called with &AllocateRequest{ContainerRequests:[]*ContainerAllocateRequest{&ContainerAllocateRequest{DevicesIDs:[0000:65:02.0],},},}
# I0511 13:03:13.167961 1 netResourcePool.go:50] GetDeviceSpecs(): for devices: [0000:65:02.0]
# I0511 13:03:13.168068 1 pool_stub.go:97] GetEnvs(): for devices: [0000:65:02.0]
# I0511 13:03:13.168077 1 pool_stub.go:113] GetMounts(): for devices: [0000:65:02.0]
# I0511 13:03:13.168082 1 server.go:124] AllocateResponse send: &AllocateResponse{ContainerResponses:[]*ContainerAllocateResponse{&ContainerAllocateResponse{Envs:map[string]string{PCIDEVICE_OPENSHIFT_IO_INTEL_710_NIC01_RT2: 0000:65:02.0,},Mounts:[]*Mount{},Devices:[]*DeviceSpec{&DeviceSpec{ContainerPath:/dev/vfio/vfio,HostPath:/dev/vfio/vfio,Permissions:mrw,},&DeviceSpec{ContainerPath:/dev/vfio/108,HostPath:/dev/vfio/108,Permissions:mrw,},},Annotations:map[string]string{},},},}
# I0511 13:03:13.168369 1 server.go:115] Allocate() called with &AllocateRequest{ContainerRequests:[]*ContainerAllocateRequest{&ContainerAllocateRequest{DevicesIDs:[0000:65:0a.0],},},}
# I0511 13:03:13.168393 1 netResourcePool.go:50] GetDeviceSpecs(): for devices: [0000:65:0a.0]
# I0511 13:03:13.168470 1 pool_stub.go:97] GetEnvs(): for devices: [0000:65:0a.0]
# I0511 13:03:13.168477 1 pool_stub.go:113] GetMounts(): for devices: [0000:65:0a.0]
# I0511 13:03:13.168481 1 server.go:124] AllocateResponse send: &AllocateResponse{ContainerResponses:[]*ContainerAllocateResponse{&ContainerAllocateResponse{Envs:map[string]string{PCIDEVICE_OPENSHIFT_IO_INTEL_710_NIC02_RT2: 0000:65:0a.0,},Mounts:[]*Mount{},Devices:[]*DeviceSpec{&DeviceSpec{ContainerPath:/dev/vfio/vfio,HostPath:/dev/vfio/vfio,Permissions:mrw,},&DeviceSpec{ContainerPath:/dev/vfio/112,HostPath:/dev/vfio/112,Permissions:mrw,},},Annotations:map[string]string{},},},}
# 到vbbu pod里面验证一下,也能看到设备出现了。
POD_ID=$(oc get pod -n vbbu-demo -o json | jq -r '.items[].metadata.name | select(. | contains("flexran-binary-release"))' )
oc exec -c flexran-release-running ${POD_ID} -- ls /dev/vfio
# Defaulted container "flexran-release-running" out of: flexran-release-running, flexran-dev-env
# 110
# 112
# 97
# vfio
POD_ID=$(oc get pod -n vbbu-demo -o json | jq -r '.items[].metadata.name | select(. | contains("flexran-binary-release"))' )
oc rsh -c flexran-release-running ${POD_ID}
POD_ID=$(oc get pod -n vbbu-demo -o json | jq -r '.items[].metadata.name | select(. | contains("flexran-binary-release"))' )
oc exec -c flexran-release-running ${POD_ID} -- ip link
# 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
# link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
# 3: eth0@if30: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP mode DEFAULT group default
# link/ether 0a:58:0a:fe:0a:0a brd ff:ff:ff:ff:ff:ff link-netnsid 0
POD_ID=$(oc get pod -n vbbu-demo -o json | jq -r '.items[].metadata.name | select(. | contains("flexran-binary-release"))' )
oc exec -c flexran-release-running ${POD_ID} -- python3 /root/dpdk-19.11/usertools/dpdk-devbind.py -s
# Network devices using DPDK-compatible driver
# ============================================
# 0000:65:02.0 'Ethernet Virtual Function 700 Series 154c' drv=vfio-pci unused=iavf,igb_uio
# 0000:65:02.1 'Ethernet Virtual Function 700 Series 154c' drv=vfio-pci unused=iavf,igb_uio
# 0000:65:02.2 'Ethernet Virtual Function 700 Series 154c' drv=vfio-pci unused=iavf,igb_uio
# 0000:65:02.3 'Ethernet Virtual Function 700 Series 154c' drv=vfio-pci unused=iavf,igb_uio
# 0000:65:0a.0 'Ethernet Virtual Function 700 Series 154c' drv=vfio-pci unused=iavf,igb_uio
# 0000:65:0a.1 'Ethernet Virtual Function 700 Series 154c' drv=vfio-pci unused=iavf,igb_uio
# 0000:65:0a.2 'Ethernet Virtual Function 700 Series 154c' drv=vfio-pci unused=iavf,igb_uio
# 0000:65:0a.3 'Ethernet Virtual Function 700 Series 154c' drv=vfio-pci unused=iavf,igb_uio
# Network devices using kernel driver
# ===================================
# 0000:65:00.0 'Ethernet Controller X710 for 10GbE SFP+ 1572' if=ens2f0 drv=i40e unused=igb_uio,vfio-pci
# 0000:65:00.1 'Ethernet Controller X710 for 10GbE SFP+ 1572' if=ens2f1 drv=i40e unused=igb_uio,vfio-pci
# 0000:b5:00.0 'Ethernet Connection X722 for 1GbE 37d1' if=eno1 drv=i40e unused=igb_uio,vfio-pci
# 0000:b5:00.1 'Ethernet Connection X722 for 1GbE 37d1' if=eno2 drv=i40e unused=igb_uio,vfio-pci
# Baseband devices using DPDK-compatible driver
# =============================================
# 0000:18:00.0 'Device 0d5d' drv=vfio-pci unused=igb_uio
# 0000:18:00.1 'Device 0d5d' drv=vfio-pci unused=igb_uio
# 0000:18:00.2 'Device 0d5d' drv=vfio-pci unused=igb_uio
# 0000:18:00.3 'Device 0d5d' drv=vfio-pci unused=igb_uio
# 0000:18:00.4 'Device 0d5d' drv=vfio-pci unused=igb_uio
# 0000:18:00.5 'Device 0d5d' drv=vfio-pci unused=igb_uio
# 0000:18:00.6 'Device 0d5d' drv=vfio-pci unused=igb_uio
# 0000:18:00.7 'Device 0d5d' drv=vfio-pci unused=igb_uio
# 0000:18:01.0 'Device 0d5d' drv=vfio-pci unused=igb_uio
# 0000:18:01.1 'Device 0d5d' drv=vfio-pci unused=igb_uio
# 0000:18:01.2 'Device 0d5d' drv=vfio-pci unused=igb_uio
# 0000:18:01.3 'Device 0d5d' drv=vfio-pci unused=igb_uio
# 0000:18:01.4 'Device 0d5d' drv=vfio-pci unused=igb_uio
# 0000:18:01.5 'Device 0d5d' drv=vfio-pci unused=igb_uio
# 0000:18:01.6 'Device 0d5d' drv=vfio-pci unused=igb_uio
# 0000:18:01.7 'Device 0d5d' drv=vfio-pci unused=igb_uio
# Baseband devices using kernel driver
# ====================================
# 0000:17:00.0 'Device 0d5c' drv=pci-pf-stub unused=igb_uio,vfio-pci
# No 'Crypto' devices detected
# ============================
# No 'Eventdev' devices detected
# ==============================
# No 'Mempool' devices detected
# =============================
# No 'Compress' devices detected
# ==============================
# Misc (rawdev) devices using kernel driver
# =========================================
# 0000:00:04.0 'Sky Lake-E CBDMA Registers 2021' drv=ioatdma unused=igb_uio,vfio-pci
# 0000:00:04.1 'Sky Lake-E CBDMA Registers 2021' drv=ioatdma unused=igb_uio,vfio-pci
# 0000:00:04.2 'Sky Lake-E CBDMA Registers 2021' drv=ioatdma unused=igb_uio,vfio-pci
# 0000:00:04.3 'Sky Lake-E CBDMA Registers 2021' drv=ioatdma unused=igb_uio,vfio-pci
# 0000:00:04.4 'Sky Lake-E CBDMA Registers 2021' drv=ioatdma unused=igb_uio,vfio-pci
# 0000:00:04.5 'Sky Lake-E CBDMA Registers 2021' drv=ioatdma unused=igb_uio,vfio-pci
# 0000:00:04.6 'Sky Lake-E CBDMA Registers 2021' drv=ioatdma unused=igb_uio,vfio-pci
# 0000:00:04.7 'Sky Lake-E CBDMA Registers 2021' drv=ioatdma unused=igb_uio,vfio-pci
# No 'Regex' devices detected
# ===========================
oc debug node/worker-2.ocp4.redhat.ren -- ip link
# Starting pod/worker-2ocp4redhatren-debug ...
# To use host binaries, run `chroot /host`
# 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
# link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
# 2: ens2f0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN mode DEFAULT group default qlen 1000
# link/ether 90:e2:ba:a8:29:e6 brd ff:ff:ff:ff:ff:ff
# vf 0 link/ether 06:b4:8a:df:01:b6 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
# vf 1 link/ether 6a:f3:e9:2e:ce:95 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
# vf 2 link/ether 86:23:2b:24:12:8f brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
# vf 3 link/ether 00:11:22:33:44:01 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
# 3: ens2f1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN mode DEFAULT group default qlen 1000
# link/ether 90:e2:ba:a8:29:e7 brd ff:ff:ff:ff:ff:ff
# vf 0 link/ether 00:11:22:33:44:02 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
# vf 1 link/ether f6:9f:b3:a4:f2:da brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
# vf 2 link/ether 36:44:0f:fa:b9:84 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
# vf 3 link/ether fa:5b:75:f2:77:8c brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
# 4: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
# link/ether ac:1f:6b:ea:5b:32 brd ff:ff:ff:ff:ff:ff
# 5: eno2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
# link/ether ac:1f:6b:ea:5b:33 brd ff:ff:ff:ff:ff:ff
# 10: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
# link/ether 52:50:27:19:21:e2 brd ff:ff:ff:ff:ff:ff
# 11: br0: <BROADCAST,MULTICAST> mtu 1450 qdisc noop state DOWN mode DEFAULT group default qlen 1000
# link/ether fe:7b:d1:84:da:4f brd ff:ff:ff:ff:ff:ff
# 12: vxlan_sys_4789: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 65000 qdisc noqueue master ovs-system state UNKNOWN mode DEFAULT group default qlen 1000
# link/ether b6:c9:1d:9d:77:aa brd ff:ff:ff:ff:ff:ff
# 13: tun0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
# link/ether 36:7a:65:37:c1:33 brd ff:ff:ff:ff:ff:ff
# 14: vethf21a4c33@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master ovs-system state UP mode DEFAULT group default
# link/ether ae:f2:57:a5:67:ad brd ff:ff:ff:ff:ff:ff link-netnsid 0
# 15: veth8662e3e2@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master ovs-system state UP mode DEFAULT group default
# link/ether 9e:49:15:3f:7c:a1 brd ff:ff:ff:ff:ff:ff link-netnsid 1
# 16: veth5d3ab571@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master ovs-system state UP mode DEFAULT group default
# link/ether aa:ad:f7:cc:b9:57 brd ff:ff:ff:ff:ff:ff link-netnsid 2
# 17: veth20ff5e06@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master ovs-system state UP mode DEFAULT group default
# link/ether 82:72:8e:6d:1a:4a brd ff:ff:ff:ff:ff:ff link-netnsid 3
# 18: vethd11f4604@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master ovs-system state UP mode DEFAULT group default
# link/ether 96:df:20:6a:a0:6f brd ff:ff:ff:ff:ff:ff link-netnsid 4
# 20: vethc860c9be@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master ovs-system state UP mode DEFAULT group default
# link/ether c6:c6:37:fb:1d:48 brd ff:ff:ff:ff:ff:ff link-netnsid 5
# 30: vethfe0374a4@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP mode DEFAULT group default
# link/ether 1e:a1:67:b2:00:f6 brd ff:ff:ff:ff:ff:ff link-netnsid 6
# 32: vethecce46ea@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master ovs-system state UP mode DEFAULT group default
# link/ether 2e:1d:11:80:37:29 brd ff:ff:ff:ff:ff:ff link-netnsid 8
# Removing debug pod ...
以上是一个开发环境的部署,要注意,在 /data/flexran 的开发成功,要复制到 /root/flexran 里面,然后用 release 容器来运行测试。
后期,当开发完成以后,会单独的重新制作 release 容器,dev 相关的容器在生成环境上,就都不用了,同理,那些复制文件的 job 也都不会在生产系统上运行。
end
linuxptp 3.11
# http://linuxptp.sourceforge.net/
# download linuxptp-3.1.1
# on a rhel8.4
dnf install -y linuxptp
# /etc/ptp4l.conf
# /etc/sysconfig/phc2sys
# /etc/sysconfig/ptp4l
# /etc/timemaster.conf
# /usr/lib/systemd/system/phc2sys.service
# /usr/lib/systemd/system/ptp4l.service
# /usr/lib/systemd/system/timemaster.service
cat /etc/sysconfig/phc2sys
# OPTIONS="-a -r"
cat /etc/sysconfig/ptp4l
# OPTIONS="-f /etc/ptp4l.conf -i eth0"
systemctl cat phc2sys
# # /usr/lib/systemd/system/phc2sys.service
# [Unit]
# Description=Synchronize system clock or PTP hardware clock (PHC)
# After=ntpdate.service ptp4l.service
# [Service]
# Type=simple
# EnvironmentFile=-/etc/sysconfig/phc2sys
# ExecStart=/usr/sbin/phc2sys $OPTIONS
# [Install]
# WantedBy=multi-user.target
systemctl cat ptp4l.service
# # /usr/lib/systemd/system/ptp4l.service
# [Unit]
# Description=Precision Time Protocol (PTP) service
# After=network-online.target
# Wants=network-online.target
# [Service]
# Type=simple
# EnvironmentFile=-/etc/sysconfig/ptp4l
# ExecStart=/usr/sbin/ptp4l $OPTIONS
# [Install]
# WantedBy=multi-user.target
mkdir -p /data/ptp
cd /data/ptp
wget https://nchc.dl.sourceforge.net/project/linuxptp/v3.1/linuxptp-3.1.1.tgz
tar zvxf linuxptp-3.1.1.tgz
cd linuxptp-3.1.1
make
cat << 'EOF' > ptp4l.sh
#!/bin/bash
# echo $DEMO_ENV_NIC > /demo.txt
# echo $DEMO_ENV_PTP4L_ARG >> /demo.txt
# ./ptp4l -f ./configs/default_zill.cfg -2 -i enp101s0f0 -m > /home/ptp4l.log 2>&1 &
# /usr/local/sbin/ptp4l -f /etc/ptp4l.conf -2 -m -i $DEMO_ENV_NIC
/usr/local/sbin/ptp4l -f /etc/ptp4l.conf -m $DEMO_ENV_PTP4L_ARG
EOF
cat << 'EOF' > phc2sys.sh
#!/bin/bash
# echo $DEMO_ENV_NIC > /demo.1.txt
# echo $DEMO_ENV_PHC2SYS_ARG >> /demo1.txt
# ./phc2sys -s enp101s0f0 -O 0 -m -R 8 >/home/phc2sys.log 2>&1 &
# /usr/local/sbin/phc2sys -s $DEMO_ENV_NIC -a -r -m -u 1 -O 0 -R 8 -z /var/run/ptp4l -t [phc2sys]
/usr/local/sbin/phc2sys -m -z /var/run/ptp4l -t [phc2sys] $DEMO_ENV_PHC2SYS_ARG
EOF
cat << 'EOF' > ts2phc.sh
#!/bin/bash
# echo $DEMO_ENV_NIC > /demo.2.txt
# echo $DEMO_ENV_TS2PHC_ARG >> /demo2.txt
# ./ts2phc -f ./configs/ts2phc-generic_GNSS0.cfg -s generic -m -c enp23s0f0 > /home/ts2phc.log 2>&1 &
# /usr/local/sbin/ts2phc -f /etc/ts2phc.cfg -s generic -m -c $DEMO_ENV_NIC
/usr/local/sbin/ts2phc -f /etc/ts2phc.cfg -m $DEMO_ENV_TS2PHC_ARG
EOF
cat << EOF > ./ptp.dockerfile
FROM registry.access.redhat.com/ubi8/ubi:8.4
COPY hwstamp_ctl nsm phc2sys phc_ctl pmc ptp4l timemaster ts2phc incdefs.sh version.sh ptp4l.sh phc2sys.sh ts2phc.sh /usr/local/sbin/
RUN cd /usr/local/sbin/ && chmod +x hwstamp_ctl nsm phc2sys phc_ctl pmc ptp4l timemaster ts2phc incdefs.sh version.sh ptp4l.sh phc2sys.sh ts2phc.sh
EOF
podman build --squash -t quay.io/nepdemo/linuxptp:3.1.1-ubi-8.4-v04 -f ptp.dockerfile ./
podman push quay.io/nepdemo/linuxptp:3.1.1-ubi-8.4-v04
cat << EOF > /data/install/ptp4l.conf
[global]
#
# Default Data Set
#
twoStepFlag 1
slaveOnly 0
priority1 128
priority2 128
domainNumber 24
#utc_offset 37
clockClass 248
clockAccuracy 0xFE
offsetScaledLogVariance 0xFFFF
free_running 0
freq_est_interval 1
dscp_event 0
dscp_general 0
dataset_comparison ieee1588
G.8275.defaultDS.localPriority 128
#
# Port Data Set
# 16 TS a second use logSyncInterval -4
logAnnounceInterval 1
logSyncInterval -4
logMinDelayReqInterval 0
logMinPdelayReqInterval 0
announceReceiptTimeout 3
syncReceiptTimeout 0
delayAsymmetry 0
fault_reset_interval 4
neighborPropDelayThresh 20000000
masterOnly 0
G.8275.portDS.localPriority 128
#
# Run time options
#
assume_two_step 0
logging_level 6
path_trace_enabled 0
follow_up_info 0
hybrid_e2e 0
inhibit_multicast_service 0
net_sync_monitor 0
tc_spanning_tree 0
tx_timestamp_timeout 1
unicast_listen 0
unicast_master_table 0
unicast_req_duration 3600
use_syslog 1
verbose 0
summary_interval 0
kernel_leap 1
check_fup_sync 0
#
# Servo Options
#
pi_proportional_const 0.0
pi_integral_const 0.0
pi_proportional_scale 0.0
pi_proportional_exponent -0.3
pi_proportional_norm_max 0.7
pi_integral_scale 0.0
pi_integral_exponent 0.4
pi_integral_norm_max 0.3
step_threshold 0.0
first_step_threshold 0.00002
max_frequency 900000000
clock_servo pi
sanity_freq_limit 200000000
ntpshm_segment 0
#
# Transport options
#
transportSpecific 0x0
ptp_dst_mac 01:1B:19:00:00:00
p2p_dst_mac 01:80:C2:00:00:0E
udp_ttl 1
udp6_scope 0x0E
uds_address /var/run/ptp4l
#
# Default interface options
#
clock_type OC
network_transport UDPv4
delay_mechanism E2E
time_stamping hardware
tsproc_mode filter
delay_filter moving_median
delay_filter_length 10
egressLatency 0
ingressLatency 0
boundary_clock_jbod 0
#
# Clock description
#
productDescription ;;
revisionData ;;
manufacturerIdentity 00:00:00
userDescription ;
timeSource 0xA0
EOF
cat << EOF > /data/install/ts2phc.cfg
[global]
use_syslog 0
verbose 1
logging_level 7
ts2phc.pulsewidth 100000000
# For GNSS module
ts2phc.nmea_serialport /dev/ttyGNSS_6500_0
[ens18f0]
ts2phc.extts_polarity rising
EOF
oc delete configmap ptp-config -n vbbu-demo
oc create configmap ptp-config -n vbbu-demo --from-file=/data/install/ptp4l.conf --from-file=/data/install/ts2phc.cfg --save-config=true
cat << 'EOF' > /data/install/ptp.demo.yaml
---
apiVersion: v1
kind: Pod
metadata:
annotations:
labels:
app: nepdemo-linuxptp-daemon
name: nepdemo-linuxptp-daemon
# namespace: openshift-ptp
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchFields:
- key: metadata.name
operator: In
values:
- worker-0.ocp4.redhat.ren
containers:
- name: ptp4l
image: quay.io/nepdemo/linuxptp:3.1.1-ubi-8.4-v04
imagePullPolicy: IfNotPresent
command: ["/bin/sh", "-c", "--"]
args: [" /usr/local/sbin/ptp4l.sh ;"]
env:
- name: DEMO_ENV_PTP4L_ARG
value: " -i ens18f0 -2 "
securityContext:
privileged: true
runAsUser: 0
volumeMounts:
- mountPath: /etc/ptp4l.conf
subPath: ptp4l.conf
name: config-volume
- mountPath: /var/run/ptp4l
name: socket-dir
- name: phc2sys
image: quay.io/nepdemo/linuxptp:3.1.1-ubi-8.4-v04
imagePullPolicy: IfNotPresent
command: ["/bin/sh", "-c", "--"]
args: [" /usr/local/sbin/phc2sys.sh ;"]
env:
- name: DEMO_ENV_PHC2SYS_ARG
value: " -s ens18f0 -r -u 1 -O 0 -R 8 "
securityContext:
privileged: true
runAsUser: 0
volumeMounts:
- mountPath: /etc/ptp4l.conf
subPath: ptp4l.conf
name: config-volume
- mountPath: /var/run/ptp4l
name: socket-dir
- name: ts2phc
image: quay.io/nepdemo/linuxptp:3.1.1-ubi-8.4-v04
imagePullPolicy: IfNotPresent
command: ["/bin/sh", "-c", "--"]
args: [" /usr/local/sbin/ts2phc.sh ;"]
env:
- name: DEMO_ENV_TS2PHC_ARG
value: " -s generic -c ens18f0 "
securityContext:
privileged: true
runAsUser: 0
volumeMounts:
- mountPath: /etc/ts2phc.cfg
subPath: ts2phc.cfg
name: config-volume
- mountPath: /var/run/ptp4l
name: socket-dir
- name: dev
mountPath: /dev
hostNetwork: true
hostPID: true
serviceAccountName: svcacct-driver
volumes:
- configMap:
defaultMode: 420
name: ptp-config
name: config-volume
- hostPath:
path: /var/run/ptp
type: DirectoryOrCreate
name: socket-dir
- name: dev
hostPath:
path: "/dev"
EOF
oc create -n vbbu-demo -f /data/install/ptp.demo.yaml
# oc delete -n vbbu-demo -f /data/install/ptp.demo.yaml
baicell bbu
cd /home/BaiBBU_XSS/tools
./XRAN_BBU stop
./XRAN_BBU start
cat /home/BaiBBU_XSS/BaiBBU_SXSS/DU/bin/logs_gNB_DU
tail -f /home/BaiBBU_XSS/BaiBBU_SXSS/DU/bin/logs_gNB_DU
export tmp_path='/home/BaiBBU_XSS-A/BaiBBU_PXSS/PHY'
cd /data/flexran
cp -r libs $tmp_path/
cp -r sdk $tmp_path/
#cp -r tests flexran_build/flexran/
cp -r wls_mod $tmp_path/
cp -r xran $tmp_path/
#cd flexran_build/flexran/
#add remove flexran source code
rm -rf $tmp_path/sdk/test
rm -rf $tmp_path/sdk/source
rm -rf $tmp_path/libs/ferrybridge
cd /home/BaiBBU_XSS-A/BaiBBU_PXSS/PHY
cat /home/BaiBBU_XSS-A/BaiBBU_PXSS/PHY/bin/l1.sh
cat /home/BaiBBU_XSS-A/BaiBBU_PXSS/PHY/bin/Phy.log
# patch /home/BaiBBU_XSS-A/BaiBBU_PXSS/PHY/bin/l1.sh
# add env variable
# export DIR_WIRELESS_SDK=/data/flexran/sdk/build-avx512-icc
# export -n DIR_WIRELESS_SDK
# export DIR_WIRELESS_SDK=/home/BaiBBU_XSS-A/BaiBBU_PXSS/PHY/sdk/build-avx512-icc
cat /data/flexran/bin/nr5g/gnb/l1/l1.sh
cat /data/flexran/bin/nr5g/gnb/l1/Phy.log
finnaly, we find out, l1app co-works with gnb_du_mac, but both working as 'EAL: Auto-detected process type: PRIMARY' DPDK docs say, multiple processes can work together.
dhcp for ru
nmcli dev con ens1f0
nmcli connection mod ens1f0 ipv4.add 192.168.160.1/24 ipv4.method manual
nmcli con up ens1f0
cat /etc/sysconfig/dhcpd
# .......
# DHCPDARGS=ens1f0
cat /etc/dhcp/dhcpd.conf
# option callhomeip code 43 = string;
# subnet 192.168.160.0 netmask 255.255.255.0 {
# range 192.168.160.10 192.168.160.100;
# option domain-name-servers 192.168.160.1;
# option routers 192.168.160.1;
# option callhomeip 81:04:C0:A8:A0:A2;
# default-lease-time 600;
# max-lease-time 7200;
# }
some test, no use here
# intel icc repo
# https://www.intel.com/content/www/us/en/developer/articles/guide/installing-intel-parallel-studio-xe-runtime-2020-using-yum-repository.html
# offical oneapi docker image build
# https://hub.docker.com/r/intel/oneapi-basekit
# https://github.com/intel/oneapi-containers/blob/12932f721dd0201dfae85cacb62495924ecf42cf/images/docker/basekit/Dockerfile.centos-8
# using files/flexran.dockerfile
# buildah bud --squash -t quay.io/nepdemo/flexran_basekit:oneapi-basekit-official-ocp-4.9.5-ubi-8.4 -f flexran.dockerfile ./
# buildah push quay.io/nepdemo/flexran_basekit:oneapi-basekit-official-ocp-4.9.5-ubi-8.4
podman build --squash -t quay.io/nepdemo/flexran_basekit:oneapi-basekit-official-ocp-4.9.5-ubi-8.4 -f flexran.dockerfile ./
podman push quay.io/nepdemo/flexran_basekit:oneapi-basekit-official-ocp-4.9.5-ubi-8.4
# in container
echo 'distroverpkg=redhat-release' >> /etc/yum.conf
rpm -q --qf %{version} redhat-release;echo
# 8.4
rpm -q --provides $(rpm -q --whatprovides "system-release(releasever)")
# base-module(platform:el8)
# config(redhat-release) = 8.4-0.6.el8
# redhat-release = 8.4-0.6.el8
# redhat-release(x86-64) = 8.4-0.6.el8
# redhat-release-client
# redhat-release-computenode
# redhat-release-server
# redhat-release-workstation
# system-release = 8.4-0.6.el8
# system-release(releasever) = 8
dnf repolist
sed -i 's|enabled=1|enabled=0|g' /etc/yum/pluginconf.d/subscription-manager.conf
sed -i 's|$releasever|8.4|g' /etc/yum.repos.d/redhat.repo
sed -i '/codeready-builder-for-rhel-8-x86_64-rpms/,/\[/ s/enabled = 0/enabled = 1/' /etc/yum.repos.d/redhat.repo
mv -f /etc/yum.repos.d/ubi.repo /etc/yum.repos.d/ubi.repo.bak
cache dnf repo
mkdir -p /data/dnf
cd /data/dnf
dnf reposync -m --download-metadata --delete -n
dnf copr enable frostyx/modulemd-tools
dnf install -y modulemd-tools
createrepo ./
repo2module . \
--module-name foo \
--module-stream devel \
--module-version 123 \
--module-context f32
createrepo_mod .
sriov setting for non-dpdk
# oc label node worker-2.ocp4.redhat.ren feature.node.kubernetes.io/network-sriov.capable="true"
# https://docs.openshift.com/container-platform/4.9/networking/hardware_networks/configuring-sriov-ib-attach.html
# Dynamic IP address (DHCP) assignment configuration
# require a dhcp server in cluster
apiVersion: operator.openshift.io/v1
kind: Network
metadata:
name: cluster
spec:
additionalNetworks:
- name: dhcp-shim
namespace: default
type: Raw
rawCNIConfig: |-
{
"name": "dhcp-shim",
"cniVersion": "0.3.1",
"type": "bridge",
"ipam": {
"type": "dhcp"
}
}
# ...
oc get Network.operator.openshift.io/cluster -o yaml
# ......
# spec:
# clusterNetwork:
# - cidr: 10.254.0.0/16
# hostPrefix: 24
# defaultNetwork:
# type: OpenShiftSDN
# disableNetworkDiagnostics: false
# logLevel: Normal
# managementState: Managed
# observedConfig: null
# operatorLogLevel: Normal
# serviceNetwork:
# - 172.30.0.0/16
# unsupportedConfigOverrides: null
# ......
# if you use ipam dhcp, then you do this, otherwise skip
oc edit Network.operator.openshift.io/cluster
oc get pod -n openshift-multus | grep dhcp
# dhcp-daemon-4s2c4 1/1 Running 0 3h11m
# dhcp-daemon-9lvch 1/1 Running 0 3h11m
# dhcp-daemon-lhss5 1/1 Running 0 3h11m
# dhcp-daemon-q8qmh 1/1 Running 0 3h11m
一个简单的端到端的CI/CD演示 a simple, working ci/cd process demo
客户需求:
- 实现一个简单的ci/cd流程,因为现在容器的ci/cd流程还没有
- 不能影响已有的开发流程,也就是和现在的开发流程手动对接,从现在的开发流程里面,直接拿到二进制文件
- 可以使用公有云服务,包括github, quay.io
- 手动触发ci/cd流程,手动出发测试环境部署。
客户现场的局限:
- 公网连接的网速比较慢,大概1MB/s
- 服务器硬盘资源相对有限
- 服务器性质是做实验的,所以可能被临时征用做为他用。
架构设计:
架构设计要点:
- 公网服务采用github, quay.io,用途是持久化存储代码和镜像,避免内网服务器的不稳定或硬盘空间不足。同时在公网服务上编译基础镜像。
- 公司内网部署gitea, quay,并和公网服务同步。
- 采用openshift的pipeline, gitops功能,实现CI/CD流程。
视频讲解:
基础镜像 / base image
我们先来配置公有云服务的基础镜像构建。我们用quay.io来作为容器镜像存储平台,用github的action功能,来编译镜像。
用github action的功能,是因为未来,我们会基于redhat ubi来编译基础镜像,在这个过程中,需要导入redhat订阅文件,这个就对公有云上的ci/cd工具的灵活性有要求,那么我们就暂时用github的action来编译基础镜像。
quay.io
在quay.io上,配置robot账号
查看和记录robot账号的用户密码
给robot账号分配权限
reference:
- https://event-driven.io/en/how_to_buid_and_push_docker_image_with_github_actions/
- https://github.com/docker/build-push-action
- https://docs.github.com/cn/actions/publishing-packages/publishing-docker-images
github
已经制作了单独的github项目,作为镜像编译的源文件项目,项目中centos7目录中,有一个docker file文件,是基于centos7的镜像基础,并安装一些软件,最终结果打包,并上传 quay.io。不过,这个docker file依赖另外一个镜像,主要是需要那个镜像里面的一个安装包,我们之所以这样设计,是因为找不到一个合适的在公网上免费存储安装包的地方,于是我们就把这个很大的安装包,打到镜像里面,上传到公网的镜像仓库里面,需要用的时候,就采用这种多阶段编译的方式,来使用。
包含安装包的镜像如何制作,在项目文档中,有详细描述。
buildah from --name onbuild-container scratch
buildah copy onbuild-container nr5g_19.10.03.bz2 /
buildah umount onbuild-container
buildah commit --rm onbuild-container quay.io/baicell/nr5g:latest
buildah push quay.io/baicell/nr5g:latest
项目中的.github/workflow目录下的main.yml文件,描述了激活github action,并且ci/cd的步骤。可以参考这个文件看公有云上,如何编译镜像。
github action里面,需要quay.io的robot账号信息,我们使用github的secret功能来实现。
gitee
由于不可描述的原因,国内访问github很不稳定,所以我们就用gitee来克隆github repo,让gitee变相做一个git代理。github clone to gitee
http proxy
由于我们的openshift环境是模拟的全离线环境,而我们的实验/方案,有一些操作是需要联网的,那么我们就需要部署一个http proxy,来模拟企业内部常有的访问互联网的代理。
podman run -d --rm --name tinyproxy -p 18080:8888 ghcr.io/wangzheng422/tinyproxy:latest
export http_proxy="http://192.168.7.1:18080"
export https_proxy=${http_proxy}
curl https://ifconfig.co
unset http_proxy
unset https_proxy
quay
我们来部署一个quay服务,同时激活远程镜像同步功能。由于项目架构设计(基础镜像已经在公有云上扫描了)和服务器资源现状,我们就不开启镜像扫描了。
# on 103
cat << EOF >> /etc/hosts
172.21.6.103 quaylab.infra.redhat.ren
EOF
export QUAY=/data/quay
# generate cert for *.redhat.ren
# 配置registry
mkdir -p /etc/crts/ && cd /etc/crts
# https://access.redhat.com/documentation/en-us/red_hat_codeready_workspaces/2.1/html/installation_guide/installing-codeready-workspaces-in-tls-mode-with-self-signed-certificates_crw
openssl genrsa -out /etc/crts/redhat.ren.ca.key 4096
openssl req -x509 \
-new -nodes \
-key /etc/crts/redhat.ren.ca.key \
-sha256 \
-days 36500 \
-out /etc/crts/redhat.ren.ca.crt \
-subj /CN="Local Red Hat Ren Signer" \
-reqexts SAN \
-extensions SAN \
-config <(cat /etc/pki/tls/openssl.cnf \
<(printf '[SAN]\nbasicConstraints=critical, CA:TRUE\nkeyUsage=keyCertSign, cRLSign, digitalSignature'))
openssl genrsa -out /etc/crts/redhat.ren.key 2048
openssl req -new -sha256 \
-key /etc/crts/redhat.ren.key \
-subj "/O=Local Red Hat Ren /CN=*.ocp4.redhat.ren" \
-reqexts SAN \
-config <(cat /etc/pki/tls/openssl.cnf \
<(printf "\n[SAN]\nsubjectAltName=DNS:*.ocp4.redhat.ren,DNS:*.apps.ocp4.redhat.ren,DNS:*.ocp4a.redhat.ren,DNS:*.apps.ocp4a.redhat.ren,DNS:*.ocp4b.redhat.ren,DNS:*.apps.ocp4b.redhat.ren,DNS:*.ocp4c.redhat.ren,DNS:*.apps.ocp4c.redhat.ren,DNS:*.ocp4s.redhat.ren,DNS:*.apps.ocp4s.redhat.ren,DNS:*.infra.redhat.ren,DNS:*.tool.redhat.ren,DNS:*.redhat.ren\nbasicConstraints=critical, CA:FALSE\nkeyUsage=digitalSignature, keyEncipherment, keyAgreement, dataEncipherment\nextendedKeyUsage=serverAuth")) \
-out /etc/crts/redhat.ren.csr
openssl x509 \
-req \
-sha256 \
-extfile <(printf "subjectAltName=DNS:*.ocp4.redhat.ren,DNS:*.apps.ocp4.redhat.ren,DNS:*.ocp4a.redhat.ren,DNS:*.apps.ocp4a.redhat.ren,DNS:*.ocp4b.redhat.ren,DNS:*.apps.ocp4b.redhat.ren,DNS:*.ocp4c.redhat.ren,DNS:*.apps.ocp4c.redhat.ren,DNS:*.ocp4s.redhat.ren,DNS:*.apps.ocp4s.redhat.ren,DNS:*.infra.redhat.ren,DNS:*.tool.redhat.ren,DNS:*.redhat.ren\nbasicConstraints=critical, CA:FALSE\nkeyUsage=digitalSignature, keyEncipherment, keyAgreement, dataEncipherment\nextendedKeyUsage=serverAuth") \
-days 365 \
-in /etc/crts/redhat.ren.csr \
-CA /etc/crts/redhat.ren.ca.crt \
-CAkey /etc/crts/redhat.ren.ca.key \
-CAcreateserial -out /etc/crts/redhat.ren.crt
openssl x509 -in /etc/crts/redhat.ren.crt -text
/bin/cp -f /etc/crts/redhat.ren.ca.crt /etc/pki/ca-trust/source/anchors/
update-ca-trust extract
# first config quay
mkdir -p $QUAY/postgres-quay
setfacl -m u:26:-wx $QUAY/postgres-quay
podman run -d --rm --name postgresql-quay \
-e POSTGRESQL_USER=quayuser \
-e POSTGRESQL_PASSWORD=quaypass \
-e POSTGRESQL_DATABASE=quay \
-e POSTGRESQL_ADMIN_PASSWORD=adminpass \
-p 5432:5432 \
-v $QUAY/postgres-quay:/var/lib/pgsql/data:Z \
registry.redhat.io/rhel8/postgresql-10:1
# Ensure that the Postgres pg_trgm module is installed, as it is required by Quay
podman exec -it postgresql-quay /bin/bash -c 'echo "CREATE EXTENSION IF NOT EXISTS pg_trgm" | psql -d quay -U postgres'
# CREATE EXTENSION
podman run -d --rm --name redis \
-p 6379:6379 \
-e REDIS_PASSWORD=strongpassword \
registry.redhat.io/rhel8/redis-5:1
podman run --rm -it --name quay_config -p 80:8080 -p 443:8443 registry.redhat.io/quay/quay-rhel8:v3.6.2 config secret
# go to http://quaylab.infra.redhat.ren
# Log in with the username quayconfig and password secret
# make config, and download the config
Database Type: Postgres
Database Server: quaylab.infra.redhat.ren:5432
Username: quayuser
Password: quaypass
Database Name: quay
Redis Hostname: quaylab.infra.redhat.ren
Redis port: 6379 (default)
Redis password: strongpassword
log path: /logarchive
super user: quayadmin
ctrl-c exit the container
# then run the quay
mkdir $QUAY/config
cp ~/Downloads/quay-config.tar.gz $QUAY/config
cd $QUAY/config
tar xvf quay-config.tar.gz
mkdir $QUAY/storage
setfacl -m u:1001:-wx $QUAY/storage
podman run -d --rm -p 80:8080 -p 443:8443 \
--name=quay \
-v $QUAY/config:/conf/stack:Z \
-v $QUAY/storage:/datastorage:Z \
registry.redhat.io/quay/quay-rhel8:v3.6.2
访问 http://quaylab.infra.redhat.ren
第一次使用,直接创建用户,我们创建quayadmin这个用户,因为之前在配置的时候,quayadmin这个用户是超级管理员。
# try it out
podman login quaylab.infra.redhat.ren
# Username: quayadmin
# Password: password
/bin/cp -f /run/user/0/containers/auth.json /data/registry.auth.json
# setup quay mirror
podman run -d --name mirroring-worker \
-v $QUAY/config:/conf/stack:Z \
registry.redhat.io/quay/quay-rhel8:v3.6.2 repomirror
# auto restart
cd ~/
podman generate systemd --new --files --name redis
podman generate systemd --new --files --name postgresql-quay
podman generate systemd --new --files --name quay
podman generate systemd --new --files --name mirroring-worker
cp -Z container-redis.service /usr/lib/systemd/system
cp -Z container-postgresql-quay.service /usr/lib/systemd/system
cp -Z container-quay.service /usr/lib/systemd/system
cp -Z container-mirroring-worker.service /usr/lib/systemd/system
systemctl daemon-reload
systemctl enable --now container-redis.service
systemctl enable --now container-postgresql-quay.service
systemctl enable --now container-quay.service
systemctl enable --now container-mirroring-worker.service
rm -f container*
用我们新创建的quayadmin用户,登录 创建一个组织 组织创建成功以后是这样· 我们在组织内部,创建镜像repo: base 创建成功了是这样 我们为了让这个repo能自动同步 quay.io,我们要把内网的这个repo设置成mirror类型。 然后我们要给sync操作,创建一个机器人账号 创建机器人账号很简单,起一个名字就好了。 给机器人账号分配repo的权限,由于我们是要从远端同步repo过来,所以这个机器人账号需要写权限。 在repo中,配置同步参数,包括上级repo位置,repo版本,同步频率等。 保存以后,能看到同步参数已经生效。点击sync now,就可以手动开始同步。 可以在 repo 的历史信息中,看到同步的进度。 看repo tag的信息,能看到远端的repo已经同步过来了。
参考资料:
openshift4
我们的演示,是围绕公司网络里面的容器平台openshift4的,所以我们要装一个openshift4,并且安装一些我们需要组件。
install ocp4
我们装一个最小版本的openshift4,只有一个节点,也就是master/worker混合部署,并且这个节点是kvm。
除了openshift4本身的节点以外,我们还需要helper kvm,这是因为openshift4的安装和运行,依赖云环境,比如load balancer, dns等,但是我们的实验室环境里面,这些都需要自己搭建提供,那么我们就创建一个helper kvm,来模拟和承载这些云服务。
# 配置openshift版本
# import openshift4 install images into quay
export BUILDNUMBER=4.9.12
# 解压缩openshift 客户端软件
tar -xzf /data/ocp4/${BUILDNUMBER}/openshift-client-linux-${BUILDNUMBER}.tar.gz -C /usr/local/sbin/
# 向内部的容器镜像仓库quay,导入openshift4安装介质。
oc image mirror -a /data/registry.auth.json --from-dir=/data/file.registry/ 'file://openshift/release:4.9.12-x86_64*' quaylab.infra.redhat.ren/ocp4/openshift4
# 我们的openshift4是模拟离线模式,所以我们需要一个容器镜像proxy,来桥接下载容器镜像。
# setup nexus
mkdir -p /data/ccn
cd /data/ccn
podman create --name swap quay.io/wangzheng422/qimgs:nexus-fs-image-2022-01-14-2155 ls
podman cp swap:/nexus-image.tgz - > /data/ccn/nexus-image.tgz.tar
podman rm -fv swap
tar vxf nexus-image.tgz.tar
tar zvxf nexus-image.tgz
rm -f nexus-image.tgz*
chown -R 200 /data/ccn/nexus-image
podman run -d -p 8082:8081 -p 8083:8083 -it --name nexus-image -v /data/ccn/nexus-image:/nexus-data:Z docker.io/sonatype/nexus3:3.33.1
# auto start nexus
cd ~/
podman generate systemd --files --name nexus-image
cp -Z container-nexus-image.service /usr/lib/systemd/system
systemctl daemon-reload
systemctl enable --now container-nexus-image.service
# 我们准备安装helper节点
# we follow single node ocp4 deployment
cd /data/kvm
wget -O rhel8.iso 'https://access.cdn.redhat.com/content/origin/files/sha256/1f/1f78e705cd1d8897a05afa060f77d81ed81ac141c2465d4763c0382aa96cadd0/rhel-8.5-x86_64-dvd.iso?user=a768b217cf6ae8041b67586bb4dd5c77&_auth_=1642400208_d400d34f0d5e2caab120537d05b0b8c9'
create_lv() {
var_vg=$1
var_lv=$2
var_size=$3
lvremove -f $var_vg/$var_lv
lvcreate -y -L $var_size -n $var_lv $var_vg
wipefs --all --force /dev/$var_vg/$var_lv
}
create_lv vgdata lvhelper 120G
create_lv vgdata lvbootstrap 120G
create_lv vgdata lvmaster0 120G
export http_proxy="http://192.168.195.54:5085"
export https_proxy=${http_proxy}
wget https://raw.githubusercontent.com/wangzheng422/docker_env/dev/redhat/ocp4/4.9/scripts/helper-ks-rhel8.cfg
unset http_proxy
unset https_proxy
sed -i '0,/^network.*/s/^network.*/network --bootproto=static --device=enp1s0 --gateway=192.168.7.1 --ip=192.168.7.11 --netmask=255.255.255.0 --nameserver=192.168.7.11 --ipv6=auto --activate/' helper-ks-rhel8.cfg
# https://stackoverflow.com/questions/18620153/find-matching-text-and-replace-next-line
sed -i '/^network.*/{n;s/^network.*/network --hostname=ocp4-helper/}' helper-ks-rhel8.cfg
export KVM_DIRECTORY=/data/kvm
virt-install --name="ocp4-Helper" --vcpus=2 --ram=4096 \
--cpu=host-model \
--disk path=/dev/vgdata/lvhelper,device=disk,bus=virtio,format=raw \
--os-variant rhel8.5 --network bridge=baremetal,model=virtio \
--graphics vnc,port=59200 \
--boot menu=on \
--location ${KVM_DIRECTORY}/rhel8.iso \
--disk ${KVM_DIRECTORY}/rhel8.iso,device=cdrom \
--initrd-inject helper-ks-rhel8.cfg --extra-args "inst.ks=file:/helper-ks-rhel8.cfg"
# 装好了helper vm,我们需要配置一下他
# config helper vm
ssh root@192.168.7.11
export YUMIP="192.168.7.1"
cat << EOF > /etc/yum.repos.d/remote.repo
[BaseOS]
name=BaseOS
baseurl=ftp://$YUMIP/rhel/dnf/rhel-8-for-x86_64-baseos-rpms
enabled=1
gpgcheck=0
[AppStream]
name=AppStream
baseurl=ftp://$YUMIP/rhel/dnf/rhel-8-for-x86_64-appstream-rpms
enabled=1
gpgcheck=0
[Ansible]
name=Ansible
baseurl=ftp://$YUMIP/rhel/dnf/ansible-2.9-for-rhel-8-x86_64-rpms
enabled=1
gpgcheck=0
EOF
sed -i 's/#UseDNS yes/UseDNS no/g' /etc/ssh/sshd_config
systemctl restart sshd
cat << EOF > /root/.ssh/config
StrictHostKeyChecking no
UserKnownHostsFile=/dev/null
EOF
echo "allow 192.0.0.0/8" >> /etc/chrony.conf
systemctl enable --now chronyd
# systemctl restart chronyd
chronyc tracking
chronyc sources -v
chronyc sourcestats -v
chronyc makestep
dnf update -y
reboot
dnf -y install ansible git unzip podman python3 buildah skopeo jq pigz
# copy in the ocp installer
mkdir -p /data/ocp4/
# scp ocp4.tgz to /data
# scp * root@192.168.7.11:/data/
cd /data
tar zvxf ocp.*.tgz
tar zvxf registry.*.tgz
cd /data/ocp4
rm -f /data/*.tgz
# update the certification for quay
mkdir -p /etc/crts/ && cd /etc/crts
# scp * root@192.168.7.11:/etc/crts/
/bin/cp -f /etc/crts/redhat.ren.ca.crt /etc/pki/ca-trust/source/anchors/
update-ca-trust extract
# create ssh key
ssh-keygen
# extract install ansible playbooks
cd /data/ocp4
unzip ocp4-upi-helpernode.zip
cd /data/ocp4/ocp4-upi-helpernode-master
# 给ansible playbook配置参数文件
cat << 'EOF' > /data/ocp4/ocp4-upi-helpernode-master/vars.yaml
---
ocp_version: 4.9.12
ssh_gen_key: false
staticips: true
firewalld: false
dns_forward: yes
iso:
iso_dl_url: "/data/ocp4/rhcos-live.x86_64.iso"
my_iso: "rhcos-live.iso" # this is internal file, just leave as it.
helper:
name: "helper"
ipaddr: "192.168.7.11"
networkifacename: "enp1s0"
gateway: "192.168.7.1"
netmask: "255.255.255.0"
dns:
domain: "redhat.ren"
clusterid: "ocp4"
forwarder1: "192.168.7.1"
forwarder2: "192.168.7.1"
bootstrap:
name: "bootstrap"
ipaddr: "192.168.7.12"
interface: "enp1s0"
install_drive: "vda"
manual: false
masters:
- name: "master-0"
ipaddr: "192.168.7.13"
interface: "enp1s0"
install_drive: "vda"
manual: false
# - name: "master-1"
# ipaddr: "192.168.7.14"
# interface: "enp1s0"
# install_drive: "vda"
# - name: "master-2"
# ipaddr: "192.168.7.15"
# interface: "enp1s0"
# install_drive: "vda"
workers:
- name: "worker-0"
ipaddr: "192.168.7.16"
interface: "eno1"
install_drive: "sda"
# - name: "worker-1"
# ipaddr: "192.168.7.17"
# interface: "enp1s0"
# install_drive: "sda"
# - name: "worker-2"
# ipaddr: "192.168.7.18"
# interface: "enp1s0"
# install_drive: "vda"
# - name: "infra-0"
# ipaddr: "192.168.7.19"
# interface: "enp1s0"
# install_drive: "vda"
# - name: "infra-1"
# ipaddr: "192.168.7.20"
# interface: "enp1s0"
# install_drive: "vda"
# - name: "worker-3"
# ipaddr: "192.168.7.21"
# interface: "enp1s0"
# install_drive: "vda"
# - name: "worker-4"
# ipaddr: "192.168.7.22"
# interface: "enp1s0"
# install_drive: "vda"
others:
- name: "registry"
ipaddr: "192.168.7.1"
- name: "yum"
ipaddr: "192.168.7.1"
- name: "quay"
ipaddr: "192.168.7.1"
- name: "nexus"
ipaddr: "192.168.7.1"
- name: "git"
ipaddr: "192.168.7.1"
otherdomains:
- domain: "rhv.redhat.ren"
hosts:
- name: "manager"
ipaddr: "192.168.7.71"
- name: "rhv01"
ipaddr: "192.168.7.72"
- domain: "others.redhat.ren"
hosts:
- name: "*"
ipaddr: "192.168.7.71"
- name: "*.apps"
ipaddr: "192.168.7.71"
- domain: "infra.redhat.ren"
hosts:
- name: "registry"
ipaddr: "192.168.7.1"
- name: "yum"
ipaddr: "192.168.7.1"
- name: "quaylab"
ipaddr: "192.168.7.1"
- name: "nexus"
ipaddr: "192.168.7.1"
- name: "git"
ipaddr: "192.168.7.1"
force_ocp_download: false
remove_old_config_files: false
ocp_client: "file:///data/ocp4/{{ ocp_version }}/openshift-client-linux-{{ ocp_version }}.tar.gz"
ocp_installer: "file:///data/ocp4/{{ ocp_version }}/openshift-install-linux-{{ ocp_version }}.tar.gz"
ocp_bios: "file:///data/ocp4/rhcos-metal.x86_64.raw.gz"
ppc64le: false
arch: 'x86_64'
chronyconfig:
enabled: true
content:
- server: "192.168.7.11"
options: iburst
setup_registry: # don't worry about this, just leave it here
deploy: false
registry_image: docker.io/library/registry:2
local_repo: "ocp4/openshift4"
product_repo: "openshift-release-dev"
release_name: "ocp-release"
release_tag: "4.6.1-x86_64"
ocp_filetranspiler: "file:///data/ocp4/filetranspiler.tgz"
registry_server: "registry.infra.redhat.ren:5443"
EOF
# ansible 脚本要运行很多次,这是第一次,主要是装云服务,配置他们
cd /data/ocp4/ocp4-upi-helpernode-master
ansible-playbook -e @vars.yaml tasks/main.yml
mkdir -p /data/install
cd /data/install
# vi install-config.yaml
cat << EOF > /data/install/install-config.yaml
apiVersion: v1
baseDomain: redhat.ren
compute:
- hyperthreading: Enabled
name: worker
replicas: 0
controlPlane:
hyperthreading: Enabled
name: master
replicas: 1
metadata:
name: ocp4
networking:
clusterNetworks:
- cidr: 10.128.0.0/14
hostPrefix: 23
networkType: OVNKubernetes
serviceNetwork:
- 172.30.0.0/16
platform:
none: {}
pullSecret: '{"auths":{"registry.ocp4.redhat.ren:5443": {"auth": "ZHVtbXk6ZHVtbXk=","email": "noemail@localhost"},"quaylab.infra.redhat.ren": {"auth": "cXVheWFkbWluOnBhc3N3b3Jk","email": "noemail@localhost"}}}'
sshKey: |
$( cat /root/.ssh/id_rsa.pub | sed 's/^/ /g' )
additionalTrustBundle: |
$( cat /etc/crts/redhat.ren.ca.crt | sed 's/^/ /g' )
imageContentSources:
- mirrors:
- quaylab.infra.redhat.ren/ocp4/openshift4
source: quay.io/openshift-release-dev/ocp-release
- mirrors:
- quaylab.infra.redhat.ren/ocp4/openshift4
source: quay.io/openshift-release-dev/ocp-v4.0-art-dev
EOF
# 清空之前的openshift安装缓存,并且创建新的ignition files
cd /data/install/
/bin/rm -rf *.ign .openshift_install_state.json auth bootstrap manifests master*[0-9] worker*[0-9]
openshift-install create manifests --dir=/data/install
# 我们有一些自定义的ignition配置,把他们加进去
# copy ntp related config
/bin/cp -f /data/ocp4/ocp4-upi-helpernode-master/machineconfig/* /data/install/openshift/
# copy image registry proxy related config
cd /data/ocp4
bash image.registries.conf.sh nexus.infra.redhat.ren:8083
/bin/cp -f /data/ocp4/image.registries.conf /etc/containers/registries.conf.d/
/bin/cp -f /data/ocp4/99-worker-container-registries.yaml /data/install/openshift
/bin/cp -f /data/ocp4/99-master-container-registries.yaml /data/install/openshift
# 创建 ignition 文件
cd /data/install/
openshift-install create ignition-configs --dir=/data/install
cd /data/ocp4/ocp4-upi-helpernode-master
# 我们来为每个主机,复制自己版本的ign,并复制到 web server 的目录下
ansible-playbook -e @vars.yaml tasks/ign.yml
# 我们为每个节点创建各自的iso文件
cd /data/ocp4/ocp4-upi-helpernode-master
ansible-playbook -e @vars.yaml tasks/iso.yml
# 接下来,我们把 master, worker 的启动iso复制到宿主机上
# 并启动kvm,将自动开始安装 master, worker 节点
# on kvm host 172.21.6.103
export KVM_DIRECTORY=/data/kvm
mkdir -p ${KVM_DIRECTORY}
cd ${KVM_DIRECTORY}
scp root@192.168.7.11:/data/install/{*boot*,*master-0,*worker-0}.iso ${KVM_DIRECTORY}/
virt-install --name=ocp4-bootstrap --vcpus=4 --ram=8192 \
--disk path=/dev/vgdata/lvbootstrap,device=disk,bus=virtio,format=raw \
--os-variant rhel8.5 --network bridge=baremetal,model=virtio \
--graphics vnc,port=59001 \
--boot menu=on --cdrom ${KVM_DIRECTORY}/rhcos_install-bootstrap.iso
virt-install --name=ocp4-master-0 --vcpus=16 --ram=73728 \
--cpu=host-model \
--disk path=/dev/vgdata/lvmaster0,device=disk,bus=virtio,format=raw \
--os-variant rhel8.5 --network bridge=baremetal,model=virtio \
--graphics vnc,port=59002 \
--boot menu=on --cdrom ${KVM_DIRECTORY}/rhcos_install-master-0.iso
# 回到helper vm上,等待安装结束
# back to helper vm
cd /data/install
export KUBECONFIG=/data/install/auth/kubeconfig
echo "export KUBECONFIG=/data/install/auth/kubeconfig" >> ~/.bashrc
oc completion bash | sudo tee /etc/bash_completion.d/openshift > /dev/null
dnf -y install jq
oc get csr | grep -v Approved
oc get csr -ojson | jq -r '.items[] | select(.status == {} ) | .metadata.name' | xargs oc adm certificate approve
cd /data/install
openshift-install wait-for bootstrap-complete --log-level debug
cd /data/install
openshift-install wait-for install-complete --log-level debug
# INFO To access the cluster as the system:admin user when using 'oc', run 'export KUBECONFIG=/data/install/auth/kubeconfig'
# INFO Access the OpenShift web-console here: https://console-openshift-console.apps.ocp4.redhat.ren
# INFO Login to the console with user: "kubeadmin", and password: "eLVhg-TUx3X-fWYL9-dHepi"
install tekton(ci/cd pipeline)
openshift pipeline 官方安装文档写的很好,照着做,点一下鼠标就好了。
install argocd(ci/cd gitops)
openshift gitops官方安装文档写的很好,照着做,点一下鼠标就好了。
install hostpath-provisioner from kubevirt
我们需要在openshift上的简单存储方案,那么我们就借用openshift virtulization来搞,他里面有一个hostpath组件
以下是配置要点
# 在节点上创建对应目录,并设置selinux权限
cat << EOF > /data/install/host-path.yaml
---
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
name: 50-set-selinux-for-hostpath-master
labels:
machineconfiguration.openshift.io/role: master
spec:
config:
ignition:
version: 3.2.0
systemd:
units:
- contents: |
[Unit]
Description=Set SELinux chcon for hostpath baicell
Before=kubelet.service
[Service]
Type=oneshot
RemainAfterExit=yes
ExecStartPre=-mkdir -p /var/hostpath
ExecStart=chcon -Rt container_file_t /var/hostpath/
[Install]
WantedBy=multi-user.target
enabled: true
name: hostpath-baicell.service
EOF
oc create -f /data/install/host-path.yaml
# 创建hostpath配置
cat << EOF > /data/install/host-path-provision.yaml
apiVersion: hostpathprovisioner.kubevirt.io/v1beta1
kind: HostPathProvisioner
metadata:
name: hostpath-provisioner
spec:
imagePullPolicy: IfNotPresent
pathConfig:
path: "/var/hostpath"
useNamingPrefix: false
EOF
oc create -f /data/install/host-path-provision.yaml -n openshift-cnv
# 创建storage class配置
cat << EOF > /data/install/host-path-storage-class.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: hostpath-provisioner
provisioner: kubevirt.io/hostpath-provisioner
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
EOF
oc create -f /data/install/host-path-storage-class.yaml
CI/CD in shell
CI/CD是一种理念,强调的是快速的从业务构想,到产品代码开发,到产品的上线以及后面的自动维护和升级。具体采用什么工具,各个团队和公司的情况不同,所以需要自己去研究。但是原则是,用自己熟悉的,能掌控的,能快速解决问题的就行。
根据我们的整体CI/CD设计,我们做2个版本的CI/CD,一个是用最简单的脚本实现,另外一个用openshift4的工具实现。这两个版本并没有谁好谁坏,只不过脚本的实现方式,更适合小团队,而openshift4的工具,更适合大团队。因为在大团队里面,沟通是一项成本非常高的活动,而openshift4中提供的这种工具,能很大程度的降低团队内部的沟通成本,同时简化和模版化的配置,降低配置错误的可能性,所以推荐团队使用。
容器镜像版本号
容器镜像都版本,比如 quay.io/baicell/fpga-driver:set_ip.v06 ,set_ip.v06就是版本号,这个版本号可以根据公司和团队的需求,进行格式定义,一般会有软件版本,架构信息,构建日期等信息。我们这个演示,大部分都用日期时间戳的方式。有的时候,我们还会把构建者信息通过label的方式放到容器里面,不过这样并不直观,一般是把需要的信息,尽量压缩到镜像的版本号里面。
注意,版本号格式虽然是随意的,但是一旦在公司/团队内部定义下来,就要尽量贯彻执行遵循执行。
build image && sync image
我们先来看看,怎么用脚本实现容器镜像的自动构建和自动上传/同步
for vbbu app
先看看vBBU这个应用的容器镜像构建,这个镜像的特点是非常大,我们已经在公有云上构建了他的编译基础版本,有6G大小,并使用quay的功能,把他异步的同步到公司内网中,接下来,我们就进行增量的构建。并把构建结果上传到公司内部的镜像仓库中。
# on helper vm
# get git repo from gitee, and copy to helper
mkdir -p /data/cicd
cd /data/cicd
wget -O main.zip https://gitee.com/wangzheng422/container.build.demo/repository/archive/main.zip
# scp main.zip root@192.168.7.11:/data/tmp/
cd /data/cicd
unzip main.zip
cd /data/cicd/container.build.demo-main/vbbu
var_date=$(date '+%Y-%m-%d-%H%M')
podman build --no-cache --build-arg REGISTRY=quaylab.infra.redhat.ren -t quaylab.infra.redhat.ren/baicell/vbbu:$var_date .
podman push quaylab.infra.redhat.ren/baicell/vbbu:$var_date
echo quaylab.infra.redhat.ren/baicell/vbbu:$var_date
# sync to public cloud
podman tag quaylab.infra.redhat.ren/baicell/vbbu:$var_date quay.io/baicell/vbbu:$var_date
podman push quay.io/baicell/vbbu:$var_date
for fpga driver
接下来,我们看看如何构建fpga driver的容器镜像。这个进行很小,我们就直接在github上面,用action直接自动构建了。
以下是假设我们需要在公有云上,手动编译的步骤。
# on public cloud host (vultr)
git clone https://github.com/wangzheng422/container.build.demo
cd container.build.demo/fpga
var_date=$(date '+%Y-%m-%d-%H%M')
podman build --no-cache -t quay.io/baicell/fpga-driver:$var_date -f driver.Dockerfile .
podman push quay.io/baicell/fpga-driver:$var_date
auto deploy to openshift
自动化部署,我们采用k8s原生支持的kustomize来做。用kustomize倒不是他多强大,只不过他很简单,可以整体上线和下线。
# on helper vm
oc new-project baicell
oc project baicell
oc create sa demo
oc adm policy add-scc-to-user privileged -z demo
mkdir -p /data/cicd
cd /data/cicd
wget -O main.zip https://gitee.com/wangzheng422/container.build.demo/repository/archive/main.zip
unzip main.zip
cd container.build.demo-main/deploy.demo/
# oc new-project baicell
oc -n baicell kustomize .
oc -n baicell apply -k .
# to restore
oc -n baicell delete -k .
CI/CD in openshift4
现在我们来看看,如果用openshift4里面自带的功能,如何用开源的方式来实现ci/cd。
tekton / pipeline
首先我们来看看pipeline/tekton。我们先把配置过程,用截屏的方式记录一下。
我们已经定义好了一个pipeline,这个pipeline有2步组成,一个是用git的方式,从远端clone一个项目,另外一个,是用buildah来编译镜像。
我们点击编辑pipeline以后,进入了编辑流水线的页面,略过名称这里的配置,我们能看到一个流程图编辑界面,鼠标放到其中的步骤上,可以新增步骤/task,注意最后有一个workspace,这里面,我们需要配置一个存储,好让数据可以在不同task之间流动。
点击某一个步骤/task以后,我们就可以配置这个步骤的参数,以buildah为例子,我们给他配置镜像名称等参数。
pipeline的每次运行,都会有记录,叫pipeline run,我们可以进入每个pipeline run,看那一次运行的日志。
pipeline会对所有属于自己的pipeline run,进行简单的统计。
接下来,我们看配置的一些点。
oc new-project demo
oc project demo
# 要给service account创建push用的token
# https://docs.openshift.com/container-platform/4.9/openshift_images/managing_images/using-image-pull-secrets.html
oc create secret docker-registry pipeline-push-quaylab \
--docker-server=quaylab.infra.redhat.ren \
--docker-username=quayadmin \
--docker-password=password \
--docker-email=quayadmin@redhat.ren
oc secrets link pipeline pipeline-push-quaylab --for=pull,mount
# 我们需要定义存储,给pipeline使用
# we define a pvc for the pipeline
cat << EOF > /data/cicd/pipeline.pvc.yaml
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: pipeline-vbbu-image-build
namespace: demo
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
storageClassName: hostpath-provisioner
volumeMode: Filesystem
EOF
oc create -f /data/cicd/pipeline.pvc.yaml
# 我们在界面上定义的pipeline,实际的yaml长这个样子,可以直接在命令行上创建。
# we define a pipeline
cat << EOF > /data/cicd/pipeline.yaml
apiVersion: tekton.dev/v1beta1
kind: Pipeline
metadata:
name: vbbu-build-image-pipeline
namespace: demo
spec:
params:
- default: demo
description: docker image tag
name: image_tag
type: string
tasks:
- name: git-clone
params:
- name: url
value: 'https://gitee.com/wangzheng422/container.build.demo'
- name: httpProxy
value: 'http://192.168.7.1:18080'
- name: httpsProxy
value: 'http://192.168.7.1:18080'
taskRef:
kind: ClusterTask
name: git-clone
workspaces:
- name: output
workspace: workspace-demo
- name: buildah
params:
- name: IMAGE
value: 'quaylab.infra.redhat.ren/baicell/vbbu:$(params.image_tag)'
- name: DOCKERFILE
value: vbbu/Dockerfile
- name: CONTEXT
value: vbbu/
- name: TLSVERIFY
value: 'false'
- name: BUILD_EXTRA_ARGS
value: '--build-arg REGISTRY=''quaylab.infra.redhat.ren'''
runAfter:
- git-clone
taskRef:
kind: ClusterTask
name: buildah
workspaces:
- name: source
workspace: workspace-demo
workspaces:
- name: workspace-demo
EOF
oc create -f /data/cicd/pipeline.yaml
argocd / gitops
接下来,我们来看看openshift4里面的gitops,他使用argocd做的。
gitops/argocd有专门的界面,在这里找登录位置,直接点击,用SSO登录就可以,默认operator都给你配置好了。
接入gitops/argocd界面后,我们要配置git源
然后创建应用
应用配置的关键信息,有git源,目标cluster(默认就好),git里面的路径等。后面有yaml配置,可以直接使用。
应用创建好以后,长这个样子
我们进入应用,看到了应用的结构图,很漂亮,我们点击同步,让gitops生效。这个时候,系统会根据git里面的yaml配置,来创建k8s对象。
gitops成功以后,拓扑图更好看了,他把隐藏创建的一些系统对象,也显示出来了。
回到概览页面,能看到,我们的应用已经正常了。
以下是一些用到的命令。
# 给被管理project打标签,让这个project被gitops管理。
oc label namespace baicell argocd.argoproj.io/managed-by=openshift-gitops
oc api-resources | grep argo
# applications app,apps argoproj.io/v1alpha1 true Application
# applicationsets appset,appsets argoproj.io/v1alpha1 true ApplicationSet
# appprojects appproj,appprojs argoproj.io/v1alpha1 true AppProject
# argocds argoproj.io/v1alpha1 true ArgoCD
oc project openshift-gitops
# 创建我们的gitops应用,这个可以直接创建,剩的在界面上敲字了。
cat << EOF > /data/cicd/gitops-app.yaml
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: demo
namespace: openshift-gitops
spec:
destination:
namespace: baicell
server: https://kubernetes.default.svc
project: default
source:
path: deploy.demo
repoURL: https://gitee.com/wangzheng422/container.build.demo
EOF
oc create -f /data/cicd/gitops-app.yaml
oc get applications
# NAME SYNC STATUS HEALTH STATUS
# demo Synced Healthy
end
openshift/k8s, 远程shell / oc exec, 原理
我们在日常运维openshift/k8s的时候,经常会运行oc exec命令,比如
oc exec -it $pod_name -- bash
但是,有的时候,这个session会连不上,有的时候,这个session会突然中断,接下来我们就尝试看看这个命令背后的原理
oc exec -v 6 -it pod/du-deployment1-58944f9f85-8m49m -- bash
# I1230 14:38:39.347429 188014 loader.go:372] Config loaded from file: /data/install/auth/kubeconfig
# I1230 14:38:39.370718 188014 round_trippers.go:454] GET https://api.ocp4s.redhat.ren:6443/api/v1/namespaces/default/pods/du-deployment1-58944f9f85-8m49m 200 OK in 10 milliseconds
# I1230 14:38:39.376109 188014 podcmd.go:88] Defaulting container name to du-container1
# I1230 14:38:39.396350 188014 round_trippers.go:454] POST https://api.ocp4s.redhat.ren:6443/api/v1/namespaces/default/pods/du-deployment1-58944f9f85-8m49m/exec?command=bash&container=du-container1&stdin=true&stdout=true&tty=true 101 Switching Protocols in 19 milliseconds
# [root@du-deployment1-58944f9f85-8m49m /]#
oc exec -v 7 -it pod/du-deployment1-58944f9f85-8m49m -- bash
# I1230 14:39:13.441167 188023 loader.go:372] Config loaded from file: /data/install/auth/kubeconfig
# I1230 14:39:13.450807 188023 round_trippers.go:432] GET https://api.ocp4s.redhat.ren:6443/api/v1/namespaces/default/pods/du-deployment1-58944f9f85-8m49m
# I1230 14:39:13.450830 188023 round_trippers.go:438] Request Headers:
# I1230 14:39:13.450837 188023 round_trippers.go:442] Accept: application/json, */*
# I1230 14:39:13.450842 188023 round_trippers.go:442] User-Agent: oc/4.9.0 (linux/amd64) kubernetes/96e95ce
# I1230 14:39:13.465425 188023 round_trippers.go:457] Response Status: 200 OK in 14 milliseconds
# I1230 14:39:13.473072 188023 podcmd.go:88] Defaulting container name to du-container1
# I1230 14:39:13.475155 188023 round_trippers.go:432] POST https://api.ocp4s.redhat.ren:6443/api/v1/namespaces/default/pods/du-deployment1-58944f9f85-8m49m/exec?command=bash&container=du-container1&stdin=true&stdout=true&tty=true
# I1230 14:39:13.475182 188023 round_trippers.go:438] Request Headers:
# I1230 14:39:13.475187 188023 round_trippers.go:442] X-Stream-Protocol-Version: v4.channel.k8s.io
# I1230 14:39:13.475191 188023 round_trippers.go:442] X-Stream-Protocol-Version: v3.channel.k8s.io
# I1230 14:39:13.475195 188023 round_trippers.go:442] X-Stream-Protocol-Version: v2.channel.k8s.io
# I1230 14:39:13.475199 188023 round_trippers.go:442] X-Stream-Protocol-Version: channel.k8s.io
# I1230 14:39:13.475203 188023 round_trippers.go:442] User-Agent: oc/4.9.0 (linux/amd64) kubernetes/96e95ce
# I1230 14:39:13.496289 188023 round_trippers.go:457] Response Status: 101 Switching Protocols in 21 milliseconds
# [root@du-deployment1-58944f9f85-8m49m /]#
上面2个命令,我们打开了log,等级设置不同,可以看到oc exec命令,其实是调用了api server上的pod接口,然后通道协议切换到了x-stream
那么我们在项目上,发现oc exec不稳定,那就要先去看api server是不是正常,在通往api server的通路上,是不是有haproxy之类的代理,代理是否正常。这样逐步的排查。
reference
- https://www.cnblogs.com/a00ium/p/10905279.html
- https://cloud.redhat.com/blog/executing-commands-in-pods-using-k8s-api
- https://docs.openshift.com/container-platform/4.9/rest_api/workloads_apis/pod-core-v1.html#apiv1namespacesnamespacepodsnameexec
nf_conntrack 在 openshift4.9上的处理
最近看到一个case,是运行了高负载docker应用的主机上,nf_conntrack报告table full。其实这是一个老问题了,原因是docker在处理容器网络的时候,默认会用nat的方式,也就是容器里面看到的地址空间是一个私有地址,需要操作系统iptables/nftables来转换。而这个转化,就需要nf_conntrack来追踪和支持。
也没什么太好的解决办法,要么就用host network,绕过nat,要么就用no tracking的方式,让iptables别记录nf_conntrack。总之没有什么特别好的办法。
openshift
openshift是一个容器平台,那么openshift上是怎么处理的呢?我们实际来看看。
# 可以看到,openshift上,对应vxlan的通讯,不进行nf_conntrack的追踪,也就是说,对于vxlan的通讯,不会被记录在nf_conntrack中。
iptables -L -v -n -t raw
# Chain PREROUTING (policy ACCEPT 0 packets, 0 bytes)
# pkts bytes target prot opt in out source destination
# 88M 39G OPENSHIFT-NOTRACK all -- * * 0.0.0.0/0 0.0.0.0/0 /* disable conntrack for vxlan */
# Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes)
# pkts bytes target prot opt in out source destination
# 87M 54G OPENSHIFT-NOTRACK all -- * * 0.0.0.0/0 0.0.0.0/0 /* disable conntrack for vxlan */
# Chain OPENSHIFT-NOTRACK (2 references)
# pkts bytes target prot opt in out source destination
# 0 0 CT udp -- * * 0.0.0.0/0 0.0.0.0/0 udp dpt:4789 NOTRACK
# 统计总的连接跟踪数
conntrack -L -o extended | wc -l
# conntrack v1.4.4 (conntrack-tools): 760 flow entries have been shown.
# 760
# 统计 TCP 协议各个状态的连接跟踪数
conntrack -L -o extended | awk '/^.*tcp.*$/ {sum[$6]++} END {for(i in sum) print i, sum[i]}'
# conntrack v1.4.4 (conntrack-tools): 774 flow entries have been shown.
# LAST_ACK 1
# CLOSE 78
# ESTABLISHED 428
# SYN_SENT 1
# TIME_WAIT 214
# 统计各个源 IP 的连接跟踪数
conntrack -L -o extended | awk '{print $7}' | cut -d "=" -f 2 | sort | uniq -c | sort -nr | head -n 10
# conntrack v1.4.4 (conntrack-tools): 805 flow entries have been shown.
# 226 10.128.0.1
# 225 192.168.7.73
# 74 192.168.7.71
# 68 10.128.0.36
# 61 172.30.0.10
# 38 127.0.0.1
# 13 10.128.0.16
# 10 10.128.0.34
# 7 10.128.0.39
# 5 10.128.0.9
# 如果没有conntrack的话,可以用下面的命令来查看
awk -F'=' '{c[$2]++} END {for ( i in c) print i,c[i]}' /proc/net/nf_conntrack | sort -g -k 3
# ...ignored...
# 10.128.0.16 dst 13
# 10.128.0.37 dst 14
# 10.128.0.43 dst 27
# 127.0.0.1 dst 38
# 192.168.7.71 dst 62
# 10.128.0.36 dst 67
# 0000:0000:0000:0000:0000:0000:0000:0001 dst 208
# 192.168.7.73 dst 220
# 10.128.0.1 dst 230
# 如果链接数量太大的话,用下面的命令先把链接信息导出来,然后在别的机器上排序
awk -F'=' '{print $2}' /proc/net/nf_conntrack > list
awk '{c[$1]++} END {for ( i in c) print i,c[i]}' list | sort -g -k 2
reference
-
https://blog.cloudflare.com/conntrack-turns-a-blind-eye-to-dropped-syns/
-
https://blog.cloudflare.com/conntrack-tales-one-thousand-and-one-flows/
-
https://www.codeleading.com/article/31982187817/
-
https://blog.longwin.com.tw/2018/07/linux-nf-conntrack-table-full-drop-packet-2018/
-
https://www.reddit.com/r/docker/comments/iq04tw/nated_containers_conntrack_table_full_inside/
-
https://forum.proxmox.com/threads/how-to-disable-nf_conntrack-completely.17957/
-
https://docs.docker.com/network/iptables/
-
https://blog.csdn.net/chunnidong6528/article/details/100975427
-
https://www.cnblogs.com/sreops/p/14023368.html
-
https://zyh.cool/posts/f41d0763/
-
https://www.redhat.com/en/blog/mitigate-tcp-syn-flood-attacks-red-hat-enterprise-linux-7-beta
-
https://access.redhat.com/discussions/6307391
-
https://access.redhat.com/solutions/781873
openshift 4.9 加载第三方驱动 / 内核模块
我们在项目中,会遇到特种硬件,比如 fpga 卡,软件供应商为这个 fpga 卡提供了驱动/内核模块,我们需要把这个驱动加载到系统中。本文就讲述,如何在 openshift 4.9 里面,通过 deployment / pod 的方式,想系统注入这个驱动/内核模块。
在本次实验中,物理机上有一块fpga卡,我们得到了对应的驱动 nr_drv_wr.ko ,这个驱动加载以后,会创建一个网卡,我们要初始化这个网卡。
好了,就让我们来看看是怎么做的吧。
制作镜像
我们把驱动拷贝到镜像里面,还把自动加载脚本也复制到镜像里面。自动加载脚本里面,有一个小技巧,就是 ko 文件,需要打上正确的selinux 标签,否则 insmod 会报错。
mkdir -p /data/wzh/fpga
cd /data/wzh/fpga
cat << 'EOF' > ./ocp4.install.sh
#!/bin/bash
set -e
set -x
if chroot /host lsmod | grep nr_drv > /dev/null 2>&1
then
echo NR Driver Module had loaded!
else
echo Inserting NR Driver Module
# chroot /host rmmod nr_drv > /dev/null 2>&1
if [ $(uname -r) == "4.18.0-305.19.1.rt7.91.el8_4.x86_64" ];
then
echo insmod nr_drv_wr.ko ...
/bin/cp -f nr_drv_wr.ko /host/tmp/nr_drv_wr.ko
chroot /host chcon -t modules_object_t /tmp/nr_drv_wr.ko
chroot /host insmod /tmp/nr_drv_wr.ko load_xeth=1
/bin/rm -f /host/tmp/nr_drv_wr.ko
CON_NAME=`chroot /host nmcli -g GENERAL.CONNECTION dev show xeth`
chroot /host nmcli connection modify "$CON_NAME" con-name xeth
chroot /host nmcli connection modify xeth ipv4.method disabled ipv6.method disabled
chroot /host nmcli dev conn xeth
else
echo insmod nr_drv_ko Failed!
fi
fi
EOF
cat << EOF > ./fpga.dockerfile
FROM docker.io/busybox:1.34
USER root
COPY Driver.PKG /Driver.PKG
COPY ocp4.install.sh /ocp4.install.sh
RUN chmod +x /ocp4.install.sh
WORKDIR /
EOF
buildah bud -t registry.ocp4.redhat.ren:5443/nep/fgpa-driver:v07 -f fpga.dockerfile .
buildah push registry.ocp4.redhat.ren:5443/nep/fgpa-driver:v07
openshift 部署
部署之前,我们先给service account加上特权模式,我们这个实验,在default project里面,用了default service account,所以命令就在下面,但是到了具体项目中,一般是要创建单独的project,并且创建单独的service account的。
然后我们用了几个小技巧,首先用init container,把驱动复制进pod,传递给真正运行的容器,然后我们无限睡眠,保持这个pod运行,这么做是因为,如果容器正常退出了,deployment会自动重启,但是我们这里不想自动重启,所以我们无限睡眠,保持这个pod运行。好在这个 pod 消耗很小。
未来可能会优化成用 job / static pod 的方式来运行。
oc adm policy add-scc-to-user privileged -z default -n default
cat << EOF > /data/install/fpga.driver.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: fpga-driver
# namespace: default
labels:
app: fpga-driver
spec:
replicas: 1
selector:
matchLabels:
app: fpga-driver
template:
metadata:
labels:
app: fpga-driver
spec:
hostPID: true
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: "app"
operator: In
values:
- fpga-driver
topologyKey: "kubernetes.io/hostname"
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- worker-0
# restartPolicy: Never
initContainers:
- name: copy
image: registry.ocp4.redhat.ren:5443/nep/fgpa-driver:v07
command: ["/bin/sh", "-c", "tar zvxf /Driver.PKG --strip 1 -C /nep/driver/ && /bin/cp -f /ocp4.install.sh /nep/driver/ "]
imagePullPolicy: Always
volumeMounts:
- name: driver-files
mountPath: /nep/driver/
containers:
- name: driver
image: registry.redhat.io/rhel8/support-tools:8.4
# imagePullPolicy: Always
command: [ "/usr/bin/bash","-c","cd /nep/driver/ && bash ./ocp4.install.sh && sleep infinity " ]
# command: [ "/usr/bin/bash","-c","tail -f /dev/null || true " ]
resources:
requests:
cpu: 10m
memory: 20Mi
securityContext:
privileged: true
# runAsUser: 0
seLinuxOptions:
level: "s0"
volumeMounts:
- name: driver-files
mountPath: /nep/driver/
- name: host
mountPath: /host
volumes:
- name: driver-files
emptyDir: {}
- name: host
hostPath:
path: /
type: Directory
EOF
oc create -f /data/install/fpga.driver.yaml
# to restore
oc delete -f /data/install/fpga.driver.yaml
sign the kernel model
CHAPTER 4. SIGNING KERNEL MODULES FOR SECURE BOOT
helm chart / helm operator 制作
2021.12 helm chart/helm operator
build helm operator
mkdir -p /data/down
cd /data/down
wget https://mirror.openshift.com/pub/openshift-v4/clients/operator-sdk/latest/operator-sdk-linux-x86_64.tar.gz
tar zvxf operator-sdk-linux-x86_64.tar.gz
install operator-sdk /usr/local/bin/
operator-sdk init --plugins helm --help
mkdir -p /data/helm
cd /data/helm
# 初始化项目
operator-sdk init \
--plugins=helm \
--project-name nep-helm-operator \
--domain=nep.com \
--group=apps \
--version=v1alpha1 \
--kind=VBBU
make bundle
# operator-sdk generate kustomize manifests -q
# Display name for the operator (required):
# > nep vBBU
# Description for the operator (required):
# > nep vRAN application including fpga driver, vCU, vDU
# Provider's name for the operator (required):
# > nep
# Any relevant URL for the provider name (optional):
# > na.nep.com
# Comma-separated list of keywords for your operator (required):
# > nep,vbbu,vran,vcu,vdu
# Comma-separated list of maintainers and their emails (e.g. 'name1:email1, name2:email2') (required):
# >
# No list provided.
# Comma-separated list of maintainers and their emails (e.g. 'name1:email1, name2:email2') (required):
# > wangzheng:wangzheng422@foxmail.com
# cd config/manager && /data/helm/bin/kustomize edit set image controller=quay.io/nep/nep-helm-operator:latest
# /data/helm/bin/kustomize build config/manifests | operator-sdk generate bundle -q --overwrite --version 0.0.1
# INFO[0001] Creating bundle.Dockerfile
# INFO[0001] Creating bundle/metadata/annotations.yaml
# INFO[0001] Bundle metadata generated suceessfully
# operator-sdk bundle validate ./bundle
# INFO[0000] All validation tests have completed successfully
cd /data/helm/helm-charts/vbbu
helm lint
dnf install -y podman-docker
cd /data/helm/
make docker-build
# docker build -t quay.io/nep/nep-helm-operator:v01 .
# Emulate Docker CLI using podman. Create /etc/containers/nodocker to quiet msg.
# STEP 1/5: FROM registry.redhat.io/openshift4/ose-helm-operator:v4.9
# STEP 2/5: ENV HOME=/opt/helm
# --> 1eec2f9c094
# STEP 3/5: COPY watches.yaml ${HOME}/watches.yaml
# --> 1836589a08c
# STEP 4/5: COPY helm-charts ${HOME}/helm-charts
# --> b6cd9f24e47
# STEP 5/5: WORKDIR ${HOME}
# COMMIT quay.io/nep/nep-helm-operator:v01
# --> 1f9bcc4cecc
# Successfully tagged quay.io/nep/nep-helm-operator:v01
# 1f9bcc4cecc55e68170e2a6f45dad7b318018df8bf3989bd990f567e3ccdfcd9
make docker-push
# docker push quay.io/nep/nep-helm-operator:v01
# Emulate Docker CLI using podman. Create /etc/containers/nodocker to quiet msg.
# Getting image source signatures
# Copying blob 8cd9b2cfbe06 skipped: already exists
# Copying blob 5bc03dec6239 skipped: already exists
# Copying blob 525ed45dbdb1 skipped: already exists
# Copying blob 758ace4ace74 skipped: already exists
# Copying blob deb6b0f93acd skipped: already exists
# Copying blob ac83cd3b61fd skipped: already exists
# Copying blob 12f964d7475b [--------------------------------------] 0.0b / 0.0b
# Copying config 1f9bcc4cec [--------------------------------------] 0.0b / 4.0KiB
# Writing manifest to image destination
# Copying config 1f9bcc4cec [--------------------------------------] 0.0b / 4.0KiB
# Writing manifest to image destination
# Storing signatures
make bundle-build BUNDLE_IMG=quay.io/nep/nep-helm-operator:bundle-v01
# docker build -f bundle.Dockerfile -t quay.io/nep/nep-helm-operator:bundle-v01 .
# Emulate Docker CLI using podman. Create /etc/containers/nodocker to quiet msg.
# STEP 1/14: FROM scratch
# STEP 2/14: LABEL operators.operatorframework.io.bundle.mediatype.v1=registry+v1
# --> Using cache b67edfbd23d6ba9c3f484a1e01f9da79fbffdc44e913423e2f616e477df372e1
# --> b67edfbd23d
# STEP 3/14: LABEL operators.operatorframework.io.bundle.manifests.v1=manifests/
# --> Using cache f2eef5180d3c9c63f40a98880ec95088b8395845e0f90960a194326d77a6f3b4
# --> f2eef5180d3
# STEP 4/14: LABEL operators.operatorframework.io.bundle.metadata.v1=metadata/
# --> Using cache 6fc10718a71e30d31cc652b47ac27ca87901ff4fda17a25e2d6bc53344e50673
# --> 6fc10718a71
# STEP 5/14: LABEL operators.operatorframework.io.bundle.package.v1=nep-helm-operator
# --> Using cache 6664d1d6c64c0954c18a432194845551e5a0c6f9bba33175d77c8791e2b0f6e0
# --> 6664d1d6c64
# STEP 6/14: LABEL operators.operatorframework.io.bundle.channels.v1=alpha
# --> Using cache 32878b9e903851bb51b6c0635c77112b4244f4ce7e9d8a7b0a0d8cf7fe7bbe0e
# --> 32878b9e903
# STEP 7/14: LABEL operators.operatorframework.io.metrics.builder=operator-sdk-v1.10.1-ocp
# --> Using cache c5482c80a3287494a5f35ee8df782f4499ad6def2aaa55652e5fc57d4dfa8f0d
# --> c5482c80a32
# STEP 8/14: LABEL operators.operatorframework.io.metrics.mediatype.v1=metrics+v1
# --> Using cache 68822f2fae03c5efc8b980882f66e870d8942d80dbf697e3d784c46f95c50437
# --> 68822f2fae0
# STEP 9/14: LABEL operators.operatorframework.io.metrics.project_layout=helm.sdk.operatorframework.io/v1
# --> Using cache a85519d2774008b3071baf6098ec59561102ef1f337acd19b2c7ef739ebae89e
# --> a85519d2774
# STEP 10/14: LABEL operators.operatorframework.io.test.mediatype.v1=scorecard+v1
# --> Using cache 17a1b08e1dca2295f98e3288d592a08636d15d7461e25e11744a499160a1546c
# --> 17a1b08e1dc
# STEP 11/14: LABEL operators.operatorframework.io.test.config.v1=tests/scorecard/
# --> Using cache 9b6a20b0ff75b501a321fe4fbdfd1d284763e65596dc85675f119e5e3de69657
# --> 9b6a20b0ff7
# STEP 12/14: COPY bundle/manifests /manifests/
# --> Using cache ff3aa5b299dae11f464d8ad56f4ae5130974e1cebd0cf273bc03aba11fcb7377
# --> ff3aa5b299d
# STEP 13/14: COPY bundle/metadata /metadata/
# --> Using cache 19395ef3259bbb4e1f5da9616195139698a3ef18e7f904a2a1cd7515cd9829f3
# --> 19395ef3259
# STEP 14/14: COPY bundle/tests/scorecard /tests/scorecard/
# --> Using cache 2268eb0a731f424f70e5b46222a1accd5344560ac9ab609ca3ccb5a4d0cd6669
# COMMIT quay.io/nep/nep-helm-operator:bundle-v01
# --> 2268eb0a731
# Successfully tagged quay.io/nep/nep-helm-operator:bundle-v01
# Successfully tagged quay.io/nep/nep-helm-operator-bundle:v0.0.1
# 2268eb0a731f424f70e5b46222a1accd5344560ac9ab609ca3ccb5a4d0cd6669
make bundle-push BUNDLE_IMG=quay.io/nep/nep-helm-operator:bundle-v01
# make docker-push IMG=quay.io/nep/nep-helm-operator:bundle-v01
# make[1]: Entering directory '/data/helm'
# docker push quay.io/nep/nep-helm-operator:bundle-v01
# Emulate Docker CLI using podman. Create /etc/containers/nodocker to quiet msg.
# Getting image source signatures
# Copying blob 24b54377030e skipped: already exists
# Copying blob 1929cd83db02 skipped: already exists
# Copying blob 44ef63131a17 [--------------------------------------] 0.0b / 0.0b
# Copying config 2268eb0a73 done
# Writing manifest to image destination
# Copying config 2268eb0a73 [--------------------------------------] 0.0b / 3.3KiB
# Writing manifest to image destination
# Storing signatures
# make[1]: Leaving directory '/data/helm'
make catalog-build CATALOG_IMG=quay.io/nep/nep-helm-operator:catalog-v01 BUNDLE_IMG=quay.io/nep/nep-helm-operator:bundle-v01
# ./bin/opm index add --mode semver --tag quay.io/nep/nep-helm-operator:catalog-v01 --bundles quay.io/nep/nep-helm-operator:bundle-v01
# INFO[0000] building the index bundles="[quay.io/nep/nep-helm-operator:bundle-v01]"
# INFO[0000] resolved name: quay.io/nep/nep-helm-operator:bundle-v01
# INFO[0000] fetched digest="sha256:1365e5913f05b733124a2a88c3113899db0c42f62b5758477577ef2117aff09f"
# INFO[0000] fetched digest="sha256:be008c9c2b4f2c031b301174608accb8622c8d843aba2d1af4d053d8b00373c2"
# INFO[0000] fetched digest="sha256:2268eb0a731f424f70e5b46222a1accd5344560ac9ab609ca3ccb5a4d0cd6669"
# INFO[0000] fetched digest="sha256:d8e28b323fec2e4de5aecfb46c4ce3e315e20f49b78f43eb7a1d657798695655"
# INFO[0000] fetched digest="sha256:c19ac761be31fa163ea3da95cb63fc0c2aaca3b316bfb049f6ee36f77522d323"
# INFO[0001] unpacking layer: {application/vnd.docker.image.rootfs.diff.tar.gzip sha256:d8e28b323fec2e4de5aecfb46c4ce3e315e20f49b78f43eb7a1d657798695655 2985 [] map[] <nil>}
# INFO[0001] unpacking layer: {application/vnd.docker.image.rootfs.diff.tar.gzip sha256:c19ac761be31fa163ea3da95cb63fc0c2aaca3b316bfb049f6ee36f77522d323 398 [] map[] <nil>}
# INFO[0001] unpacking layer: {application/vnd.docker.image.rootfs.diff.tar.gzip sha256:be008c9c2b4f2c031b301174608accb8622c8d843aba2d1af4d053d8b00373c2 438 [] map[] <nil>}
# INFO[0001] Could not find optional dependencies file dir=bundle_tmp582129875 file=bundle_tmp582129875/metadata load=annotations
# INFO[0001] found csv, loading bundle dir=bundle_tmp582129875 file=bundle_tmp582129875/manifests load=bundle
# INFO[0001] loading bundle file dir=bundle_tmp582129875/manifests file=apps.nep.com_vbbus.yaml load=bundle
# INFO[0001] loading bundle file dir=bundle_tmp582129875/manifests file=nep-helm-operator-controller-manager-metrics-service_v1_service.yaml load=bundle
# INFO[0001] loading bundle file dir=bundle_tmp582129875/manifests file=nep-helm-operator-manager-config_v1_configmap.yaml load=bundle
# INFO[0001] loading bundle file dir=bundle_tmp582129875/manifests file=nep-helm-operator-metrics-reader_rbac.authorization.k8s.io_v1_clusterrole.yaml load=bundle
# INFO[0001] loading bundle file dir=bundle_tmp582129875/manifests file=nep-helm-operator.clusterserviceversion.yaml load=bundle
# INFO[0001] Generating dockerfile bundles="[quay.io/nep/nep-helm-operator:bundle-v01]"
# INFO[0001] writing dockerfile: index.Dockerfile322782265 bundles="[quay.io/nep/nep-helm-operator:bundle-v01]"
# INFO[0001] running podman build bundles="[quay.io/nep/nep-helm-operator:bundle-v01]"
# INFO[0001] [podman build --format docker -f index.Dockerfile322782265 -t quay.io/nep/nep-helm-operator:catalog-v01 .] bundles="[quay.io/nep/nep-helm-operator:bundle-v01]"
make catalog-push CATALOG_IMG=quay.io/nep/nep-helm-operator:catalog-v01
# make docker-push IMG=quay.io/nep/nep-helm-operator:catalog-v01
# make[1]: Entering directory '/data/helm'
# docker push quay.io/nep/nep-helm-operator:catalog-v01
# Emulate Docker CLI using podman. Create /etc/containers/nodocker to quiet msg.
# Getting image source signatures
# Copying blob 8a20ae5d4166 done
# Copying blob a98a386b6ec2 skipped: already exists
# Copying blob 4e7f383eb531 skipped: already exists
# Copying blob bc276c40b172 skipped: already exists
# Copying blob b15904f6a114 skipped: already exists
# Copying blob 86aadf4df7dc skipped: already exists
# Copying config 5d5d1c219c done
# Writing manifest to image destination
# Storing signatures
# make[1]: Leaving directory '/data/helm'
export OPERATOR_VERION=v04
make docker-build IMG=quay.io/nep/nep-helm-operator:$OPERATOR_VERION
make docker-push IMG=quay.io/nep/nep-helm-operator:$OPERATOR_VERION
make bundle IMG=quay.io/nep/nep-helm-operator:$OPERATOR_VERION
make bundle-build bundle-push catalog-build catalog-push \
BUNDLE_IMG=quay.io/nep/nep-helm-operator:bundle-$OPERATOR_VERION \
CATALOG_IMG=quay.io/nep/nep-helm-operator:catalog-$OPERATOR_VERION
# on openshift helper node
cat << EOF > /data/install/nep.catalog.yaml
apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
name: nep
namespace: openshift-marketplace
spec:
displayName: nep
publisher: nep
sourceType: grpc
image: ghcr.io/wangzheng422/nep-helm-operator:catalog-2021-12-03-0504
updateStrategy:
registryPoll:
interval: 10m
EOF
oc create -f /data/install/nep.catalog.yaml
# to restore
oc delete -f /data/install/nep.catalog.yaml
helm repository
https://medium.com/@mattiaperi/create-a-public-helm-chart-repository-with-github-pages-49b180dbb417
# try to build the repo, and add it into github action
# mkdir -p /data/helm/helm-repo
cd /data/helm/helm-repo
helm package ../helm-charts/*
helm repo index --url https://wangzheng422.github.io/nep-helm-operator/ .
# try to use the repo
helm repo add myhelmrepo https://wangzheng422.github.io/nep-helm-operator/
helm repo list
# NAME URL
# myhelmrepo https://wangzheng422.github.io/nep-helm-operator/
helm search repo vbbu
# NAME CHART VERSION APP VERSION DESCRIPTION
# myhelmrepo/vbbu 0.1.0 1.16.0 A Helm chart for Kubernetes
# for ocp, if you are disconnected
cat << EOF > /data/install/helm.nep.yaml
apiVersion: helm.openshift.io/v1beta1
kind: HelmChartRepository
metadata:
name: nep-helm-charts-wzh
spec:
# optional name that might be used by console
name: nep-helm-charts-wzh
connectionConfig:
url: http://nexus.ocp4.redhat.ren:8082/repository/wangzheng422.github.io/
EOF
oc create -f /data/install/helm.nep.yaml
MetalLB layer2 mode on openshift 4.8
openshift对外提供服务,默认是router的方式,里面是一个haproxy,但是默认只是支持http/https,定制一下,可以支持tcp。这种配置方法不是很直观,特别是tcp的支持也很鸡肋。
我们已经知道metalLB可以帮助service之间暴露external IP,并且通过BGP的方式广播出去,但是在PoC的时候,BGP路由器还是比较难搞,好在metalLB还提供了layer2的方式,更简单的对外暴露external IP.
本次实验部署架构图:
安装 MetalLB
安装MetalLB非常简单
https://metallb.universe.tf/installation/clouds/#metallb-on-openshift-ocp
mkdir -p /data/install/metallb
cd /data/install/metallb
wget https://raw.githubusercontent.com/metallb/metallb/v0.10.2/manifests/namespace.yaml
wget https://raw.githubusercontent.com/metallb/metallb/v0.10.2/manifests/metallb.yaml
sed -i '/runAsUser: 65534/d' ./metallb.yaml
oc create -f /data/install/metallb/namespace.yaml
oc adm policy add-scc-to-user privileged -n metallb-system -z speaker
oc create -f /data/install/metallb/metallb.yaml
# to restore
oc delete -f /data/install/metallb/metallb.yaml
配置 MetalLB
# on helper
cat << EOF > /data/install/metal-bgp.yaml
apiVersion: v1
kind: ConfigMap
metadata:
namespace: metallb-system
name: config
data:
config: |
address-pools:
- name: my-ip-space
protocol: layer2
addresses:
- 192.168.7.150-192.168.7.200
EOF
oc create -f /data/install/metal-bgp.yaml
# to restore
oc delete -f /data/install/metal-bgp.yaml
创建测试应用
# back to helper vm
cat << EOF > /data/install/demo.yaml
---
apiVersion: v1
kind: Pod
metadata:
name: test-0
labels:
env: test
spec:
restartPolicy: OnFailure
nodeSelector:
kubernetes.io/hostname: 'master-0'
containers:
- name: php
image: "quay.io/wangzheng422/php:demo.02"
---
apiVersion: v1
kind: Pod
metadata:
name: test-1
labels:
env: test
spec:
restartPolicy: OnFailure
nodeSelector:
kubernetes.io/hostname: 'worker-0'
containers:
- name: php
image: "quay.io/wangzheng422/php:demo.02"
---
kind: Service
apiVersion: v1
metadata:
name: demo
spec:
type: LoadBalancer
ports:
- name: "http"
protocol: TCP
port: 80
targetPort: 80
selector:
env: test
EOF
oc create -f /data/install/demo.yaml
# to restore
oc delete -f /data/install/demo.yaml
oc get all
# NAME READY STATUS RESTARTS AGE
# pod/mypod-787d79b456-4f4xr 1/1 Running 4 4d17h
# pod/test-0 0/1 ContainerCreating 0 4s
# pod/test-1 1/1 Running 0 4s
# NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
# service/demo LoadBalancer 172.30.178.14 192.168.7.150 80:30781/TCP 4s
# service/kubernetes ClusterIP 172.30.0.1 <none> 443/TCP 5d16h
# service/openshift ExternalName <none> kubernetes.default.svc.cluster.local <none> 5d16h
# NAME READY UP-TO-DATE AVAILABLE AGE
# deployment.apps/mypod 1/1 1 1 4d17h
# NAME DESIRED CURRENT READY AGE
# replicaset.apps/mypod-787d79b456 1 1 1 4d17h
oc get pod -o wide
# NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
# mypod-787d79b456-4f4xr 1/1 Running 4 4d17h 10.254.1.19 worker-0 <none> <none>
# test-0 1/1 Running 0 9m36s 10.254.0.74 master-0 <none> <none>
# test-1 1/1 Running 0 9m36s 10.254.1.65 worker-0 <none> <none>
oc get svc/demo -o yaml
# apiVersion: v1
# kind: Service
# metadata:
# creationTimestamp: "2021-08-31T06:39:39Z"
# name: demo
# namespace: default
# resourceVersion: "2277414"
# uid: 6f36e7a4-ee2e-4f86-802e-6053debecfb2
# spec:
# clusterIP: 172.30.178.14
# clusterIPs:
# - 172.30.178.14
# externalTrafficPolicy: Cluster
# ipFamilies:
# - IPv4
# ipFamilyPolicy: SingleStack
# ports:
# - name: http
# nodePort: 30781
# port: 80
# protocol: TCP
# targetPort: 80
# selector:
# env: test
# sessionAffinity: None
# type: LoadBalancer
# status:
# loadBalancer:
# ingress:
# - ip: 192.168.7.150
for i in {1..10}
do
curl 192.168.7.150 && echo
done
# Hello!<br>Welcome to RedHat Developer<br>Enjoy all of the ad-free articles<br>10.254.1.65
# Hello!<br>Welcome to RedHat Developer<br>Enjoy all of the ad-free articles<br>10.254.1.65
# Hello!<br>Welcome to RedHat Developer<br>Enjoy all of the ad-free articles<br>10.254.1.65
# Hello!<br>Welcome to RedHat Developer<br>Enjoy all of the ad-free articles<br>10.254.1.65
# Hello!<br>Welcome to RedHat Developer<br>Enjoy all of the ad-free articles<br>10.254.0.74
# Hello!<br>Welcome to RedHat Developer<br>Enjoy all of the ad-free articles<br>10.254.1.65
# Hello!<br>Welcome to RedHat Developer<br>Enjoy all of the ad-free articles<br>10.254.0.74
# Hello!<br>Welcome to RedHat Developer<br>Enjoy all of the ad-free articles<br>10.254.1.65
# Hello!<br>Welcome to RedHat Developer<br>Enjoy all of the ad-free articles<br>10.254.0.74
# Hello!<br>Welcome to RedHat Developer<br>Enjoy all of the ad-free articles<br>10.254.1.65
arp -a
# ? (10.88.0.3) at 9a:b9:62:83:0f:75 [ether] on cni-podman0
# master-2.ocp4.redhat.ren (192.168.7.15) at <incomplete> on enp1s0
# ? (10.88.0.2) at 4e:de:d9:d5:f8:f1 [ether] on cni-podman0
# master-1.ocp4.redhat.ren (192.168.7.14) at <incomplete> on enp1s0
# ? (192.168.7.150) at 52:54:00:d2:ba:43 [ether] on enp1s0
# worker-1.ocp4.redhat.ren (192.168.7.17) at <incomplete> on enp1s0
# _gateway (172.21.6.254) at 00:17:94:73:12:c2 [ether] on enp1s0
# master-0.ocp4.redhat.ren (192.168.7.13) at 52:54:00:d2:ba:43 [ether] on enp1s0
# worker-0.ocp4.redhat.ren (192.168.7.16) at 90:b1:1c:44:d6:0f [ether] on enp1s0
# bootstrap.ocp4.redhat.ren (192.168.7.12) at <incomplete> on enp1s0
到worker-0上,看看 nft 规则
# go to worker-0 to analyze the nat rules
nft list ruleset | grep 192.168.7.150
# meta l4proto tcp ip daddr 192.168.7.150 tcp dport 80 counter packets 0 bytes 0 jump KUBE-FW-CTBMGJDNUDRWEDVR
nft list ruleset | grep KUBE-FW-CTBMGJDNUDRWEDVR -A 5
# meta l4proto tcp ip daddr 192.168.7.150 tcp dport 80 counter packets 0 bytes 0 jump KUBE-FW-CTBMGJDNUDRWEDVR
# meta l4proto tcp @nh,96,16 != 2814 ip daddr 172.30.35.8 tcp dport 80 counter packets 0 bytes 0 jump KUBE-MARK-MASQ
# meta l4proto tcp ip daddr 172.30.35.8 tcp dport 80 counter packets 0 bytes 0 jump KUBE-SVC-T3U64PSX3UGU57NF
# meta l4proto tcp @nh,96,16 != 2814 ip daddr 172.30.152.93 tcp dport 80 counter packets 0 bytes 0 jump KUBE-MARK-MASQ
# meta l4proto tcp ip daddr 172.30.152.93 tcp dport 80 counter packets 0 bytes 0 jump KUBE-SVC-ZOXDBRX7A3I2MI4S
# meta l4proto tcp @nh,96,16 != 2814 ip daddr 172.30.99.142 tcp dport 8443 counter packets 0 bytes 0 jump KUBE-MARK-MASQ
# --
# chain KUBE-FW-CTBMGJDNUDRWEDVR {
# counter packets 0 bytes 0 jump KUBE-MARK-MASQ
# counter packets 0 bytes 0 jump KUBE-SVC-CTBMGJDNUDRWEDVR
# counter packets 0 bytes 0 jump KUBE-MARK-DROP
# }
nft list ruleset | grep KUBE-SVC-CTBMGJDNUDRWEDVR -A 3
# meta l4proto tcp ip daddr 172.30.178.14 tcp dport 80 counter packets 0 bytes 0 jump KUBE-SVC-CTBMGJDNUDRWEDVR
# meta l4proto tcp ip daddr 192.168.7.150 tcp dport 80 counter packets 0 bytes 0 jump KUBE-FW-CTBMGJDNUDRWEDVR
# meta l4proto tcp @nh,96,16 != 2814 ip daddr 172.30.35.8 tcp dport 80 counter packets 0 bytes 0 jump KUBE-MARK-MASQ
# meta l4proto tcp ip daddr 172.30.35.8 tcp dport 80 counter packets 0 bytes 0 jump KUBE-SVC-T3U64PSX3UGU57NF
# --
# meta l4proto tcp tcp dport 30781 counter packets 0 bytes 0 jump KUBE-SVC-CTBMGJDNUDRWEDVR
# }
# chain KUBE-SVC-HH47JV2DWEPNMQEX {
# --
# chain KUBE-SVC-CTBMGJDNUDRWEDVR {
# counter packets 0 bytes 0 jump KUBE-SEP-CGMBWTJH33MIKSJY
# counter packets 0 bytes 0 jump KUBE-SEP-V5VBCVCJRZSWQ4D6
# }
# --
# counter packets 0 bytes 0 jump KUBE-SVC-CTBMGJDNUDRWEDVR
# counter packets 0 bytes 0 jump KUBE-MARK-DROP
# }
nft list ruleset | grep KUBE-SEP-CGMBWTJH33MIKSJY -A 3
# counter packets 0 bytes 0 jump KUBE-SEP-CGMBWTJH33MIKSJY
# counter packets 0 bytes 0 jump KUBE-SEP-V5VBCVCJRZSWQ4D6
# }
# --
# chain KUBE-SEP-CGMBWTJH33MIKSJY {
# ip saddr 10.254.0.74 counter packets 0 bytes 0 jump KUBE-MARK-MASQ
# meta l4proto tcp counter packets 0 bytes 0 dnat to 10.254.0.74:80
# }
nft list ruleset | grep KUBE-SEP-V5VBCVCJRZSWQ4D6 -A 3
# counter packets 0 bytes 0 jump KUBE-SEP-V5VBCVCJRZSWQ4D6
# }
# chain KUBE-FW-CTBMGJDNUDRWEDVR {
# --
# chain KUBE-SEP-V5VBCVCJRZSWQ4D6 {
# ip saddr 10.254.1.65 counter packets 0 bytes 0 jump KUBE-MARK-MASQ
# meta l4proto tcp counter packets 0 bytes 0 dnat to 10.254.1.65:80
# }
nft --handle --numeric list ruleset | grep random
# counter packets 0 bytes 0 masquerade random-fully # handle 13
看看iptables的规则
iptables -L -v -n -t nat | grep 192.168.7.150
# 0 0 KUBE-FW-CTBMGJDNUDRWEDVR tcp -- * * 0.0.0.0/0 192.168.7.150 /* default/demo:http loadbalancer IP */ tcp dpt:80
iptables -L -v -n -t nat | grep KUBE-FW-CTBMGJDNUDRWEDVR -A 5
# 0 0 KUBE-FW-CTBMGJDNUDRWEDVR tcp -- * * 0.0.0.0/0 192.168.7.150 /* default/demo:http loadbalancer IP */ tcp dpt:80
# 0 0 KUBE-MARK-MASQ tcp -- * * !10.254.0.0/16 172.30.210.66 /* openshift-kube-scheduler-operator/metrics:https cluster IP */ tcp dpt:443
# 0 0 KUBE-SVC-HH47JV2DWEPNMQEX tcp -- * * 0.0.0.0/0 172.30.210.66 /* openshift-kube-scheduler-operator/metrics:https cluster IP */ tcp dpt:443
# 0 0 KUBE-MARK-MASQ tcp -- * * !10.254.0.0/16 172.30.55.237 /* openshift-apiserver-operator/metrics:https cluster IP */ tcp dpt:443
# 0 0 KUBE-SVC-CIUYVLZDADCHPTYT tcp -- * * 0.0.0.0/0 172.30.55.237 /* openshift-apiserver-operator/metrics:https cluster IP */ tcp dpt:443
# 0 0 KUBE-MARK-MASQ tcp -- * * !10.254.0.0/16 172.30.134.31 /* openshift-pipelines/tekton-pipelines-controller:probes cluster IP */ tcp dpt:8080
# --
# Chain KUBE-FW-CTBMGJDNUDRWEDVR (1 references)
# pkts bytes target prot opt in out source destination
# 0 0 KUBE-MARK-MASQ all -- * * 0.0.0.0/0 0.0.0.0/0 /* default/demo:http loadbalancer IP */
# 0 0 KUBE-SVC-CTBMGJDNUDRWEDVR all -- * * 0.0.0.0/0 0.0.0.0/0 /* default/demo:http loadbalancer IP */
# 0 0 KUBE-MARK-DROP all -- * * 0.0.0.0/0 0.0.0.0/0 /* default/demo:http loadbalancer IP */
iptables -L -v -n -t nat | grep KUBE-SVC-CTBMGJDNUDRWEDVR -A 4
# 0 0 KUBE-SVC-CTBMGJDNUDRWEDVR tcp -- * * 0.0.0.0/0 172.30.178.14 /* default/demo:http cluster IP */ tcp dpt:80
# 0 0 KUBE-FW-CTBMGJDNUDRWEDVR tcp -- * * 0.0.0.0/0 192.168.7.150 /* default/demo:http loadbalancer IP */ tcp dpt:80
# 0 0 KUBE-MARK-MASQ tcp -- * * !10.254.0.0/16 172.30.210.66 /* openshift-kube-scheduler-operator/metrics:https cluster IP */ tcp dpt:443
# 0 0 KUBE-SVC-HH47JV2DWEPNMQEX tcp -- * * 0.0.0.0/0 172.30.210.66 /* openshift-kube-scheduler-operator/metrics:https cluster IP */ tcp dpt:443
# 0 0 KUBE-MARK-MASQ tcp -- * * !10.254.0.0/16 172.30.55.237 /* openshift-apiserver-operator/metrics:https cluster IP */ tcp dpt:443
# --
# 0 0 KUBE-SVC-CTBMGJDNUDRWEDVR tcp -- * * 0.0.0.0/0 0.0.0.0/0 /* default/demo:http */ tcp dpt:30781
# Chain KUBE-SVC-HH47JV2DWEPNMQEX (1 references)
# pkts bytes target prot opt in out source destination
# 0 0 KUBE-SEP-XIWZUKNCQE6LJCFA all -- * * 0.0.0.0/0 0.0.0.0/0 /* openshift-kube-scheduler-operator/metrics:https */
# --
# Chain KUBE-SVC-CTBMGJDNUDRWEDVR (3 references)
# pkts bytes target prot opt in out source destination
# 0 0 KUBE-SEP-CGMBWTJH33MIKSJY all -- * * 0.0.0.0/0 0.0.0.0/0 /* default/demo:http */ statistic mode random probability 0.50000000000
# 0 0 KUBE-SEP-V5VBCVCJRZSWQ4D6 all -- * * 0.0.0.0/0 0.0.0.0/0 /* default/demo:http */
# --
# 0 0 KUBE-SVC-CTBMGJDNUDRWEDVR all -- * * 0.0.0.0/0 0.0.0.0/0 /* default/demo:http loadbalancer IP */
# 0 0 KUBE-MARK-DROP all -- * * 0.0.0.0/0 0.0.0.0/0 /* default/demo:http loadbalancer IP */
# Chain KUBE-SEP-V5VBCVCJRZSWQ4D6 (1 references)
# pkts bytes target prot opt in out source destination
iptables -L -v -n -t nat | grep KUBE-SEP-CGMBWTJH33MIKSJY -A 3
# 0 0 KUBE-SEP-CGMBWTJH33MIKSJY all -- * * 0.0.0.0/0 0.0.0.0/0 /* default/demo:http */ statistic mode random probability 0.50000000000
# 0 0 KUBE-SEP-V5VBCVCJRZSWQ4D6 all -- * * 0.0.0.0/0 0.0.0.0/0 /* default/demo:http */
# Chain KUBE-FW-CTBMGJDNUDRWEDVR (1 references)
# --
# Chain KUBE-SEP-CGMBWTJH33MIKSJY (1 references)
# pkts bytes target prot opt in out source destination
# 0 0 KUBE-MARK-MASQ all -- * * 10.254.0.74 0.0.0.0/0 /* default/demo:http */
# 0 0 DNAT tcp -- * * 0.0.0.0/0 0.0.0.0/0 /* default/demo:http */ tcp to:10.254.0.74:80
iptables -L -v -n -t nat | grep KUBE-SEP-V5VBCVCJRZSWQ4D6 -A 3
# 0 0 KUBE-SEP-V5VBCVCJRZSWQ4D6 all -- * * 0.0.0.0/0 0.0.0.0/0 /* default/demo:http */
# Chain KUBE-FW-CTBMGJDNUDRWEDVR (1 references)
# pkts bytes target prot opt in out source destination
# --
# Chain KUBE-SEP-V5VBCVCJRZSWQ4D6 (1 references)
# pkts bytes target prot opt in out source destination
# 0 0 KUBE-MARK-MASQ all -- * * 10.254.1.65 0.0.0.0/0 /* default/demo:http */
# 0 0 DNAT tcp -- * * 0.0.0.0/0 0.0.0.0/0 /* default/demo:http */ tcp to:10.254.1.65:80
MetalLB BGP mode on openshift 4.8
openshift对外提供服务,默认是router的方式,里面是一个haproxy,但是默认只是支持http/https,定制一下,可以支持tcp。这种配置方法不是很直观,特别是tcp的支持也很鸡肋。我们希望的方式,是k8s service直接暴露一个对外服务ip,并且通过bgp广播出去。今天,我们就看看metalLB项目如何帮助我们达到这个目的。
本次实验部署架构图:
视频讲解:
安装 MetalLB
安装MetalLB非常简单
https://metallb.universe.tf/installation/clouds/#metallb-on-openshift-ocp
mkdir -p /data/install/metallb
cd /data/install/metallb
wget https://raw.githubusercontent.com/metallb/metallb/v0.10.2/manifests/namespace.yaml
wget https://raw.githubusercontent.com/metallb/metallb/v0.10.2/manifests/metallb.yaml
sed -i '/runAsUser: 65534/d' ./metallb.yaml
oc create -f namespace.yaml
oc adm policy add-scc-to-user privileged -n metallb-system -z speaker
oc create -f metallb.yaml
创建路由器
我们用一个 kvm 来模拟 bgp 路由器
- https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/configuring_and_managing_networking/setting-your-routing-protocols_configuring-and-managing-networking#intro-to-frr_setting-your-routing-protocols
- https://www.cisco.com/c/en/us/td/docs/ios-xml/ios/iproute_bgp/configuration/xe-16/irg-xe-16-book/bgp-dynamic-neighbors.html
- https://ipbgp.com/2018/02/07/quagga/
- https://docs.frrouting.org/en/latest/bgp.html
# to setup a router vm for testing
# go to kvm host
cd /data/kvm
wget https://raw.githubusercontent.com/wangzheng422/docker_env/dev/redhat/ocp4/4.8/scripts/helper-ks-rocky.cfg
sed -i '0,/^network.*/s/^network.*/network --bootproto=static --device=enp1s0 --gateway=172.21.6.254 --ip=172.21.6.10 --netmask=255.255.255.0 --nameserver=172.21.1.1 --ipv6=auto --activate/' helper-ks-rocky.cfg
sed -i '0,/^network --hostname.*/s/^network --hostname.*/network --hostname=bgp-router/' helper-ks-rocky.cfg
virt-install --name="bgp-router" --vcpus=2 --ram=2048 \
--cpu=host-model \
--disk path=/data/nvme/bgp-router.qcow2,bus=virtio,size=30 \
--os-variant rhel8.4 --network bridge=baremetal,model=virtio \
--graphics vnc,port=49000 \
--boot menu=on --location /data/kvm/Rocky-8.4-x86_64-minimal.iso \
--initrd-inject helper-ks-rocky.cfg --extra-args "inst.ks=file:/helper-ks-rocky.cfg"
# in the bgp-router vm
nmcli con mod enp1s0 +ipv4.addresses "192.168.7.10/24"
nmcli con up enp1s0
systemctl disable --now firewalld
dnf install -y frr
sed -i 's/bgpd=no/bgpd=yes/g' /etc/frr/daemons
systemctl enable --now frr
# 进入路由器配置界面
vtysh
# 以下是 bgp 路由器配置
router bgp 64512
neighbor metallb peer-group
neighbor metallb remote-as 64512
bgp listen limit 200
bgp listen range 192.168.7.0/24 peer-group metallb
配置 MetalLB 和 bgp-router 进行配对
# on helper
cat << EOF > /data/install/metal-bgp.yaml
apiVersion: v1
kind: ConfigMap
metadata:
namespace: metallb-system
name: config
data:
config: |
peers:
- my-asn: 64512
peer-asn: 64512
peer-address: 192.168.7.10
address-pools:
- name: my-ip-space
protocol: bgp
avoid-buggy-ips: true
addresses:
- 198.51.100.0/24
EOF
oc create -f /data/install/metal-bgp.yaml
# to restore
oc delete -f /data/install/metal-bgp.yaml
回到 bgp-router 看看路由情况
# back to bgp-router vm
vtysh
bgp-router# show ip bgp summary
IPv4 Unicast Summary:
BGP router identifier 192.168.7.10, local AS number 64512 vrf-id 0
BGP table version 0
RIB entries 0, using 0 bytes of memory
Peers 2, using 43 KiB of memory
Peer groups 1, using 64 bytes of memory
Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd PfxSnt
*192.168.7.13 4 64512 2 2 0 0 0 00:00:25 0 0
*192.168.7.16 4 64512 2 2 0 0 0 00:00:25 0 0
Total number of neighbors 2
* - dynamic neighbor
2 dynamic neighbor(s), limit 200
我们看到,集群里面的2个node,分别和路由器建立的peer关系。
创建测试应用
# back to helper vm
cat << EOF > /data/install/demo.yaml
---
apiVersion: v1
kind: Pod
metadata:
name: test-0
labels:
env: test
spec:
restartPolicy: OnFailure
nodeSelector:
kubernetes.io/hostname: 'master-0'
containers:
- name: php
image: "quay.io/wangzheng422/php:demo.02"
---
apiVersion: v1
kind: Pod
metadata:
name: test-1
labels:
env: test
spec:
restartPolicy: OnFailure
nodeSelector:
kubernetes.io/hostname: 'worker-0'
containers:
- name: php
image: "quay.io/wangzheng422/php:demo.02"
---
kind: Service
apiVersion: v1
metadata:
name: demo
spec:
type: LoadBalancer
ports:
- name: "http"
protocol: TCP
port: 80
targetPort: 80
selector:
env: test
EOF
oc create -f /data/install/demo.yaml
# to restore
oc delete -f /data/install/demo.yaml
oc get all
# NAME READY STATUS RESTARTS AGE
# pod/mypod-787d79b456-4f4xr 1/1 Running 3 3d23h
# pod/test-0 1/1 Running 0 2m28s
# pod/test-1 1/1 Running 0 2m28s
# NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
# service/demo LoadBalancer 172.30.82.87 198.51.100.1 80:32203/TCP 2m28s
# service/kubernetes ClusterIP 172.30.0.1 <none> 443/TCP 4d22h
# service/openshift ExternalName <none> kubernetes.default.svc.cluster.local <none> 4d22h
# NAME READY UP-TO-DATE AVAILABLE AGE
# deployment.apps/mypod 1/1 1 1 3d23h
# NAME DESIRED CURRENT READY AGE
# replicaset.apps/mypod-787d79b456 1 1 1 3d23h
oc get pod -o wide
# NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
# mypod-787d79b456-4f4xr 1/1 Running 3 4d 10.254.1.2 worker-0 <none> <none>
# test-0 1/1 Running 0 8m38s 10.254.0.66 master-0 <none> <none>
# test-1 1/1 Running 0 8m38s 10.254.1.230 worker-0 <none> <none>
oc get svc/demo -o yaml
# apiVersion: v1
# kind: Service
# metadata:
# creationTimestamp: "2021-08-30T12:42:21Z"
# name: demo
# namespace: default
# resourceVersion: "2046159"
# uid: 1af07435-5234-4062-994d-4715453118c6
# spec:
# clusterIP: 172.30.82.87
# clusterIPs:
# - 172.30.82.87
# externalTrafficPolicy: Cluster
# ipFamilies:
# - IPv4
# ipFamilyPolicy: SingleStack
# ports:
# - name: http
# nodePort: 32203
# port: 80
# protocol: TCP
# targetPort: 80
# selector:
# env: test
# sessionAffinity: None
# type: LoadBalancer
# status:
# loadBalancer:
# ingress:
# - ip: 198.51.100.1
回到 bgp-router 看看路由更新情况
# back to bgp-router
bgp-router# show ip bgp summary
IPv4 Unicast Summary:
BGP router identifier 192.168.7.10, local AS number 64512 vrf-id 0
BGP table version 1
RIB entries 1, using 192 bytes of memory
Peers 2, using 43 KiB of memory
Peer groups 1, using 64 bytes of memory
Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd PfxSnt
*192.168.7.13 4 64512 73 72 0 0 0 00:35:16 1 0
*192.168.7.16 4 64512 73 72 0 0 0 00:35:16 1 0
Total number of neighbors 2
* - dynamic neighbor
2 dynamic neighbor(s), limit 200
bgp-router# show ip bgp neighbors 192.168.7.13 routes
BGP table version is 1, local router ID is 192.168.7.10, vrf id 0
Default local pref 100, local AS 64512
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
Network Next Hop Metric LocPrf Weight Path
*>i198.51.100.1/32 192.168.7.13 0 0 ?
Displayed 1 routes and 2 total paths
bgp-router#
bgp-router# show ip bgp neighbors 192.168.7.16 routes
BGP table version is 1, local router ID is 192.168.7.10, vrf id 0
Default local pref 100, local AS 64512
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
Network Next Hop Metric LocPrf Weight Path
*=i198.51.100.1/32 192.168.7.16 0 0 ?
Displayed 1 routes and 2 total paths
在路由器的shell界面上看看
ip r
# default via 172.21.6.254 dev enp1s0 proto static metric 100
# 172.21.6.0/24 dev enp1s0 proto kernel scope link src 172.21.6.10 metric 100
# 192.168.7.0/24 dev enp1s0 proto kernel scope link src 192.168.7.10 metric 100
# 198.51.100.1 proto bgp metric 20
# nexthop via 192.168.7.13 dev enp1s0 weight 1
# nexthop via 192.168.7.16 dev enp1s0 weight 1
[root@bgp-router ~]# curl 198.51.100.1 && echo
Hello!<br>Welcome to RedHat Developer<br>Enjoy all of the ad-free articles<br>10.254.0.66
[root@bgp-router ~]# curl 198.51.100.1 && echo
Hello!<br>Welcome to RedHat Developer<br>Enjoy all of the ad-free articles<br>10.254.0.66
[root@bgp-router ~]# curl 198.51.100.1 && echo
Hello!<br>Welcome to RedHat Developer<br>Enjoy all of the ad-free articles<br>10.254.0.66
[root@bgp-router ~]# curl 198.51.100.1 && echo
Hello!<br>Welcome to RedHat Developer<br>Enjoy all of the ad-free articles<br>10.254.0.66
[root@bgp-router ~]# curl 198.51.100.1 && echo
Hello!<br>Welcome to RedHat Developer<br>Enjoy all of the ad-free articles<br>10.254.1.230
[root@bgp-router ~]# curl 198.51.100.1 && echo
Hello!<br>Welcome to RedHat Developer<br>Enjoy all of the ad-free articles<br>10.254.0.66
[root@bgp-router ~]# curl 198.51.100.1 && echo
Hello!<br>Welcome to RedHat Developer<br>Enjoy all of the ad-free articles<br>10.254.0.66
[root@bgp-router ~]# curl 198.51.100.1 && echo
Hello!<br>Welcome to RedHat Developer<br>Enjoy all of the ad-free articles<br>10.254.0.66
[root@bgp-router ~]# curl 198.51.100.1 && echo
Hello!<br>Welcome to RedHat Developer<br>Enjoy all of the ad-free articles<br>10.254.1.230
[root@bgp-router ~]# curl 198.51.100.1 && echo
Hello!<br>Welcome to RedHat Developer<br>Enjoy all of the ad-free articles<br>10.254.0.66
到worker-0上,看看 nft 规则
# go to worker-0 to analyze the nat rules
nft list ruleset | grep 198.51
# meta l4proto tcp ip daddr 198.51.100.1 tcp dport 80 counter packets 0 bytes 0 jump KUBE-FW-CTBMGJDNUDRWEDVR
nft list ruleset | grep KUBE-FW-CTBMGJDNUDRWEDVR -A 5
# meta l4proto tcp ip daddr 198.51.100.1 tcp dport 80 counter packets 0 bytes 0 jump KUBE-FW-CTBMGJDNUDRWEDVR
# meta l4proto tcp @nh,96,16 != 2814 ip daddr 172.30.145.124 tcp dport 443 counter packets 0 bytes 0 jump KUBE-MARK-MASQ
# meta l4proto tcp ip daddr 172.30.145.124 tcp dport 443 counter packets 0 bytes 0 jump KUBE-SVC-L54HVQEJKTL2PXFK
# meta l4proto tcp @nh,96,16 != 2814 ip daddr 172.30.16.253 tcp dport 8443 counter packets 0 bytes 0 jump KUBE-MARK-MASQ
# meta l4proto tcp ip daddr 172.30.16.253 tcp dport 8443 counter packets 0 bytes 0 jump KUBE-SVC-YVQ2VVJT4ABSS56R
# meta l4proto tcp @nh,96,16 != 2814 ip daddr 172.30.185.119 tcp dport 9091 counter packets 0 bytes 0 jump KUBE-MARK-MASQ
# --
# chain KUBE-FW-CTBMGJDNUDRWEDVR {
# counter packets 0 bytes 0 jump KUBE-MARK-MASQ
# counter packets 0 bytes 0 jump KUBE-SVC-CTBMGJDNUDRWEDVR
# counter packets 0 bytes 0 jump KUBE-MARK-DROP
# }
nft list ruleset | grep KUBE-SVC-CTBMGJDNUDRWEDVR -A 3
# meta l4proto tcp ip daddr 172.30.82.87 tcp dport 80 counter packets 0 bytes 0 jump KUBE-SVC-CTBMGJDNUDRWEDVR
# meta l4proto tcp ip daddr 198.51.100.1 tcp dport 80 counter packets 11 bytes 660 jump KUBE-FW-CTBMGJDNUDRWEDVR
# meta l4proto tcp @nh,96,16 != 2814 ip daddr 172.30.145.124 tcp dport 443 counter packets 0 bytes 0 jump KUBE-MARK-MASQ
# meta l4proto tcp ip daddr 172.30.145.124 tcp dport 443 counter packets 0 bytes 0 jump KUBE-SVC-L54HVQEJKTL2PXFK
# --
# meta l4proto tcp tcp dport 32203 counter packets 0 bytes 0 jump KUBE-SVC-CTBMGJDNUDRWEDVR
# }
# chain KUBE-SVC-DCLNKYLNAMROIJRV {
# --
# chain KUBE-SVC-CTBMGJDNUDRWEDVR {
# counter packets 9 bytes 540 jump KUBE-SEP-BKD3LMWAJNKW5GNU
# counter packets 2 bytes 120 jump KUBE-SEP-M5WVBCWAFJ2J2M2U
# }
# --
# counter packets 11 bytes 660 jump KUBE-SVC-CTBMGJDNUDRWEDVR
# counter packets 0 bytes 0 jump KUBE-MARK-DROP
# }
nft list ruleset | grep KUBE-SEP-BKD3LMWAJNKW5GNU -A 3
# counter packets 9 bytes 540 jump KUBE-SEP-BKD3LMWAJNKW5GNU
# counter packets 2 bytes 120 jump KUBE-SEP-M5WVBCWAFJ2J2M2U
# }
# --
# chain KUBE-SEP-BKD3LMWAJNKW5GNU {
# ip saddr 10.254.0.66 counter packets 0 bytes 0 jump KUBE-MARK-MASQ
# meta l4proto tcp counter packets 9 bytes 540 dnat to 10.254.0.66:80
# }
nft list ruleset | grep KUBE-SEP-M5WVBCWAFJ2J2M2U -A 3
# counter packets 2 bytes 120 jump KUBE-SEP-M5WVBCWAFJ2J2M2U
# }
# chain KUBE-FW-CTBMGJDNUDRWEDVR {
# --
# chain KUBE-SEP-M5WVBCWAFJ2J2M2U {
# ip saddr 10.254.1.230 counter packets 0 bytes 0 jump KUBE-MARK-MASQ
# meta l4proto tcp counter packets 2 bytes 120 dnat to 10.254.1.230:80
# }
Kata / sandbox container in openshift 4.8
红帽 openshift 4.8 容器平台,最新支持了kata,或者叫沙盒容器, 是在物理机上启动vm,然后在vm里面启动容器进程的技术,初衷是为了进一步提高安全性,消除用户对容器是否存在逃逸问题的顾虑,虽然还是TP阶段,但是已经可以一探究竟啦。
https://docs.openshift.com/container-platform/4.8/sandboxed_containers/understanding-sandboxed-containers.html
视频讲解:
首先我们来安装它,在operator hub里面选择sandbox container,点击安装。
然后在operator里面创建一个kata config,默认就可以,现在是TP阶段,也没什么花活。
创建好了以后,kata operator就会在系统里面创建一些配置,我们来一个一个看一下。
# 首先是runtime class,这个是指出了pod可以使用kata作为runtime,
# 注意礼貌的overhead,这个配置的意思,是kata有qemu作为虚拟机,所以会有一些额外的消耗,
# 这些消耗在scheduling的时候,需要计算,这里就把这个计算量静态的配置进去。。。
# 虽然我觉得这个不太灵活,但是目前就是这样的。
oc get runtimeclass/kata -o yaml
# apiVersion: node.k8s.io/v1
# handler: kata
# kind: RuntimeClass
# metadata:
# name: kata
# overhead:
# podFixed:
# cpu: 250m
# memory: 350Mi
# scheduling:
# nodeSelector:
# node-role.kubernetes.io/worker: ""
# ocp会把kata通过machine config的方式,配置到节点里面去
oc get mc
# NAME GENERATEDBYCONTROLLER IGNITIONVERSION AGE
# 00-master 723a8a4992f42530af95202e51e5a940d2a3d169 3.2.0 15h
# 00-worker 723a8a4992f42530af95202e51e5a940d2a3d169 3.2.0 15h
# 01-master-container-runtime 723a8a4992f42530af95202e51e5a940d2a3d169 3.2.0 15h
# 01-master-kubelet 723a8a4992f42530af95202e51e5a940d2a3d169 3.2.0 15h
# 01-worker-container-runtime 723a8a4992f42530af95202e51e5a940d2a3d169 3.2.0 15h
# 01-worker-kubelet 723a8a4992f42530af95202e51e5a940d2a3d169 3.2.0 15h
# 50-enable-sandboxed-containers-extension 3.2.0 51m
# 99-master-chrony-configuration 2.2.0 15h
# 99-master-container-registries 3.1.0 15h
# 99-master-generated-registries 723a8a4992f42530af95202e51e5a940d2a3d169 3.2.0 15h
# 99-master-ssh 3.2.0 15h
# 99-worker-chrony-configuration 2.2.0 15h
# 99-worker-container-registries 3.1.0 15h
# 99-worker-generated-registries 723a8a4992f42530af95202e51e5a940d2a3d169 3.2.0 15h
# 99-worker-ssh 3.2.0 15h
# rendered-master-8c1e34a69aa4b919b6f2eec350570491 723a8a4992f42530af95202e51e5a940d2a3d169 3.2.0 15h
# rendered-worker-4afd90ddf39588aae385def4519e8da9 723a8a4992f42530af95202e51e5a940d2a3d169 3.2.0 51m
# rendered-worker-5abff4814eef2f9bc7535e5cbb10564c 723a8a4992f42530af95202e51e5a940d2a3d169 3.2.0 15h
# 那这个machine config里面是什么呢?我们看一看
# 原来是加了一个extension,
# 经过查看源代码,这个sandboxed-containers extension就是对应了kata-containers rpm
oc get mc/50-enable-sandboxed-containers-extension -o yaml
# apiVersion: machineconfiguration.openshift.io/v1
# kind: MachineConfig
# metadata:
# labels:
# app: example-kataconfig
# machineconfiguration.openshift.io/role: worker
# name: 50-enable-sandboxed-containers-extension
# spec:
# config:
# ignition:
# version: 3.2.0
# extensions:
# - sandboxed-containers
# 我们到worker-0上看看,发现确实是安装了一个新的kata-containers rpm
rpm-ostree status
# State: idle
# Deployments:
# ● pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:6ddc94ab09a4807ea3d1f29a922fe15f0b4ee863529258c486a04e7fb7b95a4b
# CustomOrigin: Managed by machine-config-operator
# Version: 48.84.202108161759-0 (2021-08-16T18:03:02Z)
# LayeredPackages: kata-containers
# pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:6ddc94ab09a4807ea3d1f29a922fe15f0b4ee863529258c486a04e7fb7b95a4b
# CustomOrigin: Managed by machine-config-operator
# Version: 48.84.202108161759-0 (2021-08-16T18:03:02Z)
# 我们看看这个kata-containers rpm里面都提供了什么文件
rpm -ql kata-containers
# /etc/crio/crio.conf.d/50-kata
# /usr/bin/containerd-shim-kata-v2
# /usr/bin/kata-collect-data.sh
# /usr/bin/kata-monitor
# /usr/bin/kata-runtime
# /usr/lib/.build-id
# /usr/lib/.build-id/0f
# /usr/lib/.build-id/0f/dc6751937c4b54a2e10ed431f7969bfd85d2d7
# /usr/lib/.build-id/5e
# /usr/lib/.build-id/5e/ad1e1eca5ab8111a23bf094caf6acbd3b9d7af
# /usr/lib/.build-id/67
# /usr/lib/.build-id/67/e5107c68c0e147f24f6e8f4e96104564b8f223
# /usr/lib/.build-id/be
# /usr/lib/.build-id/be/0add7df48b5f06a305e95497355666a1e04e39
# /usr/lib/systemd/system/kata-osbuilder-generate.service
# /usr/libexec/kata-containers
# /usr/libexec/kata-containers/VERSION
# /usr/libexec/kata-containers/agent
# /usr/libexec/kata-containers/agent/usr
# /usr/libexec/kata-containers/agent/usr/bin
# /usr/libexec/kata-containers/agent/usr/bin/kata-agent
# /usr/libexec/kata-containers/agent/usr/lib
# /usr/libexec/kata-containers/agent/usr/lib/systemd
# /usr/libexec/kata-containers/agent/usr/lib/systemd/system
# /usr/libexec/kata-containers/agent/usr/lib/systemd/system/kata-agent.service
# /usr/libexec/kata-containers/agent/usr/lib/systemd/system/kata-containers.target
# /usr/libexec/kata-containers/kata-netmon
# /usr/libexec/kata-containers/osbuilder
# /usr/libexec/kata-containers/osbuilder/dracut
# /usr/libexec/kata-containers/osbuilder/dracut/dracut.conf.d
# /usr/libexec/kata-containers/osbuilder/dracut/dracut.conf.d/05-base.conf
# /usr/libexec/kata-containers/osbuilder/dracut/dracut.conf.d/15-dracut-rhel.conf
# /usr/libexec/kata-containers/osbuilder/initrd-builder
# /usr/libexec/kata-containers/osbuilder/initrd-builder/README.md
# /usr/libexec/kata-containers/osbuilder/initrd-builder/initrd_builder.sh
# /usr/libexec/kata-containers/osbuilder/kata-osbuilder.sh
# /usr/libexec/kata-containers/osbuilder/nsdax
# /usr/libexec/kata-containers/osbuilder/rootfs-builder
# /usr/libexec/kata-containers/osbuilder/rootfs-builder/README.md
# /usr/libexec/kata-containers/osbuilder/rootfs-builder/rootfs.sh
# /usr/libexec/kata-containers/osbuilder/scripts
# /usr/libexec/kata-containers/osbuilder/scripts/lib.sh
# /usr/share/bash-completion/completions/kata-runtime
# /usr/share/doc/kata-containers
# /usr/share/doc/kata-containers/CONTRIBUTING.md
# /usr/share/doc/kata-containers/README.md
# /usr/share/kata-containers
# /usr/share/kata-containers/defaults
# /usr/share/kata-containers/defaults/configuration.toml
# /usr/share/licenses/kata-containers
# /usr/share/licenses/kata-containers/LICENSE
# /var/cache/kata-containers
# 我们看看kata-containers 使用的虚拟机镜像
ls -Rl /var/cache/kata-containers
# /var/cache/kata-containers:
# total 0
# lrwxrwxrwx. 1 root root 121 Aug 26 05:22 kata-containers-initrd.img -> '/var/cache/kata-containers/osbuilder-images/4.18.0-305.12.1.el8_4.x86_64/"rhcos"-kata-4.18.0-305.12.1.el8_4.x86_64.initrd'
# drwxr-xr-x. 3 root root 42 Aug 26 05:22 osbuilder-images
# lrwxrwxrwx. 1 root root 50 Aug 26 05:22 vmlinuz.container -> /lib/modules/4.18.0-305.12.1.el8_4.x86_64//vmlinuz
# /var/cache/kata-containers/osbuilder-images:
# total 0
# drwxr-xr-x. 2 root root 62 Aug 26 05:22 4.18.0-305.12.1.el8_4.x86_64
# /var/cache/kata-containers/osbuilder-images/4.18.0-305.12.1.el8_4.x86_64:
# total 19224
# -rw-r--r--. 1 root root 19682871 Aug 26 05:22 '"rhcos"-kata-4.18.0-305.12.1.el8_4.x86_64.initrd'
# 我们看看kata和crio的结合点,就是crios的配置文件里面
cat /etc/crio/crio.conf.d/50-kata
# [crio.runtime.runtimes.kata]
# runtime_path = "/usr/bin/containerd-shim-kata-v2"
# runtime_type = "vm"
# runtime_root = "/run/vc"
# privileged_without_host_devices = true
# 我们能看到,系统启动的时候,会根据当前操作系统,编译一个kata使用的虚拟机镜像。
# 后面如果项目上有需要,可以在这个步骤上,做定制,做一个客户需要的虚拟机镜像。
systemctl cat kata-osbuilder-generate.service
# # /usr/lib/systemd/system/kata-osbuilder-generate.service
# [Unit]
# Description=Generate Kata appliance image for host kernel
# [Service]
# Type=oneshot
# ExecStart=/usr/libexec/kata-containers/osbuilder/kata-osbuilder.sh -c
# ExecReload=/usr/libexec/kata-containers/osbuilder/kata-osbuilder.sh
# [Install]
# WantedBy=kubelet.service
# 我们来搞一个pod,测试一下。
cat << EOF > /data/install/kata.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: mypod
labels:
app: mypod
spec:
replicas: 1
selector:
matchLabels:
app: mypod
template:
metadata:
labels:
app: mypod
spec:
runtimeClassName: kata
containers:
- name: mypod
image: quay.io/wangzheng422/qimgs:centos7-test
command:
- sleep
- infinity
EOF
oc create -f /data/install/kata.yaml
# to restore
oc delete -f /data/install/kata.yaml
# 到worker-0上,可以看到qemu进程。
ps aufx ww | grep qemu
# root 99994 0.0 0.0 12816 1076 pts/0 S+ 06:22 0:00 \_ grep --color=auto qemu
# root 93561 1.3 0.9 2466300 326724 ? Sl 06:19 0:03 /usr/libexec/qemu-kiwi -name sandbox-42f003b365352a71ab87e8a1f49b1c301b6c3c856ec5520b4986aa8b9e43151f -uuid 1cd86e5c-3f86-45e8-bce2-96b16dce635a -machine q35,accel=kvm,kernel_irqchip -cpu host,pmu=off -qmp unix:/run/vc/vm/42f003b365352a71ab87e8a1f49b1c301b6c3c856ec5520b4986aa8b9e43151f/qmp.sock,server=on,wait=off -m 2048M,slots=10,maxmem=33122M -device pci-bridge,bus=pcie.0,id=pci-bridge-0,chassis_nr=1,shpc=on,addr=2 -device virtio-serial-pci,disable-modern=false,id=serial0 -device virtconsole,chardev=charconsole0,id=console0 -chardev socket,id=charconsole0,path=/run/vc/vm/42f003b365352a71ab87e8a1f49b1c301b6c3c856ec5520b4986aa8b9e43151f/console.sock,server=on,wait=off -device virtio-scsi-pci,id=scsi0,disable-modern=false -object rng-random,id=rng0,filename=/dev/urandom -device virtio-rng-pci,rng=rng0 -device vhost-vsock-pci,disable-modern=false,vhostfd=3,id=vsock-976011602,guest-cid=976011602 -chardev socket,id=char-b4b86634faff36bb,path=/run/vc/vm/42f003b365352a71ab87e8a1f49b1c301b6c3c856ec5520b4986aa8b9e43151f/vhost-fs.sock -device vhost-user-fs-pci,chardev=char-b4b86634faff36bb,tag=kataShared -netdev tap,id=network-0,vhost=on,vhostfds=4,fds=5 -device driver=virtio-net-pci,netdev=network-0,mac=0a:58:0a:fe:01:1a,disable-modern=false,mq=on,vectors=4 -rtc base=utc,driftfix=slew,clock=host -global kvm-pit.lost_tick_policy=discard -vga none -no-user-config -nodefaults -nographic --no-reboot -daemonize -object memory-backend-file,id=dimm1,size=2048M,mem-path=/dev/shm,share=on -numa node,memdev=dimm1 -kernel /usr/lib/modules/4.18.0-305.12.1.el8_4.x86_64/vmlinuz -initrd /var/cache/kata-containers/osbuilder-images/4.18.0-305.12.1.el8_4.x86_64/"rhcos"-kata-4.18.0-305.12.1.el8_4.x86_64.initrd -append tsc=reliable no_timer_check rcupdate.rcu_expedited=1 i8042.direct=1 i8042.dumbkbd=1 i8042.nopnp=1 i8042.noaux=1 noreplace-smp reboot=k console=hvc0 console=hvc1 cryptomgr.notests net.ifnames=0 pci=lastbus=0 quiet panic=1 nr_cpus=24 scsi_mod.scan=none -pidfile /run/vc/vm/42f003b365352a71ab87e8a1f49b1c301b6c3c856ec5520b4986aa8b9e43151f/pid -smp 1,cores=1,threads=1,sockets=24,maxcpus=24
# 我们很好奇kata的详细配置,那么我们看看kata的配置文件在哪里
kata-runtime --show-default-config-paths
# /etc/kata-containers/configuration.toml
# /usr/share/kata-containers/defaults/configuration.toml
# 我们看看kata的配置文件内容
cat /usr/share/kata-containers/defaults/configuration.toml
# 我们看看kata runtime感知到的配置内容
kata-runtime env
# [Meta]
# Version = "1.0.25"
# [Runtime]
# Debug = false
# Trace = false
# DisableGuestSeccomp = true
# DisableNewNetNs = false
# SandboxCgroupOnly = true
# Path = "/usr/bin/kata-runtime"
# [Runtime.Version]
# OCI = "1.0.1-dev"
# [Runtime.Version.Version]
# Semver = "2.1.0"
# Major = 2
# Minor = 1
# Patch = 0
# Commit = "fa7b9408555e863d0f36f7d0640134069b0c70c8"
# [Runtime.Config]
# Path = "/usr/share/kata-containers/defaults/configuration.toml"
# [Hypervisor]
# MachineType = "q35"
# Version = "QEMU emulator version 5.2.0 (qemu-kvm-5.2.0-16.module+el8.4.0+11536+725e25d9.2)\nCopyright (c) 2003-2020 Fabrice Bellard and the QEMU Project developers"
# Path = "/usr/libexec/qemu-kiwi"
# BlockDeviceDriver = "virtio-scsi"
# EntropySource = "/dev/urandom"
# SharedFS = "virtio-fs"
# VirtioFSDaemon = "/usr/libexec/virtiofsd"
# Msize9p = 8192
# MemorySlots = 10
# PCIeRootPort = 0
# HotplugVFIOOnRootBus = false
# Debug = false
# [Image]
# Path = ""
# [Kernel]
# Path = "/usr/lib/modules/4.18.0-305.12.1.el8_4.x86_64/vmlinuz"
# Parameters = "scsi_mod.scan=none"
# [Initrd]
# Path = "/var/cache/kata-containers/osbuilder-images/4.18.0-305.12.1.el8_4.x86_64/\"rhcos\"-kata-4.18.0-305.12.1.el8_4.x86_64.initrd"
# [Agent]
# Debug = false
# Trace = false
# TraceMode = ""
# TraceType = ""
# [Host]
# Kernel = "4.18.0-305.12.1.el8_4.x86_64"
# Architecture = "amd64"
# VMContainerCapable = true
# SupportVSocks = true
# [Host.Distro]
# Name = "Red Hat Enterprise Linux CoreOS"
# Version = "4.8"
# [Host.CPU]
# Vendor = "GenuineIntel"
# Model = "Intel(R) Xeon(R) CPU E5-2620 v2 @ 2.10GHz"
# CPUs = 24
# [Host.Memory]
# Total = 32868716
# Free = 27704960
# Available = 29880404
# [Netmon]
# Path = "/usr/libexec/kata-containers/kata-netmon"
# Debug = false
# Enable = false
# [Netmon.Version]
# Semver = "2.1.0"
# Major = 2
# Minor = 1
# Patch = 0
# Commit = "<<unknown>>"
# 我们看看这个构建kata虚拟机镜像的脚本
cat /usr/libexec/kata-containers/osbuilder/kata-osbuilder.sh
try to debug
# try to debug
# 为了能进入到kata虚拟机内部,我们需要修改一下kata的配置文件,激活debug console
mkdir -p /etc/kata-containers/
install -o root -g root -m 0640 /usr/share/kata-containers/defaults/configuration.toml /etc/kata-containers
sed -i -e 's/^# *\(debug_console_enabled\).*=.*$/\1 = true/g' /etc/kata-containers/configuration.toml
# 然后重启pod,我们就能直接连进去kata虚拟机了。
# ps -ef | grep qemu-kiwi | sed 's/.* sandbox-\([^ ]*\) .*/\1/p' | grep -v qemu-kiwi
KATA_PID=`ps -ef | grep qemu-kiwi | sed 's/.* sandbox-\([^ ]*\) .*/\1/g' | grep -v qemu-kiwi`
kata-runtime exec $KATA_PID
in the kata vm
# 虚拟机里面,是个超级简化的系统,命令奇缺
bash-4.4# cd /etc
# ls都没有,只能echo * 代替。
bash-4.4# echo *
chrony.conf cmdline.d conf.d group ld.so.cache ld.so.conf ld.so.conf.d machine-id modules-load.d passwd resolv.conf systemd udev virc
# 可以看到,操作系统和宿主机一样,因为启动的时候,用宿主机的内核构建出来的
bash-4.4# uname -a
Linux mypod-787d79b456-4f4xr 4.18.0-305.12.1.el8_4.x86_64 #1 SMP Mon Jul 26 08:06:24 EDT 2021 x86_64 x86_64 x86_64 GNU/Linux
# 看看激活了什么内核模块
bash-4.4# lsmod
Module Size Used by
mcryptd 16384 0
virtio_blk 20480 0
virtio_console 36864 0
virtio_net 53248 0
net_failover 24576 1 virtio_net
sg 40960 0
virtio_scsi 20480 0
virtiofs 28672 1
failover 16384 1 net_failover
vmw_vsock_virtio_transport 16384 2
vmw_vsock_virtio_transport_common 32768 1 vmw_vsock_virtio_transport
vsock 45056 10 vmw_vsock_virtio_transport_common,vmw_vsock_virtio_transport
fuse 151552 1 virtiofs
# 看看挂载了什么分区
bash-4.4# mount
rootfs on / type rootfs (rw,size=964048k,nr_inodes=241012)
sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime)
proc on /proc type proc (rw,nosuid,nodev,noexec,relatime)
devtmpfs on /dev type devtmpfs (rw,nosuid,size=964064k,nr_inodes=241016,mode=755)
securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime)
selinuxfs on /sys/fs/selinux type selinuxfs (rw,relatime)
tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev)
devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000)
tmpfs on /run type tmpfs (rw,nosuid,nodev,mode=755)
tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,mode=755)
cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd)
pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime)
bpf on /sys/fs/bpf type bpf (rw,nosuid,nodev,noexec,relatime,mode=700)
cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer)
cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpu,cpuacct)
cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls,net_prio)
cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio)
cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory)
cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices)
cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event)
cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset)
cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,pids)
cgroup on /sys/fs/cgroup/hugetlb type cgroup (rw,nosuid,nodev,noexec,relatime,hugetlb)
cgroup on /sys/fs/cgroup/rdma type cgroup (rw,nosuid,nodev,noexec,relatime,rdma)
tmpfs on /tmp type tmpfs (rw,nosuid,nodev)
configfs on /sys/kernel/config type configfs (rw,relatime)
nsfs on /run/sandbox-ns/ipc type nsfs (rw)
nsfs on /run/sandbox-ns/uts type nsfs (rw)
kataShared on /run/kata-containers/shared/containers type virtiofs (rw,relatime)
shm on /run/kata-containers/sandbox/shm type tmpfs (rw,relatime)
tmpfs on /etc/resolv.conf type tmpfs (rw,nosuid,nodev,mode=755)
kataShared on /run/kata-containers/8330bf4c2a98360975ce16244af81c4a5dfa74d4ea3c8a520d9244f0c14e541b/rootfs type virtiofs (rw,relatime)
kataShared on /run/kata-containers/bc201bf92ec8dcad3435ff4191912a41efb64a1e0fb463ad4a651b4dea94a8a5/rootfs type virtiofs (rw,relatime)
b
# 看看都有什么进程
bash-4.4# ps efx ww
PID TTY STAT TIME COMMAND
2 ? S 0:00 [kthreadd]
3 ? I< 0:00 \_ [rcu_gp]
4 ? I< 0:00 \_ [rcu_par_gp]
6 ? I< 0:00 \_ [kworker/0:0H-events_highpri]
7 ? I 0:00 \_ [kworker/0:1-virtio_vsock]
8 ? I 0:00 \_ [kworker/u48:0-events_unbound]
9 ? I< 0:00 \_ [mm_percpu_wq]
10 ? S 0:00 \_ [ksoftirqd/0]
11 ? I 0:00 \_ [rcu_sched]
12 ? S 0:00 \_ [migration/0]
13 ? S 0:00 \_ [watchdog/0]
14 ? S 0:00 \_ [cpuhp/0]
16 ? S 0:00 \_ [kdevtmpfs]
17 ? I< 0:00 \_ [netns]
18 ? S 0:00 \_ [kauditd]
19 ? S 0:00 \_ [khungtaskd]
20 ? S 0:00 \_ [oom_reaper]
21 ? I< 0:00 \_ [writeback]
22 ? S 0:00 \_ [kcompactd0]
23 ? SN 0:00 \_ [ksmd]
24 ? SN 0:00 \_ [khugepaged]
25 ? I< 0:00 \_ [crypto]
26 ? I< 0:00 \_ [kintegrityd]
27 ? I< 0:00 \_ [kblockd]
28 ? I< 0:00 \_ [blkcg_punt_bio]
29 ? I< 0:00 \_ [tpm_dev_wq]
30 ? I< 0:00 \_ [md]
31 ? I< 0:00 \_ [edac-poller]
32 ? S 0:00 \_ [watchdogd]
33 ? I< 0:00 \_ [kworker/0:1H]
35 ? I 0:00 \_ [kworker/u48:1]
49 ? S 0:00 \_ [kswapd0]
132 ? I< 0:00 \_ [kthrotld]
133 ? I< 0:00 \_ [acpi_thermal_pm]
134 ? S 0:00 \_ [hwrng]
135 ? I< 0:00 \_ [kmpath_rdacd]
136 ? I< 0:00 \_ [kaluad]
137 ? I< 0:00 \_ [ipv6_addrconf]
138 ? I< 0:00 \_ [kstrp]
203 ? I 0:00 \_ [kworker/0:3-mm_percpu_wq]
206 ? S 0:00 \_ [scsi_eh_0]
207 ? I< 0:00 \_ [scsi_tmf_0]
218 ? S 0:00 \_ [khvcd]
1 ? Ss 0:00 /init HOME=/ TERM=linux
193 ? Ss 0:00 /usr/lib/systemd/systemd-journald PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin NOTIFY_SOCKET=/run/systemd/notify LISTEN_PID=193 LISTEN_FDS=3 LISTEN_FDNAMES=systemd-journald-dev-log.socket:systemd-journald.socket:systemd-journald.socket WATCHDOG_PID=193 WATCHDOG_USEC=180000000 INVOCATION_ID=00385279d7314bf5a02002d5f1e33050
201 ? Ss 0:00 /usr/lib/systemd/systemd-udevd PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin NOTIFY_SOCKET=/run/systemd/notify LISTEN_PID=201 LISTEN_FDS=2 LISTEN_FDNAMES=systemd-udevd-kernel.socket:systemd-udevd-control.socket WATCHDOG_PID=201 WATCHDOG_USEC=180000000 INVOCATION_ID=b3e4a3cd29b34c91a192bc9527da10cf JOURNAL_STREAM=9:10719
225 ? Ssl 0:02 /usr/bin/kata-agent PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin INVOCATION_ID=5683abfd11c542fe98c5f7ece1afa599 TERM=vt220
231 ? S 0:00 \_ /usr/bin/pod PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin TERM=xterm HOME=/root
235 ? S 0:00 \_ sleep infinity PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin TERM=xterm HOSTNAME=mypod-787d79b456-4f4xr NSS_SDB_USE_CACHE=no KUBERNETES_SERVICE_HOST=172.30.0.1 KUBERNETES_SERVICE_PORT=443 KUBERNETES_SERVICE_PORT_HTTPS=443 KUBERNETES_PORT=tcp://172.30.0.1:443 KUBERNETES_PORT_443_TCP=tcp://172.30.0.1:443 KUBERNETES_PORT_443_TCP_PROTO=tcp KUBERNETES_PORT_443_TCP_PORT=443 KUBERNETES_PORT_443_TCP_ADDR=172.30.0.1 HOME=/root
236 pts/0 Ss 0:00 \_ [bash] PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin INVOCATION_ID=5683abfd11c542fe98c5f7ece1afa599 TERM=vt220 RUST_BACKTRACE=full
268 pts/0 R+ 0:00 | \_ ps efx ww RUST_BACKTRACE=full INVOCATION_ID=5683abfd11c542fe98c5f7ece1afa599 PWD=/proc/net TERM=vt220 SHLVL=1 PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin OLDPWD=/proc _=/usr/bin/ps
247 pts/1 Ss+ 0:00 \_ /bin/sh TERM=screen-256color HOSTNAME=mypod-787d79b456-4f4xr KUBERNETES_PORT_443_TCP_PORT=443 KUBERNETES_PORT=tcp://172.30.0.1:443 KUBERNETES_SERVICE_PORT=443 KUBERNETES_SERVICE_HOST=172.30.0.1 PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin PWD=/ SHLVL=1 HOME=/root KUBERNETES_PORT_443_TCP_PROTO=tcp KUBERNETES_SERVICE_PORT_HTTPS=443 NSS_SDB_USE_CACHE=no KUBERNETES_PORT_443_TCP_ADDR=172.30.0.1 KUBERNETES_PORT_443_TCP=tcp://172.30.0.1:443 _=/bin/sh
# 看看有多少内存
bash-4.4# free -h
total used free shared buff/cache available
Mem: 1.9Gi 30Mi 1.8Gi 58Mi 72Mi 1.7Gi
Swap: 0B 0B 0B
# 看看内核启动参数
bash-4.4# cat cmdline
tsc=reliable no_timer_check rcupdate.rcu_expedited=1 i8042.direct=1 i8042.dumbkbd=1 i8042.nopnp=1 i8042.noaux=1 noreplace-smp reboot=k console=hvc0 console=hvc1 cryptomgr.notests net.ifnames=0 pci=lastbus=0 quiet panic=1 nr_cpus=24 scsi_mod.scan=none agent.debug_console agent.debug_console_vport=1026
# 没有ip命令,只能用内核接口,凑合看一下本机ip 地址
bash-4.4# cat /proc/net/fib_trie
Main:
+-- 0.0.0.0/0 3 0 4
+-- 0.0.0.0/4 2 0 2
|-- 0.0.0.0
/0 universe UNICAST
+-- 10.254.0.0/23 2 0 1
|-- 10.254.0.0
/16 universe UNICAST
+-- 10.254.1.0/28 2 0 2
|-- 10.254.1.0
/32 link BROADCAST
/24 link UNICAST
|-- 10.254.1.14
/32 host LOCAL
|-- 10.254.1.255
/32 link BROADCAST
+-- 127.0.0.0/8 2 0 2
+-- 127.0.0.0/31 1 0 0
|-- 127.0.0.0
/32 link BROADCAST
/8 host LOCAL
|-- 127.0.0.1
/32 host LOCAL
|-- 127.255.255.255
/32 link BROADCAST
|-- 172.30.0.0
/16 universe UNICAST
|-- 224.0.0.0
/4 universe UNICAST
Local:
+-- 0.0.0.0/0 3 0 4
+-- 0.0.0.0/4 2 0 2
|-- 0.0.0.0
/0 universe UNICAST
+-- 10.254.0.0/23 2 0 1
|-- 10.254.0.0
/16 universe UNICAST
+-- 10.254.1.0/28 2 0 2
|-- 10.254.1.0
/32 link BROADCAST
/24 link UNICAST
|-- 10.254.1.14
/32 host LOCAL
|-- 10.254.1.255
/32 link BROADCAST
+-- 127.0.0.0/8 2 0 2
+-- 127.0.0.0/31 1 0 0
|-- 127.0.0.0
/32 link BROADCAST
/8 host LOCAL
|-- 127.0.0.1
/32 host LOCAL
|-- 127.255.255.255
/32 link BROADCAST
|-- 172.30.0.0
/16 universe UNICAST
|-- 224.0.0.0
/4 universe UNICAST
# 看看systemctl的服务
bash-4.4# systemctl list-units
UNIT LOAD ACTIVE SUB DESCRIPTION
sys-devices-pci0000:00-0000:00:01.0-virtio0-virtio\x2dports-vport0p0.device loaded active plugged /sys/devices/pci0000:00/0000:00:01.0/virtio0/virtio-ports/vport0p0
sys-devices-pci0000:00-0000:00:07.0-virtio5-net-eth0.device loaded active plugged /sys/devices/pci0000:00/0000:00:07.0/virtio5/net/eth0
sys-devices-platform-serial8250-tty-ttyS0.device loaded active plugged /sys/devices/platform/serial8250/tty/ttyS0
sys-devices-platform-serial8250-tty-ttyS1.device loaded active plugged /sys/devices/platform/serial8250/tty/ttyS1
sys-devices-platform-serial8250-tty-ttyS2.device loaded active plugged /sys/devices/platform/serial8250/tty/ttyS2
sys-devices-platform-serial8250-tty-ttyS3.device loaded active plugged /sys/devices/platform/serial8250/tty/ttyS3
sys-devices-virtual-tty-hvc0.device loaded active plugged /sys/devices/virtual/tty/hvc0
sys-devices-virtual-tty-hvc1.device loaded active plugged /sys/devices/virtual/tty/hvc1
sys-devices-virtual-tty-hvc2.device loaded active plugged /sys/devices/virtual/tty/hvc2
sys-devices-virtual-tty-hvc3.device loaded active plugged /sys/devices/virtual/tty/hvc3
sys-devices-virtual-tty-hvc4.device loaded active plugged /sys/devices/virtual/tty/hvc4
sys-devices-virtual-tty-hvc5.device loaded active plugged /sys/devices/virtual/tty/hvc5
sys-devices-virtual-tty-hvc6.device loaded active plugged /sys/devices/virtual/tty/hvc6
sys-devices-virtual-tty-hvc7.device loaded active plugged /sys/devices/virtual/tty/hvc7
sys-module-configfs.device loaded active plugged /sys/module/configfs
sys-module-fuse.device loaded active plugged /sys/module/fuse
sys-subsystem-net-devices-eth0.device loaded active plugged /sys/subsystem/net/devices/eth0
-.mount loaded active mounted Root Mount
etc-resolv.conf.mount loaded active mounted /etc/resolv.conf
run-kata\x2dcontainers-3daea1739ff15b732a2a1e7cf76d64b49f128a5a55bb8807c5ddde96d378e5cd-rootfs.mount loaded active mounted /run/kata-containers/3daea1739ff15b732a2a1e7cf76d64b49f128a5a55bb8807c5ddde96d378e5cd/rootfs
run-kata\x2dcontainers-e47a609923ce835a252c87d71fc3ba92adb974f00fdae194576b3d388b1bc770-rootfs.mount loaded active mounted /run/kata-containers/e47a609923ce835a252c87d71fc3ba92adb974f00fdae194576b3d388b1bc770/rootfs
run-kata\x2dcontainers-sandbox-shm.mount loaded active mounted /run/kata-containers/sandbox/shm
-containers/shared/containersed-containers.mount loaded active mounted /run/kata--More--
run-sandbox\x2dns-ipc.mount loaded active mounted /run/sandbox-ns/ipc
run-sandbox\x2dns-uts.mount loaded active mounted /run/sandbox-ns/uts
sys-kernel-config.mount loaded active mounted Kernel Configuration File System
tmp.mount loaded active mounted Temporary Directory (/tmp)
systemd-ask-password-console.path loaded active waiting Dispatch Password Requests to Console Directory Watch
init.scope loaded active running System and Service Manager
kata-agent.service loaded active running Kata Containers Agent
kmod-static-nodes.service loaded active exited Create list of required static device nodes for the current kernel
systemd-journald.service loaded active running Journal Service
● systemd-modules-load.service loaded failed failed Load Kernel Modules
systemd-sysctl.service loaded active exited Apply Kernel Variables
systemd-tmpfiles-setup-dev.service loaded active exited Create Static Device Nodes in /dev
systemd-tmpfiles-setup.service loaded active exited Create Volatile Files and Directories
systemd-udev-trigger.service loaded active exited udev Coldplug all Devices
systemd-udevd.service loaded active running udev Kernel Device Manager
-.slice loaded active active Root Slice
system.slice loaded active active System Slice
systemd-journald-dev-log.socket loaded active running Journal Socket (/dev/log)
systemd-journald.socket loaded active running Journal Socket
systemd-udevd-control.socket loaded active running udev Control Socket
systemd-udevd-kernel.socket loaded active running udev Kernel Socket
basic.target loaded active active Basic System
kata-containers.target loaded active active Kata Containers Agent Target
local-fs.target loaded active active Local File Systems
multi-user.target loaded active active Multi-User System
paths.target loaded active active Paths
slices.target loaded active active Slices
sockets.target loaded active active Sockets
swap.target loaded active active Swap
sysinit.target loaded active active System Initialization
timers.target loaded active active Timers
# 有一个kata-containers的服务,我们很感兴趣,看看什么内容。
bash-4.4# systemctl cat kata-containers.target
# /usr/lib/systemd/system/kata-containers.target
#
# Copyright (c) 2018-2019 Intel Corporation
#
# SPDX-License-Identifier: Apache-2.0
#
[Unit]
Description=Kata Containers Agent Target
Requires=basic.target
Requires=tmp.mount
Wants=chronyd.service
Requires=kata-agent.service
Conflicts=rescue.service rescue.target
After=basic.target rescue.service rescue.target
AllowIsolate=yes
bash-4.4# systemctl cat kata-agent.service
# /usr/lib/systemd/system/kata-agent.service
#
# Copyright (c) 2018-2019 Intel Corporation
#
# SPDX-License-Identifier: Apache-2.0
#
[Unit]
Description=Kata Containers Agent
Documentation=https://github.com/kata-containers/kata-containers
Wants=kata-containers.target
[Service]
# Send agent output to tty to allow capture debug logs
# from a VM vsock port
StandardOutput=tty
Type=simple
ExecStart=/usr/bin/kata-agent
LimitNOFILE=1048576
# ExecStop is required for static agent tracing; in all other scenarios
# the runtime handles shutting down the VM.
ExecStop=/bin/sync ; /usr/bin/systemctl --force poweroff
FailureAction=poweroff
# Discourage OOM-killer from touching the agent
OOMScoreAdjust=-997
# 我们的容器都在哪里呢?找到了。
bash-4.4# pwd
/run/kata-containers/e47a609923ce835a252c87d71fc3ba92adb974f00fdae194576b3d388b1bc770/rootfs
bash-4.4# echo *
anaconda-post.log bin check.sh dev etc home lib lib64 media mnt opt proc root run sbin srv sys tmp usr var
从helper登录到容器里面,看看什么情况。
[root@helper ~]# oc rsh pod/mypod-787d79b456-4f4xr
sh-4.2# ls
anaconda-post.log bin dev etc home lib lib64 media mnt opt proc root run sbin srv sys tmp usr var
sh-4.2# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc fq_codel state UP group default qlen 1000
link/ether 0a:58:0a:fe:01:0e brd ff:ff:ff:ff:ff:ff
inet 10.254.1.14/24 brd 10.254.1.255 scope global eth0
valid_lft forever preferred_lft forever
inet6 fe80::858:aff:fefe:10e/64 scope link
valid_lft forever preferred_lft forever
inet6 fe80::5c25:c3ff:fe29:f429/64 scope link
valid_lft forever preferred_lft forever
sh-4.2# ps efx ww
PID TTY STAT TIME COMMAND
2 ? Ss 0:00 /bin/sh TERM=screen-256color HOSTNAME=mypod-787d79b456-4f4xr KUBERNETES_PORT_443_TCP_PORT=443 KUBERNETES_PORT=tcp://172.30.0.1:443 KUBERNETES_SERVICE_PORT=443 KUBERNETES_SERVICE_HOST=172.30.0.1 PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin PWD=/ SHLVL=1 HOME=/root KUBERNETES_PORT_443_TCP_PROTO=tcp KUBERNETES_SERVICE_PORT_HTTPS=443 NSS_SDB_USE_CACHE=no KUBERNETES_PORT_443_TCP_ADDR=172.30.0.1 KUBERNETES_PORT_443_TCP=tcp://172.30.0.1:443 _=/bin/sh
9 ? R+ 0:00 \_ ps efx ww HOSTNAME=mypod-787d79b456-4f4xr KUBERNETES_PORT=tcp://172.30.0.1:443 KUBERNETES_PORT_443_TCP_PORT=443 TERM=screen-256color KUBERNETES_SERVICE_PORT=443 KUBERNETES_SERVICE_HOST=172.30.0.1 PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin PWD=/ HOME=/root SHLVL=2 KUBERNETES_PORT_443_TCP_PROTO=tcp KUBERNETES_SERVICE_PORT_HTTPS=443 NSS_SDB_USE_CACHE=no KUBERNETES_PORT_443_TCP_ADDR=172.30.0.1 KUBERNETES_PORT_443_TCP=tcp://172.30.0.1:443 _=/usr/bin/ps
1 ? S 0:00 sleep infinity PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin TERM=xterm HOSTNAME=mypod-787d79b456-4f4xr NSS_SDB_USE_CACHE=no KUBERNETES_SERVICE_HOST=172.30.0.1 KUBERNETES_SERVICE_PORT=443 KUBERNETES_SERVICE_PORT_HTTPS=443 KUBERNETES_PORT=tcp://172.30.0.1:443 KUBERNETES_PORT_443_TCP=tcp://172.30.0.1:443 KUBERNETES_PORT_443_TCP_PROTO=tcp KUBERNETES_PORT_443_TCP_PORT=443 KUBERNETES_PORT_443_TCP_ADDR=172.30.0.1 HOME=/root
研究一下网络
kata的网络模型,我们很关心,官方有文档。
# 我们在worker-0上,看看namespace情况
[root@worker-0 ~]# lsns --output NS,TYPE,NETNSID,PID,COMMAND | grep qemu
4026533791 net 5 20394 /usr/libexec/qemu-kiwi -name sandbox-0f60fb9af6dbf8c8e355b9e27a62debe8276aa76f4246857e46520fa677ce40e -uuid 0a101364-3814-42a4-91b9-c8a81fc377ef -machine q35,accel=kvm,kernel_irqchip -cpu host,pmu=off -qmp unix:/run/vc/vm/0f60fb9af6dbf8c8e355b9e27a62debe8276aa76f4246857e46520fa677ce40e/qmp.sock,server=on,wait=off -m 2048M,slots=10,maxmem=33122M -device pci-bridge,bus=pcie.0,id=pci-bridge-0,chassis_nr=1,shpc=on,addr=2 -device virtio-serial-pci,disable-modern=false,id=serial0 -device virtconsole,chardev=charconsole0,id=console0 -chardev socket,id=charconsole0,path=/run/vc/vm/0f60fb9af6dbf8c8e355b9e27a62debe8276aa76f4246857e46520fa677ce40e/console.sock,server=on,wait=off -device virtio-scsi-pci,id=scsi0,disable-modern=false -object rng-random,id=rng0,filename=/dev/urandom -device virtio-rng-pci,rng=rng0 -device vhost-vsock-pci,disable-modern=false,vhostfd=3,id=vsock-2809816003,guest-cid=2809816003 -chardev socket,id=char-3bb1f59f00a0b873,path=/run/vc/vm/0f60fb9af6dbf8c8e355b9e27a62debe8276aa76f4246857e46520fa677ce40e/vhost-fs.sock -device vhost-user-fs-pci,chardev=char-3bb1f59f00a0b873,tag=kataShared -netdev tap,id=network-0,vhost=on,vhostfds=4,fds=5 -device driver=virtio-net-pci,netdev=network-0,mac=0a:58:0a:81:00:12,disable-modern=false,mq=on,vectors=4 -rtc base=utc,driftfix=slew,clock=host -global kvm-pit.lost_tick_policy=discard -vga none -no-user-config -nodefaults -nographic --no-reboot -daemonize -object memory-backend-file,id=dimm1,size=2048M,mem-path=/dev/shm,share=on -numa node,memdev=dimm1 -kernel /usr/lib/modules/4.18.0-305.19.1.el8_4.x86_64/vmlinuz -initrd /var/cache/kata-containers/osbuilder-images/4.18.0-305.19.1.el8_4.x86_64/"rhcos"-kata-4.18.0-305.19.1.el8_4.x86_64.initrd -append tsc=reliable no_timer_check rcupdate.rcu_expedited=1 i8042.direct=1 i8042.dumbkbd=1 i8042.nopnp=1 i8042.noaux=1 noreplace-smp reboot=k console=hvc0 console=hvc1 cryptomgr.notests net.ifnames=0 pci=lastbus=0 quiet panic=1 nr_cpus=24 scsi_mod.scan=none agent.debug_console agent.debug_console_vport=1026 -pidfile /run/vc/vm/0f60fb9af6dbf8c8e355b9e27a62debe8276aa76f4246857e46520fa677ce40e/pid -smp 1,cores=1,threads=1,sockets=24,maxcpus=24
# 我们到kata的netns里面去看看忘了情况, eth0后面的@if22,说的是在对端,是22号接口和本接口做了peer。
[root@worker-0 ~]# nsenter -t 20394 -n ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
3: eth0@if22: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc noqueue state UP group default qlen 1000
link/ether 0a:58:0a:81:00:12 brd ff:ff:ff:ff:ff:ff link-netns a4db0b05-2ff7-4a29-98da-1df2491622fb
inet 10.129.0.18/23 brd 10.129.1.255 scope global eth0
valid_lft forever preferred_lft forever
inet6 fe80::858:aff:fe81:12/64 scope link
valid_lft forever preferred_lft forever
4: tap0_kata: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc mq state UNKNOWN group default qlen 1000
link/ether 56:51:b2:40:7c:56 brd ff:ff:ff:ff:ff:ff
inet6 fe80::5451:b2ff:fe40:7c56/64 scope link
valid_lft forever preferred_lft forever
# 我们在worker-0上,能看到有28号接口,并且对应这kata里面的3好接口
[root@worker-0 ~]# ip link | grep 22 -A3
link/ether 9e:88:4d:e5:55:80 brd ff:ff:ff:ff:ff:ff link-netns 7ccc8362-c042-4bf3-9ddc-fa4fef322134
18: 6f53bb03a970cf7@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc noqueue master ovs-system state UP mode DEFAULT group default
link/ether 8e:a7:85:94:de:7b brd ff:ff:ff:ff:ff:ff link-netns 5f33c5e4-1788-4ab6-883b-78bf7ab5372e
22: 0f60fb9af6dbf8c@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc noqueue master ovs-system state UP mode DEFAULT group default
link/ether 02:3c:63:91:ae:7f brd ff:ff:ff:ff:ff:ff link-netns 50226e1e-a0fd-48e3-b05c-7d5aa1d41acf
# 我们看看kata netns里面有没有nftables
[root@worker-0 ~]# nsenter -t 20394 -n nft list ruleset
table ip filter {
chain INPUT {
type filter hook input priority filter; policy accept;
}
chain FORWARD {
type filter hook forward priority filter; policy accept;
meta l4proto tcp tcp dport 22623 tcp flags & (fin|syn|rst|ack) == syn counter packets 0 bytes 0 reject
meta l4proto tcp tcp dport 22624 tcp flags & (fin|syn|rst|ack) == syn counter packets 0 bytes 0 reject
meta l4proto tcp ip daddr 169.254.169.254 tcp dport != 53 counter packets 0 bytes 0 reject
meta l4proto udp ip daddr 169.254.169.254 udp dport 53 counter packets 0 bytes 0 reject
}
chain OUTPUT {
type filter hook output priority filter; policy accept;
meta l4proto tcp tcp dport 22623 tcp flags & (fin|syn|rst|ack) == syn counter packets 0 bytes 0 reject
meta l4proto tcp tcp dport 22624 tcp flags & (fin|syn|rst|ack) == syn counter packets 0 bytes 0 reject
meta l4proto tcp ip daddr 169.254.169.254 tcp dport != 53 counter packets 0 bytes 0 reject
meta l4proto udp ip daddr 169.254.169.254 udp dport 53 counter packets 0 bytes 0 reject
}
}
TC ( traffic control ) 的配置还是需要好好学习的,命令行比较复杂,可以参考以下的一些内容
可以使用的 man 命令
- man tc-mirred
- man tc-ctinfo
- man tc-u32
- man tc-actions
注意 action 里面有一个stolen,这个是说,命中以后,后续tc动作就中断了,进入netfilter等内核后续流程。
# 我们看看文档里面的tc配置,意思就是在eth0和tap0_kata之间mirror流量
# 根据网上的文档,tc qdisc add dev eth0 handle ffff: ingress is equivalent to tc qdisc add dev eth0 ingress, and also equals to 'qdisc ingress ffff: dev enp0s31f6 parent ffff:fff1 ----------------'
[root@worker-0 ~]# nsenter -t 20394 -n tc -s -p qdisc show dev eth0
qdisc noqueue 0: root refcnt 2
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
qdisc ingress ffff: parent ffff:fff1 ----------------
Sent 192 bytes 2 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
# 根据网上的文档,以下配置是 tc filter add dev eth0 parent ffff: protocol all u32 match u32 0 0 action mirred egress mirror dev tap0_kata 的结果
[root@worker-0 ~]# nsenter -t 20394 -n tc -s -p filter show dev eth0 root
filter parent ffff: protocol all pref 49152 u32 chain 0
filter parent ffff: protocol all pref 49152 u32 chain 0 fh 800: ht divisor 1
filter parent ffff: protocol all pref 49152 u32 chain 0 fh 800::800 order 2048 key ht 800 bkt 0 terminal flowid ??? not_in_hw (rule hit 2 success 2)
match 00000000/00000000 at 0 (success 2 )
action order 1: mirred (Egress Redirect to device tap0_kata) stolen
index 1 ref 1 bind 1 installed 2310 sec used 2310 sec firstused 2310 sec
Action statistics:
Sent 192 bytes 2 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
[root@worker-0 ~]# nsenter -t 20394 -n tc -s -p filter show dev eth0 ingress
filter parent ffff: protocol all pref 49152 u32 chain 0
filter parent ffff: protocol all pref 49152 u32 chain 0 fh 800: ht divisor 1
filter parent ffff: protocol all pref 49152 u32 chain 0 fh 800::800 order 2048 key ht 800 bkt 0 terminal flowid ??? not_in_hw (rule hit 2 success 2)
match 00000000/00000000 at 0 (success 2 )
action order 1: mirred (Egress Redirect to device tap0_kata) stolen
index 1 ref 1 bind 1 installed 1797 sec used 1797 sec firstused 1797 sec
Action statistics:
Sent 192 bytes 2 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
[root@worker-0 ~]# nsenter -t 20394 -n tc -s -p filter show dev eth0 egress
filter parent ffff: protocol all pref 49152 u32 chain 0
filter parent ffff: protocol all pref 49152 u32 chain 0 fh 800: ht divisor 1
filter parent ffff: protocol all pref 49152 u32 chain 0 fh 800::800 order 2048 key ht 800 bkt 0 terminal flowid ??? not_in_hw (rule hit 2 success 2)
match 00000000/00000000 at 0 (success 2 )
action order 1: mirred (Egress Redirect to device tap0_kata) stolen
index 1 ref 1 bind 1 installed 2330 sec used 2330 sec firstused 2330 sec
Action statistics:
Sent 192 bytes 2 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
# 根据网上的文档,以下配置是 tc filter add dev tap0_kata parent ffff: protocol all u32 match u32 0 0 action mirred egress mirror dev eth0 的结果
[root@worker-0 ~]# nsenter -t 20394 -n tc -s -p qdisc show dev tap0_kata
qdisc mq 0: root
Sent 1296 bytes 16 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
qdisc fq_codel 0: parent :1 limit 10240p flows 1024 quantum 1414 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64
Sent 1296 bytes 16 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
new_flows_len 0 old_flows_len 0
qdisc ingress ffff: parent ffff:fff1 ----------------
Sent 880 bytes 14 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
[root@worker-0 ~]# nsenter -t 20394 -n tc -s -p filter show dev tap0_kata root
filter parent ffff: protocol all pref 49152 u32 chain 0
filter parent ffff: protocol all pref 49152 u32 chain 0 fh 800: ht divisor 1
filter parent ffff: protocol all pref 49152 u32 chain 0 fh 800::800 order 2048 key ht 800 bkt 0 terminal flowid ??? not_in_hw (rule hit 15 success 15)
match 00000000/00000000 at 0 (success 15 )
action order 1: mirred (Egress Redirect to device eth0) stolen
index 2 ref 1 bind 1 installed 2383 sec used 247 sec firstused 2380 sec
Action statistics:
Sent 936 bytes 15 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
[root@worker-0 ~]# nsenter -t 20394 -n tc -s -p filter show dev tap0_kata ingress
filter parent ffff: protocol all pref 49152 u32 chain 0
filter parent ffff: protocol all pref 49152 u32 chain 0 fh 800: ht divisor 1
filter parent ffff: protocol all pref 49152 u32 chain 0 fh 800::800 order 2048 key ht 800 bkt 0 terminal flowid ??? not_in_hw (rule hit 14 success 14)
match 00000000/00000000 at 0 (success 14 )
action order 1: mirred (Egress Redirect to device eth0) stolen
index 2 ref 1 bind 1 installed 1690 sec used 636 sec firstused 1687 sec
Action statistics:
Sent 880 bytes 14 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
[root@worker-0 ~]# nsenter -t 20394 -n tc -s -p filter show dev tap0_kata egress
filter parent ffff: protocol all pref 49152 u32 chain 0
filter parent ffff: protocol all pref 49152 u32 chain 0 fh 800: ht divisor 1
filter parent ffff: protocol all pref 49152 u32 chain 0 fh 800::800 order 2048 key ht 800 bkt 0 terminal flowid ??? not_in_hw (rule hit 15 success 15)
match 00000000/00000000 at 0 (success 15 )
action order 1: mirred (Egress Redirect to device eth0) stolen
index 2 ref 1 bind 1 installed 2400 sec used 264 sec firstused 2397 sec
Action statistics:
Sent 936 bytes 15 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
qemu-kiwi rpm sourcing
我们来看看 qemu-kiwi 这个rpm是从哪里来的。红帽官网也有工具查。答案是 Red Hat Enterprise Linux Advanced Virtualization 8 x86_64 ( advanced-virt-for-rhel-8-x86_64-rpms )
rpm -qpi kata-containers-2.1.0-6.el8.x86_64.rpm
# warning: kata-containers-2.1.0-6.el8.x86_64.rpm: Header V3 RSA/SHA256 Signature, key ID fd431d51: NOKEY
# Name : kata-containers
# Version : 2.1.0
# Release : 6.el8
# Architecture: x86_64
# Install Date: (not installed)
# Group : Unspecified
# Size : 104672045
# License : ASL 2.0
# Signature : RSA/SHA256, Fri 13 Aug 2021 07:38:35 AM UTC, Key ID 199e2f91fd431d51
# Source RPM : kata-containers-2.1.0-6.el8.src.rpm
# Build Date : Thu 29 Jul 2021 08:43:06 PM UTC
# Build Host : x86-vm-56.build.eng.bos.redhat.com
# Relocations : (not relocatable)
# Packager : Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>
# Vendor : Red Hat, Inc.
# URL : https://github.com/kata-containers/kata-containers
# Summary : Kata Containers version 2.x repository
# Description :
# Kata Containers version 2.x repository. Kata Containers is an open source
# project and community working to build a standard implementation of lightweight
# Virtual Machines (VMs) that feel and perform like containers, but provide the
# workload isolation and security advantages of VMs. https://katacontainers.io/.
# %gopkg
rpm -qp --fileprovide kata-containers-2.1.0-6.el8.x86_64.rpm
# warning: kata-containers-2.1.0-6.el8.x86_64.rpm: Header V3 RSA/SHA256 Signature, key ID fd431d51: NOKEY
# /etc/crio/crio.conf.d/50-kata
# /usr/bin/containerd-shim-kata-v2
# /usr/bin/kata-collect-data.sh
# /usr/bin/kata-monitor
# /usr/bin/kata-runtime
# /usr/lib/.build-id
# /usr/lib/.build-id/05
# /usr/lib/.build-id/05/4f48f5aef5a7120fe76e8f41bc2e96fe82cb20
# /usr/lib/.build-id/50
# /usr/lib/.build-id/50/a5e84ca71250993215cb19c1fed802800fb358
# /usr/lib/.build-id/b1
# /usr/lib/.build-id/b1/b275acd0ff5df77c6f5abc9b6c8c5b2b4ac88e
# /usr/lib/.build-id/e7
# /usr/lib/.build-id/e7/6ecd091d646ac823c7292c65b2a186d40b8023
# /usr/lib/systemd/system/kata-osbuilder-generate.service
# /usr/libexec/kata-containers
# /usr/libexec/kata-containers/VERSION
# /usr/libexec/kata-containers/agent
# /usr/libexec/kata-containers/agent/usr
# /usr/libexec/kata-containers/agent/usr/bin
# /usr/libexec/kata-containers/agent/usr/bin/kata-agent
# /usr/libexec/kata-containers/agent/usr/lib
# /usr/libexec/kata-containers/agent/usr/lib/systemd
# /usr/libexec/kata-containers/agent/usr/lib/systemd/system
# /usr/libexec/kata-containers/agent/usr/lib/systemd/system/kata-agent.service
# /usr/libexec/kata-containers/agent/usr/lib/systemd/system/kata-containers.target
# /usr/libexec/kata-containers/kata-netmon
# /usr/libexec/kata-containers/osbuilder
# /usr/libexec/kata-containers/osbuilder/dracut
# /usr/libexec/kata-containers/osbuilder/dracut/dracut.conf.d
# /usr/libexec/kata-containers/osbuilder/dracut/dracut.conf.d/05-base.conf
# /usr/libexec/kata-containers/osbuilder/dracut/dracut.conf.d/15-dracut-rhel.conf
# /usr/libexec/kata-containers/osbuilder/initrd-builder
# /usr/libexec/kata-containers/osbuilder/initrd-builder/README.md
# /usr/libexec/kata-containers/osbuilder/initrd-builder/initrd_builder.sh
# /usr/libexec/kata-containers/osbuilder/kata-osbuilder.sh
# /usr/libexec/kata-containers/osbuilder/nsdax
# /usr/libexec/kata-containers/osbuilder/rootfs-builder
# /usr/libexec/kata-containers/osbuilder/rootfs-builder/README.md
# /usr/libexec/kata-containers/osbuilder/rootfs-builder/rootfs.sh
# /usr/libexec/kata-containers/osbuilder/scripts
# /usr/libexec/kata-containers/osbuilder/scripts/lib.sh
# /usr/share/bash-completion/completions/kata-runtime
# /usr/share/doc/kata-containers
# /usr/share/doc/kata-containers/CONTRIBUTING.md
# /usr/share/doc/kata-containers/README.md
# /usr/share/kata-containers
# /usr/share/kata-containers/defaults
# /usr/share/kata-containers/defaults/configuration.toml
# /usr/share/licenses/kata-containers
# /usr/share/licenses/kata-containers/LICENSE
# /var/cache/kata-containers
rpm -qp --requires kata-containers-2.1.0-6.el8.x86_64.rpm
# warning: kata-containers-2.1.0-6.el8.x86_64.rpm: Header V3 RSA/SHA256 Signature, key ID fd431d51: NOKEY
# /bin/bash
# /bin/sh
# /bin/sh
# /bin/sh
# dracut
# kernel
# libc.so.6()(64bit)
# libc.so.6(GLIBC_2.10)(64bit)
# libc.so.6(GLIBC_2.14)(64bit)
# libc.so.6(GLIBC_2.15)(64bit)
# libc.so.6(GLIBC_2.17)(64bit)
# libc.so.6(GLIBC_2.18)(64bit)
# libc.so.6(GLIBC_2.2.5)(64bit)
# libc.so.6(GLIBC_2.3)(64bit)
# libc.so.6(GLIBC_2.3.2)(64bit)
# libc.so.6(GLIBC_2.3.4)(64bit)
# libc.so.6(GLIBC_2.4)(64bit)
# libc.so.6(GLIBC_2.7)(64bit)
# libc.so.6(GLIBC_2.9)(64bit)
# libdl.so.2()(64bit)
# libdl.so.2(GLIBC_2.2.5)(64bit)
# libgcc_s.so.1()(64bit)
# libgcc_s.so.1(GCC_3.0)(64bit)
# libgcc_s.so.1(GCC_3.3)(64bit)
# libgcc_s.so.1(GCC_4.2.0)(64bit)
# libm.so.6()(64bit)
# libm.so.6(GLIBC_2.2.5)(64bit)
# libpthread.so.0()(64bit)
# libpthread.so.0(GLIBC_2.2.5)(64bit)
# libpthread.so.0(GLIBC_2.3.2)(64bit)
# libpthread.so.0(GLIBC_2.3.3)(64bit)
# libutil.so.1()(64bit)
# libutil.so.1(GLIBC_2.2.5)(64bit)
# qemu-kiwi >= 5.1.0-16
# rpmlib(CompressedFileNames) <= 3.0.4-1
# rpmlib(FileDigests) <= 4.6.0-1
# rpmlib(PayloadFilesHavePrefix) <= 4.0-1
# rpmlib(PayloadIsXz) <= 5.2-1
# rtld(GNU_HASH)
# systemd
# systemd
# systemd
rpm -qpi qemu-kiwi-5.2.0-16.module+el8.4.0+13460+2e130eec.13.x86_64.rpm
# warning: qemu-kiwi-5.2.0-16.module+el8.4.0+13460+2e130eec.13.x86_64.rpm: Header V3 RSA/SHA256 Signature, key ID fd431d51: NOKEY
# Name : qemu-kiwi
# Epoch : 15
# Version : 5.2.0
# Release : 16.module+el8.4.0+13460+2e130eec.13
# Architecture: x86_64
# Install Date: (not installed)
# Group : Development/Tools
# Size : 12941413
# License : GPLv2 and GPLv2+ and CC-BY
# Signature : RSA/SHA256, Tue 30 Nov 2021 10:43:30 PM UTC, Key ID 199e2f91fd431d51
# Source RPM : qemu-kvm-5.2.0-16.module+el8.4.0+13460+2e130eec.13.src.rpm
# Build Date : Fri 26 Nov 2021 09:59:08 PM UTC
# Build Host : x86-037.build.eng.bos.redhat.com
# Relocations : (not relocatable)
# Packager : Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>
# Vendor : Red Hat, Inc.
# URL : http://www.qemu.org/
# Summary : qemu-kiwi components
# Description :
# qemu-kiwi is a version of qemu-kvm with a restricted set of features
# intended for use by specific applications.
# It's experimental and unsupported.
rpm -qp --fileprovide qemu-kiwi-5.2.0-16.module+el8.4.0+13460+2e130eec.13.x86_64.rpm
# warning: qemu-kiwi-5.2.0-16.module+el8.4.0+13460+2e130eec.13.x86_64.rpm: Header V3 RSA/SHA256 Signature, key ID fd431d51: NOKEY
# /usr/lib/.build-id
# /usr/lib/.build-id/02
# /usr/lib/.build-id/02/3daf3e2bc89b7e0363ac89ea46bb70ddd74ae7
# /usr/libexec/qemu-kiwi
# /usr/share/systemtap/tapset/qemu-kiwi-log.stp
# /usr/share/systemtap/tapset/qemu-kiwi-simpletrace.stp
# /usr/share/systemtap/tapset/qemu-kiwi.stp
rpm -qp --requires qemu-kiwi-5.2.0-16.module+el8.4.0+13460+2e130eec.13.x86_64.rpm
# warning: qemu-kiwi-5.2.0-16.module+el8.4.0+13460+2e130eec.13.x86_64.rpm: Header V3 RSA/SHA256 Signature, key ID fd431d51: NOKEY
# libaio.so.1()(64bit)
# libaio.so.1(LIBAIO_0.1)(64bit)
# libaio.so.1(LIBAIO_0.4)(64bit)
# libc.so.6()(64bit)
# libc.so.6(GLIBC_2.10)(64bit)
# libc.so.6(GLIBC_2.11)(64bit)
# libc.so.6(GLIBC_2.12)(64bit)
# libc.so.6(GLIBC_2.14)(64bit)
# libc.so.6(GLIBC_2.17)(64bit)
# libc.so.6(GLIBC_2.2.5)(64bit)
# libc.so.6(GLIBC_2.25)(64bit)
# libc.so.6(GLIBC_2.27)(64bit)
# libc.so.6(GLIBC_2.28)(64bit)
# libc.so.6(GLIBC_2.3)(64bit)
# libc.so.6(GLIBC_2.3.2)(64bit)
# libc.so.6(GLIBC_2.3.4)(64bit)
# libc.so.6(GLIBC_2.4)(64bit)
# libc.so.6(GLIBC_2.7)(64bit)
# libc.so.6(GLIBC_2.8)(64bit)
# libc.so.6(GLIBC_2.9)(64bit)
# libgcc_s.so.1()(64bit)
# libgcc_s.so.1(GCC_3.0)(64bit)
# libgcc_s.so.1(GCC_3.3.1)(64bit)
# libgcc_s.so.1(GCC_3.4)(64bit)
# libgcc_s.so.1(GCC_4.7.0)(64bit)
# libgio-2.0.so.0()(64bit)
# libglib-2.0.so.0()(64bit)
# libgobject-2.0.so.0()(64bit)
# libm.so.6()(64bit)
# libm.so.6(GLIBC_2.2.5)(64bit)
# libnuma.so.1()(64bit)
# libnuma.so.1(libnuma_1.1)(64bit)
# libpixman-1.so.0()(64bit)
# libpmem.so.1()(64bit)
# libpmem.so.1(LIBPMEM_1.0)(64bit)
# libpthread.so.0()(64bit)
# libpthread.so.0(GLIBC_2.12)(64bit)
# libpthread.so.0(GLIBC_2.2.5)(64bit)
# libpthread.so.0(GLIBC_2.3.2)(64bit)
# libseccomp.so.2()(64bit)
# libutil.so.1()(64bit)
# libutil.so.1(GLIBC_2.2.5)(64bit)
# libz.so.1()(64bit)
# libz.so.1(ZLIB_1.2.0)(64bit)
# qemu-kvm-common = 15:5.2.0-16.module+el8.4.0+13460+2e130eec.13
# rpmlib(CompressedFileNames) <= 3.0.4-1
# rpmlib(FileDigests) <= 4.6.0-1
# rpmlib(PayloadFilesHavePrefix) <= 4.0-1
# rpmlib(PayloadIsXz) <= 5.2-1
# rtld(GNU_HASH)
end
sriov on openshift4 with unsupport NIC
openshift4自带sriov支持,但是由于内核只认证了某些网卡,所以openshift4内置了一个白名单,sriov的功能只对这些网卡开放。那么我们做实验的时候,没有这些网卡,但是网卡本身支持sriov,怎么做实验呢?本文就讲述如何操作。
实验拓扑图
视频讲解:
there is nic whitelist build-in for openshift4's sriov, to disable it, using:
- https://docs.openshift.com/container-platform/4.6/networking/hardware_networks/configuring-sriov-operator.html#disable-enable-sr-iov-operator-admission-control-webhook_configuring-sriov-operator
openshift
# sriov的实验不能在kvm里面做,因为sriov PF不能透传到kvm里面,那我们就搞一个物理机worker node
# check vendoer id and device id
# https://access.redhat.com/solutions/56081
# on worker-1
lspci -vv | grep -i Mellanox
# 04:00.0 Ethernet controller: Mellanox Technologies MT2892 Family [ConnectX-6 Dx]
# Subsystem: Mellanox Technologies Device 0011
# 04:00.1 Ethernet controller: Mellanox Technologies MT2892 Family [ConnectX-6 Dx]
# Subsystem: Mellanox Technologies Device 0011
lspci -nvv | grep "04:00.0\|04:00.1"
# pcilib: sysfs_read_vpd: read failed: Input/output error
# 04:00.0 0200: 15b3:101d
# 04:00.1 0200: 15b3:101d
cat /sys/class/net/*/device/sriov_numvfs
# 0
# 0
cat /sys/class/net/*/device/sriov_totalvfs
# 8
# 8
install NFD ( node feature discovery) operator
install SRIOV operator
oc create namespace openshift-sriov-network-operator
oc create -f - <<EOF
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
name: sriov-network-operators
namespace: openshift-sriov-network-operator
spec:
targetNamespaces:
- openshift-sriov-network-operator
EOF
# https://catalog.redhat.com/software/containers/openshift4/dpdk-base-rhel8/5e32be6cdd19c77896004a41
# registry.redhat.io/openshift4/dpdk-base-rhel8:latest
# oc get sriovnetworknodestates -n openshift-sriov-network-operator -o jsonpath='{.items[*].status}' | jq
# 可以看到,worker-1上的网卡,已经辨别出了VF
oc get sriovnetworknodestates -n openshift-sriov-network-operator -o json | jq ".items[] | (.metadata.name, .status)"
"master-0"
{
"interfaces": [
{
"deviceID": "1000",
"driver": "virtio-pci",
"pciAddress": "0000:00:03.0",
"vendor": "1af4"
}
],
"syncStatus": "Succeeded"
}
"master-1"
{
"interfaces": [
{
"deviceID": "1000",
"driver": "virtio-pci",
"pciAddress": "0000:00:03.0",
"vendor": "1af4"
}
],
"syncStatus": "Succeeded"
}
"master-2"
{
"interfaces": [
{
"deviceID": "1000",
"driver": "virtio-pci",
"pciAddress": "0000:00:03.0",
"vendor": "1af4"
}
],
"syncStatus": "Succeeded"
}
"worker-0"
{
"interfaces": [
{
"deviceID": "1000",
"driver": "virtio-pci",
"pciAddress": "0000:00:03.0",
"vendor": "1af4"
}
],
"syncStatus": "Succeeded"
}
"worker-1"
{
"interfaces": [
{
"deviceID": "165f",
"driver": "tg3",
"linkSpeed": "1000 Mb/s",
"linkType": "ETH",
"mac": "90:b1:1c:44:d6:0f",
"mtu": 1500,
"name": "eno1",
"pciAddress": "0000:01:00.0",
"vendor": "14e4"
},
{
"deviceID": "165f",
"driver": "tg3",
"linkSpeed": "-1 Mb/s",
"linkType": "ETH",
"mac": "90:b1:1c:44:d6:10",
"mtu": 1500,
"name": "eno2",
"pciAddress": "0000:01:00.1",
"vendor": "14e4"
},
{
"deviceID": "165f",
"driver": "tg3",
"linkSpeed": "-1 Mb/s",
"linkType": "ETH",
"mac": "90:b1:1c:44:d6:11",
"mtu": 1500,
"name": "eno3",
"pciAddress": "0000:02:00.0",
"vendor": "14e4"
},
{
"deviceID": "165f",
"driver": "tg3",
"linkSpeed": "-1 Mb/s",
"linkType": "ETH",
"mac": "90:b1:1c:44:d6:12",
"mtu": 1500,
"name": "eno4",
"pciAddress": "0000:02:00.1",
"vendor": "14e4"
},
{
"deviceID": "101d",
"driver": "mlx5_core",
"linkSpeed": "-1 Mb/s",
"linkType": "ETH",
"mac": "0c:42:a1:fa:18:52",
"mtu": 1500,
"name": "enp4s0f0",
"pciAddress": "0000:04:00.0",
"totalvfs": 8,
"vendor": "15b3"
},
{
"deviceID": "101d",
"driver": "mlx5_core",
"linkSpeed": "-1 Mb/s",
"linkType": "ETH",
"mac": "0c:42:a1:fa:18:53",
"mtu": 1500,
"name": "enp4s0f1",
"pciAddress": "0000:04:00.1",
"totalvfs": 8,
"vendor": "15b3"
}
],
"syncStatus": "Succeeded"
}
# config worker-1 with hugepage
cat << EOF > /data/install/worker-performance.yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfigPool
metadata:
name: worker-performance
spec:
machineConfigSelector:
matchExpressions:
- {key: machineconfiguration.openshift.io/role, operator: In, values: [worker,worker-performance]}
nodeSelector:
matchLabels:
node-role.kubernetes.io/worker-performance: ""
EOF
oc create -f /data/install/worker-performance.yaml
# to restore
oc delete -f /data/install/worker-performance.yaml
oc label node worker-1 node-role.kubernetes.io/worker-performance=""
cat << EOF > /data/install/worker-1-hugepage.yaml
apiVersion: performance.openshift.io/v2
kind: PerformanceProfile
metadata:
name: worker-1-hugepage
spec:
cpu:
isolated: "5-23"
reserved: "0-4"
hugepages:
defaultHugepagesSize: 1G
pages:
- count: 4
size: 1G
nodeSelector:
node-role.kubernetes.io/worker-performance: ''
EOF
oc create -f /data/install/worker-1-hugepage.yaml
# to restore
oc delete -f /data/install/worker-1-hugepage.yaml
# on worker-1
grep -i huge /proc/meminfo
# before
# AnonHugePages: 448512 kB
# ShmemHugePages: 0 kB
# HugePages_Total: 0
# HugePages_Free: 0
# HugePages_Rsvd: 0
# HugePages_Surp: 0
# Hugepagesize: 2048 kB
# Hugetlb: 0 kB
# after
# AnonHugePages: 376832 kB
# ShmemHugePages: 0 kB
# HugePages_Total: 4
# HugePages_Free: 4
# HugePages_Rsvd: 0
# HugePages_Surp: 0
# Hugepagesize: 1048576 kB
# Hugetlb: 4194304 kB
cat << EOF > /data/install/sriov-cx4.yaml
apiVersion: sriovnetwork.openshift.io/v1
kind: SriovNetworkNodePolicy
metadata:
name: policy-cx4-net-1
namespace: openshift-sriov-network-operator
spec:
resourceName: cx4nic1
nodeSelector:
feature.node.kubernetes.io/network-sriov.capable: "true"
numVfs: 4
nicSelector:
vendor: "15b3"
deviceID: "101d"
# rootDevices:
# - "0000:19:00.0"
deviceType: netdevice
isRdma: true
EOF
oc create -f /data/install/sriov-cx4.yaml
# Error from server (vendor/device 15b3/101d is not supported): error when creating "/data/install/sriov-cx4.yaml": admission webhook "operator-webhook.sriovnetwork.openshift.io" denied the request: vendor/device 15b3/101d is not supported
# to restore
oc delete -f /data/install/sriov-cx4.yaml
oc get sriovoperatorconfig default -n openshift-sriov-network-operator -o yaml | yq e '.spec' -
# enableInjector: true
# enableOperatorWebhook: true
# logLevel: 2
oc patch sriovoperatorconfig default --type=merge \
-n openshift-sriov-network-operator \
--patch '{ "spec": { "enableOperatorWebhook": false } }'
oc get sriovoperatorconfig default -n openshift-sriov-network-operator -o yaml | yq e '.spec' -
# enableInjector: true
# enableOperatorWebhook: false
# logLevel: 2
oc create -f /data/install/sriov-cx4.yaml
# sriovnetworknodepolicy.sriovnetwork.openshift.io/policy-cx4-net-1 created
# you can see, VF num set to '4'
# oc get sriovnetworknodestates worker-1 -n openshift-sriov-network-operator -o json | jq "(.metadata.name, .status)"
oc get sriovnetworknodestates worker-1 -n openshift-sriov-network-operator -o yaml | yq e "del(.metadata.managedFields)" -
apiVersion: sriovnetwork.openshift.io/v1
kind: SriovNetworkNodeState
metadata:
creationTimestamp: "2021-06-30T16:00:09Z"
generation: 4
name: worker-1
namespace: openshift-sriov-network-operator
ownerReferences:
- apiVersion: sriovnetwork.openshift.io/v1
blockOwnerDeletion: true
controller: true
kind: SriovNetworkNodePolicy
name: default
uid: cef00fc5-7952-42ec-b863-980fdc1e6318
resourceVersion: "4425538"
selfLink: /apis/sriovnetwork.openshift.io/v1/namespaces/openshift-sriov-network-operator/sriovnetworknodestates/worker-1
uid: fcf58d46-3127-4956-ac2f-df5ce2e2ac8c
spec:
dpConfigVersion: "4381421"
interfaces:
- name: enp4s0f0
numVfs: 4
pciAddress: "0000:04:00.0"
vfGroups:
- deviceType: netdevice
policyName: policy-cx4-net-1
resourceName: cx4nic1
vfRange: 0-3
- name: enp4s0f1
numVfs: 4
pciAddress: "0000:04:00.1"
vfGroups:
- deviceType: netdevice
policyName: policy-cx4-net-1
resourceName: cx4nic1
vfRange: 0-3
status:
interfaces:
- deviceID: 165f
driver: tg3
linkSpeed: 1000 Mb/s
linkType: ETH
mac: 90:b1:1c:44:d6:0f
mtu: 1500
name: eno1
pciAddress: "0000:01:00.0"
vendor: "14e4"
- deviceID: 165f
driver: tg3
linkSpeed: -1 Mb/s
linkType: ETH
mac: 90:b1:1c:44:d6:10
mtu: 1500
name: eno2
pciAddress: "0000:01:00.1"
vendor: "14e4"
- deviceID: 165f
driver: tg3
linkSpeed: -1 Mb/s
linkType: ETH
mac: 90:b1:1c:44:d6:11
mtu: 1500
name: eno3
pciAddress: "0000:02:00.0"
vendor: "14e4"
- deviceID: 165f
driver: tg3
linkSpeed: -1 Mb/s
linkType: ETH
mac: 90:b1:1c:44:d6:12
mtu: 1500
name: eno4
pciAddress: "0000:02:00.1"
vendor: "14e4"
- Vfs:
- deviceID: 101e
driver: mlx5_core
mac: 36:da:1c:a9:47:9a
mtu: 1500
name: enp4s0f0v0
pciAddress: "0000:04:00.2"
vendor: 15b3
vfID: 0
- deviceID: 101e
driver: mlx5_core
mac: 62:ab:95:db:e6:cc
mtu: 1500
name: enp4s0f0v1
pciAddress: "0000:04:00.3"
vendor: 15b3
vfID: 1
- deviceID: 101e
driver: mlx5_core
pciAddress: "0000:04:00.4"
vendor: 15b3
vfID: 2
- deviceID: 101e
driver: mlx5_core
mac: 5e:9f:cc:cc:e4:a1
mtu: 1500
name: enp4s0f0v3
pciAddress: "0000:04:00.5"
vendor: 15b3
vfID: 3
deviceID: 101d
driver: mlx5_core
eSwitchMode: legacy
linkSpeed: -1 Mb/s
linkType: ETH
mac: 0c:42:a1:fa:18:52
mtu: 1500
name: enp4s0f0
numVfs: 4
pciAddress: "0000:04:00.0"
totalvfs: 4
vendor: 15b3
- Vfs:
- deviceID: 101e
driver: mlx5_core
mac: e6:75:48:6f:56:33
mtu: 1500
name: enp4s0f1v0
pciAddress: "0000:04:00.6"
vendor: 15b3
vfID: 0
- deviceID: 101e
driver: mlx5_core
mac: 5a:74:7a:e7:3d:2b
mtu: 1500
name: enp4s0f1v1
pciAddress: "0000:04:00.7"
vendor: 15b3
vfID: 1
- deviceID: 101e
driver: mlx5_core
mac: 62:f8:19:98:d5:5f
mtu: 1500
name: enp4s0f1v2
pciAddress: "0000:04:01.0"
vendor: 15b3
vfID: 2
- deviceID: 101e
driver: mlx5_core
mac: f2:14:1e:93:e9:39
mtu: 1500
name: enp4s0f1v3
pciAddress: "0000:04:01.1"
vendor: 15b3
vfID: 3
deviceID: 101d
driver: mlx5_core
eSwitchMode: legacy
linkSpeed: -1 Mb/s
linkType: ETH
mac: 0c:42:a1:fa:18:53
mtu: 1500
name: enp4s0f1
numVfs: 4
pciAddress: "0000:04:00.1"
totalvfs: 4
vendor: 15b3
syncStatus: Succeeded
cat << EOF > /data/install/sriov-network.yaml
apiVersion: sriovnetwork.openshift.io/v1
kind: SriovNetwork
metadata:
name: mlx-dpdk-network
namespace: openshift-sriov-network-operator
spec:
networkNamespace: demo
ipam: "{}"
resourceName: cx4nic1
EOF
oc create -f /data/install/sriov-network.yaml
# to restore
oc delete -f /data/install/sriov-network.yaml
# https://github.com/openshift/sriov-network-operator/issues/133
lspci -vv | grep -i Mellanox
# 04:00.0 Ethernet controller: Mellanox Technologies MT2892 Family [ConnectX-6 Dx]
# Subsystem: Mellanox Technologies Device 0011
# 04:00.1 Ethernet controller: Mellanox Technologies MT2892 Family [ConnectX-6 Dx]
# Subsystem: Mellanox Technologies Device 0011
# 04:00.2 Ethernet controller: Mellanox Technologies ConnectX Family mlx5Gen Virtual Function
# Subsystem: Mellanox Technologies Device 0011
# 04:00.3 Ethernet controller: Mellanox Technologies ConnectX Family mlx5Gen Virtual Function
# Subsystem: Mellanox Technologies Device 0011
# 04:00.4 Ethernet controller: Mellanox Technologies ConnectX Family mlx5Gen Virtual Function
# Subsystem: Mellanox Technologies Device 0011
# 04:00.5 Ethernet controller: Mellanox Technologies ConnectX Family mlx5Gen Virtual Function
# Subsystem: Mellanox Technologies Device 0011
# 04:00.6 Ethernet controller: Mellanox Technologies ConnectX Family mlx5Gen Virtual Function
# Subsystem: Mellanox Technologies Device 0011
# 04:00.7 Ethernet controller: Mellanox Technologies ConnectX Family mlx5Gen Virtual Function
# Subsystem: Mellanox Technologies Device 0011
# 04:01.0 Ethernet controller: Mellanox Technologies ConnectX Family mlx5Gen Virtual Function
# Subsystem: Mellanox Technologies Device 0011
# 04:01.1 Ethernet controller: Mellanox Technologies ConnectX Family mlx5Gen Virtual Function
# Subsystem: Mellanox Technologies Device 0011
lspci -nvv | grep "04:00.0\|04:00.1"
# pcilib: sysfs_read_vpd: read failed: Input/output error
# 04:00.0 0200: 15b3:101d
# 04:00.1 0200: 15b3:101d
lspci | grep -i Mellanox | awk '{print $1}' | xargs -I DEMO sh -c "lspci -nvv | grep DEMO "
# pcilib: sysfs_read_vpd: read failed: Input/output error
# 04:00.0 0200: 15b3:101d
# pcilib: sysfs_read_vpd: read failed: Input/output error
# 04:00.1 0200: 15b3:101d
# pcilib: sysfs_read_vpd: read failed: Input/output error
# 04:00.2 0200: 15b3:101e
# pcilib: sysfs_read_vpd: read failed: Input/output error
# 04:00.3 0200: 15b3:101e
# pcilib: sysfs_read_vpd: read failed: Input/output error
# 04:00.4 0200: 15b3:101e
# pcilib: sysfs_read_vpd: read failed: Input/output error
# 04:00.5 0200: 15b3:101e
# pcilib: sysfs_read_vpd: read failed: Input/output error
# 04:00.6 0200: 15b3:101e
# pcilib: sysfs_read_vpd: read failed: Input/output error
# 04:00.7 0200: 15b3:101e
# pcilib: sysfs_read_vpd: read failed: Input/output error
# 04:01.0 0200: 15b3:101e
# pcilib: sysfs_read_vpd: read failed: Input/output error
# 04:01.1 0200: 15b3:101e
# <human readable name>: <vendor ID> <pf ID> <vf ID>
cat << EOF > /data/install/sriov-unsupport.yaml
apiVersion: v1
data:
CX6DX: 15b3 101d 101e
kind: ConfigMap
metadata:
name: unsupported-nic-ids
namespace: openshift-sriov-network-operator
EOF
oc create -f /data/install/sriov-unsupport.yaml
# try to deply a demo pod
cat << EOF > /data/install/dpdk-test.yaml
apiVersion: v1
kind: Pod
metadata:
name: dpdk-app
annotations:
k8s.v1.cni.cncf.io/networks: mlx-dpdk-network
spec:
containers:
- name: testpmd
image: registry.redhat.io/openshift4/dpdk-base-rhel8:v4.6
securityContext:
capabilities:
add: ["IPC_LOCK"]
volumeMounts:
- mountPath: /dev/hugepages
name: hugepage
resources:
limits:
openshift.io/cx4nic1: "1"
memory: "1Gi"
cpu: "4"
hugepages-1Gi: "4Gi"
requests:
openshift.io/cx4nic1: "1"
memory: "1Gi"
cpu: "4"
hugepages-1Gi: "4Gi"
command: ["sleep", "infinity"]
volumes:
- name: hugepage
emptyDir:
medium: HugePages
EOF
oc create -n demo -f /data/install/dpdk-test.yaml
# to restore
oc delete -n demo -f /data/install/dpdk-test.yaml
# in the pod
rpm -ql dpdk-tools
# /usr/sbin/dpdk-devbind
# /usr/share/dpdk/usertools
# /usr/share/dpdk/usertools/cpu_layout.py
# /usr/share/dpdk/usertools/dpdk-devbind.py
# /usr/share/dpdk/usertools/dpdk-pmdinfo.py
# /usr/share/dpdk/usertools/dpdk-telemetry-client.py
/usr/share/dpdk/usertools/dpdk-devbind.py --status-dev net
# lspci: Unable to load libkmod resources: error -12
# lspci: Unable to load libkmod resources: error -12
# lspci: Unable to load libkmod resources: error -12
# lspci: Unable to load libkmod resources: error -12
# lspci: Unable to load libkmod resources: error -12
# lspci: Unable to load libkmod resources: error -12
# lspci: Unable to load libkmod resources: error -12
# Network devices using kernel driver
# ===================================
# 0000:01:00.0 'NetXtreme BCM5720 2-port Gigabit Ethernet PCIe 165f' if= drv=tg3 unused=
# 0000:01:00.1 'NetXtreme BCM5720 2-port Gigabit Ethernet PCIe 165f' if= drv=tg3 unused=
# 0000:02:00.0 'NetXtreme BCM5720 2-port Gigabit Ethernet PCIe 165f' if= drv=tg3 unused=
# 0000:02:00.1 'NetXtreme BCM5720 2-port Gigabit Ethernet PCIe 165f' if= drv=tg3 unused=
# 0000:04:00.0 'MT2892 Family [ConnectX-6 Dx] 101d' if= drv=mlx5_core unused=
# 0000:04:00.1 'MT2892 Family [ConnectX-6 Dx] 101d' if= drv=mlx5_core unused=
# 0000:04:00.2 'ConnectX Family mlx5Gen Virtual Function 101e' if= drv=mlx5_core unused=
# 0000:04:00.3 'ConnectX Family mlx5Gen Virtual Function 101e' if= drv=mlx5_core unused=
# 0000:04:00.4 'ConnectX Family mlx5Gen Virtual Function 101e' if=net1 drv=mlx5_core unused=
# 0000:04:00.5 'ConnectX Family mlx5Gen Virtual Function 101e' if= drv=mlx5_core unused=
# 0000:04:00.6 'ConnectX Family mlx5Gen Virtual Function 101e' if= drv=mlx5_core unused=
# 0000:04:00.7 'ConnectX Family mlx5Gen Virtual Function 101e' if= drv=mlx5_core unused=
# 0000:04:01.0 'ConnectX Family mlx5Gen Virtual Function 101e' if= drv=mlx5_core unused=
# 0000:04:01.1 'ConnectX Family mlx5Gen Virtual Function 101e' if= drv=mlx5_core unused=
kvm does't support sriov PF passthrough
it only support VF passthrough
- https://www.cnblogs.com/dion-90/articles/8522733.html
- https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/virtualization_deployment_and_administration_guide/sect-pci_devices-pci_passthrough
# on 101
ls /sys/class/net/
lspci -vv | grep -i Mellanox
# pcilib: sysfs_read_vpd: read failed: Input/output error
# 05:00.0 Ethernet controller: Mellanox Technologies MT27710 Family [ConnectX-4 Lx]
# Subsystem: Mellanox Technologies Stand-up ConnectX-4 Lx EN, 25GbE dual-port SFP28, PCIe3.0 x8, MCX4121A-ACAT
# 05:00.1 Ethernet controller: Mellanox Technologies MT27710 Family [ConnectX-4 Lx]
# Subsystem: Mellanox Technologies Stand-up ConnectX-4 Lx EN, 25GbE dual-port SFP28, PCIe3.0 x8, MCX4121A-ACAT
# 07:00.0 Ethernet controller: Mellanox Technologies MT27710 Family [ConnectX-4 Lx]
# Subsystem: Mellanox Technologies Stand-up ConnectX-4 Lx EN, 25GbE dual-port SFP28, PCIe3.0 x8, MCX4121A-ACAT
# 07:00.1 Ethernet controller: Mellanox Technologies MT27710 Family [ConnectX-4 Lx]
# Subsystem: Mellanox Technologies Stand-up ConnectX-4 Lx EN, 25GbE dual-port SFP28, PCIe3.0 x8, MCX4121A-ACAT
virsh nodedev-list | grep 000_05
# pci_0000_05_00_0
# pci_0000_05_00_1
virsh nodedev-dumpxml pci_0000_05_00_0
<device>
<name>pci_0000_05_00_0</name>
<path>/sys/devices/pci0000:00/0000:00:06.0/0000:05:00.0</path>
<parent>pci_0000_00_06_0</parent>
<driver>
<name>mlx5_core</name>
</driver>
<capability type='pci'>
<domain>0</domain>
<bus>5</bus>
<slot>0</slot>
<function>0</function>
<product id='0x1015'>MT27710 Family [ConnectX-4 Lx]</product>
<vendor id='0x15b3'>Mellanox Technologies</vendor>
<capability type='virt_functions' maxCount='64'/>
<iommuGroup number='17'>
<address domain='0x0000' bus='0x05' slot='0x00' function='0x0'/>
</iommuGroup>
<pci-express>
<link validity='cap' port='0' speed='8' width='8'/>
<link validity='sta' speed='5' width='4'/>
</pci-express>
</capability>
</device>
virsh nodedev-dumpxml pci_0000_05_00_1
<device>
<name>pci_0000_05_00_1</name>
<path>/sys/devices/pci0000:00/0000:00:06.0/0000:05:00.1</path>
<parent>pci_0000_00_06_0</parent>
<driver>
<name>mlx5_core</name>
</driver>
<capability type='pci'>
<domain>0</domain>
<bus>5</bus>
<slot>0</slot>
<function>1</function>
<product id='0x1015'>MT27710 Family [ConnectX-4 Lx]</product>
<vendor id='0x15b3'>Mellanox Technologies</vendor>
<capability type='virt_functions' maxCount='64'/>
<iommuGroup number='18'>
<address domain='0x0000' bus='0x05' slot='0x00' function='0x1'/>
</iommuGroup>
<pci-express>
<link validity='cap' port='0' speed='8' width='8'/>
<link validity='sta' speed='5' width='4'/>
</pci-express>
</capability>
</device>
on 103
ls /sys/class/net/
# baremetal eno1 eno2 eno3 eno4 enp4s0f0 enp4s0f1 lo virbr0 virbr0-nic
echo 0 > /sys/class/net/enp4s0f0/device/sriov_numvfs
echo 0 > /sys/class/net/enp4s0f1/device/sriov_numvfs
lspci -vv | grep -i Mellanox
# 04:00.0 Ethernet controller: Mellanox Technologies MT2892 Family [ConnectX-6 Dx]
# Subsystem: Mellanox Technologies Device 0011
# 04:00.1 Ethernet controller: Mellanox Technologies MT2892 Family [ConnectX-6 Dx]
# Subsystem: Mellanox Technologies Device 0011
virsh nodedev-list | grep 000_04
# pci_0000_04_00_0
# pci_0000_04_00_1
virsh nodedev-dumpxml pci_0000_04_00_0
<device>
<name>pci_0000_04_00_0</name>
<path>/sys/devices/pci0000:00/0000:00:02.0/0000:04:00.0</path>
<parent>pci_0000_00_02_0</parent>
<driver>
<name>mlx5_core</name>
</driver>
<capability type='pci'>
<class>0x020000</class>
<domain>0</domain>
<bus>4</bus>
<slot>0</slot>
<function>0</function>
<product id='0x101d'>MT2892 Family [ConnectX-6 Dx]</product>
<vendor id='0x15b3'>Mellanox Technologies</vendor>
<capability type='virt_functions' maxCount='8'/>
<iommuGroup number='27'>
<address domain='0x0000' bus='0x04' slot='0x00' function='0x0'/>
</iommuGroup>
<numa node='0'/>
<pci-express>
<link validity='cap' port='0' speed='16' width='16'/>
<link validity='sta' speed='8' width='8'/>
</pci-express>
</capability>
</device>
virsh nodedev-dumpxml pci_0000_04_00_1
<device>
<name>pci_0000_04_00_1</name>
<path>/sys/devices/pci0000:00/0000:00:02.0/0000:04:00.1</path>
<parent>pci_0000_00_02_0</parent>
<driver>
<name>mlx5_core</name>
</driver>
<capability type='pci'>
<class>0x020000</class>
<domain>0</domain>
<bus>4</bus>
<slot>0</slot>
<function>1</function>
<product id='0x101d'>MT2892 Family [ConnectX-6 Dx]</product>
<vendor id='0x15b3'>Mellanox Technologies</vendor>
<capability type='virt_functions' maxCount='8'/>
<iommuGroup number='28'>
<address domain='0x0000' bus='0x04' slot='0x00' function='0x1'/>
</iommuGroup>
<numa node='0'/>
<pci-express>
<link validity='cap' port='0' speed='16' width='16'/>
<link validity='sta' speed='8' width='8'/>
</pci-express>
</capability>
</device>
for ocp4-aHelper, change below kvm config
<hostdev mode='subsystem' type='pci' managed='yes'>
<driver name='vfio'/>
<source>
<address domain='0x0000' bus='0x05' slot='0x00' function='0x0'/>
</source>
<alias name='hostdev0'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x0a' function='0x0'/>
</hostdev>
<hostdev mode='subsystem' type='pci' managed='yes'>
<driver name='vfio'/>
<source>
<address domain='0x0000' bus='0x05' slot='0x00' function='0x1'/>
</source>
<alias name='hostdev1'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x0b' function='0x0'/>
</hostdev>
to
<interface type='hostdev' managed='yes'>
<driver name='vfio'/>
<source>
<address type='pci' domain='0x0000' bus='0x05' slot='0x00' function='0x0'/>
</source>
</interface>
<interface type='hostdev' managed='yes'>
<driver name='vfio'/>
<source>
<address type='pci' domain='0x0000' bus='0x05' slot='0x00' function='0x1'/>
</source>
</interface>
virsh edit ocp4-aHelper
keepalived operator in openshift4
痛点
openshift4 标准安装,使用router(haproxy)来做ingress,向集群导入流量,这么做,默认只能工作在7层,虽然也有方法进行定制,让他工作在4层,但是不管从对外暴露的IP地址的可管理性,以及应用端口冲突处理方面来说,都非常不方便。
根本原因,其实是openshift4 私有化安装不支持 LoadBalancer 这个service type。 那么今天我们就找了 keepalived operator,来弥补这个缺陷。
视频讲解:
本文,参考openshift blog上的文章
- https://www.openshift.com/blog/self-hosted-load-balancer-for-openshift-an-operator-based-approach
- https://github.com/redhat-cop/keepalived-operator
试验架构图
可以看到,keepalived,会在节点上,根据service的定义,创建second IP,然后外部流量,就从这个IP地址,进入集群。这是一种k8s LoadBalancer 的实现方式,和ingress controller的方式对比,就是天然支持tcp模式的4层转发。
安装keepalived operator很简单
在web界面操作完了,需要标记节点,已经调整一下权限
oc label node master-2 node-role.kubernetes.io/loadbalancer=""
oc label node master-1 node-role.kubernetes.io/loadbalancer=""
oc adm policy add-scc-to-user privileged -z default -n keepalived-operator
接下来,我们来看看keepalived的部署有什么特殊的地方。
我们可以看到 keepalived pod 使用了 hostnetwork 和 privileged: true。 但是keepalived pod 没有挂载特殊的主机目录。
测试部署一个应用
cat << 'EOF' > /data/install/network-patch.yaml
spec:
externalIP:
policy:
allowedCIDRs:
- ${ALLOWED_CIDR}
autoAssignCIDRs:
- "${AUTOASSIGNED_CIDR}"
EOF
# export VERSION="4.9.4"
# export BINARY="yq_linux_amd64"
# wget https://github.com/mikefarah/yq/releases/download/${VERSION}/${BINARY} -O /usr/local/bin/yq && chmod +x /usr/local/bin/yq
# 24 256
# 25 128
# 26 64
# 27 32
# 28 16
cd /data/install
export ALLOWED_CIDR="172.21.6.33/27"
export AUTOASSIGNED_CIDR="172.21.6.33/27"
oc patch network cluster -p "$(envsubst < ./network-patch.yaml | yq eval -j -)" --type=merge
oc get network cluster -o yaml
# spec:
# clusterNetwork:
# - cidr: 10.254.0.0/16
# hostPrefix: 24
# externalIP:
# autoAssignCIDRs:
# - 172.21.6.33/27
# policy:
# allowedCIDRs:
# - 172.21.6.33/27
# networkType: OpenShiftSDN
# serviceNetwork:
# - 172.30.0.0/16
# status:
# clusterNetwork:
# - cidr: 10.254.0.0/16
# hostPrefix: 24
# clusterNetworkMTU: 1450
# networkType: OpenShiftSDN
# serviceNetwork:
# - 172.30.0.0/16
oc new-project demo
cat << EOF > /data/install/demo.yaml
---
apiVersion: v1
kind: Pod
metadata:
name: test-0
labels:
env: test
spec:
restartPolicy: OnFailure
nodeSelector:
kubernetes.io/hostname: 'master-0'
containers:
- name: php
image: "quay.io/wangzheng422/php:demo.02"
---
apiVersion: v1
kind: Pod
metadata:
name: test-1
labels:
env: test
spec:
restartPolicy: OnFailure
nodeSelector:
kubernetes.io/hostname: 'master-2'
containers:
- name: php
image: "quay.io/wangzheng422/php:demo.02"
---
kind: Service
apiVersion: v1
metadata:
name: demo
annotations:
keepalived-operator.redhat-cop.io/keepalivedgroup: keepalived-operator/keepalivedgroup-workers
spec:
type: LoadBalancer
ports:
- name: "http"
protocol: TCP
port: 80
targetPort: 80
selector:
env: test
EOF
oc create -n demo -f /data/install/demo.yaml
# to restore
oc delete -n demo -f /data/install/demo.yaml
分析一下应用的行为
看看service的配置,能看到已经分配了对外的IP
oc get svc
# NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
# demo LoadBalancer 172.30.203.237 172.21.6.50,172.21.6.50 80:31682/TCP 14m
curl http://172.21.6.50/
# Hello!<br>Welcome to RedHat Developer<br>Enjoy all of the ad-free articles<br>
master-2 上面,相关的iptables 配置
0 0 KUBE-FW-ZFZLPEKTCJ3DBGAL tcp -- * * 0.0.0.0/0 172.21.6.50 /* demo/demo:http loadbalancer IP */ tcp dpt:80
可以看到,svc的防火墙策略,分流到了pod
keepalived pods definition
we can see, it use hostnetwork and privileged: true
kind: Pod
apiVersion: v1
metadata:
generateName: keepalivedgroup-workers-
annotations:
openshift.io/scc: privileged
selfLink: /api/v1/namespaces/keepalived-operator/pods/keepalivedgroup-workers-fgzv8
resourceVersion: '2700532'
name: keepalivedgroup-workers-fgzv8
uid: 1addc7c7-4e6d-49c7-ae5e-3a4e2963755b
creationTimestamp: '2021-06-09T08:51:40Z'
namespace: keepalived-operator
ownerReferences:
- apiVersion: apps/v1
kind: DaemonSet
name: keepalivedgroup-workers
uid: dba36a9c-f2aa-4951-aa60-a3836275ae1b
controller: true
blockOwnerDeletion: true
labels:
controller-revision-hash: 7459c85f64
keepalivedGroup: keepalivedgroup-workers
pod-template-generation: '1'
spec:
nodeSelector:
node-role.kubernetes.io/loadbalancer: ''
restartPolicy: Always
initContainers:
- resources: {}
terminationMessagePath: /dev/termination-log
name: config-setup
command:
- bash
- '-c'
- /usr/local/bin/notify.sh
env:
- name: file
value: /etc/keepalived.d/src/keepalived.conf
- name: dst_file
value: /etc/keepalived.d/dst/keepalived.conf
- name: reachip
- name: create_config_only
value: 'true'
securityContext:
runAsUser: 0
imagePullPolicy: Always
volumeMounts:
- name: config
readOnly: true
mountPath: /etc/keepalived.d/src
- name: config-dst
mountPath: /etc/keepalived.d/dst
terminationMessagePolicy: File
image: 'quay.io/redhat-cop/keepalived-operator:latest'
serviceAccountName: default
imagePullSecrets:
- name: default-dockercfg-2d5d5
priority: 0
schedulerName: default-scheduler
hostNetwork: true
enableServiceLinks: false
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchFields:
- key: metadata.name
operator: In
values:
- master-1
terminationGracePeriodSeconds: 30
shareProcessNamespace: true
preemptionPolicy: PreemptLowerPriority
nodeName: master-1
securityContext: {}
containers:
- resources: {}
terminationMessagePath: /dev/termination-log
name: keepalived
command:
- /bin/bash
env:
- name: POD_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
securityContext:
privileged: true
imagePullPolicy: Always
volumeMounts:
- name: lib-modules
readOnly: true
mountPath: /lib/modules
- name: config-dst
readOnly: true
mountPath: /etc/keepalived.d
- name: pid
mountPath: /etc/keepalived.pid
- name: stats
mountPath: /tmp
terminationMessagePolicy: File
image: registry.redhat.io/openshift4/ose-keepalived-ipfailover
args:
- '-c'
- >
exec /usr/sbin/keepalived --log-console --log-detail --dont-fork
--config-id=${POD_NAME} --use-file=/etc/keepalived.d/keepalived.conf
--pid=/etc/keepalived.pid/keepalived.pid
- resources: {}
terminationMessagePath: /dev/termination-log
name: config-reloader
command:
- bash
- '-c'
- /usr/local/bin/notify.sh
env:
- name: pid
value: /etc/keepalived.pid/keepalived.pid
- name: file
value: /etc/keepalived.d/src/keepalived.conf
- name: dst_file
value: /etc/keepalived.d/dst/keepalived.conf
- name: reachip
- name: create_config_only
value: 'false'
securityContext:
runAsUser: 0
imagePullPolicy: Always
volumeMounts:
- name: config
readOnly: true
mountPath: /etc/keepalived.d/src
- name: config-dst
mountPath: /etc/keepalived.d/dst
- name: pid
mountPath: /etc/keepalived.pid
terminationMessagePolicy: File
image: 'quay.io/redhat-cop/keepalived-operator:latest'
- resources: {}
terminationMessagePath: /dev/termination-log
name: prometheus-exporter
command:
- /usr/local/bin/keepalived_exporter
securityContext:
privileged: true
ports:
- name: metrics
hostPort: 9650
containerPort: 9650
protocol: TCP
imagePullPolicy: Always
volumeMounts:
- name: lib-modules
readOnly: true
mountPath: /lib/modules
- name: stats
mountPath: /tmp
terminationMessagePolicy: File
image: 'quay.io/redhat-cop/keepalived-operator:latest'
args:
- '-web.listen-address'
- ':9650'
- '-web.telemetry-path'
- /metrics
automountServiceAccountToken: false
serviceAccount: default
volumes:
- name: lib-modules
hostPath:
path: /lib/modules
type: ''
- name: config
configMap:
name: keepalivedgroup-workers
defaultMode: 420
- name: config-dst
emptyDir: {}
- name: pid
emptyDir:
medium: Memory
- name: stats
emptyDir: {}
dnsPolicy: ClusterFirst
tolerations:
- operator: Exists
- key: node.kubernetes.io/not-ready
operator: Exists
effect: NoExecute
- key: node.kubernetes.io/unreachable
operator: Exists
effect: NoExecute
- key: node.kubernetes.io/disk-pressure
operator: Exists
effect: NoSchedule
- key: node.kubernetes.io/memory-pressure
operator: Exists
effect: NoSchedule
- key: node.kubernetes.io/pid-pressure
operator: Exists
effect: NoSchedule
- key: node.kubernetes.io/unschedulable
operator: Exists
effect: NoSchedule
- key: node.kubernetes.io/network-unavailable
operator: Exists
effect: NoSchedule
status:
containerStatuses:
- restartCount: 0
started: true
ready: true
name: config-reloader
state:
running:
startedAt: '2021-06-09T08:52:34Z'
imageID: >-
quay.io/redhat-cop/keepalived-operator@sha256:dab32df252b705b07840dc0488fce0577ed743aaa33bed47e293f115bdda9348
image: 'quay.io/redhat-cop/keepalived-operator:latest'
lastState: {}
containerID: 'cri-o://2d9c37aea1c623f1ff4afb50233c1d67567d3315ea64d10476cd613e8ccc2d04'
- restartCount: 0
started: true
ready: true
name: keepalived
state:
running:
startedAt: '2021-06-09T08:52:34Z'
imageID: >-
registry.redhat.io/openshift4/ose-keepalived-ipfailover@sha256:385f014b07acc361d1bb41ffd9d3abc151ab64e01f42dacba80053a4dfcbd242
image: 'registry.redhat.io/openshift4/ose-keepalived-ipfailover:latest'
lastState: {}
containerID: 'cri-o://02b384c94506b7dcbd18cbf8ceadef83b366c356de36b8e2646cc233f1c23902'
- restartCount: 0
started: true
ready: true
name: prometheus-exporter
state:
running:
startedAt: '2021-06-09T08:52:34Z'
imageID: >-
quay.io/redhat-cop/keepalived-operator@sha256:dab32df252b705b07840dc0488fce0577ed743aaa33bed47e293f115bdda9348
image: 'quay.io/redhat-cop/keepalived-operator:latest'
lastState: {}
containerID: 'cri-o://daeb85bf94923d9562a0cc777664397269ed642bd0d86cf993f12a2ff6fff925'
qosClass: BestEffort
podIPs:
- ip: 192.168.7.14
podIP: 192.168.7.14
hostIP: 192.168.7.14
startTime: '2021-06-09T08:51:40Z'
initContainerStatuses:
- name: config-setup
state:
terminated:
exitCode: 0
reason: Completed
startedAt: '2021-06-09T08:51:54Z'
finishedAt: '2021-06-09T08:51:54Z'
containerID: >-
cri-o://9ecc0e9a469a0518a7ca2fc5feef551d56c052dfe569dba391d0c0fc998b2f41
lastState: {}
ready: true
restartCount: 0
image: 'quay.io/redhat-cop/keepalived-operator:latest'
imageID: >-
quay.io/redhat-cop/keepalived-operator@sha256:dab32df252b705b07840dc0488fce0577ed743aaa33bed47e293f115bdda9348
containerID: 'cri-o://9ecc0e9a469a0518a7ca2fc5feef551d56c052dfe569dba391d0c0fc998b2f41'
conditions:
- type: Initialized
status: 'True'
lastProbeTime: null
lastTransitionTime: '2021-06-09T08:51:55Z'
- type: Ready
status: 'True'
lastProbeTime: null
lastTransitionTime: '2021-06-09T08:52:35Z'
- type: ContainersReady
status: 'True'
lastProbeTime: null
lastTransitionTime: '2021-06-09T08:52:35Z'
- type: PodScheduled
status: 'True'
lastProbeTime: null
lastTransitionTime: '2021-06-09T08:51:40Z'
phase: Running
准备一个php的测试镜像
# 准备一个php的测试镜像
cat << 'EOF' > index.php
<?php
$localIP = getHostByName(getHostName());
ECHO "Hello!<br>";
echo "Welcome to RedHat Developer<br>";
EcHo "Enjoy all of the ad-free articles<br>".$localIP;
?>
EOF
cat << EOF > php.dockerfile
FROM php:apache
COPY . /var/www/html/
EOF
buildah bud -t quay.io/wangzheng422/php:demo.02 -f php.dockerfile .
buildah push quay.io/wangzheng422/php:demo.02
Real-Time Kernel for Openshift4
5G RAN vDU 对操作系统的实时性要求很高, 基本都要求基于实时操作系统搞, openshift4 是一个和操作系统紧密捆绑的paas平台, 内置了实时操作系统, 这个操作系统是使用了 rhel8 的内核, 并使用 ostree 打包的操作系统。
openshift4 可以在node 上启动实时操作系统,有2个办法,一个是通过performance-addon operator
- https://docs.openshift.com/container-platform/4.7/scalability_and_performance/cnf-performance-addon-operator-for-low-latency-nodes.html
另外一个,是直接用machine config的办法搞
- https://docs.openshift.com/container-platform/4.7/post_installation_configuration/machine-configuration-tasks.html#nodes-nodes-rtkernel-arguments_post-install-machine-configuration-tasks
本次试验部署架构图
视频讲解:
操作系统上怎么做
用实时操作系统,就是为了性能,那么如果我们是一台物理机,不考虑容器平台,我们应该怎么配置,让这个实时操作系统性能最大化呢?
一般来说,有2个通用的配置
- 对实时操作系统,并进行系统调优配置。
- 物理机bios进行配置,关闭超线程,关闭irq balance,关闭cpu c-state 等节电功能。
对于第一个,实时操作系统的配置,参考这里
- install kernel-rt
- install rt-test
cat /etc/tuned/realtime-variables.conf
# isolated_cores=1-30
# isolate_managed_irq=Y
tuned-adm profile realtime
reboot
swapoff -a
systemctl stop irqbalance
对于第二个,物理机上bios配置,要找服务器的厂商文档,查看官方的low latency配置文档。 比如这里
System Setup Screen | Setting | Default | Recommended Alternative for Low- Latency Environments |
---|---|---|---|
Processor Settings | Logical Processor | Enabled | Disabled |
Processor Settings | Turbo Mode | Enabled | Disabled2 |
Processor Settings | C-States | Enabled | Disabled |
Processor Settings | C1E | Enabled | Disabled |
Power Management | Power Management | Active Power Controller | Maximum Performance |
先使用performance addon operator,这个是官方推荐的方法。
performance addon operator 是openshift4里面的一个operator,他的作用是,让用户进行简单的yaml配置,然后operator帮助客户进行复杂的kernel parameter, kubelet, tuned配置。
# on 104, create a new worker node
export KVM_DIRECTORY=/data/kvm
mkdir -p ${KVM_DIRECTORY}
cd ${KVM_DIRECTORY}
scp root@172.21.6.11:/data/install/{*worker-0}.iso ${KVM_DIRECTORY}/
virt-install --name=ocp4-worker0 --vcpus=4 --ram=8192 \
--disk path=/data/kvm/ocp4-worker0.qcow2,bus=virtio,size=120 \
--os-variant rhel8.0 --network bridge=br0,model=virtio \
--graphics vnc,listen=127.0.0.1,port=59005 \
--boot menu=on --cdrom ${KVM_DIRECTORY}/rhcos_install-worker-0.iso
# go back to helper
oc get csr
oc get csr -ojson | jq -r '.items[] | select(.status == {} ) | .metadata.name' | xargs oc adm certificate approve
# install performance addon operator following offical document
# https://docs.openshift.com/container-platform/4.7/scalability_and_performance/cnf-performance-addon-operator-for-low-latency-nodes.html
cat << EOF > /data/install/worker-rt.yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfigPool
metadata:
name: worker-rt
labels:
machineconfiguration.openshift.io/role: worker-rt
spec:
machineConfigSelector:
matchExpressions:
- {key: machineconfiguration.openshift.io/role, operator: In, values: [worker,worker-rt]}
nodeSelector:
matchLabels:
node-role.kubernetes.io/worker-rt: ""
EOF
oc create -f /data/install/worker-rt.yaml
oc label MachineConfigPool/worker-rt machineconfiguration.openshift.io/role=worker-rt
# to restore
oc delete -f /data/install/worker-rt.yaml
oc label node worker-0 node-role.kubernetes.io/worker-rt=""
# 以下的配置,是保留了0-1核给系统,剩下的2-3核给应用,实际物理机上,一般是2-19给应用。
cat << EOF > /data/install/performance.yaml
apiVersion: performance.openshift.io/v2
kind: PerformanceProfile
metadata:
name: example-performanceprofile
spec:
additionalKernelArgs:
- selinux=0
- intel_iommu=on
globallyDisableIrqLoadBalancing: true
cpu:
isolated: "2-3"
reserved: "0-1"
hugepages:
defaultHugepagesSize: "1G"
pages:
- size: "1G"
count: 2
node: 0
realTimeKernel:
enabled: true
numa:
topologyPolicy: "single-numa-node"
nodeSelector:
node-role.kubernetes.io/worker-rt: ""
EOF
oc create -f /data/install/performance.yaml
# restore
oc delete -f /data/install/performance.yaml
# check the result
ssh core@worker-0
uname -a
# Linux worker-0 4.18.0-240.22.1.rt7.77.el8_3.x86_64 #1 SMP PREEMPT_RT Fri Mar 26 18:44:48 EDT 2021 x86_64 x86_64 x86_64 GNU/Linux
remove worker-0
oc delete node worker-0
virsh destroy ocp4-worker0
virsh undefine ocp4-worker0
try with machine config with tunned, this is DIY if you like :)
machine config的办法,特点是定制化程度很高,如果客户之前用rt-kernel的操作系统,调优过应用,那么用machine config的方法,能够直接把客户之前的调优参数于应用过来,就不用纠结各种调优的参数,在openshift4上面,应该怎么配置进去了。
you can use machine config dirctly, this can give you full customization capabilities. If you customer already fine-tune kernel parameter on rt-kernel, you can use their kernel parameter directly on openshift4 without try the parameters by yourself.
# 打开节点的real time kernel
# cat << EOF > /data/install/99-worker-realtime.yaml
# apiVersion: machineconfiguration.openshift.io/v1
# kind: MachineConfig
# metadata:
# labels:
# machineconfiguration.openshift.io/role: "worker-rt"
# name: 99-worker-realtime
# spec:
# kernelType: realtime
# EOF
# oc create -f /data/install/99-worker-realtime.yaml
# 配置kernel启动参数,每个参数一行
# http://abcdxyzk.github.io/blog/2015/02/11/kernel-base-param/
# no_timer_check clocksource=tsc tsc=perfect intel_pstate=disable selinux=0 enforcing=0 nmi_watchdog=0 softlockup_panic=0 isolcpus=2-19 nohz_full=2-19 idle=poll default_hugepagesz=1G hugepagesz=1G hugepages=32 skew_tick=1 rcu_nocbs=2-19 kthread_cpus=0-1 irqaffinity=0-1 rcu_nocb_poll iommu=pt intel_iommu=on
cat << EOF > /data/install/05-worker-kernelarg-realtime.yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
labels:
machineconfiguration.openshift.io/role: worker-rt
name: 05-worker-kernelarg-realtime
spec:
config:
ignition:
version: 3.1.0
kernelArguments:
- no_timer_check # 禁止运行内核中时钟IRQ源缺陷检测代码。主要用于解决某些AMD平台的CPU占用过高以及时钟过快的故障。
- clocksource=tsc # clocksource={jiffies|acpi_pm|hpet|tsc} tsc TSC(Time Stamp Counter)的主体是位于CPU里面的一个64位TSC寄存器,与传统的以中断形式存在的周期性时钟不同,TSC是以计数器形式存在的单步递增性时钟,两者的区别在于,周期性时钟是通过周期性触发中断达到计时目的,如心跳一般。而单步递增时钟则不发送中断,取而代之的是由软件自己在需要的时候去主动读取TSC寄存器的值来获得时间。TSC的精度更高并且速度更快,但仅能在较新的CPU(Sandy Bridge之后)上使用。
- tsc=perfect
- intel_pstate=disable # intel_pstate驱动支持现代Intel处理器的温控。 intel_pstate=disable选项可以强制使用传统遗留的CPU驱动acpi_cpufreq
- selinux=0
- enforcing=0
- nmi_watchdog=0 # 配置nmi_watchdog(不可屏蔽中断看门狗) 0 表示关闭看门狗;
- softlockup_panic=0 # 是否在检测到软死锁(soft-lockup)的时候让内核panic
- isolcpus=2-19 # 将列表中的CPU从内核SMP平衡和调度算法中剔除。 提出后并不是绝对不能再使用该CPU的,操作系统仍然可以强制指定特定的进程使用哪个CPU(可以通过taskset来做到)。该参数的目的主要是用于实现特定cpu只运行特定进程的目的。
- nohz_full=2-19 #在 16 核的系统中,设定 nohz_full=1-15 可以在 1 到 15 内核中启用动态无时钟内核性能,并将所有的计时移动至唯一未设定的内核中(0 内核), [注意](1)"boot CPU"(通常都是"0"号CPU)会无条件的从列表中剔除。(2)这里列出的CPU编号必须也要同时列进"rcu_nocbs=..."参数中。
- idle=poll # 对CPU进入休眠状态的额外设置。poll 从根本上禁用休眠功能(也就是禁止进入C-states状态),可以略微提升一些CPU性能,但是却需要多消耗许多电力,得不偿失。不推荐使用。
- default_hugepagesz=1G
- hugepagesz=1G
- hugepages=32
- skew_tick=1 # Offset the periodic timer tick per cpu to mitigate xtime_lock contention on larger systems, and/or RCU lock contention on all systems with CONFIG_MAXSMP set. Note: increases power consumption, thus should only be enabled if running jitter sensitive (HPC/RT) workloads.
- rcu_nocbs=2-19 # 指定哪些CPU是No-CB CPU
- kthread_cpus=0-1
- irqaffinity=0-1 # 通过内核参数irqaffinity==[cpu列表],设置linux中断的亲和性,设置后,默认由这些cpu核来处理非CPU绑定中断。避免linux中断影响cpu2、cpu3上的实时应用,将linux中断指定到cpu0、cpu1处理。
- rcu_nocb_poll # 减少了需要从卸载cpu执行唤醒操作。避免了rcuo kthreads线程显式的唤醒。另一方面这会增加耗电量
- iommu=pt
- intel_iommu=on
kernelType: realtime
EOF
oc create -f /data/install/05-worker-kernelarg-realtime.yaml
# 一般都需要 cpu/numa 绑核,这个在 kubelet 的配置里面做
cat << EOF > /data/install/cpumanager-kubeletconfig.yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: KubeletConfig
metadata:
name: cpumanager-enabled
spec:
machineConfigPoolSelector:
matchLabels:
custom-kubelet: cpumanager-enabled
kubeletConfig:
cpuManagerPolicy: static
cpuManagerReconcilePeriod: 5s
topologyManagerPolicy: single-numa-node
reservedSystemCPUs: "0,1"
EOF
oc create -f /data/install/cpumanager-kubeletconfig.yaml
# 以下如果在 bios 里面关掉了,就不用做了。
# if irqbalance disabled in bios, you can skip below step.
# cat << EOF > /data/install/99-custom-disable-irqbalance-worker.yaml
# apiVersion: machineconfiguration.openshift.io/v1
# kind: MachineConfig
# metadata:
# labels:
# machineconfiguration.openshift.io/role: worker-rt
# name: 99-custom-disable-irqbalance-worker
# spec:
# config:
# ignition:
# version: 2.2.0
# systemd:
# units:
# - enabled: false
# mask: true
# name: irqbalance.service
# EOF
# oc create -f /data/install/99-custom-disable-irqbalance-worker.yaml
# 我们基于performace addon , 改一下他的例子, 这次我们基于 realtime
cat << EOF > /data/install/tuned.yaml
apiVersion: tuned.openshift.io/v1
kind: Tuned
metadata:
name: wzh-realtime
namespace: openshift-cluster-node-tuning-operator
spec:
profile:
- data: |
[main]
summary=wzh version for realtime, 5G RAN
include=openshift-node,realtime
# Different values will override the original values in parent profiles.
[variables]
# isolated_cores take a list of ranges; e.g. isolated_cores=2,4-7
isolated_cores=2-19
isolate_managed_irq=Y
[service]
service.stalld=start,enable
name: wzh-realtime
recommend:
- machineConfigLabels:
machineconfiguration.openshift.io/role: worker-rt
priority: 20
profile: wzh-realtime
EOF
oc create -f /data/install/tuned.yaml
# to restore
oc delete -f /data/install/tuned.yaml
# https://zhuanlan.zhihu.com/p/336381111
# yum install rt-test
# 在测试现场,经过整个晚上的测试,可以看到系统的实时性非常好
# 目标结果,最大不应超过 6μs
cyclictest -m -p95 -d0 -a 2-17 -t 16
try to deploy a vDU pod, using yaml
---
apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
name: host-device-du
spec:
config: '{
"cniVersion": "0.3.0",
"type": "host-device",
"device": "ens81f1np1",
"ipam": {
"type": "host-local",
"subnet": "192.168.12.0/24",
"rangeStart": "192.168.12.105",
"rangeEnd": "192.168.12.105",
"routes": [{
"dst": "0.0.0.0/0"
}],
"gateway": "192.168.12.1"
}
}'
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: du-deployment1
labels:
app: du-deployment1
spec:
replicas: 1
selector:
matchLabels:
app: du-pod1
template:
metadata:
labels:
app: du-pod1
annotations:
k8s.v1.cni.cncf.io/networks: '[
{ "name": "host-device-du",
"interface": "net1" }
]'
spec:
containers:
- name: du-container1
image: "registry.ocp4.redhat.ren:5443/ocp4/centos:7.6.1810"
imagePullPolicy: IfNotPresent
tty: true
stdin: true
env:
- name: duNetProviderDriver
value: "host-netdevice"
command:
- sleep
- infinity
securityContext:
privileged: true
capabilities:
add:
- CAP_SYS_ADMIN
volumeMounts:
- mountPath: /hugepages
name: hugepage
- name: lib-modules
mountPath: /lib/modules
- name: src
mountPath: /usr/src
- name: dev
mountPath: /dev
- name: cache-volume
mountPath: /dev/shm
resources:
requests:
cpu: 16
memory: 48Gi
hugepages-1Gi: 8Gi
limits:
cpu: 16
memory: 48Gi
hugepages-1Gi: 8Gi
volumes:
- name: hugepage
emptyDir:
medium: HugePages
- name: lib-modules
hostPath:
path: /lib/modules
- name: src
hostPath:
path: /usr/src
- name: dev
hostPath:
path: "/dev"
- name: cache-volume
emptyDir:
medium: Memory
sizeLimit: 16Gi
nodeSelector:
node-role.kubernetes.io/worker-rt: ""
research
oc get Tuned -n openshift-cluster-node-tuning-operator
# NAME AGE
# default 18d
# openshift-node-performance-example-performanceprofile 12d
# rendered 18d
oc get Tuned/default -o yaml -n openshift-cluster-node-tuning-operator
apiVersion: tuned.openshift.io/v1
kind: Tuned
metadata:
creationTimestamp: "2021-05-05T16:09:36Z"
generation: 1
name: default
namespace: openshift-cluster-node-tuning-operator
resourceVersion: "6067"
selfLink: /apis/tuned.openshift.io/v1/namespaces/openshift-cluster-node-tuning-operator/tuneds/default
uid: 205c01c5-2609-4f2f-b676-ad746ea3c9f3
spec:
profile:
- data: |
[main]
summary=Optimize systems running OpenShift (parent profile)
include=${f:virt_check:virtual-guest:throughput-performance}
[selinux]
avc_cache_threshold=8192
[net]
nf_conntrack_hashsize=131072
[sysctl]
net.ipv4.ip_forward=1
kernel.pid_max=>4194304
net.netfilter.nf_conntrack_max=1048576
net.ipv4.conf.all.arp_announce=2
net.ipv4.neigh.default.gc_thresh1=8192
net.ipv4.neigh.default.gc_thresh2=32768
net.ipv4.neigh.default.gc_thresh3=65536
net.ipv6.neigh.default.gc_thresh1=8192
net.ipv6.neigh.default.gc_thresh2=32768
net.ipv6.neigh.default.gc_thresh3=65536
vm.max_map_count=262144
[sysfs]
/sys/module/nvme_core/parameters/io_timeout=4294967295
/sys/module/nvme_core/parameters/max_retries=10
name: openshift
- data: |
[main]
summary=Optimize systems running OpenShift control plane
include=openshift
[sysctl]
# ktune sysctl settings, maximizing i/o throughput
#
# Minimal preemption granularity for CPU-bound tasks:
# (default: 1 msec# (1 + ilog(ncpus)), units: nanoseconds)
kernel.sched_min_granularity_ns=10000000
# The total time the scheduler will consider a migrated process
# "cache hot" and thus less likely to be re-migrated
# (system default is 500000, i.e. 0.5 ms)
kernel.sched_migration_cost_ns=5000000
# SCHED_OTHER wake-up granularity.
#
# Preemption granularity when tasks wake up. Lower the value to
# improve wake-up latency and throughput for latency critical tasks.
kernel.sched_wakeup_granularity_ns=4000000
name: openshift-control-plane
- data: |
[main]
summary=Optimize systems running OpenShift nodes
include=openshift
[sysctl]
net.ipv4.tcp_fastopen=3
fs.inotify.max_user_watches=65536
fs.inotify.max_user_instances=8192
name: openshift-node
recommend:
- match:
- label: node-role.kubernetes.io/master
- label: node-role.kubernetes.io/infra
operand:
debug: false
priority: 30
profile: openshift-control-plane
- operand:
debug: false
priority: 40
profile: openshift-node
status: {}
oc get Tuned/openshift-node-performance-example-performanceprofile -o yaml -n openshift-cluster-node-tuning-operator
apiVersion: tuned.openshift.io/v1
kind: Tuned
metadata:
name: openshift-node-performance-example-performanceprofile
namespace: openshift-cluster-node-tuning-operator
spec:
profile:
- data: "[main]\nsummary=Openshift node optimized for deterministic performance at the cost of increased power consumption, focused on low latency network performance. Based on Tuned 2.11 and Cluster node tuning (oc 4.5)\ninclude=openshift-node,cpu-partitioning\n\n# Inheritance of base profiles legend:\n# cpu-partitioning -> network-latency -> latency-performance\n# https://github.com/redhat-performance/tuned/blob/master/profiles/latency-performance/tuned.conf\n# https://github.com/redhat-performance/tuned/blob/master/profiles/network-latency/tuned.conf\n# https://github.com/redhat-performance/tuned/blob/master/profiles/cpu-partitioning/tuned.conf\n\n# All values are mapped with a comment where a parent profile contains them.\n# Different values will override the original values in parent profiles.\n\n[variables]\n# isolated_cores take a list of ranges; e.g. isolated_cores=2,4-7\n\nisolated_cores=2-3 \n\n\nnot_isolated_cores_expanded=${f:cpulist_invert:${isolated_cores_expanded}}\n\n[cpu]\nforce_latency=cstate.id:1|3 # latency-performance (override)\ngovernor=performance # latency-performance \nenergy_perf_bias=performance # latency-performance \nmin_perf_pct=100 # latency-performance \n\n[service]\nservice.stalld=start,enable\n\n[vm]\ntransparent_hugepages=never # network-latency\n\n\n[irqbalance]\n# Override the value set by cpu-partitioning with an empty one\nbanned_cpus=\"\"\n\n\n[scheduler]\ngroup.ksoftirqd=0:f:11:*:ksoftirqd.*\ngroup.rcuc=0:f:11:*:rcuc.*\n\ndefault_irq_smp_affinity = ignore\n\n\n[sysctl]\nkernel.hung_task_timeout_secs = 600 # cpu-partitioning #realtime\nkernel.nmi_watchdog = 0 # cpu-partitioning #realtime\nkernel.sched_rt_runtime_us = -1 # realtime \nkernel.timer_migration = 0 # cpu-partitioning (= 1) #realtime (= 0)\nkernel.numa_balancing=0 # network-latency\nnet.core.busy_read=50 # network-latency\nnet.core.busy_poll=50 # network-latency\nnet.ipv4.tcp_fastopen=3 # network-latency\nvm.stat_interval = 10 # cpu-partitioning #realtime\n\n# ktune sysctl settings for rhel6 servers, maximizing i/o throughput\n#\n# Minimal preemption granularity for CPU-bound tasks:\n# (default: 1 msec# (1 + ilog(ncpus)), units: nanoseconds)\nkernel.sched_min_granularity_ns=10000000 # latency-performance\n\n# If a workload mostly uses anonymous memory and it hits this limit, the entire\n# working set is buffered for I/O, and any more write buffering would require\n# swapping, so it's time to throttle writes until I/O can catch up. Workloads\n# that mostly use file mappings may be able to use even higher values.\n#\n# The generator of dirty data starts writeback at this percentage (system default\n# is 20%)\nvm.dirty_ratio=10 # latency-performance\n\n# Start background writeback (via writeback threads) at this percentage (system\n# default is 10%)\nvm.dirty_background_ratio=3 # latency-performance\n\n# The swappiness parameter controls the tendency of the kernel to move\n# processes out of physical memory and onto the swap disk.\n# 0 tells the kernel to avoid swapping processes out of physical memory\n# for as long as possible\n# 100 tells the kernel to aggressively swap processes out of physical memory\n# and move them to swap cache\nvm.swappiness=10 # latency-performance\n\n# The total time the scheduler will consider a migrated process\n# \"cache hot\" and thus less likely to be re-migrated\n# (system default is 500000, i.e. 0.5 ms)\nkernel.sched_migration_cost_ns=5000000 # latency-performance\n\n[selinux]\navc_cache_threshold=8192 # Custom (atomic host)\n\n[net]\nnf_conntrack_hashsize=131072 # Custom (atomic host)\n\n[bootloader]\n# set empty values to disable RHEL initrd setting in cpu-partitioning \ninitrd_remove_dir= \ninitrd_dst_img=\ninitrd_add_dir=\n# overrides cpu-partitioning cmdline\ncmdline_cpu_part=+nohz=on rcu_nocbs=${isolated_cores} tuned.non_isolcpus=${not_isolated_cpumask} intel_pstate=disable nosoftlockup\n\ncmdline_realtime=+tsc=nowatchdog intel_iommu=on iommu=pt isolcpus=managed_irq,${isolated_cores} systemd.cpu_affinity=${not_isolated_cores_expanded}\n\ncmdline_hugepages=+ default_hugepagesz=1G \ncmdline_additionalArg=+\n"
name: openshift-node-performance-example-performanceprofile
recommend:
- machineConfigLabels:
machineconfiguration.openshift.io/role: worker-rt
priority: 20
profile: openshift-node-performance-example-performanceprofile
status: {}
apiVersion: tuned.openshift.io/v1
kind: Tuned
metadata:
name: openshift-node-performance-example-performanceprofile
namespace: openshift-cluster-node-tuning-operator
spec:
profile:
- data: |
[main]
summary=Openshift node optimized for deterministic performance at the cost of increased power consumption, focused on low latency network performance. Based on Tuned 2.11 and Cluster node tuning (oc 4.5)
include=openshift-node,cpu-partitioning
# Inheritance of base profiles legend:
# cpu-partitioning -> network-latency -> latency-performance
# https://github.com/redhat-performance/tuned/blob/master/profiles/latency-performance/tuned.conf
# https://github.com/redhat-performance/tuned/blob/master/profiles/network-latency/tuned.conf
# https://github.com/redhat-performance/tuned/blob/master/profiles/cpu-partitioning/tuned.conf
# All values are mapped with a comment where a parent profile contains them.
# Different values will override the original values in parent profiles.
[variables]
# isolated_cores take a list of ranges; e.g. isolated_cores=2,4-7
isolated_cores=2-3
not_isolated_cores_expanded=
[cpu]
force_latency=cstate.id:1|3 # latency-performance (override)
governor=performance # latency-performance
energy_perf_bias=performance # latency-performance
min_perf_pct=100 # latency-performance
[service]
service.stalld=start,enable
[vm]
transparent_hugepages=never # network-latency
[irqbalance]
# Override the value set by cpu-partitioning with an empty one
banned_cpus=""
[scheduler]
group.ksoftirqd=0:f:11:*:ksoftirqd.*
group.rcuc=0:f:11:*:rcuc.*
default_irq_smp_affinity = ignore
[sysctl]
kernel.hung_task_timeout_secs = 600 # cpu-partitioning #realtime
kernel.nmi_watchdog = 0 # cpu-partitioning #realtime
kernel.sched_rt_runtime_us = -1 # realtime
kernel.timer_migration = 0 # cpu-partitioning (= 1) #realtime (= 0)
kernel.numa_balancing=0 # network-latency
net.core.busy_read=50 # network-latency
net.core.busy_poll=50 # network-latency
net.ipv4.tcp_fastopen=3 # network-latency
vm.stat_interval = 10 # cpu-partitioning #realtime
# ktune sysctl settings for rhel6 servers, maximizing i/o throughput
#
# Minimal preemption granularity for CPU-bound tasks:
# (default: 1 msec# (1 + ilog(ncpus)), units: nanoseconds)
kernel.sched_min_granularity_ns=10000000 # latency-performance
# If a workload mostly uses anonymous memory and it hits this limit, the entire
# working set is buffered for I/O, and any more write buffering would require
# swapping, so it's time to throttle writes until I/O can catch up. Workloads
# that mostly use file mappings may be able to use even higher values.
#
# The generator of dirty data starts writeback at this percentage (system default
# is 20%)
vm.dirty_ratio=10 # latency-performance
# Start background writeback (via writeback threads) at this percentage (system
# default is 10%)
vm.dirty_background_ratio=3 # latency-performance
# The swappiness parameter controls the tendency of the kernel to move
# processes out of physical memory and onto the swap disk.
# 0 tells the kernel to avoid swapping processes out of physical memory
# for as long as possible
# 100 tells the kernel to aggressively swap processes out of physical memory
# and move them to swap cache
vm.swappiness=10 # latency-performance
# The total time the scheduler will consider a migrated process
# "cache hot" and thus less likely to be re-migrated
# (system default is 500000, i.e. 0.5 ms)
kernel.sched_migration_cost_ns=5000000 # latency-performance
[selinux]
avc_cache_threshold=8192 # Custom (atomic host)
[net]
nf_conntrack_hashsize=131072 # Custom (atomic host)
[bootloader]
# set empty values to disable RHEL initrd setting in cpu-partitioning
initrd_remove_dir=
initrd_dst_img=
initrd_add_dir=
# overrides cpu-partitioning cmdline
cmdline_cpu_part=+nohz=on rcu_nocbs= tuned.non_isolcpus= intel_pstate=disable nosoftlockup
cmdline_realtime=+tsc=nowatchdog intel_iommu=on iommu=pt isolcpus=managed_irq, systemd.cpu_affinity=
cmdline_hugepages=+ default_hugepagesz=1G
cmdline_additionalArg=+
name: openshift-node-performance-example-performanceprofile
recommend:
- machineConfigLabels:
machineconfiguration.openshift.io/role: worker-rt
priority: 20
profile: openshift-node-performance-example-performanceprofile
# tuned 的配置,如果有些在bios里面做了,那么也可以忽略。我们基于performace addon , 改一下他的例子.
cat << EOF > /data/install/tuned.yaml
apiVersion: tuned.openshift.io/v1
kind: Tuned
metadata:
name: openshift-node-wzh-performance-profile
namespace: openshift-cluster-node-tuning-operator
spec:
profile:
- data: |
[main]
summary=Openshift node optimized for deterministic performance at the cost of increased power consumption, focused on low latency network performance. Based on Tuned 2.11 and Cluster node tuning (oc 4.5)
include=openshift-node,cpu-partitioning
# Inheritance of base profiles legend:
# cpu-partitioning -> network-latency -> latency-performance
# https://github.com/redhat-performance/tuned/blob/master/profiles/latency-performance/tuned.conf
# https://github.com/redhat-performance/tuned/blob/master/profiles/network-latency/tuned.conf
# https://github.com/redhat-performance/tuned/blob/master/profiles/cpu-partitioning/tuned.conf
# All values are mapped with a comment where a parent profile contains them.
# Different values will override the original values in parent profiles.
[variables]
# isolated_cores take a list of ranges; e.g. isolated_cores=2,4-7
isolated_cores=2-19
isolate_managed_irq=Y
not_isolated_cores_expanded=
[cpu]
# force_latency=cstate.id:1|3 # latency-performance (override)
governor=performance # latency-performance
energy_perf_bias=performance # latency-performance
min_perf_pct=100 # latency-performance
[service]
service.stalld=start,enable
[vm]
transparent_hugepages=never # network-latency
[irqbalance]
# Override the value set by cpu-partitioning with an empty one
banned_cpus=""
[scheduler]
group.ksoftirqd=0:f:11:*:ksoftirqd.*
group.rcuc=0:f:11:*:rcuc.*
default_irq_smp_affinity = ignore
[sysctl]
kernel.hung_task_timeout_secs = 600 # cpu-partitioning #realtime
kernel.nmi_watchdog = 0 # cpu-partitioning #realtime
kernel.sched_rt_runtime_us = -1 # realtime
kernel.timer_migration = 0 # cpu-partitioning (= 1) #realtime (= 0)
kernel.numa_balancing=0 # network-latency
net.core.busy_read=50 # network-latency
net.core.busy_poll=50 # network-latency
net.ipv4.tcp_fastopen=3 # network-latency
vm.stat_interval = 10 # cpu-partitioning #realtime
# ktune sysctl settings for rhel6 servers, maximizing i/o throughput
#
# Minimal preemption granularity for CPU-bound tasks:
# (default: 1 msec# (1 + ilog(ncpus)), units: nanoseconds)
kernel.sched_min_granularity_ns=10000000 # latency-performance
# If a workload mostly uses anonymous memory and it hits this limit, the entire
# working set is buffered for I/O, and any more write buffering would require
# swapping, so it's time to throttle writes until I/O can catch up. Workloads
# that mostly use file mappings may be able to use even higher values.
#
# The generator of dirty data starts writeback at this percentage (system default
# is 20%)
vm.dirty_ratio=10 # latency-performance
# Start background writeback (via writeback threads) at this percentage (system
# default is 10%)
vm.dirty_background_ratio=3 # latency-performance
# The swappiness parameter controls the tendency of the kernel to move
# processes out of physical memory and onto the swap disk.
# 0 tells the kernel to avoid swapping processes out of physical memory
# for as long as possible
# 100 tells the kernel to aggressively swap processes out of physical memory
# and move them to swap cache
vm.swappiness=10 # latency-performance
# The total time the scheduler will consider a migrated process
# "cache hot" and thus less likely to be re-migrated
# (system default is 500000, i.e. 0.5 ms)
kernel.sched_migration_cost_ns=5000000 # latency-performance
[selinux]
avc_cache_threshold=8192 # Custom (atomic host)
[net]
nf_conntrack_hashsize=131072 # Custom (atomic host)
[bootloader]
# set empty values to disable RHEL initrd setting in cpu-partitioning
initrd_remove_dir=
initrd_dst_img=
initrd_add_dir=
# overrides cpu-partitioning cmdline
cmdline_cpu_part=+nohz=on rcu_nocbs= tuned.non_isolcpus= intel_pstate=disable nosoftlockup
cmdline_realtime=+tsc=nowatchdog intel_iommu=on iommu=pt isolcpus=managed_irq, systemd.cpu_affinity=
cmdline_hugepages=+ default_hugepagesz=1G
cmdline_additionalArg=+
name: openshift-node-wzh-performance-profile
recommend:
- machineConfigLabels:
machineconfiguration.openshift.io/role: worker-rt
priority: 20
profile: openshift-node-wzh-performance-profile
EOF
oc create -f /data/install/tuned.yaml
# 用了performance example的profile 效果很好
cyclictest -m -p95 -d0 -a 2-17 -t 16
example config
oc get mc
NAME GENERATEDBYCONTROLLER IGNITIONVERSION AGE
00-master 791d1cc2626d1e4e5da59f15c1a6166fd398aef8 3.1.0 62d
00-worker 791d1cc2626d1e4e5da59f15c1a6166fd398aef8 3.1.0 62d
01-master-container-runtime 791d1cc2626d1e4e5da59f15c1a6166fd398aef8 3.1.0 62d
01-master-kubelet 791d1cc2626d1e4e5da59f15c1a6166fd398aef8 3.1.0 62d
01-worker-container-runtime 791d1cc2626d1e4e5da59f15c1a6166fd398aef8 3.1.0 62d
01-worker-kubelet 791d1cc2626d1e4e5da59f15c1a6166fd398aef8 3.1.0 62d
05-worker-kernelarg-rtran 3.1.0 62d
50-nto-worker-rt 3.1.0 58d
99-master-generated-registries 791d1cc2626d1e4e5da59f15c1a6166fd398aef8 3.1.0 62d
99-master-ssh 3.1.0 62d
99-worker-generated-registries 791d1cc2626d1e4e5da59f15c1a6166fd398aef8 3.1.0 62d
99-worker-realtime 62d
99-worker-rt-generated-kubelet 791d1cc2626d1e4e5da59f15c1a6166fd398aef8 3.1.0 58d
99-worker-ssh 3.1.0 62d
load-sctp-module 3.1.0 6d9h
rendered-master-0629f16bcba29a60e894f3d9e14e47b9 791d1cc2626d1e4e5da59f15c1a6166fd398aef8 3.1.0 62d
rendered-worker-7497d1b2e86631a4f390a6eba0aef74f 791d1cc2626d1e4e5da59f15c1a6166fd398aef8 3.1.0 62d
rendered-worker-rt-1e40da418635be6c6b81ebc33a1f0640 791d1cc2626d1e4e5da59f15c1a6166fd398aef8 3.1.0 62d
rendered-worker-rt-35d27df9ed0ff75a6a192700313a88f8 791d1cc2626d1e4e5da59f15c1a6166fd398aef8 3.1.0 58d
rendered-worker-rt-3e87a41fe1e455977a4a972f8d4258aa 791d1cc2626d1e4e5da59f15c1a6166fd398aef8 3.1.0 58d
rendered-worker-rt-4ba64193fdbace8fc101541335067ad4 791d1cc2626d1e4e5da59f15c1a6166fd398aef8 3.1.0 62d
rendered-worker-rt-7497d1b2e86631a4f390a6eba0aef74f 791d1cc2626d1e4e5da59f15c1a6166fd398aef8 3.1.0 62d
rendered-worker-rt-9cf8ebbc1c0cf88bb3a9716b6d66e60e 791d1cc2626d1e4e5da59f15c1a6166fd398aef8 3.1.0 58d
rendered-worker-rt-bb3c16a689e7797fb4c828cec877c9ed 791d1cc2626d1e4e5da59f15c1a6166fd398aef8 3.1.0 58d
rendered-worker-rt-ea53e6c4fc58b5f9f505ebed3cb32345 791d1cc2626d1e4e5da59f15c1a6166fd398aef8 3.1.0 58d
rendered-worker-rt-fd13902df04099f149d7653da3552f5d 791d1cc2626d1e4e5da59f15c1a6166fd398aef8 3.1.0 6d9h
oc get mc/05-worker-kernelarg-rtran -o json | jq "del(.metadata.managedFields, .metadata.uid, .metadata.selfLink, .metadata.resourceVersion, .metadata.generation, .metadata.creationTimestamp)"
{
"apiVersion": "machineconfiguration.openshift.io/v1",
"kind": "MachineConfig",
"metadata": {
"labels": {
"machineconfiguration.openshift.io/role": "worker-rt"
},
"name": "05-worker-kernelarg-rtran"
},
"spec": {
"config": {
"ignition": {
"version": "3.1.0"
}
},
"kernelArguments": [
"no_timer_check",
"clocksource=tsc",
"tsc=perfect",
"selinux=0",
"enforcing=0",
"nmi_watchdog=0",
"softlockup_panic=0",
"isolcpus=2-19",
"nohz_full=2-19",
"idle=poll",
"default_hugepagesz=1G",
"hugepagesz=1G",
"hugepages=16",
"skew_tick=1",
"rcu_nocbs=2-19",
"kthread_cpus=0-1",
"irqaffinity=0-1",
"rcu_nocb_poll",
"iommu=pt",
"intel_iommu=on"
]
}
}
oc get mc/50-nto-worker-rt -o json | jq "del(.metadata.managedFields, .metadata.uid, .metadata.selfLink, .metadata.resourceVersion, .metadata.generation, .metadata.creationTimestamp)"
{
"apiVersion": "machineconfiguration.openshift.io/v1",
"kind": "MachineConfig",
"metadata": {
"annotations": {
"tuned.openshift.io/generated-by-controller-version": "v4.6.0-202104221811.p0-0-gfdb7aec-dirty"
},
"labels": {
"machineconfiguration.openshift.io/role": "worker-rt"
},
"name": "50-nto-worker-rt"
},
"spec": {
"config": {
"ignition": {
"config": {
"replace": {
"verification": {}
}
},
"proxy": {},
"security": {
"tls": {}
},
"timeouts": {},
"version": "3.1.0"
},
"passwd": {},
"storage": {},
"systemd": {}
},
"extensions": null,
"fips": false,
"kernelArguments": [
"skew_tick=1",
"isolcpus=managed_irq,domain,2-19",
"intel_pstate=disable",
"nosoftlockup",
"tsc=nowatchdog"
],
"kernelType": "",
"osImageURL": ""
}
}
oc get mc/99-worker-realtime -o json | jq "del(.metadata.managedFields, .metadata.uid, .metadata.selfLink, .metadata.resourceVersion, .metadata.generation, .metadata.creationTimestamp)"
{
"apiVersion": "machineconfiguration.openshift.io/v1",
"kind": "MachineConfig",
"metadata": {
"labels": {
"machineconfiguration.openshift.io/role": "worker-rt"
},
"name": "99-worker-realtime"
},
"spec": {
"kernelType": "realtime"
}
}
oc get mc/load-sctp-module -o json | jq "del(.metadata.managedFields, .metadata.uid, .metadata.selfLink, .metadata.resourceVersion, .metadata.generation, .metadata.creationTimestamp)"
{
"apiVersion": "machineconfiguration.openshift.io/v1",
"kind": "MachineConfig",
"metadata": {
"labels": {
"machineconfiguration.openshift.io/role": "worker-rt"
},
"name": "load-sctp-module"
},
"spec": {
"config": {
"ignition": {
"version": "3.1.0"
},
"storage": {
"files": [
{
"contents": {
"source": "data:,"
},
"mode": 420,
"overwrite": true,
"path": "/etc/modprobe.d/sctp-blacklist.conf"
},
{
"contents": {
"source": "data:,sctp"
},
"mode": 420,
"overwrite": true,
"path": "/etc/modules-load.d/sctp-load.conf"
}
]
}
}
}
}
oc get Tuned -n openshift-cluster-node-tuning-operator
NAME AGE
default 62d
rendered 62d
wzh-realtime 58d
oc get Tuned/wzh-realtime -n openshift-cluster-node-tuning-operator -o json | jq "del(.metadata.managedFields, .metadata.uid, .metadata.selfLink, .metadata.resourceVersion, .metadata.generation, .metadata.creationTimestamp)"
{
"apiVersion": "tuned.openshift.io/v1",
"kind": "Tuned",
"metadata": {
"name": "wzh-realtime",
"namespace": "openshift-cluster-node-tuning-operator"
},
"spec": {
"profile": [
{
"data": "[main]\nsummary=wzh version for realtime, 5G RAN\ninclude=openshift-node,realtime\n\n# Inheritance of base profiles legend:\n# cpu-partitioning -> network-latency -> latency-performance\n# https://github.com/redhat-performance/tuned/blob/master/profiles/latency-performance/tuned.conf\n# https://github.com/redhat-performance/tuned/blob/master/profiles/network-latency/tuned.conf\n# https://github.com/redhat-performance/tuned/blob/master/profiles/cpu-partitioning/tuned.conf\n\n# All values are mapped with a comment where a parent profile contains them.\n# Different values will override the original values in parent profiles.\n\n[variables]\n# isolated_cores take a list of ranges; e.g. isolated_cores=2,4-7\n\nisolated_cores=2-19\nisolate_managed_irq=Y\n",
"name": "wzh-realtime"
}
],
"recommend": [
{
"machineConfigLabels": {
"machineconfiguration.openshift.io/role": "worker-rt"
},
"priority": 20,
"profile": "wzh-realtime"
}
]
}
}
oc get Tuned/wzh-realtime -n openshift-cluster-node-tuning-operator -o json | jq ".spec.profile[0].data" | jq -r
[main]
summary=wzh version for realtime, 5G RAN
include=openshift-node,realtime
# Inheritance of base profiles legend:
# cpu-partitioning -> network-latency -> latency-performance
# https://github.com/redhat-performance/tuned/blob/master/profiles/latency-performance/tuned.conf
# https://github.com/redhat-performance/tuned/blob/master/profiles/network-latency/tuned.conf
# https://github.com/redhat-performance/tuned/blob/master/profiles/cpu-partitioning/tuned.conf
# All values are mapped with a comment where a parent profile contains them.
# Different values will override the original values in parent profiles.
[variables]
# isolated_cores take a list of ranges; e.g. isolated_cores=2,4-7
isolated_cores=2-19
isolate_managed_irq=Y
oc get deployment.apps/du-deployment1 -o json | jq "del(.metadata.managedFields, .metadata.uid, .metadata.selfLink, .metadata.resourceVersion, .metadata.generation, .metadata.creationTimestamp)"
{
"apiVersion": "apps/v1",
"kind": "Deployment",
"metadata": {
"annotations": {
"deployment.kubernetes.io/revision": "1",
"kubectl.kubernetes.io/last-applied-configuration": "{\"apiVersion\":\"apps/v1\",\"kind\":\"Deployment\",\"metadata\":{\"annotations\":{},\"labels\":{\"app\":\"du-deployment1\"},\"name\":\"du-deployment1\",\"namespace\":\"default\"},\"spec\":{\"replicas\":1,\"selector\":{\"matchLabels\":{\"app\":\"du-pod1\"}},\"template\":{\"metadata\":{\"annotations\":{\"k8s.v1.cni.cncf.io/networks\":\"[ { \\\"name\\\": \\\"host-device-du\\\", \\\"interface\\\": \\\"veth11\\\" } ]\"},\"labels\":{\"app\":\"du-pod1\"}},\"spec\":{\"containers\":[{\"command\":[\"sleep\",\"infinity\"],\"env\":[{\"name\":\"duNetProviderDriver\",\"value\":\"host-netdevice\"}],\"image\":\"registry.ocp4.redhat.ren:5443/ocp4/du:v1-wzh\",\"imagePullPolicy\":\"IfNotPresent\",\"name\":\"du-container1\",\"resources\":{\"limits\":{\"cpu\":16,\"hugepages-1Gi\":\"8Gi\",\"memory\":\"48Gi\"},\"requests\":{\"cpu\":16,\"hugepages-1Gi\":\"8Gi\",\"memory\":\"48Gi\"}},\"securityContext\":{\"capabilities\":{\"add\":[\"CAP_SYS_ADMIN\"]},\"privileged\":true},\"stdin\":true,\"tty\":true,\"volumeMounts\":[{\"mountPath\":\"/hugepages\",\"name\":\"hugepage\"},{\"mountPath\":\"/lib/modules\",\"name\":\"lib-modules\"},{\"mountPath\":\"/usr/src\",\"name\":\"src\"},{\"mountPath\":\"/dev\",\"name\":\"dev\"},{\"mountPath\":\"/dev/shm\",\"name\":\"cache-volume\"}]}],\"nodeSelector\":{\"node-role.kubernetes.io/worker-rt\":\"\"},\"volumes\":[{\"emptyDir\":{\"medium\":\"HugePages\"},\"name\":\"hugepage\"},{\"hostPath\":{\"path\":\"/lib/modules\"},\"name\":\"lib-modules\"},{\"hostPath\":{\"path\":\"/usr/src\"},\"name\":\"src\"},{\"hostPath\":{\"path\":\"/dev\"},\"name\":\"dev\"},{\"emptyDir\":{\"medium\":\"Memory\",\"sizeLimit\":\"16Gi\"},\"name\":\"cache-volume\"}]}}}}\n"
},
"labels": {
"app": "du-deployment1"
},
"name": "du-deployment1",
"namespace": "default"
},
"spec": {
"progressDeadlineSeconds": 600,
"replicas": 1,
"revisionHistoryLimit": 10,
"selector": {
"matchLabels": {
"app": "du-pod1"
}
},
"strategy": {
"rollingUpdate": {
"maxSurge": "25%",
"maxUnavailable": "25%"
},
"type": "RollingUpdate"
},
"template": {
"metadata": {
"annotations": {
"k8s.v1.cni.cncf.io/networks": "[ { \"name\": \"host-device-du\", \"interface\": \"veth11\" } ]"
},
"creationTimestamp": null,
"labels": {
"app": "du-pod1"
}
},
"spec": {
"containers": [
{
"command": [
"sleep",
"infinity"
],
"env": [
{
"name": "duNetProviderDriver",
"value": "host-netdevice"
}
],
"image": "registry.ocp4.redhat.ren:5443/ocp4/du:v1-wzh",
"imagePullPolicy": "IfNotPresent",
"name": "du-container1",
"resources": {
"limits": {
"cpu": "16",
"hugepages-1Gi": "8Gi",
"memory": "48Gi"
},
"requests": {
"cpu": "16",
"hugepages-1Gi": "8Gi",
"memory": "48Gi"
}
},
"securityContext": {
"capabilities": {
"add": [
"CAP_SYS_ADMIN"
]
},
"privileged": true
},
"stdin": true,
"terminationMessagePath": "/dev/termination-log",
"terminationMessagePolicy": "File",
"tty": true,
"volumeMounts": [
{
"mountPath": "/hugepages",
"name": "hugepage"
},
{
"mountPath": "/lib/modules",
"name": "lib-modules"
},
{
"mountPath": "/usr/src",
"name": "src"
},
{
"mountPath": "/dev",
"name": "dev"
},
{
"mountPath": "/dev/shm",
"name": "cache-volume"
}
]
}
],
"dnsPolicy": "ClusterFirst",
"nodeSelector": {
"node-role.kubernetes.io/worker-rt": ""
},
"restartPolicy": "Always",
"schedulerName": "default-scheduler",
"securityContext": {},
"terminationGracePeriodSeconds": 30,
"volumes": [
{
"emptyDir": {
"medium": "HugePages"
},
"name": "hugepage"
},
{
"hostPath": {
"path": "/lib/modules",
"type": ""
},
"name": "lib-modules"
},
{
"hostPath": {
"path": "/usr/src",
"type": ""
},
"name": "src"
},
{
"hostPath": {
"path": "/dev",
"type": ""
},
"name": "dev"
},
{
"emptyDir": {
"medium": "Memory",
"sizeLimit": "16Gi"
},
"name": "cache-volume"
}
]
}
}
},
"status": {
"availableReplicas": 1,
"conditions": [
{
"lastTransitionTime": "2021-07-21T06:21:57Z",
"lastUpdateTime": "2021-07-21T06:23:05Z",
"message": "ReplicaSet \"du-deployment1-d5dc9854d\" has successfully progressed.",
"reason": "NewReplicaSetAvailable",
"status": "True",
"type": "Progressing"
},
{
"lastTransitionTime": "2021-07-21T11:07:55Z",
"lastUpdateTime": "2021-07-21T11:07:55Z",
"message": "Deployment has minimum availability.",
"reason": "MinimumReplicasAvailable",
"status": "True",
"type": "Available"
}
],
"observedGeneration": 7,
"readyReplicas": 1,
"replicas": 1,
"updatedReplicas": 1
}
}
oc get net-attach-def
# NAME AGE
# host-device-du 6h32m
# macvlan-conf 23d
oc get net-attach-def/host-device-du -o json | jq "del(.metadata.managedFields, .metadata.uid, .metadata.selfLink, .metadata.resourceVersion, .metadata.generation, .metadata.creationTimestamp)"
{
"apiVersion": "k8s.cni.cncf.io/v1",
"kind": "NetworkAttachmentDefinition",
"metadata": {
"annotations": {
"kubectl.kubernetes.io/last-applied-configuration": "{\"apiVersion\":\"k8s.cni.cncf.io/v1\",\"kind\":\"NetworkAttachmentDefinition\",\"metadata\":{\"annotations\":{},\"name\":\"host-device-du\",\"namespace\":\"default\"},\"spec\":{\"config\":\"{ \\\"cniVersion\\\": \\\"0.3.0\\\", \\\"type\\\": \\\"host-device\\\", \\\"device\\\": \\\"ens81f1np1\\\", \\\"ipam\\\": { \\\"type\\\": \\\"host-local\\\", \\\"subnet\\\": \\\"192.168.12.0/24\\\", \\\"rangeStart\\\": \\\"192.168.12.105\\\", \\\"rangeEnd\\\": \\\"192.168.12.105\\\", \\\"routes\\\": [{ \\\"dst\\\": \\\"0.0.0.0/0\\\" }], \\\"gateway\\\": \\\"192.168.12.1\\\" } }\"}}\n"
},
"name": "host-device-du",
"namespace": "default"
},
"spec": {
"config": "{ \"cniVersion\": \"0.3.0\", \"type\": \"host-device\", \"device\": \"ens18f1\", \"ipam\": { \"type\": \"host-local\", \"subnet\": \"192.168.12.0/24\", \"rangeStart\": \"192.168.12.105\", \"rangeEnd\": \"192.168.12.105\", \"routes\": [{ \"dst\": \"0.0.0.0/0\" }], \"gateway\": \"192.168.12.1\" } }"
}
}
oc get net-attach-def/host-device-du -o json | jq "del(.metadata.managedFields, .metadata.uid, .metadata.selfLink, .metadata.resourceVersion, .metadata.generation, .metadata.creationTimestamp)" | jq .spec.config | jq "fromjson"
{
"cniVersion": "0.3.0",
"type": "host-device",
"device": "ens18f1",
"ipam": {
"type": "host-local",
"subnet": "192.168.12.0/24",
"rangeStart": "192.168.12.105",
"rangeEnd": "192.168.12.105",
"routes": [
{
"dst": "0.0.0.0/0"
}
],
"gateway": "192.168.12.1"
}
}
从容器向宿主机注入内核模块 kmod / driver
从容器向宿主机注入kmod/driver,最大的场景,就是在容器平台上给GPU和DPU装驱动,参考nvidia家的gpu驱动(nvidia gpu operator),都是从容器向宿主机注入的方式做的。
还有一个大的使用场景,就是像RHACS/StackRox这种安全平台,向宿主机注入内核模块,进行系统监控。
视频讲解:
先用podman进行单机版本测试
# on a centos8 to test the driver build
# https://blog.sourcerer.io/writing-a-simple-linux-kernel-module-d9dc3762c234
yum install -y epel-release
yum update -y
yum install -y byobu podman buildah
mkdir -p /data/kmod
cd /data/kmod
podman run -it --rm quay.io/generic/centos8 bash
# below will input/run in the container
dnf update -y
dnf install -y make gcc wget perl createrepo kernel-core-$(uname -r) kernel-devel-$(uname -r) pciutils python36-devel ethtool lsof elfutils-libelf-devel rpm-build kernel-rpm-macros python36 tk numactl-libs libmnl tcl binutils kmod procps git autoconf automake libtool hostname
mkdir -p ~/src/lkm_example
cd ~/src/lkm_example
cat << 'EOF' > lkm_example.c
#include <linux/init.h>
#include <linux/module.h>
#include <linux/kernel.h>
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Wandering Star");
MODULE_DESCRIPTION("A simple example Linux module.");
MODULE_VERSION("0.01");
static int __init lkm_example_init(void) {
printk(KERN_INFO "Hello, World, Wandering Star!\n");
return 0;
}
static void __exit lkm_example_exit(void) {
printk(KERN_INFO "Goodbye, World, Wandering Star!\n");
}
module_init(lkm_example_init);
module_exit(lkm_example_exit);
EOF
cat << EOF > Makefile
obj-m += lkm_example.o
all:
make -C /lib/modules/$(uname -r)/build M=$(pwd) modules
clean:
make -C/lib/modules/$(uname -r)/build M=$(pwd) clean
EOF
sed -i 's/^ /\t/g' Makefile
make
insmod lkm_example.ko
# insmod: ERROR: could not insert module lkm_example.ko: Operation not permitted
# poc again with priviledged
podman run -it --rm --privileged quay.io/generic/centos8 bash
# do the same above again
# yum install ..............
# ........
# make
insmod lkm_example.ko
# go to host
dmesg | grep Wandering
# [ 5197.673179] Hello, World, Wandering Star!
lsmod | grep example
# lkm_example 16384 0
try the demo on openshift4
first, we try to get rpm repo offline
- https://www.openshift.com/blog/how-to-use-entitled-image-builds-to-build-drivercontainers-with-ubi-on-openshift
# on a vultr host, centos7
mkdir -p /data/rhel8/entitle
cd /data/rhel8/entitle
# goto https://access.redhat.com/management/subscriptions
# search employee sku, find a system, go into, and download from subscription
# or goto: https://access.redhat.com/management/systems/4d1e4cc0-2c99-4431-99ce-2f589a24ea11/subscriptions
yum install -y unzip
unzip *
unzip consumer_export.zip
find . -name *.pem -exec cp {} ./ \;
# podman run -ti --mount type=bind,source=/data/rhel8/entitle/$(ls *.pem | sed -n '2p'),target=/etc/pki/entitlement/entitlement.pem --mount type=bind,source=/data/rhel8/entitle/$(ls *.pem | sed -n '2p'),target=/etc/pki/entitlement/entitlement-key.pem registry.access.redhat.com/ubi8:latest bash -c "dnf search kernel-devel --showduplicates"
mkdir -p /data/rhel8/dnf
podman run -it --rm -v /data/rhel8/dnf:/data/dnf:z \
--mount type=bind,source=$(ls /data/rhel8/entitle/*.pem | sed -n '2p'),target=/etc/pki/entitlement/entitlement.pem \
--mount type=bind,source=$(ls /data/rhel8/entitle/*.pem | sed -n '2p'),target=/etc/pki/entitlement/entitlement-key.pem \
registry.access.redhat.com/ubi8:8.3 bash
cd /data/dnf
# dnf -y --enablerepo=rhel-8-for-x86_64-baseos-rpms --releasever=8.3 install make gcc wget perl createrepo pciutils python36-devel ethtool lsof elfutils-libelf-devel rpm-build kernel-rpm-macros python36 tk numactl-libs libmnl tcl binutils kmod procps git autoconf automake libtool hostname kernel-core-$(uname -r) kernel-devel-$(uname -r)
dnf -y --enablerepo=rhel-8-for-x86_64-baseos-rpms --releasever=8.3 install createrepo
dnf -y download --resolve --alldeps --releasever=8.3 \
make gcc wget perl createrepo pciutils python36-devel ethtool lsof elfutils-libelf-devel rpm-build kernel-rpm-macros python36 tk numactl-libs libmnl tcl binutils kmod procps git autoconf automake libtool hostname kernel-core-4.18.0-240.22.1.el8_3.x86_64 kernel-devel-4.18.0-240.22.1.el8_3.x86_64
dnf -y install https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm
# dnf install -y https://kojipkgs.fedoraproject.org//packages/modulemd-tools/0.9/1.fc32/noarch/modulemd-tools-0.9-1.fc32.noarch.rpm
# https://copr.fedorainfracloud.org/coprs/frostyx/modulemd-tools/
dnf copr enable -y frostyx/modulemd-tools
dnf install -y modulemd-tools
createrepo ./
repo2module . \
--module-name foo \
--module-stream devel \
--module-version 123 \
--module-context f32
createrepo_mod .
# back to host
cd /data/rhel8
tar zcvf dnf.tgz dnf/
# upload dnf.tgz to helper /var/www/html/
# on helper
cd /var/www/html/
tar zvxf dnf.tgz
we will use an entrypoint file. the entrypoint script file is locate here
# on helper
mkdir -p /data/kmod
cd /data/kmod
cat << EOF > /data/kmod/Dockerfile
FROM registry.access.redhat.com/ubi8
WORKDIR /
COPY kmod.entrypoint.sh /entrypoint.sh
RUN chmod +x /entrypoint.sh
ENTRYPOINT ["/entrypoint.sh"]
EOF
buildah bud -t quay.io/wangzheng422/qimgs:kmod-demo.02 -f Dockerfile .
buildah push quay.io/wangzheng422/qimgs:kmod-demo.02
cd /data/install
cat << EOF > kmod-pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: kmod-example
spec:
nodeSelector:
kubernetes.io/hostname: 'master-2'
restartPolicy: Never
containers:
- securityContext:
privileged: true
image: quay.io/wangzheng422/qimgs:kmod-demo.02
imagePullPolicy: Always
name: kmod-example
EOF
oc create -n demo -f kmod-pod.yaml
# to restore
oc delete -n demo -f kmod-pod.yaml
# login to master-2
ssh core@master-2
lsmod | grep example
# lkm_example 16384 0
dmesg | grep Wandering
# [40933.691925] Hello, World, Wandering Star!
RHACS/Stackrox 使用案例
我们已经完成了内核模块的注入,但是为了更好的实现软件功能,我们一般需要把/sys, /dev这种目录挂载到容器中,以下就是RHACS/StackRox的挂载实例。
others
mkdir /etc/yum.repos.d.bak
mv /etc/yum.repos.d/* /etc/yum.repos.d.bak
cat << EOF > /etc/yum.repos.d/remote.repo
[remote]
name=RHEL-Mirror
baseurl=http://v.redhat.ren:8080/
enabled=1
gpgcheck=0
EOF
openshift4 上 GPU/vGPU 共享
openshift/k8s集群上,运行了越来越多的AI/ML应用,这些应用大部分需要GPU的支持,但是英伟达/k8s官方的device-plug中,GPU的调度,是按照一块GPU为单元来进行调度的,这就在k8s调度层面,带来一个问题,即GPU资源浪费的问题。
好在社区有很多类似的方案,比如aliyun的方案,就相对简单,当然功能也简单。本文就试图在openshift4上,运行aliyun的gpu共享方案。
由于aliyun等类似的方案,大多基于nvidia-docker,而openshift4使用了crio,所以里面有一点定制化的部分。
由于时间所限,本文只是完成了方案的大致成功运行,完美的运行,需要更多的定制化,这个就有待项目中继续完善吧。
注意
- 这是调度共享方案,不是共享隔离方案
todo
- 在真实的多GPU卡环境中验证。
- 增强scheduler extender安全性
视频讲解
部署运行 scheduler extender
aliyun类似的方案,都是扩展k8s scheduler的功能,来增强k8s已有的功能,在最新版本的openshift4中,已经可以通过配置,把这个scheduler扩展功能激活。
cd /data/install
cat << EOF > ./policy.cfg
{
"kind" : "Policy",
"apiVersion" : "v1",
"predicates" : [
{"name" : "MaxGCEPDVolumeCount"},
{"name" : "GeneralPredicates"},
{"name" : "MaxAzureDiskVolumeCount"},
{"name" : "MaxCSIVolumeCountPred"},
{"name" : "CheckVolumeBinding"},
{"name" : "MaxEBSVolumeCount"},
{"name" : "MatchInterPodAffinity"},
{"name" : "CheckNodeUnschedulable"},
{"name" : "NoDiskConflict"},
{"name" : "NoVolumeZoneConflict"},
{"name" : "PodToleratesNodeTaints"}
],
"priorities" : [
{"name" : "LeastRequestedPriority", "weight" : 1},
{"name" : "BalancedResourceAllocation", "weight" : 1},
{"name" : "ServiceSpreadingPriority", "weight" : 1},
{"name" : "NodePreferAvoidPodsPriority", "weight" : 1},
{"name" : "NodeAffinityPriority", "weight" : 1},
{"name" : "TaintTolerationPriority", "weight" : 1},
{"name" : "ImageLocalityPriority", "weight" : 1},
{"name" : "SelectorSpreadPriority", "weight" : 1},
{"name" : "InterPodAffinityPriority", "weight" : 1},
{"name" : "EqualPriority", "weight" : 1}
],
"extenders": [
{
"urlPrefix": "http://127.0.0.1:32766/gpushare-scheduler",
"filterVerb": "filter",
"bindVerb": "bind",
"enableHttps": false,
"nodeCacheCapable": true,
"managedResources": [
{
"name": "aliyun.com/gpu-mem",
"ignoredByScheduler": false
}
],
"ignorable": false
}
]
}
EOF
oc delete configmap -n openshift-config scheduler-policy
oc create configmap -n openshift-config --from-file=policy.cfg scheduler-policy
oc patch Scheduler cluster --type='merge' -p '{"spec":{"policy":{"name":"scheduler-policy"}}}' --type=merge
然后我们就可以部署 scheduler extender 了
curl -O https://raw.githubusercontent.com/AliyunContainerService/gpushare-scheduler-extender/master/config/gpushare-schd-extender.yaml
# replace docker image
cd /data/install
sed -i 's/image:.*/image: quay.io\/wangzheng422\/qimgs:gpushare-scheduler-extender-2021-02-26-1339/' gpushare-schd-extender.yaml
oc delete -f gpushare-schd-extender.yaml
oc create -f gpushare-schd-extender.yaml
operator hub 中添加 catalog source
我们定制了nvidia gpu-operator,所以我们要把我们新的operator加到operator hub中去。
#
cat << EOF > /data/ocp4/my-catalog.yaml
apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
name: wzh-operator-catalog
namespace: openshift-marketplace
spec:
displayName: WZH Operator Catalog
image: 'quay.io/wangzheng422/qimgs:registry-wzh-index.2021-02-28-1446'
publisher: WZH
sourceType: grpc
EOF
oc create -f /data/ocp4/my-catalog.yaml
oc delete -f /data/ocp4/my-catalog.yaml
到此,我们就能在 operator hub 中,查找到2个gpu-operator了
安装 gpu-operator 并配置 ClusterPolicies
点击安装 nvidia & wzh 那个。
安装成功以后,创建 project gpu-operator-resources
然后在 project gpu-operator-resources 中,给gpu-operator创建一个ClusterPolicies 配置,使用以下模版创建。不过里面涉及到准备一个离线安装源的操作,参考这里完成。
apiVersion: nvidia.com/v1
kind: ClusterPolicy
metadata:
name: gpu-cluster-policy
spec:
dcgmExporter:
nodeSelector: {}
imagePullSecrets: []
resources: {}
affinity: {}
podSecurityContext: {}
repository: nvcr.io/nvidia/k8s
securityContext: {}
version: 'sha256:85016e39f73749ef9769a083ceb849cae80c31c5a7f22485b3ba4aa590ec7b88'
image: dcgm-exporter
tolerations: []
devicePlugin:
nodeSelector: {}
imagePullSecrets: []
resources: {}
affinity: {}
podSecurityContext: {}
repository: quay.io/wangzheng422
securityContext: {}
version: gpu-aliyun-device-plugin-2021-02-24-1346
image: qimgs
tolerations: []
args:
- 'gpushare-device-plugin-v2'
- '-logtostderr'
- '--v=5'
env:
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
driver:
nodeSelector: {}
imagePullSecrets: []
resources: {}
affinity: {}
podSecurityContext: {}
repository: nvcr.io/nvidia
securityContext: {}
repoConfig:
configMapName: repo-config
destinationDir: /etc/yum.repos.d
version: 'sha256:324e9dc265dec320207206aa94226b0c8735fd93ce19b36a415478c95826d934'
image: driver
tolerations: []
gfd:
nodeSelector: {}
imagePullSecrets: []
resources: {}
affinity: {}
podSecurityContext: {}
repository: nvcr.io/nvidia
securityContext: {}
version: 'sha256:8d068b7b2e3c0b00061bbff07f4207bd49be7d5bfbff51fdf247bc91e3f27a14'
image: gpu-feature-discovery
tolerations: []
migStrategy: single
sleepInterval: 60s
operator:
defaultRuntime: crio
validator:
image: cuda-sample
imagePullSecrets: []
repository: nvcr.io/nvidia/k8s
version: 'sha256:2a30fe7e23067bc2c3f8f62a6867702a016af2b80b9f6ce861f3fea4dfd85bc2'
deployGFD: true
toolkit:
nodeSelector: {}
imagePullSecrets: []
resources: {}
affinity: {}
podSecurityContext: {}
repository: nvcr.io/nvidia/k8s
securityContext: {}
version: 'sha256:81295a9eca36cbe5d94b80732210b8dc7276c6ef08d5a60d12e50479b9e542cd'
image: container-toolkit
tolerations: []
至此,gpu-operator就安装完成了,我们可以看到,device-plugin的validate并没有运行,这是因为,我们定制了sheduler, nvidia.com/gpu 已经被 aliyun.com/gpu-mem 代替。 完美解决这个问题,就需要继续定制化了,但是系统已经能按照预期运行,我们就把定制化留到以后项目中去做好了。
测试一下
我们就来实际测试一下效果
cat << EOF > /data/ocp4/gpu.test.yaml
---
kind: Deployment
apiVersion: apps/v1
metadata:
annotations:
name: demo1
labels:
app: demo1
spec:
replicas: 1
selector:
matchLabels:
app: demo1
template:
metadata:
labels:
app: demo1
spec:
# nodeSelector:
# kubernetes.io/hostname: 'worker-0'
restartPolicy: Always
containers:
- name: demo1
image: "docker.io/wangzheng422/imgs:tensorrt-ljj-2021-01-21-1151"
env:
- name: NVIDIA_VISIBLE_DEVICES
valueFrom:
fieldRef:
fieldPath: metadata.annotations['ALIYUN_COM_GPU_MEM_IDX']
resources:
limits:
# GiB
aliyun.com/gpu-mem: 3
EOF
oc create -n demo -f /data/ocp4/gpu.test.yaml
进入测试容器,看环境变量,我们就能看到 NVIDIA_VISIBLE_DEVICES 被自动设置了
我们进入scheduler extender看看日志, 可以看到scheduler试图给pod添加annotation
我们再进入device-plugin看看日志,可以看到device-plugin在对比内存,挑选gpu设备。
headless service with router
本文讲述,如果service是headless的情况下,k8s/ocp ingress如何处理,和普通headless有什么区别。
演示视频
结论是,不管是是不是headless service, openshift都会找到最终的pod ip,然后在ingress/router/haproxy里面,修改配置,让流量直接导向pod ip。
# 这里是演示环境部署脚本
cat << EOF > headless.yaml
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: slb-001
spec:
replicas: 1
selector:
matchLabels:
pod: slb-001
template:
metadata:
labels:
pod: slb-001
spec:
restartPolicy: Always
containers:
- name: slb-001-pg
image: registry.redhat.ren:5443/docker.io/etherpad/etherpad:latest
imagePullPolicy: IfNotPresent
---
apiVersion: v1
kind: Service
metadata:
name: slb-001-service
spec:
selector:
pod: slb-001
ports:
- port: 9001
protocol: TCP
targetPort: 9001
---
apiVersion: v1
kind: Service
metadata:
name: slb-002-service
spec:
selector:
pod: slb-001
clusterIP: None
ports:
- port: 9001
protocol: TCP
targetPort: 9001
---
kind: Route
apiVersion: route.openshift.io/v1
metadata:
name: slb-001
spec:
to:
kind: Service
name: slb-001-service
port:
targetPort: 9001
---
kind: Route
apiVersion: route.openshift.io/v1
metadata:
name: slb-002
spec:
to:
kind: Service
name: slb-002-service
port:
targetPort: 9001
EOF
oc apply -n demo -f headless.yaml
volumn 测试
this is for single node cluster:
https://docs.openshift.com/container-platform/4.3/storage/persistent_storage/persistent-storage-hostpath.html
local volumn
https://docs.openshift.com/container-platform/4.3/storage/persistent_storage/persistent-storage-local.html
local volumn有个坑,他是挂载设备,不是节点上面的目录,所以想用的话,先要想办法把节点上面用lvm搞出很多个设备来,然后用local volumn挂载。。。这个太傻了。。。还是需要有商业版本的云原生的存储解决方案比较好。
# on worker-0
mkdir -p /data/demo
# on helper
oc project demo
oc get sa
oc create serviceaccount -n demo demo-app
oc adm policy add-scc-to-user privileged -z demo-app
local volumn block share
如果是local volume 在块设备的模式下,是可以被相同节点的pod共享的
video
- https://youtu.be/P33sxtR57u8
- https://www.ixigua.com/i6841022539582407180/
- https://www.bilibili.com/video/BV115411W7FV/
# on infra0 create a lv
lvcreate --type raid0 -L 40G --stripes 12 -n sharelv datavg
apiVersion: "local.storage.openshift.io/v1"
kind: "LocalVolume"
metadata:
name: "local-share-block-disks"
namespace: "local-storage"
spec:
nodeSelector:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- infra0.hsc.redhat.ren
storageClassDevices:
- storageClassName: "local-share-block-sc"
volumeMode: Block
devicePaths:
- /dev/datavg/sharelv
cat << EOF > storage.yaml
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: localpvc
spec:
accessModes:
- ReadWriteOnce
volumeMode: Block
resources:
requests:
storage: 40Gi
storageClassName: local-share-block-sc
EOF
oc apply -n demo -f storage.yaml
cat << EOF > demo1.yaml
---
kind: Pod
apiVersion: v1
metadata:
annotations:
name: demo1
namespace: demo
spec:
nodeSelector:
kubernetes.io/hostname: 'infra0.hsc.redhat.ren'
restartPolicy: Always
containers:
- name: demo1
image: >-
registry.redhat.ren:5443/docker.io/wangzheng422/centos:centos7-test
env:
- name: key
value: value
command: [ "/bin/bash", "-c", "--" ]
args: [ "while true; do sleep 300000; done;" ]
imagePullPolicy: Always
securityContext:
privileged: true
volumeDevices:
- devicePath: /mnt/block
name: demo
serviceAccount: demo-app
volumes:
- name: demo
persistentVolumeClaim:
claimName: localpvc
---
kind: Pod
apiVersion: v1
metadata:
annotations:
name: demo2
namespace: demo
spec:
nodeSelector:
kubernetes.io/hostname: 'infra0.hsc.redhat.ren'
restartPolicy: Always
containers:
- name: demo1
image: >-
registry.redhat.ren:5443/docker.io/wangzheng422/centos:centos7-test
env:
- name: key
value: value
command: [ "/bin/bash", "-c", "--" ]
args: [ "while true; do sleep 300000; done;" ]
imagePullPolicy: Always
securityContext:
privileged: true
volumeDevices:
- devicePath: /mnt/block
name: demo
serviceAccount: demo-app
volumes:
- name: demo
persistentVolumeClaim:
claimName: localpvc
EOF
oc apply -f demo1.yaml
# 向块设备写入
oc exec -it -n demo demo1 -- bash -c "echo 'test 1' > /mnt/block"
oc exec -it -n demo demo1 -- head -n 1 /mnt/block
oc exec -it -n demo demo2 -- head -n 2 /mnt/block
oc delete -f demo1.yaml
local volume fs
cat << EOF > demo1.yaml
---
apiVersion: "local.storage.openshift.io/v1"
kind: "LocalVolume"
metadata:
name: "local-disks"
namespace: "local-storage"
spec:
nodeSelector:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- worker-0
storageClassDevices:
- storageClassName: "local-sc"
volumeMode: Filesystem
fsType: xfs
devicePaths:
- /data/lv01
- /data/lv02
EOF
oc apply -f demo1.yaml
oc delete -f demo1.yaml
oc get all -n local-storage
oc get pv
cat << EOF > demo1.yaml
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: local-pvc-name
spec:
accessModes:
- ReadWriteOnce
volumeMode: Filesystem
resources:
requests:
storage: 100Gi
storageClassName: local-sc
---
kind: Pod
apiVersion: v1
metadata:
annotations:
name: demo1
namespace: demo
spec:
nodeSelector:
kubernetes.io/hostname: 'worker-0'
restartPolicy: Always
containers:
- name: demo1
image: >-
registry.redhat.ren:5443/docker.io/wangzheng422/centos:centos7-test
env:
- name: key
value: value
command: [ "/bin/bash", "-c", "--" ]
args: [ "while true; do sleep 300000; done;" ]
imagePullPolicy: Always
securityContext:
privileged: true
volumeMounts:
- mountPath: /data
name: demo
readOnly: false
serviceAccount: demo-app
volumes:
- name: demo
persistentVolumeClaim:
claimName: localpvc
EOF
oc apply -f demo1.yaml
demo for hostpath
https://docs.openshift.com/container-platform/4.3/storage/persistent_storage/persistent-storage-local.html
video
- https://www.bilibili.com/video/BV1MV411Z7ZK/
- https://youtu.be/Dzq-xZW3O5E
oc project demo
oc get sa
oc create serviceaccount -n demo demo-app
oc adm policy add-scc-to-user privileged -z demo-app
cat << EOF > demo1.yaml
---
kind: Deployment
apiVersion: apps/v1
metadata:
annotations:
name: demo1
namespace: demo
spec:
replicas: 1
selector:
matchLabels:
app: demo1
template:
metadata:
labels:
app: demo1
spec:
nodeSelector:
kubernetes.io/hostname: 'infra1.hsc.redhat.ren'
restartPolicy: Always
containers:
- name: demo1
image: >-
registry.redhat.ren:5443/docker.io/wangzheng422/centos:centos7-test
env:
- name: key
value: value
command: ["/bin/bash", "-c", "--" ]
args: [ "trap : TERM INT; sleep infinity & wait" ]
imagePullPolicy: Always
securityContext:
privileged: true
volumeMounts:
- mountPath: /data
name: demo
readOnly: false
serviceAccount: demo-app
volumes:
- name: demo
hostPath:
path: /data
type: Directory
EOF
oc apply -f demo1.yaml
oc delete -f demo1.yaml
demo for emptydir
https://kubernetes.io/docs/concepts/storage/volumes/
cat << EOF > demo1.yaml
---
kind: Deployment
apiVersion: apps/v1
metadata:
annotations:
name: demo1
namespace: demo
spec:
replicas: 1
selector:
matchLabels:
app: demo1
template:
metadata:
labels:
app: demo1
spec:
nodeSelector:
kubernetes.io/hostname: 'infra1.hsc.redhat.ren'
restartPolicy: Always
containers:
- name: demo1
image: >-
registry.redhat.ren:5443/docker.io/wangzheng422/centos:centos7-test
env:
- name: key
value: value
command: ["/bin/bash", "-c", "--" ]
args: [ "trap : TERM INT; sleep infinity & wait" ]
imagePullPolicy: Always
volumeMounts:
- mountPath: /data
name: demo
readOnly: false
volumes:
- name: demo
emptyDir: {}
EOF
oc apply -f demo1.yaml
oc delete -f demo1.yaml
secret
https://docs.openshift.com/container-platform/4.3/nodes/pods/nodes-pods-secrets.html
cat << EOF > demo1.yaml
---
apiVersion: v1
kind: Secret
metadata:
name: test-secret
namespace: demo
data:
username: dmFsdWUtMQ0K
password: dmFsdWUtMQ0KDQo=
stringData:
hostname: myapp.mydomain.com
secret.properties: |-
property1=valueA
property2=valueB
---
kind: Deployment
apiVersion: apps/v1
metadata:
annotations:
name: demo1
namespace: demo
spec:
replicas: 1
selector:
matchLabels:
app: demo1
template:
metadata:
labels:
app: demo1
spec:
nodeSelector:
kubernetes.io/hostname: 'infra1.hsc.redhat.ren'
restartPolicy: Always
containers:
- name: demo1
image: >-
registry.redhat.ren:5443/docker.io/wangzheng422/centos:centos7-test
env:
- name: key
value: value
command: ["/bin/bash", "-c", "--" ]
args: [ "trap : TERM INT; sleep infinity & wait" ]
imagePullPolicy: Always
volumeMounts:
- mountPath: /data
name: demo
readOnly: true
volumes:
- name: demo
secret:
secretName: test-secret
EOF
oc apply -f demo1.yaml
oc delete -f demo1.yaml
configmap
https://kubernetes.io/docs/tasks/configure-pod-container/configure-pod-configmap/
cat << EOF > demo1.yaml
---
apiVersion: v1
kind: ConfigMap
metadata:
name: special-config
namespace: demo
data:
SPECIAL_LEVEL: very
SPECIAL_TYPE: charm
---
kind: Deployment
apiVersion: apps/v1
metadata:
annotations:
name: demo1
namespace: demo
spec:
replicas: 1
selector:
matchLabels:
app: demo1
template:
metadata:
labels:
app: demo1
spec:
nodeSelector:
kubernetes.io/hostname: 'infra1.hsc.redhat.ren'
restartPolicy: Always
containers:
- name: demo1
image: >-
registry.redhat.ren:5443/docker.io/wangzheng422/centos:centos7-test
env:
- name: key
value: value
command: ["/bin/bash", "-c", "--" ]
args: [ "trap : TERM INT; sleep infinity & wait" ]
imagePullPolicy: Always
volumeMounts:
- mountPath: /data
name: demo
readOnly: true
volumes:
- name: demo
configMap:
name: special-config
EOF
oc apply -f demo1.yaml
oc delete -f demo1.yaml
nfs manual
https://docs.openshift.com/container-platform/4.3/storage/persistent_storage/persistent-storage-nfs.html
video
- https://www.bilibili.com/video/BV1Ng4y1z7Dj/
- https://youtu.be/DIM9fLGJZLU
# on helper
mkdir -p /data/export/lv01
mkdir -p /data/export/lv02
chown -R nfsnobody:nfsnobody /data/export/lv01
chown -R nfsnobody:nfsnobody /data/export/lv02
chmod 777 /data/export/lv01
chmod 777 /data/export/lv02
cat << EOF > /etc/exports
/data/export *(rw,sync,root_squash)
/data/export/lv01 *(rw,sync,root_squash)
/data/export/lv02 *(rw,sync,root_squash)
EOF
systemctl restart nfs-server
exportfs -s
cat << EOF > demo.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv0001
labels:
storage-purpose: demo
spec:
capacity:
storage: 5Gi
accessModes:
- ReadWriteOnce
nfs:
path: /data/export/lv01
server: 117.177.241.16
persistentVolumeReclaimPolicy: Retain
EOF
oc create -n demo -f demo.yaml
oc get pv
cat << EOF > demo.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: nfs-claim1
spec:
storageClassName: ""
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
selector:
matchLabels:
storage-purpose: demo
EOF
oc create -n demo -f demo.yaml
cat << EOF > demo.yaml
---
kind: Deployment
apiVersion: apps/v1
metadata:
annotations:
name: demo
spec:
replicas: 1
selector:
matchLabels:
app: demo
template:
metadata:
labels:
app: demo
spec:
nodeSelector:
kubernetes.io/hostname: 'infra1.hsc.redhat.ren'
restartPolicy: Always
containers:
- name: demo1
image: >-
registry.redhat.ren:5443/docker.io/wangzheng422/centos:centos7-test
env:
- name: key
value: value
command: ["/bin/bash", "-c", "--" ]
args: [ "trap : TERM INT; sleep infinity & wait" ]
imagePullPolicy: Always
volumeMounts:
- mountPath: /data
name: demo
volumes:
- name: demo
persistentVolumeClaim:
claimName: nfs-claim1
EOF
oc apply -n demo -f demo.yaml
nfs auto
https://github.com/kubernetes-incubator/external-storage/blob/master/nfs-client/deploy/test-claim.yaml
video
- https://www.bilibili.com/video/BV1vt4y1272R/
- https://youtu.be/aSfiv-G67Gg
cat << EOF > demo.yaml
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: pvc-demo
annotations:
volume.beta.kubernetes.io/storage-class: nfs-storage-provisioner
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 10Gi
EOF
oc create -n demo -f demo.yaml
cat << EOF > demo.yaml
---
kind: Deployment
apiVersion: apps/v1
metadata:
annotations:
name: demo
spec:
replicas: 1
selector:
matchLabels:
app: demo
template:
metadata:
labels:
app: demo
spec:
nodeSelector:
kubernetes.io/hostname: 'infra1.hsc.redhat.ren'
restartPolicy: Always
containers:
- name: demo1
image: >-
registry.redhat.ren:5443/docker.io/wangzheng422/centos:centos7-test
env:
- name: key
value: value
command: ["/bin/bash", "-c", "--" ]
args: [ "trap : TERM INT; sleep infinity & wait" ]
imagePullPolicy: Always
volumeMounts:
- mountPath: /data
name: demo
volumes:
- name: demo
persistentVolumeClaim:
claimName: pvc-demo
EOF
oc apply -n demo -f demo.yaml
oc delete -n demo -f demo.yaml
openshift 4.3 enable SupportPodPidsLimit
默认 /sys/fs/cgroup/pids/pids.max 是1024, 有些业务是要求突破这个值。如果不放松限制,会有 "read init-p: connection reset by peer" 这种错误,无法rsh进pod. 而且客户的java程序可能会出现线程创建失败的问题。
解决问题的思路,不要按照文档,开启集群的PodPidsLimit功能,而是用mc放开crio.conf里面的pid限制。
https://www.redhat.com/en/blog/red-hat-openshift-container-platform-4-now-defaults-cri-o-underlying-container-engine
https://docs.openshift.com/container-platform/4.3/nodes/clusters/nodes-cluster-enabling-features.html
https://blog.spider.im/post/pid-limit-in-k8s/
这个pids系统限制的是线程+进程数,可以理解成pstree -pl看到的数量
https://docs.openshift.com/container-platform/4.3/scalability_and_performance/recommended-host-practices.html
https://github.com/openshift/machine-config-operator/blob/master/pkg/apis/machineconfiguration.openshift.io/v1/types.go
https://github.com/openshift/machine-config-operator/blob/master/vendor/k8s.io/kubelet/config/v1beta1/types.go
https://github.com/cri-o/cri-o/issues/1921
正确
直接覆盖 /etc/crio/crio.conf
# check current pids limit
crictl ps | awk '{print $1}' | xargs -I DEMO crictl exec DEMO cat /sys/fs/cgroup/pids/pids.max
oc label mcp worker custom-kubelet-pod-pids-limit=true
cat << EOF > crio.yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: ContainerRuntimeConfig
metadata:
name: set-log-and-pid
spec:
machineConfigPoolSelector:
matchLabels:
custom-kubelet-pod-pids-limit: 'true'
containerRuntimeConfig:
pidsLimit: 10240
EOF
oc apply -f crio.yaml
oc delete -f crio.yaml
错误
# PodPidsLimit
oc label mcp worker custom-kubelet-pod-pids-limit=true
cat << EOF > PodPidsLimit.yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: KubeletConfig
metadata:
name: pod-pids-limit
spec:
machineConfigPoolSelector:
matchLabels:
custom-kubelet-pod-pids-limit: 'true'
kubeletConfig:
PodPidsLimit: 4096
EOF
oc apply -f PodPidsLimit.yaml
oc delete -f PodPidsLimit.yaml
cat << EOF > PodPidsLimit.yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: KubeletConfig
metadata:
name: pod-pids-limit
spec:
machineConfigPoolSelector:
matchLabels:
custom-kubelet-pod-pids-limit: 'true'
kubeletConfig:
PodPidsLimit: 10240
EOF
oc apply -f PodPidsLimit.yaml
openshift 使用 rh sso 做 oauth 认证
https://access.redhat.com/documentation/en-us/red_hat_single_sign-on/7.3/html/red_hat_single_sign-on_for_openshift/index
官方文档写的很好,但是是基于 ocp 3.11 的,所以里面有几个配置点需要调整:
- 通过catalog部署的时候,一定要设置admin的用户名和密码。
- issuer url: https://sso-sso-app-demo.apps.ocpef0a.sandbox1717.opentlc.com/auth/realms/OpenShift
- Valid Redirect URIs: https://oauth-openshift.apps.ocpef0a.sandbox1717.opentlc.com/*
- ca.crt 这个文件可以在web界面上上传,但是传什么文件呢,是 openshift-ingress-operator 的 router-ca 里面的 tls.crt
- 界面老是刷不出来 openid 的登录方法: 这种情况,需要一路回退到系统界面,然后在跳转回来,再刷新才行。在登录界面一直刷新是没用的,应该是前端页面的小bug。
- 用户在rh sso里面单点认证以后,如果在openshift退出,想换一个用户登录,是不行的。 这种情况,需要登录到rh sso,把之前的用户session做登出操作,然后openshift上面才能换一个用户登录。
- 添加一个oauth Identity Providers 容易,但是没有删除界面。 这种情况,只能去直接改Identity Providers 的 yaml文件,删掉相关配置。
详细步骤
这里是配置过程的录屏:
- https://www.ixigua.com/i6800709743808610827/
- https://youtu.be/Ak9qdgIbOic
创建项目 sso-app-demo
从 catalog 里面选择 sso 创建, 注意设定sso管理员密码, 省的之后麻烦。 https://access.redhat.com/documentation/en-us/red_hat_single_sign-on/7.3/html-single/red_hat_single_sign-on_for_openshift/index#deploying_the_red_hat_single_sign_on_image_using_the_application_template
然后登录rh sso ,按照官方文档进行配置 https://access.redhat.com/documentation/en-us/red_hat_single_sign-on/7.3/html-single/red_hat_single_sign-on_for_openshift/index#OSE-SSO-AUTH-TUTE
备用命令
# oc -n openshift import-image redhat-sso73-openshift:1.0
# oc new-project sso-app-demo
# oc policy add-role-to-user view system:serviceaccount:$(oc project -q):default
# oc policy remove-role-from-user view system:serviceaccount:$(oc project -q):default
# get issuer url
curl -k https://sso-sso-app-demo.apps.ocpef0a.sandbox1717.opentlc.com/auth/realms/OpenShift/.well-known/openid-configuration | python -m json.tool | grep issuer
# curl -k https://sso-sso-app-demo.apps.ocpef0a.sandbox1717.opentlc.com/auth/realms/OpenShift/.well-known/openid-configuration | jq | less
# # on mac create a ca
# cd ~/Downloads/tmp/tmp/
# openssl req \
# -newkey rsa:2048 -nodes -keyout redhat.ren.key \
# -x509 -days 3650 -out redhat.ren.crt -subj \
# "/C=CN/ST=GD/L=SZ/O=Global Security/OU=IT Department/CN=*.redhat.ren"
# # upload crt to ocp
# oc create configmap ca-config-map --from-file=ca.crt=./redhat.ren.crt -n openshift-config
# oc delete configmap ca-config-map -n openshift-config
oc get secrets router-ca -n openshift-ingress-operator -o jsonpath='{.data.tls\.crt}' | base64 -d > router.ca.crt
# oc get secrets router-ca -n openshift-ingress-operator -o jsonpath='{.data.tls\.key}' | base64 -d
# oc get OAuthClient
# if you want to debug, https://bugzilla.redhat.com/show_bug.cgi?id=1744599
oc patch authentication.operator cluster --type=merge -p "{\"spec\":{\"operatorLogLevel\": \"TraceAll\"}}"
oc patch authentication.operator cluster --type=merge -p "{\"spec\":{\"operatorLogLevel\": \"\"}}"
# update imate stream for offline
oc patch -n openshift is mysql -p "{\"spec\":{\"tags\":[{\"name\": \"5.7\",\"from\":{\"name\":\"registry.redhat.ren:5443/registry.redhat.io/rhscl/mysql-57-rhel7:latest\"}}]}}"
oc patch -n openshift is mysql -p "{\"spec\":{\"tags\":[{\"name\": \"8.0\",\"from\":{\"name\":\"registry.redhat.ren:5443/registry.redhat.io/rhscl/mysql-80-rhel7:latest\"}}]}}"
oc patch -n openshift is redhat-sso73-openshift -p "{\"spec\":{\"tags\":[{\"name\": \"1.0\",\"from\":{\"name\":\"registry.redhat.ren:5443/registry.redhat.io/redhat-sso-7/sso73-openshift:1.0\"}}]}}"
oc patch -n openshift is redhat-sso73-openshift -p "{\"spec\":{\"tags\":[{\"name\": \"latest\",\"from\":{\"name\":\"registry.redhat.ren:5443/registry.redhat.io/redhat-sso-7/sso73-openshift:1.0\"}}]}}"
oc create is ipa-server -n openshift
ocp scc
SecComp
https://docs.openshift.com/container-platform/4.3/authentication/managing-security-context-constraints.html
https://docs.docker.com/engine/security/seccomp/
https://docs.openshift.com/container-platform/4.3/nodes/nodes/nodes-nodes-managing.html
https://docs.openshift.com/container-platform/3.11/admin_guide/seccomp.html
https://gardener.cloud/050-tutorials/content/howto/secure-seccomp/
video
- https://www.bilibili.com/video/BV1Sa4y1x7UP/
- https://youtu.be/gwu53N4dIws
实验证明,容器内部更改date,不影响主机。
# 在kube-system中创建特权用户
oc project kube-system
oc create serviceaccount -n kube-system demo-app
oc adm policy add-scc-to-user privileged -z demo-app
# 创建seccomp需要的security profile,我们这里创建一个只屏蔽clock set的profile
cat << EOF > demo.yaml
---
apiVersion: v1
kind: ConfigMap
metadata:
name: seccomp-profile
namespace: kube-system
data:
my-profile.json: |
{
"defaultAction": "SCMP_ACT_ALLOW",
"syscalls": [
{
"name": "clock_settime",
"action": "SCMP_ACT_ERRNO"
}
]
}
EOF
oc apply -f demo.yaml
# 创建一个daemon set,把我们自定义的security profile复制到各个节点的containerd运行时配置目录中,这样containerd就可以根据需要,使用这个security profile
cat << EOF > demo.yaml
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: seccomp
namespace: kube-system
labels:
security: seccomp
spec:
selector:
matchLabels:
security: seccomp
template:
metadata:
labels:
security: seccomp
spec:
initContainers:
- name: installer
image: docker.io/library/alpine:latest
command: ["/bin/sh", "-c", "cp -r -L /seccomp/*.json /host/seccomp/"]
securityContext:
privileged: true
volumeMounts:
- name: profiles
mountPath: /seccomp
- name: hostseccomp
mountPath: /host/seccomp
readOnly: false
containers:
- name: pause
image: gcr.io/google_containers/pause-amd64:3.0
terminationGracePeriodSeconds: 5
serviceAccount: demo-app
volumes:
- name: hostseccomp
hostPath:
path: /var/lib/kubelet/seccomp
- name: profiles
configMap:
name: seccomp-profile
EOF
oc apply -f demo.yaml
# 在我们的demo project中,创建特权用户
oc project demo
oc create serviceaccount -n demo demo-app
oc adm policy add-scc-to-user privileged -z demo-app
# 在demo project中,创建应用,指定使用我们刚创建的security profile,为了展现效果,我们特别的指明,需要clock setting这个权限,后面可以看到,security profile屏蔽了这个请求。
cat << EOF > demo.yaml
---
apiVersion: v1
kind: Pod
metadata:
annotations:
seccomp.security.alpha.kubernetes.io/pod: "localhost/my-profile.json"
name: demo
spec:
nodeSelector:
kubernetes.io/hostname: 'worker-0.ocp4.redhat.ren'
restartPolicy: Always
containers:
- name: demo1
image: >-
registry.redhat.ren:5443/docker.io/wangzheng422/centos:centos7-test
env:
- name: key
value: value
command: ["/bin/bash", "-c", "--" ]
args: [ "trap : TERM INT; sleep infinity & wait" ]
imagePullPolicy: Always
securityContext:
capabilities:
add: ["CAP_SYS_TIME"]
serviceAccount: demo-app
EOF
oc apply -n demo -f demo.yaml
# 进入这个pod,运行命令能看到命令失败
# this will failed, even you add the capabilities.
date -s "1 second"
# date: cannot set date: Operation not permitted
# Tue Mar 24 02:10:49 UTC 2020
# 为了对比,我们更改刚才的security profile,所有的都放开
# try to allow
cat << EOF > demo.yaml
---
apiVersion: v1
kind: ConfigMap
metadata:
name: seccomp-profile
namespace: kube-system
data:
my-profile.json: |
{
"defaultAction": "SCMP_ACT_ALLOW"
}
EOF
oc apply -f demo.yaml
# 通过打标签的方法,让daemon set重启,这样就把我们更新的security profile更新到各个节点上去
# restart damonset and restart pod.
# oc annotate -n kube-system ds seccomp last-update="`date`"
oc patch -n kube-system ds seccomp -p "{\"spec\":{\"template\":{\"metadata\":{\"annotations\":{\"date\":\"`date +'%s'`\"}}}}}"
oc get pod -n kube-system
# 进入demo pod,再次运行命令,可以看到命令运行成功。
# this command will ok.
date -s "1 second"
# 最好,为了防止安全风险,我们将security profile重置成拒绝所有,并重启daemon set,更新到所有节点上。
# finally, restore
cat << EOF > demo.yaml
---
apiVersion: v1
kind: ConfigMap
metadata:
name: seccomp-profile
namespace: kube-system
data:
my-profile.json: |
{
"defaultAction": "SCMP_ACT_ERRNO"
}
EOF
oc apply -f demo.yaml
# restart damonset and restart pod.
oc patch -n kube-system ds seccomp -p "{\"spec\":{\"template\":{\"metadata\":{\"annotations\":{\"date\":\"`date +'%s'`\"}}}}}"
capabilities
video
- https://youtu.be/yLdJghw-7xs
- https://www.bilibili.com/video/BV1x64y1T7BZ/
# 创建pod,限制selinux,没有clock setting的capability.
cat << EOF > demo.yaml
---
kind: Deployment
apiVersion: apps/v1
metadata:
annotations:
name: demo
spec:
replicas: 1
selector:
matchLabels:
app: demo
template:
metadata:
labels:
app: demo
spec:
nodeSelector:
kubernetes.io/hostname: 'infra1.hsc.redhat.ren'
restartPolicy: Always
containers:
- name: demo1
image: >-
registry.redhat.ren:5443/docker.io/wangzheng422/centos:centos7-test
env:
- name: key
value: value
command: ["/bin/bash", "-c", "--" ]
args: [ "trap : TERM INT; sleep infinity & wait" ]
imagePullPolicy: Always
securityContext:
capabilities:
drop: ["CAP_SYS_TIME"]
serviceAccount: demo-app
EOF
oc apply -n demo -f demo.yaml
# 进入pod,运行以下命令,不成功
date -s "1 second"
# 更新这个pod,赋予clock setting的capability
cat << EOF > demo.yaml
---
kind: Deployment
apiVersion: apps/v1
metadata:
annotations:
name: demo
spec:
replicas: 1
selector:
matchLabels:
app: demo
template:
metadata:
labels:
app: demo
spec:
nodeSelector:
kubernetes.io/hostname: 'infra1.hsc.redhat.ren'
restartPolicy: Always
containers:
- name: demo1
image: >-
registry.redhat.ren:5443/docker.io/wangzheng422/centos:centos7-test
env:
- name: key
value: value
command: ["/bin/bash", "-c", "--" ]
args: [ "trap : TERM INT; sleep infinity & wait" ]
imagePullPolicy: Always
securityContext:
capabilities:
add: ["CAP_SYS_TIME"]
serviceAccount: demo-app
EOF
oc apply -n demo -f demo.yaml
# 进入pod,运行命令,可以看到命令运行成功
date -s "1 second"
# 删除演示应用
oc delete -n demo -f demo.yaml
MCS
https://access.redhat.com/documentation/en-us/openshift_container_platform/3.3/html/installation_and_configuration/configuring-persistent-storage#selinuxoptions
http://www.178linux.com/98614
https://access.redhat.com/sites/default/files/video/files/mls_-_wide_8.pdf
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/selinux_users_and_administrators_guide/mls
https://www.cnblogs.com/charlieroro/p/10830721.html
video
- https://youtu.be/XoQ11ZXEL7Y
- https://www.bilibili.com/video/BV115411t777/
# on worker-0
# yum install selinux-policy-mls
# vi /etc/selinux/config
# # SELINUXTYPE=mls
# # SELINUXTYPE=mls
# getenforce
# fixfiles -F onboot
# cat /.autorelabel | less
# semanage login -l
# # semanage login --modify --range s0-s15:c0.c1023 root
# chcon -R -t default_t /data/mcs
# 创建测试用的目录,设置特殊的权限。
mkdir /data/mcs
chcon -R -l s0:c100 /data/mcs
chcon -R -t container_file_t /data/mcs
chown -R 1000:2000 /data/mcs
chmod -R 775 /data/mcs
# semanage fcontext -l | grep default_t
oc get project demo -o yaml
# metadata:
# annotations:
# openshift.io/description: ""
# openshift.io/display-name: ""
# openshift.io/requester: kube:admin
# openshift.io/sa.scc.mcs: s0:c23,c22
# openshift.io/sa.scc.supplemental-groups: 1000550000/10000
# openshift.io/sa.scc.uid-range: 1000550000/10000
# 创建pod,指定selinux的权限s0:c99。
cat << EOF > demo.yaml
---
apiVersion: v1
kind: Pod
metadata:
annotations:
name: demo
spec:
nodeSelector:
kubernetes.io/hostname: 'infra1.hsc.redhat.ren'
restartPolicy: Always
containers:
- name: demo1
image: >-
registry.redhat.ren:5443/docker.io/wangzheng422/centos:centos7-test
env:
- name: key
value: value
command: ["/bin/bash", "-c", "--" ]
args: [ "trap : TERM INT; sleep infinity & wait" ]
imagePullPolicy: Always
securityContext:
runAsUser: 1000
runAsGroup: 2000
seLinuxOptions:
level: 's0:c99'
volumeMounts:
- mountPath: /data
name: demo
readOnly: false
serviceAccount: demo-app
volumes:
- name: demo
hostPath:
path: /data/mcs
type: Directory
EOF
oc apply -n demo -f demo.yaml
# 进入pod检查权限,由于s0:c99 和目录的s0:c100 权限不符,以下操作失败
# below will fail
cd /data
# 修改目录权限,符合pod中的selinux声明
# after change the host path selinux flag
chcon -R -l s0:c99 /data/mcs
# system_u:object_r:default_t:s0:c99
# system_u:system_r:container_t:s0:c99
# seinfo -tcontainer_t
# seinfo -rsystem_r
# 进入pod操作,以下操作能够成功。
# then, below will ok
cd /data
ls
touch test
ocp 4.3 recover from node not ready
https://access.redhat.com/solutions/4923031
cat << "EOF" > recover_kubeconfig.sh
#!/bin/bash
set -eou pipefail
# context
intapi=$(oc get infrastructures.config.openshift.io cluster -o "jsonpath={.status.apiServerInternalURI}")
context="$(oc config current-context)"
# cluster
cluster="$(oc config view -o "jsonpath={.contexts[?(@.name==\"$context\")].context.cluster}")"
server="$(oc config view -o "jsonpath={.clusters[?(@.name==\"$cluster\")].cluster.server}")"
# token
ca_crt_data="$(oc get secret -n openshift-machine-config-operator node-bootstrapper-token -o "jsonpath={.data.ca\.crt}" | base64 --decode)"
namespace="$(oc get secret -n openshift-machine-config-operator node-bootstrapper-token -o "jsonpath={.data.namespace}" | base64 --decode)"
token="$(oc get secret -n openshift-machine-config-operator node-bootstrapper-token -o "jsonpath={.data.token}" | base64 --decode)"
export KUBECONFIG="$(mktemp)"
oc config set-credentials "kubelet" --token="$token" >/dev/null
ca_crt="$(mktemp)"; echo "$ca_crt_data" > $ca_crt
oc config set-cluster $cluster --server="$intapi" --certificate-authority="$ca_crt" --embed-certs >/dev/null
oc config set-context kubelet --cluster="$cluster" --user="kubelet" >/dev/null
oc config use-context kubelet >/dev/null
cat "$KUBECONFIG"
EOF
chmod 755 recover_kubeconfig.sh
./recover_kubeconfig.sh > kubeconfig-bootstrap
# scp kubeconfig-bootstrap to each affected nodes
scp kubeconfig-bootstrap core@node.ip.address:~/
# on each affected nodes
systemctl stop kubelet
mkdir -p /root/backup-certs
cp -a /var/lib/kubelet/pki /var/lib/kubelet/kubeconfig /root/backup-certs
rm -rf /var/lib/kubelet/pki /var/lib/kubelet/kubeconfig
cp /home/core/kubeconfig-bootstrap /etc/kubernetes/kubeconfig
systemctl start kubelet
# on helper
oc get node
oc get csr
oc get csr -ojson | jq -r '.items[] | select(.status == {} ) | .metadata.name' | xargs oc adm certificate approve
openshift 4.3 QoS
https://docs.openshift.com/container-platform/4.3/nodes/pods/nodes-pods-configuring.html
https://docs.openshift.com/container-platform/3.11/admin_guide/managing_pods.html#admin-guide-manage-pods-limit-bandwidth
video
- https://youtu.be/ghObMDoLcAQ
- https://www.bilibili.com/video/BV16Z4y1W75P/
# 创建一个服务端Pod,用iperf3作为服务端,服务端限制带宽1Mb/s。再创建一个客户端Pod,有iperf3作为客户端。
cat << EOF > demo.yaml
---
kind: Pod
apiVersion: v1
metadata:
name: demo-pod
annotations:
kubernetes.io/ingress-bandwidth: 1M
kubernetes.io/egress-bandwidth: 1M
spec:
nodeSelector:
kubernetes.io/hostname: 'infra1.hsc.redhat.ren'
restartPolicy: Always
containers:
- name: demo1
image: >-
registry.redhat.ren:5443/docker.io/wangzheng422/centos:centos7-test
env:
- name: key
value: value
command: ["iperf3", "-s", "-p" ]
args: [ "6666" ]
imagePullPolicy: Always
---
kind: Deployment
apiVersion: apps/v1
metadata:
annotations:
name: iperf
spec:
replicas: 1
selector:
matchLabels:
app: iperf
template:
metadata:
labels:
app: iperf
spec:
nodeSelector:
kubernetes.io/hostname: 'infra0.hsc.redhat.ren'
restartPolicy: Always
containers:
- name: iperf
image: >-
registry.redhat.ren:5443/docker.io/wangzheng422/centos:centos7-test
env:
- name: key
value: value
command: ["/bin/bash", "-c", "--" ]
args: [ "trap : TERM INT; sleep infinity & wait" ]
imagePullPolicy: Always
EOF
oc apply -n demo -f demo.yaml
# 查找服务端pod ip
oc get pod -o wide
# 进入客户端,进行测速
oc exec -it iperf-5b95866ff5-c9p9m -- iperf3 -t 20 -b 2M -p 6666 -c 10.254.5.52
# 查看服务端pod的日志,可以看到流量信息
# 更改服务端带宽为2M
oc delete pod -n demo demo-pod
cat << EOF > demo1.yaml
---
kind: Pod
apiVersion: v1
metadata:
name: demo-pod
annotations:
kubernetes.io/ingress-bandwidth: 2M
kubernetes.io/egress-bandwidth: 2M
spec:
nodeSelector:
kubernetes.io/hostname: 'infra1.hsc.redhat.ren'
restartPolicy: Always
containers:
- name: demo1
image: >-
registry.redhat.ren:5443/docker.io/wangzheng422/centos:centos7-test
env:
- name: key
value: value
command: ["iperf3", "-s", "-p" ]
args: [ "6666" ]
imagePullPolicy: Always
EOF
oc apply -n demo -f demo1.yaml
# 查找服务端pod ip
oc get pod -o wide
# 进入客户端,进行测速
oc exec -it iperf-5b95866ff5-c9p9m -- iperf3 -t 20 -b 2M -p 6666 -c 10.254.5.53
# 查看服务端pod的日志,可以看到流量信息
oc delete -n demo -f demo.yaml
openshift 4.3 QoS
本文测试,openshift (ovs) pod 在大流量下, 限流功能的表现。
video
- https://youtu.be/IaWdkPsRinw
- https://www.bilibili.com/video/BV1cV411d7LV/
参考资料:
https://docs.openshift.com/container-platform/4.3/nodes/pods/nodes-pods-configuring.html
https://docs.openshift.com/container-platform/3.11/admin_guide/managing_pods.html#admin-guide-manage-pods-limit-bandwidth
# 查看infra0, infra1上面的端口速度,可以看到是10GE的网口
ethtool em1
# Settings for em1:
# Supported ports: [ FIBRE ]
# Supported link modes: 1000baseT/Full
# 10000baseT/Full
# Supported pause frame use: Symmetric Receive-only
# Supports auto-negotiation: No
# Supported FEC modes: Not reported
# Advertised link modes: 10000baseT/Full
# Advertised pause frame use: No
# Advertised auto-negotiation: No
# Advertised FEC modes: Not reported
# Speed: 10000Mb/s
# Duplex: Full
# Port: FIBRE
# PHYAD: 1
# Transceiver: internal
# Auto-negotiation: off
# Supports Wake-on: g
# Wake-on: d
# Current message level: 0x00000000 (0)
# Link detected: yes
# 创建2个服务端Pod,用iperf3作为服务端,服务端不限速。再创建一个客户端Pod,有iperf3作为客户端。
cat << EOF > demo.yaml
---
kind: Pod
apiVersion: v1
metadata:
name: demo-pod1
namespace: demo
spec:
nodeSelector:
kubernetes.io/hostname: 'infra1.hsc.redhat.ren'
restartPolicy: Always
containers:
- name: demo1
image: >-
registry.redhat.ren:5443/docker.io/wangzheng422/centos:centos7-test
env:
- name: key
value: value
command: ["iperf3", "-s", "-p" ]
args: [ "6666" ]
imagePullPolicy: Always
resources:
requests:
cpu: 4.0
memory: 8Gi
limits:
cpu: 60.0
memory: 100Gi
---
kind: Pod
apiVersion: v1
metadata:
name: demo-pod2
namespace: default
spec:
nodeSelector:
kubernetes.io/hostname: 'infra1.hsc.redhat.ren'
restartPolicy: Always
containers:
- name: demo1
image: >-
registry.redhat.ren:5443/docker.io/wangzheng422/centos:centos7-test
env:
- name: key
value: value
command: ["iperf3", "-s", "-p" ]
args: [ "6666" ]
imagePullPolicy: Always
resources:
requests:
cpu: 4.0
memory: 8Gi
limits:
cpu: 60.0
memory: 100Gi
---
kind: Pod
apiVersion: v1
metadata:
name: iperf
namespace: zte
spec:
nodeSelector:
kubernetes.io/hostname: 'infra0.hsc.redhat.ren'
restartPolicy: Always
containers:
- name: iperf
image: >-
registry.redhat.ren:5443/docker.io/wangzheng422/centos:centos7-test
env:
- name: key
value: value
command: ["/bin/bash", "-c", "--" ]
args: [ "trap : TERM INT; sleep infinity & wait" ]
imagePullPolicy: Always
resources:
requests:
cpu: 4.0
memory: 8Gi
limits:
cpu: 60.0
memory: 100Gi
EOF
oc apply -f demo.yaml
# 查找服务端pod ip
oc get pod -A -o wide | grep demo-pod
oc get pod -n zte -o wide
pod_demo1_ip=$(oc get pod -n demo demo-pod1 -o json | jq -r '.status.podIPs[0].ip')
pod_demo2_ip=$(oc get pod -n default demo-pod2 -o json | jq -r '.status.podIPs[0].ip')
echo $pod_demo1_ip
echo $pod_demo2_ip
# 进入客户端,对两个服务端pod进行测速
/bin/rm -f nohup.out
nohup oc exec -n zte -it iperf -- iperf3 -T demo1 -i 10 -t 30 -b 3G -P 6 -p 6666 -c $pod_demo1_ip 2>&1 &
nohup oc exec -n zte -it iperf -- iperf3 -T demo2 -i 10 -t 30 -b 6G -P 6 -p 6666 -c $pod_demo2_ip 2>&1 &
tail -f nohup.out
# 调整流量重新测试,对两个服务端pod进行测速
/bin/rm -f nohup.out
nohup oc exec -n zte -it iperf -- iperf3 -T demo1 -i 10 -t 30 -b 6G -P 6 -p 6666 -c $pod_demo1_ip 2>&1 &
nohup oc exec -n zte -it iperf -- iperf3 -T demo2 -i 10 -t 30 -b 6G -P 6 -p 6666 -c $pod_demo2_ip 2>&1 &
tail -f nohup.out
# 调整流量重新测试,对两个服务端pod进行测速
/bin/rm -f nohup.out
nohup oc exec -n zte -it iperf -- iperf3 -T demo1 -i 10 -t 30 -b 8G -P 6 -p 6666 -c $pod_demo1_ip 2>&1 &
nohup oc exec -n zte -it iperf -- iperf3 -T demo2 -i 10 -t 30 -b 6G -P 6 -p 6666 -c $pod_demo2_ip 2>&1 &
tail -f nohup.out
# 查看服务端pod的日志,可以看到流量信息
# 更改服务端带宽为6G
oc delete pod -n demo demo-pod1
cat << EOF > demo1.yaml
---
kind: Pod
apiVersion: v1
metadata:
name: demo-pod1
namespace: demo
annotations:
kubernetes.io/ingress-bandwidth: 6G
kubernetes.io/egress-bandwidth: 6G
spec:
nodeSelector:
kubernetes.io/hostname: 'infra1.hsc.redhat.ren'
restartPolicy: Always
containers:
- name: demo1
image: >-
registry.redhat.ren:5443/docker.io/wangzheng422/centos:centos7-test
env:
- name: key
value: value
command: ["iperf3", "-s", "-p" ]
args: [ "6666" ]
imagePullPolicy: Always
EOF
oc apply -n demo -f demo1.yaml
# 查找服务端pod ip
oc get pod -A -o wide | grep demo-pod
oc get pod -n zte -o wide
pod_demo1_ip=$(oc get pod -n demo demo-pod1 -o json | jq -r '.status.podIPs[0].ip')
pod_demo2_ip=$(oc get pod -n default demo-pod2 -o json | jq -r '.status.podIPs[0].ip')
echo $pod_demo1_ip
echo $pod_demo2_ip
# 调整流量重新测试,对两个服务端pod进行测速
/bin/rm -f nohup.out
nohup oc exec -n zte -it iperf -- iperf3 -T demo1 -i 10 -t 30 -b 8G -P 6 -p 6666 -c $pod_demo1_ip 2>&1 &
nohup oc exec -n zte -it iperf -- iperf3 -T demo2 -i 10 -t 30 -b 6G -P 6 -p 6666 -c $pod_demo2_ip 2>&1 &
tail -f nohup.out
# 查看服务端pod的日志,可以看到流量信息
# 更改服务端带宽为3G
oc delete pod -n demo demo-pod1
cat << EOF > demo1.yaml
---
kind: Pod
apiVersion: v1
metadata:
name: demo-pod1
namespace: demo
annotations:
kubernetes.io/ingress-bandwidth: 3G
kubernetes.io/egress-bandwidth: 3G
spec:
nodeSelector:
kubernetes.io/hostname: 'infra1.hsc.redhat.ren'
restartPolicy: Always
containers:
- name: demo1
image: >-
registry.redhat.ren:5443/docker.io/wangzheng422/centos:centos7-test
env:
- name: key
value: value
command: ["iperf3", "-s", "-p" ]
args: [ "6666" ]
imagePullPolicy: Always
EOF
oc apply -n demo -f demo1.yaml
# 查找服务端pod ip
oc get pod -A -o wide | grep demo-pod
oc get pod -n zte -o wide
pod_demo1_ip=$(oc get pod -n demo demo-pod1 -o json | jq -r '.status.podIPs[0].ip')
pod_demo2_ip=$(oc get pod -n default demo-pod2 -o json | jq -r '.status.podIPs[0].ip')
echo $pod_demo1_ip
echo $pod_demo2_ip
# 调整流量重新测试,对两个服务端pod进行测速
/bin/rm -f nohup.out
nohup oc exec -n zte -it iperf -- iperf3 -T demo1 -i 10 -t 30 -b 8G -P 6 -p 6666 -c $pod_demo1_ip 2>&1 &
nohup oc exec -n zte -it iperf -- iperf3 -T demo2 -i 10 -t 30 -b 6G -P 6 -p 6666 -c $pod_demo2_ip 2>&1 &
tail -f nohup.out
# 查看服务端pod的日志,可以看到流量信息
oc delete -f demo.yaml
package size
oc exec -n zte -it iperf -- iperf3 -T demo1 -V -b 10G -M 1500 -p 6666 -c $pod_demo1_ip
# demo1: Test Complete. Summary Results:
# demo1: [ ID] Interval Transfer Bandwidth Retr
# demo1: [ 4] 0.00-10.00 sec 3.66 GBytes 3.15 Gbits/sec 221 sender
# demo1: [ 4] 0.00-10.00 sec 3.66 GBytes 3.14 Gbits/sec receiver
# demo1: CPU Utilization: local/sender 38.5% (1.8%u/36.6%s), remote/receiver 9.6% (0.4%u/9.2%s)
oc exec -n zte -it iperf -- iperf3 -T demo1 -V -b 10G -M 1000 -p 6666 -c $pod_demo1_ip
# demo1: Test Complete. Summary Results:
# demo1: [ ID] Interval Transfer Bandwidth Retr
# demo1: [ 4] 0.00-10.00 sec 2.68 GBytes 2.30 Gbits/sec 304 sender
# demo1: [ 4] 0.00-10.00 sec 2.68 GBytes 2.30 Gbits/sec receiver
# demo1: CPU Utilization: local/sender 22.8% (1.0%u/21.7%s), remote/receiver 2.4% (0.2%u/2.2%s)
oc exec -n zte -it iperf -- iperf3 -T demo1 -V -b 10G -M 500 -p 6666 -c $pod_demo1_ip
# demo1: Test Complete. Summary Results:
# demo1: [ ID] Interval Transfer Bandwidth Retr
# demo1: [ 4] 0.00-10.00 sec 1.32 GBytes 1.14 Gbits/sec 195 sender
# demo1: [ 4] 0.00-10.00 sec 1.32 GBytes 1.13 Gbits/sec receiver
# demo1: CPU Utilization: local/sender 13.6% (0.9%u/12.7%s), remote/receiver 4.2% (0.3%u/4.0%s)
oc exec -n zte -it iperf -- iperf3 -T demo1 -V -b 10G -M 100 -p 6666 -c $pod_demo1_ip
# demo1: Test Complete. Summary Results:
# demo1: [ ID] Interval Transfer Bandwidth Retr
# demo1: [ 4] 0.00-10.00 sec 224 MBytes 188 Mbits/sec 590 sender
# demo1: [ 4] 0.00-10.00 sec 223 MBytes 187 Mbits/sec receiver
# demo1: CPU Utilization: local/sender 3.5% (0.2%u/3.3%s), remote/receiver 10.2% (0.1%u/10.1%s)
oc exec -n zte -it iperf -- iperf3 -T demo1 -V -b 10G -M 1500 -P 10 -p 6666 -c $pod_demo1_ip
# demo1: [SUM] 0.00-10.00 sec 9.21 GBytes 7.91 Gbits/sec 4804 sender
# demo1: [SUM] 0.00-10.00 sec 9.20 GBytes 7.90 Gbits/sec receiver
# demo1: CPU Utilization: local/sender 65.3% (2.5%u/62.8%s), remote/receiver 28.5% (0.4%u/28.1%s)
oc exec -n zte -it iperf -- iperf3 -T demo1 -V -b 10G -M 1000 -P 10 -p 6666 -c $pod_demo1_ip
# demo1: [SUM] 0.00-10.00 sec 8.62 GBytes 7.40 Gbits/sec 4354 sender
# demo1: [SUM] 0.00-10.00 sec 8.61 GBytes 7.40 Gbits/sec receiver
# demo1: CPU Utilization: local/sender 73.7% (2.4%u/71.3%s), remote/receiver 19.7% (0.9%u/18.8%s)
oc exec -n zte -it iperf -- iperf3 -T demo1 -V -b 10G -M 500 -P 10 -p 6666 -c $pod_demo1_ip
# demo1: [SUM] 0.00-10.00 sec 4.72 GBytes 4.05 Gbits/sec 7142 sender
# demo1: [SUM] 0.00-10.00 sec 4.71 GBytes 4.05 Gbits/sec receiver
# demo1: CPU Utilization: local/sender 49.4% (2.0%u/47.3%s), remote/receiver 17.6% (0.6%u/17.1%s)
oc exec -n zte -it iperf -- iperf3 -T demo1 -V -b 10G -M 100 -P 10 -p 6666 -c $pod_demo1_ip
# demo1: [SUM] 0.00-10.00 sec 895 MBytes 750 Mbits/sec 10362 sender
# demo1: [SUM] 0.00-10.00 sec 889 MBytes 745 Mbits/sec receiver
# demo1: CPU Utilization: local/sender 14.4% (0.6%u/13.7%s), remote/receiver 22.6% (0.3%u/22.3%s)
iperf3 -T demo1 -V -b 10G -M 1500 -p 6666 -c 117.177.241.24
# demo1: Test Complete. Summary Results:
# demo1: [ ID] Interval Transfer Bandwidth Retr
# demo1: [ 4] 0.00-10.00 sec 10.5 GBytes 8.98 Gbits/sec 0 sender
# demo1: [ 4] 0.00-10.00 sec 10.4 GBytes 8.98 Gbits/sec receiver
# demo1: CPU Utilization: local/sender 52.8% (2.7%u/50.2%s), remote/receiver 30.6% (1.0%u/29.5%s)
iperf3 -T demo1 -V -b 10G -M 1000 -p 6666 -c 117.177.241.24
# demo1: Test Complete. Summary Results:
# demo1: [ ID] Interval Transfer Bandwidth Retr
# demo1: [ 4] 0.00-10.00 sec 9.28 GBytes 7.97 Gbits/sec 0 sender
# demo1: [ 4] 0.00-10.00 sec 9.27 GBytes 7.96 Gbits/sec receiver
# demo1: CPU Utilization: local/sender 54.4% (3.2%u/51.2%s), remote/receiver 19.2% (0.1%u/19.1%s)
iperf3 -T demo1 -V -b 10G -M 500 -p 6666 -c 117.177.241.24
# demo1: Test Complete. Summary Results:
# demo1: [ ID] Interval Transfer Bandwidth Retr
# demo1: [ 4] 0.00-10.00 sec 6.14 GBytes 5.28 Gbits/sec 5857 sender
# demo1: [ 4] 0.00-10.00 sec 6.14 GBytes 5.27 Gbits/sec receiver
# demo1: CPU Utilization: local/sender 30.6% (2.1%u/28.5%s), remote/receiver 12.6% (0.1%u/12.5%s)
iperf3 -T demo1 -V -b 10G -M 100 -p 6666 -c 117.177.241.24
# demo1: Test Complete. Summary Results:
# demo1: [ ID] Interval Transfer Bandwidth Retr
# demo1: [ 4] 0.00-10.00 sec 1.41 GBytes 1.21 Gbits/sec 3499 sender
# demo1: [ 4] 0.00-10.00 sec 1.40 GBytes 1.21 Gbits/sec receiver
# demo1: CPU Utilization: local/sender 8.2% (0.9%u/7.4%s), remote/receiver 23.8% (0.1%u/23.7%s)
如何添加 http_proxy 来下载镜像
我们的部署环境,如果有一个image proxy,那么可以配置集群,使用这个proxy去下载镜像。
关键是crio的环境变量,所以给这个目录添加一个环境变量进去,/etc/systemd/system/crio.service.d/
cat << EOF > crio-env.conf
[Service]
Environment=HTTP_PROXY=http://v.redhat.ren:8080
Environment=HTTPS_PROXY=http://v.redhat.ren:8080
Environment=NO_PROXY=redhat.ren,10.254.0.0/16,172.30.0.0/16
EOF
config_source=$(cat ./crio-env.conf | python3 -c "import sys, urllib.parse; print(urllib.parse.quote(''.join(sys.stdin.readlines())))" )
cat <<EOF > 50-crio-env-conf.yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
labels:
machineconfiguration.openshift.io/role: worker
name: 50-crio-env-conf
spec:
config:
ignition:
version: 2.2.0
storage:
files:
- contents:
source: data:text/plain,${config_source}
verification: {}
filesystem: root
mode: 0420
path: /etc/systemd/system/crio.service.d/20-wzh-env.conf
- contents:
source: data:text/plain,${config_source}
verification: {}
filesystem: root
mode: 0420
path: /etc/systemd/system/kubelet.service.d/20-wzh-env.conf
- contents:
source: data:text/plain,${config_source}
verification: {}
filesystem: root
mode: 0420
path: /etc/systemd/system/machine-config-daemon-host.service.d/20-wzh-env.conf
- contents:
source: data:text/plain,${config_source}
verification: {}
filesystem: root
mode: 0420
path: /etc/systemd/system/pivot.service.d/20-wzh-env.conf
EOF
oc apply -f 50-crio-env-conf.yaml -n openshift-config
等待集群重启以后,测试一下
cat << EOF > test-local-dc.yaml
kind: DeploymentConfig
apiVersion: apps.openshift.io/v1
metadata:
name: busybox
labels:
run: busybox
spec:
replicas: 1
template:
metadata:
labels:
run: busybox
spec:
containers:
- name: busybox
image: 'docker.io/busybox:1.28.0-glibc'
command:
- sleep
- '36000'
EOF
oc apply -f test-local-dc.yaml
虽然实验环境网络问题,没下载成功,但是看到下载是在走proxy了。
以下是弯路
这样就可以通过内网的proxy server去pull image了。
调优 /etc/crio/crio.conf 的方法不可以,因为查过源代码以后,发现下面链接说的操作,源代码里面也就支持3个选项,其他选项都不支持。 https://www.redhat.com/en/blog/red-hat-openshift-container-platform-4-now-defaults-cri-o-underlying-container-engine
然后从源代码里面,高兴的发现,/etc/systemd/system/crio.service.d/10-default-env.conf 是可以通过proxy的配置生效的。 https://github.com/openshift/machine-config-operator/blob/master/templates/common/_base/files/etc-systemd-system-crio.service.d-10-default-env.conf.yaml
配置一个proxy https://access.redhat.com/solutions/3442811
apiVersion: config.openshift.io/v1
kind: Proxy
metadata:
name: cluster
spec:
httpProxy: http://v.redhat.ren:8080
httpsProxy: http://v.redhat.ren:8080
noProxy: example.com
cat << EOF > proxy.yaml
apiVersion: config.openshift.io/v1
kind: Proxy
metadata:
name: cluster
spec:
httpProxy: http://v.redhat.ren:8080
httpsProxy: http://v.redhat.ren:8080
readinessEndpoints:
- http://www.google.com
noProxy: example.com
trustedCA:
name: ca.for.proxy
EOF
oc apply -f proxy.yaml
cat << EOF > proxy.yaml
apiVersion: config.openshift.io/v1
kind: Proxy
metadata:
name: cluster
spec: {}
EOF
oc apply -f proxy.yaml
cat /etc/systemd/system/crio.service.d/10-default-env.conf
cat << EOF > ca.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: ca.for.proxy
namespace: openshift-config
data:
ca-bundle.crt: |
-----BEGIN PRIVATE KEY-----
MIIEvQIBADANBgkqhkiG9w0BAQEFAASCBKcwggSjAgEAAoIBAQCpRqAtwkQsmdA5
qDyAV7ABoRmZdDh7aaH9OY+gHRVtMDYbEH1e3u4oIJ5CoAK4EiZ/AZA2Pb5xFO+5
63YwMFEucg0TcCAs20yFbhkRXac1UxsGmx3zUSfex6/A6yxwyx14/HBoli6Trqpr
oPxUFDFoHHe6zIqgQkdjdYttL/vwrVg2yH2Z3IS1qQ/uN8UpyL/yY48VRimQsGjX
9FmRusONsUdRYh29gbOI76hJ7ooCNGvgbXq/6L6OGu6by+g6MgqHtBWMjnObWkWV
ln1lRRfmhwlGO0136lURt58diJSIWPXOpSO4Ulc2JMH9D+pgAD59JU4pm1PvGotc
e+WIxvJ9AgMBAAECggEACpulcBirgwwEk4hqejSEkCWTYB17aKh/AUp5KLSJ4jTS
PzHyWV6pGBSrNkumv/hLN0xWyD9oTtfcCg+qcWylub5l+WDec1Eu43G52m+/CcVy
fSB9aQEd+YUUC4fxWgQwjaNsO/Gla5XXkjUdevtk+TxHeIpW6aIdrSrxmN8X78Yj
F0FIPYSAM4Lh2ZdykFS9igbteRN27WGlypKF6D7efDfbh4TLuVtSMRyehjewyy3U
DAYkkMm1SD/TH4HJQU8eU3Gp3ZZmP4uSTESfBc/6lrSy/ooXqtc/x8dv0SQtky0I
FQu/bTdrSjz3gOKZVfaLsG4LMiMo7M4SekyU2EGulQKBgQDUobsMXV0WrwVF4JFF
ug3PxXwcatlnesrlcOPQQdhZz4ngk3z49GxPrXykzFQ5KtMCsgyOhNpXOVu6vqew
0QmxJvF8Mo0GhwIOANlrQSn/Flt5s5GIPqteAE//RxSsAhRm6fDnxKik2aT5XOYl
9GQvFvPDtjSR0nBHQg5BuBgtbwKBgQDLzSDr61tbU02/bV/td6CkVMSSpGHpfUU+
0rGC9/JzBmBDr/mC5fDUN0bno1zq35HURxzhk306BJSbMMwnwmUFgWxPuJwlVo2V
Zs3x41eYzTj7JOPZ/AphR+6pdpXlsoxpXUQRgWq1j8hq0wUqDL8s0ltzoDJFMxri
J9N7fv6A0wKBgQChFk3Q1kKZ1sqV38XvHz8rcx/Nn51I6hwgqt/MfLXdhH+eJd59
9R7BVluhtjLwhGMMHbuplTic8BVwatQ7/oHrNeepAdsZYNrLpRUSTnH0kQmIL+RH
ZcMKGg6BBWbB0WmHdiBOVgy1pzV2vUyW4ImtqyPN15IID3eEZKTMYR3f/QKBgFke
QBEp/+71hH/64gHDV/nEH5lITJB/ePI5y+nLZrepyBqRLvhweFk0Oss8Anuqe+hp
mFWD2zStoBYkxoF0XhyENcq+nXkuWgdExzXJBhsJUqtvvDssHZXgkJqGApJI+2Fv
qT5Ga1UtpKQh1pZGsKp26gqruI/OAyl15OKR69SFAoGADAOAADooY3Qcn9AWH1e8
ebSDdimi4j1H9yFvcByaJkNrGhNgKwYYYeLsCvwxGLjRontoH6xOJAVdwmadV/CH
6Ket3yJLWRIuu1N1IKvfLEqLsp2sbWKInhohEfh5yZmvCeTUjJKkz62DYS20JsN0
1+gdBRElKgEz14GTvj7lpas=
-----END PRIVATE KEY-----
-----BEGIN CERTIFICATE-----
MIIDVzCCAj+gAwIBAgIJANzkXo7TCVYVMA0GCSqGSIb3DQEBCwUAMEIxCzAJBgNV
BAYTAlhYMRUwEwYDVQQHDAxEZWZhdWx0IENpdHkxHDAaBgNVBAoME0RlZmF1bHQg
Q29tcGFueSBMdGQwHhcNMjAwMjIyMDMxOTMxWhcNMjEwMjIxMDMxOTMxWjBCMQsw
CQYDVQQGEwJYWDEVMBMGA1UEBwwMRGVmYXVsdCBDaXR5MRwwGgYDVQQKDBNEZWZh
dWx0IENvbXBhbnkgTHRkMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEA
qUagLcJELJnQOag8gFewAaEZmXQ4e2mh/TmPoB0VbTA2GxB9Xt7uKCCeQqACuBIm
fwGQNj2+cRTvuet2MDBRLnINE3AgLNtMhW4ZEV2nNVMbBpsd81En3sevwOsscMsd
ePxwaJYuk66qa6D8VBQxaBx3usyKoEJHY3WLbS/78K1YNsh9mdyEtakP7jfFKci/
8mOPFUYpkLBo1/RZkbrDjbFHUWIdvYGziO+oSe6KAjRr4G16v+i+jhrum8voOjIK
h7QVjI5zm1pFlZZ9ZUUX5ocJRjtNd+pVEbefHYiUiFj1zqUjuFJXNiTB/Q/qYAA+
fSVOKZtT7xqLXHvliMbyfQIDAQABo1AwTjAdBgNVHQ4EFgQUaTkD399lxrjHrHkl
Mq1se4L+yr0wHwYDVR0jBBgwFoAUaTkD399lxrjHrHklMq1se4L+yr0wDAYDVR0T
BAUwAwEB/zANBgkqhkiG9w0BAQsFAAOCAQEAkuBFWQV2dFfwVChhVGKxynQ3JD48
tT27b8G0YHMIM1WGkYIO7jWOx4Vvpo0ykqvwP1r7gVLHectPynCt55c1/lN9FxuV
o+VTGN2ObA8AyEr4pPUJf7rav9GBlyJlIGL2IM4A9b0aCqfwIg0OyTSQzI5E5Cv8
SDj1XTCPwkZT+Vq8aXorpej4dNhz//0AA872pAtwp9ex+KPOVRRZM4cQfQof3saB
oPSkc8R2sA1TYNweeF4cWctWz2G0Vy/uo0fwcTb9NJwpzZlRBclg2S9WA9dMwnV8
LVnyLpo2cf4R2z8zDcfDoQV7i6JxzfTQCeUO1Zy4zPTbtKt1k8g3dYfF0w==
-----END CERTIFICATE-----
EOF
oc apply -f ca.yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: ContainerRuntimeConfig
metadata:
name: set-log-and-pid
spec:
machineConfigPoolSelector:
matchLabels:
debug-crio: config1
containerRuntimeConfig:
conmon_env: "[ HTTP_PROXY=http://v.redhat.ren:8080, HTTPS_PROXY=http://v.redhat.ren:8080 ]"
cat << EOF > crio.yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: ContainerRuntimeConfig
metadata:
name: set-log-and-pid
spec:
machineConfigPoolSelector:
matchLabels:
debug-crio: config1
containerRuntimeConfig:
conmon_env: '[HTTP_PROXY=http://v.redhat.ren:8080,HTTPS_PROXY=http://v.redhat.ren:8080]'
EOF
oc apply -f crio.yaml
oc delete -f crio.yaml
oc edit MachineConfigPool/worker
oc get ContainerRuntimeConfig -o yaml
oc get MachineConfigs
python3 -c "import sys, urllib.parse; print(urllib.parse.unquote(sys.argv[1]))" $(oc get MachineConfig/rendered-worker-a01b5da25ec85d2f0ffabfeb1fbe996d -o YAML | grep -B4 crio.conf | grep source | tail -n 1 | cut -d, -f2) | grep conmon
numa
https://docs.openshift.com/container-platform/4.3/scalability_and_performance/using-topology-manager.html#topology_manager_policies_using-topology-manager
https://www.sharcnet.ca/help/index.php/Using_numactl
video
- https://youtu.be/J2VQQZxk3eY
- https://www.bilibili.com/video/BV1HK4y1r7Di/
oc get featuregate/cluster -o yaml
oc patch featuregate/cluster -p '{"spec": { "featureSet": "LatencySensitive" } }' --type=merge
oc get KubeletConfig -o yaml
cat << EOF > cpumanager-kubeletconfig.yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: KubeletConfig
metadata:
name: cpumanager-enabled
spec:
machineConfigPoolSelector:
matchLabels:
custom-kubelet: cpumanager-enabled
kubeletConfig:
cpuManagerPolicy: static
cpuManagerReconcilePeriod: 5s
topologyManagerPolicy: single-numa-node
EOF
oc apply -f cpumanager-kubeletconfig.yaml
oc project demo
cat << EOF > cpumanager-pod.yaml
apiVersion: v1
kind: Pod
metadata:
generateName: cpumanager-
spec:
containers:
- name: cpumanager
image: gcr.io/google_containers/pause-amd64:3.0
resources:
requests:
cpu: 1
memory: "1G"
limits:
cpu: 1
memory: "1G"
nodeSelector:
cpumanager: "true"
EOF
oc apply -f cpumanager-pod.yaml
# on the worker node
yum install numactl
# 指定命令运行在NUMA NODE0上(CPU,内存都来自NUMA NODE0)
numactl --cpunodebind=0 --membind=0 COMMAND
# 指定命令CPU来自NUMA NODE1,内存尽可能来自NUMA NODE1,如果NUMA NODE1没有足够的内存了,则使用NUMA NODE0上的内存
numactl --cpunodebind=1 --preferred=1 COMMAND
# 获取进程cpu的mask
taskset -p <pid>
# pid 26624's current affinity mask: ff 这个是没设置掩码
# 进程的memory信息可以通过命令获取
numastat <pid>
# Per-node process memory usage (in MBs) for PID 26624 (firefox)
# Node 0 Total
# --------------- ---------------
# Huge 0.00 0.00
# Heap 0.00 0.00
# Stack 0.08 0.08
# Private 208.50 208.50
# ---------------- --------------- ---------------
# Total 208.58 208.58
# 类似于进程,在某个NUMA Node上占用多少内存
# 查询PCI网卡设备所在numa node
cat /sys/class/net/<devicename>/device/numa_node
# back to normal
cat << EOF > cpumanager-kubeletconfig.yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: KubeletConfig
metadata:
name: cpumanager-enabled
spec:
machineConfigPoolSelector:
matchLabels:
custom-kubelet: cpumanager-enabled
kubeletConfig:
cpuManagerPolicy: static
cpuManagerReconcilePeriod: 5s
topologyManagerPolicy: none
EOF
oc apply -f cpumanager-kubeletconfig.yaml
# delete them all
oc delete -f cpumanager-kubeletconfig.yaml
openshift 4.3 network policy demo
https://docs.openshift.com/container-platform/4.3/networking/configuring-networkpolicy.html
video
- https://youtu.be/pbV2VwIExVg
- https://www.bilibili.com/video/BV1vz411B7pC/
# 为zxcdn namespace,和demo namespace配置network policy,只放行CDN内部应用和ingress的流量,外部应用流量一律拒绝。
cat << EOF > demo.yaml
---
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: deny-other-namespaces
spec:
podSelector: null
ingress:
- from:
- podSelector: {}
---
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-from-openshift-ingress
spec:
ingress:
- from:
- namespaceSelector:
matchLabels:
network.openshift.io/policy-group: ingress
podSelector: {}
policyTypes:
- Ingress
EOF
oc apply -n zxcdn -f demo.yaml
oc apply -n demo -f demo.yaml
# 在 demo 和 zxcdn 空间中,各创建一个测试用的pod
cat << EOF > demo.yaml
---
kind: Deployment
apiVersion: apps/v1
metadata:
annotations:
name: demo
spec:
replicas: 1
selector:
matchLabels:
app: demo
template:
metadata:
labels:
app: demo
spec:
nodeSelector:
kubernetes.io/hostname: 'infra1.hsc.redhat.ren'
restartPolicy: Always
containers:
- name: demo1
image: >-
registry.redhat.ren:5443/docker.io/wangzheng422/centos:centos7-test
env:
- name: key
value: value
command: ["/bin/bash", "-c", "--" ]
args: [ "trap : TERM INT; sleep infinity & wait" ]
imagePullPolicy: Always
EOF
oc apply -n demo -f demo.yaml
oc apply -n zxcdn -f demo.yaml
# 查找cdn的ip地址
oc get pod -o wide -n zxcdn
# 进入demo pod,ping cdn pod,应该ping不通
# 配置zxcdn namespace的network policy,放行demo namespace
oc label namespace demo name=demo
cat << EOF > demo.yaml
---
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: deny-other-namespaces
spec:
podSelector: null
ingress:
- from:
- podSelector: {}
---
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-from-openshift-ingress
spec:
ingress:
- from:
- namespaceSelector:
matchLabels:
network.openshift.io/policy-group: ingress
podSelector: {}
policyTypes:
- Ingress
---
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-from-other
spec:
ingress:
- from:
- namespaceSelector:
matchLabels:
name: demo
podSelector: {}
policyTypes:
- Ingress
EOF
oc apply -n zxcdn -f demo.yaml
# 进入demo pod,ping cdn pod,应该可以ping通
# 进入zxcdn project里面的一个pod, ping demo pod,应该ping不通
oc get pod -n demo -o wide
# 配置 demo namespace的network policy, 放行 zxcdn namespace
oc label namespace zxcdn name=zxcdn
cat << EOF > demo.yaml
---
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: deny-other-namespaces
spec:
podSelector: null
ingress:
- from:
- podSelector: {}
---
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-from-openshift-ingress
spec:
ingress:
- from:
- namespaceSelector:
matchLabels:
network.openshift.io/policy-group: ingress
podSelector: {}
policyTypes:
- Ingress
---
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-from-other
spec:
ingress:
- from:
- namespaceSelector:
matchLabels:
name: zxcdn
podSelector: {}
policyTypes:
- Ingress
EOF
oc apply -n demo -f demo.yaml
# 进入zxcdn project里面的一个pod, ping demo pod,应该能够ping通
oc delete -n zxcdn -f demo.yaml
oc delete -n demo -f demo.yaml
openshift 4.3 multicast
本文测试 openshift pod 之间组播的功能
video:
- https://youtu.be/4UriNYHRbHk
- https://www.bilibili.com/video/BV1wk4y1k7sS/
参考资料:
https://docs.openshift.com/container-platform/4.3/networking/openshift_sdn/using-multicast.html
https://pktgen-dpdk.readthedocs.io/en/latest/getting_started.html
https://access.redhat.com/solutions/406553
https://wenku.baidu.com/view/9a7c3c3dbdd126fff705cc1755270722182e5943.html?rec_flag=default
# 在相应的 project 上激活组播功能
oc annotate netnamespace demo \
netnamespace.network.openshift.io/multicast-enabled=true
# 创建两个组播服务端pod,再创建一个组播测试pod
cat << EOF > demo.yaml
---
kind: Pod
apiVersion: v1
metadata:
name: demo1
spec:
nodeSelector:
kubernetes.io/hostname: 'infra1.hsc.redhat.ren'
restartPolicy: Always
containers:
- name: demo1
image: >-
registry.redhat.ren:5443/docker.io/wangzheng422/centos:centos7-test
env:
- name: key
value: value
command: ["iperf", "-s", "-u ","-B", "224.0.0.1", "-p" ]
args: [ "6666" ]
imagePullPolicy: Always
---
kind: Pod
apiVersion: v1
metadata:
name: demo2
spec:
nodeSelector:
kubernetes.io/hostname: 'infra1.hsc.redhat.ren'
restartPolicy: Always
containers:
- name: demo1
image: >-
registry.redhat.ren:5443/docker.io/wangzheng422/centos:centos7-test
env:
- name: key
value: value
command: ["iperf", "-s", "-u ","-B", "224.0.0.1", "-p" ]
args: [ "6666" ]
imagePullPolicy: Always
---
kind: Pod
apiVersion: v1
metadata:
name: iperf
spec:
nodeSelector:
kubernetes.io/hostname: 'infra0.hsc.redhat.ren'
restartPolicy: Always
containers:
- name: iperf
image: >-
registry.redhat.ren:5443/docker.io/wangzheng422/centos:centos7-test
env:
- name: key
value: value
command: ["/bin/bash", "-c", "--" ]
args: [ "trap : TERM INT; sleep infinity & wait" ]
imagePullPolicy: Always
EOF
oc apply -n demo -f demo.yaml
oc project demo
# 查看 pod 运行正常,pod 分布正常
oc get pod -o wide
# 查看组播服务 pod demo1 的组播地址
oc exec -it demo1 -- ipmaddr show dev eth0
# 3: eth0
# link 33:33:00:00:00:01
# link 01:00:5e:00:00:01
# link 33:33:ff:07:a8:2e
# inet 224.0.0.1
# inet6 ff02::1:ff07:a82e
# inet6 ff02::1
# inet6 ff01::1
# 查看组播服务 pod demo2 的组播地址
oc exec -it demo2 -- ipmaddr show dev eth0
# 3: eth0
# link 33:33:00:00:00:01
# link 01:00:5e:00:00:01
# link 33:33:ff:5c:ba:66
# inet 224.0.0.1
# inet6 ff02::1:ff5c:ba66
# inet6 ff02::1
# inet6 ff01::1
# 在测试 pod iperf 上,创建目标是 224.0.0.1 的组播流量
oc exec -it iperf -- iperf -c 224.0.0.1 -u -p 6666 -t 30 -i 1
# 在服务端 pod demo1 上,监听端口,能看到目标 224.0.0.1 的组播流量
oc exec -it demo1 -- tcpdump -i eth0 -nn
# 在服务端 pod demo2 上,监听端口,能看到目标 224.0.0.1 的组播流量
oc exec -it demo2 -- tcpdump -i eth0 -nn
# 在测试 pod iperf 上,创建目标是 225.0.0.2 的组播流量
oc exec -it iperf -- iperf -c 225.0.0.2 -u -p 6666 -t 30 -i 1
# 在服务端 pod demo1 上,监听端口,能看到目标 225.0.0.2 的组播流量
oc exec -it demo1 -- tcpdump -i eth0 -nn
# 在服务端 pod demo2 上,监听端口,能看到目标 225.0.0.2 的组播流量
oc exec -it demo2 -- tcpdump -i eth0 -nn
# 恢复环境
oc delete -f demo.yaml
pkgen
oc annotate netnamespace demo \
netnamespace.network.openshift.io/multicast-enabled=true
# do below before create pod
modprobe pktgen
ps aux | grep pktgen
ls /proc/net/pktgen/
# create pod
oc project demo
oc get sa
oc create serviceaccount -n demo demo-app
oc adm policy add-scc-to-user privileged -z demo-app
cat << EOF > demo1.yaml
kind: Deployment
apiVersion: apps/v1
metadata:
annotations:
name: demo1
namespace: demo
labels:
app: demo1
spec:
replicas: 2
selector:
matchLabels:
app: demo1
template:
metadata:
labels:
app: demo1
spec:
nodeSelector:
kubernetes.io/hostname: 'worker-0'
restartPolicy: Always
containers:
- name: demo1
image: >-
registry.redhat.ren:5443/docker.io/wangzheng422/centos:centos7-test
env:
- name: key
value: value
command: [ "/bin/bash", "-c", "--" ]
args: [ "while true; do sleep 300000; done;" ]
imagePullPolicy: Always
securityContext:
privileged: true
serviceAccount: demo-app
EOF
oc apply -f demo1.yaml
ipmaddr show dev eth0
# 3: eth0
# link 33:33:00:00:00:01
# link 01:00:5e:00:00:01
# link 33:33:ff:ff:9d:55
# inet 224.0.0.1
# inet6 ff02::1:ffff:9d55
# inet6 ff02::1
# inet6 ff01::1
export IF=if581
echo "rem_device_all" > /proc/net/pktgen/kpktgend_0
echo "add_device eth0@${IF}" > /proc/net/pktgen/kpktgend_0
echo "max_before_softirq 100000" > /proc/net/pktgen/kpktgend_0
echo "count 100" > /proc/net/pktgen/eth0@${IF}
echo "clone_skb 1000000" > /proc/net/pktgen/eth0@${IF}
echo "pkt_size 1300" > /proc/net/pktgen/eth0@${IF}
echo "delay 0" > /proc/net/pktgen/eth0@${IF}
echo "dst 224.0.0.2" > /proc/net/pktgen/eth0@${IF}
echo "dst_mac 01:00:5e:00:00:02" > /proc/net/pktgen/eth0@${IF}
echo start > /proc/net/pktgen/pgctrl
cat /proc/net/pktgen/eth0@${IF}
# oc rsh <another pod>
tcpdump -i eth0 -nn
openshift 4.3 firewall
本文记录,如何在openshift集群主机上应用防火墙。这对于客户有内部扫描审计来说,很有用。
做法很简单,就是调用systemd来注入一个新服务,启动本地定制化脚本。
这种做法可以用来做任何你想在coreos瞎搞的事情:)
coreos
对于coreos,特别是master。
cat << EOF > wzh.script
#!/bin/bash
iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT
iptables -A INPUT -s 127.0.0.1/32 -j ACCEPT
iptables -A INPUT -s 223.87.20.0/24 -j ACCEPT
iptables -A INPUT -s 117.177.241.0/24 -j ACCEPT
iptables -A INPUT -s 39.134.200.0/24 -j ACCEPT
iptables -A INPUT -s 192.168.7.0/24 -j ACCEPT
iptables -A INPUT -s 112.44.102.224/27 -j ACCEPT
iptables -A INPUT -s 47.93.86.113/32 -j ACCEPT
iptables -A INPUT -p tcp -j REJECT
iptables -A INPUT -p udp -j REJECT
EOF
var_local=$(cat ./wzh.script | python3 -c "import sys, urllib.parse; print(urllib.parse.quote(''.join(sys.stdin.readlines())))" )
cat <<EOF > 45-master-wzh-service.yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
labels:
machineconfiguration.openshift.io/role: master
name: 45-master-wzh-service
spec:
config:
ignition:
version: 2.2.0
storage:
files:
- contents:
source: data:text/plain,${var_local}
verification: {}
filesystem: root
mode: 0755
path: /etc/rc.d/wzh.local
systemd:
units:
- name: wzh.service
enabled: true
contents: |
[Unit]
Description=/etc/rc.d/wzh.local Compatibility
Documentation=zhengwan@redhat.com
ConditionFileIsExecutable=/etc/rc.d/wzh.local
After=network.target
[Service]
Type=oneshot
User=root
Group=root
ExecStart=/bin/bash -c /etc/rc.d/wzh.local
[Install]
WantedBy=multi-user.target
EOF
oc apply -f 45-master-wzh-service.yaml -n openshift-config
oc delete -f 45-wzh-service.yaml -n openshift-config
for rhel with firewalld
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/security_guide/sec-setting_and_controlling_ip_sets_using_firewalld
https://unix.stackexchange.com/questions/159873/whitelist-source-ip-addresses-in-centos-7
firewall-cmd --get-ipset-types
firewall-cmd --permanent --get-ipsets
firewall-cmd --permanent --new-ipset=my-allow-list --type=hash:net
firewall-cmd --permanent --get-ipsets
# firewall-cmd --permanent --info-ipset=my-allow-list
cat > /root/ocp4/iplist.txt <<EOL
127.0.0.1/32
223.87.20.0/24
117.177.241.0/24
39.134.200.0/24
39.134.201.0/24
39.137.101.0/24
192.168.7.0/24
112.44.102.224/27
47.93.86.113/32
EOL
firewall-cmd --permanent --ipset=my-allow-list --add-entries-from-file=iplist.txt
firewall-cmd --permanent --ipset=my-allow-list --get-entries
firewall-cmd --permanent --zone=trusted --add-source=ipset:my-allow-list
firewall-cmd --reload
firewall-cmd --list-all
# firewall-cmd --permanent --zone=trusted --add-source=192.168.7.0/24
firewall-cmd --get-active-zones
# firewall-cmd --zone=block --change-interface=em1
firewall-cmd --set-default-zone=block
firewall-cmd --runtime-to-permanent
firewall-cmd --reload
firewall-cmd --list-all-zones
firewall-cmd --get-default-zone
for rhel with iptables
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/security_guide/sec-setting_and_controlling_ip_sets_using_iptables
# secure for anti-scan
cat << EOF >> /etc/rc.local
ipset create my-allow-set hash:net
ipset add my-allow-set 127.0.0.1/32
ipset add my-allow-set 223.87.20.0/24
ipset add my-allow-set 117.177.241.0/24
ipset add my-allow-set 39.134.200.0/24
ipset add my-allow-set 39.134.201.0/24
ipset add my-allow-set 39.137.101.0/24
ipset add my-allow-set 192.168.7.0/24
ipset add my-allow-set 112.44.102.224/27
ipset add my-allow-set 47.93.86.113/32
iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT
iptables -A INPUT -m set --match-set my-allow-set src -j ACCEPT
iptables -A INPUT -p tcp -j REJECT
iptables -A INPUT -p udp -j REJECT
EOF
chmod +x /etc/rc.d/rc.local
systemctl enable rc-local
# systemctl start rc-local
ipset list
# 221.226.0.75
# 210.21.236.182
# 61.132.54.2
ipset add my-allow-set 221.226.0.75/32
ipset add my-allow-set 210.21.236.182/32
ipset add my-allow-set 61.132.54.2/32
other record
# https://bugzilla.redhat.com/show_bug.cgi?id=1723327
# https://access.redhat.com/solutions/4264181
for i in $(oc get pods -n openshift-machine-config-operator -l k8s-app=machine-config-daemon -o go-template --template '{{range .items}}{{.metadata.name}}{{"\n"}}{{end}}' | xargs); do oc rsh -n openshift-machine-config-operator $i chroot /rootfs rm -rf /run/pivot/reboot-needed; done
rpm-ostree rollback --reboot
cat << EOF > wzh.service
[Unit]
Description=/etc/rc.d/wzh.local Compatibility
Documentation=zhengwan@redhat.com
ConditionFileIsExecutable=/etc/rc.d/wzh.local
After=network.target
[Service]
Type=oneshot
User=root
Group=root
ExecStart=/bin/bash -c /etc/rc.d/wzh.local
[Install]
WantedBy=multi-user.target
EOF
var_service=$(cat ./wzh.service | python3 -c "import sys, urllib.parse; print(urllib.parse.quote(''.join(sys.stdin.readlines())))" )
openshift 4.3 using ldap
演示场景如下
- 部署openlap,并部署web前端
- 在openlap上配置2个group,一个是admins,一个是users,并给每个group配置一个user
- ocp上配置ldap方式的用户认证
- 在ocp上使用命令行,同步ldap,查看已经生成了group和user
- 用这个用户登录ocp,发现什么都干不了
- 在ocp上使用命令行,给admins group授予cluster view的权限,给users group授予demo project view的权限。
- 重新登录/刷新页面,可以看到admin用户可以看到整个集群的内容,users的用户有了demo project的权限。
video
- https://youtu.be/Sg3euS3ip4k
- https://www.bilibili.com/video/BV1XA411b7N6/
参考资料:
- https://docs.openshift.com/container-platform/4.3/authentication/identity_providers/configuring-ldap-identity-provider.html
- https://docs.openshift.com/container-platform/4.3/authentication/ldap-syncing.html
- https://www.cnblogs.com/ericnie/p/10063816.html
- https://access.redhat.com/solutions/2484371
- https://access.redhat.com/solutions/3419841
openldap
skopeo copy docker://docker.io/osixia/openldap:latest docker://registry.redhat.ren:5443/docker.io/osixia/openldap:latest
skopeo copy docker://docker.io/osixia/phpldapadmin:latest docker://registry.redhat.ren:5443/docker.io/osixia/phpldapadmin:latest
# 启动openldap服务
podman run -p 389:389 --name openldap --hostname ldap.redhat.ren --env LDAP_ORGANISATION="redhat" --env LDAP_DOMAIN="redhat.ren" --env LDAP_ADMIN_PASSWORD="ldap123" --detach registry.redhat.ren:5443/docker.io/osixia/openldap:latest
# 默认登录用户名:admin
podman run -d -p 5080:80 --name phpldapadmin --env PHPLDAPADMIN_HTTPS=false --env PHPLDAPADMIN_LDAP_HOSTS=117.177.241.16 --detach registry.redhat.ren:5443/docker.io/osixia/phpldapadmin:latest
# http://helper.hsc.redhat.ren:5080
# Login DN: cn=admin,dc=redhat,dc=ren
# Password: ldap123
podman rm -fv phpldapadmin
podman rm -fv openldap
yum install -y openldap openldap-clients openldap-servers
systemctl status slapd
# 为ldap添加测试用户数据
cat << EOF > base.ldif
dn: ou=users,dc=redhat,dc=ren
objectClass: organizationalUnit
objectClass: top
ou: users
dn: ou=groups,dc=redhat,dc=ren
objectClass: organizationalUnit
objectClass: top
ou: groups
EOF
ldapadd -x -D "cn=admin,dc=redhat,dc=ren" -w ldap123 -f base.ldif
# 创建用户密码
slappasswd -s redhat
# {SSHA}yiR9306gQWh4mdeOuJ1KUg5cxQ8uoWKK
cat << EOF >users.ldif
dn: cn=ocpadm,ou=users,dc=redhat,dc=ren
objectClass: person
objectClass: organizationalPerson
objectClass: inetOrgPerson
cn: ocpadm
sn: ocpadm
uid: ocpadm
displayName: ocpadm
mail: ocpadm@redhat.ren
userPassword: {SSHA}yiR9306gQWh4mdeOuJ1KUg5cxQ8uoWKK
dn: cn=wzh,ou=users,dc=redhat,dc=ren
objectClass: person
objectClass: organizationalPerson
objectClass: inetOrgPerson
cn: wzh
sn: wzh
uid: wzh
displayName: wzh
mail: wzh@redhat.ren
userPassword: {SSHA}yiR9306gQWh4mdeOuJ1KUg5cxQ8uoWKK
dn: cn=admins,ou=groups,dc=redhat,dc=ren
objectClass: groupOfNames
cn: admins
owner: cn=admin,dc=redhat,dc=ren
member: cn=ocpadm,ou=users,dc=redhat,dc=ren
dn: cn=normals,ou=groups,dc=redhat,dc=ren
objectClass: groupOfNames
cn: normals
owner: cn=admin,dc=redhat,dc=ren
member: cn=wzh,ou=users,dc=redhat,dc=ren
EOF
ldapadd -x -D "cn=admin,dc=redhat,dc=ren" -w ldap123 -f users.ldif
ldapsearch -x -D "cn=admin,dc=redhat,dc=ren" -w ldap123 -b dc=redhat,dc=ren
ocp operation
oc get user
oc get group
oc get identity
# cleanup 垃圾用户数据
oc get user | grep ldap | awk '{print $1}' | xargs -I DEMO oc delete user DEMO
oc get identity | grep ldap | awk '{print $1}' | xargs -I DEMO oc delete identity DEMO
# 创建登录密码
oc create secret generic ldap-secret --from-literal=bindPassword=ldap123 -n openshift-config
# 创建ldap登录入口
cat << EOF > ldap.yaml
apiVersion: config.openshift.io/v1
kind: OAuth
metadata:
name: cluster
spec:
identityProviders:
- name: "Local Password"
mappingMethod: claim
type: HTPasswd
htpasswd:
fileData:
name: htpasswd
- name: ldapidp
mappingMethod: claim
type: LDAP
ldap:
attributes:
id:
- dn
email:
- mail
name:
- cn
preferredUsername:
- uid
bindDN: "cn=admin,dc=redhat,dc=ren"
bindPassword:
name: ldap-secret
insecure: true
url: "ldap://registry.redhat.ren:389/ou=users,dc=redhat,dc=ren?uid"
EOF
oc apply -f ldap.yaml
# 从ldap同步group数据
cat << EOF > ldapsync.yaml
kind: LDAPSyncConfig
apiVersion: v1
url: ldap://registry.redhat.ren:389
insecure: true
bindDN: cn=admin,dc=redhat,dc=ren
bindPassword: ldap123
groupUIDNameMapping:
"cn=admins,ou=groups,dc=redhat,dc=ren": Administrators
"cn=normals,ou=groups,dc=redhat,dc=ren": NormalUsers
rfc2307:
groupsQuery:
baseDN: "ou=groups,dc=redhat,dc=ren"
scope: sub
derefAliases: never
pageSize: 0
filter: (objectclass=groupOfNames)
groupUIDAttribute: dn
groupNameAttributes: [ cn ]
groupMembershipAttributes: [ member ]
usersQuery:
baseDN: "ou=users,dc=redhat,dc=ren"
scope: sub
derefAliases: never
pageSize: 0
userUIDAttribute: dn
userNameAttributes: [ cn ]
tolerateMemberNotFoundErrors: false
tolerateMemberOutOfScopeErrors: false
EOF
oc adm groups sync --sync-config=ldapsync.yaml --confirm
# 删除ldap上已经删除的用户组
# oc adm prune groups --sync-config=ldapsync.yaml --confirm
# 在这个时候,可以用wzh/ocpadm登录系统,但是可以看到没有任何project的权限
# 准备为用户组赋权
oc get clusterrole
oc get role
# 赋予admin和normal组不同的权限
oc adm policy add-cluster-role-to-group cluster-reader Administrators
oc policy add-role-to-group view NormalUsers -n demo
# 再次登录系统,可以看到用户有了相应的权限
# 撤销用户组权限
oc adm policy remove-cluster-role-from-group cluster-reader Administrators
oc policy remove-role-from-group view NormalUsers -n demo
# remove ldap
# cleanup 垃圾用户数据
oc get user | grep ldap | awk '{print $1}' | xargs -I DEMO oc delete user DEMO
oc get identity | grep ldap | awk '{print $1}' | xargs -I DEMO oc delete identity DEMO
cat << EOF > ldap.yaml
apiVersion: config.openshift.io/v1
kind: OAuth
metadata:
name: cluster
spec:
identityProviders:
- name: "Local Password"
mappingMethod: claim
type: HTPasswd
htpasswd:
fileData:
name: htpasswd
EOF
oc apply -f ldap.yaml
free ipa
skopeo copy docker://docker.io/freeipa/freeipa-server:latest docker://registry.redhat.ren:5443/docker.io/freeipa/freeipa-server:latest
mkdir -p /data/freeipa
cat << EOF > /data/freeipa/ipa-server-install-options
--realm=redhat.ren
--ds-password=The-directory-server-password
--admin-password=The-admin-password
EOF
# setsebool -P container_manage_cgroup 1
docker run --name freeipa-server-container -ti --privileged \
-e IPA_SERVER_IP=10.66.208.240 \
-p 3080:80 -p 3443:443 -p 389:389 -p 636:636 -p 88:88 -p 464:464 \
-p 88:88/udp -p 464:464/udp -p 123:123/udp \
-h ipa.redhat.ren \
-v /sys/fs/cgroup:/sys/fs/cgroup:ro \
--tmpfs /run --tmpfs /tmp \
-v /data/freeipa:/data:Z \
docker.io/freeipa/freeipa-server ipa-server-install
docker start -ai freeipa-server-container
docker rm -fv $(docker ps -qa)
firewall-cmd --zone=public --add-port=3443/tcp --permanent
firewall-cmd --reload
image pull secret
https://docs.openshift.com/container-platform/4.3/openshift_images/managing_images/using-image-pull-secrets.html
https://docs.openshift.com/container-platform/4.3/installing/install_config/installing-restricted-networks-preparations.html
# accross projects
oc policy add-role-to-user \
system:image-puller system:serviceaccount:project-a:default \
--namespace=project-b
oc policy add-role-to-group \
system:image-puller system:serviceaccounts:project-a \
--namespace=project-b
# ref outside
oc create secret generic <pull_secret_name> \
--from-file=.dockercfg=<path/to/.dockercfg> \
--type=kubernetes.io/dockercfg
oc create secret generic <pull_secret_name> \
--from-file=.dockerconfigjson=<path/to/.docker/config.json> \
--type=kubernetes.io/dockerconfigjson
oc create secret docker-registry <pull_secret_name> \
--docker-server=<registry_server> \
--docker-username=<user_name> \
--docker-password=<password> \
--docker-email=<email>
oc secrets link default <pull_secret_name> --for=pull
oc secrets link builder <pull_secret_name>
# global
oc get secret/pull-secret -n openshift-config -o yaml
oc get secret/pull-secret -n openshift-config -o json | jq -r '.data.".dockerconfigjson"' | base64 -d
oc set data secret/pull-secret -n openshift-config --from-file=.dockerconfigjson=<pull-secret-location>
cat ./pull-secret.text | jq . > <path>/<pull-secret-file>
# <credentials>
echo -n '<user_name>:<password>' | base64 -w0
# "auths": {
# ...
# "<local_registry_host_name>:<local_registry_host_port>": {
# "auth": "<credentials>",
# "email": "you@example.com"
# },
# ...
openshift 4.3 huge page
video
- https://youtu.be/T7R-j0B9eSY
- https://www.bilibili.com/video/BV1De411W7JU/
https://docs.openshift.com/container-platform/4.3/scalability_and_performance/what-huge-pages-do-and-how-they-are-consumed-by-apps.html
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/performance_tuning_guide/sect-red_hat_enterprise_linux-performance_tuning_guide-configuring_transparent_huge_pages
# check original status
cat /sys/kernel/mm/transparent_hugepage/enabled
# [always] madvise never
cat /sys/kernel/mm/transparent_hugepage/defrag
# [always] madvise never
# begin to test
oc label node infra1.hsc.redhat.ren hugepages=true
cat << EOF > hugepages_tuning.yaml
apiVersion: tuned.openshift.io/v1
kind: Tuned
metadata:
name: hugepages
namespace: openshift-cluster-node-tuning-operator
spec:
profile:
- data: |
[main]
summary=Configuration for hugepages
include=openshift-node
[vm]
transparent_hugepages=never
[sysctl]
vm.nr_hugepages=1024
name: node-hugepages
recommend:
- match:
- label: hugepages
priority: 30
profile: node-hugepages
EOF
oc create -f hugepages_tuning.yaml
oc get pod -o wide -n openshift-cluster-node-tuning-operator
oc logs tuned-86g8b \
-n openshift-cluster-node-tuning-operator | grep 'applied$' | tail -n1
# check result
cat /sys/kernel/mm/transparent_hugepage/enabled
# always madvise [never]
cat /sys/kernel/mm/transparent_hugepage/defrag
# [always] madvise never
# node feature discovery 功能已经触发了profile自动选择。
cat << EOF > hugepages-pod.yaml
apiVersion: v1
kind: Pod
metadata:
generateName: hugepages-volume-
spec:
containers:
- securityContext:
privileged: true
image: registry.redhat.ren:5443/docker.io/wangzheng422/centos:centos7-test
imagePullPolicy: Always
command:
- sleep
- inf
name: example
volumeMounts:
- mountPath: /dev/hugepages
name: hugepage
resources:
limits:
hugepages-2Mi: 100Mi
memory: "1Gi"
cpu: "1"
volumes:
- name: hugepage
emptyDir:
medium: HugePages
EOF
oc create -n demo -f hugepages-pod.yaml
# login into pod
oc rsh hugepages-volume-9nwlv
mount | grep page
# nodev on /dev/hugepages type hugetlbfs (rw,relatime,seclabel,pagesize=2Mi)
# 来看看系统huge page的状态
# yum install libhugetlbfs-utils
hugeadm --explain
# 根据以下的2个帖子,hugepage是给程序分配内存用的,不能用文件操作演示
# https://serverfault.com/questions/811670/how-to-create-copy-a-file-into-hugetlbfs
# https://stackoverflow.com/questions/40285971/how-to-load-text-segments-of-shared-libraries-into-huge-pages-on-linux]
# sysbench memory --memory-hugetlb=on --memory-total-size=200M run
# restore
oc delete -f hugepages_tuning.yaml
# reboot
openshift 4.3 helm
本文讲述,如何在openshift 4.3 上演示helm功能
video
- https://youtu.be/L6ioq_JMOtE
- https://www.bilibili.com/video/BV1qp4y197yH/
参考资料:
https://docs.openshift.com/container-platform/4.3/cli_reference/helm_cli/getting-started-with-helm-on-openshift-container-platform.html
https://chartmuseum.com/docs/#installing-chartsinto-kubernetes
https://whmzsu.github.io/helm-doc-zh-cn/chart/chart_repository-zh_cn.html
操作步骤
# 环境准备
skopeo copy docker://docker.io/gogs/gogs docker://registry.redhat.ren:5443/docker.io/gogs/gogs
skopeo copy docker://docker.io/chartmuseum/chartmuseum:latest docker://registry.redhat.ren:5443/docker.io/chartmuseum/chartmuseum:latest
skopeo copy docker://docker.io/ananwaresystems/webarchive:1.0 docker://registry.redhat.ren:5443/docker.io/ananwaresystems/webarchive:1.0
skopeo copy docker://docker.io/tomcat:7.0 docker://registry.redhat.ren:5443/docker.io/tomcat:7.0
# https://github.com/helm/charts/tree/master/stable/chartmuseum
# 运行一个helm chart repository
mkdir -p /data/ocp4/helm/charts
podman run --rm -it \
-p 18080:8080 \
-v /data/ocp4/helm/charts:/charts:Z \
-e DEBUG=true \
-e STORAGE=local \
-e STORAGE_LOCAL_ROOTDIR=/charts \
--privileged \
registry.redhat.ren:5443/docker.io/chartmuseum/chartmuseum:latest
# 准备 helm 客户端
curl -L https://mirror.openshift.com/pub/openshift-v4/clients/helm/latest/helm-linux-amd64 -o /usr/local/bin/helm
chmod +x /usr/local/bin/helm
helm version
helm repo add chartmuseum http://localhost:18080
helm repo list
# 编译一个helm chart, 并上传 chart repository
cd /data/ocp4/helm/tomcat
helm lint
helm package .
curl --data-binary "@tomcat-0.4.1.tgz" http://localhost:18080/api/charts
helm repo update
helm search repo
# 通过 helm chart 创建 tomcat deploy
oc project demo
helm install example-tomcat chartmuseum/tomcat
helm list
# 恢复环境
helm uninstall example-tomcat
helm repo remove chartmuseum
/bin/rm -f /data/ocp4/helm/charts/*
openshift tcp-router
本文描述,如何通过定制化haproxy template, 通过给route添加annotation,就可以向外界开放tcp路由。本文相关脚本和文件,在scripts目录中。
初衷和原理
经常会在openshift的poc中遇到L4负载均衡的测试,我们知道默认ocp router是haproxy做的,而且默认只支持http, https, 虽然tls/sni 也算是支持tcp的一种方式,但是这个也还是7层的。官方文档只是简单的说,如果有其他的需求,就定制haproxy template来满足,不过定制说的很少,例子也不多。本文就是一个通过定制化haproxy template,来达到动态监听route配置,并动态开放tcp端口。
定制haproxy template需要了解openshift router的一些原理要点
- openshift router不仅仅是haproxy,它还有一个go程序,监听了openshift的配置,并且写入了一堆的map文件,这个文件是非常关键的配置haproxy template的配置文件。
- openshift router里面的tls passthrough方式,对应到haproxy的配置里面,就是tcp的模式,我们的定制点就是在这里。
- 定制过程集中在,屏蔽掉http/https 的edge和reencrypt部分,对于打annotation 的route,开放tls passthrough的frontend
- route annotation 配置形式是 haproxy.router.openshift.io/external-tcp-port: "13306"
- 当然,ocp4现在还不支持定制化route template,所以本文直接创建了一个route的deployment。
- 现场实施的时候,注意更改router的image,每个版本的image可以去release.txt文件中找到。
既然是面向poc的,就肯定有局限
- route annotation 定义的开放tcp端口,是手动定义,而且面向整个集群各个project开放,必然会导致tcp端口冲突。需要已有端口管理方案,这个就交给GPS吧。
以下是route的配置示例
kind: Route
apiVersion: route.openshift.io/v1
metadata:
name: ottcache-002
annotations:
haproxy.router.openshift.io/wzh-router-name: "wzh-router-1"
haproxy.router.openshift.io/external-tcp-port: "6620"
spec:
to:
kind: Service
name: ottcache-002-service
port:
targetPort: 6620
tls:
termination: passthrough
insecureEdgeTerminationPolicy: None
以下是template里面,关键的定制点
{{/*try to add tcp support*/}}
{{- if eq (env "WZH_ROUTER_NAME" "wzh-router-name") (index $cfg.Annotations "haproxy.router.openshift.io/wzh-router-name") }}
{{- if (isInteger (index $cfg.Annotations "haproxy.router.openshift.io/external-tcp-port")) }}
frontend tcp-{{ (index $cfg.Annotations "haproxy.router.openshift.io/external-tcp-port") }}
bind *:{{ (index $cfg.Annotations "haproxy.router.openshift.io/external-tcp-port") }}
mode tcp
default_backend {{genBackendNamePrefix $cfg.TLSTermination}}:{{$cfgIdx}}
{{- end}}{{/* end haproxy.router.openshift.io */}}
{{- end}}{{/* end WZH_ROUTER_NAME */}}
{{/*end try to add tcp support*/}}
测试步骤
测试步骤不复杂,就是创建一个新的router,然后就可以去其他project创建应用,给route打annotation就可以了。
本文的例子,包含两个应用,一个是web应用,一个是mysql,都通过tcp端口对外开放。
# tcp-router will install in the same project with openshift router
oc project openshift-ingress
# install the tcp-router and demo
oc create configmap customrouter-wzh --from-file=haproxy-config.template
oc apply -f haproxy.router.yaml
oc apply -f haproxy.demo.yaml
# test your tcp-router, replace ip with router ip, both command will success.
curl 192.168.7.18:18080
podman run -it --rm registry.redhat.ren:5443/docker.io/mysql mysql -h 192.168.7.18 -P 13306 -u user -D db -p
# if you want to delete the tcp-router and demo
oc delete -f haproxy.router.yaml
oc delete configmap customrouter-wzh
oc delete -f haproxy.demo.yaml
# oc set volume deployment/router-wzh --add --overwrite \
# --name=config-volume \
# --mount-path=/var/lib/haproxy/conf/custom \
# --source='{"configMap": { "name": "customrouter-wzh"}}'
# oc set env dc/router \
# TEMPLATE_FILE=/var/lib/haproxy/conf/custom/haproxy-config.template
参考
https://docs.openshift.com/container-platform/3.11/install_config/router/customized_haproxy_router.html#go-template-actions
https://www.haproxy.com/blog/introduction-to-haproxy-maps/
https://access.redhat.com/solutions/3495011
https://blog.zhaw.ch/icclab/openshift-custom-router-with-tcpsni-support/
以下是弯路
分析源码,我们可以看到,openshift router还是对haproxy做了扩展的,那些map文件,都是router的扩展生成的,目的是对接endpoint,绕过service。所以我们想做tcp转发,可以借助sni-tcp来实现tcp转发。
openshift 4.3 grafana
展示 openshift 4.3 上的 grafana 功能
video
- https://youtu.be/xGry0_LWFNw
- https://www.bilibili.com/video/BV1yV411d7vR/
cpu manager
https://docs.openshift.com/container-platform/4.3/scalability_and_performance/using-cpu-manager.html
video
- https://youtu.be/gzdb2AURhvo
- https://www.bilibili.com/video/BV1Ua4y1t7aQ/
oc get node
oc label node ip-10-0-138-181.us-west-2.compute.internal cpumanager=true
oc label node worker-0 cpumanager=true
oc label node infra0.hsc.redhat.ren --overwrite cpumanager=false
oc label node worker-0.ocpsc.redhat.ren --overwrite cpumanager=true
oc label node worker-1.ocpsc.redhat.ren --overwrite cpumanager=true
oc label node worker-2.ocpsc.redhat.ren --overwrite cpumanager=true
oc label node worker-3.ocpsc.redhat.ren --overwrite cpumanager=true
oc get machineconfigpool worker -o yaml
# oc edit machineconfigpool worker
# metadata:
# creationTimestamp: 2019-xx-xxx
# generation: 3
# labels:
# custom-kubelet: cpumanager-enabled
oc patch machineconfigpool worker -p '{"metadata":{"labels": { "custom-kubelet": "cpumanager-enabled" } } }' --type=merge
cat << EOF > cpumanager-kubeletconfig.yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: KubeletConfig
metadata:
name: cpumanager-enabled
spec:
machineConfigPoolSelector:
matchLabels:
custom-kubelet: cpumanager-enabled
kubeletConfig:
cpuManagerPolicy: static
cpuManagerReconcilePeriod: 6s
EOF
oc apply -f cpumanager-kubeletconfig.yaml
alias urldecode='python3 -c "import sys, urllib.parse as ul; \
print(ul.unquote_plus(sys.argv[1]))"'
alias urlencode='python3 -c "import sys, urllib.parse as ul; \
print (ul.quote_plus(sys.argv[1]))"'
worker_mc_kubelet_yaml=$(oc get mc | grep kubelet | grep 99 | awk '{print $1}')
urldecode $(oc get mc ${worker_mc_kubelet_yaml} -o json | jq -r .spec.config.storage.files[0].contents.source | sed "s/data:text\/plain,//g") | jq
oc debug node/infra0.hsc.redhat.ren
cat /host/etc/kubernetes/kubelet.conf | grep cpuManager
# cat /etc/kubernetes/kubelet.conf | grep cpuManager
cat << EOF > cpumanager-pod.yaml
apiVersion: v1
kind: Pod
metadata:
generateName: cpumanager-
spec:
containers:
- name: cpumanager
image: gcr.io/google_containers/pause-amd64:3.0
resources:
requests:
cpu: 1
memory: "1G"
limits:
cpu: 1
memory: "1G"
nodeSelector:
cpumanager: "true"
EOF
oc apply -f cpumanager-pod.yaml
systemctl status
# └─kubepods.slice
# ├─kubepods-podcc529083_9d0a_43aa_9d9f_1fc0dc3b626b.slice
# │ ├─crio-conmon-b67ba6af381740b5f9b459482e41a14d4ced2cd8e9431598d84066d20027ef06.scope
# │ │ └─1434963 /usr/libexec/crio/conmon -s -c b67ba6af381740b5f9b459482e41a14d4ced2cd8e9431598d84066d20027ef06 -n k8s_cpumanager_> │ ├─crio-conmon-4ab85736504471dcca960aea960ca01ab0fa582439e444d407ac8d001d6dbd2b.scope
# │ │ └─1434127 /usr/libexec/crio/conmon -s -c 4ab85736504471dcca960aea960ca01ab0fa582439e444d407ac8d001d6dbd2b -n k8s_POD_cpumana> │ ├─crio-b67ba6af381740b5f9b459482e41a14d4ced2cd8e9431598d84066d20027ef06.scope
# │ │ └─1434975 /pause
# │ └─crio-4ab85736504471dcca960aea960ca01ab0fa582439e444d407ac8d001d6dbd2b.scope
# │ └─1434151 /usr/bin/pod
cd /sys/fs/cgroup/cpuset/kubepods.slice/kubepods-podcc529083_9d0a_43aa_9d9f_1fc0dc3b626b.slice/crio-b67ba6af381740b5f9b459482e41a14d4ced2cd8e9431598d84066d20027ef06.scope
for i in `ls cpuset.cpus tasks` ; do echo -n "$i "; cat $i ; done
# cpuset.cpus 12
# tasks 30894
grep Cpus_allowed_list /proc/1434975/status
# Cpus_allowed_list: 12
systemctl status
# ├─kubepods-burstable.slice
# │ ├─kubepods-burstable-podb8410218_65e9_4ec2_b944_6f0f1709e6a9.slice
# │ │ │ └─6696 /usr/bin/configmap-reload --webhook-url=http://localhost:8080/-/reload --volume-dir=/etc/serving-certs-ca-bundle
# │ │ ├─crio-conmon-958273b72d8d6f1a06a640bd158aa1f5dcc9372b232c79af9f3731068b0bcb9f.scope
# │ │ │ └─6922 /usr/libexec/crio/conmon -s -c 958273b72d8d6f1a06a640bd158aa1f5dcc9372b232c79af9f3731068b0bcb9f -n k8s_kube-rbac-pr>
# │ │ ├─crio-conmon-dc78df658a47a6bcad1772c5f0154c058b3b517f924c842eb9ba2c878edf86a3.scope
# │ │ │ └─6256 /usr/libexec/crio/conmon -s -c dc78df658a47a6bcad1772c5f0154c058b3b517f924c842eb9ba2c878edf86a3 -n k8s_telemeter-cl>
# │ │ ├─crio-958273b72d8d6f1a06a640bd158aa1f5dcc9372b232c79af9f3731068b0bcb9f.scope
# │ │ │ └─6958 /usr/bin/kube-rbac-proxy --secure-listen-address=:8443 --upstream=http://127.0.0.1:8080/ --tls-cert-file=/etc/tls/p>
# │ │ ├─crio-conmon-7a9aaeff818804cb48c6de76ef604e1241717ef25f9d2e31502bca5e03a0a126.scope
# │ │ │ └─5215 /usr/libexec/crio/conmon -s -c 7a9aaeff818804cb48c6de76ef604e1241717ef25f9d2e31502bca5e03a0a126 -n k8s_POD_telemete>
# │ │ ├─crio-dc78df658a47a6bcad1772c5f0154c058b3b517f924c842eb9ba2c878edf86a3.scope
# │ │ │ └─6321 /usr/bin/telemeter-client --id=02b8c3b4-9aed-4268-b1b7-84c998b50184 --from=https://prometheus-k8s.openshift-monitor>
# │ │ ├─crio-conmon-6cefa86b950deb57dac809b57246fb553e0c96fc31ae1cd7b8efa43207995749.scope
# │ │ │ └─6635 /usr/libexec/crio/conmon -s -c 6cefa86b950deb57dac809b57246fb553e0c96fc31ae1cd7b8efa43207995749 -n k8s_reload_telem>
# │ │ └─crio-7a9aaeff818804cb48c6de76ef604e1241717ef25f9d2e31502bca5e03a0a126.scope
# │ │ └─5292 /usr/bin/pod
cat /sys/fs/cgroup/cpuset/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podb8410218_65e9_4ec2_b944_6f0f1709e6a9.slice/crio-dc78df658a47a6bcad1772c5f0154c058b3b517f924c842eb9ba2c878edf86a3.scope/cpuset.cpus
# 0-1,3
oc describe node ip-10-0-138-181.us-west-2.compute.internal
# 可以看到其他pod被限制了使用12号cpu,有一些进程不被限制,是控制进程。
# cd /sys/fs/cgroup/cpuset/kubepods.slice/kubepods-burstable.slice/
cd /sys/fs/cgroup/cpuset/kubepods.slice/kubepods-besteffort.slice
find . -name cpuset.cpus | grep crio | xargs -I DEMO cat DEMO
# 0-11,13-23
# 0-23
# 0-11,13-23
# 0-23
# 0-11,13-23
# 0-23
# 0-11,13-23
# 0-23
# 0-11,13-23
# 0-23
# 0-23
# in pod
cat /sys/fs/cgroup/cpuset/cpuset.cpus
# 0-1,19
cat /proc/1/status | grep -i cpus_allow
# Cpus_allowed: 7fffc
# Cpus_allowed_list: 2-18
oc get pod -A | grep Running | awk '{print $1 "\t" $2}' > list
while read -r pod; do
echo "$pod"
oc exec -n $pod -- cat /sys/fs/cgroup/cpuset/cpuset.cpus
oc exec -n $pod -- cat /proc/1/status | grep -i cpus_allow
done < list
ls /proc | egrep '^[0-9]+$' | xargs -I DEMO echo " grep -s -i name /proc/DEMO/status | tr -d '\n'; echo -n -e '\t'; grep -s -i cpus_allowed_list /proc/DEMO/status ; " | sh
# Name: systemd Cpus_allowed_list: 0-1
# Name: ksoftirqd/0 Cpus_allowed_list: 0
# Name: migration/10 Cpus_allowed_list: 10
# Name: posixcputmr/10 Cpus_allowed_list: 10
# Name: rcuc/10 Cpus_allowed_list: 10
# Name: ksoftirqd/10 Cpus_allowed_list: 10
# Name: kworker/10:0-mm_percpu_wq Cpus_allowed_list: 10
# Name: kworker/10:0H Cpus_allowed_list: 10
# Name: rcuop/10 Cpus_allowed_list: 0-1
# Name: cpuhp/11 Cpus_allowed_list: 11
# Name: watchdog/11 Cpus_allowed_list: 0-19
# Name: migration/11 Cpus_allowed_list: 11
# Name: systemd-journal Cpus_allowed_list: 0-1
# Name: rcu_preempt Cpus_allowed_list: 0-1
# Name: posixcputmr/11 Cpus_allowed_list: 11
# Name: rcuc/11 Cpus_allowed_list: 11
# Name: systemd-udevd Cpus_allowed_list: 0-1
# Name: ksoftirqd/11 Cpus_allowed_list: 11
# Name: kworker/11:0-mm_percpu_wq Cpus_allowed_list: 11
# Name: irq/149-ioat-ms Cpus_allowed_list: 0
# Name: irq/151-ioat-ms Cpus_allowed_list: 1
# Name: irq/152-ioat-ms Cpus_allowed_list: 0
# Name: kworker/11:0H Cpus_allowed_list: 11
# Name: irq/153-ioat-ms Cpus_allowed_list: 1
# Name: irq/154-ioat-ms Cpus_allowed_list: 0
# Name: irq/155-ioat-ms Cpus_allowed_list: 1
# Name: irq/156-ioat-ms Cpus_allowed_list: 1
# Name: irq/157-mei_me Cpus_allowed_list: 0
# Name: irq/158-ioat-ms Cpus_allowed_list: 0
# Name: irq/16-i801_smb Cpus_allowed_list: 1
# Name: rcub/2 Cpus_allowed_list: 0-1
# Name: kipmi0 Cpus_allowed_list: 0-1
# Name: ib-comp-wq Cpus_allowed_list: 0-19
# Name: kworker/u41:0 Cpus_allowed_list: 0-1
# Name: ib-comp-unb-wq Cpus_allowed_list: 0-19
# Name: ib_mcast Cpus_allowed_list: 0-19
# Name: rcuop/11 Cpus_allowed_list: 0-1
# Name: ib_nl_sa_wq Cpus_allowed_list: 0-19
# Name: bnxt_re Cpus_allowed_list: 0-19
# Name: irq/159-bnxt_qp Cpus_allowed_list: 0
# Name: bnxt_qplib_nq Cpus_allowed_list: 0-19
# Name: irq/160-bnxt_qp Cpus_allowed_list: 1
# Name: ttm_swap Cpus_allowed_list: 0-19
# Name: bnxt_qplib_nq Cpus_allowed_list: 0-19
# Name: irq/161-bnxt_qp Cpus_allowed_list: 1
# Name: cpuhp/12 Cpus_allowed_list: 12
# Name: bnxt_qplib_nq Cpus_allowed_list: 0-19
# Name: irq/162-bnxt_qp Cpus_allowed_list: 1
# Name: bnxt_qplib_nq Cpus_allowed_list: 0-19
# Name: irq/163-bnxt_qp Cpus_allowed_list: 1
# Name: bnxt_qplib_nq Cpus_allowed_list: 0-19
# Name: irq/164-bnxt_qp Cpus_allowed_list: 0-1
# Name: bnxt_qplib_nq Cpus_allowed_list: 0-19
# Name: irq/165-bnxt_qp Cpus_allowed_list: 1
# Name: bnxt_qplib_nq Cpus_allowed_list: 0-19
# Name: irq/166-bnxt_qp Cpus_allowed_list: 1
# Name: watchdog/12 Cpus_allowed_list: 0-19
# Name: bnxt_qplib_nq Cpus_allowed_list: 0-19
# Name: irq/167-bnxt_qp Cpus_allowed_list: 1
# Name: ib_mad1 Cpus_allowed_list: 0-19
# Name: irq/168-bnxt_qp Cpus_allowed_list: 1
# Name: bnxt_qplib_nq Cpus_allowed_list: 0-19
# Name: irq/169-bnxt_qp Cpus_allowed_list: 0
# Name: bnxt_qplib_nq Cpus_allowed_list: 0-19
# Name: irq/170-bnxt_qp Cpus_allowed_list: 1
# Name: bnxt_qplib_nq Cpus_allowed_list: 0-19
# Name: migration/12 Cpus_allowed_list: 12
# Name: irq/171-bnxt_qp Cpus_allowed_list: 0
# Name: bnxt_qplib_nq Cpus_allowed_list: 0-19
# Name: irq/172-bnxt_qp Cpus_allowed_list: 0
# Name: bnxt_qplib_nq Cpus_allowed_list: 0-19
# Name: irq/173-bnxt_qp Cpus_allowed_list: 0
# Name: bnxt_qplib_nq Cpus_allowed_list: 0-19
# Name: irq/174-bnxt_qp Cpus_allowed_list: 0
# Name: bnxt_qplib_nq Cpus_allowed_list: 0-19
# Name: irq/175-bnxt_qp Cpus_allowed_list: 0
# Name: bnxt_qplib_nq Cpus_allowed_list: 0-19
# Name: rcub/1 Cpus_allowed_list: 0-1
# Name: posixcputmr/12 Cpus_allowed_list: 12
# Name: irq/176-bnxt_qp Cpus_allowed_list: 0
# Name: nfit Cpus_allowed_list: 0-19
# Name: ib_mad1 Cpus_allowed_list: 0-19
# Name: rcuc/12 Cpus_allowed_list: 12
# Name: rdma-ndd Cpus_allowed_list: 0-1
# Name: ksoftirqd/12 Cpus_allowed_list: 12
# Name: rdma_cm Cpus_allowed_list: 0-19
# Name: kworker/12:0-mm_percpu_wq Cpus_allowed_list: 12
# Name: iw_cxgb4 Cpus_allowed_list: 0-19
# Name: kworker/12:0H Cpus_allowed_list: 12
# Name: Register_iWARP_ Cpus_allowed_list: 0-19
# Name: rcuop/12 Cpus_allowed_list: 0-1
# Name: rpciod Cpus_allowed_list: 0-19
# Name: xprtiod Cpus_allowed_list: 0-19
# Name: cpuhp/13 Cpus_allowed_list: 13
# Name: watchdog/13 Cpus_allowed_list: 0-19
# Name: migration/13 Cpus_allowed_list: 13
# Name: posixcputmr/13 Cpus_allowed_list: 13
# Name: irq/119-i40e-ve Cpus_allowed_list: 0
# Name: irq/120-i40e-ve Cpus_allowed_list: 1
# Name: irq/121-i40e-ve Cpus_allowed_list: 0
# Name: irq/122-i40e-ve Cpus_allowed_list: 1
# Name: irq/123-i40e-ve Cpus_allowed_list: 0
# Name: irq/124-i40e-ve Cpus_allowed_list: 1
# Name: irq/125-i40e-ve Cpus_allowed_list: 0
# Name: irq/126-i40e-ve Cpus_allowed_list: 1
# Name: irq/127-i40e-ve Cpus_allowed_list: 0
# Name: irq/128-i40e-ve Cpus_allowed_list: 1
# Name: irq/129-i40e-ve Cpus_allowed_list: 0
# Name: irq/130-i40e-ve Cpus_allowed_list: 0
# Name: irq/131-i40e-ve Cpus_allowed_list: 0
# Name: irq/132-i40e-ve Cpus_allowed_list: 1
# Name: irq/133-i40e-ve Cpus_allowed_list: 1
# Name: irq/134-i40e-ve Cpus_allowed_list: 0
# Name: irq/135-i40e-ve Cpus_allowed_list: 1
# Name: irq/136-i40e-ve Cpus_allowed_list: 1
# Name: irq/137-i40e-ve Cpus_allowed_list: 0
# Name: irq/138-i40e-ve Cpus_allowed_list: 0
# Name: conmon Cpus_allowed_list: 0-1
# Name: pod Cpus_allowed_list: 0-19
# Name: conmon Cpus_allowed_list: 0-1
# Name: sleep Cpus_allowed_list: 2-18
# Name: runc Cpus_allowed_list: 0-1
# Name: bash Cpus_allowed_list: 2-18
# Name: runc Cpus_allowed_list: 0-1
# Name: bash Cpus_allowed_list: 2-18
# Name: runc Cpus_allowed_list: 0-1
# Name: bash Cpus_allowed_list: 2-18
# Name: runc Cpus_allowed_list: 0-1
# Name: bash Cpus_allowed_list: 2-18
# Name: rcuc/0 Cpus_allowed_list: 0
# Name: rcuc/13 Cpus_allowed_list: 13
# Name: jbd2/sda1-8 Cpus_allowed_list: 0-1
# Name: ext4-rsv-conver Cpus_allowed_list: 0-19
# Name: ksoftirqd/13 Cpus_allowed_list: 13
# Name: kworker/13:0-mm_percpu_wq Cpus_allowed_list: 13
# Name: kworker/13:0H Cpus_allowed_list: 13
# Name: srp_remove Cpus_allowed_list: 0-19
# Name: licManager Cpus_allowed_list: 2-18
# Name: sh Cpus_allowed_list: 2-18
# Name: rcuop/13 Cpus_allowed_list: 0-1
# Name: post-office Cpus_allowed_list: 2-18
# Name: oam Cpus_allowed_list: 2-18
# Name: tr069-v2 Cpus_allowed_list: 2-18
# Name: ftp-func Cpus_allowed_list: 2-18
# Name: o-ru-controller Cpus_allowed_list: 2-18
# Name: lighttpd Cpus_allowed_list: 2-18
# Name: lighttpd Cpus_allowed_list: 2-18
# Name: gnb_cu_oam Cpus_allowed_list: 2-18
# Name: bin_reader Cpus_allowed_list: 2-18
# Name: duoam Cpus_allowed_list: 17-18
# Name: gnb_cu_son Cpus_allowed_list: 2-4
# Name: target_completi Cpus_allowed_list: 0-19
# Name: xcopy_wq Cpus_allowed_list: 0-19
# Name: cpuhp/14 Cpus_allowed_list: 14
# Name: licManager Cpus_allowed_list: 2-18
# Name: sh Cpus_allowed_list: 2-18
# Name: post-office Cpus_allowed_list: 2-18
# Name: oam Cpus_allowed_list: 2-18
# Name: tr069-v2 Cpus_allowed_list: 2-18
# Name: ftp-func Cpus_allowed_list: 2-18
# Name: o-ru-controller Cpus_allowed_list: 2-18
# Name: lighttpd Cpus_allowed_list: 2-18
# Name: lighttpd Cpus_allowed_list: 2-18
# Name: gnb_cu_oam Cpus_allowed_list: 2-18
# Name: bin_reader Cpus_allowed_list: 18
# Name: watchdog/14 Cpus_allowed_list: 0-19
# Name: duoam Cpus_allowed_list: 17-18
# Name: migration/14 Cpus_allowed_list: 14
# Name: licManager Cpus_allowed_list: 2-18
# Name: sh Cpus_allowed_list: 2-18
# Name: post-office Cpus_allowed_list: 2-18
# Name: oam Cpus_allowed_list: 2-18
# Name: tr069-v2 Cpus_allowed_list: 2-18
# Name: ftp-func Cpus_allowed_list: 2-18
# Name: o-ru-controller Cpus_allowed_list: 2-18
# Name: lighttpd Cpus_allowed_list: 2-18
# Name: lighttpd Cpus_allowed_list: 2-18
# Name: gnb_cu_oam Cpus_allowed_list: 2-3
# Name: gnb_cu_pdcp Cpus_allowed_list: 2-5
# Name: bin_reader Cpus_allowed_list: 18
# Name: bin_reader Cpus_allowed_list: 18
# Name: bin_reader Cpus_allowed_list: 18
# Name: bin_reader Cpus_allowed_list: 18
# Name: duoam Cpus_allowed_list: 17-18
# Name: dumgr Cpus_allowed_list: 17-18
# Name: gnb_du_layer2 Cpus_allowed_list: 5
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: gnb_cu_son Cpus_allowed_list: 2-4
# Name: gnb_cu_l3 Cpus_allowed_list: 7-9
# Name: posixcputmr/14 Cpus_allowed_list: 14
# Name: auditd Cpus_allowed_list: 0-1
# Name: rcuc/14 Cpus_allowed_list: 14
# Name: licManager Cpus_allowed_list: 2-18
# Name: sh Cpus_allowed_list: 2-18
# Name: post-office Cpus_allowed_list: 2-18
# Name: oam Cpus_allowed_list: 2-18
# Name: tr069-v2 Cpus_allowed_list: 2-18
# Name: ftp-func Cpus_allowed_list: 2-18
# Name: o-ru-controller Cpus_allowed_list: 2-18
# Name: lighttpd Cpus_allowed_list: 2-18
# Name: lighttpd Cpus_allowed_list: 2-18
# Name: gnb_cu_oam Cpus_allowed_list: 2-3
# Name: gnb_cu_pdcp Cpus_allowed_list: 2-5
# Name: bin_reader Cpus_allowed_list: 18
# Name: bin_reader Cpus_allowed_list: 18
# Name: bin_reader Cpus_allowed_list: 18
# Name: bin_reader Cpus_allowed_list: 18
# Name: duoam Cpus_allowed_list: 17-18
# Name: dumgr Cpus_allowed_list: 17-18
# Name: gnb_du_layer2 Cpus_allowed_list: 5
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: gnb_cu_son Cpus_allowed_list: 2-4
# Name: gnb_cu_l3 Cpus_allowed_list: 7-9
# Name: posixcputmr/0 Cpus_allowed_list: 0
# Name: ksoftirqd/14 Cpus_allowed_list: 14
# Name: kworker/14:0-mm_percpu_wq Cpus_allowed_list: 14
# Name: chronyd Cpus_allowed_list: 0-1
# Name: sssd Cpus_allowed_list: 0-1
# Name: kworker/14:0H Cpus_allowed_list: 14
# Name: dbus-daemon Cpus_allowed_list: 0-1
# Name: licManager Cpus_allowed_list: 2-18
# Name: sh Cpus_allowed_list: 2-18
# Name: rcuop/14 Cpus_allowed_list: 0-1
# Name: post-office Cpus_allowed_list: 2-18
# Name: oam Cpus_allowed_list: 2-18
# Name: tr069-v2 Cpus_allowed_list: 2-18
# Name: ftp-func Cpus_allowed_list: 2-18
# Name: o-ru-controller Cpus_allowed_list: 2-18
# Name: lighttpd Cpus_allowed_list: 2-18
# Name: lighttpd Cpus_allowed_list: 2-18
# Name: gnb_cu_oam Cpus_allowed_list: 2-3
# Name: gnb_cu_pdcp Cpus_allowed_list: 2-4,9
# Name: bin_reader Cpus_allowed_list: 18
# Name: bin_reader Cpus_allowed_list: 18
# Name: bin_reader Cpus_allowed_list: 18
# Name: bin_reader Cpus_allowed_list: 18
# Name: duoam Cpus_allowed_list: 17-18
# Name: dumgr Cpus_allowed_list: 17-18
# Name: gnb_du_layer2 Cpus_allowed_list: 5
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: gnb_cu_son Cpus_allowed_list: 2-4
# Name: gnb_cu_rrm Cpus_allowed_list: 5-6
# Name: gnb_cu_l3 Cpus_allowed_list: 2-18
# Name: cpuhp/15 Cpus_allowed_list: 15
# Name: watchdog/15 Cpus_allowed_list: 0-19
# Name: migration/15 Cpus_allowed_list: 15
# Name: posixcputmr/15 Cpus_allowed_list: 15
# Name: sssd_be Cpus_allowed_list: 0-1
# Name: licManager Cpus_allowed_list: 2-18
# Name: sh Cpus_allowed_list: 2-18
# Name: post-office Cpus_allowed_list: 2-18
# Name: oam Cpus_allowed_list: 2-18
# Name: tr069-v2 Cpus_allowed_list: 2-18
# Name: ftp-func Cpus_allowed_list: 2-18
# Name: o-ru-controller Cpus_allowed_list: 2-18
# Name: lighttpd Cpus_allowed_list: 2-18
# Name: lighttpd Cpus_allowed_list: 2-18
# Name: gnb_cu_oam Cpus_allowed_list: 2-3
# Name: gnb_cu_pdcp Cpus_allowed_list: 2-4,9
# Name: bin_reader Cpus_allowed_list: 18
# Name: bin_reader Cpus_allowed_list: 18
# Name: bin_reader Cpus_allowed_list: 18
# Name: bin_reader Cpus_allowed_list: 18
# Name: duoam Cpus_allowed_list: 17-18
# Name: dumgr Cpus_allowed_list: 17-18
# Name: gnb_du_layer2 Cpus_allowed_list: 5
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: bin_reader Cpus_allowed_list: 2
# Name: gnb_cu_son Cpus_allowed_list: 2-4
# Name: gnb_cu_rrm Cpus_allowed_list: 5-6
# Name: gnb_cu_l3 Cpus_allowed_list: 2-18
# Name: kworker/u40:0-events_unbound Cpus_allowed_list: 0-1
# Name: rcuc/15 Cpus_allowed_list: 15
# Name: ksoftirqd/15 Cpus_allowed_list: 15
# Name: migration/0 Cpus_allowed_list: 0
# Name: kworker/15:0-mm_percpu_wq Cpus_allowed_list: 15
# Name: sssd_nss Cpus_allowed_list: 0-1
# Name: kworker/15:0H Cpus_allowed_list: 15
# Name: systemd-logind Cpus_allowed_list: 0-1
# Name: rcuop/15 Cpus_allowed_list: 0-1
# Name: vim Cpus_allowed_list: 2-18
# Name: ovsdb-server Cpus_allowed_list: 0-1
# Name: cpuhp/16 Cpus_allowed_list: 16
# Name: watchdog/16 Cpus_allowed_list: 0-19
# Name: migration/16 Cpus_allowed_list: 16
# Name: posixcputmr/16 Cpus_allowed_list: 16
# Name: rcuc/16 Cpus_allowed_list: 16
# Name: ksoftirqd/16 Cpus_allowed_list: 16
# Name: kworker/16:0-mm_percpu_wq Cpus_allowed_list: 16
# Name: watchdog/0 Cpus_allowed_list: 0
# Name: kworker/16:0H Cpus_allowed_list: 16
# Name: rcuop/16 Cpus_allowed_list: 0-1
# Name: cpuhp/17 Cpus_allowed_list: 17
# Name: ovs-vswitchd Cpus_allowed_list: 0-1
# Name: watchdog/17 Cpus_allowed_list: 0-19
# Name: kworker/u40:1-events_unbound Cpus_allowed_list: 0-1
# Name: migration/17 Cpus_allowed_list: 17
# Name: kworker/0:0-events Cpus_allowed_list: 0
# Name: posixcputmr/17 Cpus_allowed_list: 17
# Name: rcuc/17 Cpus_allowed_list: 17
# Name: NetworkManager Cpus_allowed_list: 0-1
# Name: kworker/0:1-events Cpus_allowed_list: 0
# Name: ksoftirqd/17 Cpus_allowed_list: 17
# Name: kworker/u40:2-events_unbound Cpus_allowed_list: 0-1
# Name: sshd Cpus_allowed_list: 0-1
# Name: sshd Cpus_allowed_list: 0-1
# Name: bash Cpus_allowed_list: 0-1
# Name: sudo Cpus_allowed_list: 0-1
# Name: bash Cpus_allowed_list: 0-1
# Name: kworker/17:0-mm_percpu_wq Cpus_allowed_list: 17
# Name: irq/79-i40e-ens Cpus_allowed_list: 1
# Name: irq/80-i40e-ens Cpus_allowed_list: 0
# Name: irq/81-i40e-ens Cpus_allowed_list: 0
# Name: irq/82-i40e-ens Cpus_allowed_list: 1
# Name: kworker/17:0H Cpus_allowed_list: 17
# Name: irq/83-i40e-ens Cpus_allowed_list: 0
# Name: irq/84-i40e-ens Cpus_allowed_list: 0
# Name: kworker/1:1-xfs-cil/dm-0 Cpus_allowed_list: 1
# Name: irq/85-i40e-ens Cpus_allowed_list: 0
# Name: irq/86-i40e-ens Cpus_allowed_list: 0
# Name: irq/87-i40e-ens Cpus_allowed_list: 0
# Name: irq/88-i40e-ens Cpus_allowed_list: 0
# Name: irq/89-i40e-ens Cpus_allowed_list: 0
# Name: irq/90-i40e-ens Cpus_allowed_list: 0
# Name: irq/91-i40e-ens Cpus_allowed_list: 0
# Name: irq/92-i40e-ens Cpus_allowed_list: 0
# Name: cpuhp/0 Cpus_allowed_list: 0
# Name: rcuop/17 Cpus_allowed_list: 0-1
# Name: irq/93-i40e-ens Cpus_allowed_list: 1
# Name: irq/94-i40e-ens Cpus_allowed_list: 0
# Name: irq/95-i40e-ens Cpus_allowed_list: 1
# Name: irq/96-i40e-ens Cpus_allowed_list: 0
# Name: irq/97-i40e-ens Cpus_allowed_list: 1
# Name: irq/98-i40e-ens Cpus_allowed_list: 1
# Name: kworker/1:4-xfs-cil/dm-0 Cpus_allowed_list: 1
# Name: cpuhp/18 Cpus_allowed_list: 18
# Name: kworker/0:0H-xfs-log/dm-0 Cpus_allowed_list: 0
# Name: kworker/u40:3-events_unbound Cpus_allowed_list: 0-1
# Name: watchdog/18 Cpus_allowed_list: 0-19
# Name: kworker/1:0-events Cpus_allowed_list: 1
# Name: migration/18 Cpus_allowed_list: 18
# Name: sleep Cpus_allowed_list: 0-1,19
# Name: sleep Cpus_allowed_list: 0-1,19
# Name: sh Cpus_allowed_list: 0-1
# Name: posixcputmr/18 Cpus_allowed_list: 18
# Name: agetty Cpus_allowed_list: 0-1
# Name: agetty Cpus_allowed_list: 0-1
# Name: rcuc/18 Cpus_allowed_list: 18
# Name: ksoftirqd/18 Cpus_allowed_list: 18
# Name: kworker/18:0-mm_percpu_wq Cpus_allowed_list: 18
# Name: kworker/18:0H Cpus_allowed_list: 18
# Name: rcuop/18 Cpus_allowed_list: 0-1
# Name: cpuhp/1 Cpus_allowed_list: 1
# Name: cpuhp/19 Cpus_allowed_list: 19
# Name: irq/70-ens81f0n Cpus_allowed_list: 0
# Name: irq/71-ens81f0n Cpus_allowed_list: 1
# Name: irq/72-ens81f0n Cpus_allowed_list: 0
# Name: irq/73-ens81f0n Cpus_allowed_list: 1
# Name: irq/74-ens81f0n Cpus_allowed_list: 0
# Name: watchdog/19 Cpus_allowed_list: 0-19
# Name: irq/75-ens81f0n Cpus_allowed_list: 1
# Name: irq/76-ens81f0n Cpus_allowed_list: 0
# Name: irq/77-ens81f0n Cpus_allowed_list: 1
# Name: migration/19 Cpus_allowed_list: 19
# Name: posixcputmr/19 Cpus_allowed_list: 19
# Name: rcuc/19 Cpus_allowed_list: 19
# Name: irq/110-ens81f1 Cpus_allowed_list: 0
# Name: irq/111-ens81f1 Cpus_allowed_list: 1
# Name: irq/112-ens81f1 Cpus_allowed_list: 0
# Name: irq/113-ens81f1 Cpus_allowed_list: 1
# Name: irq/114-ens81f1 Cpus_allowed_list: 0
# Name: irq/115-ens81f1 Cpus_allowed_list: 1
# Name: irq/116-ens81f1 Cpus_allowed_list: 0
# Name: irq/117-ens81f1 Cpus_allowed_list: 1
# Name: ksoftirqd/19 Cpus_allowed_list: 19
# Name: kworker/19:0-mm_percpu_wq Cpus_allowed_list: 19
# Name: kworker/19:0H Cpus_allowed_list: 19
# Name: rcuop/19 Cpus_allowed_list: 0-1
# Name: watchdog/1 Cpus_allowed_list: 0-19
# Name: irq/4-ttyS0 Cpus_allowed_list: 0
# Name: kdevtmpfs Cpus_allowed_list: 0-1
# Name: netns Cpus_allowed_list: 0-19
# Name: rcu_tasks_kthre Cpus_allowed_list: 0-1
# Name: kauditd Cpus_allowed_list: 0-1
# Name: sshd Cpus_allowed_list: 0-1
# Name: rpcbind Cpus_allowed_list: 0-1
# Name: rpc.statd Cpus_allowed_list: 0-1
# Name: khungtaskd Cpus_allowed_list: 0-1
# Name: oom_reaper Cpus_allowed_list: 0-1
# Name: kthreadd Cpus_allowed_list: 0-1
# Name: migration/1 Cpus_allowed_list: 1
# Name: writeback Cpus_allowed_list: 0-19
# Name: kcompactd0 Cpus_allowed_list: 0-1
# Name: ksmd Cpus_allowed_list: 0-1
# Name: crypto Cpus_allowed_list: 0-19
# Name: kintegrityd Cpus_allowed_list: 0-19
# Name: kblockd Cpus_allowed_list: 0-19
# Name: irq/9-acpi Cpus_allowed_list: 0
# Name: tpm_dev_wq Cpus_allowed_list: 0-19
# Name: posixcputmr/1 Cpus_allowed_list: 1
# Name: md Cpus_allowed_list: 0-19
# Name: crio Cpus_allowed_list: 0-1
# Name: edac-poller Cpus_allowed_list: 0-19
# Name: watchdogd Cpus_allowed_list: 0-1
# Name: rcuc/1 Cpus_allowed_list: 1
# Name: conmon Cpus_allowed_list: 0-1
# Name: conmon Cpus_allowed_list: 0-1
# Name: conmon Cpus_allowed_list: 0-1
# Name: conmon Cpus_allowed_list: 0-1
# Name: conmon Cpus_allowed_list: 0-1
# Name: conmon Cpus_allowed_list: 0-1
# Name: ksoftirqd/1 Cpus_allowed_list: 1
# Name: kswapd0 Cpus_allowed_list: 0-1
# Name: pod Cpus_allowed_list: 0-1
# Name: pod Cpus_allowed_list: 0-1
# Name: pod Cpus_allowed_list: 0-1
# Name: pod Cpus_allowed_list: 0-1
# Name: pod Cpus_allowed_list: 0-1
# Name: pod Cpus_allowed_list: 0-1
# Name: conmon Cpus_allowed_list: 0-1
# Name: kworker/2:1-mm_percpu_wq Cpus_allowed_list: 2
# Name: pod Cpus_allowed_list: 0-1
# Name: kworker/3:1-mm_percpu_wq Cpus_allowed_list: 3
# Name: kworker/4:1-mm_percpu_wq Cpus_allowed_list: 4
# Name: kworker/1:1H-kblockd Cpus_allowed_list: 1
# Name: kworker/5:1-mm_percpu_wq Cpus_allowed_list: 5
# Name: kworker/6:1-mm_percpu_wq Cpus_allowed_list: 6
# Name: kworker/7:1-mm_percpu_wq Cpus_allowed_list: 7
# Name: kworker/8:1-mm_percpu_wq Cpus_allowed_list: 8
# Name: kworker/9:1-mm_percpu_wq Cpus_allowed_list: 9
# Name: kworker/10:1-mm_percpu_wq Cpus_allowed_list: 10
# Name: kworker/11:1-mm_percpu_wq Cpus_allowed_list: 11
# Name: kworker/12:1-mm_percpu_wq Cpus_allowed_list: 12
# Name: kworker/13:1-mm_percpu_wq Cpus_allowed_list: 13
# Name: conmon Cpus_allowed_list: 0-1
# Name: conmon Cpus_allowed_list: 0-1
# Name: kworker/14:1-mm_percpu_wq Cpus_allowed_list: 14
# Name: kworker/15:1-mm_percpu_wq Cpus_allowed_list: 15
# Name: cpuhp/2 Cpus_allowed_list: 2
# Name: kworker/16:1-mm_percpu_wq Cpus_allowed_list: 16
# Name: sh Cpus_allowed_list: 0-1,19
# Name: kworker/17:1-mm_percpu_wq Cpus_allowed_list: 17
# Name: tail Cpus_allowed_list: 0-1,19
# Name: kworker/18:1-mm_percpu_wq Cpus_allowed_list: 18
# Name: kworker/19:1-mm_percpu_wq Cpus_allowed_list: 19
# Name: conmon Cpus_allowed_list: 0-1
# Name: conmon Cpus_allowed_list: 0-1
# Name: watchdog/2 Cpus_allowed_list: 0-19
# Name: kubelet Cpus_allowed_list: 0-1
# Name: systemd Cpus_allowed_list: 0-1
# Name: (sd-pam) Cpus_allowed_list: 0-1
# Name: podman pause Cpus_allowed_list: 0-1
# Name: machine-config- Cpus_allowed_list: 0-1,19
# Name: kworker/0:1H-kblockd Cpus_allowed_list: 0
# Name: openshift-sdn-n Cpus_allowed_list: 0-1,19
# Name: migration/2 Cpus_allowed_list: 2
# Name: conmon Cpus_allowed_list: 0-1
# Name: run Cpus_allowed_list: 0-1,19
# Name: posixcputmr/2 Cpus_allowed_list: 2
# Name: rcu_gp Cpus_allowed_list: 0-19
# Name: rcuc/2 Cpus_allowed_list: 2
# Name: conmon Cpus_allowed_list: 0-1
# Name: conmon Cpus_allowed_list: 0-1
# Name: oauth-proxy Cpus_allowed_list: 0-1,19
# Name: kube-rbac-proxy Cpus_allowed_list: 0-1,19
# Name: openshift-tuned Cpus_allowed_list: 0-1,19
# Name: ksoftirqd/2 Cpus_allowed_list: 2
# Name: kworker/1:2H-kblockd Cpus_allowed_list: 1
# Name: polkitd Cpus_allowed_list: 0-1
# Name: kworker/2:0-mm_percpu_wq Cpus_allowed_list: 2
# Name: journalctl Cpus_allowed_list: 0-1,19
# Name: kworker/2:0H Cpus_allowed_list: 2
# Name: conmon Cpus_allowed_list: 0-1
# Name: node_exporter Cpus_allowed_list: 0-1,19
# Name: rcuop/2 Cpus_allowed_list: 0-1
# Name: conmon Cpus_allowed_list: 0-1
# Name: kthrotld Cpus_allowed_list: 0-19
# Name: irq/24-PCIe PME Cpus_allowed_list: 1
# Name: irq/26-PCIe PME Cpus_allowed_list: 1
# Name: kube-rbac-proxy Cpus_allowed_list: 0-1,19
# Name: irq/26-pciehp Cpus_allowed_list: 1
# Name: irq/26-s-pciehp Cpus_allowed_list: 1
# Name: irq/27-PCIe PME Cpus_allowed_list: 1
# Name: cpuhp/3 Cpus_allowed_list: 3
# Name: irq/65-PCIe PME Cpus_allowed_list: 0
# Name: irq/65-aerdrv Cpus_allowed_list: 0
# Name: irq/65-s-aerdrv Cpus_allowed_list: 0
# Name: acpi_thermal_pm Cpus_allowed_list: 0-19
# Name: kmpath_rdacd Cpus_allowed_list: 0-19
# Name: kaluad Cpus_allowed_list: 0-19
# Name: irq/66-xhci_hcd Cpus_allowed_list: 1
# Name: irq/8-rtc0 Cpus_allowed_list: 0
# Name: ipv6_addrconf Cpus_allowed_list: 0-19
# Name: kstrp Cpus_allowed_list: 0-19
# Name: watchdog/3 Cpus_allowed_list: 0-19
# Name: migration/3 Cpus_allowed_list: 3
# Name: posixcputmr/3 Cpus_allowed_list: 3
# Name: rcuc/3 Cpus_allowed_list: 3
# Name: conmon Cpus_allowed_list: 0-1
# Name: pod Cpus_allowed_list: 0-1
# Name: conmon Cpus_allowed_list: 0-1
# Name: network-metrics Cpus_allowed_list: 0-1,19
# Name: conmon Cpus_allowed_list: 0-1
# Name: rcu_par_gp Cpus_allowed_list: 0-19
# Name: ksoftirqd/3 Cpus_allowed_list: 3
# Name: pod Cpus_allowed_list: 0-1
# Name: conmon Cpus_allowed_list: 0-1
# Name: kube-rbac-proxy Cpus_allowed_list: 0-1,19
# Name: conmon Cpus_allowed_list: 0-1
# Name: kworker/3:0-mm_percpu_wq Cpus_allowed_list: 3
# Name: entrypoint.sh Cpus_allowed_list: 0-1,19
# Name: conmon Cpus_allowed_list: 0-1
# Name: coredns Cpus_allowed_list: 0-1,19
# Name: kworker/3:0H Cpus_allowed_list: 3
# Name: rcuop/3 Cpus_allowed_list: 0-1
# Name: conmon Cpus_allowed_list: 0-1
# Name: kube-rbac-proxy Cpus_allowed_list: 0-1,19
# Name: cpuhp/4 Cpus_allowed_list: 4
# Name: conmon Cpus_allowed_list: 0-1
# Name: bash Cpus_allowed_list: 0-1,19
# Name: watchdog/4 Cpus_allowed_list: 0-19
# Name: migration/4 Cpus_allowed_list: 4
# Name: posixcputmr/4 Cpus_allowed_list: 4
# Name: tuned Cpus_allowed_list: 0-1,19
# Name: rcuc/4 Cpus_allowed_list: 4
# Name: ksoftirqd/4 Cpus_allowed_list: 4
# Name: irqbalance Cpus_allowed_list: 0-1
# Name: stalld Cpus_allowed_list: 0-1
# Name: kworker/4:0-mm_percpu_wq Cpus_allowed_list: 4
# Name: kworker/4:0H Cpus_allowed_list: 4
# Name: iscsi_eh Cpus_allowed_list: 0-19
# Name: rcuop/4 Cpus_allowed_list: 0-1
# Name: cpuhp/5 Cpus_allowed_list: 5
# Name: watchdog/5 Cpus_allowed_list: 0-19
# Name: migration/5 Cpus_allowed_list: 5
# Name: posixcputmr/5 Cpus_allowed_list: 5
# Name: rcuc/5 Cpus_allowed_list: 5
# Name: ksoftirqd/5 Cpus_allowed_list: 5
# Name: kworker/5:0-mm_percpu_wq Cpus_allowed_list: 5
# Name: kworker/5:0H Cpus_allowed_list: 5
# Name: rcuop/5 Cpus_allowed_list: 0-1
# Name: cpuhp/6 Cpus_allowed_list: 6
# Name: watchdog/6 Cpus_allowed_list: 0-19
# Name: cnic_wq Cpus_allowed_list: 0-19
# Name: bnx2i_thread/0 Cpus_allowed_list: 0
# Name: bnx2i_thread/1 Cpus_allowed_list: 1
# Name: bnx2i_thread/2 Cpus_allowed_list: 2
# Name: bnx2i_thread/3 Cpus_allowed_list: 3
# Name: bnx2i_thread/4 Cpus_allowed_list: 4
# Name: bnx2i_thread/5 Cpus_allowed_list: 5
# Name: bnx2i_thread/6 Cpus_allowed_list: 6
# Name: migration/6 Cpus_allowed_list: 6
# Name: bnx2i_thread/7 Cpus_allowed_list: 7
# Name: bnx2i_thread/8 Cpus_allowed_list: 8
# Name: bnx2i_thread/9 Cpus_allowed_list: 9
# Name: bnx2i_thread/10 Cpus_allowed_list: 10
# Name: bnx2i_thread/11 Cpus_allowed_list: 11
# Name: bnx2i_thread/12 Cpus_allowed_list: 12
# Name: bnx2i_thread/13 Cpus_allowed_list: 13
# Name: bnx2i_thread/14 Cpus_allowed_list: 14
# Name: bnx2i_thread/15 Cpus_allowed_list: 15
# Name: bnx2i_thread/16 Cpus_allowed_list: 16
# Name: posixcputmr/6 Cpus_allowed_list: 6
# Name: bnx2i_thread/17 Cpus_allowed_list: 17
# Name: bnx2i_thread/18 Cpus_allowed_list: 18
# Name: bnx2i_thread/19 Cpus_allowed_list: 19
# Name: rcuc/6 Cpus_allowed_list: 6
# Name: ksoftirqd/6 Cpus_allowed_list: 6
# Name: kworker/6:0-mm_percpu_wq Cpus_allowed_list: 6
# Name: kmpathd Cpus_allowed_list: 0-19
# Name: kworker/6:0H Cpus_allowed_list: 6
# Name: kmpath_handlerd Cpus_allowed_list: 0-19
# Name: rcuop/6 Cpus_allowed_list: 0-1
# Name: cpuhp/7 Cpus_allowed_list: 7
# Name: watchdog/7 Cpus_allowed_list: 0-19
# Name: migration/7 Cpus_allowed_list: 7
# Name: posixcputmr/7 Cpus_allowed_list: 7
# Name: rcuc/7 Cpus_allowed_list: 7
# Name: ksoftirqd/7 Cpus_allowed_list: 7
# Name: kworker/7:0-mm_percpu_wq Cpus_allowed_list: 7
# Name: kworker/7:0H Cpus_allowed_list: 7
# Name: ata_sff Cpus_allowed_list: 0-19
# Name: i40e Cpus_allowed_list: 0-19
# Name: rcuop/7 Cpus_allowed_list: 0-1
# Name: bnxt_pf_wq Cpus_allowed_list: 0-19
# Name: irq/67-ahci[000 Cpus_allowed_list: 0
# Name: scsi_eh_0 Cpus_allowed_list: 0-1
# Name: scsi_tmf_0 Cpus_allowed_list: 0-19
# Name: scsi_eh_1 Cpus_allowed_list: 0-1
# Name: scsi_tmf_1 Cpus_allowed_list: 0-19
# Name: scsi_eh_2 Cpus_allowed_list: 0-1
# Name: scsi_tmf_2 Cpus_allowed_list: 0-19
# Name: cpuhp/8 Cpus_allowed_list: 8
# Name: scsi_eh_3 Cpus_allowed_list: 0-1
# Name: scsi_tmf_3 Cpus_allowed_list: 0-19
# Name: scsi_eh_4 Cpus_allowed_list: 0-1
# Name: scsi_tmf_4 Cpus_allowed_list: 0-19
# Name: scsi_eh_5 Cpus_allowed_list: 0-1
# Name: scsi_tmf_5 Cpus_allowed_list: 0-19
# Name: watchdog/8 Cpus_allowed_list: 0-19
# Name: irq/99-i40e-000 Cpus_allowed_list: 0
# Name: irq/78-i40e-000 Cpus_allowed_list: 0
# Name: irq/109-ahci[00 Cpus_allowed_list: 1
# Name: scsi_eh_6 Cpus_allowed_list: 0-1
# Name: migration/8 Cpus_allowed_list: 8
# Name: scsi_tmf_6 Cpus_allowed_list: 0-19
# Name: scsi_eh_7 Cpus_allowed_list: 0-1
# Name: scsi_tmf_7 Cpus_allowed_list: 0-19
# Name: scsi_eh_8 Cpus_allowed_list: 0-1
# Name: scsi_tmf_8 Cpus_allowed_list: 0-19
# Name: scsi_eh_9 Cpus_allowed_list: 0-1
# Name: scsi_tmf_9 Cpus_allowed_list: 0-19
# Name: scsi_eh_10 Cpus_allowed_list: 0-1
# Name: scsi_tmf_10 Cpus_allowed_list: 0-19
# Name: posixcputmr/8 Cpus_allowed_list: 8
# Name: scsi_eh_11 Cpus_allowed_list: 0-1
# Name: scsi_tmf_11 Cpus_allowed_list: 0-19
# Name: scsi_eh_12 Cpus_allowed_list: 0-1
# Name: scsi_tmf_12 Cpus_allowed_list: 0-19
# Name: scsi_eh_13 Cpus_allowed_list: 0-1
# Name: scsi_tmf_13 Cpus_allowed_list: 0-19
# Name: irq/139-i40e-00 Cpus_allowed_list: 0
# Name: rcuc/8 Cpus_allowed_list: 8
# Name: irq/118-i40e-00 Cpus_allowed_list: 1
# Name: ksoftirqd/8 Cpus_allowed_list: 8
# Name: kworker/8:0-mm_percpu_wq Cpus_allowed_list: 8
# Name: kworker/8:0H Cpus_allowed_list: 8
# Name: rcuop/8 Cpus_allowed_list: 0-1
# Name: cpuhp/9 Cpus_allowed_list: 9
# Name: mm_percpu_wq Cpus_allowed_list: 0-19
# Name: watchdog/9 Cpus_allowed_list: 0-19
# Name: migration/9 Cpus_allowed_list: 9
# Name: posixcputmr/9 Cpus_allowed_list: 9
# Name: kdmflush Cpus_allowed_list: 0-19
# Name: rcuc/9 Cpus_allowed_list: 9
# Name: xfsalloc Cpus_allowed_list: 0-19
# Name: xfs_mru_cache Cpus_allowed_list: 0-19
# Name: ksoftirqd/9 Cpus_allowed_list: 9
# Name: xfs-buf/dm-0 Cpus_allowed_list: 0-19
# Name: xfs-conv/dm-0 Cpus_allowed_list: 0-19
# Name: xfs-cil/dm-0 Cpus_allowed_list: 0-19
# Name: xfs-reclaim/dm- Cpus_allowed_list: 0-19
# Name: xfs-log/dm-0 Cpus_allowed_list: 0-19
# Name: xfs-eofblocks/d Cpus_allowed_list: 0-19
# Name: xfsaild/dm-0 Cpus_allowed_list: 0-1
# Name: kworker/9:0-mm_percpu_wq Cpus_allowed_list: 9
# Name: kworker/9:0H Cpus_allowed_list: 9
# Name: rcuop/9 Cpus_allowed_list: 0-1
# Name: cpuhp/10 Cpus_allowed_list: 10
# Name: watchdog/10 Cpus_allowed_list: 0-19
openshift 4.3 build config & hpa
video for build config & scale up
- https://youtu.be/O0TjPBisMVo
- https://www.bilibili.com/video/BV1rT4y137QJ/
- https://www.ixigua.com/i6824464593977344525/
video for scale up & service
- https://youtu.be/6fMe7T4RlCI
- https://www.bilibili.com/video/BV1Xt4y1y7xG/
- https://www.ixigua.com/i6824739572237206023/
php build config
# 准备一个php的测试镜像
cat << EOF > php.dockerfile
FROM php:apache
COPY . /var/www/html/
EOF
cat <<EOF > index.php
<?php
ECHO "Hello!<br>";
echo "Welcome to RedHat Developer<br>";
EcHo "Enjoy all of the ad-free articles<br>";
?>
EOF
buildah build-using-dockerfile -t docker.io/wangzheng422/php:demo -f php.dockerfile .
podman run -it --rm -p 18080:80 --name my-running-app docker.io/wangzheng422/php:demo
# 创建一个git服务器,用gogs,启动以后要做一些配置。
# 配置 resolve.conf
# 配置 app.ini
# [webhook]
# SKIP_TLS_VERIFY = true
mkdir -p /data/ocp4/gogs
podman run -d --name=gogs -p 10022:22 -p 10080:3000 -v /data/ocp4/gogs:/data:Z registry.redhat.ren:5443/docker.io/gogs/gogs
podman stop gogs
podman start gogs
# http://registry.redhat.ren:10080
# 在demo项目中,创建编译配置
oc project demo
oc import-image php:apache-wzh --from=registry.redhat.ren:5443/docker.io/library/php:apache-wzh --confirm
# oc import-image php:apache-wzh --from=registry.redhat.ren:5443/docker.io/wangzheng422/php:apache --confirm
oc create is php-sample -n demo
cat << EOF > bc.is.yaml
kind: BuildConfig
apiVersion: build.openshift.io/v1
metadata:
name: "php-sample-build"
spec:
runPolicy: "Serial"
triggers:
- type: "Generic"
generic:
secret: "secret101"
-
type: "ImageChange"
source:
git:
uri: "http://registry.redhat.ren:10080/root/php"
dockerfile: "FROM php:apache\nCOPY . /var/www/html/"
strategy:
dockerStrategy:
from:
kind: "ImageStreamTag"
name: "php:apache-wzh"
output:
to:
kind: "ImageStreamTag"
name: "php-sample:demo"
EOF
oc apply -f bc.is.yaml
# 在界面上操作,通过镜像创建应用,并通过代码更改,触发应用的重新部署。
hpa
我们在这里展示openshift如何根据cpu的负载,自动扩缩pod的数量。
video
- https://youtu.be/_UTncz3StXE
- https://www.bilibili.com/video/BV1Tk4y1r7Be/
# oc autoscale dc/php-sample \
# --min 1 \
# --max 3 \
# --cpu-percent=50
# 根据已经创建的deployment,创建HPA
cat << 'EOF' > demo.hpa.yaml
kind: HorizontalPodAutoscaler
apiVersion: autoscaling/v2beta1
metadata:
name: php-sample
namespace: demo
spec:
scaleTargetRef:
kind: Deployment
name: php-sample
apiVersion: apps/v1
minReplicas: 1
maxReplicas: 3
metrics:
- type: Resource
resource:
name: cpu
targetAverageUtilization: 50
EOF
oc apply -n demo -f demo.hpa.yaml
# 为了不影响其他测试,我们把php pod定点到同一个交换机主机上
cat << 'EOF'
nodeSelector:
kubernetes.io/hostname: 'infra1.hsc.redhat.ren'
EOF
# 为了更好的展示效果,我们限制cpu使用量
cat << 'EOF'
resources:
requests:
cpu: '100m'
memory: "1G"
limits:
cpu: '100m'
memory: "1G"
EOF
# 开始压力
ab -c 100 -n 99999 http://php-sample-demo.apps.ocpsc.redhat.ren/
# 查看当前hpa的状态
oc describe hpa/php-sample
弯路
skopeo copy docker://docker.io/php:apache docker-archive:///root/tmp/php.tar
gzip php.tar
skopeo copy docker-archive:///data/ocp4/tmp/php.tar.gz docker://registry.redhat.ren:5443/docker.io/library/php:apache
skopeo copy docker://docker.io/wangzheng422/php:apache docker://registry.redhat.ren:5443/docker.io/wangzheng422/php:apache
cat << EOF > docker.php.sh
#!/usr/bin/env bash
set -e
set -x
buildah from --name onbuild-container docker.io/php:apache
buildah run onbuild-container sed -i "s/80/8080/g" /etc/apache2/sites-available/000-default.conf /etc/apache2/ports.conf
buildah umount onbuild-container
buildah config -p 8080 onbuild-container
buildah commit --squash --rm --format=docker onbuild-container docker.io/wangzheng422/php:apache
buildah push docker.io/wangzheng422/php:apache
EOF
bash docker.php.sh
cat << EOF > docker.php.sh
#!/usr/bin/env bash
set -e
set -x
buildah from --name onbuild-container registry.redhat.ren:5443/docker.io/library/php:apache
buildah run onbuild-container sed -i "s/80/8080/g" /etc/apache2/sites-available/000-default.conf /etc/apache2/ports.conf
buildah umount onbuild-container
buildah config -p 8080 onbuild-container
buildah commit --squash --rm --format=docker onbuild-container registry.redhat.ren:5443/docker.io/library/php:apache-wzh
buildah push registry.redhat.ren:5443/docker.io/library/php:apache-wzh
EOF
bash docker.php.sh
# 我们不需要复杂的 template
oc get template -n openshift | grep php
# 用 source to image 功能就可以,所有找一下image stream
oc get is -A | grep php
# 我们把sample operator的状态改一下
oc get configs.samples.operator.openshift.io/cluster -o yaml
oc patch configs.samples.operator.openshift.io/cluster -p '{"spec":{"managementState": "Unmanaged"}}' --type=merge
export LOCAL_REG='registry.redhat.ren:5443'
var_is_name='php'
var_json=$(oc get is ${var_is_name} -n openshift -o json)
var_j=0
for var_is_tag in $(echo $var_json | jq -r ".spec.tags[].name"); do
var_is_image_name=$(echo $var_json | jq -r ".spec.tags[${var_j}].from.name")
var_is_image_kind=$(echo $var_json | jq -r ".spec.tags[${var_j}].from.kind")
if [[ $var_is_image_kind =~ 'DockerImage' ]]; then
var_new_is_image_name="${LOCAL_REG}/$var_is_image_name"
echo "###############################"
echo $var_is_image_name
echo $var_is_image_kind
echo $var_new_is_image_name
echo $var_is_tag
oc patch -n openshift is ${var_is_name} --type='json' -p="[{\"op\": \"replace\", \"path\": \"/spec/tags/${var_j}/from/name\", \"value\":\"${var_new_is_image_name}\"}]"
fi
var_j=$((var_j+1))
done
containered cloud-native (ccn) roadshow 离线部署
CCN是一个不错的演示openshift之上,ci/cd, cloud-native, istio, serverless的演示教材,教学的内容非常丰富。
第一个模块,着重讲解如何拆分单体应用,以及拆分的应用如何上云。
第二个模块,讲解如何在线debug, 如何监控上云的应用
第三个模块,应用转换到服务网格service mesh/istio架构
第四个模块,应用使用无服务架构serverless/knative架构开发
培训过程视频
- The Containers and Cloud-Native Roadshow Dev Track - Module 1
- The Containers and Cloud-Native Roadshow Dev Track - Module 2
- The Containers and Cloud-Native Roadshow Dev Track - Module 3
- The Containers and Cloud-Native Roadshow Dev Track - Module 4
安装过程视频
不过 upstream 的 CCN 是基于 rh demo system 的,必须在线,这里就做了一个离线的版本,供给客户离线使用。
离线部署架构描述
本次CCN离线,是基于ocp 4.4.7制作。一共有4个module。
做CCN的离线,主要有以下3部分工作
- github 离线
- maven, npm 离线
- 需要的镜像离线
在实验室的部署架构如下,供参考:
可以看到,与标准的部署架构没什么区别,就是在helper节点上面,加了gogs, nexus。
安装介质下载
请到如下的链接,下载安装介质,注意,这个安装介质是基于ocp 4.4.7 制作。
- 链接: https://pan.baidu.com/s/1f3EcbojFss5cDDQBPBzA-A 密码: 1jun
由于上传的时候,安装5G大小切分,下载以后,合并使用如下的命令范例:
cat registry.?? > registry.tgz
百度盘上还会有补丁文件,比如,当有一个 agnosticd.zip 文件时, 这个就是补丁文件,上传到helper上,替换ocp4.tgz解压缩出来的同名文件即可。
教材修订
教材根据上游的项目做了修订,主要是源代码,为了应对纯离线环境,做了小的修改。如果在教学现场,发现有步骤做不下去,多半是因为离线环境的问题,请参考教学视频录像,里面会有如何绕过离线环境问题的技巧。
基础ocp4.4环境的部署细节
- 按照离线的方法安装ocp4,里面要特别注意要有这些安装细节
- 打上离线registries.conf的补丁
- 打上local image registry ca的补丁
- 配置image registry
- 配置sample operator,并打上image stream的补丁
- 部署离线operator hub
ccn for ocp-4.4 安装步骤
建议用独立的ocp4集群来安装ccn教材,因为ccn教材会全局的激活多个operator,这些operator也许对集群中的其他环境有影响。
# on helper
# deploy gogs
export LOCAL_REG='registry.redhat.ren:5443/'
# export LOCAL_REG=''
# gogs_var_date='2020-07-06'
podman stop gogs
podman rm -fv gogs
cd /data/ccn
rm -rf /data/ccn/gogs
podman run -d --name gogs-fs --entrypoint "tail" ${LOCAL_REG}docker.io/wangzheng422/gogs-fs:2020-07-17-1412 -f /dev/null
podman cp gogs-fs:/gogs.tgz /data/ccn/
tar zxf gogs.tgz
podman rm -fv gogs-fs
# change /data/ccn/gogs/resolv.conf to fit your env
# change /data/ccn/gogs/gogs/conf/app.ini to fit your env
# generally, tag latest works
podman run -d --name=gogs -p 10022:22 -p 10080:3000 -v /data/ccn/gogs:/data:Z -v /data/ccn/gogs/resolv.conf:/etc/resolv.conf:Z ${LOCAL_REG}docker.io/gogs/gogs
# for those not using provided source, try a specific tag.
podman run -d --name=gogs -p 10022:22 -p 10080:3000 -v /data/ccn/gogs:/data:Z -v /data/ccn/gogs/resolv.conf:/etc/resolv.conf:Z ${LOCAL_REG}docker.io/gogs/gogs:0.12.3
# restore if you need.
podman stop gogs
podman rm -fv gogs
# deploy nexus
mkdir -p /data/ccn/nexus
cd /data/ccn
rm -rf /data/ccn/nexus
podman run -d --name nexus-fs --entrypoint "tail" ${LOCAL_REG}docker.io/wangzheng422/nexus-fs:2020-07-20-0320 -f /dev/null
podman cp nexus-fs:/nexus.tgz /data/ccn/
tar zxf nexus.tgz ./
podman rm -fv nexus-fs
chown -R 200:root /data/ccn/nexus
# generally, tag latest works
podman run -d -p 8081:8081 -it --name nexus -v /data/ccn/nexus:/nexus-data:Z ${LOCAL_REG}docker.io/sonatype/nexus3:latest
# for those not using provided source, try a specific tag.
podman run -d -p 8081:8081 -it --name nexus -v /data/ccn/nexus:/nexus-data:Z ${LOCAL_REG}docker.io/sonatype/nexus3:3.26.1
# restore if you need.
podman stop nexus
podman rm -fv nexus
# deploy etherpad
mkdir -p /data/ccn/etherpad
chown -R 5001 /data/ccn/etherpad
podman run -d -p 9001:9001 -it --name etherpad -v /data/ccn/etherpad:/opt/etherpad-lite/var:z ${LOCAL_REG}docker.io/etherpad/etherpad:latest
# restore if you need.
podman stop etherpad
podman rm -fv etherpad
# agnosticd on helper
mkdir -p /data/pip3
cd /data/pip3
podman create --name swap registry.redhat.ren:5443/docker.io/wangzheng422/base-fs:pip3-whl-2020-07-05 ls
podman cp swap:/wheelhouse.tar.gz - > wheelhouse.tar.gz
tar vxf wheelhouse.tar.gz
podman rm -fv swap
pip3 install --user --upgrade -r wheelhouse/requirements.txt --no-index --find-links wheelhouse
# yum downgrade ansible-2.8.12-1.el7ae
# 安装ccn环境的参数
oc login -u kubeadmin
# oc login -u system:admin
# TARGET_HOST="bastion.rhte-b5c8.openshiftworkshop.com"
OCP_USERNAME="system:admin"
WORKLOAD="ocp4-workload-ccnrd"
GUID=b5c8
USER_COUNT=2
MODULE_TYPE="m1;m2;m3;m4"
SSH_KEY=~/.ssh/id_rsa
WZH_SUBDOMIN_BASE=base.ocp4.redhat.ren
WZH_REGISTRY_SERVER=registry.redhat.ren:5443
WZH_GOGS_SERVER=gogs.redhat.ren:10080
# create users
BASE_DIR="/root/ocp4"
mkdir -p ${BASE_DIR}
cd ${BASE_DIR}
/bin/rm -f ${BASE_DIR}/htpasswd
touch ${BASE_DIR}/htpasswd
for i in $(seq 1 $USER_COUNT)
do
htpasswd -Bb ${BASE_DIR}/htpasswd user${i} redhat
done
oc create secret generic htpasswd --from-file=${BASE_DIR}/htpasswd -n openshift-config
oc apply -f - <<EOF
apiVersion: config.openshift.io/v1
kind: OAuth
metadata:
name: cluster
spec:
identityProviders:
- name: Local Password
mappingMethod: claim
type: HTPasswd
htpasswd:
fileData:
name: htpasswd
EOF
# oc delete secret htpasswd -n openshift-config
# 以下是安装步骤
# a TARGET_HOST is specified in the command line, without using an inventory file
cd /root/ocp4/agnosticd/ansible
ansible-playbook -i localhost, ./configs/ocp-workloads/ocp-workload.yml \
-e"ansible_ssh_private_key_file=${SSH_KEY}" \
-e"ansible_user=root" \
-e"ocp_username=${OCP_USERNAME}" \
-e"ocp_workload=${WORKLOAD}" \
-e"silent=False" \
-e"guid=${GUID}" \
-e"num_users=${USER_COUNT}" \
-e"user_count=${USER_COUNT}" \
-e"module_type=${MODULE_TYPE}" \
-e"wzh_registry_server=${WZH_REGISTRY_SERVER}" \
-e"wzh_gogs_server=${WZH_GOGS_SERVER}" \
-e"ansible_python_interpreter=/usr/bin/python3" \
-e"subdomain_base=${WZH_SUBDOMIN_BASE}" \
-v \
-e"ACTION=create"
# 由于实验环境里面的演示网站,会用到一些在线的静态文件,如果客户端浏览器不能联网
# 或者不能沟通"快速"上网,那么需要给这些在线资源做dns解析,解析到平台的router上来
# 离线的安装介质里面,有static-html,用来提供这些静态文件服务。
# at.alicdn.com
# maxcdn.bootstrapcdn.com
# cdnjs.cloudflare.com
# ajax.googleapis.com
# code.jquery.com
# 以下是删除ccn的步骤,注意大部分的operator不会删除。
# a TARGET_HOST is specified in the command line, without using an inventory file
ansible-playbook -i localhost, ./configs/ocp-workloads/ocp-workload.yml \
-e"ansible_ssh_private_key_file=${SSH_KEY}" \
-e"ansible_user=root" \
-e"ocp_username=${OCP_USERNAME}" \
-e"ocp_workload=${WORKLOAD}" \
-e"silent=False" \
-e"guid=${GUID}" \
-e"num_users=${USER_COUNT}" \
-e"user_count=${USER_COUNT}" \
-e"module_type=${MODULE_TYPE}" \
-e"wzh_registry_server=${WZH_REGISTRY_SERVER}" \
-e"wzh_gogs_server=${WZH_GOGS_SERVER}" \
-e"ansible_python_interpreter=/usr/bin/python3" \
-e"subdomain_base=${WZH_SUBDOMIN_BASE}" \
-v \
-e"ACTION=remove"
其他备忘
yum install -y wget jq
# Keycloak credentials: admin / 2kBdjDwcZK94
# STACK_ID: stacksq1xbet4os1uioep
manully patch image stream
- jenkins:2 to registry.redhat.ren/ocp4/openshift4@sha256:*****
- jenkins:latest to registry.redhat.ren/ocp4/openshift4@sha256:*****
tips
oc get istio-io -n opentlc-mgr-tutorial
oc new-build -i openshift/redhat-openjdk18-openshift:1.5 --binary --name=inventory-quarkus -l app=inventory-quarkus
npm run nodeshift -- --dockerImage=registry.redhat.ren:5443/docker.io/wangzheng422/cloudnative-workspaces-quarkus --imageTag=nodejs-10-2020-07-16-2155
todo
- PPT
离线ccn, containered cloud native 制作
基本思路
- 需要一个离线的github
- 目前看,gogs没有体现在离线部署脚本中。
- gogs集群外部署,不外置数据库。以后在考虑如何集群内部署,如何pv import
- 研究gogs api,批量创建用户和project
- 需要一个maven的离线proxy
- 目前看,没有包含在离线脚本中,但是crw里面有个配置,指向了离线proxy,似乎好做。
- nexus集群外部署.
- 需要各种镜像
- 目前看,用的大多是image stream,反而好做
additional need:
- maven repository cache
- github clone site
- https://github.com/wangzheng422/cloud-native-workshop-v2m1-guides
- https://github.com/wangzheng422/cloud-native-workshop-v2m2-guides
- https://github.com/wangzheng422/cloud-native-workshop-v2m3-guides
- https://github.com/RedHat-Middleware-Workshops/cloud-native-workshop-v2m4-guides
- https://github.com/wangzheng422/cloud-native-workshop-v2-infra
- branch: dev-ocp-4.2
- https://github.com/wangzheng422/cloud-native-workshop-v2m1-labs
- https://github.com/wangzheng422/cloud-native-workshop-v2m2-labs
- https://github.com/wangzheng422/cloud-native-workshop-v2m3-labs
- https://github.com/RedHat-Middleware-Workshops/cloud-native-workshop-v2m4-labs
image need:
- gitlab/gitlab-ce:latest
- quay.io/osevg/workshopper
- quay.io/openshiftlabs/rhamt-web-openshift-messaging-executor:4.2.1.Final
- quay.io/openshiftlabs/rhamt-web-openshift:4.2.1.Final
- registry.redhat.io/openshift-service-mesh/istio-rhel8-operator:1.0.3
- is: jenkins:2 from ocp 4.2 install
- is: quarkus-stack:1.3 quay.io/openshiftlabs/cloudnative-workspaces-quarkus:1.3 to change .m2/settings.xml to add my mirror
reference:
- https://github.com/RedHat-Middleware-Workshops/cloud-native-workshop-v2-infra/tree/ocp-3.11 , we use ocp-4.2 branch right now.
my upstream repository
- https://github.com/wangzheng422/cloud-native-workshop-v2-infra
- quay.io/wangzheng422/gogs-fs
- quay.io/wangzheng422/nexus-fs
build github clone site, using gogs,
似乎gogs并没有在离线部署脚本中
# http://gogs.redhat.ren:10080/
yum install firewalld
systemctl enable firewalld
systemctl start firewalld
yum -y install podman pigz skopeo buildah
podman stop gogs || true
podman rm -fv gogs || true
podman stop nexus || true
podman rm -fv nexus || true
podman stop etherpad || true
podman rm -fv etherpad || true
podman image prune -a
cd /data/ccn
rm -rf /data/ccn/gogs
podman run -d --name gogs-fs --entrypoint "tail" docker.io/wangzheng422/gogs-fs:init -f /dev/null
podman cp gogs-fs:/gogs.tgz /data/ccn/
tar zxf gogs.tgz
podman rm -fv gogs-fs
firewall-cmd --permanent --add-port=10080/tcp
firewall-cmd --reload
firewall-cmd --list-all
podman run -d --name=gogs -p 10022:22 -p 10080:3000 -v /data/ccn/gogs:/data:Z docker.io/gogs/gogs
# Custom config '/data/ccn/gogs/gogs/conf/app.ini'
# find the access key in pwd file
export ACCESS_KEY=""
# curl -v -s -w '%{http_code}' -H "Authorization: token ${ACCESS_KEY}" -X DELETE http://gogs.redhat.ren:10080/api/v1/repos/root/cloud-native-workshop-v2m1-guides
# curl -v -s -w '%{http_code}' -H "Authorization: token ${ACCESS_KEY}" -X DELETE http://gogs.redhat.ren:10080/api/v1/repos/root/cloud-native-workshop-v2m2-guides
# curl -v -s -w '%{http_code}' -H "Authorization: token ${ACCESS_KEY}" -X DELETE http://gogs.redhat.ren:10080/api/v1/repos/root/cloud-native-workshop-v2m3-guides
# curl -v -s -w '%{http_code}' -H "Authorization: token ${ACCESS_KEY}" -X DELETE http://gogs.redhat.ren:10080/api/v1/repos/root/cloud-native-workshop-v2m4-guides
# curl -v -s -w '%{http_code}' -H "Authorization: token ${ACCESS_KEY}" -X DELETE http://gogs.redhat.ren:10080/api/v1/repos/root/cloud-native-workshop-v2m1-labs
# curl -v -s -w '%{http_code}' -H "Authorization: token ${ACCESS_KEY}" -X DELETE http://gogs.redhat.ren:10080/api/v1/repos/root/cloud-native-workshop-v2m2-labs
# curl -v -s -w '%{http_code}' -H "Authorization: token ${ACCESS_KEY}" -X DELETE http://gogs.redhat.ren:10080/api/v1/repos/root/cloud-native-workshop-v2m3-labs
# curl -v -s -w '%{http_code}' -H "Authorization: token ${ACCESS_KEY}" -X DELETE http://gogs.redhat.ren:10080/api/v1/repos/root/cloud-native-workshop-v2m4-labs
curl -v -s -w '%{http_code}' -H "Authorization: token ${ACCESS_KEY}" -X POST http://gogs.redhat.ren:10080/api/v1/repos/migrate \
-H "Content-Type: application/json" \
-d '{"clone_addr": "'"https://github.com/wangzheng422/cloud-native-workshop-v2m1-guides.git"'", "uid": '"1"', "repo_name": "'"cloud-native-workshop-v2m1-guides"'" }'
curl -v -s -w '%{http_code}' -H "Authorization: token ${ACCESS_KEY}" -X POST http://gogs.redhat.ren:10080/api/v1/repos/migrate \
-H "Content-Type: application/json" \
-d '{"clone_addr": "'"https://github.com/wangzheng422/cloud-native-workshop-v2m2-guides.git"'", "uid": '"1"', "repo_name": "'"cloud-native-workshop-v2m2-guides"'" }'
curl -v -s -w '%{http_code}' -H "Authorization: token ${ACCESS_KEY}" -X POST http://gogs.redhat.ren:10080/api/v1/repos/migrate \
-H "Content-Type: application/json" \
-d '{"clone_addr": "'"https://github.com/wangzheng422/cloud-native-workshop-v2m3-guides.git"'", "uid": '"1"', "repo_name": "'"cloud-native-workshop-v2m3-guides"'" }'
curl -v -s -w '%{http_code}' -H "Authorization: token ${ACCESS_KEY}" -X POST http://gogs.redhat.ren:10080/api/v1/repos/migrate \
-H "Content-Type: application/json" \
-d '{"clone_addr": "'"https://github.com/wangzheng422/cloud-native-workshop-v2m4-guides.git"'", "uid": '"1"', "repo_name": "'"cloud-native-workshop-v2m4-guides"'" }'
curl -v -s -w '%{http_code}' -H "Authorization: token ${ACCESS_KEY}" -X POST http://gogs.redhat.ren:10080/api/v1/repos/migrate \
-H "Content-Type: application/json" \
-d '{"clone_addr": "'"https://github.com/wangzheng422/cloud-native-workshop-v2m1-labs.git"'", "uid": '"1"', "repo_name": "'"cloud-native-workshop-v2m1-labs"'" }'
curl -v -s -w '%{http_code}' -H "Authorization: token ${ACCESS_KEY}" -X POST http://gogs.redhat.ren:10080/api/v1/repos/migrate \
-H "Content-Type: application/json" \
-d '{"clone_addr": "'"https://github.com/wangzheng422/cloud-native-workshop-v2m2-labs.git"'", "uid": '"1"', "repo_name": "'"cloud-native-workshop-v2m2-labs"'" }'
curl -v -s -w '%{http_code}' -H "Authorization: token ${ACCESS_KEY}" -X POST http://gogs.redhat.ren:10080/api/v1/repos/migrate \
-H "Content-Type: application/json" \
-d '{"clone_addr": "'"https://github.com/wangzheng422/cloud-native-workshop-v2m3-labs.git"'", "uid": '"1"', "repo_name": "'"cloud-native-workshop-v2m3-labs"'" }'
curl -v -s -w '%{http_code}' -H "Authorization: token ${ACCESS_KEY}" -X POST http://gogs.redhat.ren:10080/api/v1/repos/migrate \
-H "Content-Type: application/json" \
-d '{"clone_addr": "'"https://github.com/wangzheng422/cloud-native-workshop-v2m4-labs.git"'", "uid": '"1"', "repo_name": "'"cloud-native-workshop-v2m4-labs"'" }'
curl -v -s -w '%{http_code}' -H "Authorization: token ${ACCESS_KEY}" -X POST http://gogs.redhat.ren:10080/api/v1/repos/migrate \
-H "Content-Type: application/json" \
-d '{"clone_addr": "'"https://github.com/spring-projects/spring-petclinic.git"'", "uid": '"1"', "repo_name": "'"spring-petclinic"'" }'
podman logs -f gogs
podman stop gogs
podman rm -fv gogs
# bash demo.env.build.sh
cd /data/ccn
var_date=$(date '+%Y-%m-%d-%H%M')
echo $var_date
podman stop gogs
podman rm -fv gogs
tar cf - ./gogs | pigz -c > gogs.tgz
buildah from --name onbuild-container docker.io/library/centos:centos7
buildah copy onbuild-container gogs.tgz /
buildah umount onbuild-container
buildah commit --rm --format=docker onbuild-container docker.io/wangzheng422/gogs-fs:$var_date
# buildah rm onbuild-container
buildah push docker.io/wangzheng422/gogs-fs:$var_date
echo "docker.io/wangzheng422/gogs-fs:$var_date"
build maven repository cache
# http://nexus.redhat.ren:8081
mkdir -p cd /data/ccn
cd /data/ccn
rm -rf /data/ccn/nexus
podman run -d --name nexus-fs --entrypoint "tail" docker.io/wangzheng422/nexus-fs:2020-10-25-0919 -f /dev/null
podman cp nexus-fs:/nexus.tgz /data/ccn/
tar zxf nexus.tgz ./
podman rm -fv nexus-fs
podman run -d -p 8081:8081 -it --name nexus -v /data/ccn/nexus:/nexus-data:Z docker.io/sonatype/nexus3:3.26.1
## change code ready workspace
# change maven settings.xml for maven proxy
var_date=$(date '+%Y-%m-%d-%H%M')
echo $var_date
# on vultr init stack image
# mkdir -p /data/ccn/workspaces
# cd /data/ccn/workspaces
# # wget -O settings.xml https://raw.githubusercontent.com/wangzheng422/docker_env/dev/redhat/ocp4/4.4/ccn/settings.xml
# wget -O settings.xml https://raw.githubusercontent.com/wangzheng422/agnosticd/wzh-dev/ansible/roles/ocp4-workload-ccnrd/files/settings.xml
# wget -O .npmrc https://raw.githubusercontent.com/wangzheng422/docker_env/dev/redhat/ocp4/4.4/ccn/.npmrc
# wget -O stack.Dockerfile https://raw.githubusercontent.com/wangzheng422/agnosticd/wzh-dev/ansible/roles/ocp4-workload-ccnrd/files/stack.Dockerfile
# buildah bud --squash --format=docker -t docker.io/wangzheng422/cloudnative-workspaces-quarkus:init-2.1 -f stack.Dockerfile .
# buildah push docker.io/wangzheng422/cloudnative-workspaces-quarkus:init-2.1
# on vultr update stack update
var_date=$(date '+%Y-%m-%d-%H%M')
echo $var_date
mkdir -p /data/ccn/workspaces
cd /data/ccn/workspaces
# /bin/cp -f /data/order-service.tgz ./
wget -O settings.xml https://raw.githubusercontent.com/wangzheng422/docker_env/dev/redhat/ocp4/4.4/ccn/settings.xml
wget -O .npmrc https://raw.githubusercontent.com/wangzheng422/docker_env/dev/redhat/ocp4/4.4/ccn/.npmrc
wget -O .bowerrc https://raw.githubusercontent.com/wangzheng422/docker_env/dev/redhat/ocp4/4.4/ccn/.bowerrc
wget --no-check-certificate --no-cache --no-cookies -O stack.Dockerfile https://raw.githubusercontent.com/wangzheng422/docker_env/dev/redhat/ocp4/4.4/ccn/stack.dev.Dockerfile
buildah bud --squash --format=docker -t docker.io/wangzheng422/cloudnative-workspaces-quarkus:$var_date -f stack.Dockerfile .
buildah push docker.io/wangzheng422/cloudnative-workspaces-quarkus:$var_date
# on site stack update
buildah from --name onbuild-container registry.redhat.ren:5443/docker.io/wangzheng422/cloudnative-workspaces-quarkus:2020-07-08-1594213447
buildah run onbuild-container /bin/rm -rf /tmp/*
buildah umount onbuild-container
buildah commit --rm --squash --format=docker onbuild-container registry.redhat.ren:5443/docker.io/wangzheng422/cloudnative-workspaces-quarkus:$var_date
# buildah rm onbuild-container
buildah push registry.redhat.ren:5443/docker.io/wangzheng422/cloudnative-workspaces-quarkus:$var_date
echo "registry.redhat.ren:5443/docker.io/wangzheng422/cloudnative-workspaces-quarkus:$var_date"
# get nexus fs
podman stop nexus
podman rm -fv nexus
cd /data/ccn
tar cf - ./nexus | pigz -c > nexus.tgz
buildah from --name onbuild-container docker.io/library/centos:centos7
buildah copy onbuild-container nexus.tgz /
buildah umount onbuild-container
buildah commit --rm --format=docker onbuild-container docker.io/wangzheng422/nexus-fs:$var_date
# buildah rm onbuild-container
buildah push docker.io/wangzheng422/nexus-fs:$var_date
echo "docker.io/wangzheng422/nexus-fs:$var_date"
# docker.io/wangzheng422/nexus-fs:2020-10-25-0919
nodejs image
# on vultr update stack update
var_date=$(date '+%Y-%m-%d-%H%M')
echo $var_date
mkdir -p /data/ccn/workspaces
cd /data/ccn/workspaces
# /bin/cp -f /data/order-service.tgz ./
wget -O settings.xml https://raw.githubusercontent.com/wangzheng422/docker_env/dev/redhat/ocp4/4.4/ccn/settings.xml
wget -O .npmrc https://raw.githubusercontent.com/wangzheng422/docker_env/dev/redhat/ocp4/4.4/ccn/.npmrc
wget --no-check-certificate --no-cache --no-cookies -O stack.Dockerfile https://raw.githubusercontent.com/wangzheng422/docker_env/dev/redhat/ocp4/4.4/ccn/nodejs-10.Dockerfile
buildah bud --format=docker -t docker.io/wangzheng422/cloudnative-workspaces-quarkus:nodejs-10-$var_date -f stack.Dockerfile .
buildah push docker.io/wangzheng422/cloudnative-workspaces-quarkus:nodejs-10-$var_date
build static html file
# get source to image
# https://github.com/openshift/source-to-image
wget -O source-to-image.tgz https://github.com/openshift/source-to-image/releases/download/v1.3.0/source-to-image-v1.3.0-eed2850f-linux-amd64.tar.gz
tar zvxf source-to-image.tgz
mv s2i /usr/local/bin/
var_date=$(date '+%Y-%m-%d-%H%M')
echo $var_date
rm -rf /data/ccn/static-html
mkdir -p /data/ccn/static-html/files
cd /data/ccn/static-html/files
mkdir -p bootstrap/3.3.5/css/
wget -O bootstrap/3.3.5/css/bootstrap.min.css https://maxcdn.bootstrapcdn.com/bootstrap/3.3.5/css/bootstrap.min.css
wget -O bootstrap/3.3.5/css/bootstrap-theme.min.css https://maxcdn.bootstrapcdn.com/bootstrap/3.3.5/css/bootstrap-theme.min.css
mkdir -p bootstrap/3.3.5/js/
wget -O bootstrap/3.3.5/js/bootstrap.min.js https://maxcdn.bootstrapcdn.com/bootstrap/3.3.5/js/bootstrap.min.js
mkdir -p ajax/libs/jquery/2.1.4/
wget -O ajax/libs/jquery/2.1.4/jquery.min.js https://ajax.googleapis.com/ajax/libs/jquery/2.1.4/jquery.min.js
mkdir -p bootstrap/3.3.5/fonts/
wget -O bootstrap/3.3.5/fonts/glyphicons-halflings-regular.woff2 https://raw.githubusercontent.com/wangzheng422/docker_env/dev/redhat/ocp4/4.4/ccn/glyphicons-halflings-regular.woff2
wget -O bootstrap/3.3.5/fonts/glyphicons-halflings-regular.woff https://raw.githubusercontent.com/wangzheng422/docker_env/dev/redhat/ocp4/4.4/ccn/glyphicons-halflings-regular.woff
wget -O bootstrap/3.3.5/fonts/glyphicons-halflings-regular.ttf https://raw.githubusercontent.com/wangzheng422/docker_env/dev/redhat/ocp4/4.4/ccn/glyphicons-halflings-regular.ttf
mkdir -p t/
wget -O t/font_148784_v4ggb6wrjmkotj4i.woff https://at.alicdn.com/t/font_148784_v4ggb6wrjmkotj4i.woff
wget -O t/font_148784_v4ggb6wrjmkotj4i.ttf https://at.alicdn.com/t/font_148784_v4ggb6wrjmkotj4i.ttf
mkdir -p bootstrap/4.0.0-beta/css/
wget -O bootstrap/4.0.0-beta/css/bootstrap.min.css https://maxcdn.bootstrapcdn.com/bootstrap/4.0.0-beta/css/bootstrap.min.css
mkdir -p ajax/libs/patternfly/3.24.0/css/
wget -O ajax/libs/patternfly/3.24.0/css/patternfly.min.css https://cdnjs.cloudflare.com/ajax/libs/patternfly/3.24.0/css/patternfly.min.css
wget -O ajax/libs/patternfly/3.24.0/css/patternfly-additions.min.css https://cdnjs.cloudflare.com/ajax/libs/patternfly/3.24.0/css/patternfly-additions.min.css
wget -O jquery-3.2.1.min.js https://code.jquery.com/jquery-3.2.1.min.js
mkdir -p ajax/libs/jquery-timeago/1.6.1/
wget -O ajax/libs/jquery-timeago/1.6.1/jquery.timeago.min.js https://cdnjs.cloudflare.com/ajax/libs/jquery-timeago/1.6.1/jquery.timeago.min.js
mkdir -p ajax/libs/angularjs/1.4.8/
wget -O ajax/libs/angularjs/1.4.8/angular.min.js https://ajax.googleapis.com/ajax/libs/angularjs/1.4.8/angular.min.js
cd /data/ccn/static-html/
s2i build --rm files/ registry.redhat.io/rhscl/nginx-114-rhel7:latest nginx-sample-app
docker tag nginx-sample-app docker.io/wangzheng422/cloudnative-workspaces-quarkus:swap-$var_date
docker push docker.io/wangzheng422/cloudnative-workspaces-quarkus:swap-$var_date
echo docker.io/wangzheng422/cloudnative-workspaces-quarkus:swap-$var_date
wget -O mime.types https://raw.githubusercontent.com/wangzheng422/docker_env/dev/redhat/ocp4/4.4/ccn/mime.types
wget -O nginx.conf https://raw.githubusercontent.com/wangzheng422/docker_env/dev/redhat/ocp4/4.4/ccn/nginx.conf
cat << EOF > nginx.Dockerfile
FROM docker.io/wangzheng422/cloudnative-workspaces-quarkus:swap-$var_date
USER root
COPY mime.types /etc/nginx/
COPY nginx.conf /etc/nginx/
USER 1001
EOF
buildah bud --format=docker -t docker.io/wangzheng422/cloudnative-workspaces-quarkus:static-html-$var_date -f nginx.Dockerfile .
buildah push docker.io/wangzheng422/cloudnative-workspaces-quarkus:static-html-$var_date
echo "docker.io/wangzheng422/cloudnative-workspaces-quarkus:static-html-$var_date"
docker image prune -f
podman image prune -a
# oc -n labs-infra create route edge static-html-0 --service=static-html --hostname=maxcdn.bootstrapcdn.com
# oc -n labs-infra create route edge static-html-1 --service=static-html --hostname=ajax.googleapis.com
# oc -n labs-infra create route edge static-html-2 --service=static-html --hostname=at.alicdn.com
# oc -n labs-infra create route edge static-html-3 --service=static-html --hostname=cdnjs.cloudflare.com
# oc -n labs-infra create route edge static-html-4 --service=static-html --hostname=code.jquery.com
pip for agnosticd
# on vultr perpare pip
# https://www.linuxtechi.com/use-ansible-galaxy-roles-ansible-playbook/
# https://docs.ansible.com/ansible/latest/scenario_guides/guide_kubernetes.html
# https://stackoverflow.com/questions/11091623/how-to-install-packages-offline
# https://www.activestate.com/resources/quick-reads/how-to-update-all-python-packages/
# yum install -y python2-pip
mkdir -p /data/pip3
cd /data/pip3
# pip install --upgrade pip
pip3 install --user --upgrade kubernetes openshift requests
pip3 freeze > requirements.txt
pip3 install -r requirements.txt --upgrade
mkdir -p wheelhouse
pip2 download -r requirements.txt -d wheelhouse
/bin/cp -f requirements.txt wheelhouse/
tar -zcf wheelhouse.tar.gz wheelhouse
var_date=$(date '+%Y-%m-%d')
echo $var_date
buildah from --name onbuild-container scratch
buildah copy onbuild-container wheelhouse.tar.gz /
buildah umount onbuild-container
buildah commit --rm --format=docker onbuild-container docker.io/wangzheng422/base-fs:pip3-whl-$var_date
# buildah rm onbuild-container
buildah push docker.io/wangzheng422/base-fs:pip3-whl-$var_date
echo "docker.io/wangzheng422/base-fs:pip3-whl-$var_date"
labs sync
rsync -e ssh --info=progress2 -P --delete -arz bastion.fd21.example.opentlc.com:/data/ccn/nexus/ /data/ccn/nexus/
rsync -e ssh -P --delete -arz root@bastion.fd21.example.opentlc.com:/data/ccn/nexus/ ./nexus/
rsync -e ssh -P --delete -arz ./nexus/ root@192.168.7.11:/data/ccn/nexus/
chown -R 200:root nexus
rsync -e ssh --info=progress2 -P --delete -arz 192.168.252.11:/data/ccn/nexus/ ./nexus/
other tips
find object blocks deleting namespace/project
- https://access.redhat.com/solutions/4165791
PROJECT_NAME=user1-cloudnativeapps
oc api-resources --verbs=list --namespaced -o name | xargs -n 1 oc get --show-kind --ignore-not-found -n $PROJECT_NAME
oc api-resources --verbs=list --cached --namespaced -o name | xargs -n 1 oc get --show-kind --ignore-not-found -n $PROJECT_NAME
configuration.serving.knative.dev/payment
service.serving.knative.dev/payment
route.serving.knative.dev/payment
service mesh & knative
oc project istio-system
oc get pod -o json | jq -r '.items[].spec.containers[].image' > tmp.list
oc project istio-operator
oc get pod -o json | jq -r '.items[].spec.containers[].image' >> tmp.list
oc project knative-eventing
oc get pod -o json | jq -r '.items[].spec.containers[].image' >> tmp.list
oc project knative-serving
oc get pod -o json | jq -r '.items[].spec.containers[].image' >> tmp.list
oc project tekton-pipelines
oc get pod -o json | jq -r '.items[].spec.containers[].image' >> tmp.list
oc get pod -o json | jq -r '.items[].spec.initContainers[].image' >> tmp.list
oc project openshift-operators
oc get pod -o json | jq -r '.items[].spec.containers[].image' >> tmp.list
cat tmp.list | sort | uniq
oc project user0-catalog
oc get pod -o json | jq -r '.items[].spec.containers[].image'| sort | uniq
try the install shell
cd
git clone https://github.com/wangzheng422/cloud-native-workshop-v2-infra
cd cloud-native-workshop-v2-infra
git fetch origin
git checkout -b dev-ocp-4.2 origin/dev-ocp-4.2
# in local vm
rsync -e ssh --info=progress2 -P --delete -arz /data/registry-add root@base-pvg.redhat.ren:/data/
# on base-pvg
ansible localhost -m lineinfile -a 'path=/etc/hosts line="127.0.0.1 registry-add.redhat.ren"'
cat > /etc/dnsmasq.d/origin-upstream-dns.conf << EOF
server=10.66.208.137
EOF
systemctl restart dnsmasq
podman run -d --name mirror-registry \
-p 5000:5000 --restart=always \
-v /data/registry-add:/var/lib/registry:z \
-v /etc/crts/:/certs:z \
-e REGISTRY_HTTP_TLS_CERTIFICATE=/certs/redhat.ren.crt \
-e REGISTRY_HTTP_TLS_KEY=/certs/redhat.ren.key \
registry:2
###
skopeo copy docker://docker.io/wangzheng422/gogs-fs:2020-01-01 docker://registry.redhat.ren/docker.io/wangzheng422/gogs-fs:2020-01-01
skopeo copy docker://docker.io/wangzheng422/nexus-fs:2020-01-01 docker://registry.redhat.ren/docker.io/wangzheng422/nexus-fs:2020-01-01
# spring.datasource.initialization-mode: always
tips
- spring.datasource.initialization-mode=always
- prometheus: [ url ]
nodejs
git clone https://github.com/wangzheng422/cloud-native-workshop-v2m4-labs && cd cloud-native-workshop-v2m4-labs && git checkout ocp-4.4 && cd coolstore-ui
cat << EOF > Dockerfile
FROM docker.io/wangzheng422/cloudnative-workspaces-quarkus:nodejs-10-2020-07-16-2155
# Add application sources to a directory that the assemble script expects them
# and set permissions so that the container runs without root access
USER 0
ADD . /tmp/src
RUN chown -R 1001:0 /tmp/src
USER 1001
# Install the dependencies
RUN /usr/libexec/s2i/assemble
# Set the default command for the resulting image
CMD /usr/libexec/s2i/run
EOF
cat << "EOF" > post_install.sh
#!/bin/bash
var_new_domain="static-html-labs-infra.apps.redhat.container-contest.top"
var_new_domain_enc=$(echo $var_new_domain | sed "s/\./\\\./g")
# node_modules/.bin/bower install
# grep -rni "at.alicdn.com" *
# grep -rl 'at.alicdn.com' * | xargs sed -i "s/at\.alicdn\.com/$var_new_domain_enc/g"
grep -rl 'code.jquery.com' * | xargs sed -i "s/code\.jquery\.com/$var_new_domain_enc/g"
grep -rni "code.jquery.com" * || true
EOF
# change package.json
# change postinsall to the shell
# and try to fix domain issues.
podman build -t node-app .
以下是弯路
build github clone site, using gitlab
yum -y install podman
rm -rf /data/ccn/gitlab
mkdir -p /data/ccn/gitlab/config
mkdir -p /data/ccn/gitlab/logs
mkdir -p /data/ccn/gitlab/data
# podman run --detach \
# --hostname local.redhat.ren \
# --env GITLAB_OMNIBUS_CONFIG="external_url 'http://local.redhat.ren:7080/'; gitlab_rails['lfs_enabled'] = true;" \
# --publish 7443:443 --publish 7080:80 --publish 7022:22 \
# --name gitlab \
# --restart always \
# --volume /data/ocp4/demo/gitlab/config:/etc/gitlab:Z \
# --volume /data/ocp4/demo/gitlab/logs:/var/log/gitlab:Z \
# --volume /data/ocp4/demo/gitlab/data:/var/opt/gitlab:Z \
# gitlab/gitlab-ce:latest
podman run --detach \
--hostname local.redhat.ren \
--publish 7443:443 --publish 7080:80 --publish 7022:22 \
--name gitlab \
--restart always \
--volume /data/ccn/gitlab/config:/etc/gitlab:Z \
--volume /data/ccn/gitlab/logs:/var/log/gitlab:Z \
--volume /data/ccn/gitlab/data:/var/opt/gitlab:Z \
gitlab/gitlab-ce:latest
# set default username / password
# root / redhat2019
podman stop gitlab
podman rm -fv gitlab
cd /data/ccn
# tar zcf gitlab.tgz ./gitlab
cat << EOF > /data/ccn/gitlab.files.Dockerfile
FROM registry.redhat.io/ubi7/ubi
COPY gitlab /gitlab
EOF
podman build --no-cache -f /data/ccn/gitlab.files.Dockerfile -t quay.io/wangzheng422/gitlab-fs /data/ccn/
podman push quay.io/wangzheng422/gitlab-fs
podman exec -it gitlab update-permissions
podman restart gitlab
podman logs -f gitlab
getfacl /data/ccn/gitlab/
# now we try to use it
rm -rf /data/ccn/gitlab
podman run -d --name gitlab-fs --entrypoint "tail" quay.io/wangzheng422/gitlab-fs -f /dev/null
podman cp gitlab-fs:/gitlab /data/ccn/
podman rm -fv gitlab-fs
# tar zxf gitlab.tgz
# chown -R root: /data/ccn/gitlab/
containered cloud-native (ccn) roadshow 离线部署
CCN是一个不错的演示openshift之上,ci/cd, cloud-native, istio, serverless的演示教材,教学的内容非常丰富。
第一个模块,着重讲解如何拆分单体应用,以及拆分的应用如何上云。
第二个模块,讲解如何在线debug, 如何监控上云的应用
第三个模块,应用转换到服务网格service mesh/istio架构
第四个模块,应用使用无服务架构serverless/knative架构开发
培训过程视频
- The Containers and Cloud-Native Roadshow Dev Track - Module 1
- The Containers and Cloud-Native Roadshow Dev Track - Module 2
- The Containers and Cloud-Native Roadshow Dev Track - Module 3
- The Containers and Cloud-Native Roadshow Dev Track - Module 4
安装过程视频
不过 upstream 的 CCN 是基于 rh demo system 的,必须在线,这里就做了一个离线的版本,供给客户离线使用。
离线部署架构描述
本次CCN离线,是基于ocp 4.6.9 制作。一共有4个module。
做CCN的离线,主要有以下3部分工作
- github 离线
- maven, npm 离线
- 需要的镜像离线
在实验室的部署架构如下,供参考:
可以看到,与标准的部署架构没什么区别,就是在helper节点上面,加了gogs, nexus。
安装介质下载
请到如下的链接,下载安装介质,注意,这个安装介质是基于ocp 4.6.9 制作。
链接: https://pan.baidu.com/s/1jJU0HLnZMnvCNMNq1OEDxA 密码: uaaw
其中包括如下类型的文件:
- ocp4.tgz 这个文件包含了iso等安装介质,以及各种安装脚本,全部下载的镜像列表等。需要复制到宿主机,以及工具机上去。
- registry.tgz 这个文件也是docker image registry的仓库打包文件。需要先补充镜像的话,按照这里操作: 4.6.add.image.md
- nexus-image.tgz 这个是nexus的镜像仓库打包,集群的镜像proxy指向nexus,由nexus提供镜像的cache
- poc.image.tgz 这个是给registry.tgz补充的一些镜像,主要是ccn使用,补充的镜像列表在这里 poc.image.list ,按照这里操作: 4.6.add.image.md
由于上传的时候,安装5G大小切分,下载以后,合并使用如下的命令范例:
cat registry.?? > registry.tgz
百度盘上还会有补丁文件,比如,当有一个 agnosticd.zip 文件时, 这个就是补丁文件,上传到helper上,替换ocp4.tgz解压缩出来的同名文件即可。
教材修订
教材根据上游的项目做了修订,主要是源代码,为了应对纯离线环境,做了小的修改。如果在教学现场,发现有步骤做不下去,多半是因为离线环境的问题,请参考教学视频录像,里面会有如何绕过离线环境问题的技巧。
基础ocp4.6环境的部署细节
- 按照离线的方法安装ocp4,里面要特别注意要有这些安装细节
- 部署nexus镜像仓库代理
- 打上离线registries.conf的补丁,指向nexus
- 给ingress配置真证书
- 配置image registry
- 配置sample operator,并打上image stream的补丁
- 部署离线operator hub
ccn for ocp-4.6 安装步骤
建议用独立的ocp4集群来安装ccn教材,因为ccn教材会全局的激活多个operator,这些operator也许对集群中的其他环境有影响。
# on helper
# deploy gitea
export LOCAL_REG='registry.ocp4.redhat.ren:5443'
# export LOCAL_REG=''
# gogs_var_date='2020-07-06'
podman stop gitea
podman rm -fv gitea
mkdir -p /data/ccn/gitea
cd /data/ccn
podman create --name swap $LOCAL_REG/wangzheng422/gogs-fs:gitea-2020-12-26-1325 ls
podman cp swap:/gitea.tgz /data/ccn/gitea.tgz
podman rm -fv swap
tar zvxf gitea.tgz
rm -f gitea.tgz
chown -R 1000:1000 /data/ccn/gitea
podman run -d --name gitea \
-v /data/ccn/gitea:/data:Z \
-e USER_UID=1000 \
-e USER_GID=1000 \
-p 10080:3000 \
-p 10022:22 \
${LOCAL_REG}/gitea/gitea:1.13.0
# deploy nexus for maven
mkdir -p /data/ccn/nexus
cd /data/ccn/
podman create --name swap $LOCAL_REG/wangzheng422/nexus-fs:maven-2020-12-25-2024 ls
podman cp swap:/nexus.tgz /data/ccn/nexus.tgz
podman rm -fv swap
tar zvxf nexus.tgz
rm -f nexus.tgz
chown -R 200 /data/ccn/nexus
podman run -d -p 8081:8081 --name nexus -v /data/ccn/nexus:/nexus-data:Z $LOCAL_REG/sonatype/nexus3:3.29.0
# deploy etherpad for notes
mkdir -p /data/ccn/etherpad
chown -R 5001 /data/ccn/etherpad
podman run -d -p 9001:9001 -it --name etherpad -v /data/ccn/etherpad:/opt/etherpad-lite/var:z $LOCAL_REG/etherpad/etherpad:latest
# deploy mta vscode extenstion to helper web server
mkdir -p /data/ccn/vscode
mkdir -p /var/www/html/ccn/
cd /data/ccn/vscode
podman create --name swap $LOCAL_REG/wangzheng422/imgs:mta-vscode-extension.vsix-2020-12-30-1012 ls
podman cp swap:/mta-vscode-extension.vsix /var/www/html/ccn/mta-vscode-extension.vsix
podman cp swap:/logo-eclipseche.svg /var/www/html/ccn/logo-eclipseche.svg
podman rm -fv swap
# agnosticd on helper
mkdir -p /data/pip3
cd /data/pip3
podman create --name swap $LOCAL_REG/wangzheng422/base-fs:pip3-whl-2020-07-05 ls
podman cp swap:/wheelhouse.tar.gz wheelhouse.tar.gz
tar vxf wheelhouse.tar.gz
podman rm -fv swap
pip3 install --user --upgrade -r wheelhouse/requirements.txt --no-index --find-links wheelhouse
# 集群证书
# ccn 环境,高度依赖ingress证书,需要配置一个公网CA签发的真证书,给 *.apps.ocp4.redhat.ren
# install chrome on kvm host
wget https://dl.google.com/linux/direct/google-chrome-stable_current_x86_64.rpm
yum install ./google-chrome-stable_current_*.rpm
google-chrome --no-sandbox --ignore-certificate-errors &
# fix js cache issue
cat << EOF >> /etc/hosts
127.0.0.1 maxcdn.bootstrapcdn.com ajax.googleapis.com at.alicdn.com cdnjs.cloudflare.com code.jquery.com
EOF
# 安装ccn环境的参数
# oc login -u kubeadmin
oc login -u system:admin
# TARGET_HOST="bastion.rhte-b5c8.openshiftworkshop.com"
OCP_USERNAME="system:admin"
WORKLOAD="ocp4-workload-ccnrd"
GUID=b5c8
USER_COUNT=2
MODULE_TYPE="m1;m2;m3;m4"
SSH_KEY=~/.ssh/helper_rsa
WZH_SUBDOMIN_BASE=base.ocp4.redhat.ren
WZH_REGISTRY_SERVER=nexus.ocp4.redhat.ren:8083
WZH_GOGS_SERVER=git.ocp4.redhat.ren:10080
WZH_WEB_SERVER=helper.ocp4.redhat.ren:8080
ssh-copy-id -i ~/.ssh/helper_rsa.pub root@localhost
# create users
BASE_DIR="/data/install"
mkdir -p ${BASE_DIR}
cd ${BASE_DIR}
/bin/rm -f ${BASE_DIR}/htpasswd
touch ${BASE_DIR}/htpasswd
for i in $(seq 1 $USER_COUNT)
do
htpasswd -Bb ${BASE_DIR}/htpasswd user${i} redhat
done
oc create secret generic htpasswd --from-file=${BASE_DIR}/htpasswd -n openshift-config
oc apply -f - <<EOF
apiVersion: config.openshift.io/v1
kind: OAuth
metadata:
name: cluster
spec:
identityProviders:
- name: HTPassword
mappingMethod: claim
type: HTPasswd
htpasswd:
fileData:
name: htpasswd
EOF
# oc delete secret htpasswd -n openshift-config
# 以下是安装步骤
# a TARGET_HOST is specified in the command line, without using an inventory file
oc project default
cd /data/ocp4/agnosticd/ansible
ansible-playbook -i localhost, ./configs/ocp-workloads/ocp-workload.yml \
-e"ansible_ssh_private_key_file=${SSH_KEY}" \
-e"ansible_user=root" \
-e"ocp_username=${OCP_USERNAME}" \
-e"ocp_workload=${WORKLOAD}" \
-e"silent=False" \
-e"guid=${GUID}" \
-e"num_users=${USER_COUNT}" \
-e"user_count=${USER_COUNT}" \
-e"module_type=${MODULE_TYPE}" \
-e"wzh_registry_server=${WZH_REGISTRY_SERVER}" \
-e"wzh_gogs_server=${WZH_GOGS_SERVER}" \
-e"wzh_web_server=${WZH_WEB_SERVER}" \
-e"ansible_python_interpreter=/usr/bin/python3" \
-e"subdomain_base=${WZH_SUBDOMIN_BASE}" \
-v \
-e"ACTION=create"
# 由于实验环境里面的演示网站,会用到一些在线的静态文件,如果客户端浏览器不能联网
# 或者不能沟通"快速"上网,那么需要给这些在线资源做dns解析,解析到平台的router上来
# 离线的安装介质里面,有static-html,用来提供这些静态文件服务。
# at.alicdn.com
# maxcdn.bootstrapcdn.com
# cdnjs.cloudflare.com
# ajax.googleapis.com
# code.jquery.com
# 以下是删除ccn的步骤,注意大部分的operator不会删除。
# a TARGET_HOST is specified in the command line, without using an inventory file
cd /data/ocp4/agnosticd/ansible
ansible-playbook -i localhost, ./configs/ocp-workloads/ocp-workload.yml \
-e"ansible_ssh_private_key_file=${SSH_KEY}" \
-e"ansible_user=root" \
-e"ocp_username=${OCP_USERNAME}" \
-e"ocp_workload=${WORKLOAD}" \
-e"silent=False" \
-e"guid=${GUID}" \
-e"num_users=${USER_COUNT}" \
-e"user_count=${USER_COUNT}" \
-e"module_type=${MODULE_TYPE}" \
-e"wzh_registry_server=${WZH_REGISTRY_SERVER}" \
-e"wzh_gogs_server=${WZH_GOGS_SERVER}" \
-e"wzh_web_server=${WZH_WEB_SERVER}" \
-e"ansible_python_interpreter=/usr/bin/python3" \
-e"subdomain_base=${WZH_SUBDOMIN_BASE}" \
-v \
-e"ACTION=remove"
做练习中需要注意的地方
# git 链接要改成gitea上的地址
# http://git.ocp4.redhat.ren:10080/root/cloud-native-workshop-v2m1-labs.git
# http://git.ocp4.redhat.ren:10080/root/cloud-native-workshop-v2m2-labs.git
# http://git.ocp4.redhat.ren:10080/root/cloud-native-workshop-v2m3-labs.git
# http://git.ocp4.redhat.ren:10080/root/cloud-native-workshop-v2m4-labs.git
# http://git.ocp4.redhat.ren:10080/root/vote-api.git
# http://git.ocp4.redhat.ren:10080/root/vote-ui.git
# oc 命令有引用镜像的地方,都要改成nexus上的地址
oc new-build --docker-image=nexus.ocp4.redhat.ren:8083/ubi8/openjdk-11 --binary --name=catalog-springboot -l app=catalog-springboot
# in module 4, nodeshift编译命令要改一下。
npm run nodeshift -- --dockerImage=nexus.ocp4.redhat.ren:8083/wangzheng422/imgs --imageTag=nodejs-10-wzh-2021-01-05
其他备忘
yum install -y wget jq
# Keycloak credentials: admin / 2kBdjDwcZK94
# STACK_ID: stacksq1xbet4os1uioep
todo
- PPT
离线ccn, containered cloud native 制作
基本思路
- 需要一个离线的github
- 目前看,gogs没有体现在离线部署脚本中。
- gogs集群外部署,不外置数据库。以后在考虑如何集群内部署,如何pv import
- 研究gogs api,批量创建用户和project
- 需要一个maven的离线proxy
- 目前看,没有包含在离线脚本中,但是crw里面有个配置,指向了离线proxy,似乎好做。
- nexus集群外部署.
- 需要各种镜像
- 目前看,用的大多是image stream,反而好做
additional need:
- maven repository cache
- github clone site
- https://github.com/wangzheng422/cloud-native-workshop-v2m1-guides
- https://github.com/wangzheng422/cloud-native-workshop-v2m2-guides
- https://github.com/wangzheng422/cloud-native-workshop-v2m3-guides
- https://github.com/RedHat-Middleware-Workshops/cloud-native-workshop-v2m4-guides
- https://github.com/wangzheng422/cloud-native-workshop-v2-infra
- branch: dev-ocp-4.2
- https://github.com/wangzheng422/cloud-native-workshop-v2m1-labs
- https://github.com/wangzheng422/cloud-native-workshop-v2m2-labs
- https://github.com/wangzheng422/cloud-native-workshop-v2m3-labs
- https://github.com/RedHat-Middleware-Workshops/cloud-native-workshop-v2m4-labs
image need:
- registry.redhat.io/openshift-service-mesh/istio-rhel8-operator:1.0.3
- is: jenkins:2 from ocp 4.2 install
- is: quarkus-stack:1.3 quay.io/openshiftlabs/cloudnative-workspaces-quarkus:1.3 to change .m2/settings.xml to add my mirror
reference:
- https://github.com/RedHat-Middleware-Workshops/cloud-native-workshop-v2-infra/tree/ocp-3.11 , we use ocp-4.2 branch right now.
my upstream repository
- quay.io/wangzheng422/gogs-fs
- quay.io/wangzheng422/nexus-fs
build github clone site, using gitea
似乎 gitea 并没有在离线部署脚本中
# http://git.ocp4.redhat.ren:10080/
cat << EOF >> /etc/hosts
127.0.0.1 registry.ocp4.redhat.ren nexus.ocp4.redhat.ren git.ocp4.redhat.ren
EOF
yum install -y firewalld
systemctl disable --now firewalld
# systemctl start firewalld
yum -y install podman pigz skopeo buildah
podman image prune -a
############################################
# build init fs
mkdir -p /data/ccn/gitea
cd /data/ccn
rm -rf /data/ccn/gitea
mkdir -p /data/ccn/gitea
chown -R 1000:1000 /data/ccn/gitea
podman run -d --name gitea \
-v /data/ccn/gitea:/data:Z \
-e USER_UID=1000 \
-e USER_GID=1000 \
-p 10080:3000 \
-p 10022:22 \
docker.io/gitea/gitea:1.13.0
# admin user: root / redhat
# api call token : 6d47a0172d53e567737f7a81bbb6dbff4c1565d1
cd /data/ccn
tar cf - ./gitea | pigz -c > gitea.tgz
buildah from --name onbuild-container scratch
buildah copy onbuild-container gitea.tgz /
buildah umount onbuild-container
buildah commit --rm --format=docker onbuild-container docker.io/wangzheng422/gogs-fs:gitea-init
rm -f gitea.tgz
buildah push docker.io/wangzheng422/gogs-fs:gitea-init
echo "docker.io/wangzheng422/gogs-fs:gitea-init"
######################################################
# build gitea based on init fs
mkdir -p /data/ccn/gitea
cd /data/ccn
rm -rf /data/ccn/gitea
mkdir -p /data/ccn/gitea
chown -R 1000:1000 /data/ccn/gitea
cd /data/ccn
podman create --name swap docker.io/wangzheng422/gogs-fs:gitea-init ls
podman cp swap:/gitea.tgz - > gitea.tgz
podman rm -fv swap
tar zvxf gitea.tgz
rm -f gitea.tgz
chown -R 1000:1000 /data/ccn/gitea
podman run -d --name gitea \
-v /data/ccn/gitea:/data:Z \
-e USER_UID=1000 \
-e USER_GID=1000 \
-p 10080:3000 \
-p 10022:22 \
docker.io/gitea/gitea:1.13.0
# Custom config '/data/ccn/gogs/gogs/conf/app.ini'
# find the access key in pwd file
export ACCESS_KEY="6d47a0172d53e567737f7a81bbb6dbff4c1565d1"
# curl -v -s -w '%{http_code}' -H "Authorization: token ${ACCESS_KEY}" -X DELETE http://gogs.redhat.ren:10080/api/v1/repos/root/cloud-native-workshop-v2m1-guides
# curl -v -s -w '%{http_code}' -H "Authorization: token ${ACCESS_KEY}" -X DELETE http://gogs.redhat.ren:10080/api/v1/repos/root/cloud-native-workshop-v2m2-guides
# curl -v -s -w '%{http_code}' -H "Authorization: token ${ACCESS_KEY}" -X DELETE http://gogs.redhat.ren:10080/api/v1/repos/root/cloud-native-workshop-v2m3-guides
# curl -v -s -w '%{http_code}' -H "Authorization: token ${ACCESS_KEY}" -X DELETE http://gogs.redhat.ren:10080/api/v1/repos/root/cloud-native-workshop-v2m4-guides
# curl -v -s -w '%{http_code}' -H "Authorization: token ${ACCESS_KEY}" -X DELETE http://gogs.redhat.ren:10080/api/v1/repos/root/cloud-native-workshop-v2m1-labs
# curl -v -s -w '%{http_code}' -H "Authorization: token ${ACCESS_KEY}" -X DELETE http://gogs.redhat.ren:10080/api/v1/repos/root/cloud-native-workshop-v2m2-labs
# curl -v -s -w '%{http_code}' -H "Authorization: token ${ACCESS_KEY}" -X DELETE http://gogs.redhat.ren:10080/api/v1/repos/root/cloud-native-workshop-v2m3-labs
# curl -v -s -w '%{http_code}' -H "Authorization: token ${ACCESS_KEY}" -X DELETE http://gogs.redhat.ren:10080/api/v1/repos/root/cloud-native-workshop-v2m4-labs
curl -v -s -w '%{http_code}' -H "Authorization: token ${ACCESS_KEY}" -X POST http://git.ocp4.redhat.ren:10080/api/v1/repos/migrate \
-H "Content-Type: application/json" \
-d '{"clone_addr": "'"https://github.com/wangzheng422/cloud-native-workshop-v2m1-guides.git"'", "uid": '"1"', "repo_name": "'"cloud-native-workshop-v2m1-guides"'" }'
curl -v -s -w '%{http_code}' -H "Authorization: token ${ACCESS_KEY}" -X POST http://git.ocp4.redhat.ren:10080/api/v1/repos/migrate \
-H "Content-Type: application/json" \
-d '{"clone_addr": "'"https://github.com/wangzheng422/cloud-native-workshop-v2m2-guides.git"'", "uid": '"1"', "repo_name": "'"cloud-native-workshop-v2m2-guides"'" }'
curl -v -s -w '%{http_code}' -H "Authorization: token ${ACCESS_KEY}" -X POST http://git.ocp4.redhat.ren:10080/api/v1/repos/migrate \
-H "Content-Type: application/json" \
-d '{"clone_addr": "'"https://github.com/wangzheng422/cloud-native-workshop-v2m3-guides.git"'", "uid": '"1"', "repo_name": "'"cloud-native-workshop-v2m3-guides"'" }'
curl -v -s -w '%{http_code}' -H "Authorization: token ${ACCESS_KEY}" -X POST http://git.ocp4.redhat.ren:10080/api/v1/repos/migrate \
-H "Content-Type: application/json" \
-d '{"clone_addr": "'"https://github.com/wangzheng422/cloud-native-workshop-v2m4-guides.git"'", "uid": '"1"', "repo_name": "'"cloud-native-workshop-v2m4-guides"'" }'
curl -v -s -w '%{http_code}' -H "Authorization: token ${ACCESS_KEY}" -X POST http://git.ocp4.redhat.ren:10080/api/v1/repos/migrate \
-H "Content-Type: application/json" \
-d '{"clone_addr": "'"https://github.com/wangzheng422/cloud-native-workshop-v2m1-labs.git"'", "uid": '"1"', "repo_name": "'"cloud-native-workshop-v2m1-labs"'" }'
curl -v -s -w '%{http_code}' -H "Authorization: token ${ACCESS_KEY}" -X POST http://git.ocp4.redhat.ren:10080/api/v1/repos/migrate \
-H "Content-Type: application/json" \
-d '{"clone_addr": "'"https://github.com/wangzheng422/cloud-native-workshop-v2m2-labs.git"'", "uid": '"1"', "repo_name": "'"cloud-native-workshop-v2m2-labs"'" }'
curl -v -s -w '%{http_code}' -H "Authorization: token ${ACCESS_KEY}" -X POST http://git.ocp4.redhat.ren:10080/api/v1/repos/migrate \
-H "Content-Type: application/json" \
-d '{"clone_addr": "'"https://github.com/wangzheng422/cloud-native-workshop-v2m3-labs.git"'", "uid": '"1"', "repo_name": "'"cloud-native-workshop-v2m3-labs"'" }'
curl -v -s -w '%{http_code}' -H "Authorization: token ${ACCESS_KEY}" -X POST http://git.ocp4.redhat.ren:10080/api/v1/repos/migrate \
-H "Content-Type: application/json" \
-d '{"clone_addr": "'"https://github.com/wangzheng422/cloud-native-workshop-v2m4-labs.git"'", "uid": '"1"', "repo_name": "'"cloud-native-workshop-v2m4-labs"'" }'
curl -v -s -w '%{http_code}' -H "Authorization: token ${ACCESS_KEY}" -X POST http://git.ocp4.redhat.ren:10080/api/v1/repos/migrate \
-H "Content-Type: application/json" \
-d '{"clone_addr": "'"https://github.com/spring-projects/spring-petclinic.git"'", "uid": '"1"', "repo_name": "'"spring-petclinic"'" }'
curl -v -s -w '%{http_code}' -H "Authorization: token ${ACCESS_KEY}" -X POST http://git.ocp4.redhat.ren:10080/api/v1/repos/migrate \
-H "Content-Type: application/json" \
-d '{"clone_addr": "'"https://github.com/wangzheng422/vote-api.git"'", "uid": '"1"', "repo_name": "'"vote-api"'" }'
curl -v -s -w '%{http_code}' -H "Authorization: token ${ACCESS_KEY}" -X POST http://git.ocp4.redhat.ren:10080/api/v1/repos/migrate \
-H "Content-Type: application/json" \
-d '{"clone_addr": "'"https://github.com/wangzheng422/vote-ui.git"'", "uid": '"1"', "repo_name": "'"vote-ui"'" }'
podman logs -f gitea
podman stop gitea
podman rm -fv gitea
# bash demo.env.build.sh
cd /data/ccn
var_date=$(date '+%Y-%m-%d-%H%M')
echo $var_date
tar cf - ./gitea | pigz -c > gitea.tgz
buildah from --name onbuild-container scratch
buildah copy onbuild-container gitea.tgz /
buildah umount onbuild-container
buildah commit --rm --format=docker onbuild-container docker.io/wangzheng422/gogs-fs:gitea-$var_date
rm -f gitea.tgz
buildah push docker.io/wangzheng422/gogs-fs:gitea-$var_date
echo "docker.io/wangzheng422/gogs-fs:gitea-$var_date"
# docker.io/wangzheng422/gogs-fs:gitea-2021-01-06-0652
create an online nexus maven proxy
我们使用一个在线的nexus proxy,来cache maven
- https://blog.csdn.net/kq1983/article/details/83066102
# get old fs
mkdir -p /data/ccn/nexus
cd /data/ccn/
podman create --name swap docker.io/wangzheng422/nexus-fs:2020-10-25-0919 ls
podman cp swap:/nexus.tgz - > /data/ccn/nexus.tgz
podman rm -fv swap
tar zvxf nexus.tgz
rm -f nexus.tgz
chown -R 200 /data/ccn/nexus
#####################################################
# init build the nexus fs
mkdir -p /data/ccn/nexus
chown -R 200 /data/ccn/nexus
podman run -d -p 8081:8081 --name nexus -v /data/ccn/nexus:/nexus-data:Z docker.io/sonatype/nexus3:3.29.0
podman stop nexus
podman rm nexus
# get the admin password
cat /data/ccn/nexus/admin.password && echo
# 8c9862da-5dcd-430c-a026-e3557539459a
# open http://nexus.ocp4.redhat.ren:8081
# add aliyun maven proxy
# https://blog.csdn.net/kq1983/article/details/83066102
######################################################
# dump the nexus image fs out
var_date=$(date '+%Y-%m-%d-%H%M')
echo $var_date
cd /data/ccn
tar cf - ./nexus | pigz -c > nexus.tgz
buildah from --name onbuild-container scratch
buildah copy onbuild-container nexus.tgz /
buildah umount onbuild-container
buildah commit --rm --format=docker onbuild-container docker.io/wangzheng422/nexus-fs:maven-$var_date
# buildah rm onbuild-container
rm -f nexus.tgz
buildah push docker.io/wangzheng422/nexus-fs:maven-$var_date
echo "docker.io/wangzheng422/nexus-fs:maven-$var_date"
# docker.io/wangzheng422/nexus-fs:maven-2021-01-06-1456
create code ready workspace image
CRW 给每个session启动了一个container,这个container的image就是用的操作台,我们定制一下这个操作台,让maven什么的都指向内网的proxy
mkdir -p /data/ccn/workspaces
cd /data/ccn/workspaces
# /bin/cp -f /data/order-service.tgz ./
wget -O settings.xml https://raw.githubusercontent.com/wangzheng422/docker_env/dev/redhat/ocp4/4.6/ccn/settings.xml
wget -O .npmrc https://raw.githubusercontent.com/wangzheng422/docker_env/dev/redhat/ocp4/4.6/ccn/.npmrc
wget -O .bowerrc https://raw.githubusercontent.com/wangzheng422/docker_env/dev/redhat/ocp4/4.6/ccn/.bowerrc
wget --no-check-certificate --no-cache --no-cookies -O stack.Dockerfile https://raw.githubusercontent.com/wangzheng422/docker_env/dev/redhat/ocp4/4.6/ccn/stack.dev.Dockerfile
buildah bud --format=docker -t docker.io/wangzheng422/cloudnative-workspaces-quarkus:2.4.1-wzh -f stack.Dockerfile .
buildah push docker.io/wangzheng422/cloudnative-workspaces-quarkus:2.4.1-wzh
mta vscode extension
ccn 4.6 做了一个vscode上的extension,这个需要做离线
################################3
## build mta extension
# install nodejs
curl -sL https://rpm.nodesource.com/setup_10.x | sudo bash -
yum install -y nodejs
npm install -g typescript vsce
mkdir -p /data/ccn/vscode
cd /data/ccn/vscode
git clone https://github.com/wangzheng422/rhamt-vscode-extension
cd rhamt-vscode-extension
git checkout ocp-4.6-ccn
npm install
npm run vscode:prepublish
vsce package -o mta-vscode-extension.vsix
cp mta-vscode-extension.vsix ../
cd /data/ccn/vscode
###################################
## use redhat upstream
var_date=$(date '+%Y-%m-%d-%H%M')
echo $var_date
mkdir -p /data/ccn/vscode
cd /data/ccn/vscode
# wget -O mta-vscode-extension.vsix https://download.jboss.org/jbosstools/adapters/snapshots/mta-vscode-extension/mta-vscode-extension-0.0.48-662.vsix
wget https://www.eclipse.org/che/images/logo-eclipseche.svg
buildah from --name onbuild-container scratch
buildah copy onbuild-container mta-vscode-extension.vsix /
buildah copy onbuild-container logo-eclipseche.svg /
buildah umount onbuild-container
buildah commit --rm --format=docker onbuild-container docker.io/wangzheng422/imgs:mta-vscode-extension.vsix-$var_date
cd /data/ccn
# rm -rf /data/ccn/vscode
buildah push docker.io/wangzheng422/imgs:mta-vscode-extension.vsix-$var_date
echo "docker.io/wangzheng422/imgs:mta-vscode-extension.vsix-$var_date"
# docker.io/wangzheng422/imgs:mta-vscode-extension.vsix-2020-12-30-1012
##############################
# use real upstream
var_date=$(date '+%Y-%m-%d-%H%M')
echo $var_date
buildah from --name onbuild-container quay.io/windupeng/mta-vscode-extension
buildah umount onbuild-container
buildah commit --rm --format=docker onbuild-container docker.io/wangzheng422/imgs:mta-vscode-extension.base-$var_date
buildah push docker.io/wangzheng422/imgs:mta-vscode-extension.base-$var_date
echo "docker.io/wangzheng422/imgs:mta-vscode-extension.base-$var_date"
# docker.io/wangzheng422/imgs:mta-vscode-extension.base-2020-12-30-1340
# if you want to use prebuild newer version
# https://raw.githubusercontent.com/windup/rhamt-che-demo/master/meta.yaml
mkdir -p /data/ccn/vscode
cd /data/ccn/vscode
wget -O mta-vscode-extension.vsix https://download.jboss.org/jbosstools/adapters/snapshots/mta-vscode-extension/mta-vscode-extension-0.0.58-790.vsix
wget https://www.eclipse.org/che/images/logo-eclipseche.svg
buildah from --name onbuild-container scratch
buildah copy onbuild-container mta-vscode-extension.vsix /
buildah copy onbuild-container logo-eclipseche.svg /
buildah umount onbuild-container
buildah commit --rm --format=docker onbuild-container docker.io/wangzheng422/imgs:mta-vscode-extension.vsix-0.0.48-662
cd /data/ccn
# rm -rf /data/ccn/vscode
buildah push docker.io/wangzheng422/imgs:mta-vscode-extension.vsix-0.0.48-662
oc get pod -o json | jq -r .items[0].metadata.name
oc get pod -o json | jq -r .items[0].spec.containers[].name
oc get pod -o json | jq -r .items[0].spec.initContainers[].name
oc rsh -c $(oc get pod -o json | jq -r '.items[0].spec.containers[] | select( .name | contains("rhamt-extension") ) | .name') $(oc get pod -o json | jq -r .items[0].metadata.name)
oc logs $(oc get pod -o json | jq -r .items[0].metadata.name) -c $(oc get pod -o json | jq -r '.items[0].spec.containers[] | select( .name | contains("rhamt-extension") ) | .name')
oc logs $(oc get pod -o json | jq -r .items[0].metadata.name) -c $(oc get pod -o json | jq -r '.items[0].spec.containers[] | select( .name | contains("theia-ide") ) | .name')
oc logs $(oc get pod -o json | jq -r .items[0].metadata.name) -c $(oc get pod -o json | jq -r '.items[0].spec.containers[] | select( .name | contains("vscode-quarkus") ) | .name')
oc logs $(oc get pod -o json | jq -r .items[0].metadata.name) -c $(oc get pod -o json | jq -r '.items[0].spec.containers[] | select( .name | contains("che-jwtproxy") ) | .name')
oc logs $(oc get pod -o json | jq -r .items[0].metadata.name) -c $(oc get pod -o json | jq -r '.items[0].spec.containers[] | select( .name | contains("quarkus-tools") ) | .name')
oc logs $(oc get pod -o json | jq -r .items[0].metadata.name) -c $(oc get pod -o json | jq -r '.items[0].spec.containers[] | select( .name | contains("che-machine-exe") ) | .name')
oc logs $(oc get pod -o json | jq -r .items[0].metadata.name) -c $(oc get pod -o json | jq -r '.items[0].spec.initContainers[] | select( .name | contains("remote-runtime-inject") ) | .name')
oc logs $(oc get pod -o json | jq -r .items[0].metadata.name) -c $(oc get pod -o json | jq -r '.items[0].spec.initContainers[] | select( .name | contains("pluginbroker-artifacts-rhel8") ) | .name')
oc exec $(oc get pod -o json | jq -r .items[0].metadata.name) -c $(oc get pod -o json | jq -r '.items[0].spec.containers[] | select( .name | contains("rhamt-extension") ) | .name') -- /usr/sbin/killall5
build static html file
# get source to image
# https://github.com/openshift/source-to-image
wget -O source-to-image.tgz https://github.com/openshift/source-to-image/releases/download/v1.3.0/source-to-image-v1.3.0-eed2850f-linux-amd64.tar.gz
tar zvxf source-to-image.tgz
mv s2i /usr/local/bin/
var_date=$(date '+%Y-%m-%d-%H%M')
echo $var_date
rm -rf /data/ccn/static-html
mkdir -p /data/ccn/static-html/files
cd /data/ccn/static-html/files
download_url() {
# https://maxcdn.bootstrapcdn.com/bootstrap/3.3.5/css/bootstrap.min.css
var_url=$1
# bootstrap/3.3.5/css/bootstrap.min.css
var_file=${var_url#*.*/}
# bootstrap/3.3.5/css
var_path=${var_file%/*}
mkdir -p $var_path
wget -O $var_file $var_url
}
download_url https://maxcdn.bootstrapcdn.com/bootstrap/3.3.5/css/bootstrap.min.css
download_url https://maxcdn.bootstrapcdn.com/bootstrap/3.3.5/css/bootstrap-theme.min.css
download_url https://maxcdn.bootstrapcdn.com/font-awesome/4.4.0/css/font-awesome.min.css
download_url https://maxcdn.bootstrapcdn.com/bootstrap/3.3.5/js/bootstrap.min.js
download_url https://maxcdn.bootstrapcdn.com/bootstrap/4.0.0-beta/css/bootstrap.min.css
download_url https://ajax.googleapis.com/ajax/libs/jquery/2.1.4/jquery.min.map
download_url https://ajax.googleapis.com/ajax/libs/jquery/2.1.4/jquery.min.map
download_url https://ajax.googleapis.com/ajax/libs/angularjs/1.4.8/angular.min.js
download_url https://at.alicdn.com/t/font_148784_v4ggb6wrjmkotj4i.woff
download_url https://at.alicdn.com/t/font_148784_v4ggb6wrjmkotj4i.ttf
download_url https://cdnjs.cloudflare.com/ajax/libs/patternfly/3.24.0/css/patternfly.min.css
download_url https://cdnjs.cloudflare.com/ajax/libs/patternfly/3.24.0/css/patternfly-additions.min.css
download_url https://cdnjs.cloudflare.com/ajax/libs/jquery-cookie/1.4.1/jquery.cookie.js
download_url https://cdnjs.cloudflare.com/ajax/libs/jquery-timeago/1.6.1/jquery.timeago.min.js
wget -O jquery-3.2.1.min.js https://code.jquery.com/jquery-3.2.1.min.js
wget -O jquery-latest.min.js http://code.jquery.com/jquery-latest.min.js
mkdir -p bootstrap/3.3.5/fonts/
wget -O bootstrap/3.3.5/fonts/glyphicons-halflings-regular.woff2 https://raw.githubusercontent.com/wangzheng422/docker_env/dev/redhat/ocp4/4.4/ccn/glyphicons-halflings-regular.woff2
wget -O bootstrap/3.3.5/fonts/glyphicons-halflings-regular.woff https://raw.githubusercontent.com/wangzheng422/docker_env/dev/redhat/ocp4/4.4/ccn/glyphicons-halflings-regular.woff
wget -O bootstrap/3.3.5/fonts/glyphicons-halflings-regular.ttf https://raw.githubusercontent.com/wangzheng422/docker_env/dev/redhat/ocp4/4.4/ccn/glyphicons-halflings-regular.ttf
cd /data/ccn/static-html/
s2i build --rm files/ registry.redhat.io/rhscl/nginx-114-rhel7:latest nginx-sample-app
docker tag nginx-sample-app docker.io/wangzheng422/cloudnative-workspaces-quarkus:swap-$var_date
docker push docker.io/wangzheng422/cloudnative-workspaces-quarkus:swap-$var_date
echo docker.io/wangzheng422/cloudnative-workspaces-quarkus:swap-$var_date
wget -O mime.types https://raw.githubusercontent.com/wangzheng422/docker_env/dev/redhat/ocp4/4.4/ccn/mime.types
wget -O nginx.conf https://raw.githubusercontent.com/wangzheng422/docker_env/dev/redhat/ocp4/4.4/ccn/nginx.conf
cat << EOF > nginx.Dockerfile
FROM docker.io/wangzheng422/cloudnative-workspaces-quarkus:swap-$var_date
USER root
COPY mime.types /etc/nginx/
COPY nginx.conf /etc/nginx/
USER 1001
EOF
buildah bud --format=docker -t docker.io/wangzheng422/cloudnative-workspaces-quarkus:static-html-$var_date -f nginx.Dockerfile .
buildah push docker.io/wangzheng422/cloudnative-workspaces-quarkus:static-html-$var_date
echo "docker.io/wangzheng422/cloudnative-workspaces-quarkus:static-html-$var_date"
docker image prune -f
podman image prune -a
# oc -n labs-infra create route edge static-html-0 --service=static-html --hostname=maxcdn.bootstrapcdn.com
# oc -n labs-infra create route edge static-html-1 --service=static-html --hostname=ajax.googleapis.com
# oc -n labs-infra create route edge static-html-2 --service=static-html --hostname=at.alicdn.com
# oc -n labs-infra create route edge static-html-3 --service=static-html --hostname=cdnjs.cloudflare.com
# oc -n labs-infra create route edge static-html-4 --service=static-html --hostname=code.jquery.com
pip for agnosticd
# on vultr perpare pip
# https://www.linuxtechi.com/use-ansible-galaxy-roles-ansible-playbook/
# https://docs.ansible.com/ansible/latest/scenario_guides/guide_kubernetes.html
# https://stackoverflow.com/questions/11091623/how-to-install-packages-offline
# https://www.activestate.com/resources/quick-reads/how-to-update-all-python-packages/
# yum install -y python2-pip
mkdir -p /data/pip3
cd /data/pip3
# pip install --upgrade pip
pip3 install --user --upgrade kubernetes openshift requests
pip3 freeze --user > requirements.txt
# pip3 install -r requirements.txt --upgrade
mkdir -p wheelhouse
pip3 download -r requirements.txt -d wheelhouse
/bin/cp -f requirements.txt wheelhouse/
tar -zcf wheelhouse.tar.gz wheelhouse
var_date=$(date '+%Y-%m-%d')
echo $var_date
buildah from --name onbuild-container scratch
buildah copy onbuild-container wheelhouse.tar.gz /
buildah umount onbuild-container
buildah commit --rm --format=docker onbuild-container docker.io/wangzheng422/base-fs:pip3-whl-$var_date
# buildah rm onbuild-container
buildah push docker.io/wangzheng422/base-fs:pip3-whl-$var_date
echo "docker.io/wangzheng422/base-fs:pip3-whl-$var_date"
nodejs
# docker.io/wangzheng422/cloudnative-workspaces-quarkus:nodejs-10-2020-07-16-2155
# this docker file is build using nodejs-10.Dockerfile
mkdir -p /data/ccn/nodejs
cd /data/ccn/nodejs
var_date=$(date '+%Y-%m-%d')
echo $var_date
wget -O .npmrc https://raw.githubusercontent.com/wangzheng422/docker_env/dev/redhat/ocp4/4.6/ccn/.npmrc
wget -O .bowerrc https://raw.githubusercontent.com/wangzheng422/docker_env/dev/redhat/ocp4/4.6/ccn/.bowerrc
wget https://raw.githubusercontent.com/wangzheng422/docker_env/dev/redhat/ocp4/4.6/ccn/nodejs-10.Dockerfile
buildah bud --format=docker -t docker.io/wangzheng422/imgs:nodejs-10-wzh-$var_date -f nodejs-10.Dockerfile .
buildah push docker.io/wangzheng422/imgs:nodejs-10-wzh-$var_date
echo "docker.io/wangzheng422/imgs:nodejs-10-wzh-$var_date"
# docker.io/wangzheng422/imgs:nodejs-10-wzh-2021-01-05
build dist
cd /data/ocp4
wget -O poc.image.list https://raw.githubusercontent.com/wangzheng422/docker_env/dev/redhat/ocp4/4.6/ccn/poc.image.list
export MIRROR_DIR='/data/poc.image'
/bin/rm -rf ${MIRROR_DIR}
bash add.image.sh poc.image.list ${MIRROR_DIR}
labs sync
rsync -e ssh --info=progress2 -P --delete -arz bastion.fd21.example.opentlc.com:/data/ccn/nexus/ /data/ccn/nexus/
rsync -e ssh -P --delete -arz root@bastion.fd21.example.opentlc.com:/data/ccn/nexus/ ./nexus/
rsync -e ssh -P --delete -arz ./nexus/ root@192.168.7.11:/data/ccn/nexus/
chown -R 200:root nexus
rsync -e ssh --info=progress2 -P --delete -arz 192.168.252.11:/data/ccn/nexus/ ./nexus/
other tips
find object blocks deleting namespace/project
- https://access.redhat.com/solutions/4165791
PROJECT_NAME=user1-cloudnativeapps
oc api-resources --verbs=list --namespaced -o name | xargs -n 1 oc get --show-kind --ignore-not-found -n $PROJECT_NAME
oc api-resources --verbs=list --cached --namespaced -o name | xargs -n 1 oc get --show-kind --ignore-not-found -n $PROJECT_NAME
configuration.serving.knative.dev/payment
service.serving.knative.dev/payment
route.serving.knative.dev/payment
service mesh & knative
oc project istio-system
oc get pod -o json | jq -r '.items[].spec.containers[].image' > tmp.list
oc project istio-operator
oc get pod -o json | jq -r '.items[].spec.containers[].image' >> tmp.list
oc project knative-eventing
oc get pod -o json | jq -r '.items[].spec.containers[].image' >> tmp.list
oc project knative-serving
oc get pod -o json | jq -r '.items[].spec.containers[].image' >> tmp.list
oc project tekton-pipelines
oc get pod -o json | jq -r '.items[].spec.containers[].image' >> tmp.list
oc get pod -o json | jq -r '.items[].spec.initContainers[].image' >> tmp.list
oc project openshift-operators
oc get pod -o json | jq -r '.items[].spec.containers[].image' >> tmp.list
cat tmp.list | sort | uniq
oc project user0-catalog
oc get pod -o json | jq -r '.items[].spec.containers[].image'| sort | uniq
以下是弯路
build github clone site, using gitlab
yum -y install podman
rm -rf /data/ccn/gitlab
mkdir -p /data/ccn/gitlab/config
mkdir -p /data/ccn/gitlab/logs
mkdir -p /data/ccn/gitlab/data
# podman run --detach \
# --hostname local.redhat.ren \
# --env GITLAB_OMNIBUS_CONFIG="external_url 'http://local.redhat.ren:7080/'; gitlab_rails['lfs_enabled'] = true;" \
# --publish 7443:443 --publish 7080:80 --publish 7022:22 \
# --name gitlab \
# --restart always \
# --volume /data/ocp4/demo/gitlab/config:/etc/gitlab:Z \
# --volume /data/ocp4/demo/gitlab/logs:/var/log/gitlab:Z \
# --volume /data/ocp4/demo/gitlab/data:/var/opt/gitlab:Z \
# gitlab/gitlab-ce:latest
podman run --detach \
--hostname local.redhat.ren \
--publish 7443:443 --publish 7080:80 --publish 7022:22 \
--name gitlab \
--restart always \
--volume /data/ccn/gitlab/config:/etc/gitlab:Z \
--volume /data/ccn/gitlab/logs:/var/log/gitlab:Z \
--volume /data/ccn/gitlab/data:/var/opt/gitlab:Z \
gitlab/gitlab-ce:latest
# set default username / password
# root / redhat2019
podman stop gitlab
podman rm -fv gitlab
cd /data/ccn
# tar zcf gitlab.tgz ./gitlab
cat << EOF > /data/ccn/gitlab.files.Dockerfile
FROM registry.redhat.io/ubi7/ubi
COPY gitlab /gitlab
EOF
podman build --no-cache -f /data/ccn/gitlab.files.Dockerfile -t quay.io/wangzheng422/gitlab-fs /data/ccn/
podman push quay.io/wangzheng422/gitlab-fs
podman exec -it gitlab update-permissions
podman restart gitlab
podman logs -f gitlab
getfacl /data/ccn/gitlab/
# now we try to use it
rm -rf /data/ccn/gitlab
podman run -d --name gitlab-fs --entrypoint "tail" quay.io/wangzheng422/gitlab-fs -f /dev/null
podman cp gitlab-fs:/gitlab /data/ccn/
podman rm -fv gitlab-fs
# tar zxf gitlab.tgz
# chown -R root: /data/ccn/gitlab/
openshift 4.10 ACM with observ user case
By default, the observability is disabled, here we will enable it, and see what it looks like.
Here is the architecture of the acm observability:
create the acm hub cluster
- install a sno cluster with 16C, 64GB memory, and 2 100GB disks
- install ODF from operator hub, and create a ceph following steps here.
- install ACM from operator hub
- install multiclusterhub with default setting
create the managed cluster
install a sno cluster with 16C, 32GB memory.
NODE_SSH_KEY="$(cat ~/.ssh/id_rsa.pub)"
INSTALL_IMAGE_REGISTRY=quaylab.infra.redhat.ren:8443
PULL_SECRET='{"auths":{"registry.redhat.io": {"auth": "ZHVtbXk6ZHVtbXk=","email": "noemail@localhost"},"registry.ocp4.redhat.ren:5443": {"auth": "ZHVtbXk6ZHVtbXk=","email": "noemail@localhost"},"'${INSTALL_IMAGE_REGISTRY}'": {"auth": "'$( echo -n 'admin:shadowman' | openssl base64 )'","email": "noemail@localhost"}}}'
NTP_SERVER=192.168.7.11
HELP_SERVER=192.168.7.11
KVM_HOST=192.168.7.11
API_VIP=192.168.7.100
INGRESS_VIP=192.168.7.101
CLUSTER_PROVISION_IP=192.168.7.103
BOOTSTRAP_IP=192.168.7.12
ACM_DEMO_MNGED_CLUSTER=acm-demo-man01
ACM_DEMO_MNGED_SNO_IP=192.168.7.23
# 定义单节点集群的节点信息
SNO_CLUSTER_NAME=acm-demo-man01
SNO_BASE_DOMAIN=redhat.ren
SNO_IP=192.168.7.23
SNO_GW=192.168.7.11
SNO_NETMAST=255.255.255.0
SNO_NETMAST_S=24
SNO_HOSTNAME=acm-demo-man01-master
SNO_IF=enp1s0
SNO_IF_MAC=`printf '00:60:2F:%02X:%02X:%02X' $[RANDOM%256] $[RANDOM%256] $[RANDOM%256]`
SNO_DNS=192.168.7.11
SNO_DISK=/dev/vda
SNO_CORE_PWD=redhat
echo ${SNO_IF_MAC} > /data/sno/sno.mac
# goto kvm host ( 103 )
scp root@192.168.7.11:/data/install/sno.iso /data/kvm/
virsh destroy ocp4-acm-man01
virsh undefine ocp4-acm-man01
create_lv() {
var_vg=$1
var_pool=$2
var_lv=$3
var_size=$4
var_action=$5
lvremove -f $var_vg/$var_lv
# lvcreate -y -L $var_size -n $var_lv $var_vg
if [ "$var_action" == "recreate" ]; then
lvcreate --type thin -n $var_lv -V $var_size --thinpool $var_vg/$var_pool
wipefs --all --force /dev/$var_vg/$var_lv
fi
}
create_lv vgdata poolA lvacm-man01 100G recreate
create_lv vgdata poolA lvacm-man01-data 100G recreate
SNO_MEM=32
virt-install --name=ocp4-acm-man01-master01 --vcpus=16 --ram=$(($SNO_MEM*1024)) \
--cpu=host-model \
--disk path=/dev/vgdata/lvacm-man01,device=disk,bus=virtio,format=raw \
--disk path=/dev/vgdata/lvacm-man01-data,device=disk,bus=virtio,format=raw \
--os-variant rhel8.3 --network bridge=baremetal,model=virtio \
--graphics vnc,port=59003 \
--boot menu=on --cdrom /data/kvm/sno.iso
# INFO Install complete!
# INFO To access the cluster as the system:admin user when using 'oc', run 'export KUBECONFIG=/data/install/auth/kubeconfig'
# INFO Access the OpenShift web-console here: https://console-openshift-console.apps.acm-demo-man01.redhat.ren
# INFO Login to the console with user: "kubeadmin", and password: "FohuH-IwyJe-3UQPL-AakHm"
# INFO Time elapsed: 0s
enable acm observ in acm hub cluster
offical document: acm observ enable, it will enable observ in managed cluster automatically.
# try to install acm observ
# https://access.redhat.com/documentation/en-us/red_hat_advanced_cluster_management_for_kubernetes/2.4/html-single/observability/index
oc create namespace open-cluster-management-observability
DOCKER_CONFIG_JSON=`oc extract secret/pull-secret -n openshift-config --to=-`
oc create secret generic multiclusterhub-operator-pull-secret \
-n open-cluster-management-observability \
--from-literal=.dockerconfigjson="$DOCKER_CONFIG_JSON" \
--type=kubernetes.io/dockerconfigjson
cat << EOF > /data/install/acm.observ.secret.yaml
---
apiVersion: v1
kind: Secret
metadata:
name: thanos-object-storage
namespace: open-cluster-management-observability
type: Opaque
stringData:
thanos.yaml: |
type: s3
config:
bucket: $BUCKET_NAME
endpoint: $AWS_HOST
insecure: true
access_key: $AWS_ACCESS_KEY_ID
secret_key: $AWS_SECRET_ACCESS_KEY
EOF
oc create -f /data/install/acm.observ.secret.yaml
# oc delete -f /data/install/acm.observ.secret.yaml
cat << EOF > /data/install/acm.observ.yaml
---
apiVersion: observability.open-cluster-management.io/v1beta2
kind: MultiClusterObservability
metadata:
name: observability
spec:
observabilityAddonSpec: {}
storageConfig:
metricObjectStorage:
name: thanos-object-storage
key: thanos.yaml
---
EOF
oc create -f /data/install/acm.observ.yaml -n open-cluster-management
# oc delete -f /data/install/acm.observ.yaml -n open-cluster-management
import second cluster into acm, by using kubeconfig file
click 'Grafana' on top right, you will see the grafana dashboard
you can see there are 3 default dashboard included with acm, 2 of them are usable for ocp4
look at the 'ACM - Clusters Overview' dashboard
look at the 'ACM - Resource Optimization / Cluster' dashboard
for acm hub cluster:
for managed cluster:
ansible platform 2.1 install
客户希望安装一套ansible platform,而且是全离线环境,那么我们就按照最简单的单节点模式,来安装一下。整个安装过程,就是用rhel8.5的dvd安装基本操作系统,然后把dvd作为系统的dnf源。接着,导入3个docker镜像,并且本地启动一个docker registry服务。
注意,单节点服务至少要8G内存,不然安装脚本检测不过的。
安装操作系统,配置基础服务
# install rhel 8.5 using dvd iso
# reboot, and set dvd iso as dnf source
blkid | grep sr0
# /dev/sr0: BLOCK_SIZE="2048" UUID="2021-10-13-03-57-25-00" LABEL="RHEL-8-5-0-BaseOS-x86_64" TYPE="iso9660" PTUUID="4d694e6c" PTTYPE="dos"
blkid /dev/sr0 -o value | sed -n 2p
# 2021-10-13-03-57-25-00
mkdir -p /media/cdrom
mount /dev/sr0 /media/cdrom
cat << EOF >> /etc/fstab
UUID=`blkid /dev/sr0 -o value | sed -n 2p` /media/cdrom iso9660 ro,user,auto 0 0
EOF
cat << EOF > /etc/yum.repos.d/dvd.repo
[dvd-base]
name=dvd-base
baseurl=file:///media/cdrom/BaseOS
enabled=1
gpgcheck=0
[dvd-app]
name=dvd-app
baseurl=file:///media/cdrom/AppStream
enabled=1
gpgcheck=0
EOF
# we need to setup a docker registry
# and we need copy docker registry image into the disconnected host
podman pull docker.io/library/registry:2
podman save docker.io/library/registry:2 | pigz -c > registry.tgz
podman load -i registry.tgz
# Loaded image(s): docker.io/library/registry:2
# this is testing/demo purpose,
# do not turn off firewalld on production system
systemctl disable --now firewalld
cat << EOF >> /etc/hosts
127.0.0.1 registry.redhat.ren
EOF
# 配置registry
mkdir -p /etc/crts/ && cd /etc/crts
openssl genrsa -out /etc/crts/redhat.ren.ca.key 4096
openssl req -x509 \
-new -nodes \
-key /etc/crts/redhat.ren.ca.key \
-sha256 \
-days 36500 \
-out /etc/crts/redhat.ren.ca.crt \
-subj /CN="Local Red Hat Ren Signer" \
-reqexts SAN \
-extensions SAN \
-config <(cat /etc/pki/tls/openssl.cnf \
<(printf '[SAN]\nbasicConstraints=critical, CA:TRUE\nkeyUsage=keyCertSign, cRLSign, digitalSignature'))
openssl genrsa -out /etc/crts/redhat.ren.key 2048
openssl req -new -sha256 \
-key /etc/crts/redhat.ren.key \
-subj "/O=Local Red Hat Ren /CN=*.ocp4.redhat.ren" \
-reqexts SAN \
-config <(cat /etc/pki/tls/openssl.cnf \
<(printf "\n[SAN]\nsubjectAltName=DNS:*.ocp4.redhat.ren,DNS:*.apps.ocp4.redhat.ren,DNS:*.redhat.ren\nbasicConstraints=critical, CA:FALSE\nkeyUsage=digitalSignature, keyEncipherment, keyAgreement, dataEncipherment\nextendedKeyUsage=serverAuth")) \
-out /etc/crts/redhat.ren.csr
openssl x509 \
-req \
-sha256 \
-extfile <(printf "subjectAltName=DNS:*.ocp4.redhat.ren,DNS:*.apps.ocp4.redhat.ren,DNS:*.redhat.ren\nbasicConstraints=critical, CA:FALSE\nkeyUsage=digitalSignature, keyEncipherment, keyAgreement, dataEncipherment\nextendedKeyUsage=serverAuth") \
-days 36500 \
-in /etc/crts/redhat.ren.csr \
-CA /etc/crts/redhat.ren.ca.crt \
-CAkey /etc/crts/redhat.ren.ca.key \
-CAcreateserial -out /etc/crts/redhat.ren.crt
openssl x509 -in /etc/crts/redhat.ren.crt -text
/bin/cp -f /etc/crts/redhat.ren.ca.crt /etc/pki/ca-trust/source/anchors/
update-ca-trust extract
cd /data/ocp4
# systemctl stop docker-distribution
/bin/rm -rf /data/registry
mkdir -p /data/registry
podman run --restart always --name local-registry -p 5443:5443 \
-d --restart=always \
-v /data/registry/:/var/lib/registry:z \
-v /etc/crts:/certs:z \
-e REGISTRY_HTTP_ADDR=0.0.0.0:5443 \
-e REGISTRY_HTTP_TLS_CERTIFICATE=/certs/redhat.ren.crt \
-e REGISTRY_HTTP_TLS_KEY=/certs/redhat.ren.key \
docker.io/library/registry:2
安装ansible platform
官方文档写的非常清晰,我们安装官方文档做就好。只不过官方文档里面,对全离线的时候,docker image怎么处理,似乎没讲的很详细,我们补充一下。
# document is here
# https://access.redhat.com/documentation/en-us/red_hat_ansible_automation_platform/2.1/pdf/red_hat_ansible_automation_platform_installation_guide/red_hat_ansible_automation_platform-2.1-red_hat_ansible_automation_platform_installation_guide-en-us.pdf
# goto https://access.redhat.com/downloads/content/480
# to download Ansible Automation Platform 2.1.0 Setup Bundle
mkdir -p /data
cd /data
tar zvxf ansible-automation-platform-setup-bundle-2.1.0-1.tar.gz
cd /data/ansible-automation-platform-setup-bundle-2.1.0-1
podman load -i images/ee-29-rhel8.tgz
# Loaded image(s): registry.redhat.io/ansible-automation-platform-21/ee-29-rhel8:latest
podman load -i images/ee-minimal-rhel8.tgz
# Loaded image(s): registry.redhat.io/ansible-automation-platform-21/ee-minimal-rhel8:latest
podman load -i images/ee-supported-rhel8.tgz
# Loaded image(s): registry.redhat.io/ansible-automation-platform-21/ee-supported-rhel8:latest
podman tag registry.redhat.io/ansible-automation-platform-21/ee-29-rhel8:latest registry.redhat.ren:5443/ansible-automation-platform-21/ee-29-rhel8:latest
podman push registry.redhat.ren:5443/ansible-automation-platform-21/ee-29-rhel8:latest
podman tag registry.redhat.io/ansible-automation-platform-21/ee-minimal-rhel8:latest registry.redhat.ren:5443/ansible-automation-platform-21/ee-minimal-rhel8:latest
podman push registry.redhat.ren:5443/ansible-automation-platform-21/ee-minimal-rhel8:latest
podman tag registry.redhat.io/ansible-automation-platform-21/ee-supported-rhel8:latest registry.redhat.ren:5443/ansible-automation-platform-21/ee-supported-rhel8:latest
podman push registry.redhat.ren:5443/ansible-automation-platform-21/ee-supported-rhel8:latest
/bin/cp -f inventory inventory.bak
cat << EOF > inventory
[automationcontroller]
127.0.0.1 ansible_connection=local
[database]
[all:vars]
admin_password='password'
pg_host=''
pg_port=''
pg_database='awx'
pg_username='awx'
pg_password='password'
registry_url='registry.redhat.ren:5443'
EOF
./setup.sh -e gpgcheck=0
# login using admin / password
# open browser to https://172.16.218.2/
安装到此结束,打开浏览器,访问 https://
并使用用户名 admin 密码 password 登录就可以了。
激活订阅
初始安装,第一次登录,会要求用订阅激活。而我们是离线安装模式,所以这里有一个在红帽官网导出离线证书的过程。
首先我们登录到ansible platform里面后,看到激活界面,点击链接,访问红帽官网。
访问到红帽官网以后,点击新的订阅分配
给新的订阅取个容易记忆的名字。订阅分配其实是一个订阅证书分发的机制,我们创建了这个订阅分配以后,就可以往里面添加购买的产品,比如买了ansible, rhel等,然后就会统一的下一个zip文件,都给你打包好,很方便。
创建好订阅分配后,点击订阅。
然后点击添加订阅
用关键字搜索产品,如果我们购买的产品少,那么就不用搜索,直接在列表中选择就可以,在要购买的产品后面,调整权利,比如我们要激活一个系统,就设置权利为1就可以了。
点击提交以后,我们就看到订阅添加成功了
我们点击导出清单,就可以到处订阅证书了
你会得到一个类似这样文件名的文件: manifest_ansible_20220107T110649Z.zip 。把这个文件,导入到ansible platform中。
在用户数据反馈中,取消点击,因为我们是离线的,访问不到红帽的公网系统。
提交后,我们就进入ansible platform的首页界面啦。
virus test for docker image security scanning
几乎所有的容器平台,都有容器安全的方案,比如大名鼎鼎的clair,但是他们的扫描原理并不是深度扫描,而是通过容器内部yum, apk等包管理工具,扫描包管理工具的历史数据库,看装在容器里面的软件版本,从而判断是否有漏洞的。
这种扫描方法,当然是为了性能,但是也给日常实践带来了困扰,一般工程师很容易的以为,我们上了一个容器安全平台,就可以高枕无忧了,其实不是这样的。
以下我们就举一个实际的例子,看看效果,然后我们再来想想怎么应对。
test quay / clair / docker hub
我们用网上的测试病毒,复制到容器里面去,打包,上传到镜像仓库,看看镜像仓库的扫描结果。为了更有代表性,我们把病毒复制成java,然后我们把镜像分别上传quay.io, docker hub
mkdir -p /data/tmp
cd /data/tmp
cat << EOF > ./virus.Dockerfile
FROM registry.access.redhat.com/ubi8/ubi-minimal
ADD https://www.ikarussecurity.com/wp-content/downloads/eicar_com.zip /wzh
ADD https://github.com/MalwareSamples/Linux-Malware-Samples/blob/main/00ae07c9fe63b080181b8a6d59c6b3b6f9913938858829e5a42ab90fb72edf7a /wzh01
ADD https://github.com/MalwareSamples/Linux-Malware-Samples/blob/main/00ae07c9fe63b080181b8a6d59c6b3b6f9913938858829e5a42ab90fb72edf7a /usr/bin/java
RUN chmod +x /wzh*
RUN chmod +x /usr/bin/java
EOF
buildah bud -t quay.io/wangzheng422/qimgs:virus -f virus.Dockerfile ./
buildah push quay.io/wangzheng422/qimgs:virus
buildah bud -t docker.io/wangzheng422/virus -f virus.Dockerfile ./
buildah push docker.io/wangzheng422/virus
我们发现,包含病毒的镜像,在两个容器镜像平台上,都扫描不出来。这也就证明了,普通的扫描工具,只能扫描包管理工具的历史数据库,而不能扫描容器内部的软件版本。
log4jshell
现在大名顶顶的log4j的漏洞,各地平台扫描能不一样,能扫描的也是检测容器里面的jar文件,然后看这个jar文件里面的MANIFEST.MF文件,在这个文件里面看看对应软件包的版本号,然后报警。
我们可以看看 quay.io/apoczeka/log4shell-vuln 这个容器,在quay.io上面的扫描结果,可以看到他无法发现log4j的漏洞。
ACS
那么我们看看红帽RHACS容器安全平台能不能扫出来。
# on vultr
wget https://mirror.openshift.com/pub/rhacs/assets/latest/bin/linux/roxctl
install -m 755 roxctl /usr/local/bin/
# on ACS platform
# Integrations -> API Token -> Create Integration
# role -> continous-integration -> create
# copy the API token out
export ROX_API_TOKEN=<api_token>
export ROX_CENTRAL_ADDRESS=central-stackrox.apps.cluster-ms246.ms246.sandbox1059.opentlc.com:443
roxctl -e "$ROX_CENTRAL_ADDRESS" --insecure-skip-tls-verify image scan -i docker.io/elastic/logstash:7.13.0 | jq '.scan.components[] | .vulns[]? | select(.cve == "CVE-2021-44228") | .cve'
# "CVE-2021-44228"
# "CVE-2021-44228"
roxctl -e "$ROX_CENTRAL_ADDRESS" --insecure-skip-tls-verify image scan -i quay.io/apoczeka/log4shell-vuln | jq '.scan.components[] | .vulns[]? | select(.cve == "CVE-2021-44228") | .cve'
# "CVE-2021-44228"
roxctl -e "$ROX_CENTRAL_ADDRESS" --insecure-skip-tls-verify image check -r 0 -o json -i docker.io/elastic/logstash:7.13.0
我们可以看到ACS成功的检测到了log4j的漏洞。这样我们就可以把ACS继承到我们CI/CD流水线里面去,完成漏洞扫描。
当然,我们可以使用ACS内置的界面工具,快速的定义规则,并且第一时间禁止相关的漏洞。
这里是配置生效以后(要有点下发的时间,如果集群比较繁忙的话),ACS阻止有漏洞的镜像运行的效果。
不过,目前ACS只支持deployment模式的部署,你要是修改deployment, 或者干脆用pod直接部署,都会绕过ACS的检测,这个以后看ACS升级解决吧。
grype
类似ACS的命令行工具,还有很多其他的选择,这里举个例子。
# https://github.com/anchore/grype
grype -q quay.io/apoczeka/log4shell-vuln | grep log4j
# log4j-api 2.14.1 2.15.0 GHSA-jfh8-c2jp-5v3q Critical
# log4j-api 2.14.1 2.16.0 GHSA-7rjr-3q55-vv33 Medium
# log4j-api 2.14.1 CVE-2021-44228 Critical
# log4j-core 2.14.1 2.15.0 GHSA-jfh8-c2jp-5v3q Critical
# log4j-core 2.14.1 2.16.0 GHSA-7rjr-3q55-vv33 Medium
# log4j-core 2.14.1 CVE-2021-44228 Critical
# log4j-jul 2.14.1 CVE-2021-44228 Critical
# log4j-slf4j-impl 2.14.1 CVE-2021-44228 Critical
trivy
https://github.com/aquasecurity/trivy
openshift install cnv with ocs and external ceph
本次测试的目标业务场景是,一个CS服务的虚机,用CNV跑在openshift上,虚机的镜像承载在ceph上,并测试虚机热迁移和虚机克隆场景。
由于测试环境所限,我们配置一个单节点的ceph,挂3个5.5T的盘,这个ceph节点,就用kvm,配置了16C32G,实际跑下来,感觉8C32G也是够的。
部署架构图
视频讲解
单节点ceph安装,ocs安装,对接外部ceph存储
cnv安装,导入虚机镜像,热迁移,克隆
install ceph
我们先安装这个ceph节点。
#####################################
## start to install ceph
cd /backup/wzh
lvremove -f ocp4/cephlv
lvcreate -y -L 230G -n cephlv ocp4
lvremove -f ocp4/cephdata01lv
lvcreate -y -L 3T -n cephdata01lv ocp4
lvremove -f ocp4/cephdata02lv
lvcreate -y -L 3T -n cephdata02lv ocp4
lvremove -f ocp4/cephdata03lv
lvcreate -y -L 3T -n cephdata03lv ocp4
virt-install --name=ocp4-ceph --vcpus=16 --ram=32768 \
--disk path=/dev/ocp4/cephlv,device=disk,bus=virtio,format=raw \
--disk path=/dev/ocp4/cephdata01lv,device=disk,bus=virtio,format=raw \
--disk path=/dev/ocp4/cephdata02lv,device=disk,bus=virtio,format=raw \
--disk path=/dev/ocp4/cephdata03lv,device=disk,bus=virtio,format=raw \
--os-variant centos7.0 --network network:openshift4,model=virtio \
--boot menu=on --location /home/data/openshift/ocp.4.3.21/rhel-server-7.8-x86_64-dvd.iso \
--initrd-inject rhel-ks-ceph.cfg --extra-args "inst.ks=file:/rhel-ks-ceph.cfg"
#######################################
# kvm's host bond and vlan
# https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/networking_guide/sec-configure_802_1q_vlan_tagging_using_the_command_line_tool_nmcli
# https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/networking_guide/sec-vlan_on_bond_and_bridge_using_the_networkmanager_command_line_tool_nmcli
nmcli con add type bond \
con-name bond-24 \
ifname bond-24 \
mode 802.3ad ipv4.method disabled ipv6.method ignore
nmcli con mod id bond-24 bond.options \
mode=802.3ad,miimon=100,lacp_rate=fast,xmit_hash_policy=layer2+3
nmcli con add type bond-slave ifname enp176s0f0 con-name enp176s0f0 master bond-24
nmcli con add type bond-slave ifname enp59s0f0 con-name enp59s0f0 master bond-24
nmcli con up bond-24
nmcli connection add type bridge con-name br-ceph ifname br-ceph ip4 192.168.18.200/24
nmcli con up br-ceph
nmcli con add type vlan con-name vlan-ceph ifname vlan-ceph dev bond-24 id 501 master br-ceph slave-type bridge
nmcli con up vlan-ceph
# no need below
# cat << EOF > /backup/wzh/virt-net.xml
# <network>
# <name>vm-br-ceph</name>
# <forward mode='bridge'>
# <bridge name='br-ceph'/>
# </forward>
# </network>
# EOF
# virsh net-define --file virt-net.xml
# virsh net-autostart br-ceph
# virsh net-start br-ceph
# virsh net-list
# # restore
# virsh net-undefine br-ceph
# virsh net-destroy br-ceph
cat << EOF > /root/.ssh/config
StrictHostKeyChecking no
UserKnownHostsFile=/dev/null
EOF
# restore
nmcli con del vlan-ceph
nmcli con del br-ceph
nmcli con del enp59s0f0
nmcli con del enp176s0f0
nmcli con del bond-24
########################################
# go to ceph vm
# https://www.cyberciti.biz/faq/linux-list-network-cards-command/
cat /proc/net/dev
nmcli con add type ethernet ifname eth1 con-name eth1
nmcli con modify eth1 ipv4.method manual ipv4.addresses 192.168.18.203/24
nmcli con modify eth1 connection.autoconnect yes
nmcli con reload
nmcli con up eth1
# restore
nmcli con del eth1
##########################################
# go to worker2 vm, to test the ceph vlan
nmcli con add type ethernet ifname ens9 con-name ens9
nmcli con modify ens9 ipv4.method manual ipv4.addresses 192.168.18.209/24
nmcli con modify ens9 connection.autoconnect yes
nmcli con reload
nmcli con up ens9
# restore
nmcli con del ens9
nmcli con del 'Wired connection 1'
##########################################
# go to worker1 vm, to test the ceph vlan
nmcli con add type ethernet ifname ens9 con-name ens9
nmcli con modify ens9 ipv4.method manual ipv4.addresses 192.168.18.208/24
nmcli con modify ens9 connection.autoconnect yes
nmcli con reload
nmcli con up ens9
# restore
nmcli con del ens9
nmcli con del 'Wired connection 1'
##########################################
# go to worker0 vm, to test the ceph vlan
nmcli con add type ethernet ifname ens9 con-name ens9
nmcli con modify ens9 ipv4.method manual ipv4.addresses 192.168.18.207/24
nmcli con modify ens9 connection.autoconnect yes
nmcli con reload
nmcli con up ens9
##########################################
# go to master2 vm, to test the ceph vlan
nmcli con add type ethernet ifname ens9 con-name ens9
nmcli con modify ens9 ipv4.method manual ipv4.addresses 192.168.18.206/24
nmcli con modify ens9 connection.autoconnect yes
nmcli con reload
nmcli con up ens9
# restore
nmcli con del ens9
nmcli con del 'Wired connection 1'
##########################################
# go to master1 vm, to test the ceph vlan
nmcli con add type ethernet ifname ens9 con-name ens9
nmcli con modify ens9 ipv4.method manual ipv4.addresses 192.168.18.205/24
nmcli con modify ens9 connection.autoconnect yes
nmcli con reload
nmcli con up ens9
# restore
nmcli con del ens9
nmcli con del 'Wired connection 1'
##########################################
# go to master0 vm, to test the ceph vlan
nmcli con add type ethernet ifname ens9 con-name ens9
nmcli con modify ens9 ipv4.method manual ipv4.addresses 192.168.18.204/24
nmcli con modify ens9 connection.autoconnect yes
nmcli con reload
nmcli con up ens9
# restore
nmcli con del ens9
nmcli con del 'Wired connection 1'
##########################################
# go to worker4 baremetal, to test the ceph vlan
nmcli con del 'Wired connection 1'
nmcli con del 'Wired connection 2'
nmcli con del 'Wired connection 3'
nmcli con del 'Wired connection 4'
nmcli con del 'Wired connection 5'
nmcli con del ens35f0.991
nmcli con del ens35f1
# https://access.redhat.com/solutions/1526613
nmcli con add type bond \
con-name bond-24 \
ifname bond-24 \
mode 802.3ad ipv4.method disabled ipv6.method ignore
nmcli con mod id bond-24 bond.options \
mode=802.3ad,miimon=100,lacp_rate=fast,xmit_hash_policy=layer2+3
nmcli con add type bond-slave ifname ens49f0 con-name ens49f0 master bond-24
nmcli con add type bond-slave ifname ens35f0 con-name ens35f0 master bond-24
nmcli con up bond-24
nmcli con add type vlan con-name vlan-ceph ifname vlan-ceph dev bond-24 id 501 ip4 192.168.18.211/24
nmcli con up vlan-ceph
# restore
nmcli con del vlan-ceph
nmcli con del ens49f0 ens35f0
nmcli con del bond-24
#############################################
# go to worker3 baremetal, to test the ceph vlan
nmcli con del 'Wired connection 1'
nmcli con del 'Wired connection 2'
nmcli con del 'Wired connection 3'
nmcli con del 'Wired connection 4'
nmcli con del 'Wired connection 5'
nmcli con add type bond \
con-name bond-24 \
ifname bond-24 \
mode 802.3ad ipv4.method disabled ipv6.method ignore
nmcli con mod id bond-24 bond.options \
mode=802.3ad,miimon=100,lacp_rate=fast,xmit_hash_policy=layer2+3
nmcli con add type bond-slave ifname ens49f0 con-name ens49f0 master bond-24
nmcli con add type bond-slave ifname ens35f0 con-name ens35f0 master bond-24
nmcli con up bond-24
nmcli con add type vlan con-name vlan-ceph ifname vlan-ceph dev bond-24 id 501 ip4 192.168.18.210/24
nmcli con up vlan-ceph
# restore
nmcli con del vlan-ceph
nmcli con del ens49f0 ens35f0
nmcli con del bond-24
#################################################
## for ceph vm
# install a 'fast' http proxy, then
subscription-manager --proxy=127.0.0.1:6666 register --username **** --password ********
# subscription-manager --proxy=127.0.0.1:6666 refresh
subscription-manager config --rhsm.baseurl=https://china.cdn.redhat.com
# subscription-manager config --rhsm.baseurl=https://cdn.redhat.com
subscription-manager --proxy=127.0.0.1:6666 refresh
# https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/4/html-single/installation_guide/index
subscription-manager --proxy=127.0.0.1:6666 repos --disable=*
subscription-manager --proxy=127.0.0.1:6666 repos --enable=rhel-7-server-rpms \
--enable=rhel-7-server-extras-rpms \
--enable=rhel-7-server-supplementary-rpms \
--enable=rhel-7-server-optional-rpms \
--enable=rhel-7-server-rhceph-4-tools-rpms --enable=rhel-7-server-ansible-2.8-rpms \
--enable=rhel-7-server-rhceph-4-mon-rpms \
--enable=rhel-7-server-rhceph-4-osd-rpms \
--enable=rhel-7-server-rhceph-4-tools-rpms
yum clean all
yum makecache
yum update -y
systemctl enable --now firewalld
systemctl start firewalld
systemctl status firewalld
firewall-cmd --zone=public --add-port=6789/tcp
firewall-cmd --zone=public --add-port=6789/tcp --permanent
firewall-cmd --zone=public --add-port=6800-7300/tcp
firewall-cmd --zone=public --add-port=6800-7300/tcp --permanent
firewall-cmd --zone=public --add-port=6800-7300/tcp
firewall-cmd --zone=public --add-port=6800-7300/tcp --permanent
firewall-cmd --zone=public --add-port=6800-7300/tcp
firewall-cmd --zone=public --add-port=6800-7300/tcp --permanent
firewall-cmd --zone=public --add-port=8080/tcp
firewall-cmd --zone=public --add-port=8080/tcp --permanent
firewall-cmd --zone=public --add-port=443/tcp
firewall-cmd --zone=public --add-port=443/tcp --permanent
# firewall-cmd --zone=public --add-port=9090/tcp
# firewall-cmd --zone=public --add-port=9090/tcp --permanent
ssh-keygen
sed -i 's/#UseDNS yes/UseDNS no/' /etc/ssh/sshd_config
systemctl restart sshd
ssh-copy-id root@ceph
yum install -y ceph-ansible docker
cd /usr/share/ceph-ansible
# yum install -y docker
systemctl enable --now docker
cd /usr/share/ceph-ansible
/bin/cp -f group_vars/all.yml.sample group_vars/all.yml
/bin/cp -f group_vars/osds.yml.sample group_vars/osds.yml
/bin/cp -f site-docker.yml.sample site-docker.yml
/bin/cp -f site.yml.sample site.yml
/bin/cp -f group_vars/rgws.yml.sample group_vars/rgws.yml
/bin/cp -f group_vars/mdss.yml.sample group_vars/mdss.yml
# remember to set the env
# https://access.redhat.com/RegistryAuthentication
# REGISTRY_USER_NAME=
# REGISTRY_TOKEN=
cat << EOF > ./group_vars/all.yml
fetch_directory: ~/ceph-ansible-keys
monitor_interface: eth1
public_network: 192.168.18.0/24
# ceph_docker_image: rhceph/rhceph-4-rhel8
# ceph_docker_image_tag: "latest"
# containerized_deployment: true
ceph_docker_registry: registry.redhat.io
ceph_docker_registry_auth: true
ceph_docker_registry_username: ${REGISTRY_USER_NAME}
ceph_docker_registry_password: ${REGISTRY_TOKEN}
ceph_origin: repository
ceph_repository: rhcs
# ceph_repository_type: cdn
ceph_repository_type: iso
ceph_rhcs_iso_path: /root/rhceph-4.1-rhel-7-x86_64.iso
ceph_rhcs_version: 4
bootstrap_dirs_owner: "167"
bootstrap_dirs_group: "167"
dashboard_admin_user: admin
dashboard_admin_password: Redhat!23
node_exporter_container_image: registry.redhat.io/openshift4/ose-prometheus-node-exporter:v4.1
grafana_admin_user: admin
grafana_admin_password: Redhat!23
grafana_container_image: registry.redhat.io/rhceph/rhceph-4-dashboard-rhel8
prometheus_container_image: registry.redhat.io/openshift4/ose-prometheus:4.1
alertmanager_container_image: registry.redhat.io/openshift4/ose-prometheus-alertmanager:4.1
radosgw_interface: eth1
radosgw_address_block: 192.168.18.0/24
radosgw_civetweb_port: 8080
radosgw_civetweb_num_threads: 512
ceph_conf_overrides:
global:
osd_pool_default_size: 3
osd_pool_default_min_size: 2
osd_pool_default_pg_num: 32
osd_pool_default_pgp_num: 32
osd:
osd_scrub_begin_hour: 22
osd_scrub_end_hour: 7
EOF
cat << EOF > ./group_vars/osds.yml
devices:
- /dev/vdb
- /dev/vdc
- /dev/vdd
EOF
cat << EOF > ./hosts
[grafana-server]
ceph
[mons]
ceph
[osds]
ceph
[mgrs]
ceph
EOF
sed -i "s/#copy_admin_key: false/copy_admin_key: true/" ./group_vars/rgws.yml
cd /usr/share/ceph-ansible
mkdir -p ~/ceph-ansible-keys
ansible all -m ping -i hosts
ansible-playbook -vv site.yml -i hosts
# You can access your dashboard web UI at http://ceph:8443/ as an 'admin' user with 'Redhat!23' password
cd /root
ceph osd getcrushmap -o crushmap
crushtool -d crushmap -o crushmap.txt
sed -i 's/step chooseleaf firstn 0 type host/step chooseleaf firstn 0 type osd/' crushmap.txt
grep 'step chooseleaf' crushmap.txt
crushtool -c crushmap.txt -o crushmap-new
ceph osd setcrushmap -i crushmap-new
cd /usr/share/ceph-ansible
# test the result
ceph health detail
ceph osd pool create test 8
ceph osd pool set test pg_num 128
ceph osd pool set test pgp_num 128
ceph osd pool application enable test rbd
ceph -s
ceph osd tree
ceph osd pool ls
ceph pg dump
cat << EOF > hello-world.txt
wangzheng
EOF
rados --pool test put hello-world hello-world.txt
rados --pool test get hello-world fetch.txt
cat fetch.txt
# continue to install
cat << EOF >> ./hosts
[rgws]
ceph
[mdss]
ceph
EOF
ansible-playbook -vv site.yml --limit mdss -i hosts
ansible-playbook -vv site.yml --limit rgws -i hosts
# change mon param for S3
# 416 (InvalidRange)
# https://www.cnblogs.com/flytor/p/11380026.html
# https://www.cnblogs.com/fuhai0815/p/12144214.html
# https://access.redhat.com/solutions/3328431
# add config line
vi /etc/ceph/ceph.conf
# mon_max_pg_per_osd = 300
systemctl restart ceph-mon@ceph.service
ceph tell mon.* injectargs '--mon_max_pg_per_osd=1000'
ceph --admin-daemon /var/run/ceph/ceph-mon.`hostname -s`.asok config show | grep mon_max_pg_per_osd
ceph --admin-daemon /var/run/ceph/ceph-mgr.`hostname -s`.asok config set mon_max_pg_per_osd 1000
ceph osd lspools
ceph osd dump | grep 'replicated size'
install ocs
接下来在openshift4里面安装ocs组件,来对接之前安装的ceph节点。
# check ceph versino
ceph tell osd.* version
python ceph-external-cluster-details-exporter.py --help
python ceph-external-cluster-details-exporter.py --rbd-data-pool-name test --rgw-endpoint 192.168.18.203:8080 --run-as-user client.ocs
# [{"kind": "ConfigMap", "data": {"maxMonId": "0", "data": "ceph=192.168.18.203:6789", "mapping": "{}"}, "name": "rook-ceph-mon-endpoints"}, {"kind": "Secret", "data": {"mon-secret": "mon-secret", "fsid": "bfaeb4fb-2f44-41e7-9539-1ca75bb394a8", "cluster-name": "openshift-storage", "admin-secret": "admin-secret"}, "name": "rook-ceph-mon"}, {"kind": "Secret", "data": {"userKey": "AQBZUWdfavnEDBAA0qwn1WLRbFV+0bUY+8ZnMQ==", "userID": "client.ocs"}, "name": "rook-ceph-operator-creds"}, {"kind": "Secret", "data": {"userKey": "AQBZUWdfC1EzDhAAjVV7+S3jKk8LcPUxxkIF9A==", "userID": "csi-rbd-node"}, "name": "rook-csi-rbd-node"}, {"kind": "StorageClass", "data": {"pool": "test"}, "name": "ceph-rbd"}, {"kind": "Secret", "data": {"userKey": "AQBZUWdfG8pvEBAAnldlqNj72gqBRvSxc8FB+g==", "userID": "csi-rbd-provisioner"}, "name": "rook-csi-rbd-provisioner"}, {"kind": "Secret", "data": {"adminID": "csi-cephfs-provisioner", "adminKey": "AQBZUWdfCxXWExAAiiaU1KIyjFsBxZB6h9WVtw=="}, "name": "rook-csi-cephfs-provisioner"}, {"kind": "Secret", "data": {"adminID": "csi-cephfs-node", "adminKey": "AQBZUWdf52L9ERAAXbK5upV2lO5phttDrwzJyg=="}, "name": "rook-csi-cephfs-node"}, {"kind": "StorageClass", "data": {"pool": "cephfs_data", "fsName": "cephfs"}, "name": "cephfs"}, {"kind": "StorageClass", "data": {"endpoint": "192.168.18.203:8080", "poolPrefix": "default"}, "name": "ceph-rgw"}]
oc get cephcluster -n openshift-storage
oc get storagecluster -n openshift-storage
# install chrome on kvm host
wget https://dl.google.com/linux/direct/google-chrome-stable_current_x86_64.rpm
yum install ./google-chrome-stable_current_*.rpm
google-chrome &
install cnv
# upload win10.qcow2 to http server(helper)
scp win10.qcow2.gz root@192.168.8.202:/var/www/html/
# on helper
chmod 644 /var/www/html/win10.qcow2.gz
oc project demo
cat << EOF > win10.dv.yaml
apiVersion: cdi.kubevirt.io/v1alpha1
kind: DataVolume
metadata:
name: "example-import-dv-win10"
spec:
source:
http:
url: "http://192.168.8.202:8080/win10.qcow2.gz"
pvc:
volumeMode: Block
storageClassName: ocs-external-storagecluster-ceph-rbd
accessModes:
- ReadWriteMany
resources:
requests:
storage: "40Gi"
EOF
oc apply -n demo -f win10.dv.yaml
oc get dv,pvc
# create a vm, and test the live migration
###############################################################
# network
#####################################
# worker4 baremetal, nic bond + vlan + bridge for business
nmcli con add type bond \
con-name bond-13 \
ifname bond-13 \
mode 802.3ad ipv4.method disabled ipv6.method ignore
nmcli con mod id bond-13 bond.options \
mode=802.3ad,miimon=100,lacp_rate=fast,xmit_hash_policy=layer2+3
nmcli con add type bond-slave ifname ens49f1 con-name ens49f1 master bond-13
nmcli con add type bond-slave ifname ens35f1 con-name ens35f1 master bond-13
nmcli con up bond-13
nmcli connection add type bridge con-name br-business ifname br-business ip4 172.17.4.211/24
nmcli con up br-business
nmcli con add type vlan con-name vlan-business ifname vlan-business dev bond-13 id 991 master br-business slave-type bridge
nmcli con up vlan-business
#####################################
# worker4 baremetal, nic bond + vlan + bridge for business
nmcli con add type bond \
con-name bond-13 \
ifname bond-13 \
mode 802.3ad ipv4.method disabled ipv6.method ignore
nmcli con mod id bond-13 bond.options \
mode=802.3ad,miimon=100,lacp_rate=fast,xmit_hash_policy=layer2+3
nmcli con add type bond-slave ifname ens49f1 con-name ens49f1 master bond-13
nmcli con add type bond-slave ifname ens35f1 con-name ens35f1 master bond-13
nmcli con up bond-13
nmcli connection add type bridge con-name br-business ifname br-business ip4 172.17.4.210/24
nmcli con up br-business
nmcli con add type vlan con-name vlan-business ifname vlan-business dev bond-13 id 991 master br-business slave-type bridge
nmcli con up vlan-business
###############################
# try to add 2nd nic
cat << EOF > nic.vm.yaml
apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
name: bridge-network-business
annotations:
k8s.v1.cni.cncf.io/resourceName: bridge.network.kubevirt.io/br-business
spec:
config: '{
"cniVersion": "0.3.1",
"name": "bridge-network-business",
"plugins": [
{
"type": "cnv-bridge",
"bridge": "br-business"
},
{
"type": "cnv-tuning"
}
]
}'
EOF
CS 游戏业务场景测试
###################################
# add management vlan to kvm host
nmcli con add type vlan con-name vlan-management ifname vlan-management dev bond-24 id 500 ip4 1.41.0.124/27
nmcli con up vlan-management
#restore
nmcli con del vlan-management
# upload cs server image
# for python3
python -m http.server 7800
# for python2
python -m SimpleHTTPServer 7800
oc project demo
cat << EOF > cnv.cs.dv.yaml
apiVersion: cdi.kubevirt.io/v1alpha1
kind: DataVolume
metadata:
name: "import-dv-cs-yitu"
spec:
source:
http:
url: "http://192.168.8.251:7800/yitu.raw"
pvc:
volumeMode: Block
storageClassName: ocs-external-storagecluster-ceph-rbd
accessModes:
- ReadWriteMany
resources:
requests:
storage: "150Gi"
EOF
oc apply -n demo -f cnv.cs.dv.yaml
oc get dv,pvc
业务测试服务器是一个CS业务,还是个ubuntu14,我们启动这个虚机,并且配置他的网络。 interface /etc/network/interfaces.d/eth0.cfg for cs server (ubuntu 14)
# The primary network interface
auto eth0
iface eth0 inet static
address 172.17.4.215
netmask 255.255.255.0
gateway 172.17.4.254
dns-nameservers 114.114.114.114
ifdown eth0
ifup eth0
cnv live migration
# upload cs server image
# for python3
python -m http.server 7800
# for python2
python -m SimpleHTTPServer 7800
oc project demo
cat << EOF > cnv.cs.dv.yaml
apiVersion: cdi.kubevirt.io/v1alpha1
kind: DataVolume
metadata:
name: "import-dv-rhel-78"
spec:
source:
http:
url: "http://192.168.8.251:7800/rhel7.8.img"
pvc:
volumeMode: Block
storageClassName: ocs-external-storagecluster-ceph-rbd
accessModes:
- ReadWriteMany
resources:
requests:
storage: "10Gi"
EOF
oc apply -n demo -f cnv.cs.dv.yaml
oc get dv,pvc
############################################
# try to debug the vm stuck after node failure, but find out this is not working.
# we try to decrease the pdb, but no use, vm still not move to another node.
oc get pdb -n demo
# NAME MIN AVAILABLE MAX UNAVAILABLE ALLOWED DISRUPTIONS AGE
# kubevirt-disruption-budget-j5zlc 2 N/A 0 12m
# kubevirt-disruption-budget-qsk9j 2 N/A 0 12m
oc patch pdb kubevirt-disruption-budget-j5zlc -n demo --type=merge -p '{"spec":{"minAvailable":0}}'
oc patch pdb kubevirt-disruption-budget-qsk9j -n demo --type=merge -p '{"spec":{"minAvailable":0}}'
# Cannot evict pod as it would violate the pod's disruption budget.
oc adm drain worker-3.ocp4.redhat.ren --grace-period=10 --force --delete-local-data --ignore-daemonsets
oc adm uncordon worker-3.ocp4.redhat.ren
debug for node failure senario
# evictionStrategy: LiveMigrate
# power off and power on the VM
# remove evictionStrategy: LiveMigrate settings,
# and find out this doesn't work
oc patch -n demo vm/rhel78 --type json -p '[{"op": "remove", "path": "/spec/template/spec/evictionStrategy"}]'
oc get vm/rhel78 -o yaml | grep evictionStrategy
# restore evictionStrategy: LiveMigrate settings
oc patch -n demo vm/rhel78 --type=merge -p '{"spec": {"template": {"spec": {"evictionStrategy":"LiveMigrate"} } } }'
# oc delete pod -n openshift-storage noobaa-db-0 --force --grace-period=0
# oc get pod -n openshift-storage
# no result out for these 2 command
oc get pod/virt-launcher-rhel78-r6d9m -o yaml | grep -A2 finalizer
oc get vm/rhel78 -o yaml | grep -A2 finalizer
# we can see there are finalizers on vmi
oc get vmi/rhel78 -o yaml | grep -A2 finalizer
# finalizers:
# - foregroundDeleteVirtualMachine
# generation: 20
# poweroff the node, to reproduce the issue
# when the node is notready, and pod is terminating
oc get node
# NAME STATUS ROLES AGE VERSION
# master-0.ocp4.redhat.ren Ready master 102d v1.18.3+6c42de8
# master-1.ocp4.redhat.ren Ready master 102d v1.18.3+6c42de8
# master-2.ocp4.redhat.ren Ready master 102d v1.18.3+6c42de8
# worker-0.ocp4.redhat.ren Ready worker 102d v1.18.3+6c42de8
# worker-1.ocp4.redhat.ren Ready worker 102d v1.18.3+6c42de8
# worker-2.ocp4.redhat.ren Ready worker 102d v1.18.3+6c42de8
# worker-3.ocp4.redhat.ren NotReady cnv,worker 93d v1.18.3+6c42de8
# worker-4.ocp4.redhat.ren Ready cnv,worker 91d v1.18.3+6c42de8
oc get pod
# NAME READY STATUS RESTARTS AGE
# v2v-vmware-568b875554-lsj57 1/1 Running 0 2d6h
# virt-launcher-rhel78-r6d9m 1/1 Terminating 0 44m
oc get vmi
# NAME AGE PHASE IP NODENAME
# rhel78 52m Running 172.17.4.15/24 worker-3.ocp4.redhat.ren
# below is working
oc patch -n demo vmi/rhel78 --type=merge -p '{"metadata": {"finalizers":null}}'
# after node failure, delete vmi
oc delete vmi/rhel78
oc get pod
# NAME READY STATUS RESTARTS AGE
# v2v-vmware-568b875554-lsj57 1/1 Running 0 2d6h
# virt-launcher-rhel78-f5ltc 1/1 Running 0 32s
# virt-launcher-rhel78-r6d9m 1/1 Terminating 0 46m
# no use below, because we are bare mental.
cat << EOF > healthcheck.yaml
apiVersion: machine.openshift.io/v1beta1
kind: MachineHealthCheck
metadata:
name: example
namespace: openshift-machine-api
spec:
selector:
matchLabels:
machine.openshift.io/cluster-api-machine-role: cnv
unhealthyConditions:
- type: "Ready"
timeout: "300s"
status: "False"
- type: "Ready"
timeout: "300s"
status: "Unknown"
maxUnhealthy: "80%"
EOF
oc apply -f healthcheck.yaml
oc get MachineHealthCheck -n openshift-machine-api
# NAME MAXUNHEALTHY EXPECTEDMACHINES CURRENTHEALTHY
# example 80%
其他备忘
oc get nns worker-4.ocp4.redhat.ren -o yaml
apiVersion: nmstate.io/v1alpha1
kind: NodeNetworkState
metadata:
creationTimestamp: "2020-09-16T03:15:51Z"
generation: 1
managedFields:
- apiVersion: nmstate.io/v1alpha1
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:ownerReferences:
.: {}
k:{"uid":"135e4844-bf87-465a-8f6a-5fc1f85e5beb"}:
.: {}
f:apiVersion: {}
f:kind: {}
f:name: {}
f:uid: {}
f:status:
.: {}
f:currentState:
.: {}
f:dns-resolver:
.: {}
f:config:
.: {}
f:search: {}
f:server: {}
f:running:
.: {}
f:search: {}
f:server: {}
f:interfaces: {}
f:route-rules:
.: {}
f:config: {}
f:routes:
.: {}
f:config: {}
f:running: {}
f:lastSuccessfulUpdateTime: {}
manager: kubernetes-nmstate
operation: Update
time: "2020-09-23T01:38:50Z"
name: worker-4.ocp4.redhat.ren
ownerReferences:
- apiVersion: v1
kind: Node
name: worker-4.ocp4.redhat.ren
uid: 135e4844-bf87-465a-8f6a-5fc1f85e5beb
resourceVersion: "43763614"
selfLink: /apis/nmstate.io/v1alpha1/nodenetworkstates/worker-4.ocp4.redhat.ren
uid: 095a8223-d139-4add-9fcf-0e0435191f78
status:
currentState:
dns-resolver:
config:
search: []
server:
- 192.168.8.202
running:
search: []
server:
- 192.168.8.202
interfaces:
- ipv4:
dhcp: false
enabled: false
ipv6:
autoconf: false
dhcp: false
enabled: false
link-aggregation:
mode: 802.3ad
options:
ad_actor_system: "00:00:00:00:00:00"
lacp_rate: fast
miimon: "100"
xmit_hash_policy: layer2+3
slaves:
- ens49f1
- ens35f1
mac-address: B8:59:9F:EF:71:5D
mtu: 1500
name: bond-13
state: up
type: bond
- ipv4:
dhcp: false
enabled: false
ipv6:
autoconf: false
dhcp: false
enabled: false
link-aggregation:
mode: 802.3ad
options:
ad_actor_system: "00:00:00:00:00:00"
lacp_rate: fast
miimon: "100"
xmit_hash_policy: layer2+3
slaves:
- ens49f0
- ens35f0
mac-address: B8:59:9F:EF:71:5C
mtu: 1500
name: bond-24
state: up
type: bond
- bridge:
options:
group-forward-mask: 0
mac-ageing-time: 300
multicast-snooping: true
stp:
enabled: true
forward-delay: 15
hello-time: 2
max-age: 20
priority: 32768
port:
- name: vlan-business
stp-hairpin-mode: false
stp-path-cost: 100
stp-priority: 32
ipv4:
address:
- ip: 172.17.4.211
prefix-length: 24
dhcp: false
enabled: true
ipv6:
address:
- ip: fe80::1a6a:4414:8fec:940e
prefix-length: 64
auto-dns: true
auto-gateway: true
auto-routes: true
autoconf: true
dhcp: true
enabled: true
mac-address: B8:59:9F:EF:71:5D
mtu: 1500
name: br-business
state: up
type: linux-bridge
- ipv4:
enabled: false
ipv6:
enabled: false
mac-address: 1e:d4:cc:be:5e:49
mtu: 1450
name: br0
state: down
type: ovs-interface
- ethernet:
auto-negotiation: true
duplex: full
speed: 10000
sr-iov:
total-vfs: 0
vfs: []
ipv4:
dhcp: false
enabled: false
ipv6:
autoconf: false
dhcp: false
enabled: false
mac-address: B8:59:9F:EF:71:5C
mtu: 1500
name: ens35f0
state: up
type: ethernet
- ethernet:
auto-negotiation: true
duplex: full
speed: 10000
sr-iov:
total-vfs: 0
vfs: []
ipv4:
dhcp: false
enabled: false
ipv6:
autoconf: false
dhcp: false
enabled: false
mac-address: B8:59:9F:EF:71:5D
mtu: 1500
name: ens35f1
state: up
type: ethernet
- ipv4:
enabled: false
ipv6:
enabled: false
mac-address: B4:96:91:67:2D:A4
mtu: 1500
name: ens47f0
state: down
type: ethernet
- ethernet:
auto-negotiation: true
duplex: full
speed: 1000
sr-iov:
total-vfs: 0
vfs: []
ipv4:
address:
- ip: 192.168.8.211
prefix-length: 24
dhcp: false
enabled: true
ipv6:
address:
- ip: fe80::b696:91ff:fe67:2da5
prefix-length: 64
autoconf: false
dhcp: false
enabled: true
mac-address: B4:96:91:67:2D:A5
mtu: 1500
name: ens47f1
state: up
type: ethernet
- ethernet:
auto-negotiation: true
duplex: full
speed: 10000
sr-iov:
total-vfs: 0
vfs: []
ipv4:
dhcp: false
enabled: false
ipv6:
autoconf: false
dhcp: false
enabled: false
mac-address: B8:59:9F:EF:71:5C
mtu: 1500
name: ens49f0
state: up
type: ethernet
- ethernet:
auto-negotiation: true
duplex: full
speed: 10000
sr-iov:
total-vfs: 0
vfs: []
ipv4:
dhcp: false
enabled: false
ipv6:
autoconf: false
dhcp: false
enabled: false
mac-address: B8:59:9F:EF:71:5D
mtu: 1500
name: ens49f1
state: up
type: ethernet
- ipv4:
enabled: false
ipv6:
enabled: false
mtu: 65536
name: lo
state: down
type: unknown
- ipv4:
enabled: false
ipv6:
enabled: false
mac-address: de:b2:ca:03:6b:fa
mtu: 1450
name: tun0
state: down
type: ovs-interface
- ipv4:
dhcp: false
enabled: false
ipv6:
autoconf: false
dhcp: false
enabled: false
mac-address: B8:59:9F:EF:71:5D
mtu: 1500
name: vlan-business
state: up
type: vlan
vlan:
base-iface: bond-13
id: 991
- ipv4:
address:
- ip: 192.168.18.211
prefix-length: 24
dhcp: false
enabled: true
ipv6:
address:
- ip: fe80::e852:70de:e7be:8f04
prefix-length: 64
auto-dns: true
auto-gateway: true
auto-routes: true
autoconf: true
dhcp: true
enabled: true
mac-address: B8:59:9F:EF:71:5C
mtu: 1500
name: vlan-ceph
state: up
type: vlan
vlan:
base-iface: bond-24
id: 501
- ipv4:
enabled: false
ipv6:
enabled: false
mac-address: C2:AE:59:84:C6:E0
mtu: 65000
name: vxlan_sys_4789
state: down
type: vxlan
vxlan:
base-iface: ""
destination-port: 4789
id: 0
remote: ""
route-rules:
config: []
routes:
config:
- destination: 0.0.0.0/0
metric: -1
next-hop-address: 192.168.8.1
next-hop-interface: ens47f1
table-id: 0
running:
- destination: 172.17.4.0/24
metric: 425
next-hop-address: ""
next-hop-interface: br-business
table-id: 254
- destination: 0.0.0.0/0
metric: 104
next-hop-address: 192.168.8.1
next-hop-interface: ens47f1
table-id: 254
- destination: 192.168.8.0/24
metric: 104
next-hop-address: ""
next-hop-interface: ens47f1
table-id: 254
- destination: 192.168.18.0/24
metric: 400
next-hop-address: ""
next-hop-interface: vlan-ceph
table-id: 254
- destination: fe80::/64
metric: 425
next-hop-address: ""
next-hop-interface: br-business
table-id: 254
- destination: fe80::/64
metric: 256
next-hop-address: ""
next-hop-interface: ens47f1
table-id: 254
- destination: fe80::/64
metric: 400
next-hop-address: ""
next-hop-interface: vlan-ceph
table-id: 254
- destination: ff00::/8
metric: 256
next-hop-address: ""
next-hop-interface: br-business
table-id: 255
- destination: ff00::/8
metric: 256
next-hop-address: ""
next-hop-interface: ens47f1
table-id: 255
- destination: ff00::/8
metric: 256
next-hop-address: ""
next-hop-interface: vlan-ceph
table-id: 255
lastSuccessfulUpdateTime: "2020-09-23T01:38:50Z"
next step
Multi-Queue
- https://kubevirt.io/user-guide/#/creation/disks-and-volumes?id=virtio-block-multi-queue
- https://kubevirt.io/user-guide/#/creation/interfaces-and-networks?id=virtio-net-multiqueue
cases:
- https://access.redhat.com/support/cases/#/case/02763144
RHACS / stackrox
官方的安装文档,非常详细和准确,照着做就好。
- https://help.stackrox.com/docs/get-started/quick-start/
视频讲解
install rhacs
# below is no use for v3.0.59.1
cat <<EOF | oc apply -f -
apiVersion: helm.openshift.io/v1beta1
kind: HelmChartRepository
metadata:
name: rhacs-repo
spec:
name: rhacs-repo
connectionConfig:
url: http://registry.ocp4.redhat.ren:8080/rhacs-chart/
EOF
# restore
oc delete HelmChartRepository rhacs-repo
mkdir -p /data/install/rhacs
cd /data/install/rhacs
roxctl central generate interactive
# password: redhat
# Enter path to the backup bundle from which to restore keys and certificates (optional):
# Enter PEM cert bundle file (optional):
# Enter administrator password (default: autogenerated):
# Re-enter administrator password:
# Enter orchestrator (k8s, openshift): openshift
# Enter the directory to output the deployment bundle to (default: "central-bundle"):
# Enter the OpenShift major version (3 or 4) to deploy on (default: "0"): 4
# Enter Istio version when deploying into an Istio-enabled cluster (leave empty when not running Istio) (optional):
# Enter the method of exposing Central (route, lb, np, none) (default: "none"): route
# Enter main image to use (default: "stackrox.io/main:3.0.59.1"): registry.redhat.io/rh-acs/main:3.0.59.1
# Enter whether to run StackRox in offline mode, which avoids reaching out to the Internet (default: "false"): true
# Enter whether to enable telemetry (default: "true"):
# Enter the deployment tool to use (kubectl, helm, helm-values) (default: "kubectl"):
# Enter Scanner DB image to use (default: "stackrox.io/scanner-db:2.13.0"): registry.redhat.io/rh-acs/scanner-db:2.13.0
# Enter Scanner image to use (default: "stackrox.io/scanner:2.13.0"): registry.redhat.io/rh-acs/scanner:2.13.0
# Enter Central volume type (hostpath, pvc): pvc
# Enter external volume name (default: "stackrox-db"):
# Enter external volume size in Gi (default: "100"): 100
# Enter storage class name (optional if you have a default StorageClass configured):
# Generating deployment bundle...
# NOTE: Unless run in offline mode, StackRox Kubernetes Security Platform collects and transmits aggregated usage and system health information. If you want to OPT OUT from this, re-generate the deployment bundle with the '--enable-telemetry=false' flag
# Done!
# Wrote central bundle to "central-bundle"
# To deploy:
# - If you need to add additional trusted CAs, run central/scripts/ca-setup.sh.
# - Deploy Central
# - Run central/scripts/setup.sh
# - Run oc create -R -f central
# - Deploy Scanner
# If you want to run the StackRox Scanner:
# - Run scanner/scripts/setup.sh
# - Run oc create -R -f scanner
# PLEASE NOTE: The recommended way to deploy StackRox is by using Helm. If you have
# Helm 3.1+ installed, please consider choosing this deployment route instead. For your
# convenience, all required files have been written to the helm/ subdirectory, along with
# a README file detailing the Helm-based deployment process.
# For administrator login, select the "Login with username/password" option on
# the login page, and log in with username "admin" and the password found in the
# "password" file located in the same directory as this README.
./central-bundle/central/scripts/setup.sh
oc -n stackrox get route central
# NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD
# central central-stackrox.apps.ocp4.redhat.ren central https passthrough None
cat central-bundle/password
# redhat
# open https://central-stackrox.apps.ocp4.redhat.ren
# with admin / redhat
./central-bundle/scanner/scripts/setup.sh
oc create -R -f central-bundle/scanner
# serviceaccount/scanner created
# clusterrole.rbac.authorization.k8s.io/stackrox-scanner-psp created
# rolebinding.rbac.authorization.k8s.io/stackrox-scanner-psp created
# podsecuritypolicy.policy/stackrox-scanner created
# securitycontextconstraints.security.openshift.io/scanner created
# secret/scanner-db-password created
# secret/scanner-tls created
# secret/scanner-db-tls created
# configmap/scanner-config created
# networkpolicy.networking.k8s.io/scanner created
# networkpolicy.networking.k8s.io/scanner-db created
# deployment.apps/scanner created
# deployment.apps/scanner-db created
# service/scanner created
# service/scanner-db created
# horizontalpodautoscaler.autoscaling/scanner created
install sensor
sensor是stackrox的runtime扫描器核心,本质上,是一个内核模块/ebpf注入,而且是从容器里面注入,这里面的原理,我会单独做一个视频解释一下。
为了装sensor,我们需要在central平台上,添加集群。登录到系统中,选择系统配置,集群,添加集群:
添加集群里面,有2个参数,是sensor的镜像地址,我们当然要用registry.redhat.io的这种不需要申请license的地址了,对应的栏位填写如下信息:
- registry.redhat.io/rh-acs/main
- registry.redhat.io/rh-acs/collector
点击下一步以后,下载一个文件,然后到helper上继续。
cd /data/install/rhacs/
/bin/cp -f ~/Downloads/sensor-ocp4.zip /data/install/rhacs/
unzip -d sensor sensor-ocp4.zip
./sensor/sensor.sh
# namespace/stackrox annotated
# Now using project "stackrox" on server "https://api.ocp4.redhat.ren:6443".
# Creating sensor secrets...
# secret/sensor-tls created
# Creating sensor RBAC roles...
# serviceaccount/sensor created
# clusterrole.rbac.authorization.k8s.io/stackrox:view-cluster created
# clusterrolebinding.rbac.authorization.k8s.io/stackrox:monitor-cluster created
# role.rbac.authorization.k8s.io/edit created
# rolebinding.rbac.authorization.k8s.io/manage-namespace created
# clusterrole.rbac.authorization.k8s.io/stackrox:edit-workloads created
# clusterrolebinding.rbac.authorization.k8s.io/stackrox:enforce-policies created
# clusterrole.rbac.authorization.k8s.io/stackrox:network-policies created
# clusterrolebinding.rbac.authorization.k8s.io/stackrox:network-policies-binding created
# clusterrole.rbac.authorization.k8s.io/stackrox:update-namespaces created
# clusterrolebinding.rbac.authorization.k8s.io/stackrox:update-namespaces-binding created
# clusterrole.rbac.authorization.k8s.io/stackrox:create-events created
# clusterrolebinding.rbac.authorization.k8s.io/stackrox:create-events-binding created
# clusterrole.rbac.authorization.k8s.io/stackrox:review-tokens created
# clusterrolebinding.rbac.authorization.k8s.io/stackrox:review-tokens-binding created
# Creating sensor security context constraints...
# securitycontextconstraints.security.openshift.io/sensor created
# Creating sensor network policies...
# networkpolicy.networking.k8s.io/sensor created
# Creating sensor pod security policies...
# clusterrole.rbac.authorization.k8s.io/stackrox-sensor-psp created
# rolebinding.rbac.authorization.k8s.io/stackrox-sensor-psp created
# podsecuritypolicy.policy/stackrox-sensor created
# Enter username for docker registry at registry.redhat.io: wandering.star
# Enter password for wandering.star @ registry.redhat.io: secret/collector-stackrox created
# Creating admission controller security context constraints...
# securitycontextconstraints.security.openshift.io/admission-control created
# Creating admission controller secrets...
# secret/admission-control-tls created
# Creating admission controller RBAC roles...
# serviceaccount/admission-control created
# role.rbac.authorization.k8s.io/watch-config created
# rolebinding.rbac.authorization.k8s.io/admission-control-watch-config created
# Creating admission controller network policies...
# networkpolicy.networking.k8s.io/admission-control-no-ingress created
# Creating admission controller pod security policies...
# podsecuritypolicy.policy/stackrox-admission-control created
# clusterrole.rbac.authorization.k8s.io/stackrox-admission-control-psp created
# rolebinding.rbac.authorization.k8s.io/stackrox-admission-control-psp created
# Creating admission controller deployment...
# deployment.apps/admission-control created
# service/admission-control created
# W0507 18:24:56.251769 13915 warnings.go:70] admissionregistration.k8s.io/v1beta1 ValidatingWebhookConfiguration is deprecated in v1.16+, unavailable in v1.22+; use admissionregistration.k8s.io/v1 ValidatingWebhookConfiguration
# W0507 18:24:56.272199 13915 warnings.go:70] admissionregistration.k8s.io/v1beta1 ValidatingWebhookConfiguration is deprecated in v1.16+, unavailable in v1.22+; use admissionregistration.k8s.io/v1 ValidatingWebhookConfiguration
# validatingwebhookconfiguration.admissionregistration.k8s.io/stackrox created
# Creating collector security context constraints...
# securitycontextconstraints.security.openshift.io/collector created
# Creating collector secrets...
# secret/collector-tls created
# Creating collector RBAC roles...
# serviceaccount/collector created
# Creating collector network policies...
# networkpolicy.networking.k8s.io/collector-no-ingress created
# Creating collector pod security policies...
# clusterrole.rbac.authorization.k8s.io/stackrox-collector-psp created
# rolebinding.rbac.authorization.k8s.io/stackrox-collector-psp created
# podsecuritypolicy.policy/stackrox-collector created
# Creating collector daemon set...
# daemonset.apps/collector created
# Creating sensor deployment...
# deployment.apps/sensor created
# service/sensor created
# service/sensor-webhook created
# secret/helm-effective-cluster-name created
# Creating upgrader service account
# serviceaccount/sensor-upgrader created
# clusterrolebinding.rbac.authorization.k8s.io/stackrox:upgrade-sensors created
我们来简单的窥探一下,装了sensor以后,master node上面dmesg的信息,可以看到有一个collector kmod加载了,并且还用到了CPU指令集的特性。
在master node上面,执行lsmod,也能看到这个collector kmod
lsmod | grep coll
# collector 651264 22
remove sensor
cd /data/install/rhacs
# ./sensor/delete-sensor.sh
kubectl delete --raw /apis/security.openshift.io/v1/securitycontextconstraints/collector
./sensor/delete-sensor.sh
bugfix for https://access.redhat.com/solutions/5911951
cd /data/install/rhacs
upgrade
https://help.stackrox.com/docs/upgrade-stackrox/from-44/
为 RHACS 找个应用场景: 安全合规测试云
视频讲解
红帽断供centos的思考
首先声明,笔者水平有限,如果读者对本文不认同,那么一定读者是对的,笔者是错的。
红帽最近有一个声明,断供了 git.centos.org。海内外议论纷纷,大部分认为红帽违背了开源协议和开源精神。对此,笔者的分析如下。
先说结论,笔者认为,红帽做的事情,是为开源事业可持续发展,探索出路。
好了,我们接着说,笔者为什么这么认为。首先,在红帽的声明中,间接的提到了,红帽需要钱,换句话说,红帽的业绩遇到了挑战。这很好理解,随着公有云的流行,RHEL的客户基础受到了极大的损失。另外,像Rocky这种下游衍生版本,号称实时和RHEL同步,对RHEL的业绩冲击也是存在的,这方面就有客户携手Rocky Linux公开的声明。
缺钱这个问题,摆在这里,那么这个问题是红帽自己独有的问题吗?很明显,并不是,比如之前有名的log2j漏洞大流行,我们才发现,如此广泛使用的组件,是志愿者免费维护的,纯粹的用爱发电。。。我们都知道,如果持续要求对方一直用爱发电,最后的结果会是什么样子。
有一本书,说商业本质是秘密,那对于企业来说,秘密对应着什么呢?笔者认为,对应着产权(资本,IP)和运营。红帽这种企业没有IP,那么它靠的就是运营,开源事业的运营,而运营是基于规则的,运营的终极形态,是流量运营,听上去是不是耳熟?就是互联网上的常说的眼球经济。所以,从某种角度来说,红帽的运营模式,像极了互联网的运营模式,用免费模式打击对手,获得流量(垄断地位),然后流量变现。而不同的地方在于,红帽是围绕开源事业来做这个商业模式/商业循环的,它的这个商业模式如果能成功,就能保障开源事业的成功。
说到规则,这让我想起了看过的一个访谈,里面介绍了商品贸易,专利制度,开源协议,这些都是商业文明的阶段性产物,都是一种制度性保障,为的是让商业高效的运转,但是商业利益,一定是制度制定者来获取的,也就是美国人去获得的。所以红帽其实就是开源协议这种制度的商业利益收获者的代表,自然它就需要为这个制度寻找出路了,好在红帽还是信仰开源,积极反馈开源的,这一点是开源事业的幸事。中国也有欧拉社区,有木兰协议,也在制定自己的规则制度,估计未来基于这个制度的商业利益收获者会是中国人,那让我们拭目以待吧。
所以,笔者认为,红帽做的事情,是在新时代为开源事业探索一个可持续发展的出路。或者说,他是基于GPL协议来打补丁。红帽打补丁的方式,是使用服务费的方式来实现的。我们知道,GPL协议规定,软件二次分发,不能收取授权费。那我们收取服务费可不可以呢?这个问题,我们可以留给律师们去讨论。
笔者虽然对GPL协议并不权威,但是还是能回想起多年前,在学校听过年轻时stallman的演讲,感受最深的是,开源是让程序员能看到源代码,能更好,更高效的工作和协同,多年以后,我们看到了开源的巨大成功,笔者本人也雨露均沾,但是我们还是不能忘记,开源保护的是一种工作方式,如果我们能免费的保护他,自然好,但是如果不能免费,我们就要思考需要付出多大的代价来保护它。
所以说,红帽现在的探索是有意义的,它在保护开源的工作方式,要说有什么不足,我倒认为它的进度太慢了,我们无法评估我们需要付出多少金钱来保护开源的工作方式,以及它和闭源方式的对比。
最后,我来预测一下接下来的发展,笔者猜测,红帽会和RHEL下游的发行版(rocky, alma等)达成协议,红帽发布的补丁等源代码,要保持半年到1年的时间间隔,才能进入下游发行版。这是因为,目前通过官方渠道,得到红帽的源代码,必须走红帽的订阅协议,也就是刚才说的服务费协议,这个协议有一个过期时间,也就是说,你通过协议,拿到的源代码,受到一个有效期的限制,一般是1年,1年以后,你拿着这个源代码,想干什么都可以了。
总结来说,红帽在探索用服务费的方式保护开源的工作方式,这会造成下游发行版的一个人为的时间差。总体来说,对于开发者/程序员来说,开源的工作方式没有变化,但是对于大型最终用户来说,是要考虑,加大你的投资啦。
satellite 作为yum repo的简单演示
客户购买了 redhat rhel 订阅,想在国内使用,但是 rhel 订阅在os上激活,是需要连接国外服务器的,由于众所周知的原因,国内访问有时候不稳定,甚至国外服务器本身有的时候就慢,那么这种情况下,客户就需要一个订阅的代理服务器。红帽的satellite产品,就是这样一个订阅注册代理服务器。
当然,satellite产品内涵很丰富,订阅注册代理只是一个其中一个很小的功能。satellite产品的标准使用场景是,用户有一个内网环境,里面有一个主机能联网,这个主机安装satellite,并向satellite导入订阅证书,启动yum repo镜像服务,iPXE, dhcp, dns服务等,这些服务在一起,就能让内网的其他主机具备了上电以后,自动安装rhel的能力,rhel装好以后,satellite还提供持续更新的功能。
所以satellite是一个带安装源的OS全生命周期管理维护产品,官方文档在这里:
- https://access.redhat.com/documentation/en-us/red_hat_satellite/6.13
本文,就演示一个最简单的场景,安装satellite,并且内网rhel在satellite上激活订阅,并使用satellite作为yum repo源。
实验架构图,请注意,本次实验展示的satellite的功能和场景很简单,其他satellite的功能,比如内容视图,satellite集群,离线操作等等很多功能,依然等待大家去探索。
- satellite 作为yum repo的简单演示
- 安装 satellite server
- 下载订阅信息
- 导入订阅信息
- 配置 yum repo 镜像
- 配置 active key
- 注册主机
- 增加订阅数量
- 超用会发生什么
- 激活 Simple Content Access (SCA)
- 使用 API 来注销主机
- 网络防火墙端口
- 端口转发
- 中国区加速
- 安装 insight 插件
- 重装os
- 监控 subscription / 订阅
- end
- next
安装 satellite server
satellite的完整产品架构里面,有server,还有独立的capsule,我们是极简部署,而且server里面也有内置的capsule,所以我们这次就部署一个server就好了。
服务器用的是16C 32G,500G HDD的VM,实际项目里面,硬盘要大点。
另外,server要有域名,而且要配置好反向解析。
# satellite server
# 172.21.6.171
# dns resolve and reverse to panlab-satellite-server.infra.wzhlab.top
# satellite client host
# 172.21.6.172
# on satellite server
ssh root@172.21.6.171
# https://access.redhat.com/documentation/en-us/red_hat_satellite/6.13/html-single/installing_satellite_server_in_a_connected_network_environment/index
systemctl disable --now firewalld.service
hostnamectl set-hostname panlab-satellite-server.infra.wzhlab.top
ping -c1 localhost
# PING localhost(localhost (::1)) 56 data bytes
# 64 bytes from localhost (::1): icmp_seq=1 ttl=64 time=0.043 ms
ping -c1 `hostname -f`
# PING panlab-satellite-server.wzhlab.top (172.21.6.171) 56(84) bytes of data.
# 64 bytes from bogon (172.21.6.171): icmp_seq=1 ttl=64 time=0.047 ms
# active subscrition on this rhel.
subscription-manager register --auto-attach --username xxxxxxxxx --password xxxxxxxxxx
# add repo for satellite
subscription-manager repos --enable=rhel-8-for-x86_64-baseos-rpms \
--enable=rhel-8-for-x86_64-appstream-rpms \
--enable=satellite-6.13-for-rhel-8-x86_64-rpms \
--enable=satellite-maintenance-6.13-for-rhel-8-x86_64-rpms
# Repository 'rhel-8-for-x86_64-baseos-rpms' is enabled for this system.
# Repository 'rhel-8-for-x86_64-appstream-rpms' is enabled for this system.
# Repository 'satellite-6.13-for-rhel-8-x86_64-rpms' is enabled for this system.
# Repository 'satellite-maintenance-6.13-for-rhel-8-x86_64-rpms' is enabled for this system.
dnf module enable satellite:el8
dnf update -y
dnf install satellite chrony sos -y
systemctl enable --now chronyd
# begin install satellite
satellite-installer --scenario satellite \
--foreman-initial-organization "My_Organization" \
--foreman-initial-location "My_Location" \
--foreman-initial-admin-username admin \
--foreman-initial-admin-password redhat
# ......
# 2023-05-16 22:41:17 [NOTICE] [configure] System configuration has finished.
# Success!
# * Satellite is running at https://panlab-satellite-server.infra.wzhlab.top
# Initial credentials are admin / redhat
# * To install an additional Capsule on separate machine continue by running:
# capsule-certs-generate --foreman-proxy-fqdn "$CAPSULE" --certs-tar "/root/$CAPSULE-certs.tar"
# * Capsule is running at https://panlab-satellite-server.infra.wzhlab.top:9090
# The full log is at /var/log/foreman-installer/satellite.log
# Package versions are being locked.
安装很容易,但是时间有点长,十几分钟,官方建议套在 tmux 里面运行安装程序。安装完成了,浏览器直接访问 url 就可以了。
我们可以在系统里面,看到satellite server作为一个host已经存在了。
下载订阅信息
我们的业务场景,是内网主机都注册到satellite上来,这必然需要把红帽官网上的订阅信息导入到satellite里面去,我们来一步一步做一下。
首先,我们要去红帽官网,创建一个订阅分配,如果我们有100个订阅,都要用到satellite上,那么就分配100个来。我们做实验,就分配1个,后面好实验超用,还有添加数量的场景。
我们创建的订阅分配,类型和我们装的satellite版本要保持一致。
切换到订阅tab:
添加订阅,会打开一个页面,让你搜索你有的订阅,并挑选一个出来:
我们选好了订阅以后,设定数量,根据需要的数量来,一般情况,把你所以的订阅都加进来。我们做实验,就设置 1. 然后下载。
导入订阅信息
我们已经有了订阅信息文件,那么我们回到satellite管理界面里面,导入它。
完成以后,我们就能看到订阅信息了。
配置 yum repo 镜像
我们实验的目的,就是配置一个yum repo 镜像源出来,但是默认satellite使用的是on demand 的方式来下载 rpm,我们希望让他一气呵成,主动提前的下载好,那么需要做一个配置。
激活主动下载配置
做了主动下载配置以后,我们就来添加 yum 源。
我们先搜索 appstream 。
然后我们选择小版本
为了做实验,凸显效果,我们只选择8.6这个非最新版本。
我们再搜索 baseos ,并选择 8.6 版本
选择好了yum 源以后,我们开始手动同步。
选择要同步的repo, 开始
经过漫长的时间,下载完成。
satellite服务器端的基本服务,就配置完了,我们看看系统情况。
satellite-maintain service list
# Running Service List
# ================================================================================
# List applicable services:
# dynflow-sidekiq@.service indirect
# foreman-proxy.service enabled
# foreman.service enabled
# httpd.service enabled
# postgresql.service enabled
# pulpcore-api.service enabled
# pulpcore-content.service enabled
# pulpcore-worker@.service indirect
# redis.service enabled
# tomcat.service enabled
# All services listed [OK]
# --------------------------------------------------------------------------------
df -h
# Filesystem Size Used Avail Use% Mounted on
# devtmpfs 16G 0 16G 0% /dev
# tmpfs 16G 148K 16G 1% /dev/shm
# tmpfs 16G 8.9M 16G 1% /run
# tmpfs 16G 0 16G 0% /sys/fs/cgroup
# /dev/sda3 499G 106G 393G 22% /
# /dev/sda2 1014M 265M 749M 27% /boot
# /dev/sda1 599M 9.6M 590M 2% /boot/efi
# tmpfs 3.2G 0 3.2G 0% /run/user/0
free -h
# total used free shared buff/cache available
# Mem: 31Gi 21Gi 1.7Gi 567Mi 7.9Gi 8.6Gi
# Swap: 0B 0B 0B
内存占用21G,硬盘占用 110G。这个数据给以后部署提供一个依据吧。。。
我们查看capsule的使用资源情况。
配置 active key
我们导入了 subscription,要给rhel使用,需要创建active key并绑定。active key可以灵活的控制激活的 rhel 数量,确保我们不超量使用订阅。
随便起一个名字
active key的详细配置里面, 我们设置 host limite 为 unlimited, 这个建议设置为具体数字, 保证不超用。我们还要选择 environment, 简单的场景,默认的就好,这个配置能让我们把主机分成不同的group来管理。 content view 也是默认的, 这个配置可以让不同的主机组,看到的rpm 版本不同 。release version 放空,这个配置可以配置主机默认的release版本。
可以看到,satellite的功能很多,是面向大规模主机部署场景设计的。
然后,我们把订阅附加到 active key里面去。
我们的orgnization启用了 simple access control, 为了对比,我们先 disable它,后面我们会打开它,来做个对比。
取消 SAC 的激活
注册主机
我们来创建就一个 URI, 目标rhel,curl这个 URL,会下载一个脚本,运行这个脚本,目标rhel就注册到我们的satellite server上了。
根据图例,来配置,注意,激活insecure,因为我们是自签名证书
详细配置里面,我们disable全部功能,因为我们不需要satellite来帮助我们部署服务器。我们让这个URL一直有效。
点击生成以后,就得到一个命令,复制下来,保存起来。
有了命令,我们就找一台rhel,来试试。
# on client host
curl -sS --insecure 'https://panlab-satellite-server.infra.wzhlab.top/register?activation_keys=demo-activate&location_id=2&organization_id=1&setup_insights=false&setup_remote_execution=false&setup_remote_execution_pull=false&update_packages=false' -H 'Authorization: Bearer eyJhbGciOiJIUzI1NiJ9.eyJ1c2VyX2lkIjo0LCJpYXQiOjE2ODQzMDU1MTYsImp0aSI6IjdiODBkNzdmMjVjYzY1MDZjODQ3OGI2Y2VjNzRkZWZjOGM2YjAyMDUxMDQ4YTcyYTJlMWE1YzRiNTgyMjE5NzAiLCJzY29wZSI6InJlZ2lzdHJhdGlvbiNnbG9iYWwgcmVnaXN0cmF0aW9uI2hvc3QifQ.EVXyW9gjWyAQIFYUxnwwdxAigrPmUo_XYWnqn-Wh1Fw' | bash
# #
# # Running registration
# #
# Updating Subscription Management repositories.
# Unable to read consumer identity
# This system is not registered with an entitlement server. You can use subscription-manager to register.
# Error: There are no enabled repositories in "/etc/yum.repos.d", "/etc/yum/repos.d", "/etc/distro.repos.d".
# The system has been registered with ID: e9d03372-d3f4-4970-bb38-3a2282458e29
# The registered system name is: panlab-satellite-client
# Installed Product Current Status:
# Product Name: Red Hat Enterprise Linux for x86_64
# Status: Subscribed
# # Running [panlab-satellite-client] host initial configuration
# Refreshing subscription data
# All local data refreshed
# Host [panlab-satellite-client] successfully configured.
# Successfully updated the system facts.
subscription-manager status
# +-------------------------------------------+
# System Status Details
# +-------------------------------------------+
# Overall Status: Current
# System Purpose Status: Not Specified
subscription-manager release --list
# +-------------------------------------------+
# Available Releases
# +-------------------------------------------+
# 8.6
subscription-manager release --set=8.6
subscription-manager config
# [server]
# hostname = panlab-satellite-server.infra.wzhlab.top
# ......
# [rhsm]
# auto_enable_yum_plugins = [1]
# baseurl = https://panlab-satellite-server.infra.wzhlab.top/pulp/content
# ......
dnf repolist
# Updating Subscription Management repositories.
# repo id repo name
# rhel-8-for-x86_64-appstream-rpms Red Hat Enterprise Linux 8 for x86_64 - AppStream (RPMs)
# rhel-8-for-x86_64-baseos-rpms Red Hat Enterprise Linux 8 for x86_64 - BaseOS (RPMs)
dnf makecache
# Updating Subscription Management repositories.
# Red Hat Enterprise Linux 8 for x86_64 - BaseOS (RPMs) 63 kB/s | 4.1 kB 00:00
# Red Hat Enterprise Linux 8 for x86_64 - AppStream (RPMs) 65 kB/s | 4.5 kB 00:00
# Metadata cache created.
subscription-manager repos
# +----------------------------------------------------------+
# Available Repositories in /etc/yum.repos.d/redhat.repo
# +----------------------------------------------------------+
# Repo ID: rhel-8-for-x86_64-baseos-rpms
# Repo Name: Red Hat Enterprise Linux 8 for x86_64 - BaseOS (RPMs)
# Repo URL: https://panlab-satellite-server.infra.wzhlab.top/pulp/content/My_Organization/Library/content/dist/rhel8/8.6/x86_64/baseos/os
# Enabled: 1
# Repo ID: rhel-8-for-x86_64-appstream-rpms
# Repo Name: Red Hat Enterprise Linux 8 for x86_64 - AppStream (RPMs)
# Repo URL: https://panlab-satellite-server.infra.wzhlab.top/pulp/content/My_Organization/Library/content/dist/rhel8/8.6/x86_64/appstream/os
# Enabled: 1
我们回到active key,可以看到已经激活的 repo
然后看到,我们没有配置host collection,所以这里也是空的。
最后,我们在active key的host列表中,看到了我们刚才的主机。
点进去看看,可以看到主机的rpm的安全问题,satellite已经能够看到。
问题那么多,我们更新一下看看
# on satellite-client
dnf update -y
哈哈,问题都解决了。
我们能看到,已经使用了一个订阅
在订阅详细信息里面,也能看到一个activation key
订阅包含的,使用的产品内容就是baseos, appstream
主机列表多了我们刚才激活的主机。
增加订阅数量
如果我们多买了一些订阅,怎么添加数量呢?这里,我们就模拟增加1个订阅的场景。
我们访问redhat portal,点击之前创建的订阅分配。
调整数量为 2
回到satellite里面,我们维护一下我们的manifect
点击刷新,他会在线更新
更新完成以后,数量就变成 2 了。
超用会发生什么
我们回复订阅分配为 1 ,然后在第二台主机上激活订阅,会发生什么呢?
# on client-02 , to try over use
curl -sS --insecure 'https://panlab-satellite-server.infra.wzhlab.top/register?activation_keys=demo-activate&location_id=2&organization_id=1&setup_insights=false&setup_remote_execution=false&setup_remote_execution_pull=false&update_packages=false' -H 'Authorization: Bearer eyJhbGciOiJIUzI1NiJ9.eyJ1c2VyX2lkIjo0LCJpYXQiOjE2ODQzMDU1MTYsImp0aSI6IjdiODBkNzdmMjVjYzY1MDZjODQ3OGI2Y2VjNzRkZWZjOGM2YjAyMDUxMDQ4YTcyYTJlMWE1YzRiNTgyMjE5NzAiLCJzY29wZSI6InJlZ2lzdHJhdGlvbiNnbG9iYWwgcmVnaXN0cmF0aW9uI2hvc3QifQ.EVXyW9gjWyAQIFYUxnwwdxAigrPmUo_XYWnqn-Wh1Fw' | bash
# #
# # Running registration
# #
# Updating Subscription Management repositories.
# Unable to read consumer identity
# This system is not registered with an entitlement server. You can use subscription-manager to register.
# Error: There are no enabled repositories in "/etc/yum.repos.d", "/etc/yum/repos.d", "/etc/distro.repos.d".
# The system has been registered with ID: 43e38f76-2416-49db-890f-1a3ad3973828
# The registered system name is: satellite-client-02
# Installed Product Current Status:
# Product Name: Red Hat Enterprise Linux for x86_64
# Status: Not Subscribed
# Unable to find available subscriptions for all your installed products.
subscription-manager list --consumed
# No consumed subscription pools were found.
subscription-manager repos
# This system has no repositories available through subscriptions.
subscription-manager status
# +-------------------------------------------+
# System Status Details
# +-------------------------------------------+
# Overall Status: Invalid
# Red Hat Enterprise Linux for x86_64:
# - Not supported by a valid subscription.
# System Purpose Status: Not Specified
我们可以看到,订阅没有激活。我们确认一下,在订阅里面看,消耗量为 1
但是在activation key 里面,host 为2
不过,这个host list里面,有一个主机没有激活。
激活 Simple Content Access (SCA)
我们激活SCA,并限制 activation key 的 host 数量,用这种方法,来平衡使用方便和订阅不要超用。
激活 SCA
限制host 数量为1
我们在第二台主机上在激活试试
# on client-02 , to try over use
curl -sS --insecure 'https://panlab-satellite-server.infra.wzhlab.top/register?activation_keys=demo-activate&location_id=2&organization_id=1&setup_insights=false&setup_remote_execution=false&setup_remote_execution_pull=false&update_packages=false' -H 'Authorization: Bearer eyJhbGciOiJIUzI1NiJ9.eyJ1c2VyX2lkIjo0LCJpYXQiOjE2ODQzMDU1MTYsImp0aSI6IjdiODBkNzdmMjVjYzY1MDZjODQ3OGI2Y2VjNzRkZWZjOGM2YjAyMDUxMDQ4YTcyYTJlMWE1YzRiNTgyMjE5NzAiLCJzY29wZSI6InJlZ2lzdHJhdGlvbiNnbG9iYWwgcmVnaXN0cmF0aW9uI2hvc3QifQ.EVXyW9gjWyAQIFYUxnwwdxAigrPmUo_XYWnqn-Wh1Fw' | bash
# #
# # Running registration
# #
# Updating Subscription Management repositories.
# Unable to read consumer identity
# This system is not registered with an entitlement server. You can use subscription-manager to register.
# Error: There are no enabled repositories in "/etc/yum.repos.d", "/etc/yum/repos.d", "/etc/distro.repos.d".
# Max Hosts (1) reached for activation key 'demo-activate' (HTTP error code 409: Conflict)
激活失败。
超用情况
如果启用了SCA,但是不限制host数量,超用的情况下,能不能激活成功呢?我们做个实验。
先导入只有1个订阅的离线订阅文件。
然后,在active key里面,放开host limit限制。
接下来,我们在2个主机上,进行激活。
# on 172
# try to register
curl -sS --insecure 'https://panlab-satellite-server.infra.wzhlab.top:6443/register?activation_keys=demo-activate&location_id=2&organization_id=1&setup_insights=false&setup_remote_execution=false&update_packages=false' -H 'Authorization: Bearer eyJhbGciOiJIUzI1NiJ9.eyJ1c2VyX2lkIjo0LCJpYXQiOjE2OTM4ODg5NzIsImp0aSI6IjFlNzdkNDE1OWM4NmE3OGVjOWY5NjViMWQwODRlOWY5NThlN2ExZDBkNWZhZTJjZjY3NjMzMTQ1Nzk5NTRkNWEiLCJzY29wZSI6InJlZ2lzdHJhdGlvbiNnbG9iYWwgcmVnaXN0cmF0aW9uI2hvc3QifQ.aM6lKpNfBCH-FMHU0xkc6q4XaNeuS8JezLIQCf2faxI' | bash
# #
# # Running registration
# #
# Updating Subscription Management repositories.
# Unable to read consumer identity
# This system is not registered with an entitlement server. You can use subscription-manager to register.
# Error: There are no enabled repositories in "/etc/yum.repos.d", "/etc/yum/repos.d", "/etc/distro.repos.d".
# The system has been registered with ID: b853bd17-204a-4eeb-83c7-1d07f3dea7c6
# The registered system name is: client-0-changed
# # Running [client-0-changed] host initial configuration
# Refreshing subscription data
# All local data refreshed
# Host [client-0-changed] successfully configured.
# Successfully updated the system facts.
# on 173
# try to register
curl -sS --insecure 'https://panlab-satellite-server.infra.wzhlab.top:6443/register?activation_keys=demo-activate&location_id=2&organization_id=1&setup_insights=false&setup_remote_execution=false&update_packages=false' -H 'Authorization: Bearer eyJhbGciOiJIUzI1NiJ9.eyJ1c2VyX2lkIjo0LCJpYXQiOjE2OTM4ODg5NzIsImp0aSI6IjFlNzdkNDE1OWM4NmE3OGVjOWY5NjViMWQwODRlOWY5NThlN2ExZDBkNWZhZTJjZjY3NjMzMTQ1Nzk5NTRkNWEiLCJzY29wZSI6InJlZ2lzdHJhdGlvbiNnbG9iYWwgcmVnaXN0cmF0aW9uI2hvc3QifQ.aM6lKpNfBCH-FMHU0xkc6q4XaNeuS8JezLIQCf2faxI' | bash
# #
# # Running registration
# #
# Updating Subscription Management repositories.
# Unable to read consumer identity
# This system is not registered with an entitlement server. You can use subscription-manager to register.
# Error: There are no enabled repositories in "/etc/yum.repos.d", "/etc/yum/repos.d", "/etc/distro.repos.d".
# The system has been registered with ID: 4bba0a26-4f91-4bae-8752-4b073eeaee13
# The registered system name is: satellite-client-02
# # Running [satellite-client-02] host initial configuration
# Refreshing subscription data
# All local data refreshed
# Host [satellite-client-02] successfully configured.
# Successfully updated the system facts.
使用 API 来注销主机
一般情况下,主机注册以后就一直在satellite里面了,但是如果我们是一个云环境,主机需要频繁的注册和注销,那么我们需要一个自动的方法,让云平台来调用 satellite API,实现satellite里面的主机自动注销。
使用 hostname 来注销
satellite官方文档里面,已经提供了一个API,可以自动注销主机。
本次实验就试试把client-2给删掉。
curl -s --request DELETE --insecure --user admin:redhat \
https://panlab-satellite-server.infra.wzhlab.top/api/v2/hosts/satellite-client-02 | jq .
# {
# "id": 3,
# "name": "satellite-client-02",
# "last_compile": "2023-05-17T10:21:24.000Z",
# "last_report": null,
# "updated_at": "2023-05-17T10:21:24.861Z",
# "created_at": "2023-05-17T10:19:49.756Z",
# "root_pass": null,
# "architecture_id": 1,
# "operatingsystem_id": 2,
# "ptable_id": null,
# "medium_id": null,
# "build": false,
# "comment": null,
# "disk": null,
# "installed_at": null,
# "model_id": 1,
# "hostgroup_id": null,
# "owner_id": 1,
# "owner_type": "User",
# "enabled": true,
# "puppet_ca_proxy_id": null,
# "managed": false,
# "use_image": null,
# "image_file": "",
# "uuid": null,
# "compute_resource_id": null,
# "puppet_proxy_id": null,
# "certname": "satellite-client-02",
# "image_id": null,
# "organization_id": 1,
# "location_id": 2,
# "otp": null,
# "realm_id": null,
# "compute_profile_id": null,
# "provision_method": "build",
# "grub_pass": null,
# "discovery_rule_id": null,
# "global_status": 2,
# "lookup_value_matcher": "fqdn=satellite-client-02",
# "openscap_proxy_id": null,
# "pxe_loader": null,
# "initiated_at": null,
# "build_errors": null,
# "content_facet_attributes": {
# "id": 2,
# "host_id": 3,
# "uuid": null,
# "content_view_id": 1,
# "lifecycle_environment_id": 1,
# "kickstart_repository_id": null,
# "content_source_id": null,
# "installable_security_errata_count": 0,
# "installable_enhancement_errata_count": 0,
# "installable_bugfix_errata_count": 0,
# "applicable_rpm_count": 0,
# "upgradable_rpm_count": 0,
# "applicable_module_stream_count": 0,
# "upgradable_module_stream_count": 0,
# "applicable_deb_count": 0,
# "upgradable_deb_count": 0
# }
# }
API 调用以后,我们就能看到 client-2 这个主机被注销了。
这个注销主机的方法,有一个潜在问题,就是这个主机名会不会改变,如果我们在主机上,把主机名给改了,satellite里面会自动改,还是不会变呢?我们继续实验看看。
我们先看看现在的主机名是什么
hostnamectl
# Static hostname: client-0
# Icon name: computer-vm
# Chassis: vm
# Machine ID: 75587495919e40b7a0d39f7168df895e
# Boot ID: a15f631019d0463395d12c332873eb52
# Virtualization: vmware
# Operating System: Red Hat Enterprise Linux 8.6 (Ootpa)
# CPE OS Name: cpe:/o:redhat:enterprise_linux:8::baseos
# Kernel: Linux 4.18.0-372.32.1.el8_6.x86_64
# Architecture: x86-64
在satellite里面确认一下主机名
接着,我们修改主机名,并刷新
hostnamectl set-hostname client-0-changed
subscription-manager refresh
我们在satellite里面确认一下,主机名没有修改
那么,什么情况下,satellite里面的主机名会改变呢,通过笔者的实验,发现必须unregister以后,重新注册才可以。
使用 host id 来注销
# get host uuid on the managed host
subscription-manager facts | grep system.uuid
# dmi.system.uuid: 4C6B4D56-ACB7-585F-EB20-90FD676DEA4B
# check how many uuid you can find, from another host
subscription-manager facts | grep uuid
# dmi.system.uuid: 8DF84D56-895F-6163-962B-30EF44BDE122
# virt.uuid: 8DF84D56-895F-6163-962B-30EF44BDE122
# get host id from satellite by uuid
curl -s --request GET --insecure --user admin:redhat \
https://panlab-satellite-server.infra.wzhlab.top/api/v2/hosts?search=facts.dmi::system::uuid=4C6B4D56-ACB7-585F-EB20-90FD676DEA4B | \
jq .results[0].id
# 8
# get host id from satellite by name
curl -s --request GET --insecure --user admin:redhat \
https://panlab-satellite-server.infra.wzhlab.top/api/hosts/panlab-satellite-client | jq .id
# 2
# delete host using host id
curl -s --request DELETE --insecure --user admin:redhat \
https://panlab-satellite-server.infra.wzhlab.top/api/hosts/2 | jq .
# {
# "id": 2,
# "name": "panlab-satellite-client",
# "last_compile": "2023-05-17T12:26:28.000Z",
# "last_report": null,
# "updated_at": "2023-05-17T12:26:28.289Z",
# "created_at": "2023-05-17T06:43:46.628Z",
# "root_pass": null,
# "architecture_id": 1,
# "operatingsystem_id": 2,
# "ptable_id": null,
# "medium_id": null,
# "build": false,
# "comment": null,
# "disk": null,
# "installed_at": "2023-05-17T06:44:01.221Z",
# "model_id": 1,
# "hostgroup_id": null,
# "owner_id": 1,
# "owner_type": "User",
# "enabled": true,
# "puppet_ca_proxy_id": null,
# "managed": false,
# "use_image": null,
# "image_file": "",
# "uuid": null,
# "compute_resource_id": null,
# "puppet_proxy_id": null,
# "certname": "panlab-satellite-client",
# "image_id": null,
# "organization_id": 1,
# "location_id": 2,
# "otp": null,
# "realm_id": null,
# "compute_profile_id": null,
# "provision_method": "build",
# "grub_pass": null,
# "discovery_rule_id": null,
# "global_status": 0,
# "lookup_value_matcher": "fqdn=panlab-satellite-client",
# "openscap_proxy_id": null,
# "pxe_loader": null,
# "initiated_at": "2023-05-17T06:43:59.574Z",
# "build_errors": null,
# "content_facet_attributes": {
# "id": 1,
# "host_id": 2,
# "uuid": "e9d03372-d3f4-4970-bb38-3a2282458e29",
# "content_view_id": 1,
# "lifecycle_environment_id": 1,
# "kickstart_repository_id": null,
# "content_source_id": null,
# "installable_security_errata_count": 0,
# "installable_enhancement_errata_count": 0,
# "installable_bugfix_errata_count": 0,
# "applicable_rpm_count": 0,
# "upgradable_rpm_count": 0,
# "applicable_module_stream_count": 0,
# "upgradable_module_stream_count": 0,
# "applicable_deb_count": 0,
# "upgradable_deb_count": 0
# },
# "subscription_facet_attributes": {
# "id": 1,
# "host_id": 2,
# "uuid": "e9d03372-d3f4-4970-bb38-3a2282458e29",
# "last_checkin": "2023-06-26T03:27:43.457Z",
# "service_level": "",
# "release_version": "8.6",
# "autoheal": true,
# "registered_at": "2023-05-17T06:43:47.000Z",
# "registered_through": "panlab-satellite-server.infra.wzhlab.top",
# "user_id": null,
# "hypervisor": false,
# "hypervisor_host_id": null,
# "purpose_usage": "",
# "purpose_role": "",
# "dmi_uuid": "4C6B4D56-ACB7-585F-EB20-90FD676DEA4B"
# }
# }
调用了这个接口之后,我们就能看到这个主机被注销了。
网络防火墙端口
客户的网络有严格的限制,要访问公网,需要特定的开防火墙,那么satellite需要开什么防火墙策略呢?
根据以下的一些官方知识库
- How to access Red Hat Subscription Manager (RHSM) through a firewall or proxy
- Public CIDR Lists for Red Hat (IP Addresses for cdn.redhat.com)
- Downloading Packages via Red Hat Official Network is Slow in mainland China
- What is the IP address range for 'subscription.rhn.redhat.com' and 'subscription.rhsm.redhat.com'?
- How do I configure my firewall for api.access.redhat.com?
我们总结出了一些域名需要放开
- subscription.rhn.redhat.com:443 [https] AND subscription.rhsm.redhat.com:443 [https] (This is the new default address in newer versions of RHEL 7)
- cdn.redhat.com:443 [https]
- *.akamaiedge.net:443 [https] OR *.akamaitechnologies.com:443 [https]
- china.cdn.redhat.com:443 [https]
如果客户网络的防火墙,只支持ip,那么要放开如下的一系列网络段,不过根据作者实际测试,这个ip地址列表并不准确,或者说,更新的并不及时。
端口转发
客户内网有严格的流量管理,不允许443端口通讯,需要把satellite的https 443端口,变成6443,那么我们来试试
先在web console上做一个配置
# on satellite server
# redirect 6443 to 443
iptables -t nat -A PREROUTING -p tcp --dport 6443 -j REDIRECT --to-port 443
# block 443 traffic REJECT
# iptables -A INPUT -p tcp --dport 443 -j DROP
# iptables -A INPUT -p tcp --dport 443 -j REJECT
# iptables -A INPUT -p tcp --dport 443 -j ACCEPT
# iptables -A INPUT -p tcp --dport 80 -j ACCEPT
# persistent
iptables-save > /etc/sysconfig/iptables
cat << EOF > /etc/systemd/system/iptables.service
[Unit]
Description=iptables Firewall Rules
After=network.target
[Service]
ExecStart=/sbin/iptables-restore /etc/sysconfig/iptables
Type=oneshot
RemainAfterExit=yes
[Install]
WantedBy=multi-user.target
EOF
systemctl enable --now iptables.service
# systemctl disable --now iptables.service
# sed -i "s/443/6443/g" /etc/httpd/conf/ports.conf
# semanage port -a -t http_port_t -p tcp 6443
# on client node
# to register
curl -sS --insecure 'https://panlab-satellite-server.infra.wzhlab.top:6443/register?activation_keys=demo-activate&location_id=2&organization_id=1&setup_insights=false&setup_remote_execution=false&update_packages=false' -H 'Authorization: Bearer eyJhbGciOiJIUzI1NiJ9.eyJ1c2VyX2lkIjo0LCJpYXQiOjE2ODk4MjMxNDksImp0aSI6IjUxNTNiZmFjMDIxMjNjYjEzZDdjZjM5NWRkMWIyZWEzMWQ3NzA3YTczNzgxNzRhOWI5MDMzMzdjOTA4MzBlY2UiLCJzY29wZSI6InJlZ2lzdHJhdGlvbiNnbG9iYWwgcmVnaXN0cmF0aW9uI2hvc3QifQ.idNFXNsi6mz0fKef42yn_XwVWvwdKD2R3FolAHsrRmo' > sub.sh
sed -i 's/--server.port="443"/--server.port="6443"/g' sub.sh
sed -i 's|https://panlab-satellite-server.infra.wzhlab.top/|https://panlab-satellite-server.infra.wzhlab.top:6443/|g' sub.sh
# manually modify the shell
# comment out 2 step at the end of the script
# 我们的场景简单,就不需要其他步骤了
# #register_katello_host | bash
# echo 'skip step'
# else
# #register_host | bash
# echo 'skip step'
# fi
bash sub.sh
subscription-manager release --list
# +-------------------------------------------+
# Available Releases
# +-------------------------------------------+
# 8.6
subscription-manager release --set=8.6
subscription-manager config
# [server]
# hostname = panlab-satellite-server.infra.wzhlab.top
# insecure = [0]
# no_proxy = []
# port = 6443
# ......
# [rhsm]
# auto_enable_yum_plugins = [1]
# baseurl = https://panlab-satellite-server.infra.wzhlab.top:6443/pulp/content
# ......
subscription-manager list --installed
# +-------------------------------------------+
# Installed Product Status
# +-------------------------------------------+
# Product Name: Red Hat Enterprise Linux for x86_64
# Product ID: 479
# Version: 8.6
# Arch: x86_64
subscription-manager repos
# +----------------------------------------------------------+
# Available Repositories in /etc/yum.repos.d/redhat.repo
# +----------------------------------------------------------+
# Repo ID: rhel-8-for-x86_64-appstream-rpms
# Repo Name: Red Hat Enterprise Linux 8 for x86_64 - AppStream (RPMs)
# Repo URL: https://panlab-satellite-server.infra.wzhlab.top:6443/pulp/content/My_Organization/Library/content/dist/rhel8/8.6/x86_64/appstream/os
# Enabled: 1
# Repo ID: rhel-8-for-x86_64-baseos-rpms
# Repo Name: Red Hat Enterprise Linux 8 for x86_64 - BaseOS (RPMs)
# Repo URL: https://panlab-satellite-server.infra.wzhlab.top:6443/pulp/content/My_Organization/Library/content/dist/rhel8/8.6/x86_64/baseos/os
# Enabled: 1
# try to unregister using satellite API
# get host id from satellite
# do not run below command on satellite server, you will face iptable redirect rule failure
curl -s --request GET --insecure --user admin:redhat \
https://panlab-satellite-server.infra.wzhlab.top:6443/api/hosts/client-0-changed | jq .id
# 6
# delete host using host id
curl -s --request DELETE --insecure --user admin:redhat \
https://panlab-satellite-server.infra.wzhlab.top:6443/api/hosts/6 | jq .
# {
# "id": 6,
# "name": "client-0-changed",
# "last_compile": "2023-07-20T04:02:48.000Z",
# "last_report": null,
# "updated_at": "2023-07-20T04:02:48.132Z",
# "created_at": "2023-07-20T03:19:39.676Z",
# "root_pass": null,
# "architecture_id": 1,
# "operatingsystem_id": 2,
# "ptable_id": null,
# "medium_id": null,
# "build": false,
# "comment": null,
# "disk": null,
# "installed_at": "2023-07-20T03:20:28.857Z",
# "model_id": 1,
# "hostgroup_id": null,
# "owner_id": 1,
# "owner_type": "User",
# "enabled": true,
# "puppet_ca_proxy_id": null,
# "managed": false,
# "use_image": null,
# "image_file": "",
# "uuid": null,
# "compute_resource_id": null,
# "puppet_proxy_id": null,
# "certname": "client-0-changed",
# "image_id": null,
# "organization_id": 1,
# "location_id": 2,
# "otp": null,
# "realm_id": null,
# "compute_profile_id": null,
# "provision_method": "build",
# "grub_pass": null,
# "discovery_rule_id": null,
# "global_status": 0,
# "lookup_value_matcher": "fqdn=client-0-changed",
# "openscap_proxy_id": null,
# "pxe_loader": null,
# "initiated_at": "2023-07-20T03:20:27.055Z",
# "build_errors": null,
# "content_facet_attributes": {
# "id": 5,
# "host_id": 6,
# "uuid": "e91e4f8d-6ace-4a7a-8af0-dd7311786042",
# "content_view_id": 1,
# "lifecycle_environment_id": 1,
# "kickstart_repository_id": null,
# "content_source_id": null,
# "installable_security_errata_count": 0,
# "installable_enhancement_errata_count": 0,
# "installable_bugfix_errata_count": 0,
# "applicable_rpm_count": 0,
# "upgradable_rpm_count": 0,
# "applicable_module_stream_count": 0,
# "upgradable_module_stream_count": 0,
# "applicable_deb_count": 0,
# "upgradable_deb_count": 0
# },
# "subscription_facet_attributes": {
# "id": 9,
# "host_id": 6,
# "uuid": "e91e4f8d-6ace-4a7a-8af0-dd7311786042",
# "last_checkin": "2023-07-20T04:02:46.801Z",
# "service_level": "",
# "release_version": "8.6",
# "autoheal": true,
# "registered_at": "2023-07-20T03:20:16.000Z",
# "registered_through": "panlab-satellite-server.infra.wzhlab.top",
# "user_id": null,
# "hypervisor": false,
# "hypervisor_host_id": null,
# "purpose_usage": "",
# "purpose_role": "",
# "dmi_uuid": "4C6B4D56-ACB7-585F-EB20-90FD676DEA4B"
# }
# }
从上面的操作,可以看到,客户的需求非常简单,那么我们是可以把端口从443变成6443的。大致的过程,是在web console上配置一下入口url,然后在主机上配置iptables,做端口转发。然后,在被管理节点上,把下发的shell脚本,做个定制,就可以了。
但是,需要注意,更改443端口,是红帽官方不支持的定制化,所以只能用于需求非常简单的场景中。
中国区加速
satellite默认会从cdn.redhat.com上下载rpm,但是在客户网络里面很慢,从china.cdn.redhat.com下载比较快,那么我们怎么配置,来用中国区的镜像呢?
第一步,配置一个 Content Credentials,注意,这里面的文件,是/etc/rhsm/ca/redhat-uep.pem,你可以下载到本地上传,也可以复制内容进去。
第二步,是配置一个 "Custom CDN",注意这里面要配置 SSL CA Content Credential ,用我们第一步配置的就好。
第三步,是刷新一下
到这里,我们就配置成功,可以同步啦。
安装 insight 插件
satellite也可以作为insight的proxy来使用,尝试之前,参考以下官方知识库,把对应的insight开关打开。
对应到实验环境,我们需要去目标主机上,单独打开insight开关
默认是关上的(false)
我们打开它。
然后,我们去目标主机,进行操作。我们先模拟离线环境,做iptables规则,关闭所有出流量,只留下satellite。
# block traffic to outside
# except to satellite
iptables -A OUTPUT -p tcp -d 172.21.6.171 -j ACCEPT
iptables -A OUTPUT -p tcp --sport 22 -j ACCEPT
iptables -A OUTPUT -p tcp -j REJECT
# try to register on insight
insights-client --register
# Successfully registered host client-0-changed
# Automatic scheduling for Insights has been enabled.
# Starting to collect Insights data for client-0-changed
# Uploading Insights data.
# Successfully uploaded report from client-0-changed to account 5910538.
# View the Red Hat Insights console at https://console.redhat.com/insights/
insights-client --check-results
insights-client --show-results
# [
# {
# "rule": {
# "rule_id": "generate_vmcore_failed_during_makedumpfile|GENERATE_VMCORE_FAILED_DURING_MAKEDUMPFILE",
# "created_at": "2023-02-08T08:31:18.561333Z",
# "updated_at": "2023-03-05T08:31:21.314917Z",
# "description": "The vmcore generation fails in RHEL 8.6 when \"cgroup_disable=memory\" is configured due to a known bug in the kernel",
# "active": true,
# "category": {
# "id": 1,
# "name": "Availability"
# },
# "impact": {
# "name": "Kernel Panic",
# "impact": 4
# },
# "likelihood": 3,
# "node_id": "6969010",
# "tags": "kdump kernel panic",
# "reboot_required": true,
# "publish_date": "2023-03-05T03:26:00Z",
# "summary": "The vmcore generation fails in RHEL 8.6 when \"cgroup_disable=memory\" is configured due to a known bug in the kernel.\n",
# "generic": "The vmcore generation fails in RHEL 8.6 when \"cgroup_disable=memory\" is configured due to a known bug in the kernel.\n",
# "reason": "This host is running **RHEL 8.6** with **kernel-{{=pydata.rhel_version}}** and \n**\"cgroup_disable=memory\"** is appended to the **KDUMP_COMMANDLINE_APPEND** in\nthe `/etc/sysconfig/kdump`:\n~~~\nKDUMP_COMMANDLINE_APPEND=\"{{=pydata.kdump_data_append}}\"\n~~~\n\nHowever, due to a known bug in the kernel versions prior to **4.18.0-372.40.1.el8_6**, \nthe vmcore generation fails when **\"cgroup_disable=memory\"** is appended to \n**KDUMP_COMMANDLINE_APPEND** in the `/etc/sysconfig/kdump`.\n",
# "more_info": "",
# "resolution_set": [
# {
# "system_type": 105,
# "resolution": "Red Hat recommends that you perform the following steps:\n\n{{?pydata.cur_lock && pydata.rcm_locks}}\n* Unset the release lock.\n ~~~\n # subscription-manager release --unset\n ~~~\n{{?}}\n\n{{?pydata.no_base &&\n (pydata.cur_lock==null || (pydata.cur_lock && pydata.rcm_locks))}}\n* Enable the RHEL base repo:\n ~~~\n # subscription-manager repos --enable={{=pydata.no_base}}\n ~~~\n Note: To fix the issue in the base channel, you have to enable the base channel at first.\n{{?}}\n\n{{?pydata.cur_lock && pydata.req_repos && pydata.rcm_locks==null}}\n* {{?Object.keys(pydata.req_repos).length > 1}}Enable one of the following channels{{??}}Enable the following channel{{?}}:\n ~~~\n {{~pydata.req_repos:e}}# subscription-manager repos --enable={{=e}}\n {{~}}\n ~~~\n Note: Red Hat only provides the resolution in the required channel{{?Object.keys(pydata.req_repos).length > 1}}s{{?}}. \n{{?}}\n* Update the `kernel` package:\n ~~~\n # yum update kernel\n ~~~\n* Reboot the system with the new kernel:\n ~~~\n # reboot\n ~~~\n{{?pydata.cur_lock && pydata.rcm_locks}}\n**Alternatively**, if unsetting the release lock is not an option, fix this issue by re-setting the release lock to {{?Object.keys(pydata.rcm_locks).length > 1}}one of the RHEL releases ``{{~pydata.rcm_locks:e}}{{=e}}, {{~}}``{{??}}the RHEL release ``{{=pydata.rcm_locks[0]}}``{{?}} and updating the package.{{?}}\n\n\nAfter applying the remediation, refresh the results of Advisor analysis by running the `insights-client` command on the system. \n~~~ \n# insights-client \n~~~ \n",
# "resolution_risk": {
# "name": "Upgrade Kernel",
# "risk": 3
# },
# "has_playbook": true
# }
# ],
# "total_risk": 3
# },
# "details": {
# "type": "rule",
# "error_key": "GENERATE_VMCORE_FAILED_DURING_MAKEDUMPFILE",
# "rhel_version": "4.18.0-372.32.1.el8_6.x86_64",
# "kdump_data_append": "irqpoll nr_cpus=1 reset_devices cgroup_disable=memory mce=off numa=off udev.children-max=2 panic=10 rootflags=nofail acpi_no_memhotplug transparent_hugepage=never nokaslr novmcoredd hest_disable"
# },
# "resolution": {
# "system_type": 105,
# "resolution": "Red Hat recommends that you perform the following steps:\n\n{{?pydata.cur_lock && pydata.rcm_locks}}\n* Unset the release lock.\n ~~~\n # subscription-manager release --unset\n ~~~\n{{?}}\n\n{{?pydata.no_base &&\n (pydata.cur_lock==null || (pydata.cur_lock && pydata.rcm_locks))}}\n* Enable the RHEL base repo:\n ~~~\n # subscription-manager repos --enable={{=pydata.no_base}}\n ~~~\n Note: To fix the issue in the base channel, you have to enable the base channel at first.\n{{?}}\n\n{{?pydata.cur_lock && pydata.req_repos && pydata.rcm_locks==null}}\n* {{?Object.keys(pydata.req_repos).length > 1}}Enable one of the following channels{{??}}Enable the following channel{{?}}:\n ~~~\n {{~pydata.req_repos:e}}# subscription-manager repos --enable={{=e}}\n {{~}}\n ~~~\n Note: Red Hat only provides the resolution in the required channel{{?Object.keys(pydata.req_repos).length > 1}}s{{?}}. \n{{?}}\n* Update the `kernel` package:\n ~~~\n # yum update kernel\n ~~~\n* Reboot the system with the new kernel:\n ~~~\n # reboot\n ~~~\n{{?pydata.cur_lock && pydata.rcm_locks}}\n**Alternatively**, if unsetting the release lock is not an option, fix this issue by re-setting the release lock to {{?Object.keys(pydata.rcm_locks).length > 1}}one of the RHEL releases ``{{~pydata.rcm_locks:e}}{{=e}}, {{~}}``{{??}}the RHEL release ``{{=pydata.rcm_locks[0]}}``{{?}} and updating the package.{{?}}\n\n\nAfter applying the remediation, refresh the results of Advisor analysis by running the `insights-client` command on the system. \n~~~ \n# insights-client \n~~~ \n",
# "resolution_risk": {
# "name": "Upgrade Kernel",
# "risk": 3
# },
# "has_playbook": true
# },
# "impacted_date": "2023-09-05T05:09:54.945795Z"
# },
# {
# "rule": {
# "rule_id": "el8_to_el9_upgrade|RHEL8_TO_RHEL9_UPGRADE_AVAILABLE_V1",
# "created_at": "2023-07-18T08:45:10.136263Z",
# "updated_at": "2023-09-04T20:32:26.551599Z",
# "description": "RHEL 8 system is eligible for an in-place upgrade to RHEL 9 using the Leapp utility",
# "active": true,
# "category": {
# "id": 4,
# "name": "Performance"
# },
# "impact": {
# "name": "Best Practice",
# "impact": 1
# },
# "likelihood": 1,
# "node_id": "6955478",
# "tags": "autoack kernel leapp",
# "reboot_required": true,
# "publish_date": "2023-08-11T08:32:29Z",
# "summary": "Red Hat provides `leapp` utility to support upgrade from **RHEL 8** to **RHEL 9**. The current **RHEL 8** version is eligible for upgrade to **RHEL 9** via `leapp` utility. Red Hat recommends that you install `leapp` packages.\n\nOne way to install `leapp` is during execution of **RHEL preupgrade analysis utility** in *Automation Toolkit -> Tasks* service. Run this task to understand the impact of an upgrade on your fleet and make a remediation plan before your maintenance window begins.\n",
# "generic": "Red Hat provides `leapp` utility to support upgrade from **RHEL 8** to **RHEL 9**. The current **RHEL 8** version is eligible for upgrade to **RHEL 9** via `leapp` utility. Red Hat recommends that you install `leapp` packages.\n\nOne way to install `leapp` is during execution of **RHEL preupgrade analysis utility** in *Automation Toolkit -> Tasks* service. Run this task to understand the impact of an upgrade on your fleet and make a remediation plan before your maintenance window begins.\n",
# "reason": "{{? pydata.error_key == \"RHEL8_TO_RHEL9_UPGRADE_AVAILABLE\"}}\nThe current **RHEL** version **{{=pydata.supported_path[0]}}** is eligible for upgrade to **RHEL** version {{? pydata.supported_path.length > 2}}**{{=pydata.supported_path[2]}}** (default) or **{{=pydata.supported_path[1]}}**{{??}}**{{=pydata.supported_path[1]}}**{{?}} via the Leapp utility.\n{{?}}\n\n{{? pydata.error_key == \"RHEL8_TO_RHEL9_UPGRADE_AVAILABLE_RPMS\"}}\nThe Leapp utility is available on this system. The current **RHEL** version **{{=pydata.supported_path[0]}}** is eligible for upgrade to **RHEL** version {{? pydata.supported_path.length > 2}}**{{=pydata.supported_path[2]}}** (default) or **{{=pydata.supported_path[1]}}**{{??}}**{{=pydata.supported_path[1]}}**{{?}} via the Leapp utility.\n{{?}}\n",
# "more_info": "One way to install `leapp` is during execution of **RHEL preupgrade analysis utility** in [Automation Toolkit -> Tasks](https://console.redhat.com/insights/tasks) service. Run this task to understand the impact of an upgrade on your fleet and make a remediation plan before your maintenance window begins.\n",
# "resolution_set": [
# {
# "system_type": 105,
# "resolution": "Red Hat recommends that you upgrade to **RHEL9** with the following steps:\n\n{{? pydata.error_key == \"RHEL8_TO_RHEL9_UPGRADE_AVAILABLE\"}}\n1. Planning an upgrade according to these [points](https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/9/html/upgrading_from_rhel_8_to_rhel_9/planning-an-upgrade_upgrading-from-rhel-8-to-rhel-9)\n1. Preparing a RHEL 8 system for the upgrade according to this [procedure](https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/9/html/upgrading_from_rhel_8_to_rhel_9/assembly_preparing-for-the-upgrade_upgrading-from-rhel-8-to-rhel-9).\n\n1. Install `leapp` utility.\n ~~~\n # dnf install leapp-upgrade\n ~~~\n1. Identify potential upgrade problems before upgrade.\n ~~~\n # leapp preupgrade --target {{? pydata.supported_path.length > 2}}{{=pydata.supported_path[2]}}{{??}}{{=pydata.supported_path[1]}}{{?}}\n ~~~\n **Note**: Check `/var/log/leapp/leapp-report.txt` or web console for any pre-check failure and refer to [Reviewing the pre-upgrade report](https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/9/html/upgrading_from_rhel_8_to_rhel_9/reviewing-the-pre-upgrade-report_upgrading-from-rhel-8-to-rhel-9) for more details. \n1. Start the upgrade.\n ~~~\n # leapp upgrade --target {{? pydata.supported_path.length > 2}}{{=pydata.supported_path[2]}}{{??}}{{=pydata.supported_path[1]}}{{?}}\n ~~~\n1. Reboot system.\n ~~~\n # reboot\n ~~~\n{{?}}\n\n{{? pydata.error_key == \"RHEL8_TO_RHEL9_UPGRADE_AVAILABLE_RPMS\"}}\n1. Planning an upgrade according to these [points](https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/9/html/upgrading_from_rhel_8_to_rhel_9/planning-an-upgrade_upgrading-from-rhel-8-to-rhel-9)\n1. Preparing a RHEL 8 system for the upgrade according to this [procedure](https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/9/html/upgrading_from_rhel_8_to_rhel_9/assembly_preparing-for-the-upgrade_upgrading-from-rhel-8-to-rhel-9).\n\n1. Identify potential upgrade problems before upgrade.\n ~~~\n # leapp preupgrade --target {{? pydata.supported_path.length > 2}}{{=pydata.supported_path[2]}}{{??}}{{=pydata.supported_path[1]}}{{?}}\n ~~~\n1. Start the upgrade.\n ~~~\n # leapp upgrade --target {{? pydata.supported_path.length > 2}}{{=pydata.supported_path[2]}}{{??}}{{=pydata.supported_path[1]}}{{?}}\n ~~~\n **Note**: Check `/var/log/leapp/leapp-report.txt` or web console for any pre-check failure and refer to [Reviewing the pre-upgrade report](https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/9/html/upgrading_from_rhel_8_to_rhel_9/reviewing-the-pre-upgrade-report_upgrading-from-rhel-8-to-rhel-9) for more details.\n1. Reboot system.\n ~~~\n # reboot\n ~~~\n{{?}}\nFor more details about upgrading, refer to [Upgrading to RHEL9](https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/9/html/upgrading_from_rhel_8_to_rhel_9/index).\n\n\nAfter applying the remediation, refresh the results of Advisor analysis by running the `insights-client` command on the system. \n~~~ \n# insights-client \n~~~ \n",
# "resolution_risk": {
# "name": "Upgrade RHEL",
# "risk": 3
# },
# "has_playbook": false
# }
# ],
# "total_risk": 1
# },
# "details": {
# "type": "rule",
# "error_key": "RHEL8_TO_RHEL9_UPGRADE_AVAILABLE_V1",
# "supported_path": [
# "8.6",
# "9.0"
# ]
# },
# "resolution": {
# "system_type": 105,
# "resolution": "Red Hat recommends that you upgrade to **RHEL9** with the following steps:\n\n{{? pydata.error_key == \"RHEL8_TO_RHEL9_UPGRADE_AVAILABLE\"}}\n1. Planning an upgrade according to these [points](https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/9/html/upgrading_from_rhel_8_to_rhel_9/planning-an-upgrade_upgrading-from-rhel-8-to-rhel-9)\n1. Preparing a RHEL 8 system for the upgrade according to this [procedure](https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/9/html/upgrading_from_rhel_8_to_rhel_9/assembly_preparing-for-the-upgrade_upgrading-from-rhel-8-to-rhel-9).\n\n1. Install `leapp` utility.\n ~~~\n # dnf install leapp-upgrade\n ~~~\n1. Identify potential upgrade problems before upgrade.\n ~~~\n # leapp preupgrade --target {{? pydata.supported_path.length > 2}}{{=pydata.supported_path[2]}}{{??}}{{=pydata.supported_path[1]}}{{?}}\n ~~~\n **Note**: Check `/var/log/leapp/leapp-report.txt` or web console for any pre-check failure and refer to [Reviewing the pre-upgrade report](https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/9/html/upgrading_from_rhel_8_to_rhel_9/reviewing-the-pre-upgrade-report_upgrading-from-rhel-8-to-rhel-9) for more details. \n1. Start the upgrade.\n ~~~\n # leapp upgrade --target {{? pydata.supported_path.length > 2}}{{=pydata.supported_path[2]}}{{??}}{{=pydata.supported_path[1]}}{{?}}\n ~~~\n1. Reboot system.\n ~~~\n # reboot\n ~~~\n{{?}}\n\n{{? pydata.error_key == \"RHEL8_TO_RHEL9_UPGRADE_AVAILABLE_RPMS\"}}\n1. Planning an upgrade according to these [points](https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/9/html/upgrading_from_rhel_8_to_rhel_9/planning-an-upgrade_upgrading-from-rhel-8-to-rhel-9)\n1. Preparing a RHEL 8 system for the upgrade according to this [procedure](https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/9/html/upgrading_from_rhel_8_to_rhel_9/assembly_preparing-for-the-upgrade_upgrading-from-rhel-8-to-rhel-9).\n\n1. Identify potential upgrade problems before upgrade.\n ~~~\n # leapp preupgrade --target {{? pydata.supported_path.length > 2}}{{=pydata.supported_path[2]}}{{??}}{{=pydata.supported_path[1]}}{{?}}\n ~~~\n1. Start the upgrade.\n ~~~\n # leapp upgrade --target {{? pydata.supported_path.length > 2}}{{=pydata.supported_path[2]}}{{??}}{{=pydata.supported_path[1]}}{{?}}\n ~~~\n **Note**: Check `/var/log/leapp/leapp-report.txt` or web console for any pre-check failure and refer to [Reviewing the pre-upgrade report](https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/9/html/upgrading_from_rhel_8_to_rhel_9/reviewing-the-pre-upgrade-report_upgrading-from-rhel-8-to-rhel-9) for more details.\n1. Reboot system.\n ~~~\n # reboot\n ~~~\n{{?}}\nFor more details about upgrading, refer to [Upgrading to RHEL9](https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/9/html/upgrading_from_rhel_8_to_rhel_9/index).\n\n\nAfter applying the remediation, refresh the results of Advisor analysis by running the `insights-client` command on the system. \n~~~ \n# insights-client \n~~~ \n",
# "resolution_risk": {
# "name": "Upgrade RHEL",
# "risk": 3
# },
# "has_playbook": false
# },
# "impacted_date": "2023-09-05T05:09:54.945795Z"
# }
# ]
最后,我们能在公网的insight console上,看到我们的主机。
同时,我们注意到,虽然在公网的insight上,能看到这个主机,但是在access.redhat.com的订阅管理中,是看不到这个被管主机的。
总结以下,satellite可以作为insight的proxy使用,但是在实验过程中,我发现insight的结果,只能在主机自己,和insight公网console上展现,而satellite上面,虽然有insight结果展示的入口页面,但是页面是空的,也许有别的配置吧。
重装os
客户有一个特殊场景,如果rhel重装了,那么satellite上面,原来的信息要怎么处理?是否可以不删除satellite上面的信息,直接在rhel上面注册,复用之前的注册信息呢?我们试试
我们重装实验环境中的一台主机
# before reinstall, we check the uuid
subscription-manager facts | grep uuid
# dmi.system.uuid: 8DF84D56-895F-6163-962B-30EF44BDE122
# virt.uuid: 8DF84D56-895F-6163-962B-30EF44BDE122
# after reinstall os
# we can see the uuid is the same
subscription-manager facts | grep uuid
# dmi.system.uuid: 8DF84D56-895F-6163-962B-30EF44BDE122
# virt.uuid: 8DF84D56-895F-6163-962B-30EF44BDE122
# try to register again
curl -sS --insecure 'https://panlab-satellite-server.infra.wzhlab.top/register?activation_keys=demo-activate&location_id=2&organization_id=1&setup_insights=false&setup_remote_execution=false&setup_remote_execution_pull=false&update_packages=false' -H 'Authorization: Bearer eyJhbGciOiJIUzI1NiJ9.eyJ1c2VyX2lkIjo0LCJpYXQiOjE2ODQzMDU1MTYsImp0aSI6IjdiODBkNzdmMjVjYzY1MDZjODQ3OGI2Y2VjNzRkZWZjOGM2YjAyMDUxMDQ4YTcyYTJlMWE1YzRiNTgyMjE5NzAiLCJzY29wZSI6InJlZ2lzdHJhdGlvbiNnbG9iYWwgcmVnaXN0cmF0aW9uI2hvc3QifQ.EVXyW9gjWyAQIFYUxnwwdxAigrPmUo_XYWnqn-Wh1Fw' | bash
# ......
# The DMI UUID of this host (8DF84D56-895F-6163-962B-30EF44BDE122) matches other registered hosts: satellite-client-02 (HTTP error code 422: Unprocessable Entity)
好了,我们看到了结论,satellite发现,已经有一个相同的uuid主机存在,不能再注册了。我们能做的,就是先在satellite里面,把现在已经存在的这个主机给删掉。
change uuid
我们知道了,uuid是注册satellite的一个key,但是,如果我们的环境特殊,uuid就是重复的,那么怎么办呢?官方有解决方案
[root@client ~]# vi /etc/rhsm/facts/uuid.facts
{"dmi.system.uuid": "customuuid"}
* customuuid = hostname which is unique for every machine.
监控 subscription / 订阅
客户想自动化的监控订阅的过期时间,好及时的更新订阅。虽然我们可以在红帽的portal上面方便的看到订阅的状态,但是,如果我们是运维组,没有访问红帽portal的权限(内部沟通协调问题,你懂的),还是需要一个监控的工具来做这件事情。
那么我们就用 satellite 的 API 来做这件事情。
# you can get the org_id by search a host, the result json contain org_id
curl -s --request GET --insecure --user admin:redhat \
https://panlab-satellite-server.infra.wzhlab.top:6443//katello/api/subscriptions?organization_id=1 | \
jq -r '["Name","End Date"], (.results[] | [.name, .end_date] ) | @tsv '
# Name End Date
# Employee SKU 2027-01-01 04:59:59 UTC
从上面的例子,我们可以看到,从 satellite API 里面,能直接拿到订阅的过期时间。方便运维组监控。
end
next
RHEL 订阅在线注册相关问题
在线注册过程
国内客户,购买了rhel订阅以后,就可以把自己的系统,在线注册了。一般用如下的命令:
subscription-manager register --auto-attach --username ********* --password ********
上述命令在国内的网络情况下,经常出现速度慢,超时等错误。这是因为,register过程,要访问国外的服务器(subscription.rhsm.redhat.com)。那我们可以搞一个proxy,然后让注册过程走proxy,就能加速。
How to access Red Hat Subscription Manager (RHSM) through a firewall or proxy
export PROXY="127.0.0.1:18801"
subscription-manager register --proxy=$PROXY --auto-attach --username ********* --password ********
官方知识库: https://access.redhat.com/solutions/253273
debug
如果不太清楚慢的原因,那么就需要打开rhsm的log,看看日志,确定问题原因了。
sed -i 's/default_log_level = .*/default_log_level = DEBUG/' /etc/rhsm/rhsm.conf
subscription-manager status
cat /var/log/rhsm/rhsm.log
后台持续访问红帽服务器
rhsm带了一些服务,其中有一个服务 rhsmcertd.service 默认是激活的。
systemctl list-unit-files | grep rhsm
# rhsm-facts.service disabled
# rhsm.service disabled
# rhsmcertd.service enabled
systemctl cat rhsmcertd.service
# # /usr/lib/systemd/system/rhsmcertd.service
# [Unit]
# Description=Enable periodic update of entitlement certificates.
# After=network.target
# [Service]
# Type=forking
# ExecStart=/usr/bin/rhsmcertd
# [Install]
# WantedBy=multi-user.target
我们可以看到,它启动了一个系统管理的服务,我们可以man rhsmcertd
看看这个服务是做什么的。原来,它是定期去红帽服务器检查和更新证书的。我们是在线系统,留着它就好。
dnf using subscription-manager as plugin
我们平常使用dnf的时候,会不会触发subscription-manager里面的功能呢?笔者认为不会,这是因为RHEL的dnf里面,有一个plugin
cat /etc/dnf/plugins/subscription-manager.conf
# [main]
# enabled=1
# # When following option is set to 1, then all repositories defined outside redhat.repo will be disabled
# # every time subscription-manager plugin is triggered by dnf or yum
# disable_system_repos=0
我们可以看到,dnf有一个subscription-manager的plugin,具体他做什么,可以看看 dnf-plugin-subscription-manager 这个rpm,可以看到他有几个python脚本,他只有在特定的,有satellite的情况下,使用 upload subcommand 触发类似subscription-manager的逻辑,向satellite汇报本机情况。
Simple Content Access
红帽提供了一种新的消费订阅的模式,Simple Content Access,原来管理员需要一台主机一台主机的register, 然后在主机上添加订阅。这么操作有点麻烦。在新的 SCA 模式下,管理员只需要 register 这个主机就可以了,主机可以使用任何当前 org 下的订阅。
红帽的 SCA 政策,就是变相的鼓励大家超用订阅,然后第二年红帽销售就有理由管客户多要一笔钱了。这也是为什么,笔者建议激活SCA之前,要研究一下如何限制订阅的使用方法和措施。
官方文档:
activation key
SCA 太好用了,怎么能严格的控制使用量呢?方法是activation activation key可以指定host 数量,就可以避免超量使用啦。
具体方法见官方文档: https://access.redhat.com/articles/1378093
取消订阅过程
如果vm要销毁了,那么怎么取消订阅的使用呢,很简单。但是一定要记得,在vm销毁之前运行哦。。。
subscription-manager unregister
离线注册过程
如果客户网络情况太特殊,那么我们还可以走离线注册过程。背后的原理是,之前的在线注册,经过用户名密码验证后,系统会下载一个证书,保存在系统里面,后续再和红帽系统建立连接,就使用这个证书了。
离线注册流程,就是去手动下载这个证书,导入到系统中去,然后走后续流程。
具体步骤,见这个在线知识库: https://access.redhat.com/solutions/3121571
CCSP订阅的注册过程
CCSP订阅是为云主机厂商提供的一种订阅方式。有了CCSP订阅,云主机厂商需要去维护一套RHUI(Red Hat Update Infrastructure),然后云上的rhel都去访问RHUI来获得更新。
rpm CDN 加速
上面说的都是注册过程,注册完了,就是下载rpm了。红帽的rpm有全球的CDN加速,由于众所周知的原因,如果客户感觉下载慢,可以切换国内的CDN
subscription-manager config --rhsm.baseurl=https://china.cdn.redhat.com
subscription-manager refresh
yum clean all
yum makecache
官方知识库: https://access.redhat.com/solutions/5090421
satellite
企业用户的私有云,都是离线的环境。红帽提供了一个产品叫satellite,相当于一个注册服务器的代理和rpm源的私有CDN。
local repo mirror
如果客户认为使用satellite太复杂,部署太麻烦,那么还有一种笨拙,但是简单的方法,就是先注册一台主机,把红帽官方的repo给镜像到本地,在这个主机上开启web服务,把这个主机给变成一个本地repo源。其他主机指向这个本地源就可以了。
官方知识库: https://access.redhat.com/solutions/23016
第二年续订
通常,rhel订阅都是一年的,第二年续订就好。。。但是,续订以后,大部分情况,订阅号会改变,这种情况下,rhel上要做什么操作呢?需要刷新,并重新绑定。
# 刷新订阅信息
subscription-manager refresh
# 自动选择匹配的订阅附加
subscription-manager auto-attach
如果不清楚,或者忘记了有什么订阅了?那么用以下命令查看
# 此命令是查看当前账号下所有可用的有效订阅,其中也可以看到每个订阅的有效期
subscription-manager list --available
# 此命令是查看当前这台机器所消耗的订阅类型,其中也包括有效时间
subscription-manager list --consumed
如果担心订阅号改变,会影响业务,那么我们可以在RHSM web console上,把新的订阅号加上,然后提前 subscription-manager refresh
, 这样就可以了。在RHSM web console上的操作,也可以通过rest api完成,方便有大量订阅的客户,自动化处理。
通过新增系统启动项来原地重装操作系统
很多时候,我们有一台centos7的主机,但是没有cd-rom的访问权限,有可能也没有console的访问权限,而我们做实验,又需要把这个主机刷新成rhel8等操作系统,那么我们怎么才能把这个centos7主机,原地重新安装成其他操作系统呢?
之前,已经有文章,描述怎么从centos7开始一个openshift/coreos的安装。那么,本文就探讨一下,如何从centos7,自动化安装一个alma8。同时,为了探索在安装的时候,能加载某些第三方驱动,我们也试试如何从centos7 boot进入alma8的手动安装界面。
boot into auto install
我们先来做一个从centos7的系统,做一些配置以后,重启,自动化安装成alma8系统。
这个部署就是一个kvm,这个kvm原来装的是centos7。但是我们需要一个安装源,也就是一个外部的http web server,提供安装光盘,并且提供kickstart配置文件。按理说哈,我们是可以配置kvm,把这些安装光盘里面的文件,还有kickstart文件什么的,都放到本地硬盘里面去,但是无奈在启动参数里面,你要指定这个硬盘id,作者实在是不知道怎么找到这些硬盘id,好在如果你用外部的http web server,只要知道URL就可以。
参考资料:
- https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html-single/system_design_guide/index#starting-a-kickstart-installation-manually_starting-kickstart-installations
- https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html-single/system_design_guide/index#updating-drivers-during-installation_system-design-guide
- https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html-single/system_design_guide/index#updating-drivers-during-installation_system-design-guide
# create a kickstart file and copy to /data/dnf/
# 本文的目录里面,有kickstart的配置模板,我们改变一下里面的IP地址i配置,安装源的URL就能用了。
sed -i '0,/^network.*/s/^network.*/network --bootproto=static --device=enp1s0 --gateway=192.168.7.11 --ip=192.168.7.12 --netmask=255.255.255.0 --nameserver=192.168.7.11 --ipv6=auto --activate/' helper-ks-alma.cfg
sed -i '0,/^url --url.*/s/^url --url.*/url --url="http:\/\/192.168.7.11:5000\/cdrom\/"/' helper-ks-alma.cfg
# create a centos7 kvm
# 要装kvm,我们需要一个bridge
cat << 'EOF' > /data/kvm/bridge.sh
#!/usr/bin/env bash
PUB_CONN='eno1'
PUB_IP='172.21.6.102/24'
PUB_GW='172.21.6.254'
PUB_DNS='172.21.1.1'
nmcli con down "$PUB_CONN"
nmcli con delete "$PUB_CONN"
nmcli con down baremetal
nmcli con delete baremetal
# RHEL 8.1 appends the word "System" in front of the connection,delete in case it exists
nmcli con down "System $PUB_CONN"
nmcli con delete "System $PUB_CONN"
nmcli connection add ifname baremetal type bridge con-name baremetal ipv4.method 'manual' \
ipv4.address "$PUB_IP" \
ipv4.gateway "$PUB_GW" \
ipv4.dns "$PUB_DNS"
nmcli con add type bridge-slave ifname "$PUB_CONN" master baremetal
nmcli con down "$PUB_CONN";pkill dhclient;dhclient baremetal
nmcli con up baremetal
EOF
bash /data/kvm/bridge.sh
nmcli con mod baremetal +ipv4.addresses "192.168.7.102/24"
nmcli con up baremetal
mkdir -p /data/kvm
cd /data/kvm
# 我们就用centos7 minimal iso来安装好了。
# 先下载这个iso,作者发现,南京大学的mirror是真的快啊。。。
wget -O centos.iso http://mirrors.nju.edu.cn/centos/7.9.2009/isos/x86_64/CentOS-7-x86_64-Minimal-2207-02.iso
# 同样,作者提供了一个kvm安装centos的kickstart配置文件模板,替换一下ip地址配置就能用了。
sed -i '0,/^network.*/s/^network.*/network --bootproto=static --device=eth0 --gateway=192.168.7.9 --ip=192.168.7.12 --netmask=255.255.255.0 --nameserver=192.168.7.11 --ipv6=auto --activate/' helper-ks.cfg
# 接下来,我们定义kvm,给他创建存储空间,启动kvm,就开始自动安装centos kvm了
create_lv() {
var_vg=$1
var_pool=$2
var_lv=$3
var_size=$4
var_action=$5
lvremove -f $var_vg/$var_lv
# lvcreate -y -L $var_size -n $var_lv $var_vg
if [ "$var_action" == "recreate" ]; then
lvcreate --type thin -n $var_lv -V $var_size --thinpool $var_vg/$var_pool
wipefs --all --force /dev/$var_vg/$var_lv
fi
}
virsh destroy ocp4-acm-hub
virsh undefine ocp4-acm-hub
create_lv vgdata poolA lvacmhub 100G
create_lv vgdata poolA lvacmhub-data 100G
create_lv vgdata poolA lvacmhub 100G recreate
create_lv vgdata poolA lvacmhub-data 100G recreate
virt-install --name="ocp4-acm-hub" --vcpus=16 --ram=$((4*1024)) \
--cpu=host-model \
--disk path=/dev/vgdata/lvacmhub,device=disk,bus=virtio,format=raw \
--disk path=/dev/vgdata/lvacmhub-data,device=disk,bus=virtio,format=raw \
--os-variant rhel8.5 --network bridge=baremetal,model=virtio \
--graphics vnc,port=59000 \
--boot menu=on --location /data/kvm/centos.iso \
--initrd-inject helper-ks.cfg --extra-args "inst.ks=file:/helper-ks.cfg"
# 等一会,我们就有了一个centos kvm了。
# on helper web server
# 然后,我们在web server上,下载alma8的安装光盘,我们用minimal的版本就好了。
cd /data/dnf
wget -O alma8.iso http://mirrors.nju.edu.cn/almalinux/8/isos/x86_64/AlmaLinux-8-latest-x86_64-minimal.iso
# wget -O rocky8.iso http://mirrors.nju.edu.cn/rocky/8/isos/x86_64/Rocky-x86_64-minimal.iso
# 我们把光盘挂载在本地,然后我们的web server会自动的发布出去。
mkdir -p /data/dnf/cdrom
mount alma8.iso /data/dnf/cdrom
# mount rocky8.iso /data/dnf/cdrom
# on the centos7 vm
# 登录到新安装的centos7, 修改启动项,让下次启动的时候,直接进入安装界面
sshpass -p 'redhat' ssh-copy-id root@192.168.7.12
ssh root@192.168.7.12
# 在centos7里面,下载alma8的内核和ram disk.
HELPER_URL=http://192.168.7.11:5000/cdrom/
curl -o /boot/initrd.img $HELPER_URL/images/pxeboot/initrd.img
curl -o /boot/vmlinuz $HELPER_URL/images/pxeboot/vmlinuz
SNO_IP=192.168.7.13
SNO_GW=192.168.7.11
SNO_NETMAST=255.255.255.0
SNO_HOSTNAME=acm-demo-hub-master
SNO_IF=enp1s0
SNO_DNS=192.168.7.11
SNO_DISK=/dev/vda
SNO_ROOTFS=http://192.168.7.11:5000/cdrom/
SNO_IGN=http://192.168.7.11:5000/helper-ks-alma.cfg
# 根据参数,我们自定义一个启动项,
# 这个启动项,用alma8的内核和ram disk启动,带IP地址参数,
# kickstart配置文件指向web server, 安装文件源也指向web server
cat << EOF > /etc/grub.d/40_custom
#!/bin/sh
exec tail -n +3 \$0
# This file provides an easy way to add custom menu entries. Simply type the
# menu entries you want to add after this comment. Be careful not to change
# the 'exec tail' line above.
menuentry 'coreos' --class fedora --class gnu-linux --class gnu --class os {
insmod gzio
insmod part_msdos
insmod xfs
set root='hd0,msdos1'
echo 'Loading coreos kernel ...'
linux /vmlinuz rd.neednet=1 ip=$SNO_IP::$SNO_GW:$SNO_NETMAST:$SNO_HOSTNAME:$SNO_IF:none nameserver=$SNO_DNS inst.ks=$SNO_IGN inst.repo=$SNO_ROOTFS
echo 'Loading coreos initrd ...'
initrd /initrd.img
}
EOF
# 定义下次自动启动的启动项,就是我们新定义
sed -i 's/^GRUB_DEFAULT=.*/GRUB_DEFAULT="coreos"/' /etc/default/grub
# 写入我们的配置到grub
grub2-mkconfig -o /etc/grub2.cfg
# 重启等待就好了。
reboot
boot into install console
有的时候,我们是能接触到console的,而且自动化配置的很多参数,我们也不知道,那么我们就必须用手动的方式安装。同样的,我们假设设备已经装好了centos7,我们从这里开始开始。
这里面的区别和上面的步骤很小,就是在启动参数里面,我们不要加inst.ks这个参数,也就是没有自动化安装的配置文件,这样重启以后,我们就能停留在我们很熟悉的安装界面上了。
# create a centos7 kvm
cat << 'EOF' > /data/kvm/bridge.sh
#!/usr/bin/env bash
PUB_CONN='eno1'
PUB_IP='172.21.6.102/24'
PUB_GW='172.21.6.254'
PUB_DNS='172.21.1.1'
nmcli con down "$PUB_CONN"
nmcli con delete "$PUB_CONN"
nmcli con down baremetal
nmcli con delete baremetal
# RHEL 8.1 appends the word "System" in front of the connection,delete in case it exists
nmcli con down "System $PUB_CONN"
nmcli con delete "System $PUB_CONN"
nmcli connection add ifname baremetal type bridge con-name baremetal ipv4.method 'manual' \
ipv4.address "$PUB_IP" \
ipv4.gateway "$PUB_GW" \
ipv4.dns "$PUB_DNS"
nmcli con add type bridge-slave ifname "$PUB_CONN" master baremetal
nmcli con down "$PUB_CONN";pkill dhclient;dhclient baremetal
nmcli con up baremetal
EOF
bash /data/kvm/bridge.sh
nmcli con mod baremetal +ipv4.addresses "192.168.7.102/24"
nmcli con up baremetal
mkdir -p /data/kvm
cd /data/kvm
wget -O centos.iso http://mirrors.nju.edu.cn/centos/7.9.2009/isos/x86_64/CentOS-7-x86_64-Minimal-2207-02.iso
sed -i '0,/^network.*/s/^network.*/network --bootproto=static --device=eth0 --gateway=192.168.7.9 --ip=192.168.7.12 --netmask=255.255.255.0 --nameserver=192.168.7.11 --ipv6=auto --activate/' helper-ks.cfg
create_lv() {
var_vg=$1
var_pool=$2
var_lv=$3
var_size=$4
var_action=$5
lvremove -f $var_vg/$var_lv
# lvcreate -y -L $var_size -n $var_lv $var_vg
if [ "$var_action" == "recreate" ]; then
lvcreate --type thin -n $var_lv -V $var_size --thinpool $var_vg/$var_pool
wipefs --all --force /dev/$var_vg/$var_lv
fi
}
virsh destroy ocp4-acm-hub
virsh undefine ocp4-acm-hub
create_lv vgdata poolA lvacmhub 100G
create_lv vgdata poolA lvacmhub-data 100G
create_lv vgdata poolA lvacmhub 100G recreate
create_lv vgdata poolA lvacmhub-data 100G recreate
virt-install --name="ocp4-acm-hub" --vcpus=16 --ram=$((4*1024)) \
--cpu=host-model \
--disk path=/dev/vgdata/lvacmhub,device=disk,bus=virtio,format=raw \
--disk path=/dev/vgdata/lvacmhub-data,device=disk,bus=virtio,format=raw \
--os-variant rhel8.5 --network bridge=baremetal,model=virtio \
--graphics vnc,port=59000 \
--boot menu=on --location /data/kvm/centos.iso \
--initrd-inject helper-ks.cfg --extra-args "inst.ks=file:/helper-ks.cfg"
# on helper web server
cd /data/dnf
wget -O alma8.iso http://mirrors.nju.edu.cn/almalinux/8/isos/x86_64/AlmaLinux-8-latest-x86_64-minimal.iso
mkdir -p /data/dnf/cdrom
mount alma8.iso /data/dnf/cdrom
# on the centos7 vm
sshpass -p 'redhat' ssh-copy-id root@192.168.7.12
ssh root@192.168.7.12
HELPER_URL=http://192.168.7.11:5000/cdrom/
curl -o /boot/initrd.img $HELPER_URL/images/pxeboot/initrd.img
curl -o /boot/vmlinuz $HELPER_URL/images/pxeboot/vmlinuz
SNO_IP=192.168.7.12
SNO_GW=192.168.7.11
SNO_NETMAST=255.255.255.0
SNO_HOSTNAME=acm-demo-hub-master
SNO_IF=enp1s0
SNO_DNS=192.168.7.11
SNO_DISK=/dev/vda
SNO_ROOTFS=http://192.168.7.11:5000/cdrom/
SNO_IGN=http://192.168.7.11:5000/helper-ks-alma8.cfg
cat << EOF > /etc/grub.d/40_custom
#!/bin/sh
exec tail -n +3 \$0
# This file provides an easy way to add custom menu entries. Simply type the
# menu entries you want to add after this comment. Be careful not to change
# the 'exec tail' line above.
menuentry 'coreos' --class fedora --class gnu-linux --class gnu --class os {
insmod gzio
insmod part_msdos
insmod xfs
set root='hd0,msdos1'
echo 'Loading coreos kernel ...'
linux /vmlinuz rd.neednet=1 ip=$SNO_IP::$SNO_GW:$SNO_NETMAST:$SNO_HOSTNAME:$SNO_IF:none nameserver=$SNO_DNS inst.repo=$SNO_ROOTFS
echo 'Loading coreos initrd ...'
initrd /initrd.img
}
EOF
sed -i 's/^GRUB_DEFAULT=.*/GRUB_DEFAULT="coreos"/' /etc/default/grub
grub2-mkconfig -o /etc/grub2.cfg
reboot
build nano boot disk
我们尝试做一个迷你启动盘,在这个迷你启动盘里面,除了内核和ram disk之外,什么也没有。同时,增加内核参数,把安装介质和kickstart配置文件,都指向web server。
- https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html-single/installation_guide/index#s2-kickstart2-boot-media
dnf install -y isomd5sum
cd /data/dnf
wget -O rocky8.iso http://mirrors.nju.edu.cn/rocky/8/isos/x86_64/Rocky-x86_64-minimal.iso
mount rocky8.iso /data/dnf/cdrom
# create a kickstart file and copy to /data/dnf/
# 本文的目录里面,有kickstart的配置模板,我们改变一下里面的IP地址i配置,安装源的URL就能用了。
cd /data/dnf
/bin/cp -f helper-ks-alma.cfg helper-ks-alma-wutong.cfg
# for 101
SNO_IP=172.21.6.101
SNO_IP_INSTALL=172.21.6.199
SNO_GW=172.21.6.254
SNO_NETMAST=255.255.255.0
SNO_HOSTNAME=panlab-101
SNO_IF=eno1
SNO_DNS=172.21.1.1
SNO_DISK=/dev/sda
SNO_ROOTFS=http://172.21.6.11:5000/cdrom/
SNO_IGN=http://172.21.6.11:5000/helper-ks-alma-wutong.cfg
# for 102
SNO_IP=172.21.6.102
SNO_IP_INSTALL=172.21.6.199
SNO_GW=172.21.6.254
SNO_NETMAST=255.255.255.0
SNO_HOSTNAME=panlab-102
SNO_IF=eno1
SNO_DNS=172.21.1.1
SNO_DISK=/dev/sda
SNO_ROOTFS=http://172.21.6.11:5000/cdrom/
SNO_IGN=http://172.21.6.11:5000/helper-ks-alma-wutong.cfg
# for 103
SNO_IP=172.21.6.103
SNO_IP_INSTALL=172.21.6.199
SNO_GW=172.21.6.254
SNO_NETMAST=255.255.255.0
SNO_HOSTNAME=panlab-102
SNO_IF=eno1
SNO_DNS=172.21.1.1
SNO_DISK=/dev/sda
SNO_ROOTFS=http://172.21.6.11:5000/cdrom/
SNO_IGN=http://172.21.6.11:5000/helper-ks-alma-wutong.cfg
sed -i "0,/^network.*/s/^network.*/network --bootproto=static --device=$SNO_IF --gateway=$SNO_GW --ip=$SNO_IP --netmask=$SNO_NETMAST --nameserver=$SNO_DNS --ipv6=auto --activate/" helper-ks-alma-wutong.cfg
sed -i "s/network --hostname=.*/network --hostname=$SNO_HOSTNAME/" helper-ks-alma-wutong.cfg
sed -i "0,/^url --url.*/s#^url --url.*#url --url=\"$SNO_ROOTFS\"#" helper-ks-alma-wutong.cfg
sed -i 's/vda/sda/g' helper-ks-alma-wutong.cfg
# mount /data/dnf/alma8.iso /data/dnf/cdrom
mkdir -p /data/tmp/
/bin/cp -pRf /data/dnf/cdrom/ /data/tmp/
cd /data/tmp/cdrom
rm -rf BaseOS/
rm -rf Minimal/
rm -f images/install.img
cat <<EOF > isolinux/isolinux.cfg
default vesamenu.c32
timeout 5
display boot.msg
# Clear the screen when exiting the menu, instead of leaving the menu displayed.
# For vesamenu, this means the graphical background is still displayed without
# the menu itself for as long as the screen remains in graphics mode.
menu clear
menu background splash.png
menu title AlmaLinux 8.7
menu vshift 8
menu rows 18
menu margin 8
#menu hidden
menu helpmsgrow 15
menu tabmsgrow 13
# Border Area
menu color border * #00000000 #00000000 none
# Selected item
menu color sel 0 #ffffffff #00000000 none
# Title bar
menu color title 0 #ff7ba3d0 #00000000 none
# Press [Tab] message
menu color tabmsg 0 #ff3a6496 #00000000 none
# Unselected menu item
menu color unsel 0 #84b8ffff #00000000 none
# Selected hotkey
menu color hotsel 0 #84b8ffff #00000000 none
# Unselected hotkey
menu color hotkey 0 #ffffffff #00000000 none
# Help text
menu color help 0 #ffffffff #00000000 none
# A scrollbar of some type? Not sure.
menu color scrollbar 0 #ffffffff #ff355594 none
# Timeout msg
menu color timeout 0 #ffffffff #00000000 none
menu color timeout_msg 0 #ffffffff #00000000 none
# Command prompt text
menu color cmdmark 0 #84b8ffff #00000000 none
menu color cmdline 0 #ffffffff #00000000 none
# Do not display the actual menu unless the user presses a key. All that is displayed is a timeout message.
menu tabmsg Press Tab for full configuration options on menu items.
menu separator # insert an empty line
menu separator # insert an empty line
label linux
menu label ^Install WZH Linux 8.7
menu default
kernel vmlinuz
append initrd=initrd.img rd.neednet=1 ip=$SNO_IP_INSTALL::$SNO_GW:$SNO_NETMAST:$SNO_HOSTNAME:$SNO_IF:none nameserver=$SNO_DNS inst.ks=$SNO_IGN inst.repo=$SNO_ROOTFS
label check
menu label Test this ^media & install WZH Linux 8.7
kernel vmlinuz
append initrd=initrd.img inst.stage2=hd:LABEL=AlmaLinux-8-7-x86_64-dvd rd.live.check quiet
menu separator # insert an empty line
# utilities submenu
menu begin ^Troubleshooting
menu title Troubleshooting
label vesa
menu indent count 5
menu label Install AlmaLinux 8.7 in ^basic graphics mode
text help
Try this option out if you're having trouble installing
AlmaLinux 8.7.
endtext
kernel vmlinuz
append initrd=initrd.img inst.stage2=hd:LABEL=AlmaLinux-8-7-x86_64-dvd nomodeset quiet
label rescue
menu indent count 5
menu label ^Rescue a AlmaLinux system
text help
If the system will not boot, this lets you access files
and edit config files to try to get it booting again.
endtext
kernel vmlinuz
append initrd=initrd.img inst.stage2=hd:LABEL=AlmaLinux-8-7-x86_64-dvd inst.rescue quiet
label memtest
menu label Run a ^memory test
text help
If your system is having issues, a problem with your
system's memory may be the cause. Use this utility to
see if the memory is working correctly.
endtext
kernel memtest
menu separator # insert an empty line
label local
menu label Boot from ^local drive
localboot 0xffff
menu separator # insert an empty line
menu separator # insert an empty line
label returntomain
menu label Return to ^main menu
menu exit
menu end
EOF
genisoimage -U -r -v -T -J -joliet-long -V "RHEL-6.9" -volset "RHEL-6.9" -A "RHEL-6.9" -b isolinux/isolinux.bin -c isolinux/boot.cat -no-emul-boot -boot-load-size 4 -boot-info-table -eltorito-alt-boot -e images/efiboot.img -no-emul-boot -o ../wzh.iso .
implantisomd5 ../wzh.iso
update minimal
dnf update-minimal --security
end
others
https://access.redhat.com/documentation/zh-cn/red_hat_enterprise_linux/7/html/installation_guide/chap-anaconda-boot-options
- inst.graphical
- inst.resolution=800x600
dmsetup info -c -o name,blkdevname,devnos_used,blkdevs_used
# Name BlkDevName DevNosUsed BlkDevNamesUsed
# vgdata-lvacmhub dm-4 253:2 dm-2
# vgdata-lvacmhub--data dm-5 253:2 dm-2
# vgdata-poolA dm-3 253:2 dm-2
# vgdata-poolA-tpool dm-2 253:1,253:0 dm-1,dm-0
# vgdata-poolA_tdata dm-1 8:16 sdb
# vgdata-poolA_tmeta dm-0 8:16 sdb
Relax and Recover(ReaR) / disaster recovery
Redhat give us a system back solution (ReaR), althought it is not supported by redhat, but it works. now we test it out.
reference:
- https://access.redhat.com/solutions/2115051
# on nfs server
yum install -y nfs-utils
mkdir -p /storage
cat << EOF > /etc/exports
/storage *(fsid=0,rw,sync,no_root_squash,no_subtree_check,crossmnt)
EOF
cat /etc/exports
# /storage *(fsid=0,rw,sync,no_root_squash,no_subtree_check,crossmnt)
systemctl enable --now nfs
systemctl disable --now firewalld
# on target server
yum install -y rear pstree nfs-utils
cat << EOF > /etc/rear/local.conf
OUTPUT=ISO
OUTPUT_URL=nfs://192.168.203.134/storage
BACKUP=NETFS
BACKUP_URL=nfs://192.168.203.134/storage
BACKUP_PROG_EXCLUDE=("${BACKUP_PROG_EXCLUDE[@]}" '/media' '/var/tmp' '/var/crash')
NETFS_KEEP_OLD_BACKUP_COPY=
EOF
rear -d -v mkbackup
# on nfs server, new files created from target centos7 vm
tree /storage
# /storage
# └── target-centos7
# ├── backup.log
# ├── backup.tar.gz
# ├── README
# ├── rear-target-centos7.iso
# ├── rear-target-centos7.log
# ├── selinux.autorelabel
# └── VERSION
# now destroy the target centos vm, and recreate a new one
# boot the new vm using rear-target-centos7.iso
after reboot, the system comes back
no-cost rhel subscription / 红帽免费开发者订阅
自从centos宣布停止支持后,红帽为了照顾广大的开发者群体,推出了免费的开发者订阅,可以激活16个系统,还能免费看红帽的知识库,超值,现在就把注册和激活开发者账号的流程走一遍。
注册账户和激活订阅
首先,登录 https://access.redhat.com/ 去创建一个账号
然后访问: https://developers.redhat.com/products/rhel/download
接下来,我们确认一下我们的账号是否有 developer subscription, 访问 https://access.redhat.com/management
我们能够看到,我们刚刚激活了2个subscription,其中一个就是我们要的developer subscription
激活一个系统
接下来,我们用我们的用户名,密码,来激活一个rhel系统
subscription-manager register --auto-attach --username ******** --password ********
dnf repolist
# Updating Subscription Management repositories.
# repo id repo name
# rhel-8-for-x86_64-appstream-rpms Red Hat Enterprise Linux 8 for x86_64 - AppStream (RPMs)
# rhel-8-for-x86_64-baseos-rpms Red Hat Enterprise Linux 8 for x86_64 - BaseOS (RPMs)
访问 https://access.redhat.com/management/systems , 可以看到系统已经激活
能看知识库了
访问这个知识库文章,确认自己能访问知识库啦: https://access.redhat.com/solutions/6178422
调整分区
默认rhel安装,会给一个很大的/home,但是我们做实验,最好把空间都给 / , 不然很容易出现 / 空间不足的情况,那么怎么把 /home 删掉,并且扩大 / 分区呢?
lsblk
# NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
# sr0 11:0 1 1024M 0 rom
# vda 252:0 0 60G 0 disk
# ├─vda1 252:1 0 1G 0 part /boot
# └─vda2 252:2 0 59G 0 part
# ├─rhel_v-root 253:0 0 38.3G 0 lvm /
# ├─rhel_v-swap 253:1 0 2.1G 0 lvm [SWAP]
# └─rhel_v-home 253:2 0 18.7G 0 lvm /home
umount /home
lvremove -f /dev/rhel_v/home
# Logical volume "home" successfully removed.
# comment out the following line to skip the /home partition
sed -i -E 's/^(.*\/home)/# \1/g' /etc/fstab
lvextend -l +100%FREE /dev/rhel_v/root
# Size of logical volume rhel_v/root changed from <38.26 GiB (9794 extents) to <56.94 GiB (14576 extents).
# Logical volume rhel_v/root successfully resized.
xfs_growfs /dev/rhel_v/root
# meta-data=/dev/mapper/rhel_v-root isize=512 agcount=4, agsize=2507264 blks
# = sectsz=512 attr=2, projid32bit=1
# = crc=1 finobt=1, sparse=1, rmapbt=0
# = reflink=1
# data = bsize=4096 blocks=10029056, imaxpct=25
# = sunit=0 swidth=0 blks
# naming =version 2 bsize=4096 ascii-ci=0, ftype=1
# log =internal log bsize=4096 blocks=4897, version=2
# = sectsz=512 sunit=0 blks, lazy-count=1
# realtime =none extsz=4096 blocks=0, rtextents=0
# data blocks changed from 10029056 to 14925824
lsblk
# NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
# sr0 11:0 1 1024M 0 rom
# vda 252:0 0 60G 0 disk
# ├─vda1 252:1 0 1G 0 part /boot
# └─vda2 252:2 0 59G 0 part
# ├─rhel_v-root 253:0 0 57G 0 lvm /
# └─rhel_v-swap 253:1 0 2.1G 0 lvm [SWAP]
reference
- https://developers.redhat.com/blog/2021/02/10/how-to-activate-your-no-cost-red-hat-enterprise-linux-subscription#
- https://developers.redhat.com/products/rhel/download
在红帽官网查询rpm包属于哪个repo
红帽rhel8系统,可以配置多个repo,默认是baseos, appstream,但是我们在项目中,经常会被问到,某个rpm是从哪个repo中获取的,如果这个rpm不在baseos, appstream这两个默认的repo中,那么我们在系统上用命令是查不出来的(至少我现在还不知道有什么办法),必须先把repo加到系统上,才能用dnf命令给查出来,这样就变成一个鸡生蛋,蛋生鸡的游戏,我们还是不知道应该加哪个repo。
好在,红帽官网提供了一个工具,可以查询rpm包属于哪个repo,只要把rpm包信息输入到系统上,然后用这个工具查询,就可以知道属于哪个repo了。
我们访问这个网页,查询一个包 qemu-kiwi,我们注意到,他是根据我当前的订阅权限查询的,好在作者当前的订阅包含的内容还算多。
点击x84_64,我们就能看到rpm包的具体信息,包含了repo的名字,版本,描述,等等。
于是,我们就得到了最终的答案,qemu-kiwi这个rpm包,是属于 advanced-virt-for-rhel-8-x86_64-rpms 这个repo的
离线环境下 原地从 rhel7 向 rhel8 升级
随着rhel7的生命周期结束,越来越近,同时rhel8的很多新特性很有功能和性能上的优势,很多客户都在考虑从rhel7向rhel8升级。一般来说,系统升级是高风险操作,非常推荐客户找备份主机,新装rhel8,然后把应用迁移过去,然后把rhel7这台主机操作系统删除重装rhel8,之后再把应用迁移回来。
但是很多客户的生产主机是非常高配的,并没有足够的备份主机,做上述操作。这样就要考虑在原地从rhel7向rhel8升级。由于原地升级风险很大,强烈建议客户联系专业团队,如红帽GPS,做全面的原地升级计划。
一般来说,原地升级要考虑如下问题:
- 系统存储情况,分区配置
- 是否有第三方内核驱动
- 操作系统做了什么定制化
- 启动了什么应用
红帽官方提供了leapp,以及boom来支持原地升级。但是这个并不能完全消除原地升级的风险。本文就在一台宿主机上,按照一个rhel7 vm,然后模拟离线环境,来升级到rhel8。目的是让准备实施原地升级的客户,能进行原地升级演练,更好的模拟目标系统的状态,并尽早的发现原地升级过程中的问题和风险。
视频讲解:
参考材料:
- 红帽官方文档 UPGRADING FROM RHEL 7 TO RHEL 8
- 红帽官方博客 Upgrading from RHEL 7 to RHEL 8 with Leapp and BOOM
- Leapp utility metadata in-place upgrades of RHEL for disconnected upgrades
- Customizing your Red Hat Enterprise Linux in-place upgrade
- How to do an offline upgrade to RHEL 8 with Leapp?
- Boom! Booting RHEL from LVM snapshots
- Why do I get I/O errors when my LVM snapshot reaches 100% usage?
leapp
leapp是红帽官方的升级工具,在rhel8官方文档中,有详细的描述。本文聚焦在全离线环境下,如何使用leapp的方式来进行升级。注意,如果只使用leapp升级系统,那升级过程是单向的,也就是说,一旦开始升级,就不能再恢复到原来的状态,也不能降级。如何恢复或者降级,在后面boom的章节中描述。
订阅离线证书
为了在离线环境进行升级,我们需要先准备 rhel7 & rhel8 repo,那么这就需要订阅离线证书。这里我们就看看怎么下载,并解压缩出来。
我们访问红帽在线订阅系统,选择一个系统,给这系统添加正确的订阅,然后点击下载,我们就能得到一个zip文件。
然后我们上次这个zip文件到服务器上,之后解压缩之。
# on host, we use rocky 8.5
# prepare redhat subscription cert
mkdir -p /data/rhel8/entitle
cd /data/rhel8/entitle
# goto https://access.redhat.com/management/subscriptions
# search employee sku, find a system, go into, and download from subscription
# or goto: https://access.redhat.com/management/systems/4d1e4cc0-2c99-4431-99ce-2f589a24ea11/subscriptions
dnf install -y unzip
unzip *
unzip consumer_export.zip
find . -name *.pem -exec cp {} ./ \;
mkdir -p /data/dockerfile/
cd /data/dockerfile/
ls /data/rhel8/entitle/*.pem | sed -n '2p' | xargs -I DEMO /bin/cp -f DEMO ./
用容器的方式构建离线repo
我们采用容器的方式构建离线repo,这样就可以在离线环境中进行升级了。至于为什么用容器的方式,那是因为我们需要同时为rhel7, rhel8两个系统构建离线repo,一般的方法,这就需要2个系统,而我们只需要一个系统,那么我们就用容器的方式,来模拟两个操作系统环境,进行构建。
这里面的一个问题是,在容器环境中,红帽的订阅机制是不生效的,我们需要一些技巧来解决这个问题。
# prepare rhel8 repo
mkdir -p /data/rhel/dnf
podman run -it --rm -v /data/rhel/dnf:/data/dnf:z \
--mount type=bind,source=$(ls /data/rhel8/entitle/*.pem | sed -n '2p'),target=/etc/pki/entitlement/entitlement.pem,relabel=shared \
--mount type=bind,source=$(ls /data/rhel8/entitle/*.pem | sed -n '2p'),target=/etc/pki/entitlement/entitlement-key.pem,relabel=shared \
registry.access.redhat.com/ubi8 bash
# in podman shell
dnf -y update || true && \
sed -i 's|enabled=1|enabled=0|g' /etc/yum/pluginconf.d/subscription-manager.conf && \
sed -i 's|%(ca_cert_dir)sredhat-uep.pem|/etc/rhsm/ca/redhat-uep.pem|g' /etc/yum.repos.d/redhat.repo && \
sed -i '/ansible-2.9-for-rhel-8-x86_64-rpms/,/enabled = 0/s/enabled = 0/enabled = 1/' /etc/yum.repos.d/redhat.repo && \
sed -i 's|cdn.redhat.com|china.cdn.redhat.com|g' /etc/yum.repos.d/redhat.repo && \
dnf -y update && \
cd /data/dnf && \
dnf reposync -m --download-metadata --delete -n
# prepare rhel7 repo
mkdir -p /data/rhel/yum
podman run -it --rm -v /data/rhel/yum:/data/yum:z \
--mount type=bind,source=$(ls /data/rhel8/entitle/*.pem | sed -n '2p'),target=/etc/pki/entitlement/entitlement.pem,relabel=shared \
--mount type=bind,source=$(ls /data/rhel8/entitle/*.pem | sed -n '2p'),target=/etc/pki/entitlement/entitlement-key.pem,relabel=shared \
registry.access.redhat.com/ubi7 bash
# in podman shell
# https://unix.stackexchange.com/questions/677719/search-and-replace-lines-after-a-regex-match-using-sed
# https://stackoverflow.com/questions/148451/how-to-use-sed-to-replace-only-the-first-occurrence-in-a-file
sed -i 's|%(ca_cert_dir)sredhat-uep.pem|/etc/rhsm/ca/redhat-uep.pem|g' /etc/rhsm/rhsm.conf && \
yum -y update || true && \
sed -i 's|enabled=1|enabled=0|g' /etc/yum/pluginconf.d/subscription-manager.conf && \
sed -i 's|%(ca_cert_dir)sredhat-uep.pem|/etc/rhsm/ca/redhat-uep.pem|g' /etc/yum.repos.d/redhat.repo && \
sed -i 's|cdn.redhat.com|china.cdn.redhat.com|g' /etc/yum.repos.d/redhat.repo && \
sed -i '/rhel-7-server-extras-rpms/,/enabled = 0/s/enabled = 0/enabled = 1/' /etc/yum.repos.d/redhat.repo && \
yum -y update && \
cd /data/yum && \
yum install -y yum-utils createrepo && \
reposync -n -d -l -m && \
createrepo ./
使用ftp来提供repo服务
我们已经准备好了离线repo,那么我们启动一个ftp服务,来提供离线repo的服务。这里面会有一些权限, selinux的问题和解决技巧。
# setup ftp service for repo
dnf -y install vsftpd
sed -i 's/anonymous_enable=NO/anonymous_enable=YES/g' /etc/vsftpd/vsftpd.conf
systemctl enable --now vsftpd
systemctl disable --now firewalld
cd /data/
chcon -R -t public_content_t rhel
chown -R ftp:ftp rhel
cd /var/ftp
mkdir -p /var/ftp/rhel
# https://stackoverflow.com/questions/34736743/ftp-550-failed-to-change-directory
mount --bind /data/rhel /var/ftp/rhel
cat << EOF >> /etc/fstab
/data/rhel /var/ftp/rhel none bind 0 0
EOF
dnf install -y lftp
# try the ftp server
lftp 127.0.0.1
# ls rhel/yum/rhel-7-server-rpms/Packages/a/
安装 rhel7 vm
至此,我们的准备工作都完成了,开始安装rhel7的虚拟机。
# setup bridge for vm
mkdir -p /data/kvm
cat << 'EOF' > /data/kvm/bridge.sh
#!/usr/bin/env bash
PUB_CONN='eno2'
PUB_IP='192.168.7.11/24'
PUB_GW='192.168.7.11'
PUB_DNS='192.168.7.11'
nmcli con down "$PUB_CONN"
nmcli con delete "$PUB_CONN"
nmcli con down baremetal
nmcli con delete baremetal
# RHEL 8.1 appends the word "System" in front of the connection,delete in case it exists
nmcli con down "System $PUB_CONN"
nmcli con delete "System $PUB_CONN"
nmcli connection add ifname baremetal type bridge con-name baremetal ipv4.method 'manual' \
ipv4.address "$PUB_IP" \
ipv4.gateway "$PUB_GW" \
ipv4.dns "$PUB_DNS"
nmcli con add type bridge-slave ifname "$PUB_CONN" master baremetal
nmcli con down "$PUB_CONN";pkill dhclient;dhclient baremetal
nmcli con up baremetal
EOF
bash /data/kvm/bridge.sh
# install rhel7 vm
cd /data/kvm
osinfo-query os | grep rhel7
# rhel7-unknown | Red Hat Enterprise Linux 7 Unknown | 7-unknown | http://redhat.com/rhel/7-unknown
# rhel7.0 | Red Hat Enterprise Linux 7.0 | 7.0 | http://redhat.com/rhel/7.0
# rhel7.1 | Red Hat Enterprise Linux 7.1 | 7.1 | http://redhat.com/rhel/7.1
# rhel7.2 | Red Hat Enterprise Linux 7.2 | 7.2 | http://redhat.com/rhel/7.2
# rhel7.3 | Red Hat Enterprise Linux 7.3 | 7.3 | http://redhat.com/rhel/7.3
# rhel7.4 | Red Hat Enterprise Linux 7.4 | 7.4 | http://redhat.com/rhel/7.4
# rhel7.5 | Red Hat Enterprise Linux 7.5 | 7.5 | http://redhat.com/rhel/7.5
# rhel7.6 | Red Hat Enterprise Linux 7.6 | 7.6 | http://redhat.com/rhel/7.6
# rhel7.7 | Red Hat Enterprise Linux 7.7 | 7.7 | http://redhat.com/rhel/7.7
# rhel7.8 | Red Hat Enterprise Linux 7.8 | 7.8 | http://redhat.com/rhel/7.8
# rhel7.9 | Red Hat Enterprise Linux 7.9 | 7.9 | http://redhat.com/rhel/7.9
# download rhel7 iso
wget -O rhel7.iso 'https://access.cdn.redhat.com/content/origin/files/sha256/19/19d653ce2f04f202e79773a0cbeda82070e7527557e814ebbce658773fbe8191/rhel-server-7.9-x86_64-dvd.iso?user=a768b217cf6ae8041b67586bb4dd5c77&_auth_=1641893589_4f48191c0168e22e5cedac1a1ef79ef8'
pvcreate /dev/sdb
vgcreate vgdata /dev/sdb
create_lv() {
var_vg=$1
var_lv=$2
lvremove -f $var_vg/$var_lv
lvcreate -y -L 120G -n $var_lv $var_vg
wipefs --all --force /dev/$var_vg/$var_lv
}
create_lv vgdata lvrhel7
export http_proxy="http://192.168.195.54:5085"
export https_proxy=${http_proxy}
wget https://raw.githubusercontent.com/wangzheng422/docker_env/dev/redhat/notes/2022/files/helper-ks.cfg
unset http_proxy
unset https_proxy
# https://octowhale.gitbooks.io/kickstart/content/chapter2-kickstart-options-logvol.html
# https://octowhale.gitbooks.io/kickstart/content/chapter2-kickstart-options-network.html
# https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/performing_an_advanced_rhel_installation/kickstart-commands-and-options-reference_installing-rhel-as-an-experienced-user#network_kickstart-commands-for-network-configuration
sed -i '0,/^network.*/s/^network.*/network --bootproto=static --device=eth0 --gateway=192.168.7.1 --ip=192.168.7.12 --netmask=255.255.255.0 --nameserver=192.168.7.1 --noipv6 --activate/' helper-ks.cfg
sed -i 's/logvol \/ --fstype="xfs" .*/logvol \/ --fstype="xfs" --name=root --vgname=vg0 --percent=50/' helper-ks.cfg
# 配置kvm环境
dnf -y groupinstall "Server with GUI"
dnf -y install qemu-kvm libvirt libguestfs-tools virt-install virt-viewer virt-manager tigervnc-server
systemctl disable --now firewalld
systemctl enable --now libvirtd
# 准备vnc环境
vncpasswd
cat << EOF > ~/.vnc/config
session=gnome
securitytypes=vncauth,tlsvnc
# desktop=sandbox
geometry=1280x800
alwaysshared
EOF
cat << EOF >> /etc/tigervnc/vncserver.users
:1=root
EOF
# systemctl disable vncserver@:1
systemctl start vncserver@:1
# 如果你想停掉vnc server,这么做
systemctl stop vncserver@:1
# start to install the rhel7 vm
virt-install --name="rhel7" --vcpus=8 --ram=8192 \
--cpu=host-model \
--disk path=/dev/vgdata/lvrhel7,device=disk,bus=virtio,format=raw \
--os-variant rhel7.9 --network bridge=baremetal,model=virtio \
--graphics vnc,port=59000 \
--boot menu=on --location /data/kvm/rhel7.iso \
--initrd-inject helper-ks.cfg --extra-args "inst.ks=file:/helper-ks.cfg"
cat << EOF > /root/.ssh/config
StrictHostKeyChecking no
UserKnownHostsFile=/dev/null
EOF
配置 rhel 7 vm
rhel7虚拟机装好以后,我们要对他做一些简单的配置,把他的更新源指向我们的离线repo
# setup rhel7 vm
ssh root@192.168.7.12
# disable dns lookup in sshd when ssh login
sed -i 's/#UseDNS yes/UseDNS no/g' /etc/ssh/sshd_config
systemctl restart sshd
# link to local repo
cat << 'EOF' > /etc/yum.repos.d/remote.repo
[remote-rhel7]
name=remote-rhel7
baseurl=ftp://192.168.7.11/rhel/yum
enabled=1
gpgcheck=0
EOF
yum update -y
reboot
开始升级
我们使用leapp来升级,leapp会检查系统配置,并给出系统上有什么问题,导致不能原地升级。我们要根据leapp的提升,进行系统配置,配置完成后,我们可以再试试,如果检查通过,就可以原地升级了。
本文的系统环境非常简单,还遇到了2个问题,可以想象到,如果是生产环境,会遇到更多的问题。
# perform upgrade
# 先安装升级需要的软件
yum install -y leapp leapp-repository leapp-repository-deps lvm2-python-boom
# 配置升级过程中的安装源
cat << 'EOF' > /etc/leapp/files/leapp_upgrade_repositories.repo
[BaseOS]
name=BaseOS
baseurl=ftp://192.168.7.11/rhel/dnf/rhel-8-for-x86_64-baseos-rpms
enabled=1
gpgcheck=0
[AppStream]
name=AppStream
baseurl=ftp://192.168.7.11/rhel/dnf/rhel-8-for-x86_64-appstream-rpms
enabled=1
gpgcheck=0
EOF
# 因为我们是离线环境,需要有一些升级用的参数文件,需要手动的下载和导入
# https://access.redhat.com/articles/3664871
# download the leapp-data15.tar.gz to server
tar -xzf leapp-data15.tar.gz -C /etc/leapp/files
# 做第一次的升级前检测
# 从结果看,发现了2个问题,导致不能升级
leapp preupgrade --no-rhsm --enablerepo BaseOS --enablerepo AppStream
# .........
# ====> * verify_check_results
# Check all generated results messages and notify user about them.
# ============================================================
# UPGRADE INHIBITED
# ============================================================
# Upgrade has been inhibited due to the following problems:
# 1. Inhibitor: Possible problems with remote login using root account
# 2. Inhibitor: Missing required answers in the answer file
# Consult the pre-upgrade report for details and possible remediation.
# ============================================================
# UPGRADE INHIBITED
# ============================================================
# Debug output written to /var/log/leapp/leapp-preupgrade.log
# ============================================================
# REPORT
# ============================================================
# A report has been generated at /var/log/leapp/leapp-report.json
# A report has been generated at /var/log/leapp/leapp-report.txt
# ============================================================
# END OF REPORT
# ============================================================
# Answerfile has been generated at /var/log/leapp/answerfile
# 我们看看这两个问题是什么
# 还好,红帽工具给出了解决问题的方法和命令
cat /var/log/leapp/leapp-report.txt
# Risk Factor: high (inhibitor)
# Title: Possible problems with remote login using root account
# Summary: OpenSSH configuration file does not explicitly state the option PermitRootLogin in sshd_config file, which will default in RHEL8 to "prohibit-password".
# Remediation: [hint] If you depend on remote root logins using passwords, consider setting up a different user for remote administration or adding "PermitRootLogin yes" to sshd_config.
# Key: 3d21e8cc9e1c09dc60429de7716165787e99515f
# ----------------------------------------
# Risk Factor: high (inhibitor)
# Title: Missing required answers in the answer file
# Summary: One or more sections in answerfile are missing user choices: remove_pam_pkcs11_module_check.confirm
# For more information consult https://leapp.readthedocs.io/en/latest/dialogs.html
# Remediation: [hint] Please register user choices with leapp answer cli command or by manually editing the answerfile.
# [command] leapp answer --section remove_pam_pkcs11_module_check.confirm=True
# Key: d35f6c6b1b1fa6924ef442e3670d90fa92f0d54b
# ----------------------------------------
# ............
# 我们应用红帽的解决方案
sed -i 's/#PermitRootLogin yes/PermitRootLogin yes/g' /etc/ssh/sshd_config
leapp answer --section remove_pam_pkcs11_module_check.confirm=True
# 开始升级
leapp upgrade --no-rhsm --enablerepo BaseOS --enablerepo AppStream
# ..............
# Transaction Summary
# =========================================================================================================================
# Install 213 Packages
# Upgrade 285 Packages
# Remove 66 Packages
# Downgrade 7 Packages
# Total size: 589 M
# DNF will only download packages, install gpg keys, and check the transaction.
# Downloading Packages:
# Running transaction check
# Transaction check succeeded.
# Running transaction test
# Transaction test succeeded.
# Complete!
# ====> * add_upgrade_boot_entry
# Add new boot entry for Leapp provided initramfs.
# A reboot is required to continue. Please reboot your system.
# Debug output written to /var/log/leapp/leapp-upgrade.log
# ============================================================
# REPORT
# ============================================================
# A report has been generated at /var/log/leapp/leapp-report.json
# A report has been generated at /var/log/leapp/leapp-report.txt
# ============================================================
# END OF REPORT
# ============================================================
# Answerfile has been generated at /var/log/leapp/answerfile
reboot
第一次重启,我们能看到多了一个特殊的启动项,不用有任何操作,让他自动继续。
我们能看到启动过程是不一样的,在继续做系统升级的操作。
然后,系统会自动重启,我们能看到,重启以后,重新进行selinux relabel
之后,会再次自动重启,就完成升级了,可以看到简单的完成状态信息
升级之后的配置
至此,我们就完成了rhel7->rhel8的升级,我们要做一点配置,也就是把 rhel8 的更新源给配置进去。
# ssh into the new upgraded rhel8
cat << 'EOF' > /etc/yum.repos.d/remote.repo
[BaseOS]
name=BaseOS
baseurl=ftp://192.168.7.11/rhel/dnf/rhel-8-for-x86_64-baseos-rpms
enabled=1
gpgcheck=0
[AppStream]
name=AppStream
baseurl=ftp://192.168.7.11/rhel/dnf/rhel-8-for-x86_64-appstream-rpms
enabled=1
gpgcheck=0
EOF
dnf makecache
dnf upgrade -y
# ......
# Dependencies resolved.
# Nothing to do.
# Complete!
BOOM
之前说的leapp方法,有一个问题,就是如果系统升级失败,会让系统进入不可用状态。遗憾的是,对于定制化很多的生产系统,升级失败并不是小概率事件。为了避免系统升级失败,导致系统完全不可用的情况发生,红帽提供了boom工具,来帮助在升级之前,做一个系统快照,如果升级失败,那么就可以从这个系统快照中恢复系统。
boom工具并不是为了系统原地升级新打造的,boom是一个老工具,一个常用的使用场景是,先给系统做一个快照,然后对系统进行配置,如果发现系统配置正确,那么删除这个快照。如果发现系统配置不正确,那么就回复这个快照。
可以看出来,系统原地升级,只不过是boom的一个使用场景。
参考材料:
创建系统快照
我们坚持系统当前的状态,并创建系统快照
# after rhel7 vm created
vgs
# VG #PV #LV #SN Attr VSize VFree
# vg0 1 2 0 wz--n- <119.00g 59.25g
lvs
# LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
# root vg0 -wi-ao---- <59.25g
# swap vg0 -wi-ao---- 512.00m
lvcreate -s -L 10G -n rollback vg0/root
# Logical volume "rollback" created.
lvs
# LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
# rollback vg0 swi-a-s--- 10.00g root 0.01
# root vg0 owi-aos--- <59.25g
# swap vg0 -wi-ao---- 512.00m
yum install -y leapp leapp-repository leapp-repository-deps lvm2-python-boom
boom create --title "RHEL7 Snapshot" --rootlv vg0/rollback
# WARNING - Boom configuration not found in grub.cfg
# WARNING - Run 'grub2-mkconfig > /boot/grub2/grub.cfg' to enable
# Created entry with boot_id 982beff:
# title RHEL7 Snapshot
# machine-id 036bb4e6c07a4ba9856c4bf68c1bd250
# version 3.10.0-1160.49.1.el7.x86_64
# linux /vmlinuz-3.10.0-1160.49.1.el7.x86_64
# initrd /initramfs-3.10.0-1160.49.1.el7.x86_64.img
# options root=/dev/vg0/rollback ro rd.lvm.lv=vg0/rollback
# grub_users $grub_users
# grub_arg --unrestricted
# grub_class kernel
grub2-mkconfig > /boot/grub2/grub.cfg
# Generating grub configuration file ...
# Found linux image: /boot/vmlinuz-3.10.0-1160.49.1.el7.x86_64
# Found initrd image: /boot/initramfs-3.10.0-1160.49.1.el7.x86_64.img
# Found linux image: /boot/vmlinuz-3.10.0-1160.el7.x86_64
# Found initrd image: /boot/initramfs-3.10.0-1160.el7.x86_64.img
# Found linux image: /boot/vmlinuz-0-rescue-036bb4e6c07a4ba9856c4bf68c1bd250
# Found initrd image: /boot/initramfs-0-rescue-036bb4e6c07a4ba9856c4bf68c1bd250.img
# done
boom list
# BootID Version Name RootDevice
# 982beff 3.10.0-1160.49.1.el7.x86_64 Red Hat Enterprise Linux Server /dev/vg0/rollback
lvs
# LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
# rollback vg0 swi-a-s--- 10.00g root 0.40
# root vg0 owi-aos--- <59.25g
# swap vg0 -wi-ao---- 512.00m
reboot
升级系统
我们接下来,安装之前leapp的步骤,进行原地升级操作,重启以后,我们看看系统状态,可以看到快照卷已经有接近50%的使用量。这就提醒我们,需要给快照卷足够大的空间,否则快照卷就会失效,丧失了系统恢复的功能。
启动过程中,我们选择默认的kernel
# perform upgrade to rhel8
# after upgrade
lvs
# LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
# rollback vg0 swi-a-s--- 10.00g root 44.57
# root vg0 owi-aos--- <59.25g
# swap vg0 -wi-ao---- 512.00m
rollback to rhel7
接下来我们尝试恢复到rhel7。我们重启系统,选择snapshot启动系统。
然后做卷的恢复操作。
# boot using the snapshot
lvconvert --merge /dev/vg0/rollback
# Delaying merge since snapshot is open.
# Merging of snapshot vg0/rollback will occur on next activation of vg0/root.
reboot
重启后,选择老的rhel7的kernel启动系统。
重装kernel,让rhel7最新的kernel作为默认的kernel。
lvs
# LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
# root vg0 Owi-aos--- <59.25g 11.06
# swap vg0 -wi-ao---- 512.00m
yum list kernel*
# Installed Packages
# kernel.x86_64 3.10.0-1160.el7 @anaconda/7.9
# kernel.x86_64 3.10.0-1160.49.1.el7 @remote-rhel7
# kernel-tools.x86_64 3.10.0-1160.49.1.el7 @remote-rhel7
# kernel-tools-libs.x86_64 3.10.0-1160.49.1.el7 @remote-rhel7
# Available Packages
# kernel-abi-whitelists.noarch 3.10.0-1160.49.1.el7 remote-rhel7
# kernel-debug.x86_64 3.10.0-1160.49.1.el7 remote-rhel7
# kernel-debug-devel.x86_64 3.10.0-1160.49.1.el7 remote-rhel7
# kernel-devel.x86_64 3.10.0-1160.49.1.el7 remote-rhel7
# kernel-doc.noarch 3.10.0-1160.49.1.el7 remote-rhel7
# kernel-headers.x86_64 3.10.0-1160.49.1.el7 remote-rhel7
# https://access.redhat.com/solutions/4094081
yum remove -y kernel-3.10.0-1160.49.1.el7.x86_64 ; yum install -y kernel-3.10.0-1160.49.1.el7.x86_64
# grubby fatal error: unable to find a suitable template
# grubby: doing this would leave no kernel entries. Not writing out new config.
# Verifying : kernel-3.10.0-1160.49.1.el7.x86_64 1/1
# Installed:
# kernel.x86_64 0:3.10.0-1160.49.1.el7
grub2-mkconfig -o /boot/grub2/grub.cfg
yum remove -y kernel-3.10.0-1160.49.1.el7.x86_64 ; yum install -y kernel-3.10.0-1160.49.1.el7.x86_64
reboot
重启以后,我们能看到rhel7最新的kernel已经作为系统默认启动的kernel选项。
accept the upgraded rhel8
最后,我们看看如果原地升级成功,我们如何接受这个升级。过程也简单,就是boom中删除snapshot的启动项,并且把snapshot卷删掉。
# boot into the rhel8
uname -a
# Linux helper 4.18.0-348.7.1.el8_5.x86_64 #1 SMP Wed Dec 8 21:51:17 EST 2021 x86_64 x86_64 x86_64 GNU/Linux
lvs
# LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
# rollback vg0 swi-a-s--- 10.00g root 43.99
# root vg0 owi-aos--- <59.25g
# swap vg0 -wi-ao---- 512.00m
boom list
# WARNING - Options for BootEntry(boot_id=d291021) do not match OsProfile: marking read-only
# BootID Version Name RootDevice
# 6d82dac 3.10.0-1160.49.1.el7.x86_64 Red Hat Enterprise Linux Server /dev/vg0/rollback
# e1f4484 3.10.0-1160.49.1.el7.x86_64 Red Hat Enterprise Linux Server /dev/mapper/vg0-root
# f7da13a 3.10.0-1160.el7.x86_64 Red Hat Enterprise Linux Server /dev/mapper/vg0-root
# d291021 4.18.0-348.7.1.el8_5.x86_64 Red Hat Enterprise Linux /dev/mapper/vg0-root
boom entry delete 6d82dac
# WARNING - Options for BootEntry(boot_id=d291021) do not match OsProfile: marking read-only
# Deleted 1 entry
boom list
# WARNING - Options for BootEntry(boot_id=d291021) do not match OsProfile: marking read-only
# BootID Version Name RootDevice
# e1f4484 3.10.0-1160.49.1.el7.x86_64 Red Hat Enterprise Linux Server /dev/mapper/vg0-root
# f7da13a 3.10.0-1160.el7.x86_64 Red Hat Enterprise Linux Server /dev/mapper/vg0-root
# d291021 4.18.0-348.7.1.el8_5.x86_64 Red Hat Enterprise Linux /dev/mapper/vg0-root
lvs
# LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
# rollback vg0 swi-a-s--- 10.00g root 44.41
# root vg0 owi-aos--- <59.25g
# swap vg0 -wi-ao---- 512.00m
lvremove -f /dev/vg0/rollback
# Logical volume "rollback" successfully removed.
lvs
# LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
# root vg0 -wi-ao---- <59.25g
# swap vg0 -wi-ao---- 512.00m
reboot
重启以后,我们能看到,snapshot启动项没有了。
lvm snapshot full issue
这里提供一个背景知识,如果snapshot卷满了,那么snapshot卷就失效了,我们也就不能恢复了。
Why do I get I/O errors when my LVM snapshot reaches 100% usage?
end
sysctl.conf 里面设置的参数无法加载
客户遇到一个很奇怪的问题,明明在sysctl.conf里面配置了net.netfilter.nf_conntrack_max参数,但是重启以后,这个参数还是没有生效,这个问题是什么原因呢?
答案在红帽的知识库里面
- https://access.redhat.com/solutions/548813
我们知道,sysctl的配置说是在系统启动的时候,由systemd-sysctl.service加载的。而客户环境里面,又有docker,docker因为会调用iptables的nat功能,从而隐形的加载内核模块nf_conntrack。我们猜测,应该是docker服务和systemd-sysctl服务,在系统启动的时候,相互配合有问题,才造成了我们内核模块的参数加载失败的问题。
接下来,我们做个实验来看看。
实验环境准备
我们装一台centos7,关闭firewalld,再安装社区版本的docker-ce。注意,我们在这里,并没有激活docker服务的自动启动。
# disable firewalld
systemctl disable --now firewalld
# install docker ce
yum install -y yum-utils
yum-config-manager \
--add-repo \
https://download.docker.com/linux/centos/docker-ce.repo
yum install -y docker-ce docker-ce-cli containerd.io
# reboot is important
reboot
环境调查
系统重启以后,我们看看是否加载了内核模块nf_conntrack,并且看看有没有对应的参数。
# check nf_conntrack module status
lsmod | grep nf_conntrack
# nothing
sysctl net.netfilter.nf_conntrack_max
# sysctl: cannot stat /proc/sys/net/netfilter/nf_conntrack_max: No such file or directory
可以看到,没有加载nf_conntrack模块,而且没有对应的参数。那么我们手动启动docker服务,然后再看看。
# enable docker service and check nf_conntrack again
systemctl start docker
lsmod | grep nf_conntrack
# nf_conntrack_netlink 36396 0
# nf_conntrack_ipv4 15053 2
# nf_defrag_ipv4 12729 1 nf_conntrack_ipv4
# nf_conntrack 139264 6 nf_nat,nf_nat_ipv4,xt_conntrack,nf_nat_masquerade_ipv4,nf_conntrack_netlink,nf_conntrack_ipv4
# libcrc32c 12644 2 nf_nat,nf_conntrack
sysctl net.netfilter.nf_conntrack_max
# net.netfilter.nf_conntrack_max = 65536
# check we didn't set the parameter for net.netfilter.nf_conntrack_max
find /etc -type f -exec grep -H nf_conntrack_max {} \;
# nothing
可以看到,加载了docker服务以后,内核模块nf_conntrack已经加载了,而且参数net.netfilter.nf_conntrack_max已经设置了。我们查找了一下/etc目录下面,发现没有配置net.netfilter.nf_conntrack_max的参数。
那么我们可以得出结论,不做任何配置,nf_conntrack内核模块,只有在docker服务启动的时候,才会加载到内核。如果systemd-sysctl这个服务,在系统启动的时候,早于docker服务,那么他就无法设置参数net.netfilter.nf_conntrack_max,就会造成我们所看到的故障。
接下来,我们确认一下我们的猜测,我们罗列一下,有哪些服务是在docker服务启动之前,必须要启动的。或者说,docker服务依赖哪些别的服务。
# check systemd init sequence
systemctl enable --now docker
# we can see, docker start after systemd-sysctl
systemctl list-dependencies docker
# docker.service
# ● ├─containerd.service
# ● ├─docker.socket
# ● ├─system.slice
# ● ├─basic.target
# ● │ ├─microcode.service
# ● │ ├─rhel-dmesg.service
# ● │ ├─selinux-policy-migrate-local-changes@targeted.service
# ● │ ├─paths.target
# ● │ ├─slices.target
# ● │ │ ├─-.slice
# ● │ │ └─system.slice
# ● │ ├─sockets.target
# ● │ │ ├─dbus.socket
# ● │ │ ├─systemd-initctl.socket
# ● │ │ ├─systemd-journald.socket
# ● │ │ ├─systemd-shutdownd.socket
# ● │ │ ├─systemd-udevd-control.socket
# ● │ │ └─systemd-udevd-kernel.socket
# ● │ ├─sysinit.target
# ● │ │ ├─dev-hugepages.mount
# ● │ │ ├─dev-mqueue.mount
# ● │ │ ├─kmod-static-nodes.service
# ● │ │ ├─plymouth-read-write.service
# ● │ │ ├─plymouth-start.service
# ● │ │ ├─proc-sys-fs-binfmt_misc.automount
# ● │ │ ├─rhel-autorelabel-mark.service
# ● │ │ ├─rhel-autorelabel.service
# ● │ │ ├─rhel-domainname.service
# ● │ │ ├─rhel-import-state.service
# ● │ │ ├─rhel-loadmodules.service
# ● │ │ ├─sys-fs-fuse-connections.mount
# ● │ │ ├─sys-kernel-config.mount
# ● │ │ ├─sys-kernel-debug.mount
# ● │ │ ├─systemd-ask-password-console.path
# ● │ │ ├─systemd-binfmt.service
# ● │ │ ├─systemd-firstboot.service
# ● │ │ ├─systemd-hwdb-update.service
# ● │ │ ├─systemd-journal-catalog-update.service
# ● │ │ ├─systemd-journal-flush.service
# ● │ │ ├─systemd-journald.service
# ● │ │ ├─systemd-machine-id-commit.service
# ● │ │ ├─systemd-modules-load.service
# ● │ │ ├─systemd-random-seed.service
# ● │ │ ├─systemd-sysctl.service
# ● │ │ ├─systemd-tmpfiles-setup-dev.service
# ● │ │ ├─systemd-tmpfiles-setup.service
# ● │ │ ├─systemd-udev-trigger.service
# ● │ │ ├─systemd-udevd.service
# ● │ │ ├─systemd-update-done.service
# ● │ │ ├─systemd-update-utmp.service
# ● │ │ ├─systemd-vconsole-setup.service
# ● │ │ ├─cryptsetup.target
# ● │ │ ├─local-fs.target
# ● │ │ │ ├─-.mount
# ● │ │ │ ├─rhel-readonly.service
# ● │ │ │ ├─systemd-fsck-root.service
# ● │ │ │ └─systemd-remount-fs.service
# ● │ │ └─swap.target
# ● │ └─timers.target
# ● │ └─systemd-tmpfiles-clean.timer
# ● └─network-online.target
我们可以很清晰的看到,systemd-sysctl.service服务是在docker服务启动之前,必须要启动的。
解决问题
红帽知识库给出了解决办法,就是使用系统自带的rhel-loadmodules.service服务,这个服务,在systemd-sysctl.service启动之前启动,我们来看看他的内容。
systemctl cat rhel-loadmodules.service
# # /usr/lib/systemd/system/rhel-loadmodules.service
# [Unit]
# Description=Load legacy module configuration
# DefaultDependencies=no
# Conflicts=shutdown.target
# After=systemd-readahead-collect.service systemd-readahead-replay.service
# Before=sysinit.target shutdown.target
# ConditionPathExists=|/etc/rc.modules
# ConditionDirectoryNotEmpty=|/etc/sysconfig/modules/
# [Service]
# ExecStart=/usr/lib/systemd/rhel-loadmodules
# Type=oneshot
# TimeoutSec=0
# RemainAfterExit=yes
# [Install]
# WantedBy=sysinit.target
我们可以看到,rhel-loadmodules.service服务,检查/etc/sysconfig/modules/目录是否存在,如果存在,就运行程序/usr/lib/systemd/rhel-loadmodules。那我们看看/usr/lib/systemd/rhel-loadmodules的内容。
cat /usr/lib/systemd/rhel-loadmodules
# #!/bin/bash
# # Load other user-defined modules
# for file in /etc/sysconfig/modules/*.modules ; do
# [ -x $file ] && $file
# done
# # Load modules (for backward compatibility with VARs)
# if [ -f /etc/rc.modules ]; then
# /etc/rc.modules
# fi
/usr/lib/systemd/rhel-loadmodules的内容很简单,就是遍历/etc/sysconfig/modules/目录下的所有*.modules文件,如果文件存在,就运行它。
那么我们就有了解决办法,创建对应的module文件,让这些内核模块,在系统启动的早期就加载了,这样之后的systemd-sysctl.service服务就可以正常的去设置参数了。
echo "modprobe nf_conntrack" >> /etc/sysconfig/modules/nf_conntrack.modules && chmod 775 /etc/sysconfig/modules/nf_conntrack.modules
echo "net.netfilter.nf_conntrack_max = 2097152" >> /etc/sysctl.d/99-nf_conntrack.conf
reboot
重启以后,我们坚持一下系统状态,如我们所预期的,一切正常。
systemctl status rhel-loadmodules.service
# ● rhel-loadmodules.service - Load legacy module configuration
# Loaded: loaded (/usr/lib/systemd/system/rhel-loadmodules.service; enabled; vendor preset: enabled)
# Active: active (exited) since Sun 2022-01-02 08:08:22 UTC; 40s ago
# Process: 350 ExecStart=/usr/lib/systemd/rhel-loadmodules (code=exited, status=0/SUCCESS)
# Main PID: 350 (code=exited, status=0/SUCCESS)
# Tasks: 0
# Memory: 0B
# CGroup: /system.slice/rhel-loadmodules.service
# Jan 02 08:08:22 vultr.guest systemd[1]: Started Load legacy module configuration.
sysctl net.netfilter.nf_conntrack_max
# net.netfilter.nf_conntrack_max = 2097152
错误的方法 rc-local.service
面对sysctl参数无法加载的情况,我们第一时间想到的,可能就是rc-local.service服务,这个服务读取/etc/rc.local文件,并执行这个文件中的内容。不过,我们发现在这里面的sysctl -w 命令,依然不起作用,我们猜测,这是由于rc-local.service没有在docker.service之前执行导致的。
那么我们就来确认一下。
systemctl cat rc-local
# # /usr/lib/systemd/system/rc-local.service
# # This file is part of systemd.
# #
# # systemd is free software; you can redistribute it and/or modify it
# # under the terms of the GNU Lesser General Public License as published by
# # the Free Software Foundation; either version 2.1 of the License, or
# # (at your option) any later version.
# # This unit gets pulled automatically into multi-user.target by
# # systemd-rc-local-generator if /etc/rc.d/rc.local is executable.
# [Unit]
# Description=/etc/rc.d/rc.local Compatibility
# ConditionFileIsExecutable=/etc/rc.d/rc.local
# After=network.target
# [Service]
# Type=forking
# ExecStart=/etc/rc.d/rc.local start
# TimeoutSec=0
# RemainAfterExit=yes
systemctl list-dependencies rc-local
# rc-local.service
# ● ├─system.slice
# ● └─basic.target
# ● ├─microcode.service
# ● ├─rhel-dmesg.service
# ● ├─selinux-policy-migrate-local-changes@targeted.service
# ● ├─paths.target
# ● ├─slices.target
# ● │ ├─-.slice
# ● │ └─system.slice
# ● ├─sockets.target
# ● │ ├─dbus.socket
# ● │ ├─systemd-initctl.socket
# ● │ ├─systemd-journald.socket
# ● │ ├─systemd-shutdownd.socket
# ● │ ├─systemd-udevd-control.socket
# ● │ └─systemd-udevd-kernel.socket
# ● ├─sysinit.target
# ● │ ├─dev-hugepages.mount
# ● │ ├─dev-mqueue.mount
# ● │ ├─kmod-static-nodes.service
# ● │ ├─plymouth-read-write.service
# ● │ ├─plymouth-start.service
# ● │ ├─proc-sys-fs-binfmt_misc.automount
# ● │ ├─rhel-autorelabel-mark.service
# ● │ ├─rhel-autorelabel.service
# ● │ ├─rhel-domainname.service
# ● │ ├─rhel-import-state.service
# ● │ ├─rhel-loadmodules.service
# ● │ ├─sys-fs-fuse-connections.mount
# ● │ ├─sys-kernel-config.mount
# ● │ ├─sys-kernel-debug.mount
# ● │ ├─systemd-ask-password-console.path
# ● │ ├─systemd-binfmt.service
# ● │ ├─systemd-firstboot.service
# ● │ ├─systemd-hwdb-update.service
# ● │ ├─systemd-journal-catalog-update.service
# ● │ ├─systemd-journal-flush.service
# ● │ ├─systemd-journald.service
# ● │ ├─systemd-machine-id-commit.service
# ● │ ├─systemd-modules-load.service
# ● │ ├─systemd-random-seed.service
# ● │ ├─systemd-sysctl.service
# ● │ ├─systemd-tmpfiles-setup-dev.service
# ● │ ├─systemd-tmpfiles-setup.service
# ● │ ├─systemd-udev-trigger.service
# ● │ ├─systemd-udevd.service
# ● │ ├─systemd-update-done.service
# ● │ ├─systemd-update-utmp.service
# ● │ ├─systemd-vconsole-setup.service
# ● │ ├─cryptsetup.target
# ● │ ├─local-fs.target
# ● │ │ ├─-.mount
# ● │ │ ├─rhel-readonly.service
# ● │ │ ├─systemd-fsck-root.service
# ● │ │ └─systemd-remount-fs.service
# ● │ └─swap.target
# ● └─timers.target
# ● └─systemd-tmpfiles-clean.timer
我们可以看到,rc-local服务,和docker服务,没有前后关系,这就导致了rc-local无法确保在docker.serivce之后运行,这样就会导致相关内核模块没有加载,进而导致sysctl命令无法加载。
reference
- https://access.redhat.com/solutions/548813
- https://www.dazhuanlan.com/bygxb/topics/1709928
others
systemctl list-unit-files | grep docker
# docker.service enabled
# docker.socket disabled
# https://stackoverflow.com/questions/29309717/is-there-any-way-to-list-systemd-services-in-linux-in-the-order-of-they-were-l#fromHistory
# systemd-analyze plot > startup_order.svg
yum install -y graphviz
systemd-analyze dot | dot -Tsvg > systemd.svg
# Color legend: black = Requires
# dark blue = Requisite
# dark grey = Wants
# red = Conflicts
# green = After
# CONNTRACK_MAX = 连接跟踪表大小 (HASHSIZE) * Bucket 大小 (bucket size)
modinfo nf_conntrack
# filename: /lib/modules/3.10.0-1160.49.1.el7.x86_64/kernel/net/netfilter/nf_conntrack.ko.xz
# license: GPL
# retpoline: Y
# rhelversion: 7.9
# srcversion: 358A2186187A7E81339334C
# depends: libcrc32c
# intree: Y
# vermagic: 3.10.0-1160.49.1.el7.x86_64 SMP mod_unload modversions
# signer: CentOS Linux kernel signing key
# sig_key: 77:15:99:7F:C4:81:91:84:C7:45:27:B6:08:4B:C7:F9:BB:15:62:7D
# sig_hashalgo: sha256
# parm: tstamp:Enable connection tracking flow timestamping. (bool)
# parm: acct:Enable connection tracking flow accounting. (bool)
# parm: nf_conntrack_helper:Enable automatic conntrack helper assignment (default 1) (bool)
# parm: expect_hashsize:uint
cat /proc/sys/net/nf_conntrack_max
# 65536
cat /proc/sys/net/netfilter/nf_conntrack_max
# 65536
cat /proc/sys/net/netfilter/nf_conntrack_buckets
# 16384
# so the bucket size = 4
echo "`cat /proc/sys/net/netfilter/nf_conntrack_max` / `cat /proc/sys/net/netfilter/nf_conntrack_buckets`" | bc
# 4
给 mellanox bf2网卡刷镜像, 并测试 DPI URL-filter 场景
本文试图在BF2上配置DPI功能中的URL-filter场景,网络流量从bf2上经过以后,bf2的dpi芯片会分析网络包,并根据规则进行拦截。
实验的大体过程是,宿主机用rocky linux,用官方的固件(ubuntu)刷bf2卡,把bf2卡配置好。然后在宿主机上做点测试。
本文里面有一段,是如何在宿主机是rocky linux的情况下,给bf2卡刷官方的镜像
install host with rocky 8.5
我们先在宿主机上安装 rocky linux 8.5
# install rocky 8.5
export VAR_HOST='rl_panlab104'
# 按照完了操作系统以后,添加kernel参数,主要是intel_iommu=on iommu=pt,然后重启
cp /etc/default/grub /etc/default/grub.bak
sed -i "/GRUB_CMDLINE_LINUX/s/resume=[^[:space:]]*//" /etc/default/grub
sed -i "/GRUB_CMDLINE_LINUX/s/rd.lvm.lv=${VAR_HOST}\\/swap//" /etc/default/grub
# https://unix.stackexchange.com/questions/403706/sed-insert-text-after-nth-character-preceding-following-a-given-string
sed -i '/GRUB_CMDLINE_LINUX/s/"/ intel_iommu=on iommu=pt pci=realloc default_hugepagesz=1G hugepagesz=1G hugepages=16 rdblacklist=nouveau"/2' /etc/default/grub
grub2-mkconfig -o /boot/efi/EFI/rocky/grub.cfg
grub2-mkconfig -o /boot/grub2/grub.cfg
# 添加kvm cpu host mode模式的支持,可以不做
cat << EOF > /etc/modprobe.d/kvm-nested.conf
options kvm_intel nested=1
options kvm-intel enable_shadow_vmcs=1
options kvm-intel enable_apicv=1
options kvm-intel ept=1
EOF
# 默认的操作系统安装,有swap, home分区,我们是测试系统,全都删了吧。
umount /home
swapoff /dev/$VAR_HOST/swap
cp /etc/fstab /etc/fstab.bak
sed -i 's/^[^#]*home/#&/' /etc/fstab
sed -i 's/^[^#]*swap/#&/' /etc/fstab
lvremove -f /dev/$VAR_HOST/home
lvremove -f /dev/$VAR_HOST/swap
lvextend -l +100%FREE /dev/$VAR_HOST/root
xfs_growfs /dev/$VAR_HOST/root
# on 104
# first, is console
# https://www.mellanox.com/products/infiniband-drivers/linux/mlnx_ofed
dnf install -y epel-release
dnf install -y byobu htop
dnf groupinstall -y 'Development Tools'
dnf groupinstall -y "Server with GUI"
dnf config-manager --set-enabled powertools
# https://bugzilla.redhat.com/show_bug.cgi?id=1814682
dnf install -y kernel-modules-extra psmisc
mkdir -p /data/down/
cd /data/down/
# 接下来装一些bf2特殊的包,把bf2向主机暴露的串口设备给激活。
# https://docs.nvidia.com/doca/sdk/installation-guide/index.html
# wget https://developer.nvidia.com/networking/secure/doca-sdk/doca_1.2.0/doca_120_b215/rshim-2.0.6-3.ge329c69.el7.centos.x86_64.rpm
yum install -y rshim*.rpm
dnf install -y rshim expect wget minicom rpm-build lshw
systemctl enable --now rshim
systemctl status rshim --no-pager -l
dnf install -y openssl-devel
export http_proxy="http://192.168.195.54:5085"
export https_proxy=${http_proxy}
git clone https://github.com/Mellanox/mstflint
cd mstflint
./autogen.sh
./configure --disable-inband
make && make install
# 接下来,配置宿主机当作nat路由器,这样bf2上的操作系统,也能访问互联网了。
# nat router on host
# https://access.redhat.com/discussions/4642721
cat << EOF >> /etc/sysctl.d/99-wzh-sysctl.conf
net.ipv4.ip_forward = 1
EOF
sysctl --system
systemctl disable --now firewalld
# on host
cat << EOF >> /etc/rc.d/rc.local
iptables -t nat -A POSTROUTING -o eno2 -j MASQUERADE
EOF
chmod +x /etc/rc.d/rc.local
systemctl enable --now rc-local
flash bf2 with offical image
if you want to flash the bf2 to offical doca ubuntu image, follow steps here.
# on host
mkdir -p /data/soft
cd /data/soft
cat << EOF > pwd
panpan
EOF
cat << EOF > bf.cfg
ubuntu_PASSWORD='`openssl passwd -1 -in pwd`'
EOF
dnf install -y pv
# https://docs.nvidia.com/doca/sdk/installation-guide/index.html
bfb-install --bfb /data/down/DOCA_v1.2.0_BlueField_OS_Ubuntu_20.04-5.4.0-1022-bluefield-5.5-1.0.3.2-3.8.0.11969-1.signed-aarch64.bfb --config bf.cfg --rshim rshim0
# console=hvc0 console=ttyAMA0 earlycon=pl011,0x01000000 fixrtc quiet
# on host
# set ip address to connect to bf2
# nmcli conn add type tun mode tap con-name tmfifo_net0 ifname tmfifo_net0 autoconnect yes ip4 192.168.100.1
nmcli conn modify tmfifo_net0 ipv4.address 192.168.100.1/30
nmcli conn up tmfifo_net0
# if you want to connect to bf2 through serial console
minicom --color on --baudrate 115200 --device /dev/rshim0/console
# on bf2
# login using ubuntu / panpan
sudo -i
passwd
sed -i 's/#PermitRootLogin prohibit-password/PermitRootLogin yes/' /etc/ssh/sshd_config
systemctl restart sshd
# set ip address to connect from host
cat << EOF > /etc/cloud/cloud.cfg.d/99-disable-network-config.cfg
network: {config: disabled}
EOF
cat << EOF > /etc/netplan/50-netcfg-wzh.yaml
network:
ethernets:
oob_net0:
dhcp4: true
tmfifo_net0:
addresses:
- 192.168.100.2/30
dhcp4: false
nameservers:
addresses:
- 172.21.1.1
routes:
- metric: 1025
to: 0.0.0.0/0
via: 192.168.100.1
renderer: NetworkManager
version: 2
EOF
netplan apply
/etc/init.d/networking restart
# on host
# 接下来,就可以很舒适的从宿主机上ssh到bf2卡上了
ssh root@192.168.100.2
dpi url-filter test
https://docs.nvidia.com/doca/sdk/url-filter/index.html
我们参考官方文档,做dpi URL-Filter的测试。
# on bf2
cd /opt/mellanox/doca/examples/url_filter/bin
echo 2048 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
systemctl restart mlx-regex
systemctl status mlx-regex
# ● mlx-regex.service - Regex daemon for BlueField 2
# Loaded: loaded (/etc/systemd/system/mlx-regex.service; enabled; vendor preset: enabled)
# Active: active (running) since Thu 2021-12-16 11:47:01 UTC; 7s ago
# Main PID: 55816 (mlx-regex)
# Tasks: 1 (limit: 19083)
# Memory: 564.0K
# CGroup: /system.slice/mlx-regex.service
# └─55816 /usr/bin/mlx-regex
# Dec 16 11:47:01 localhost systemd[1]: Started Regex daemon for BlueField 2.
/opt/mellanox/iproute2/sbin/mlxdevm port show
# pci/0000:03:00.0/294912: type eth netdev en3f0pf0sf0 flavour pcisf controller 0 pfnum 0 sfnum 0
# function:
# hw_addr 02:56:ae:76:cd:e9 state active opstate attached roce true max_uc_macs 128 trust off
# pci/0000:03:00.1/360448: type eth netdev en3f1pf1sf0 flavour pcisf controller 0 pfnum 1 sfnum 0
# function:
# hw_addr 02:26:61:34:13:9e state active opstate attached roce true max_uc_macs 128 trust off
/opt/mellanox/iproute2/sbin/mlxdevm port add pci/0000:03:00.0 flavour pcisf pfnum 0 sfnum 4
/opt/mellanox/iproute2/sbin/mlxdevm port add pci/0000:03:00.0 flavour pcisf pfnum 0 sfnum 5
/opt/mellanox/iproute2/sbin/mlxdevm port show
# pci/0000:03:00.0/294912: type eth netdev en3f0pf0sf0 flavour pcisf controller 0 pfnum 0 sfnum 0
# function:
# hw_addr 02:56:ae:76:cd:e9 state active opstate attached roce true max_uc_macs 128 trust off
# pci/0000:03:00.0/294913: type eth netdev en3f0pf0sf4 flavour pcisf controller 0 pfnum 0 sfnum 4
# function:
# hw_addr 00:00:00:00:00:00 state inactive opstate detached roce true max_uc_macs 128 trust off
# pci/0000:03:00.0/294914: type eth netdev en3f0pf0sf5 flavour pcisf controller 0 pfnum 0 sfnum 5
# function:
# hw_addr 00:00:00:00:00:00 state inactive opstate detached roce true max_uc_macs 128 trust off
# pci/0000:03:00.1/360448: type eth netdev en3f1pf1sf0 flavour pcisf controller 0 pfnum 1 sfnum 0
# function:
# hw_addr 02:26:61:34:13:9e state active opstate attached roce true max_uc_macs 128 trust off
/opt/mellanox/iproute2/sbin/mlxdevm port function set pci/0000:03:00.0/294913 hw_addr 02:25:f2:8d:a2:4c trust on state active
/opt/mellanox/iproute2/sbin/mlxdevm port function set pci/0000:03:00.0/294914 hw_addr 02:25:f2:8d:a2:5c trust on state active
ovs-vsctl del-br ovsbr1
ovs-vsctl add-br sf_bridge1
ovs-vsctl add-br sf_bridge2
ovs-vsctl add-port sf_bridge1 p0
ovs-vsctl add-port sf_bridge1 en3f0pf0sf4
ovs-vsctl add-port sf_bridge2 pf0hpf
ovs-vsctl add-port sf_bridge2 en3f0pf0sf5
ovs-vsctl show
# 04d25b73-2f63-4e47-b7d9-2362cc4d7fda
# Bridge ovsbr2
# Port p1
# Interface p1
# Port en3f1pf1sf0
# Interface en3f1pf1sf0
# Port ovsbr2
# Interface ovsbr2
# type: internal
# Port pf1hpf
# Interface pf1hpf
# Bridge sf_bridge2
# Port sf_bridge2
# Interface sf_bridge2
# type: internal
# Port en3f0pf0sf5
# Interface en3f0pf0sf5
# Port pf0hpf
# Interface pf0hpf
# Bridge sf_bridge1
# Port sf_bridge1
# Interface sf_bridge1
# type: internal
# Port en3f0pf0sf4
# Interface en3f0pf0sf4
# Port p0
# Interface p0
# ovs_version: "2.15.1"
ifconfig en3f0pf0sf4 up
ifconfig en3f0pf0sf5 up
echo mlx5_core.sf.4 > /sys/bus/auxiliary/drivers/mlx5_core.sf_cfg/unbind
echo mlx5_core.sf.4 > /sys/bus/auxiliary/drivers/mlx5_core.sf/bind
echo mlx5_core.sf.5 > /sys/bus/auxiliary/drivers/mlx5_core.sf_cfg/unbind
echo mlx5_core.sf.5 > /sys/bus/auxiliary/drivers/mlx5_core.sf/bind
ls /sys/bus/auxiliary/devices/mlx5_core.sf.*
# /sys/bus/auxiliary/devices/mlx5_core.sf.2:
# driver infiniband infiniband_mad infiniband_verbs mlx5_core.eth.2 mlx5_core.rdma.2 net power sfnum subsystem uevent
# /sys/bus/auxiliary/devices/mlx5_core.sf.3:
# driver infiniband infiniband_mad infiniband_verbs mlx5_core.eth.3 mlx5_core.rdma.3 net power sfnum subsystem uevent
# /sys/bus/auxiliary/devices/mlx5_core.sf.4:
# driver infiniband infiniband_mad infiniband_verbs mlx5_core.eth.4 mlx5_core.rdma.4 net power sfnum subsystem uevent
# /sys/bus/auxiliary/devices/mlx5_core.sf.5:
# driver infiniband infiniband_mad infiniband_verbs mlx5_core.eth.5 mlx5_core.rdma.5 net power sfnum subsystem uevent
cat /sys/bus/auxiliary/devices/mlx5_core.sf.4/sfnum
# 4
# on 104 host with bf2
# nmcli con modify enp6s0f1 ipv4.method manual ipv4.addresses 192.168.99.11/24
nmcli con down enp6s0f1
nmcli con modify enp6s0f0 ipv4.method manual ipv4.addresses 192.168.99.11/24
nmcli con up enp6s0f0
# on 104 bf2
# 我们创建url filter规则。
/opt/mellanox/doca/examples/url_filter/bin/doca_url_filter -a 0000:03:00.0,class=regex -a auxiliary:mlx5_core.sf.4,sft_en=1 -a auxiliary:mlx5_core.sf.5,sft_en=1 -- -p
URL FILTER>> create database
URL FILTER>> filter http wzh_hits_msg wzhtest
URL FILTER>> commit database /tmp/signature.txt
# /tmp/104052/signatures.rules
# rules file is /tmp/104052/signatures.rules
# Info: Setting target hardware version to v5.7...done
# Info: Setting virtual prefix mode to 0...done
# Info: Setting prefix capacity to 32K...done
# Info: Setting compiler objective value to 5...done
# Info: Setting number of threads for compilation to 1...done
# Info: Reading ruleset...done
# Info: Detected 2 rules
# Info: Enabling global single-line mode...done
# Info: Setting maximum TPE data width to 4...done
# Info: Scanning rules...[==============================]...done
# Info: Analising possible prefix usage...[==============================]...done
# Info: Mapping prefixes, phase 1...[==============================]...done
# Info: Mapping prefixes, phase 2...[==============================]...done
# Info: Running rules analysis...[==============================]...done
# Info: Optimizing memory map...[==============================]...done
# Info: Analyzing memory map...[==============================]...done
# Info: Calculating thread instructions...[==============================]...done
# Info: Beginning to write memory map for ROF2...done
# Info: PPE total 1-byte prefix usage: 0/256 (0%)
# Info: PPE total 2-byte prefix usage: 0/2048 (0%)
# Info: PPE total 3-byte prefix usage: 0/2048 (0%)
# Info: PPE total 4-byte prefix usage: 1/32768 (0.00305176%)
# Info: TPE instruction RAM TCM partition usage: 2048/2048 (100%)
# Info: TPE instruction RAM external memory partition usage: 6207/13M (0.0455343%)
# Info: TPE class RAM usage: 1/256 (0.390625%)
# Info: Estimated threads/byte: 5.183e-10
# Info: Finalizing memory map for ROF2...done
# Info: Storing ROF2 data...done
# Info: Number of rules compiled = 2/2
# Info: Writing ROF2 file to /tmp/104052/rof/signatures_compiled.rof2
# Info: Writing binary ROF2 file to /tmp/104052/rof/signatures_compiled.rof2.binary...done
URL FILTER>> [12:36:50:606702][DOCA][I][UFLTR::Core]: SIG ID: 1, URL MSG: wzh_hits_msg, SFT_FID: 1
# on 101
curl http://192.168.99.11
# ....
# <footer class="col-sm-12">
# <a href="https://apache.org">Apache™</a> is a registered trademark of <a href="https://apache.org">the Apache Software Foundation</a> in the United States and/or other countries.<br />
# <a href="https://nginx.org">NGINX™</a> is a registered trademark of <a href="https://">F5 Networks, Inc.</a>.
# </footer>
# </body>
# </html>
curl http://192.168.99.11/test
# <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
# <html><head>
# <title>404 Not Found</title>
# </head><body>
# <h1>Not Found</h1>
# <p>The requested URL was not found on this server.</p>
# </body></html>
# 一下url命中了规则,可以看到访问不成功。
# 其他没有命中的规则,就可以访问http服务。
curl http://192.168.99.11/wzhtest
# curl: (56) Recv failure: Connection timed out
performance test
简单的测试一下性能,由于环境的物理设备条件所限,所以结果并不准确。
# on 104 host
dnf install -y iperf3
iperf3 -s -p 6666
# on 101 host
iperf3 -c 192.168.99.11 -p 6666
# Connecting to host 192.168.99.11, port 6666
# [ 5] local 192.168.99.21 port 37060 connected to 192.168.99.11 port 6666
# [ ID] Interval Transfer Bitrate Retr Cwnd
# [ 5] 0.00-1.00 sec 1.40 GBytes 12.1 Gbits/sec 17 905 KBytes
# [ 5] 1.00-2.00 sec 1.46 GBytes 12.6 Gbits/sec 26 795 KBytes
# [ 5] 2.00-3.00 sec 1.41 GBytes 12.1 Gbits/sec 71 922 KBytes
# [ 5] 3.00-4.00 sec 1.49 GBytes 12.8 Gbits/sec 0 998 KBytes
# [ 5] 4.00-5.00 sec 1.44 GBytes 12.4 Gbits/sec 44 1010 KBytes
# [ 5] 5.00-6.00 sec 1.34 GBytes 11.5 Gbits/sec 101 796 KBytes
# [ 5] 6.00-7.00 sec 1.45 GBytes 12.5 Gbits/sec 9 925 KBytes
# [ 5] 7.00-8.00 sec 1.39 GBytes 11.9 Gbits/sec 0 1014 KBytes
# [ 5] 8.00-9.00 sec 1.45 GBytes 12.4 Gbits/sec 62 930 KBytes
# [ 5] 9.00-10.00 sec 1.44 GBytes 12.3 Gbits/sec 157 1.07 MBytes
# - - - - - - - - - - - - - - - - - - - - - - - - -
# [ ID] Interval Transfer Bitrate Retr
# [ 5] 0.00-10.00 sec 14.3 GBytes 12.3 Gbits/sec 487 sender
# [ 5] 0.00-10.04 sec 14.3 GBytes 12.2 Gbits/sec receiver
# iperf Done.
ethtool enp5s0f1
# Settings for enp5s0f1:
# Supported ports: [ Backplane ]
# Supported link modes: 1000baseKX/Full
# 10000baseKR/Full
# 25000baseCR/Full
# 25000baseKR/Full
# 25000baseSR/Full
# Supported pause frame use: Symmetric
# Supports auto-negotiation: Yes
# Supported FEC modes: None RS BASER
# Advertised link modes: 1000baseKX/Full
# 10000baseKR/Full
# 25000baseCR/Full
# 25000baseKR/Full
# 25000baseSR/Full
# Advertised pause frame use: Symmetric
# Advertised auto-negotiation: Yes
# Advertised FEC modes: None RS BASER
# Link partner advertised link modes: Not reported
# Link partner advertised pause frame use: No
# Link partner advertised auto-negotiation: Yes
# Link partner advertised FEC modes: Not reported
# Speed: 25000Mb/s
# Duplex: Full
# Auto-negotiation: on
# Port: Direct Attach Copper
# PHYAD: 0
# Transceiver: internal
# Supports Wake-on: d
# Wake-on: d
# Current message level: 0x00000004 (4)
# link
# Link detected: yes
others
# firewall-cmd --permanent --direct --add-rule ipv4 nat POSTROUTING 0 -o eth_ext -j MASQUERADE
# firewall-cmd --permanent --direct --add-rule ipv4 filter FORWARD 0 -i eth_int -o eth_ext -j ACCEPT
# firewall-cmd --permanent --direct --add-rule ipv4 filter FORWARD 0 -i eth_ext -o eth_int -m state --state RELATED,ESTABLISHED -j ACCEPT
# firewall-cmd --permanent --add-port=80/tcp
# firewall-cmd --permanent --add-port=443/tcp
# firewall-cmd --permanent --add-port=53/tcp
# firewall-cmd --permanent --add-port=53/udp
# firewall-cmd --permanent --add-masquerade
# firewall-cmd --reload
# firewall-cmd --permanent --direct --remove-rule ipv4 nat POSTROUTING 0 -o eth_ext -j MASQUERADE
# firewall-cmd --permanent --direct --remove-rule ipv4 filter FORWARD 0 -i eth_int -o eth_ext -j ACCEPT
# firewall-cmd --permanent --direct --remove-rule ipv4 filter FORWARD 0 -i eth_ext -o eth_int -m state --state RELATED,ESTABLISHED -j ACCEPT
# firewall-cmd --permanent --remove-port=80/tcp
# firewall-cmd --permanent --remove-port=443/tcp
# firewall-cmd --permanent --remove-port=53/tcp
# firewall-cmd --permanent --remove-port=53/udp
# firewall-cmd --permanent --remove-masquerade
# firewall-cmd --reload
mellanox bf2 网卡激活snap功能, 配置nvme over fabrics 支持
本文讲述,如果使用mellanox bf2网卡,配置snap,挂载远端的nvme设备给宿主机使用,达到nvme over fabric的目的。
我们的实验环境,是2台物理机,宿主机都装的rocky linux 8.5,其中一个物理机上有nvme设备,另外一个物理机上有bf2网卡。
以下是实验架构:
安装实验环境
首先,我们需要给bf2刷上mellanox官方的doca bfb image,可以简单的理解是刷网卡固件,参考这里的文档做。
接下来,我们给nvme设备主机,配置nvme over fabric的能力。
# on 101
# config nvme storage server side
# https://access.redhat.com/documentation/zh-cn/red_hat_enterprise_linux/8/html/managing_storage_devices/overview-of-nvme-over-fabric-devicesmanaging-storage-devices
nmcli con modify enp5s0f1 ipv4.method manual ipv4.addresses 192.168.99.21/24
nmcli con up enp5s0f1
yum install -y nvmetcli
cd /data/down/
# wget http://git.infradead.org/users/hch/nvmetcli.git/blob_plain/0a6b088db2dc2e5de11e6f23f1e890e4b54fee64:/rdma.json
cat << EOF > /data/down/rdma.json
{
"hosts": [
{
"nqn": "hostnqn"
}
],
"ports": [
{
"addr": {
"adrfam": "ipv4",
"traddr": "192.168.99.21",
"treq": "not specified",
"trsvcid": "4420",
"trtype": "rdma"
},
"portid": 2,
"referrals": [],
"subsystems": [
"testnqn"
]
}
],
"subsystems": [
{
"allowed_hosts": [],
"attr": {
"allow_any_host": "1"
},
"namespaces": [
{
"device": {
"nguid": "ef90689c-6c46-d44c-89c1-4067801309a8",
"path": "/dev/nvme0n1"
},
"enable": 1,
"nsid": 1
}
],
"nqn": "testnqn"
}
]
}
EOF
modprobe nvmet-rdma
nvmetcli restore /data/down/rdma.json
dmesg
# ........
# [32664.912901] nvmet: adding nsid 1 to subsystem testnqn
# [32664.914013] nvmet_rdma: enabling port 2 (192.168.99.21:4420)
# to clear config / 清空配置
nvmetcli clear
nvme list
# Node SN Model Namespace Usage Format FW Rev
# --------------------- -------------------- ---------------------------------------- --------- -------------------------- ---------------- --------
# /dev/nvme0n1 CVCQ726600A0400AGN INTEL SSDPEDMW400G4 1 400.09 GB / 400.09 GB 512 B + 0 B 8EV10171
# 测试一下
yum install nvme-cli
modprobe nvme-rdma
nvme discover -t rdma -a 192.168.99.21 -s 4420
# Discovery Log Number of Records 1, Generation counter 2
# =====Discovery Log Entry 0======
# trtype: rdma
# adrfam: ipv4
# subtype: nvme subsystem
# treq: not specified, sq flow control disable supported
# portid: 2
# trsvcid: 4420
# subnqn: testnqn
# traddr: 192.168.99.21
# rdma_prtype: not specified
# rdma_qptype: connected
# rdma_cms: rdma-cm
# rdma_pkey: 0x0000
接下来,我们在bf2上,做一些配置
# on 104 bf2
# 查看一下基本信息
ovs-vsctl show
# 04d25b73-2f63-4e47-b7d9-2362cc4d7fda
# Bridge ovsbr2
# Port p1
# Interface p1
# Port en3f1pf1sf0
# Interface en3f1pf1sf0
# Port ovsbr2
# Interface ovsbr2
# type: internal
# Port pf1hpf
# Interface pf1hpf
# Bridge ovsbr1
# Port en3f0pf0sf0
# Interface en3f0pf0sf0
# Port pf0hpf
# Interface pf0hpf
# Port p0
# Interface p0
# Port ovsbr1
# Interface ovsbr1
# type: internal
# ovs_version: "2.15.1"
# nmcli con modify enp3s0f0s0 ipv4.method manual ipv4.addresses 192.168.99.11/24
# nmcli con up enp3s0f0s0
# ip addr add 192.168.99.11/24 dev enp3s0f0s0
# ip addr del 192.168.99.11/24 dev enp3s0f0s0
# 给一个sf端口,配置ip地址,这样bf2就能连接到远端nvme服务
cat << EOF > /etc/netplan/70-wzh-mlnx.yaml
network:
ethernets:
enp3s0f0s0:
addresses:
- 192.168.99.11/24
dhcp4: false
renderer: NetworkManager
version: 2
EOF
# 配置bf2参数
mlxconfig -y -d /dev/mst/mt41686_pciconf0 s \
PF_BAR2_ENABLE=0 \
PER_PF_NUM_SF=1
mlxconfig -y -d /dev/mst/mt41686_pciconf0 s \
PCI_SWITCH_EMULATION_ENABLE=1 \
PCI_SWITCH_EMULATION_NUM_PORT=16 \
VIRTIO_NET_EMULATION_ENABLE=1 \
VIRTIO_NET_EMULATION_NUM_VF=0 \
VIRTIO_NET_EMULATION_NUM_PF=0 \
VIRTIO_NET_EMULATION_NUM_MSIX=16 \
ECPF_ESWITCH_MANAGER=1 \
ECPF_PAGE_SUPPLIER=1 \
SRIOV_EN=0 \
PF_SF_BAR_SIZE=8 \
PF_TOTAL_SF=64
mlxconfig -y -d /dev/mst/mt41686_pciconf0.1 s \
PF_SF_BAR_SIZE=10 \
PF_TOTAL_SF=64
mlxconfig -y -d /dev/mst/mt41686_pciconf0 s \
VIRTIO_BLK_EMULATION_ENABLE=1 \
VIRTIO_BLK_EMULATION_NUM_PF=0 \
VIRTIO_BLK_EMULATION_NUM_VF=0 \
VIRTIO_BLK_EMULATION_NUM_MSIX=16 \
EXP_ROM_VIRTIO_BLK_UEFI_x86_ENABLE=0
# 清空原来的snap配置
# 系统默认会创建一个demo的nvme设备,我们为了更清晰的做实验,就清空默认的配置
/bin/cp -f /etc/mlnx_snap/snap_rpc_init_bf2.conf /etc/mlnx_snap/snap_rpc_init_bf2.conf.wzh
/bin/cp -f /etc/mlnx_snap/spdk_rpc_init.conf /etc/mlnx_snap/spdk_rpc_init.conf.wzh
echo "" > /etc/mlnx_snap/snap_rpc_init_bf2.conf
echo "" > /etc/mlnx_snap/spdk_rpc_init.conf
# remember to COLD reboot
poweroff
# on bf2
# 重启以后,我们手动设置snap服务,深入的理解一下spdk, snap
# set the snap step by step
snap_rpc.py subsystem_nvme_create Mellanox_NVMe_SNAP "Mellanox NVMe SNAP Controller"
# {
# "nqn": "nqn.2021-06.mlnx.snap:8b82f658f138ceaf83e3bfc261a7fb14:0",
# "subsys_id": 0
# }
snap_rpc.py controller_nvme_create mlx5_0 --subsys_id 0 --pf_id 0
# {
# "name": "NvmeEmu0pf0",
# "cntlid": 0,
# "version": "1.3.0",
# "offload": false,
# "mempool": false,
# "max_nsid": 1024,
# "max_namespaces": 1024
# }
spdk_rpc.py bdev_nvme_attach_controller -b Nvme0 -t rdma -a 192.168.99.21 -f ipv4 -s 4420 -n testnqn
# Nvme0n1
snap_rpc.py controller_nvme_namespace_attach -c NvmeEmu0pf0 spdk Nvme0n1 1
snap_rpc.py emulation_device_attach --num_msix 8 mlx5_0 virtio_blk
# {
# "emulation_manager": "mlx5_0",
# "emulation_type": "virtio_blk",
# "pci_type": "physical function",
# "pci_index": 0
# }
snap_rpc.py controller_virtio_blk_create mlx5_0 --bdev_type spdk --bdev Nvme0n1 --pf_id 0 --num_queues 7
# VblkEmu0pf0
# 配置好了,我们检查一下状态
# check status
snap_rpc.py controller_nvme_namespace_list -n nqn.2021-06.mlnx.snap:8b82f658f138ceaf83e3bfc261a7fb14:0 -i 0
# {
# "name": "NvmeEmu0pf0",
# "cntlid": 0,
# "Namespaces": [
# {
# "nsid": 1,
# "bdev": "Nvme0n1",
# "bdev_type": "spdk",
# "qn": "",
# "protocol": "",
# "snap-direct": true
# }
# ]
# }
snap_rpc.py emulation_managers_list
# [
# {
# "emulation_manager": "mlx5_0",
# "hotplug_support": true,
# "supported_types": [
# "nvme",
# "virtio_blk",
# "virtio_net"
# ]
# }
# ]
spdk_rpc.py bdev_nvme_get_controllers
# [
# {
# "name": "Nvme0",
# "trid": {
# "trtype": "RDMA",
# "adrfam": "IPv4",
# "traddr": "192.168.99.21",
# "trsvcid": "4420",
# "subnqn": "testnqn"
# }
# }
# ]
snap_rpc.py controller_list
# [
# {
# "mempool": false,
# "name": "VblkEmu0pf0",
# "emulation_manager": "mlx5_0",
# "type": "virtio_blk",
# "pci_index": 0,
# "pci_bdf": "07:00.0"
# },
# {
# "subnqn": "nqn.2021-06.mlnx.snap:8b82f658f138ceaf83e3bfc261a7fb14:0",
# "cntlid": 0,
# "version": "1.3.0",
# "offload": false,
# "mempool": false,
# "max_nsid": 1024,
# "max_namespaces": 1024,
# "name": "NvmeEmu0pf0",
# "emulation_manager": "mlx5_0",
# "type": "nvme",
# "pci_index": 0,
# "pci_bdf": "06:00.2"
# }
# ]
测试
# on 101, rocky linux
lsblk
# NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
# sda 8:0 0 278.9G 0 disk
# ├─sda1 8:1 0 1G 0 part /boot
# └─sda2 8:2 0 277.9G 0 part
# └─rl_lab101-root 253:0 0 277.9G 0 lvm /
# sr0 11:0 1 1024M 0 rom
# nvme0n1 259:0 0 372.6G 0 disk
# └─nvme-data 253:1 0 372.6G 0 lvm
# on 104 host, rocky linux
# before snap setting
lsblk
# NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
# sda 8:0 0 278.9G 0 disk
# ├─sda1 8:1 0 600M 0 part /boot/efi
# ├─sda2 8:2 0 1G 0 part /boot
# └─sda3 8:3 0 277.3G 0 part
# └─rl_panlab104-root 253:0 0 277.3G 0 lvm /
# after snap setting
lsblk
# NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
# sda 8:0 0 278.9G 0 disk
# ├─sda1 8:1 0 600M 0 part /boot/efi
# ├─sda2 8:2 0 1G 0 part /boot
# └─sda3 8:3 0 277.3G 0 part
# └─rl_panlab104-root 253:0 0 277.3G 0 lvm /
# vda 252:0 0 372.6G 0 disk
# └─nvme-data 253:1 0 372.6G 0 lvm
mount /dev/mapper/nvme-data /mnt
ls /mnt
# bgp-router.qcow2 ocp4-master-0.qcow2 ocp4-windows.qcow2
持久化配置
我们刚才的配置,是实验性质的,一步一步手工做的,把这些配置固定下来,这么做。
# on 104, bf2
cat << EOF > snap_rpc_init_bf2.conf
subsystem_nvme_create Mellanox_NVMe_SNAP "Mellanox NVMe SNAP Controller"
controller_nvme_create mlx5_0 --subsys_id 0 --pf_id 0
controller_nvme_namespace_attach -c NvmeEmu0pf0 spdk Nvme0n1 1
emulation_device_attach --num_msix 8 mlx5_0 virtio_blk
controller_virtio_blk_create mlx5_0 --bdev_type spdk --bdev Nvme0n1 --pf_id 0 --num_queues 7
EOF
cat << EOF > spdk_rpc_init.conf
bdev_nvme_attach_controller -b Nvme0 -t rdma -a 192.168.99.21 -f ipv4 -s 4420 -n testnqn
EOF
# cold reboot
poweroff
Mellanox CX6 vdpa 硬件卸载 ovs-kernel 方式
本文来讲解,使用mellanox CX6 dx 网卡,实现vdpa硬件卸载。
视频讲解:
vdpa 硬件卸载介绍
既然说到了vdpa卸载,那么我们先简单介绍一下他是什么。
vDPA (virtio data path acceleration) 是一个内核框架,在2020年正式引入内核,NIC厂家会做vDPA网卡,意思是datapath遵循virtio规范,而控制面由厂家驱动提供。
以下是vDPA在虚拟机平台部署时的架构图:
以下是vDPA在k8s平台中部署是的架构图:
上面的架构图,是借用红帽介绍vdpa背景的文章。我们这次的实验,是按照mellanox的文档来做,从mellanox角度看,vdpa有2种方式来做
- 配置ovs-dpdk, ovs配置vdpa端口,同时创建socket。vm通过socket挂载vdpa设备。
- 配置ovs-kernel,启动vdpa-dpdk程序,同时创建socket。vm通过socket挂载vdpa设备。
第一种方法,由于ovs-dpdk,mellanox官方文档说只支持到rhel/centos 7 , 我们的环境是rhel/rocky 8.4,所以我们用后面一种方法。
在这里,背景介绍的很简单,以下是参考链接,可以更深入的学习:
- Introduction to vDPA kernel framework
- How vhost-user came into being: Virtio-networking and DPDK
- A journey to the vhost-users realm
- How deep does the vDPA rabbit hole go?
- Achieving network wirespeed in an open standard manner: introducing vDPA
- vDPA hands on: The proof is in the pudding
- vdpa-deployment from redhat-nfvpe
- Virtio-networking series from redhat blog
- How vDPA can help network service providers simplify CNF/VNF certification
- vDPA : On the road to production
- vDPA原理和实现
- VirtIO and TC
有一个dpdk特殊概念,vf representor,dpdk文档有说,简单理解,是给控制面准备的vf的分身。
- https://doc.dpdk.org/guides-18.11/prog_guide/switch_representation.html
.-------------. .-------------. .-------------.
| hypervisor | | VM 1 | | VM 2 |
| application | | application | | application |
`--+---+---+--' `----------+--' `--+----------'
| | | | |
| | `-------------------. | |
| `---------. | | |
| | | | |
.-----+-----. .-----+-----. .-----+-----. | |
| port_id 3 | | port_id 4 | | port_id 5 | | |
`-----+-----' `-----+-----' `-----+-----' | |
| | | | |
.-+--. .-----+-----. .-----+-----. .---+--. .--+---.
| PF | | VF 1 rep. | | VF 2 rep. | | VF 1 | | VF 2 |
`-+--' `-----+-----' `-----+-----' `---+--' `--+---'
| | | | |
| | .---------' | |
`-----. | | .-----------------' |
| | | | .---------------------'
| | | | |
.--+-------+---+---+---+--.
| managed interconnection |
`------------+------------'
|
.----+-----.
| physical |
| port 0 |
`----------'
本次实验的架构图如下:
系统安装
export VAR_HOST='rl_panlab105'
# 按照完了操作系统以后,添加kernel参数,主要是intel_iommu=on iommu=pt,然后重启
cp /etc/default/grub /etc/default/grub.bak
sed -i "/GRUB_CMDLINE_LINUX/s/resume=[^[:space:]]*//" /etc/default/grub
sed -i "/GRUB_CMDLINE_LINUX/s/rd.lvm.lv=${VAR_HOST}\\/swap//" /etc/default/grub
# https://unix.stackexchange.com/questions/403706/sed-insert-text-after-nth-character-preceding-following-a-given-string
sed -i '/GRUB_CMDLINE_LINUX/s/"/ intel_iommu=on iommu=pt default_hugepagesz=1G hugepagesz=1G hugepages=16 rdblacklist=nouveau"/2' /etc/default/grub
grub2-mkconfig -o /boot/efi/EFI/rocky/grub.cfg
grub2-mkconfig -o /boot/grub2/grub.cfg
# 添加kvm cpu host mode模式的支持,可以不做
cat << EOF > /etc/modprobe.d/kvm-nested.conf
options kvm_intel nested=1
options kvm-intel enable_shadow_vmcs=1
options kvm-intel enable_apicv=1
options kvm-intel ept=1
EOF
# 默认的操作系统安装,有swap, home分区,我们是测试系统,全都删了吧。
umount /home
swapoff /dev/$VAR_HOST/swap
cp /etc/fstab /etc/fstab.bak
sed -i 's/^[^#]*home/#&/' /etc/fstab
sed -i 's/^[^#]*swap/#&/' /etc/fstab
lvremove -f /dev/$VAR_HOST/home
lvremove -f /dev/$VAR_HOST/swap
lvextend -l +100%FREE /dev/$VAR_HOST/root
xfs_growfs /dev/$VAR_HOST/root
# 至此,开始安装网卡驱动
# 103 driver install
# https://www.mellanox.com/products/infiniband-drivers/linux/mlnx_ofed
mkdir -p /data/down/
cd /data/down/
dnf groupinstall -y 'Development Tools'
dnf groupinstall -y "Server with GUI"
wget https://www.mellanox.com/downloads/ofed/MLNX_OFED-5.4-3.0.3.0/MLNX_OFED_LINUX-5.4-3.0.3.0-rhel8.4-x86_64.tgz
tar zvxf *.tgz
cd /data/down/MLNX_OFED_LINUX-5.4-3.0.3.0-rhel8.4-x86_64
dnf install -y tcl tk kernel-modules-extra python36 make gcc-gfortran tcsh unbound
./mlnxofedinstall --all --force --distro rhel8.4
# ./mlnxofedinstall --dpdk --ovs-dpdk --upstream-libs --add-kernel-support --force --distro rhel8.4
reboot
systemctl enable --now mst
systemctl enable --now openibd
cat << EOF > /etc/yum.repos.d/mlx.repo
[mlnx_ofed]
name=MLNX_OFED Repository
baseurl=file:///data/down/MLNX_OFED_LINUX-5.4-3.0.3.0-rhel8.4-x86_64/RPMS
enabled=1
gpgcheck=0
EOF
dnf makecache
# 开始安装dpdk相关的软件
mkdir -p /data/soft
cd /data/soft
dnf config-manager --set-enabled powertools
dnf install -y ninja-build meson
# 装mlnx版本的dpdk组件和ovs软件
# dnf group list
# dnf groupinstall -y 'Development Tools'
# install dpdk
dnf install -y mlnx-dpdk mlnx-dpdk-devel numactl-devel openvswitch openvswitch-selinux-policy libnl3-devel openssl-devel zlib-devel libpcap-devel elfutils-libelf-devel
# https://doc.dpdk.org/guides/linux_gsg/sys_reqs.html#compilation-of-the-dpdk
pip3 install --user pyelftools
systemctl enable --now openvswitch
export PATH=$PATH:/opt/mellanox/dpdk/bin/
echo 'export PATH=$PATH:/opt/mellanox/dpdk/bin/' >> ~/.bash_profile
# 编译上游的dpdk软件包,因为我们要用里面的vdpa sample程序
cd /data/soft/
wget https://fast.dpdk.org/rel/dpdk-20.11.3.tar.xz
tar vxf dpdk-20.11.3.tar.xz
# https://core.dpdk.org/doc/quick-start/
cd /data/soft/dpdk-stable-20.11.3/
# meson -Dexamples=all build
meson --reconfigure -Dexamples=all build
ninja -C build
export PKG_CONFIG_PATH=/opt/mellanox/dpdk/lib64/pkgconfig/
cd /data/soft/dpdk-stable-20.11.3/examples/vdpa
make -j
# 按照kvm相关软件包
# install kvm with qemu
# dnf -y groupinstall "Server with GUI"
dnf -y install qemu-kvm libvirt libguestfs-tools virt-install virt-viewer virt-manager tigervnc-server
systemctl disable --now firewalld
systemctl enable --now libvirtd
# 最后,设置mlx网卡参数,激活sriov
# pci地址,使用 lspci -D | grep -i mell 或者 lshw -c network -businfo 得到
lspci -D | grep -i mell
# 0000:04:00.0 Ethernet controller: Mellanox Technologies MT2892 Family [ConnectX-6 Dx]
# 0000:04:00.1 Ethernet controller: Mellanox Technologies MT2892 Family [ConnectX-6 Dx]
lshw -c network -businfo
# Bus info Device Class Description
# =======================================================
# pci@0000:02:00.0 eno3 network NetXtreme BCM5720 2-port Gigabit Ethernet PCIe
# pci@0000:02:00.1 eno4 network NetXtreme BCM5720 2-port Gigabit Ethernet PCIe
# pci@0000:01:00.0 eno1 network NetXtreme BCM5720 2-port Gigabit Ethernet PCIe
# pci@0000:01:00.1 eno2 network NetXtreme BCM5720 2-port Gigabit Ethernet PCIe
# pci@0000:04:00.0 enp4s0f0 network MT2892 Family [ConnectX-6 Dx]
# pci@0000:04:00.1 enp4s0f1 network MT2892 Family [ConnectX-6 Dx]
# UCTX_EN is for enable DevX
# DevX allows to access firmware objects
mlxconfig -y -d 0000:04:00.0 set SRIOV_EN=1 UCTX_EN=1 NUM_OF_VFS=8
ovs-kernel 方案
网卡设置脚本
# mlx默认的ovs,缺少一些selinux的配置,在此补上
# 项目上,可以根据需要,自行补充缺少的selinux配置
semodule -i wzh-mellanox-ovs-dpdk.pp
# 这里做了一个配置和启动ovs的脚步,逻辑是先清空ovs配置,再配置网卡模式,然后启动ovs
cat << 'EOF' > /data/ovs-offload-env.sh
#!/usr/bin/env bash
set -e
set -x
systemctl restart openvswitch
ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-init=try
systemctl restart openvswitch
ip link set dev ${IFNAME} down || true
ip link set dev ${IFNAME}_0 down || true
ip link set dev ${IFNAME}_1 down || true
ip link set dev ${IFNAME}v0 down || true
ip link set dev ${IFNAME}v1 down || true
ovs-vsctl del-port ovs-sriov ${IFNAME} || true
ovs-vsctl del-port ovs-sriov ${IFNAME}_0 || true
ovs-vsctl del-port ovs-sriov ${IFNAME}_1 || true
ovs-vsctl del-br ovs-sriov || true
ovs-vsctl del-port br0-ovs pf0vf0 || true
ovs-vsctl del-port br0-ovs pf0vf1 || true
ovs-vsctl del-port br0-ovs pf0 || true
ovs-vsctl del-br br0-ovs || true
ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-init=false
ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-extra=" "
ovs-vsctl --no-wait set Open_vSwitch . other_config={}
# Turn off SR-IOV on the PF device.
echo 0 > /sys/class/net/$IFNAME/device/sriov_numvfs
cat /sys/class/net/$IFNAME/device/sriov_numvfs
# 0
systemctl restart openvswitch
# Turn ON SR-IOV on the PF device.
echo 2 > /sys/class/net/$IFNAME/device/sriov_numvfs
cat /sys/class/net/$IFNAME/device/sriov_numvfs
# 2
ip link set $IFNAME vf 0 mac ${VF1MAC}
ip link set $IFNAME vf 1 mac ${VF2MAC}
echo ${PCINUM%%.*}.2 > /sys/bus/pci/drivers/mlx5_core/unbind || true
echo ${PCINUM%%.*}.3 > /sys/bus/pci/drivers/mlx5_core/unbind || true
devlink dev eswitch set pci/$PCINUM mode switchdev
devlink dev eswitch show pci/$PCINUM
# # pci/0000:43:00.0: mode switchdev inline-mode none encap-mode basic
echo ${PCINUM%%.*}.2 > /sys/bus/pci/drivers/mlx5_core/bind
echo ${PCINUM%%.*}.3 > /sys/bus/pci/drivers/mlx5_core/bind
# systemctl enable --now openvswitch
# systemctl restart openvswitch
# Create an OVS bridge (here it's named ovs-sriov).
ovs-vsctl add-br ovs-sriov
ovs-vsctl set Open_vSwitch . other_config:hw-offload=true
systemctl restart openvswitch
ovs-vsctl add-port ovs-sriov ${IFNAME}
ovs-vsctl add-port ovs-sriov ${IFNAME}_0
ovs-vsctl add-port ovs-sriov ${IFNAME}_1
ip link set dev ${IFNAME} up
ip link set dev ${IFNAME}_0 up
ip link set dev ${IFNAME}_1 up
ip link set dev ${IFNAME}v0 up
ip link set dev ${IFNAME}v1 up
# systemctl restart openvswitch
# ip addr add ${VF1IP} dev ${IFNAME}v0
# ip addr add ${VF2IP} dev ${IFNAME}v1
EOF
# for 103
# export IFNAME=enp4s0f0
# export PCINUM=0000:04:00.0
# export VF1MAC=e4:11:22:33:44:50
# export VF2MAC=e4:11:22:33:44:51
# export VF1IP=192.168.55.21/24
# export VF2IP=192.168.55.22/24
# bash /data/ovs-offload-env.sh
# 设置一下环境变量,就可以执行脚本,启动ovs了。
# for 105
export IFNAME=enp67s0f0
export PCINUM=0000:43:00.0
export VF1MAC=e4:11:22:33:55:60
export VF2MAC=e4:11:22:33:55:61
# export VF1IP=192.168.55.31/24
# export VF2IP=192.168.55.32/24
bash /data/ovs-offload-env.sh
# 我们还需要启动一个DPDK的程序,做vdpa的功能,并接到vf上去。
/data/soft/dpdk-stable-20.11.3/examples/vdpa/build/vdpa -w ${PCINUM%%.*}.2,class=vdpa --log-level=pmd,info -- -i
create /tmp/sock-virtio0 0000:43:00.2
# EAL: Detected 24 lcore(s)
# EAL: Detected 2 NUMA nodes
# Option -w, --pci-whitelist is deprecated, use -a, --allow option instead
# EAL: Detected shared linkage of DPDK
# EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
# EAL: Selected IOVA mode 'VA'
# EAL: No available hugepages reported in hugepages-2048kB
# EAL: Probing VFIO support...
# EAL: Probe PCI driver: mlx5_pci (15b3:101e) device: 0000:43:00.2 (socket 1)
# mlx5_vdpa: ROCE is disabled by Netlink successfully.
# EAL: No legacy callbacks, legacy socket not created
# Interactive-mode selected
# vdpa> create /tmp/sock-virtio0 0000:43:00.2
# VHOST_CONFIG: vhost-user server: socket created, fd: 112
# VHOST_CONFIG: bind to /tmp/sock-virtio0
# vdpa>
vdpa> list
# device name queue num supported features
# 0000:43:00.2 256 0x114c60180b
vdpa> stats 0000:43:00.2 0
# Device 0000:43:00.2:
# Virtq 0:
# received_descriptors 1024
# completed_descriptors 39
# bad descriptor errors 0
# exceed max chain 0
# invalid buffer 0
# completion errors 0
kvm
接下来,我们就要创建一个kvm,来使用我们的vdpa通道。
由于我们创建了一个socket,需要qemu有权限读取这个socket,所以我们需要把qemu的用户改为root。
sed -i.bak 's/#user = "root"/user = "root"/' /etc/libvirt/qemu.conf
# 我们还需要创建一个网桥,让kvm能接住宿主机的网口能上网。方便访问和管理。
mkdir -p /data/kvm
cat << 'EOF' > /data/kvm/bridge.sh
#!/usr/bin/env bash
PUB_CONN='eno1'
PUB_IP='172.21.6.103/24'
PUB_GW='172.21.6.254'
PUB_DNS='172.21.1.1'
nmcli con down "$PUB_CONN"
nmcli con delete "$PUB_CONN"
nmcli con down baremetal
nmcli con delete baremetal
# RHEL 8.1 appends the word "System" in front of the connection,delete in case it exists
nmcli con down "System $PUB_CONN"
nmcli con delete "System $PUB_CONN"
nmcli connection add ifname baremetal type bridge con-name baremetal ipv4.method 'manual' \
ipv4.address "$PUB_IP" \
ipv4.gateway "$PUB_GW" \
ipv4.dns "$PUB_DNS"
nmcli con add type bridge-slave ifname "$PUB_CONN" master baremetal
nmcli con down "$PUB_CONN";pkill dhclient;dhclient baremetal
nmcli con up baremetal
EOF
bash /data/kvm/bridge.sh
# 我们先用标准的方法,创建,启动和安装一个kvm
cd /data/kvm
export DOMAIN=cx6.1
virt-install --name="${DOMAIN}" --vcpus=2 --ram=8192 \
--cputune vcpupin0.vcpu=14,vcpupin1.vcpu=16 \
--memorybacking hugepages.page0.size=1,hugepages.page0.unit=GiB \
--cpu host-model \
--disk path=/data/kvm/${DOMAIN}.qcow2,bus=virtio,size=30 \
--os-variant rhel8.4 \
--network bridge=baremetal,model=virtio \
--graphics vnc,port=59000 \
--boot menu=on --location /data/kvm/Rocky-8.4-x86_64-minimal.iso \
--initrd-inject helper-ks-rocky.cfg --extra-args "inst.ks=file:/helper-ks-rocky.cfg"
# 接下来,配置这个kvm,把vdpa的通道加入到kvm里面。
# https://unix.stackexchange.com/questions/235414/libvirt-how-to-pass-qemu-command-line-args
# virt-xml $DOMAIN --edit --confirm --qemu-commandline 'env=MY-ENV=1234'
virt-xml $DOMAIN --edit --qemu-commandline='-chardev socket,id=charnet1,path=/tmp/sock-virtio0'
virt-xml $DOMAIN --edit --qemu-commandline='-netdev vhost-user,chardev=charnet1,queues=16,id=hostnet1'
virt-xml $DOMAIN --edit --qemu-commandline='-device virtio-net-pci,mq=on,vectors=6,netdev=hostnet1,id=net1,mac=e4:11:c6:d3:45:f2,bus=pcie.0,addr=0x6,page-per-vq=on,rx_queue_size=1024,tx_queue_size=1024'
接下来,要手动修改如下的配置配置,注意这里cpu binding的核,都应该在一个numa上面。
virsh edit cx6.1
<cputune>
<vcpupin vcpu='0' cpuset='14'/>
<vcpupin vcpu='1' cpuset='16'/>
</cputune>
<cpu mode='host-model' check='partial'>
<numa>
<cell id='0' cpus='0-1' memory='8388608' unit='KiB' memAccess='shared'/>
</numa>
</cpu>
最后的配置样例如下,项目中,可以根据以下例子排错。
virsh dumpxml cx6.1
<domain type='kvm' id='11' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'>
<name>cx6.1</name>
<uuid>5cbb6f7c-7122-4fc4-9706-ff46aed3bf25</uuid>
<metadata>
<libosinfo:libosinfo xmlns:libosinfo="http://libosinfo.org/xmlns/libvirt/domain/1.0">
<libosinfo:os id="http://redhat.com/rhel/8.4"/>
</libosinfo:libosinfo>
</metadata>
<memory unit='KiB'>8388608</memory>
<currentMemory unit='KiB'>8388608</currentMemory>
<memoryBacking>
<hugepages>
<page size='1048576' unit='KiB'/>
</hugepages>
</memoryBacking>
<vcpu placement='static'>2</vcpu>
<cputune>
<vcpupin vcpu='0' cpuset='14'/>
<vcpupin vcpu='1' cpuset='16'/>
</cputune>
<resource>
<partition>/machine</partition>
</resource>
<os>
<type arch='x86_64' machine='pc-q35-rhel8.2.0'>hvm</type>
<boot dev='hd'/>
<bootmenu enable='yes'/>
</os>
<features>
<acpi/>
<apic/>
</features>
<cpu mode='custom' match='exact' check='full'>
<model fallback='forbid'>IvyBridge-IBRS</model>
<vendor>Intel</vendor>
<feature policy='require' name='ss'/>
<feature policy='require' name='vmx'/>
<feature policy='require' name='pdcm'/>
<feature policy='require' name='pcid'/>
<feature policy='require' name='hypervisor'/>
<feature policy='require' name='arat'/>
<feature policy='require' name='tsc_adjust'/>
<feature policy='require' name='umip'/>
<feature policy='require' name='md-clear'/>
<feature policy='require' name='stibp'/>
<feature policy='require' name='arch-capabilities'/>
<feature policy='require' name='ssbd'/>
<feature policy='require' name='xsaveopt'/>
<feature policy='require' name='pdpe1gb'/>
<feature policy='require' name='ibpb'/>
<feature policy='require' name='ibrs'/>
<feature policy='require' name='amd-stibp'/>
<feature policy='require' name='amd-ssbd'/>
<feature policy='require' name='skip-l1dfl-vmentry'/>
<feature policy='require' name='pschange-mc-no'/>
<numa>
<cell id='0' cpus='0-1' memory='8388608' unit='KiB' memAccess='shared'/>
</numa>
</cpu>
<clock offset='utc'>
<timer name='rtc' tickpolicy='catchup'/>
<timer name='pit' tickpolicy='delay'/>
<timer name='hpet' present='no'/>
</clock>
<on_poweroff>destroy</on_poweroff>
<on_reboot>restart</on_reboot>
<on_crash>destroy</on_crash>
<pm>
<suspend-to-mem enabled='no'/>
<suspend-to-disk enabled='no'/>
</pm>
<devices>
<emulator>/usr/libexec/qemu-kvm</emulator>
<disk type='file' device='disk'>
<driver name='qemu' type='qcow2'/>
<source file='/data/kvm/cx6.1.qcow2' index='2'/>
<backingStore/>
<target dev='vda' bus='virtio'/>
<alias name='virtio-disk0'/>
<address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x0'/>
</disk>
<disk type='file' device='cdrom'>
<driver name='qemu'/>
<target dev='sda' bus='sata'/>
<readonly/>
<alias name='sata0-0-0'/>
<address type='drive' controller='0' bus='0' target='0' unit='0'/>
</disk>
<controller type='usb' index='0' model='qemu-xhci' ports='15'>
<alias name='usb'/>
<address type='pci' domain='0x0000' bus='0x02' slot='0x00' function='0x0'/>
</controller>
<controller type='sata' index='0'>
<alias name='ide'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x1f' function='0x2'/>
</controller>
<controller type='pci' index='0' model='pcie-root'>
<alias name='pcie.0'/>
</controller>
<controller type='pci' index='1' model='pcie-root-port'>
<model name='pcie-root-port'/>
<target chassis='1' port='0x10'/>
<alias name='pci.1'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0' multifunction='on'/>
</controller>
<controller type='pci' index='2' model='pcie-root-port'>
<model name='pcie-root-port'/>
<target chassis='2' port='0x11'/>
<alias name='pci.2'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x1'/>
</controller>
<controller type='pci' index='3' model='pcie-root-port'>
<model name='pcie-root-port'/>
<target chassis='3' port='0x12'/>
<alias name='pci.3'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x2'/>
</controller>
<controller type='pci' index='4' model='pcie-root-port'>
<model name='pcie-root-port'/>
<target chassis='4' port='0x13'/>
<alias name='pci.4'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x3'/>
</controller>
<controller type='pci' index='5' model='pcie-root-port'>
<model name='pcie-root-port'/>
<target chassis='5' port='0x14'/>
<alias name='pci.5'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x4'/>
</controller>
<controller type='pci' index='6' model='pcie-root-port'>
<model name='pcie-root-port'/>
<target chassis='6' port='0x15'/>
<alias name='pci.6'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x5'/>
</controller>
<controller type='pci' index='7' model='pcie-root-port'>
<model name='pcie-root-port'/>
<target chassis='7' port='0x16'/>
<alias name='pci.7'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x6'/>
</controller>
<controller type='virtio-serial' index='0'>
<alias name='virtio-serial0'/>
<address type='pci' domain='0x0000' bus='0x03' slot='0x00' function='0x0'/>
</controller>
<interface type='bridge'>
<mac address='52:54:00:8d:b6:8e'/>
<source bridge='baremetal'/>
<target dev='vnet2'/>
<model type='virtio'/>
<alias name='net0'/>
<address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
</interface>
<serial type='pty'>
<source path='/dev/pts/6'/>
<target type='isa-serial' port='0'>
<model name='isa-serial'/>
</target>
<alias name='serial0'/>
</serial>
<console type='pty' tty='/dev/pts/6'>
<source path='/dev/pts/6'/>
<target type='serial' port='0'/>
<alias name='serial0'/>
</console>
<channel type='unix'>
<source mode='bind' path='/var/lib/libvirt/qemu/channel/target/domain-11-cx6.1/org.qemu.guest_agent.0'/>
<target type='virtio' name='org.qemu.guest_agent.0' state='disconnected'/>
<alias name='channel0'/>
<address type='virtio-serial' controller='0' bus='0' port='1'/>
</channel>
<input type='tablet' bus='usb'>
<alias name='input0'/>
<address type='usb' bus='0' port='1'/>
</input>
<input type='mouse' bus='ps2'>
<alias name='input1'/>
</input>
<input type='keyboard' bus='ps2'>
<alias name='input2'/>
</input>
<graphics type='vnc' port='59000' autoport='no' listen='127.0.0.1'>
<listen type='address' address='127.0.0.1'/>
</graphics>
<video>
<model type='qxl' ram='65536' vram='65536' vgamem='16384' heads='1' primary='yes'/>
<alias name='video0'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x0'/>
</video>
<memballoon model='virtio'>
<stats period='5'/>
<alias name='balloon0'/>
<address type='pci' domain='0x0000' bus='0x05' slot='0x00' function='0x0'/>
</memballoon>
<rng model='virtio'>
<backend model='random'>/dev/urandom</backend>
<alias name='rng0'/>
<address type='pci' domain='0x0000' bus='0x06' slot='0x00' function='0x0'/>
</rng>
</devices>
<seclabel type='dynamic' model='selinux' relabel='yes'>
<label>system_u:system_r:svirt_t:s0:c46,c926</label>
<imagelabel>system_u:object_r:svirt_image_t:s0:c46,c926</imagelabel>
</seclabel>
<seclabel type='dynamic' model='dac' relabel='yes'>
<label>+0:+0</label>
<imagelabel>+0:+0</imagelabel>
</seclabel>
<qemu:commandline>
<qemu:arg value='-chardev'/>
<qemu:arg value='socket,id=charnet1,path=/tmp/sock-virtio0'/>
<qemu:arg value='-netdev'/>
<qemu:arg value='vhost-user,chardev=charnet1,queues=16,id=hostnet1'/>
<qemu:arg value='-device'/>
<qemu:arg value='virtio-net-pci,mq=on,vectors=6,netdev=hostnet1,id=net1,mac=e4:11:c6:d3:45:f2,bus=pcie.0,addr=0x6,page-per-vq=on,rx_queue_size=1024,tx_queue_size=1024'/>
</qemu:commandline>
</domain>
赶紧试试吧
接下来就进入测试和体验环节。
# in cx6.1 kvm
# nmcli dev connect enp0s6
nmcli con modify enp0s6 ipv4.method manual ipv4.addresses 192.168.99.11/24
# nmcli con modify enp0s6 ipv4.method manual ipv4.addresses 192.168.55.91/24
nmcli con up enp0s6
# on peer machine (102)
nmcli con modify enp66s0f0 ipv4.method manual ipv4.addresses 192.168.99.21/24
# nmcli con modify enp66s0f0 ipv4.method manual ipv4.addresses 192.168.55.92/24
# nmcli dev connect enp66s0f0
nmcli con up enp66s0f0
# run after the tcpdump is running
ping 192.168.99.21
# PING 192.168.99.21 (192.168.99.21) 56(84) bytes of data.
# 64 bytes from 192.168.99.21: icmp_seq=1 ttl=64 time=0.089 ms
# 64 bytes from 192.168.99.21: icmp_seq=2 ttl=64 time=0.044 ms
# 64 bytes from 192.168.99.21: icmp_seq=3 ttl=64 time=0.046 ms
# ....
# on 105
tcpdump -i enp67s0f0_0 -w dump.test
# dropped privs to tcpdump
# tcpdump: listening on enp67s0f0_0, link-type EN10MB (Ethernet), capture size 262144 bytes
# ^C2 packets captured
# 2 packets received by filter
# 0 packets dropped by kernel
tcpdump -i enp67s0f0 -w dump.test
# dropped privs to tcpdump
# tcpdump: listening on enp67s0f0, link-type EN10MB (Ethernet), capture size 262144 bytes
# ^C4 packets captured
# 4 packets received by filter
# 0 packets dropped by kernel
用 wireshark 打开,可以看到是标准的icmp包,说明我们构建的是数据通路,而不是协议封装。另外,我们会发现,ping了很多包,但是我们只是抓到了1个,这是因为,网卡offload了,我们只能抓到第一个进入内核查流表的包,后面的都网卡offload了,就抓不到了。
以下是在pf上抓的包,抓到了4个。都是流的第一个包,后面的就都offload啦。
# ovs-dpctl dump-flows
# on 105
# 看看ovs的流表,可以看到有2个arp(0x0806)的流表(0x0806),正向和方向
# 还有2个ip(0x0800)的流表,正向和反向
ovs-appctl dpctl/dump-flows type=offloaded
# recirc_id(0),in_port(2),eth(src=0c:42:a1:fa:18:8e,dst=e4:11:c6:d3:45:f2),eth_type(0x0800),ipv4(frag=no), packets:149, bytes:15198, used:0.510s, actions:3
# recirc_id(0),in_port(2),eth(src=0c:42:a1:fa:18:8e,dst=e4:11:c6:d3:45:f2),eth_type(0x0806), packets:0, bytes:0, used:8.700s, actions:3
# recirc_id(0),in_port(3),eth(src=e4:11:c6:d3:45:f2,dst=0c:42:a1:fa:18:8e),eth_type(0x0800),ipv4(frag=no), packets:149, bytes:14602, used:0.510s, actions:2
# recirc_id(0),in_port(3),eth(src=e4:11:c6:d3:45:f2,dst=0c:42:a1:fa:18:8e),eth_type(0x0806), packets:0, bytes:0, used:8.701s, actions:2
# 我们再看看tc的配置,可以看到ovs把配置下发给了tc
# 这里是vf的入流量,可以看到它把流量镜像给了父端口,并且规则由硬件实现
tc -s filter show dev enp67s0f0_0 ingress
# filter protocol ip pref 2 flower chain 0
# filter protocol ip pref 2 flower chain 0 handle 0x1
# dst_mac 0c:42:a1:fa:18:8e
# src_mac e4:11:c6:d3:45:f2
# eth_type ipv4
# ip_flags nofrag
# in_hw in_hw_count 1
# action order 1: mirred (Egress Redirect to device enp67s0f0) stolen
# index 4 ref 1 bind 1 installed 318 sec used 0 sec
# Action statistics:
# Sent 30380 bytes 310 pkt (dropped 0, overlimits 0 requeues 0)
# Sent software 0 bytes 0 pkt
# Sent hardware 30380 bytes 310 pkt
# backlog 0b 0p requeues 0
# cookie 8be6df4d7d4c33fce08f01a46fa10a4a
# no_percpu
# used_hw_stats delayed
# 我们再看看vf的出流量
# 有2个规则,一个是arp,一个是ip
# 都会把流量镜像给了父端口,并且规则由硬件实现
tc -s filter show dev enp67s0f0_0 egress
# filter ingress protocol ip pref 2 flower chain 0
# filter ingress protocol ip pref 2 flower chain 0 handle 0x1
# dst_mac 0c:42:a1:fa:18:8e
# src_mac e4:11:c6:d3:45:f2
# eth_type ipv4
# ip_flags nofrag
# in_hw in_hw_count 1
# action order 1: mirred (Egress Redirect to device enp67s0f0) stolen
# index 4 ref 1 bind 1 installed 379 sec used 0 sec
# Action statistics:
# Sent 36260 bytes 370 pkt (dropped 0, overlimits 0 requeues 0)
# Sent software 0 bytes 0 pkt
# Sent hardware 36260 bytes 370 pkt
# backlog 0b 0p requeues 0
# cookie 8be6df4d7d4c33fce08f01a46fa10a4a
# no_percpu
# used_hw_stats delayed
# filter ingress protocol arp pref 4 flower chain 0
# filter ingress protocol arp pref 4 flower chain 0 handle 0x1
# dst_mac 0c:42:a1:fa:18:8e
# src_mac e4:11:c6:d3:45:f2
# eth_type arp
# in_hw in_hw_count 1
# action order 1: mirred (Egress Redirect to device enp67s0f0) stolen
# index 3 ref 1 bind 1 installed 13 sec used 6 sec
# Action statistics:
# Sent 60 bytes 1 pkt (dropped 0, overlimits 0 requeues 0)
# Sent software 0 bytes 0 pkt
# Sent hardware 60 bytes 1 pkt
# backlog 0b 0p requeues 0
# cookie 1fbfd56eae42f9dbe71bf99bd800cd6d
# no_percpu
# used_hw_stats delayed
tc qdisc show dev enp67s0f0_0
# qdisc mq 0: root
# qdisc fq_codel 0: parent :1 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64
# qdisc ingress ffff: parent ffff:fff1 ----------------
# 最后,我们把系统环境记录一下,方便回忆和项目上对比。
# on 105
ip link
# 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
# link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
# 2: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master baremetal state UP mode DEFAULT group default qlen 1000
# link/ether 90:b1:1c:40:59:27 brd ff:ff:ff:ff:ff:ff
# 3: eno2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN mode DEFAULT group default qlen 1000
# link/ether 90:b1:1c:40:59:28 brd ff:ff:ff:ff:ff:ff
# 4: eno3: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN mode DEFAULT group default qlen 1000
# link/ether 90:b1:1c:40:59:29 brd ff:ff:ff:ff:ff:ff
# 5: eno4: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN mode DEFAULT group default qlen 1000
# link/ether 90:b1:1c:40:59:2a brd ff:ff:ff:ff:ff:ff
# 6: enp67s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovs-system state UP mode DEFAULT group default qlen 1000
# link/ether 0c:42:a1:fa:18:a2 brd ff:ff:ff:ff:ff:ff
# vf 0 link/ether e4:11:22:33:55:60 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state disable, trust off, query_rss off
# vf 1 link/ether e4:11:22:33:55:61 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state disable, trust off, query_rss off
# 7: enp67s0f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
# link/ether 0c:42:a1:fa:18:a3 brd ff:ff:ff:ff:ff:ff
# 8: ib0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 4092 qdisc mq state DOWN mode DEFAULT group default qlen 256
# link/infiniband 00:00:10:28:fe:80:00:00:00:00:00:00:98:03:9b:03:00:cc:71:2c brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff
# 9: baremetal: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
# link/ether 90:b1:1c:40:59:27 brd ff:ff:ff:ff:ff:ff
# 10: virbr0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default qlen 1000
# link/ether 52:54:00:8f:4a:bc brd ff:ff:ff:ff:ff:ff
# 11: virbr0-nic: <BROADCAST,MULTICAST> mtu 1500 qdisc fq_codel master virbr0 state DOWN mode DEFAULT group default qlen 1000
# link/ether 52:54:00:8f:4a:bc brd ff:ff:ff:ff:ff:ff
# 16: enp67s0f0_0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovs-system state UP mode DEFAULT group default qlen 1000
# link/ether fa:cf:0f:6a:ec:45 brd ff:ff:ff:ff:ff:ff
# 17: enp67s0f0_1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovs-system state UP mode DEFAULT group default qlen 1000
# link/ether 76:65:93:70:96:ac brd ff:ff:ff:ff:ff:ff
# 18: enp67s0f0v0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
# link/ether e4:11:22:33:55:60 brd ff:ff:ff:ff:ff:ff
# 19: enp67s0f0v1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
# link/ether e4:11:22:33:55:61 brd ff:ff:ff:ff:ff:ff
# 20: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
# link/ether f6:e9:fd:16:8a:ea brd ff:ff:ff:ff:ff:ff
# 21: ovs-sriov: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
# link/ether 0c:42:a1:fa:18:a2 brd ff:ff:ff:ff:ff:ff
# 22: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master baremetal state UNKNOWN mode DEFAULT group default qlen 1000
# link/ether fe:54:00:8d:b6:8e brd ff:ff:ff:ff:ff:ff
ip a
# 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
# link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
# inet 127.0.0.1/8 scope host lo
# valid_lft forever preferred_lft forever
# inet6 ::1/128 scope host
# valid_lft forever preferred_lft forever
# 2: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master baremetal state UP group default qlen 1000
# link/ether 90:b1:1c:40:59:27 brd ff:ff:ff:ff:ff:ff
# 3: eno2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000
# link/ether 90:b1:1c:40:59:28 brd ff:ff:ff:ff:ff:ff
# 4: eno3: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000
# link/ether 90:b1:1c:40:59:29 brd ff:ff:ff:ff:ff:ff
# 5: eno4: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000
# link/ether 90:b1:1c:40:59:2a brd ff:ff:ff:ff:ff:ff
# 6: enp67s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovs-system state UP group default qlen 1000
# link/ether 0c:42:a1:fa:18:a2 brd ff:ff:ff:ff:ff:ff
# 7: enp67s0f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
# link/ether 0c:42:a1:fa:18:a3 brd ff:ff:ff:ff:ff:ff
# 8: ib0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 4092 qdisc mq state DOWN group default qlen 256
# link/infiniband 00:00:10:28:fe:80:00:00:00:00:00:00:98:03:9b:03:00:cc:71:2c brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff
# 9: baremetal: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
# link/ether 90:b1:1c:40:59:27 brd ff:ff:ff:ff:ff:ff
# inet 172.21.6.105/24 brd 172.21.6.255 scope global noprefixroute baremetal
# valid_lft forever preferred_lft forever
# inet6 fe80::12a7:202d:c70b:be14/64 scope link noprefixroute
# valid_lft forever preferred_lft forever
# 10: virbr0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default qlen 1000
# link/ether 52:54:00:8f:4a:bc brd ff:ff:ff:ff:ff:ff
# inet 192.168.122.1/24 brd 192.168.122.255 scope global virbr0
# valid_lft forever preferred_lft forever
# 11: virbr0-nic: <BROADCAST,MULTICAST> mtu 1500 qdisc fq_codel master virbr0 state DOWN group default qlen 1000
# link/ether 52:54:00:8f:4a:bc brd ff:ff:ff:ff:ff:ff
# 16: enp67s0f0_0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovs-system state UP group default qlen 1000
# link/ether fa:cf:0f:6a:ec:45 brd ff:ff:ff:ff:ff:ff
# 17: enp67s0f0_1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovs-system state UP group default qlen 1000
# link/ether 76:65:93:70:96:ac brd ff:ff:ff:ff:ff:ff
# 18: enp67s0f0v0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
# link/ether e4:11:22:33:55:60 brd ff:ff:ff:ff:ff:ff
# inet 192.168.55.31/24 scope global enp67s0f0v0
# valid_lft forever preferred_lft forever
# 19: enp67s0f0v1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
# link/ether e4:11:22:33:55:61 brd ff:ff:ff:ff:ff:ff
# inet 192.168.55.32/24 scope global enp67s0f0v1
# valid_lft forever preferred_lft forever
# 20: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
# link/ether f6:e9:fd:16:8a:ea brd ff:ff:ff:ff:ff:ff
# 21: ovs-sriov: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
# link/ether 0c:42:a1:fa:18:a2 brd ff:ff:ff:ff:ff:ff
# 22: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master baremetal state UNKNOWN group default qlen 1000
# link/ether fe:54:00:8d:b6:8e brd ff:ff:ff:ff:ff:ff
# inet6 fe80::fc54:ff:fe8d:b68e/64 scope link
# valid_lft forever preferred_lft forever
ovs-vsctl show
# 8f3eddeb-c42c-4af4-9dc8-a46169d91a7c
# Bridge ovs-sriov
# Port enp67s0f0_1
# Interface enp67s0f0_1
# Port ovs-sriov
# Interface ovs-sriov
# type: internal
# Port enp67s0f0
# Interface enp67s0f0
# Port enp67s0f0_0
# Interface enp67s0f0_0
# ovs_version: "2.14.1"
# on kvm
ip link
# 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
# link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
# 2: enp0s6: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
# link/ether e4:11:c6:d3:45:f2 brd ff:ff:ff:ff:ff:ff
# 3: enp1s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP mode DEFAULT group default qlen 1000
# link/ether 52:54:00:8d:b6:8e brd ff:ff:ff:ff:ff:ff
ip a
# 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
# link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
# inet 127.0.0.1/8 scope host lo
# valid_lft forever preferred_lft forever
# inet6 ::1/128 scope host
# valid_lft forever preferred_lft forever
# 2: enp0s6: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
# link/ether e4:11:c6:d3:45:f2 brd ff:ff:ff:ff:ff:ff
# inet 192.168.99.11/24 brd 192.168.99.255 scope global noprefixroute enp0s6
# valid_lft forever preferred_lft forever
# inet6 fe80::f3c:b686:1739:a748/64 scope link noprefixroute
# valid_lft forever preferred_lft forever
# 3: enp1s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
# link/ether 52:54:00:8d:b6:8e brd ff:ff:ff:ff:ff:ff
# inet 172.21.6.11/24 brd 172.21.6.255 scope global noprefixroute enp1s0
# valid_lft forever preferred_lft forever
# inet6 fe80::5054:ff:fe8d:b68e/64 scope link noprefixroute
# valid_lft forever preferred_lft forever
性能测试
# on 102
dnf install -y iperf3
systemctl disable --now firewalld
iperf3 -s -p 6666
# on 11
dnf install -y iperf3
iperf3 -t 20 -p 6666 -c 192.168.99.21
Connecting to host 192.168.99.21, port 6666
[ 5] local 192.168.99.11 port 50960 connected to 192.168.99.21 port 6666
[ ID] Interval Transfer Bitrate Retr Cwnd
[ 5] 0.00-1.00 sec 1.40 GBytes 12.0 Gbits/sec 0 594 KBytes
[ 5] 1.00-2.00 sec 1.39 GBytes 12.0 Gbits/sec 0 594 KBytes
[ 5] 2.00-3.00 sec 1.39 GBytes 12.0 Gbits/sec 0 594 KBytes
[ 5] 3.00-4.00 sec 1.40 GBytes 12.0 Gbits/sec 0 624 KBytes
[ 5] 4.00-5.00 sec 1.40 GBytes 12.0 Gbits/sec 0 659 KBytes
[ 5] 5.00-6.00 sec 1.40 GBytes 12.0 Gbits/sec 0 659 KBytes
[ 5] 6.00-7.00 sec 1.40 GBytes 12.0 Gbits/sec 0 659 KBytes
[ 5] 7.00-8.00 sec 1.40 GBytes 12.0 Gbits/sec 0 1.03 MBytes
[ 5] 8.00-9.00 sec 1.40 GBytes 12.0 Gbits/sec 0 1.03 MBytes
[ 5] 9.00-10.00 sec 1.40 GBytes 12.0 Gbits/sec 0 1.03 MBytes
[ 5] 10.00-11.00 sec 1.39 GBytes 12.0 Gbits/sec 0 1.03 MBytes
[ 5] 11.00-12.00 sec 1.39 GBytes 12.0 Gbits/sec 0 1.03 MBytes
[ 5] 12.00-13.00 sec 1.39 GBytes 11.9 Gbits/sec 0 1.03 MBytes
[ 5] 13.00-14.00 sec 1.39 GBytes 11.9 Gbits/sec 0 1.03 MBytes
[ 5] 14.00-15.00 sec 1.39 GBytes 11.9 Gbits/sec 0 1.03 MBytes
[ 5] 15.00-16.00 sec 1.39 GBytes 11.9 Gbits/sec 0 1.03 MBytes
[ 5] 16.00-17.00 sec 1.39 GBytes 12.0 Gbits/sec 0 1.03 MBytes
[ 5] 17.00-18.00 sec 1.39 GBytes 11.9 Gbits/sec 0 1.03 MBytes
[ 5] 18.00-19.00 sec 1.39 GBytes 11.9 Gbits/sec 0 1.03 MBytes
[ 5] 19.00-20.00 sec 1.39 GBytes 11.9 Gbits/sec 0 1.03 MBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-20.00 sec 27.9 GBytes 12.0 Gbits/sec 0 sender
[ 5] 0.00-20.04 sec 27.9 GBytes 11.9 Gbits/sec receiver
iperf Done.
# on 105
systemctl disable --now irqbalance.service
mlnx_affinity start
# on 102
systemctl disable --now irqbalance.service
mlnx_affinity start
# on 102
dnf install -y qperf
qperf
# on 105
qperf 192.168.88.21 tcp_bw
tcp_bw:
bw = 2.8 GB/sec
# on 101
qperf 192.168.99.21 tcp_bw
tcp_bw:
bw = 1.48 GB/sec
RHEL/centos 8 build kernel
本文描述如何在rhel8上编译自定义的内核。
业务背景是,客户需要使用mellanox网卡高级功能,需要kernel打开相应的选项,才可以使用,所以我们就编译一个新的内核出来。
讲解视频
实验步骤
# https://access.redhat.com/articles/3938081
# grubby --info=ALL | grep title
# https://blog.packagecloud.io/eng/2015/04/20/working-with-source-rpms/
export PROXY="192.168.253.1:5085"
export PROXY="192.168.203.1:5085"
# 由于需要rhel8.3,而当前8.3还是beta状态,我们需要注册特殊的订阅。
subscription-manager --proxy=$PROXY register --username **** --password ********
# subscription-manager config --rhsm.baseurl=https://china.cdn.redhat.com
# subscription-manager config --rhsm.baseurl=https://cdn.redhat.com
subscription-manager --proxy=$PROXY refresh
subscription-manager --proxy=$PROXY repos --help
subscription-manager --proxy=$PROXY repos --list > list
cat list | grep 'Repo ID' | grep -v source | grep -v debug
subscription-manager --proxy=$PROXY repos --disable="*"
subscription-manager --proxy=$PROXY repos \
--enable="rhel-8-for-x86_64-baseos-beta-rpms" \
--enable="rhel-8-for-x86_64-appstream-beta-rpms" \
--enable="rhel-8-for-x86_64-supplementary-beta-rpms" \
--enable="rhel-8-for-x86_64-rt-beta-rpms" \
--enable="rhel-8-for-x86_64-highavailability-beta-rpms" \
--enable="rhel-8-for-x86_64-nfv-beta-rpms" \
--enable="fast-datapath-beta-for-rhel-8-x86_64-rpms" \
--enable="codeready-builder-beta-for-rhel-8-x86_64-rpms" \
# --enable="dirsrv-beta-for-rhel-8-x86_64-rpms" \
# ansible-2.9-for-rhel-8-x86_64-rpms
cat << EOF >> /etc/dnf/dnf.conf
proxy=http://$PROXY
EOF
# 编译内核,需要rhel7, rhel8里面的epel的包
yum -y install https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
dnf -y install https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm
yum -y install yum-utils rpm-build
yum list kernel.x86_64
# 下载内核源码包
yumdownloader --source kernel.x86_64
# 安装源码包
rpm -ivh /root/kernel-4.18.0-221.el8.src.rpm
cd /root/rpmbuild/SPECS
# https://stackoverflow.com/questions/13227162/automatically-install-build-dependencies-prior-to-building-an-rpm-package
# 安装辅助包
yum-builddep kernel.spec
# 生成配置
rpmbuild -bp --target=x86_64 kernel.spec
# libbabeltrace-devel
# https://www.cnblogs.com/luohaixian/p/9313863.html
KERNELVERION=`uname -r | sed "s/.$(uname -m)//"`
KERNELRV=$(uname -r)
/bin/cp -f /root/rpmbuild/BUILD/kernel-${KERNELVERION}/linux-${KERNELRV}/configs/* /root/rpmbuild/SOURCES/
cd /root/rpmbuild/BUILD/kernel-${KERNELVERION}/linux-${KERNELRV}/
/bin/cp -f configs/kernel-4.18.0-`uname -m`.config .config
# cp /boot/config-`uname -r` .config
make oldconfig
# 自定义配置,请观看视频
make menuconfig
# vi .config
# CONFIG_MLX5_TC_CT=y
# CONFIG_NET_ACT_CT=m
# CONFIG_SKB_EXTENSIONS=y
# CONFIG_NET_TC_SKB_EXT=y
# CONFIG_NF_FLOW_TABLE=m
# CONFIG_NF_FLOW_TABLE_IPV4=m x
# CONFIG_NF_FLOW_TABLE_IPV6=m x
# CONFIG_NF_FLOW_TABLE_INET=m
# CONFIG_NET_ACT_CONNMARK=m x
# CONFIG_NET_ACT_IPT=m x
# CONFIG_NET_EMATCH_IPT=m x
# CONFIG_NET_ACT_IFE=m x
# 指明编译x86
# x86_64
sed -i '1s/^/# x86_64\n/' .config
/bin/cp -f .config configs/kernel-4.18.0-`uname -m`.config
/bin/cp -f .config configs/kernel-x86_64.config
/bin/cp -f configs/* /root/rpmbuild/SOURCES/
cd /root/rpmbuild/SPECS
# cp kernel.spec kernel.spec.orig
# https://fedoraproject.org/wiki/Building_a_custom_kernel
# 自定义内核名称
sed -i "s/# define buildid \\.local/%define buildid \\.wzh/" kernel.spec
# rpmbuild -bb --target=`uname -m` --without kabichk kernel.spec 2> build-err.log | tee build-out.log
# rpmbuild -bb --target=`uname -m` --without debug --without debuginfo --without kabichk kernel.spec 2> build-err.log | tee build-out.log
rpmbuild -bb --target=`uname -m` --with baseonly --without debug --without debuginfo --without kabichk kernel.spec 2> build-err.log | tee build-out.log
cd /root/rpmbuild/RPMS/x86_64/
# 安装编译的内核
INSTALLKV=4.18.0-221.el8.wzh
yum install ./kernel-$INSTALLKV.x86_64.rpm ./kernel-core-$INSTALLKV.x86_64.rpm ./kernel-modules-$INSTALLKV.x86_64.rpm
# 重启以后,检查内核模块激活
grep -R --include=Makefile CONFIG_NET_ACT_IFE
# rpmbuild/BUILD/kernel-4.18.0-221.el8/linux-4.18.0-221.el8.wzh.x86_64/net/sched/Makefile:obj-$(CONFIG_NET_ACT_IFE) += act_ife.o
modprobe act_ife
lsmod | grep act_ife
本次实验编译完成的rhel kernel的包,在这里下载:
链接: https://pan.baidu.com/s/1AG07HxpXy9hoCLMq9qXi0Q 密码: 7hkt --来自百度网盘超级会员V3的分享
检查是否在虚拟机上以及主机基本情况
以下是虚拟机的输出
https://www.cnblogs.com/klb561/p/10527197.html
dmidecode -s system-product-name
# OpenStack Nova
lshw -class system
# sz-mec-dev02
# description: Computer
# product: OpenStack Nova
# vendor: OpenStack Foundation
# version: 13.2.1-20190604220711
# serial: 261977f6-fc7a-49f3-954e-cf9feb70fc2c
# width: 64 bits
# capabilities: smbios-2.8 dmi-2.8 smp vsyscall32
# configuration: boot=normal family=Virtual Machine uuid=8C0EE55A-5F37-554D-8300-313E29EF58B0
# *-pnp00:00
# product: PnP device PNP0b00
# physical id: 1
# capabilities: pnp
# configuration: driver=rtc_cmos
dmesg |grep -i virtual
# [ 0.145659] Booting paravirtualized kernel on KVM
# [ 1.177345] input: VirtualPS/2 VMware VMMouse as /devices/platform/i8042/serio1/input/input4
# [ 1.178356] input: VirtualPS/2 VMware VMMouse as /devices/platform/i8042/serio1/input/input3
# [ 1.223866] systemd[1]: Detected virtualization kvm.
check core
https://www.cyberciti.biz/faq/check-how-many-cpus-are-there-in-linux-system/
echo "Number of CPU/cores online at $HOSTNAME: $(getconf _NPROCESSORS_ONLN)"
check memory
https://www.networkworld.com/article/3336174/how-much-memory-is-installed-and-being-used-on-your-linux-systems.html
dmidecode -t 17 | grep "Size.*MB" | awk '{s+=$2} END {print s / 1024 "GB"}'
openshift 4 kvm+ovs install
openshift4在日常的安装场景中,有这样一个情况,就是需要在多台配置小一些的主机上,实现跨主机的集群安装,这就需要多个kvm跨主机通讯,本来使用bridge方式,搭配可直连的ip地址,是可以满足的,但是由于ip地址管理的限制,我们没有可以直连的ip地址,那么我们就需要ovs+vxlan的方式,来解决这个问题。
本文针对2台主机,讲述如何配置ovs,以及如何启动kvm。
参考资料:
- https://stackoverflow.com/questions/30622680/kvm-ovs-bridged-network-how-to-configure
- https://stackoverflow.com/questions/31566658/setup-private-networking-between-two-hosts-and-two-vms-with-libvirt-openvswitc
- https://blog.csdn.net/wuliangtianzu/article/details/81870551
- https://pinrojas.com/2017/05/03/how-to-use-virt-install-to-connect-at-openvswitch-bridges/
- https://www.jianshu.com/p/658332deac99
- https://developer.gnome.org/NetworkManager/stable/nm-openvswitch.html
mtu 调整:
- https://www.cnblogs.com/JacZhu/p/11006738.html
- https://stackoom.com/question/3gFcR/%E6%97%A0%E6%B3%95%E5%9C%A8OVS%E9%9A%A7%E9%81%93%E4%B8%AD%E6%8D%95%E8%8E%B7%E5%A4%A7%E4%BA%8EMTU-%E7%9A%84%E6%B5%81%E9%87%8F
- https://serverfault.com/questions/680635/mtu-on-open-vswitch-bridge-port
- https://stackoverflow.com/questions/54398827/unable-to-capture-traffic-greater-than-mtu-1500-in-ovs-tunnel
vxlan
- https://blog.csdn.net/a363344923/article/details/98033856
- https://prolinuxhub.com/configure-start-up-scripts-for-ovs-on-centos-and-red-hat/
nat
- https://www.sdnlab.com/19842.html
- https://www.sdnlab.com/19802.html
- https://www.sdnlab.com/19765.html
基于本文的ocp4安装实践,见笔记: https://github.com/wangzheng422/docker_env/blob/master/redhat/prepare/cmri/lab.md
on redhat-01
yum -y install openvswitch2.11 NetworkManager-ovs
# install pkg for vnc and kvm
systemctl enable --now openvswitch
systemctl status openvswitch
systemctl enable --now libvirtd
cat << 'EOF' > /etc/sysconfig/network-scripts/ifcfg-br-int
DEVICE=br-int
ONBOOT=yes
DEVICETYPE=ovs
TYPE=OVSBridge
BOOTPROTO=static
HOTPLUG=no
IPADDR=192.168.7.1
PREFIX=24
MTU=1450
EOF
cat << 'EOF' > /etc/sysconfig/network-scripts/ifcfg-vxlan1
DEVICE=vxlan1
ONBOOT=yes
DEVICETYPE=ovs
TYPE=OVSTunnel
OVS_BRIDGE=br-int
OVS_TUNNEL_TYPE=vxlan
OVS_TUNNEL_OPTIONS="options:remote_ip=172.29.159.100"
BOOTPROTO=static
HOTPLUG=no
EOF
systemctl restart network
ovs-vsctl show
# ovs-vsctl set int br-int mtu_request=1450
# ovs-vsctl set int br-int mtu_request=[]
mkdir -p /data/kvm
cd /data/kvm
# bridge mode
cat << 'EOF' > ovsnet.xml
<network>
<name>br-int</name>
<forward mode='bridge'/>
<bridge name='br-int'/>
<virtualport type='openvswitch'/>
</network>
EOF
virsh net-define ovsnet.xml
virsh net-start br-int
virsh net-autostart br-int
# restore
virsh net-destroy br-int
virsh net-undefine br-int
/bin/rm -f /etc/sysconfig/network-scripts/ifcfg-br-int
/bin/rm -f /etc/sysconfig/network-scripts/ifcfg-vxlan1
systemctl restart network
on redhat-02
yum -y install openvswitch2.11 NetworkManager-ovs
# install pkg for vnc and kvm
systemctl enable --now openvswitch
systemctl status openvswitch
systemctl enable --now libvirtd
ovs-vsctl show
cat << 'EOF' > /etc/sysconfig/network-scripts/ifcfg-br-int
DEVICE=br-int
ONBOOT=yes
DEVICETYPE=ovs
TYPE=OVSBridge
BOOTPROTO=static
HOTPLUG=no
IPADDR=192.168.7.2
PREFIX=24
MTU=1450
EOF
cat << 'EOF' > /etc/sysconfig/network-scripts/ifcfg-vxlan1
DEVICE=vxlan1
ONBOOT=yes
DEVICETYPE=ovs
TYPE=OVSTunnel
OVS_BRIDGE=br-int
OVS_TUNNEL_TYPE=vxlan
OVS_TUNNEL_OPTIONS="options:remote_ip=172.29.159.99"
BOOTPROTO=static
HOTPLUG=no
EOF
systemctl restart network
ovs-vsctl show
# ovs-vsctl set int br-int mtu_request=1450
mkdir -p /data/kvm
cd /data/kvm
# bridge mode
cat << 'EOF' > ovsnet.xml
<network>
<name>br-int</name>
<forward mode='bridge'/>
<bridge name='br-int'/>
<virtualport type='openvswitch'/>
</network>
EOF
virsh net-define ovsnet.xml
virsh net-start br-int
virsh net-autostart br-int
# restore
virsh net-destroy br-int
virsh net-undefine br-int
创建虚拟机
虚机创建,注意调整每个虚机的mtu,关键在虚拟机里面,操作系统对网卡mtu的设置,这个其实是kernel安装的时候,启动参数的问题,请参考这里: https://www.man7.org/linux/man-pages/man7/dracut.cmdline.7.html
mkdir -p /data/kvm
cd /data/kvm
lvremove -f datavg/helperlv
lvcreate -y -L 230G -n helperlv datavg
# 230G
virt-install --name="ocp4-aHelper" --vcpus=2 --ram=4096 \
--disk path=/dev/datavg/helperlv,device=disk,bus=virtio,format=raw \
--os-variant centos7.0 --network network:br-int,model=virtio \
--boot menu=on --location /data/kvm/rhel-server-7.8-x86_64-dvd.iso \
--initrd-inject /data/kvm/helper-ks.cfg --extra-args "inst.ks=file:/helper-ks.cfg"
弯路
ovs上的虚拟机,要开启mtu调整
sysctl -w net.ipv4.tcp_mtu_probing=1
cat << 'EOF' > /etc/sysctl.d/99-sysctl-wzh.conf
net.ipv4.tcp_mtu_probing = 1
EOF
sysctl --system
ovs-vsctl add-port br-int vxlan1 -- \
set Interface vxlan1 type=vxlan options:remote_ip=172.29.159.99
ovs-vsctl set int br-int mtu_request=1450
nmcli connection add type vxlan id 100 remote 172.29.159.99 ipv4.addresses 192.168.77.2/24 ipv4.method manual ifname vxlan1 connection.id vxlan1 vxlan.parent enp2s0f0
nmcli conn up vxlan1
nmcli conn del vxlan1
ovs-vsctl add-port br-int vxlan1 -- \
set Interface vxlan1 type=vxlan options:remote_ip=172.29.159.100
ovs-vsctl set int br-int mtu_request=1450
ovs-vsctl set int br-int mtu_request=[]
systemctl restart network
# restore
ovs-vsctl del-port br-int vxlan1
ovs-vsctl del-br br-int
rm -f /etc/sysconfig/network-scripts/ifcfg-br-int
systemctl restart network
man nm-openvswitch
nmcli con add type ovs-bridge \
con-name br-private \
ifname br-private \
ipv4.method 'manual' \
ipv4.address '192.168.7.1/24'
nmcli connection modify br-private ipv4.addresses 192.168.7.1/24
nmcli connection modify eno2 ipv4.gateway 192.168.39.254
nmcli connection modify eno2 ipv4.dns 192.168.39.129
nmcli connection modify br-private ipv4.method manual
nmcli connection modify br-private connection.autoconnect yes
nmcli connection modify br-private connection.autoconnect yes
nmcli connection reload
nmcli con del br-private
nmcli connection add type vxlan id 100 remote 172.29.159.100 ipv4.addresses 192.168.77.1/24 ipv4.method manual ifname vxlan1 connection.id vxlan1 vxlan.parent enp2s0f0
nmcli conn up vxlan1
nmcli conn del vxlan1
nmcli conn add type ovs-bridge conn.interface bridge0
nmcli conn add type ovs-port conn.interface port0 master bridge0
nmcli conn add type ovs-interface conn.interface iface0 master port0 \
ipv4.method manual ipv4.address 192.168.7.1/24
nmcli conn del ovs-slave-iface0
nmcli conn del ovs-slave-port0
nmcli conn del ovs-bridge-bridge0
ovs-vsctl add-br br-private
ovs-dpctl show
ovs-ofctl show br0
OpenShift and Container Storage for Administrators
本文讲述openshift4的管理员上手培训,主要亮点是openshift的存储模块ocs,openshift的集中日志,和openshift的计量计费,这几个模块需要的底层资源比较多,平时难得有环境可以尝试。
workshop upstream github: https://github.com/openshift/openshift-cns-testdrive
WORKSHOP MODULES
以下是培训的各个模块的教材。
- Environment Overview
- Installation and Verification
- Application Management Basics
- Application Storage Basics
- MachineSets, Machines, and Nodes
- Infrastructure Nodes and Operators
- Deploying and Managing OpenShift Container Storage
- OpenShift Log Aggregation
- External (LDAP) Authentication Providers, Users, and Groups
- OpenShift Monitoring with Prometheus
- Project Template, Quota, and Limits
- OpenShift Networking and NetworkPolicy
- Disabling Project Self-Provisioning
- Cluster Resource Quotas
- Cluster Metering
- Taints and Tolerations
ocs (openshift container storage)
集中日志
计量计费
poc for sc
- poc for sc
- rhel host maintain
- install ocp
- helper node day1
- helper node day1 oper
- helper node day 2 sec
- helper node quay
- helper node zte oper
- helper host add vm-router
- helper node zte tcp-router
- helper node cluster tunning
- helper node local storage
- bootstrap node day1
- master1 node day1
- master0 node day1
- master2 node day1
- infra0 node day1
- infra1 node day1
- worker-0 day2 oper
- worker-1 day2 oper
- worker-2 day2 oper
- tips
rhel host maintain
aliyun host
ssh-copy-id root@
cat << EOF > /root/.ssh/config
StrictHostKeyChecking no
UserKnownHostsFile=/dev/null
EOF
export VULTR_HOST=helper.hsc.redhat.ren
rsync -e ssh --info=progress2 -P --delete -arz /data/rhel-data/data ${VULTR_HOST}:/data/rhel-data
rsync -e ssh --info=progress2 -P --delete -arz /data/registry ${VULTR_HOST}:/data/
rsync -e ssh --info=progress2 -P --delete -arz /data/ocp4 ${VULTR_HOST}:/data/
rsync -e ssh --info=progress2 -P --delete -arz /data/is.samples ${VULTR_HOST}:/data/
cd /data
tar -cvf - registry/ | pigz -c > registry.tgz
tar -cvf - ocp4/ | pigz -c > ocp4.tgz
tar -cvf - data/ | pigz -c > rhel-data.tgz
tar -cvf - is.samples/ | pigz -c > /data_hdd/down/is.samples.tgz
helper host
######################################################
# on helper
find . -name vsftp*
yum -y install ./data/rhel-7-server-rpms/Packages/vsftpd-3.0.2-25.el7.x86_64.rpm
systemctl start vsftpd
systemctl restart vsftpd
systemctl enable vsftpd
firewall-cmd --permanent --add-service=ftp
firewall-cmd --reload
mv data /var/ftp/
chcon -R -t public_content_t /var/ftp/data
mkdir /etc/yum.repos.d.bak
mv /etc/yum.repos.d/* /etc/yum.repos.d.bak
cat << EOF > /etc/yum.repos.d/remote.repo
[remote]
name=RHEL FTP
baseurl=ftp://117.177.241.16/data
enabled=1
gpgcheck=0
EOF
yum clean all
yum repolist
yum -y update
hostnamectl set-hostname helper.hsc.redhat.ren
nmcli connection modify em1 ipv4.dns 114.114.114.114
nmcli connection reload
nmcli connection up em1
yum -y install fail2ban
cat << EOF > /etc/fail2ban/jail.d/wzh.conf
[sshd]
enabled = true
[recidive]
enabled = true
EOF
systemctl enable fail2ban
systemctl restart fail2ban
fail2ban-client status sshd
fail2ban-client status recidive
systemctl status fail2ban
tail -F /var/log/fail2ban.log
cp /etc/ssh/sshd_config /etc/ssh/sshd_config.BAK
sed -i 's/#UseDNS yes/UseDNS no/g' /etc/ssh/sshd_config
diff /etc/ssh/sshd_config /etc/ssh/sshd_config.BAK
systemctl restart sshd
passwd
useradd -m wzh
lsblk | grep 446 | awk '{print $1}' | xargs -I DEMO echo -n "/dev/DEMO "
# /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm
lsblk | grep 446 | awk '{print $1}' | wc -l
# 12
# https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/configuring_and_managing_logical_volumes/assembly_configure-mange-raid-configuring-and-managing-logical-volumes
yum install -y lvm2
pvcreate -y /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm
vgcreate datavg /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm
vgs
lvcreate --type raid10 -l 100%FREE --stripes 6 -n datalv datavg
umount /data_hdd
lvremove /dev/datavg/datalv
mkfs.xfs /dev/datavg/datalv
lvdisplay /dev/datavg/datalv -m
mkdir -p /data
cp /etc/fstab /etc/fstab.bak
cat << EOF >> /etc/fstab
/dev/datavg/datalv /data xfs defaults 0 0
EOF
mount -a
yum install -y sysstat
lsblk | grep disk | awk '{print $1}' | xargs -I DEMO echo -n "DEMO "
# sda sdb sdc sdd sde sdf sdg sdh sdi sdj sdk sdl sdm
iostat -h -m -x sda sdb sdc sdd sde sdf sdg sdh sdi sdj sdk sdl sdm 5
iostat -m -x dm-24 5
yum install -y chrony
systemctl enable chronyd
systemctl restart chronyd
systemctl status chronyd
chronyc tracking
firewall-cmd --get-zones
# block dmz drop external home internal public trusted work
firewall-cmd --zone=public --list-all
firewall-cmd --permanent --zone=public --remove-port=2049/tcp
firewall-cmd --permanent --zone=public --add-rich-rule='rule family="ipv4" port port="2049" protocol="tcp" source address="117.177.241.0/24" accept'
firewall-cmd --permanent --zone=public --add-rich-rule='rule family="ipv4" port port="2049" protocol="tcp" source address="39.137.101.0/24" accept'
# firewall-cmd --permanent --zone=public --add-port=4443/tcp
firewall-cmd --reload
showmount -a
exportfs -s
cd /data_ssd/
scp *.tgz root@117.177.241.17:/data_hdd/down/
# https://access.redhat.com/solutions/3341191
# subscription-manager register --org=ORG ID --activationkey= Key Name
cat /var/log/rhsm/rhsm.log
subscription-manager config --rhsm.manage_repos=0
cp /etc/yum/pluginconf.d/subscription-manager.conf /etc/yum/pluginconf.d/subscription-manager.conf.orig
cat << EOF > /etc/yum/pluginconf.d/subscription-manager.conf
[main]
enabled=0
EOF
# https://access.redhat.com/products/red-hat-insights/#getstarted
subscription-manager register --auto-attach
yum --disableplugin=subscription-manager install insights-client
insights-client --register
yum --disableplugin=subscription-manager install ncdu
helper host day 2
####################################
# anti scan
firewall-cmd --permanent --zone=public --remove-rich-rule='rule family="ipv4" port port="2049" protocol="tcp" source address="117.177.241.0/24" accept'
firewall-cmd --permanent --zone=public --remove-rich-rule='rule family="ipv4" port port="2049" protocol="tcp" source address="39.137.101.0/24" accept'
firewall-cmd --permanent --new-ipset=my-allow-list --type=hash:net
firewall-cmd --permanent --get-ipsets
cat > /root/iplist.txt <<EOL
127.0.0.1/32
223.87.20.0/24
117.177.241.0/24
39.134.200.0/24
39.134.201.0/24
39.137.101.0/24
192.168.7.0/24
112.44.102.224/27
47.93.86.113/32
221.226.0.75/32
210.21.236.182/32
61.132.54.0/24
112.44.102.228/32
223.87.20.7/32
10.88.0.0/16
223.86.0.14/32
39.134.204.0/24
EOL
firewall-cmd --permanent --ipset=my-allow-list --add-entries-from-file=iplist.txt
firewall-cmd --permanent --ipset=my-allow-list --get-entries
firewall-cmd --permanent --zone=trusted --add-source=ipset:my-allow-list
firewall-cmd --reload
firewall-cmd --list-all
firewall-cmd --get-active-zones
firewall-cmd --zone=block --change-interface=em1
firewall-cmd --set-default-zone=block
firewall-cmd --runtime-to-permanent
firewall-cmd --reload
# setup time server
/bin/cp -f /etc/chrony.conf /etc/chrony.conf.bak
cat << EOF > /etc/chrony.conf
server 0.rhel.pool.ntp.org iburst
server 1.rhel.pool.ntp.org iburst
server 2.rhel.pool.ntp.org iburst
server 3.rhel.pool.ntp.org iburst
driftfile /var/lib/chrony/drift
makestep 1.0 3
rtcsync
logdir /var/log/chrony
allow 39.134.0.0/16
EOF
systemctl restart chronyd
systemctl status chronyd
chronyc tracking
useradd -m zte
groupadd docker
usermod -aG docker zte
# https://github.com/containers/libpod/issues/5049
loginctl enable-linger zte
su -l zte
# https://www.redhat.com/en/blog/preview-running-containers-without-root-rhel-76
echo 10000 > /proc/sys/user/max_user_namespaces
####################################
## trust podman
firewall-cmd --permanent --zone=trusted --add-interface=cni0
firewall-cmd --permanent --zone=trusted --remove-interface=cni0
firewall-cmd --reload
# update ntp
cat << EOF > /etc/chrony.conf
server 223.87.20.100 iburst
driftfile /var/lib/chrony/drift
makestep 1.0 3
rtcsync
logdir /var/log/chrony
allow 39.134.0.0/16
EOF
systemctl restart chronyd
systemctl status chronyd
chronyc tracking
bootstrap host
######################################################
# bootstrap
mkdir /etc/yum.repos.d.bak
mv /etc/yum.repos.d/* /etc/yum.repos.d.bak
cat << EOF > /etc/yum.repos.d/remote.repo
[remote]
name=RHEL FTP
baseurl=ftp://117.177.241.16/data
enabled=1
gpgcheck=0
EOF
yum clean all
yum repolist
yum -y update
hostnamectl set-hostname bootstrap.hsc.redhat.ren
nmcli connection modify em1 ipv4.dns 117.177.241.16
nmcli connection reload
nmcli connection up em1
yum -y install fail2ban
cat << EOF > /etc/fail2ban/jail.d/wzh.conf
[sshd]
enabled = true
[recidive]
enabled = true
EOF
systemctl enable fail2ban
systemctl restart fail2ban
fail2ban-client status sshd
fail2ban-client status recidive
systemctl status fail2ban
tail -F /var/log/fail2ban.log
cp /etc/ssh/sshd_config /etc/ssh/sshd_config.BAK
sed -i 's/#UseDNS yes/UseDNS no/g' /etc/ssh/sshd_config
diff /etc/ssh/sshd_config /etc/ssh/sshd_config.BAK
systemctl restart sshd
passwd
useradd -m wzh
lsblk | grep 446 | awk '{print $1}' | xargs -I DEMO echo -n "/dev/DEMO "
# /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm
lsblk | grep 446 | awk '{print $1}' | wc -l
# 12
# https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/configuring_and_managing_logical_volumes/assembly_configure-mange-raid-configuring-and-managing-logical-volumes
yum install -y lvm2
pvcreate -y /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm
vgcreate datavg /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm
vgs
lvcreate --type raid10 -l 100%FREE --stripes 6 -n datalv datavg
mkfs.xfs /dev/datavg/datalv
lvdisplay /dev/datavg/datalv -m
mkdir -p /data
cp /etc/fstab /etc/fstab.bak
cat << EOF >> /etc/fstab
/dev/datavg/datalv /data xfs defaults 0 0
EOF
mount -a
yum install -y sysstat
lsblk | grep disk | awk '{print $1}' | xargs -I DEMO echo -n "DEMO "
# sda sdb sdc sdd sde sdf sdg sdh sdi sdj sdk sdl sdm
iostat -h -m -x sda sdb sdc sdd sde sdf sdg sdh sdi sdj sdk sdl sdm 5
iostat -m -x dm-24 5
yum install -y chrony
systemctl enable chronyd
systemctl restart chronyd
systemctl status chronyd
chronyc tracking
# update ntp
cat << EOF > /etc/chrony.conf
server 223.87.20.100 iburst
driftfile /var/lib/chrony/drift
makestep 1.0 3
rtcsync
logdir /var/log/chrony
EOF
systemctl restart chronyd
systemctl status chronyd
chronyc tracking
master0 host
#####################################################
# master0
mkdir /etc/yum.repos.d.bak
mv /etc/yum.repos.d/* /etc/yum.repos.d.bak
cat << EOF > /etc/yum.repos.d/remote.repo
[remote]
name=RHEL FTP
baseurl=ftp://117.177.241.16/data
enabled=1
gpgcheck=0
EOF
yum clean all
yum repolist
yum -y update
hostnamectl set-hostname master0.hsc.redhat.ren
nmcli connection modify em1 ipv4.dns 117.177.241.16
nmcli connection reload
nmcli connection up em1
yum -y install fail2ban
cat << EOF > /etc/fail2ban/jail.d/wzh.conf
[sshd]
enabled = true
EOF
systemctl enable fail2ban
systemctl restart fail2ban
fail2ban-client status sshd
fail2ban-client status recidive
systemctl status fail2ban
tail -F /var/log/fail2ban.log
cat << EOF > /etc/fail2ban/jail.d/wzh.conf
[sshd]
enabled = true
[recidive]
enabled = true
EOF
cp /etc/ssh/sshd_config /etc/ssh/sshd_config.BAK
sed -i 's/#UseDNS yes/UseDNS no/g' /etc/ssh/sshd_config
diff /etc/ssh/sshd_config /etc/ssh/sshd_config.BAK
systemctl restart sshd
passwd
useradd -m wzh
yum install -y chrony
systemctl enable chronyd
systemctl restart chronyd
systemctl status chronyd
chronyc tracking
lsblk | grep 446 | awk '{print $1}' | xargs -I DEMO echo -n "/dev/DEMO "
# /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm
lsblk | grep 446 | awk '{print $1}' | wc -l
# 12
# https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/configuring_and_managing_logical_volumes/assembly_configure-mange-raid-configuring-and-managing-logical-volumes
yum install -y lvm2
pvcreate -y /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm
vgcreate datavg /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm
vgs
lvcreate --type raid0 -l 100%FREE --stripes 12 -n datalv datavg
mkfs.xfs /dev/datavg/datalv
lvdisplay /dev/datavg/datalv -m
mkdir -p /data
mkdir -p /data_hdd
cp /etc/fstab /etc/fstab.bak
cat << EOF >> /etc/fstab
/dev/datavg/datalv /data_hdd xfs defaults 0 0
EOF
mount -a
# update ntp
cat << EOF > /etc/chrony.conf
server 223.87.20.100 iburst
driftfile /var/lib/chrony/drift
makestep 1.0 3
rtcsync
logdir /var/log/chrony
EOF
systemctl restart chronyd
systemctl status chronyd
chronyc tracking
master1 host
######################################################
# master1
mkdir /etc/yum.repos.d.bak
mv /etc/yum.repos.d/* /etc/yum.repos.d.bak
cat << EOF > /etc/yum.repos.d/remote.repo
[remote]
name=RHEL FTP
baseurl=ftp://117.177.241.16/data
enabled=1
gpgcheck=0
EOF
yum clean all
yum repolist
yum -y update
hostnamectl set-hostname master1.hsc.redhat.ren
nmcli connection modify em1 ipv4.dns 117.177.241.16
nmcli connection reload
nmcli connection up em1
yum -y install fail2ban
cat << EOF > /etc/fail2ban/jail.d/wzh.conf
[sshd]
enabled = true
[recidive]
enabled = true
EOF
systemctl enable fail2ban
systemctl restart fail2ban
fail2ban-client status sshd
fail2ban-client status recidive
systemctl status fail2ban
tail -F /var/log/fail2ban.log
cp /etc/ssh/sshd_config /etc/ssh/sshd_config.BAK
sed -i 's/#UseDNS yes/UseDNS no/g' /etc/ssh/sshd_config
diff /etc/ssh/sshd_config /etc/ssh/sshd_config.BAK
systemctl restart sshd
passwd
useradd -m wzh
yum install -y chrony
systemctl enable chronyd
systemctl restart chronyd
systemctl status chronyd
chronyc tracking
mkdir -p /data_hdd
mkfs.xfs -f /dev/sdb
cat << EOF >> /etc/fstab
/dev/sdb /data_hdd xfs defaults 0 0
EOF
mount -a
# update ntp
cat << EOF > /etc/chrony.conf
server 223.87.20.100 iburst
driftfile /var/lib/chrony/drift
makestep 1.0 3
rtcsync
logdir /var/log/chrony
EOF
systemctl restart chronyd
systemctl status chronyd
chronyc tracking
master2 host
######################################################
# master2
mkdir /etc/yum.repos.d.bak
mv /etc/yum.repos.d/* /etc/yum.repos.d.bak
cat << EOF > /etc/yum.repos.d/remote.repo
[remote]
name=RHEL FTP
baseurl=ftp://117.177.241.16/data
enabled=1
gpgcheck=0
EOF
yum clean all
yum repolist
yum -y update
hostnamectl set-hostname master2.hsc.redhat.ren
nmcli connection modify em1 ipv4.dns 117.177.241.16
nmcli connection reload
nmcli connection up em1
yum -y install fail2ban
cat << EOF > /etc/fail2ban/jail.d/wzh.conf
[sshd]
enabled = true
EOF
systemctl enable fail2ban
systemctl restart fail2ban
cat << EOF > /etc/fail2ban/jail.d/wzh.conf
[sshd]
enabled = true
[recidive]
enabled = true
EOF
fail2ban-client status
systemctl status fail2ban
tail -F /var/log/fail2ban.log
cp /etc/ssh/sshd_config /etc/ssh/sshd_config.BAK
sed -i 's/#UseDNS yes/UseDNS no/g' /etc/ssh/sshd_config
diff /etc/ssh/sshd_config /etc/ssh/sshd_config.BAK
systemctl restart sshd
passwd
useradd -m wzh
lsblk | grep 446 | awk '{print $1}' | xargs -I DEMO echo -n "/dev/DEMO "
# /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm
lsblk | grep 446 | awk '{print $1}' | wc -l
# 12
# https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/configuring_and_managing_logical_volumes/assembly_configure-mange-raid-configuring-and-managing-logical-volumes
yum install -y lvm2
pvcreate -y /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm
vgcreate datavg /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm
vgs
lvcreate --type raid0 -l 100%FREE --stripes 12 -n datalv datavg
mkfs.xfs /dev/datavg/datalv
lvdisplay /dev/datavg/datalv -m
mkdir -p /data
mkdir -p /data_hdd
cp /etc/fstab /etc/fstab.bak
cat << EOF >> /etc/fstab
/dev/datavg/datalv /data_hdd xfs defaults 0 0
EOF
mount -a
yum install -y sysstat
lsblk | grep disk | awk '{print $1}' | xargs -I DEMO echo -n "DEMO "
# sda sdb sdc sdd sde sdf sdg sdh sdi sdj sdk sdl sdm
iostat -m -x sda sdb sdc sdd sde sdf sdg sdh sdi sdj sdk sdl sdm 5
iostat -m -x dm-12 5
yum install -y chrony
systemctl enable chronyd
systemctl restart chronyd
systemctl status chronyd
chronyc tracking
# update ntp
cat << EOF > /etc/chrony.conf
server 223.87.20.100 iburst
driftfile /var/lib/chrony/drift
makestep 1.0 3
rtcsync
logdir /var/log/chrony
EOF
systemctl restart chronyd
systemctl status chronyd
chronyc tracking
infra0 host
######################################################
# infra0
mkdir /etc/yum.repos.d.bak
mv /etc/yum.repos.d/* /etc/yum.repos.d.bak
cat << EOF > /etc/yum.repos.d/remote.repo
[remote]
name=RHEL FTP
baseurl=ftp://117.177.241.16/data
enabled=1
gpgcheck=0
EOF
yum clean all
yum repolist
yum -y update
hostnamectl set-hostname infra0.hsc.redhat.ren
nmcli connection modify em1 ipv4.dns 117.177.241.16
nmcli connection reload
nmcli connection up em1
yum -y install fail2ban
cat << EOF > /etc/fail2ban/jail.d/wzh.conf
[sshd]
enabled = true
[recidive]
enabled = true
EOF
systemctl enable fail2ban
systemctl restart fail2ban
fail2ban-client status sshd
fail2ban-client status recidive
systemctl status fail2ban
tail -F /var/log/fail2ban.log
cp /etc/ssh/sshd_config /etc/ssh/sshd_config.BAK
sed -i 's/#UseDNS yes/UseDNS no/g' /etc/ssh/sshd_config
diff /etc/ssh/sshd_config /etc/ssh/sshd_config.BAK
systemctl restart sshd
passwd
useradd -m wzh
lsblk | grep 446 | awk '{print $1}' | xargs -I DEMO echo -n "/dev/DEMO "
# /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm
lsblk | grep 446 | awk '{print $1}' | wc -l
# 12
# https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/configuring_and_managing_logical_volumes/assembly_configure-mange-raid-configuring-and-managing-logical-volumes
yum install -y lvm2
pvcreate -y /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm
vgcreate datavg /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm
vgs
lvcreate --type raid0 -l 100%FREE --stripes 12 -n datalv datavg
mkfs.xfs /dev/datavg/datalv
lvdisplay /dev/datavg/datalv -m
mkdir -p /data
mkdir -p /data_hdd
cp /etc/fstab /etc/fstab.bak
cat << EOF >> /etc/fstab
/dev/datavg/datalv /data xfs defaults 0 0
EOF
mount -a
# https://access.redhat.com/solutions/769403
fuser -km /data
lvremove -f datavg/datalv
vgremove datavg
pvremove /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm
pvcreate -y /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm
vgcreate datavg /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm
lvcreate --type raid0 -L 400G --stripes 12 -n monitorlv datavg
yum install -y sysstat
lsblk | grep disk | awk '{print $1}' | xargs -I DEMO echo -n "DEMO "
# sda sdb sdc sdd sde sdf sdg sdh sdi sdj sdk sdl sdm
iostat -m -x sda sdb sdc sdd sde sdf sdg sdh sdi sdj sdk sdl sdm 5
iostat -m -x dm-12 5
yum install -y chrony
systemctl enable chronyd
systemctl restart chronyd
systemctl status chronyd
chronyc tracking
# update ntp
cat << EOF > /etc/chrony.conf
server 223.87.20.100 iburst
driftfile /var/lib/chrony/drift
makestep 1.0 3
rtcsync
logdir /var/log/chrony
EOF
systemctl restart chronyd
systemctl status chronyd
chronyc tracking
infra1 host
######################################################
# infra1
mkdir /etc/yum.repos.d.bak
mv /etc/yum.repos.d/* /etc/yum.repos.d.bak
cat << EOF > /etc/yum.repos.d/remote.repo
[remote]
name=RHEL FTP
baseurl=ftp://117.177.241.16/data
enabled=1
gpgcheck=0
EOF
yum clean all
yum repolist
yum -y update
hostnamectl set-hostname infra1.hsc.redhat.ren
nmcli connection modify em1 ipv4.dns 117.177.241.16
nmcli connection reload
nmcli connection up em1
yum -y install fail2ban
cat << EOF > /etc/fail2ban/jail.d/wzh.conf
[sshd]
enabled = true
[recidive]
enabled = true
EOF
systemctl enable fail2ban
systemctl restart fail2ban
fail2ban-client status sshd
fail2ban-client status recidive
systemctl status fail2ban
tail -F /var/log/fail2ban.log
cp /etc/ssh/sshd_config /etc/ssh/sshd_config.BAK
sed -i 's/#UseDNS yes/UseDNS no/g' /etc/ssh/sshd_config
diff /etc/ssh/sshd_config /etc/ssh/sshd_config.BAK
systemctl restart sshd
passwd
useradd -m wzh
lsblk | grep 446 | awk '{print $1}' | xargs -I DEMO echo -n "/dev/DEMO "
# /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm
lsblk | grep 446 | awk '{print $1}' | wc -l
# 12
# https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/configuring_and_managing_logical_volumes/assembly_configure-mange-raid-configuring-and-managing-logical-volumes
yum install -y lvm2
pvcreate -y /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm
vgcreate datavg /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm
vgs
lvcreate --type raid0 -l 100%FREE --stripes 12 -n datalv datavg
mkfs.xfs /dev/datavg/datalv
lvdisplay /dev/datavg/datalv -m
mkdir -p /data
mkdir -p /data_hdd
cp /etc/fstab /etc/fstab.bak
cat << EOF >> /etc/fstab
/dev/datavg/datalv /data xfs defaults 0 0
EOF
mount -a
# https://access.redhat.com/solutions/769403
fuser -km /data
lvremove -f datavg/datalv
vgremove datavg
pvremove /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm
pvcreate -y /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm
vgcreate datavg /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm
lvcreate --type raid0 -L 400G --stripes 12 -n monitorlv datavg
yum install -y sysstat
lsblk | grep disk | awk '{print $1}' | xargs -I DEMO echo -n "DEMO "
# sda sdb sdc sdd sde sdf sdg sdh sdi sdj sdk sdl sdm
iostat -m -x sda sdb sdc sdd sde sdf sdg sdh sdi sdj sdk sdl sdm 5
iostat -m -x dm-12 5
yum install -y chrony
systemctl enable chronyd
systemctl restart chronyd
systemctl status chronyd
chronyc tracking
# update ntp
cat << EOF > /etc/chrony.conf
server 223.87.20.100 iburst
driftfile /var/lib/chrony/drift
makestep 1.0 3
rtcsync
logdir /var/log/chrony
EOF
systemctl restart chronyd
systemctl status chronyd
chronyc tracking
worker-0 host
mkdir /etc/yum.repos.d.bak
mv /etc/yum.repos.d/* /etc/yum.repos.d.bak
cat << EOF > /etc/yum.repos.d/remote.repo
[remote]
name=RHEL FTP
baseurl=ftp://117.177.241.16/data
enabled=1
gpgcheck=0
EOF
yum clean all
yum --disableplugin=subscription-manager repolist
yum -y update
hostnamectl set-hostname worker-0.ocpsc.redhat.ren
nmcli connection modify enp3s0f0 ipv4.dns 117.177.241.16
nmcli connection reload
nmcli connection up enp3s0f0
yum -y install fail2ban
cat << EOF > /etc/fail2ban/jail.d/wzh.conf
[sshd]
enabled = true
EOF
systemctl enable fail2ban
systemctl restart fail2ban
cat << EOF > /etc/fail2ban/jail.d/wzh.conf
[sshd]
enabled = true
[recidive]
enabled = true
EOF
systemctl restart fail2ban
fail2ban-client status sshd
fail2ban-client status recidive
systemctl status fail2ban
tail -F /var/log/fail2ban.log
cp /etc/ssh/sshd_config /etc/ssh/sshd_config.BAK
sed -i 's/#UseDNS yes/UseDNS no/g' /etc/ssh/sshd_config
diff /etc/ssh/sshd_config /etc/ssh/sshd_config.BAK
systemctl restart sshd
passwd
useradd -m wzh
lsblk | grep 446 | awk '{print $1}' | xargs -I DEMO echo -n "/dev/DEMO "
# /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk
lsblk | grep 446 | awk '{print $1}' | wc -l
# 11
yum install -y lvm2
pvcreate -y /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk
vgcreate datavg /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk
vgs
lvcreate --type raid0 -l 100%FREE --stripes 10 -n datalv datavg
mkfs.xfs /dev/datavg/datalv
lvdisplay /dev/datavg/datalv -m
mkdir -p /data
cp /etc/fstab /etc/fstab.bak
cat << EOF >> /etc/fstab
/dev/datavg/datalv /data xfs defaults 0 0
EOF
mount -a
yum install -y sysstat
lsblk | grep disk | awk '{print $1}' | xargs -I DEMO echo -n "DEMO "
# sda sdb sdc sdd sde sdf sdg sdh sdi sdj sdk sdl sdm
iostat -m -x sda sdb sdc sdd sde sdf sdg sdh sdi sdj sdk 5
iostat -m -x dm-10 5
####################################
# ntp
yum install -y chrony
systemctl enable chronyd
systemctl restart chronyd
systemctl status chronyd
chronyc tracking
systemctl disable --now firewalld.service
# update ntp
cat << EOF > /etc/chrony.conf
server 223.87.20.100 iburst
driftfile /var/lib/chrony/drift
makestep 1.0 3
rtcsync
logdir /var/log/chrony
EOF
systemctl restart chronyd
systemctl status chronyd
chronyc tracking
#######################################
# nic bond
cat << EOF > /root/nic.bond.sh
#!/bin/bash
# delete all connection
nmcli -g uuid con | while read i ; do nmcli c delete uuid ${i} ; done
nmcli con add type bond \
con-name bond0 \
ifname bond0 \
mode 802.3ad \
ipv4.method 'manual' \
ipv4.address '39.137.101.28/25' \
ipv4.gateway '39.137.101.126' \
ipv4.dns '117.177.241.16'
nmcli con mod id bond0 bond.options \
mode=802.3ad,miimon=100,lacp_rate=fast,xmit_hash_policy=layer2+3
nmcli con add type bond-slave ifname enp3s0f0 con-name enp3s0f0 master bond0
nmcli con add type bond-slave ifname enp3s0f1 con-name enp3s0f1 master bond0
# nmcli con down enp3s0f0 && nmcli con start enp3s0f0
# nmcli con down enp3s0f1 && nmcli con start enp3s0f1
# nmcli con down bond0 && nmcli con start bond0
systemctl restart network
EOF
cat > /root/nic.restore.sh << 'EOF'
#!/bin/bash
# delete all connection
nmcli -g uuid con | while read i ; do nmcli c delete uuid ${i} ; done
# re-create primary connection
nmcli con add type ethernet \
con-name enp3s0f0 \
ifname enp3s0f0 \
ipv4.method 'manual' \
ipv4.address '39.137.101.28/25' \
ipv4.gateway '39.137.101.126' \
ipv4.dns '117.177.241.16'
# restart interface
# nmcli con down enp3s0f0 && nmcli con up enp3s0f0
systemctl restart network
exit 0
EOF
chmod +x /root/nic.restore.sh
cat > ~/cron-network-con-recreate << EOF
*/2 * * * * /bin/bash /root/nic.restore.sh
EOF
crontab ~/cron-network-con-recreate
bash /root/nic.bond.sh
worker-0 disk
#########################################
# ssd cache + hdd
# https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html-single/logical_volume_manager_administration/index#lvm_cache_volume_creation
umount /data
lsblk -d -o name,rota
lvremove /dev/datavg/datalv
pvcreate /dev/nvme0n1
# https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/logical_volume_manager_administration/vg_grow
vgextend datavg /dev/nvme0n1
## raid5 + cache
lvcreate --type raid5 -L 1G --stripes 9 -n hddlv datavg /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk
lvcreate --type raid5 -L 3.8T --stripes 9 -n mixlv datavg /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk
lvcreate -L 1G -n ssdlv datavg /dev/nvme0n1
# lvcreate --type cache-pool -L 300G -n cache1 datavg /dev/nvme0n1
lvcreate -L 1.4T -n cache1 datavg /dev/nvme0n1
lvcreate -L 14G -n cache1meta datavg /dev/nvme0n1
lvconvert --type cache-pool --poolmetadata datavg/cache1meta datavg/cache1
lvconvert --type cache --cachepool datavg/cache1 datavg/mixlv
# lvcreate --type raid5 --stripes 9 -L 1T -I 16M -R 4096K -n hddlv datavg /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk
# lvcreate --type raid5 --stripes 9 -L 1T -I 16M -R 4096K -n datalv datavg /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk
# lvcreate --type raid5 --stripes 9 -L 1T -n datalv datavg /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk
## raid0 + cache
lvcreate --type raid0 -L 4T --stripes 10 -n hddlv datavg /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk
lvcreate --type raid0 -L 1T --stripes 10 -n mixlv datavg /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk
lvcreate -L 300G -n ssdlv datavg /dev/nvme0n1
lvcreate --type cache-pool -L 300G -n cpool datavg /dev/nvme0n1
lvs -a -o name,size,attr,devices datavg
# lvconvert --type cache --cachepool cpool datavg/datalv
lvconvert --type cache --cachepool cpool datavg/mixlv
# lvconvert --type cache --cachepool cpool --cachemode writeback datavg/datalv
# lvs -a -o name,size,attr,devices datavg
# lvs -o+cache_mode datavg
# mkfs.xfs /dev/datavg/datalv
mkfs.xfs /dev/datavg/hddlv
mkfs.xfs /dev/datavg/ssdlv
mkfs.xfs /dev/datavg/mixlv
mkdir -p /data/
mkdir -p /data_ssd/
mkdir -p /data_mix/
cat /etc/fstab
cat << EOF >> /etc/fstab
/dev/datavg/hddlv /data xfs defaults 0 0
/dev/datavg/ssdlv /data_ssd xfs defaults 0 0
/dev/datavg/mixlv /data_mix xfs defaults 0 0
EOF
mount -a
df -h | grep \/data
# cleanup
umount /data/
umount /data_ssd/
umount /data_mix/
lvremove -f /dev/datavg/hddlv
lvremove -f /dev/datavg/ssdlv
lvremove -f /dev/datavg/mixlv
fio --rw=rw --rwmixread=99 --bsrange=4k-256k --name=vdo \
--directory=./ --ioengine=libaio --numjobs=1 --thread \
--norandommap --runtime=300 --direct=0 --iodepth=8 \
--scramble_buffers=1 --offset=0 --size=100g
fio --rw=rw --rwmixread=99 --bsrange=4k-256k --name=vdo \
--directory=./ --ioengine=sync --size=100g
blktrace /dev/datavg/mixlv /dev/nvme0n1 /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk
blkparse -o /dev/null -i dm-42 -d dm-42.bin
btt -i dm-42.blktrace.bin
blkparse -o /dev/null -i nvme0n1 -d nvme0n1.bin
btt -i nvme0n1.bin | less
blkparse -o /dev/null -i sdb -d sdb.bin
btt -i sdb.bin | less
dstat -D /dev/mapper/datavg-hddlv,sdd,nvme0n1 -N enp3s0f0
dstat -D /dev/mapper/datavg-hddlv,sdd,nvme0n1 --disk-util
bmon -p ens8f0,ens8f1,enp3s0f0,enp3s0f1
lvs -o+lv_all datavg/mixlv_corig
lvs -o+Layout datavg/mixlv_corig
lvs -o+CacheReadHits,CacheReadMisses
lvs -o+Layout
blockdev --report
# https://access.redhat.com/solutions/3588841
/sbin/blockdev --setra 262144 /dev/mapper/datavg-hddlv
/sbin/blockdev --setra 8192 /dev/mapper/datavg-hddlv
/sbin/blockdev --setra 0 /dev/mapper/datavg-hddlv
hdparm -t /dev/mapper/datavg-hddlv
/sbin/blockdev --setra 4096 /dev/mapper/datavg-hddlv
/sbin/blockdev --setra 8192 /dev/mapper/datavg-hddlv
/sbin/blockdev --setra 16384 /dev/mapper/datavg-hddlv
/sbin/blockdev --setra 32768 /dev/mapper/datavg-hddlv
/sbin/blockdev --setra 65536 /dev/mapper/datavg-hddlv
/sbin/blockdev --setra 131072 /dev/mapper/datavg-hddlv
for f in /dev/mapper/datavg-hddlv_rimage_*; do /sbin/blockdev --setra 65536 $f ; done
for f in /dev/mapper/datavg-hddlv_rimage_*; do /sbin/blockdev --setra 131072 $f ; done
blktrace /dev/datavg/hddlv /dev/nvme0n1 /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk
# Generate distribution of file sizes from the command prompt
# https://superuser.com/questions/565443/generate-distribution-of-file-sizes-from-the-command-prompt
find /data/mnt/ -type f > list
cat list | xargs ls -l > list.size
cat list.size | awk '{ n=int(log($5)/log(2)); \
if (n<10) n=10; \
size[n]++ } \
END { for (i in size) printf("%d %d\n", 2^i, size[i]) }' \
| sort -n \
| awk 'function human(x) { x[1]/=1024; \
if (x[1]>=1024) { x[2]++; \
human(x) } } \
{ a[1]=$1; \
a[2]=0; \
human(a); \
printf("%3d%s: %6d\n", a[1],substr("kMGTEPYZ",a[2]+1,1),$2) }'
# 1k: 2
# 16k: 18875840
# 64k: 7393088
# 128k: 5093147
# 512k: 1968632
# 1M: 914486
cat list.size | awk '{size[int(log($5)/log(2))]++}END{for (i in size) printf("%10d %3d\n", 2^i, size[i])}' | sort -n
# 5.5
var_basedir="/data_ssd/mnt"
find $var_basedir -type f -size -16k > list.16k
find $var_basedir -type f -size -128k -size +16k > list.128k
find $var_basedir -type f -size +128k > list.+128k
find $var_basedir -type f > list
dstat --output /root/dstat.csv -D /dev/mapper/datavg-mixlv,/dev/mapper/datavg-mixlv_corig,sdh,sdab -N bond0
dstat -D /dev/mapper/datavg-hddlv,/dev/datavg/ext4lv,sdh,sdab -N bond0
i=0
while read f; do
/bin/cp -f $f /data_mix/mnt/$i
((i++))
done < list
find /data_mix/mnt/ -type f > list
cat list | shuf > list.shuf.all
cat list.16k | shuf > list.shuf.16k
cat list.128k | shuf > list.shuf.128k
cat list.+128k | shuf > list.shuf.+128k
cat list.128k list.+128k | shuf > list.shuf.+16k
# zte use 1800
var_total=10
rm -f split.list.*
split -n l/$var_total list.shuf.all split.list.all.
split -n l/$var_total list.shuf.16k split.list.16k.
split -n l/$var_total list.shuf.128k split.list.128k.
split -n l/$var_total list.shuf.+128k split.list.+128k.
split -n l/$var_total list.shuf.+16k split.list.+16k.
for f in split.list.16k.*; do
cat $f | xargs -I DEMO cat DEMO > /dev/null &
done
# for f in split.list.+16k.*; do
# cat $f | xargs -I DEMO cat DEMO > /dev/null &
# done
for f in split.list.128k.*; do
cat $f | xargs -I DEMO cat DEMO > /dev/null &
done
for f in split.list.+128k.*; do
cat $f | xargs -I DEMO cat DEMO > /dev/null &
done
for f in split.list.all.*; do
cat $f | xargs -I DEMO cat DEMO > /dev/null &
done
ps -ef | grep /data_ssd/mnt | grep cat | awk '{print $2}' | xargs -I DEMO kill DEMO
echo "wait to finish"
wait
# while true; do
# for f in split.list.all.*; do
# cat $f | xargs -I DEMO cat DEMO > /dev/null &
# done
# echo "wait to finish"
# wait
# done
kill -9 $(jobs -p)
jobs -p | xargs kill
ps -ef | grep /mnt/zxdfs | grep cat | awk '{print $2}' | xargs -I DEMO kill DEMO
ps -ef | grep /data_mix/mnt | grep cat | awk '{print $2}' | xargs -I DEMO kill DEMO
worker-1 host
mkdir /etc/yum.repos.d.bak
mv /etc/yum.repos.d/* /etc/yum.repos.d.bak
cat << EOF > /etc/yum.repos.d/remote.repo
[remote]
name=RHEL FTP
baseurl=ftp://117.177.241.16/data
enabled=1
gpgcheck=0
EOF
yum clean all
yum --disableplugin=subscription-manager repolist
yum install -y byobu htop iostat
yum -y update
hostnamectl set-hostname worker-2.ocpsc.redhat.ren
nmcli connection modify eno1 ipv4.dns 117.177.241.16
nmcli connection reload
nmcli connection up eno1
yum -y install fail2ban
cat << EOF > /etc/fail2ban/jail.d/wzh.conf
[sshd]
enabled = true
EOF
systemctl enable fail2ban
systemctl restart fail2ban
cat << EOF > /etc/fail2ban/jail.d/wzh.conf
[sshd]
enabled = true
[recidive]
enabled = true
EOF
systemctl restart fail2ban
fail2ban-client status sshd
fail2ban-client status recidive
systemctl status fail2ban
tail -F /var/log/fail2ban.log
cp /etc/ssh/sshd_config /etc/ssh/sshd_config.BAK
sed -i 's/#UseDNS yes/UseDNS no/g' /etc/ssh/sshd_config
diff /etc/ssh/sshd_config /etc/ssh/sshd_config.BAK
systemctl restart sshd
passwd
useradd -m wzh
lsblk | grep 5.5 | awk '{print $1}' | xargs -I DEMO echo -n "/dev/DEMO "
# /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk
lsblk | grep 5.5 | awk '{print $1}' | wc -l
# 24
yum install -y lvm2
pvcreate -y /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdq /dev/sdr /dev/sds /dev/sdt /dev/sdu /dev/sdv /dev/sdw /dev/sdx
vgcreate datavg /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdq /dev/sdr /dev/sds /dev/sdt /dev/sdu /dev/sdv /dev/sdw /dev/sdx
vgs
lvcreate --type raid0 -l 100%FREE --stripes 24 -n datalv datavg
mkfs.xfs /dev/datavg/datalv
lvdisplay /dev/datavg/datalv -m
mkdir -p /data
cp /etc/fstab /etc/fstab.bak
cat << EOF >> /etc/fstab
/dev/datavg/datalv /data xfs defaults 0 0
EOF
mount -a
yum install -y sysstat
lsblk | grep disk | awk '{print $1}' | xargs -I DEMO echo -n "DEMO "
# sda sdb sdc sdd sde sdf sdg sdh sdi sdj sdk sdl sdm
iostat -m -x sda sdb sdc sdd sde sdf sdg sdh sdi sdj sdk 5
iostat -m -x dm-10 5
########################################
# ntp
yum install -y chrony
systemctl enable chronyd
systemctl restart chronyd
systemctl status chronyd
chronyc tracking
systemctl disable --now firewalld.service
# setup time server
/bin/cp -f /etc/chrony.conf /etc/chrony.conf.bak
cat << EOF > /etc/chrony.conf
server 117.177.241.16 iburst
server 0.rhel.pool.ntp.org iburst
server 1.rhel.pool.ntp.org iburst
server 2.rhel.pool.ntp.org iburst
server 3.rhel.pool.ntp.org iburst
driftfile /var/lib/chrony/drift
makestep 1.0 3
rtcsync
logdir /var/log/chrony
EOF
systemctl restart chronyd
systemctl status chronyd
chronyc tracking
chronyc sources -v
# update ntp
cat << EOF > /etc/chrony.conf
server 223.87.20.100 iburst
driftfile /var/lib/chrony/drift
makestep 1.0 3
rtcsync
logdir /var/log/chrony
EOF
systemctl restart chronyd
systemctl status chronyd
chronyc tracking
worker-1 disk
##################################
## config
mkdir -p /app_conf/zxcdn
#########################################
# ssd cache + hdd
# https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html-single/logical_volume_manager_administration/index#lvm_cache_volume_creation
umount /data
lsblk -d -o name,rota
lvremove /dev/datavg/datalv
# lsblk | grep 894 | awk '{print $1}'
pvcreate /dev/sdz /dev/sdaa /dev/sdab /dev/sdac /dev/sdad /dev/sdae /dev/sdaf /dev/sdag /dev/sdah /dev/sdai
# https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/logical_volume_manager_administration/vg_grow
vgextend datavg /dev/sdz /dev/sdaa /dev/sdab /dev/sdac /dev/sdad /dev/sdae /dev/sdaf /dev/sdag /dev/sdah /dev/sdai
## raid5
lvcreate --type raid5 -L 3T --stripes 23 -n hddlv datavg /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdq /dev/sdr /dev/sds /dev/sdt /dev/sdu /dev/sdv /dev/sdw /dev/sdx
lvcreate --type raid0 -L 1G --stripes 10 -n ssdlv datavg /dev/sdz /dev/sdaa /dev/sdab /dev/sdac /dev/sdad /dev/sdae /dev/sdaf /dev/sdag /dev/sdah /dev/sdai
lvcreate --type raid5 -L 3T --stripes 23 -n mixlv datavg /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdq /dev/sdr /dev/sds /dev/sdt /dev/sdu /dev/sdv /dev/sdw /dev/sdx
lvcreate --type raid5 -L 1T --stripes 9 -n cache1 datavg /dev/sdz /dev/sdaa /dev/sdab /dev/sdac /dev/sdad /dev/sdae /dev/sdaf /dev/sdag /dev/sdah /dev/sdai
lvcreate --type raid5 -L 10G --stripes 9 -n cache1meta datavg /dev/sdz /dev/sdaa /dev/sdab /dev/sdac /dev/sdad /dev/sdae /dev/sdaf /dev/sdag /dev/sdah /dev/sdai
lvconvert --type cache-pool --poolmetadata datavg/cache1meta datavg/cache1
lvconvert --type cache --cachepool datavg/cache1 datavg/mixlv
# lvcreate --type raid5 --stripes 9 -L 1T -I 16M -R 4096K -n hddlv datavg /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk
lvcreate --type raid5 -L 12T --stripes 23 -n mix0lv datavg /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdq /dev/sdr /dev/sds /dev/sdt /dev/sdu /dev/sdv /dev/sdw /dev/sdx
lvcreate --type raid0 -L 4T --stripes 10 -n cachemix0 datavg /dev/sdz /dev/sdaa /dev/sdab /dev/sdac /dev/sdad /dev/sdae /dev/sdaf /dev/sdag /dev/sdah /dev/sdai
lvcreate --type raid0 -L 40G --stripes 10 -n cachemix0meta datavg /dev/sdz /dev/sdaa /dev/sdab /dev/sdac /dev/sdad /dev/sdae /dev/sdaf /dev/sdag /dev/sdah /dev/sdai
lvconvert --type cache-pool --poolmetadata datavg/cachemix0meta datavg/cachemix0
lvconvert --type cache --cachepool datavg/cachemix0 datavg/mix0lv
lvcreate --type raid5 -L 1T --stripes 23 -n mix0weblv datavg /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdq /dev/sdr /dev/sds /dev/sdt /dev/sdu /dev/sdv /dev/sdw /dev/sdx
lvcreate --type raid0 -L 162G --stripes 10 -n cachemix0web datavg /dev/sdz /dev/sdaa /dev/sdab /dev/sdac /dev/sdad /dev/sdae /dev/sdaf /dev/sdag /dev/sdah /dev/sdai
lvcreate --type raid0 -L 2G --stripes 10 -n cachemix0webmeta datavg /dev/sdz /dev/sdaa /dev/sdab /dev/sdac /dev/sdad /dev/sdae /dev/sdaf /dev/sdag /dev/sdah /dev/sdai
lvconvert --type cache-pool --poolmetadata datavg/cachemix0webmeta datavg/cachemix0web
lvconvert --type cache --cachepool datavg/cachemix0web datavg/mix0weblv
# lvcreate --type raid0 -L 200G --stripes 10 -n ssd0lv datavg /dev/sdz /dev/sdaa /dev/sdab /dev/sdac /dev/sdad /dev/sdae /dev/sdaf /dev/sdag /dev/sdah /dev/sdai
lvcreate --type raid0 -L 200G --stripes 4 -n ssd0lv datavg /dev/sdz /dev/sdaa /dev/sdab /dev/sdac
fio --rw=rw --rwmixread=99 --bsrange=4k-256k --name=vdo \
--filename=/dev/datavg/ssd0lv --ioengine=libaio --numjobs=1 --thread \
--norandommap --runtime=300 --direct=1 --iodepth=8 \
--scramble_buffers=1 --offset=0 --size=100g
lvremove -f datavg/ssd0lv
## raid0 + stripe
lvcreate --type raid0 -L 130T --stripes 24 -n hddlv datavg /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdq /dev/sdr /dev/sds /dev/sdt /dev/sdu /dev/sdv /dev/sdw /dev/sdx
lvcreate --type raid0 -L 900G --stripesize 128k --stripes 24 -n testfslv datavg /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdq /dev/sdr /dev/sds /dev/sdt /dev/sdu /dev/sdv /dev/sdw /dev/sdx
mkfs.ext4 /dev/datavg/testfslv
mount /dev/datavg/testfslv /data_mix
lvcreate --type raid0 -L 5T --stripes 10 -n ssdlv datavg /dev/sdz /dev/sdaa /dev/sdab /dev/sdac /dev/sdad /dev/sdae /dev/sdaf /dev/sdag /dev/sdah /dev/sdai
lvcreate --type raid5 -L 5T --stripes 9 -n ssdlv datavg /dev/sdz /dev/sdaa /dev/sdab /dev/sdac /dev/sdad /dev/sdae /dev/sdaf /dev/sdag /dev/sdah /dev/sdai
mkfs.ext4 /dev/datavg/ssdlv
mount /dev/datavg/ssdlv /data_ssd
rsync -e ssh --info=progress2 -P --delete -ar --files-from=list.20k / 39.134.201.65:/data_ssd/mnt/
rsync -e ssh --info=progress2 -P --delete -ar /data/mnt/ 39.134.201.65:/data_ssd/mnt/
rsync -e ssh --info=progress2 -P --delete -ar /data/mnt/zxdfs/webcache-011/ 39.134.201.65:/data_ssd/mnt/zxdfs/webcache-011/
rsync -e ssh --info=progress2 -P --delete -ar /data/mnt/zxdfs/webcache-012/ 39.134.201.65:/data_ssd/mnt/zxdfs/webcache-012/
# slow
lvcreate --type raid0 -L 400G --stripesize 128k --stripes 12 -n testfslv datavg /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl
# Generate distribution of file sizes from the command prompt
# https://superuser.com/questions/565443/generate-distribution-of-file-sizes-from-the-command-prompt
cat list | xargs ls -l > list.size
cat list.size | awk '{ n=int(log($5)/log(2)); \
if (n<10) n=10; \
size[n]++ } \
END { for (i in size) printf("%d %d\n", 2^i, size[i]) }' \
| sort -n \
| awk 'function human(x) { x[1]/=1024; \
if (x[1]>=1024) { x[2]++; \
human(x) } } \
{ a[1]=$1; \
a[2]=0; \
human(a); \
printf("%3d%s: %6d\n", a[1],substr("kMGTEPYZ",a[2]+1,1),$2) }'
lvcreate --type raid0 -L 1T --stripes 24 -n mixlv datavg /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdq /dev/sdr /dev/sds /dev/sdt /dev/sdu /dev/sdv /dev/sdw /dev/sdx
lvcreate --type raid0 -L 300G --stripes 10 -n ssdlv datavg /dev/sdz /dev/sdaa /dev/sdab /dev/sdac /dev/sdad /dev/sdae /dev/sdaf /dev/sdag /dev/sdah /dev/sdai
lvcreate --type raid0 -L 300G --stripes 10 -n cache1 datavg /dev/sdz /dev/sdaa /dev/sdab /dev/sdac /dev/sdad /dev/sdae /dev/sdaf /dev/sdag /dev/sdah /dev/sdai
lvcreate --type raid0 -L 3G --stripes 10 -n cache1meta datavg /dev/sdz /dev/sdaa /dev/sdab /dev/sdac /dev/sdad /dev/sdae /dev/sdaf /dev/sdag /dev/sdah /dev/sdai
lvconvert --type cache-pool --poolmetadata datavg/cache1meta datavg/cache1
# lvs -a -o name,size,attr,devices datavg
lvconvert --type cache --cachepool datavg/cache1 datavg/mixlv
# lvs -a -o name,size,attr,devices datavg
# lvs -o+cache_mode datavg
mkfs.xfs /dev/datavg/hddlv
mkfs.xfs /dev/datavg/ssdlv
mkfs.xfs /dev/datavg/mixlv
mkfs.xfs /dev/datavg/mix0lv
mkfs.xfs /dev/datavg/mix0weblv
mkdir -p /data/
mkdir -p /data_ssd/
mkdir -p /data_mix/
mkdir -p /data_mix0
mkdir -p /data_mix0_web/
cat /etc/fstab
cat << EOF >> /etc/fstab
/dev/datavg/hddlv /data xfs defaults 0 0
# /dev/datavg/ssdlv /data_ssd xfs defaults 0 0
# /dev/datavg/mixlv /data_mix xfs defaults 0 0
# /dev/datavg/mix0lv /data_mix0 xfs defaults 0 0
# /dev/datavg/mix0weblv /data_mix0_web xfs defaults 0 0
EOF
mount -a
df -h | grep \/data
dd if=/dev/zero of=/data/testfile bs=4k count=9999 oflag=dsync
dd if=/dev/zero of=/data_ssd/testfile bs=4k count=9999 oflag=dsync
dd if=/dev/zero of=/data_mix/testfile bs=4k count=9999 oflag=dsync
dd if=/dev/zero of=/data/testfile bs=4M count=9999 oflag=dsync
dd if=/dev/zero of=/data_ssd/testfile bs=4M count=9999 oflag=dsync
dd if=/dev/zero of=/data_mix/testfile bs=4M count=9999 oflag=dsync
dd if=/data/testfile of=/dev/null bs=4k count=9999 oflag=dsync
dd if=/data_ssd/testfile of=/dev/null bs=4k count=9999 oflag=dsync
dd if=/data_mix/testfile of=/dev/null bs=4k count=9999 oflag=dsync
dd if=/dev/zero of=/data/testfile.large bs=4M count=9999 oflag=direct
dd if=/dev/zero of=/data_ssd/testfile.large bs=4M count=9999 oflag=direct
dd if=/dev/zero of=/data_mix/testfile.large bs=4M count=9999 oflag=direct
dd if=/dev/zero of=/data/testfile.large bs=4M count=9999
dd if=/dev/zero of=/data_ssd/testfile.large bs=4M count=9999
dd if=/dev/zero of=/data_mix/testfile.large bs=4M count=9999
dd if=/data/testfile.large of=/dev/null bs=4k count=9999 oflag=dsync
dd if=/data_ssd/testfile.large of=/dev/null bs=4k count=9999 oflag=dsync
dd if=/data_mix/testfile.large of=/dev/null bs=4k count=9999 oflag=dsync
dd if=/data/testfile.large of=/dev/null bs=4M count=9999 oflag=dsync
dd if=/data_ssd/testfile.large of=/dev/null bs=4M count=9999 oflag=dsync
dd if=/data_mix/testfile.large of=/dev/null bs=4M count=9999 oflag=dsync
dd if=/data/testfile.large of=/dev/null bs=4M count=9999
dd if=/data_ssd/testfile.large of=/dev/null bs=4M count=9999
dd if=/data_mix/testfile.large of=/dev/null bs=4M count=9999
dd if=/data/testfile.large of=/dev/null bs=40M count=9999
dd if=/data_ssd/testfile.large of=/dev/null bs=40M count=9999
dd if=/data_mix/testfile.large of=/dev/null bs=40M count=9999
# cleanup
umount /data/
umount /data_ssd/
umount /data_mix/
umount /data_mix0/
lvremove -f /dev/datavg/hddlv
lvremove -f /dev/datavg/ssdlv
lvremove -f /dev/datavg/mixlv
lvremove -f /dev/datavg/mix0lv
# ssd tunning
# https://serverfault.com/questions/80134/linux-md-vs-lvm-performance
hdparm -tT /dev/md0
# https://www.ibm.com/developerworks/cn/linux/l-lo-io-scheduler-optimize-performance/index.html
cat /sys/block/*/queue/scheduler
lsblk | grep 894 | awk '{print $1}' | xargs -I DEMO cat /sys/block/DEMO/queue/scheduler
lsblk | grep 894 | awk '{print "echo deadline > /sys/block/"$1"/queue/scheduler"}'
iostat -x -m 3 /dev/mapper/datavg-mix0weblv /dev/mapper/datavg-mix0weblv_corig /dev/mapper/datavg-cachemix0web_cdata /dev/mapper/datavg-cachemix0web_cmeta
dstat -D /dev/mapper/datavg-hddlv,sdh,sdab -N bond0
dstat -D /dev/mapper/datavg-hddlv,sdh,sdab --disk-util
bmon -p eno1,eno2,ens2f0,ens2f1,bond0
lvs -o+lv_all datavg/mixlv_corig
lvs -o+Layout datavg/mixlv_corig
lvs -o+CacheReadHits,CacheReadMisses
lvs -o+Layout
blockdev --report
# https://access.redhat.com/solutions/3588841
/sbin/blockdev --setra 1048576 /dev/mapper/datavg-hddlv
/sbin/blockdev --setra 524288 /dev/mapper/datavg-hddlv
/sbin/blockdev --setra 262144 /dev/mapper/datavg-hddlv
/sbin/blockdev --setra 131072 /dev/mapper/datavg-hddlv
/sbin/blockdev --setra 65536 /dev/mapper/datavg-hddlv
/sbin/blockdev --setra 32768 /dev/mapper/datavg-hddlv
/sbin/blockdev --setra 16384 /dev/mapper/datavg-hddlv
/sbin/blockdev --setra 8192 /dev/mapper/datavg-hddlv
/sbin/blockdev --setra 8192 /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdq /dev/sdr /dev/sds /dev/sdt /dev/sdu /dev/sdv /dev/sdw /dev/sdx
for f in /dev/mapper/datavg-hddlv_rimage_*; do /sbin/blockdev --setra 8192 $f ; done
for f in /dev/mapper/datavg-hddlv_rimage_*; do /sbin/blockdev --setra 16384 $f ; done
blktrace /dev/datavg/hddlv /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdq /dev/sdr /dev/sds /dev/sdt /dev/sdu /dev/sdv /dev/sdw /dev/sdx
blkparse -o /dev/null -i dm-24 -d dm-24.bin
btt -i dm-24.bin | less
blkparse -o /dev/null -i sda -d sda.bin
btt -i sda.bin | less
# 5.5
# find /data/mnt/ -type f -size -2M -size +512k > list
var_basedir="/data_mix/mnt"
find $var_basedir -type f -size -2M > list.2m
find $var_basedir -type f -size -10M -size +2M > list.10m
find $var_basedir -type f -size +10M > list.100m
find /data/mnt/ -type f > list
dstat --output /root/dstat.csv -D /dev/mapper/datavg-mixlv,/dev/mapper/datavg-mixlv_corig,sdh,sdab -N bond0
dstat -D /dev/mapper/datavg-hddlv,/dev/datavg/testfslv,sdh,sdab -N bond0
mkdir -p /data_mix/mnt
i=11265199
while read f; do
/bin/cp -f $f /data_mix/mnt/$i &
((i++))
if (( $i % 200 == 0 )) ; then
wait
fi
done < list.100m
while true; do
df -h | grep /data
sleep 60
done
find /data_mix/mnt/ -type f > list
cat list | shuf > list.shuf.all
cat list.2m | shuf > list.shuf.2m
cat list.10m | shuf > list.shuf.10m
cat list.100m | shuf > list.shuf.100m
cat list.10m list.100m | shuf > list.shuf.+2m
# zte use 1800
var_total=10
split -n l/$var_total list.shuf.all split.list.all.
split -n l/$var_total list.shuf.2m split.list.2m.
split -n l/$var_total list.shuf.10m split.list.10m.
split -n l/$var_total list.shuf.100m split.list.100m.
split -n l/$var_total list.shuf.+2m split.list.+2m.
rm -f split.list.*
for f in split.list.2m.*; do
cat $f | xargs -I DEMO cat DEMO > /dev/null &
done
# for f in split.list.+2m.*; do
# cat $f | xargs -I DEMO cat DEMO > /dev/null &
# done
for f in split.list.10m.*; do
cat $f | xargs -I DEMO cat DEMO > /dev/null &
done
for f in split.list.100m.*; do
cat $f | xargs -I DEMO cat DEMO > /dev/null &
done
for f in split.list.all.*; do
cat $f | xargs -I DEMO cat DEMO > /dev/null &
done
jobs -p | xargs kill
ps -ef | grep xargs | grep DEMO | grep cat | awk '{print $2}' | xargs -I DEMO kill DEMO
ps -ef | grep /data_mix/mnt | grep cat | awk '{print $2}' | xargs -I DEMO kill DEMO
rclone sync /data/mnt/ /data/backup/mnt/ -P -L --transfers 64
rclone sync /data/home/ /data/backup/home/ -P -L --transfers 64
rclone sync /data/ztecdn/ /data/backup/ztecdn/ -P -L --transfers 64
rclone sync /data/backup/mnt/ /data/mnt/ -P -L --transfers 64
# check sn
dmidecode -t 1
# # dmidecode 3.2
# Getting SMBIOS data from sysfs.
# SMBIOS 3.0.0 present.
# Handle 0x0001, DMI type 1, 27 bytes
# System Information
# Manufacturer: Huawei
# Product Name: 5288 V5
# Version: Purley
# Serial Number: 2102312CJSN0K9000028
# UUID: a659bd21-cc64-83c1-e911-6cd6de4f8050
# Wake-up Type: Power Switch
# SKU Number: Purley
# Family: Purley
# check disk
lshw -c disk
# *-disk:0
# description: ATA Disk
# product: HUS726T6TALE600
# physical id: 0.2.0
# bus info: scsi@0:0.2.0
# logical name: /dev/sda
# version: T010
# serial: xLkuQ2-XVVp-sfs3-8Rgm-vRgS-uysW-ncIudq
# size: 5589GiB (6001GB)
# capacity: 5589GiB (6001GB)
# capabilities: 7200rpm lvm2
# configuration: ansiversion=6 logicalsectorsize=512 sectorsize=4096
# *-disk:1
# description: ATA Disk
# product: HUS726T6TALE600
# physical id: 0.3.0
# bus info: scsi@0:0.3.0
# logical name: /dev/sdb
# version: T010
# serial: 5d2geD-fGih-Q6yK-2xVs-lWUG-tH38-qQWRC6
# size: 5589GiB (6001GB)
# capacity: 5589GiB (6001GB)
# capabilities: 7200rpm lvm2
# configuration: ansiversion=6 logicalsectorsize=512 sectorsize=4096
# *-disk:2
# description: ATA Disk
# product: HUS726T6TALE600
# physical id: 0.c.0
# bus info: scsi@0:0.12.0
# logical name: /dev/sdk
# version: T010
# serial: fePKOb-MTZv-j4Xz-qNjo-cPTr-078I-vZYiPH
# size: 5589GiB (6001GB)
# capacity: 5589GiB (6001GB)
# capabilities: 7200rpm lvm2
# configuration: ansiversion=6 logicalsectorsize=512 sectorsize=4096
# *-disk:3
# description: ATA Disk
# product: HUS726T6TALE600
# physical id: 0.d.0
# bus info: scsi@0:0.13.0
# logical name: /dev/sdl
# version: T010
# serial: fUTBJp-fXg0-0uJX-V4Qp-vSfZ-yxmb-G8LNam
# size: 5589GiB (6001GB)
# capacity: 5589GiB (6001GB)
# capabilities: 7200rpm lvm2
# configuration: ansiversion=6 logicalsectorsize=512 sectorsize=4096
# *-disk:4
# description: ATA Disk
# product: HUS726T6TALE600
# physical id: 0.e.0
# bus info: scsi@0:0.14.0
# logical name: /dev/sdm
# version: T010
# serial: SNfxce-ytX2-7j4p-opnQ-lOxC-AFIp-VbCfec
# size: 5589GiB (6001GB)
# capacity: 5589GiB (6001GB)
# capabilities: 7200rpm lvm2
# configuration: ansiversion=6 logicalsectorsize=512 sectorsize=4096
# *-disk:5
# description: ATA Disk
# product: HUS726T6TALE600
# physical id: 0.f.0
# bus info: scsi@0:0.15.0
# logical name: /dev/sdn
# version: T010
# serial: HJqH2G-XT7i-2R27-dSb0-q36n-T4Ut-Ml4GiE
# size: 5589GiB (6001GB)
# capacity: 5589GiB (6001GB)
# capabilities: 7200rpm lvm2
# configuration: ansiversion=6 logicalsectorsize=512 sectorsize=4096
# *-disk:6
# description: ATA Disk
# product: HUS726T6TALE600
# physical id: 0.10.0
# bus info: scsi@0:0.16.0
# logical name: /dev/sdo
# version: T010
# serial: IBh87y-SOWJ-rI3R-Mshu-agWM-TyHs-6ko0iu
# size: 5589GiB (6001GB)
# capacity: 5589GiB (6001GB)
# capabilities: 7200rpm lvm2
# configuration: ansiversion=6 logicalsectorsize=512 sectorsize=4096
# *-disk:7
# description: ATA Disk
# product: HUS726T6TALE600
# physical id: 0.11.0
# bus info: scsi@0:0.17.0
# logical name: /dev/sdp
# version: T010
# serial: erBKxc-gBsD-msEq-aXMJ-8akE-FGRb-SjBk1w
# size: 5589GiB (6001GB)
# capacity: 5589GiB (6001GB)
# capabilities: 7200rpm lvm2
# configuration: ansiversion=6 logicalsectorsize=512 sectorsize=4096
# *-disk:8
# description: ATA Disk
# product: HUS726T6TALE600
# physical id: 0.12.0
# bus info: scsi@0:0.18.0
# logical name: /dev/sdq
# version: T010
# serial: HsiL2h-6736-4x4H-0OTz-HuXj-My1c-RRShQP
# size: 5589GiB (6001GB)
# capacity: 5589GiB (6001GB)
# capabilities: 7200rpm lvm2
# configuration: ansiversion=6 logicalsectorsize=512 sectorsize=4096
# *-disk:9
# description: ATA Disk
# product: HUS726T6TALE600
# physical id: 0.13.0
# bus info: scsi@0:0.19.0
# logical name: /dev/sdr
# version: T010
# serial: yZQ8MH-7SCw-KIFL-fphN-S0W0-GS4V-Wc2gwx
# size: 5589GiB (6001GB)
# capacity: 5589GiB (6001GB)
# capabilities: 7200rpm lvm2
# configuration: ansiversion=6 logicalsectorsize=512 sectorsize=4096
# *-disk:10
# description: ATA Disk
# product: HUS726T6TALE600
# physical id: 0.14.0
# bus info: scsi@0:0.20.0
# logical name: /dev/sds
# version: T010
# serial: pp6xvN-MBT9-aLkB-65hF-7fwE-29vt-hA51K9
# size: 5589GiB (6001GB)
# capacity: 5589GiB (6001GB)
# capabilities: 7200rpm lvm2
# configuration: ansiversion=6 logicalsectorsize=512 sectorsize=4096
# *-disk:11
# description: ATA Disk
# product: HUS726T6TALE600
# physical id: 0.15.0
# bus info: scsi@0:0.21.0
# logical name: /dev/sdt
# version: T010
# serial: jXj3cL-qvoJ-JWP0-jvp9-WEbn-yD63-e6vFmP
# size: 5589GiB (6001GB)
# capacity: 5589GiB (6001GB)
# capabilities: 7200rpm lvm2
# configuration: ansiversion=6 logicalsectorsize=512 sectorsize=4096
# *-disk:12
# description: ATA Disk
# product: HUS726T6TALE600
# physical id: 0.4.0
# bus info: scsi@0:0.4.0
# logical name: /dev/sdc
# version: T010
# serial: Ca6Nyo-Oq5p-UdAY-oqIs-DlK5-1PPy-ugvF3P
# size: 5589GiB (6001GB)
# capacity: 5589GiB (6001GB)
# capabilities: 7200rpm lvm2
# configuration: ansiversion=6 logicalsectorsize=512 sectorsize=4096
# *-disk:13
# description: ATA Disk
# product: HUS726T6TALE600
# physical id: 0.16.0
# bus info: scsi@0:0.22.0
# logical name: /dev/sdu
# version: T010
# serial: GOTXh2-34fo-rZfh-IB5d-RkwW-o5EC-rDD4R1
# size: 5589GiB (6001GB)
# capacity: 5589GiB (6001GB)
# capabilities: 7200rpm lvm2
# configuration: ansiversion=6 logicalsectorsize=512 sectorsize=4096
# *-disk:14
# description: ATA Disk
# product: HUS726T6TALE600
# physical id: 0.17.0
# bus info: scsi@0:0.23.0
# logical name: /dev/sdv
# version: T010
# serial: 7Yn8xd-68Xu-A0RC-nx5Q-YEvJ-QPEG-CwjkP0
# size: 5589GiB (6001GB)
# capacity: 5589GiB (6001GB)
# capabilities: 7200rpm lvm2
# configuration: ansiversion=6 logicalsectorsize=512 sectorsize=4096
# *-disk:15
# description: ATA Disk
# product: HUS726T6TALE600
# physical id: 0.18.0
# bus info: scsi@0:0.24.0
# logical name: /dev/sdw
# version: T010
# serial: hdz5tv-f2Zm-wuf8-qtKO-XIlN-4Z1E-uHapKc
# size: 5589GiB (6001GB)
# capacity: 5589GiB (6001GB)
# capabilities: 7200rpm lvm2
# configuration: ansiversion=6 logicalsectorsize=512 sectorsize=4096
# *-disk:16
# description: ATA Disk
# product: HUS726T6TALE600
# physical id: 0.19.0
# bus info: scsi@0:0.25.0
# logical name: /dev/sdx
# version: T010
# serial: C3VFhO-mh9a-vKIR-Gi1o-pc05-LOqY-oErH8r
# size: 5589GiB (6001GB)
# capacity: 5589GiB (6001GB)
# capabilities: 7200rpm lvm2
# configuration: ansiversion=6 logicalsectorsize=512 sectorsize=4096
# *-disk:17
# description: SCSI Disk
# product: HW-SAS3408
# vendor: AVAGO
# physical id: 2.0.0
# bus info: scsi@0:2.0.0
# logical name: /dev/sdy
# version: 5.06
# serial: 00457f537b174eb025007018406c778a
# size: 446GiB (478GB)
# capabilities: gpt-1.00 partitioned partitioned:gpt
# configuration: ansiversion=5 guid=f72b8f56-6e5d-4a0c-a2a0-bf641ac2c2ff logicalsectorsize=512 sectorsize=4096
# *-disk:18
# description: ATA Disk
# product: HUS726T6TALE600
# physical id: 0.5.0
# bus info: scsi@0:0.5.0
# logical name: /dev/sdd
# version: T010
# serial: 1sulWQ-pttz-zf0P-WTEe-cydl-lY6Q-CdX4Hv
# size: 5589GiB (6001GB)
# capacity: 5589GiB (6001GB)
# capabilities: 7200rpm lvm2
# configuration: ansiversion=6 logicalsectorsize=512 sectorsize=4096
# *-disk:19
# description: ATA Disk
# product: HUS726T6TALE600
# physical id: 0.6.0
# bus info: scsi@0:0.6.0
# logical name: /dev/sde
# version: T010
# serial: JF6q37-XaYh-qoXg-mPeZ-4Ofr-Qrkt-nh21RR
# size: 5589GiB (6001GB)
# capacity: 5589GiB (6001GB)
# capabilities: 7200rpm lvm2
# configuration: ansiversion=6 logicalsectorsize=512 sectorsize=4096
# *-disk:20
# description: ATA Disk
# product: HUS726T6TALE600
# physical id: 0.7.0
# bus info: scsi@0:0.7.0
# logical name: /dev/sdf
# version: T010
# serial: vvF48a-k1sq-7v1m-dpSh-yb50-KLLk-otk7lA
# size: 5589GiB (6001GB)
# capacity: 5589GiB (6001GB)
# capabilities: 7200rpm lvm2
# configuration: ansiversion=6 logicalsectorsize=512 sectorsize=4096
# *-disk:21
# description: ATA Disk
# product: HUS726T6TALE600
# physical id: 0.8.0
# bus info: scsi@0:0.8.0
# logical name: /dev/sdg
# version: T010
# serial: NHU0VX-vm31-DyRP-V4dc-gx7T-dXGI-Bb8qlw
# size: 5589GiB (6001GB)
# capacity: 5589GiB (6001GB)
# capabilities: 7200rpm lvm2
# configuration: ansiversion=6 logicalsectorsize=512 sectorsize=4096
# *-disk:22
# description: ATA Disk
# product: HUS726T6TALE600
# physical id: 0.9.0
# bus info: scsi@0:0.9.0
# logical name: /dev/sdh
# version: T010
# serial: jCIRNL-K08S-oYZc-Q5Eb-Y2ht-0NYt-0luz1T
# size: 5589GiB (6001GB)
# capacity: 5589GiB (6001GB)
# capabilities: 7200rpm lvm2
# configuration: ansiversion=6 logicalsectorsize=512 sectorsize=4096
# *-disk:23
# description: ATA Disk
# product: HUS726T6TALE600
# physical id: 0.a.0
# bus info: scsi@0:0.10.0
# logical name: /dev/sdi
# version: T010
# serial: wiQiLJ-Arua-8vcg-m6ta-KgSL-f1kD-rgzKxD
# size: 5589GiB (6001GB)
# capacity: 5589GiB (6001GB)
# capabilities: 7200rpm lvm2
# configuration: ansiversion=6 logicalsectorsize=512 sectorsize=4096
# *-disk:24
# description: ATA Disk
# product: HUS726T6TALE600
# physical id: 0.b.0
# bus info: scsi@0:0.11.0
# logical name: /dev/sdj
# version: T010
# serial: T7vZ96-uTGr-tvFz-jKoZ-479j-vRvh-WeCVRJ
# size: 5589GiB (6001GB)
# capacity: 5589GiB (6001GB)
# capabilities: 7200rpm lvm2
# configuration: ansiversion=6 logicalsectorsize=512 sectorsize=4096
# *-disk:0
# description: ATA Disk
# product: MTFDDAK960TDC-1A
# physical id: 0.e.0
# bus info: scsi@15:0.14.0
# logical name: /dev/sdz
# version: M030
# serial: HE21uM-4KRw-heFX-IFVf-zO8Y-Rzah-ncwlwL
# size: 894GiB (960GB)
# capacity: 894GiB (960GB)
# capabilities: lvm2
# configuration: ansiversion=6 logicalsectorsize=512 sectorsize=4096
# *-disk:1
# description: ATA Disk
# product: MTFDDAK960TDC-1A
# physical id: 0.f.0
# bus info: scsi@15:0.15.0
# logical name: /dev/sdaa
# version: M030
# serial: RGeqtd-dTEc-hV8g-Xd9o-I1Ke-sDH1-UK6mZg
# size: 894GiB (960GB)
# capacity: 894GiB (960GB)
# capabilities: lvm2
# configuration: ansiversion=6 logicalsectorsize=512 sectorsize=4096
# *-disk:2
# description: ATA Disk
# product: MTFDDAK960TDC-1A
# physical id: 0.10.0
# bus info: scsi@15:0.16.0
# logical name: /dev/sdab
# version: M030
# serial: 1ROsNp-0J4j-DuWM-1nNl-Fo3K-gWfg-d7VDLq
# size: 894GiB (960GB)
# capacity: 894GiB (960GB)
# capabilities: lvm2
# configuration: ansiversion=6 logicalsectorsize=512 sectorsize=4096
# *-disk:3
# description: ATA Disk
# product: MTFDDAK960TDC-1A
# physical id: 0.11.0
# bus info: scsi@15:0.17.0
# logical name: /dev/sdac
# version: M030
# serial: s0XeSI-Zl3B-0xcU-8wi3-BvVo-vU3k-cLZx22
# size: 894GiB (960GB)
# capacity: 894GiB (960GB)
# capabilities: lvm2
# configuration: ansiversion=6 logicalsectorsize=512 sectorsize=4096
# *-disk:4
# description: ATA Disk
# product: MTFDDAK960TDC-1A
# physical id: 0.12.0
# bus info: scsi@15:0.18.0
# logical name: /dev/sdad
# version: M030
# serial: rZZ7yM-KImV-6Ld8-xmOJ-KyiC-Wstp-4t35S3
# size: 894GiB (960GB)
# capacity: 894GiB (960GB)
# capabilities: lvm2
# configuration: ansiversion=6 logicalsectorsize=512 sectorsize=4096
# *-disk:5
# description: ATA Disk
# product: MTFDDAK960TDC-1A
# physical id: 0.13.0
# bus info: scsi@15:0.19.0
# logical name: /dev/sdae
# version: M030
# serial: LI50dd-vn2G-RiYE-5iuL-nxYI-TXCT-zs1lSY
# size: 894GiB (960GB)
# capacity: 894GiB (960GB)
# capabilities: lvm2
# configuration: ansiversion=6 logicalsectorsize=512 sectorsize=4096
# *-disk:6
# description: ATA Disk
# product: MTFDDAK960TDC-1A
# physical id: 0.14.0
# bus info: scsi@15:0.20.0
# logical name: /dev/sdaf
# version: M030
# serial: 2hkDxG-90a2-mkEJ-GxmQ-doAv-SPT1-8qyo10
# size: 894GiB (960GB)
# capacity: 894GiB (960GB)
# capabilities: lvm2
# configuration: ansiversion=6 logicalsectorsize=512 sectorsize=4096
# *-disk:7
# description: ATA Disk
# product: MTFDDAK960TDC-1A
# physical id: 0.15.0
# bus info: scsi@15:0.21.0
# logical name: /dev/sdag
# version: M030
# serial: bMQrTa-IKF7-vDFU-5RSR-cj4a-cOUL-QAY2yI
# size: 894GiB (960GB)
# capacity: 894GiB (960GB)
# capabilities: lvm2
# configuration: ansiversion=6 logicalsectorsize=512 sectorsize=4096
# *-disk:8
# description: ATA Disk
# product: MTFDDAK960TDC-1A
# physical id: 0.16.0
# bus info: scsi@15:0.22.0
# logical name: /dev/sdah
# version: M030
# serial: q0VZpE-4sub-HKbe-RkRx-G0wM-HOeU-NDRXRe
# size: 894GiB (960GB)
# capacity: 894GiB (960GB)
# capabilities: lvm2
# configuration: ansiversion=6 logicalsectorsize=512 sectorsize=4096
# *-disk:9
# description: ATA Disk
# product: MTFDDAK960TDC-1A
# physical id: 0.17.0
# bus info: scsi@15:0.23.0
# logical name: /dev/sdai
# version: M030
# serial: fEj7Rr-FSS8-ruwb-IjSj-xW6l-oj6v-q1pSNV
# size: 894GiB (960GB)
# capacity: 894GiB (960GB)
# capabilities: lvm2
# configuration: ansiversion=6 logicalsectorsize=512 sectorsize=4096
# *-disk:10
# description: SCSI Disk
# product: HW-SAS3408
# vendor: AVAGO
# physical id: 2.0.0
# bus info: scsi@15:2.0.0
# logical name: /dev/sdaj
# version: 5.06
# serial: 00a6b489499e4cb02500904af3624ac6
# size: 893GiB (958GB)
# capabilities: partitioned partitioned:dos
# configuration: ansiversion=5 logicalsectorsize=512 sectorsize=4096 signature=550d3974
yum -y install fio
# https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/storage_administration_guide/vdo-ev-performance-testing
lvs -o+cache_policy,cache_settings,chunksize datavg/mix0weblv
# https://access.redhat.com/solutions/2961861
for i in /proc/[0-9]* ; do echo $i >> /tmp/mountinfo ; grep -q "/dev/mapper/datavg-mix0weblv" $i/mountinfo ; echo $? >> /tmp/mountinfo ; done
grep -B 1 '^0$' /tmp/mountinfo
lvcreate --type raid5 -L 120G --stripes 23 -n mixtestlv datavg /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdq /dev/sdr /dev/sds /dev/sdt /dev/sdu /dev/sdv /dev/sdw /dev/sdx
fio --rw=rw --rwmixread=80 --bsrange=4k-256k --name=vdo \
--filename=/dev/datavg/mixtestlv --ioengine=libaio --numjobs=1 --thread \
--norandommap --runtime=300 --direct=0 --iodepth=8 \
--scramble_buffers=1 --offset=0 --size=100g
lvremove -f datavg/mixtestlv
# Run status group 0 (all jobs):
# READ: bw=587MiB/s (615MB/s), 587MiB/s-587MiB/s (615MB/s-615MB/s), io=79.9GiB (85.8GB), run=139473-139473msec
# WRITE: bw=147MiB/s (155MB/s), 147MiB/s-147MiB/s (155MB/s-155MB/s), io=20.1GiB (21.6GB), run=139473-139473msec
lvcreate --type raid6 -L 120G --stripes 22 -n mixtestlv datavg /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdq /dev/sdr /dev/sds /dev/sdt /dev/sdu /dev/sdv /dev/sdw /dev/sdx
fio --rw=rw --rwmixread=80 --bsrange=4k-256k --name=vdo \
--filename=/dev/datavg/mixtestlv --ioengine=libaio --numjobs=1 --thread \
--norandommap --runtime=300 --direct=0 --iodepth=8 \
--scramble_buffers=1 --offset=0 --size=100g
lvremove -f datavg/mixtestlv
# Run status group 0 (all jobs):
# READ: bw=586MiB/s (614MB/s), 586MiB/s-586MiB/s (614MB/s-614MB/s), io=79.9GiB (85.8GB), run=139739-139739msec
# WRITE: bw=147MiB/s (154MB/s), 147MiB/s-147MiB/s (154MB/s-154MB/s), io=20.1GiB (21.6GB), run=139739-139739msec
lvcreate --type raid0 -L 120G --stripes 24 -n mixtestlv datavg /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdq /dev/sdr /dev/sds /dev/sdt /dev/sdu /dev/sdv /dev/sdw /dev/sdx
fio --rw=rw --rwmixread=80 --bsrange=4k-256k --name=vdo \
--filename=/dev/datavg/mixtestlv --ioengine=libaio --numjobs=1 --thread \
--norandommap --runtime=300 --direct=0 --iodepth=8 \
--scramble_buffers=1 --offset=0 --size=100g
lvremove -f datavg/mixtestlv
# Run status group 0 (all jobs):
# READ: bw=1139MiB/s (1194MB/s), 1139MiB/s-1139MiB/s (1194MB/s-1194MB/s), io=79.9GiB (85.8GB), run=71841-71841msec
# WRITE: bw=286MiB/s (300MB/s), 286MiB/s-286MiB/s (300MB/s-300MB/s), io=20.1GiB (21.6GB), run=71841-71841msec
lvcreate --type raid0 -L 100G --stripes 10 -n mixtestlv datavg /dev/sdz /dev/sdaa /dev/sdab /dev/sdac /dev/sdad /dev/sdae /dev/sdaf /dev/sdag /dev/sdah /dev/sdai
fio --rw=rw --rwmixread=80 --bsrange=4k-256k --name=vdo \
--filename=/dev/datavg/mixtestlv --ioengine=libaio --numjobs=1 --thread \
--norandommap --runtime=300 --direct=0 --iodepth=8 \
--scramble_buffers=1 --offset=0 --size=100g
lvremove -f datavg/mixtestlv
# Run status group 0 (all jobs):
# READ: bw=1358MiB/s (1424MB/s), 1358MiB/s-1358MiB/s (1424MB/s-1424MB/s), io=79.9GiB (85.8GB), run=60282-60282msec
# WRITE: bw=341MiB/s (358MB/s), 341MiB/s-341MiB/s (358MB/s-358MB/s), io=20.1GiB (21.6GB), run=60282-60282msec
lvcreate --type raid5 -L 100G --stripes 9 -n mixtestlv datavg /dev/sdz /dev/sdaa /dev/sdab /dev/sdac /dev/sdad /dev/sdae /dev/sdaf /dev/sdag /dev/sdah /dev/sdai
fio --rw=rw --rwmixread=80 --bsrange=4k-256k --name=vdo \
--filename=/dev/datavg/mixtestlv --ioengine=libaio --numjobs=1 --thread \
--norandommap --runtime=300 --direct=0 --iodepth=8 \
--scramble_buffers=1 --offset=0 --size=100g
lvremove -f datavg/mixtestlv
lvcreate --type raid6 -L 100G --stripes 9 -n mixtestlv datavg /dev/sdz /dev/sdaa /dev/sdab /dev/sdac /dev/sdad /dev/sdae /dev/sdaf /dev/sdag /dev/sdah /dev/sdai
fio --rw=rw --rwmixread=80 --bsrange=4k-256k --name=vdo \
--filename=/dev/datavg/mixtestlv --ioengine=libaio --numjobs=1 --thread \
--norandommap --runtime=300 --direct=0 --iodepth=8 \
--scramble_buffers=1 --offset=0 --size=100g
lvremove -f datavg/mixtestlv
lvcreate --type raid5 -L 120G --stripes 23 -n mixtestlv datavg /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdq /dev/sdr /dev/sds /dev/sdt /dev/sdu /dev/sdv /dev/sdw /dev/sdx
lvcreate --type raid0 -L 40G --stripes 10 -n cachetest datavg /dev/sdz /dev/sdaa /dev/sdab /dev/sdac /dev/sdad /dev/sdae /dev/sdaf /dev/sdag /dev/sdah /dev/sdai
lvcreate --type raid0 -L 400M --stripes 10 -n cachetestmeta datavg /dev/sdz /dev/sdaa /dev/sdab /dev/sdac /dev/sdad /dev/sdae /dev/sdaf /dev/sdag /dev/sdah /dev/sdai
lvconvert --type cache-pool --poolmetadata datavg/cachetestmeta datavg/cachetest
lvconvert --type cache --cachepool datavg/cachetest datavg/mixtestlv
fio --rw=rw --rwmixread=80 --bsrange=4k-256k --name=vdo \
--filename=/dev/datavg/mixtestlv --ioengine=libaio --numjobs=1 --thread \
--norandommap --runtime=300 --direct=0 --iodepth=8 \
--scramble_buffers=1 --offset=0 --size=100g -random_distribution=zoned:60/10:30/20:8/30:2/40
lvremove -f datavg/mixtestlv
# Run status group 0 (all jobs):
# READ: bw=716MiB/s (750MB/s), 716MiB/s-716MiB/s (750MB/s-750MB/s), io=31.0GiB (34.3GB), run=45744-45744msec
# WRITE: bw=180MiB/s (189MB/s), 180MiB/s-180MiB/s (189MB/s-189MB/s), io=8228MiB (8628MB), run=45744-45744msec
lvcreate --type raid5 -L 120G --stripes 23 -n mixtestlv datavg /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdq /dev/sdr /dev/sds /dev/sdt /dev/sdu /dev/sdv /dev/sdw /dev/sdx
lvcreate --type raid5 -L 40G --stripes 9 -n cachetest datavg /dev/sdz /dev/sdaa /dev/sdab /dev/sdac /dev/sdad /dev/sdae /dev/sdaf /dev/sdag /dev/sdah /dev/sdai
lvcreate --type raid5 -L 400M --stripes 9 -n cachetestmeta datavg /dev/sdz /dev/sdaa /dev/sdab /dev/sdac /dev/sdad /dev/sdae /dev/sdaf /dev/sdag /dev/sdah /dev/sdai
lvconvert --type cache-pool --poolmetadata datavg/cachetestmeta datavg/cachetest
lvconvert --type cache --cachepool datavg/cachetest datavg/mixtestlv
fio --rw=rw --rwmixread=80 --bsrange=4k-256k --name=vdo \
--filename=/dev/datavg/mixtestlv --ioengine=libaio --numjobs=1 --thread \
--norandommap --runtime=300 --direct=0 --iodepth=8 \
--scramble_buffers=1 --offset=0 --size=100g -random_distribution=zoned:60/10:30/20:8/30:2/40
lvremove -f datavg/mixtestlv
# Run status group 0 (all jobs):
# READ: bw=487MiB/s (511MB/s), 487MiB/s-487MiB/s (511MB/s-511MB/s), io=79.9GiB (85.8GB), run=167880-167880msec
# WRITE: bw=122MiB/s (128MB/s), 122MiB/s-122MiB/s (128MB/s-128MB/s), io=20.1GiB (21.6GB), run=167880-167880msec
lvcreate -L 100G -n singledisklv datavg /dev/sda
fio --rw=rw --rwmixread=80 --bsrange=4k-256k --name=vdo \
--filename=/dev/datavg/singledisklv --ioengine=libaio --numjobs=1 --thread \
--norandommap --runtime=300 --direct=0 --iodepth=8 \
--scramble_buffers=1 --offset=0 --size=100g -random_distribution=zoned:60/10:30/20:8/30:2/40
lvremove -f datavg/singledisklv
# Run status group 0 (all jobs):
# READ: bw=151MiB/s (158MB/s), 151MiB/s-151MiB/s (158MB/s-158MB/s), io=44.2GiB (47.5GB), run=300031-300031msec
# WRITE: bw=37.0MiB/s (39.8MB/s), 37.0MiB/s-37.0MiB/s (39.8MB/s-39.8MB/s), io=11.1GiB (11.9GB), run=300031-300031msec
lvcreate -L 20G -n singledisklv datavg /dev/sdai
fio --rw=rw --rwmixread=80 --bsrange=4k-256k --name=vdo \
--filename=/dev/datavg/singledisklv --ioengine=libaio --numjobs=1 --thread \
--norandommap --runtime=300 --direct=0 --iodepth=8 \
--scramble_buffers=1 --offset=0 --size=20g -random_distribution=zoned:60/10:30/20:8/30:2/40
lvremove -f datavg/singledisklv
# Run status group 0 (all jobs):
# READ: bw=431MiB/s (452MB/s), 431MiB/s-431MiB/s (452MB/s-452MB/s), io=16.0GiB (17.2GB), run=38005-38005msec
# WRITE: bw=108MiB/s (113MB/s), 108MiB/s-108MiB/s (113MB/s-113MB/s), io=4088MiB (4287MB), run=38005-38005msec
fio --rw=rw --rwmixread=99 --bsrange=4k-256k --name=vdo \
--directory=./ --ioengine=libaio --numjobs=1 --thread \
--norandommap --runtime=300 --direct=0 --iodepth=8 \
--scramble_buffers=1 --offset=0 --size=100g
fio --rw=rw --rwmixread=99 --bsrange=4k-256k --name=vdo \
--directory=./ --ioengine=sync --size=100g
blktrace /dev/datavg/mixlv
# http benchmark tools
yum install httpd-tools
# https://github.com/philipgloyne/apachebench-for-multi-url
# https://hub.docker.com/r/chrisipa/ab-multi-url
# https://www.simonholywell.com/post/2015/06/parallel-benchmark-many-urls-with-apachebench/
fio --rw=rw --rwmixread=99 --bsrange=4k-256k --name=vdo \
--filename=/dev/datavg/ssd0lv --ioengine=libaio --numjobs=1 --thread \
--norandommap --runtime=300 --direct=0 --iodepth=8 \
--scramble_buffers=1 --offset=0 --size=100g
fio --rw=rw --rwmixread=99 --bsrange=128k-256k --name=vdo \
--filename=/dev/datavg/ssd0lv --ioengine=libaio --numjobs=1 --thread \
--norandommap --runtime=300 --direct=0 --iodepth=8 \
--scramble_buffers=1 --offset=0 --size=100g
worker-1 nic bond
ip link show
# 2: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
# link/ether cc:64:a6:59:bd:24 brd ff:ff:ff:ff:ff:ff
# 3: eno2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
# link/ether cc:64:a6:59:bd:25 brd ff:ff:ff:ff:ff:ff
# 4: ens2f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
# link/ether 08:4f:0a:b5:a2:be brd ff:ff:ff:ff:ff:ff
# 5: eno3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
# link/ether cc:64:a6:59:bd:26 brd ff:ff:ff:ff:ff:ff
# 6: eno4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
# link/ether cc:64:a6:59:bd:27 brd ff:ff:ff:ff:ff:ff
# 7: ens2f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
# link/ether 08:4f:0a:b5:a2:bf brd ff:ff:ff:ff:ff:ff
ip a s eno1
# 2: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
# link/ether cc:64:a6:59:bd:24 brd ff:ff:ff:ff:ff:ff
# inet 39.134.201.65/27 brd 39.134.201.95 scope global noprefixroute eno1
# valid_lft forever preferred_lft forever
# inet6 fe80::149f:d0ce:2700:4bf2/64 scope link noprefixroute
# valid_lft forever preferred_lft forever
ethtool eno1 # 10000baseT/Full
ethtool eno2 # 10000baseT/Full
ethtool eno3 # 1000baseT/Full
ethtool eno4 # 1000baseT/Full
ethtool ens2f0 # 10000baseT/Full
ethtool ens2f1 # 10000baseT/Full
nmcli con add type bond \
con-name bond0 \
ifname bond0 \
mode 802.3ad
nmcli con mod id bond0 bond.options \
mode=802.3ad,miimon=100,lacp_rate=fast,xmit_hash_policy=layer2+3
nmcli con add type bond-slave ifname eno2 con-name eno2 master bond0
nmcli con add type bond-slave ifname ens2f0 con-name ens2f0 master bond0
nmcli con add type bond-slave ifname ens2f1 con-name ens2f1 master bond0
nmcli con down eno2
nmcli con up eno2
nmcli con down ens2f0
nmcli con up ens2f0
nmcli con down ens2f1
nmcli con up ens2f1
nmcli con down bond0
nmcli con start bond0
#######################################
# nic bond
cat > /root/nic.bond.sh << 'EOF'
#!/bin/bash
set -x
# delete all connection
nmcli -g uuid con | while read i ; do nmcli c delete ${i} ; done
nmcli con add type bond \
con-name bond0 \
ifname bond0 \
mode 802.3ad \
ipv4.method 'manual' \
ipv4.address '39.134.201.65/27' \
ipv4.gateway '39.134.201.94' \
ipv4.dns '117.177.241.16'
nmcli con mod id bond0 bond.options \
mode=802.3ad,miimon=100,lacp_rate=fast,xmit_hash_policy=layer2+3
nmcli con add type bond-slave ifname eno1 con-name eno1 master bond0
nmcli con add type bond-slave ifname eno2 con-name eno2 master bond0
nmcli con add type bond-slave ifname ens2f0 con-name ens2f0 master bond0
nmcli con add type bond-slave ifname ens2f1 con-name ens2f1 master bond0
systemctl restart network
EOF
cat > /root/nic.restore.sh << 'EOF'
#!/bin/bash
set -x
# delete all connection
nmcli -g uuid con | while read i ; do nmcli c delete ${i} ; done
# re-create primary connection
nmcli con add type ethernet \
con-name eno1 \
ifname eno1 \
ipv4.method 'manual' \
ipv4.address '39.134.201.65/27' \
ipv4.gateway '39.134.201.94' \
ipv4.dns '117.177.241.16'
systemctl restart network
exit 0
EOF
chmod +x /root/nic.restore.sh
cat > ~/cron-network-con-recreate << EOF
*/20 * * * * /bin/bash /root/nic.restore.sh
EOF
crontab ~/cron-network-con-recreate
bash /root/nic.bond.sh
# debug
cat /proc/net/bonding/bond0
cat /sys/class/net/bond*/bonding/xmit_hash_policy
# https://access.redhat.com/solutions/666853
ip -s -h link show master bond0
worker-2 host
mkdir /etc/yum.repos.d.bak
mv /etc/yum.repos.d/* /etc/yum.repos.d.bak
cat << EOF > /etc/yum.repos.d/remote.repo
[remote]
name=RHEL FTP
baseurl=ftp://117.177.241.16/data
enabled=1
gpgcheck=0
EOF
yum clean all
yum --disableplugin=subscription-manager repolist
yum install -y byobu htop iostat
yum -y update
hostnamectl set-hostname worker-2.ocpsc.redhat.ren
nmcli connection modify eno1 ipv4.dns 117.177.241.16
nmcli connection reload
nmcli connection up eno1
yum -y install fail2ban
cat << EOF > /etc/fail2ban/jail.d/wzh.conf
[sshd]
enabled = true
EOF
systemctl enable fail2ban
systemctl restart fail2ban
cat << EOF > /etc/fail2ban/jail.d/wzh.conf
[sshd]
enabled = true
[recidive]
enabled = true
EOF
systemctl restart fail2ban
fail2ban-client status sshd
fail2ban-client status recidive
systemctl status fail2ban
tail -F /var/log/fail2ban.log
cp /etc/ssh/sshd_config /etc/ssh/sshd_config.BAK
sed -i 's/#UseDNS yes/UseDNS no/g' /etc/ssh/sshd_config
diff /etc/ssh/sshd_config /etc/ssh/sshd_config.BAK
systemctl restart sshd
passwd
useradd -m wzh
lsblk | grep 5.5 | awk '{print $1}' | xargs -I DEMO echo -n "/dev/DEMO "
# /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk
lsblk | grep 5.5 | awk '{print $1}' | wc -l
# 24
yum install -y lvm2
pvcreate -y /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdq /dev/sdr /dev/sds /dev/sdt /dev/sdu /dev/sdv /dev/sdw /dev/sdx
vgcreate datavg /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdq /dev/sdr /dev/sds /dev/sdt /dev/sdu /dev/sdv /dev/sdw /dev/sdx
vgs
lvcreate --type raid0 -l 100%FREE --stripes 24 -n datalv datavg
mkfs.xfs /dev/datavg/datalv
lvdisplay /dev/datavg/datalv -m
mkdir -p /data
cp /etc/fstab /etc/fstab.bak
cat << EOF >> /etc/fstab
/dev/datavg/datalv /data xfs defaults 0 0
EOF
mount -a
yum install -y sysstat
lsblk | grep disk | awk '{print $1}' | xargs -I DEMO echo -n "DEMO "
# sda sdb sdc sdd sde sdf sdg sdh sdi sdj sdk sdl sdm
iostat -m -x sda sdb sdc sdd sde sdf sdg sdh sdi sdj sdk 5
iostat -m -x dm-10 5
########################################
# ntp
yum install -y chrony
systemctl enable chronyd
systemctl restart chronyd
systemctl status chronyd
chronyc tracking
systemctl disable --now firewalld.service
# setup time server
/bin/cp -f /etc/chrony.conf /etc/chrony.conf.bak
cat << EOF > /etc/chrony.conf
server 117.177.241.16 iburst
server 0.rhel.pool.ntp.org iburst
server 1.rhel.pool.ntp.org iburst
server 2.rhel.pool.ntp.org iburst
server 3.rhel.pool.ntp.org iburst
driftfile /var/lib/chrony/drift
makestep 1.0 3
rtcsync
logdir /var/log/chrony
EOF
systemctl restart chronyd
systemctl status chronyd
chronyc tracking
chronyc sources -v
# update ntp
cat << EOF > /etc/chrony.conf
server 223.87.20.100 iburst
driftfile /var/lib/chrony/drift
makestep 1.0 3
rtcsync
logdir /var/log/chrony
EOF
systemctl restart chronyd
systemctl status chronyd
chronyc tracking
worker-2 disk
#########################################
# ssd cache + hdd
# https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html-single/logical_volume_manager_administration/index#lvm_cache_volume_creation
umount /data
lsblk -d -o name,rota
lvremove /dev/datavg/datalv
# lsblk | grep 894 | awk '{print $1}'
pvcreate /dev/sdz /dev/sdaa /dev/sdab /dev/sdac /dev/sdad /dev/sdae /dev/sdaf /dev/sdag /dev/sdah /dev/sdai
# https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/logical_volume_manager_administration/vg_grow
vgextend datavg /dev/sdz /dev/sdaa /dev/sdab /dev/sdac /dev/sdad /dev/sdae /dev/sdaf /dev/sdag /dev/sdah /dev/sdai
## raid5
lvcreate --type raid5 -L 1G --stripes 23 -n hddlv datavg /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdq /dev/sdr /dev/sds /dev/sdt /dev/sdu /dev/sdv /dev/sdw /dev/sdx
lvcreate --type raid5 -L 1G --stripes 23 -n mixlv datavg /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdq /dev/sdr /dev/sds /dev/sdt /dev/sdu /dev/sdv /dev/sdw /dev/sdx
lvcreate --type raid5 -L 1G --stripes 9 -n ssdlv datavg /dev/sdz /dev/sdaa /dev/sdab /dev/sdac /dev/sdad /dev/sdae /dev/sdaf /dev/sdag /dev/sdah /dev/sdai
lvcreate --type raid5 -L 3T --stripes 23 -n mix0lv datavg /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdq /dev/sdr /dev/sds /dev/sdt /dev/sdu /dev/sdv /dev/sdw /dev/sdx
lvcreate --type raid0 -L 1.3536T --stripes 10 -n cachemix0 datavg /dev/sdz /dev/sdaa /dev/sdab /dev/sdac /dev/sdad /dev/sdae /dev/sdaf /dev/sdag /dev/sdah /dev/sdai
lvcreate --type raid0 -L 13G --stripes 10 -n cachemix0meta datavg /dev/sdz /dev/sdaa /dev/sdab /dev/sdac /dev/sdad /dev/sdae /dev/sdaf /dev/sdag /dev/sdah /dev/sdai
lvconvert --type cache-pool --poolmetadata datavg/cachemix0meta datavg/cachemix0
lvconvert --type cache --cachepool datavg/cachemix0 datavg/mix0lv
# lvcreate --type raid5 --stripes 9 -L 1T -I 16M -R 4096K -n hddlv datavg /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk
## raid0 + stripe
lvcreate --type raid0 -L 1T --stripes 24 -n hdd0lv datavg /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdq /dev/sdr /dev/sds /dev/sdt /dev/sdu /dev/sdv /dev/sdw /dev/sdx
fio --rw=rw --rwmixread=99 --bsrange=4k-256k --name=vdo \
--filename=/dev/datavg/hdd0lv --ioengine=libaio --numjobs=1 --thread \
--norandommap --runtime=300 --direct=1 --iodepth=8 \
--scramble_buffers=1 --offset=0 --size=100g
# Run status group 0 (all jobs):
# READ: bw=2453MiB/s (2572MB/s), 2453MiB/s-2453MiB/s (2572MB/s-2572MB/s), io=98.0GiB (106GB), run=41331-41331msec
# WRITE: bw=24.9MiB/s (26.1MB/s), 24.9MiB/s-24.9MiB/s (26.1MB/s-26.1MB/s), io=1029MiB (1079MB), run=41331-41331msec
lvs -o+stripesize,chunksize datavg/hdd0lv
# LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert Stripe Chunk
# hdd0lv datavg rwi-aor--- 1.00t 64.00k 0
lvremove -f datavg/hdd0lv
lvcreate --type raid0 -L 1T -I 128 --stripes 24 -n hdd1lv datavg /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdq /dev/sdr /dev/sds /dev/sdt /dev/sdu /dev/sdv /dev/sdw /dev/sdx
fio --rw=rw --rwmixread=99 --bsrange=4k-256k --name=vdo \
--filename=/dev/datavg/hdd1lv --ioengine=libaio --numjobs=1 --thread \
--norandommap --runtime=300 --direct=1 --iodepth=8 \
--scramble_buffers=1 --offset=0 --size=100g
# Run status group 0 (all jobs):
# READ: bw=2674MiB/s (2804MB/s), 2674MiB/s-2674MiB/s (2804MB/s-2804MB/s), io=98.0GiB (106GB), run=37912-37912msec
# WRITE: bw=27.1MiB/s (28.4MB/s), 27.1MiB/s-27.1MiB/s (28.4MB/s-28.4MB/s), io=1029MiB (1079MB), run=37912-37912msec
lvs -o+stripesize,chunksize datavg/hdd1lv
# LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert Stripe Chunk
# hdd1lv datavg rwi-a-r--- 1.00t 128.00k 0
lvremove -f datavg/hdd1lv
lvcreate --type raid0 -L 1T -I 256 --stripes 24 -n hdd1lv datavg /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdq /dev/sdr /dev/sds /dev/sdt /dev/sdu /dev/sdv /dev/sdw /dev/sdx
fio --rw=rw --rwmixread=99 --bsrange=4k-256k --name=vdo \
--filename=/dev/datavg/hdd1lv --ioengine=libaio --numjobs=1 --thread \
--norandommap --runtime=300 --direct=1 --iodepth=8 \
--scramble_buffers=1 --offset=0 --size=100g
# Run status group 0 (all jobs):
# READ: bw=2674MiB/s (2804MB/s), 2674MiB/s-2674MiB/s (2804MB/s-2804MB/s), io=98.0GiB (106GB), run=37912-37912msec
# WRITE: bw=27.1MiB/s (28.4MB/s), 27.1MiB/s-27.1MiB/s (28.4MB/s-28.4MB/s), io=1029MiB (1079MB), run=37912-37912msec
lvs -o+stripesize,chunksize datavg/hdd1lv
# LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert Stripe Chunk
# hdd1lv datavg rwi-a-r--- 1.00t 256.00k 0k 0
lvremove -f datavg/hdd1lv
lvcreate --type raid0 -L 300G --stripes 10 -n ssd0lv datavg /dev/sdz /dev/sdaa /dev/sdab /dev/sdac /dev/sdad /dev/sdae /dev/sdaf /dev/sdag /dev/sdah /dev/sdai
fio --rw=rw --rwmixread=99 --bsrange=4k-256k --name=vdo \
--filename=/dev/datavg/ssd0lv --ioengine=libaio --numjobs=1 --thread \
--norandommap --runtime=300 --direct=1 --iodepth=8 \
--scramble_buffers=1 --offset=0 --size=100g
# Run status group 0 (all jobs):
# READ: bw=2602MiB/s (2728MB/s), 2602MiB/s-2602MiB/s (2728MB/s-2728MB/s), io=98.0GiB (106GB), run=38965-38965msec
# WRITE: bw=26.4MiB/s (27.7MB/s), 26.4MiB/s-26.4MiB/s (27.7MB/s-27.7MB/s), io=1029MiB (1079MB), run=38965-38965msec
lvs -o+stripesize,chunksize datavg/ssd0lv
# LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert Stripe Chunk
# ssd0lv datavg rwi-a-r--- 300.00g 64.00k 0
lvremove -f datavg/ssd0lv
lvcreate --type raid0 -L 300G -I 128 --stripes 10 -n ssd0lv datavg /dev/sdz /dev/sdaa /dev/sdab /dev/sdac /dev/sdad /dev/sdae /dev/sdaf /dev/sdag /dev/sdah /dev/sdai
fio --rw=rw --rwmixread=99 --bsrange=4k-256k --name=vdo \
--filename=/dev/datavg/ssd0lv --ioengine=libaio --numjobs=1 --thread \
--norandommap --runtime=300 --direct=1 --iodepth=8 \
--scramble_buffers=1 --offset=0 --size=100g
# Run status group 0 (all jobs):
# READ: bw=2438MiB/s (2556MB/s), 2438MiB/s-2438MiB/s (2556MB/s-2556MB/s), io=98.0GiB (106GB), run=41584-41584msec
# WRITE: bw=24.7MiB/s (25.9MB/s), 24.7MiB/s-24.7MiB/s (25.9MB/s-25.9MB/s), io=1029MiB (1079MB), run=41584-41584msec
lvs -o+stripesize,chunksize datavg/ssd0lv
# LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert Stripe Chunk
# ssd0lv datavg rwi-a-r--- 300.00g 128.00k 0
lvremove -f datavg/ssd0lv
lvcreate --type raid0 -L 300G -I 256 --stripes 10 -n ssd0lv datavg /dev/sdz /dev/sdaa /dev/sdab /dev/sdac /dev/sdad /dev/sdae /dev/sdaf /dev/sdag /dev/sdah /dev/sdai
fio --rw=rw --rwmixread=99 --bsrange=4k-256k --name=vdo \
--filename=/dev/datavg/ssd0lv --ioengine=libaio --numjobs=1 --thread \
--norandommap --runtime=300 --direct=1 --iodepth=8 \
--scramble_buffers=1 --offset=0 --size=100g
# Run status group 0 (all jobs):
# READ: bw=1908MiB/s (2000MB/s), 1908MiB/s-1908MiB/s (2000MB/s-2000MB/s), io=98.0GiB (106GB), run=53135-53135msec
# WRITE: bw=19.4MiB/s (20.3MB/s), 19.4MiB/s-19.4MiB/s (20.3MB/s-20.3MB/s), io=1029MiB (1079MB), run=53135-53135msec
lvs -o+stripesize,chunksize datavg/ssd0lv
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert Stripe Chunk
# ssd0lv datavg rwi-a-r--- 300.00g 256.00k 0 0
lvremove -f datavg/ssd0lv
lvcreate --type raid5 -L 120G --stripes 23 -n hdd5lv datavg /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdq /dev/sdr /dev/sds /dev/sdt /dev/sdu /dev/sdv /dev/sdw /dev/sdx
fio --rw=rw --rwmixread=99 --bsrange=4k-256k --name=vdo \
--filename=/dev/datavg/hdd5lv --ioengine=libaio --numjobs=1 --thread \
--norandommap --runtime=300 --direct=1 --iodepth=8 \
--scramble_buffers=1 --offset=0 --size=100g
# Run status group 0 (all jobs):
# READ: bw=474MiB/s (497MB/s), 474MiB/s-474MiB/s (497MB/s-497MB/s), io=98.0GiB (106GB), run=214073-214073msec
# WRITE: bw=4920KiB/s (5038kB/s), 4920KiB/s-4920KiB/s (5038kB/s-5038kB/s), io=1029MiB (1079MB), run=214073-214073msec
lvs -o+stripesize,chunksize datavg/hdd5lv
# LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert Stripe Chunk
# hdd5lv datavg rwi-a-r--- 120.03g 100.00 64.00k 0
lvremove -f datavg/hdd5lv
lvcreate --type raid5 -L 120G -I 128 --stripes 23 -n hdd5lv datavg /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdq /dev/sdr /dev/sds /dev/sdt /dev/sdu /dev/sdv /dev/sdw /dev/sdx
fio --rw=rw --rwmixread=99 --bsrange=4k-256k --name=vdo \
--filename=/dev/datavg/hdd5lv --ioengine=libaio --numjobs=1 --thread \
--norandommap --runtime=300 --direct=1 --iodepth=8 \
--scramble_buffers=1 --offset=0 --size=100g
# Run status group 0 (all jobs):
# READ: bw=449MiB/s (471MB/s), 449MiB/s-449MiB/s (471MB/s-471MB/s), io=98.0GiB (106GB), run=225892-225892msec
# WRITE: bw=4663KiB/s (4775kB/s), 4663KiB/s-4663KiB/s (4775kB/s-4775kB/s), io=1029MiB (1079MB), run=225892-225892msec
lvs -o+stripesize,chunksize datavg/hdd5lv
# LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert Stripe Chunk
# hdd5lv datavg rwi-a-r--- 120.03g 100.00 128.00k 0
lvremove -f datavg/hdd5lv
lvcreate --type raid5 -L 120G --stripes 23 -n mixtestlv datavg /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdq /dev/sdr /dev/sds /dev/sdt /dev/sdu /dev/sdv /dev/sdw /dev/sdx
lvcreate --type raid0 -L 40G --stripes 10 -n cachetest datavg /dev/sdz /dev/sdaa /dev/sdab /dev/sdac /dev/sdad /dev/sdae /dev/sdaf /dev/sdag /dev/sdah /dev/sdai
lvcreate --type raid0 -L 1G --stripes 10 -n cache1testmeta datavg /dev/sdz /dev/sdaa /dev/sdab /dev/sdac /dev/sdad /dev/sdae /dev/sdaf /dev/sdag /dev/sdah /dev/sdai
lvconvert --type cache-pool --poolmetadata datavg/cache1testmeta datavg/cachetest
lvconvert --type cache --cachepool datavg/cachetest datavg/mixtestlv
fio --rw=rw --rwmixread=99 --bsrange=4k-256k --name=vdo \
--filename=/dev/datavg/mixtestlv --ioengine=libaio --numjobs=1 --thread \
--norandommap --runtime=300 --direct=1 --iodepth=8 \
--scramble_buffers=1 --offset=0 --size=100g
# Run status group 0 (all jobs):
# READ: bw=449MiB/s (471MB/s), 449MiB/s-449MiB/s (471MB/s-471MB/s), io=98.0GiB (106GB), run=225892-225892msec
# WRITE: bw=4663KiB/s (4775kB/s), 4663KiB/s-4663KiB/s (4775kB/s-4775kB/s), io=1029MiB (1079MB), run=225892-225892msec
lvs -o+stripesize,chunksize datavg/mixtestlv
# LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert Stripe Chunk
# hdd5lv datavg rwi-a-r--- 120.03g 100.00 128.00k 0
lvremove -f datavg/mixtestlv
lvcreate --type raid0 -L 1T --stripes 24 -n hdd1lv datavg /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdq /dev/sdr /dev/sds /dev/sdt /dev/sdu /dev/sdv /dev/sdw /dev/sdx
fio --rw=randrw --rwmixread=99 --bsrange=4k-256k --name=vdo \
--filename=/dev/datavg/hdd1lv --ioengine=libaio --numjobs=1 --thread \
--norandommap --runtime=300 --direct=1 --iodepth=8 \
--scramble_buffers=1 --offset=0 --size=100g
# Run status group 0 (all jobs):
# READ: bw=2453MiB/s (2572MB/s), 2453MiB/s-2453MiB/s (2572MB/s-2572MB/s), io=98.0GiB (106GB), run=41331-41331msec
# WRITE: bw=24.9MiB/s (26.1MB/s), 24.9MiB/s-24.9MiB/s (26.1MB/s-26.1MB/s), io=1029MiB (1079MB), run=41331-41331msec
lvs -o+stripesize,chunksize datavg/hdd1lv
# LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert Stripe Chunk
# hdd0lv datavg rwi-aor--- 1.00t 64.00k 0
lvremove -f datavg/hdd1lv
lvcreate --type raid0 -L 300G --stripes 10 -n ssd0lv datavg /dev/sdz /dev/sdaa /dev/sdab /dev/sdac /dev/sdad /dev/sdae /dev/sdaf /dev/sdag /dev/sdah /dev/sdai
fio --rw=randrw --rwmixread=99 --bsrange=4k-256k --name=vdo \
--filename=/dev/datavg/ssd0lv --ioengine=libaio --numjobs=1 --thread \
--norandommap --runtime=300 --direct=1 --iodepth=8 \
--scramble_buffers=1 --offset=0 --size=100g
# Run status group 0 (all jobs):
# READ: bw=1527MiB/s (1601MB/s), 1527MiB/s-1527MiB/s (1601MB/s-1601MB/s), io=98.0GiB (106GB), run=66375-66375msec
# WRITE: bw=15.5MiB/s (16.2MB/s), 15.5MiB/s-15.5MiB/s (16.2MB/s-16.2MB/s), io=1029MiB (1079MB), run=66375-66375msec
lvs -o+stripesize,chunksize datavg/ssd0lv
# LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert Stripe Chunk
# ssd0lv datavg rwi-a-r--- 300.00g 64.00k 0
lvremove -f datavg/ssd0lv
lvcreate --type raid0 -L 1G --stripes 24 -n hddlv datavg /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdq /dev/sdr /dev/sds /dev/sdt /dev/sdu /dev/sdv /dev/sdw /dev/sdx
lvcreate --type raid0 -L 130T --stripes 24 -n mixlv datavg /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdq /dev/sdr /dev/sds /dev/sdt /dev/sdu /dev/sdv /dev/sdw /dev/sdx
# lvcreate --type raid0 -L 300G --stripes 10 -n ssdlv datavg /dev/sdz /dev/sdaa /dev/sdab /dev/sdac /dev/sdad /dev/sdae /dev/sdaf /dev/sdag /dev/sdah /dev/sdai
lvcreate --type raid0 -L 8.6T --stripes 10 -n cache1 datavg /dev/sdz /dev/sdaa /dev/sdab /dev/sdac /dev/sdad /dev/sdae /dev/sdaf /dev/sdag /dev/sdah /dev/sdai
lvcreate --type raid0 -L 40G --stripes 10 -n cache1meta datavg /dev/sdz /dev/sdaa /dev/sdab /dev/sdac /dev/sdad /dev/sdae /dev/sdaf /dev/sdag /dev/sdah /dev/sdai
lvconvert --type cache-pool --poolmetadata datavg/cache1meta datavg/cache1
# lvs -a -o name,size,attr,devices datavg
lvconvert --type cache --cachepool datavg/cache1 datavg/mixlv
lvconvert --splitcache datavg/mixlv
# lvs -a -o name,size,attr,devices datavg
# lvs -o+cache_mode datavg
mkfs.xfs /dev/datavg/hddlv
mkfs.xfs /dev/datavg/ssdlv
mkfs.xfs /dev/datavg/mixlv
mkfs.xfs /dev/datavg/mix0lv
mkdir -p /data/
mkdir -p /data_ssd/
mkdir -p /data_mix/
mkdir -p /data_mix0
cat /etc/fstab
cat << EOF >> /etc/fstab
/dev/datavg/hddlv /data xfs defaults 0 0
/dev/datavg/ssdlv /data_ssd xfs defaults 0 0
/dev/datavg/mixlv /data_mix xfs defaults 0 0
/dev/datavg/mix0lv /data_mix0 xfs defaults 0 0
EOF
mount -a
df -h | grep \/data
dd if=/dev/zero of=/data/testfile bs=4k count=9999 oflag=dsync
dd if=/dev/zero of=/data_ssd/testfile bs=4k count=9999 oflag=dsync
dd if=/dev/zero of=/data_mix/testfile bs=4k count=9999 oflag=dsync
dd if=/dev/zero of=/data/testfile bs=4M count=9999 oflag=dsync
dd if=/dev/zero of=/data_ssd/testfile bs=4M count=9999 oflag=dsync
dd if=/dev/zero of=/data_mix/testfile bs=4M count=9999 oflag=dsync
dd if=/dev/zero of=/data/testfile.large bs=4M count=9999 oflag=direct
dd if=/dev/zero of=/data_ssd/testfile.large bs=4M count=9999 oflag=direct
dd if=/dev/zero of=/data_mix/testfile.large bs=4M count=9999 oflag=direct
dd if=/dev/zero of=/data/testfile.large bs=4M count=9999
dd if=/dev/zero of=/data_ssd/testfile.large bs=4M count=9999
dd if=/dev/zero of=/data_mix/testfile.large bs=4M count=9999
dd if=/data/testfile.large of=/dev/null bs=4k count=9999 oflag=dsync
dd if=/data_ssd/testfile.large of=/dev/null bs=4k count=9999 oflag=dsync
dd if=/data_mix/testfile.large of=/dev/null bs=4k count=999999 oflag=dsync
dd if=/data/testfile.large of=/dev/null bs=4M count=9999 oflag=dsync
dd if=/data_ssd/testfile.large of=/dev/null bs=4M count=9999 oflag=dsync
dd if=/data_mix/testfile.large of=/dev/null bs=4M count=9999 oflag=dsync
dd if=/data/testfile.large of=/dev/null bs=4M count=9999
dd if=/data_ssd/testfile.large of=/dev/null bs=4M count=9999
dd if=/data_mix/testfile.large of=/dev/null bs=4M count=9999
# cleanup
umount /data/
umount /data_ssd/
umount /data_mix/
umount /data_mix0/
lvremove -f /dev/datavg/hddlv
lvremove -f /dev/datavg/ssdlv
lvremove -f /dev/datavg/mixlv
lvremove -f /dev/datavg/mix0lv
# ssd tunning
# https://serverfault.com/questions/80134/linux-md-vs-lvm-performance
hdparm -tT /dev/md0
# https://www.ibm.com/developerworks/cn/linux/l-lo-io-scheduler-optimize-performance/index.html
cat /sys/block/*/queue/scheduler
lsblk | grep 894 | awk '{print $1}' | xargs -I DEMO cat /sys/block/DEMO/queue/scheduler
lsblk | grep 894 | awk '{print "echo deadline > /sys/block/"$1"/queue/scheduler"}'
fio --rw=rw --rwmixread=99 --bsrange=4k-256k --name=vdo \
--directory=./ --ioengine=libaio --numjobs=1 --thread \
--norandommap --runtime=300 --direct=0 --iodepth=8 \
--scramble_buffers=1 --offset=0 --size=100g
fio --rw=rw --rwmixread=99 --bsrange=4k-256k --name=vdo \
--directory=./ --ioengine=sync --size=100g
blktrace /dev/datavg/mix0lv /dev/sdz /dev/sdaa /dev/sdab /dev/sdac /dev/sdad /dev/sdae /dev/sdaf /dev/sdag /dev/sdah /dev/sdai /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdq /dev/sdr /dev/sds /dev/sdt /dev/sdu /dev/sdv /dev/sdw /dev/sdx
blkparse -o /dev/null -i dm-244 -d dm-244.bin
btt -i dm-244.bin | less
blkparse -o /dev/null -i sdaa -d sdaa.bin
btt -i sdaa.bin | less
blkparse -o /dev/null -i sda -d sda.bin
btt -i sda.bin | less
blktrace /dev/datavg/ssd0lv /dev/sdz /dev/sdaa /dev/sdab /dev/sdac /dev/sdad /dev/sdae /dev/sdaf /dev/sdag /dev/sdah /dev/sdai
lvmconfig --typeconfig default --withcomments --withspaces
lvmconfig --type default --withcomments allocation/cache_policy
lvmconfig --type default --withcomments allocation/cache_settings
lvmconfig --type list --withcomments allocation/cache_settings
iostat -x -m 3 /dev/mapper/datavg-mixlv sdh sdab
dstat -D /dev/mapper/datavg-mixlv,/dev/mapper/datavg-mixlv_corig,sdh,sdab -N bond0
dstat -D /dev/mapper/datavg-mixlv,/dev/mapper/datavg-mixlv_corig,sdh,sdab --disk-util
bmon -p eno1,eno2,ens2f0,ens2f1,bond0
lvs -o+lv_all datavg/mixlv_corig
lvs -o+Layout datavg/mixlv_corig
lvs -o+CacheReadHits,CacheReadMisses
lvs -o+Layout
blockdev --report
# RO RA SSZ BSZ StartSec Size Device
# rw 8192 512 4096 0 478998953984 /dev/sdy
# rw 8192 512 512 2048 1073741824 /dev/sdy1
# rw 8192 512 4096 2099200 1073741824 /dev/sdy2
# rw 8192 512 4096 4196352 476849373184 /dev/sdy3
# rw 8192 512 4096 0 958999298048 /dev/sdaj
# rw 8192 512 4096 2048 958998249472 /dev/sdaj1
# rw 8192 512 4096 0 6001175126016 /dev/sda
# rw 8192 512 4096 0 6001175126016 /dev/sdd
# rw 8192 512 4096 0 6001175126016 /dev/sde
# rw 8192 512 4096 0 6001175126016 /dev/sdc
# rw 8192 512 4096 0 6001175126016 /dev/sdf
# rw 8192 512 4096 0 6001175126016 /dev/sdb
# rw 8192 512 4096 0 6001175126016 /dev/sdg
# rw 8192 512 4096 0 6001175126016 /dev/sdh
# rw 8192 512 4096 0 6001175126016 /dev/sdk
# rw 8192 512 4096 0 6001175126016 /dev/sdi
# rw 8192 512 4096 0 6001175126016 /dev/sdm
# rw 8192 512 4096 0 6001175126016 /dev/sdj
# rw 8192 512 4096 0 6001175126016 /dev/sdl
# rw 8192 512 4096 0 6001175126016 /dev/sdn
# rw 8192 512 4096 0 6001175126016 /dev/sdo
# rw 8192 512 4096 0 6001175126016 /dev/sdp
# rw 8192 512 4096 0 6001175126016 /dev/sdx
# rw 8192 512 4096 0 6001175126016 /dev/sdq
# rw 8192 512 4096 0 6001175126016 /dev/sdr
# rw 8192 512 4096 0 6001175126016 /dev/sdu
# rw 8192 512 4096 0 6001175126016 /dev/sdw
# rw 8192 512 4096 0 6001175126016 /dev/sds
# rw 8192 512 4096 0 6001175126016 /dev/sdt
# rw 8192 512 4096 0 6001175126016 /dev/sdv
# rw 8192 512 4096 0 960197124096 /dev/sdz
# rw 8192 512 4096 0 960197124096 /dev/sdaa
# rw 8192 512 4096 0 960197124096 /dev/sdac
# rw 8192 512 4096 0 960197124096 /dev/sdab
# rw 8192 512 4096 0 960197124096 /dev/sdad
# rw 8192 512 4096 0 960197124096 /dev/sdae
# rw 8192 512 4096 0 960197124096 /dev/sdag
# rw 8192 512 4096 0 960197124096 /dev/sdaf
# rw 8192 512 4096 0 960197124096 /dev/sdai
# rw 8192 512 4096 0 960197124096 /dev/sdah
# rw 8192 512 4096 0 5955689381888 /dev/dm-0
# rw 8192 512 4096 0 5955689381888 /dev/dm-1
# rw 8192 512 4096 0 5955689381888 /dev/dm-2
# rw 8192 512 4096 0 5955689381888 /dev/dm-3
# rw 8192 512 4096 0 5955689381888 /dev/dm-4
# rw 8192 512 4096 0 5955689381888 /dev/dm-5
# rw 8192 512 4096 0 5955689381888 /dev/dm-6
# rw 8192 512 4096 0 5955689381888 /dev/dm-7
# rw 8192 512 4096 0 5955689381888 /dev/dm-8
# rw 8192 512 4096 0 5955689381888 /dev/dm-9
# rw 8192 512 4096 0 5955689381888 /dev/dm-10
# rw 8192 512 4096 0 5955689381888 /dev/dm-11
# rw 8192 512 4096 0 5955689381888 /dev/dm-12
# rw 8192 512 4096 0 5955689381888 /dev/dm-13
# rw 8192 512 4096 0 5955689381888 /dev/dm-14
# rw 8192 512 4096 0 5955689381888 /dev/dm-15
# rw 8192 512 4096 0 5955689381888 /dev/dm-16
# rw 8192 512 4096 0 5955689381888 /dev/dm-17
# rw 8192 512 4096 0 5955689381888 /dev/dm-18
# rw 8192 512 4096 0 5955689381888 /dev/dm-19
# rw 8192 512 4096 0 5955689381888 /dev/dm-20
# rw 8192 512 4096 0 5955689381888 /dev/dm-21
# rw 8192 512 4096 0 5955689381888 /dev/dm-22
# rw 8192 512 4096 0 5955689381888 /dev/dm-23
# rw 8192 512 4096 0 142936545165312 /dev/dm-24
# rw 8192 512 4096 0 945580670976 /dev/dm-25
# rw 8192 512 4096 0 945580670976 /dev/dm-26
# rw 8192 512 4096 0 945580670976 /dev/dm-27
# rw 8192 512 4096 0 945580670976 /dev/dm-28
# rw 8192 512 4096 0 945580670976 /dev/dm-29
# rw 8192 512 4096 0 945580670976 /dev/dm-30
# rw 8192 512 4096 0 945580670976 /dev/dm-31
# rw 8192 512 4096 0 945580670976 /dev/dm-32
# rw 8192 512 4096 0 945580670976 /dev/dm-33
# rw 8192 512 4096 0 945580670976 /dev/dm-34
# rw 8192 512 4096 0 9455806709760 /dev/dm-35
# rw 8192 512 4096 0 4294967296 /dev/dm-36
# rw 8192 512 4096 0 4294967296 /dev/dm-37
# rw 8192 512 4096 0 4294967296 /dev/dm-38
# rw 8192 512 4096 0 4294967296 /dev/dm-39
# rw 8192 512 4096 0 4294967296 /dev/dm-40
# rw 8192 512 4096 0 4294967296 /dev/dm-41
# rw 8192 512 4096 0 4294967296 /dev/dm-42
# rw 8192 512 4096 0 4294967296 /dev/dm-43
# rw 8192 512 4096 0 4294967296 /dev/dm-44
# rw 8192 512 4096 0 4294967296 /dev/dm-45
# rw 8192 512 4096 0 42949672960 /dev/dm-46
# rw 8192 512 4096 0 142936545165312 /dev/dm-47
# rw 8192 512 4096 0 46137344 /dev/dm-48
# rw 8192 512 4096 0 46137344 /dev/dm-49
# rw 8192 512 4096 0 46137344 /dev/dm-50
# rw 8192 512 4096 0 46137344 /dev/dm-51
# rw 8192 512 4096 0 46137344 /dev/dm-52
# rw 8192 512 4096 0 46137344 /dev/dm-53
# rw 8192 512 4096 0 46137344 /dev/dm-54
# rw 8192 512 4096 0 46137344 /dev/dm-55
# rw 8192 512 4096 0 46137344 /dev/dm-56
# rw 8192 512 4096 0 46137344 /dev/dm-57
# rw 8192 512 4096 0 46137344 /dev/dm-58
# rw 8192 512 4096 0 46137344 /dev/dm-59
# rw 8192 512 4096 0 46137344 /dev/dm-60
# rw 8192 512 4096 0 46137344 /dev/dm-61
# rw 8192 512 4096 0 46137344 /dev/dm-62
# rw 8192 512 4096 0 46137344 /dev/dm-63
# rw 8192 512 4096 0 46137344 /dev/dm-64
# rw 8192 512 4096 0 46137344 /dev/dm-65
# rw 8192 512 4096 0 46137344 /dev/dm-66
# rw 8192 512 4096 0 46137344 /dev/dm-67
# rw 8192 512 4096 0 46137344 /dev/dm-68
# rw 8192 512 4096 0 46137344 /dev/dm-69
# rw 8192 512 4096 0 46137344 /dev/dm-70
# rw 8192 512 4096 0 46137344 /dev/dm-71
# rw 8192 512 4096 0 1107296256 /dev/dm-72
# https://access.redhat.com/solutions/3588841
/sbin/blockdev --setra 4096 /dev/mapper/datavg-mixlv
/sbin/blockdev --setra 8192 /dev/mapper/datavg-mixlv
/sbin/blockdev --setra 16384 /dev/mapper/datavg-mixlv
/sbin/blockdev --setra 32768 /dev/mapper/datavg-mixlv
/sbin/blockdev --setra 65536 /dev/mapper/datavg-mixlv
/sbin/blockdev --setra 131072 /dev/mapper/datavg-mixlv
/sbin/blockdev --setra 262144 /dev/mapper/datavg-mixlv
# final config
/sbin/blockdev --setra 16384 /dev/mapper/datavg-mixlv
for f in /dev/mapper/datavg-mixlv_corig_rimage_*; do /sbin/blockdev --setra 16384 $f ; done
# worker2
# 5.5
find /data_mix/mnt/ -type f > list
dstat --output /root/dstat.csv -D /dev/mapper/datavg-mixlv,/dev/mapper/datavg-mixlv_corig,sdh,sdab -N bond0
var_basedir="/data_mix/mnt"
find $var_basedir -type f -size -511M > list.512m
find $var_basedir -type f -size -2049M -size +511M > list.2g
find $var_basedir -type f -size +2049M > list.+2g
cat list | shuf > list.shuf.all
cat list.512m | shuf > list.shuf.512m
cat list.2g | shuf > list.shuf.2g
cat list.+2g | shuf > list.shuf.+2g
cat list.2g list.+2g | shuf > list.shuf.+512m
rm -f split.list.*
# zte use 1800
var_total=10
# split -n l/$var_total list.shuf.all split.list.all.
split -n l/$var_total list.shuf.512m split.list.512m.
split -n l/$var_total list.shuf.2g split.list.2g.
split -n l/$var_total list.shuf.+2g split.list.+2g.
split -n l/$var_total list.shuf.+512m split.list.+512m.
for f in split.list.512m.*; do
cat $f | xargs -I DEMO cat DEMO > /dev/null &
done
# for f in split.list.+512m.*; do
# cat $f | xargs -I DEMO cat DEMO > /dev/null &
# done
for f in split.list.2g.*; do
cat $f | xargs -I DEMO cat DEMO > /dev/null &
done
for f in split.list.+2g.*; do
cat $f | xargs -I DEMO cat DEMO > /dev/null &
done
ps -ef | grep /data_mix/mnt | grep cat | awk '{print $2}' | xargs -I DEMO kill DEMO
tmux kill-window -t 3
# rm -f split.*
# 2.8
var_num=`echo "scale=0;$(cat list | wc -l )/5" | bc -l`
head -n $var_num list > list.20
tail -n +$var_num list > list.80
var_total=1500
# split -n l/$(echo "scale=0;$var_total/5*4"|bc -l) list.20 split.list.20.
# while true; do
# for f in split.list.20.*; do
# cat $f | xargs -I DEMO cat DEMO > /dev/null &
# done
# echo "wait to finish"
# wait
# done
var_runtimes=$(echo "scale=0;$var_total/5*4"|bc -l)
while true; do
for ((i=1; i<=$var_runtimes; i++)); do
echo "Welcome $i times"
cat list.20 | shuf | xargs -I DEMO cat DEMO > /dev/null &
done
echo "wait to finish"
wait
done
var_total=1500
# split -n l/$(echo "scale=0;$var_total/5*1"|bc -l) list.80 split.list.80.
# while true; do
# for f in split.list.80.*; do
# cat $f | xargs -I DEMO cat DEMO > /dev/null &
# done
# echo "wait to finish"
# wait
# done
var_runtimes=$(echo "scale=0;$var_total/5*1"|bc -l)
while true; do
for ((i=1; i<=$var_runtimes; i++)); do
echo "Welcome $i times"
cat list.80 | shuf | xargs -I DEMO cat DEMO > /dev/null &
done
echo "wait to finish"
wait
done
# 500M-1.2GB/s
ps -ef | grep /data_mix/mnt | grep cat | awk '{print $2}' | xargs -I DEMO kill DEMO
worker-2 disk tunning
# 8.6T cache / 130T hdd = 6.6%
# 660G cache / 10T hdd
lvcreate --type raid0 -L 10T --stripesize 2048k --stripes 24 -n ext02lv datavg /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdq /dev/sdr /dev/sds /dev/sdt /dev/sdu /dev/sdv /dev/sdw /dev/sdx
lvcreate --type raid0 -L 10T --stripesize 4096k --stripes 24 -n ext04lv datavg /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdq /dev/sdr /dev/sds /dev/sdt /dev/sdu /dev/sdv /dev/sdw /dev/sdx
lvcreate --type raid5 -L 10T --stripesize 2048k --stripes 23 -n ext52lv datavg /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdq /dev/sdr /dev/sds /dev/sdt /dev/sdu /dev/sdv /dev/sdw /dev/sdx
lvcreate --type raid5 -L 10T --stripesize 2048k --stripes 11 -n ext52lv12 datavg /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl
lvcreate --type raid0 -L 10T --stripesize 2048k --stripes 24 -n xfs02lv datavg /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdq /dev/sdr /dev/sds /dev/sdt /dev/sdu /dev/sdv /dev/sdw /dev/sdx
lvcreate --type raid0 -L 10T --stripesize 4096k --stripes 24 -n xfs04lv datavg /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdq /dev/sdr /dev/sds /dev/sdt /dev/sdu /dev/sdv /dev/sdw /dev/sdx
lvcreate --type raid5 -L 10T --stripesize 2048k --stripes 23 -n xfs52lv datavg /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdq /dev/sdr /dev/sds /dev/sdt /dev/sdu /dev/sdv /dev/sdw /dev/sdx
lvcreate --type raid5 -L 10T --stripesize 2048k --stripes 11 -n xfs52lv12 datavg /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdq /dev/sdr /dev/sds /dev/sdt /dev/sdu /dev/sdv /dev/sdw /dev/sdx
lvcreate --type raid0 -L 3.5T --stripesize 1024k --stripes 10 -n ext01lvssd datavg /dev/sdz /dev/sdaa /dev/sdab /dev/sdac /dev/sdad /dev/sdae /dev/sdaf /dev/sdag /dev/sdah /dev/sdai
lvcreate --type raid0 -L 3.5T --stripesize 1024k --stripes 10 -n xfs01lvssd datavg /dev/sdz /dev/sdaa /dev/sdab /dev/sdac /dev/sdad /dev/sdae /dev/sdaf /dev/sdag /dev/sdah /dev/sdai
lvcreate --type raid0 -L 700G --stripesize 2048k --stripes 10 -n cachelv datavg /dev/sdz /dev/sdaa /dev/sdab /dev/sdac /dev/sdad /dev/sdae /dev/sdaf /dev/sdag /dev/sdah /dev/sdai
lvconvert --type cache-pool datavg/cachelv
lvconvert --type cache --cachepool datavg/cachelv datavg/ext02lv
# lvconvert --splitcache datavg/ext02lv
# lvconvert --uncache datavg/ext02lv
lvs -o+layout,stripesize
# LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert Layout Stripe
# ext01lvssd datavg rwi-a-r--- 3.50t raid,raid0 1.00m
# ext02lv datavg Cwi-a-C--- 10.00t [cachelv] [ext02lv_corig] 0.01 16.41 0.00 cache 0
# ext04lv datavg rwi-a-r--- 10.00t raid,raid0 4.00m
# ext52lv datavg rwi-a-r--- 10.00t 9.72 raid,raid5,raid5_ls 2.00m
# xfs01lvssd datavg rwi-a-r--- 3.50t raid,raid0 1.00m
mkdir -p /data_ext02
mkdir -p /data_ext04
mkdir -p /data_ext52
mkdir -p /data_ext01
mkdir -p /data_xfs01
mkdir -p /data_xfs02
mkdir -p /data_xfs04
mkdir -p /data_xfs52
mkdir -p /data_ext52_12
mkdir -p /data_xfs52_12
mkfs.ext4 /dev/datavg/ext02lv
mkfs.ext4 /dev/datavg/ext04lv
mkfs.ext4 /dev/datavg/ext52lv
mkfs.ext4 /dev/datavg/ext01lvssd
mkfs.xfs /dev/datavg/xfs01lvssd
mkfs.xfs /dev/datavg/xfs02lv
mkfs.xfs /dev/datavg/xfs04lv
mkfs.xfs /dev/datavg/xfs52lv
mkfs.ext4 /dev/datavg/ext52lv12
mkfs.xfs /dev/datavg/xfs52lv12
mount /dev/datavg/ext02lv /data_ext02
mount /dev/datavg/ext04lv /data_ext04
mount /dev/datavg/ext52lv /data_ext52
mount /dev/datavg/ext01lvssd /data_ext01
mount /dev/datavg/xfs01lvssd /data_xfs01
mount /dev/datavg/xfs02lv /data_xfs02
mount /dev/datavg/xfs04lv /data_xfs04
mount /dev/datavg/xfs52lv /data_xfs52
mount /dev/datavg/ext52lv12 /data_ext52_12
mount /dev/datavg/xfs52lv12 /data_xfs52_12
dstat -d -D /dev/datavg/ext02lv,/dev/datavg/ext04lv,/dev/datavg/ext52lv,/dev/datavg/ext01lvssd,/dev/datavg/xfs01lvssd,/dev/datavg/xfs02lv,/dev/datavg/xfs04lv,/dev/datavg/xfs52lv,/dev/datavg/ext52lv12,/dev/datavg/xfs52lv12,/dev/sdaa
dstat -d -D /dev/datavg/ext02lv,/dev/datavg/ext04lv,/dev/datavg/ext52lv,/dev/datavg/ext01lvssd,/dev/datavg/xfs01lvssd,/dev/datavg/xfs02lv,/dev/datavg/xfs04lv,/dev/datavg/xfs52lv,/dev/datavg/ext52lv12,/dev/datavg/xfs52lv12,/dev/sdaa,/dev/sdb --disk-util
bmon -p bond0,enp*
# on worker1
rclone config
rclone lsd worker-2:
rclone sync /data_ssd/mnt/ worker-2:/data_ext01/mnt/ -P -L --transfers 64
# on worker-2
# fill data
# for 256M
var_basedir_ext="/data_ext04/mnt"
mkdir -p $var_basedir_ext
# how may write concurrency
var_total_write=10
# how much size each file, this value is in MB
# 512M
var_size=512
# how much size to write totally, in TB
# write 3T
var_total_size=3
var_number=$(echo "scale=0;$var_total_size*1024*1024/$var_size/$var_total_write"|bc -l)
var_len=$(echo "scale=0;$var_size*1024/1"|bc -l)
for ((i=1; i<=$var_number; i++)); do
for ((j=1; j<=$var_total_write; j++)); do
head -c ${var_len}K < /dev/urandom > $var_basedir_ext/$var_size-$j-$i &
done
echo "wait to finish: $i"
wait
done
# fill data
# for 1G
var_basedir_ext="/data_ext04/mnt"
mkdir -p $var_basedir_ext
# how may write concurrency
var_total_write=10
# how much size each file, this value is in MB
# 512M
var_size=1024
# how much size to write totally, in TB
# write 3T
var_total_size=3
var_number=$(echo "scale=0;$var_total_size*1024*1024/$var_size/$var_total_write"|bc -l)
var_len=$(echo "scale=0;$var_size*1024/1"|bc -l)
for ((i=1; i<=$var_number; i++)); do
for ((j=1; j<=$var_total_write; j++)); do
head -c ${var_len}K < /dev/urandom > $var_basedir_ext/$var_size-$j-$i &
done
echo "wait to finish: $i"
wait
done
# fill data
# for 2G
var_basedir_ext="/data_ext04/mnt"
mkdir -p $var_basedir_ext
# how may write concurrency
var_total_write=10
# how much size each file, this value is in MB
# 512M
var_size=2048
# how much size to write totally, in TB
# write 3T
var_total_size=3
var_number=$(echo "scale=0;$var_total_size*1024*1024/$var_size/$var_total_write"|bc -l)
var_len=$(echo "scale=0;$var_size*1024/1"|bc -l)
for ((i=1; i<=$var_number; i++)); do
for ((j=1; j<=$var_total_write; j++)); do
head -c ${var_len}K < /dev/urandom > $var_basedir_ext/$var_size-$j-$i &
done
echo "wait to finish: $i"
wait
done
# copy data
rclone sync /data_ext01/mnt/ /data_xfs01/mnt/ -P -L --transfers 64
rclone sync /data_ext04/mnt/ /data_xfs02/mnt/ -P -L --transfers 64
rclone sync /data_ext04/mnt/ /data_xfs04/mnt/ -P -L --transfers 10
rclone sync /data_ext04/mnt/ /data_xfs52/mnt/ -P -L --transfers 10
rclone sync /data_ext04/mnt/ /data_xfs52_12/mnt/ -P -L --transfers 10
rclone sync /data_ext04/mnt/ /data_ext02/mnt/ -P -L --transfers 10
rclone sync /data_ext04/mnt/ /data_ext52/mnt/ -P -L --transfers 10
rclone sync /data_ext04/mnt/ /data_ext52_12/mnt/ -P -L --transfers 10
var_truebase="/data_xfs52"
mkdir -p $var_truebase/list.tmp
cd $var_truebase/list.tmp
var_basedir="$var_truebase/mnt"
find $var_basedir -type f -size -600M > list.512m
find $var_basedir -type f -size -1100M -size +600M > list.1g
find $var_basedir -type f -size +1100M > list.+1g
find $var_basedir -type f > list
cat list | xargs ls -l > list.size
cat list.size | awk '{ n=int(log($5)/log(2)); \
if (n<10) n=10; \
size[n]++ } \
END { for (i in size) printf("%d %d\n", 2^i, size[i]) }' \
| sort -n \
| awk 'function human(x) { x[1]/=1024; \
if (x[1]>=1024) { x[2]++; \
human(x) } } \
{ a[1]=$1; \
a[2]=0; \
human(a); \
printf("%3d - %4d %s: %6d\n", a[1], a[1]*2,substr("kMGTEPYZ",a[2]+1,1),$2) }'
# seperate read
for i in 512m 1g +1g ; do
cat list.$i | shuf > list.shuf.$i
done
rm -f split.list.*
# zte use 1800
var_total=30
for i in 512m 1g +1g ; do
split -n l/$var_total list.shuf.$i split.list.$i.
done
for f in split.list.512m.*; do
cat $f | xargs -I DEMO cat DEMO > /dev/null &
done
for f in split.list.1g.*; do
cat $f | xargs -I DEMO cat DEMO > /dev/null &
done
for f in split.list.+1g.*; do
cat $f | xargs -I DEMO cat DEMO > /dev/null &
done
# mix read
for i in 512m 1g +1g ; do
cat list.$i | shuf > list.shuf.$i
done
rm -f split.list.*
# zte use 1800
var_total=10
for i in 512m 1g +1g ; do
split -n l/$var_total list.shuf.$i split.list.$i.
done
for i in 512m 1g +1g ; do
for f in split.list.$i.*; do
cat $f | xargs -I DEMO cat DEMO > /dev/null &
done
done
ps -ef | grep xargs | grep DEMO | grep cat | awk '{print $2}' | xargs -I DEMO kill DEMO
ps -ef | grep cat | grep /data | awk '{print $2}' | xargs -I DEMO kill -9 DEMO
lvconvert --splitcache datavg/ext02lv
var_truebase="/data_ext01"
mkdir -p $var_truebase/list.tmp
cd $var_truebase/list.tmp
var_basedir="$var_truebase/mnt"
find $var_basedir -type f -size -16k > list.16k
find $var_basedir -type f -size -128k -size +16k > list.128k
find $var_basedir -type f -size +128k > list.+128k
find $var_basedir -type f > list
cat list | xargs ls -l > list.size
cat list.size | awk '{ n=int(log($5)/log(2)); \
if (n<10) n=10; \
size[n]++ } \
END { for (i in size) printf("%d %d\n", 2^i, size[i]) }' \
| sort -n \
| awk 'function human(x) { x[1]/=1024; \
if (x[1]>=1024) { x[2]++; \
human(x) } } \
{ a[1]=$1; \
a[2]=0; \
human(a); \
printf("%3d - %4d %s: %6d\n", a[1], a[1]*2,substr("kMGTEPYZ",a[2]+1,1),$2) }'
# seperate read
for i in 16k 128k +128k ; do
cat list.$i | shuf > list.shuf.$i
done
rm -f split.list.*
# zte use 1800
var_total=30
for i in 16k 128k +128k ; do
split -n l/$var_total list.shuf.$i split.list.$i.
done
for f in split.list.16k.*; do
cat $f | xargs -I DEMO cat DEMO > /dev/null &
done
for f in split.list.128k.*; do
cat $f | xargs -I DEMO cat DEMO > /dev/null &
done
for f in split.list.+128k.*; do
cat $f | xargs -I DEMO cat DEMO > /dev/null &
done
# mix read
for i in 16k 128k +128k ; do
cat list.$i | shuf > list.shuf.$i
done
rm -f split.list.*
# zte use 1800
var_total=10
for i in 16k 128k +128k ; do
split -n l/$var_total list.shuf.$i split.list.$i.
done
for i in 16k 128k +128k ; do
for f in split.list.$i.*; do
cat $f | xargs -I DEMO cat DEMO > /dev/null &
done
done
ps -ef | grep xargs | grep DEMO | grep cat | awk '{print $2}' | xargs -I DEMO kill DEMO
worker-2 nic bond
ip link show
# 2: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
# link/ether cc:64:a6:59:bb:80 brd ff:ff:ff:ff:ff:ff
# 3: ens2f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
# link/ether 08:4f:0a:b5:a4:6e brd ff:ff:ff:ff:ff:ff
# 4: eno2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
# link/ether cc:64:a6:59:bb:81 brd ff:ff:ff:ff:ff:ff
# 5: eno3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
# link/ether cc:64:a6:59:bb:82 brd ff:ff:ff:ff:ff:ff
# 6: ens2f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
# link/ether 08:4f:0a:b5:a4:6f brd ff:ff:ff:ff:ff:ff
# 7: eno4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
# link/ether cc:64:a6:59:bb:83 brd ff:ff:ff:ff:ff:ff
ip a s eno1
# 2: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
# link/ether cc:64:a6:59:bb:80 brd ff:ff:ff:ff:ff:ff
# inet 39.134.201.66/27 brd 39.134.201.95 scope global noprefixroute eno1
# valid_lft forever preferred_lft forever
# inet6 fe80::f690:1c45:b8c3:96d/64 scope link noprefixroute
# valid_lft forever preferred_lft forever
ethtool eno1 # 10000baseT/Full
ethtool eno2 # 10000baseT/Full
ethtool eno3 # 1000baseT/Full
ethtool eno4 # 1000baseT/Full
ethtool ens2f0 # 10000baseT/Full
ethtool ens2f1 # 10000baseT/Full
nmcli con add type bond \
con-name bond0 \
ifname bond0 \
mode 802.3ad
nmcli con mod id bond0 bond.options \
mode=802.3ad,miimon=100,lacp_rate=fast,xmit_hash_policy=layer2+3
nmcli con add type bond-slave ifname eno2 con-name eno2 master bond0
nmcli con add type bond-slave ifname ens2f0 con-name ens2f0 master bond0
nmcli con add type bond-slave ifname ens2f1 con-name ens2f1 master bond0
nmcli con down eno2
nmcli con up eno2
nmcli con down ens2f0
nmcli con up ens2f0
nmcli con down ens2f1
nmcli con up ens2f1
nmcli con down bond0
nmcli con start bond0
#######################################
# nic bond
cat > /root/nic.bond.sh << 'EOF'
#!/bin/bash
set -x
# delete all connection
nmcli -g uuid con | while read i ; do nmcli c delete ${i} ; done
nmcli con add type bond \
con-name bond0 \
ifname bond0 \
mode 802.3ad \
ipv4.method 'manual' \
ipv4.address '39.134.201.66/27' \
ipv4.gateway '39.134.201.94' \
ipv4.dns '117.177.241.16'
nmcli con mod id bond0 bond.options \
mode=802.3ad,miimon=100,lacp_rate=fast,xmit_hash_policy=layer2+3
nmcli con add type bond-slave ifname eno1 con-name eno1 master bond0
nmcli con add type bond-slave ifname eno2 con-name eno2 master bond0
nmcli con add type bond-slave ifname ens2f0 con-name ens2f0 master bond0
nmcli con add type bond-slave ifname ens2f1 con-name ens2f1 master bond0
systemctl restart network
EOF
cat > /root/nic.restore.sh << 'EOF'
#!/bin/bash
set -x
# delete all connection
nmcli -g uuid con | while read i ; do nmcli c delete ${i} ; done
# re-create primary connection
nmcli con add type ethernet \
con-name eno1 \
ifname eno1 \
ipv4.method 'manual' \
ipv4.address '39.134.201.66/27' \
ipv4.gateway '39.134.201.94' \
ipv4.dns '117.177.241.16'
systemctl restart network
exit 0
EOF
chmod +x /root/nic.restore.sh
cat > ~/cron-network-con-recreate << EOF
*/20 * * * * /bin/bash /root/nic.restore.sh
EOF
crontab ~/cron-network-con-recreate
bash /root/nic.bond.sh
worker-3 host
systemctl stop firewalld
systemctl disable firewalld
cat << EOF > /etc/rc.local
#!/bin/bash
# THIS FILE IS ADDED FOR COMPATIBILITY PURPOSES
#
# It is highly advisable to create own systemd services or udev rules
# to run scripts during boot instead of using this file.
#
# In contrast to previous versions due to parallel execution during boot
# this script will NOT be run after all other services.
#
# Please note that you must run 'chmod +x /etc/rc.d/rc.local' to ensure
# that this script will be executed during boot.
touch /var/lock/subsys/local
ipset create my-allow-set hash:net
ipset add my-allow-set 127.0.0.1/32
ipset add my-allow-set 223.87.20.0/24
ipset add my-allow-set 117.177.241.0/24
ipset add my-allow-set 39.134.200.0/24
ipset add my-allow-set 39.134.201.0/24
ipset add my-allow-set 39.137.101.0/24
ipset add my-allow-set 192.168.7.0/24
ipset add my-allow-set 112.44.102.224/27
ipset add my-allow-set 47.93.86.113/32
ipset add my-allow-set 221.226.0.75/32
ipset add my-allow-set 210.21.236.182/32
ipset add my-allow-set 61.132.54.2/32
ipset add my-allow-set 39.134.198.0/24
ipset add my-allow-set 39.134.204.0/24
iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT
iptables -A INPUT -m set --match-set my-allow-set src -j ACCEPT
iptables -A INPUT -p tcp -j REJECT
iptables -A INPUT -p udp -j REJECT
EOF
chmod +x /etc/rc.d/rc.local
systemctl enable rc-local
# systemctl restart rc-local
#######################################
# nic bond
cat << 'EOF' > /root/nic.bond.sh
#!/bin/bash
# delete all connection
nmcli -g uuid con | while read i ; do nmcli c delete uuid ${i} ; done
nmcli con add type bond \
con-name bond0 \
ifname bond0 \
mode 802.3ad \
ipv4.method 'manual' \
ipv4.address '39.134.204.73/27' \
ipv4.gateway '39.134.204.65' \
ipv4.dns '117.177.241.16'
nmcli con mod id bond0 bond.options \
mode=802.3ad,miimon=100,lacp_rate=fast,xmit_hash_policy=layer2+3
nmcli con add type bond-slave ifname enp176s0f0 con-name enp176s0f0 master bond0
nmcli con add type bond-slave ifname enp176s0f1 con-name enp176s0f1 master bond0
systemctl restart network
EOF
cat > /root/nic.restore.sh << 'EOF'
#!/bin/bash
# delete all connection
nmcli -g uuid con | while read i ; do nmcli c delete uuid ${i} ; done
# re-create primary connection
nmcli con add type ethernet \
con-name enp176s0f0 \
ifname enp176s0f0 \
ipv4.method 'manual' \
ipv4.address '39.134.204.73/27' \
ipv4.gateway '39.134.204.65' \
ipv4.dns '117.177.241.16'
systemctl restart network
exit 0
EOF
chmod +x /root/nic.restore.sh
cat > ~/cron-network-con-recreate << EOF
*/2 * * * * /bin/bash /root/nic.restore.sh
EOF
crontab ~/cron-network-con-recreate
mkdir /etc/yum.repos.d.bak
mv /etc/yum.repos.d/* /etc/yum.repos.d.bak
cat << EOF > /etc/yum.repos.d/remote.repo
[remote]
name=RHEL FTP
baseurl=ftp://117.177.241.16/data
enabled=1
gpgcheck=0
EOF
yum clean all
yum --disableplugin=subscription-manager repolist
yum -y update
hostnamectl set-hostname worker-3.ocpsc.redhat.ren
nmcli connection modify enp176s0f0 ipv4.dns 117.177.241.16
nmcli connection reload
nmcli connection up enp176s0f0
# ntp
yum install -y chrony
systemctl enable chronyd
systemctl restart chronyd
systemctl status chronyd
chronyc tracking
systemctl disable --now firewalld.service
# update ntp
cat << EOF > /etc/chrony.conf
server 223.87.20.100 iburst
driftfile /var/lib/chrony/drift
makestep 1.0 3
rtcsync
logdir /var/log/chrony
EOF
systemctl restart chronyd
systemctl status chronyd
chronyc tracking
worker-3 disk
lshw -class disk
lsblk | grep 5.5 | awk '{print $1}' | xargs -I DEMO echo -n "/dev/DEMO "
# /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdq /dev/sdr /dev/sds /dev/sdt /dev/sdu /dev/sdv /dev/sdw /dev/sdx /dev/sdy
lsblk | grep 5.5 | awk '{print $1}' | wc -l
# 24
pvcreate -y /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdq /dev/sdr /dev/sds /dev/sdt /dev/sdu /dev/sdv /dev/sdw /dev/sdx /dev/sdy
vgcreate datavg /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdq /dev/sdr /dev/sds /dev/sdt /dev/sdu /dev/sdv /dev/sdw /dev/sdx /dev/sdy
lsblk -d -o name,rota
lvcreate --type raid0 -L 120T --stripesize 128k --stripes 24 -n hddlv datavg /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdq /dev/sdr /dev/sds /dev/sdt /dev/sdu /dev/sdv /dev/sdw /dev/sdx /dev/sdy
mkfs.ext4 /dev/datavg/hddlv
lvcreate --type raid0 -L 5T --stripesize 512k --stripes 24 -n xfslv datavg /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdq /dev/sdr /dev/sds /dev/sdt /dev/sdu /dev/sdv /dev/sdw /dev/sdx /dev/sdy
lvcreate --type raid0 -L 110T --stripesize 4096k --stripes 24 -n extzxlv datavg /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdq /dev/sdr /dev/sds /dev/sdt /dev/sdu /dev/sdv /dev/sdw /dev/sdx /dev/sdy
lvcreate --type raid0 -L 3.5T --stripesize 4096k --stripes 24 -n ext04lv datavg /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdq /dev/sdr /dev/sds /dev/sdt /dev/sdu /dev/sdv /dev/sdw /dev/sdx /dev/sdy
lvcreate --type raid6 -L 3.5T --stripesize 2048k --stripes 22 -n ext62lv datavg /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdq /dev/sdr /dev/sds /dev/sdt /dev/sdu /dev/sdv /dev/sdw /dev/sdx /dev/sdy
lvcreate --type raid5 -L 3.5T --stripesize 2048k --stripes 23 -n ext52lv datavg /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdq /dev/sdr /dev/sds /dev/sdt /dev/sdu /dev/sdv /dev/sdw /dev/sdx /dev/sdy
mkfs.ext4 -E lazy_itable_init=0,lazy_journal_init=0 /dev/mapper/fc-root
mkfs.xfs /dev/datavg/xfslv
mkfs.ext4 /dev/datavg/extlv
mkfs.ext4 /dev/datavg/ext04lv
mkfs.ext4 /dev/datavg/ext62lv
mkfs.ext4 /dev/datavg/ext52lv
mkfs.ext4 /dev/datavg/extzxlv
# mkfs.xfs /dev/datavg/extzxlv
mount /dev/datavg/extzxlv /data
rclone sync /data_ext04/mnt/ /data/redhat_mnt/ -P -L --transfers 64
mount /dev/datavg/xfslv /data_xfs
mount /dev/datavg/extlv /data_ext
mkdir -p /data_ext02
mkdir -p /data_ext04
mkdir -p /data_ext62
mkdir -p /data_ext52
mount /dev/datavg/ext02lv /data_ext02
mount /dev/datavg/ext04lv /data_ext04
# mount /dev/datavg/ext62lv /data_ext62
mount /dev/datavg/ext52lv /data_ext52
umount /data_xfs
lvremove -f datavg/xfslv
# rsync --info=progress2 -P -ar /data_ext/mnt/ /data_xfs/mnt/
rclone sync /data_ext/mnt/ /data_xfs/mnt/ -P -L --transfers 64
umount /data_ext
lvremove -f datavg/extlv
rclone sync /data_xfs/mnt/ /data_ext/mnt/ -P -L --transfers 64
umount /data_ext52
rclone sync /data_xfs/mnt/ /data_ext04/mnt/ -P -L --transfers 64
rclone sync /data_xfs/mnt/ /data_ext62/mnt/ -P -L --transfers 64
rclone sync /data_xfs/mnt/ /data_ext52/mnt/ -P -L --transfers 64
lvs -o+stripesize
dstat -D /dev/datavg/xfslv,/dev/datavg/extlv,/dev/sdb,/dev/sdc 5
dstat -D /dev/datavg/xfslv,/dev/datavg/extlv,/dev/sdb,/dev/sdc --disk-util
bmon -p bond0,enp*
blockdev --report
# https://access.redhat.com/solutions/3588841
# orig: 12288
/sbin/blockdev --setra 131072 /dev/datavg/xfslv
/sbin/blockdev --setra 131072 /dev/datavg/extlv
/sbin/blockdev --setra 12288 /dev/datavg/xfslv
/sbin/blockdev --setra 12288 /dev/datavg/extlv
mkdir -p /data/
cat /etc/fstab
cat << EOF >> /etc/fstab
/dev/datavg/hddlv /data ext4 defaults 0 0
EOF
mount -a
df -h | grep \/data
while true; do df -h | grep /data; sleep 10; done
dstat -D /dev/datavg/hddlv
dstat -D /dev/sdb,/dev/sdc
dstat -D /dev/sdb,/dev/sdc --disk-util
mkfs.xfs -f /dev/sdb
mkfs.ext4 -F /dev/sdc
mkdir -p /data_xfs
mkdir -p /data_ext
mount /dev/sdb /data_xfs
mount /dev/sdc /data_ext
# fill data
# for 1.5M
var_basedir_xfs="/data_xfs/mnt"
var_basedir_ext="/data_ext/mnt"
mkdir -p $var_basedir_xfs
mkdir -p $var_basedir_ext
var_basedir_xfs="/data_xfs/mnt"
var_basedir_ext="/data_ext/mnt"
var_total=10
# 512k
var_size=0.5
# write 1T
var_number=$(echo "scale=0;1024*1024/$var_size/$var_total"|bc -l)
var_len=$(echo "scale=0;$var_size*1024/1"|bc -l)
for ((i=1; i<=$var_number; i++)); do
for ((j=1; j<=$var_total; j++)); do
# echo "Welcome $i times"
head -c ${var_len}K < /dev/urandom > $var_basedir_xfs/$var_size-$j-$i &
head -c ${var_len}K < /dev/urandom > $var_basedir_ext/$var_size-$j-$i &
done
echo "wait to finish: $i"
wait
done
var_basedir_xfs="/data_xfs/mnt"
var_basedir_ext="/data_ext/mnt"
var_total=10
# 4M
var_size=4
# write 1T
var_number=$(echo "scale=0;1024*1024/$var_size/$var_total"|bc -l)
var_len=$(echo "scale=0;$var_size*1024/1"|bc -l)
for ((i=1; i<=$var_number; i++)); do
for ((j=1; j<=$var_total; j++)); do
# echo "Welcome $i times"
head -c ${var_len}K < /dev/urandom > $var_basedir_xfs/$var_size-$j-$i &
head -c ${var_len}K < /dev/urandom > $var_basedir_ext/$var_size-$j-$i &
done
echo "wait to finish: $i"
wait
done
var_basedir_xfs="/data_xfs/mnt"
var_basedir_ext="/data_ext/mnt"
var_total=10
# 8M
var_size=8
# write 1T
var_number=$(echo "scale=0;1024*1024/$var_size/$var_total"|bc -l)
var_len=$(echo "scale=0;$var_size*1024/1"|bc -l)
for ((i=1; i<=$var_number; i++)); do
for ((j=1; j<=$var_total; j++)); do
# echo "Welcome $i times"
head -c ${var_len}K < /dev/urandom > $var_basedir_xfs/$var_size-$j-$i &
head -c ${var_len}K < /dev/urandom > $var_basedir_ext/$var_size-$j-$i &
done
echo "wait to finish: $i"
wait
done
var_basedir_xfs="/data_xfs/mnt"
var_basedir_ext="/data_ext/mnt"
var_total=10
# 32M
var_size=32
# write 1T
var_number=$(echo "scale=0;1024*1024/$var_size/$var_total"|bc -l)
var_len=$(echo "scale=0;$var_size*1024/1"|bc -l)
for ((i=1; i<=$var_number; i++)); do
for ((j=1; j<=$var_total; j++)); do
# echo "Welcome $i times"
head -c ${var_len}K < /dev/urandom > $var_basedir_xfs/$var_size-$j-$i &
head -c ${var_len}K < /dev/urandom > $var_basedir_ext/$var_size-$j-$i &
done
echo "wait to finish: $i"
wait
done
var_basedir_xfs="/data_xfs/mnt"
var_basedir_ext="/data_ext/mnt"
var_total=10
# 64M
var_size=64
# write 1T
var_number=$(echo "scale=0;1024*1024/$var_size/$var_total"|bc -l)
var_len=$(echo "scale=0;$var_size*1024/1"|bc -l)
for ((i=1; i<=$var_number; i++)); do
for ((j=1; j<=$var_total; j++)); do
# echo "Welcome $i times"
head -c ${var_len}K < /dev/urandom > $var_basedir_xfs/$var_size-$j-$i &
head -c ${var_len}K < /dev/urandom > $var_basedir_ext/$var_size-$j-$i &
done
echo "wait to finish: $i"
wait
done
mkdir -p /data_xfs/list.tmp
cd /data_xfs/list.tmp
var_basedir="/data_xfs/mnt"
find $var_basedir -type f -size -2M > list.2m
find $var_basedir -type f -size -10M -size +2M > list.10m
find $var_basedir -type f -size +10M > list.100m
find $var_basedir -type f > list
var_truebase="/data"
mkdir -p $var_truebase/list.tmp
cd $var_truebase/list.tmp
var_basedir="$var_truebase/mnt"
find $var_basedir -type f -size -2M > list.2m
find $var_basedir -type f -size -10M -size +2M > list.10m
find $var_basedir -type f -size +10M > list.100m
find $var_basedir -type f > list
cat list | xargs ls -l > list.size
cat list.size | awk '{ n=int(log($5)/log(2)); \
if (n<10) n=10; \
size[n]++ } \
END { for (i in size) printf("%d %d\n", 2^i, size[i]) }' \
| sort -n \
| awk 'function human(x) { x[1]/=1024; \
if (x[1]>=1024) { x[2]++; \
human(x) } } \
{ a[1]=$1; \
a[2]=0; \
human(a); \
printf("%3d - %4d %s: %6d\n", a[1], a[1]*2,substr("kMGTEPYZ",a[2]+1,1),$2) }'
cat list | shuf > list.shuf.all
cat list.2m | shuf > list.shuf.2m
cat list.10m | shuf > list.shuf.10m
cat list.100m | shuf > list.shuf.100m
cat list.10m list.100m | shuf > list.shuf.+2m
rm -f split.list.*
# zte use 1800
var_total=10
split -n l/$var_total list.shuf.all split.list.all.
split -n l/$var_total list.shuf.2m split.list.2m.
split -n l/$var_total list.shuf.10m split.list.10m.
split -n l/$var_total list.shuf.100m split.list.100m.
split -n l/$var_total list.shuf.+2m split.list.+2m.
for f in split.list.2m.*; do
cat $f | xargs -I DEMO cat DEMO > /dev/null &
done
# for f in split.list.+2m.*; do
# cat $f | xargs -I DEMO cat DEMO > /dev/null &
# done
for f in split.list.10m.*; do
cat $f | xargs -I DEMO cat DEMO > /dev/null &
done
for f in split.list.100m.*; do
cat $f | xargs -I DEMO cat DEMO > /dev/null &
done
for f in split.list.all.*; do
cat $f | xargs -I DEMO cat DEMO > /dev/null &
done
jobs -p | xargs kill
ps -ef | grep xargs | grep DEMO | grep cat | awk '{print $2}' | xargs -I DEMO kill DEMO
install ocp
helper node day1
############################################################
# on macbook
mkdir -p /Users/wzh/Documents/redhat/tools/redhat.ren/etc
mkdir -p /Users/wzh/Documents/redhat/tools/redhat.ren/lib
mkdir -p /Users/wzh/Documents/redhat/tools/ocpsc.redhat.ren/etc
mkdir -p /Users/wzh/Documents/redhat/tools/ocpsc.redhat.ren/lib
rm -rf /Users/wzh/Documents/redhat/tools/apps.ocpsc.redhat.ren/
mkdir -p /Users/wzh/Documents/redhat/tools/apps.ocpsc.redhat.ren/etc
mkdir -p /Users/wzh/Documents/redhat/tools/apps.ocpsc.redhat.ren/lib
cd /Users/wzh/Documents/redhat/tools/redhat.ren/
docker run -it --rm --name certbot \
-v "/Users/wzh/Documents/redhat/tools/redhat.ren/etc:/etc/letsencrypt" \
-v "/Users/wzh/Documents/redhat/tools/redhat.ren/lib:/var/lib/letsencrypt" \
certbot/certbot certonly -d "*.redhat.ren" --manual --preferred-challenges dns-01 --server https://acme-v02.api.letsencrypt.org/directory
cp ./etc/archive/redhat.ren/fullchain4.pem redhat.ren.crt
cp ./etc/archive/redhat.ren/privkey4.pem redhat.ren.key
cd /Users/wzh/Documents/redhat/tools/ocpsc.redhat.ren/
docker run -it --rm --name certbot \
-v "/Users/wzh/Documents/redhat/tools/ocpsc.redhat.ren/etc:/etc/letsencrypt" \
-v "/Users/wzh/Documents/redhat/tools/ocpsc.redhat.ren/lib:/var/lib/letsencrypt" \
certbot/certbot certonly -d "*.ocpsc.redhat.ren" --manual --preferred-challenges dns-01 --server https://acme-v02.api.letsencrypt.org/directory
cp ./etc/archive/ocpsc.redhat.ren/fullchain1.pem ocpsc.redhat.ren.crt
cp ./etc/archive/ocpsc.redhat.ren/privkey1.pem ocpsc.redhat.ren.key
cd /Users/wzh/Documents/redhat/tools/apps.ocpsc.redhat.ren/
docker run -it --rm --name certbot \
-v "/Users/wzh/Documents/redhat/tools/apps.ocpsc.redhat.ren/etc:/etc/letsencrypt" \
-v "/Users/wzh/Documents/redhat/tools/apps.ocpsc.redhat.ren/lib:/var/lib/letsencrypt" \
certbot/certbot certonly -d "*.apps.ocpsc.redhat.ren" --manual --preferred-challenges dns-01 --server https://acme-v02.api.letsencrypt.org/directory
cp ./etc/archive/apps.ocpsc.redhat.ren/fullchain1.pem apps.ocpsc.redhat.ren.crt
cp ./etc/archive/apps.ocpsc.redhat.ren/privkey1.pem apps.ocpsc.redhat.ren.key
# scp these keys to helper
# /data/cert/*
####################################################
# on helper node
yum -y install podman docker-distribution pigz skopeo httpd-tools
# https://access.redhat.com/solutions/3175391
htpasswd -cbB /etc/docker-distribution/registry_passwd admin ***************
cat << EOF > /etc/docker-distribution/registry/config.yml
version: 0.1
log:
fields:
service: registry
storage:
cache:
layerinfo: inmemory
filesystem:
rootdirectory: /data/registry
delete:
enabled: true
http:
addr: :5443
tls:
certificate: /data/cert/redhat.ren.crt
key: /data/cert/redhat.ren.key
auth:
htpasswd:
realm: basic‑realm
path: /etc/docker-distribution/registry_passwd
EOF
# systemctl restart docker
systemctl stop docker-distribution
systemctl enable docker-distribution
systemctl restart docker-distribution
#
firewall-cmd --permanent --add-port=5443/tcp
firewall-cmd --reload
podman login registry.redhat.ren:5443 -u admin -p *******************
yum install -y docker
systemctl start docker
docker login registry.redhat.ren:5443 -u admin
# upload vars-static.yaml to helper
yum -y install ansible-2.8.10 git unzip podman python36
cd /data/ocp4/ocp4-upi-helpernode
ansible-playbook -e @vars-static.yaml -e staticips=true tasks/main.yml
# upload install-config.yaml to helper /data/ocp4
cd /data/ocp4
/bin/rm -rf *.ign .openshift_install_state.json auth bootstrap master0 master1 master2 worker0 worker1 worker2
openshift-install create ignition-configs --dir=/data/ocp4
/bin/cp -f bootstrap.ign /var/www/html/ignition/bootstrap-static.ign
/bin/cp -f master.ign /var/www/html/ignition/master-0.ign
/bin/cp -f master.ign /var/www/html/ignition/master-1.ign
/bin/cp -f master.ign /var/www/html/ignition/master-2.ign
/bin/cp -f worker.ign /var/www/html/ignition/worker-0.ign
/bin/cp -f worker.ign /var/www/html/ignition/worker-1.ign
/bin/cp -f worker.ign /var/www/html/ignition/worker-2.ign
chmod 644 /var/www/html/ignition/*
########################################################
# on helper node, create iso
yum -y install genisoimage libguestfs-tools
systemctl start libvirtd
export NGINX_DIRECTORY=/data/ocp4
export RHCOSVERSION=4.3.0
export VOLID=$(isoinfo -d -i ${NGINX_DIRECTORY}/rhcos-${RHCOSVERSION}-x86_64-installer.iso | awk '/Volume id/ { print $3 }')
TEMPDIR=$(mktemp -d)
echo $VOLID
echo $TEMPDIR
cd ${TEMPDIR}
# Extract the ISO content using guestfish (to avoid sudo mount)
guestfish -a ${NGINX_DIRECTORY}/rhcos-${RHCOSVERSION}-x86_64-installer.iso \
-m /dev/sda tar-out / - | tar xvf -
# Helper function to modify the config files
modify_cfg(){
for file in "EFI/redhat/grub.cfg" "isolinux/isolinux.cfg"; do
# Append the proper image and ignition urls
sed -e '/coreos.inst=yes/s|$| coreos.inst.install_dev=vda coreos.inst.image_url='"${URL}"'\/install\/'"${BIOSMODE}"'.raw.gz coreos.inst.ignition_url='"${URL}"'\/ignition\/'"${NODE}"'.ign ip='"${IP}"'::'"${GATEWAY}"':'"${NETMASK}"':'"${FQDN}"':'"${NET_INTERFACE}"':none:'"${DNS}"' nameserver='"${DNS}"'|' ${file} > $(pwd)/${NODE}_${file##*/}
# Boot directly in the installation
sed -i -e 's/default vesamenu.c32/default linux/g' -e 's/timeout 600/timeout 10/g' $(pwd)/${NODE}_${file##*/}
done
}
URL="http://117.177.241.16:8080/"
GATEWAY="117.177.241.1"
NETMASK="255.255.255.0"
DNS="117.177.241.16"
# BOOTSTRAP
# TYPE="bootstrap"
NODE="bootstrap-static"
IP="117.177.241.243"
FQDN="vm-bootstrap"
BIOSMODE="bios"
NET_INTERFACE="ens3"
modify_cfg
# MASTERS
# TYPE="master"
# MASTER-0
NODE="master-0"
IP="117.177.241.240"
FQDN="vm-master0"
BIOSMODE="bios"
NET_INTERFACE="ens3"
modify_cfg
# MASTER-1
NODE="master-1"
IP="117.177.241.241"
FQDN="vm-master1"
BIOSMODE="bios"
NET_INTERFACE="ens3"
modify_cfg
# MASTER-2
NODE="master-2"
IP="117.177.241.242"
FQDN="vm-master2"
BIOSMODE="bios"
NET_INTERFACE="ens3"
modify_cfg
# WORKERS
NODE="worker-0"
IP="117.177.241.244"
FQDN="vm-worker0"
BIOSMODE="bios"
NET_INTERFACE="ens3"
modify_cfg
NODE="worker-1"
IP="117.177.241.245"
FQDN="vm-worker1"
BIOSMODE="bios"
NET_INTERFACE="ens3"
modify_cfg
# Generate the images, one per node as the IP configuration is different...
# https://github.com/coreos/coreos-assembler/blob/master/src/cmd-buildextend-installer#L97-L103
for node in master-0 master-1 master-2 worker-0 worker-1 worker-2 bootstrap-static; do
# Overwrite the grub.cfg and isolinux.cfg files for each node type
for file in "EFI/redhat/grub.cfg" "isolinux/isolinux.cfg"; do
/bin/cp -f $(pwd)/${node}_${file##*/} ${file}
done
# As regular user!
genisoimage -verbose -rock -J -joliet-long -volset ${VOLID} \
-eltorito-boot isolinux/isolinux.bin -eltorito-catalog isolinux/boot.cat \
-no-emul-boot -boot-load-size 4 -boot-info-table \
-eltorito-alt-boot -efi-boot images/efiboot.img -no-emul-boot \
-o ${NGINX_DIRECTORY}/${node}.iso .
done
# Optionally, clean up
cd /data/ocp4
rm -Rf ${TEMPDIR}
cd ${NGINX_DIRECTORY}
# mkdir -p /data/ocp4
# mkdir -p /data/kvm
scp master-*.iso root@117.177.241.17:/data/ocp4/
scp master-*.iso root@117.177.241.21:/data/ocp4/
scp worker-*.iso root@117.177.241.21:/data/ocp4/
scp bootstrap-*.iso root@117.177.241.21:/data/ocp4/
scp master-*.iso root@117.177.241.18:/data/ocp4/
# after you create and boot master vm, worker vm, you can track the result
export KUBECONFIG=/data/ocp4/auth/kubeconfig
echo "export KUBECONFIG=/data/ocp4/auth/kubeconfig" >> ~/.bashrc
source ~/.bashrc
oc get nodes
openshift-install wait-for bootstrap-complete --log-level debug
oc get csr
openshift-install wait-for install-complete
bash add.image.load.sh /data_ssd/is.samples/mirror_dir/
oc apply -f ./99-worker-zzz-container-registries.yaml -n openshift-config
oc apply -f ./99-master-zzz-container-registries.yaml -n openshift-config
helper node day1 oper
# https://docs.openshift.com/container-platform/4.3/openshift_images/managing_images/using-image-pull-secrets.html#images-update-global-pull-secret_using-image-pull-secrets
oc set data secret/pull-secret -n openshift-config --from-file=.dockerconfigjson=/data/pull-secret.json
# https://docs.openshift.com/container-platform/4.3/networking/ingress-operator.html#nw-ingress-controller-tls-profiles_configuring-ingress
oc --namespace openshift-ingress-operator get ingresscontrollers
oc --namespace openshift-ingress create secret tls custom-certs-default --cert=/data/cert/apps.ocpsc.redhat.ren.crt --key=/data/cert/apps.ocpsc.redhat.ren.key
oc patch --type=merge --namespace openshift-ingress-operator ingresscontrollers/default \
--patch '{"spec":{"defaultCertificate":{"name":"custom-certs-default"}}}'
oc get --namespace openshift-ingress-operator ingresscontrollers/default \
--output jsonpath='{.spec.defaultCertificate}'
# upgrade ingress ca
oc --namespace openshift-ingress create secret tls custom-certs-default-01 --cert=/data/cert/apps.ocpsc.redhat.ren.crt --key=/data/cert/apps.ocpsc.redhat.ren.key
oc patch --type=merge --namespace openshift-ingress-operator ingresscontrollers/default \
--patch '{"spec":{"defaultCertificate":{"name":"custom-certs-default-01"}}}'
##################################################3
# add rhel hw node, and remove vm worker node
ssh-copy-id root@infra-0.ocpsc.redhat.ren
ssh root@infra-0.ocpsc.redhat.ren
ssh-copy-id root@infra-1.ocpsc.redhat.ren
ssh root@infra-1.ocpsc.redhat.ren
# disable firewalld on infra-0, infra-1
yum -y install openshift-ansible openshift-clients jq
# create rhel-ansible-host
cat <<EOF > /data/ocp4/rhel-ansible-host
[all:vars]
ansible_user=root
#ansible_become=True
openshift_kubeconfig_path="/data/ocp4/auth/kubeconfig"
[new_workers]
infra-0.ocpsc.redhat.ren
infra-1.ocpsc.redhat.ren
EOF
ansible-playbook -i /data/ocp4/rhel-ansible-host /usr/share/ansible/openshift-ansible/playbooks/scaleup.yml
# then remove old vm-worker0, vm-worker1
oc get nodes -o wide
oc adm cordon vm-worker-0.ocpsc.redhat.ren
oc adm cordon vm-worker-1.ocpsc.redhat.ren
oc adm drain vm-worker-0.ocpsc.redhat.ren --force --delete-local-data --ignore-daemonsets
oc adm drain vm-worker-1.ocpsc.redhat.ren --force --delete-local-data --ignore-daemonsets
oc delete nodes vm-worker-0.ocpsc.redhat.ren
oc delete nodes vm-worker-1.ocpsc.redhat.ren
oc get nodes -o wide
# create nfs storage and enable image operator
bash ocp4-upi-helpernode/files/nfs-provisioner-setup.sh
oc patch configs.imageregistry.operator.openshift.io cluster -p '{"spec":{"managementState": "Managed","storage":{"pvc":{"claim":""}}}}' --type=merge
# create operator catalog
oc patch OperatorHub cluster --type json \
-p '[{"op": "add", "path": "/spec/disableAllDefaultSources", "value": true}]'
cat <<EOF > redhat-operator-catalog.yaml
apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
name: redhat-operator-catalog
namespace: openshift-marketplace
spec:
displayName: Redhat Operator Catalog
sourceType: grpc
image: registry.redhat.ren:5443/docker.io/wangzheng422/operator-catalog:redhat-2020-03-23
publisher: Red Hat
EOF
oc create -f redhat-operator-catalog.yaml
# create infra node
# https://access.redhat.com/solutions/4287111
oc get node
oc label node infra0.hsc.redhat.ren node-role.kubernetes.io/infra=""
oc label node infra1.hsc.redhat.ren node-role.kubernetes.io/infra=""
oc patch ingresscontroller default -n openshift-ingress-operator --type=merge --patch='{"spec":{"nodePlacement":{"nodeSelector": {"matchLabels":{"node-role.kubernetes.io/infra":""}}}}}'
oc patch configs.imageregistry.operator.openshift.io/cluster -n openshift-image-registry --type=merge --patch '{"spec":{"nodeSelector":{"node-role.kubernetes.io/infra":""}}}'
oc get pod -o wide -n openshift-image-registry --sort-by=".spec.nodeName"
cat <<EOF > /data/ocp4/monitoring-cm.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: cluster-monitoring-config
namespace: openshift-monitoring
data:
config.yaml: |+
alertmanagerMain:
nodeSelector:
node-role.kubernetes.io/infra: ""
prometheusK8s:
nodeSelector:
node-role.kubernetes.io/infra: ""
volumeClaimTemplate:
metadata:
name: localpvc
spec:
storageClassName: local-sc
resources:
requests:
storage: 400Gi
prometheusOperator:
nodeSelector:
node-role.kubernetes.io/infra: ""
grafana:
nodeSelector:
node-role.kubernetes.io/infra: ""
k8sPrometheusAdapter:
nodeSelector:
node-role.kubernetes.io/infra: ""
kubeStateMetrics:
nodeSelector:
node-role.kubernetes.io/infra: ""
telemeterClient:
nodeSelector:
node-role.kubernetes.io/infra: ""
EOF
oc apply -f /data/ocp4/monitoring-cm.yaml -n openshift-monitoring
oc get pods -n openshift-monitoring -o wide --sort-by=".spec.nodeName"
###########################################
## add user for zte
cd /data/ocp4
touch /data/ocp4/htpasswd
htpasswd -B /data/ocp4/htpasswd zteca
htpasswd -B /data/ocp4/htpasswd zteadm
oc create secret generic htpasswd --from-file=/data/ocp4/htpasswd -n openshift-config
oc apply -f - <<EOF
apiVersion: config.openshift.io/v1
kind: OAuth
metadata:
name: cluster
spec:
identityProviders:
- name: Local Password
mappingMethod: claim
type: HTPasswd
htpasswd:
fileData:
name: htpasswd
EOF
watch oc get pod -n openshift-authentication
oc adm policy add-cluster-role-to-user cluster-admin zteca
oc new-project zte
oc adm policy add-role-to-user admin zteadm -n zte
oc get clusterrolebinding.rbac
oc get clusterrole.rbac
oc adm policy add-cluster-role-to-user cluster-reader zteadm
oc adm policy remove-cluster-role-from-user cluster-reader zteadm
#########################################
# add more rhel-ansible-host
# scp vars_static.yaml to helper
cd /data/ocp4/ocp4-upi-helpernode
ansible-playbook -e @vars-static.yaml -e staticips=true tasks/main.yml
ssh-copy-id root@worker-0.ocpsc.redhat.ren
cat <<EOF > /data/ocp4/rhel-ansible-host
[all:vars]
ansible_user=root
#ansible_become=True
openshift_kubeconfig_path="/data/ocp4/auth/kubeconfig"
[workers]
infra-0.ocpsc.redhat.ren
infra-1.ocpsc.redhat.ren
[new_workers]
worker-0.ocpsc.redhat.ren
EOF
ansible-playbook -i /data/ocp4/rhel-ansible-host /usr/share/ansible/openshift-ansible/playbooks/scaleup.yml
#########################################
# add more rhel-ansible-host
cat << EOF > /etc/yum/pluginconf.d/subscription-manager.conf
[main]
enabled=0
EOF
# scp vars_static.yaml to helper
cd /data/ocp4/ocp4-upi-helpernode
ansible-playbook -e @vars-static.yaml -e staticips=true tasks/main.yml
ssh-copy-id root@worker-1.ocpsc.redhat.ren
ssh-copy-id root@worker-2.ocpsc.redhat.ren
cat <<EOF > /data/ocp4/rhel-ansible-host
[all:vars]
ansible_user=root
#ansible_become=True
openshift_kubeconfig_path="/data/ocp4/auth/kubeconfig"
[workers]
infra-0.ocpsc.redhat.ren
infra-1.ocpsc.redhat.ren
worker-0.ocpsc.redhat.ren
[new_workers]
worker-1.ocpsc.redhat.ren
worker-2.ocpsc.redhat.ren
EOF
ansible-playbook -i /data/ocp4/rhel-ansible-host /usr/share/ansible/openshift-ansible/playbooks/scaleup.yml
#########################################
# add worker-3 rhel-ansible-host
# upload vars-static.yaml
cd /data/ocp4/ocp4-upi-helpernode
ansible-playbook -e @vars-static.yaml -e staticips=true tasks/main.yml
cat << EOF > /etc/yum/pluginconf.d/subscription-manager.conf
[main]
enabled=0
EOF
# scp vars_static.yaml to helper
cd /data/ocp4/ocp4-upi-helpernode
ansible-playbook -e @vars-static.yaml -e staticips=true tasks/main.yml
ssh-copy-id root@worker-3.ocpsc.redhat.ren
cat <<EOF > /data/ocp4/rhel-ansible-host
[all:vars]
ansible_user=root
#ansible_become=True
openshift_kubeconfig_path="/data/ocp4/auth/kubeconfig"
[workers]
infra-0.ocpsc.redhat.ren
infra-1.ocpsc.redhat.ren
worker-0.ocpsc.redhat.ren
worker-1.ocpsc.redhat.ren
worker-2.ocpsc.redhat.ren
[new_workers]
worker-3.ocpsc.redhat.ren
EOF
ansible-playbook -i /data/ocp4/rhel-ansible-host /usr/share/ansible/openshift-ansible/playbooks/scaleup.yml
helper node day 2 sec
cat << EOF > wzh.script
#!/bin/bash
iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT
iptables -A INPUT -s 127.0.0.1/32 -j ACCEPT
iptables -A INPUT -s 223.87.20.0/24 -j ACCEPT
iptables -A INPUT -s 117.177.241.0/24 -j ACCEPT
iptables -A INPUT -s 39.134.200.0/24 -j ACCEPT
iptables -A INPUT -s 39.134.201.0/24 -j ACCEPT
iptables -A INPUT -s 39.137.101.0/24 -j ACCEPT
iptables -A INPUT -s 192.168.7.0/24 -j ACCEPT
iptables -A INPUT -s 112.44.102.224/27 -j ACCEPT
iptables -A INPUT -s 47.93.86.113/32 -j ACCEPT
iptables -A INPUT -s 39.134.204.0/24 -j ACCEPT
iptables -A INPUT -p tcp -j REJECT
iptables -A INPUT -p udp -j REJECT
EOF
var_local=$(cat ./wzh.script | python3 -c "import sys, urllib.parse; print(urllib.parse.quote(''.join(sys.stdin.readlines())))" )
cat <<EOF > 45-wzh-service.yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
labels:
machineconfiguration.openshift.io/role: master
name: 45-wzh-service
spec:
config:
ignition:
version: 2.2.0
storage:
files:
- contents:
source: data:text/plain,${var_local}
verification: {}
filesystem: root
mode: 0755
path: /etc/rc.d/wzh.local
systemd:
units:
- name: wzh.service
enabled: true
contents: |
[Unit]
Description=/etc/rc.d/wzh.local Compatibility
Documentation=zhengwan@redhat.com
ConditionFileIsExecutable=/etc/rc.d/wzh.local
After=network.target
[Service]
Type=oneshot
User=root
Group=root
ExecStart=/bin/bash -c /etc/rc.d/wzh.local
[Install]
WantedBy=multi-user.target
EOF
oc apply -f 45-wzh-service.yaml -n openshift-config
helper node quay
# on helper node
firewall-cmd --permanent --zone=public --add-port=4443/tcp
firewall-cmd --reload
podman pod create --infra-image registry.redhat.ren:5443/gcr.io/google_containers/pause-amd64:3.0 --name quay -p 4443:8443
cd /data
rm -rf /data/quay
podman run -d --name quay-fs --entrypoint "tail" registry.redhat.ren:5443/docker.io/wangzheng422/quay-fs:3.2.0-init -f /dev/null
podman cp quay-fs:/quay.tgz /data/
tar zxf quay.tgz
podman rm -fv quay-fs
export MYSQL_CONTAINER_NAME=quay-mysql
export MYSQL_DATABASE=enterpriseregistrydb
export MYSQL_PASSWORD=zvbk3fzp5f5m2a8j
export MYSQL_USER=quayuser
export MYSQL_ROOT_PASSWORD=q98u335musckfqxe
podman run \
--detach \
--restart=always \
--env MYSQL_ROOT_PASSWORD=${MYSQL_ROOT_PASSWORD} \
--env MYSQL_USER=${MYSQL_USER} \
--env MYSQL_PASSWORD=${MYSQL_PASSWORD} \
--env MYSQL_DATABASE=${MYSQL_DATABASE} \
--name ${MYSQL_CONTAINER_NAME} \
--privileged=true \
--pod quay \
-v /data/quay/lib/mysql:/var/lib/mysql/data:Z \
registry.redhat.ren:5443/registry.access.redhat.com/rhscl/mysql-57-rhel7
podman run -d --restart=always \
--pod quay \
--privileged=true \
--name quay-redis \
-v /data/quay/lib/redis:/var/lib/redis/data:Z \
registry.redhat.ren:5443/registry.access.redhat.com/rhscl/redis-32-rhel7
sleep 10
/bin/cp -f /data/cert/redhat.ren.crt /data/quay/config/extra_ca_certs/redhat.ren.crt
/bin/cp -f /data/cert/redhat.ren.crt /data/quay/config/ssl.cert
/bin/cp -f /data/cert/redhat.ren.key /data/quay/config/ssl.key
podman run --restart=always \
--sysctl net.core.somaxconn=4096 \
--privileged=true \
--name quay-master \
--pod quay \
--add-host mysql:127.0.0.1 \
--add-host redis:127.0.0.1 \
--add-host clair:127.0.0.1 \
-v /data/quay/config:/conf/stack:Z \
-v /data/quay/storage:/datastorage:Z \
-d registry.redhat.ren:5443/quay.io/redhat/quay:v3.2.1
# https://registry.redhat.ren:4443/
podman run --name clair-postgres --pod quay \
-v /data/quay/lib/postgresql/data:/var/lib/postgresql/data:Z \
-d registry.redhat.ren:5443/docker.io/library/postgres
# change /data/quay/clair-config/config.yaml
# https://registry.redhat.ren:4443 -> https://registry.redhat.ren:8443
podman run --restart=always -d \
--name clair \
-v /data/quay/clair-config:/clair/config:Z \
-v /data/quay/clair-config/ca.crt:/etc/pki/ca-trust/source/anchors/ca.crt \
--pod quay \
registry.redhat.ren:5443/quay.io/redhat/clair-jwt:v3.2.1
# stop and restart
podman stop clair
podman stop clair-postgres
podman stop quay-master
podman stop quay-redis
podman stop quay-mysql
podman rm quay-master
podman rm quay-redis
podman rm quay-mysql
podman rm clair
podman rm clair-postgres
podman pod ps
podman pod stop quay
podman pod rm quay
helper node zte oper
cd /data/ocp4/zte
oc project zxcdn
oc adm policy add-role-to-user admin zteadm -n zxcdn
oc create serviceaccount -n zxcdn zxcdn-app
oc adm policy add-scc-to-user privileged -z zxcdn-app -n zxcdn
# oc adm policy remove-scc-from-user privileged -z zxcdn-app
oc get networks.operator.openshift.io cluster -o yaml
oc apply -f zte-macvlan.yaml
oc apply -f slbl7-configmap.yaml
# oc apply -f slbl7-deployment.yaml
oc apply -f slbl7-pod.yaml
oc apply -f ottcache-configmap.yaml
oc apply -f ottcache-pod.yaml
# oc apply -f ott-service.yaml
oc delete -f slbl7-pod.yaml
oc delete -f ottcache-pod.yaml
## web cache
oc apply -f slb-configmap.yaml
oc apply -f slb-deployment.yaml
oc delete -f slb-deployment.yaml
oc apply -f webcache-configmap.yaml
oc apply -f webcache-deployment.yaml
oc delete -f webcache-deployment.yaml
helper host add vm-router
cd /data/ocp4/ocp4-upi-helpernode
ansible-playbook -e @vars-static.yaml -e staticips=true tasks/config.files.yml
# upload install-config.yaml to helper /data/ocp4
cd /data/ocp4
/bin/cp -f worker.ign /var/www/html/ignition/router-0.ign
/bin/cp -f worker.ign /var/www/html/ignition/router-1.ign
/bin/cp -f worker.ign /var/www/html/ignition/router-2.ign
/bin/cp -f worker.ign /var/www/html/ignition/router-3.ign
/bin/cp -f worker.ign /var/www/html/ignition/router-4.ign
/bin/cp -f worker.ign /var/www/html/ignition/router-5.ign
/bin/cp -f worker.ign /var/www/html/ignition/router-6.ign
/bin/cp -f worker.ign /var/www/html/ignition/router-7.ign
/bin/cp -f worker.ign /var/www/html/ignition/router-8.ign
chmod 644 /var/www/html/ignition/*
export NGINX_DIRECTORY=/data/ocp4
export RHCOSVERSION=4.3.0
export VOLID=$(isoinfo -d -i ${NGINX_DIRECTORY}/rhcos-${RHCOSVERSION}-x86_64-installer.iso | awk '/Volume id/ { print $3 }')
TEMPDIR=$(mktemp -d)
echo $VOLID
echo $TEMPDIR
cd ${TEMPDIR}
# Extract the ISO content using guestfish (to avoid sudo mount)
guestfish -a ${NGINX_DIRECTORY}/rhcos-${RHCOSVERSION}-x86_64-installer.iso \
-m /dev/sda tar-out / - | tar xvf -
# Helper function to modify the config files
modify_cfg(){
for file in "EFI/redhat/grub.cfg" "isolinux/isolinux.cfg"; do
# Append the proper image and ignition urls
sed -e '/coreos.inst=yes/s|$| coreos.inst.install_dev=vda coreos.inst.image_url='"${URL}"'\/install\/'"${BIOSMODE}"'.raw.gz coreos.inst.ignition_url='"${URL}"'\/ignition\/'"${NODE}"'.ign ip='"${IP}"'::'"${GATEWAY}"':'"${NETMASK}"':'"${FQDN}"':'"${NET_INTERFACE}"':none:'"${DNS}"' nameserver='"${DNS}"'|' ${file} > $(pwd)/${NODE}_${file##*/}
# Boot directly in the installation
sed -i -e 's/default vesamenu.c32/default linux/g' -e 's/timeout 600/timeout 10/g' $(pwd)/${NODE}_${file##*/}
done
}
URL="http://117.177.241.16:8080/"
GATEWAY="117.177.241.1"
NETMASK="255.255.255.0"
DNS="117.177.241.16"
NODE="router-0"
IP="117.177.241.243"
FQDN="vm-router-0"
BIOSMODE="bios"
NET_INTERFACE="ens3"
modify_cfg
NODE="router-1"
IP="117.177.241.244"
FQDN="vm-router-1"
BIOSMODE="bios"
NET_INTERFACE="ens3"
modify_cfg
NODE="router-2"
IP="117.177.241.245"
FQDN="vm-router-2"
BIOSMODE="bios"
NET_INTERFACE="ens3"
modify_cfg
NODE="router-3"
IP="117.177.241.246"
FQDN="vm-router-3"
BIOSMODE="bios"
NET_INTERFACE="ens3"
modify_cfg
NODE="router-4"
IP="117.177.241.247"
FQDN="vm-router-4"
BIOSMODE="bios"
NET_INTERFACE="ens3"
modify_cfg
NODE="router-5"
IP="117.177.241.248"
FQDN="vm-router-5"
BIOSMODE="bios"
NET_INTERFACE="ens3"
modify_cfg
NODE="router-6"
IP="117.177.241.249"
FQDN="vm-router-6"
BIOSMODE="bios"
NET_INTERFACE="ens3"
modify_cfg
NODE="router-7"
IP="117.177.241.250"
FQDN="vm-router-7"
BIOSMODE="bios"
NET_INTERFACE="ens3"
modify_cfg
NODE="router-8"
IP="117.177.241.251"
FQDN="vm-router-8"
BIOSMODE="bios"
NET_INTERFACE="ens3"
modify_cfg
# Generate the images, one per node as the IP configuration is different...
# https://github.com/coreos/coreos-assembler/blob/master/src/cmd-buildextend-installer#L97-L103
for node in router-0 router-1 router-2 router-3 router-4 router-5 router-6 router-7 router-8; do
# Overwrite the grub.cfg and isolinux.cfg files for each node type
for file in "EFI/redhat/grub.cfg" "isolinux/isolinux.cfg"; do
/bin/cp -f $(pwd)/${node}_${file##*/} ${file}
done
# As regular user!
genisoimage -verbose -rock -J -joliet-long -volset ${VOLID} \
-eltorito-boot isolinux/isolinux.bin -eltorito-catalog isolinux/boot.cat \
-no-emul-boot -boot-load-size 4 -boot-info-table \
-eltorito-alt-boot -efi-boot images/efiboot.img -no-emul-boot \
-o ${NGINX_DIRECTORY}/${node}.iso .
done
# Optionally, clean up
cd /data/ocp4
rm -Rf ${TEMPDIR}
cd ${NGINX_DIRECTORY}
scp router-*.iso root@117.177.241.21:/data/ocp4/
# after vm on bootstrap created
oc get csr
oc get csr -o go-template='{{range .items}}{{if not .status}}{{.metadata.name}}{{"\n"}}{{end}}{{end}}' | xargs oc adm certificate approve
oc label node vm-router-0.ocpsc.redhat.ren node-role.kubernetes.io/router=''
oc label node vm-router-1.ocpsc.redhat.ren node-role.kubernetes.io/router=''
oc label node vm-router-2.ocpsc.redhat.ren node-role.kubernetes.io/router=''
oc label node vm-router-3.ocpsc.redhat.ren node-role.kubernetes.io/router=''
oc label node vm-router-4.ocpsc.redhat.ren node-role.kubernetes.io/router=''
# oc label node vm-router-5.ocpsc.redhat.ren node-role.kubernetes.io/router=''
# oc label node vm-router-6.ocpsc.redhat.ren node-role.kubernetes.io/router=''
# oc label node vm-router-7.ocpsc.redhat.ren node-role.kubernetes.io/router=''
# oc label node vm-router-8.ocpsc.redhat.ren node-role.kubernetes.io/router=''
##########################
## secure the router vm
cat << EOF > router.mcp.yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfigPool
metadata:
name: router
spec:
machineConfigSelector:
matchExpressions:
- {key: machineconfiguration.openshift.io/role, operator: In, values: [worker,router]}
nodeSelector:
matchLabels:
node-role.kubernetes.io/router: ""
EOF
oc apply -f router.mcp.yaml
cat << EOF > wzh.script
#!/bin/bash
iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT
iptables -A INPUT -s 127.0.0.1/32 -j ACCEPT
iptables -A INPUT -s 223.87.20.0/24 -j ACCEPT
iptables -A INPUT -s 117.177.241.0/24 -j ACCEPT
iptables -A INPUT -s 39.134.200.0/24 -j ACCEPT
iptables -A INPUT -s 39.134.201.0/24 -j ACCEPT
iptables -A INPUT -s 39.137.101.0/24 -j ACCEPT
iptables -A INPUT -s 192.168.7.0/24 -j ACCEPT
iptables -A INPUT -s 112.44.102.224/27 -j ACCEPT
iptables -A INPUT -s 47.93.86.113/32 -j ACCEPT
iptables -A INPUT -p tcp -j REJECT
iptables -A INPUT -p udp -j REJECT
EOF
var_local=$(cat ./wzh.script | python3 -c "import sys, urllib.parse; print(urllib.parse.quote(''.join(sys.stdin.readlines())))" )
cat <<EOF > 45-router-wzh-service.yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
labels:
machineconfiguration.openshift.io/role: router
name: 45-router-wzh-service
spec:
config:
ignition:
version: 2.2.0
storage:
files:
- contents:
source: data:text/plain,${var_local}
verification: {}
filesystem: root
mode: 0755
path: /etc/rc.d/wzh.local
systemd:
units:
- name: wzh.service
enabled: true
contents: |
[Unit]
Description=/etc/rc.d/wzh.local Compatibility
Documentation=zhengwan@redhat.com
ConditionFileIsExecutable=/etc/rc.d/wzh.local
After=network.target
[Service]
Type=oneshot
User=root
Group=root
ExecStart=/bin/bash -c /etc/rc.d/wzh.local
[Install]
WantedBy=multi-user.target
EOF
oc apply -f 45-router-wzh-service.yaml -n openshift-config
# DO NOT
# cp 99-master-zzz-container-registries.yaml 99-router-zzz-container-registries.yaml
# # change: machineconfiguration.openshift.io/role: router
# oc apply -f ./99-router-zzz-container-registries.yaml -n openshift-config
# on helper node
cat << EOF > /etc/docker-distribution/registry/config.yml
version: 0.1
log:
fields:
service: registry
storage:
cache:
layerinfo: inmemory
filesystem:
rootdirectory: /data/registry
delete:
enabled: true
http:
addr: :5443
tls:
certificate: /data/cert/redhat.ren.crt
key: /data/cert/redhat.ren.key
EOF
systemctl restart docker-distribution
helper node zte tcp-router
oc project openshift-ingress
# install the tcp-router and demo
oc create configmap customrouter-wzh --from-file=haproxy-config.template
oc apply -f haproxy.router.yaml
oc project zxcdn
oc apply -f ott-service.tcp.route.yaml
helper node cluster tunning
# tunning for pid.max
oc label mcp worker custom-kubelet-pod-pids-limit=true
cat << EOF > PodPidsLimit.yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: KubeletConfig
metadata:
name: pod-pids-limit
spec:
machineConfigPoolSelector:
matchLabels:
custom-kubelet-pod-pids-limit: 'true'
kubeletConfig:
PodPidsLimit: 4096
EOF
oc apply -f PodPidsLimit.yaml
cat << EOF > crio.yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: ContainerRuntimeConfig
metadata:
name: set-log-and-pid
spec:
machineConfigPoolSelector:
matchLabels:
custom-kubelet-pod-pids-limit: 'true'
containerRuntimeConfig:
pidsLimit: 10240
EOF
oc apply -f crio.yaml
helper node local storage
https://docs.openshift.com/container-platform/4.3/storage/persistent_storage/persistent-storage-local.html
oc new-project local-storage
apiVersion: "local.storage.openshift.io/v1"
kind: "LocalVolume"
metadata:
name: "local-disks"
namespace: "local-storage"
spec:
nodeSelector:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- infra0.hsc.redhat.ren
- infra1.hsc.redhat.ren
storageClassDevices:
- storageClassName: "local-sc"
volumeMode: Filesystem
fsType: xfs
devicePaths:
- /dev/datavg/monitorlv
bootstrap node day1
##########################################################3
## on bootstrap
yum -y install tigervnc-server tigervnc gnome-terminal gnome-session gnome-classic-session gnome-terminal nautilus-open-terminal control-center liberation-mono-fonts google-noto-sans-cjk-fonts google-noto-sans-fonts fonts-tweak-tool
yum install -y qgnomeplatform xdg-desktop-portal-gtk NetworkManager-libreswan-gnome PackageKit-command-not-found PackageKit-gtk3-module abrt-desktop at-spi2-atk at-spi2-core avahi baobab caribou caribou-gtk2-module caribou-gtk3-module cheese compat-cheese314 control-center dconf empathy eog evince evince-nautilus file-roller file-roller-nautilus firewall-config firstboot fprintd-pam gdm gedit glib-networking gnome-bluetooth gnome-boxes gnome-calculator gnome-classic-session gnome-clocks gnome-color-manager gnome-contacts gnome-dictionary gnome-disk-utility gnome-font-viewer gnome-getting-started-docs gnome-icon-theme gnome-icon-theme-extras gnome-icon-theme-symbolic gnome-initial-setup gnome-packagekit gnome-packagekit-updater gnome-screenshot gnome-session gnome-session-xsession gnome-settings-daemon gnome-shell gnome-software gnome-system-log gnome-system-monitor gnome-terminal gnome-terminal-nautilus gnome-themes-standard gnome-tweak-tool nm-connection-editor orca redhat-access-gui sane-backends-drivers-scanners seahorse setroubleshoot sushi totem totem-nautilus vinagre vino xdg-user-dirs-gtk yelp
yum install -y cjkuni-uming-fonts dejavu-sans-fonts dejavu-sans-mono-fonts dejavu-serif-fonts gnu-free-mono-fonts gnu-free-sans-fonts gnu-free-serif-fonts google-crosextra-caladea-fonts google-crosextra-carlito-fonts google-noto-emoji-fonts jomolhari-fonts khmeros-base-fonts liberation-mono-fonts liberation-sans-fonts liberation-serif-fonts lklug-fonts lohit-assamese-fonts lohit-bengali-fonts lohit-devanagari-fonts lohit-gujarati-fonts lohit-kannada-fonts lohit-malayalam-fonts lohit-marathi-fonts lohit-nepali-fonts lohit-oriya-fonts lohit-punjabi-fonts lohit-tamil-fonts lohit-telugu-fonts madan-fonts nhn-nanum-gothic-fonts open-sans-fonts overpass-fonts paktype-naskh-basic-fonts paratype-pt-sans-fonts sil-abyssinica-fonts sil-nuosu-fonts sil-padauk-fonts smc-meera-fonts stix-fonts thai-scalable-waree-fonts ucs-miscfixed-fonts vlgothic-fonts wqy-microhei-fonts wqy-zenhei-fonts
vncpasswd
cat << EOF > ~/.vnc/xstartup
#!/bin/sh
unset SESSION_MANAGER
unset DBUS_SESSION_BUS_ADDRESS
gnome-session &
EOF
chmod +x ~/.vnc/xstartup
vncserver :1 -geometry 1280x800
# 如果你想停掉vnc server,这么做
vncserver -kill :1
firewall-cmd --permanent --add-port=6001/tcp
firewall-cmd --permanent --add-port=5901/tcp
firewall-cmd --reload
# 配置kvm环境
yum -y install qemu-kvm libvirt libvirt-python libguestfs-tools virt-install virt-viewer virt-manager
systemctl enable libvirtd
systemctl start libvirtd
brctl show
virsh net-list
cat << EOF > /data/virt-net.xml
<network>
<name>br0</name>
<forward mode='bridge'>
<bridge name='br0'/>
</forward>
</network>
EOF
virsh net-define --file virt-net.xml
virsh net-dumpxml br0
# virsh net-undefine openshift4
# virsh net-destroy openshift4
virsh net-autostart br0
virsh net-start br0
cp /etc/sysconfig/network-scripts/ifcfg-em1 /etc/sysconfig/network-scripts/ifcfg-em1.orig
cat << EOF > /etc/sysconfig/network-scripts/ifcfg-em1
TYPE=Ethernet
PROXY_METHOD=none
BROWSER_ONLY=no
BOOTPROTO=none
DEFROUTE=yes
IPV4_FAILURE_FATAL=no
IPV6INIT=yes
IPV6_AUTOCONF=yes
IPV6_DEFROUTE=yes
IPV6_FAILURE_FATAL=no
IPV6_ADDR_GEN_MODE=stable-privacy
NAME=em1
DEVICE=em1
ONBOOT=yes
# IPADDR=117.177.241.21
# PREFIX=24
# GATEWAY=117.177.241.1
IPV6_PRIVACY=no
# DNS1=117.177.241.16
BRIDGE=br0
EOF
cat <<EOF > /etc/sysconfig/network-scripts/ifcfg-br0
TYPE=Bridge
BOOTPROTO=static
IPADDR=117.177.241.21
GATEWAY=117.177.241.1
DNS1=117.177.241.16
ONBOOT=yes
DEFROUTE=yes
NAME=br0
DEVICE=br0
PREFIX=24
EOF
systemctl restart network
virt-install --name=ocp4-bootstrap --vcpus=2 --ram=16384 \
--disk path=/data/kvm/ocp4-bootstrap.qcow2,bus=virtio,size=200 \
--os-variant rhel8.0 --network bridge=br0,model=virtio \
--boot menu=on --cdrom /data/ocp4/bootstrap-static.iso
virt-install --name=ocp4-master0 --vcpus=8 --ram=65536 \
--disk path=/data/kvm/ocp4-master0.qcow2,bus=virtio,size=200 \
--os-variant rhel8.0 --network bridge=br0,model=virtio \
--boot menu=on --cdrom /data/ocp4/master-0.iso
# virt-install --name=ocp4-master1 --vcpus=20 --ram=200704 \
# --disk path=/data/kvm/ocp4-master1.qcow2,bus=virtio,size=200 \
# --os-variant rhel8.0 --network bridge=br0,model=virtio \
# --boot menu=on --cdrom /data/ocp4/master-1.iso
virt-install --name=ocp4-master2 --vcpus=8 --ram=65536 \
--disk path=/data/kvm/ocp4-master2.qcow2,bus=virtio,size=200 \
--os-variant rhel8.0 --network bridge=br0,model=virtio \
--boot menu=on --cdrom /data/ocp4/master-2.iso
virt-install --name=ocp4-worker0 --vcpus=4 --ram=32768 \
--disk path=/data/kvm/ocp4-worker0.qcow2,bus=virtio,size=200 \
--os-variant rhel8.0 --network bridge=br0,model=virtio \
--boot menu=on --cdrom /data/ocp4/worker-0.iso
virt-install --name=ocp4-worker1 --vcpus=4 --ram=32768 \
--disk path=/data/kvm/ocp4-worker1.qcow2,bus=virtio,size=200 \
--os-variant rhel8.0 --network bridge=br0,model=virtio \
--boot menu=on --cdrom /data/ocp4/worker-1.iso
tar -cvf - ocp4-master0.qcow2 | pigz -c > /data/kvm/ocp4-master0.qcow2.tgz
rsync -e "ssh -c chacha20-poly1305@openssh.com" --info=progress2 -P -arz /data/kvm/ocp4-master0.qcow2.tgz root@117.177.241.18:/data/kvm/
tar -cvf - ocp4-master2.qcow2 | pigz -c > /data/kvm/ocp4-master2.qcow2.tgz
rsync -e "ssh -c chacha20-poly1305@openssh.com" --info=progress2 -P -arz /data/kvm/ocp4-master2.qcow2.tgz root@117.177.241.22:/data/kvm/
# anti scan
firewall-cmd --permanent --new-ipset=my-allow-list --type=hash:net
firewall-cmd --permanent --get-ipsets
cat > /root/iplist.txt <<EOL
127.0.0.1/32
223.87.20.0/24
117.177.241.0/24
39.134.200.0/24
39.134.201.0/24
39.137.101.0/24
192.168.7.0/24
112.44.102.224/27
47.93.86.113/32
EOL
firewall-cmd --permanent --ipset=my-allow-list --add-entries-from-file=iplist.txt
firewall-cmd --permanent --ipset=my-allow-list --get-entries
firewall-cmd --permanent --zone=trusted --add-source=ipset:my-allow-list
firewall-cmd --reload
firewall-cmd --list-all
firewall-cmd --get-active-zones
firewall-cmd --set-default-zone=block
firewall-cmd --runtime-to-permanent
firewall-cmd --reload
# https://access.redhat.com/solutions/39604
virsh list
virsh dump ocp4-router-0 /data/tmp/ocp4-router-0.dump --memory-only --verbose
virsh dump ocp4-router-1 /data/tmp/ocp4-router-1.dump --memory-only --verbose
virsh dump ocp4-router-2 /data/tmp/ocp4-router-2.dump --memory-only --verbose
virsh dump ocp4-router-3 /data/tmp/ocp4-router-3.dump --memory-only --verbose
cd /data
tar -cvf - tmp/ | pigz -c > virsh.dump.tgz
################################
## add more router vm
virt-install --name=ocp4-router-0 --vcpus=4 --ram=16384 \
--disk path=/data/kvm/ocp4-router-0.qcow2,bus=virtio,size=200 \
--os-variant rhel8.0 --network bridge=br0,model=virtio \
--boot menu=on --cdrom /data/ocp4/router-0.iso
virt-install --name=ocp4-router-1 --vcpus=4 --ram=16384 \
--disk path=/data/kvm/ocp4-router-1.qcow2,bus=virtio,size=200 \
--os-variant rhel8.0 --network bridge=br0,model=virtio \
--boot menu=on --cdrom /data/ocp4/router-1.iso
virt-install --name=ocp4-router-2 --vcpus=4 --ram=16384 \
--disk path=/data/kvm/ocp4-router-2.qcow2,bus=virtio,size=200 \
--os-variant rhel8.0 --network bridge=br0,model=virtio \
--boot menu=on --cdrom /data/ocp4/router-2.iso
virt-install --name=ocp4-router-3 --vcpus=4 --ram=16384 \
--disk path=/data/kvm/ocp4-router-3.qcow2,bus=virtio,size=200 \
--os-variant rhel8.0 --network bridge=br0,model=virtio \
--boot menu=on --cdrom /data/ocp4/router-3.iso
virt-install --name=ocp4-router-4 --vcpus=4 --ram=16384 \
--disk path=/data/kvm/ocp4-router-4.qcow2,bus=virtio,size=200 \
--os-variant rhel8.0 --network bridge=br0,model=virtio \
--boot menu=on --cdrom /data/ocp4/router-4.iso
# virt-install --name=ocp4-router-5 --vcpus=2 --ram=8192 \
# --disk path=/data/kvm/ocp4-router-5.qcow2,bus=virtio,size=200 \
# --os-variant rhel8.0 --network bridge=br0,model=virtio \
# --boot menu=on --cdrom /data/ocp4/router-5.iso
# virt-install --name=ocp4-router-6 --vcpus=2 --ram=8192 \
# --disk path=/data/kvm/ocp4-router-6.qcow2,bus=virtio,size=200 \
# --os-variant rhel8.0 --network bridge=br0,model=virtio \
# --boot menu=on --cdrom /data/ocp4/router-6.iso
# virt-install --name=ocp4-router-7 --vcpus=2 --ram=8192 \
# --disk path=/data/kvm/ocp4-router-7.qcow2,bus=virtio,size=200 \
# --os-variant rhel8.0 --network bridge=br0,model=virtio \
# --boot menu=on --cdrom /data/ocp4/router-7.iso
# virt-install --name=ocp4-router-8 --vcpus=2 --ram=8192 \
# --disk path=/data/kvm/ocp4-router-8.qcow2,bus=virtio,size=200 \
# --os-variant rhel8.0 --network bridge=br0,model=virtio \
# --boot menu=on --cdrom /data/ocp4/router-8.iso
# helper node operation
master1 node day1
##########################################################3
## on master1
yum -y install tigervnc-server tigervnc gnome-terminal gnome-session gnome-classic-session gnome-terminal nautilus-open-terminal control-center liberation-mono-fonts google-noto-sans-cjk-fonts google-noto-sans-fonts fonts-tweak-tool
yum install -y qgnomeplatform xdg-desktop-portal-gtk NetworkManager-libreswan-gnome PackageKit-command-not-found PackageKit-gtk3-module abrt-desktop at-spi2-atk at-spi2-core avahi baobab caribou caribou-gtk2-module caribou-gtk3-module cheese compat-cheese314 control-center dconf empathy eog evince evince-nautilus file-roller file-roller-nautilus firewall-config firstboot fprintd-pam gdm gedit glib-networking gnome-bluetooth gnome-boxes gnome-calculator gnome-classic-session gnome-clocks gnome-color-manager gnome-contacts gnome-dictionary gnome-disk-utility gnome-font-viewer gnome-getting-started-docs gnome-icon-theme gnome-icon-theme-extras gnome-icon-theme-symbolic gnome-initial-setup gnome-packagekit gnome-packagekit-updater gnome-screenshot gnome-session gnome-session-xsession gnome-settings-daemon gnome-shell gnome-software gnome-system-log gnome-system-monitor gnome-terminal gnome-terminal-nautilus gnome-themes-standard gnome-tweak-tool nm-connection-editor orca redhat-access-gui sane-backends-drivers-scanners seahorse setroubleshoot sushi totem totem-nautilus vinagre vino xdg-user-dirs-gtk yelp
yum install -y cjkuni-uming-fonts dejavu-sans-fonts dejavu-sans-mono-fonts dejavu-serif-fonts gnu-free-mono-fonts gnu-free-sans-fonts gnu-free-serif-fonts google-crosextra-caladea-fonts google-crosextra-carlito-fonts google-noto-emoji-fonts jomolhari-fonts khmeros-base-fonts liberation-mono-fonts liberation-sans-fonts liberation-serif-fonts lklug-fonts lohit-assamese-fonts lohit-bengali-fonts lohit-devanagari-fonts lohit-gujarati-fonts lohit-kannada-fonts lohit-malayalam-fonts lohit-marathi-fonts lohit-nepali-fonts lohit-oriya-fonts lohit-punjabi-fonts lohit-tamil-fonts lohit-telugu-fonts madan-fonts nhn-nanum-gothic-fonts open-sans-fonts overpass-fonts paktype-naskh-basic-fonts paratype-pt-sans-fonts sil-abyssinica-fonts sil-nuosu-fonts sil-padauk-fonts smc-meera-fonts stix-fonts thai-scalable-waree-fonts ucs-miscfixed-fonts vlgothic-fonts wqy-microhei-fonts wqy-zenhei-fonts
vncpasswd
cat << EOF > ~/.vnc/xstartup
#!/bin/sh
unset SESSION_MANAGER
unset DBUS_SESSION_BUS_ADDRESS
gnome-session &
EOF
chmod +x ~/.vnc/xstartup
vncserver :1 -geometry 1280x800
# 如果你想停掉vnc server,这么做
vncserver -kill :1
firewall-cmd --permanent --add-port=6001/tcp
firewall-cmd --permanent --add-port=5901/tcp
firewall-cmd --reload
# 配置kvm环境
yum -y install qemu-kvm libvirt libvirt-python libguestfs-tools virt-install virt-viewer virt-manager
systemctl enable libvirtd
systemctl start libvirtd
brctl show
virsh net-list
cat << EOF > /data/virt-net.xml
<network>
<name>br0</name>
<forward mode='bridge'>
<bridge name='br0'/>
</forward>
</network>
EOF
virsh net-define --file virt-net.xml
virsh net-dumpxml br0
# virsh net-undefine openshift4
# virsh net-destroy openshift4
virsh net-autostart br0
virsh net-start br0
cp /etc/sysconfig/network-scripts/ifcfg-em1 /etc/sysconfig/network-scripts/ifcfg-em1.orig
cat << EOF > /etc/sysconfig/network-scripts/ifcfg-em1
TYPE=Ethernet
PROXY_METHOD=none
BROWSER_ONLY=no
BOOTPROTO=none
DEFROUTE=yes
IPV4_FAILURE_FATAL=no
IPV6INIT=yes
IPV6_AUTOCONF=yes
IPV6_DEFROUTE=yes
IPV6_FAILURE_FATAL=no
IPV6_ADDR_GEN_MODE=stable-privacy
NAME=em1
DEVICE=em1
ONBOOT=yes
# IPADDR=117.177.241.17
# PREFIX=24
# GATEWAY=117.177.241.1
IPV6_PRIVACY=no
# DNS1=117.177.241.16
BRIDGE=br0
EOF
cat <<EOF > /etc/sysconfig/network-scripts/ifcfg-br0
TYPE=Bridge
BOOTPROTO=static
IPADDR=117.177.241.17
GATEWAY=117.177.241.1
DNS1=117.177.241.16
ONBOOT=yes
DEFROUTE=yes
NAME=br0
DEVICE=br0
PREFIX=24
EOF
systemctl restart network
virt-install --name=ocp4-master1 --vcpus=20 --ram=200704 \
--disk path=/data/kvm/ocp4-master1.qcow2,bus=virtio,size=200 \
--os-variant rhel8.0 --network bridge=br0,model=virtio \
--boot menu=on --cdrom /data/ocp4/master-1.iso
virsh list --all
virsh start ocp4-master1
# anti scan
firewall-cmd --permanent --new-ipset=my-allow-list --type=hash:net
firewall-cmd --permanent --get-ipsets
cat > /root/iplist.txt <<EOL
127.0.0.1/32
223.87.20.0/24
117.177.241.0/24
39.134.200.0/24
39.134.201.0/24
39.137.101.0/24
192.168.7.0/24
112.44.102.224/27
47.93.86.113/32
EOL
firewall-cmd --permanent --ipset=my-allow-list --add-entries-from-file=iplist.txt
firewall-cmd --permanent --ipset=my-allow-list --get-entries
firewall-cmd --permanent --zone=trusted --add-source=ipset:my-allow-list
firewall-cmd --reload
firewall-cmd --list-all
firewall-cmd --get-active-zones
firewall-cmd --set-default-zone=block
firewall-cmd --runtime-to-permanent
firewall-cmd --reload
master0 node day1
########################################################
# master0
yum -y install tigervnc-server tigervnc gnome-terminal gnome-session gnome-classic-session gnome-terminal nautilus-open-terminal control-center liberation-mono-fonts google-noto-sans-cjk-fonts google-noto-sans-fonts fonts-tweak-tool
yum install -y qgnomeplatform xdg-desktop-portal-gtk NetworkManager-libreswan-gnome PackageKit-command-not-found PackageKit-gtk3-module abrt-desktop at-spi2-atk at-spi2-core avahi baobab caribou caribou-gtk2-module caribou-gtk3-module cheese compat-cheese314 control-center dconf empathy eog evince evince-nautilus file-roller file-roller-nautilus firewall-config firstboot fprintd-pam gdm gedit glib-networking gnome-bluetooth gnome-boxes gnome-calculator gnome-classic-session gnome-clocks gnome-color-manager gnome-contacts gnome-dictionary gnome-disk-utility gnome-font-viewer gnome-getting-started-docs gnome-icon-theme gnome-icon-theme-extras gnome-icon-theme-symbolic gnome-initial-setup gnome-packagekit gnome-packagekit-updater gnome-screenshot gnome-session gnome-session-xsession gnome-settings-daemon gnome-shell gnome-software gnome-system-log gnome-system-monitor gnome-terminal gnome-terminal-nautilus gnome-themes-standard gnome-tweak-tool nm-connection-editor orca redhat-access-gui sane-backends-drivers-scanners seahorse setroubleshoot sushi totem totem-nautilus vinagre vino xdg-user-dirs-gtk yelp
yum install -y cjkuni-uming-fonts dejavu-sans-fonts dejavu-sans-mono-fonts dejavu-serif-fonts gnu-free-mono-fonts gnu-free-sans-fonts gnu-free-serif-fonts google-crosextra-caladea-fonts google-crosextra-carlito-fonts google-noto-emoji-fonts jomolhari-fonts khmeros-base-fonts liberation-mono-fonts liberation-sans-fonts liberation-serif-fonts lklug-fonts lohit-assamese-fonts lohit-bengali-fonts lohit-devanagari-fonts lohit-gujarati-fonts lohit-kannada-fonts lohit-malayalam-fonts lohit-marathi-fonts lohit-nepali-fonts lohit-oriya-fonts lohit-punjabi-fonts lohit-tamil-fonts lohit-telugu-fonts madan-fonts nhn-nanum-gothic-fonts open-sans-fonts overpass-fonts paktype-naskh-basic-fonts paratype-pt-sans-fonts sil-abyssinica-fonts sil-nuosu-fonts sil-padauk-fonts smc-meera-fonts stix-fonts thai-scalable-waree-fonts ucs-miscfixed-fonts vlgothic-fonts wqy-microhei-fonts wqy-zenhei-fonts
vncpasswd
cat << EOF > ~/.vnc/xstartup
#!/bin/sh
unset SESSION_MANAGER
unset DBUS_SESSION_BUS_ADDRESS
gnome-session &
EOF
chmod +x ~/.vnc/xstartup
vncserver :1 -geometry 1280x800
# 如果你想停掉vnc server,这么做
vncserver -kill :1
firewall-cmd --permanent --add-port=6001/tcp
firewall-cmd --permanent --add-port=5901/tcp
firewall-cmd --reload
# 配置kvm环境
yum -y install qemu-kvm libvirt libvirt-python libguestfs-tools virt-install virt-viewer virt-manager
systemctl enable libvirtd
systemctl start libvirtd
brctl show
virsh net-list
cat << EOF > /data/virt-net.xml
<network>
<name>br0</name>
<forward mode='bridge'>
<bridge name='br0'/>
</forward>
</network>
EOF
virsh net-define --file virt-net.xml
virsh net-dumpxml br0
# virsh net-undefine openshift4
# virsh net-destroy openshift4
virsh net-autostart br0
virsh net-start br0
cp /etc/sysconfig/network-scripts/ifcfg-em1 /etc/sysconfig/network-scripts/ifcfg-em1.orig
cat << EOF > /etc/sysconfig/network-scripts/ifcfg-em1
TYPE=Ethernet
PROXY_METHOD=none
BROWSER_ONLY=no
BOOTPROTO=none
DEFROUTE=yes
IPV4_FAILURE_FATAL=no
IPV6INIT=yes
IPV6_AUTOCONF=yes
IPV6_DEFROUTE=yes
IPV6_FAILURE_FATAL=no
IPV6_ADDR_GEN_MODE=stable-privacy
NAME=em1
DEVICE=em1
ONBOOT=yes
# IPADDR=117.177.241.18
# PREFIX=24
# GATEWAY=117.177.241.1
IPV6_PRIVACY=no
# DNS1=117.177.241.16
BRIDGE=br0
EOF
cat <<EOF > /etc/sysconfig/network-scripts/ifcfg-br0
TYPE=Bridge
BOOTPROTO=static
IPADDR=117.177.241.18
GATEWAY=117.177.241.1
DNS1=117.177.241.16
ONBOOT=yes
DEFROUTE=yes
NAME=br0
DEVICE=br0
PREFIX=24
EOF
systemctl restart network
mkdir -p /data/ocp4
mkdir -p /data/kvm
pigz -dc ocp4-master0.qcow2.tgz | tar xf -
virt-install --name=ocp4-master0 --vcpus=20 --ram=200704 \
--disk path=/data/kvm/ocp4-master0.qcow2,bus=virtio,size=200 \
--os-variant rhel8.0 --network bridge=br0,model=virtio \
--boot menu=on
virsh list --all
virsh start ocp4-master0
# anti scan
firewall-cmd --permanent --new-ipset=my-allow-list --type=hash:net
firewall-cmd --permanent --get-ipsets
cat > /root/iplist.txt <<EOL
127.0.0.1/32
223.87.20.0/24
117.177.241.0/24
39.134.200.0/24
39.134.201.0/24
39.137.101.0/24
192.168.7.0/24
112.44.102.224/27
47.93.86.113/32
EOL
firewall-cmd --permanent --ipset=my-allow-list --add-entries-from-file=iplist.txt
firewall-cmd --permanent --ipset=my-allow-list --get-entries
firewall-cmd --permanent --zone=trusted --add-source=ipset:my-allow-list
firewall-cmd --reload
firewall-cmd --list-all
firewall-cmd --get-active-zones
firewall-cmd --set-default-zone=block
firewall-cmd --runtime-to-permanent
firewall-cmd --reload
master2 node day1
########################################################
# master2
yum -y install tigervnc-server tigervnc gnome-terminal gnome-session gnome-classic-session gnome-terminal nautilus-open-terminal control-center liberation-mono-fonts google-noto-sans-cjk-fonts google-noto-sans-fonts fonts-tweak-tool
yum install -y qgnomeplatform xdg-desktop-portal-gtk NetworkManager-libreswan-gnome PackageKit-command-not-found PackageKit-gtk3-module abrt-desktop at-spi2-atk at-spi2-core avahi baobab caribou caribou-gtk2-module caribou-gtk3-module cheese compat-cheese314 control-center dconf empathy eog evince evince-nautilus file-roller file-roller-nautilus firewall-config firstboot fprintd-pam gdm gedit glib-networking gnome-bluetooth gnome-boxes gnome-calculator gnome-classic-session gnome-clocks gnome-color-manager gnome-contacts gnome-dictionary gnome-disk-utility gnome-font-viewer gnome-getting-started-docs gnome-icon-theme gnome-icon-theme-extras gnome-icon-theme-symbolic gnome-initial-setup gnome-packagekit gnome-packagekit-updater gnome-screenshot gnome-session gnome-session-xsession gnome-settings-daemon gnome-shell gnome-software gnome-system-log gnome-system-monitor gnome-terminal gnome-terminal-nautilus gnome-themes-standard gnome-tweak-tool nm-connection-editor orca redhat-access-gui sane-backends-drivers-scanners seahorse setroubleshoot sushi totem totem-nautilus vinagre vino xdg-user-dirs-gtk yelp
yum install -y cjkuni-uming-fonts dejavu-sans-fonts dejavu-sans-mono-fonts dejavu-serif-fonts gnu-free-mono-fonts gnu-free-sans-fonts gnu-free-serif-fonts google-crosextra-caladea-fonts google-crosextra-carlito-fonts google-noto-emoji-fonts jomolhari-fonts khmeros-base-fonts liberation-mono-fonts liberation-sans-fonts liberation-serif-fonts lklug-fonts lohit-assamese-fonts lohit-bengali-fonts lohit-devanagari-fonts lohit-gujarati-fonts lohit-kannada-fonts lohit-malayalam-fonts lohit-marathi-fonts lohit-nepali-fonts lohit-oriya-fonts lohit-punjabi-fonts lohit-tamil-fonts lohit-telugu-fonts madan-fonts nhn-nanum-gothic-fonts open-sans-fonts overpass-fonts paktype-naskh-basic-fonts paratype-pt-sans-fonts sil-abyssinica-fonts sil-nuosu-fonts sil-padauk-fonts smc-meera-fonts stix-fonts thai-scalable-waree-fonts ucs-miscfixed-fonts vlgothic-fonts wqy-microhei-fonts wqy-zenhei-fonts
vncpasswd
cat << EOF > ~/.vnc/xstartup
#!/bin/sh
unset SESSION_MANAGER
unset DBUS_SESSION_BUS_ADDRESS
gnome-session &
EOF
chmod +x ~/.vnc/xstartup
vncserver :1 -geometry 1280x800
# 如果你想停掉vnc server,这么做
vncserver -kill :1
firewall-cmd --permanent --add-port=6001/tcp
firewall-cmd --permanent --add-port=5901/tcp
firewall-cmd --reload
# 配置kvm环境
yum -y install qemu-kvm libvirt libvirt-python libguestfs-tools virt-install virt-viewer virt-manager
systemctl enable libvirtd
systemctl start libvirtd
brctl show
virsh net-list
cat << EOF > /data/virt-net.xml
<network>
<name>br0</name>
<forward mode='bridge'>
<bridge name='br0'/>
</forward>
</network>
EOF
virsh net-define --file virt-net.xml
virsh net-dumpxml br0
# virsh net-undefine openshift4
# virsh net-destroy openshift4
virsh net-autostart br0
virsh net-start br0
cp /etc/sysconfig/network-scripts/ifcfg-em1 /etc/sysconfig/network-scripts/ifcfg-em1.orig
cat << EOF > /etc/sysconfig/network-scripts/ifcfg-em1
TYPE=Ethernet
PROXY_METHOD=none
BROWSER_ONLY=no
BOOTPROTO=none
DEFROUTE=yes
IPV4_FAILURE_FATAL=no
IPV6INIT=yes
IPV6_AUTOCONF=yes
IPV6_DEFROUTE=yes
IPV6_FAILURE_FATAL=no
IPV6_ADDR_GEN_MODE=stable-privacy
NAME=em1
DEVICE=em1
ONBOOT=yes
# IPADDR=117.177.241.22
# PREFIX=24
# GATEWAY=117.177.241.1
IPV6_PRIVACY=no
# DNS1=117.177.241.16
BRIDGE=br0
EOF
cat <<EOF > /etc/sysconfig/network-scripts/ifcfg-br0
TYPE=Bridge
BOOTPROTO=static
IPADDR=117.177.241.22
GATEWAY=117.177.241.1
DNS1=117.177.241.16
ONBOOT=yes
DEFROUTE=yes
NAME=br0
DEVICE=br0
PREFIX=24
EOF
systemctl restart network
mkdir -p /data/ocp4
mkdir -p /data/kvm
pigz -dc ocp4-master2.qcow2.tgz | tar xf -
virt-install --name=ocp4-master2 --vcpus=20 --ram=200704 \
--disk path=/data/kvm/ocp4-master2.qcow2,bus=virtio,size=200 \
--os-variant rhel8.0 --network bridge=br0,model=virtio \
--boot menu=on
virsh list --all
virsh start ocp4-master2
# anti scan
firewall-cmd --permanent --new-ipset=my-allow-list --type=hash:net
firewall-cmd --permanent --get-ipsets
cat > /root/iplist.txt <<EOL
127.0.0.1/32
223.87.20.0/24
117.177.241.0/24
39.134.200.0/24
39.134.201.0/24
39.137.101.0/24
192.168.7.0/24
112.44.102.224/27
47.93.86.113/32
EOL
firewall-cmd --permanent --ipset=my-allow-list --add-entries-from-file=iplist.txt
firewall-cmd --permanent --ipset=my-allow-list --get-entries
firewall-cmd --permanent --zone=trusted --add-source=ipset:my-allow-list
firewall-cmd --reload
firewall-cmd --list-all
firewall-cmd --get-active-zones
firewall-cmd --set-default-zone=block
firewall-cmd --runtime-to-permanent
firewall-cmd --reload
infra0 node day1
systemctl disable firewalld.service
systemctl stop firewalld.service
# secure for anti-scan
cat << EOF >> /etc/rc.local
ipset create my-allow-set hash:net
ipset add my-allow-set 127.0.0.1/32
ipset add my-allow-set 223.87.20.0/24
ipset add my-allow-set 117.177.241.0/24
ipset add my-allow-set 39.134.200.0/24
ipset add my-allow-set 39.134.201.0/24
ipset add my-allow-set 39.137.101.0/24
ipset add my-allow-set 192.168.7.0/24
ipset add my-allow-set 112.44.102.224/27
ipset add my-allow-set 47.93.86.113/32
ipset add my-allow-set 39.134.204.0/24
iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT
iptables -A INPUT -m set --match-set my-allow-set src -j ACCEPT
iptables -A INPUT -p tcp -j REJECT
iptables -A INPUT -p udp -j REJECT
EOF
chmod +x /etc/rc.d/rc.local
systemctl enable rc-local
# systemctl restart rc-local
# 配置kvm环境
yum -y install qemu-kvm libvirt libvirt-python libguestfs-tools virt-install virt-viewer virt-manager
systemctl enable libvirtd
systemctl start libvirtd
infra1 node day1
systemctl disable firewalld.service
systemctl stop firewalld.service
# secure for anti-scan
cat << EOF >> /etc/rc.local
ipset create my-allow-set hash:net
ipset add my-allow-set 127.0.0.1/32
ipset add my-allow-set 223.87.20.0/24
ipset add my-allow-set 117.177.241.0/24
ipset add my-allow-set 39.134.200.0/24
ipset add my-allow-set 39.134.201.0/24
ipset add my-allow-set 39.137.101.0/24
ipset add my-allow-set 192.168.7.0/24
ipset add my-allow-set 112.44.102.224/27
ipset add my-allow-set 47.93.86.113/32
iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT
iptables -A INPUT -m set --match-set my-allow-set src -j ACCEPT
iptables -A INPUT -p tcp -j REJECT
iptables -A INPUT -p udp -j REJECT
EOF
chmod +x /etc/rc.d/rc.local
systemctl enable rc-local
# systemctl restart rc-local
# 配置kvm环境
yum -y install qemu-kvm libvirt libvirt-python libguestfs-tools virt-install virt-viewer virt-manager
systemctl enable libvirtd
systemctl start libvirtd
worker-0 day2 oper
podman login registry.redhat.ren:4443 -u zteadm
# localhost/ottcache-img:6.01.05.01T03
skopeo copy docker-archive:ZXCDN-OTT-IAS-IMGV6.01.05.01_TEST.tar docker://registry.redhat.ren:4443/zteadm/ottcache-img:6.01.05.01T03
# localhost/slbl7-img:6.01.05.01T03
skopeo copy docker-archive:ZXCDN-OTT-SLBL7-IMGV6.01.05.01_TEST.tar docker://registry.redhat.ren:4443/zteadm/slbl7-img:6.01.05.01T03
# localhost/webcache-img:v6.01.04.03
skopeo copy docker-archive:ZXCDN-CACHE-WEBCACHE-IMGV6.01.04.03.tar docker://registry.redhat.ren:4443/zteadm/webcache-img:v6.01.04.03
# localhost/pg-img:v1.01.01.01
skopeo copy docker-archive:ZXCDN-PG-IMGV1.01.01.01.tar docker://registry.redhat.ren:4443/zteadm/pg-img:v1.01.01.01
# localhost/slb-img:v6.01.04.03
skopeo copy docker-archive:ZXCDN-CACHE-SLB-IMGV6.01.04.03.tar docker://registry.redhat.ren:4443/zteadm/slb-img:v6.01.04.03
# io speed test
dd if=/dev/zero of=/data/testfile bs=1G count=10
# 10+0 records in
# 10+0 records out
# 10737418240 bytes (11 GB) copied, 6.85688 s, 1.6 GB/s
dd if=/dev/zero of=/data/testfile bs=1G count=10 oflag=direct
# 10+0 records in
# 10+0 records out
# 10737418240 bytes (11 GB) copied, 3.98098 s, 2.7 GB/s
dd if=/dev/zero of=/data/testfile bs=5M count=9999
# 9999+0 records in
# 9999+0 records out
# 52423557120 bytes (52 GB) copied, 27.8529 s, 1.9 GB/s
dd if=/dev/zero of=/data/testfile bs=5M count=9999 oflag=direct
# 9999+0 records in
# 9999+0 records out
# 52423557120 bytes (52 GB) copied, 16.1121 s, 3.3 GB/s
dd if=/dev/zero of=/data/testfile bs=5M count=9999 oflag=dsync
# 9999+0 records in
# 9999+0 records out
# 52423557120 bytes (52 GB) copied, 51.2713 s, 1.0 GB/s
dd if=/data/testfile of=/dev/null bs=1M count=9999 oflag=dsync
# 9999+0 records in
# 9999+0 records out
# 10484711424 bytes (10 GB) copied, 1.9141 s, 5.5 GB/s
dd if=/data/testfile of=/dev/null bs=5M count=9999 oflag=dsync
# 9999+0 records in
# 9999+0 records out
# 52423557120 bytes (52 GB) copied, 9.3676 s, 5.6 GB/s
# secure for anti-scan
cat << EOF > /etc/rc.local
#!/bin/bash
# THIS FILE IS ADDED FOR COMPATIBILITY PURPOSES
#
# It is highly advisable to create own systemd services or udev rules
# to run scripts during boot instead of using this file.
#
# In contrast to previous versions due to parallel execution during boot
# this script will NOT be run after all other services.
#
# Please note that you must run 'chmod +x /etc/rc.d/rc.local' to ensure
# that this script will be executed during boot.
touch /var/lock/subsys/local
ipset create my-allow-set hash:net
ipset add my-allow-set 127.0.0.1/32
ipset add my-allow-set 223.87.20.0/24
ipset add my-allow-set 117.177.241.0/24
ipset add my-allow-set 39.134.200.0/24
ipset add my-allow-set 39.134.201.0/24
ipset add my-allow-set 39.137.101.0/24
ipset add my-allow-set 192.168.7.0/24
ipset add my-allow-set 112.44.102.224/27
ipset add my-allow-set 47.93.86.113/32
ipset add my-allow-set 221.226.0.75/32
ipset add my-allow-set 210.21.236.182/32
ipset add my-allow-set 61.132.54.2/32
ipset add my-allow-set 39.134.198.0/24
ipset add my-allow-set 218.205.236.16/28
ipset add my-allow-set 39.134.204.0/24
iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT
iptables -A INPUT -m set --match-set my-allow-set src -j ACCEPT
iptables -A INPUT -p tcp -j REJECT
iptables -A INPUT -p udp -j REJECT
EOF
chmod +x /etc/rc.d/rc.local
systemctl enable rc-local
# systemctl restart rc-local
ipset add my-allow-set 221.226.0.75/32
ipset add my-allow-set 210.21.236.182/32
ipset add my-allow-set 61.132.54.2/32
# 配置kvm环境
yum -y install qemu-kvm libvirt libvirt-python libguestfs-tools virt-install virt-viewer virt-manager
systemctl enable libvirtd
systemctl start libvirtd
worker-1 day2 oper
cat << EOF > /etc/rc.local
#!/bin/bash
# THIS FILE IS ADDED FOR COMPATIBILITY PURPOSES
#
# It is highly advisable to create own systemd services or udev rules
# to run scripts during boot instead of using this file.
#
# In contrast to previous versions due to parallel execution during boot
# this script will NOT be run after all other services.
#
# Please note that you must run 'chmod +x /etc/rc.d/rc.local' to ensure
# that this script will be executed during boot.
touch /var/lock/subsys/local
ipset create my-allow-set hash:net
ipset add my-allow-set 127.0.0.1/32
ipset add my-allow-set 223.87.20.0/24
ipset add my-allow-set 117.177.241.0/24
ipset add my-allow-set 39.134.200.0/24
ipset add my-allow-set 39.134.201.0/24
ipset add my-allow-set 39.137.101.0/24
ipset add my-allow-set 192.168.7.0/24
ipset add my-allow-set 112.44.102.224/27
ipset add my-allow-set 47.93.86.113/32
ipset add my-allow-set 221.226.0.75/32
ipset add my-allow-set 210.21.236.182/32
ipset add my-allow-set 61.132.54.2/32
ipset add my-allow-set 39.134.198.0/24
ipset add my-allow-set 218.205.236.16/28
ipset add my-allow-set 39.134.204.0/24
iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT
iptables -A INPUT -m set --match-set my-allow-set src -j ACCEPT
iptables -A INPUT -p tcp -j REJECT
iptables -A INPUT -p udp -j REJECT
EOF
chmod +x /etc/rc.d/rc.local
systemctl enable rc-local
# systemctl restart rc-local
# 配置kvm环境
yum -y install qemu-kvm libvirt libvirt-python libguestfs-tools virt-install virt-viewer virt-manager
systemctl enable libvirtd
systemctl start libvirtd
worker-2 day2 oper
cat << EOF > /etc/rc.local
#!/bin/bash
# THIS FILE IS ADDED FOR COMPATIBILITY PURPOSES
#
# It is highly advisable to create own systemd services or udev rules
# to run scripts during boot instead of using this file.
#
# In contrast to previous versions due to parallel execution during boot
# this script will NOT be run after all other services.
#
# Please note that you must run 'chmod +x /etc/rc.d/rc.local' to ensure
# that this script will be executed during boot.
touch /var/lock/subsys/local
ipset create my-allow-set hash:net
ipset add my-allow-set 127.0.0.1/32
ipset add my-allow-set 223.87.20.0/24
ipset add my-allow-set 117.177.241.0/24
ipset add my-allow-set 39.134.200.0/24
ipset add my-allow-set 39.134.201.0/24
ipset add my-allow-set 39.137.101.0/24
ipset add my-allow-set 192.168.7.0/24
ipset add my-allow-set 112.44.102.224/27
ipset add my-allow-set 47.93.86.113/32
ipset add my-allow-set 221.226.0.75/32
ipset add my-allow-set 210.21.236.182/32
ipset add my-allow-set 61.132.54.2/32
ipset add my-allow-set 39.134.198.0/24
ipset add my-allow-set 218.205.236.16/28
ipset add my-allow-set 39.134.204.0/24
iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT
iptables -A INPUT -m set --match-set my-allow-set src -j ACCEPT
iptables -A INPUT -p tcp -j REJECT
iptables -A INPUT -p udp -j REJECT
EOF
chmod +x /etc/rc.d/rc.local
systemctl enable rc-local
# systemctl restart rc-local
# 配置kvm环境
yum -y install qemu-kvm libvirt libvirt-python libguestfs-tools virt-install virt-viewer virt-manager
systemctl enable libvirtd
systemctl start libvirtd
systemctl status libvirtd
systemctl stop libvirtd
systemctl disable libvirtd
# Installed:
# libguestfs-tools.noarch 1:1.40.2-5.el7_7.3 libvirt.x86_64 0:4.5.0-23.el7_7.5 libvirt-python.x86_64 0:4.5.0-1.el7
# qemu-kvm.x86_64 10:1.5.3-167.el7_7.4 virt-install.noarch 0:1.5.0-7.el7 virt-manager.noarch 0:1.5.0-7.el7
# virt-viewer.x86_64 0:5.0-15.el7
# Dependency Installed:
# adwaita-cursor-theme.noarch 0:3.28.0-1.el7 adwaita-icon-theme.noarch 0:3.28.0-1.el7
# at-spi2-atk.x86_64 0:2.26.2-1.el7 at-spi2-core.x86_64 0:2.28.0-1.el7
# atk.x86_64 0:2.28.1-1.el7 augeas-libs.x86_64 0:1.4.0-9.el7
# autogen-libopts.x86_64 0:5.18-5.el7 cairo.x86_64 0:1.15.12-4.el7
# cairo-gobject.x86_64 0:1.15.12-4.el7 cdparanoia-libs.x86_64 0:10.2-17.el7
# celt051.x86_64 0:0.5.1.3-8.el7 colord-libs.x86_64 0:1.3.4-1.el7
# cyrus-sasl.x86_64 0:2.1.26-23.el7 dbus-x11.x86_64 1:1.10.24-13.el7_6
# dconf.x86_64 0:0.28.0-4.el7 dejavu-fonts-common.noarch 0:2.33-6.el7
# dejavu-sans-fonts.noarch 0:2.33-6.el7 flac-libs.x86_64 0:1.3.0-5.el7_1
# fontconfig.x86_64 0:2.13.0-4.3.el7 fontpackages-filesystem.noarch 0:1.44-8.el7
# fribidi.x86_64 0:1.0.2-1.el7_7.1 fuse.x86_64 0:2.9.2-11.el7
# fuse-libs.x86_64 0:2.9.2-11.el7 gdk-pixbuf2.x86_64 0:2.36.12-3.el7
# genisoimage.x86_64 0:1.1.11-25.el7 glib-networking.x86_64 0:2.56.1-1.el7
# glusterfs-api.x86_64 0:3.12.2-47.2.el7 glusterfs-cli.x86_64 0:3.12.2-47.2.el7
# gnome-icon-theme.noarch 0:3.12.0-1.el7 gnutls.x86_64 0:3.3.29-9.el7_6
# gnutls-dane.x86_64 0:3.3.29-9.el7_6 gnutls-utils.x86_64 0:3.3.29-9.el7_6
# gperftools-libs.x86_64 0:2.6.1-1.el7 graphite2.x86_64 0:1.3.10-1.el7_3
# gsettings-desktop-schemas.x86_64 0:3.28.0-2.el7 gsm.x86_64 0:1.0.13-11.el7
# gstreamer1.x86_64 0:1.10.4-2.el7 gstreamer1-plugins-base.x86_64 0:1.10.4-2.el7
# gtk-update-icon-cache.x86_64 0:3.22.30-3.el7 gtk-vnc2.x86_64 0:0.7.0-3.el7
# gtk3.x86_64 0:3.22.30-3.el7 gvnc.x86_64 0:0.7.0-3.el7
# harfbuzz.x86_64 0:1.7.5-2.el7 hexedit.x86_64 0:1.2.13-5.el7
# hicolor-icon-theme.noarch 0:0.12-7.el7 hivex.x86_64 0:1.3.10-6.9.el7
# ipxe-roms-qemu.noarch 0:20180825-2.git133f4c.el7 iso-codes.noarch 0:3.46-2.el7
# jasper-libs.x86_64 0:1.900.1-33.el7 jbigkit-libs.x86_64 0:2.0-11.el7
# json-glib.x86_64 0:1.4.2-2.el7 lcms2.x86_64 0:2.6-3.el7
# libICE.x86_64 0:1.0.9-9.el7 libSM.x86_64 0:1.2.2-2.el7
# libX11.x86_64 0:1.6.7-2.el7 libX11-common.noarch 0:1.6.7-2.el7
# libXau.x86_64 0:1.0.8-2.1.el7 libXcomposite.x86_64 0:0.4.4-4.1.el7
# libXcursor.x86_64 0:1.1.15-1.el7 libXdamage.x86_64 0:1.1.4-4.1.el7
# libXext.x86_64 0:1.3.3-3.el7 libXfixes.x86_64 0:5.0.3-1.el7
# libXft.x86_64 0:2.3.2-2.el7 libXi.x86_64 0:1.7.9-1.el7
# libXinerama.x86_64 0:1.1.3-2.1.el7 libXmu.x86_64 0:1.1.2-2.el7
# libXrandr.x86_64 0:1.5.1-2.el7 libXrender.x86_64 0:0.9.10-1.el7
# libXt.x86_64 0:1.1.5-3.el7 libXtst.x86_64 0:1.2.3-1.el7
# libXv.x86_64 0:1.0.11-1.el7 libXxf86misc.x86_64 0:1.0.3-7.1.el7
# libXxf86vm.x86_64 0:1.1.4-1.el7 libarchive.x86_64 0:3.1.2-14.el7_7
# libasyncns.x86_64 0:0.8-7.el7 libcacard.x86_64 40:2.5.2-2.el7
# libconfig.x86_64 0:1.4.9-5.el7 libepoxy.x86_64 0:1.5.2-1.el7
# libglvnd.x86_64 1:1.0.1-0.8.git5baa1e5.el7 libglvnd-egl.x86_64 1:1.0.1-0.8.git5baa1e5.el7
# libglvnd-glx.x86_64 1:1.0.1-0.8.git5baa1e5.el7 libgovirt.x86_64 0:0.3.4-3.el7
# libguestfs.x86_64 1:1.40.2-5.el7_7.3 libguestfs-tools-c.x86_64 1:1.40.2-5.el7_7.3
# libgusb.x86_64 0:0.2.9-1.el7 libibverbs.x86_64 0:22.1-3.el7
# libiscsi.x86_64 0:1.9.0-7.el7 libjpeg-turbo.x86_64 0:1.2.90-8.el7
# libmodman.x86_64 0:2.0.1-8.el7 libogg.x86_64 2:1.3.0-7.el7
# libosinfo.x86_64 0:1.1.0-3.el7 libproxy.x86_64 0:0.4.11-11.el7
# librdmacm.x86_64 0:22.1-3.el7 libsndfile.x86_64 0:1.0.25-10.el7
# libsoup.x86_64 0:2.62.2-2.el7 libthai.x86_64 0:0.1.14-9.el7
# libtheora.x86_64 1:1.1.1-8.el7 libtiff.x86_64 0:4.0.3-32.el7
# libusal.x86_64 0:1.1.11-25.el7 libusbx.x86_64 0:1.0.21-1.el7
# libvirt-bash-completion.x86_64 0:4.5.0-23.el7_7.5 libvirt-client.x86_64 0:4.5.0-23.el7_7.5
# libvirt-daemon.x86_64 0:4.5.0-23.el7_7.5 libvirt-daemon-config-network.x86_64 0:4.5.0-23.el7_7.5
# libvirt-daemon-config-nwfilter.x86_64 0:4.5.0-23.el7_7.5 libvirt-daemon-driver-interface.x86_64 0:4.5.0-23.el7_7.5
# libvirt-daemon-driver-lxc.x86_64 0:4.5.0-23.el7_7.5 libvirt-daemon-driver-network.x86_64 0:4.5.0-23.el7_7.5
# libvirt-daemon-driver-nodedev.x86_64 0:4.5.0-23.el7_7.5 libvirt-daemon-driver-nwfilter.x86_64 0:4.5.0-23.el7_7.5
# libvirt-daemon-driver-qemu.x86_64 0:4.5.0-23.el7_7.5 libvirt-daemon-driver-secret.x86_64 0:4.5.0-23.el7_7.5
# libvirt-daemon-driver-storage.x86_64 0:4.5.0-23.el7_7.5 libvirt-daemon-driver-storage-core.x86_64 0:4.5.0-23.el7_7.5
# libvirt-daemon-driver-storage-disk.x86_64 0:4.5.0-23.el7_7.5 libvirt-daemon-driver-storage-gluster.x86_64 0:4.5.0-23.el7_7.5
# libvirt-daemon-driver-storage-iscsi.x86_64 0:4.5.0-23.el7_7.5 libvirt-daemon-driver-storage-logical.x86_64 0:4.5.0-23.el7_7.5
# libvirt-daemon-driver-storage-mpath.x86_64 0:4.5.0-23.el7_7.5 libvirt-daemon-driver-storage-rbd.x86_64 0:4.5.0-23.el7_7.5
# libvirt-daemon-driver-storage-scsi.x86_64 0:4.5.0-23.el7_7.5 libvirt-daemon-kvm.x86_64 0:4.5.0-23.el7_7.5
# libvirt-glib.x86_64 0:1.0.0-1.el7 libvirt-libs.x86_64 0:4.5.0-23.el7_7.5
# libvisual.x86_64 0:0.4.0-16.el7 libvorbis.x86_64 1:1.3.3-8.el7.1
# libwayland-client.x86_64 0:1.15.0-1.el7 libwayland-cursor.x86_64 0:1.15.0-1.el7
# libwayland-egl.x86_64 0:1.15.0-1.el7 libwayland-server.x86_64 0:1.15.0-1.el7
# libxcb.x86_64 0:1.13-1.el7 libxkbcommon.x86_64 0:0.7.1-3.el7
# libxshmfence.x86_64 0:1.2-1.el7 lsof.x86_64 0:4.87-6.el7
# lzop.x86_64 0:1.03-10.el7 mesa-libEGL.x86_64 0:18.3.4-6.el7_7
# mesa-libGL.x86_64 0:18.3.4-6.el7_7 mesa-libgbm.x86_64 0:18.3.4-6.el7_7
# mesa-libglapi.x86_64 0:18.3.4-6.el7_7 mtools.x86_64 0:4.0.18-5.el7
# netcf-libs.x86_64 0:0.2.8-4.el7 nettle.x86_64 0:2.7.1-8.el7
# numad.x86_64 0:0.5-18.20150602git.el7 opus.x86_64 0:1.0.2-6.el7
# orc.x86_64 0:0.4.26-1.el7 osinfo-db.noarch 0:20190319-2.el7
# osinfo-db-tools.x86_64 0:1.1.0-1.el7 pango.x86_64 0:1.42.4-4.el7_7
# pcre2.x86_64 0:10.23-2.el7 perl-Sys-Guestfs.x86_64 1:1.40.2-5.el7_7.3
# perl-Sys-Virt.x86_64 0:4.5.0-2.el7 perl-hivex.x86_64 0:1.3.10-6.9.el7
# perl-libintl.x86_64 0:1.20-12.el7 pixman.x86_64 0:0.34.0-1.el7
# pulseaudio-libs.x86_64 0:10.0-5.el7 pulseaudio-libs-glib2.x86_64 0:10.0-5.el7
# pycairo.x86_64 0:1.8.10-8.el7 python-gobject.x86_64 0:3.22.0-1.el7_4.1
# qemu-img.x86_64 10:1.5.3-167.el7_7.4 qemu-kvm-common.x86_64 10:1.5.3-167.el7_7.4
# radvd.x86_64 0:2.17-3.el7 rdma-core.x86_64 0:22.1-3.el7
# rest.x86_64 0:0.8.1-2.el7 scrub.x86_64 0:2.5.2-7.el7
# seabios-bin.noarch 0:1.11.0-2.el7 seavgabios-bin.noarch 0:1.11.0-2.el7
# sgabios-bin.noarch 1:0.20110622svn-4.el7 spice-glib.x86_64 0:0.35-4.el7
# spice-gtk3.x86_64 0:0.35-4.el7 spice-server.x86_64 0:0.14.0-7.el7
# squashfs-tools.x86_64 0:4.3-0.21.gitaae0aff4.el7 supermin5.x86_64 0:5.1.19-1.el7
# syslinux.x86_64 0:4.05-15.el7 syslinux-extlinux.x86_64 0:4.05-15.el7
# trousers.x86_64 0:0.3.14-2.el7 unbound-libs.x86_64 0:1.6.6-1.el7
# usbredir.x86_64 0:0.7.1-3.el7 virt-manager-common.noarch 0:1.5.0-7.el7
# vte-profile.x86_64 0:0.52.2-2.el7 vte291.x86_64 0:0.52.2-2.el7
# xkeyboard-config.noarch 0:2.24-1.el7 xml-common.noarch 0:0.6.3-39.el7
# xorg-x11-server-utils.x86_64 0:7.7-20.el7 xorg-x11-xauth.x86_64 1:1.0.9-1.el7
# xorg-x11-xinit.x86_64 0:1.3.4-2.el7 yajl.x86_64 0:2.0.4-4.el7
tips
- config local storage operator
- config monitor storage
- benchmark the storage using real senario
OSX 系统如何录制系统声音
mac系统上,如何录制系统的声音一直是个问题。特别是现在开在线会议,有的时候想在mac上录下来,在mac系统上,默认是不可以的。windows系统上不存在这个问题,不知道为什么mac上面反而特别麻烦。
解决这个问题的方法,就是BackgroundMusic
首先,下载和安装 BackgroundMusic , 这个按照官网的步骤完成就好。
然后,启动audio midi程序,配置一个aggregate device
确认一下系统的input, output设置
最后,我们开启录屏软件,输入设备选择aggregate device就可以了。