Gisida님의 Database Operator In Kubernetes study(=DOIK) 스터디 진행 중 테스트 한 내용입니다.
Cloud Native PostgreSQL(CNPG) 오퍼레이터 테스트
1. CNPG(Cloud Native PostgreSQL) 설치
2. CNPG 기본사용
3. CNPG 장애 테스트
4. CNPG Scale & 롤링 업데이트
5. CNPG 기타
- Cloud Native PostgreSQL(CNPG)?
CloudNativePG는 프라이빗, 퍼블릭, 하이브리드 또는 멀티 클라우드 환경에서 실행되는 지원되는 모든 Kubernetes 클러스터에서 PostgreSQL 워크로드를 관리하도록 설계된 오픈 소스 오퍼레이터입니다 - Architecture
- Read-write workloads
애플리케이션은 아래 그림과 같이 -rw suffix service를 사용하여, Kubernetes 오퍼레이터가 현재 Primary Instance선택한 PostgreSQL 인스턴스에 연결할 수 있습니다. Primary Instance를 일시적 또는 영구적으로 사용할 수 없는 경우 Kubernetes는 고가용성을 위해 -rw 서비스를 클러스터의 다른 인스턴스로 이동합니다.
- Read-Only workloads
애플리케이션은 운영자가 제공한 -ro 서비스를 통해 상시 대기 복제본에 액세스할 수 있습니다. 이 서비스를 사용하면 애플리케이션이 기본 노드에서 읽기 전용 쿼리를 오프로드할 수 있습니다.
- Multi-cluster deployments
두 개의 서로 다른 Kubernetes 클러스터에 복제되는 PostgreSQL입니다. 여기서 기본 클러스터는 첫 번째 Kubernetes 클러스터에 있고 복제본 클러스터는 두 번째 클러스터에 있습니다. 두 번째 Kubernetes 클러스터는 DR역할합니다.
1. CNPG(Cloud Native PostgreSQL) 설치
▶ Helm Repo추가
$ helm repo add cnpg https://cloudnative-pg.github.io/charts
"cnpg" has been added to your repositories
▶ 오퍼레이터 설치
$ helm install cnpg cnpg/cloudnative-pg -f ~/DOIK/5/values.yaml
NAME: cnpg
LAST DEPLOYED: Wed Jun 22 10:27:50 2022
NAMESPACE: default
STATUS: deployed
REVISION: 1
NOTES:
CloudNativePG operator should be installed in namespace "default".
You can now create a PostgreSQL cluster with 3 nodes in the current namespace as follows:
cat <<EOF | kubectl apply -f -
# Example of PostgreSQL cluster
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
name: cluster-example
spec:
instances: 3
storage:
size: 1Gi
EOF
$ kubectl get cluster
▶ CRD 확인
$ kubectl get crd
NAME CREATED AT
backups.postgresql.cnpg.io 2022-06-22T01:27:52Z
clusters.postgresql.cnpg.io 2022-06-22T01:27:52Z
poolers.postgresql.cnpg.io 2022-06-22T01:27:52Z
scheduledbackups.postgresql.cnpg.io 2022-06-22T01:27:52Z
▶ 클러스터 설치 Deploy a PostgreSQL cluster
$ cat ~/DOIK/5/mycluster1.yaml
# Example of PostgreSQL cluster
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
name: mycluster
spec:
imageName: ghcr.io/cloudnative-pg/postgresql:14.2
instances: 3
storage:
size: 3Gi
postgresql:
parameters:
max_worker_processes: "40"
timezone: "Asia/Seoul"
pg_hba:
- host all postgres all trust
primaryUpdateStrategy: unsupervised
enableSuperuserAccess: true
bootstrap:
initdb:
database: app
encoding: UTF8
localeCType: C
localeCollate: C
owner: app
$ kubectl apply -f ~/DOIK/5/mycluster1.yaml
cluster.postgresql.cnpg.io/mycluster created
$ kubectl get pod -w
NAME READY STATUS RESTARTS AGE
cnpg-cloudnative-pg-5f8cc75df5-jk2v4 1/1 Running 0 17m
mycluster-1 1/1 Running 0 4m23s
mycluster-2 1/1 Running 0 3m36s
mycluster-3 1/1 Running 0 2m52s
▶ 클러스터 확인
$ kubectl get all
NAME READY STATUS RESTARTS AGE
pod/cnpg-cloudnative-pg-5f8cc75df5-jk2v4 1/1 Running 0 30m
pod/mycluster-1 1/1 Running 0 16m
pod/mycluster-2 1/1 Running 0 16m
pod/mycluster-3 1/1 Running 0 15m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/cnpg-webhook-service ClusterIP 10.200.1.33 <none> 443/TCP 30m
$ kubectl get cluster
NAME AGE INSTANCES READY STATUS PRIMARY
mycluster 17m 3 3 Cluster in healthy state mycluster-1
$ kubectl cnpg status mycluster --verbose ## -v , config 설정 적용 확인
Cluster Summary
Name: mycluster
Namespace: default
System ID: 7111873327487692819
PostgreSQL Image: ghcr.io/cloudnative-pg/postgresql:14.2
Primary instance: mycluster-1
Status: Cluster in healthy state
Instances: 3
Ready instances: 3
Current Write LSN: 0/6000060 (Timeline: 1 - WAL File: 000000010000000000000006)
PostgreSQL Configuration
archive_command = '/controller/manager wal-archive --log-destination /controller/log/postgres.json %p'
archive_mode = 'on'
archive_timeout = '5min'
cluster_name = 'mycluster'
dynamic_shared_memory_type = 'posix'
full_page_writes = 'on'
hot_standby = 'true'
listen_addresses = '*'
log_destination = 'csvlog'
log_directory = '/controller/log'
log_filename = 'postgres'
log_rotation_age = '0'
log_rotation_size = '0'
log_truncate_on_rotation = 'false'
logging_collector = 'on'
max_parallel_workers = '32'
max_replication_slots = '32'
max_worker_processes = '40'
port = '5432'
restart_after_crash = 'false'
shared_memory_type = 'mmap'
shared_preload_libraries = ''
ssl = 'on'
ssl_ca_file = '/controller/certificates/client-ca.crt'
ssl_cert_file = '/controller/certificates/server.crt'
ssl_key_file = '/controller/certificates/server.key'
timezone = 'Asia/Seoul'
unix_socket_directories = '/controller/run'
wal_keep_size = '512MB'
wal_level = 'logical'
wal_log_hints = 'on'
wal_receiver_timeout = '5s'
wal_sender_timeout = '5s'
cnpg.config_sha256 = '8a3f5126327a68ea817baaac7b4e96184e10c86208e93c126750f40c92a30747'
PostgreSQL HBA Rules
# Grant local access
local all all peer map=local
# Require client certificate authentication for the streaming_replica user
hostssl postgres streaming_replica all cert
hostssl replication streaming_replica all cert
hostssl all cnpg_pooler_pgbouncer all cert
host all postgres all trust
# Otherwise use the default authentication method
host all all all scram-sha-256
Certificates Status
Certificate Name Expiration Date Days Left Until Expiration
---------------- --------------- --------------------------
mycluster-ca 2022-09-20 01:35:08 +0000 UTC 89.98
mycluster-replication 2022-09-20 01:35:08 +0000 UTC 89.98
mycluster-server 2022-09-20 01:35:08 +0000 UTC 89.98
Continuous Backup status
Not configured
Streaming Replication status
Name Sent LSN Write LSN Flush LSN Replay LSN Write Lag Flush Lag Replay Lag State Sync State Sync Priority
---- -------- --------- --------- ---------- --------- --------- ---------- ----- ---------- -------------
mycluster-2 0/6000060 0/6000060 0/6000060 0/6000060 00:00:00 00:00:00 00:00:00 streaming async 0
mycluster-3 0/6000060 0/6000060 0/6000060 0/6000060 00:00:00 00:00:00 00:00:00 streaming async 0
Instances status
Name Database Size Current LSN Replication role Status QoS Manager Version
---- ------------- ----------- ---------------- ------ --- ---------------
mycluster-1 33 MB 0/6000060 Primary OK BestEffort 1.15.1
mycluster-2 33 MB 0/6000060 Standby (async) OK BestEffort 1.15.1
mycluster-3 33 MB 0/6000060 Standby (async) OK BestEffort 1.15.1
▶ 클러스터된 기본리소스 확인
$ kubectl get pod,deploy
NAME READY STATUS RESTARTS AGE
pod/cnpg-cloudnative-pg-5f8cc75df5-jk2v4 1/1 Running 0 34m
pod/mycluster-1 1/1 Running 0 21m
pod/mycluster-2 1/1 Running 0 20m
pod/mycluster-3 1/1 Running 0 19m
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/cnpg-cloudnative-pg 1/1 1 1 34m
$ kubectl get svc,ep,endpointslices -l cnpg.io/cluster=mycluster
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/mycluster-any ClusterIP 10.200.1.195 <none> 5432/TCP 22m
service/mycluster-r ClusterIP 10.200.1.232 <none> 5432/TCP 22m
service/mycluster-ro ClusterIP 10.200.1.69 <none> 5432/TCP 22m
service/mycluster-rw ClusterIP 10.200.1.174 <none> 5432/TCP 22m
NAME ENDPOINTS AGE
endpoints/mycluster-any 172.16.1.4:5432,172.16.2.6:5432,172.16.3.4:5432 22m
endpoints/mycluster-r 172.16.1.4:5432,172.16.2.6:5432,172.16.3.4:5432 22m
endpoints/mycluster-ro 172.16.1.4:5432,172.16.2.6:5432 22m
endpoints/mycluster-rw 172.16.3.4:5432 22m
NAME ADDRESSTYPE PORTS ENDPOINTS AGE
endpointslice.discovery.k8s.io/mycluster-any-gjmlp IPv4 5432 172.16.3.4,172.16.1.4,172.16.2.6 22m
endpointslice.discovery.k8s.io/mycluster-r-98r65 IPv4 5432 172.16.3.4,172.16.1.4,172.16.2.6 22m
endpointslice.discovery.k8s.io/mycluster-ro-kh6fm IPv4 5432 172.16.1.4,172.16.2.6 22m
endpointslice.discovery.k8s.io/mycluster-rw-4gcpn IPv4 5432 172.16.3.4 22m
$ kubectl get cm,secret
NAME DATA AGE
configmap/cnpg-controller-manager-config 0 34m
configmap/cnpg-default-monitoring 1 34m
configmap/kube-root-ca.crt 1 85m
NAME TYPE DATA AGE
secret/cnpg-ca-secret Opaque 2 34m
secret/cnpg-cloudnative-pg-token-c7sqs kubernetes.io/service-account-token 3 34m
secret/cnpg-webhook-cert kubernetes.io/tls 2 34m
secret/default-token-j8k8p kubernetes.io/service-account-token 3 85m
secret/mycluster-app kubernetes.io/basic-auth 3 22m
secret/mycluster-ca Opaque 2 22m
secret/mycluster-replication kubernetes.io/tls 2 22m
secret/mycluster-server kubernetes.io/tls 2 22m
secret/mycluster-superuser kubernetes.io/basic-auth 3 22m
secret/mycluster-token-jzdtw kubernetes.io/service-account-token 3 22m
secret/sh.helm.release.v1.cnpg.v1 helm.sh/release.v1 1 34m
$ kubectl get pdb
NAME MIN AVAILABLE MAX UNAVAILABLE ALLOWED DISRUPTIONS AGE
mycluster 1 N/A 1 22m
mycluster-primary 1 N/A 0 22m
2. CNPG 기본사용
▶ 2개의 자격 증명이 저장된 secret 확인
$ kubectl get secret -l cnpg.io/cluster=mycluster
NAME TYPE DATA AGE
mycluster-app kubernetes.io/basic-auth 3 123m
mycluster-superuser kubernetes.io/basic-auth 3 123m
▶ DB계정정보 확인
# Super계정명 확인
$ kubectl get secrets mycluster-superuser -o jsonpath={.data.username} | base64 -d ;echo
postgres
# app 계정명
$ kubectl get secrets mycluster-app -o jsonpath={.data.username} | base64 -d ;echo
app
# app 계정 암호
kubectl get secrets mycluster-app -o jsonpath={.data.password} | base64 -d ; echo
wgyiBXwnWHPoXF9plDaulXX34U8xQ0WvZOKdsoP5ozD95j9PMkkSRbHXW6AArRaq
▶ app계정 암호 변수 지정
$ AUSERPW=$(kubectl get secrets mycluster-app -o jsonpath={.data.password} | base64 -d)
▶ myclient 파드 2대 배포 : envsubst 활용
$ cat ~/DOIK/5/myclient.yaml
apiVersion: v1
kind: Pod
metadata:
name: ${PODNAME}
labels:
app: myclient
spec:
nodeName: k8s-m
containers:
- name: ${PODNAME}
image: bitnami/postgresql:${VERSION}
command: ["tail"]
args: ["-f", "/dev/null"]
terminationGracePeriodSeconds: 0
$ for ((i=1; i<=2; i++)); do PODNAME=myclient$i VERSION=14.3.0 envsubst < ~/DOIK/5/myclient.yaml | kubectl apply -f - ; done
pod/myclient1 created
pod/myclient2 created
▶ [myclient1] superuser 계정으로 mycluster-rw 서비스 접속
# 접속
$ kubectl exec -it myclient1 -- psql -U postgres -h mycluster-rw -p 5432
psql (14.3, server 14.2 (Debian 14.2-1.pgdg110+1))
SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, bits: 256, compression: off)
postgres=#
# 연결정보 확인
postgres=# \conninfo
You are connected to database "postgres" as user "postgres" on host "mycluster-rw" (address "10.200.1.174") at port "5432".
SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, bits: 256, compression: off)
▶ 테스트를 위한 sample데이터 다운로드((https://www.postgresqltutorial.com/postgresql-getting-started)
$ curl -LO https://www.postgresqltutorial.com/wp-content/uploads/2019/05/dvdrental.zip
$ apt install unzip -y && unzip dvdrental.zip
▶ myclient1 파드에 dvdrental.tar 복사
$ kubectl cp dvdrental.tar myclient1:/tmp
▶ [myclient1] superuser 계정으로 mycluster-rw 서비스 접속 후 데이터베이스 생성
$ kubectl exec -it myclient1 -- createdb -U postgres -h mycluster-rw -p 5432 dvdrental
$ kubectl exec -it myclient1 -- psql -U postgres -h mycluster-rw -p 5432 -l
List of databases
Name | Owner | Encoding | Collate | Ctype | Access privileges
-----------+----------+----------+---------+-------+-----------------------
app | app | UTF8 | C | C |
dvdrental | postgres | UTF8 | C | C |
postgres | postgres | UTF8 | C | C |
template0 | postgres | UTF8 | C | C | =c/postgres +
| | | | | postgres=CTc/postgres
template1 | postgres | UTF8 | C | C | =c/postgres +
| | | | | postgres=CTc/postgres
(5 rows)
▶ DVD Rental Sample Database Import
$ kubectl exec -it myclient1 -- pg_restore -U postgres -d dvdrental /tmp/dvdrental.tar -h mycluster-rw -p 5432
# DVD Rental Database에서 테이블 조회
kubectl exec -it myclient1 -- psql -U postgres -h mycluster-rw -p 5432 -d dvdrental -c "SELECT * FROM actor"
actor_id | first_name | last_name | last_update
----------+-------------+--------------+------------------------
1 | Penelope | Guiness | 2013-05-26 14:47:57.62
2 | Nick | Wahlberg | 2013-05-26 14:47:57.62
3 | Ed | Chase | 2013-05-26 14:47:57.62
4 | Jennifer | Davis | 2013-05-26 14:47:57.62
5 | Johnny | Lollobrigida | 2013-05-26 14:47:57.62
6 | Bette | Nicholson | 2013-05-26 14:47:57.62
~ (생략)
▶ 각 파드에 접근해서 DVD Rental Database복제 동기화 확인
# 파드IP 변수 지정
$ POD1=$(kubectl get pod mycluster-1 -o jsonpath={.status.podIP})
$ POD2=$(kubectl get pod mycluster-2 -o jsonpath={.status.podIP})
$ POD3=$(kubectl get pod mycluster-3 -o jsonpath={.status.podIP})
# 파드별 actor 테이블 카운트 조회
$ kubectl exec -it myclient1 -- psql -U postgres -h $POD1 -p 5432 -d dvdrental -c "SELECT COUNT(*) FROM actor"
count
-------
200
(1 row)
$ kubectl exec -it myclient1 -- psql -U postgres -h $POD2 -p 5432 -d dvdrental -c "SELECT COUNT(*) FROM actor"
count
-------
200
(1 row)
$ kubectl exec -it myclient1 -- psql -U postgres -h $POD3 -p 5432 -d dvdrental -c "SELECT COUNT(*) FROM actor"
count
-------
200
(1 row)
▶ rw vs ro vs r(any) 차이 확인
→ rw : write가 가능한 노드로만 접속
$ kubectl exec -it myclient1 -- psql -U postgres -h mycluster-rw -p 5432 -c "select inet_server_addr();"
for i in {1..30}; do kubectl exec -it myclient1 -- psql -U postgres -h mycluster-rw -p 5432 -c "select inet_server_addr();"; done | sort | uniq -c | sort -nr | grep 172
==> 30 172.16.3.4
→ ro : read만 가능한 노드로만 접속
$ kubectl exec -it myclient1 -- psql -U postgres -h mycluster-ro -p 5432 -c "select inet_server_addr();"
for i in {1..30}; do kubectl exec -it myclient1 -- psql -U postgres -h mycluster-ro -p 5432 -c "select inet_server_addr();"; done | sort | uniq -c | sort -nr | grep 172
==> 18 172.16.1.4
==> 12 172.16.2.6
→ r : read + write가 가능한 노드로 접속
$ kubectl exec -it myclient1 -- psql -U postgres -h mycluster-r -p 5432 -c "select inet_server_addr();"
for i in {1..30}; do kubectl exec -it myclient1 -- psql -U postgres -h mycluster-r -p 5432 -c "select inet_server_addr();"; done | sort | uniq -c | sort -nr | grep 172
==> 11 172.16.1.4
==> 10 172.16.2.6
==> 9 172.16.3.4
→ any : 아무거나 접속
$ kubectl exec -it myclient1 -- psql -U postgres -h mycluster-any -p 5432 -c "select inet_server_addr();"
for i in {1..30}; do kubectl exec -it myclient1 -- psql -U postgres -h mycluster-any -p 5432 -c "select inet_server_addr();"; done | sort | uniq -c | sort -nr | grep 172
==> 12 172.16.1.4
==> 9 172.16.3.4
==> 9 172.16.2.6
3. CNPG 장애 테스트
▶ 장애 테스트를 위한 사전준비
# 파드IP 변수 지정
POD1=$(kubectl get pod mycluster-1 -o jsonpath={.status.podIP})
POD2=$(kubectl get pod mycluster-2 -o jsonpath={.status.podIP})
POD3=$(kubectl get pod mycluster-3 -o jsonpath={.status.podIP})
# query.sql
$ cat ~/DOIK/5/query.sql
CREATE DATABASE test;
\c test;
CREATE TABLE t1 (c1 INT PRIMARY KEY, c2 TEXT NOT NULL);
INSERT INTO t1 VALUES (1, 'Luis');
# SQL 파일 query 실행
$ kubectl cp ~/DOIK/5/query.sql myclient1:/tmp
$ kubectl exec -it myclient1 -- psql -U postgres -h mycluster-rw -p 5432 -f /tmp/query.sql
CREATE DATABASE
psql (14.3, server 14.2 (Debian 14.2-1.pgdg110+1))
SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, bits: 256, compression: off)
You are now connected to database "test" as user "postgres".
CREATE TABLE
INSERT 0 1
# 확인
$ kubectl exec -it myclient1 -- psql -U postgres -h mycluster-ro -p 5432 -d test -c "SELECT * FROM t1"
c1 | c2
----+------
1 | Luis
(1 row)
# INSERT
# test 데이터베이스에 97개의 데이터 INSERT
c1 | c2
----+-------
1 | Luis
2 | Luis2
(2 rows)
test 데이터베이스에 97개의 데이터 INSERT
$ for ((i=3; i<=100; i++)); do kubectl exec -it myclient1 -- psql -U postgres -h mycluster-rw -p 5432 -d test -c "INSERT INTO t1 VALUES ($i, 'Luis$i');";echo; done
$ kubectl exec -it myclient1 -- psql -U postgres -h mycluster-ro -p 5432 -d test -c "SELECT COUNT(*) FROM t1"
count
-------
100
(1 row)
# [터미널2] 모니터링
$ while true; do kubectl exec -it myclient2 -- psql -U postgres -h mycluster-ro -p 5432 -d test -c "SELECT COUNT(*) FROM t1"; date;sleep 1; done
▶ [장애1] 프라이머리 파드(인스턴스) 1대 강제 삭제 및 동작 확인
→ 프라이머리 파드 정보 확인
$ kubectl cnpg status mycluster
→ [터미널1], [터미널2] 모니터링모니터링
# [터미널1] 모니터링
$ watch kubectl get pod -l cnpg.io/cluster=mycluster
# [터미널2] 모니터링
$ while true; do kubectl exec -it myclient2 -- psql -U postgres -h mycluster-ro -p 5432 -d test -c "SELECT COUNT(*) FROM t1"; date;sleep 1; done
→ [터미널3] test 데이터베이스에 다량의 데이터 반복 INSERT
$ for ((i=301; i<=10000; i++)); do kubectl exec -it myclient1 -- psql -U postgres -h mycluster-rw -p 5432 -d test -c "INSERT INTO t1 VALUES ($i, 'Luis$i');";echo; done
# $ for ((i=10001; i<=20000; i++)); do kubectl exec -it myclient1 -- psql -U postgres -h mycluster-rw -p 5432 -d test -c "INSERT INTO t1 VALUES ($i, 'Luis$i');";echo; done
→ [터미널4] 파드 삭제
$ kubectl delete pvc/mycluster-1 pod/mycluster-2
pod "mycluster-2" deleted
→ 삭제 확인
$ kubectl cnpg status mycluster
Cluster Summary
Name: mycluster
Namespace: default
System ID: 7111873327487692819
PostgreSQL Image: ghcr.io/cloudnative-pg/postgresql:14.2
Primary instance: mycluster-3
Status: Failing over Failing over from mycluster-2 to mycluster-3
Instances: 3
Ready instances: 2
Current Write LSN: 0/E007850 (Timeline: 3 - WAL File: 00000003000000000000000E)
Certificates Status
Certificate Name Expiration Date Days Left Until Expiration
---------------- --------------- --------------------------
mycluster-ca 2022-09-20 01:35:08 +0000 UTC 89.88
mycluster-replication 2022-09-20 01:35:08 +0000 UTC 89.88
mycluster-server 2022-09-20 01:35:08 +0000 UTC 89.88
Continuous Backup status
Not configured
Streaming Replication status
Name Sent LSN Write LSN Flush LSN Replay LSN Write Lag Flush Lag Replay Lag State Sync State Sync Priority
---- -------- --------- --------- ---------- --------- --------- ---------- ----- ---------- -------------
mycluster-4 0/E007850 0/E007850 0/E007850 0/E007850 00:00:00.000721 00:00:00.001346 00:00:00.001411 streaming async 0
Instances status
Name Database Size Current LSN Replication role Status QoS Manager Version
---- ------------- ----------- ---------------- ------ --- ---------------
mycluster-2 - - - pod not available BestEffort -
mycluster-3 57 MB 0/E007850 Primary OK BestEffort 1.15.1
mycluster-4 57 MB 0/E007908 Standby (async) OK BestEffort 1.15.1
→ 수초 순단 후 Insert 재개
psql: error: connection to server at "mycluster-rw" (10.200.1.174), port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
command terminated with exit code 2
psql: error: connection to server at "mycluster-rw" (10.200.1.174), port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
command terminated with exit code 2
INSERT 0 1
INSERT 0 1
INSERT 0 1
→ 파드 정보 확인 : 파드 재생 성
$ kubectl get pod -l cnpg.io/cluster=mycluster
NAME READY STATUS RESTARTS AGE
mycluster-2 1/1 Running 0 157m
mycluster-3 1/1 Running 0 157m
mycluster-4 1/1 Running 0 49s
▶ [장애2] 프라이머리 파드(인스턴스) 가 배포된 노드 1대 drain 설정 및 동작 확인
→ (옵션) 오퍼레이터 로그 확인
$ kubetail -l app.kubernetes.io/instance=cnpg -f
→ 워커노드 drain
$ kubectl drain k8s-w1 --delete-emptydir-data --force --ignore-daemonsets && kubectl get node -w
node/k8s-w1 cordoned
WARNING: ignoring DaemonSet-managed Pods: kube-system/kube-flannel-ds-gb76j, kube-system/kube-proxy-h22x6
evicting pod kube-system/coredns-64897985d-sbc7f
evicting pod default/mycluster-3
evicting pod kube-system/coredns-64897985d-mwx5s
error when evicting pods/"mycluster-3" -n "default" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.
evicting pod default/mycluster-3
pod/coredns-64897985d-sbc7f evicted
pod/coredns-64897985d-mwx5s evicted
pod/mycluster-3 evicted
node/k8s-w1 drained
NAME STATUS ROLES AGE VERSION
k8s-m Ready control-plane,master 3h47m v1.23.6
k8s-w1 Ready,SchedulingDisabled <none> 3h47m v1.23.6
k8s-w2 Ready <none> 3h47m v1.23.6
k8s-w3 Ready <none> 3h47m v1.23.6
k8s-w3 Ready <none> 3h47m v1.23.6
k8s-w2 Ready <none> 3h47m v1.23.6
→ 클러스터 정보 확인
$ kubectl cnpg status mycluster
Cluster Summary
Name: mycluster
Namespace: default
System ID: 7111873327487692819
PostgreSQL Image: ghcr.io/cloudnative-pg/postgresql:14.2
Primary instance: mycluster-3
Status: Failing over Failing over from mycluster-2 to mycluster-3
Instances: 3
Ready instances: 2
Current Write LSN: 0/E007850 (Timeline: 3 - WAL File: 00000003000000000000000E)
Certificates Status
Certificate Name Expiration Date Days Left Until Expiration
---------------- --------------- --------------------------
mycluster-ca 2022-09-20 01:35:08 +0000 UTC 89.88
mycluster-replication 2022-09-20 01:35:08 +0000 UTC 89.88
mycluster-server 2022-09-20 01:35:08 +0000 UTC 89.88
Continuous Backup status
Not configured
Streaming Replication status
Name Sent LSN Write LSN Flush LSN Replay LSN Write Lag Flush Lag Replay Lag State Sync State Sync Priority
---- -------- --------- --------- ---------- --------- --------- ---------- ----- ---------- -------------
mycluster-4 0/E007850 0/E007850 0/E007850 0/E007850 00:00:00.000721 00:00:00.001346 00:00:00.001411 streaming async 0
Instances status
Name Database Size Current LSN Replication role Status QoS Manager Version
---- ------------- ----------- ---------------- ------ --- ---------------
mycluster-2 - - - pod not available BestEffort -
mycluster-3 57 MB 0/E007850 Primary OK BestEffort 1.15.1
mycluster-4 57 MB 0/E007908 Standby (async) OK BestEffort 1.15.1
→ 동작 확인 후 uncordon 설정
$ kubectl uncordon k8s-w1
→ 클러스터 정보 확인
$ kubectl cnpg status mycluster
Cluster Summary
Name: mycluster
Namespace: default
System ID: 7111873327487692819
PostgreSQL Image: ghcr.io/cloudnative-pg/postgresql:14.2
Primary instance: mycluster-2
Status: Switchover in progress Switching over to mycluster-2, because primary instance was running on unschedulable node k8s-w1
Instances: 3
Ready instances: 2
Current Write LSN: 0/F003308 (Timeline: 4 - WAL File: 00000004000000000000000F)
Certificates Status
Certificate Name Expiration Date Days Left Until Expiration
---------------- --------------- --------------------------
mycluster-ca 2022-09-20 01:35:08 +0000 UTC 89.88
mycluster-replication 2022-09-20 01:35:08 +0000 UTC 89.88
mycluster-server 2022-09-20 01:35:08 +0000 UTC 89.88
Continuous Backup status
Not configured
Streaming Replication status
Name Sent LSN Write LSN Flush LSN Replay LSN Write Lag Flush Lag Replay Lag State Sync State Sync Priority
---- -------- --------- --------- ---------- --------- --------- ---------- ----- ---------- -------------
mycluster-4 0/F003308 0/F003308 0/F003308 0/F003308 00:00:00 00:00:00 00:00:00 streaming async 0
Instances status
Name Database Size Current LSN Replication role Status QoS Manager Version
---- ------------- ----------- ---------------- ------ --- ---------------
mycluster-2 57 MB 0/F003308 Primary OK BestEffort 1.15.1
mycluster-3 - - - pod not available BestEffort -
mycluster-4 57 MB 0/F003308 Standby (async) OK BestEffort 1.15.1
$ kubectl cnpg status mycluster
Cluster Summary
Name: mycluster
Namespace: default
System ID: 7111873327487692819
PostgreSQL Image: ghcr.io/cloudnative-pg/postgresql:14.2
Primary instance: mycluster-2
Status: Cluster in healthy state
Instances: 3
Ready instances: 3
Current Write LSN: 0/F003340 (Timeline: 4 - WAL File: 00000004000000000000000F)
Certificates Status
Certificate Name Expiration Date Days Left Until Expiration
---------------- --------------- --------------------------
mycluster-ca 2022-09-20 01:35:08 +0000 UTC 89.88
mycluster-replication 2022-09-20 01:35:08 +0000 UTC 89.88
mycluster-server 2022-09-20 01:35:08 +0000 UTC 89.88
Continuous Backup status
Not configured
Streaming Replication status
Name Sent LSN Write LSN Flush LSN Replay LSN Write Lag Flush Lag Replay Lag State Sync State Sync Priority
---- -------- --------- --------- ---------- --------- --------- ---------- ----- ---------- -------------
mycluster-4 0/F003340 0/F003340 0/F003340 0/F003340 00:00:00 00:00:00 00:00:00 streaming async 0
mycluster-3 0/F003340 0/F003340 0/F003340 0/F003340 00:00:00 00:00:00 00:00:00 streaming async 0
Instances status
Name Database Size Current LSN Replication role Status QoS Manager Version
---- ------------- ----------- ---------------- ------ --- ---------------
mycluster-2 57 MB 0/F003340 Primary OK BestEffort 1.15.1
mycluster-3 57 MB 0/F003340 Standby (async) OK BestEffort 1.15.1
mycluster-4 57 MB 0/F003340 Standby (async) OK BestEffort 1.15.1
4. CloudNativePG Scale & 롤링 업데이트
▶CNPG Scale 설정
→ 사전정보 확인
$ kubectl cnpg status mycluster
Cluster Summary
Name: mycluster
Namespace: default
System ID: 7111873327487692819
PostgreSQL Image: ghcr.io/cloudnative-pg/postgresql:14.2
Primary instance: mycluster-2
Status: Cluster in healthy state
Instances: 3
Ready instances: 3
Current Write LSN: 0/11000000 (Timeline: 4 - WAL File: 000000040000000000000010)
Certificates Status
Certificate Name Expiration Date Days Left Until Expiration
---------------- --------------- --------------------------
mycluster-ca 2022-09-20 01:35:08 +0000 UTC 89.87
mycluster-replication 2022-09-20 01:35:08 +0000 UTC 89.87
mycluster-server 2022-09-20 01:35:08 +0000 UTC 89.87
Continuous Backup status
Not configured
Streaming Replication status
Name Sent LSN Write LSN Flush LSN Replay LSN Write Lag Flush Lag Replay Lag State Sync State Sync Priority
---- -------- --------- --------- ---------- --------- --------- ---------- ----- ---------- -------------
mycluster-4 0/11000000 0/11000000 0/11000000 0/11000000 00:00:00 00:00:00 00:00:00 streaming async 0
mycluster-3 0/11000000 0/11000000 0/11000000 0/11000000 00:00:00 00:00:00 00:00:00 streaming async 0
Instances status
Name Database Size Current LSN Replication role Status QoS Manager Version
---- ------------- ----------- ---------------- ------ --- ---------------
mycluster-2 57 MB 0/11000000 Primary OK BestEffort 1.15.1
mycluster-3 57 MB 0/11000000 Standby (async) OK BestEffort 1.15.1
mycluster-4 57 MB 0/11000000 Standby (async) OK BestEffort 1.15.1
$ kubectl get cluster mycluster
NAME AGE INSTANCES READY STATUS PRIMARY
mycluster 176m 3 3 Cluster in healthy state mycluster-2
NAME AGE INSTANCES READY STATUS PRIMARY
mycluster 167m 3 3 Cluster in healthy state mycluster-6
→ 모니터링
$ watch kubectl get pod -l postgresql=mycluster
→ 5EA로 Scale-Out 실행
$ kubectl patch cluster mycluster --type=merge -p '{"spec":{"instances":5}}'
cluster.postgresql.cnpg.io/mycluster patched
→ Scale-Out 확인
$ kubectl get cluster mycluster
NAME AGE INSTANCES READY STATUS PRIMARY
mycluster 178m 5 5 Cluster in healthy state mycluster-2
$ kubectl cnpg status mycluster
Cluster Summary
Name: mycluster
Namespace: default
System ID: 7111873327487692819
PostgreSQL Image: ghcr.io/cloudnative-pg/postgresql:14.2
Primary instance: mycluster-2
Status: Cluster in healthy state
Instances: 5
Ready instances: 5
Current Write LSN: 0/14000060 (Timeline: 4 - WAL File: 000000040000000000000014)
Certificates Status
Certificate Name Expiration Date Days Left Until Expiration
---------------- --------------- --------------------------
mycluster-ca 2022-09-20 01:35:08 +0000 UTC 89.87
mycluster-replication 2022-09-20 01:35:08 +0000 UTC 89.87
mycluster-server 2022-09-20 01:35:08 +0000 UTC 89.87
Continuous Backup status
Not configured
Streaming Replication status
Name Sent LSN Write LSN Flush LSN Replay LSN Write Lag Flush Lag Replay Lag State Sync State Sync Priority
---- -------- --------- --------- ---------- --------- --------- ---------- ----- ---------- -------------
mycluster-4 0/14000060 0/14000060 0/14000060 0/14000060 00:00:00 00:00:00 00:00:00 streaming async 0
mycluster-3 0/14000060 0/14000060 0/14000060 0/14000060 00:00:00 00:00:00 00:00:00 streaming async 0
mycluster-5 0/14000060 0/14000060 0/14000060 0/14000060 00:00:00 00:00:00 00:00:00 streaming async 0
mycluster-6 0/14000060 0/14000060 0/14000060 0/14000060 00:00:00 00:00:00 00:00:00 streaming async 0
Instances status
Name Database Size Current LSN Replication role Status QoS Manager Version
---- ------------- ----------- ---------------- ------ --- ---------------
mycluster-2 57 MB 0/14000060 Primary OK BestEffort 1.15.1
mycluster-3 57 MB 0/14000060 Standby (async) OK BestEffort 1.15.1
mycluster-4 57 MB 0/14000060 Standby (async) OK BestEffort 1.15.1
mycluster-5 57 MB 0/14000060 Standby (async) OK BestEffort 1.15.1
mycluster-6 57 MB 0/14000060 Standby (async) OK BestEffort 1.15.1
→ any 접속 확인
$ for i in {1..30}; do kubectl exec -it myclient1 -- psql -U postgres -h mycluster-any -p 5432 -c "select inet_server_addr();"; done | sort | uniq -c | sort -nr | grep 172
8 172.16.1.9
7 172.16.2.7
6 172.16.2.10
5 172.16.3.8
4 172.16.1.5
→ 3EA로 Scale-In 수행
$ kubectl patch cluster mycluster --type=merge -p '{"spec":{"instances":3}}'
cluster.postgresql.cnpg.io/mycluster patched
→ Scale-In 확인
$ kubectl get cluster mycluster
NAME AGE INSTANCES READY STATUS PRIMARY
mycluster 3h 3 3 Cluster in healthy state mycluster-2
▶ 롤링 업데이트 : Standby 시작 → Primary 갱신 전 SwitchOver로 다운타임 최소화
→ 모니터링
# [터미널1] 모니터링
$ watch kubectl get pod -l cnpg.io/cluster=mycluster
NAME READY STATUS RESTARTS AGE
mycluster-2 1/1 Running 0 20m
mycluster-3 1/1 Running 0 16m
mycluster-4 1/1 Running 0 22m
# [터미널2] 모니터링
while true; do kubectl exec -it myclient2 -- psql -U postgres -h mycluster-ro -p 5432 -d test -c "SELECT COUNT(*) FROM t1"; date;sleep 1; done
# [터미널3] test 데이터베이스에 다량의 데이터 INSERT
for ((i=10000; i<=20000; i++)); do kubectl exec -it myclient1 -- psql -U postgres -h mycluster-rw -p 5432 -d test -c "INSERT INTO t1 VALUES ($i, 'Luis$i');";echo; done
→ [터미널4] postgresql:14.2 → postgresql:14.3 로 업데이트
$ kubectl patch cluster mycluster --type=merge -p '{"spec":{"imageName":"ghcr.io/cloudnative-pg/postgresql:14.3"}}' && kubectl get pod -l postgresql=mycluster -w
→ 클러스터 정보 확인
$ kubectl get cluster mycluster
NAME AGE INSTANCES READY STATUS PRIMARY
mycluster 3h1m 2 2 Upgrading cluster mycluster-2
$ kubectl cnpg status mycluster | grep Image
PostgreSQL Image: ghcr.io/cloudnative-pg/postgresql:14.3
→ Patch 도중 Insert작업 잠시 순단 발생
psql: error: connection to server at "mycluster-rw" (10.200.1.174), port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
command terminated with exit code 2
psql: error: connection to server at "mycluster-rw" (10.200.1.174), port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
command terminated with exit code 2
psql: error: connection to server at "mycluster-rw" (10.200.1.174), port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
command terminated with exit code 2
INSERT 0 1
INSERT 0 1
INSERT 0 1
5. CNPG 기타
▶ 파드 볼륨증가 : 온라인 볼륨 확장은 현재 local-path 는 미지원되어 불가능!!
→ pod, pvc정보 확인
$ watch kubectl get pod,pvc
NAME READY STATUS RESTARTS AGE
pod/cnpg-cloudnative-pg-5f8cc75df5-jk2v4 1/1 Running 0 3h19m
pod/myclient1 1/1 Running 0 60m
pod/myclient2 1/1 Running 0 60m
pod/mycluster-2 1/1 Running 0 4m55s
pod/mycluster-3 1/1 Running 0 5m25s
pod/mycluster-4 1/1 Running 0 5m50s
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
persistentvolumeclaim/mycluster-2 Bound pvc-c490f766-3a2b-4658-90d6-49dff5dc03e4 3Gi RWO local-path 3h6m
persistentvolumeclaim/mycluster-3 Bound pvc-d370b23b-14dc-4b90-bfb4-83fa9aa51290 3Gi RWO local-path 3h5m
persistentvolumeclaim/mycluster-4 Bound pvc-ca35067b-0291-4559-b919-639e91e8df67 3Gi RWO local-path 29m
→ PVC 3G -> 5G 로 증가 설정 : 증가 후 감소는 안됨
$ kubectl patch cluster mycluster --type=merge -p '{"spec":{"storage":{"resizeInUseVolumes":false}}}'
cluster.postgresql.cnpg.io/mycluster patched
$ kubectl patch cluster mycluster --type=merge -p '{"spec":{"storage":{"size":"5Gi"}}}'
cluster.postgresql.cnpg.io/mycluster patched
$ kubectl describe cluster mycluster
→ 파드/PVC 삭제 실행으로 재실행
$ kubectl delete pvc/mycluster-4 pod/mycluster-4
persistentvolumeclaim "mycluster-4" deleted
pod "mycluster-4" deleted
$ kubectl delete pvc/mycluster-3 pod/mycluster-3
persistentvolumeclaim "mycluster-3" deleted
pod "mycluster-3" deleted
$ kubectl delete pvc/mycluster-2 pod/mycluster-2
persistentvolumeclaim "mycluster-2" deleted
pod "mycluster-2" deleted
→ 볼륨 사이즈 증가 확인
kubectl get pod,pvc
NAME READY STATUS RESTARTS AGE
pod/cnpg-cloudnative-pg-5f8cc75df5-jk2v4 1/1 Running 0 3h22m
pod/myclient1 1/1 Running 0 62m
pod/myclient2 1/1 Running 0 62m
pod/mycluster-7 1/1 Running 0 47s
pod/mycluster-8 1/1 Running 0 24s
pod/mycluster-9 1/1 Running 0 9s
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
persistentvolumeclaim/mycluster-7 Bound pvc-5c07d1ae-f6e1-4b05-829a-6512f6f33d30 5Gi RWO local-path 59s
persistentvolumeclaim/mycluster-8 Bound pvc-121e78c4-5f59-4895-aa14-5844a4b3098d 5Gi RWO local-path 35s
persistentvolumeclaim/mycluster-9 Bound pvc-75db182a-b42c-402a-8f23-119dfe9a848d 5Gi RWO local-path 20s
'K8S' 카테고리의 다른 글
MySQL Operator based on Percona (0) | 2022.06.22 |
---|---|
Percona Distribution for MongoDB - 샤딩 (3/3) (0) | 2022.06.19 |
Percona Distribution for MongoDB 오퍼레이터 - 복제 (2/3) (0) | 2022.06.19 |
Percona Distribution for MongoDB 오퍼레이터 - 기본설치 (1/3) (0) | 2022.06.19 |
Helm을 이용해 K8S에 MariaDB Galera 설치하기 (0) | 2022.05.29 |