Gisida님의 Database Operator In Kubernetes study(=DOIK) 스터디 진행 중 테스트 한 내용입니다.
Cloud Native PostgreSQL(CNPG) 오퍼레이터 테스트
1. CNPG(Cloud Native PostgreSQL) 설치
2. CNPG 기본사용
3. CNPG 장애 테스트
4. CNPG Scale & 롤링 업데이트
5. CNPG 기타
- Cloud Native PostgreSQL(CNPG)?
CloudNativePG는 프라이빗, 퍼블릭, 하이브리드 또는 멀티 클라우드 환경에서 실행되는 지원되는 모든 Kubernetes 클러스터에서 PostgreSQL 워크로드를 관리하도록 설계된 오픈 소스 오퍼레이터입니다 - Architecture
- Read-write workloads

애플리케이션은 아래 그림과 같이 -rw suffix service를 사용하여, Kubernetes 오퍼레이터가 현재 Primary Instance선택한 PostgreSQL 인스턴스에 연결할 수 있습니다. Primary Instance를 일시적 또는 영구적으로 사용할 수 없는 경우 Kubernetes는 고가용성을 위해 -rw 서비스를 클러스터의 다른 인스턴스로 이동합니다.
- Read-Only workloads

애플리케이션은 운영자가 제공한 -ro 서비스를 통해 상시 대기 복제본에 액세스할 수 있습니다. 이 서비스를 사용하면 애플리케이션이 기본 노드에서 읽기 전용 쿼리를 오프로드할 수 있습니다.
- Multi-cluster deployments

두 개의 서로 다른 Kubernetes 클러스터에 복제되는 PostgreSQL입니다. 여기서 기본 클러스터는 첫 번째 Kubernetes 클러스터에 있고 복제본 클러스터는 두 번째 클러스터에 있습니다. 두 번째 Kubernetes 클러스터는 DR역할합니다.
1. CNPG(Cloud Native PostgreSQL) 설치
▶ Helm Repo추가
$ helm repo add cnpg https://cloudnative-pg.github.io/charts "cnpg" has been added to your repositories
▶ 오퍼레이터 설치
$ helm install cnpg cnpg/cloudnative-pg -f ~/DOIK/5/values.yaml NAME: cnpg LAST DEPLOYED: Wed Jun 22 10:27:50 2022 NAMESPACE: default STATUS: deployed REVISION: 1 NOTES: CloudNativePG operator should be installed in namespace "default". You can now create a PostgreSQL cluster with 3 nodes in the current namespace as follows: cat <<EOF | kubectl apply -f - # Example of PostgreSQL cluster apiVersion: postgresql.cnpg.io/v1 kind: Cluster metadata: name: cluster-example spec: instances: 3 storage: size: 1Gi EOF $ kubectl get cluster
▶ CRD 확인
$ kubectl get crd NAME CREATED AT backups.postgresql.cnpg.io 2022-06-22T01:27:52Z clusters.postgresql.cnpg.io 2022-06-22T01:27:52Z poolers.postgresql.cnpg.io 2022-06-22T01:27:52Z scheduledbackups.postgresql.cnpg.io 2022-06-22T01:27:52Z
▶ 클러스터 설치 Deploy a PostgreSQL cluster
$ cat ~/DOIK/5/mycluster1.yaml # Example of PostgreSQL cluster apiVersion: postgresql.cnpg.io/v1 kind: Cluster metadata: name: mycluster spec: imageName: ghcr.io/cloudnative-pg/postgresql:14.2 instances: 3 storage: size: 3Gi postgresql: parameters: max_worker_processes: "40" timezone: "Asia/Seoul" pg_hba: - host all postgres all trust primaryUpdateStrategy: unsupervised enableSuperuserAccess: true bootstrap: initdb: database: app encoding: UTF8 localeCType: C localeCollate: C owner: app $ kubectl apply -f ~/DOIK/5/mycluster1.yaml cluster.postgresql.cnpg.io/mycluster created $ kubectl get pod -w NAME READY STATUS RESTARTS AGE cnpg-cloudnative-pg-5f8cc75df5-jk2v4 1/1 Running 0 17m mycluster-1 1/1 Running 0 4m23s mycluster-2 1/1 Running 0 3m36s mycluster-3 1/1 Running 0 2m52s
▶ 클러스터 확인
$ kubectl get all NAME READY STATUS RESTARTS AGE pod/cnpg-cloudnative-pg-5f8cc75df5-jk2v4 1/1 Running 0 30m pod/mycluster-1 1/1 Running 0 16m pod/mycluster-2 1/1 Running 0 16m pod/mycluster-3 1/1 Running 0 15m NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/cnpg-webhook-service ClusterIP 10.200.1.33 <none> 443/TCP 30m $ kubectl get cluster NAME AGE INSTANCES READY STATUS PRIMARY mycluster 17m 3 3 Cluster in healthy state mycluster-1 $ kubectl cnpg status mycluster --verbose ## -v , config 설정 적용 확인 Cluster Summary Name: mycluster Namespace: default System ID: 7111873327487692819 PostgreSQL Image: ghcr.io/cloudnative-pg/postgresql:14.2 Primary instance: mycluster-1 Status: Cluster in healthy state Instances: 3 Ready instances: 3 Current Write LSN: 0/6000060 (Timeline: 1 - WAL File: 000000010000000000000006) PostgreSQL Configuration archive_command = '/controller/manager wal-archive --log-destination /controller/log/postgres.json %p' archive_mode = 'on' archive_timeout = '5min' cluster_name = 'mycluster' dynamic_shared_memory_type = 'posix' full_page_writes = 'on' hot_standby = 'true' listen_addresses = '*' log_destination = 'csvlog' log_directory = '/controller/log' log_filename = 'postgres' log_rotation_age = '0' log_rotation_size = '0' log_truncate_on_rotation = 'false' logging_collector = 'on' max_parallel_workers = '32' max_replication_slots = '32' max_worker_processes = '40' port = '5432' restart_after_crash = 'false' shared_memory_type = 'mmap' shared_preload_libraries = '' ssl = 'on' ssl_ca_file = '/controller/certificates/client-ca.crt' ssl_cert_file = '/controller/certificates/server.crt' ssl_key_file = '/controller/certificates/server.key' timezone = 'Asia/Seoul' unix_socket_directories = '/controller/run' wal_keep_size = '512MB' wal_level = 'logical' wal_log_hints = 'on' wal_receiver_timeout = '5s' wal_sender_timeout = '5s' cnpg.config_sha256 = '8a3f5126327a68ea817baaac7b4e96184e10c86208e93c126750f40c92a30747' PostgreSQL HBA Rules # Grant local access local all all peer map=local # Require client certificate authentication for the streaming_replica user hostssl postgres streaming_replica all cert hostssl replication streaming_replica all cert hostssl all cnpg_pooler_pgbouncer all cert host all postgres all trust # Otherwise use the default authentication method host all all all scram-sha-256 Certificates Status Certificate Name Expiration Date Days Left Until Expiration ---------------- --------------- -------------------------- mycluster-ca 2022-09-20 01:35:08 +0000 UTC 89.98 mycluster-replication 2022-09-20 01:35:08 +0000 UTC 89.98 mycluster-server 2022-09-20 01:35:08 +0000 UTC 89.98 Continuous Backup status Not configured Streaming Replication status Name Sent LSN Write LSN Flush LSN Replay LSN Write Lag Flush Lag Replay Lag State Sync State Sync Priority ---- -------- --------- --------- ---------- --------- --------- ---------- ----- ---------- ------------- mycluster-2 0/6000060 0/6000060 0/6000060 0/6000060 00:00:00 00:00:00 00:00:00 streaming async 0 mycluster-3 0/6000060 0/6000060 0/6000060 0/6000060 00:00:00 00:00:00 00:00:00 streaming async 0 Instances status Name Database Size Current LSN Replication role Status QoS Manager Version ---- ------------- ----------- ---------------- ------ --- --------------- mycluster-1 33 MB 0/6000060 Primary OK BestEffort 1.15.1 mycluster-2 33 MB 0/6000060 Standby (async) OK BestEffort 1.15.1 mycluster-3 33 MB 0/6000060 Standby (async) OK BestEffort 1.15.1
▶ 클러스터된 기본리소스 확인
$ kubectl get pod,deploy NAME READY STATUS RESTARTS AGE pod/cnpg-cloudnative-pg-5f8cc75df5-jk2v4 1/1 Running 0 34m pod/mycluster-1 1/1 Running 0 21m pod/mycluster-2 1/1 Running 0 20m pod/mycluster-3 1/1 Running 0 19m NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/cnpg-cloudnative-pg 1/1 1 1 34m $ kubectl get svc,ep,endpointslices -l cnpg.io/cluster=mycluster NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/mycluster-any ClusterIP 10.200.1.195 <none> 5432/TCP 22m service/mycluster-r ClusterIP 10.200.1.232 <none> 5432/TCP 22m service/mycluster-ro ClusterIP 10.200.1.69 <none> 5432/TCP 22m service/mycluster-rw ClusterIP 10.200.1.174 <none> 5432/TCP 22m NAME ENDPOINTS AGE endpoints/mycluster-any 172.16.1.4:5432,172.16.2.6:5432,172.16.3.4:5432 22m endpoints/mycluster-r 172.16.1.4:5432,172.16.2.6:5432,172.16.3.4:5432 22m endpoints/mycluster-ro 172.16.1.4:5432,172.16.2.6:5432 22m endpoints/mycluster-rw 172.16.3.4:5432 22m NAME ADDRESSTYPE PORTS ENDPOINTS AGE endpointslice.discovery.k8s.io/mycluster-any-gjmlp IPv4 5432 172.16.3.4,172.16.1.4,172.16.2.6 22m endpointslice.discovery.k8s.io/mycluster-r-98r65 IPv4 5432 172.16.3.4,172.16.1.4,172.16.2.6 22m endpointslice.discovery.k8s.io/mycluster-ro-kh6fm IPv4 5432 172.16.1.4,172.16.2.6 22m endpointslice.discovery.k8s.io/mycluster-rw-4gcpn IPv4 5432 172.16.3.4 22m $ kubectl get cm,secret NAME DATA AGE configmap/cnpg-controller-manager-config 0 34m configmap/cnpg-default-monitoring 1 34m configmap/kube-root-ca.crt 1 85m NAME TYPE DATA AGE secret/cnpg-ca-secret Opaque 2 34m secret/cnpg-cloudnative-pg-token-c7sqs kubernetes.io/service-account-token 3 34m secret/cnpg-webhook-cert kubernetes.io/tls 2 34m secret/default-token-j8k8p kubernetes.io/service-account-token 3 85m secret/mycluster-app kubernetes.io/basic-auth 3 22m secret/mycluster-ca Opaque 2 22m secret/mycluster-replication kubernetes.io/tls 2 22m secret/mycluster-server kubernetes.io/tls 2 22m secret/mycluster-superuser kubernetes.io/basic-auth 3 22m secret/mycluster-token-jzdtw kubernetes.io/service-account-token 3 22m secret/sh.helm.release.v1.cnpg.v1 helm.sh/release.v1 1 34m $ kubectl get pdb NAME MIN AVAILABLE MAX UNAVAILABLE ALLOWED DISRUPTIONS AGE mycluster 1 N/A 1 22m mycluster-primary 1 N/A 0 22m
2. CNPG 기본사용
▶ 2개의 자격 증명이 저장된 secret 확인
$ kubectl get secret -l cnpg.io/cluster=mycluster NAME TYPE DATA AGE mycluster-app kubernetes.io/basic-auth 3 123m mycluster-superuser kubernetes.io/basic-auth 3 123m
▶ DB계정정보 확인
# Super계정명 확인 $ kubectl get secrets mycluster-superuser -o jsonpath={.data.username} | base64 -d ;echo postgres # app 계정명 $ kubectl get secrets mycluster-app -o jsonpath={.data.username} | base64 -d ;echo app # app 계정 암호 kubectl get secrets mycluster-app -o jsonpath={.data.password} | base64 -d ; echo wgyiBXwnWHPoXF9plDaulXX34U8xQ0WvZOKdsoP5ozD95j9PMkkSRbHXW6AArRaq
▶ app계정 암호 변수 지정
$ AUSERPW=$(kubectl get secrets mycluster-app -o jsonpath={.data.password} | base64 -d)
▶ myclient 파드 2대 배포 : envsubst 활용
$ cat ~/DOIK/5/myclient.yaml apiVersion: v1 kind: Pod metadata: name: ${PODNAME} labels: app: myclient spec: nodeName: k8s-m containers: - name: ${PODNAME} image: bitnami/postgresql:${VERSION} command: ["tail"] args: ["-f", "/dev/null"] terminationGracePeriodSeconds: 0 $ for ((i=1; i<=2; i++)); do PODNAME=myclient$i VERSION=14.3.0 envsubst < ~/DOIK/5/myclient.yaml | kubectl apply -f - ; done pod/myclient1 created pod/myclient2 created
▶ [myclient1] superuser 계정으로 mycluster-rw 서비스 접속
# 접속 $ kubectl exec -it myclient1 -- psql -U postgres -h mycluster-rw -p 5432 psql (14.3, server 14.2 (Debian 14.2-1.pgdg110+1)) SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, bits: 256, compression: off) postgres=# # 연결정보 확인 postgres=# \conninfo You are connected to database "postgres" as user "postgres" on host "mycluster-rw" (address "10.200.1.174") at port "5432". SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, bits: 256, compression: off)
▶ 테스트를 위한 sample데이터 다운로드((https://www.postgresqltutorial.com/postgresql-getting-started)
$ curl -LO https://www.postgresqltutorial.com/wp-content/uploads/2019/05/dvdrental.zip $ apt install unzip -y && unzip dvdrental.zip
▶ myclient1 파드에 dvdrental.tar 복사
$ kubectl cp dvdrental.tar myclient1:/tmp
▶ [myclient1] superuser 계정으로 mycluster-rw 서비스 접속 후 데이터베이스 생성
$ kubectl exec -it myclient1 -- createdb -U postgres -h mycluster-rw -p 5432 dvdrental $ kubectl exec -it myclient1 -- psql -U postgres -h mycluster-rw -p 5432 -l List of databases Name | Owner | Encoding | Collate | Ctype | Access privileges -----------+----------+----------+---------+-------+----------------------- app | app | UTF8 | C | C | dvdrental | postgres | UTF8 | C | C | postgres | postgres | UTF8 | C | C | template0 | postgres | UTF8 | C | C | =c/postgres + | | | | | postgres=CTc/postgres template1 | postgres | UTF8 | C | C | =c/postgres + | | | | | postgres=CTc/postgres (5 rows)
▶ DVD Rental Sample Database Import
$ kubectl exec -it myclient1 -- pg_restore -U postgres -d dvdrental /tmp/dvdrental.tar -h mycluster-rw -p 5432 # DVD Rental Database에서 테이블 조회 kubectl exec -it myclient1 -- psql -U postgres -h mycluster-rw -p 5432 -d dvdrental -c "SELECT * FROM actor" actor_id | first_name | last_name | last_update ----------+-------------+--------------+------------------------ 1 | Penelope | Guiness | 2013-05-26 14:47:57.62 2 | Nick | Wahlberg | 2013-05-26 14:47:57.62 3 | Ed | Chase | 2013-05-26 14:47:57.62 4 | Jennifer | Davis | 2013-05-26 14:47:57.62 5 | Johnny | Lollobrigida | 2013-05-26 14:47:57.62 6 | Bette | Nicholson | 2013-05-26 14:47:57.62 ~ (생략)
▶ 각 파드에 접근해서 DVD Rental Database복제 동기화 확인
# 파드IP 변수 지정 $ POD1=$(kubectl get pod mycluster-1 -o jsonpath={.status.podIP}) $ POD2=$(kubectl get pod mycluster-2 -o jsonpath={.status.podIP}) $ POD3=$(kubectl get pod mycluster-3 -o jsonpath={.status.podIP}) # 파드별 actor 테이블 카운트 조회 $ kubectl exec -it myclient1 -- psql -U postgres -h $POD1 -p 5432 -d dvdrental -c "SELECT COUNT(*) FROM actor" count ------- 200 (1 row) $ kubectl exec -it myclient1 -- psql -U postgres -h $POD2 -p 5432 -d dvdrental -c "SELECT COUNT(*) FROM actor" count ------- 200 (1 row) $ kubectl exec -it myclient1 -- psql -U postgres -h $POD3 -p 5432 -d dvdrental -c "SELECT COUNT(*) FROM actor" count ------- 200 (1 row)
▶ rw vs ro vs r(any) 차이 확인
→ rw : write가 가능한 노드로만 접속
$ kubectl exec -it myclient1 -- psql -U postgres -h mycluster-rw -p 5432 -c "select inet_server_addr();" for i in {1..30}; do kubectl exec -it myclient1 -- psql -U postgres -h mycluster-rw -p 5432 -c "select inet_server_addr();"; done | sort | uniq -c | sort -nr | grep 172 ==> 30 172.16.3.4
→ ro : read만 가능한 노드로만 접속
$ kubectl exec -it myclient1 -- psql -U postgres -h mycluster-ro -p 5432 -c "select inet_server_addr();" for i in {1..30}; do kubectl exec -it myclient1 -- psql -U postgres -h mycluster-ro -p 5432 -c "select inet_server_addr();"; done | sort | uniq -c | sort -nr | grep 172 ==> 18 172.16.1.4 ==> 12 172.16.2.6
→ r : read + write가 가능한 노드로 접속
$ kubectl exec -it myclient1 -- psql -U postgres -h mycluster-r -p 5432 -c "select inet_server_addr();" for i in {1..30}; do kubectl exec -it myclient1 -- psql -U postgres -h mycluster-r -p 5432 -c "select inet_server_addr();"; done | sort | uniq -c | sort -nr | grep 172 ==> 11 172.16.1.4 ==> 10 172.16.2.6 ==> 9 172.16.3.4
→ any : 아무거나 접속
$ kubectl exec -it myclient1 -- psql -U postgres -h mycluster-any -p 5432 -c "select inet_server_addr();" for i in {1..30}; do kubectl exec -it myclient1 -- psql -U postgres -h mycluster-any -p 5432 -c "select inet_server_addr();"; done | sort | uniq -c | sort -nr | grep 172 ==> 12 172.16.1.4 ==> 9 172.16.3.4 ==> 9 172.16.2.6
3. CNPG 장애 테스트
▶ 장애 테스트를 위한 사전준비
# 파드IP 변수 지정 POD1=$(kubectl get pod mycluster-1 -o jsonpath={.status.podIP}) POD2=$(kubectl get pod mycluster-2 -o jsonpath={.status.podIP}) POD3=$(kubectl get pod mycluster-3 -o jsonpath={.status.podIP}) # query.sql $ cat ~/DOIK/5/query.sql CREATE DATABASE test; \c test; CREATE TABLE t1 (c1 INT PRIMARY KEY, c2 TEXT NOT NULL); INSERT INTO t1 VALUES (1, 'Luis'); # SQL 파일 query 실행 $ kubectl cp ~/DOIK/5/query.sql myclient1:/tmp $ kubectl exec -it myclient1 -- psql -U postgres -h mycluster-rw -p 5432 -f /tmp/query.sql CREATE DATABASE psql (14.3, server 14.2 (Debian 14.2-1.pgdg110+1)) SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, bits: 256, compression: off) You are now connected to database "test" as user "postgres". CREATE TABLE INSERT 0 1 # 확인 $ kubectl exec -it myclient1 -- psql -U postgres -h mycluster-ro -p 5432 -d test -c "SELECT * FROM t1" c1 | c2 ----+------ 1 | Luis (1 row) # INSERT # test 데이터베이스에 97개의 데이터 INSERT c1 | c2 ----+------- 1 | Luis 2 | Luis2 (2 rows) test 데이터베이스에 97개의 데이터 INSERT $ for ((i=3; i<=100; i++)); do kubectl exec -it myclient1 -- psql -U postgres -h mycluster-rw -p 5432 -d test -c "INSERT INTO t1 VALUES ($i, 'Luis$i');";echo; done $ kubectl exec -it myclient1 -- psql -U postgres -h mycluster-ro -p 5432 -d test -c "SELECT COUNT(*) FROM t1" count ------- 100 (1 row) # [터미널2] 모니터링 $ while true; do kubectl exec -it myclient2 -- psql -U postgres -h mycluster-ro -p 5432 -d test -c "SELECT COUNT(*) FROM t1"; date;sleep 1; done
▶ [장애1] 프라이머리 파드(인스턴스) 1대 강제 삭제 및 동작 확인
→ 프라이머리 파드 정보 확인
$ kubectl cnpg status mycluster
→ [터미널1], [터미널2] 모니터링모니터링
# [터미널1] 모니터링 $ watch kubectl get pod -l cnpg.io/cluster=mycluster # [터미널2] 모니터링 $ while true; do kubectl exec -it myclient2 -- psql -U postgres -h mycluster-ro -p 5432 -d test -c "SELECT COUNT(*) FROM t1"; date;sleep 1; done
→ [터미널3] test 데이터베이스에 다량의 데이터 반복 INSERT
$ for ((i=301; i<=10000; i++)); do kubectl exec -it myclient1 -- psql -U postgres -h mycluster-rw -p 5432 -d test -c "INSERT INTO t1 VALUES ($i, 'Luis$i');";echo; done # $ for ((i=10001; i<=20000; i++)); do kubectl exec -it myclient1 -- psql -U postgres -h mycluster-rw -p 5432 -d test -c "INSERT INTO t1 VALUES ($i, 'Luis$i');";echo; done
→ [터미널4] 파드 삭제
$ kubectl delete pvc/mycluster-1 pod/mycluster-2 pod "mycluster-2" deleted
→ 삭제 확인
$ kubectl cnpg status mycluster Cluster Summary Name: mycluster Namespace: default System ID: 7111873327487692819 PostgreSQL Image: ghcr.io/cloudnative-pg/postgresql:14.2 Primary instance: mycluster-3 Status: Failing over Failing over from mycluster-2 to mycluster-3 Instances: 3 Ready instances: 2 Current Write LSN: 0/E007850 (Timeline: 3 - WAL File: 00000003000000000000000E) Certificates Status Certificate Name Expiration Date Days Left Until Expiration ---------------- --------------- -------------------------- mycluster-ca 2022-09-20 01:35:08 +0000 UTC 89.88 mycluster-replication 2022-09-20 01:35:08 +0000 UTC 89.88 mycluster-server 2022-09-20 01:35:08 +0000 UTC 89.88 Continuous Backup status Not configured Streaming Replication status Name Sent LSN Write LSN Flush LSN Replay LSN Write Lag Flush Lag Replay Lag State Sync State Sync Priority ---- -------- --------- --------- ---------- --------- --------- ---------- ----- ---------- ------------- mycluster-4 0/E007850 0/E007850 0/E007850 0/E007850 00:00:00.000721 00:00:00.001346 00:00:00.001411 streaming async 0 Instances status Name Database Size Current LSN Replication role Status QoS Manager Version ---- ------------- ----------- ---------------- ------ --- --------------- mycluster-2 - - - pod not available BestEffort - mycluster-3 57 MB 0/E007850 Primary OK BestEffort 1.15.1 mycluster-4 57 MB 0/E007908 Standby (async) OK BestEffort 1.15.1
→ 수초 순단 후 Insert 재개
psql: error: connection to server at "mycluster-rw" (10.200.1.174), port 5432 failed: Connection refused Is the server running on that host and accepting TCP/IP connections? command terminated with exit code 2 psql: error: connection to server at "mycluster-rw" (10.200.1.174), port 5432 failed: Connection refused Is the server running on that host and accepting TCP/IP connections? command terminated with exit code 2 INSERT 0 1 INSERT 0 1 INSERT 0 1
→ 파드 정보 확인 : 파드 재생 성
$ kubectl get pod -l cnpg.io/cluster=mycluster NAME READY STATUS RESTARTS AGE mycluster-2 1/1 Running 0 157m mycluster-3 1/1 Running 0 157m mycluster-4 1/1 Running 0 49s
▶ [장애2] 프라이머리 파드(인스턴스) 가 배포된 노드 1대 drain 설정 및 동작 확인
→ (옵션) 오퍼레이터 로그 확인
$ kubetail -l app.kubernetes.io/instance=cnpg -f
→ 워커노드 drain
$ kubectl drain k8s-w1 --delete-emptydir-data --force --ignore-daemonsets && kubectl get node -w node/k8s-w1 cordoned WARNING: ignoring DaemonSet-managed Pods: kube-system/kube-flannel-ds-gb76j, kube-system/kube-proxy-h22x6 evicting pod kube-system/coredns-64897985d-sbc7f evicting pod default/mycluster-3 evicting pod kube-system/coredns-64897985d-mwx5s error when evicting pods/"mycluster-3" -n "default" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget. evicting pod default/mycluster-3 pod/coredns-64897985d-sbc7f evicted pod/coredns-64897985d-mwx5s evicted pod/mycluster-3 evicted node/k8s-w1 drained NAME STATUS ROLES AGE VERSION k8s-m Ready control-plane,master 3h47m v1.23.6 k8s-w1 Ready,SchedulingDisabled <none> 3h47m v1.23.6 k8s-w2 Ready <none> 3h47m v1.23.6 k8s-w3 Ready <none> 3h47m v1.23.6 k8s-w3 Ready <none> 3h47m v1.23.6 k8s-w2 Ready <none> 3h47m v1.23.6
→ 클러스터 정보 확인
$ kubectl cnpg status mycluster Cluster Summary Name: mycluster Namespace: default System ID: 7111873327487692819 PostgreSQL Image: ghcr.io/cloudnative-pg/postgresql:14.2 Primary instance: mycluster-3 Status: Failing over Failing over from mycluster-2 to mycluster-3 Instances: 3 Ready instances: 2 Current Write LSN: 0/E007850 (Timeline: 3 - WAL File: 00000003000000000000000E) Certificates Status Certificate Name Expiration Date Days Left Until Expiration ---------------- --------------- -------------------------- mycluster-ca 2022-09-20 01:35:08 +0000 UTC 89.88 mycluster-replication 2022-09-20 01:35:08 +0000 UTC 89.88 mycluster-server 2022-09-20 01:35:08 +0000 UTC 89.88 Continuous Backup status Not configured Streaming Replication status Name Sent LSN Write LSN Flush LSN Replay LSN Write Lag Flush Lag Replay Lag State Sync State Sync Priority ---- -------- --------- --------- ---------- --------- --------- ---------- ----- ---------- ------------- mycluster-4 0/E007850 0/E007850 0/E007850 0/E007850 00:00:00.000721 00:00:00.001346 00:00:00.001411 streaming async 0 Instances status Name Database Size Current LSN Replication role Status QoS Manager Version ---- ------------- ----------- ---------------- ------ --- --------------- mycluster-2 - - - pod not available BestEffort - mycluster-3 57 MB 0/E007850 Primary OK BestEffort 1.15.1 mycluster-4 57 MB 0/E007908 Standby (async) OK BestEffort 1.15.1
→ 동작 확인 후 uncordon 설정
$ kubectl uncordon k8s-w1
→ 클러스터 정보 확인
$ kubectl cnpg status mycluster Cluster Summary Name: mycluster Namespace: default System ID: 7111873327487692819 PostgreSQL Image: ghcr.io/cloudnative-pg/postgresql:14.2 Primary instance: mycluster-2 Status: Switchover in progress Switching over to mycluster-2, because primary instance was running on unschedulable node k8s-w1 Instances: 3 Ready instances: 2 Current Write LSN: 0/F003308 (Timeline: 4 - WAL File: 00000004000000000000000F) Certificates Status Certificate Name Expiration Date Days Left Until Expiration ---------------- --------------- -------------------------- mycluster-ca 2022-09-20 01:35:08 +0000 UTC 89.88 mycluster-replication 2022-09-20 01:35:08 +0000 UTC 89.88 mycluster-server 2022-09-20 01:35:08 +0000 UTC 89.88 Continuous Backup status Not configured Streaming Replication status Name Sent LSN Write LSN Flush LSN Replay LSN Write Lag Flush Lag Replay Lag State Sync State Sync Priority ---- -------- --------- --------- ---------- --------- --------- ---------- ----- ---------- ------------- mycluster-4 0/F003308 0/F003308 0/F003308 0/F003308 00:00:00 00:00:00 00:00:00 streaming async 0 Instances status Name Database Size Current LSN Replication role Status QoS Manager Version ---- ------------- ----------- ---------------- ------ --- --------------- mycluster-2 57 MB 0/F003308 Primary OK BestEffort 1.15.1 mycluster-3 - - - pod not available BestEffort - mycluster-4 57 MB 0/F003308 Standby (async) OK BestEffort 1.15.1 $ kubectl cnpg status mycluster Cluster Summary Name: mycluster Namespace: default System ID: 7111873327487692819 PostgreSQL Image: ghcr.io/cloudnative-pg/postgresql:14.2 Primary instance: mycluster-2 Status: Cluster in healthy state Instances: 3 Ready instances: 3 Current Write LSN: 0/F003340 (Timeline: 4 - WAL File: 00000004000000000000000F) Certificates Status Certificate Name Expiration Date Days Left Until Expiration ---------------- --------------- -------------------------- mycluster-ca 2022-09-20 01:35:08 +0000 UTC 89.88 mycluster-replication 2022-09-20 01:35:08 +0000 UTC 89.88 mycluster-server 2022-09-20 01:35:08 +0000 UTC 89.88 Continuous Backup status Not configured Streaming Replication status Name Sent LSN Write LSN Flush LSN Replay LSN Write Lag Flush Lag Replay Lag State Sync State Sync Priority ---- -------- --------- --------- ---------- --------- --------- ---------- ----- ---------- ------------- mycluster-4 0/F003340 0/F003340 0/F003340 0/F003340 00:00:00 00:00:00 00:00:00 streaming async 0 mycluster-3 0/F003340 0/F003340 0/F003340 0/F003340 00:00:00 00:00:00 00:00:00 streaming async 0 Instances status Name Database Size Current LSN Replication role Status QoS Manager Version ---- ------------- ----------- ---------------- ------ --- --------------- mycluster-2 57 MB 0/F003340 Primary OK BestEffort 1.15.1 mycluster-3 57 MB 0/F003340 Standby (async) OK BestEffort 1.15.1 mycluster-4 57 MB 0/F003340 Standby (async) OK BestEffort 1.15.1
4. CloudNativePG Scale & 롤링 업데이트
▶CNPG Scale 설정
→ 사전정보 확인
$ kubectl cnpg status mycluster Cluster Summary Name: mycluster Namespace: default System ID: 7111873327487692819 PostgreSQL Image: ghcr.io/cloudnative-pg/postgresql:14.2 Primary instance: mycluster-2 Status: Cluster in healthy state Instances: 3 Ready instances: 3 Current Write LSN: 0/11000000 (Timeline: 4 - WAL File: 000000040000000000000010) Certificates Status Certificate Name Expiration Date Days Left Until Expiration ---------------- --------------- -------------------------- mycluster-ca 2022-09-20 01:35:08 +0000 UTC 89.87 mycluster-replication 2022-09-20 01:35:08 +0000 UTC 89.87 mycluster-server 2022-09-20 01:35:08 +0000 UTC 89.87 Continuous Backup status Not configured Streaming Replication status Name Sent LSN Write LSN Flush LSN Replay LSN Write Lag Flush Lag Replay Lag State Sync State Sync Priority ---- -------- --------- --------- ---------- --------- --------- ---------- ----- ---------- ------------- mycluster-4 0/11000000 0/11000000 0/11000000 0/11000000 00:00:00 00:00:00 00:00:00 streaming async 0 mycluster-3 0/11000000 0/11000000 0/11000000 0/11000000 00:00:00 00:00:00 00:00:00 streaming async 0 Instances status Name Database Size Current LSN Replication role Status QoS Manager Version ---- ------------- ----------- ---------------- ------ --- --------------- mycluster-2 57 MB 0/11000000 Primary OK BestEffort 1.15.1 mycluster-3 57 MB 0/11000000 Standby (async) OK BestEffort 1.15.1 mycluster-4 57 MB 0/11000000 Standby (async) OK BestEffort 1.15.1 $ kubectl get cluster mycluster NAME AGE INSTANCES READY STATUS PRIMARY mycluster 176m 3 3 Cluster in healthy state mycluster-2 NAME AGE INSTANCES READY STATUS PRIMARY mycluster 167m 3 3 Cluster in healthy state mycluster-6
→ 모니터링
$ watch kubectl get pod -l postgresql=mycluster
→ 5EA로 Scale-Out 실행
$ kubectl patch cluster mycluster --type=merge -p '{"spec":{"instances":5}}' cluster.postgresql.cnpg.io/mycluster patched
→ Scale-Out 확인
$ kubectl get cluster mycluster NAME AGE INSTANCES READY STATUS PRIMARY mycluster 178m 5 5 Cluster in healthy state mycluster-2 $ kubectl cnpg status mycluster Cluster Summary Name: mycluster Namespace: default System ID: 7111873327487692819 PostgreSQL Image: ghcr.io/cloudnative-pg/postgresql:14.2 Primary instance: mycluster-2 Status: Cluster in healthy state Instances: 5 Ready instances: 5 Current Write LSN: 0/14000060 (Timeline: 4 - WAL File: 000000040000000000000014) Certificates Status Certificate Name Expiration Date Days Left Until Expiration ---------------- --------------- -------------------------- mycluster-ca 2022-09-20 01:35:08 +0000 UTC 89.87 mycluster-replication 2022-09-20 01:35:08 +0000 UTC 89.87 mycluster-server 2022-09-20 01:35:08 +0000 UTC 89.87 Continuous Backup status Not configured Streaming Replication status Name Sent LSN Write LSN Flush LSN Replay LSN Write Lag Flush Lag Replay Lag State Sync State Sync Priority ---- -------- --------- --------- ---------- --------- --------- ---------- ----- ---------- ------------- mycluster-4 0/14000060 0/14000060 0/14000060 0/14000060 00:00:00 00:00:00 00:00:00 streaming async 0 mycluster-3 0/14000060 0/14000060 0/14000060 0/14000060 00:00:00 00:00:00 00:00:00 streaming async 0 mycluster-5 0/14000060 0/14000060 0/14000060 0/14000060 00:00:00 00:00:00 00:00:00 streaming async 0 mycluster-6 0/14000060 0/14000060 0/14000060 0/14000060 00:00:00 00:00:00 00:00:00 streaming async 0 Instances status Name Database Size Current LSN Replication role Status QoS Manager Version ---- ------------- ----------- ---------------- ------ --- --------------- mycluster-2 57 MB 0/14000060 Primary OK BestEffort 1.15.1 mycluster-3 57 MB 0/14000060 Standby (async) OK BestEffort 1.15.1 mycluster-4 57 MB 0/14000060 Standby (async) OK BestEffort 1.15.1 mycluster-5 57 MB 0/14000060 Standby (async) OK BestEffort 1.15.1 mycluster-6 57 MB 0/14000060 Standby (async) OK BestEffort 1.15.1
→ any 접속 확인
$ for i in {1..30}; do kubectl exec -it myclient1 -- psql -U postgres -h mycluster-any -p 5432 -c "select inet_server_addr();"; done | sort | uniq -c | sort -nr | grep 172 8 172.16.1.9 7 172.16.2.7 6 172.16.2.10 5 172.16.3.8 4 172.16.1.5
→ 3EA로 Scale-In 수행
$ kubectl patch cluster mycluster --type=merge -p '{"spec":{"instances":3}}' cluster.postgresql.cnpg.io/mycluster patched
→ Scale-In 확인
$ kubectl get cluster mycluster NAME AGE INSTANCES READY STATUS PRIMARY mycluster 3h 3 3 Cluster in healthy state mycluster-2
▶ 롤링 업데이트 : Standby 시작 → Primary 갱신 전 SwitchOver로 다운타임 최소화
→ 모니터링
# [터미널1] 모니터링 $ watch kubectl get pod -l cnpg.io/cluster=mycluster NAME READY STATUS RESTARTS AGE mycluster-2 1/1 Running 0 20m mycluster-3 1/1 Running 0 16m mycluster-4 1/1 Running 0 22m # [터미널2] 모니터링 while true; do kubectl exec -it myclient2 -- psql -U postgres -h mycluster-ro -p 5432 -d test -c "SELECT COUNT(*) FROM t1"; date;sleep 1; done # [터미널3] test 데이터베이스에 다량의 데이터 INSERT for ((i=10000; i<=20000; i++)); do kubectl exec -it myclient1 -- psql -U postgres -h mycluster-rw -p 5432 -d test -c "INSERT INTO t1 VALUES ($i, 'Luis$i');";echo; done
→ [터미널4] postgresql:14.2 → postgresql:14.3 로 업데이트
$ kubectl patch cluster mycluster --type=merge -p '{"spec":{"imageName":"ghcr.io/cloudnative-pg/postgresql:14.3"}}' && kubectl get pod -l postgresql=mycluster -w
→ 클러스터 정보 확인
$ kubectl get cluster mycluster NAME AGE INSTANCES READY STATUS PRIMARY mycluster 3h1m 2 2 Upgrading cluster mycluster-2 $ kubectl cnpg status mycluster | grep Image PostgreSQL Image: ghcr.io/cloudnative-pg/postgresql:14.3
→ Patch 도중 Insert작업 잠시 순단 발생
psql: error: connection to server at "mycluster-rw" (10.200.1.174), port 5432 failed: Connection refused Is the server running on that host and accepting TCP/IP connections? command terminated with exit code 2 psql: error: connection to server at "mycluster-rw" (10.200.1.174), port 5432 failed: Connection refused Is the server running on that host and accepting TCP/IP connections? command terminated with exit code 2 psql: error: connection to server at "mycluster-rw" (10.200.1.174), port 5432 failed: Connection refused Is the server running on that host and accepting TCP/IP connections? command terminated with exit code 2 INSERT 0 1 INSERT 0 1 INSERT 0 1
5. CNPG 기타
▶ 파드 볼륨증가 : 온라인 볼륨 확장은 현재 local-path 는 미지원되어 불가능!!
→ pod, pvc정보 확인
$ watch kubectl get pod,pvc NAME READY STATUS RESTARTS AGE pod/cnpg-cloudnative-pg-5f8cc75df5-jk2v4 1/1 Running 0 3h19m pod/myclient1 1/1 Running 0 60m pod/myclient2 1/1 Running 0 60m pod/mycluster-2 1/1 Running 0 4m55s pod/mycluster-3 1/1 Running 0 5m25s pod/mycluster-4 1/1 Running 0 5m50s NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE persistentvolumeclaim/mycluster-2 Bound pvc-c490f766-3a2b-4658-90d6-49dff5dc03e4 3Gi RWO local-path 3h6m persistentvolumeclaim/mycluster-3 Bound pvc-d370b23b-14dc-4b90-bfb4-83fa9aa51290 3Gi RWO local-path 3h5m persistentvolumeclaim/mycluster-4 Bound pvc-ca35067b-0291-4559-b919-639e91e8df67 3Gi RWO local-path 29m
→ PVC 3G -> 5G 로 증가 설정 : 증가 후 감소는 안됨
$ kubectl patch cluster mycluster --type=merge -p '{"spec":{"storage":{"resizeInUseVolumes":false}}}' cluster.postgresql.cnpg.io/mycluster patched $ kubectl patch cluster mycluster --type=merge -p '{"spec":{"storage":{"size":"5Gi"}}}' cluster.postgresql.cnpg.io/mycluster patched $ kubectl describe cluster mycluster
→ 파드/PVC 삭제 실행으로 재실행
$ kubectl delete pvc/mycluster-4 pod/mycluster-4 persistentvolumeclaim "mycluster-4" deleted pod "mycluster-4" deleted $ kubectl delete pvc/mycluster-3 pod/mycluster-3 persistentvolumeclaim "mycluster-3" deleted pod "mycluster-3" deleted $ kubectl delete pvc/mycluster-2 pod/mycluster-2 persistentvolumeclaim "mycluster-2" deleted pod "mycluster-2" deleted
→ 볼륨 사이즈 증가 확인
kubectl get pod,pvc NAME READY STATUS RESTARTS AGE pod/cnpg-cloudnative-pg-5f8cc75df5-jk2v4 1/1 Running 0 3h22m pod/myclient1 1/1 Running 0 62m pod/myclient2 1/1 Running 0 62m pod/mycluster-7 1/1 Running 0 47s pod/mycluster-8 1/1 Running 0 24s pod/mycluster-9 1/1 Running 0 9s NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE persistentvolumeclaim/mycluster-7 Bound pvc-5c07d1ae-f6e1-4b05-829a-6512f6f33d30 5Gi RWO local-path 59s persistentvolumeclaim/mycluster-8 Bound pvc-121e78c4-5f59-4895-aa14-5844a4b3098d 5Gi RWO local-path 35s persistentvolumeclaim/mycluster-9 Bound pvc-75db182a-b42c-402a-8f23-119dfe9a848d 5Gi RWO local-path 20s
'K8S' 카테고리의 다른 글
MySQL Operator based on Percona (0) | 2022.06.22 |
---|---|
Percona Distribution for MongoDB - 샤딩 (3/3) (0) | 2022.06.19 |
Percona Distribution for MongoDB 오퍼레이터 - 복제 (2/3) (0) | 2022.06.19 |
Percona Distribution for MongoDB 오퍼레이터 - 기본설치 (1/3) (0) | 2022.06.19 |
Helm을 이용해 K8S에 MariaDB Galera 설치하기 (0) | 2022.05.29 |