Kubernetes部署高可用PostgreSQL集群

Prodan Labs 等级 101 0 0

Description PostgreSQL 是一个功能强大的开源关系数据库,它使用和扩展了 SQL 语言,并结合了许多安全存储和扩展最复杂数据工作负载的功能。PostgreSQL 的起源可以追溯到 1986 年,作为加利福尼亚大学伯克利分校 POSTGRES 项目的一部分,与 Linux 内核相似,PostgreSQL 是由社区驱动的开源项目,由全世界贡献者所维护。

PostgreSQL 号称是 "世界上最先进的开源关系型数据库",具有很多先进的数据库理念,它对标的应该是 Oracle 和 SQLserver 这类商用库,与 MySQL 相比各有优劣,从使用角度来看,MySQL 偏向开发者,不过在开源社区 PostgreSQL 更受青睐,大多开源项目都使用 PostgreSQL 取代了 MySQL。

1、Crunchy


PostgreSQL 有很多高可用方案,如 Pgpool II 、Bucardo、PostgresXL 和 Postgres-XC 等。Crunchy 是一个容器化解决方案,它能快速部署 PostgreSQL 。

PostgreSQL Operator 是一个 kubernetes 应用控制器,用于在 Kubernetes 上运行生产级 PostgreSQL ,它简化了 PostgreSQL 集群的部署,并提供了集群配置管理、高可用性、计划备份、灾难恢复、监控和连接池等功能。 Description

2、安装 Crunchy PostgreSQL Operator


PostgreSQL 数据需要持久化,kubernetes 需要先创建存储类,PostgreSQL Operator 支持 NFS、ceph 等存储类,这里用的是 rook-ceph-block 。

[root@k8s-test-master01 ~]# kubectl get sc
NAME              PROVISIONER         RECLAIMPOLICY   VOLUMEBINDINGMODE   ALLOWVOLUMEEXPANSION   AGE
ceph-storage      kubernetes.io/rbd   Delete          Immediate           true                   4d1h
rook-ceph-block   kubernetes.io/rbd   Delete          Immediate           true                   3d23h

创建命名空间及下载 yaml 文件

kubectl create namespace pgo
curl https://raw.githubusercontent.com/CrunchyData/postgres-operator/v4.6.1/installers/kubectl/postgres-operator.yml > postgres-operator.yml

修改 yaml 文件,加入下列环境变量,用于设置存储类

            - name: BACKREST_STORAGE
              value: "rook"
            - name: BACKUP_STORAGE
              value: "rook"
            - name: PRIMARY_STORAGE
              value: "rook"
            - name: REPLICA_STORAGE
              value: "rook"
            - name: STORAGE9_SIZE
              value: "50G"
# 安装 postgres operator
kubectl apply -f  postgres-operator.yml

安装 pgo 客户端

curl https://raw.githubusercontent.com/CrunchyData/postgres-operator/v4.6.1/installers/kubectl/client-setup.sh > client-setup.sh
chmod +x client-setup.sh
./client-setup.sh

cat <<EOF >> ~/.bashrc
export PGOUSER="${HOME?}/.pgo/pgo/pgouser"
export PGO_CA_CERT="${HOME?}/.pgo/pgo/client.crt"
export PGO_CLIENT_CERT="${HOME?}/.pgo/pgo/client.crt"
export PGO_CLIENT_KEY="${HOME?}/.pgo/pgo/client.key"
export PGO_APISERVER_URL='https://127.0.0.1:8443'
export PGO_NAMESPACE=pgo
EOF

source ~/.bashrc

配置 pgo 客户端

# postgres operator 的 service 类型默认是 ClusterIP ,需要开启端口转发
kubectl -n pgo port-forward svc/postgres-operator 8443:8443 &

wget https://github.com/CrunchyData/postgres-operator/releases/download/v4.6.1/pgo
chmod +x pgo 
mv pgo /usr/local/bin/
[root@k8s-test-master01 ~]# pgo version
pgo client version 4.6.1
Handling connection for 8443
pgo-apiserver version 4.6.1
[root@k8s-test-master01 ~]# 

# 注:kubernetes 节点需要安装 socat,否则提示 unable to do port forwarding: socat not found.
# RHEL 系
yum install socat
# Debian 系
sudo apt-get install socat 
# SUSE
sudo zypper in socat

3、安装 PostgreSQL 集群

PostgreSQL Operator 可定制很高,可以根据自己的实际情况进行设置

# PostgreSQL 集群的名称
pgo_cluster_name=hippo
# PostgreSQL 部署的命名空间
cluster_namespace=pgo
# 自定义 PostgreSQL 配置
cat postgres-ha.yaml
---
bootstrap:
  dcs:
    postgresql:
      parameters:
        max_connections: 30
        shared_buffers: 2GB
        password_encryption: scram-sha-256
# 创建 hippo-custom-config configmap
kubectl -n pgo create configmap hippo-custom-config --from-file=postgres-ha.yaml

# 超级用户密码
kubectl create secret generic -n "${cluster_namespace}" "${pgo_cluster_name}-postgres-secret" \
  --from-literal=username=postgres \
  --from-literal=password=Supersecurepassword*

# 同步复制用户密码
kubectl create secret generic -n "${cluster_namespace}" "${pgo_cluster_name}-primaryuser-secret" \
  --from-literal=username=primaryuser \
  --from-literal=password=Anothersecurepassword*

# 普通用户 hippo 用户密码
kubectl create secret generic -n "${cluster_namespace}" "${pgo_cluster_name}-hippo-secret" \
  --from-literal=username=hippo \
  --from-literal=password=Moresecurepassword*

创建 PostgreSQL 集群

pgo create cluster ${pgo_cluster_name} \
  --cpu=1.0 --cpu-limit=4.0 \
  --memory=2Gi --memory-limit=6Gi \
  --metrics \
  --exporter-cpu=0.5 --exporter-cpu-limit=1.0 \
  --exporter-memory=256Mi --exporter-memory-limit=1Gi \
  --replica-count=2 \
  --replica-storage-config='rook' \
  --wal-storage-config='rook' \
  --wal-storage-size=10Gi  \
  --custom-config=hippo-custom-config \
  --pgbouncer \
  --storage-config='rook' \
  --pvc-size=20Gi \
  --pgbackrest-cpu=0.5 --pgbackrest-cpu-limit=1.0 \
  --pgbackrest-memory=256Mi --pgbackrest-memory-limit=1Gi \
  --pgbackrest-pvc-size=100Gi \
  --pgbackrest-storage-config='rook' \
  --username=hippo 

--metrics 开启 PostgreSQL exporter,接入普罗米修斯监控 --pgbouncer 开启pgbouncer连接池 --replica-count=2 创建2个副本的高可用性PostgreSQL集群 --sync-replication 开启同步复制,防止事务丢失 生产环境可以设置 pod 调度策略,避免 pod 运行在同一个 node 节点 --pod-anti-affinity
--node-label=region=us-east-1 --node-affinity-type=required

查看 PostgreSQL 集群

[root@k8s-test-master01 ~]# pgo show cluster hippo

cluster : hippo (crunchy-postgres-ha:centos8-13.2-4.6.1)
        pod : hippo-5f69bb897d-cqjgh (Running) on k8s-test-node01 (2/2) (primary)
                pvc: hippo (20Gi)
        pod : hippo-pksw-7588f857f7-m6w72 (Running) on k8s-test-master01 (2/2) (replica)
                pvc: hippo-pksw (20Gi)
        pod : hippo-vpcy-6796f5b79b-kddqg (Running) on k8s-test-node01 (2/2) (replica)
                pvc: hippo-vpcy (20Gi)
        resources : CPU: 1 Memory: 2Gi
        limits : CPU: 4 Memory: 6Gi
        deployment : hippo
        deployment : hippo-backrest-shared-repo
        deployment : hippo-pgbouncer
        deployment : hippo-pksw
        deployment : hippo-vpcy
        service : hippo - ClusterIP (10.254.76.44) - Ports (9187/TCP, 2022/TCP, 5432/TCP)
        service : hippo-pgbouncer - ClusterIP (10.254.48.160) - Ports (5432/TCP)
        service : hippo-replica - ClusterIP (10.254.238.91) - Ports (9187/TCP, 2022/TCP, 5432/TCP)
        pgreplica : hippo-pksw
        pgreplica : hippo-vpcy
        labels : deployment-name=hippo name=hippo pg-cluster=hippo pgo-version=4.6.1 pgouser=admin workflowid=452b51f0-848c-4891-9131-f4e1b8fbcc36 crunchy-pgha-scope=hippo crunchy-postgres-exporter=true 

接入 Prometheus

---
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  annotations:
    meta.helm.sh/release-name: demo
    meta.helm.sh/release-namespace: monitoring
  labels:
    app: prometheus-operator-postgres-replica-exporter
    app.kubernetes.io/managed-by: Helm
    chart: prometheus-operator-9.3.2
    heritage: Helm
    release: demo
  name: demo-prometheus-operator-postgres-replica-exporter
  namespace: monitoring
spec:
  endpoints:
  - port: postgres-exporter
    path: /metrics
    interval: 30s 
  namespaceSelector:
    matchNames:
    - pgo
  selector:
    matchLabels:
      name: hippo-replica
      pg-cluster: hippo
      vendor: crunchydata
---
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  annotations:
    meta.helm.sh/release-name: demo
    meta.helm.sh/release-namespace: monitoring
  labels:
    app: prometheus-operator-postgres-exporter
    app.kubernetes.io/managed-by: Helm
    chart: prometheus-operator-9.3.2
    heritage: Helm
    release: demo
  name: demo-prometheus-operator-postgres-exporter
  namespace: monitoring
spec:
  endpoints:
  - port: postgres-exporter
    path: /metrics
    interval: 30s 
  namespaceSelector:
    matchNames:
    - pgo
  selector:
    matchLabels:
      name: hippo
      pg-cluster: hippo
      vendor: crunchydata

Description

安装 pgAdmin 的图形管理工具

pgo create pgadmin -n pgo hippo

# service 类型默认是 ClusterIP ,新增一个 NodePort 类型的 service ,用与外部访问
apiVersion: v1
kind: Service
metadata:
  labels:
    name: hippo-pgadmin
    pg-cluster: hippo
    pgadmin: "true"
    vendor: crunchydata
  name: hippo-pgadmin-external
  namespace: pgo
spec:
  ports:
  - port: 5050
    protocol: TCP
    targetPort: 5050
    nodePort: 5050
  selector:
    name: hippo-pgadmin
  type: NodePort

# 如果登录 pgAdmin 提示密码错误,可以使用 pgo 更新下密码即可
pgo update user -n pgo hippo --username=hippo --password=Moresecurepassword*

Description

4、同步复制


进入 PostgreSQL 主实例创建一个表,并插入数据。

[root@k8s-test-master01 ~]# kubectl get po  -n pgo -lcrunchy-pgha-scope=hippo
NAME                          READY   STATUS    RESTARTS   AGE
hippo-749d494c4c-52cf6        2/2     Running   0          4d5h
hippo-hvaz-56bb495b94-qw6zl   2/2     Running   0          4d5h
hippo-snmw-6d7877d78b-rbrjs   2/2     Running   0          4d5h
[root@k8s-test-master01 ~]# kubectl -n pgo exec -it hippo-749d494c4c-52cf6 -- psql
Defaulting container name to database.
Use 'kubectl describe pod/hippo-749d494c4c-52cf6 -n pgo' to see all of the containers in this pod.
psql (13.2)
Type "help" for help.

postgres=# \c hippo
You are now connected to database "hippo" as user "postgres".
hippo=# CREATE TABLE demo (
hippo(#     id integer,
hippo(#     name text,
hippo(#     val numeric
hippo(# );
CREATE TABLE
hippo=# \dt
        List of relations
 Schema | Name | Type  |  Owner   
--------+------+-------+----------
 public | demo | table | postgres
(1 row)

hippo=# INSERT INTO demo VALUES (1, 'test', 9.99);
INSERT 0 1
hippo=# select * from demo;
 id | name | val  
----+------+------
  0 | test | 9.99
(1 row)

hippo=# 

进入 PostgreSQL副本库查看数据 Description 两个副本都可以同步主库数据。

5、逻辑备份


创建备份的 PVC

---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: postgres-backup
  namespace: pgo
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 20Gi
  storageClassName: rook-ceph-block

创建 hippo 库的逻辑备份,指定 PVC

[root@k8s-test-master01 ~]# pgo backup hippo --backup-type=pgdump --database=hippo --pvc-name=postgres-backup -n pgo
created Pgtask backup-hippo-pgdump
[root@k8s-test-master01 ~]# kubectl get po -n pgo  -lpgdump=true
NAME                             READY   STATUS      RESTARTS   AGE
backup-hippo-pgdump-fdgj-xwzml   0/1     Completed   0          56m
backup-hippo-pgdump-jxis-flhpr   0/1     Completed   0          34m
backup-hippo-pgdump-utaq-6tzfh   0/1     Completed   0          48m
backup-hippo-pgdump-vlzn-d5z5x   0/1     Completed   0          15s

查看备份

[root@k8s-test-master01 ~]# pgo show backup --backup-type=pgdump hippo
Handling connection for 8443

pgdump : backup-hippo-pgdump
        PVC Name:       postgres-backup
        Access Mode:    ReadWriteOnce
        PVC Size:       1G
        Creation:       2021-03-05 10:23:14 +0000 UTC
        CCPImageTag:    centos8-13.2-4.6.1
        Backup Status:  job completed [backup-hippo-pgdump-vlzn]
        Backup Host:    hippo
        Backup User Secret:     hippo-postgres-secret
        Backup Port:    5432
        Backup Opts:

删除表

postgres=# \c hippo 
You are now connected to database "hippo" as user "postgres".
hippo=# \dt
        List of relations
 Schema | Name | Type  |  Owner   
--------+------+-------+----------
 public | demo | table | postgres
(1 row)

hippo=# drop table demo ;
DROP TABLE
hippo=# \dt
Did not find any relations.
hippo=# 

从逻辑备份还原数据,需要查看逻辑备份存储到 PVC 以及逻辑备份创建的时间戳。

实际上通过 pgo show backup 获取时间是不准的,创建容器与启动备份任务是有时间差的,最准确的备份时间点应该查看 pod 的日志,或者把 PVC 挂载出来查看。

[root@k8s-test-master01 ~]# kubectl get po -n pgo  -lpgdump=true
NAME                             READY   STATUS      RESTARTS   AGE
backup-hippo-pgdump-fdgj-xwzml   0/1     Completed   0          62m
backup-hippo-pgdump-jxis-flhpr   0/1     Completed   0          39m
backup-hippo-pgdump-utaq-6tzfh   0/1     Completed   0          53m
backup-hippo-pgdump-vlzn-d5z5x   0/1     Completed   0          5m18s
[root@k8s-test-master01 ~]# kubectl logs -f backup-hippo-pgdump-vlzn-d5z5x   -n pgo 
Fri Mar  5 10:23:23 UTC 2021 INFO: Image mode found: pgdump
Fri Mar  5 10:23:23 UTC 2021 INFO: Starting in 'pgdump' mode
Fri Mar  5 10:23:23 UTC 2021 INFO: Waiting for PostgreSQL to be ready..
hippo:5432 - accepting connections
Fri Mar  5 10:23:23 UTC 2021 INFO: Checking if PostgreSQL is accepting queries..
              now              
-------------------------------
 2021-03-05 10:23:23.601422+00
(1 row)

Fri Mar  5 10:23:23 UTC 2021 INFO: Dumping to -f /pgdata/hippo-backups/2021-03-05-10-23-23/dump 
Fri Mar  5 10:23:23 UTC 2021 INFO: Taking logical backup of the hippo database..
Fri Mar  5 10:23:23 UTC 2021 INFO: Logical backup completed.  Exiting..

恢复数据

[root@k8s-test-master01 ~]# pgo restore hippo --backup-type=pgdump --backup-pvc=postgres-backup \
>   --pgdump-database=hippo --pitr-target="2021-03-05-10-23-23" -n pgo    
WARNING: Are you sure? (yes/no): yes
Handling connection for 8443
restore performed on hippo to postgres-backup opts= pitr-target=2021-03-05-10-23-23
[root@k8s-test-master01 ~]# kubectl get po -n pgo -ljob-name=pgrestore-hippo-tbps
NAME                         READY   STATUS      RESTARTS   AGE
pgrestore-hippo-tbps-g8xcv   0/1     Completed   0          67s

查看数据已经恢复。

postgres=# \c hippo 
You are now connected to database "hippo" as user "postgres".
hippo=# \dt
        List of relations
 Schema | Name | Type  |  Owner   
--------+------+-------+----------
 public | demo | table | postgres
(1 row)

hippo=# select * from demo;
 id | name | val  
----+------+------
  1 | test | 9.99
(1 row)

6、故障转移


手动故障转移,可以自动选择最佳的候选副本进行故障转移,或指定的副本候选者进行故障转移。

# 自动选择副本
pgo failover hippo
# 指定副本
pgo failover hippo --target==hippo-wvqy

查询集群的副本实例数

[root@k8s-test-master01 ~]# kubectl get po -n pgo -lcrunchy-pgha-scope=hippo
NAME                          READY   STATUS    RESTARTS   AGE
hippo-679cd9ddf5-5rfjf        2/2     Running   0          135m
hippo-ohbu-68d479877c-4mt5h   2/2     Running   0          134m
hippo-wvqy-7f949f79c6-2p7jq   2/2     Running   0          134m
[root@k8s-test-master01 ~]# pgo failover hippo --query

Cluster: hippo
REPLICA                 STATUS          NODE            REPLICATION LAG         PENDING RESTART
hippo-ohbu              running         k8s-test-master01                  0 MB                   false
hippo-wvqy              running         k8s-test-node01            0 MB                   false

手动故障转移

[root@k8s-test-master01 ~]# pgo failover hippo
WARNING: Are you sure? (yes/no): yes
failover success for cluster hippo

查看副本实例已经发生变化

[root@k8s-test-master01 ~]# pgo failover hippo --query

Cluster: hippo
REPLICA                 STATUS          NODE            REPLICATION LAG         PENDING RESTART
hippo                   running         k8s-test-node01            0 MB                   false
hippo-ohbu              running         k8s-test-master01                  0 MB                   false

进入转移后的主实例 hippo-wvqy ,删掉数据目录

[root@k8s-test-master01 ~]# kubectl exec -it -n pgo hippo-wvqy-7f949f79c6-2p7jq -- bash
Defaulting container name to database.
Use 'kubectl describe pod/hippo-wvqy-7f949f79c6-2p7jq -n pgo' to see all of the containers in this pod.
bash-4.4$ lsblk
NAME   MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
rbd0   251:0    0  10G  0 disk 
rbd1   251:16   0  20G  0 disk 
rbd2   251:32   0  20G  0 disk /pgdata
rbd3   251:48   0  10G  0 disk /pgwal
vda    252:0    0  20G  0 disk 
└─vda1 252:1    0  20G  0 part /etc/resolv.conf
bash-4.4$ cd /pgdata/
bash-4.4$ ls 
hippo-wvqy  lost+found
bash-4.4$ rm -rf hippo-wvqy/
bash-4.4$ ls 
lost+found
bash-4.4$ 

再执行故障转移

[root@k8s-test-master01 ~]# pgo failover hippo
WARNING: Are you sure? (yes/no): yes
Handling connection for 8443
failover success for cluster hippo

再查看 hippo-wvqy 实例,数据已恢复。

[root@k8s-test-master01 ~]# pgo failover hippo --query
Handling connection for 8443

Cluster: hippo
REPLICA                 STATUS          NODE            REPLICATION LAG         PENDING RESTART
hippo-ohbu              running         k8s-test-master01                  0 MB                   false
hippo-wvqy              running         k8s-test-node01            0 MB                   false
[root@k8s-test-master01 ~]# kubectl exec -it -n pgo hippo-wvqy-7f949f79c6-2p7jq -- bash
Defaulting container name to database.
Use 'kubectl describe pod/hippo-wvqy-7f949f79c6-2p7jq -n pgo' to see all of the containers in this pod.
bash-4.4$ lsblk
NAME   MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
rbd0   251:0    0  10G  0 disk 
rbd1   251:16   0  20G  0 disk 
rbd2   251:32   0  20G  0 disk /pgdata
rbd3   251:48   0  10G  0 disk /pgwal
vda    252:0    0  20G  0 disk 
└─vda1 252:1    0  20G  0 part /etc/resolv.conf
bash-4.4$ ls /pgdata/
hippo-wvqy  lost+found
bash-4.4$ du -sh /pgdata/hippo-wvqy/
160M    /pgdata/hippo-wvqy/

7、总结


除了文中列举的功能外,crunchy 还提供了连接池、定时备份、增量备份等,这里就不一一列举了,crunchy 的实现原理和逻辑还有待研究,感兴趣的读者可以到官网查看,官方的文档还是挺详细的。

参考文档: https://access.crunchydata.com/documentation/postgres-operator/4.6.1/


感兴趣的读者可以关注下微信号 Description

预览图
收藏
评论区