PostgreSQL 主机性能测试方法 - 单机多实例

    xiaoxiao2026-03-05  7

    背景

    业界有一些通用的数据库性能测试模型,可以用来测试硬件在不同测试模型下的性能表现。 参考http://www.tpc.org/https://github.com/oltpbenchmark/oltpbenchhttp://oltpbenchmark.com/

    本文主要以PostgreSQL为例,向大家介绍一下,如何使用PostgreSQL来测试硬件的性能。

    PostgreSQL 的功能非常的强大,所以可以适用于几乎所有的测试模型,同时用户还可以根据自己的应用场景设计测试模型。

    一、机器部署

    LVM部署

    可选,如果你的主机有多个块设备,可以使用LVM的条带化,提高整体的吞吐和IOPS能力,当然你也可以使用多个表空间来利用不同的块设备。

    举例

    常用的配置,例如4个组,每个组为3块盘组成的RAID 5 4 * (3 DISK RAID 5)

    lvcreate 参数举例

    -i 4: 4 个组,所以条带宽度设置为4 -I 16: (3-1) * 8K 每个组3块盘,实际数据盘为2块,所以将条带大小设置为 2*8KB (8K指数据块的块大小,或者WAL的块大小)

    分区与对齐举例(假设为SSD,需要对齐)

    parted -s /dev/sdb mklable gpt parted -s /dev/sdc mklable gpt parted -s /dev/sdd mklable gpt parted -s /dev/sde mklable gpt parted -s /dev/sdb mkpart primary 1MiB xxxxGB # 对齐, 起步为条带大小的倍数 1MiB/16K parted -s /dev/sdc mkpart primary 1MiB xxxxGB parted -s /dev/sdd mkpart primary 1MiB xxxxGB parted -s /dev/sde mkpart primary 1MiB xxxxGB

    逻辑卷

    pvcreate /dev/sd[bcde]1 vgcreate -s 128MB vgdata01 /dev/sd[bcde]1 lvcreate -i 4 -I 16 -n lv01 -L 5T vgdata01 lvcreate -i 4 -I 16 -n lv02 -l 100%FREE vgdata01

    创建文件系统(mkfs.ext4可以感知lvm条带配置,因此不需要设置mkfs.ext4的条带配置)

    mkfs.ext4 /dev/mapper/vgdata01-lv01 -m 0 -O extent,uninit_bg -E lazy_itable_init=1 -T largefile -L lv01 mkfs.ext4 /dev/mapper/vgdata01-lv02 -m 0 -O extent,uninit_bg -E lazy_itable_init=1 -T largefile -L lv02 mkdir /u01 mkdir /u02 mount -o defaults,noatime,nodiratime,nodelalloc,barrier=0,data=writeback LABEL=lv01 /u01 mount -o defaults,noatime,nodiratime,nodelalloc,barrier=0,data=writeback LABEL=lv02 /u02

    目录

    mkdir -p /data01/digoal mkdir -p /data02/digoal chown digoal /data01/digoal chown digoal /data02/digoal

    io 调度 : SSD建议使用deadline 或 noop

    echo deadline > /sys/block/sdb/queue/scheduler echo deadline > /sys/block/sdc/queue/scheduler echo deadline > /sys/block/sdd/queue/scheduler echo deadline > /sys/block/sde/queue/scheduler

    OS内核配置

    配置举例,请根据环境调整 1. /etc/sysctl.conf

    fs.aio-max-nr = 1048576 fs.file-max = 76724600 kernel.core_pattern= /data01/corefiles/core_%e_%u_%t_%s.%p kernel.sem = 4096 2147483647 2147483646 512000 kernel.shmall = 107374182 kernel.shmmax = 274877906944 kernel.shmmni = 819200 net.core.netdev_max_backlog = 10000 net.core.rmem_default = 262144 net.core.rmem_max = 4194304 net.core.wmem_default = 262144 net.core.wmem_max = 4194304 net.core.somaxconn = 4096 net.ipv4.tcp_max_syn_backlog = 4096 net.ipv4.tcp_keepalive_intvl = 20 net.ipv4.tcp_keepalive_probes = 3 net.ipv4.tcp_keepalive_time = 60 net.ipv4.tcp_mem = 8388608 12582912 16777216 net.ipv4.tcp_fin_timeout = 5 net.ipv4.tcp_synack_retries = 2 net.ipv4.tcp_syncookies = 1 net.ipv4.tcp_timestamps = 1 net.ipv4.tcp_tw_recycle = 0 net.ipv4.tcp_tw_reuse = 1 net.ipv4.tcp_max_tw_buckets = 262144 net.ipv4.tcp_rmem = 8192 87380 16777216 net.ipv4.tcp_wmem = 8192 65536 16777216 net.nf_conntrack_max = 1200000 net.netfilter.nf_conntrack_max = 1200000 vm.dirty_background_bytes = 409600000 vm.dirty_expire_centisecs = 3000 vm.dirty_ratio = 95 vm.dirty_writeback_centisecs = 100 vm.extra_free_kbytes = 4096000 vm.min_free_kbytes = 2097152 vm.mmap_min_addr = 65536 vm.overcommit_memory = 0 vm.overcommit_ratio = 90 vm.swappiness = 0 vm.zone_reclaim_mode = 0 net.ipv4.ip_local_port_range = 40000 65535

    2. /etc/security/limits.conf

    * soft nofile 655360 * hard nofile 655360 * soft nproc 655360 * hard nproc 655360 * soft memlock unlimited * hard memlock unlimited * soft core unlimited * hard core unlimited

    3. disable thp

    cat /sys/kernel/mm/transparent_hugepage/enabled cat /sys/kernel/mm/transparent_hugepage/defrag cat /sys/kernel/mm/redhat_transparent_hugepage/enabled cat /sys/kernel/mm/redhat_transparent_hugepage/defrag [never]

    4. io schedular

    /boot/grub/grub.conf title xxxxxxxxx root(0,0) kernel /vmlinuz-............. numa=off elevator=deadline

    5. disable selinux

    cat /etc/selinux/config SELINUX=disabled SELINUXTYPE=targeted

    6. cgroup

    yum install -y libcgroup mkdir -p /cgroup/cpu mkdir -p /cgroup/cpuacct mkdir -p /cgroup/memory mkdir -p /cgroup/blkio mount -t cgroup -o cpu cpu /cgroup/cpu mount -t cgroup -o cpuacct cpuacct /cgroup/cpuacct mount -t cgroup -o memory memory /cgroup/memory mount -t cgroup -o blkio blkio /cgroup/blkio

    二、自定义测试模型

    100个实例, 每个实例1亿记录 。

    压测 insert on conflict update 。 (前期插入,后期全更新,对IO考验较大)

    每个实例 2GB shared buffer 。

    (使用CGROUP)

    被测试机器部署

    1. install postgresql 9.6

    yum -y install rsync coreutils glib2 lrzsz sysstat e4fsprogs xfsprogs ntp readline-devel zlib zlib-devel openssl openssl-devel pam-devel libxml2-devel libxslt-devel python-devel tcl-devel gcc make smartmontools flex bison perl perl-devel perl-ExtUtils* OpenIPMI-tools openldap openldap-devel tar -jxvf postgresql-9.6rc1.tar.bz2 cd postgresql-9.6rc1 ./configure --prefix=/home/digoal/pgsql9.6rc1 make world -j 32 make install-world

    2. 配置环境变量

    $ vi ~/env.sh export PS1="$USER@`/bin/hostname -s`-> " export PGPORT=$1 export PGDATA=/data02/digoal/pg_root$PGPORT export LANG=en_US.utf8 export PGHOME=/home/digoal/pgsql9.6rc1 export LD_LIBRARY_PATH=$PGHOME/lib:/lib64:/usr/lib64:/usr/local/lib64:/lib:/usr/lib:/usr/local/lib:$LD_LIBRARY_PATH export DATE=`date +"%Y%m%d%H%M"` export PATH=$PGHOME/bin:$PATH:. export MANPATH=$PGHOME/share/man:$MANPATH export PGHOST=$PGDATA export PGUSER=postgres export PGDATABASE=postgres alias rm='rm -i' alias ll='ls -lh' unalias vi

    3. 初始化数据库集群脚本

    $ vi ~/init.sh for ((i=1921;i<1921+$1;i++)) do . ~/env.sh $i initdb -D $PGDATA -E UTF8 --locale=C -U postgres -X /data01/digoal/pg_xlog$i echo "local all all trust" > $PGDATA/pg_hba.conf echo "host all all 127.0.0.1/32 trust" >> $PGDATA/pg_hba.conf echo "host all all ::1/128 trust" >> $PGDATA/pg_hba.conf echo "host all all 0.0.0.0/0 trust" >> $PGDATA/pg_hba.conf done

    4. 初始化数据库集群

    $ rm -rf /data01/digoal/* $ rm -rf /data02/digoal/* $ cd ~ $ . ~/init.sh 100

    5. 获取DM设备 major minor

    # dmsetup ls vgdata01-lv02 (253, 1) vgdata01-lv01 (253, 0)

    6. cgroup 限制

    xlog盘限制iops 4000 data盘限制iops 800 cpu 0.7核, 周期1秒, quota 0.7秒 cpu real-time 调度统一 周期1秒, runtime 0.001秒 memory, 4G, 打开oom

    7. 启动数据库集群脚本

    $ vi ~/start.sh for ((i=1921;i<1921+$1;i++)) do . /home/digoal/env.sh $i cgcreate -g cpu:RULE$i cgcreate -g cpuacct:RULE$i cgcreate -g memory:RULE$i cgcreate -g blkio:RULE$i echo "253:0 4000" > /cgroup/blkio/RULE$i/blkio.throttle.write_iops_device echo "253:0 4000" > /cgroup/blkio/RULE$i/blkio.throttle.read_iops_device echo "253:1 800" > /cgroup/blkio/RULE$i/blkio.throttle.write_iops_device echo "253:1 800" > /cgroup/blkio/RULE$i/blkio.throttle.read_iops_device echo "70" > /cgroup/cpu/RULE$i/cpu.shares echo "1000000" > /cgroup/cpu/RULE$i/cpu.cfs_period_us echo "700000" > /cgroup/cpu/RULE$i/cpu.cfs_quota_us echo "1000000" > /cgroup/cpu/RULE$i/cpu.rt_period_us echo "1000" > /cgroup/cpu/RULE$i/cpu.rt_runtime_us echo "4294967296" > /cgroup/memory/RULE$i/memory.limit_in_bytes cgexec -g cpu:RULE$i -g cpuacct:RULE$i -g memory:RULE$i -g blkio:RULE$i su - digoal -c ". ~/env.sh $i ; nohup postgres -B 1GB -c port=$i -c listen_addresses='0.0.0.0' -c synchronous_commit=on -c full_page_writes=on -c wal_buffers=128MB -c wal_writer_flush_after=0 -c bgwriter_delay=10ms -c max_connections=100 -c bgwriter_lru_maxpages=1000 -c bgwriter_lru_multiplier=10.0 -c unix_socket_directories='.' -c max_wal_size=4GB -c checkpoint_timeout=30min -c checkpoint_completion_target=0.5 -c log_checkpoints=on -c log_connections=on -c log_disconnections=on -c log_error_verbosity=verbose -c autovacuum_vacuum_scale_factor=0.002 -c autovacuum_max_workers=4 -c autovacuum_naptime=5s -c random_page_cost=1.0 -c constraint_exclusion=on -c log_destination='csvlog' -c logging_collector=on -c maintenance_work_mem=256MB -c autovacuum_work_mem=256MB -D $PGDATA -k $PGDATA >/dev/null 2>&1 &" done

    8. 启动数据库集群 因为需要设置CGROUP,需要超级用户执行

    $ sudo bash -c "bash" # . /home/digoal/start.sh 100

    9. 停集群脚本 为了尽快停库(checkpoint可能耗费大量IO),可以先将资源放大,然后停库。

    $ vi ~/stop.sh for ((i=1921;i<1921+$1;i++)) do . /home/digoal/env.sh $i cgdelete cpu:RULE$i cgdelete cpuacct:RULE$i cgdelete blkio:RULE$i su - digoal -c ". ~/env.sh $i ; pg_ctl stop -m fast -w -t 600 -D $PGDATA" echo "0" > /cgroup/memory/RULE$i/memory.force_empty cgdelete memory:RULE$i done

    需要超级用户执行

    $ sudo bash -c "bash" # . /home/digoal/stop.sh 100

    测试客户端机器部署

    假设已安装postgresql 9.6

    1. 目录

    mkdir /data01/digoal chown digoal /data01/digoal

    2. 环境脚本

    $ vi ~/env.sh export PS1="$USER@`/bin/hostname -s`-> " export PGPORT=$1 export PGDATA=/data02/digoal/pg_root$PGPORT export LANG=en_US.utf8 export PGHOME=/home/digoal/pgsql9.6rc1 export LD_LIBRARY_PATH=$PGHOME/lib:/lib64:/usr/lib64:/usr/local/lib64:/lib:/usr/lib:/usr/local/lib:$LD_LIBRARY_PATH export DATE=`date +"%Y%m%d%H%M"` export PATH=$PGHOME/bin:$PATH:. export MANPATH=$PGHOME/share/man:$MANPATH export PGHOST=$PGDATA export PGUSER=postgres export PGDATABASE=postgres alias rm='rm -i' alias ll='ls -lh' unalias vi

    3. 生成压测数据结构以及pgbench调用的测试脚本 定制内容都在这里。

    $ vi ~/pgbench_init.sh for ((i=1921;i<1921+$1;i++)) do . ~/env.sh $i export PGHOST=$2 psql -c "drop table if exists test; create table test(id int primary key, info text, crt_time timestamp);" echo "\set id random(1,100000000)" > ~/test$i.sql echo "insert into test (id,info,crt_time) values (:id, md5(random()::text), now()) on conflict on constraint test_pkey do update set info=excluded.info, crt_time=excluded.crt_time;" >> ~/test$i.sql done

    不需要执行。

    4. 压测脚本

    $ vi ~/pgbench.sh mkdir -p /data01/digoal/log for ((i=1921;i<1921+$1;i++)) do . ~/env.sh $i export PGHOST=$2 nohup pgbench -M prepared -f ~/test$i.sql -n -r -P 1 -c 4 -j 4 -T 2592000 >>/data01/digoal/log/$2_$i.log 2>&1 & # 每个实例测试4个连接 done

    不需要执行,执行样例如下

    cd ~ . ~/pgbench.sh 100 目标主机IP

    5. 压测脚本

    $ vi test.sh #!/bin/bash cd ~ . ~/pgbench_init.sh 100 $1 for ((i=1;i>0;i=1)) do sleep 1 CNT=`ps -ewf|grep pgbench|grep -c -v grep` if [ $CNT -eq 0 ]; then . ~/pgbench.sh 100 $1 exit fi sleep 1 done $ chmod 500 test.sh

    6. 调用压测脚本

    nohup ./test.sh 目标IP >/dev/null 2>&1 &

    7. 压测结果

    $ cat /data01/digoal/log/$2_$i.log $ head -n 30000 /data01/digoal/log/$2_$i.log |tail -n 7200 > /tmp/1 $ cat /tmp/1|awk '{print $4 "," $7 "," $10}' >/tmp/2

    输出TPS,RT,标准差。

    TPS表示数据库视角的事务处理能力(也就是单个测试脚本的每秒调用次数)。

    RT表示响应时间。

    标准差可以用来表示抖动,通常应该在1以内(越大,说明抖动越厉害)。

    8. 主机性能结果

    $ sar -f ....

    三、内置的测试模型tpc-b

    只有有两处不一样,其他与定制测试一样。

    1. 生成压测数据结构

    $ vi ~/pgbench_init.sh for ((i=1921;i<1921+$1;i++)) do . ~/env.sh $i export PGHOST=$2 nohup pgbench -i -s 2800 >/dev/null 2>&1 & # 2.8 亿数据 done

    不需要执行。

    2. 压测脚本

    $ vi ~/pgbench.sh mkdir -p /data01/digoal/log for ((i=1921;i<1921+$1;i++)) do . ~/env.sh $i export PGHOST=$2 nohup pgbench -M prepared -n -r -P 10 -S -c 4 -j 4 -T 2592000 >>/data01/data/log/$2_$i.log 2>&1 & # 每个实例测试4个连接 done

    Count

    最新回复(0)