Deepgreen的TPC-H测试步骤(简略版)

    xiaoxiao2024-10-05  83

    本文仅简略介绍Deepgreen数据库在TPC-H测试中需要注意的几个步骤,为朋友们以后做测试防雷,详细步骤参见德哥帖子。

    1.上传文件并解压缩:

    ChrisdeMacBook-Pro:tpch chris$ scp master.zip tpc-h-tool.zip dgadmin@192.168.100.107:/home/dgadmin dgadmin@192.168.100.107's password: master.zip 100% 21KB 3.6MB/s 00:00 tpc-h-tool.zip unzip xxx

    2.设置Makefile参数:

    [dgadmin@linux1 ~]$ cd tpch_2_17_0/ [dgadmin@linux1 tpch_2_17_0]$ ls dbgen dev-tools ref_data [dgadmin@linux1 tpch_2_17_0]$ cd dbgen/ [dgadmin@linux1 dbgen]$ cp makefile.suite Makefile [dgadmin@linux1 dbgen]$ vim Makefile

    主要修改CC、Database、Machine、Workload四行:

    ################ ## CHANGE NAME OF ANSI COMPILER HERE ################ CC = gcc # Current values for DATABASE are: INFORMIX, DB2, TDAT (Teradata) # SQLSERVER, SYBASE, ORACLE, VECTORWISE # Current values for MACHINE are: ATT, DOS, HP, IBM, ICL, MVS, # SGI, SUN, U2200, VMS, LINUX, WIN32 # Current values for WORKLOAD are: TPCH DATABASE= ORACLE MACHINE = LINUX WORKLOAD = TPCH

    3.编译:make

    4.生成测试数据:

    [dgadmin@linux1 dbgen]$ ./dbgen -s 1 -f TPC-H Population Generator (Version 2.17.0) Copyright Transaction Processing Performance Council 1994 - 2010

    5.将测试数据转换为postgresql识别的格式,删除末尾的分隔符|

    [dgadmin@linux1 dbgen]$ for i in `ls *.tbl`; do sed 's/|$//' $i > ${i/tbl/csv}; done

    6.把包含csv文件的目录,软链接到/tmp/dss-data。tpch-pg脚本中一会要用到这个目录

    [dgadmin@linux1 dbgen]$ pwd /home/dgadmin/tpch_2_17_0/dbgen [dgadmin@linux1 dbgen]$ ln -s /home/dgadmin/tpch_2_17_0/dbgen /tmp/dss-data 7.切换到脚本文件master目录下:

    [dgadmin@linux1 ~]$ cd pg_tpch-master/ [dgadmin@linux1 pg_tpch-master]$ ls drop-caches.sh dss LICENSE process.php README.md tpch.sh [dgadmin@linux1 pg_tpch-master]$ cd dss/ [dgadmin@linux1 dss]$ ls templates tpch-alter.sql tpch-create.sql tpch-index.sql tpch-load.sql tpch-pkeys.sql 8.修改tpch-load.sql文件,适应dg语法:

    ##########COPY命令格式有问题,为了获得更好的效果,使用列存储,修改如下举例########## ) with (APPENDONLY=true,BLOCKSIZE=2097152,ORIENTATION=COLUMN,COMPRESSTYPE=lz4,CHECKSUM=true,OIDS=false);备注:greenplum的压缩类型为:QuickLZ

    单独用一个小文件测试数据是否可用:

    psql -d tech COPY region FROM '/tmp/dss-data/region.csv' WITH csv DELIMITER '|’;

    9.把master下的所有文件复制到tpch的dbgen目录下:

    [dgadmin@linux1 ~]$ cp -r pg_tpch-master/* tpch_2_17_0/dbgen/ 10.在dbgen文件夹下的dss目录下创建文件夹: [dgadmin@linux1 dbgen]$ mkdir dss/queries

    10.生成测试sql:

    [dgadmin@linux1 dbgen]$ for q in `seq 1 22` do DSS_QUERY=dss/templates ./qgen $q >> dss/queries/$q.sql sed 's/^select/explain select/' dss/queries/$q.sql > dss/queries/$q.explain.sql cat dss/queries/$q.sql >> dss/queries/$q.explain.sql; done

    11.连接并创建测试数据库tpch

    [dgadmin@linux1 dbgen]$ psql -d postgres -p 15432 psql (8.2.15) Type "help" for help. postgres=# create database tpch owner=dgadmin; CREATE DATABASE

    12.设置几个数据库参数:

    ##########设置几个参数########## gpconfig -c enable_nestloop -v off gpconfig -c work_mem -v 256MB gpstop -u

    13.执行测试:

    nohup ./tpch.sh ./results tpch dgadmin &

    备注:如果报错误或者bench.log报timeout,调整tpch.sh文件中的时间为半小时或者更长即可。

    相关资源:七夕情人节表白HTML源码(两款)
    最新回复(0)