安装并配置 Hive、MySQL,编写 HiveSQL 语句实现简单的CRUD操作

    xiaoxiao2025-05-15  47

    一、安装mysql

    1、下载安装包

    wget http://dev.mysql.com/get/mysql-community-release-el7-5.noarch.rpm

    2、解压并安装

    rpm -ivh mysql-community-release-el7-5.noarch.rpm yum install mysql-community-server

    3、重启mysql服务

    service mysqld restart

    4、为root用户设置密码(root)

    mysql -u root mysql> set password for ‘root’@‘localhost’ =password(‘root’);

    5、修改配置文件/etc/my.cnf

    vi /etc/my.cnf

    6、加上以下配置

    [mysql] default-character-set =utf8 grant all privileges on . to root@’ %'identified by ‘root’

    7、刷新权限

    flush privileges

    二、hive的安装及配置

    1、下载安装包 http://mirror.bit.edu.cn/apache/hive/ 2、用xftp上传至虚拟机,解压安装到指定目录下/opt/module 3、修改etc/profile文件,添加HIVE_HOME安装路径,使其生效

    Source profile

    4、配置 hive-env.sh

    cp hive-env.sh.template hive-env.sh 修改Hadoop的安装路径 HADOOP_HOME=/opt/module /hadoop-2.7.3 修改Hive的conf目录的路径 export HIVE_CONF_DIR=/opt/module/hive/conf

    5、配置hive-site.xml

    <property> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:mysql://127.0.0.1:3306/hive?characterEncoding=UTF-8&serverTimezone=GMT+8</value> <description> JDBC connect string for a JDBC metastore. To use SSL to encrypt/authenticate the connection, provide database-specific SSL flag in the connection URL. For example, jdbc:postgresql://myhost/db?ssl=true for postgres database. </description> </property> <property> <name>javax.jdo.option.ConnectionDriverName</name> <value>com.mysql.cj.jdbc.Driver</value> <description>Driver class name for a JDBC metastore</description> </property> <property> <name>javax.jdo.option.ConnectionUserName</name> <value>root</value> <description>Username to use against metastore database</description> </property> <property> <name>javax.jdo.option.ConnectionPassword</name> <value>123456</value> <description>password to use against metastore database</description> </property> <property> <name>hive.exec.local.scratchdir</name> <value>/usr/local/hive/apache-hive-2.3.4-bin/tmp/${user.name}</value> <description>Local scratch space for Hive jobs</description> </property> <property> <name>hive.downloaded.resources.dir</name> <value>/usr/local/hive/apache-hive-2.3.4-bin/iotmp/${hive.session.id}_resources</value> <description>Temporary local directory for added resources in the remote file system.</description> </property> <property> <name>hive.querylog.location</name> <value>/usr/local/hive/apache-hive-2.3.4-bin/iotmp/${system:user.name}</value> <description>Location of Hive run time structured log file</description> </property> <property> <name>hive.server2.logging.operation.log.location</name> <value>/usr/local/hive/apache-hive-2.3.4-bin/iotmp/${system:user.name}/operation_logs</value> <description>Top level directory where operation logs are stored if logging functionality is enabled</description> </property> <property> <name>hive.server2.thrift.bind.host</name> <value>bigdata</value> <description>Bind host on which to run the HiveServer2 Thrift service.</description> </property> <property> <name>system:java.io.tmpdir</name> <value>/usr/local/hive/apache-hive-2.3.4-bin/iotmp</value> <description/> </property>

    6、 初始化

    schematool -dbType mysql -initSchema

    三、编写 HiveQL 语句实现 WordCount 程序

    1、先把要统计的文件传到 HDFS 上

    vim 1.txt hdfs dfs -mkdir /input hdfs dfs -put 1.txt /input hdfs dfs -ls /input

    2、打开 beeline,创建内部表 words

    create table words(line string);

    3、导入文章内容

    load data inpath '/input/1.txt' overwrite into table words;

    4、执行 WordCount 操作,将结果保存到新表 wordcount 中

    create table wordcount as select word, count(1) as count from (select explode(split(line,' '))as word from words) w group by word order by word;

    5、查看统计结果

    select * from wordcount;
    最新回复(0)