hadoop+zookeeper实现高可用

    xiaoxiao2022-07-07  156

    一、zookeeper简介

    ZooKeeper是一个分布式的,开放源码的分布式应用程序协调服务,是Google的Chubby一个开源的实现,是Hadoop和Hbase的重要组件。它是一个为分布式应用提供一致性服务的软件,提供的功能包括:配置维护、域名服务、分布式同步、组服务等。

    二、 ZooKeeper的基本运转流程:

    1、选举Leader。

    2、同步数据。

    3、选举Leader过程中算法有很多,但要达到的选举标准是一致的。

    4、Leader要具有最高的执行ID,类似root权限。

    5、集群中大多数的机器得到响应并接受选出的Leader。

    三、zookeeper实现高可用

    1.关闭之前的服务,并清理环境

    [hadoop@server1 hadoop]$ sbin/stop-yarn.sh Stopping nodemanagers Stopping resourcemanager [hadoop@server1 hadoop]$ sbin/stop-dfs.sh Stopping namenodes on [server1] Stopping datanodes Stopping secondary namenodes [server1] [hadoop@server1 hadoop]$ jps 14909 Jps server1/2/3: rm -fr /tmp/*

    2.在server2/server3/server4/server5中安装jdk并配置环境变量:

    [hadoop@server5 ~]$ tar zxf jdk-8u181-linux-x64.tar.gz [hadoop@server5 ~]$ ln -s jdk1.8.0_181 java [hadoop@server5 ~]$ vim ~/.bash_profile PATH=$PATH:$HOME/.local/bin:$HOME/bin:$HOME/java/bin [hadoop@server5 ~]$ source ~/.bash_profile [hadoop@server5 ~]$ jps 1388 Jps

    4.安装hadoop并配置:

    [hadoop@server5 ~]$ tar zxf hadoop-3.0.3.tar.gz [hadoop@server5 ~]$ ln -s hadoop-3.0.3 hadoop [hadoop@server5 ~]$ cd hadoop [hadoop@server5 hadoop]$ cd etc/hadoop/ [hadoop@server5 hadoop]$ vim hadoop-env.sh 54 export JAVA_HOME=/home/hadoop/java

    5.搭建zookeeper(在任意一个节点作都可以)

    [hadoop@server2 ~]$ tar zxf zookeeper-3.4.9.tar.gz [hadoop@server2 ~]$ cd zookeeper-3.4.9 [hadoop@server2 zookeeper-3.4.9]$ cd conf/ [hadoop@server2 conf]$ ls configuration.xsl log4j.properties zoo_sample.cfg [hadoop@server2 conf]$ cp zoo_sample.cfg zoo.cfg [hadoop@server2 conf]$ vim zoo.cfg server.1=172.25.60.2:2888:3888 server.2=172.25.60.3:2888:3888 server.3=172.25.60.4:2888:3888

    各节点配置文件相同:

    [hadoop@server2 conf]$ scp zoo.cfg server3:/home/hadoop/zookeeper-3.4.9/conf [hadoop@server2 conf]$ scp zoo.cfg server4:/home/hadoop/zookeeper-3.4.9/conf

    6.在/tmp/zookeeper 目录中创建 myid 文件,写入一个唯一的数字,取值范围在 1-255

    [hadoop@server2 conf]$ mkdir /tmp/zookeeper [hadoop@server2 conf]$ echo 1 > /tmp/zookeeper/myid [hadoop@server3 ~]$ mkdir /tmp/zookeeper [hadoop@server3 ~]$ echo 2 > /tmp/zookeeper/myid [hadoop@server4 ~]$ mkdir /tmp/zookeeper [hadoop@server4 ~]$ echo 3 > /tmp/zookeeper/myid

    7.开启服务:

    [hadoop@server2 zookeeper-3.4.9]$ bin/zkServer.sh start [hadoop@server3 zookeeper-3.4.9]$ bin/zkServer.sh start [hadoop@server4 zookeeper-3.4.9]$ bin/zkServer.sh start

    查看节点状态:

    #从节点 [hadoop@server2 zookeeper-3.4.9]$ bin/zkServer.sh status ZooKeeper JMX enabled by default Using config: /home/hadoop/zookeeper-3.4.9/bin/../conf/zoo.cfg Mode: follower #从节点 [hadoop@server3 zookeeper-3.4.9]$ bin/zkServer.sh status ZooKeeper JMX enabled by default Using config: /home/hadoop/zookeeper-3.4.9/bin/../conf/zoo.cfg Mode: follower #主节点 [hadoop@server4 zookeeper-3.4.9]$ bin/zkServer.sh status ZooKeeper JMX enabled by default Using config: /home/hadoop/zookeeper-3.4.9/bin/../conf/zoo.cfg Mode: leader

    8.在server2:

    [hadoop@server2 zookeeper-3.4.9]$ cd bin/ [hadoop@server2 bin]$ ls README.txt zkCli.cmd zkEnv.cmd zkServer.cmd zkCleanup.sh zkCli.sh zkEnv.sh zkServer.sh [hadoop@server2 bin]$ ./zkCli.sh #连接zookeeper [zk: localhost:2181(CONNECTED) 0] ls / [zookeeper] [zk: localhost:2181(CONNECTED) 1] ls /zookeeper [quota] [zk: localhost:2181(CONNECTED) 2] ls /zookeeper/quota [] [zk: localhost:2181(CONNECTED) 3] get /zookeeper/quota cZxid = 0x0 ctime = Thu Jan 01 08:00:00 CST 1970 mZxid = 0x0 mtime = Thu Jan 01 08:00:00 CST 1970 pZxid = 0x0 cversion = 0 dataVersion = 0 aclVersion = 0 ephemeralOwner = 0x0 dataLength = 0 numChildren = 0

    9.server1上配置hadoop:

    [hadoop@server1 ~]$ cd hadoop/etc/hadoop/ [hadoop@server1 hadoop]$ vim core-site.xml <configuration> #指定 hdfs 的 namenode 为 masters (名称可自定义) <property> <name>fs.defaultFS</name> <value>hdfs://master</value> </property> #指定 zookeeper 集群主机地址 <property> <name>ha.zookeeper.quorum</name> <value>172.25.60.2:2181,172.25.60.3:2181,172.25.60.4:2181</value> </property> </configuration>

    [hadoop@server1 hadoop]$ vim core-site.xml <configuration> ##指定 hdfs 的 namenode 为 masters (名称可自定义) <property> <name>fs.defaultFS</name> <value>hdfs://masters</value> </property> ##指定 zookeeper 集群主机地址 <property> <name>ha.zookeeper.quorum</name> <value>172.25.60.2:2181,172.25.60.3:2181,172.25.60.4:2181</value> </property> </configuration> [hadoop@server1 hadoop]$ vim hdfs-site.xml <configuration> <property> <name>dfs.replication</name> <value>3</value> </property> ##指定 hdfs 的 nameservices 为 masters,和 core-site.xml 文件中的设置保持一致 <property> <name>dfs.nameservices</name> <value>masters</value> </property> ##masters 下面有两个 namenode 节点,分别是 h1 和 h2 <property> <name>dfs.ha.namenodes.masters</name> <value>h1,h2</value> </property> ##指定 h1 节点的 rpc 通信地址 <property> <name>dfs.namenode.rpc-address.masters.h1</name> <value>172.25.14.1:9000</value> </property> ##指定 h1 节点的 http 通信地址 <property> <name>dfs.namenode.http-address.masters.h1</name> <value>172.25.60.1:9870</value> </property> ##指定 h2 节点的 rpc 通信地址 <property> <name>dfs.namenode.rpc-address.masters.h2</name> <value>172.25.60:9000</value> </property> ##指定 h2 节点的 http 通信地址 <property> <name>dfs.namenode.http-address.masters.h2</name> <value>172.25.60.5:9870</value> </property> ##指定 NameNode 元数据在 JournalNode 上的存放位置 <property> <name>dfs.namenode.shared.edits.dir</name> <value>qjournal://172.25.60.2:8485;172.25.60.3:8485;172.25.60.4:8485/masters</value> </property> ##指定 JournalNode 在本地磁盘存放数据的位置 <property> <name>dfs.journalnode.edits.dir</name> <value>/tmp/journaldata</value> </property> ##开启 NameNode 失败自动切换 <property> <name>dfs.ha.automatic-failover.enabled</name> <value>true</value> </property> ##配置失败自动切换实现方式 <property> <name>dfs.client.failover.proxy.provider.masters</name> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value> </property> ##配置隔离机制方法,每个机制占用一行 <property> <name>dfs.ha.fencing.methods</name> <value> sshfence shell(/bin/true) </value> </property> ##使用 sshfence 隔离机制时需要 ssh 免密码 <property> <name>dfs.ha.fencing.ssh.private-key-files</name> <value>/home/hadoop/.ssh/id_rsa</value> </property> ##配置 sshfence 隔离机制超时时间 <property> <name>dfs.ha.fencing.ssh.connect-timeout</name> <value>30000</value> </property> </configuration>

    10.在server2/3/4 上启动 zookeeper 集群节点

    [hadoop@server2 ~]$ cd hadoop [hadoop@server2 hadoop]$ bin/hdfs --daemon start journalnode [hadoop@server2 hadoop]$ jps 11430 Jps 11399 JournalNode 11244 QuorumPeerMain [hadoop@server3 ~]$ cd hadoop [hadoop@server3 hadoop]$ bin/hdfs --daemon start journalnode [hadoop@server3 hadoop]$ jps 11252 QuorumPeerMain 11415 Jps 11375 JournalNode [hadoop@server4 ~]$ cd hadoop [hadoop@server4 hadoop]$ bin/hdfs --daemon start journalnode [hadoop@server4 hadoop]$ jps 13076 Jps 12904 QuorumPeerMain 13035 JournalNode

    11.传递配置文件搭建高可用

    [hadoop@server1 hadoop]$ pwd /home/hadoop/hadoop [hadoop@server1 hadoop]$ bin/hdfs namenode -format [hadoop@server1 hadoop]$ scp -r /tmp/hadoop-hadoop 172.25.60.5:/tmp/ [hadoop@server1 hadoop]$ cd etc/hadoop/ [hadoop@server1 hadoop]$ scp core-site.xml hdfs-site.xml hadoop@172.25.60.5:/home/hadoop/hadoop/etc/hadoop

    格式化zookeeper:

    [hadoop@server1 hadoop]$ bin/hdfs zkfc -formatZK

    12.打开hdfs集群(server1)

    [hadoop@server1 hadoop]$ sbin/start-dfs.sh Starting namenodes on [server1 server5] server1: namenode is running as process 15374. Stop it first. server5: WARNING: /home/hadoop/hadoop-3.0.3/logs does not exist. Creating. Starting datanodes 172.25.60.2: datanode is running as process 11520. Stop it first. 172.25.60.3: datanode is running as process 11495. Stop it first. 172.25.60.4: datanode is running as process 13214. Stop it first. Starting journal nodes [172.25.60.2 172.25.60.3 172.25.60.4] 172.25.60.3: journalnode is running as process 11375. Stop it first. 172.25.60.4: journalnode is running as process 13035. Stop it first. 172.25.60.2: journalnode is running as process 11399. Stop it first. Starting ZK Failover Controllers on NN hosts [server1 server5] [hadoop@server1 hadoop]$ jps 15996 DFSZKFailoverController 15374 NameNode 16046 Jps [hadoop@server5 ~]$ jps 10886 NameNode 10983 DFSZKFailoverController 11036 Jps

    13.打开浏览器:

    http://172.25.60.1:9870/----浏览器测试显示1上是active

    http://172.25.60.5:9870-----5是standby(备用状态)

    停掉server1(关闭server1 的NameNode),server5的状态就变成了active:

    [hadoop@server1 hadoop]$ jps 15996 DFSZKFailoverController 16076 Jps 15374 NameNode [hadoop@server1 hadoop]$ kill -9 15374 [hadoop@server1 hadoop]$ jps 16087 Jps 15996 DFSZKFailoverController

    在server1上传文件:

    [hadoop@server1 hadoop]$ bin/hdfs dfs -mkdir -p /user/hadoop [hadoop@server1 hadoop]$ bin/hdfs dfs -mkdir input [hadoop@server1 hadoop]$ bin/hdfs dfs -put etc/hadoop/* input

    在浏览器上查看,已经上传成功了,是通过server5上传的

    打开server1的namenode

    [hadoop@server1 hadoop]$ bin/hdfs --daemon start namenode [hadoop@server1 hadoop]$ jps 17665 Jps 17593 NameNode 15996 DFSZKFailoverController

    server1的状态变为standby(备用)状态:

     

     

     

     

     

    最新回复(0)