Hadoop +Zookeeper 实现高可用

    xiaoxiao2022-07-08  168

    实验环境

    IP 主机名 172.25.38.1 server1 Namenode 172.25.38.2 server2 Journalnode 172.25.38.3 server3 Journalnode 172.25.38.4 server4 Journalnode 172.25.38.5 server5 Namenode

    1、搭建zookeeper集群

    [root@server1 ~]# /etc/init.d/nfs start 开启服务 [root@server1 ~]# showmount -e Export list for server1: /home/hadoop * [root@server1 ~]# su - hadoop [hadoop@server1 ~]$ [hadoop@server1 ~]$ ls hadoop hadoop-2.7.3.tar.gz jdk1.7.0_79 hadoop-2.7.3 java jdk-7u79-linux-x64.tar.gz [hadoop@server1 ~]$ rm -fr /tmp/* [hadoop@server1 ~]$ ls hadoop java zookeeper-3.4.9.tar.gz hadoop-2.7.3 jdk1.7.0_79 hadoop-2.7.3.tar.gz jdk-7u79-linux-x64.tar.gz [hadoop@server1 ~]$ tar zxf zookeeper-3.4.9.tar.gz 解压zookeeper包

    2、配置server5作为高可用节点

    [root@server5 ~]# yum install nfs-utils -y 安装服务 [root@server5 ~]# /etc/init.d/rpcbind start 开启服务 [ OK ] [root@server5 ~]# /etc/init.d/nfs start 开启nfs服务 [root@server5 ~]# useradd -u 800 hadoop [root@server5 ~]# mount 172.25.38.1:/home/hadoop/ /home/hadoop/ 挂载 [root@server5 ~]# df 查看挂载 Filesystem 1K-blocks Used Available Use% Mounted on /dev/mapper/VolGroup-lv_root 19134332 929548 17232804 6% / tmpfs 380140 0 380140 0% /dev/shm /dev/vda1 495844 33478 436766 8% /boot 172.25.38.1:/home/hadoop/ 19134336 3289728 14872704 19% /home/hadoop [root@server5 ~]# su - hadoop [hadoop@server5 ~]$ ls hadoop hadoop-2.7.3.tar.gz jdk1.7.0_79 主机均已经同步 hadoop-2.7.3 java jdk-7u79-linux-x64.tar.gz [hadoop@server5 ~]$ rm -fr /tmp/*

    3、配置从节点

    [root@server2 ~]# mount 172.25.38.1:/home/hadoop/ /home/hadoop/ 挂载 [root@server2 ~]# df 查看挂载 Filesystem 1K-blocks Used Available Use% Mounted on /dev/mapper/VolGroup-lv_root 19134332 1860136 16302216 11% / tmpfs 251120 0 251120 0% /dev/shm /dev/vda1 495844 33478 436766 8% /boot 172.25.38.1:/home/hadoop/ 19134336 3289728 14872704 19% /home/hadoop [root@server2 ~]# rm -fr /tmp/* [root@server2 ~]# su - hadoop [hadoop@server2 ~]$ ls hadoop java zookeeper-3.4.9 hadoop-2.7.3 jdk1.7.0_79 zookeeper-3.4.9.tar.gz hadoop-2.7.3.tar.gz jdk-7u79-linux-x64.tar.gz [hadoop@server2 ~]$ cd zookeeper-3.4.9 [hadoop@server2 zookeeper-3.4.9]$ ls bin dist-maven LICENSE.txt src build.xml docs NOTICE.txt zookeeper-3.4.9.jar CHANGES.txt ivysettings.xml README_packaging.txt zookeeper-3.4.9.jar.asc conf ivy.xml README.txt zookeeper-3.4.9.jar.md5 contrib lib recipes zookeeper-3.4.9.jar.sha1 [hadoop@server2 zookeeper-3.4.9]$ cd conf/ [hadoop@server2 conf]$ ls configuration.xsl log4j.properties zoo_sample.cfg

    4、添加从节点信息

    [hadoop@server2 conf]$ cp zoo_sample.cfg zoo.cfg [hadoop@server2 conf]$ vim zoo.cfg [hadoop@server2 conf]$ cat zoo.cfg | tail -n 3 server.1=172.25.38.2:2888:3888 server.2=172.25.38.3:2888:3888 server.3=172.25.38.4:2888:3888 [hadoop@server2 conf]$ mkdir /tmp/zookeeper [hadoop@server2 conf]$ cd /tmp/zookeeper/ [hadoop@server2 zookeeper]$ ls [hadoop@server2 zookeeper]$ echo 1 > myid [hadoop@server2 zookeeper]$ ls myid [hadoop@server2 zookeeper]$ cd [hadoop@server2 ~]$ cd zookeeper-3.4.9/conf/ [hadoop@server2 conf]$ ls configuration.xsl log4j.properties zoo.cfg zoo_sample.cfg [hadoop@server2 conf]$ cd .. [hadoop@server2 zookeeper-3.4.9]$ cd bin/ 各节点配置文件相同,并且需要在/tmp/zookeeper 目录中创建 myid 文件,写入一个唯一的数字,取值范围在 1-255 [hadoop@server2 zookeeper-3.4.9]$ cd bin/ [hadoop@server2 bin]$ ls README.txt zkCli.cmd zkEnv.cmd zkServer.cmd zkCleanup.sh zkCli.sh zkEnv.sh zkServer.sh [hadoop@server2 bin]$ ./zkServer.sh start 开启服务 依次按照同样的方法配置其他节点

    server3

    [root@server3 ~]# mount 172.25.38.1:/home/hadoop/ /home/hadoop/ [root@server3 ~]# df Filesystem 1K-blocks Used Available Use% Mounted on /dev/mapper/VolGroup-lv_root 19134332 1537052 16625300 9% / tmpfs 251124 0 251124 0% /dev/shm /dev/vda1 495844 33478 436766 8% /boot 172.25.38.1:/home/hadoop/ 19134336 3289728 14872704 19% /home/hadoop [root@server3 ~]# rm -fr /tmp/* [root@server3 ~]# su - hadoop [hadoop@server3 ~]$ ls hadoop java zookeeper-3.4.9 hadoop-2.7.3 jdk1.7.0_79 zookeeper-3.4.9.tar.gz hadoop-2.7.3.tar.gz jdk-7u79-linux-x64.tar.gz [hadoop@server3 ~]$ mkdir /tmp/zookeeper [hadoop@server3 ~]$ cd /tmp/zookeeper/ [hadoop@server3 zookeeper]$ ls [hadoop@server3 zookeeper]$ echo 2 > myid [hadoop@server3 zookeeper]$ cd [hadoop@server3 ~]$ cd zookeeper-3.4.9/bin/ [hadoop@server3 bin]$ ls README.txt zkCli.cmd zkEnv.cmd zkServer.cmd zookeeper.out zkCleanup.sh zkCli.sh zkEnv.sh zkServer.sh [hadoop@server3 bin]$ ./zkServer.sh start

    server4

    [root@server4 ~]# mount 172.25.38.1:/home/hadoop/ /home/hadoop/ [root@server4 ~]# df Filesystem 1K-blocks Used Available Use% Mounted on /dev/mapper/VolGroup-lv_root 19134332 1350656 16811696 8% / tmpfs 251124 0 251124 0% /dev/shm /dev/vda1 495844 33478 436766 8% /boot 172.25.38.1:/home/hadoop/ 19134336 3289728 14872704 19% /home/hadoop [root@server4 ~]# rm -fr /tmp/* [root@server4 ~]# su - hadoop [hadoop@server4 ~]$ mkdir /tmp/zookeeper [hadoop@server4 ~]$ cd /tmp/zookeeper/ [hadoop@server4 zookeeper]$ ls [hadoop@server4 zookeeper]$ echo 3 >myid [hadoop@server4 zookeeper]$ ls myid [hadoop@server4 zookeeper]$ cd [hadoop@server4 ~]$ cd zookeeper-3.4.9/bin/ [hadoop@server4 bin]$ ./zkServer.sh start ZooKeeper JMX enabled by default

    server2

    [hadoop@server2 bin]$ ls README.txt zkCli.cmd zkEnv.cmd zkServer.cmd zookeeper.out zkCleanup.sh zkCli.sh zkEnv.sh zkServer.sh [hadoop@server2 bin]$ pwd /home/hadoop/zookeeper-3.4.9/bin [hadoop@server2 bin]$ ./zkCli.sh 连接zookeeper [zk: localhost:2181(CONNECTED) 0] ls / [zookeeper] [zk: localhost:2181(CONNECTED) 1] ls /zookeeper [quota] [zk: localhost:2181(CONNECTED) 2] ls /zookeeper/quota [] [zk: localhost:2181(CONNECTED) 3] get /zookeeper/quota cZxid = 0x0 ctime = Thu Jan 01 08:00:00 CST 1970 mZxid = 0x0 mtime = Thu Jan 01 08:00:00 CST 1970 pZxid = 0x0 cversion = 0 dataVersion = 0 aclVersion = 0 ephemeralOwner = 0x0 dataLength = 0 numChildren = 0

    5、hadoop的配置详解

    core-site.xml

    [hadoop@server1 ~]$ ls hadoop java zookeeper-3.4.9 hadoop-2.7.3 jdk1.7.0_79 zookeeper-3.4.9.tar.gz hadoop-2.7.3.tar.gz jdk-7u79-linux-x64.tar.gz [hadoop@server1 ~]$ cd hadoop/etc/hadoop/ [hadoop@server1 hadoop]$ vim core-site.xml <configuration> 指定 hdfs 的 namenode 为 masters (名称可自定义) <property> <name>fs.defaultFS</name> <value>hdfs://masters</value> </property> <property> 指定 zookeeper 集群主机地址 <name>ha.zookeeper.quorum</name> <value>172.25.38.2:2181,172.25.38.3:2181,172.25.38.4:2181</value> </property> </configuration>

    hdfs-site.xml

    [hadoop@server1 hadoop]$ vim hdfs-site.xml [hadoop@server1 hadoop]$ cat hdfs-site.xml | tail -n 74 <configuration> <property> <name>dfs.replication</name> <value>3</value> </property> <property> 指定 hdfs 的 nameservices 为 masters,和 core-site.xml 文件中的设置保持一致 <name>dfs.nameservices</name> <value>masters</value> </property> masters 下面有两个 namenode 节点,分别是 h1 和 h2 <property> <name>dfs.ha.namenodes.masters</name> <value>h1,h2</value> </property> 指定 h1 节点的 rpc 通信地址 <property> <name>dfs.namenode.rpc-address.masters.h1</name> <value>172.25.38.1:9000</value> </property> 指定 h1 节点的 http 通信地址 <property> <name>dfs.namenode.http-address.masters.h1</name> <value>172.25.38.1:50070</value> </property> 指定 h2 节点的 rpc 通信地址 <property> <name>dfs.namenode.rpc-address.masters.h2</name> <value>172.25.38.5:9000</value> </property> 指定 h2 节点的 http 通信地址 <property> <name>dfs.namenode.http-address.masters.h2</name> <value>172.25.38.5:50070</value> </property> 指定 NameNode 元数据在 JournalNode 上的存放位置 <property> <name>dfs.namenode.shared.edits.dir</name> <value>qjournal://172.25.38.2:8485;172.25.38.3:8485;172.25.38.4:8485/masters</value> </property> 指定 JournalNode 在本地磁盘存放数据的位置 <property> <name>dfs.journalnode.edits.dir</name> <value>/tmp/journaldata</value> </property> 开启 NameNode 失败自动切换 <property> <name>dfs.ha.automatic-failover.enabled</name> <value>true</value> </property> 配置失败自动切换实现方式 <property> <name>dfs.client.failover.proxy.provider.masters</name> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value> </property> 配置隔离机制方法,每个机制占用一行 <property> <name>dfs.ha.fencing.methods</name> <value> sshfence shell(/bin/true) </value> </property> 使用 sshfence 隔离机制时需要 ssh 免密码 <property> <name>dfs.ha.fencing.ssh.private-key-files</name> <value>/home/hadoop/.ssh/id_rsa</value> </property> 配置 sshfence 隔离机制超时时间 <property> <name>dfs.ha.fencing.ssh.connect-timeout</name> <value>30000</value> </property> </configuration> [hadoop@server1 hadoop]$ pwd /home/hadoop/hadoop/etc/hadoop [hadoop@server1 hadoop]$ vim slaves [hadoop@server1 hadoop]$ cat slaves 172.25.38.2 172.25.38.3 172.25.38.4 在三个 DN 上依次启动 journalnode(第一次启动 hdfs 必须先启动 journalnode) [hadoop@server2 hadoop]$ pwd /home/hadoop/hadoop [hadoop@server2 hadoop]$ ls bigfile etc input libexec logs output sbin bin include lib LICENSE.txt NOTICE.txt README.txt share [hadoop@server2 hadoop]$ sbin/hadoop-daemon.sh start journalnode starting journalnode, logging to /home/hadoop/hadoop-2.7.3/logs/hadoop-hadoop-journalnode-server2.out [hadoop@server2 hadoop]$ jps 1459 JournalNode 1508 Jps 1274 QuorumPeerMain server3和server4重复以上操作

    6、测试与server5的免密连接,传递配置文件搭建高可用

    [hadoop@server1 hadoop]$ pwd /home/hadoop/hadoop [hadoop@server1 hadoop]$ ls bigfile etc input libexec logs output sbin bin include lib LICENSE.txt NOTICE.txt README.txt share [hadoop@server1 hadoop]$ bin/hdfs namenode -format [hadoop@server1 hadoop]$ ssh server5 [hadoop@server5 ~]$ exit logout Connection to server5 closed. [hadoop@server1 hadoop]$ ssh 172.25.38.5 Last login: Tue Aug 28 10:40:53 2018 from server1 [hadoop@server5 ~]$ exit logout Connection to 172.25.38.5 closed. [hadoop@server1 hadoop]$ scp -r /tmp/hadoop-hadoop/ 172.25.38.5:/tmp/ fsimage_0000000000000000000 100% 353 0.3KB/s 00:00 VERSION 100% 202 0.2KB/s 00:00 seen_txid 100% 2 0.0KB/s 00:00 fsimage_0000000000000000000.md5 100% 62 0.1KB/s 00:00

    格式化 zookeeper (只需在 h1 上执行即可)

    [hadoop@server1 hadoop]$ ls bigfile etc input libexec logs output sbin bin include lib LICENSE.txt NOTICE.txt README.txt share [hadoop@server1 hadoop]$ bin/hdfs zkfc -formatZK

    启动 hdfs 集群(只需在 h1 上执行即可)

    [hadoop@server1 hadoop]$ sbin/start-dfs.sh 免密没有做好的话需要卡住的时候输入yes

    server2

    [hadoop@server2 ~]$ ls hadoop hadoop-2.7.3.tar.gz jdk1.7.0_79 zookeeper-3.4.9 hadoop-2.7.3 java jdk-7u79-linux-x64.tar.gz zookeeper-3.4.9.tar.gz [hadoop@server2 ~]$ cd zookeeper-3.4.9 [hadoop@server2 zookeeper-3.4.9]$ ls bin dist-maven LICENSE.txt src build.xml docs NOTICE.txt zookeeper-3.4.9.jar CHANGES.txt ivysettings.xml README_packaging.txt zookeeper-3.4.9.jar.asc conf ivy.xml README.txt zookeeper-3.4.9.jar.md5 contrib lib recipes zookeeper-3.4.9.jar.sha1 [hadoop@server2 zookeeper-3.4.9]$ bin/zkCli.sh [zk: localhost:2181(CONNECTED) 0] ls / [zookeeper, hadoop-ha] [zk: localhost:2181(CONNECTED) 1] ls /hadoop-ha [masters] [zk: localhost:2181(CONNECTED) 2] ls [zk: localhost:2181(CONNECTED) 3] ls /hadoop-ha/masters [ActiveBreadCrumb, ActiveStandbyElectorLock] [zk: localhost:2181(CONNECTED) 4] ls /hadoop-ha/masters/Active ActiveBreadCrumb ActiveStandbyElectorLock [zk: localhost:2181(CONNECTED) 4] ls /hadoop-ha/masters/ActiveBreadCrumb [] [zk: localhost:2181(CONNECTED) 5] get /hadoop-ha/masters/ActiveBreadCrumb mastersh2server5 �F(�> 当前master为server5 cZxid = 0x10000000a ctime = Tue Aug 28 10:46:47 CST 2018 mZxid = 0x10000000a mtime = Tue Aug 28 10:46:47 CST 2018 pZxid = 0x10000000a cversion = 0 dataVersion = 0 aclVersion = 0 ephemeralOwner = 0x0 dataLength = 28 numChildren = 0

    7、在网页查看server1和server5的状态,一个为active,一个为standby

    8、测试故障自动切换

    [hadoop@server5 ~]$ ls hadoop hadoop-2.7.3.tar.gz jdk1.7.0_79 zookeeper-3.4.9 hadoop-2.7.3 java jdk-7u79-linux-x64.tar.gz zookeeper-3.4.9.tar.gz [hadoop@server5 ~]$ cd hadoop [hadoop@server5 hadoop]$ ls bigfile etc input libexec logs output sbin bin include lib LICENSE.txt NOTICE.txt README.txt share [hadoop@server5 hadoop]$ bin/hdfs dfs -mkdir /user [hadoop@server5 hadoop]$ bin/hdfs dfs -mkdir /user/hadoop [hadoop@server5 hadoop]$ bin/hdfs dfs -ls [hadoop@server5 hadoop]$ bin/hdfs dfs -put etc/hadoop/ input [hadoop@server5 hadoop]$ bin/hdfs dfs -ls Found 1 items drwxr-xr-x - hadoop supergroup 0 2018-08-28 10:59 input [hadoop@server5 hadoop]$ jps 1479 DFSZKFailoverController 1382 NameNode 1945 Jps [hadoop@server5 hadoop]$ kill -9 1382 直接结束进程

    server2

    [zk: localhost:2181(CONNECTED) 6] get /hadoop-ha/masters/ActiveBreadCrumb mastersh1server1 �F(�> master已经变成了server1 cZxid = 0x10000000a ctime = Tue Aug 28 10:46:47 CST 2018 mZxid = 0x10000000f mtime = Tue Aug 28 11:00:22 CST 2018 pZxid = 0x10000000a cversion = 0 dataVersion = 1 aclVersion = 0 ephemeralOwner = 0x0 dataLength = 28 numChildren = 0

    server5

    [hadoop@server5 hadoop]$ pwd /home/hadoop/hadoop [hadoop@server5 hadoop]$ jps 1479 DFSZKFailoverController 1991 Jps [hadoop@server5 hadoop]$ sbin/hadoop-daemon.sh start namenode 恢复节点 starting namenode, logging to /home/hadoop/hadoop-2.7.3/logs/hadoop-hadoop-namenode-server5.out [hadoop@server5 hadoop]$ jps 查看进程已经恢复 1479 DFSZKFailoverController 2020 NameNode 2100 Jps
    最新回复(0)