文章目录
1、为 VMware 的虚拟机绑定 IP 地址2、环境参数3、环境搭建3.1 安装 jdk 1.83.2 安装 Hadoop -- cdh5.7.03.3 安装 mysql3.4 安装 Hive3.5 安装 Scala -- 2.12.83.6 安装 maven -- 3.5.43.7 spark 源码编译3.8 spark local 环境搭建3.9 spark Standdalone 环境搭建
1、为 VMware 的虚拟机绑定 IP 地址
查看IP地址子网ip、子网掩码、和网关 编辑 --> 虚拟网络编辑器 --> NET设置
修改虚拟机网关
sudo vim /etc/sysconfig/network-scripts/ifcfg-ens33
设置 BOOTPROTO=static,并在文件最后添加 IPADDR、NETMASK、NETMASK 和 DNS1 的配置
IPADDR
=192.168.48.143
NETMASK
=255.255.255.0
GATEWAY
=192.168.48.2
DNS1
=8.8.8.8
重启网卡,使得配置生效
sudo service network restart
2、环境参数
Linux版本: CentOS 7.2jdk版本: 1.8Hadoop版本: hadoop-2.6.0-cdh5.7.0Scala版本: 2.12.8Spark版本: spark-2.4.3 (spark最新版本2.4.3依赖scala2.12,maven3.5.4以上)开发工具: IDEACDH相关下载地址:http://archive.cloudera.com/cdh5/cdh/5/项目目录:
3、环境搭建
3.1 安装 jdk 1.8
[zcx@zoucaoxin software
]$ rz -y
[zcx@zoucaoxin software
]$
tar -zxvf jdk-8u191-linux-x64.tar.gz -C ~/app/
[zcx@zoucaoxin app
]$ vim ~/.bash_profile
export JAVA_HOME
=/home/zcx/app/jdk1.8.0_191
export JRE_HOME
=/home/zcx/app/jdk1.8.0_191/jre
export CLASSPATH
=.:
$JAVA_HOME/lib:
$JRE_HOME/lib:
$CLASSPATH
export PATH
=$JAVA_HOME/bin:
$JRE_HOME/bin:
$PATH
source ~/.bash_profile
java -version
3.2 安装 Hadoop – cdh5.7.0
[zcx@zoucaoxin ~
]$
sudo vim /etc/hosts
192.168.48.143 zoucaoxin
[zcx@zoucaoxin ~
]$ ssh-keygen -t rsa
[zcx@zoucaoxin ~
]$
cp ~/.ssh/id_rsa.pub ~/.ssh/authorized_keys
[zcx@zoucaoxin ~
]$
ssh localhost
[zcx@zoucaoxin software
]$ rz -y
[zcx@zoucaoxin software
]$
tar -zxvf hadoop-2.6.0-cdh5.7.0.tar.gz -C ~/app/
[zcx@zoucaoxin app
]$ vim ~/.bash_profile
export HADOOP_HOME
=/home/zcx/app/hadoop-2.6.0-cdh5.7.0
export PATH
=$HADOOP_HOME/bin:
$HADOOP_HOME/sbin:
$PATH
source ~/.bash_profile
[zcx@zoucaoxin hadoop-2.6.0-cdh5.7.0
]$ vim etc/hadoop/hadoop-env.sh
export JAVA_HOME
=/home/zcx/app/jdk1.8.0_191
[zcx@zoucaoxin hadoop-2.6.0-cdh5.7.0
]$ vim etc/hadoop/core-site.xml
<configuration
>
<property
>
<name
>fs.defaultFS
</name
>
<value
>hdfs://zoucaoxin:9000
</value
>
</property
>
<property
>
<name
>hadoop.tmp.dir
</name
>
<value
>/home/zcx/app/tmp
</value
>
</property
>
</configuration
>
[zcx@zoucaoxin hadoop-2.6.0-cdh5.7.0
]$ vim etc/hadoop/hdfs-site.xml
<configuration
>
<property
>
<name
>dfs.replication
</name
>
<value
>1
</value
>
</property
>
</configuration
>
[zcx@zoucaoxin hadoop-2.6.0-cdh5.7.0
]$ hadoop namenode -format
[zcx@zoucaoxin hadoop-2.6.0-cdh5.7.0
]$ start-dfs.sh
[zcx@zoucaoxin hadoop-2.6.0-cdh5.7.0
]$ jps
[zcx@zoucaoxin hadoop-2.6.0-cdh5.7.0
]$
cd etc/hadoop
[zcx@zoucaoxin hadoop
]$
cp mapred-site.xml.template mapred-site.xml
[zcx@zoucaoxin hadoop
]$ vim mapred-site.xml
<configuration
>
<property
>
<name
>mapreduce.framework.name
</name
>
<value
>yarn
</value
>
</property
>
</configuration
>
[zcx@zoucaoxin hadoop
]$ vim yarn-site.xml
<configuration
>
<property
>
<name
>yarn.nodemanager.aux-services
</name
>
<value
>mapreduce_shuffle
</value
>
</property
>
</configuration
>
[zcx@zoucaoxin hadoop
]$ start-yarn.sh
[zcx@zoucaoxin hadoop
]$ jps
3.3 安装 mysql
[zcx@zoucaoxin software
]$
wget http://repo.mysql.com/mysql-community-release-el7-5.noarch.rpm
[zcx@zoucaoxin ~
]$
sudo rpm -ivh mysql-community-release-el7-5.noarch.rpm
[zcx@zoucaoxin ~
]$
sudo yum update
[zcx@zoucaoxin ~
]$
sudo yum
install mysql mysql-server
[zcx@zoucaoxin ~
]$
sudo systemctl start mysqld
[zcx@zoucaoxin ~
]$ mysqladmin -u root password
"zoucaoxin"
[zcx@zoucaoxin software
]$ mysql -u root -p
Enter password:
mysql
> grant all privileges on *.* to root@
'%' identified by
"zoucaoxin";
3.4 安装 Hive
export HIVE_HOME
=/home/zcx/app/hive-1.1.0-cdh5.7.0
export PATH
=$HIVE_HOME/bin:
$PATH
[zcx@zoucaoxin hive-1.1.0-cdh5.7.0
]$
cd conf/
[zcx@zoucaoxin conf
]$
cp hive-env.sh.template hive-env.sh
[zcx@zoucaoxin conf
]$ vim hive-env.sh
HADOOP_HOME
=/home/zcx/app/hadoop-2.6.0-cdh5.7.0
[zcx@zoucaoxin conf
]$ vim hive-site.xml
<configuration
>
<property
>
<name
>javax.jdo.option.ConnectionURL
</name
>
<value
>jdbc:mysql://localhost:3306/sparksql?createDatabaseIfNotExist
=true
</value
>
</property
>
<property
>
<name
>javax.jdo.option.ConnectionDriverName
</name
>
<value
>com.mysql.jdbc.Driver
</value
>
</property
>
<property
>
<name
>javax.jdo.option.ConnectionUserName
</name
>
<value
>root
</value
>
</property
>
<property
>
<name
>javax.jdo.option.ConnectionPassword
</name
>
<value
>zoucaoxin
</value
>
</property
>
</configuration
>
[zcx@zoucaoxin software
]$
cd ~/app/hive-1.1.0-cdh5.7.0/lib/
[zcx@zoucaoxin lib
]$
cp ~/software/mysql-connector-java-5.1.27.jar
.
[zcx@zoucaoxin ~
]$ hive
3.5 安装 Scala – 2.12.8
[zcx@zoucaoxin software
]$ rz -y
[zcx@zoucaoxin software
]$
tar -zxvf scala-2.12.8.tgz -C ~/app/
[zcx@zoucaoxin app
]$ vim ~/.bash_profile
export SCALA_HOME
=/home/zcx/app/scala-2.12.8
export PATH
=$SCALA_HOME/bin:
$PATH
[zcx@zoucaoxin app
]$
source ~/.bash_profile
[zcx@zoucaoxin app
]$ scala
3.6 安装 maven – 3.5.4
[zcx@zoucaoxin app
]$ vim ~/.bash_profile
export MAVEN_HOME
=/home/zcx/app/apache-maven-3.5.4
export PATH
=$MAVEN_HOME/bin:
$PATH
[zcx@zoucaoxin app
]$
source ~/.bash_profile
mvn -v
mkdir maven_repository
<localRepository
>/home/zcx/maven_repository
</localRepository
>
3.7 spark 源码编译
[zcx@zoucaoxin source
]$
tar -zxvf spark-2.4.3.tgz
https://spark.apache.org/docs/latest/building-spark.html
<repository
>
<id
>cloudera
</id
>
<url
>https://repository.cloudera.com/artifactory/cloudera-repos/
</url
>
</repository
>
./build/mvn -Pyarn -Phive -Phive-thriftserver -Phadoop-2.6 -Dhadoop.version
=2.6.0-cdh5.7.0 -DskipTests clean package
[zcx@zoucaoxin spark-2.4.3
]$ ./dev/make-distribution.sh --name 2.6.0-cdh5.7.0 --tgz -Pyarn -Phadoop-2.6 -Phive -Phive-thriftserver -Dhadoop.version
=2.6.0-cdh5.7.0
3.8 spark local 环境搭建
[zcx@zoucaoxin spark-2.4.3
]$
tar -zxvf spark-2.4.3-bin-2.6.0-cdh5.7.0.tgz -C ~/app/
[zcx@zoucaoxin bin
]$
rm -rf *.cmd
[zcx@zoucaoxin bin
]$ vim ~/.bash_profile
export SPARK_HOME
=/home/zcx/app/spark-2.4.3-bin-2.6.0-cdh5.7.0
export PATH
=$SPARK_HOME/bin:
$PATH
[zcx@zoucaoxin bin
]$
source ~/.bash_profile
[zcx@zoucaoxin spark-2.4.3-bin-2.6.0-cdh5.7.0
]$ spark-shell --master local
[2
]
3.9 spark Standdalone 环境搭建
[zcx@zoucaoxin conf
]$
cp spark-env.sh.template spark-env.sh
[zcx@zoucaoxin conf
]$ vim spark-env.sh
SPARK_MASTER_HOST
=zoucaoxin
SPARK_WORKER_CORES
=2
SPARK_WORKER_MEMORY
=2g
SPARK_WORKER_INSTANCES
=1
[zcx@zoucaoxin conf
]$ vim slaves
zoucaoxin
[zcx@zoucaoxin sbin
]$ vim spark-config.sh
export JAVA_HOME
=/home/zcx/app/jdk1.8.0_191
[zcx@zoucaoxin sbin
]$ ./start-all.sh
starting org.apache.spark.deploy.master.Master, logging to /home/zcx/app/spark-2.4.3-bin-2.6.0-cdh5.7.0/logs/spark-zcx-org.apache.spark.deploy.master.Master-1-zoucaoxin.out
zoucaoxin: starting org.apache.spark.deploy.worker.Worker, logging to /home/zcx/app/spark-2.4.3-bin-2.6.0-cdh5.7.0/logs/spark-zcx-org.apache.spark.deploy.worker.Worker-1-zoucaoxin.out
[zcx@zoucaoxin spark-2.4.3-bin-2.6.0-cdh5.7.0
]$
cat /home/zcx/app/spark-2.4.3-bin-2.6.0-cdh5.7.0/logs/spark-zcx-org.apache.spark.deploy.master.Master-1-zoucaoxin.out
Spark Command: /home/zcx/app/jdk1.8.0_191/bin/java -cp /home/zcx/app/spark-2.4.3-bin-2.6.0-cdh5.7.0/conf/:/home/zcx/app/spark-2.4.3-bin-2.6.0-cdh5.7.0/jars/* -Xmx1g org.apache.spark.deploy.master.Master --host zoucaoxin --port 7077 --webui-port 8080
========================================
Using Spark
's default log4j profile: org/apache/spark/log4j-defaults.properties
19/05/15 10:49:43 INFO Master: Started daemon with process name: 23060@zoucaoxin
19/05/15 10:49:43 INFO SignalUtils: Registered signal handler for TERM
19/05/15 10:49:43 INFO SignalUtils: Registered signal handler for HUP
19/05/15 10:49:43 INFO SignalUtils: Registered signal handler for INT
19/05/15 10:49:44 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
19/05/15 10:49:44 INFO SecurityManager: Changing view acls to: zcx
19/05/15 10:49:44 INFO SecurityManager: Changing modify acls to: zcx
19/05/15 10:49:44 INFO SecurityManager: Changing view acls groups to:
19/05/15 10:49:44 INFO SecurityManager: Changing modify acls groups to:
19/05/15 10:49:44 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(zcx); groups with view permissions: Set(); users with modify permissions: Set(zcx); groups with modify permissions: Set()
19/05/15 10:49:45 INFO Utils: Successfully started service 'sparkMaster
' on port 7077.
19/05/15 10:49:45 INFO Master: Starting Spark master at spark://zoucaoxin:7077
19/05/15 10:49:45 INFO Master: Running Spark version 2.4.3
19/05/15 10:49:46 INFO Utils: Successfully started service 'MasterUI
' on port 8080.
19/05/15 10:49:46 INFO MasterWebUI: Bound MasterWebUI to 0.0.0.0, and started at http://zoucaoxin:8080
19/05/15 10:49:46 INFO Master: I have been elected leader! New state: ALIVE
19/05/15 10:49:51 INFO Master: Registering worker 192.168.48.143:38593 with 2 cores, 2.0 GB RAM
# 绑定在7077端口,masterUI在8080端口
# 打开后台界面:http://192.168.48.143:8080/
# 查看worker日志
[zcx@zoucaoxin spark-2.4.3-bin-2.6.0-cdh5.7.0]$ cat /home/zcx/app/spark-2.4.3-bin-2.6.0-cdh5.7.0/logs/spark-zcx-org.apache.spark.deploy.worker.Worker-1-zoucaoxin.out
Spark Command: /home/zcx/app/jdk1.8.0_191/bin/java -cp /home/zcx/app/spark-2.4.3-bin-2.6.0-cdh5.7.0/conf/:/home/zcx/app/spark-2.4.3-bin-2.6.0-cdh5.7.0/jars/* -Xmx1g org.apache.spark.deploy.worker.Worker --webui-port 8081 spark://zoucaoxin:7077
========================================
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
19/05/15 10:49:47 INFO Worker: Started daemon with process name: 23139@zoucaoxin
19/05/15 10:49:47 INFO SignalUtils: Registered signal handler
for TERM
19/05/15 10:49:48 INFO SignalUtils: Registered signal handler
for HUP
19/05/15 10:49:48 INFO SignalUtils: Registered signal handler
for INT
19/05/15 10:49:48 WARN NativeCodeLoader: Unable to load native-hadoop library
for your platform
... using builtin-java classes where applicable
19/05/15 10:49:48 INFO SecurityManager: Changing view acls to: zcx
19/05/15 10:49:48 INFO SecurityManager: Changing modify acls to: zcx
19/05/15 10:49:48 INFO SecurityManager: Changing view acls
groups to:
19/05/15 10:49:48 INFO SecurityManager: Changing modify acls
groups to:
19/05/15 10:49:48 INFO SecurityManager: SecurityManager: authentication disabled
; ui acls disabled
; users with view permissions: Set
(zcx
); groups with view permissions: Set
(); users with modify permissions: Set
(zcx
); groups with modify permissions: Set
()
19/05/15 10:49:49 INFO Utils: Successfully started
service 'sparkWorker' on port 38593.
19/05/15 10:49:50 INFO Worker: Starting Spark worker 192.168.48.143:38593 with 2 cores, 2.0 GB RAM
19/05/15 10:49:50 INFO Worker: Running Spark version 2.4.3
19/05/15 10:49:50 INFO Worker: Spark home: /home/zcx/app/spark-2.4.3-bin-2.6.0-cdh5.7.0
19/05/15 10:49:50 INFO Utils: Successfully started
service 'WorkerUI' on port 8081.
19/05/15 10:49:50 INFO WorkerWebUI: Bound WorkerWebUI to 0.0.0.0, and started at http://zoucaoxin:8081
19/05/15 10:49:50 INFO Worker: Connecting to master zoucaoxin:7077
...
19/05/15 10:49:50 INFO TransportClientFactory: Successfully created connection to zoucaoxin/192.168.48.143:7077 after 120 ms
(0 ms spent
in bootstraps
)
19/05/15 10:49:51 INFO Worker: Successfully registered with master spark://zoucaoxin:7077
[zcx@zoucaoxin sbin
]$ ./stop-all.sh
[zcx@zoucaoxin bin
]$ spark-shell --master spark://zoucaoxin:7077