配置完成后,运行
java -version -------------- java version "1.8.0_121" Java(TM) SE Runtime Environment (build 1.8.0_121-b13) Java HotSpot(TM) 64-Bit Server VM (build 25.121-b13, mixed mode)下载hadoop,解压到指定目录,这里是/opt配置系统变量
vim ~/.bash_profile export HADOOP_HOME=/opt/hadoop-2.7.3 export HADOOP_PREFIX=$HADOOP_HOME export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH修改 /etc/hadoop/hadoop-env.sh
export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.8.0_121.jdk/Contents/Home export HADOOP_CONF_DIR=/opt/hadoop-2.7.3/etc/hadoop export HADOOP_OPTS="$HADOOP_OPTS -Djava.security.krb5.realm=OX.AC.UK -Djava.security.krb5.kdc=kdc0.ox.ac.uk:kdc1.ox.ac.uk"修改/etc/hadoop/core-site.xml
<property> <name>fs.defaultFS</name> <value>hdfs://localhost:9000</value> <description>The name of the default file system.</description> </property> <property> <name>hadoop.tmp.dir</name> <value>/Users/micmiu/tmp/hadoop</value> <description>A base for other temporary directories.</description> </property> <property> <name>io.native.lib.available</name> <value>false</value> <description>default value is true:Should native hadoop libraries, if present, be used.</description> </property>修改hdfs-site.xml
<property> <name>dfs.replication</name> <value>1</value> <!--如果是单节点配置为1,如果是集群根据实际集群数量配置 --> </property>修改yarn-site.xml
<property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property>修改mapred-site.xml
cp mapred-site.xml.template mapred-site.xml. <property> <name>mapreduce.framework.name</name> <value>yarn</value> <final>true</final> </property>格式化namenode
hadoop namenode -format启动hdfs和yarn
start-dfs.sh start-yarn.sh查看守护进程是否开启
jps 6917 DataNode 6838 NameNode 2810 Launcher 7130 ResourceManager 7019 SecondaryNameNode 7772 Jps 7215 NodeManagerwordcount示例
hdfs dfs -mkdir -p /user/jjzhu/wordcount/in hdfs dfs -put xxxxx.txt /user/jjzhu/wordcount/in hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar wordcount /user/jjzhu/wordcount/in /user/jjzhu/wordcount/out运行过程
17/04/07 13:04:10 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032 17/04/07 13:04:10 INFO input.FileInputFormat: Total input paths to process : 1 17/04/07 13:04:10 INFO mapreduce.JobSubmitter: number of splits:1 17/04/07 13:04:11 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1491532908338_0004 17/04/07 13:04:11 INFO impl.YarnClientImpl: Submitted application application_1491532908338_0004 17/04/07 13:04:11 INFO mapreduce.Job: The url to track the job: http://jjzhu:8088/proxy/application_1491532908338_0004/ 17/04/07 13:04:11 INFO mapreduce.Job: Running job: job_1491532908338_0004 17/04/07 13:04:18 INFO mapreduce.Job: Job job_1491532908338_0004 running in uber mode : false 17/04/07 13:04:18 INFO mapreduce.Job: map 0% reduce 0% 17/04/07 13:04:23 INFO mapreduce.Job: map 100% reduce 0% 17/04/07 13:04:29 INFO mapreduce.Job: map 100% reduce 100% 17/04/07 13:04:29 INFO mapreduce.Job: Job job_1491532908338_0004 completed successfully 17/04/07 13:04:29 INFO mapreduce.Job: Counters: 49 File System Counters FILE: Number of bytes read=1141 FILE: Number of bytes written=239913 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=869 HDFS: Number of bytes written=779 HDFS: Number of read operations=6 HDFS: Number of large read operations=0 HDFS: Number of write operations=2 Job Counters Launched map tasks=1 Launched reduce tasks=1 Data-local map tasks=1 Total time spent by all maps in occupied slots (ms)=2859 Total time spent by all reduces in occupied slots (ms)=2527 Total time spent by all map tasks (ms)=2859 Total time spent by all reduce tasks (ms)=2527 Total vcore-milliseconds taken by all map tasks=2859 Total vcore-milliseconds taken by all reduce tasks=2527 Total megabyte-milliseconds taken by all map tasks=2927616 Total megabyte-milliseconds taken by all reduce tasks=2587648 Map-Reduce Framework Map input records=1 Map output records=118 Map output bytes=1219 Map output materialized bytes=1141 Input split bytes=122 Combine input records=118 Combine output records=89 Reduce input groups=89 Reduce shuffle bytes=1141 Reduce input records=89 Reduce output records=89 Spilled Records=178 Shuffled Maps =1 Failed Shuffles=0 Merged Map outputs=1 GC time elapsed (ms)=103 CPU time spent (ms)=0 Physical memory (bytes) snapshot=0 Virtual memory (bytes) snapshot=0 Total committed heap usage (bytes)=329252864 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0 File Input Format Counters Bytes Read=747 File Output Format Counters Bytes Written=779查看结果
hdfs dfs -ls /user/jjzhu/wordcount/out -rw-r--r-- 1 didi supergroup 0 2017-04-07 13:04 /user/jjzhu/wordcount/out/_SUCCESS -rw-r--r-- 1 didi supergroup 779 2017-04-07 13:04 /user/jjzhu/wordcount/out/part-r-00000 hdfs dfs -cat /user/jjzhu/wordcount/out/part-r-00000 A 1 Other 1 Others 1 Some 2 There 1 a 1 access 2 access); 1 according 1 adding 1 allowing 1 ......关闭hadoop
stop-hdfs.sh stop-yarn.sh下载解压配置环境变量export HIVE_HOME=/opt/hive-2.1.1export PATH=$HIVE_HOME/bin:$PATH
下载mysql-connector-xx.xx.xx.jar 到lib下
vim hive-site.xml将所有${system:java.io.tmpdir} 和 ${system:user.name}替换并配置mysql数据库连接信息
<property> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true&characterEncoding=UTF-8&useSSL=false</value> </property> <property> <name>javax.jdo.option.ConnectionDriverName</name> <value>com.mysql.jdbc.Driver</value> </property> <property> <name>javax.jdo.option.ConnectionUserName</name> <value>root</value> </property> <property> <name>javax.jdo.option.ConnectionPassword</name> <value>123456</value> </property>为hive创建HDFS目录
hdfs dfs -mkdir -p /usr/hive/warehouse hdfs dfs -mkdir -p /usr/hive/tmp hdfs dfs -mkdir -p /usr/hive/log hdfs dfs -chmod -R 777 /usr/hive初始化数据库
./bin/schematool -initSchema -dbType mysql mysql> show databases; +--------------------+ | Database | +--------------------+ | information_schema | | hive | | mysql | | performance_schema | | sys | +--------------------+ mysql> use hive; Database changed mysql> show tables; +---------------------------+ | Tables_in_hive | +---------------------------+ | AUX_TABLE | | BUCKETING_COLS | | SORT_COLS | | TABLE_PARAMS | | TAB_COL_STATS | | TBLS | | TBL_COL_PRIVS | | TBL_PRIVS | | TXNS | | TXN_COMPONENTS | | TYPES | | TYPE_FIELDS | | VERSION | | WRITE_SET | +---------------------------+启动hive
jjzhu:opt didi$ hive SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/opt/hive-2.1.1/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/hadoop-2.7.3/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory] Logging initialized using configuration in jar:file:/opt/hive-2.1.1/lib/hive-common-2.1.1.jar!/hive-log4j2.properties Async: true Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases. hive>下载解压配置环境变量
export $SQOOP_HOME=/opt/sqoop-1.99.7 export SQOOP_SERVER_EXTRA_LIB=$SQOOP_HOME/extra export PATH=$SQOOP_HOME/bin:$PATH在conf目录下的两个主要配置文件sqoop.properties和sqoop_bootstrap.properties主要修改sqoop.properties
org.apache.sqoop.submission.engine.mapreduce.configuration.directory=/opt/hadoop-2.7.3/etc/hadoop org.apache.sqoop.security.authentication.type=SIMPLE org.apache.sqoop.security.authentication.handler=org.apache.sqoop.security.authentication.SimpleAuthenticationHandler org.apache.sqoop.security.authentication.anonymous=true