Hadoop是使用Java编写的,是为了解决大数据场景下的两大问题,分布式存储和分布式处理而诞生的,包含很多组件、套件。需要运行在Linux系统下。主要包括HDFS 和 MapReduce两个组件。
下载地址 https://archive.apache.org/dist/hadoop/common/
选择合适自己的tar.gz版本下载,该文档选择V3.2.1。
Hadoop是Java开发的,所以依赖jdk运行,要先安装jdk
Hadoop和jdk版本对应关系如下:
Hadoop版本 | jdk版本 |
---|---|
>Hadoop3.3 | java8 or java11(runTime) |
Hadoop3.0~Hadoop3.2 | java8 |
Hadoop2.7~Hadoop2.10 | java7 and java8 |
Hadoop安装分为三种模式,单机模式、伪分布式模式,分布式模式。
单机模式主要是用来测试学习使用,底层使用的还是系统自带的文件系统。伪分布式和分布式模式底层使用Hdfs文件系统。
单机模式安装
将tar.gz包上传到Linux目录下解压,并将解压后目录变成hadoop。编辑./etc/hadoop/hadoop-env.sh文件,配置jdk路径
# The java implementation to use. By default, this environment
# variable is REQUIRED on ALL platforms except OS X!
# export JAVA_HOME=
export JAVA_HOME=/usr/java/jdk1.8.0_201-amd64
执行如下命令测试Hadoop安装环境
[root@k8s-node-107 hadoop]# bin/hadoop version
Hadoop 3.2.1
Source code repository Unknown -r 7a3bc90b05f257c8ace2f76d74264906f0f7a932
Compiled by hexiaoqiao on 2021-01-03T09:26Z
Compiled with protoc 2.5.0
From source with checksum 5a8f564f46624254b27f6a33126ff4
This command was run using /home/bigData/soft/hadoop/share/hadoop/common/hadoop-common-3.2.2.jar
执行如下命令使用hadoop自带的计数案例测试hadoop运行情况(命令在安装根目录同级目录执行):
mkdir input
cp hadoop/etc/hadoop/*.xml input
hadoop/bin/hadoop jar hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.2.1.jar grep input output 'dfs[a-z.]+'
[root@localhost soft]# cat output/*
1 dfsadmin
思考:单机模式仅仅只是为了测试开发的jar包是否可用,在运行中使用到了MapReduce进行计算,但未使用到Hdfs.
伪分布式模式
hadoop伪分布式安装遇到的大多数问题来源于对Linux系统常用操作的不熟悉,比如新建用户、权限赋予、ssh免登陆设置
1、 配置Hadoop相关环境变量(/etc/profile文件中加)
# Hadoop Environment Variables
export HADOOP_HOME=/home/bigData/soft/hadoop
export HADOOP_INSTALL=$HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin
2、配置启动用户
由于hadoop伪分布式/分布式架构中主节点需要通过ssh去启动其他从节点,由于ssh默认是有密码的,所以需要设置ssh免密登录。但又不想破坏虚拟机原有用户ssh鉴权,所以会新建一个用户,这里新建用户为hadoop。脚本如下:
adduser hadoop //添加用户hadoop
passwd hadoop //设置密码
id hadoop // 查看hadoop用户组信息
usermod -g root hadoop //将hadoop加入root超管权限组
su hadoop //切换到hadoop账号
sudo chmod 777 -R hadoop //更改hadoop目录权限
3、修改sudoers文件
切换到root账户,修改sudoers文件,不然无法在新账户里使用sudo命令,会报如下错误:
hadoop is not in the sudoers file. This incident will be reported.
修改命令如下:
[root@localhost hadoop]# chmod a+x /etc/sudoers
[root@localhost hadoop]# vi /etc/sudoers
在文件sudoers中增加如下hadoop用户设置,保存时需要使用:wq!,要带!号,不然会显示只读文件无法修改
#Allow root to run any commands anywhere
root ALL=(ALL) ALL
hadoop ALL=(ALL) ALL
4、配置免密登录
切换到hadoop用户后测试是否支持免密登录,输入ssh localhost,如下
[hadoop@localhost soft]$ ssh localhost
The authenticity of host 'localhost (::1)' can't be established.
ECDSA key fingerprint is SHA256:cbS92o4o5+EzTyMUh93la2K25R2niIP10hRIMmh/zRA.
ECDSA key fingerprint is MD5:d6:3b:b0:e7:6d:6f:b8:57:83:6c:db:9e:88:73:a8:e4.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'localhost' (ECDSA) to the list of known hosts.
hadoop@localhost's password:
Permission denied, please try again.
hadoop@localhost's password:
如上提示输入密码则不支持免密登录。下面是设置免密登录脚本:
[hadoop@localhost soft]$ cd ~/.ssh/ # 若没有该目录,请先执行一次ssh localhost
[hadoop@localhost .ssh]$ ssh-keygen -t rsa # 出现输入提示,都按回车就可以
Generating public/private rsa key pair.
Enter file in which to save the key (/home/hadoop/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/hadoop/.ssh/id_rsa.
Your public key has been saved in /home/hadoop/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:DptnAzC7LHbHldsrnr0aQrhIc157McjaLXvuM3D0CxQ [email protected]
The key's randomart image is:
+---[RSA 2048]----+
| |
| E |
| o . |
| * .o. |
| o + BoS. |
| . * O.Xo=. |
| + B XoX... |
| . o . B==.. |
| .=*=+. |
+----[SHA256]-----+
[hadoop@localhost .ssh]$ ls
id_rsa id_rsa.pub known_hosts
[hadoop@localhost .ssh]$ cat id_rsa.pub >> authorized_keys # 加入授权
[hadoop@localhost .ssh]$ ls
authorized_keys id_rsa id_rsa.pub known_hosts
[hadoop@localhost .ssh]$ chmod 600 ./authorized_keys # 修改文件权限
再次尝试ssh localhost,如下则表示免密登录设置成功
[hadoop@localhost .ssh]$ ssh localhost
Last failed login: Thu Jul 1 17:32:19 CST 2021 from localhost on ssh:notty
There were 2 failed login attempts since the last successful login.
Last login: Thu Jul 1 17:31:21 2021
5、 修改4个关键配置文件
vi etc/hadoop/core-site.xml
hadoop.tmp.dir
/home/bigData/soft/hadoop/tmp
Abase for other temporary directories.
fs.defaultFS
hdfs://localhost:39000
vi etc/hadoop/hdfs-site.xml
dfs.replication
1
dfs.namenode.name.dir
/home/bigData/soft/hadoop/tmp/dfs/name
dfs.datanode.data.dir
/home/bigData/soft/hadoop/tmp/dfs/data
dfs.namenode.http-address
0.0.0.0:9870
vi etc/hadoop/mapred-site.xml
mapreduce.framework.name
yarn
mapreduce.application.classpath
$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*
vi etc/hadoop/yarn-site.xml
yarn.resourcemanger.hostname
localhost
yarme>yarn.nodemanager.env-whitelist JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME
yarn.nodemanager.aux-services
mapreduce_shuffle
6、HDFS初始化
[hadoop@localhost hadoop]# hdfs namenode -format
WARNING: /home/bigData/soft/hadoop/logs does not exist. Creating.
2021-06-25 12:55:08,910 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = localhost/127.0.0.1
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 3.2.2
.....
2021-06-25 12:55:10,799 INFO util.GSet: 0.029999999329447746% max memory 4.3 GB = 1.3 MB
2021-06-25 12:55:10,799 INFO util.GSet: capacity = 2^17 = 131072 entries
2021-06-25 12:55:10,886 INFO namenode.FSImage: Allocated new BlockPoolId: BP-494110815-127.0.0.1-1624596910865
2021-06-25 12:55:10,902 INFO common.Storage: Storage directory /home/bigData/soft/hadoop/datanode/dfs/name has been successfully formatted.
2021-06-25 12:55:10,959 INFO namenode.FSImageFormatProtobuf: Saving image file /home/bigData/soft/hadoop/datanode/dfs/name/current/fsimage.ckpt_0000000000000000000 using no compression
2021-06-25 12:55:11,100 INFO namenode.FSImageFormatProtobuf: Image file /home/bigData/soft/hadoop/datanode/dfs/name/current/fsimage.ckpt_0000000000000000000 of size 399 bytes saved in 0 seconds .
2021-06-25 12:55:11,121 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0
2021-06-25 12:55:11,129 INFO namenode.FSImage: FSImageSaver clean checkpoint: txid=0 when meet shutdown.
2021-06-25 12:55:11,129 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at localhost/127.0.0.1
************************************************************/
7、 修改hdfs启动、停止脚本
修改start-dfs.sh,stop-dfs.sh脚本,在文件头部增加如下脚本:
HDFS_DATANODE_USER=hadoop
HADOOP_SECURE_DN_USER=hadoop
HDFS_NAMENODE_USER=hadoop
HDFS_SECONDARYNAMENODE_USER=hadoop
8、启动dfs
执行start-dfs.sh启动完成后,执行jps查看启动进程,如果有如下四个进程则启动成功:
[hadoop@localhost hadoop]$ jps
114497 Jps
113914 NameNode
114314 SecondaryNameNode
114044 DataNode
若namenode或者其他进程无法启动,一定要去logs下查看日志
浏览器输入http://ip:9870/访问hdfs管理界面,其中ip地址为部署服务器地址。
关闭命令
stop-dfs.sh
9、启动yarn
执行start-yarn.sh启动完成后,执行jps查看启动进程,如果有如下NodeManager、ResourceManager进程则启动成功:
[hadoop@localhost hadoop]$ jps
113914 NameNode
114314 SecondaryNameNode
121629 NodeManager
114044 DataNode
121516 ResourceManager
121759 Jps
浏览器输入http://ip:8088/访问yarn管理界面,其中ip地址为部署服务器地址。
关闭命令
stop-yarn.sh
8、查看日志
cd logs/
tail -300f hadoop-hadoop-namenode-localhost.localdomain.log
#export HADOOP_ROOT_LOGGER=DEBUG,console
9、 常见问题
Q1:按照以上流程跑起来后,jps查看,未发现namenode进程
A1: 进入到logs/目录查看namenode的日志,发现如下错误:
2021-07-05 16:23:01,750 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping NameNode metrics system...
2021-07-05 16:23:01,750 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system stopped.
2021-07-05 16:23:01,751 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system shutdown complete.
2021-07-05 16:23:01,761 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: Failed to start namenode.
java.net.BindException: Problem binding to [localhost:9000] java.net.BindException: Address already in use; For more details see: http://wiki.apache.org/hadoop/BindException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
] java.net.BindException: Address already in use; For more details see: http://wiki.apache.org/hadoop/BindException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
这问题明显是9000端口占用引起的冲突,修改端口值即可解决问题。由于没有日志意识,导致该问题困扰自己一下午。
ps:其它遇到的问题都在安装步骤里完善了,这里就不再赘述。
Hadoop安装对Linux基础知识要求较高,所以它的学习门槛相对较高一些。本人也是亲自踩了若干坑,也是巩固了自己的基础知识,记录如上,欢迎大家来沟通交流。