Hadoop2.2.0生产环境模拟

1:规划
centOS6.4上搭建hadoop2.2.0环境,java版本7UP21
192.168.100.201 product201.product (namenode)
192.168.100.202 product202.product (datanode)
192.168.100.203 product203.product (datanode)
192.168.100.204 product204.product (datanode)
192.168.100.200 productserver.product (DNS、NFS)
指导思想:
A:将Hadoop2.2.0的部署文件共享在productserver.product:/share/hadoop,每台客户端安装时通过脚本下载
B:SSH公钥文件所有节点共享,将各自产生的公钥添加到productserver.product:/mnt/.ssh/authorized_keys
C:Hadoop配置文件所有节点共享,部署在productserver.product:/mnt/.ssh,各节点通过软链接引用
D:各节点安装使用安装脚本,包括启动共享文件的装载、向slaves文件注册、下载Hadoop部署文件、建立软链接等等


2:创建虚拟机样板机(VM和vitualBOX都可以)
A:安装centOS6.4虚拟机product201.product,开通ssh服务,屏蔽iptables服务
[root@hadoop1 ~]# chkconfig sshd on
[root@hadoop1 ~]# chkconfig iptables off
[root@hadoop1 ~]# chkconfig ip6tables off
[root@hadoop1 ~]# chkconfig postfix off

B:修改/etc/sysconfig/selinux
SELINUX=disabled

C:修改ssh配置/etc/ssh/sshd_config,打开注释:
RSAAuthentication yes
PubkeyAuthentication yes
AuthorizedKeysFile .ssh/authorized_keys

D:安装JAVA,在环境变量配置文件/etc/profile末尾增加:
export JAVA_HOME=/usr/java/jdk1.7.0_21
export JRE_HOME=/usr/java/jdk1.7.0_21/jre
export HADOOP_PREFIX=/app/hadoop/hadoop220
export HADOOP_COMMON_HOME=${HADOOP_PREFIX}
export HADOOP_CONF_DIR=${HADOOP_PREFIX}/etc/hadoop
export CLASSPATH=.:$JAVA_HOME/lib:$JAVA_HOME/lib/tools.jar
export PATH=$JAVA_HOME/bin:$JRE_HOME/bin:${HADOOP_PREFIX}/bin:${HADOOP_PREFIX}/sbin:$PATH

E:增加hadoop组和hadoop用户,并设置hadoop用户密码。

3:创建虚拟机
A:关闭样板机,分别复制成product202.product、product203.product、product204.product、productserver.product:
修改虚拟机的下列文件中相关的信息,使虚拟机的网络配置和虚拟机名相应
/etc/udev/rules.d/70-persistent-net.rules
/etc/sysconfig/network
/etc/sysconfig/network-scripts/ifcfg-eth0

B:启动product201.product、product202.product、product203.product、product204.product、productserver.product,确保相互之间能ping通。

4:服务器安装(提供DNS和NFS服务)
A:DNS安装
[root@productserver ~]# yum install bind_libs bind bind-utils
Hadoop2.2.0生产环境模拟_第1张图片
  **************************************************************************
如果想增加系统安全性,可以将bind的根目录限定在某一个目录之中,需要安装bind-chroot。但要注意将"$AddUnixListenSocket /var/named/chroot/dev/log"加入/etc/rsyslog.conf文件中,不然rsyslog守护程序将无法记载bind日志。
[root@productserver ~]# yum install bind_libs bind bind-utils bind-chroot
[root@productserver ~]# vi rsyslog.conf
$AddUnixListenSocket /var/named/chroot/dev/log
**************************************************************************

B:配置/etc/named.conf 和 /etc/name
[root@productserver ~]# vi /etc/named.conf 
[root@productserver ~]# cat /etc/named.conf

// Provided by Red Hat bind package to configure the ISC BIND named(8) DNS
// server as a caching only nameserver (as a localhost DNS resolver only).
//
// See /usr/share/doc/bind*/sample/ for example named configuration files.
//

options {
listen-on port 53 { any; };
listen-on-v6 port 53 { ::1; };
directory "/var/named";
dump-file "/var/named/data/cache_dump.db";
statistics-file "/var/named/data/named_stats.txt";
memstatistics-file "/var/named/data/named_mem_stats.txt";
allow-query { any; };
recursion yes;
forwarders { 202.101.172.35; };

dnssec-enable yes;
dnssec-validation yes;
dnssec-lookaside auto;

/* Path to ISC DLV key */
bindkeys-file "/etc/named.iscdlv.key";

managed-keys-directory "/var/named/dynamic";
};

logging {
channel default_debug {
file "data/named.run";
severity dynamic;
};
};

zone "." IN {
type hint;
file "named.ca";
};

include "/etc/named.rfc1912.zones";
include "/etc/named.root.key";


[root@productserver ~]# vi /etc/named.rfc1912.zones
[root@productserver ~]# cat /etc/named.rfc1912.zones

// named.rfc1912.zones:
//
// Provided by Red Hat caching-nameserver package
//
// ISC BIND named zone configuration for zones recommended by
// RFC 1912 section 4.1 : localhost TLDs and address zones
// and http://www.ietf.org/internet-drafts/draft-ietf-dnsop-default-local-zones-02.txt
// (c)2007 R W Franks
//
// See /usr/share/doc/bind*/sample/ for example named configuration files.
//

zone "localhost.localdomain" IN {
type master;
file "named.localhost";
allow-update { none; };
};

zone "localhost" IN {
type master;
file "named.localhost";
allow-update { none; };
};

//注释下面几行
//zone "1.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.ip6.arpa" IN {
// type master;
// file "named.loopback";
// allow-update { none; };
//};

//zone "1.0.0.127.in-addr.arpa" IN {
// type master;
// file "named.loopback";
// allow-update { none; };
//};

zone "0.in-addr.arpa" IN {
type master;
file "named.empty";
allow-update { none; };
};

zone "product" IN {
type master;
file "product.zone";
};

zone "100.168.192.in-addr.arpa" IN {
type master;
file "100.168.192.zone";
};



C:正反解文件配置
正反解文件配置容易出错,可以用“named-checkzone zone名 zone文件”命令来检查
[root@productserver ~]# vi /var/named/product.zone
[root@productserver ~]# cat /var/named/product.zone

$TTL 86400
@ IN SOA product. root.product. (
2013122801 ; serial (d. adams)
3H ; refresh
15M ; retry
1W ; expiry
1D ) ; minimum
@ IN NS productserver.
productserver IN A 192.168.100.200

; 正解设置
product201 IN A 192.168.100.201
product202 IN A 192.168.100.202
product203 IN A 192.168.100.203
product204 IN A 192.168.100.204
product211 IN A 192.168.100.211
product212 IN A 192.168.100.212
product213 IN A 192.168.100.213
product214 IN A 192.168.100.214

[root@productserver ~]# vi /var/named/100.168.192.zone
[root@productserver ~]# cat /var/named/100.168.192.zone 

$TTL 86400
@ IN SOA productserver. root.productserver. (
2013122801 ; serial (d. adams)
3H ; refresh
15M ; retry
1W ; expiry
1D ) ; minimum
IN NS productserver.
200 IN PTR productserver.product.

;反解设置
201 IN PTR product201.product.
202 IN PTR product202.product.
203 IN PTR product203.product.
204 IN PTR product204.product.
211 IN PTR product211.product.
212 IN PTR product212.product.
213 IN PTR product213.product.
214 IN PTR product214.product.


D:配置DNS服务并启动DNS
[root@productserver ~]# chkconfig named on
[root@productserver ~]# /etc/init.d/named restart
Hadoop2.2.0生产环境模拟 - mmicky - mmicky 的博客
**************************************************************************
启动named时报错
Generating /etc/rndc.key:
处理方法
[root@productserver ~]# rndc-confgen -r /dev/urandom -a
然后再启动named就可以了
**************************************************************************

E:安装NFS
**************************************************************************
用下面命令检查运行NFS必须的软件包是否已经安装
[root@productserver ~]# rpm -qa |grep rpcbind
[root@productserver ~]# rpm -qa |grep nfs
若没有使用下面命令安装
[root@productserver ~]# yum install nfs-utils
**************************************************************************
安装后配置NFS服务和启动
[root@productserver ~]# chkconfig rpcbind on
[root@productserver ~]# chkconfig nfs on
[root@productserver ~]# chkconfig nfslock on
[root@productserver ~]# service rpcbind restart
[root@productserver ~]# service nfs restart
[root@productserver ~]# service nfslock restart

解压缩hadoop安装文件到/app/hadoop/hadoop220,其中将/app/hadoop整个目录赋予hadoop:hadoop,并且在/app/hadoop/hadoop220下建立mydata目录存放数据,建立logs目录存放日志。
创建共享目录/share/hadoop(存放hadoop文件,用户hadoop只读)和/share/.ssh(存放所有节点公钥,用户hadoop读写)
[root@productserver ~]# mkdir -p /share/hadoop
[root@productserver ~]# cp -r /app/hadoop/hadoop220 /share/hadoop/
[root@productserver ~]# chown -R hadoop:hadoop /share/hadoop
[root@productserver ~]# setfacl -m u:hadoop:rwx /share/hadoop
[root@productserver ~]# mkdir -p /share/.ssh
[root@productserver ~]# chmod 700 /share/.ssh
[root@productserver ~]# chown -R hadoop:hadoop /share/.ssh
[root@productserver ~]# setfacl -m u:hadoop:rwx /share/.ssh

配置共享目录配置文件
[root@productserver ~]# vi /etc/exports
[root@productserver ~]# cat /etc/exports 
/share/hadoop 192.168.100.0/24(ro)
/share/.ssh 192.168.100.0/24(rw)

使共享目录配置文件/etc/exports生效:
[root@productserver ~]# exportfs -arv

5:Hadoop节点安装配置
A:DNS客户端配置
client解析主机的IP的次序:首先访问/etc/nsswitch.conf;如果是file先,就找/etc/hosts解析;如果是dns先,就先找/etc/resolv.conf解析。本环境是由dns来解析,所以要配置/etc/nsswitch.conf和/etc/resolv.conf。由于CentOS6.x的NetworkManager服务有时候会产生一些比较奇特的现象《鸟哥的Linux私房菜-服务器架设篇(第三版)P597》,所以顺便关闭NetworkManger服务。

B:NFS客户端配置
如果想自动加载nfs,可以通过两种方法:
在/etc/rc.d/rc.local加入加载命令
配置autofs服务的/etc/auto.master文件及其指向的加载配置明细文件,并启动autofs文件
由于hadoop系统启动的时候要使用productserver:/share/.ssh中的公钥,所以需要在节点重启的时候自动加载nfs,本方案将采用第一种方法,并关闭autofs服务。

C:生成ssh公钥和节点名

D:复制hadoop安装包

E:启动每台虚拟机(product201、product202、product203、product204),生成两个脚本hadoop_root.sh和hadoop_hadoop.sh,分别由用户root和hadoop运行。
hadoop_root.sh脚本,由root运行
[root@product201 ~]# vi /app/hadoop_root.sh

#/bin/bash
echo "############DNS客户端配置###############"
sed -i 's/hosts: files dns/hosts: dns files/g' `find /etc/ -name nsswitch.conf`
echo "nameserver 192.168.100.200" >>/etc/resolv.conf
chkconfig NetworkManager off
chkconfig autofs off

echo "############NFS客户端配置###############"
chkconfig rpcbind on
chkconfig nfslock on
service rpcbind restart
service nfslock restart
mkdir -p /mnt/hadoop
mkdir -p /mnt/.ssh
mount -t nfs productserver:/share/hadoop /mnt/hadoop
mount -t nfs productserver:/share/.ssh /mnt/.ssh
echo "mount -t nfs productserver:/share/hadoop /mnt/hadoop">>/etc/rc.d/rc.local
echo "mount -t nfs productserver:/share/.ssh /mnt/.ssh">>/etc/rc.d/rc.local
mkdir -p /app/hadoop/
chown -R hadoop:hadoop /app/hadoop

echo "----------END-----------------"


hadoop_hadoop.sh脚本,由hadoop运行
[root@product201 ~]# vi /app/hadoop_hadoop.sh

#/bin/bash
echo "############生成公钥###############"
ssh-keygen -t rsa -N123456 -f /home/hadoop/.ssh/id_rsa
cat /home/hadoop/.ssh/id_rsa.pub>>/mnt/.ssh/authorized_keys
ln -sf /mnt/.ssh/authorized_keys /home/hadoop/.ssh/authorized_keys
hostname>>/mnt/.ssh/slaves

echo "############复制hadoop文件###############"
cp -r /mnt/hadoop/hadoop220 /app/hadoop/

echo "############链接hadoop配置文件###############"
rm -rf /app/hadoop/hadoop220/etc/hadoop/*
ln -sf /mnt/.ssh/capacity-scheduler.xml /app/hadoop/hadoop220/etc/hadoop/capacity-scheduler.xml
ln -sf /mnt/.ssh/configuration.xsl /app/hadoop/hadoop220/etc/hadoop/configuration.xsl
ln -sf /mnt/.ssh/container-executor.cfg /app/hadoop/hadoop220/etc/hadoop/container-executor.cfg
ln -sf /mnt/.ssh/core-site.xml /app/hadoop/hadoop220/etc/hadoop/core-site.xml
ln -sf /mnt/.ssh/hadoop-env.cmd /app/hadoop/hadoop220/etc/hadoop/hadoop-env.cmd
ln -sf /mnt/.ssh/hadoop-env.sh /app/hadoop/hadoop220/etc/hadoop/hadoop-env.sh
ln -sf /mnt/.ssh/hadoop-metrics2.properties /app/hadoop/hadoop220/etc/hadoop/hadoop-metrics2.properties
ln -sf /mnt/.ssh/hadoop-metrics.properties /app/hadoop/hadoop220/etc/hadoop/hadoop-metrics.properties
ln -sf /mnt/.ssh/hadoop-policy.xml /app/hadoop/hadoop220/etc/hadoop/hadoop-policy.xml
ln -sf /mnt/.ssh/hdfs-site.xml /app/hadoop/hadoop220/etc/hadoop/hdfs-site.xml
ln -sf /mnt/.ssh/httpfs-env.sh /app/hadoop/hadoop220/etc/hadoop/httpfs-env.sh
ln -sf /mnt/.ssh/httpfs-log4j.properties /app/hadoop/hadoop220/etc/hadoop/httpfs-log4j.properties
ln -sf /mnt/.ssh/httpfs-signature.secret /app/hadoop/hadoop220/etc/hadoop/httpfs-signature.secret
ln -sf /mnt/.ssh/httpfs-site.xml /app/hadoop/hadoop220/etc/hadoop/httpfs-site.xml
ln -sf /mnt/.ssh/log4j.properties /app/hadoop/hadoop220/etc/hadoop/log4j.properties
ln -sf /mnt/.ssh/mapred-env.cmd /app/hadoop/hadoop220/etc/hadoop/mapred-env.cmd
ln -sf /mnt/.ssh/mapred-env.sh /app/hadoop/hadoop220/etc/hadoop/mapred-env.sh
ln -sf /mnt/.ssh/mapred-queues.xml.template /app/hadoop/hadoop220/etc/hadoop/mapred-queues.xml.template
ln -sf /mnt/.ssh/mapred-site.xml /app/hadoop/hadoop220/etc/hadoop/mapred-site.xml
ln -sf /mnt/.ssh/mapred-site.xml.template /app/hadoop/hadoop220/etc/hadoop/mapred-site.xml.template
ln -sf /mnt/.ssh/masters /app/hadoop/hadoop220/etc/hadoop/masters
ln -sf /mnt/.ssh/slaves /app/hadoop/hadoop220/etc/hadoop/slaves
ln -sf /mnt/.ssh/ssl-client.xml.example /app/hadoop/hadoop220/etc/hadoop/ssl-client.xml.example
ln -sf /mnt/.ssh/ssl-server.xml.example /app/hadoop/hadoop220/etc/hadoop/ssl-server.xml.example
ln -sf /mnt/.ssh/yarn-env.cmd /app/hadoop/hadoop220/etc/hadoop/yarn-env.cmd
ln -sf /mnt/.ssh/yarn-env.sh /app/hadoop/hadoop220/etc/hadoop/yarn-env.sh
ln -sf /mnt/.ssh/yarn-site.xml /app/hadoop/hadoop220/etc/hadoop/yarn-site.xml
echo "----------END-----------------"



[root@product201 ~]# chmod 777 /app/*.sh
[root@product201 ~]# /app/hadoop_root.sh
Hadoop2.2.0生产环境模拟_第2张图片
 
[root@product201 ~]# su - hadoop
[hadoop@product201 ~]$ /app/hadoop_hadoop.sh
Hadoop2.2.0生产环境模拟_第3张图片
 
6:启动hadoop
A:由于productserver:/share/.ssh/slaves注册所有的节点机器名,需要将namenode机器名删除。本实验以product201.product作为namenode。
B:修改  productserver:/share/.ssh/ 的读写属性是700, productserver:/share/.ssh/authorized_keys 的读写属性是600 ,不然免密码会出现问题。
C:在  product201.product做一次ssh各节点,保留初次密码。
D:格式化namenode
E:启动Hadoop
Hadoop2.2.0生产环境模拟_第4张图片
 
7:Tips
A:hadoop_hadoop.sh脚本中对hadoop配置文件进行连接,命令可以通过awk生成。
cat ccc | awk '{ print "ln -sf " $10 }'
cat ccc | awk '{ print "ln -sf /mnt/.ssh/" $10 " /app/hadoop/hadoop220/etc/hadoop/" $10 }'
B:使用DNS,最好各节点的hostname是XXX.YYY类似的名称,如果只是一个简单的机器名,在解析配置的时候不大好配置。

你可能感兴趣的:(Hadoop2.2.0生产环境模拟)