hadoop2.6集群搭建

下面是自己在公司的测试服务器上面搭建的HA
#nohup java -jar puv_view.jar >> puv_out.file 2>&1 &

#*/2 * * * * sh /usr/local/puv_jar/exc.sh

卸载centos自带的jdk
yum -y remove java java-1.7.0-openjdk-1.7.0.75-2.5.4.2.el7_0.x86_64
yum -y remove java java-1.7.0-openjdk-headless-1.7.0.75-2.5.4.2.el7_0.x86_64
rpm -qa | grep java
-----------------------------------------------
安装jdk
--------------------------------------------------
/etc/hosts
192.168.1.16 shaobao16
192.168.1.17 shaobao17
192.168.1.18 shaobao18
192.168.1.19 shaobao19

/etc/sysconfig/network
NETWORKING=yes
HOSTNAME=主机名
reboot
-----------------------------------------------------
免密码登陆ssh
(1)ssh-keygen -t r
(2)cp ~/.ssh/id_rsa.pub ~/.ssh/authorized_keys
(3)对每一台机器都要执行
ssh-copy-id -i  shaobao17
ssh-copy-id -i  shaobao18
ssh-copy-id -i  shaobao19
(17,18,19都要对所以机器执行以上部分)
--------------------------------------------------
centos7防火墙
systemctl disable firewalld.service
systemctl stop firewalld.service
----------------------
安装zk
1.ZooKeeper
1.1 zk可以用来保证数据在zk集群之间的数据的事务性一致。
2.如何搭建ZooKeeper服务器集群
2.1 zk服务器集群规模不小于3个节点,要求各服务器之间系统时间要保持一致。
2.2 在hadoop0的/usr/local目录下,解压缩zk....tar.gz,设置环境变量
2.3 在conf目录下,修改文件 vi zoo_sample.cfg  zoo.cfg
2.4 编辑该文件,执行vi zoo.cfg
修改dataDir=/usr/local/zk/data
新增
                    server.0=hadoop0:2888:3888
    server.1=hadoop1:2888:3888
            server.2=hadoop2:2888:3888
2.5 创建文件夹mkdir /usr/local/zk/data
2.6 在data目录下,创建文件myid,值为0
2.7 把zk目录复制到hadoop1和hadoop2中
2.8 把hadoop1中相应的myid的值改为1
    把hadoop2中相应的myid的值改为2
2.9 启动,在三个节点上分别执行命令zkServer.sh start
2.10 检验,在三个节点上分别执行命令zkServer.sh status
-------------------------
搭建Hadoop2.6
1.修改hadoop-env.sh 
  export JAVA_HOME=/usr/local/jdk

2.修改core-site.xml

<configuration>
   <property>
     <name>fs.defaultFS</name>
     <value>hdfs://shaobao</value>
  </property>

  <property>
    <name>hadoop.tmp.dir</name>
    <value>/opt/hadoop-2.6.0/tmp</value>
</property>

<property>
    <name>ha.zookeeper.quorum</name>
    <value>shaobao16:2181,shaobao17:2181,shaobao18:2181</value>
</property>
3.修改hdfs-site.xml
<property>
        <name>dfs.replication</name>
        <value>2</value>
     </property>
     <!--这里是搭建Hadoop联邦-->
  <property>
        <name>dfs.nameservices</name>
        <value>shaobao</value>
   </property>
   <!--配置高可靠-->
   <property>
        <name>dfs.ha.namenodes.shaobao</name>
        <value>shaobao16,shaobao18</value>
    </property>
    <property>
        <name>dfs.namenode.rpc-address.shaobao.shaobao16</name>
        <value>shaobao16:9000</value>
    </property>
    <property>
        <name>dfs.namenode.http-address.shaobao.shaobao16</name>
        <value>shaobao16:50070</value>
    </property>
    <property>
        <name>dfs.namenode.rpc-address.shaobao.shaobao18</name>
        <value>shaobao18:9000</value>
    </property>
   <property>
        <name>dfs.namenode.http-address.shaobao.shaobao18</name>
        <value>shaobao18:50070</value>
    </property>

   <property>
        <name>dfs.namenode.shared.edits.dir</name>
        <value>qjournal://shaobao16:8485/shaobao</value>
    </property>

    <property>
        <name>dfs.ha.automatic-failover.enabled.shaobao</name>
        <value>true</value>
    </property>
    <property>
        <name>dfs.client.failover.proxy.provider.shaobao</name>
        <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
    </property>
  <property>
        <name>dfs.journalnode.edits.dir</name>
        <value>/opt/hadoop-2.6.0/tmp/journal</value>
  </property>

  <property>
        <name>dfs.ha.fencing.methods</name>
        <value>sshfence</value>
    </property>

   <property>
        <name>dfs.ha.fencing.ssh.private-key-files</name>
        <value>/root/.ssh/id_rsa</value>
    </property>
   
3.修改mapred-site.xml
   <property>
                <name>mapreduce.framework.name</name>
                <value>yarn</value>
          </property>
         
          <property>
                <name>mapred.child.java.opts</name>
                <value>-Xmx2048m</value>
          </property>
         
4修改slaves
shaobao17
shaobao18
shaobao19

5.描述
   journalnode  shaobao16
   HA shaobao16,shaobao18
   集群名称是shaobao
   slaves shaobao17,shaobao18,shaobao19
  
6.启动journalnode
    sbin/hadoop-daemon.sh start journalnode
7.格式化zk
   bin/hdfs  zkfc  -formatZK
8.格式化,启动namenode
   shaobao16执行
   bin/hdfs  namenode  -format
   sbin/hadoop-daemon.sh  start  namenode
   shaobao18
   bin/hdfs  namenode  -bootstrapStandby
   sbin/hadoop-daemon.sh  start  namenode
9.在win7系统C:\Windows\System32\drivers\etc\hosts添加
192.168.1.16  shaobao16
192.168.1.17  shaobao17
192.168.1.18  shaobao18
192.168.1.19  shaobao19

10.启动zkfc
shaobao16和shaobao18
sbin/hadoop-daemon.sh   start   zkfc

http://shaobao16:50070/dfshealth.html#tab-overview  (shaobao16有standby变成active)
http://shaobao18:50070/dfshealth.html#tab-overview
11修改yarn
<property>
    <name>yarn.resourcemanager.hostname</name>
    <value>shaobao16</value>
</property>
<property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
</property>
12启动yarn
sbin/start-yarn.sh 
查看  http://shaobao16:8088/ 可以看见resouceMananger的相关配置

hadoop jar hadoop-mapreduce-examples-2.6.0.jar wordcount hdfs://shaobao/sort/a hdfs://shaobao/sort/out3/

当我在此运行2g数据时,rm自动杀死job的线程,报内存溢出。下面对yarn-site.xml进行修改
<property>
    <name>yarn.resourcemanager.hostname</name>
    <value>shaobao16</value>
</property>
<property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
</property>


<property>
    <name>yarn.nodemanager.resource.memory-mb</name>
    <value>22528</value>
    <discription>每个节点可用内存,单位MB</discription>
  </property>
 
  <property>
    <name>yarn.scheduler.minimum-allocation-mb</name>
    <value>3000</value>
    <discription>单个任务可申请最少内存,默认1024MB</discription>
  </property>
 
  <property>
    <name>yarn.scheduler.maximum-allocation-mb</name>
    <value>16384</value>
    <discription>单个任务可申请最大内存,默认8192MB</discription>
  </property>

  <property>
    <name>yarn.nodemanager.vmem-pmem-ratio</name>
    <value>3</value>
    <discription>每单位的物理内存总量对应的虚拟内存量,默认是2.1,表示每使用1MB的物理内存,最多可以使用2.1MB的虚拟内存总量</discription>
  </property>

你可能感兴趣的:(hadoop2)