【Hadoop】Hadoop生态系列之HDFS架构简述

上一篇:Hadoop生态系列之Hadoop简述及环境搭建

指路牌

    • HDFS架构
      • 简介
      • 架构
        • NameNode & DataNodes
        • HDFS不擅长存储小文件
        • HDFS机架感知
        • SecondaryNameNode & NameNode
        • NameNode启动过程
        • NameNode的SafeMode(安全模式)
        • SSH免密码认证原理
        • Trash回收站
        • 目录结构

HDFS架构

简介

Hadoop分布式文件系统(简称:HDFS)是指被设计成适合运行在通用硬件(commodity hardware)上的分布式文件系统(Distributed File System)。它和现有的分布式文件系统有很多共同点。但同时,它和其他的分布式文件系统的区别也是很明显的。HDFS是一个高度容错性的系统,适合部署在廉价的机器上。HDFS能提供高吞吐量的数据访问,非常适合大规模数据集上的应用。HDFS放宽了一部分POSIX约束,来实现流式读取文件系统数据的目的。HDFS在最开始是作为Apache Nutch搜索引擎项目的基础架构而开发的。HDFS是Apache Hadoop Core项目的一部分。
HDFS有着高容错性(fault-tolerant)的特点,并且设计用来部署在低廉的(low-cost)硬件上。而且它提供高吞吐量(high throughput)来访问应用程序的数据,适合那些有着超大数据集(large data set)的应用程序。

架构

NameNode & DataNodes

HDFS has a master/slave architecture. An HDFS cluster consists of a single NameNode, a master server that manages the file system namespace and regulates access to files by clients. In addition, there are a number of DataNodes, usually one per node in the cluster, which manage storage attached to the nodes that they run on. HDFS exposes a file system namespace and allows user data to be stored in files. Internally, a file is split into one or more blocks and these blocks are stored in a set of DataNodes. The NameNode executes file system namespace operations like opening, closing, and renaming files and directories. It also determines the mapping of blocks to DataNodes. The DataNodes are responsible for serving read and write requests from the file system’s clients. The DataNodes also perform block creation, deletion, and replication upon instruction from the NameNode.
——摘自官网

原文翻译: HDFS是一个master/slave架构,一个HDFS的集群包含一个NameNode,该服务是主服务负责管管文件系统的Namespace和响应客户端的常规访问。另外,有很多个DataNode节点,每个DataNode负责管理存储在DataNode运行所在主机上得文件。HDFS暴露了一个文件系统Namespace以及允许将用户数据存储到文件里。HDFS底层会将文件切分成1~N个块,这些块被存储在一系列的DataNode上,NameNode负责修改Namespace的DDL操作例如:打开、关闭、修改文件或者文件夹。NameNode决定了数据块到DataNode的映射。DataNode负责响应客户端的读写请求,同时在接收到来自于NameNode的指令DataNode还要进行块的创建、删除、复制等操作。

【Hadoop】Hadoop生态系列之HDFS架构简述_第1张图片
名词解释:
NameNode:使用内存存储集群中的元数据(文件命名空间-文件目录结构、数据块到DataNode映射)
DataNode:负责响应客户端对数据块的读写请求,向NameNode汇报自身状态信息
Block:是HDFS切分文件的尺度,默认是128MB,一个文件最多只有 一个不足128MB块
副本因子:HDFS为了防止DataNode宕机导致块的丢失,允许一个块又多个备份,默认备份是3

HDFS不擅长存储小文件

因为Namenode使用单机的内存存储,因此由于小文件会占用更多的内存空间,导致了Namenode内存浪费

案例 NameNode DataNode
1文件128MB 1条数据块映射元数据 128MB磁盘存储*(副本因子)
1000文件总计128MB 1000*1条数据块映射元数据 128MB磁盘存储*(副本因子)

HDFS机架感知

分布式的集群通常包含非常多的机器,由于受到机架槽位和交换机网口的限制,通常大型的分布式集群都会跨好几个机架,由多个机架上的机器共同组成一个分布式集群。机架内的机器之间的网络速度通常都会高于跨机架机器之间的网络速度,并且机架之间机器的网络通信通常受到上层交换机间网络带宽的限制。

Hadoop在设计时考虑到数据的安全与高效,数据文件默认在HDFS上存放三份,存储策略为:

  • 第一个block副本放在客户端所在的数据节点里(如果客户端不在集群范围内,则从整个集群中随机选择一个合适的数据节点来存放)。

  • 第二个副本放置在与第一个副本所在节点相同机架内的其它数据节点上

  • 第三个副本放置在不同机架的节点上

这样如果本地数据损坏,节点可以从同一机架内的相邻节点拿到数据,速度要比从跨机架节点上拿数据要快;同时,如果整个机架的网络出现异常,也能保证在其它机架的节点上找到数据。
为了降低整体的带宽消耗和读取延时,HDFS会尽量让读取程序读取离它最近的副本。
如果在读取程序的同一个机架上有一个副本,那么就读取该副本。
如果一个HDFS集群跨越多个数据中心,那么客户端也将首先读本地数据中心的副本。

参考:https://www.cnblogs.com/zwgblog/p/7096875.html

SecondaryNameNode & NameNode

【Hadoop】Hadoop生态系列之HDFS架构简述_第2张图片
名词解释:
fsimage:存储在Namenode服务所在物理主机磁盘上的一个二进制文本文件。记录了元数据信息

edits:存储在Namenode服务所在物理主机磁盘上的一个二进制文本文件,记录了对元数据修改操作。

The NameNode stores modifications to the file system as a log appended to a native file system file, edits. When a NameNode starts up, it reads HDFS state from an image file, fsimage, and then applies edits from the edits log file. It then writes new HDFS state to the fsimage and starts normal operation with an empty edits file. Since NameNode merges fsimage and edits files only during start up, the edits log file could get very large over time on a busy cluster. Another side effect of a larger edits file is that next restart of NameNode takes longer.

The secondary NameNode merges the fsimage and the edits log files periodically and keeps edits log size within a limit. It is usually run on a different machine than the primary NameNode since its memory requirements are on the same order as the primary NameNode.

The start of the checkpoint process on the secondary NameNode is controlled by two configuration parameters.

  • dfs.namenode.checkpoint.period, set to 1 hour by default, specifies the maximum delay between two consecutive checkpoints, and
  • dfs.namenode.checkpoint.txns, set to 1 million by default, defines the number of uncheckpointed transactions on the NameNode which will force an urgent checkpoint, even if the checkpoint period has not been reached.

The secondary NameNode stores the latest checkpoint in a directory which is structured the same way as the primary NameNode’s directory. So that the check pointed image is always ready to be read by the primary NameNode if necessary.
——摘自官网
原文翻译: 当第一次启动Namenode服务的时候,系统会加载fsimage和edits文件进行合并得到最新元数据信息,并且更新fsimage和edits,一旦服务启动成功后,在服务运行期间不再更新fsimage,只是将操作记录在edits中。导致namenode在长期运行之后重启导致namenode启动时间过长,还可能导致edits文件过大。因此Hadoop HDFS引入Secondary Namenode 辅助Namenode在运行期间完成对元数据的整理。

NameNode启动过程

【Hadoop】Hadoop生态系列之HDFS架构简述_第3张图片

NameNode的SafeMode(安全模式)

On startup, the NameNode enters a special state called Safemode. Replication of data blocks does not occur when the NameNode is in the Safemode state. The NameNode receives Heartbeat and Blockreport messages from the DataNodes. A Blockreport contains the list of data blocks that a DataNode is hosting. Each block has a specified minimum number of replicas. A block is considered safely replicated when the minimum number of replicas of that data block has checked in with the NameNode. After a configurable percentage of safely replicated data blocks checks in with the NameNode (plus an additional 30 seconds), the NameNode exits the Safemode state. It then determines the list of data blocks (if any) that still have fewer than the specified number of replicas. The NameNode then replicates these blocks to other DataNodes.
——摘自官网
原文翻译:
在启动过程中,NameNode会进入一个特殊的状态称为SafeMode,即安全模式。HDFS在处于安全模式下不会进行数据块的复制。NameNode在安全模式下接收来自DataNode的心跳和Blockreport信息,每个DataNode的块的汇报信息中包含了该物理主机上所持有的所有的数据块的信息。NameNode会在启动时候检查所有汇报的块是否满足设置的最小副本数(默认值1),只要块达到了最小副本数,才认得当前块是安全的。NameNode等待30s然后尝试检查汇报的所谓的安全的块的比例有没有达到99.9%,如果达到该阈值,NameNode自动退出安全模式。然后开始检查块的副本数有没有低于配置的副本数,然后发送复制指令,进行块的复制。

注意:HDFS在启动的时候会自动进入和退出安全模式,一般在生产环境下,有时候也会让HDFS强制进入安全模式,进而对服务器进行维护。

[root@CentOS ~]# hdfs dfsadmin -safemode get
Safe mode is OFF
[root@CentOS ~]# hdfs dfsadmin -safemode enter
Safe mode is ON
[root@CentOS ~]# hdfs dfs -put hadoop-2.9.2.tar.gz /
put: Cannot create file/hadoop-2.9.2.tar.gz._COPYING_. Name node is in safe mode.
[root@CentOS ~]# hdfs dfsadmin -safemode leave
Safe mode is OFF
[root@CentOS ~]# hdfs dfs -put hadoop-2.9.2.tar.gz /

SSH免密码认证原理

SSH 为建立在应用层基础上的安全协议。SSH 是较可靠,专为远程登录会话和其他网络服务提供安全性的协议。利用 SSH 协议可以有效防止远程管理过程中的信息泄露问题。提供的登录方式有两种:

  • 基于口令的安全验证: 有可能远程主机冒充目标主机,截获用户信息。
  • 密匙的安全验证: 需要认证的是机器的身份

【Hadoop】Hadoop生态系列之HDFS架构简述_第4张图片

①产生公私钥对,可选RSA或者DSA算法

[root@CentOS ~]# ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
Created directory '/root/.ssh'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:qWX5zumy1JS1f1uxPb3Gr+5e8F0REVueJew/WYrlxwc root@CentOS
The key's randomart image is:
+---[RSA 2048]----+
|             ..+=|
|              .o*|
|            .. +.|
|         o o .E o|
|        S o .+.*+|
|       + +  ..o=%|
|      . . o   o+@|
|       ..o .   ==|
|        .+=  +*+o|
+----[SHA256]-----+

  • 默认会在~/.ssh目录下产生id_rsa(私钥)和id_rsa.pub(公钥)

②将本机的公钥添加到目标主机的授信列表文件

[root@CentOS ~]# ssh-copy-id root@CentOS
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/root/.ssh/id_rsa.pub"
The authenticity of host 'centos (192.168.73.130)' can't be established.
ECDSA key fingerprint is SHA256:WnqQLGCjyJjgb9IMEUUhz1RLkpxvZJxzEZjtol7iLac.
ECDSA key fingerprint is MD5:45:05:12:4c:d6:1b:0c:1a:fc:58:00:ec:12:7e:c1:3d.
Are you sure you want to continue connecting (yes/no)? yes
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
root@centos's password:

Number of key(s) added: 1

Now try logging into the machine, with:   "ssh 'root@CentOS'"
and check to make sure that only the key(s) you wanted were added.

  • 默认会将本机的公钥添加到远程目标主机的~/.ssh/authorized_keys文件中。

Trash回收站

HDFS为了规避由于用户的误操作,导致的数据删除丢失,用户可以在构建HDFS的时候,配置HDFS的垃圾回收功能。所谓的垃圾回收,本质上是在用户删除文件的时候,系统并不会立即删除文件,仅仅是将文件移动到垃圾回收的目录。然后更具配置的时间,一旦超过该时间,系统会删除该文件,用户需要在到期之前,将回收站的文件移除垃圾站,即可避免删除。

  • 开启垃圾回收,需要在core-site.xml中添加如下配置,然后重启hdfs即可

<property>
  <name>fs.trash.intervalname>
  <value>5value>
property>
[root@CentOS hadoop-2.9.2]# hdfs dfs -rm -r -f /jdk-8u191-linux-x64.rpm
20/09/25 20:09:24 INFO fs.TrashPolicyDefault: Moved: 'hdfs://CentOS:9000/jdk-8u191-linux-x64.rpm' to trash at: hdfs://CentOS:9000/user/root/.Trash/Current/jdk-8u191-linux-x64.rpm

目录结构

[root@CentOS ~]# tree -L 1 /usr/hadoop-2.9.2/
/usr/hadoop-2.9.2/
├── bin  # 系统脚本,hdfs、hadoop、yarn
├── etc  # 配置目录xml、文本文件
├── include # 一些C的头文件,无需关注
├── lib  # 第三方native实现C实现
├── libexec # hadoop运行时候,加载配置的脚本
├── LICENSE.txt
├── logs # 系统运行日志目录,排查故障!
├── NOTICE.txt
├── README.txt
├── sbin  # 用户脚本,通常用于启动服务例如:start|top-dfs.sh、
└── share # hadoop运行的依赖jars、内嵌webapp 

下一篇:Hadoop生态系列之HDFS常用Shell命令实践及Java API操作HDFS

你可能感兴趣的:(Apache,Hadoop生态,hadoop,大数据,hdfs)