查看日志:
[root@slave01 mapred]# tail -100 /opt/modules/hadoop/hadoop-1.0.3/libexec/../logs/hadoop-hadoop-datanode-slave01.log 2013-11-12 19:19:22,650 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG: /************************************************************ STARTUP_MSG: Starting DataNode STARTUP_MSG: host = slave01/10.10.205.200 STARTUP_MSG: args = [] STARTUP_MSG: version = 1.0.3 STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r 1335192; compiled by 'hortonfo' on Tue May 8 20:31:25 UTC 2012 ************************************************************/ 2013-11-12 19:19:27,175 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties 2013-11-12 19:19:27,272 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered. 2013-11-12 19:19:27,455 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s). 2013-11-12 19:19:27,455 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system started 2013-11-12 19:19:28,054 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered. 2013-11-12 19:19:29,224 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Invalid directory in dfs.data.dir: Incorrect permission for /opt/data/hadoop/hdfs/data, expected: rwxr-xr-x, while actual: rwxrwxr-x 2013-11-12 19:19:29,224 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: All directories in dfs.data.dir are invalid. 2013-11-12 19:19:29,224 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Exiting Datanode 2013-11-12 19:19:29,225 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
发现:Invalid directory in dfs.data.dir: Incorrect permission for /opt/data/hadoop/hdfs/data, expected: rwxr-xr-x, while actual: rwxrwxr-x
处理方式:
[root@slave01 opt]# ll total 24 drwxr-xr-x 3 hadoop hadoop 4096 Oct 30 02:52 data drwxr-xr-x 3 root root 4096 Oct 30 02:52 modules drwxr-xr-x 3 root root 4096 Oct 30 02:30 sun [root@slave01 opt]# chmod g-w /opt/data/hadoop/hdfs/data [root@slave01 opt]# cd /opt/data/hadoop/hdfs/data [root@slave01 data]# ll total 0 [root@slave01 data]# cd ../ [root@slave01 hdfs]# ll total 16 drwxr-xr-x 2 hadoop hadoop 4096 Oct 30 20:38 data drwxrwxr-x 2 hadoop hadoop 4096 Oct 30 20:38 name
重新启动后检查:
[hadoop@master bin]$ hadoop dfsadmin -report Warning: $HADOOP_HOME is deprecated. Configured Capacity: 16002351104 (14.9 GB) Present Capacity: 10700967966 (9.97 GB) DFS Remaining: 10700845056 (9.97 GB) DFS Used: 122910 (120.03 KB) DFS Used%: 0% Under replicated blocks: 1 Blocks with corrupt replicas: 0 Missing blocks: 0 ------------------------------------------------- Datanodes available: 2 (2 total, 0 dead) Name: 10.10.205.200:50010 Decommission Status : Normal Configured Capacity: 8001175552 (7.45 GB) DFS Used: 61455 (60.01 KB) Non DFS Used: 2656518129 (2.47 GB) DFS Remaining: 5344595968(4.98 GB) DFS Used%: 0% DFS Remaining%: 66.8% Last contact: Tue Nov 12 21:03:17 PST 2013 Name: 10.10.205.201:50010 Decommission Status : Normal Configured Capacity: 8001175552 (7.45 GB) DFS Used: 61455 (60.01 KB) Non DFS Used: 2644865009 (2.46 GB) DFS Remaining: 5356249088(4.99 GB) DFS Used%: 0% DFS Remaining%: 66.94% Last contact: Tue Nov 12 21:03:17 PST 2013
原因:多了用户组的写权限能造成集群系统的无法启动。