记学习大数据踩坑系列--Secondary namenode failed to start via ambari.

以下是报错日志

File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/snamenode.py", line 143, in
Traceback (most recent call last):

SNameNode().execute() File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 219, in execute method(env) File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/snamenode.py", line 51, in start snamenode(action="start") File "/usr/lib/python2.6/site-packages/ambari_commons/os_family_impl.py", line 89, in thunk return fn(*args, **kwargs) File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_snamenode.py", line 47, in snamenode create_log_dir=True File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/utils.py", line 267, in service Execute(daemon_cmd, not_if=process_id_exists_command, environment=hadoop_env_exports) File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 154, in __init__ self.env.run() File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 160, in run self.run_action(resource, action) File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 124, in run_action provider_action() File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", line 238, in action_run tries=self.resource.tries, try_sleep=self.resource.try_sleep) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 70, in inner result = function(command, **kwargs) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 92, in checked_call tries=tries, try_sleep=try_sleep) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 140, in _call_wrapper result = _call(command, **kwargs_copy) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 291, in _call raise Fail(err_msg)resource_management.core.exceptions.

Fail: Execution of 'ambari-sudo.sh su hdfs -l -s /bin/bash -c 'ulimit -c unlimited ; /usr/hdp/current/hadoop-client/sbin/hadoop-daemon.sh --config /usr/hdp/current/hadoop-client/conf start secondarynamenode'' returned 1. starting secondarynamenode, logging to /var/log/hadoop/hdfs/hadoop-hdfs-secondarynamenode-slaver1.outJava HotSpot(TM) 64-Bit Server VM 

warning: INFO: os::commit_memory(0x00000000c8000000, 939524096, 0) failed; error='Cannot allocate memory' (errno=12)## There is insufficient memory for the Java Runtime Environment to continue.# Native memory allocation (mmap) failed to map 939524096 bytes for committing reserved memory.# An error report file with more information is saved as:# /var/log/hadoop/hdfs/hs_err_pid30918.log


根据这里红色的报错信息,显示内存不足以分配给JAVA。由于不懂JAVA,看到了一个最为简单的方法,那就是老一套--重启服务器。



重启服务器后,执行ambari-agent start 无法启动,想起来没有 禁用Transparent Huge Pages,

#cat/sys/kernel/mm/transparent_hugepage/enabled

[always] madvisenever

# echo never >/sys/kernel/mm/transparent_hugepage/enabled

# echo never >/sys/kernel/mm/transparent_hugepage/defrag

# cat/sys/kernel/mm/transparent_hugepage/enabled

always madvise[never]

再次执行ambari-agent start ,输出信息如下

Verifying Python version compatibility...
Using python  /usr/bin/python
Checking for previously running Ambari Agent...
/var/run/ambari-agent/ambari-agent.pid found with no process. Removing 1391...
Starting ambari-agent
Verifying ambari-agent process status...
Ambari Agent successfully started
Agent PID at: /var/run/ambari-agent/ambari-agent.pid
Agent out at: /var/log/ambari-agent/ambari-agent.out

Agent log at: /var/log/ambari-agent/ambari-agent.log

显示已经成功,但当我执行ambari-agent status,发现高兴过早。如下执行后信息

Found ambari-agent PID: 1472

ambari-agent not running. Stale PID File at: /var/run/ambari-agent/ambari-agent.pid    

并没有启动。然后删除ambari-agent.pid。再次启动agent。

报一下错误:

[root@slaver1 ~]# ambari-agent start
Verifying Python version compatibility...
Using python  /usr/bin/python
Checking for previously running Ambari Agent...
Starting ambari-agent
Verifying ambari-agent process status...
ERROR: ambari-agent start failed. For more details, see /var/log/ambari-agent/ambari-agent.out:
====================
  File "/usr/lib/python2.6/site-packages/ambari_agent/Hardware.py", line 44, in __init__
    self.hardware.update(Facter().facterInfo())
  File "/usr/lib/python2.6/site-packages/ambari_agent/Facter.py", line 522, in facterInfo
    facterInfo = super(FacterLinux, self).facterInfo()
  File "/usr/lib/python2.6/site-packages/ambari_agent/Facter.py", line 217, in facterInfo
    facterInfo['ipaddress'] = self.getIpAddress()
  File "/usr/lib/python2.6/site-packages/ambari_agent/Facter.py", line 73, in getIpAddress
    return socket.gethostbyname(self.getFqdn().lower())
gaierror: [Errno -3] Temporary failure in name resolution


====================
Agent out at: /var/log/ambari-agent/ambari-agent.out

Agent log at: /var/log/ambari-agent/ambari-agent.log

如红字所示,无法解析主机。

#hostname

slaver1.novalocal

竟然后面有域名,想起来上次好像用的是临时的,重启后又变成/etc/hostname里的名字

#hostnamectl set-hostname slaver1   这个是centos7版本后永久性改hostname的命令,之前版本要去配置文件里改


再次重启

[root@slaver1 ~]# ambari-agent start
Verifying Python version compatibility...
Using python  /usr/bin/python
Checking for previously running Ambari Agent...
/var/run/ambari-agent/ambari-agent.pid found with no process. Removing 1581...
Starting ambari-agent
Verifying ambari-agent process status...
Ambari Agent successfully started
Agent PID at: /var/run/ambari-agent/ambari-agent.pid
Agent out at: /var/log/ambari-agent/ambari-agent.out
Agent log at: /var/log/ambari-agent/ambari-agent.log

查看

[root@slaver1 ~]# ambari-agent status
Found ambari-agent PID: 1671
ambari-agent running.
Agent PID at: /var/run/ambari-agent/ambari-agent.pid
Agent out at: /var/log/ambari-agent/ambari-agent.out
Agent log at: /var/log/ambari-agent/ambari-agent.log



你可能感兴趣的:(记学习大数据踩坑系列--Secondary namenode failed to start via ambari.)