【Error】Unable to connect to a as user root ...
时间:2023-09-12 01:37:02
在搭建Hadoop HA遇到的问题具体报错如下
com.jcraft.jsch.JSchException: Auth fail at com.jcraft.jsch.Session.connect(Session.java:452) at org.apache.hadoop.ha.SshFenceByTcpPort.tryFence(SshFenceByTcpPort.java:100) at org.apache.hadoop.ha.NodeFencer.fence(NodeFencer.java:97) at org.apache.hadoop.ha.ZKFailoverController.doFence(ZKFailoverController.java:532) at org.apache.hadoop.ha.ZKFailoverController.fenceOldActive(ZKFailoverController.java:505) at org.apache.hadoop.ha.ZKFailoverController.access$1100(ZKFailoverController.java:61) at org.apache.hadoop.ha.ZKFailoverController$ElectorCallbacks.fenceOldActive(ZKFailoverController.java:892) at org.apache.hadoop.ha.ActiveStandbyElector.fenceOldActive(ActiveStandbyElector.java:902) at org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:801) at org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:416) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:599) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498) 2021-12-27 11:07:20,846 WARN org.apache.hadoop.ha.NodeFencer: Fencing method org.apache.hadoop.ha.SshFenceByTcpPort(null) was unsuccessful. 2021-12-27 11:07:20,846 ERROR org.apache.hadoop.ha.NodeFencer: Unable to fence service by any configured method. 2021-12-27 11:07:20,846 WARN org.apache.hadoop.ha.ActiveStandbyElector: Exception handling the winning of election java.lang.RuntimeException: Unable to fence NameNode at a/192.168.0.149:8020 at org.apache.hadoop.ha.ZKFailoverController.doFence(ZKFailoverController.java:533) at org.apache.hadoop.ha.ZKFailoverController.fenceOldActive(ZKFailoverController.java:505) at org.apache.hadoop.ha.ZKFailoverController.access$1100(ZKFailoverController.java:61) at org.apache.hadoop.ha.ZKFailoverController$ElectorCallbacks.fenceOldActive(ZKFailoverController.java:892) at org.apache.hadoop.ha.ActiveStandbyElector.fenceOldActive(ActiveStandbyElector.java:902) at org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:801) at org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:416) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:599) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498) 2021-12-27 11:07:20,846 INFO org.apache.hadoop.ha.ActiveStandbyElector: Trying to re-establish ZK session 2021-12-27 11:07:20,851 INFO org.apache.zookeeper.ZooKeeper: Session: 0x37df9b417310059 closed 2021-12-27 11:07:21,852 INFO org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString=a:2181,b:2181,c:2181 sessionTimeout=5000 watcher=org.apache.hadoop.ha.ActiveStandbyElector$WatcherWithClientRef@44a90199 2021-12-27 11:07:21,853 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server b/192.168.0.150:2181. Will not attempt to authenticate using SASL (unknown error) 2021-12-27 11:07:21,854 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to b/192.168.0.150:2181, initiating session 2021-12-27 11:07:21,859 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server b/192.168.0.150:2181, sessionid = 0x27df9b3aaf60068, negotiated timeout = 5000 2021-12-27 11:07:21,860 INFO org.apache.zookeeper.ClientCnxn: EventThread shut down 2021-12-27 11:07:21,861 INFO org.apache.hadoop.ha.ActiveStandbyElector: Session connected. 2021-12-27 11:07:21,862 INFO org.apache.hadoop.ha.ActiveStandbyElector: Checking for any old active which needs to be fenced... 2021-12-27 11:07:21,862 INFO org.apache.hadoop.ha.ActiveStandbyElector: Old node exists: 0a096d79636c757374657212026e311a016120d43e28d33e 2021-12-27 11:07:21,864 INFO org.apache.hadoop.ha.ZKFailoverController: Should fence: NameNode at a/192.168.0.149:8020 2021-12-27 11:07:22,866 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: a/192.168.0.149:8020. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1000 MILLISECONDS) 2021-12-27 11:07:22,867 WARN org.apache.hadoop.ha.FailoverController: Unable to gracefully make NameNode at a/192.168.0.149:8020 standby (unable to connect) java.net.ConnectException: Call From b/192.168.0.150 to a:8020 failed on connection exception: java.net.ConnectException: 拒绝连接; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused at sun.reflect.GeneratedConstructorAccessor26.newInstance(Unknown Source) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.refect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:732)
at org.apache.hadoop.ipc.Client.call(Client.java:1480)
at org.apache.hadoop.ipc.Client.call(Client.java:1407)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
at com.sun.proxy.$Proxy9.transitionToStandby(Unknown Source)
at org.apache.hadoop.ha.protocolPB.HAServiceProtocolClientSideTranslatorPB.transitionToStandby(HAServiceProtocolClientSideTranslatorPB.java:112)
at org.apache.hadoop.ha.FailoverController.tryGracefulFence(FailoverController.java:172)
at org.apache.hadoop.ha.ZKFailoverController.doFence(ZKFailoverController.java:514)
at org.apache.hadoop.ha.ZKFailoverController.fenceOldActive(ZKFailoverController.java:505)
at org.apache.hadoop.ha.ZKFailoverController.access$1100(ZKFailoverController.java:61)
at org.apache.hadoop.ha.ZKFailoverController$ElectorCallbacks.fenceOldActive(ZKFailoverController.java:892)
at org.apache.hadoop.ha.ActiveStandbyElector.fenceOldActive(ActiveStandbyElector.java:902)
at org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:801)
at org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:416)
at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:599)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
Caused by: java.net.ConnectException: 拒绝连接
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495)
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:609)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:707)
at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:370)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1529)
at org.apache.hadoop.ipc.Client.call(Client.java:1446)
... 14 more
下面提供俩种可能导致这个错误的原因,也欢迎大家指出不足与一起讨论
- 第一种:可能是ssh绵密登录没有配置,可以尝试报错机器与其他进行免密登录,看是否可以成功的免密登录
- 第二种:由于
dfs.ha.fencing.methods
参数的value是sshfence,需要使用的fuser命令;可能没有安装fuser
(每个namenode节点都需要)
安装命令:yum -y install psmisc