Query ID = root_20250610113434_c388a02f-b24b-4fb2-a37e-eb8619141174 Total jobs = 3 Launching Job 1 out of 3 Number of reduce tasks is set to 0 since there's no reduce operator java.net.ConnectException: Call From node1/192.168.142.129 to node2:8032 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused at sun.reflect.GeneratedConstructorAccessor59.newInstance(Unknown Source) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:930) at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:845) at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1571) at org.apache.hadoop.ipc.Client.call(Client.java:1513) at org.apache.hadoop.ipc.Client.call(Client.java:1410) at org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:258) at org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:139) at com.sun.proxy.$Proxy77.getNewApplication(Unknown Source) at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getNewApplication(ApplicationClientProtocolPBClientImpl.java:286) at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:433) at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:166) at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:158) at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:96) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:362) at com.sun.proxy.$Proxy78.getNewApplication(Unknown Source) at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getNewApplication(YarnClientImpl.java:284) at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.createApplication(YarnClientImpl.java:292) at org.apache.hadoop.mapred.ResourceMgrDelegate.getNewJobID(ResourceMgrDelegate.java:201) at org.apache.hadoop.mapred.YARNRunner.getNewJobID(YARNRunner.java:272) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:157) at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1678) at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1675) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1675) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:576) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:571) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:571) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:562) at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:423) at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:149) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:205) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2664) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:2335) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:2011) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1709) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1703) at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:157) at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:218) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:188) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:402) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:683) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:328) at org.apache.hadoop.util.RunJar.main(RunJar.java:241) Caused by: java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:715) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:205) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:600) at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:652) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:773) at org.apache.hadoop.ipc.Client$Connection.access$3800(Client.java:347) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1632) at org.apache.hadoop.ipc.Client.call(Client.java:1457) ... 55 more Job Submission failed with exception 'java.net.ConnectException(Call From node1/192.168.142.129 to node2:8032 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused)' FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask. Call From node1/192.168.142.129 to node2:8032 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
时间: 2025-08-03 17:21:16 浏览: 15
<think>我们正在解决HadoopYARN的Connectionrefused问题,具体错误是java.net.ConnectException。用户提到Hive的MapRedTask在node1尝试连接node2的8032端口时失败。根据之前的引用,问题出现在连接配置上,且没有启用Kerberos认证。8032端口是YARNResourceManager的scheduler服务的默认端口。我们需要系统性地检查网络、服务状态、配置等方面。以下为排查步骤:1.确认目标服务的端口是否监听:在node2上检查8032端口是否被监听(使用netstat或ss)2.检查防火墙设置(可能阻止了端口访问)3.确认ResourceManager服务是否在node2上正常运行4.检查Hadoop配置(yarn-site.xml)是否正确指定了ResourceManager的主机和端口5.检查客户端配置(在node1上,即运行Hive的节点)是否指向正确的ResourceManager地址6.确认主机名解析正确(node2能否被node1正确解析)具体步骤:步骤1:在node2上检查8032端口sudonetstat-tuln|grep8032或sudoss-tuln|grep8032步骤2:检查防火墙(如果运行)CentOS/RHEL:sudofirewall-cmd--list-ports#如果没有开放8032,则添加规则并重载sudofirewall-cmd--add-port=8032/tcp--permanentsudofirewall-cmd--reloadUbuntu:sudoufwstatus#若需要,开放端口sudoufwallow8032/tcp步骤3:检查ResourceManager服务状态systemctlstatushadoop-yarn-resourcemanager#假设用systemd管理或者使用jps命令查看是否有ResourceManager进程步骤4:检查yarn-site.xml配置(在node2上)确保以下配置正确:<property><name>yarn.resourcemanager.hostname</name><value>node2</value><!--这里应该是node2的主机名--></property><property><name>yarn.resourcemanager.scheduler.address</name><value>${yarn.resourcemanager.hostname}:8030</value><!--注意:默认端口是8030,但是错误中连接的是8032?--></property>注意:用户错误中连接的端口是8032,但YARNResourceManager的服务端口有三个:-ResourceManager对客户端(Client)的端口:8032(yarn.resourcemanager.address)?实际上,不同版本端口不同。在Hadoop2中,默认是:-yarn.resourcemanager.address(客户端通信端口):8032-yarn.resourcemanager.scheduler.address:8030-yarn.resourcemanager.resource-tracker.address:8031-yarn.resourcemanager.admin.address:8033因此,用户连接8032可能是客户端连接ResourceManager的端口。所以,在yarn-site.xml中需要检查:yarn.resourcemanager.address是否配置为node2:8032步骤5:在node1(Hive客户端)上检查配置同样,node1上的Hive服务使用的Hadoop客户端配置(yarn-site.xml)必须指向ResourceManager的地址和端口。检查node1上的yarn-site.xml中yarn.resourcemanager.address是否配置为node2:8032。步骤6:检查主机名解析在node1上执行:pingnode2(确保node2是配置中设置的主机名,并且能解析到正确的IP)如果使用IP地址,则配置中直接写IP。另外,常见错误:如果在配置中使用了主机名,但是/etc/hosts文件没有正确配置,或者DNS解析问题。如果以上步骤都正确,但问题仍然存在,可以尝试在node1上使用telnet测试连接:telnetnode28032如果telnet连接失败,则说明网络不通或服务未监听。此外,还有一个可能:ResourceManager服务绑定到了127.0.0.1而不是0.0.0.0。可以通过在node2上检查监听地址:netstat-tuln|grep8032如果看到的是127.0.0.1:8032,则说明只监听本地,需要修改配置绑定到0.0.0.0。修改方法:在yarn-site.xml中,对于每个需要绑定外部IP的配置,可以明确指定IP或0.0.0.0,或者使用主机名(需要确保主机名解析为外部IP)。例如:将yarn.resourcemanager.address配置为0.0.0.0:8032,但这样不推荐,因为可能不安全。推荐使用主机名,并确保主机名对应的是外部IP。注意:使用0.0.0.0可能会被安全策略禁止,因此更推荐使用主机名,并且保证主机名在所有节点都能解析到正确的IP。因此,总结步骤:1.确认ResourceManager服务在node2上运行。2.确认node2上的端口监听正确(监听在外部IP或0.0.0.0,而不是127.0.0.1)。3.防火墙开放。4.客户端配置指向正确的主机名和端口。5.主机名解析正确。现在,我们按照这个思路给出回答。</think>
### Hadoop YARN 连接拒绝(Connection refused)问题排查指南
以下是逐步排查Hive执行MapReduce任务时出现`java.net.ConnectException: Connection refused`错误的完整流程:
#### 1. 基础网络检测
```bash
# 在node1测试到node2的基础连通性
ping node2
# 检测关键端口8032的连通性(ResourceManager调度端口)
telnet node2 8032 # 返回"Connection refused"即确认问题存在
# 检查端口实际监听状态(在node2执行)
sudo netstat -tuln | grep 8032
```
> 若端口未监听,继续以下排查 [^1]
#### 2. 服务状态诊断
```bash
# 检查ResourceManager状态(在node2执行)
sudo systemctl status hadoop-yarn-resourcemanager
# 检查服务日志(关键错误线索)
tail -50 /var/log/hadoop-yarn/yarn/yarn-resourcemanager-*.log
```
典型异常线索:
```
ERROR - Bind failed on 0.0.0.0:8032
WARN - org.apache.hadoop.ipc.Server: Bind exception
```
#### 3. 配置项重点检查 (`yarn-site.xml`)
需验证node2上的以下配置:
```xml
<!-- 确认绑定的主机名 -->
<property>
<name>yarn.resourcemanager.hostname</name>
<value>node2</value> <!-- 必须匹配实际主机名 -->
</property>
<!-- 检查核心端口配置 -->
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>${yarn.resourcemanager.hostname}:8032</value>
</property>
<!-- 禁用IPv6避免冲突 -->
<property>
<name>yarn.nodemanager.bind-host</name>
<value>0.0.0.0</value> <!-- 监听所有接口 -->
</property>
```
#### 4. 系统防火墙与SELinux
```bash
# 检查防火墙状态(node2)
sudo firewall-cmd --state
# 添加规则开放Hadoop集群端口
sudo firewall-cmd --permanent --add-port={8030-8033,8040,8088}/tcp
sudo firewall-cmd --reload
# 临时停用SELinux做隔离测试
sudo setenforce 0
```
#### 5. 进阶调试方法
**情况A:端口未监听**
```bash
# 强制重启ResourceManager服务
sudo systemctl restart hadoop-yarn-resourcemanager
# 查看端口占用情况
sudo lsof -i :8032
```
**情况B:主机名解析失败**
```bash
# 在所有节点验证主机名映射(需包含所有集群节点)
cat /etc/hosts
# 测试反向解析
host node1
host node2
```
#### 6. 多节点一致性检查
```
检查项 node1验证 node2验证
1. core-site.xml配置 一致 一致
2. yarn-site.xml配置 包含RM地址 声明RM服务
3. Hadoop版本 相同 相同
4. JDK版本 相同 相同
```
#### 7. 备选连接参数测试
在`hive-site.xml`添加备用地址配置:
```xml
<property>
<name>yarn.resourcemanager.address</name>
<value>node2:8032</value>
</property>
```
> 完成以上步骤后,建议使用测试作业验证:
> ```sql
> SET mapred.job.name="Connection_Test";
> SELECT COUNT(1) FROM any_table; -- 简单触发MapReduce任务
> ```
阅读全文
相关推荐













