Hadoop DataXceiver java.io.IOException: Connection reset by peer

最近执行mapreduce的时候老出现mapreduce的task执行不稳定的情况,有时候某个任务一直在重试,导致整个mapreduce一直处于一个阶段,就像卡住了一样,重试N久,最后可能几小时才执行完。于是乎只好查看各个目录下的log(问题跟踪解决http://blog.csdn.net/rzhzhz/article/details/7536285),发现datanode下出现了如下错误

2012-04-27 10:40:30,683 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(10.64.49.22:50010, storageID=DS-1420900310-10.64.49.22-50010-1332741432282, infoPort=50075, ipcPort=50020):DataXceiver
java.io.IOException: Connection reset by peer
        at sun.nio.ch.FileDispatcher.read0(Native Method)
        at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:21)
        at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:202)
        at sun.nio.ch.IOUtil.read(IOUtil.java:175)
        at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:243)
        at org.apache.hadoop.net.SocketInputStream$Reader.performIO(SocketInputStream.java:55)
        at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142)
        at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155)
        at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128)
        at java.io.BufferedInputStream.read1(BufferedInputStream.java:256)
        at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
        at java.io.DataInputStream.read(DataInputStream.java:132)
        at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:264)
        at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:354)
        at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:375)
        at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:528)
        at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:397)
        at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:107)
        at java.lang.Thread.run(Thread.java:662)


这是官方bug页,此问题已经处于closed状态

HADOOP-3678

官方描述如下

When a client reads data using read(), it closes the sockets after it is done. 
Often it might not read till the end of a block. The datanode on the other side keeps writing data until the client connection is closed or end of the block is reached.
If the client does not read till the end of the block, Datanode writes an error message and stack trace to the datanode log. It should not.
This is not an error and it just pollutes the log and confuses the user. 


 

 

你可能感兴趣的:(mapreduce,hadoop,任务,Sockets)