Thursday, June 08, 2006

Non Blocking with Traditional Java IO - On the Use of InputStream.available() and Thread.sleep()

Some time ago I did quite a lot of IO in Java and I yet did not see this way of reading a InputStream from a Socket:
    InputStream in=channel.getInputStream();

    channel.connect();

    byte[] tmp=new byte[1024];
    while(true){
      while(in.available()>0){
        int i=in.read(tmp, 01024);
        if(i<0)break;
        System.out.print(new String(tmp, 0, i));
      }
      if(channel.isClosed()){
        System.out.println("exit-status: "+channel.getExitStatus());
        break;
      }
      try{Thread.sleep(1000);}catch(Exception ee){}
    }
    channel.disconnect();


This comes from a piece of code from an example of JSch , a good ssh client in java. A work collegue had the bad idea to remove the Thread.sleep call and was struggling with why it would randomly work.The way I would have done it is the following:
    InputStream in=channel.getInputStream();

    channel.connect();

    byte[] tmp=new byte[1024];
    int bytesRead = 0;
    while((bytesRead = in.read(tmp,0,1024>= 0){
      System.out.print(new String(tmp, 0, bytesRead));
    }

    if(channel.isClosed()){
      System.out.println("exit-status: "+channel.getExitStatus());
      break;
    }
    channel.disconnect();


This has the advantage of being more readable and having less secret spices in it. In the first code, the call to available() is non blocking, meaning that without the Thread.sleep(), there will never be the time for the socket buffer to fill up. But is the first code more efficient or the second code?

I did a search on google to understand the interest of the first code. The only advantages I found in the first code are the possibility to interrupt the thread running the code and a finer grained control on timeouts.

There is a lengthy explanation by Doug Lea in his book "Concurrent Programming in Java". This book usually provides excellent explanations, and is a must read for anybody doing concurrent programming. But this time, about this subject, I did not find him that clear.

There is a more simple explanation in a course from San Diego State University (see last example)
A read() on an inputstream or reader blocks. Once a thread calls read() it will not respond to interrupt() (or much else) until the read is completed. This is a problem when a read could take a long time: reading from a socket or the keyboard. If the input is not forth coming, the read() could block forever.

As usual, you should not rely on all what you read on the web, as this page (SCJP Questions & Answers) testifies:
Q. When will a Thread I/O blocked?
A:
When a thread executes a read() call on an InputStream, if no byte is available. The calling Thread blocks, in other words, stops executing until a byte is available or the Thread is interrupted.

Still I am wondering if the second code would not just go into IOException (socket timeout), on timeout (adjustable with Socket.setTimeout ) and release the Thread then. Do you have an idea when the first code could be better?

5 comments :

  1. The first code makes no sense as is. If it were reading more than one input stream, it would make a little more sense, but not as much sense as using NIO.

    ReplyDelete
  2. Yes, the first code is a bad example.
    But it's a problem that the java runtime doesn't throw an exception when a thread blocked on an IO primitive like read() is interrupted.
    However it is possible to simulate this behavior by closing the socket on which the thread is blocked. (The limitation is that it is not always easy to get the socket reference.)

    ReplyDelete
  3. That's actually quite a neat trick. Sure nio is better, but sometimes one is stuck in pre-1.5 land.

    ReplyDelete
  4. The 2nd code will block once the connection is interrupted (hardware unplugged, etc) while transmitting. Whereas the 1st code will recover.

    ReplyDelete
  5. I don't see why people say the first example is incorrect, or one should use NIO. The fact is listening on a SSH stream is a blocking event. NIO is useless here.

    The first example is correct, since one can inject a timeout into the loop and break, or look for end of input string like a prompt and break.

    The second example it incorrect. While elegant, it will hang waiting on end of channel.

    ReplyDelete