Make sure the Thread.run() method has terminated before closing the
socket. Currently, the socket is closed through Packetizer.close(),
which happens too early. Move the socket.close() into Transport.close()
and after the Thread.join() call.
While at it, modify the stop_thread() method and use it in
Transport.close() to avoid code duplication. Use join() with a timeout
to make it possible to terminate the main thread with KeyboardInterrupt.
Also, remove the now obsolete socket.close() from Transport.atfork().
This fixes a potential infinite loop if paramiko.SSHClient is connected
through a paramiko.Channel instead of a regular socket (tunneling).
Details:
Using a debug patch to dump the current stack of the thread every
couple of seconds while trying to close it, I've seen the following
over and over again:
Thread could not be stopped, still running.
Current traceback (most recent call last):
File "/usr/lib/python2.7/threading.py", line 524, in __bootstrap
self.__bootstrap_inner()
File "/usr/lib/python2.7/threading.py", line 551, in __bootstrap_inner
self.run()
File ".../paramiko/transport.py", line 1564, in run
self._channel_handler_table[ptype](chan, m)
File ".../paramiko/channel.py", line 1102, in _handle_close
self.transport._send_user_message(m)
File ".../paramiko/transport.py", line 1418, in _send_user_message
self._send_message(data)
File ".../paramiko/transport.py", line 1398, in _send_message
self.packetizer.send_message(data)
File ".../paramiko/packet.py", line 319, in send_message
self.write_all(out)
File ".../paramiko/packet.py", line 248, in write_all
n = self.__socket.send(out)
File ".../paramiko/channel.py", line 732, in send
self.lock.release()
The thread was running Packetizer.write_all() in an endless loop:
while len(out) > 0:
...
n = Channel.send(out) # n == 0 because channel got closed
...
out = out[n:] # essentially out = out
Signed-off-by: Frank Arnold <farnold@amazon.com>
(bug 69222)
on some recent linux kernels, a socket can return "readable" from select,
but a subsequent read() will return EAGAIN. this is against the contract
of select(), so python's socketmodule doesn't catch it or handle it.
therefore, we need to. EAGAIN should now be treated the same as a
socket timeout.
some performance improvements: be a LOT less aggressive about stirring the randpool; use buffering when reading the banner; add a hook for using a native-compiled hmac (which gives the biggest boost, but should probably be done in pycrypto)
don't attempt to start a rekey negotiation from within send_message -- always do it from the feeder thread. this prevents a situation where more than one thread may decide spontaneously to rekey, sending multiple kexinit messages, which confuses the hell out of the remote host :) also, do some locking around the clear-to-send event, to avoid a race when we first go into rekeying. add some tests for these things too
oooh maybe i'll test things before checking them in next time: rekeying was a little bit overzealous. now it's careful to only rekey once and reset the counters in sync
split out Packetizer, fix banner detection bug, new unit test
split out a chunk of BaseTransport into a Packetizer class, which handles
the in/out packet data, ciphers, etc. it didn't make the code any smaller
(transport.py is still close to 1500 lines, which is awful) but it did split
out a coherent chunk of functionality into a discrete unit.
in the process, fixed a bug that alain spineux pointed out: the banner
check was too forgiving and would block forever waiting for an SSH banner.
now it waits 5 seconds for the first line, and 2 seconds for each subsequent
line, before giving up.
added a unit test to test keepalive, since i wasn't sure that was still
working after pulling out Packetizer.