High Performance Enabled SSH/SCP News
Update 7.29.2008
HPN13 has been updated to v5 for use with OpenSSH 5.1. There are two minor changes in functionality. First, the progress meter will
no longer spit out an extra line with the peak throughput. Instead the peak throughput will be displayed as the last update in the
meter itself. Second, I've increased the number fo outstanding requests in sftp to 256. This will give users 8MB of outstanding
data. If you think you neeed more then increase the number of request of the size of the buffer (-R and -B respectively in sftp).
There are some internal changes which cuts down on some of the complexity of the patch. As far as I can tell there hasn't been any
change that will be noticible to the user. However, I'm often wrong so if you run into a problem please contact us at
hpn-ssh@psc.edu
Update 4.07.2008
HPN13 has been updated to v3. This is for OpenSSH 5.0. v2 had been released for OpenSSH 4.9 but was withdrawn due to a
security problem in OpenSSH 4.9 We will not be making it available and we suggest everyone move to OpenSSH 5.0.
Update 2.07.2008
We've released the HPN13 set of patches to replace the HPN12 patch set. The main addition is a multi-threaded AES-CTR mode cipher. This allows
the use of multiple processing cores (using posix threads) to create a cipherstream indistinguishable from the more common single thread AES-CTR
mode cipher. As such its completely compatible with existing installations. We also made a change in how we distribute the patches. We'll be
providing them in a 'kitchen sink' format and as 'a la carte' so users can apply the patches that are most useful to them.
Update 11.30.2007
The previous version fo the HPN12 patch (revision 19) has been removed and replaced with revision 20.
In some circumstances a buffer_append_space bug resulted from using the wrong size check value. There was
also a typo which may have led to inappropriately large or small HPN buffers when using the HPNBufferSize
option with sshd.
Update 09.28.2007
The previous version of the HPN12 patch (revision 18) has been removed and replaced with revision 19. This revision removes some nonfunctional
code that had accumulated in the patch over the past couple of years. We also change the default behaviour of the TCPRcvBufPoll option to
enabled. This will end up making things a bit easier for the people with autotuning kernels - which are generally in the majority now. People
without autotuning should disable this option as its just wasting cycles for them. We also made a change which was allowing peopelto specify the
none cipher for authentication. We included a failsafe that will not allow the use of the none cipher unless authentication has taken place. If
it is attempted the conenction will close. We've also made a change where the HPN buffer will grow somewhat more aggressively than it has in the
past.
Also, we have a short paper available on the work we've done on HPN-SSH. Its should be understandable to most users of the patch. High Speed Bulk Data Transfer on the Grid Using the SSH
Protocol.
Update 09.05.2007
A new version of the OpenSSH was released today and so has a new version of the HPN-SSH patch. This patch doesn't provide any new functionality
*but* this should be seen as an interim patch. In the next month I expect to have a new patch available for the 4.7p1 code base which will have
some improvements. Also, not to jinx anything but I think we'll have an HPN13 patch set available in a few months. This will be of *real*
interest to people in multicore and SMP environments.
Update 03.27.2007
I was made aware that you could have both NoneSwitch and NoneEnabled set in the ssh_config file. While this may be convenient for those people
who always use the NONE cipher switch there is no way that this is acceptable default behaviour. So I changed the configuration so that you can
only use the NoneSwitch option from the command line. I also changed the maximum value of HPNBufferSize to 64MB. This is only used in HPN to
Non-HPN connections.
Additionally, I ran some tests recently. I wanted to make sure that all of the changes hadn't introduced any bottlenecks into the code.
So I took a stock version of OpenSSH and increased the default buffer size to 64MB (which was a two line change) and compared it to
HPN12v16. The end result is that they provided statistically identical throughput performance over a long fast network. I still need
to rerun the tests in a local area network to see if the short RTTs have an impact. More on that when it happen.
Update 02.20.2007
Very minor change. Some of the comments were in the C++ style as opposed to plain old C.
This was causing it to choke on some compilers. I've changed everything to the C style comments.
Update 11.13.2006
Yet another default buffer size bug. This time I ngelected to make sure that the hpn buffer on the
server side was correctly sized. In versions prior to v14 unless the HPN buffer size was explicitly set
it used a default value of 2MB. As of now it will use the current TCP buffer size. I also discovered that
when the hpn disabled option is set to yes the buffer is actually 128K and not 64K. I'll be working on that this week.
Update 10.31.2006
There was a documentation bug in the usage section of scp. This has been fixed in v13 release
for 4.4p1. Happy halloween kids!
Update 10.23.2006
A bug was preventing some of the options in scp from working properly. This would have affected
people using the -1, -2, -4, -6, and -C options. This has been fixed in the v12 release for 4.4p1.
Update 10.12.2006
Quite a few things have happened. First we found a number of minor errors in the code and
one major error. None of these errors were security related though. Minor errors were conflicting
messages about the state of the NONE cipher, some documentation problems, and the like. The major
error was more problematic - due to a mistake in the way I was handling reading the configuration file
the HPN buffer was being set to 2MB pretty much no matter what. This has been corrected in the
4.4p1 HPNv12 patch but *not* in the 4.3p2 HPNv9 patch. This won't be an issue for most people but people
on very high seepd links might have been artificially throttled because of this. While fixing this mistake
I also incorporated a new method for determining the maximum size of the HPN buffer. This means that the
size fo the buffer will depend on, in part, configuration options the user chooses. These are explained in the
HPN12-README file. The end result of this all should be that the buffer won't be any larger than it
actually has to be.
I've started collecting data on HPN12 performance in terms of throughput, CPU usage, sensitivity to packet
loss and the like. I hope to have a paper on this work in the next 3 or 4 months.
Update 7.26.2006
Two minor changes. First, the version string now includes major and minor version. As of this
date it should report at hpn12v8. Second, there was a bug in the documentation in that it failed to
explain that the 'NONE' cipher will only work on bulk data transfers and is silently disabled
for all interactive sessions. Its a good compromise betweem performance and security.
Update 7.13.2006
Turns out that there was a bug in that the NoneSwitch configuration options wasn't accepting
yes/no modifiers. The NoneSwitch was just acting as a flag instead of an option. So I fixed it
so that -oNoneSwitch=yes/no works correctly. Otherwise no functionality was changed.
Update 5.19.2006
New version of the patch has been released. Please see the version history and
the patch information for more details. Comments or questions are appreciated.
Update 4.3.2006
We just found out that there was a typo in the patch for OpenSSHv4.3 patch with
the none switch. This would have caused a segmentation fault if the -R/-r option
was used. This typo has been fixed in the latest version of the patch.
Sorry about that.
Update 2.1.2006
We have just released a patch for OpenSSH v4.3. This is, again, in the
HPN-11 cycle so there isn't any additional functionality added. However,
due to command line switch namespace collision we have have to change the
-w tcp receive window switch to
-r in ssh and
-R in scp.
Update 9.08.2005
We've just released a patch for OpenSSH v4.2. This is still in the
HPN-11 cycle so there isn't any different or new functionality at
this time.
Update 6.17.2005
We've just released an entirely new set of patches and removed all
of the older versions of the patches from the main site. If you'd
like to see the old site please go to
http://www.psc.edu/networking/projects/hpn-ssh/old-site/.
This set of patches brings OpenSSH 3.9, 4.0, 4.1 up to the same
level of functionality and provides some significant improvements
over the old patch. A new version checking system allows us to
provide almost full HPN speeds even if both ends of the connection
are not using HPN clients and servers. The speed improvement will
only be in the direction of whichever side is HPN. So an HPN server
will get a speed boost even if nonHPN clients connect to it.
Additionally, we've added the
'-w' switch. This allows the
user to specify a TCP receive window size for that connection up to
the maximum allowed on that host. This doesn't require root
privlidges nor will it effect other users on the system.
Additionally, we've decided to make the 'None' cipher available on
all versions using the
'-z' switch. We no longer consider
this 'experimental' code but it shoudl still be used at the users
own risk. We think we've locked it up as much as possible but we
may have missed something. Please keep in mind that even with the
-z switch the authentication process is
still fully
encrypted only the bulk data transfer will happen in the clear.
Additionally, MAC will still be performed which provides greater
assurance of the integrity of the data.
Update 5.11.2005
I've just put out the patch set for OpenSSH 4.0p1. This is
*experimental* and should be used with caution. We have found that
under some circumstances one of the buffers used by scp can grow
quite large rather quickly. This seems to be because the scp
process run asymetrically from the ssh transport. However, there
does seem to be a method to tell ssh to slow down if scp can't
write the data to disk fast enough. We've also added a new feature
we're callinh user adjustable windows. The
-w [size of receive
window in bytes] will set the local receive buffer of that
specific connection to a user defined size. The size of this window
cannot exceed the system defined maximum buffer size though. No
special permissions are necessary to set the buffer size. It is
very important to note that this will *only* set the the local
receive buffer size. Its unlikley that users will ever be able to
set the remote receive buffer size for obvious reasons.
Update 3.25.2005
In all likelihood this is the last OpenSSH 3.9 patch we'll be
releasing. in this patch we remove all references to the
'unlimited' element we inserted in the buffer struct found in
buffer.c. It turns out that this was basically unnecessary once we
resolved some issues in channels.c. We've also updated channels.c
to handle the largewindow bug we discovered in a more practical
manner. Thanks to Darren Tucker for his help on that. Lastly, we
made a few minor modifications to scp.c to bring the size of its
read/write pipe buffers into closer alignment with the ssh
read/write pipe/buffers. The end result of this is a 60% reduction
in read/write syscalls when scp is the data source and a 20%
reductions when it is the data sink.
Update 1.15.2005
We discovered a minor problem in that the
experimental none
cipher switching patch did not perform as expected. A failsafe
which I thought was in place to prevent switching to the none
cipher during interactive sessions wasn't performing as expected.
The result being that it was possible to send data in the clear.
However to do this the user would have to explicitly pass
the -z switch to ssh. So if you didn't do that you did not send
data in plaintext. However, I urge all users of the
experimental patch to upgrade to the latest version
as
soon as possible. In the new patch I explicitly test for the
tty_flag before the none cipher switch takes place. If the tty_flag
is true then the cipher switch fails silently and the session
continues with the original encryption cipher (this silent failure
mode may change in future releases of this patch). Additionally, we
now also check to make sure the -T (no_tty_flag) switch is not set
before enabling the none cipher.
Update 1.12.2005
We've started putting hpn-ssh on more production systems so we've
be able to get a better idea of how it might perform in the wild.
Initial results look good but it did illustrate the need to have
your network buffers tuned properly.
hpn-ssh will not make ssh
run faster if your system is mistuned. Essentially it can only
work with what it has. Also, never forget that disk I/O operations
will be a likely bottleneck in some systems. An interesting test is
to transfer a large file to disk and then transfer the same file to
/dev/null. If there is a notable difference in throughput you are
disk bound. Also, we're about to start up more active development
again. We hope to tighten up some code and rethink some
assumptions. This will likely just be incremental improvements.
However, we also have some other ssh projects we'll be persuing.
More on that later in a couple of months.
Update 11.23.2004
Quick note here. The previously discovered problem with corrupted
MACs on input actually stems from a hardware and/or driver bug with
the intel e1000 cards. We think this fully justifies our continued
use of HMAC even without data encryption - we'd not have found this
bug without it. We have seen some problem with the syskonnect
interface we replaced it with though. It turns out that under high
loads the vm.min_free_kbytes is too low and crashes the interface.
We set it to 12MB and it seems to work fine now.
Update 11.19.2004
Last night we conducted an edurance test using the cipher switching
version of hpn-ssh. Using pipes we pushed data from /dev/zero on a
host at NCSA to /dev/null on a host at PSC. We were able to move
4.343 terabytes of data in 17 hours 38 minutes at an average rate
of 71MBps. Instantaneous tranfer rates over 100MBps were seen with
some regularity. Please note that this test did not make use of the
disk subsystem and therefore the results should be viewed as having
occured under laboratory conditions. In more practical situations
users will see their throughput being limited by the speed of their
disk IO.
Update 11.12.2004
We're back from the Supercomputing Conference and I want to thank
everyone who saw the poster and talked to me during the course of
the past week. There has been a lot of interest and, hopefully, a
lot of support will be forthcoming. In the meantime I've decided to
release the combined hpn and none cipher patch. This is the first
iteration of the patch and the none cipher support might be
changing the near future. So long term compatablity is not assured.
However, what this does is allow for midstream cipher switching. In
particular the undocumented '-z' switch to scp will provide for
encrypted authentication and unencrypted data transfer. We've seen
speeds of over 692Mbps using this patch. This will *not* work with
interactive (TTY) sessions - which you don't really want anyway. If
the server doesn't support the none switch then the connection will
fail silently. Regardless, use this patch at your own risk and view
this as an experimental patch. Lastly, the use of the none cipher
is *within* spec for the SSH v2.0 protocol. OpenSSH simply decided
to not implement it for obvious reasons.
Update 10.28.2004
We'll be presenting this work at the
SuperComputing 2004
Conference in Pittsburgh, Pennsylvania November 9-12. The
poster presentation is officialy from 5-7pm on the 9th so if you
want to stop by and say hello we'll be there. Also, you can see
some of the throughput rates we've been getting by going to the
Internet2 Weekly Top
Flow Reports. We start showing up during the week of September
20, 2004. Look in the Non-Measurement flow section for flows
between PSC-NCNE and NCSA using port 52222. We've recently hit
50MB/s and we were strictly processor limited.
Update 9.27.2004
We have uncovered some problems with using 9000 byte packets on the
linux system. When we pull data from the linux system we
consistanly get MAC errors (ssh checksum indicating data
corruption). We were able to recreate with an unpatched SSH server
so it doesn't seem to be a result of our patch. Also, the
previously mentioned user report of asymmetrical transfer rates was
determiend to be an issue with the system buffers and not the SSH
code. If anyone else is seeing this behaviour please let us know.
Update 9.24.2004
We resolved the problem on the linux 2.4 autotuning kernel by
upgrading to 2.6. There also seems to be a minor problem with the
way linux does memory accounting for the windows which might cause
a problem with the rcv_ssthresh. However, users with non-autotuning
kernels should not see a problem. As of this evening we were able
to sustain 35MBps (280Mbps) in both directions. The OS X issue is
still being explored.
Update 09/24/2004
We have recently heard of and experienced some problems with
asymmetrical performance. The first is on a linux 2.4 autotuning
kernel. In this case it seems that a tcp bug *might* be preventing
the window from updating properly. On Mac OS X (10.3.4 1.33Ghz CPU)
we were able to get through asymmetry down to a 2:1 ratio (8.0 MBps
sending v. 3.5 MBps receiving)but there is indication of some CPU
bounding issues. We have another report of a user also seeing
highly asymmetrical throughput but we've yet to be able to
determine the cause. Reports from other users would be *very*
useful at this stage.