ZPROX v0.3 - (c) 1998 Willy Tarreau <tarreau@aemiaif.lip6.fr>

WARNING: Zprox is still alpha software. It's just been achieved and may
         require a long "bug hunting" period to become a stable and usable
         proxy.


0/ Changes:
===========
	0.3 : Completely re-written the compress/decompress function
   	      transfer_data() to ensure that deflate and inflate will be used on
		  a maximum amount of data at once in order to enhance compression
		  ratio by limiting the number of flushes which came very often in
		  v0.1 and v0.2.

	0.2 : Many bugs fixed from v0.1:
			- now socket descriptors are correctly released if the connect
			  can't succeed.
			- fixed some buffer limit problems in the compress/decompress code
			  which occasionnaly made some connections hang.
			- improved speed by making better use of the buffers.
			- added statistics to measure inflate and deflate performance.

	0.1 : First release. It was supposed to be buggy and it was :-)


I/ Why ZPROX ?
==============
   Zprox was born for two reasons:

1) after having worked on RIGAT (http://www-miaif.lip6.fr/rigat/) with a
   friend of mine, Nicolas Pronine (pronine@aemiaif.lip6.fr), I wanted to
   rewrite a very simple and generic TCP-level proxy, which, like RIGAT,
   could be used nearly every time a TCP connection is needed, in order to
   interconnect hosts upon networks that aren't direcly routable. RIGAT, of
   course, is far better in this work, but is not yet so simple to adapt to
   special situations.

2) I was fed up with waiting very long moments for my modem to retrieve web
   pages that were easily compressible. I knew the solution was to compress
   data exchanged by the two modems. It should have been evident to anyone
   who at least once tried to compress HTML or plain text documents. But I
   didn't have time nor courage to write an "on the fly" compressor. So I
   had the idea to use the proxying feature of SSH and its compressing
   capability to make on my host a local access to a remote Squid proxy.
   After a few tests, I obtained about 11 kB/s on HTML and PostScript with
   my 33600 modem which did only 3.7 kB/s alone. This sounded very
   interesting, and I wanted to do better, thinking that the cryptographic
   algorithms in SSH may hurt a little bit the data transfer rate. Realizing
   that I had nothing to do with crypto on my web transfers, I decided to
   write my own proxy which would be able to compress data on the fly, and
   would accept any number of asynchronous connections. During one night and
   the following day, I wrote the most buggy program I ever did :-)


II/ How does it work ?
======================
   Zprox, unlike Webroute, the first simple web proxy I wrote a few years ago
   (http://www-miaif.lip6.fr/willy/webroute/), doesn't fork to handle each new
   connection ! Although it worked (and is reported to still work on several
   sites), this was very dirty for many reasons. Now, it relies on one single
   listening socket and one single select loop. Just as it's done in every
   good proxy (including RIGAT, of course), the connect() call is non-blocking
   in order to avoid to freeze the proxy each time a connection is requested
   for a long distance host. Long distance is to be understood "at a network
   level". For example, if I can ping a hosts at thousands kilometers in less
   than 30 ms, this is not that long. But the web proxy I use with Zprox is
   two routers after the remote modem, and the modem lines give the data a 160
   ms travel time over the network, which is now considered as a long distance
   network.

   For the compression, I used ZLIB 1.0.4, from Jean-Loup Gailly and Mark
   Adler. It's the base of the very well-known GZIP program. (Note that SSH
   also uses this library). The main difference between Zprox and SSH's proxy
   is that the user not only can choose the compression strength, but he can
   also choose it for each one of the two way of communication. The idea is
   that a telnet session, for example, can be significantly sped-up by
   compressing data from the server to the client (for example, the result of
   an "ls -l" compresses very well), but it's also slowed down when
   compressing between the client and the server, because of time and overhead
   generated for each single character. This is also the case with HTTP
   connexions. Unless keep-alive connections are required, there's no chance
   that the HTTP headers form the client be compressed because they're often
   too small (although with certain browsers such as MSIE4, headers are so
   long that they can be compressed).

III/ How to use it ?
====================
   The user simply selects, for each way of communication, the operation to be
   accomplished by the proxy. This one can be:

     - No compression, represented by the digit 0;

     - compression, by specifying a digit between 1 and 9 (1 corresponds to
	   the lower level and faster compression, 9 to the higher and slower);

     - decompression, represented by the character 'u', which restores the
       original data from a compressed stream.

   One and only one of these digits should be specified after the letter 'i'
   for the input stream, or the letter 'o' for the output stream. By input
   stream, I mean the stream coming from the server, and read by the client.
   The output stream goes from the client to the server through the proxy.

   Of course, if you choose to compress a stream, you'll need a second proxy
   to decompress it. In a misconfiguration, you may happen to forget to
   decompress a compressed stream. You would then retrieve useless data. If
   you try to decompress a stream which isn't compressed, you'll generate an
   error since Zlib's inflate method cannot handle other than deflated data.
   In this case, the connection will be shut down on the way of the error. It
   may be followed by the disconnection due to the server or client's response
   to shutdown.

   Unlike RIGAT, Zprox works with only one remote server. It's not possible to
   specify a list of servers, nor to choose one depending on the user and/or
   client host. But this is not the goal of Zprox which is mainly intended to
   be used on Web connections. This means that if you use it to access the web,
   you have two choices:

     - you access *all* the web through a proxy such as Squid, in which case
       Squid will be designated as the remote server, and the local Zprox
       instance will be referenced as an HTTP proxy;

     - you access only *one* web server, in which case it will be the remote
       server, and the local instance of Zprox will simply be accessed as the
       web server. In reality, sort of this connection won't be used on HTTP,
       but preferably on telnet or POP3 sessions.

IV/ Examples of application:
============================
1) Speeding up a web proxy over a modem line, using Squid on the server:
------------------------------------------------------------------------


  * originally:
  -------------

   Full IP
      |    /-------\                                     /--------\
      \    | Squid |                                     | Client |
       ~---+ based +----[modem]--~~~~///~~~~--[modem]----+ (web   |
           | proxy |                                     |browser)|
           \-------/                                     \--------/
           prox:3128                 PPP          netscape using prox:3128
           (squid)                                as HTTP proxy.

  * First speed-up, using the same Squid proxy and 2 Zprox:
  ---------------------------------------------------------

   Full IP     S1       Z1                        Z2            C1
      |    /-------\ /-------\                 /-------\    /--------\
      \    | Squid | | Zprox |                 | Zprox |    |        |
       ~---+ based +-+  -i6  +-[mdm]~///~[mdm]-+  -iu  +----+ Client |
           | proxy | |  -o0  |                 |  -o0  |    |        |
           \-------/ \-------/                 \-------/    \--------/
           prox:3128 prox:3125       PPP       local:3125   netscape using
           (squid)   zprox using               using prox   local:3125 as
                     prox:3128 as              :3125 as     HTTP proxy.
                     remote server.            rmt serv.

  This synoptic shows that C1 issues HTTP requests to Z2 which lets them
  uncompressed, and directly sends them to Z1 which, of course, doesn't make
  any modification on it before giving them to S1. S1 analyses them and
  retrieves the web page from the Internet or from its local cache, and sends
  the response to Z1 which compresses data at level 6 (good compression over
  time ratio). The compressed stream goes through the modem line to Z2 which
  uncompresses the stream before delivering the original data to C1.

  Although this already works very well, it's possible to gain even more:
  it's often that C1 sends several simultaneous requests (frames composing a
  page, icons...). In this case, it will take some time to display objects,
  which may hash the stream. If the stream is hashed, compression ratio goes
  down because one deflate is done on the maximum size of data in Z1's
  buffers. For this reason, the user can select a higher buffer size.
  Moreover, a high water/low water mechanism has been implemented to prevent
  deflate and/or network accesses for too small data. To improve even more,
  the user may install a local instance of Squid. This one can handle
  multiple continuous data streams better than any web browser, and can cache
  data. This will be particularly interesting on icons, which generate
  connections for small amounts of data.

 *  The following synoptic involves 2 Zprox and 2 Squid:
 -------------------------------------------------------
   - On the server side:

      Full IP     S1          Z1
         |    /-------\    /-------\
         \    | Squid |    | Zprox |
          ~---+ based +----+  -i6  +----[modem]---~// PPP link ... client
              | proxy |    |  -o0  |
              \-------/    \-------/
              prox:3128    prox:3125
              (squid)      zprox using prox:3128
                           as remote server

   - On the client side
                                  Z2          S2            C1
                               /-------\   /-------\    /--------\
                               | Zprox |   | Squid |    |        |
      ...PPP link //~--[modem]-+  -iu  +---+ based +----+ Client |
                               |  -o0  |   | proxy |    |        |
                               \-------/   \-------/    \--------/
                              local:3125   local:3128   netscape using
                              using prox   using local  local:3128 as
                              :3125 as     :3125 as     HTTP proxy.
                              rmt serv.    parent.

   S1, Z1 and Z2 remain identically configured as in the previous case. The
   only modification comes from the insertion of S2, the Squid proxy which
   listens on port 3128 on the local host, and which uses Z2 as single parent
   (which means that it redirects to this parent all the requests that it
   cannot satisfy from its cache).

   First experimental results:
   ---------------------------
   I'm actually using Zprox this way, and I'm getting an average throughput of
   14.5 kB/s on HTML pages, even with frames, and on a PostScript report of
   445 kB (which of course, weren't in local squid cache). This is nearly 4
   times as fast as the direct modem connection (3.7 kB/s), and about 32%
   faster than SSH's proxying feature. When downloading already compressed
   data (ZIP,TGZ,JPG,GIF...), there's non noticeable overhead because the
   deflate method is said to *at most* make an output of 0.1% + 12 bytes
   greater than the input. So it's worth using Zprox with Squid.

2) Mail retrieving speed-up with 2 Zprox (with POP3):
-----------------------------------------------------
   On the local host, install Zprox exactly as in the first case with Squid:

         POP       Z1                        Z2            C1
      /-------\ /-------\                 /-------\    /--------\
      | POP3  | | Zprox |                 | Zprox |    |        |
      + mail  +-+  -i9  +-[mdm]~///~[mdm]-+  -iu  +----+ Client |
      | server| |  -o0  |                 |  -o0  |    |        |
      \-------/ \-------/                 \-------/    \--------/
      mail:110  mail:3110       PPP       local:110    Mail reader using
                zprox using               using mail   local:110 as
                mail:110 as               :3110 as     POP3 server.
                remote server.            rmt serv.

   It's necessary that Z2 runs on local port 110 because most of the mail
   readers don't allow the user to choose a port. This is fixed to 110 by
   default for POP3.

3) Mail sending with SMTP (outgoing compression)
------------------------------------------------
        SMTP      Z1                        Z2            C1
     /-------\ /-------\                 /-------\    /--------\
     | SMTP  | | Zprox |                 | Zprox |    |        |
     + mail  +-+  -i0  +-[mdm]~///~[mdm]-+  -i0  +----+ Client |
     | server| |  -ou  |                 |  -o9  |    |        |
     \-------/ \-------/                 \-------/    \--------/
     mail:25   mail:3025       PPP       local:25     Mail sender using
               zprox using               using mail   local:25 as
               mail:25 as                :3025 as     SMTP server.
               remote server.            rmt serv.

   Mails are send to the local SMTP proxy making the remote SMTP server
   appear as a local one.

4) Telnet session compression
-----------------------------
   It can be useful to compress a telnet session. An incoming stream
   compression really improves speed over modem lines. Try it with an
   'ls -l /usr/bin' and you'll understand what I mean. Moreover, as the
   compressed stream is meaningless alone, it may be useful to also compress
   the outgoing connection (the keyboard input) to make it "unsniffable".
   This slows a bit the session down, but makes sort of an encryption on it,
   which protects against simple sniffers such as those included in root-kits.
   Note that if you really need security, Zprox is not adapted, prefer using
   SSH which is strongly secured. But if you can't use SSH for some reason,
   then Zprox can help you. Here comes an example:

        T1       Z1                        Z2            C1
    /--------\ /-------\                 /-------\    /--------\
    | TELNET | | Zprox |                 | Zprox |    | TELNET |
    + server +-+  -i1  +-[mdm]~///~[mdm]-+  -iu  +----+ Client |
    |        | |  -ou  |                 |  -o6  |    |        |
    \--------/ \-------/                 \-------/    \--------/
     serv:23   serv:3023       PPP       local:3023   telnet connecting
               zprox using               using serv   local:3023 to
               serv:23 as                :3023 as     access telnetd server
               remote server.            rmt serv.    on T1.


5) Transactionnal application on IP over serial lines
-----------------------------------------------------
   Some central systems use transaction-driven applications. Clients are
   terminals which get a full page at once, let the user fill fields in and
   send the entire response when the user hits a transmit key. Many of these
   systems can work on IP which allows them to use nation-wide terminals over
   X25 lines, for instance. When multiple users share the same line, access
   time may be significantly high. Therefore, using Zprox between the central
   server and the terminals should greatly decrease response time. As this
   response may be several seconds long, Zprox should be used at the maximal
   compression level in both directions; there's no risk to increase this
   latency by the time needed for compression.

   The scheme will be exactly the same as for the telnet session except that
   '-i1' and '-o6' arguments could be replaced with '-i9' and '-o9'.


6) X-Window proxying.
---------------------
   As for telnet sessions, it can be interesting to forward X-Window sessions
   over a compressed link. The conditions are stricly the same: Do not
   compress too much on the output way to avoid time overhead, and compress
   reasonably on the input way to make gain of long repetitive patterns. Do
   not attempt to transfer huge amounts of data ! On my PC, I can forward
   MPEGPLAY at its nominal playing speed when no compression is used, but it
   slows down to 7 images/s when compressing with -o6. But it can support 70
   "xeyes" at about 30% of the CPU load when the mouse moves very fast (X11
   takes more CPU than Zprox in this case). An example will follow. Don't
   forget that this time, the client host is the X server and the server host
   is the X client. So connections are reverse-ordered from what you seem to
   be using.


          SERVER      Z1                        Z2          CLIENT
        /--------\ /-------\                 /-------\    /--------\
        |X-Window| | Zprox |                 | Zprox |    |X-Window|
        + client +-+  -iu  +-[mdm]~///~[mdm]-+  -i1  +----+ server |
        |(xterm) | |  -o6  |                 |  -ou  |    |XFree86 |
        \--------/ \-------/                 \-------/    \--------/
         xterm     serv:6005       PPP       local:3005   X server on
         using     zprox using               using local  display :0
         serv:5 as local:3005 as             :6000 as     (port 6000)
         display.  remote server.            rmt serv.

   From the client, you first establish a telnet connection to the server, and
   launch "xterm -display serv:5". If your authorizations are correctly set,
   you'll get an xterm on the client's display.

7) Other protocols
------------------
   Many other protocols can be proxied by Zprox, such as news (NNTP). Each
   time it's the same mechanism. You just have to know in which way(s) you
   want to transfer many data, and configure Zprox to satisfy your needs.

   FTP isn't proxiable by Zprox because in an FTP session, the outgoing
   connection you make is only for control. The data connnection comes in from
   the server on a port specified by your client. So Zprox could proxy the
   control connection, which is non sense, but not the data connection, except
   by analysing the connection stream to detect PORT commands and to
   automatically start another server. This is the method masquerading works,
   but this isn't interesting there because files retrieved on FTP servers are
   often already compressed (sometimes, they're even compressed with BZIP2).
   It's also possible to use Squid's ftp proxying with web browsers. In this
   case, Zprox can be used, but not so useful.

V/ How to compile it ?
======================
   To compile Zprox, you'll need zlib release 1.0.4 or 1.1.2. Both have been
   tested and do work. Perhaps there are small bugs in one or the other one,
   but I really don't think so. At the moment, I've used it on Linux 2.0 and
   2.1. I've done a few tests on Windows NT 4.0 with gcc, but I already had
   surprises such as connect() refusing to work asynchronously, so I'll see...
   I've also tested it on Bull OPEN7 v3.13. It works slowly, but it works.
   This system supports asynchronous connect(), but doesn't support non-
   blocking read/write more than WinNT. I must say I don't really know if I
   should implement non-blocking I/O or not. It should even improve network
   performance because of the ability to make even bigger packets at a time.
   But if many systems don't support this feature, that's not so interesting.
   I'd like to test it on SunOS or Solaris, but I don't have anymore access to
   a sparc station. I think it could complain about some includes, but after
   some corrections, there's no reason for it not to work. Could someone test
   it on FreeBSD with many connections (hundred of xeyes, for example), as
   FreeBSD is reported to easily support very high loads.

   To compile it, as I'm lazy, I haven't written any makefile yet because it
   was very late and the simple fact of stopping any activity by writing a
   makefile would have made me sleep ! So it's still so simple (provided you
   own a copy of zlib in the current directory):

    gcc -O2 -s -o zprox zprox.c -Lzlib-1.0.4 -lz


V/ Possible extensions
======================
   As stated at first, my first goal in Zprox was to write a generic proxy
   which should be easily adaptable to any usage with minor code changes. If
   you need to proxy some protocols, to add cryptography, other compression
   algorithms (lossless or lossy), you might use Zprox for base, because the
   most annoying part in writing a proxy is to implement the select() loop,
   with all the error handling. But be careful, this proxy is strictly
   connection-oriented, so limited to TCP (and perhaps to other protocols such
   as SPX with minor changes). Message-oriented proxying is much harder to
   implement (RIGAT was designed to indifferently work with or without
   connection, and the second mode isn't finished yet). So you can't use Zprox
   to mount NFS drives, for example, but it may be possible to compress
   NetBios sessions (TCP port 139).

VI/ Reliability
===============
   As long as Zprox will remain alpha or beta software, I won't be able to
   tell you what can happen to your data. You can try to make it crash by any
   manner, I'll be interested. Of course you can make fun by making multiple
   connections to a chargen service to consume your CPU, but I don't think
   Zprox will crash. It might be more sensible to very small buffer sizes
   (which wouldn't contain enough data for deflate() to work) or
   disconnections between the select() and the read() or write() calls. In any
   case, when using it for tests purpose only with a few connections, you can
   reduce the maximal number of file descriptors that can be allocated. Feel
   free to play with it, but remember this: if you lose data, I won't be able
   to do anything for you. I personnaly use it at my own risks, so use it at
   your own risks too.


VII/ TO DO
==========
   Zprox works very well with large buffer size (16 kB). But these buffers
   need to be allocated 4 at once (2 for each way), which means that at each
   connect, we have to malloc 64 kB and this takes some time on slow systems.
   To improve this, Zprox uses its own pool of used and free buffers, so that
   it reduces the number of malloc() and free() calls, by simply assigning a
   free pointer to a new buffer. But to be really useful, this would need
   a lot of preallocated buffers, which consume memory even if not used.
   Perhaps it would be interesting to automatically free() one half of the
   available buffers after a predefined time period, in order to reduce
   memory usage when Zprox is not used.

Newer versions can be found on:

       http://www-miaif.lip6.fr/willy/zprox/

Have fun !


Willy Tarreau <tarreau@aemiaif.lip6.fr>.