ZPROX v0.3 - (c) 1998 Willy Tarreau WARNING: Zprox is still alpha software. It's just been achieved and may require a long "bug hunting" period to become a stable and usable proxy. 0/ Changes: =========== 0.3 : Completely re-written the compress/decompress function transfer_data() to ensure that deflate and inflate will be used on a maximum amount of data at once in order to enhance compression ratio by limiting the number of flushes which came very often in v0.1 and v0.2. 0.2 : Many bugs fixed from v0.1: - now socket descriptors are correctly released if the connect can't succeed. - fixed some buffer limit problems in the compress/decompress code which occasionnaly made some connections hang. - improved speed by making better use of the buffers. - added statistics to measure inflate and deflate performance. 0.1 : First release. It was supposed to be buggy and it was :-) I/ Why ZPROX ? ============== Zprox was born for two reasons: 1) after having worked on RIGAT (http://www-miaif.lip6.fr/rigat/) with a friend of mine, Nicolas Pronine (pronine@aemiaif.lip6.fr), I wanted to rewrite a very simple and generic TCP-level proxy, which, like RIGAT, could be used nearly every time a TCP connection is needed, in order to interconnect hosts upon networks that aren't direcly routable. RIGAT, of course, is far better in this work, but is not yet so simple to adapt to special situations. 2) I was fed up with waiting very long moments for my modem to retrieve web pages that were easily compressible. I knew the solution was to compress data exchanged by the two modems. It should have been evident to anyone who at least once tried to compress HTML or plain text documents. But I didn't have time nor courage to write an "on the fly" compressor. So I had the idea to use the proxying feature of SSH and its compressing capability to make on my host a local access to a remote Squid proxy. After a few tests, I obtained about 11 kB/s on HTML and PostScript with my 33600 modem which did only 3.7 kB/s alone. This sounded very interesting, and I wanted to do better, thinking that the cryptographic algorithms in SSH may hurt a little bit the data transfer rate. Realizing that I had nothing to do with crypto on my web transfers, I decided to write my own proxy which would be able to compress data on the fly, and would accept any number of asynchronous connections. During one night and the following day, I wrote the most buggy program I ever did :-) II/ How does it work ? ====================== Zprox, unlike Webroute, the first simple web proxy I wrote a few years ago (http://www-miaif.lip6.fr/willy/webroute/), doesn't fork to handle each new connection ! Although it worked (and is reported to still work on several sites), this was very dirty for many reasons. Now, it relies on one single listening socket and one single select loop. Just as it's done in every good proxy (including RIGAT, of course), the connect() call is non-blocking in order to avoid to freeze the proxy each time a connection is requested for a long distance host. Long distance is to be understood "at a network level". For example, if I can ping a hosts at thousands kilometers in less than 30 ms, this is not that long. But the web proxy I use with Zprox is two routers after the remote modem, and the modem lines give the data a 160 ms travel time over the network, which is now considered as a long distance network. For the compression, I used ZLIB 1.0.4, from Jean-Loup Gailly and Mark Adler. It's the base of the very well-known GZIP program. (Note that SSH also uses this library). The main difference between Zprox and SSH's proxy is that the user not only can choose the compression strength, but he can also choose it for each one of the two way of communication. The idea is that a telnet session, for example, can be significantly sped-up by compressing data from the server to the client (for example, the result of an "ls -l" compresses very well), but it's also slowed down when compressing between the client and the server, because of time and overhead generated for each single character. This is also the case with HTTP connexions. Unless keep-alive connections are required, there's no chance that the HTTP headers form the client be compressed because they're often too small (although with certain browsers such as MSIE4, headers are so long that they can be compressed). III/ How to use it ? ==================== The user simply selects, for each way of communication, the operation to be accomplished by the proxy. This one can be: - No compression, represented by the digit 0; - compression, by specifying a digit between 1 and 9 (1 corresponds to the lower level and faster compression, 9 to the higher and slower); - decompression, represented by the character 'u', which restores the original data from a compressed stream. One and only one of these digits should be specified after the letter 'i' for the input stream, or the letter 'o' for the output stream. By input stream, I mean the stream coming from the server, and read by the client. The output stream goes from the client to the server through the proxy. Of course, if you choose to compress a stream, you'll need a second proxy to decompress it. In a misconfiguration, you may happen to forget to decompress a compressed stream. You would then retrieve useless data. If you try to decompress a stream which isn't compressed, you'll generate an error since Zlib's inflate method cannot handle other than deflated data. In this case, the connection will be shut down on the way of the error. It may be followed by the disconnection due to the server or client's response to shutdown. Unlike RIGAT, Zprox works with only one remote server. It's not possible to specify a list of servers, nor to choose one depending on the user and/or client host. But this is not the goal of Zprox which is mainly intended to be used on Web connections. This means that if you use it to access the web, you have two choices: - you access *all* the web through a proxy such as Squid, in which case Squid will be designated as the remote server, and the local Zprox instance will be referenced as an HTTP proxy; - you access only *one* web server, in which case it will be the remote server, and the local instance of Zprox will simply be accessed as the web server. In reality, sort of this connection won't be used on HTTP, but preferably on telnet or POP3 sessions. IV/ Examples of application: ============================ 1) Speeding up a web proxy over a modem line, using Squid on the server: ------------------------------------------------------------------------ * originally: ------------- Full IP | /-------\ /--------\ \ | Squid | | Client | ~---+ based +----[modem]--~~~~///~~~~--[modem]----+ (web | | proxy | |browser)| \-------/ \--------/ prox:3128 PPP netscape using prox:3128 (squid) as HTTP proxy. * First speed-up, using the same Squid proxy and 2 Zprox: --------------------------------------------------------- Full IP S1 Z1 Z2 C1 | /-------\ /-------\ /-------\ /--------\ \ | Squid | | Zprox | | Zprox | | | ~---+ based +-+ -i6 +-[mdm]~///~[mdm]-+ -iu +----+ Client | | proxy | | -o0 | | -o0 | | | \-------/ \-------/ \-------/ \--------/ prox:3128 prox:3125 PPP local:3125 netscape using (squid) zprox using using prox local:3125 as prox:3128 as :3125 as HTTP proxy. remote server. rmt serv. This synoptic shows that C1 issues HTTP requests to Z2 which lets them uncompressed, and directly sends them to Z1 which, of course, doesn't make any modification on it before giving them to S1. S1 analyses them and retrieves the web page from the Internet or from its local cache, and sends the response to Z1 which compresses data at level 6 (good compression over time ratio). The compressed stream goes through the modem line to Z2 which uncompresses the stream before delivering the original data to C1. Although this already works very well, it's possible to gain even more: it's often that C1 sends several simultaneous requests (frames composing a page, icons...). In this case, it will take some time to display objects, which may hash the stream. If the stream is hashed, compression ratio goes down because one deflate is done on the maximum size of data in Z1's buffers. For this reason, the user can select a higher buffer size. Moreover, a high water/low water mechanism has been implemented to prevent deflate and/or network accesses for too small data. To improve even more, the user may install a local instance of Squid. This one can handle multiple continuous data streams better than any web browser, and can cache data. This will be particularly interesting on icons, which generate connections for small amounts of data. * The following synoptic involves 2 Zprox and 2 Squid: ------------------------------------------------------- - On the server side: Full IP S1 Z1 | /-------\ /-------\ \ | Squid | | Zprox | ~---+ based +----+ -i6 +----[modem]---~// PPP link ... client | proxy | | -o0 | \-------/ \-------/ prox:3128 prox:3125 (squid) zprox using prox:3128 as remote server - On the client side Z2 S2 C1 /-------\ /-------\ /--------\ | Zprox | | Squid | | | ...PPP link //~--[modem]-+ -iu +---+ based +----+ Client | | -o0 | | proxy | | | \-------/ \-------/ \--------/ local:3125 local:3128 netscape using using prox using local local:3128 as :3125 as :3125 as HTTP proxy. rmt serv. parent. S1, Z1 and Z2 remain identically configured as in the previous case. The only modification comes from the insertion of S2, the Squid proxy which listens on port 3128 on the local host, and which uses Z2 as single parent (which means that it redirects to this parent all the requests that it cannot satisfy from its cache). First experimental results: --------------------------- I'm actually using Zprox this way, and I'm getting an average throughput of 14.5 kB/s on HTML pages, even with frames, and on a PostScript report of 445 kB (which of course, weren't in local squid cache). This is nearly 4 times as fast as the direct modem connection (3.7 kB/s), and about 32% faster than SSH's proxying feature. When downloading already compressed data (ZIP,TGZ,JPG,GIF...), there's non noticeable overhead because the deflate method is said to *at most* make an output of 0.1% + 12 bytes greater than the input. So it's worth using Zprox with Squid. 2) Mail retrieving speed-up with 2 Zprox (with POP3): ----------------------------------------------------- On the local host, install Zprox exactly as in the first case with Squid: POP Z1 Z2 C1 /-------\ /-------\ /-------\ /--------\ | POP3 | | Zprox | | Zprox | | | + mail +-+ -i9 +-[mdm]~///~[mdm]-+ -iu +----+ Client | | server| | -o0 | | -o0 | | | \-------/ \-------/ \-------/ \--------/ mail:110 mail:3110 PPP local:110 Mail reader using zprox using using mail local:110 as mail:110 as :3110 as POP3 server. remote server. rmt serv. It's necessary that Z2 runs on local port 110 because most of the mail readers don't allow the user to choose a port. This is fixed to 110 by default for POP3. 3) Mail sending with SMTP (outgoing compression) ------------------------------------------------ SMTP Z1 Z2 C1 /-------\ /-------\ /-------\ /--------\ | SMTP | | Zprox | | Zprox | | | + mail +-+ -i0 +-[mdm]~///~[mdm]-+ -i0 +----+ Client | | server| | -ou | | -o9 | | | \-------/ \-------/ \-------/ \--------/ mail:25 mail:3025 PPP local:25 Mail sender using zprox using using mail local:25 as mail:25 as :3025 as SMTP server. remote server. rmt serv. Mails are send to the local SMTP proxy making the remote SMTP server appear as a local one. 4) Telnet session compression ----------------------------- It can be useful to compress a telnet session. An incoming stream compression really improves speed over modem lines. Try it with an 'ls -l /usr/bin' and you'll understand what I mean. Moreover, as the compressed stream is meaningless alone, it may be useful to also compress the outgoing connection (the keyboard input) to make it "unsniffable". This slows a bit the session down, but makes sort of an encryption on it, which protects against simple sniffers such as those included in root-kits. Note that if you really need security, Zprox is not adapted, prefer using SSH which is strongly secured. But if you can't use SSH for some reason, then Zprox can help you. Here comes an example: T1 Z1 Z2 C1 /--------\ /-------\ /-------\ /--------\ | TELNET | | Zprox | | Zprox | | TELNET | + server +-+ -i1 +-[mdm]~///~[mdm]-+ -iu +----+ Client | | | | -ou | | -o6 | | | \--------/ \-------/ \-------/ \--------/ serv:23 serv:3023 PPP local:3023 telnet connecting zprox using using serv local:3023 to serv:23 as :3023 as access telnetd server remote server. rmt serv. on T1. 5) Transactionnal application on IP over serial lines ----------------------------------------------------- Some central systems use transaction-driven applications. Clients are terminals which get a full page at once, let the user fill fields in and send the entire response when the user hits a transmit key. Many of these systems can work on IP which allows them to use nation-wide terminals over X25 lines, for instance. When multiple users share the same line, access time may be significantly high. Therefore, using Zprox between the central server and the terminals should greatly decrease response time. As this response may be several seconds long, Zprox should be used at the maximal compression level in both directions; there's no risk to increase this latency by the time needed for compression. The scheme will be exactly the same as for the telnet session except that '-i1' and '-o6' arguments could be replaced with '-i9' and '-o9'. 6) X-Window proxying. --------------------- As for telnet sessions, it can be interesting to forward X-Window sessions over a compressed link. The conditions are stricly the same: Do not compress too much on the output way to avoid time overhead, and compress reasonably on the input way to make gain of long repetitive patterns. Do not attempt to transfer huge amounts of data ! On my PC, I can forward MPEGPLAY at its nominal playing speed when no compression is used, but it slows down to 7 images/s when compressing with -o6. But it can support 70 "xeyes" at about 30% of the CPU load when the mouse moves very fast (X11 takes more CPU than Zprox in this case). An example will follow. Don't forget that this time, the client host is the X server and the server host is the X client. So connections are reverse-ordered from what you seem to be using. SERVER Z1 Z2 CLIENT /--------\ /-------\ /-------\ /--------\ |X-Window| | Zprox | | Zprox | |X-Window| + client +-+ -iu +-[mdm]~///~[mdm]-+ -i1 +----+ server | |(xterm) | | -o6 | | -ou | |XFree86 | \--------/ \-------/ \-------/ \--------/ xterm serv:6005 PPP local:3005 X server on using zprox using using local display :0 serv:5 as local:3005 as :6000 as (port 6000) display. remote server. rmt serv. From the client, you first establish a telnet connection to the server, and launch "xterm -display serv:5". If your authorizations are correctly set, you'll get an xterm on the client's display. 7) Other protocols ------------------ Many other protocols can be proxied by Zprox, such as news (NNTP). Each time it's the same mechanism. You just have to know in which way(s) you want to transfer many data, and configure Zprox to satisfy your needs. FTP isn't proxiable by Zprox because in an FTP session, the outgoing connection you make is only for control. The data connnection comes in from the server on a port specified by your client. So Zprox could proxy the control connection, which is non sense, but not the data connection, except by analysing the connection stream to detect PORT commands and to automatically start another server. This is the method masquerading works, but this isn't interesting there because files retrieved on FTP servers are often already compressed (sometimes, they're even compressed with BZIP2). It's also possible to use Squid's ftp proxying with web browsers. In this case, Zprox can be used, but not so useful. V/ How to compile it ? ====================== To compile Zprox, you'll need zlib release 1.0.4 or 1.1.2. Both have been tested and do work. Perhaps there are small bugs in one or the other one, but I really don't think so. At the moment, I've used it on Linux 2.0 and 2.1. I've done a few tests on Windows NT 4.0 with gcc, but I already had surprises such as connect() refusing to work asynchronously, so I'll see... I've also tested it on Bull OPEN7 v3.13. It works slowly, but it works. This system supports asynchronous connect(), but doesn't support non- blocking read/write more than WinNT. I must say I don't really know if I should implement non-blocking I/O or not. It should even improve network performance because of the ability to make even bigger packets at a time. But if many systems don't support this feature, that's not so interesting. I'd like to test it on SunOS or Solaris, but I don't have anymore access to a sparc station. I think it could complain about some includes, but after some corrections, there's no reason for it not to work. Could someone test it on FreeBSD with many connections (hundred of xeyes, for example), as FreeBSD is reported to easily support very high loads. To compile it, as I'm lazy, I haven't written any makefile yet because it was very late and the simple fact of stopping any activity by writing a makefile would have made me sleep ! So it's still so simple (provided you own a copy of zlib in the current directory): gcc -O2 -s -o zprox zprox.c -Lzlib-1.0.4 -lz V/ Possible extensions ====================== As stated at first, my first goal in Zprox was to write a generic proxy which should be easily adaptable to any usage with minor code changes. If you need to proxy some protocols, to add cryptography, other compression algorithms (lossless or lossy), you might use Zprox for base, because the most annoying part in writing a proxy is to implement the select() loop, with all the error handling. But be careful, this proxy is strictly connection-oriented, so limited to TCP (and perhaps to other protocols such as SPX with minor changes). Message-oriented proxying is much harder to implement (RIGAT was designed to indifferently work with or without connection, and the second mode isn't finished yet). So you can't use Zprox to mount NFS drives, for example, but it may be possible to compress NetBios sessions (TCP port 139). VI/ Reliability =============== As long as Zprox will remain alpha or beta software, I won't be able to tell you what can happen to your data. You can try to make it crash by any manner, I'll be interested. Of course you can make fun by making multiple connections to a chargen service to consume your CPU, but I don't think Zprox will crash. It might be more sensible to very small buffer sizes (which wouldn't contain enough data for deflate() to work) or disconnections between the select() and the read() or write() calls. In any case, when using it for tests purpose only with a few connections, you can reduce the maximal number of file descriptors that can be allocated. Feel free to play with it, but remember this: if you lose data, I won't be able to do anything for you. I personnaly use it at my own risks, so use it at your own risks too. VII/ TO DO ========== Zprox works very well with large buffer size (16 kB). But these buffers need to be allocated 4 at once (2 for each way), which means that at each connect, we have to malloc 64 kB and this takes some time on slow systems. To improve this, Zprox uses its own pool of used and free buffers, so that it reduces the number of malloc() and free() calls, by simply assigning a free pointer to a new buffer. But to be really useful, this would need a lot of preallocated buffers, which consume memory even if not used. Perhaps it would be interesting to automatically free() one half of the available buffers after a predefined time period, in order to reduce memory usage when Zprox is not used. Newer versions can be found on: http://www-miaif.lip6.fr/willy/zprox/ Have fun ! Willy Tarreau .