diff miscellany/compress-4.0/README3.0 @ 0:bce86c4163a3

Initial revision
author kono
date Mon, 18 Apr 2005 23:46:02 +0900
parents
children
line wrap: on
line diff
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/miscellany/compress-4.0/README3.0	Mon Apr 18 23:46:02 2005 +0900
@@ -0,0 +1,200 @@
+Enclosed is compress version 3.0 with the following changes:
+
+1.	"Block" compression is performed.  After the BITS run out, the
+	compression ratio is checked every so often.  If it is decreasing,
+	the table is cleared and a new set of substrings are generated.
+
+	This makes the output of compress 3.0 not compatable with that of
+	compress 2.0.  However, compress 3.0 still accepts the output of
+	compress 2.0.  To generate output that is compatable with compress
+	2.0, use the undocumented "-C" flag.
+
+2.	A quiet "-q" flag has been added for use by the news system.
+
+3.	The character chaining has been deleted and the program now uses
+	hashing.  This improves the speed of the program, especially
+	during decompression.  Other speed improvements have been made,
+	such as using putc() instead of fwrite().
+
+4.	A large table is used on large machines when a relatively small
+	number of bits is specified.  This saves much time when compressing
+	for a 16-bit machine on a 32-bit virtual machine.  Note that the
+	speed improvement only occurs when the input file is > 30000
+	characters, and the -b BITS is less than or equal to the cutoff
+	described below.
+
+Most of these changes were made by James A. Woods (ames!jaw).  Thank you
+James!
+
+Version 3.0 has been beta tested on many machines.
+
+To compile compress:
+
+	cc -O -DUSERMEM=usermem -o compress compress.c
+
+Where "usermem" is the amount of physical user memory available (in bytes).  
+If any physical memory is to be reserved for other processes, put in 
+"-DSACREDMEM sacredmem", where "sacredmem" is the amount to be reserved.
+
+The difference "usermem-sacredmem" determines the maximum BITS that can be
+specified, and the cutoff bits where the large+fast table is used.
+
+memory: at least		BITS		cutoff
+------  -- -----                ----            ------
+   4,718,592 			 16		  13
+   2,621,440 			 16		  12
+   1,572,864			 16		  11
+   1,048,576			 16		  10
+     631,808			 16               --
+     329,728			 15               --
+     178,176			 14		  --
+      99,328			 13		  --
+           0			 12		  --
+
+The default memory size is 750,000 which gives a maximum BITS=16 and no
+large+fast table.
+
+The maximum bits can be overrulled by specifying "-DBITS=bits" at
+compilation time.
+
+If your machine doesn't support unsigned characters, define "NO_UCHAR" 
+when compiling.
+
+If your machine has "int" as 16-bits, define "SHORT_INT" when compiling.
+
+After compilation, move "compress" to a standard executable location, such 
+as /usr/local.  Then:
+	cd /usr/local
+	ln compress uncompress
+	ln compress zcat
+
+On machines that have a fixed stack size (such as Perkin-Elmer), set the
+stack to at least 12kb.  ("setstack compress 12" on Perkin-Elmer).
+
+Next, install the manual (compress.l).
+	cp compress.l /usr/man/manl
+	cd /usr/man/manl
+	ln compress.l uncompress.l
+	ln compress.l zcat.l
+
+		- or -
+
+	cp compress.l /usr/man/man1/compress.1
+	cd /usr/man/man1
+	ln compress.1 uncompress.1
+	ln compress.1 zcat.1
+
+The zmore shell script and manual page are for use on systems that have a
+"more(1)" program.  Install the shell script and the manual page in a "bin"
+and "man" directory, respectively.  If your system doesn't have the
+"more(1)" program, just skip "zmore".
+
+					regards,
+					petsd!joe
+
+Here is the README file from the previous version of compress (2.0):
+
+>Enclosed is compress.c version 2.0 with the following bugs fixed:
+>
+>1.	The packed files produced by compress are different on different
+>	machines and dependent on the vax sysgen option.
+>		The bug was in the different byte/bit ordering on the
+>		various machines.  This has been fixed.
+>
+>		This version is NOT compatible with the original vax posting
+>		unless the '-DCOMPATIBLE' option is specified to the C
+>		compiler.  The original posting has a bug which I fixed, 
+>		causing incompatible files.  I recommend you NOT to use this
+>		option unless you already have a lot of packed files from
+>		the original posting by thomas.
+>2.	The exit status is not well defined (on some machines) causing the
+>	scripts to fail.
+>		The exit status is now 0,1 or 2 and is documented in
+>		compress.l.
+>3.	The function getopt() is not available in all C libraries.
+>		The function getopt() is no longer referenced by the
+>		program.
+>4.	Error status is not being checked on the fwrite() and fflush() calls.
+>		Fixed.
+>
+>The following enhancements have been made:
+>
+>1.	Added facilities of "compact" into the compress program.  "Pack",
+>	"Unpack", and "Pcat" are no longer required (no longer supplied).
+>2.	Installed work around for C compiler bug with "-O".
+>3.	Added a magic number header (\037\235).  Put the bits specified
+>	in the file.
+>4.	Added "-f" flag to force overwrite of output file.
+>5.	Added "-c" flag and "zcat" program.  'ln compress zcat' after you
+>	compile.
+>6.	The 'uncompress' script has been deleted; simply 
+>	'ln compress uncompress' after you compile and it will work.
+>7.	Removed extra bit masking for machines that support unsigned
+>	characters.  If your machine doesn't support unsigned characters,
+>	define "NO_UCHAR" when compiling.
+>
+>Compile "compress.c" with "-O -o compress" flags.  Move "compress" to a
+>standard executable location, such as /usr/local.  Then:
+>	cd /usr/local
+>	ln compress uncompress
+>	ln compress zcat
+>
+>On machines that have a fixed stack size (such as Perkin-Elmer), set the
+>stack to at least 12kb.  ("setstack compress 12" on Perkin-Elmer).
+>
+>Next, install the manual (compress.l).
+>	cp compress.l /usr/man/manl		- or -
+>	cp compress.l /usr/man/man1/compress.1
+>
+>Here is the README that I sent with my first posting:
+>
+>>Enclosed is a modified version of compress.c, along with scripts to make it
+>>run identically to pack(1), unpack(1), an pcat(1).  Here is what I
+>>(petsd!joe) and a colleague (petsd!peora!srd) did:
+>>
+>>1. Removed VAX dependencies.
+>>2. Changed the struct to separate arrays; saves mucho memory.
+>>3. Did comparisons in unsigned, where possible.  (Faster on Perkin-Elmer.)
+>>4. Sorted the character next chain and changed the search to stop
+>>prematurely.  This saves a lot on the execution time when compressing.
+>>
+>>This version is totally compatible with the original version.  Even though
+>>lint(1) -p has no complaints about compress.c, it won't run on a 16-bit
+>>machine, due to the size of the arrays.
+>>
+>>Here is the README file from the original author:
+>> 
+>>>Well, with all this discussion about file compression (for news batching
+>>>in particular) going around, I decided to implement the text compression
+>>>algorithm described in the June Computer magazine.  The author claimed
+>>>blinding speed and good compression ratios.  It's certainly faster than
+>>>compact (but, then, what wouldn't be), but it's also the same speed as
+>>>pack, and gets better compression than both of them.  On 350K bytes of
+>>>unix-wizards, compact took about 8 minutes of CPU, pack took about 80
+>>>seconds, and compress (herein) also took 80 seconds.  But, compact and
+>>>pack got about 30% compression, whereas compress got over 50%.  So, I
+>>>decided I had something, and that others might be interested, too.
+>>>
+>>>As is probably true of compact and pack (although I haven't checked),
+>>>the byte order within a word is probably relevant here, but as long as
+>>>you stay on a single machine type, you should be ok.  (Can anybody
+>>>elucidate on this?)  There are a couple of asm's in the code (extv and
+>>>insv instructions), so anyone porting it to another machine will have to
+>>>deal with this anyway (and could probably make it compatible with Vax
+>>>byte order at the same time).  Anyway, I've linted the code (both with
+>>>and without -p), so it should run elsewhere.  Note the longs in the
+>>>code, you can take these out if you reduce BITS to <= 15.
+>>>
+>>>Have fun, and as always, if you make good enhancements, or bug fixes,
+>>>I'd like to see them.
+>>>
+>>>=Spencer (thomas@utah-20, {harpo,hplabs,arizona}!utah-cs!thomas)
+>>
+>>					regards,
+>>					joe
+>>
+>>--
+>>Full-Name:  Joseph M. Orost
+>>UUCP:       ..!{decvax,ucbvax,ihnp4}!vax135!petsd!joe
+>>US Mail:    MS 313; Perkin-Elmer; 106 Apple St; Tinton Falls, NJ 07724
+>>Phone:      (201) 870-5844