annotate miscellany/compress-4.0/README3.0 @ 0:bce86c4163a3

Initial revision
author kono
date Mon, 18 Apr 2005 23:46:02 +0900
parents
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
bce86c4163a3 Initial revision
kono
parents:
diff changeset
1 Enclosed is compress version 3.0 with the following changes:
bce86c4163a3 Initial revision
kono
parents:
diff changeset
2
bce86c4163a3 Initial revision
kono
parents:
diff changeset
3 1. "Block" compression is performed. After the BITS run out, the
bce86c4163a3 Initial revision
kono
parents:
diff changeset
4 compression ratio is checked every so often. If it is decreasing,
bce86c4163a3 Initial revision
kono
parents:
diff changeset
5 the table is cleared and a new set of substrings are generated.
bce86c4163a3 Initial revision
kono
parents:
diff changeset
6
bce86c4163a3 Initial revision
kono
parents:
diff changeset
7 This makes the output of compress 3.0 not compatable with that of
bce86c4163a3 Initial revision
kono
parents:
diff changeset
8 compress 2.0. However, compress 3.0 still accepts the output of
bce86c4163a3 Initial revision
kono
parents:
diff changeset
9 compress 2.0. To generate output that is compatable with compress
bce86c4163a3 Initial revision
kono
parents:
diff changeset
10 2.0, use the undocumented "-C" flag.
bce86c4163a3 Initial revision
kono
parents:
diff changeset
11
bce86c4163a3 Initial revision
kono
parents:
diff changeset
12 2. A quiet "-q" flag has been added for use by the news system.
bce86c4163a3 Initial revision
kono
parents:
diff changeset
13
bce86c4163a3 Initial revision
kono
parents:
diff changeset
14 3. The character chaining has been deleted and the program now uses
bce86c4163a3 Initial revision
kono
parents:
diff changeset
15 hashing. This improves the speed of the program, especially
bce86c4163a3 Initial revision
kono
parents:
diff changeset
16 during decompression. Other speed improvements have been made,
bce86c4163a3 Initial revision
kono
parents:
diff changeset
17 such as using putc() instead of fwrite().
bce86c4163a3 Initial revision
kono
parents:
diff changeset
18
bce86c4163a3 Initial revision
kono
parents:
diff changeset
19 4. A large table is used on large machines when a relatively small
bce86c4163a3 Initial revision
kono
parents:
diff changeset
20 number of bits is specified. This saves much time when compressing
bce86c4163a3 Initial revision
kono
parents:
diff changeset
21 for a 16-bit machine on a 32-bit virtual machine. Note that the
bce86c4163a3 Initial revision
kono
parents:
diff changeset
22 speed improvement only occurs when the input file is > 30000
bce86c4163a3 Initial revision
kono
parents:
diff changeset
23 characters, and the -b BITS is less than or equal to the cutoff
bce86c4163a3 Initial revision
kono
parents:
diff changeset
24 described below.
bce86c4163a3 Initial revision
kono
parents:
diff changeset
25
bce86c4163a3 Initial revision
kono
parents:
diff changeset
26 Most of these changes were made by James A. Woods (ames!jaw). Thank you
bce86c4163a3 Initial revision
kono
parents:
diff changeset
27 James!
bce86c4163a3 Initial revision
kono
parents:
diff changeset
28
bce86c4163a3 Initial revision
kono
parents:
diff changeset
29 Version 3.0 has been beta tested on many machines.
bce86c4163a3 Initial revision
kono
parents:
diff changeset
30
bce86c4163a3 Initial revision
kono
parents:
diff changeset
31 To compile compress:
bce86c4163a3 Initial revision
kono
parents:
diff changeset
32
bce86c4163a3 Initial revision
kono
parents:
diff changeset
33 cc -O -DUSERMEM=usermem -o compress compress.c
bce86c4163a3 Initial revision
kono
parents:
diff changeset
34
bce86c4163a3 Initial revision
kono
parents:
diff changeset
35 Where "usermem" is the amount of physical user memory available (in bytes).
bce86c4163a3 Initial revision
kono
parents:
diff changeset
36 If any physical memory is to be reserved for other processes, put in
bce86c4163a3 Initial revision
kono
parents:
diff changeset
37 "-DSACREDMEM sacredmem", where "sacredmem" is the amount to be reserved.
bce86c4163a3 Initial revision
kono
parents:
diff changeset
38
bce86c4163a3 Initial revision
kono
parents:
diff changeset
39 The difference "usermem-sacredmem" determines the maximum BITS that can be
bce86c4163a3 Initial revision
kono
parents:
diff changeset
40 specified, and the cutoff bits where the large+fast table is used.
bce86c4163a3 Initial revision
kono
parents:
diff changeset
41
bce86c4163a3 Initial revision
kono
parents:
diff changeset
42 memory: at least BITS cutoff
bce86c4163a3 Initial revision
kono
parents:
diff changeset
43 ------ -- ----- ---- ------
bce86c4163a3 Initial revision
kono
parents:
diff changeset
44 4,718,592 16 13
bce86c4163a3 Initial revision
kono
parents:
diff changeset
45 2,621,440 16 12
bce86c4163a3 Initial revision
kono
parents:
diff changeset
46 1,572,864 16 11
bce86c4163a3 Initial revision
kono
parents:
diff changeset
47 1,048,576 16 10
bce86c4163a3 Initial revision
kono
parents:
diff changeset
48 631,808 16 --
bce86c4163a3 Initial revision
kono
parents:
diff changeset
49 329,728 15 --
bce86c4163a3 Initial revision
kono
parents:
diff changeset
50 178,176 14 --
bce86c4163a3 Initial revision
kono
parents:
diff changeset
51 99,328 13 --
bce86c4163a3 Initial revision
kono
parents:
diff changeset
52 0 12 --
bce86c4163a3 Initial revision
kono
parents:
diff changeset
53
bce86c4163a3 Initial revision
kono
parents:
diff changeset
54 The default memory size is 750,000 which gives a maximum BITS=16 and no
bce86c4163a3 Initial revision
kono
parents:
diff changeset
55 large+fast table.
bce86c4163a3 Initial revision
kono
parents:
diff changeset
56
bce86c4163a3 Initial revision
kono
parents:
diff changeset
57 The maximum bits can be overrulled by specifying "-DBITS=bits" at
bce86c4163a3 Initial revision
kono
parents:
diff changeset
58 compilation time.
bce86c4163a3 Initial revision
kono
parents:
diff changeset
59
bce86c4163a3 Initial revision
kono
parents:
diff changeset
60 If your machine doesn't support unsigned characters, define "NO_UCHAR"
bce86c4163a3 Initial revision
kono
parents:
diff changeset
61 when compiling.
bce86c4163a3 Initial revision
kono
parents:
diff changeset
62
bce86c4163a3 Initial revision
kono
parents:
diff changeset
63 If your machine has "int" as 16-bits, define "SHORT_INT" when compiling.
bce86c4163a3 Initial revision
kono
parents:
diff changeset
64
bce86c4163a3 Initial revision
kono
parents:
diff changeset
65 After compilation, move "compress" to a standard executable location, such
bce86c4163a3 Initial revision
kono
parents:
diff changeset
66 as /usr/local. Then:
bce86c4163a3 Initial revision
kono
parents:
diff changeset
67 cd /usr/local
bce86c4163a3 Initial revision
kono
parents:
diff changeset
68 ln compress uncompress
bce86c4163a3 Initial revision
kono
parents:
diff changeset
69 ln compress zcat
bce86c4163a3 Initial revision
kono
parents:
diff changeset
70
bce86c4163a3 Initial revision
kono
parents:
diff changeset
71 On machines that have a fixed stack size (such as Perkin-Elmer), set the
bce86c4163a3 Initial revision
kono
parents:
diff changeset
72 stack to at least 12kb. ("setstack compress 12" on Perkin-Elmer).
bce86c4163a3 Initial revision
kono
parents:
diff changeset
73
bce86c4163a3 Initial revision
kono
parents:
diff changeset
74 Next, install the manual (compress.l).
bce86c4163a3 Initial revision
kono
parents:
diff changeset
75 cp compress.l /usr/man/manl
bce86c4163a3 Initial revision
kono
parents:
diff changeset
76 cd /usr/man/manl
bce86c4163a3 Initial revision
kono
parents:
diff changeset
77 ln compress.l uncompress.l
bce86c4163a3 Initial revision
kono
parents:
diff changeset
78 ln compress.l zcat.l
bce86c4163a3 Initial revision
kono
parents:
diff changeset
79
bce86c4163a3 Initial revision
kono
parents:
diff changeset
80 - or -
bce86c4163a3 Initial revision
kono
parents:
diff changeset
81
bce86c4163a3 Initial revision
kono
parents:
diff changeset
82 cp compress.l /usr/man/man1/compress.1
bce86c4163a3 Initial revision
kono
parents:
diff changeset
83 cd /usr/man/man1
bce86c4163a3 Initial revision
kono
parents:
diff changeset
84 ln compress.1 uncompress.1
bce86c4163a3 Initial revision
kono
parents:
diff changeset
85 ln compress.1 zcat.1
bce86c4163a3 Initial revision
kono
parents:
diff changeset
86
bce86c4163a3 Initial revision
kono
parents:
diff changeset
87 The zmore shell script and manual page are for use on systems that have a
bce86c4163a3 Initial revision
kono
parents:
diff changeset
88 "more(1)" program. Install the shell script and the manual page in a "bin"
bce86c4163a3 Initial revision
kono
parents:
diff changeset
89 and "man" directory, respectively. If your system doesn't have the
bce86c4163a3 Initial revision
kono
parents:
diff changeset
90 "more(1)" program, just skip "zmore".
bce86c4163a3 Initial revision
kono
parents:
diff changeset
91
bce86c4163a3 Initial revision
kono
parents:
diff changeset
92 regards,
bce86c4163a3 Initial revision
kono
parents:
diff changeset
93 petsd!joe
bce86c4163a3 Initial revision
kono
parents:
diff changeset
94
bce86c4163a3 Initial revision
kono
parents:
diff changeset
95 Here is the README file from the previous version of compress (2.0):
bce86c4163a3 Initial revision
kono
parents:
diff changeset
96
bce86c4163a3 Initial revision
kono
parents:
diff changeset
97 >Enclosed is compress.c version 2.0 with the following bugs fixed:
bce86c4163a3 Initial revision
kono
parents:
diff changeset
98 >
bce86c4163a3 Initial revision
kono
parents:
diff changeset
99 >1. The packed files produced by compress are different on different
bce86c4163a3 Initial revision
kono
parents:
diff changeset
100 > machines and dependent on the vax sysgen option.
bce86c4163a3 Initial revision
kono
parents:
diff changeset
101 > The bug was in the different byte/bit ordering on the
bce86c4163a3 Initial revision
kono
parents:
diff changeset
102 > various machines. This has been fixed.
bce86c4163a3 Initial revision
kono
parents:
diff changeset
103 >
bce86c4163a3 Initial revision
kono
parents:
diff changeset
104 > This version is NOT compatible with the original vax posting
bce86c4163a3 Initial revision
kono
parents:
diff changeset
105 > unless the '-DCOMPATIBLE' option is specified to the C
bce86c4163a3 Initial revision
kono
parents:
diff changeset
106 > compiler. The original posting has a bug which I fixed,
bce86c4163a3 Initial revision
kono
parents:
diff changeset
107 > causing incompatible files. I recommend you NOT to use this
bce86c4163a3 Initial revision
kono
parents:
diff changeset
108 > option unless you already have a lot of packed files from
bce86c4163a3 Initial revision
kono
parents:
diff changeset
109 > the original posting by thomas.
bce86c4163a3 Initial revision
kono
parents:
diff changeset
110 >2. The exit status is not well defined (on some machines) causing the
bce86c4163a3 Initial revision
kono
parents:
diff changeset
111 > scripts to fail.
bce86c4163a3 Initial revision
kono
parents:
diff changeset
112 > The exit status is now 0,1 or 2 and is documented in
bce86c4163a3 Initial revision
kono
parents:
diff changeset
113 > compress.l.
bce86c4163a3 Initial revision
kono
parents:
diff changeset
114 >3. The function getopt() is not available in all C libraries.
bce86c4163a3 Initial revision
kono
parents:
diff changeset
115 > The function getopt() is no longer referenced by the
bce86c4163a3 Initial revision
kono
parents:
diff changeset
116 > program.
bce86c4163a3 Initial revision
kono
parents:
diff changeset
117 >4. Error status is not being checked on the fwrite() and fflush() calls.
bce86c4163a3 Initial revision
kono
parents:
diff changeset
118 > Fixed.
bce86c4163a3 Initial revision
kono
parents:
diff changeset
119 >
bce86c4163a3 Initial revision
kono
parents:
diff changeset
120 >The following enhancements have been made:
bce86c4163a3 Initial revision
kono
parents:
diff changeset
121 >
bce86c4163a3 Initial revision
kono
parents:
diff changeset
122 >1. Added facilities of "compact" into the compress program. "Pack",
bce86c4163a3 Initial revision
kono
parents:
diff changeset
123 > "Unpack", and "Pcat" are no longer required (no longer supplied).
bce86c4163a3 Initial revision
kono
parents:
diff changeset
124 >2. Installed work around for C compiler bug with "-O".
bce86c4163a3 Initial revision
kono
parents:
diff changeset
125 >3. Added a magic number header (\037\235). Put the bits specified
bce86c4163a3 Initial revision
kono
parents:
diff changeset
126 > in the file.
bce86c4163a3 Initial revision
kono
parents:
diff changeset
127 >4. Added "-f" flag to force overwrite of output file.
bce86c4163a3 Initial revision
kono
parents:
diff changeset
128 >5. Added "-c" flag and "zcat" program. 'ln compress zcat' after you
bce86c4163a3 Initial revision
kono
parents:
diff changeset
129 > compile.
bce86c4163a3 Initial revision
kono
parents:
diff changeset
130 >6. The 'uncompress' script has been deleted; simply
bce86c4163a3 Initial revision
kono
parents:
diff changeset
131 > 'ln compress uncompress' after you compile and it will work.
bce86c4163a3 Initial revision
kono
parents:
diff changeset
132 >7. Removed extra bit masking for machines that support unsigned
bce86c4163a3 Initial revision
kono
parents:
diff changeset
133 > characters. If your machine doesn't support unsigned characters,
bce86c4163a3 Initial revision
kono
parents:
diff changeset
134 > define "NO_UCHAR" when compiling.
bce86c4163a3 Initial revision
kono
parents:
diff changeset
135 >
bce86c4163a3 Initial revision
kono
parents:
diff changeset
136 >Compile "compress.c" with "-O -o compress" flags. Move "compress" to a
bce86c4163a3 Initial revision
kono
parents:
diff changeset
137 >standard executable location, such as /usr/local. Then:
bce86c4163a3 Initial revision
kono
parents:
diff changeset
138 > cd /usr/local
bce86c4163a3 Initial revision
kono
parents:
diff changeset
139 > ln compress uncompress
bce86c4163a3 Initial revision
kono
parents:
diff changeset
140 > ln compress zcat
bce86c4163a3 Initial revision
kono
parents:
diff changeset
141 >
bce86c4163a3 Initial revision
kono
parents:
diff changeset
142 >On machines that have a fixed stack size (such as Perkin-Elmer), set the
bce86c4163a3 Initial revision
kono
parents:
diff changeset
143 >stack to at least 12kb. ("setstack compress 12" on Perkin-Elmer).
bce86c4163a3 Initial revision
kono
parents:
diff changeset
144 >
bce86c4163a3 Initial revision
kono
parents:
diff changeset
145 >Next, install the manual (compress.l).
bce86c4163a3 Initial revision
kono
parents:
diff changeset
146 > cp compress.l /usr/man/manl - or -
bce86c4163a3 Initial revision
kono
parents:
diff changeset
147 > cp compress.l /usr/man/man1/compress.1
bce86c4163a3 Initial revision
kono
parents:
diff changeset
148 >
bce86c4163a3 Initial revision
kono
parents:
diff changeset
149 >Here is the README that I sent with my first posting:
bce86c4163a3 Initial revision
kono
parents:
diff changeset
150 >
bce86c4163a3 Initial revision
kono
parents:
diff changeset
151 >>Enclosed is a modified version of compress.c, along with scripts to make it
bce86c4163a3 Initial revision
kono
parents:
diff changeset
152 >>run identically to pack(1), unpack(1), an pcat(1). Here is what I
bce86c4163a3 Initial revision
kono
parents:
diff changeset
153 >>(petsd!joe) and a colleague (petsd!peora!srd) did:
bce86c4163a3 Initial revision
kono
parents:
diff changeset
154 >>
bce86c4163a3 Initial revision
kono
parents:
diff changeset
155 >>1. Removed VAX dependencies.
bce86c4163a3 Initial revision
kono
parents:
diff changeset
156 >>2. Changed the struct to separate arrays; saves mucho memory.
bce86c4163a3 Initial revision
kono
parents:
diff changeset
157 >>3. Did comparisons in unsigned, where possible. (Faster on Perkin-Elmer.)
bce86c4163a3 Initial revision
kono
parents:
diff changeset
158 >>4. Sorted the character next chain and changed the search to stop
bce86c4163a3 Initial revision
kono
parents:
diff changeset
159 >>prematurely. This saves a lot on the execution time when compressing.
bce86c4163a3 Initial revision
kono
parents:
diff changeset
160 >>
bce86c4163a3 Initial revision
kono
parents:
diff changeset
161 >>This version is totally compatible with the original version. Even though
bce86c4163a3 Initial revision
kono
parents:
diff changeset
162 >>lint(1) -p has no complaints about compress.c, it won't run on a 16-bit
bce86c4163a3 Initial revision
kono
parents:
diff changeset
163 >>machine, due to the size of the arrays.
bce86c4163a3 Initial revision
kono
parents:
diff changeset
164 >>
bce86c4163a3 Initial revision
kono
parents:
diff changeset
165 >>Here is the README file from the original author:
bce86c4163a3 Initial revision
kono
parents:
diff changeset
166 >>
bce86c4163a3 Initial revision
kono
parents:
diff changeset
167 >>>Well, with all this discussion about file compression (for news batching
bce86c4163a3 Initial revision
kono
parents:
diff changeset
168 >>>in particular) going around, I decided to implement the text compression
bce86c4163a3 Initial revision
kono
parents:
diff changeset
169 >>>algorithm described in the June Computer magazine. The author claimed
bce86c4163a3 Initial revision
kono
parents:
diff changeset
170 >>>blinding speed and good compression ratios. It's certainly faster than
bce86c4163a3 Initial revision
kono
parents:
diff changeset
171 >>>compact (but, then, what wouldn't be), but it's also the same speed as
bce86c4163a3 Initial revision
kono
parents:
diff changeset
172 >>>pack, and gets better compression than both of them. On 350K bytes of
bce86c4163a3 Initial revision
kono
parents:
diff changeset
173 >>>unix-wizards, compact took about 8 minutes of CPU, pack took about 80
bce86c4163a3 Initial revision
kono
parents:
diff changeset
174 >>>seconds, and compress (herein) also took 80 seconds. But, compact and
bce86c4163a3 Initial revision
kono
parents:
diff changeset
175 >>>pack got about 30% compression, whereas compress got over 50%. So, I
bce86c4163a3 Initial revision
kono
parents:
diff changeset
176 >>>decided I had something, and that others might be interested, too.
bce86c4163a3 Initial revision
kono
parents:
diff changeset
177 >>>
bce86c4163a3 Initial revision
kono
parents:
diff changeset
178 >>>As is probably true of compact and pack (although I haven't checked),
bce86c4163a3 Initial revision
kono
parents:
diff changeset
179 >>>the byte order within a word is probably relevant here, but as long as
bce86c4163a3 Initial revision
kono
parents:
diff changeset
180 >>>you stay on a single machine type, you should be ok. (Can anybody
bce86c4163a3 Initial revision
kono
parents:
diff changeset
181 >>>elucidate on this?) There are a couple of asm's in the code (extv and
bce86c4163a3 Initial revision
kono
parents:
diff changeset
182 >>>insv instructions), so anyone porting it to another machine will have to
bce86c4163a3 Initial revision
kono
parents:
diff changeset
183 >>>deal with this anyway (and could probably make it compatible with Vax
bce86c4163a3 Initial revision
kono
parents:
diff changeset
184 >>>byte order at the same time). Anyway, I've linted the code (both with
bce86c4163a3 Initial revision
kono
parents:
diff changeset
185 >>>and without -p), so it should run elsewhere. Note the longs in the
bce86c4163a3 Initial revision
kono
parents:
diff changeset
186 >>>code, you can take these out if you reduce BITS to <= 15.
bce86c4163a3 Initial revision
kono
parents:
diff changeset
187 >>>
bce86c4163a3 Initial revision
kono
parents:
diff changeset
188 >>>Have fun, and as always, if you make good enhancements, or bug fixes,
bce86c4163a3 Initial revision
kono
parents:
diff changeset
189 >>>I'd like to see them.
bce86c4163a3 Initial revision
kono
parents:
diff changeset
190 >>>
bce86c4163a3 Initial revision
kono
parents:
diff changeset
191 >>>=Spencer (thomas@utah-20, {harpo,hplabs,arizona}!utah-cs!thomas)
bce86c4163a3 Initial revision
kono
parents:
diff changeset
192 >>
bce86c4163a3 Initial revision
kono
parents:
diff changeset
193 >> regards,
bce86c4163a3 Initial revision
kono
parents:
diff changeset
194 >> joe
bce86c4163a3 Initial revision
kono
parents:
diff changeset
195 >>
bce86c4163a3 Initial revision
kono
parents:
diff changeset
196 >>--
bce86c4163a3 Initial revision
kono
parents:
diff changeset
197 >>Full-Name: Joseph M. Orost
bce86c4163a3 Initial revision
kono
parents:
diff changeset
198 >>UUCP: ..!{decvax,ucbvax,ihnp4}!vax135!petsd!joe
bce86c4163a3 Initial revision
kono
parents:
diff changeset
199 >>US Mail: MS 313; Perkin-Elmer; 106 Apple St; Tinton Falls, NJ 07724
bce86c4163a3 Initial revision
kono
parents:
diff changeset
200 >>Phone: (201) 870-5844