0
|
1 Enclosed is compress version 3.0 with the following changes:
|
|
2
|
|
3 1. "Block" compression is performed. After the BITS run out, the
|
|
4 compression ratio is checked every so often. If it is decreasing,
|
|
5 the table is cleared and a new set of substrings are generated.
|
|
6
|
|
7 This makes the output of compress 3.0 not compatable with that of
|
|
8 compress 2.0. However, compress 3.0 still accepts the output of
|
|
9 compress 2.0. To generate output that is compatable with compress
|
|
10 2.0, use the undocumented "-C" flag.
|
|
11
|
|
12 2. A quiet "-q" flag has been added for use by the news system.
|
|
13
|
|
14 3. The character chaining has been deleted and the program now uses
|
|
15 hashing. This improves the speed of the program, especially
|
|
16 during decompression. Other speed improvements have been made,
|
|
17 such as using putc() instead of fwrite().
|
|
18
|
|
19 4. A large table is used on large machines when a relatively small
|
|
20 number of bits is specified. This saves much time when compressing
|
|
21 for a 16-bit machine on a 32-bit virtual machine. Note that the
|
|
22 speed improvement only occurs when the input file is > 30000
|
|
23 characters, and the -b BITS is less than or equal to the cutoff
|
|
24 described below.
|
|
25
|
|
26 Most of these changes were made by James A. Woods (ames!jaw). Thank you
|
|
27 James!
|
|
28
|
|
29 Version 3.0 has been beta tested on many machines.
|
|
30
|
|
31 To compile compress:
|
|
32
|
|
33 cc -O -DUSERMEM=usermem -o compress compress.c
|
|
34
|
|
35 Where "usermem" is the amount of physical user memory available (in bytes).
|
|
36 If any physical memory is to be reserved for other processes, put in
|
|
37 "-DSACREDMEM sacredmem", where "sacredmem" is the amount to be reserved.
|
|
38
|
|
39 The difference "usermem-sacredmem" determines the maximum BITS that can be
|
|
40 specified, and the cutoff bits where the large+fast table is used.
|
|
41
|
|
42 memory: at least BITS cutoff
|
|
43 ------ -- ----- ---- ------
|
|
44 4,718,592 16 13
|
|
45 2,621,440 16 12
|
|
46 1,572,864 16 11
|
|
47 1,048,576 16 10
|
|
48 631,808 16 --
|
|
49 329,728 15 --
|
|
50 178,176 14 --
|
|
51 99,328 13 --
|
|
52 0 12 --
|
|
53
|
|
54 The default memory size is 750,000 which gives a maximum BITS=16 and no
|
|
55 large+fast table.
|
|
56
|
|
57 The maximum bits can be overrulled by specifying "-DBITS=bits" at
|
|
58 compilation time.
|
|
59
|
|
60 If your machine doesn't support unsigned characters, define "NO_UCHAR"
|
|
61 when compiling.
|
|
62
|
|
63 If your machine has "int" as 16-bits, define "SHORT_INT" when compiling.
|
|
64
|
|
65 After compilation, move "compress" to a standard executable location, such
|
|
66 as /usr/local. Then:
|
|
67 cd /usr/local
|
|
68 ln compress uncompress
|
|
69 ln compress zcat
|
|
70
|
|
71 On machines that have a fixed stack size (such as Perkin-Elmer), set the
|
|
72 stack to at least 12kb. ("setstack compress 12" on Perkin-Elmer).
|
|
73
|
|
74 Next, install the manual (compress.l).
|
|
75 cp compress.l /usr/man/manl
|
|
76 cd /usr/man/manl
|
|
77 ln compress.l uncompress.l
|
|
78 ln compress.l zcat.l
|
|
79
|
|
80 - or -
|
|
81
|
|
82 cp compress.l /usr/man/man1/compress.1
|
|
83 cd /usr/man/man1
|
|
84 ln compress.1 uncompress.1
|
|
85 ln compress.1 zcat.1
|
|
86
|
|
87 The zmore shell script and manual page are for use on systems that have a
|
|
88 "more(1)" program. Install the shell script and the manual page in a "bin"
|
|
89 and "man" directory, respectively. If your system doesn't have the
|
|
90 "more(1)" program, just skip "zmore".
|
|
91
|
|
92 regards,
|
|
93 petsd!joe
|
|
94
|
|
95 Here is the README file from the previous version of compress (2.0):
|
|
96
|
|
97 >Enclosed is compress.c version 2.0 with the following bugs fixed:
|
|
98 >
|
|
99 >1. The packed files produced by compress are different on different
|
|
100 > machines and dependent on the vax sysgen option.
|
|
101 > The bug was in the different byte/bit ordering on the
|
|
102 > various machines. This has been fixed.
|
|
103 >
|
|
104 > This version is NOT compatible with the original vax posting
|
|
105 > unless the '-DCOMPATIBLE' option is specified to the C
|
|
106 > compiler. The original posting has a bug which I fixed,
|
|
107 > causing incompatible files. I recommend you NOT to use this
|
|
108 > option unless you already have a lot of packed files from
|
|
109 > the original posting by thomas.
|
|
110 >2. The exit status is not well defined (on some machines) causing the
|
|
111 > scripts to fail.
|
|
112 > The exit status is now 0,1 or 2 and is documented in
|
|
113 > compress.l.
|
|
114 >3. The function getopt() is not available in all C libraries.
|
|
115 > The function getopt() is no longer referenced by the
|
|
116 > program.
|
|
117 >4. Error status is not being checked on the fwrite() and fflush() calls.
|
|
118 > Fixed.
|
|
119 >
|
|
120 >The following enhancements have been made:
|
|
121 >
|
|
122 >1. Added facilities of "compact" into the compress program. "Pack",
|
|
123 > "Unpack", and "Pcat" are no longer required (no longer supplied).
|
|
124 >2. Installed work around for C compiler bug with "-O".
|
|
125 >3. Added a magic number header (\037\235). Put the bits specified
|
|
126 > in the file.
|
|
127 >4. Added "-f" flag to force overwrite of output file.
|
|
128 >5. Added "-c" flag and "zcat" program. 'ln compress zcat' after you
|
|
129 > compile.
|
|
130 >6. The 'uncompress' script has been deleted; simply
|
|
131 > 'ln compress uncompress' after you compile and it will work.
|
|
132 >7. Removed extra bit masking for machines that support unsigned
|
|
133 > characters. If your machine doesn't support unsigned characters,
|
|
134 > define "NO_UCHAR" when compiling.
|
|
135 >
|
|
136 >Compile "compress.c" with "-O -o compress" flags. Move "compress" to a
|
|
137 >standard executable location, such as /usr/local. Then:
|
|
138 > cd /usr/local
|
|
139 > ln compress uncompress
|
|
140 > ln compress zcat
|
|
141 >
|
|
142 >On machines that have a fixed stack size (such as Perkin-Elmer), set the
|
|
143 >stack to at least 12kb. ("setstack compress 12" on Perkin-Elmer).
|
|
144 >
|
|
145 >Next, install the manual (compress.l).
|
|
146 > cp compress.l /usr/man/manl - or -
|
|
147 > cp compress.l /usr/man/man1/compress.1
|
|
148 >
|
|
149 >Here is the README that I sent with my first posting:
|
|
150 >
|
|
151 >>Enclosed is a modified version of compress.c, along with scripts to make it
|
|
152 >>run identically to pack(1), unpack(1), an pcat(1). Here is what I
|
|
153 >>(petsd!joe) and a colleague (petsd!peora!srd) did:
|
|
154 >>
|
|
155 >>1. Removed VAX dependencies.
|
|
156 >>2. Changed the struct to separate arrays; saves mucho memory.
|
|
157 >>3. Did comparisons in unsigned, where possible. (Faster on Perkin-Elmer.)
|
|
158 >>4. Sorted the character next chain and changed the search to stop
|
|
159 >>prematurely. This saves a lot on the execution time when compressing.
|
|
160 >>
|
|
161 >>This version is totally compatible with the original version. Even though
|
|
162 >>lint(1) -p has no complaints about compress.c, it won't run on a 16-bit
|
|
163 >>machine, due to the size of the arrays.
|
|
164 >>
|
|
165 >>Here is the README file from the original author:
|
|
166 >>
|
|
167 >>>Well, with all this discussion about file compression (for news batching
|
|
168 >>>in particular) going around, I decided to implement the text compression
|
|
169 >>>algorithm described in the June Computer magazine. The author claimed
|
|
170 >>>blinding speed and good compression ratios. It's certainly faster than
|
|
171 >>>compact (but, then, what wouldn't be), but it's also the same speed as
|
|
172 >>>pack, and gets better compression than both of them. On 350K bytes of
|
|
173 >>>unix-wizards, compact took about 8 minutes of CPU, pack took about 80
|
|
174 >>>seconds, and compress (herein) also took 80 seconds. But, compact and
|
|
175 >>>pack got about 30% compression, whereas compress got over 50%. So, I
|
|
176 >>>decided I had something, and that others might be interested, too.
|
|
177 >>>
|
|
178 >>>As is probably true of compact and pack (although I haven't checked),
|
|
179 >>>the byte order within a word is probably relevant here, but as long as
|
|
180 >>>you stay on a single machine type, you should be ok. (Can anybody
|
|
181 >>>elucidate on this?) There are a couple of asm's in the code (extv and
|
|
182 >>>insv instructions), so anyone porting it to another machine will have to
|
|
183 >>>deal with this anyway (and could probably make it compatible with Vax
|
|
184 >>>byte order at the same time). Anyway, I've linted the code (both with
|
|
185 >>>and without -p), so it should run elsewhere. Note the longs in the
|
|
186 >>>code, you can take these out if you reduce BITS to <= 15.
|
|
187 >>>
|
|
188 >>>Have fun, and as always, if you make good enhancements, or bug fixes,
|
|
189 >>>I'd like to see them.
|
|
190 >>>
|
|
191 >>>=Spencer (thomas@utah-20, {harpo,hplabs,arizona}!utah-cs!thomas)
|
|
192 >>
|
|
193 >> regards,
|
|
194 >> joe
|
|
195 >>
|
|
196 >>--
|
|
197 >>Full-Name: Joseph M. Orost
|
|
198 >>UUCP: ..!{decvax,ucbvax,ihnp4}!vax135!petsd!joe
|
|
199 >>US Mail: MS 313; Perkin-Elmer; 106 Apple St; Tinton Falls, NJ 07724
|
|
200 >>Phone: (201) 870-5844
|