Mercurial > hg > CbC > CbC_gcc
annotate zlib/contrib/asm586/README.586 @ 51:ae3a4bfb450b
add some files of version 4.4.3 that have been forgotten.
author | kent <kent@cr.ie.u-ryukyu.ac.jp> |
---|---|
date | Sun, 07 Feb 2010 18:27:48 +0900 |
parents | |
children |
rev | line source |
---|---|
51
ae3a4bfb450b
add some files of version 4.4.3 that have been forgotten.
kent <kent@cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
1 This is a patched version of zlib modified to use |
ae3a4bfb450b
add some files of version 4.4.3 that have been forgotten.
kent <kent@cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
2 Pentium-optimized assembly code in the deflation algorithm. The files |
ae3a4bfb450b
add some files of version 4.4.3 that have been forgotten.
kent <kent@cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
3 changed/added by this patch are: |
ae3a4bfb450b
add some files of version 4.4.3 that have been forgotten.
kent <kent@cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
4 |
ae3a4bfb450b
add some files of version 4.4.3 that have been forgotten.
kent <kent@cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
5 README.586 |
ae3a4bfb450b
add some files of version 4.4.3 that have been forgotten.
kent <kent@cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
6 match.S |
ae3a4bfb450b
add some files of version 4.4.3 that have been forgotten.
kent <kent@cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
7 |
ae3a4bfb450b
add some files of version 4.4.3 that have been forgotten.
kent <kent@cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
8 The effectiveness of these modifications is a bit marginal, as the the |
ae3a4bfb450b
add some files of version 4.4.3 that have been forgotten.
kent <kent@cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
9 program's bottleneck seems to be mostly L1-cache contention, for which |
ae3a4bfb450b
add some files of version 4.4.3 that have been forgotten.
kent <kent@cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
10 there is no real way to work around without rewriting the basic |
ae3a4bfb450b
add some files of version 4.4.3 that have been forgotten.
kent <kent@cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
11 algorithm. The speedup on average is around 5-10% (which is generally |
ae3a4bfb450b
add some files of version 4.4.3 that have been forgotten.
kent <kent@cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
12 less than the amount of variance between subsequent executions). |
ae3a4bfb450b
add some files of version 4.4.3 that have been forgotten.
kent <kent@cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
13 However, when used at level 9 compression, the cache contention can |
ae3a4bfb450b
add some files of version 4.4.3 that have been forgotten.
kent <kent@cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
14 drop enough for the assembly version to achieve 10-20% speedup (and |
ae3a4bfb450b
add some files of version 4.4.3 that have been forgotten.
kent <kent@cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
15 sometimes more, depending on the amount of overall redundancy in the |
ae3a4bfb450b
add some files of version 4.4.3 that have been forgotten.
kent <kent@cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
16 files). Even here, though, cache contention can still be the limiting |
ae3a4bfb450b
add some files of version 4.4.3 that have been forgotten.
kent <kent@cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
17 factor, depending on the nature of the program using the zlib library. |
ae3a4bfb450b
add some files of version 4.4.3 that have been forgotten.
kent <kent@cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
18 This may also mean that better improvements will be seen on a Pentium |
ae3a4bfb450b
add some files of version 4.4.3 that have been forgotten.
kent <kent@cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
19 with MMX, which suffers much less from L1-cache contention, but I have |
ae3a4bfb450b
add some files of version 4.4.3 that have been forgotten.
kent <kent@cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
20 not yet verified this. |
ae3a4bfb450b
add some files of version 4.4.3 that have been forgotten.
kent <kent@cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
21 |
ae3a4bfb450b
add some files of version 4.4.3 that have been forgotten.
kent <kent@cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
22 Note that this code has been tailored for the Pentium in particular, |
ae3a4bfb450b
add some files of version 4.4.3 that have been forgotten.
kent <kent@cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
23 and will not perform well on the Pentium Pro (due to the use of a |
ae3a4bfb450b
add some files of version 4.4.3 that have been forgotten.
kent <kent@cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
24 partial register in the inner loop). |
ae3a4bfb450b
add some files of version 4.4.3 that have been forgotten.
kent <kent@cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
25 |
ae3a4bfb450b
add some files of version 4.4.3 that have been forgotten.
kent <kent@cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
26 If you are using an assembler other than GNU as, you will have to |
ae3a4bfb450b
add some files of version 4.4.3 that have been forgotten.
kent <kent@cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
27 translate match.S to use your assembler's syntax. (Have fun.) |
ae3a4bfb450b
add some files of version 4.4.3 that have been forgotten.
kent <kent@cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
28 |
ae3a4bfb450b
add some files of version 4.4.3 that have been forgotten.
kent <kent@cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
29 Brian Raiter |
ae3a4bfb450b
add some files of version 4.4.3 that have been forgotten.
kent <kent@cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
30 breadbox@muppetlabs.com |
ae3a4bfb450b
add some files of version 4.4.3 that have been forgotten.
kent <kent@cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
31 April, 1998 |
ae3a4bfb450b
add some files of version 4.4.3 that have been forgotten.
kent <kent@cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
32 |
ae3a4bfb450b
add some files of version 4.4.3 that have been forgotten.
kent <kent@cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
33 |
ae3a4bfb450b
add some files of version 4.4.3 that have been forgotten.
kent <kent@cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
34 Added for zlib 1.1.3: |
ae3a4bfb450b
add some files of version 4.4.3 that have been forgotten.
kent <kent@cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
35 |
ae3a4bfb450b
add some files of version 4.4.3 that have been forgotten.
kent <kent@cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
36 The patches come from |
ae3a4bfb450b
add some files of version 4.4.3 that have been forgotten.
kent <kent@cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
37 http://www.muppetlabs.com/~breadbox/software/assembly.html |
ae3a4bfb450b
add some files of version 4.4.3 that have been forgotten.
kent <kent@cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
38 |
ae3a4bfb450b
add some files of version 4.4.3 that have been forgotten.
kent <kent@cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
39 To compile zlib with this asm file, copy match.S to the zlib directory |
ae3a4bfb450b
add some files of version 4.4.3 that have been forgotten.
kent <kent@cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
40 then do: |
ae3a4bfb450b
add some files of version 4.4.3 that have been forgotten.
kent <kent@cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
41 |
ae3a4bfb450b
add some files of version 4.4.3 that have been forgotten.
kent <kent@cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
42 CFLAGS="-O3 -DASMV" ./configure |
ae3a4bfb450b
add some files of version 4.4.3 that have been forgotten.
kent <kent@cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
43 make OBJA=match.o |