view memo/result.txt @ 54:6538c34155de

fix
author Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
date Fri, 12 Feb 2016 14:58:02 +0900
parents a82607c0089d
children 49526135ba64
line wrap: on
line source

Fri Feb 12 12:15:42 JST 2016



-------------------------------------------

Wed Feb 10 11:06:12 JST 2016

[file 読み込みなし]

'[A-Z][A-Za-z0-9]*'  500MB.txt
./regexParser -ts               16.150
./cerium/ceriumGrep -cpu 12      7.386
./cerium/ceriumGrep -cpu  2     15.401
./cerium/ceriumGrep -cpu  1     25.534


./cerium/ceriumGrep -cpu 2 -regex '[A-Z][A-Za-z0-9]*' -file file/500MB.txt >   24.73s user 0.57s system 164% cpu 15.401 total

./regexParser -regex '[A-Z][A-Za-z0-9]*' -ts -file file/500MB.txt > /dev/null  15.96s user 0.17s system 99% cpu 16.150 total

./cerium/ceriumGrep -cpu 12 -regex '[A-Z][A-Za-z0-9]*' -file file/500MB.txt >  27.08s user 0.66s system 375% cpu 7.386 total

./cerium/ceriumGrep -regex '[A-Z][A-Za-z0-9]*' -file file/500MB.txt >   25.09s user 0.53s system 100% cpu 25.534 total

./sequentialSearchCbC -file file/500MB.txt > /dev/null  10.47s user 0.17s system 99% cpu 10.647 total

[file 読み込みあり]

'[A-Z][A-Za-z0-9]*'  500MB.txt
./regexParser -ts               21.171
./cerium/ceriumGrep -cpu 12     10.419
./cerium/ceriumGrep -cpu  2     27.061
egrep                           57.753

./cerium/ceriumGrep -cpu 2 -regex '[A-Z][A-Za-z0-9]*' -file file/500MB.txt >   25.00s user 0.74s system 95% cpu 27.061 total

./regexParser -regex '[A-Z][A-Za-z0-9]*' -ts -file file/500MB.txt > /dev/null  15.98s user 0.23s system 76% cpu 21.171 total

./cerium/ceriumGrep -regex '[A-Z][A-Za-z0-9]*' -file file/500MB.txt >   24.50s user 0.66s system 65% cpu 38.293 total

./cerium/ceriumGrep -cpu 12 -regex '[A-Z][A-Za-z0-9]*' -file file/500MB.txt >  25.65s user 0.83s system 254% cpu 10.419 total

egrep -o '[A-Z][A-Za-z0-9]*' file/500MB.txt > /dev/null  57.46s user 0.23s system 99% cpu 57.753 total

./sequentialSearchCbC -file file/500MB.txt > /dev/null  10.51s user 0.22s system 64% cpu 16.530 total


[word count]
    [firefly]
    cache あり
wc 500MB.txt > /dev/null  3.94s user 0.14s system 99% cpu 4.079 total
    cpu time
     1  3.702293
     4  1.003647
     8  0.529814
    12  0.398372

    cache なし
wc 500MB.txt >/dev/null  3.95s user 0.20s system 39% cpu 10.590 total
    [mmap]
    1   9.957094
    4   8.625949
    8  10.351554
    12  9.264983

    [bread]
    1   9.326340
    4   8.520677
    8   8.036562
    12  7.825948


---------------------------------------------------------------
cpu 6

./cerium/ceriumGrep -regex '(a|b)*a(a|b)(a|b)' -br -file file/ab500MB.txt -cp  32.90s user 1.28s system 99% cpu 34.514 total
cache 17.625

./cerium/ceriumGrep -regex '(a|b)*a(a|b)(a|b)(a|b)' -br -file file/ab500MB.tx  31.77s user 1.18s system 109% cpu 30.167 total
cache 19.153

./cerium/ceriumGrep -regex '(a|b)*a(a|b)(a|b)(a|b)(a|b)' -br -file  -cpu 6 >   31.82s user 1.17s system 113% cpu 29.160 total
17.193

./cerium/ceriumGrep -regex '(a|b)*a(a|b)(a|b)(a|b)(a|b)(a|b)' -br -file  -cpu  33.98s user 1.36s system 97% cpu 36.390 total
19.701

82.88s user 0.20s system 99% cpu 1:23.09 total
egrep 83.09

状態数を増やしてみたけど、速度には影響を与えない??

egrep -o '(a|b)*a(a|b)(a|b)' file/ab500MB.txt > /dev/null

./cerium/ceriumGrep -regex '(a|b)*a(a|b)(a|b)' -file file/ab500MB.txt -cpu 6   32.82s user 0.85s system 202% cpu 16.595 total


[キャッシュ有]
cpu time
 1 29.053
 4 21.091
 8 16.595
12 16.021

[キャッシュ無 mmap]
cpu time
 1 57.654
 4 43.959
 8 33.365
12 35.475

[キャッシュ無 bread]
cpu time
 1 40.494
 4 33.718
 8 34.257
12 32.457

./cerium/ceriumGrep -regex '[A-Z][a-zA-Z0-9_]*' -file file/500MB.txt >   25.29s user 0.53s system 100% cpu 25.721 total

egrep -o '[A-Z][a-zA-Z0-9_]*' 500MB.txt > /dev/null
56.34s user 0.16s system 99% cpu 56.506 total
line:13260580

[キャッシュ有]
cpu time
 1  25.721
 4  11.165
 8   8.180
12   7.380

egrep 56.506
[キャッシュ無 : bread]
 1  30.682
 4  16.497
 8  15.907
12  15.004

[キャッシュ無 : mmap]
 1  35.783
 4  17.189
 8  15.901
12  15.798

-- DFA を生成後、配列にアクセスして状態遷移--

[キャッシュ無]
./regexParser -subset -regex '[A-Z][a-zA-Z0-9_]*' -ts -file file/500MB.txt >   16.06s user 0.22s system 77% cpu 21.139 total

[キャッシュ有]
./regexParser -subset -regex '[A-Z][a-zA-Z0-9_]*' -ts -file file/500MB.txt >   16.05s user 0.19s system 99% cpu 16.246 total

-- cgrep , egrep --

cgrep -G '[A-Z][a-zA-Z0-9_]*' file/500MB.txt --no-line-umber --no-filename >/dev/null
測れない(2時間ぐらいぶんまわしてた)

egrep -o '(a|b)*a(a|b)(a|b)(a|b)(a|b)(a|b)(a|b)(a|b)' file/ab500MB.txt >   113.08s user 0.21s system 99% cpu 1:53.29 total
12503552

egrep -o '(a|b)*a(a|b)(a|b)(a|b)(a|b)(a|b)(a|b)' file/ab500MB.txt > /dev/null  103.32s user 0.18s system 99% cpu 1:43.50 total
14066496

egrep -o '(a|b)*a(a|b)(a|b)(a|b)(a|b)(a|b)' file/ab500MB.txt > /dev/null  98.29s user 0.18s system 99% cpu 1:38.47 total
15629440

egrep -o '(a|b)*a(a|b)(a|b)(a|b)(a|b)' file/ab500MB.txt > /dev/null  94.72s user 0.18s system 99% cpu 1:34.89 total
16410912

egrep -o '(a|b)*a(a|b)(a|b)(a|b)' file/ab500MB.txt > /dev/null  90.15s user 0.19s system 99% cpu 1:30.33 total
line:16410912

egrep -o '(a|b)*a(a|b)(a|b)' file/ab500MB.txt > /dev/null  82.88s user 0.20s system 99% cpu 1:23.09 total
line:19536800

egrep -o '[A-Z][a-zA-Z0-9_]*' 500MB.txt > /dev/null
56.34s user 0.16s system 99% cpu 56.506 total
line:13260580

sudo purge 後(キャッシュ消した)
egrep -o '[A-Z][a-zA-Z0-9_]*' 500MB.txt > /dev/null
57.37s user 0.22s system 98% cpu 58.382 total

キャッシュがあってもなってもかわらない。
(ファイルを毎回読み込みながら grep してる?)


-------------------------------------------

Mon Feb  8 17:24:16 JST 2016
compare cprep egrep ceriumgrep seqsearch
500MB.txt '[A-Z][a-zA-Z0-9_]*'
ab500MB.txt '(a|b)*a(a|b)(a|b)'


↓ 測定条件かえ

time cgrep -G '[A-Z][a-zA-Z0-9_]*' ../../Game/Cerium/example/bm_search/1GB.txt --no-line-umber --no-filename >/dev/null

[word count]
    firefly
    cpu time
     1  7.408101
     2  3.800094
     3  2.593649
     4  1.982035
     5  1.609130
     6  1.356986
     7  1.171626
     8  1.038483
     9  0.931845
    10  0.851650
    11  0.783369
    12  0.741725
    13  0.729744
    14  0.721221
    15  0.706474
    16  0.694984

    [mmap]
     1 19.124272
     4 17.701034
     8 17.517347
    16 16.844748

    [bread]
     1 15.219672
     4 15.892460
     8 13.709429
    16 13.913612

----------------------------------------
cache の消しかた
%sudo purge

./cerium/ceriumGrep  -regex '[A-Z][A-Za-z]*' -file ../../../Game/Cerium/example/bm_search/1GB.txt -cpu 16 -br

[firefly]
    [キャッシュ有り : file 読み込み時間なし]
    firefly
    | CPU | time |
     1   85.171171
     2   55.709298
     3   48.688031
     4   42.053209
     5   40.690125
     6   37.075352
     7   34.771558
     8   36.138412
     9   33.190304
    10   35.892051
    11   33.734864
    12   31.231748
    13   32.997263
    14   31.953924
    15   31.359396
    16   31.367073

    [キャッシュ無し : file 読み込み時間あり]
    [mmap]
     1   96.669395
     4   47.382920
     8   40.574622
    16   41.616542

    [bread]
     1   84.327310
     4   44.930445
     8   43.237358
    16   42.504598

    egrep -o

    [キャッシュ無し]
    egrep -o '[A-Z][a-zA-Z0-9_]*' ../../../Game/Cerium/example/bm_search/1GB.txt
    110.78s user 24.05s system 99% cpu 2:15.22 total

    [キャッシュ有り]
    egrep -o '[A-Z][a-zA-Z0-9_]*' ../../../Game/Cerium/example/bm_search/1GB.txt
    111.36s user 24.28s system 99% cpu 2:16.33 total



    ./cerium/ceriumGrep  -regex '(a|b)*a(a|b)(a|b)' -file ../../../Game/Cerium/example/bm_search/1GB.txt -cpu 8
    cpu time
     1  58.409044
     2  30.587006
     3  19.761497
     4  15.099642
     5  12.150340
     6  10.202328
     7  8.794964
     8   7.791925
     9   6.884088
    10   6.195592
    11   5.702492
    12   5.412080
    13   5.330420
    14   5.247614
    15   5.165163
    16   5.115427

    [mmap]
     1  70.830896
     4  23.777594
     8  16.743966
    16  15.853613

    [bread]
     1  58.259406
     4  19.307748
     8  17.217379
    16  15.243179

    egrep -o '(a|b)*a(a|b)(a|b)' ../../../Game/Cerium/example/bm_search/1GB.txt
    106.43s user 0.32s system 99% cpu 1:46.75 total

----------------------------------------------------------------
    firefly
    (a|b) の数を増やしてみる
    ./cerium/ceriumGrep -subset -regex '(a|b)*a(a|b)[...]' -file file/ab1GB.txt -cpu 8

    regex : (a|b)*a(a|b)

    cpu time
     8  130.188505

    regex : (a|b)*a(a|b)(a|b)
    cpu time
     8  113.549269

    regex : (a|b)*a(a|b)(a|b)(a|b)
     8  114.059856

    regex : (a|b)*a(a|b)(a|b)(a|b)(a|b)
     8  115.274656

    egrep -o '(a|b)*a(a|b)(a|b)' file/ab1GB.txt
    223.31s user 40.86s system 99% cpu 4:24.17 total

    egrep -o '(a|b)*a(a|b)(a|b)(a|b)' file/ab1GB.txt
    240.23s user 35.46s system 99% cpu 4:35.70 total

    egrep -o '(a|b)*a(a|b)(a|b)(a|b)(a|b)' file/ab1GB.txt
    252.63s user 35.24s system 99% cpu 4:47.87 total

    Print があまりにも大きすぎて time の大半は Print じゃないか説


    CentOS 7.2 os.cr.ie.u-ryukyu.ac.jp