Mercurial > hg > Members > shinya > pyrect
annotate pyrect/translator/grep_translator.py @ 110:68b616dbe2c9
modify filtering rules.
author | Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp> |
---|---|
date | Sat, 12 Feb 2011 16:47:45 +0900 |
parents | d591da6e2988 |
children |
rev | line source |
---|---|
27
3db85244784b
modify jitgrep, pre-compile grep main routine to libgrep.so. so JIT-compile only required DFA-transition.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
22
diff
changeset
|
1 #!/usr/bin/env python |
3db85244784b
modify jitgrep, pre-compile grep main routine to libgrep.so. so JIT-compile only required DFA-transition.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
22
diff
changeset
|
2 |
49
7f4221018adf
accept UTF-8 encoding. but some foundational bug in converting algorithm NFA. maybe, which is not too difficult.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
47
diff
changeset
|
3 import os |
47
701beabd7d97
add input-rules, Range, CharacterClass, Anchor and MultiByte-Char(but not work)\nand more simplify NFA (is global improvement).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
45
diff
changeset
|
4 from c_translator import CTranslator |
70
74f4e50c4f11
add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
68
diff
changeset
|
5 from pyrect.regexp import Regexp |
97
5db856953793
implement range-expression. and add repeat-mn syntax(ex. A{1,10}).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
95
diff
changeset
|
6 from pyrect.regexp.ast import ASTWalker, AnyChar, Character, SpecialInputNode |
12
41391400fe68
add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
7 |
14
55684cb51347
add LICENSE
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
12
diff
changeset
|
8 class GREPTranslateExeption(Exception): |
55684cb51347
add LICENSE
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
12
diff
changeset
|
9 pass |
55684cb51347
add LICENSE
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
12
diff
changeset
|
10 |
12
41391400fe68
add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
11 class GREPTranslator(CTranslator): |
41391400fe68
add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
12 """GREPTranslator |
29
b833746d9d92
modify jitgrep.py and change linking method.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
27
diff
changeset
|
13 This Class can translate form DFA into grep source-code. |
b833746d9d92
modify jitgrep.py and change linking method.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
27
diff
changeset
|
14 which based on (beautiful) mini-grep introduced \"The Practice of Programming\" |
b833746d9d92
modify jitgrep.py and change linking method.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
27
diff
changeset
|
15 written by Rob Pike & Brian W. Kernighan. (see template/grep.c) |
27
3db85244784b
modify jitgrep, pre-compile grep main routine to libgrep.so. so JIT-compile only required DFA-transition.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
22
diff
changeset
|
16 >>> string = \"(build|fndecl|gcc)\" |
12
41391400fe68
add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
17 >>> reg = Regexp(string) |
43
83c69d42faa8
replace converting-flow, module dfareg with module regexp. it's is substantial changing in implimentation.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
38
diff
changeset
|
18 >>> tje = GREPTranslator(reg) |
12
41391400fe68
add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
19 >>> tje.translate() |
41391400fe68
add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
20 """ |
14
55684cb51347
add LICENSE
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
12
diff
changeset
|
21 |
49
7f4221018adf
accept UTF-8 encoding. but some foundational bug in converting algorithm NFA. maybe, which is not too difficult.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
47
diff
changeset
|
22 BASE_DIR = os.path.dirname(os.path.abspath(__file__)) |
7f4221018adf
accept UTF-8 encoding. but some foundational bug in converting algorithm NFA. maybe, which is not too difficult.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
47
diff
changeset
|
23 |
43
83c69d42faa8
replace converting-flow, module dfareg with module regexp. it's is substantial changing in implimentation.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
38
diff
changeset
|
24 def __init__(self, regexp): |
99
e327e93aeb3a
remove callgraph and use Transition.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
97
diff
changeset
|
25 CTranslator.__init__(self, regexp) |
50
d1afae06e776
jitgrep: set bufsize default 1M. and remove with statement.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
49
diff
changeset
|
26 self.__bufsize = 1024 * 1024 |
59
fd3d0b8326fe
implement regexp-syntax any-char ('.').
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
58
diff
changeset
|
27 self.thread_dfa = 1 |
fd3d0b8326fe
implement regexp-syntax any-char ('.').
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
58
diff
changeset
|
28 self.thread_line = 1 |
110
68b616dbe2c9
modify filtering rules.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
109
diff
changeset
|
29 self.filter = True |
68
56a997f2c121
improve codegen. remove needless code (when filter-only, no need to emit dfa-code).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
67
diff
changeset
|
30 self.filter_only = False |
70
74f4e50c4f11
add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
68
diff
changeset
|
31 self.filter_prefix = False |
74f4e50c4f11
add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
68
diff
changeset
|
32 self.skip_boost = True |
84
f5c4193913a1
add table-lookup option.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
83
diff
changeset
|
33 self.table_lookup = False |
67
b02b321d0e06
implement bm_filter on mmap. but it's slower than dfa. ?;-(
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
63
diff
changeset
|
34 self.start = "matcher" |
63
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
35 self.interface = "UCHARP beg, UCHARP buf, UCHARP end" |
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
36 self.args = "beg, buf, end" |
38
06826250198b
modify grep_translator, use property at bufsize.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
34
diff
changeset
|
37 |
06826250198b
modify grep_translator, use property at bufsize.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
34
diff
changeset
|
38 def getbufsize(self,): |
06826250198b
modify grep_translator, use property at bufsize.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
34
diff
changeset
|
39 return self.__bufsize |
06826250198b
modify grep_translator, use property at bufsize.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
34
diff
changeset
|
40 def setbufsize(self, bufsize): |
06826250198b
modify grep_translator, use property at bufsize.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
34
diff
changeset
|
41 self.__bufsize = abs(bufsize) |
06826250198b
modify grep_translator, use property at bufsize.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
34
diff
changeset
|
42 |
75
e06786b3c2dc
modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
73
diff
changeset
|
43 bufsize = property(getbufsize, setbufsize) |
12
41391400fe68
add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
44 |
41391400fe68
add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
45 def emit_initialization(self): |
63
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
46 self.emit("#include <stdio.h>") |
49
7f4221018adf
accept UTF-8 encoding. but some foundational bug in converting algorithm NFA. maybe, which is not too difficult.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
47
diff
changeset
|
47 self.emit("#include <stdlib.h>") |
63
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
48 self.emit("#include <sys/mman.h>") |
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
49 self.emit("#include <sys/types.h>") |
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
50 self.emit("#include <sys/stat.h>") |
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
51 self.emit("#include <fcntl.h>") |
80
53c3ce58fc8a
modify code gen, for no-warnings (gcc -Wall).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
79
diff
changeset
|
52 self.emit("#include <unistd.h>") |
83
68cefeb3bee1
experimentation, use table-lookup at first state's transition.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
81
diff
changeset
|
53 self.emit("#include <string.h>", 2) |
63
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
54 |
80
53c3ce58fc8a
modify code gen, for no-warnings (gcc -Wall).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
79
diff
changeset
|
55 self.emit("typedef unsigned char UCHAR;") |
83
68cefeb3bee1
experimentation, use table-lookup at first state's transition.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
81
diff
changeset
|
56 self.emit("typedef unsigned char *UCHARP;", 2) |
80
53c3ce58fc8a
modify code gen, for no-warnings (gcc -Wall).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
79
diff
changeset
|
57 |
63
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
58 self.emit('void reject(%s);' % self.interface) |
83
68cefeb3bee1
experimentation, use table-lookup at first state's transition.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
81
diff
changeset
|
59 self.emit("void matcher(%s);" % self.interface) |
68cefeb3bee1
experimentation, use table-lookup at first state's transition.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
81
diff
changeset
|
60 self.emit('void accept(%s);' % self.interface, 2) |
63
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
61 |
70
74f4e50c4f11
add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
68
diff
changeset
|
62 key = None |
75
e06786b3c2dc
modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
73
diff
changeset
|
63 |
110
68b616dbe2c9
modify filtering rules.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
109
diff
changeset
|
64 if (self.filter == True or self.filter == "bmh" or self.filter == "quick" or self.filter == "memchr")\ |
75
e06786b3c2dc
modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
73
diff
changeset
|
65 and self.regexp.must_words: |
70
74f4e50c4f11
add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
68
diff
changeset
|
66 key = max(self.regexp.must_words, key=len) |
74f4e50c4f11
add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
68
diff
changeset
|
67 if len(self.regexp.must_words) == 1 and len(key) == self.regexp.min_len: |
74f4e50c4f11
add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
68
diff
changeset
|
68 self.filter_only = True |
74f4e50c4f11
add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
68
diff
changeset
|
69 else: |
74f4e50c4f11
add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
68
diff
changeset
|
70 self.filter = False |
74f4e50c4f11
add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
68
diff
changeset
|
71 |
74f4e50c4f11
add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
68
diff
changeset
|
72 if not self.filter_only: |
100 | 73 for state in self.fa.transition.iterkeys(): |
70
74f4e50c4f11
add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
68
diff
changeset
|
74 self.emit("void %s(%s);" % (self.state_name(state), self.interface)) |
74f4e50c4f11
add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
68
diff
changeset
|
75 self.emit() |
74f4e50c4f11
add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
68
diff
changeset
|
76 |
75
e06786b3c2dc
modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
73
diff
changeset
|
77 if self.filter == "bmh": |
e06786b3c2dc
modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
73
diff
changeset
|
78 self.emit_bmh_filter(key) |
109
d591da6e2988
add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
106
diff
changeset
|
79 elif self.filter == "memchr": |
d591da6e2988
add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
106
diff
changeset
|
80 self.emit_memchr_filter(key) |
d591da6e2988
add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
106
diff
changeset
|
81 elif self.filter == "quick": |
d591da6e2988
add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
106
diff
changeset
|
82 self.emit_quick_filter(key) |
79
623eccb93ca1
modify filter emit-option's bug.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
78
diff
changeset
|
83 elif self.filter: |
109
d591da6e2988
add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
106
diff
changeset
|
84 if len(key) > 5: |
110
68b616dbe2c9
modify filtering rules.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
109
diff
changeset
|
85 print("hoge") |
109
d591da6e2988
add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
106
diff
changeset
|
86 self.emit_quick_filter(key) |
d591da6e2988
add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
106
diff
changeset
|
87 else: |
110
68b616dbe2c9
modify filtering rules.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
109
diff
changeset
|
88 print("fuga") |
109
d591da6e2988
add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
106
diff
changeset
|
89 self.emit_memchr_filter(key) |
70
74f4e50c4f11
add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
68
diff
changeset
|
90 |
74f4e50c4f11
add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
68
diff
changeset
|
91 if self.skip_boost and not self.filter_only and \ |
74f4e50c4f11
add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
68
diff
changeset
|
92 not AnyChar() in self.regexp.chars and \ |
106
8102bf4bbec6
modify range stmt.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
102
diff
changeset
|
93 self.regexp.min_len >= 2: |
70
74f4e50c4f11
add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
68
diff
changeset
|
94 self.emit_booster(self.regexp.min_len, self.regexp.chars) |
74f4e50c4f11
add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
68
diff
changeset
|
95 else: |
74f4e50c4f11
add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
68
diff
changeset
|
96 self.skip_boost = False |
57
81b44ae1cd73
add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
56
diff
changeset
|
97 |
49
7f4221018adf
accept UTF-8 encoding. but some foundational bug in converting algorithm NFA. maybe, which is not too difficult.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
47
diff
changeset
|
98 grepsource = open(self.BASE_DIR + "/template/grep.c") |
34
50b10929be29
change compile-method to full-source-compile.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
33
diff
changeset
|
99 self.emit(grepsource.read()) |
12
41391400fe68
add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
100 |
109
d591da6e2988
add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
106
diff
changeset
|
101 def emit_memchr_filter(self, key): |
d591da6e2988
add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
106
diff
changeset
|
102 l = len(key) |
d591da6e2988
add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
106
diff
changeset
|
103 def emit_next(): |
d591da6e2988
add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
106
diff
changeset
|
104 if self.filter_only: |
d591da6e2988
add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
106
diff
changeset
|
105 self.emit("return accept(%s);" % self.args) |
d591da6e2988
add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
106
diff
changeset
|
106 elif self.filter_prefix: |
d591da6e2988
add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
106
diff
changeset
|
107 self.emit("buf++;") |
d591da6e2988
add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
106
diff
changeset
|
108 self.emit("return %s(%s);" % (self.state_name(self.fa.start), self.args)) |
d591da6e2988
add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
106
diff
changeset
|
109 else: |
d591da6e2988
add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
106
diff
changeset
|
110 self.emit("beg = get_line_beg(buf, beg);") |
d591da6e2988
add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
106
diff
changeset
|
111 self.emit("buf = beg;") |
d591da6e2988
add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
106
diff
changeset
|
112 self.emit("return %s(%s);" % (self.state_name(self.fa.start), self.args)) |
d591da6e2988
add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
106
diff
changeset
|
113 |
d591da6e2988
add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
106
diff
changeset
|
114 self.emit("UCHARP get_line_beg(UCHARP p, UCHARP beg);", 2) |
d591da6e2988
add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
106
diff
changeset
|
115 self.emiti("void memchr_filter(%s) {" % self.interface) |
d591da6e2988
add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
106
diff
changeset
|
116 self.emit('static const UCHAR key[] = "%s";' % key) |
d591da6e2988
add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
106
diff
changeset
|
117 |
d591da6e2988
add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
106
diff
changeset
|
118 self.emit("int i, len = %d;" % l); |
d591da6e2988
add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
106
diff
changeset
|
119 |
d591da6e2988
add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
106
diff
changeset
|
120 self.emiti("while ((buf = memchr(buf, key[0], end-buf)) != NULL) {") |
d591da6e2988
add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
106
diff
changeset
|
121 self.emiti( "for (i = 1; i < len; i++) {") |
d591da6e2988
add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
106
diff
changeset
|
122 self.iemitd( "if (key[i] != buf[i]) goto retry;") |
d591da6e2988
add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
106
diff
changeset
|
123 self.demit( "}") |
d591da6e2988
add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
106
diff
changeset
|
124 self.emit( "goto next;") |
d591da6e2988
add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
106
diff
changeset
|
125 self.demiti("retry:") |
d591da6e2988
add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
106
diff
changeset
|
126 self.emit( "buf++;") |
d591da6e2988
add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
106
diff
changeset
|
127 self.demit("}") |
d591da6e2988
add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
106
diff
changeset
|
128 self.emit( "return;") |
d591da6e2988
add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
106
diff
changeset
|
129 self.emit( "next:") |
d591da6e2988
add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
106
diff
changeset
|
130 emit_next() |
d591da6e2988
add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
106
diff
changeset
|
131 self.demit("}", 2) |
d591da6e2988
add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
106
diff
changeset
|
132 |
75
e06786b3c2dc
modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
73
diff
changeset
|
133 def emit_bmh_filter(self, key): |
70
74f4e50c4f11
add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
68
diff
changeset
|
134 l = len(key) |
67
b02b321d0e06
implement bm_filter on mmap. but it's slower than dfa. ?;-(
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
63
diff
changeset
|
135 def emit_next(): |
68
56a997f2c121
improve codegen. remove needless code (when filter-only, no need to emit dfa-code).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
67
diff
changeset
|
136 if self.filter_only: |
102
a38b57592d45
modify (add return statement).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
100
diff
changeset
|
137 self.emit("return accept(%s);" % self.args) |
68
56a997f2c121
improve codegen. remove needless code (when filter-only, no need to emit dfa-code).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
67
diff
changeset
|
138 elif self.filter_prefix: |
77
c50511498bcf
improve filter.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
76
diff
changeset
|
139 self.emit("buf++;") |
102
a38b57592d45
modify (add return statement).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
100
diff
changeset
|
140 self.emit("return %s(%s);" % (self.state_name(self.fa.start), self.args)) |
67
b02b321d0e06
implement bm_filter on mmap. but it's slower than dfa. ?;-(
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
63
diff
changeset
|
141 else: |
72
8b9c3a924744
rename memrchr -> beg_get_line.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
71
diff
changeset
|
142 self.emit("beg = get_line_beg(buf, beg);") |
67
b02b321d0e06
implement bm_filter on mmap. but it's slower than dfa. ?;-(
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
63
diff
changeset
|
143 self.emit("buf = beg;") |
102
a38b57592d45
modify (add return statement).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
100
diff
changeset
|
144 self.emit("return %s(%s);" % (self.state_name(self.fa.start), self.args)) |
67
b02b321d0e06
implement bm_filter on mmap. but it's slower than dfa. ?;-(
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
63
diff
changeset
|
145 |
72
8b9c3a924744
rename memrchr -> beg_get_line.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
71
diff
changeset
|
146 self.emit("UCHARP get_line_beg(UCHARP p, UCHARP beg);", 2) |
75
e06786b3c2dc
modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
73
diff
changeset
|
147 self.emiti("void bmh_filter(%s) {" % self.interface) |
57
81b44ae1cd73
add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
56
diff
changeset
|
148 l = len(key) |
81b44ae1cd73
add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
56
diff
changeset
|
149 if l == 1: |
67
b02b321d0e06
implement bm_filter on mmap. but it's slower than dfa. ?;-(
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
63
diff
changeset
|
150 self.emit("buf = memchr(buf, %d, (end - buf));" % ord(key)) |
b02b321d0e06
implement bm_filter on mmap. but it's slower than dfa. ?;-(
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
63
diff
changeset
|
151 self.emit("if (buf == NULL) return;") |
b02b321d0e06
implement bm_filter on mmap. but it's slower than dfa. ?;-(
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
63
diff
changeset
|
152 emit_next() |
90
8cfa81638130
buf-fix: goto booster possibly, and improve code-gen routine (add some usefull functions -> demiti, iemit,,).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
85
diff
changeset
|
153 self.demit("}", 2) |
57
81b44ae1cd73
add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
56
diff
changeset
|
154 return |
81b44ae1cd73
add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
56
diff
changeset
|
155 |
77
c50511498bcf
improve filter.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
76
diff
changeset
|
156 self.emit('static const UCHAR key[] = "%s";' % key) |
70
74f4e50c4f11
add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
68
diff
changeset
|
157 |
73
a6a0504dea7b
modify bm-filter's implimentation. table-lookup -> switch. it's more simple and beautiful.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
72
diff
changeset
|
158 skip = dict() |
a6a0504dea7b
modify bm-filter's implimentation. table-lookup -> switch. it's more simple and beautiful.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
72
diff
changeset
|
159 for i in range(l-1): |
a6a0504dea7b
modify bm-filter's implimentation. table-lookup -> switch. it's more simple and beautiful.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
72
diff
changeset
|
160 skip[key[i]] = l-1-i |
57
81b44ae1cd73
add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
56
diff
changeset
|
161 |
80
53c3ce58fc8a
modify code gen, for no-warnings (gcc -Wall).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
79
diff
changeset
|
162 self.emit("UCHARP tmp1, tmp2; buf += %d;" % (l-1), 2) |
79
623eccb93ca1
modify filter emit-option's bug.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
78
diff
changeset
|
163 |
76
dd6d2b9e48ad
improvement quick-filtering.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
75
diff
changeset
|
164 self.emiti("while (buf < end) {") |
77
c50511498bcf
improve filter.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
76
diff
changeset
|
165 self.emiti( "if (*buf == %d /* %s */) {" % (ord(key[-1]), Character.ascii(key[-1]))) |
c50511498bcf
improve filter.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
76
diff
changeset
|
166 self.emit( "tmp1 = buf, tmp2 = (UCHARP)key+%d;" % (l-1)) |
76
dd6d2b9e48ad
improvement quick-filtering.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
75
diff
changeset
|
167 self.emiti( "while (*(--tmp1) == *(--tmp2)) {") |
dd6d2b9e48ad
improvement quick-filtering.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
75
diff
changeset
|
168 self.emit( "if (tmp2 == key) goto next;") |
92
87cd1db7ec3f
modify codegen-indent.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
90
diff
changeset
|
169 self.demit( "}") |
87cd1db7ec3f
modify codegen-indent.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
90
diff
changeset
|
170 self.demit( "}") |
73
a6a0504dea7b
modify bm-filter's implimentation. table-lookup -> switch. it's more simple and beautiful.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
72
diff
changeset
|
171 self.emiti( "switch(*buf) {") |
a6a0504dea7b
modify bm-filter's implimentation. table-lookup -> switch. it's more simple and beautiful.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
72
diff
changeset
|
172 for k, v in skip.iteritems(): |
a6a0504dea7b
modify bm-filter's implimentation. table-lookup -> switch. it's more simple and beautiful.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
72
diff
changeset
|
173 self.emiti( "case %d: /* %s */" % (ord(k), Character.ascii(k))) |
a6a0504dea7b
modify bm-filter's implimentation. table-lookup -> switch. it's more simple and beautiful.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
72
diff
changeset
|
174 self.emit( "buf += %d; break;" % v), self.dedent() |
a6a0504dea7b
modify bm-filter's implimentation. table-lookup -> switch. it's more simple and beautiful.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
72
diff
changeset
|
175 self.emiti("default: buf += %d;" % l), self.dedent() |
92
87cd1db7ec3f
modify codegen-indent.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
90
diff
changeset
|
176 self.demit( "}") |
87cd1db7ec3f
modify codegen-indent.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
90
diff
changeset
|
177 self.demit("}") |
67
b02b321d0e06
implement bm_filter on mmap. but it's slower than dfa. ?;-(
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
63
diff
changeset
|
178 self.emit( "return;") |
b02b321d0e06
implement bm_filter on mmap. but it's slower than dfa. ?;-(
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
63
diff
changeset
|
179 self.emit( "next:") |
b02b321d0e06
implement bm_filter on mmap. but it's slower than dfa. ?;-(
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
63
diff
changeset
|
180 emit_next() |
92
87cd1db7ec3f
modify codegen-indent.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
90
diff
changeset
|
181 self.demit("}", 2) |
33
e9e90c006760
simplify grep.c, correnspod syntax '^'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
30
diff
changeset
|
182 |
75
e06786b3c2dc
modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
73
diff
changeset
|
183 def emit_quick_filter(self, key): |
e06786b3c2dc
modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
73
diff
changeset
|
184 l = len(key) |
e06786b3c2dc
modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
73
diff
changeset
|
185 def emit_next(): |
e06786b3c2dc
modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
73
diff
changeset
|
186 if self.filter_only: |
102
a38b57592d45
modify (add return statement).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
100
diff
changeset
|
187 self.emit("return accept(%s);" % self.args) |
75
e06786b3c2dc
modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
73
diff
changeset
|
188 elif self.filter_prefix: |
e06786b3c2dc
modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
73
diff
changeset
|
189 self.emit("buf+%d;" % l) |
102
a38b57592d45
modify (add return statement).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
100
diff
changeset
|
190 self.emit("return %s(%s);" % (self.state_name(self.fa.start) ,self.args)) |
75
e06786b3c2dc
modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
73
diff
changeset
|
191 else: |
e06786b3c2dc
modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
73
diff
changeset
|
192 self.emit("beg = get_line_beg(buf, beg);") |
e06786b3c2dc
modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
73
diff
changeset
|
193 self.emit("buf = beg;") |
102
a38b57592d45
modify (add return statement).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
100
diff
changeset
|
194 self.emit("return %s(%s);" % (self.state_name(self.fa.start), self.args)) |
75
e06786b3c2dc
modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
73
diff
changeset
|
195 |
e06786b3c2dc
modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
73
diff
changeset
|
196 self.emit("UCHARP get_line_beg(UCHARP p, UCHARP beg);", 2) |
e06786b3c2dc
modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
73
diff
changeset
|
197 |
e06786b3c2dc
modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
73
diff
changeset
|
198 self.emiti("void quick_filter(%s) {" % self.interface) |
e06786b3c2dc
modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
73
diff
changeset
|
199 l = len(key) |
e06786b3c2dc
modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
73
diff
changeset
|
200 if l == 1: |
e06786b3c2dc
modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
73
diff
changeset
|
201 self.emit("buf = memchr(buf, %d, (end - buf));" % ord(key)) |
e06786b3c2dc
modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
73
diff
changeset
|
202 self.emit("if (buf == NULL) return;") |
e06786b3c2dc
modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
73
diff
changeset
|
203 emit_next() |
92
87cd1db7ec3f
modify codegen-indent.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
90
diff
changeset
|
204 self.demit("}", 2) |
75
e06786b3c2dc
modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
73
diff
changeset
|
205 return |
e06786b3c2dc
modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
73
diff
changeset
|
206 |
77
c50511498bcf
improve filter.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
76
diff
changeset
|
207 self.emit('static const UCHAR key[] = "%s";' % key) |
75
e06786b3c2dc
modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
73
diff
changeset
|
208 |
e06786b3c2dc
modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
73
diff
changeset
|
209 skip = dict() |
e06786b3c2dc
modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
73
diff
changeset
|
210 for i in range(l): |
e06786b3c2dc
modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
73
diff
changeset
|
211 skip[key[i]] = l-i |
e06786b3c2dc
modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
73
diff
changeset
|
212 |
80
53c3ce58fc8a
modify code gen, for no-warnings (gcc -Wall).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
79
diff
changeset
|
213 self.emit("UCHARP tmp1, tmp2, end_ = end - %d;" % (l-1), 2) |
76
dd6d2b9e48ad
improvement quick-filtering.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
75
diff
changeset
|
214 self.emiti("while (buf < end_) {") |
78
240475723cd8
add option "--filter=[bmh,quick,none]".
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
77
diff
changeset
|
215 self.emiti( "if (*buf == %d /* %s */) {" % (ord(key[0]), Character.ascii(key[0]))) |
77
c50511498bcf
improve filter.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
76
diff
changeset
|
216 self.emit( "tmp1 = buf, tmp2 = (UCHARP)key;") |
76
dd6d2b9e48ad
improvement quick-filtering.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
75
diff
changeset
|
217 self.emiti( "while (*(++tmp1) == *(++tmp2)){") |
dd6d2b9e48ad
improvement quick-filtering.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
75
diff
changeset
|
218 self.emit( "if (tmp2 == key+%d) goto next;" % (l-1)) |
92
87cd1db7ec3f
modify codegen-indent.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
90
diff
changeset
|
219 self.demit( "}") |
87cd1db7ec3f
modify codegen-indent.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
90
diff
changeset
|
220 self.demit( "}") |
97
5db856953793
implement range-expression. and add repeat-mn syntax(ex. A{1,10}).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
95
diff
changeset
|
221 if self.table_lookup: |
5db856953793
implement range-expression. and add repeat-mn syntax(ex. A{1,10}).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
95
diff
changeset
|
222 self.emiti("static const void * tbl[256] = {[0 ... 255] &&any, %s};" |
5db856953793
implement range-expression. and add repeat-mn syntax(ex. A{1,10}).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
95
diff
changeset
|
223 % ", ".join("[%d] &&add%s" % (ord(c), s) for c, s in skip.iteritems())) |
5db856953793
implement range-expression. and add repeat-mn syntax(ex. A{1,10}).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
95
diff
changeset
|
224 self.emit("goto *tbl[buf[%d]];" % l) |
5db856953793
implement range-expression. and add repeat-mn syntax(ex. A{1,10}).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
95
diff
changeset
|
225 defun = [] |
5db856953793
implement range-expression. and add repeat-mn syntax(ex. A{1,10}).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
95
diff
changeset
|
226 for s in skip.itervalues(): |
5db856953793
implement range-expression. and add repeat-mn syntax(ex. A{1,10}).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
95
diff
changeset
|
227 if s in defun: continue |
5db856953793
implement range-expression. and add repeat-mn syntax(ex. A{1,10}).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
95
diff
changeset
|
228 defun.append(s) |
5db856953793
implement range-expression. and add repeat-mn syntax(ex. A{1,10}).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
95
diff
changeset
|
229 self.emit("add%s: buf += %s; goto ends;" % (s, s)) |
5db856953793
implement range-expression. and add repeat-mn syntax(ex. A{1,10}).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
95
diff
changeset
|
230 self.emit("any: buf += %d; ends:;" % (l+1)) |
5db856953793
implement range-expression. and add repeat-mn syntax(ex. A{1,10}).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
95
diff
changeset
|
231 else: |
5db856953793
implement range-expression. and add repeat-mn syntax(ex. A{1,10}).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
95
diff
changeset
|
232 self.emiti( "switch(buf[%d]) {" % l) |
5db856953793
implement range-expression. and add repeat-mn syntax(ex. A{1,10}).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
95
diff
changeset
|
233 for k, v in skip.iteritems(): |
5db856953793
implement range-expression. and add repeat-mn syntax(ex. A{1,10}).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
95
diff
changeset
|
234 self.emiti( "case %d: /* %s */" % (ord(k), Character.ascii(k))) |
5db856953793
implement range-expression. and add repeat-mn syntax(ex. A{1,10}).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
95
diff
changeset
|
235 self.emit( "buf += %d; break;" % v), self.dedent() |
5db856953793
implement range-expression. and add repeat-mn syntax(ex. A{1,10}).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
95
diff
changeset
|
236 self.emiti("default: buf += %d;" % (l+1)), self.dedent() |
5db856953793
implement range-expression. and add repeat-mn syntax(ex. A{1,10}).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
95
diff
changeset
|
237 self.demit( "}") |
92
87cd1db7ec3f
modify codegen-indent.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
90
diff
changeset
|
238 self.demit("}") |
75
e06786b3c2dc
modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
73
diff
changeset
|
239 self.emit( "return;") |
e06786b3c2dc
modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
73
diff
changeset
|
240 self.emit( "next:") |
e06786b3c2dc
modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
73
diff
changeset
|
241 emit_next() |
92
87cd1db7ec3f
modify codegen-indent.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
90
diff
changeset
|
242 self.demit("}", 2) |
75
e06786b3c2dc
modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
73
diff
changeset
|
243 |
70
74f4e50c4f11
add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
68
diff
changeset
|
244 def emit_booster(self, min_len, chars): |
74f4e50c4f11
add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
68
diff
changeset
|
245 self.emiti("void booster(%s) {" % self.interface) |
81
3dc381c90870
improve booster's routine.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
80
diff
changeset
|
246 self.emit( "UCHARP end_ = end - %d;" % (min_len-1)) |
3dc381c90870
improve booster's routine.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
80
diff
changeset
|
247 self.emit( "if (buf > end_) return;") |
70
74f4e50c4f11
add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
68
diff
changeset
|
248 self.emiti( "do {") |
97
5db856953793
implement range-expression. and add repeat-mn syntax(ex. A{1,10}).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
95
diff
changeset
|
249 self.emiti( "switch (buf[%d]) {" % (min_len-1)) |
5db856953793
implement range-expression. and add repeat-mn syntax(ex. A{1,10}).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
95
diff
changeset
|
250 for c in chars: |
5db856953793
implement range-expression. and add repeat-mn syntax(ex. A{1,10}).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
95
diff
changeset
|
251 self.emit( "case %d: /* %s */" % (ord(c), Character.ascii(c))) |
5db856953793
implement range-expression. and add repeat-mn syntax(ex. A{1,10}).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
95
diff
changeset
|
252 self.emit( "goto ret;") |
5db856953793
implement range-expression. and add repeat-mn syntax(ex. A{1,10}).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
95
diff
changeset
|
253 self.demit( "}") |
92
87cd1db7ec3f
modify codegen-indent.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
90
diff
changeset
|
254 self.demit( "} while((buf += %d) <= end_);" % min_len) |
100 | 255 self.emit( "ret: return %s(%s);" % (self.state_name(self.fa.start) , self.args)) |
92
87cd1db7ec3f
modify codegen-indent.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
90
diff
changeset
|
256 self.demit("}", 2) |
70
74f4e50c4f11
add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
68
diff
changeset
|
257 |
12
41391400fe68
add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
258 def emit_driver(self): |
67
b02b321d0e06
implement bm_filter on mmap. but it's slower than dfa. ?;-(
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
63
diff
changeset
|
259 self.emiti("void matcher(%s) {" % self.interface) |
b02b321d0e06
implement bm_filter on mmap. but it's slower than dfa. ?;-(
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
63
diff
changeset
|
260 if self.filter: |
75
e06786b3c2dc
modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
73
diff
changeset
|
261 self.emit( "%s(%s);" % (self.filter + "_filter", self.args)) |
67
b02b321d0e06
implement bm_filter on mmap. but it's slower than dfa. ?;-(
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
63
diff
changeset
|
262 else: |
100 | 263 self.emit( "%s(%s);" % (self.state_name(self.fa.start), self.args)) |
63
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
264 self.emit( "return;") |
92
87cd1db7ec3f
modify codegen-indent.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
90
diff
changeset
|
265 self.demit("}", 2) |
12
41391400fe68
add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
266 |
63
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
267 def emit_accept_state(self): |
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
268 self.emiti("void accept(%s) {" % self.interface) |
109
d591da6e2988
add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
106
diff
changeset
|
269 self.emit( "UCHARP ret = (UCHARP)memchr(buf, '\\n', (end - buf));") |
70
74f4e50c4f11
add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
68
diff
changeset
|
270 if self.skip_boost or self.filter: |
72
8b9c3a924744
rename memrchr -> beg_get_line.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
71
diff
changeset
|
271 self.emit( "beg = get_line_beg(buf, beg);") |
79
623eccb93ca1
modify filter emit-option's bug.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
78
diff
changeset
|
272 self.emiti( "if (ret == NULL) {") |
70
74f4e50c4f11
add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
68
diff
changeset
|
273 self.emit( "print_line(beg, end);") |
63
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
274 self.emit( "return;") |
92
87cd1db7ec3f
modify codegen-indent.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
90
diff
changeset
|
275 self.demit( "}") |
63
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
276 self.emit( "print_line(beg, ret);") |
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
277 self.emit( "beg = buf = ret + 1;") |
102
a38b57592d45
modify (add return statement).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
100
diff
changeset
|
278 self.emit( "return %s(%s);" % (self.start, self.args)) |
92
87cd1db7ec3f
modify codegen-indent.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
90
diff
changeset
|
279 self.demit("}", 2) |
63
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
280 |
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
281 def emit_reject_state(self): |
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
282 self.emiti("void reject(%s) {" % self.interface) |
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
283 self.emit( "if (buf >= end) return;") |
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
284 self.emit( "beg = buf;") |
70
74f4e50c4f11
add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
68
diff
changeset
|
285 self.emit( "return %s(%s);" % (self.start, self.args)) |
92
87cd1db7ec3f
modify codegen-indent.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
90
diff
changeset
|
286 self.demit("}", 2) |
63
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
287 |
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
288 def emit_switch(self, case, default=None): |
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
289 if not case: |
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
290 if default: |
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
291 self.emit("return %s(%s);" % (default, self.args)) |
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
292 return |
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
293 self.emiti("switch(*buf++) {") |
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
294 for case, next_ in case.iteritems(): |
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
295 self.trans_stmt.emit(case, self.state_name(next_)) |
100 | 296 if default == self.state_name(self.fa.start) and self.skip_boost: |
70
74f4e50c4f11
add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
68
diff
changeset
|
297 self.emit("default: return booster(%s);" % self.args) |
74f4e50c4f11
add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
68
diff
changeset
|
298 else: |
74f4e50c4f11
add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
68
diff
changeset
|
299 self.emit("default: return %s(%s);" % (default, self.args)) |
92
87cd1db7ec3f
modify codegen-indent.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
90
diff
changeset
|
300 self.demit("}") |
63
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
301 |
12
41391400fe68
add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
302 def emit_state(self, cur_state, transition): |
68
56a997f2c121
improve codegen. remove needless code (when filter-only, no need to emit dfa-code).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
67
diff
changeset
|
303 if self.filter_only: return |
56a997f2c121
improve codegen. remove needless code (when filter-only, no need to emit dfa-code).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
67
diff
changeset
|
304 |
63
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
305 self.emiti("void %s(%s) {" % (self.state_name(cur_state), self.interface)) |
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
306 |
100 | 307 if cur_state in self.fa.accepts: |
63
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
308 self.emit( "return accept(beg, buf-1, end);") |
92
87cd1db7ec3f
modify codegen-indent.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
90
diff
changeset
|
309 self.demit("}", 2) |
63
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
310 return |
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
311 |
85
b34a900a3a0b
modify table-lookup option.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
84
diff
changeset
|
312 if transition.has_key(AnyChar()): |
b34a900a3a0b
modify table-lookup option.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
84
diff
changeset
|
313 default = self.state_name(transition.pop(AnyChar())) |
b34a900a3a0b
modify table-lookup option.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
84
diff
changeset
|
314 else: |
100 | 315 default = self.state_name(self.fa.start) |
85
b34a900a3a0b
modify table-lookup option.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
84
diff
changeset
|
316 |
100 | 317 if self.table_lookup and (cur_state == self.fa.start or \ |
85
b34a900a3a0b
modify table-lookup option.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
84
diff
changeset
|
318 self.state_name(cur_state) == default): |
100 | 319 if self.skip_boost and default == self.state_name(self.fa.start): |
85
b34a900a3a0b
modify table-lookup option.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
84
diff
changeset
|
320 default = "booster" |
94
492f543703d5
improve jump-table initialize (when enable table-lookup). C99's range-initialize is awesome.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
92
diff
changeset
|
321 tbl = dict() |
85
b34a900a3a0b
modify table-lookup option.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
84
diff
changeset
|
322 for eol in self.eols: |
b34a900a3a0b
modify table-lookup option.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
84
diff
changeset
|
323 tbl[eol.char] = "reject" |
b34a900a3a0b
modify table-lookup option.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
84
diff
changeset
|
324 for c, n in transition.iteritems(): |
b34a900a3a0b
modify table-lookup option.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
84
diff
changeset
|
325 tbl[c.char] = self.state_name(n) |
94
492f543703d5
improve jump-table initialize (when enable table-lookup). C99's range-initialize is awesome.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
92
diff
changeset
|
326 self.emit( "static void (*%s_table[256])(UCHARP, UCHARP, UCHARP) = {[0 ... 255] = (void*)%s, %s};" |
492f543703d5
improve jump-table initialize (when enable table-lookup). C99's range-initialize is awesome.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
92
diff
changeset
|
327 % (self.state_name(cur_state), default, |
492f543703d5
improve jump-table initialize (when enable table-lookup). C99's range-initialize is awesome.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
92
diff
changeset
|
328 ", ".join("[%d] = %s" % (i, s) for (i, s) in tbl.items()))) |
85
b34a900a3a0b
modify table-lookup option.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
84
diff
changeset
|
329 self.emit( "return %s_table[*buf++](%s);" % (self.state_name(cur_state), self.args)) |
92
87cd1db7ec3f
modify codegen-indent.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
90
diff
changeset
|
330 self.demit("}", 2) |
83
68cefeb3bee1
experimentation, use table-lookup at first state's transition.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
81
diff
changeset
|
331 return |
68cefeb3bee1
experimentation, use table-lookup at first state's transition.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
81
diff
changeset
|
332 |
63
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
333 for eol in self.eols: |
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
334 transition[eol] = "reject" |
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
335 |
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
336 for input_ in transition.keys(): |
97
5db856953793
implement range-expression. and add repeat-mn syntax(ex. A{1,10}).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
95
diff
changeset
|
337 if isinstance(input_, SpecialInputNode): |
63
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
338 self.trans_stmt.emit(input_, self.state_name(transition.pop(input_))) |
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
339 |
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
340 self.emit_switch(transition, default) |
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
341 |
92
87cd1db7ec3f
modify codegen-indent.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
90
diff
changeset
|
342 self.demit("}", 2) |
63
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
343 |
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
344 class _trans_stmt(ASTWalker): |
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
345 def __init__(self, emit): |
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
346 self._emit = emit |
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
347 self.args = "beg, buf, end" |
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
348 |
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
349 def emit(self, input_node, next_): |
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
350 self.next = next_ |
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
351 input_node.accept(self) |
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
352 |
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
353 def visit(self, input_node): |
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
354 self._emit("/* UNKNOW RULE */") |
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
355 self._emit("/* %s */" % input_node.__repr__()) |
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
356 |
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
357 def visit_Character(self, char): |
70
74f4e50c4f11
add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
68
diff
changeset
|
358 self._emit("case %d: /* %s */" % (char.char, char)) |
63
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
359 self._emit(" return %s(%s);" % (self.next, self.args)) |
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
360 |
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
361 # Special Rule |
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
362 def visit_BegLine(self, begline): |
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
363 self._emit("/* begin of line */") |
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
364 self._emit("if (buf == beg)") |
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
365 self._emit(" return %s(%s);" % (self.next, self.args), 2) |
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
366 |
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
367 def visit_Range(self, range): |
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
368 if isinstance(range.lower, MBCharacter) and not \ |
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
369 isinstance(range.upper, MBCharacter) or \ |
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
370 isinstance(range.upper, MBCharacter) and not \ |
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
371 isinstance(range.lower, MBCharacter): |
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
372 return |
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
373 |
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
374 if isinstance(range.lower, MBCharacter): |
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
375 self.visit(range) |
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
376 else: |
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
377 self._emit("if ('%s' <= *buf && *buf <= '%s')" % (range.lower.char, range.upper.char)) |
020ba001c58a
modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
59
diff
changeset
|
378 self._emit(" return %s(beg, buf+1, end);" % self.next, 2) |
12
41391400fe68
add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
379 |
41391400fe68
add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
380 def test(): |
41391400fe68
add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
381 import doctest |
41391400fe68
add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
382 doctest.testmod() |
41391400fe68
add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
383 |
41391400fe68
add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
384 if __name__ == '__main__': test() |