annotate pyrect/translator/grep_translator.py @ 110:68b616dbe2c9

modify filtering rules.
author Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
date Sat, 12 Feb 2011 16:47:45 +0900
parents d591da6e2988
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
27
3db85244784b modify jitgrep, pre-compile grep main routine to libgrep.so. so JIT-compile only required DFA-transition.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 22
diff changeset
1 #!/usr/bin/env python
3db85244784b modify jitgrep, pre-compile grep main routine to libgrep.so. so JIT-compile only required DFA-transition.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 22
diff changeset
2
49
7f4221018adf accept UTF-8 encoding. but some foundational bug in converting algorithm NFA. maybe, which is not too difficult.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 47
diff changeset
3 import os
47
701beabd7d97 add input-rules, Range, CharacterClass, Anchor and MultiByte-Char(but not work)\nand more simplify NFA (is global improvement).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 45
diff changeset
4 from c_translator import CTranslator
70
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
5 from pyrect.regexp import Regexp
97
5db856953793 implement range-expression. and add repeat-mn syntax(ex. A{1,10}).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 95
diff changeset
6 from pyrect.regexp.ast import ASTWalker, AnyChar, Character, SpecialInputNode
12
41391400fe68 add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
7
14
55684cb51347 add LICENSE
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 12
diff changeset
8 class GREPTranslateExeption(Exception):
55684cb51347 add LICENSE
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 12
diff changeset
9 pass
55684cb51347 add LICENSE
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 12
diff changeset
10
12
41391400fe68 add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
11 class GREPTranslator(CTranslator):
41391400fe68 add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
12 """GREPTranslator
29
b833746d9d92 modify jitgrep.py and change linking method.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 27
diff changeset
13 This Class can translate form DFA into grep source-code.
b833746d9d92 modify jitgrep.py and change linking method.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 27
diff changeset
14 which based on (beautiful) mini-grep introduced \"The Practice of Programming\"
b833746d9d92 modify jitgrep.py and change linking method.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 27
diff changeset
15 written by Rob Pike & Brian W. Kernighan. (see template/grep.c)
27
3db85244784b modify jitgrep, pre-compile grep main routine to libgrep.so. so JIT-compile only required DFA-transition.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 22
diff changeset
16 >>> string = \"(build|fndecl|gcc)\"
12
41391400fe68 add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
17 >>> reg = Regexp(string)
43
83c69d42faa8 replace converting-flow, module dfareg with module regexp. it's is substantial changing in implimentation.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 38
diff changeset
18 >>> tje = GREPTranslator(reg)
12
41391400fe68 add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
19 >>> tje.translate()
41391400fe68 add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
20 """
14
55684cb51347 add LICENSE
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 12
diff changeset
21
49
7f4221018adf accept UTF-8 encoding. but some foundational bug in converting algorithm NFA. maybe, which is not too difficult.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 47
diff changeset
22 BASE_DIR = os.path.dirname(os.path.abspath(__file__))
7f4221018adf accept UTF-8 encoding. but some foundational bug in converting algorithm NFA. maybe, which is not too difficult.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 47
diff changeset
23
43
83c69d42faa8 replace converting-flow, module dfareg with module regexp. it's is substantial changing in implimentation.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 38
diff changeset
24 def __init__(self, regexp):
99
e327e93aeb3a remove callgraph and use Transition.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 97
diff changeset
25 CTranslator.__init__(self, regexp)
50
d1afae06e776 jitgrep: set bufsize default 1M. and remove with statement.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 49
diff changeset
26 self.__bufsize = 1024 * 1024
59
fd3d0b8326fe implement regexp-syntax any-char ('.').
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 58
diff changeset
27 self.thread_dfa = 1
fd3d0b8326fe implement regexp-syntax any-char ('.').
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 58
diff changeset
28 self.thread_line = 1
110
68b616dbe2c9 modify filtering rules.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 109
diff changeset
29 self.filter = True
68
56a997f2c121 improve codegen. remove needless code (when filter-only, no need to emit dfa-code).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 67
diff changeset
30 self.filter_only = False
70
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
31 self.filter_prefix = False
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
32 self.skip_boost = True
84
f5c4193913a1 add table-lookup option.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 83
diff changeset
33 self.table_lookup = False
67
b02b321d0e06 implement bm_filter on mmap. but it's slower than dfa. ?;-(
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 63
diff changeset
34 self.start = "matcher"
63
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
35 self.interface = "UCHARP beg, UCHARP buf, UCHARP end"
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
36 self.args = "beg, buf, end"
38
06826250198b modify grep_translator, use property at bufsize.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 34
diff changeset
37
06826250198b modify grep_translator, use property at bufsize.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 34
diff changeset
38 def getbufsize(self,):
06826250198b modify grep_translator, use property at bufsize.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 34
diff changeset
39 return self.__bufsize
06826250198b modify grep_translator, use property at bufsize.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 34
diff changeset
40 def setbufsize(self, bufsize):
06826250198b modify grep_translator, use property at bufsize.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 34
diff changeset
41 self.__bufsize = abs(bufsize)
06826250198b modify grep_translator, use property at bufsize.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 34
diff changeset
42
75
e06786b3c2dc modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 73
diff changeset
43 bufsize = property(getbufsize, setbufsize)
12
41391400fe68 add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
44
41391400fe68 add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
45 def emit_initialization(self):
63
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
46 self.emit("#include <stdio.h>")
49
7f4221018adf accept UTF-8 encoding. but some foundational bug in converting algorithm NFA. maybe, which is not too difficult.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 47
diff changeset
47 self.emit("#include <stdlib.h>")
63
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
48 self.emit("#include <sys/mman.h>")
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
49 self.emit("#include <sys/types.h>")
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
50 self.emit("#include <sys/stat.h>")
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
51 self.emit("#include <fcntl.h>")
80
53c3ce58fc8a modify code gen, for no-warnings (gcc -Wall).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 79
diff changeset
52 self.emit("#include <unistd.h>")
83
68cefeb3bee1 experimentation, use table-lookup at first state's transition.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 81
diff changeset
53 self.emit("#include <string.h>", 2)
63
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
54
80
53c3ce58fc8a modify code gen, for no-warnings (gcc -Wall).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 79
diff changeset
55 self.emit("typedef unsigned char UCHAR;")
83
68cefeb3bee1 experimentation, use table-lookup at first state's transition.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 81
diff changeset
56 self.emit("typedef unsigned char *UCHARP;", 2)
80
53c3ce58fc8a modify code gen, for no-warnings (gcc -Wall).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 79
diff changeset
57
63
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
58 self.emit('void reject(%s);' % self.interface)
83
68cefeb3bee1 experimentation, use table-lookup at first state's transition.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 81
diff changeset
59 self.emit("void matcher(%s);" % self.interface)
68cefeb3bee1 experimentation, use table-lookup at first state's transition.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 81
diff changeset
60 self.emit('void accept(%s);' % self.interface, 2)
63
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
61
70
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
62 key = None
75
e06786b3c2dc modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 73
diff changeset
63
110
68b616dbe2c9 modify filtering rules.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 109
diff changeset
64 if (self.filter == True or self.filter == "bmh" or self.filter == "quick" or self.filter == "memchr")\
75
e06786b3c2dc modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 73
diff changeset
65 and self.regexp.must_words:
70
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
66 key = max(self.regexp.must_words, key=len)
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
67 if len(self.regexp.must_words) == 1 and len(key) == self.regexp.min_len:
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
68 self.filter_only = True
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
69 else:
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
70 self.filter = False
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
71
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
72 if not self.filter_only:
100
6aab6b1038f0 bug-fix
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 99
diff changeset
73 for state in self.fa.transition.iterkeys():
70
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
74 self.emit("void %s(%s);" % (self.state_name(state), self.interface))
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
75 self.emit()
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
76
75
e06786b3c2dc modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 73
diff changeset
77 if self.filter == "bmh":
e06786b3c2dc modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 73
diff changeset
78 self.emit_bmh_filter(key)
109
d591da6e2988 add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 106
diff changeset
79 elif self.filter == "memchr":
d591da6e2988 add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 106
diff changeset
80 self.emit_memchr_filter(key)
d591da6e2988 add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 106
diff changeset
81 elif self.filter == "quick":
d591da6e2988 add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 106
diff changeset
82 self.emit_quick_filter(key)
79
623eccb93ca1 modify filter emit-option's bug.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 78
diff changeset
83 elif self.filter:
109
d591da6e2988 add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 106
diff changeset
84 if len(key) > 5:
110
68b616dbe2c9 modify filtering rules.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 109
diff changeset
85 print("hoge")
109
d591da6e2988 add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 106
diff changeset
86 self.emit_quick_filter(key)
d591da6e2988 add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 106
diff changeset
87 else:
110
68b616dbe2c9 modify filtering rules.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 109
diff changeset
88 print("fuga")
109
d591da6e2988 add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 106
diff changeset
89 self.emit_memchr_filter(key)
70
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
90
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
91 if self.skip_boost and not self.filter_only and \
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
92 not AnyChar() in self.regexp.chars and \
106
8102bf4bbec6 modify range stmt.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 102
diff changeset
93 self.regexp.min_len >= 2:
70
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
94 self.emit_booster(self.regexp.min_len, self.regexp.chars)
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
95 else:
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
96 self.skip_boost = False
57
81b44ae1cd73 add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 56
diff changeset
97
49
7f4221018adf accept UTF-8 encoding. but some foundational bug in converting algorithm NFA. maybe, which is not too difficult.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 47
diff changeset
98 grepsource = open(self.BASE_DIR + "/template/grep.c")
34
50b10929be29 change compile-method to full-source-compile.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 33
diff changeset
99 self.emit(grepsource.read())
12
41391400fe68 add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
100
109
d591da6e2988 add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 106
diff changeset
101 def emit_memchr_filter(self, key):
d591da6e2988 add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 106
diff changeset
102 l = len(key)
d591da6e2988 add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 106
diff changeset
103 def emit_next():
d591da6e2988 add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 106
diff changeset
104 if self.filter_only:
d591da6e2988 add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 106
diff changeset
105 self.emit("return accept(%s);" % self.args)
d591da6e2988 add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 106
diff changeset
106 elif self.filter_prefix:
d591da6e2988 add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 106
diff changeset
107 self.emit("buf++;")
d591da6e2988 add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 106
diff changeset
108 self.emit("return %s(%s);" % (self.state_name(self.fa.start), self.args))
d591da6e2988 add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 106
diff changeset
109 else:
d591da6e2988 add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 106
diff changeset
110 self.emit("beg = get_line_beg(buf, beg);")
d591da6e2988 add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 106
diff changeset
111 self.emit("buf = beg;")
d591da6e2988 add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 106
diff changeset
112 self.emit("return %s(%s);" % (self.state_name(self.fa.start), self.args))
d591da6e2988 add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 106
diff changeset
113
d591da6e2988 add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 106
diff changeset
114 self.emit("UCHARP get_line_beg(UCHARP p, UCHARP beg);", 2)
d591da6e2988 add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 106
diff changeset
115 self.emiti("void memchr_filter(%s) {" % self.interface)
d591da6e2988 add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 106
diff changeset
116 self.emit('static const UCHAR key[] = "%s";' % key)
d591da6e2988 add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 106
diff changeset
117
d591da6e2988 add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 106
diff changeset
118 self.emit("int i, len = %d;" % l);
d591da6e2988 add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 106
diff changeset
119
d591da6e2988 add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 106
diff changeset
120 self.emiti("while ((buf = memchr(buf, key[0], end-buf)) != NULL) {")
d591da6e2988 add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 106
diff changeset
121 self.emiti( "for (i = 1; i < len; i++) {")
d591da6e2988 add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 106
diff changeset
122 self.iemitd( "if (key[i] != buf[i]) goto retry;")
d591da6e2988 add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 106
diff changeset
123 self.demit( "}")
d591da6e2988 add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 106
diff changeset
124 self.emit( "goto next;")
d591da6e2988 add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 106
diff changeset
125 self.demiti("retry:")
d591da6e2988 add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 106
diff changeset
126 self.emit( "buf++;")
d591da6e2988 add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 106
diff changeset
127 self.demit("}")
d591da6e2988 add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 106
diff changeset
128 self.emit( "return;")
d591da6e2988 add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 106
diff changeset
129 self.emit( "next:")
d591da6e2988 add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 106
diff changeset
130 emit_next()
d591da6e2988 add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 106
diff changeset
131 self.demit("}", 2)
d591da6e2988 add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 106
diff changeset
132
75
e06786b3c2dc modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 73
diff changeset
133 def emit_bmh_filter(self, key):
70
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
134 l = len(key)
67
b02b321d0e06 implement bm_filter on mmap. but it's slower than dfa. ?;-(
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 63
diff changeset
135 def emit_next():
68
56a997f2c121 improve codegen. remove needless code (when filter-only, no need to emit dfa-code).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 67
diff changeset
136 if self.filter_only:
102
a38b57592d45 modify (add return statement).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 100
diff changeset
137 self.emit("return accept(%s);" % self.args)
68
56a997f2c121 improve codegen. remove needless code (when filter-only, no need to emit dfa-code).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 67
diff changeset
138 elif self.filter_prefix:
77
c50511498bcf improve filter.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 76
diff changeset
139 self.emit("buf++;")
102
a38b57592d45 modify (add return statement).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 100
diff changeset
140 self.emit("return %s(%s);" % (self.state_name(self.fa.start), self.args))
67
b02b321d0e06 implement bm_filter on mmap. but it's slower than dfa. ?;-(
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 63
diff changeset
141 else:
72
8b9c3a924744 rename memrchr -> beg_get_line.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 71
diff changeset
142 self.emit("beg = get_line_beg(buf, beg);")
67
b02b321d0e06 implement bm_filter on mmap. but it's slower than dfa. ?;-(
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 63
diff changeset
143 self.emit("buf = beg;")
102
a38b57592d45 modify (add return statement).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 100
diff changeset
144 self.emit("return %s(%s);" % (self.state_name(self.fa.start), self.args))
67
b02b321d0e06 implement bm_filter on mmap. but it's slower than dfa. ?;-(
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 63
diff changeset
145
72
8b9c3a924744 rename memrchr -> beg_get_line.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 71
diff changeset
146 self.emit("UCHARP get_line_beg(UCHARP p, UCHARP beg);", 2)
75
e06786b3c2dc modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 73
diff changeset
147 self.emiti("void bmh_filter(%s) {" % self.interface)
57
81b44ae1cd73 add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 56
diff changeset
148 l = len(key)
81b44ae1cd73 add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 56
diff changeset
149 if l == 1:
67
b02b321d0e06 implement bm_filter on mmap. but it's slower than dfa. ?;-(
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 63
diff changeset
150 self.emit("buf = memchr(buf, %d, (end - buf));" % ord(key))
b02b321d0e06 implement bm_filter on mmap. but it's slower than dfa. ?;-(
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 63
diff changeset
151 self.emit("if (buf == NULL) return;")
b02b321d0e06 implement bm_filter on mmap. but it's slower than dfa. ?;-(
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 63
diff changeset
152 emit_next()
90
8cfa81638130 buf-fix: goto booster possibly, and improve code-gen routine (add some usefull functions -> demiti, iemit,,).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 85
diff changeset
153 self.demit("}", 2)
57
81b44ae1cd73 add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 56
diff changeset
154 return
81b44ae1cd73 add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 56
diff changeset
155
77
c50511498bcf improve filter.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 76
diff changeset
156 self.emit('static const UCHAR key[] = "%s";' % key)
70
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
157
73
a6a0504dea7b modify bm-filter's implimentation. table-lookup -> switch. it's more simple and beautiful.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 72
diff changeset
158 skip = dict()
a6a0504dea7b modify bm-filter's implimentation. table-lookup -> switch. it's more simple and beautiful.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 72
diff changeset
159 for i in range(l-1):
a6a0504dea7b modify bm-filter's implimentation. table-lookup -> switch. it's more simple and beautiful.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 72
diff changeset
160 skip[key[i]] = l-1-i
57
81b44ae1cd73 add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 56
diff changeset
161
80
53c3ce58fc8a modify code gen, for no-warnings (gcc -Wall).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 79
diff changeset
162 self.emit("UCHARP tmp1, tmp2; buf += %d;" % (l-1), 2)
79
623eccb93ca1 modify filter emit-option's bug.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 78
diff changeset
163
76
dd6d2b9e48ad improvement quick-filtering.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 75
diff changeset
164 self.emiti("while (buf < end) {")
77
c50511498bcf improve filter.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 76
diff changeset
165 self.emiti( "if (*buf == %d /* %s */) {" % (ord(key[-1]), Character.ascii(key[-1])))
c50511498bcf improve filter.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 76
diff changeset
166 self.emit( "tmp1 = buf, tmp2 = (UCHARP)key+%d;" % (l-1))
76
dd6d2b9e48ad improvement quick-filtering.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 75
diff changeset
167 self.emiti( "while (*(--tmp1) == *(--tmp2)) {")
dd6d2b9e48ad improvement quick-filtering.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 75
diff changeset
168 self.emit( "if (tmp2 == key) goto next;")
92
87cd1db7ec3f modify codegen-indent.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 90
diff changeset
169 self.demit( "}")
87cd1db7ec3f modify codegen-indent.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 90
diff changeset
170 self.demit( "}")
73
a6a0504dea7b modify bm-filter's implimentation. table-lookup -> switch. it's more simple and beautiful.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 72
diff changeset
171 self.emiti( "switch(*buf) {")
a6a0504dea7b modify bm-filter's implimentation. table-lookup -> switch. it's more simple and beautiful.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 72
diff changeset
172 for k, v in skip.iteritems():
a6a0504dea7b modify bm-filter's implimentation. table-lookup -> switch. it's more simple and beautiful.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 72
diff changeset
173 self.emiti( "case %d: /* %s */" % (ord(k), Character.ascii(k)))
a6a0504dea7b modify bm-filter's implimentation. table-lookup -> switch. it's more simple and beautiful.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 72
diff changeset
174 self.emit( "buf += %d; break;" % v), self.dedent()
a6a0504dea7b modify bm-filter's implimentation. table-lookup -> switch. it's more simple and beautiful.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 72
diff changeset
175 self.emiti("default: buf += %d;" % l), self.dedent()
92
87cd1db7ec3f modify codegen-indent.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 90
diff changeset
176 self.demit( "}")
87cd1db7ec3f modify codegen-indent.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 90
diff changeset
177 self.demit("}")
67
b02b321d0e06 implement bm_filter on mmap. but it's slower than dfa. ?;-(
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 63
diff changeset
178 self.emit( "return;")
b02b321d0e06 implement bm_filter on mmap. but it's slower than dfa. ?;-(
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 63
diff changeset
179 self.emit( "next:")
b02b321d0e06 implement bm_filter on mmap. but it's slower than dfa. ?;-(
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 63
diff changeset
180 emit_next()
92
87cd1db7ec3f modify codegen-indent.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 90
diff changeset
181 self.demit("}", 2)
33
e9e90c006760 simplify grep.c, correnspod syntax '^'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 30
diff changeset
182
75
e06786b3c2dc modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 73
diff changeset
183 def emit_quick_filter(self, key):
e06786b3c2dc modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 73
diff changeset
184 l = len(key)
e06786b3c2dc modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 73
diff changeset
185 def emit_next():
e06786b3c2dc modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 73
diff changeset
186 if self.filter_only:
102
a38b57592d45 modify (add return statement).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 100
diff changeset
187 self.emit("return accept(%s);" % self.args)
75
e06786b3c2dc modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 73
diff changeset
188 elif self.filter_prefix:
e06786b3c2dc modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 73
diff changeset
189 self.emit("buf+%d;" % l)
102
a38b57592d45 modify (add return statement).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 100
diff changeset
190 self.emit("return %s(%s);" % (self.state_name(self.fa.start) ,self.args))
75
e06786b3c2dc modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 73
diff changeset
191 else:
e06786b3c2dc modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 73
diff changeset
192 self.emit("beg = get_line_beg(buf, beg);")
e06786b3c2dc modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 73
diff changeset
193 self.emit("buf = beg;")
102
a38b57592d45 modify (add return statement).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 100
diff changeset
194 self.emit("return %s(%s);" % (self.state_name(self.fa.start), self.args))
75
e06786b3c2dc modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 73
diff changeset
195
e06786b3c2dc modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 73
diff changeset
196 self.emit("UCHARP get_line_beg(UCHARP p, UCHARP beg);", 2)
e06786b3c2dc modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 73
diff changeset
197
e06786b3c2dc modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 73
diff changeset
198 self.emiti("void quick_filter(%s) {" % self.interface)
e06786b3c2dc modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 73
diff changeset
199 l = len(key)
e06786b3c2dc modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 73
diff changeset
200 if l == 1:
e06786b3c2dc modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 73
diff changeset
201 self.emit("buf = memchr(buf, %d, (end - buf));" % ord(key))
e06786b3c2dc modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 73
diff changeset
202 self.emit("if (buf == NULL) return;")
e06786b3c2dc modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 73
diff changeset
203 emit_next()
92
87cd1db7ec3f modify codegen-indent.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 90
diff changeset
204 self.demit("}", 2)
75
e06786b3c2dc modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 73
diff changeset
205 return
e06786b3c2dc modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 73
diff changeset
206
77
c50511498bcf improve filter.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 76
diff changeset
207 self.emit('static const UCHAR key[] = "%s";' % key)
75
e06786b3c2dc modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 73
diff changeset
208
e06786b3c2dc modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 73
diff changeset
209 skip = dict()
e06786b3c2dc modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 73
diff changeset
210 for i in range(l):
e06786b3c2dc modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 73
diff changeset
211 skip[key[i]] = l-i
e06786b3c2dc modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 73
diff changeset
212
80
53c3ce58fc8a modify code gen, for no-warnings (gcc -Wall).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 79
diff changeset
213 self.emit("UCHARP tmp1, tmp2, end_ = end - %d;" % (l-1), 2)
76
dd6d2b9e48ad improvement quick-filtering.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 75
diff changeset
214 self.emiti("while (buf < end_) {")
78
240475723cd8 add option "--filter=[bmh,quick,none]".
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 77
diff changeset
215 self.emiti( "if (*buf == %d /* %s */) {" % (ord(key[0]), Character.ascii(key[0])))
77
c50511498bcf improve filter.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 76
diff changeset
216 self.emit( "tmp1 = buf, tmp2 = (UCHARP)key;")
76
dd6d2b9e48ad improvement quick-filtering.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 75
diff changeset
217 self.emiti( "while (*(++tmp1) == *(++tmp2)){")
dd6d2b9e48ad improvement quick-filtering.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 75
diff changeset
218 self.emit( "if (tmp2 == key+%d) goto next;" % (l-1))
92
87cd1db7ec3f modify codegen-indent.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 90
diff changeset
219 self.demit( "}")
87cd1db7ec3f modify codegen-indent.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 90
diff changeset
220 self.demit( "}")
97
5db856953793 implement range-expression. and add repeat-mn syntax(ex. A{1,10}).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 95
diff changeset
221 if self.table_lookup:
5db856953793 implement range-expression. and add repeat-mn syntax(ex. A{1,10}).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 95
diff changeset
222 self.emiti("static const void * tbl[256] = {[0 ... 255] &&any, %s};"
5db856953793 implement range-expression. and add repeat-mn syntax(ex. A{1,10}).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 95
diff changeset
223 % ", ".join("[%d] &&add%s" % (ord(c), s) for c, s in skip.iteritems()))
5db856953793 implement range-expression. and add repeat-mn syntax(ex. A{1,10}).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 95
diff changeset
224 self.emit("goto *tbl[buf[%d]];" % l)
5db856953793 implement range-expression. and add repeat-mn syntax(ex. A{1,10}).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 95
diff changeset
225 defun = []
5db856953793 implement range-expression. and add repeat-mn syntax(ex. A{1,10}).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 95
diff changeset
226 for s in skip.itervalues():
5db856953793 implement range-expression. and add repeat-mn syntax(ex. A{1,10}).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 95
diff changeset
227 if s in defun: continue
5db856953793 implement range-expression. and add repeat-mn syntax(ex. A{1,10}).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 95
diff changeset
228 defun.append(s)
5db856953793 implement range-expression. and add repeat-mn syntax(ex. A{1,10}).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 95
diff changeset
229 self.emit("add%s: buf += %s; goto ends;" % (s, s))
5db856953793 implement range-expression. and add repeat-mn syntax(ex. A{1,10}).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 95
diff changeset
230 self.emit("any: buf += %d; ends:;" % (l+1))
5db856953793 implement range-expression. and add repeat-mn syntax(ex. A{1,10}).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 95
diff changeset
231 else:
5db856953793 implement range-expression. and add repeat-mn syntax(ex. A{1,10}).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 95
diff changeset
232 self.emiti( "switch(buf[%d]) {" % l)
5db856953793 implement range-expression. and add repeat-mn syntax(ex. A{1,10}).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 95
diff changeset
233 for k, v in skip.iteritems():
5db856953793 implement range-expression. and add repeat-mn syntax(ex. A{1,10}).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 95
diff changeset
234 self.emiti( "case %d: /* %s */" % (ord(k), Character.ascii(k)))
5db856953793 implement range-expression. and add repeat-mn syntax(ex. A{1,10}).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 95
diff changeset
235 self.emit( "buf += %d; break;" % v), self.dedent()
5db856953793 implement range-expression. and add repeat-mn syntax(ex. A{1,10}).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 95
diff changeset
236 self.emiti("default: buf += %d;" % (l+1)), self.dedent()
5db856953793 implement range-expression. and add repeat-mn syntax(ex. A{1,10}).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 95
diff changeset
237 self.demit( "}")
92
87cd1db7ec3f modify codegen-indent.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 90
diff changeset
238 self.demit("}")
75
e06786b3c2dc modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 73
diff changeset
239 self.emit( "return;")
e06786b3c2dc modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 73
diff changeset
240 self.emit( "next:")
e06786b3c2dc modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 73
diff changeset
241 emit_next()
92
87cd1db7ec3f modify codegen-indent.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 90
diff changeset
242 self.demit("}", 2)
75
e06786b3c2dc modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 73
diff changeset
243
70
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
244 def emit_booster(self, min_len, chars):
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
245 self.emiti("void booster(%s) {" % self.interface)
81
3dc381c90870 improve booster's routine.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 80
diff changeset
246 self.emit( "UCHARP end_ = end - %d;" % (min_len-1))
3dc381c90870 improve booster's routine.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 80
diff changeset
247 self.emit( "if (buf > end_) return;")
70
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
248 self.emiti( "do {")
97
5db856953793 implement range-expression. and add repeat-mn syntax(ex. A{1,10}).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 95
diff changeset
249 self.emiti( "switch (buf[%d]) {" % (min_len-1))
5db856953793 implement range-expression. and add repeat-mn syntax(ex. A{1,10}).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 95
diff changeset
250 for c in chars:
5db856953793 implement range-expression. and add repeat-mn syntax(ex. A{1,10}).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 95
diff changeset
251 self.emit( "case %d: /* %s */" % (ord(c), Character.ascii(c)))
5db856953793 implement range-expression. and add repeat-mn syntax(ex. A{1,10}).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 95
diff changeset
252 self.emit( "goto ret;")
5db856953793 implement range-expression. and add repeat-mn syntax(ex. A{1,10}).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 95
diff changeset
253 self.demit( "}")
92
87cd1db7ec3f modify codegen-indent.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 90
diff changeset
254 self.demit( "} while((buf += %d) <= end_);" % min_len)
100
6aab6b1038f0 bug-fix
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 99
diff changeset
255 self.emit( "ret: return %s(%s);" % (self.state_name(self.fa.start) , self.args))
92
87cd1db7ec3f modify codegen-indent.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 90
diff changeset
256 self.demit("}", 2)
70
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
257
12
41391400fe68 add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
258 def emit_driver(self):
67
b02b321d0e06 implement bm_filter on mmap. but it's slower than dfa. ?;-(
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 63
diff changeset
259 self.emiti("void matcher(%s) {" % self.interface)
b02b321d0e06 implement bm_filter on mmap. but it's slower than dfa. ?;-(
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 63
diff changeset
260 if self.filter:
75
e06786b3c2dc modify filtering algorithm, unloop string-compare!!
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 73
diff changeset
261 self.emit( "%s(%s);" % (self.filter + "_filter", self.args))
67
b02b321d0e06 implement bm_filter on mmap. but it's slower than dfa. ?;-(
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 63
diff changeset
262 else:
100
6aab6b1038f0 bug-fix
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 99
diff changeset
263 self.emit( "%s(%s);" % (self.state_name(self.fa.start), self.args))
63
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
264 self.emit( "return;")
92
87cd1db7ec3f modify codegen-indent.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 90
diff changeset
265 self.demit("}", 2)
12
41391400fe68 add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
266
63
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
267 def emit_accept_state(self):
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
268 self.emiti("void accept(%s) {" % self.interface)
109
d591da6e2988 add memchr-filter. and fix emit buf.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 106
diff changeset
269 self.emit( "UCHARP ret = (UCHARP)memchr(buf, '\\n', (end - buf));")
70
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
270 if self.skip_boost or self.filter:
72
8b9c3a924744 rename memrchr -> beg_get_line.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 71
diff changeset
271 self.emit( "beg = get_line_beg(buf, beg);")
79
623eccb93ca1 modify filter emit-option's bug.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 78
diff changeset
272 self.emiti( "if (ret == NULL) {")
70
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
273 self.emit( "print_line(beg, end);")
63
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
274 self.emit( "return;")
92
87cd1db7ec3f modify codegen-indent.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 90
diff changeset
275 self.demit( "}")
63
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
276 self.emit( "print_line(beg, ret);")
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
277 self.emit( "beg = buf = ret + 1;")
102
a38b57592d45 modify (add return statement).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 100
diff changeset
278 self.emit( "return %s(%s);" % (self.start, self.args))
92
87cd1db7ec3f modify codegen-indent.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 90
diff changeset
279 self.demit("}", 2)
63
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
280
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
281 def emit_reject_state(self):
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
282 self.emiti("void reject(%s) {" % self.interface)
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
283 self.emit( "if (buf >= end) return;")
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
284 self.emit( "beg = buf;")
70
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
285 self.emit( "return %s(%s);" % (self.start, self.args))
92
87cd1db7ec3f modify codegen-indent.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 90
diff changeset
286 self.demit("}", 2)
63
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
287
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
288 def emit_switch(self, case, default=None):
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
289 if not case:
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
290 if default:
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
291 self.emit("return %s(%s);" % (default, self.args))
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
292 return
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
293 self.emiti("switch(*buf++) {")
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
294 for case, next_ in case.iteritems():
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
295 self.trans_stmt.emit(case, self.state_name(next_))
100
6aab6b1038f0 bug-fix
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 99
diff changeset
296 if default == self.state_name(self.fa.start) and self.skip_boost:
70
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
297 self.emit("default: return booster(%s);" % self.args)
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
298 else:
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
299 self.emit("default: return %s(%s);" % (default, self.args))
92
87cd1db7ec3f modify codegen-indent.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 90
diff changeset
300 self.demit("}")
63
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
301
12
41391400fe68 add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
302 def emit_state(self, cur_state, transition):
68
56a997f2c121 improve codegen. remove needless code (when filter-only, no need to emit dfa-code).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 67
diff changeset
303 if self.filter_only: return
56a997f2c121 improve codegen. remove needless code (when filter-only, no need to emit dfa-code).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 67
diff changeset
304
63
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
305 self.emiti("void %s(%s) {" % (self.state_name(cur_state), self.interface))
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
306
100
6aab6b1038f0 bug-fix
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 99
diff changeset
307 if cur_state in self.fa.accepts:
63
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
308 self.emit( "return accept(beg, buf-1, end);")
92
87cd1db7ec3f modify codegen-indent.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 90
diff changeset
309 self.demit("}", 2)
63
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
310 return
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
311
85
b34a900a3a0b modify table-lookup option.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 84
diff changeset
312 if transition.has_key(AnyChar()):
b34a900a3a0b modify table-lookup option.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 84
diff changeset
313 default = self.state_name(transition.pop(AnyChar()))
b34a900a3a0b modify table-lookup option.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 84
diff changeset
314 else:
100
6aab6b1038f0 bug-fix
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 99
diff changeset
315 default = self.state_name(self.fa.start)
85
b34a900a3a0b modify table-lookup option.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 84
diff changeset
316
100
6aab6b1038f0 bug-fix
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 99
diff changeset
317 if self.table_lookup and (cur_state == self.fa.start or \
85
b34a900a3a0b modify table-lookup option.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 84
diff changeset
318 self.state_name(cur_state) == default):
100
6aab6b1038f0 bug-fix
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 99
diff changeset
319 if self.skip_boost and default == self.state_name(self.fa.start):
85
b34a900a3a0b modify table-lookup option.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 84
diff changeset
320 default = "booster"
94
492f543703d5 improve jump-table initialize (when enable table-lookup). C99's range-initialize is awesome.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 92
diff changeset
321 tbl = dict()
85
b34a900a3a0b modify table-lookup option.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 84
diff changeset
322 for eol in self.eols:
b34a900a3a0b modify table-lookup option.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 84
diff changeset
323 tbl[eol.char] = "reject"
b34a900a3a0b modify table-lookup option.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 84
diff changeset
324 for c, n in transition.iteritems():
b34a900a3a0b modify table-lookup option.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 84
diff changeset
325 tbl[c.char] = self.state_name(n)
94
492f543703d5 improve jump-table initialize (when enable table-lookup). C99's range-initialize is awesome.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 92
diff changeset
326 self.emit( "static void (*%s_table[256])(UCHARP, UCHARP, UCHARP) = {[0 ... 255] = (void*)%s, %s};"
492f543703d5 improve jump-table initialize (when enable table-lookup). C99's range-initialize is awesome.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 92
diff changeset
327 % (self.state_name(cur_state), default,
492f543703d5 improve jump-table initialize (when enable table-lookup). C99's range-initialize is awesome.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 92
diff changeset
328 ", ".join("[%d] = %s" % (i, s) for (i, s) in tbl.items())))
85
b34a900a3a0b modify table-lookup option.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 84
diff changeset
329 self.emit( "return %s_table[*buf++](%s);" % (self.state_name(cur_state), self.args))
92
87cd1db7ec3f modify codegen-indent.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 90
diff changeset
330 self.demit("}", 2)
83
68cefeb3bee1 experimentation, use table-lookup at first state's transition.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 81
diff changeset
331 return
68cefeb3bee1 experimentation, use table-lookup at first state's transition.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 81
diff changeset
332
63
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
333 for eol in self.eols:
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
334 transition[eol] = "reject"
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
335
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
336 for input_ in transition.keys():
97
5db856953793 implement range-expression. and add repeat-mn syntax(ex. A{1,10}).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 95
diff changeset
337 if isinstance(input_, SpecialInputNode):
63
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
338 self.trans_stmt.emit(input_, self.state_name(transition.pop(input_)))
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
339
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
340 self.emit_switch(transition, default)
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
341
92
87cd1db7ec3f modify codegen-indent.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 90
diff changeset
342 self.demit("}", 2)
63
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
343
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
344 class _trans_stmt(ASTWalker):
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
345 def __init__(self, emit):
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
346 self._emit = emit
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
347 self.args = "beg, buf, end"
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
348
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
349 def emit(self, input_node, next_):
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
350 self.next = next_
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
351 input_node.accept(self)
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
352
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
353 def visit(self, input_node):
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
354 self._emit("/* UNKNOW RULE */")
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
355 self._emit("/* %s */" % input_node.__repr__())
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
356
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
357 def visit_Character(self, char):
70
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
358 self._emit("case %d: /* %s */" % (char.char, char))
63
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
359 self._emit(" return %s(%s);" % (self.next, self.args))
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
360
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
361 # Special Rule
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
362 def visit_BegLine(self, begline):
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
363 self._emit("/* begin of line */")
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
364 self._emit("if (buf == beg)")
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
365 self._emit(" return %s(%s);" % (self.next, self.args), 2)
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
366
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
367 def visit_Range(self, range):
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
368 if isinstance(range.lower, MBCharacter) and not \
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
369 isinstance(range.upper, MBCharacter) or \
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
370 isinstance(range.upper, MBCharacter) and not \
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
371 isinstance(range.lower, MBCharacter):
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
372 return
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
373
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
374 if isinstance(range.lower, MBCharacter):
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
375 self.visit(range)
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
376 else:
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
377 self._emit("if ('%s' <= *buf && *buf <= '%s')" % (range.lower.char, range.upper.char))
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
378 self._emit(" return %s(beg, buf+1, end);" % self.next, 2)
12
41391400fe68 add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
379
41391400fe68 add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
380 def test():
41391400fe68 add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
381 import doctest
41391400fe68 add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
382 doctest.testmod()
41391400fe68 add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
383
41391400fe68 add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
384 if __name__ == '__main__': test()