annotate pyrect/translator/grep_translator.py @ 71:3be07ba2d648

bug-fix: modify booster's stop rule. EOF - > stop.
author Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
date Sun, 07 Nov 2010 13:45:20 +0900
parents 74f4e50c4f11
children 8b9c3a924744
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
27
3db85244784b modify jitgrep, pre-compile grep main routine to libgrep.so. so JIT-compile only required DFA-transition.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 22
diff changeset
1 #!/usr/bin/env python
3db85244784b modify jitgrep, pre-compile grep main routine to libgrep.so. so JIT-compile only required DFA-transition.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 22
diff changeset
2
49
7f4221018adf accept UTF-8 encoding. but some foundational bug in converting algorithm NFA. maybe, which is not too difficult.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 47
diff changeset
3 import os
47
701beabd7d97 add input-rules, Range, CharacterClass, Anchor and MultiByte-Char(but not work)\nand more simplify NFA (is global improvement).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 45
diff changeset
4 from c_translator import CTranslator
70
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
5 from pyrect.regexp import Regexp
63
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
6 from pyrect.regexp.ast import ASTWalker, AnyChar, Character
12
41391400fe68 add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
7
14
55684cb51347 add LICENSE
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 12
diff changeset
8 class GREPTranslateExeption(Exception):
55684cb51347 add LICENSE
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 12
diff changeset
9 pass
55684cb51347 add LICENSE
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 12
diff changeset
10
12
41391400fe68 add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
11 class GREPTranslator(CTranslator):
41391400fe68 add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
12 """GREPTranslator
29
b833746d9d92 modify jitgrep.py and change linking method.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 27
diff changeset
13 This Class can translate form DFA into grep source-code.
b833746d9d92 modify jitgrep.py and change linking method.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 27
diff changeset
14 which based on (beautiful) mini-grep introduced \"The Practice of Programming\"
b833746d9d92 modify jitgrep.py and change linking method.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 27
diff changeset
15 written by Rob Pike & Brian W. Kernighan. (see template/grep.c)
27
3db85244784b modify jitgrep, pre-compile grep main routine to libgrep.so. so JIT-compile only required DFA-transition.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 22
diff changeset
16 >>> string = \"(build|fndecl|gcc)\"
12
41391400fe68 add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
17 >>> reg = Regexp(string)
43
83c69d42faa8 replace converting-flow, module dfareg with module regexp. it's is substantial changing in implimentation.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 38
diff changeset
18 >>> tje = GREPTranslator(reg)
12
41391400fe68 add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
19 >>> tje.translate()
41391400fe68 add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
20 """
14
55684cb51347 add LICENSE
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 12
diff changeset
21
49
7f4221018adf accept UTF-8 encoding. but some foundational bug in converting algorithm NFA. maybe, which is not too difficult.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 47
diff changeset
22 BASE_DIR = os.path.dirname(os.path.abspath(__file__))
7f4221018adf accept UTF-8 encoding. but some foundational bug in converting algorithm NFA. maybe, which is not too difficult.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 47
diff changeset
23
43
83c69d42faa8 replace converting-flow, module dfareg with module regexp. it's is substantial changing in implimentation.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 38
diff changeset
24 def __init__(self, regexp):
49
7f4221018adf accept UTF-8 encoding. but some foundational bug in converting algorithm NFA. maybe, which is not too difficult.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 47
diff changeset
25 CTranslator.__init__(self, regexp, fa="DFA")
50
d1afae06e776 jitgrep: set bufsize default 1M. and remove with statement.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 49
diff changeset
26 self.__bufsize = 1024 * 1024
59
fd3d0b8326fe implement regexp-syntax any-char ('.').
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 58
diff changeset
27 self.thread_dfa = 1
fd3d0b8326fe implement regexp-syntax any-char ('.').
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 58
diff changeset
28 self.thread_line = 1
57
81b44ae1cd73 add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 56
diff changeset
29 self.filter = True
68
56a997f2c121 improve codegen. remove needless code (when filter-only, no need to emit dfa-code).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 67
diff changeset
30 self.filter_only = False
70
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
31 self.filter_prefix = False
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
32 self.skip_boost = True
67
b02b321d0e06 implement bm_filter on mmap. but it's slower than dfa. ?;-(
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 63
diff changeset
33 self.start = "matcher"
63
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
34 self.interface = "UCHARP beg, UCHARP buf, UCHARP end"
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
35 self.args = "beg, buf, end"
38
06826250198b modify grep_translator, use property at bufsize.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 34
diff changeset
36
06826250198b modify grep_translator, use property at bufsize.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 34
diff changeset
37 def getbufsize(self,):
06826250198b modify grep_translator, use property at bufsize.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 34
diff changeset
38 return self.__bufsize
06826250198b modify grep_translator, use property at bufsize.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 34
diff changeset
39 def setbufsize(self, bufsize):
06826250198b modify grep_translator, use property at bufsize.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 34
diff changeset
40 self.__bufsize = abs(bufsize)
06826250198b modify grep_translator, use property at bufsize.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 34
diff changeset
41
06826250198b modify grep_translator, use property at bufsize.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 34
diff changeset
42 bufsize = property(getbufsize, setbufsize)
12
41391400fe68 add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
43
41391400fe68 add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
44 def emit_initialization(self):
63
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
45 self.emit("#include <stdio.h>")
59
fd3d0b8326fe implement regexp-syntax any-char ('.').
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 58
diff changeset
46 self.emit("#define GREP grep")
67
b02b321d0e06 implement bm_filter on mmap. but it's slower than dfa. ?;-(
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 63
diff changeset
47 self.emit("#define UCHAR unsigned char")
63
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
48 self.emit("#define UCHARP unsigned char *")
49
7f4221018adf accept UTF-8 encoding. but some foundational bug in converting algorithm NFA. maybe, which is not too difficult.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 47
diff changeset
49 self.emit("#include <stdlib.h>")
63
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
50 self.emit("#include <sys/mman.h>")
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
51 self.emit("#include <sys/types.h>")
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
52 self.emit("#include <sys/stat.h>")
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
53 self.emit("#include <fcntl.h>")
49
7f4221018adf accept UTF-8 encoding. but some foundational bug in converting algorithm NFA. maybe, which is not too difficult.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 47
diff changeset
54 self.emit("#include <string.h>")
63
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
55
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
56 self.emit('void reject(%s);' % self.interface)
67
b02b321d0e06 implement bm_filter on mmap. but it's slower than dfa. ?;-(
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 63
diff changeset
57 self.emit("void matcher(%s);" % self.interface, 2)
70
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
58 self.emit('void accept(%s);' % self.interface)
63
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
59
70
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
60 key = None
67
b02b321d0e06 implement bm_filter on mmap. but it's slower than dfa. ?;-(
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 63
diff changeset
61 if self.filter and self.regexp.must_words:
70
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
62 key = max(self.regexp.must_words, key=len)
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
63 if len(self.regexp.must_words) == 1 and len(key) == self.regexp.min_len:
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
64 self.filter_only = True
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
65 else:
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
66 self.filter = False
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
67
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
68 if not self.filter_only:
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
69 for state in self.cg.map.iterkeys():
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
70 self.emit("void %s(%s);" % (self.state_name(state), self.interface))
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
71 self.emit()
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
72
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
73 if self.filter: self.emit_filter(key)
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
74
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
75 if self.skip_boost and not self.filter_only and \
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
76 not AnyChar() in self.regexp.chars and \
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
77 self.regexp.min_len > 2:
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
78 self.emit_booster(self.regexp.min_len, self.regexp.chars)
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
79 else:
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
80 self.skip_boost = False
57
81b44ae1cd73 add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 56
diff changeset
81
49
7f4221018adf accept UTF-8 encoding. but some foundational bug in converting algorithm NFA. maybe, which is not too difficult.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 47
diff changeset
82 grepsource = open(self.BASE_DIR + "/template/grep.c")
34
50b10929be29 change compile-method to full-source-compile.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 33
diff changeset
83 self.emit(grepsource.read())
12
41391400fe68 add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
84
70
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
85 def emit_filter(self, key):
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
86 l = len(key)
67
b02b321d0e06 implement bm_filter on mmap. but it's slower than dfa. ?;-(
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 63
diff changeset
87 def emit_next():
68
56a997f2c121 improve codegen. remove needless code (when filter-only, no need to emit dfa-code).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 67
diff changeset
88 if self.filter_only:
67
b02b321d0e06 implement bm_filter on mmap. but it's slower than dfa. ?;-(
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 63
diff changeset
89 self.emit("accept(%s);" % self.args)
68
56a997f2c121 improve codegen. remove needless code (when filter-only, no need to emit dfa-code).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 67
diff changeset
90 elif self.filter_prefix:
70
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
91 self.emit("buf -= %d;" % l)
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
92 self.emit("%s(%s);" % (self.state_name(self.cg.start) ,self.args))
67
b02b321d0e06 implement bm_filter on mmap. but it's slower than dfa. ?;-(
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 63
diff changeset
93 else:
70
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
94 self.emit("beg = memrchr(buf, '\\n', beg);")
67
b02b321d0e06 implement bm_filter on mmap. but it's slower than dfa. ?;-(
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 63
diff changeset
95 self.emit("buf = beg;")
70
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
96 self.emit("%s(%s);" % (self.state_name(self.cg.start), self.args))
67
b02b321d0e06 implement bm_filter on mmap. but it's slower than dfa. ?;-(
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 63
diff changeset
97
b02b321d0e06 implement bm_filter on mmap. but it's slower than dfa. ?;-(
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 63
diff changeset
98 self.emit("UCHARP memrchr(UCHARP p, int c, UCHARP beg);", 2)
b02b321d0e06 implement bm_filter on mmap. but it's slower than dfa. ?;-(
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 63
diff changeset
99
b02b321d0e06 implement bm_filter on mmap. but it's slower than dfa. ?;-(
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 63
diff changeset
100 self.emiti("void bm_filter(%s) {" % self.interface)
57
81b44ae1cd73 add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 56
diff changeset
101 l = len(key)
81b44ae1cd73 add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 56
diff changeset
102 if l == 1:
67
b02b321d0e06 implement bm_filter on mmap. but it's slower than dfa. ?;-(
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 63
diff changeset
103 self.emit("buf = memchr(buf, %d, (end - buf));" % ord(key))
b02b321d0e06 implement bm_filter on mmap. but it's slower than dfa. ?;-(
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 63
diff changeset
104 self.emit("if (buf == NULL) return;")
b02b321d0e06 implement bm_filter on mmap. but it's slower than dfa. ?;-(
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 63
diff changeset
105 emit_next()
57
81b44ae1cd73 add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 56
diff changeset
106 self.emitd("}", 2)
81b44ae1cd73 add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 56
diff changeset
107 return
81b44ae1cd73 add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 56
diff changeset
108
70
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
109 self.emit('static UCHAR key[] = "%s";' % key)
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
110
57
81b44ae1cd73 add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 56
diff changeset
111 skip = [str(l)] * 256
81b44ae1cd73 add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 56
diff changeset
112 for i in range(l - 1):
81b44ae1cd73 add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 56
diff changeset
113 skip[ord(key[i])] = str(l-1-i)
81b44ae1cd73 add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 56
diff changeset
114
68
56a997f2c121 improve codegen. remove needless code (when filter-only, no need to emit dfa-code).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 67
diff changeset
115 self.emiti( "static const UCHAR skip[256] = {")
57
81b44ae1cd73 add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 56
diff changeset
116 for i in range(8):
81b44ae1cd73 add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 56
diff changeset
117 i = i * 32
81b44ae1cd73 add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 56
diff changeset
118 self.emit(",".join(skip[i:i+32]) + ",")
81b44ae1cd73 add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 56
diff changeset
119 self.emitd( "};")
81b44ae1cd73 add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 56
diff changeset
120
70
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
121 self.emit("UCHARP tmp1, *tmp2;", 2)
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
122 self.emit("int i; buf += %d;" % (l-1))
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
123 self.emiti("while (buf < end) {")
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
124 self.emiti( "if (*buf == %d /*'%c'*/) {" % (ord(key[l-1]), key[l-1]))
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
125 self.emit( "tmp1 = key+%d; tmp2 = buf;" % (l-1))
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
126 self.emit( "while (*(--tmp1) == *(--tmp2)) {")
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
127 self.emit( "if (tmp1 == key) goto next;")
57
81b44ae1cd73 add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 56
diff changeset
128 self.emitd( "}")
81b44ae1cd73 add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 56
diff changeset
129 self.emitd( "}")
70
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
130 self.emit( "buf += skip[*buf];")
57
81b44ae1cd73 add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 56
diff changeset
131 self.emitd("}")
67
b02b321d0e06 implement bm_filter on mmap. but it's slower than dfa. ?;-(
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 63
diff changeset
132 self.emit( "return;")
b02b321d0e06 implement bm_filter on mmap. but it's slower than dfa. ?;-(
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 63
diff changeset
133 self.emit( "next:")
b02b321d0e06 implement bm_filter on mmap. but it's slower than dfa. ?;-(
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 63
diff changeset
134 emit_next()
57
81b44ae1cd73 add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 56
diff changeset
135 self.emitd("}", 2)
33
e9e90c006760 simplify grep.c, correnspod syntax '^'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 30
diff changeset
136
70
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
137 def emit_booster(self, min_len, chars):
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
138 self.emiti("void booster(%s) {" % self.interface)
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
139 self.emiti( "do {")
71
3be07ba2d648 bug-fix: modify booster's stop rule. EOF - > stop.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 70
diff changeset
140 self.emit( "if (buf > end) return;")
70
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
141 self.emiti( "switch (*(buf+%d)) {" % (min_len-1))
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
142 for c in chars:
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
143 self.emit( "case %d: /* %s */" % (ord(c), Character.ascii(c)))
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
144 self.emit( "goto ret;")
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
145 self.emitd( "}")
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
146 self.emitd( "} while(buf += %d);" % (min_len-1))
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
147 self.emit( "ret: return %s(%s);" % (self.state_name(self.cg.start) , self.args))
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
148 self.emitd("}", 2)
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
149
12
41391400fe68 add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
150 def emit_driver(self):
67
b02b321d0e06 implement bm_filter on mmap. but it's slower than dfa. ?;-(
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 63
diff changeset
151 self.emiti("void matcher(%s) {" % self.interface)
b02b321d0e06 implement bm_filter on mmap. but it's slower than dfa. ?;-(
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 63
diff changeset
152 if self.filter:
b02b321d0e06 implement bm_filter on mmap. but it's slower than dfa. ?;-(
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 63
diff changeset
153 self.emit( "%s(%s);" % ("bm_filter", self.args))
b02b321d0e06 implement bm_filter on mmap. but it's slower than dfa. ?;-(
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 63
diff changeset
154 else:
b02b321d0e06 implement bm_filter on mmap. but it's slower than dfa. ?;-(
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 63
diff changeset
155 self.emit( "%s(%s);" % (self.state_name(self.cg.start), self.args))
63
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
156 self.emit( "return;")
59
fd3d0b8326fe implement regexp-syntax any-char ('.').
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 58
diff changeset
157 self.emitd("}")
fd3d0b8326fe implement regexp-syntax any-char ('.').
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 58
diff changeset
158 return
12
41391400fe68 add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
159
63
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
160 def emit_accept_state(self):
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
161 self.emiti("void accept(%s) {" % self.interface)
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
162 self.emit( "UCHARP ret = (UCHARP)memchr(buf, '\\n', (buf - end));")
70
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
163 if self.skip_boost or self.filter:
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
164 self.emit( "beg = memrchr(buf, '\\n', beg);")
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
165 self.emit( 'if (ret == NULL) ret = end;')
63
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
166 self.emiti( "if (ret > end) {")
70
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
167 self.emit( "print_line(beg, end);")
63
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
168 self.emit( "return;")
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
169 self.emitd( "}")
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
170 self.emit( "print_line(beg, ret);")
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
171 self.emit( "beg = buf = ret + 1;")
67
b02b321d0e06 implement bm_filter on mmap. but it's slower than dfa. ?;-(
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 63
diff changeset
172 self.emit( "%s(%s);" % (self.start, self.args))
63
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
173 self.emitd("}", 2)
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
174
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
175 def emit_reject_state(self):
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
176 self.emiti("void reject(%s) {" % self.interface)
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
177 self.emit( "if (buf >= end) return;")
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
178 self.emit( "beg = buf;")
70
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
179 self.emit( "return %s(%s);" % (self.start, self.args))
63
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
180 self.emitd("}", 2)
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
181
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
182 def emit_switch(self, case, default=None):
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
183 if not case:
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
184 if default:
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
185 self.emit("return %s(%s);" % (default, self.args))
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
186 return
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
187 self.emiti("switch(*buf++) {")
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
188 for case, next_ in case.iteritems():
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
189 self.trans_stmt.emit(case, self.state_name(next_))
70
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
190 if default == self.state_name(self.cg.start) and self.skip_boost:
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
191 self.emit("default: return booster(%s);" % self.args)
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
192 else:
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
193 self.emit("default: return %s(%s);" % (default, self.args))
63
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
194 self.emitd("}")
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
195
12
41391400fe68 add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
196 def emit_state(self, cur_state, transition):
68
56a997f2c121 improve codegen. remove needless code (when filter-only, no need to emit dfa-code).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 67
diff changeset
197 if self.filter_only: return
56a997f2c121 improve codegen. remove needless code (when filter-only, no need to emit dfa-code).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 67
diff changeset
198
63
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
199 self.emiti("void %s(%s) {" % (self.state_name(cur_state), self.interface))
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
200
12
41391400fe68 add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
201 if cur_state in self.cg.accepts:
63
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
202 self.emit( "return accept(beg, buf-1, end);")
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
203 self.emitd("}", 2)
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
204 return
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
205
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
206 default = self.state_name(self.cg.start)
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
207 for eol in self.eols:
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
208 transition[eol] = "reject"
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
209
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
210 for input_ in transition.keys():
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
211 if type(input_) in self.special_rule:
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
212 self.trans_stmt.emit(input_, self.state_name(transition.pop(input_)))
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
213 elif type(input_) is AnyChar:
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
214 default = self.state_name(transition.pop(input_))
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
215
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
216 self.emit_switch(transition, default)
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
217
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
218 self.emitd("}", 2)
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
219
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
220 class _trans_stmt(ASTWalker):
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
221 def __init__(self, emit):
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
222 self._emit = emit
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
223 self.args = "beg, buf, end"
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
224
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
225 def emit(self, input_node, next_):
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
226 self.next = next_
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
227 input_node.accept(self)
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
228
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
229 def visit(self, input_node):
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
230 self._emit("/* UNKNOW RULE */")
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
231 self._emit("/* %s */" % input_node.__repr__())
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
232
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
233 def visit_Character(self, char):
70
74f4e50c4f11 add boost algorithm. but it's buggy, not work.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 68
diff changeset
234 self._emit("case %d: /* %s */" % (char.char, char))
63
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
235 self._emit(" return %s(%s);" % (self.next, self.args))
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
236
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
237 # Special Rule
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
238 def visit_BegLine(self, begline):
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
239 self._emit("/* begin of line */")
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
240 self._emit("if (buf == beg)")
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
241 self._emit(" return %s(%s);" % (self.next, self.args), 2)
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
242
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
243 def visit_Range(self, range):
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
244 if isinstance(range.lower, MBCharacter) and not \
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
245 isinstance(range.upper, MBCharacter) or \
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
246 isinstance(range.upper, MBCharacter) and not \
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
247 isinstance(range.lower, MBCharacter):
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
248 return
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
249
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
250 if isinstance(range.lower, MBCharacter):
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
251 self.visit(range)
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
252 else:
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
253 self._emit("if ('%s' <= *buf && *buf <= '%s')" % (range.lower.char, range.upper.char))
020ba001c58a modify I/O routine. use mmap. it's really faster than fgets ;-)
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents: 59
diff changeset
254 self._emit(" return %s(beg, buf+1, end);" % self.next, 2)
12
41391400fe68 add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
255
41391400fe68 add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
256 def test():
41391400fe68 add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
257 import doctest
41391400fe68 add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
258 doctest.testmod()
41391400fe68 add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
259
41391400fe68 add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
260 if __name__ == '__main__': test()