Mercurial > hg > Members > shinya > pyrect
annotate pyrect/translator/grep_translator.py @ 58:81337db23999
modify ternary operator (ex: return s1 if ~~ else s2). for python2.4 ;-(
author | Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp> |
---|---|
date | Mon, 01 Nov 2010 14:50:52 +0900 |
parents | 81b44ae1cd73 |
children | fd3d0b8326fe |
rev | line source |
---|---|
27
3db85244784b
modify jitgrep, pre-compile grep main routine to libgrep.so. so JIT-compile only required DFA-transition.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
22
diff
changeset
|
1 #!/usr/bin/env python |
3db85244784b
modify jitgrep, pre-compile grep main routine to libgrep.so. so JIT-compile only required DFA-transition.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
22
diff
changeset
|
2 |
49
7f4221018adf
accept UTF-8 encoding. but some foundational bug in converting algorithm NFA. maybe, which is not too difficult.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
47
diff
changeset
|
3 import os |
47
701beabd7d97
add input-rules, Range, CharacterClass, Anchor and MultiByte-Char(but not work)\nand more simplify NFA (is global improvement).
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
45
diff
changeset
|
4 from c_translator import CTranslator |
57
81b44ae1cd73
add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
56
diff
changeset
|
5 from pyrect.regexp import Regexp, Analyzer |
12
41391400fe68
add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
6 |
14
55684cb51347
add LICENSE
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
12
diff
changeset
|
7 class GREPTranslateExeption(Exception): |
55684cb51347
add LICENSE
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
12
diff
changeset
|
8 pass |
55684cb51347
add LICENSE
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
12
diff
changeset
|
9 |
12
41391400fe68
add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
10 class GREPTranslator(CTranslator): |
41391400fe68
add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
11 """GREPTranslator |
29
b833746d9d92
modify jitgrep.py and change linking method.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
27
diff
changeset
|
12 This Class can translate form DFA into grep source-code. |
b833746d9d92
modify jitgrep.py and change linking method.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
27
diff
changeset
|
13 which based on (beautiful) mini-grep introduced \"The Practice of Programming\" |
b833746d9d92
modify jitgrep.py and change linking method.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
27
diff
changeset
|
14 written by Rob Pike & Brian W. Kernighan. (see template/grep.c) |
27
3db85244784b
modify jitgrep, pre-compile grep main routine to libgrep.so. so JIT-compile only required DFA-transition.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
22
diff
changeset
|
15 >>> string = \"(build|fndecl|gcc)\" |
12
41391400fe68
add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
16 >>> reg = Regexp(string) |
43
83c69d42faa8
replace converting-flow, module dfareg with module regexp. it's is substantial changing in implimentation.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
38
diff
changeset
|
17 >>> tje = GREPTranslator(reg) |
12
41391400fe68
add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
18 >>> tje.translate() |
41391400fe68
add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
19 """ |
14
55684cb51347
add LICENSE
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
12
diff
changeset
|
20 |
49
7f4221018adf
accept UTF-8 encoding. but some foundational bug in converting algorithm NFA. maybe, which is not too difficult.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
47
diff
changeset
|
21 BASE_DIR = os.path.dirname(os.path.abspath(__file__)) |
7f4221018adf
accept UTF-8 encoding. but some foundational bug in converting algorithm NFA. maybe, which is not too difficult.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
47
diff
changeset
|
22 |
43
83c69d42faa8
replace converting-flow, module dfareg with module regexp. it's is substantial changing in implimentation.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
38
diff
changeset
|
23 def __init__(self, regexp): |
49
7f4221018adf
accept UTF-8 encoding. but some foundational bug in converting algorithm NFA. maybe, which is not too difficult.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
47
diff
changeset
|
24 CTranslator.__init__(self, regexp, fa="DFA") |
50
d1afae06e776
jitgrep: set bufsize default 1M. and remove with statement.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
49
diff
changeset
|
25 self.__bufsize = 1024 * 1024 |
56
ee9945561f80
add parallel I/O grep (per line) with pthread. but it's very slow. really slow..
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
52
diff
changeset
|
26 self.parallel_match = False |
ee9945561f80
add parallel I/O grep (per line) with pthread. but it's very slow. really slow..
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
52
diff
changeset
|
27 self.thread_num = 0 |
57
81b44ae1cd73
add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
56
diff
changeset
|
28 self.filter = True |
38
06826250198b
modify grep_translator, use property at bufsize.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
34
diff
changeset
|
29 |
06826250198b
modify grep_translator, use property at bufsize.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
34
diff
changeset
|
30 def getbufsize(self,): |
06826250198b
modify grep_translator, use property at bufsize.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
34
diff
changeset
|
31 return self.__bufsize |
06826250198b
modify grep_translator, use property at bufsize.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
34
diff
changeset
|
32 def setbufsize(self, bufsize): |
06826250198b
modify grep_translator, use property at bufsize.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
34
diff
changeset
|
33 self.__bufsize = abs(bufsize) |
06826250198b
modify grep_translator, use property at bufsize.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
34
diff
changeset
|
34 |
06826250198b
modify grep_translator, use property at bufsize.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
34
diff
changeset
|
35 bufsize = property(getbufsize, setbufsize) |
12
41391400fe68
add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
36 |
41391400fe68
add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
37 def emit_initialization(self): |
49
7f4221018adf
accept UTF-8 encoding. but some foundational bug in converting algorithm NFA. maybe, which is not too difficult.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
47
diff
changeset
|
38 CTranslator.emit_initialization(self) |
57
81b44ae1cd73
add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
56
diff
changeset
|
39 |
81b44ae1cd73
add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
56
diff
changeset
|
40 if self.thread_num > 1: |
81b44ae1cd73
add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
56
diff
changeset
|
41 self.emit("#define GREP paragrep") |
81b44ae1cd73
add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
56
diff
changeset
|
42 else: |
81b44ae1cd73
add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
56
diff
changeset
|
43 self.emit("#define GREP grep") |
81b44ae1cd73
add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
56
diff
changeset
|
44 |
50
d1afae06e776
jitgrep: set bufsize default 1M. and remove with statement.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
49
diff
changeset
|
45 self.emit("#define LINEBUFSIZE %d" % self.bufsize) |
d1afae06e776
jitgrep: set bufsize default 1M. and remove with statement.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
49
diff
changeset
|
46 self.emit("#define READBUFSIZE %d" % self.bufsize) |
56
ee9945561f80
add parallel I/O grep (per line) with pthread. but it's very slow. really slow..
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
52
diff
changeset
|
47 self.emit('#define THREAD_NUM %d' % self.thread_num) |
ee9945561f80
add parallel I/O grep (per line) with pthread. but it's very slow. really slow..
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
52
diff
changeset
|
48 self.emit('#define THREAD_BUF %d' % 3) |
ee9945561f80
add parallel I/O grep (per line) with pthread. but it's very slow. really slow..
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
52
diff
changeset
|
49 self.emit('#include <pthread.h>') |
49
7f4221018adf
accept UTF-8 encoding. but some foundational bug in converting algorithm NFA. maybe, which is not too difficult.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
47
diff
changeset
|
50 self.emit("#include <stdlib.h>") |
7f4221018adf
accept UTF-8 encoding. but some foundational bug in converting algorithm NFA. maybe, which is not too difficult.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
47
diff
changeset
|
51 self.emit("#include <string.h>") |
7f4221018adf
accept UTF-8 encoding. but some foundational bug in converting algorithm NFA. maybe, which is not too difficult.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
47
diff
changeset
|
52 self.emit("char readbuf[%d];" % (self.bufsize)) |
7f4221018adf
accept UTF-8 encoding. but some foundational bug in converting algorithm NFA. maybe, which is not too difficult.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
47
diff
changeset
|
53 self.emit("int DFA(unsigned char* s);", 2) |
57
81b44ae1cd73
add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
56
diff
changeset
|
54 |
81b44ae1cd73
add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
56
diff
changeset
|
55 if self.filter and self.regexp.must_words: |
81b44ae1cd73
add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
56
diff
changeset
|
56 self.emit_filter(self.regexp.must_words) |
81b44ae1cd73
add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
56
diff
changeset
|
57 |
49
7f4221018adf
accept UTF-8 encoding. but some foundational bug in converting algorithm NFA. maybe, which is not too difficult.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
47
diff
changeset
|
58 grepsource = open(self.BASE_DIR + "/template/grep.c") |
34
50b10929be29
change compile-method to full-source-compile.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
33
diff
changeset
|
59 self.emit(grepsource.read()) |
12
41391400fe68
add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
60 |
57
81b44ae1cd73
add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
56
diff
changeset
|
61 def emit_filter(self, words): |
81b44ae1cd73
add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
56
diff
changeset
|
62 def longest(s1, s2): |
58
81337db23999
modify ternary operator (ex: return s1 if ~~ else s2). for python2.4 ;-(
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
57
diff
changeset
|
63 if len(s1) >= len(s2): |
81337db23999
modify ternary operator (ex: return s1 if ~~ else s2). for python2.4 ;-(
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
57
diff
changeset
|
64 return s1 |
81337db23999
modify ternary operator (ex: return s1 if ~~ else s2). for python2.4 ;-(
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
57
diff
changeset
|
65 else: |
81337db23999
modify ternary operator (ex: return s1 if ~~ else s2). for python2.4 ;-(
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
57
diff
changeset
|
66 return s2 |
81337db23999
modify ternary operator (ex: return s1 if ~~ else s2). for python2.4 ;-(
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
57
diff
changeset
|
67 |
57
81b44ae1cd73
add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
56
diff
changeset
|
68 key = reduce(longest, words) |
81b44ae1cd73
add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
56
diff
changeset
|
69 |
81b44ae1cd73
add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
56
diff
changeset
|
70 if len(words) == 1: |
81b44ae1cd73
add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
56
diff
changeset
|
71 if len(key) == self.regexp.min_len: |
81b44ae1cd73
add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
56
diff
changeset
|
72 self.emit("#define FILTER_ONLY 1", 1) |
81b44ae1cd73
add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
56
diff
changeset
|
73 else: |
81b44ae1cd73
add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
56
diff
changeset
|
74 self.emit("#define WITH_FILTER 1", 1) |
81b44ae1cd73
add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
56
diff
changeset
|
75 |
81b44ae1cd73
add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
56
diff
changeset
|
76 self.emiti("int FILTER(unsigned char* text, int n) {") |
81b44ae1cd73
add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
56
diff
changeset
|
77 l = len(key) |
81b44ae1cd73
add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
56
diff
changeset
|
78 if l == 1: |
81b44ae1cd73
add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
56
diff
changeset
|
79 self.emit(" return (strchr(text, %d) != NULL)" % ord(key)) |
81b44ae1cd73
add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
56
diff
changeset
|
80 self.emitd("}", 2) |
81b44ae1cd73
add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
56
diff
changeset
|
81 return |
81b44ae1cd73
add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
56
diff
changeset
|
82 |
81b44ae1cd73
add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
56
diff
changeset
|
83 skip = [str(l)] * 256 |
81b44ae1cd73
add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
56
diff
changeset
|
84 for i in range(l - 1): |
81b44ae1cd73
add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
56
diff
changeset
|
85 skip[ord(key[i])] = str(l-1-i) |
81b44ae1cd73
add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
56
diff
changeset
|
86 |
81b44ae1cd73
add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
56
diff
changeset
|
87 self.emit('static unsigned char key[] = "%s";' % key) |
81b44ae1cd73
add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
56
diff
changeset
|
88 self.emiti( "static int skip[256] = {") |
81b44ae1cd73
add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
56
diff
changeset
|
89 for i in range(8): |
81b44ae1cd73
add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
56
diff
changeset
|
90 i = i * 32 |
81b44ae1cd73
add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
56
diff
changeset
|
91 self.emit(",".join(skip[i:i+32]) + ",") |
81b44ae1cd73
add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
56
diff
changeset
|
92 self.emitd( "};") |
81b44ae1cd73
add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
56
diff
changeset
|
93 |
81b44ae1cd73
add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
56
diff
changeset
|
94 self.emit("int i = %d, j, k, len = %d;" % (l-1 ,l)) |
81b44ae1cd73
add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
56
diff
changeset
|
95 self.emit("unsigned char c, tail = %d; //'%c'" % (ord(key[l-1]), key[l-1]), 2) |
81b44ae1cd73
add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
56
diff
changeset
|
96 self.emiti("while (i < n) {") |
81b44ae1cd73
add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
56
diff
changeset
|
97 self.emit( "c = text[i];") |
81b44ae1cd73
add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
56
diff
changeset
|
98 self.emiti( "if (c == tail) {") |
81b44ae1cd73
add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
56
diff
changeset
|
99 self.emit( "j = len - 1; k = i;") |
81b44ae1cd73
add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
56
diff
changeset
|
100 self.emiti( "while (key[--j] == text[--k]) {") |
81b44ae1cd73
add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
56
diff
changeset
|
101 self.emit( "if (j == 0) return 1;") |
81b44ae1cd73
add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
56
diff
changeset
|
102 self.emitd( "}") |
81b44ae1cd73
add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
56
diff
changeset
|
103 self.emitd( "}") |
81b44ae1cd73
add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
56
diff
changeset
|
104 self.emit( "i += skip[c];") |
81b44ae1cd73
add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
56
diff
changeset
|
105 self.emitd("}") |
81b44ae1cd73
add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
56
diff
changeset
|
106 self.emit( "return 0;") |
81b44ae1cd73
add fixed-string filter(Boyer-Moore), and add option '--disable-filter'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
56
diff
changeset
|
107 self.emitd("}", 2) |
33
e9e90c006760
simplify grep.c, correnspod syntax '^'.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
30
diff
changeset
|
108 |
12
41391400fe68
add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
109 def emit_driver(self): |
49
7f4221018adf
accept UTF-8 encoding. but some foundational bug in converting algorithm NFA. maybe, which is not too difficult.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
47
diff
changeset
|
110 self.emiti("int DFA(unsigned char *text) {") |
7f4221018adf
accept UTF-8 encoding. but some foundational bug in converting algorithm NFA. maybe, which is not too difficult.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
47
diff
changeset
|
111 self.emiti( "do {") |
7f4221018adf
accept UTF-8 encoding. but some foundational bug in converting algorithm NFA. maybe, which is not too difficult.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
47
diff
changeset
|
112 self.emiti( "if(%s(text))" % self.state_name(self.cg.start)) |
7f4221018adf
accept UTF-8 encoding. but some foundational bug in converting algorithm NFA. maybe, which is not too difficult.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
47
diff
changeset
|
113 self.emit( "return 1;") |
7f4221018adf
accept UTF-8 encoding. but some foundational bug in converting algorithm NFA. maybe, which is not too difficult.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
47
diff
changeset
|
114 self.emitd( r"} while (*text++ != '\0');") |
52
abb0691e792a
bug fix. remove unnecessarily files.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
50
diff
changeset
|
115 self.emitd("return 0;") |
49
7f4221018adf
accept UTF-8 encoding. but some foundational bug in converting algorithm NFA. maybe, which is not too difficult.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
47
diff
changeset
|
116 self.emitd("}", 2) |
12
41391400fe68
add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
117 |
41391400fe68
add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
118 def emit_state(self, cur_state, transition): |
41391400fe68
add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
119 if cur_state in self.cg.accepts: |
49
7f4221018adf
accept UTF-8 encoding. but some foundational bug in converting algorithm NFA. maybe, which is not too difficult.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
47
diff
changeset
|
120 self.emiti("int %s(unsigned char* s) {" % self.state_name(cur_state)) |
7f4221018adf
accept UTF-8 encoding. but some foundational bug in converting algorithm NFA. maybe, which is not too difficult.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
47
diff
changeset
|
121 self.emit( "return accept(s);") |
7f4221018adf
accept UTF-8 encoding. but some foundational bug in converting algorithm NFA. maybe, which is not too difficult.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
47
diff
changeset
|
122 self.emitd("}") |
12
41391400fe68
add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
123 else: |
49
7f4221018adf
accept UTF-8 encoding. but some foundational bug in converting algorithm NFA. maybe, which is not too difficult.
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
47
diff
changeset
|
124 CTranslator.emit_state(self, cur_state, transition) |
12
41391400fe68
add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
125 |
41391400fe68
add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
126 def test(): |
41391400fe68
add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
127 import doctest |
41391400fe68
add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
128 doctest.testmod() |
41391400fe68
add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
129 |
41391400fe68
add GREPTranslator(Translator) and implement jit-compile-grep,
Ryoma SHINYA <shinya@firefly.cr.ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
130 if __name__ == '__main__': test() |