annotate regexParser/TODO @ 304:c48a8671ce34

fix parallel search first match
author Shinji KONO <kono@ie.u-ryukyu.ac.jp>
date Mon, 08 Feb 2016 12:24:47 +0900
parents 27414e6fb33c
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
304
c48a8671ce34 fix parallel search first match
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 302
diff changeset
1 Mon Feb 8 12:13:08 JST 2016
c48a8671ce34 fix parallel search first match
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 302
diff changeset
2
c48a8671ce34 fix parallel search first match
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 302
diff changeset
3 word の処理をする前に、CharClassをobjectにする方が良いか? CbCっぽくはなくなるが。
c48a8671ce34 fix parallel search first match
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 302
diff changeset
4
302
27414e6fb33c retrying blocked search
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 298
diff changeset
5 Sat Feb 6 19:50:04 JST 2016
27414e6fb33c retrying blocked search
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 298
diff changeset
6
27414e6fb33c retrying blocked search
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 298
diff changeset
7 ちょっとあれだけど、
27414e6fb33c retrying blocked search
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 298
diff changeset
8
27414e6fb33c retrying blocked search
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 298
diff changeset
9 各blockはstate 1から始める
27414e6fb33c retrying blocked search
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 298
diff changeset
10 終わりの状態が1でなかったら、そこだけやりなおす
27414e6fb33c retrying blocked search
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 298
diff changeset
11
27414e6fb33c retrying blocked search
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 298
diff changeset
12 ってのが簡単。最悪、全部やり直す可能性があるが...
27414e6fb33c retrying blocked search
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 298
diff changeset
13
27414e6fb33c retrying blocked search
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 298
diff changeset
14 Wed Feb 3 21:15:49 JST 2016
27414e6fb33c retrying blocked search
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 298
diff changeset
15
27414e6fb33c retrying blocked search
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 298
diff changeset
16 blockedSearch だと一つはoverrapさせる必要がある。
27414e6fb33c retrying blocked search
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 298
diff changeset
17
27414e6fb33c retrying blocked search
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 298
diff changeset
18 (aaa|aaabb)
27414e6fb33c retrying blocked search
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 298
diff changeset
19 state : 1 [a-a] (14)
27414e6fb33c retrying blocked search
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 298
diff changeset
20 state : 2*
27414e6fb33c retrying blocked search
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 298
diff changeset
21 state : 4 [a-a] (8)
27414e6fb33c retrying blocked search
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 298
diff changeset
22 state : 8 [a-a] (2)
27414e6fb33c retrying blocked search
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 298
diff changeset
23 state : 10 [a-a] (20)
27414e6fb33c retrying blocked search
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 298
diff changeset
24 state : 20 [a-a] (40)
27414e6fb33c retrying blocked search
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 298
diff changeset
25 state : 40 [b-b] (80)
27414e6fb33c retrying blocked search
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 298
diff changeset
26 state : 80 [b-b] (2)
27414e6fb33c retrying blocked search
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 298
diff changeset
27 state : 14 [a-a] (28)
27414e6fb33c retrying blocked search
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 298
diff changeset
28 state : 28 [a-a] (42)
27414e6fb33c retrying blocked search
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 298
diff changeset
29 state : 42* [b-b] (80)
27414e6fb33c retrying blocked search
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 298
diff changeset
30
27414e6fb33c retrying blocked search
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 298
diff changeset
31 a | a | a bbb
27414e6fb33c retrying blocked search
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 298
diff changeset
32 prev 14 28
27414e6fb33c retrying blocked search
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 298
diff changeset
33 curret 7F ... ..
27414e6fb33c retrying blocked search
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 298
diff changeset
34
27414e6fb33c retrying blocked search
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 298
diff changeset
35 a a | a | a bbb
27414e6fb33c retrying blocked search
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 298
diff changeset
36 prev 14 28
27414e6fb33c retrying blocked search
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 298
diff changeset
37 curret 7F ... ..
27414e6fb33c retrying blocked search
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 298
diff changeset
38
27414e6fb33c retrying blocked search
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 298
diff changeset
39 false positive がある → 再判定
27414e6fb33c retrying blocked search
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 298
diff changeset
40 maxmum match による見落としがある (元々そういうものはあるのだが...)
27414e6fb33c retrying blocked search
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 298
diff changeset
41 なくそうと思うと、ちょっと大変(可能な resultを全部推移させる必要がある)
27414e6fb33c retrying blocked search
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 298
diff changeset
42 内部の非決定性がなければ、こういう問題は出ない
27414e6fb33c retrying blocked search
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 298
diff changeset
43
27414e6fb33c retrying blocked search
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 298
diff changeset
44
298
63213964502a refactoring ....
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 296
diff changeset
45 Wed Feb 3 08:20:06 JST 2016
63213964502a refactoring ....
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 296
diff changeset
46
63213964502a refactoring ....
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 296
diff changeset
47 state : 1 [w-w] (4)
63213964502a refactoring ....
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 296
diff changeset
48 state : 4 [o-o] (8)
63213964502a refactoring ....
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 296
diff changeset
49 state : 8 [r-r] (10)
63213964502a refactoring ....
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 296
diff changeset
50 node : a 10 -> 2 [d-d] (2)
63213964502a refactoring ....
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 296
diff changeset
51
63213964502a refactoring ....
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 296
diff changeset
52 w | o r d
63213964502a refactoring ....
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 296
diff changeset
53 4 8 10 2
63213964502a refactoring ....
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 296
diff changeset
54
63213964502a refactoring ....
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 296
diff changeset
55 x | w o r d
63213964502a refactoring ....
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 296
diff changeset
56 1 4 8 10 2
63213964502a refactoring ....
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 296
diff changeset
57
63213964502a refactoring ....
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 296
diff changeset
58 Tue Feb 2 11:21:14 JST 2016 kono
295
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 293
diff changeset
59
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 293
diff changeset
60 あとは word の処理だけだ
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 293
diff changeset
61 charClassMergeをなおさないといけない
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 293
diff changeset
62 merge で文字列のlistにする
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 293
diff changeset
63 長いものは分割
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 293
diff changeset
64 部分文字列は分解する?
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 293
diff changeset
65
296
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 295
diff changeset
66 Cerirum 側で、最初のmatchが表示されてない
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 295
diff changeset
67
298
63213964502a refactoring ....
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 296
diff changeset
68 Tue Feb 2 09:55:40 JST 2016 kono
293
948428caf616 NFA maximum match worked
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 291
diff changeset
69
948428caf616 NFA maximum match worked
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 291
diff changeset
70 % ./regexParser -subst -regex '(a|b)*a(a|b)(a|b)'
948428caf616 NFA maximum match worked
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 291
diff changeset
71 ---Print Node----
948428caf616 NFA maximum match worked
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 291
diff changeset
72 a(1)->(1)
948428caf616 NFA maximum match worked
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 291
diff changeset
73 |
948428caf616 NFA maximum match worked
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 291
diff changeset
74 b(1)->(1)
948428caf616 NFA maximum match worked
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 291
diff changeset
75 *
948428caf616 NFA maximum match worked
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 291
diff changeset
76 +
948428caf616 NFA maximum match worked
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 291
diff changeset
77 a(4)->(4)
948428caf616 NFA maximum match worked
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 291
diff changeset
78 +
948428caf616 NFA maximum match worked
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 291
diff changeset
79 a(4)->(8)
948428caf616 NFA maximum match worked
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 291
diff changeset
80 |
948428caf616 NFA maximum match worked
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 291
diff changeset
81 b(4)->(8)
948428caf616 NFA maximum match worked
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 291
diff changeset
82 +
948428caf616 NFA maximum match worked
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 291
diff changeset
83 a(8)->(2)
948428caf616 NFA maximum match worked
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 291
diff changeset
84 |
948428caf616 NFA maximum match worked
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 291
diff changeset
85 b(8)->(2)
948428caf616 NFA maximum match worked
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 291
diff changeset
86 -----------------
948428caf616 NFA maximum match worked
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 291
diff changeset
87 state : 1
948428caf616 NFA maximum match worked
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 291
diff changeset
88 node : + 1 -> 1
948428caf616 NFA maximum match worked
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 291
diff changeset
89 [a-a] (5)
948428caf616 NFA maximum match worked
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 291
diff changeset
90 [b-b] (1)
948428caf616 NFA maximum match worked
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 291
diff changeset
91
948428caf616 NFA maximum match worked
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 291
diff changeset
92 state : 2*
948428caf616 NFA maximum match worked
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 291
diff changeset
93 node : e 2 -> 1
948428caf616 NFA maximum match worked
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 291
diff changeset
94
948428caf616 NFA maximum match worked
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 291
diff changeset
95 state : 4
948428caf616 NFA maximum match worked
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 291
diff changeset
96 node : | 4 -> 1
948428caf616 NFA maximum match worked
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 291
diff changeset
97 [a-a] (8)
948428caf616 NFA maximum match worked
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 291
diff changeset
98 [b-b] (8)
948428caf616 NFA maximum match worked
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 291
diff changeset
99
948428caf616 NFA maximum match worked
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 291
diff changeset
100 state : 8
948428caf616 NFA maximum match worked
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 291
diff changeset
101 node : | 8 -> 1
948428caf616 NFA maximum match worked
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 291
diff changeset
102 [a-a] (2)
948428caf616 NFA maximum match worked
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 291
diff changeset
103 [b-b] (2)
948428caf616 NFA maximum match worked
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 291
diff changeset
104
948428caf616 NFA maximum match worked
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 291
diff changeset
105 state : 5
948428caf616 NFA maximum match worked
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 291
diff changeset
106 [a-a] (1)
948428caf616 NFA maximum match worked
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 291
diff changeset
107 [b-b] (9)
948428caf616 NFA maximum match worked
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 291
diff changeset
108
948428caf616 NFA maximum match worked
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 291
diff changeset
109 state : 9
948428caf616 NFA maximum match worked
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 291
diff changeset
110 [a-a] (1) <---- 間違い 2 とmergeしているはずだが...
948428caf616 NFA maximum match worked
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 291
diff changeset
111 [b-b] (3)
948428caf616 NFA maximum match worked
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 291
diff changeset
112
948428caf616 NFA maximum match worked
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 291
diff changeset
113 state : 3*
948428caf616 NFA maximum match worked
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 291
diff changeset
114 [a-a] (5)
948428caf616 NFA maximum match worked
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 291
diff changeset
115 [b-b] (1)
948428caf616 NFA maximum match worked
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 291
diff changeset
116
948428caf616 NFA maximum match worked
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 291
diff changeset
117 やはり charClassMerge のbugだった。
948428caf616 NFA maximum match worked
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 291
diff changeset
118
948428caf616 NFA maximum match worked
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 291
diff changeset
119 createCharClassRangeで、同じものだったら新しく作らないってのがあると良い
948428caf616 NFA maximum match worked
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 291
diff changeset
120 charClassMerg が同じものを返す場合があるってことね
295
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 293
diff changeset
121 同じレンジで同じ状態のものだけなので、それほどあるとは思えないが。
293
948428caf616 NFA maximum match worked
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 291
diff changeset
122
948428caf616 NFA maximum match worked
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 291
diff changeset
123
289
20ed7536784f add test file
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 287
diff changeset
124 Mon Feb 1 01:51:10 JST 2016 kono
20ed7536784f add test file
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 287
diff changeset
125
20ed7536784f add test file
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 287
diff changeset
126 非決定性がある時の maxmum match がよろしくない
20ed7536784f add test file
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 287
diff changeset
127 これ以上拡張できないという終了条件の実現は?
20ed7536784f add test file
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 287
diff changeset
128
20ed7536784f add test file
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 287
diff changeset
129 ./regexParser -ts -subset -regex '(a|b)*a' -file ahoaho.txt
20ed7536784f add test file
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 287
diff changeset
130
20ed7536784f add test file
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 287
diff changeset
131 で、bの後にaが来なくなると、bの手前までをacceptする
20ed7536784f add test file
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 287
diff changeset
132
291
1b75546ff65f fix TODO
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 289
diff changeset
133 subset construction はいじらない方針で。
1b75546ff65f fix TODO
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 289
diff changeset
134
1b75546ff65f fix TODO
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 289
diff changeset
135
1b75546ff65f fix TODO
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 289
diff changeset
136 state : 1
1b75546ff65f fix TODO
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 289
diff changeset
137 node : + 1 -> 1
1b75546ff65f fix TODO
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 289
diff changeset
138 [a-a] (3)
1b75546ff65f fix TODO
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 289
diff changeset
139 [b-b] (1)
1b75546ff65f fix TODO
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 289
diff changeset
140
1b75546ff65f fix TODO
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 289
diff changeset
141 state : 2*
1b75546ff65f fix TODO
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 289
diff changeset
142 node : e 2 -> 1
1b75546ff65f fix TODO
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 289
diff changeset
143
1b75546ff65f fix TODO
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 289
diff changeset
144 state : 3*
1b75546ff65f fix TODO
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 289
diff changeset
145 [a-a] (3)
1b75546ff65f fix TODO
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 289
diff changeset
146 [b-b] (1)
1b75546ff65f fix TODO
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 289
diff changeset
147
1b75546ff65f fix TODO
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 289
diff changeset
148 * はaccept state。
1b75546ff65f fix TODO
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 289
diff changeset
149
1b75546ff65f fix TODO
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 289
diff changeset
150 [a-a] (3) で stateMatch で良いが、maxmum だと match している間は stateMatch はしない。
1b75546ff65f fix TODO
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 289
diff changeset
151 現状は、*の付いているstateで、条件にmatchしない時に stateMatch してる。
1b75546ff65f fix TODO
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 289
diff changeset
152 これだと state 3 で b で satete 1 に行ってしまい、b 以降に a がない時に失敗する。b に行く前の state 3 で stateMatchするべき。
1b75546ff65f fix TODO
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 289
diff changeset
153
1b75546ff65f fix TODO
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 289
diff changeset
154 matchする可能性がなくなったところで、前の部分でmatchさせる必要がある。
1b75546ff65f fix TODO
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 289
diff changeset
155 * match してなければ、match top をupdate
1b75546ff65f fix TODO
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 289
diff changeset
156 * match している間は直前matchをupdate
1b75546ff65f fix TODO
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 289
diff changeset
157 * match fail したところで、直前のmatch があれば、それを返す
1b75546ff65f fix TODO
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 289
diff changeset
158 という感じか?
1b75546ff65f fix TODO
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 289
diff changeset
159
1b75546ff65f fix TODO
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 289
diff changeset
160 minimum match は
1b75546ff65f fix TODO
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 289
diff changeset
161 * match してなければ、match top をupdate
1b75546ff65f fix TODO
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 289
diff changeset
162 * match したところで、直前のmatch があれば、それを返す
1b75546ff65f fix TODO
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 289
diff changeset
163 か?
1b75546ff65f fix TODO
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 289
diff changeset
164
1b75546ff65f fix TODO
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 289
diff changeset
165 ソース生成を CbC に対応させる。(でないと動かないらしい)
289
20ed7536784f add test file
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 287
diff changeset
166
20ed7536784f add test file
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 287
diff changeset
167
284
5d23dc02f60d add TODO
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 221
diff changeset
168 Sun Jan 31 20:37:49 JST 2016 masa
289
20ed7536784f add test file
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 287
diff changeset
169 並列処理時のバグ Ok
20ed7536784f add test file
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 287
diff changeset
170 (mili|have) のsubset construction のミス Ok
20ed7536784f add test file
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 287
diff changeset
171 tSearch の segv Ok
284
5d23dc02f60d add TODO
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 221
diff changeset
172
289
20ed7536784f add test file
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 287
diff changeset
173 '(main|int) ' .. Ok
20ed7536784f add test file
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 287
diff changeset
174 '(main|int)\(' .. Ok
287
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 284
diff changeset
175
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 284
diff changeset
176 とかが動かない。
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 284
diff changeset
177
291
1b75546ff65f fix TODO
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 289
diff changeset
178 start state に accept flag が立っていると''にmatchしてしまう。それは別に生成する。
1b75546ff65f fix TODO
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 289
diff changeset
179
221
78174ff2f338 add Todo
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 215
diff changeset
180 Sat Jan 2 15:29:16 JST 2016 kono
78174ff2f338 add Todo
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 215
diff changeset
181
78174ff2f338 add Todo
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 215
diff changeset
182 stateよりもstate transitionの方が大きいので、subset contructionで CharClassWalkするのは良くない。
78174ff2f338 add Todo
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 215
diff changeset
183 mergeTransition した時に、state listに新しいものを接続してやれば、CharClassWalkの必要はない。
78174ff2f338 add Todo
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 215
diff changeset
184 その時に、stateArray には入れないでおく。sateArrayは処理済みなので。
78174ff2f338 add Todo
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 215
diff changeset
185
78174ff2f338 add Todo
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 215
diff changeset
186 EOF stateには cc がないので特別扱いする必要がある。
78174ff2f338 add Todo
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 215
diff changeset
187
78174ff2f338 add Todo
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 215
diff changeset
188 Tue Dec 29 17:55:17 JST 2015 kono
215
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
189
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
190 Todo は上に付け加えていく。
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
191
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
192 abc*d +
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
193 / \
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
194 + d
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
195 / \
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
196 + *
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
197 / \ |
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
198 a b c
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
199
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
200 Parserを書き換えて、
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
201
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
202 abc*d +
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
203 / \
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
204 a +
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
205 / \
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
206 b +
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
207 / \
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
208 * d
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
209 |
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
210 c
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
211
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
212 とすることもできる。たぶん、こっちの方が良い。でも、
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
213 ((ab)(c*))d
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
214 と書いても良いはずで、しかも、これは abc*d とおなじになるので解決になってない。
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
215
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
216 sub treeは、最初の状態を返す必要がある。そうでないと、
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
217 (ab*|bc*)
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
218 とかがうまく動かない。
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
219
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
220 最後が*で終わっている時には、次の式と重ねる必要がある。なので、
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
221 最後の*があれば、それを持ち歩く
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
222 方式が良いと思います。
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
223
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
224 stateAllocateをgenerateTransitionは1 passにすると stateArrayの大きさを徐々に増やす必要がある。
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
225 少なくともループは一つにした方が間違いが少ないだろう。
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
226
210
e8aa8a1ea749 add benchmark TODO
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 204
diff changeset
227
e8aa8a1ea749 add benchmark TODO
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 204
diff changeset
228 2015年 12月27日 日曜日 19時31分03秒 JST
e8aa8a1ea749 add benchmark TODO
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 204
diff changeset
229 例題 特定の IP のアクセス数をカウントする
e8aa8a1ea749 add benchmark TODO
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 204
diff changeset
230 concordance
e8aa8a1ea749 add benchmark TODO
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 204
diff changeset
231 regex をつかった条件付き concordance
e8aa8a1ea749 add benchmark TODO
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 204
diff changeset
232 regex をつかった条件付き wordcount
e8aa8a1ea749 add benchmark TODO
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 204
diff changeset
233 これを行う perl スクリプトと比較
215
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
234
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
235 2015年 12月26日 土曜日 18時07分00秒 JST
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
236 TODO CharClassWalker の routine test を作成する
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
237 TODO CharClassMerge の routine test を作成する
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
238 TODO searchBit の routine test を作成する
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
239 TODO subsetConstraction の routine test を作成する
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
240