annotate regexParser/TODO @ 287:2f3e7bba038e

fix \
author Shinji KONO <kono@ie.u-ryukyu.ac.jp>
date Sun, 31 Jan 2016 22:59:59 +0900
parents 5d23dc02f60d
children 20ed7536784f
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
284
5d23dc02f60d add TODO
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 221
diff changeset
1 Sun Jan 31 20:37:49 JST 2016 masa
5d23dc02f60d add TODO
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 221
diff changeset
2 並列処理時のバグ
5d23dc02f60d add TODO
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 221
diff changeset
3 (mili|have) のsubset construction のミス
5d23dc02f60d add TODO
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 221
diff changeset
4 tSearch の segv
5d23dc02f60d add TODO
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 221
diff changeset
5
287
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 284
diff changeset
6 '(main|int) '
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 284
diff changeset
7 '(main|int)\('
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 284
diff changeset
8
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 284
diff changeset
9 とかが動かない。
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 284
diff changeset
10
221
78174ff2f338 add Todo
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 215
diff changeset
11 Sat Jan 2 15:29:16 JST 2016 kono
78174ff2f338 add Todo
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 215
diff changeset
12
78174ff2f338 add Todo
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 215
diff changeset
13 stateよりもstate transitionの方が大きいので、subset contructionで CharClassWalkするのは良くない。
78174ff2f338 add Todo
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 215
diff changeset
14 mergeTransition した時に、state listに新しいものを接続してやれば、CharClassWalkの必要はない。
78174ff2f338 add Todo
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 215
diff changeset
15 その時に、stateArray には入れないでおく。sateArrayは処理済みなので。
78174ff2f338 add Todo
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 215
diff changeset
16
78174ff2f338 add Todo
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 215
diff changeset
17 EOF stateには cc がないので特別扱いする必要がある。
78174ff2f338 add Todo
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 215
diff changeset
18
78174ff2f338 add Todo
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 215
diff changeset
19 Tue Dec 29 17:55:17 JST 2015 kono
215
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
20
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
21 Todo は上に付け加えていく。
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
22
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
23 abc*d +
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
24 / \
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
25 + d
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
26 / \
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
27 + *
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
28 / \ |
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
29 a b c
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
30
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
31 Parserを書き換えて、
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
32
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
33 abc*d +
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
34 / \
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
35 a +
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
36 / \
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
37 b +
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
38 / \
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
39 * d
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
40 |
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
41 c
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
42
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
43 とすることもできる。たぶん、こっちの方が良い。でも、
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
44 ((ab)(c*))d
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
45 と書いても良いはずで、しかも、これは abc*d とおなじになるので解決になってない。
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
46
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
47 sub treeは、最初の状態を返す必要がある。そうでないと、
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
48 (ab*|bc*)
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
49 とかがうまく動かない。
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
50
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
51 最後が*で終わっている時には、次の式と重ねる必要がある。なので、
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
52 最後の*があれば、それを持ち歩く
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
53 方式が良いと思います。
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
54
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
55 stateAllocateをgenerateTransitionは1 passにすると stateArrayの大きさを徐々に増やす必要がある。
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
56 少なくともループは一つにした方が間違いが少ないだろう。
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
57
210
e8aa8a1ea749 add benchmark TODO
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 204
diff changeset
58
e8aa8a1ea749 add benchmark TODO
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 204
diff changeset
59 2015年 12月27日 日曜日 19時31分03秒 JST
e8aa8a1ea749 add benchmark TODO
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 204
diff changeset
60 例題 特定の IP のアクセス数をカウントする
e8aa8a1ea749 add benchmark TODO
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 204
diff changeset
61 concordance
e8aa8a1ea749 add benchmark TODO
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 204
diff changeset
62 regex をつかった条件付き concordance
e8aa8a1ea749 add benchmark TODO
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 204
diff changeset
63 regex をつかった条件付き wordcount
e8aa8a1ea749 add benchmark TODO
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 204
diff changeset
64 これを行う perl スクリプトと比較
215
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
65
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
66 2015年 12月26日 土曜日 18時07分00秒 JST
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
67 TODO CharClassWalker の routine test を作成する
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
68 TODO CharClassMerge の routine test を作成する
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
69 TODO searchBit の routine test を作成する
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
70 TODO subsetConstraction の routine test を作成する
63e9224c7b2b try to fix asterisk
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents: 212
diff changeset
71