Mercurial > hg > Members > shoshi > iDB2010
annotate db.ind @ 6:2b708e45be0d default tip
fix
author | Shinji KONO <kono@ie.u-ryukyu.ac.jp> |
---|---|
date | Wed, 02 Jun 2010 14:58:12 +0900 |
parents | 5571683e9870 |
children |
rev | line source |
---|---|
0
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
1 -title: Programming Scalable Service in Code segment and Data segment |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
2 |
6 | 3 --author: Shosi TAMAKI and Shinji KONO |
0
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
4 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
5 -abstract: |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
6 |
1 | 7 To implement scalable services, not only higher software design, |
3
8fd7e6fb855f
spell check pdf generated.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
1
diff
changeset
|
8 low level implementation is also important to achieve performance |
1 | 9 and reliability. |
10 A combination of fine grain task manager and continuation based | |
11 language is good to make Scalable Services on Many core architecture. | |
12 Code segment is a | |
13 small part of execution code written in a lower language of C. | |
3
8fd7e6fb855f
spell check pdf generated.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
1
diff
changeset
|
14 Data segments are fragments of memory and these are passed |
8fd7e6fb855f
spell check pdf generated.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
1
diff
changeset
|
15 among code segments and CPU cores. We discuss the pro and cons |
1 | 16 of our method. |
17 | |
0
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
18 --Tools for implementing Distributed Application |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
19 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
20 On demanding construction of scalable services such as Twitter, |
1 | 21 FaceBook or Network based Book Publishing, we need new stage of programming |
0
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
22 tools. Based on our experiences, we designed and implemented |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
23 two major tools to build scalable services: Code segments and |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
24 Data Segments. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
25 |
1 | 26 Not necessary mentioned SEDA \cite{SEDA2001}, scalable services |
0
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
27 requires highly distributed servers and highly multi-threaded |
1 | 28 program on a server among them. This type of implementation works |
3
8fd7e6fb855f
spell check pdf generated.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
1
diff
changeset
|
29 fine in theory, but it heavyly depends on low level implementation, such |
1 | 30 as threads, synchronized queues and CAS operations. |
31 | |
32 We have successfully implemented | |
0
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
33 WWW services using Classical tools such as C++, Java, or even C. |
1 | 34 Script Languages such as Perl, PHP or Python are used in front end, |
35 but in case of heavy duty database side, careful implementation | |
3
8fd7e6fb855f
spell check pdf generated.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
1
diff
changeset
|
36 is necessary to achieve good performance. |
0
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
37 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
38 Now some of the services have more than 10 millions users, |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
39 load balancing among several WWW front-end and many |
3
8fd7e6fb855f
spell check pdf generated.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
1
diff
changeset
|
40 memcached\cite{memcached04} servers to replicate Database accesses using |
0
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
41 classical database such as Oracle, mySQL or Postgress, which |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
42 performs so badly, Internet companies have to |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
43 create Key Value Store system by themselves, such as BigTable |
1 | 44 \cite{Chang06bigtable:a} |
45 or Cassandra\cite{cassandra09}. | |
46 This situation is sometimes discussed in a context | |
3
8fd7e6fb855f
spell check pdf generated.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
1
diff
changeset
|
47 of ACID vs BASE database scheme \cite{Brewer:2000:TRD:343477.343502}. |
0
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
48 |
1 | 49 Based on our works on Internet programming and Sony PS3 |
50 programming, that is Cell architecture\cite{Cell}, now, | |
0
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
51 we are sure that we need more suitable tools to implement various |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
52 components in the scalable services, such as database server, |
1 | 53 web server or HTTP response generator even in a front end. |
0
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
54 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
55 We separate difficulties in two point of views: Code and Data. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
56 This sounds very basic, but since our history is starting from |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
57 a single CPU with few memories, current tools are some how obsolete |
1 | 58 now, so we have to reconsider the situation |
59 (Fig. \ref{Data and Code in Internet Service}). | |
60 | |
3
8fd7e6fb855f
spell check pdf generated.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
1
diff
changeset
|
61 <center><img src="fig/two-side.pdf" alt="Data and Code in Internet Service"></center> |
0
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
62 |
1 | 63 We are working on a combination of Continuation based C\cite{kono08f,cbc-sourceforge} |
64 and | |
65 Cerium Engine\cite{cerium-sourceforge}. | |
66 The former one is a lower language of C implemented | |
67 in GCC\cite{gcc}, and the later is a Open CL\cite{opencl} like fine grain task manager on | |
0
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
68 PS3 Cell architecture, which supports data segment management on |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
69 SPE (Synergistic Processor Engine). Since SPE has only 256Kbytes |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
70 local memory, careful management is necessary, so we have to invent |
1 | 71 our own memory manager. We can use 6 SPE with 2Tbit/s ring bus in |
72 PS3 Linux (Fig. \ref{Cell Architecture}). | |
73 | |
3
8fd7e6fb855f
spell check pdf generated.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
1
diff
changeset
|
74 <center><img src="fig/Cell.pdf" alt="Cell Architecture"></center> |
0
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
75 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
76 In this paper, first we analyze problems in scalable system. Then |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
77 we introduce new concepts: Code Segment and Data Segment. Code |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
78 segment is implemented in Continuation based C here after we call CbC and Data Segment |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
79 is implemented in a memory management module in many core task manager called |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
80 Cerium Engine. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
81 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
82 The basic idea is this. We pass the control among module layer without function |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
83 call. We cannot use conventional language because it has built-in function |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
84 call which which cannot be removed. These module layer segments are |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
85 called code segments. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
86 All the data are stored in a message |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
87 packed from, which we called data segment, which is controlled uniformly. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
88 Instead of using direct pointer access, |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
89 data segments are copied among modules and CPU cores, which are carefully |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
90 adopted to the cache or interconnect communication such as DMA. |
3
8fd7e6fb855f
spell check pdf generated.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
1
diff
changeset
|
91 All the data segments are hashed in $2^n$ size memory pool similar to the |
0
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
92 Unix malloc mechanism. This pool is in 64bit address space and |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
93 it makes data segment communication far simpler. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
94 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
95 It sounds like completely different from current Internet service scheme, |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
96 we overcome the difference in following way using code transformation. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
97 First we make entire program in a conventional way. We divide it |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
98 into code segment and explicit stack manipulation. During this stage, |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
99 communications are reformatted into data segment passing among |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
100 code segments. The program still working exactly the same before |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
101 transformation and we may use automatic conversion here. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
102 We reorganize it using data parallel and pipeline execution. At this |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
103 stage, automatic conversion is not suitable in many cases, so we have |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
104 to make translation by hands, but we can use possible equivalence checker |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
105 for the program correctness. In order to make pipeline execution, |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
106 destructive modification of the content of data segments such as |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
107 classical object oriented programming is not allowed. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
108 We have to make copies. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
109 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
110 At the last section, we give some of our achievement and |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
111 comparison with other tools, such as SEDA or Open CL. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
112 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
113 --Problems |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
114 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
115 Let's think that we are going to make a network game. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
116 Maybe PS3 (6 SPU and 2 PPU )is used in a client side and 32 CPU ( 16 x 2 hyper thread ). We have to use highly pipelined thread and data parallel execution |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
117 in both client side and server side, something like SEDA architecture. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
118 We will demonstrate several problems based on our experiences. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
119 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
120 ---Our Experiences |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
121 |
1 | 122 Our PS3 implementation is SPURS\cite{spurs} like Pipeline Task Manager which |
0
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
123 is called Cerium. (We had to write software rendering because of SCEI did |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
124 not open GPU information BTW.) Since PS3's SPU has only 256Kbytes memory, |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
125 we have to carefully handle memory usage both in case of code and data. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
126 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
127 Data segment is copied from PPU to SPU via DMA, which overhead is hide |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
128 using Task Pipeline. But we have to avoid too much synchronization |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
129 of these copies. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
130 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
131 In case of Xeon architecture, CPU cores are shared all the memory, |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
132 but actually it has a local cache which is interconnected using quick path. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
133 Cache size is 256Kbyte for each core. Implicit copy is done between |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
134 a cache memory and the main memory or between a cache and another cache. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
135 The situation is basically the same in PS3 and Xeon. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
136 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
137 ---Module Layer |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
138 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
139 Complex systems such as Operating systems, Database systems, |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
140 Network Systems or Game libraries have several module layer. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
141 For examples, Database system has message packing module, |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
142 query analyzing module and execution module. Network system |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
143 may have ISO standard layer such as presentation, transport |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
144 and network. Operating system have v-node file system and |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
145 device drivers. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
146 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
147 Each module may have 1-5 nested function calls, so we have |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
148 more then 10-30 nested function calls in Complex systems. |
1 | 149 It can be implemented in normal function calls (Fig.\ref{Layer by call}). |
150 <center><img src="fig/layer-func.pdf" alt="Layer by call"></center> | |
151 | |
152 Using our Continuation base C, layering can be implemented in | |
153 a goto statement. Since it has no stack operation, it works very fast (Fig.\ref{Layer by goto}). | |
154 <center><img src="fig/layer-continuation.pdf" alt="Layer by goto"></center> | |
0
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
155 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
156 If we have several tasks to do, each |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
157 processing in modules can be executed in a pipelined way. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
158 In order to implement the pipeline, we assign threads from |
1 | 159 thread pools to each module layer . |
3
8fd7e6fb855f
spell check pdf generated.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
1
diff
changeset
|
160 Each thread is interconnected by a synchronized queue, which has |
1 | 161 certain overheads, but if it is carefully implemented, parallel |
162 processing hides its costs. | |
0
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
163 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
164 Conversion from sequential execution to pipelined execution |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
165 is straight forward, but if it has a race condition, correcting |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
166 the problem is very difficult. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
167 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
168 ---SEDA |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
169 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
170 This combination of pipeline staging and data parallel execution |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
171 is the heart of SEDA. But it requires very complex programming. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
172 At first we have design communication queue among all the pipeline |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
173 stages. In case of C++, we have to managing all the queue manually |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
174 because it lacks garbage collection. It is not so easy and requires |
1 | 175 complex memory pools ( or conservative GC ), which is a bug prone |
3
8fd7e6fb855f
spell check pdf generated.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
1
diff
changeset
|
176 (Fig.\ref{Layer by Thread}). |
8fd7e6fb855f
spell check pdf generated.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
1
diff
changeset
|
177 <center><img src="fig/layer-continuation.pdf" alt="Layer by Thread"></center> |
0
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
178 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
179 ---Thread Implementation |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
180 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
181 Theoretically SEDA architecture works fine, but it assumes very |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
182 fast thread execution with blocking queue. Cassandra key value store |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
183 system use Java to implement this architecture. Java 6 is far better |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
184 thread execution, but sine it is a combination of user level thread and |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
185 kernel level thread, it is not so easy to optimize its execution. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
186 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
187 If we use script language to implement this, thing become worse. For |
1 | 188 example, Python thread implementation is very bad concurrency\cite{python-gil}, |
189 and ruby does not support kernel thread. And their GC mechanism always interfere | |
190 with thread executions. | |
0
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
191 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
192 ---Blocking queue |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
193 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
194 Each threads are executed in an event driven way. A task is |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
195 put into a blocking queue and it wakes up a thread. The thread |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
196 read the queue atomically. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
197 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
198 We can write this operation in following way. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
199 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
200 while(Task p = waitingQueue.get()) { |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
201 p.run(); |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
202 } |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
203 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
204 It looks good but {\tt p} is determines just before its execution, |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
205 which is no good in terms of branch prediction. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
206 It looks like this delay is not so important, but |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
207 it has penalty around 10 clocks. If we have many small task |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
208 this situation is not so good. What we need is 10 to 20 instruction |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
209 cycle before executing the indirect call. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
210 |
1 | 211 Besides blocking queue's CAS costs, queue operations include |
212 allocation / deallocation | |
213 cost. In case of Java, to avoid GC penalty, link node is not reused and | |
214 it simply delete old one and create new one. If the new operation is | |
215 shared among threads (unlikely), it requires another CAS, otherwise | |
3
8fd7e6fb855f
spell check pdf generated.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
1
diff
changeset
|
216 it requires separate memory pool for each thread, which consumes a lot |
1 | 217 of memory. |
218 | |
0
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
219 ---Scheduling |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
220 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
221 If cost of blocking queue operation is negligible, simple |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
222 FIFO scheduling is OK in SEDA from the through put point of view. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
223 But blocking queue requires 10 to 20 instruction cycle under |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
224 no race condition. Thread pool size is heavyly depends on |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
225 the architecture, that is number of CPU, number of requests, |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
226 execution time of tasks. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
227 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
228 Sometimes it is better to reduce concurrency and skip these |
3
8fd7e6fb855f
spell check pdf generated.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
1
diff
changeset
|
229 synchronization costs. In this case, synchronization of threads |
1 | 230 becomes just a cost. |
0
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
231 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
232 ---Garbage Collection |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
233 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
234 Basically communication between layers makes no garbage, because it is |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
235 generated and destroyed in fix amount size. But in case of programming |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
236 language with GC, if we use memory pool like technique, it makes many |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
237 references which GC have to take care of. It makes GC very slow. So |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
238 it is better to simply generate and destroy. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
239 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
240 Apache Web server features memory pool approach drastically, but it |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
241 is an convention, some module use malloc library call directly. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
242 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
243 ---Programming Correctness |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
244 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
245 SEDA architecture or SPURS architecture is very complex to implement |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
246 and the program working on it is very difficult also. It is very difficult |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
247 to test. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
248 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
249 For example, message packet between pipeline stages is created and destroyed |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
250 in exact moment. If we lost the correct timing, a bug will occur or not |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
251 if we are unlucky. This is odd, because even if program itself is deterministic, |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
252 it behaves non deterministic dew to pipeline execution timing. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
253 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
254 ---Many Core Awareness |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
255 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
256 Open CL \cite{opencl} is a standard library to use Many Core |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
257 architecture, but it has very complex interfaces. We have to |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
258 write a program on a core in a string, which is compiled in |
1 | 259 LLVM \cite{llvm}. Data transfer API is vary and complex, which |
0
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
260 requires large amount of code. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
261 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
262 In case of Java or Scripting language, we cannot directly control |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
263 the copy between cores, which means we cannot hide copy cost |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
264 explicitly. We have to care about SPU's local memory size or |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
265 cache memory size which is 256Kbytes in this time. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
266 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
267 The same careful management is necessary for executing code |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
268 which is a data on a core also. We have to transfer code segment |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
269 using copying cost hiding technique such as pipeline execution. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
270 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
271 These higher level pipeline optimization is very difficult and is not handle well |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
272 in compiles. Since compiler technique is working well on streaming instructions, it is some how contradict. It should be designed by hands. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
273 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
274 --New Tools |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
275 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
276 We introduce two main tools, one is Continuation based C and the other one |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
277 is Data segment management based on Cerium Task Manager. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
278 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
279 --Code Segment |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
280 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
281 Continuation based C is a C language which all the function is |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
282 forced to do tail call elimination. It is implemented using |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
283 GCC 4.x. Modification is not so large. We also force |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
284 FASTCALL option which assign arguments on registers. This |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
285 makes it faster. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
286 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
287 CbC Syntax is very simple. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
288 |
3
8fd7e6fb855f
spell check pdf generated.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
1
diff
changeset
|
289 struct interface1 { DataSegment<Data> *i;}; |
8fd7e6fb855f
spell check pdf generated.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
1
diff
changeset
|
290 struct interface2 { DataSegment<Data> *o;}; |
0
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
291 |
3
8fd7e6fb855f
spell check pdf generated.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
1
diff
changeset
|
292 __code f(struct interface1 *a, |
8fd7e6fb855f
spell check pdf generated.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
1
diff
changeset
|
293 struct interface2 *b) { |
8fd7e6fb855f
spell check pdf generated.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
1
diff
changeset
|
294 b->o=a->i; |
0
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
295 goto g(b); |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
296 } |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
297 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
298 In this example, a code segment |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
299 \verb+f+ has \verb+input a+ and sends \verb+output b+ to a code segment \verb+g+. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
300 There is no return from code segment \verb+b+, \verb+b+ should call another |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
301 continuation using \verb+goto+. Any control structure in C is allowed in CbC |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
302 language, but we may restrict ourselves to use \verb+if+ statement |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
303 only, because it is sufficient to implement C to CbC translation. In this case, |
5 | 304 code segment has one input interface and several output interfaces (fig.\ref{Code Segment}). |
305 <center><img src="fig/code.pdf" alt="Code Segment"></center> | |
0
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
306 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
307 \verb+__code+ and parameterized global goto statement is an extension of |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
308 Continuation based C. Unlike \verb+C--+ \cite{cminusminus}'s parameterized goto, |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
309 we cannot goto into normal C function because of forced FASTCALL option. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
310 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
311 ---Continuation |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
312 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
313 Since code segment has no stack, continuation of code segment is mere |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
314 entry address to the code segment. We can call it a light weight continuation. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
315 |
1 | 316 We also supports full continuation of normal C function using GCC nested function and statement expression. It is implemented some like this in GCC compiler in |
317 a pseudo code with GCC extensions. | |
0
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
318 |
3
8fd7e6fb855f
spell check pdf generated.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
1
diff
changeset
|
319 void (*__return)(int retval_, void *_envp); |
8fd7e6fb855f
spell check pdf generated.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
1
diff
changeset
|
320 __return = ({ |
8fd7e6fb855f
spell check pdf generated.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
1
diff
changeset
|
321 nee_label__ _cbc_exit0; |
8fd7e6fb855f
spell check pdf generated.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
1
diff
changeset
|
322 void __return_func(int retval_, |
8fd7e6fb855f
spell check pdf generated.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
1
diff
changeset
|
323 void *_envp){ |
0
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
324 retval = retval_; |
3
8fd7e6fb855f
spell check pdf generated.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
1
diff
changeset
|
325 goto exit0; |
8fd7e6fb855f
spell check pdf generated.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
1
diff
changeset
|
326 } |
8fd7e6fb855f
spell check pdf generated.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
1
diff
changeset
|
327 if (0) { |
8fd7e6fb855f
spell check pdf generated.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
1
diff
changeset
|
328 _cbc_exit0: |
8fd7e6fb855f
spell check pdf generated.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
1
diff
changeset
|
329 return retval; |
8fd7e6fb855f
spell check pdf generated.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
1
diff
changeset
|
330 } |
8fd7e6fb855f
spell check pdf generated.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
1
diff
changeset
|
331 __return_func; // return value |
0
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
332 }); |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
333 |
3
8fd7e6fb855f
spell check pdf generated.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
1
diff
changeset
|
334 void *__environment = |
8fd7e6fb855f
spell check pdf generated.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
1
diff
changeset
|
335 __builtin_frame_pointer(); |
0
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
336 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
337 We have a environment pointer which is usually the frame pointer, but |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
338 it is not used here, because this is a closure with a hidden environment. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
339 Since this closure is usually implemented using trampoline, that is executable code on stack, if execution code on stack is prohibited, it will not work, but it works on Linux and Mac OS X. In case of Windows case, we cannot use closure, |
1 | 340 so we have to assign frame pointer explicitly. If we don't have to |
341 handle frame pointer directly, generation of continuation is done in | |
342 parsing phase. This is important to make GCC modification minimum. | |
343 | |
344 Anyway this can be used like this. | |
0
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
345 |
3
8fd7e6fb855f
spell check pdf generated.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
1
diff
changeset
|
346 int main() { |
8fd7e6fb855f
spell check pdf generated.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
1
diff
changeset
|
347 goto f(1, __environment, __return ); |
8fd7e6fb855f
spell check pdf generated.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
1
diff
changeset
|
348 .... |
8fd7e6fb855f
spell check pdf generated.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
1
diff
changeset
|
349 } |
8fd7e6fb855f
spell check pdf generated.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
1
diff
changeset
|
350 __code f(int, void *env, |
8fd7e6fb855f
spell check pdf generated.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
1
diff
changeset
|
351 __code (*continuation)(int retval_,void *fp)) { |
8fd7e6fb855f
spell check pdf generated.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
1
diff
changeset
|
352 goto (continuation)(-1, env); |
8fd7e6fb855f
spell check pdf generated.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
1
diff
changeset
|
353 } |
0
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
354 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
355 In this example, \verb+main+ will return -1. When you want to return to |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
356 the middle of the normal function or code segment, put an extra function |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
357 call over it. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
358 |
1 | 359 |
0
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
360 --Data Segment |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
361 |
3
8fd7e6fb855f
spell check pdf generated.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
1
diff
changeset
|
362 We have Open CL like task manager with data segment. |
1 | 363 |
0
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
364 Data segment is a set of doubly linked fix size block which also |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
365 hashed by the 64bit address. It has $2^n$ size, so it is allocated |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
366 efficiently. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
367 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
368 Each site, CPU or cores expected to have separated data segment pool. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
369 Data segment address is unique in all CPUs. In case of PS3, SPU has |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
370 local storage that is 256Kbytes separate addressing space, which |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
371 local address is different from data segment global address. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
372 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
373 Code segment will not use global address directly but it will use |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
374 offset in data segments in its input interface. So we can use |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
375 same code segment both for 64 bit Xeon and SPU 256Kbytes memory. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
376 This means data segment size is normally limited by its hardware, |
1 | 377 Typically 16Kbytes (fig.\ref{Pipeline buffered data segment}). |
378 | |
379 <center><img src="fig/pipeline.pdf" alt="Pipeline buffered data segment"></center> | |
380 | |
3
8fd7e6fb855f
spell check pdf generated.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
1
diff
changeset
|
381 Each Core have to have two input segments and two output segments |
8fd7e6fb855f
spell check pdf generated.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
1
diff
changeset
|
382 to make pipeline correctly. With two extra segments are necessary |
8fd7e6fb855f
spell check pdf generated.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
1
diff
changeset
|
383 for task array it-selves, so we have 6 segments total. |
0
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
384 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
385 ---Data Segment operations |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
386 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
387 Data segment has several operations, |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
388 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
389 get with no global address |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
390 get with copy |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
391 get with no copy |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
392 get with copy with write back |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
393 get with no copy with write |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
394 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
395 Allocation/deallocation is not directly handled from its code segment, |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
396 because it is handled by the Task Manager in a pipelined way. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
397 |
1 | 398 API can be called from a Task like this, |
399 | |
3
8fd7e6fb855f
spell check pdf generated.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
1
diff
changeset
|
400 Datasegment tile = |
8fd7e6fb855f
spell check pdf generated.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
1
diff
changeset
|
401 smanager->get_segment(addr); |
1 | 402 |
403 but usually it not visible from the task, because its reading | |
404 operations were done before its execution and its writing | |
405 execution will be done after the task execution. | |
406 | |
0
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
407 Data segment may contains other data segments' global address, but |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
408 it may invalid. It is a kind of key in a key value store. Consistency |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
409 of data segment global address is maintained by the Cerium task manager. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
410 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
411 --Typical usage |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
412 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
413 A code segment is provided input interface, which contains array of |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
414 data segment in local address space or cache. Usually availability |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
415 of data segment is assured by the task manager. If it is not ready, |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
416 the code segment waits and other ready-to-run code segments are executed. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
417 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
418 Loading necessary data segments in the input interfaces are done prior |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
419 to the execution, may be in a back ground of other code segments |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
420 execution. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
421 |
1 | 422 In following example, \verb+t_exec+ is created, and it has one input |
3
8fd7e6fb855f
spell check pdf generated.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
1
diff
changeset
|
423 data segment and one output data segment. It can be executed in |
1 | 424 any SPU (PS3's CPU core), and \verb+t_print+ task have to wait for |
425 its completion. Finally it is spawned. | |
426 | |
3
8fd7e6fb855f
spell check pdf generated.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
1
diff
changeset
|
427 HTask *t_exec = |
8fd7e6fb855f
spell check pdf generated.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
1
diff
changeset
|
428 manager->create_task(TASK_EXEC); |
1 | 429 t_exec->add_input_datasegment(i_data); |
430 t_exec->add_output_datasegment(o_data); | |
431 t_exec->set_cpu(SPE_ANY); | |
432 t_print->wait_for(t_exec); | |
433 t_exec->spawn(); | |
434 | |
0
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
435 When all data segments are ready, the cod segment is executed. During |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
436 its execution, next input interfaces may be loading. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
437 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
438 After the execution, output interfaces are written into the global |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
439 address if necessary. This is also done in a pipelined way. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
440 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
441 ---Task dependency and Task array |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
442 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
443 Cerium task manager has very simple FIFO scheduler. It is sufficient |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
444 if only through put is matter, which is a usual case. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
445 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
446 All the task is stored in data segments, and connected wait-for link. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
447 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
448 After a task is finished, the task manager solve these dependencies, |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
449 which is a rather heavy task. If tasks are grouped in terms of |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
450 dependencies, we can reduce this phase. This is called task array. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
451 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
452 All the data is stored in data segments and it is managed in data segment |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
453 pool in each separate CPU, that is we need no lock in its allocation. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
454 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
455 ---Task execution |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
456 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
457 After the loading of input interface, if we have a next task, |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
458 we know where to execute it. It can be passed to the current |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
459 task. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
460 |
3
8fd7e6fb855f
spell check pdf generated.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
1
diff
changeset
|
461 _code task_a(next_task, interface input, |
8fd7e6fb855f
spell check pdf generated.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
1
diff
changeset
|
462 interface output) { |
0
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
463 .... Task processing |
3
8fd7e6fb855f
spell check pdf generated.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
1
diff
changeset
|
464 goto next_task->code(next_task, |
8fd7e6fb855f
spell check pdf generated.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
1
diff
changeset
|
465 next_task->input, next_task->output); |
0
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
466 } |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
467 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
468 If we have not task to execute more, we can put mail waiting task in |
3
8fd7e6fb855f
spell check pdf generated.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
1
diff
changeset
|
469 the \verb+next_task+. |
8fd7e6fb855f
spell check pdf generated.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
1
diff
changeset
|
470 In this way, \verb+next_task+ call address is determined well before the call. |
0
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
471 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
472 ---Data segment deallocation timing |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
473 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
474 There are two types of data segment. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
475 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
476 The one is staying in a main memory |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
477 indefinitely, possibly replicated in more reliable storage hierarchy such |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
478 as SSD or Hard Disks. It's global address is persistent. It is basically |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
479 write only and remain forever in the life of the Internet service. In |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
480 other words, it will never be deallocated. We can call this a persistent |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
481 data segment. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
482 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
483 The other data segment is stayed in local cores. This is limited and |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
484 temporally. It is copied from the persistent data segment. After the |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
485 code segment execution, temporal data segment may be copied into |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
486 persistent data segment. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
487 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
488 Task itself don't care about reading and writing race conditions. It have |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
489 be controlled in terms of task dependency or be controlled by the task |
1 | 490 manager (fig.\ref{Global and local data segments}). |
491 | |
492 <center><img src="fig/global.pdf" alt="Global and local data segments"></center> | |
0
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
493 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
494 ---Where is the synchronization? |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
495 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
496 In Cerium Task Manager, a core has a single threaded scheduler. It accept |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
497 an array of task as a data segment as a mail. There is a main task manager, |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
498 which waits mails from schedulers in cores. Synchronization only |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
499 happens in mail transfer among main scheduler and sub schedulers in cores. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
500 This means synchronization itself can be delayed significantly in this scheme. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
501 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
502 If some service needs very fast response, dominating special task is necessary. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
503 For example, it wait some events using spin locks or hardware interrupts. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
504 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
505 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
506 ---Hardware support |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
507 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
508 This architecture requires explicit cache control. But now a day, most |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
509 architecture has this kind of cache control such as memory barrier. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
510 Unfortunately these are not standardized. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
511 Using Cerium task manager, we can hide these differences. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
512 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
513 ---With Conventional Operating System |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
514 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
515 Task Manager itself is running in a user space. Since tasks are in |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
516 data segments and it can be transferred to other user spaces, for |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
517 example in other clusters. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
518 Actually we build our task manager in user space. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
519 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
520 There are possible operating system supports for this task manager or |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
521 we can provide memory space management for code segments and data segments. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
522 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
523 ---Object Orientation |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
524 |
1 | 525 There are many object oriented programming style since Smalltalk-80\cite{smalltalk}. |
0
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
526 C++ or Java has an object in fixed memory address. When a |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
527 field of an object is updated, fixed memory contents is updated. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
528 In case of highly pipelined execution, updating memory contents requires |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
529 synchronization when the object is shared. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
530 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
531 In our scheme, usually input interface and output interface point |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
532 different data segment to avoid synchronization. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
533 |
1 | 534 In ACT3\cite{actor87}, actor has a become operator. An object is replaced by |
0
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
535 newly created object. This means object has multiple memory address |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
536 according to its update history. In Smalltalk-80, it has object |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
537 table and become operator is a replacement of pointer in the list. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
538 The list should be kept in a data segment and update by a single |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
539 threaded task. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
540 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
541 We can build actor like object oriented system on top of data segment |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
542 pools. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
543 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
544 ---Verification |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
545 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
546 Basically pipelined tasks are in fact, series of application of |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
547 tasks on requests. We can simply writes this using iterator. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
548 In case of word count in a file, |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
549 |
3
8fd7e6fb855f
spell check pdf generated.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
1
diff
changeset
|
550 foreach data segment d |
8fd7e6fb855f
spell check pdf generated.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
1
diff
changeset
|
551 in ( file ), |
8fd7e6fb855f
spell check pdf generated.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
1
diff
changeset
|
552 out in (partial_result) { |
8fd7e6fb855f
spell check pdf generated.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
1
diff
changeset
|
553 task_work_count(d,out); |
0
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
554 } |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
555 task_sum_up(partial_result); |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
556 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
557 If pipeline execution is correct, we don't have to verify the pipeline |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
558 execution, but check the correctness of this sequential execution, |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
559 which is much easier. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
560 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
561 Once we get verified sequential execution, we can put checking |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
562 stage on each pipeline stage. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
563 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
564 -- Comparison |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
565 |
1 | 566 Our architecture is a variant of SEDA, but it can reduce |
0
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
567 synchronization cost. Using data segment copying, shared |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
568 data are reduced. Copying costs are hide using DMA or |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
569 cache management instruction. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
570 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
571 Open CL is recently introduced but it has very complex |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
572 data transfer operation. It is an assembler level description |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
573 to achieve best performance. Data segment handling makes it |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
574 simple both in syntactically and in memory management efficiency. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
575 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
576 Tasks in Open CL is stored in C strings. In our scheme |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
577 tasks are all written in code segments, which can run on |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
578 any architecture. Actually we can run Cerium both on Mac OS X and |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
579 PS3 Linux using the same code. If it contains an optimized code |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
580 for SPU, we can run a code has the same behavior with non optimized code. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
581 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
582 Script language is easy to describe, but it works sequentially |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
583 from the beginning. Python cannot achieve parallel processing |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
584 advantage because of interpreter restriction. Our approach is that |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
585 divide problem into code segments and data segments and execute |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
586 it in an iterator. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
587 |
1 | 588 This is something like FP\cite{key7}, but using data segments, its execution |
0
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
589 becomes suitable for Many core architecture. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
590 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
591 KVS, key value store is a distributed database which is separated |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
592 from main service program, In our scheme, management of global |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
593 data segment can be done in a KVS. We can also use our architecture |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
594 to implement a KVS. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
595 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
596 In Persistent programming, records and transactions are introduced |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
597 in Programming Languages, but at that time, parallel execution is |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
598 not well considered. In our scheme, data segments behave as records, |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
599 which has several versions. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
600 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
601 This architecture uses a new language CbC. It is a lower level |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
602 language of C, but still programmers have to learn it. It is |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
603 very different programming style also. We think it is not so |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
604 easyly accepted by every one. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
605 |
1 | 606 In structured programming and data flow diagrams \cite{Bruza93thesemantics}, everything has record like structure, |
0
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
607 which is called container. In this architecture, containers has |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
608 common operations and managed in many core architecture. It is |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
609 also executed in CbC. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
610 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
611 There is a conversion algorithm from C to CbC, so we hope |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
612 some kind of half automatic conversion of sequential implementation |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
613 of the Internet service is possible. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
614 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
615 --Conclusions |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
616 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
617 We are developing SEDA like architecture for software service |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
618 architecture. It has code segment system based on Continuation |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
619 based C and data segment system based on |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
620 Cerium Engine. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
621 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
622 Combination of data segment and code segment provides a better |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
623 many core programming than Open CL. It is executed in multi |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
624 stage pipeline. Code segment provides good implementation |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
625 technique of pipeline scheduler. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
626 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
627 Data segment copying makes garbage collection unnecessary in |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
628 computation pipeline which cannot avoid in case of GC based |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
629 language such as Java. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
630 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
631 Cerium task engine and Continuation based C compiler is |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
632 developed openly and working. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
633 The combination of code segment and data segment is under construction. |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
634 |
201c0dfb14fd
first try with no figure, no reference.
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff
changeset
|
635 |