view paper/rectype.ind @ 14:66931f63db4d

modify
author Nobuyasu Oshiro <dimolto@cr.ie.u-ryukyu.ac.jp>
date Thu, 14 Jun 2012 23:45:05 +0900
parents bf3c780d3039
children 68d2c32f74cf
line wrap: on
line source

-title: Recursive type syntax in Continuation based C

\newcommand{\rectype}{{\tt \_\_rectype}}

--author: Shinji Kono, Nobuyasu Oshiro

--abstract: 
We have implemented Continuation based C (CbC).
CbC is an extension of C, which has parameterized goto statement.
It is useful for finite state automaton or many core tasks. 
Goto statement is a way to force tail call elimination.
The destination of goto statement is called Code Segment, which is actually a normal function of C.
To represent recursive function call, the type system of C is not enough, because
it has no recursive types.
We introduce \rectype keyword for recursive type, and it is implemented in GCC-4.6.0.
We will compare the conventional methods, \rectype keyword and a method using C structure.
Also we show usage of CbC and it's benchmark.


--Continuation based C

CbC's basic programming unit is a code segment. It is not a subroutine, but it
looks like a function, because it has input and output. We can use C struct
as input and output interfaces.

   struct interface1  { int i; };
   struct interface2  { int o; };

   __code f(struct interface1 a) { 
       struct interface2 b; b.o=a.i;  
       goto g(b); 
   }

In this example, a code segment
\verb+f+ has \verb+input a+ and sends \verb+output b+ to a code segment \verb+g+.
There is no return from code segment \verb+b+, \verb+b+ should call another
continuation using \verb+goto+. Any control structure in C is allowed in CwC
language, but in case of CbC, we restrict ourselves to use \verb+if+ statement
only, because it is sufficient to implement C to CbC translation. In this case,
code segment has one input interface and several output interfaces (fig.\ref{code}).

\includegraphics[width=6cm]{
\begin{figure}[htb]
\begin{center}
\includegraphics[width=6cm]{figure/code.pdf}
\caption{code}
\end{center}
\label{code}
\end{figure}



\verb+__code+ and parameterized global goto statement is an extension of
Continuation based C. Unlike \verb+C--+ \cite{cminusminus}'s parameterized goto,
we cannot goto into normal C function.

--Intermix with C

In CwC, we can go to a code segment from a C function and we can call C functions
in a code segment. So we don't have to shift completely from C to CbC. The later
one is straight forward, but the former one needs further extensions.

   void *env;
   __code (*exit)(int);

   __code h(char *s) { 
        printf(s);
        goto (*exit)(0),env;
   }

   int main() {
        env  = __environment;
        exit = __return;
        goto h("hello World\n");
   }

In this hello world example, the environment of \verb+main()+
and its continuation is kept in global variables. The environment
and the continuation can be get using \verb+__environment+,
and \verb+__return+. Arbitrary mixture of code segments and functions
are allowed (in CwC). The continuation of \verb+goto+ statement 
never returns to original function, but it goes to caller of original
function. In this case, it returns result 0 to the operating system.


--What's good?

CbC is a kind of high level assembler language. It can do several
original C language cannot do. For examples,

{\small
\begin{verbatim}
    Thread Scheduler 
    Context Switch
    Synchronization Primitives
    I/O wait semantics

\end{verbatim}
}

are impossible to write in C. Usually it requires some help of
assembler language such as \verb+__asm+ statement extension which is
of course not portable.

--Scheduler example

We can easily write these things in CbC, because
CbC has no hidden information behind the stack frame of C. 
A thread simply go to the scheduler,

    goto scheduler(self, task_list);


and the scheduler simply pass the control to the next
thread in the task queue.

    code scheduler(Thread self,TaskPtr list)
    {
        TaskPtr t = list;
        TaskPtr e;
        list = list->next;
        goto list->thread->next(list->thread,list);
    }

Of course it is a simulator, but it is an implementation
also. If we have a CPU resource API, we can write real multi
CPU scheduler in CbC.

This is impossible in C, because we cannot access the hidden
stack which is necessary to switch in the scheduler. In CbC,
everything is visible, so we can switch threads very easily.

This means we can use CbC as an executable  specification 
language of OS API.

--Self Verification

Since we can write a scheduler in CbC, we can also enumerate
all possible interleaving of a concurrent program. We have
implement a model checker in CwC. CbC can be a self verifiable
language\cite{kono08a}.

SPIN\cite{holzmann97model} is a very reliable model checker, but it have to
use special specification language PROMELA. We cannot directly
use PROMELA as an implementation language, and it is slightly
difficult to study its concurrent execution semantics including
communication ports.

There are another kind of model checker for real programming
language, such as Java PathFinder\cite{havelund98model}. Java PathFinder use
Java Virtual Machine (JVM) for state space enumeration which
is very expensive some time.

In CbC, state enumerator itself is written in CbC, and its concurrency
semantics is written in CbC itself. Besides it is very close
to the implementation. Actually we can use CbC as an implementation
language. Since enumerator is written in the application itself, we
can perform abstraction or approximation in the application specific
way, which is a little difficult in Java PathFinder. It is possible
to handle JVM API for the purpose, although.

We can use CPS transformed CbC source code for verification, but
we don't have to transform all of the source code, because CwC
supports all C constructs. (But not in C++... Theoretically it is
possible with using cfront converter, it should be difficult).


--As a target language

Now we have GCC implementation of CbC, it runs very fast. Many
popular languages are implemented on top of C. Some of them
uses very large switch statement for the byte code interpreter.
We don't have to use these hacks, when we use CbC as an implementation
language. 

CbC is naturally similar to the state charts. It means it is very
close to UML diagrams. Although CbC does not have Object Oriented
feature such as message passing nor inheritance, which is not
crucial in UML.


--Recursive type syntax

CbC's program pass next pointer of code segment on argument.
It is passed as follows.

   __code csA( __code (*p)() ) {
       goto p(csB);
   }

But, This declarationd is not right.
Because p have arguments.
p 



    __code csA( __code (*p)( __code (*)( __code (*)( __code )))) {
       goto p(csB);
    }




We implemented \rectype syntax in CbC on GCC.
\rectype syntax is declare a recursive type.
This example is using \rectype in CbC.

    __code csA( __rectype *p) {
       goto p(csB);
    }

*p represent pointer of csA at \ref{code:rectype} .
p's argument type is same csA that function pointer.

Recursive type Can not declarated in C.
Because 



It is the same as the following.

    void csA( void (*p)(void*)) {
       p(csB);
    }

--Implementation of \rectype in CbC on GCC


\begin{figure}[htpb]
  \begin{minipage}{0.5\hsize}
  \begin{center}
\scalebox{0.35}{\includegraphics{figure/tree1.pdf}}
  \end{center}
  \caption{}
  \label{fig:tree1}
  \end{minipage}
  \begin{minipage}{0.2\hsize}
  \begin{center}
\scalebox{0.35}{\includegraphics{figure/tree2.pdf}}
  \end{center}
  \caption{\rectype}
  \label{fig:tree2}
\end{minipage}
\end{figure}



--Method other than \rectype

    struct interface {
       __code (*next)(struct interface);
    };

    __code csA(struct interface p) {
        struct interface ds = { csB };
        goto p.next(ds);
    }

    int main() {
        struct interface ds = { print };
        goto csA(ds);
        return 0;
    }





    __code fibonacci(__rectype *p, int num,  int count, int result, int prev) {



\section{Comparision}


\begin{table}[htpb]
\centering
\small
\begin{tabular}{|l|r|r|r|} \hline
(unit: s) & ./conv1 1 & ./conv1 2 &  ./conv1 3 \\ \hline
Micro-C(32bit)       & 9.93 & 6.31 & 7.18 \\ \hline 
Micro-C(64bit)       & 5.03 & 5.12 & 5.00 \\ \hline \hline
GCC -O3(32bit)       & 2.52 & 2.34 & 1.53 \\ \hline
GCC -O3(64bit)       & 1.80 & 1.20 & 1.44 \\ \hline
\end{tabular}
\caption{Micro-C, GCC bench mark (in sec)}
\label{tab:mc,gcc,compare}
\end{table}