view preliminary/final-thesis.tex @ 3:d54646faa2a9 default tip

fix
author MasaKoha <kogagura@cr.ie.u-ryukyu.ac.jp>
date Tue, 16 Jun 2015 11:55:03 +0900
parents c0933fa26c81
children
line wrap: on
line source

\documentclass[twocolumn,twoside,9.5pt]{jarticle}
\usepackage[dvipdfmx]{graphicx}
\usepackage{picins}
\usepackage{fancyhdr}
\pagestyle{fancy}
\lhead{\parpic{\includegraphics[height=1zw,clip,keepaspectratio]{pic/emblem-bitmap.pdf}}琉球大学主催 工学部情報工学科 卒業研究発表会}
\rhead{}
\cfoot{}

\setlength{\topmargin}{-1in \addtolength{\topmargin}{15mm}}
\setlength{\headheight}{0mm}
\setlength{\headsep}{5mm}
\setlength{\oddsidemargin}{-1in \addtolength{\oddsidemargin}{11mm}}
\setlength{\evensidemargin}{-1in \addtolength{\evensidemargin}{21mm}}
\setlength{\textwidth}{181mm}
\setlength{\textheight}{261mm}
\setlength{\footskip}{0mm}
\pagestyle{empty}

\begin{document}
\title{Implement asynchronous read of Cerium}
\author{148585H {Masataka}{KOHAGURA}}
\date{}
\maketitle
\thispagestyle{fancy}

\section{Abstract}
We are developing a Parallel task manager Cerium.
I/O Included programming, read times is more heavy than processing time of Task.
We assume to inplement included I/O programm by parallel programming. If I/O time is heavy, it is slowly included I/O programm.
In the conventional implementation, we implemented file read with "mmap()" or "read()".
Inplementation this function down the degree of parallelism because another CPU stop while reading files.
In real read situation, asynchronous read sometimes gives good result on word count example. We gives the result and analysis.
\section{Cerium Task Manager}
We program parallel per tashs with Task Manager.
It is treated function and sub routins as task and we set depending, Input Data, and output data.
And, it is managed Task Manager setting. This paper's ``Input Data" is text file of search subject.

Cerium Task Manager can use on PlayStaion 3/Cell, MacOS X and Linux.

\section{Outline included I/O Task}
Split the file constant size after file reading and splits file are excluded string search.
And returns the results to the last counting.
(fig\ref{fig:includeio})

\begin{figure}[htbp]
\begin{center}
\includegraphics[width=0.5\textwidth]{pic/includeio.pdf}
\end{center}
\caption{include I/O Task}
\label{fig:includeio}
\end{figure}

\section{Design and implementation of parallel processing for I/O}

\subsection{problems implementation of mmap}
In previous research we has done the reading of files in mmap.
Timing for reading files in mmap , not when the mmap function was called , for the first time the file is read when you access something with it mmap regions.
So divided Task is not to perform a string search immediately, the first time the file is stored in memory when it tries to string search .
Task is desirably performed simultaneously .
Since the reading is in mmap in each Task would happening , waiting Task by the I/O bottleneck occurs.

\subsection{Design and implementation of Asynchronous read}
Asynchronous read separate to a process to read certain size and to perform a string search .
In this way , I do a read-only Asynchronous read, it was generated separately the Task Blocks to perform a string search processing .

Read Task is not to read the entire file at once , performs a split certain size , each character string search is performed as soon as they are read.
Task When starting one by one, because it would compress memory in Task you start to perform the boot block summarizes multiple Task.

A text file to be processed in this one block , and we read in the Asynchronous Read, to start the Task Blocks in the range that has been read when you are finished loading .
If the Task, which is responsible for the range before it is read by the Asynchronous Read ends up starting, not return the correct results.
To prevent it, Task Blocks are always Blocked Read and wears wait to start from taking place.
(fig\ref{fig:blockedreadwait})

\begin{figure}[htbp]
\begin{center}
\includegraphics[width=0.5\textwidth]{pic/blockedreadwait.pdf}
\end{center}
\caption{Wait for Blocked Read}
\label{fig:blockedreadwait}
\end{figure}

\subsection{Implementation I/O thread}
It is possible to change the settings for the CPU Type in Cerium Task Manager In Task Unit.
If you set the Type of SPE\_ANY, Cerium Task Manager side automatically allocates CPU.
However, if you would use this Type in this implementation, there is a problem that the Task to interrupt the Blocked Read Task would have been allocated.
In order to solve the problem, an implementation of the thread of the dedicated I/O of IO\_0.
This Thread was tuned to run at the highest priority to the I/O.
(fig\ref{fig:io0})
%%
%(fig\ref{fig:speany})
%
%\begin{figure}[htbp]
%\begin{center}
%\includegraphics[width=0.4\textwidth]{pic/speany.pdf}
%\end{center}
%\caption{SPE\_ANYでの設定時}
%\label{fig:speany}
%\end{figure}


\begin{figure}[htbp]
\begin{center}
\includegraphics[width=0.4\textwidth]{pic/io0.pdf}
\end{center}
\caption{implement IO\_0}
\label{fig:io0}
\end{figure}

\newpage

\section{Benchmark}

\begin{itemize}
 \item Mac OS X Mavericks (10.9.1)
 \item HDD 1TB、Memory 16GB、CPU 2*2.66 GHz 6-Core Intel Xeon
 \item CPU NUM 12
 \item Is multiplied by the Booye-Moore String Search for the 10GB file , count what is included a number of search string
 \item Time until the measurements returned results from reading the file
\end{itemize}

%以下の表\ref{table:result}に実行結果を示す。
\begin{tiny}
  \begin{table}[ht]
    \begin{center}
      \label{table:result}
      \small
      \begin{tabular}[t]{c|r}
        \hline
        Read Method & Spend Average Time(s)\\
        \hline
        mmap & 154.6 \\
        \hline
        一括 Read & 114.9 \\
        \hline
        Blocked Read \& SPE\_ANY & 106.0 \\
        \hline
        Blocked Read \& IO\_0 & 99.2 \\
        \hline
      \end{tabular}
      \caption{result}
    \end{center}
  \end{table}
\end{tiny}

%\ref{table:result}より、mmap より Blocked Read \& IO\_0 の実行速度が 36 \% 改善された。
From Table 1, the execution speed of Asynchronous Read \&SPE\_ANY has been improved 31\% from mmap.
In addition, CPU Type of Asynchronous Read was also seen further 4\% improvement by changing from SPE\_ANY to IO\_0.
From this , when performing parallel processing including I/O, instead of automatically to read by implementing in mmap, it is understood that better to control the reading in itself increases.

\section{conclusion}
When implementing the Task containing an I / O with mmap, because reading is not carried out only when some processing is applied with respect to the mmap regions , so that it would leave the loading on each Task.
As a method to solve it, and we implemented that performs Asynchronous Read and Task in parallel .
Also , Asynchronous Read is a result of improved so as not to be interrupted in another Task, 35 \% improvement in execution speed was observed .
From this study , Task , including an I / O , it is considered that there is room to further improve depending tuning.

\thispagestyle{fancy}
\begin{thebibliography}{9}

\bibitem{kinjyo}金城裕、河野真治、多賀野海人、小林佑亮 (琉球大学)\\
ゲームフレームワーク Cerium Task Manager の改良\\
情報処理学会システムソフトウェアとオペレーティング・システム研究会 (OS), April 2011

\end{thebibliography}
\end{document}