Merge branch 'master' of bitbucket.org:winfried/hpc

This commit is contained in:
Armin Friedl 2016-06-24 18:51:11 +02:00
commit 804c6be5a2

View file

@ -191,12 +191,137 @@ However if the process is the root process the reduction is finished and can be
\end{algorithm}
The calculation of the parent and child nodes is the only aspect which has to be changed for all possible kinds of trees.
However there are of course certain optimizations possible to use some knowledge of a concrete tree.
However there are of course certain optimizations possible where some knowledge about the structure of the tree can be used.
Such implementation details will be shown in the following part.
The code for all our implementations can be found in the Appendix in \prettyref{sec:appendix}.
\FloatBarrier
\subsection{Binomial Tree Reduce}
\section{Implementation Details}
\label{sec:kernels}
The first of the three implementations we completed was the binomial tree reduction.
Since there were already some examples and explanations on how reductions and broadcasts work on binomial trees presented during the lectures, this was probably the most straight forward part of the project.
When looking at some trees of different sizes we quickly noticed, that the position of each node is static and the tree only grows in one direction.
This fact can be used in a sense that the children do not have to be precomputed but instead can be calculated during the loop before the corresponding receive operation.
A comparison between a $B_2$ and a $B_3$ tree is shown in \prettyref{fig:binomtrees}.
\begin{figure}
\begin{center}
\begin{tikzpicture}
\begin{scope}
\node [circle,draw]{$0$}
child { node [circle,draw]{$1$}}
child {node [circle,draw] {$2$}
child {node [circle,draw] {$3$}}};
\end{scope}
\begin{scope}[shift={(5,0)}]
\node [circle,draw]{$0$}
child { node [circle,draw]{$1$}}
child {node [circle,draw] {$2$}
child {node [circle,draw] {$3$}}}
child {node [circle,draw] {$4$}
child {node [circle,draw] {$5$}}
child {node [circle,draw] {$6$}
child {node [circle,draw] {$7$}}}};
\end{scope}
\end{tikzpicture}
\caption{Comparison between a $B_2$ and a $B_3$}
\label{fig:binomtrees}
\end{center}
\end{figure}
From some of those trees we then determined that for the node with rank $r$ the child in each iteration is $r+i$ where $i=1$ at the start and is multiplied by $2$ after each iteration.
Before each iteration there is an additional condition, which checks if the node has any children left or if it should send the result.
\subsection{Fibonacci Tree Reduce}
The core difference of a Fibonacci tree compared to a binomial tree is the fixed degree of $2$.
To guarantee the correct order of the computed operation the position of a node inside the tree is not only dependent on the rank of the process, but also on the total size of the tree.
This is due to the fact that all ranks in one subtree must be lower than the ranks in the second subtree.
Therefore the position of a node with a certain rank changes depending on the tree size.
This can be seen in the comparison of the trees $F_2$ and $F_3$ in \prettyref{fig:fibtrees}.
\begin{figure}
\begin{center}
\begin{tikzpicture}
\begin{scope}
\node [circle,draw]{$0$}
child { node [circle,draw]{$1$}}
child {node [circle,draw] {$2$}
child {node [circle,draw] {$3$}}};
\end{scope}
\begin{scope}[shift={(5,0)}]
\node [circle,draw]{$0$}
child { node [circle,draw]{$1$}
child { node [circle,draw]{$2$}}}
child {node [circle,draw] {$3$}
child {node [circle,draw] {$4$}}
child {node [circle,draw] {$5$}
child {node [circle,draw] {$6$}}}};
\end{scope}
\end{tikzpicture}
\caption{Comparison between a $F_2$ and a $F_3$}
\label{fig:fibtrees}
\end{center}
\end{figure}
During the receiving step of the algorithm we do not need a loop any more, since there are always two or less children for each node.
On the other hand the calculation of the children now has to be done in a loop.
As a first step we have to determine the size of the tree which can contain all processes.
This can be done by searching the Fibonacci numbers for the first value which is greater than the number of processes.
Since we know the size of both subtrees using the Fibonacci numbers we can determine whether a node is supposed to be in the left or right subtree.
When doing this recursively the position of a node and its children can be calculated.
The runtime of this part depends on the size of the tree and is therefore bound by the Fibonacci numbers.
Now that all communication partners have been determined each process has to execute at most two receives and afterwards one send command.
Noticeable when comparing this technique to the biomial tree is that there is already one less node in a tree $F_3$ than in the $B_3$.
This means that the binomial tree can handle more processes in the same number of rounds.
\subsection{Binary Tree Reduce}
The reduction using a binary tree can be implemented in a very similar way like the Fibonacci tree since the degree is also two.
The key difference is of course the structure of the trees and therefore the calculation of the children.
Again the position of certain nodes changes depending on the size of the tree since the lower ranks must be in the left subtree and the higher ones in the right subtree.
The structure of such trees can be shown rather nice because they are simply complete binary trees.
There is again a comparison between a tree $T_2$ and $T_3$ which is shown in \prettyref{fig:bintrees}.
\begin{figure}
\begin{center}
\begin{tikzpicture}
\begin{scope}
\node [circle,draw]{$0$}
child { node [circle,draw]{$1$}
child { node [circle,draw]{$2$}}
child { node [circle,draw]{$3$}}}
child { node [circle,draw]{$4$}
child { node [circle,draw]{$5$}}
child { node [circle,draw]{$6$}}};
\end{scope}
\begin{scope}[shift={(5,0)}]
\node [circle,draw]{$0$}
child { node [circle,draw]{$1$}
child { node [circle,draw]{$2$}
child { node [circle,draw]{$3$}}
child { node [circle,draw]{$4$}}}
child { node [circle,draw]{$5$}
child { node [circle,draw]{$6$}}
child { node [circle,draw]{$7$}}}}
child { node [circle,draw]{$8$}
child { node [circle,draw]{$9$}
child { node [circle,draw]{$10$}}
child { node [circle,draw]{$11$}}}
child { node [circle,draw]{$12$}
child { node [circle,draw]{$13$}}
child { node [circle,draw]{$14$}}}};
\end{scope}
\end{tikzpicture}
\caption{Comparison between a $T_2$ and a $T_3$}
\label{fig:bintrees}
\end{center}
\end{figure}
The tree size as well as the computation of the child nodes can be done using a logarithmic function on the number of processes.
The rest works in exactly the same way as the previously explained algorithms.
When structuring the tree like this the drawback is that in each round a node receives data from both children.
As a result the number of rounds for this algorithm is the size of the tree plus an additional round.
\FloatBarrier
@ -209,6 +334,7 @@ However there are of course certain optimizations possible to use some knowledge
\label{sec:analysis}
\section{Appendix}
\label{sec:appendix}
\lstinputlisting[language=C]{../binom_reduce.c}