diff --git a/reduce/bin_reduce.c b/reduce/bin_reduce.c index c420b2c..0bbd10b 100644 --- a/reduce/bin_reduce.c +++ b/reduce/bin_reduce.c @@ -1,10 +1,3 @@ -/* - * bin_reduce.c - * - * Created on: 16 Jun 2016 - * Author: johannes - */ - #include #include #include diff --git a/reduce/binom_reduce.c b/reduce/binom_reduce.c index e96db38..82737cc 100644 --- a/reduce/binom_reduce.c +++ b/reduce/binom_reduce.c @@ -1,10 +1,3 @@ -/* - * binom_reduce.c - * - * Created on: 18 Jun 2016 - * Author: johannes - */ - #include #include #include diff --git a/reduce/fib_reduce.c b/reduce/fib_reduce.c index b1c69cf..3479a63 100644 --- a/reduce/fib_reduce.c +++ b/reduce/fib_reduce.c @@ -1,10 +1,3 @@ -/* - * fib_reduce.c - * - * Created on: 18 Jun 2016 - * Author: johannes - */ - #include #include #include diff --git a/reduce/hpc_mpi.c b/reduce/hpc_mpi.c index c0d2e7d..b271da9 100644 --- a/reduce/hpc_mpi.c +++ b/reduce/hpc_mpi.c @@ -1,18 +1,10 @@ -/* - ============================================================================ - Name : hpc_mpi.c - Author : - Version : - Copyright : Your copyright notice - Description : Hello MPI World in C - ============================================================================ - */ #include #include #include #include #include #include +#include #include "bin_reduce.h" #include "binom_reduce.h" #include "fib_reduce.h" diff --git a/reduce/report/nodeplot.pdf b/reduce/report/nodeplot.pdf new file mode 100755 index 0000000..c64e5f2 Binary files /dev/null and b/reduce/report/nodeplot.pdf differ diff --git a/reduce/report/report.pdf b/reduce/report/report.pdf new file mode 100755 index 0000000..bf51283 Binary files /dev/null and b/reduce/report/report.pdf differ diff --git a/reduce/report/report.tex b/reduce/report/report.tex index df206a9..e3ca713 100755 --- a/reduce/report/report.tex +++ b/reduce/report/report.tex @@ -106,7 +106,7 @@ As a baseline for the comparison serves a given implementation of the MPI standa A binomial tree has a non-fixed degree where each tree $B_i$ has exactly $i$ subtrees of size $B_0$ to $B_{i-1}$. The number of nodes in such a tree is equal to $2^i$ and the depth is $i$. \item[Fibonacci Tree] - The Fibonacci tree uses a fixed degree of $2$ where a tree of size $F_i$ has one subtree of size $T_{i-1}$ and one of $T_{i-2}$. + The Fibonacci tree uses a fixed degree of $2$ where a tree of size $F_i$ has one subtree of size $F_{i-1}$ and one of $F_{i-2}$. Therefore the number of nodes in this kind of tree is $fib(i+3)-1$ using the Fibonacci function $fib(x) = fib(x-1)+fib(x-2)$ and its depth is as well $i$. \item[Binary Tree] The binary tree used for reduction is a common complete binary tree where a tree $T_i$ has two subtrees $T_{i-1}$. @@ -285,7 +285,11 @@ There is again a comparison between a tree $T_2$ and $T_3$ which is shown in \pr \begin{figure} \begin{center} -\begin{tikzpicture} +\begin{tikzpicture}[ + auto, + level 1/.style={sibling distance=40mm}, + level 2/.style={sibling distance=20mm}, + level 3/.style={sibling distance=10mm}] \begin{scope} \node [circle,draw]{$0$} child { node [circle,draw]{$1$} @@ -295,7 +299,7 @@ child { node [circle,draw]{$4$} child { node [circle,draw]{$5$}} child { node [circle,draw]{$6$}}}; \end{scope} -\begin{scope}[shift={(5,0)}] +\begin{scope}[shift={(7,0)}] \node [circle,draw]{$0$} child { node [circle,draw]{$1$} child { node [circle,draw]{$2$} @@ -328,6 +332,43 @@ As a result the number of rounds for this algorithm is the size of the tree plus \section{Results} \label{sec:results} +Before we compared the runtime of our algorithms the correctness has to be tested by a reasonable amount. +As already stated in the project description this process can be done easily by comparing each result to the MPI\_Reduce function. +This can be done for all implementations at once to quickly test them for correctness. +With this method we tried various combinations of array sizes, process numbers as well as different operations and the result always turned out to be correct. + +After doing those tests to show the correctness we started doing some benchmarking comparisons of our implementations. +To do this we utilized 36 nodes of the jupiter system with 16 processes each. +We did two kinds of tests which will be explained now to compare the runtime of our implementations as well as the MPI\_Reduce function. + +\subsection{Process Count} + +For the first benchmark we used a different number of processes to check the scaling of all methods. +The size of the array on each process for this test is $1000$. +This is a rather low value, but since we are using tree reduction which is not optimal for high amounts of data such a value makes sense. +The amount of processes used was increased from starting with only one node up to using all available 36 nodes. +On each node all 16 processes where used in all tests. +Therefore the total process count ranged from 16 to 576. +For all out tests in this project we used a repetition count of 30 which allowed us to run a high number of different inputs in a reasonable amount of time. + +\begin{figure} + \begin{adjustbox}{center} + \includegraphics[width=0.8\linewidth]{nodeplot} + \end{adjustbox} + \caption{Average runtimes on 1 to 36 nodes with 16 processes each.} + \label{fig:roofline} +\end{figure} + +\subsection{Array Size} + +Our second used a fixed number of processes but the size of the local arrays was increasing. +This should show how the different implementations perform for small arrays or even a single number. +But it also shows how they perform with a large amount of data on each process. +The amount of nodes was fixed at 36 for the complete test. +The size of the local arrays was increased by a factor of 10 in each iteration starting with just 1 and increased it up to 1000000. +The number of repetitions is the same as for the last test at 30. + + \FloatBarrier \section{Analysis}