Merge branch 'master' of bitbucket.org:winfried/hpc
This commit is contained in:
commit
7c2a8a1dbb
7 changed files with 45 additions and 33 deletions
|
@ -1,10 +1,3 @@
|
||||||
/*
|
|
||||||
* bin_reduce.c
|
|
||||||
*
|
|
||||||
* Created on: 16 Jun 2016
|
|
||||||
* Author: johannes
|
|
||||||
*/
|
|
||||||
|
|
||||||
#include <stdio.h>
|
#include <stdio.h>
|
||||||
#include <string.h>
|
#include <string.h>
|
||||||
#include <mpi.h>
|
#include <mpi.h>
|
||||||
|
|
|
@ -1,10 +1,3 @@
|
||||||
/*
|
|
||||||
* binom_reduce.c
|
|
||||||
*
|
|
||||||
* Created on: 18 Jun 2016
|
|
||||||
* Author: johannes
|
|
||||||
*/
|
|
||||||
|
|
||||||
#include <mpi.h>
|
#include <mpi.h>
|
||||||
#include <string.h>
|
#include <string.h>
|
||||||
#include <stdio.h>
|
#include <stdio.h>
|
||||||
|
|
|
@ -1,10 +1,3 @@
|
||||||
/*
|
|
||||||
* fib_reduce.c
|
|
||||||
*
|
|
||||||
* Created on: 18 Jun 2016
|
|
||||||
* Author: johannes
|
|
||||||
*/
|
|
||||||
|
|
||||||
#include <stdio.h>
|
#include <stdio.h>
|
||||||
#include <string.h>
|
#include <string.h>
|
||||||
#include <mpi.h>
|
#include <mpi.h>
|
||||||
|
|
|
@ -1,18 +1,10 @@
|
||||||
/*
|
|
||||||
============================================================================
|
|
||||||
Name : hpc_mpi.c
|
|
||||||
Author :
|
|
||||||
Version :
|
|
||||||
Copyright : Your copyright notice
|
|
||||||
Description : Hello MPI World in C
|
|
||||||
============================================================================
|
|
||||||
*/
|
|
||||||
#include <stdio.h>
|
#include <stdio.h>
|
||||||
#include <string.h>
|
#include <string.h>
|
||||||
#include <mpi.h>
|
#include <mpi.h>
|
||||||
#include <unistd.h>
|
#include <unistd.h>
|
||||||
#include <stdlib.h>
|
#include <stdlib.h>
|
||||||
#include <time.h>
|
#include <time.h>
|
||||||
|
#include <getopt.h>
|
||||||
#include "bin_reduce.h"
|
#include "bin_reduce.h"
|
||||||
#include "binom_reduce.h"
|
#include "binom_reduce.h"
|
||||||
#include "fib_reduce.h"
|
#include "fib_reduce.h"
|
||||||
|
|
BIN
reduce/report/nodeplot.pdf
Executable file
BIN
reduce/report/nodeplot.pdf
Executable file
Binary file not shown.
BIN
reduce/report/report.pdf
Executable file
BIN
reduce/report/report.pdf
Executable file
Binary file not shown.
|
@ -106,7 +106,7 @@ As a baseline for the comparison serves a given implementation of the MPI standa
|
||||||
A binomial tree has a non-fixed degree where each tree $B_i$ has exactly $i$ subtrees of size $B_0$ to $B_{i-1}$.
|
A binomial tree has a non-fixed degree where each tree $B_i$ has exactly $i$ subtrees of size $B_0$ to $B_{i-1}$.
|
||||||
The number of nodes in such a tree is equal to $2^i$ and the depth is $i$.
|
The number of nodes in such a tree is equal to $2^i$ and the depth is $i$.
|
||||||
\item[Fibonacci Tree]
|
\item[Fibonacci Tree]
|
||||||
The Fibonacci tree uses a fixed degree of $2$ where a tree of size $F_i$ has one subtree of size $T_{i-1}$ and one of $T_{i-2}$.
|
The Fibonacci tree uses a fixed degree of $2$ where a tree of size $F_i$ has one subtree of size $F_{i-1}$ and one of $F_{i-2}$.
|
||||||
Therefore the number of nodes in this kind of tree is $fib(i+3)-1$ using the Fibonacci function $fib(x) = fib(x-1)+fib(x-2)$ and its depth is as well $i$.
|
Therefore the number of nodes in this kind of tree is $fib(i+3)-1$ using the Fibonacci function $fib(x) = fib(x-1)+fib(x-2)$ and its depth is as well $i$.
|
||||||
\item[Binary Tree]
|
\item[Binary Tree]
|
||||||
The binary tree used for reduction is a common complete binary tree where a tree $T_i$ has two subtrees $T_{i-1}$.
|
The binary tree used for reduction is a common complete binary tree where a tree $T_i$ has two subtrees $T_{i-1}$.
|
||||||
|
@ -285,7 +285,11 @@ There is again a comparison between a tree $T_2$ and $T_3$ which is shown in \pr
|
||||||
|
|
||||||
\begin{figure}
|
\begin{figure}
|
||||||
\begin{center}
|
\begin{center}
|
||||||
\begin{tikzpicture}
|
\begin{tikzpicture}[
|
||||||
|
auto,
|
||||||
|
level 1/.style={sibling distance=40mm},
|
||||||
|
level 2/.style={sibling distance=20mm},
|
||||||
|
level 3/.style={sibling distance=10mm}]
|
||||||
\begin{scope}
|
\begin{scope}
|
||||||
\node [circle,draw]{$0$}
|
\node [circle,draw]{$0$}
|
||||||
child { node [circle,draw]{$1$}
|
child { node [circle,draw]{$1$}
|
||||||
|
@ -295,7 +299,7 @@ child { node [circle,draw]{$4$}
|
||||||
child { node [circle,draw]{$5$}}
|
child { node [circle,draw]{$5$}}
|
||||||
child { node [circle,draw]{$6$}}};
|
child { node [circle,draw]{$6$}}};
|
||||||
\end{scope}
|
\end{scope}
|
||||||
\begin{scope}[shift={(5,0)}]
|
\begin{scope}[shift={(7,0)}]
|
||||||
\node [circle,draw]{$0$}
|
\node [circle,draw]{$0$}
|
||||||
child { node [circle,draw]{$1$}
|
child { node [circle,draw]{$1$}
|
||||||
child { node [circle,draw]{$2$}
|
child { node [circle,draw]{$2$}
|
||||||
|
@ -328,6 +332,43 @@ As a result the number of rounds for this algorithm is the size of the tree plus
|
||||||
\section{Results}
|
\section{Results}
|
||||||
\label{sec:results}
|
\label{sec:results}
|
||||||
|
|
||||||
|
Before we compared the runtime of our algorithms the correctness has to be tested by a reasonable amount.
|
||||||
|
As already stated in the project description this process can be done easily by comparing each result to the MPI\_Reduce function.
|
||||||
|
This can be done for all implementations at once to quickly test them for correctness.
|
||||||
|
With this method we tried various combinations of array sizes, process numbers as well as different operations and the result always turned out to be correct.
|
||||||
|
|
||||||
|
After doing those tests to show the correctness we started doing some benchmarking comparisons of our implementations.
|
||||||
|
To do this we utilized 36 nodes of the jupiter system with 16 processes each.
|
||||||
|
We did two kinds of tests which will be explained now to compare the runtime of our implementations as well as the MPI\_Reduce function.
|
||||||
|
|
||||||
|
\subsection{Process Count}
|
||||||
|
|
||||||
|
For the first benchmark we used a different number of processes to check the scaling of all methods.
|
||||||
|
The size of the array on each process for this test is $1000$.
|
||||||
|
This is a rather low value, but since we are using tree reduction which is not optimal for high amounts of data such a value makes sense.
|
||||||
|
The amount of processes used was increased from starting with only one node up to using all available 36 nodes.
|
||||||
|
On each node all 16 processes where used in all tests.
|
||||||
|
Therefore the total process count ranged from 16 to 576.
|
||||||
|
For all out tests in this project we used a repetition count of 30 which allowed us to run a high number of different inputs in a reasonable amount of time.
|
||||||
|
|
||||||
|
\begin{figure}
|
||||||
|
\begin{adjustbox}{center}
|
||||||
|
\includegraphics[width=0.8\linewidth]{nodeplot}
|
||||||
|
\end{adjustbox}
|
||||||
|
\caption{Average runtimes on 1 to 36 nodes with 16 processes each.}
|
||||||
|
\label{fig:roofline}
|
||||||
|
\end{figure}
|
||||||
|
|
||||||
|
\subsection{Array Size}
|
||||||
|
|
||||||
|
Our second used a fixed number of processes but the size of the local arrays was increasing.
|
||||||
|
This should show how the different implementations perform for small arrays or even a single number.
|
||||||
|
But it also shows how they perform with a large amount of data on each process.
|
||||||
|
The amount of nodes was fixed at 36 for the complete test.
|
||||||
|
The size of the local arrays was increased by a factor of 10 in each iteration starting with just 1 and increased it up to 1000000.
|
||||||
|
The number of repetitions is the same as for the last test at 30.
|
||||||
|
|
||||||
|
|
||||||
\FloatBarrier
|
\FloatBarrier
|
||||||
|
|
||||||
\section{Analysis}
|
\section{Analysis}
|
||||||
|
|
Loading…
Reference in a new issue