34 lines
1.8 KiB
TeX
34 lines
1.8 KiB
TeX
\subsection{Theoretical Peak Performance}
|
|
The CPU under test was a Intel\textregistered{} Core\texttrademark{} i5-4210U. \prettyref{tbl:spec-4210} shows the relevant specifications for this processor according to \textcite{ark4210}.
|
|
|
|
\begin{table}[h!]
|
|
\centering
|
|
\begin{tabular}{ll}
|
|
\toprule
|
|
Specification & Value \\
|
|
\midrule
|
|
Instruction Set Extension & SSE4.1/4.2, AVX 2.0 \\
|
|
\# of Cores & 2 \\
|
|
Processor Base Frequency & 1.7 GHz \\
|
|
Max Turbo Frequency & 2.7 GHz \\
|
|
Microarchitecture & Haswell \\
|
|
\bottomrule
|
|
\end{tabular}
|
|
\caption{Intel\textregistered{} Core\texttrademark{} i5-4210U processor specifications~\cite{ark4210}}
|
|
\label{tbl:spec-4210}
|
|
\end{table}
|
|
|
|
The 4th generation Intel Core processors provide FMA\footnote{Fused Multiply Add} and AVX\footnote{Advanced Vector Extension} extensions~\cite[5-2 Vol.1]{intel2016}. An FMA unit is capable of ``[...] 256-bit floating-point instructions to perform computation on
|
|
256-bit vectors''~\cite[5-28 Vol.1]{intel2016}. Therefore it can execute 2 (multiply-add) times 4 double-precision floating-point instructions each cycle. This results in 8 DP FLOPs per cycle.
|
|
|
|
Unfortunately no definite source could be found but according to \textcite{shimpi2012} the Haswell architecture has 2 FMA units, equalling to $2 * 8 = 16$ DP FLOPs per core. Furthermore there are 2 cores in a Core i5 processor. Taken together this results in $16 * 2 = 32$ DP FLOPs per cycle for both cores.
|
|
|
|
At max frequency the processor is therefore capable of a theoretical peak performance of $32*2.7 = 86.4$ GFLOP/s.
|
|
|
|
|
|
\printbibliography
|
|
|
|
%%% Local Variables:
|
|
%%% mode: latex
|
|
%%% TeX-master: "../report"
|
|
%%% End:
|