report kernels+8er kernel bug behoben
This commit is contained in:
parent
b5a5abf815
commit
85d20472b3
26 changed files with 2743 additions and 198 deletions
BIN
reduce/reduce
Executable file
BIN
reduce/reduce
Executable file
Binary file not shown.
|
@ -1,6 +1,8 @@
|
||||||
According to the definition used the arithmetic intensity is measured by operations per byte. This might not be adequat for haswell processors (and later). Due to the fused multiply-add\footnote{although called multiply-add there are 36 different slightly instructions} extension two floating point operations can be performed with a single instruction.
|
According to the definition used the arithmetic intensity is measured by operations per byte. This might not be adequat for haswell processors (and later). Due to the fused multiply-add\footnote{although called multiply-add there are 36 different slightly instructions} extension two floating point operations can be performed with a single instruction.
|
||||||
|
|
||||||
- worse results for 4 threads @ NUMA-STREAM not necessarily expected
|
- worse results for 4 threads @ NUMA-STREAM not necessarily expected
|
||||||
|
- better results for triad possibly due to combined storage in FMA
|
||||||
|
- striding for arrays
|
||||||
|
|
||||||
%%% Local Variables:
|
%%% Local Variables:
|
||||||
%%% mode: latex
|
%%% mode: latex
|
||||||
|
|
|
@ -1,3 +1,115 @@
|
||||||
|
Kernels with operational intensity (OI) of $\rfrac{1}{16}$ and $8$ have been implemented. The kernels are introduced in the following sections.
|
||||||
|
|
||||||
|
However the effective operational intensity of a given kernel in a high-level language (as C) is not obvious when compiled to processor instructions. Furthermore, due to today's advanced processor architecture, adaptions had to be made to account for special capabilites. This resulted in several different kernels. Not all of them are machine independent with regard to operational intensity.
|
||||||
|
|
||||||
|
All kernels were compiled with \verb|gcc 5.3.1| and different options. The compilation was checked with \verb|objdump -d -M intel-mnemonics|. For a more elaborate analysis of the disassembly on the testers computer, please refer to the header file \verb|aikern.h| that should come with this report. Additionally \verb|Makefile| provides all informations about the used and tested compiler options.
|
||||||
|
|
||||||
|
Good results\footnote{all, including the special FMA kernels, use only expected memory access, doing everything else in registers} were achieved with \verb|-O2 -mavx -mfma|. But \verb|-O2 -maxv -mfma| is a tradeoff between the best possible results and obviously correct compiled code. In fact the assembly almost looks like handwritten. If even more optimization is wanted \verb|-O3| can be used. To fully utilize FMA with packed doubles \verb|-Ofast| or \verb|-Ofast -ffast-math| has to be used. Be aware that more optimization than \verb|-O2 -maxv -mfma| results in a very hard to understand disassembly. \verb|-ffast-math| can even introduce rounding errors. It is not completely obvious that the highly optimized compiled code still has the wanted operational intensity. \verb|-O0| never works out.
|
||||||
|
|
||||||
|
\bigskip
|
||||||
|
\bigskip
|
||||||
|
\begin{footnotesize}
|
||||||
|
\noindent\emph{Remark:} Contrary to popular believe the roofline model is built atop the notion of operational intensity\footnote{FLOPs against bytes written to DRAM} kernels. The differences to arithmetic intensities are outlined in~\textcite{williams2009}. Depending on the definition used these two terms are not necessarily interchangeable. The notion of operational intensity in the following sections might be what some would understand by the term arithmetic itensity.
|
||||||
|
\end{footnotesize}
|
||||||
|
|
||||||
|
\subsection{1/16 $\neq$ 1/16. Or: The Fancy Arithmetics of a Compiler}
|
||||||
|
|
||||||
|
In order to understand why the following kernels are implemented the way they are, an example of a badly behaving $\rfrac{1}{16}$ OI kernel is given in~\prettyref{lst:1-16-simple-dangerous}. The kernel has one FP operation ($*$) and reads 16 bytes (a[i], b[i]) from memory. But in practice this algorithm does not work as expected. There are several ways how one could write the same kernel.
|
||||||
|
|
||||||
|
\begin{itemize}
|
||||||
|
\item Submitting \verb|volatile|. This results in the loop being optimized away completely for optimization levels above \verb|-O0|.
|
||||||
|
\item Using no optimization i.e. \verb|-O0|. No advanced features of the processor will be used (e.g., FMA requires at least \verb|-O2|). Also just about everything is read and written from and to the stack. Even loop variables. One may now assume that this is cached anyway --- or one ain't so.
|
||||||
|
\item Using \verb|volatile| and optimization. When volatile is used gcc reads and writes variable \verb|tmp| from and to the stack, even in \verb|-O3|. If tmp is cached or not is hard to predict. It's not improbable but relying on that assumption can yield wrong results.
|
||||||
|
\item Using \verb|register|, \verb|volatile| and optimization. Unfortunately \verb|register| just \emph{advises} the compiler to use a register. It does not force the compiler to do so. Seemingly \verb|volatile| overrules \verb|register| in this case -- \verb|tmp| is read and written from and to the stack. Again assuming any caching behaviour is adventurous at least.
|
||||||
|
\end{itemize}
|
||||||
|
|
||||||
|
In the worst case found (no optimization, no volatile, no register) this results in reads of 16 bytes (a[i],b[i]) plus 8 bytes (i), and writes of 16 bytes (i, tmp assignment). Making no caching assumptions this results in an effective operational intensity of $\rfrac{1}{40}$ for a superficial $\rfrac{1}{16}$ OI kernel. For more complex kernels the results get even worse. A triad \verb|t=a*b+c| will store easy-to-miss intermediate results on the stack if no special care is taken.
|
||||||
|
|
||||||
|
To prevent this, one could write assembly directly or rely on compiler intrinsics. The kernels in this report though consist just of normal C code which was hand-crafted until an acceptable compilation was reached. The generated machine code was disassembled and manually checked for hidden memory access. The results are therefore compiler and machine specific, but should be quite generalizable for the most part.
|
||||||
|
|
||||||
|
\bigskip
|
||||||
|
\begin{lstlisting}[caption={Simple $\rfrac{1}{16}$ kernel with questionable compiled form}, label=lst:1-16-simple-dangerous]
|
||||||
|
volatile register double tmp = 0.1;
|
||||||
|
for(size_t i=0; i<size; i++)
|
||||||
|
tmp = a[i] * b[i];
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
\subsection{The 1/16 OI Kernel}
|
||||||
|
\label{sec:1-16}
|
||||||
|
Two $\rfrac{1}{16}$ kernels have been implemented. The kernel in~\prettyref{lst:1-16-simple} is a standard kernel which does not assume special processor capabilities. The second kernel in~\prettyref{lst:1-16-fma} however is designed to make use of a processor's FMA unit.
|
||||||
|
|
||||||
|
The simple kernel in~\prettyref{lst:1-16-simple} reads 8 bytes (a[i]) once for both operands of $*$ and writes 8 bytes (again to a[i]). This results in 16 byte operations. Only one FP instruction is executed, namely $*$. At \verb|-O2| the loop variable is held in a register. This results in an $\rfrac{1}{16}$ OI kernel.
|
||||||
|
|
||||||
|
\begin{lstlisting}[caption={Simple $\rfrac{1}{16}$ OI kernel}, label=lst:1-16-simple]
|
||||||
|
(*\textcolor{Orchid}{\#pragma omp parallel for}*)
|
||||||
|
for(size_t i=0; i<size; i++)
|
||||||
|
a[i] = a[i] * a[i];
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
The FMA aware kernel in~\prettyref{lst:1-16-fma} is a bit more involved. First a triad operation is used ($*$ and $+$ operations have to be balanced). This results in 2 FP instructions executed per round. $3*8=24$ bytes have to be read (a[i], b[i], c[i]) and 8 bytes have to be written (a[i]), in sum 32 byte operations. This results in an $\rfrac{2}{32} = \rfrac{1}{16}$ OI kernel. The loop variable is again held in a register.
|
||||||
|
|
||||||
|
Be aware that the FMA kernel \emph{cannot} be used on a non-FMA processor. For the FMA aware kernel to work correctly it is important that
|
||||||
|
\begin{enumerate*}[label=(\roman*)]
|
||||||
|
\item the processor has an FMA unit
|
||||||
|
\item the \verb|aikern.c| library is compiled with at least \verb|-O2 -mavx -mfma|
|
||||||
|
\item the compiled binary really makes use an FMA instruction (such as \verb|vfmadd132sd|~\cite{intelvfmadd132sd} or even \verb|vfmadd132pd|~\cite{intelvfmadd132pd} on the testers machine)
|
||||||
|
\end{enumerate*}.
|
||||||
|
Otherwise the results are meaningless due to write-outs of intermediary values.
|
||||||
|
|
||||||
|
Also note that in order to use the full capabilities of Intel's FMA the doubles must be packed. This happens if \verb|-Ofast| is given to \verb|gcc| in addition. However this also triggers other optimizations such that the assembly gets long and complex. It is not immediately obvious that the generated assembly is correct. But no instructions could be found that do not solely use registers, except loading and storing data from and to the arrays -- just as wanted.
|
||||||
|
|
||||||
|
\begin{lstlisting}[caption={FMA aware $\rfrac{1}{16}$ OI kernel}, label=lst:1-16-fma]
|
||||||
|
(*\textcolor{Orchid}{\#pragma omp parallel for}*)
|
||||||
|
for(size_t i=0; i<size; i++)
|
||||||
|
a[i] = a[i] * b[i] + c[i];
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
\subsection{The 8 OI Kernel}
|
||||||
|
In this section the implemented 8 OI kernels are shown.~\prettyref{lst:8-1-simple} is a simple 8 OI kernel which should work on any processor. The kernel in~\prettyref{lst:8-1-fma} is tailored for processors with an FMA unit. For the kernels macros were used to repeat the floating point instructions. In some sense this behaves like a huge loop unrolling. Some of the used repeating macros are shown in~\prettyref{lst:8-1-macros}.
|
||||||
|
|
||||||
|
\bigskip
|
||||||
|
\begin{lstlisting}[caption={Macros for bulk repeating instructions}, label=lst:8-1-macros]
|
||||||
|
#define REP0(X)
|
||||||
|
#define REP1(X) X
|
||||||
|
#define REP2(X) REP1(X) REP1(X)
|
||||||
|
#define REP3(X) REP2(X) REP1(X)
|
||||||
|
//[...]
|
||||||
|
#define REP9(X) REP8(X) REP1(X)
|
||||||
|
#define REP10(X) REP9(X) REP1(X)
|
||||||
|
#define REP20(X) REP10(X) REP10(X)
|
||||||
|
//[...]
|
||||||
|
#define REP100(X) REP50(X) REP50(X)
|
||||||
|
\end{lstlisting}
|
||||||
|
\bigskip
|
||||||
|
|
||||||
|
The simple kernel in~\prettyref{lst:8-1-simple} reads 8 bytes (a[i]) and writes 8 bytes (a[i]) while performing 128 FLOPs in total. Therefore this represents a $\rfrac{128}{16}=\rfrac{8}{1}$ OI kernel.
|
||||||
|
|
||||||
|
\bigskip
|
||||||
|
\begin{lstlisting}[caption={Simple $8$ OI kernel}, label=lst:8-1-simple]
|
||||||
|
(*\textcolor{Orchid}{\#pragma omp parallel for}*)
|
||||||
|
for(size_t i=0; i<size; i++){
|
||||||
|
a[i] = REP100(a[i]*)
|
||||||
|
REP20(a[i]*)
|
||||||
|
REP8(a[i]*)
|
||||||
|
REP1(a[i]);
|
||||||
|
}
|
||||||
|
\end{lstlisting}
|
||||||
|
\bigskip
|
||||||
|
|
||||||
|
For the most part things mentioned already in~\prettyref{sec:1-16} hold true for the 8 OI FMA aware kernel too. Please refer to~\prettyref{sec:1-16} for more detailed information about the rationale behind. Compiling this with \verb|-O2 -mavx -mfma| yields an obviously correct result. However if one wants to make use of packed doubles \verb|-Ofast| has to be used which optimizes the code further so that the disassembly is hard to grasp. Anyway it seems that with \verb|-Ofast| at least no malicious read/writes are introduced.
|
||||||
|
|
||||||
|
The FMA aware kernel in~\prettyref{lst:8-1-fma} reads 8 bytes (a[i]) and writes 8 bytes (a[i] but only once per iteration), totalling 16 bytes. Please keep in mind that intermediate a[i] are not written back but instead (at least with \verb|-O2| or better) held in a register. There is only one \verb|vmovsd| instruction for writing the value back in each iteration. The kernel executes $64*2 = 128$ FLOPs. Therefore this is a $\rfrac{128}{16} = \rfrac{8}{1}$ OI kernel.
|
||||||
|
|
||||||
|
\bigskip
|
||||||
|
\begin{lstlisting}[caption={FMA aware $8$ OI kernel}, label=lst:8-1-fma]
|
||||||
|
(*\textcolor{Orchid}{\#pragma omp parallel for}*)
|
||||||
|
for(size_t i=0; i<size; i++){
|
||||||
|
REP60(a[i] = a[i] * a[i] + a[i];)
|
||||||
|
REP4(a[i] = a[i] * a[i] + a[i];)
|
||||||
|
}
|
||||||
|
\end{lstlisting}
|
||||||
|
\bigskip
|
||||||
|
|
||||||
|
|
||||||
%%% Local Variables:
|
%%% Local Variables:
|
||||||
%%% mode: latex
|
%%% mode: latex
|
||||||
|
|
|
@ -26,7 +26,7 @@ According to~\textcite[5-2 Vol.1]{intel2016} the 4th generation Intel Core proce
|
||||||
|
|
||||||
In general an FMA unit is capable of multiple floating-point (FP) operations during a single cycle. This is directly backed by the hardware (operations are \emph{``fused''} together). Specifically the FMA unit of a Haswell processor is capable of ``[...] 256-bit floating-point instructions to perform computation on 256-bit vectors''~\cite[5-28 Vol.1]{intel2016}.
|
In general an FMA unit is capable of multiple floating-point (FP) operations during a single cycle. This is directly backed by the hardware (operations are \emph{``fused''} together). Specifically the FMA unit of a Haswell processor is capable of ``[...] 256-bit floating-point instructions to perform computation on 256-bit vectors''~\cite[5-28 Vol.1]{intel2016}.
|
||||||
|
|
||||||
Since even a DP (double-precision) FP element has only 64-bit, 256-bit would be obviously overprovisioned. But the FMA instructions do not just take scalars as arguments. Instead up to 4 DP FP elements can be packed together in a vector and operations are conducted pairwise. An example mulitply-add instruction is given in \cite{intel2}.
|
Since even a DP (double-precision) FP element has only 64-bit, 256-bit would be obviously overprovisioned. But the FMA instructions do not just take scalars as arguments. Instead up to 4 DP FP elements can be packed together in a vector and operations are conducted pairwise. An example mulitply-add instruction is given in \cite{intelvfmadd132pd}.
|
||||||
|
|
||||||
Unfortunately no definite source could be found but according to \textcite{shimpi2012} the Haswell architecture is built with 2 FMA units per core. Taking all together we get:
|
Unfortunately no definite source could be found but according to \textcite{shimpi2012} the Haswell architecture is built with 2 FMA units per core. Taking all together we get:
|
||||||
|
|
||||||
|
|
|
@ -29,7 +29,7 @@
|
||||||
\abx@aux@cite{berstrom}
|
\abx@aux@cite{berstrom}
|
||||||
\abx@aux@cite{ark4210}
|
\abx@aux@cite{ark4210}
|
||||||
\abx@aux@cite{intel2016}
|
\abx@aux@cite{intel2016}
|
||||||
\abx@aux@cite{intel2}
|
\abx@aux@cite{intelvfmadd132pd}
|
||||||
\abx@aux@cite{shimpi2012}
|
\abx@aux@cite{shimpi2012}
|
||||||
\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {section}{\numberline {1}Introduction}{2}{section.1}}
|
\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {section}{\numberline {1}Introduction}{2}{section.1}}
|
||||||
\newlabel{sec:introduction}{{1}{2}{Introduction}{section.1}{}}
|
\newlabel{sec:introduction}{{1}{2}{Introduction}{section.1}{}}
|
||||||
|
@ -51,6 +51,23 @@
|
||||||
\newlabel{fig:roofline}{{1}{4}{Roofline graph from the values obtained in~\prettyref {sec:peak} and~\prettyref {sec:memory}\relax }{figure.caption.3}{}}
|
\newlabel{fig:roofline}{{1}{4}{Roofline graph from the values obtained in~\prettyref {sec:peak} and~\prettyref {sec:memory}\relax }{figure.caption.3}{}}
|
||||||
\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {section}{\numberline {3}Kernels}{4}{section.3}}
|
\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {section}{\numberline {3}Kernels}{4}{section.3}}
|
||||||
\newlabel{sec:kernels}{{3}{4}{Kernels}{section.3}{}}
|
\newlabel{sec:kernels}{{3}{4}{Kernels}{section.3}{}}
|
||||||
\newlabel{LastPage}{{}{4}{}{page.4}{}}
|
\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {subsection}{\numberline {3.1}1/16 $\not =$ 1/16. Or: The Fancy Arithmetics of a Compiler}{5}{subsection.3.1}}
|
||||||
\xdef\lastpage@lastpage{4}
|
\newlabel{lst:1-16-simple-dangerous}{{2}{5}{Simple $\rfrac {1}{16}$ kernel with questionable compiled form}{lstlisting.2}{}}
|
||||||
\xdef\lastpage@lastpageHy{4}
|
\@writefile{lol}{\defcounter {refsection}{0}\relax }\@writefile{lol}{\contentsline {lstlisting}{\numberline {2}Simple ${}^{1}\tmspace -\thinmuskip {.1667em}/_{16}$ kernel with questionable compiled form}{5}{lstlisting.2}}
|
||||||
|
\abx@aux@cite{intelvfmadd132sd}
|
||||||
|
\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {subsection}{\numberline {3.2}The 1/16 OI Kernel}{6}{subsection.3.2}}
|
||||||
|
\newlabel{sec:1-16}{{3.2}{6}{The 1/16 OI Kernel}{subsection.3.2}{}}
|
||||||
|
\newlabel{lst:1-16-simple}{{3}{6}{Simple $\rfrac {1}{16}$ OI kernel}{lstlisting.3}{}}
|
||||||
|
\@writefile{lol}{\defcounter {refsection}{0}\relax }\@writefile{lol}{\contentsline {lstlisting}{\numberline {3}Simple ${}^{1}\tmspace -\thinmuskip {.1667em}/_{16}$ OI kernel}{6}{lstlisting.3}}
|
||||||
|
\newlabel{lst:1-16-fma}{{4}{6}{FMA aware $\rfrac {1}{16}$ OI kernel}{lstlisting.4}{}}
|
||||||
|
\@writefile{lol}{\defcounter {refsection}{0}\relax }\@writefile{lol}{\contentsline {lstlisting}{\numberline {4}FMA aware ${}^{1}\tmspace -\thinmuskip {.1667em}/_{16}$ OI kernel}{6}{lstlisting.4}}
|
||||||
|
\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {subsection}{\numberline {3.3}The 8 OI Kernel}{6}{subsection.3.3}}
|
||||||
|
\newlabel{lst:8-1-macros}{{5}{6}{Macros for bulk repeating instructions}{lstlisting.5}{}}
|
||||||
|
\@writefile{lol}{\defcounter {refsection}{0}\relax }\@writefile{lol}{\contentsline {lstlisting}{\numberline {5}Macros for bulk repeating instructions}{6}{lstlisting.5}}
|
||||||
|
\newlabel{lst:8-1-simple}{{6}{7}{Simple $8$ OI kernel}{lstlisting.6}{}}
|
||||||
|
\@writefile{lol}{\defcounter {refsection}{0}\relax }\@writefile{lol}{\contentsline {lstlisting}{\numberline {6}Simple $8$ OI kernel}{7}{lstlisting.6}}
|
||||||
|
\newlabel{lst:8-1-fma}{{7}{7}{FMA aware $8$ OI kernel}{lstlisting.7}{}}
|
||||||
|
\@writefile{lol}{\defcounter {refsection}{0}\relax }\@writefile{lol}{\contentsline {lstlisting}{\numberline {7}FMA aware $8$ OI kernel}{7}{lstlisting.7}}
|
||||||
|
\newlabel{LastPage}{{}{8}{}{page.8}{}}
|
||||||
|
\xdef\lastpage@lastpage{8}
|
||||||
|
\xdef\lastpage@lastpageHy{8}
|
||||||
|
|
|
@ -79,7 +79,7 @@
|
||||||
\verb https://www-ssl.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-manual-325462.pdf
|
\verb https://www-ssl.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-manual-325462.pdf
|
||||||
\endverb
|
\endverb
|
||||||
\endentry
|
\endentry
|
||||||
\entry{intel2}{online}{}
|
\entry{intelvfmadd132pd}{online}{}
|
||||||
\name{labelname}{1}{}{%
|
\name{labelname}{1}{}{%
|
||||||
{{hash=ff97a9fdede09eaf6e1c8ec9f6a61dd5}{{Intel}}{I\bibinitperiod}{}{}{}{}{}{}}%
|
{{hash=ff97a9fdede09eaf6e1c8ec9f6a61dd5}{{Intel}}{I\bibinitperiod}{}{}{}{}{}{}}%
|
||||||
}
|
}
|
||||||
|
@ -89,13 +89,32 @@
|
||||||
\strng{namehash}{ff97a9fdede09eaf6e1c8ec9f6a61dd5}
|
\strng{namehash}{ff97a9fdede09eaf6e1c8ec9f6a61dd5}
|
||||||
\strng{fullhash}{ff97a9fdede09eaf6e1c8ec9f6a61dd5}
|
\strng{fullhash}{ff97a9fdede09eaf6e1c8ec9f6a61dd5}
|
||||||
\field{sortinit}{I}
|
\field{sortinit}{I}
|
||||||
\field{labeltitle}{Intel Intrinsics Guide}
|
\field{labeltitle}{Intel Intrinsics Guide: vfmadd132pd}
|
||||||
\field{title}{Intel Intrinsics Guide}
|
\field{title}{Intel Intrinsics Guide: vfmadd132pd}
|
||||||
\field{urlday}{19}
|
\field{urlday}{19}
|
||||||
\field{urlmonth}{06}
|
\field{urlmonth}{06}
|
||||||
\field{urlyear}{2016}
|
\field{urlyear}{2016}
|
||||||
\verb{url}
|
\verb{url}
|
||||||
\verb https://software.intel.com/sites/landingpage/IntrinsicsGuide/#techs=AVX2,FMA&text=madd&expand=2365
|
\verb https://software.intel.com/sites/landingpage/IntrinsicsGuide/#techs=AVX2,FMA&text=vfmadd132pd&expand=2365
|
||||||
|
\endverb
|
||||||
|
\endentry
|
||||||
|
\entry{intelvfmadd132sd}{online}{}
|
||||||
|
\name{labelname}{1}{}{%
|
||||||
|
{{hash=ff97a9fdede09eaf6e1c8ec9f6a61dd5}{{Intel}}{I\bibinitperiod}{}{}{}{}{}{}}%
|
||||||
|
}
|
||||||
|
\name{author}{1}{}{%
|
||||||
|
{{hash=ff97a9fdede09eaf6e1c8ec9f6a61dd5}{{Intel}}{I\bibinitperiod}{}{}{}{}{}{}}%
|
||||||
|
}
|
||||||
|
\strng{namehash}{ff97a9fdede09eaf6e1c8ec9f6a61dd5}
|
||||||
|
\strng{fullhash}{ff97a9fdede09eaf6e1c8ec9f6a61dd5}
|
||||||
|
\field{sortinit}{I}
|
||||||
|
\field{labeltitle}{Intel Intrinsics Guide: vfmadd132sd}
|
||||||
|
\field{title}{Intel Intrinsics Guide: vfmadd132sd}
|
||||||
|
\field{urlday}{19}
|
||||||
|
\field{urlmonth}{06}
|
||||||
|
\field{urlyear}{2016}
|
||||||
|
\verb{url}
|
||||||
|
\verb https://software.intel.com/sites/landingpage/IntrinsicsGuide/#techs=AVX2,FMA&text=vfmadd132sd&expand=2365,2403
|
||||||
\endverb
|
\endverb
|
||||||
\endentry
|
\endentry
|
||||||
\entry{ark4210}{online}{}
|
\entry{ark4210}{online}{}
|
||||||
|
|
|
@ -1971,12 +1971,16 @@
|
||||||
<bcf:citekey order="5">intel2016</bcf:citekey>
|
<bcf:citekey order="5">intel2016</bcf:citekey>
|
||||||
<bcf:citekey order="6">intel2016</bcf:citekey>
|
<bcf:citekey order="6">intel2016</bcf:citekey>
|
||||||
<bcf:citekey order="7">intel2016</bcf:citekey>
|
<bcf:citekey order="7">intel2016</bcf:citekey>
|
||||||
<bcf:citekey order="8">intel2</bcf:citekey>
|
<bcf:citekey order="8">intelvfmadd132pd</bcf:citekey>
|
||||||
<bcf:citekey order="9">shimpi2012</bcf:citekey>
|
<bcf:citekey order="9">shimpi2012</bcf:citekey>
|
||||||
<bcf:citekey order="10">shimpi2012</bcf:citekey>
|
<bcf:citekey order="10">shimpi2012</bcf:citekey>
|
||||||
<bcf:citekey order="11">berstrom</bcf:citekey>
|
<bcf:citekey order="11">berstrom</bcf:citekey>
|
||||||
<bcf:citekey order="12">bergstrom2</bcf:citekey>
|
<bcf:citekey order="12">bergstrom2</bcf:citekey>
|
||||||
<bcf:citekey order="13">williams2009</bcf:citekey>
|
<bcf:citekey order="13">williams2009</bcf:citekey>
|
||||||
|
<bcf:citekey order="14">williams2009</bcf:citekey>
|
||||||
|
<bcf:citekey order="15">williams2009</bcf:citekey>
|
||||||
|
<bcf:citekey order="16">intelvfmadd132sd</bcf:citekey>
|
||||||
|
<bcf:citekey order="17">intelvfmadd132pd</bcf:citekey>
|
||||||
</bcf:section>
|
</bcf:section>
|
||||||
<bcf:sortlist section="0" type="entry" label="nty">
|
<bcf:sortlist section="0" type="entry" label="nty">
|
||||||
<bcf:sorting>
|
<bcf:sorting>
|
||||||
|
|
|
@ -1,14 +1,14 @@
|
||||||
[0] Config.pm:318> INFO - This is Biber 1.8
|
[0] Config.pm:318> INFO - This is Biber 1.8
|
||||||
[0] Config.pm:321> INFO - Logfile is 'report.blg'
|
[0] Config.pm:321> INFO - Logfile is 'report.blg'
|
||||||
[71] biber:275> INFO - === Thu Jun 23, 2016, 02:25:38
|
[53] biber:275> INFO - === Thu Jun 23, 2016, 19:53:59
|
||||||
[71] Biber.pm:333> INFO - Reading 'report.bcf'
|
[53] Biber.pm:333> INFO - Reading 'report.bcf'
|
||||||
[152] Biber.pm:630> INFO - Found 7 citekeys in bib section 0
|
[116] Biber.pm:630> INFO - Found 8 citekeys in bib section 0
|
||||||
[164] Biber.pm:3053> INFO - Processing section 0
|
[127] Biber.pm:3053> INFO - Processing section 0
|
||||||
[187] Biber.pm:3190> INFO - Looking for bibtex format file 'roofline.bib' for section 0
|
[142] Biber.pm:3190> INFO - Looking for bibtex format file 'roofline.bib' for section 0
|
||||||
[188] bibtex.pm:937> INFO - Decoding LaTeX character macros into UTF-8
|
[144] bibtex.pm:937> INFO - Decoding LaTeX character macros into UTF-8
|
||||||
[189] bibtex.pm:812> INFO - Found BibTeX data source 'roofline.bib'
|
[144] bibtex.pm:812> INFO - Found BibTeX data source 'roofline.bib'
|
||||||
[236] Biber.pm:2939> INFO - Overriding locale 'en_US.UTF-8' default tailoring 'variable = shifted' with 'variable = non-ignorable'
|
[179] Biber.pm:2939> INFO - Overriding locale 'en_US.UTF-8' default tailoring 'variable = shifted' with 'variable = non-ignorable'
|
||||||
[236] Biber.pm:2945> INFO - Sorting 'entry' list 'nty' keys
|
[179] Biber.pm:2945> INFO - Sorting 'entry' list 'nty' keys
|
||||||
[236] Biber.pm:2949> INFO - No sort tailoring available for locale 'en_US.UTF-8'
|
[179] Biber.pm:2949> INFO - No sort tailoring available for locale 'en_US.UTF-8'
|
||||||
[252] bbl.pm:482> INFO - Writing 'report.bbl' with encoding 'UTF-8'
|
[197] bbl.pm:482> INFO - Writing 'report.bbl' with encoding 'UTF-8'
|
||||||
[254] bbl.pm:555> INFO - Output to report.bbl
|
[198] bbl.pm:555> INFO - Output to report.bbl
|
||||||
|
|
|
@ -1,11 +1,11 @@
|
||||||
# Fdb version 3
|
# Fdb version 3
|
||||||
["biber report"] 1466641538 "report.bcf" "report.bbl" "report" 1466642333
|
["biber report"] 1466704438 "report.bcf" "report.bbl" "report" 1466709195
|
||||||
"report.bcf" 1466642333 92144 b16bb4d23ff7f0d4a3e0ee2f3a7b2c36 ""
|
"report.bcf" 1466709195 92382 2683b542d57d2326e3b37a6a44222b52 ""
|
||||||
"roofline.bib" 1466632630 3723 5c74ca6da23b4936d86117884f95cb33 ""
|
"roofline.bib" 1466704433 4157 226e47c750579a202f66b6f0e4df67bb ""
|
||||||
(generated)
|
(generated)
|
||||||
"report.bbl"
|
"report.bbl"
|
||||||
"report.blg"
|
"report.blg"
|
||||||
["pdflatex"] 1466642332 "report.tex" "report.pdf" "report" 1466642333
|
["pdflatex"] 1466709193 "report.tex" "report.pdf" "report" 1466709195
|
||||||
"/usr/share/texlive/texmf-dist/fonts/enc/dvips/cm-super/cm-super-t1.enc" 1136849721 2971 def0b6c1f0b107b3b936def894055589 ""
|
"/usr/share/texlive/texmf-dist/fonts/enc/dvips/cm-super/cm-super-t1.enc" 1136849721 2971 def0b6c1f0b107b3b936def894055589 ""
|
||||||
"/usr/share/texlive/texmf-dist/fonts/enc/dvips/cm-super/cm-super-ts1.enc" 1136849721 2900 1537cc8184ad1792082cd229ecc269f4 ""
|
"/usr/share/texlive/texmf-dist/fonts/enc/dvips/cm-super/cm-super-ts1.enc" 1136849721 2900 1537cc8184ad1792082cd229ecc269f4 ""
|
||||||
"/usr/share/texlive/texmf-dist/fonts/map/fontname/texfonts.map" 1272929888 3287 e6b82fe08f5336d4d5ebc73fb1152e87 ""
|
"/usr/share/texlive/texmf-dist/fonts/map/fontname/texfonts.map" 1272929888 3287 e6b82fe08f5336d4d5ebc73fb1152e87 ""
|
||||||
|
@ -20,6 +20,7 @@
|
||||||
"/usr/share/texlive/texmf-dist/fonts/tfm/jknappen/ec/ecrm0900.tfm" 1136768653 3584 d3d8ac8b25ca19c0a40b86a5db1e8ccc ""
|
"/usr/share/texlive/texmf-dist/fonts/tfm/jknappen/ec/ecrm0900.tfm" 1136768653 3584 d3d8ac8b25ca19c0a40b86a5db1e8ccc ""
|
||||||
"/usr/share/texlive/texmf-dist/fonts/tfm/jknappen/ec/ecrm1095.tfm" 1136768653 3584 929cdff2b7a8c11bd4d49fd68cb0ae70 ""
|
"/usr/share/texlive/texmf-dist/fonts/tfm/jknappen/ec/ecrm1095.tfm" 1136768653 3584 929cdff2b7a8c11bd4d49fd68cb0ae70 ""
|
||||||
"/usr/share/texlive/texmf-dist/fonts/tfm/jknappen/ec/ecrm1440.tfm" 1136768653 3584 3169d30142b88a27d4ab0e3468e963a2 ""
|
"/usr/share/texlive/texmf-dist/fonts/tfm/jknappen/ec/ecrm1440.tfm" 1136768653 3584 3169d30142b88a27d4ab0e3468e963a2 ""
|
||||||
|
"/usr/share/texlive/texmf-dist/fonts/tfm/jknappen/ec/ecti0900.tfm" 1136768653 3072 a603fa6d934ebc72197ed1c389943d86 ""
|
||||||
"/usr/share/texlive/texmf-dist/fonts/tfm/jknappen/ec/ecti1095.tfm" 1136768653 3072 b73d2778cc3af44970de4de5e032d7f6 ""
|
"/usr/share/texlive/texmf-dist/fonts/tfm/jknappen/ec/ecti1095.tfm" 1136768653 3072 b73d2778cc3af44970de4de5e032d7f6 ""
|
||||||
"/usr/share/texlive/texmf-dist/fonts/tfm/jknappen/ec/ectt1095.tfm" 1136768653 1536 a988bfe554c1f79514bd46d13c3c64ce ""
|
"/usr/share/texlive/texmf-dist/fonts/tfm/jknappen/ec/ectt1095.tfm" 1136768653 1536 a988bfe554c1f79514bd46d13c3c64ce ""
|
||||||
"/usr/share/texlive/texmf-dist/fonts/tfm/jknappen/ec/tcrm1095.tfm" 1136768653 1536 02c06700a42be0f5a28664c7273f82e7 ""
|
"/usr/share/texlive/texmf-dist/fonts/tfm/jknappen/ec/tcrm1095.tfm" 1136768653 1536 02c06700a42be0f5a28664c7273f82e7 ""
|
||||||
|
@ -56,6 +57,7 @@
|
||||||
"/usr/share/texlive/texmf-dist/fonts/tfm/public/stmaryrd/stmary9.tfm" 1302307949 848 594c171945930dfc7cc52fb30457c803 ""
|
"/usr/share/texlive/texmf-dist/fonts/tfm/public/stmaryrd/stmary9.tfm" 1302307949 848 594c171945930dfc7cc52fb30457c803 ""
|
||||||
"/usr/share/texlive/texmf-dist/fonts/type1/public/amsfonts/cm/cmmi10.pfb" 1248133631 36299 5f9df58c2139e7edcf37c8fca4bd384d ""
|
"/usr/share/texlive/texmf-dist/fonts/type1/public/amsfonts/cm/cmmi10.pfb" 1248133631 36299 5f9df58c2139e7edcf37c8fca4bd384d ""
|
||||||
"/usr/share/texlive/texmf-dist/fonts/type1/public/amsfonts/cm/cmr10.pfb" 1248133631 35752 024fb6c41858982481f6968b5fc26508 ""
|
"/usr/share/texlive/texmf-dist/fonts/type1/public/amsfonts/cm/cmr10.pfb" 1248133631 35752 024fb6c41858982481f6968b5fc26508 ""
|
||||||
|
"/usr/share/texlive/texmf-dist/fonts/type1/public/amsfonts/cm/cmr12.pfb" 1248133631 32722 d7379af29a190c3f453aba36302ff5a9 ""
|
||||||
"/usr/share/texlive/texmf-dist/fonts/type1/public/amsfonts/cm/cmr8.pfb" 1248133631 32726 0a1aea6fcd6468ee2cf64d891f5c43c8 ""
|
"/usr/share/texlive/texmf-dist/fonts/type1/public/amsfonts/cm/cmr8.pfb" 1248133631 32726 0a1aea6fcd6468ee2cf64d891f5c43c8 ""
|
||||||
"/usr/share/texlive/texmf-dist/fonts/type1/public/amsfonts/cm/cmsy10.pfb" 1248133631 32569 5e5ddc8df908dea60932f3c484a54c0d ""
|
"/usr/share/texlive/texmf-dist/fonts/type1/public/amsfonts/cm/cmsy10.pfb" 1248133631 32569 5e5ddc8df908dea60932f3c484a54c0d ""
|
||||||
"/usr/share/texlive/texmf-dist/fonts/type1/public/cm-super/sfbx1095.pfb" 1215737283 154600 ea54091d31de803b613ba9e80ca51709 ""
|
"/usr/share/texlive/texmf-dist/fonts/type1/public/cm-super/sfbx1095.pfb" 1215737283 154600 ea54091d31de803b613ba9e80ca51709 ""
|
||||||
|
@ -68,6 +70,7 @@
|
||||||
"/usr/share/texlive/texmf-dist/fonts/type1/public/cm-super/sfrm0900.pfb" 1215737283 149037 995a6f1e12c1d647b99b1cf55db78699 ""
|
"/usr/share/texlive/texmf-dist/fonts/type1/public/cm-super/sfrm0900.pfb" 1215737283 149037 995a6f1e12c1d647b99b1cf55db78699 ""
|
||||||
"/usr/share/texlive/texmf-dist/fonts/type1/public/cm-super/sfrm1095.pfb" 1215737283 145929 f25e56369a345c4ff583b067cd87ce8e ""
|
"/usr/share/texlive/texmf-dist/fonts/type1/public/cm-super/sfrm1095.pfb" 1215737283 145929 f25e56369a345c4ff583b067cd87ce8e ""
|
||||||
"/usr/share/texlive/texmf-dist/fonts/type1/public/cm-super/sfrm1440.pfb" 1215737283 131078 d96015a2fa5c350129e933ca070b2484 ""
|
"/usr/share/texlive/texmf-dist/fonts/type1/public/cm-super/sfrm1440.pfb" 1215737283 131078 d96015a2fa5c350129e933ca070b2484 ""
|
||||||
|
"/usr/share/texlive/texmf-dist/fonts/type1/public/cm-super/sfti0900.pfb" 1215737283 183673 6df73819bb3e1246a6315a4913a2d331 ""
|
||||||
"/usr/share/texlive/texmf-dist/fonts/type1/public/cm-super/sfti1095.pfb" 1215737283 196446 8fbbe4b97b83e5182def6d29a44e57fb ""
|
"/usr/share/texlive/texmf-dist/fonts/type1/public/cm-super/sfti1095.pfb" 1215737283 196446 8fbbe4b97b83e5182def6d29a44e57fb ""
|
||||||
"/usr/share/texlive/texmf-dist/fonts/type1/public/cm-super/sftt1095.pfb" 1215737283 169670 48d12e69c9a3b23c81f6d703ccbd4554 ""
|
"/usr/share/texlive/texmf-dist/fonts/type1/public/cm-super/sftt1095.pfb" 1215737283 169670 48d12e69c9a3b23c81f6d703ccbd4554 ""
|
||||||
"/usr/share/texlive/texmf-dist/tex/context/base/supp-pdf.mkii" 1337017135 71627 94eb9990bed73c364d7f53f960cc8c5b ""
|
"/usr/share/texlive/texmf-dist/tex/context/base/supp-pdf.mkii" 1337017135 71627 94eb9990bed73c364d7f53f960cc8c5b ""
|
||||||
|
@ -158,8 +161,6 @@
|
||||||
"/usr/share/texlive/texmf-dist/tex/latex/listings/listings.cfg" 1394061314 1828 1429ae58d32ff215bffb2acf697ae41a ""
|
"/usr/share/texlive/texmf-dist/tex/latex/listings/listings.cfg" 1394061314 1828 1429ae58d32ff215bffb2acf697ae41a ""
|
||||||
"/usr/share/texlive/texmf-dist/tex/latex/listings/listings.sty" 1394061314 80361 048fe35275a1096660ea67eecd2213f4 ""
|
"/usr/share/texlive/texmf-dist/tex/latex/listings/listings.sty" 1394061314 80361 048fe35275a1096660ea67eecd2213f4 ""
|
||||||
"/usr/share/texlive/texmf-dist/tex/latex/listings/lstlang1.sty" 1394061314 93168 df9863fadbf023e458067a158925eff9 ""
|
"/usr/share/texlive/texmf-dist/tex/latex/listings/lstlang1.sty" 1394061314 93168 df9863fadbf023e458067a158925eff9 ""
|
||||||
"/usr/share/texlive/texmf-dist/tex/latex/listings/lstlang2.sty" 1394061314 89980 e97cebbc4f0eae4011a8bea389a05d0a ""
|
|
||||||
"/usr/share/texlive/texmf-dist/tex/latex/listings/lstlang3.sty" 1394061314 86841 4fa558f6bbd8f3d49e175c0dd27ff41a ""
|
|
||||||
"/usr/share/texlive/texmf-dist/tex/latex/listings/lstmisc.sty" 1394061314 77029 dfe676ac1c76cfa220c8107472a1da27 ""
|
"/usr/share/texlive/texmf-dist/tex/latex/listings/lstmisc.sty" 1394061314 77029 dfe676ac1c76cfa220c8107472a1da27 ""
|
||||||
"/usr/share/texlive/texmf-dist/tex/latex/logreq/logreq.def" 1284153563 1620 fb1c32b818f2058eca187e5c41dfae77 ""
|
"/usr/share/texlive/texmf-dist/tex/latex/logreq/logreq.def" 1284153563 1620 fb1c32b818f2058eca187e5c41dfae77 ""
|
||||||
"/usr/share/texlive/texmf-dist/tex/latex/logreq/logreq.sty" 1284153563 6187 b27afc771af565d3a9ff1ca7d16d0d46 ""
|
"/usr/share/texlive/texmf-dist/tex/latex/logreq/logreq.sty" 1284153563 6187 b27afc771af565d3a9ff1ca7d16d0d46 ""
|
||||||
|
@ -195,21 +196,22 @@
|
||||||
"/usr/share/texlive/texmf-dist/web2c/texmf.cnf" 1455657841 31706 2be2b4306fae7fc20493e3b90c2ad04d ""
|
"/usr/share/texlive/texmf-dist/web2c/texmf.cnf" 1455657841 31706 2be2b4306fae7fc20493e3b90c2ad04d ""
|
||||||
"/usr/share/texlive/texmf-var/web2c/pdftex/pdflatex.fmt" 1457104667 3492982 6abaa3262ef9227a797168d32888676c ""
|
"/usr/share/texlive/texmf-var/web2c/pdftex/pdflatex.fmt" 1457104667 3492982 6abaa3262ef9227a797168d32888676c ""
|
||||||
"inputs/introduction.tex" 1466184626 76 eaf0f76fa74815989416f6f6d1c36f8b ""
|
"inputs/introduction.tex" 1466184626 76 eaf0f76fa74815989416f6f6d1c36f8b ""
|
||||||
"inputs/kernels.tex" 1466184646 75 4edfbf753fb138c9886dd119053949bf ""
|
"inputs/kernels.tex" 1466709193 10203 9325fb415b03bebae73e25df31a02d19 ""
|
||||||
"inputs/roofline.tex" 1466642331 5522 4541d608767a130965ef6af1061bff79 ""
|
"inputs/roofline.tex" 1466704311 5532 18ef3b0c3e19883f9e4d68e1bd73b31b ""
|
||||||
"report.aux" 1466642333 3974 fbce129a17c9c0f39751b7114db01f4a ""
|
"report.aux" 1466709195 6200 ef3b9dffee45c82bc6071014e35f58b8 ""
|
||||||
"report.bbl" 1466641538 6814 69377a156548dd41d6fce56d0861beda "biber report"
|
"report.bbl" 1466704439 7655 4b5f697a70789470cde9f922b6440ee7 "biber report"
|
||||||
"report.out" 1466642333 334 a1cec9b42f1ecf30af112fc058dd7354 ""
|
"report.out" 1466709195 566 365a3bdfdb786abd7e70ca003f732afb ""
|
||||||
"report.run.xml" 1466642333 2317 80d7743117fafc51b1e42b536d793f68 ""
|
"report.run.xml" 1466709195 2317 80d7743117fafc51b1e42b536d793f68 ""
|
||||||
"report.tex" 1466626391 4716 59f1e8b52a6969670880343126dbe52a ""
|
"report.tex" 1466708164 4496 4af727a449506efbfccac9df327ec9fe ""
|
||||||
"report.toc" 1466642333 818 cfda5e6b9084ed337791b495536ef0b7 ""
|
"report.toc" 1466709195 1210 9050233c7a77a885db53f60f534c1c7a ""
|
||||||
"res/rooftop.png" 1466641296 38798 e83f8157e0a63985f174d5bf3128cc98 ""
|
"res/rooftop-eps-converted-to.pdf" 1466670002 22114 f6f2c1d53d8b6a5f4042e202648c7b36 ""
|
||||||
|
"res/rooftop.eps" 1466669975 36013 2a6358f72820d80a6e87ee15e92d5669 ""
|
||||||
(generated)
|
(generated)
|
||||||
"report-blx.bib"
|
|
||||||
"report.out"
|
"report.out"
|
||||||
"report.toc"
|
"report.toc"
|
||||||
"report.log"
|
|
||||||
"report.run.xml"
|
|
||||||
"report.bcf"
|
"report.bcf"
|
||||||
"report.pdf"
|
"report.run.xml"
|
||||||
|
"report-blx.bib"
|
||||||
|
"report.log"
|
||||||
"report.aux"
|
"report.aux"
|
||||||
|
"report.pdf"
|
||||||
|
|
|
@ -218,22 +218,6 @@ INPUT /usr/share/texlive/texmf-dist/tex/latex/base/ts1enc.dfu
|
||||||
INPUT /usr/share/texlive/texmf-dist/tex/latex/base/ts1enc.dfu
|
INPUT /usr/share/texlive/texmf-dist/tex/latex/base/ts1enc.dfu
|
||||||
INPUT /usr/share/texlive/texmf-dist/tex/latex/listings/lstlang1.sty
|
INPUT /usr/share/texlive/texmf-dist/tex/latex/listings/lstlang1.sty
|
||||||
INPUT /usr/share/texlive/texmf-dist/tex/latex/listings/lstlang1.sty
|
INPUT /usr/share/texlive/texmf-dist/tex/latex/listings/lstlang1.sty
|
||||||
INPUT /usr/share/texlive/texmf-dist/tex/latex/listings/lstlang2.sty
|
|
||||||
INPUT /usr/share/texlive/texmf-dist/tex/latex/listings/lstlang2.sty
|
|
||||||
INPUT /usr/share/texlive/texmf-dist/tex/latex/listings/lstlang3.sty
|
|
||||||
INPUT /usr/share/texlive/texmf-dist/tex/latex/listings/lstlang3.sty
|
|
||||||
INPUT /usr/share/texlive/texmf-dist/tex/latex/listings/lstlang1.sty
|
|
||||||
INPUT /usr/share/texlive/texmf-dist/tex/latex/listings/lstlang1.sty
|
|
||||||
INPUT /usr/share/texlive/texmf-dist/tex/latex/listings/lstlang2.sty
|
|
||||||
INPUT /usr/share/texlive/texmf-dist/tex/latex/listings/lstlang2.sty
|
|
||||||
INPUT /usr/share/texlive/texmf-dist/tex/latex/listings/lstlang3.sty
|
|
||||||
INPUT /usr/share/texlive/texmf-dist/tex/latex/listings/lstlang3.sty
|
|
||||||
INPUT /usr/share/texlive/texmf-dist/tex/latex/listings/lstlang1.sty
|
|
||||||
INPUT /usr/share/texlive/texmf-dist/tex/latex/listings/lstlang1.sty
|
|
||||||
INPUT /usr/share/texlive/texmf-dist/tex/latex/listings/lstlang2.sty
|
|
||||||
INPUT /usr/share/texlive/texmf-dist/tex/latex/listings/lstlang2.sty
|
|
||||||
INPUT /usr/share/texlive/texmf-dist/tex/latex/listings/lstlang3.sty
|
|
||||||
INPUT /usr/share/texlive/texmf-dist/tex/latex/listings/lstlang3.sty
|
|
||||||
INPUT /usr/share/texlive/texmf-dist/tex/latex/listings/lstmisc.sty
|
INPUT /usr/share/texlive/texmf-dist/tex/latex/listings/lstmisc.sty
|
||||||
INPUT /usr/share/texlive/texmf-dist/tex/latex/listings/lstmisc.sty
|
INPUT /usr/share/texlive/texmf-dist/tex/latex/listings/lstmisc.sty
|
||||||
INPUT /usr/share/texlive/texmf-dist/tex/latex/biblatex/lbx/english.lbx
|
INPUT /usr/share/texlive/texmf-dist/tex/latex/biblatex/lbx/english.lbx
|
||||||
|
@ -345,15 +329,25 @@ INPUT /usr/share/texlive/texmf-dist/fonts/tfm/public/amsfonts/symbols/msbm10.tfm
|
||||||
INPUT /usr/share/texlive/texmf-dist/fonts/tfm/public/amsfonts/symbols/msbm5.tfm
|
INPUT /usr/share/texlive/texmf-dist/fonts/tfm/public/amsfonts/symbols/msbm5.tfm
|
||||||
INPUT /usr/share/texlive/texmf-dist/fonts/tfm/public/stmaryrd/stmary9.tfm
|
INPUT /usr/share/texlive/texmf-dist/fonts/tfm/public/stmaryrd/stmary9.tfm
|
||||||
INPUT /usr/share/texlive/texmf-dist/fonts/tfm/public/stmaryrd/stmary5.tfm
|
INPUT /usr/share/texlive/texmf-dist/fonts/tfm/public/stmaryrd/stmary5.tfm
|
||||||
INPUT /usr/share/texlive/texmf-dist/tex/latex/listings/lstlang1.sty
|
INPUT res/rooftop.eps
|
||||||
INPUT /usr/share/texlive/texmf-dist/tex/latex/listings/lstlang1.sty
|
INPUT ./res/rooftop.eps
|
||||||
INPUT /usr/share/texlive/texmf-dist/tex/latex/listings/lstmisc.sty
|
INPUT ./res/rooftop.eps
|
||||||
INPUT /usr/share/texlive/texmf-dist/tex/latex/listings/lstmisc.sty
|
INPUT ./res/rooftop-eps-converted-to.pdf
|
||||||
INPUT res/rooftop.png
|
INPUT ./res/rooftop-eps-converted-to.pdf
|
||||||
INPUT ./res/rooftop.png
|
INPUT ./res/rooftop.eps
|
||||||
INPUT ./res/rooftop.png
|
INPUT ./res/rooftop-eps-converted-to.pdf
|
||||||
|
INPUT ./res/rooftop-eps-converted-to.pdf
|
||||||
|
INPUT ./res/rooftop-eps-converted-to.pdf
|
||||||
INPUT inputs/kernels.tex
|
INPUT inputs/kernels.tex
|
||||||
INPUT inputs/kernels.tex
|
INPUT inputs/kernels.tex
|
||||||
|
INPUT /usr/share/texlive/texmf-dist/fonts/tfm/jknappen/ec/ecti0900.tfm
|
||||||
|
INPUT /usr/share/texlive/texmf-dist/fonts/tfm/public/cm/cmr12.tfm
|
||||||
|
INPUT /usr/share/texlive/texmf-dist/fonts/tfm/public/cm/cmmi12.tfm
|
||||||
|
INPUT /usr/share/texlive/texmf-dist/fonts/tfm/public/cm/cmsy10.tfm
|
||||||
|
INPUT /usr/share/texlive/texmf-dist/fonts/tfm/public/cm/cmex10.tfm
|
||||||
|
INPUT /usr/share/texlive/texmf-dist/fonts/tfm/public/amsfonts/symbols/msam10.tfm
|
||||||
|
INPUT /usr/share/texlive/texmf-dist/fonts/tfm/public/amsfonts/symbols/msbm10.tfm
|
||||||
|
INPUT /usr/share/texlive/texmf-dist/fonts/tfm/public/stmaryrd/stmary10.tfm
|
||||||
INPUT /usr/share/texlive/texmf-dist/fonts/tfm/jknappen/ec/eccc1095.tfm
|
INPUT /usr/share/texlive/texmf-dist/fonts/tfm/jknappen/ec/eccc1095.tfm
|
||||||
INPUT /usr/share/texlive/texmf-dist/fonts/tfm/jknappen/ec/tcti1095.tfm
|
INPUT /usr/share/texlive/texmf-dist/fonts/tfm/jknappen/ec/tcti1095.tfm
|
||||||
INPUT report.aux
|
INPUT report.aux
|
||||||
|
@ -365,6 +359,7 @@ INPUT /usr/share/texlive/texmf-dist/fonts/enc/dvips/cm-super/cm-super-ts1.enc
|
||||||
INPUT /usr/share/texlive/texmf-dist/fonts/enc/dvips/cm-super/cm-super-t1.enc
|
INPUT /usr/share/texlive/texmf-dist/fonts/enc/dvips/cm-super/cm-super-t1.enc
|
||||||
INPUT /usr/share/texlive/texmf-dist/fonts/type1/public/amsfonts/cm/cmmi10.pfb
|
INPUT /usr/share/texlive/texmf-dist/fonts/type1/public/amsfonts/cm/cmmi10.pfb
|
||||||
INPUT /usr/share/texlive/texmf-dist/fonts/type1/public/amsfonts/cm/cmr10.pfb
|
INPUT /usr/share/texlive/texmf-dist/fonts/type1/public/amsfonts/cm/cmr10.pfb
|
||||||
|
INPUT /usr/share/texlive/texmf-dist/fonts/type1/public/amsfonts/cm/cmr12.pfb
|
||||||
INPUT /usr/share/texlive/texmf-dist/fonts/type1/public/amsfonts/cm/cmr8.pfb
|
INPUT /usr/share/texlive/texmf-dist/fonts/type1/public/amsfonts/cm/cmr8.pfb
|
||||||
INPUT /usr/share/texlive/texmf-dist/fonts/type1/public/amsfonts/cm/cmsy10.pfb
|
INPUT /usr/share/texlive/texmf-dist/fonts/type1/public/amsfonts/cm/cmsy10.pfb
|
||||||
INPUT /usr/share/texlive/texmf-dist/fonts/type1/public/cm-super/sfbx1095.pfb
|
INPUT /usr/share/texlive/texmf-dist/fonts/type1/public/cm-super/sfbx1095.pfb
|
||||||
|
@ -377,5 +372,6 @@ INPUT /usr/share/texlive/texmf-dist/fonts/type1/public/cm-super/sfrm0800.pfb
|
||||||
INPUT /usr/share/texlive/texmf-dist/fonts/type1/public/cm-super/sfrm0900.pfb
|
INPUT /usr/share/texlive/texmf-dist/fonts/type1/public/cm-super/sfrm0900.pfb
|
||||||
INPUT /usr/share/texlive/texmf-dist/fonts/type1/public/cm-super/sfrm1095.pfb
|
INPUT /usr/share/texlive/texmf-dist/fonts/type1/public/cm-super/sfrm1095.pfb
|
||||||
INPUT /usr/share/texlive/texmf-dist/fonts/type1/public/cm-super/sfrm1440.pfb
|
INPUT /usr/share/texlive/texmf-dist/fonts/type1/public/cm-super/sfrm1440.pfb
|
||||||
|
INPUT /usr/share/texlive/texmf-dist/fonts/type1/public/cm-super/sfti0900.pfb
|
||||||
INPUT /usr/share/texlive/texmf-dist/fonts/type1/public/cm-super/sfti1095.pfb
|
INPUT /usr/share/texlive/texmf-dist/fonts/type1/public/cm-super/sfti1095.pfb
|
||||||
INPUT /usr/share/texlive/texmf-dist/fonts/type1/public/cm-super/sftt1095.pfb
|
INPUT /usr/share/texlive/texmf-dist/fonts/type1/public/cm-super/sftt1095.pfb
|
||||||
|
|
|
@ -1,4 +1,4 @@
|
||||||
This is pdfTeX, Version 3.14159265-2.6-1.40.15 (TeX Live 2014) (preloaded format=pdflatex 2016.3.4) 23 JUN 2016 02:38
|
This is pdfTeX, Version 3.14159265-2.6-1.40.15 (TeX Live 2014) (preloaded format=pdflatex 2016.3.4) 23 JUN 2016 21:13
|
||||||
entering extended mode
|
entering extended mode
|
||||||
restricted \write18 enabled.
|
restricted \write18 enabled.
|
||||||
%&-line parsing enabled.
|
%&-line parsing enabled.
|
||||||
|
@ -1184,30 +1184,6 @@ Package textcomp Info: Setting ptmj sub-encoding to TS1/4 on input line 340.
|
||||||
(/usr/share/texlive/texmf-dist/tex/latex/listings/lstlang1.sty
|
(/usr/share/texlive/texmf-dist/tex/latex/listings/lstlang1.sty
|
||||||
File: lstlang1.sty 2014/03/04 1.5c listings language file
|
File: lstlang1.sty 2014/03/04 1.5c listings language file
|
||||||
)
|
)
|
||||||
(/usr/share/texlive/texmf-dist/tex/latex/listings/lstlang2.sty
|
|
||||||
File: lstlang2.sty 2014/03/04 1.5c listings language file
|
|
||||||
)
|
|
||||||
(/usr/share/texlive/texmf-dist/tex/latex/listings/lstlang3.sty
|
|
||||||
File: lstlang3.sty 2014/03/04 1.5c listings language file
|
|
||||||
)
|
|
||||||
(/usr/share/texlive/texmf-dist/tex/latex/listings/lstlang1.sty
|
|
||||||
File: lstlang1.sty 2014/03/04 1.5c listings language file
|
|
||||||
)
|
|
||||||
(/usr/share/texlive/texmf-dist/tex/latex/listings/lstlang2.sty
|
|
||||||
File: lstlang2.sty 2014/03/04 1.5c listings language file
|
|
||||||
)
|
|
||||||
(/usr/share/texlive/texmf-dist/tex/latex/listings/lstlang3.sty
|
|
||||||
File: lstlang3.sty 2014/03/04 1.5c listings language file
|
|
||||||
)
|
|
||||||
(/usr/share/texlive/texmf-dist/tex/latex/listings/lstlang1.sty
|
|
||||||
File: lstlang1.sty 2014/03/04 1.5c listings language file
|
|
||||||
)
|
|
||||||
(/usr/share/texlive/texmf-dist/tex/latex/listings/lstlang2.sty
|
|
||||||
File: lstlang2.sty 2014/03/04 1.5c listings language file
|
|
||||||
)
|
|
||||||
(/usr/share/texlive/texmf-dist/tex/latex/listings/lstlang3.sty
|
|
||||||
File: lstlang3.sty 2014/03/04 1.5c listings language file
|
|
||||||
)
|
|
||||||
(/usr/share/texlive/texmf-dist/tex/latex/listings/lstmisc.sty
|
(/usr/share/texlive/texmf-dist/tex/latex/listings/lstmisc.sty
|
||||||
File: lstmisc.sty 2014/03/04 1.5c (Carsten Heinz)
|
File: lstmisc.sty 2014/03/04 1.5c (Carsten Heinz)
|
||||||
)
|
)
|
||||||
|
@ -1226,26 +1202,26 @@ Package biblatex Warning: 'babel/polyglossia' detected but 'csquotes' missing.
|
||||||
(./report.aux)
|
(./report.aux)
|
||||||
\openout1 = `report.aux'.
|
\openout1 = `report.aux'.
|
||||||
|
|
||||||
LaTeX Font Info: Checking defaults for OML/cmm/m/it on input line 81.
|
LaTeX Font Info: Checking defaults for OML/cmm/m/it on input line 80.
|
||||||
LaTeX Font Info: ... okay on input line 81.
|
LaTeX Font Info: ... okay on input line 80.
|
||||||
LaTeX Font Info: Checking defaults for T1/cmr/m/n on input line 81.
|
LaTeX Font Info: Checking defaults for T1/cmr/m/n on input line 80.
|
||||||
LaTeX Font Info: ... okay on input line 81.
|
LaTeX Font Info: ... okay on input line 80.
|
||||||
LaTeX Font Info: Checking defaults for OT1/cmr/m/n on input line 81.
|
LaTeX Font Info: Checking defaults for OT1/cmr/m/n on input line 80.
|
||||||
LaTeX Font Info: ... okay on input line 81.
|
LaTeX Font Info: ... okay on input line 80.
|
||||||
LaTeX Font Info: Checking defaults for OMS/cmsy/m/n on input line 81.
|
LaTeX Font Info: Checking defaults for OMS/cmsy/m/n on input line 80.
|
||||||
LaTeX Font Info: ... okay on input line 81.
|
LaTeX Font Info: ... okay on input line 80.
|
||||||
LaTeX Font Info: Checking defaults for OMX/cmex/m/n on input line 81.
|
LaTeX Font Info: Checking defaults for OMX/cmex/m/n on input line 80.
|
||||||
LaTeX Font Info: ... okay on input line 81.
|
LaTeX Font Info: ... okay on input line 80.
|
||||||
LaTeX Font Info: Checking defaults for U/cmr/m/n on input line 81.
|
LaTeX Font Info: Checking defaults for U/cmr/m/n on input line 80.
|
||||||
LaTeX Font Info: ... okay on input line 81.
|
LaTeX Font Info: ... okay on input line 80.
|
||||||
LaTeX Font Info: Checking defaults for PD1/pdf/m/n on input line 81.
|
LaTeX Font Info: Checking defaults for PD1/pdf/m/n on input line 80.
|
||||||
LaTeX Font Info: ... okay on input line 81.
|
LaTeX Font Info: ... okay on input line 80.
|
||||||
LaTeX Font Info: Checking defaults for TS1/cmr/m/n on input line 81.
|
LaTeX Font Info: Checking defaults for TS1/cmr/m/n on input line 80.
|
||||||
LaTeX Font Info: Try loading font information for TS1+cmr on input line 81.
|
LaTeX Font Info: Try loading font information for TS1+cmr on input line 80.
|
||||||
(/usr/share/texlive/texmf-dist/tex/latex/base/ts1cmr.fd
|
(/usr/share/texlive/texmf-dist/tex/latex/base/ts1cmr.fd
|
||||||
File: ts1cmr.fd 1999/05/25 v2.5h Standard LaTeX font definitions
|
File: ts1cmr.fd 1999/05/25 v2.5h Standard LaTeX font definitions
|
||||||
)
|
)
|
||||||
LaTeX Font Info: ... okay on input line 81.
|
LaTeX Font Info: ... okay on input line 80.
|
||||||
|
|
||||||
(/usr/share/texlive/texmf-dist/tex/context/base/supp-pdf.mkii
|
(/usr/share/texlive/texmf-dist/tex/context/base/supp-pdf.mkii
|
||||||
[Loading MPS to PDF converter (version 2006.09.02).]
|
[Loading MPS to PDF converter (version 2006.09.02).]
|
||||||
|
@ -1276,7 +1252,7 @@ File: epstopdf-sys.cfg 2010/07/13 v1.3 Configuration of (r)epstopdf for TeX Liv
|
||||||
e
|
e
|
||||||
))
|
))
|
||||||
\c@lstlisting=\count308
|
\c@lstlisting=\count308
|
||||||
LaTeX Info: Redefining \microtypecontext on input line 81.
|
LaTeX Info: Redefining \microtypecontext on input line 80.
|
||||||
Package microtype Info: Generating PDF output.
|
Package microtype Info: Generating PDF output.
|
||||||
Package microtype Info: Character protrusion enabled (level 2).
|
Package microtype Info: Character protrusion enabled (level 2).
|
||||||
Package microtype Info: Using default protrusion set `alltext'.
|
Package microtype Info: Using default protrusion set `alltext'.
|
||||||
|
@ -1292,7 +1268,7 @@ File: mt-cmr.cfg 2013/05/19 v2.2 microtype config. file: Computer Modern Roman
|
||||||
(RS)
|
(RS)
|
||||||
)
|
)
|
||||||
\AtBeginShipoutBox=\box37
|
\AtBeginShipoutBox=\box37
|
||||||
Package hyperref Info: Link coloring ON on input line 81.
|
Package hyperref Info: Link coloring ON on input line 80.
|
||||||
|
|
||||||
(/usr/share/texlive/texmf-dist/tex/latex/hyperref/nameref.sty
|
(/usr/share/texlive/texmf-dist/tex/latex/hyperref/nameref.sty
|
||||||
Package: nameref 2012/10/27 v2.43 Cross-referencing by name of section
|
Package: nameref 2012/10/27 v2.43 Cross-referencing by name of section
|
||||||
|
@ -1302,9 +1278,9 @@ Package: gettitlestring 2010/12/03 v1.4 Cleanup title references (HO)
|
||||||
)
|
)
|
||||||
\c@section@level=\count309
|
\c@section@level=\count309
|
||||||
)
|
)
|
||||||
LaTeX Info: Redefining \ref on input line 81.
|
LaTeX Info: Redefining \ref on input line 80.
|
||||||
LaTeX Info: Redefining \pageref on input line 81.
|
LaTeX Info: Redefining \pageref on input line 80.
|
||||||
LaTeX Info: Redefining \nameref on input line 81.
|
LaTeX Info: Redefining \nameref on input line 80.
|
||||||
|
|
||||||
(./report.out) (./report.out)
|
(./report.out) (./report.out)
|
||||||
\@outlinefile=\write4
|
\@outlinefile=\write4
|
||||||
|
@ -1316,7 +1292,7 @@ Package lastpage Info: Please have a look at the pageslts package at
|
||||||
(lastpage) or
|
(lastpage) or
|
||||||
(lastpage) http://www.ctan.org/tex-archive/
|
(lastpage) http://www.ctan.org/tex-archive/
|
||||||
(lastpage) install/macros/latex/contrib/pageslts.tds.zip
|
(lastpage) install/macros/latex/contrib/pageslts.tds.zip
|
||||||
(lastpage) ! on input line 81.
|
(lastpage) ! on input line 80.
|
||||||
Package caption Info: Begin \AtBeginDocument code.
|
Package caption Info: Begin \AtBeginDocument code.
|
||||||
Package caption Info: End \AtBeginDocument code.
|
Package caption Info: End \AtBeginDocument code.
|
||||||
Package biblatex Info: Input encoding 'utf8' detected.
|
Package biblatex Info: Input encoding 'utf8' detected.
|
||||||
|
@ -1327,9 +1303,9 @@ Package biblatex Info: Automatic encoding selection.
|
||||||
Package biblatex Info: Trying to load bibliographic data...
|
Package biblatex Info: Trying to load bibliographic data...
|
||||||
Package biblatex Info: ... file 'report.bbl' found.
|
Package biblatex Info: ... file 'report.bbl' found.
|
||||||
(./report.bbl)
|
(./report.bbl)
|
||||||
Package biblatex Info: Reference section=0 on input line 81.
|
Package biblatex Info: Reference section=0 on input line 80.
|
||||||
Package biblatex Info: Reference segment=0 on input line 81.
|
Package biblatex Info: Reference segment=0 on input line 80.
|
||||||
LaTeX Font Info: Try loading font information for U+msa on input line 91.
|
LaTeX Font Info: Try loading font information for U+msa on input line 90.
|
||||||
|
|
||||||
(/usr/share/texlive/texmf-dist/tex/latex/amsfonts/umsa.fd
|
(/usr/share/texlive/texmf-dist/tex/latex/amsfonts/umsa.fd
|
||||||
File: umsa.fd 2013/01/14 v3.01 AMS symbols A
|
File: umsa.fd 2013/01/14 v3.01 AMS symbols A
|
||||||
|
@ -1337,7 +1313,7 @@ File: umsa.fd 2013/01/14 v3.01 AMS symbols A
|
||||||
(/usr/share/texlive/texmf-dist/tex/latex/microtype/mt-msa.cfg
|
(/usr/share/texlive/texmf-dist/tex/latex/microtype/mt-msa.cfg
|
||||||
File: mt-msa.cfg 2006/02/04 v1.1 microtype config. file: AMS symbols (a) (RS)
|
File: mt-msa.cfg 2006/02/04 v1.1 microtype config. file: AMS symbols (a) (RS)
|
||||||
)
|
)
|
||||||
LaTeX Font Info: Try loading font information for U+msb on input line 91.
|
LaTeX Font Info: Try loading font information for U+msb on input line 90.
|
||||||
|
|
||||||
(/usr/share/texlive/texmf-dist/tex/latex/amsfonts/umsb.fd
|
(/usr/share/texlive/texmf-dist/tex/latex/amsfonts/umsb.fd
|
||||||
File: umsb.fd 2013/01/14 v3.01 AMS symbols B
|
File: umsb.fd 2013/01/14 v3.01 AMS symbols B
|
||||||
|
@ -1345,10 +1321,10 @@ File: umsb.fd 2013/01/14 v3.01 AMS symbols B
|
||||||
(/usr/share/texlive/texmf-dist/tex/latex/microtype/mt-msb.cfg
|
(/usr/share/texlive/texmf-dist/tex/latex/microtype/mt-msb.cfg
|
||||||
File: mt-msb.cfg 2005/06/01 v1.0 microtype config. file: AMS symbols (b) (RS)
|
File: mt-msb.cfg 2005/06/01 v1.0 microtype config. file: AMS symbols (b) (RS)
|
||||||
)
|
)
|
||||||
LaTeX Font Info: Try loading font information for U+stmry on input line 91.
|
LaTeX Font Info: Try loading font information for U+stmry on input line 90.
|
||||||
|
|
||||||
(/usr/share/texlive/texmf-dist/tex/latex/stmaryrd/Ustmry.fd)
|
(/usr/share/texlive/texmf-dist/tex/latex/stmaryrd/Ustmry.fd)
|
||||||
Package tocbasic Info: character protrusion at toc deactivated on input line 96
|
Package tocbasic Info: character protrusion at toc deactivated on input line 95
|
||||||
.
|
.
|
||||||
(./report.toc)
|
(./report.toc)
|
||||||
\tf@toc=\write5
|
\tf@toc=\write5
|
||||||
|
@ -1372,68 +1348,89 @@ Package microtype Info: Loading generic settings for font family
|
||||||
(microtype) For optimal results, create family-specific settings.
|
(microtype) For optimal results, create family-specific settings.
|
||||||
(microtype) See the microtype manual for details.
|
(microtype) See the microtype manual for details.
|
||||||
[2]
|
[2]
|
||||||
(/usr/share/texlive/texmf-dist/tex/latex/listings/lstlang1.sty
|
Package epstopdf Info: Source file: <res/rooftop.eps>
|
||||||
File: lstlang1.sty 2014/03/04 1.5c listings language file
|
(epstopdf) date: 2016-06-23 10:19:35
|
||||||
)
|
(epstopdf) size: 36013 bytes
|
||||||
(/usr/share/texlive/texmf-dist/tex/latex/listings/lstmisc.sty
|
(epstopdf) Output file: <res/rooftop-eps-converted-to.pdf>
|
||||||
File: lstmisc.sty 2014/03/04 1.5c (Carsten Heinz)
|
(epstopdf) date: 2016-06-23 10:20:02
|
||||||
)
|
(epstopdf) size: 22114 bytes
|
||||||
<res/rooftop.png, id=83, 586.8324pt x 442.2924pt>
|
(epstopdf) Command: <repstopdf --outfile=res/rooftop-eps-converted-
|
||||||
File: res/rooftop.png Graphic file (type png)
|
to.pdf res/rooftop.eps>
|
||||||
<use res/rooftop.png>
|
(epstopdf) \includegraphics on input line 70.
|
||||||
Package pdftex.def Info: res/rooftop.png used on input line 70.
|
Package epstopdf Info: Output file is already uptodate.
|
||||||
(pdftex.def) Requested size: 358.50612pt x 270.20964pt.
|
|
||||||
)
|
<res/rooftop-eps-converted-to.pdf, id=98, 587.19376pt x 442.65375pt>
|
||||||
[3] (./inputs/kernels.tex)
|
File: res/rooftop-eps-converted-to.pdf Graphic file (type pdf)
|
||||||
Overfull \hbox (19.7725pt too wide) in paragraph at lines 117--117
|
|
||||||
|
<use res/rooftop-eps-converted-to.pdf>
|
||||||
|
Package pdftex.def Info: res/rooftop-eps-converted-to.pdf used on input line 70
|
||||||
|
.
|
||||||
|
(pdftex.def) Requested size: 358.50612pt x 270.25478pt.
|
||||||
|
) [3] (./inputs/kernels.tex [4 <./res/rooftop-eps-converted-to.pdf>]
|
||||||
|
|
||||||
|
Package hyperref Warning: Token not allowed in a PDF string (PDFDocEncoding):
|
||||||
|
(hyperref) removing `math shift' on input line 15.
|
||||||
|
|
||||||
|
|
||||||
|
Package hyperref Warning: Token not allowed in a PDF string (PDFDocEncoding):
|
||||||
|
(hyperref) removing `\not' on input line 15.
|
||||||
|
|
||||||
|
|
||||||
|
Package hyperref Warning: Token not allowed in a PDF string (PDFDocEncoding):
|
||||||
|
(hyperref) removing `math shift' on input line 15.
|
||||||
|
|
||||||
|
[5] [6])
|
||||||
|
Overfull \hbox (19.7725pt too wide) in paragraph at lines 116--116
|
||||||
\T1/cmtt/m/n/10.95 blob / e5aa9ca4a77623ff6f1c2d5daa7995565b944506 / stream . c
|
\T1/cmtt/m/n/10.95 blob / e5aa9ca4a77623ff6f1c2d5daa7995565b944506 / stream . c
|
||||||
# L286$[][] \T1/cmr/m/n/10.95 (-20) (vis-ited on 06/20/2016).
|
# L286$[][] \T1/cmr/m/n/10.95 (-20) (vis-ited on 06/20/2016).
|
||||||
[]
|
[]
|
||||||
|
|
||||||
|
[7]
|
||||||
AED: lastpage setting LastPage
|
AED: lastpage setting LastPage
|
||||||
[4 <./res/rooftop.png>]
|
[8]
|
||||||
Package atveryend Info: Empty hook `BeforeClearDocument' on input line 118.
|
Package atveryend Info: Empty hook `BeforeClearDocument' on input line 117.
|
||||||
Package atveryend Info: Empty hook `AfterLastShipout' on input line 118.
|
Package atveryend Info: Empty hook `AfterLastShipout' on input line 117.
|
||||||
(./report.aux)
|
(./report.aux)
|
||||||
Package atveryend Info: Executing hook `AtVeryEndDocument' on input line 118.
|
Package atveryend Info: Executing hook `AtVeryEndDocument' on input line 117.
|
||||||
Package atveryend Info: Executing hook `AtEndAfterFileList' on input line 118.
|
Package atveryend Info: Executing hook `AtEndAfterFileList' on input line 117.
|
||||||
Package rerunfilecheck Info: File `report.out' has not changed.
|
Package rerunfilecheck Info: File `report.out' has not changed.
|
||||||
(rerunfilecheck) Checksum: A1CEC9B42F1ECF30AF112FC058DD7354;334.
|
(rerunfilecheck) Checksum: 365A3BDFDB786ABD7E70CA003F732AFB;566.
|
||||||
Package logreq Info: Writing requests to 'report.run.xml'.
|
Package logreq Info: Writing requests to 'report.run.xml'.
|
||||||
\openout1 = `report.run.xml'.
|
\openout1 = `report.run.xml'.
|
||||||
|
|
||||||
Package atveryend Info: Empty hook `AtVeryVeryEnd' on input line 118.
|
Package atveryend Info: Empty hook `AtVeryVeryEnd' on input line 117.
|
||||||
)
|
)
|
||||||
Here is how much of TeX's memory you used:
|
Here is how much of TeX's memory you used:
|
||||||
22318 strings out of 493339
|
21436 strings out of 493339
|
||||||
351456 string characters out of 6141383
|
338721 string characters out of 6141383
|
||||||
896953 words of memory out of 5000000
|
878402 words of memory out of 5000000
|
||||||
25280 multiletter control sequences out of 15000+600000
|
24309 multiletter control sequences out of 15000+600000
|
||||||
25899 words of font info for 101 fonts, out of 8000000 for 9000
|
29876 words of font info for 133 fonts, out of 8000000 for 9000
|
||||||
953 hyphenation exceptions out of 8191
|
953 hyphenation exceptions out of 8191
|
||||||
59i,8n,122p,1066b,1944s stack positions out of 5000i,500n,10000p,200000b,80000s
|
48i,8n,76p,1008b,1880s stack positions out of 5000i,500n,10000p,200000b,80000s
|
||||||
{/usr/share/texlive/texmf-dist/fonts/enc/dvips/cm-super/cm-super-ts1.enc}{/us
|
{/usr/share/texlive/texmf-dist/fonts/enc/dvips/cm-super/cm-super-ts1.enc}{/us
|
||||||
r/share/texlive/texmf-dist/fonts/enc/dvips/cm-super/cm-super-t1.enc}</usr/share
|
r/share/texlive/texmf-dist/fonts/enc/dvips/cm-super/cm-super-t1.enc}</usr/share
|
||||||
/texlive/texmf-dist/fonts/type1/public/amsfonts/cm/cmmi10.pfb></usr/share/texli
|
/texlive/texmf-dist/fonts/type1/public/amsfonts/cm/cmmi10.pfb></usr/share/texli
|
||||||
ve/texmf-dist/fonts/type1/public/amsfonts/cm/cmr10.pfb></usr/share/texlive/texm
|
ve/texmf-dist/fonts/type1/public/amsfonts/cm/cmr10.pfb></usr/share/texlive/texm
|
||||||
f-dist/fonts/type1/public/amsfonts/cm/cmr8.pfb></usr/share/texlive/texmf-dist/f
|
f-dist/fonts/type1/public/amsfonts/cm/cmr12.pfb></usr/share/texlive/texmf-dist/
|
||||||
onts/type1/public/amsfonts/cm/cmsy10.pfb></usr/share/texlive/texmf-dist/fonts/t
|
fonts/type1/public/amsfonts/cm/cmr8.pfb></usr/share/texlive/texmf-dist/fonts/ty
|
||||||
ype1/public/cm-super/sfbx1095.pfb></usr/share/texlive/texmf-dist/fonts/type1/pu
|
pe1/public/amsfonts/cm/cmsy10.pfb></usr/share/texlive/texmf-dist/fonts/type1/pu
|
||||||
blic/cm-super/sfbx1200.pfb></usr/share/texlive/texmf-dist/fonts/type1/public/cm
|
blic/cm-super/sfbx1095.pfb></usr/share/texlive/texmf-dist/fonts/type1/public/cm
|
||||||
-super/sfbx1440.pfb></usr/share/texlive/texmf-dist/fonts/type1/public/cm-super/
|
-super/sfbx1200.pfb></usr/share/texlive/texmf-dist/fonts/type1/public/cm-super/
|
||||||
sfbx2074.pfb></usr/share/texlive/texmf-dist/fonts/type1/public/cm-super/sfcc109
|
sfbx1440.pfb></usr/share/texlive/texmf-dist/fonts/type1/public/cm-super/sfbx207
|
||||||
5.pfb></usr/share/texlive/texmf-dist/fonts/type1/public/cm-super/sfrm0600.pfb><
|
4.pfb></usr/share/texlive/texmf-dist/fonts/type1/public/cm-super/sfcc1095.pfb><
|
||||||
/usr/share/texlive/texmf-dist/fonts/type1/public/cm-super/sfrm0800.pfb></usr/sh
|
/usr/share/texlive/texmf-dist/fonts/type1/public/cm-super/sfrm0600.pfb></usr/sh
|
||||||
are/texlive/texmf-dist/fonts/type1/public/cm-super/sfrm0900.pfb></usr/share/tex
|
are/texlive/texmf-dist/fonts/type1/public/cm-super/sfrm0800.pfb></usr/share/tex
|
||||||
live/texmf-dist/fonts/type1/public/cm-super/sfrm1095.pfb></usr/share/texlive/te
|
live/texmf-dist/fonts/type1/public/cm-super/sfrm0900.pfb></usr/share/texlive/te
|
||||||
xmf-dist/fonts/type1/public/cm-super/sfrm1440.pfb></usr/share/texlive/texmf-dis
|
xmf-dist/fonts/type1/public/cm-super/sfrm1095.pfb></usr/share/texlive/texmf-dis
|
||||||
t/fonts/type1/public/cm-super/sfti1095.pfb></usr/share/texlive/texmf-dist/fonts
|
t/fonts/type1/public/cm-super/sfrm1440.pfb></usr/share/texlive/texmf-dist/fonts
|
||||||
/type1/public/cm-super/sftt1095.pfb>
|
/type1/public/cm-super/sfti0900.pfb></usr/share/texlive/texmf-dist/fonts/type1/
|
||||||
Output written on report.pdf (4 pages, 290073 bytes).
|
public/cm-super/sfti1095.pfb></usr/share/texlive/texmf-dist/fonts/type1/public/
|
||||||
|
cm-super/sftt1095.pfb>
|
||||||
|
Output written on report.pdf (8 pages, 328260 bytes).
|
||||||
PDF statistics:
|
PDF statistics:
|
||||||
192 PDF objects out of 1000 (max. 8388607)
|
353 PDF objects out of 1000 (max. 8388607)
|
||||||
164 compressed objects within 2 object streams
|
278 compressed objects within 3 object streams
|
||||||
31 named destinations out of 1000 (max. 500000)
|
81 named destinations out of 1000 (max. 500000)
|
||||||
22070 words of extra memory for PDF output out of 24883 (max. 10000000)
|
26190 words of extra memory for PDF output out of 29859 (max. 10000000)
|
||||||
|
|
||||||
|
|
Binary file not shown.
|
@ -46,6 +46,8 @@
|
||||||
\newrefformat{sec}{\hyperref[#1]{Section~\ref*{#1}}}
|
\newrefformat{sec}{\hyperref[#1]{Section~\ref*{#1}}}
|
||||||
\renewcommand{\arraystretch}{1.2}
|
\renewcommand{\arraystretch}{1.2}
|
||||||
|
|
||||||
|
\newcommand*\rfrac[2]{{}^{#1}\!/_{#2}}%running fraction with slash - requires math mode
|
||||||
|
|
||||||
\newcommand\bigforall{\mbox{\Large $\mathsurround0pt\forall$}}
|
\newcommand\bigforall{\mbox{\Large $\mathsurround0pt\forall$}}
|
||||||
\everymath{\displaystyle}
|
\everymath{\displaystyle}
|
||||||
|
|
||||||
|
@ -59,7 +61,7 @@
|
||||||
extendedchars=true, % lets you use non-ASCII characters; for 8-bits encodings only, does not work with UTF-8
|
extendedchars=true, % lets you use non-ASCII characters; for 8-bits encodings only, does not work with UTF-8
|
||||||
frame=single, % adds a frame around the code
|
frame=single, % adds a frame around the code
|
||||||
keepspaces=true, % keeps spaces in text, useful for keeping indentation of code (possibly needs columns=flexible)
|
keepspaces=true, % keeps spaces in text, useful for keeping indentation of code (possibly needs columns=flexible)
|
||||||
language=TeX, % the language of the code
|
language=C, % the language of the code
|
||||||
numbers=left, % where to put the line-numbers; possible values are (none, left, right)
|
numbers=left, % where to put the line-numbers; possible values are (none, left, right)
|
||||||
numbersep=5pt, % how far the line-numbers are from the code
|
numbersep=5pt, % how far the line-numbers are from the code
|
||||||
numberstyle=\tiny\color{gray}, % the style that is used for the line-numbers
|
numberstyle=\tiny\color{gray}, % the style that is used for the line-numbers
|
||||||
|
@ -70,12 +72,9 @@
|
||||||
stepnumber=1, % the step between two line-numbers. If it's 1, each line will be numbered
|
stepnumber=1, % the step between two line-numbers. If it's 1, each line will be numbered
|
||||||
tabsize=2, % sets default tabsize to 2 spaces
|
tabsize=2, % sets default tabsize to 2 spaces
|
||||||
title=\lstname, % show the filename of files included with \lstinputlisting; also try caption instead of title
|
title=\lstname, % show the filename of files included with \lstinputlisting; also try caption instead of title
|
||||||
emph=[3]{int:,array,set,of,int,if,then,else,constraint,var,union,endif,function,where,in,div,predicate,let,opt,full,format,def,for,True,False,return,or},
|
keywordstyle=\color{blue},
|
||||||
emphstyle=[3]\color{ForestGreen},
|
morekeywords={size_t},
|
||||||
emph=[2]{length,max,forall,startEmptyBuffer,fix,startEmptyBufferShow,exactly,cumulative,occurs,deopt,sum,,all},
|
commentstyle=\color{ForestGreen}
|
||||||
emphstyle=[2]\color{blue},
|
|
||||||
commentstyle=\color{BrickRed},
|
|
||||||
stringstyle =\color{red},
|
|
||||||
}
|
}
|
||||||
|
|
||||||
\begin{document}
|
\begin{document}
|
||||||
|
|
|
@ -13,3 +13,9 @@
|
||||||
\contentsline {subsection}{\numberline {2.3}Graph}{3}{subsection.2.3}
|
\contentsline {subsection}{\numberline {2.3}Graph}{3}{subsection.2.3}
|
||||||
\defcounter {refsection}{0}\relax
|
\defcounter {refsection}{0}\relax
|
||||||
\contentsline {section}{\numberline {3}Kernels}{4}{section.3}
|
\contentsline {section}{\numberline {3}Kernels}{4}{section.3}
|
||||||
|
\defcounter {refsection}{0}\relax
|
||||||
|
\contentsline {subsection}{\numberline {3.1}1/16 $\not =$ 1/16. Or: The Fancy Arithmetics of a Compiler}{5}{subsection.3.1}
|
||||||
|
\defcounter {refsection}{0}\relax
|
||||||
|
\contentsline {subsection}{\numberline {3.2}The 1/16 OI Kernel}{6}{subsection.3.2}
|
||||||
|
\defcounter {refsection}{0}\relax
|
||||||
|
\contentsline {subsection}{\numberline {3.3}The 8 OI Kernel}{6}{subsection.3.3}
|
||||||
|
|
BIN
roofline/report/res/rooftop-eps-converted-to.pdf
Normal file
BIN
roofline/report/res/rooftop-eps-converted-to.pdf
Normal file
Binary file not shown.
2350
roofline/report/res/rooftop.eps
Normal file
2350
roofline/report/res/rooftop.eps
Normal file
File diff suppressed because it is too large
Load diff
Binary file not shown.
Before Width: | Height: | Size: 38 KiB |
|
@ -42,10 +42,20 @@
|
||||||
Timestamp = {2016.06.20}
|
Timestamp = {2016.06.20}
|
||||||
}
|
}
|
||||||
|
|
||||||
@Online{intel2,
|
@Online{intelvfmadd132pd,
|
||||||
Title = {Intel Intrinsics Guide},
|
Title = {Intel Intrinsics Guide: vfmadd132pd},
|
||||||
Author = {{Intel}},
|
Author = {{Intel}},
|
||||||
Url = {https://software.intel.com/sites/landingpage/IntrinsicsGuide/#techs=AVX2,FMA&text=madd&expand=2365},
|
Url = {https://software.intel.com/sites/landingpage/IntrinsicsGuide/#techs=AVX2,FMA&text=vfmadd132pd&expand=2365},
|
||||||
|
Urldate = {2016-06-19},
|
||||||
|
|
||||||
|
Owner = {armin},
|
||||||
|
Timestamp = {2016.06.22}
|
||||||
|
}
|
||||||
|
|
||||||
|
@Online{intelvfmadd132sd,
|
||||||
|
Title = {Intel Intrinsics Guide: vfmadd132sd},
|
||||||
|
Author = {{Intel}},
|
||||||
|
Url = {https://software.intel.com/sites/landingpage/IntrinsicsGuide/#techs=AVX2,FMA&text=vfmadd132sd&expand=2365,2403},
|
||||||
Urldate = {2016-06-19},
|
Urldate = {2016-06-19},
|
||||||
|
|
||||||
Owner = {armin},
|
Owner = {armin},
|
||||||
|
|
|
@ -1,7 +1,8 @@
|
||||||
all: roofline roofline_avx roofline_o3avx roofline_o3 roofline_avxfma
|
all: roofline roofline_avx roofline_o3avx roofline_o3 roofline_avxfma roofline_avxfmafast
|
||||||
|
|
||||||
|
# Roofline Binary
|
||||||
roofline: roofline.c aikern.a
|
roofline: roofline.c aikern.a
|
||||||
gcc -Wall -Wextra -O3 -std=c99 -fopenmp $^ -o $@
|
gcc -Wall -Wextra -std=c99 -fopenmp $^ -o $@
|
||||||
|
|
||||||
roofline_avx: roofline.c aikern_avx.a
|
roofline_avx: roofline.c aikern_avx.a
|
||||||
gcc -Wall -Wextra -O3 -std=c99 -fopenmp $^ -o $@
|
gcc -Wall -Wextra -O3 -std=c99 -fopenmp $^ -o $@
|
||||||
|
@ -15,6 +16,10 @@ roofline_o3: roofline.c aikern_o3.a
|
||||||
roofline_avxfma: roofline.c aikern_avxfma.a
|
roofline_avxfma: roofline.c aikern_avxfma.a
|
||||||
gcc -Wall -Wextra -O3 -std=c99 -fopenmp $^ -o $@
|
gcc -Wall -Wextra -O3 -std=c99 -fopenmp $^ -o $@
|
||||||
|
|
||||||
|
roofline_avxfmafast: roofline.c aikern_avxfmafast.a
|
||||||
|
gcc -Wall -Wextra -O3 -std=c99 -fopenmp $^ -o $@
|
||||||
|
|
||||||
|
# Static Libraries
|
||||||
aikern.a: aikern.c aikern.h
|
aikern.a: aikern.c aikern.h
|
||||||
gcc -c -o aikern.o aikern.c
|
gcc -c -o aikern.o aikern.c
|
||||||
ar rcs aikern.a aikern.o
|
ar rcs aikern.a aikern.o
|
||||||
|
@ -36,8 +41,17 @@ aikern_avxfma.a: aikern.c aikern.h
|
||||||
gcc -O2 -mavx -mfma -c -o aikern_avxfma.o aikern.c
|
gcc -O2 -mavx -mfma -c -o aikern_avxfma.o aikern.c
|
||||||
ar rcs aikern_avxfma.a aikern_avxfma.o
|
ar rcs aikern_avxfma.a aikern_avxfma.o
|
||||||
|
|
||||||
|
aikern_avxfmafast.a: aikern.c aikern.h
|
||||||
|
gcc -O2 -mavx -mfma -Ofast -c -o aikern_avxfmafastmath.o aikern.c
|
||||||
|
ar rcs aikern_avxfmafast.a aikern_avxfmafastmath.o
|
||||||
|
|
||||||
|
aikern_avxfmafastmath.a: aikern.c aikern.h
|
||||||
|
gcc -O2 -mavx -mfma -Ofast -ffast-math -c -o aikern_avxfmafastmath.o aikern.c
|
||||||
|
ar rcs aikern_avxfmafast.a aikern_avxfmafastmath.o
|
||||||
|
|
||||||
|
|
||||||
clean:
|
clean:
|
||||||
rm -f roofline roofline_avx roofline_o3avx roofline_o3 roofline_avxfma
|
rm -f roofline roofline_avx roofline_o3avx roofline_o3 roofline_avxfma roofline_avxfmafast
|
||||||
rm -f *.o
|
rm -f *.o
|
||||||
rm -f *.a
|
rm -f *.a
|
||||||
rm -f *.so
|
rm -f *.so
|
||||||
|
|
|
@ -23,6 +23,8 @@ void kernel_1_16_fuseaware(double* a, double* b, double* c, size_t size)
|
||||||
vmovsd xmm1,QWORD PTR [rdx+rax*8] # 1 read
|
vmovsd xmm1,QWORD PTR [rdx+rax*8] # 1 read
|
||||||
vfmadd132sd xmm0,xmm1,QWORD PTR [rsi+rax*8] # 2 FLOPs + 1 read
|
vfmadd132sd xmm0,xmm1,QWORD PTR [rsi+rax*8] # 2 FLOPs + 1 read
|
||||||
vmovsd QWORD PTR [rdi+rax*8],xmm0 # 1 write
|
vmovsd QWORD PTR [rdi+rax*8],xmm0 # 1 write
|
||||||
|
|
||||||
|
Uses packed doubles with -Ofast.
|
||||||
*/
|
*/
|
||||||
|
|
||||||
#pragma omp parallel for
|
#pragma omp parallel for
|
||||||
|
@ -36,6 +38,26 @@ void kernel_1_16_fuseaware(double* a, double* b, double* c, size_t size)
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
#define REP0(X)
|
||||||
|
#define REP1(X) X
|
||||||
|
#define REP2(X) REP1(X) REP1(X)
|
||||||
|
#define REP3(X) REP2(X) REP1(X)
|
||||||
|
#define REP4(X) REP3(X) REP1(X)
|
||||||
|
#define REP5(X) REP4(X) REP1(X)
|
||||||
|
#define REP6(X) REP5(X) REP1(X)
|
||||||
|
#define REP7(X) REP6(X) REP1(X)
|
||||||
|
#define REP8(X) REP7(X) REP1(X)
|
||||||
|
#define REP9(X) REP8(X) REP1(X)
|
||||||
|
|
||||||
|
#define REP10(X) REP9(X) REP1(X)
|
||||||
|
#define REP20(X) REP10(X) REP10(X)
|
||||||
|
#define REP30(X) REP20(X) REP10(X)
|
||||||
|
#define REP40(X) REP30(X) REP10(X)
|
||||||
|
#define REP50(X) REP40(X) REP10(X)
|
||||||
|
#define REP60(X) REP50(X) REP10(X)
|
||||||
|
|
||||||
|
#define REP100(X) REP50(X) REP50(X)
|
||||||
|
|
||||||
void kernel_8_1_simple(double* a, double* b, double* c, size_t size)
|
void kernel_8_1_simple(double* a, double* b, double* c, size_t size)
|
||||||
{
|
{
|
||||||
/* === Warning ===
|
/* === Warning ===
|
||||||
|
@ -49,7 +71,7 @@ void kernel_8_1_simple(double* a, double* b, double* c, size_t size)
|
||||||
|
|
||||||
vmovsd xmm1,QWORD PTR [rdi] # 1 read
|
vmovsd xmm1,QWORD PTR [rdi] # 1 read
|
||||||
vmulsd xmm0,xmm1,xmm1 # 1 FLOP+register shuffling
|
vmulsd xmm0,xmm1,xmm1 # 1 FLOP+register shuffling
|
||||||
vmulsd xmm0,xmm0,xmm1 # 15x 1 FLOP+register shuffling
|
vmulsd xmm0,xmm0,xmm1 # 127x 1 FLOP+register shuffling
|
||||||
# [...]
|
# [...]
|
||||||
vmovsd QWORD PTR [rdi-0x8],xmm0 # 1 write
|
vmovsd QWORD PTR [rdi-0x8],xmm0 # 1 write
|
||||||
*/
|
*/
|
||||||
|
@ -57,16 +79,14 @@ void kernel_8_1_simple(double* a, double* b, double* c, size_t size)
|
||||||
#pragma omp parallel for
|
#pragma omp parallel for
|
||||||
for(size_t i=0; i<size; i++){
|
for(size_t i=0; i<size; i++){
|
||||||
/*
|
/*
|
||||||
COMM: 1 read+1 write
|
COMM: 1 read+1 write = 16 byte
|
||||||
COMP: 16 FLOPs
|
COMP: 128 FLOPs
|
||||||
-> AI = 8
|
-> AI = 128/16 = 8
|
||||||
*/
|
*/
|
||||||
a[i] = a[i] * a[i] * a[i] *
|
a[i] = REP100(a[i]*)
|
||||||
a[i] * a[i] * a[i] *
|
REP20(a[i]*)
|
||||||
a[i] * a[i] * a[i] *
|
REP8(a[i]*)
|
||||||
a[i] * a[i] * a[i] *
|
REP1(a[i]);
|
||||||
a[i] * a[i] * a[i] *
|
|
||||||
a[i] * a[i];
|
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -76,28 +96,25 @@ void kernel_8_1_fuseaware(double* a, double* b, double* c, size_t size)
|
||||||
With FMA (and -O2):
|
With FMA (and -O2):
|
||||||
|
|
||||||
vmovsd xmm0,QWORD PTR [rdi] # 1 read
|
vmovsd xmm0,QWORD PTR [rdi] # 1 read
|
||||||
vfmadd132sd xmm0,xmm0,xmm0 # 8x 2 FLOPs+register shuffling
|
vfmadd132sd xmm0,xmm0,xmm0 # 64 x 2 FLOPs+register shuffling
|
||||||
vmovsd QWORD PTR [rdi-0x8],xmm0 # 1 write
|
vmovsd QWORD PTR [rdi-0x8],xmm0 # 1 write
|
||||||
|
|
||||||
|
Uses packed doubles with -Ofast.
|
||||||
*/
|
*/
|
||||||
|
|
||||||
#pragma omp parallel for
|
#pragma omp parallel for
|
||||||
for(size_t i=0; i<size; i++){
|
for(size_t i=0; i<size; i++){
|
||||||
/*
|
/*
|
||||||
COMM: 1 read + 1 write
|
COMM: 1 read + 1 write
|
||||||
COMP: 16 FLOP
|
COMP: 128 FLOP
|
||||||
-> AI = 8
|
-> AI = 8
|
||||||
*/
|
*/
|
||||||
a[i] = a[i] * a[i] + a[i];
|
REP60(a[i] = a[i] * a[i] + a[i];)
|
||||||
a[i] = a[i] * a[i] + a[i];
|
REP4(a[i] = a[i] * a[i] + a[i];)
|
||||||
a[i] = a[i] * a[i] + a[i];
|
|
||||||
a[i] = a[i] * a[i] + a[i];
|
|
||||||
a[i] = a[i] * a[i] + a[i];
|
|
||||||
a[i] = a[i] * a[i] + a[i];
|
|
||||||
a[i] = a[i] * a[i] + a[i];
|
|
||||||
a[i] = a[i] * a[i] + a[i];
|
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
/* === FAILED KERNELS === */
|
/* === FAILED KERNELS === */
|
||||||
|
|
||||||
/*
|
/*
|
||||||
|
@ -127,7 +144,7 @@ void kernel_1_16_simple_dangerous(double* a, double* b, double* c, size_t size)
|
||||||
|
|
||||||
// volatile to prevent compiler from optimizing this away
|
// volatile to prevent compiler from optimizing this away
|
||||||
// register to advise compiler to put this in register
|
// register to advise compiler to put this in register
|
||||||
volatile register double tmp = 0.1;
|
double tmp = 0.1;
|
||||||
|
|
||||||
#pragma omp parallel for
|
#pragma omp parallel for
|
||||||
for(size_t i=0; i<size; i++){
|
for(size_t i=0; i<size; i++){
|
||||||
|
|
Binary file not shown.
Binary file not shown.
Binary file not shown.
BIN
roofline/src/roofline_avxfmafast
Executable file
BIN
roofline/src/roofline_avxfmafast
Executable file
Binary file not shown.
Binary file not shown.
Binary file not shown.
Loading…
Reference in a new issue