Commit b2f3e1ce authored by CHAMONT David's avatar CHAMONT David
Browse files

Cours complet

parent 9368f3ab
#include <valarray>
#include <cstdlib>
#include <cassert>
#include <iostream>
#include <string_view>
#include <chrono>
template< typename Fonction, typename... ArgTypes >
auto time( std::string_view title, Fonction f, ArgTypes... args )
{
using namespace std::chrono ;
auto t1 {steady_clock::now()} ;
auto res {f(args...)} ;
auto t2 {steady_clock::now()} ;
auto dt {duration_cast<microseconds>(t2-t1).count()} ;
std::cout<<"("<<title<<" time: "<<dt<<" us)"<<std::endl ;
return res ;
}
std::valarray<double> generate( int size )
{
std::valarray<double> datas(size) ;
for ( double & data : datas )
{ data = std::rand()/(RAND_MAX+1.) ; }
return datas ;
}
double analyse1( std::valarray<double> const & datas, int power )
{
double res = 0 ;
for ( double data : datas )
{
double val = 1 ;
for ( int j=0 ; j<power ; ++j )
val *= data ;
res += val ;
}
return res ;
}
double analyse2( std::valarray<double> datas, int power )
{
std::valarray<double> vals(1.,datas.size()) ;
for ( int j=0 ; j<power ; ++j )
{ vals *= datas ; }
double res = 0 ;
for ( double val : vals )
{ res += val ; }
return res ;
}
int main( int argc, char * argv[] )
{
assert(argc==3) ;
int size {atoi(argv[1])} ;
int power {atoi(argv[2])} ;
auto datas = time("gen",generate,size) ;
auto res1 = time("ana1",analyse1,datas,power) ;
auto res2 = time("ana2",analyse2,datas,power) ;
std::cout << res1 << " " << res2 << std::endl ;
}
\ No newline at end of file
> ./chrono.2.py 10 chrono.1.cpp 1024 100000
0.525744 0.525744
(gen mean time: 14.7)
(ana1 mean time: 2761842)
(ana2 mean time: 1319491)
\ No newline at end of file
#!/usr/bin/env python3
import os, sys
import re
import subprocess
import statistics
NB_RUNS = int(sys.argv[1])
SRC_FILE = sys.argv[2]
RUN_ARGS = ' '.join(sys.argv[3:])
exe_file = SRC_FILE.replace(".cpp",".exe")
compile_cmd = "rm -f {} && g++ -std=c++17 -O3 -march=native {} -o {}".format(exe_file,SRC_FILE,exe_file)
run_cmd = "./{} {}".format(exe_file,RUN_ARGS)
# Utility fonction
def run(cmd):
proc = subprocess.run(cmd, stdout=subprocess.PIPE, stderr=subprocess.STDOUT, shell=True, executable='bash', universal_newlines=True, check=True)
return proc.stdout.rstrip().split('\n')
# Compile & first run
expr_times = re.compile("^\((.*) time: (.*) us\)$")
times = {}
os.system(compile_cmd)
for line in run(run_cmd):
match = expr_times.match(line)
if expr_times.match(line):
times[match.groups()[0]] = []
else:
print(line)
# Repeat timig
for irun in range(NB_RUNS):
for line in run(run_cmd):
match = expr_times.match(line)
if expr_times.match(line):
times[match.groups()[0]].append(int(match.groups()[1]))
# Display mean times
for ktime in times:
print("({} mean time: {})".format(ktime,int(statistics.mean(times[ktime]))))
#include <valarray>
int constexpr size = 1024 ;
int constexpr power = 100000 ;
static void Analyse1(benchmark::State& state) {
std::valarray<double> datas(size) ;
for ( double & data : datas )
{ data = std::rand()/(RAND_MAX+1.) ; }
for (auto _ : state) {
double res {0.} ;
for ( double data : datas ) {
double val = 1 ;
for ( int j=0 ; j<power ; ++j )
val *= data ;
res += val ;
}
benchmark::DoNotOptimize(res) ;
}
}
BENCHMARK(Analyse1);
static void Analyse2(benchmark::State& state) {
std::valarray<double> datas(size) ;
for ( double & data : datas )
{ data = std::rand()/(RAND_MAX+1.) ; }
for (auto _ : state) {
double res {0.} ;
std::valarray<double> vals(1.,datas.size()) ;
for ( int j=0 ; j<power ; ++j )
{ vals *= datas ; }
for ( double val : vals )
{ res += val ; }
benchmark::DoNotOptimize(res);
}
}
BENCHMARK(Analyse2);
\ No newline at end of file
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"attributes": {
"classes": [
"cpp"
],
"id": ""
},
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# Floating point basics\n",
"\n",
"In scientific computing, we most often encode real numbers using \"floating-point format\". This encoding can represent a large range of values, from the largest to the smallest, and handle efficiently most of real-world problems."
]
},
{
"cell_type": "markdown",
"metadata": {
"attributes": {
"classes": [
"cpp"
],
"id": ""
},
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"When such a large range is not useful, one can use fixed-point numbers to obtain faster calculations. When looking for maximum precision, while accepting a longer execution time, one can rely on quotients of integers."
]
},
{
"cell_type": "markdown",
"metadata": {
"attributes": {
"classes": [
"cpp"
],
"id": ""
},
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## General representation of a floating-point number\n",
"\n",
"In a given base *b*, which will be the same for all the numbers, the representation of a number includes: \n",
"* one bit for the sign (equal to −1 or 1),\n",
"* the mantissa,\n",
"* the exponent (relative integer).\n",
"\n",
"The encoded number is equivalent to *sign × mantissa × b^exponent*"
]
},
{
"cell_type": "markdown",
"metadata": {
"attributes": {
"classes": [
"cpp"
],
"id": ""
},
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"When the exponent vary, the comma somehow “float”, hence the name. The mantissa is a sequence of digits in base b, generally of fixed size, in which we choose to place a comma at a given position: just before or just after the first digit, or even just after the last digit (in this case the mantissa is a natural number)."
]
},
{
"cell_type": "markdown",
"metadata": {
"attributes": {
"classes": [
"cpp"
],
"id": ""
},
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"Floating-point numbers are a kind of **computer equivalent of scientific notation**, The later rather using a base of 10 and placing the comma after the first digit of the mantissa.\n",
"\n",
"Over time several computing variants have emerged, inducing problems of portability. This motivated the creation of the IEEE 754 standard."
]
},
{
"cell_type": "markdown",
"metadata": {
"attributes": {
"classes": [
"cpp"
],
"id": ""
},
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## IEEE 754 standard\n",
"\n",
"In 1984 the IEEE (Institute of Electrical and Electronics Engineers) defined its standard * IEEE 754 *, which was improved in 1987, 2008, and recently in July 2019.\n",
"\n",
"The standard proposes different formats in base 2 and in base 10. For base 2, it imposes that the comma is placed after the first bit of the mantissa. Since, in base `2`, the first bit is always` 1`, we do not store it: it is \"implicit\". For the rest, the standard distributes the bits as follows:\n",
"\n",
"| Name | Common name | Size | Sign | Exponent | Mantissa | Significant decimal digits |\n",
"| :-------- | ------------------- | -------: | ----: | -------: | --------: | -------------------------: |\n",
"| binary16 | Half precision | 16 bits | 1 bit | 5 bits | 10 bits | ? |\n",
"| binary32 | Single precision | 32 bits | 1 bit | 8 bits | 23 bits | about 7 |\n",
"| binary64 | Double precision | 64 bits | 1 bit | 11 bits | 52 bits | about 16 |\n",
"| binary128 | Quadruple precision | 128 bits | 1 bit | 15 bits | 112 bits | ? |"
]
},
{
"cell_type": "markdown",
"metadata": {
"attributes": {
"classes": [
"cpp"
],
"id": ""
},
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Predefined floating-point types in C++\n",
"\n",
"The standard does not enforce size of floating-point types. This can cause portability issues from one machine to another. The only guarantee that the standard provides us is\n",
"\n",
"```\n",
"sizeof(float) <= sizeof(double) <= sizeof(long double)\n",
"```\n",
"\n",
"However, nowadays, we can assume that all the architectures use the format **IEEE 754 binary32 for the type `float`**, and the format **IEEE 754 binary64 for the type` double`** and it should come with an efficient hardware implementation for operations on these numbers."
]
},
{
"cell_type": "markdown",
"metadata": {
"attributes": {
"classes": [
"cpp"
],
"id": ""
},
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"The situation varies more widely for `long double`. If the size of the type on your architecture is 16 bytes (128 bits), it is common that only the first 80 bits are used for the computation, at least for an Intel processor.\n",
"\n",
"It is also possible that some operations are not directly supported by the hardware, but rather \"software-emulated\", with reduced performance."
]
},
{
"cell_type": "markdown",
"metadata": {
"attributes": {
"classes": [
"cpp"
],
"id": ""
},
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"To check for possible portability issues, you can add `static_assert` into your code in order to verify that the size of the types is what you expect.\n",
"\n",
"If you restrain yourself to the g++ compiler, and accept its extensions to the standard, g++ offers you either `__float80` or` __float128` types, depending on the architecture you are compiling on."
]
},
{
"cell_type": "markdown",
"metadata": {
"attributes": {
"classes": [
"cpp"
],
"id": ""
},
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Exotic precisions\n",
"\n",
"If the level of precision of `long double` is not enough for you, there are few ways to explore (below). Of course, as there is no direct hardware support, the execution times may be a lot longer.\n",
"* the GNU library [**MPFR**](https://fr.wikipedia.org/wiki/GNU_MPFR) allows one to choose an arbitrary precision.\n",
"* of course, as for many other problems, there is the [Boost library](https://www.boost.org/doc/libs/1_71_0/libs/multiprecision/doc/html/boost_multiprecision/intro.html) :)\n",
"* some algorithms of **compensated arithmetic** make it possible to limit the losses linked to rounding during a calculation."
]
},
{
"cell_type": "markdown",
"metadata": {
"attributes": {
"classes": [
"cpp"
],
"id": ""
},
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"In addition, especially since the great comeback of machine learning, there is a growing need for numbers with lower precision:\n",
"* The **half-precision**, which is stored in 2 bytes, will also rely on an external library and will only be effective on certain specific hardware such as GPGPUs;\n",
"* **BF16** is a variant of the previous one which uses more bits for the exponent and less for the mantissa and which therefore favors \"order of magnitude\" to the detriment of precision."
]
},
{
"cell_type": "markdown",
"metadata": {
"attributes": {
"classes": [
"cpp"
],
"id": ""
},
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"It should also be mentioned that the **80-bit extended precision** available on Intel processors was used in the intermediate calculations even though the start and end data were `double`. On the contrary, it was not used for vectorized instructions, which led to different results depending on whether a code was vectorized or not. As far as we know, compilers now avoid this use of 80 bits, even for scalar computations, unless explicitly asked for."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# Questions ?"
]
},
{
"cell_type": "markdown",
"metadata": {
"attributes": {
"classes": [
"cpp"
],
"id": ""
},
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"# Exercise\n",
"\n",
"Using the numeric traits defined via [std::numeric_limits](https://en.cppreference.com/w/cpp/types/numeric_limits) in the standard library, complete the following table:\n",
"\n",
"| Type | Size (in bytes) | Minimal value | Maximum value | Epsilon |\n",
"| :---------- | -------------------: | -------------------: | -------------------: | -------------------: |\n",
"| float | | | | |\n",
"| double | | | | |\n",
"| long double | | | | |\n",
"\n",
"To make sure you don't lose any information while printing out, make sure you display 18 significant digits.\n",
"* What do we mean here by \"mininal value\" ? What is the difference with the \"lowest value\" ?\n",
"* What do you think of the results for `long double`: it is rather 80 or 128 bits ?\n",
"\n",
"<!--\n",
"A propos du long double, 80 ou 128 bits ? A priori le 80 bits utilise\n",
"autant de bits que le 128 bits pour l'exposant, donc le min/max n'est\n",
"pas discrimant. Par contre, en 128 bits, il me semble que le nombre de\n",
"chiffres significatif doit etre supérieur à 30, et donc l'epsilon devrait\n",
"être inférieur à 1e-30, sans quoi c'est sans doute du 80 bits...\n",
"même si la taille apparente est dans tous les cas de 16 octets.\n",
"-->"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"## Sources\n",
"\n",
"* [Wikipedia](https://en.wikipedia.org/wiki/Floating-point_arithmetic)\n",
"* [Cpp Reference](https://en.cppreference.com/w/cpp/language/types)\n",
"* [QuadMath](https://people.sc.fsu.edu/~jburkardt/cpp_src/g++_quadmath/g++_quadmath.html)\n",
"\n",
"\n",
"* [What Every Programmer Should Know About Floating-Point Arithmetic](https://floating-point-gui.de/)\n",
"* [IEEE-754 Floating-Point Conversion](https://babbage.cs.qc.cuny.edu/IEEE-754.old/Decimal.html)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"© *CNRS 2021* \n",
"*This document was created by David Chamont and translated by Olga Abramkina. It is available under the [License Creative Commons - Attribution - No commercial use - Shared under the conditions 4.0 International](http://creativecommons.org/licenses/by-nc-sa/4.0/)*"
]
}
],
"metadata": {
"celltoolbar": "Diaporama",
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.5"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"attributes": {
"classes": [
"cpp"
],
"id": ""
},
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# Configuring the precision\n",
"\n",
"Many scientists choose the double precision type (`double`) by default, out of habit. They suspect that simple precision (`float`) might be insufficient, and higher precision (`long double`) might be slow and bulky. Do not rely on such fuzzy intuition: instead, write your code with a configurable precision, and **try out** different ones."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### First pre-requisite : accuracy control\n",
"\n",
"Before trying different floating-point types, one must wonder how to **validate how accurate is the final result**, or at least if it accurate enough."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### Second-prequisite : execution time control\n",
"\n",
"When monitoring the execution time, there are a few rules to known about, especially when playing with small/toy code:\n",
"* run your program many times and compute a mean exwcution time ;\n",
"* your program must run long enough, so that the processor pipelines get filled and you go well beyond the initial *computing latency* ;\n",
"* if the size of your data become too big, you may fill out the processor cache memory, become I/O bound, and the results will not any more express the CPU performance, but the bandwidth of the memory bus."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## First step with `using`\n",
"\n",
"If you are starting from a code written with `double`, you can simply search and replace all `double` with, for example, `real`, and add `using real = double` at the beginning of all your `*.cc` body files, or at the beginning of some `*.h` header file included everywhere. \n",
"\n",
"From there on, you can change your alias into `using real = float`, and recompile everything. Then check if the results are still accurate enough, and measure how faster is execution."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Writing tmp.precision.h\n"
]
}
],
"source": [