When doing standard double precision floating point operations in C or Fortran, you can expect 15-17 significant digits. But what do you do when 15 decimals are not enough? The GNU multiple precision arithmetic library is go-to library when you need not 15, but thousands of decimals, but for more modest needs, quadruple precision (128 bits), might be enough. In Fortran, 128 bits is REAL*16
, and in C we can access it by the type __float128
. These may not be availabe in all compilers, but GCC (4.6 and newer) and Intel C/Fortran do support it. According to the GCC manual, we can expect it to be “an order of magnitude or two slower” than double precision.
I decided to test the quad precision by doing a simple matrix-matrix multiplication test. Two 256x256 matrices are initialized with trigonometric expressions and then multiplied by each other. It is straightforward to change the program into quad precision: the double
arrays should be declared as __float128
, the trig. functions are called cosq
/sinq
, and we need to use the special function quadmath_snprintf
to print the numbers to the screen. (In Fortran, you don’t even need to use a different print function.)
For simplicity, I look specifically at the decimal expansion of one matrix element (0,0). With gcc, I get
$ gcc -o matmul_double -O3 -mavx -std=c99 matmul_double.c -lm
$ ./matmul_double
Matrix size: 256 by 256 (double precision).
Time: 0.010 seconds (3355.4 MFLOP/s)
C(0,0)-diagnostic: -0.0939652685936642
gcc -o matmul_quad -std=c99 -O3 -mavx matmul_quad.c -lquadmath -lm
./matmul_quad
Matrix size: 256 by 256 (quad precision).
Time: 4.140 seconds (8.1 MFLOP/s)
C(0,0)-diagnostic: -0.093965268593662358578620940776
which confirms that __float128
works in gcc, but with significant speed penalty. In this case, the difference in runtime is around 1000x.
Does Intel’s C compiler fare better?
$ icc -o matmul_double -std=c99 -O3 -xavx mmul_double.c -lm
$ ./matmul_double
Matrix size: 256 by 256 (double precision).
0.000 seconds (inf MFLOP/s)
C(0,0)-diagnostic: -0.0939652685936624
$ icc -o matmul_quad -std=c99 -O3 -mavx -I/software/apps/gcc/4.7.2/lib/gcc/x86_64-unknown-linux-gnu/4.7.2/include -L/software/apps/gcc/4.7.2/lib64/ matmul_quad.c -lquadmath -lm
$ ./matmul_quad
Matrix size: 256 by 256 (quad precision).
0.880 seconds (38.1 MFLOP/s)
C(0,0)-diagnostic: -0.093965268593662358578620940776
Yes, it is evident that the quad precision floating maths runs a few times faster with ICC, even though the same underlying library is used.
If we actually look at the decimals, and compare the results, we find something interesting. The quad precision results from GCC and ICC are identical, which is assuring, but the double precision binary compiled by ICC delivers higher precision, despite using very high optimization.
GCC(double) -0.09396526859366|42
ICC(double) -0.09396526859366|24
GCC(quad.) -0.09396526859366|2358578620940776
ICC(quad.) -0.09396526859366|2358578620940776
Lowering GCC’s optimization did not help, and gcc is not supposed to introduce any unsafe mathematical operations unless explicitly enabled by -ffast-math
or -Ofast
in the first place, so it is unclear to me where the difference comes from. Typically, fluctuations in the last decimal is not a problem in practice for many codes. There are many that run fine with e.g. gcc’s -ffast-math
, but if you are struggling with ill-conditioned matrices, every decimal could count.