Which processor helps with the floating point calculation

Calculate GFlops performance of processors

In the article Benchen like the pros in c’t 11/20 you present the Flops program. How can I check whether my processor is performing at its best?

Flops measures the floating point performance of desktop, notebook and server processors with various modern instruction set extensions such as AVX (Advanced Vector Extensions) and FMA3 (Fused multiply-add with three operands). Typically, the performance of CPUs with double precision is compared with 64 bits (FP64, Double Precision / DP) in the unit flops (Floating Point Operations Per Second).

This value can be checked because one can easily calculate the maximum floating point performance of a CPU. The formula for this is: number of cores × clock frequency in GHz × CPU instructions per clock = computing power in GigaFlops. The number of cores only refers to the physical cores, SMT or Hyper-Threading are not included. With modern CPUs, however, the clock frequency is not that easy to determine because the turbo fluctuates depending on the utilization of the processing units and the available thermal budget. Tools such as CPU-Z or HWInfo64 show the current clock rate; the Windows 10 Task Manager display is too unreliable for this. With the twelve-core Ryzen 9 3900X, the clock frequency at full load with AVX2 / FMA3 was around 4.15 GHz.

The number of instructions per cycle determines the architecture: The two 256-bit-wide FMA units of a Zen-2 core can each process two operations and thus come together to 16 FP64 operations per cycle. This value also applies to Intel's desktop and notebook CPUs from the Core i-4000 to Core i-10000 series as well as to the Ice Lake Core i-1000G mobile processors. The latter have only one FMA unit, but this can execute commands twice as wide via the AVX-512. The Core X CPUs create 32 flops per clock with two such units. Ryzen processors of the first two generations (Zen, Zen +) only have 8 FP64 operations.

If you multiply the 12 cores of the Ryzen 3900X with a 4.15 GHz clock and 16 operations per clock, you get 796.8 GFlops, which corresponds to our measured value of 788 billion floating point operations per second. (chh)