id author title date pages extension mime words sentences flesch summary cache txt work_j6zirmtczrav3e56ilniy62vbi Massimiliano Fasi Numerical behavior of NVIDIA tensor cores 2021 19 .pdf application/pdf 10811 1223 64 We explore the floating-point arithmetic implemented in the NVIDIA tensor cores, � Is the result of each floating-point operation in (2) normalized, or do tensor cores only available, in order to guarantee reproducibility and facilitate testing other matrix multiplyaccumulate units, such as the third generation tensor cores in the latest NVIDIA A100 Rounding modes in tensor core computations Tests for determining what rounding modes are used in the inner products and the final rounding Features of the accumulator Tests that explore the number of extra bits in the alignment step of floating-point addition inside 2. Can tensor cores take binary32 subnormal numbers as inputs for C in (2) without 2. Can tensor cores take binary32 subnormal numbers as inputs for C in (2) without 3. Can tensor cores compute subnormal numbers from normal numbers and return them? When tensor cores are used in binary16 mode, the result computed in the format of ./cache/work_j6zirmtczrav3e56ilniy62vbi.pdf ./txt/work_j6zirmtczrav3e56ilniy62vbi.txt