Tensor Cores maji presnost FP16/32 - viz. AT - "These cores are essentially a mass collection of ALUs for performing 4x4 Matrix operations; specifically a fused multiply add (A*B+C), multiplying two 4x4 FP16 matrices together, and then adding that result to an FP16 or FP32 4x4 matrix to generate a final 4x4 FP32 matrix."
Cela V100 ma ve skutecnosti 5376 (Cuda) + 672 (Tensor) = 6048 jader, protoze Tensor Cores jsou samostatna. Operace, ktere Tensor Cores delaji, jsou presne ty, co se pouzivaji ve strojovem uceni (konkretne deep learning). Na to ma byt nova Vega primo urcena, ale uprimne, s vykonem jen 22 TFLOPS proti 120 TFLOPS nema zadnou sanci. A to jeste pred tim, nez vubec stihla vyjit. Pro AMD naprosta katastrofa.