GeistHaus
log in · sign up

Fast Multidimensional Matrix Multiplication on CPU from Scratch

siboehm.com

Numpy can multiply two 1024x1024 matrices on a 4-core Intel CPU in ~8ms.This is incredibly fast, considering this boils down to 18 FLOPs / core / cycle, with...

1 page links to this URL