Design of scalable dense linear algebra libraries for multithreaded architectures: the LU factorization | IEEE Conference Publication | IEEE Xplore