Hi,
I am trying to speed up the factorization of a dense symmetric indefinite matrix, the size of my matrices is usually between 10 k and 20 k.
I am using LAPACK (dsytrf) and MKL 2018 and I run it on a supercomputer node with two Intel Xeon E5-2680 v3 Haswell CPUs (2 x 12 Cores, 2,5 GHz). I also tried a node with Intel Xeon Phi 7250-F Knights Landing CPU and 68 cores, 1.4 GHz. The problem is that the factorization does not seem to scale very well with the number of threads I use: with up to 8 threads I see some improvement (the run time is halfed) but after that there is even a slowdown.
Is this something that is to be expected from this MKL routine? And if so, do you know of any alternative that scales better?
Thanks
Daniel