Hello, experts,
Assume that I have 4 cores machine, each core has 2MB of LLC slice and LLC includes L2.
1) If I use single-threaded MKL, the MKL instance will use 2MB of LLC or use 8MB LLC?
2) If I use openmp threads to control the parallelism, will MKL instance determine available LLC based on thread num?
Any help is appreciated. Thanks.
Best Regards