Hey everyone,
I couldn't find any old topics that dealt with this question in detail, so here I am asking it again: is there a way to enable FMA math when using the MKL routines? Here is a sample routine that when run on MSVC 2017 with the latest MKL version (details in the output below) and an AVX2 processor DOES NOT use FMA:
void print_mkl_info() { MKLVersion Version; mkl_get_version(&Version); printf("Major version: %d\n",Version.MajorVersion); printf("Minor version: %d\n",Version.MinorVersion); printf("Update version: %d\n",Version.UpdateVersion); printf("Product status: %s\n",Version.ProductStatus); printf("Build: %s\n",Version.Build); printf("Platform: %s\n",Version.Platform); printf("Processor optimization: %s\n",Version.Processor); printf("================================================================\n"); printf("\n"); } float standard_dot_product(float* a, float* b) { float c = 0.0f; for (int i = 0; i < 4; i++) { c = c + (a[i] * b[i]); } return c; } float standard_fma_dot_product(float* a, float* b) { float c = 0.0f; for (int i = 0; i < 4; i++) { c = fmaf(a[i], b[i], c); } return c; } float mkl_dot_product(float* a, float* b) { return cblas_sdot(4, a, 1, b, 1); } int main() { print_mkl_info(); float a[4] = { 1.907607, -.7862027, 1.148311, .9604002 }; float b[4] = { -.9355000, -.6915108, 1.724470, -.7097529 }; printf("Standard dot product is: %.23f\n", standard_dot_product(a, b)); printf("Standard FMA dot product is: %.23f\n", standard_fma_dot_product(a, b)); printf("MKL dot product is: %.23f\n", mkl_dot_product(a, b)); return 0; }
The above program outputs (compiled with FP:FAST and O2. Note that changing O2 to O1 changes the result of the standard_dot_product function, but not of the CBLAS routine):
Major version: 2019 Minor version: 0 Update version: 2 Product status: Product Build: 20190118 Platform: 32-bit Processor optimization: Intel(R) Advanced Vector Extensions 2 (Intel(R) AVX2) enabled processors ================================================================ Standard dot product is: 0.05768233537673950195313 Standard FMA dot product is: 0.05768235772848129272461 MKL dot product is: 0.05768233537673950195313
So is there anyway to generate results with FMA in such cases? Or am I being a knobhead and missing something?
THANKS!
Swat