Hi, I got segment fault error when calling dcg_init. Please help me. Thanks.
KrylovFMM is the subroutine calling the iterative solver. Shelltest1.c is the main file.
Hi, I got segment fault error when calling dcg_init. Please help me. Thanks.
KrylovFMM is the subroutine calling the iterative solver. Shelltest1.c is the main file.
Attachment | Size |
---|---|
Download![]() | 3.64 KB |
Download![]() | 7.02 KB |
I am trying to compile a program using the MKL (11.3, 2016.0.109) libraries with the gfortran (5.1.0) compiler and OpenMPI (1.8.5, compiled against gfortran 5.1.0).
I can successfully compile the program without any errors.
However, when executing my program I end up with this error:
Intel MKL FATAL ERROR: Cannot load symbol MKLMPI_Get_wrappers.
I have searched the Intel site for references regarding this issue to no avail (https://software.intel.com/en-us/mkl-reference-manual-for-fortran) does not supply this kind of information.
For your information my compilation flags are these:
-Wl,--no-as-needed -L/opt/intel/mkl/lib/intel64 -lmkl_scalapack_lp64 -lmkl_blacs_openmpi_lp64 -lmkl_lapack95_lp64 -lmkl_blas95_lp64 -lmkl_gf_lp64 -lmkl_core -lmkl_sequential
Which (as said) compiles fine. I have also tried the explicit https://software.intel.com/en-us/articles/intel-mkl-link-line-advisor which behaves similarly.
In addition I have tried adding start-group and end-group passed to the linker, to no avail.
I have also tried linking the generic library
-lmkl_blacs_openmpi_lp64 -lmkl_blacs_lp64
which did not change anything.
To my knowledge I should be able to link MKL against gfortran, no?
And I suspect some compatibility issues as it is a run-time error.
Note, that I can easily use the same flags for compiling against OpenMPI and Intel compiler.
Hi,
My code compiles fine when using Visual Studio 2013, but when converting it to VS15, I get this error:
Error LNK2001 unresolved external symbol _snprintf FEBio2 C:\FEBio2_15\VS2013\mkl_core.lib(mkl_aa_fw_device_threading_params.obj) 1
Any suggestions?
Thanks,
Dave
Hi,
There seems to be a bug in the Intel MKL library version 11.3 with IFFT. The test program below (sorry the attach feature doesn't work) performs the FFT on an input buffer and then the IFFT on the returned buffer. The results of the IFFT is completely different from the original input buffer. The problem was not present in the Intel MKL library version 10.3. The problem is present on both Linux and Mac OS X and the same code works fine on other FFTW3 implementations. Is this a known problem? I am missing something?
Let me know if you need more information.
Etienne
#include <fftw/fftw3.h>
#include <unistd.h>
#include <string.h>
float input[] = {
1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00,
1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00,
1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00,
1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00,
1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00,
1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00,
1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00,
1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00,
1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00,
1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00,
1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00,
1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00,
1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00,
1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00,
1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00,
1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 0.98, 0.93, 0.93, 0.98, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00,
1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 0.93, 0.75, 0.64, 0.57, 0.57, 0.64, 0.75, 0.93, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00,
1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 0.87, 0.64, 0.43, 0.00, 0.00, 0.00, 0.00, 0.43, 0.64, 0.87, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00,
1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 0.93, 0.64, 0.36, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.36, 0.64, 0.93, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00,
1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 0.75, 0.43, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.43, 0.75, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00,
1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 0.98, 0.64, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.64, 0.98, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00,
1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 0.93, 0.57, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.57, 0.93, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00,
1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 0.93, 0.57, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.57, 0.93, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00,
1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 0.98, 0.64, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.64, 0.98, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00,
1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 0.75, 0.43, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.43, 0.75, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00,
1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 0.93, 0.64, 0.36, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.36, 0.64, 0.93, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00,
1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 0.87, 0.64, 0.43, 0.00, 0.00, 0.00, 0.00, 0.43, 0.64, 0.87, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00,
1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 0.93, 0.75, 0.64, 0.57, 0.57, 0.64, 0.75, 0.93, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00,
1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 0.98, 0.93, 0.93, 0.98, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00,
1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00,
1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00,
1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00,
1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00,
1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00,
1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00,
1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00,
1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00,
1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00,
1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00,
1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00,
1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00,
1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00,
1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00,
1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00,
1.00, 1.00, 1.00, 1.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00,
0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00
};
void print_float( float* buf, size_t w, size_t h, float divideby )
{
printf("{\n");
for (int j=0; j < h; ++j) {
for (int i=0; i < w; ++i) {
if ( i == 0 )
printf( " " );
printf("%.2f, ", buf[i + (j * w)]/divideby);
}
printf("\n");
}
printf("};\n");
}
void print_complex( fftwf_complex* buf, size_t w, size_t h )
{
printf("{\n");
for (int j=0; j < h; ++j) {
for (int i=0; i < w; ++i) {
if ( i == 0 )
printf( " " );
printf("(%.2f, %.2f), ", buf[i + (j * w)][0], buf[i + (j * w)][1]);
}
printf("\n");
}
printf("};\n");
}
int main()
{
size_t height = 46;
size_t width = 46;
size_t cwidth = width/2+1;
float* in = (float*)fftw_malloc( sizeof( float ) * width * height );
memcpy( in, input, width * height * sizeof( float ) );
printf("INPUT\n");
print_float( in, width, height, 1.0f );
fftwf_complex* out =
(fftwf_complex*)fftw_malloc( sizeof( fftwf_complex ) * cwidth * height );
fftwf_plan p_r2c_ =
fftwf_plan_dft_r2c_2d(
height, width, in, out, FFTW_ESTIMATE
);
fftwf_plan p_c2r_ =
fftwf_plan_dft_c2r_2d(
height, width, out, in, FFTW_ESTIMATE
);
fftwf_execute_dft_r2c(p_r2c_, in, out);
printf("\nINPUT => FFT\n");
print_complex( out, cwidth, height );
fftwf_execute_dft_c2r(p_c2r_, out, in);
printf("\nINPUT => FFT => IFFT\n");
print_float( in, width, height, 2116.0f );
}
Hello all,
I am a first time MKL user trying to use the library to fit a 3rd-order 2-d polynomial function to f(x,y). This algorithm works using pretty much the same exact approach in Python so I believe it to be conceptually sound.
I'm trying to use LAPACKE_dgelsy but my program dies whenever it's called, and I'm sure my function arguments are incorrect.
I created my A matrix to be 10 x 60,000, with each of the 10 rows representing a coefficient for {1,x,y,x2,xy,y2,etc..} at each index of data, and B[i] = f(x[i],y[i]) (B is size 1 x 60,000). dgelsy only takes a single pointer, so I've had "a" take the form of the array:
double * a = new double[n * 10];
and I just concatenated each row of data one after the other into this array. I'm under the impression that dgelsy's argument values "LAPACK_ROW_MAJOR","m", "n", and "nrhs" will order the data in a more typical A*x=B, m x n, matrix form for all the calculations (at least conceptually, not in actual memory). Do I have the dimensions and input form correct in this?
The rest of the code looks like this so far:
lapack_int matrixLayout = LAPACK_ROW_MAJOR; lapack_int n = 60000; lapack_int lda = n; lapack_int ldb = n; m = 10; lapack_int nrhs = n; int *jpvt = new int[n];//No idea what this is lapack_int rcond = -1; //This either int *rank = new int[n]; // This is supposed to be an output variable, how should I initialize it? int LapackResult = LAPACKE_dgelsy(matrixLayout, m, n, nrhs, a, lda, b, ldb, jpvt, rcond, rank);
It crashes the program at the last line obviously. I've looked at all the documentation and I can't figure out what it wants for the arguments: jpvt, rcond, and rank. Could anyone give me more info about what these mean?
Finally, if WHEN I do get dgelsy to work, how can I interpret the output (i.e. rank pointer) to find the coefficients of my fitted polynomial?
Sorry if I'm coming at this completely wrong. I've gotten the IPP library to work very well for me so I've been taking some of the things I learned there and trying to apply them here to no avail.
I appreciate any help you can offer!
I am trying to solve a sparse system of equations but I am getting this error "MKL-DSS-DSS-Error, reordering problem" without any more specifics on what the error is. This is what my code looks like:
#include "mkl_dss.h"
#include "mkl_types.h"
/*
** Define the array and rhs vectors
*/
//#define NROWS 5
//#define NCOLS 5
//#define NNONZEROS 9
//#define NRHS 1
#if defined(MKL_ILP64)
#define MKL_INT long long
#else
#define MKL_INT int
#endif
//static const MKL_INT nRows = NROWS ;
//static const MKL_INT nCols = NCOLS ;
//static const MKL_INT nNonZeros = NNONZEROS ;
//static const MKL_INT nRhs = NRHS ;
//static _INTEGER_t rowIndex[NROWS+1] = { 1, 6, 7, 8, 9, 10 };
//static _INTEGER_t columns[NNONZEROS] = { 1, 2, 3, 4, 5, 2, 3, 4, 5 };
//static _DOUBLE_PRECISION_t values[NNONZEROS] = { 9, 1.5, 6, .75, 3, 0.5, 12, .625, 16 };
//static _DOUBLE_PRECISION_t rhs[NCOLS] = { 1, 2, 3, 4, 5 };
void mkl_dss(double* K, int* col_ind, int* row_ptr, double* F, int nn, int nnz) {
MKL_INT nRows = 2*nn;
MKL_INT nCols = 2*nn;
MKL_INT nNonZeros = nnz;
MKL_INT nRhs = 1;
MKL_INT i;
_INTEGER_t *rowIndex, *columns;
_DOUBLE_PRECISION_t *values, *rhs, *solValues;
rowIndex = new _INTEGER_t[nRows + 1];
columns = new _INTEGER_t[nnz];
values = new _DOUBLE_PRECISION_t[nnz];
rhs = new _DOUBLE_PRECISION_t[nCols];
solValues = new _DOUBLE_PRECISION_t[nCols];
for(i = 0; i < nnz; i++){
values[i] = K[i];
columns[i] = col_ind[i];
}
for(i = 0; i < nRows + 1; i++)
rowIndex[i] = row_ptr[i];
for(i = 0; i < nCols; i++){
rhs[i] = F[i];
solValues[i] = 0.0;
}
/* Allocate storage for the solver handle and the right-hand side. */
// _DOUBLE_PRECISION_t solValues[NROWS];
_MKL_DSS_HANDLE_t handle;
_INTEGER_t error;
_CHARACTER_t statIn[] = "determinant";
_DOUBLE_PRECISION_t statOut[5];
MKL_INT opt = MKL_DSS_DEFAULTS;
MKL_INT sym = MKL_DSS_SYMMETRIC;
MKL_INT type = MKL_DSS_POSITIVE_DEFINITE;
/* --------------------- */
/* Initialize the solver */
/* --------------------- */
error = dss_create(handle, opt );
if ( error != MKL_DSS_SUCCESS ) goto printError;
/* ------------------------------------------- */
/* Define the non-zero structure of the matrix */
/* ------------------------------------------- */
error = dss_define_structure(
handle, sym, rowIndex, nRows, nCols,
columns, nNonZeros );
if ( error != MKL_DSS_SUCCESS ) goto printError;
/* ------------------ */
/* Reorder the matrix */
/* ------------------ */
error = dss_reorder( handle, opt, 0);
if ( error != MKL_DSS_SUCCESS ) goto printError;
/* ------------------ */
/* Factor the matrix */
/* ------------------ */
error = dss_factor_real( handle, type, values );
if ( error != MKL_DSS_SUCCESS ) goto printError;
/* ------------------------ */
/* Get the solution vector */
/* ------------------------ */
error = dss_solve_real( handle, opt, rhs, nRhs, solValues );
if ( error != MKL_DSS_SUCCESS ) goto printError;
/* ------------------------ */
/* Get the determinant (not for a diagonal matrix) */
/*--------------------------*/
if ( nRows < nNonZeros ) {
error = dss_statistics(handle, opt, statIn, statOut);
if ( error != MKL_DSS_SUCCESS ) goto printError;
/*-------------------------*/
/* print determinant */
/*-------------------------*/
printf(" determinant power is %g \n", statOut[0]);
printf(" determinant base is %g \n", statOut[1]);
printf(" Determinant is %g \n", (pow(10.0,statOut[0]))*statOut[1]);
}
/* -------------------------- */
/* Deallocate solver storage */
/* -------------------------- */
error = dss_delete( handle, opt );
if ( error != MKL_DSS_SUCCESS ) goto printError;
/* ---------------------- */
/* Print solution vector */
/* ---------------------- */
printf(" Solution array: ");
for(i = 0; i< nCols; i++)
printf(" %g", solValues[i] );
printf("\n");
printError:
printf("Solver returned error code %d\n", error);
for(i = 0; i < nCols; i++)
F[i] = solValues[i];
delete[] rowIndex;
delete[] columns;
delete[] values;
delete[] rhs;
delete[] solValues;
}
The same piece of code worked for the default example problem, so I know the sparse solver compiles and links correctly. But it does not work when I use my own set of arrays. I have a hunch that opt = MKL_DSS_DEFAULTS is not working as it should. Should I be using something different ? I have used Fortran style indexing (starting from 1) for the columns and row index vectors. I have also tried zero based indexing and set the opt = MKL_DSS_MSG_LVL_WARNING + MKL_DSS_TERM_LVL_ERROR+MKL_DSS_ZERO_BASED_INDEXING which also did not work.
Fortran and C applications that make calls to DSS (Direct Sparse Solver) produce huge EXE files on Windows when recent versions of Intel compilers and MKL are used, if static libraries are selected (/MT) rather than dynamic libraries (/MD). I noticed this problem when there was a noticeable delay before such a program, built from 35 lines of source code, ran and gave results from solving a set of five simultaneous linear equations. I happened to look at the HDD activity lights on my laptop, and then I checked the EXE size (all values given are byte counts).
Compiler EXE Size (/MD) EXE Size (/MT) NOTE ------------ ---------------- ------------------- --------- CVF 6.6 36,864 831,488 32-bit, CXML 11.1.70 14,336 7,812,608 32-bit 14.0.4 18,944 44,312,064 LP64 15.0.4 19,968 53,190,656 LP64 16.0.0 23,040 61,577,216 LP64
Even if it is felt that nothing is amiss, it would be nice for users to be aware of the huge EXE sizes produced when DSS routines are called and static libraries are used.
Here is the test program, which is a simplified version of one of the examples provided with MKL.
PROGRAM DSS_test USe mkl_dss IMPLICIT NONE C INTEGER, PARAMETER :: nRows=5, nCols=5, nNonZeros=9, nRhs=1 INTEGER :: rowIndex(nRows + 1) = [ 1, 6, 7, 8, 9, 10 ] INTEGER :: columns(nNonZeros) = [ 1, 2, 3, 4, 5, 2, 3, 4, 5 ] DOUBLE PRECISION :: 1 values(nNonZeros) = [9.,1.5,6.,.75,3.,0.5,12.,0.625,16.], 2 rhs(nRows) = [ 1., 2., 3., 4., 5. ] DOUBLE PRECISION solution(nRows) INTEGER*8 handle INTEGER i, error, buf,idum(1) C error = dss_create(handle, MKL_DSS_DEFAULTS) IF (error .NE. MKL_DSS_SUCCESS ) GO TO 999 C error = dss_define_structure( handle, MKL_DSS_SYMMETRIC,& rowIndex, nRows, nCols, columns, nNonZeros ) IF (error .NE. MKL_DSS_SUCCESS ) GO TO 999 error = dss_reorder( handle, MKL_DSS_DEFAULTS, idum) IF (error .NE. MKL_DSS_SUCCESS ) GO TO 999 error = dss_factor_real( handle, MKL_DSS_DEFAULTS, VALUES) IF (error .NE. MKL_DSS_SUCCESS ) GO TO 999 error = dss_solve_real( handle, MKL_DSS_DEFAULTS, rhs, nRhs,& solution) IF (error .NE. MKL_DSS_SUCCESS ) GO TO 999 error = dss_delete( handle, MKL_DSS_DEFAULTS ) IF (error .NE. MKL_DSS_SUCCESS ) GO TO 999 WRITE(*,'(10ES12.4)') (solution(i), i = 1, nCols) STOP 999 WRITE(*,*) "DSS error code ", error STOP 1 1000 END PROGRAM
For speeding up the compilations, I used the following module, compiled once, instead of having include 'mkl_dss.fi' in my program source:
module mkl_dss c implicit none include 'mkl_dss.fi' c end module
Hi,
I am using MKL (the student version) with MPICH2.In my Makefile, the paths for MKL are hardcoded. How can I make it that they get more general? I mean, that now that my professor will check the project, assuming he was MKL installed in his system, how can he compile it? I would like to provide a Makefile that would be (almost) ready to run.
OBJSDIR = obj OBJS = $(OBJSDIR)/main.o $(OBJSDIR)/IO.o $(OBJSDIR)/alloc.o $(OBJSDIR)/communication.o $(OBJSDIR)/accuracy.o SOURCE = main.cpp src/IO.cpp src/alloc.cpp src/communication.cpp src/accuracy.cpp HEADER = headers/IO.h headers/alloc.h headers/communication.h headers/accuracy.h OUT = test CXX = ../../mpich-install/bin/mpic++ CXXFLAGS = -I../../intel/mkl/include -Wl,--start-group -Wl,--end-group -lpthread -lm -ldl -Wall LDFLAGS = ../../intel/mkl/lib/intel64/libmkl_scalapack_lp64.a -Wl,--start-group ../../intel/mkl/lib/intel64/libmkl_intel_lp64.a ../../intel/mkl/lib/intel64/libmkl_core.a ../../intel/mkl/lib/intel64/libmkl_sequential.a -Wl,--end-group ../../intel/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.a -lpthread -lm -ldl all: $(OBJSDIR) $(OUT) $(OBJSDIR): mkdir $(OBJSDIR) $(OUT): $(OBJS) $(CXX) $(OBJS) -o $(OUT) $(CXXFLAGS) $(LDFLAGS) # make -f Makefile clean # create/compile the individual files >>separately<< $(OBJSDIR)/main.o: main.cpp $(CXX) -c main.cpp $(CXXFLAGS) -o $@ $(OBJSDIR)/IO.o: src/IO.cpp $(CXX) -c src/IO.cpp $(CXXFLAGS) -o $@ $(OBJSDIR)/alloc.o: src/alloc.cpp $(CXX) -c src/alloc.cpp $(CXXFLAGS) -o $@ $(OBJSDIR)/communication.o: src/communication.cpp $(CXX) -c src/communication.cpp $(CXXFLAGS) -o $@ $(OBJSDIR)/accuracy.o: src/accuracy.cpp $(CXX) -c src/accuracy.cpp $(CXXFLAGS) -o $@ .PHONY: clean clean: rm -rf $(OBJSDIR)/*.o
So, how to modify CXXFLAGS and LDFLAGS? The linker advisor didn't help much.
Dear Forum,
I am trying to make MKL accelerate a matrix multiplication for me. It works, but MKL insists on doing it with a single thread. I played around a bit. But regardless of what I do - even when multiplying two randomly initialized 10000x10000 matrices - MKL does not use multiple threads. Am I missing something?
Function:
BLAS sgemm, via libmkl_rt.so
Environment settings:
MKL_NUM_THREADS=8; export MKL_NUM_THREADS OMP_NUM_THREADS=8; export OMP_NUM_THREADS MKL_DYNAMIC=FALSE; export MKL_DYNAMIC OMP_DYNAMIC=FALSE; export OMP_DYNAMIC MKL_DOMAIN_NUM_THREADS=MKL_DOMAIN_ALL,8; export MKL_DOMAIN_NUM_THREADS OMP_DOMAIN_NUM_THREADS=MKL_DOMAIN_ALL,8; export OMP_DOMAIN_NUM_THREADS
Machine:
LinuxMint 17.2, Kernel: 3.19.0-26, CPU: 4th gen. i7, HT activated in BIOS
Hello! I don't see MKL listed in my available downloads for "Community Licensing for Intel® Performance Libraries for OS X". It appears to be available for Linux and Windows, though. Will MKL be made available for OS X?
Thanks!
Tim
i am trying to use the dsyev routine. Now i modified your default example slightly by solving for matrix:
2,1,3, 1,2,3, 3,3,20
i am getting :
Eigenvalues
1.00 2.00 21.00
Eigenvectors (stored columnwise)
0.71 0.69 0.16
-0.71 0.69 0.16
0.00 -0.23 0.97
i again tried with vector:
1,2
2,1
Eigenvalues
-1.00 3.00
Eigenvectors (stored columnwise)
-0.71 0.71
0.71 0.71
eign values are fine , but why eign vectors are wrong !
according to this and this the eign vectors should have been :
[-1 , 1 , 0] , [-3,-3,1] , [1,1,6]
and [1,-1] , [1,1]
/******************************************************************************* * Copyright (C) 2009-2015 Intel Corporation. All Rights Reserved. * The information and material ("Material") provided below is owned by Intel * Corporation or its suppliers or licensors, and title to such Material remains * with Intel Corporation or its suppliers or licensors. The Material contains * proprietary information of Intel or its suppliers and licensors. The Material * is protected by worldwide copyright laws and treaty provisions. No part of * the Material may be copied, reproduced, published, uploaded, posted, * transmitted, or distributed in any way without Intel's prior express written * permission. No license under any patent, copyright or other intellectual * property rights in the Material is granted to or conferred upon you, either * expressly, by implication, inducement, estoppel or otherwise. Any license * under such intellectual property rights must be express and approved by Intel * in writing. * ******************************************************************************** */ /* LAPACKE_dsyev Example. ====================== Program computes all eigenvalues and eigenvectors of a real symmetric matrix A: 1.96 -6.49 -0.47 -7.20 -0.65 -6.49 3.80 -6.39 1.50 -6.34 -0.47 -6.39 4.17 -1.51 2.67 -7.20 1.50 -1.51 5.70 1.80 -0.65 -6.34 2.67 1.80 -7.10 Description. ============ The routine computes all eigenvalues and, optionally, eigenvectors of an n-by-n real symmetric matrix A. The eigenvector v(j) of A satisfies A*v(j) = lambda(j)*v(j) where lambda(j) is its eigenvalue. The computed eigenvectors are orthonormal. Example Program Results. ======================== LAPACKE_dsyev (row-major, high-level) Example Program Results Eigenvalues -11.07 -6.23 0.86 8.87 16.09 Eigenvectors (stored columnwise) -0.30 -0.61 0.40 -0.37 0.49 -0.51 -0.29 -0.41 -0.36 -0.61 -0.08 -0.38 -0.66 0.50 0.40 0.00 -0.45 0.46 0.62 -0.46 -0.80 0.45 0.17 0.31 0.16 */ #include <stdlib.h> #include <stdio.h> #include "mkl_lapacke.h" /* Auxiliary routines prototypes */ extern void print_matrix( char* desc, MKL_INT m, MKL_INT n, double* a, MKL_INT lda ); /* Parameters */ #define N 2 #define LDA N /* Main program */ int main() { /* Locals */ MKL_INT n = N, lda = LDA, info; /* Local arrays */ double w[N]; double a[LDA*N] = { 1,2, 2,1 }; /* Executable statements */ printf( "LAPACKE_dsyev (row-major, high-level) Example Program Results\n" ); /* Solve eigenproblem */ info = LAPACKE_dsyev( LAPACK_ROW_MAJOR, 'V', 'U', n, a, lda, w ); /* Check for convergence */ if( info > 0 ) { printf( "The algorithm failed to compute eigenvalues.\n" ); exit( 1 ); } /* Print eigenvalues */ print_matrix( "Eigenvalues", 1, n, w, 1 ); /* Print eigenvectors */ print_matrix( "Eigenvectors (stored columnwise)", n, n, a, lda ); exit( 0 ); } /* End of LAPACKE_dsyev Example */ /* Auxiliary routine: printing a matrix */ void print_matrix( char* desc, MKL_INT m, MKL_INT n, double* a, MKL_INT lda ) { MKL_INT i, j; printf( "\n %s\n", desc ); for( i = 0; i < m; i++ ) { for( j = 0; j < n; j++ ) printf( " %6.2f", a[i*lda+j] ); printf( "\n" ); } }
i have two codes for solving linear equations, one from intel:
/******************************************************************************* * Copyright (C) 2009-2015 Intel Corporation. All Rights Reserved. * The information and material ("Material") provided below is owned by Intel * Corporation or its suppliers or licensors, and title to such Material remains * with Intel Corporation or its suppliers or licensors. The Material contains * proprietary information of Intel or its suppliers and licensors. The Material * is protected by worldwide copyright laws and treaty provisions. No part of * the Material may be copied, reproduced, published, uploaded, posted, * transmitted, or distributed in any way without Intel's prior express written * permission. No license under any patent, copyright or other intellectual * property rights in the Material is granted to or conferred upon you, either * expressly, by implication, inducement, estoppel or otherwise. Any license * under such intellectual property rights must be express and approved by Intel * in writing. * ******************************************************************************** */ /* LAPACKE_dgesv Example. ====================== The program computes the solution to the system of linear equations with a square matrix A and multiple right-hand sides B, where A is the coefficient matrix: 6.80 -6.05 -0.45 8.32 -9.67 -2.11 -3.30 2.58 2.71 -5.14 5.66 5.36 -2.70 4.35 -7.26 5.97 -4.44 0.27 -7.17 6.08 8.23 1.08 9.04 2.14 -6.87 and B is the right-hand side matrix: 4.02 -1.56 9.81 6.19 4.00 -4.09 -8.22 -8.67 -4.57 -7.57 1.75 -8.61 -3.03 2.86 8.99 Description. ============ The routine solves for X the system of linear equations A*X = B, where A is an n-by-n matrix, the columns of matrix B are individual right-hand sides, and the columns of X are the corresponding solutions. The LU decomposition with partial pivoting and row interchanges is used to factor A as A = P*L*U, where P is a permutation matrix, L is unit lower triangular, and U is upper triangular. The factored form of A is then used to solve the system of equations A*X = B. Example Program Results. ======================== LAPACKE_dgesv (row-major, high-level) Example Program Results Solution -0.80 -0.39 0.96 -0.70 -0.55 0.22 0.59 0.84 1.90 1.32 -0.10 5.36 0.57 0.11 4.04 Details of LU factorization 8.23 1.08 9.04 2.14 -6.87 0.83 -6.94 -7.92 6.55 -3.99 0.69 -0.67 -14.18 7.24 -5.19 0.73 0.75 0.02 -13.82 14.19 -0.26 0.44 -0.59 -0.34 -3.43 Pivot indices 5 5 3 4 5 */ #include <stdlib.h> #include <stdio.h> #include "mkl_lapacke.h" /* Auxiliary routines prototypes */ extern void print_matrix( char* desc, MKL_INT m, MKL_INT n, double* a, MKL_INT lda ); extern void print_int_vector( char* desc, MKL_INT n, MKL_INT* a ); /* Parameters */ #define N 3 #define NRHS 1 #define LDA N #define LDB NRHS /* Main program */ int main() { /* Locals */ MKL_INT n = N, nrhs = NRHS, lda = LDA, ldb = LDB, info; /* Local arrays */ MKL_INT ipiv[N]; double a[LDA*N] = { 1,1,1, 1,1,3, 2,1,1 }; double b[LDB*N] = { 1, 2, 3 }; /* Executable statements */ printf( "LAPACKE_dgesv (row-major, high-level) Example Program Results\n" ); /* Solve the equations A*X = B */ info = LAPACKE_dgesv( LAPACK_ROW_MAJOR, n, nrhs, a, lda, ipiv, b, ldb ); /* Check for the exact singularity */ if( info > 0 ) { printf( "The diagonal element of the triangular factor of A,\n" ); printf( "U(%i,%i) is zero, so that A is singular;\n", info, info ); printf( "the solution could not be computed.\n" ); exit( 1 ); } /* Print solution */ print_matrix( "Solution", n, nrhs, b, ldb ); /* Print details of LU factorization */ print_matrix( "Details of LU factorization", n, n, a, lda ); /* Print pivot indices */ print_int_vector( "Pivot indices", n, ipiv ); exit( 0 ); } /* End of LAPACKE_dgesv Example */ /* Auxiliary routine: printing a matrix */ void print_matrix( char* desc, MKL_INT m, MKL_INT n, double* a, MKL_INT lda ) { MKL_INT i, j; printf( "\n %s\n", desc ); for( i = 0; i < m; i++ ) { for( j = 0; j < n; j++ ) printf( " %6.2f", a[i*lda+j] ); printf( "\n" ); } } /* Auxiliary routine: printing a vector of integers */ void print_int_vector( char* desc, MKL_INT n, MKL_INT* a ) { MKL_INT j; printf( "\n %s\n", desc ); for( j = 0; j < n; j++ ) printf( " %6i", a[j] ); printf( "\n" ); }
another which i created for gnu blas lapack :
#include<stdio.h> #include<iostream> #include "lapacke.h" using namespace std; int main() { // note, to understand this part take a look in the MAN pages, at section of parameters. char TRANS = 'T'; int INFO=3; int LDA = 3; int LDB = 3; int N = 3; int NRHS = 1; int IPIV[3] ; /* double A[9]= { 1,2,-1, 2,1,1, -1,2,1, }; double B[3]= { 4, -2, 2 }; */ double A[9] = { 1,1,1, 1,1,3, 2,1,1 }; double B[3] = { 1, 2, 3 }; // end of declarations cout << "compute the LU factorization..."<< endl << endl; //void LAPACK_dgetrf( lapack_int* m, lapack_int* n, double* a, lapack_int* lda, lapack_int* ipiv, lapack_int *info ); LAPACK_dgetrf(&N,&N,A,&LDA,IPIV,&INFO); // checks INFO, if INFO != 0 something goes wrong, for more information see the MAN page of dgetrf. if(INFO) { cout << "an error occured : "<< INFO << endl << endl; }else{ cout << "solving the system..."<< endl << endl; // void LAPACK_dgetrs( char* trans, lapack_int* n, lapack_int* nrhs, const double* a, lapack_int* lda, const lapack_int* ipiv,double* b, lapack_int* ldb, lapack_int *info ); dgetrs_(&TRANS,&N,&NRHS,A,&LDA,IPIV,B,&LDB,&INFO); printf("IPIV= %d %d %d \n",IPIV[0],IPIV[1],IPIV[2]); if(INFO) { // checks INFO, if INFO != 0 something goes wrong, for more information see the MAN page of dgetrs. cout << "an error occured : "<< INFO << endl << endl; }else{ cout << "print the result : {"; int i; for (i=0;i<N;i++) { cout << B[i] << ""; } cout << "}"<< endl << endl; } } cout << "program terminated."<< endl << endl; return 0; }
compute the LU factorization... solving the system... IPIV= 1 3 3 print the result : {2 -1.5 0.5 } program terminated.
outputs are: LAPACKE_dgesv (row-major, high-level) Example Program Results Solution 2.00 -1.50 0.50 Details of LU factorization 2.00 1.00 1.00 0.50 0.50 2.50 0.50 1.00 -2.00 Pivot indices 3 2 3
can intel's ipiv differ from gnu's ipiv (different algorithm) or, there is some error in my code ?
Awaiting your reply
Regards
Puneet
Hi
Short question: how do the execution time of FEAST scale with the subspace size M0?
I am trying to add support for the eigensolver FEAST to a large atomic structure program package, GRASP2K, and it seems to work just fine for small and medium sized problems (up to matrix sizes about 3-4000, 100 000 non-zero elements and M0~200-300). For larger problems (N~200 000, NZ~50-60 000 000, M0~1000) the solution, or rather the execution time, depends a lot on the initial guess for emin and emax. I have an older approxiamte solver that can calculate approximate values for emin and emax but with large error bars. If I use these values and use FEAST to estimate the number of eigenvalues (fmp(14=2)) I always get a very large number recommended for M0 (~20 000). I know from the physics involved that it always is the Neig (a number comming from the way the Hamiltonian is constructed, ~2-300) lowest eigenvalues that I am interseted in.
I can either set emin and emax so that I am sure to bracket the eigenvalues I am interested in and accept a large value for M0 and only use the first few eigenvaules returned, or I can iterate and test lower and lower values for emax until M0 returned by fmp(14)=2 comes close to Neig*1.5. It is not cheap to call FEAST many times to find a good energy range as the solution is iterated over many times improving the basis set in an outer loop. I could just test the two cases but the resources I have at hand during development are a bit underpowered and each test run takes about a week so any hint would be welcome.
/Per Andersson
Hello,
I wanted to use the function vmlSetMode for c++ but the input parameter is ignored for all possible inputs. For example:
unsigned int x = vmlSetMode(VML_ERRMODE_IGNORE); unsigned int y = vmlGetMode();
The value of x seems to be random (0xffffb2c0, 0xffffaa36, ...) and y is always different from VML_ERRMODE_IGNORE and equals x.
System infos: RedHat, Netbeans 8.0.2, Parallel Studio Composer 2016.
Does anybody have an idea?
Thanks
Hi,
I'm facing a problem probably due to misalignment so I decided to use mkl_*alloc with Parallel Direct Sparse Solver.
The solver needs 64byte alignment that is succesfully obtained my mkl_malloc/mkl_calloc specifying the alignment parameter.
Unfortunately this is not possible with mkl_realloc: its default is a simple 16byte alignment and nothing can be done to bypass this default behaviour.
Any workaround?
Thanks,
Dario
Hi:
I need to use pardiso to solve a big matrix with big right hand side and the physical memory is not big enough to store the matrix and right hand side at the same time. So I split the right hand side into two parts, the first part has 10000 columns and the second part has 10001 columns. And then call pardiso with phase=33 to solve the first part and call phase=33 again to solve the second part. The problem is how to set nrhs in phase 12 and phase -1, 10000 or 10001 or just 0. Because all the examples in solverc and solverf all set nrhs=1 in phase 11, 22,33 and -1. I can not find an useful example.
Hi,
I am using Pardiso, from the MKL that ships with icpc 16.0.0 on Mac OSX.
I have already computed a LU factorization, and I want to use Pardiso as an iterative solver using the previous LU factorization as a preconditionner. For that, I've set iparm[3] = 21 for 2 digits of accuracy. But pardiso_64 returns an error -4 and iparm[19] contains -18.
I don't understand what's going on as the last digit of iparm[19] should be either 1, 2, 3, 4 or 5 according to the MKL documentation : https://software.intel.com/en-us/node/521691
Intel MKL is a popular math library used by many to create fast and reliable applications in science, engineering, and finance. Do you know it is now available for free (at no cost)? The community licensing program gives anyone, individuals or organizations, free license for the latest version of Intel MKL. There is no royalty for distributing the library in an application. The only restrictions are:
Better yet, this community licensing program is not just for MKL, it is applicable to other Intel performance libraries including:
Visit https://software.intel.com/sites/campaigns/nest/ to get yourself started and to learn the program's details
More Intel software tools, such as the Intel Compilers, Intel MPI, and tools for debugging and performance tuning, may also be available for free (at no cost) if you are qualified as an academic researcher, teacher, student, or an open source contributor. See https://software.intel.com/en-us/free_tools_and_libraries.
OS: Fedora 22
parallel_studio_xe_2016
Hardware : 8 Thread(s) per core: 2 Vendor ID: GenuineIntel Model name:
Intel(R) Core(TM) i7-2600K CPU @ 3.40GHz - Sandybridge
R-3.2.2
Here is my build configuration:
-------------------------------------------------------
source /opt/intel/compilers_and_libraries_2016/linux/mkl/bin/mklvars.sh intel64 source /opt/intel/bin/compilervars.sh intel64 _mkllibpath=$MKLROOT/lib/intel64 _openmplibpath=${PROD_DIR}/compiler/lib/intel64 export LD_LIBRARY_PATH=${_mkllibpath}:${_openmplibpath} export MKL="-L${_mkllibpath} -L${_openmplibpath} -lmkl_intel_lp64 -lkml_intel_thread -lkml_core -liomp5 -lpthread" export CC="icc" export F77="ifort" export CXX="icpc" export AR="xiar" export LD="xild" export CFLAGS="-O3 -ipo -openmp -parallel -xAVX" export CXXFLAGS="-O3 -ipo -openmp -parallel -xAVX" export FFLAGS="-O3 -ipo -openmp -parallel -xAVX" export MAIN_LDFLAGS='-openmp' ./configure --with-lapack --with-blas="$MKL" --enable-R-shlib --enable-memory-profiling --enable-openmp --enable-BLAS-shlib --enable-lto F77=${F77} FC=${F77}
------------------------------------------------------------
After I run ./configure, it seems from config.log everything is fine:
checking for dgemm_ in result: yes checking whether double complex BLAS can be used result: yes checking whether the BLAS is complete result: yes
The only error I can see is ld complaining about not finding -lRblas
----------------------------------------------------------------------------
Then run
$ make
with no errors.
Now, with no make install, I get this:
--------------------------------------------------------------------
$ ldd bin/exec/R linux-vdso.so.1 (0x00007ffe073f3000) libR.so => /usr/lib64/R/lib/libR.so (0x00007f43939e6000) libRblas.so => not found libm.so.6 => /lib64/libm.so.6 (0x00007f43936de000) libiomp5.so => /opt/intel/compilers_and_libraries_2016.0.109/linux/compiler/lib/intel64/libiomp5.so (0x00007f439339c000) libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f4393185000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f4392f69000) libc.so.6 => /lib64/libc.so.6 (0x00007f4392ba8000) libdl.so.2 => /lib64/libdl.so.2 (0x00007f43929a4000) libblas.so.3 => /lib64/libblas.so.3 (0x00007f439274b000) libgfortran.so.3 => /lib64/libgfortran.so.3 (0x00007f439241f000) libquadmath.so.0 => /lib64/libquadmath.so.0 (0x00007f43921e0000) libreadline.so.6 => /lib64/libreadline.so.6 (0x00007f4391f96000) libtre.so.5 => /lib64/libtre.so.5 (0x00007f4391d85000) libpcre.so.1 => /lib64/libpcre.so.1 (0x00007f4391b15000) liblzma.so.5 => /lib64/liblzma.so.5 (0x00007f43918ef000) libbz2.so.1 => /lib64/libbz2.so.1 (0x00007f43916de000) libz.so.1 => /lib64/libz.so.1 (0x00007f43914c8000) librt.so.1 => /lib64/librt.so.1 (0x00007f43912c0000) libicuuc.so.54 => /lib64/libicuuc.so.54 (0x00007f4390f2e000) libicui18n.so.54 => /lib64/libicui18n.so.54 (0x00007f4390ad7000) libgomp.so.1 => /lib64/libgomp.so.1 (0x00007f43908b5000) /lib64/ld-linux-x86-64.so.2 (0x00005557e2243000) libtinfo.so.5 => /lib64/libtinfo.so.5 (0x00007f439068a000) libicudata.so.54 => /lib64/libicudata.so.54 (0x00007f438ec5f000) libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00007f438e8dc000)
-----------------------------------------------------------------------------------------------
Now a few questions:
1- am I not supposed to see something like this in the ldd command return?
ibmkl_intel_lp64.so => libmkl_intel_thread.so => libmkl_core.so =>
Or do I need to run $make install before ldd?
2- when visiting https://software.intel.com/en-us/articles/intel-mkl-link-line-advisor here is what I get as
linking and compiler options:
Linking options:
-L${MKLROOT}/lib/intel64 -lmkl_intel_ilp64 -lmkl_core -lmkl_intel_thread -lpthread -lm
Compiler options:
-DMKL_ILP64 -qopenmp -I${MKLROOT}/include
What is the difference between -openmp and -qopenmp? Shall I use
indeed the above compiler and linking options indeed?
Thank you for help in this difficult topic for me.
Hi,
I would like to do a QR factorization using LAPACK. From the documentation available here https://software.intel.com/en-us/node/521003#E832D468-0891-40EC-9468-686... , I've decided to use geqrf for the factorization.
As I need to solve a Least square problem, I need to solve R.x = (Q1)^T b as explained in the documentation. You can apply matrix Q with the function ormqr. But, which routine should I use to solve R.x = y ?