Quantcast
Channel: Intel® oneAPI Math Kernel Library & Intel® Math Kernel Library
Viewing all articles
Browse latest Browse all 2652

MKL's (distributed) FFT library fails with a floating-point error

$
0
0

When repeatedly calling MKL's distributed (cluster) DFT library  via the FFTW3 interface library, it will fail with a floating-point error with certain combination's of grid sizes and MPI processes (eg, a 1024 x 256 x 1 grid running with 17 MPI processes). This is repeatable, and I have uploaded an example code that demonstrates the problem. I am compiling using "Composer XE 2015" tools (MKL), eg

[jshaw@cl4n074]: make fft_mpi
echo "Building ... fft_mpi"
Building ... fft_mpi
g++ -I/home/jshaw/XCuda-2.4.0/inc -DNDEBUG -O2 -I/home/jshaw/XCuda-2.4.0/inc -I/opt/lic/intel13/impi/4.1.0.024/include64 -I/opt/lic/intel15/composer_xe_2015.1.133/mkl/include -I/opt/lic/intel15/composer_xe_2015.1.133/mkl/include/fftw fft_mpi.cpp -o fft_mpi -L/home/jshaw/XCuda-2.4.0/lib/x86_64-CentOS-6.5 -lXCuda -L/home/jshaw/XCuda-2.4.0/lib/x86_64-CentOS-6.5 -lXCut  -L/opt/lic/intel13/impi/4.1.0.024/lib64 -lmpi -L/home/jshaw/fftw-libs-1.1/x86_64-CentOS-6.5/lib -lfftw3x_cdft_lp64 -L/opt/lic/intel15/composer_xe_2015.1.133/mkl/lib/intel64 -lmkl_cdft_core -lmkl_intel_lp64 -lmkl_core -lmkl_sequential -lmkl_blacs_intelmpi_lp64 -lpthread -lm -ldl -lirc

[jshaw@cl4n074]: mpirun -np 17 fft_mpi

FFT-MPI for Complex Multi-dimensional Transforms (float)
MPI procs: 17, dim: 1024x256x1, loops: 100 (2 MBytes)

Allocating FFT memory ...
rank:   0 (16592     61      0)
rank:   2 (16592     61    122)
rank:   7 (16592     60    424)
rank:   8 (16592     60    484)
rank:   9 (16592     60    544)
rank:  10 (16592     60    604)
rank:  12 (16592     60    724)
rank:  13 (16592     60    784)
rank:  15 (16592     60    904)
rank:  16 (16592     60    964)
rank:   1 (16592     61     61)
rank:   3 (16592     61    183)
rank:   4 (16592     60    244)
rank:   5 (16592     60    304)
rank:   6 (16592     60    364)
rank:  11 (16592     60    664)
rank:  14 (16592     60    844)
Initializing ...
Planning ...
loop: <1fnr> <2fnr> <3fnr> <4fnr> <5fnr> <6fnr> <7fnr> <8fnr> <9fnr> <10fnr> <11fnr> <12fnr> <13fAPPLICATION TERMINATED WITH THE EXIT STRING: Floating point exception (signal 8)
[jshaw@cl4n074]:

Note that the XCuda libraries referenced in the compile/link line are NOT required for this example to work!  The code will run correctly with various grid sizes and numbers of MPI processes. Its not as simple as having too many or too few MPI processes, or the FFT grid being small or large. I regularly run codes with large 3D grids. The issue is that one cannot pick arbitrary grid sizes or MPI ranks (as you should be able to). Any help or thoughts on how this can be fixed are appreciated. Is this a legitimate bug in MKL?

AttachmentSize
Downloadfft_mpi.cpp8.67 KB

Viewing all articles
Browse latest Browse all 2652

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>