Quantcast
Channel: Intel® oneAPI Math Kernel Library & Intel® Math Kernel Library
Viewing all articles
Browse latest Browse all 2652

Pardiso mpi version with fatal error

$
0
0

Hello,

I receive a fatal error when using the impi version of intel Pardiso and it would be nice if someone could help me with it. I compile the code with (intel link line advisor)

mpiifort -i8 -I${MKLROOT}/include -c -o 
mkl_cluster_sparse_solver.o ${MKLROOT}/include /mkl_cluster_sparse_solver.f90

mpiifort -i8 -I${MKLROOT}/include -c -o MPI.o MPI.f90
mpiifort mkl_cluster_sparse_solver.o MPI.o -o MPI.out -Wl,
--start-group ${MKLROOT}/lib/intel64/libmkl_intel_ilp64.a 
${MKLROOT}/lib/intel64/libmkl_intel_thread.a 
${MKLROOT}/lib/intel64/libmkl_core.a 
${MKLROOT}/lib/intel64/libmkl_blacs_intelmpi_ilp64.a -Wl,--end-group
 -liomp5 -lpthread -lm -ldl

and run it for instance on two nodes with

mpiexec -n 2 ./MPI.out

I use the 64bit interface. The funny thing is that the reordering phase perfectly works however, the factorisation and solve step don't. The error message I get is the following:

Fatal error in PMPI_Bcast: Message truncated, error stack:
PMPI_Bcast(2654)..................: MPI_Bcast(buf=0x7ffe63518210, count=1, 
MPI_LONG_LONG_INT, root=0, MPI_COMM_WORLD) failed
MPIR_Bcast_impl(1804).............: fail failed
MPIR_Bcast(1832)..................: fail failed
I_MPIR_Bcast_intra(2057)..........: Failure during collective
MPIR_Bcast_intra(1599)............: fail failed
MPIR_Bcast_binomial(247)..........: fail failed
MPIDI_CH3U_Receive_data_found(131): Message from rank 0 and tag 2 truncated;
1600 bytes received but buffer size is 8

So this seems to be a problem with the buffer size. I thought first of all that my problem is too large however, this is not an issue of the matrix size. I tried to fix it by setting

export I_MPI_SHM_LMT_BUFFER_SIZE=2000

but it did not change the problem. In the impi manual there is also the I_MPI_SHM_LMT_BUFFER_NUM and I also tried to set this number to a higher value. The following versions are used: MKL version: 2017.4.256, Ifort version: 17.0.6.256, IMPI version: 2017.4.239. I tried also newer versions but it changed nothing. If I should post an example please let me know. However, I have the hope that it can be easily fixed by setting the buffer size (not I_MPI_SHM_LMT_BUFFER_SIZE) to a higher value.

Thanks in advance


Viewing all articles
Browse latest Browse all 2652

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>