Quantcast
Channel: Intel® oneAPI Math Kernel Library & Intel® Math Kernel Library
Viewing all articles
Browse latest Browse all 2652

bad interaction betwenn mkl_dynamic and potri in mkl 19.02???

$
0
0

Hi

the program below implements the inversion of an autoregressive matrix.

Program Test
  use blas95
  use lapack95
  USE IFPORT
  use mkl_service
  implicit none
  integer(kind=8) :: istat, n, c1, c2, ise
  integer(kind=4) :: dy
  character(len=200) :: msg
  Real(kind=8), allocatable :: A(:,:)
  real(kind=8) :: r1=0.0D0, r2=0.0D0
  outer:block
    dy=1
    write(*,*) "dynamic: ", dy
    call mkl_set_dynamic(dy)
    call mkl_set_num_threads(mkl_get_max_threads())
    n=10000
    write(*,"(*(g0"",""))") n
    r1=dclock()
    !!start building the matrix
    allocate(&
      &A(n,n),&
      &stat=istat,errmsg=msg)
    if(istat/=0) Then
      write(*,*) msg;exit outer
    end if
    !$OMP PARALLEL DO PRIVATE(c1)
    Do c1=1,size(A,2)
      Do c2=c1,size(A,1)
        A(c2,c1)=0.5**(c2-c1)
      end Do
    end Do
    !$OMP END PARALLEL DO
    ise=size(A,1)
    !$OMP PARALLEL DO PRIVATE(c1) FIRSTPRIVATE(ise)
    Do c1=1,ise-1
      A(c1,(c1+1):ise)=A((c1+1):ise,c1)
    End Do
    !$OMP END PARALLEL DO
    r2=Dclock()
    write(*,*) "alloc: ", r2-r1
    !!end building matrix
    r2=Dclock()
    call potrf(A=A,UPLO="U",INFO=istat)
    r1=dclock()
    write(*,*) "potrf: ",r1-r2
    call POTRI(A=A,Info=istat)
    r2=dclock()
    write(*,*) "potri: ",r2-r1
  End block outer
End Program Test

 

For setting mkl_dynamic to 0 or 1, I noticed hardly any difference in processing time when using mkl 17.08.

mkl_dynamic=0:

potrf: 0.88 seconds, potri: 2.12 seconds

mkl_dynamic=1

potrf: 0.58 seconds, potri: 2.09 seconds

However, with mkl 19.02 the differences are such that mkl_dynamic=0 makes the program unusable.

mkl_dynamic=0:

potrf: 0.37 seconds, potri: 110.69 seconds

mkl_dynamic=1

potrf: 0.37 seconds, potri: 1.11 seconds

Times were obtained on Intel(R) Xeon(R) CPU E5-2697 v4 @ 2.30GHz.

Environment variables were:

MKL_NUM_THREADS=36

KMP_AFFINITY=granularity=core,scatter

 

I noticed that potri in mkl 19.02 use a lot of time all 72 threads (including hyperthreading)

 

Is this a newly introduced bug or am I doing anything wrong.

 

Thanks


Viewing all articles
Browse latest Browse all 2652

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>