error in LAPACKE_dgesvd Example row major

July 16, 2018, 12:49 pm

Latest and popular articles on Intel Technologies

≪ Previous: Redistributable packges, what to supply with the application

Hello,
I wanted to share an error i found in the example for LAPACKE_dgesvd example for row major. https://software.intel.com/sites/products/documentation/doclib/mkl_sa/11...
in the function:

print_matrix( "Left singular vectors (stored columnwise)", m, n, u, ldu );

it will print the left singular matrix as mxn while its a mxm. The matrix contain correct data but the print_matrix function gets it column from column of the original matrix. so it should be

print_matrix( "Left singular vectors (stored columnwise)", m, m, u, ldu );

also note the comment where it says "Example Program Results." should also be changed for the left singular vectors:
 Left singular vectors (stored columnwise)
  -0.59   0.26   0.36   0.31   0.23   0.55
  -0.40   0.24  -0.22  -0.75  -0.36   0.18
  -0.03  -0.60  -0.45   0.23  -0.31   0.54
  -0.43   0.24  -0.69   0.33   0.16  -0.39
  -0.47  -0.35   0.39   0.16  -0.52  -0.46
   0.29   0.58  -0.02   0.38  -0.65   0.11

↧

SLALAPACK Communicators

July 16, 2018, 2:37 pm

Latest and popular articles on Intel Technologies

≫ Next: Using Intel MKL with Armadillo: segmentation fault

≪ Previous: error in LAPACKE_dgesvd Example row major

After a lot of research I did not realize how the communicators in SCALAPCK, especially in MKL, are used. I want to call SCLAPACK routines only from the subset of MPI_COMM_WORLD. For example, I start my program with 28 processes, but I just want to use the first 5x5 = 25 processes for SCALAPACK and let others do something else. How should I initialize SCALAPACK accordingly ?

Thank you in advance,

Dmitry

↧

Using Intel MKL with Armadillo: segmentation fault

July 17, 2018, 4:17 am

Latest and popular articles on Intel Technologies

≫ Next: MPI Linpack from MKL, SSE4_2, turbo, and Skylake: Some SSE 4.2 threads run at the AVX512 turbo frequency

≪ Previous: SLALAPACK Communicators

I have been searching for a while now and nothing allowed me to solve my problem.

I installed the Intel MKL libraries on my Ubuntu 18.04 machine and it is correctly linked to Numpy and Scipy. Now, I wanted to do the same thing using Armadillo in the C++ language. I installed it using the readme.txt instructions provided by the Armadillo project (using `cmake`). I checked that it correctly detected the presence of MKL and it did.
Now I want to check that it works well so I just build a matrix and diagonalize it using the following code

    #include <iostream>
    #include <armadillo>

    using namespace std;
    using namespace arma;

    int main()
      {
        wall_clock timer;
        int dim = 100;

        cx_mat C = randu<cx_mat>(dim,dim);
        cx_mat D = C.t()*C;

        vec eigval2;
        cx_mat eigvec2;

        timer.tic(); // Initialize clock

        eig_sym(eigval2, eigvec2, D);
    
        double n = timer.toc();

        cout << "Elapsed time: "<< n << " seconds"<< endl;
        cout << eigval2 << endl;

      return 0;
      }

which is very basic. The problem is that when I try to run it with a matrix dimension of 500. I get a segmentation error (core dumped) and nothing else. I don't know if this has to do with the linking to MKL or just the Armadillo install. Notice that I don't know how Armadillo "knows" that I want to compile and run using MKL since I also have openBLAS, Lapack and BLAS installed since I just use

 `g++ example.cpp -o example -O2 -larmadillo && ./example`.

I also tried commenting the `#define ARMA_USE_LAPACK` and `#define ARMA_USE_BLAS` in the "include/armadillo_bits/config.hpp" file and rebuilding everything but nothing has changed.

I see a lot of answers pointing to a linking problem with MKL and redirecting to the Intel Link Line Advisor but I have no clue what half of the parameters are nor where I have to implement the necessary changes.

I would appreciate any hints/advice/references to solve that problem.

Thanks in advance,

↧

MPI Linpack from MKL, SSE4_2, turbo, and Skylake: Some SSE 4.2 threads run at the AVX512 turbo frequency

July 17, 2018, 12:33 pm

Latest and popular articles on Intel Technologies

≫ Next: How can I get a licence?!

≪ Previous: Using Intel MKL with Armadillo: segmentation fault

Running Linpack MKL (xhpl.2018.3.222.static) with MKL_ENABLE_INSTRUCTIONS=SSE4_2 on Skylake with turbo enabled.

I've tried this with three different releases of MKL and three different Skylake processors. They all show the same effect, but with different frequencies, of course.

The base thread of each of the MPI ranks runs at the AVX512 turbo frequency, while the other threads run at the expected non-AVX frequency.
If I specify AVX2, all threads run at the AVX 2.0 frequency, as expected
If I specify AVX512, all threads run at the AVX 512 frequency, as expected

At first I thought the SSE 4.2 run might be using 512 bit instructions.on those two CPUs, but fiddling with the performance MSRs to look at the counters shows that only the expected Floating Point Double Precision instructions are being retired.

Here are some characteristics of my Skylake processor and the Linpack run (frequencies are all-cores-active max frequencies, in GHz):

# cores/processor    8
                     frequency       GFlops          run time (sec)
non-AVX turbo        4.1             2.07505e+02     222.87
AVX 2.0 turbo        3.7             8.22624e+02      56.22 
AVX 512 turbo        3.0             1.30613e+03      35.41

Below is a turbostat snapshot while running with SSE4_2

(There's a bit of bouncing around of frequencies as the job runs, but you can see that the CPU 0 & 8 frequencies are low, tending toward 3.0 GHz, and the other 14 CPUs' frequencies are high, tending toward 4.1 GHz.

        Core    CPU     Avg_MHz Busy%   Bzy_MHz TSC_MHz IRQ     SMI     CPU%c1  CPU%c3  CPU%c6  CPU%c7  CoreTmp PkgTmp  PkgWatt RAMWatt PKG_%   RAM_%
        -       -       3957    100.00  3967    3891    15914   0       0.00    0.00    0.00    0.00    69      69      317.50  0.00    0.00    0.00
        4       1       4090    100.00  4100    3891    5011    0       0.00    0.00    0.00    0.00    54
        8       2       4090    100.00  4100    3891    9       0       0.00    0.00    0.00    0.00    66
        9       3       4090    100.00  4100    3891    84      0       0.00    0.00    0.00    0.00    67
        11      4       4090    100.00  4100    3891    8       0       0.00    0.00    0.00    0.00    63
        16      0       3047    100.00  3054    3891    5626    0       0.00    0.00    0.00    0.00    55      67      153.59  0.00    0.00    0.00
        18      5       4090    100.00  4100    3891    9       0       0.00    0.00    0.00    0.00    67
        19      6       4090    100.00  4100    3891    9       0       0.00    0.00    0.00    0.00    64
        25      7       4090    100.00  4100    3891    9       0       0.00    0.00    0.00    0.00    63
        1       8       3006    100.00  3013    3891    5080    0       0.00    0.00    0.00    0.00    50      69      163.91  0.00    0.00    0.00
        2       9       4090    100.00  4100    3891    10      0       0.00    0.00    0.00    0.00    56
        3       10      4090    100.00  4100    3891    9       0       0.00    0.00    0.00    0.00    66
        4       11      4090    100.00  4100    3891    9       0       0.00    0.00    0.00    0.00    67
        8       12      4090    100.00  4100    3891    9       0       0.00    0.00    0.00    0.00    67
        18      13      4090    100.00  4100    3891    9       0       0.00    0.00    0.00    0.00    69
        24      14      4090    100.00  4100    3891    8       0       0.00    0.00    0.00    0.00    69
        27      15      4090    100.00  4100    3891    15      0       0.00    0.00    0.00    0.00    67

I used the attached script to reproduce this. It takes an optional argument for the desired setting for MKL_ENABLE_INSTRUCTIONS, defaulting to SSE4_2. It will create an HPL.dat file if it does not exist, and run Linpack with two MPI ranks.

-- Chuck Newman

Attachment	Size
Download RunLinpack.zip	1.73 KB

↧

How can I get a licence?!

July 17, 2018, 6:10 pm

Latest and popular articles on Intel Technologies

≫ Next: Clarification on signed/unsigned input for cblas_gemm_s8u8s32

≪ Previous: MPI Linpack from MKL, SSE4_2, turbo, and Skylake: Some SSE 4.2 threads run at the AVX512 turbo frequency

Ok, been googling intel mkl, mkl license, mkl community licence, intel licence mkl free, mkl linux license etc for about half an hour and cannot find anywhere on Intel's website where I can actually get a free license or even purchase one! No product pages for MKL and the rest of the performance libraries link to any licensing page.

I am interested in the free community license for MKL library for linux, I'm a graduate student and Intel has made a simple thing too difficult in this case.

Joseph

P.S. Also on Windows Anaconda Python is much faster than Intel Python even with MKL and the rest of the performance libraries installed, maybe the same guys worked on the website eh?

↧

Clarification on signed/unsigned input for cblas_gemm_s8u8s32

July 18, 2018, 5:35 am

Latest and popular articles on Intel Technologies

≫ Next: R on Xeon Phi (on Windows)

≪ Previous: How can I get a licence?!

Hello,

I have a quick question on cblas_gemm_s8u8s32.

What is the reasoning behind requiring one side to be signed and the other unsigned?

The cuBLAS equivalent of this function, cublasGemmEx, expects both a and b to be signed which seems simpler to work with according to me.

Thanks,

Guillaume

↧

R on Xeon Phi (on Windows)

July 19, 2018, 12:38 am

Latest and popular articles on Intel Technologies

≫ Next: pardiso with c#

≪ Previous: Clarification on signed/unsigned input for cblas_gemm_s8u8s32

Hello there,

I was wondering whether you might be able to share detailed instructions on how to compile R to run on the xeon phi co-processor, specifically tailored for Windows 10?

I have read in detail the following links:

...but as all of these are written assuming the user is running Linux OS, I am struggling to make any progress whatsoever (functions like "./configure" dont work in Windows). I have also searched high and low on the internet but have not been able to find anything.

My goal is the rebuild R3.4.3 to run on the xeon phi 3120A, in the Windows 10 environment. I have installed the co-processor drivers and the unit is showing up correctly.

Many thanks in advance,

Keyur

↧

pardiso with c#

July 19, 2018, 11:56 pm

Latest and popular articles on Intel Technologies

≫ Next: mkl, matmult.py in windows w/ mkl_rt.dll

≪ Previous: R on Xeon Phi (on Windows)

dear all,

I have read all documentation about that (using pardiso with c#) on web and in your site, by the way executing the main example I found in you site I have to say that doesn't work.

I Attached the c# example and the error I received.

Is there anyone that can help me to understand whitch is the issue?

At the end we are going to use the complex system resolution.

mkl_rt.dll used is the following version: 2018.0.0.1

Thank you

Gianluca

Attachment	Size
Download TestPardiso.zip	4.55 KB
Download mkl_error_pardiso.png	54.06 KB

↧

mkl, matmult.py in windows w/ mkl_rt.dll

July 11, 2018, 12:49 pm

Latest and popular articles on Intel Technologies

≫ Next: Solving diffusion type equation using Poisson Solver

≪ Previous: pardiso with c#

I am trying to give examples to my students for directly calling mkl from python (using intel/anaconda python 2.7 dist) in a Windows 10 environment.; simple examples first, then moving on to paradiso, etc. (Yes, everything works fine in python with auto linking of mkl for standard scipy and scipy.sparse functions.) The students will need ctypes for specialty routines like paradiso and for their own needed c++ code snippets.

MKL 2018.3 is installed and successfully accessed via c++ from VS2017. So all the various mkl_rt.dll etc are present in the appropriate redist dirs. I am afraid I am not that familiar with ctypes in a windows environment. In python, trying to start by running the matmult.py example posted here: (https://software.intel.com/en-us/articles/using-intel-mkl-in-your-python...) but python chokes on cdll statement. How do I edit the cdll line

from ctypes import *
# Load the share library
mkl = cdll.LoadLibrary("./libmkl_rt.so")

for a Windows environment? So far, none of the following work :

mkl=cdll.LoadLibrary(“.\mkl_rt.dll”)  #with mkl_rt.dll in current directory
mkl=cdll.LoadLibrary(“MKLPATH\mkl_rt.dll”) 
 
mkl=windll.LoadLibrary(“.\mkl_rt.dll”) #with mkl_rt.dll in current directory
mkl=windll.LoadLibrary(“MKLPATH\mkl_rt.dll”)  
mkl=windll.LoadLibrary(“MKLPATH\mkl_rt”)

How do I edit the cdll line for a Windows environment?

OR is there a new version of matmult.py needed specific to Windows installation?

Given that we're using the Intel Python distribution - is there a better way to access things like paradiso than using ctypes + the independent MKL install?

↧

Solving diffusion type equation using Poisson Solver

July 12, 2018, 9:19 am

Latest and popular articles on Intel Technologies

≫ Next: MKL linking with MinGW64, is it still impossible?

≪ Previous: mkl, matmult.py in windows w/ mkl_rt.dll

Hi,

I am trying to solve a 3D diffusion type equation with periodic in X and Y and Neumann boundary condition in Z direction using the MKL Poisson Solver and facing couple of problems.

First of all the BCTYPE if i use 'PPPPNN' and put

bd_az[i + j * (nx+1)]= 0.0
bd_bz[i + j * (nx+1)]= 0.0

in the Z boundary, is it considering the Zero Neumann condition acurately at the boundaries?

and the other question is if i write the diffusion equation in terms of poisson equation then my RHS or the 'f' will be time dependent as du/dt. So, in that case will there be any conflict between the time scheme (u_old , u_new)?

Or is there any other way to use the MKL Poisson solver for the time dependent equations with the above mentioned boundary conditions ?

Please explain.

Thanks,

Swagnik

↧

MKL linking with MinGW64, is it still impossible?

July 12, 2018, 10:41 am

Latest and popular articles on Intel Technologies

≫ Next: MKL_VERBOSE

≪ Previous: Solving diffusion type equation using Poisson Solver

Hi, all.

We can find similar topics on it such as https://software.intel.com/en-us/forums/intel-math-kernel-library/topic/277796

It is from 2012, but how is the situation now in 2018? I am trying to replace FFTW3 with MKL due to licensing reasons. The header has:

#include <mkl_dfti.h>
#include <fftw3_mkl.h>

Which can be redundant because fftw3_mkl.h includes most needed stuff. My compilation process is in a Makefile and the MKL-related parts look like this:

MKL_DIR = "c:/Program Files (x86)/IntelSWTools/compilers_and_libraries_2018.3.210/windows/mkl"
MKL_INC = -I$(MKL_DIR)/include -I$(MKL_DIR)/include/fftw
MKL_LIBS = $(MKL_DIR)/lib/intel64_win
MKL_ALL = $(MKL_INC) -L$(MKL_LIBS) -lmkl_core -lmkl_intel_lp64 -lm

CXX = g++
CXXFLAGS = -Wall -pthread -mms-bitfields -m64

app:
    $(CXX) $(CXXFLAGS) -o $(BINDIR)/$(EXEC_WIN) $(HELPER_OBJS) $(APP_OBJ) $(MKL_ALL)

The objects will be generated fine, as you probably know if you use MinGW64. But this target, app, which will do the linking, outputs the following error for this configuration, which I borrowed from the Intel online linking guide:

==== BUNCH OF ERROR TEXT ====

c:/Program Files (x86)/IntelSWTools/compilers_and_libraries_2018.3.210/windows/mkl/lib/intel64_win/mkl_intel_lp64.lib(_free.obj):(.text[mkl_free]+0x4): undefined reference to `mkl_serv_free'
c:/Program Files (x86)/IntelSWTools/compilers_and_libraries_2018.3.210/windows/mkl/lib/intel64_win/mkl_intel_lp64.lib(_free.obj):(.text[MKL_free]+0x1): undefined reference to `mkl_serv_free'
c:/Program Files (x86)/IntelSWTools/compilers_and_libraries_2018.3.210/windows/mkl/lib/intel64_win/mkl_intel_lp64.lib(_malloc.obj):(.text[mkl_malloc]+0x6): undefined reference to `mkl_serv_malloc'
c:/Program Files (x86)/IntelSWTools/compilers_and_libraries_2018.3.210/windows/mkl/lib/intel64_win/mkl_intel_lp64.lib(_malloc.obj):(.text[MKL_malloc]+0x1): undefined reference to `mkl_serv_malloc'
c:/Program Files (x86)/IntelSWTools/compilers_and_libraries_2018.3.210/windows/mkl/lib/intel64_win/mkl_intel_lp64.lib(fftwf_plan_dft_r2c.obj):(.text[fftwf_plan_dft_r2c]+0x22c): undefined reference to `__security_check_cookie'
c:/Program Files (x86)/IntelSWTools/compilers_and_libraries_2018.3.210/windows/mkl/lib/intel64_win/mkl_intel_lp64.lib(fftwf_plan_dft_r2c.obj):(.xdata+0x10): undefined reference to `__GSHandlerCheck'
c:/Program Files (x86)/IntelSWTools/compilers_and_libraries_2018.3.210/windows/mkl/lib/intel64_win/mkl_intel_lp64.lib(fftwf_plan_guru64_dft_r2c.obj):(.text[fftwf_plan_guru64_dft_r2c]+0x207): undefined reference to `__security_check_cookie'
c:/Program Files (x86)/IntelSWTools/compilers_and_libraries_2018.3.210/windows/mkl/lib/intel64_win/mkl_intel_lp64.lib(fftwf_plan_guru64_dft_r2c.obj):(.xdata+0x14): undefined reference to `__GSHandlerCheck'
c:/Program Files (x86)/IntelSWTools/compilers_and_libraries_2018.3.210/windows/mkl/lib/intel64_win/mkl_intel_lp64.lib(dfticommitdescriptor_lp64.obj):(.text[DftiCommitDescriptor]+0x28): undefined reference to `mkl_dft_dfti_verbose'
c:/Program Files (x86)/IntelSWTools/compilers_and_libraries_2018.3.210/windows/mkl/lib/intel64_win/mkl_intel_lp64.lib(dfticreatedescriptor_s_1d_lp64.obj):(.text[DftiCreateDescriptor_s_1d]+0x61): undefined reference to `mkl_dft_dfti_create_sc1d'
c:/Program Files (x86)/IntelSWTools/compilers_and_libraries_2018.3.210/windows/mkl/lib/intel64_win/mkl_intel_lp64.lib(dfticreatedescriptor_s_1d_lp64.obj):(.text[DftiCreateDescriptor_s_1d]+0x79): undefined reference to `mkl_dft_dfti_create_sr1d'
c:/Program Files (x86)/IntelSWTools/compilers_and_libraries_2018.3.210/windows/mkl/lib/intel64_win/mkl_intel_lp64.lib(dfticreatedescriptor_s_1d_lp64.obj):(.text[DftiCreateDescriptor_s_1d]+0x8a): undefined reference to `mkl_dft_bless_node_omp'
c:/Program Files (x86)/IntelSWTools/compilers_and_libraries_2018.3.210/windows/mkl/lib/intel64_win/mkl_intel_lp64.lib(dfticreatedescriptor_s_md_lp64.obj):(.text[DftiCreateDescriptor_s_md]+0x220): undefined reference to `mkl_dft_dfti_create_scmd'
c:/Program Files (x86)/IntelSWTools/compilers_and_libraries_2018.3.210/windows/mkl/lib/intel64_win/mkl_intel_lp64.lib(dfticreatedescriptor_s_md_lp64.obj):(.text[DftiCreateDescriptor_s_md]+0x23f): undefined reference to `mkl_dft_dfti_create_srmd'
c:/Program Files (x86)/IntelSWTools/compilers_and_libraries_2018.3.210/windows/mkl/lib/intel64_win/mkl_intel_lp64.lib(dfticreatedescriptor_s_md_lp64.obj):(.text[DftiCreateDescriptor_s_md]+0x24e): undefined reference to `mkl_dft_bless_node_omp'
c:/Program Files (x86)/IntelSWTools/compilers_and_libraries_2018.3.210/windows/mkl/lib/intel64_win/mkl_intel_lp64.lib(dfticreatedescriptor_s_md_lp64.obj):(.text[DftiCreateDescriptor_s_md]+0x2a2): undefined reference to `__security_check_cookie'
c:/Program Files (x86)/IntelSWTools/compilers_and_libraries_2018.3.210/windows/mkl/lib/intel64_win/mkl_intel_lp64.lib(dfticreatedescriptor_s_md_lp64.obj):(.xdata+0xc): undefined reference to `__GSHandlerCheck'
c:/Program Files (x86)/IntelSWTools/compilers_and_libraries_2018.3.210/windows/mkl/lib/intel64_win/mkl_intel_lp64.lib(dftisetvalue_lp64.obj):(.text[DftiSetValue]+0x108): undefined reference to `mkl_serv_strnlen_s'
c:/Program Files (x86)/IntelSWTools/compilers_and_libraries_2018.3.210/windows/mkl/lib/intel64_win/mkl_intel_lp64.lib(dftisetvalue_lp64.obj):(.text[DftiSetValue]+0x316): undefined reference to `__security_check_cookie'
c:/Program Files (x86)/IntelSWTools/compilers_and_libraries_2018.3.210/windows/mkl/lib/intel64_win/mkl_intel_lp64.lib(dftisetvalue_lp64.obj):(.xdata+0xc): undefined reference to `__GSHandlerCheck'
collect2.exe: error: ld returned 1 exit status
make: *** [Makefile:75: windows] Error 1

My MingW64 is from a MSYS2 installation, version is 7.3.0. Is it still a no go, this linking is not going to happen?

↧

MKL_VERBOSE

July 13, 2018, 9:17 am

Latest and popular articles on Intel Technologies

≫ Next: How can I get a licence?!

≪ Previous: MKL linking with MinGW64, is it still impossible?

It would be useful to limit the output of MKL_VERBOSE on a per-thread basis. For example, assuming the use of a KNL and you are running 1 process with 16 threads. You may want to limit the MKL_VERBOSE output from only 1 calling thread. (though KML calls may be using multiple threads)

↧

How can I get a licence?!

July 17, 2018, 6:10 pm

Latest and popular articles on Intel Technologies

≫ Next: Clarification on signed/unsigned input for cblas_gemm_s8u8s32

≪ Previous: MKL_VERBOSE

I am interested in the free community license for MKL library for linux, I'm a graduate student and Intel has made a simple thing too difficult in this case.

Joseph

P.S. Also on Windows Anaconda Python is much faster than Intel Python even with MKL and the rest of the performance libraries installed, maybe the same guys worked on the website eh?

↧

Clarification on signed/unsigned input for cblas_gemm_s8u8s32

July 18, 2018, 5:35 am

Latest and popular articles on Intel Technologies

≫ Next: R on Xeon Phi (on Windows)

≪ Previous: How can I get a licence?!

Hello,

I have a quick question on cblas_gemm_s8u8s32.

What is the reasoning behind requiring one side to be signed and the other unsigned?

The cuBLAS equivalent of this function, cublasGemmEx, expects both a and b to be signed which seems simpler to work with according to me.

Thanks,

Guillaume

↧

R on Xeon Phi (on Windows)

July 19, 2018, 12:38 am

Latest and popular articles on Intel Technologies

≫ Next: pardiso with c#

≪ Previous: Clarification on signed/unsigned input for cblas_gemm_s8u8s32

Hello there,

I was wondering whether you might be able to share detailed instructions on how to compile R to run on the xeon phi co-processor, specifically tailored for Windows 10?

I have read in detail the following links:

My goal is the rebuild R3.4.3 to run on the xeon phi 3120A, in the Windows 10 environment. I have installed the co-processor drivers and the unit is showing up correctly.

Many thanks in advance,

Keyur

↧

pardiso with c#

July 19, 2018, 11:56 pm

Latest and popular articles on Intel Technologies

≫ Next: The Discrete Hartly Transform can not get the right computing result in Windows system

≪ Previous: R on Xeon Phi (on Windows)

dear all,

I have read all documentation about that (using pardiso with c#) on web and in your site, by the way executing the main example I found in you site I have to say that doesn't work.

I Attached the c# example and the error I received.

Is there anyone that can help me to understand whitch is the issue?

At the end we are going to use the complex system resolution.

mkl_rt.dll used is the following version: 2018.0.0.1

Thank you

Gianluca

Attachment	Size
Download TestPardiso.zip	4.55 KB
Download mkl_error_pardiso.png	54.06 KB

↧

The Discrete Hartly Transform can not get the right computing result in Windows system

July 20, 2018, 8:27 pm

Latest and popular articles on Intel Technologies

≫ Next: MKL Batch GEMM with TBB threading solution gives no performance improvements

≪ Previous: pardiso with c#

Hi,

I have a program needs using fftw in MKL and I can get the right result in Linux edition. But when I compiled it in Windows edition, I got a absolutely wrong answer. So I wrote a simple program to test fftw and I found the computing result of FFTW_DHT(The Discrete Hartly Transform) would be all zeros. I complied this program with Visual Studio 2015 and Fortran compiler 2018.The simple program has been attached and I want to know how to solve it.

Best wishes！

Attachment	Size
Download Source1.f90	471 bytes

↧

MKL Batch GEMM with TBB threading solution gives no performance improvements

July 20, 2018, 11:17 pm

Latest and popular articles on Intel Technologies

≫ Next: .NET core compatibility

≪ Previous: The Discrete Hartly Transform can not get the right computing result in Windows system

As part of the open source library ArrayFire, Intel MKL is used for GEMM operations and recently updated the code to use batch version of GEMM. We have noticed that using GNU OpenMP or Intel OpenMP as threading solution is giving the expected speedups but TBB is not. We wanted to bring it to your attention. Given below is the arrayfire benchmark code used to time the GEMM operations.

#include <arrayfire.h>
#include <stdio.h>
#include <math.h>
#include <cstdlib>

using namespace af;

// create a small wrapper to benchmark
static array A; // populated before each timing
static void fn()
{
    array B = matmul(A, A);  // matrix multiply
    B.eval();                // ensure evaluated
}

int main(int argc, char ** argv)
{
    double peak = 0;
    try {
        int device = argc > 1 ? atoi(argv[1]) : 0;
        setDevice(device);
        info();

        printf("Benchmark N-by-N matrix multiply\n");
        for (int n = 128; n <= 2048; n += 128) {

            //printf("%4d x %4d: ", n, n);
            A = constant(1,n,n,3);
            double time = timeit(fn); // time in seconds
            double gflops = 2.0 * powf(n,3) / (time * 1e9);
            if (gflops > peak)
                peak = gflops;

            printf("%4.2f\n", gflops);
            fflush(stdout);
        }
    } catch (af::exception& e) {
        fprintf(stderr, "%s\n", e.what());
        throw;
    }


    printf(" ### peak %g GFLOPS\n", peak);

    return 0;
}

The benchmark results are provided in the form an interactive chart at the this URL

The usage of batch GEMM call inside arrayfire can be found in the following source file.

https://github.com/9prady9/arrayfire/blob/57eb26d03a738c8a99b664dcbe374b...

Thank you,

Pradeep.

↧

.NET core compatibility

July 21, 2018, 3:39 am

Latest and popular articles on Intel Technologies

≫ Next: Install issues

≪ Previous: MKL Batch GEMM with TBB threading solution gives no performance improvements

Hello.

I was able to use MKL under mono/linux.

However it fails with segmentation fault and/or illegal instruction when I use .NET core.

The same .NET core app works when launched under windows.

Are there known compatibility issues with dotnet core that for whatever reason wouldn't exist on mono?

Same OS, same libmkl_rt.so library, but when it starts executing it halts everything and no tracing information of any sort.

I have not compiled MKL from source.

↧

Install issues

July 21, 2018, 9:12 am

Latest and popular articles on Intel Technologies

≫ Next: Detecting if Intel MKL is enabled in Visual Studio project's properties

≪ Previous: .NET core compatibility

I've followed the MKL 2018 Update 3. No problem figuring out the configuration for download (Ubuntu 16.04 LTS with gcc, etc). In the getting started guide it 1st says to check your installation and be sure that "mklvars.sh" and "mklvars.csh" are in (install directory)/bin. In my case that's /opt/intel/bin, the default from the install script. In that directory are two files, "compilervars.csh" and "compilervars.sh". Ok, no big deal. Intel changed the script names and didn't update the guide. But when I go to the next step in the guide, "setting the environment variables", and execute the script I get,

$ /opt/intel/bin/compilervars.sh intel64

ERROR: architecture is not defined. Accepted values: ia32, intel64

Syntax:
source mklvars.sh <arch> [MKL_interface] [mod]

<arch> must be one of the following
ia32 : Setup for IA-32 architecture
intel64 : Setup for Intel(R) 64 architecture

mod (optional) - set path to Intel(R) MKL F95 modules

MKL_interface (optional) - Intel(R) MKL programming interface for intel64
Not applicable without mod
lp64 : 4 bytes integer (default)
ilp64 : 8 bytes integer

If the arguments to the sourced script are ignored (consult docs for
your shell) the alternative way to specify target is environment
variables COMPILERVARS_ARCHITECTURE or MKLVARS_ARCHITECTURE to pass
<arch> to the script, MKLVARS_INTERFACE to pass <MKL_interface> and
MKLVARS_MOD to pass <mod>

/opt/intel/bin/compilervars.sh: 42: [: =: unexpected operator
/opt/intel/bin/compilervars.sh: 50: [: =: unexpected operator
/opt/intel/bin/compilervars.sh: 159: [: =: unexpected operator
/opt/intel/bin/compilervars.sh: 171: [: =: unexpected operator

This seems simple enough and I can't believe it is broken. What am I doing wrong!?!

↧

error in LAPACKE_dgesvd Example row major

SLALAPACK Communicators

Using Intel MKL with Armadillo: segmentation fault

MPI Linpack from MKL, SSE4_2, turbo, and Skylake: Some SSE 4.2 threads run at the AVX512 turbo frequency

How can I get a licence?!

Clarification on signed/unsigned input for cblas_gemm_s8u8s32

R on Xeon Phi (on Windows)

pardiso with c#

mkl, matmult.py in windows w/ mkl_rt.dll

Solving diffusion type equation using Poisson Solver

MKL linking with MinGW64, is it still impossible?

MKL_VERBOSE

How can I get a licence?!

Clarification on signed/unsigned input for cblas_gemm_s8u8s32

R on Xeon Phi (on Windows)

pardiso with c#

The Discrete Hartly Transform can not get the right computing result in Windows system

MKL Batch GEMM with TBB threading solution gives no performance improvements

.NET core compatibility

Install issues

Adobe Photoshop 2020 v21.0.2.57 Pre-Activated

Class 10 Sanskrit Grammar Book Solutions चित्राधारितम् वर्णनम्