Building the Project with DPC++#

This page describes building the oneMKL Interfaces with either the Intel(R) oneAPI DPC++ Compiler or open-source oneAPI DPC++ Compiler. For guidance on building the project with AdaptiveCpp, see Building the Project with AdaptiveCpp.

Environment Setup#

  1. Install the required DPC++ compiler (Intel(R) DPC++ or Open DPC++ - see Selecting a Compiler).

  2. Clone this project. The root directory of the cloned repository will be referred to as <path to onemkl>.

  3. Build and install all required dependencies.

Build Commands#

The build commands for various compilers and backends differ mostly in setting the values of CMake options for compiler and backend. In this section, we describe the common build commands. We will discuss backend-specific details in the Backends section and provide examples in CMake invocation examples.

On Linux, the common form of the build command looks as follows (see Building for Windows for building on Windows):

# Inside <path to onemkl>
mkdir build && cd build
cmake .. -DCMAKE_CXX_COMPILER=$CXX_COMPILER    \ # Should be icpx or clang++
        -DCMAKE_C_COMPILER=$C_COMPILER         \ # Should be icx or clang
        -DENABLE_MKLGPU_BACKEND=False          \ # Optional: The MKLCPU backend is True by default.
        -DENABLE_MKLGPU_BACKEND=False          \ # Optional: The MKLGPU backend is True by default.
        -DENABLE_<BACKEND_NAME>_BACKEND=True   \ # Enable any other backend(s) (optional)
        -DENABLE_<BACKEND_NAME_2>_BACKEND=True \ # Multiple backends can be enabled at once.
        -DBUILD_FUNCTIONAL_TESTS=False         \ # See page *Building and Running Tests* for more on building tests. True by default.
        -DBUILD_EXAMPLES=False                   # Optional: True by default.
cmake --build .
cmake --install . --prefix <path_to_install_dir>  # required to have full package structure

In the above, the $CXX_COMPILER and $C_COMPILER should be set to icpx and icx respectively when using the Intel(R) oneAPI DPC++ Compiler, or clang++ and clang respectively when using the Open DPC++ Compiler.

Backends should be enabled by setting -DENABLE_<BACKEND_NAME>_BACKEND=True for each desired backend. By default, only the MKLGPU and MKLCPU backends are enabled. Multiple backends for multiple device vendors can be enabled at once (albeit with limitations when using portBLAS and portFFT). The supported backends for the compilers are given in the table at oneMKL supported configurations table, and the CMake option names are given in the table below. Some backends may require additional parameters to be set. See the relevant section below for additional guidance.

If a backend library supports multiple domains (i.e., BLAS, LAPACK, DFT, RNG, sparse BLAS), it may be desirable to only enable selected domains. For this, the TARGET_DOMAINS variable should be set. See the section TARGET_DOMAINS.

By default, the library also additionally builds examples and tests. These can be disabled by setting the parameters BUILD_FUNCTIONAL_TESTS and BUILD_EXAMPLES to False. Building the functional tests requires additional external libraries for the BLAS and LAPACK domains. See the section Building and Running Tests for more information.

The most important supported build options are:

CMake Option

Supported Values

Default Value

ENABLE_MKLCPU_BACKEND

True, False

True

ENABLE_MKLGPU_BACKEND

True, False

True

ENABLE_CUBLAS_BACKEND

True, False

False

ENABLE_CUSOLVER_BACKEND

True, False

False

ENABLE_CUFFT_BACKEND

True, False

False

ENABLE_CURAND_BACKEND

True, False

False

ENABLE_NETLIB_BACKEND

True, False

False

ENABLE_ROCBLAS_BACKEND

True, False

False

ENABLE_ROCFFT_BACKEND

True, False

False

ENABLE_ROCSOLVER_BACKEND

True, False

False

ENABLE_ROCRAND_BACKEND

True, False

False

ENABLE_MKLCPU_THREAD_TBB

True, False

True

ENABLE_PORTBLAS_BACKEND

True, False

False

ENABLE_PORTFFT_BACKEND

True, False

False

BUILD_FUNCTIONAL_TESTS

True, False

True

BUILD_EXAMPLES

True, False

True

TARGET_DOMAINS (list)

blas, lapack, rng, dft, sparse_blas

All domains

Some additional build options are given in the section Additional build options.

TARGET_DOMAINS#

oneMKL supports multiple domains: BLAS, DFT, LAPACK, RNG and sparse BLAS. The domains built by oneMKL can be selected using the TARGET_DOMAINS parameter. In most cases, TARGET_DOMAINS is set automatically according to the domains supported by the backend libraries enabled. However, while most backend libraries support only one of these domains, but some may support multiple. For example, the MKLCPU backend supports every domain. To enable support for only the BLAS domain in the oneMKL Interfaces whilst compiling with MKLCPU, TARGET_DOMAINS could be set to blas. To enable BLAS and DFT, -DTARGET_DOMAINS="blas dft" would be used.

Backends#

Building for Intel(R) oneMKL#

The Intel(R) oneMKL backend supports multiple domains on both x86 CPUs and Intel GPUs. The MKLCPU backend using Intel(R) oneMKL for x86 CPU is enabled by default, and controlled with the parameter ENABLE_MKLCPU_BACKEND. The MKLGPU backend using Intel(R) oneMKL for Intel GPU is enabled by default, and controlled with the parameter ENABLE_MKLGPU_BACKEND.

When using the Intel(R) oneAPI DPC++ Compiler, it is likely that Intel(R) oneMKL will be found automatically. If it is not, the parameter MKL_ROOT can be set to point to the installation prefix of Intel(R) oneMKL. Alternatively, the MKLROOT environment variable can be set, either manually or by using an environment script provided by the package.

Building for CUDA#

The CUDA backends can be enabled with ENABLE_CUBLAS_BACKEND, ENABLE_CUFFT_BACKEND, ENABLE_CURAND_BACKEND, and ENABLE_CUSOLVER_BACKEND.

No additional parameters are required for using CUDA libraries. In most cases, the CUDA libraries should be found automatically by CMake.

Building for ROCm#

The ROCm backends can be enabled with ENABLE_ROCBLAS_BACKEND, ENABLE_ROCFFT_BACKEND, ENABLE_ROCSOLVER_BACKEND and ENABLE_ROCRAND_BACKEND.

For RocBLAS, RocSOLVER and RocRAND, the target device architecture must be set. This can be set with using the HIP_TARGETS parameter. For example, to enable a build for MI200 series GPUs, -DHIP_TARGETS=gfx90a should be set. Currently, DPC++ can only build for a single HIP target at a time. This may change in future versions.

A few often-used architectures are listed below:

Architecture

AMD GPU name

gfx90a

AMD Instinct(TM) MI210/250/250X Accelerator

gfx908

AMD Instinct(TM) MI 100 Accelerator

gfx906

AMD Radeon Instinct(TM) MI50/60 Accelerator
AMD Radeon(TM) (Pro) VII Graphics Card

gfx900

Radeon Instinct(TM) MI 25 Accelerator
Radeon(TM) RX Vega 64/56 Graphics

For a host with ROCm installed, the device architecture can be retrieved via the rocminfo tool. The architecture will be displayed in the Name: row.

Pure SYCL backends: portBLAS and portFFT#

portBLAS and portFFT are experimental pure-SYCL backends that work on all SYCL targets supported by the DPC++ compiler. Since they support multiple targets, they cannot be enabled with other backends in the same domain, or the MKLCPU or MKLGPU backends. Both libraries are experimental and currently only support a subset of operations and features.

For best performance, both libraries must be tuned. See the individual sections for more details.

Both portBLAS and portFFT are used as header-only libraries, and will be downloaded automatically if not found.

Building for portBLAS#

portBLAS is enabled by setting -DENABLE_PORTBLAS_BACKEND=True.

By default, the portBLAS backend is not tuned for any specific device. This tuning is required to achieve best performance. portBLAS can be tuned for a specific hardware target by adding compiler definitions in 2 ways:

  1. Manually specify a tuning target with -DPORTBLAS_TUNING_TARGET=<target>. The list of portBLAS targets can be found here. This will automatically set -fsycl-targets if needed.

  2. If one target is set via -fsycl-targets the configuration step will try to automatically detect the portBLAS tuning target. One can manually specify -fsycl-targets via CMAKE_CXX_FLAGS. See DPC++ User Manual for more information on -fsycl-targets.

portBLAS relies heavily on JIT compilation. This may cause time-outs on some systems. To avoid this issue, use ahead-of-time compilation through tuning targets or sycl-targets.

Building for portFFT#

portFFT is enabled by setting -DENABLE_PORTFFT_BACKEND=True.

By default, the portFFT backend is not tuned for any specific device. The tuning flags are detailed in the portFFT repository, and can set at configuration time. Note that some tuning configurations may be incompatible with some targets.

The portFFT library is compiled using the same -fsycl-targets as specified by the CMAKE_CXX_FLAGS. If none are found, it will compile for -fsycl-targets=spir64, and -if the compiler supports it- nvptx64-nvidia-cuda. To enable HIP targets, HIP_TARGETS must be specified. See DPC++ User Manual for more information on -fsycl-targets.

Additional Build Options#

When building oneMKL the SYCL implementation can be specified by setting the ONEMKL_SYCL_IMPLEMENTATION option. Possible values are:

Please see Building the Project with AdaptiveCpp if using this option.

The following table provides details of CMake options and their default values:

CMake Option

Supported Values

Default Value

BUILD_SHARED_LIBS

True, False

True

BUILD_DOC

True, False

False

Note

When building with clang++ for AMD backends, you must additionally set ONEAPI_DEVICE_SELECTOR to hip:gpu and provide -DHIP_TARGETS according to the targeted hardware. This backend has only been tested for the gfx90a architecture (MI210) at the time of writing.

Note

When building with BUILD_FUNCTIONAL_TESTS=True (default option) only single CUDA backend can be built (#270).

CMake invocation examples#

Build oneMKL with support for Nvidia GPUs with tests disabled using the Ninja build system:

cmake $ONEMKL_DIR \
    -GNinja \
    -DCMAKE_CXX_COMPILER=clang++ \
    -DCMAKE_C_COMPILER=clang \
    -DENABLE_MKLGPU_BACKEND=False \
    -DENABLE_MKLCPU_BACKEND=False \
    -DENABLE_CUFFT_BACKEND=True \
    -DENABLE_CUBLAS_BACKEND=True \
    -DENABLE_CUSOLVER_BACKEND=True \
    -DENABLE_CURAND_BACKEND=True \
    -DBUILD_FUNCTIONAL_TESTS=False

$ONEMKL_DIR points at the oneMKL source directly. The x86 CPU (MKLCPU) and Intel GPU (MKLGPU) backends are enabled by default, but are disabled here. The backends for Nvidia GPUs must all be explicilty enabled. The tests are disabled, but the examples will still be built.

Building oneMKL with support for AMD GPUs with tests disabled:

cmake $ONEMKL_DIR \
    -DCMAKE_CXX_COMPILER=clang++ \
    -DCMAKE_C_COMPILER=clang \
    -DENABLE_MKLCPU_BACKEND=False \
    -DENABLE_MKLGPU_BACKEND=False \
    -DENABLE_ROCFFT_BACKEND=True  \
    -DENABLE_ROCBLAS_BACKEND=True \
    -DENABLE_ROCSOLVER_BACKEND=True \
    -DHIP_TARGETS=gfx90a \
    -DBUILD_FUNCTIONAL_TESTS=False

$ONEMKL_DIR points at the oneMKL source directly. The x86 CPU (MKLCPU) and Intel GPU (MKLGPU) backends are enabled by default, but are disabled here. The backends for AMD GPUs must all be explicilty enabled. The tests are disabled, but the examples will still be built.

Build oneMKL for the DFT domain only with support for x86 CPU, Intel GPU, AMD GPU and Nvidia GPU with testing enabled:

cmake $ONEMKL_DIR \
    -DCMAKE_CXX_COMPILER=icpx \
    -DCMAKE_C_COMPILER=icx \
    -DENABLE_ROCFFT_BACKEND=True \
    -DENABLE_CUFFT_BACKEND=True \
    -DTARGET_DOMAINS=dft \
    -DBUILD_EXAMPLES=False

Note that this is not a supported configuration, and requires Codeplay’s oneAPI for AMD and Nvidia GPU plugins. The MKLCPU and MKLGPU backends are enabled by default, with backends for Nvidia GPU and AMD GPU explicitly enabled. -DTARGET_DOMAINS=dft causes only DFT backends to be built. If this was not set, the backend libraries to enable the use of BLAS, LAPACK and RNG with MKLGPU and MKLCPU would also be enabled. The build of examples is disabled. Since functional testing was not disabled, tests would be built.

Project Cleanup#

Most use-cases involve building the project without the need to clean up the build directory. However, if you wish to clean up the build directory, you can delete the build folder and create a new one. If you wish to clean up the build files but retain the build configuration, following commands will help you do so.

# If you use "GNU/Unix Makefiles" for building,
make clean

# If you use "Ninja" for building
ninja -t clean

Building for Windows#

The Windows build is similar to the Linux build, albeit that fewer backends are supported. Additionally, the Ninja build system must be used. For example:

# Inside <path to onemkl>
md build && cd build
cmake .. -G Ninja [-DCMAKE_CXX_COMPILER=<path_to_icx_compiler>\bin\icx] # required only if icx is not found in environment variable PATH
                  [-DCMAKE_C_COMPILER=<path_to_icx_compiler>\bin\icx]   # required only if icx is not found in environment variable PATH
                  [-DMKL_ROOT=<mkl_install_prefix>]                     # required only if environment variable MKLROOT is not set
                  [-DREF_BLAS_ROOT=<reference_blas_install_prefix>]     # required only for testing
                  [-DREF_LAPACK_ROOT=<reference_lapack_install_prefix>] # required only for testing
ninja
ctest
cmake --install . --prefix <path_to_install_dir> # required to have full package structure

Build FAQ#

clangrt builtins lib not found

Encountered when trying to build oneMKL with some ROCm libraries. There are several possible solutions: * If building Open DPC++ from source, add compiler-rt to the external projects compile option: --llvm-external-projects compiler-rt. * The clangrt from ROCm can be used, depending on ROCm version: export LIBRARY_PATH=/path/to/rocm-$rocm-version$/llvm/lib/clang/$clang-version$/lib/linux/:$LIBRARY_PATH

Could NOT find CBLAS (missing: CBLAS file)

Encountered when tests are enabled along with the BLAS domain. The tests require a reference BLAS implementation, but cannot find one. Either install or build a BLAS library and set -DREF_BLAS_ROOT` as described in Building and Running Tests. Alternatively, the tests can be disabled by setting -DBUILD_FUNCTIONAL_TESTS=False.

error: invalid target ID ‘’; format is a processor name followed by an optional colon-delimited list of features followed by an enable/disable sign (e.g.,’gfx908:sramecc+:xnack-‘)

The HIP_TARGET has not been set. Please see Building for ROCm.