====== Cuda Programming in EECS ======
Some EECS Linux systems have [[https://nvidia.com|NVIDIA]] [[wp>Graphics_processing_unit|GPUs]] capable of running CUDA applications. In addition to having a compatible card, a special driver is also needed for CUDA. Below are some tips on how to get more information on CUDA capabilities and programming in EECS.
===== Does my system have a CUDA-capable GPU? =====
You can discover whether your computer has a CUDA-capable NVIDIA card by checking the PCI-connected hardware with the ''lspci'' command:jruser:hydra9 ~> lspci | grep -i nvidia
01:00.0 VGA compatible controller: NVIDIA Corporation GM107 [GeForce GTX 745] (rev a2)
01:00.1 Audio device: NVIDIA Corporation Device 0fbc (rev a1)In this example, the system has a NVIDIA GeForce GTX 745 video card. NVIDIA provides [[https://developer.nvidia.com/cuda-gpus|more information]] on which cards have CUDA capabilities and at what level.
===== Where is the CUDA software? =====
On EECS IT-supported systems, the most recent NVIDIA's CUDA software is installed in ''/usr/local/cuda''. Some older versions may be also available in ''/usr/local'' with version numbers such as ''/usr/local/cuda-10.1''.
===== How do I use CUDA? =====
If your system supports CUDA, you may want to start by adding ''/usr/local/cuda/bin'' to your shell's ''PATH'' variable. This can be done in your shell initialization files, e.g. by adding the line ''export PATH="$PATH:/usr/local/cuda/bin'' to your ''.zshrc'' file. If you are using default EECS shell initialization files, you will likely also have a ''.zsh.path'' where you can alter your default PATH.
For general information on CUDA programming, please see:
* [[https://developer.nvidia.com/cuda-education-training|NVIDIA CUDA Education and Training]]
* [[https://developer.nvidia.com/cuda-education|Training Materials and Code Samples]]
==== Using pkg-config for CUDA ====
If you are familiar with the Linux [[https://www.freedesktop.org/wiki/Software/pkg-config/|pkg-config]] command, you can use it to add the proper compiler and linker options to your ''gcc'' command-line and makefiles. To see what CUDA versions are supported via ''pkg-config'' try:jruser:hydra9 /usr/local> pkg-config --list-all | grep cuda
cuda-10.1 cuda - CUDA Driver Library
cuda-10.2 cuda - CUDA Driver Library
cudart-10.2 cudart - CUDA Runtime Library
cudart-10.1 cudart - CUDA Runtime Library
To get the compiler flags for CUDA version 10.2, for example, you can run:jruser:hydra9 /usr/local> pkg-config --cflags cuda-10.2
-I/usr/local/cuda-10.2/targets/x86_64-linux/include
For more information, read the [[https://linux.die.net/man/1/pkg-config|manual page]] for ''pkg-config'' and the documentation linked above.
===== What about TensorFlow, NVIDIA cuDNN, pyCUDA, etc.? =====
A lot of software packages integrate with CUDA-capable GPUs. Some of them are installed by default on EECS systems, however licensing restrictions do not always allow EECS IT to install a package for every user on a system. For example, [[https://developer.nvidia.com/cudnn|NVIDIA cuDNN]] requires end-users to agree to a developer license agreement.
In general, you should be able to download your own copies of software that is not available on EECS IT systems and install it in your Linux home directory. For Python modules, use a [[https://docs.python-guide.org/dev/virtualenvs/|Python VirtualEnv]]. If you need a specific library or software product for a course, please [[:start|contact EECS IT]] to discuss options. Further reading:
* [[knowledge-base/linux-topics/quick-reference|Linux Quick Reference]]
* [[knowledge-base/linux-topics/using-redhat-software-collections|Using RedHat Software Collections]]
[[:start|Contact EECS IT Support]] if you need software for a course.
===== What capabilities does my card have? =====
If you have determined that your system includes CUDA-capable GPU, you can use the ''deviceQuery'' command to find out more about such as its "CUDA Capability" version, number of CUDA cores, etc :jruser:hydra9 ~> /usr/local/cuda/samples/bin/x86_64/linux/release/deviceQuery
/usr/local/cuda/samples/bin/x86_64/linux/release/deviceQuery Starting...
CUDA Device Query (Runtime API) version (CUDART static linking)
Detected 1 CUDA Capable device(s)
Device 0: "GeForce GTX 745"
CUDA Driver Version / Runtime Version 10.2 / 10.2
CUDA Capability Major/Minor version number: 5.0
Total amount of global memory: 4036 MBytes (4231725056 bytes)
( 3) Multiprocessors, (128) CUDA Cores/MP: 384 CUDA Cores
GPU Max Clock rate: 1032 MHz (1.03 GHz)
Memory Clock rate: 900 Mhz
Memory Bus Width: 128-bit
L2 Cache Size: 2097152 bytes
Maximum Texture Dimension Size (x,y,z) 1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096)
Maximum Layered 1D Texture Size, (num) layers 1D=(16384), 2048 layers
Maximum Layered 2D Texture Size, (num) layers 2D=(16384, 16384), 2048 layers
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per multiprocessor: 2048
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Concurrent copy and kernel execution: Yes with 1 copy engine(s)
Run time limit on kernels: Yes
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Disabled
Device supports Unified Addressing (UVA): Yes
Device supports Compute Preemption: No
Supports Cooperative Kernel Launch: No
Supports MultiDevice Co-op Kernel Launch: No
Device PCI Domain ID / Bus ID / location ID: 0 / 1 / 0
Compute Mode:
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 10.2, CUDA Runtime Version = 10.2, NumDevs = 1
Result = PASS
===== Will you enable CUDA on my desktop Red Hat Enterprise Linux system? =====
If your system has a CUDA-capable GPU but you are currently unable to use the CUDA toolkit, the NVIDIA driver will likely need to be installed or updated. Red Hat Linux ships with the open-source [[https://nouveau.freedesktop.org/wiki/|Nouveau]] driver for NVIDA video cards. However, while this driver supports display capabilities, it does not support CUDA programming. To enable CUDA, the commercial NVIDIA driver needs to be installed. To check what driver is installed on your system, you can use the ''lsmod'' command to list the currently running kernel modeles (drivers):jruser:hydra9 ~> lsmod | grep -E "nvidia|nouveau"
nvidia_drm 39594 3
nvidia_modeset 1109637 6 nvidia_drm
nvidia_uvm 939731 0
nvidia 20390418 253 nvidia_modeset,nvidia_uvm
drm_kms_helper 186531 1 nvidia_drm
drm 456166 6 drm_kms_helper,nvidia_drm
ipmi_msghandler 56728 2 ipmi_devintf,nvidiaIn the above example, the "nvidia" driver is installed. On systems with Nouveau you will see something similar to this:jruser:hydra9 ~> lsmod | grep -E "nvidia|nouveau"
nouveau 1898794 7
mxm_wmi 13021 1 nouveau
i2c_algo_bit 13413 1 nouveau
drm_kms_helper 186531 1 nouveau
ttm 96673 1 nouveau
drm 456166 7 ttm,drm_kms_helper,nouveau
wmi 21636 6 dell_smbios,dell_wmi_descriptor,dell_led,dell_wmi,mxm_wmi,nouveau
video 24538 1 nouveauPlease [[:start|contact EECS IT support]] for help getting the NVIDIA drivers installed on your system.