====== Cuda Programming in EECS ====== Some EECS Linux systems have [[https://nvidia.com|NVIDIA]] [[wp>Graphics_processing_unit|GPUs]] capable of running CUDA applications. In addition to having a compatible card, a special driver is also needed for CUDA. Below are some tips on how to get more information on CUDA capabilities and programming in EECS. ===== Does my system have a CUDA-capable GPU? ===== You can discover whether your computer has a CUDA-capable NVIDIA card by checking the PCI-connected hardware with the ''lspci'' command:jruser:hydra9 ~> lspci | grep -i nvidia 01:00.0 VGA compatible controller: NVIDIA Corporation GM107 [GeForce GTX 745] (rev a2) 01:00.1 Audio device: NVIDIA Corporation Device 0fbc (rev a1)In this example, the system has a NVIDIA GeForce GTX 745 video card. NVIDIA provides [[https://developer.nvidia.com/cuda-gpus|more information]] on which cards have CUDA capabilities and at what level. ===== Where is the CUDA software? ===== On EECS IT-supported systems, the most recent NVIDIA's CUDA software is installed in ''/usr/local/cuda''. Some older versions may be also available in ''/usr/local'' with version numbers such as ''/usr/local/cuda-10.1''. ===== How do I use CUDA? ===== If your system supports CUDA, you may want to start by adding ''/usr/local/cuda/bin'' to your shell's ''PATH'' variable. This can be done in your shell initialization files, e.g. by adding the line ''export PATH="$PATH:/usr/local/cuda/bin'' to your ''.zshrc'' file. If you are using default EECS shell initialization files, you will likely also have a ''.zsh.path'' where you can alter your default PATH. For general information on CUDA programming, please see: * [[https://developer.nvidia.com/cuda-education-training|NVIDIA CUDA Education and Training]] * [[https://developer.nvidia.com/cuda-education|Training Materials and Code Samples]] ==== Using pkg-config for CUDA ==== If you are familiar with the Linux [[https://www.freedesktop.org/wiki/Software/pkg-config/|pkg-config]] command, you can use it to add the proper compiler and linker options to your ''gcc'' command-line and makefiles. To see what CUDA versions are supported via ''pkg-config'' try:jruser:hydra9 /usr/local> pkg-config --list-all | grep cuda cuda-10.1 cuda - CUDA Driver Library cuda-10.2 cuda - CUDA Driver Library cudart-10.2 cudart - CUDA Runtime Library cudart-10.1 cudart - CUDA Runtime Library To get the compiler flags for CUDA version 10.2, for example, you can run:jruser:hydra9 /usr/local> pkg-config --cflags cuda-10.2 -I/usr/local/cuda-10.2/targets/x86_64-linux/include For more information, read the [[https://linux.die.net/man/1/pkg-config|manual page]] for ''pkg-config'' and the documentation linked above. ===== What about TensorFlow, NVIDIA cuDNN, pyCUDA, etc.? ===== A lot of software packages integrate with CUDA-capable GPUs. Some of them are installed by default on EECS systems, however licensing restrictions do not always allow EECS IT to install a package for every user on a system. For example, [[https://developer.nvidia.com/cudnn|NVIDIA cuDNN]] requires end-users to agree to a developer license agreement. In general, you should be able to download your own copies of software that is not available on EECS IT systems and install it in your Linux home directory. For Python modules, use a [[https://docs.python-guide.org/dev/virtualenvs/|Python VirtualEnv]]. If you need a specific library or software product for a course, please [[:start|contact EECS IT]] to discuss options. Further reading: * [[knowledge-base/linux-topics/quick-reference|Linux Quick Reference]] * [[knowledge-base/linux-topics/using-redhat-software-collections|Using RedHat Software Collections]] [[:start|Contact EECS IT Support]] if you need software for a course. ===== What capabilities does my card have? ===== If you have determined that your system includes CUDA-capable GPU, you can use the ''deviceQuery'' command to find out more about such as its "CUDA Capability" version, number of CUDA cores, etc :jruser:hydra9 ~> /usr/local/cuda/samples/bin/x86_64/linux/release/deviceQuery /usr/local/cuda/samples/bin/x86_64/linux/release/deviceQuery Starting... CUDA Device Query (Runtime API) version (CUDART static linking) Detected 1 CUDA Capable device(s) Device 0: "GeForce GTX 745" CUDA Driver Version / Runtime Version 10.2 / 10.2 CUDA Capability Major/Minor version number: 5.0 Total amount of global memory: 4036 MBytes (4231725056 bytes) ( 3) Multiprocessors, (128) CUDA Cores/MP: 384 CUDA Cores GPU Max Clock rate: 1032 MHz (1.03 GHz) Memory Clock rate: 900 Mhz Memory Bus Width: 128-bit L2 Cache Size: 2097152 bytes Maximum Texture Dimension Size (x,y,z) 1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096) Maximum Layered 1D Texture Size, (num) layers 1D=(16384), 2048 layers Maximum Layered 2D Texture Size, (num) layers 2D=(16384, 16384), 2048 layers Total amount of constant memory: 65536 bytes Total amount of shared memory per block: 49152 bytes Total number of registers available per block: 65536 Warp size: 32 Maximum number of threads per multiprocessor: 2048 Maximum number of threads per block: 1024 Max dimension size of a thread block (x,y,z): (1024, 1024, 64) Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535) Maximum memory pitch: 2147483647 bytes Texture alignment: 512 bytes Concurrent copy and kernel execution: Yes with 1 copy engine(s) Run time limit on kernels: Yes Integrated GPU sharing Host Memory: No Support host page-locked memory mapping: Yes Alignment requirement for Surfaces: Yes Device has ECC support: Disabled Device supports Unified Addressing (UVA): Yes Device supports Compute Preemption: No Supports Cooperative Kernel Launch: No Supports MultiDevice Co-op Kernel Launch: No Device PCI Domain ID / Bus ID / location ID: 0 / 1 / 0 Compute Mode: < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) > deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 10.2, CUDA Runtime Version = 10.2, NumDevs = 1 Result = PASS ===== Will you enable CUDA on my desktop Red Hat Enterprise Linux system? ===== If your system has a CUDA-capable GPU but you are currently unable to use the CUDA toolkit, the NVIDIA driver will likely need to be installed or updated. Red Hat Linux ships with the open-source [[https://nouveau.freedesktop.org/wiki/|Nouveau]] driver for NVIDA video cards. However, while this driver supports display capabilities, it does not support CUDA programming. To enable CUDA, the commercial NVIDIA driver needs to be installed. To check what driver is installed on your system, you can use the ''lsmod'' command to list the currently running kernel modeles (drivers):jruser:hydra9 ~> lsmod | grep -E "nvidia|nouveau" nvidia_drm 39594 3 nvidia_modeset 1109637 6 nvidia_drm nvidia_uvm 939731 0 nvidia 20390418 253 nvidia_modeset,nvidia_uvm drm_kms_helper 186531 1 nvidia_drm drm 456166 6 drm_kms_helper,nvidia_drm ipmi_msghandler 56728 2 ipmi_devintf,nvidiaIn the above example, the "nvidia" driver is installed. On systems with Nouveau you will see something similar to this:jruser:hydra9 ~> lsmod | grep -E "nvidia|nouveau" nouveau 1898794 7 mxm_wmi 13021 1 nouveau i2c_algo_bit 13413 1 nouveau drm_kms_helper 186531 1 nouveau ttm 96673 1 nouveau drm 456166 7 ttm,drm_kms_helper,nouveau wmi 21636 6 dell_smbios,dell_wmi_descriptor,dell_led,dell_wmi,mxm_wmi,nouveau video 24538 1 nouveauPlease [[:start|contact EECS IT support]] for help getting the NVIDIA drivers installed on your system.