46 | | +--------------------------+-----------------------------+----------+------------+--------+ |
47 | | | Image Name | Description | username | Updated | Status | |
48 | | +--------------------------+-----------------------------+----------+------------+--------+ |
49 | | | bare.ndz | ubuntu18.04 + basic config | root | 2020-05-11 | ready | |
50 | | +--------------------------+-----------------------------+----------+------------+--------+ |
51 | | | baseline.ndz | bare + omf tools | root | 2020-05-11 | ready | |
52 | | +--------------------------+-----------------------------+----------+------------+--------+ |
53 | | | baseline-uhd.ndz | baseline + uhd 3.15 | root | 2020-05-11 | ready | |
54 | | +--------------------------+-----------------------------+----------+------------+--------+ |
55 | | | baseline-gr.ndz | baseline-uhd + gnuradio 3.8 | root | 2020-05-11 | ready | |
56 | | +--------------------------+-----------------------------+----------+------------+--------+ |
57 | | | baseline-cuda.ndz | baseline + cuda + drivers | root | n/a | | |
58 | | +--------------------------+-----------------------------+----------+------------+--------+ |
59 | | | baseline-tensorflow.ndz | baseline-cuda + tensorflow | root | n/a | | |
60 | | +--------------------------+-----------------------------+----------+------------+--------+ |
61 | | | baseline-pytorch.ndz | baseline-cuda + pytorch | root | n/a | | |
62 | | +--------------------------+-----------------------------+----------+------------+--------+ |
63 | | }}} |
64 | | |
| 46 | +--------------------------+-----------------------------+----------+------------+----------+ |
| 47 | | Image Name | Description | username | Updated | Status | |
| 48 | +--------------------------+-----------------------------+----------+------------+----------+ |
| 49 | | bare.ndz | ubuntu18.04 + basic config | root | 2020-05-11 | ready | |
| 50 | +--------------------------+-----------------------------+----------+------------+----------+ |
| 51 | | baseline.ndz | bare + omf tools | root | 2020-05-11 | ready | |
| 52 | +--------------------------+-----------------------------+----------+------------+----------+ |
| 53 | | baseline-uhd.ndz | baseline + uhd 3.15 | root | 2020-05-11 | ready | |
| 54 | +--------------------------+-----------------------------+----------+------------+----------+ |
| 55 | | baseline-gr.ndz | baseline-uhd + gnuradio 3.8 | root | 2020-05-11 | ready | |
| 56 | +--------------------------+-----------------------------+----------+------------+----------+ |
| 57 | | baseline-cuda.ndz | baseline + cuda + drivers | root | 2020-05-20 | internal | |
| 58 | +--------------------------+-----------------------------+----------+------------+----------+ |
| 59 | | baseline-tensorflow.ndz | baseline-cuda + tensorflow | root | 2020-05-20 | internal | |
| 60 | +--------------------------+-----------------------------+----------+------------+----------+ |
| 61 | | baseline-pytorch.ndz | baseline-cuda + pytorch | root | 2020-05-20 | internal | |
| 62 | +--------------------------+-----------------------------+----------+------------+----------+ |
| 63 | }}} |
350 | | ===== Baseline CUDA |
351 | | The cuda baseline image is meant to be run on the cosmos server machines containing V100 GPUs. It is built with Nvidia drivers for the GPUs and CUDA libraries for general purpose GPU programming. The baseline image is built with driver version 410.104 with cuda 10.0 libraries. |
352 | | |
353 | | If you would like to create a cuda image using different versions of either the drivers or cuda, you can do so by starting with the baseline_1804 image. |
354 | | |
355 | | 1. Select the driver version you need from [https://www.nvidia.com/Download/Find.aspx?lang=en-us the Nvidia Driver Downloads Page]. Be sure to specify the product type as "Tesla" and the product series as "V-Series". Click download and then on the following page, right click on the "agree & download" and copy the link address. On the node, use wget or curl to download the link you copied. |
356 | | * "dpkg -i nvidia-diag-driver-local-repo-ubuntu1804-410.104_1.0-1_amd64.deb" note: you may be asked to add a gpg key during the installation process. Use the command that is given. |
357 | | * "apt-get update" |
358 | | * "apt-get install cuda-drivers" |
359 | | * log out of the node and use omf tell to turn it off and on again. When you log back into the node, running lsmod should demonstrate that the nvidia drivers have been loaded. |
360 | | 2. Select the version of cuda you need from [https://developer.nvidia.com/cuda-toolkit-archive the cuda toolkit archive], then choose your operating system (Linux), architecture (x86_64), distribution (Ubuntu), and version (18.04). Choose "deb(local)" as the installer type. Again, copy the download link and use wget to download it onto the node. You can then follow the installation instructions on the download page. |
| 351 | ==== Baseline CUDA |
| 352 | |
| 353 | The cuda baseline image is meant to be run on the cosmos server machines containing V100 GPUs. It is built with CUDA drivers libraries for general purpose GPU programming. The baseline image is built with driver version 440.33 with cuda 10.2 libraries. If you need an older version of cuda, there is also an image for version 10.1 called "baseline_cuda_1804_101.ndz" which has driver version 418.39. Note: all CUDA images are currently only available to members of winlab. Please contact a COSMOS administrator if you would like to use this image. |
| 354 | |
| 355 | [[CollapsibleStart(Build Steps)]] |
| 356 | |
| 357 | If you would like to create your own cuda image, you can do so by starting with the baseline_1804 image. |
| 358 | |
| 359 | 1. Select the version of cuda you need from [https://developer.nvidia.com/cuda-toolkit-archive the cuda toolkit archive], then choose your operating system (Linux), architecture (x86_64), distribution (Ubuntu), and version (18.04). Choose "runfile(local)" as the installer type. If there is a download link available, copy the link and use wget to download it onto the node. If there is no download link, there should be a url provided to use with wget. |
| 360 | * After the installer has downloaded, use `chmod +x` to make it executable. |
| 361 | * Blacklist nouveau by creating the file `/etc/modprobe.d/blacklist-nouveau.conf` with these lines: |
| 362 | {{{ |
| 363 | blacklist nouveau |
| 364 | options nouveau modeset=0 |
| 365 | }}} |
| 366 | * run `sudo update-initramfs -u` to apply the changes |
| 367 | * reboot the node, and execute the installer |
| 368 | 4. You will also have to add the directory of cuda binaries to the path. Edit the .bash_rc file and add |
| 369 | {{{ |
| 370 | export PATH=/usr/local/cuda/bin${PATH:+:${PATH}} |
| 371 | export LD_LIBRARY_PATH=/usr/local/cuda/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}} |
| 372 | }}} |