This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
tamiwiki:projects:egpu [2023/06/18 18:07] – [P40] yair | tamiwiki:projects:egpu [2023/11/04 11:07] (current) – [1080Ti] yair | ||
---|---|---|---|
Line 1: | Line 1: | ||
====== EGPU ====== | ====== EGPU ====== | ||
- | https:// | + | {{ : |
+ | |||
+ | we are using the [[https:// | ||
+ | |||
+ | Linux Kernal notes > https:// | ||
+ | [[https:// | ||
+ | |||
+ | |||
+ | === ThunderBolt check and setup === | ||
TLDR | TLDR | ||
- upgrade kernel (??) | - upgrade kernel (??) | ||
Line 46: | Line 55: | ||
<code bash> | <code bash> | ||
(base) user@eight: | (base) user@eight: | ||
- | 1 | ||
- | (base) user@eight: | ||
- | == / | ||
- | modalias : pci: | ||
- | vendor | ||
- | model : GF106GL [Quadro 2000] | ||
- | manual_install: | ||
- | driver | ||
- | driver | ||
- | |||
</ | </ | ||
- | just an old card... | ||
- | but EEK | ||
- | |||
- | <code bash> | ||
- | (base) user@eight: | ||
- | 08:04.0 PCI bridge: Intel Corporation JHL7540 Thunderbolt 3 Bridge [Titan Ridge DD 2018] (rev 06) | ||
- | 09:00.0 VGA compatible controller: NVIDIA Corporation GF106GL [Quadro 2000] (rev a1) | ||
- | 09:00.1 Audio device: NVIDIA Corporation GF106 High Definition Audio Controller (rev a1) | ||
- | |||
- | $sudo dmesg | ||
- | [ 1041.053826] nvidia: module license ' | ||
- | [ 1041.053831] Disabling lock debugging due to kernel taint | ||
- | [ 1041.484017] nvidia-nvlink: | ||
- | [ 1041.484032] NVRM: The NVIDIA Quadro 2000 GPU installed in this system is | ||
- | | ||
- | | ||
- | | ||
- | | ||
- | [ 1041.501047] NVRM: No NVIDIA GPU found. | ||
- | [ 1041.521176] nvidia-nvlink: | ||
- | [ 1042.332830] nvidia-nvlink: | ||
- | [ 1042.332842] NVRM: The NVIDIA Quadro 2000 GPU installed in this system is | ||
- | | ||
- | | ||
- | | ||
- | | ||
- | [ 1042.335282] NVRM: No NVIDIA GPU found. | ||
- | [ 1042.335835] nvidia-nvlink: | ||
- | </ | ||
- | |||
- | <WRAP center round alert 33%>WE ARE TAINTED</ | ||
- | |||
- | ==== driver ==== | ||
- | |||
- | we went with ubuntu selection | ||
- | |||
- | but cute https:// | ||
- | |||
- | <code bash> | ||
- | $ sudo apt installl nvidia-headless-535 | ||
- | |||
- | |||
- | #downgrade nvidia to quadro supported version | ||
- | sudo apt install nvidia-headless-390 | ||
- | |||
- | # EEK | ||
- | RROR (dkms apport): kernel package linux-headers-6.3.7-060307-generic is not supported | ||
- | Error! Bad return status for module build on kernel: 6.3.7-060307-generic (x86_64) | ||
- | Consult / | ||
- | dpkg: error processing package nvidia-dkms-390 (--configure): | ||
- | | ||
- | dpkg: dependency problems prevent configuration of nvidia-headless-390: | ||
- | | ||
- | Package nvidia-dkms-390 is not configured yet. | ||
- | |||
- | dpkg: error processing package nvidia-headless-390 (--configure): | ||
- | | ||
- | Processing triggers for libc-bin (2.36-0ubuntu4) ... | ||
- | No apport report written because the error message indicates its a followup error from a previous failure. | ||
- | / | ||
- | ndi.so.4 is not a symbolic link | ||
- | |||
- | Processing triggers for man-db (2.10.2-2) ... | ||
- | Processing triggers for initramfs-tools (0.140ubuntu17) ... | ||
- | update-initramfs: | ||
- | Errors were encountered while processing: | ||
- | | ||
- | | ||
- | |||
- | |||
- | </ | ||
- | |||
- | downgrading but to headless,\\ | ||
- | without touching the x config? | ||
- | |||
- | |||
- | going with [[https:// | ||
- | |||
- | [1] A 3D video game environment and benchmark designed from scratch for reinforcement learning research | ||
- | |||
- | <code bash> | ||
- | conda create -n avalon python=3.9 | ||
- | conda activate avalon | ||
- | |||
- | sudo apt install --no-install-recommends libegl-dev libglew-dev libglfw3-dev libnvidia-gl libopengl-dev libosmesa6 mesa-utils-extra | ||
- | |||
- | #this will also install torch... | ||
- | pip install avalon-rl[train] | ||
- | |||
- | python -m avalon.install_godot_binary | ||
- | python -m avalon.common.check_install | ||
- | </ | ||
- | |||
- | why even bother, the quaDRO IS JUST A TEST. \\ | ||
- | NEED TO CLEAN REMOVE THE 390 driver AND MOVE BACK TO | ||
- | |||
- | NVIDIA-CURRENT | ||
- | |||
- | |||
- | ==== P40 ==== | ||
- | NVIDIA Tesla P40 24GB DDR5 GPU Accelerator Card Dual PCI-E 3.0 x16\\ | ||
- | need to retrofit with a FAN,it doesnt come with one | ||
- | |||
- | got one on ebay for 200$(+shipping) ([[https:// | ||
- | |||
- | some dude got it working, https:// | ||
- | |||
- | {{ : | ||
- | installing | ||
- | <code bash> | ||
- | sudo apt install nvidia-headless-535 | ||
- | </ | ||
- | |||
- | there is some issue, unlike other cards the blue led doesnt turn green on thunderbolt connection.\\ | ||
- | no power passing to the gPU.\\ | ||
- | unlike with other cards we tried (quadro 2000 and 660Ti) | ||
- | |||
- | :( | ||
==== 1080Ti ==== | ==== 1080Ti ==== | ||
Line 223: | Line 104: | ||
$ sudo ubuntu-drivers autoinstall | $ sudo ubuntu-drivers autoinstall | ||
- | 1The following additional packages will be installed: | ||
- | libnvidia-common-535 libnvidia-compute-535: | ||
- | libnvidia-decode-535 libnvidia-decode-535: | ||
- | libnvidia-encode-535 libnvidia-encode-535: | ||
- | libnvidia-extra-535 libnvidia-fbc1-535 libnvidia-fbc1-535: | ||
- | libnvidia-gl-535 libnvidia-gl-535: | ||
- | nvidia-settings nvidia-utils-535 screen-resolution-extra | ||
- | xserver-xorg-video-nvidia-535 | ||
- | The following packages will be REMOVED: | ||
- | libnvidia-common-390 libnvidia-gl-390 | ||
- | The following NEW packages will be installed: | ||
- | libnvidia-common-535 libnvidia-compute-535: | ||
- | libnvidia-decode-535 libnvidia-decode-535: | ||
- | libnvidia-encode-535 libnvidia-encode-535: | ||
- | libnvidia-extra-535 libnvidia-fbc1-535 libnvidia-fbc1-535: | ||
- | libnvidia-gl-535 libnvidia-gl-535: | ||
- | nvidia-prime nvidia-settings nvidia-utils-535 | ||
- | screen-resolution-extra xserver-xorg-video-nvidia-535 | ||
</ | </ | ||
+ | |||
+ | |||
+ | ==== P40 ==== | ||
+ | <WRAP center round important 60%> | ||
+ | this doesnt work on our test machine | ||
+ | </ | ||
+ | |||
+ | |||
+ | {{ : | ||
+ | |||
+ | the P40 needs modern motherboard that allow for '' | ||
+ | |||
+ | NVIDIA Tesla P40 24GB DDR5 GPU Accelerator Card Dual PCI-E 3.0 x16\\ | ||
+ | need to retrofit with a FAN,it doesnt come with one | ||
+ | |||
+ | got one on ebay for 200$(+shipping) ([[https:// | ||
+ | |||
+ | some dude got it working, https:// | ||
+ | === SPECIFICATIONS: | ||
+ | |||
+ | * GPU Architecture: | ||
+ | * Single-Precision Performance 12 TeraFLOPS* | ||
+ | * Integer Operations (INT8) 47 TOPS* (TeraOperations per Second) | ||
+ | * GPU Memory 24 GB | ||
+ | * Memory Bandwidth 346 GB/s | ||
+ | * System Interface PCI Express 3.0 x16 | ||
+ | * Form Factor 4.4” H x 10.5” L, Dual Slot, Full Height | ||
+ | * Max Power 250 W | ||
+ | * Enhanced Programmability with Page Migration Engine Yes | ||
+ | * ECC Protection Yes | ||
+ | * Server-Optimized for Data Center Deployment Yes | ||
+ | * Hardware-Accelerated Video Engine 1x Decode Engine, 2x Encode Engine /> | ||
+ | * NVPN: 699-2G610-0200-100 | ||
+ | * NVIDIA® CUDA® cores: 3840 | ||
+ | |||
+ | |||
+ | installing | ||
+ | <code bash> | ||
+ | sudo apt install nvidia-headless-535 | ||
+ | </ | ||
+ | |||
+ | there is some issue, unlike other cards the blue led doesnt turn green on thunderbolt connection.\\ | ||
+ | no power passing to the gPU.\\ | ||
+ | |||
+ | :( | ||
+ | |||
==== misc ==== | ==== misc ==== | ||