This is an old revision of the document!
we are using the TH3P4G3 eGPU external thunderbolt thing.
Linux Kernal notes > https://docs.kernel.org/admin-guide/thunderbolt.html
TLDR
The authorized attribute reads 0 which means no PCIe tunnels are created yet. The user can authorize the device by simply entering: # echo 1 > /sys/bus/thunderbolt/devices/0-1/authorized This will create the PCIe tunnels and the device is now connected.
from mainline, (the ubuntu dist-upgrade is too conservative (5.19))
cd /tmp rm -i *deb wget -c https://kernel.ubuntu.com/~kernel-ppa/mainline/v6.3.7/amd64/linux-headers-6.3.7-060307-generic_6.3.7-060307.202306090936_amd64.deb wget -c https://kernel.ubuntu.com/~kernel-ppa/mainline/v6.3.7/amd64/linux-headers-6.3.7-060307_6.3.7-060307.202306090936_all.deb wget -c https://kernel.ubuntu.com/~kernel-ppa/mainline/v6.3.7/amd64/linux-image-unsigned-6.3.7-060307-generic_6.3.7-060307.202306090936_amd64.deb wget -c https://kernel.ubuntu.com/~kernel-ppa/mainline/v6.3.7/amd64/linux-modules-6.3.7-060307-generic_6.3.7-060307.202306090936_amd64.deb sudo dpkg -i *.deb
hmm, you need to connect before boot.
now permissions
$ sudo dmesg dprobe" pid=563 comm="apparmor_parser" [ 7.888207] audit: type=1400 audit(1686781044.331:3): apparmor="STATUS" operation="profile_load" profile="unconfined" name="nvidia_modprobe//kmod" pid=563 comm="apparmor_parser"
authorized the tamala!
(base) user@eight:~$ echo 1 | sudo tee /sys/bus/thunderbolt/devices/0-1/authorized 1 (base) user@eight:~$ sudo ubuntu-drivers devices == /sys/devices/pci0000:00/0000:00:1c.4/0000:04:00.0/0000:05:01.0/0000:07:00.0/0000:08:01.0/0000:09:00.0 == modalias : pci:v000010DEd00000DD8sv000010DEsd0000084Abc03sc00i00 vendor : NVIDIA Corporation model : GF106GL [Quadro 2000] manual_install: True driver : nvidia-driver-390 - distro non-free recommended driver : xserver-xorg-video-nouveau - distro free builtin
just an old card…
but EEK
(base) user@eight:~$ lspci | tail 08:04.0 PCI bridge: Intel Corporation JHL7540 Thunderbolt 3 Bridge [Titan Ridge DD 2018] (rev 06) 09:00.0 VGA compatible controller: NVIDIA Corporation GF106GL [Quadro 2000] (rev a1) 09:00.1 Audio device: NVIDIA Corporation GF106 High Definition Audio Controller (rev a1) $sudo dmesg [ 1041.053826] nvidia: module license 'NVIDIA' taints kernel. [ 1041.053831] Disabling lock debugging due to kernel taint [ 1041.484017] nvidia-nvlink: Nvlink Core is being initialized, major device number 509 [ 1041.484032] NVRM: The NVIDIA Quadro 2000 GPU installed in this system is NVRM: supported through the NVIDIA 390.xx Legacy drivers. Please NVRM: visit http://www.nvidia.com/object/unix.html for more NVRM: information. The 535.43.02 NVIDIA driver will ignore NVRM: this GPU. Continuing probe... [ 1041.501047] NVRM: No NVIDIA GPU found. [ 1041.521176] nvidia-nvlink: Unregistered Nvlink Core, major device number 509 [ 1042.332830] nvidia-nvlink: Nvlink Core is being initialized, major device number 509 [ 1042.332842] NVRM: The NVIDIA Quadro 2000 GPU installed in this system is NVRM: supported through the NVIDIA 390.xx Legacy drivers. Please NVRM: visit http://www.nvidia.com/object/unix.html for more NVRM: information. The 535.43.02 NVIDIA driver will ignore NVRM: this GPU. Continuing probe... [ 1042.335282] NVRM: No NVIDIA GPU found. [ 1042.335835] nvidia-nvlink: Unregistered Nvlink Core, major device number 509
WE ARE TAINTED
we went with ubuntu selection
but cute https://www.nvidia.com/en-us/drivers/unix/
$ sudo apt installl nvidia-headless-535 #downgrade nvidia to quadro supported version sudo apt install nvidia-headless-390 # EEK RROR (dkms apport): kernel package linux-headers-6.3.7-060307-generic is not supported Error! Bad return status for module build on kernel: 6.3.7-060307-generic (x86_64) Consult /var/lib/dkms/nvidia/390.157/build/make.log for more information. dpkg: error processing package nvidia-dkms-390 (--configure): installed nvidia-dkms-390 package post-installation script subprocess returned error exit status 10 dpkg: dependency problems prevent configuration of nvidia-headless-390: nvidia-headless-390 depends on nvidia-dkms-390; however: Package nvidia-dkms-390 is not configured yet. dpkg: error processing package nvidia-headless-390 (--configure): dependency problems - leaving unconfigured Processing triggers for libc-bin (2.36-0ubuntu4) ... No apport report written because the error message indicates its a followup error from a previous failure. /sbin/ldconfig.real: /lib/lib ndi.so.4 is not a symbolic link Processing triggers for man-db (2.10.2-2) ... Processing triggers for initramfs-tools (0.140ubuntu17) ... update-initramfs: Generating /boot/initrd.img-6.3.7-060307-generic Errors were encountered while processing: nvidia-dkms-390 nvidia-headless-390
downgrading but to headless,
without touching the x config?
going with avalon readme
[1] A 3D video game environment and benchmark designed from scratch for reinforcement learning research
conda create -n avalon python=3.9 conda activate avalon sudo apt install --no-install-recommends libegl-dev libglew-dev libglfw3-dev libnvidia-gl libopengl-dev libosmesa6 mesa-utils-extra #this will also install torch... pip install avalon-rl[train] python -m avalon.install_godot_binary python -m avalon.common.check_install
why even bother, the quaDRO IS JUST A TEST.
NEED TO CLEAN REMOVE THE 390 driver AND MOVE BACK TO
NVIDIA-CURRENT
unlike other cards the blue led doesnt turn green on thunderbolt connection.
NVIDIA Tesla P40 24GB DDR5 GPU Accelerator Card Dual PCI-E 3.0 x16
need to retrofit with a FAN,it doesnt come with one
got one on ebay for 200$(+shipping) (ebay mirror)
some dude got it working, https://github.com/JingShing/How-to-use-tesla-p40
installing
sudo apt install nvidia-headless-535
there is some issue, unlike other cards the blue led doesnt turn green on thunderbolt connection.
no power passing to the gPU.
:(
$sudo dmesg -w [96236.873213] nvidia-nvlink: Nvlink Core is being initialized, major device number 509 [96236.874544] nvidia 0000:09:00.0: enabling device (0006 -> 0007) [96236.874646] nvidia 0000:09:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=none [96236.991272] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 535.43.02 Mon May 22 20:46:13 UTC 2023 [96237.009537] nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms 535.43.02 Mon May 22 20:25:24 UTC 2023 [96237.013346] [drm] [nvidia-drm] [GPU ID 0x00000900] Loading driver [96238.239429] [drm] Initialized nvidia-drm 0.0.0 20160202 for 0000:09:00.0 on minor 1 [96238.269008] nvidia_uvm: module uses symbols nvUvmInterfaceDisableAccessCntr from proprietary module nvidia, inheriting taint. [96238.330257] nvidia-uvm: Loaded the UVM driver, major device number 507. [96238.399348] NVRM: API mismatch: the client has the version 390.157, but NVRM: this kernel module has the version 535.43.02. Please NVRM: make sure that this kernel module and all NVIDIA driver
update the driver to fit
$ ubuntu-drivers devices == /sys/devices/pci0000:00/0000:00:1c.4/0000:04:00.0/0000:05:01.0/0000:07:00.0/0000:08:01.0/0000:09:00.0 == modalias : pci:v000010DEd00001B06sv00001458sd0000377Abc03sc00i00 vendor : NVIDIA Corporation model : GP102 [GeForce GTX 1080 Ti] manual_install: True driver : nvidia-driver-450-server - distro non-free driver : nvidia-driver-510 - distro non-free driver : nvidia-driver-390 - distro non-free driver : nvidia-driver-470 - distro non-free driver : nvidia-driver-525-server - distro non-free driver : nvidia-driver-525 - distro non-free driver : nvidia-driver-535 - third-party non-free recommended driver : nvidia-driver-515 - distro non-free driver : nvidia-driver-515-server - distro non-free driver : nvidia-driver-530 - distro non-free driver : nvidia-driver-470-server - distro non-free driver : xserver-xorg-video-nouveau - distro free builtin $ sudo ubuntu-drivers autoinstall 1The following additional packages will be installed: libnvidia-common-535 libnvidia-compute-535:i386 libnvidia-decode-535 libnvidia-decode-535:i386 libnvidia-encode-535 libnvidia-encode-535:i386 libnvidia-extra-535 libnvidia-fbc1-535 libnvidia-fbc1-535:i386 libnvidia-gl-535 libnvidia-gl-535:i386 nvidia-prime nvidia-settings nvidia-utils-535 screen-resolution-extra xserver-xorg-video-nvidia-535 The following packages will be REMOVED: libnvidia-common-390 libnvidia-gl-390 The following NEW packages will be installed: libnvidia-common-535 libnvidia-compute-535:i386 libnvidia-decode-535 libnvidia-decode-535:i386 libnvidia-encode-535 libnvidia-encode-535:i386 libnvidia-extra-535 libnvidia-fbc1-535 libnvidia-fbc1-535:i386 libnvidia-gl-535 libnvidia-gl-535:i386 nvidia-driver-535 nvidia-prime nvidia-settings nvidia-utils-535 screen-resolution-extra xserver-xorg-video-nvidia-535
lspci -v | grep -A 2 -E "(VGA comp|3D)" 00:02.0 VGA compatible controller: Intel Corporation Iris Pro Graphics 580 (rev 09) (prog-if 00 [VGA controller]) DeviceName: CPU Subsystem: Intel Corporation Iris Pro Graphics 580 -- 09:00.0 VGA compatible controller: NVIDIA Corporation GF106GL [Quadro 2000] (rev a1) (prog-if 00 [VGA controller]) Subsystem: NVIDIA Corporation GF106GL [Quadro 2000] Flags: bus master, fast devsel, latency 0
power from 12v dc plug (150W?)
https://www.reddit.com/r/eGPU/comments/ukqto9/comment/ige1rwv