Hello,
I am using Jetson Orin Nano Developer kit (8GB) with the following libraries:
CUDA:12.2.140
cuDNN: 1.0
TensorRT:8.6.2.3
VPI: 3.1.5
vulkan: 1.3.204
OpenCV: 4.8.0 with CUDA:NO
with,
Python: 3.10.12
Jetpack: 6.0
L4T: 36.3.0
Ubuntu: 22.04
aarch64
I am trying to install PyTorch version: 2.3.0 (compatible with Jetpack-6.0 and CUDA:12.2) with the following command lines from ([PyTorch for Jetson](https://PyTorch for Jetson))
$ wget https://nvidia.box.com/shared/static/p57jwntv436lfrd78inwl7iml6p13fzh.whl -O torch-2.3.0-cp310-cp310-linux_aarch64.whl
$ sudo apt-get install python3-pip libopenblas-base libopenmpi-dev libomp-dev
$ pip3 install ‘Cython<3’
$ pip3 install numpy torch-2.3.0-cp310-cp310-linux_aarch64.whl
After running these commands:
The output comes as:
Successfully installed torch-1.8.0
(flresearch@flresearch-desktop:~/FLresearch/lib64/python3.10/site-packages$ flresearch@flresearch-desktop:~/FLresearch/lib64/python3.10/site-packages$ pip3 install ./torch-2.3.0-cp310-cp310-linux_aarch64.whl --force-reinstall --no-deps
Defaulting to user installationshould because normal site-packages is not writeable
Processing ./torch-2.3.0-cp310-cp310-linux_aarch64.whl
Installing collected packages: torch
Attempting uninstall: torch
Found existing installation: torch 1.8.0
Uninstalling torch-1.8.0:
Successfully uninstalled torch-1.8.0
Successfully installed torch-1.8.0
Successfully installed torch-1.8.0
)
which is creating a conflict with OpenMPI when I am trying to run the script for YOLOv5 training:
flresearch@flresearch-desktop:~/yolov5$ python3 train.py --img 640 --batch 4 --epochs 2 --data data/VOS2028.yaml --weights ‘’ --cfg models/yolov5s.yaml --device 0
Traceback (most recent call last):
File “/home/flresearch/yolov5/train.py”, line 34, in
import torch
File “/home/flresearch/.local/lib/python3.10/site-packages/torch/init.py”, line 195, in
_load_global_deps()
File “/home/flresearch/.local/lib/python3.10/site-packages/torch/init.py”, line 148, in _load_global_deps
ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL)
File “/usr/lib/python3.10/ctypes/init.py”, line 374, in init
self._handle = _dlopen(self._name, mode)
OSError: libmpi_cxx.so.20: cannot open shared object file: No such file or directory
then I tried:
$unzip -l torch-2.3.0-cp310-cp310-linux_aarch64.wl | grep dist-info
and it outputted:
torch-1.8.0.sidt-info/METADATA
.
.
.so on
which is not expected,
I think the PyTorch wheel torch-2.3.0 is actually a PyTorch 1.8.0 internally.
Then tried downloading and installing the PyTorch versions (which I think are not compatible with Jetpack 6.0, CUDA 12.2):
-
- JetPack 6.0 (L4T R36.2 / R36.3) + CUDA 12.4
- torch 2.3 -
torch-2.3.0-cp310-cp310-linux_aarch64.whl
- torchaudio 2.3 -
torchaudio-2.3.0+952ea74-cp310-cp310-linux_aarch64.whl
- torchvision 0.18 -
torchvision-0.18.0a0+6043bc2-cp310-cp310-linux_aarch64.whl
- PyTorch v2.2.0
- JetPack 6.0 DP (L4T R36.2.0)
- PyTorch v2.1.0
- JetPack 6.0 DP (L4T R36.2.0)
- Python 3.10 -
torch-2.1.0-cp310-cp310-linux_aarch64.whl
(USE_DISTRIBUTED=on)
- Python 3.10 -
Which again installed torch-1.8.0
And,
OSError: libmpi_cxx.so.20: cannot open shared object file: No such file or directory
the above error means Python (via PyTorch ) is trying to load a shared library libmpi_cxx.so.20 but on the system the version is libmpi_cxx.so.40
I am not sure how I should proceed with this. I hope I will get some insights from the Forum.
Thank you in advance.
-Monika