- 环境说明
- 报错信息
- 解决办法
在docker容器内对mmcv的源代码通过cuda进行编译时报错
OSError: CUDA_HOME environment variable is not set. Please set it to your CUDA install root.
关于这一点,很多人没有说明清楚,这里对这一需求做一个说明:
环境说明通过docker-compose up 运行,并且使用配置
deploy:
mode: replicated
resources:
reservations:
devices:
- driver: nvidia
capabilities: [ gpu ]
count: all
memory: 8g
挂载了显卡
在容器中能正常使用显卡
nvidia-smi
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.47.03 Driver Version: 510.47.03 CUDA Version: 11.6 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... Off | 00000000:17:00.0 Off | N/A |
| 0% 43C P8 11W / 275W | 6MiB / 11264MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 1 NVIDIA GeForce ... Off | 00000000:65:00.0 Off | N/A |
| 0% 43C P8 13W / 275W | 101MiB / 11264MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 1003 G /usr/lib/xorg/Xorg 4MiB |
| 1 N/A N/A 1003 G /usr/lib/xorg/Xorg 39MiB |
| 1 N/A N/A 1516 G /usr/bin/gnome-shell 58MiB |
+-----------------------------------------------------------------------------+
报错信息
下载mmcv后 ,在mmcv文件路径运行
cd mmcv
MMCV_WITH_OPS=1 pip install -e .
以安装mmcv-full,出现错误
root@3be4a3494460:/home/MMCV_Frame/mmcv# MMCV_WITH_OPS=1 pip install -e .
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Obtaining file:///home/MMCV_Frame/mmcv
ERROR: Command errored out with exit status 1:
command: /opt/conda/bin/python -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/home/MMCV_Frame/mmcv/setup.py'"'"'; __file__='"'"'/home/MMCV_Frame/mmcv/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-pip-egg-info-jl7t6bh1
cwd: /home/MMCV_Frame/mmcv/
Complete output (13 lines):
Traceback (most recent call last):
File "" , line 1, in <module>
File "/home/MMCV_Frame/mmcv/setup.py", line 422, in <module>
ext_modules=get_extensions(),
File "/home/MMCV_Frame/mmcv/setup.py", line 331, in get_extensions
extra_compile_args=extra_compile_args)
File "/opt/conda/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 932, in CUDAExtension
library_dirs += library_paths(cuda=True)
File "/opt/conda/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1040, in library_paths
if (not os.path.exists(_join_cuda_home(lib_dir)) and
File "/opt/conda/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 2058, in _join_cuda_home
raise EnvironmentError('CUDA_HOME environment variable is not set. '
OSError: CUDA_HOME environment variable is not set. Please set it to your CUDA install root.
----------------------------------------
WARNING: Discarding file:///home/MMCV_Frame/mmcv. Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
解决办法
参考了别人的很多没用的办法就不说了,这里写一个自己突然想到的一个有效解决办法:
运行时挂载 /usr/local/cuda:/usr/local/cuda:ro
这里的路径根据每个人的机子不同会有差异,通过$(which nvcc)
的返回信息/usr/local/cuda/bin/nvcc
可以找到cuda的路径,把cuda的路径挂载到容器中就行了。
再次运行
export CUDA_HOME='/usr/local/cuda'
cd mmcv
MMCV_WITH_OPS=1 pip install -e .
等了10来分钟,编译完毕。
另外,编译过程是真的慢,而且找了半天的多线程编译都没有效果,pip的多线程编译居然没有人关心…
如果不对mmcv或者mmdet的框架进行修改的话,还是建议到官网使用pip直接安装有预编译的版本,会快很多。
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)