遇到的问题:以前安装tesla系列、geforce系列等gpu 驱动的时候,只需执行nvidiaxxxx.run驱动包或者安装nvidiaxxxx.rpm包即可,但遇到新的GPU Nvidia Tesla A100的时候,安装完驱动似乎不起作用,找了半天资料,原来需要以下步骤才可以使用;
参考资料:
https://docs.nvidia.com/datacenter/tesla/pdf/fabric-manager-user-guide.pdf
https://docs.nvidia.com/datacenter/tesla/fabric-manager-user-guide/index.html#abstract
驱动下载连接:https://www.nvidia.cn/Download/index.aspx?lang=cn
nvidia-driver-local-repo-rhel7-470.57.02-1.0-1.x86_64.rpm
rpm -ivh nvidia-driver-local-repo-rhel7-470.57.02-1.0-1.x86_64.rpm
yum clean all
yum install -y cuda-drivers
yum install -y cuda-drivers-fabricmanager libnvidia-nscq
4、启动服务systemctl enable nvidia-fabricmanager
systemctl start nvidia-fabricmanager
nvswitch、nvlink功能可以正常使用;
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)