3050ti跑tensoflow-gpu，屡次碰壁，总结原因如下_python

项目场景：

例如：新电脑到了，配置是3050ti 4G 其他参数都是比较平均的，安装好基础软件之后，上深度学习。

问题描述

在挣扎后回忆起怎么安装显卡驱动了，看到配置版本tf2.0配的cuda是10.0,cudnn是7.4,安装成功后的显示情况：（可以看到cuda是10.0，我显卡驱动限制版本是11.4,原则上是不高于显卡限制)

nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sat_Aug_25_21:08:04_Central_Daylight_Time_2018
Cuda compilation tools, release 10.0, V10.0.130

nvidia-smi
Sat Apr 16 13:06:29 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 472.47       Driver Version: 472.47       CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name            TCC/WDDM | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ... WDDM  | 00000000:01:00.0 Off |                  N/A |
| N/A   41C    P8     6W /  N/A |    107MiB /  4096MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A     12672    C+G   ...mmandCenterBackground.exe    N/A      |
+-----------------------------------------------------------------------------+

在安装后，pycharm中配置tensorflow-gpu2.0后出现错误，提示显卡驱动不兼容

原因分析：

因为安装的是cuda10，可能是版本太低了。

解决方案：

我又开始卸载cuda，安装cuda11.1版本并配置cudnn7.6.5后测试成功，但是跑模型的时候又出现了无法调用显存，虽然在过程中百度瞟到过30系列不能配置cuda11以下版本，当时因为执意要跑tf2.0，所以没想那么多。

问题描述

在挣扎后开始降低tf版本，tensorflow1.14，开始是美好的，后面运行demo也还行，复现口罩识别demo时，找到一个开源标签和开源代码，就开始了去复现，克服了数据集合代码兼容性，一步步走向成功时，训练的时候就给我说用不了gpu:

Epoch 1/50
2022-04-16 13:17:51.918104: E tensorflow/core/grappler/optimizers/meta_optimizer.cc:502] shape_optimizer failed: Invalid argument: Subshape must have computed start >= end since stride is negative, but is 0 and 2 (computed from start 0 and end 9223372036854775807 over shape with rank 2 and stride-1)
2022-04-16 13:17:51.959926: E tensorflow/core/grappler/optimizers/meta_optimizer.cc:502] remapper failed: Invalid argument: Subshape must have computed start >= end since stride is negative, but is 0 and 2 (computed from start 0 and end 9223372036854775807 over shape with rank 2 and stride-1)
2022-04-16 13:17:52.102568: E tensorflow/core/grappler/optimizers/meta_optimizer.cc:502] layout failed: Invalid argument: Subshape must have computed start >= end since stride is negative, but is 0 and 2 (computed from start 0 and end 9223372036854775807 over shape with rank 2 and stride-1)
2022-04-16 13:17:52.317837: E tensorflow/core/grappler/optimizers/meta_optimizer.cc:502] shape_optimizer failed: Invalid argument: Subshape must have computed start >= end since stride is negative, but is 0 and 2 (computed from start 0 and end 9223372036854775807 over shape with rank 2 and stride-1)
2022-04-16 13:17:52.351374: E tensorflow/core/grappler/optimizers/meta_optimizer.cc:502] remapper failed: Invalid argument: Subshape must have computed start >= end since stride is negative, but is 0 and 2 (computed from start 0 and end 9223372036854775807 over shape with rank 2 and stride-1)
2022-04-16 13:17:54.651630: W tensorflow/core/common_runtime/bfc_allocator.cc:237] Allocator (GPU_0_bfc) ran out of memory trying to allocate 831.81MiB with freed_by_count=0. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2022-04-16 13:17:54.651962: W tensorflow/core/common_runtime/bfc_allocator.cc:237] Allocator (GPU_0_bfc) ran out of memory trying to allocate 831.81MiB with freed_by_count=0. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2022-04-16 13:17:54.708460: W tensorflow/core/common_runtime/bfc_allocator.cc:237] Allocator (GPU_0_bfc) ran out of memory trying to allocate 760.50MiB with freed_by_count=0. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2022-04-16 13:17:54.708782: W tensorflow/core/common_runtime/bfc_allocator.cc:237] Allocator (GPU_0_bfc) ran out of memory trying to allocate 760.50MiB with freed_by_count=0. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2022-04-16 13:17:55.002098: E tensorflow/stream_executor/cuda/cuda_blas.cc:428] failed to run cuBLAS routine: CUBLAS_STATUS_EXECUTION_FAILED
Traceback (most recent call last):
  File "E:/depthLearning1.14/train.py", line 198, in <module>
    _main()
  File "E:/depthLearning1.14/train.py", line 73, in _main
    callbacks=[logging, checkpoint])
tensorflow.python.framework.errors_impl.InternalError: 2 root error(s) found.
  (0) Internal: Blas SGEMM launch failed : m=692224, n=32, k=64
	 [[{{node conv2d_3/convolution}}]]
	 [[loss/add_74/_1451]]
  (1) Internal: Blas SGEMM launch failed : m=692224, n=32, k=64
	 [[{{node conv2d_3/convolution}}]]
0 successful operations.
0 derived errors ignored.

Process finished with exit code 1

我寻思要不我用cpu：

os.environ["CUDA_VISIBLE_DEVICES"] = '1' #use GPU with ID=0
# add to the top of your code under import tensorflow as tf
config = tf.ConfigProto()
config.gpu_options.allow_growth = True
config.gpu_options.per_process_gpu_memory_fraction = 0.5 # maximun alloc gpu50% of MEM
# config.gpu_options.allow_growth = True #allocate dynamically

喵的，cpu启动就直接训练了


Create YOLOv3 model with 9 anchors and 2 classes.
Load weights model_data/yolo_weights.h5.
Freeze the first 249 layers of total 252 layers.
WARNING:tensorflow:From E:\depthLearning1.14\venv\lib\site-packages\keras\backend\tensorflow_backend.py:3080: add_dispatch_support.<locals>.wrapper (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
WARNING:tensorflow:From E:\depthLearning1.14\venv\lib\site-packages\keras\optimizers.py:790: The name tf.train.Optimizer is deprecated. Please use tf.compat.v1.train.Optimizer instead.

Train on 839 samples, val on 93 samples, with batch size 16.
WARNING:tensorflow:From E:\depthLearning1.14\venv\lib\site-packages\keras\callbacks.py:850: The name tf.summary.merge_all is deprecated. Please use tf.compat.v1.summary.merge_all instead.

WARNING:tensorflow:From E:\depthLearning1.14\venv\lib\site-packages\keras\callbacks.py:853: The name tf.summary.FileWriter is deprecated. Please use tf.compat.v1.summary.FileWriter instead.

Epoch 1/50
2022-04-16 13:16:50.892058: E tensorflow/core/grappler/optimizers/meta_optimizer.cc:502] shape_optimizer failed: Invalid argument: Subshape must have computed start >= end since stride is negative, but is 0 and 2 (computed from start 0 and end 9223372036854775807 over shape with rank 2 and stride-1)
2022-04-16 13:16:50.931636: E tensorflow/core/grappler/optimizers/meta_optimizer.cc:502] remapper failed: Invalid argument: Subshape must have computed start >= end since stride is negative, but is 0 and 2 (computed from start 0 and end 9223372036854775807 over shape with rank 2 and stride-1)
2022-04-16 13:16:51.290190: E tensorflow/core/grappler/optimizers/meta_optimizer.cc:502] shape_optimizer failed: Invalid argument: Subshape must have computed start >= end since stride is negative, but is 0 and 2 (computed from start 0 and end 9223372036854775807 over shape with rank 2 and stride-1)
2022-04-16 13:16:51.325787: E tensorflow/core/grappler/optimizers/meta_optimizer.cc:502] remapper failed: Invalid argument: Subshape must have computed start >= end since stride is negative, but is 0 and 2 (computed from start 0 and end 9223372036854775807 over shape with rank 2 and stride-1)

 1/52 [..............................] - ETA: 5:17 - loss: 8804.4365
 2/52 [>.............................] - ETA: 4:32 - loss: 8518.9624

解决方案：

要么直接上tenforflow最新版本，要么双系统或者虚拟机，去搞ubuntu。
一个大佬的留言：
目前来看win平台是不支持的，只支持11.0以上的cuda版本。安装低版本的tf会成功，但是进行网络训练会报错。我试了很多次是不行的。网上有linux平台30系显卡安装tf1.15的教程，没有尝试过。由于代码都是基于1.x写的，所以只能用旧显卡了。
我尝试完后才看到这个留言，我真的吐了，我只能去虚拟机看看咯。

欢迎分享，转载请注明来源：内存溢出

原文地址: http://outofmemory.cn/langs/715994.html

3050ti跑tensoflow-gpu，屡次碰壁，总结原因如下

发表评论

评论列表（0条）