【深度学习】【TensorFlow程序报错】【cuDNN】Failed to get convolution algorithm

TensorFlow程序报错

  • 程序报错
  • 解决方法
      • 具体而言
        • `tf.ConfigProto`
        • `keras.backend`
  • 进一步学习
  • 参考文献

程序报错

Traceback (most recent call last):
  File "E:/Pycharm Project/ImgSegBase/UNet_Training.py", line 127, in <module>
    callbacks=[checkpoint_period, reduce_lr])
  File "C:\Users\1\Anaconda3\envs\TensorField\lib\site-packages\tensorflow\python\util\deprecation.py", line 324, in new_func
    return func(*args, **kwargs)
  File "C:\Users\1\Anaconda3\envs\TensorField\lib\site-packages\tensorflow\python\keras\engine\training.py", line 1829, in fit_generator
    initial_epoch=initial_epoch)
  File "C:\Users\1\Anaconda3\envs\TensorField\lib\site-packages\tensorflow\python\keras\engine\training.py", line 108, in _method_wrapper
    return method(self, *args, **kwargs)
  File "C:\Users\1\Anaconda3\envs\TensorField\lib\site-packages\tensorflow\python\keras\engine\training.py", line 1098, in fit
    tmp_logs = train_function(iterator)
  File "C:\Users\1\Anaconda3\envs\TensorField\lib\site-packages\tensorflow\python\eager\def_function.py", line 780, in __call__
    result = self._call(*args, **kwds)
  File "C:\Users\1\Anaconda3\envs\TensorField\lib\site-packages\tensorflow\python\eager\def_function.py", line 840, in _call
    return self._stateless_fn(*args, **kwds)
  File "C:\Users\1\Anaconda3\envs\TensorField\lib\site-packages\tensorflow\python\eager\function.py", line 2829, in __call__
    return graph_function._filtered_call(args, kwargs)  # pylint: disable=protected-access
  File "C:\Users\1\Anaconda3\envs\TensorField\lib\site-packages\tensorflow\python\eager\function.py", line 1848, in _filtered_call
    cancellation_manager=cancellation_manager)
  File "C:\Users\1\Anaconda3\envs\TensorField\lib\site-packages\tensorflow\python\eager\function.py", line 1924, in _call_flat
    ctx, args, cancellation_manager=cancellation_manager))
  File "C:\Users\1\Anaconda3\envs\TensorField\lib\site-packages\tensorflow\python\eager\function.py", line 550, in call
    ctx=ctx)
  File "C:\Users\1\Anaconda3\envs\TensorField\lib\site-packages\tensorflow\python\eager\execute.py", line 60, in quick_execute
    inputs, attrs, num_outputs)
tensorflow.python.framework.errors_impl.UnknownError:  Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
	 [[node functional_1/conv1/Conv2D (defined at E:/Pycharm Project/ImgSegBase/UNet_Training.py:127) ]] [Op:__inference_train_function_7730]

Function call stack:
train_function

2021-02-12 13:48:05.362343: W tensorflow/core/kernels/data/generator_dataset_op.cc:103] Error occurred when finalizing GeneratorDataset iterator: Failed precondition: Python interpreter state is not initialized. The process may be terminated.
	 [[{
     {
     node PyFunc}}]]


解决方法

import tensorflow as tf
config = tf.compat.v1.ConfigProto(allow_soft_placement=True)
config.gpu_options.per_process_gpu_memory_fraction = 0.2
tf.compat.v1.keras.backend.set_session(tf.compat.v1.Session(config=config))

具体而言

tf.ConfigProto

import tensorflow as tf

Session_Configuration = tf.ConfigProto\
(
      log_device_placement=True,
      inter_op_parallelism_threads=0,
      intra_op_parallelism_threads=0,
      allow_soft_placement=True
)

对于上述代码,tf.ConfigProto 主要的作用是配置tf.Session的运算方式,比如 G P U GPU GPU运算或者 C P U CPU CPU运算。代码中之所以用tf.compat.v1.ConfigProto是为了满足 T e n s o r F l o w 2. x TensorFlow2.x TensorFlow2.x T e n s o r F l o w 1. x TensorFlow1.x TensorFlow1.x的兼容性支持。关于tf.ConfigProto 的参数如下

  • log_device_placement=True
    设置为True时,会打印出TensorFlow使用了哪些操作
  • inter_op_parallelism_threads=0
    设置一个操作内部其并行运算的线程数,比如矩阵乘法。如果设置为 0 0 ,则表示以最优的线程数处理
  • intra_op_parallelism_threads=0
    设置多个操作其并行运算的线程数。如果设置为 0 0 ,则表示以最优的线程数处理
  • allow_soft_placement=True
    不同的设备其 G P U GPU GPU C P U CPU CPU是不同的,如果将此参数选项设置成True,那么当运行设备不满足要求时,会自动分配 G P U GPU GPU或者 C P U CPU CPU

关于tf.ConfigProto的返回对象Session_Configuration其属性如下

  • 当使用 G P U GPU GPU时候, T e n s o r f l o w Tensorflow Tensorflow运行自动慢慢达到最大GPU的内存
Session_Configuration.gpu_options.allow_growth = True
  • 当使用 G P U GPU GPU时,设置 G P U GPU GPU内存使用最大比例
Session_Configuration.gpu_options.per_process_gpu_memory_fraction = 0.4

额外补充一个相关操作如下。查看是否能够使用 G P U GPU GPU进行运算:

tf.test.is_built_with_cuda()

keras.backend

K e r a s Keras Keras可以看作是 T e n s o r F l o w TensorFlow TensorFlow的高级封装。因此在使用 K e r a s Keras Keras时,依然需要注意其背后的 G r a p h Graph Graph S e s s i o n Session Session。在调用keras.layers.*时, K e r a s Keras Keras会调用 T e n s o r F l o w TensorFlow TensorFlow在一个 G r a p h Graph Graph上建图。通过keras.backend.set_sessionkeras.backend.get_session,可以设置和获取 k e r a s keras keras背后使用的 s e s s i o n session session。因此, K e r a s Keras Keras可以完成转换为 T e n s o r F l o w TensorFlow TensorFlow

进一步学习

from tensorflow.keras.backend import * 

对于TensorFlow Core v2.4.1,该模块下的函数集如下,具体的用法详看 T e n s o r F l o w 官 方 文 档 TensorFlow官方文档 TensorFlow

clear_session(...): Resets all state generated by Keras.

epsilon(...): Returns the value of the fuzz factor used in numeric expressions.

floatx(...): Returns the default float type, as a string.

get_uid(...): Associates a string prefix with an integer counter in a TensorFlow graph.

image_data_format(...): Returns the default image data format convention.

is_keras_tensor(...): Returns whether x is a Keras tensor.

reset_uids(...): Resets graph identifiers.

rnn(...): Iterates over the time dimension of a tensor.

set_epsilon(...): Sets the value of the fuzz factor used in numeric expressions.

set_floatx(...): Sets the default float type.

set_image_data_format(...): Sets the value of the image data format convention.

参考文献

[1] https://blog.csdn.net/qq_29969029/article/details/108415432

[2]https://blog.csdn.net/qq_31261509/article/details/79746114?ops_request_misc=%25257B%252522request%25255Fid%252522%25253A%252522161310940416780271536742%252522%25252C%252522scm%252522%25253A%25252220140713.130102334…%252522%25257D&request_id=161310940416780271536742&biz_id=0&utm_medium=distribute.pc_search_result.none-task-blog-2allsobaiduend~default-2-79746114.first_rank_v2_pc_rank_v29_10&utm_term=tf.configproto

[3] https://blog.csdn.net/qq_31964037/article/details/106866396?ops_request_misc=&request_id=&biz_id=102&utm_term=tf.keras.backend%25E6%2598%25AF%25E4%25BB%2580%25E4%25B9%2588&utm_medium=distribute.pc_search_result.none-task-blog-2allsobaiduweb~default-0-106866396.first_rank_v2_pc_rank_v29_10

你可能感兴趣的:(深度学习)