记录pytorch训练yolov3出错

错误如下:

Model Summary: 222 layers, 6.15291e+07 parameters, 6.15291e+07 gradients
Caching labels (5547 found, 0 missing, 19 empty, 0 duplicate, for 5566 images): 100%|| 5566/5566 [00:00<00:00, 75
Caching labels (2381 found, 0 missing, 5 empty, 0 duplicate, for 2386 images): 100%|| 2386/2386 [00:00<00:00, 766
Using 8 dataloader workers
Starting training for 300 epochs...

     Epoch   gpu_mem      GIoU       obj       cls     total   targets  img_size
   246/299     4.26G      2.09      0.81      2.11      5.01        10       416: 100%|| 696/696 [03:01<00:00,  3
               Class    Images   Targets         P         R   mAP@0.5        F1:   0%|   | 0/150 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "train.py", line 469, in <module>
    train()  # train normally
  File "train.py", line 357, in train
    dataloader=testloader)
  File "/home/aistudio/facemask1/test.py", line 77, in test
    for batch_i, (imgs, targets, paths, shapes) in enumerate(tqdm(dataloader, desc=s)):
  File "/opt/conda/envs/facemask/lib/python3.7/site-packages/tqdm/std.py", line 1127, in __iter__
    for obj in iterable:
  File "/opt/conda/envs/facemask/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 345, in __next__
    data = self._next_data()
  File "/opt/conda/envs/facemask/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 856, in _next_data
    return self._process_data(data)
  File "/opt/conda/envs/facemask/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 881, in _process_data
    data.reraise()
  File "/opt/conda/envs/facemask/lib/python3.7/site-packages/torch/_utils.py", line 394, in reraise
    raise self.exc_type(msg)
cv2.error: Caught error in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/opt/conda/envs/facemask/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop
    data = fetcher.fetch(index)
  File "/opt/conda/envs/facemask/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/opt/conda/envs/facemask/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/aistudio/facemask1/utils/datasets.py", line 432, in __getitem__
    img, ratio, pad = letterbox(img, shape, auto=False, scaleup=self.augment)
  File "/home/aistudio/facemask1/utils/datasets.py", line 630, in letterbox
    img = cv2.copyMakeBorder(img, top, bottom, left, right, cv2.BORDER_CONSTANT, value=color)  # add border
cv2.error: OpenCV(4.2.0) /io/opencv/modules/core/src/copy.cpp:1421: error: (-215:Assertion failed) top >= 0 && bottom >= 0 && left >= 0 && right >= 0 && _src.dims() <= 2 in function 'copyMakeBorder'

错误原因是数据集有一个图片出现错误,因为验证时第一个图片就没有成功,于是在2007_val.txt中删除了第一个图片的路径,最终成功运行

(5.13更新)感谢评论区学霸 qq_39878661的回答,以下引用原话:

问题根源还是在于该目录下的val.shapes文件没有更新,只是删除验证文件val.txt里第一个路径后,软件自动就会重新计算和更新val.shapes文件了,这个文件包含了验证文件val.txt里的每一个图片的长宽尺寸,格式就是纯文本的,可以直接打开查看

代码是github上ultralytics的yolov3
附上代码作者相关问题的回答连接

下图是提问者的回答,想知道他是怎么回溯这个图片路径的,求大佬评论区告知,感谢!
记录pytorch训练yolov3出错_第1张图片

博主小白一枚,如果有更好的解决方法欢迎评论区交流,感激不尽

你可能感兴趣的:(记录error)