跑通CLIP4STR,用于字符识别的预标签制作

工程链接:https://github.com/VamosC/CLIP4STR

下载工程链接工程,下载模型clip4str_base16x16_d70bde1f2d.ckpt和ViT-B-16.pt;

首先根据工程中的README.md进行环境处理:

Requires `Python >= 3.8` and `PyTorch >= 1.12`.
The following commands are tested on a Linux machine with CUDA Driver Version `525.105.17` and CUDA Version `11.3`.
```
conda create --name clip4str python==3.8
conda install pytorch==1.12.0 torchvision==0.13.0 torchaudio==0.12.0 -c pytorch
pip install -r requirements.txt 

具体步骤:

1.指定环境,创立虚拟环境

conda create --name /home/fxp/fxp/envs/CLIP4STR python==3.8

2.启动虚拟环境

source activate /home/fxp/fxp/envs/CLIP4STR

3.装指定的应用库

1)

conda install pytorch==1.12.0 torchvision==0.13.0 torchaudio==0.12.0 -c pytorch

2)

pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple

其次,修改路径

CLIP_PATH = '/PUT/YOUR/PATH/HERE/pretrained/clip'

在CLIP4STR-main/strhub/models/vl_str/system.py中的line22行;

根据README.md进行测试:

bash scripts/read.sh 0 clip4str_base16x16_d70bde1f2d.ckpt misc/test_images

我在pycharm中配置参数:

/home/fxp/4tdisk/code/certificate_reader/CLIP4STR-main/weights/clip4str_base16x16_d70bde1f2d.ckpt
--images_path
/home/fxp/4tdisk/code/certificate_reader/CLIP4STR-main/misc/test_image

测试可输出正常字符识别结果;

在运用中遇到的问题:

1.

RuntimeError: The NVIDIA driver on your system is too old (found version 10010).  
Please update your GPU driver by downloading and installing a new version from the 
URL: http://www.nvidia.com/Download/index.aspx Alternatively, go to: 
https://pytorch.org to install a PyTorch version that has been compiled with your 
version of the CUDA driver.

解决办法:直接使用新的显卡驱动,显卡驱动的安装参考:

ubuntu重装cuda,cudnn,并挂载硬盘到home_cudnn重新安装-CSDN博客

后续:

因为是为了检查人工标注的字符,所以才想到用这个大模型,但是模型的推理尺寸是224*224,短的文本行识别效果还是可以,太长的文本行效果不如paddleOCR的服务器大模型,所以就没有使用该模型做标签质检; 

你可能感兴趣的:(ocr,linux,OCR)