warren@伟_

【工程部署】在RK3588上部署OCR(文字检测识别)（DBNet+CRNN）

硬件平台：

1、firefly安装Ubuntu系统的RK3588；

2、安装Windows系统的电脑一台，其上安装Ubuntu18.04系统虚拟机。

参考手册：《00-Rockchip_RKNPU_User_Guide_RKNN_API_V1.3.0_CN》

《RKNN Toolkit Lite2 用户使用指南》

1、文字检测

项目地址：

GitHub - WenmuZhou/PytorchOCR: 基于Pytorch的OCR工具库，支持常用的文字检测和识别算法

DBNet（Dynamic-Link Bi-directional Network）是一种用于文本检测的深度学习模型。该模型于2019年由Minghui Liao等人提出，并在文本检测领域取得了显著的成果。DBNet的设计目标是在保持高精度的同时，提高文本检测的效率。传统的文本检测模型通常使用单向的横向连接或纵向连接来处理文本实例。然而，这种单向连接可能导致信息的不完整传递或信息冗余，影响了检测性能和速度。

为了解决这些问题，DBNet引入了双向动态连接机制，允许横向和纵向两个方向上的信息流动。具体来说，DBNet由两个关键组成部分构成：

(1) Bi-directional FFM（Feature Fusion Module）：这是DBNet的核心组件之一。它包括横向和纵向两个方向的子模块。在横向子模块中，DBNet通过可变形卷积（deformable convolution）从不同尺度的特征图中提取并融合文本实例的特征。而在纵向子模块中，DBNet使用自适应的特征选择机制，动态选择最具有代表性的特征。这些子模块的组合使得文本实例的特征能够全面而高效地进行建模。

(2) Aggregation Decoder：这是DBNet的另一个重要组件，用于从特征图中生成文本实例的边界框和对应的文本分数。该解码器结合了横向和纵向的特征，通过逐步聚合来预测文本的位置和形状。由于使用了双向动态连接，解码器能够更准确地还原文本实例的形态。

DBNet的训练过程包括前向传播和反向传播。在前向传播中，DBNet将图像输入网络，经过一系列卷积、特征融合和解码操作，得到文本检测的结果。然后，通过计算预测结果和真实标签之间的损失函数，使用反向传播算法来更新网络参数，从而不断优化模型的性能。

DBNet在文本检测任务中取得了非常好的效果。其双向动态连接机制允许更好地利用横向和纵向的信息，提高了文本检测的准确性和鲁棒性。此外，相比传统的文本检测模型，DBNet在保持高精度的情况下，大幅提升了检测速度，使得它在实际应用中更具可用性和实用性。因此，DBNet在文字检测、自动化办公、图像识别等领域都具有广泛的应用前景。论文地址：https://arxiv.org/abs/1911.08947

图1. DBNet网络结构

2、文字识别

项目地址：

GitHub - WenmuZhou/PytorchOCR: 基于Pytorch的OCR工具库，支持常用的文字检测和识别算法

CRNN（Convolutional Recurrent Neural Network）是一种深度学习模型，结合了卷积神经网络（CNN）和循环神经网络（RNN）的优势，广泛应用于图像文本识别（OCR）任务。CRNN模型于2015年由Baoguang Shi等人首次提出，并在OCR领域取得了显著的突破。

CRNN的设计思想是将卷积神经网络用于图像的特征提取，并利用循环神经网络来对序列建模，从而使得CRNN能够直接从图像级别到序列级别进行端到端的学习。

CRNN模型通常由以下几个部分组成：

(1) 卷积层（Convolutional Layers）：CRNN利用多个卷积层来提取图像中的局部特征。这些卷积层可以学习不同层次的图像表示，从低级特征（如边缘和纹理）到高级特征（如形状和模式）。

(2) RNN层（Recurrent Layers）：在卷积层后面，CRNN采用RNN层来处理序列数据。RNN能够捕捉序列的上下文信息，因此对于OCR任务而言，它可以有效地处理不同长度的文本序列。

(3) 转录层（Transcription Layer）：在RNN层之后，CRNN使用转录层来将RNN输出映射到字符类别。这通常是一个全连接层，将RNN输出映射到预定义的字符集合，从而实现对文本的识别。

CRNN的训练过程包括两个主要步骤：前向传播和反向传播。在前向传播中，CRNN将图像输入模型，经过卷积和循环层，最终得到文本序列的预测。然后，通过计算预测结果和真实标签之间的损失函数，使用反向传播算法来更新网络参数，从而使得模型的预测结果逐渐接近真实标签。

CRNN在OCR领域的应用广泛，能够识别不同尺寸、字体、颜色和背景的文本。它在识别长文本序列方面表现优秀，并且由于端到端的设计，避免了传统OCR系统中复杂的流水线处理。因此，CRNN在很多实际场景中都取得了很好的效果，如车牌识别、文字检测和手写体识别等。

总结来说，CRNN是一种将CNN和RNN结合起来的深度学习模型，用于图像文本识别任务。其端到端的设计、优秀的序列建模能力和在OCR领域的广泛应用，使得CRNN成为了一种重要的OCR模型，为自动化文本处理和识别带来了巨大的便利。论文地址：https://arxiv.org/abs/1507.05717

图2. CRNN结构

环境搭建

rknn-toolkit以及rknpu_sdk环境搭建

（手把手）rknn-toolkit以及rknpu_sdk环境搭建--以rk3588为例_warren@伟_的博客-CSDN博客

模型的导出与验证

文字检测

导出onnx模型

'''

Author: warren

Date: 2023-06-07 14:52:27

LastEditors: warren

LastEditTime: 2023-06-12 15:20:28

FilePath: /warren/VanillaNet1/export_onnx.py

Description: export onnx model


Copyright (c) 2023 by ${git_name_email}, All Rights Reserved.

'''

#!/usr/bin/env python3

import torch

from torchocr.networks import build_model

MODEL_PATH='./model/det_db_mbv3_new.pth'

DEVICE='cuda:0' if torch.cuda.is_available() else 'cpu'

print("-----------------------devices",DEVICE)


class DetInfer:

    def __init__(self, model_path):

        ckpt = torch.load(model_path, map_location=DEVICE)

        cfg = ckpt['cfg']

        self.model = build_model(cfg['model'])

        state_dict = {}

        for k, v in ckpt['state_dict'].items():

            state_dict[k.replace('module.', '')] = v

        self.model.load_state_dict(state_dict)

        self.device = torch.device(DEVICE)

        self.model.to(self.device)

        self.model.eval()


        checkpoint = torch.load(MODEL_PATH, map_location=DEVICE)


        # Prepare input tensor

        input = torch.randn(1, 3, 640, 640, requires_grad=False).float().to(torch.device(DEVICE))


        # Export the torch model as onnx

        print("-------------------export")

        torch.onnx.export(self.model,

            input,

            'detect_model_small.onnx', # name of the exported onnx model

            export_params=True,

            opset_version=12,

            do_constant_folding=False)


# Load the pretrained model and export it as onnx

model = DetInfer(MODEL_PATH)

验证

import numpy as np

import cv2

import torch

from torchvision import transforms

# from label_convert import CTCLabelConverter

import cv2

import numpy as np

import pyclipper

from shapely.geometry import Polygon 

import onnxruntime


class DBPostProcess():

    def __init__(self, thresh=0.3, box_thresh=0.7, max_candidates=1000, unclip_ratio=2):

        self.min_size = 3

        self.thresh = thresh

        self.box_thresh = box_thresh

        self.max_candidates = max_candidates

        self.unclip_ratio = unclip_ratio


    def __call__(self, pred, h_w_list, is_output_polygon=False):

        '''

        batch: (image, polygons, ignore_tags

        h_w_list: 包含[h,w]的数组

        pred:

            binary: text region segmentation map, with shape (N, 1,H, W)

        '''

        pred = pred[:, 0, :, :]

        segmentation = self.binarize(pred)

        boxes_batch = []

        scores_batch = []

        for batch_index in range(pred.shape[0]):

            height, width = h_w_list[batch_index]

            boxes, scores = self.post_p(pred[batch_index], segmentation[batch_index], width, height,

                                        is_output_polygon=is_output_polygon)

            boxes_batch.append(boxes)

            scores_batch.append(scores)

        return boxes_batch, scores_batch


    def binarize(self, pred):

        return pred > self.thresh


    def post_p(self, pred, bitmap, dest_width, dest_height, is_output_polygon=False):

        '''

        _bitmap: single map with shape (H, W),

            whose values are binarized as {0, 1}

        '''

        height, width = pred.shape

        boxes = []

        new_scores = []

        # bitmap = bitmap.cpu().numpy()

        if cv2.__version__.startswith('3'):

            _, contours, _ = cv2.findContours((bitmap * 255).astype(np.uint8), cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)

        if cv2.__version__.startswith('4'):

            contours, _ = cv2.findContours((bitmap * 255).astype(np.uint8), cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)

        for contour in contours[:self.max_candidates]:

            epsilon = 0.005 * cv2.arcLength(contour, True)

            approx = cv2.approxPolyDP(contour, epsilon, True)

            points = approx.reshape((-1, 2))

            if points.shape[0] < 4:

                continue

            score = self.box_score_fast(pred, contour.squeeze(1))

            if self.box_thresh > score:

                continue

            if points.shape[0] > 2:

                box = self.unclip(points, unclip_ratio=self.unclip_ratio)

                if len(box) > 1:

                    continue

            else:

                continue

            four_point_box, sside = self.get_mini_boxes(box.reshape((-1, 1, 2)))

            if sside < self.min_size + 2:

                continue

            if not isinstance(dest_width, int):

                dest_width = dest_width.item()

                dest_height = dest_height.item()

            if not is_output_polygon:

                box = np.array(four_point_box)

            else:

                box = box.reshape(-1, 2)

            box[:, 0] = np.clip(np.round(box[:, 0] / width * dest_width), 0, dest_width)

            box[:, 1] = np.clip(np.round(box[:, 1] / height * dest_height), 0, dest_height)

            boxes.append(box)

            new_scores.append(score)

        return boxes, new_scores


    def unclip(self, box, unclip_ratio=1.5):

        poly = Polygon(box)

        distance = poly.area * unclip_ratio / poly.length

        offset = pyclipper.PyclipperOffset()

        offset.AddPath(box, pyclipper.JT_ROUND, pyclipper.ET_CLOSEDPOLYGON)

        expanded = np.array(offset.Execute(distance))

        return expanded


    def get_mini_boxes(self, contour):

        bounding_box = cv2.minAreaRect(contour)

        points = sorted(list(cv2.boxPoints(bounding_box)), key=lambda x: x[0])


        index_1, index_2, index_3, index_4 = 0, 1, 2, 3

        if points[1][1] > points[0][1]:

            index_1 = 0

            index_4 = 1

        else:

            index_1 = 1

            index_4 = 0

        if points[3][1] > points[2][1]:

            index_2 = 2

            index_3 = 3

        else:

            index_2 = 3

            index_3 = 2


        box = [points[index_1], points[index_2], points[index_3], points[index_4]]

        return box, min(bounding_box[1])


    def box_score_fast(self, bitmap, _box):

        # bitmap = bitmap.detach().cpu().numpy()

        h, w = bitmap.shape[:2]

        box = _box.copy()

        xmin = np.clip(np.floor(box[:, 0].min()).astype(np.int), 0, w - 1)

        xmax = np.clip(np.ceil(box[:, 0].max()).astype(np.int), 0, w - 1)

        ymin = np.clip(np.floor(box[:, 1].min()).astype(np.int), 0, h - 1)

        ymax = np.clip(np.ceil(box[:, 1].max()).astype(np.int), 0, h - 1)


        mask = np.zeros((ymax - ymin + 1, xmax - xmin + 1), dtype=np.uint8)

        box[:, 0] = box[:, 0] - xmin

        box[:, 1] = box[:, 1] - ymin

        cv2.fillPoly(mask, box.reshape(1, -1, 2).astype(np.int32), 1)

        return cv2.mean(bitmap[ymin:ymax + 1, xmin:xmax + 1], mask)[0]

def narrow_224_32(image, expected_size=(224,32)):

    ih, iw = image.shape[0:2]

    ew, eh = expected_size

    # scale = eh / ih

    scale = min((eh/ih),(ew/iw))

    # scale = eh / max(iw,ih)

    nh = int(ih * scale)

    nw = int(iw * scale)

    image = cv2.resize(image, (nw, nh), interpolation=cv2.INTER_CUBIC)


    top = 0

    bottom = eh - nh

    left = 0

    right = ew - nw


    new_img = cv2.copyMakeBorder(image, top, bottom, left, right, cv2.BORDER_CONSTANT, value=(114, 114, 114))

    return image,new_img

def draw_bbox(img_path, result, color=(0, 0, 255), thickness=2):

    import cv2

    if isinstance(img_path, str):

        img_path = cv2.imread(img_path)

        # img_path = cv2.cvtColor(img_path, cv2.COLOR_BGR2RGB)

    img_path = img_path.copy()

    for point in result:

        point = point.astype(int)

        cv2.polylines(img_path, [point], True, color, thickness)

    return img_path


if __name__ == '__main__':

    onnx_model = onnxruntime.InferenceSession("detect_model_small.onnx")

    input_name = onnx_model.get_inputs()[0].name

    # Set inputs

    img = cv2.imread('./pic/6.jpg')

    img0 , image= narrow_224_32(img,expected_size=(640,640))

   

    transform_totensor = transforms.ToTensor()

    tensor=transform_totensor(image)

    tensor_nor=transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])

    tensor=tensor_nor(tensor)

    tensor = np.array(tensor,dtype=np.float32).reshape(1,3,640,640)


    post_proess = DBPostProcess()

    is_output_polygon = False

    #run

    outputs = onnx_model.run(None, {input_name:tensor})

    #post process

    feat_2 = torch.from_numpy(outputs[0])

    print(feat_2.size())

    box_list, score_list = post_proess(outputs[0], [image.shape[:2]], is_output_polygon=is_output_polygon)

    box_list, score_list = box_list[0], score_list[0]

    if len(box_list) > 0:

        idx = [x.sum() > 0 for x in box_list]

        box_list = [box_list[i] for i, v in enumerate(idx) if v]

        score_list = [score_list[i] for i, v in enumerate(idx) if v]

    else:

        box_list, score_list = [], []

    print("-----------------box list",box_list)

    img = draw_bbox(image, box_list)

    img = img[0:img0.shape[0],0:img0.shape[1]]

    print("============save pic")

    img1=np.array(img,dtype=np.uint8).reshape(640,640,3)

    cv2.imwrite("img.jpg",img1)

    cv2.waitKey()

文字识别

onnx模型导出

#!/usr/bin/env python3

import os

import sys

import pathlib

# 将 torchocr路径加到python路径里

__dir__ = pathlib.Path(os.path.abspath(__file__))

import numpy as np

sys.path.append(str(__dir__))

sys.path.append(str(__dir__.parent.parent))

import torch

from torchocr.networks import build_model


MODEL_PATH='./model/ch_rec_moblie_crnn_mbv3.pth'

DEVICE='cuda:0' if torch.cuda.is_available() else 'cpu'

print("-----------------------devices",DEVICE)


class RecInfer:

    def __init__(self, model_path, batch_size=1):

        ckpt = torch.load(model_path, map_location=DEVICE)

        cfg = ckpt['cfg']

        self.model = build_model(cfg['model'])

        state_dict = {}

        for k, v in ckpt['state_dict'].items():

            state_dict[k.replace('module.', '')] = v

        self.model.load_state_dict(state_dict)

        self.batch_size = batch_size

        self.device = torch.device(DEVICE)

        self.model.to(self.device)

        self.model.eval()

# Prepare input tensor

        input = torch.randn(1, 3, 32, 224, requires_grad=False).float().to(torch.device(DEVICE))

# Export the torch model as onnx

        print("-------------------export")

        torch.onnx.export(self.model,

            input,

            'rego_model_small.onnx',

            export_params=True,

            opset_version=12,

            do_constant_folding=False)

       

# Load the pretrained model and export it as onnx

model = RecInfer(MODEL_PATH)

验证

import onnxruntime

import numpy as np

import cv2

import torch

DEVICE='cuda:0' if torch.cuda.is_available() else 'cpu'

IMG_WIDTH=448

ONNX_MODEL='./onnx_model/repvgg_s.onnx'

LABEL_FILE='/root/autodl-tmp/warren/PytorchOCR_OLD/torchocr/datasets/alphabets/dict_text.txt'

#ONNX_MODEL='./onnx_model/rego_model_small.onnx'

#LABEL_FILE='/root/autodl-tmp/warren/PytorchOCR_OLD/torchocr/datasets/alphabets/ppocr_keys_v1.txt'

PIC='./pic/img.jpg'


class CTCLabelConverter(object):

    """ Convert between text-label and text-index """


    def __init__(self, character):

        # character (str): set of the possible characters.

        dict_character = []

        with open(character, "rb") as fin:

            lines = fin.readlines()

            for line in lines:

                line = line.decode('utf-8').strip("\n").strip("\r\n")

                dict_character += list(line)

        self.dict = {}

        for i, char in enumerate(dict_character):

            # NOTE: 0 is reserved for 'blank' token required by CTCLoss

            self.dict[char] = i + 1

        #TODO replace ‘ ’ with special symbol

        self.character = ['[blank]'] + dict_character+[' ']  # dummy '[blank]' token for CTCLoss (index 0)


    def decode(self, preds, raw=False):

        """ convert text-index into text-label. """

        preds_idx = preds.argmax(axis=2)

        preds_prob = preds.max(axis=2)

        result_list = []

        for word, prob in zip(preds_idx, preds_prob):

            if raw:

                result_list.append((''.join([self.character[int(i)] for i in word]), prob))

            else:

                result = []

                conf = []

                for i, index in enumerate(word):

                    if word[i] != 0 and (not (i > 0 and word[i - 1] == word[i])):

                        result.append(self.character[int(index)])

                        conf.append(prob[i])

                result_list.append((''.join(result), conf))

        return result_list

   

def decode(preds, raw=False):

    """ convert text-index into text-label. """

    dict_character = []

    dict = {}

    character=LABEL_FILE

    with open(character, "rb") as fin:

        lines = fin.readlines()

        for line in lines:

            line = line.decode('utf-8').strip("\n").strip("\r\n")

            dict_character += list(line)

    for i, char in enumerate(dict_character):

        # NOTE: 0 is reserved for 'blank' token required by CTCLoss

        dict[char] = i + 1

        #TODO replace ‘ ’ with special symbol

        character = ['[blank]'] + dict_character+[' ']  # dummy '[blank]' token for CTCLoss (index 0)

    preds_idx = preds.argmax(axis=2)

    preds_prob = preds.max(axis=2)

    result_list = []

    for word, prob in zip(preds_idx, preds_prob):

        if raw:

            result_list.append((''.join([character[int(i)] for i in word]), prob))

        else:

            result = []

            conf = []

            for i, index in enumerate(word):

                if word[i] != 0 and (not (i > 0 and word[i - 1] == word[i])):

                    result.append(character[int(index)])

                    conf.append(prob[i])

            result_list.append((''.join(result), conf))

    return result_list


def width_pad_img(_img, _target_width, _pad_value=0):

        _height, _width, _channels = _img.shape

        to_return_img = np.ones([_height, _target_width, _channels], dtype=_img.dtype) * _pad_value

        to_return_img[:_height, :_width, :] = _img

        return to_return_img

def resize_with_specific_height(_img):

        resize_ratio = 32 / _img.shape[0]

        return cv2.resize(_img, (0, 0), fx=resize_ratio, fy=resize_ratio, interpolation=cv2.INTER_LINEAR)

def normalize_img(_img):

        return (_img.astype(np.float32) / 255 - 0.5) / 0.5


if __name__ == '__main__':

    onnx_model = onnxruntime.InferenceSession(ONNX_MODEL)

    input_name = onnx_model.get_inputs()[0].name

    # Set inputs

    imgs = cv2.imread(PIC)

    if not isinstance(imgs,list):

        imgs = [imgs]

    imgs = [normalize_img(resize_with_specific_height(img)) for img in imgs]

    widths = np.array([img.shape[1] for img in imgs])

    idxs = np.argsort(widths)

    txts = []

    label_convert=CTCLabelConverter(LABEL_FILE)

    for idx in range(len(imgs)):

        batch_idxs = idxs[idx:min(len(imgs), idx+1)]

        batch_imgs = [width_pad_img(imgs[idx],IMG_WIDTH) for idx in batch_idxs]

        batch_imgs = np.stack(batch_imgs)

        print(batch_imgs.shape)

        tensor =batch_imgs.transpose([0,3, 1, 2]).astype(np.float32)

        out = onnx_model.run(None, {input_name:tensor})

        tensor_out = torch.tensor(out)

        tensor_out = torch.squeeze(tensor_out,dim=1)

        softmax_output = tensor_out.softmax(dim=2)

        print("---------------out shape is",softmax_output.shape)

        txts.extend([label_convert.decode(np.expand_dims(txt, 0)) for txt in softmax_output])

    idxs = np.argsort(idxs)

    out_txts = [txts[idx] for idx in idxs]

    import sys

    import codecs


    sys.stdout = codecs.getwriter("utf-8")(sys.stdout.detach())

    print(out_txts)

至此导出验证成功

rk3588板端部署

转化为rknn模型

from rknn.api import RKNN

ONNX_MODEL = 'xxx.onnx'
RKNN_MODEL = 'xxxx.rknn'
DATASET = './dataset.txt'

if __name__ == '__main__':

    # Create RKNN object
    rknn = RKNN(verbose=True)

    # pre-process config
    print('--> Config model')
    ret=rknn.config(mean_values=[[0, 0, 0]], std_values=[[0, 0, 0]],target_platform='rk3588')  #wzw
    if ret != 0:
        print('config model failed!')
        exit(ret)
    print('done')

    # Load ONNX model
    print('--> Loading model')
    ret = rknn.load_onnx(model=ONNX_MODEL, outputs=['output', '345', '346'])  
    if ret != 0:
        print('Load model failed!')
        exit(ret)
    print('done')

    # Build model
    print('--> Building model')
    ret = rknn.build(do_quantization=True, dataset=DATASET)
    #ret = rknn.build(do_quantization=False)
    if ret != 0:
        print('Build model failed!')
        exit(ret)
    print('done')

    # Export RKNN model
    print('--> Export rknn model')
    ret = rknn.export_rknn(RKNN_MODEL)
    if ret != 0:
        print('Export rknn model failed!')
        exit(ret)
    print('done')
    #release rknn
    rknn.release()

使用pyqt进行开发

PyQt软件设计

使用pyqt进行开发，ui界面如图所示

UI

图6. 基于PYQT的ui界面

该界面包含了三个功能按钮，其中包裹一个选择静态图片，一个使用相机，一个检测按钮，TextEdit用于显示识别结果，label用于显示处理完成后的图片。

软件流程图如下：

总体目录参照

下面依次介绍图片检测的相关代码：

import platform

import sys

import cv2

import numpy as np

import torch

import pyclipper

from shapely.geometry import Polygon

from torchvision import transforms

import time

import os

import glob

import threading

from PyQt5.QtGui import *

from PyQt5.QtWidgets import *

from PyQt5.QtCore import *

import platform

from rknnlite.api import RKNNLite

import os

os.environ.pop("QT_QPA_PLATFORM_PLUGIN_PATH")

DETECT_MODEL = './model/model_small.rknn'

REGO_MODEL='./model/repvgg_s.rknn'

LABEL_FILE='./dict/dict_text.txt'

LABEL_SIZE_PRIVIOUS=0

LABEL_SIZE_LATTER=0

# 文件夹路径

folder_path = './crop_pic'

# 使用 glob 来获取所有图片文件的路径

image_files = glob.glob(os.path.join(folder_path, '*.png')) + glob.glob(os.path.join(folder_path, '*.jpg'))


def  resize_img_self(image,reszie_size=(0,0)):

    ih,iw=image.shape[0:2]

    ew,eh=reszie_size

    scale=eh/ih

    width=int(iw*scale)

    height=int(ih*scale)

    if height!=eh:

        height=eh

    image=cv2.resize(image,(width,height),interpolation=cv2.INTER_LINEAR)

    top = 0

    bottom = 0

    left = 0

    right = ew-width

    new_img = cv2.copyMakeBorder(image, top, bottom, left, right, cv2.BORDER_CONSTANT, value=(114, 114, 114))

    #print("new image shape",new_img.shape)

    return new_img

def narrow_224_32(image, expected_size=(224,32)):

    ih, iw = image.shape[0:2]

    ew, eh = expected_size

    # scale = eh / ih

    scale = min((eh/ih),(ew/iw))

    # scale = eh / max(iw,ih)

    nh = int(ih * scale)

    nw = int(iw * scale)

    image = cv2.resize(image, (nw, nh), interpolation=cv2.INTER_CUBIC)


    top = 0

    bottom = eh - nh

    left = 0

    right = ew - nw


    new_img = cv2.copyMakeBorder(image, top, bottom, left, right, cv2.BORDER_CONSTANT, value=(114, 114, 114))

    return image,new_img

def draw_bbox(img_path, result, color=(0, 0, 255), thickness=2):

    import cv2

    if isinstance(img_path, str):

        img_path = cv2.imread(img_path)

        # img_path = cv2.cvtColor(img_path, cv2.COLOR_BGR2RGB)

    img_path = img_path.copy()

    for point in result:

        point = point.astype(int)

        cv2.polylines(img_path, [point], True, color, thickness)

    return img_path

def delay_milliseconds(milliseconds):

    seconds = milliseconds / 1000.0

    time.sleep(seconds)


""" Convert between text-label and text-index """

class CTCLabelConverter(object):

    def __init__(self, character):

        # character (str): set of the possible characters.

        dict_character = []

        with open(character, "rb") as fin:

            lines = fin.readlines()

            for line in lines:

                line = line.decode('utf-8').strip("\n").strip("\r\n")

                dict_character += list(line)

        self.dict = {}

        for i, char in enumerate(dict_character):

            # NOTE: 0 is reserved for 'blank' token required by CTCLoss

            self.dict[char] = i + 1

        #TODO replace ‘ ’ with special symbol

        self.character = ['[blank]'] + dict_character+[' ']  # dummy '[blank]' token for CTCLoss (index 0)


    def decode(self, preds, raw=False):

        """ convert text-index into text-label. """

        preds_idx = preds.argmax(axis=2)

        preds_prob = preds.max(axis=2)

        result_list = []

        for word, prob in zip(preds_idx, preds_prob):

            if raw:

                result_list.append((''.join([self.character[int(i)] for i in word]), prob))

            else:

                result = []

                conf = []

                for i, index in enumerate(word):

                    if word[i] != 0 and (not (i > 0 and word[i - 1] == word[i])):

                        result.append(self.character[int(index)])

                        #conf.append(prob[i])

                #result_list.append((''.join(result), conf))

                result_list.append((''.join(result)))

        return result_list

class DBPostProcess():

    def __init__(self, thresh=0.3, box_thresh=0.7, max_candidates=1000, unclip_ratio=2):

        self.min_size = 3

        self.thresh = thresh

        self.box_thresh = box_thresh

        self.max_candidates = max_candidates

        self.unclip_ratio = unclip_ratio


    def __call__(self, pred, h_w_list, is_output_polygon=False):

        pred = pred[:, 0, :, :]

        segmentation = self.binarize(pred)

        boxes_batch = []

        scores_batch = []

        for batch_index in range(pred.shape[0]):

            height, width = h_w_list[batch_index]

            boxes, scores = self.post_p(pred[batch_index], segmentation[batch_index], width, height,

                                        is_output_polygon=is_output_polygon)

            boxes_batch.append(boxes)

            scores_batch.append(scores)

        return boxes_batch, scores_batch


    def binarize(self, pred):

        return pred > self.thresh


    def post_p(self, pred, bitmap, dest_width, dest_height, is_output_polygon=False):

        '''

        _bitmap: single map with shape (H, W),

            whose values are binarized as {0, 1}

        '''

        height, width = pred.shape

        boxes = []

        new_scores = []

        # bitmap = bitmap.cpu().numpy()

        if cv2.__version__.startswith('3'):

            _, contours, _ = cv2.findContours((bitmap * 255).astype(np.uint8), cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)

        if cv2.__version__.startswith('4'):

            contours, _ = cv2.findContours((bitmap * 255).astype(np.uint8), cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)

        for contour in contours[:self.max_candidates]:

            epsilon = 0.005 * cv2.arcLength(contour, True)

            approx = cv2.approxPolyDP(contour, epsilon, True)

            points = approx.reshape((-1, 2))

            if points.shape[0] < 4:

                continue

            score = self.box_score_fast(pred, contour.squeeze(1))

            if self.box_thresh > score:

                continue

            if points.shape[0] > 2:

                box = self.unclip(points, unclip_ratio=self.unclip_ratio)

                if len(box) > 1:

                    continue

            else:

                continue

            four_point_box, sside = self.get_mini_boxes(box.reshape((-1, 1, 2)))

            if sside < self.min_size + 2:

                continue

            if not isinstance(dest_width, int):

                dest_width = dest_width.item()

                dest_height = dest_height.item()

            if not is_output_polygon:

                box = np.array(four_point_box)

            else:

                box = box.reshape(-1, 2)

            box[:, 0] = np.clip(np.round(box[:, 0] / width * dest_width), 0, dest_width)

            box[:, 1] = np.clip(np.round(box[:, 1] / height * dest_height), 0, dest_height)

            boxes.append(box)

            new_scores.append(score)

        return boxes, new_scores


    def unclip(self, box, unclip_ratio=1.5):

        poly = Polygon(box)

        distance = poly.area * unclip_ratio / poly.length

        offset = pyclipper.PyclipperOffset()

        offset.AddPath(box, pyclipper.JT_ROUND, pyclipper.ET_CLOSEDPOLYGON)

        expanded = np.array(offset.Execute(distance))

        return expanded


    def get_mini_boxes(self, contour):

        bounding_box = cv2.minAreaRect(contour)

        points = sorted(list(cv2.boxPoints(bounding_box)), key=lambda x: x[0])


        index_1, index_2, index_3, index_4 = 0, 1, 2, 3

        if points[1][1] > points[0][1]:

            index_1 = 0

            index_4 = 1

        else:

            index_1 = 1

            index_4 = 0

        if points[3][1] > points[2][1]:

            index_2 = 2

            index_3 = 3

        else:

            index_2 = 3

            index_3 = 2


        box = [points[index_1], points[index_2], points[index_3], points[index_4]]

        return box, min(bounding_box[1])


    def box_score_fast(self, bitmap, _box):

        # bitmap = bitmap.detach().cpu().numpy()

        h, w = bitmap.shape[:2]

        box = _box.copy()

        xmin = np.clip(np.floor(box[:, 0].min()).astype(np.int), 0, w - 1)

        xmax = np.clip(np.ceil(box[:, 0].max()).astype(np.int), 0, w - 1)

        ymin = np.clip(np.floor(box[:, 1].min()).astype(np.int), 0, h - 1)

        ymax = np.clip(np.ceil(box[:, 1].max()).astype(np.int), 0, h - 1)


        mask = np.zeros((ymax - ymin + 1, xmax - xmin + 1), dtype=np.uint8)

        box[:, 0] = box[:, 0] - xmin

        box[:, 1] = box[:, 1] - ymin

        cv2.fillPoly(mask, box.reshape(1, -1, 2).astype(np.int32), 1)

        return cv2.mean(bitmap[ymin:ymax + 1, xmin:xmax + 1], mask)[0]

class Process_Class(QWidget):

    detect_end = pyqtSignal(str)

    clear_text = pyqtSignal()

    def __init__(self):

        super().__init__()

        self.image = None

        self.img=None

        self.camera_status=False

        self.result_string=None

        self.cap = cv2.VideoCapture()

        #detect

        rknn_model_detect = DETECT_MODEL

        self.rknn_lite_detect = RKNNLite()

        self.rknn_lite_detect.load_rknn(rknn_model_detect)# load RKNN model

        self.rknn_lite_detect.init_runtime(core_mask=RKNNLite.NPU_CORE_2)# init runtime environment

        #rego

        rknn_model_rego = REGO_MODEL

        self.rknn_lite_rego = RKNNLite()

        self.rknn_lite_rego.load_rknn(rknn_model_rego)# load RKNN model

        self.rknn_lite_rego.init_runtime(core_mask=RKNNLite.NPU_CORE_0_1)# init runtime environment

        self.detect_end.connect(self.update_text_box)

        self.clear_text.connect(self.clear_text_box)

    def cv2_to_qpixmap(self, cv_image):

        height, width, channel = cv_image.shape

        bytes_per_line = 3 * width

        q_image = QImage(cv_image.data, width, height, bytes_per_line, QImage.Format_RGB888).rgbSwapped()

        return QPixmap.fromImage(q_image)

    def show_pic(self, cv_image):

        pixmap = self.cv2_to_qpixmap(cv_image)

        if MainWindow.pic_label is not None:

            MainWindow.pic_label.setPixmap(pixmap)

            QApplication.processEvents()

        else:

            print("wrong!!!!!!!")

    def camera_open(self):

        self.camera_status = not self.camera_status

        print("------------camera status is",self.camera_status)

        if  self.camera_status:

            self.cap.open(12)

            if  self.cap.isOpened():

                print("run camera")

                while(True):

                    frame = self.cap.read()

                    if not frame[0]:

                        print("read frame failed!!!!")

                        exit()

                    self.image=frame[1]

                    self.detect_pic()

                    if not self.camera_status:  

                        break

            else:

                print("Cannot open camera")

                exit()

        else:

            self.release_camera()

    def release_camera(self):

        if self.cap.isOpened():

            self.cap.release()

            self.camera_status = False

            print("摄像头关闭")

    def open_file(self):

        # 获取图像的路径

        img_path, _ = QFileDialog.getOpenFileName()

        if img_path != '':

            self.image = cv2.imread(img_path)

        self.show_pic(self.image)

    def crop_and_save_image(self,image, box_points):

        global LABEL_SIZE_PRIVIOUS

        global LABEL_SIZE_LATTER

        i=-1

        # 将box_points转换为NumPy数组，并取整数值

        box_points = np.array(box_points, dtype=np.int32)

        mask = np.zeros_like(image)  # 创建与图像相同大小的全黑图像

        print("LABEL_SIZE_PRIVIOUS ",LABEL_SIZE_PRIVIOUS,"LABEL_SIZE_LATTER ",LABEL_SIZE_LATTER)

        if LABEL_SIZE_PRIVIOUS==LABEL_SIZE_LATTER:

            LABEL_SIZE_PRIVIOUS=len(box_points)

            for box_point in box_points:

                i=i+1

                cropped_image = image.copy()

                # 使用OpenCV的函数裁剪图像

                x, y, w, h = cv2.boundingRect(box_point)

                cropped_image = image[y:y+h, x:x+w]

                # 创建与图像大小相同的全黑掩码

                mask = np.zeros_like(cropped_image)

                # 在掩码上绘制多边形

                cv2.fillPoly(mask, [box_point - (x, y)], (255, 255, 255))

                # 使用 bitwise_and 进行图像裁剪

                masked_cropped_image = cv2.bitwise_and(cropped_image, mask) 

                # 保存裁剪后的图像

                output_path = f"{'./crop_pic/'}img_{i}.jpg"

                cv2.imwrite(output_path, masked_cropped_image)

        else:

            #self.clear_text.emit()

            LABEL_SIZE_LATTER=LABEL_SIZE_PRIVIOUS

            current_directory = os.getcwd()+'/crop_pic'  # Get the current directory

            for filename in os.listdir(current_directory):

                if filename.endswith(".jpg"):

                    file_path = os.path.join(current_directory, filename)

                    os.remove(file_path)

                    print(f"Deleted: {file_path}")

    def detect_thread(self):

        #detect inference

        img0 , image= narrow_224_32(self.image,expected_size=(640,640))

        outputs =self.rknn_lite_detect.inference(inputs=[image])

        post_proess = DBPostProcess()

        is_output_polygon = False

        box_list, score_list = post_proess(outputs[0], [image.shape[:2]], is_output_polygon=is_output_polygon)

        box_list, score_list = box_list[0], score_list[0]

        if len(box_list) > 0:

            idx = [x.sum() > 0 for x in box_list]

            box_list = [box_list[i] for i, v in enumerate(idx) if v]

            score_list = [score_list[i] for i, v in enumerate(idx) if v]

        else:

            box_list, score_list = [], []

        self.image = draw_bbox(image, box_list)

        self.crop_and_save_image(image,box_list)

        self.image = self.image[0:img0.shape[0],0:img0.shape[1]]

        self.show_pic(self.image)

    def rego_thread(self):

        label_convert=CTCLabelConverter(LABEL_FILE)

        self.clear_text.emit()

        for image_file in image_files:

            if os.path.exists(image_file):

                print('-----------image file',image_file,len(image_files))

                self.img = cv2.imread(image_file)

                image = resize_img_self(self.img,reszie_size=(448,32))

                # Inference

                outputs = self.rknn_lite_rego.inference(inputs=[image])

                #post process

                feat_2 = torch.tensor(outputs[0],dtype=torch.float32)

                txt = label_convert.decode(feat_2.detach().numpy())

                self.result_string = ' '.join(txt)

                print(self.result_string)

                self.detect_end.emit(self.result_string)

            else:

                print("-----------no crop image!!!")

    def detect_pic(self):

        self.detect_thread()

        my_thread = threading.Thread(target=self.rego_thread)

        # 启动线程

        my_thread.start()

        # 等待线程结束

        my_thread.join()

    def update_text_box(self, text):

        # 在主线程中更新文本框的内容

        MainWindow.text_box.append(text)

    def clear_text_box(self):

        print("clear--------------------------------")

        # 在主线程中更新文本框的内容

        MainWindow.text_box.clear()

       

   

class MainWindow(QMainWindow):

    #pic_label = None

    def __init__(self):

        pic_label = None

        text_box  = None

        super().__init__()

        self.process_functions = Process_Class()

        self.window = QWidget()

        # 创建小部件

        self.pic_label = QLabel('Show Window!', parent=self.window)

        self.pic_label.setMinimumHeight(500)  # 设置最小高度

        self.pic_label.setMaximumHeight(500)  # 设置最大高度

        self.pic_button = QPushButton('Picture', parent=self.window)

        self.pic_button.clicked.connect(self.process_functions.open_file)

        self.camera_button = QPushButton('Camera', parent=self.window)

        self.camera_button.clicked.connect(self.process_functions.camera_open)

        self.detect_button = QPushButton('Detect', parent=self.window)

        self.detect_button.clicked.connect(self.process_functions.detect_pic)

        self.text_box = QTextEdit()

        # 创建垂直布局管理器并将小部件添加到布局中

        self.left_layout = QVBoxLayout()

        self.right_layout = QVBoxLayout()

        self.layout = QHBoxLayout()

        self.create_ui()

        self.window.closeEvent = self.closeEvent


    def create_ui(self):

        self.window.setWindowTitle('Scene_text_rego')

        self.window.setGeometry(0, 0, 800, 600)  # 设置窗口位置和大小

        # 设置主窗口的布局

        self.pic_label.setStyleSheet('border: 2px solid black; padding: 10px;')

        self.left_layout.addWidget(self.pic_label)

        self.left_layout.addWidget(self.text_box)

        self.right_layout.addWidget(self.pic_button)

        self.right_layout.addWidget(self.camera_button)

        self.right_layout.addWidget(self.detect_button)

        self.layout.addLayout(self.left_layout)

        self.layout.addLayout(self.right_layout)

        self.window.setLayout(self.layout)

        self.window.show()

    def closeEvent(self, event):

        # 释放摄像头资源

        self.process_functions.release_camera()

        event.accept()


def main():

    # 创建应用程序对象

    app = QApplication(sys.argv)

    win = MainWindow()

    MainWindow.pic_label = win.pic_label  # 设置类变量pic_label为MainWindow对象的pic_label

    MainWindow.text_box = win.text_box  # 设置类变量pic_label为MainWindow对象的pic_label

    # 运行应用程序

    sys.exit(app.exec_())

    rknn_lite_detect.release()


if __name__ == '__main__':

    main()

运行结果

参考资料

博文：

【工程部署】手把手教你在RKNN上部署OCR服务（上）_rknn ocr_三叔家的猫的博客-CSDN博客

你可能感兴趣的:(ocr,人工智能,linux,1024程序员节,深度学习,pyqt,python)

python 读excel每行替换_Python脚本操作Excel实现批量替换功能 weixin_39646695 python 读excel每行替换
Python脚本操作Excel实现批量替换功能大家好，给大家分享下如何使用Python脚本操作Excel实现批量替换。使用的工具Openpyxl，一个处理excel的python库，处理excel，其实针对的就是WorkBook，Sheet，Cell这三个最根本的元素~明确需求原始excel如下我们的目标是把下面excel工作表的sheet1表页A列的内容“替换我吧”批量替换为B列的“我用来替换的
Linux系统配置（应用程序） 1风天云月 Linux linux 应用程序编译安装 rpm http
目录前言一、应用程序概述1、命令与程序的关系2、程序的组成3、软件包封装类型二、RPM1、RPM概述2、RPM用法三、编译安装1、解包2、配置3、编译4、安装5、启用httpd服务结语前言在Linux中的应用程序被视为将软件包安装到系统中后产生的各种文档，其中包括可执行文件、配置文件、用户手册等内容，这些文档被组织为一个有机的整体，为用户提供特定的功能，因此对于“安装软件包”与“安装应用程序”这两
包含日志获取webshell 陈望_ning
日志文件关闭：Apache目录下的httpd.conf文件#ErrorLog"logs/error.log"#CustomLog"logs/access.log"common加#号为注释不产生日志文件如果去掉#将会在Apache/logs/目录下产生日志文件linux:access_logerror_logwindows:access.logerror.logaccess_log每一行记录了一次网
为了在未来的人工智能世界中取得成功，学生们必须学习人类写作的优点睿邸管家
澳大利亚各地的学生在新学年开始使用铅笔、钢笔和键盘学习写字。在工作场所，机器也在学习写作，如此有效，几年之内，它们可能会写得比人类更好。有时它们已经做到了，就像Grammarly这样的应用程序所展示的那样。当然，人类现在的日常写作可能很快就会由具有人工智能(AI)的机器来完成。手机和电子邮件软件常用的预测文本是无数人每天都在使用的一种人工智能写作形式。据AI行业研究机构称，到2022年，人工智能及
Android 应用权限管理详解
文章目录1.权限类型2.权限请求机制3.权限组和分级4.权限管理的演进5.权限监控和SELinux强制访问控制6.应用权限审核和GooglePlayProtect7.开发者最佳实践8.用户权限管理9.Android应用沙箱模型10.ScopedStorage（分区存储）11.背景位置权限（BackgroundLocationAccess）12.权限回收和自动清理13.权限请求的用户体验设计14.G
python笔记14介绍几个魔法方法抢公主的大魔王 python python
python笔记14介绍几个魔法方法先声明一下各位大佬，这是我的笔记。如有错误，恳请指正。另外，感谢您的观看，谢谢啦！(1).__doc__输出对应的函数，类的说明文档print(print.__doc__)print(value,...,sep='',end='\n',file=sys.stdout,flush=False)Printsthevaluestoastream,ortosys.std
Anaconda 和 Miniconda：功能详解与选择建议古月฿ python入门 python conda
Anaconda和Miniconda详细介绍一、Anaconda的详细介绍1.什么是Anaconda？Anaconda是一个开源的包管理和环境管理工具，在数据科学、机器学习以及科学计算领域发挥着关键作用。它以Python和R语言为基础，为用户精心准备了大量预装库和工具，极大地缩短了搭建数据科学环境的时间。对于那些想要快速开展数据分析、模型训练等工作的人员来说，Anaconda就像是一个一站式的“数
环境搭建 | Python + Anaconda / Miniconda + PyCharm 的安装、配置与使用
本文将分别介绍Python、Anaconda/Miniconda、PyCharm的安装、配置与使用，详细介绍Python环境搭建的全过程，涵盖Python、Pip、PythonLauncher、Anaconda、Miniconda、Pycharm等内容，以官方文档为参照，使用经验为补充，内容全面而详实。由于图片太多，就先贴一个无图简化版吧，详情请查看Python+Anaconda/Minicond
你竟然还在用克隆删除？Conda最新版rename命令全攻略！曦紫沐 Python基础知识 conda 虚拟环境管理
文章摘要Conda虚拟环境管理终于迎来革命性升级！本文揭秘Conda4.9+版本新增的rename黑科技，彻底告别传统“克隆+删除”的繁琐操作。从命令解析到实战案例，手把手教你如何安全高效地重命名Python虚拟环境，附带版本检测、环境迁移、故障排查等进阶技巧，助你提升开发效率10倍！一、颠覆认知：Conda居然自带重命名功能？很多开发者仍停留在“Conda无法直接重命名环境”的认知阶段，实际上自
centos7安装配置 Anaconda3
Anaconda是一个用于科学计算的Python发行版,Anaconda于Python，相当于centos于linux。下载[root@testsrc]#mwgethttps://mirrors.tuna.tsinghua.edu.cn/anaconda/archive/Anaconda3-5.2.0-Linux-x86_64.shBegintodownload:Anaconda3-5.2.0-L
Pandas：数据科学的超级瑞士军刀科技林总 DeepSeek学AI 人工智能
**——从零基础到高效分析的进化指南**###**一、Pandas诞生：数据革命的救世主****2010年前的数据分析噩梦**：```python#传统Python处理表格数据data=[]forrowincsv_file:ifrow[3]>100androw[2]=="China":data.append(float(row[5])#代码冗长易错！```**核心痛点**：-Excel处理百万行崩
人工智能应用研究快讯 2021-11-30 峰谷皆平
[HTML]ArtificialIntelligenceforSkinCancerDetection:ScopingReviewATakiddin,JSchneider,YYang,AAbd-Alrazaq...JournalofMedicalInternet...,2021ABSTACT:Background:Skincanceristhemostcommoncancertypeaffectin
从振动信号到精准预警：AI 如何重塑工业设备健康管理？缘华工业智维人工智能计算机视觉边缘计算信息与通信
在智能制造浪潮席卷全球的当下，工业生产正经历着从传统模式向智能化、数字化转型的深刻变革。在这场变革中，AI驱动的振动分析技术犹如一颗璀璨新星，成为工业设备可靠运行的“健康卫士”。它通过在设备关键部位部署振动传感器，如同医生为患者听诊般实时采集设备运行时的振动信号，再借助强大的人工智能算法对这些“工业脉搏”进行深度解析，从而实现对工业设备从故障预警到寿命预测的全周期精准守护。一、AI振动分析：设备状
【Jupyter】个人开发常见命令 TIM老师 #Pycharm &VSCode python Jupyter
1.查看python版本importsysprint(sys.version)2.ipynb/py文件转换jupyternbconvert--topythonmy_file.ipynbipynb转换为mdjupyternbconvert--tomdmy_file.ipynbipynb转为htmljupyternbconvert--tohtmlmy_file.ipynbipython转换为pdfju
AI 生成虚拟宠物：24 小时陪你聊天解闷大力出奇迹985 人工智能宠物
本文围绕AI生成虚拟宠物展开，介绍这类依托人工智能技术诞生的虚拟伙伴，能实现24小时不间断陪伴聊天，为人们解闷。文中详细阐述其技术基础，包括自然语言处理、机器学习等；分析多样功能，如个性化互动、情绪回应等；探讨在独居人群、压力大者等不同群体中的应用场景，最后总结其为人们生活带来的积极影响及未来发展潜力，展现AI虚拟宠物在陪伴领域的独特价值。一、AI生成虚拟宠物的诞生背景与技术基石在快节奏的现代社会
用 Python 开发小游戏：零基础也能做出《贪吃蛇》
本文专为零基础学习者打造，详细介绍如何用Python开发经典小游戏《贪吃蛇》。无需复杂编程知识，从环境搭建到代码编写、功能实现，逐步讲解核心逻辑与操作。涵盖Pygame库的基础运用、游戏界面设计、蛇的移动与食物生成规则等，让新手能按步骤完成开发，同时融入SEO优化要点，帮助读者轻松入门Python游戏开发，体验从0到1做出游戏的乐趣。一、为什么选择用Python开发《贪吃蛇》对于零基础学习者来说，
基于Python的AI健康助手：开发与部署全攻略 AI算力网络与通信 AI算力网络与通信原理 AI人工智能大数据架构 python 人工智能开发语言 ai
基于Python的AI健康助手：开发与部署全攻略关键词：Python、AI健康助手、机器学习、自然语言处理、Flask、部署、健康管理摘要：本文将详细介绍如何使用Python开发一个AI健康助手，从需求分析、技术选型到核心功能实现，再到最终部署上线的完整过程。我们将使用自然语言处理技术理解用户健康咨询，通过机器学习模型提供个性化建议，并展示如何用Flask框架构建Web应用接口。文章包含大量实际代
GPT-4 在 AIGC 中的微调技巧：让模型更懂你的需求 AIGC应用创新大全 AI人工智能与大数据应用开发 MCP&Agent 云算力网络 AIGC ai
GPT-4在AIGC中的微调技巧：让模型更懂你的需求关键词：GPT-4、AIGC、模型微调、监督学习、指令优化、过拟合预防、个性化生成摘要：AIGC（人工智能生成内容）正在重塑内容创作行业，但通用的GPT-4模型可能无法精准匹配你的垂直需求——比如写电商爆款文案时总“跑题”，或生成技术文档时专业术语不够。本文将用“教小朋友学画画”的通俗类比，从微调的底层逻辑讲到实战技巧，带你掌握让GPT-4“更懂
AIGC内容生成实战：如何用ChatGPT+DALL·E打造高转化内容 AI大模型应用工坊 AI大模型开发实战 AIGC chatgpt ai
AIGC内容生成实战：如何用ChatGPT+DALL·E打造高转化内容关键词：AIGC、ChatGPT、DALL·E、内容生成、高转化营销、多模态协同、提示词工程摘要：随着AIGC（人工智能生成内容）技术的爆发式发展，ChatGPT（文本生成）与DALL·E（图像生成）的组合已成为内容创作领域的“黄金搭档”。本文将深度解析二者的协同原理，结合实战案例演示从需求分析到内容落地的全流程，并揭示提升内容
数据分析领域中AI人工智能的发展前景展望 AI大模型应用工坊 AI大模型开发实战数据分析人工智能数据挖掘 ai
数据分析领域中AI人工智能的发展前景展望关键词：数据分析、人工智能、机器学习、深度学习、数据挖掘、预测分析、自动化摘要：本文深入探讨了人工智能在数据分析领域的发展现状和未来趋势。我们将从核心技术原理出发，分析AI如何改变传统数据分析范式，详细讲解机器学习算法在数据分析中的应用，并通过实际案例展示AI驱动的数据分析解决方案。文章还将探讨行业应用场景、工具生态以及未来发展面临的挑战和机遇，为数据分析师
AI人工智能中的数据挖掘：提升智能决策能力
AI人工智能中的数据挖掘：提升智能决策能力关键词：数据挖掘、人工智能、机器学习、智能决策、数据分析、特征工程、模型优化摘要：本文深入探讨了数据挖掘在人工智能领域中的核心作用，重点分析了如何通过数据挖掘技术提升智能决策能力。文章从基础概念出发，详细介绍了数据挖掘的关键算法、数学模型和实际应用场景，并通过Python代码示例展示了数据挖掘的全流程。最后，文章展望了数据挖掘技术的未来发展趋势和面临的挑战
lesson20：Python函数的标注你的电影很有趣 python 开发语言
目录引言：为什么函数标注是现代Python开发的必备技能一、函数标注的基础语法1.1参数与返回值标注1.2支持的标注类型1.3Python3.9+的重大改进：标准集合泛型二、高级标注技巧与最佳实践2.1复杂参数结构标注2.2函数类型与回调标注2.3变量注解与类型别名三、静态类型检查工具应用3.1mypy：最流行的类型检查器3.2Pyright与IDE集成3.3运行时类型验证四、函数标注的工程价值与
Jupyter Notebook：数据科学的“瑞士军刀” a小胡哦机器学习基础人工智能机器学习
在数据科学的世界里，JupyterNotebook是一个不可或缺的工具，它就像是数据科学家手中的“瑞士军刀”，功能强大且灵活多变。今天，就让我们一起深入了解这个神奇的工具。一、JupyterNotebook是什么？JupyterNotebook是一个开源的Web应用程序，它允许你创建和共享包含实时代码、方程、可视化和解释性文本的文档。它支持多种编程语言，其中Python是最常用的语言之一。Jupy
Django学习笔记（一）
学习视频为：pythondjangoweb框架开发入门全套视频教程一、安装pipinstalldjango==****检查是否安装成功django.get_version()二、django新建项目操作1、新建一个项目django-adminstartprojectproject_name2、新建APPcdproject_namedjango-adminstartappApp注：一个project
【三桥君】MCP中台，究竟如何实现多模型、多渠道、多环境的统一管控？如何以MCP为核心设计AI应用架构？三桥君《三桥君 MCP落地方法论》《三桥君 AI大模型落地方法论》#《三桥君 AI产品方法论》人工智能 AI产品经理 MCP API 三桥君系统架构 llama
你好，我是✨三桥君✨本文介绍>>一、引言随着人工智能技术的快速发展，越来越多的企业开始引入大语言模型（LLM）以提升用户体验和运营效率。然而，如何高效、稳定地将这些AI能力落地到生产环境呢？传统的系统架构往往难以应对AI应用的高并发、低延迟和灵活扩展需求，因此，从整体架构角度设计AI应用架构显得尤为重要。本文三桥君将深入探讨以MCP为核心的AI应用架构，并分析多种部署方式的优劣势，为企业在AI落地
Python 程序设计讲义（26）：字符串的用法——字符的编码睿思达DBA_WGX Python 讲义 python 开发语言
Python程序设计讲义（26）：字符串的用法——字符的编码目录Python程序设计讲义（26）：字符串的用法——字符的编码一、字符的编码二、`ASCII`编码三、`Unicode`编码四、使用`ord()`函数查询一个字符对应的`Unicode`编码五、使用`chr()`函数查询一个`Unicode`编码对应的字符六、`Python`字符串的特征一、字符的编码计算机默认只能处理二进制数，而不能处
【Python】pypinyin-汉字拼音转换工具鸟哥大大 Python python 自然语言处理
文章目录1.主要功能2.安装3.常用API3.1拼音风格3.2核心API3.2.1pypinyin.pinyin()3.2.2pypinyin.lazy_pinyin()3.2.3pypinyin.load_single_dict()3.2.4pypinyin.load_phrases_dict()3.2.5pypinyin.slug()3.3注册新的拼音风格4.基本用法4.1库导入4.2基本汉字
python编程第十四课：数据可视化小小源助手 Python代码实例信息可视化 python 开发语言
Python数据可视化：让数据“开口说话”在当今数据爆炸的时代，数据可视化已成为探索数据规律、传达数据信息的关键技术。Python凭借其丰富的第三方库，为数据可视化提供了强大而灵活的解决方案。本文将带你深入了解Matplotlib库的基础绘图、Seaborn库的高级可视化以及交互式可视化工具Plotly，帮助你通过图表清晰地展示数据背后的故事。一、Matplotlib库基础绘图Matplotlib
深入理解卷积神经网络（CNN）与循环神经网络（RNN） CodeJourney. cnn rnn 人工智能
在当今的人工智能领域，神经网络无疑是最为璀璨的明珠之一。而卷积神经网络（ConvolutionalNeuralNetworks，CNN）和循环神经网络（RecurrentNeuralNetworks，RNN）作为神经网络家族中的重要成员，各自有着独特的架构和强大的功能，广泛应用于众多领域。本文将深入探讨这两种神经网络的原理、特点以及应用场景，为对深度学习感兴趣的读者提供全面的知识讲解。一、卷积神经
Python数据可视化：用代码绘制数据背后的故事 AAEllisonPang Python 信息可视化 python 开发语言
引言：当数据会说话在数据爆炸的时代，可视化是解锁数据价值的金钥匙。Python凭借其丰富的可视化生态库，已成为数据科学家的首选工具。本文将带您从基础到高级，探索如何用Python将冰冷数字转化为引人入胜的视觉叙事。一、基础篇：二维可视化的艺术表达1.1Matplotlib：可视化领域的瑞士军刀importmatplotlib.pyplotaspltimportnumpyasnpx=np.linsp
Algorithm 香水浓 java Algorithm
冒泡排序 public static void sort(Integer[] param) { for (int i = param.length - 1; i > 0; i--) { for (int j = 0; j < i; j++) { int current = param[j]; int next = param[j + 1];
mongoDB 复杂查询表达式开窍的石头 mongodb
1:count Pg: db.user.find().count(); 统计多少条数据 2:不等于$ne Pg: db.user.find({_id:{$ne:3}},{name:1,sex:1,_id:0}); 查询id不等于3的数据。 3：大于$gt $gte(大于等于) &n
Jboss Java heap space异常解决方法, jboss OutOfMemoryError : PermGen space 0624chenhong jvm jboss
转自 http://blog.csdn.net/zou274/article/details/5552630 解决办法： window->preferences->java->installed jres->edit jre 把default vm arguments 的参数设为-Xms64m -Xmx512m ----------------
文件上传下载解析相对路径不懂事的小屁孩文件上传
有点坑吧，弄这么一个简单的东西弄了一天多，身边还有大神指导着，网上各种百度着。下面总结一下遇到的问题：文件上传，在页面上传的时候，不要想着去操作绝对路径，浏览器会对客户端的信息进行保护，避免用户信息收到攻击。在上传图片，或者文件时，使用form表单来操作。前台通过form表单传输一个流到后台，而不是ajax传递参数到后台，代码如下: <form action=&
怎么实现qq空间批量点赞换个号韩国红果果 qq
纯粹为了好玩！！逻辑很简单 1 打开浏览器console；输入以下代码。先上添加赞的代码 var tools={}; //添加所有赞 function init(){ document.body.scrollTop=10000; setTimeout(function(){document.body.scrollTop=0;},2000);//加
判断是否为中文灵静志远中文
方法一： public class Zhidao { public static void main(String args[]) { String s = "sdf灭礌 kjl d{';\fdsjlk是"; int n=0; for(int i=0; i<s.length(); i++) { n = (int)s.charAt(i); if((
一个电话面试后总结 a-john 面试
今天，接了一个电话面试，对于还是初学者的我来说，紧张了半天。面试的问题分了层次，对于一类问题，由简到难。自己觉得回答不好的地方作了一下总结：在谈到集合类的时候，举几个常用的集合类，想都没想，直接说了list,map。然后对list和map分别举几个类型： list方面：ArrayList,LinkedList。在谈到他们的区别时，愣住了
MSSQL中Escape转义的使用 aijuans MSSQL
IF OBJECT_ID('tempdb..#ABC') is not null drop table tempdb..#ABC create table #ABC ( PATHNAME NVARCHAR(50) ) insert into #ABC SELECT N'/ABCDEFGHI' UNION ALL SELECT N'/ABCDGAFGASASSDFA' UNION ALL
一个简单的存储过程 asialee mysql 存储过程构造数据批量插入
今天要批量的生成一批测试数据，其中中间有部分数据是变化的，本来想写个程序来生成的，后来想到存储过程就可以搞定，所以随手写了一个，记录在此： DELIMITER $$ DROP PROCEDURE IF EXISTS inse
annot convert from HomeFragment_1 to Fragment 百合不是茶 android 导包错误
创建了几个类继承Fragment, 需要将创建的类存储在ArrayList<Fragment>中; 出现不能将new 出来的对象放到队列中,原因很简单; 创建类时引入包是:import android.app.Fragment; 创建队列和对象时使用的包是:import android.support.v4.ap
Weblogic10两种修改端口的方法 bijian1013 weblogic 端口号配置管理 config.xml
一.进入控制台进行修改 1.进入控制台: http://127.0.0.1:7001/console 2.展开左边树菜单域结构->环境->服务器-->点击AdminServer(管理) &
mysql 操作指令征客丶 mysql
一、连接mysql 进入 mysql 的安装目录； $ bin/mysql -p [host IP 如果是登录本地的mysql 可以不写 -p 直接 -u] -u [userName] -p 输入密码，回车，接连；二、权限操作［如果你很了解mysql数据库后，你可以直接去修改系统表，然后用 mysql> flush privileges; 指令让权限生效］ 1、赋权 mys
【Hive一】Hive入门 bit1129 hive
Hive安装与配置 Hive的运行需要依赖于Hadoop，因此需要首先安装Hadoop2.5.2，并且Hive的启动前需要首先启动Hadoop。 Hive安装和配置的步骤 1. 从如下地址下载Hive0.14.0 http://mirror.bit.edu.cn/apache/hive/ 2.解压hive，在系统变
ajax 三种提交请求的方法 BlueSkator Ajax jqery
1、ajax 提交请求 $.ajax({ type:"post", url : "${ctx}/front/Hotel/getAllHotelByAjax.do", dataType : "json", success : function(result) { try { for(v
mongodb开发环境下的搭建入门 braveCS 运维
linux下安装mongodb 1）官网下载mongodb-linux-x86_64-rhel62-3.0.4.gz 2）linux 解压 gzip -d mongodb-linux-x86_64-rhel62-3.0.4.gz; mv mongodb-linux-x86_64-rhel62-3.0.4 mongodb-linux-x86_64-rhel62-
编程之美-最短摘要的生成 bylijinnan java 数据结构算法编程之美
import java.util.HashMap; import java.util.Map; import java.util.Map.Entry; public class ShortestAbstract { /** * 编程之美最短摘要的生成 * 扫描过程始终保持一个[pBegin,pEnd]的range,初始化确保[pBegin,pEnd]的ran
json数据解析及typeof chengxuyuancsdn js typeof json解析
// json格式 var people='{"authors": [{"firstName": "AAA","lastName": "BBB"},' +' {"firstName": "CCC&
流程系统设计的层次和目标 comsci 设计模式数据结构 sql 框架脚本
流程系统设计的层次和目标
RMAN List和report 命令 daizj oracle list report rman
LIST 命令使用RMAN LIST 命令显示有关资料档案库中记录的备份集、代理副本和映像副本的信息。使用此命令可列出： • RMAN 资料档案库中状态不是AVAILABLE 的备份和副本 • 可用的且可以用于还原操作的数据文件备份和副本 • 备份集和副本，其中包含指定数据文件列表或指定表空间的备份 • 包含指定名称或范围的所有归档日志备份的备份集和副本 • 由标记、完成时间、可
二叉树:红黑树 dieslrae 二叉树
红黑树是一种自平衡的二叉树,它的查找,插入,删除操作时间复杂度皆为O(logN),不会出现普通二叉搜索树在最差情况时时间复杂度会变为O(N)的问题. 红黑树必须遵循红黑规则,规则如下 1、每个节点不是红就是黑。 2、根总是黑的 &
C语言homework3，7个小题目的代码 dcj3sjt126com c
1、打印100以内的所有奇数。 # include <stdio.h> int main(void) { int i; for (i=1; i<=100; i++) { if (i%2 != 0) printf("%d ", i); } return 0; } 2、从键盘上输入10个整数，
自定义按钮, 图片在上, 文字在下, 居中显示 dcj3sjt126com 自定义
#import <UIKit/UIKit.h> @interface MyButton : UIButton -(void)setFrame:(CGRect)frame ImageName:(NSString*)imageName Target:(id)target Action:(SEL)action Title:(NSString*)title Font:(CGFloa
MySQL查询语句练习题，测试足够用了 flyvszhb sql mysql
http://blog.sina.com.cn/s/blog_767d65530101861c.html 1.创建student和score表 CREATE TABLE student ( id INT(10) NOT NULL UNIQUE PRIMARY KEY , name VARCHAR
转：MyBatis Generator 详解 happyqing mybatis
MyBatis Generator 详解 http://blog.csdn.net/isea533/article/details/42102297 MyBatis Generator详解 http://git.oschina.net/free/Mybatis_Utils/blob/master/MybatisGeneator/MybatisGeneator.
让程序员少走弯路的14个忠告 jingjing0907 工作计划学习
无论是谁，在刚进入某个领域之时，有再大的雄心壮志也敌不过眼前的迷茫：不知道应该怎么做，不知道应该做什么。下面是一名软件开发人员所学到的经验，希望能对大家有所帮助 1.不要害怕在工作中学习。只要有电脑，就可以通过电子阅读器阅读报纸和大多数书籍。如果你只是做好自己的本职工作以及分配的任务，那是学不到很多东西的。如果你盲目地要求更多的工作，也是不可能提升自己的。放
nginx和NetScaler区别流浪鱼 nginx
NetScaler是一个完整的包含操作系统和应用交付功能的产品，Nginx并不包含操作系统，在处理连接方面，需要依赖于操作系统，所以在并发连接数方面和防DoS攻击方面，Nginx不具备优势。 2.易用性方面差别也比较大。Nginx对管理员的水平要求比较高，参数比较多，不确定性给运营带来隐患。在NetScaler常见的配置如健康检查，HA等，在Nginx上的配置的实现相对复杂。 3.策略灵活度方
第11章动画效果（下） onestopweb 动画
index.html <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/
FAQ - SAP BW BO roadmap blueoxygen BO BW
http://www.sdn.sap.com/irj/boc/business-objects-for-sap-faq Besides, I care that how to integrate tightly. By the way, for BW consultants, please just focus on Query Designer which i
关于java堆内存溢出的几种情况 tomcat_oracle java jvm jdk thread
【情况一】：　　 java.lang.OutOfMemoryError: Java heap space：这种是java堆内存不够，一个原因是真不够，另一个原因是程序中有死循环；　　如果是java堆内存不够的话，可以通过调整JVM下面的配置来解决：　　<jvm-arg>-Xms3062m</jvm-arg> 　　<jvm-arg>-Xmx
Manifest.permission_group权限组阿尔萨斯 Permission
结构继承关系 public static final class Manifest.permission_group extends Object java.lang.Object android. Manifest.permission_group 常量 ACCOUNTS 直接通过统计管理器访问管理的统计 COST_MONEY可以用来让用户花钱但不需要通过与他们直接牵涉的权限 D