Python利用PaddleOCR制作個搜題小工具

Posted on 2022-06-08 by WalkonNet

介紹

PaddleOCR 是一個基於百度飛槳的OCR工具庫，包含總模型僅8.6M的超輕量級中文OCR，單模型支持中英文數字組合識別、豎排文本識別、長文本識別。同時支持多種文本檢測、文本識別的訓練算法。

本教程將介紹PaddleOCR的基本使用方法以及如何使用它開發一個自動搜題的小工具。

項目地址

安裝

雖然PaddleOCR支持服務端部署並提供識別API，但根據我們的需求，搭建一個本地離線的OCR識別環境，所以此次我們隻介紹如何在本地安裝並使用的做法。

安裝PaddlePaddle飛槳框架

一、環境準備

1.1 目前飛槳支持的環境

Windows 7/8/10 專業版/企業版 (64bit)

GPU版本支持CUDA 10.1/10.2/11.0/11.2，且僅支持單卡

Python 版本 3.6+/3.7+/3.8+/3.9+ (64 bit)

pip 版本 20.2.2或更高版本 (64 bit)

二、安裝命令

pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple

(註意此版本為CPU版本，如需GPU版本請查看PaddlePaddle文檔)

安裝完成後您可以使用 python 進入python解釋器，輸入import paddle ，再輸入 paddle.utils.run_check()

如果出現PaddlePaddle is installed successfully!，說明您已成功安裝。

安裝PaddleOCR

pip install "paddleocr>=2.0.1" # 推薦使用2.0.1+版本

代碼使用

安裝完成後你可以使用以下代碼來進行簡單的功能測試

from paddleocr import PaddleOCR, draw_ocr

# Paddleocr目前支持中英文、英文、法語、德語、韓語、日語，可以通過修改lang參數進行切換
# 參數依次為`ch`, `en`, `french`, `german`, `korean`, `japan`。
ocr = PaddleOCR(use_angle_cls=True, lang="ch")  # need to run only once to download and load model into memory
# 選擇你要識別的圖片路徑
img_path = '11.jpg'
result = ocr.ocr(img_path, cls=True)
for line in result:
    print(line)

# 顯示結果
from PIL import Image

image = Image.open(img_path).convert('RGB')
boxes = [line[0] for line in result]
txts = [line[1][0] for line in result]
scores = [line[1][1] for line in result]
im_show = draw_ocr(image, boxes, txts, scores, font_path='/path/to/PaddleOCR/doc/fonts/simfang.ttf')
im_show = Image.fromarray(im_show)
im_show.save('result.jpg')

結果是一個list，每個item包含瞭文本框，文字和識別置信度

[[[24.0, 36.0], [304.0, 34.0], [304.0, 72.0], [24.0, 74.0]], ['純臻營養護發素', 0.964739]]
[[[24.0, 80.0], [172.0, 80.0], [172.0, 104.0], [24.0, 104.0]], ['產品信息/參數', 0.98069626]]
[[[24.0, 109.0], [333.0, 109.0], [333.0, 136.0], [24.0, 136.0]], ['（45元/每公斤，100公斤起訂）', 0.9676722]]
……

可視化效果

至此我們就掌握瞭 PaddleOCR 的基本使用，基於這個我們就能開發出一個OCR的搜題小工具瞭。

更多使用方法請參考

搜題小工具

現在有很多那種答題競賽的小遊戲，在限定時間內看誰答題正確率更高。或者現在一些單位會搞一些大練兵什麼的競賽，需要在網上答題，這個時候手動輸入題目去搜索就很慢，效率也不會太高，所以我們就可以來寫一個腳本，幫助我們完成搜題的過程。

基本思路就是通過ADB截取當前屏幕，然後剪切出題目所在位置，然後通過PaddleOCR來獲取題目文字，之後打開搜索引擎搜索或者打開題庫搜索。

安裝ADB

你可以到這裡下載安裝ADB之後配置環境變量。

配置完環境變量後在終端輸入adb,如果出現以下字符則證明adb安裝完成。

Android Debug Bridge version 1.0.41
Version 31.0.3-7562133

截圖並保存題目區域圖片

import os
from PIL import Image

# 截圖
def pull_screenshot():
    os.system('adb shell screencap -p /sdcard/screenshot.png')
    os.system('adb pull /sdcard/screenshot.png .')

img = Image.open("./screenshot.png")
# 切割問題區域
# (起始點的橫坐標，起始點的縱坐標，寬度，高度）
question  = img.crop((10, 400, 1060, 1000))
# 保存問題區域
question.save("./question.png")

OCR識別，獲取題目

ocr = PaddleOCR(use_angle_cls=False, 
                        lang="ch", 
                        show_log=False
                        )  # need to run only once to download and load model into memory
img_path = 'question.png'
result = ocr.ocr(img_path, cls=False)

# 獲取題目文本
questionList = [line[1][0] for line in result]
text = ""
# 將數組轉換為字符串
for str in questionList :
    text += str
print(text)

打開瀏覽器搜索

import webbrowser
webbrowser.open('https://baidu.com/s?wd=' + urllib.parse.quote(question))

之後你就可以查看搜索結果瞭

如果有題庫，你還可以使用pyautogui來模擬鼠標鍵盤操作，去操作Word等軟件在題庫中進行搜索。

完整代碼

# -*- coding: utf-8 -*-

# @Author  : Pu Zhiwei
# @Time    : 2021-09-02 20:29

from PIL import Image
import os
import matplotlib.pyplot as plt
from paddleocr import PaddleOCR, draw_ocr
import pyperclip
import pyautogui
import time
import webbrowser
import urllib.parse


# 鼠標位置
currentMouseX, currentMouseY = 60, 282

# 截圖獲取當前題目
def pull_screenshot():
    os.system('adb shell screencap -p /sdcard/screenshot.png')
    os.system('adb pull /sdcard/screenshot.png .')

# 移動鼠標到搜索框搜索
def MoveMouseToSearch():
    # duration 參數，移動時間，即用時0.1秒移動到對應位置
    pyautogui.moveTo(currentMouseX, currentMouseY, duration=0.1)
    # 左鍵點擊
    pyautogui.click()
    pyautogui.click()
    # 模擬組合鍵，粘貼
    pyautogui.hotkey('ctrl', 'v')

# 擴充問題
def AddText(list, length, text):
    if length > 3:
        return text + list[3]
    else:
        return text
# 打開瀏覽器
def open_webbrowser(question):
    webbrowser.open('https://baidu.com/s?wd=' + urllib.parse.quote(question))


# 顯示所識別的題目
def ShowAllQuestionText(list):
    text = ""
    for str in list:
        text += str
    print(text)



if __name__ == "__main__":
    while True:
        print("\n\n請將鼠標放在Word的搜索框上，三秒後腳本將自動獲取Word搜索框位置！\n\n")
        # 延時三秒輸出鼠標位置
        time.sleep(3)
        # 獲取當前鼠標位置
        currentMouseX, currentMouseY = pyautogui.position()
        print('當前鼠標位置為: {0} , {1}'.format(currentMouseX, currentMouseY))
        start = input("按y鍵程序開始運行，按其他鍵重新獲取搜索框位置：")
        if start == 'y':
            break

    while True:
        t = time.perf_counter()
        pull_screenshot()
        img = Image.open("./screenshot.png")
        # 切割問題區域
        # (起始點的橫坐標，起始點的縱坐標，寬度，高度）
        question  = img.crop((10, 400, 1060, 1000))
        # 保存問題區域
        question.save("./question.png")


        # 加載 PaddleOCR
        # Paddleocr目前支持中英文、英文、法語、德語、韓語、日語，可以通過修改lang參數進行切換
        # 參數依次為`ch`, `en`, `french`, `german`, `korean`, `japan`。

        # 自定義模型地址
        # det_model_dir='./inference/ch_ppocr_server_v2.0_det_train', 
        #                rec_model_dir='./inference/ch_ppocr_server_v2.0_rec_pre',
        #                cls_model_dir='./inference/ch_ppocr_mobile_v2.0_cls_train',
        ocr = PaddleOCR(use_angle_cls=False, 
                        lang="ch", 
                        show_log=False
                        )  # need to run only once to download and load model into memory
        img_path = 'question.png'
        result = ocr.ocr(img_path, cls=False)

        questionList = [line[1][0] for line in result]
        length = len(questionList)
        text = ""
        if length < 1:
            text = questionList[0]
        elif length == 2:
            text = questionList[1]
        else:
            text = questionList[1] + questionList[2]

        print('\n\n')
        ShowAllQuestionText(questionList)
        # 將結果寫入剪切板
        pyperclip.copy(text)
        # 點擊搜索
        MoveMouseToSearch()
        
        # 計算時間
        print('\n\n')
        end_time3 = time.perf_counter()
        print('用時: {0}'.format(end_time3 - t))
        
        go = input('輸入回車繼續運行,輸入 e 打開瀏覽器搜索，輸入 a 增加題目長度，輸入 n 結束程序運行： ')
        if go == 'n':
            break
  
        if go == 'a':
            text = AddText(questionList, length, text)
            pyperclip.copy(text)
            # 點擊搜索
            MoveMouseToSearch()
            stop = input("輸入回車繼續")
        elif go == 'e':
            # 打開瀏覽器
            open_webbrowser(text)
            stop = input("輸入回車繼續")

        print('\n------------------------\n\n')

到此這篇關於Python利用PaddleOCR制作個搜題小工具的文章就介紹到這瞭,更多相關Python PaddleOCR搜題工具內容請搜索WalkonNet以前的文章或繼續瀏覽下面的相關文章希望大傢以後多多支持WalkonNet！

Python利用PaddleOCR制作個搜題小工具

目錄

介紹

安裝

安裝PaddlePaddle飛槳框架

安裝PaddleOCR

代碼使用

搜題小工具

安裝ADB

截圖並保存題目區域圖片

OCR識別，獲取題目

打開瀏覽器搜索

完整代碼

推薦閱讀：

發佈留言取消回覆

近期文章

目錄

介紹

安裝

安裝PaddlePaddle飛槳框架

安裝PaddleOCR

代碼使用

搜題小工具

安裝ADB

截圖並保存題目區域圖片

OCR識別，獲取題目

打開瀏覽器搜索

完整代碼

推薦閱讀：

發佈留言 取消回覆

近期文章

標籤

發佈留言取消回覆