使用python模塊plotdigitizer摳取論文圖片中的數據實例詳解
技術背景
對於各行各業的研究人員來說,經常會面臨這樣的一個問題:有一篇不錯的文章裡面有很好的數據,但是這個數據在文章中僅以圖片的形式出現。而假如我們希望可以從該圖片中提取出數據,這樣就可以用我們自己的形式重新來展現這些數據,還可以額外再附上自己優化後的數據。因此從論文圖片中提取數據,是一個非常實際的需求。這裡以前面寫的量子退火的博客為例,博客中有這樣的一張圖片:
在這篇文章中,我們將介紹如何使用python從圖片上把數據摳取出來。
plotdigitizer的安裝
這裡我們使用pip
來安裝python第三方庫plotdigitizer
,該庫的主要功能就是可以自動化的從圖片中提取出數據,我們可以使用騰訊的pip鏡像源來加速我們的安裝過程:
[dechin@dechin-manjaro plotdigitizer]$ python3 -m pip install -i https://mirrors.cloud.tencent.com/pypi/simple plotdigitizer Looking in indexes: https://mirrors.cloud.tencent.com/pypi/simple Collecting plotdigitizer Downloading https://mirrors.cloud.tencent.com/pypi/packages/89/bb/ff753093458c05ce3b52fd17527b6b0622ca096aadcf561c6316320ab793/plotdigitizer-0.1.3-py3-none-any.whl (20 kB) Collecting loguru<0.6.0,>=0.5.3 Downloading https://mirrors.cloud.tencent.com/pypi/packages/6d/48/0a7d5847e3de329f1d0134baf707b689700b53bd3066a5a8cfd94b3c9fc8/loguru-0.5.3-py3-none-any.whl (57 kB) |████████████████████████████████| 57 kB 521 kB/s Collecting opencv-python<5.0.0,>=4.5.1 Downloading https://mirrors.cloud.tencent.com/pypi/packages/2a/9a/ff309b530ac1b029bfdb9af3a95eaff0f5f45f6a2dbe37b3454ae8412f4c/opencv_python-4.5.1.48-cp38-cp38-manylinux2014_x86_64.whl (50.4 MB) |████████████████████████████████| 50.4 MB 467 kB/s Collecting numpy<2.0.0,>=1.19.5 Downloading https://mirrors.cloud.tencent.com/pypi/packages/c7/e6/dccac76b7e825915ffb906beeba5a953597b6cfe1fe686b5276e122cb07c/numpy-1.20.1-cp38-cp38-manylinux2010_x86_64.whl (15.4 MB) |████████████████████████████████| 15.4 MB 20.4 MB/s Collecting matplotlib<4.0.0,>=3.3.4 Downloading https://mirrors.cloud.tencent.com/pypi/packages/ab/20/60cfe5d611ac86df07b7b1f9b9582f22f7eda5edbe2124ba85bdf3133822/matplotlib-3.3.4-cp38-cp38-manylinux1_x86_64.whl (11.6 MB) |████████████████████████████████| 11.6 MB 4.4 MB/s Requirement already satisfied: python-dateutil>=2.1 in /home/dechin/anaconda3/lib/python3.8/site-packages (from matplotlib<4.0.0,>=3.3.4->plotdigitizer) (2.8.1) Requirement already satisfied: cycler>=0.10 in /home/dechin/anaconda3/lib/python3.8/site-packages (from matplotlib<4.0.0,>=3.3.4->plotdigitizer) (0.10.0) Requirement already satisfied: pillow>=6.2.0 in /home/dechin/anaconda3/lib/python3.8/site-packages (from matplotlib<4.0.0,>=3.3.4->plotdigitizer) (8.0.1) Requirement already satisfied: kiwisolver>=1.0.1 in /home/dechin/anaconda3/lib/python3.8/site-packages (from matplotlib<4.0.0,>=3.3.4->plotdigitizer) (1.3.0) Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.3 in /home/dechin/anaconda3/lib/python3.8/site-packages (from matplotlib<4.0.0,>=3.3.4->plotdigitizer) (2.4.7) Requirement already satisfied: six>=1.5 in /home/dechin/anaconda3/lib/python3.8/site-packages (from python-dateutil>=2.1->matplotlib<4.0.0,>=3.3.4->plotdigitizer) (1.15.0) Installing collected packages: loguru, numpy, opencv-python, matplotlib, plotdigitizer Attempting uninstall: numpy Found existing installation: numpy 1.19.2 Uninstalling numpy-1.19.2: Successfully uninstalled numpy-1.19.2 Attempting uninstall: matplotlib Found existing installation: matplotlib 3.3.2 Uninstalling matplotlib-3.3.2: Successfully uninstalled matplotlib-3.3.2 Successfully installed loguru-0.5.3 matplotlib-3.3.4 numpy-1.20.1 opencv-python-4.5.1.48 plotdigitizer-0.1.3
通過運行幫助指令,我們可以查看是否安裝成功:
[dechin@dechin-manjaro plotdigitizer]$ plotdigitizer -h usage: plotdigitizer [-h] --data-point DATA_POINT [--location LOCATION] [--plot PLOT] [--output OUTPUT] [--preprocess] [--debug] INPUT Digitize image. positional arguments: INPUT Input image file. optional arguments: -h, --help show this help message and exit --data-point DATA_POINT, -p DATA_POINT Datapoints (min 3 required). You have to click on them later. At least 3 points are recommended. e.g -p 0,0 -p 10,0 -p 0,1 Make sure that point are comma separated without any space. --location LOCATION, -l LOCATION Location of a points on figure in pixels (integer). These values should appear in the same order as -p option. If not given, you will be asked to click on the figure. --plot PLOT Plot the final result. Requires matplotlib. --output OUTPUT, -o OUTPUT Name of the output file else trajectory will be written to <INPUT>.traj.csv --preprocess Preprocess the image. Useful with bad resolution images. --debug Enable debug logger
執行指令與輸出圖片
先把需要摳取數據的圖片放到當前目錄下,然後運行如下指令:
plotdigitizer ./test1.png -p 0,-1 -p 20,0 -p 0,0.1 --plot output.png
該指令會將test1.png
中的數據提取出來,可以使用-o
存儲為csv格式的數據表格。這裡實際使用中我們發現,即使不用plot
指令,也會在Manjaro Linux
系統下不斷的輸出打印圖片,隻有通過kill -9
的方式才能強行將進程殺死,有可能是開源庫中存在的某個bug。這裡展示一下用新的數據繪制出來的效果圖:
執行結束後,該圖片會被輸出到臨時文件夾tmp/plotdigitizer/
下,但是註意前面產生的圖片會被後來的臨時文件所覆蓋。
總結概要
這裡我們僅僅是介紹和演示瞭plotdigitizer的基本使用方法,這樣一個使用python制作的圖像數據工具更加符合pythoner
的使用習慣和邏輯。雖然實際使用過程中工具可能出現各種各樣的問題,但是基本上是一個比較好的工具,值得推薦。
版權聲明
本文首發鏈接為:https://www.cnblogs.com/dechinphy/p/plotdigitizer.html
作者ID:DechinPhy
更多原著文章請參考:https://www.cnblogs.com/dechinphy/
到此這篇關於使用python模塊plotdigitizer摳取論文圖片中的數據的文章就介紹到這瞭,更多相關python模塊plotdigitizer內容請搜索WalkonNet以前的文章或繼續瀏覽下面的相關文章希望大傢以後多多支持WalkonNet!
推薦閱讀:
- 支持python的分佈式計算框架Ray詳解
- 教你一分鐘在win10終端成功安裝Pytorch的方法步驟
- Python openpyxl 無法保存文件的解決方案
- python可以美化表格數據輸出結果的兩個工具
- 10 個Python中Pip的使用技巧分享