使用bandit對目標python代碼進行安全函數掃描的案例分析
技術背景
在一些對python開源庫代碼的安全掃描中,我們有可能需要分析庫中所使用到的函數是否會對代碼的執行環境造成一些非預期的影響。典型的例如python的沙箱逃逸問題,通過一些python的第三方庫可以執行系統shell命令,而這就不在python的沙箱防護范圍之內瞭。關於python的沙箱逃逸問題,這裡不作展開,這也是困擾業界多年的一個問題,連python官方也提過python的沙箱是沒有完美的防護方案的,這裡僅作為一個背景案例使用:
# subprocess_Popen.py import subprocess import uuid subprocess.Popen('touch ' + str(uuid.uuid1()) +'.txt', shell = True)
這裡演示的功能是使用subprocess
函數庫開啟一個系統shell
,並執行一個touch
的指令,可以生成一個指定文件名的文件,類似於mkdir
產生一個文件夾。我們可以看到這個文件成功執行後會在當前的目錄下生成一個uuid
隨機命名的txt文件:
[dechin@dechin-manjaro bandit_test]$ python3 subprocess_Popen.py [dechin@dechin-manjaro bandit_test]$ ll 總用量 4 -rw-r--r-- 1 dechin dechin 0 1月 26 23:03 b7aa0fc8-5fe7-11eb-b5d3-058313e110e4.txt -rw-r--r-- 1 dechin dechin 123 1月 26 23:03 subprocess_Popen.py
然而,本次的關註點並不在與這個函數執行瞭什麼功能,而是這個函數中用到瞭subprocess
這個函數庫。按照python的語言特點,當你的系統中如果存在這樣的一個模塊引用瞭subprocess
庫,那麼任何可以調用該功能模塊的函數,都可以調用到subprocess
這個函數,以下是另外一個惡意用戶的python代碼
:
# bad.py from subprocess_Popen import subprocess as subprocess subprocess.Popen('touch bad.txt', shell = True)
該代碼的目的是在不直接import subprocess
的條件下,通過前面創建好的subprocess_Popen.py
來進行搭橋調用subprocess
的功能函數。這個腳本的執行結果如下:
[dechin@dechin-manjaro bandit_test]$ python3 bad.py [dechin@dechin-manjaro bandit_test]$ ll 總用量 12 -rw-r--r-- 1 dechin dechin 0 1月 26 23:13 0fda7ede-5fe9-11eb-80a8-ad279ab4e0a6.txt -rw-r--r-- 1 dechin dechin 0 1月 26 23:03 b7aa0fc8-5fe7-11eb-b5d3-058313e110e4.txt -rw-r--r-- 1 dechin dechin 113 1月 26 23:13 bad.py -rw-r--r-- 1 dechin dechin 0 1月 26 23:13 bad.txt drwxr-xr-x 2 dechin dechin 4096 1月 26 23:13 __pycache__ -rw-r--r-- 1 dechin dechin 123 1月 26 23:03 subprocess_Popen.py
這個結果意味著,我們成功的使用bad.py
調用瞭subprocess_Popen.py
中所引用的subprocess
,成功touch
瞭一個bad.txt
的文件。
到這裡我們的背景案例演示結束,但我們需要重新梳理這些案例中所包含的邏輯:我們原本是希望在自己的系統中不引入python的沙箱逃逸問題,我們會對其他人傳遞過來的代碼進行掃描,如使用下文中將要介紹的bandit
工具來屏蔽subprocess
等”危險函數”。而如果我們在自己寫的python庫或者引入的第三方python庫中存在類似於subprocess
的引用,這就會導致我們的屏蔽失效,用戶可以任意的通過這些引用的搭橋直接調用subprocess
的函數功能。因此,在特殊的條件要求下,我們需要對自己的代碼進行安全函數掃描,以免為其他人的系統帶來不可預期的安全風險。bandit
隻是其中的一種安全函數掃描的工具,接下來我們介紹一下其基本安裝和使用方法。
用pip安裝bandit
這裡直接使用pip
來安裝bandit
,有需要的也可以從源碼直接安裝。關於在pip
的使用中配置國內鏡像源的方法,可以參考這篇博客中對python安裝第三方庫的介紹。
[dechin@dechin-manjaro bandit_test]$ python3 -m pip install bandit Collecting bandit Downloading bandit-1.7.0-py3-none-any.whl (115 kB) |████████████████████████████████| 115 kB 101 kB/s Requirement already satisfied: PyYAML>=5.3.1 in /home/dechin/anaconda3/lib/python3.8/site-packages (from bandit) (5.3.1) Collecting GitPython>=1.0.1 Downloading GitPython-3.1.12-py3-none-any.whl (159 kB) |████████████████████████████████| 159 kB 28 kB/s Requirement already satisfied: six>=1.10.0 in /home/dechin/anaconda3/lib/python3.8/site-packages (from bandit) (1.15.0) Collecting stevedore>=1.20.0 Downloading stevedore-3.3.0-py3-none-any.whl (49 kB) |████████████████████████████████| 49 kB 25 kB/s Collecting gitdb<5,>=4.0.1 Downloading gitdb-4.0.5-py3-none-any.whl (63 kB) |████████████████████████████████| 63 kB 28 kB/s Collecting pbr!=2.1.0,>=2.0.0 Downloading pbr-5.5.1-py2.py3-none-any.whl (106 kB) |████████████████████████████████| 106 kB 26 kB/s Collecting smmap<4,>=3.0.1 Downloading smmap-3.0.5-py2.py3-none-any.whl (25 kB) Installing collected packages: smmap, gitdb, GitPython, pbr, stevedore, bandit Successfully installed GitPython-3.1.12 bandit-1.7.0 gitdb-4.0.5 pbr-5.5.1 smmap-3.0.5 stevedore-3.3.0
安裝結束之後,可以通過以下指令驗證是否安裝成功:
[dechin@dechin-manjaro bandit_test]$ bandit -h usage: bandit [-h] [-r] [-a {file,vuln}] [-n CONTEXT_LINES] [-c CONFIG_FILE] [-p PROFILE] [-t TESTS] [-s SKIPS] [-l] [-i] [-f {csv,custom,html,json,screen,txt,xml,yaml}] [--msg-template MSG_TEMPLATE] [-o [OUTPUT_FILE]] [-v] [-d] [-q] [--ignore-nosec] [-x EXCLUDED_PATHS] [-b BASELINE] [--ini INI_PATH] [--exit-zero] [--version] [targets [targets ...]] Bandit - a Python source code security analyzer positional arguments: targets source file(s) or directory(s) to be tested optional arguments: -h, --help show this help message and exit -r, --recursive find and process files in subdirectories -a {file,vuln}, --aggregate {file,vuln} aggregate output by vulnerability (default) or by filename -n CONTEXT_LINES, --number CONTEXT_LINES maximum number of code lines to output for each issue -c CONFIG_FILE, --configfile CONFIG_FILE optional config file to use for selecting plugins and overriding defaults -p PROFILE, --profile PROFILE profile to use (defaults to executing all tests) -t TESTS, --tests TESTS comma-separated list of test IDs to run -s SKIPS, --skip SKIPS comma-separated list of test IDs to skip -l, --level report only issues of a given severity level or higher (-l for LOW, -ll for MEDIUM, -lll for HIGH) -i, --confidence report only issues of a given confidence level or higher (-i for LOW, -ii for MEDIUM, -iii for HIGH) -f {csv,custom,html,json,screen,txt,xml,yaml}, --format {csv,custom,html,json,screen,txt,xml,yaml} specify output format --msg-template MSG_TEMPLATE specify output message template (only usable with --format custom), see CUSTOM FORMAT section for list of available values -o [OUTPUT_FILE], --output [OUTPUT_FILE] write report to filename -v, --verbose output extra information like excluded and included files -d, --debug turn on debug mode -q, --quiet, --silent only show output in the case of an error --ignore-nosec do not skip lines with # nosec comments -x EXCLUDED_PATHS, --exclude EXCLUDED_PATHS comma-separated list of paths (glob patterns supported) to exclude from scan (note that these are in addition to the excluded paths provided in the config file) (default: .svn,CVS,.bzr,.hg,.git,__pycache__,.tox,.eggs,*.egg) -b BASELINE, --baseline BASELINE path of a baseline report to compare against (only JSON-formatted files are accepted) --ini INI_PATH path to a .bandit file that supplies command line arguments --exit-zero exit with 0, even with results found --version show program's version number and exit CUSTOM FORMATTING ----------------- Available tags: {abspath}, {relpath}, {line}, {test_id}, {severity}, {msg}, {confidence}, {range} Example usage: Default template: bandit -r examples/ --format custom --msg-template \ "{abspath}:{line}: {test_id}[bandit]: {severity}: {msg}" Provides same output as: bandit -r examples/ --format custom Tags can also be formatted in python string.format() style: bandit -r examples/ --format custom --msg-template \ "{relpath:20.20s}: {line:03}: {test_id:^8}: DEFECT: {msg:>20}" See python documentation for more information about formatting style: https://docs.python.org/3/library/string.html The following tests were discovered and loaded: ----------------------------------------------- B101 assert_used B102 exec_used B103 set_bad_file_permissions B104 hardcoded_bind_all_interfaces B105 hardcoded_password_string B106 hardcoded_password_funcarg B107 hardcoded_password_default B108 hardcoded_tmp_directory B110 try_except_pass B112 try_except_continue B201 flask_debug_true B301 pickle B302 marshal B303 md5 B304 ciphers B305 cipher_modes B306 mktemp_q B307 eval B308 mark_safe B309 httpsconnection B310 urllib_urlopen B311 random B312 telnetlib B313 xml_bad_cElementTree B314 xml_bad_ElementTree B315 xml_bad_expatreader B316 xml_bad_expatbuilder B317 xml_bad_sax B318 xml_bad_minidom B319 xml_bad_pulldom B320 xml_bad_etree B321 ftplib B323 unverified_context B324 hashlib_new_insecure_functions B325 tempnam B401 import_telnetlib B402 import_ftplib B403 import_pickle B404 import_subprocess B405 import_xml_etree B406 import_xml_sax B407 import_xml_expat B408 import_xml_minidom B409 import_xml_pulldom B410 import_lxml B411 import_xmlrpclib B412 import_httpoxy B413 import_pycrypto B501 request_with_no_cert_validation B502 ssl_with_bad_version B503 ssl_with_bad_defaults B504 ssl_with_no_version B505 weak_cryptographic_key B506 yaml_load B507 ssh_no_host_key_verification B601 paramiko_calls B602 subprocess_popen_with_shell_equals_true B603 subprocess_without_shell_equals_true B604 any_other_function_with_shell_equals_true B605 start_process_with_a_shell B606 start_process_with_no_shell B607 start_process_with_partial_path B608 hardcoded_sql_expressions B609 linux_commands_wildcard_injection B610 django_extra_used B611 django_rawsql_used B701 jinja2_autoescape_false B702 use_of_mako_templates B703 django_mark_safe
從這個列表中的屏蔽函數我們可以看出所謂的”危險函數”到底都有哪些,比如常用的subprocess
和random
都被包含在內。subprocess
是因為其對shell的調用而被列為”危險函數”,而random
則是因為其偽隨機數的性質(這裡簡單說明一下,現在一般推薦使用secrets
中的所謂安全隨機數,但是實際上隻有量子疊加測量才能夠真正實現真隨機數)。
bandit常用使用方法
直接對py
文件進行掃描:
[dechin@dechin-manjaro bandit_test]$ bandit subprocess_Popen.py [main] INFO profile include tests: None [main] INFO profile exclude tests: None [main] INFO cli include tests: None [main] INFO cli exclude tests: None [main] INFO running on Python 3.8.5 [node_visitor] INFO Unable to find qualified name for module: subprocess_Popen.py Run started:2021-01-26 15:31:00.425603 Test results: >> Issue: [B404:blacklist] Consider possible security implications associated with subprocess module. Severity: Low Confidence: High Location: subprocess_Popen.py:3 More Info: https://bandit.readthedocs.io/en/latest/blacklists/blacklist_imports.html#b404-import-subprocess 2 3 import subprocess 4 import uuid -------------------------------------------------- >> Issue: [B602:subprocess_popen_with_shell_equals_true] subprocess call with shell=True identified, security issue. Severity: High Confidence: High Location: subprocess_Popen.py:6 More Info: https://bandit.readthedocs.io/en/latest/plugins/b602_subprocess_popen_with_shell_equals_true.html 5 6 subprocess.Popen('touch ' + str(uuid.uuid1()) +'.txt', shell = True) -------------------------------------------------- Code scanned: Total lines of code: 3 Total lines skipped (#nosec): 0 Run metrics: Total issues (by severity): Undefined: 0.0 Low: 1.0 Medium: 0.0 High: 1.0 Total issues (by confidence): Undefined: 0.0 Low: 0.0 Medium: 0.0 High: 2.0 Files skipped (0):
通過對剛才所創建的調用瞭危險函數subprocess
的py文件subprocess_Popen.py
的掃描,我們識別出瞭其中的”危險函數”,註意這裡的Issue
編號是602
,定級是Severity: Low Confidence: High
。但是如果我們用bandit
去掃描利用瞭其他函數對危險函數的調用搭橋來二次調用的bad.py
文件,我們發現是另外一種結果:
[dechin@dechin-manjaro bandit_test]$ bandit bad.py [main] INFO profile include tests: None [main] INFO profile exclude tests: None [main] INFO cli include tests: None [main] INFO cli exclude tests: None [main] INFO running on Python 3.8.5 [node_visitor] INFO Unable to find qualified name for module: bad.py Run started:2021-01-26 15:30:47.370468 Test results: >> Issue: [B404:blacklist] Consider possible security implications associated with subprocess module. Severity: Low Confidence: High Location: bad.py:3 More Info: https://bandit.readthedocs.io/en/latest/blacklists/blacklist_imports.html#b404-import-subprocess 2 3 from subprocess_Popen import subprocess as subprocess 4 5 subprocess.Popen('touch bad.txt', shell = True) -------------------------------------------------- >> Issue: [B604:any_other_function_with_shell_equals_true] Function call with shell=True parameter identified, possible security issue. Severity: Medium Confidence: Low Location: bad.py:5 More Info: https://bandit.readthedocs.io/en/latest/plugins/b604_any_other_function_with_shell_equals_true.html 4 5 subprocess.Popen('touch bad.txt', shell = True) -------------------------------------------------- Code scanned: Total lines of code: 2 Total lines skipped (#nosec): 0 Run metrics: Total issues (by severity): Undefined: 0.0 Low: 1.0 Medium: 1.0 High: 0.0 Total issues (by confidence): Undefined: 0.0 Low: 1.0 Medium: 0.0 High: 1.0 Files skipped (0):
註意這裡雖然實現的功能跟上面那個例子是一樣的,但是這裡的Issue
編號為604
,定級也變成瞭Severity: Medium Confidence: Low
。這裡的關鍵並不是定級變成瞭什麼,而是定級被改變瞭,這是因為bandit
是通過對字符串的處理來識別危險函數的,因此對於這種二次調用的特殊場景,bandit
不一定都能夠準確的識別出來對危險函數的調用,甚至可能出現二次調用後,完全無法識別風險函數的使用的可能性。
2.掃描一個目錄下的所有py
文件,並將結果寫入txt
文件
[dechin@dechin-manjaro bandit_test]$ bandit *.py -o test_bandit.txt -f txt [main] INFO profile include tests: None [main] INFO profile exclude tests: None [main] INFO cli include tests: None [main] INFO cli exclude tests: None [main] INFO running on Python 3.8.5 [node_visitor] INFO Unable to find qualified name for module: bad.py [node_visitor] INFO Unable to find qualified name for module: subprocess_Popen.py [text] INFO Text output written to file: test_bandit.txt
該案例就掃描瞭當前目錄下的所有py文件,其實就是bad.py
和subprocess_Popen.py
這兩個,並且將最終的掃描結果保存至test_bandit.txt
文件中,這裡我們就不展示txt文件的具體內容,大概就是將上一章節的兩個執行結果進行瞭整合。
3.掃描一個目錄下的多層文件夾中的py
文件,並將結果寫入html
文件
假如我們有如下所示的一個目錄結構需要進行掃描測試:
[dechin@dechin-manjaro bandit_test]$ tree . ├── bad.py ├── bad.txt ├── level2 │ └── test_random.py ├── subprocess_Popen.py ├── test_bandit.html └── test_bandit.txt 1 directory, 6 files [dechin@dechin-manjaro bandit_test]$ cat level2/test_random.py # test_bandit.py import random a = random.random()
我們可以在當前目錄下執行如下指令:
[dechin@dechin-manjaro bandit_test]$ bandit -r . -f html -o test_bandit.html [main] INFO profile include tests: None [main] INFO profile exclude tests: None [main] INFO cli include tests: None [main] INFO cli exclude tests: None [main] INFO running on Python 3.8.5 [html] INFO HTML output written to file: test_bandit.html
這裡我們得到的結果是一個test_bandit.html
文件,文件內容如下圖所示:
4.使用配置文件禁用部分Issue
在執行目錄下創建一個.bandit
文件,作如下配置就可以避免對B404
的審查:
[bandit] skips: B404
執行的掃描結果如下圖所示,我們可以看到B404
相關的Issue已經不在列表中瞭:
5.在py
文件中直接逃避bandit
審計
在待掃描的py文件的對應風險函數後加上如下註釋,即可在bandit
審計過程中自動忽略:
# bad.py from subprocess_Popen import subprocess as sb sb.Popen('touch bad.txt', shell = 1) # nosec
這裡我們可以看到最終的審計結果中,B604
也隨之而不見瞭,如下圖所示。從這個案例中我們也可以知悉,bandit
並不是一個用來作安全防護的工具,僅僅是用來做比較初步的python代碼安全函數使用規范的審查工作,而掃描出來的問題是否處理,其實最終還是取決於開發者自己。
bandit簡單性能測試
眾所周知python語言的性能是極其受限的,因此bandit的性能也有可能十分的低下,這裡讓我們來定量的測試一下bandit的性能到底在什麼水準。首先我們創建一個10000行的py文件,內容全部為危險函數的使用:
# gen.py import os with open('test_bandit_power.py', 'w') as py_file: py_file.write('import subprocess as sb\n') for i in range(10000): py_file.write('sb.Popen(\'whoami\', shell = 1)\n')
通過執行python3 gen.py
就可以生成一個10000行的危險函數文件test_bandit_power.py
,大約300KB的大小。此時我們針對這單個的文件進行bandit
掃描測試,我們發現這個過程極為漫長,並且生成瞭大量的錯誤日志:
[dechin@dechin-manjaro bandit_test]$ time bandit test_bandit_power.py -f html -o test_power.html [main] INFO profile include tests: None [main] INFO profile exclude tests: None [main] INFO cli include tests: None [main] INFO cli exclude tests: None [main] INFO running on Python 3.8.5 [node_visitor] INFO Unable to find qualified name for module: test_bandit_power.py [html] INFO HTML output written to file: test_power.html real 0m6.239s user 0m6.082s sys 0m0.150s
我們可以簡單估算,如果10000行的代碼都需要6s
的時間來進行掃描,那麼對於比較大的項目的1000000+
的代碼的掃描時間,則有可能達到10min
往上,這個時間雖然也不是特別長,但是對於大型的項目而言這絕對不是一個非常高效的選擇。
總結概要
在一些對安全性要求較高的開發項目中,有可能會禁止使用危險函數,如subprocess
等。而bandit
的作用旨在通過對代碼的掃描自動化的給出安全危險函數分析意見,至於是否采納,還是以不同項目的管理者需求為準。同時經過我們的測試發現,bandit
在實際使用場景下性能表現並不如意,因此在大型項目中我們並不推薦使用,如果一定要使用也可以考慮進行針對性的配置。
版權聲明
本文首發鏈接為:https://www.cnblogs.com/dechinphy/p/bandit.html
作者ID:DechinPhy
更多原著文章請參考:https://www.cnblogs.com/dechinphy/
推薦閱讀:
- Python移動測試開發subprocess模塊項目實戰
- python中subprocess實例用法及知識點詳解
- Python調用系統命令os.system()和os.popen()的實現
- Python中使用subprocess庫創建附加進程
- Python實現系統交互(subprocess)