詳解Python之可迭代對象,迭代器和生成器
一、概念描述
可迭代對象就是可以迭代的對象,我們可以通過內置的iter函數獲取其迭代器,可迭代對象內部需要實現__iter__函數來返回其關聯的迭代器;
迭代器是負責具體數據的逐個遍歷的,其通過實現__next__函數得以逐個的訪問關聯的數據元素;同時通過實現__iter__來實現對可迭代對象的兼容;
生成器是一種迭代器模式,其實現瞭數據的惰性生成,即隻有使用的時候才會生成對應的元素;
二、序列的可迭代性
python內置的序列可以通過for進行迭代,解釋器會調用iter函數獲取序列的迭代器,由於iter函數兼容序列實現的__getitem__,會自動創建一個迭代器;
迭代器的
import re from dis import dis class WordAnalyzer: reg_word = re.compile('\w+') def __init__(self, text): self.words = self.__class__.reg_word.findall(text) def __getitem__(self, index): return self.words[index] def iter_word_analyzer(): wa = WordAnalyzer('this is mango word analyzer') print('start for wa') for w in wa: print(w) print('start while wa_iter') wa_iter = iter(wa) while True: try: print(next(wa_iter)) except StopIteration as e: break; iter_word_analyzer() dis(iter_word_analyzer) # start for wa # this # is # mango # word # analyzer # start while wa_iter # this # is # mango # word # analyzer # 15 0 LOAD_GLOBAL 0 (WordAnalyzer) # 2 LOAD_CONST 1 ('this is mango word analyzer') # 4 CALL_FUNCTION 1 # 6 STORE_FAST 0 (wa) # # 16 8 LOAD_GLOBAL 1 (print) # 10 LOAD_CONST 2 ('start for wa') # 12 CALL_FUNCTION 1 # 14 POP_TOP # # 17 16 LOAD_FAST 0 (wa) # 18 GET_ITER # >> 20 FOR_ITER 12 (to 34) # 22 STORE_FAST 1 (w) # # 18 24 LOAD_GLOBAL 1 (print) # 26 LOAD_FAST 1 (w) # 28 CALL_FUNCTION 1 # 30 POP_TOP # 32 JUMP_ABSOLUTE 20 # # 20 >> 34 LOAD_GLOBAL 1 (print) # 36 LOAD_CONST 3 ('start while wa_iter') # 38 CALL_FUNCTION 1 # 40 POP_TOP # # 21 42 LOAD_GLOBAL 2 (iter) # 44 LOAD_FAST 0 (wa) # 46 CALL_FUNCTION 1 # 48 STORE_FAST 2 (wa_iter) # # 23 >> 50 SETUP_FINALLY 16 (to 68) # # 24 52 LOAD_GLOBAL 1 (print) # 54 LOAD_GLOBAL 3 (next) # 56 LOAD_FAST 2 (wa_iter) # 58 CALL_FUNCTION 1 # 60 CALL_FUNCTION 1 # 62 POP_TOP # 64 POP_BLOCK # 66 JUMP_ABSOLUTE 50 # # 25 >> 68 DUP_TOP # 70 LOAD_GLOBAL 4 (StopIteration) # 72 JUMP_IF_NOT_EXC_MATCH 114 # 74 POP_TOP # 76 STORE_FAST 3 (e) # 78 POP_TOP # 80 SETUP_FINALLY 24 (to 106) # # 26 82 POP_BLOCK # 84 POP_EXCEPT # 86 LOAD_CONST 0 (None) # 88 STORE_FAST 3 (e) # 90 DELETE_FAST 3 (e) # 92 JUMP_ABSOLUTE 118 # 94 POP_BLOCK # 96 POP_EXCEPT # 98 LOAD_CONST 0 (None) # 100 STORE_FAST 3 (e) # 102 DELETE_FAST 3 (e) # 104 JUMP_ABSOLUTE 50 # >> 106 LOAD_CONST 0 (None) # 108 STORE_FAST 3 (e) # 110 DELETE_FAST 3 (e) # 112 RERAISE # >> 114 RERAISE # 116 JUMP_ABSOLUTE 50 # >> 118 LOAD_CONST 0 (None) # 120 RETURN_VALUE
三、經典的迭代器模式
標準的迭代器需要實現兩個接口方法,一個可以獲取下一個元素的__next__方法和直接返回self的__iter__方法;
迭代器迭代完所有的元素的時候會拋出StopIteration異常,但是python內置的for、列表推到、元組拆包等會自動處理這個異常;
實現__iter__主要為瞭方便使用迭代器,這樣就可以最大限度的方便使用迭代器;
迭代器隻能迭代一次,如果需要再次迭代就需要再次調用iter方法獲取新的迭代器,這就要求每個迭代器維護自己的內部狀態,即一個對象不能既是可迭代對象同時也是迭代器;
從經典的面向對象設計模式來看,可迭代對象可以隨時生成自己關聯的迭代器,而迭代器負責具體的元素的迭代處理;
import re from dis import dis class WordAnalyzer: reg_word = re.compile('\w+') def __init__(self, text): self.words = self.__class__.reg_word.findall(text) def __iter__(self): return WordAnalyzerIterator(self.words) class WordAnalyzerIterator: def __init__(self, words): self.words = words self.index = 0 def __iter__(self): return self; def __next__(self): try: word = self.words[self.index] except IndexError: raise StopIteration() self.index +=1 return word def iter_word_analyzer(): wa = WordAnalyzer('this is mango word analyzer') print('start for wa') for w in wa: print(w) print('start while wa_iter') wa_iter = iter(wa) while True: try: print(next(wa_iter)) except StopIteration as e: break; iter_word_analyzer() # start for wa # this # is # mango # word # analyzer # start while wa_iter # this # is # mango # word # analyzer
四、生成器也是迭代器
生成器是調用生成器函數生成的,生成器函數是含有yield的工廠函數;
生成器本身就是迭代器,其支持使用next函數遍歷生成器,同時遍歷完也會拋出StopIteration異常;
生成器執行的時候會在yield語句的地方暫停,並返回yield右邊的表達式的值;
def gen_func(): print('first yield') yield 'first' print('second yield') yield 'second' print(gen_func) g = gen_func() print(g) for val in g: print(val) g = gen_func() print(next(g)) print(next(g)) print(next(g)) # <function gen_func at 0x7f1198175040> # <generator object gen_func at 0x7f1197fb6cf0> # first yield # first # second yield # second # first yield # first # second yield # second # StopIteration
我們可以將__iter__作為生成器函數
import re from dis import dis class WordAnalyzer: reg_word = re.compile('\w+') def __init__(self, text): self.words = self.__class__.reg_word.findall(text) def __iter__(self): for word in self.words: yield word def iter_word_analyzer(): wa = WordAnalyzer('this is mango word analyzer') print('start for wa') for w in wa: print(w) print('start while wa_iter') wa_iter = iter(wa) while True: try: print(next(wa_iter)) except StopIteration as e: break; iter_word_analyzer() # start for wa # this # is # mango # word # analyzer # start while wa_iter # this # is # mango # word # analyzer
五、實現惰性迭代器
迭代器的一大亮點就是通過__next__來實現逐個元素的遍歷,這個大數據容器的遍歷帶來瞭可能性;
我們以前的實現在初始化的時候,直接調用re.findall得到瞭所有的序列元素,並不是一個很好的實現;我們可以通過re.finditer來在遍歷的時候得到數據;
import re from dis import dis class WordAnalyzer: reg_word = re.compile('\w+') def __init__(self, text): # self.words = self.__class__.reg_word.findall(text) self.text = text def __iter__(self): g = self.__class__.reg_word.finditer(self.text) print(g) for match in g: yield match.group() def iter_word_analyzer(): wa = WordAnalyzer('this is mango word analyzer') print('start for wa') for w in wa: print(w) print('start while wa_iter') wa_iter = iter(wa) wa_iter1= iter(wa) while True: try: print(next(wa_iter)) except StopIteration as e: break; iter_word_analyzer() # start for wa # <callable_iterator object at 0x7feed103e040> # this # is # mango # word # analyzer # start while wa_iter # <callable_iterator object at 0x7feed103e040> # this # is # mango # word # analyzer
六、使用生成器表達式簡化惰性迭代器
生成器表達式是生成器的聲明性定義,與列表推到的語法類似,隻是生成元素是惰性的;
def gen_func(): print('first yield') yield 'first' print('second yield') yield 'second' l = [x for x in gen_func()] for x in l: print(x) print() ge = (x for x in gen_func()) print(ge) for x in ge: print(x) # first yield # second yield # first # second # # <generator object <genexpr> at 0x7f78ff5dfd60> # first yield # first # second yield # second
使用生成器表達式實現word analyzer
import re from dis import dis class WordAnalyzer: reg_word = re.compile('\w+') def __init__(self, text): # self.words = self.__class__.reg_word.findall(text) self.text = text def __iter__(self): # g = self.__class__.reg_word.finditer(self.text) # print(g) # for match in g: # yield match.group() ge = (match.group() for match in self.__class__.reg_word.finditer(self.text)) print(ge) return ge def iter_word_analyzer(): wa = WordAnalyzer('this is mango word analyzer') print('start for wa') for w in wa: print(w) print('start while wa_iter') wa_iter = iter(wa) while True: try: print(next(wa_iter)) except StopIteration as e: break; iter_word_analyzer() # start for wa # <generator object WordAnalyzer.__iter__.<locals>.<genexpr> at 0x7f4178189200> # this # is # mango # word # analyzer # start while wa_iter # <generator object WordAnalyzer.__iter__.<locals>.<genexpr> at 0x7f4178189200> # this # is # mango # word # analyzer
總結
本篇文章就到這裡瞭,希望能夠給你帶來幫助,也希望您能夠多多關註WalkonNet的更多內容!