詳解Python之可迭代對象,迭代器和生成器

一、概念描述

可迭代對象就是可以迭代的對象,我們可以通過內置的iter函數獲取其迭代器,可迭代對象內部需要實現__iter__函數來返回其關聯的迭代器;

迭代器是負責具體數據的逐個遍歷的,其通過實現__next__函數得以逐個的訪問關聯的數據元素;同時通過實現__iter__來實現對可迭代對象的兼容;

生成器是一種迭代器模式,其實現瞭數據的惰性生成,即隻有使用的時候才會生成對應的元素;

二、序列的可迭代性

python內置的序列可以通過for進行迭代,解釋器會調用iter函數獲取序列的迭代器,由於iter函數兼容序列實現的__getitem__,會自動創建一個迭代器;

迭代器的

import re
from dis import dis
class WordAnalyzer:
    reg_word = re.compile('\w+')
    def __init__(self, text):
        self.words = self.__class__.reg_word.findall(text)
    def __getitem__(self, index):
        return self.words[index]
def iter_word_analyzer():
    wa = WordAnalyzer('this is mango word analyzer')
    print('start for wa')
    for w in wa:
        print(w)
    print('start while wa_iter')
    wa_iter = iter(wa)
    while True:
        try:
            print(next(wa_iter))
        except StopIteration as e:
            break;
iter_word_analyzer()
dis(iter_word_analyzer)
# start for wa
# this
# is
# mango
# word
# analyzer
# start while wa_iter
# this
# is
# mango
# word
# analyzer
#  15           0 LOAD_GLOBAL              0 (WordAnalyzer)
#               2 LOAD_CONST               1 ('this is mango word analyzer')
#               4 CALL_FUNCTION            1
#               6 STORE_FAST               0 (wa)
# 
#  16           8 LOAD_GLOBAL              1 (print)
#              10 LOAD_CONST               2 ('start for wa')
#              12 CALL_FUNCTION            1
#              14 POP_TOP
# 
#  17          16 LOAD_FAST                0 (wa)
#              18 GET_ITER
#         >>   20 FOR_ITER                12 (to 34)
#              22 STORE_FAST               1 (w)
# 
#  18          24 LOAD_GLOBAL              1 (print)
#              26 LOAD_FAST                1 (w)
#              28 CALL_FUNCTION            1
#              30 POP_TOP
#              32 JUMP_ABSOLUTE           20
# 
#  20     >>   34 LOAD_GLOBAL              1 (print)
#              36 LOAD_CONST               3 ('start while wa_iter')
#              38 CALL_FUNCTION            1
#              40 POP_TOP
# 
#  21          42 LOAD_GLOBAL              2 (iter)
#              44 LOAD_FAST                0 (wa)
#              46 CALL_FUNCTION            1
#              48 STORE_FAST               2 (wa_iter)
# 
#  23     >>   50 SETUP_FINALLY           16 (to 68)
# 
#  24          52 LOAD_GLOBAL              1 (print)
#              54 LOAD_GLOBAL              3 (next)
#              56 LOAD_FAST                2 (wa_iter)
#              58 CALL_FUNCTION            1
#              60 CALL_FUNCTION            1
#              62 POP_TOP
#              64 POP_BLOCK
#              66 JUMP_ABSOLUTE           50
# 
#  25     >>   68 DUP_TOP
#              70 LOAD_GLOBAL              4 (StopIteration)
#              72 JUMP_IF_NOT_EXC_MATCH   114
#              74 POP_TOP
#              76 STORE_FAST               3 (e)
#              78 POP_TOP
#              80 SETUP_FINALLY           24 (to 106)
# 
#  26          82 POP_BLOCK
#              84 POP_EXCEPT
#              86 LOAD_CONST               0 (None)
#              88 STORE_FAST               3 (e)
#              90 DELETE_FAST              3 (e)
#              92 JUMP_ABSOLUTE          118
#              94 POP_BLOCK
#              96 POP_EXCEPT
#              98 LOAD_CONST               0 (None)
#             100 STORE_FAST               3 (e)
#             102 DELETE_FAST              3 (e)
#             104 JUMP_ABSOLUTE           50
#         >>  106 LOAD_CONST               0 (None)
#             108 STORE_FAST               3 (e)
#             110 DELETE_FAST              3 (e)
#             112 RERAISE
#         >>  114 RERAISE
#             116 JUMP_ABSOLUTE           50
#         >>  118 LOAD_CONST               0 (None)
#             120 RETURN_VALUE

三、經典的迭代器模式

標準的迭代器需要實現兩個接口方法,一個可以獲取下一個元素的__next__方法和直接返回self的__iter__方法;

迭代器迭代完所有的元素的時候會拋出StopIteration異常,但是python內置的for、列表推到、元組拆包等會自動處理這個異常;

實現__iter__主要為瞭方便使用迭代器,這樣就可以最大限度的方便使用迭代器;

迭代器隻能迭代一次,如果需要再次迭代就需要再次調用iter方法獲取新的迭代器,這就要求每個迭代器維護自己的內部狀態,即一個對象不能既是可迭代對象同時也是迭代器;

從經典的面向對象設計模式來看,可迭代對象可以隨時生成自己關聯的迭代器,而迭代器負責具體的元素的迭代處理;

import re
from dis import dis
class WordAnalyzer:
    reg_word = re.compile('\w+')
    def __init__(self, text):
        self.words = self.__class__.reg_word.findall(text)
    def __iter__(self):
        return WordAnalyzerIterator(self.words)
class WordAnalyzerIterator:
    def __init__(self, words):
        self.words = words
        self.index = 0
    def __iter__(self):
        return self;
    def __next__(self):
        try:
            word = self.words[self.index]
        except IndexError:
            raise StopIteration()
        self.index +=1
        return word
def iter_word_analyzer():
    wa = WordAnalyzer('this is mango word analyzer')
    print('start for wa')
    for w in wa:
        print(w)
    print('start while wa_iter')
    wa_iter = iter(wa)
    while True:
        try:
            print(next(wa_iter))
        except StopIteration as e:
            break;
iter_word_analyzer()
# start for wa
# this
# is
# mango
# word
# analyzer
# start while wa_iter
# this
# is
# mango
# word
# analyzer

四、生成器也是迭代器

生成器是調用生成器函數生成的,生成器函數是含有yield的工廠函數;

生成器本身就是迭代器,其支持使用next函數遍歷生成器,同時遍歷完也會拋出StopIteration異常;

生成器執行的時候會在yield語句的地方暫停,並返回yield右邊的表達式的值;

def gen_func():
    print('first yield')
    yield 'first'
    print('second yield')
    yield 'second'
print(gen_func)
g = gen_func()
print(g)
for val in g:
    print(val)
g = gen_func()
print(next(g))
print(next(g))
print(next(g))
# <function gen_func at 0x7f1198175040>
# <generator object gen_func at 0x7f1197fb6cf0>
# first yield
# first
# second yield
# second
# first yield
# first
# second yield
# second
# StopIteration

我們可以將__iter__作為生成器函數

import re
from dis import dis
class WordAnalyzer:
    reg_word = re.compile('\w+')
    def __init__(self, text):
        self.words = self.__class__.reg_word.findall(text)
    def __iter__(self):
        for word in self.words:
            yield word
def iter_word_analyzer():
    wa = WordAnalyzer('this is mango word analyzer')
    print('start for wa')
    for w in wa:
        print(w)
    print('start while wa_iter')
    wa_iter = iter(wa)
    while True:
        try:
            print(next(wa_iter))
        except StopIteration as e:
            break;
iter_word_analyzer()
# start for wa
# this
# is
# mango
# word
# analyzer
# start while wa_iter
# this
# is
# mango
# word
# analyzer

五、實現惰性迭代器

迭代器的一大亮點就是通過__next__來實現逐個元素的遍歷,這個大數據容器的遍歷帶來瞭可能性;

我們以前的實現在初始化的時候,直接調用re.findall得到瞭所有的序列元素,並不是一個很好的實現;我們可以通過re.finditer來在遍歷的時候得到數據;

import re
from dis import dis
class WordAnalyzer:
    reg_word = re.compile('\w+')
    def __init__(self, text):
        # self.words = self.__class__.reg_word.findall(text)
        self.text = text
    def __iter__(self):
        g = self.__class__.reg_word.finditer(self.text)
        print(g)
        for match in g:
            yield match.group()
def iter_word_analyzer():
    wa = WordAnalyzer('this is mango word analyzer')
    print('start for wa')
    for w in wa:
        print(w)
    print('start while wa_iter')
    wa_iter = iter(wa)
    wa_iter1= iter(wa)
    while True:
        try:
            print(next(wa_iter))
        except StopIteration as e:
            break;
iter_word_analyzer()
# start for wa
# <callable_iterator object at 0x7feed103e040>
# this
# is
# mango
# word
# analyzer
# start while wa_iter
# <callable_iterator object at 0x7feed103e040>
# this
# is
# mango
# word
# analyzer

六、使用生成器表達式簡化惰性迭代器

生成器表達式是生成器的聲明性定義,與列表推到的語法類似,隻是生成元素是惰性的;

def gen_func():
    print('first yield')
    yield 'first'
    print('second yield')
    yield 'second'
l = [x for x in gen_func()]
for x in l:
    print(x)
print()
ge = (x for x in gen_func())
print(ge)
for x in ge:
    print(x)
# first yield
# second yield
# first
# second
#
# <generator object <genexpr> at 0x7f78ff5dfd60>
# first yield
# first
# second yield
# second

使用生成器表達式實現word analyzer

import re
from dis import dis
class WordAnalyzer:
    reg_word = re.compile('\w+')
    def __init__(self, text):
        # self.words = self.__class__.reg_word.findall(text)
        self.text = text
    def __iter__(self):
        # g = self.__class__.reg_word.finditer(self.text)
        # print(g)
        # for match in g:
        #     yield match.group()
        ge = (match.group() for match in self.__class__.reg_word.finditer(self.text))
        print(ge)
        return ge
def iter_word_analyzer():
    wa = WordAnalyzer('this is mango word analyzer')
    print('start for wa')
    for w in wa:
        print(w)
    print('start while wa_iter')
    wa_iter = iter(wa)
    while True:
        try:
            print(next(wa_iter))
        except StopIteration as e:
            break;
iter_word_analyzer()
# start for wa
# <generator object WordAnalyzer.__iter__.<locals>.<genexpr> at 0x7f4178189200>
# this
# is
# mango
# word
# analyzer
# start while wa_iter
# <generator object WordAnalyzer.__iter__.<locals>.<genexpr> at 0x7f4178189200>
# this
# is
# mango
# word
# analyzer
 

總結

本篇文章就到這裡瞭,希望能夠給你帶來幫助,也希望您能夠多多關註WalkonNet的更多內容!

推薦閱讀: