Python實戰之MNIST手寫數字識別詳解

Posted on 2022-01-05 by WalkonNet

數據集介紹

MNIST數據集是機器學習領域中非常經典的一個數據集，由60000個訓練樣本和10000個測試樣本組成，每個樣本都是一張28 * 28像素的灰度手寫數字圖片，且內置於keras。本文采用Tensorflow下Keras（Keras中文文檔）神經網絡API進行網絡搭建。

開始之前，先回憶下機器學習的通用工作流程（ √表示本文用到，×表示本文沒有用到 )

1.定義問題，收集數據集（√）

2.選擇衡量成功的指標（√）

3.確定評估的方法（√）

4.準備數據（√）

5.開發比基準更好的模型（×）

6.擴大模型規模（×）

7.模型正則化與調節參數（×）

關於最後一層激活函數與損失函數的選擇

下面開始正文～

1.數據預處理

首先導入數據，要使用mnist.load()函數

我們來看看它的源碼聲明：

def load_data(path='mnist.npz'):
  """Loads the [MNIST dataset](http://yann.lecun.com/exdb/mnist/).

  This is a dataset of 60,000 28x28 grayscale images of the 10 digits,
  along with a test set of 10,000 images.
  More info can be found at the
  [MNIST homepage](http://yann.lecun.com/exdb/mnist/).


  Arguments:
      path: path where to cache the dataset locally
          (relative to `~/.keras/datasets`).

  Returns:
      Tuple of Numpy arrays: `(x_train, y_train), (x_test, y_test)`.
      **x_train, x_test**: uint8 arrays of grayscale image data with shapes
        (num_samples, 28, 28).

      **y_train, y_test**: uint8 arrays of digit labels (integers in range 0-9)
        with shapes (num_samples,).
  """

可以看到，裡面包含瞭數據集的下載鏈接，以及數據集規模、尺寸以及數據類型的聲明，且函數返回的是四個numpy array組成的兩個元組。

導入數據集並reshape至想要形狀，再標準化處理。

其中內置於keras的to_categorical()就是one-hot編碼——將每個標簽表示為全零向量，隻有標簽索引對應的元素為1.

eg: col=10

[0,1,9]-------->[ [1,0,0,0,0,0,0,0,0,0],
                  [0,1,0,0,0,0,0,0,0,0],
                  [0,0,0,0,0,0,0,0,0,1] ]

我們可以手動實現它：

def one_hot(sequences,col):
        resuts=np.zeros((len(sequences),col))
        # for i,sequence in enumerate(sequences):
        #         resuts[i,sequence]=1
        for i in range(len(sequences)):
                for j in range(len(sequences[i])):
                        resuts[i,sequences[i][j]]=1
        return resuts

下面是預處理過程

def data_preprocess():
    (train_images, train_labels), (test_images, test_labels) = mnist.load_data()
    train_images = train_images.reshape((60000, 28, 28, 1))
    train_images = train_images.astype('float32') / 255
    #print(train_images[0])
    test_images = test_images.reshape((10000, 28, 28, 1))
    test_images = test_images.astype('float32') / 255

    train_labels = to_categorical(train_labels)
    test_labels = to_categorical(test_labels)
    return train_images,train_labels,test_images,test_labels

2.網絡搭建

這裡我們搭建的是卷積神經網絡，就是包含一些卷積、池化、全連接的簡單線性堆積。我們知道多個線性層堆疊實現的仍然是線性運算，添加層數並不會擴展假設空間（從輸入數據到輸出數據的所有可能的線性變換集合），因此需要添加非線性或激活函數。relu是最常用的激活函數，也可以用prelu、elu

def build_module():
    model = models.Sequential()
    #第一層卷積層，首層需要指出input_shape形狀
    model.add(layers.Conv2D(32, (3,3), activation='relu', input_shape=(28,28,1)))
    #第二層最大池化層
    model.add(layers.MaxPooling2D((2,2)))
    #第三層卷積層
    model.add(layers.Conv2D(64, (3,3), activation='relu'))
    #第四層最大池化層
    model.add(layers.MaxPooling2D((2,2)))
    #第五層卷積層
    model.add(layers.Conv2D(64, (3,3), activation='relu'))
    #第六層Flatten層，將3D張量平鋪為向量
    model.add(layers.Flatten())
    #第七層全連接層
    model.add(layers.Dense(64, activation='relu'))
    #第八層softmax層，進行分類
    model.add(layers.Dense(10, activation='softmax'))
    return model

使用model.summary()查看搭建的網路結構：

3.網絡配置

網絡搭建好之後還需要關鍵的一步設置配置。比如：優化器——網絡梯度下降進行參數更新的具體方法、損失函數——衡量生成值與目標值之間的距離、評估指標等。配置這些可以通過 model.compile() 參數傳遞做到。

我們來看看model.compile()的源碼分析下：

  def compile(self,
              optimizer='rmsprop',
              loss=None,
              metrics=None,
              loss_weights=None,
              weighted_metrics=None,
              run_eagerly=None,
              steps_per_execution=None,
              **kwargs):
    """Configures the model for training.

關於優化器

優化器：字符串（優化器名稱）或優化器實例。

字符串格式：比如使用優化器的默認參數

實例優化器進行參數傳入：

keras.optimizers.RMSprop(lr=0.001, rho=0.9, epsilon=None, decay=0.0)
model.compile(optimizer='rmsprop'，loss='mean_squared_error')

建議使用優化器的默認參數（除瞭學習率 lr，它可以被自由調節）

參數：

lr: float >= 0. 學習率。
rho: float >= 0. RMSProp梯度平方的移動均值的衰減率.
epsilon: float >= 0. 模糊因子. 若為 None, 默認為 K.epsilon()。
decay: float >= 0. 每次參數更新後學習率衰減值。

類似還有好多優化器，比如SGD、Adagrad、Adadelta、Adam、Adamax、Nadam等

關於損失函數

取決於具體任務，一般來說損失函數要能夠很好的刻畫任務。比如

1.回歸問題

希望神經網絡輸出的值與ground-truth的距離更近，選取能刻畫距離的loss應該會更合適，比如L1 Loss、MSE Loss等

2.分類問題

希望神經網絡輸出的類別與ground-truth的類別一致，選取能刻畫類別分佈的loss應該會更合適，比如cross_entropy

具體常見選擇可查看文章開始處關於損失函數的選擇

關於指標

常規使用查看上述列表即可。下面說說自定義評價函數：它應該在編譯的時候（compile）傳遞進去。該函數需要以 (y_true, y_pred) 作為輸入參數，並返回一個張量作為輸出結果。

import keras.backend as K
def mean_pred(y_true, y_pred):
    return K.mean(y_pred)

model.compile(optimizer='rmsprop',
              loss='binary_crossentropy',
              metrics=['accuracy', mean_pred])

4.網絡訓練與測試

1.訓練（擬合）

使用model.fit()，它可以接受的參數列表

def fit(self,
          x=None,
          y=None,
          batch_size=None,
          epochs=1,
          verbose=1,
          callbacks=None,
          validation_split=0.,
          validation_data=None,
          shuffle=True,
          class_weight=None,
          sample_weight=None,
          initial_epoch=0,
          steps_per_epoch=None,
          validation_steps=None,
          validation_batch_size=None,
          validation_freq=1,
          max_queue_size=10,
          workers=1,
          use_multiprocessing=False):

這個源碼有300多行長，具體的解讀放在下次。

我們對訓練數據進行劃分，以64個樣本為小批量進行網絡傳遞，對所有數據迭代5次

model.fit(train_images, train_labels, epochs = 5, batch_size=64)

2.測試

使用model.evaluates()函數

test_loss, test_acc = model.evaluate(test_images, test_labels)

關於測試函數的返回聲明：

Returns:
        Scalar test loss (if the model has a single output and no metrics)
        or list of scalars (if the model has multiple outputs
        and/or metrics). The attribute `model.metrics_names` will give you
        the display labels for the scalar outputs.

5.繪制loss和accuracy隨著epochs的變化圖

model.fit()返回一個History對象，它包含一個history成員，記錄瞭訓練過程的所有數據。

我們采用matplotlib.pyplot進行繪圖，具體見後面完整代碼。

Returns:
        A `History` object. Its `History.history` attribute is
        a record of training loss values and metrics values
        at successive epochs, as well as validation loss values
        and validation metrics values (if applicable).

def draw_loss(history):
    loss=history.history['loss']
    epochs=range(1,len(loss)+1)
    plt.subplot(1,2,1)#第一張圖
    plt.plot(epochs,loss,'bo',label='Training loss')
    plt.title("Training loss")
    plt.xlabel('Epochs')
    plt.ylabel('Loss')
    plt.legend()

    plt.subplot(1,2,2)#第二張圖
    accuracy=history.history['accuracy']
    plt.plot(epochs,accuracy,'bo',label='Training accuracy')
    plt.title("Training accuracy")
    plt.xlabel('Epochs')
    plt.ylabel('Accuracy')
    plt.suptitle("Train data")
    plt.legend()
    plt.show()

6.完整代碼

from tensorflow.keras.datasets import mnist
from tensorflow.keras import models
from tensorflow.keras import layers
from tensorflow.keras.utils import to_categorical
import matplotlib.pyplot as plt
import numpy as np
def data_preprocess():
    (train_images, train_labels), (test_images, test_labels) = mnist.load_data()
    train_images = train_images.reshape((60000, 28, 28, 1))
    train_images = train_images.astype('float32') / 255
    #print(train_images[0])
    test_images = test_images.reshape((10000, 28, 28, 1))
    test_images = test_images.astype('float32') / 255

    train_labels = to_categorical(train_labels)
    test_labels = to_categorical(test_labels)
    return train_images,train_labels,test_images,test_labels

#搭建網絡
def build_module():
    model = models.Sequential()
    #第一層卷積層
    model.add(layers.Conv2D(32, (3,3), activation='relu', input_shape=(28,28,1)))
    #第二層最大池化層
    model.add(layers.MaxPooling2D((2,2)))
    #第三層卷積層
    model.add(layers.Conv2D(64, (3,3), activation='relu'))
    #第四層最大池化層
    model.add(layers.MaxPooling2D((2,2)))
    #第五層卷積層
    model.add(layers.Conv2D(64, (3,3), activation='relu'))
    #第六層Flatten層，將3D張量平鋪為向量
    model.add(layers.Flatten())
    #第七層全連接層
    model.add(layers.Dense(64, activation='relu'))
    #第八層softmax層，進行分類
    model.add(layers.Dense(10, activation='softmax'))
    return model
def draw_loss(history):
    loss=history.history['loss']
    epochs=range(1,len(loss)+1)
    plt.subplot(1,2,1)#第一張圖
    plt.plot(epochs,loss,'bo',label='Training loss')
    plt.title("Training loss")
    plt.xlabel('Epochs')
    plt.ylabel('Loss')
    plt.legend()

    plt.subplot(1,2,2)#第二張圖
    accuracy=history.history['accuracy']
    plt.plot(epochs,accuracy,'bo',label='Training accuracy')
    plt.title("Training accuracy")
    plt.xlabel('Epochs')
    plt.ylabel('Accuracy')
    plt.suptitle("Train data")
    plt.legend()
    plt.show()
if __name__=='__main__':
    train_images,train_labels,test_images,test_labels=data_preprocess()
    model=build_module()
    print(model.summary())
    model.compile(optimizer='rmsprop', loss = 'categorical_crossentropy', metrics=['accuracy'])
    history=model.fit(train_images, train_labels, epochs = 5, batch_size=64)
    draw_loss(history)
    test_loss, test_acc = model.evaluate(test_images, test_labels)
    print('test_loss=',test_loss,'  test_acc = ', test_acc)

迭代訓練過程中loss和accuracy的變化

由於數據集比較簡單，隨便的神經網絡設計在測試集的準確率可達到99.2%

以上就是Python實戰之MNIST手寫數字識別詳解的詳細內容，更多關於Python MNIST手寫數字識別的資料請關註WalkonNet其它相關文章！

Python實戰之MNIST手寫數字識別詳解

目錄

數據集介紹

1.數據預處理

2.網絡搭建

3.網絡配置

關於優化器

關於損失函數

關於指標

4.網絡訓練與測試

5.繪制loss和accuracy隨著epochs的變化圖

6.完整代碼

推薦閱讀：

發佈留言取消回覆

近期文章

目錄

數據集介紹

1.數據預處理

2.網絡搭建

3.網絡配置

關於優化器

關於損失函數

關於指標

4.網絡訓練與測試

5.繪制loss和accuracy隨著epochs的變化圖

6.完整代碼

推薦閱讀：

發佈留言 取消回覆

近期文章

標籤

發佈留言取消回覆