Python K-means實現簡單圖像聚類的示例代碼

這裡直接給出第一個版本的直接實現:

import os
import numpy as np
from sklearn.cluster import KMeans
import cv2
from imutils import build_montages
import matplotlib.image as imgplt

image_path = []
all_images = []
images = os.listdir('./images')

for image_name in images:
    image_path.append('./images/' + image_name)
for path in image_path:
    image = imgplt.imread(path)
    image = image.reshape(-1, )
    all_images.append(image)

clt = KMeans(n_clusters=2)
clt.fit(all_images)
labelIDs = np.unique(clt.labels_)

for labelID in labelIDs:
    idxs = np.where(clt.labels_ == labelID)[0]
    idxs = np.random.choice(idxs, size=min(25, len(idxs)),
		replace=False)
    show_box = []
    for i in idxs:
        image = cv2.imread(image_path[i])
        image = cv2.resize(image, (96, 96))
        show_box.append(image)
    montage = build_montages(show_box, (96, 96), (5, 5))[0]

    title = "Type {}".format(labelID)
    cv2.imshow(title, montage)
    cv2.waitKey(0)

主要需要註意的問題是對K-Means原理的理解。K-means做的是對向量的聚類,也就是說,假設要處理的是224×224×3的RGB圖像,那麼就得先將其轉為1維的向量。在上面的做法裡,我們是直接對其展平:

image = image.reshape(-1, )

那麼這麼做的缺陷也是十分明顯的。例如,對於兩張一模一樣的圖像,我們將前者向左平移一個像素。這麼做下來後兩張圖像在感官上幾乎沒有任何區別,但由於整體平移會導致兩者的圖像矩陣逐像素比較的結果差異巨大。以橘子汽車聚類為例,實驗結果如下:

在這裡插入圖片描述

在這裡插入圖片描述

可以看到結果是比較差的。因此,我們進行改進,利用ResNet-50進行圖像特征的提取(embedding),在特征的基礎上聚類而非直接在像素上聚類,代碼如下:

import os
import numpy as np
from sklearn.cluster import KMeans
import cv2
from imutils import build_montages
import torch.nn as nn
import torchvision.models as models
from PIL import Image
from torchvision import transforms

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        resnet50 = models.resnet50(pretrained=True)
        self.resnet = nn.Sequential(resnet50.conv1,
                                    resnet50.bn1,
                                    resnet50.relu,
                                    resnet50.maxpool,
                                    resnet50.layer1,
                                    resnet50.layer2,
                                    resnet50.layer3,
                                    resnet50.layer4)

    def forward(self, x):
        x = self.resnet(x)
        return x

net = Net().eval()

image_path = []
all_images = []
images = os.listdir('./images')

for image_name in images:
    image_path.append('./images/' + image_name)
for path in image_path:
    image = Image.open(path).convert('RGB')
    image = transforms.Resize([224,244])(image)
    image = transforms.ToTensor()(image)
    image = image.unsqueeze(0)
    image = net(image)
    image = image.reshape(-1, )
    all_images.append(image.detach().numpy())

clt = KMeans(n_clusters=2)
clt.fit(all_images)
labelIDs = np.unique(clt.labels_)

for labelID in labelIDs:
	idxs = np.where(clt.labels_ == labelID)[0]
	idxs = np.random.choice(idxs, size=min(25, len(idxs)),
		replace=False)
	show_box = []
	for i in idxs:
		image = cv2.imread(image_path[i])
		image = cv2.resize(image, (96, 96))
		show_box.append(image)
	montage = build_montages(show_box, (96, 96), (5, 5))[0]

	title = "Type {}".format(labelID)
	cv2.imshow(title, montage)
	cv2.waitKey(0)

可以發現結果明顯改善:

在這裡插入圖片描述

在這裡插入圖片描述

到此這篇關於Python K-means實現簡單圖像聚類的示例代碼的文章就介紹到這瞭,更多相關Python K-means圖像聚類內容請搜索WalkonNet以前的文章或繼續瀏覽下面的相關文章希望大傢以後多多支持WalkonNet!

推薦閱讀: