Python K-means實現簡單圖像聚類的示例代碼
這裡直接給出第一個版本的直接實現:
import os import numpy as np from sklearn.cluster import KMeans import cv2 from imutils import build_montages import matplotlib.image as imgplt image_path = [] all_images = [] images = os.listdir('./images') for image_name in images: image_path.append('./images/' + image_name) for path in image_path: image = imgplt.imread(path) image = image.reshape(-1, ) all_images.append(image) clt = KMeans(n_clusters=2) clt.fit(all_images) labelIDs = np.unique(clt.labels_) for labelID in labelIDs: idxs = np.where(clt.labels_ == labelID)[0] idxs = np.random.choice(idxs, size=min(25, len(idxs)), replace=False) show_box = [] for i in idxs: image = cv2.imread(image_path[i]) image = cv2.resize(image, (96, 96)) show_box.append(image) montage = build_montages(show_box, (96, 96), (5, 5))[0] title = "Type {}".format(labelID) cv2.imshow(title, montage) cv2.waitKey(0)
主要需要註意的問題是對K-Means原理的理解。K-means做的是對向量的聚類,也就是說,假設要處理的是224×224×3的RGB圖像,那麼就得先將其轉為1維的向量。在上面的做法裡,我們是直接對其展平:
image = image.reshape(-1, )
那麼這麼做的缺陷也是十分明顯的。例如,對於兩張一模一樣的圖像,我們將前者向左平移一個像素。這麼做下來後兩張圖像在感官上幾乎沒有任何區別,但由於整體平移會導致兩者的圖像矩陣逐像素比較的結果差異巨大。以橘子汽車聚類為例,實驗結果如下:
可以看到結果是比較差的。因此,我們進行改進,利用ResNet-50進行圖像特征的提取(embedding),在特征的基礎上聚類而非直接在像素上聚類,代碼如下:
import os import numpy as np from sklearn.cluster import KMeans import cv2 from imutils import build_montages import torch.nn as nn import torchvision.models as models from PIL import Image from torchvision import transforms class Net(nn.Module): def __init__(self): super(Net, self).__init__() resnet50 = models.resnet50(pretrained=True) self.resnet = nn.Sequential(resnet50.conv1, resnet50.bn1, resnet50.relu, resnet50.maxpool, resnet50.layer1, resnet50.layer2, resnet50.layer3, resnet50.layer4) def forward(self, x): x = self.resnet(x) return x net = Net().eval() image_path = [] all_images = [] images = os.listdir('./images') for image_name in images: image_path.append('./images/' + image_name) for path in image_path: image = Image.open(path).convert('RGB') image = transforms.Resize([224,244])(image) image = transforms.ToTensor()(image) image = image.unsqueeze(0) image = net(image) image = image.reshape(-1, ) all_images.append(image.detach().numpy()) clt = KMeans(n_clusters=2) clt.fit(all_images) labelIDs = np.unique(clt.labels_) for labelID in labelIDs: idxs = np.where(clt.labels_ == labelID)[0] idxs = np.random.choice(idxs, size=min(25, len(idxs)), replace=False) show_box = [] for i in idxs: image = cv2.imread(image_path[i]) image = cv2.resize(image, (96, 96)) show_box.append(image) montage = build_montages(show_box, (96, 96), (5, 5))[0] title = "Type {}".format(labelID) cv2.imshow(title, montage) cv2.waitKey(0)
可以發現結果明顯改善:
到此這篇關於Python K-means實現簡單圖像聚類的示例代碼的文章就介紹到這瞭,更多相關Python K-means圖像聚類內容請搜索WalkonNet以前的文章或繼續瀏覽下面的相關文章希望大傢以後多多支持WalkonNet!
推薦閱讀:
- Python sklearn中的K-Means聚類使用方法淺析
- Python中的imread()函數用法說明
- Python+OpenCV實戰之利用 K-Means 聚類進行色彩量化
- python 基於空間相似度的K-means軌跡聚類的實現
- python中k-means和k-means++原理及實現