Python K-means實(shí)現(xiàn)簡(jiǎn)單圖像聚類的示例代碼
這里直接給出第一個(gè)版本的直接實(shí)現(xiàn):
import os
import numpy as np
from sklearn.cluster import KMeans
import cv2
from imutils import build_montages
import matplotlib.image as imgplt
image_path = []
all_images = []
images = os.listdir('./images')
for image_name in images:
image_path.append('./images/' + image_name)
for path in image_path:
image = imgplt.imread(path)
image = image.reshape(-1, )
all_images.append(image)
clt = KMeans(n_clusters=2)
clt.fit(all_images)
labelIDs = np.unique(clt.labels_)
for labelID in labelIDs:
idxs = np.where(clt.labels_ == labelID)[0]
idxs = np.random.choice(idxs, size=min(25, len(idxs)),
replace=False)
show_box = []
for i in idxs:
image = cv2.imread(image_path[i])
image = cv2.resize(image, (96, 96))
show_box.append(image)
montage = build_montages(show_box, (96, 96), (5, 5))[0]
title = "Type {}".format(labelID)
cv2.imshow(title, montage)
cv2.waitKey(0)
主要需要注意的問題是對(duì)K-Means原理的理解。K-means做的是對(duì)向量的聚類,也就是說,假設(shè)要處理的是224×224×3的RGB圖像,那么就得先將其轉(zhuǎn)為1維的向量。在上面的做法里,我們是直接對(duì)其展平:
image = image.reshape(-1, )
那么這么做的缺陷也是十分明顯的。例如,對(duì)于兩張一模一樣的圖像,我們將前者向左平移一個(gè)像素。這么做下來后兩張圖像在感官上幾乎沒有任何區(qū)別,但由于整體平移會(huì)導(dǎo)致兩者的圖像矩陣逐像素比較的結(jié)果差異巨大。以橘子汽車聚類為例,實(shí)驗(yàn)結(jié)果如下:


可以看到結(jié)果是比較差的。因此,我們進(jìn)行改進(jìn),利用ResNet-50進(jìn)行圖像特征的提取(embedding),在特征的基礎(chǔ)上聚類而非直接在像素上聚類,代碼如下:
import os
import numpy as np
from sklearn.cluster import KMeans
import cv2
from imutils import build_montages
import torch.nn as nn
import torchvision.models as models
from PIL import Image
from torchvision import transforms
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
resnet50 = models.resnet50(pretrained=True)
self.resnet = nn.Sequential(resnet50.conv1,
resnet50.bn1,
resnet50.relu,
resnet50.maxpool,
resnet50.layer1,
resnet50.layer2,
resnet50.layer3,
resnet50.layer4)
def forward(self, x):
x = self.resnet(x)
return x
net = Net().eval()
image_path = []
all_images = []
images = os.listdir('./images')
for image_name in images:
image_path.append('./images/' + image_name)
for path in image_path:
image = Image.open(path).convert('RGB')
image = transforms.Resize([224,244])(image)
image = transforms.ToTensor()(image)
image = image.unsqueeze(0)
image = net(image)
image = image.reshape(-1, )
all_images.append(image.detach().numpy())
clt = KMeans(n_clusters=2)
clt.fit(all_images)
labelIDs = np.unique(clt.labels_)
for labelID in labelIDs:
idxs = np.where(clt.labels_ == labelID)[0]
idxs = np.random.choice(idxs, size=min(25, len(idxs)),
replace=False)
show_box = []
for i in idxs:
image = cv2.imread(image_path[i])
image = cv2.resize(image, (96, 96))
show_box.append(image)
montage = build_montages(show_box, (96, 96), (5, 5))[0]
title = "Type {}".format(labelID)
cv2.imshow(title, montage)
cv2.waitKey(0)
可以發(fā)現(xiàn)結(jié)果明顯改善:


到此這篇關(guān)于Python K-means實(shí)現(xiàn)簡(jiǎn)單圖像聚類的示例代碼的文章就介紹到這了,更多相關(guān)Python K-means圖像聚類內(nèi)容請(qǐng)搜索本站以前的文章或繼續(xù)瀏覽下面的相關(guān)文章希望大家以后多多支持本站!
版權(quán)聲明:本站文章來源標(biāo)注為YINGSOO的內(nèi)容版權(quán)均為本站所有,歡迎引用、轉(zhuǎn)載,請(qǐng)保持原文完整并注明來源及原文鏈接。禁止復(fù)制或仿造本網(wǎng)站,禁止在非maisonbaluchon.cn所屬的服務(wù)器上建立鏡像,否則將依法追究法律責(zé)任。本站部分內(nèi)容來源于網(wǎng)友推薦、互聯(lián)網(wǎng)收集整理而來,僅供學(xué)習(xí)參考,不代表本站立場(chǎng),如有內(nèi)容涉嫌侵權(quán),請(qǐng)聯(lián)系alex-e#qq.com處理。
關(guān)注官方微信