操作平台: windows10, python37, jupyter
数据下载: https://www.lanzous.com/iae2wyh
自身值_序号.bmp
的方式进行命名,如图:import numpy as np
from sklearn.neighbors import KNeighborsClassifier
import matplotlib.pyplot as plt
%matplotlib inline
img = plt.imread('./data/0/0_1.bmp')
plt.imshow(img,cmap = plt.cm.gray) #这些图片都是黑白的,
data = []
for i in range(10):
for j in range(1,501):
data.append(plt.imread('./data/%d/%d_%d.bmp'%(i,i,j)))
查看data大小:
len(data)
5000
# 数据
X = np.array(data)
X.shape
(5000, 28, 28)
(1)构造数组
y = [0,1,2,3,4,5,6,7,8,9]*500
y
结果:从0-9,0-9,0-9,一共500次。总共有5000个数值。
(2)数组转numpy
y = np.array(y)
y
array([0, 1, 2, ..., 7, 8, 9])
(3)排序
y.sort()
y
array([0, 0, 0, ..., 9, 9, 9])
index = np.random.randint(0,5000,size = 4000)#随机抽取80%来训练
X_train = X[index]
y_train = y[index]
index = np.random.randint(0,5000,size = 1000) #随机抽取20%来测试
X_test = X[index]
y_test = y[index]
(1)查看数据形状
结果分析: x的训练集和y的训练集数据维度不能对应,x是3维的,y属于1维的,必须要把它统一维度才能进行运算,接下了对x进行降维处理。
(2)降维处理
X_train.reshape(4000,784)
array([[255, 255, 255, ..., 255, 255, 255],
[255, 255, 255, ..., 255, 255, 255],
[255, 255, 255, ..., 255, 255, 255],
...,
[255, 255, 255, ..., 255, 255, 255],
[255, 255, 255, ..., 255, 255, 255],
[255, 255, 255, ..., 255, 255, 255]], dtype=uint8)
%%time
knn = KNeighborsClassifier(n_neighbors=5)#邻近值个数为5
knn.fit(X_train.reshape(4000,-1),y_train) #如果不想计算出28*28=784,可以直接用-1代替784
Wall time: 1.97 s
KNeighborsClassifier(algorithm='auto', leaf_size=30, metric='minkowski',
metric_params=None, n_jobs=None, n_neighbors=5, p=2,
weights='uniform')
%%time #耗时Wall time: 12.1 s
# 预测
y_ = knn.predict(X_test.reshape(1000,784))
(1)方法一
(y_test == y_).mean()
0.931
(2)方法二
knn.score(X_test.reshape(1000,-1),y_test)
0.931
%%time
# n_neighbors=邻近值,weights=权重, p = 距离,n_jobs=线程
knn = KNeighborsClassifier(n_neighbors=5,weights='distance', p = 1,n_jobs=-1)
knn.fit(X_train.reshape(4000,-1),y_train) #训练模型
# predict ,计算准确率
knn.score(X_test.reshape(1000,-1),y_test)
Wall time: 3.88 s
0.953
img = plt.imread('./data/9/9_1.bmp') #随便读取一张图片
plt.imshow(img,cmap = plt.cm.gray) #展示
plt.show()
knn.predict(img.reshape(1,-1))[0]
9