本文主要介绍使用keras来做房屋价格预测。
示例代码:
from keras.datasets import boston_housing
from keras.models import Sequential
from keras.layers import Dense
import numpy as np
from keras import backend as K
import matplotlib.pyplot as plt
(train_data, train_targets), (test_data, test_targets) = boston_housing.load_data()
'''
我们有404个训练样本和102个测试样本。 该数据包括13个功能。 输入数据中的13个功能如下
#1。人均犯罪率。
#2。占地面积超过25,000平方英尺的住宅用地比例。
#3。每个城镇非零售业务占比的比例。
#4。Charles River虚拟变量(如果管道限制河流则= 1;否则为0)。
#5。一氧化氮浓度(每千万份)。
#6。每个住宅的平均房间数。
#7。1940年以前建造的自住单位比例。
#8。到波士顿五个就业中心的加权距离。
#9。径向高速公路的可达性指数。
#10。每10,000美元的全额物业税率。
#11。城镇的学生与教师比例。
#12. 1000 *(Bk - 0.63)** 2其中Bk是城镇黑人的比例。
#13.人口状况较低。
#
#目标是自住房屋的中位数,以千美元计算:
'''
print(train_data.shape)
print(train_targets)
'''
将神经网络输入所有采用不同范围的值都是有问题的。 网络可能能够
自动适应这种异构数据,但肯定会使学习变得更加困难。 处理此类数据
的一种广泛的最佳实践是进行特征标准化:对于输入数据中的每个特征(
输入数据矩阵中的一列),我们将减去特征的均值并除以标准差,因此
该特征将以0为中心,并具有单位标准偏差。
'''
mean = train_data.mean(axis=0)
train_data -= mean
std = train_data.std(axis=0)
train_data /= std
test_data -= mean
test_data /= std
'''
由于可用的样本很少,我们将使用一个非常小的网络,其中有两个隐藏层,
每个层有64个单元。 通常,拥有的训练数据越少,过度拟合就越差,
使用小型网络是缓解过度拟合的一种方法。
'''
# 搭建网络
def build_model():
model = Sequential()
model.add(Dense(64, activation='relu',
input_shape=(train_data.shape[1],)))
model.add(Dense(64, activation='relu'))
model.add(Dense(1))
model.compile(optimizer='rmsprop',
loss='mse',
metrics=['mae'])
return model
k = 4
num_val_samples = len(train_data) // k
num_epochs = 100
all_scores = []
for i in range(k):
print('processing fold #', i)
# 验证数据预处理:通过k来划分
val_data = train_data[i * num_val_samples: (i + 1) * num_val_samples]
val_targets = train_targets[i * num_val_samples: (i + 1) * num_val_samples]
# 准备训练数据
partial_train_data = np.concatenate(
[train_data[: i * num_val_samples],
train_data[(i + 1) * num_val_samples:]],
axis=0
)
partial_train_targets = np.concatenate(
[train_targets[: i * num_val_samples],
train_targets[(i + 1) * num_val_samples:]],
axis=0
)
# 创建已编译好的keras模型
model = build_model()
# 训练模型
model.fit(partial_train_data, partial_train_targets,
epochs=num_epochs, batch_size=1, verbose=0)
# 通过验证集验证模型
val_mse, val_mae = model.evaluate(val_data, val_targets, verbose=0)
all_scores.append(val_mae)
print(all_scores)
K.clear_session()
num_epochs = 500
all_mae_histories = []
for i in range(k):
print('processing fold #', i)
# Prepare the validation data: data from partition # k
val_data = train_data[i * num_val_samples: (i + 1) * num_val_samples]
val_targets = train_targets[i * num_val_samples: (i + 1) * num_val_samples]
# Prepare the training data: data from all other partitions
partial_train_data = np.concatenate(
[train_data[:i * num_val_samples],
train_data[(i + 1) * num_val_samples:]],
axis=0)
partial_train_targets = np.concatenate(
[train_targets[:i * num_val_samples],
train_targets[(i + 1) * num_val_samples:]],
axis=0)
# Build the Keras model (already compiled)
model = build_model()
# Train the model (in silent mode, verbose=0)
history = model.fit(partial_train_data, partial_train_targets,
validation_data=(val_data, val_targets),
epochs=num_epochs, batch_size=1, verbose=0)
mae_history = history.history['val_mean_absolute_error']
all_mae_histories.append(mae_history)
average_mae_history = [
np.mean([x[i] for x in all_mae_histories]) for i in range(num_epochs)]
plt.plot(range(1, len(average_mae_history) + 1), average_mae_history)
plt.xlabel('Epochs')
plt.ylabel('Validation MAE')
plt.show()
def smooth_curve(points, factor=0.9):
smoothed_points = []
for point in points:
if smoothed_points:
previous = smoothed_points[-1]
smoothed_points.append(previous * factor + point * (1 - factor))
else:
smoothed_points.append(point)
return smoothed_points
smooth_mae_history = smooth_curve(average_mae_history[10:])
plt.plot(range(1, len(smooth_mae_history) + 1), smooth_mae_history)
plt.xlabel('Epochs')
plt.ylabel('Validation MAE')
plt.show()
运行结果:
(404, 13)
[15.2 42.3 50. 21.1 17.7 18.5 11.3 15.6 15.6 14.4 12.1 17.9 23.1 19.9
15.7 8.8 50. 22.5 24.1 27.5 10.9 30.8 32.9 24. 18.5 13.3 22.9 34.7
16.6 17.5 22.3 16.1 14.9 23.1 34.9 25. 13.9 13.1 20.4 20. 15.2 24.7
22.2 16.7 12.7 15.6 18.4 21. 30.1 15.1 18.7 9.6 31.5 24.8 19.1 22.
14.5 11. 32. 29.4 20.3 24.4 14.6 19.5 14.1 14.3 15.6 10.5 6.3 19.3
19.3 13.4 36.4 17.8 13.5 16.5 8.3 14.3 16. 13.4 28.6 43.5 20.2 22.
23. 20.7 12.5 48.5 14.6 13.4 23.7 50. 21.7 39.8 38.7 22.2 34.9 22.5
31.1 28.7 46. 41.7 21. 26.6 15. 24.4 13.3 21.2 11.7 21.7 19.4 50.
22.8 19.7 24.7 36.2 14.2 18.9 18.3 20.6 24.6 18.2 8.7 44. 10.4 13.2
21.2 37. 30.7 22.9 20. 19.3 31.7 32. 23.1 18.8 10.9 50. 19.6 5.
14.4 19.8 13.8 19.6 23.9 24.5 25. 19.9 17.2 24.6 13.5 26.6 21.4 11.9
22.6 19.6 8.5 23.7 23.1 22.4 20.5 23.6 18.4 35.2 23.1 27.9 20.6 23.7
28. 13.6 27.1 23.6 20.6 18.2 21.7 17.1 8.4 25.3 13.8 22.2 18.4 20.7
31.6 30.5 20.3 8.8 19.2 19.4 23.1 23. 14.8 48.8 22.6 33.4 21.1 13.6
32.2 13.1 23.4 18.9 23.9 11.8 23.3 22.8 19.6 16.7 13.4 22.2 20.4 21.8
26.4 14.9 24.1 23.8 12.3 29.1 21. 19.5 23.3 23.8 17.8 11.5 21.7 19.9
25. 33.4 28.5 21.4 24.3 27.5 33.1 16.2 23.3 48.3 22.9 22.8 13.1 12.7
22.6 15. 15.3 10.5 24. 18.5 21.7 19.5 33.2 23.2 5. 19.1 12.7 22.3
10.2 13.9 16.3 17. 20.1 29.9 17.2 37.3 45.4 17.8 23.2 29. 22. 18.
17.4 34.6 20.1 25. 15.6 24.8 28.2 21.2 21.4 23.8 31. 26.2 17.4 37.9
17.5 20. 8.3 23.9 8.4 13.8 7.2 11.7 17.1 21.6 50. 16.1 20.4 20.6
21.4 20.6 36.5 8.5 24.8 10.8 21.9 17.3 18.9 36.2 14.9 18.2 33.3 21.8
19.7 31.6 24.8 19.4 22.8 7.5 44.8 16.8 18.7 50. 50. 19.5 20.1 50.
17.2 20.8 19.3 41.3 20.4 20.5 13.8 16.5 23.9 20.6 31.5 23.3 16.8 14.
33.8 36.1 12.8 18.3 18.7 19.1 29. 30.1 50. 50. 22. 11.9 37.6 50.
22.7 20.8 23.5 27.9 50. 19.3 23.9 22.6 15.2 21.7 19.2 43.8 20.3 33.2
19.9 22.5 32.7 22. 17.1 19. 15. 16.1 25.1 23.7 28.7 37.2 22.6 16.4
25. 29.8 22.1 17.4 18.1 30.3 17.5 24.7 12.6 26.5 28.7 13.3 10.4 24.4
23. 20. 17.8 7. 11.8 24.4 13.8 19.4 25.2 19.4 19.4 29.1]
processing fold # 0
WARNING:tensorflow:From D:\Users\Seavan_CC\Anaconda3\lib\site-packages\keras\backend\tensorflow_backend.py:1349: calling reduce_mean (from tensorflow.python.ops.math_ops) with keep_dims is deprecated and will be removed in a future version.
Instructions for updating:
keep_dims is deprecated, use keepdims instead
2018-11-04 19:09:25.661535: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
processing fold # 1
processing fold # 2
processing fold # 3
[2.0597879414511198, 2.2206014052476033, 2.9226412867555522, 2.3122610349466304]
processing fold # 0
processing fold # 1
processing fold # 2
processing fold # 3