11.利用keras把cifar10的py批次格式数据转为jpg格式,制作

在用cnn进行分类时候,我们经常用到cifar10,但是官网的数据集都是转换格式后的,这里把官网的py版本(http://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz)利用keras自带的dataset工具下载并进行转换为jpg格式。

1.安装keras

pip install keras -i https://pypi.tuna.tsinghua.edu.cn/simple --user

 

执行以下脚本后程序会自动下载cifar10并保存在~/.keras/datasets/下面。也可以在官网下载cifar-10-python.tar.gz文件,并且重命名为cifar-10-batches-py.tar.gz,然后复制到~/.keras/datasets/下再执行脚本。

完毕后发现“cifar10_jpg_train”和"cifar10_jpg_test"即为转完的jpg图像。

from keras.datasets import cifar10
from PIL import Image
import numpy as np
import os, shutil


def cifar10_to_jpg(img_dir, x_train1, y_train1):
    if not os.path.exists(img_dir):
        os.mkdir(img_dir)
    else:
        shutil.rmtree(img_dir)
        os.mkdir(img_dir)
    id = 0
    for x, y in zip(x_train1, y_train1):
        label = y[0]
        if not os.path.exists(os.path.join(img_dir, str(label))):
            os.mkdir(os.path.join(img_dir, str(label)))
        img_path = os.path.join(img_dir, str(label), "{}_{}.jpg".format(label, id))
        id += 1
        img = Image.fromarray(x)
        img.save(img_path)

if __name__ == "__main__":
    (x_train, y_train), (x_test, y_test) = cifar10.load_data()
    cifar10_to_jpg("cifar10_jpg_train", x_train, y_train)
    cifar10_to_jpg("cifar10_jpg_test", x_test, y_test)

11.利用keras把cifar10的py批次格式数据转为jpg格式,制作_第1张图片

 

2.用以下脚本制作数据的txt文件,文件内包含图像路径和标签,便于以后训练模型时数据的读取。

#coding:utf-8
import os
import random
import glob
import numpy as np
import shutil

img_path = "/Users/ming/Downloads/zhangming/pytorch_demo/data"
# train val img folder
train_path = img_path + "/cifar10_jpg_train"
val_path = img_path + "/cifar10_jpg_test"
# save txt
train_txt = "./cifar10_train.txt"
val_txt = "./cifar10_test.txt"

f1 = open(train_txt, "w")
f2 = open(val_txt, "w")
def read_img_path(train_path, shuffle=True):
	train_list = []
	for label in os.listdir(train_path):
		img_list = glob.glob(os.path.join(train_path, label, "*.jpg"))
		for img in img_list:
			#f1.write(img + " " + str(label) + "\n")
			train_list.append(img + " " + str(label) + "\n")
	if shuffle:
		np.random.shuffle(train_list)
	return train_list

train_list = read_img_path(train_path, True)
for i in train_list:
	f1.write(i)

test_list = read_img_path(val_path, False)
for i in test_list:
	f2.write(i)

文件内容如下:

11.利用keras把cifar10的py批次格式数据转为jpg格式,制作_第2张图片

 

你可能感兴趣的:(tensorflow,tensorflow)