程序示例:
b = np.zeros((2, 3, 5))
print(b)
print(b.ndim)
print(b.size)
print(b.shape)
执行结果:
[[[0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0.]]
[[0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0.]]]
3
30
(2, 3, 5)
结果解释:
NumPy中的多维数组称为ndarray。其中:
最常用的方法是使用array()函数,该函数只有一个唯一的参数,需要传入一个数组类型的对象。
我们可以传入单层或多层列表,嵌套元组或元组列表,也可以是元组和列表组成的列表(或元组)。总之,传入的对象是数组类型即可。(Python的数组类型有列表List、元组Tuple、字典Dict、集合Set)
import numpy as np
# 嵌套的列表
a = np.array([[1, 2, 3], [4, 5, 6]])
print(a)
# 嵌套的元组
b = np.array(((3, 2, 1), (6, 5, 4)))
print(b)
# 元组和列表组成的列表
c = np.array([(1, 2, 3), [4, 5, 6], (7, 8, 9)])
print(c)
# 元组和列表组成的元组
d = np.array(([3, 2, 1], (6, 5, 4), [9, 8, 7]))
print(d)
执行结果:
[[1 2 3]
[4 5 6]]
[[3 2 1]
[6 5 4]]
[[1 2 3]
[4 5 6]
[7 8 9]]
[[3 2 1]
[6 5 4]
[9 8 7]]
指定数组中的元素类型,可以设置array()的dtype参数,例如:
程序代码:
a = np.array([[1,2],[3,4]],dtype = 'D')
a
执行结果:
array([[1.+0.j, 2.+0.j],
[3.+0.j, 4.+0.j]])
NumPy数组能够包含多种数据类型。
数据类型对象是numpy.dtype类的实例。
函数array()、arange()都有参数dtype,可以通过设置dtype参数,设置各元素的数据类型。
程序代码:
import numpy as np
a1 = np.array((2,3))
print('a1 = ', a1)
a2 = np.array((2,3),dtype='f')#浮点型
print('a2 = ', a2)
b1 = np.arange(2,13,2)#间隔为2
print('b1 = ', b1)
b2 = np.arange(2,13,2,dtype='D') #复数型
print('b2 = ', b2)
执行结果:
a1 = [2 3]
a2 = [2. 3.]
b1 = [ 2 4 6 8 10 12]
b2 = [ 2.+0.j 4.+0.j 6.+0.j 8.+0.j 10.+0.j 12.+0.j]
def zeros(shape, dtype=None, order=‘C’):
Return a new array of given shape and type, filled with zeros.
Parameters
----------
shape : int or tuple of ints
Shape of the new array, e.g.,(2, 3)
or2
.
dtype : data-type, optional
The desired data-type for the array, e.g.,numpy.int8
. Default is
numpy.float64
.
order : {‘C’, ‘F’}, optional, default: ‘C’
Whether to store multi-dimensional data in row-major
(C-style) or column-major (Fortran-style) order in
memory.
Returns
-------
out : ndarray
Array of zeros with the given shape, dtype, and order.
def ones(shape, dtype=None, order=‘C’):
Return a new array of given shape and type, filled with ones.
arange([start,] stop[, step,], dtype=None)
Parameters
----------
start : number, optional
Start of interval. The interval includes this value. The default
start value is 0.
stop : number
End of interval. The interval does not include this value, except
in some cases wherestep
is not an integer and floating point
round-off affects the length ofout
.
step : number, optional
Spacing between values. For any outputout
, this is the distance
between two adjacent values,out[i+1] - out[i]
. The default
step size is 1. Ifstep
is specified as a position argument,
start
must also be given.
dtype : dtype
The type of the output array. Ifdtype
is not given, infer the data
type from the other input arguments.
Returns
-------
arange : ndarray
程序示例:
print(np.zeros((3,3)))
print()
print(np.ones((3,3),dtype='f'))
print()
print(np.arange(6))
print(np.arange(1,6))
print(np.arange(1,6,2))
执行结果;
[[0. 0. 0.]
[0. 0. 0.]
[0. 0. 0.]]
[[1. 1. 1.]
[1. 1. 1.]
[1. 1. 1.]]
[0 1 2 3 4 5]
[1 2 3 4 5]
[1 3 5]
.def linspace(start, stop, num=50, endpoint=True, retstep=False, dtype=None):
“”"
Return evenly spaced numbers over a specified interval.
Returnsnum
evenly spaced samples, calculated over the
interval [start
,stop
].
The endpoint of the interval can optionally be excluded.
Parameters
----------
start : scalar
The starting value of the sequence.
stop : scalar
The end value of the sequence, unlessendpoint
is set to False.
In that case, the sequence consists of all but the last ofnum + 1
evenly spaced samples, so thatstop
is excluded. Note that the step
size changes whenendpoint
is False.
num : int, optional
Number of samples to generate. Default is 50. Must be non-negative.
endpoint : bool, optional
If True,stop
is the last sample. Otherwise, it is not included.
Default is True.
retstep : bool, optional
If True, return (samples
,step
), wherestep
is the spacing
between samples.
dtype : dtype, optional
The type of the output array. Ifdtype
is not given, infer the data
type from the other input arguments.
… versionadded:: 1.9.0
Returns
-------
samples : ndarray
There arenum
equally spaced samples in the closed interval
[start, stop]
or the half-open interval[start, stop)
(depending on whetherendpoint
is True or False).
step : float, optional
Only returned ifretstep
is True
Size of spacing between samples.
linspace,默认使用起始闭区间[start,end]。
参数num表示默认划分成50份。
参数endpoint为True时,表示将端点end考虑在内。为False时,表示将端点end排除在外。
参数retstep为True时,表示返回生成数组中各元素的间隔。为False时,不返回间隔。
程序示例:
print(np.linspace(2,8,num=4))
print(np.linspace(2,8,num=4,retstep=True))
执行结果:
[2. 4. 6. 8.]
(array([2., 4., 6., 8.]), 2.0)
加(+)、减(-)、乘(*) 均为元素级,即两个数组进行这三种运算,是各对应位置的元素,进行这些运算。
NumPy中用dot()函数表示矩阵积。
dot(a, b, out=None)
程序示例:
import numpy as np
a = np.arange(1, 4)
print('a = ', a)
b = np.linspace(1, 3, 3)
print('b = ', b)
print(np.dot(a, b)) # 即1*1+2+2+3+3 = 14
print(a.dot(b))
print(np.dot(b,a)) # 注意:矩阵积的运算不遵循交换律,所以运算对象的顺序很重要。此处的例子没有显示这一点,但还是要注意。
执行结果:
a = [1 2 3]
b = [1. 2. 3.]
14.0
14.0
14.0
自增(+=)、自减( -=)。
程序示例:
a = np.arange(1, 4)
print('a = ', a)
b = np.linspace(1, 3, 3)
print('b = ', b)
print()
a += 1
print(a)
b -= 1
print(b)
执行结果:
a = [1 2 3]
b = [1. 2. 3.]
[2 3 4]
[0. 1. 2.]
通用函数对输入数组的每个元素进行操作,生成的所有结果组成一个新的数组,输出数组的size与输入数组相同。
sqrt()、log()、sin()。
聚合函数对一组值进行操作,返回一个单一值作为结果。
sum()、min()、max()、mean()、std()。
包含两种机制:正数索引(自左从0开始)、负数索引(自右从-1开始)。
一维数组程序示例:
import numpy as np
a = np.arange(1, 6)
print('a = ', a)
print(a[0])
print(a[1]) # 正数索引
print(a[-1])
print(a[-2])
执行结果:
a = [1 2 3 4 5]
1
2
5
4
二维数组程序示例:
b = np.arange(1,10).reshape((3,3))
print(b)
print(b[0][0])
print(b[1][1])
print(b[2][2])
执行结果:
[[1 2 3]
[4 5 6]
[7 8 9]]
1
5
9
import numpy as np
a = np.arange(0, 9)
print('a = ', a)
print('a[1:6] = ', a[1:6]) # 元素的起始索引,开区间
# 间隔抽取
print('a[1:6:2] = ', a[1:6:2]) # 在切片中,每两个元素取一个,即每隔一个元素取一个
print()
# 间隔抽取默认值
print('a[0:6] = ', a[0:6])
print('a[:6:2] = ', a[:6:2]) # 第一个数字省略,切片默认从数组最左侧(索引为0)开始
print()
print('a[1:] = ', a[1:])
print('a[1::2] = ', a[1::2]) # 第二个数字省略,切片默认到数组最右侧结束
print()
print('a[1:6] = ', a[1:6])
print('a[1:6:] = ', a[1:6:]) # 最后一个省略,间隔默认取1
执行结果:
a = [0 1 2 3 4 5 6 7 8]
a[1:6] = [1 2 3 4 5]
a[1:6:2] = [1 3 5]
a[0:6] = [0 1 2 3 4 5]
a[:6:2] = [0 2 4]
a[1:] = [1 2 3 4 5 6 7 8]
a[1::2] = [1 3 5 7
a[1:6] = [1 2 3 4 5]
a[1:6:] = [1 2 3 4 5]
import numpy as np
a = np.arange(1,10).reshape((3,3))
print(a, '\n')
print(a[:, 1], '\n') # 省略第一个,默认选择所有行
print(a[1, :], '\n') # 省略第二个,默认选择所有列
print(a[0:2, 1:3], '\n') # 0到1行,1到2列
print(a[[0, 2], 1:3]) # [0,2]用列表表示,选择第0行和第2行;
执行结果
[[1 2 3]
[4 5 6]
[7 8 9]]
[2 5 8]
[4 5 6]
[[2 3]
[5 6]]
[[2 3]
[8 9]]
import numpy as np
# 1 迭代一维数组
a = np.arange(1, 6)
print(a, '\n')
for item in a:
print(item)
print()
# 2 迭代二维数组
b = np.arange(1, 10).reshape((3, 3))
print(b, '\n')
print('2.1 以行为单位输出:')
for row in b:
print(row)
print('2.2 使用嵌套循环遍历每个元素:')
for row in b:
for item in row:
print(item)
print('2.3 通过一维迭代器flat遍历数组')
"""
ndarray.flat
A 1-D iterator over the array。(遍历数组的一维迭代器)
Return a copy of the array collapsed into one dimension.(flat是numpy.flatier的实例,将原数组折叠为一个一维数组,然后返回。)
"""
for item in b.flat: # 将数组b中的元素按顺序组成一个一维数组
print(item)
执行结果:
[1 2 3 4 5]
1
2
3
4
5
[[1 2 3]
[4 5 6]
[7 8 9]]
2.1 以行为单位输出:
[1 2 3]
[4 5 6]
[7 8 9]
2.2 使用嵌套循环遍历每个元素:
1
2
3
4
5
6
7
8
9
2.3 通过一维迭代器flat遍历数组
1
2
3
4
5
6
7
8
9
import numpy as np
b = np.arange(1, 10).reshape((3, 3))
print(b, '\n')
# 聚合函数
print(np.apply_along_axis(np.mean, axis=0, arr=b)) # 对每一列求平均值
print(np.apply_along_axis(np.mean, 1, b)) # 对每一行求平均值
print()
def foo(x):
return x/2
print(np.apply_along_axis(foo, 1, b)) # 将每一行除以2
执行结果:
[[1 2 3]
[4 5 6]
[7 8 9]]
[4. 5. 6.]
[2. 5. 8.]
[[0.5 1. 1.5]
[2. 2.5 3. ]
[3.5 4. 4.5]]
import numpy as np
a = np.ones(9).reshape(3, 3)
print('a = ', a, '\n')
b = np.arange(1, 10).reshape((3, 3))
print('b = ', b, '\n')
print('水平叠加 = ', np.hstack((a, b)))
执行结果:a = [[1. 1. 1.]
[1. 1. 1.]
[1. 1. 1.]]
b = [[1 2 3]
[4 5 6]
[7 8 9]]
水平叠加 = [[1. 1. 1. 1. 2. 3.]
[1. 1. 1. 4. 5. 6.]
[1. 1. 1. 7. 8. 9.]]
import numpy as np
b = np.arange(1, 10).reshape((3, 3))
print('b = ', b, '\n')
print('横向拆分 = ', np.hsplit(b, 3)) # 即竖着下刀,那么刀的移动方向就是自左至右
print('hsplit等同于hsplit(axis = 1)',np.split(b,indices_or_sections=3,axis=1)) # split()第二个参数指示将数组拆分为几个子数组;axis指示按哪个轴进行拆分
print('纵向拆分 = ', np.vsplit(b, 3))
执行结果:
b = [[1 2 3]
[4 5 6]
[7 8 9]]
横向拆分 = [array([[1],
[4],
[7]]), array([[2],
[5],
[8]]), array([[3],
[6],
[9]])]
hsplit等同于hsplit(axis = 1) [array([[1],
[4],
[7]]), array([[2],
[5],
[8]]), array([[3],
[6],
[9]])]
纵向拆分 = [array([[1, 2, 3]]), array([[4, 5, 6]]), array([[7, 8, 9]])]
import numpy as np
b = np.arange(1, 10).reshape((3, 3))
print('b = ', b, '\n')
print('tolist(): ', b.tolist()) # 将数组转换为Python列表
print('astype(): ', b.astype('complex')) # 将数组元素转换成指定类型
执行结果:
b = [[1 2 3]
[4 5 6]
[7 8 9]]
tolist(): [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
astype(): [[1.+0.j 2.+0.j 3.+0.j]
[4.+0.j 5.+0.j 6.+0.j]
[7.+0.j 8.+0.j 9.+0.j]]
程序示例:
import numpy as np
b = np.arange(1, 10).reshape(3, 3)
print('b = ', b)
b_copy = b.copy()
print('b_copy = ', b_copy)
b_view = b.view()
print('b_view = ', b_view, '\n')
b.flat = 0
print('b = ', b)
print('b_copy = ', b_copy)
print('b_view = ', b_view)
执行结果:
b = [[1 2 3]
[4 5 6]
[7 8 9]]
b_copy = [[1 2 3]
[4 5 6]
[7 8 9]]
b_view = [[1 2 3]
[4 5 6]
[7 8 9]]
b = [[0 0 0]
[0 0 0]
[0 0 0]]
b_copy = [[1 2 3]
[4 5 6]
[7 8 9]]
b_view = [[0 0 0]
[0 0 0]
[0 0 0]]
程序示例:
import numpy as np
a = np.arange(16).reshape(4,4)
np.save('saved_data',a) # 保存到saved_data.npy文件中,扩展名.npy,扩展名系统会自动添加
print('a = ', a, '\n')
loaded_data = np.load('saved_data.npy') # 注意,在读取文件时需要自己加上扩展名
print('loaded_data = ', loaded_data)
执行结果:
a = [[ 0 1 2 3]
[ 4 5 6 7]
[ 8 9 10 11]
[12 13 14 15]]
loaded_data = [[ 0 1 2 3]
[ 4 5 6 7]
[ 8 9 10 11]
[12 13 14 15]]
此处以读取CSV文件中数据为例,可以看出,文件中有缺失数据:
#data.csv
id,value1,value2,value3
1,123,1.4,23
2,110,,18
3,,2.1,19
程序示例:
import numpy as np
data = np.genfromtxt('data.csv', delimiter = ',', names = True)
print(data)
print(data['id']) # 将标题看成能够充当索引的标签,用它们按列抽取标签
print(data[0]) # 使用数值索引按行索取
执行结果:
[(1., 123., 1.4, 23.) (2., 110., nan, 18.) (3., nan, 2.1, 19.)]
[1. 2. 3.]
(1., 123., 1.4, 23.)