Scipy教程 - 距离计算库scipy.spatial.distance

http://blog.csdn.net/pipisorry/article/details/48814183

距离计算

矩阵距离计算函数

矩阵参数每行代表一个观测值,计算结果就是每行之间的metric距离。Distance matrix computation from a collection of raw observation vectors stored in a rectangular array.

scipy.spatial.distance.pdist(X, metric=’euclidean’, p=2, w=None, V=None, VI=None)

这里计算的是两两之间的距离,而不是相似度,如计算cosine距离后要用1-cosine才能得到相似度。从下面的consine计算公式就可以看出。

观测值(n维)两两之间的距离。Pairwise distances between observations in n-dimensional space.

值越大,相关度越小

Y = pdist(X, ’euclidean’)    #d=sqrt((x1-x2)^2+(y1-y2)^2+(z1-z2)^2)

Y = pdist(X, ’minkowski’, p)

...

Scipy教程 - 距离计算库scipy.spatial.distance_第1张图片

scipy.spatial.distance.cdist(XA, XB, metric=’euclidean’, p=2, V=None, VI=None, w=None)

Computes distance between each pair of the two collections of inputs.

当然XA\XB最简单的形式是一个二维向量(也必须是,否则报错ValueError: XA must be a 2-dimensional array.),计算的就是两个向量之间的metric距离度量。

scipy.spatial.distance.squareform(X, force=’no’, checks=True)

Converts a vector-form distance vector to a square-form distance matrix, and vice-versa.

注意:Distance matrix 'X' must be symmetric&diagonal must be zero.

矩阵距离计算示例

示例1

x
array([[0, 2, 3],
       [2, 0, 6],
       [3, 6, 0]])
y=dis.pdist(x)
Iy
array([ 4.12310563,  5.83095189,  8.54400375])
z=dis.squareform(y)
z
array([[ 0.        ,  4.12310563,  5.83095189],
       [ 4.12310563,  0.        ,  8.54400375],
       [ 5.83095189,  8.54400375,  0.        ]])
type(z)
numpy.ndarray
type(y)
numpy.ndarray

示例2

print(sim)
print(spatial.distance.cdist(sim[0].reshape((1, 2)), sim[1].reshape((1, 2)), metric='cosine'))
print(spatial.distance.pdist(sim, metric='cosine'))
[[-2.85 -0.45]
 [-2.5   1.04]]

[[ 0.14790689]]

[ 0.14790689]

皮皮blog



检验距离矩阵有效性Predicates for checking the validity of distance matrices

is_valid_dm(D[, tol, throw, name, warning]) Returns True if input array is a valid distance matrix.
is_valid_y(y[, warning, throw, name]) Returns True if the input array is a valid condensed distance matrix.
num_obs_dm(d) Returns the number of original observations that correspond to a square, redundant num_obs_y(Y) Returns the number of original observations that correspond to a condensed distance

皮皮blog


向量距离计算函数Distance functions between two vectors u and v

Scipy教程 - 距离计算库scipy.spatial.distance_第2张图片

皮皮blog

from:http://blog.csdn.net/pipisorry/article/details/48814183

ref:scipy-ref-0.14.0-p933


你可能感兴趣的:(scipy,距离计算)