奇异值分解——学习笔记

原本在看这篇论文，Information-theoretical label embeddings for large-scale image classification。

发邮件想问问可以不可分享代码，这是作者回信提到的

I don't have any of that code anymore. But the code for computing PMI-SVD embeddings is basically just a few lines of numpy. The code for training an image model is just regular Keras. There is nothing complicated or magic.

If occ is a vector counting label occurences, coocc is a matrix counting cooccurences, and cardinal is the total number of occurences, then the embedding code is:

pmi = cardinal * coocc / (np.dot(occ, occ.T) + 10e-5)
pmi[coocc > 0.] = np.log(pmi[coocc > 0.])
pmi[coocc == 0.] = 0.
u, s, _ = np.linalg.svd(pmi)
sigma = np.eye(len(s)) * np.sqrt(s)  
embeddings = np.dot(u, sigma[:dim].T)

其中PMI（pointwise mutual information ）中用到了奇异值分解。

这是论文里的相关描述

论文截图

于是想找找关于奇异值分解的文章，在知乎找到了这个答案

答案里说到了秩为1的矩阵，于是又去找了几篇文章重温一下秩是什么。
http://blog.csdn.net/u011240016/article/details/52811606
http://blog.csdn.net/u011240016/article/details/52926635
http://blog.csdn.net/u011240016/article/details/52805663
http://blog.csdn.net/u011240016/article/details/53386184
http://blog.csdn.net/u011240016/article/details/52869027

总的来说，组成矩阵的线性无关的向量个数。
比如说下面的矩阵，