论文阅读“Multi-view clustering via adversarial view embedding and adaptive view fusion”

Li Y, Liao H. Multi-view clustering via adversarial view embedding and adaptive view fusion[J]. Applied Intelligence, 2021, 51(3): 1201-1212.

摘要逻辑简记

当前任务背景介绍:

Multi-view clustering, which explores complementarity and consistency among multiple distinct feature sets to boost clustering performance, is becoming more and more useful in many real-world applications.

首先介绍了传统多视图方法的步骤:

Traditional approaches usually map multiple views to a unified embedding, in which some weighted mechanisms are utilized to measure the importance of each view. The embedding, serving as a clustering-friendly representation, is then sent to extra clustering algorithms.

接着指出其缺陷:

(1)However, a unified embedding cannot cover both complementarity and consistency among views and the weighted scheme measuring the importance of each view as a whole ignores the differences of features in each view. (忽略了各视图的特有信息)
(2) Moreover, because of lacking in proper grouping structure constraint imposed on the unified embedding, it will lead to just multi-view representation learned, which is not clustering friendly. (学到的信息并不是聚类友好的)

提出本文的方法:

In this paper, we propose a novel multi-view clustering method to alleviate the above problems.

阐述该方法的具体做法:

By dividing the embedding of a view into unified and view-specific vectors explicitly, complementarity and consistency can be reflected.
Besides, an adversarial learning process is developed to force the above embeddings to be non-trivial.
Then a fusion strategy is automatically learned, which will adaptively adjust weights for all the features in each view.
Finally, a Kullback-Liebler (KL) divergence based objective is developed to constrain the fused embedding for clustering friendly representation learning and to conduct clustering.

关键词

多视图聚类
对抗视图嵌入
自适应视图融合
聚类友好的表示学习

模型浅析

该模型分为5个部分:
view-specific encoder networks,
view reconstruction decoder networks,
view classification network,
adaptive multi-view fusion network,
a KL divergence based clustering modular

给定多视图数据,并使用表示数据中的第个视图。

  • {view-specific encoder networks, view reconstruction decoder networks, view classification network}
    通常使用全连接层或卷积神经网络进行特征编码,用于提取每个视图的高级嵌入特征。并且学习到的嵌入表示分为两个部分:共享特征和视图特有特征,分别用于发现数据的一致性和互补特征。
    编码结果为,紧接着被分为和。解码过程则是使用去重构,即。并且还构造了分类器,使用各视图表示作为输入来辨别来自哪个视图 。
    视图的编码器、解码器以及视图分类网络共同构成了一个对抗的视图嵌入学习过程。
    通俗一点来讲,上述的视图对抗嵌入学习过程是一个最小最大化的游戏过程:

    其中,重构损失(对应生成过程--编码和解码)可以定义为如下:
    分类损失就是正常的多分类交叉熵:

  • {adaptive multi-view fusion network}
    由前序过程, 我们可以得到所有视图的特有表示和共享表示, 该模块的目标是学习一个完整的多视图表示用于聚类。作者指出,因为中包含的是未出现在其余视图的特征信息,因此这里使用的是进行自适应融合。具体来讲,不同于为每个视图学习一个标量参数用于融合,作者提出给学习一个权重向量来衡量视图中每个特征的重要性,并确保所有视图权重相加为向量1:

    最后的融合特征表示为所有视图特有特征和共享特征的拼接向量:

    的学习过程如下:我们将个视图的共享表示喂入两层的全连接网络中,输出个向量,并在相同的位置使用softmax。不同的样本共享相同的权重计算过程以学习权重向量。

  • {a KL divergence based clustering modular}

    类似于DEC中的深度聚类模块,不过这里的p的构造写成了q的平方形式。


整体上来说,主要的创新包括两点:首先是使用一个编码器生成两个嵌入向量,区分了view-common和view-specific两种表示,其次此基础上引入了一个视图分类器,形成了对抗的视图学习过程。思想上的创新大于技术上的创新。所谓自适应的权重向量学习,讲了一大堆,最后融合成了一个。且的主体部分还是基于view-specific的拼接。从这个角度来讲,我觉得创新不如将其迁移到VAE极其变种模型中,使得充当view-specific,而大局的作为数据分布学习view-common特征。这样的话,可以更加适合对抗生成过程的解释和迁移。以上为拙见。

你可能感兴趣的:(论文阅读“Multi-view clustering via adversarial view embedding and adaptive view fusion”)