读论文---ViT是参数有效的视听学习者-Visio Transfermers are Parameter-Efficient Audio-Visual Learners
名词定义LAVIS(LatentAudio-VISualHybrid)适配器AbstractVisiontransformers(ViTs)haveachievedimpressiveresultsonvariouscomputervisiontasksinthelastseveralyears.Inthiswork,westudythecapabilityoffrozenViTs,pretrai