VIT视觉

Vision Transformer

视觉和语言(Vision-Language)

NLPrompt: Noise-Label Prompt Learning for Vision-Language Models

  • Paper: https://arxiv.org/abs/2412.01256
  • Code: GitHub - qunovo/NLPrompt

PhysVLM: Enabling Visual Language Models to Understand Robotic Physical Reachability

  • Paper: https://arxiv.org/abs/2503.08481
  • Code: GitHub - unira-zwj/PhysVLM: PhysVLM: Enabling Visual Language Models to Understand Robotic Physical Reachability

MMRL: Multi-Modal Representation Learning for Vision-Language Models

  • Paper: [2503.08497] MMRL: Multi-Modal Representation Learning for Vision-Language Models
  • Code: GitHub - yunncheng/MMRL: Code for CVPR2025 "MMRL: Multi-Modal Representation Learning for Vision-Language Models" and its extension "MMRL++: Parameter-Efficient and Interaction-Aware Representation Learning for Vision-Language Models".

目标检测(Object Detection)

LLMDet: Learning Strong Open-Vocabulary Object Detectors under the Supervision of Large Language Models

  • Paper: https://arxiv.org/abs/2501.18954
  • Code:GitHub - iSEE-Laboratory/LLMDet: (CVPR 2025 highlight✨) Official repository of paper "LLMDet: Learning Strong Open-Vocabulary Object Detectors under the Supervision of Large Language Models"

Mr. DETR: Instructive Multi-Route Training for Detection Transformers

  • Paper: https://arxiv.org/abs/2412.10028
  • Code: GitHub - Visual-AI/Mr.DETR: [CVPR 2025] Mr. DETR: Instructive Multi-Route Training for Detection Transformers

异常检测(Anomaly Detection)

目标跟踪(Object Tracking)

Multiple Object Tracking as ID Prediction

  • Paper:https://arxiv.org/abs/2403.16848
  • Code: GitHub - MCG-NJU/MOTIP: [CVPR 2025] Multiple Object Tracking as ID Prediction

Omnidirectional Multi-Object Tracking

  • Paper:https://arxiv.org/abs/2503.04565
  • Code:GitHub - xifen523/OmniTrack: The official implementation of OmniTrack: Omnidirectional Multi-Object Tracking (CVPR 2025)

医学图像(Medical Image)

BrainMVP: Multi-modal Vision Pre-training for Medical Image Analysis

  • Paper: https://arxiv.org/abs/2410.10604
  • Code: GitHub - shaohao011/BrainMVP: This is a PyTorch implementation of BrainMVP for mpMRI brain image analysis.

医学图像分割(Medical Image Segmentation)

Test-Time Domain Generalization via Universe Learning: A Multi-Graph Matching Approach for Medical Image Segmentation

  • Paper: https://arxiv.org/abs/2503.13012
  • Code: GitHub - Yore0/TTDG-MGM: [CVPR 2025] Test-Time Domain Generalization via Universe Learning: A Multi-Graph Matching Approach for Medical Image Segmentation

自动驾驶(Autonomous Driving)

LiMoE: Mixture of LiDAR Representation Learners from Automotive Scenes

  • Project: https://ldkong.com/LiMoE
  • Paper: https://arxiv.org/abs/2501.04004
  • Code: GitHub - Xiangxu-0103/LiMoE: [CVPR'25] LiMoE: Mixture of LiDAR Representation Learners from Automotive Scenes

3D点云(3D-Point-Cloud)

Unlocking Generalization Power in LiDAR Point Cloud Registration

  • Paper: https://arxiv.org/abs/2503.10149
  • Code: GitHub - peakpang/UGP: [CVPR 2025 Highlight] Unlocking Generalization Power in LiDAR Point Cloud Registration

你可能感兴趣的:(opencv,目标检测,机器学习,数据挖掘,语音识别,人工智能,计算机视觉)