CVPR2025

CVPR论文列表

大论文相关,abstact

  • SphereUFormer: A U-Shaped Transformer for Spherical 360 Perception
    对360rgb图的深度进行估计

  • CroCoDL: Cross-device Collaborative Dataset for Localization(没有)

  • SemAlign3D: Semantic Correspondence between RGB-Images through Aligning 3D Object-Class Representations
    用点云的形式对应出两张图上语义相同的物体

  • PlanarSplatting: Accurate Planar Surface Reconstruction in 3 Minutes
    3维重建并且新视角合成

  • A General Adaptive Dual-level Weighting Mechanism for Remote Sensing Pansharpening
    遥感图像锐化(跟超分的区别:超分input为一张图,该task input为两张图,一张带颜色的低分辨率多光谱图,一张不带颜色的高分辨率纹理图

  • Towards Generalizable Scene Change Detection
    没搞懂这个的意义在于?原文: The ability to accurately identify meaningful changes in a scene across different time steps—despite challenges such as illumination variations, seasonal changes, and weather conditions—plays a key role in the system’s effectiveness and reliability.

  • Towards Long-Horizon Vision-Language Navigation: Platform, Benchmark and Method
    这个用language来提出task,思维链拆分成序列的任务,然后利用vision information去构建。无人机等导航任务可以用

  • T-FAKE: Synthesizing Thermal Images for Facial Landmarking
    人脸标定

  • Opportunistic Single-Photon Time of Flight(没有)

  • AerialMegaDepth: Learning Aerial-Ground Reconstruction and View Synthesis
    空地视角重建与新视角合成

  • AG-VPReID: A Challenging Large-Scale Benchmark for Aerial-Ground Video-based Person Re-Identification
    空地视角人re-identification,所以可以用作空地视角的物体-reidentification

  • Object-aware Sound Source Localization via Audio-Visual Scene Understanding(没有)

  • light3R-SfM: Towards Feed-forward Structure-from-Motion
    无序的一系列图片输入,3维点云图输出

  • High Temporal Consistency through Semantic Similarity Propagation in Semi-Supervised Video Semantic Segmentation for Autonomous Flight

  • EvOcc: Accurate Semantic Occupancy for Automated Driving Using Evidence Theory(没有)

  • RUBIK: A Structured Benchmark for Image Matching across Geometric Challenges
    图像匹配–图像重复区域、放缩比例、视角变换角度

  • SplatFlow: Self-Supervised Dynamic Gaussian Splatting in Neural Motion Flow Field for Autonomous Driving

  • [x]Satellite to GroundScape - Large-scale Consistent Ground View Generation from Satellite Views
    一个目标物体从卫星视角到地面视角到转换,构建地面视角

  • **XLRS-Bench: Could Your Multimodal LLMs Understand Extremely Large Ultra-High-Resolution Remote Sensing Imagery? **
    大模型对超高分辨率的遥感突袭那个的理解

  • Satellite Observations Guided Diffusion Model for Accurate Meteorological States at Arbitrary Resolution
    气象预测 ,用的是卫星图不是rgb或者多光谱

  • ABBSPO: Adaptive Bounding Box Scaling and Symmetric Prior based Orientation Prediction for Detecting Aerial Image Objects (没有)

  • Multi-Modal Aerial-Ground Cross-View Place Recognition with Neural ODEs(没有)

  • RobSense: A Robust Multi-modal Foundation Model for Remote Sensing with Static, Temporal, and Incomplete Data Adaptability (没有)

  • SeCap: Self-Calibrating and Adaptive Prompts for Cross-view Person Re-Identification in Aerial-Ground Networks
    Person Re-Identification in Aerial-Ground Networks

  • ROS-SAM: High-Quality Interactive Segmentation for Remote Sensing Moving Object
    还是目标检测,但是主要针对的类别是动目标(飞机、车、轮船),难点是目标小

  • Detection-Friendly Nonuniformity Correction: A Union Framework for Infrared UAV Target Detection
    红外无人机检测,提供了一个新数据集和检测框架

  • JarvisIR: Elevating Autonomous Driving Perception with Intelligent Image Restoration
    面向自动驾驶感知在不同天气情况下的图像退化问题,提出一个基于VLM的系统用于合理选择现有的专家修复模型。引入新数据集和人类反馈。

  • The Change You Want To Detect: Semantic Change Detection In Earth Observation With Hybrid Data Generationf

  • Effective Cloud Removal for Remote Sensing Images by an Improved Mean-Reverting Denoising Model with Elucidated Design Space

  • Towards Satellite Image Road Graph Extraction: A Global-Scale Dataset and A Novel Method

  • Learning Occlusion-Robust Vision Transformers for Real-Time UAV Tracking

  • UCM-VeID V2: A Richer Dataset and A Pre-training Method for UAV Cross-Modality Vehicle Re-Identification

  • Galaxy Walker: Geometry-aware VLMs For Galaxy-scale Understanding

  • Exact: Exploring Space-Time Perceptive Clues for Weakly Supervised Satellite Image Time Series Semantic Segmentation

  • Horizon-GS: Unified 3D Gaussian Splatting for Large-Scale Aerial-to-Ground Scenes

low-leve相关

  • Uncertainty-guided Perturbation for Image Super-Resolution Diffusion Model
  • Reconstructing Animals and the Wild
  • Training Data Provenance Verification: Did Your Model Use Synthetic Data from My Generative Model for Training?
  • Latent Space Super-Resolution for Higher-Resolution Image Generation with Diffusion Models
  • Pathways on the Image Manifold: Image Editing via Video Generation
  • Augmenting Perceptual Super-Resolution via Image Quality Predictors
  • MaDCoW: Marginal Distortion Correction for Wide-Angle Photography with Arbitrary Objects
  • Localized Concept Erasure for Text-to-Image Diffusion Models Using Training-Free Gated Low-Rank Adaptation
  • UniRestore: Unified Perceptual and Task-Oriented Image Restoration Model Using Diffusion Prior
  • LSNet: See Large, Focus Small
  • Evaluating Model Perception of Color Illusions in Photorealistic Scenes
  • Improving Diffusion Inverse Problem Solving with Decoupled Noise Annealing
  • Degradation-Aware Feature Perturbation for All-in-One Image Restoration
  • Good, Cheap, and Fast: Overfitted Image Compression with Wasserstein Distortion
  • Edge-SD-SR: Low Latency and Parameter Efficient On-device Super-Resolution with Stable Diffusion via Bidirectional Conditioning
  • A Regularization-Guided Equivariant Approach for Image Restoration
  • The Power of Context: How Multimodality Improves Image Super-Resolution
  • GenDeg: Diffusion-based Degradation Synthesis for Generalizable All-In-One Image Restoration
  • Exposure-slot: Exposure-centric Representations Learning with Slot-in-Slot Attention for Region-aware Exposure Correction
  • Positive2Negative: Breaking the Information-Lossy Barrier in Self-Supervised Single Image Denoising
  • MaIR: A Locality- and Continuity-Preserving Mamba for Image Restoration
  • Decouple Distortion from Perception: Region Adaptive Diffusion for Extreme-low Bitrate Perception Image Compression
  • Rotation-Equivariant Self-Supervised Method in Image Denoising
  • CATANet: Efficient Content-Aware Token Aggregation for Lightweight Image Super-Resolution
  • Towards RAW Object Detection in Diverse Conditions
  • ACL: Activating Capability of Linear Attention for Image Restoration
  • Image Quality Assessment: Investigating Causal Perceptual Effects with Abductive Counterfactual Inference
  • Progressive Focused Transformer for Single Image Super-Resolution
  • HIIF: Hierarchical Encoding based Implicit Image Function for Continuous Super-resolution
  • Efficient Diffusion as Low Light Enhance
  • ADD: Attribution-Driven Data Augmentation Framework for Boosting Image Super-Resolution
  • Bridging the Gap between Gaussian Diffusion Models and Universal Quantization for Image Compression
  • Tokenize Image Patches: Global Context Fusion for Effective Haze Removal in Large Images
  • Adapting Text-to-Image Generation with Feature Difference Instruction for Generic Image Restoration
  • TSD-SR: One-Step Diffusion with Target Score Distillation for Real-World Image Super-Resolution
  • Do Computer Vision Foundation Models Learn the Low-level Characteristics of the Human Visual System?
  • Perceptual Video Compression with Neural Wrapping
  • From Zero to Detail: Deconstructing Ultra-High-Definition Image Restoration from Progressive Spectral Perspective
  • High Dynamic Range Video Compression: A Large-Scale Benchmark Dataset and A Learned Bit-depth Scalable Compression Algorithm
  • DL2G: Degradation-guided Local-to-Global Restoration for Eyeglass Reflection Removal
  • FLAVC: Learned Video Compression with Feature Level Attention
  • Binarized Mamba-Transformer for Lightweight Quad Bayer HybridEVS Demosaicing
  • ABC-Former: Auxiliary Bimodal Cross-domain Transformer with Interactive Channel Attention for White Balance
  • Reversing Flow for Image Restoration
  • Adaptive Rectangular Convolution for Remote Sensing Pansharpening
  • DnLUT: Ultra-Efficient Color Image Denoising via Channel-Aware Lookup Tables
  • PassionSR: Post-Training Quantization with Adaptive Scale in One-Step Diffusion based Image Super-Resolution
  • Proximal Algorithm Unrolling: Flexible and Efficient Reconstruction Networks for Single-Pixel Imaging
  • GCC: Generative Color Constancy via Diffusing a Color Checker
  • WISE: A Framework for Gigapixel Whole-Slide-Image Lossless Compression
  • Visual-Instructed Degradation Diffusion for All-in-One Image Restoration
  • Can Text-to-Video Generation help Video-Language Alignment?
  • VoCo-LLaMA: Towards Vision Compression with Large Language Models
  • URWKV: Unified RWKV Model with Multi-state Perspective for Low-light Image Restoration
  • MambaIC: State Space Models for High-Performance Learned Image Compression
  • HVI: A New Color Space for Low-light Image Enhancement
  • OSMamba: Omnidirectional Spectral Mamba with Dual-Domain Prior Generator for Exposure Correction
  • Adversarial Diffusion Compression for Real-World Image Super-Resolution
  • Diffusion-based Event Generation for High-Quality Image Deblurring
  • CoA: Towards Real Image Dehazing via Compression-and-Adaptation
  • PICD: Versatile Perceptual Image Compression with Diffusion Rendering
  • UHD-processer: Unified UHD Image Restoration with Progressive Frequency Learning and Degradation-aware Prompts
  • QMambaBSR: Burst Image Super-Resolution with Query State Space Model
  • Balanced Rate-Distortion Optimization in Learned Image Compression
  • UltraFusion: Ultra High Dynamic Imaging using Exposure Fusion
  • Frequency-Biased Synergistic Design for Image Compression and Compensation
  • Iterative Predictor-Critic Code Decoding for Real-World Image Dehazing
  • Reconciling Stochastic and Deterministic Strategies for Zero-shot Image Restoration using Diffusion Model in Dual
  • Complexity Experts are Task-Discriminative Learners for Any Image Restoration
  • Adaptive Dropout: Unleashing Dropout across Layers for Generalizable Image Super-Resolution
  • Linear Attention Modeling for Learned Image Compression
  • SnowMaster: Comprehensive Real-world Image Desnowing via MLLM with Multi-Model Feedback Optimization
  • Multi-Modal Synergistic Implicit Image Enhancement for Efficient Optical Flow Estimation
  • Channel Consistency Prior and Self-Reconstruction Strategy Based Unsupervised Image Deraining
  • Distilling Spatially-Heterogeneous Distortion Perception for Blind Image Quality Assessment
  • PIDSR: Complementary Polarized Image Demosaicing and Super-Resolution
  • A Polarization-Aided Transformer for Image Deblurring via Motion Vector Decomposition
  • Neural Video Compression with Context Modulation
  • Using Powerful Prior Knowledge of Diffusion Model in Deep Unfolding Networks for Image Compressive Sensing
  • Improving Visual and Downstream Performance of Low-Light Enhancer with Vision Foundation Models Collaboration

总结:low-level比我想象中多,super-resolution应该是最多的,并且image compression貌似得到了比较大的关注。除此以外,去雾去雨去学ISP基础等每个方向只有1-2篇,有部分工作用LVM做low-level的问题,可以关注

其他

  • Teaching Large Language Models to Regress Accurate Image Quality Scores Using Score Distribution

总结

出现最多的关键词: LVM\LLM,3DGS,image generation\understanding\editing\inpainting、multi-view,robotics
个人感受: LVM\LLM\robotics确实是大热点,LVM\LLM多做下游任务,有部分微调的工作。robotics上面的任务非常繁杂。
超出以外的是有非常多multi-view关键词的工作,个人猜想是否是因为2D CV的工作其实已经到一个好的阶段,但是总归要应用到3D中,因此探索multi-view:2D-3D-2D的过程是正在进行时
与大论文相关的工作: 出现了一些 Aerial-to-Ground关键词,并且有两篇是数据集贡献。

你可能感兴趣的:(论文阅读,深度学习)