CVPR 2024 视频处理方向总汇(视频监控、视频理解、视频识别和视频预测等)

1、视频处理总汇

  • Learning from One Continuous Video Stream
  • Deep Video Inverse Tone Mapping Based on Temporal Clues
  • VTimeLLM: Empower LLM to Grasp Video Moments
  • Combining Frame and GOP Embeddings for Neural Video Representation
  • Learning to Predict Activity Progress by Self-Supervised Video Alignment
  • CoDeF: Content Deformation Fields for Temporally Consistent Video Processing
  • vid-TLDR: Training Free Token Merging for Light-weight Video Transformer
    ⭐code
  • Video2Game: Real-time Interactive Realistic and Browser-Compatible Environment from a Single Video
    ⭐code
  • Dancing with Still Images: Video Distillation via Static-Dynamic Disentanglement
  • Understanding Video Transformers via Universal Concept Discovery
  • Video Recognition in Portrait Mode
    project
  • VideoRF: Rendering Dynamic Radiance Fields as 2D Feature Video Streams
    project
  • Just Add π! Pose Induced Video Transformers for Understanding Activities of Daily Living
    ⭐code
  • A Simple Recipe for Contrastively Pre-training Video-First Encoders Beyond 16 Frames
  • [Reliable Video Teller via Equal Distance to Visual Tokens]
  • Vista-LLaMA: Reliable Video Narrator via Equal Distance to Visual Tokens
    project
  • Towards HDR and HFR Video from Rolling-Mixed-Bit Spikings
  • Physics-guided Shape-from-Template: Monocular Video Perception through Neural Surrogate Models
  • 睡眠监测
    • SleepVST: Sleep Staging from Near-Infrared Video Signals using Pre-Trained Transformers
  • 视频理解
    • Compositional Video Understanding with Spatiotemporal Structure-based Transformers
    • Action Scene Graphs for Long-Form Understanding of Egocentric Videos
    • HIG: Hierarchical Interlacement Graph Approach to Scene Graph Generation in Video Understanding
    • A Backpack Full of Skills: Egocentric Video Understanding with Diverse Task Perspectives
      project
    • Koala: Key Frame-Conditioned Long Video-LLM
    • MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding
      ⭐code
    • Abductive Ego-View Accident Video Understanding for Safe Driving Perception
      project
    • OmniVid: A Generative Framework for Universal Video Understanding
      ⭐code
    • A Unified Framework for Human-centric Point Cloud Video Understanding
    • Bridging the Gap: A Unified Video Comprehension Framework for Moment Retrieval and Highlight Detection
    • MovieChat: From Dense Token to Sparse Memory for Long Video Understanding
      project
    • TimeChat: A Time-sensitive Multimodal Large Language Model for Long Video Understanding
      ⭐code
    • Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding
      ⭐code
  • 视频摘要
    • Previously on ... From Recaps to Story Summarization
      project
    • Scaling Up Video Summarization Pretraining with Large Language Models
    • CSTA: CNN-based Spatiotemporal Attention for Video Summarization
      ⭐code
  • 视频重建
    • HDRFlow: Real-Time HDR Video Reconstruction with Large Motions
      ⭐code
  • 视频表示
    • DS-NeRV: Implicit Neural Video Representation with Decomposed Static and Dynamic Codes
      project
  • 视频判读
    • Visual Objectification in Films: Towards a New AI Task for Video Interpretation
  • 电影描述
    • MICap: A Unified Model for Identity-Aware Movie Descriptions
      project
  • 视频监控
    • Towards Surveillance Video-and-Language Understanding: New Dataset Baselines and Challenges
      dataset
  • 视频预测
    • Video Prediction by Modeling Videos as Continuous Multi-Dimensional Processes
    • ExtDM: Distribution Extrapolation Diffusion Model for Video Prediction
      ⭐code
      project
  • 视频稳定
    • Harnessing Meta-Learning for Improving Full-Frame Video Stabilization
    • 3D Multi-frame Fusion for Video Stabilization
  • 视频识别
    • OST: Refining Text Knowledge with Optimal Spatio-Temporal Descriptor for General Video Recognition
      ⭐code
      project
  • 视频对话
    • BT-Adapter: Video Conversation is Feasible Without Video Instruction Tuning
      ⭐code
  • 视频重照明
    • Real-time 3D-aware Portrait Video Relighting
  • 视频和谐化
    • Video Harmonization with Triplet Spatio-Temporal Variation Patterns
      VILP
  • 视频帧插值
    • Video Frame Interpolation via Direct Synthesis with the Event-based Reference
    • IQ-VFI: Implicit Quadratic Motion Estimation for Video Frame Interpolation
    • EVS-assisted Joint Deblurring Rolling-Shutter Correction and Video Frame Interpolation through Sensor Inverse Modeling
    • TTA-EVF: Test-Time Adaptation for Event-based Video Frame Interpolation via Reliable Pixel and Sample Estimation
    • Sparse Global Matching for Video Frame Interpolation with Large Motion
      ⭐code
    • Perception-Oriented Video Frame Interpolation via Asymmetric Blending
      ⭐code
      视频插帧视觉效果新突破!上海交大提出PerVFI,视频插帧新范式
    • SportsSloMo: A New Benchmark and Baselines for Human-centric Video Frame Interpolation
      project
  • 视频主题交换
    • VideoSwap: Customized Video Subject Swapping with Interactive Semantic Point Correspondence
      project
  • 视频异常检测
    • Open-Vocabulary Video Anomaly Detection
    • Multi-Scale Video Anomaly Detection by Multi-Grained Spatio-Temporal Representation Learning
    • Harnessing Large Language Models for Training-free Video Anomaly Detection
      ⭐code
    • Collaborative Learning of Anomalies with Privacy (CLAP) for Unsupervised Video Anomaly Detection: A New Baseline
      ⭐code
    • Prompt-Enhanced Multiple Instance Learning for Weakly Supervised Video Anomaly Detection
    • MULDE: Multiscale Log-Density Estimation via Denoising Score Matching for Video Anomaly Detection
    • PREGO: Online Mistake Detection in PRocedural EGOcentric Videos
    • Self-Distilled Masked Auto-Encoders are Efficient Video Anomaly Detectors
      ⭐code
    • Text Prompt with Normality Guidance for Weakly Supervised Video Anomaly Detection
    • GlitchBench: Can Large Multimodal Models Detect Video Game Glitches?
      project大型多模态模型能否检测视频游戏故障
  • 视频场景检测
    • Neighbor Relations Matter in Video Scene Detection
  • 视频镜像检测
    • Effective Video Mirror Detection with Inconsistent Motion Cues
  • 自动生成电影预告片
    • Towards Automated Movie Trailer Generation
  • 视频对话式音乐推荐系统
    • MuseChat: A Conversational Music Recommendation System for Videos
  • Video Paragraph Grounding
    • Siamese Learning with Joint Alignment and Regression for Weakly-Supervised Video Paragraph Grounding
  • video Grounding
    • SnAG: Scalable and Accurate Video Grounding
      ⭐code
    • Context-Guided Spatio-Temporal Video Grounding
      ⭐code
    • Video-GroundingDINO: Towards Open-Vocabulary Spatio-Temporal Video Grounding
    • What When and Where? Self-Supervised Spatio-Temporal Grounding in Untrimmed Multi-Action Videos from Narrated Instructions

你可能感兴趣的:(图形图像处理,计算机视觉,音视频,视频处理,python,视频监控,视频理解)