Update Dates 2401

2401 * *AI for 3D Content Creation
* *AI for Creative Video Editing and Understanding
* *AROW: Adversarial Robustness in the Real World
* *Artificial Intelligence for Humanitarian Assistance and Disaster Response
* *Artificial Social Intelligence Workshop and Challenge
* *Bio-Image Computing
* *Challenge for Out-of-Distribution Generalization in Computer Vision
* *Closing the Loop Between Vision and Language
* *Computer Vision Aided Architectural Design
* *Computer Vision for Automated Medical Diagnosis
* *Computer Vision for Metaverse
* *Computer Vision in Plant Phenotyping and Agriculture
* *DeepFake Analysis and Detection
* *eHeritage and Digital Art Preservation
* *Event detection for situation awareness in autonomous driving
* *International Workshop on Analysis and Modeling of Faces and Gestures
* *International Workshop on Assistive Computer Vision and Robotics
* *JRDB: Visual Perception for Navigation in Human Environments: The JackRabbot Social Grouping and Activity Dataset and Benchmark
* *LatinX in CV Research
* *New Ideas in Vision Transformers
* *Open-Vocabulary 3D Scene Understanding
* *Recovering 6D Object Pose
* *Representation Learning with Very Limited Images: The Potential of Self-, Synthetic- and Formula-Supervision
* *Resource Efficient Deep Learning for Computer Vision
* *Robustness and Reliability of Autonomous Vehicles in the Open-world
* *Solving CAD History and pArameters Recovery from Point clouds and 3D scans
* *To NeRF or not to NeRF: A View Synthesis Challenge for Human Heads
* *Uncertainty Estimation for Computer Vision
* *Vision-and-Language Algorithmic Reasoning Workshop
* *Visual Continual Learning
* *Visual Inductive Priors for Data-Efficient Deep Learning
* *Visual Object Tracking Challenge
* *Women in Computer Vision
* *Workshop on Foundation Models: Foundation Model Challenge
* *Workshop on Scene Graphs and Graph Representation Learning
* 20-Year Analysis of the Dynamics and Driving Factors of Grassland Desertification in Xilingol, China, A
* 2D Cross-View Object Segmentation and Perceptual Grouping in Computer-Aided Design Drawings
* 2D-3D Interlaced Transformer for Point Cloud Segmentation with Scene-Level Supervision
* 2D3D-MATR: 2D-3D Matching Transformer for Detection-free Registration between Images and Point Clouds
* 3-D Point Cloud Attribute Compression with p-Laplacian Embedding Graph Dictionary Learning
* 360VOT: A New Benchmark Dataset for Omnidirectional Visual Object Tracking
* 360° from a Single Camera: A Few-Shot Approach for LiDAR Segmentation
* 3D Distillation: Improving Self-Supervised Monocular Depth Estimation on Reflective Surfaces
* 3D Human Mesh Recovery with Sequentially Global Rotation Estimation
* 3D Implicit Transporter for Temporally Consistent Keypoint Discovery
* 3D Instance Segmentation via Enhanced Spatial and Semantic Supervision
* 3D Landslide Monitoring in High Spatial Resolution by Feature Tracking and Histogram Analyses Using Laser Scanners
* 3D Motion Magnification: Visualizing Subtle Motions with Time-Varying Radiance Fields
* 3D Neural Embedding Likelihood: Probabilistic Inverse Graphics for Robust 6D Pose Estimation
* 3D Segmentation of Humans in Point Clouds with Synthetic Data
* 3D Semantic Subspace Traverser: Empowering 3D Generative Model with Shape Editing Capability
* 3D surface Approximation of the Entire Bayeux Tapestry for Improved Pedagogical Access
* 3D U-Net based two stage deep learning framework for predicting dose distributions in radiation treatment planning, A
* 3D VR Sketch Guided 3D Shape Prototyping and Exploration
* 3D-aware Blending with Generative NeRFs
* 3D-Aware Generative Model for Improved Side-View Image Synthesis
* 3D-aware Image Generation using 2D Diffusion Models
* 3D-Aware Neural Body Fitting for Occlusion Robust 3D Human Pose Estimation
* 3D-VisTA: Pre-trained Transformer for 3D Vision and Text Alignment
* 3DHacker: Spectrum-based Decision Boundary Generation for Hard-label 3D Point Cloud Attack
* 3DHumanGAN: 3D-Aware Human Image Generation with 3D Pose Mapping
* 3DMASC: Accessible, explainable 3D point clouds classification. Application to BI-spectral TOPO-bathymetric lidar data
* 3DMiner: Discovering Shapes from Large-Scale Unannotated Image Datasets
* 3DMOTFormer: Graph Transformer for Online 3D Multi-Object Tracking
* 3DPPE: 3D Point Positional Encoding for Transformer-based Multi-Camera 3D Object Detection
* 4D Myocardium Reconstruction with Decoupled Motion and Shape Model
* 4D Panoptic Segmentation as Invariant and Equivariant Field Prediction
* 5-Point Minimal Solver for Event Camera Relative Motion Estimation, A
* : A Large-Scale Benchmark for Rib Labeling and Anatomical Centerline Extraction
* A-STAR: Test-time Attention Segregation and Retention for Text-to-image Synthesis
* A2Q: Accumulator-Aware Quantization with Guaranteed Overflow Avoidance
* Abductive natural language inference by interactive model with structural loss
* Ablating Concepts in Text-to-Image Diffusion Models
* Aboveground Forest Biomass Estimation Using Tent Mapping Atom Search Optimized Backpropagation Neural Network with Landsat 8 and Sentinel-1A Data
* Accelerating Deep Neural Networks via Semi-Structured Activation Sparsity
* AccFlow: Backward Accumulation for Long-Range Optical Flow
* Accidental Turntables: Learning 3D Pose by Watching Objects Turn
* Accumulation Knowledge Distillation for Conditional GAN Compression
* Accurate 3D Face Reconstruction with Facial Component Tokens
* Accurate and Fast Compressed Video Captioning
* Accurate and fast extraction of adhesive cells based on concave points detection and matching
* Achievement-based Training Progress Balancing for Multi-Task Learning
* ACLS: Adaptive and Conditional Label Smoothing for Network Calibration
* ActFormer: A GAN-based Transformer towards General Action-Conditioned 3D Human Motion Generation
* Action Sensitivity Learning for Temporal Action Localization
* ACTIS: Improving data efficiency by leveraging semi-supervised Augmentation Consistency Training for Instance Segmentation
* Activate and Reject: Towards Safe Domain Generalization under Category Shift
* Active Disparity Sampling for Stereo Matching With Adjoint Network
* Active Neural Mapping
* Active Stereo Without Pattern Projector
* Active-Passive Remote Sensing Evaluation of Ecological Environment Quality in Juye Mining Area, China
* ACTIVE: Towards Highly Transferable 3D Physical Camouflage for Universal and Robust Vehicle Evasion
* Actor-agnostic Multi-label Action Recognition with Multi-modal Query
* ActorsNeRF: Animatable Few-shot Human Rendering with Generalizable NeRFs
* AD-CLIP: Adapting Domains in Prompt Space Using CLIP
* Ada3D: Exploiting the Spatial Redundancy with Adaptive Inference for Efficient 3D Object Detection
* AdaMV-MoE: Adaptive Multi-Task Vision Mixture-of-Experts
* AdaNIC: Towards Practical Neural Image Compression via Dynamic Transform Routing
* Adapt Your Teacher: Improving Knowledge Distillation for Exemplar-free Continual Learning
* ADAPT: Efficient Multi-Agent Trajectory Prediction with Adaptation
* Adapting Vision Foundation Models for Plant Phenotyping
* Adaptive and Background-Aware Vision Transformer for Real-Time UAV Tracking
* Adaptive Calibrator Ensemble: Navigating Test Set Difficulty in Out-of-Distribution Scenarios
* Adaptive Frequency Filters As Efficient Global Token Mixers
* adaptive fuzzy control technique for a high-speed vehicular platoon experiencing communication delays, An
* Adaptive Illumination Mapping for Shadow Detection in Raw Images
* Adaptive Image Anonymization in the Context of Image Classification with Neural Networks
* Adaptive Model Ensemble Adversarial Attack for Boosting Adversarial Transferability, An
* Adaptive Nonlinear Latent Transformation for Conditional Face Editing
* Adaptive Positional Encoding for Bundle-Adjusting Neural Radiance Fields
* Adaptive Reordering Sampler with Neurally Guided MAGSAC
* Adaptive Rotated Convolution for Rotated Object Detection
* Adaptive Self-Training for Object Detection
* Adaptive Similarity Bootstrapping for Self-Distillation based Representation Learning
* Adaptive Space-Location-Weighting Function Method for High-Precision Density Inversion of Gravity Data
* Adaptive Spiral Layers for Efficient 3D Representation Learning on Meshes
* Adaptive Superpixel for Active Learning in Semantic Segmentation
* Adaptive Template Transformer for Mitochondria Segmentation in Electron Microscopy Images
* Adaptive Testing of Computer Vision Models
* Adaptively Optimized Masking EMD for Separating Intrinsic Oscillatory Modes of Nonstationary Signals
* Adding Conditional Control to Text-to-Image Diffusion Models
* ADMM-Net for Beamforming Based on Linear Rectification with the Atomic Norm Minimization
* ADNet: Lane Shape Prediction via Anchor Decomposition
* Advanced Augmentation and Ensemble Approaches for Classifying Long-Tailed Multi-Label Chest X-Rays
* Advancing Example Exploitation Can Alleviate Critical Challenges in Adversarial Training
* Advancing Referring Expression Segmentation Beyond Single Image
* Advancing sun glint correction in high-resolution marine UAV RGB imagery for coral reef monitoring
* AdvDiffuser: Natural Adversarial Example Synthesis with Diffusion Models
* AdVerb: Visually Guided Audio Dereverberation
* Adversarial Attacks Against Uncertainty Quantification
* Adversarial Bayesian Augmentation for Single-Source Domain Generalization
* Adversarial Examples with Specular Highlights
* Adversarial Finetuning with Latent Representation Constraint to Mitigate Accuracy-Robustness Tradeoff
* Adversarially Trained Persistent Homology Based Graph Convolutional Network for Disease Identification Using Brain Connectivity
* Adverse Weather Removal with Codebook Priors
* AerialVLN: Vision-and-Language Navigation for UAVs
* Aerosol Optical Depth Measurements from a Simulated Low-Cost Multi-Wavelength Ground-Based Camera: A Clear Case over a Peri-Urban Area
* AesPA-Net: Aesthetic Pattern-Aware Style Transfer Networks
* Affective Image Filter: Reflecting Emotions from Text to Images
* Affine-Consistent Transformer for Multi-Class Cell Nuclei Detection
* Affordance segmentation of hand-occluded containers from exocentric images
* AffordPose: A Large-scale Dataset of Hand-Object Interactions with Affordance-driven Hand Pose
* AG3D: Learning to Generate 3D Avatars from 2D Image Collections
* AGG-Net: Attention Guided Gated-convolutional Network for Depth Image Completion
* Agglomerative Transformer for Human-Object Interaction Detection
* Aggregating Feature Point Cloud for Depth Completion
* Agile Modeling: From Concept to Classifier in Minutes
* AI for Marine, Ocean and Climate Change Monitoring
* AIDE: A Vision-Driven Multi-View, Multi-Modal, Multi-Tasking Dataset for Assistive Driving Perception
* AIROGS: Artificial Intelligence for Robust Glaucoma Screening Challenge
* ALAN: Self-Attention Is Not All You Need for Image Super-Resolution
* ALFA: Leveraging All Levels of Feature Abstraction for Enhancing the Generalization of Histopathology Image Classification Across Unseen Hospitals
* Algebraically rigorous quaternion framework for the neural network pose estimation problem
* Algorithm-Hardware Co-Optimization and Deployment Method for Field-Programmable Gate-Array-Based Convolutional Neural Network Remote Sensing Image Processing
* AlignDet: Aligning Pre-training and Fine-tuning in Object Detection
* Alignment and Generation Adapter for Efficient Video-Text Understanding
* Alignment Before Aggregation: Trajectory Memory Retrieval Network for Video Object Segmentation
* Alignment-free HDR Deghosting with Semantics Consistent Transformer
* ALIP: Adaptive Language-Image Pre-training with Synthetic Caption
* All in Tokens: Unifying Output Space of Visual Tasks via Soft Token
* All-Pairs Consistency Learning for Weakly Supervised Semantic Segmentation
* All-to-key Attention for Arbitrary Style Transfer
* All-Weather Monitoring of Ulva prolifera in the Yellow Sea Based on Sentinel-1, Sentinel-3, and NPP Satellite Data
* Alleviating Catastrophic Forgetting of Incremental Object Detection via Within-Class and Between-Class Knowledge Distillation
* ALWOD: Active Learning for Weakly-Supervised Object Detection
* Among Us: Adversarially Robust Collaborative Perception by Consensus
* AMSC: Adaptive Masking and Structure-Constraint Learning for Domain Adaptive Semantic Segmentation Under Adverse Conditions
* Analysis of Depths Derived by Airborne Lidar and Satellite Imaging to Support Bathymetric Mapping Efforts with Varying Environmental Conditions: Lower Laguna Madre, Gulf of Mexico
* Analysis of Earthquake-Triggered Landslides through an Integrated Unmanned Aerial Vehicle-Based Approach: A Case Study from Central Italy
* Analysis of Illumination Conditions in the Lunar South Polar Region Using Multi-Temporal High-Resolution Orbital Images
* Analysis of Light Obstruction from Street Lighting in Road Scenes
* Analysis of the Spatial Distribution and Deformation Types of Active Landslides in the Upper Jinsha River, China, Using Integrated Remote Sensing Technologies
* Analyzing the Behavior of Cauliflower Harvest-Readiness Models by Investigating Feature Relevances
* Analyzing the Ionospheric Irregularities Caused by the September 2017 Geomagnetic Storm Using Ground-Based GNSS, Swarm, and FORMOSAT-3/COSMIC Data near the Equatorial Ionization Anomaly in East Africa
* Anatomical Invariance Modeling and Semantic Alignment for Self-supervised Learning in 3D Medical Image Analysis
* Anchor Structure Regularization Induced Multi-view Subspace Clustering via Enhanced Tensor Rank Minimization
* Anchor-Intermediate Detector: Decoupling and Coupling Bounding Boxes for Accurate Object Detection
* Animal3D: A Comprehensive Dataset of 3D Animal Pose and Shape
* Anomaly Detection under Distribution Shift
* Anomaly Detection using Score-based Perturbation Resilience
* Anomaly-Aware Semantic Segmentation via Style-Aligned OoD Augmentation
* Antenna Pattern Calibration Method for Phased Array of High-Frequency Surface Wave Radar Based on First-Order Sea Clutter
* Anti-DreamBooth: Protecting users from personalized text-to-image synthesis
* AntiNODE: Evaluating Efficiency Robustness of Neural ODEs
* Aperture Diffraction for Compact Snapshot Spectral Imaging
* APNet: Urban-level Scene Segmentation of Aerial Images and Point Clouds
* Application of Machine Learning for Disease Detection Tasks in Olive Trees Using Hyperspectral Data
* AR-TTA: A Simple Method for Real-World Continual Test-Time Adaptation
* Are current long-term video understanding datasets long-term?
* AREA: Adaptive Reweighting via Effective Area for Long-Tailed Classification
* Aria Digital Twin: A New Benchmark Dataset for Egocentric 3D Machine Perception
* ARNOLD: A Benchmark for Language-Grounded Task Learning With Continuous States in Realistic 3D Scenes
* Artisanal Mining River Dredge Detection Using SAR: A Method Comparison
* Artistic Line Drawing Rendering With Priors of Depth and Edge Density
* ASAG: Building Strong One-Decoder-Layer Sparse Detectors via Adaptive Sparse Anchor Generation
* ASIC: Aligning Sparse in-the-wild Image Collections
* ASM: Adaptive Skinning Model for High-Quality 3D Face Modeling
* ASQ: An Ultra-Low Bit Rate ASR-Oriented Speech Quantization Method
* Assessing CYGNSS Satellite Soil Moisture Data for Drought Monitoring with Multiple Datasets and Indicators
* Assessing Shallow Soft Deposits through Near-Surface Geophysics and UAV-SfM: Application in Pocket Beaches Environments
* Assessing the Impact of Diversity on the Resilience of Deep Learning Ensembles: A Comparative Study on Model Architecture, Output, Activation, and Attribution
* Assessing the Potential of UAV-Based Multispectral and Thermal Data to Estimate Soil Water Content Using Geophysical Methods
* Assessment of Bottom-Up Satellite Precipitation Products on River Streamflow Estimations in the Peruvian Pacific Drainage
* Assessment of the GNSS-RTK for Application in Precision Forest Operations
* Assessment of the Impact of Pacific Inflow on Sea Surface Temperature Prior to the Freeze-Up Period over the Bering Sea
* Assessment of the Impact of Surface Water Content for Temperate Forests in SAR Data at C-Band
* AssetField: Assets Mining and Reconfiguration in Ground Feature Plane Representation
* Associating Anomaly Detection Strategy Based on Kittler's Taxonomy with Image Editing to Extend the Mapping of Polluted Water Bodies
* ASUR3D: Arbitrary Scale Upsampling and Refinement of 3D Point Clouds using Local Occupancy Fields
* Asymmetric Feature Enhancement Network for Multiple Object Tracking of Unmanned Aerial Vehicle, An
* Asynchronous Linear Filter Architecture for Hybrid Event-Frame Cameras, An
* Atmospheric Transmission and Thermal Inertia Induced Blind Road Segmentation with a Large-Scale Dataset TBRSD
* Atomic Norm-Based Joint Delay-Doppler Shift Estimation for OFDM Passive Radar
* ATT3D: Amortized Text-to-3D Object Synthesis
* Attending Generalizability in Course of Deep Fake Detection by Exploring Multi-task Learning
* Attention Discriminant Sampling for Point Clouds
* Attention Network with Outdoor Illumination Variation Prior for Spectral Reconstruction from RGB Images
* Attention Where It Matters: Rethinking Visual Document Understanding with Selective Region Concentration
* Attention-Guided Fusion and Classification for Hyperspectral and LiDAR Data
* Attentive Mask CLIP
* AttT2M: Text-Driven Human Motion Generation with Multi-Perspective Attention Mechanism
* ATZSL: Defensive Zero-Shot Recognition in the Presence of Adversaries
* Audio-Enhanced Text-to-Video Retrieval using Text-Conditioned Feature Alignment
* Audio-Visual Class-Incremental Learning
* Audio-Visual Deception Detection: DOLOS Dataset and Parameter-Efficient Crossmodal Learning
* Audio-Visual Glance Network for Efficient Video Recognition
* Audiovisual Masked Autoencoders
* Augmented Box Replay: Overcoming Foreground Shift for Incremental Object Detection
* Augmenting and Aligning Snippets for Few-Shot Video Domain Adaptation
* Augmenting Features via Contrastive Learning-based Generative Model for Long-Tailed Classification
* AutoAD II: The Sequel - Who, When, and What in Movie Audio Description
* AutoDiffusion: Training-Free Optimization of Time Steps and Architectures for Automated Diffusion Model Acceleration
* Automated detection and classification of skin diseases using diverse features and improved gray wolf-based multiple-layer perceptron neural network
* Automated High-Resolution Bathymetry from Sentinel-1 SAR Images in Deeper Nearshore Coastal Waters in Eastern Florida
* Automated Knowledge Distillation via Monte Carlo Tree Search
* Automated Surface Runoff Estimation with the Spectral Unmixing of Remotely Sensed Multispectral Imagery
* Automatic Animation of Hair Blowing in Still Portrait Photos
* Automatic Audio Feature Extraction for Keyword Spotting
* Automatic differentiation of Eucalyptus species through Sentinel-2 images, Worldview-3 images and LiDAR data
* Automatic Identification and Statistical Analysis of Data Steps in Electric Field Measurements from CSES-01 Satellite
* Automatic Identification of Human Subgroups in Time-Dependent Pedestrian Flow Networks
* Automatic left ventricle segmentation via edge-shape feature-based fully convolutional neural network
* automatic measurement method for ankle key angles based on point cloud segmentation network, An
* Automatic Network Pruning via Hilbert-Schmidt Independence Criterion Lasso under Information Bottleneck Principle
* Automatic semantic modeling of structured data sources with cross-modal retrieval
* Autonomous mobile robot for automatic out of stock detection in a supermarket
* AutoReP: Automatic ReLU Replacement for Fast Private Network Inference
* AutoSynth: Learning to Generate 3D Training Data for Object Point Cloud Registration
* Auxiliary Tasks Benefit 3D Skeleton-based Human Motion Prediction
* AvatarCraft: Transforming Text into Neural Human Avatars with Parameterized Shape and Pose Control
* AW-Net: A Novel Fully Connected Attention-based Medical Image Segmentation Model
* Back to Reality: Learning Data-Efficient 3D Object Detector with Shape Guidance
* Backpropagation Path Search On Adversarial Transferability
* BallGAN: 3D-aware Image Synthesis with a Spherical Background
* BANSAC: A dynamic BAyesian Network for adaptive SAmple Consensus
* BaRe-ESA: A Riemannian Framework for Unregistered Human Body Shapes
* Basis scaling and double pruning for efficient inference in network-based transfer learning
* Batch-based Model Registration for Fast 3D Sherd Reconstruction
* Bayesian Optimization Meets Self-Distillation
* Bayesian Prompt Learning for Image-Language Model Generalization
* Bayesian Source Model for the 2022 Mw6.6 Luding Earthquake, Sichuan Province, China, Constrained by GPS and InSAR Observations, A
* Bayesian Spatial Models for Projecting Corn Yields
* BCUIS-Net: A breast cancer ultrasound image segmentation network via boundary-aware and shape feature fusion
* BD-SKUNet: Selective-Kernel UNets for Building Damage Assessment in High-Resolution Satellite Images
* Be Everywhere - Hear Everything (BEE): Audio Scene Reconstruction by Sparse Audio-Visual Samples
* Beating Backdoor Attack at Its Own Game
* BeLFusion: Latent Diffusion for Behavior-Driven Human Motion Prediction
* Benchmark for Chinese-English Scene Text Image Super-resolution, A
* Benchmarking Algorithmic Bias in Face Recognition: An Experimental Approach Using Synthetic Faces and Human Evaluation
* Benchmarking and Analyzing Robust Point Cloud Recognition: Bag of Tricks for Defending Adversarial Examples
* Benchmarking Data Efficiency and Computational Efficiency of Temporal Action Localization Models
* Benchmarking Image Classifiers for Physical Out-of-Distribution Examples Detection
* Benchmarking Low-Shot Robustness to Natural Distribution Shifts
* Betrayed by Captions: Joint Caption Grounding and Generation for Open Vocabulary Instance Segmentation
* Better localized predictions with Out-of-Scope information and Explainable AI: One-Shot SAR backscatter nowcast framework with data from neighboring region
* Better May Not Be Fairer: A Study on Subgroup Discrepancy in Image Classification
* BEV-DG: Cross-Modal Learning under Bird's-Eye View for Domain Generalization of 3D Semantic Segmentation
* BEVPlace: Learning LiDAR-based Place Recognition using Bird's Eye View Images
* Beyond Image Borders: Learning Feature Extrapolation for Unbounded Image Composition
* Beyond Object Recognition: A New Benchmark towards Object Concept Learning
* Beyond One-to-One: Rethinking the Referring Image Segmentation
* Beyond Pixel-Wise Unmixing: Spatial-Spectral Attention Fully Convolutional Networks for Abundance Estimation
* Beyond Single Path Integrated Gradients for Reliable Input Attribution via Randomized Path Sampling
* Beyond Skin Tone: A Multidimensional Measure of Apparent Skin Color
* Beyond the limitation of monocular 3D detector via knowledge distillation
* Beyond the Pixel: a Photometrically Calibrated HDR Dataset for Luminance and Color Prediction
* BGNet: Boundary-guided network for polyp segmentation
* Bi-Encoder Cascades for Efficient Image Search
* Biased Class disagreement: detection of out of distribution instances by using differently biased semantic segmentation models
* Bidirectional Alignment for Domain Adaptive Detection with Transformers
* Bidirectionally Deformable Motion Modulation For Video-based Human Pose Transfer
* BiFF: Bi-level Future Fusion with Polyline-based Coordinate for Interactive Trajectory Prediction
* BiLMa: Bidirectional Local-Matching for Text-based Person Re-identification
* Binary Noise Guidance Learning for Remote Sensing Image-to-Image Translation
* Bio-Inspired Small Target Motion Detection With Spatio-Temporal Feedback in Natural Scenes
* Bird's-Eye-View Scene Graph for Vision-Language Navigation
* BiViT: Extremely Compressed Binary Vision Transformers
* Black Box Few-Shot Adaptation for Vision-Language models
* Black-Box Attacks on Image Activity Prediction and its Natural Language Explanations
* Black-box Unsupervised Domain Adaptation with Bi-directional Atkinson-Shiffrin Memory
* Blended-NeRF: Zero-Shot Object Generation and Blending in Existing Neural Radiance Fields
* BlendFace: Re-designing Identity Encoders for Face-Swapping
* Blending-NeRF: Text-Driven Localized Editing in Neural Radiance Fields
* Blind quality-based pairwise ranking of contrast changed color images using deep networks
* BlindHarmony: Blind Harmonization for MR Images via Flow model
* Blockchain Data Mining With Graph Learning: A Survey
* BluNF: Blueprint Neural Field
* Blurry dense object extraction based on buffer parsing network for high-resolution satellite remote sensing imagery
* BoDiffusion: Diffusing Sparse Observations for Full-Body Human Motion Synthesis
* Body Knowledge and Uncertainty Modeling for Monocular 3D Human Body Reconstruction
* Bold but Cautious: Unlocking the Potential of Personalized Federated Learning through Cautiously Aggressive Collaboration
* BoMD: Bag of Multi-label Descriptors for Noisy Chest X-ray Classification
* Boosting 3-DoF Ground-to-Satellite Camera Localization Accuracy via Geometry-Guided Cross-View Transformer
* Boosting Adversarial Transferability via Gradient Relevance Attack
* Boosting Few-shot Action Recognition with Graph-guided Hybrid Matching
* Boosting Long-tailed Object Detection via Step-wise Learning on Smooth-tail Data
* Boosting Multi-modal Model Performance with Adaptive Gradient Modulation
* Boosting Novel Category Discovery Over Domains with Soft Contrastive Learning and All in One Classifier
* Boosting Point-BERT by Multi-Choice Tokens
* Boosting Positive Segments for Weakly-Supervised Audio-Visual Video Parsing
* Boosting Semantic Segmentation from the Perspective of Explicit Class Embeddings
* Boosting Semi-Supervised Learning by bridging high and low-confidence predictions
* Boosting Single Image Super-Resolution via Partial Channel Shifting
* Boosting Whole Slide Image Classification from the Perspectives of Distribution, Correlation and Magnification
* Bootstrap Motion Forecasting With Self-Consistent Constraints
* Borrowing Knowledge From Pre-trained Language Model: A New Data-efficient Visual Learning Paradigm
* Both Diverse and Realism Matter: Physical Attribute and Style Alignment for Rainy Image Generation
* Boundary-Aware Divide and Conquer: A Diffusion-based Solution for Unsupervised Shadow Removal
* Box-Based Refinement for Weakly Supervised and Unsupervised Localization Tasks
* BoxDiff: Text-to-Image Synthesis with Training-Free Box-Constrained Diffusion
* BoxSnake: Polygonal Instance Segmentation with Box Supervision
* Brain tumor image pixel segmentation and detection using an aggregation of GAN models with vision transformer
* Brain tumor segmentation and survival time prediction using graph momentum fully convolutional network with modified Elman spike neural network
* brain tumour classification on the magnetic resonance images using convolutional neural network based privacy-preserving federated learning, A
* Breaking Common Sense: WHOOPS! A Vision-and-Language Benchmark of Synthetic and Compositional Images
* Breaking Temporal Consistency: Generating Video Universal Adversarial Perturbations Using Image Models
* Breaking The Limits of Text-conditioned 3D Motion Synthesis with Elaborative Descriptions
* Bridging Cross-task Protocol Inconsistency for Distillation in Dense Object Detection
* Bridging Vision and Language Encoders: Parameter-Efficient Tuning for Referring Image Segmentation
* Brighten up Images via Dual-Branch Structure-Texture Awareness Feature Interaction
* Bring Clipart to Life
* BT2: Backward-compatible Training with Basis Transformation
* BuilDiff: 3D Building Shape Generation using Single-Image Conditional Point Cloud Diffusion Models
* Building a Winning Team: Selecting Source Model Ensembles using a Submodular Transferability Estimation Approach
* Building Bridge Across the Time: Disruption and Restoration of Murals In the Wild
* Building CAD Model Reconstruction from Point Clouds via Instance Segmentation, Signed Distance Function, and Graph Cut
* Building Vision Transformers with Hierarchy Aware Feature Aggregation
* Building3D: An Urban-Scale Dataset and Benchmarks for Learning Roof Structures from Point Clouds
* BUS: Efficient and Effective Vision-language Pre-training with Bottom-Up Patch Summarization
* C2F2NeUS: Cascade Cost Frustum Fusion for High Fidelity and Generalizable Neural Surface Reconstruction
* C2ST: Cross-modal Contextualized Sequence Transduction for Continuous Sign Language Recognition
* CAD-Estate: Large-scale CAD Model Annotation in RGB Videos
* CAFA: Class-Aware Feature Alignment for Test-Time Adaptation
* Calibrated Out-of-Distribution Detection with a Generic Representation
* Calibrating Panoramic Depth Estimation for Practical Localization and Mapping
* Calibrating Uncertainty for Semi-Supervised Crowd Counting
* Camera-Aware Recurrent Learning and Earth Mover's Test-Time Adaption for Generalizable Person Re-Identification
* Camera-Based Road Snow Coverage Estimation
* Camera-Driven Representation Learning for Unsupervised Domain Adaptive Person Re-identification
* Can Language Models Learn to Listen?
* Can Self-Supervised Representation Learning Methods Withstand Distribution Shifts and Corruptions?
* Can the Accuracy of Fine-Resolution Precipitation Products Be Assessed from the Surrounding Water Balance and Drought Chain (WBDC) in the Qinghai-Tibetan Plateau?
* Can Unstructured Pruning Reduce the Depth in Deep Neural Networks?
* Can Water-Detection Indices Be Reliable Proxies for Water Discharges in Mid-Sized Braided Rivers Using Coarse-Resolution Landsat Archives?
* CancerUniT: Towards a Single Unified Model for Effective Detection, Segmentation, and Diagnosis of Eight Major Cancers Using a Large Collection of CT Scans
* Candidate-Aware Selective Disambiguation Based on Normalized Entropy for Instance-Dependent Partial-Label Learning
* Canonical Factors for Hybrid Neural Fields
* CaPhy: Capturing Physical Properties for Animatable Human Avatars
* Cascade-DETR: Delving into High-Quality Universal Object Detection
* Cascading Machine Learning to Monitor Volcanic Thermal Activity Using Orbital Infrared Data: From Detection to Quantitative Evaluation
* CASSPR: Cross Attention Single Scan Place Recognition
* Category-aware Allocation Transformer for Weakly Supervised Object Localization
* CathEye: A Forward-Looking Ultrasound Catheter for Image-Guided Cardiovascular Procedures, The
* Causal-DFQ: Causality Guided Data-free Network Quantization
* Causality-Aware Graph Convolutional Network Framework for Rigidity Assessment in Parkinsonians, A
* Causality-Driven One-Shot Learning for Prostate Cancer Grading from MRI
* CauSSL: Causality-Inspired Semi-Supervised Learning for Medical Image Segmentation
* CAWM: Class-Aware Weight Map for Improved Semi-Supervised Nuclei Segmentation
* CBA: Improving Online Continual Learning via Continual Bias Adaptor
* CC3D: Layout-Conditioned Generation of Compositional 3D Scenes
* CCANet: A Collaborative Cross-Modal Attention Network for RGB-D Crowd Counting
* CDAC: Cross-domain Attention Consistency in Transformer for Domain Adaptive Semantic Segmentation
* Center-Based Decoupled Point Cloud Registration for 6D Object Pose Estimation
* CFCG: Semi-Supervised Semantic Segmentation via Cross-Fusion and Contour Guidance Supervision
* CGBA: Curvature-aware Geometric Black-box Attack
* CGOF++: Controllable 3D Face Synthesis With Conditional Generative Occupancy Fields
* Champagne: Learning Real-world Conversation from Large-Scale Web Videos
* Change You Want to See (Now in 3D), The
* Changes in Qinghai Lake Area and Their Interactions with Climatic Factors
* Chaotic World: A Large and Challenging Benchmark for Human Behavior Understanding in Chaotic Events
* Characteristics of Inter-System Bias between BDS-2 and BDS-3 and Its Impact on BDS Orbit and Clock Solutions
* Characterization of Active Riverbed Spatiotemporal Dynamics through the Definition of a Framework for Remote Sensing Procedures
* Characterization of BDS Multipath Effect Based on AT-Conv-LSTM Network
* Characterization of the Fels Landslide (Alaska) Using Combined Terrestrial, Aerial, and Satellite Remote Sensing Data
* Characterizing Face Recognition for Resource Efficient Deployment on Edge
* Characterizing the Water Storage Variation of Kusai Lake by Constructing Time Series from Multisource Remote Sensing Data
* ChartReader: A Unified Framework for Chart Derendering and Comprehension without Heuristic Rules
* Chasing clouds: Differentiable volumetric rasterisation of point clouds as a highly efficient and accurate loss for large-scale deformable 3D registration
* CheckerPose: Progressive Dense Keypoint Localization for Object Pose Estimation with Graph Neural Network
* Chest X-Ray Feature Pyramid Sum Model with Diseased Area Data Augmentation Method
* CheXFusion: Effective Fusion of Multi-View Features using Transformers for Long-Tailed Chest X-Ray Classification
* ChildPlay: A New Benchmark for Understanding Children's Gaze Behaviour
* Chinese Text Recognition with A Pre-Trained CLIP-Like Model Through Image-IDS Aligning
* Chop & Learn: Recognizing and Generating Object-State Compositions
* Chord: Category-level Hand-held Object Reconstruction via Shape Deformation
* Chordal Averaging on Flag Manifolds and Its Applications
* CHORUS: Learning Canonicalized 3D Human-Object Spatial Relations from Unbounded Synthesized Images
* Chroma Intra Prediction With Lightweight Attention-Based Neural Networks
* Chupa: Carving 3D Clothed Humans from Skinned Shape Priors using 2D Diffusion Probabilistic Models
* CIRI: Curricular Inactivation for Residue-aware One-shot Video Inpainting
* CiT: Curation in Training for Effective Vision-Language Data
* CiteTracker: Correlating Image and Text for Visual Tracking
* City Scale Traffic Monitoring Using WorldView Satellite Imagery and Deep Learning: A Case Study of Barcelona
* CL-MVSNet: Unsupervised Multi-view Stereo with Dual-level Contrastive Learning
* Class Prior-Free Positive-Unlabeled Learning with Taylor Variational Loss for Hyperspectral Remote Sensing Imagery
* Class-aware Memory Guided Unbiased Weighting for Universal Domain Adaptive Object Detection
* Class-Aware Patch Embedding Adaptation for Few-Shot Image Classification
* Class-Guided Image-to-Image Diffusion: Cell Painting from Brightfield Images with Class Labels
* Class-incremental Continual Learning for Instance Segmentation with Image-level Weak Supervision
* Class-Incremental Grouping Network for Continual Audio-Visual Learning
* Class-Incremental Learning of Plant and Disease Detection: Growing Branches with Knowledge Distillation
* Class-Incremental Learning using Diffusion Model for Distillation and Replay
* Class-relation Knowledge Distillation for Novel Class Discovery
* Classification of River Sediment Fractions in a River Segment including Shallow Water Areas Based on Aerial Images from Unmanned Aerial Vehicles with Convolution Neural Networks
* Classification of Urban Surface Elements by Combining Multisource Data and Ontology
* Classification robustness to common optical aberrations
* CleanCLIP: Mitigating Data Poisoning Attacks in Multimodal Contrastive Learning
* ClimateNeRF: Extreme Weather Synthesis in Neural Radiance Field
* Clinically-Inspired Multi-Agent Transformers for Disease Trajectory Forecasting From Multimodal Data
* CLIP goes 3D: Leveraging Prompt Tuning for Language Grounded 3D Recognition
* CLIP-Cluster: CLIP-Guided Attribute Hallucination for Face Clustering
* CLIP-Decoder: ZeroShot Multilabel Classification using Multimodal CLIP Aligned Representations
* CLIP-Driven Universal Model for Organ Segmentation and Tumor Detection
* CLIP-FO3D: Learning Free Open-world 3D Scene Representations from 2D Dense CLIP
* CLIP2Point: Transfer CLIP to Point Cloud Classification with Image-Depth Pre-Training
* CLIPascene: Scene Sketching with Different Types and Levels of Abstraction
* CLIPath: Fine-tune CLIP with Visual Feature Fusion for Pathology Image Analysis Towards Minimizing Data Collection Efforts
* ClipCrop: Conditioned Cropping Driven by Vision-Language Model
* CLIPN for Zero-Shot OOD Detection: Teaching CLIP to Say No
* CLIPTER: Looking at the Bigger Picture in Scene Text Recognition
* CLIPTrans: Transferring Visual Knowledge with Pre-trained Models for Multimodal Machine Translation
* CLNeRF: Continual Learning Meets NeRF
* Closed-Loop Training for Projected GAN
* Cloth2Body: Generating 3D Human Body Mesh from 2D Clothing
* ClothesNet: An Information-Rich 3D Garment Model Repository with Simulated Clothes Environment
* ClothPose: A Real-world Benchmark for Visual Analysis of Garment Pose via An Indirect Recording Solution
* Cloud-Type Classification for Southeast China Based on Geostationary Orbit EO Datasets and the LighGBM Model
* CLR: Channel-wise Lightweight Reprogramming for Continual Learning
* ClusT3: Information Invariant Test-Time Training
* Clustered FedStack: Intermediate Global Models with Bayesian Information Criterion
* Clusterformer: Cluster-based Transformer for 3D Object Detection in Point Clouds
* Clustering based Point Cloud Representation Learning for 3D Analysis
* Clustering-based Domain-Incremental Learning
* Clutter Detection and Removal in 3D Scenes with View-Consistent Inpainting
* CMAT: Integrating Convolution Mixer and Self-Attention for Visual Tracking
* CMDA: Cross-Modality Domain Adaptation for Nighttime Semantic Segmentation
* CMLocate: A cross-modal automatic visual geo-localization framework for a natural environment without GNSS information
* CNN based Cuneiform Sign Detection Learned from Annotated 3D Renderings and Mapped Photographs with Illumination Augmentation
* CNOS: A Strong Baseline for CAD-based Novel Object Segmentation
* Co-Evolution of Pose and Mesh for 3D Human Body Estimation from Video
* CO-Net: Learning Multiple Point Cloud Tasks at Once with A Cohesive Network
* CO-PILOT: Dynamic Top-Down Point Cloud with Conditional Neighborhood Aggregation for Multi-Gigapixel Histopathology Image Representation
* CO2 emissions and delivery time of last-mile drone delivery using trucks
* Coarse to Fine Frame Selection for Online Open-ended Video Question Answering
* Coarse- and Fine-Grained Fusion Hierarchical Network for Hole Filling in View Synthesis
* Coarse-Super-Resolution-Fine Network (CoSF-Net): A Unified End-to-End Neural Network for 4D-MRI With Simultaneous Motion Estimation and Super-Resolution
* Coarse-to-Fine Amodal Segmentation with Shape Prior
* Coarse-to-Fine: Learning Compact Discriminative Representation for Single-Stage Image Retrieval
* Coastline Monitoring and Prediction Based on Long-Term Remote Sensing Data: A Case Study of the Eastern Coast of Laizhou Bay, China
* COCO-O: A Benchmark for Object Detectors under Natural Distribution Shifts
* Coherent Event Guided Low-Light Video Enhancement
* CoIn: Contrastive Instance Feature Mining for Outdoor 3D Object Detection with Very Limited Annotations
* coincidence detection perspective for the maximum mean discrepancy, A
* CoinSeg: Contrast Inter- and Intra- Class Representations for Incremental Segmentation
* CoLaNet: Adaptive Context and Latent Information Blending for Face Image Inpainting
* COLD Fusion: Calibrated and Ordinal Latent Distribution Fusion for Uncertainty-Aware Multimodal Emotion Recognition
* Collaborative Propagation on Multiple Instance Graphs for 3D Instance Segmentation with Single-point Supervision
* Collaborative Tracking Learning for Frame-Rate-Insensitive Multi-Object Tracking
* Collecting The Puzzle Pieces: Disentangled Self-Driven Human Pose Transfer by Permuting Textures
* Color-Based Point Cloud Classification Using a Novel Gaussian Mixed Modeling-Based Approach versus a Deep Neural Network
* Combating Coronary Calcium Scoring Bias for Non-gated CT by Semantic Learning on Gated CT
* Combating Noisy Labels with Sample Selection by Mining High-Discrepancy Examples
* Combined Methodology for Rockfall Susceptibility Mapping Using UAV Imagery Data
* Combining RadCalNet Sites for Radiometric Cross Calibration of Landsat 9 and Landsat 8 Operational Land Imagers (OLIs)
* Combining Satellite Imagery and a Deep Learning Algorithm to Retrieve the Water Levels of Small Reservoirs
* Combining Texture, Color, and Vegetation Index from Unmanned Aerial Vehicle Multispectral Images to Estimate Winter Wheat Leaf Area Index during the Vegetative Growth Stage
* Communication-efficient Federated Learning with Single-Step Synthetic Features Compressor for Faster Convergence
* Communication-Efficient Vertical Federated Learning with Limited Overlapping Samples
* Comparative Study of Natural Replay and Experience Replay in Online Object Detection
* Comparative Study of Vision Transformer Encoders and Few-shot Learning for Medical Image Classification, A
* Comparative Study on Predicting Topography from Gravity Anomaly and Gravity Gradient Anomaly
* Comparison of Differences in Actual Cropland Evapotranspiration under Two Irrigation Methods Using Satellite-Based Model
* Comparison of Supervised Learning and Changepoint Detection for Insect Detection in Lidar Data
* Comparison of the Heights of Sporadic E Layers and Vertical Ion Convergence Parameters
* COMPASS: High-Efficiency Deep Image Compression with Arbitrary-scale Spatial Scalability
* Compatibility of Fundamental Matrices for Complete Viewing Graphs
* Complementary Domain Adaptation and Generalization for Unsupervised Continual Domain Shift Learning
* Complete Recipe for Diffusion Generative Models, A
* Complex Validation of Weather Research and Forecasting: Chemistry Modelling of Atmospheric CO2 in the Coastal Cities of the Gulf of Finland
* Complex-Valued Retrievals From Noisy Images Using Diffusion Models
* Compositional Feature Augmentation for Unbiased Scene Graph Generation
* Compound-Gaussian Model with Nakagami-Distributed Textures for High-Resolution Sea Clutter at Medium/High Grazing Angles
* Comprehensive Empirical Evaluation on Online Continual Learning, A
* Comprehensive Framework for Evaluating Deepfake Generators: Dataset, Metrics Performance, and Comparative Analysis, A
* Comprehensive Multimodal Segmentation in Medical Imaging: Combining YOLOv8 with SAM and HQ-SAM Models
* Comprehensive Study of Transfer Learning under Constraints, A
* comprehensive study on automatic non-informative frame detection in colonoscopy videos, A
* Computation and Data Efficient Backdoor Attacks
* Computational 3D Imaging with Position Sensors
* Computational Evaluation of the Combination of Semi-Supervised and Active Learning for Histopathology Image Segmentation with Missing Annotations
* Computationally-Efficient Neural Image Compression with Shallow Decoders
* Concept Graph Neural Networks for Surgical Video Understanding
* Concept-wise Fine-tuning Matters in Preventing Negative Transfer
* Conceptual and Hierarchical Latent Space Decomposition for Face Editing
* Conditional 360-degree Image Synthesis for Immersive Indoor Scene Decoration
* Confidence-aware Pseudo-label Learning for Weakly Supervised Visual Grounding
* Confidence-based Visual Dispersal for Few-shot Unsupervised Domain Adaptation
* Confusing Large Models by Confusing Small Models
* Confusion Mixup Regularized Multimodal Fusion Network for Continual Egocentric Activity Recognition
* Consistency Regularization for Generalizable Source-free Domain Adaptation
* Consistent Depth Prediction for Transparent Object Reconstruction from RGB-D Camera
* ConSlide: Asynchronous Hierarchical Interaction Transformer with Breakup-Reorganize Rehearsal for Continual Whole Slide Image Analysis
* Constraining Depth Map Geometry for Multi-View Stereo: A Dual-Depth Approach with Saddle-shaped Depth Cells
* Constraints on the Fault Dip Angles of Lunar Graben and Their Significance for Lunar Thermal Evolution
* Construction and Optimisation of Ecological Networks in High-Density Central Urban Areas: The Case of Fuzhou City, China
* Construction of an Ecological Security Pattern in Rapidly Urbanizing Areas Based on Ecosystem Sustainability, Stability, and Integrity
* Construction of Remote Sensing Indices Knowledge Graph (RSIKG) Based on Semantic Hierarchical Graph
* ContactGen: Generative Contact Modeling for Grasp Generation
* Contactless Pulse Estimation Leveraging Pseudo Labels and Self-Supervision
* Content-Aware Local GAN for Photo-Realistic Super-Resolution
* Context-Aware Planning and Environment-Aware Memory for Instruction Following Embodied Agents
* Context-VQA: Towards Context-Aware and Purposeful Visual Question Answering
* Continual Evidential Deep Learning for Out-of-Distribution Detection
* Continual Learning for Personalized Co-Speech Gesture Generation
* Continual Learning with Deep Streaming Regularized Discriminant Analysis
* Continual Segment: Towards a Single, Unified and Non-forgetting Continual Segmentation Model of 143 Whole-body Organs in CT Scans
* Continual Zero-Shot Learning through Semantically Guided Generative Random Walks
* Continuous Hand Gesture Recognition for Human-Robot Collaborative Assembly
* Continuously Masked Transformer for Image Inpainting
* Contrastive Continuity on Augmentation Stability Rehearsal for Continual Self-Supervised Learning
* Contrastive Correlation Preserving Replay for Online Continual Learning
* Contrastive Feature Masking Open-Vocabulary Vision Transformer
* Contrastive Image Synthesis and Self-supervised Feature Adaptation for Cross-Modality Biomedical Image Segmentation
* Contrastive Model Adaptation for Cross-Condition Robustness in Semantic Segmentation
* Contrastive Pseudo Learning for Open-World DeepFake Attribution
* Controllable Guide-Space for Generalizable Face Forgery Detection
* Controllable Inversion of Black-Box Face Recognition Models via Diffusion
* Controllable Person Image Synthesis with Pose-Constrained Latent Diffusion
* Controllable Video Generation With Text-Based Instructions
* Controllable Visual-Tactile Synthesis
* Convex Decomposition of Indoor Scenes
* Convolutional Networks with Oriented 1D Kernels
* COOL-CHIC: Coordinate-based Low Complexity Hierarchical Image Codec
* COOP: Decoupling and Coupling of Whole-Body Grasping Pose Generation
* Coordinate Quantized Neural Implicit Representations for Multi-view Reconstruction
* Coordinate Transformer: Achieving Single-stage Multi-person Mesh Recovery from Videos
* COPILOT: Human-Environment Collision Prediction and Localization from Egocentric Videos
* CopyRNeRF: Protecting the CopyRight of Neural Radiance Fields
* CORE: Co-planarity Regularized Monocular Geometry Estimation with Weak Supervision
* Core: Cooperative Reconstruction for Multi-Agent Perception
* CoroNetGAN: Controlled Pruning of GANs via Hypernetworks
* Correcting for Mobile X-Band Weather Radar Tilt Using Solar Interference
* Correction for Mechanical Inaccuracies in a Scanning Talbot-Lau Interferometer
* Correlation between the Channel Discharge Current and Spectrum of a Single-Stroke Lightning Flash to Canton Tower
* Correlative Scan Matching Position Estimation Method by Fusing Visual and Radar Line Features
* Corrupting Neuron Explanations of Deep Visual Features
* COSE: A Consistency-Sensitivity Metric for Saliency on Image Classification
* CoSign: Exploring Co-occurrence Signals in Skeleton-based Continuous Sign Language Recognition
* Cost Function Unrolling in Unsupervised Optical Flow
* CoTDet: Affordance Knowledge Prompting for Task Driven Object Detection
* Counterfactual-based Saliency Map: Towards Visual Contrastive Explanations for Neural Networks
* Counting Crowds in Bad Weather
* CPCM: Contextual Point Cloud Modeling for Weakly-supervised Point Cloud Semantic Segmentation
* CPR-CLIP: Multimodal Pre-Training for Composite Error Recognition in CPR Training
* Creative Birds: Self-Supervised Single-View 3D Style Transfer
* Creative Birds: Self-Supervised Single-View 3D Style Transfer
* CRN: Camera Radar Net for Accurate, Robust, Efficient 3D Perception
* CroCo v2: Improved Cross-view Completion Pre-training for Stereo Matching and Optical Flow
* Cross Contrasting Feature Perturbation for Domain Generalization
* Cross Modal Transformer: Towards Fast and Robust 3D Object Detection
* Cross-Dataset Study on the Brazilian Sign Language Translation, A
* Cross-Dimensional Refined Learning for Real-Time 3D Visual Perception from Monocular Video
* Cross-Domain Change Detection Network Based on Instance Normalization, A
* Cross-Domain Product Representation Learning for Rich-Content E-Commerce
* Cross-Domain Transfer Learning with CoRTe: Consistent and Reliable Transfer from Black-Box to Lightweight Segmentation Model
* Cross-Modal Dense Passage Retrieval for Outside Knowledge Visual Question Answering
* Cross-modal Latent Space Alignment for Image to Avatar Translation
* Cross-Modal Learning with 3D Deformable Attention for Action Recognition
* Cross-Modal Orthogonal High-Rank Augmentation for RGB-Event Transformer-trackers
* Cross-Modal Retrieval and Semantic Refinement for Remote Sensing Image Captioning
* Cross-modal Scalable Hyperbolic Hierarchical Clustering
* Cross-Modal Translation and Alignment for Survival Analysis
* Cross-model temporal cooperation via saliency maps for efficient frame classification
* Cross-Ray Neural Radiance Fields for Novel-view Synthesis from Unconstrained Image Collections
* Cross-view Semantic Alignment for Livestreaming Product Recognition
* Cross-view Topology Based Consistent and Complementary Information for Deep Multi-view Clustering
* CROSSFIRE: Camera Relocalization On Self-Supervised Features from an Implicit Representation
* CrossLoc3D: Aerial-Ground Cross-Source 3D Place Recognition
* CrossMatch: Source-Free Domain Adaptive Semantic Segmentation via Cross-Modal Consistency Training
* CSDA: Learning Category-Scale Joint Feature for Domain Adaptive Object Detection
* CTP: Towards Vision-Language Continual Pretraining via Compatible Momentum Contrast and Topology Preservation
* CTVIS: Consistent Training for Online Video Instance Segmentation
* Cumulative Spatial Knowledge Distillation for Vision Transformers
* CuNeRF: Cube-Based Neural Radiance Field for Zero-Shot Medical Image Arbitrary-Scale Super Resolution
* Curvature-Aware Training for Coordinate Networks
* Customization of latent space in semi-supervised Variational AutoEncoder
* CVRecon: Rethinking 3D Geometric Feature Learning For Neural Reconstruction
* CVSformer: Cross-View Synthesis Transformer for Semantic Scene Completion
* Cyclic Test-Time Adaptation on Monocular Video for 3D Human Mesh Reconstruction
* Cyclic-Bootstrap Labeling for Weakly Supervised Object Detection
* Cylin-Painting: Seamless 360° Panoramic Image Outpainting and Beyond
* cyto-Knet: An instance segmentation approach for multiple myeloma plasma cells using conditional kernels
* D-IF: Uncertainty-aware Human Digitization via Implicit Distribution Field
* D-ViSA: A Dataset for Detecting Visual Sentiment from Art Images
* D3G: Exploring Gaussian Prior for Temporal Sentence Grounding with Glance Annotation
* DALL-EVAL: Probing the Reasoning Skills and Social Biases of Text-to-Image Generation Models
* Dancing in the Dark: A Benchmark towards General Low-light Video Enhancement
* DandelionNet: Domain Composition with Instance Adaptive Classification for Domain Generalization
* Dark Side Augmentation: Generating Diverse Night Examples for Metric Learning
* DarSwin: Distortion Aware Radial Swin Transformer
* DARTH: Holistic Test-time Adaptation for Multiple Object Tracking
* Data Augmented Flatness-aware Gradient Projection for Continual Learning
* Data Driven Approach for Analyzing the Effect of Climate Change on Mosquito Abundance in Europe, A
* Data Efficient Single Image Dehazing via Adversarial Auto-Augmentation and extended Atmospheric Scattering Model
* Data-Free Class-Incremental Hand Gesture Recognition
* Data-free Knowledge Distillation for Fine-grained Visual Categorization
* DataDAM: Efficient Dataset Distillation with Attention Matching
* Dataset and Model for the Visual Quality Assessment of Inversely Tone-Mapped HDR Videos, A
* Dataset Quantization
* DatasetEquity: Are All Samples Created Equal? In The Quest For Equity Within Datasets
* DATran: Dual Attention Transformer for Multi-Label Image Classification
* DCELANM-Net: Medical image segmentation based on dual channel efficient layer aggregation network with learner
* DCMAI: A Dynamical Cross-Modal Alignment Interaction Framework for Document Key Information Extraction
* DCPB: Deformable Convolution based on the Poincaré Ball for Top-view Fisheye Cameras
* DDColor: Towards Photo-Realistic Image Colorization via Dual Decoders
* DDFM: Denoising Diffusion Model for Multi-Modality Image Fusion
* DDG-Net: Discriminability-Driven Graph Network for Weakly-supervised Temporal Action Localization
* DDIT: Semantic Scene Completion via Deformable Deep Implicit Templates
* DDOD: Dive Deeper into the Disentanglement of Object Detector
* DDP: Diffusion Model for Dense Visual Prediction
* DDS2M: Self-Supervised Denoising Diffusion Spatio-Spectral Model for Hyperspectral Image Restoration
* Dec-Adapter: Exploring Efficient Decoder-Side Adapter for Bridging Screen Content and Natural Image Compression
* Decision Boundary Optimization for Few-shot Class-Incremental Learning
* DECO: Dense Estimation of 3D Human-Scene Contact In The Wild
* Decoding Silent Reading EEG Signals Using Adaptive Feature Graph Convolutional Network
* Decoding the Dynamics of Climate Change Impact: Temporal Patterns of Surface Warming and Melting on the Nivlisen Ice Shelf, Dronning Maud Land, East Antarctica
* Decomposition-Based Variational Network for Multi-Contrast MRI Super-Resolution and Reconstruction
* Decouple Before Interact: Multi-Modal Prompt Learning for Continual Visual Question Answering
* Decoupled DETR: Spatially Disentangling Localization and Classification for Improved End-to-End Object Detection
* Decoupled Iterative Refinement Framework for Interacting Hands Reconstruction from a Single RGB Image
* DeDrift: Robust Similarity Search under Content Drift
* Deep Active Contours for Real-time 6-DoF Object Tracking
* Deep Blind Chest X-Ray Image Quality Assessment With Region-of-Interest-Guided Attention
* Deep Directly-Trained Spiking Neural Networks for Object Detection
* Deep Equilibrium Object Detection
* Deep Feature Deblurring Diffusion for Detecting Out-of-Distribution Objects
* Deep Fusion Transformer Network with Weighted Vector-Wise Keypoints Voting for Robust 6D Object Pose Estimation
* Deep Generalized Learning Model for PET Image Reconstruction
* Deep Generative Networks for Heterogeneous Augmentation of Cranial Defects
* Deep Geometrized Cartoon Line Inbetweening
* Deep geometry-aware camera self-calibration from video
* Deep Homography Mixture for Single Image Rolling Shutter Correction
* Deep Image Harmonization with Globally Guided Feature Transformation and Relation Distillation
* Deep Image Harmonization with Learnable Augmentation
* Deep Image Matting With Sparse User Interactions
* Deep Incubation: Training Large Models by Divide-and-Conquering
* Deep learning based 3d reconstruction for phenotyping of wheat seeds: a dataset, challenge, and baseline method
* Deep Learning Driven Detection of Tsunami Related Internal Gravity Waves: a path towards open-ocean natural hazards detection
* Deep Learning for Apple Fruit Quality Inspection using X-Ray Imaging
* Deep Learning for Integrated Speckle Reduction and Super-Resolution in Multi-Temporal SAR
* Deep Learning Framework using Sparse Diffusion MRI for Diagnosis of Frontotemporal Dementia
* Deep learning-based methods for detecting defects in cast iron parts and surfaces
* Deep learning-based survival prediction of brain tumor patients using attention-guided 3D convolutional neural network with radiomics approach from multimodality magnetic resonance imaging
* deep learning-based x-ray imaging diagnosis system for classification of tuberculosis, COVID-19, and pneumonia traits using evolutionary algorithm, A
* Deep Multitask Learning with Progressive Parameter Sharing
* Deep Multiview Clustering by Contrasting Cluster Assignments
* Deep Optics for Video Snapshot Compressive Imaging
* Deep Semantic-Aware Proxy Hashing for Multi-Label Cross-Modal Retrieval
* Deep Video Demoiréing via Compact Invertible Dyadic Decomposition
* DeepChange: A Long-Term Person Re-Identification Benchmark with Clothes Change
* DeepContrast: Deep Tissue Contrast Enhancement using Synthetic Data Degradations and OOD Model Predictions
* DeepCut: Unsupervised Segmentation using Graph Neural Networks Clustering
* DeepDIST: A Black-Box Anti-Collusion Framework for Secure Distribution of Deep Models
* Deepfakes Signatures Detection in the Handcrafted Features Space
* DeePoint: Visual Pointing Recognition and Direction Estimation
* DeepRoadNet: A deep residual based segmentation network for road map detection from remote aerial image
* DeepVAT: A Self-Supervised Technique for Cluster Assessment in Image Datasets
* Defense-Prefix for Preventing Typographic Attacks on CLIP
* DeFi: Detection and Filling of Holes in Point Clouds Towards Restoration of Digitized Cultural Heritage Models
* Deformable Model-Driven Neural Rendering for High-Fidelity 3D Reconstruction of Human Heads Under Low-View Settings
* Deformable Neural Radiance Fields using RGB and Event Cameras
* Deformer: Dynamic Fusion Transformer for Robust Hand Pose Estimation
* DeFormer: Integrating Transformers with Deformable Models for 3D Shape Abstraction from a Single Image
* DeformToon3d: Deformable Neural Radiance Fields for 3D Toonification
* Degradation-Resistant Unfolding Network for Heterogeneous Image Fusion
* DELFlow: Dense Efficient Learning of Scene Flow for Large-Scale Point Clouds
* Delicate Textured Mesh Recovery from NeRF via Adaptive Surface Refinement
* DeLiRa: Self-Supervised Depth, Light, and Radiance Fields
* DELO: Deep Evidential LiDAR Odometry using Partial Optimal Transport
* Delta Denoising Score
* Delving into Motion-Aware Matching for Monocular 3D Object Tracking
* Democratising 2D Sketch to 3D Shape Retrieval Through Pivoting
* Denoising Diffusion Autoencoders are Unified Self-supervised Learners
* Denoising Diffusion for 3D Hand Pose Estimation from Images
* Dense 2D-3D Indoor Prediction with Sound via Aligned Cross-Modal Distillation
* Dense Text-to-Image Generation with Attention Modulation
* DenseShift: Towards Accurate and Efficient Low-Bit Power-of-Two Quantization
* Density-invariant Features for Distant Point Cloud Registration
* DEQ-MPI: A Deep Equilibrium Reconstruction With Learned Consistency for Magnetic Particle Imaging
* Design and Implementation of K-Band Electromagnetic Wave Rain Gauge System
* Designing Phase Masks for Under-Display Cameras
* DETA: Denoised Task Adaptation for Few-Shot Learning
* Detecting Images Generated by Deep Diffusion Models using their Local Intrinsic Dimensionality
* Detecting Objects with Context-Likelihood Graphs and Graph Refinement
* Detection and Analysis of the Variation in the Minimum Ecological Instream Flow Requirement in the Chinese Northwestern Inland Arid Region by Using a New Remote Sensing Method
* Detection of Aphid-Infested Mustard Crop Using Ground Spectroscopy
* Detection of Changes in Buildings in Remote Sensing Images via Self-Supervised Contrastive Pre-Training and Historical Geographic Information System Vector Maps
* Detection of Fusarium Damaged Kernels in Wheat Using Deep Semi-Supervised Learning on a Novel WheatSeedBelt Dataset
* Detection of Rubber Tree Powdery Mildew from Leaf Level Hyperspectral Data Using Continuous Wavelet Transform and Machine Learning
* Detection Transformer with Stable Matching
* DetermiNet: A Large-Scale Diagnostic Dataset for Complex Visually-Grounded Referencing using Determiners
* Deterministic Neural Illumination Mapping for Efficient Auto-White Balance Correction
* DetOFA: Efficient Training of Once-for-All Networks for Object Detection using Path Filter
* DETR Does Not Need Multi-Scale or Locality Design
* DETRDistill: A Universal Knowledge Distillation Framework for DETR-families
* DETRs with Collaborative Hybrid Assignments Training
* DetZero: Rethinking Offboard 3D Object Detection with Long-term Sequential Point Clouds
* Developing Explainable Deep Model for Discovering Novel Control Mechanism of Neuro-Dynamics
* Developing Robust and Lightweight Adversarial Defenders by Enforcing Orthogonality on Attack-Agnostic Denoising Autoencoders
* Development and Evaluation of a Cloud-Gap-Filled MODIS Normalized Difference Snow Index Product over High Mountain Asia
* Development and validation of an artificial intelligence software for periodontal bone loss in panoramic imaging
* Development of a Numerical Simulation Model to Support the Design of a Ship-Satellite Communication System for Autonomous Marine Navigation
* Development of Intensity-Duration-Frequency (IDF) Curves over the United Arab Emirates (UAE) Using CHIRPS Satellite-Based Precipitation Products
* Devil is in the Crack Orientation: A New Perspective for Crack Detection, The
* Devil is in the Upsampling: Architectural Decisions Made Simpler for Denoising with Deep Image Prior, The
* DFA3D: 3D Deformable Attention For 2D-to-3D Feature Lifting
* DFM-X: Augmentation by Leveraging Prior Knowledge of Shortcut Learning
* DG-Recon: Depth-Guided Neural 3D Scene Reconstruction
* DG3D: Generating High Quality 3D Textured Shapes by Learning to Discriminate Multi-Modal Diffusion-Renderings
* Diagnosing of Parkinson's disease based on hand drawing analysis using Bi-Directional LSTM equipped with fuzzy inferential soft-max classifier
* DiFaReli: Diffusion Face Relighting
* Diff-Retinex: Rethinking Low-light Image Enhancement with A Generative Diffusion Model
* Diff3DHPE: A Diffusion Model for 3D Human Pose Estimation
* DiffCloth: Diffusion Based Garment Synthesis and Manipulation via Structural Cross-modal Semantic Alignment
* DiffDis: Empowering Generative Diffusion Model with Cross-Modal Discrimination Capability
* DiffDreamer: Towards Consistent Unsupervised Single-view Scene Extrapolation with Conditional Diffusion Models
* Differentiable Image Data Augmentation and Its Applications: A Survey
* Differentiable Transportation Pruning
* DiffFacto: Controllable Part-Based 3D Point Cloud Generation with Cross Diffusion
* DiffFit: Unlocking Transferability of Large Diffusion Models via Simple Parameter-Efficient Fine-Tuning
* DiffGuard: Semantic Mismatch-Guided Out-of-Distribution Detection using Pre-trained Diffusion Models
* DiffHPE: Robust, Coherent 3D Human Pose Lifting with Diffusion
* DiffIR: Efficient Diffusion Model for Image Restoration
* DiffPose: Multi-hypothesis Human Pose Estimation using Diffusion Models
* DiffPose: SpatioTemporal Diffusion Model for Video-Based Human Pose Estimation
* Diffractive Sail-Based Displaced Orbits for High-Latitude Environment Monitoring
* DiffRate: Differentiable Compression Rate for Efficient Vision Transformers
* DiffTAD: Temporal Action Detection with Proposal Denoising Diffusion
* DiffuMask: Synthesizing Images with Pixel-Level Annotations for Semantic Segmentation Using Diffusion Models
* Diffuse3D: Wide-Angle 3D Photography via Bilateral Diffusion
* Diffusion Action Segmentation
* Diffusion Based Augmentation for Captioning and Retrieval in Cultural Heritage
* Diffusion in Style
* Diffusion Mechanism in Residual Neural Network: Theory and Applications
* Diffusion Model as Representation Learner
* Diffusion Models as Masked Autoencoders
* Diffusion-Based 3D Human Pose Estimation with Multi-Hypothesis Aggregation
* Diffusion-based Image Translation with Label Guidance for Domain Adaptive Semantic Segmentation
* Diffusion-Guided Reconstruction of Everyday Hand-Object Interaction Clips
* Diffusion-SDF: Conditional Generative Modeling of Signed Distance Functions
* DiffusionDet: Diffusion Model for Object Detection
* DiffusionRet: Generative Text-Video Retrieval with Diffusion Model
* DiffV2S: Diffusion-based Video-to-Speech Synthesis with Vision-guided Speaker Embedding
* Digitization of the Built Cultural Heritage: An Integrated Methodology for Preservation and Accessibilization of an Art Nouveau Museum
* DiLiGenT-Pi: Photometric Stereo for Planar Surfaces with Rich Details - Benchmark Dataset and Beyond
* DIME-FM: DIstilling Multimodal and Efficient Foundation Models
* DINAR: Diffusion Inpainting of Neural Textures for One-Shot Human Avatars
* DIRE for Diffusion-Generated Image Detection
* Direct Unsupervised Denoising
* Discovering Spatio-Temporal Rationales for Video Question Answering
* Discrepant and Multi-instance Proxies for Unsupervised Person Re-identification
* Discrete Representation Learning for Modeling Imaging-based Spatial Transcriptomics Data
* Discriminative Class Tokens for Text-to-Image Diffusion Models
* Disentangle then Parse: Night-time Semantic Segmentation with Illumination Disentanglement
* Disentangling Spatial and Temporal Learning for Efficient Image-to-Video Transfer Learning
* DISeR: Designing Imaging Systems with Reinforcement Learning
* DISGAN: Wavelet-informed Discriminator Guides GAN to MRI Super-resolution with Noise Cleaning
* Disjoint Pose and Shape for 3D Face Reconstruction
* Disperse Asymmetric Subspace Relation Hashing for Cross-Modal Retrieval
* Disposable Transfer Learning for Selective Source Task Unlearning
* Distance Matters For Improving Performance Estimation Under Covariate Shift
* Distillation, Ensemble and Selection for Building a Better and Faster Siamese Based Tracker
* DistillBEV: Boosting Multi-Camera 3D Object Detection with Cross-Modal Knowledge Distillation
* Distilled non-semantic speech embeddings with binary neural networks for low-resource devices
* Distilled Reverse Attention Network for Open-world Compositional Zero-Shot Learning
* Distilling Coarse-to-Fine Semantic Matching Knowledge for Weakly Supervised 3D Visual Grounding
* Distilling DETR with Visual-Linguistic Knowledge for Open-Vocabulary Object Detection
* Distilling from Similar Tasks for Transfer Learning on a Budget
* Distilling Large Vision-Language Model with Out-of-Distribution Generalizability
* Distilling Part-whole Hierarchical Knowledge from a Huge Pretrained Class Agnostic Segmentation Framework
* Distracting Downpour: Adversarial Weather Attacks for Motion Estimation
* Distributed bundle adjustment with block-based sparse matrix compression for super large scale datasets
* Distributed non-linear model predictive control with Gaussian process dynamics for two-dimensional motion of vehicle platoon
* Distribution Shift Matters for Knowledge Distillation with Webly Collected Images
* Distribution-Aligned Diffusion for Human Mesh Recovery
* Distribution-Aware Prompt Tuning for Vision-Language Models
* Distribution-Consistent Modal Recovering for Incomplete Multimodal Learning
* Diverse Cotraining Makes Strong Semi-Supervised Segmentor
* Diverse Data Augmentation with Diffusions for Effective Test-time Prompt Tuning
* Diverse Inpainting and Editing with GAN Inversion
* Divide and Conquer: 3D Point Cloud Instance Segmentation With Point-Wise Binarization
* Divide and Conquer: a Two-Step Method for High Quality Face De-identification with Model Explainability
* Divide&Classify: Fine-Grained Classification for City-Wide Visual Place Recognition
* DLGSANet: Lightweight Dynamic Local and Global Self-Attention Network for Image Super-Resolution
* DLT: Conditioned layout generation with Joint Discrete-Continuous Diffusion Layout Transformer
* DMCH: A Deep Metric and Category-Level Semantic Hashing Network for Retrieval in Remote Sensing
* DMMG: Dual Min-Max Games for Self-Supervised Skeleton-Based Action Recognition
* DMNet: Delaunay Meshing Network for 3D Shape Representation
* DMSC-GAN: A c-GAN-Based Framework for Super-Resolution Reconstruction of SAR Images
* DNA-Rendering: A Diverse Neural Actor Repository for High-Fidelity Human-centric Rendering
* Do DALL-E and Flamingo Understand Each Other?
* Do Planar Constraints Improve Camera Pose Estimation in Monocular SLAM?
* DocTr: Document Transformer for Structured Information Extraction in Documents
* Document Understanding Dataset and Evaluation (DUDE)
* Does Physical Adversarial Example Really Matter to Autonomous Driving? Towards System-Level Effect of Adversarial Object Evasion Attack
* DOLCE: A Model-Based Probabilistic Diffusion Framework for Limited-Angle CT Reconstruction
* Domain Adaptive Few-Shot Open-Set Learning
* Domain Adversarial Learning Towards Underwater Image Enhancement
* Domain Generalization Guided by Gradient Signal to Noise Ratio of Parameters
* Domain generalization of 3D semantic segmentation in autonomous driving
* Domain Generalization via Balancing Training Difficulty and Model Capability
* Domain Generalization via Rationale Invariance
* Domain Incremental Object Detection Based on Feature Space Topology Preserving Strategy
* Domain Specified Optimization for Deployment Authorization
* Domain-Specificity Inducing Transformers for Source-Free Domain Adaptation
* DomainAdaptor: A Novel Approach to Test-time Adaptation
* DomainDrop: Suppressing Domain-Sensitive Channels for Domain Generalization
* DomainFeat: Learning Local Features With Domain Adaptation
* DONNAv2: Lightweight Neural Architecture Search for Vision tasks
* Doppelgangers: Learning to Disambiguate Images of Similar Structures
* DOT: A Distillation-Oriented Trainer
* Double-Factor Tensor Cascaded-Rank Decomposition for Hyperspectral Image Denoising
* Downscaled Representation Matters: Improving Image Rescaling with Collaborative Downscaled Images
* Downstream-agnostic Adversarial Examples
* DPF-Net: Combining Explicit Shape Priors in Deformable Primitive Field for Unsupervised Structural Reconstruction of 3D Objects
* DPM-OT: A New Diffusion Probabilistic Model Based on Optimal Transport
* DPS-Net: Deep Polarimetric Stereo Depth Estimation
* DQS3D: Densely-matched Quantization-aware Semi-supervised 3D Detection
* DR-Tune: Improving Fine-tuning of Pretrained Visual Models by Distribution Regularization with Semantic Calibration
* DRAW: Defending Camera-shooted RAW against Image Manipulation
* DREAM: Efficient Dataset Distillation by Representative Matching
* DreamBooth3D: Subject-Driven Text-to-3D Generation
* DreamPose: Fashion Image-to-Video Synthesis via Stable Diffusion
* DreamTeacher: Pretraining Image Backbones with Deep Generative Models
* Dreamwalker: Mental Planning for Continuous Vision-Language Navigation
* DReg-NeRF: Deep Registration for Neural Radiance Fields
* DriveAdapter: Breaking the Coupling Barrier of Perception and Planning in End-to-End Autonomous Driving
* Drones4Good: Supporting Disaster Relief Through Remote Sensing and AI
* DS-Fusion: Artistic Typography via Discriminated and Stylized Diffusion
* DSMT-Net: Dual Self-Supervised Multi-Operator Transformation for Multi-Source Endoscopic Ultrasound Diagnosis
* DSRNet: Depth Super-Resolution Network guided by blurry depth and clear intensity edges
* DTR-Net: Dual-Space 3D Tooth Model Reconstruction From Panoramic X-Ray Images
* Dual Aggregation Transformer for Image Super-Resolution
* Dual Class Token Vision Transformer for Direction of Arrival Estimation in Low SNR
* Dual Learning with Dynamic Knowledge Distillation for Partially Relevant Video Retrieval
* Dual Meta-Learning with Longitudinally Generalized Regularization for One-Shot Brain Tissue Segmentation Across the Human Lifespan
* Dual Perspective of Human Motion Analysis: 3D Pose Estimation and 2D Trajectory Prediction, A
* Dual Pseudo-Labels Interactive Self-Training for Semi-Supervised Visible-Infrared Person Re-Identification
* Dual-Contrastive Dual-Consistency Dual-Transformer: A Semi-Supervised Approach to Medical Image Segmentation
* Dual-level Interaction for Domain Adaptive Semantic Segmentation
* Dual-Stream Complex-Valued Convolutional Network for Authentic Dehazed Image Quality Assessment
* DVGaze: Dual-View Gaze Estimation
* DVIS: Decoupled Video Instance Segmentation Framework
* DyGait: Exploiting Dynamic Representations for High-performance Gait Recognition
* Dynamic Analysis of a Long Run-Out Rockslide Considering Dynamic Fragmentation Behavior in Jichang Town: Insights from the Three-Dimensional Coupled Finite-Discrete Element Method
* Dynamic Dual-Processing Object Detection Framework Inspired by the Brain's Recognition Mechanism, A
* Dynamic Hyperbolic Attention Network for Fine Hand-object Reconstruction
* Dynamic MDETR: A Dynamic Multimodal Transformer Decoder for Visual Grounding
* Dynamic Mesh Recovery from Partial Point Cloud Sequence
* Dynamic Mesh-Aware Radiance Fields
* Dynamic Multiview Refinement of 3D Hand Datasets using Differentiable Ray Tracing
* Dynamic Neural Network is All You Need: Understanding the Robustness of Dynamic Mechanisms in Neural Networks
* Dynamic Perceiver for Efficient Visual Recognition
* Dynamic PlenOctree for Adaptive Sampling Refinement in Explicit NeRF
* Dynamic Point Fields
* Dynamic Residual Classifier for Class Incremental Learning
* Dynamic Scene Graph Representation for Surgical Video
* Dynamic Snake Convolution based on Topological Geometric Constraints for Tubular Structure Segmentation
* Dynamic Texts From UAV Perspective Natural Images
* Dynamic Token Pruning in Plain Vision Transformers for Semantic Segmentation
* Dynamic weight HiLo attention network for medical image multiple organ segmentation
* DynamicISP: Dynamically Controlled Image Signal Processor for Image Recognition
* DynaMITe: Dynamic Query Bootstrapping for Multi-object Interactive Segmentation Transformer
* E-MLB: Multilevel Benchmark for Event-Based Camera Denoising
* E2E-LOAD: End-to-End Long-form Online Action Detection
* E2NeRF: Event Enhanced Neural Radiance Fields from Blurry Images
* E2VPT: An Effective and Efficient Approach for Visual Prompt Tuning
* E3Sym: Leveraging E(3) Invariance for Unsupervised 3D Planar Reflective Symmetry Detection
* Early Detection of Dicamba and 2,4-D Herbicide Drifting Injuries on Soybean with a New Spatial-Spectral Algorithm Based on LeafSpec, an Accurate Touch-Based Hyperspectral Leaf Scanner
* Early Detection of Drought Stress in Durum Wheat Using Hyperspectral Imaging and Photosystem Sensing
* EC-SfM: Efficient Covisibility-Based Structure-from-Motion for Both Sequential and Unordered Images
* ECO: Ensembling Context Optimization for Vision-Language Models
* Ecological Security Assessment of Grain-for-Green Program Typical Areas in Northern China Based on Multi-Source Remote Sensing Data
* EdaDet: Open-Vocabulary Object Detection Using Early Dense Alignment
* EDAPS: Enhanced Domain-Adaptive Panoptic Segmentation
* Edge aware depth inference for large-scale aerial building multi-view stereo
* Editable Image Geometric Abstraction via Neural Primitive Assembly
* Editing Implicit Assumptions in Text-to-Image Diffusion Models
* Effect of Grassland Fires on Dust Storms in Dornod Aimag, Mongolia
* Effect of Stage Training for Long-Tailed Multi-Label Image Classification
* effect of visibility on road traffic during foggy weather conditions, The
* effective human monkeypox classification using vision transformer, An
* Effective Real Image Editing with Accelerated Iterative Diffusion Inversion
* Effective Whole-body Pose Estimation with Two-stages Distillation
* effectiveness of MAE pre-pretraining for billion-scale pretraining, The
* Effects of Potential Large-Scale Irrigation on Regional Precipitation in Northwest China
* Efficient 3D Frequency Semi-Airborne Electromagnetic Modeling Based on Domain Decomposition
* Efficient 3D Reconstruction, Streaming and Visualization of Static and Dynamic Scene Parts for Multi-client Live-telepresence in Large-scale Environments
* Efficient 3D Semantic Segmentation with Superpoint Transformer
* Efficient Action Spotting Using Saliency Feature Weighting
* Efficient Adaptive Human-Object Interaction Detection with Concept-guided Memory
* Efficient Camouflaged Object Detection via Progressive Refinement Network
* Efficient Computation Sharing for Multi-Task Visual Scene Understanding
* Efficient Controllable Multi-Task Architectures
* Efficient Converted Spiking Neural Network for 3D and 2D Classification
* Efficient Crowd Counting via Dual Knowledge Distillation
* Efficient Decision-based Black-box Patch Attacks on Video Recognition
* Efficient Deep Neural Network to Classify Large 3D Images With Small Objects, An
* Efficient Deep Space Filling Curve
* Efficient Diffusion Training via Min-SNR Weighting Strategy
* Efficient Discovery and Effective Evaluation of Visual Perceptual Similarity: A Benchmark and Beyond
* Efficient Emotional Adaptation for Audio-Driven Talking-Head Generation
* Efficient Grapevine Structure Estimation in Vineyards Conditions
* Efficient Graph Convolutional RVFL Network for Hyperspectral Image Classification, An
* Efficient Joint Optimization of Layer-Adaptive Weight Pruning in Deep Neural Networks
* Efficient LiDAR Point Cloud Oversegmentation Network
* efficient lung image classification and detection using spiral-optimized Gabor filter with convolutional neural network, An
* Efficient Model Personalization in Federated Learning via Client-Specific Prompt Generation
* Efficient Multi-View K-Means for Image Clustering
* Efficient Multiscale Spatial Rearrangement MLP Architecture for Image Restoration, An
* Efficient Neural PDE-Solvers using Quantization Aware Training
* Efficient neural supersampling on a novel gaming dataset
* efficient point cloud place recognition approach based on transformer in dynamic environment, An
* Efficient Region-Aware Neural Radiance Fields for High-Fidelity Talking Portrait Synthesis
* Efficient Rep-Style Gaussian-Wasserstein Network: Improved UAV Infrared Small Object Detection for Urban Road Surveillance and Safety, An
* efficient stacked bidirectional GRU-LSTM network for intracranial hemorrhage detection, An
* Efficient Transformer-Based 3D Object Detection with Dynamic Token Halting
* Efficient Unified Demosaicing for Bayer and Non-Bayer Patterned Image Sensors
* Efficient Urban Flood Mapping Framework Towards Disaster Response Driven by Weakly Supervised Semantic Segmentation with Decoupled Training Samples, An
* Efficient Video Action Detection with Token Dropout and Context Refinement
* Efficient Video Prediction via Sparsely Conditioned Flow Matching
* Efficient View Synthesis with Neural Radiance Distribution Field
* efficient wavelet thresholding strategy and robust shrinkage approach for de-noising ECG signal, An
* Efficient, Self-Supervised Human Pose Estimation with Inductive Prior Tuning
* Efficient-VQGAN: Towards High-Resolution Image Generation with Efficient Vision Transformers
* Efficiently Robustify Pre-Trained Models
* EfficientTrain: Exploring Generalized Curriculum Learning for Training Visual Backbones
* EfficientViT: Lightweight Multi-Scale Attention for High-Resolution Dense Prediction
* EGC: Image Generation and Classification via a Diffusion Energy-Based Model
* EGformer: Equirectangular Geometry-biased Transformer for 360 Depth Estimation
* EGMT-CD: Edge-Guided Multimodal Transformers Change Detection from Satellite and Aerial Images
* Ego-Only: Egocentric Action Detection without Exocentric Transferring
* EgoHumans: An Egocentric 3D Multi-Human Benchmark
* EgoLoc: Revisiting 3D Object Localization from Egocentric Videos with Visual Queries
* EgoObjects: A Large-Scale Egocentric Dataset for Fine-Grained Object Understanding
* EgoPCA: A New Framework for Egocentric Hand-Object Interaction Understanding
* EgoTV: Egocentric Task Verification from Natural Language Task Descriptions
* EgoVLPv2: Egocentric Video-Language Pre-training with Fusion in the Backbone
* EigenPlaces: Training Viewpoint Robust Models for Visual Place Recognition
* EigenTrajectory: Low-Rank Descriptors for Multi-Modal Trajectory Forecasting
* ElasticViT: Conflict-aware Supernet Training for Deploying Fast Vision Transformer on Diverse Mobile Devices
* Elevation-Dependent Contribution of the Response and Sensitivity of Vegetation Greenness to Hydrothermal Conditions on the Grasslands of Tibet Plateau from 2000 to 2021
* ELFNet: Evidential Local-global Fusion for Stereo Matching
* ELITE: Encoding Visual Concepts into Textual Embeddings for Customized Text-to-Image Generation
* Embarrassingly Simple Backdoor Attack on Self-supervised Learning, An
* Embedded Deformation-based Compression for Human 3D Dynamic Meshes with Changing Topology
* Embedded plant recognition: a benchmark for low footprint deep neural networks
* Embedded-GPU-Based Scheme for Real-Time Imaging Processing of Unmanned Aerial Vehicle Borne Video Synthetic Aperture Radar, An
* EMDB: The Electromagnetic Database of Global 3D Human Pose and Shape in the Wild
* EMMN: Emotional Motion Memory Network for Audio-driven Emotional Talking Face Generation
* EmoSet: A Large-scale Visual Emotion Dataset with Rich Attributes
* EmoTalk: Speech-Driven Emotional Disentanglement for 3D Face Animation
* Emotional Listener Portrait: Realistic Listener Motion Simulation in Conversation
* Empirical Analysis for Zero-Shot Multi-Label Classification on COVID-19 CT Scans and Uncurated Reports, An
* Empirical Analysis of Range for 3D Object Detection, An
* empirical study of the effect of video encoders on Temporal Video Grounding, An
* Empowering Low-Light Image Enhancer through Customized Learnable Priors
* Empowering ultrasound image filtering precision by reducing speckles and preserving edge cues
* EMQ: Evolving Training-free Proxies for Automated Mixed Precision Quantization
* EMR-MSF: Self-Supervised Recurrent Monocular Scene Flow Exploiting Ego-Motion Rigidity
* Encoding of Media Value Chain Processes Through Blockchains and MPEG-21 Smart Contracts for Media
* Encyclopedic VQA: Visual questions about detailed properties of fine-grained categories
* End-to-end 3D Tracking with Decoupled Queries
* End-to-End Deep Learning for Reconstructing Segmented 3D CT Image from Multi-Energy X-ray Projections
* End-to-End Diffusion Latent Optimization Improves Classifier Guidance
* End-to-End Instance-Level Human Parsing by Segmenting Persons
* End2End Multi-View Feature Matching with Differentiable Pose Optimization
* Energy-based Self-Training and Normalization for Unsupervised Domain Adaptation
* enhanced framework for identifying brain tumor using discrete wavelet transform, deep convolutional network, and feature fusion-based machine learning techniques, An
* Enhanced Meta Label Correction for Coping with Label Corruption
* Enhanced Micro-Doppler Feature Extraction Using Adaptive Short-Time Kernel-Based Sparse Time-Frequency Distribution
* Enhanced Soft Label for Semi-Supervised Semantic Segmentation
* Enhancing Adversarial Robustness in Low-Label Regime via Adaptively Weighted Regularization and Knowledge Distillation
* Enhancing brain tumor classification with transfer learning: Leveraging DenseNet121 for accurate and efficient detection
* Enhancing Classification Accuracy on Limited Data via Unconditional GAN
* Enhancing CLIP with GPT-4: Harnessing Visual Descriptions as Prompts
* Enhancing Differentiable Architecture Search: A Study on Small Number of Cell Blocks in the Search Stage, and Important Branches-based Cells Selection
* Enhancing explainability in brain tumor detection: A novel DeepEBTDNet model with LIME on MRI images
* Enhancing Fine-Tuning based Backdoor Defense with Sharpness-Aware Minimization
* Enhancing Generalization of Universal Adversarial Perturbation through Gradient Aggregation
* Enhancing Human-Robot Collaborative Object Search through Human Behavior Observation and Dialog
* Enhancing LiDAR-UAS Derived Digital Terrain Models with Hierarchic Robust and Volume-Based Filtering Approaches for Precision Topographic Mapping
* Enhancing Medical Image Segmentation: Optimizing Cross-Entropy Weights and Post-Processing with Autoencoders
* Enhancing Modality-Agnostic Representations via Meta-learning for Brain Tumor Segmentation
* Enhancing Multi-Label Long-Tailed Classification on Chest X-Rays through ML-GCN Augmentation
* Enhancing NeRF akin to Enhancing LLMs: Generalizable NeRF Transformer with Mixture-of-View-Experts
* Enhancing Non-line-of-sight Imaging via Learnable Inverse Kernel and Attention Mechanisms
* Enhancing Object Detection in Remote Sensing: A Hybrid YOLOv7 and Transformer Approach with Automatic Model Selection
* Enhancing Privacy Preservation in Federated Learning via Learning Rate Perturbation
* Enhancing Representation Learning With Spatial Transformation and Early Convolution for Reinforcement Learning-Based Small Object Detection
* Enhancing Sample Utilization through Sample Adaptive Augmentation in Semi-Supervised Learning
* Ensuring a connected structure for Retinal Vessels Deep-Learning Segmentation
* ENTL: Embodied Navigation Trajectory Learner
* Entropic Score metric: Decoupling Topology and Size in Training-free NAS
* ENVIDR: Implicit Differentiable Renderer with Neural Environment Lighting
* Environment Agnostic Representation for Visual Reinforcement learning
* Environment-Invariant Curriculum Relation Learning for Fine-Grained Scene Graph Generation
* Environmental and Seasonal Variability of High Latitude Methane Emissions Based on Earth Observation Data and Atmospheric Inverse Modelling
* eP-ALM: Efficient Perceptual Augmentation of Language Models
* EP2P-Loc: End-to-End 3D Point to 2D Pixel Localization for Large-Scale Visual Localization
* EPiC: Ensemble of Partial Point Clouds for Robust Classification
* EQ-Net: Elastic Quantization Neural Networks
* Equivariant Similarity for Vision-Language Foundation Models
* Erasing Concepts from Diffusion Models
* ESSAformer: Efficient Transformer for Hyperspectral Image Super-resolution
* Essential Matrix Estimation using Convex Relaxations in Orthogonal Space
* ESTextSpotter: Towards Better Scene Text Spotting with Explicit Synergy in Transformer
* Estimating Fine Fuel Load Using Sentinel-2A Imagery and Machine Learning: A Case Study in the Mountainous Forests of Changsha, China
* Estimating the SPAD of Litchi in the Growth Period and Autumn Shoot Period Based on UAV Multi-Spectrum
* Estimating tree species composition from airborne laser scanning data using point-based deep learning models
* Estimation and Spatiotemporal Analysis of Surface Evaporation in the Yangtze River Basin from 2010 to 2019
* Estimation of Crop Production by Fusing Images and Crop Features
* Estimation of Human Condition at Disaster Site Using Aerial Drone Images
* Estimation of Oak Leaf Functional Traits for California Woodland Savannas and Mixed Forests: Comparison between Statistical, Physical, and Hybrid Methods Using Spectroscopy
* Estimator Meets Equilibrium Perspective: A Rectified Straight Through Estimator for Binary Neural Networks Training
* ETran: Energy-Based Transferability Estimation
* Euclidean Space is Evil: Hyperbolic Attribute Editing for Few-shot Image Generation, The
* Eulerian Single-Photon Vision
* Evaluating Data Attribution for Text-to-Image Models
* Evaluating Predictive Models of Tree Foliar Moisture Content for Application to Multispectral UAS Data: A Laboratory Study
* Evaluating the Performance of Satellite Derived Temperature and Precipitation Datasets in Ecuador
* Evaluating the Predictive Value of Glioma Growth Models for Low-Grade Glioma After Tumor Resection
* Evaluation and Analysis of Remote Sensing-Based Approach for Salt Marsh Monitoring
* Evaluation and Improvement of Interpretability for Self-Explainable Part-Prototype Networks
* Evaluation of 3D Reconstruction for Cultural Heritage Applications
* Evaluation of Open Geotechnical Knowledge in Urban Environments for 3D Modelling of the City of Seville (Spain)
* Evaluation of Suitable Habitats for Birds Based on MaxEnt and Google Earth Engine: A Case Study of Baer's Pochard (Aythya baeri) in Baiyangdian, China
* Evaluation of the CRTM Land Emissivity Model over Grass and Sand Surfaces Using Ground-Based Measurements
* Evaluation of the Spatiotemporal Change of Ecological Quality under the Context of Urban Expansion: A Case Study of Typical Urban Agglomerations in China
* Event Camera Data Pre-training
* Event-based Temporally Dense Optical Flow Estimation with Sequential Learning
* Event-Guided Procedure Planning from Instructional Videos with Text Supervision
* Eventful Transformers: Leveraging Temporal Redundancy in Vision Transformers
* EverLight: Indoor-Outdoor Editable HDR Lighting Estimation
* Evidential Pseudo-Label Ensemble for semi-supervised classification
* Evolution of Different Types of Eddies Originating from Different Baroclinic Instability Types
* Examining Autoexposure for Challenging Scenes
* ExBluRF: Efficient Radiance Fields for Extreme Motion Blurred Images
* Exemplar-Free Continual Transformer with Convolutions
* Expectation Propagation Detection With Physical Network Coding for Massive MIMO Systems
* Experience Replay as an Effective Strategy for Optimizing Decentralized Federated Learning
* Experimental Analysis of Deep-Sea AUV Based on Multi-Sensor Integrated Navigation and Positioning
* Experimental Determination of the Ionospheric Effects and Cycle Slip Phenomena for Galileo and GPS in the Arctic
* Experimental Protocol for Neural Architecture Search in Super-Resolution, An
* Experimental Results of Underwater Sound Speed Profile Inversion by Few-Shot Multi-Task Learning
* Explaining Adversarial Robustness of Neural Networks from Clustering Effect Perspective
* Explaining through Transformer Input Sampling
* Explaining Vision and Language through Graphs of Events in Space and Time
* Explanatory Object Part Aggregation for Zero-Shot Learning
* Explicit Motion Disentangling for Efficient Optical Flow Estimation
* Exploiting Proximity-Aware Tasks for Embodied Social Navigation
* Exploiting stance similarity and graph neural networks for fake news detection
* Exploiting temporal information to detect conversational groups in videos and predict the next speaker
* Exploration of Data Scene Characterization and 3D ROC Evaluation for Hyperspectral Anomaly Detection
* exploration of ventricle regions segmentation and multiclass disease detection using cardiac MRI, An
* Explore and Tell: Embodied Visual Captioning in 3D Environments
* Exploring Group Video Captioning with Efficient Relational Approximation
* Exploring Image Classification Robustness and Interpretability with Right for the Right Reasons Data Augmentation
* Exploring Inlier and Outlier Specification for Improved Medical OOD Detection
* Exploring Lightweight Hierarchical Vision Transformers for Efficient Visual Tracking
* Exploring Model Transferability through the Lens of Potential Energy
* Exploring Object-Centric Temporal Modeling for Efficient Multi-View 3D Object Detection
* Exploring Open-Vocabulary Semantic Segmentation from CLIP Vision Encoder Distillation Only
* Exploring Positional Characteristics of Dual-Pixel Data for Camera Autofocus
* Exploring Predicate Visual Context in Detecting of Human-Object Interactions
* Exploring Self-Supervised Learning for Multi-Modal Remote Sensing Pre-Training via Asymmetric Attention Fusion
* Exploring Temporal Concurrency for Video-Language Representation Learning
* Exploring Temporal Frequency Spectrum in Deep Video Deblurring
* Exploring the association between socio-demographic factors and public acceptance towards fully automated vehicles: Insights from a survey in Australia
* Exploring the Benefits of Visual Prompting in Differential Privacy
* Exploring the Road Graph in Trajectory Forecasting for Autonomous Driving
* Exploring the Sim2Real Gap using Digital Twins
* Exploring the Spatiotemporal Dynamics and Driving Factors of Net Ecosystem Productivity in China from 1982 to 2020
* Exploring Transformers for Open-world Instance Segmentation
* Exploring Video Quality Assessment on User Generated Contents from Aesthetic and Technical Perspectives
* ExposureDiffusion: Learning to Expose for Low-light Image Enhancement
* Expressive Talking Head Video Encoding in StyleGAN2 Latent Space
* Expressive Text-to-Image Generation with Rich Text
* Extending TrOCR for Text Localization-Free OCR of Full-Page Scanned Receipt Images
* Extensible and Efficient Proxy for Neural Architecture Search
* External Commonsense Knowledge as a Modality for Social Intelligence Question-Answering
* Extract-and-Adaptation Network for 3D Interacting Hand Mesh Recovery
* Extracting Citrus-Growing Regions by Multiscale UNet Using Sentinel-2 Satellite Imagery
* Extracting Plastic Greenhouses from Remote Sensing Images with a Novel U-FDS Net
* Extraction of Building Roof Contours from Airborne LiDAR Point Clouds Based on Multidirectional Bands
* Extreme Marine Heatwave Event in the Yellow Sea during Winter 2019/20: Causes and Consequences, An
* F&F Attack: Adversarial Attack against Multiple Object Trackers by Inducing False Negatives and False Positives
* Face Clustering via Graph Convolutional Networks with Confidence Edges
* FaceCLIPNeRF: Text-driven 3D Face Manipulation using Deformable Neural Radiance Fields
* FACET: Fairness in Computer Vision Evaluation Benchmark
* Facial Prior Guided Micro-Expression Generation
* Facsimiles-based deep learning for matching relief-printed decorations on medieval ceramic sherds
* Factorized Dynamic Fully-Connected Layers for Neural Networks
* Factorized Inverse Path Tracing for Efficient and Accurate Material-Lighting Estimation
* FACTS: First Amplify Correlations and Then Slice to Discover Bias
* Fair Robust Active Learning by Joint Inconsistency
* False Correlation Reduction for Offline Reinforcement Learning
* Fan-Beam Binarization Difference Projection (FB-BDP): A Novel Local Object Descriptor for Fine-Grained Leaf Image Retrieval
* Fantasia3D: Disentangling Geometry and Appearance for High-quality Text-to-3D Content Creation
* Far Away in the Deep Space: Dense Nearest-Neighbor-Based Out-of-Distribution Detection
* FArMARe: a Furniture-Aware Multi-task methodology for Recommending Apartments based on the user interests
* FashionNTM: Multi-turn Fashion Image Retrieval via Cascaded Memory
* Fast Adversarial Training with Smooth Convergence
* Fast and Accurate Hyperspectral Image Classification with Window Shape Adaptive Singular Spectrum Analysis
* Fast and Accurate Transferability Measurement by Evaluating Intra-class Feature Variance
* Fast Full-frame Video Stabilization with Iterative Optimization
* Fast Globally Optimal Surface Normal from an Affine Correspondence
* Fast Inference and Update of Probabilistic Density Estimation on Trajectory Prediction
* Fast Neural Scene Flow
* Fast Object Detection in High-Resolution Videos
* Fast Solution of Scattering and Micro-Doppler Features from Moving Target Using a Tailored Shooting and Bouncing Ray Method
* Fast Thick Cloud Removal for Multi-Temporal Remote Sensing Imagery via Representation Coefficient Total Variation
* Fast UAV Image Mosaicking by a Triangulated Irregular Network of Bucketed Tiepoints
* Fast Unified System for 3D Object Detection and Tracking, A
* FastRecon: Few-shot Industrial Anomaly Detection via Fast Feature Reconstruction
* FastViT: A Fast Hybrid Vision Transformer using Structural Reparameterization
* FateZero: Fusing Attentions for Zero-shot Text-based Video Editing
* FB-BEV: BEV Representation from Forward-Backward View Transformations
* FBLNet: FeedBack Loop Network for Driver Attention Prediction
* Fcaformer: Forward Cross Attention in Hybrid Vision Transformer
* FDViT: Improve the Hierarchical Architecture of Vision Transformer
* Feasibility Study of Thermal Infrared Imaging for Monitoring Natural Terrain: A Case Study in Hong Kong, A
* FeatEnHancer: Enhancing Hierarchical Features for Object Detection and Beyond Under Low-Light Vision
* Feature Modulation Transformer: Cross-Refinement of Global Representation via High-Frequency Prior for Image Super-Resolution
* Feature Prediction Diffusion Model for Video Anomaly Detection
* Feature Proliferation: the Cancer in StyleGAN and its Treatments
* Feature Scalar Field Grid-Guided Optical-Flow Image Matching for Multi-View Images of Asteroid
* Feature Weakening, Contextualization, and Discrimination for Weakly Supervised Temporal Action Localization
* FeatureNeRF: Learning Generalizable NeRFs by Distilling Foundation Models
* FedDP: Dual Personalization in Federated Medical Image Segmentation
* Federated Adversarial Domain Hallucination for Privacy-Preserving Domain Generalization
* Federated Learning Over Images: Vertical Decompositions and Pre-Trained Backbones Are Difficult to Beat
* FedLID: Self-Supervised Federated Learning for Leveraging Limited Image Data
* FedOSS: Federated Open Set Recognition via Inter-Client Discrepancy and Collaboration
* FedPD: Federated Open Set Recognition with Parameter Disentanglement
* FedPerfix: Towards Partial Model Personalization of Vision Transformers in Federated Learning
* FedRCIL: Federated Knowledge Distillation for Representation based Contrastive Incremental Learning
* FemtoDet: An Object Detection Baseline for Energy Versus Performance Tradeoffs
* FerKD: Surgical Label Adaptation for Efficient Distillation
* Few Labels are Enough! Semi-supervised Graph Learning for Social Interaction
* Few shot font generation via transferring similarity guided global style and quantization local style
* Few-Shot Common Action Localization via Cross-Attentional Fusion of Context and Temporal Dynamics
* Few-shot Continual Infomax Learning
* Few-Shot Dataset Distillation via Translative Pre-Training
* Few-Shot Physically-Aware Articulated Mesh Generation via Hierarchical Deformation
* Few-Shot Video Classification via Representation Fusion and Promotion Learning
* FewFaceNet: A Lightweight Few-Shot Learning-based Incremental Face Authentication for Edge Cameras
* Fg-T2M: Fine-Grained Text-Driven Human Motion Generation via Diffusion Model
* Fine-Grained is Too Coarse: A Novel Data-Centric Approach for Efficient Scene Graph Generation
* Fine-Grained Self-Supervision for Generalizable Semantic Segmentation
* Fine-grained Unsupervised Domain Adaptation for Gait Recognition
* Fine-grained Visible Watermark Removal
* Fine-Tuned but Zero-Shot 3D Shape Sketch View Similarity and Retrieval
* FineDance: A Fine-grained Choreography Dataset for 3D Full Body Dance Generation
* FineRecon: Depth-aware Feed-forward Network for Detailed 3D Reconstruction
* Fingerprinting Deep Image Restoration Models
* FireFly: A Synthetic Dataset for Ember Detection in Wildfire
* First Session Adaptation: A Strong Replay-Free Baseline for Class-Incremental Learning
* First Step towards Meteosat Third Generation Day-2 Precipitation Rate Product: Deep Learning for Precipitation Rate Retrieval from Geostationary Infrared Measurements, A
* First Visual Object Tracking Segmentation VOTS2023 Challenge Results, The
* FishNet: A Large-scale Dataset and Benchmark for Fish Recognition, Detection, and Functional Trait Prediction
* FIVA: Facial Image and Video Anonymization and Anonymization Defense
* Flashback for Continual Learning
* Flatness-Aware Minimization for Domain Generalization
* FLatten Transformer: Vision Transformer using Focused Linear Attention
* Flexible Visual Recognition by Evidential Modeling of Confusion and Ignorance
* FLIP: Cross-domain Face Anti-spoofing with Language Guidance
* FlipNeRF: Flipped Reflection Rays for Few-shot Novel View Synthesis
* Floor Plan Reconstruction from Sparse Views: Combining Graph Neural Network with Constrained Diffusion
* Focal Network for Image Restoration
* FocalFormer3D : Focusing on Hard Instance for 3D Object Detection
* Focus on Content not Noise: Improving Image Generation for Nuclei Segmentation by Suppressing Steganography in CycleGAN
* Focus on Your Target: A Dual Teacher-Student Framework for Domain-adaptive Semantic Segmentation
* Focus the Discrepancy: Intra- and Inter-Correlation Learning for Image Anomaly Detection
* Forecast-MAE: Self-supervised Pre-training for Motion Forecasting with Masked Autoencoders
* Foreground and Text-lines Aware Document Image Rectification
* Foreground Object Search by Distilling Composite Image Feature
* Foreground-Background Distribution Modeling Transformer for Visual Object Tracking
* Foreground-Background Separation through Concept Distillation from Generative Image Foundation Models
* Forest/Nonforest Segmentation Using Sentinel-1 and -2 Data Fusion in the Bajo Cauca Subregion in Colombia
* Forming Adversarial Example Attacks Against Deep Neural Networks With Reinforcement Learning
* Forward Flow for Novel View Synthesis of Dynamic Scenes
* FPR: False Positive Rectification for Weakly Supervised Semantic Segmentation
* Frameworks for Privacy-Preserving Federated Learning
* FRAug: Tackling Federated Learning with Non-IID Features via Representation Augmentation
* FreeCOS: Self-Supervised Learning from Fractals and Unlabeled Images for Curvilinear Object Segmentation
* FreeDoM: Training-Free Energy-Guided Conditional Diffusion Model
* Frequency Domain Imaging Algorithms for Short-Range Synthetic Aperture Radar
* Frequency-aware GAN for Adversarial Manipulation Generation
* Frequency-Aware Self-Supervised Long-Tailed Learning
* From Chaos Comes Order: Ordering Event Representations for Object Recognition and Detection
* From Knowledge Distillation to Self-Knowledge Distillation: A Unified Approach with Normalized Loss and Customized Soft Labels
* From Scarcity to Understanding: Transfer Learning for the Extremely Low Resource Irish Sign Language
* From Sky to the Ground: A Large-scale Benchmark and Simple Baseline Towards Real Rain Removal
* FrozenRecon: Pose-free 3D Scene Reconstruction with Frozen Depth Models
* FS-DETR: Few-Shot DEtection TRansformer with prompting and without re-training
* FSAR: Federated Skeleton-based Action Recognition with Adaptive Topology Structure and Knowledge Distillation
* FSI: Frequency and Spatial Interactive Learning for Image Restoration in Under-Display Cameras
* FSSDD: Few-shot steel defect detection based on multi-scale semantic enhancement representation and mask category information mapping
* Full-Body Articulated Human-Object Interaction
* FULLER: Unified Multi-modality Multi-task 3D Perception via Multi-level Gradient Calibration
* Fully Attentional Networks with Self-emerging Token Labeling
* Fully Differentiable Framework for 2D/3D Registration and the Projective Spatial Transformers, A
* FunnyBirds: A Synthetic Vision Dataset for a Part-Based Analysis of Explainable AI Methods
* Fusing VHR Post-disaster Aerial Imagery and LiDAR Data for Roof Classification in the Caribbean
* Fusion Approaches to Predict Post-stroke Aphasia Severity from Multimodal Neuroimaging Data
* fusion-attention swin transformer for cardiac MRI image segmentation, A
* Future Land Use and Flood Risk Assessment in the Guanzhong Plain, China: Scenario Analysis and the Impact of Climate Change
* Fuzzy C-means clustering algorithm based on superpixel merging and multi-feature adaptive fusion measurement
* G2L: A High-Dimensional Geometric Approach for Automatic Generation of Highly Accurate Pseudo-Labels
* G2L: Semantically Aligned and Uniform Video Grounding via Geodesic and Game Theory
* GA-GWNN: Generalized Adaptive Graph Wavelet Neural Network
* GACE: Geometry Aware Confidence Enhancement for Black-box 3D Object Detectors on LiDAR-Data
* GAFlow: Incorporating Gaussian Attention into Optical Flow
* GAIT: Generating Aesthetic Indoor Tours with Deep Reinforcement Learning
* Game of Bundle Adjustment - Learning Efficient Convergence, A
* GameFormer: Game-theoretic Modeling and Learning of Transformer-based Interactive Prediction and Planning for Autonomous Driving
* GAMNet: Global attention via multi-scale context for depth estimation algorithm and application
* GANDiffFace: Controllable Generation of Synthetic Datasets for Face Recognition with Realistic Variations
* GaPro: Box-Supervised 3D Point Cloud Instance Segmentation Using Gaussian Processes as Pseudo Labelers
* GasMono: Geometry-Aided Self-Supervised Monocular Depth Estimation for Indoor Scenes
* Gated Attention Transformer for Multi-Person Pose Tracking, A
* Gaussian Image Anomaly Detection with Greedy Eigencomponent Selection
* Gaussian Latent Representations for Uncertainty Estimation using Mahalanobis Distance in Deep Classifiers
* GECCO: Geometrically-Conditioned Point Diffusion Models
* GEDepth: Ground Embedding for Monocular Depth Estimation
* geeSEBAL-MODIS: Continental-scale evapotranspiration based on the surface energy balance for South America
* Gender Artifacts in Visual Datasets
* General Deep Learning Point-Surface Fusion Framework for RGB Image Super-Resolution, A
* General Image-to-Image Translation with One-Shot Image Guidance
* General Pairwise Modification Framework for Reversible Data Hiding in JPEG Images
* General Planar Motion from a Pair of 3D Correspondences
* Generalist Framework for Panoptic Segmentation of Images and Videos, A
* Generalizable Decision Boundaries: Dualistic Meta-Learning for Open Set Domain Generalization
* Generalizable Heterogeneous Federated Cross-Correlation and Instance Similarity Learning
* Generalizable Neural Fields as Partially Observed Neural Processes
* Generalized Differentiable RANSAC
* Generalized Few-Shot Point Cloud Segmentation Via Geometric Words
* Generalized Lightness Adaptation with Channel Selective Normalization
* Generalized Sum Pooling for Metric Learning
* Generalized Unitary Joint Diagonalization Algorithm Based on Approximate Givens Rotations
* Generalizing Event-Based Motion Deblurring in Real-World Scenarios
* Generalizing Neural Human Fitting to Unseen Poses with Articulated SE(3) Equivariance
* Generating Dynamic Kernels via Transformers for Lane Detection
* Generating Instance-level Prompts for Rehearsal-free Continual Learning
* Generating Realistic Images from In-the-wild Sounds
* Generating Synthetic Computed Tomography (CT) Images to Improve the Performance of Machine Learning Model for Pediatric Abdominal Anomaly Detection
* Generating Visual Scenes from Touch
* Generative Action Description Prompts for Skeleton-based Action Recognition
* Generative Approach for Probabilistic Human Mesh Recovery using Diffusion Models
* Generative Gradient Inversion via Over-Parameterized Networks in Federated Learning
* Generative Multiplane Neural Radiance for 3D-Aware Image Generation
* Generative Novel View Synthesis with 3D-Aware Diffusion Models
* Generative Prompt Model for Weakly Supervised Object Localization
* Generative-Based Image Fusion Strategy for Visible-Infrared Person Re-Identification, A
* Genetic Programming Guided Mapping of Forest Canopy Height by Combining LiDAR Satellites with Sentinel-1/2, Terrain, and Climate Data
* Geodesic Regression Characterizes 3D Shape Changes in the Female Brain During Menstruation
* Geometric Superpixel Representations for Efficient Image Classification with Graph Neural Networks
* Geometric Viewpoint Learning with Hyper-Rays and Harmonics Encoding
* Geometrized Transformer for Self-Supervised Homography Estimation
* Geometry-guided Feature Learning and Fusion for Indoor Scene Reconstruction
* GeoMIM: Towards Better 3D Knowledge Transfer via Masked Image Modeling for Multi-view 3D Understanding
* GeoUDF: Surface Reconstruction from 3D Point Clouds via Geometry-guided Distance Representation
* GePSAn: Generative Procedure Step Anticipation in Cooking Videos
* Get the Best of Both Worlds: Improving Accuracy and Transferability by Grassmann Class Representation
* Get3DHuman: Lifting StyleGAN-Human into a 3D Generative Model using Pixel-aligned Reconstruction Priors
* GeT: Generative Target Structure Debiasing for Domain Adaptation
* GET: Group Event Transformer for Event-Based Vision
* GETAvatar: Generative Textured Meshes for Animatable Human Avatars
* GIFD: A Generative Gradient Inversion Method with Feature Domain Optimization
* GIS-Based Progress Monitoring of SDGs towards Achieving Saudi Vision 2030
* GLA-GCN: Global-local Adaptive Graph Convolutional Network for 3D Human Pose Estimation from Monocular Video
* Glacier Surface Velocity Variations in the West Kunlun Mts. with Sentinel-1A Image Feature-Tracking (2014-2023)
* Glioma detection using EHO based FLAME clustering in MR brain images
* Global Adaptation meets Local Generalization: Unsupervised Domain Adaptation for 3D Human Pose Estimation
* Global Balanced Experts for Federated Long-Tailed Learning
* Global Features are All You Need for Image Retrieval and Reranking
* Global Knowledge Calibration for Fast Open-Vocabulary Segmentation
* Global Perception Based Autoregressive Neural Processes
* Global Shapes and Salient Joints Features Learning for Skeleton-Based Action Recognition
* Global Terrestrial Evapotranspiration Estimation from Visible Infrared Imaging Radiometer Suite (VIIRS) Data
* GlobalMapper: Arbitrary-Shaped Urban Layout Generation
* Gloss-free Sign Language Translation: Improving from Visual-Language Pretraining
* GlowGAN: Unsupervised Learning of HDR Images from LDR Images in the Wild
* GlueGen: Plug and Play Multi-Modal Encoders for X-to-Image Generation
* GlueStick: Robust Image Matching by Sticking Points and Lines Together
* GNOS-II on Fengyun-3 Satellite Series: Exploration of Multi-GNSS Reflection Signals for Operational Applications
* GNSS Carrier-Phase Multipath Modeling and Correction: A Review and Prospect of Data Processing Methods
* GO-SLAM: Global Optimization for Consistent 3D Instant Reconstruction
* Going Beyond Nouns With Vision & Language Models Using Synthetic Data
* Going Denser with Open-Vocabulary Part Segmentation
* Good Fences Make Good Neighbours
* Good Student is Cooperative and Reliable: CNN-Transformer Collaborative Learning for Semantic Segmentation, A
* GPA-3D: Geometry-aware Prototype Alignment for Unsupervised Domain Adaptive 3D Object Detection from Point Clouds
* GPFL: Simultaneously Learning Global and Personalized Feature Information for Personalized Federated Learning
* GPGait: Generalized Pose-based Gait Recognition
* GPS-GLASS: Learning Nighttime Semantic Segmentation Using Daytime Video and GPS data
* Gradient Estimation for Unseen Domain Risk Minimization with Pre-Trained Models
* Gradient-based Sampling for Class Imbalanced Semi-supervised Object Detection
* Gradient-Regulated Meta-Prompt Learning for Generalizable Vision-Language Models
* Gram-based Attentive Neural Ordinary Differential Equations Network for Video Nystagmography Classification
* GRAM-HD: 3D-Consistent Image Generation at High Resolution with Generative Radiance Manifolds
* Gramian Attention Heads are Strong yet Efficient Vision Learners
* Graph Matching with Bi-level Noisy Correspondence
* GraphAlign: Enhancing Accurate Feature Alignment by Graph matching for Multi-Modal 3D Object Detection
* GraphCFC: A Directed Graph Based Cross-Modal Feature Complementation Approach for Multimodal Conversational Emotion Recognition
* GraphEcho: Graph-Driven Unsupervised Domain Adaptation for Echocardiogram Video Segmentation
* Graphics2RAW: Mapping Computer Graphics Images to Sensor RAW Images
* Green Fingerprint of Antarctica: Drones, Hyperspectral Imaging, and Machine Learning for Moss and Lichen Classification, A
* Greening the Urban Landscape: Assessing the Impact of Tree-Planting Initiatives and Climate Influences on Miami-Dade County's Greenness
* GridMM: Grid Memory Map for Vision-and-Language Navigation
* GridPull: Towards Scalability in Learning Implicit Representations from 3D Point Clouds
* Grounded Entity-Landmark Adaptive Pre-training for Vision-and-Language Navigation
* Grounded Image Text Matching with Mismatched Relation Reasoning
* Grounding 3D Object Affordance from 2D Interactions in Images
* Group DETR: Fast DETR Training with Group-Wise One-to-Many Assignment
* Group Pose: A Simple Baseline for End-to-End Multi-person Pose Estimation
* Group-Conditional Conformal Prediction via Quantile Regression Calibration for Crop and Weed Classification
* GrowCLIP: Data-aware Automatic Model Growing for Large-scale Contrastive Language-Image Pre-training
* Growing a Brain with Sparsity-Inducing Generation for Continual Learning
* Guarding the Guardians: Automated Analysis of Online Child Sexual Abuse
* Guided Motion Diffusion for Controllable Human Motion Synthesis
* Guiding image captioning models toward more specific captions
* Guiding Local Feature Matching with Surface Curvature
* Guiding Video Prediction with Explicit Procedural Knowledge
* H3WB: Human3.6M 3D WholeBody Dataset and Benchmark
* HairCLIPv2: Unifying Hair Editing via Proxy Feature Blending
* HairNeRF: Geometry-Aware Image Synthesis for Hairstyle Transfer
* HAL3D: Hierarchical Active Learning for Fine-Grained 3D Part Labeling
* HaMuCo: Hand Pose Estimation via Multiview Collaborative Self-Supervised Learning
* Hand Gesture Authentication by Discovering Fine-Grained Spatiotemporal Identity Characteristics
* HandR2N2: Iterative 3D Hand Pose Estimation Using a Residual Recurrent Neural Network
* Handwritten and Printed Text Segmentation: A Signature Case Study
* Hard No-Box Adversarial Attack on Skeleton-Based Human Action Recognition with Skeleton-Motion-Informed Gradient
* Harnessing the Spatial-Temporal Attention of Diffusion Models for High-Fidelity Text-to-Image Synthesis
* Harvard Glaucoma Detection and Progression: A Multimodal Multitask Dataset and Generalization-Reinforced Semi-Supervised Learning
* Hashing Neural Video Decomposition with Multiplicative Residuals in Space-Time
* Hausdorff Regression Paradigm for Interval Privacy, A
* Haystack: A Panoptic Scene Graph Dataset to Evaluate Rare Predicate Classes
* HDG-ODE: A Hierarchical Continuous-Time Model for Human Pose Forecasting
* Helping Hands: An Object-Aware Ego-Centric Video Recognition Model
* Heterogeneous Diversity Driven Active Learning for Multi-Object Tracking
* Heterogeneous Forgetting Compensation for Class-Incremental Learning
* Heterogeneous Ship Data Classification with Spatial-Channel Attention with Bilinear Pooling Network
* Hidden Biases of End-to-End Driving Models
* Hidden classification layers: Enhancing linear separability between classes in neural networks layers
* Hiding Visual Information via Obfuscating Adversarial Perturbations
* Hierarchical Contrastive Learning for Pattern-Generalizable Image Corruption Detection
* Hierarchical Generation of Human-Object Interactions with Diffusion Probabilistic Models
* Hierarchical Knowledge Guided Learning for Real-World Retinal Disease Recognition
* Hierarchical Point-Based Active Learning for Semi-Supervised Point Cloud Semantic Segmentation
* Hierarchical Prior Mining for Non-local Multi-View Stereo
* Hierarchical Spatial-Temporal Adaptive Graph Fusion for Monocular 3D Human Pose Estimation
* Hierarchical Spatio-Temporal Representation Learning for Gait Recognition
* Hierarchical Spatiotemporal Transformers for Video Object Segmentation
* Hierarchical Visual Categories Modeling: A Joint Representation Learning and Density Estimation Framework for Out-of-Distribution Detection
* Hierarchical Visual Primitive Experts for Compositional Zero-Shot Learning
* Hierarchical-Instance Contrastive Learning for Minority Detection on Imbalanced Medical Datasets
* Hierarchically Decomposed Graph Convolutional Networks for Skeleton-Based Action Recognition
* HiFace: High-Fidelity 3D Face Reconstruction by Learning Static and Dynamic Details
* High performance image steganography integrating IWT and Hamming code within secret sharing
* High Quality Entity Segmentation
* High-Accuracy Mapping of Soil Parent Material Types in Hilly Areas at the County Scale Using Machine Learning Algorithms
* High-Frequency Microbarograph-Observed Pressure Variations Associated with Gust Fronts during an Extreme Rainfall Event
* High-Precision Baseline Calibration Method Based on Estimation of Azimuth Fringe Frequency with THz Interferometry SAR, A
* High-Resolution Document Shadow Removal via A Large-Scale Real-World Dataset and A Frequency-Aware Shadow Erasing Net
* High-Resolution Feature Pyramid Network for Small Object Detection on Drone View
* High-Resolution Mapping of Mangrove Species Height in Fujian Zhangjiangkou National Mangrove Nature Reserve Combined GF-2, GF-3, and UAV-LiDAR
* High-Temporal-Resolution Rock Slope Monitoring Using Terrestrial Structure-from-Motion Photogrammetry in an Application with Spatial Resolution Limitations
* Higher UHI Intensity, Higher Urban Temperature? A Synthetical Analysis of Urban Heat Environment in Urban Megaregion
* HiLo: Exploiting High Low Frequency Relations for Unbiased Panoptic Scene Graph Generation
* HiTeA: Hierarchical Temporal-Aware Video-Language Pre-training
* HiVLP: Hierarchical Interactive Video-Language Pre-Training
* HM-ViT: Hetero-modal Vehicle-to-Vehicle Cooperative Perception with Vision Transformer
* HMARNET: A Hierarchical Multi-Attention Residual Network for Gleason scoring of prostate cancer
* HMD-NeMo: Online 3D Avatar Motion Generation From Sparse Observations
* Holistic Geometric Feature Learning for Structured Reconstruction
* Holistic Label Correction for Noisy Multi-Label Classification
* HollowNeRF: Pruning Hashgrid-Based NeRFs with Trainable Collision Mitigation
* HoloAssist: an Egocentric Human Interaction Dataset for Interactive AI Assistants in the Real World
* Homeomorphism Alignment for Unsupervised Domain Adaptation
* Homography Guided Temporal Fusion for Road Line and Marking Segmentation
* HopFIR: Hop-wise GraphFormer with Intragroup Joint Refinement for 3D Human Pose Estimation
* Horse with no Labels: Self-Supervised Horse Pose Estimation from Unlabelled Images and Synthetic Prior, A
* HOSNeRF: Dynamic Human-Object-Scene Neural Radiance Fields from a Single Video
* Householder Projector for Unsupervised Latent Semantics Discovery
* How Do Driving Factors Affect Vegetation Coverage Change in the Shaanxi Region of the Qinling Mountains?
* How Far Pre-trained Models Are from Neural Collapse on the Target Dataset Informs their Transferability
* How Much Temporal Long-Term Context is Needed for Action Segmentation?
* How to Boost Face Recognition with StyleGAN?
* How to choose your best allies for a transferable attack?
* HRS-Bench: Holistic, Reliable and Scalable Benchmark for Text-to-Image Models
* HSE: Hybrid Species Embedding for Deep Metric Learning
* HSR-Diff: Hyperspectral Image Super-Resolution via Conditional Diffusion Models
* HTML: Hybrid Temporal-scale Multimodal Learning Framework for Referring Video Object Segmentation
* Human from Blur: Human Pose Tracking from Blurry Images
* Human Gait Recognition Based on Frontal-View Sequences Using Gait Dynamics and Deep Learning
* Human Part-wise 3D Motion Context Learning for Sign Language Recognition
* Human Preference Score: Better Aligning Text-to-image Models with Human Preference
* Human-Centric Scene Understanding for 3D Large-Scale Scenarios
* Human-Inspired Facial Sketch Synthesis with Dynamic Adaptation
* HumanMAC: Masked Motion Completion for Human Motion Prediction
* Humans in 4D: Reconstructing and Tracking Humans with Transformers
* HumanSD: A Native Skeleton-Guided Diffusion Model for Human Image Generation
* Hybrid attention mechanism of feature fusion for medical image segmentation
* Hybrid Cross-Transformer-KPConv for Point Cloud Segmentation
* Hybrid Spectral Denoising Transformer with Guided Attention
* Hybrid Visual Transformer for Efficient Deep Human Activity Recognition, A
* Hybrid-NET: A fusion of DenseNet169 and advanced machine learning classifiers for enhanced brain tumor diagnosis
* Hybrid-Scale Feature Enhancement Network for Hyperspectral Image Classification, A
* HybridAugment++: Unified Frequency Spectra Perturbations for Model Robustness
* HybridVPS: Hybrid-Supervised Video Polyp Segmentation Under Low-Cost Labels
* Hydrological Response Assessment of Land Cover Change in a Peruvian Amazonian Basin Impacted by Deforestation Using the SWAT Model
* Hyperbolic Audio-visual Zero-shot Learning
* Hyperbolic Chamfer Distance for Point Cloud Completion
* Hyperbolic prototypical network for few shot remote sensing scene classification
* HyperCoil-Recon: A Hypernetwork-based Adaptive Coil Configuration Task Switching Network for MRI Reconstruction
* HyperDiffusion: Generating Implicit Neural Fields with Weight-Space Diffusion
* HyperReenact: One-Shot Reenactment via Jointly Learning to Refine and Retarget Faces
* HyperSparse Neural Networks: Shifting Exploration to Exploitation through Adaptive Regularization
* Hyperspectral Image Classification Using Spectral-Spatial Double-Branch Attention Mechanism
* Hyperspectral Imaging of In-Site Stained Glasses: Illumination Variation Compensation Using Two Perpendicular Scans
* I can't believe there's no images!: Learning Visual Tasks Using Only Language Supervision
* I-ViT: Integer-only Quantization for Efficient Vision Transformer Inference
* ICD-Face: Intra-class Compactness Distillation for Face Recognition
* ICE-NeRF: Interactive Color Editing of NeRFs via Decomposition-Aware Weight Optimization
* ICICLE: Interpretable Class Incremental Continual Learning
* ICL-D3IE: In-Context Learning with Diverse Demonstrations Updating for Document Information Extraction
* iDAG: Invariant DAG Searching for Domain Generalization
* Identification of Novel Classes for Improving Few-Shot Object Detection
* Identification of Systematic Errors of Image Classifiers on Rare Subgroups
* Identifying Crop Growth Stages from Solar-Induced Chlorophyll Fluorescence Data in Maize and Winter Wheat from Ground and Satellite Measurements
* Identifying Old-Growth Forests in Complex Landscapes: A New LiDAR-Based Estimation Framework and Conservation Implications
* Identifying Out-of-Domain Objects with Dirichlet Deep Neural Networks
* Identifying Systematic Errors in Object Detectors with the SCROD Pipeline
* Identity-Consistent Aggregation for Video Object Detection
* Identity-Seeking Self-Supervised Representation Learning for Generalizable Person Re-identification
* IDiff-Face: Synthetic-based Face Recognition through Fizzy Identity-Conditioned Diffusion Models
* IDTransformer: Transformer for Intrinsic Image Decomposition
* IFPNet: Integrated Feature Pyramid Network with Fusion Factor for Lane Detection
* IHNet: Iterative Hierarchical Network Guided by High-Resolution Estimated Information for Scene Flow Estimation
* IIEU: Rethinking Neural Feature Activation from Decision-Making
* ILSH: The Imperial Light-Stage Head Dataset for Human Head View Synthesis
* Image Captioning With Controllable and Adaptive Length Levels
* Image Data and Backbone in Weakly Supervised Fine-Grained Visual Categorization: A Revisit and Further Thinking, The
* Image Guided Inpainting with Parameter Efficient Learning
* Image Restoration via Frequency Selection
* Image-free Classifier Injection for Zero-Shot Classification
* ImbSAM: A Closer Look at Sharpness-Aware Minimization in Class-Imbalanced Recognition
* ImGeoNet: Image-induced Geometry-aware Voxel Representation for Multi-view 3D Object Detection
* Imitator: Personalized Speech-driven 3D Facial Animation
* Impacts of Marine Heatwave Events on Three Distinct Upwelling Systems and Their Implications for Marine Ecosystems in the Northwestern South China Sea
* Implementing the European Space Agency's SentiNel Application Platform's Open-Source Python Module for Differential Synthetic Aperture Radar Interferometry Coseismic Ground Deformation from Sentinel-1 Data
* Implicit Autoencoder for Point-Cloud Self-Supervised Representation Learning
* Implicit Identity Representation Conditioned Memory Compensation Network for Talking Head Video Generation
* Implicit Neural Representation for Cooperative Low-light Image Enhancement
* Implicit Neural Representation in Medical Imaging: A Comparative Survey
* Implicit Temporal Modeling with Learnable Alignment for Video Recognition
* Importance Weighted Structure Learning for Scene Graph Generation
* Improved Generalization of White Blood Cell Classification by Learnable Illumination Intensity Invariant Layer
* Improved Head and Data Augmentation to Reduce Artifacts at Grid Boundaries in Object Detection
* Improved Hybrid Beamforming Algorithm for Fast Target Tracking in Satellite and V2X Communication, An
* Improved Knowledge Transfer for Semi-supervised Domain Adaptation via Trico Training Strategy
* Improved Modeling of Gross Primary Production and Transpiration of Sugarcane Plantations with Time-Series Landsat and Sentinel-2 Images
* Improved Visual Fine-tuning with Natural Language Supervision
* Improving 3D Imaging with Pre-Trained Perpendicular 2D Diffusion Models
* Improving Adversarial Robustness of Masked Autoencoders via Test-time Frequency-domain Prompting
* Improving Automatic Endoscopic Stone Recognition Using a Multi-view Fusion Approach Enhanced with Two-Step Transfer Learning
* Improving CLIP Fine-tuning Performance
* Improving Colored Dissolved Organic Matter (CDOM) Retrievals by Sentinel2-MSI Data through a Total Suspended Matter (TSM)-Driven Classification: The Case of Pertusillo Lake (Southern Ital
* Improving Continuous Sign Language Recognition with Cross-Lingual Signs
* Improving Deep Learning on Hyperspectral Images of Grain by Incorporating Domain Knowledge from Chemometrics
* Improving Diversity in Zero-Shot GAN Adaptation with Semantic Variations
* Improving Equivariance in State-of-the-Art Supervised Depth and Normal Predictors
* Improving Generalization in Visual Reinforcement Learning via Conflict-aware Gradient Agreement Augmentation
* Improving Generalization of Adversarial Training via Robust Critical Fine-Tuning
* Improving Geological Remote Sensing Interpretation via Optimal Transport-Based Point-Surface Data Fusion
* Improving Handwritten Mathematical Expression Recognition via Similar Symbol Distinguishing
* Improving Land Use and Land Cover Information of Wunbaik Mangrove Area in Myanmar Using U-Net Model with Multisource Remote Sensing Datasets
* Improving Landslide Prediction: Innovative Modeling and Evaluation of Landslide Scenario with Knowledge Graph Embedding
* Improving Lens Flare Removal with General-Purpose Pipeline and Multiple Light Sources Recovery
* Improving Online Lane Graph Extraction by Object-Lane Clustering
* Improving Out-of-Distribution Generalization in SAR Image Scene Classification with Limited Training Samples
* Improving Pixel-based MIM by Reducing Wasted Modeling Capability
* Improving Predictions of Tibetan Plateau Summer Precipitation Using a Sea Surface Temperature Analog-Based Correction Method
* Improving Radar Reflectivity Reconstruction with Himawari-9 and UNet++ for Off-Shore Weather Monitoring
* Improving Replay Sample Selection and Storage for Less Forgetting in Continual Learning
* Improving Representation Learning for Histopathologic Images with Cluster Constraints
* Improving Sample Quality of Diffusion Models Using Self-Attention Guidance
* Improving Transformer-based Image Matching by Cascaded Capturing Spatially Informative Keypoints
* In silico, in vitro, and in vivo validation of a microwave imaging system using a low-profile Ultra Wide Band Archimedean spiral antenna to detect skin cancer
* In-Style: Bridging Text and Uncurated Videos with Style Transfer for Text-Video Retrieval
* Inclination Trend of the Agulhas Return Current Path in Three Decades
* Incorporating Bioclimatic Zones into Informing Ecological Networks for Better Biodiversity Conservation
* Incremental Generalized Category Discovery
* Indoor Depth Recovery Based on Deep Unfolding with Non-Local Prior
* Inducing Neural Collapse to a Fixed Hierarchy-Aware Frame for Reducing Mistake Severity
* Inductive Conformal Prediction for Harvest-Readiness Classification of Cauliflower Plants: A Comparative Study of Uncertainty Quantification Methods
* Inequality-Constrained 3D Morphable Face Model Fitting
* InfiniCity: Infinite-Scale City Synthesis
* influence of BRDF effects and representativeness of training data on tree species classification using multi-flightline airborne hyperspectral imagery, The
* Influence of South-to-North Water Diversion on Land Subsidence in North China Plain Revealed by Using Geodetic Measurements
* Information Extraction Method for Industrial and Commercial Rooftop Photovoltaics Based on GaoFen-7 Remote Sensing Images, An
* Informative Data Mining for One-shot Cross-Domain Semantic Segmentation
* Infrared Dim and Small Target Detection Based on Superpixel Segmentation and Spatiotemporal Cluster 4D Fully-Connected Tensor Network Decomposition
* Infrared Small Dim Target Detection Using Group Regularized Principle Component Pursuit
* InFusion: Inject and Attention Fusion for Multi Concept Zero-Shot Text-based Video Editing
* Inherent Redundancy in Spiking Neural Networks
* Innovating Real Fisheye Image Correction with Dual Diffusion Architecture
* InSAR Monitoring Using Persistent Scatterer Interferometry (PSI) and Small Baseline Subset (SBAS) Techniques for Ground Deformation Measurement in Metropolitan Area of Concepcion, Chile
* Inspecting the Geographical Representativeness of Images from Text-to-Image Models
* INSTA-BNN: Binary Neural Network with INSTAnce-aware Threshold
* Instance and Category Supervision are Alternate Learners for Continual Learning
* Instance Neural Radiance Field
* Instance-aware Dynamic Prompt Tuning for Pre-trained Point Cloud Models
* Instant Continual Learning of Neural Radiance Fields
* InstaTune: Instantaneous Neural Architecture Search During Fine-Tuning
* Instruct-NeRF2NeRF: Editing 3D Scenes with Instructions
* INT2: Interactive Trajectory Prediction at Intersections
* Integrally Migrating Pre-trained Transformer Encoder-decoders for Visual Object Detection
* Integrating Boxes and Masks: A Multi-Object Framework for Unified Visual Tracking and Segmentation
* Integrating GRACE/GRACE Follow-On and Wells Data to Detect Groundwater Storage Recovery at a Small-Scale in Beijing Using Deep Learning
* Integrating Language Guidance Into Image-Text Matching for Correcting False Negatives
* Intelligent Environment-Adaptive GNSS/INS Integrated Positioning with Factor Graph Optimization
* IntentQA: Context-aware Video Intent Reasoning
* Inter-Comparison of Satellite-Based Sea Ice Concentration in the Amundsen Sea, Antarctica
* Inter-Frame Compression for Dynamic Point Cloud Geometry Coding
* Inter-Realization Channels: Unsupervised Anomaly Detection Beyond One-Class Classification
* Interaction acceptance modelling and estimation for a proactive engagement in the context of human-robot interactions
* Interaction-aware Joint Attention Estimation Using People Attributes
* Interaction-Aware Prompting for Zero-Shot Spatio-Temporal Action Detection
* Interactive Class-Agnostic Object Counting
* Interactive Image Segmentation with Cross-Modality Vision Transformers
* interactive method for adaptive acquisition in Reflectance Transformation Imaging for cultural heritage, An
* Interannual Variability of Salinity in the Chukchi Sea and Its Relationships with the Dynamics of the East Siberian Current during 1993-2020
* InterAug: A Tuning-Free Augmentation Policy for Data-Efficient and Robust Object Detection
* InterDiff: Generating 3D Human-Object Interactions with Physics-Informed Diffusion
* InterFormer Real-time Interactive Image Segmentation
* Internal Tree Trunk Decay Detection Using Close-Range Remote Sensing Data and the PointNet Deep Learning Method
* Interpretable and Accurate Deep-Learning Diagnosis Framework Modeled With Fully and Semi-Supervised Reciprocal Learning, An
* Interpretable Framework to Characterize Compound Treatments on Filamentous Fungi using Cell Painting and Deep Metric Learning, An
* Interpretable-through-prototypes deepfake detection for diffusion models
* Interstation Undifferenced Real-Time Time Transfer Method with Refined Modeling of Receiver Clock, An
* Intra voxel analysis in magnetic resonance imaging via deep learning
* Intrinsic Appearance Decomposition Using Point Cloud Representation
* IntrinsicNeRF: Learning Intrinsic Neural Radiance Fields for Editable Novel View Synthesis
* Introducing Language Guidance in Prompt-based Continual Learning
* Introspection of 2D Object Detection using Processed Neural Activation Patterns in Automated Driving Systems
* Invariant Feature Regularization for Fair Face Recognition
* Invariant Training 2D-3D Joint Hard Samples for Few-Shot Point Cloud Recognition
* Inverse Compositional Learning for Weakly-supervised Relation Grounding
* Inverse problem regularization with hierarchical variational autoencoders
* Investigating the Spatial, Proximity, and Multiscale Effects of Influencing Factors in the Snowmelt Process in the Manas River Basin Using a Novel Zonal Spatial Panel Model
* Investigation of the Source of Iceland Basin Freshening: Virtual Particle Tracking with Satellite-Derived Geostrophic Surface Velocities
* Involving Distinguished Temporal Graph Convolutional Networks for Skeleton-Based Temporal Action Segmentation
* IOMatch: Simplifying Open-Set Semi-Supervised Learning with Joint Inliers and Outliers Utilization
* Ionospheric Plasma Perturbations before a Sequence of Strong Earthquakes in Southeast Asia and Northern Oceania in 2018, The
* IPCert: Provably Robust Intellectual Property Protection for Machine Learning
* Is context all you need? Scaling Neural Sign Language Translation to Large Domains of Discourse
* Is Imitation All You Need? Generalized Decision-Making with Dual-Phase Training
* Is there progress in activity progress prediction?
* Isomer: Isomerous Transformer for Zero-shot Video Object Segmentation
* IST-Net: Prior-free Category-level Pose Estimation with Implicit Space Transformation
* Iterative Denoiser and Noise Estimator for Self-Supervised Image Denoising
* Iterative Prompt Learning for Unsupervised Backlit Image Enhancement
* Iterative Robust Visual Grounding with Masked Reference based Centerpoint Supervision
* Iterative Soft Shrinkage Learning for Efficient Image Super-Resolution
* Iterative Superquadric Recomposition of 3D Objects from Multiple Views
* IterDepth: Iterative Residual Refinement for Outdoor Self-Supervised Multi-Frame Monocular Depth Estimation
* ITI-Gen: Inclusive Text-to-Image Generation
* iVS-Net: Learning Human View Synthesis from Internet Videos
* JEDI: Joint Expert Distillation in a Semi-Supervised Multi-Dataset Student-Teacher Scenario for Video Action Recognition
* Joint Cross-Attention Network With Deep Modality Prior for Fast MRI Reconstruction
* Joint Demosaicing and Deghosting of Time-Varying Exposures for Single-Shot HDR Imaging
* Joint Implicit Neural Representation for High-fidelity and Compact Vector Fonts
* Joint Metrics Matter: A Better Standard for Trajectory Forecasting
* Joint-Guided Distillation Binary Neural Network via Dynamic Channel-Wise Diversity Enhancement for Object Detection
* Joint-Relation Transformer for Multi-Person Motion Prediction
* JOTR: 3D Joint Contrastive Learning with Transformers for Occluded Human Mesh Recovery
* Jumping through Local Minima: Quantization in the Loss Landscape of Vision Transformers
* Just Ask Plus: Using Transcripts for VideoQA
* Kecor: Kernel Coding Rate Maximization for Active 3D Object Detection
* Keep It SimPool:Who Said Supervised Transformers Suffer from Attention Deficit?
* Kick Back & Relax: Learning to Reconstruct the World by Watching SlowTV
* Kinematic and Microphysical Characteristics of Extremely Heavy Rainfall in Zhengzhou City on 20 July 2021 Observed with Dual-Polarization Radars and Disdrometers, The
* Kinship Representation Learning with Face Componential Relation
* Knowing Where to Focus: Event-aware Transformer for Video Grounding
* Knowledge Evolution Learning: A Cost-Free Weakly Supervised Semantic Segmentation Framework for High-Resolution Land Cover Classification
* Knowledge Informed Sequential Scene Graph Verification Using VQA
* Knowledge Proxy Intervention for Deconfounded Video Question Answering
* Knowledge Restore and Transfer for Multi-Label Class-Incremental Learning
* Knowledge-Aware Federated Active Learning with Non-IID Data
* Knowledge-Aware Prompt Tuning for Generalizable Vision-Language Models
* Knowledge-Spreader: Learning Semi-Supervised Facial Action Dynamics by Consistifying Knowledge Granularity
* L-DAWA: Layer-wise Divergence Aware Weight Aggregation in Federated Self-Supervised Visual Representation Learning
* LA-Net: Landmark-Aware Learning for Reliable Facial Expression Recognition under Label Noise
* Label Semantic Knowledge Distillation for Unbiased Scene Graph Generation
* Label Shift Adapter for Test-Time Adaptation under Covariate and Label Shifts
* Label-Efficient Online Continual Object Detection in Streaming Video
* Label-Free Event-based Object Recognition via Joint Learning with Image Reconstruction from Events
* Label-Guided Knowledge Distillation for Continual Semantic Segmentation on 2D Images and 3D Point Clouds
* Label-Noise Learning with Intrinsically Long-Tailed Data
* LAC: Latent Action Composition for Skeleton-based Action Segmentation
* LAN-HDR: Luminance-based Alignment Network for High Dynamic Range Video Reconstruction
* Land-Use Optimization Based on Ecological Security Pattern: A Case Study of Baicheng, Northeast China
* Landscape Learning for Neural Network Inversion
* Landslide Hazard Assessment Combined with InSAR Deformation: A Case Study in the Zagunao River Basin, Sichuan Province, Southwestern China
* Language-enhanced RNR-Map: Querying Renderable Neural Radiance Field maps with natural language
* LaPE: Layer-adaptive Position Embedding for Vision Transformers with Independent Layer Normalization
* Large Selective Kernel Network for Remote Sensing Object Detection
* Large-Scale Land Cover Mapping with Fine-Grained Classes via Class-Aware Semi-Supervised Semantic Segmentation
* Large-Scale Outdoor Multi-modal Dataset and Benchmark for Novel View Synthesis and Implicit Scene Reconstruction, A
* Large-Scale Person Detection and Localization using Overhead Fisheye Cameras
* Large-scale Study of Spatiotemporal Representation Learning with a New Benchmark on Action Recognition, A
* LaRS: A Diverse Panoptic Maritime Obstacle Detection Dataset and Benchmark
* Late Stopping: Avoiding Confidently Learning from Mislabeled Examples
* Latent Space of Stochastic Diffusion Models for Zero-Shot Image Editing and Guidance, A
* Latent-OFER: Detect, Mask, and Reconstruct with Latent Vectors for Occluded Facial Expression Recognition
* LatentSwap3D: Semantic Edits on 3D Image GANs
* LATR: 3D Lane Detection from Monocular Images with Transformer
* LAW-Diffusion: Complex Scene Generation by Diffusion with Layouts
* LayerNet: High-Resolution Semantic 3D Reconstruction of Clothed People
* LayoutDiffusion: Improving Graphic Layout Generation by Discrete Diffusion Probabilistic Models
* LCNME: Label Correction Using Network Prediction Based on Memorization Effects for Cross-Modal Retrieval With Noisy Labels
* LD-ZNet: A Latent Diffusion Approach for Text-Based Image Segmentation
* LDL: Line Distance Functions for Panoramic Localization
* LDP-Feat: Image Features with Local Differential Privacy
* LEA2: A Lightweight Ensemble Adversarial Attack via Non-overlapping Vulnerable Frequency Regions
* LeaF: Learning Frames for 4D Point Cloud Sequence Understanding
* Leaping Into Memories: Space-Time Deep Feature Synthesis
* Learn TAROT with MENTOR: A Meta-Learned Self-supervised Approach for Trajectory Prediction
* Learned Compressive Representations for Single-Photon 3D Imaging
* Learned Image Reasoning Prior Penetrates Deep Unfolding Network for Panchromatic and Multi-Spectral Image Fusion
* Learning a More Continuous Zero Level Set in Unsigned Distance Fields through Level Set Projection
* Learning A Room with the Occ-SDF Hybrid: Signed Distance Function Mingled with Occupancy Aids Scene Representation
* Learning Adaptive Neighborhoods for Graph Neural Networks
* Learning Clothing and Pose Invariant 3D Shape Representation for Long-Term Person Re-Identification
* Learning Concise and Descriptive Attributes for Visual Recognition
* Learning Concordant Attention via Target-aware Alignment for Visible-Infrared Person Re-identification
* Learning Continuous Exposure Value Representations for Single-Image HDR Reconstruction
* Learning Correction Filter via Degradation-Adaptive Regression for Blind Single Image Super-Resolution
* Learning Cross-Modal Affinity for Referring Video Object Segmentation Targeting Limited Samples
* Learning Cross-Representation Affinity Consistency for Sparsely Supervised Biomedical Instance Segmentation
* Learning Data-Driven Vector-Quantized Degradation Model for Animation Video Super-Resolution
* Learning Deep Scene Curve for Fast and Robust Underwater Image Enhancement
* Learning Depth Estimation for Transparent and Mirror Surfaces
* Learning Diverse Tone Styles for Image Retouching
* Learning Fine-Grained Features for Pixel-wise Video Correspondences
* Learning Foresightful Dense Visual Affordance for Deformable Object Manipulation
* Learning From Coding Features: High Efficiency Rate Control for AOMedia Video 1
* Learning from Noisy Pseudo Labels for Semi-Supervised Temporal Action Localization
* Learning from Semantic Alignment between Unpaired Multiviews for Egocentric Video Recognition
* Learning From Teacher's Failure: A Reflective Learning Paradigm for Knowledge Distillation
* Learning Gabor Texture Features for Fine-Grained Recognition
* Learning Global Evapotranspiration Dataset Corrections from a Water Cycle Closure Supervision
* Learning Global-aware Kernel for Image Harmonization
* Learning Hierarchical Features with Joint Latent Space Energy-Based Prior
* Learning Hierarchical Modular Networks for Video Captioning
* Learning Human Dynamics in Autonomous Driving Scenarios
* Learning Human-Human Interactions in Images from Weak Textual Supervision
* Learning Image Harmonization in the Linear Color Space
* Learning Image-Adaptive Codebooks for Class-Agnostic Image Restoration
* Learning in Imperfect Environment: Multi-Label Classification with Long-Tailed Distribution and Partial Labels
* Learning Interpretable Forensic Representations via Local Window Modulation
* Learning Interpretable Rules for Scalable Data Representation and Classification
* Learning Long-range Information with Dual-Scale Transformers for Indoor Scene Completion
* Learning Navigational Visual Representations with Semantic Map Supervision
* Learning Neural Eigenfunctions for Unsupervised Semantic Segmentation
* Learning Neural Implicit Surfaces with Object-Aware Radiance Fields
* Learning Non-Local Spatial-Angular Correlation for Light Field Image Super-Resolution
* Learning Optical Flow from Event Camera with Rendered Dataset
* Learning Point Cloud Completion without Complete Point Clouds: A Pose-Aware Approach
* Learning Proposal-Aware Re-Ranking for Weakly-Supervised Temporal Action Localization
* Learning Pseudo-Relations for Cross-domain Semantic Segmentation
* Learning Rain Location Prior for Nighttime Deraining
* Learning Robust Representations with Information Bottleneck and Memory Network for RGB-D-based Gesture Recognition
* Learning Semi-supervised Gaussian Mixture Models for Generalized Category Discovery
* Learning Shape Primitives via Implicit Convexity Regularization
* Learning Spatial-Context-Aware Global Visual Feature Representation for Instance Image Retrieval
* Learning Support and Trivial Prototypes for Interpretable Image Classification
* Learning Symmetry-Aware Geometry Correspondences for 6D Object Pose Estimation
* Learning to Distill Global Representation for Sparse-View CT
* Learning to Generate Semantic Layouts for Higher Text-Image Correspondence in Text-to-Image Synthesis
* Learning to Ground Instructional Articles in Videos through Narrations
* Learning to Identify Critical States for Reinforcement Learning from Videos
* Learning to Learn: How to Continuously Teach Humans and Machines
* Learning to Prompt CLIP for Monocular Depth Estimation: Exploring the Limits of Human Language
* Learning to rank approach for refining image retrieval in visual arts
* Learning to Transform for Generalizable Instance-wise Invariance
* Learning to Upsample by Learning to Sample
* Learning Trajectory-Word Alignments for Video-Language Tasks
* Learning Unified Decompositional and Compositional NeRF for Editable Novel View Synthesis
* Learning Universal Semantic Correspondences with No Supervision and Automatic Data Curation
* Learning Versatile 3D Shape Generation with Improved Auto-regressive Models
* Learning Vision-and-Language Navigation from YouTube Videos
* Learning with Diversity: Self-Expanded Equalization for Better Generalized Deep Metric Learning
* Learning with Noisy Data for Semi-Supervised 3D Object Detection
* Learnt Contrastive Concept Embeddings for Sign Recognition
* Lecture Presentations Multimodal Dataset: Towards Understanding Multimodality in Educational Videos
* LEMMS: Label Estimation of Multi-feature Movie Segments
* Lens Parameter Estimation for Realistic Depth of Field Modeling
* LERF: Language Embedded Radiance Fields
* Less is More: Focus Attention for Efficient DETR
* Leveraging Classic Deconvolution and Feature Extraction in Zero-Shot Image Restoration
* Leveraging Inpainting for Single-Image Shadow Removal
* Leveraging Intrinsic Properties for Non-Rigid Garment Alignment
* Leveraging SE(3) Equivariance for Learning 3D Geometric Shape Assembly
* Leveraging Spatio-Temporal Dependency for Skeleton-Based Action Recognition
* Leveraging Visual Attention for out-of-distribution Detection
* LexLIP: Lexicon-Bottlenecked Language-Image Pre-Training for Large-Scale Image-Text Sparse Retrieval
* LFS-GAN: Lifelong Few-Shot Image Generation
* LiDAR-Camera Panoptic Segmentation via Geometry-Consistent and Semantic-Aware Alignment
* LiDAR-UDA: Self-ensembling Through Time for Unsupervised LiDAR Domain Adaptation
* LightDepth: Single-View Depth Self-Supervision from Illumination Decline
* LightGlue: Local Feature Matching at Light Speed
* Lighting Every Darkness in Two Pairs: A Calibration-Free Pipeline for RAW Denoising
* Lighting up NeRF via Unsupervised Decomposition and Enhancement
* LightNet: Generative Model for Enhancement of Low-Light Images
* Lightweight and Fast Low-Light Image Enhancement Method Based on PoolFormer
* Lightweight CNN-Conformer Model for Automatic Speaker Verification, A
* Lightweight Image Super-Resolution with Superpixel Token Interaction
* Lightweight Man-Overboard Detection and Tracking Model Using Aerial Images for Maritime Search and Rescue, A
* Lightweight Neural Network for Enhancing Imaging Performance of Under-Display Camera
* Lightweight Skeleton-Based 3D-CNN for Real-Time Fall Detection and Action Recognition, A
* Lightweight Vision Transformer with Spatial and Channel Enhanced Self-Attention
* LIMITR: Leveraging Local Information for Medical Image-Text Representation
* Linear Frequency Modulation and Orthogonal Code Modulation for Co-Located Multiple-Input Multiple-Output High-Frequency Surface Wave Radar
* Linear Spaces of Meanings: Compositional Structures in Vision-Language Models
* Linear-Covariance Loss for End-to-End Learning of 6D Pose Estimation
* Links between Land Cover and In-Water Optical Properties in Four Optically Contrasting Swedish Bays
* Lip Reading for Low-resource Languages by Learning and Combining General Speech Knowledge and Language-specific Knowledge
* Lip2Vec: Efficient and Robust Visual Speech Recognition via Latent-to-Latent Visual to Audio Representation Mapping
* LIST: Learning Implicitly from Spatial Transformers for Single-View 3D Reconstruction
* LISTER: Neighbor Decoding for Length-Insensitive Scene Text Recognition
* LiveHand: Real-time and Photorealistic Neural Hand Rendering
* LivelySpeaker: Towards Semantic-Aware Co-Speech Gesture Generation
* LivePose: Online 3D Reconstruction from Monocular Video with Dynamic Camera Poses
* LLaViLo: Boosting Video Moment Retrieval via Adapter-Based Multimodal Modeling
* LLM-Planner: Few-Shot Grounded Planning for Embodied Agents with Large Language Models
* LMR: A Large-Scale Multi-Reference Dataset for Reference-based Super-Resolution
* LNPL-MIL: Learning from Noisy Pseudo Labels for Promoting Multiple Instance Learning in Whole Slide Image
* Local and Global Logit Adjustments for Long-Tailed Learning
* Local Context-Aware Active Domain Adaptation
* Local or Global: Selective Knowledge Assimilation for Federated Learning with Limited Labels
* Localizing Moments in Long Video Via Multimodal Guidance
* Localizing Object-level Shape Variations with Text-to-Image Diffusion Models
* Locally Stylized Neural Radiance Fields
* Locating Noise is Halfway Denoising for Semi-Supervised Segmentation
* Locating X-Ray Coronary Angiogram Keyframes via Long Short-Term Spatiotemporal Attention With Image-to-Patch Contrastive Learning
* Locomotion-Action-Manipulation: Synthesizing Human-Scene Interactions in Complex 3D Environments
* LoCUS: Learning Multiscale 3D-consistent Features from Posed Images
* Logic-induced Diagnostic Reasoning for Semi-supervised Semantic Segmentation
* LogicSeg: Parsing Visual Semantics with Neural Logic Learning and Reasoning
* LoGoPrompt: Synthetic Text Images Can Be Good Visual Prompts for Vision-Language Models
* LoLep: Single-View View Synthesis with Locally-Learned Planes and Self-Attention Occlusion Inference
* Long-Range Grouping Transformer for Multi-View 3D Reconstruction
* Long-range Multimodal Pretraining for Movie Understanding
* Long-Term Monitoring of Inland Water Quality Parameters Using Landsat Time-Series and Back-Propagated ANN: Assessment and Usability in a Real-Case Scenario
* Long-Term Photometric Consistent Novel View Synthesis with Diffusion Models
* Look at the Neighbor: Distortion-aware Unsupervised Domain Adaptation for Panoramic Semantic Segmentation
* Looking at Words and Points with Attention: A Benchmark for Text-to-Shape Coherence
* Looking through the past: better knowledge retention for generative replay in continual learning
* Lossy and Lossless (L2) Post-training Model Size Compression
* LoTE-Animal: A Long Time-span Dataset for Endangered Animal Behavior Understanding
* Low-Light Image Enhancement with Illumination-Aware Gamma Correction and Complete Image Modelling Network
* Low-Light Image Enhancement with Multi-stage Residue Quantization and Brightness-aware Attention
* Low-Shot Object Counting Network With Iterative Prototype Adaptation, A
* Low-Tech and Low-Cost System for High-Resolution Underwater RTK Photogrammetry in Coastal Shallow Waters
* LPFF: A Portrait Dataset for Face Generators Across Large Poses
* LRRU: Long-short Range Recurrent Updating Networks for Depth Completion
* LSTM-based deep learning framework for adaptive identifying eco-driving on intelligent vehicle multivariate time-series data
* LU-NeRF: Scene and Pose Estimation by Synchronizing Local Unposed NeRFs
* Luminance-aware Color Transform for Multiple Exposure Correction
* LViT: Language Meets Vision Transformer in Medical Image Segmentation
* LVOS: A Benchmark for Long-term Video Object Segmentation
* M2C: Concise Music Representation for 3D Dance Generation
* M2T: Masking Transformers Twice for Faster Decoding
* MAAL: Multimodality-Aware Autoencoder-based Affordance Learning for 3D Articulated Objects
* MacD-Net: An automatic guided-ensemble approach for macular pathology detection using optical coherence tomography images
* Machine Learning in the Hyperspectral Classification of Glycaspis brimblecombei (Hemiptera Psyllidae) Attack Severity in Eucalyptus
* Machine Learning-Based Estimation of Tropical Cyclone Intensity from Advanced Technology Microwave Sounder Using a U-Net Algorithm
* Machine-Learning-Based Study on All-Day Cloud Classification Using Himawari-8 Infrared Data, A
* MAGI: Multi-Annotated Explanation-Guided Learning
* MagicFusion: Boosting Text-to-Image Generation Performance by Fusing Diffusion Models
* Make Encoder Great Again in 3D GAN Inversion through Geometry and Occlusion-Aware Encoding
* Make-An-Animation: Large-Scale Text-conditional 3D Human Motion Generation
* Make-It-3D: High-Fidelity 3D Creation from A Single Image with Diffusion Prior
* Making and Breaking of Camouflage, The
* MAMMOS: MApping Multiple human MOtion with Scene understanding and natural interactions
* MAMo: Leveraging Memory and Attention for Monocular Video Depth Estimation
* Manifestation of Internal Waves in the Structure of an Artificial Slick Band
* Manipulate by Seeing: Creating Manipulation Controllers from Pre-Trained Representations
* MAP: Towards Balanced Generalization of IID and OOD through Model-Agnostic Adapters
* MAPConNet: Self-supervised 3D Pose Transfer with Mesh and Point Contrastive Learning
* MapFormer: Boosting Change Detection by Using Pre-change Information
* Mapping Changes in Fractional Vegetation Cover on the Namib Gravel Plains with Satellite-Retrieved Land Surface Emissivity Data
* Mapping Memes to Words for Multimodal Hateful Meme Classification
* Mapping Multi-Depth Soil Salinity Using Remote Sensing-Enabled Machine Learning in the Yellow River Delta, China
* Mapping Multi-Modal Brain Connectome for Brain Disorder Diagnosis via Cross-Modal Mutual Learning
* Mapping Small-Scale Irrigation Areas Using Expert Decision Rules and the Random Forest Classifier in Northern Ethiopia
* MapPrior: Bird's-Eye View Map Layout Estimation with Generative Models
* MAR: Masked Autoencoders for Efficient Action Recognition
* March in Chat: Interactive Prompting for Remote Embodied Referring Expression
* Margin Contrastive Learning with Learnable-Vector for Continual Learning
* Markov Game Video Augmentation for Action Segmentation
* MARL: Multi-scale Archetype Representation Learning for Urban Building Energy Modeling
* MARS: Model-agnostic Biased Object Removal without Additional Supervision for Weakly-Supervised Semantic Segmentation
* MAS: Towards Resource-Efficient Federated Multiple-Task Learning
* MasaCtrl: Tuning-Free Mutual Self-Attention Control for Consistent Image Synthesis and Editing
* Mask-Attention-Free Transformer for 3D Instance Segmentation
* Masked Autoencoders are Efficient Class Incremental Learners
* Masked Autoencoders Are Stronger Knowledge Distillers
* Masked Conditional Variational Autoencoders for Chromosome Straightening
* Masked Diffusion Transformer is a Strong Image Synthesizer
* Masked Motion Predictors are Strong 3D Action Representation Learners
* Masked Retraining Teacher-Student Framework for Domain Adaptive Object Detection
* Masked Spatio-Temporal Structure Prediction for Self-supervised Learning on Point Cloud Videos
* Masked Spiking Transformer
* Masking Strategies for Background Bias Removal in Computer Vision Models
* MasQCLIP for Open-Vocabulary Universal Image Segmentation
* Mastering Spatial Graph Prediction of Road Networks
* MAtch, eXpand and Improve: Unsupervised Finetuning for Zero-Shot Action Recognition with Language Knowledge
* MatchMakerNet: Enabling Fragment Matching for Cultural Heritage Analysis
* MATE: Masked Autoencoders are Online 3D Test-Time Learners
* MatrixCity: A Large-scale City Dataset for City-scale Neural Rendering and Beyond
* MatrixVT: Efficient Multi-Camera to BEV Transformation for 3D Perception
* MB-TaylorFormer: Multi-branch Efficient Transformer Expanded by Taylor Formula for Image Dehazing
* MBPTrack: Improving 3D Point Cloud Tracking with Memory networks and Box Priors
* MCAGCN: Multi-component attention graph convolutional neural network for road travel time prediction
* MCDC-Net: Multi-scale forgery image detection network based on central difference convolution
* MDCS: More Diverse Experts with Consistency Self-distillation for Long-tailed Recognition
* Measuring Asymmetric Gradient Discrepancy in Parallel Continual Learning
* Measuring the Multi-Scale Landscape Pattern of China's Largest Archipelago from a Dual-3D Perspective Based on Remote Sensing
* MedKLIP: Medical Knowledge Enhanced Language-Image Pre-Training for X-ray Diagnosis
* MEFLUT: Unsupervised 1D Lookup Tables for Multi-exposure Image Fusion
* MEGA: Multimodal Alignment Aggregation and Distillation For Cinematic Video Segmentation
* Membrane Potential Batch Normalization for Spiking Neural Networks
* Memory Population in Continual Learning via Outlier Elimination
* Memory-and-Anticipation Transformer for Online Action Understanding
* Memory-augmented Variational Adaptation for Online Few-shot Segmentation
* MemorySeg: Online LiDAR Semantic Segmentation with a Latent Memory
* MeMOTR: Long-Term Memory-Augmented Transformer for Multi-Object Tracking
* Mesh2Tex: Generating Mesh Textures from Image Queries
* Meta OOD Learning For Continuously Adaptive OOD Detection
* Meta-ZSDETR: Zero-Shot DETR with Meta-Learning
* MetaBEV: Solving Sensor Failures for 3D Detection and Map Segmentation
* MetaF2N: Blind Image Super-Resolution by Learning Efficient Model Adaptation from Faces
* MetaFormer Baselines for Vision
* MetaGCD: Learning to Continually Learn in Generalized Category Discovery
* MetaVG: A Meta-Learning Framework for Visual Grounding
* Method for Extracting Photovoltaic Panels from High-Resolution Optical Remote Sensing Images Guided by Prior Knowledge, A
* Metric3D: Towards Zero-shot Metric 3D Prediction from A Single Image
* MeViS: A Large-scale Benchmark for Video Segmentation with Motion Expressions
* MFECLIP: CLIP With Mapping-Fusion Embedding for Text-Guided Image Editing
* MGiaD: Multigrid in all dimensions. Efficiency and robustness by weight sharing and coarsening in resolution and channel dimensions*
* MGMAE: Motion Guided Masking for Video Masked Autoencoding
* MHCN: A Hyperbolic Neural Network Model for Multi-view Hierarchical Clustering
* MHEntropy: Entropy Meets Multiple Hypotheses for Pose and Shape Recovery
* MI-GAN: A Simple Baseline for Image Inpainting on Mobile Devices
* MIAD: A Maintenance Inspection Dataset for Unsupervised Anomaly Detection
* Microaneurysm classification system in color fundus images using auto-weight dilated convolutional network
* Mid-Term Monitoring of Suspended Sediment Plumes of Greek Rivers Using Moderate Resolution Imaging Spectroradiometer (MODIS) Imagery
* Mimic3D: Thriving 3D-Aware GANs via 3D-to-2D Imitation
* MIMO-NeRF: Fast Neural Rendering with Multi-input Multi-output Neural Radiance Fields
* Mind the Clot: Automated LVO Detection on CTA using Deep Learning
* Minimal Solutions to Generalized Three-View Relative Pose Problem
* Minimal Solutions to Uncalibrated Two-view Geometry with Known Epipoles
* Minimum Latency Deep Online Video Stabilization
* Minimum Noise Fraction Analysis of TGO/NOMAD LNO Channel High-Resolution Nadir Spectra of Mars
* Mining bias-target Alignment from Voronoi Cells
* Mining Deformation Monitoring Based on Lutan-1 Monostatic and Bistatic Data
* MiniROAD: Minimal RNN Framework for Online Action Detection
* Mirror U-Net: Marrying Multimodal Fission with Multi-task Learning for Semantic Segmentation in Medical Imaging
* Misalign, Contrast then Distill: Rethinking Misalignments in Language-Image Pretraining
* Misalignment-Free Relation Aggregation for Multi-Source-Free Domain Adaptation
* Mitigating Adversarial Vulnerability through Causal Parameter Estimation by Adversarial Double Machine Learning
* Mitigating and Evaluating Static Bias of Action Representations in the Background and the Foreground
* Mitigating Confounding Bias in Practical Recommender Systems With Partially Inaccessible Exposure Status
* MixBag: Bag-Level Data Augmentation for Learning from Label Proportions
* MixCycle: Mixup Assisted Semi-Supervised 3D Single Object Tracking with Cycle Consistency
* Mixed Neural Voxels for Fast Multi-view Video Synthesis
* MixPath: A Unified Approach for One-shot Neural Architecture Search
* MixReorg: Cross-Modal Mixed Patch Reorganization is a Good Mask Learner for Open-World Semantic Segmentation
* MixSpeech: Cross-Modality Self-Learning with Audio-Visual Stream Mixup for Visual Speech Translation and Recognition
* MixSynthFormer: A Transformer Encoder-like Structure with Mixed Synthetic Self-attention for Efficient Human Pose Estimation
* MMAN-M2: Multiple multi-head attentions network based on encoder with missing modalities
* MMST-ViT: Climate Change-aware Crop Yield Prediction via Multi-Modal Spatial-Temporal Vision Transformer
* MMTF: Multi-Modal Temporal Fusion for Commonsense Video Question Answering
* MMTFN: Multi-modal multi-scale transformer fusion network for Alzheimer's disease diagnosis
* MMVP: Motion-Matrix-based Video Prediction
* MobileSAM-Track: Lightweight One-Shot Tracking and Segmentation of Small Objects on Edge Devices
* MODA: Mapping-Once Audio-driven Portrait Animation with Dual Attentions
* Modality Unifying Network for Visible-Infrared Person Re-Identification
* Model Calibration in Dense Classification with Adaptive Label Perturbation
* ModelGiF: Gradient Fields for Model Functional Distance
* Modeling the Relative Visual Tempo for Self-supervised Skeleton-based Action Recognition
* Modeling Visual Impairments with Artificial Neural Networks: a Review
* Modelling Paroxysmal and Mild-Strombolian Eruptive Plumes at Stromboli and Mt. Etna on 28 August 2019
* MOFA: A Model Simplification Roadmap for Image Restoration on Mobile Devices
* MolGrapher: Graph-based Visual Recognition of Chemical Structures
* Moment Detection in Long Tutorial Videos
* Monitoring of Cotton Boll Opening Rate Based on UAV Multispectral Data
* Monitoring of Supraglacial Lake Distribution and Full-Year Changes Using Multisource Time-Series Satellite Imagery
* Monitoring Spatio-Temporal Variations of Ponds in Typical Rural Area in the Huai River Basin of China
* Monocular 3D Object Detection with Bounding Box Denoising in 3D by Perceiver
* MonoDETR: Depth-guided Transformer for Monocular 3D Object Detection
* MonoNeRD: NeRF-like Representations for Monocular 3D Object Detection
* MonoNeRF: Learning a Generalizable Dynamic Radiance Field from Monocular Videos
* Monte Carlo Linear Clustering with Single-Point Supervision is Enough for Infrared Small Target Detection
* MoreauGrad: Sparse and Robust Interpretation of Neural Networks via Moreau Envelope
* MOSAIC: Multi-Object Segmented Arbitrary Stylization Using CLIP
* MOSE: A New Dataset for Video Object Segmentation in Complex Scenes
* Most Important Person-guided Dual-branch Cross-Patch Attention for Group Affect Recognition
* MOST: Multiple Object localization with Self-supervised Transformers for object discovery
* MoTIF: Learning Motion Trajectories with Local Implicit Neural Functions for Continuous Space-Time Video Super-Resolution
* Motion Complement and Temporal Multifocusing for Skeleton-Based Action Recognition
* Motion-Guided Masking for Spatiotemporal Representation Learning
* MotionBERT: A Unified Perspective on Learning Human Motion Representations
* MotionDeltaCNN: Sparse CNN Inference of Frame Differences in Moving Camera Videos with Spherical Buffers and Padded Convolutions
* MotionLM: Multi-Agent Motion Forecasting as Language Modeling
* Movement Enhancement toward Multi-Scale Video Feature Representation for Temporal Action Detection
* MozzieNet: A deep learning approach to efficiently detect malaria parasites in blood smear images
* MPCViT: Searching for Accurate and Efficient MPC-Friendly Vision Transformer with Heterogeneous Attention
* MPI-Flow: Learning Realistic Optical Flow with Multiplane Images
* MRM: Masked Relation Modeling for Medical Image Pre-Training with Genetics
* MRN: Multiplexed Routing Network for Incremental Multilingual Text Recognition
* MSI: Maximize Support-Set Information for Few-Shot Segmentation
* MSISR-STF: Spatiotemporal Fusion via Multilevel Single-Image Super-Resolution
* MSSFF: Advancing Hyperspectral Classification through Higher-Accuracy Multistage Spectral-Spatial Feature Fusion
* MST-compression: Compressing and Accelerating Binary Neural Networks with Minimum Spanning Tree
* MSViT: Dynamic Mixed-scale Tokenization for Vision Transformers
* mu-Split: image decomposition for fluorescence microscopy
* MULLER: Multilayer Laplacian Resizer for Vision
* Multi-body Depth and Camera Pose Estimation from Multiple Views
* Multi-Camera 3D Position Estimation using Conditional Random Field
* Multi-classification of brain tumor by using deep convolutional neural network model in magnetic resonance imaging images
* Multi-ConDoS: Multimodal Contrastive Domain Sharing Generative Adversarial Networks for Self-Supervised Medical Image Segmentation
* Multi-Dimensional Low-Rank with Weighted Schatten p-Norm Minimization for Hyperspectral Anomaly Detection
* Multi-Directional Subspace Editing in Style-Space
* Multi-echo hyperspectral reflectance extraction method based on full waveform hyperspectral LiDAR
* Multi-event Video-Text Retrieval
* Multi-Exit Resource-Efficient Neural Architecture for Image Classification with Optimized Fusion Block
* Multi-Frequency Representation Enhancement with Privilege Information for Video Super-Resolution
* Multi-grained Temporal Prototype Learning for Few-shot Video Object Segmentation
* Multi-granularity Interaction Simulation for Unsupervised Interactive Segmentation
* Multi-Input Convolutional Neural Networks Model for Earthquake Precursor Detection Based on Ionospheric Total Electron Content, A
* Multi-interactive Feature Learning and a Full-time Multi-modality Benchmark for Image Fusion and Segmentation
* Multi-label affordance mapping from egocentric vision
* Multi-Label Knowledge Distillation
* Multi-Label Self-Supervised Learning with Scene Images
* Multi-Level Attention Interactive Network for Cloud and Snow Detection Segmentation
* Multi-metrics adaptively identifies backdoors in Federated learning
* Multi-Modal Continual Test-Time Adaptation for 3D Semantic Segmentation
* Multi-Modal Correlated Network with Emotional Reasoning Knowledge for Social Intelligence Question-Answering
* Multi-modal Gated Mixture of Local-to-Global Experts for Dynamic Image Fusion
* Multi-Modal Image Registration Based on Phase Exponent Differences of the Gaussian Pyramid
* Multi-Modal Neural Radiance Field for Monocular Dense SLAM with a Light-Weight ToF Sensor
* Multi-Object Discovery by Low-Dimensional Object Motion
* Multi-Object Navigation with dynamically learned neural implicit representations
* Multi-Resolution Segmentation of Solar Photovoltaic Systems Using Deep Learning
* Multi-Satellite Imaging Task Planning for Large Regional Coverage: A Heuristic Algorithm Based on Triple Grids Method
* Multi-Satellite Observation-Relay Transmission-Downloading Coupling Scheduling Method
* Multi-Scale Bidirectional Recurrent Network with Hybrid Correlation for Point Cloud Based Scene Flow Estimation
* Multi-Scale Fusion and Transformer Based Registration Guided Speckle Noise Reduction for OCT Images, A
* Multi-Scale Remote Sensing Assessment of Ecological Environment Quality and Its Driving Factors in Watersheds: A Case Study of Huashan Creek Watershed in China
* Multi-scale Residual Low-Pass Filter Network for Image Deblurring
* Multi-Sensor Observations Reveal Large-Amplitude Nonlinear Internal Waves in the Kara Gates, Arctic Ocean
* Multi-Source T-S Target Recognition via an Intuitionistic Fuzzy Method
* Multi-source to multi-target domain adaptation method based on similarity measurement
* Multi-Stage Asynchronous Federated Learning With Adaptive Differential Privacy
* Multi-Stage Progressive Change Detection on High Resolution Remote Sensing Imagery
* Multi-Task Consistency for Active Learning
* Multi-Task Hypergraphs for Semi-supervised Learning using Earth Observations
* Multi-Task Learning with Knowledge Distillation for Dense Prediction
* Multi-task View Synthesis with Neural Radiance Fields
* Multi-Temporal and Time-Lag Responses of Terrestrial Net Ecosystem Productivity to Extreme Climate from 1981 to 2019 in China
* Multi-View Active Fine-Grained Visual Recognition
* Multi-view Self-supervised Disentanglement for General Image Denoising
* Multi-view Spectral Polarization Propagation for Video Glass Segmentation
* Multi-weather Image Restoration via Domain Translation
* Multi3DRefer: Grounding Text Description to Multiple 3D Objects
* Multidimensional Analysis of Social Biases in Vision Transformers, A
* Multiencoder-based federated intelligent deep learning model for brain tumor segmentation
* MultiFeNet: Multi-scale feature scaling in deep neural network for the brain tumour classification in MRI images
* Multiloss strategy for breast cancer subtype classification using digital breast tomosynthesis
* Multimodal Contrastive Learning and Tabular Attention for Automated Alzheimer's Disease Prediction
* Multimodal Distillation for Egocentric Action Recognition
* Multimodal Error Correction with Natural Language and Pointing Gestures
* Multimodal Garment Designer: Human-Centric Latent Diffusion Models for Fashion Image Editing
* Multimodal High-order Relation Transformer for Scene Boundary Detection
* Multimodal Motion Conditioned Diffusion Model for Skeleton-based Video Anomaly Detection
* Multimodal Neurons in Pretrained Text-Only Transformers
* Multimodal Optimal Transport-based Co-Attention Transformer with Global Structure Consistency for Survival Prediction
* Multimodal Parameter-Efficient Few-Shot Class Incremental Learning
* Multimodal Variational Auto-encoder based Audio-Visual Segmentation
* Multiple Close-Range Geomatic Techniques for the Kinematic Study of the La Pael Rock Glacier, Southern Pyrenees
* Multiple Greenness Indexes Revealed the Vegetation Greening during the Growing Season and Winter on the Tibetan Plateau despite Regional Variations
* Multiple Instance Learning Framework with Masked Hard Instance Mining for Whole Slide Image Classification
* Multiple Planar Object Tracking
* Multiscale Fusion of Panchromatic and Multispectral Images Based on Adaptive Iterative Filtering
* Multiscale Representation for Real-Time Anti-Aliasing Neural Rendering
* Multiscale Structure Guided Diffusion for Image Deblurring
* Multisensory Analysis of the Moisture Course of the Cave of Altamira (Spain): Implications for Its Conservation, A
* Muscles in Action
* MUter: Machine Unlearning on Adversarially Trained Models
* Mutual Information Regularization for Weakly-Supervised RGB-D Salient Object Detection
* MUVA: A New Large-Scale Benchmark for Multi-view Amodal Instance Segmentation in the Shopping Scenario
* MV-DeepSDF: Implicit Modeling with Multi-Sweep Point Clouds for 3D Vehicle Reconstruction in Autonomous Driving
* MV-Map: Offboard HD Map Generation with Multi-view Consistency
* MVPSNet: Fast Generalizable Multi-view Photometric Stereo
* MVSSC: Meta-reinforcement learning based visual indoor navigation using multi-view semantic spatial context
* Name Your Colour For the Task: Artificially Discover Colour Naming via Colour Quantisation Transformer
* NAPA-VQ: Neighborhood Aware Prototype Augmentation with Vector Quantization for Continual Learning
* Narrator: Towards Natural Control of Human-Scene Interaction Generation via Relationship Reasoning
* NavCom Signal Authentication Scheme Based on Twice Two-Way Satellite Time Transfer, A
* Navigating to Objects Specified by Images
* NaviNeRF: NeRF-based 3D Representation Disentanglement by Latent Semantic Navigation
* NCHO: Unsupervised Learning for Neural 3D Composition of Humans and Objects
* NCQS: Nonlinear Convex Quadrature Surrogate Hyperparameter Optimization
* NDC-Scene: Boost Monocular 3D Semantic Scene Completion in Normalized Device Coordinates Space
* NDDepth: Normal-Distance Assisted Monocular Depth Estimation
* Nearest Neighbor Guidance for Out-of-Distribution Detection
* Neglected Free Lunch: Learning Image Classifiers Using Annotation Byproducts
* NeILF++: Inter-Reflectable Light Fields for Geometry and Material Estimation
* NeMF: Inverse Volume Rendering with Neural Microflake Field
* NEMTO: Neural Environment Matting for Novel View and Relighting Synthesis of Transparent Objects
* NeO 360: Neural Fields for Sparse View Synthesis of Outdoor Scenes
* NeRF-Det: Learning Geometry-Aware Volumetric Representation for Multi-View 3D Object Detection
* NeRF-GAN Distillation for Efficient 3D-Aware Generation with Convolutions
* NeRF-LOAM: Neural Implicit Representation for Large-Scale Incremental LiDAR Odometry and Mapping
* NeRF-MS: Neural Radiance Fields with Multi-Sequence
* NeRF-Pose: A First-Reconstruct-Then-Regress Approach for Weakly-supervised 6D Object Pose Estimation
* NerfAcc: Efficient Sampling Accelerates NeRFs
* Nerfbusters: Removing Ghostly Artifacts from Casually Captured NeRFs
* NeRFrac: Neural Radiance Fields through Refractive Surface
* NeSS-ST: Detecting Good and Stable Keypoints with a Neural Stability Score and the Shi-Tomasi detector
* NeTO: Neural Reconstruction of Transparent Objects with Self-Occlusion Aware Refraction-Tracing
* Neural Collage Transfer: Artistic Reconstruction via Material Manipulation
* Neural Deformable Models for 3D Bi-Ventricular Heart Shape Reconstruction and Modeling from 2D Sparse Cardiac Magnetic Resonance Imaging
* Neural Fields for Structured Lighting
* Neural Haircut: Prior-Guided Strand-Based Hair Reconstruction
* Neural Implicit Surface Evolution
* Neural Interactive Keypoint Detection
* Neural LiDAR Fields for Novel View Synthesis
* Neural Microfacet Fields for Inverse Rendering
* Neural Radiance Fields with LiDAR Maps
* Neural Reconstruction of Relightable Human Model from Monocular Video
* Neural Video Depth Stabilizer
* Neural-PBIR Reconstruction of Shape, Material, and Illumination
* NeuRBF: A Neural Fields Representation with Adaptive Radial Basis Functions
* NeuS2: Fast Learning of Neural Implicit Surfaces for Multi-view Reconstruction
* New Dataset for End-to-End Sign Language Translation: The Greek Elementary School Dataset, A
* New Deep Neural Network Based on SwinT-FRM-ShipNet for SAR Ship Detection in Complex Near-Shore and Offshore Environments, A
* new hierarchical algorithm based on CapsGAN for imbalanced image classification, A
* New Insights into the Simulations of Electric Currents for Discharges and ULF Magnetic-Field Perturbations: Applications to the Popocatepetl Volcano and a Micro-Discharge Model
* new large dataset and a transfer learning methodology for plant phenotyping in Vertical Farms, A
* New Method for Deformation Monitoring of Structures by Precise Point Positioning, A
* New Remote Sensing Desert Vegetation Detection Index, A
* New Theory and Faster Computations for Subspace-Based Sensitivity Map Estimation in Multichannel MRI
* NIR-assisted Video Enhancement via Unpaired 24-hour Data
* NLOS-NeuS: Non-line-of-sight Neural Implicit Surface
* No Data Augmentation? Alternative Regularizations for Effective Training on Small Datasets
* No Fear of Classifier Biases: Neural Collapse Inspired Federated Learning with Synthetic and Fixed Classifier
* Noise-aware Learning from Web-crawled Image-Text Data for Image Captioning
* Noise-in, Bias-out: Balanced and Real-time MoCap Solving
* Noise2Info: Noisy Image to Information of Noise for Self-Supervised Image Denoising
* Noisy Remote Sensing Scene Classification via Progressive Learning Based on Multiscale Information Exploration
* Non-Coaxial Event-guided Motion Deblurring with Spatial Alignment
* Non-Destructive Infield Quality Estimation of Strawberries using Deep Architectures
* Non-Fluent Synthetic Target-Language Data Improve Neural Machine Translation
* Non-Isocentric Geometry for Next-Generation Tomosynthesis With Super-Resolution
* Non-Semantics Suppressed Mask Learning for Unsupervised Video Semantic Compression
* Noninvasive Early Detection of Nutrient Deficiencies in Greenhouse-Grown Industrial Hemp Using Hyperspectral Imaging
* Nonrigid Object Contact Estimation With Regional Unwrapping Transformer
* Normalizing Flows for Human Pose Anomaly Detection
* Not All Features Matter: Enhancing Few-shot CLIP with Adaptive Prior Refinement
* Not All Steps are Created Equal: Selective Diffusion Distillation for Image Manipulation
* Not Every Side Is Equal: Localization Uncertainty Estimation for Semi-Supervised 3D Object Detection
* Note on the Unit-Circle-Zeros Property of the MVDR Beamformer, A
* NOVA: NOvel View Augmentation for Neural Composition of Dynamic Objects
* Novel Approach to Evaluate GNSS-RO Signal Receiver Performance in Terms of Ground-Based Atmospheric Occultation Simulation System, A
* Novel Building Extraction Network via Multi-Scale Foreground Modeling and Gated Boundary Refinement, A
* Novel Chaotic-NLFM Signal under Low Oversampling Factors for Deception Jamming Suppression, A
* Novel Intrapulse Beamsteering SAR Imaging Mode Based on OFDM-Chirp Signals, A
* Novel Learning Dictionary for Sparse Coding-Based Key Point Detection, A
* Novel Polarization Scattering Decomposition Model and Its Application to Ship Detection, A
* Novel Scenes & Classes: Towards Adaptive Open-set Object Detection
* Novel-view Synthesis and Pose Estimation for Hand-Object Interaction from Sparse Views
* NPC: Neural Point Characters from Video
* NSF: Neural Surface Fields for Human Modeling from Monocular Depth
* NU-Net: a self-supervised smart filter for enhancing blobs in bioimages
* Nucleus-Aware Self-Supervised Pretraining Using Unpaired Image-to-Image Translation for Histopathology Images
* nuScenes Knowledge Graph - A comprehensive semantic representation of traffic scenes for trajectory prediction
* Object as Query: Lifting any 2D Object Detector to 3D Detection
* Object-aware Gaze Target Detection
* Object-Centric Multiple Object Tracking
* ObjectFusion: Multi-modal 3D Object Detection with Object-Centric Fusion
* Objects do not disappear: Video object detection by single-frame object location anticipation
* ObjectSDF++: Improved Object-Compositional Neural Implicit Surfaces
* Occ2Net: Robust Image Matching Based on 3D Occupancy Estimation for Occluded Regions
* OccFormer: Dual-path Transformer for Vision-based 3D Semantic Occupancy Prediction
* Occluded Gait Recognition via Silhouette Registration Guided by Automated Occlusion Degree Estimation
* Ocean Colour Atmospheric Correction for Optically Complex Waters under High Solar Zenith Angles: Facilitating Frequent Diurnal Monitoring and Management
* OCHID-Fi: Occlusion-Robust Hand Pose Estimation in 3D via RF-Vision
* Ocular Disease Recognition via Differential Privacy and Unsupervised Domain Regularizer
* OEGR-DETR: A Novel Detection Transformer Based on Orientation Enhancement and Group Relations for SAR Object Detection
* OFVL-MS: Once for Visual Localization across Multiple Indoor Scenes
* Oil Well Detection under Occlusion in Remote Sensing Images Using the Improved YOLOv5 Model
* Oilfield Reservoir Parameter Inversion Based on 2D Ground Deformation Measurements Acquired by a Time-Series MSBAS-InSAR Method
* OMG-Attack: Self-Supervised On-Manifold Generation of Transferable Evasion Attacks
* Omnidirectional Information Gathering for Knowledge Transfer-based Audio-Visual Navigation
* Omnidirectional Monolithic Marker for Intra-Operative MR-Based Positional Sensing in Closed MRI
* OmniLabel: A Challenging Benchmark for Language-Based Object Detection
* OmnimatteRF: Robust Omnimatte with 3D Background Modeling
* OmniZoomer: Learning to Move and Zoom in on Sphere at High-Resolution
* On Binary Sequence Design via PSL Minimization
* On Moving Object Segmentation from Monocular Video with Transformers
* On Offline Evaluation of 3D Object Detection for Autonomous Driving
* On the Adversarial Robustness of Multi-Modal Foundation Models
* On the Audio-visual Synchronization for Lip-to-Speech Synthesis
* On the Contractivity of Plug-and-Play Operators
* On the Effectiveness of LayerNorm Tuning for Continual Learning in Vision Transformers
* On the Effectiveness of Spectral Discriminators for Perceptual Quality Improvement
* On the Interplay of Convolutional Padding and Adversarial Robustness
* On the risk of manual annotations in 3D confocal microscopy image segmentation
* On the Robustness of Normalizing Flows for Inverse Problems in Imaging
* On the Robustness of Open-World Test-Time Training: Self-Training with Dynamic Prototype Expansion
* On the unreasonable vulnerability of transformers for image restoration - and an easy fix
* On Variational Block Sparse Recovery With Unknown Partition and L_0-Norm Constraint
* On-device Real-time Custom Hand Gesture Recognition
* On-Orbit Calibration Method for Correction Microwave Radiometer of the HY-2 Satellite Constellation
* Once Detected, Never Lost: Surpassing Human Performance in Offline LiDAR based 3D Object Detection
* One Transform to Compute Them All: Efficient Fusion-Based Full-Reference Video Quality Assessment
* One-bit Flip is All You Need: When Bit-flip Attack Meets Model Training
* One-Shot Generative Domain Adaptation
* One-shot Implicit Animatable Avatars with Model-based Priors
* One-shot recognition of any material anywhere using contrastive learning with physics-based rendering
* One-Shot Weakly-Supervised Segmentation in 3D Medical Images
* Online Class Incremental Learning on Stochastic Blurry Task Boundary via Mask and Visual Prompt Tuning
* Online Clustered Codebook
* Online Continual Learning on Hierarchical Label Expansion
* Online Detection of AI-Generated Images
* Online Multiple Kernel Parallelizable Learning Scheme, An
* Online Prototype Learning for Online Continual Learning
* OnlineRefer: A Simple Online Baseline for Referring Video Object Segmentation
* OPAL: Occlusion Pattern Aware Loss for Unsupervised Light Field Disparity Estimation
* Open Problems in Computer Vision for Wilderness SAR and The Search for Patricia Wu-Murad
* Open Scene Understanding: Grounded Situation Recognition Meets Segment Anything for Helping People with Visual Impairments
* Open Set Video HOI detection from Action-centric Chain-of-Look Prompting
* Open-domain Visual Entity Recognition: Towards Recognizing Millions of Wikipedia Entities
* Open-Vocabulary Object Detection With an Open Corpus
* Open-vocabulary Object Segmentation with Diffusion Models
* Open-vocabulary Panoptic Segmentation with Embedding Modulation
* Open-Vocabulary Semantic Segmentation with Decoupled One-Pass Network
* Open-Vocabulary Video Question Answering: A New Benchmark for Evaluating the Generalizability of Video Question Answering Models
* OpenIncrement: A Unified Framework for Open Set Recognition and Deep Class-Incremental Learning
* OpenOccupancy: A Large Scale Benchmark for Surrounding Semantic Occupancy Perception
* OPERA: Omni-Supervised Representation Learning with Hierarchical Supervisions
* Operational Aspects of Landsat 8 and 9 Geometry
* Optical Solutions for Spectral Imaging Inverse Problems with a Shift-Variant System
* Optimal Composite Likelihood Estimation and Prediction for Distributed Gaussian Process Modeling
* Optimal dispatch of a mobile storage unit to support electric vehicles charging stations
* optimal system for increasing the contrast resolution qualities of histopathology images in the wavelet domain, An
* Optimized Ensemble Framework for Multi-Label Classification on Long-Tailed Chest X-ray Data, An
* Optimized Integer Aperture Bootstrapping for High-Integrity CDGNSS Applications
* Optimizing Back-Propagation Neural Network to Retrieve Sea Surface Temperature Based on Improved Sparrow Search Algorithm
* Optimizing the Placement of Roadside LiDARs for Autonomous Driving
* ORC: Network Group-based Knowledge Distillation using Online Role Change
* Ord2Seq: Regarding Ordinal Regression as Label Sequence Prediction
* Order-preserving Consistency Regularization for Domain Adaptation and Generalization
* Order-Prompted Tag Sequence Generation for Video Tagging
* Order-ViT: Order Learning Vision Transformer for Cancer Classification in Pathology Images
* Ordered Atomic Activity for Fine-grained Interactive Traffic Scenario Understanding
* Ordinal Label Distribution Learning
* Orientation-Aware Pedestrian Attribute Recognition Based on Graph Convolution Network
* OrthoPlanes: A Novel Representation for Better 3D-Awareness of GANs
* OSCNet: Orientation-Shared Convolutional Network for CT Metal Artifact Learning
* Out-of-Distribution Detection for Monocular Depth Estimation
* Out-of-domain GAN inversion via Invertibility Decomposition for Photo-Realistic Human Face Manipulation
* Overcoming Forgetting Catastrophe in Quantization-Aware Training
* Overcomplete graph convolutional denoising autoencoder for noisy skeleton action recognition
* Overview of GIS-RS Applications for Archaeological and Cultural Heritage under the DBAR-Heritage Mission, An
* Overwriting Pretrained Bias with Finetuning Data
* OxfordTVG-HIC: Can Machine Make Humorous Captions from Images?
* P1AC: Revisiting Absolute Pose From a Single Affine Correspondence
* P2C: Self-Supervised Point Cloud Completion from Single Partial Clouds
* PADCLIP: Pseudo-labeling with Adaptive Debiasing in CLIP for Unsupervised Domain Adaptation
* Padding Aware Neurons
* PADDLES: Phase-Amplitude Spectrum Disentangled Early Stopping for Learning with Noisy Labels
* Painter: Teaching Auto-regressive Language Models to Draw Sketches
* Pairwise Similarity Learning is SimPLE
* Pan-Sharpening Network of Multi-Spectral Remote Sensing Images Using Two-Stream Attention Feature Extractor and Multi-Detail Injection (TAMINet)
* PanFlowNet: A Flow-Based Deep Network for Pan-sharpening
* Panoramas from Photons
* PanoStyle: Semantic, Geometry-Aware and Shading Independent Photorealistic Style Transfer for Indoor Panoramic Scenes
* Parallax-Tolerant Unsupervised Deep Image Stitching
* Parallel Attention Interaction Network for Few-Shot Skeleton-Based Action Recognition
* Parallel-Cascaded Ensemble of Machine Learning Models for Crop Type Classification in Google Earth Engine Using Multi-Temporal Sentinel-1/2 and Landsat-8/9 Remote Sensing Data, A
* Parallelized Inter-Image k-Means Clustering Algorithm for Unsupervised Classification of Series of Satellite Images
* Parameterized Cost Volume for Stereo Matching
* Parametric Classification for Generalized Category Discovery: A Baseline Study
* Parametric Depth Based Feature Representation Learning for Object Detection and Segmentation in Bird's-Eye View
* Parametric Information Maximization for Generalized Category Discovery
* ParCNetV2: Oversized Kernel with Enhanced Attention*
* PARF: Primitive-Aware Radiance Fusion for Indoor Scene Novel View Synthesis
* PARFormer: Transformer-Based Multi-Task Network for Pedestrian Attribute Recognition
* PARIS: Part-level Reconstruction and Motion Analysis for Articulated Objects
* parking occupancy prediction method incorporating time series decomposition and temporal pattern attention mechanism, A
* Parse-Then-Place Approach for Generating Graphic Layouts from Textual Descriptions, A
* Part-Aware Transformer for Generalizable Person Re-identification
* PARTICLE: Part Discovery and Contrastive Learning for Fine-grained Recognition
* Partition Speeds Up Learning Implicit Neural Representations Based on Exponential-Increase Hypothesis
* Partition-and-Debias: Agnostic Biases Mitigation via A Mixture of Biases-Specific Experts
* PARTNER: Level up the Polar Representation for LiDAR 3D Object Detection
* Passive Ultra-Wideband Single-Photon Imaging
* Pasta: Proportional Amplitude Spectrum Training Augmentation for Syn-to-Real Domain Generalization
* Pasture Biomass Estimation Using Ultra-High-Resolution RGB UAVs Images and Deep Learning
* PAT: Position-Aware Transformer for Dense Multi-Label Action Detection
* PatchCT: Aligning Patch Set and Label Set with Conditional Transport for Multi-Label Image Classification
* PatFig: Generating Short and Long Captions for Patent Figures
* Pathology-Based Ischemic Stroke Etiology Classification via Clot Composition Guided Multiple Instance Learning
* Patient-Specific Heart Geometry Modeling for Solid Biomechanics Using Deep Learning
* PATMAT: Person Aware Tuning of Mask-Aware Transformer for Face inpainting
* Pattern-Aware Transformer: Hierarchical Pattern Propagation in Sequential Medical Images
* PC-Adapter: Topology-Aware Adapter for Efficient Domain Adaption on Point Clouds with Rectified Pseudo-label
* PCTrans: Position-Guided Transformer with Query Contrast for Biological Instance Segmentation
* PDiscoNet: Semantically consistent part discovery for fine-grained recognition
* PEANUT: Predicting and Navigating to Unseen Targets
* Perceptual Artifacts Localization for Image Synthesis Tasks
* Perceptual Grouping in Contrastive Vision-Language Models
* Perceptual Quality Improvement in Videoconferencing Using Keyframes-Based GAN
* Performance Analysis and Improvement of Constrained Adaptive Monopulse Approach
* Performance Analysis of Artificial Intelligence Approaches for LEMP Classification
* Performance analysis of state-of-the-art CNN architectures for brain tumour detection
* Performance Assessment of a High-Frequency Radar Network for Detecting Surface Currents in the Pearl River Estuary
* Perils of Learning From Unlabeled Data: Backdoor Attacks on Semi-supervised Learning, The
* Periodically Exchange Teacher-Student for Source-Free Object Detection
* Perpetual Humanoid Control for Real-time Simulated Avatars
* Persistent-Transient Duality: A Multi-mechanism Approach for Modeling Human-Object Interaction
* Person Re-Identification without Identification via Event Anonymization
* Personalized 3D Human Pose and Shape Refinement
* Personalized Image Enhancement Featuring Masked Style Modeling
* Personalized Image Generation for Color Vision Deficiency Population
* Personalized Monitoring in Home Healthcare: An Assistive System for Post Hip Replacement Rehabilitation
* Personalized Semantics Excitation for Federated Image Classification
* PETRv2: A Unified Framework for 3D Perception from Multi-Camera Images
* PFENet++: Boosting Few-Shot Semantic Segmentation With the Noise-Filtered Context-Aware Prior Mask
* PG-RCNN: Semantic Surface Point Generation for 3D Object Detection
* PGFed: Personalize Each Client's Global Objective for Federated Learning
* PhaseMP: Robust 3D Pose Estimation via Phase-conditioned Human Motion Prior
* Phasic Content Fusing Diffusion Model with Directional Distribution Consistency for Few-Shot Model Adaption
* PHRIT: Parametric Hand Representation with Implicit Template
* PhysDiff: Physics-Guided Human Motion Diffusion Model
* Physically-plausible illumination distribution estimation
* Physics inspired hybrid attention for SAR target recognition
* Physics-Augmented Autoencoder for 3D Skeleton-Based Gait Recognition
* Physics-Driven Turbulence Image Restoration with Stochastic Refinement
* Physics-Informed Deep Learning Inversion with Application to Noisy Magnetotelluric Measurements
* PIDRo: Parallel Isomeric Attention with Dynamic Routing for Text-Video Retrieval
* Pigment Mapping for Tomb Murals using Neural Representation and Physics-based Model
* PIRNet: Privacy-Preserving Image Restoration Network via Wavelet Lifting
* PivotNet: Vectorized Pivot Learning for End-to-end HD Map Construction
* Pix2PixSSR: Spatial super-resolution synthesis and visualization for time-varying volumetric data
* Pix2Video: Video Editing using Image Diffusion
* Pixel Adaptive Deep Unfolding Transformer for Hyperspectral Image Reconstruction
* Pixel-Aligned Recurrent Queries for Multi-View 3D Object Detection
* Pixel-Wise Contrastive Distillation
* PlanarTrack: A Large-scale Challenging Benchmark for Planar Object Tracking
* PlaneRecTR: Unified Query Learning for 3D Plane Recovery from a Single View
* PlankAssembly: Robust 3D Reconstruction from Three Orthographic Views with Learnt Shape Programs
* Plant Root Occlusion Inpainting with Generative Adversarial Network
* Plausible Uncertainties for Human Pose Regression
* Plugging the Gaps in the Global PhenoCam Monitoring of Forests: The Need for a PhenoCam Network across Indian Forests
* Pluralistic Aging Diffusion Autoencoder
* PNI: Industrial Anomaly Detection using Position and Neighborhood Information
* PODIA-3D: Domain Adaptation of 3D Generative Model Across Large Domain Gap Using Pose-Preserved Text-to-Image Diffusion
* Poincaré ResNet
* Point Cloud Segmentation Method for Dim and Cluttered Underground Tunnel Scenes Based on the Segment Anything Model, A
* Point Contrastive Prediction with Semantic Clustering for Self-Supervised Learning on Point Cloud Videos
* Point-Query Quadtree for Crowd Counting, Localization, and More
* Point-SLAM: Dense Neural Point Cloud-based SLAM
* Point-TTA: Test-Time Adaptation for Point Cloud Registration Using Multitask Meta-Auxiliary Learning
* Point2Mask: Point-supervised Panoptic Segmentation via Optimal Transport
* PointCLIP V2: Prompting CLIP and GPT for Powerful 3D Open-world Learning
* PointDC: Unsupervised Semantic Segmentation of 3D Point Clouds via Cross-modal Distillation and Super-Voxel Clustering
* Pointing Gesture Recognition via Self-supervised Regularization for ASD Screening
* Pointing out Human Answer Mistakes in a Goal-Oriented Visual Dialogue
* PointMBF: A Multi-scale Bidirectional Fusion Network for Unsupervised RGB-D Point Cloud Registration
* PointOdyssey: A Large-Scale Synthetic Dataset for Long-Term Point Tracking
* PolicyCleanse: Backdoor Detection and Mitigation for Competitive Reinforcement Learning
* Pollinators as Data Collectors: Estimating Floral Diversity with Bees and Computer Vision
* Polygon Detection for Room Layout Estimation using Heterogeneous Graphs and Wireframes
* Ponder: Point Cloud Pre-training via Neural Rendering
* Population Monte Carlo With Normalizing Flow
* Pose-Free Neural Radiance Fields via Implicit Pose Regularization
* PoseBias: On Dataset Bias and Task Difficulty - Is there an Optimal Camera Position for Facial Image Analysis?
* PoseDiffusion: Solving Pose Estimation via Diffusion-aided Bundle Adjustment
* PoseFix: Correcting 3D Human Poses with Natural Language
* PoseMatcher: One-shot 6D Object Pose Estimation by Deep Feature Matching
* Posidonia oceanica Cartography and Evolution of the Balearic Sea (Western Mediterranean)
* Positive Unlabeled Fake News Detection via Multi-Modal Masked Transformer Network
* Post Training Mixed Precision Quantization of Neural Networks using First-Order Information
* POSTER: A Pyramid Cross-Fusion Transformer Network for Facial Expression Recognition
* PourIt!: Weakly-supervised Liquid Perception from a Single Image for Visual Closed-Loop Robotic Pouring
* Power of Sound (TPoS): Audio Reactive Video Generation with Stable Diffusion, The
* PPR: Physically Plausible Reconstruction from Monocular Videos
* Practical Membership Inference Attacks Against Large-Scale Multi-Modal Models: A Pilot Study
* PRANC: Pseudo RAndom Networks for Compacting deep models
* PRAT: PRofiling Adversarial aTtacks
* Pre-Launch Multi-Energy Radiance Calibration of the OMS-N
* Pre-training Vision Transformers with Very Limited Synthesized Images
* Pre-training-free Image Manipulation Localization through Non-Mutually Exclusive Contrastive Learning
* Predict to Detect: Prediction-guided 3D Object Detection using Sequential Images
* Predicting coronary artery disease using learning rate based snow leopard optimization with soft swish scaling based bidirectional long short-term memory
* Predicting Radiologists' Gaze With Computational Saliency Models in Mammogram Reading
* Prediction of Open Woodland Transpiration Incorporating Sun-Induced Chlorophyll Fluorescence and Vegetation Structure
* Preface: A Data-driven Volumetric Prior for Few-shot Ultra High-resolution Face Synthesis
* Preparing the Future for Continual Semantic Segmentation
* Preserve Your Own Correlation: A Noise Prior for Video Diffusion Models
* Preserving Modality Structure Improves Multi-Modal Learning
* Preserving Tumor Volumes for Unsupervised Medical Image Registration
* PreSTU: Pre-Training for Scene-Text Understanding
* Pretrained Language Models as Visual Planners for Human Assistance
* Preventing Zero-Shot Transfer Degradation in Continual Learning of Vision-Language Models
* Prior knowledge guided text to image generation
* Prior-guided Source-free Domain Adaptation for Human Pose Estimation
* PRIOR: Prototype Representation Joint Learning from Medical Images and Reports
* Priority-Centric Human Motion Generation in Discrete Latent Space
* Privacy Preserving Localization via Coordinate Permutations
* Privacy-Preserving Face Recognition Using Random Frequency Components
* Probabilistic Human Mesh Recovery in 3D Scenes from Egocentric Views
* Probabilistic MIMO U-Net: Efficient and Accurate Uncertainty Estimation for Pixel-wise Regression
* Probabilistic Modeling of Inter- and Intra-observer Variability in Medical Image Segmentation
* Probabilistic Precision and Recall Towards Reliable Evaluation of Generative Models
* Probabilistic Triangulation for Uncalibrated Multi-View 3D Human Pose Estimation
* ProbVLM: Probabilistic Adapter for Frozen Vison-Language Models
* Progress and Limitations in the Satellite-Based Estimate of Burnt Areas
* Progressive Feature Adjustment for Semi-supervised Learning from Pretrained Models
* Progressive Learning of 3D Reconstruction Network From 2D GAN Data
* Progressive Spatio-Temporal Prototype Matching for Text-Video Retrieval
* Prompt Switch: Efficient CLIP Adaptation for Text-Video Retrieval
* Prompt Tuning Inversion for Text-Driven Image Editing Using Diffusion Models
* Prompt-aligned Gradient for Prompt Tuning
* PromptCap: Prompt-Guided Image Captioning for VQA with GPT-3
* PromptStyler: Prompt-driven Style Generation for Source-free Domain Generalization
* Propagation of Meteorological Drought to Agricultural and Hydrological Droughts in the Tropical Lancang-Mekong River Basin
* ProPainter: Improving Propagation and Transformer for Video Inpainting
* ProtoFL: Unsupervised Federated Learning via Prototypical Distillation
* ProtoTransfer: Cross-Modal Prototype Transfer for Point Cloud Segmentation
* Prototype Reminiscence and Augmented Asymmetric Knowledge Aggregation for Non-Exemplar Class-Incremental Learning
* Prototype-based Dataset Comparison
* Prototypical Kernel Learning and Open-set Foreground Perception for Generalized Few-shot Semantic Segmentation
* ProVLA: Compositional Image Search with Progressive Vision-Language Alignment and Multimodal Fusion
* Proxy Anchor-based Unsupervised Learning for Continuous Generalized Category Discovery
* Prune Spatio-temporal Tokens by Semantic-aware Temporal Accumulation
* Pseudo Flow Consistency for Self-Supervised 6D Object Pose Estimation
* Pseudo-Label Alignment for Semi-Supervised Instance Segmentation
* Pseudo-Supervised Low-Light Image Enhancement With Mutual Learning
* PVO-Based Reversible Data Hiding Using Global Sorting and Fixed 2D Mapping Modification
* PVT++: A Simple End-to-End Latency-Aware Visual Tracking Framework
* Pyramid Dual Domain Injection Network for Pan-sharpening
* PØDA: Prompt-driven Zero-shot Domain Adaptation
* Q-Diffusion: Quantizing Diffusion Models
* QBitOpt: Fast and Accurate Bitwidth Reallocation during Training
* QD-BEV: Quantization-aware View-guided Distillation for Multi-view 3D Object Detection
* Quality Diversity for Visual Pre-Training
* Quality-Agnostic Deepfake Detection with Intra-model Collaborative Learning
* Quantifying Intermittent Flow Regimes in Ungauged Basins: Optimization of Remote Sensing Techniques for Ephemeral Channels Using a Flexible Statistical Classification
* Quantifying the Impact of Hurricane Harvey on Beach-Dune Systems of the Central Texas Coast and Monitoring Their Changes Using UAV Photogrammetry
* Quantitative analysis of facial soft tissue using weighted cascade regression model applicable for facial plastic surgery
* Quantitative Assessment of Factors Influencing the Spatiotemporal Variation in Carbon Dioxide Fluxes Simulated by Multi-Source Remote Sensing Data in Tropical Vegetation
* Quantization Step Estimation of Color Images Based on Res2Net-C With Frequency Clustering Prior Knowledge
* Quantized Generative Models for Solving Inverse Problems
* Quantum Algorithm for Signal Denoising
* Quantum-inspired hybrid algorithm for image classification and segmentation: Q-Means++ max-cut method
* Query Refinement Transformer for 3D Instance Segmentation
* Query6DoF: Learning Sparse Queries as Implicit Shape Prior for Category-Level 6DoF Pose Estimation
* R-Pred: Two-Stage Motion Prediction Via Tube-Query Attention-Based Trajectory Refinement
* R3D3: Dense 3D Reconstruction of Dynamic Scenes from Multiple Cameras
* Radar Emitter Signal Intra-Pulse Modulation Open Set Recognition Based on Deep Neural Network
* Radar-Infrared Sensor Fusion Based on Hierarchical Features Mining
* Radio Frequency Interference Mitigation in Synthetic Aperture Radar Data Based on Instantaneous Spectrum Forward Consecutive Mean Excision
* Raising the Bar on the Evaluation of Out-of-Distribution Detection
* RANA: Relightable Articulated Neural Avatars
* Random Boxes Are Open-world Object Detectors
* Random Sub-Samples Generation for Self-Supervised Real Image Denoising
* Randomized Quantization: A Generic Augmentation for Data Agnostic Self-supervised Learning
* RankMatch: Fostering Confidence and Consistency in Learning with Noisy Labels
* RankMixup: Ranking-Based Mixup Training for Network Calibration
* Rapid Adaptation in Online Continual Learning: Are We Evaluating It Right?
* Rapid building damage assessment workflow: An implementation for the 2023 Rolling Fork, Mississippi tornado event
* Rapid Flood Inundation Forecast Using Fourier Neural Operator
* Rapid Mapping of Landslides Induced by Heavy Rainfall in the Emilia-Romagna (Italy) Region in May 2023
* Rapid Network Adaptation: Learning to Adapt Neural Networks Using Test-Time Feedback
* Rapid survey method for large-scale outdoor surveillance cameras using binary space partitioning
* Rapid tomato DUS trait analysis using an optimized mobile-based coarse-to-fine instance segmentation algorithm
* RAU-Net-Based Imaging Method for Spatial-Variant Correction and Denoising in Multiple-Input Multiple-Output Radar
* RawHDR: High Dynamic Range Image Reconstruction from a Single Raw Image
* Ray Conditioning: Trading Photo-consistency for Photo-realism in Multi-view Image Generation
* Ray-Patch: An Efficient Querying for Light Field Transformers
* RbA: Segmenting Unknown Regions Rejected by All
* RCA-NOC: Relative Contrastive Alignment for Novel Object Captioning
* RCD-SGD: Resource-Constrained Distributed SGD in Heterogeneous Environment Via Submodular Partitioning
* RCV2023 Challenges: Benchmarking Model Training and Inference for Resource-Constrained Deep Learning
* Re-mine, Learn and Reason: Exploring the Cross-modal Semantic Correlations for Language-guided HOI detection
* Re-Parameterized Vision Transformer (ReVT) for Domain-Generalized Semantic Segmentation, A
* Re-ReND: Real-time Rendering of NeRFs across Devices
* Re:PolyWorld - A Graph Neural Network for Polygonal Scene Parsing
* ReactioNet: Learning High-order Facial Behavior from Universal Stimulus-Reaction by Dyadic Relation Reasoning
* Read-only Prompt Optimization for Vision-Language Few-shot Learning
* Real Aperture Radar Angular Super-Resolution Imaging Using Modified Smoothed L0 Norm with a Regularization Strategy
* real-time critical-scenario-generation framework for defect detection of autonomous driving system, A
* Real-Time Neural Rasterization for Large Scenes
* Real-Time Optimisation-Based Path Planning for Visually Impaired People in Dynamic Environments
* Real-Time Synchronous Acquisition and Processing of Signal in Coherent Doppler Wind Lidar Using FPGA
* RealGraph: A Multiview Dataset for 4D Real-world Context Graph Generation
* Realistic Full-Body Tracking from Sparse Observations via Joint-Level Modeling
* REAP: A Large-Scale Realistic Adversarial Patch Benchmark
* Recent Advances in Rate Control: From Optimization to Implementation and Beyond
* RECIST-Induced Reliable Learning: Geometry-Driven Label Propagation for Universal Lesion Segmentation
* Recognition Model Incorporating Geometric Relationships of Ship Components, A
* Reconciling Object-Level and Global-Level Objectives for Long-Tail Detection
* ReconFormer: Accelerated MRI Reconstruction Using Recurrent Transformer
* Reconstructed Convolution Module Based Look-Up Tables for Efficient Image Super-Resolution
* Reconstructing Groups of People with Hypergraph Relational Reasoning
* Reconstructing Interacting Hands with Interaction Prior from Monocular Images
* Reconstructing Pruned Filters using Cheap Spatial Transformations
* Reconstructing Snow Cover under Clouds and Cloud Shadows by Combining Sentinel-2 and Landsat 8 Images in a Mountainous Region
* Reconstruction of 3D Interaction Models from Images using Shape Prior
* Reconstruction of Continuous High-Resolution Sea Surface Temperature Data Using Time-Aware Implicit Neural Representation
* Recovering a Molecule's 3D Dynamics from Liquid-phase Electron Microscopy Movies
* RecRecNet: Rectangling Rectified Wide-Angle Images by Thin-Plate Spline Model and DoF-based Curriculum Learning
* Recursive Video Lane Detection
* RecursiveDet: End-to-End Region-based Recursive Object Detection
* RED-PSM: Regularization by Denoising of Partially Separable Models for Dynamic Imaging
* Reducing Training Time in Cross-Silo Federated Learning using Multigraph Topology
* Ref-NeuS: Ambiguity-Reduced Neural Implicit Surface Learning for Multi-View Reconstruction with Reflection
* RefEgo: Referring Expression Comprehension Dataset from First-Person Perception of Ego4D
* Reference-guided Controllable Inpainting of Neural Radiance Fields
* Referring Image Segmentation Using Text Supervision
* ReFit: Recurrent Fitting Network for 3D Human Recovery
* Refraction Correction Based on ATL03 Photon Parameter Tracking for Improving ICESat-2 Bathymetry Accuracy
* ReGen: A good Generative zero-shot video classifier should be Rewarded
* RegFormer: An Efficient Projection-Aware Transformer Network for Large-Scale Point Cloud Registration
* Region-Aware Portrait Retouching With Sparse Interactive Guidance
* Regional Accuracy Assessment of 30-Meter GLC_FCS30, GlobeLand30, and CLCD Products: A Case Study in Xinjiang Area
* Registration-enhanced multiple instance learning for cervical cancer whole slide image classification
* Regularized Mask Tuning: Uncovering Hidden Knowledge in Pre-trained Vision-Language Models
* Regularized Primitive Graph Learning for Unified Vector Mapping
* Rehearsal-Free Domain Continual Face Anti-Spoofing: Generalize More and Forget Less
* Reinforce Data, Multiply Impact: Improved Model Accuracy and Robustness with Dataset Reinforcement
* Reinforced Disentanglement for Face Swapping without Skip Connection
* Reinforcement learning for instance segmentation with high-level priors
* Reinforcement learning with space carving for plant scanning
* Relational Prior Knowledge Graphs for Detection and Instance Segmentation
* ReLeaPS: Reinforcement Learning-based Illumination Planning for Generalized Photometric Stereo
* Reliable Event Generation With Invertible Conditional Normalizing Flow
* Relightify: Relightable 3D Faces from a Single Image via Diffusion Models
* Remembering Normality: Memory-guided Knowledge Distillation for Unsupervised Anomaly Detection
* ReMoDiffuse: Retrieval-Augmented Motion Diffusion Model
* Remote Sensing Analysis for Vegetation Assessment of a Large-Scale Constructed Wetland Treating Produced Water Polluted with Oil Hydrocarbons
* Remote Sensing Estimation of CDOM and DOC with the Environmental Implications for Lake Khanka
* Remote Sensing Image Retrieval Algorithm for Dense Data
* Remote Sensing of Aerosols and Water-Leaving Radiance from Chinese FY-3/MERSI Based on a Simultaneous Method
* Remote ultrasound real-time consultation and quality control system
* Remotely Sensed Agroclimatic Classification and Zoning in Water-Limited Mediterranean Areas towards Sustainable Agriculture
* Removing Anomalies as Noises for Industrial Defect Localization
* RenderIH: A Large-scale Synthetic Dataset for 3D Interacting Hand Pose Estimation
* Rendering Humans from Object-Occluded Monocular Videos
* ReNeRF: Relightable Neural Radiance Fields with Nearfield Lighting
* Repetition-aware Image Sequence Sampling for Recognizing Repetitive Human Actions
* Replay: Multi-modal Multi-view Acted Videos for Casual Holography
* RepQ-ViT: Scale Reparameterization for Post-Training Quantization of Vision Transformers
* Representation Disparity-aware Distillation for 3D Object Detection
* Representation transfer and data cleaning in multi-views for text simplification
* Research on prediction method of expressway section traffic flow considering anomaly detection probability
* Reshaping Leaf-Level Reflectance Data for Plant Species Discrimination: Exploring Image Shape's Impact on Deep Learning Results
* Residual Pattern Learning for Pixel-wise Out-of-Distribution Detection in Semantic Segmentation
* Response of Vegetation Phenology to Climate Change on the Tibetan Plateau Considering Time-Lag and Cumulative Effects
* Responses of Ozone to the Solar Eclipse on the 21st of June 2020 in the Mesosphere and Upper Stratosphere, The
* ResQ: Residual Quantization for Video Perception
* ReST: A Reconfigurable Spatial-Temporal Graph Model for Multi-Camera Multi-Object Tracking
* Restoration of Bone-Conducted Speech With U-Net-Like Model and Energy Distance Loss
* Rethinking Amodal Video Segmentation from Learning Supervised Signals with Object-centric Representation
* Rethinking Data Distillation: Do Not Overlook Calibration
* Rethinking Fast Fourier Convolution in Image Inpainting
* Rethinking Mobile Block for Efficient Attention-based Models
* Rethinking Multi-Contrast MRI Super-Resolution: Rectangle-Window Cross-Attention Transformer and Arbitrary-Scale Upsampling
* Rethinking Object Saliency Ranking: A Novel Whole-Flow Processing Paradigm
* Rethinking Point Cloud Registration as Masking and Reconstruction
* Rethinking pose estimation in crowds: overcoming the detection information bottleneck and ambiguity
* Rethinking precision of pseudo label: Test-time adaptation via complementary learning
* Rethinking Range View Representation for LiDAR Segmentation
* Rethinking Safe Semi-supervised Learning: Transferring the Open-set Problem to A Close-set One
* Rethinking the Role of Pre-Trained Networks in Source-Free Domain Adaptation
* Rethinking Video Frame Interpolation from Shutter Mode Induced Degradation
* Rethinking Vision Transformers for MobileNet Size and Speed
* Retinal vessel segmentation to diagnose diabetic retinopathy using fundus images: A survey
* Retinexformer: One-stage Retinex-based Transformer for Low-light Image Enhancement
* Retrieval of Desert Microwave Land Surface Emissivity Based on Machine Learning Algorithms
* Retrieval of Subsurface Velocities in the Southern Ocean from Satellite Observations
* Retrieving-to-Answer: Zero-Shot Video Question Answering with Frozen Large Language Models
* Retro-FPN: Retrospective Feature Pyramid Network for Point Cloud Semantic Segmentation
* Retrospect to Multi-prompt Learning across Vision and Language, A
* Revisit PCA-based technique for Out-of-Distribution Detection
* Revisiting Domain-Adaptive 3D Object Detection by Reliable, Diverse and Class-balanced Pseudo-Labeling
* Revisiting Foreground and Background Separation in Weakly-supervised Temporal Action Localization: A Clustering-based Approach
* Revisiting Fully Convolutional Geometric Features for Object 6D Pose Estimation
* Revisiting Generalizability in Deepfake Detection: Improving Metrics and Stabilizing Transfer
* Revisiting Kernel Temporal Segmentation as an Adaptive Tokenizer for Long-form Video Understanding
* Revisiting Scene Text Recognition: A Data Perspective
* Revisiting the Parameter Efficiency of Adapters from the Perspective of Precision Redundancy
* RFD-ECNet: Extreme Underwater Image Compression with Reference to Feature Dictionary
* RFLA: A Stealthy Reflected Light Adversarial Attack in the Physical World
* RheumaVIT: transformer-based model for Automated Scoring of Hand Joints in Rheumatoid Arthritis
* Rich Action-Semantic Consistent Knowledge for Early Action Prediction
* Rickrolling the Artist: Injecting Backdoors into Text Encoders for Text-to-Image Synthesis
* RICO: Regularizing the Unobservable for Indoor Compositional Reconstruction
* Riemannian SPD learning to represent and characterize fixational oculomotor Parkinsonian abnormalities
* RIGID: Recurrent GAN Inversion and Editing of Real Face Videos
* Risk Assessment and Analysis of Its Influencing Factors of Debris Flows in Typical Arid Mountain Environment: A Case Study of Central Tien Shan Mountains, China
* Risk-averse perimeter control for alleviating the congestion of an urban traffic network system with uncertainties
* risk-based driver behaviour model, A
* RLIPv2: Fast Scaling of Relational Language-Image Pre-training
* RLSAC: Reinforcement Learning enhanced Sample Consensus for End-to-End Robust Estimation
* RMP-Loss: Regularizing Membrane Potential Distribution for Spiking Neural Networks
* RNVE: A Real Nighttime Vision Enhancement Benchmark and Dual-Stream Fusion Network
* Road damage detection with bounding box and generative adversarial networks based augmentation methods
* Robo3D: Towards Robust and Reliable 3D Perception against Corruptions
* Robust AMD Stage Grading with Exclusively OCTA Modality Leveraging 3D Volume
* Robust Asymmetric Loss for Multi-Label Long-Tailed Learning
* Robust Cross-Drone Multi-Target Association Using 3D Spatial Consistency
* Robust e-NeRF: NeRF from Sparse & Noisy Events under Non-Uniform Motion
* Robust Evaluation of Diffusion-Based Adversarial Purification
* Robust Frame-to-Frame Camera Rotation Estimation in Crowded Scenes
* Robust Geometry-Preserving Depth Estimation Using Differentiable Rendering
* Robust Heterogeneous Federated Learning under Data Corruption
* Robust Index Based on Phenological Features to Extract Sugarcane from Multisource Remote Sensing Data, A
* Robust Mixture-of-Expert Training for Convolutional Neural Networks
* Robust Monocular Depth Estimation under Challenging Conditions
* Robust MSFM Learning Network for Classification and Weakly Supervised Localization
* Robust Multi-Local to Global with Outlier Filtering for Point Cloud Registration, A
* Robust Object Modeling for Visual Tracking
* Robust One-Shot Face Video Re-enactment using Hybrid Latent Spaces of StyleGAN2
* Robust Referring Video Object Segmentation with Cyclic Structural Consensus
* Robust Semantic Segmentation UNCV2023 Challenge Results, The
* Robust Time-of-Arrival-Based Splitting Mean Moving Object Localization
* Robust Vascular Segmentation for Raw Complex Images of Laser Speckle Contrast Based on Weakly Supervised Learning
* Robust Visual Tracking Using Hierarchical Vision Transformer with Shifted Windows Multi-Head Self-Attention
* robust watermarking approach against high-density salt and pepper noise (RWSPN) to enhance medical image security, A
* Robustifying Token Attention for Vision Transformers
* Role-aware Interaction Generation from Textual Description
* ROME: Robustifying Memory-Efficient NAS via Topology Disentanglement and Gradient Accumulation
* Room-Object Entity Prompting and Reasoning for Embodied Referring Expression
* Root Pose Decomposition Towards Generic Non-rigid 3D Reconstruction with Monocular Videos
* Rosetta Neurons: Mining the Common Units in a Model Zoo
* Rotation-invariant Hierarchical Segmentation on Poincaré Ball for 3D Point Cloud
* RPEFlow: Multimodal Fusion of RGB-PointCloud-Event for Joint Optical Flow and Scene Flow Estimation
* RPG-Palm: Realistic Pseudo-data Generation for Palmprint Recognition
* RPM 2.0: RF-Based Pose Machines for Multi-Person 3D Pose Estimation
* RRc-UNet 3D for lung tumor segmentation from CT scans of Non-Small Cell Lung Cancer patients
* RSFNet: A White-Box Image Retouching Approach using Region-Specific Color Filters
* RV-VAE: Integrating Random Variable Algebra into Variational Autoencoders
* S-TREK: Sequential Translation and Rotation Equivariant Keypoints for local feature extraction
* S-VolSDF: Sparse Multi-View Stereo Regularization of Neural Implicit Surfaces
* S2RF: Semantically Stylized Radiance Fields
* S3IM: Stochastic Structural SIMilarity and Its Unreasonable Effectiveness for Neural Fields
* SA-BEV: Generating Semantic-Aware Bird's-Eye-View Feature for Multi-view 3D Object Detection
* SAFARI: Versatile and Efficient Evaluations for Robustness of Interpretability
* SAFE: Machine Unlearning With Shard Graphs
* SAFE: Sensitivity-Aware Features for Out-of-Distribution Object Detection
* SAFL-Net: Semantic-Agnostic Feature Learning Network with Auxiliary Plugins for Image Manipulation Detection
* SAGA: Spectral Adversarial Geometric Attack on 3D Meshes
* SAL-ViT: Towards Latency Efficient Private Inference on ViT using Selective Attention Search with a Learnable Softmax Approximation
* SALAD: Part-Level Latent Diffusion for 3D Shape Generation and Manipulation
* Saliency Regularization for Self-Training with Partial Annotations
* Saliency-guided stairs detection on wearable RGB-D devices for visually impaired persons with Swin-Transformer
* Salient Object Detection Toward Single-Pixel Imaging
* Salient Object Detection With Edge-Guided Learning and Specific Aggregation
* Salient-Boundary-Guided Pseudo-Pixel Supervision for Weakly-Supervised Semantic Segmentation
* SAM-Adapter: Adapting Segment Anything in Underperformed Scenes
* Sample-adaptive Augmentation for Point Cloud Recognition Against Real-world Corruptions
* Sample-wise Label Confidence Incorporation for Learning with Noisy Labels
* Sample4Geo: Hard Negative Sampling For Cross-View Geo-Localisation
* SAMPLING: Scene-adaptive Hierarchical Multiplane Images Representation for Novel View Synthesis from a Single Image
* Sat2Density: Faithful Density Learning from Satellite-Ground Image Pairs
* Satellite Estimation of pCO2 and Quantification of CO2 Fluxes in China's Chagan Lake in the Context of Climate Change
* SATHUR: Self Augmenting Task Hallucinal Unified Representation for Generalized Class Incremental Learning
* SatlasPretrain: A Large-Scale Dataset for Remote Sensing Image Understanding
* SATR: Zero-Shot Semantic Segmentation of 3D Shapes
* SAtUNet: Series atrous convolution enhanced U-Net for lung nodule segmentation
* Save the Tiny, Save the All: Hierarchical Activation Network for Tiny Object Detection
* SC2GAN: Rethinking Entanglement by Self-correcting Correlated GAN Space
* SC3K: Self-supervised and Coherent 3D Keypoints Estimation from Rotated, Noisy, and Decimated Point Cloud Data
* Scalable Diffusion Models with Transformers
* Scalable Face Image Coding via StyleGAN Prior: Toward Compression for Human-Machine Collaborative Vision
* Scalable MAV Indoor Reconstruction with Neural Implicit Surfaces
* Scalable Multi-Temporal Remote Sensing Change Data Generation via Simulating Stochastic Change Process
* Scalable Retrieval of Similar Landscapes in Optical Satellite Imagery Using Unsupervised Representation Learning
* Scalable Video Object Segmentation with Simplified Framework
* Scale-Aware Modulation Meet Transformer
* Scale-MAE: A Scale-Aware Masked Autoencoder for Multiscale Geospatial Representation Learning
* Scales of oblique photographs updated
* Scaling Data Generation in Vision-and-Language Navigation
* SCANet: Scene Complexity Aware Network for Weakly-Supervised Video Moment Retrieval
* ScanNet++: A High-Fidelity Dataset of 3D Indoor Scenes
* Scanning Only Once: An End-to-end Framework for Fast Temporal Grounding in Long Videos
* ScatterNeRF: Seeing Through Fog with Physically-Based Inverse Neural Rendering
* Scene as Occupancy
* Scene Classification Method Based on Multi-Scale Convolutional Neural Network with Long Short-Term Memory and Whale Optimization Algorithm
* Scene Graph Contrastive Learning for Embodied Navigation
* Scene Matters: Model-based Deep Video Compression
* Scene-Aware Feature Matching
* Scene-Aware Label Graph Learning for Multi-Label Image Classification
* SceneGenie: Scene Graph Guided Diffusion Models for Image Synthesis
* SceneRF: Self-Supervised Monocular 3D Scene Reconstruction with Radiance Fields
* Scenimefy: Learning to Craft Anime Scene via Semi-Supervised Image-to-Image Translation
* SCOB: Universal Text Understanding via Character-wise Supervised Contrastive Learning with Online Text Rendering for Bridging Domain Gap
* Score Priors Guided Deep Variational Inference for Unsupervised Real-World Single Image Denoising
* Score-Based Diffusion Models as Principled Priors for Inverse Imaging
* SCoTTi: Save Computation at Training Time with an adaptive framework
* Scratch Each Other's Back: Incomplete Multi-modal Brain Tumor Segmentation Via Category Aware Group Self-Support Learning
* Scratching Visual Transformer's Back with Uniform Attention
* Screening Image Features of Collapsed Buildings for Operational and Rapid Remote Sensing Identification
* ScrollNet: Dynamic Weight Importance for Continual Learning
* SCSC: Spatial Cross-scale Convolution Module to Strengthen both CNNs and Transformers
* SDAT-Former++: A Foggy Scene Semantic Segmentation Method with Stronger Domain Adaption Teacher for Remote Sensing Images
* SDBAD-Net: A Spatial Dual-Branch Attention Dehazing Network Based on Meta-Former Paradigm
* SDSNet: Building Extraction in High-Resolution Remote Sensing Images Using a Deep Convolutional Network with Cross-Layer Feature Information Interaction Filtering
* SE-Swin: An improved Swin-Transfomer network of self-ensemble feature extraction framework for image retrieval
* Sea Surface pCO2 Response to Typhoon Wind Pump and Kuroshio Intrusion in the Northeastern South China Sea
* Seal-3D: Interactive Pixel-Level Editing for Neural Radiance Fields
* Search for or Navigate to? Dual Adaptive Thinking for Object Navigation
* Seasonal Variability of Arctic Mid-Level Clouds and the Relationships with Sea Ice from 2003 to 2022: A Satellite Perspective
* See More and Know More: Zero-shot Point Cloud Segmentation via Multi-modal Visual Data
* SeeABLE: Soft Discrepancies and Bounded Contrastive Learning for Exposing Deepfakes
* Seeing Beyond the Patch: Scale-Adaptive Semantic Segmentation of High-resolution Remote Sensing Imagery based on Reinforcement Learning
* SEFD: Learning to Distill Complex Pose and Occlusion
* SegDA: Maximum Separable Segment Mask with Pseudo Labels for Domain Adaptive Semantic Segmentation
* SegGPT: Towards Segmenting Everything In Context
* Segment Anything
* Segment Anything Model (SAM) Assisted Remote Sensing Supervision for Mariculture: Using Liaoning Province, China as an Example
* Segment Every Reference Object in Spatial and Temporal Spaces
* Segmentation of Tubular Structures Using Iterative Training with Tailored Samples
* Segmentation-based Assessment of Tumor-Vessel Involvement for Surgical Resectability Prediction of Pancreatic Ductal Adenocarcinoma
* Segmenting Known Objects and Unseen Unknowns without Prior Knowledge
* SegPrompt: Boosting Open-world Segmentation via Category-level Prompt Learning
* SegRCDB: Semantic Segmentation via Formula-Driven Supervised Learning
* SeiT: Storage-Efficient Vision Training with Tokens Using 1% of Pixel Storage
* Selective Freezing for Efficient Continual Learning
* SelectNAdapt: Support Set Selection for Few-Shot Domain Adaptation
* Self-Adaptive-Filling Deep Convolutional Neural Network Classification Method for Mountain Vegetation Type Based on High Spatial Resolution Aerial Images
* Self-Calibrated Cross Attention Network for Few-Shot Segmentation
* Self-Distillation-Based Polarimetric Image Classification with Noisy and Sparse Labels
* Self-Evolved Dynamic Expansion Model for Task-Free Continual Learning
* Self-Feedback DETR for Temporal Action Detection
* Self-Ordering Point Clouds
* Self-Organizing Pathway Expansion for Non-Exemplar Class-Incremental Learning
* Self-regulating Prompts: Foundational Model Adaptation without Forgetting
* Self-similarity Driven Scale-invariant Learning for Weakly Supervised Person Search
* Self-Supervised Anomaly Detection from Anomalous Training Data via Iterative Latent Token Masking
* Self-Supervised Burst Super-Resolution
* self-supervised causal feature reinforcement learning method for non-invasive hemoglobin prediction, A
* Self-supervised Character-to-Character Distillation for Text Recognition
* Self-supervised Cross-view Representation Reconstruction for Change Captioning
* Self-supervised Hypergraphs for Learning Multiple World Interpretations
* Self-supervised Image Denoising with Downsampled Invariance Loss and Conditional Blind-Spot Network
* Self-supervised knowledge distillation in counterfactual learning for VQA
* Self-supervised Learning of Implicit Shape Representation with Dense Correspondence for Deformable Objects
* Self-supervised Learning to Bring Dual Reversed Rolling Shutter Images Alive
* Self-Supervised Monocular Depth Estimation by Direction-aware Cumulative Convolution Network
* Self-supervised Monocular Depth Estimation: Let's Talk About The Weather
* Self-supervised Monocular Underwater Depth Recovery, Image Restoration, and a Real-sea Video Dataset
* Self-Supervised Multi-Scale Cropping and Simple Masked Attentive Predicting for Lung CT-Scan Anomaly Detection
* Self-Supervised Object Detection from Egocentric Videos
* Self-supervised pre-training for large-scale crop mapping using Sentinel-2 time series
* Self-supervised Pre-training for Mirror Detection
* Self-supervised Semantic Segmentation: Consistency over Transformation
* Self-training and multi-task learning for limited data: Evaluation study on object detection
* SelfGraphVQA: A Self-Supervised Graph Neural Network for Scene-based Question Answering
* Semantic Attention Flow Fields for Monocular Dynamic Scene Decomposition
* Semantic Information in Contrastive Learning
* Semantic Motif Segmentation of Archaeological Fresco Fragments
* Semantic Parsing of Colonoscopy Videos with Multi-Label Temporal Networks
* Semantic RGB-D Image Synthesis
* Semantic Segmentation of Crops and Weeds with Probabilistic Modeling and Uncertainty Quantification
* Semantic Segmentation using Foundation Models for Cultural Heritage: an Experimental Study on Notre-Dame de Paris
* Semantic-Aware Dynamic Parameter for Video Inpainting Transformer
* Semantic-Aware Implicit Template Learning via Part Deformation Consistency
* Semantic-Disentangled Transformer With Noun-Verb Embedding for Compositional Action Recognition
* Semantic-Layout-Guided Image Synthesis for High-Quality Synthetic-Aperature Radar Detection Sample Generation
* Semantically Enhanced Scene Captions with Physical and Weather Condition Changes
* Semantically Structured Image Compression via Irregular Group-Based Decoupling
* Semantics Meets Temporal Correspondence: Self-supervised Object-centric Learning in Videos
* Semantics-Consistent Feature Search for Self-Supervised Visual Representation Learning
* Semantify: Simplifying the Control of 3D Morphable Models using CLIP
* SemARFlow: Injecting Semantics into Unsupervised Optical Flow Estimation for Autonomous Driving
* SeMask: Semantically Masked Transformers for Semantic Segmentation
* Semi-Supervised Deep Learning Framework for Change Detection in Open-Pit Mines Using SAR Imagery, A
* Semi-Supervised Learning via Weight-aware Distillation under Class Distribution Mismatch
* Semi-supervised Quality Evaluation of Colonoscopy Procedures
* Semi-Supervised Semantic Segmentation under Label Noise via Diverse Learning Groups
* Semi-supervised Semantics-guided Adversarial Training for Robust Trajectory Prediction
* Semi-supervised Speech-driven 3D Facial Animation via Cross-modal Encoding
* Sempart: Self-supervised Multi-resolution Partitioning of Image Semantics
* SENSE: Self-Evolving Learning for Self-Supervised Monocular Depth Estimation
* Sensitivities of Vegetation Gross Primary Production to Precipitation Frequency in the Northern Hemisphere from 1982 to 2015
* Sensitivity analysis of AI-based algorithms for autonomous driving on optical wavefront aberrations induced by the windshield
* Sentence Attention Blocks for Answer Grounding
* Sentence Speaks a Thousand Images: Domain Generalization through Distilling CLIP with Language Guidance, A
* SEPAL: Spatial Gene Expression Prediction from Local Graphs
* Sequential Texts Driven Cohesive Motions Synthesis with Natural Transitions
* Set-level Guidance Attack: Boosting Adversarial Transferability of Vision-Language Pre-training Models
* Set-the-Scene: Global-Local Training for Generating Controllable NeRF Scenes
* SFHarmony: Source Free Domain Adaptation for Distributed Neuroimaging Analysis
* SG-Former: Self-guided Transformer with Evolving Token Reallocation
* SGAligner: 3D Scene Alignment with Scene Graphs
* SHACIRA: Scalable HAsh-grid Compression for Implicit Neural Representations
* Shannon Strikes Again! Entropy-based Pruning in Deep Neural Networks for Transfer Learning under Extreme Memory and Computation Budgets
* Shape Analysis of Euclidean Curves under Frenet-Serret Framework
* Shape Anchor Guided Holistic Indoor Scene Understanding
* ShapeScaffolder: Structure-Aware 3D Shape Generation from Text
* Shapley Deep Learning: A Consensus for General-Purpose Vision Systems
* Sharing is Caring: Concurrent Interactive Segmentation and Model Training using a Joint Model
* SHARP Challenge 2023: Solving CAD History and pArameters Recovery from Point clouds and 3D scans. Overview, Datasets, Metrics, and Baselines
* ShaRPy: Shape Reconstruction and Hand Pose Estimation from RGB-D with Uncertainty
* Shatter and Gather: Learning Referring Image Segmentation with Text Supervision
* Shell-Net: A robust deep neural network for the joint segmentation of retinal fragments
* SHERF: Generalizable Human NeRF from a Single Image
* Shift from Texture-bias to Shape-Bias: Edge Deformation-Based Augmentation for Robust Object Recognition
* ShiftNAS: Improving One-shot NAS via Probability Shift
* Ship-Go: SAR Ship Images Inpainting via instance-to-image Generative Diffusion Models
* Short-Long-Term Propagation-Based Video Inpainting
* Short-term origin-destination demand forecasting in bus rapid transit based on dual attentive multi-scale convolutional network
* Shortcut-V2V: Compression Framework for Video-to-Video Translation based on Temporal Redundancy Reduction
* SHOWMe: Benchmarking Object-agnostic Hand-Object 3D Reconstruction
* Shrinking Class Space for Enhanced Certainty in Semi-Supervised Learning
* SiameseNet Based Fine-Grained Semantic Change Detection for High Resolution Remote Sensing Images
* SIDGAN: High-Resolution Dubbed Video Generation via Shift-Invariant Learning
* sigma-Adaptive Decoupled Prototype for Few-Shot Object Detection
* SIGMA: Scale-Invariant Global Sparse Shape Matching
* Sigmoid Loss for Language Image Pre-Training
* Sign Language Translation with Iterative Prototype
* Significant Location Accuracy Changes Resulting from Lightning Detection Networks Deployed on Inclined Terrains
* SilentTrig: An imperceptible backdoor attack against speaker identification with hidden triggers
* SiLK: Simple Learned Keypoints
* SILT: Shadow-aware Iterative Label Tuning for Learning to Detect Shadows from Noisy Labels
* SimFIR: A Simple Framework for Fisheye Image Rectification with Self-supervised Representation Learning
* Similarity Min-Max: Zero-Shot Day-Night Domain Adaptation
* SimMatchV2: Semi-Supervised Learning with Graph Consistency
* SimNP: Learning Self-Similarity Priors Between Neural Points
* Simoun: Synergizing Interactive Motion-appearance Understanding for Vision-based Reinforcement Learning
* Simple and Effective Out-of-Distribution Detection via Cosine-based Softmax Loss
* Simple and Explainable Method for Uncertainty Estimation using Attribute Prototype Networks, A
* Simple and Generic Framework for Feature Distillation via Channel-wise Transformation, A
* Simple and Robust Framework for Cross-Modality Medical Image Segmentation applied to Vision Transformers, A
* Simple Baselines for Interactive Video Retrieval with Questions and Answers
* Simple Framework for Open-Vocabulary Segmentation and Detection, A
* Simple Recipe to Meta-Learn Forward and Backward Transfer, A
* Simple Signal for Domain Shift, A
* Simple Vision Transformer for Weakly Semi-supervised 3D Object Detection, A
* SimpleClick: Interactive Image Segmentation with Simple Vision Transformers
* Simulating Fluids in Real-World Still Images
* Simulation Framework of Unmanned Aerial Vehicles Route Planning Design and Validation for Landslide Monitoring, A
* SINC: Self-Supervised In-Context Learning for Vision-Language Tasks
* SINC: Spatial Composition of 3D Human Motions for Simultaneous Action Generation
* Single Depth-image 3D Reflection Symmetry and Shape Prediction
* Single Image Deblurring with Row-dependent Blur Magnitude
* Single Image Defocus Deblurring via Implicit Neural Inverse Kernels
* Single Image Reflection Separation via Component Synergy
* Single-Image Simultaneous Destriping and Denoising: Double Low-Rank Property
* Single-Shot Pruning for Pre-trained Models: Rethinking the Importance of Magnitude Pruning
* Single-Stage Diffusion NeRF: A Unified Approach to 3D Generation and Reconstruction
* SIRA-PCR: Sim-to-Real Adaptation for 3D Point Cloud Registration
* Size Does Matter: Size-aware Virtual Try-on via Clothing-oriented Transformation Try-on Network
* SKED: Sketch-guided Text-based 3D Editing
* skeletonization algorithm for gradient-based optimization, A
* SkeletonMAE: Graph-based Masked Autoencoder for Skeleton Sequence Pre-training
* SkeleTR: Towards Skeleton-based Action Recognition in the Wild
* Sketch and Text Guided Diffusion Model for Colored Point Cloud Generation
* Skill Transformer: A Monolithic Policy for Mobile Manipulation
* Skin cancer classification using improved transfer learning model-based random forest classifier and golden search optimization
* Skip-Plan: Procedure Planning in Instructional Videos via Condensed Action Space Learning
* SKiT: a Fast Key Information Video Transformer for Online Surgical Phase Recognition
* SlaBins: Fisheye Depth Estimation using Slanted Bins on Road Environments
* SLAN: Self-Locator Aided Network for Vision-Language Understanding
* SLCA: Slow Learner with Classifier Alignment for Continual Learning on a Pre-trained Model
* Sleep disorder identification using wavelet scattering on ECG signals
* Small Object Detection via Coarse-to-fine Proposal Generation and Imitation Learning
* SMAUG: Sparse Masked Autoencoder for Efficient Video-Language Pre-training
* SMMix: Self-Motivated Image Mixing for Vision Transformers
* Smoothness Similarity Regularization for Few-Shot GAN Adaptation
* SMR: Spatial-Guided Model-Based Regression for 3D Hand Pose and Mesh Reconstruction
* Snow Removal in Video: A New Dataset and A Novel Method
* SOAR: Scene-debiasing Open-set Action Recognition
* Social Diffusion: Long-term Multiple Human Motion Anticipation
* SOCS: Semantically-aware Object Coordinate Space for Category-Level 6D Object Pose Estimation under Large Shape Variations
* SoDaCam: Software-defined Cameras via Single-Photon Imaging
* soft nearest-neighbor framework for continual semi-supervised learning, A
* Softmax Bias Correction for Quantized Generative Models
* SortedAP: Rethinking evaluation metrics for instance segmentation
* Sound Localization from Motion: Jointly Learning Sound Direction and Camera Rotation
* Sound Source Localization is All about Cross-Modal Alignment
* Sounding Video Generator: A Unified Framework for Text-Guided Sounding Video Generation
* Source-free Depth for Object Pop-out
* Source-free Domain Adaptive Human Pose Estimation
* Space Engage: Collaborative Space Supervision for Contrastive-based Semi-Supervised Semantic Segmentation
* Space-time Prompting for Video Class-incremental Learning
* Space-Time Variational Method for Retrieving Upper-Level Vortex Winds from GOES-16 Rapid Scans over Hurricanes, A
* SPACE: Speech-driven Portrait Animation with Controllable Expression
* SpaceEvo: Hardware-Friendly Search Space Design for Efficient INT8 Inference
* Spacetime Surface Regularization for Neural Dynamic Scene Reconstruction
* Spammer detection on short video applications
* SPARF: Large-Scale Learning of 3D Sparse Radiance Fields from Few Input Images
* Sparse Instance Conditioned Multimodal Trajectory Prediction
* Sparse Linear Concept Discovery Models
* Sparse Mix-Attention Transformer for Multispectral Image and Hyperspectral Image Fusion
* Sparse Point Guided 3D Lane Detection
* Sparse Sampling Transformer with Uncertainty-Driven Ranking for Unified Removal of Raindrops and Rain Streaks
* SparseBEV: High-Performance Sparse 3D Object Detection from Multi-Camera Videos
* SparseDet: Improving Sparsely Annotated Object Detection with Pseudo-positive Mining
* SparseFusion: Fusing Multi-Modal Sparse Representations for Multi-Sensor 3D Object Detection
* SparseMAE: Sparse Training Meets Masked Autoencoders
* SparseNeRF: Distilling Depth Ranking for Few-shot Novel View Synthesis
* Spatial Distribution of Multiple Atmospheric Pollutants in China from 2015 to 2020
* Spatial Downscaling Framework for SMAP Soil Moisture Based on Stacking Strategy, A
* Spatial Estimation of Daily Growth Biomass in Paddy Rice Field Using Canopy Photosynthesis Model Based on Ground and UAV Observations
* Spatial Heterogeneity of Combined Factors Affecting Vegetation Greenness Change in the Yangtze River Economic Belt from 2000 to 2020
* Spatial Patterns of Turbidity in Cartagena Bay, Colombia, Using Sentinel-2 Imagery
* Spatial Self-Distillation for Object Detection with Inaccurate Bounding Boxes
* Spatial-Aware Token for Weakly Supervised Object Localization
* Spatial-Temporal Knowledge-Embedded Transformer for Video Scene Graph Generation
* Spatially and Spectrally Consistent Deep Functional Maps
* Spatially-Adaptive Feature Modulation for Efficient Image Super-Resolution
* Spatio-Temporal Analysis of Patient-Derived Organoid Videos Using Deep Learning for the Prediction of Drug Efficacy
* Spatio-Temporal Convolution-Attention Video Network
* Spatio-Temporal Crop Aggregation for Video Representation Learning
* Spatio-Temporal Domain Awareness for Multi-Agent Collaborative Perception
* Spatio-temporal Prompting Network for Robust Video Feature Extraction
* Spatiotemporal Analysis and Prediction of Carbon Emissions from Energy Consumption in China through Nighttime Light Remote Sensing
* Spatiotemporal Analysis of Soil Moisture Variability and Its Driving Factor
* Spatiotemporal Characteristics of Actual Evapotranspiration Changes and Their Climatic Causes in China
* Spatiotemporal Evaluation of Regional Land Use Dynamics and Its Potential Ecosystem Impact under Carbon Neutral Pathways in the Guangdong-Hong Kong-Macao Greater Bay Area
* Spatiotemporal Evolution and Rank-Size Pattern of Chinese Urban Settlements
* Special section: Best papers of the international conference on pattern recognition and artificial intelligence (ICPRAI) 2022
* Spectral Graphormer: Spectral Graph-based Transformer for Egocentric Two-Hand Reconstruction using Multi-View Color Images
* Spectral-Spatial Feature Extraction for Hyperspectral Image Classification Using Enhanced Transformer with Large-Kernel Attention
* Spectrum Extension of a Real-Aperture Microwave Radiometer Using a Spectrum Extension Convolutional Neural Network for Spatial Resolution Enhancement
* Spectrum-guided Multi-granularity Referring Video Object Segmentation
* Speech2Lip: High-fidelity Speech to Lip Generation by Learning from a Short Video
* Speech4Mesh: Speech-Assisted Monocular 3D Facial Reconstruction for Speech-Driven 3D Facial Animation
* Spherical Space Feature Decomposition for Guided Depth Map Super-Resolution
* SpinCam: High-Speed Imaging via a Rotating Point-Spread Function
* SPLiT: Single Portrait Lighting Estimation via a Tetrad of Face Intrinsics
* SportsMOT: A Large Multi-Object Tracking Dataset in Multiple Sports Scenes
* Spurious Features Everywhere: Large-Scale Detection of Harmful Spurious Features in ImageNet
* SpyroPose: SE(3) Pyramids for Object Pose Distribution Estimation
* SQAD: Automatic Smartphone Camera Quality Assessment and Benchmarking
* SRFormer: Permuted Self-Attention for Single Image Super-Resolution
* SSB: Simple but Strong Baseline for Boosting Performance of Open-Set Semi-Supervised Learning
* SSDA: Secure Source-Free Domain Adaptation
* SSF: Accelerating Training of Spiking Neural Networks with Stabilized Spiking Flow
* SSGAM-Net: A Hybrid Semi-Supervised and Supervised Network for Robust Semantic Segmentation Based on Drone LiDAR Data
* SSIG: A Visually-Guided Graph Edit Distance for Floor Plan Similarity
* SSRL: Self-Supervised Spatial-Temporal Representation Learning for 3D Action Recognition
* Stabilizing Visual Reinforcement Learning via Asymmetric Interactive Cooperation
* Stable Cluster Discrimination for Deep Clustering
* Stable Signature: Rooting Watermarks in Latent Diffusion Models, The
* StableVideo: Text-driven Consistency-aware Diffusion Video Editing
* Stacked deep model-based classification of the multiclass brain hemorrhages in CT scans
* StageInteractor: Query-based Object Detector with Cross-stage Interaction
* Stain Detection Based on Unmanned Aerial Vehicle Hyperspectral Photovoltaic Module
* Steered Diffusion: A Generalized Framework for Plug-and-Play Conditional Image Synthesis
* STEERER: Resolving Scale Variations for Counting and Localization via Selective Inheritance Learning
* StegaNeRF: Embedding Invisible Information within Neural Radiance Fields
* step towards understanding why classification helps regression, A
* STEPs: Self-Supervised Key Step Extraction and Localization from Unlabeled Procedural Videos
* Stochastic Segmentation with Conditional Categorical Diffusion Models
* Story Visualization by Online Text Augmentation with Context Memory
* STPF-Net: Short-Term Precipitation Forecast Based on a Recurrent Neural Network
* STPrivacy: Spatio-Temporal Privacy-Preserving Action Recognition
* Strata-NeRF : Neural Radiance Fields for Stratified Scenes
* STRIDE: Street View-based Environmental Feature Detection and Pedestrian Collision Prediction
* Strip-MLP: Efficient Token Interaction for Vision MLP
* Strivec: Sparse Tri-Vector Radiance Fields
* Structural Alignment for Network Pruning through Partial Regularization
* Structural and Geomechanical Analysis of Natural Caves and Rock Shelters: Comparison between Manual and Remote Sensing Discontinuity Data Gathering
* Structural biomarker-based Alzheimer's disease detection via ensemble learning techniques
* Structural Priors Guided Network for the Corneal Endothelial Cell Segmentation
* Structure and Content-Guided Video Synthesis with Diffusion Models
* Structure Invariant Transformation for better Adversarial Transferability
* Structure-Aware Surface Reconstruction via Primitive Assembly
* Structure-Preserving and Illumination-Consistent Cycle Framework for Image Harmonization, A
* Studies on High-Resolution Airborne Synthetic Aperture Radar Image Formation with Pseudo-Random Agility of Interpulse Waveform Parameters
* Studying How to Efficiently and Effectively Guide Models with Explanations
* Studying the Impact of Augmentations on Medical Confidence Calibration
* Studying the Impact of the Geospace Environment on Solar Lithosphere Coupling and Earthquake Activity
* StyleDiffusion: Controllable Disentangled Style Transfer via Diffusion Models
* StyleDomain: Efficient and Lightweight Parameterizations of StyleGAN for One-shot and Few-shot Domain Adaptation
* StyleGANEX: StyleGAN-Based Manipulation Beyond Cropped Aligned Faces
* StyleInV: A Temporal Style Modulated Inversion Network for Unconditional Video Generation
* StyleLipSync: Style-based Personalized Lip-sync Video Generation
* StylerDALLE: Language-Guided Style Transfer Using a Vector-Quantized Tokenizer of a Large-Scale Generative Model
* Suaeda salsa spectral index for Suaeda salsa mapping and fractional cover estimation in intertidal wetlands
* Sub-Ensembles for Fast Uncertainty Estimation in Neural Networks
* Sub-Nanosecond UTC Dissemination Based on BDS-3 PPP-B2b Service
* Sub-Pixel Checkerboard Corner Localization for Robust Vision Measurement
* Subclass-balancing Contrastive Learning for Long-tailed Recognition
* Suitable-Matching Areas' Selection Method Based on Multi-Level Saliency
* Sum-Path-Gain Maximization for IRS-Aided MIMO Communication System via Riemannian Gradient Descent Network
* SUMMIT: Source-Free Adaptation of Uni-Modal Models to Multi-Modal Targets
* Super Augmented Nested Arrays: A New Sparse Array for Improved DOA Estimation Accuracy
* Supervised Homography Learning with Realistic Dataset Generation
* SupFusion: Supervised LiDAR-Camera Fusion for 3D Object Detection
* Support-Query Mutual Promotion and Classification Correction Network for Few-Shot Object Detection
* Surface Extraction from Neural Unsigned Distance Fields
* Surface Normal Clustering for Implicit Representation of Manhattan Scenes
* Surface Subsidence over a Coastal City Using SBAS-InSAR with Sentinel-1A Data: A Case of Nansha District, China
* SurfsUp: Learning Fluid Simulation for Novel Surfaces
* Surround the Nonlinearity: Inserting Foldable Convolutional Autoencoders to Reduce Activation Footprint
* Surround-View Vision-based 3D Detection for Autonomous Driving: A Survey
* SurroundOcc: Multi-Camera 3D Occupancy Prediction for Autonomous Driving
* Survey of Object Detection for UAVs Based on Deep Learning, A
* SuS-X: Training-Free Name-Only Transfer of Vision-Language Models
* SVDFormer: Complementing Point Cloud via Self-view Augmentation and Self-structure Dual-generator
* SVDiff: Compact Parameter Space for Diffusion Fine-Tuning
* SVQNet: Sparse Voxel-Adjacent Query Network for 4D Spatio-Temporal LiDAR Semantic Segmentation
* SwiftFormer: Efficient Additive Attention for Transformer-based Real-time Mobile Vision Applications
* SwinLSTM: Improving Spatiotemporal Prediction Accuracy using Swin Transformer and LSTM
* Switchable-Encoder-Based Self-Supervised Learning Framework for Monocular Depth and Pose Estimation
* SYENet: A Simple Yet Effective Network for Multiple Low-Level Vision Tasks with Real-time Performance on Mobile Device
* SynBody: Synthetic Dataset with Layered Human Models for 3D Human Perception and Modeling
* Synchronize Feature Extracting and Matching: A Single Branch Framework for 3D Object Tracking
* SynDrone: Multi-modal UAV Dataset for Urban Scenarios
* Synergistic Application of Multiple Machine Learning Algorithms and Hyperparameter Optimization Strategies for Net Ecosystem Productivity Prediction in Southeast Asia
* Synthesizing Diverse Human Motions in 3D Indoor Scenes
* Synthetic Dataset Acquisition for a Specific Target Domain
* Systematic Evaluation of Multi-Resolution ICESat-2 Canopy Height Data: A Case Study of the Taranaki Region
* Systematic Review of Radiative Transfer Models for Crop Yield Prediction and Crop Traits Retrieval, A
* T-FFTRadNet: Object Detection with Swin Vision Transformers from Raw ADC Radar Signals
* Take-A-Photo: 3D-to-2D Generative Pre-training of Point Cloud Models
* Talking Head Generation with Probabilistic Audio-to-Visual Diffusion Priors
* TALL: Thumbnail Layout for Deepfake Video Detection
* Taming Contrast Maximization for Learning Sequential, Low-latency, Event-based Optical Flow
* Tangent Model Composition for Ensembling and Continual Fine-tuning
* Tangent Sampson Error: Fast Approximate Two-view Reprojection Error for Central Camera Models
* TAPIR: Tracking Any Point with per-frame Initialization and temporal Refinement
* TARGET: Federated Class-Continual Learning via Exemplar-Free Distillation
* Targeted Adversarial Attacks on Generalizable Neural Radiance Fields
* Task Agnostic Restoration of Natural Video Dynamics
* Task-Aware Adaptive Learning for Cross-domain Few-Shot Learning
* Task-Oriented Multi-Modal Mutual Learning for Vision-Language Models
* TaskExpert: Dynamically Assembling Multi-Task Representations with Memorial Mixture-of-Experts
* Taxonomy Adaptive Cross-Domain Adaptation in Medical Imaging via Optimization Trajectory Distillation
* TCOVIS: Temporally Consistent Online Video Instance Segmentation
* Teaching CLIP to Count to Ten
* TeD-SPAD: Temporal Distinctiveness for Self-supervised Privacy-preservation for video Anomaly Detection
* TeleViT: Teleconnection-driven Transformers Improve Subseasonal to Seasonal Wildfire Forecasting
* Tem-adapter: Adapting Image-Text Pretraining for Video Question Answer
* Template Inversion Attack against Face Recognition Systems using 3D Face Reconstruction
* Template-guided Hierarchical Feature Restoration for Anomaly Detection
* Template-guided Illumination Correction for Document Images with Imperfect Geometric Reconstruction
* TEMPO: Efficient Multi-View Pose Estimation, Tracking, and Forecasting
* Temporal Action Segmentation: An Analysis of Modern Techniques
* Temporal and Spatial Evolution Analysis and Correlation Measurement of Urban-Rural Fringes Based on Nighttime Light Data
* Temporal and Spatial Surface Heat Source Variation in the Gurbantunggut Desert from 1950 to 2021
* Temporal Collection and Distribution for Referring Video Object Segmentation
* Temporal DINO: A Self-supervised Video Strategy to Enhance Action Prediction
* Temporal Enhanced Training of Multi-view 3D Object Detector via Historical Object Prediction
* Temporal-Coded Spiking Neural Networks with Dynamic Firing Threshold: Learning with Event-Driven Backpropagation
* Temporally Consistent Semantic Segmentation using Spatially Aware Multi-view Semantic Fusion for Indoor RGB-D videos
* Tensor Factorization for Leveraging Cross-Modal Knowledge in Data-Constrained Infrared Object Detection
* Terrestrial Laser Scanning-Based Method for Indoor Geometric Quality Measurement, A
* Tessellating the Latent Space for Non-Adversarial Generative Auto-Encoders
* Test Time Adaptation for Blind Image Quality Assessment
* Test-time Personalizable Forecasting of 3D Human Poses
* Tetra-NeRF: Representing Neural Radiance Fields Using Tetrahedra
* TexFusion: Synthesizing 3D Textures with Text-Guided Image Diffusion Models
* Text Position-Aware Pixel Aggregation Network With Adaptive Gaussian Threshold: Detecting Text in the Wild
* Text-Conditioned Sampling Framework for Text-to-Image Generation with Masked Generative Models
* Text-Driven Generative Domain Adaptation with Spectral Consistency Regularization
* Text2Performer: Text-Driven Human Video Generation
* Text2Room: Extracting Textured 3D Meshes from 2D Text-to-Image Models
* Text2Tex: Text-driven Texture Synthesis via Diffusion Models
* Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Generators
* TextManiA: Enriching Visual Feature by Text-driven Manifold Augmentation
* TextPSG: Panoptic Scene Graph Generation from Textual Descriptions
* Texture Generation on 3D Meshes with Point-UV Diffusion
* Texture Learning Domain Randomization for Domain Generalized Segmentation
* TF-ICON: Diffusion-Based Training-Free Cross-Domain Image Composition
* Theoretical Analysis of Density Peaks Clustering and the Component-Wise Peak-Finding Algorithm, A
* Theoretical and Numerical Analysis of 3D Reconstruction Using Point and Line Incidences
* Theory of Topological Derivatives for Inverse Rendering of Geometry, A
* Thinking Image Color Aesthetics Assessment: Models, Datasets and Benchmarks
* Three-Dimensional Distribution and Transport Features of Dust and Polluted Dust over China and Surrounding Areas from CALIPSO
* Three-Dimensional Resistivity and Chargeability Tomography with Expanding Gradient and Pole-Dipole Arrays in a Polymetallic Mine, China
* THÖR-Magni: Comparative Analysis of Deep Learning Models for Role-conditioned Human Motion Prediction
* TiDAL: Learning Training Dynamics for Active Learning
* TiDy-PSFs: Computational Imaging with Time-Averaged Dynamic Point-Spread-Functions
* TIFA: Accurate and Interpretable Text-to-Image Faithfulness Evaluation with Question Answering
* TIJO: Trigger Inversion with Joint Optimization for Defending Multimodal Backdoored Models
* Tiled Multiplane Images for Practical 3D Photography
* Time-to-Contact Map by Joint Estimation of Up-to-Scale Inverse Depth and Global Motion using a Single Event Camera
* Tiny and Efficient Model for the Edge Detection Generalization
* TinyCLIP: CLIP Distillation via Affinity Mimicking and Weight Inheritance
* TKIL: Tangent Kernel Optimization for Class Balanced Incremental Learning
* TM2D: Bimodality Driven 3D Dance Generation via Music-Text Integration
* TMA: Temporal Motion Aggregation for Event-based Optical Flow
* TMR: Text-to-Motion Retrieval Using Contrastive 3D Human Motion Synthesis
* To Adapt or Not to Adapt? Real-Time Adaptation for Semantic Segmentation
* Tobias: A Random CNN Sees Objects
* Token-Label Alignment for Vision Transformers
* Too Large; Data Reduction for Vision-Language Pre-Training
* ToonTalker: Cross-Domain Face Reenactment
* Topo-CXR: Chest X-ray TB and Pneumonia Screening with Topological Machine Learning
* TopoSeg: Topology-Aware Nuclear Instance Segmentation
* TORE: Token Reduction for Efficient Human Mesh Recovery with Transformer
* Total-Recon: Deformable Scene Reconstruction for Embodied View Synthesis
* Toward a European Facility for Ground-Based Radar Observations of Near-Earth Objects
* Toward a Redefinition of Agricultural Drought Periods: A Case Study in a Mediterranean Semi-Arid Region
* Toward more accurate diagnosis of multiple sclerosis: Automated lesion segmentation in brain magnetic resonance image using modified U-Net model
* Toward Multi-Granularity Decision-Making: Explicit Visual Reasoning with Hierarchical Knowledge
* Toward Unsupervised Realistic Visual Question Answering
* Towards Accurate Medical Image Segmentation with Gradient-Optimized Dice Loss
* Towards an Exhaustive Evaluation of Vision-Language Foundation Models
* Towards Attack-tolerant Federated Learning via Critical Parameter Analysis
* Towards Authentic Face Restoration with Iterative Diffusion Models and Beyond
* Towards Automated Regulation of Jacobaea Vulgaris in Grassland using Deep Neural Networks
* Towards Better Robustness against Common Corruptions for Unsupervised Domain Adaptation
* Towards Bridged Vision and Language: Learning Cross-Modal Knowledge Representation for Relation Extraction
* Towards Building More Robust Models with Frequency Bias
* Towards Content-based Pixel Retrieval in Revisited Oxford and Paris
* Towards Deeply Unified Depth-aware Panoptic Segmentation with Bi-directional Guidance Learning
* Towards Effective Instance Discrimination Contrastive Loss for Unsupervised Domain Adaptation
* Towards estimation of human intent in assistive robotic teleoperation using kinaesthetic and visual feedback
* Towards Fair and Comprehensive Comparisons for Image-Based 3D Object Detection
* Towards Fairness-aware Adversarial Network Pruning
* Towards Fixing Clever-Hans Predictors with Counterfactual Knowledge Distillation
* Towards General Low-Light Raw Noise Synthesis and Modeling
* Towards Generic Image Manipulation Detection with Weakly-Supervised Self-Consistency Learning
* Towards Geospatial Foundation Models via Continual Pretraining
* Towards Grand Unified Representation Learning for Unsupervised Visible-Infrared Person Re-Identification
* Towards Hierarchical Regional Transformer-based Multiple Instance Learning
* Towards High-Fidelity Text-Guided 3D Face Generation and Manipulation Using only Images
* Towards High-Quality Specular Highlight Removal by Leveraging Large-Scale Synthetic Data
* Towards Improved Input Masking for Convolutional Neural Networks
* Towards Inadequately Pre-trained Models in Transfer Learning
* Towards Instance-adaptive Inference for Federated Learning
* Towards Memory- and Time-Efficient Backpropagation for Training Spiking Neural Networks
* Towards Models that Can See and Read
* Towards Multi-Layered 3D Garments Animation
* Towards Nonlinear-Motion-Aware and Occlusion-Robust Rolling Shutter Correction
* Towards Open-Set Test-Time Adaptation Utilizing the Wisdom of Crowds in Entropy Minimization
* Towards Open-Vocabulary Video Instance Segmentation
* Towards Real-World Burst Image Super-Resolution: Benchmark and Method
* Towards Realistic Evaluation of Industrial Continual Learning Scenarios with an Emphasis on Energy Consumption and Computational Footprint
* Towards Robust and Smooth 3D Multi-Person Pose Estimation from Monocular Videos in the Wild
* Towards Robust Model Watermark via Reducing Parametric Vulnerability
* Towards Robust Natural-Looking Mammography Lesion Synthesis on Ipsilateral Dual-Views Breast Cancer Analysis
* Towards Saner Deep Image Registration
* Towards Semi-supervised Learning with Non-random Missing Labels
* Towards Understanding the Generalization of Deepfake Detectors from a Game-Theoretical View
* Towards Unifying Medical Vision-and-Language Pre-training via Soft Prompts
* Towards Universal Image Embeddings: A Large-Scale Dataset and Challenge for Generic Image Representations
* Towards Universal LiDAR-Based 3D Object Detection by Multi-Domain Knowledge Transfer
* Towards Unsupervised Domain Generalization for Face Anti-Spoofing
* Towards Viewpoint Robustness in Bird's Eye View Segmentation
* Towards Viewpoint-Invariant Visual Recognition via Adversarial Training
* Towards Vision-Language Mechanistic Interpretability: A Causal Tracing Tool for BLIP
* Towards Zero Domain Gap: A Comprehensive Study of Realistic LiDAR Simulation for Autonomy Testing
* Towards Zero-Shot Scale-Aware Monocular Depth Estimation
* TP-NoDe: Topology-aware Progressive Noising and Denoising of Point Clouds towards Upsampling
* Tracing the Influence of Predecessors on Trajectory Prediction
* Tracing the Origin of Adversarial Attack for Forensic Investigation and Deterrence
* TrackFlow: Multi-Object Tracking with Normalizing Flows
* Tracking Anything with Decoupled Video Segmentation
* Tracking by 3D Model Estimation of Unknown Objects in Videos
* Tracking by Natural Language Specification with Long Short-term Context Decoupling
* Tracking Everything Everywhere All at Once
* Tracking the Vegetation Change Trajectory over Large-Surface Coal Mines in the Jungar Coalfield Using Landsat Time-Series Data
* Tracking With Saliency Region Transformer
* Tracking without Label: Unsupervised Multiple Object Tracking via Contrastive Similarity Learning
* Traffic Mirror Detection and Annotation Methods from Street Images of Open Data for Preventing Accidents at Intersections by Alert
* TrainFors: A Large Benchmark Training Dataset for Image Manipulation Detection and Localization
* Training Methods of Multi-Label Prediction Classifiers for Hyperspectral Remote Sensing Images
* Traj-MAE: Masked Autoencoders for Trajectory Prediction
* Trajectory Unified Transformer for Pedestrian Trajectory Prediction
* Trajectory-Prediction with Vision: A Survey
* TrajectoryFormer: 3D Object Tracking Transformer with Predictive Trajectory Hypotheses
* TrajPAC: Towards Robustness Verification of Pedestrian Trajectory Prediction Models
* TransCNNLoc: End-to-end pixel-level learning for 2D-to-3D pose estimation in dynamic indoor scenes
* Transductive Learning With Prior Knowledge for Generalized Zero-Shot Action Recognition
* TransFace: Calibrating Transformer Training for Face Recognition from a Data-Centric Perspective
* Transferable Adversarial Attack for Both Vision Transformers and Convolutional Networks via Momentum Integrated Gradients
* Transferable Decoding with Visual Entities for Zero-Shot Image Captioning
* Transformer-based Detection of Microorganisms on High-Resolution Petri Dish Images
* Transformer-Based Sensor Fusion for Autonomous Driving: A Survey
* Transformer-Based Spatio-Temporal Analysis for Classification of Aortic Stenosis Severity From Echocardiography Cine Series
* Transformers Pay Attention to Convolutions Leveraging Emerging Properties of ViTs by Dual Attention-Image Network
* TransHuman: A Transformer-based Human Representation for Generalizable Neural Human Rendering
* TransIFF: An Instance-Level Feature Fusion Framework for Vehicle-Infrastructure Cooperative 3D Detection with Transformers
* TransInpaint: Transformer-based Image Inpainting with Context Adaptation
* Translating Images to Road Network: A Non-Autoregressive Sequence-to-Sequence Approach
* TransMatch: A Transformer-Based Multilevel Dual-Stream Feature Matching Network for Unsupervised Deformable Image Registration
* Transparent Shape from a Single View Polarization Image
* TransTIC: Transferring Transformer-based Image Compression from Human Perception to Machine Perception
* Treating Pseudo-labels Generation as Image Matting for Weakly Supervised Semantic Segmentation
* Tree Species Classification from Airborne Hyperspectral Images Using Spatial-Spectral Network
* Tree-Structured Shading Decomposition
* Trend Classification of InSAR Displacement Time Series Using SAE-CNN
* Tri-MipRF: Tri-Mip Representation for Efficient Anti-Aliasing Neural Radiance Fields
* Triple-loss driven generative adversarial network for pansharpening
* TripLe: Revisiting Pretrained Model Reuse and Progressive Learning for Efficient Vision Transformer Scaling and Searching
* TRM-UAP: Enhancing the Transferability of Data-Free Universal Adversarial Perturbation via Truncated Ratio Maximization
* Troubleshooting Ethnic Quality Bias with Curriculum Domain Adaptation for Face Image Quality Assessment
* TSOSVNet: Teacher-student collaborative knowledge distillation for Online Signature Verification
* Tube-Link: A Flexible Cross Tube Framework for Universal Video Segmentation
* Tubelet-Contrastive Self-Supervision for Video-Efficient Generalization
* Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation
* Tuning Pre-trained Model via Moment Probing
* Two Birds, One Stone: A Unified Framework for Joint Learning of Image and Video Style Transfers
* Two-in-One Depth: Bridging the Gap Between Monocular and Binocular Self-supervised Depth Estimation
* Two-Level Integrity-Monitoring Method for Multi-Source Information Fusion Navigation
* Two-stage coarse-to-fine method for pathological images in medical decision-making systems
* Two-stage feature distribution rectification for few-shot point cloud semantic segmentation
* TYC Dataset for Understanding Instance-Level Semantics and Motions of Cells in Microstructures, The
* U-RED: Unsupervised 3D Shape Retrieval and Deformation for Partial Point Clouds
* U2D2Net: Unsupervised Unified Image Dehazing and Denoising Network for Single Hazy Image Enhancement
* UATVR: Uncertainty-Adaptive Text-Video Retrieval
* UCF: Uncovering Common Features for Generalizable Deepfake Detection
* UGC: Unified GAN Compression for Efficient Image-to-Image Translation
* UHDNeRF: Ultra-High-Definition Neural Radiance Fields
* UMC: A Unified Bandwidth-efficient and Multi-resolution based Collaborative Perception Framework
* UMFuse: Unified Multi View Fusion for Human Editing applications
* UMIFormer: Mining the Correlations between Similar Tokens for Multi-View 3D Reconstruction
* Unaligned 2D to 3D Translation with Conditional Vector-Quantized Code Diffusion using Transformers
* Uncertainty Analysis and Data Fusion of Multi-Source Land Evapotranspiration Products Based on the TCH Method
* Uncertainty Guided Adaptive Warping for Robust and Efficient Stereo Matching
* Uncertainty-aware State Space Transformer for Egocentric 3D Hand Trajectory Forecasting
* Uncertainty-aware Unsupervised Multi-Object Tracking
* Uncertainty-guided Learning for Improving Image Manipulation Detection
* UncLe-SLAM: Uncertainty Learning for Dense Neural SLAM
* Uncovering Plastic Litter Spectral Signatures: A Comparative Study of Hyperspectral Band Selection Algorithms
* Under-Display Camera Image Restoration with Scattering Effect
* Undercover Deepfakes: Detecting Fake Segments in Videos
* Understanding 3D Object Interaction from a Single Image
* Understanding and Accelerating Neural Architecture Search With Training-Free and Theory-Grounded Metrics
* Understanding Hessian Alignment for Domain Generalization
* Understanding Self-attention Mechanism via Dynamical System Perspective
* Understanding the Feature Norm for Out-of-Distribution Detection
* Understanding Video Scenes through Text: Insights from Text-based Video Question Answering
* Unfolding Framework with Prior of Convolution-Transformer Mixture and Uncertainty Estimation for Video Snapshot Compressive Imaging
* Uni-3D: A Universal Model for Panoptic 3D Scene Reconstruction
* Uni-NLX: Unifying Textual Explanations for Vision and Vision-Language Tasks
* UniDexGrasp++: Improving Dexterous Grasping Policy Learning via Geometry-aware Curriculum and Iterative Generalist-Specialist Learning
* UniFace: Unified Cross-Entropy Loss for Deep Face Recognition
* Unified Adversarial Patch for Cross-modal Attacks in the Physical World
* Unified Approach for Occlusion Tolerant 3D Facial Pose Capture and Gaze Estimation using MocapNETs, A
* Unified Automatic Plant Cover and Phenology Prediction
* Unified Coarse-to-Fine Alignment for Video-Text Retrieval
* Unified Continual Learning Framework with General Parameter-Efficient Tuning, A
* Unified Data-Free Compression: Pruning and Quantization without Fine-Tuning
* Unified Framework for Robustness on Diverse Sampling Errors, A
* Unified Out-Of-Distribution Detection: A Model-Specific Perspective
* Unified Pre-training with Pseudo Texts for Text-To-Image Person Re-identification
* Unified Transformer Framework for Group-Based Segmentation: Co-Segmentation, Co-Saliency Detection and Video Salient Object Detection, A
* Unified Visual Relationship Detection with Vision and Language Models
* UniFormerV2: Unlocking the Potential of Image ViTs for Video Understanding
* UniFusion: Unified Multi-view Fusion Transformer for Spatial-Temporal Representation in Bird's-Eye-View
* Unify, Align and Refine: Multi-Level Semantic Alignment for Radiology Report Generation
* UniKD: Universal Knowledge Distillation for Mimicking Homogeneous or Heterogeneous Object Detectors
* Unilaterally Aggregated Contrastive Learning with Hierarchical Augmentation for Anomaly Detection
* UniSeg: A Unified Multi-Modal LiDAR Segmentation Network and the OpenPCSeg Codebase
* UniT3D: A Unified Transformer for 3D Dense Captioning and Visual Grounding
* UnitedHuman: Harnessing Multi-Source Data for High-Resolution Human Generation
* UniTR: A Unified and Efficient Multi-Modal Transformer for Bird's-Eye-View Representation
* Universal Domain Adaptation via Compressive Attention Matching
* UniverSeg: Universal Medical Image Segmentation
* UniVTG: Towards Unified Video-Language Temporal Grounding
* Unleashing Text-to-Image Diffusion Models for Visual Perception
* Unleashing the Potential of Spiking Neural Networks with Dynamic Confidence
* Unleashing the Power of Gradient Signal-to-Noise Ratio for Zero-Shot NAS
* Unleashing Vanilla Vision Transformer with Masked Image Modeling for Object Detection
* UnLoc: A Unified Framework for Video Localization Tasks
* Unlocking Comparative Plant Scoring with Siamese Neural Networks and Pairwise Pseudo Labelling
* Unmasked Teacher: Towards Training-Efficient Video Foundation Models
* Unmasking Anomalies in Road-Scene Segmentation
* Unpaired Multi-domain Attribute Translation of 3D Facial Shapes with a Square and Symmetric Geometric Map
* Unraveling a Decade: A Comprehensive Survey on Isolated Sign Language Recognition
* Unreasonable Effectiveness of Large Language-Vision Models for Source-free Video Domain Adaptation, The
* Unseen and Adverse Outdoor Scenes Recognition Through Event-based Captions
* Unsupervised 3D Perception with 2D Vision-Language Distillation for Autonomous Driving
* Unsupervised Accuracy Estimation of Deep Visual Models using Domain-Adaptive Adversarial Perturbation without Source Samples
* Unsupervised Camouflaged Object Segmentation as Domain Adaptation
* Unsupervised Compositional Concepts Discovery with Text-to-Image Generative Models
* Unsupervised Confidence Approximation: Trustworthy Learning from Noisy Labelled Data
* Unsupervised Domain Adaptation for Medical Image Segmentation by Disentanglement Learning and Self-Training
* Unsupervised domain adaptation for SAR target classification based on domain- and class-level alignment: From simulated to real data
* Unsupervised Domain Adaptation for Self-Driving from Past Traversal Features
* Unsupervised Domain Adaptation for Training Event-Based Networks Using Contrastive Learning and Uncorrelated Conditioning
* Unsupervised Domain Adaptive Detection with Network Stability Analysis
* Unsupervised Facial Performance Editing via Vector-Quantized StyleGAN Representations
* Unsupervised Feature Representation Learning for Domain-generalized Cross-domain Image Retrieval
* Unsupervised Image Denoising in Real-World Scenarios via Self-Collaboration Parallel Generative Adversarial Branches
* Unsupervised Learning of Object-Centric Embeddings for Cell Instance Segmentation in Microscopy Images
* Unsupervised Linear Component Analysis for a Class of Probability Mixture Models
* Unsupervised Manifold Linearizing and Clustering
* Unsupervised Object Localization with Representer Point Selection
* Unsupervised Open-Vocabulary Object Localization in Videos
* Unsupervised Pathology Detection: A Deep Dive Into the State of the Art
* Unsupervised Prompt Tuning for Text-Driven Object Detection
* Unsupervised Self-Driving Attention Prediction via Uncertainty Mining and Knowledge Embedding
* Unsupervised Surface Anomaly Detection with Diffusion Probabilistic Model
* Unsupervised Video Deraining with An Event Camera
* Unsupervised Video Object Segmentation with Online Adversarial Self-Tuning
* UpCycling: Semi-supervised 3D Object Detection without Sharing Raw-level Unlabeled Scenes
* UPGPT: Universal Diffusion Model for Person Image Generation, Editing and Pose Transfer
* Upward Robust Steganography Based on Overflow Alleviation
* Urban Radiance Field Representation with Deformable Neural Mesh Primitives
* UrbanGIRAFFE: Representing Urban Scenes as Compositional Generative Neural Feature Fields
* Usage of Airborne LiDAR Data and High-Resolution Remote Sensing Images in Implementing the Smart City Concept
* USAGE: A Unified Seed Area Generation Paradigm for Weakly Supervised Semantic Segmentation
* Use of covariance matrix images for electroencephalography signal classification for multiclass motor imagery-based brain computer interface
* Using a Waffle Iron for Automotive Point Cloud Semantic Segmentation
* Using and Abusing Equivariance
* Using Atmospheric Inverse Modelling of Methane Budgets with Copernicus Land Water and Wetness Data to Detect Land Use-Related Emissions
* Using Downwelling Far- and Thermal-Infrared Hyperspectral Radiance for Cloud Phase Classification in the Antarctic
* Using Keyhole Images to Map Soil Liquefaction Induced by the 1966 Xingtai Ms 6.8 and 7.2 Earthquakes, North China
* Using Large Text To Image Models with Structured Prompts for Skin Disease Identification: A Case Study
* Using normalized echo state network to detect abnormal ECG patterns
* U²-Former: Nested U-Shaped Transformer for Image Restoration via Multi-View Contrastive Learning
* V-FUSE: Volumetric Depth Map Fusion with Long-Range Constraints
* V3Det: Vast Vocabulary Visual Detection Dataset
* VAD: Vectorized Scene Representation for Efficient Autonomous Driving
* VADER: Video Alignment Differencing and Retrieval
* Validating GEDI tree canopy cover product across forest types using co-registered aerial LiDAR data
* Vanishing Point Estimation in Uncalibrated Images with Prior Gravity Direction
* VAPCNet: Viewpoint-Aware 3D Point Cloud Completion
* Variability of Snow Cover and Its Contribution to Water Resources in the Chinese Altai Mountains from 2000 to 2022, The
* Variance Reduced Domain Randomization for Reinforcement Learning With Policy Gradient
* Variational Causal Inference Network for Explanatory Visual Question Answering
* Variational Degeneration to Structural Refinement: A Unified Framework for Superimposed Image Decomposition
* VAST: Vivify Your Talking Avatar via Zero-Shot Expressive Facial Style Transfer
* Vegetation Land Segmentation with Multi-Modal and Multi-Temporal Remote Sensing Images: A Temporal Learning Approach and a New Dataset
* Vehicle Detection in UAV Images via Background Suppression Pyramid Network and Multi-Scale Task Adaptive Decoupled Head
* Verbs in Action: Improving verb understanding in video-language models
* VeRi3D: Generative Vertex-based Radiance Fields for 3D Controllable Human Image Synthesis
* Versatile Diffusion: Text, Images and Variations All in One Diffusion Model
* VertexSerum: Poisoning Graph Neural Networks for Link Inference
* Vertically Resolved Global Ocean Light Models Using Machine Learning
* VI-Net: Boosting Category-level 6D Object Pose Estimation via Learning Decoupled Rotations on the Spherical Representations
* Victim and The Beneficiary: Exploiting a Poisoned Model to Train a Clean Model on Poisoned Data, The
* Video Action Recognition with Adaptive Zooming Using Motion Residuals
* Video Action Recognition with Attentive Semantic Units
* Video Action Segmentation via Contextually Refined Temporal Keypoints
* Video Adverse-Weather-Component Suppression Network via Weather Messenger and Adversarial Backpropagation
* Video Anomaly Detection via Sequentially Learning Multiple Pretext Tasks
* Video Attribute Prototype Network: A New Perspective for Zero-Shot Video Classification
* Video Background Music Generation: Dataset, Method and Evaluation
* Video BagNet: short temporal receptive fields increase robustness in long-term action recognition
* Video Frame Interpolation With Many-to-Many Splatting and Spatial Selective Refinement
* Video Object Segmentation-aware Video Frame Interpolation
* Video OWL-ViT: Temporally-consistent open-world localization in video
* Video State-Changing Object Segmentation
* Video Task Decathlon: Unifying Image and Video Tasks in Autonomous Driving
* Video-and-Language (VidL) models and their cognitive relevance
* Video-FocalNets: Spatio-Temporal Focal Modulation for Video Action Recognition
* VideoFlow: Exploiting Temporal Cues for Multi-frame Optical Flow Estimation
* VidStyleODE: Disentangled Video Editing via StyleGAN and NeuralODEs
* View Consistent Purification for Accurate Cross-View Localization
* Viewing Graph Solvability in Practice
* ViewRefer: Grasp the Multi-view Knowledge for 3D Visual Grounding
* Viewset Diffusion: (0-)Image-Conditioned 3D Generative Models from 2D Data
* ViLLA: Fine-Grained Vision-Language Representation Learning from Real-World Data
* ViLTA: Enhancing Vision-Language Pre-training through Textual Augmentation
* ViM: Vision Middleware for Unified Downstream Transferring
* ViperGPT: Visual Inference via Python Execution for Reasoning
* Virtual perturbations to assess explainability of deep-learning based cell fate predictors
* Virtual Try-On with Pose-Garment Keypoints Guided Inpainting
* Visible-Infrared Person Re-Identification via Semantic Alignment and Affinity Inference
* Vision Grid Transformer for Document Layout Analysis
* Vision HGNN: An Image is More than a Graph of Nodes
* Vision Relation Transformer for Unbiased Scene Graph Generation
* Vision Transformer Adapters for Generalizable Multitask Learning
* Vision-based Monitoring of the Short-term Dynamic Behaviour of Plants for Automated Phenotyping
* Vision-Based Treatment Localization with Limited Data: Automated Documentation of Military Emergency Medical Procedures
* Vision-Based UAV Self-Positioning in Low-Altitude Urban Environments
* Vision-Language Models Performing Zero-Shot Tasks Exhibit Disparities Between Gender Groups
* Visual Explanations via Iterated Integrated Attributions
* Visual Representation-Guided Framework With Global Affinity for Weakly Supervised Salient Object Detection, A
* Visual Traffic Knowledge Graph Generation from Scene Images
* Visualizing and Understanding Contrastive Learning
* Visually-Prompted Language Model for Fine-Grained Scene Graph Generation in an Open World
* ViTPose++: Vision Transformer for Generic Body Pose Estimation
* VL-Match: Enhancing Vision-Language Pretraining with Token-Level and Instance-Level Matching
* VL-PET: Vision-and-Language Parameter-Efficient Tuning via Granularity Control
* VLMAH: Visual-Linguistic Modeling of Action History for Effective Action Anticipation
* VLN-PETL: Parameter-Efficient Transfer Learning for Vision-and-Language Navigation
* VLSlice: Interactive Vision-and-Language Slice Discovery
* Volumetric Fast Fourier Convolution for Detecting Ink on the Carbonized Herculaneum Papyri
* VoroMesh: Learning Watertight Surface Meshes with Voronoi Diagrams
* Vox-E: Text-guided Voxel Editing of 3D Objects
* VQ-InfraTrans: A Unified Framework for RGB-IR Translation with Hybrid Transformer
* VQ3D: Learning a 3D-Aware Generative Model on ImageNet
* VQA Therapy: Exploring Answer Differences by Visually Grounding Answers
* VQA-GNN: Reasoning with Multimodal Knowledge via Graph Neural Networks for Visual Question Answering
* VSCHH 2023: A Benchmark for the View Synthesis Challenge of Human Heads
* Waffling around for Performance: Visual Classification with Random Words and Broad Concepts
* WALDO: Future Video Synthesis using Object Layer Decomposition and Parametric Flow Prediction
* Walking Your LiDOG: A Journey Through Multiple Domains for LiDAR Semantic Segmentation
* Wasserstein Expansible Variational Autoencoder for Discriminative and Generative Continual Learning
* Water Quality and Flooding Impact of the Record-Breaking Storm Gloria in the Ebro Delta (Western Mediterranean)
* Water Storage Variations Recovered from Global Navigation Satellite System Network Using Spatial Constraints: A Case Study of the Contiguous United States
* WaterLo: Protect Images from Deepfakes Using Localized Semi-Fragile Watermark
* WaterMask: Instance Segmentation for Underwater Imagery
* WaveIPT: Joint Attention and Flow Alignment in the Wavelet domain for Pose Transfer
* Wavelet scattering- and object detection-based computer vision for identifying dengue from peripheral blood microscopy
* Wavelet Transform Feature Enhancement for Semantic Segmentation of Remote Sensing Images
* WaveNeRF: Wavelet-based Generalizable Neural Radiance Fields
* WDiscOOD: Out-of-Distribution Detection via Whitened Linear Discriminant Analysis
* Weakly Semi-supervised Detector-based Video Classification with Temporal Context for Lung Ultrasound
* Weakly Supervised Learning of Semantic Correspondence through Cascaded Online Correspondence Refinement
* Weakly Supervised Referring Image Segmentation with Intra-Chunk and Inter-Chunk Consistency
* Weakly-supervised 3D Pose Transfer with Keypoints
* Weakly-Supervised Action Localization by Hierarchically-structured Latent Attention Modeling
* Weakly-Supervised Text-driven Contrastive Learning for Facial Behavior Understanding
* Weed Mapping with Convolutional Neural Networks on High Resolution Whole-Field Images
* What can a cook in Italy teach a mechanic in India? Action Recognition Generalisation Over Scenarios and Locations
* What can Discriminator do? Towards Box-free Ownership Verification of Generative Adversarial Networks
* What Can Simple Arithmetic Operations Do for Temporal Modeling?
* What do neural networks learn in image classification? A frequency shortcut perspective
* What does a platypus look like? Generating customized prompts for zero-shot image classification
* What does CLIP know about a red circle? Visual prompt engineering for VLMs
* What Does Really Count? Estimating Relevance of Corner Cases for Semantic Segmentation in Automated Driving
* What If the TV was off? Examining Counterfactual Reasoning Abilities of Multi-modal Language Models
* When Do Curricula Work in Federated Learning?
* When Epipolar Constraint Meets Non-local Operators in Multi-View Stereo
* When Layers Play the Lottery, all Tickets Win at Initialization
* When Noisy Labels Meet Long Tail Dilemmas: A Representation Calibration Method
* When Prompt-based Incremental Learning Does Not Meet Strong Pretraining
* When to Learn What: Model-Adaptive Data Augmentation Curriculum
* Which Tokens to Use? Investigating Token Reduction in Vision Transformers
* Who are you referring to? Coreference resolution in image narrations
* Why do networks have inhibitory/negative connections?
* Why Is Prompt Tuning for Vision-Language Models Robust to Noisy Labels?
* Why Rivers Disappear: Remote Sensing Analysis of Postmining Factors Using the Example of the Sztola River, Poland
* Wildfire Burnt Area Severity Classification from UAV-Based RGB and Multispectral Imagery
* Will Large-scale Generative Models Corrupt Future Datasets?
* Window-Based Early-Exit Cascades for Uncertainty Estimation: When Deep Ensembles are More Efficient than Single Models
* Window-based Model Averaging Improves Generalization in Heterogeneous Federated Learning
* Windowed Radon Transform and Tensor Rank-1 Decomposition for Adaptive Beamforming in Ultrafast Ultrasound
* With a Little Help from your own Past: Prototypical Memory Networks for Image Captioning
* WLQRP: A Weighting Link-Quality-Based Routing Protocol for Underwater Sensor Networks
* WOOD: Wasserstein-Based Out-of-Distribution Detection
* Workie-Talkie: Accelerating Federated Learning by Overlapping Computing and Communications via Contrastive Regularization
* X-Mesh: Towards Fast and Accurate Text-driven 3D Stylization via Dynamic Textual Guidance
* X-VoE: Measuring eXplanatory Violation of Expectation in Physical Events
* XiNet: Efficient Neural Networks for tinyML
* XMem++: Production-level Video Segmentation From Few Annotated Frames
* XNet: Wavelet-Based Low and High Frequency Fusion Networks for Fully- and Semi-Supervised Semantic Segmentation of Biomedical Images
* XVO: Generalized Visual Odometry via Cross-Modal Self-Training
* Yes, we CANN: Constrained Approximate Nearest Neighbors for local feature-based visual localization
* YOLOBench: Benchmarking Efficient Object Detectors on Embedded Systems
* YOLOv5 based detector for eight different urine particles components on single board computer
* You can have your ensemble and run it too: Deep Ensembles Spread Over Time
* You Never Get a Second Chance To Make a Good First Impression: Seeding Active Learning for 3D Semantic Segmentation
* Your Diffusion Model is Secretly a Zero-Shot Classifier
* Your Input Matters: Comparing Real-Valued PolSAR Data Representations for CNN-Based Segmentation
* Zenseact Open Dataset: A large-scale and diverse multimodal dataset for autonomous driving
* Zero-1-to-3: Zero-shot One Image to 3D Object
* Zero-guidance Segmentation Using Zero Segment Labels
* Zero-Shot and Few-Shot Video Question Answering with Multi-Modal Prompts
* Zero-Shot Composed Image Retrieval with Textual Inversion
* Zero-Shot Contrastive Loss for Text-Guided Diffusion Image Style Transfer
* Zero-Shot Point Cloud Segmentation by Semantic-Visual Aware Synthesis
* Zero-shot spatial layout conditioning for text-to-image diffusion models
* ZiCo-BC: A Bias Corrected Zero-Shot NAS for Vision Tasks
* Zip-NeRF: Anti-Aliased Grid-Based Neural Radiance Fields
* Zolly: Zoom Focal Length Correctly for Perspective-Distorted Human Mesh Reconstruction
* zPROBE: Zero Peek Robustness Checks for Federated Learning
3412 for 2401

Index for "2"

Last update:20-Jan-25 12:09:31
Use price@usc.edu for comments.