2512
* *3D Imaging, Modeling, Processing, Visualization and Transmission
* *
* *Affective Behavior Analysis In-the-Wild
* *AgriVision: Agriculture-Vision: Challenges and Opportunities for Computer Vision in Agriculture
* *AI for Creative Visual Content Generation, Editing and Understanding
* *Autonomous Driving
* *Catch UAVs That Want to Watch You: Detection and Tracking of Unmanned Aerial Vehicle in the Wild
* *Computer Vision for Drug Discovery Where Are We and What is Beyond?
* *Computer Vision for Microscopy Image Analysis
* *Computer Vision in Sports
* *CVPR
* *Data Driven Autonomous Driving Simulation
* *Distillation of Foundation Models for Autonomous Driving
* *Domain Generalization: Evolution, Breakthroughs, and Future Horizons
* *Efficient and On-Device Generation
* *Efficient Large Vision Models
* *Event-Based Vision
* *Explainable AI for Computer Vision Workshop
* *Exploring the Next Generation of Data
* *Federated Learning for Computer Vision
* *Fine-grained Visual Categorization
* *Foundation Models for V2X-Based Cooperative Autonomous Driving
* *Human Motion Generation
* *Image Matching: Local Features and Beyond
* *LatinX in CV Research
* *Mechanistic Interpretability for Vision
* *Mobile AI
* *Monocular Depth Estimation Challenge
* *Multi-Agent Embodied Intelligent Systems Meet Generative-AI Era: Opportunities, Challenges and Futures
* *Multimodal Algorithmic Reasoning Workshop
* *MVA
* *Navigating the Future: Ensuring Trustworthiness in Multi-Modal Open-World Intelligence
* *New Trends in Image Restoration and Enhancement
* *Open-World 3D Scene Understanding
* *Perception Beyond the Visible Spectrum
* *Pixel-level understanding with Vision Foundation Models
* *Pixel-Level Video Understanding in the Wild Challenge
* *Precognition: Seeing Through the Future
* *ReGenAI: Second Workshop on Responsible Generative AI
* *Rhobin Challenge: Reconstruction of Human-Object Interaction
* *Safe Artificial Intelligence for All Domains
* *Sign Language Recognition, Translation and Production
* *SyntaGen: Harnessing Generative Models for Synthetic Visual Datasets
* *Test-time Scaling for Computer Vision
* *Uncertainty Quantification for Computer Vision
* *Urban Scene Modeling: Where Vision Meets Photogrammetry and Graphics
* *Visual Anomaly and Novelty Detection
* *Visual Odometry and Computer Vision Applications Based on Location Clues
* *What is Next in Multimodal Foundation Models?
* *Women in Computer Vision
* *Workshop of Adversarial Machine Learning on Computer Vision: Foundation Models + X
* *Workshop on Foundation and Large Vision Models in Remote Sensing
* 0.96 pJ/SOP Heterogeneous Neuromorphic Chip Toward Energy-Efficient Edge Visual Applications, A
* 2D-CFAR Target Detection Method in Sea Clutter Based on Copula Theory Using Dual-Observation Channels, A
* 360-GS: Layout-Guided Panoramic Gaussian Splatting for Indoor Roaming
* 3D Face Reconstruction From Radar Images
* 3D Hand Pose Estimation via Articulated Anchor-to-Joint 3D Local Regressors
* 3D High-Resolution Seismic Imaging of Elusive Seismogenic Faults: The Pantano-Ripa Rossa Fault, Southern Italy
* 3D MedDiffusion: A 3D Medical Latent Diffusion Model for Controllable and High-Quality Medical Image Generation
* 3D Object Reconstruction Through Integration of Hyperspectral and RGB-D Imaging
* 3D Reconstruction with Spatial Memory
* 3D Whole-Body Grasp Synthesis with Directional Controllability
* 3D-Aided Pedestrian Representation Learning for Video-Based Person Re-Identification
* 3D-GPT: Procedural 3D Modeling with Large Language Models
* 3Diface: Synthesizing and Editing Holistic 3D Facial Animation
* 3DPortraitGAN: Learning One-Quarter Headshot 3D GANs From a Single-View Portrait Dataset With Diverse Body Poses
* 3rd Multi-Modal Aerial View Image Challenge: Sensor Domain Translation - PBVS 2025
* 4D-Editor: Interactive Object-Level Editing in Dynamic Neural Radiance Fields via Semantic Distillation
* 4DStyleGaussian: Generalizable 4D style transfer with Gaussian splatting
* 4th Multi-Modal Aerial View Image Challenge: SAR Classification - PBVS 2025
* 6D Pose Estimation of Novel Objects: A Survey
* Medical Image Encryption (H4)
* UAV Object Detection (H3)
* A2-GNN: Angle-Annular GNN for Visual Descriptor-Free Camera Relocalization
* AASTFNet: An Attention-Aware Spatial-Temporal Fusion Network for Enhanced Pain Intensity Evaluation in Facial Image
* ABENet: Attention-based bidirectional enhancement network for collaborative camouflaged object detection
* ABFE-Net: Attention-Based Feature Enhancement Network for Few-Shot Point Cloud Classification
* Absolute Radiometric Calibration Evaluation of Uncrewed Aerial System (UAS) Headwall and MicaSense Sensors and Improving Data Quality Using the Empirical Line Method
* Accelerating Computation for Estimating Land Surface Temperature: An Efficient Global-Local Regression (EGLR) Framework
* Accounting for 10 m Resolution Mapping for Above-Ground Biomass of Urban Trees in C40 Cities Across Eurasia Continent
* ACLI: A CNN Pruning Framework Leveraging Adjacent Convolutional Layer Interdependence and gamma-Weakly Submodularity
* Action adversarial robust reinforcement learning via Stackelberg game
* Action Anticipation from Soccernet Football Video Broadcasts
* Action Valuation in Sports: A Survey
* Active Learning for Multiple Target Models
* Active-Disturbance-Rejection Cooperative Control for Multi-Train System With Constraints and Collision Avoidance
* ActNAS: Generating Efficient YOLO Models Using Activation NAS
* Ad-MNet with FConv: FPGA-enabled advanced MobileNet model with fast convolution accelerator for image resolution and quality enhancement
* ADA-Track++: End-to-End Multi-Camera 3D Multi-Object Tracking With Alternating Detection and Association
* AdaPL: Adaptive Pseudo Labeling for deep active learning in image classification
* Adaptive Clustering and Frequency Division Network for Efficient Monocular Depth Estimation
* Adaptive Depth-Converted-Scale Convolution for Self-Supervised Monocular Depth Estimation
* Adaptive Evidential Fusion of Light-Dark Features With Multi-Scan Mamba for Automated Macular Edema Diagnosis
* Adaptive Far-Field Region of Interest Extraction and its Applications for Long-Range Ground Surveillance
* Adaptive integration of textual context and visual embeddings for underrepresented vision classification
* Adaptive latent disease state learning for multimodal Alzheimer's disease biomarker detection with missing modalities
* Adaptive Multi-Feature Fusion Algorithm for Ship Rust Detection on Coating Surfaces
* Adaptive multi-view consistency clustering via structure-enhanced contrastive learning
* Adaptive Part Shifting for Fine-Grained Ship Classification in Remote Sensing Images
* Adaptive Sequential Phase Optimization Method Based on Coherence Stability Detection and Adjustment Correction, An
* Adaptive Spatio-Temporal Voxel-Based Trajectory Planning and Optimization for Close-Quarters Ships Collision Avoidance
* adaptive weighted active contour based HRNet for underwater image segmentation, An
* Adaptor: Adaptive Token Reduction for Video Diffusion Transformers
* AdaVid: Adaptive Video-Language Pretraining
* ADcFNet-deep learning based facial expression identification using FER vision transformer
* Added Value of Assimilating FY-4B AGRI Water Vapor Radiances on Analyses and Forecasts for 23-7 Heavy Rainfall
* Additive decomposition of one-dimensional signals using Transformers
* ADFNeT: Adaptive decomposition and fusion for color constancy
* Advanced fusion of IoT and AI technologies for smart environments: Enhancing environmental perception and mobility solutions for visually impaired individuals
* Advancements and Challenges in Arabic Optical Character Recognition: A Comprehensive Survey
* Advancements in Affective and Behavior Analysis: The 8th ABAW Workshop and Competition
* Advancing Ambient Lighting Normalization via Diffusion Shadow Generation
* Advancing Disease Detection Using Deep Learning in Low-Data Environments
* Advancing Facial Age Progression for Occluded Faces
* Advancing Hyperspectral LWIR Imaging of Soils with a Controlled Laboratory Setup
* Advancing image super-resolution techniques in remote sensing: A comprehensive survey
* Advancing Positioning, Navigation, and Timing (PNT) Service Using Satellite Navigation Technology
* Advancing Real-World Parking Slot Detection With Large-Scale Dataset and Semi-Supervised Baseline
* Advantage Analysis of Spaceborne SAR Imaging in Very Low Earth Orbit: A Case Study of Haishao-1
* Adversarially Domain-Adaptive Latent Diffusion for Unsupervised Semantic Segmentation
* Aegis: A domain generalization framework for medical image segmentation by mitigating feature misalignment
* Aerial Infrared Health Monitoring of Solar Photovoltaic Farms at Scale
* AerOSeg: Harnessing SAM for Open-Vocabulary Segmentation in Remote Sensing Images
* Affective Embodied Agent for Patient Assistance in Virtual Rehabilitation
* Affinity-aware uncertainty quantification for learning with noisy labels
* AFFusion: Atmospheric scattering enhancement and frequency integrated spatial-channel attention for infrared and visible image fusion
* AG-MAE: Anatomically Guided Spatio-Temporal Masked Auto-Encoder for Online Hand Gesture Recognition
* Age Prediction of Komatsuna using Hu Moments with Neural Networks for Small Datasets
* Agent-Based System for Location Privacy Protection in Location-Based Services, An
* AGFNet: Adaptive Guided Scanning and Frequency-Enhanced Network for High-Resolution Remote Sensing Building Change Detection
* AGILE: A Diffusion-Based Attention-Guided Image and Label Translation for Efficient Cross-Domain Plant Trait Identification
* Agri-FM+: a Self-Supervised Foundation Model for Agricultural Vision
* Agro-Net: a Convolution-Attention Fusion Based Hyperspectral Model for Agro-Food Quality Assessment
* AGS-Mesh: Adaptive Gaussian Splatting and Meshing with Geometric Priors for Indoor Room Reconstruction Using Smartphones
* AI Hiring with LLMs: A Context-Aware and Explainable Multi-Agent Framework for Resume Screening
* AI-Based Hate Speech Detection System Using Video URLs for Effective Content Moderation
* AI-Based Video Content Understanding for Automatic and Interactive Multimedia Retrieval
* Airborne G-Band Water Vapor Radiometer and Dropsonde Validation of Reanalysis and NWP Precipitable Water Vapor over the Korean Peninsula, An
* AirTraj-Diff: Generating Aircraft Trajectory With Conditional Diffusion Probabilistic Model
* Alignment-Condition-Based Iterative Learning Controller for High-Speed Trains With Norm-Bounded Uncertainties, An
* alpha-Surf: Implicit Surface Reconstruction for Semi-Transparent and Thin Objects with Decoupled Geometry and Opacity
* ALSA-UAD: Unsupervised anomaly detection on histopathology images using adversarial learning and simulated anomaly
* AM-ConvBLS: Adaptive Manifold Convolutional Broad Learning System for Cross-Session and Cross-Subject Emotion Recognition
* AM40: Enhancing action recognition through matting-driven interaction analysis
* AMF-UNet: A Lightweight Adaptive Multi-Mamba Fusion U-Shaped Network for Medical Image Segmentation
* Analysis of 2-D Signals With Fast Varying Instantaneous Frequencies: Extending Complex-Lag Time-Frequency Distribution, An
* Analysis of LightGlue Matching for Robust TIN-Based UAV Image Mosaicking
* Analysis of Regional Surface CO2 Fluxes Using the MEGA Satellite Data Assimilation System
* Analysis of Semi-Global Factors Influencing the Prediction of Crash Severity
* Analysis of the Statistical Relationship Between Vertical Ground Displacements and Selected Explanatory Factors: A Case Study of the Underground Gas Storage Area, Kosakowo, Poland
* Analyzing Hierarchical Structure in Vision Models with Sparse Autoencoders
* Analyzing Nighttime Lights Using Multi-Temporal Imagery from Luojia-1 and the International Space Station with In Situ and Land Use Data
* Anatomical foundation models for brain MRIs
* Anisotropic pth-order TV-based Retinex decomposition with adaptive reflectance regularizer for low-light image enhancement
* AnomalyHybrid: A Domain-Agnostic Generative Framework for General Anomaly Detection
* Any-scale Object Detection using Arbitrary-scaled Images
* AppleGrowthVision: A Large-Scale Stereo Dataset for Phenological Analysis, Fruit Detection, and 3D Reconstruction in Apple Orchards
* Application of High-Precision Classification Method Based on Spatiotemporal Stable Samples and Land Use Policy in Oasis-Desert Mosaic Landscape Areas
* Application of MLEM-TV Algorithm in Diffuse Correlation Tomography Blood Flow Imaging
* Application of MLS and UAS-SfM for Beach Management at the North Padre Island Seawall
* Application of Structure from Motion Techniques Using Historical Aerial Images, Orthomosaics, and Aerial LiDAR Point Cloud Datasets for the Investigation of Debris Flow Source Areas
* Approximate 2D-3D Shape Matching for Interactive Applications
* APSAM: Adaptive Progressive Learning for Segment Anything Model in anomaly detection
* APTNet: Attention-refined progressive three-branch network for real-time semantic segmentation
* Arbitrary-Scale Planetary Remote Sensing Super-Resolution via Adaptive Frequency-Spatial Neural Operator
* ARC-Flow: Articulated, Resolution-Agnostic, Correspondence-Free Matching and Interpolation of 3D Shapes Under Flow Fields
* ARC-NeRF: Area Ray Casting for Broader Unseen View Coverage in Few-Shot Object Rendering
* ARDGen: Augmentation Regularization for Domain-Generalized Medical Report Generation
* Are Sport Clubs Mediating Urban Expressive Crimes?: London as the Case Study
* Are Vision-Language Models Ready for Dietary Assessment? Exploring the Next Frontier in AI-Powered Food Image Recognition
* Are We There Yet? A Brief Survey of Music Emotion Prediction Datasets, Models and Outstanding Challenges
* ARF: Arbitrary Routing Framework for All-in-One Image Restoration
* ARISE: Explainable Multimodal Aggressive Driving Detection via Driver State and Environment Perception
* ARSGaussian: 3D Gaussian Splatting with LiDAR for aerial remote sensing novel view synthesis
* Artificial Intelligence in CT-Based Diagnosis of Small Pulmonary Nodules: Current Applications and Future Perspectives
* Artificial Intelligence-Based Brain Tumor Segmentation Using Adaptive Hybrid CNN and Classification by Multi-Scale Dilated MobileNet with Attention Mechanism for MRI Images
* ASROT: A Novel Resampling Algorithm to Balance Training Datasets for Classification of Minor Crops in High-Elevation Regions
* Assessing demographic bias in brain age prediction models using multiple deep learning paradigms
* Assessing Surface Water Dynamics of Wetlands in Reclaimed Mining Areas in the Athabasca Oil Sands Region, Alberta, Canada, with Time-Varying Sentinel-1 SAR and Sentinel-2 Multi-Spectral Imagery
* Assessing the Consistency Among Three Mascon Solutions and COST-G-Based Grid Products for Characterizing Antarctic Ice Sheet Mass Change
* Assessing Wildfire Impacts from the Perspectives of Social and Ecological Remote Sensing
* Assessment of Instrument Performance of the FY3E/JTSIM/DARA Radiometer Through the Analysis of TSI Observations
* Assessment of Multiple Satellite Precipitation Products over Italy
* AthletePose3D: A Benchmark Dataset for 3D Human Pose Estimation and Kinematic Validation in Athletic Movements
* Atmospheric Correction Inter-Comparison eXercise, ACIX-III Land: An Assessment of Atmospheric Correction Processors for EnMAP and PRISMA over Land
* Attacking Attention of Foundation Models Disrupts Downstream Tasks
* Attention-Aware Temporal Adversarial Shadows on Traffic Sign Sequences
* Attention-Based Shape-Deformation Networks for Artifact-Free Geometry Reconstruction of Lumbar Spine From MR Images
* Attention-Guided Hierarchical Defense for Multimodal Attacks in Vision-Language Models
* AttentiveGRU: Recurrent Spatio-Temporal Modeling for Advanced Radar-Based BEV Object Detection
* ATTSF-Net: Attention-Based Similarity Fusion Network for Audio-Visual Emotion Recognition
* Audio-Visual Emotion Classification Using Reinforcement Learning-Enhanced Particle Swarm Optimisation
* Augmented Reality Applications using Active Markers with an Event Camera
* Automated 3D Building Model Reconstruction from Satellite Images Using Two-Stage Polygon Decomposition and Adaptive Roof Fitting
* Automated 3D Reconstruction of Interior Structures from Unstructured Point Clouds
* Automated Essential Concept Discovery for Few-Shot Out-of-Distribution Detection
* Automated Lithology Segmentation of 3D Point Clouds from Highwalls Using Deep Learning
* Automated recognition of humerus anomalies with convolutional neural networks
* Automated Workflow for Generating 3D Solids from Indoor Point Clouds in a Cadastral Context, An
* Automatic Ghost Noise Labeling for 4D mmWave Radar Data in Underground Mine Environments Using LiDAR as Reference
* Automatic Rating Approach Using Machine Learning and Feature Selection for Finger Tapping in MDS-UPDRS Part III, An
* Automatic Segmentation of Metaplasia in an Endoscopic Decision Support System
* Automotive Cockpit-Driving Integration for Human-Centric Autonomous Driving: A Survey
* Autonomous Multimodal Reasoning via Implicit Chain-of-Vision
* AutoVFX: Physically Realistic Video Editing from Natural Language Instructions
* AV-Lip-Sync+: Leveraging AV-HuBERT to Exploit Multimodal Inconsistency for Deepfake Detection of Frontal Face Videos
* Aware Yet Biased: Investigating Emotional Reasoning and Appraisal Bias in Large Language Models
* AWM-GAN: SAR-to-Optical Image Translation with Adaptive Weight Maps
* Azimuth-Continuously Controllable SAR Image Generation Algorithm Based on GAN, An
* Backdoor defense based on adversarial prediction proximity and contrastive knowledge distillation
* Balancing Accuracy and Efficiency: HWBENet for Water Body Extraction in Complex Rural Landscapes
* Balancing Exploration and Cybersickness: Investigating Curiosity-Driven Behavior in Virtual Environments
* Balancing Privacy and Action Performance: A Penalty-Driven Approach to Image Anonymization
* BAP-DETR: Efficient drone object detection network based on bipartite attentive processing and dual fusion encoder
* Bayesian Posterior Distribution Estimation of Kinetic Parameters in Dynamic Brain PET Using Generative Deep Learning Models
* Bayesian Unsupervised Disentanglement of Anatomy and Geometry for Deep Groupwise Image Registration
* Behind the Magic, MERLIM: Multi-Modal Evaluation Benchmark for Large Image-Language Models
* Benchmark Framework for the Right Atrium Cavity Segmentation From LGE-MRIs, A
* Benchmarking Elevation Plus Land Surface Parameters Finds FathomDEM and Copernicus DEM Win as Best Global DEMs
* Benchmarking Multi-Modal Semantic Segmentation Under Sensor Failures: Missing and Noisy Modality Robustness
* Benchmarking tree instance segmentation of terrestrial laser scanning point clouds
* Benford's Law and Transport Infrastructure: The Analysis of the Main Road Network's Higher-Level Segments in the EU
* Best Linear Unbiased Estimation for 2D and 3D Flow with Event-Based Cameras
* Betsu-Betsu: Multi-View Separable 3D Reconstruction of Two Interacting Objects
* Better Coherence, Better Height: Fusing Physical Models and Deep Learning for Forest Height Estimation from Interferometric SAR Data
* Better early detector for high-performance detection transformer
* Beyond Academic Benchmarks: Critical Analysis and Best Practices for Visual Industrial Anomaly Detection
* Beyond Inserting: Learning Subject Embedding for Semantic-Fidelity Personalized Diffusion Generation
* Beyond Neurofibrillary Tangles: Explainable AI for Microscopic Tauopathy Classification in Immunofluorescence Imaging
* Beyond Raw Videos: Understanding Edited Videos with Large Multimodal Model
* Beyond static imaging: A dynamic decision paradigm for robust array-SAR in diverse sensing scenarios
* BFSTAL: Bidirectional Feature Splitting With Cross-Layer Fusion for Temporal Action Localization
* Bias-aware learning for unbiased scene graph generation in remote sensing imagery
* BiasBench: A Reproducible Benchmark for Tuning the Biases of Event Cameras
* Bidirectional Action Sequence Learning for Long-term Action Anticipation with Large Language Models
* Bidirectional Beta-Tuned Diffusion Model
* BiGS: Bidirectional Primitives for Relightable 3D Gaussian Splatting
* BIMA: Bijective Maximum Likelihood Learning Approach to Hallucination Prediction and Mitigation in Large Vision-Language Models
* BiMambaHSI: Bidirectional Spectral-Spatial State Space Model for Hyperspectral Image Classification
* Bimanual Manipulation of Steady-Hand Eye Robots With Adaptive Sclera Force Control: Cooperative Versus Teleoperation Strategies
* Binned MSE for Imbalanced Dust Density Estimation
* Binocular Vision-Based Infrastructure Crack Measurement with Morphological Union Enhancement
* Blind Quality Assessment of Wide-Angle Videos Based on Deformation Representation Learning and Multi-Dimensional Feature Fusion
* Boosting Domain Generalization in Remote Sensing Image Segmentation via Style Mapping and General Prototypical Contrast
* Boosting Geometric Invariants for Discriminative Forensics of Large-Scale Generated Visual Content
* Boosting neural network performance for high dimensional data through random projections
* Boosting SAR ATR Trustworthiness via ERFA: An Electromagnetic Reconstruction Feature Alignment Method
* Boosting Small Object Tracking via Collaborative Detection Transformer
* BORT2: Bi-level optimization for robust target training in multi-source domain adaptation
* Bottom-up building exposure modeling with multimodal earth vision
* Boundary-aware shape recognition using dynamic graph convolutional networks
* boundary-enhanced and target-driven deformable convolutional network for abdominal multi-organ segmentation, A
* BPSAC: A Blockchain and PUF-Enabled Security Architecture With C-IDS for Maritime Intelligent Transportation Systems
* BR-MoE: Blind Multi-Modal Tracking With Route-Dynamic Mixture of Experts
* BRAT: Bidirectional Relative Positional Attention Transformer for Event-Based Eye Tracking
* Breaking a New Image Cryptosystem From Three Perspectives
* Breaking Players' Expectations: The Role of Non-Player Characters' Coherence and Consistency
* Bridging Classical and Modern Computer Vision: PerceptiveNet for Tree Crown Semantic Segmentation
* Bridging Detection and Re-Identification: Evaluating Trustworthiness and Error Propagation in Face Recognition Pipelines
* Bridging Morphology and Molecular Signatures: Multi-Task Deep Learning for Multi-Omics Prediction from Histopathology
* Bridging Self-Supervision and Mechanism of Action Discovery in Morphological Profiling
* Bridging the Modality Gap: Training-Free Adaptation of Vision-Language Models for Remote Sensing via Visual Prototypes
* Bridging Vision Foundation and Vision-Language Models for Open-Vocabulary Semantic Segmentation of UAV Imagery
* Built-Up Surface Ensemble Model for Romania Based on OpenStreetMap, Microsoft Building Footprints, and Global Human Settlement Layer Data Sources Using Triple Collocation Analysis
* Bundle adjustment-based co-registration with high geolocation accuracy for UAV photogrammetry
* Cache-aided cross-modal correlation correction for unsupervised cross-domain text-based person search
* CACP: Context-Aware Copy-Paste to Enrich Image Content for Data Augmentation
* CaddieSet: A Golf Swing Dataset with Human Joint Features and Ball Information
* CadenceRAG: Context-Aware and Dependency-Enhanced Retrieval Augmented Generation for Holistic Video Understanding
* CAETFN: Context Adaptively Enhanced Text-Guided Fusion Network for Multimodal Sentiment Analysis
* CAGS: Open-vocabulary 3D scene understanding with context-aware Gaussian splatting
* Calibration-Free Method for Large-View Classroom People Counting with Object Detection-Based Structure Matching, A
* California Crop Yield Benchmark: Combining Satellite Image, Climate, Evapotranspiration, and Soil Data Layers for County-Level Yield Forecasting of Over 70 Crops
* CamCtrl3D: Single-Image Scene Exploration with Precise 3D Camera Control
* Camera-Only 3D Panoptic Scene Completion for Autonomous Driving Through Differentiable Object Shapes
* CameraHMR: Aligning People with Perspective
* Can Geometry Save Central Views for Sports Field Registration?
* Can Relevance Feedback, Conversational Search and Foundation Models Work Together for Interactive Video Search and Exploration?
* Can Vision-Language Models Understand and Interpret Dynamic Gestures from Pedestrians? Pilot Datasets and Exploration Towards Instructive Nonverbal Commands for Cooperative Autonomous Vehicles
* Canvas: Compositional Generation for Art Painting With Seamless Subject-Driven Infusion
* Capturing Fine-Grained Alignments Improves 3D Affordance Detection
* Capturing the Footsteps of Mobility: A Machine Learning-Based Study on the Relationship Between Streetscape and Consumption Vitality
* CARN: Complexity-Aware Routing Network for Efficient and Adaptive Inference
* Catching the Blackdog Easily: A Convenient Depression Diagnosis Method Based on Audio-Visual Deep Learning
* CatFree3D: Category-Agnostic 3D Object Detection with Diffusion
* Causal Inference-Enhanced UAV Detection and Identification for Low-Altitude Air City Transport
* CCNeXt: An effective self-supervised stereo depth estimation approach
* CDVS: Compressed Domain On Device Memory Efficient 8K Video SlowMo
* CE-NPBG: Connectivity Enhanced Neural Point-Based Graphics for Novel View Synthesis in Autonomous Driving Scenes
* Cell Tracking According to Biological Needs: Strong Mitosis-Aware Multi- Hypothesis Tracker With Aleatoric Uncertainty
* CellRep: Multichannel Image Representation Learning Model
* CerMapp: A Cloud-Based Geospatial Prototype for National Wildlife Disease Surveillance
* CETrack: A Feature-Match-Based Framework for Lesion Tracking in CE Videos
* CFE-PVTSeg:Cross-domain frequency-enhanced pyramid vision transformer segmentation network
* CFMW: Cross-Modality Fusion Mamba for Robust Object Detection Under Adverse Weather
* CFPNet: Improving Lightweight ToF Depth Completion via Cross-Zone Feature Propagation
* Chain of Diagnosis Framework for Accurate and Explainable Radiology Report Generation, A
* ChangeViT: Unleashing plain vision transformers for change detection in remote sensing images
* Channel scaling: An efficient feature representation to enhance the generalization of few-shot learning
* Channel Self-Attention Residual Network: Learning Micro-Expression Recognition Features From Augmented Motion Flow Images
* Channel-aware feature mining network for Visible-Infrared Person Re-identification
* Characterising Active Mobility in Urban Areas Through Street Network Indices
* Characterization and Modelling of Environmental Crime: A Case Study Applied to the Canary Islands (Spain)
* Characterizing the Surface Grain Size Distribution in a Gravel-Bed River Using UAV Optical Imagery and SfM Photogrammetry
* Chest X-Ray Foundation Model With Global and Local Representations Integration
* Choosing 'Right' from Wrong: A Closer Look at Selection Bias in Spatial Multiple-Choice Questions in Large Multimodal Models
* Choreme-Based Spatial Analysis and Tourism Assessment in the Oltenia de sub Munte Geopark, Romania
* CI-TransCNN: A class imbalance hybrid CNN-Transformer Network for facial attribute recognition
* CineMyoPS: Segmenting Myocardial Pathologies From Cine Cardiac MR
* Citygen: Infinite and Controllable City Layout Generation
* Class-Aware Unsupervised Domain Adaptation Framework for Cross-Continental Crop Classification with Sentinel-2 Time Series, A
* Class-mismatched semi-supervised learning from a new perspective
* Classification Drives Geographic Bias in Street Scene Segmentation
* CleanMAP: Distilling Multimodal LLMs for Confidence-Driven Crowdsourced HD Map Updates
* Climate Surpasses Soil Texture in Driving Soil Salinization Alleviation in Arid Xinjiang
* Clinical Stage Prompt Induced Multi-Modal Prognosis
* CLIP can understand depth
* CLIP-Driven with Dynamic Feature Selection and Alignment Network for Referring Remote Sensing Image Segmentation
* CLIP-Guided Cross-Modal Feature Fusion based Few-Shot Learning for Nighttime Pavement Defect Detection
* CLIP-SLA: Parameter-Efficient CLIP Adaptation for Continuous Sign Language Recognition
* Clip4Retrofit: Enabling Real-Time Image Labeling on Edge Devices via Cross-Architecture CLIP Distillation
* CLIPDraw++: Text-to-Sketch Synthesis with Simple Primitives
* CMDiff: Clip-guided multi-dimension mamba diffusion model for low light image enhancement
* CMSNet: A SAM-Enhanced CNN-Mamba Framework for Damaged Building Change Detection in Remote Sensing Imagery
* CoDEx: Combining Domain Expertise for Spatial Generalization in Satellite Image Analysis
* CoE: Deep Coupled Embedding for Non-Rigid Point Cloud Correspondences
* Collaborative feature alignment with global-local fusion for fine-grained sketch-based image retrieval
* Collaborative Feminist Cartography in Geographical Education: Mapping Gender Representation in Street Naming (Las Calles de las Mujeres)
* collaborative spatial-frequency learning network for infrared and visible image fusion, A
* Collective Migration-Inspired Large-Deformation Compensation for Nonrigid Image Registration
* Combining Vision-Language Models and Weak Supervision for Nuanced Vision Classification Tasks
* Common issues and human intervention in object detection from handcrafted features to deep learning: discussion
* Communication-Efficient Federated Multi-View Clustering
* Compact-pol to quad-pol SAR reconstruction via a joint mathematical-physical-constrained multimodal correlation-preserving latent diffusion framework
* Comparative Analysis of Low-Cost Devices for High-Precision Diameter at Breast Height Estimation, A
* Comparative Analysis of Object Detection Algorithms for Bolt Detection: Performance Evaluation of Faster R-CNN, SSD, RetinaNet and YOLOv8n
* Comparative Analysis of Tropospheric Correction Methods for Ground Deformation Monitoring over Mining Area with DS-InSAR
* Comparison Visual Instruction Tuning
* CompCraft: Foreground-Driven Image Synthesis With Customized Layouts
* Compensating for the Incomplete With the Complete: An Efficient Scene Text Detector
* Complex-valued mix transformer for SAR ship detection
* Compositional Distributed Learning for Multi-View Perception: A Maximal Coding Rate Reduction Perspective
* Compositional Generative Model of Unbounded 4D Cities
* Compositional Image-Text Matching and Retrieval by Grounding Entities
* CoMPR: Efficient point cloud dataset condensation via bidirectional matching and point recycling
* Comprehending C codes with LLMs: Effective comment generation through retrieval and reasoning
* Comprehensive Modelling Framework for Identifying Green Infrastructure Layout in Urban Flood Management of the Yellow River Basin, A
* comprehensive review of low-light image enhancement methods, A
* Comprehensive Survey on Datasets for Affective Computing and Mental Disorder, A
* comprehensive survey on magnetic resonance image reconstruction, A
* Compressed Domain Multiframe Processing
* Compression and Rendering of Time-varying Interplanetary Volumes
* Computation Offloading Optimization for Digital Twin Assisted 5G-Enabled Edge Computing Network in Urban Rail Transit
* Computer-aided design of personalized occlusal positioning splints using multimodal 3D data
* Condense loss: Exploiting vector magnitude during person Re-identification training process
* Condimen: Conditional Multi-Person Mesh Recovery
* Conditional Prototype Rectification Prompt Learning
* Confidence-based Adaptive Weighted Boxes Fusion for Multi-Object Tracking of Small Birds
* Confidence-Calibrated Covariate Shift Correction for Few-Shot Classification in Vision-Language Models
* ConMSDMamba: Multi-Scale Dilated Mamba Based on Conformer for Speech Emotion Recognition
* Connected Vehicle Data-Driven Robust Optimization for Traffic Signal Timing: Modeling Traffic Flow Variability and Errors
* conSAMmé: Achieving Consistent Segmentations with SAM
* Considering Sentiment Causes in In-Context Learning for Aspect-Based Sentiment Analysis
* Construction and Visualization of Levels of Detail for High-Resolution LiDAR-Derived Digital Outcrop Models
* Construction of Three-Dimensional Memristor-Enhanced Polynomial Hyperchaotic Map and its Application in Image Security Protection
* Content-Awareness Video Compression for Roadside Surveillance Cameras
* Context-Aware Anomaly Detection of Pedestrian Trajectories in Urban Back Streets Using a Variational Autoencoder
* Context-Aware Contrastive Learning for Virtual IHC Staining With Inconsistent Image Pairs
* Contextual Graph Reconstruction and Emotional Variation Learning for Conversational Emotion Recognition
* Continual face forgery detection based on relation-aware spatial-frequency interaction aggregation and contrastive learning
* Contrast and clustering: Learning neighborhood pair representation for source-free domain adaptation
* Contrastive Attention-Based Network for Self-Supervised Point Cloud Completion
* Contrastive Label Disambiguation for Self-Supervised Terrain Traversability Learning in Off-Road Environments
* Contrastive learning with auxiliary model
* Controllable Text-to-3D Generation via Surface-Aligned Gaussian Splatting
* Cooperative Driving at Multiple Unsignalized Intersections in Fully Autonomous Driving Scenarios
* Cooperative multi-task learning and reliability assessment for glioma segmentation and IDH genotyping
* Coordinated Multi-Agent Reinforcement Learning Method for Integrating Transit Signal Priority and Speed Guidance Control
* COP-GEN-Beta: Unified Generative Modelling of COPernicus Imagery Thumbnails
* Copy-move forgery detection of social media images using tendency sparsity filtering and variable cluster spectral clustering
* Corn yield estimation under extreme climate stress with knowledge-encoded deep learning
* Correcting Atmospheric Temperature and Vapor Density Profiles of Ground-Based Microwave Radiometer in Diverse Skies by Regression Model and Artificial Neural Network Methods
* Correction of Refraction Effects on Unmanned Aerial Vehicle Structure-from-Motion Bathymetric Survey for Coral Reef Roughness Characterisation
* Could Micro-Expressions Be Quantified? Electromyography Gives Affirmative Evidence
* Coupled Diffusion Models for Metal Artifact Reduction of Clinical Dental CBCT Images
* CoupleFER: Dynamic Cross-Modal Fusion via Prompt Learning for Improved 2D+3D FER
* CPMFFormer: Class-Aware Progressive Multiscale Fusion Transformer for Hyperspectral Image Classification
* Creating Multimodal Interactive Digital Twin Characters From Videos: A Dataset and Baseline
* Critical Factors for the Application of InSAR Monitoring in Ports
* CRLSTM-HEXNET: Hybrid Deep Learning Framework with Harris Hawk Optimization in Multi-Label Classification
* Cross-channel blur invariants of color and multispectral images
* Cross-Context Aggregation for Multi-View Urban Scene and Building Facade Matching
* Cross-Domain Few-Shot Learning Method Based on Fractional Domain Information for Hyperspectral Image Multi-Class Change Detection
* Cross-Domain Hyperspectral Image Classification Based on Bi-Directional Domain Adaptation
* Cross-Domain Land Surface Temperature Retrieval via Strategic Fine-Tuning-Based Transfer Learning: Application to GF5-02 VIMI Imagery
* Cross-domain-aware deep unfolding transformer for hyperspectral image super-resolution
* Cross-Modal Consistency Learning for Sign Language Recognition
* Cross-Modal Deep Feature Fusion Framework Based on Ensemble Learning for Land Use Classification, A
* Cross-Modal Facial Expression Recognition with Global Channel-Spatial Attention: Modal Enhancement and Proportional Criterion Fusion
* Cross-Modal Knowledge Distillation from First-Person Views to Third-Person BEV Maps for Universal Point Goal Navigation
* Cross-modality white matter lesion segmentation by modality de-indentification
* Cross-scale coupled attention network for underwater image enhancement
* Cross-Spectral Body Recognition with Side Information Embedding: Benchmarks on LLCM and Analyzing Range-Induced Occlusions on IJB-MDF
* Cross-View Geo-Localization via 3D Gaussian Splatting-Based Novel View Synthesis
* CrossGlue: Cross-Modal Image matching via potential message investigation and visual-gradient message integration
* CSRN: Cross-Sensor Robust Recognition Network for Multi-Modal Aerial View Object Classification
* CTC: Contribution to Classification of Complex Features
* CTIGEN-CDM: Controlled Text-to-Image Generation Using Cropped Diffusion Models
* CTNet: Color transformation network for low-light image enhancement
* Ctrl-Room: Controllable Text-to-3D Room Meshes Generation with Layout Constraints
* Curvature Learning for Generalization of Hyperbolic Neural Networks
* CurvNet: Latent contour representation and iterative data engine for curvature angle estimation
* Cybersecurity Standards Across Industries: A Critical Analysis of Current Practices and Future Directions in the Railway Sector
* Cycle Training with Semi-Supervised Domain Adaptation: Bridging Accuracy and Efficiency for Real-Time Mobile Scene Detection
* CYFLOD: Cyclic Filtering and Loss Damping for Alleviating Noisy Labels in Fine-Grained Visual Classification
* CytoFM: The First Cytology Foundation Model
* D-Feat Occlusions: Diffusion Features for Robustness to Partial Visual Occlusions in Object Recognition
* D2PCFN: Dual domain progressive cross-fusion network for remote sensing image pansharpening
* D3PD: Dual distillation and dynamic fusion for camera-radar 3D perception
* DAF: Distillation, Augmentation and Filtering Based Framework for Efficient Smartphone Human Activity Recognition
* DAFNet: A Dual Attention Fusion-Based Face Super-Resolution Network
* DAR-Prompt: Dynamic Regulation in Prompt Tuning for Multi-Label Zero-Shot Learning
* Dark light image recognition technology based on improved ssa and object detection
* Data Augmentation and Interpolation Improves Machine Learning-Based Pasture Biomass Estimation from Sentinel-2 Imagery
* Data Scaling Laws for End-to-End Autonomous Driving
* Data-And Knowledge-Driven Visual Abductive Reasoning
* Data-driven Head Motion Generation through Natural Gaze-Head Coordination
* Dataformer: Differential Additive Transformer for Lightweight Semantic Segmentation
* Dataset for Semantic and Instance Segmentation of Modern Fruit Orchards, A
* Datasets for Valence and Arousal Inference: A Survey
* Datasets of Smartphone Modalities for Depression Assessment: A Scoping Review
* DBaP-net: Deep network for image defogging based on physical properties prior
* DBL: Dual-Level balanced learning for long-Tailed classification
* DCal-Rec: A Spatio-Temporal Distribution Calibration Framework for Next-POI Recommendation
* DCSEG: Decoupled 3D Open-Set Segmentation using Gaussian Splatting
* DDaTR: Dynamic Difference-Aware Temporal Residual Network for Longitudinal Radiology Report Generation
* DDDMNet: A DSM Difference Normalization Module Network for Urban Building Change Detection
* Debiasing Medical Knowledge for Prompting Universal Model in CT Image Segmentation
* Decadal-Scale Warming Signals in Antarctic Ice Sheet Interior Revealed by L-Band Passive Microwave Observations from 2015 to 2025
* Decentralized Incentive-Based Pricing and Energy Management Strategy for Multi-Energy Electric Vehicle Charging Stations, A
* DECEPTIcON: Bridging Gaps in In-the-Wild Deception Research
* Deciding the Path: Leveraging Multi-Agent Systems for Solving Complex Tasks
* Decision-Making for Autonomous Vehicles in Multi-Scenarios With Global Map Model and Dynamic Safe Topological Structure
* DeclutterNeRF: Generative-Free 3D Scene Recovery for Occlusion Removal
* Decoding attention from the visual cortex: fMRI-based prediction of human saliency maps
* Decoding the brain via multi-view brain topology contrastive learning
* Decoding Vision Transformers: The Diffusion Steering Lens
* Decomposing Food Images for Better Nutrition Analysis: a Nutritionist-Inspired Two-Step Multimodal Llm Approach
* Decoupled Scale and Appearance for Optimal Deep Diamond ReID
* Decoupling Identity Confounders for Enhanced Facial Expression Recognition: An Information-Theoretic Approach
* Deep Diffusion Models and Unsupervised Hyperspectral Unmixing for Realistic Abundance Map Synthesis
* Deep intrinsic image decomposition via physics-aware neural networks
* Deep learning and multi-modal MRI for the segmentation of sub-acute and chronic stroke lesions
* Deep Learning Approach to Downscaling Microwave Land Surface Temperatures for a Clear-Sky Merged Infrared-Microwave Product, A
* Deep Learning Method for Predicting Terminal Air Traffic Congestion Levels Based on Multimode Spatiotemporal Propagation Networks, A
* Deep Learning Model With Fine-Tuning for Generalized Few-Shot Activity Recognition
* Deep Learning Model-Based Nudity Detection with Image Feature Extraction Approaches for GLAM Materials
* Deep Learning-Based Classification of Aquatic Vegetation Using GF-1/6 WFV and HJ-2 CCD Satellite Data
* Deep Learning-Based Classification of Planar^99m-Tc Pyrophosphate Scintigraphy for the Diagnosis of Cardiac Amyloidosis
* Deep Learning-Based Semantic Segmentation for Automatic Shoreline Extraction in Coastal Video Monitoring Systems
* Deep Line-Segment Detection-Driven Building Footprints Extraction from Backpack LiDAR Point Clouds for Urban Scene Reconstruction
* Deep Lookup Network
* Deep Medial Voxels: Learned Medial Axis Approximations for Anatomical Shape Modeling
* Deep noise-tolerant hashing for remote sensing image retrieval
* Deep Polycuboid Fitting for Compact 3D Representation of Indoor Scenes
* Deep Reinforcement Learning-Based Collaborative Computation Offloading for Distributed Vehicular Edge Computing
* Deep Reinforcement Learning-Based Task Scheduling and Resource Allocation for Vehicular Edge Computing: A Survey
* deep spatio-temporal architecture for dynamic ECN analysis with Granger causality based causal discovery, A
* DeepDC: Deep Distance Correlation as a Perceptual Image Quality Evaluator
* Deepfake Detection Leveraging Self-Blended Artifacts Guided by Facial Embedding Discrepancy
* DeepPartitioning: Deep Learning of Graph Partitioning for Neuron Segmentation From Electron Microscopy Volume via Graph Neural Network
* Defending Against Frequency-Based Attacks with Diffusion Models
* Defending Against Transfer-Based Adversarial Attacks Using SVD-Driven Feature Evolution
* Defenses in Adversarial Machine Learning: A Systematic Survey From the Lifecycle Perspective
* Define, Refine, Align: Correspondence-Free 3D Line Alignment with Attentional, Equivariant and Rotational Layers
* DeforHMR: Vision Transformer with Deformable Cross-Attention for 3D Human Mesh Recovery
* DEFT-VTON: Efficient Virtual Try-On with Consistent Generalised H-Transform
* Defurnishing with X-Ray Vision: Joint Removal of Furniture from Panoramas and Mesh
* DEGAS: Detailed Expressions on Full-Body Gaussian Avatars
* Dehaze-cGAN: Image dehazing using a multi-head attention-based conditional GAN for traffic video monitoring
* Delineation of Management Zones Based on the Agricultural Potential Concept for Potato Production Using Optical Satellite Images
* DELTA: Dense Depth from Events and LiDAR Using Transformer's Attention
* DEM-Based UAV Geolocation of Thermal Hotspots on Complex Terrain
* DEMO: Point-Feature Tracking for Pixel Processor Arrays
* Denoising Monte Carlo Renders with Diffusion Models
* Denoising VIIRS and Sentinel-2 MSI ocean color imagery for improved floating algae monitoring using noise-simulation-aided deep learning
* Density peaks clustering algorithm integrating manifold distance and mutual nearest neighbors
* Density-aware global-local attention network for point cloud segmentation
* Dental Lesion Segmentation Method Based on Hypernetwork Improved Unet
* Design and Optimization of Adaptive Cooperative MAC Protocol With Priority Scheduling for Train-to-Train Communications
* Design of a Provable Secure ECC and HMAC-Based Robust and Efficient Authentication Scheme for Maritime Transportation System
* Detailed 3D Modeling and Component Monomerization Extraction of Buildings Using Close-Range Photogrammetry
* Detect, Classify, Act: Categorizing Industrial Anomalies with Multi-Modal Large Language Models
* Detecting Burned Vegetation Areas by Merging Spectral and Texture Features in a ResNet Deep Learning Architecture
* Detecting Hand-Object Interaction Based on Movements in Hand Surrounding Region
* Detecting Localized Deepfake Manipulations Using Action Unit-Guided Video Representations
* Detecting Looted Archaeological Sites from Satellite Image Time Series
* Detecting Signs of Depression Using Social Media Texts Through an Ensemble of Ensemble Classifiers
* Detecting Sympathetic Discharges: Comparison of Electrodermal Activity and Skin Sympathetic Nerve Activity in Stimulation-to-Response Time and Recovery Time to Baseline
* Detection and Localization of Drones and UAVs Using Sound and Vision
* Detection of Medial Epicondyle Avulsion in Elbow Ultrasound Images via Bone Structure Reconstruction
* Detection of Schizophrenia Spectrum Disorder and Major Depression Disorder Using Automated Speech Analysis
* Detector-Free Image Matching with Lightweight Backbone and Feature Filtering
* Developing variants of the Lucy-Richardson algorithm for coded aperture imaging: tutorial
* DG-DETR: Toward domain generalized detection transformer
* DGC-Link: Dual-Gate Chebyshev Linkage Network on EEG Emotion Recognition
* DGFMamba: Model fine-tuning based on bidirectional state space for domain generalization semantic segmentation
* DI-Retinex: Digital-Imaging Retinex Model for Low-Light Image Enhancement
* Diabetes Screening Algorithm Embedded with Inception Deep Convolution in Swin Transformer, A
* Diagnosis Method of Major Depressive Disorder Using Wavelet Coherence and State-Pathology Separation Network, The
* Diagnosis of Pediatric Hypopigmentary Dermatoses Based on Lightweight HierAttn Network
* differentiable method for novel view SAR image generation via 3D Gaussian Splatting, A
* DiffExplainer: Towards cross-modal global explanations with diffusion models
* DiffRSD: Diffusion-Based and Integrity-Aware RGB-D Rail Surface Defect Inspection
* DiffTrajectory: Mitigating cumulative errors and enhancing inference efficiency in diffusion-based trajectory prediction
* DiffTrajectory: Mitigating cumulative errors and enhancing inference efficiency in diffusion-based trajectory prediction
* DiffuseDoc: Document geometric rectification via diffusion model
* Diffusion model-based imbalanced diabetic retinal image classification
* Diffusion Models for Intelligent Transportation Systems: A Survey
* Diffusion With Reinforcement Learning for Pedestrian Trajectory Prediction
* Diffusion-Based Continuous Sign Language Generation with Cluster-Specific Fine-Tuning and Motion-Adapted Transformer
* Diffusion-Enhanced Underwater Debris Detection via Improved YOLOv12n Framework
* Direct and Explicit 3D Generation from a Single Image
* Direct Zero-Shot Indoor Scene Recognition Method Based on Visual Question Answering, A
* Direction-Aware Hybrid Representation Learning for 3D Hand Pose and Shape Estimation
* discDC: Unsupervised discriminative deep image clustering via confidence-driven self-labeling
* Discourse Structure- and Interlocutor-Guided Network for Dialogue Act Recognition and Sentiment Classification, A
* Discriminative attention based weighted sparse representation of visual objects in complex scenarios
* Discriminative response pruning for robust and efficient deep networks under label noise
* Disentangling Inter- and Intra-Video Relations for Multi-Event Video-Text Retrieval and Grounding
* Disentangling Polysemantic Channels in Convolutional Neural Networks
* Disentangling Visual Transformers: Patch-Level Interpretability for Image Classification
* Displacement Time Series Forecasting Using Sentinel-1 SBAS-InSAR Results in a Mining Subsidence Case Study: Evaluation of Machine Learning and Deep Learning Methods
* Dist-Tracker: A Small Object-Aware Detector and Tracker for UAV Tracking
* Distance self-adaptive fuzzy c-means and its application to image segmentation
* Distantly supervised reinforcement localization for real-world object distribution estimation
* Distillation-Supervised Convolutional Low-Rank Adaptation for Efficient Image Super-Resolution
* Distilling Grounding DINO for an Edge-Cloud Collaborative Advanced Driver Assistance System
* Distilling Hierarchical Knowledge From Multimodal Fusion for Unimodal Image Segmentation
* Distilling Normalizing Flows
* distributed deep learning approach for blood sample-based early detection of dementia, A
* Distributed Input Mapping Control for Multiple Mixed Platoons With a Flexible Structure Model
* Distributed Switching Model Predictive Control for Adaptive Human-Lead-Platooning in Mixed Traffic
* Distribution entropy regularized multimodal subspace support vector data description for anomaly detection
* Distribution Shift Problem in Transportation Networks Using Reinforcement Learning and AI, The
* Distribution Shifts at Scale: Out-of-distribution Detection in Earth Observation
* Diversity Learning Guided Dual Graph Autoencoder for Unsupervised Hyperspectral Band Selection
* DLSF: Dual-Layer Synergistic Fusion for High-Fidelity Image Synthesis
* DLST: Dual-Template Co-Evolution Learning for Robust Long-Term Drone Tracking in Dynamic Environments
* DM-MKGC: Multimodal Knowledge Graph Completion Based on Dynamic Prompt Learning and Multi-Granularity Aggregation
* DNMCN: Dual-Stage Normalization Based Modality-Collaborative Fusion Network for Multimodal Sentiment Analysis
* DOCB: A Dynamic Online Cross-Batch Hard Exemplar Recall for Cross-View Geo-Localization
* Document Image Rectification using Stable Diffusion Transformer
* Domain Adaptation for Skin Lesion: Evaluating Real-World Generalisation
* Domain Adaptation of VLM for Soccer Video Understanding
* Domain adapter for visual object tracking based on hyperspectral video
* Domain Generalization for Semantic Segmentation: A Survey
* Domain Generalization of Pathological Image Segmentation by Patch-Level and WSI-Level Contrastive Learning
* Domain Generalization Through Attenuation of Domain-Specific Information
* Domain-Adaptive Point Cloud Semantic Segmentation via Knowledge-Augmented Deep Learning
* Domain-Unified Adaptive Detection Framework for Small Vehicle Targets in Monostatic/Bistatic SAR Images
* Dominant Role of Meteorology and Aerosols in Regulating the Seasonal Variation of Urban Thermal Environment in Beijing
* Double Loop Trajectory Planning for Virtually Coupled Trains Considering Line Condition Disturbances
* DQP-PCQA: Deep Quantization Parameters Bring New Insight to Point Cloud Quality Assessment
* Dream-in-Style: Text-to-3D Generation Using Stylized Score Distillation
* DreamBeast: Distilling 3D Fantastical Animals with Part-Aware Knowledge Transfer
* DressRecon: Freeform 4D Human Reconstruction from Monocular Video
* Drivable 3D Gaussian Avatars
* Drive4C: A Closed-Loop Benchmark on what Foundation Models Really Need to be Capable of for Language-Guided Autonomous Driving
* Driver Gaze Area Prediction During IVIS Secondary Tasks Based on Multivariate Features of Spatial-Temporal Distribution
* DRLN: Disparity-Aware Rescaling Learning Network for Multi-View Video Coding Optimization
* Drug Discovery Agent: An Automated Vision Detection System for Drug-Cell Interactions
* Drug response prediction: A critical systematic review of current datasets and methods
* DSCViTANet: A Hybrid Depthwise Separable Convolution and Vision Transformer for Early Alzheimer's Classification
* Dual Attention Guidance Network for Self-Supervised Monocular Depth Estimation
* Dual Precision Quantization for Efficient and Accurate Deep Neural Networks Inference
* Dual Prototypes-Based Personalized Federated Adversarial Cross-Modal Hashing
* Dual Uncertainty-Aware Correspondence Adapting and Retaining for Continual Composed Image Retrieval
* Dual-Branch Hyperspectral Open-Set Classification with Reconstruction-Prototype Fusion for Satellite IoT Perception
* Dual-branch interactive guided network based on gradient prior for image super-resolution
* Dual-branch manifold information consistency for unsupervised visible-infrared person re-identification
* Dual-decoder collaborative learning with multi-hybrid view augmentation for self-supervised 3D action recognition
* Dual-Domain Representation Alignment for Unsupervised Height Estimation from Cross-Resolution Remote Sensing Images
* Dual-Input Frequency-Aware Network for High-Quality Thermal Image Super-Resolution
* Dual-optimized two-stage Camouflaged Object Detection
* Dual-Path Enhancements in Event-Based Eye Tracking: Augmented Robustness and Adaptive Temporal Modeling
* Dual-Source CBCT for Large FoV Imaging Under Short-Scan Trajectories
* Dual-Stage Cross-Modal Network with Dynamic Feature Fusion for Emotional Mimicry Intensity Estimation
* Dual-stage encryption framework for secure 3D information hiding using single-pixel imaging and diffusion models
* Dual-teacher self-distillation registration for multi-modality medical image fusion
* DUALF-D: Disentangled dual-hyperprior approach for light field image compression
* DualRecon: Building 3D Reconstruction from Dual-View Remote Sensing Images
* DuoSpaceNet: Leveraging Both Bird's-Eye-View and Perspective View Representations for 3D Object Detection
* DURN: Data uncertainty-driven robust network for mural sketch detection
* Dust to Detail: Restoring Sand-dust Images with Frequency-Guided Attention and Multi-Scale Features
* Dyadic Control for Formation Maintenance and Collision Avoidance in Cooperative Road Transportation Systems
* Dyadic Mamba: Long-term Dyadic Human Motion Synthesis
* DynaGuide: A generalizable dynamic guidance framework for zero-shot guided unsupervised semantic segmentation
* Dynamic Age Estimation via Mixture of Experts: Bridging Semantic and Structural Models
* Dynamic clustering transformer for LiDAR-based 3D object detection
* Dynamic deep multi-label image data augmentation based on self-paced learning
* Dynamic EventNeRF: Reconstructing General Dynamic Scenes from Multi-View RGB and Event Streams
* Dynamic Expansion Planning of Charging Stations With Fixed and Mobile Chargers
* Dynamic proxy domain generalizes the crowd localization by better binary segmentation
* Dynamic State-Control Modeling for Generalized Remote Sensing Image Super-Resolution
* Dynamic Watermarks in Images Generated by Diffusion Models
* Dynamic Zoning Cooperative Control Method for Urban Road Traffic Flow in a Connected Traffic Environment, A
* Dynamic-Attention-Based EEG State Transition Modeling for Emotion Recognition
* Dynamical Causal Graph Neural Network for EEG Emotion Recognition
* DynOMo: Online Point Tracking by Dynamic Online Monocular Gaussian Reconstruction
* DySS: Dynamic Queries and State-Space Learning for Efficient 3D Object Detection from Multi-Camera Videos
* E-3DGS: Event-Based Novel View Rendering of Large-Scale Scenes Using 3D Gaussian Splatting
* E-BARF: Bundle Adjusting Neural Radiance Fields from a Moving Event Camera
* E-Net for pansharpening: A super-resolution perspective
* E-VLC: A Real-World Dataset for Event-Based Visible Light Communication and Localization
* EAME: Element-Aware Multi-UAV distributed autonomous exploration for efficient and complete under-canopy measurements
* EAS-Det: Edge-Aware Semantic Feature Fusion for Robust 3D Object Detection in LiDAR Point Clouds
* EBSnoR: Event-Based Snow Removal by Optimal Dwell Time Thresholding
* ECAKM: Efficient Conditional Anonymous Authentication Scheme With On-Chain Key Management in VANETs
* Echocardiography Video Segmentation via Neighborhood Correlation Mining
* ECO-AI: Energy-Conscious Optimization for AI Training
* EcoWikiRS: Learning Ecological Representation of Satellite Images from Weak Supervision with Species Observations and Wikipedia
* Edge Collaborative Caching Based on Incentive-Driven D3QN Combined With User Preferences in UAV-Assisted Vehicular Networks
* Edge Craft Odyssey: Navigating guided super-resolution with a fast, precise, and lightweight network
* Edge detection-driven LightGBM for fast intra partition of H.266/VVC
* Edge-Augmented HLAC and Gaussian Distribution-Based Weighted Feature Extraction for 1-ms Abnormal Detection System in Logistics
* EdgeSAM: Prompt-In-the-Loop Distillation for SAM
* Editorial to the special issue in vision language models for remote sensing analysis and interpretation
* Editorial: Special Section Forum for Information Retrieval Evaluation (FIRE) 2024
* EEG Coupled Scale-Invariant Dynamics for Emotion Recognition: A Domain Adaptation Approach
* EEG Neurofeedback-Based Gait Motor Imagery Training in Lokomat Enhances Motor Rhythms in Complete Spinal Cord Injury
* Effect of layers on CNN model accuracy for facial emotion recognition
* Effect of Mindfulness Meditation on Sensory Perception and Emotional Evaluation of Mid-Air Touch on the Forearm
* Effective Adversarial Attack Approach to Assess the Vulnerability of Autonomous Vehicle Trajectory Prediction Models
* Effective Finite Time Stability Control for Human-Machine Shared Vehicle Following System
* Effective SAR Image Despeckling Using Noise-Guided Transformer and Multi-Scale Feature Fusion
* Effectiveness of Max-Pooling for Fine-Tuning CLIP on Videos
* Effectiveness of Training with Procedurally Generated Synthetic Images of Crop Plants
* Effects of Prescribed Fire on Spatial Patterns of Plant Functional Traits and Spectral Diversity Using Hyperspectral Imagery from Savannah Landscapes on the Edwards Plateau of Texas, USA
* Efficacy of varying sensing features for enhanced performance of deep-learning-informed multidimensional force platform
* Efficiency of Data Clustering for Stratification and Sampling in the Two-Phase ALS-Enhanced Forest Stock Inventory
* Efficient 2D to Full 3D Human Pose Uplifting Including Joint Rotations
* Efficient 3D Surface Super-Resolution via Normal-Based Multimodal Restoration
* Efficient and compact tensor wheel decomposition for tensor completion
* efficient and lightweight pyramid attention for image deblurring, An
* Efficient and Scalable Framework for Lightweight Crop Disease Recognition in Low-Resource Settings, An
* Efficient Biomass Estimation Model for Large-Scale Olea europaea L. by Integrating UAV-RGB and U2-Net with Allometric Equations, An
* Efficient Burst Super-Resolution with One-Step Diffusion
* efficient community-aware pre-training method for graph neural networks, An
* Efficient Continuous Group Convolutions for Local SE(3) Equivariance in 3D Point Clouds
* efficient direct solution of the perspective-three-point problem, An
* Efficient High-Order Spatial Interactions for Visual Perception
* Efficient Image Generation with Variadic Attention Heads
* Efficient Method for Measuring Oil Casing Thread Geometric Parameters Using Point Cloud Data, An
* Efficient Motion Sickness Assessment: Recreation of On-Road Driving on a Compact Test Track
* Efficient multi-view discrete co-clustering with learned graph
* Efficient multi-view stereo with depth-aware iterations and hybrid loss strategy
* Efficient Nearest Neighbor Search Using Dynamic Programming
* Efficient Self-Supervised Learning for Earth Observation via Dynamic Dataset Curation
* Efficient Skeleton-Based Action Recognition using Superposed Shape Subspace
* Efficient Synchronous Training Integrated Model for Driving Decision-Making Based on Deep Reinforcement Learning, An
* Efficient Task-Specific Conditional Diffusion Policies: Shortcut Model Acceleration and SO(3) Optimization
* Efficient VideoMAE via Temporal Progressive Training
* Efficiently Mitigating Video Content Misalignment on Large Vision Model with Time-Series Data Alignment
* EffiHeritageNet: Efficient Semantic Segmentation Method for Intangible Cultural Heritage Scenes
* EFMANet: An Edge-Fused Multidimensional Attention Network for Remote Sensing Semantic Segmentation
* EFMK: Extrinsic Parameters-Free Multi-View 3D Human Skeleton Estimation
* Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives
* Egocentric Event-Based Vision for Ping Pong Ball Trajectory Prediction
* EgoGaussian: Dynamic Scene Understanding from Egocentric Video with 3D Gaussian Splatting
* EigenLoRAx: Recycling Adapters to Find Principal Subspaces for Resource-Efficient Adaptation and Inference
* EL-Attack: Explicit and Latent Space Hybrid Optimization based General and Effective Attack for Autonomous Driving Trajectory Prediction
* Electronic Library for Commercially Usable Emotional Stimuli (EL-CUES): An Annotated Image Database for Emotion Induction Validated in a German Population
* Eliminating Semantic Ambiguity in Human Pose Estimation via Stable Feature Upsampling
* Embedding Shift Dissection on CLIP: Effects of Augmentations on VLM's Representation Learning
* EmbryoVision AI: An explainable deep learning framework for enhanced blastocyst selection in assisted reproductive technologies
* Emergency Motor Intention Detection Based on Unpredictable Anticipatory Activity: An EEG Study
* eMotion-GAN: A motion-based GAN for photorealistic and facial expression preserving frontal view synthesis
* EmotionMIL: An End-to-End Multiple Instance Learning Framework for Emotion Recognition From EEG Signals
* Emotions in LatAm: A New Dataset and Benchmark for Emotion Recognition in Latin America
* EmoVLM-KD: Fusing Distilled Expertise with Vision-Language Models for Visual Emotion Analysis
* Empathetic Response Generation Through Multi-Modality
* Empathy Detection From Text, Audiovisual, Audio or Physiological Signals: A Systematic Review of Task Formulations and Machine Learning Methods
* Empirical Study for Efficient Video Quality Assessment, An
* Encoding Affective Cues in Multimodal Textual Transcriptions
* End-to-End Autonomous Driving Without Costly Modularization and 3D Manual Annotation
* End-to-End Depth-Based Pipeline for Selfie Image Rectification, An
* End-To-End Pipeline for Virtual Banner Replacement in Football Broadcasts, An
* Enforcing View-Consistency in Class-Agnostic 3D Segmentation Fields
* Enhance Image-to-Point-Cloud Registration with Beltrami Flow
* Enhance Then Search: an Augmentation-Search Strategy with Foundation Models for Cross-Domain Few-Shot Object Detection
* Enhanced AlexNet for Detecting the Myocardial Infarction: An Efficient Approach
* Enhanced Computational Complexity in Continuous-Depth Models: Neural Ordinary Differential Equations With Trainable Numerical Schemes
* Enhanced Detection of Drought Events in California's Central Valley Basin Using Rauch-Tung-Striebel Smoothed GRACE Level-2 Data: Mechanistic Insights from Climate-Hydrology Interactions
* Enhanced Emotion Recognition in Conversations Through Hybrid Context Encoding and Latent Dependency Mining
* Enhanced licence plate detection using YOLO framework in challenging environments
* Enhanced Multi-Vehicle Trajectory Prediction via an Extended Temporal Sequence Fusion Attention Network
* Enhanced Multi-View Pedestrian Detection Using Probabilistic Occupancy Volume
* Enhanced Resolution of Martian Polar Stratigraphy via Structure Enhancement Denoising and Sparse Deterministic Deconvolution of SHARAD Data
* Enhanced Semantic Extraction and Guidance for UGC Image Super Resolution
* Enhanced skeleton-based Group Activity Recognition through spatio-temporal graph convolution with cross-dimensional attention
* Enhancing action recognition by leveraging the hierarchical structure of actions and textual context
* Enhancing aesthetic image generation with reinforcement learning guided prompt optimization in stable diffusion
* Enhancing Chlorophyll-a Estimation in Optically Complex Waters Using ZY-1 02E Hyperspectral Imagery: An Integrated Approach Combining Optical Classification and Multi-Index Blending Models
* Enhancing Facial Expression Recognition with LSTM Through Dual-Direction Attention Mixed Feature Networks and Clip
* Enhancing Few-Shot Class-Incremental Learning via Frozen Feature Augmentation
* Enhancing lightweight image super-resolution with hybrid convolution and attention
* Enhancing Lung Disease Diagnosis: A High Performance Hybrid Deep Learning Framework for Multi-Class Chest X-Ray Analysis
* Enhancing Multi-Modal Automatic Target Recognition Using Out-of-Distribution Exploitation (MATRODE)
* Enhancing Multimodal Sarcasm Detection Via Global and Local Prompt Mechanisms
* Enhancing Polar Sea Ice Estimation: Deep SARU-Net for Spatiotemporal Super-Resolution Approach
* Enhancing Reliability of Medical Image Diagnosis through Top-rank Learning with Rejection Module
* Enhancing Semi-Supervised Instance Segmentation Through SAM-Driven Pseudo-Label Generation in Autonomous Driving Environment
* Enhancing snow depth estimation in forested regions of the northern hemisphere: A physically-constrained machine learning approach with spatiotemporal dynamics
* Enhancing Vehicular Network Security, Privacy, and Trust Through Reinforcement Learning: A Comprehensive Survey
* Enhancing Video Anomaly Understanding via Multi-Task Instruction Tuning
* Enhancing Vision Transformer Explainability using Artificial Astrocytes
* Enhancing visual representation of untrimmed videos by counteracting visuality threatening content
* Entropy-informed weighting channel normalizing flow for deep generative models
* ePBR: Extended PBR Materials in Image Synthesis
* Error-Resilient incomplete multi-View clustering: Mitigating imputation-induced error accumulation
* Estimating and Mapping Aboveground Biomass of Vegetation in Typical Lake Flooding Wetland Based on MODIS and Landsat Images Fusion
* Estimating Grazing Pressure from Satellite Time Series Without Reliance on Total Production
* Estimating Plant Physiological Parameters for Vitis vinifera L. Using In Situ Hyperspectral Measurements and Ensemble Machine Learning
* Estimation of All-Weather Daily Surface Net Radiation over the Tibetan Plateau Using an Optimized CNN Model
* Estimation of the Relationship Between Urban Landscape Pattern and Crop Yield by Remote Sensing Data and Field Measurement
* EV-Flying: An Event-Based Dataset for In-the-Wild Recognition of Flying Objects
* EV-LayerSegNet: Self-Supervised Motion Segmentation Using Event Cameras
* Evaluating Airborne Thermal Infrared Hyperspectral Data for Leaf Area Index Retrieval in Temperate Forests
* Evaluating Consistency and Accuracy of Public Tidal Flat Datasets in China's Coastal Zone
* Evaluating Multispectral Imagery and Lidar Data for Vegetation Classification: A Comparative Assessment of UASs and Traditional Field Methods to Support Coastal Restoration Monitoring
* Evaluating Terrestrial Water Storage, Fluxes, and Drivers in the Pearl River Basin from Downscaled GRACE/GFO and Hydrometeorological Data
* Evaluating Text-to-Video Alignment: A Hierarchical Benchmark for Video Generation Models
* Evaluating the applicability of exclusively optical sensors in flash drought monitoring with MODIS observations
* Evaluation of Fir Forest Die-Back and Regeneration After a Severe Bark Beetle Disturbance Using UAV-Based Remote Sensing
* Evaluation of Gap-Filling Methods for Inland Water Color Remote Sensing Data: A Case Study in Lake Taihu
* Evaluation of Metro Station Accessibility Based on Combined Weights and GRA-TOPSIS Method
* Evaluation of Multi-Source Satellite XCO2 Products over China Using the Three-Cornered Hat Method and Multi-Reference Comprehensive Comparisons
* Evaluation of Non-Spherical Particle Models for Mineral Dust in Multi-Wavelength Polarization Lidar Applications: Comparison of Spheroid, Super-Ellipsoid, and Irregular-Hexagonal Models
* Evaluation of the flagGraupelHail Product from Dual-Frequency Precipitation Radar Onboard the Global Precipitation Measurement Core Observatory Using Multi-Parameter Phased Array Weather Radar
* EvenFormer: Dynamic Even Transformer for Real-World Image Restoration
* Event Quality Score (EQS): Assessing the Realism of Simulated Event Camera Streams via Distances in Latent Space
* Event-Based Continuous Color Video Decompression from Single Frames
* Event-Based Eye Tracking. 2025 Event-Based Vision Workshop
* Event-Based Prescribed Performance Anti-Disturbance Switchable Trajectory-Tracking Control for Underactuated Surface Vessels With Input Saturation
* Event-Based Tracking and Imaging of Randomly Moving Objects in Dense Dynamical Scattering Media
* Event-Conditioned Dual-Modal Fusion for Motion Deblurring
* Event-Driven Dynamic Attention for Multi-Object Tracking on Neuromorphic Hardware
* evolution of humanoid robots, The
* Evolution of Spaceborne SAR Missions in Earth Orbit
* EWAM: Scene-Adaptive Infrared-Visible Image Matching with Radiation-Prior Encoding and Learnable Wavelet Edge Enhancement
* ExaM: Unsupervised Concept-Based Representation Learning to Better Explain Models in Vision Tasks
* Examining the Fourier Spectrum of Speech Signal From a Time-Frequency Perspective for Automatic Depression Level Prediction
* Exemplar Masking for Multimodal Incremental Learning
* Expanded SPAN for Efficient Super-Resolution
* Explainable multimodal brain imaging through a multiple-branch neural network
* Explainable Physical PolSAR Autoencoders for Soil Moisture Estimation
* Explainable Sentiment Analysis With DeepSeek-R1: Performance, Efficiency, and Few-Shot Learning
* Explaining 3D Point Cloud Semantic Segmentation Models Through Adversarial Attacks
* Explicable Artificial Intelligence for Affective Computing
* Exploiting Adversarial Learning and Topology Augmentation for Open-Set Visual Recognition
* Exploiting Diffusion Priors for Generalizable Few-Shot Satellite Image Semantic Segmentation
* Exploiting Frequency Correlation for Hyperspectral Image Reconstruction
* Exploration of the Mechanisms Underlying Corneal Decompensation Using Graph Neural Networks
* Exploratory Learning of Amis Indigenous Culture and Local Environments Using Virtual Reality and Drone Technology
* Exploring a Non-Parametric Uncertain Adaptive training method for facial expression recognition
* Exploring Audio Cues for Enhanced Test-Time Video Model Adaptation
* Exploring Cognitive and Aesthetic Causality for Multimodal Aspect-Based Sentiment Analysis
* Exploring efficient appearance prompts for light-weight object tracking
* Exploring Emotional Engagement with Responsible AI Constructs: A Video-Based Cognitive Experiment
* Exploring Missing Modality in Multimodal Egocentric Datasets
* Exploring Modality Guidance to Enhance VFM-Based Feature Fusion for UDA in 3D Semantic Segmentation
* Exploring Semi-Supervised Learning for Online Mapping
* Exploring Temporal Dynamics in Event-Based Eye Tracker
* Expressive Keypoints for Skeleton-Based Action Recognition via Progressive Skeleton Evolution
* Extra-Lightweight AI-Based Privacy Preserving Framework for Egocentric Wearable Cameras
* Extreme Lake Level Rise in the Zaysan Basin Driven by Intense Snowmelt Runoff
* Eyes Tell the Truth: Gaze Val Highlights Shortcomings of Generative AI in Medical Imaging
* Face Reconstruction from Face Embeddings Using Adapter to a Face Foundation Model
* FaceGest: A Comprehensive Facial Gesture Dataset for Human-Computer Interaction
* FaceMINT: A library for gaining insights into biometric face recognition via mechanistic interpretability
* Facial Expression Recognition With an Efficient Mix Transformer for Affective Human-Robot Interaction
* Facial image super-resolution network for confusing arbitrary gender classifiers
* FADMB: Fully attention-based dual memory bank network for weakly supervised video anomaly detection
* Fair-DETR: Detection transformer with adaptive multi-scale attention and dual strong constraint-aware query selection
* Fairness-Aware Boosting Model for Imbalanced 3D Point Cloud Segmentation in Autonomous Driving
* FALCON: Fast Image Haze Removal Leveraging Continuous Density Mask
* FAPS-MER: Facial action position and semantic based interactive fusion for micro-expression recognition
* Fast adaptive QTMT partitioning for intra 360°video coding based on gradient boosted trees
* Fast Collaborative Representation Algorithm Based on Extended Multi-Attribute Profiles for Hyperspectral Anomaly Detection, A
* Fast multi-view discrete clustering with two solvers
* Fast Sphericity and Roundness Approximation in 2D and 3D Using Local Thickness
* FastGrasp: Efficient Grasp Synthesis with Diffusion
* FCTFANet: A Fused CNN-Transformer Feature Aggregator Network for Image Restoration
* Feature Attenuation of Defective Representation Can Resolve Incomplete Masking on Anomaly Detection
* Feature Matching in the Dark: Homography-Based RGB-IR Feature Transformation for Low-Light Vision
* Feature subset weighting for distance-based supervised learning
* Feature-aligned distillation for dense object detection via refined semantic guidance and distribution consistency
* FedAlign: Federated Domain Generalization With Cross-Client Feature Alignment
* FedCAPR:Federated Camera-Aware Unsupervised Person Re-Identification with Identity-Distributed Equalization for Decentralized Data Clustering
* FedCIAL: Federated Color-Invariant Adversarial Learning for Enhancing Fairness and Performance in Skin Lesion Classification
* FedDAG: Federated Domain Adversarial Generation Toward Generalizable Medical Image Analysis
* FedDG-MoE: Test-Time Mixture-of-Experts Fusion for Federated Domain Generalization
* Federated automatic latent variable selection in multi-output Gaussian processes
* Federated cross-source learning for lung nodule segmentation with data characteristic-aware weight optimization
* FedFAT: Frequency adpative interpolation for federated domain generalization on heterogeneous medical images
* FedSECA: Sign Election and Coordinate-Wise Aggregation of Gradients for Byzantine Tolerant Federated Learning
* Few-Shot Adaptation of Grounding DINO for Agricultural Domain
* Few-shot image generation via information transfer from the built Geodesic surface
* Few-Shot Testing of Autonomous Vehicles With Scenario Similarity Learning
* Few-shot-based video generation via multimodal fusion and Fourier Spliter
* FGPR: A large-scale dataset and benchmark for fine-grained product retrieval
* FI-MambaNet: A Feature Integration Network with Mamba and Multi-Head Self-Attention for Remote Sensing Object Detection
* Fidelity-preserving zero-shot diffusion models for highly ill-posed inverse problems in lensless imaging
* FieldMOT: A Field-Registered Multi-Object Tracking for Sports Videos
* Findings from shared tasks on hate speech detection: Performance patterns for low-resource languages
* Fine-Grained Artist Identification Method for Authentication and Attribution of Drawings Using Hatching Lines, A
* Fine-Grained Few-Shot Classification with Part Matching
* Fine-Grained Interpretation of Remote Sensing Image: A Review
* Fine-Grained Multispectral Fusion for Oriented Object Detection in Remote Sensing
* Fine-Tuning Large Language Models With Behavioral Alignment for Depression Detection
* FineCausal: A Causal-Based Framework for Interpretable Fine-Grained Action Quality Assessment
* Finite-Time Lyapunov-Based Model Predictive Control of ASVs: An Enlarging Attraction Domain Strategy Against DoS Attacks
* Finite-Time Multi-Lane Fusion Control for 2-D Plane Vehicle Platoon With FDI Attacks
* FireRisk-Multi: A Dynamic Multimodal Fusion Framework for High-Precision Wildfire Risk Assessment
* Flar-SVD: Fast and Latency-Aware Singular Value Decomposition for Model Compression
* Flash3D: Feed-Forward Generalisable 3D Scene Reconstruction from a Single Image
* Flexible Partial Screen-Shooting Watermarking With Provable Robustness
* Flow-Guided Deformable Alignment with Channel-Wise Self-Attention Reconstruct for Efficient Burst HDR Restoration
* Flow-Rate-Constrained Physics-Informed Neural Networks for Flow Field Error Correction in 4-D Flow Magnetic Resonance Imaging
* FlowLoss: Dynamic Flow-Conditioned Loss Strategy for Video Diffusion Models
* FlowMap: High-Quality Camera Poses, Intrinsics, and Depth via Gradient Descent
* FM-LoRA: Factorized Low-Rank Meta-Prompting for Continual Learning
* FMDP: Leveraging a Foundation Model for Dual-Pixel Disparity Estimation
* FOCUS: Multi-View Foot Reconstruction from Synthetically Trained Dense Correspondences
* Food Degradation Analysis Using Multimodal Fuzzy Clustering
* FoodVideoQA: A Novel Baseline Framework for Dietary Monitoring
* FORCE: Physics-Aware Human-Object Interaction
* ForesightNav: Learning Scene Imagination for Efficient Exploration
* Forest Fire and Smoke Recognition Based on YOLO
* Forget Less, Learn More: Contrastive-Based Federated Class Incremental Learning with a Low-Dimensional Projection Layer
* Foundation Model for Skeleton-Based Human Action Understanding
* Foundation Models for Remote Sensing: An Analysis of MLLMs for Object Localization
* Fourier Fusion Implicit Mamba Network for Remote Sensing Pansharpening
* Fourier-enhanced semi-supervised proxy learning for ultra-fine-grained novel class discovery
* FourieRF: Few-Shot NeRFs via Progressive Fourier Frequency Control
* Fourth Monocular Depth Estimation Challenge, The
* FovEx: Human-Inspired Explanations for Vision Transformers and Convolutional Neural Networks
* FPD: Fringe Photometric Deflectometry when Fringe Meets Photometric Stereo
* FQ-EMCI-Net: A Multi-Head Attention CNN-DQN Approach with Filtered Q-Learning and Equilibrium Monte Carlo Initialization for SP-DLBP
* Framework for Processing of CRISM Hyperspectral Data for Global Martian Mineralogy
* FreBIS: Frequency-Based Stratification for Neural Implicit Surface Representations
* Freight Flow Equilibrium Assignment on a Multimodal Transport Network Integrating Urban Roads and Passenger-Freight Metro Lines
* Frenet-Serret Frame-Based Decomposition for Part Segmentation of 3-D Curvilinear Structures
* FreqMamba: A Frequency-Aware Mamba Framework with Group-Separated Attention for Hyperspectral Image Classification
* Frequency-domain multi-scale graph learning with information-theoretic constraint for spatio-temporal prediction
* Frequency-Prior Enhanced Ambient Lighting Normalization Via Visual Perceptual Refinement
* Frequency-Prompted Image Restoration to Enhance Perception in Intelligent Transportation Systems
* FrogDogNet: Fourier Frequency Retained Visual Prompt Output Guidance for Domain Generalization of CLIP in Remote Sensing
* From Beats to Scores: A Multi-Modal Framework for Comprehensive Figure Skating Assessment
* From Broadcast to Minimap: Achieving State-of-the-Art Soccernet Game State Reconstruction
* From Coast to Inland: Nonlinear and Temperature-Mediated Urbanization Effects on Vegetation Phenology in Shandong Province, China
* From Data to Design: Leveraging Frequency Statistics for Efficient Neural Network Architectures
* From Depletion to Recovery: Tracking Water Storage Changes in the Semiarid Region of Inner Mongolia, China
* From Experimental Field to Real Field: Monitoring Wheat Stripe Rust Based on Optimized Hyperspectral Vegetation Index
* From Precomputed Particle Shading to Volumetric Atmospheric Cloud Rendering for Real-Time Gaming: Methods and Advances
* FsBAD: Data-efficient feature reconstruction for few-shot brain anomaly detection
* FSF-Net: Enhance 4D occupancy forecasting with coarse BEV scene flow for autonomous driving
* FullCycle: Full Stage Adversarial Attack for Reinforcement Learning Robustness Evaluation
* Fully-Geometric Cross-Attention for Point Cloud Registration
* FungiTastic: A Multi-Modal Dataset and Benchmark for Image Categorization
* FusedVision: A Knowledge-Infusing Approach for Practical Anomaly Detection in Real-World Surveillance Videos
* Fusion or Confusion? A Look at Dataset Pooling for Infrared Object Detection
* FUSION: Frequency-Guided Underwater Spatial Image recOnstructioN
* FusionNet: Multi-Model Linear Fusion Framework for Low-Light Image Enhancement
* Fuzzy clustering algorithm with locality preserving based on anchor graph
* Fuzzy Granular Support Vector Machine for Network Traffic Anomaly Detection, A
* G-Buffer Supported Neural Screen-Space Refraction Baking for Real-Time Global Illumination
* GAN semantics for personalized facial beauty synthesis and enhancement
* GAN-Based Domain Adaptation for Image-Aware Layout Generation in Advertising Poster Design
* Garment3DGen: 3D Garment Stylization and Texture Generation
* GarmentDreamer: 3DGS Guided Garment Synthesis with Diverse Geometry and Texture Details
* Gated-enhanced attention addition network for indoor RGB-D semantic segmentation
* Gaussian Garments: Reconstructing Simulation-Ready Clothing with Photorealistic Appearance from Multi-View Video
* Gaussian splitting attack: Gaussian splatting-based multi-view 3D adversarial attack
* Gaussianavatar-Editor: Photorealistic Animatable Gaussian Head Avatar Editor
* Gaussians-to-Life: Text-Driven Animation of 3D Gaussian Splatting Scenes
* GaussianStyle: Gaussian Head Avatar via StyleGAN
* GaussianVideo: Efficient Video Representation and Compression by Gaussian Splatting
* Gaze Attention Estimation for Medical Environments
* GEDR: Gaussian-Enhanced Detail Reconstruction for Real-Time High-Fidelity 3D Scene Reconstruction
* Gen3DSR: Generalizable 3D Scene Reconstruction Via Divide and Conquer From a Single View
* GenderBias-VL: Benchmarking Gender Bias in Vision Language Models via Counterfactual Probing
* Gene Expression Programming-Based Ride Insert Policy for Online Electric Vehicle Ride-Hailing Optimization
* General Pruning Criteria for Fast SBL
* Generalizable Unsupervised Microscopy Video Denoising via Weighted SpatioTemporal Sampling
* Generalized referring expression segmentation driven by instance-oriented queries
* Generalized Variational Retrieval of Full Field-of-View Cloud Fraction and Precipitable Water Vapor from FY-4A/GIIRS Observations
* Generalizing to Unseen Speakers: Multimodal Emotion Recognition in Conversations With Speaker Generalization
* Generative AI for Film Creation: A Survey of Recent Advances
* Generative AI Game Jam Case Study From October 2024, A
* Generative Augmentation Hashing for Few-Shot Cross-Modal Retrieval
* Generative Causality-Driven Network for Graph Multi-Task Learning
* Generative models for noise-robust training in unsupervised domain adaptation
* GEO SAR Refocusing Algorithm of Ship Targets with Complex Motion via CFSFD-Based ISAR Technique
* Geo-MRC: Dynamic Boundary Inference in Machine Reading Comprehension for Nested Geographic Named Entity Recognition
* Geological Evolution of Rima Bode on the Moon Revealed by Multi-Source Remote Sensing Data
* Geological Map of the Proclus Crater: A Study Case to Integrate Composition and Morpho-Stratigraphic Mapping on the Moon
* Geometric Attribute Collaborative Method in Multi-Scale Polygonal Entity Matching Scenario: Integrating Sentence-BERT and Three-Branch Attention Network, A
* Geometric Consistency Refinement for Single Image Novel View Synthesis via Test-Time Adaptation of Diffusion Models
* Geometric Correspondence Consistency in RGB-D Relative Pose Estimation
* Geometrically Constrained Position Estimation through Low-level Tracking
* Geometry-Aware Feature Matching for Large-Scale Structure from Motion
* Geometry-Aware Texture Generation for 3D Head Modeling with Artist-Driven Control
* Geometry-Guided Cross-View Diffusion for One-to-Many Cross-View Image Synthesis
* Get a GRIP on Test Time Adaptation! - Group Robust Inference-Time Policy Optimization for Vision Models
* GLFFEN: A Global-Local Feature Fusion Enhancement Network for Hyperspectral Image Classification
* GLGF-CR: A Gated Local-Global Fusion approach for cloud removal in real-world remote sensing
* GLNet-YOLO: Research on Pedestrian Detection Technology Based on Multimodal Feature Fusion
* Global and local collaborative learning for no-reference omnidirectional image quality assessment
* Global Self-Attention-Driven Graph Clustering Ensemble
* gMINT: Gradiant-based Membership Inference Test Applied to Image Models
* Go-Stereo: Geometry-Gated Offset Correction Stereo Matching for Autonomous Driving
* Goal-Driven Human Motion Synthesis in Diverse Tasks
* Good4cir: Generating Detailed Synthetic Captions for Composed Image Retrieval
* GPRNet: A Geometric Prior-Refined Semantic Segmentation Network for Land Use and Land Cover Mapping
* GPT-FL: Generative Pre-Trained Model-Assisted Federated Learning
* GRADE: A Generalization Robustness Assessment via Distributional Evaluation for Remote Sensing Object Detection
* Gradient-Guided Optimization for Large Motion Video Frame Interpolation
* graph contrastive learning network for change detection with heterogeneous remote sensing images, A
* Graph-Based Multi-Resolution Cosegmentation for Coarse-to-Fine Object-Level SAR Image Change Detection
* Graph-Based Relaxation for Over-Normalization Avoidance in Reflectance Normalization of Multi-Temporal Satellite Imagery
* Graph-Oriented Instruction Tuning of Large Language Models for Generic Graph Mining
* GraphProbe: Knowledge Probing for Graph Representation Learning
* GRIN: Zero-Shot Metric Depth with Pixel-Level Diffusion
* GRS: Generating Robotic Simulation Tasks from Real-World Images
* GS-Pose: Generalizable Segmentation-Based 6D Object Pose Estimation with 3D Gaussian Splatting
* GST: Precise 3D Human Body from a Single Image with Gaussian Splatting Transformers
* Gully Extraction in Northeast China's Black Soil Region: A Multi-CNN Comparison with Texture-Enhanced Remote Sensing
* GVP: Generative Volumetric Primitives
* H2OT: Hierarchical Hourglass Tokenizer for Efficient Video Pose Transformers
* HADDNLP: Hyperspectral anomaly detection via double nonlocal priors
* HADF: A Hybrid Attention and Dual-Branch Feature Fusion Method for Infrared and Visible Image Fusion
* HAECcity: Open-Vocabulary Scene Understanding of City-Scale Point Clouds with Superpoint Graph Clustering
* HAI-Net: Skin Lesion Segmentation Using a High-Performance Adaptive Attention and Information Interaction Network
* Hail Damage Detection: Integrating Sentinel-2 Images with Weather Radar Hail Kinetic Energy
* Hand Gesture Recognition System Based on Indian Sign Language Using SVM and CNN
* Harmonizing Attention Fields with Knowledge Distillation for Multi-View 3D Object Detection
* HARMONY: Hidden Activation Representations and Model Output-Aware Uncertainty Estimation for Vision-Language Models
* Harnessing Depth Gradients: A New Framework for Precise RGB-D Instance Segmentation
* HCS-DFC: A Diffusion Classifier for Mode of Action Prediction Using Morphological Profiles
* HCTANet: Hierarchical Cross-Temporal Attention Network for Semantic Change Detection in Complex Remote Sensing Scenes
* HDC: Hierarchical Distillation for Multi-Level Noisy Consistency in Semi-Supervised Fetal Ultrasound Segmentation
* HDRUnet3D: High dynamic range image reconstruction network with residual and illumination maps
* HeadCraft: Modeling High-Detail Shape Variations for Animated 3DMMs
* HeadEvolver: Text to Head Avatars via Expressive and Attribute-Preserving Mesh Deformation
* HeadGAP: Few-Shot 3D Head Avatar via Generalizable Gaussian Priors
* HeAL3D: Heuristical-enhanced Active Learning for 3D Object Detection
* Heart Rate and Facial Expression Data Influence the Ease of Communication in a Remote Work Set-Up
* Heterogeneous graph contrastive learning with spectral augmentation and dual aggregation
* Heterospectral Structure Compensation Sampling for Hyperspectral Fusion Computational Imaging
* HF-D-FINE: High-resolution features enhanced D-FINE for tiny object detection in UAV image
* HFF-Net: An Efficient Hierarchical Feature Fusion Network for High-Quality Depth Completion
* HG-SFDA: HyperGraph Learning Meets Source-Free Unsupervised Domain Adaptation
* Hierarchical community-based graph generation model for improving structural diversity
* Hierarchical Control of Emotion Rendering in Speech Synthesis
* Hierarchical gradient modulation for multi-resolution image registration
* Hierarchical Multi-Task Restoration Network for Old Photo Enhancement
* Hierarchical Prompt Engineering for Remote Sensing Scene Understanding with Large Vision-Language Models
* Hierarchical Semantic Segmentation with Autoregressive Language Modeling
* Hierarchical Spherical CNNs With Lifting-Based Adaptive Wavelets for Pooling and Unpooling
* High-Fidelity Star Map Simulation Method for Airborne All-Time Three-FOV Star Sensor Under Dynamic Conditions, A
* High-precision flood change detection with lightweight SAR transformer network and context-aware attention for enriched-diverse and complex flooding scenarios
* High-Precision Geolocation of SAR Images via Multi-View Fusion Without Ground Control Points
* High-Precision Human Pose Estimation Algorithm Based on Multi-View LiDAR and Visible Light Sensors
* High-Resolution GPR Surveys to Investigate the Internal Structure of Pillars Inside the Cathedral of San Giorgio in Ragusa Ibla (Sicily, Italy)
* High-Resolution Mapping and Spatiotemporal Dynamics of Cropland Soil Temperature in the Huang-Huai-Hai Plain, China (2003-2020)
* High-Resolution Remote Sensing and People-to-Pixel Integration for Mapping Farmland Abandonment in Central Himalayan Villages
* High-Resolution Underwater Creature Segmentation
* High-Speed Train Positioning in 5G NR Signals: A Novel High-Order Extended Kalman Filter Utilizing an Auxiliary Model for High-Order Variables
* Highway Signage Breakage Detection Algorithm Based on Improved YOLOv8
* HILoF-DETR: A Lightweight Framework for SAR Ship Detection with Spatial Frequency Enhancement and Dynamic Alignment
* HiTAL: Hierarchical Thumbnail and Latent Augmentation for Deepfake Detection
* HMD2: Environment-Aware Motion Generation from Single Egocentric Head-Mounted Device
* HMPFormer: Hierarchical vision transformer with multi-perspective feature learning for precise polyp segmentation
* HMSNet: Hilbert curve enhanced Mamba for real-time semantic segmentation
* HOI-Diff: Text-Driven Synthesis of 3D Human-Object Interactions using Diffusion Models
* HoleGest: Decoupled Diffusion and Motion Priors for Generating Holisticly Expressive Co-Speech Gestures
* HopGAT: A multi-hop graph attention network with heterophily and degree awareness
* HopNet: Harmonizing Object Placement Network for Realistic Image Generation via Object Composition
* How Does Built Environment Influence Housing Prices in Large-Scale Areas? An Interpretable Machine Learning Method by Considering Multi-Dimensional Accessibility
* How Does the Machine Perceive Depth for Indoor Single Images with CNN?
* How Good is my Video-LMM? Complex Video Reasoning and Robustness Evaluation Suite for Video-LMMs
* How Much Noise is There in Labels Generated by Humans? A Method to Validate Automatically Generated Bounding Boxes
* How to Enhance Causal Discrimination of Emotional Utterances: A Case on LLMs
* HSIGene: A Foundation Model for Hyperspectral Image Generation
* Human Eye Optics Simulation and Visual Modeling of Myopia Correction
* human layout consistency framework for image-based virtual try-on, A
* Human Mesh Reconstruction of Sports Players with Multiple Dynamic Cameras
* Human vs. Machine Minds: Ego-Centric Action Recognition Compared
* Human-Machine Cooperative Control Strategy Based on Deep Reinforcement Learning to Enhance Heavy Vehicle Driving Safety, A
* Human-Robot Navigation Using Event-Based Cameras and Reinforcement Learning
* HumMorph: Generalized Dynamic Human Neural Fields From Few Views
* Hurricane Precipitation Intensity as a Function of Geometric Shape: The Evolution of Dvorak Geometries
* Hybrid AI-Physical Modeling for Penetration Bias Correction in X-Band InSAR DEMs: A Greenland Case Study
* Hybrid CNN and SVM model for Alzheimer's disease classification using categorical focal loss function
* Hybrid Strategy Combining Maritime Physical Data to the OpenSARShip RCS Statistics for Fast and Effective Vessel Detection in SAR Imagery, A
* HybridEditDif: Text and Exemplar Guided Image Editing with Diffusion Models
* HyperCASR: Spectral-Spatial Open-Set Recognition With Category-Aware Semantic Reconstruction for Hyperspectral Imagery
* Hypercomplex Neural Network and Cross-Modal Attention for Multi-Modal Emotion Recognition Using Physiological Signals
* Hypergraph regularization-based anchor learning for multi-view clustering
* Hypergraph-Based High-Order Correlation Analysis for Large-Scale Long-Tailed Data Classification
* Hyperspectral Change Detection Method for Small Vehicles, A
* Hyperspectral Object Tracking via Band and Context Refinement Network
* Hyperspectral Object Tracking With Spectral Information Prompt
* Hyperspectral space transformations for texture classification
* Hyperspectral Tracker With Constrained Object Adaptive Learning and Trajectory Construction
* I-Filtering: Implicit Filtering for Learning Neural Distance Functions From 3D Point Clouds
* IAUNet: Instance-Aware U-Net
* IBD: Alleviating Hallucinations in Large Vision-Language Models via Image-Biased Decoding
* Ice Hockey Puck Localization Using Contextual Cues
* ICT-QA: Question Answering Over Multi-Modal Contexts Including Image, Chart, and Text Modalities
* Identification of the Sediment Thickness Variation of a Tidal Mudflat in the South Yellow Sea via GPR
* Identifying real changes for height displaced buildings to aid in deep learning training sample generation
* Identifying rural roads in remote sensing imagery: From benchmark dataset to coarse-to-fine extraction network: A case study in China
* Idoldancenet: Indian Heritage Idol Dance Pose Classification
* IEMFormer: Internal and External Multi-Fusion Transformer for Indoor RGB-D Semantic Segmentation
* IFADiff: Training-Free Hyperspectral Image Generation via Integer-Fractional Alternating Diffusion Sampling
* iFusion: Inverting Diffusion for Pose-Free Reconstruction From Sparse Views
* IG-ODAM: Instance-Aware Visual Explanations for Object Detection with Integrated Gradients
* IGL-DT: Iterative Global-Local Feature Learning with Dual-Teacher Semantic Segmentation Framework Under Limited Annotation Scheme
* IL-NeRF: Incremental Learning for Neural Radiance Fields with Camera Pose Alignment
* Illumination and identity feature disentanglement network: Optimizing the performance of finger-vein recognition in outdoor multi-illumination
* Illusory VQA: Benchmarking and Enhancing Multimodal Models on Visual Illusions
* ImagingNet: A New Learnable SAR Imaging Method via Hierarchical U-Shaped Network
* IMC: A Benchmark for Invariant Learning Under Multiple Causes
* Impact of Joint Assimilating AWS and Radar Observations on the Analysis and Forecast of a Squall Line with Complex Terrain
* Impact of Optical System Size on Robustness in Laser Speckle Authentication
* Impact on Triple/N-Way Collocation-Based Validation of Remote Sensing Products Due to Non-Ideal Error Statistics, The
* Implicit Diffusion-Based Super-Resolution for Intangible Cultural Heritage Images
* Improved 3DUNet+ with Inter-Slice Difference Awareness for Pulmonary Vessel CT Image Segmentation, An
* Improved Atmospheric Correction for Remote Imaging Spectroscopy Missions with Accelerated Optimal Estimation
* Improved Change Detection Method for Time-Series Soil Moisture Retrieval in Semi-Arid Area, An
* Improved Data Processing and a Prior Profile Generation Method for Precise Retrieval of Atmospheric CO2 Based on a Laser Heterodyne Radiometer
* Improved Out-of-Distribution Detection with Additive Angular Margin Loss
* Improved Repeat and Concatenate: A More Effective 2D X-Ray to 3D CT Image Translation Model
* Improved YOLOv8n Algorithm for Small Object Detection in Road Scenes, An
* Improvement of Snow Albedo Simulation Considering Water Content
* Improving adversarial transferability via semantic-style joint expectation perturbations
* Improving Generalized Visual Grounding With Instance-Aware Joint Learning
* Improving lesion segmentation in medical images by global and regional feature compensation
* Improving Multimodal Hateful Meme Detection Exploiting LMM-Generated Knowledge
* Improving Noisy Sensor Positions Using Noisy Inter-Sensor AOA Measurements
* Improving Object Detection Models via LLM-Based Training Data Synthesis
* Improving Open-World Object Localization by Discovering Background
* Improving Optical Flow and Stereo Depth Estimation by Leveraging Uncertainty-Based Learning Difficulties
* Improving out-of-domain generalization in Multiple Sclerosis detection and segmentation using Random Convolutions
* Improving Post-Training Quantization via Probabilistic Programming
* Improving Robustness and Reliability in Medical Image Classification With Latent-Guided Diffusion and Nested-Ensembles
* Improving Satellite-Derived Bathymetry in Complex Coastal Environments: A Generalised Linear Model and Multi-Temporal Sentinel-2 Approach
* Improving Weather-Based OOD Generalisation in Lidar-Based Object Detection Models via Adversarial Training
* Imputing missing multi-sensor data in the healthcare domain: A systematic review
* In Vivo Laparoscopic Image De-Smoking Dataset, Evaluation, and Beyond
* Inaccurate DInSAR Time Series Underlie the Purported Evidence of a Recent, Rapid Ascent of a Magmatic Body in the Campi Flegrei Caldera (Italy)
* iNatAg: Multi-Class Classification Models Enabled by a Large-Scale Benchmark Dataset with 4.7M Images of 2,959 Crop and Weed Species
* Incorporating Dense Metric Depth into Neural 3D Representations for View Synthesis and Relighting
* Incremental pseudo-labeling for black-box unsupervised domain adaptation
* Individual Tree-Level Biomass Mapping in Chinese Coniferous Plantation Forests Using Multimodal UAV Remote Sensing Approach Integrating Deep Learning and Machine Learning
* Inferring Driving Maps by Deep Learning-Based Trail Map Extraction
* Inferring River Channel Geometry Based on Multi-Satellite Datasets and Hydraulic Modeling
* Influence of Different Jumping Gaps on Socket-Shield Technology: Three-Dimensional Finite Element Analysis
* Information transmission: Inferring change area from change moment in time series remote sensing images
* Inpaint-Outpaint Synergy: Mask Refinement for Trimap-Free Matting
* INPC: Implicit Neural Point Clouds for Radiance Field Rendering
* INRet: A General Framework for Accurate Retrieval of INRs for Shapes
* InSAR Reveals Coseismic Deformation and Coulomb Stress Changes of the 2025 Tingri Earthquake: Implications for Regional Hazard Assessment
* Instance Feature Caching for Cross-Domain Few-Shot Object Detection
* Instance-Aware Multi-Task Learning for Nuclei Segmentation
* Instance-Level Orientation Enhancement for Horizontal Box Supervised Oriented Object Detection in Remote Sensing Images
* Instruction-Augmented Multimodal Alignment for Image-Text and Element Matching
* Instrument-Tissue-Guided Surgical Action Triplet Detection via Textual-Temporal Trail Exploration
* Integrated Approach for Simultaneous Calibration and 3-D Coronary Artery Centerline Reconstruction From Two Non-Simultaneous Angiographic Images, An
* Integrated Fault Estimation and Fault-Tolerant Tracking Control for Unmanned Surface Vessels Under Connectivity-Hybrid Cyber-Attacks
* Integrated Feature Framework for Wetland Mapping Using Multi-Source Imagery, An
* Integrated multi-channel approach for speckle noise reduction in SAR imagery using gradient, spatial, and frequency analysis
* Integrated Optimization Approach for High-Speed Railway Rescheduling With Passenger Transfer and Compensation Mechanisms
* Integrating ConvNeXt and vision transformers for enhancing facial age estimation
* Integrating Driving-Aware World Model With MPC for Autonomous Driving at Unsignalized T-Intersections
* Integrating Knowledge for High-Fidelity Remote Sensing Detection of Cross-River Bridges
* Integrating Multiple Semantics of Street View Imagery for Semi-Supervised Building Function Identification
* Integrating Satellite and Field Data for Glacier Melt Modeling in High-Mountain Asia: A Case Study on Passu Glacier
* Integrating UAV Multi-Temporal Imagery and Machine Learning to Assess Biophysical Parameters of Douro Grapevines
* Intelligent Optimizations for UAV, Digital Twin, and ISCC Enabled Intelligent Transportation Systems
* Interactive Agent Foundation Model, An
* Interactive Dual-Branch Transformer for Precise Agricultural Parcel Delineation from Remote Sensing Imagery
* Interactive Humanoid: Online Full Body Human Motion Reaction Synthesis with Social Affordance Forecasting and Canonicalization
* Interactive Multimodal Framework with Temporal Modeling for Emotion Recognition
* Interactive Siamese Network-Based Roadside Perception for Multi-Vehicle Tracking
* Intercomparison, Fusion and Application of FY-3E/WindRAD and HY-2B/SCA Ocean Surface Wind Products for Tropical Cyclone Monitoring
* Interpretable Dynamic Brain Network Analysis With Functional and Structural Priors
* Intersection-based Ensemble for Small Multi-Object Tracking in Challenging Environments
* InterTrack: Tracking Human Object Interaction Without Object Templates
* Intertwined Factors Affecting Altimeter Sigma0, The
* Intriguing Properties of Robust Classification
* Investigating an Earthquake Surface Rupture Along the Kumysh Fault (Eastern Tianshan, Central Asia) from High-Resolution Topographic Data
* Investigating dementia: an analysis on machine learning strategies
* Investigating Mechanisms for In-Context Vision Language Binding
* Investigating the Effects of Sleep Conditions on Emotion Responses with EEG Signals and Eye Movements
* Investigation of Physics-Informed Methods for Improving Sea Surface Height Prediction Based on Neural Networks in the South China Sea
* Invisible Stitch: Generating Smooth 3D Scenes with Depth Inpainting
* Ionospheric Corrections for Space Domain Awareness Using HF Line-of-Sight Radar
* IRIS: An information path planning method based on reinforcement learning and information-directed sampling
* IRR-RADA: A Reflection-Aware Saliency Map and Adaptive Curriculum Learning Based Data Augmentation Method for Image Reflection Removal
* Is Multi-Person Gait Recognition Feasible Under Mutual Occlusion? A Human Model Regression-Based Approach
* Is Temporal Prompting All We Need for Limited Labeled Action Recognition?
* ITACLIP: Boosting Training-Free Semantic Segmentation with Image, Text, and Architectural Enhancements
* Iterative Event-Based Motion Segmentation by Variational Contrast Maximization
* Iterative Learning Model Predictive Control for Robust Rescheduling of Intercity Express Trains
* Iterative Similarity Perturbation Point Cloud Registration Based on Deformation-Resistant Region Detection
* ITS-Net: A platform and sensor agnostic 3D deep learning model for individual tree segmentation using aerial LiDAR data
* JADE: Joint-Aware Latent Diffusion for 3D Human Generative Modeling
* Jailbreak Attack with Multimodal Virtual Scenario Hypnosis for Vision-Language Models
* Joint luminance-chrominance learning for quality assessment of low-light image enhancement
* Joint Perception and Prediction for Autonomous Driving: A Survey
* Joint Shape Reconstruction and Registration via a Shared Hybrid Diffeomorphic Flow
* Joint subgraph independence for graph out-of-distribution generalization
* Joker: Conditional 3D Head Synthesis with Extreme Facial Expressions
* Jump-Aware: Player Position Rectification and Identification in Dynamic Sports Using Jump Event Spotting
* KernFusNet: Implicit Kernel Modulation and Fusion for Blind Super-Resolution
* KFGOD: A Fine-Grained Object Detection Dataset in KOMPSAT Satellite Imagery
* Knowledge Distillation Approach for SOS Fusion Staging: Towards Fully Automated Skeletal Maturity Assessment
* Knowledge Distillation Based Binarized Separable Convolutional Neural Network for Underwater Acoustic Target Recognition, A
* Knowledge Distillation from First-Order Representation for Visual State Space Model
* Knowledge tailoring: Bridging the teacher-student gap in semantic segmentation
* KOFFVQA: An Objectively Evaluated Free-Form VQA Benchmark for Large Vision-Language Models in the Korean Language
* LA-GATs: A Multi-Feature Constrained and Spatially Adaptive Graph Attention Network for Building Clustering
* Label Feature Co-Learning for Facial and EEG Emotion Recognition
* Laboratory Calibration Comparison of Hyperspectral Ocean Color Radiometers in the Frame of the FRM4SOC Phase 2 Project
* LAC-PS: A Light Direction Selection Policy Under the Accuracy Constraint for Photometric Stereo
* LADI v2: Multi-Label Dataset and Classifiers for Low-Altitude Disaster Imagery
* Land Surface Reflection Differences Observed by Spaceborne Multi-Satellite GNSS-R Systems
* Land-cover prior diffusion probabilistic model for remote sensing image super resolution
* LangCoop: Collaborative Driving with Language
* LangGas: Introducing Language in Selective Zero-Shot Background Subtraction for Semi-Transparent Gas Leak Detection with a New Dataset
* LangOcc: Open Vocabulary Occupancy Estimation via Volume Rendering
* Language model encoded multi-scale feature fusion and transformation for predicting protein-peptide binding sites
* Language-guided Recursive Spatiotemporal Graph Modeling for Video Summarization
* Language-Guided Trajectory Traversal in Disentangled Stable Diffusion Latent Space for Factorized Medical Image Generation
* LAPIS: A Novel Dataset for Personalized Image Aesthetic Assessment
* LapisGS: Layered Progressive 3D Gaussian Splatting for Adaptive Streaming
* Large-Model-Enhanced Method for Rail Surface Defect Detection in Heavy-Haul Railway, A
* Large-Scale 3D Gaussian Reconstruction Method for Optimized Adaptive Density Control in Training Resource Scheduling, A
* Large-Scale Analysis on Contextual Self-Supervised Video Representation Learning, A
* Large-Scale Dataset of Gaussian Splats and Their Self-Supervised Pretraining, A
* Large-small model collaboration for medical visual question answering with task aware mixture of experts and relation knowledge distillation
* Latent Patched Efficient Diffusion Model for High Resolution Image Synthesis
* Lattice-Based Ring Signcryption Scheme for Secure Communication in 6G-Enabled Vehicular Ad Hoc Networks Using Blockchain, A
* Layer-wise correlation and attention discrepancy distillation for semantic segmentation
* LayerCLIP: A fine-grained class activation map for weakly supervised semantic segmentation
* LayerMix: Enhanced Data Augmentation for Robust Deep Learning
* LDCT image denoising model based on dual-path attention, An
* Learn depth space from light field via a distance-constraint query mechanism
* Learned Lightweight Smartphone ISP with Unpaired Data
* Learned Smartphone ISP on Mobile GPUs, Mobile AI 2025 Challenge: Report
* Learning a Perspective-Invariant Descriptor for Remote Sensing Image Matching
* Learning Assisted Interactive Modelling with Rough Freehand 3D Sketch Strokes
* Learning Classifier Performance in an Ensemble of Classifiers for Personality Prediction Using Laughter
* Learning from majority label: A novel problem in multi-class multiple-instance learning
* Learning from Noise: Enhancing DNNs for Event-Based Vision Through Controlled Noise Injection
* Learning Heterogeneous Mixture of Scene Experts for Large-Scale Neural Radiance Fields
* Learning hierarchical uncertainty from hybrid representations for neural active reconstruction
* Learning interpretable binary codes via semantic alignment for customized image retrieval
* Learning label-specific features for multi-dimensional classification
* Learning like a real student: Black-box domain adaptation with preview, differentiated learning and review
* Learning multi-scale spatial-frequency features for image denoising
* Learning Multi-View Stereo With Geometry-Aware Prior
* Learning multiscale residual prototypes and global-local correspondence for video anomaly detection
* Learning Naturally Aggregated Appearance for Efficient 3D Editing
* Learning Optical Flow Field via Neural Ordinary Differential Equation
* Learning Pose-Aware Representations in Vision Transformers for Understanding Activities of Daily Living
* Learning Safe, Optimal, and Real-Time Flight Interaction With Deep Confidence-Enhanced Reachability Guarantee
* Learning Semantic-Aware Point-Line Features for Localization and Reconstruction
* Learning to complement with multiple humans
* Learning to correct unevenly exposed face images using RGB-NIR pairs
* Learning to Drive from a World Model
* Learning to Rank Onset-Occurring-Offset Representations for Micro-Expression Recognition
* Learning to recognize correctly completed procedure steps in egocentric assembly videos through spatio-temporal modeling
* Learning unified patterns of multimodalities for video temporal grounding
* Learning-Based Geometric Tracking Control for Rigid Body Dynamics
* least-squares spatio-temporal adjustment algorithm (LSSTAA) for radiometric normalization of PlanetScope time series, A
* Less Biased Noise Scale Estimation for Threshold-Robust RANSAC
* LET-CViT: A low-light enhanced two-stream CNN and vision transformer for Deepfake detection
* Leveraging Anthropometric Measurements to Improve Human Mesh Estimation and Ensure Consistent Body Shapes
* Leveraging ChatGPT-Based Augmentation and Contrastive Learning for Chinese Massive Open Online Course Sentiment Analysis
* Leveraging Eye Movement for Instructing Robust Video-Based Facial Expression Recognition
* Leveraging Fixed and Dynamic Pseudo-Labels in Cross-Supervision Framework for Semi-Supervised Medical Image Segmentation
* Leveraging graph-order scheduling and cascade hashing for fast feature matching of spherical images
* Leveraging Intermediate Features of Vision Transformer for Face Anti-Spoofing
* Leveraging Lightweight Facial Models and Textual Modality in Audio-Visual Emotional Understanding in-the-Wild
* Leveraging Limited ISMN Soil Moisture Measurements to Develop the HYDRUS-1D Model and Explore the Potential of Remotely Sensed Precipitation for Soil Moisture Estimates in the Northern Territory, Australia
* Leveraging Masked Feature and Consistency Regularization for Unsupervised Domain Adaptation Based Semi-Supervised Semantic Segmentation
* Leveraging Multimodal Large Language Models for Joint Discrete and Continuous Evaluation in Text-to-Image Alignment
* Leveraging Multimodal Large Language Models for Referring Camouflaged Object Detection
* Leveraging Segment Anything Model for Source-Free Domain Adaptation via Dual Feature Guided Auto-Prompting
* Leveraging Spatial-Temporal Heterogeneity and Cross-Mode Interactions: A Meta-Learning Approach for Multimodal Transportation Demand Prediction
* Leveraging Synthetic Adult Datasets for Unsupervised Infant Pose Estimation
* Leveraging Vision-Language Foundation Models to Reveal Hidden Image-Attribute Relationships in Medical Imaging
* LFMix: A Lightweight Hybrid Architecture for Light Field Super-Resolution
* LFSSMam: Efficient Aggregation of Multi-Spatial-Angular-Modal Information Using Selective SSM for Light Field Semantic Segmentation
* LFTramba: Comprehensive Information Learning for Light Field Image Super-Resolution via a Hybrid Transformer-Mamba Framework
* LFTransMamba: A Hybrid Mamba-Transformer Model for Light Field Image Super-Resolution
* LGNet: A Lightweight Ghost-Enhanced Network for Efficient SAR Ship Detection
* LibEER: A Comprehensive Benchmark and Algorithm Library for EEG-Based Emotion Recognition
* Lifting Deep Image Denoisers to Video With Frame Interpolation Pre-Training
* Lightning Location and Imaging Based on Electromagnetic Time Reversal and HDBSCAN
* Lightplane: Highly-Scalable Components for Neural 3D Fields
* Lightweight adaptive spatiotemporal information fusion network for medical time series classification
* Lightweight Anonymous Authenticated Key Agreement Protocol for V2I with Multi-TA Model, A
* Lightweight Convolutional Neural Network for Underwater Image Quality Enhancement, A
* Lightweight Image Super-Resolution With Pyramid Clustering Transformer
* Lightweight Moment Retrieval System with Global Re-Ranking and Robust Adaptive Bidirectional Temporal Search, A
* lightweight multilevel multiscale dual-path fusion network for remote sensing semantic segmentation, A
* Lightweight Perception-Driven Compression Method for Social Media Images, A
* Lightweight Similar Vehicle Detection Method With Multi-Scale Attention and Feature Fusion, A
* Lightweight Spatio-Temporal Convolutional Neural Network for Audio-Visual Emotion Recognition
* Lightweight Unsupervised Homography Estimation for Infrared and Visible Images Based on UAV Perspective Enabling Real-Time Processing in Space-Air-Ground Integrated Network
* Linear Complexity Multi-View Unsupervised Feature Selection via Anchor-Based Feature Relationship Construction
* Linguistic query-guided mask generation for referring image segmentation
* LINR: A Plug-and-Play Local Implicit Neural Representation Module for Visual Object Tracking
* LiPar: A Lightweight Parallel Learning Model for Practical In-Vehicle Network Intrusion Detection
* Lipvis: A Novel Transient Viseme Extraction Framework for Lip Reading
* Live Demonstration: Neurotouch - A Neuromorphic Vision-Based Tactile Sensor for Real-Time Gesture Recognition
* Live Demonstration: Real-Time Event-Data Processing with Graph Convolutional Neural Networks and SoC FPGA
* LLaVA-SCo: Teach Vision Language Models to Self-Correct
* LLCSpike: Learned Lossless Compression for Spike Data With Implicit Spike Representations
* LLM Framework for Long-Form Video Retrieval and Audio-Visual Question Answering Using Qwen2/2.5, An
* LLM-Enabled Multi-Agent Autonomous Mechatronics Design Framework, An
* LLM-guided hybrid Mamba-Transformer architecture for part-to-whole motion synthesis, A
* LLMPi: Optimizing LLMs for High-Throughput on Raspberry Pi
* LMFormer: Lane Based Motion Prediction Transformer
* LNTransformer: Lung Nodule Transformer for Sparse CT Segmentation
* Local Vectorized High Definition Map Construction for Autonomous Driving: A Comprehensive Review
* Locality-sensitive hashing based instance selection method with its application to acceleration of feature selection, A
* Locate then Calibrate: A Synergistic Framework for Small Object Detection from Aerial Imagery to Ground-Level Views
* Location-Free Scene Graph Generation
* Long-Term LULC Monitoring in El Jadida, Morocco (1985-2020): A Machine Learning-Based Comparative Analysis
* Looking into the Shadow: Recording a Total Solar Eclipse with High-Resolution Event Cameras
* LoopSplat: Loop Closure by Registering 3D Gaussian Splats
* LoVCS: A local voxel center based descriptor for 3D object recognition
* Low Texture 3D Reconstruction System Based on Manhattan Axis and 2D/3D Line Features
* Low-Cost Real-Time Remote Sensing and Geolocation of Moving Targets via Monocular Bearing-Only Micro UAVs
* Low-Frame-Rate Cell Tracking: Unmet Needs and Future Directions
* Low-Latency Real-Time Audio-Driven Talking Head Generation Based on Future Speech Feature Prediction
* Low-Light Image Enhancement Algorithm Based on Information Fusion Strategy
* Low-Light Image Enhancement via Multi-Exposure Progressive Contrastive Regularization
* Low-Resource Video Super-Resolution using Memory, Wavelets, and Deformable Convolutions
* LSBE-Net: Semantic segmentation of large-scale point cloud scenes via local boundary feature and spatial attention aggregation
* LSE-NeRF: Learning Sensor Modeling Errors for Deblured Neural Radiance Fields with RGB-Event Stereo
* LSSInst: Improving Geometric Modeling in LSS-Based BEV Perception with Instance Representation
* LSTM-Improved NHC in Tightly Coupled GNSS PPP/INS/ODO System for Robust Vehicle Navigation, An
* LViT-GMMs: Semantic Segmentation for Maritime Object Detection
* LVOS: A Benchmark for Large-Scale Long-Term Video Object Segmentation
* LVP-CLIP: Revisiting CLIP for Continual Learning with Label Vector Pool
* M-Adaptor: Text-Driven Whole-Body Human Motion Generation
* M3D: A Multimodal, Multilingual and Multitask Dataset for Grounded Document-Level Information Extraction
* M3FNet: Multi-modal multi-temporal multi-scale data fusion network for tree species composition mapping
* MAC++: Going Further with Maximal Cliques for 3D Registration
* Machine Learning Approaches to Phytoplankton Identification and Classification Using GCOM-C/SGLI Imagery
* Machine Learning Insights into Supply-Demand Mismatch, Interactions and Driving Mechanisms of Ecosystem Services Across Scales: A Case Study of Xingtai, China
* Machine Unlearning in Hyperbolic vs. Euclidean Multimodal Contrastive Learning: Adapting Alignment Calibration to MERU
* MAD: Makeup All-in-One with Cross-Domain Diffusion Model
* MADRL-Based Multi-UAV 3D Trajectory Planning for 6G-Oriented Communication Assistance
* Magic for the Age of Quantized DNNs
* Magnetic Tile Defect Detection with Cross-Scale Visual Feature Fusion: A Cascade Framework of Improved YOLOv11 and SAM Segmentation
* MAIENet: Multi-Modality Adaptive Interaction Enhancement Network for SAR Object Detection
* Maize Ear Sensing for on-Farm Yield Predictions
* Make Identity Indistinguishable: Utility-Preserving Face Dataset Publication With Provable Privacy Guarantees
* Making Every Event Count: Balancing Data Efficiency and Accuracy in Event Camera Subsampling
* Mamba-VA: A Mamba-Based Approach for Continuous Emotion Recognition in Valence-Arousal Space
* MambaMatch: Establishing Reliable Correspondences via Multi-Scale State Space Model
* Manager: Aggregating Insights From Unimodal Experts in Two-Tower VLMs and MLLMs
* Manipulating Trajectory Prediction Models With Backdoors
* Mapping aboveground tree biomass and uncertainty using an upscaling approach: A case study of the larch forests in northeastern China using UAV laser scanning data
* Mapping Biodiversity at Very-High Resolution in Europe
* Mapping subtle-featured oyster rafts with high-resolution imagery and deep learning techniques
* Maps from Motion (MfM): Generating 2D Semantic Maps from Sparse Multi-View Images
* Marine Radar Oil Spill Monitoring Method Based on YOLOv11 and Improved NGO Algorithm
* Mask-DiFuser: A Masked Diffusion Model for Unified Unsupervised Image Fusion
* Mask-Informed Deep Contrastive Incomplete Multi-View Clustering
* MaskAdapt: Unsupervised Geometry-Aware Domain Adaptation Using Multimodal Contextual Learning and RGB-Depth Masking
* Masked Face Recognition Method with Arcface Fusion of Attention and Focal Loss
* Mass Change in Antarctica from 2002 to 2025 Using GRACE and GRACE-FO
* Massively Parallel Lagrangian Relaxation Algorithm for Solving Large-Scale Spatial Optimization Problems Using GPGPU
* MASt3R-SfM: A Fully-Integrated Solution for Unconstrained Structure-from-Motion
* MaterialFusion: Enhancing Inverse Rendering with Material Diffusion Priors
* Matrix formula for subpixel image registration
* MAVEN: Multi-Modal Attention for Valence-Arousal Emotion Network
* Maximizing Aerial Detection of Organic Objects in Non-Exhaustively Searchable Survey Areas
* mChartQA and mChartQABench: A multimodal-only solution for complex chart question-answering
* MCK-Net: A Pedestrian Trajectory Prediction Network With MRF Clique Model and KAN
* MCoCa: Towards fine-grained multimodal control in image captioning
* MDMP: Multi-Modal Diffusion for Supervised Motion Predictions with Uncertainty
* MDPNet: Multiscale Dynamic Polyp-Focus Network for Enhancing Medical Image Polyp Segmentation
* MDTN: Multi-scale Dense Transformer Network for single-image super-resolution
* Medical Knowledge Intervention Prompt Tuning for Medical Image Classification
* MedSetFeat++: An attention-enriched set feature framework for few-shot medical image classification
* MegaLoc: One Retrieval to Place Them All
* Memory-Based Temporal Transformer U-Net for Multi-Frame Infrared Small Target Detection
* MER-CLIP: AU-Guided Vision-Language Alignment for Micro-Expression Recognition
* MerCulture: A Comprehensive Benchmark to Evaluate Vision-Language Models on Cultural Understanding in Singapore
* MERIT dataset: Modelling and efficiently rendering interpretable transcripts, The
* MESA: Text-Driven Terrain Generation Using Latent Diffusion and Global Copernicus Data
* Mesh Extraction for Unbounded Scenes Using Camera-Aware Octrees
* MeshUp: Multi-Target Mesh Deformation via Blended Score Distillation
* method for automatic breast density classification in magnetic resonance imaging, A
* MetricCol: Metric Depth and Pose Estimation in Colonoscopy via Geometric Consistency and Domain Adaptation
* MFE-STN: A Versatile Front-End Module for SAR Deception Jamming False Target Recognition
* MFSR-GAN: Multi-Frame Super-Resolution with Handheld Motion Modeling
* MGAEPL: Multi-Granularity Automated and Editable Prompt Learning for brain tumor segmentation
* MGAF: LiDAR-Camera 3D Object Detection With Multiple Guidance and Adaptive Fusion
* MGFNet: Multi-granularity medical pattern fusion network for patient risk prediction
* MGMSDNet: Multi gradient multi scale attention driven denoiser network
* MHFNet: Multimodal hybrid fusion framework for misaligned SAR-Optical ship detection
* MICH-Net: A Novel Deep Learning Architecture With African Fire Hawk Optimization for Steering Angle Prediction in an Advanced Driver Assistance System
* Microwave-Optical Multi-Stage Synergistic Daily 30 m Soil Moisture Downscaling Framework, A
* MIEI:A KID-based quality assessment metric for grayscale industrial equipment images
* MIGF-Net: Multimodal interaction-guided fusion network for image aesthetics assessment
* miMamba: EEG-Based Emotion Recognition With Multi-Scale Inverted Mamba Models
* Mind the data: Evaluating data quality sensitivity in medical LLMs
* Minimalist Approach to HDR Image Compression with Applications to Low-Light Image Enhancement, A
* Mipmap-GS: Let Gaussians Deform with Scale-Specific Mipmap for Anti-Aliasing Rendering
* Mitigating task randomness in graph few-shot learning
* Mix-QSAM: Mixed-Precision Quantization of the Segment Anything Model
* MixL-CNN: Lightweight multi-scale model for cross-domain aspect term extraction
* Mixture-of-Shape-Experts (MoSE): End-to-End Shape Dictionary Framework to Prompt SAM for Generalizable Medical Segmentation
* MixViT: Single image dehazing using Mixed Attention based Vision Transformer
* MKFi: Temporally robust WiFi CSI-based activity recognition under data scarcity
* MLANet: Multilevel aggregation network for binocular eye-fixation prediction
* mli-NeRF: Multi-Light Intrinsic-Aware Neural Radiance Fields
* MLM-EOE: Automatic Depression Detection via Sentimental Annotation and Multi-Expert Ensemble
* mmDiffusion: mmWave Diffusion for Sequential 3D Human Dense Point Cloud Generation
* MMDrive: Multi-Modal Remote Physiological Signal Measurement Dataset for Driver Status Monitoring
* MMDStegNet: An Adversarial Steganography Framework With Maximum Mean Discrepancy Regularization
* MMFNet: A multi-scale memory fusion network based on simulated abnormal samples for anomaly detection
* MMP: Enhancing unsupervised graph anomaly detection with multi-view message passing
* MObi: Multimodal Object Inpainting Using Diffusion Models
* MobileSACNet: Lightweight Spectral-Spatial Compression for Hyperspectral Segmentation in Autonomous Driving Systems
* MoCLIP: Motion-Aware Fine-Tuning and Distillation of CLIP for Human Motion Generation
* Modality Confusion Learning: A Versatile Framework for Visible-Infrared Re-identification
* Modality Selection and Skill Segmentation via Cross-Modality Attention
* Model-Free Adaptive Fault-Tolerant Formation Control for Nonlinear MIMO Multiagent Systems Over Fading Channels
* Modeling and Solving Methods for Eco-Driving of Freight Trains With Traction Chains Temperature Models
* Modeling Temporal Resistance Assessment of Cotton to Verticillium Wilt Using Airborne Hyperspectral Data and Disease Progression Rates
* Modified Whale Algorithm and Morley PSO-ML-Based Hyperparameter Optimization for Intrusion Detection
* Modifying Generative Distributions in Latent Diffusion Models to Improve Alignment with Desired Properties
* MoExDA: Domain Adaptation for Edge-based Action Recognition
* Mof-Image: Generating Mixture-of-Features Video Game Image Dataset via GPU Rendering Simulation
* MoLA: Motion Generation and Editing with Latent Diffusion Enhanced by Adversarial Training
* mondrian conformal predictive system with improved decision trees for uncertainty quantification under heteroscedasticity, A
* Monitoring Sand Dune Height Change in Kubuqi Desert Based on a Bistatic InSAR-Measured DEM Differential Method
* Monitoring Wet-Snow Avalanche Risk in Southeastern Tibet with a UAV-Based Multi-Sensor Framework
* MonoA2: Adaptive depth with augmented head for monocular 3D object detection
* Monocular 3D lane detection with geometry-guided transformation and contextual enhancement
* Monocular 3D Reconstruction Based on Deep Convolutional Neural Networks
* MonoGSDet: Monocular 3D Object Detection With Gaussian Splatting in Autonomous Driving
* MonoPatchNeRF: Improving Neural Radiance Fields with Patch-Based Monocular Guidance
* Monte Carlo-Based Spatial Optimization of Simulation Plots for Forest Growth Modeling
* Monthly Convective Boundary Layer Height Study over Brazil Using Radiosonde, ERA5, and COSMIC-2 Data
* Mopeft: A Mixture-of-Pefts for the Segment Anything Model
* Morphological Correction Method for River Skeleton Lines Based on Sampling Point Offsets
* MorphoSkel3D: Morphological Skeletonization of 3D Point Clouds for Informed Sampling in Object Classification and Retrieval
* Mosaic Pattern Excavation Transformer for Spectral Imaging
* MOSAIC: A multi-view 2.5D organ slice selector with cross-attentional reasoning for anatomically-aware CT localization in medical organ segmentation
* MOT-STM: Maritime Object Tracking: A Spatial-Temporal and Metadata-based approach
* MotionDreamer: Exploring Semantic Video Diffusion Features for Zero-Shot 3D Mesh Animation
* MovieChat+: Question-Aware Sparse Memory for Long Video Question Answering
* MPRNet: A Temporal-Aware Cross-Modal Encoding Framework for Personality Recognition
* MRLF: Multi-Resolution Layered Fusion Network for Optical and SAR Images
* MSFA Image Denoising Using Physics-Based Noise Model and Noise-Decoupled Network
* MSFENet: Multi-Scale Filter-Enhanced Network architecture for digital image forgery trace localization
* MSFFNet: Multimodal Spatial-Frequency Fusion Network for RGB-DSM Remote Sensing Image Segmentation
* MSFIQA: A multi-scale feature fusion network based on human visual perception for no-reference image quality assessment
* MTA-VPS: A Large-Scale Benchmark for Video-Based Person Search
* MTADA: A Multi-Task Adversarial Domain Adaptation Network for EEG-Based Cross-Subject Emotion Recognition
* MTD-YOLO: A Multi-Scale Perception Framework with Task Decoupling and Dynamic Alignment for UAV Small Object Detection
* MTDA-STGCN: Modern Temporal and Dual-Attention-Based Spatiotemporal Graph Convolutional Network for 4D Trajectory Prediction
* MTDNet: A crowd counting network based on a multiscale transformer and dilated convolution
* Mtevent: a Multi-Task Event Camera Dataset for 6D Pose Estimation and Moving Object Detection
* MTGT: Multiscale Text Feature-Guided Transformer in medical image segmentation
* MTW-DETR: A multi-task collaborative optimization model for adverse weather object detection
* Multi-AD: cross-domain unsupervised anomaly detection for medical and industrial applications
* Multi-Agent Systems for Robotic Autonomy with LLMs
* Multi-aspect Knowledge Distillation with Large Language Model
* Multi-Centralized Strategy for Trajectory-Based Active Traffic Management With Cooperative Automation, A
* Multi-Dimensional Quality Assessment for UGC Videos via Modular Multi-Modal Vision-Language Models
* Multi-Entity Video Transformers for Fine-Grained Video Representation Learning
* Multi-exposure image enhancement and YOLO integration for nighttime pedestrian detection
* Multi-Flow: Multi-View-Enriched Normalizing Flows for Industrial Anomaly Detection
* Multi-layer graph constraint dictionary pair learning for image classification
* multi-layer processing and coarse filtering network for accurate feature matching, A
* Multi-Layer Radial Basis Function Networks for Out-of-Distribution Detection
* multi-modal 3D object detection framework based on enhanced Convolution, mixed Sampling, and Image-Point cloud bidirectional fusion, A
* Multi-Modal Cooperative Distillation for Zero-Shot Multi-Label Classification
* Multi-modal deep facial expression recognition framework combining knowledge distillation and retrieval-augmented generation
* Multi-modal mamba framework for RGB-T crowd counting with linear complexity
* Multi-Modal masked autoencoder and parallel Mamba for 3D brain tumor segmentation
* Multi-Modal Multi-Expert Framework for Pain Assessment in Postoperative Children, A
* Multi-Modal Prompts With Primitives Enhancement for Compositional Zero-Shot Learning
* Multi-Objective Agent-Based Model Predictive Controller for Plug-and-Play Vehicle Control
* Multi-Objective Model for Traffic Signal Coordination Control With Queue Profile Estimation, A
* Multi-Option Hierarchical Reinforcement Learning Framework With State Segmentation for Mixed On-Ramp Merging
* Multi-Person Pose Estimation Evaluation Using Optimal Transportation and Improved Pose Matching
* Multi-Prior Fusion Transfer Plugin for Adapting In-Air Models to Underwater Image Enhancement and Detection
* Multi-Scale Feature Refinement and Dual-Attention Enhanced Dynamic Convolutional Network for Speech-Based Depression and ADHD Assessment, A
* Multi-scale feature sharing and collaborative sampling for unsupervised vehicle re-identification
* Multi-Scale Information-Driven Rock Classification Algorithm Based on Enhanced ResNet, A
* Multi-Scale Reinforcement Learning of Dynamic Energy Controller for Connected Electrified Vehicles
* Multi-Sensor Machine Learning Framework for Field-Scale Soil Salinity Mapping Under Data-Scarce Conditions, A
* Multi-Source Data Fusion and Ensemble Learning for Canopy Height Estimation: Application of PolInSAR-Derived Labels in Tropical Forests
* Multi-Spectral Imaging and Data Fusion for Real-Time Bleeding Detection
* Multi-subspace graph clustering joint dimensionality reduction and feature selection
* Multi-target federated backdoor attack based on feature aggregation
* Multi-Task Learning for Ocean-Front Detection and Evolutionary Trend Recognition
* Multi-Task Learning Model for V-PCC Geometry Compression Artifact Removal
* Multi-Technique 3D Modelling of Narrow Gorges to Assess Stability: Case Study of Caminito Del Rey (Spain)
* Multi-Vehicle Collaborative Trajectory Planning for AVP in Parking Lots: A Bio-Inspired Evolutionary Reinforcement Learning Approach
* Multi-view biclustering via non-negative matrix tri-factorisation
* Multi-View Self-Supervised Domain Adaptation for EEG-Based Emotion Recognition
* Multidimensional Imaging Data Completion via Weighted Three-Directional Minimax Concave Penalty Regularization
* Multimodal 3D Object Detection on Unseen Domains
* Multimodal Dynamic Cost Matrix Adaptation under data imbalance for multimodal sentiment analysis
* Multimodal Emotion Prediction in Interpersonal Videos Integrating Facial and Speech Cues
* Multimodal Framework for Therapeutic Consultations
* Multimodal Generalized Category Discovery
* Multimodal Rationales for Explainable Visual Question Answering
* Multimodal Spatiotemporal Deep Fusion for Highway Traffic Accident Prediction in Toronto: A Case Study and Roadmap
* Multiplatform Remote Sensing Techniques for Active Tectonics, Seismotectonics, and Volcanic Hazard Assessment
* Multiple cross-modal complementation network for lightweight RGB-D salient object detection
* Multiple Instance Learning for Visual Grain Quality Analysis Without Instance-Level Annotation
* Multisensory Music Listening in Affective Virtual Environments
* Mutual Head Knowledge Distillation Framework for Lightweight RGB-T Crowd Counting, A
* Mutualistic Multi-Network Noisy Label Learning (MMNNLL) Method and Its Application to Transdiagnostic Classification of Bipolar Disorder and Schizophrenia
* MV2DFusion: Leveraging Modality-Specific Object Semantics for Multi-Modal 3D Detection
* MVA 2025 Small Multi-Object Tracking for Spotting Birds Challenge: Dataset, Methods, and Results
* MVCM: Enhancing Multi-View and Cross-Modality Alignment for Medical Visual Question Answering and Medical Image-Text Retrieval
* Nadirfloornet: Reconstructing Multi-Room Floorplans from a Small Set of Registered Panoramic Images
* Nano-plasmonic dual-mode probe for near-vector field scanning optical microscopy
* Nanoparticle Diameter Measurements with Event Camera Tracking
* NAS-GS: Normal Alignment and Surface-Constrained Optimization of 3DGS for High-Fidelity Surface Reconstruction
* Naturally Computed Scale Invariance in the Residual Stream of ResNet18
* Near-Incident Detection in Railroad Environments: Lateral Distance Estimation froM Train-Mounted Monocular Camera
* Near-Real-Time Turbidity Monitoring at Global Scale Using Sentinel-2 Data and Machine Learning Techniques
* Neighbor-Based Feature and Index Enhancement for Person Re-Identification
* NeIn: Telling What You Don't Want
* NeRF gets personal: Mask-NeRF for targeted scene elements reconstruction
* Network-Wide Freeway Traffic Estimation Using Sparse Sensor Data: A Dirichlet Graph Auto-Encoder Approach
* NeuHMR: Neural Rendering-Guided Human Motion Reconstruction
* NeuRadar: Neural Radiance Fields for Automotive Radar Point Clouds
* Neural Solver With Traversal-Based Feature Representation and Adjacent Attention for Capacitated Arc Routing Problem, A
* Neurosurgical Craniotomy Training System Based on Haptic Virtual Reality Simulation, A
* New Advances Towards Early Warning Systems in the Mediterranean Sea Using the Real-Time RING GNSS Research Infrastructure
* New Approach to Characterize Dynamics of ECG-Derived Skin Nerve Activity via Time-Varying Spectral Analysis, A
* new framework for realizing fraction-order filters with robust performance, A
* New Tai Le Character Recognition System Based on Modelarts Platform, The
* Nexar Dashcam Collision Prediction Dataset and Challenge
* NExNet Seg: Neuron Expansion Network for Medical Image Segmentation
* Next-Generation C-Band SAR Mission: Design Concept for Earth Observation Service Continuity
* No reference Point Cloud Quality Assessment via cross-modal learning and contrastive enhancement
* No Train Yet Gain: Towards Generic Multi-Object Tracking in Sports and Beyond
* No-MambAAD: Revitalizing Conv-Only Networks for Unsupervised Anomaly Detection
* No-Reference Image Quality Assessment: Exploring Intrinsic Distortion Characteristics via Generative Noise Estimation With Mamba
* Noise Algorithms in Game Terrain Generation
* Noise Consistency Regularization for Improved Subject-Driven Image Synthesis
* Noise-aware state-space method for underwater object detection
* Noise-based Regularized Training for Diffusion Models
* Noise-tolerant scheme and explicit regularizer for deep active learning with noisy oracles
* NoKSR: Kernel-Free Neural Surface Reconstruction via Point Cloud Serialization
* Nonconvex Transform-Based Low-Rank Tensor Completion With Coupled Spatiotemporal Relation Learning for Traffic Data Recovery
* Nonuniform low-light image enhancement via noise-aware decomposition and adaptive correction
* Normalization-Calibration Model for Multi-Source Ground-Based FPAR Observations in Mountainous Forests, A
* Novel 3D Decoder with Weighted and Learnable Triple Attention for 3D Microscopy Image Segmentation, A
* Novel AIoT-Based and User Behavior-Driven Dockless Bike-Sharing Management System for Chaotic Operations in a Condensed City, A
* Novel Approach for Monitoring Vessel Load Capacity Through Ship Dimension Extraction From Synthetic Aperture Radar Images, A
* Novel Conditional Adversarial Domain Adaptation Network for EEG Cross-Subject Emotion Recognition, A
* Novel Few-Shot Learning Framework for Supervised Diffeomorphic Image Registration Network, A
* Novel Framework Based on Data Fusion and Machine Learning for Upscaling Evapotranspiration from Flux Towers to the Regional Scale, A
* Novel Framework Integrating Spectrum Analysis and AI for Near-Ground-Surface PM2.5 Concentration Estimation, A
* Novel Hybrid Attention-Based Dilated Network for Depression Classification Model from Multimodal Data Using Improved Heuristic Approach, A
* novel image enhancement method based on image decomposition and deep neural networks, A
* Novel Intelligent Operation State Recognition Approach for High-Temperature Superconducting Levitator of Maglev Vehicle Based on Thermal-Vibration Fusion Method, A
* novel method for remote sensing phycocyanin leveraging optical classification and an integrated machine learning algorithm, A
* Novel Robustness Measure for Evaluating Perceptions in Autonomous Driving, A
* Novel Three-Dimensional Imaging Method for Space Targets Utilizing Optical-ISAR Joint Observation, A
* Novel U-Shaped Network Combined with a Hierarchical Sparse Attention Mechanism for Coastal Aquaculture Area Extraction in a Complex Environment, A
* novel vision transformer with selective residual in multihead self-attention for pattern recognition, A
* Nowcasting Echo Top for Aviation Operations Using CNN-Transformer
* NPPCast: A Compact CNN Integrating Satellite Data for Global Ocean Net Primary Production Forecasts
* NTIRE 2025 Ambient Lighting Normalization Challenge Report
* NTIRE 2025 Challenge on Cross-Domain Few-Shot Object Detection: Methods and Results
* NTIRE 2025 Challenge on Day and Night Raindrop Removal for Dual-Focused Images: Methods and Results
* NTIRE 2025 Challenge on Efficient Burst HDR and Restoration: Datasets, Methods, and Results
* NTIRE 2025 Challenge on Event-Based Image Deblurring: Methods and Results
* NTIRE 2025 Challenge on HR Depth From Images of Specular and Transparent Surfaces
* NTIRE 2025 Challenge on Image Super-ResolutionX4: Methods and Results
* NTIRE 2025 Challenge on Light Field Image Super-Resolution: Methods and Results
* NTIRE 2025 Challenge on Low Light Image Enhancement: Methods and Results
* NTIRE 2025 Challenge on Night Photography Rendering
* NTIRE 2025 Challenge on Raw Image Restoration and Super-Resolution
* NTIRE 2025 Challenge on Real-World Face Restoration: Methods and Results
* NTIRE 2025 Challenge on Short-Form UGC Video Quality Assessment and Enhancement: KWAISR Dataset and Study
* NTIRE 2025 Challenge on Short-Form UGC Video Quality Assessment and Enhancement: Methods and Results
* NTIRE 2025 Challenge on Single Image Reflection Removal in the Wild: Datasets, Methods and Results
* NTIRE 2025 challenge on Text to Image Generation Model Quality Assessment
* NTIRE 2025 Challenge on UGC Video Enhancement: Methods and Results
* NTIRE 2025 Challenge on Video Quality Enhancement for Video Conferencing: Datasets, Methods and Results
* Ntire 2025 Image Shadow Removal Challenge Report
* NTIRE 2025 the 2nd Restore Any Image Model (RAIM) in the Wild Challenge
* Ntire 2025 XGC Quality Assessment Challenge: Methods and Results
* Obfuscation Based Privacy Preserving Representations Are Recoverable Using Neighborhood Information
* Object Agnostic 3D Lifting in Space and Time
* Object Detection Data Synthesis via Box-to-Image Generation Based on Diffusion Models
* Object is Worth 64×64 Pixels: Generating 3D Object via Image Diffusion, An
* Object State Recognition in Cooking Videos through End State Frames Analysis
* ObjectCarver: Semi-Automatic Segmentation, Reconstruction and Separation of 3D Objects
* Oblique-MERF: Revisiting and Improving MERF for Oblique Photography
* Observed Mesoscale Wind Response to Sea Surface Temperature Patterns: Modulation by Large-Scale Physical Conditions
* OccludeNeRF: Geometry-Aware 3D Scene Inpainting with Collaborative Score Distillation in NeRF
* OD-NeRF: Efficient Training of On-the-Fly Dynamic Neural Radiance Fields
* Off-axis system misalignment detection technique based on hybrid features of the wavelet transform and the Gabor filter
* Off-Nadir Satellite Image Scene Classification: Benchmark Dataset, Angle-Aware Active Domain Adaptation, and Angular Impact Analysis
* Oil Spill Detection and Identification on Coastal Sandy Beaches: Application of Field Spectroscopy and CMOS Sensor Imagery
* On the robustness of adversarial training against uncertainty attacks
* On the Robustness of GUI Grounding Models Against Image Attacks
* On the Suitability of Reinforcement Fine-Tuning to Visual Tasks
* One-to-Many Fine-Grained Matching Between UAV Images and Satellite Images for UAV Self-Localization
* Online 3D Scene Reconstruction Using Neural Object Priors
* Online Gaussian Test-Time Adaptation of Vision-Language Models
* OnlyFlow: Optical Flow Based Motion Conditioning for Video Diffusion Models
* Open Dataset and Enhancement Method for Long-Wave Thermal Diurnal Material Classification
* Open-Vocabulary Semantic Part Segmentation of 3D Human
* OpenSplat3D: Open-Vocabulary 3D Instance Segmentation using Gaussian Splatting
* Opentad: a Unified Framework and Comprehensive Study of Temporal Action Detection
* Operational Optimization of the Intelligent Bi-Directional Suspended Monorail System for Sea-Rail Intermodal Transportation
* Optical Remote Sensing Ship Detection Combining Channel Shuffling and Bilinear Interpolation
* OpticFusion: Multi-Modal Neural Implicit 3D Reconstruction of Microstructures by Fusing White Light Interferometry and Optical Microscopy
* OptiFusionStack: A Physio-Spatial Stacking Framework for Shallow Water Bathymetry Integrating QAA-Derived Priors and Neighborhood Context
* Optimal Mixture Model Distribution Alignment-Based 3D-2D Gaussian Splatting Registration for Monocular Endoscopic Ar Guidance
* Optimising Vision Transformer Performance on Limited Datasets: A Multi-Gradient Approach
* Optimization of Robotaxi Dispatch With Pick-Up/Drop-Off-Point and Boarding-Time Recommendation
* Optimization of Spatial Sampling in Satellite-UAV Integrated Remote Sensing: Rationale and Applications in Crop Monitoring
* Optimized GNSS RTK/INS/Vision Integration-Based Vehicle Positioning Model and Its Credibility Assessment, An
* Optimizing Local-Global Dependencies for Accurate 3D Human Pose Estimation
* Optimizing Right-of-Way Allocation in Urban Expressway Ramp Influence Areas: Placement of Autonomous Vehicle-Dedicated Lanes and Buffer Zones
* OptiPMB: Enhancing 3D Multi-Object Tracking With Optimized Poisson Multi-Bernoulli Filtering
* Oriented feature aggregation and sinkhorn distance fusion for few-shot object detection
* Out-of-Distribution Detection with Adversarial Outlier Exposure
* Out-of-Distribution Knowledge Inference-Based Approach for SAR Imagery Open-Set Recognition
* Out-of-Distribution Segmentation in Autonomous Driving: Problems and State of the Art
* Outlier-Robust Multi-Model Fitting on Quantum Annealers
* Overview of the 1st International Workshop on Interactive Video Search and Exploration
* P2P-NET: A PSO-Vision Framework for Accurate Detection and Multi-Class Classification of Parasitic Eggs in Human and Animal in Microscopy Images
* PACMAN: Rapid identification of keypoint patch-based fiducial marker in occluded environments
* PainFormer: A Vision Foundation Model for Automatic Pain Assessment
* PAL: Boosting Skin Lesion Segmentation via Probabilistic Attribute Learning
* Pan-Rsvqa: Vision Foundation Models as Pseudo-Annotators for Remote Sensing Visual Question Answering
* PanoDreamer: Consistent Text to 360-Degree Scene Generation
* Panopticon: Advancing Any-Sensor Foundation Models for Earth Observation
* Para-Lane: Multi-Lane Dataset Registering Parallel Scans for Benchmarking Novel View Synthesis
* Parallel Sampling of Diffusion Models on SO(3)
* Parameter-efficient action planning with large language models for vision-and-language navigation
* Parameterized Low-Rank Regularizer for High-dimensional Visual Data
* Parse Trees Guided LLM Prompt Compression
* PARSEL: A Multimodal Dataset for Modeling Decision-Making Processes Involved in Selecting Partners for Joint Tasks
* Partial Multiview Incomplete Multilabel Learning via Uncertainty-Driven Reliable Dynamic Fusion
* Particle Rendering: Implicitly Aggregating Incident and Outgoing Light Fields for Novel View Synthesis
* PartStickers: Generating Parts of Objects for Rapid Prototyping
* PaSTe: Improving the Efficiency of Visual Anomaly Detection at the Edge
* PatchContrast: Self-Supervised Pre-Training for 3D Object Detection
* Pathological graph self-supervised learning for clear-cell renal cell carcinoma survival prediction
* Pathway-Aware Multimodal Transformer (PAMT): Integrating Pathological Image and Gene Expression for Interpretable Cancer Survival Analysis
* PCBEAR: Pose Concept Bottleneck for Explainable Action Recognition
* PCFFusion: Progressive cross-modal feature fusion network for infrared and visible images
* Perception-Based Feedback Tracking via Scalarized Pixel Data of Visual Frames
* Perception-Oriented Bidirectional Attention Network for Image Super-Resolution Quality Assessment
* Perceptual Transform Fusion of Infrared and Visible Images
* Perturbation distillation and backdoor feature induction for universal defense in deep vision models
* Perturbed State Space Feature Encoders for Optical Flow with Event Cameras
* PETAH: Parameter Efficient Task Adaptation for Hybrid Transformers
* PF3Det: A Prompted Foundation Feature Assisted Visual Lidar 3D Detector
* PGVS: A probabilistic graph-theoretic framework for view-graph selection in structure-from-motion
* Photovoltaic Power Station Identification Based on High-Resolution Network and Google Earth Engine: A Case Study of Qinghai Province, Northwest China
* Physical-Model-Guided Dual-Branch Generative Adversarial Network for Thin Cloud Removal
* Physics-Based Human Pose Estimation from a Single Moving RGB Camera
* Physics-Driven Machine-Learning Retrieval and Uncertainty Quantification of Crop Leaf Area Index
* Physics-Guided Conditional Diffusion Model for GPR Denoising and Signal Recovery in Complex Mining Environments
* Physics-Guided Detector for SAR Airplanes
* Physics-Informed Deep Learning for 3D Wind Field Retrieval of Open-Ocean Typhoons
* Physics-informed Machine Learning for Medical Image Analysis
* Physics-Informed Transformer Networks for Interpretable GNSS-R Wind Speed Retrieval
* PhysNav-DG: A Novel Adaptive Framework for Robust VLM-Sensor Fusion in Navigation Applications
* PhytoSynth: Leveraging Multi-modal Generative Models for Crop Disease Data Generation with Novel Benchmarking and Prompt Engineering Approach
* PiCaZo: Pixel-Aligned Contrastive Learning for Zero-Shot Domain Adaptation
* PineSORT: A Simple Online Real-Time Tracking Framework for Drone Videos in Agriculture
* PIR: Photometric Inverse Rendering with Shading Cues Modeling and Surface Reflectance Regularization
* PlanarTrack: A high-quality and challenging benchmark for large-scale planar object tracking
* PlaneRecTR++: Unified Query Learning for Joint 3D Planar Reconstruction and Pose Estimation
* Plant-to-camera enabled 3D morphological reconstruction: A high-fidelity approach for plant phenotyping
* Platoon Communication Power Control Under V2V Data Uncertainty: A Robust DRL Approach
* Plenoptic PNG: Real-Time Neural Radiance Fields in 150 KB
* Plug and play labeling strategies for boosting small brain lesion segmentation
* PLVM: A Tuning-Free Approach for Personalized Large Vision-Language Model
* PlückeRF: A Line-Based 3D Representation for Few-View Reconstruction
* PME-MER: Efficient Position and Motion Encoder for Micro-Expression Recognition
* PMGT-VR: A Decentralized Proximal-Gradient Algorithmic Framework With Variance Reduction
* PMI-Transformer: Parking Memory Interaction Transformer for Vehicle Intent Prediction via Cooperative Vehicle-Infrastructure Systems
* Point Cloud Edge Extraction Based on 3D Point Separability Filter with Spherical Mask
* Point Cloud Stitching Approach Based on Image Registration for High-Precision Threaded Surface Modeling in Multi-View 3D Imaging, A
* Point Evolution Hierarchy Network for Weak Single-Point Human Parsing
* Points-Supervised Fundus Vessel Segmentation via Shape Priors and Contrastive Learning
* Polar Coordinate-Based 2D Pose Prior with Neural Distance Field
* Polarization Compensation and Multi-Branch Fusion Network for UAV Recognition with Radar Micro-Doppler Signatures
* Polynomial Fitting-Based Estimation of Spatially Varying Point Spread Function From a Single Image
* Pose-Aware Weakly-Supervised Action Segmentation
* Pose-to-Pose: A New Task and Benchmark for Human Pose Transition in Yoga
* PoseGuru: Landmarks for Explainable Pose Correction using Exemplar-Guided Algorithmic Recourse
* PoseSynVIT: Lightweight and Scalable Vision Transformers for Human Pose Estimation
* Position-Invariant Graph Convolutional Recurrent Network for Traffic Forecasting
* Post-Fire Streamflow Prediction: Remote Sensing Insights from Landsat and an Unmanned Aerial Vehicle
* Posture Action Correction Method for Sports Dance Using Improved Deep Reinforcement Learning in IoT
* Power of Augmentations in IR Object Detection, The
* PPTracker: Tracking UAV Swarms with Prior Prompt
* Practical and Secure Authentication Protocol for Vehicle to Grid in Intelligent Transportation Systems
* PRANCE: Joint Token-Optimization and Structural Channel-Pruning for Adaptive ViT Inference
* Pre-Manipulation Alignment Prediction for Open-Vocabulary Object Manipulation Based on End-Effector Trajectories
* Predicting Butterfly Species Presence from Satellite Imagery Using Soft Contrastive Regularisation
* Predictive Benthic Habitat Mapping Reveals Significant Loss of Zostera marina in the Puck Lagoon, Baltic Sea, over Six Decades
* Predictive Display for Teleoperation Based on Vector Fields Using Lidar-Camera Fusion
* Preference isolation forest for structure-based anomaly detection
* Preserving privacy without compromising accuracy: Machine unlearning for handwritten text recognition
* PRIMEDrive-CoT: A Precognitive Chain-of-Thought Framework for Uncertainty-Aware Object Interaction in Driving Scene Scenario
* Prior tokenization-based interactive segmentation with Vision Transformers
* Privacy Preserving Ordinal-Meta Learning with VLMs for Fine-Grained Fruit Quality Prediction
* Proactive Risk Prediction Framework for Cut-In Maneuvers Incorporating Inherent Driving Styles, A
* Probabilistic Mission Design for Neuro-Symbolic Unmanned Aircraft Systems
* Probabilistic Online Event Downsampling
* Probabilistic Perspective-n-Lines for Indoor Camera Pose Estimation
* Probing Early and Long-Term Drought Responses in Kauri Using Canopy Hyperspectral Imaging
* Probing Vulnerabilities of Vision-Lidar Based Autonomous Driving Systems
* Proc-Gs: Procedural Building Generation for City Assembly with 3D Gaussians
* progressive attention network with transformer for multi-label image recognition, A
* Progressive Autoregressive Video Diffusion Models
* Prompt Categories Cluster for Weakly Supervised Semantic Segmentation
* Prompt the Missing: Efficient and Robust Audio-Visual Classification Under Uncertain Modalities
* Prompt-Guided Attention Head Selection for Focus-Oriented Image Retrieval
* Prompt-Tuning SAM: From Generalist to Specialist with Only 2,048 Parameters and 16 Training Images
* PromptNorm: Image Geometry Guides Ambient Light Normalization
* Proposal-Guided Multi-Scale Radar and Vision Fusion for 3D Object Detection
* ProtoPatchNet: An Interpretable Patch-Based Prototypical Network
* Prototype-Based Continual Learning with Label-Free Replay Buffer and Cluster Preservation Loss
* Prototype-Guided Diffusion for Digital Pathology: Achieving Foundation Model Performance with Minimal Clinical Data
* PS4PRO: Pixel-to-Pixel Supervision for Photorealistic Rendering and Optimization
* Pseudo-Labelling Meets Label Smoothing for Noisy Partial Label Learning
* PSTF-AttControl: Per-subject-tuning-free personalized image generation with controllable face attributes
* Public Opinion Crisis Management via Social Media Mining
* PUF-Assisted Lightweight Mutual Authentication of Low-Cost RFID Tags for Medical Privacy Preservation, A
* Pureformer: Transformer-Based Image Denoising
* Pushing the Limits of LiDAR: Accurate Performance Analysis of Indoor 3D LiDARs
* PVUW 2025 Challenge Report: Advances in Pixel-Level Understanding of Complex Videos in the Wild
* Q-CIDNet: Perceptual Quality Aware Color and Intensity Decoupling Network for Video Quality Enhancement
* Q-Learning-Based Multi-Objective Heuristic Algorithm for Solving Multi-Resource Co-Scheduling Problem of Ship-Lock-Channel Considering Water Discharge
* QID: Efficient Query-Informed ViTs in Data-Scarce Regimes for OCR-Free Visual Document Understanding
* Quadrocular, Neuromorphic Stereo Triangulation and Asynchronous Data Fusion for 3D Object Tracking
* Quality Assessment for Talking Head Videos via Multi-Modal Feature Representation
* Quality-Aware Spatio-Temporal Transformer Network for RGBT Tracking
* Quantifying Motion Sickness in Virtual Reality Using a Multimodal 1CNN-GRU-Attention Approach With GSR Data
* Quantifying the Contribution of Forest Restoration to Wind Erosion Control Using RWEQ: A Case Study of Duolun County in Inner Mongolia, China
* Quantized Image Super-Resolution on Mobile NPUs, Mobile AI 2025 Challenge: Report
* Quantum Federated Learning for Multimodal Data: A Modality-Agnostic Approach
* Quaternionic reweighted amplitude flow for phase retrieval in image reconstruction
* query-driven twin network framework with optimization-based meta-learning for few-shot hyperspectral image classification, A
* RAD: Retrieval-Augmented Decision-Making of Meta-Actions with Vision-Language Models in Autonomous Driving
* RADLER: Radar Object Detection Leveraging Semantic 3D City Models and Self-Supervised Radar-Image Learning
* RadSplat: Radiance Field-Informed Gaussian Splatting for Robust Real- Time Rendering with 900+ FPS
* RALSD-YOLO: Lightweight Maize Tassel Detection Algorithm Based on Improved YOLOv8
* Random Walk Detection of Small Targets Based on Information Entropy and Intensity Local Contrast Method
* RankTuning: Cross-Image Partial Tuning Strategies for Rank Optimization in Visual Place Recognition
* Raw Image Reconstruction From RGB on Smartphones. NTIRE 2025 Challenge Report
* Ray-Bundle-Based X-Ray Representation and Reconstruction: An Alternative to Classic Tomography on Voxelized Volumes
* Read My Ears! Horse Ear Movement Detection for Equine Affective State Assessment
* Reading in the Dark with Foveated Event Vision
* Real-Time Detection Method for Surface Defects in 3D Printing Based on YOLOv12 Algorithm, A
* Real-Time Enhancements of Digital Twins With Incremental Time Series Data in Networked Air-Ground Cooperative UAV Swarm Systems
* Real-time facial expression recognition via quaternion Gabor convolutional neural network
* Real-Time Fire Detection Using Hybrid Feature Extraction: Color, Texture, and Motion Analysis
* Real-time fusion of stereo vision and hyperspectral imaging for objective decision support during surgery
* Real-Time Fusion of Two-Stage Point Cloud Clustering and Saliency Image for Water Surface Object Detection, A
* Real-Time Pedestrian Detection at the Edge on a Fully Asynchronous Neuromorphic System
* Real-Time Simulation of Destructible Objects: From Rigid Fractures to Soft-Body Deformation
* Real-Time Ultra-Fine-Grained Surgical Instrument Classification
* RealmDreamer: Text-Driven 3D Scene Generation with Inpainting and Depth Diffusion
* Reason3D: Searching and Reasoning 3D Segmentation via Large Language Model
* ReasonDrive: Efficient Visual Question Answering for Autonomous Vehicles with Reasoning-Enhanced Small Vision-Language Models
* Recognition and localisation of staircases for visually impaired people
* Reconstructed SWHs Based on a Deep Learning Method and the Revealed Long-Term SWH Variance Characteristics During 1993-2024
* Reconstruction of SMAP Soil Moisture Data Based on Residual Autoencoder Network with Convolutional Feature Extraction
* Recursive Multi-Exposure Alignment with Spatiotemporal Decoupling for Efficient Burst HDR and Restoration
* Reducing semantic ambiguity in open-vocabulary remote sensing image segmentation via knowledge graph-enhanced class representations
* REEF: Relevance-Aware and Efficient LLM Adapter for Video Understanding
* Reference Governor-Based Autonomous Driving Control Strategy for Guaranteed Safe Overtaking, A
* Reference Segmentation Network Based on Feature Interaction Enhancement
* ReferGPT: Towards Zero-Shot Referring Multi-Object Tracking
* RefineHOS: A high-performance hand-object segmentation with fine-grained spatial features
* Refining the granularity of smoke representation: SAM-powered density-aware progressive smoke segmentation framework
* Reflecting Reality: Enabling Diffusion Models to Produce Faithful Mirror Reflections
* Region Uncertainty Estimation for Medical Image Segmentation With Noisy Labels
* Region-Aware Driven Distribution Optimization for Stereo Matching
* Regional Cooperative Decision-Making Based on Coalition Game for Multilane Merging in Mixed Traffic
* Regional patch-based MRI brain age modeling with an interpretable cognitive reserve proxy
* Reinterpreting Hypergraph Kernels: Insights Through Homomorphism Analysis
* REJEPA: A Novel Joint-Embedding Predictive Architecture for Efficient Remote Sensing Image Retrieval
* Rel-SA: Alzheimer's Disease Detection Using Relevance-augmented Self Attention by Inducing Domain Priors in Vision Transformers
* Relation-Aware Meta-Learning for Zero-Shot Sketch-Based Image Retrieval
* Reliable classification through rank-based conformal prediction sets
* Remote Sensing and Multi-Level Data Analyses for Hum na Sutli Landslide Impact Assessment in a Changing Climate
* Remote Sensing for Quantifying Greenhouse Gas Emissions at Carbon Capture, Utilisation and Storage Facilities: A Review
* Remote Sensing Standardized Soil Moisture Index for Drought Monitoring: A Case Study in the Ebro Basin
* Remote Sensing-Based Spatiotemporal Assessment of Heat Risk in the Guangdong-Hong Kong-Macao Greater Bay Area
* RepFC: Universal Structural Reparametrization Block for High Performance, Lightweight Deep Neural Networks
* Representativeness Error Assessment and Multi-Method Scaling of HY-2B Altimeter Significant Wave Height
* Repurposing SAM for User-Defined Semantics Aware Segmentation
* Research on an adaptive robust ellipse fitting method integrating multiple weight strategies
* Research on an Intelligent Security Door Passenger Flow Statistics System Based on an Improved Deep Learning Human Body Recognition Algorithm
* Research on Defect Detection of Wire Rope of Mine Hoist Based on Feature Embedding
* Research on Forest Fire Smoke and Cloud Separation Method Based on Fisher Discriminant Analysis
* Research on Semantic Communication Based on Balancing of Task Distortion
* Research on Ultrasound Image Feature Enhancement Based on Frequency-Domain Self-Attention
* Resolving Surface Heat Island Effects in Fine-Scale Spatio-Temporal Domains for the Two Warmest Metropolitan Cities of Korea
* REST: A resolution preserving network for photorealistic style transfer via semantic distillation
* REST: Holistic Learning for End-to-End Semantic Segmentation of Whole-Scene Remote Sensing Imagery
* Rethinking Compressive Sensing: A Compression Framework for Video Super-Resolution
* Rethinking Emotion Annotations in the Era of Large Language Models
* Rethinking hard training sample generation for medical image segmentation
* Rethinking the Role of Spatial Mixing
* Retinex-guided generative diffusion prior for low-light image enhancement
* Retinex-Guided Histogram Transformer for Mask-Free Shadow Removal
* Retrieval augmented generation for smart calorie estimation in complex food scenarios
* Retrieval of Ozone Profiles from Limb Scattering Measurements of the OMS on FY-3F Satellite
* Revealing Emotional Insights From Mental Health Discussions on Instagram and TikTok Using BERT Models
* Reversible Grayscale Method Based on Bit-Field Multi-Channel Fusion Encoding, A
* Review of Artificial Intelligence and Remote Sensing for Marine Oil Spill Detection, Classification, and Thickness Estimation, A
* Review of Google Earth Engine for Land Use and Land Cover Change Analysis: Trends, Applications, and Challenges, A
* Review of Human Emotion Synthesis Based on Generative Technology, A
* Revisiting Multi-Modal LLM Evaluation
* Revisiting Referring Expression Comprehension Evaluation in the Era of Large Multimodal Models
* Revisiting Self-Generating Simple Figure Patterns for Learning Microscopy Image Segmentation
* Revisiting Transferable Adversarial Images: Systemization, Evaluation, and New Insights
* Revolutionizing Drug Discovery: Integrating Spatial Transcriptomics with Advanced Computer Vision Techniques
* ReX: Causal discovery based on machine learning and explainability techniques
* RF-REN: RGB-Frequency Relation Exploration Network for Micro-Expression Recognition
* RGB Photo Enhancement on Mobile GPUs, Mobile Ai 2025 Challenge: Report
* RGB to Infrared Image Translation Based on Diffusion Bridges Under Aerial Perspective
* RGB-Thermal Cooperative Robot Vision Strategy for Multi-Person Tracking in Both Well-Lit and Low-Light Scenes
* RGBX-DiffusionDet: a framework for multi-modal RGB-X object detection using DiffusionDet
* Rig3DGS: Creating Controllable Portraits From Casual Monocular Videos
* Rigid Body Adversarial Attacks
* RIS-Aided Channel Estimation for Multi-User MIMO mmWave Systems Under Practical Hybrid Architecture With Direct Path
* RISE-SDF: A Relightable Information-Shared Signed Distance Field for Glossy Object Inverse Rendering
* Risk Assessment Method of Three-Dimensional Low-Attitude Airspace Based on Multi-Source Data, A
* Risk Preference-Based Decision-Making and Control Framework for Pedestrian Interaction
* Robotics and autonomous systems: Metaheuristic Optimisation and Deep Learning
* Robust 3D Visual Question Answering via Bias Learning
* Robust 3D Watermarking Method Based on Statistical Features, A
* Robust 6DoF Pose Estimation Against Depth Noise and a Comprehensive Evaluation on a Mobile Dataset
* Robust AD: A Real World Benchmark Dataset for Robustness in Industrial Anomaly Detection
* Robust and Reliable Positioning Method for Complex Environments Based on Quality-Controlled Multi-Sensor Fusion of GNSS, INS, and LiDAR, A
* Robust Collagen Texture Quantification in Nonlinear Microscopy by Combining the Gradient Structure Tensor With a Mixed Noise Model
* Robust High-Precision Time Synchronization for Distributed Sensor Systems in Challenging Environments
* Robust image hiding via conditional invertible neural network
* robust JPEG quantization step estimation method for image forensics, A
* Robust Polyp Detection and Diagnosis Through Compositional Prompt-Guided Diffusion Models
* Robust Reinforcement Learning Framework for Platoon Control of Heterogeneous Vehicles Under Uncertain Dynamics, A
* Robust scene text understanding with OCR token and word alignment for Text-VQA and text-caption
* Robust spatio-temporal graph neural networks with sparse structure learning
* Robust Specific Emitter Identification Under Label Noise and Quantity Limitations in Intelligent Transportation Systems
* Robust Stage-Wise LVLM Adaptation: Multi-Phase Prompt Lora Fine-Tuning for Compound Expression Recognition
* Robust tracking via rethinking prediction head
* Robust Translation Synchronization Algorithm, A
* Robustifying Point Cloud Networks by Refocusing
* Robustness Evaluation for Video Models with Reinforcement Learning
* Robusto-1 Dataset: Comparing Humans and VLMs on Real Out-Of-Distribution Autonomous Driving VQA from Peru
* RotCLIP: Tuning CLIP with Visual Adapter and Textual Prompts for Rotation Robust Remote Sensing Image Classification
* RouteKG: A Knowledge Graph-Based Framework for Route Prediction on Road Networks
* Routing Optimization for Hazardous Materials Transportation Based on Vehicle Sharing Mode
* Routing Optimization of Takeout Delivery Routes Under Joint Delivery Model of Drones, Occasional Drivers, and Riders
* RSDB-Net: A Novel Rotation-Sensitive Dual-Branch Network with Enhanced Local Features for Remote Sensing Ship Detection
* RVMamba: Selective Text-Vision Mamba for Referring Video Object Segmentation
* S-Band SAR Target Classification Via 2D and 3D Deep Learning Methods
* S-EO: A Large-Scale Dataset for Geometry-Aware Shadow Detection in Remote Sensing Applications
* S2GL-MambaResNet: A Spatial-Spectral Global-Local Mamba Residual Network for Hyperspectral Image Classification
* s2p-hd: GPU-Accelerated Binocular Stereo Pipeline for Large-Scale Same-Date Stereo
* Safe-Construct: Redefining Construction Safety Violation Recognition as 3D Multi-View Engagement Task
* Safety-Driven Interpretable Model for Vehicle Control With Impact on Traffic, A
* SAGA: Semantic-Aware Gray Color Augmentation for Visible-to-Thermal Domain Adaptation Across Multi-View Drone and Ground-Based Vision Systems
* Salient Object Detection with Dynamic Convolutions
* Salient object ranking with reinforcement learning
* SAM-FDN: A SAM Fine-Tuning Adaptation Remote Sensing Change Detection Method Based on Fourier Frequency Domain Analysis Difference Reinforcement
* SAM-GFNet: Generalized Feature Fusion with Hierarchical Network for Hyperspectral Image Semantic Segmentation
* SAM-guided prompt learning for Multiple Sclerosis lesion segmentation
* SAM4EM: Efficient Memory-Based Two Stage Prompt-Free Segment Anything Model Adapter for Complex 3D Neuroscience Electron Microscopy Stacks
* SAMIRO: Spatial Attention Mutual Information Regularization with a pre-trained model as Oracle for lane detection
* SAMJAM: Zero-Shot Video Scene Graph Generation for Egocentric Kitchen Videos
* Sampled Radial Basis Function Neural Network Adaptive Observer-Based Platooning Control for Autonomous Vehicles With Cyber-Attacks
* SampleDet3D: Sample Enhanced 3D Object Detection
* SAMPro3D: Locating SAM Prompts in 3D for Zero-Shot Instance Segmentation
* SAR image change detection based on saliency region guidance and SIFT keypoint extraction
* SAR-Conditioned Consistency Model for Effective Cloud Removal in Remote Sensing Images
* SarcasmBench: Towards Evaluating Large Language Models on Sarcasm Understanding
* SARCLIP: a multimodal foundation framework for SAR imagery via contrastive language-image pre-training
* SARFormer - An Acquisition Parameter Aware Vision Transformer for Synthetic Aperture Radar Data
* SATE: Efficient knowledge distillation with implicit student-aware teacher ensembles
* Satellite Formation Flying Determination with Low-Cost GNSS Receivers Raw Data
* Satellite-Based Methane Emission Monitoring: A Review Across Industries
* Satellite-observed first large-scale Sargassum bloom in the northern South China Sea
* SBAS-InSAR Phase Unwrapping Method Integrating ICU and SNAPHU: A Case Study of Dalian City, An
* SBFS-Net: Smoke Segmentation via Separated Smoke and Background Features
* SC-NeRF: NeRF-Based Point Cloud Reconstruction Using a Stationary Camera for Agricultural Applications
* SCAF-YOLO: Multi-Scale Feature Fusion for Small Object Detection in Remote Sensing Images
* Scalable data twinning
* Scalable Fail-Degraded Systems for Autonomous Vehicles: A Survey
* Scalar computational primitives with perturbative phase interferometry
* Scale parallax network for few-shot learning
* Scale-Invariant Implicit Neural Representations for Object Counting
* Scale-Selectable Global Information and Discrepancy Learning Network for Multimodal Sentiment Analysis
* ScaleGraph: A scalable self-supervised framework for cross-domain zero-shot graph learning
* Scaling Chest X-Ray Foundation Models From Mixed Supervisions for Dense Prediction
* Scaling Laws in Zero-Shot Gender Classification Using CLIP
* Scaling On-Device GPU Inference for Large Generative Models
* Scaling Test-Time Compute can Outperform Larger Architectures in Computer Vision
* Scene Recognition Meets Knowledge Graphs: Enhancing Robustness to Object Diversity
* SCENE-Net: Geometric induction for interpretable and low-resource 3D pole detection with Group-Equivariant Non-Expansive Operators
* Scene-Specific Anomalous Relationship Detection Using Scene Graph Summarization
* SceneMotifCoder: Example-Driven Visual Program Learning for Generating 3D Object Arrangements
* SCESS-Net: Semantic consistency enhancement and segment selection network for audio-visual event localization
* SCFGL: Soft clustering based federated graph learning on Non-IID graphs
* Score-Based Spatial-Temporal Point Process for Traffic Accident Prediction
* ScoreCAM++: Gated Score-Weighted Visual Explanations for CNNs
* screen-shooting resilient watermarking based on Dual-Mode Convolution Block and dynamic learning strategy, A
* SDC-Net: A novel selective dilated convolution network for medical images segmentation
* SDWS-YOLO: A Lightweight Small Object Detection Model for Aerial Remote Sensing Images
* SEA-Net: Dual Attention U-Net for Bleeding Segmentation in Capsule Endoscopy Images
* Seabed-Net: A multi-task network for joint bathymetry estimation and seabed classification from remote sensing imagery in shallow waters
* SeaDiff: Underwater Image Enhancement With Degradation-Aware Diffusion Model
* Secret Point Recognition Algorithm via Test-Time Augmentation Based on Large Language Models
* Secure Medical Image Encryption Scheme Based on Cross-Ring Josephus Scrambling and Two-Dimensional Cellular Automata, A
* Securing the Skies: a Comprehensive Survey on Anti-Uav Methods, Benchmarking, and Future Directions
* Sediment-Deficit Sink-Zone Morphodynamics in Oceanic Island Dune Systems: Integration of Field Data and Remote Sources in the Macaronesian Region
* SEED-MYA: A Novel Myanmar Multimodal Dataset for Enhancing Emotion Recognition
* SeeD: Online similarity-preserving pattern discovery for streaming trajectories
* Seeing like a Cephalopod: Colour Vision with a Monochrome Event Camera
* SegGeo-SLAM: A real-time Visual SLAM system for dynamic environments
* Segment Any Primitive: Zero-Shot 3D Primitive Segmentation from Point Cloud
* Segment AnyNeuron
* Segmenting the Motion Components of a Video: A Long-Term Unsupervised Model
* Seismocardiography for Emotion Recognition: A Study on EmoWear With Insights From DEAP
* Select2Drive: Pragmatic Communications for Real-Time Collaborative Autonomous Driving
* Selecting the Optimal Approach for Individual Tree Segmentation in Euphrates Poplar Desert Riparian Forest Using Terrestrial Laser Scanning
* Selective Test-Time Domain Adaptation Using Fisher Information for Robust Facial Expression Recognition In-the-Wild
* Self-Guidance: Boosting Flow and Diffusion Generation on Their Own
* Self-supervised 3D Image Deburring for Lattice Light Sheet Microscopy
* Self-supervised despeckling based solely on SAR intensity images: A general strategy
* Self-Supervised Learning for Soybean Disease Detection Using UAV Hyperspectral Imagery
* Self-Supervised Neuron Morphology Representation With Graph Transformer
* Self-Supervised Pretraining for Fine-Grained Plankton Recognition
* Self-Supervised Upsampling for Reconstructions With Generalized Enhancement in Photoacoustic Computed Tomography
* Semantic BIM enrichment for firefighting assets: Fire-ART dataset and panoramic image-based 3D reconstruction
* Semantic Matters: Multimodal Features for Affective Analysis
* Semantic Segmentation of iPS Cells: Case Study on Model Complexity in Biomedical Imaging
* Semantic-Assisted Object Clustering for Multi-Modal Referring Video Segmentation
* Semantic-assisted unpaired image dehazing
* Semantic-Aware Local Image Editing with a Single Mask Operation
* SemanticSugarBeets: A Multi-Task Framework and Dataset for Inspecting Harvest and Storage Characteristics of Sugar Beets
* Semi-Analytical-Empirical Hybrid Model for Shallow Water Bathymetry Using Multispectral Imagery Without In Situ Data, A
* Semi-BSU: A Boundary-Aware Semi-Supervised Semantic Segmentation Framework with Superpixel Refinement for Coastal Aquaculture Pond Extraction from Remote Sensing Images
* Semi-Self-Supervised Approach for Dense-Pattern Video Object Segmentation, A
* Semi-supervised feature selection with concept factorization and robust label learning
* Semi-supervised medical image segmentation via anatomy-preserving consistency training
* Semi-Supervised Object-Wise Anomaly Detection for Firearm and Firearm Component Detection in X-Ray Security Imagery
* SemMatcher: Semantic-aware feature matching with neighborhood consensus
* Sensor Agnostic Domain Generalization Framework for Leveraging Geospatial Foundation Models: Enhancing Semantic Segmentation via Synergistic Pseudo-Labeling and Generative Learning, A
* Sentence-Level Relation Semantics Learning via Contrastive Sentences
* Sentinel-1 Based Hybrid Interferometric Approach to Complement EGMS for Landslides Identification, A
* Separating Shared and Domain-Specific LoRAs for Multi-Domain Learning
* SequencePAR: Understanding pedestrian attributes via a sequence generation paradigm
* Serp-Mamba: Advancing High-Resolution Retinal Vessel Segmentation With Selective State-Space Model
* SetPeER: Set-Based Personalized Emotion Recognition With Weak Supervision
* SF3Net: Frequency-Domain Enhanced Segmentation Network for High-Resolution Remote Sensing Imagery
* SGA-YOLO: A Lightweight Real-Time Object Detection Network for UAV Infrared Images
* ShadowAug: A Multi-Strategy Data Augmentation Method for Image Shadow Removal
* ShadowSG: Spherical Gaussian Illumination from Shadows
* Ship Detection in SAR Images Using Sparse R-CNN with Wavelet Deformable Convolution and Attention Mechanism
* Ship Target Feature Detection of Airborne Scanning Radar Based on Trajectory Prediction Integration
* Shopformer: Transformer-Based Framework for Detecting Shoplifting via Human Pose
* Short-Term 3D Human Mesh Recovery with Virtual Markers Disentanglement
* Show or Tell? A Benchmark to Evaluate Visual and Textual Prompts in Semantic Segmentation
* SILK: Smooth InterpoLation Framework for Motion in-Betweening a Simplified Computational Approach
* SilVar-Med: A Speech-Driven Visual Language Model for Explainable Abnormality Detection in Medical Imaging
* SimCache: Similarity Caching for Efficient VLM-based Scene Understanding
* Similarity Metric Method for Contour Line Groups Considering Terrain Features, A
* Simple Combination of Diffusion Models for Better Quality Trade-Offs in Image Denoising, A
* Simple Detector with Frame Dynamics is a Strong Tracker, A
* Simple Yet Effective Way to Use Polarimetric Information in Stereo Matching
* simple yet lightweight module for enhancing domain generalization through relative representation, A
* Simulation Analysis of Unmanned Aerial Vehicle-Based Laser Remote Sensing for Methane Point Source Traceability and Leakage Quantification
* Simultaneous acquisition of geometry and material for translucent objects
* Single stage weakly supervised semantic segmentation via enhanced patch affinity
* Single-channel SAR GMTI based on pixel-level difference feature extraction network
* Single-molecule orientation localization microscopy I: fundamental limits
* Single-molecule orientation localization microscopy I: fundamental limits
* Single-sphere camera-projector calibration via dual epipolar geometry under active illumination
* Single-Stage Uncertainty-Aware Jersey Number Recognition in Soccer
* Singular Pooling: A Spectral Pooling Paradigm for Second-Trimester Prenatal Level II Ultrasound Standard Fetal Plane Identification
* SK-RD4AD: Skip-Connected Reverse Distillation for Robust One-Class Anomaly Detection
* Skin Lesion Classification using Dermoscopic Images and Clinical Metadata: Insights from Multimodal Models
* Skor-Xg: Skeleton-Oriented Expected Goal Estimation in Soccer
* SkyCloud360: Sky and Cloud Segmentation in Equirectangular Images
* SL-Seg: A CNN-Transformer Fusion Network for Road Surface and Lane Segmentation in Complex Scenarios
* SLFusion: A Structure-Aware Infrared and Visible Image Fusion Network for Low-Light Scenes
* Slot Attention-Based Feature Filtering for Few-Shot Learning
* SLRTP2025 Sign Language Production Challenge: Methodology, Results, and Future Work
* Small object detection in aerial traffic imagery: A benchmark for motorbike-dominated road scenes
* Smallgs: Gaussian Splatting-Based Camera Pose Estimation for Small-Baseline Videos
* Smart Infrastructure Deployment Strategy Based on Hybrid Networking: Toward City-Scale Vehicle-Cloud Computation
* SmartHome-Bench: A Comprehensive Benchmark for Video Anomaly Detection in Smart Homes Using Multi-Modal Large Language Models
* SMLNet: A SPD Manifold Learning Network for Infrared and Visible Image Fusion
* Smooth Cache: A Universal Inference Acceleration Technique for Diffusion Transformers
* SMORE: Simultaneous Map and Object REconstruction
* Snap-it, Tap-it, Splat-it: Tactile-Informed 3D Gaussian Splatting for Reconstructing Challenging Surfaces
* Snapshot Hyperspectral Imaging using Petrographic Thin Section
* SNNTracker: Online High-Speed Multi-Object Tracking With Spike Camera
* SoccerNet-v3D: Leveraging Sports Broadcast Replays for 3D Scene Understanding
* Soil Freeze-Thaw Disturbance Index and Its Indicative Significance on the Qinghai-Tibet Plateau
* SOOD++: Leveraging Unlabeled Data to Boost Oriented Object Detection
* SoyStageNet: Balancing Accuracy and Efficiency for Real-Time Soybean Growth Stage Detection
* SPADe: Spatial Plaid Attention Decoder for Semantic Segmentation of Street Views
* SPAFormer: Sequential 3D Part Assembly with Transformers
* Sparse-to-Dense Training: A Novel Training Scheme to Enhance Vision Transformers
* Sparsegs: Sparse View Synthesis Using 3D Gaussian Splatting
* SparseTSF: Lightweight and Robust Time Series Forecasting via Sparse Modeling
* Spatial and Temporal Patterns of Mangrove Forest Change in the Mekong Region over Four Decades Based on a Remote Sensing Data-Driven Approach
* Spatial Cognition from Egocentric Video: Out of Sight, Not Out of Mind
* Spatial Contexts-Informed Self-Supervised Learning Approach for Pavement Distress Segmentation, A
* spatial features and weight adjusted loss infused Tiny YOLO for shadow detection, A
* Spatial Multi-Semantic Features Guided Spectral-Friendly Transformer Network for Hyperspectral Image Classification
* Spatial Planning Model for Obnoxious Facilities with Spatially Informed Constraints, A
* Spatial-Frequency Enhanced Mamba for Multi-Modal Image Fusion
* Spatio-Temporal State Space Model for Efficient Event-Based Optical Flow
* Spatio-Temporal Variability and Environmental Associations of Emergency Department Demand: A Longitudinal Analysis in Zaragoza, Spain (2011-2024)
* Spatiotemporal Analysis of Skier Versus Snowboarder Injury Patterns: A GIS-Based Comparative Study at a Large West Coast Resort
* Spatiotemporal Dynamics and Drivers of Agricultural Drought in the Huang-Huai-Hai Plain Based on Crop Water Stress Index and Spatial Machine Learning
* Spatiotemporal Dynamics of Surface Energy Balance over the Debris-Covered Glacier: A Case Study of Lirung Glacier in the Central Himalaya from 2017 to 2019
* Spatiotemporal Flight Trajectory Prediction and Online Learning Framework Based on Integrated Transformer-Bidirectional Gated Recurrent Unit, A
* Spatiotemporal Patterns of Climate-Vegetation Regulation of Soil Moisture with Phenological Feedback Effects Using Satellite Data
* speckle noise filtering method based on quantum-classical feature fusion neural networks with Monte Carlo Tree Search, A
* Speckle2Self: Learning Self-Supervised Despeckling with Attention Mechanism for SAR Images
* Spectral compression transformer with line pose graph for monocular 3D human pose estimation
* Spectral Prototype Attention Domain Adaptation for Hyperspectral Image Classification
* Spectral-Spatial Superpixel Bi-Stochastic Graph Learning for Large-Scale and High-Dimensional Hyperspectral Image Clustering
* Spectro-Textural Integration in Mangrove Delineation: A Case Analysis of Aboitiz Cleanergy Park, Davao City, Philippines
* SphereFusion: Efficient Panorama Depth Estimation via Gated Fusion
* Spherical target eccentricity correction in photogrammetric applications
* Spherical Vision Transformers for Audio-Visual Saliency Prediction in 360° Videos
* SPIdepth: Strengthened Pose Information for Self-Supervised Monocular Depth Estimation
* Splat-SLAM: Globally Optimized RGB-Only SLAM with 3D Gaussians
* SplatMesh: Interactive 3D Segmentation and Editing Using Mesh-Based Gaussian Splatting
* SplatTouch: Explicit 3D Representation Binding Vision and Touch
* SPM-CyViT: A self-supervised pre-trained cycle-consistent vision transformer with multi-branch for contrast-enhanced CT synthesis
* Spoken in Jest, Detected in Earnest: A Systematic Review of Sarcasm Recognition: Multimodal Fusion, Challenges, and Future Prospects
* Sporadic Federated Learning Approach in Quantum Environment to Tackle Quantum Noise
* Sport Field Calibration with NeRF-Guided Camera Optimization from a Single Image
* SportMamba: Adaptive Non-Linear Multi-Object Tracking with State Space Models for Team Sports
* SPSC-Net: Shared parallel space-channel attention mechanism transformer network for cell sequence image segmentation
* SPTPCA: Structure-Preserving Tensor Principal Component Analysis for Hyperspectral Dimensionality Reduction
* Spurfies: Sparse-View Surface Reconstruction Using Local Geometry Priors
* SqCLIRIL: Spoken query cross-lingual information retrieval in Indian languages
* SRLF: Sparse Representation Learning Framework for Railroad Surrounding Potential Risk Perception Using UAV Imagery
* SRVP: Strong Recollection Video Prediction Model Using Attention-Based Spatiotemporal Correlation Fusion
* SS-NeRF: Physically Based Sparse Spectral Rendering With Neural Radiance Field
* SSA-SARIMA-GSVR Hybrid Model Based on Singular Spectrum Analysis for O3-CPM Prediction, An
* SSCL: Semi-Supervised Comprehensive Learning for Nighttime Semantic Segmentation
* ssEM Image Restoration via Diffusion Models With Multi-Output Joint Strategy for Noise Estimation
* SSL4Eco: A Global Seasonal Dataset for Geospatial Foundation Models in Ecology
* SSRFlow: Semantic-Aware Fusion with Spatial Temporal Re-Embedding for Real-World Scene Flow
* SSUFormer: Spatial-spectral UnetFormer for improving hyperspectral image classification
* ST-GCN-AltFormer: Gesture Recognition With Spatial-Temporal Alternating Transformer
* Stable Attribute Group Editing for Reliable Few-Shot Image Generation
* STAM: Zero-Shot Style Transfer Using Diffusion Model via Attention Modulation
* Standardized Dataset and Image-Subspace-Based Method for Strip-Mode Synthetic Aperture Radar Block-Type Radio Frequency Interference Suppression
* STAPLE: Siamese Transformer Assisted Pseudo Label Ensembling for Unsupervised Domain Adaptation in No-Reference IQA
* STAR-Net: an interpretable model-aided network for remote sensing image denoising
* STARS: Semantics-Aware Text-guided Aerial Image Refinement and Synthesis
* Statistic Temporal Checking and Depth Layering based Multi-Object Relative Size Estimation from Monocular Video
* STEAM: Style Transfer Enabled Adversarial Attack With Attention Mechanism on Remote Sensing Image Scene Classification
* Stochastic latent feature distillation: Enhancing dataset distillation via structured uncertainty modeling
* Stochastic-Based Patch Filtering for Few-Shot Learning
* Stock Price Prediction and Investment Strategy via Machine Learning Model Fusion
* Stokes-S0 Prior-Guided Dual-Branch Network for Polarized Image Enhancement
* Strong Baseline for Multi-Person Tracking in Thermal Infrared Imagery, A
* Strong Baseline: Multi-UAV Tracking via YOLOv12 with BoT-SORT-ReID
* StrongSiamTracker: A Siamese Tracker with Dynamic Global Detection for Robust Anti-UAV Tracking
* STRRNet: Semantics-Guided Two-Stage Raindrop Removal Network
* Structural-prior guided bi-generative network for image inpainting
* Structural-Spectral Graph Convolution With Evidential Edge Learning for Hyperspectral Image Clustering
* Structure-aware spherical density steered cross-domain learning for effective point cloud understanding
* Structure-Induced Gradient Regulation for Generalizable Vision-Language Models
* Structuring the processing frameworks for data stream evaluation and application
* Student gaze target estimation based on depth transformation on dual-view classroom images
* Study of Emotion Concept Formation by Integrating Vision, Physiology, and Word Information Using Multilayered Multimodal Latent Dirichlet Allocation
* Studying Image Diffusion Features for Zero-Shot Video Object Segmentation
* Style-Preserving Diffusion for Scene Text Editing
* Style-Preserving Generator for Synthetic License Plate Recognition
* Subtyping Autism Spectrum Disorder Using Multimodal Multilayer Hypergraphs
* SUIT: Spatial-Spectral Union-Intersection Interaction Network for Hyperspectral Object Tracking
* Super Resolution of Satellite-Based Land Surface Temperature Through Airborne Thermal Imaging
* SUPER-Net: Trustworthy image segmentation via uncertainty propagation in encoder-decoder networks
* Superpixel segmentation of remote sensing images via edge extension and adaptive region merging
* Superpixel Segmentation With Edge Guided Local-Global Attention Network
* Supervised Contrastive Learning With Mixed Samples for Long-Tailed Recognition
* Supervised Domain Adaptation from Scene Text Recognition for Licence Plate Recognition
* Surface Defect Detection of Chip Images Based on the Improved FCOS with SENet, The
* Surface Roughness in Geomorphometry: From Basic Metrics Toward a Coherent Framework
* Surface Uplift Induced by Groundwater Level Variations Revealed Using MT-InSAR Time-Series Observations
* Surface Velocity and Dynamics of the Southern Patagonian Icefield Using Feature and Speckle Tracking Methods on Sentinel-1 SAR Images During 2019-2020
* SurfNet: Reconstruction of Cortical Surfaces via Coupled Diffeomorphic Deformations
* SurfR: Surface Reconstruction with Multi-Scale Attention
* Surprising Utility of Group Partitioning in Improving Conformal Prediction of Visual Classifiers Under Distributional Shifts, The
* survey of publicly available multi-temporal point cloud datasets, A
* Survey of State of the Art Large Vision Language Models: Alignment, Benchmark, Evaluations and Challenges, A
* Survey on Reinforcement Learning Methods for UAV Systems, A
* Survey on the Use of Agent-Based Modeling and Simulation for the Vehicle Routing Problem, A
* SVAD: From Single Image to 3D Avatar via Synthetic Data Generation with Video Diffusion and Data Augmentation
* Swarm of MASSs Cyber-Security for Anti-Hijack System Based on Blockchain and Chaotic-Steganography Using VRF-PBFT and Encoder-Decoder Deep Neural Networks
* SwarmDiff: Swarm Robotic Trajectory Planning in Cluttered Environments via Diffusion Transformer
* SweepEvGS: Event-Based 3D Gaussian Splatting for Macro and Micro Radiance Field Rendering From a Single Sweep
* Swin Transformer-based maritime objects instance segmentation with dual attention and multi-scale fusion
* SwinPaste: A Swin Transformer-Based Framework for RGB-Guided Thermal Image Super-Resolution
* Switcher-HNet: A switchable hierarchical network for tree species classification from forest stand to individual tree tasks
* Syn3DTxt: Embedding 3D Cues for Scene Text Generation
* SynerCD: Synergistic Tri-Branch and Vision-Language Coupling for Remote Sensing Change Detection
* Synergistic Fusion Network of Microscopic Hyperspectral and RGB Images for Multi-Perspective Segmentation
* Synergistic Mechanism of Spatiotemporal Dynamics in Urban Thermal Environments and Air Pollutants in China
* Synergy of Digital Twins and Machine Learning in Sustainable Water Management: SWIM, The
* Synthesizing Consistent Novel Views Via 3D Epipolar Attention Without Re-Training
* Synthetic Data Augmentation using Pre-trained Diffusion Models for Long-tailed Food Image Classification
* Synthetic Dataset for Group Activity Recognition
* T-SAM: Transductive Learning for Segment Anything Model
* T2BR: A Hierarchical Repositioning Approach for Autonomous Mobility on Demand Systems
* Tadmo: A tabular distance measure with move operations
* TAHAG: Two-Stage Domain Adaptation With Hybrid Adaptive Graph Learning for EEG Emotion Recognition
* Talk2Traffic: Interactive and Editable Traffic Scenario Generation for Autonomous Driving with Multimodal Large Language Model
* Task-Agnostic Attacks Against Vision Foundation Models
* Task-Conditioned Ensemble of Expert Models for Continuous Learning
* Task-Distributionally Robust Data-Free Meta-Learning
* Task-Informed Meta-Learning for Remote Sensing
* Task-Level Contrastiveness for Cross-Domain Few-Shot Learning
* Task-Specific Electroencephlogram Analysis: A Novel ICA and Dynamic Multi-Stage Clustering Approach for Neural Signal Processing
* TasselNetV4: A vision foundation model for cross-scene, cross-scale, and cross-species plant counting
* TB-Bench: Training and Testing Multi-Modal AI for Understanding Spatio-Temporal Traffic Behaviors from Dashcam Images/Videos
* TBGA-Net: Trigonometric Bilinear Attention and Global-Aware Aggregation Network for Large-Scale 3D Point Cloud Segmentation
* TBiGAN-based parallel networks for remaining useful life prediction of multi-stage degraded bearings
* TDBCL: A time series dual-branch balance contrastive learning for imbalanced classification
* TEDRA: Text-Based Editing of Dynamic and Photoreal Actors
* Temporal Conditioning for Realistic Performance Video Generation from Instrumental Sounds
* Temporal Consistent Semantic Video Color Transfer from Multiple References
* Tenth NTIRE 2025 Efficient Super-Resolution Challenge Report, The
* Tenth NTIRE 2025 Image Denoising Challenge Report, The
* Terahertz Squint SAR Imaging Based on Decoupled Frequency Scaling Algorithm
* Terramesh: A Planetary Mosaic of Multimodal Earth Observation Data
* Text Prompt Region Decomposition for Effective Facial Expression Recognition
* Text-Guided Patch Scoring and Local Distortion Guidance for Image Quality Assessment
* Text-Guided Semantic Alignment Network With Spatial-Frequency Interaction for Infrared-Visible Image Fusion Under Extreme Illumination
* text-only weakly supervised learning framework for text spotting via text-to-polygon generator, A
* Textinvision: Text and Prompt Complexity Driven Visual Text Generation Benchmark
* Texture-aware fast mode decision and complexity allocation for VVC based point cloud compression
* Texture-aware transformer with pose-patch mapping for occluded person re-identification
* Texture2LoD3: Enabling LoD3 Building Reconstruction with Panoramic Images
* TGKAV: Tree-Based Group Key Agreement Scheme With Practical Antenna Implementation for Vehicle Platoon
* Thermal Image Super-Resolution Challenge Results - PBVS 2025
* Thermal Pedestrian Multiple Object Tracking Challenge (TP-MOT)
* ThiefCloud: A Thickness Fused Thin Cloud Removal Network for Optical Remote Sensing Image With Self-Supervised Learnable Cloud Prior
* Three-dimensional reconstruction method for sparsely textured surfaces via freely moving line-structured-light-based binocular stereo vision
* Threshold Effect in the Street Vitality Formation Mechanism, The
* Tightly Coupled RTK-Visual-Inertial Integration With a Novel Sliding Ambiguity Window Optimization Framework
* Time series adaptive mode decomposition (TAMD): Method for improving forecasting accuracy in the apparel industry
* Tiny object detection based on dynamic scale-awareness label assignment and contextual enhancement
* TinyDef-DETR: A Transformer-Based Framework for Defect Detection in Transmission Lines from UAV Imagery
* TLAC: Two-Stage LMM Augmented CLIP for Zero-Shot Classification
* To Match or Not to Match: Revisiting Image Matching for Reliable Visual Place Recognition
* ToF-360 - A Panoramic Time-of-Flight RGB-D Dataset for Single Capture Indoor Semantic 3D Reconstruction
* Tokenfocus-VQA: Enhancing Text-to-Image Alignment with Position-Aware Focus and Multi-Perspective Aggregations on LVLMs
* Tomato Plant Disease Identification via Deep Learning Technique
* TOODIB: Task-aligned one-stage object detection with interactions between branches
* ToothMaker: Realistic Panoramic Dental Radiograph Generation via Disentangled Control
* Topology Optimization in Medical Image Segmentation With Fast X Euler Characteristic
* TopoSegNet: Scalable topology preservation in image segmentation via critical points
* Touch and Feel Virtual Objects
* Toward Automation in Text-Based Video Retrieval with LLM Assistance
* Toward Camera Open-Set 3D Object Detection for Autonomous Driving Scenarios
* Toward Democratizing High-Definition Map Update Through Consortium Blockchain
* Toward Effective Knowledge Distillation: Navigating Beyond Small-Data Pitfall
* Toward Efficient Traffic State Imputation From Sparse Observations on Road Network Scale: A Spatial-Temporal Query Matching Approach
* Toward Generalizable Prompt Learning via Multi-Regularization Guided Knowledge Distillation
* Toward noise-resilient retrieval of land surface temperature and emissivity using airborne thermal infrared hyperspectral imagery
* Toward Optimal Mixture of Experts System for 3D Object Detection: A Game of Accuracy, Efficiency and Adaptivity
* Toward Reliable Medical Image Segmentation by Modeling Evidential Calibrated Uncertainty
* Toward Resilience in Broadacre Agriculture: A Methodological Review of Remote Sensing in Crop Productivity, Phenology, and Environmental Stress Detection
* Towards 4D human video stylization
* Towards Accurate Prediction of Runout Distance of Rainfall-Induced Shallow Landslides: An Integrated Remote Sensing and Explainable Machine Learning Framework in Southeast China
* Towards Ball Spin and Trajectory Analysis in Table Tennis Broadcast Videos via Physically Grounded Synthetic-to-Real Transfer
* Towards Efficient and Robust Moment Retrieval System: A Unified Framework for Multi-Granularity Models and Temporal Reranking
* Towards Efficient Benchmarking of Foundation Models in Remote Sensing: A Capabilities Encoding Approach
* Towards Evaluating the Robustness of Visual State Space Models
* Towards Exploring Continual Learning for Toxicologic Pathology in Pharmaceutical Drug Discovery
* Towards Faster and More Compact Foundation Models for Molecular Property Prediction
* Towards Fine-Grained Spatial Control for Soccer Game Image Generation
* Towards Foundation Models for 3D Vision: How Close are We?
* Towards Greedy Iterative Adversarial Attack With Distortion Maps Against Deep Face Recognition
* Towards Holistic Visual Quality Assessment of AI-Generated Videos: A LLM-Based Multi-Dimensional Evaluation Model
* Towards Low-Latency Event-Based Obstacle Avoidance on a FPGA-Drone
* Towards resolution-arbitrary remote sensing change detection with Spatial-frequency dual domain learning
* Towards Robust Multimodal AU Detection: STN-Enhanced Visual Encoding and Audio-Visual Spatial-Temporal Alignment
* Towards Scale-Aware Low-Light Enhancement Via Structure-Guided Transformer Design
* Towards Synthetic Concept Activation Vectors via Generative Models
* Towards Trustworthy Autonomous Vehicles with Vision-Language Models under Targeted and Untargeted Adversarial Attacks
* Towards Unconstrained 2D Pose Estimation of the Human Spine
* Towards unified molecule-enhanced pathology image representation learning via integrating spatial transcriptomics
* TP-GLIF: Trajectory Prediction of Surrounding Vehicles in Unsignalized Roundabouts Based on Global-Local History Information Fusion
* TP-LReID: Lifelong person re-identification using text prompts
* TraDiffusion: Trajectory-Based Training-Free Image Generation
* Traffic Sign Recognition Under Visual Perturbations: Shadows, Light Patches, and Simulated Obstructions
* Train-Borne Video Intelligent Solution for High-Speed Railway Infrastructure Inspection, A
* Training Data Reconstruction: Privacy Due to Uncertainty?
* Training Neural Networks on RAW and HDR Images for Restoration Tasks
* Training-Free Color-Style Disentanglement for Constrained Text-to-Image Synthesis
* TrajGNAS: Heterogeneous Multiagent Trajectory Prediction Based on a Graph Neural Architecture Search
* TransBridge: Boost 3D Object Detection by Scene-Level Completion With Transformer Decoder
* Transfer Learning of Stochastic Kriging for Individualized Prediction
* TransformAR: A light-weight transformer-based metric for Augmented Reality quality assessment
* Transformer-based dynamic cell bounding box refinement for end-to-end Table Structure Recognition
* Transformer-Based Lung Infection Severity Prediction with Cross Attention and Conditional TransMix Augmentation
* Transformer-based Visual Grounding with Inter-Modality Cross Attention
* Translating Images to Road Network: A Sequence-to-Sequence Perspective
* TransSTC: transformer tracker meets efficient spatial-temporal cues
* Travel Time and Weather-Aware Traffic Forecasting in a Conformal Graph Neural Network Framework
* Tree-Based Regressor Comparison for Burn Severity Mapping: Spatially Blocked Validation Within and Across Fires
* Tree-Structure Transformer for Skeleton-Based Human Action Recognition
* Trend-aware time series clustering via self-attentive LSTM
* TripleMixer: A Triple-Domain Mixing Model for Point Cloud Denoising Under Adverse Weather
* TRIS: A multimodal and multitask framework for unifying text-image retrieval and referring image segmentation
* Trishul: Towards Region Identification and Screen Hierarchy Understanding for Large VLM Based GUI Agents
* True Hyperspectral Image Super-Resolution Dataset, A
* Trustworthy Multi-UAV Collaboration: A Self-Supervised Framework for Explainable and Adversarially Robust Decision-Making
* TSAR: A two-stage approach to motion artifact reduction in OCTA images
* TSMnet: Two-step separation pipeline based on threshold shrinkage memory network for weakly-supervised video anomaly detection
* TT3D: Table Tennis 3D Reconstruction
* TTGen: Incorporating Test-Time Scaling to Diffusion Models
* TTT-KD: Test-Time Training for 3D Semantic Segmentation Through Knowledge Distillation From Foundation Models
* Tube Arrangement Using a Mesh-Based Sorting Approach in Video Synopsis
* Turbo principles meet compression: Rethinking nonlinear transformations in learned image compression
* Turin3D: Evaluating Adaptation Strategies Under Label Scarcity in Urban Lidar Segmentation with Semi-Supervised Techniques
* Twitter User Geolocation Based on Multi-Graph Feature Fusion with Gating Mechanism
* Two Views are Better Than One: Monocular 3D Pose Estimation with Multiview Consistency
* Two-Dimensional InSAR-Based Framework for Landslide Identification and Movement Pattern Classification, A
* U-ARE-ME: Uncertainty-Aware Rotation Estimation in Manhattan Environments
* U-Shape Mamba: State Space Model for Faster Diffusion
* UAV-based monocular 3D panoptic mapping for fruit shape completion in orchard
* UAV-Enabled Covert Autonomous Vehicular Communication: Joint Trajectory and Resource Allocation Design
* UIT-OpenViIC: An open-domain benchmark for evaluating image captioning in Vietnamese
* Ultra-efficient 3D shape reconstruction: Line-coded absolute phase unwrapping algorithm
* Uncertainty Aware Training to Improve Uncertainty Active Learning for Semantic Segmentation
* Uncertainty Mixture of Experts Model for Long Tail Crop Type Mapping
* Uncertainty Quantification for Gradient-Based Explanations in Neural Networks
* Uncertainty Quantification for Semi-Supervised Object Detection in Remote Sensing Images
* Uncertainty-Guided Adaptive Correction for Semi-Supervised Medical Image Segmentation
* Uncertainty-Guided Style-Aware Probabilistic Perceptual Quality Assessment for AI-Generated Images
* Uncovering Branch Specialization in InceptionV1 Using K Sparse Autoencoders
* Understanding and Mitigating Toxicity in Image-Text Pretraining Datasets: A Case Study on LLaVA
* Understanding and Predicting Temporal Visual Attention Influenced by Dynamic Highlights in Monitoring Task
* Understanding and tackling the modality imbalance problem in multimodal survival prediction
* Understanding Depth and Height Perception in Large Visual-Language Models
* Understanding the Anomalies in Exotic Annual Grass Cover in Precipitation Scenario Maps of Rangelands in the Western United States
* Understanding the Effect of using Semantically Meaningful Tokens for Visual Representation Learning
* Uni4DPC: A Unified 4D Sparse Tensor Network for Intra- and Inter-Frame Dynamic Point Cloud Geometry Compression
* Unicamp-UAV: An open dataset for human detection in UAV imagery
* Unified complex-valued high-resolution frequency representation with cross-domain attention for radar-based physiological state recognition
* Unified Perspective for Loss-Oriented Imbalanced Learning via Localization, A
* Unified-Modal Salient Object Detection via Adaptive Prompt Learning
* UniForCE: The Unimodality Forest method for Clustering and Estimation of the number of clusters
* Unimotion: Unifying 3D Human Motion Synthesis and Understanding
* UNIT: Unsupervised Online Instance Segmentation Through Time
* United We Stand, Divided We Fall: Handling Weak Complementarity for Audio-Visual Emotion Recognition in Valence-Arousal Space
* UniToken: Harmonizing Multimodal Understanding and Generation Through Unified Visual Encoding
* Universal Shape of Strong Remote Adversarial Patches for Object Detection with Convolutional Neural Networks
* Universal Vessel Segmentation for Multi-Modality Retinal Images
* Unlimited Sampling of Multiband Signals: Single-Channel Acquisition and Recovery
* UnMA-CapSumT: Unified and Multi-Head attention-driven caption summarization transformer
* Unmasking Deepfakes: Impacts, Detection Challenges, and Responsible Response
* UnravelNet: A backbone for enhanced multi-scale and low-quality feature extraction in remote sensing object detection
* Unsupervised 3D Braided Hair Reconstruction from a Single-View Image
* Unsupervised Clustering of InSAR Time-Series Deformation in Mandalay Region from 2022 to 2025 Using Dynamic Time Warping and Longest Common Subsequence
* Unsupervised contrastive analysis for anomaly detection in brain MRIs via conditional diffusion models
* Unsupervised domain adaptation for cardiac MRI segmentation via adversarial learning in latent space
* Unsupervised domain adaptation via style-aware self-intermediate domain
* Unsupervised Face Super-Resolution via Integrating Faithful 3D Facial Priors
* Unsupervised Few-Shot Food Recognition With Intra-Class Variation and Inter-Class Similarity Modeling
* Unsupervised Gaze Representation Learning by Switching Features
* Unsupervised instance segmentation with superpixels
* Unsupervised Learning Approach for Reconstructing 3T-Like Images From 0.3T MRI Without Paired Training Data, An
* Unsupervised Object Localization driven by self-supervised foundation models: A comprehensive review
* Unsupervised Stratification of Patients With Myocardial Infarction Based on Imaging and In-Silico Biomarkers
* Unveiling Histopathological Features of Breast Cancers Using Limited Data
* UP-Person: Unified Parameter-Efficient Transfer Learning for Text-Based Person Retrieval
* UPPET: Unified Pedestrian Pose Estimation in Thermal Imaging
* Urban Grid Population Inflow Prediction via POI-Enhanced Conditional Diffusion with Dual-Dimensional Attention
* Urban Informal Settlement Classification via Cross-Scale Hierarchical Perception Fusion Network Using Remote Sensing and Street View Images
* UrbanIR: Large-Scale Urban Scene Inverse Rendering from a Single Video
* Using dynamic knowledge for kernel modulation: Towards image generation via one-shot multi-domain adaptation
* Using K-Means-Derived Pseudo-Labels and Machine Learning Classification on Sentinel-2 Imagery to Delineate Snow Cover Ratio and Snowline Altitude: A Case Study on White Glacier from 2019 to 2024
* USRNet: Unified scene recovery network for image restoration under multiple adverse weather conditions
* V-NAW: Video-Based Noise-Aware Adaptive Weighting for Facial Expression Recognition
* V2X-Reg++: A Real-Time Global Registration Method for Multi-End Sensing System in Urban Intersections
* V3LMA: Visual 3D-Enhanced Language Model for Autonomous Driving
* Validation of Soil Temperature Sensing Depth Estimates Using High-Temporal Resolution Data from NEON and SMAP Missions
* Validation of the MODIS Clumping Index: A Case Study in Saihanba National Forest Park
* Valve Fleets: A Novel Control Method for Mixed Traffic Flow Regulation With Applications in Bottleneck Segments
* variable gaussian kernel scale active contour model based on Jeffreys divergence for ICT image segmentation, A
* Variable-rate learned image compression with integer-arithmetic-only inference
* Variable-Size Symmetry-Based Graph Fourier Transforms for Image Compression
* Variation Characteristics of the Ionospheric E Layer over the Tibetan Plateau and Surrounding Areas During a Full Solar Cycle
* Variational Gaussian Mixture Model for Tracking Multiple Extended Targets or Unresolvable Group Targets in Closely Spaced Scenarios
* Vehicle Accident Detection in Video Surveillance Based on BiFPN-YOLOv8
* Vehicle Detection Under Complex Weather Conditions Based on an Adaptive Model
* Vehicle Localization Framework Using Georeferenced Snow Poles and LiDAR in GNSS-Limited Environments Under Nordic Conditions
* Velocity2DMs: A Contextual Modeling Approach to Dynamics Marking Prediction in Piano Performance
* Versatile Fourier Transform Spectrometer Model for Earth Observation Missions Validated with In-Flight Systems Measurements
* Vertical Characteristics of an Ozone Pollution Episode in Hong Kong Under the Typhoon Mawar: A Case Study
* Very Similar Appearance Feature Classification for Chronic Endometritis Diagnosis in Hysteroscopy Images
* Video and text semantic center alignment for text-video cross-modal retrieval
* Video Steganography With Optimized Robust Modulation Paths for Lossy Channels
* Video, How do Your Tokens Merge?
* VideoARD: An Analysis-Ready Multi-Level Data Model for Remote Sensing Video
* ViDROP: Video Dense Representation Through Spatio-Temporal Sparsity
* Viewpoint-Aware 3D Dense Captioning
* Viewport Prediction for Volumetric Video Streaming by Exploring Video Saliency and User Trajectory Information
* Virtual Pose Coach: A Motion-Retargeting Approach for Pose Training
* Visible-Infrared Person Re-Identification with Modality-Specific Expert
* Vision Language Models for Massive MIMO Semantic Communication
* Vision-by-prompt: Context-aware dual prompts for composed video retrieval
* Vision-language tracking with attention-based optimization
* VisionCube: 3D-Aware Vision-Language Model for Multi-Step Spatial Reasoning
* ViSkin: Physics-Based Simulation of Virtual Skin on Personalized Avatars
* VISTA-CLIP: Visual Incremental Self-Tuned Adaptation for Efficient Continual Panoptic Segmentation
* VISTANet: VIsual Spoken Textual Additive Net for Interpretable Multimodal Emotion Recognition
* Visual and Textual Prompts in VLLMs for Enhancing Emotion Recognition
* Visual Question Answering on Multiple Remote Sensing Image Modalities
* Visual RAG Pipeline for Few-Shot Fine-Grained Product Classification, A
* Visualizing and Controlling Cortical Responses Using Voxel-Weighted Activation Maximization
* Visually Interpretable Subtask Reasoning for Visual Question Answering
* Vit4v: a Video Classification Method for the Detection of Varroa Destructor from Honeybees
* VividTalk: One-Shot Audio-Driven Talking Head Generation Based on 3D Hybrid Prior
* VMM: Video-Music Mamba for generating background music from videos
* VNL-STES: A Benchmark Dataset and Model for Spatiotemporal Event Spotting in Volleyball Analytics
* Vocabulary-Free 3D Instance Segmentation with Vision-Language Assistant
* Vocabulary-Free Few-Shot Learning for Vision-Language Models
* VolTex: Food Volume Estimation Using Text-Guided Segmentation and Neural Surface Reconstruction
* Volume Measurement Technology of Dispensing Transparent Adhesives Based on Line Laser Scanning
* VPA: Multi-Modal Virtual Point Augmentation for 3D Object Detection
* VRAG: Retrieval-Augmented Video Question Answering for Long-Form Videos
* VRU-CIPI: Crossing Intention Prediction at Intersections for Improving Vulnerable Road Users Safety
* VXP: Voxel-Cross-Pixel Large-Scale Camera-LiDAR Place Recognition
* W-EICMFusion: A fusion network for infrared and visible images utilising WOA hyperparameter optimisation
* W-MambaFuse: A wavelet decomposition and adaptive state-space modeling approach for anatomical and functional image fusion
* WaterSplatting: Fast Underwater 3D Scene Reconstruction Using Gaussian Splatting
* WaveDIF: Wavelet Sub-Band based Deepfake Identification in Frequency Domain
* Waveform Analysis for Enhancing Airborne LiDAR Bathymetry in Turbid and Shallow Tidal Flats of the Korean West Coast
* Wavelet-Based Mechanistic Interpretability of Vision Transformers via Frequency-Aware Ablations
* Wavelet-based physically guided normalization network for real-time traffic dehazing
* Wavelet-guided diffusion enhancement network with directional learning for single-pixel imaging
* Way Up: a Dataset for Hold Usage Detection in Sport Climbing, The
* Weakly and Self-Supervised Class-Agnostic Motion Prediction for Autonomous Driving
* Weakly Supervised Bilinear Convolutional Neural Network for Fine-Grained Vehicle Classification
* Weakly Supervised Panoptic Segmentation for Defect-Based Grading of Fresh Produce
* Weakly Supervised SAR Ship Oriented-Detection Algorithm Based on Pseudo-Label Generation Optimization and Guidance
* Weakly supervised semantic segmentation of SAR images via multi-level pseudo-label fusion
* Weighted Probabilistic Mask Aggregation for Fault Tolerant Federated Learning
* WFDENet: Wavelet-based frequency decomposition and enhancement network for diabetic retinopathy lesion segmentation
* What is the Added Value of UDA in the VFM Era?
* What Makes for a Good Stereoscopic Image?
* Wheat3DGS: In-Field 3D Reconstruction, Instance Segmentation and Phenotyping of Wheat Heads with Gaussian Splatting
* When Textures Deceive: Weakly Supervised Industrial Anomaly Detection with Adapted-Loss CycleGAN
* When Voices Speak Louder: Leveraging Audio Signals in Emotion-Cause Extraction via Large Multilingual Multimodal Indian Dialogue Datasets
* Where Is the Ball: 3D Ball Trajectory Estimation From 2D Monocular Tracking
* Why We Feel: Breaking Boundaries in Emotional Reasoning with Multimodal Large Language Models
* Wide Field-of-View MMW SISO-SAR Image Reconstruction Based on Curved Linear Array
* WildlifeReID-10k: Wildlife Re-Identification Dataset with 10k Individual Animals
* Window normalization: Enhancing point cloud understanding by unifying inconsistent point densities
* Window self-attention and 3D volumetric refinement for large vessel occlusion detection in brain angiography
* Window Token Concatenation for Efficient Visual Large Language Models
* Winter Wheat-Yield Estimation in the Huang-Huai-Hai Region Based on KNN-Ward Phenological Zoning and Multi-Source Data
* WMFA-AT: Adaptive Teacher with Weighted Multi-Layer Feature Alignment for Cross-Domain UAV Object Detection
* WQLCP: Weighted Adaptive Conformal Prediction for Robust Uncertainty Quantification Under Distribution Shifts
* X-Edit: Detecting and Localizing Edits in Images Altered by Text-Guided Diffusion Models
* X-Fake: Juggling Utility Evaluation and Explanation of Simulated SAR Images
* XiEff Representation for Interpretable Near-Field Imaging
* XLD: A Cross-Lane Dataset for Benchmarking Novel Driving View Synthesis
* XLITE-Unet: Extremely Light and Efficient Deep learning architecture with selective atrous and axial depthwise convolution for image segmentation
* XYScanNet: A State Space Model for Single Image Deblurring
* YOLOv8-GAD: A Lightweight Model for Wheat Ear Counting in Field for UAV Edge Computing
* You look from old classes: Towards accurate few shot class-incremental learning
* YUW-Net: A Y-shaped deep network for enhancement of Under-water images
* Z-SASLM: Zero-Shot Style-Aligned SLI Blending Latent Manipulation
* ZECO: ZeroFusion Guided 3D MRI Conditional Generation
* Zero-Shot Denoising for Fluorescence Lifetime Imaging Microscopy with Intensity-Guided Learning
* Zero-Shot Interpretable Image Steganalysis for Invertible Image Hiding
* Zero-Shot Object Detection with Knowledge Enhancement Via Dual-Branch Subgraph Reasoning
* Zero6DOT: Zero-Shot 6D Object Pose Tracking With Monocular RGB Video
* ZeroPS: High-Quality Cross-Modal Knowledge Transfer for Zero-Shot 3D Part Segmentation
* Zfusion: an Effective Fuser of Camera and 4D Radar for 3D Object Perception in Autonomous Driving
2327 for 2512