Keith Price Bibliography update Details for 2111

Update Dates 2111

2111 * *CVPR
* *SVO Pro: Semi-direct Visual-Inertial Odometry and SLAM for Monocular, Stereo, and Wide Angle Cameras
* 2D or not 2D? Adaptive 3D Convolution Selection for Efficient Video Recognition
* 3-D Characterization of Urban Areas Using High-Resolution Polarimetric SAR Tomographic Techniques and a Minimal Number of Acquisitions
* 3-D Crosswell Electromagnetic Inversion Based on General Measures
* 3D AffordanceNet: A Benchmark for Visual Object Affordance Understanding
* 3D CNNs with Adaptive Temporal Feature Resolutions
* 3D GAN for Improved Large-pose Facial Recognition, A
* 3D Graph Anatomy Geometry-Integrated Network for Pancreatic Mass Segmentation, Diagnosis, and Quantitative Patient Management
* 3D Human Action Representation Learning via Cross-View Consistency Pursuit
* 3D Object Detection with Pointformer
* 3D Shape Generation with Grid-based Implicit Functions
* 3D Spatial Recognition without Spatially Labeled 3D
* 3D Video Stabilization with Depth Estimation by CNN-based Optimization
* 3D-FUTURE: 3D Furniture Shape with TextURE
* 3D-MAN: 3D Multi-frame Attention Network for Object Detection
* 3D-to-2D Distillation for Indoor Scene Parsing
* 3DCaricShop: A Dataset and A Baseline Method for Single-view 3D Caricature Face Reconstruction
* 3DIoUMatch: Leveraging IoU Prediction for Semi-Supervised 3D Object Detection
* 4D Hyperspectral Photoacoustic Data Restoration with Reliability Analysis
* 4D Panoptic LiDAR Segmentation
* A2-FPN: Attention Aggregation based Feature Pyramid Network for Instance Segmentation
* ABMDRNet: Adaptive-weighted Bi-directional Modality Difference Reduction Network for RGB-T Semantic Segmentation
* Abstract Spatial-Temporal Reasoning via Probabilistic Abduction and Execution
* Accurate Few-shot Object Detection with Support-Query Mutual Guidance and Hybrid Loss
* Accurate Mapping Method for UAV Photogrammetry Without Ground Control Points in the Map Projection Frame
* Achieving robustness in classification using optimal transport with hinge regularization
* ACRE: Abstract Causal REasoning Beyond Covariation
* Action Shuffle Alternating Learning for Unsupervised Action Segmentation
* Action Unit Memory Network for Weakly Supervised Temporal Action Localization
* ACTION-Net: Multipath Excitation for Action Recognition
* Activate or Not: Learning Customized Activation
* Actor-Context-Actor Relation Network for Spatio-Temporal Action Localization
* AdaBins: Depth Estimation Using Adaptive Bins
* Adaptive Aggregation Networks for Class-Incremental Learning
* Adaptive Class Suppression Loss for Long-Tail Object Detection
* Adaptive Consistency Prior based Deep Network for Image Denoising
* Adaptive Consistency Regularization for Semi-Supervised Transfer Learning
* Adaptive Convolutions for Structure-Aware Style Transfer
* Adaptive Cross-Modal Prototypes for Cross-Domain Visual-Language Retrieval
* Adaptive direction-guided structure tensor total variation
* Adaptive Image Transformer for One-Shot Object Detection
* Adaptive LiDAR Sampling and Depth Completion Using Ensemble Variance
* Adaptive Methods for Real-World Domain Generalization
* Adaptive Prototype Learning and Allocation for Few-Shot Segmentation
* Adaptive Rank Estimate in Robust Principal Component Analysis
* Adaptive Subtraction Based on U-Net for Removing Seismic Multiples
* Adaptive Weighted Discriminator for Training Generative Adversarial Networks
* AdaStereo: A Simple and Efficient Approach for Adaptive Stereo Matching
* AdCo: Adversarial Contrast for Efficient Learning of Unsupervised Representations from Self-Trained Negative Adversaries
* AdderSR: Towards Energy Efficient Image Super-Resolution
* Adversarial Generation of Continuous Images
* Adversarial Imaging Pipelines
* Adversarial Invariant Learning
* Adversarial Laser Beam: Effective Physical-World Attack to DNNs in a Blink
* Adversarial Robustness Across Representation Spaces
* Adversarial Robustness under Long-Tailed Distribution
* Adversarially Adaptive Normalization for Single Domain Generalization
* AdvSim: Generating Safety-Critical Scenarios for Self-Driving Vehicles
* AF)2-S3Net: Attentive Feature Fusion with Adaptive Feature Selection for Sparse Semantic Segmentation Network
* Affect2MM: Affective Analysis of Multimedia Content Using Emotion Causality
* Affective Growth of Computer Vision, The
* Affective Processes: stochastic modelling of temporal context for emotion and facial expression recognition
* Affordance Transfer Learning for Human-Object Interaction Detection
* AGORA: Avatars in Geography Optimized for Regression Analysis
* AGQA: A Benchmark for Compositional Spatio-Temporal Reasoning
* AIFit: Automatic 3D Human-Interpretable Feedback Models for Fitness Training
* All Labels Are Not Created Equal: Enhancing Semi-supervision via Label Grouping and Co-training
* Alpha-Refine: Boosting Tracking Performance by Precise Bounding Box Estimation
* AlphaMatch: Improving Consistency for Semi-supervised Learning with Alpha-divergence
* Alternative Probabilistic Interpretation of the Huber Loss, An
* Amalgamating Knowledge from Heterogeneous Graph Neural Networks
* Analytic Algorithm for Dipole Electromagnetic Field in Fully Anisotropic Planar-Stratified Media, An
* Anchor-Constrained Viterbi for Set-Supervised Action Segmentation
* Anchor-Free Person Search
* Animating Pictures with Eulerian Motion Fields
* Anomaly Detection in Hyperspectral Imagery Based on Gaussian Mixture Model
* Anomaly Detection in Video via Self-Supervised and Multi-Task Learning
* ANR: Articulated Neural Rendering for Virtual Avatars
* Anti-Adversarially Manipulated Attributions for Weakly and Semi-Supervised Semantic Segmentation
* Anti-aliasing Semantic Reconstruction for Few-Shot Semantic Segmentation
* Anticipating human actions by correlating past with the future with Jaccard similarity measures
* Anycost GANs for Interactive Image Synthesis and Editing
* ApodNet: Learning for High Frame Rate Synthetic Transmit Aperture Ultrasound Imaging
* AQD: Towards Accurate Quantized Object Detection
* Architectural Adversarial Robustness: The Case for Deep Pursuit
* Are Labels Always Necessary for Classifier Accuracy Evaluation?
* ArtCoder: An End-to-end Method for Generating Scanning-robust Stylized QR Codes
* ArtEmis: Affective Language for Visual Art
* ArtFlow: Unbiased Image Style Transfer via Reversible Neural Flows
* ARVo: Learning All-Range Volumetric Correspondence for Video Deblurring
* Asymmetric Gained Deep Image Compression With Continuous Rate Adaptation
* Asymmetric metric learning for knowledge transfer
* ATSO: Asynchronous Teacher-Student Optimization for Semi-Supervised Image Segmentation
* Attend to the Difference: Cross-Modality Person Re-Identification via Contrastive Correlation
* Attention-Based Second-Order Pooling Network for Hyperspectral Image Classification
* Attention-guided Image Compression by Deep Reconstruction of Compressive Sensed Saliency Skeleton
* Attentional Local Contrast Networks for Infrared Small Target Detection
* AttentiveNAS: Improving Neural Architecture Search via Attentive Sampling
* Audio-Driven Emotional Video Portraits
* Audio-Visual Instance Discrimination with Cross-Modal Agreement
* Augmentation Strategies for Learning with Noisy Labels
* Auto-Exposure Fusion for Single-Image Shadow Removal
* AutoDO: Robust AutoAugment for Biased Data with Label Noise via Scalable Probabilistic Implicit Differentiation
* AutoFlow: Learning a Better Training Set for Optical Flow
* AutoInt: Automatic Integration for Fast Neural Volume Rendering
* Automated Detection of Marine Glacier Calving Fronts Using the 2-D Wavelet Transform Modulus Maxima Segmentation Method
* Automated Log-Scale Quantization for Low-Cost Deep Neural Networks
* Automatic Correction of Internal Units in Generative Neural Networks
* Automatic Vertebra Localization and Identification in CT by Spine Rectification and Anatomically-constrained Optimization
* Autoregressive Stylized Motion Synthesis with Generative Flow
* BABEL: Bodies, Action and Behavior with English Labels
* Back to Event Basics: Self-Supervised Learning of Image Reconstruction for Event Cameras via Photometric Constancy
* Back to the Feature: Learning Robust Camera Localization from Pixels to Pose
* Back-tracing Representative Points for Voting-based 3D Object Detection in Point Clouds
* Backdoor Attacks Against Deep Learning Systems in the Physical World
* Background Splitting: Finding Rare Classes in a Sea of Background
* Background-Aware Pooling and Noise-Aware Loss for Weakly-Supervised Semantic Segmentation
* BASAR:Black-box Attack on Skeletal Action Recognition
* BasicVSR: The Search for Essential Components in Video Super-Resolution and Beyond
* Bayesian Inversion of Time-Lapse Seismic AVO Data for Multimodal Reservoir Properties
* Bayesian Nested Neural Networks for Uncertainty Calibration and Adaptive Compression
* BBAM: Bounding Box Attribution Map for Weakly Supervised Semantic and Instance Segmentation
* BCNet: Searching for Network Width with Bilaterally Coupled Network
* Behavior-Driven Synthesis of Human Dynamics
* Benchmarking Representation Learning for Natural World Image Collections
* Bespoke Fractal Sampling Patterns for Discrete Fourier Space via the Kaleidoscope Transform
* Beyond Bounding-Box: Convex-hull Feature Adaptation for Oriented and Densely Packed Object Detection
* Beyond Image to Depth: Improving Depth Prediction using Echoes
* Beyond Max-Margin: Class Margin Equilibrium for Few-shot Object Detection
* Beyond Short Clips: End-to-End Video-Level Learning with Collaborative Memories
* Beyond Static Features for Temporally Consistent 3D Human Pose and Shape from a Video
* Bi-GCN: Binary Graph Convolutional Network
* BiCnet-TKS: Learning Efficient Spatial-Temporal Representation for Video Person Re-Identification
* Bidirectional Projection Network for Cross Dimension Scene Understanding
* Bifurcated Backbone Strategy for RGB-D Salient Object Detection
* Bilateral Grid Learning for Stereo Matching Networks
* Bilevel Online Adaptation for Out-of-Domain Human Mesh Reconstruction
* Bilinear Parameterization for Non-Separable Singular Value Penalties
* Binary Graph Neural Networks
* Binary TTC: A Temporal Geofence for Autonomous Navigation
* Bipartite Graph Network with Adaptive Message Passing for Unbiased Scene Graph Generation
* Birds of a Feather: Capturing Avian Shape Models from Images
* Black-box Explanation of Object Detectors via Saliency Maps
* Blessings of Unlabeled Background in Untrimmed Videos, The
* Blind Deblurring for Saturated Images
* Blind Primed Supervised (BLIPS) Learning for MR Image Reconstruction
* Blind video quality assessment based on multilevel video perception
* Blocks-World Cameras
* Blur, Noise, and Compression Robust Generative Adversarial Networks
* Blurring/Clutter Mitigation in Quarry Monitoring by Ground-Based Synthetic Aperture Radar
* Body Meshes as Points
* Body2Hands: Learning to Infer 3D Hands from Conversational Gesture Body Dynamics
* Boosting Ensemble Accuracy by Revisiting Ensemble Diversity Metrics
* Boosting Monocular Depth Estimation Models to High-Resolution via Content-Adaptive Multi-Resolution Merging
* Boosting Video Representation Learning with Multi-Faceted Integration
* Bottleneck Transformers for Visual Recognition
* Bottom-Up Human Pose Estimation Via Disentangled Keypoint Regression
* Bottom-Up Shift and Reasoning for Referring Image Segmentation
* Boundary IoU: Improving Object-Centric Image Segmentation Evaluation
* BoxInst: High-Performance Instance Segmentation with Box Annotations
* Brain Image Synthesis with Unsupervised Multivariate Canonical CSC l4Net
* BRepNet: A topological message passing system for solid models
* Bridge to Answer: Structure-aware Graph Interaction Network for Video Question Answering
* Bridging the Visual Gap: Wide-Range Image Blending
* Building Reliable Explanations of Unreliable Neural Networks: Locally Smoothing Perspective of Model Interpretation
* Calibrated RGB-D Salient Object Detection
* Camera Pose Matters: Improving Depth Prediction by Mitigating Pose Distribution Bias
* Camera-Space Hand Mesh Recovery via Semantic Aggregation and Adaptive 2D-1D Registration
* CAMERAS: Enhanced Resolution And Sanity preserving Class Activation Mapping for image saliency
* Camouflaged Object Segmentation with Distraction Mining
* Can audio-visual integration strengthen robustness under multimodal attacks?
* Can We Characterize Tasks Without Labels or Features?
* CanonPose: Self-Supervised Monocular 3D Human Pose Estimation in the Wild
* Capsule Network is Not More Robust than Convolutional Network
* CapsuleRRT: Relationships-aware Regression Tracking via Capsules
* Capturing Omni-Range Context for Omnidirectional Segmentation
* Cascaded and Recursive ConvNets (CRCNN): An effective and flexible approach for image denoising
* Cascaded Prediction Network via Segment Tree for Temporal Video Grounding
* CASTing Your Model: Learning to Localize Improves Self-Supervised Representations
* Categorical Depth Distribution Network for Monocular 3D Object Detection
* Causal Attention for Vision-Language Tasks
* Causal Hidden Markov Model for Time Series Disease Forecasting
* CausalVAE: Disentangled Representation Learning via Neural Structural Causal Models
* CDFI: Compression-Driven Network Design for Frame Interpolation
* Center-based 3D Object Detection and Tracking
* CFNet: Cascade and Fused Cost Volume for Robust Stereo Matching
* CGA-Net: Category Guided Aggregation for Point Cloud Semantic Segmentation
* ChallenCap: Monocular 3D Capture of Challenging Human Performances using Multi-Modal References
* Checkerboard Context Model for Efficient Learned Image Compression
* Circular-Structured Representation for Visual Emotion Distribution Learning, A
* Class-Aware Robust Adversarial Training for Object Detection
* ClassSR: A General Framework to Accelerate Super-Resolution Networks by Data Characteristic
* CLCC: Contrastive Learning for Color Constancy
* clDice: A Novel Topology-Preserving Loss Function for Tubular Structure Segmentation
* CLEAR: Comprehensive Learning Enabled Adversarial Reconstruction for Subtle Structure Enhanced Low-Dose CT Imaging
* Closed-Form Factorization of Latent Semantics in GANs
* Closed-Form Hybrid Cramer-Rao Bound for DOA Estimation by an Acoustic Vector Sensor Under Orientation Deviation
* Closer Look at Fourier Spectrum Discrepancies for CNN-generated Images Detection, A
* Closing the Loop: Joint Rain Generation and Removal via Disentangled Image Translation
* Cloud2Curve: Generation and Vectorization of Parametric Sketches
* Clusformer: A Transformer based Clustering Approach to Unsupervised Large-scale Face and Visual Landmark Recognition
* Cluster, Split, Fuse, and Update: Meta-Learning for Open Compound Domain Adaptive Semantic Segmentation
* Cluster-based WSN protocol that improves network lifetime through CH replacement
* Cluster-wise Hierarchical Generative Model for Deep Amortized Clustering
* Co-Attention for Conditioned Image Matching
* Co-Grounding Networks with Semantic Attention for Referring Expression Comprehension in Videos
* Coarse-Fine Networks for Temporal Activity Detection in Videos
* Coarse-to-Fine Domain Adaptive Semantic Segmentation with Photometric Alignment and Category-Center Regularization
* Coarse-to-Fine Person Re-Identification with Auxiliary-Domain Classification and Second-Order Information Bottleneck
* CoCoNets: Continuous Contrastive 3D Scene Representations
* CoCosNet v2: Full-Resolution Correspondence Learning for Image Translation
* CodedStereo: Learned Phase Masks for Large Depth-of-field Stereo
* CoLA: Weakly-Supervised Temporal Action Localization with Snippet Contrastive Learning
* Collaborative Spatial-Temporal Modeling for Language-Queried Video Actor Segmentation
* ColorRL: Reinforced Coloring for End-to-End Instance Segmentation
* Combinatorial Learning of Graph Edit Distance via Dynamic Embedding
* Combined Depth Space based Architecture Search For Person Re-identification
* Combining Semantic Guidance and Deep Reinforcement Learning For Generating Human Level Paintings
* Coming Down to Earth: Satellite-to-Street View Synthesis for Geo-Localization
* Communication Efficient SGD via Gradient Sampling with Bayes Prior
* CoMoGAN: continuous model-guided image-to-image translation
* Compact Band Weighting Module Based on Attention-Driven for Hyperspectral Image Classification
* Compatibility-aware Heterogeneous Visual Search
* Complementary Relation Contrastive Distillation
* Complete & Label: A Domain Adaptation Approach to Semantic Segmentation of LiDAR Point Clouds
* COMPLETER: Incomplete Multi-view Clustering via Contrastive Prediction
* Composing Photos Like a Photographer
* CompositeTasking: Understanding Images by Spatial Composition of Tasks
* Compressed Sensing-Based Multitarget CFAR Detection Algorithm for FMCW Radar
* Compressive Beamforming Based on Multiconstraint Bayesian Framework
* Computer Vision and Pattern Recognition 2020
* Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts
* CondenseNet V2: Sparse Feature Reactivation for Deep Networks
* Conditional Bures Metric for Domain Adaptation
* Confluent Vessel Trees with Accurate Bifurcations
* Connecting What to Say With Where to Look by Modeling Human Attention Traces
* Consensus Maximisation Using Influences of Monotone Boolean Functions
* Consistent Instance False Positive Improves Fairness in Face Recognition
* ContactOpt: Optimizing Contact to Improve Grasps
* Content-Aware GAN Compression
* Context Modeling in 3D Human Pose Estimation: A Unified Perspective
* Context-aware Biaffine Localizing Network for Temporal Sentence Grounding
* Context-Aware Layout to Image Generation with Enhanced Object Appearance
* Continual Adaptation of Visual Representations via Domain Randomization and Meta-learning
* Continual Learning via Bit-Level Information Preserving
* Continual Semantic Segmentation via Repulsion-Attraction of Sparse and Disentangled Latent Representations
* Continuous Conversion of CT Kernel Using Switchable CycleGAN With AdaIN
* Continuous Face Aging via Self-estimated Residual Age Embedding
* Contrastive Embedding for Generalized Zero-Shot Learning
* Contrastive Learning based Hybrid Networks for Long-Tailed Image Classification
* Contrastive Learning for Compact Single Image Dehazing
* Contrastive Neural Architecture Search with Neural Architecture Comparators
* Controllable Image Restoration for Under-Display Camera in Smartphones
* Controllable Sketch-to-Image Translation for Robust Face Synthesis
* Controlling the Rain: from Removal to Rendering
* Convolutional Dynamic Alignment Networks for Interpretable Classifications
* Convolutional Hough Matching Networks
* Convolutional Neural Network Pruning with Structural Redundancy Reduction
* Cooperative pixel clustering for accurate automatic inflamed appendix extraction from ultrasound images
* Coordinate Attention for Efficient Mobile Network Design
* Correction of Time-Varying Baseline Errors Based on Multibaseline Airborne Interferometric Data Without High-Precision DEMs
* Correlated Input-Dependent Label Noise in Large-Scale Image Classification
* CorrNet3D: Unsupervised End-to-end Learning of Dense Correspondence for 3D Point Clouds
* CoSMo: Content-Style Modulation for Image Retrieval with Text Feedback
* Counterfactual VQA: A Cause-Effect Look at Language Bias
* Counterfactual Zero-Shot and Open-Set Visual Recognition
* CReST: A Class-Rebalancing Self-Training Framework for Imbalanced Semi-Supervised Learning
* CRFace: Confidence Ranker for Model-Agnostic Face Detection Refinement
* Cross Modal Focal Loss for RGBD Face Anti-Spoofing
* Cross-Domain Adaptive Clustering for Semi-Supervised Domain Adaptation
* Cross-Domain Gradient Discrepancy Minimization for Unsupervised Domain Adaptation
* Cross-Domain Similarity Learning for Face Recognition in Unseen Domains
* Cross-Iteration Batch Normalization
* Cross-Modal Center Loss for 3D Cross-Modal Retrieval
* Cross-Modal Collaborative Representation Learning and a Large-Scale RGBT Benchmark for Crowd Counting
* Cross-Modal Contrastive Learning for Text-to-Image Generation
* Cross-MPI: Cross-scale Stereo for Image Super-Resolution using Multiplane Images
* Cross-Scene Hyperspectral Image Classification With Discriminative Cooperative Alignment
* Cross-View Cross-Scene Multi-View Crowd Counting
* Cross-View Gait Recognition with Deep Universal Linear Embeddings
* Cross-View Regularization for Domain Adaptive Panoptic Segmentation
* Crossing cuts polygonal puzzles: Models and Solvers
* CT Reconstruction With PDF: Parameter-Dependent Framework for Data From Multiple Geometries and Dose Levels
* CT-Net: Complementary Transfering Network for Garment Transfer with Arbitrary Geometric Changes
* CTU depth decision algorithms for HEVC: A survey
* Cuboids Revisited: Learning Robust 3D Shape Fitting to Single RGB Images
* Curriculum Graph Co-Teaching for Multi-Target Domain Adaptation
* CutPaste: Self-Supervised Learning for Anomaly Detection and Localization
* Cycle4Completion: Unpaired Point Cloud Completion using Cycle Transformation with Missing Region Coding
* Cyclic Co-Learning of Sounding Object Visual Grounding and Sound Separation
* CycN-Net: A Convolutional Neural Network Specialized for 4D CBCT Images Refinement
* Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR Segmentation
* D-NeRF: Neural Radiance Fields for Dynamic Scenes
* D2IM-Net: Learning Detail Disentangled Implicit Fields from Single Images
* DANNet: A One-Stage Domain Adaptation Network for Unsupervised Nighttime Semantic Segmentation
* DAP: Detection-Aware Pre-training with Weak Supervision
* DARCNN: Domain Adaptive Region-based Convolutional Neural Network for Unsupervised Instance Segmentation in Biomedical Images
* DAT: Training Deep Networks Robust to Label-Noise by Matching the Feature Distributions
* Data Extrapolation From Learned Prior Images for Truncation Correction in Computed Tomography
* Data-Free Knowledge Distillation For Image Super-Resolution
* Data-Free Model Extraction
* Data-Uncertainty Guided Multi-Phase Learning for Semi-Supervised Object Detection
* DatasetGAN: Efficient Labeled Data Factory with Minimal Human Effort
* DCNAS: Densely Connected Neural Architecture Search for Semantic Image Segmentation
* DCT-Mask: Discrete Cosine Transform Mask Representation for Instance Segmentation
* De-rendering the World's Revolutionary Artefacts
* Debiased Subjective Assessment of Real-World Image Enhancement
* Decomposition Model for Stereo Matching, A
* DECOR-GAN: 3D Shape Detailization by Conditional Refinement
* Decoupled Dynamic Filter Networks
* Deep Active Surface Models
* Deep Analysis of CNN-based Spatio-temporal Representations for Action Recognition
* Deep Animation Video Interpolation in the Wild
* Deep Burst Super-Resolution
* Deep Compositional Metric Learning
* Deep Convolutional Dictionary Learning for Image Denoising
* Deep Convolutional Neural Network Framework for Subpixel Mapping
* Deep Denoising of Flash and No-Flash Pairs for Photography in Low-Light Environments
* Deep Dual Consecutive Network for Human Pose Estimation
* Deep Emulator for Secondary Motion of 3D Characters, A
* Deep Gaussian Scale Mixture Prior for Spectral Compressive Imaging
* Deep Gradient Projection Networks for Pan-sharpening
* Deep Graph Matching under Quadratic Constraint
* Deep Hashing Learning for Visual and Semantic Retrieval of Remote Sensing Images
* Deep High-Resolution Representation Learning for Cross-Resolution Person Re-Identification
* Deep Homography for Efficient Stereo Image Compression
* Deep Implicit Moving Least-Squares Functions for 3D Reconstruction
* Deep Implicit Templates for 3D Shape Representation
* Deep Interactive Denoiser (DID) for X-Ray Computed Tomography
* Deep learning based HEVC in-loop filter and noise reduction
* Deep Learning for Ultrasound Beamforming in Flexible Array Transducer
* Deep Learning in Latent Space for Video Prediction and Compression
* Deep Learning-Based Model That Reduces Speed of Sound Aberrations for Improved In Vivo Photoacoustic Imaging, A
* Deep Lesion Tracker: Monitoring Lesions in 4D Longitudinal Imaging Studies
* Deep Lucas-Kanade Homography for Multimodal Image Alignment
* Deep Multi-Task Learning for Joint Localization, Perception, and Prediction
* Deep Occlusion-Aware Instance Segmentation with Overlapping BiLayers
* Deep Optimized Priors for 3D Shape Modeling and Reconstruction
* Deep Perceptual Preprocessing for Video Coding
* Deep Polarization Imaging for 3D Shape and SVBRDF Acquisition
* Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion
* Deep Stable Learning for Out-Of-Distribution Generalization
* Deep Texture Recognition via Exploiting Cross-Layer Statistical Self-Similarity
* Deep Tomographic Image Reconstruction: Yesterday, Today, and Tomorrow: Editorial for the 2nd Special Issue Machine Learning for Image Reconstruction
* Deep Trajectory Post-Processing and Position Projection for Single & Multiple Camera Multiple Object Tracking
* Deep Two-View Structure-from-Motion Revisited
* Deep Video Matting via Spatio-Temporal Alignment and Aggregation
* Deep-NC: A secure image transmission using deep learning and network coding
* DeepACG: Co-Saliency Detection via Semantic-aware Contrast Gromov-Wasserstein Distance
* DeepFuseNet of Omnidirectional Far-Infrared and Visual Stream for Vegetation Detection
* DeepI2P: Image-to-Point Cloud Registration via Deep Classification
* DeepLM: Large-scale Nonlinear Least Squares on Deep Learning Frameworks using Stochastic Domain Decomposition
* Deeply Shape-guided Cascade for Instance Segmentation
* DeepMetaHandles: Learning Deformation Meta-Handles of 3D Meshes with Biharmonic Coordinates
* DeepSurfels: Learning Online Appearance Fusion
* DeepTag: An Unsupervised Deep Learning Method for Motion Tracking on Cardiac Tagging Magnetic Resonance Images
* DeepVideoMVS: Multi-View Stereo on Video with Recurrent Spatio-Temporal Fusion
* Defending Multimodal Fusion Models against Single-Source Adversaries
* DeFLOCNet: Deep Image Editing via Flexible Low-level Controls
* DeFlow: Learning Complex Image Degradations from Unpaired Data with Conditional Flows
* DeFMO: Deblurring and Shape Recovery of Fast Moving Objects
* Deformed Implicit Field: Modeling 3D Shapes with Learned Dense Correspondence
* Delving Deep into Many-to-many Attention for Few-shot Video Object Segmentation
* Delving into Data: Effectively Substitute Training for Black-box Attack
* Delving into Localization Errors for Monocular 3D Object Detection
* DeMoCap: Low-Cost Marker-Based Motion Capture
* Denoise and Contrast for Category Agnostic Shape Completion
* Dense Contrastive Learning for Self-Supervised Visual Pre-Training
* Dense Label Encoding for Boundary Discontinuity Free Rotation Detection
* Dense Relation Distillation with Context-aware Aggregation for Few-Shot Object Detection
* Densely connected multidilated convolutional networks for dense prediction tasks
* Depth Completion using Plane-Residual Representation
* Depth Completion with Twin Surface Extrapolation at Occlusion Boundaries
* Depth from Camera Motion and Object Detection
* Depth-Aware Mirror Segmentation
* Depth-conditioned Dynamic Message Propagation for Monocular 3D Object Detection
* DER: Dynamically Expandable Representation for Class Incremental Learning
* DeRF: Decomposed Radiance Fields
* DESA: Disparity Estimation With Surface Awareness
* Detecting Human-Object Interaction via Fabricated Compositional Learning
* Detection, Tracking, and Counting Meets Drones in Crowds: A Benchmark
* DetectoRS: Detecting Objects with Recursive Feature Pyramid and Switchable Atrous Convolution
* DexYCB: A Benchmark for Capturing Hand Grasping of Objects
* DG-Font: Deformable Generative Networks for Unsupervised Font Generation
* DI-Fusion: Online Implicit 3D Reconstruction with Deep Priors
* Dictionary-guided Scene Text Recognition
* Differentiable Diffusion for Dense Depth Estimation from Multi-view Images
* Differentiable Multi-Granularity Human Representation Learning for Instance-Aware Human Semantic Parsing
* Differentiable Patch Selection for Image Recognition
* Differentiable SLAM-net: Learning Particle SLAM for Visual Navigation
* Diffusion Probabilistic Models for 3D Point Cloud Generation
* Digital Gimbal: End-to-end Deep Image Stabilization with Learnable Exposure Times
* DiNTS: Differentiable Neural Network Topology Search for 3D Medical Image Segmentation
* DISCO: Dynamic and Invariant Sensitive Channel Obfuscation for deep neural networks
* Discover Cross-Modality Nuances for Visible-Infrared Person Re-Identification
* Discovering Hidden Physics Behind Transport Dynamics
* Discovering Interpretable Latent Space Directions of GANs Beyond Binary Attributes
* Discovering Relationships between Object Categories via Universal Canonical Maps
* Discrete-continuous Action Space Policy Gradient-based Attention for Image-Text Matching
* Discrimination-Aware Mechanism for Fine-grained Representation Learning
* Discriminative Appearance Modeling with Multi-track Pooling for Real-time Multi-object Tracking
* Disentangled Cycle Consistency for Highly-realistic Virtual Try-On
* Disentangling Label Distribution for Long-tailed Visual Recognition
* Distilling Audio-Visual Knowledge by Compositional Contrastive Learning
* Distilling Causal Effect of Data in Class-Incremental Learning
* Distilling Knowledge via Knowledge Review
* Distilling Object Detectors via Decoupled Features
* Distractor-Aware Fast Tracking via Dynamic Convolutions and MOT Philosophy
* Distributed Beams: Concept of Operations for Polarimetric Rotating Phased Array Radar
* Distribution Alignment: A Unified Framework for Long-tail Visual Recognition
* Distribution-aware Adaptive Multi-bit Quantization
* DivCo: Diverse Conditional Image Synthesis via Contrastive Generative Adversarial Network
* Dive into Ambiguity: Latent Distribution Mining and Pairwise Uncertainty Estimation for Facial Expression Recognition
* Divergence Optimization for Noisy Universal Domain Adaptation
* Diverse Branch Block: Building a Convolution as an Inception-like Unit
* Diverse Part Discovery: Occluded Person Re-identification with Part-Aware Transformer
* Diverse Semantic Image Synthesis via Probability Distribution Modeling
* Diversifying Sample Generation for Accurate Data-Free Quantization
* Divide-and-Conquer for Lane-Aware Diverse Trajectory Prediction
* DoDNet: Learning to Segment Multi-Organ and Tumors from Multiple Partially Labeled Datasets
* Dogfight: Detecting Drones from Drones Videos
* Domain Adaptation with Auxiliary Target Domain-Oriented Classifier
* Domain Consensus Clustering for Universal Domain Adaptation
* Domain Discrepancy Elimination and Mean Face Representation Learning for NIR-VIS Face Recognition
* Domain-Independent Dominance of Adaptive Methods
* Domain-robust VQA with diverse datasets and methods but no target labels
* Domain-Specific Suppression for Adaptive Object Detection
* DotFAN: A Domain-Transferred Face Augmentation Net
* DOTS: Decoupling Operation and Topology in Differentiable Architecture Search
* Double low-rank representation with projection distance penalty for clustering
* Downsampled Imaging Geometric Modeling for Accurate CT Reconstruction via Deep Learning
* Drafting and Revision: Laplacian Pyramid Network for Fast High-Quality Artistic Style Transfer
* DRANet: Disentangling Representation and Adaptation Networks for Unsupervised Cross-Domain Adaptation
* DriveGAN: Towards a Controllable High-Quality Neural Simulation
* DRONE: Dual-Domain Residual-based Optimization NEtwork for Sparse-View CT Reconstruction
* DSC-PoseNet: Learning 6DoF Object Pose Estimation via Dual-scale Consistency
* DSRNA: Differentiable Search of Robust Neural Architectures
* Dual Attention Guided Gaze Target Detection in the Wild
* Dual Attention Suppression Attack: Generate Adversarial Camouflage in Physical World
* Dual Contradistinctive Generative Autoencoder
* Dual Iterative Refinement Method for Non-rigid Shape Matching, A
* Dual Pixel Exploration: Simultaneous Depth Estimation and Image Restoration
* Dual-Constrained Deep Semi-Supervised Coupled Factorization Network with Enriched Prior
* Dual-GAN: Joint BVP and Noise Modeling for Remote Physiological Measurement
* Dual-Stream Multiple Instance Learning Network for Whole Slide Image Classification with Self-Supervised Contrastive Learning
* Dual-View Snapshot Compressive Imaging via Optical Flow Aided Recurrent Neural Network
* DualAST: Dual Style-Learning Networks for Artistic Style Transfer
* DualGraph: A graph-based method for reasoning about label noise
* DyCo3D: Robust Instance Segmentation of 3D Point Clouds through Dynamic Convolution
* DyGLIP: A Dynamic Graph Model with Link Prediction for Accurate Multi-Camera Multiple Object Tracking
* Dynamic Class Queue for Large Scale Face Recognition In the Wild
* Dynamic Domain Adaptation for Efficient Inference
* Dynamic Head: Unifying Object Detection Heads with Attentions
* Dynamic Imaging Using a Deep Generative SToRM (Gen-SToRM) Model
* Dynamic Metric Learning: Towards a Scalable Metric Space to Accommodate Multiple Semantic Scales
* Dynamic Neural Radiance Fields for Monocular 4D Facial Avatar Reconstruction
* Dynamic Probabilistic Graph Convolution for Facial Action Unit Intensity Estimation
* Dynamic Region-Aware Convolution
* Dynamic Slimmable Network
* Dynamic Transfer for Multi-Source Domain Adaptation
* Dynamic Weighted Learning for Unsupervised Domain Adaptation
* DyStaB: Unsupervised Object Segmentation via Dynamic-Static Bootstrapping*
* ECKPN: Explicit Class Knowledge Propagation Network for Transductive Few-shot Learning
* EDNet: Efficient Disparity Estimation with Cost Volume Combination and Attention-based Spatial Residual
* Effective Snapshot Compressive-spectral Imaging via Deep Denoising and Total Variation Priors
* Effective Sparsification of Neural Networks with Global Sparsity Constraint
* Effects of Induced Field Rotation From Rough Surface on H-Alpha Decomposition of Full-Polarimetric GPR
* Efficient Conditional GAN Transfer with Knowledge Propagation across Classes
* Efficient deformable shape correspondence via multiscale spectral manifold wavelets preservation
* Efficient Feature Transformations for Discriminative and Generative Continual Learning
* Efficient Initial Pose-graph Generation for Global SfM
* Efficient Multi-Stage Video Denoising with Recurrent Spatio-Temporal Fusion
* Efficient Object Embedding for Spliced Image Retrieval
* Efficient Regional Memory Network for Video Object Segmentation
* EffiScene: Efficient Per-Pixel Rigidity Inference for Unsupervised Joint Learning of Optical Flow, Depth, Camera Pose and Motion Segmentation
* Ego-Exo: Transferring Visual Representations from Third-person to First-person Videos
* Elastic Least-Squares Reverse-Time Migration Based on a Modified Acoustic-Elastic Coupled Equation for OBS Four-Component Data
* Embedded Discriminative Attention Mechanism for Weakly Supervised Semantic Segmentation
* Embedding Transfer with Label Relaxation for Improved Metric Learning
* Embracing Uncertainty: Decoupling and De-bias for Robust Temporal Grounding
* Encoder Fusion Network with Co-Attention Embedding for Referring Image Segmentation
* Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation
* End-to-end High Dynamic Range Camera Pipeline Optimization
* End-to-End Human Object Interaction Detection with HOI Transformer
* End-to-End Human Pose and Mesh Reconstruction with Transformers
* End-to-End Learning for Joint Image Demosaicing, Denoising and Super-Resolution
* End-to-End Object Detection with Fully Convolutional Network
* End-to-End Rotation Averaging with Multi-Source Propagation
* End-to-End Video Instance Segmentation with Transformers
* EnD: Entangling and Disentangling deep representations for bias correction
* Energy-Based Learning for Scene Graph Generation
* Enhance Curvature Information by Structured Stochastic Quasi-Newton Methods
* Enhanced CRC-based correction of multiple errors with candidate validation
* Enhancing the Transferability of Adversarial Attacks through Variance Tuning
* Enriching ImageNet with Human Similarity Judgments and Psychological Embeddings
* Ensemble single image deraining network via progressive structural boosting constraints
* Ensembling with Deep Generative Views
* entropy minimization histogram mergence scheme and its application in image compression, An
* Equalization Loss v2: A New Gradient Balance Approach for Long-tailed Object Detection
* Equivariant Point Network for 3D Point Cloud Analysis
* Euro-PVI: Pedestrian Vehicle Interactions in Dense Urban Centers
* Evaluation of Rainfall Estimation Derived From Commercial Interactive DVB Receivers Using Disdrometer, Rain Gauge, and Weather Radar
* Evaluation of Winter Wheat Yield Simulation Based on Assimilating LAI Retrieved From Networked Optical and SAR Remotely Sensed Images Into the WOFOST Model
* EvDistill: Asynchronous Events to End-task Learning via Bidirectional Reconstruction-guided Cross-modal Knowledge Distillation
* Event-based Bispectral Photometry using Temporally Modulated Illumination
* Event-based Synthetic Aperture Imaging with a Hybrid Network
* EventZoom: Learning to Denoise and Super Resolve Neuromorphic Events
* Every Annotation Counts: Multi-label Deep Supervision for Medical Image Segmentation
* Exemplar-Based Open-Set Panoptic Segmentation Network
* Explaining Classifiers using Adversarial Perturbations on the Perceptual Ball
* Explicit Knowledge Incorporation for Visual Reasoning
* Exploit Visual Dependency Relations for Semantic Segmentation
* Exploiting & Refining Depth Distributions with Triangulation Light Curtains
* Exploiting Aliasing for Manga Restoration
* Exploiting Edge-Oriented Reasoning for 3D Point-based Scene Graph Analysis
* Exploiting Semantic Embedding and Visual Feature for Facial Action Unit Detection
* Exploiting Spatial Dimensions of Latent in GAN for Real-time Image Editing
* Explore Image Deblurring via Encoded Blur Kernel Space
* Exploring Adversarial Fake Images on Face Manifold
* Exploring and Distilling Posterior and Prior Knowledge for Radiology Report Generation
* Exploring Complementary Strengths of Invariant and Equivariant Representations for Few-Shot Learning
* Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts
* Exploring Heterogeneous Clues for Weakly-Supervised Audio-Visual Video Parsing
* Exploring Intermediate Representation for Monocular Vehicle Pose Estimation
* Exploring Simple Siamese Representation Learning
* Exploring Sparsity in Image Super-Resolution for Efficient Inference
* Exponential Moving Average Normalization for Self-supervised and Semi-supervised Learning
* Extreme Low-Light Environment-Driven Image Denoising over Permanently Shadowed Lunar Regions with a Physical Noise Model
* Extreme Rotation Estimation using Dense Correlation Volumes
* Face Forensics in the Wild
* Face Forgery Detection by 3D Decomposition
* FaceInpainter: High Fidelity Face Adaptation to Heterogeneous Domains
* FACESEC: A Fine-grained Robustness Evaluation Framework for Face Recognition Systems
* Facial Action Unit Detection With Transformers
* FAIEr: Fidelity and Adequacy Ensured Image Caption Evaluation
* Fair Attribute Classification through Latent Space De-biasing
* Fair Feature Distillation for Visual Recognition
* FAPIS: A Few-shot Anchor-free Part-based Instance Segmenter
* Farewell to Mutual Information: Variational Distillation for Cross-Modal Person Re-Identification
* Fashion IQ: A New Dataset Towards Retrieving Images by Natural Language Feedback
* Fast and Accurate Model Scaling
* Fast Bayesian Uncertainty Estimation and Reduction of Batch Normalized Single Image Super-Resolution Network
* Fast end-to-end learning on protein surfaces
* Fast Momentum Contrast Learning for Unsupervised Person Re-Identification
* Fast Pixel-Superpixel Region Merging for SAR Image Segmentation
* Fast Sinkhorn Filters: Using Matrix Scaling for Non-Rigid Shape Correspondence with Functional Maps
* Faster Meta Update Strategy for Noise-Robust Deep Learning
* FBI-Denoiser: Fast Blind Image Denoiser for Poisson-Gaussian Noise
* FBNetV3: Joint Architecture-Recipe Search using Predictor Pretraining
* FCPose: Fully Convolutional Multi-Person Pose Estimation with Dynamic Instance-Aware Convolutions
* Feature Decomposition and Reconstruction Learning for Effective Facial Expression Recognition
* Feature-Level Collaboration: Joint Unsupervised Learning of Optical Flow, Stereo Depth and Camera Motion
* FedDG: Federated Domain Generalization on Medical Image Segmentation via Episodic Learning in Continuous Frequency Space
* Feedback-based object detection for multi-person pose estimation
* FESTA: Flow Estimation via Spatial-Temporal Attention for Scene Point Clouds
* Few-shot 3D Point Cloud Semantic Segmentation
* Few-Shot Classification with Feature Map Reconstruction Networks
* Few-Shot Human Motion Transfer by Personalized Geometry and Texture Modeling
* Few-shot Image Generation via Cross-domain Correspondence
* Few-Shot Incremental Learning with Continually Evolved Classifiers
* Few-Shot Object Detection via Classification Refinement and Distractor Retreatment
* Few-shot Open-set Recognition by Transformation Consistency
* Few-Shot Segmentation Without Meta-Learning: A Good Transductive Inference Is All You Need?
* Few-Shot Transformation of Common Actions into Time and Space
* FFB6D: A Full Flow Bidirectional Fusion Network for 6D Pose Estimation
* Fine-grained Angular Contrastive Learning with Coarse Labels
* Fine-Grained Shape-Appearance Mutual Learning for Cloth-Changing Person Re-Identification
* Fingerspelling Detection in American Sign Language
* FixBi: Bridging Domain Spaces for Unsupervised Domain Adaptation
* Flow Guided Transformable Bottleneck Networks for Motion Retargeting
* Flow-based Kernel Prior with Application to Blind Super-Resolution
* Flow-guided One-shot Talking Face Generation with a High-resolution Audio-visual Dataset
* FlowStep3D: Model Unrolling for Self-Supervised Scene Flow Estimation
* Focus on Local: Detecting Lane Marker from Bottom Up via Key Point
* Footprint Size Design of Large-Footprint Full-Waveform LiDAR for Forest and Topography Applications: A Theoretical Study
* Forecasting Irreversible Disease via Progression Learning
* ForgeryNet: A Versatile Benchmark for Comprehensive Forgery Analysis
* Fostering Generalization in Single-view 3D Reconstruction by Learning a Hierarchy of Local and Global Shape Priors
* Found a Reason for me? Weakly-supervised Grounded Visual Question Answering using Capsules
* Fourier Contour Embedding for Arbitrary-Shaped Text Detection
* Fourier-based Framework for Domain Generalization, A
* FP-NAS: Fast Probabilistic Neural Architecture Search
* FrameExit: Conditional Early Exiting for Efficient Video Recognition
* Frequency-aware Discriminative Feature Learning Supervised by Single-Center Loss for Face Forgery Detection
* From Points to Multi-Object 3D Reconstruction
* From Rain Generation to Rain Removal
* From Semantic Categories to Fixations: A Novel Weakly-supervised Visual-auditory Saliency Detection Approach
* From Shadow Generation to Shadow Removal
* From Synthetic to Real: Unsupervised Domain Adaptation for Animal Pose Estimation
* FS-Net: Fast Shape-based Network for Category-Level 6D Object Pose Estimation with Decoupled Rotation Mechanism
* FSCE: Few-Shot Object Detection via Contrastive Proposal Encoding
* FSDR: Frequency Space Domain Randomization for Domain Generalization
* Fully Convolutional Networks for Panoptic Segmentation
* Fully Convolutional Scene Graph Generation
* Fully Understanding Generic Objects: Modeling, Segmentation, and Reconstruction
* Function4D: Real-time Human Volumetric Capture from Very Sparse Consumer RGBD Sensors
* functional approach to rotation equivariant non-linearities for Tensor Field Networks, A
* Fusing the Old with the New: Learning Relative Camera Pose with Geometry-Guided Uncertainty
* FVC: An End-to-End Framework Towards Deep Video Compression in Feature Space
* GAIA: A Transfer Learning System of Object Detection that Fits Your Needs
* GAN Prior Embedded Network for Blind Face Restoration in the Wild
* GANmut: Learning Interpretable Conditional Space for Gamut of Emotions
* Gated Spatio-Temporal Attention-Guided Video Deblurring
* GATSBI: Generative Agent-centric Spatio-temporal Object Interaction
* Gaussian Context Transformer
* GDR-Net: Geometry-Guided Direct Regression Network for Monocular 6D Object Pose Estimation
* General Instance Distillation for Object Detection
* General Multi-label Image Classification with Transformers
* Generalizable Pedestrian Detection: The Elephant In The Room
* Generalizable Person Re-identification with Relevance-aware Mixture of Experts
* Generalization on Unseen Domains via Inference-time Label-Preserving Target Projections
* Generalized Domain Adaptation
* Generalized Few-Shot Object Detection without Forgetting
* Generalized Focal Loss V2: Learning Reliable Localization Quality Estimation for Dense Object Detection
* Generalized Loss Function for Crowd Counting and Localization, A
* Generalizing Face Forgery Detection with High-frequency Features
* Generalizing to the Open World: Deep Visual Odometry with Online Adaptation
* Generating Diverse Structure for Image Inpainting With Hierarchical VQ-VAE
* Generating Manga from Illustrations via Mimicking Manga Creation Workflow
* Generative Classifiers as a Basis for Trustworthy Image Classification
* Generative Hierarchical Features from Synthesizing Images
* Generative Interventions for Causal Learning
* Generative PointNet: Deep Energy-Based Learning on Unordered Point Sets for 3D Generation, Reconstruction and Classification
* Generic Perceptual Loss for Modeling Structured Output Dependencies
* Geo-FARM: Geodesic Factor Regression Model for Misaligned Pre-shape Responses in Statistical Shape Analysis
* geodesic multipolar parameterization-based representation for 3D face recognition, A
* Geolocation Error Estimation and Correction on Long-Term MWRI Data
* GeoSim: Realistic Video Simulation via Geometry-Aware Composition for Self-Driving
* GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields
* Glance and Gaze: Inferring Action-aware Points for One-Stage Human-Object Interaction Detection
* Glancing at the Patch: Anomaly Localization with Global and Local Feature Comparison
* GLAVNet: Global-Local Audio-Visual Cues for Fine-Grained Material Recognition
* GLEAN: Generative Latent Bank for Large-Factor Image Super-Resolution
* Global Transport for Fluid Reconstruction with Learned Self-Supervision
* Global2Local: Efficient Structure Search for Video Action Segmentation
* Globally Optimal Relative Pose Estimation with Gravity Prior
* GMOT-40: A Benchmark for Generic Multiple Object Tracking
* Goal-Oriented Gaze Estimation for Zero-Shot Learning
* GPU-Based Supervoxel Generation With a Novel Anisotropic Metric
* Gradient Forward-Propagation for Large-Scale Temporal Video Modelling
* Gradient-based Algorithms for Machine Teaching
* Graph Attention Tracking
* Graph Stacked Hourglass Networks for 3D Human Pose Estimation
* Graph-based High-Order Relation Discovery for Fine-grained Recognition
* Graph-based High-order Relation Modeling for Long-term Action Recognition
* Grayscale-inversion and rotation invariant image description with sorted LBP features
* Greedy Hierarchical Variational Autoencoders for Large-Scale Video Prediction
* GrooMeD-NMS: Grouped Mathematically Differentiable NMS for Monocular 3D Object Detection
* Group Collaborative Learning for Co-Salient Object Detection
* Group Whitening: Balancing Learning Efficiency and Representational Capacity
* Group-Aware Label Transfer for Domain Adaptive Person Re-Identification
* Guest Editorial: Special Issue on Deep Learning for Video Analysis and Compression
* Guided Integrated Gradients: An Adaptive Path Method for Removing Noise
* Guided Interactive Video Object Segmentation Using Reliability-Based Attention Maps
* Hallucination Improves Few-Shot Object Detection
* Hardness Sampling for Self-Training Based Transductive Zero-Shot Learning
* Harmonious Semantic Line Detection via Maximal Weight Clique Selection
* HCRF-Flow: Scene Flow from Point Clouds with Continuous High-order CRFs and Position-aware Flow Embedding
* HDMapGen: A Hierarchical Graph Generative Model of High Definition Maps
* HDR Environment Map Estimation for Real-Time Augmented Reality
* Head pose estimation: A survey of the last ten years
* Heterogeneity Hypothesis: Finding Layer-Wise Differentiated Network Architectures, The
* Heterogeneous Grid Convolution for Adaptive, Efficient, and Controllable Computation
* Hierarchical and Partially Observable Goal-driven Policy Learning with Goals Relational Graph
* Hierarchical Domain-Adapted Feature Learning for Video Saliency Prediction
* Hierarchical Layout-Aware Graph Convolutional Network for Unified Aesthetics Assessment
* Hierarchical Lovász Embeddings for Proposal-free Panoptic Segmentation
* Hierarchical Motion Understanding via Motion Programs
* Hierarchical Video Prediction using Relational Layouts for Human-Object Interactions
* High Frequency Detail Accentuation in CNN Image Restoration
* High Pitch Helical CT Reconstruction
* High-Fidelity and Arbitrary Face Editing
* High-fidelity Face Tracking for AR/VR via Deep Lighting Adaptation
* High-Fidelity Neural Human Motion Transfer from Monocular Video
* High-Quality Stereo Image Restoration from Double Refraction
* High-Resolution Photorealistic Image Translation in Real-Time: A Laplacian Pyramid Translation Network
* High-speed Image Reconstruction through Short-term Plasticity for Spiking Cameras
* Hijack-GAN: Unintended-Use of Pretrained, Black-Box GANs
* Hilbert Sinkhorn Divergence for Optimal Transport
* HistoGAN: Controlling Colors of GAN-Generated and Real Images via Color Histograms
* HITNet: Hierarchical Iterative Tile Refinement Network for Real-time Stereo Matching
* HLA-Face: Joint High-Low Adaptation for Low Light Face Detection
* HoHoNet: 360 Indoor Holistic Understanding with Latent Horizontal Features
* Holistic 3D Human and Scene Mesh Estimation from Single View Images
* Holistic 3D Scene Understanding from a Single Image with Implicit Representation
* Holograms on the Horizon?
* Home Action Genome: Cooperative Compositional Action Understanding
* HOTR: End-to-End Human-Object Interaction Detection with Transformers
* HourNAS: Extremely Fast Neural Architecture Search Through an Hourglass Lens
* House-GAN++: Generative Adversarial Layout Refinement Network towards Intelligent Computational Agent for Professional Architects
* How does topology influence gradient propagation and model performance of deep networks with DenseNet-type skip connections?
* How Privacy-Preserving are Line Clouds? Recovering Scene Details from 3D Lines
* How Robust are Randomized Smoothing based Defenses to Data Poisoning?
* How to Exploit the Transferability of Learned Image Compression to Conventional Codecs
* How Transferable are Reasoning Patterns in VQA?
* How Well Do Self-Supervised Models Transfer?
* How2Sign: A Large-scale Multimodal Dataset for Continuous American Sign Language
* HR-NAS: Searching Efficient High-Resolution Neural Architectures with Lightweight Transformers
* Human De-occlusion: Invisible Perception and Recovery for Humans
* Human POSEitioning System (HPS): 3D Human Pose Estimation and Self-localization in Large Scenes from Body-Mounted Sensors
* Human-like Controllable Image Captioning with Verb-specific Semantic Roles
* HumanGPS: Geodesic PreServing Feature for Dense Human Correspondences
* Humble Teachers Teach Better Students for Semi-Supervised Object Detection
* HVPR: Hybrid Voxel-Point Representation for Single-stage 3D Object Detection
* Hybrid Inference Network for Few-Shot SAR Automatic Target Recognition
* Hybrid Message Passing with Performance-Driven Structures for Facial Action Unit Detection
* Hybrid Rotation Averaging: A Fast and Robust Rotation Averaging Approach
* HybrIK: A Hybrid Analytical-Neural Inverse Kinematics Solution for 3D Human Pose and Shape Estimation
* Hyper-LifelongGAN: Scalable Lifelong Learning for Image Conditioned Generation
* Hyperbolic-to-Hyperbolic Graph Convolutional Network, A
* Hyperdimensional computing as a framework for systematic aggregation of image descriptors
* HyperSeg: Patch-wise Hypernetwork for Real-time Semantic Segmentation
* Hyperspectral Anomaly Detection via Deep Plug-and-Play Denoising CNN Regularization
* i3DMM: Deep Implicit 3D Morphable Model of Human Heads
* I3Net: Implicit Instance-Invariant Network for Adapting One-Stage Object Detectors
* IBRNet: Learning Multi-View Image-Based Rendering
* ID-Unet: Iterative Soft and Hard Deformation for View Synthesis
* IIRC: Incremental Implicitly-Refined Classification
* Im2Vec: Synthesizing Vector Graphics without Vector Supervision
* Image Change Captioning by Learning from an Auxiliary Task
* Image Generators with Conditionally-Independent Pixel Synthesis
* Image Inpainting Guided by Coherence Priors of Semantics and Textures
* Image Inpainting with External-internal Learning and Monochromic Bottleneck
* Image Restoration for Under-Display Camera
* Image Source Identification with Known Post-Processed Based on Convolutional Neural Network
* Image Super-Resolution with Non-Local Sparse Attention
* Image synthesis via adversarial geometric consistency pursuit
* Image-to-image Translation via Hierarchical Style Disentanglement
* Images of 3-D Maneuvering Motion Targets for Interferometric ISAR With 2-D Joint Sparse Reconstruction
* IMAGINE: Image Synthesis by Image-Guided Model Inversion
* img2pose: Face Alignment and Detection via 6DoF, Face Pose Estimation
* iMiGUE: An Identity-free Video Dataset for Micro-Gesture Understanding and Emotion Analysis
* IMODAL: creating learnable user-defined deformation models
* Implicit Feature Alignment: Learn to Convert Text Recognizer to Text Spotter
* improved edge detection technique, An
* Improved Handling of Motion Blur in Online Object Detection
* Improved Image Matting via Real-time User Clicks and Uncertainty Estimation
* Improving Accuracy of Binary Neural Networks using Unbalanced Activation Distribution
* Improving Calibration for Long-Tailed Recognition
* Improving Multiple Object Tracking with Single Object Tracking
* Improving Multiple Pedestrian Tracking by Track Management and Occlusion Handling
* Improving OCR-based Image Captioning by Incorporating Geometrical Relationship
* Improving Panoptic Segmentation at All Scales
* Improving Sign Language Translation with Monolingual Data by Sign Back-Translation
* Improving the Efficiency and Robustness of Deepfakes Detection through Precise Geometric Features
* Improving the Transferability of Adversarial Samples with Adversarial Transformations
* Improving Transferability of Adversarial Patches on Face Recognition with Generative Models
* Improving Unsupervised Image Clustering With Robust Learning
* Improving Weakly Supervised Visual Grounding by Contrastive Knowledge Distillation
* In the light of feature distributions: moment matching for Neural Style Transfer
* in-depth analysis of hyperspectral target detection with shadow compensation via LiDAR, An
* Inception Convolution with Efficient Dilation Search
* Incremental Few-Shot Instance Segmentation
* Incremental Learning via Rate Reduction
* Indoor Lighting Estimation using an Event Camera
* Indoor Panorama Planar 3D Reconstruction via Divide and Conquer
* Inferring and Improving Street Maps with Data-Driven Automation
* Inferring CAD Modeling Sequences Using Zone Graphs
* Information Bottleneck Disentanglement for Identity Swapping
* Information-Theoretic Segmentation by Inpainting Error Maximization
* Informative and Consistent Correspondence Mining for Cross-Domain Weakly Supervised Object Detection
* Instance Level Affinity-Based Transfer for Unsupervised Domain Adaptation
* Instance Localization for Self-supervised Detection Pretraining
* Instant-Teaching: An End-to-End Semi-Supervised Object Detection Framework
* Integrated Online Prediction Model for IoT Data
* Intelligent Carpet: Inferring 3D Human Pose from Tactile Signals
* Intentonomy: a Dataset and Study towards Human Intent Understanding
* Interactive Self-Training with Mean Teachers for Semi-supervised Object Detection
* Interpolation-based Semi-supervised Learning for Object Detection
* Interpretable Social Anchors for Human Trajectory Forecasting in Crowds
* Interpreting Super-Resolution Networks with Local Attribution Maps
* Interventional Video Grounding with Dual Contrastive Learning
* Intra-Inter Camera Similarity for Unsupervised Person Re-Identification
* Intrapulse Azimuth Frequency Scanning-Based 2-D Scanning SAR for HRWS Imaging
* Intrinsic Image Harmonization
* Introvert: Human Trajectory Prediction via Conditional 3D Attention
* Inverse Simulation: Reconstructing Dynamic Geometry of Clothed Humans via Optimal Control
* InverseForm: A Loss Function for Structured Boundary-Aware Segmentation
* Invertible Denoising Network: A Light Solution for Real Noise Removal
* Invertible Image Signal Processing
* Inverting Generative Adversarial Renderer for Face Reconstruction
* Invisible Perturbations: Physical Adversarial Examples Exploiting the Rolling Shutter Effect
* Involution: Inverting the Inherence of Convolution for Visual Recognition
* IoU Attack: Towards Temporally Coherent Black-Box Adversarial Attack for Visual Object Tracking
* IQDet: Instance-wise Quality Distribution Sampling for Object Detection
* IronMask: Modular Architecture for Protecting Deep Face Template
* ISCL: Interdependent Self-Cooperative Learning for Unpaired Image Denoising
* Iso-Points: Optimizing Neural Implicit Surfaces with Hybrid Representations
* Isometric Multi-Shape Matching
* Iterative Enhanced Multivariance Products Representation for Effective Compression of Hyperspectral Images
* Iterative Filter Adaptive Network for Single Image Defocus Deblurring
* Iterative Shrinking for Referring Expression Grounding Using Deep Reinforcement Learning
* iVPF: Numerical Invertible Volume Preserving Flow for Efficient Lossless Compression
* Jigsaw Clustering for Unsupervised Visual Representation Learning
* Jo-SRC: A Contrastive Approach for Combating Noisy Labels
* Joint Deep Model-based MR Image and Coil Sensitivity Reconstruction Network (Joint-ICNet) for Fast MRI
* Joint Generative and Contrastive Learning for Unsupervised Person Re-identification
* Joint Learning of 3D Shape Retrieval and Deformation
* Joint Negative and Positive Learning for Noisy Labels
* Joint Noise-Tolerant Learning and Meta Camera Shift Adaptation for Unsupervised Person Re-Identification
* Joint-DetNAS: Upgrade Your Detector with NAS, Pruning and Dynamic Distillation
* JPSS-1 ATMS Postlaunch Active Geolocation Analysis
* Kaleido-BERT: Vision-Language Pre-training on Fashion Domain
* Keep your Eyes on the Lane: Real-time Attention-guided Lane Detection
* KeepAugment: A Simple Information-Preserving Data Augmentation Approach
* Kernel eigenmaps based multiscale sparse model for hyperspectral image classification
* Keypoint-graph-driven learning framework for object pose estimation
* KeypointDeformer: Unsupervised 3D Keypoint Discovery for Shape Control
* Knowledge Evolution in Neural Networks
* KOALAnet: Blind Super-Resolution using Kernel-Oriented Adaptive Local Adjustment
* KRISP: Integrating Implicit and Symbolic Knowledge for Open-Domain Knowledge-Based VQA
* KSM: Fast Multiple Task Adaption via Kernel-wise Soft Mask Learning
* L2M-GAN: Learning to Manipulate Latent Space Semantics for Facial Attribute Editing
* Labeled from Unlabeled: Exploiting Unlabeled Data for Few-shot Deep HDR Deghosting
* LAFEAT: Piercing Through Adversarial Defenses with Latent Features
* Landmark Regularization: Ranking Guided Super-Net Training in Neural Architecture Search
* LaPred: Lane-Aware Prediction of Multi-Modal Future Trajectories of Dynamic Agents
* Large-capacity Image Steganography Based on Invertible Neural Networks
* Large-scale Localization Datasets in Crowded Indoor Spaces
* Large-Scale Study on Unsupervised Spatiotemporal Representation Learning, A
* LAU-Net: Latitude Adaptive Upscaling Network for Omnidirectional Image Super-resolution
* Layer-wise Searching for 1-bit Detectors
* Layerwise Optimization by Gradient Decomposition for Continual Learning
* Layout-Guided Novel View Synthesis from a Single Indoor Panorama
* LayoutGMN: Neural Graph Matching for Structural Layout Similarity
* LayoutTransformer: Scene Layout Generation with Conceptual and Spatial Diversity
* LEAP: Learning Articulated Occupancy of People
* Learnable Companding Quantization for Accurate Low-bit Neural Networks
* Learnable Graph Matching: Incorporating Graph Partitioning with Deep Feature Learning for Multiple Object Tracking
* Learnable Motion Coherence for Correspondence Pruning
* Learned Initializations for Optimizing Coordinate-Based Neural Representations
* Learning 3D Shape Feature for Texture-Insensitive Person Re-Identification
* Learning a Facial Expression Embedding Disentangled from Identity
* Learning a Non-blind Deblurring Network for Night Blurry Images
* Learning a Proposal Classifier for Multiple Object Tracking
* Learning a Self-Expressive Network for Subspace Clustering
* Learning Accurate Dense Correspondences and When to Trust Them
* Learning Affinity-Aware Upsampling for Deep Image Matting*
* Learning An Explicit Weighting Scheme for Adapting Complex HSI Noise
* Learning Asynchronous and Sparse Human-Object Interaction in Videos
* Learning Better Visual Dialog Agents with Pretrained Visual-Linguistic Representation
* Learning by Aligning Videos in Time
* Learning by Planning: Language-Guided Global Image Editing
* Learning by Watching
* Learning Calibrated Medical Image Segmentation via Multi-rater Agreement Modeling
* Learning Camera Localization via Dense Scene Matching
* Learning Complete 3D Morphable Face Models from Images and Videos
* Learning Compositional Radiance Fields of Dynamic Human Heads
* Learning Compositional Representation for 4D Captures with Neural ODE
* Learning Continuous Image Representation with Local Implicit Image Function
* Learning Cross-Modal Retrieval with Noisy Labels
* Learning Data Consistency and its Application to Dynamic MR Imaging
* Learning Decision Trees Recurrently Through Communication
* Learning Deep Classifiers Consistent with Fine-Grained Novelty Detection
* Learning Deep Latent Variable Models by Short-Run MCMC Inference with Optimal Transport Correction
* Learning Delaunay Surface Elements for Mesh Reconstruction
* Learning Discriminative Prototypes with Dynamic Time Warping
* Learning Dynamic Alignment via Meta-Filter for Few-Shot Learning
* Learning Dynamic Network Using a Reuse Gate Function in Semi-supervised Video Object Segmentation
* Learning Dynamics via Graph Neural Networks for Human Pose Estimation and Tracking
* Learning Feature Aggregation for Deep 3D Morphable Models
* Learning Fine-Grained Segmentation of 3D Shapes without Part Labels
* Learning from the Master: Distilling Cross-modal Advanced Knowledge for Lip Reading
* Learning Goals from Failure
* Learning Graph Embeddings for Compositional Zero-shot Learning
* Learning Graphs for Knowledge Transfer with Limited Labels
* Learning High Fidelity Depths of Dressed Humans by Watching Social Media Dance Videos
* Learning Invariant Representations and Risks for Semi-supervised Domain Adaptation
* Learning monocular 3D reconstruction of articulated categories from motion
* Learning Multi-Scale Photo Exposure Correction
* Learning Neural Representation of Camera Pose with Matrix Representation of Pose Shift via View Synthesis
* Learning Normal Dynamics in Videos with Meta Prototype Network
* Learning Optical Flow from a Few Matches
* Learning optical flow from still images
* Learning Parallel Dense Correspondence from Spatio-Temporal Descriptors for Efficient and Robust 4D Reconstruction
* Learning Placeholders for Open-Set Recognition
* Learning Position and Target Consistency for Memory-based Video Object Segmentation
* Learning Probabilistic Ordinal Embeddings for Uncertainty-Aware Regression
* Learning Progressive Point Embeddings for 3D Point Cloud Generation
* Learning Salient Boundary Feature for Anchor-free Temporal Action Localization
* Learning Scalable l8-constrained Near-lossless Image Compression via Joint Lossy Image and Residual Compression
* Learning Scene Structure Guidance via Cross-Task Knowledge Transfer for Single Depth Super-Resolution
* Learning Semantic Person Image Generation by Region-Adaptive Normalization
* Learning Semantic-Aware Dynamics for Video Prediction
* Learning Spatial-Semantic Relationship for Facial Attribute Recognition with Limited Labeled Data
* Learning Spatially-Variant MAP Models for Non-blind Image Deblurring
* Learning Statistical Texture for Semantic Segmentation
* Learning Student Networks in the Wild
* Learning Temporal Consistency for Low Light Video Enhancement from Single Images
* Learning Tensor Low-Rank Prior for Hyperspectral Image Reconstruction
* Learning the Best Pooling Strategy for Visual Semantic Embedding
* Learning the Non-differentiable Optimization for Blind Super-Resolution
* Learning the Predictability of the Future
* Learning the Superpixel in a Non-iterative and Lifelong Manner
* Learning to Aggregate and Personalize 3D Face from In-the-Wild Photo Collection
* Learning to Associate Every Segment for Video Panoptic Segmentation
* Learning To Count Everything
* Learning to Filter: Siamese Relation Network for Robust Tracking
* Learning to Fuse Asymmetric Feature Maps in Siamese Trackers
* Learning to Generalize Unseen Domains via Memory-based Multi-Source Meta-Learning for Person Re-Identification
* Learning to Identify Correct 2D-2D Line Correspondences on Sphere
* Learning to Predict Visual Attributes in the Wild
* Learning to Rank Proposals for Siamese Visual Tracking
* Learning to Recommend Frame for Interactive Video Object Segmentation in the Wild
* Learning to Reconstruct CT Images From the VVBP-Tensor
* Learning to Reconstruct High Speed and High Dynamic Range Videos from Events
* Learning to Recover 3D Scene Shape from a Single Image
* Learning to Relate Depth and Semantics for Unsupervised Domain Adaptation
* Learning to Restore Hazy Video: A New Real-World Dataset and A New Method
* Learning to Segment Actions from Visual and Language Instructions via Differentiable Weak Sequence Alignment
* Learning to Segment Rigid Motions from Two Frames
* Learning to Track Instances without Video Annotations
* Learning to Warp for Style Transfer
* Learning Triadic Belief Dynamics in Nonverbal Communication from Videos
* Learning View Selection for 3D Scenes
* Learning View-Disentangled Human Pose Representation by Contrastive Cross-View Mutual Information Maximization
* Learning-based Image Registration with Meta-Regularization
* LED2-Net: Monocular 360° Layout Estimation via Differentiable Depth Rendering
* Lesion-Aware Transformers for Diabetic Retinopathy Grading
* Less is More: CLIPBERT for Video-and-Language Learning via Sparse Sampling
* Leveraging Large-Scale Weakly Labeled Data for Semi-Supervised Mass Detection in Mammograms
* Leveraging Line-point Consistence to Preserve Structures for Wide Parallax Image Stitching
* Leveraging the Availability of Two Cameras for Illuminant Estimation
* LiBRe: A Practical Bayesian Approach to Adversarial Detection
* LiDAR R-CNN: An Efficient and Universal 3D Object Detector
* LiDAR-Aug: A General Rendering-based Augmentation Framework for 3D Object Detection
* LiDAR-based Panoptic Segmentation via Dynamic Shifting Network
* Lifelong Person Re-Identification via Adaptive Knowledge Accumulation
* Lifting 2D StyleGAN for 3D-Aware Face Generation
* Light Field Super-Resolution with Zero-Shot Learning
* LightTrack: Finding Lightweight Neural Networks for Object Tracking via One-Shot Architecture Search
* Limitations of Post-Hoc Feature Alignment for Robustness
* Line Segment Detection Using Transformers without Edges
* Linear Semantics in Generative Adversarial Networks
* Linguistic Structures as Weak Supervision for Visual Scene Graph Generation
* Lips Don't Lie: A Generalisable and Robust Approach to Face Forgery Detection
* Lipstick ain't enough: Beyond Color Matching for In-the-Wild Makeup Transfer
* LipSync3D: Data-Efficient Learning of Personalized 3D Talking Faces from Video using Pose and Lighting Normalization
* Lite-HRNet: A Lightweight High-Resolution Network
* Localizing Visual Sounds the Hard Way
* Locally Aware Piecewise Transformation Fields for 3D Human Mesh Registration
* Locate then Segment: A Strong Pipeline for Referring Image Segmentation
* LoFTR: Detector-Free Local Feature Matching with Transformers
* LOHO: Latent Optimization of Hairstyles via Orthogonalization
* Long-Tailed Multi-Label Visual Recognition by Collaborative Training on Uniform and Re-balanced Samplings
* Look Before You Leap: Learning Landmark Features for One-Stage Visual Grounding
* Look Before you Speak: Visually Contextualized Utterances
* Look Closer to Segment Better: Boundary Patch Refinement for Instance Segmentation
* Looking into Your Speech: Learning Cross-modal Affinity for Audio-visual Speech Separation
* Loop Closure Detection by Using Global and Local Features With Photometric and Viewpoint Invariance
* Lottery Ticket Hypothesis for Object Recognition, The
* Lottery Tickets Hypothesis for Supervised and Self-supervised Pre-training in Computer Vision Models, The
* low light natural image statistical model for joint contrast enhancement and denoising, A
* Low-light image enhancement via deep Retinex decomposition and bilateral learning
* LPSNet: A lightweight solution for fast panoptic segmentation
* LQF: Linear Quadratic Fine-Tuning
* LR-Net: Low-Rank Spatial-Spectral Network for Hyperspectral Image Denoising
* M3DSSD: Monocular 3D Single Stage Object Detector
* M3P: Learning Universal Representations via Multitask Multilingual Multimodal Pre-training
* MagDR: Mask-guided Detection and Reconstruction for Defending Deepfakes
* MagFace: A Universal Representation for Face Recognition and Quality Assessment
* Magic Layouts: Structural Prior for Component Detection in User Interface Designs
* Manifold Regularized Dynamic Network Pruning
* ManipulaTHOR: A Framework for Visual Object Manipulation
* Mapping of Directional Ocean Wave Spectra in Hurricanes and Other Environments
* Maritime filtering for images and videos
* MASA-SR: Matching Acceleration and Spatial Adaptation for Reference-Based Image Super-Resolution
* Mask Guided Matting via Progressive Refinement Network
* Mask-Embedded Discriminator with Region-based Semantic Regularization for Semi-Supervised Class-Conditional Image Synthesis
* Mask-ToF: Learning Microlens Masks for Flying Pixel Correction in Time-of-Flight Imaging
* Masksembles for Uncertainty Estimation
* MaX-DeepLab: End-to-End Panoptic Segmentation with Mask Transformers
* MaxUp: Lightweight Adversarial Training with Data Augmentation Improves Neural Network Training
* MAZE: Data-Free Model Stealing Attack Using Zeroth-Order Gradient Estimation
* MDPET: A Unified Motion Correction and Denoising Adversarial Network for Low-Dose Gated PET
* MeanShift++: Extremely Fast Mode-Seeking With Applications to Segmentation and Object Tracking
* Measuring Deformed Sea Ice in Seasonal Ice Zones Using L-Band SAR Images
* MeGA-CDA: Memory Guided Attention for Category-Aware Unsupervised Domain Adaptive Object Detection
* Melanoma skin cancer identification with amalgamated TSBTC and BTC colour features using ensemble of machine learning algorithms
* Memory Oriented Transfer Learning for Semi-Supervised Image Deraining
* Memory-Efficient Network for Large-scale Video Compressive Sensing
* Memory-guided Unsupervised Image-to-image Translation
* Mesh Saliency: An Independent Perceptual Measure or A Derivative of Image Saliency?
* Mesoscopic photogrammetry with an unstabilized phone camera
* Meta Batch-Instance Normalization for Generalizable Person Re-Identification
* Meta Pseudo Labels
* Meta-Mining Discriminative Samples for Kinship Verification
* MetaAlign: Coordinating Domain Alignment and Classification for Unsupervised Domain Adaptation
* MetaCorrection: Domain-aware Meta Loss Correction for Unsupervised Domain Adaptation in Semantic Segmentation
* Metadata Normalization
* MetaHTR: Towards Writer-Adaptive Handwritten Text Recognition
* MetaSAug: Meta Semantic Augmentation for Long-Tailed Visual Recognition
* MetaSCI: Scalable and Adaptive Reconstruction for Video Compressive Sensing
* MetaSets: Meta-Learning on Point Sets for Generalizable Representations
* Method for Deriving Relative Humidity From MODIS Data Under All-Sky Conditions, A
* MetricOpt: Learning to Optimize Black-Box Evaluation Metrics
* Micro-expression recognition from local facial regions
* Minimally Invasive Surgery for Sparse Neural Networks in Contrastive Manner
* Mining Better Samples for Contrastive Learning of Temporal Correspondence
* Mirror3D: Depth Refinement for Mirror Surfaces
* MIST: Multiple Instance Self-Training Framework for Video Anomaly Detection
* MIST: Multiple Instance Spatial Transformer
* Mitigating Face Recognition Bias via Group Adaptive Classifier
* Mixed-Privacy Forgetting in Deep Networks
* MobileDets: Searching for Object Detection Architectures for Mobile Accelerators
* Model-Aware Gesture-to-Gesture Translation
* Model-based 3D Hand Reconstruction via Self-Supervised Learning
* Model-Contrastive Federated Learning
* Modeling Level 2 Passive Microwave Precipitation Retrieval Error Over Complex Terrain Using a Nonparametric Statistical Technique
* Modeling Multi-Label Action Dependencies for Temporal Action Localization
* Modular Interactive Video Object Segmentation: Interaction-to-Mask, Propagation and Difference-Aware Fusion
* Mol2Image: Improved Conditional Flow Models for Molecule to Image Synthesis
* MongeNet: Efficient Sampler for Geometric Deep Learning
* Monocular 3D Multi-Person Pose Estimation by Integrating Top-Down and Bottom-Up Networks
* Monocular 3D Object Detection: An Extrinsic Parameter Free Approach
* Monocular Depth Estimation via Listwise Ranking using the Plackett-Luce Model
* Monocular Real-time Full Body Capture with Inter-part Correlations
* Monocular Reconstruction of Neural Face Reflectance Fields
* MonoRec: Semi-Supervised Dense Reconstruction in Dynamic Environments from a Single Moving Camera
* MonoRUn: Monocular 3D Object Detection by Reconstruction and Uncertainty Propagation
* Monte Carlo Scene Search for 3D Scene Understanding
* MOOD: Multi-level Out-of-distribution Detection
* More Photos are All You Need: Semi-Supervised Learning for Fine-Grained Sketch Based Image Retrieval
* MOS: Towards Scaling Out-of-distribution Detection for Large Semantic Space
* MOST: A Multi-Oriented Scene Text Detector with Localization Refinement
* Motion Representations for Articulated Animation
* MotionRNN: A Flexible Model for Video Prediction with Spacetime-Varying Motions
* MoViNets: Mobile Video Networks for Efficient Video Recognition
* MP3: A Unified Model to Map, Perceive, Predict and Plan
* MR Image Super-Resolution with Squeeze and Excitation Reasoning Attention Network
* Multi-attentional Deepfake Detection
* Multi-Branch Tensor Network Structure for Tensor-Train Discriminant Analysis
* Multi-Decoding Deraining Network and Quasi-Sparsity Based Training
* Multi-focus image fusion via Joint convolutional analysis and synthesis sparse representation
* Multi-institutional Collaborations for Improving Deep Learning-based Magnetic Resonance Image Reconstruction Using Federated Learning
* Multi-Itinerary Optimization as Cloud Service
* Multi-Label Activity Recognition using Activity-specific Features and Activity Correlations
* Multi-Label Learning from Single Positive Labels
* Multi-Level Difference Repair Architecture for Face Hallucination
* Multi-Modal Fusion Transformer for End-to-End Autonomous Driving
* Multi-Modal Relational Graph for Cross-Modal Video Moment Retrieval
* Multi-Objective CNN-Based Algorithm for SAR Despeckling
* Multi-Objective Interpolation Training for Robustness to Label Noise
* Multi-person Implicit Reconstruction from a Single Image
* Multi-Perspective LSTM for Joint Visual Representation Learning
* Multi-Scale Aligned Distillation for Low-Resolution Detection
* Multi-Scale Spatial Attention-Guided Monocular Depth Estimation With Semantic Enhancement
* Multi-scale spatial convolution algorithm for lane line detection and lane offset estimation in complex road conditions
* Multi-shot Temporal Event Localization: a Benchmark
* Multi-Source Domain Adaptation with Collaborative Learning for Semantic Segmentation
* Multi-stage Aggregated Transformer Network for Temporal Language Localization in Videos
* Multi-Stage Progressive Image Restoration
* Multi-Target Domain Adaptation with Collaborative Consistency Learning
* Multi-Task Network for Joint Specular Highlight Detection and Removal, A
* Multi-Temporal Urban Development SpaceNet Dataset, The
* Multi-view 3D Reconstruction of a Texture-less Smooth Surface of Unknown Generic Reflectance
* Multi-view Depth Estimation using Epipolar Spatio-Temporal Networks
* Multi-View Multi-Person 3D Pose Estimation with Plane Sweep Stereo
* Multi-wavelet guided deep mean-shift prior for image restoration
* MultiBodySync: Multi-Body Segmentation and Motion Estimation via 3D Scan Synchronization
* MultiLink: Multi-class Structure Recovery via Agglomerative Clustering and Model Selection
* Multimodal Contrastive Training for Visual Representation Learning
* Multimodal Motion Prediction with Stacked Transformers
* Multiparameter discrete transforms based on discrete orthogonal polynomials and their application to image watermarking
* Multiple Instance Active Learning for Object Detection
* Multiple Instance Captioning: Learning Representations from Histopathology Textbooks and Articles
* Multiple Object Tracking with Correlation Learning
* Multiplexed Network for End-to-End, Multilingual OCR, A
* Multiresolution Knowledge Distillation for Anomaly Detection
* Multispectral Photometric Stereo for Spatially-Varying Spectral Reflectances: A well posed problem?
* Multitemporal Hyperspectral Images Change Detection Based on Joint Unmixing and Information Coguidance Strategy
* Multiview vision-based human crowd localization for UAV fleet flight safety
* MUST-GAN: Multi-level Statistics Transfer for Self-driven Person Image Generation
* Mutual CRF-GNN for Few-shot Learning
* Mutual Graph Learning for Camouflaged Object Detection
* Mutual Information-Based Self-Supervised Learning Model for PolSAR Land Cover Classification, A
* NAS-FCOS: Efficient Search for Object Detection Architectures
* Natural Adversarial Examples
* Navigating the GAN Parameter Space for Semantic Image Editing
* NBNet: Noise Basis Learning for Image Denoising with Subspace Projection
* Nearest Neighbor Matching for Deep Clustering
* Neighbor2Neighbor: Self-Supervised Denoising from Single Noisy Images
* Neighborhood Contrastive Learning for Novel Class Discovery
* Neighborhood Normalization for Robust Geometric Feature Learning
* NeRD: Neural 3D Reflection Symmetry Detector
* NeRF in the Wild: Neural Radiance Fields for Unconstrained Photo Collections
* NeRV: Neural Reflectance and Visibility Fields for Relighting and View Synthesis
* NetAdaptV2: Efficient Neural Architecture Search with Fast Super-Network Training and Architecture Optimization
* Network Pruning via Performance Maximization
* Network Quantization with Element-wise Gradient Scaling
* Neural Architecture Search with Random Labels
* Neural Auto-Exposure for High-Dynamic Range Object Detection
* Neural Body: Implicit Neural Representations with Structured Latent Codes for Novel View Synthesis of Dynamic Humans
* Neural Camera Simulators
* Neural Cellular Automata Manifold
* Neural Deformation Graphs for Globally-consistent Non-rigid Reconstruction
* Neural Descent for Visual 3D Human Pose and Shape
* Neural Feature Search for RGB-Infrared Person Re-Identification
* Neural Geometric Level of Detail: Real-time Rendering with Implicit 3D Shapes
* Neural Lumigraph Rendering
* Neural Parts: Learning Expressive 3D Shape Abstractions with Invertible Neural Networks
* Neural Prototype Trees for Interpretable Fine-grained Image Recognition
* Neural Reprojection Error: Merging Feature Learning and Camera Pose Estimation
* Neural Response Interpretation through the Lens of Critical Pathways
* Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes
* Neural Scene Graphs for Dynamic Scenes
* Neural Side-By-Side: Predicting Human Preferences for No-Reference Super-Resolution Evaluation
* Neural Splines: Fitting 3D Surfaces with Infinitely-Wide Neural Networks
* Neural Surface Maps
* Neural Tangent Link Between CNN Denoisers and Non-Local Filters, The
* NeuralFusion: Online Depth Fusion in Latent Space
* NeuralHumanFVV: Real-Time Neural Volumetric Human Performance Rendering using RGB Cameras
* NeuralRecon: Real-Time Coherent 3D Reconstruction from Monocular Video
* NeuroMorph: Unsupervised Shape Interpolation and Correspondence in One Go
* NeuTex: Neural Texture Mapping for Volumetric Neural Rendering
* NewtonianVAE: Proportional Control and Goal Identification from Pixels via Physical Latent Spaces
* NeX: Real-time View Synthesis with Neural Basis Expansion
* NExT-QA: Next Phase of Question-Answering to Explaining Temporal Actions
* Nighttime Visibility Enhancement by Increasing the Dynamic Range and Suppression of Light Effects
* No frame left behind: Full Video Action Recognition
* No Shadow Left Behind: Removing Objects and their Shadows using Approximate Lighting and Geometry
* Noise-resistant Deep Metric Learning with Ranking-based Instance Selection
* Non-Salient Region Object Mining for Weakly Supervised Semantic Segmentation
* Normal Integration via Inverse Plane Fitting with Minimum Point-to-Plane Distance
* NormalFusion: Real-Time Acquisition of Surface Normals for High-Resolution RGB-D Scanning
* Normalized Avatar Synthesis Using StyleGAN and Perceptual Refinement
* Not just Compete, but Collaborate: Local Image-to-Image Translation via Cooperative Mask Prediction
* Novel hyper chaotic color image encryption based on pixel and bit level scrambling with diffusion
* Novel Motion Compensation Scheme for 2-D Multichannel SAR Systems With Quaternion Posture Calculation, A
* NPAS: A Compiler-aware Framework of Unified Network Pruning and Architecture Search for Beyond Real-Time Mobile Acceleration
* Nutrition5k: Towards Automatic Nutritional Understanding of Generic Food
* Object classification from randomized EEG trials
* Objectron: A Large Scale Dataset of Object-Centric Videos in the Wild with Pose Annotations
* Objects are Different: Flexible Monocular 3D Object Detection
* OBoW: Online Bag-of-Visual-Words Generation for Self-Supervised Learning
* OCONet: Image Extrapolation by Object Completion
* Offboard 3D Object Detection from Point Cloud Sequences
* OLE: A Novel Oceanic Lidar Emulator
* Omni-supervised Point Cloud Segmentation via Gradual Receptive Field Component Reasoning
* Omnimatte: Associating Objects and Their Effects in Video
* On Feature Normalization and Data Augmentation
* On Focal Loss for Class-Posterior Probability Estimation: A Theoretical Perspective
* On Hallucinations in Tomographic Image Reconstruction
* On Learning the Geodesic Path for Incremental Learning
* On Robustness and Transferability of Convolutional Neural Networks
* On Self-Contact and Human Pose
* On Semantic Similarity in Video Retrieval
* On the Difficulty of Membership Inference Attacks
* One Shot Face Swapping on Megapixels
* One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation
* One-Shot Free-View Neural Talking-Head Synthesis for Video Conferencing
* One-Shot Neural Ensemble Architecture Search by Diversity-Guided Search Space Shrinking
* Online Learning of a Probabilistic and Adaptive Scene Representation
* Online Multiple Object Tracking with Cross-Task Synergy
* OPANAS: One-Shot Path Aggregation Network Architecture Search for Object Detection
* Open Domain Generalization with Domain-Augmented Meta-Learning
* Open World Compositional Zero-Shot Learning
* Open-book Video Captioning with Retrieve-Copy-Generate Network
* Open-Vocabulary Object Detection Using Captions
* OpenMix: Reviving Known Knowledge for Discovering Novel Visual Categories in an Open World
* OpenRooms: An Open Framework for Photorealistic Indoor Scene Datasets
* Optimal Gradient Checkpoint Search for Arbitrary Computation Graphs
* Optimal Quantization using Scaled Codebook
* Order Estimation via Matrix Completion for Multi-Switch Antenna Selection
* ORDisCo: Effective and Efficient Usage of Incremental Unlabeled Data for Semi-supervised Continual Learning
* Orthogonal Over-Parameterized Training
* OSTeC: One-Shot Texture Completion
* OTA: Optimal Transport Assignment for Object Detection
* OTCE: A Transferability Metric for Cross-Domain Cross-Task Representations
* Out-of-Distribution Detection Using Union of 1-Dimensional Subspaces
* Over-the-Air Adversarial Flickering Attacks against Video Recognition Networks
* PAConv: Position Adaptive Convolution with Dynamic Kernel Assembling on Point Clouds
* Pair comparison based progressive subjective quality ranking for underwater images
* PANDA: Adapting Pretrained Features for Anomaly Detection and Segmentation
* Panoptic Segmentation Forecasting
* Panoptic-PolarNet: Proposal-free LiDAR Point Cloud Panoptic Segmentation
* Panoramic Image Reflection Removal
* Parameterization of the Cloud Scattering Polarization Signal Derived From GPM Observations for Microwave Fast Radative Transfer Models, A
* Pareidolia Face Reenactment
* Pareto Self-Supervised Training for Few-Shot Learning
* Parser-Free Virtual Try-on via Distilling Appearance Flows
* Part-aware Panoptic Segmentation
* Partial Feature Selection and Alignment for Multi-Source Domain Adaptation
* Partial Person Re-identification with Part-Part Correspondence Learning
* Partially View-aligned Representation Learning with Noise-robust Contrastive Loss
* Partition-Guided GANs
* Passive Inter-Photon Imaging
* Passive Synthetic Aperture Radar Imaging Using Radio-Astronomical Sources
* Patch-Based U-Net Model for Isotropic Quantitative Differential Phase Contrast Imaging
* Patch-NetVLAD: Multi-Scale Fusion of Locally-Global Descriptors for Place Recognition
* Patch-VQ: Patching Up the Video Quality Problem
* Patch2Pix: Epipolar-Guided Pixel-Level Correspondences
* PatchMatch-Based Neighborhood Consensus for Semantic Correspondence
* PatchmatchNet: Learned Multi-View Patchmatch Stereo
* Patchwise Generative ConvNet: Training Energy-Based Models from a Single Natural Image for Internal Learning
* PAUL: Procrustean Autoencoder for Unsupervised Lifting
* PCLs: Geometry-aware Neural Reconstruction of 3D Pose with Perspective Crop Layers
* PD-GAN: Probabilistic Diverse GAN for Image Inpainting
* Pedestrian and Ego-vehicle Trajectory Prediction from Monocular Camera
* Peek Into the Reasoning of Neural Networks: Interpreting with Structural Visual Concepts, A
* Perception Matters: Detecting Perception Failures of VQA Models Using Metamorphic Testing
* Perceptual Indistinguishability-Net (PI-Net): Facial Image Obfuscation with Manipulable Semantics
* Permute, Quantize, and Fine-tune: Efficient Compression of Neural Networks
* Permuted AdaIN: Reducing the Bias Towards Global Statistics in Image Classification
* Person Re-identification using Heterogeneous Local Graph Attention Networks
* Person30K: A Dual-Meta Generalization Network for Person Re-Identification
* Personalized Outfit Recommendation with Learnable Anchors
* PGT: A Progressive Method for Training Models on Long Videos
* PhD Learning: Learning with Pompeiu-hausdorff Distances for Video-based Vehicle Re-Identification
* PhySG: Inverse Rendering with Spherical Gaussians for Physics-based Material Editing and Relighting
* Physically-aware Generative Network for 3D Shape Modeling
* Physics-based Iterative Projection Complex Neural Network for Phase Retrieval in Lensless Microscopy Imaging
* pi-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis
* Picasso: A CUDA-based Library for Deep Learning over 3D Meshes
* PiCIE: Unsupervised Semantic Segmentation using Invariance and Equivariance in Clustering
* PISE: Person Image Synthesis and Editing with Decoupled GAN
* Pixel Codec Avatars
* Pixel-aligned Volumetric Avatars
* Pixel-in-Pixel Net: Towards Efficient Facial Landmark Detection in the Wild
* Pixel-wise Anomaly Detection in Complex Driving Scenes
* pixelNeRF: Neural Radiance Fields from One or Few Images
* PixMatch: Unsupervised Domain Adaptation via Pixelwise Consistency Training
* PLADE-Net: Towards Pixel-Level Accuracy for Self-Supervised Single-View Depth Estimation with Neural Positional Encoding and Distilled Matting Loss
* Plan2Scene: Converting Floorplans to 3D Scenes
* Planimetric Location Method for Laser Footprints of the Chinese Gaofen-7 Satellite Using Laser Spot Center Detection and Image Matching to Stereo Image Product, A
* Playable Video Generation
* PLOP: Learning without Forgetting for Continual Semantic Segmentation
* PlückerNet: Learn to Register 3D Line Reconstructions¨
* PML: Progressive Margin Loss for Long-tailed Age Classification
* PMP-Net: Point Cloud Completion by Learning Multi-step Point Moving Paths
* Point 4D Transformer Networks for Spatio-Temporal Modeling in Point Cloud Videos
* Point Cloud Instance Segmentation using Probabilistic Embeddings
* Point Cloud Registration Based on One-Point RANSAC and Scale-Annealing Biweight Estimation
* Point Cloud Upsampling via Disentangled Refinement
* Point2Skeleton: Learning Skeletal Representations from Point Clouds
* PointAugmenting: Cross-Modal Augmentation for 3D Object Detection
* PointDSC: Robust Point Cloud Registration using Deep Spatial Consistency
* PointFlow: Flowing Semantics Through Points for Aerial Image Segmentation
* PointGuard: Provably Robust 3D Point Cloud Classification
* PointNetLK Revisited
* Points as Queries: Weakly Semi-supervised Object Detection by Points
* Polarimetric Normal Stereo
* Polka Lines: Learning Structured Illumination and Reconstruction for Active Stereo
* Polygonal Building Extraction by Frame Field Learning
* Polygonal Point Set Tracking
* Populating 3D Scenes by Learning Human-Scene Interaction
* Pose Recognition with Cascade Transformers
* Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual Representation
* Pose-Guided Human Animation from a Single Image in the Wild
* PoseAug: A Differentiable Pose Augmentation Framework for 3D Human Pose Estimation
* POSEFusion: Pose-guided Selective Fusion for Single-view Human Volumetric Capture
* Positional Encoding as Spatial Inductive Bias in GANs
* Positive Sample Propagation along the Audio-Visual Event Line
* Positive-Congruent Training: Towards Regression-Free Model Updates
* Positive-Unlabeled Data Purification in the Wild for Object Detection
* Post-hoc Uncertainty Calibration for Domain Drift Scenarios
* Posterior Promoted GAN with Distribution Discriminator for Unsupervised Image Synthesis
* PPR10K: A Large-Scale Portrait Photo Retouching Dataset with Human-Region Mask and Group-Level Consistency
* PQA: Perceptual Question Answering
* Practical Single-Image Super-Resolution Using Look-Up Table
* Practical Wide-Angle Portraits Correction with Deep Structured Models
* Practice of Applying AI to Benefit Visually Impaired People in China, The
* Pre-Trained Image Processing Transformer
* PREDATOR: Registration of 3D Point Clouds with Low Overlap
* Predicting Human Scanpaths in Visual Question Answering
* Primitive Representation Learning for Scene Text Recognition
* Prior Based Human Completion
* Prioritized Architecture Sampling with Monto-Carlo Tree Search
* Privacy Preserving Localization and Mapping from Uncalibrated Cameras
* Privacy-preserving Collaborative Learning with Automatic Transformation Search
* Privacy-Preserving Image Features via Adversarial Affine Subspace Embeddings
* Probabilistic 3D Human Shape and Pose Estimation from Multiple Unconstrained Images in the Wild
* Probabilistic Embeddings for Cross-Modal Retrieval
* Probabilistic Model Distillation for Semantic Correspondence
* Probabilistic Modeling of Semantic Ambiguity for Scene Graph Generation
* Probabilistic Selective Encryption of Convolutional Neural Networks for Hierarchical Services
* Probabilistic Tracklet Scoring and Inpainting for Multiple Object Tracking
* Progressive Contour Regression for Arbitrary-Shape Scene Text Detection
* Progressive Diversified Augmentation for General Robustness of DNNs: A Unified Approach
* Progressive Domain Expansion Network for Single Domain Generalization
* Progressive Modality Reinforcement for Human Multimodal Emotion Recognition from Unaligned Multimodal Sequences
* Progressive Semantic Segmentation
* Progressive Semantic-Aware Style Transformation for Blind Face Restoration
* Progressive Stage-wise Learning for Unsupervised Feature Representation Enhancement
* Progressive Temporal Feature Alignment Network for Video Inpainting
* Progressive Unsupervised Learning for Visual Object Tracking
* Progressively Complementary Network for Fisheye Image Rectification Using Appearance Flow
* Projecting Your View Attentively: Monocular Road Scene Layout Estimation via Cross-view Transformation
* Propagate Yourself: Exploring Pixel-Level Consistency for Unsupervised Visual Representation Learning
* ProSelfLC: Progressive Self Label Correction for Training Robust Deep Neural Networks
* Protecting Intellectual Property of Generative Adversarial Networks from Ambiguity Attacks
* Prototype Augmentation and Self-Supervision for Incremental Learning
* Prototype Completion with Primitive Knowledge for Few-Shot Learning
* Prototype-Guided Saliency Feature Learning for Person Search
* Prototype-supervised Adversarial Network for Targeted Attack of Deep Hashing
* Prototypical Cross-domain Self-supervised Learning for Few-shot Unsupervised Domain Adaptation
* Prototypical Pseudo Label Denoising and Target Structure Learning for Domain Adaptive Semantic Segmentation
* PSD: Principled Synthetic-to-Real Dehazing Guided by Physical Priors
* Pseudo 3D Auto-Correlation Network for Real Image Denoising
* Pseudo Facial Generation with Extreme Poses for Face Recognition
* PSRR-MaxpoolNMS: Pyramid Shifted MaxpoolNMS with Relationship Recovery
* PU-GCN: Point Cloud Upsampling using Graph Convolutional Networks
* Pulsar: Efficient Sphere-based Neural Rendering
* Pushing it out of the Way: Interactive Visual Navigation
* PV-RAFT: Point-Voxel Correlation Fields for Scene Flow Estimation of Point Clouds
* PVGNet: A Bottom-Up One-Stage 3D Object Detector with Integrated Multi-Level Features
* PVO-based reversible data hiding using adaptive multiple histogram generation and modification
* PWCLO-Net: Deep LiDAR Odometry in 3D Point Clouds Using Hierarchical Embedding Mask Optimization
* QAIR: Practical Query-efficient Black-Box Attacks for Image Retrieval
* QPIC: Query-Based Pairwise Human-Object Interaction Detection with Image-Wide Contextual Information
* QPP: Real-Time Quantization Parameter Prediction for Deep Neural Networks
* Quality-Agnostic Image Recognition via Invertible Decoder
* Quantifying Explainers of Graph Neural Networks in Computational Pathology
* Quantum Permutation Synchronization
* Quasi-Dense Similarity Learning for Multiple Object Tracking
* Quasiconvex Formulation for Radial Cameras, A
* Radar-Camera Pixel Depth Association for Depth Completion
* Radio Frequency Interference Detection and Localization in Sentinel-1 Images
* RAFT-3D: Scene Flow using Rigid-Motion Embeddings
* Railroad is not a Train: Saliency as Pseudo-pixel Supervision for Weakly Supervised Semantic Segmentation
* Rainbow Memory: Continual Learning with a Memory of Diverse Samples
* RangeIoUDet: Range Image based Real-Time 3D Object Detector Optimized by Intersection over Union
* Rank-One Prior: Toward Real-Time Scene Recovery
* RankDetNet: Delving into Ranking Constraints for Object Detection
* Ranking Neural Checkpoints
* RaScaNet: Learning Tiny Models by Raster-Scanning Images
* Re-labeling ImageNet: From Single to Multi-Labels, from Global to Localized Labels
* Read and Attend: Temporal Localisation in Sign Language Videos
* Read Like Humans: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Recognition
* ReAgent: Point Cloud Registration using Imitation and Reinforcement Learning
* Real-time 16K video coding on a GPU with complexity scalable BPC-PaCo
* Real-Time High-Resolution Background Matting
* Real-Time Selfie Video Stabilization
* Real-Time Sphere Sweeping Stereo from Multiview Fisheye Images
* Realistic Evaluation of Semi-Supervised Learning for Fine-Grained Classification, A
* Reciprocal Landmark Detection and Tracking with Extremely Few Annotations
* Reciprocal Transformations for Unsupervised Video Object Segmentation
* Recognizing Actions in Videos from Unseen Viewpoints
* Reconsidering Representation Alignment for Multi-view Clustering
* Reconstructing 3D Human Pose by Watching Humans in the Mirror
* Reconstruction of Orthographic Mosaics From Perspective X-Ray Images
* Recorrupted-to-Recorrupted: Unsupervised Deep Learning for Image Denoising
* Rectification-based Knowledge Retention for Continual Learning
* Recurrent Multi-view Alignment Network for Unsupervised Surface Registration
* Recursive residual atrous spatial pyramid pooling network for single image deraining
* ReDet: A Rotation-equivariant Detector for Aerial Object Detection
* Reducing Domain Gap by Reducing Style Bias
* Refer-it-in-RGBD: A Bottom-up Approach for 3D Visual Grounding in RGBD Images
* Refine Myself by Teaching Myself: Feature Refinement via Self-Knowledge Distillation
* RefineMask: Towards High-Quality Instance Segmentation with Fine-Grained Features
* Refining Pseudo Labels with Clustering Consensus over Generations for Unsupervised Object Re-identification
* Reflection Removal Using Dual-Polarization and Saliency in Millimeter-Wave and Terahertz Imaging
* Reformulating HOI Detection as Adaptive Set Prediction
* Region-aware Adaptive Instance Normalization for Image Harmonization
* Regressive Domain Adaptation for Unsupervised Keypoint Detection
* Regularization by Denoising super-resolution method based on genetic algorithms, A
* Regularization Strategy for Point Cloud via Rigidly Mixed Sample
* Regularizing Generative Adversarial Networks under Limited Data
* Regularizing Neural Networks via Adversarial Model Perturbation
* Reinforced Attention for Few-Shot Learning and Beyond
* Relation-aware Instance Refinement for Weakly Supervised Visual Grounding
* Relative multiscale deep depth from focus
* Relative Order Analysis and Optimization for Unsupervised Deep Metric Learning
* Relevance-CAM: Your Model Already Knows Where to Look
* ReMix: Towards Image-to-Image Translation with Limited Data
* Removing Diffraction Image Artifacts in Under-Display Camera via Dynamic Skip Connection Network
* Removing Raindrops and Rain Streaks in One Go
* Removing the Background by Adding the Background: Towards Background Robust Self-supervised Video Representation Learning
* ReNAS: Relativistic Evaluation of Neural Architecture Search
* Repetitive Activity Counting by Sight and Sound
* Repopulating Street Scenes
* Representation Learning via Global Temporal Alignment and Cycle-Consistency
* Representative Batch Normalization with Feature Calibration
* Representative Forgery Mining for Fake Face Detection
* Representing Videos as Discriminative Sub-graphs for Action Recognition
* Repurposing GANs for One-Shot Semantic Part Segmentation
* RepVGG: Making VGG-style ConvNets Great Again
* Residential floor plan recognition and reconstruction
* Restore from Restored: Video Restoration with Pseudo Clean Video
* Restoring Extremely Dark Images in Real Time
* Rethinking and Improving the Robustness of Image Style Transfer
* Rethinking BiSeNet For Real-time Semantic Segmentation
* Rethinking Channel Dimensions for Efficient Model Design
* Rethinking Class Relations: Absolute-relative Supervised and Unsupervised Few-shot Learning
* Rethinking Graph Neural Architecture Search from Message-passing
* Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers
* Rethinking Style Transfer: From Pixels to Parameterized Brushstrokes
* Rethinking Text Segmentation: A Novel Dataset and A Text-Specific Refinement Approach
* Rethinking the Heatmap Regression for Bottom-up Human Pose Estimation
* Retinex-inspired Unrolling with Cooperative Prior Architecture Search for Low-light Image Enhancement
* Revamping Cross-Modal Recipe Retrieval with Hierarchical Transformers and Self-supervised Learning
* Revisiting Knowledge Distillation: An Inheritance and Exploration Framework
* Revisiting Superpixels for Active Learning in Semantic Segmentation with Realistic Annotation Costs
* RfD-Net: Point Scene Understanding by Semantic Instance Reconstruction
* RGB-D Local Implicit Function for Depth Completion of Transparent Objects
* Rich Context Aggregation with Reflection Prior for Glass Surface Detection
* Rich features for perceptual quality assessment of UGC videos
* Ridgelet-Nets With Speckle Reduction Regularization for SAR Image Scene Classification
* Riggable 3D Face Reconstruction via In-Network Optimization
* Right for the Right Concept: Revising Neuro-Symbolic Concepts by Interacting with their Explanations
* Robust and Accurate Object Detection via Adversarial Learning
* Robust and reversible image watermarking in homomorphic encrypted domain
* Robust Audio-Visual Instance Discrimination
* Robust Bayesian Neural Networks by Spectral Expectation Bound Regularization
* Robust Consistent Video Depth Estimation
* Robust Instance Segmentation through Reasoning about Multi-Object Occlusion
* Robust multi-source co-adaptation with adaptive loss minimization
* Robust Multimodal Vehicle Detection in Foggy Weather Using Complementary Lidar and Radar Signals
* Robust Neural Routing Through Space Partitions for Camera Relocalization in Dynamic Indoor Environments
* Robust Point Cloud Registration Framework Based on Deep Graph Matching
* Robust Reference-based Super-Resolution via C2-Matching
* Robust Reflection Removal with Reflection-free Flash-only Cues
* Robust Representation Learning with Feedback for Single Image Deraining
* RobustNet: Improving Domain Generalization in Urban-Scene Segmentation via Instance Selective Whitening
* Roof-GAN: Learning to Generate Roof Geometry and Relations for Residential Houses
* Room-and-Object Aware Knowledge Reasoning for Remote Embodied Referring Expression
* Roses are Red, Violets are Blue… But Should VQA expect Them To?
* Rotation Coordinate Descent for Fast Globally Optimal Rotation Averaging
* Rotation Equivariant Siamese Networks for Tracking
* Rotation-Only Bundle Adjustment
* RPN Prototype Alignment For Domain Adaptive Object Detector
* RPSRNet: End-to-End Trainable Rigid Point Set Registration Network using Barnes-Hut 2D-Tree Representation
* RSG: A Simple but Effective Module for Learning Imbalanced Datasets
* RSN: Range Sparse Net for Efficient, Accurate LiDAR 3D Object Detection
* RSTNet: Captioning with Adaptive Attention on Visual and Non-Visual Words
* S2-aware network for visual recognition
* S2-BNN: Bridging the Gap Between Self-Supervised Real and 1-bit Neural Networks via Guided Distribution Calibration
* S2R-DepthNet: Learning a Generalizable Depth-specific Structural Representation
* S3: Learnable Sparse Signal Superdensity for Guided Depth Estimation
* S3: Neural Shape, Skeleton, and Skinning Fields for 3D Human Modeling
* Safe Local Motion Planning with Self-Supervised Freespace Forecasting
* SAIL-VOS 3D: A Synthetic Dataset and Baselines for Object Detection and 3D Mesh Reconstruction from Video Data
* Saliency-Guided Image Translation
* Salient Object Detection Based on Progressively Supervised Learning for Remote Sensing Images
* SC-PNN: Saliency Cascade Convolutional Neural Network for Pansharpening
* Scalability vs. Utility: Do We Have to Sacrifice One for the Other in Data Importance Quantification?
* Scalable Differential Privacy with Sparse Network Finetuning
* Scale-aware Automatic Augmentation for Object Detection
* Scale-Aware Graph Neural Network for Few-Shot Semantic Segmentation
* Scale-Localized Abstract Reasoning
* SCALE: Modeling Clothed Humans with a Surface Codec of Articulated Local Elements
* Scaled-YOLOv4: Scaling Cross Stage Partial Network
* Scaling Local Self-Attention for Parameter Efficient Visual Backbones
* Scan2Cap: Context-aware Dense Captioning in RGB-D Scans
* SCANimate: Weakly Supervised Learning of Skinned Clothed Avatar Networks
* Scene Essence
* Scene Text Retrieval via Joint Text Detection and Similarity Learning
* Scene Text Telescope: Text-Focused Scene Image Super-Resolution
* Scene-aware Generative Network for Human Motion Synthesis
* Scene-Intuitive Agent for Remote Embodied Visual Grounding
* SceneGen: Learning to Generate Realistic Traffic Scenes
* SceneGraphFusion: Incremental 3D Scene Graph Prediction from RGB-D Sequences
* SCF-Net: Learning Spatial Contextual Features for Large-Scale Point Cloud Segmentation
* SDD-FIQA: Unsupervised Face Image Quality Assessment with Similarity Distribution Distance
* SE-SSD: Self-Ensembling Single-Stage Object Detector From Point Cloud
* Searching by Generating: Flexible and Efficient One-Shot NAS with Architecture Generator
* Searching for Fast Model Families on Datacenter Accelerators
* Second-Order Approach to Learning with Instance-Dependent Label Noise, A
* See through Gradients: Image Batch Recovery via GradInversion
* Seeing Behind Objects for 3D Multi-Object Tracking in RGB-D Sequences
* Seeing in Extra Darkness Using a Deep-Red Flash
* Seeing Out of tHe bOx: End-to-End Pre-training for Vision-Language Representation Learning
* Seeking the Shape of Sound: An Adaptive Framework for Learning Voice-Face Association
* Seesaw Loss for Long-Tailed Instance Segmentation
* Segmentation of turbulent computational fluid dynamics simulations with unsupervised ensemble learning
* Self-Aligned Video Deraining with Transmission-Depth Consistency
* Self-attention based Text Knowledge Mining for Text Detection
* Self-boosting Framework for Automated Radiographic Report Generation, A
* Self-generated Defocus Blur Detection via Dual Adversarial Discriminators
* Self-Guided and Cross-Guided Learning for Few-Shot Segmentation
* Self-Point-Flow: Self-Supervised Scene Flow Estimation from Point Clouds with Optimal Transport and Random Walk
* Self-Promoted Prototype Refinement for Few-Shot Class-Incremental Learning
* Self-SAGCN: Self-Supervised Semantic Alignment for Graph Convolution Network
* Self-Supervised 3D Mesh Reconstruction from Single Images
* Self-supervised Augmentation Consistency for Adapting Semantic Segmentation
* Self-Supervised Collision Handling via Generative 3D Garment Models for Virtual Try-On
* Self-supervised Geometric Perception
* Self-Supervised Learning for Semi-Supervised Temporal Action Proposal
* Self-supervised Learning of Depth Inference for Multi-view Stereo
* Self-Supervised Learning on 3D Point Clouds by Learning Discrete Generative Models
* Self-supervised Motion Learning from Static Images
* Self-Supervised Multi-Frame Monocular Scene Flow
* Self-Supervised Pillar Motion Learning for Autonomous Driving
* Self-Supervised Simultaneous Multi-Step Prediction of Road Dynamics and Cost Map
* Self-Supervised Video GANs: Learning for Appearance Consistency and Motion Coherency
* Self-supervised Video Hashing via Bidirectional Transformers
* Self-Supervised Video Representation Learning by Context and Motion Decoupling
* Self-Supervised Visibility Learning for Novel View Synthesis
* Self-Supervised Wasserstein Pseudo-Labeling for Semi-Supervised Image Classification
* SelfAugment: Automatic Augmentation Policies for Self-Supervised Learning
* SelfDoc: Self-Supervised Document Representation Learning
* Semantic Audio-Visual Navigation
* Semantic Image Matting
* Semantic Palette: Guiding Scene Generation with Class Proportions
* Semantic Relation Reasoning for Shot-Stable Few-Shot Object Detection
* Semantic Scene Completion via Integrating Instances and Scene in-the-Loop
* Semantic Segmentation for Real Point Cloud Scenes via Bilateral Augmentation and Adaptive Fusion
* Semantic Segmentation with Generative Models: Semi-Supervised Learning and Strong Out-of-Domain Generalization
* Semantic-aware Knowledge Distillation for Few-Shot Class-Incremental Learning
* Semantic-Aware Video Text Detection
* Semi-Supervised 3D Hand-Object Poses Estimation with Interactions in Time
* Semi-Supervised Action Recognition with Temporal Contrastive Learning
* Semi-supervised Domain Adaptation based on Dual-level Domain Mixing for Semantic Segmentation
* Semi-Supervised Semantic Segmentation with Cross Pseudo Supervision
* Semi-supervised Semantic Segmentation with Directional Context-aware Consistency
* Semi-supervised Synthesis of High-Resolution Editable Textures for 3D Humans
* Semi-Supervised Video Deraining with Dynamical Rain Generator
* Separating Skills and Concepts for Novel Visual Question Answering
* Sequence-to-Sequence Contrastive Learning for Text Recognition
* Sequential Graph Convolutional Network for Active Learning
* SetVAE: Learning Hierarchical Composition for Generative Modeling of Set-Structured Data
* Sewer-ML: A Multi-Label Sewer Defect Classification Dataset and Benchmark
* SG-Net: Spatial Granularity Network for One-Stage Video Instance Segmentation
* SGCN: Sparse Graph Convolution Network for Pedestrian Trajectory Prediction
* Shallow Feature Matters for Weakly Supervised Object Localization
* Shape and Material Capture at Home
* Shape from Sky: Polarimetric Normal Recovery Under The Sky
* Shared Cross-Modal Trajectory Prediction for Autonomous Driving
* Shelf-Supervised Mesh Prediction in the Wild
* Ship Detection in SAR Images via Enhanced Nonnegative Sparse Locality-Representation of Fisher Vectors
* Shot Contrastive Self-Supervised Learning for Scene Boundary Detection
* Siamese Natural Language Tracker: Tracking by Natural Language Descriptions with Siamese Trackers
* SiamMOT: Siamese Multi-Object Tracking
* Sign-Agnostic Implicit Learning of Surface Self-Similarities for Shape Modeling and Reconstruction from Raw Point Clouds
* Similarity-Based Ranking Method for Hyperspectral Band Selection, A
* Simple Copy-Paste is a Strong Data Augmentation Method for Instance Segmentation
* SimPLE: Similar Pseudo Label Exploitation for Semi-Supervised Classification
* Simpler Certified Radius Maximization by Propagating Covariances
* Simplified square to hexagonal lattice conversion based on 1-D multirate processing
* SimPoE: Simulated Character Control for 3D Human Pose Estimation
* Simulating Unknown Target Models for Query-Efficient Black-box Attacks
* Simultaneously Localize, Segment and Rank the Camouflaged Objects
* Single Image Depth Prediction with Wavelet Decomposition
* Single Image Reflection Removal with Absorption Effect
* Single Pair Cross-Modality Super Resolution
* Single-Shot Freestyle Dance Reenactment
* Single-Stage Instance Shadow Detection with Bidirectional Relation Learning
* Single-View 3D Object Reconstruction from Shape Priors in Memory
* Single-view robot pose and joint angle estimation via render & compare
* SIPSA-Net: Shift-Invariant Pan Sharpening with Moving Object Alignment for Satellite Imagery
* Skeleton Merger: an Unsupervised Aligned Keypoint Detector
* Skeleton-Based Action Recognition with Focusing-Diffusion Graph Convolutional Networks
* Sketch, Ground, and Refine: Top-Down Dense Video Captioning
* Sketch2Model: View-Aware 3D Modeling from Single Free-Hand Sketches
* SKFAC: Training Neural Networks with Faster Kronecker-Factored Approximate Curvature
* Skip-Convolutions for Efficient Video Processing
* SLADE: A Self-Training Framework For Distance Metric Learning
* Sliced Wasserstein Loss for Neural Texture Synthesis, A
* SliceNet: deep dense depth estimation from a single indoor panorama using a slice-based representation
* Slimmable Compressive Autoencoders for Practical Neural Image Compression
* SMD-Nets: Stereo Mixture Density Networks
* Smoothing the Disentangled Latent Style Space for Unsupervised Image-to-Image Translation
* SMPLicit: Topology-aware Generative Model for Clothed People
* SMURF: Self-Teaching Multi-Frame Unsupervised RAFT with Full-Image Warping
* SOE-Net: A Self-Attention and Orientation Encoding Network for Point Cloud based Place Recognition
* Soft-IntroVAE: Analyzing and Improving the Introspective Variational Autoencoder
* Soil Moisture Retrieval Using Reflection Coefficients: Numerical Experiments
* SOLD2: Self-supervised Occlusion-aware Line Description and Detection
* SOON: Scenario Oriented Object Navigation with Graph-based Exploration
* Soteria: Provable Defense against Privacy Leakage in Federated Learning from Representation Perspective
* Source-Free Domain Adaptation for Semantic Segmentation
* Space-Time Distillation for Video Super-Resolution
* Space-time Neural Irradiance Fields for Free-Viewpoint Video
* Sparse Auxiliary Networks for Unified Monocular Depth Prediction and Completion
* Sparse Multi-Path Corrections in Fringe Projection Profilometry
* Sparse R-CNN: End-to-End Object Detection with Learnable Proposals
* Spatial Assembly Networks for Image Representation Learning
* Spatial Feature Calibration and Temporal Fusion for Effective One-stage Video Instance Segmentation
* Spatial-Phase Shallow Learning: Rethinking Face Forgery Detection in Frequency Domain
* Spatial-Temporal Correlation and Topology Learning for Person Re-Identification in Videos
* Spatially Consistent Representation Learning
* Spatially-Adaptive Pixelwise Networks for Fast Image Translation
* Spatially-Correlative Loss for Various Image Translation Tasks, The
* Spatially-invariant Style-codes Controlled Makeup Transfer
* Spatially-Varying Outdoor Lighting Estimation from Intrinsics
* Spatio-temporal Contrastive Domain Adaptation for Action Recognition
* Spatiotemporal Contrastive Video Representation Learning
* Spatiotemporal Registration for Event-based Visual Odometry
* SPB-Net: A Deep Network for SAR Imaging and Despeckling With Downsampled Data
* Speckle noise removal based on structural convolutional neural networks with feature fusion for medical image
* Spectral-Spatial Genetic Algorithm-Based Unsupervised Band Selection for Hyperspectral Image Classification
* Spherical Confidence Learning for Face Recognition
* SpinNet: Learning a General Surface Descriptor for 3D Point Cloud Registration
* Spk2ImgNet: Learning to Reconstruct Dynamic Scene from Continuous Spike Stream
* Spoken Moments: Learning Joint Audio-Visual Representations from Video Descriptions
* SPSG: Self-Supervised Photometric Scene Generation from RGB-D Scans
* Spurious Signal in SMAP Fourth Stokes Parameter
* Square Root Bundle Adjustment for Large-Scale Reconstruction
* SRDAN: Scale-aware and Range-aware Domain Adaptation Network for Cross-dataset 3D Object Detection
* SRNHARB: A deep light-weight image super resolution network using hybrid activation residual blocks
* SRWarp: Generalized Image Super-Resolution under Arbitrary Transformation
* SSAN: Separable Self-Attention Network for Video Representation Learning
* SSLayout360: Semi-Supervised Indoor Layout Estimation from 360° Panorama
* SSN: Soft Shadow Network for Image Compositing
* SSSIC: Semantics-to-Signal Scalable Image Coding With Learned Structural Representations
* SSTVOS: Sparse Spatiotemporal Transformers for Video Object Segmentation
* ST3D: Self-training for Unsupervised Domain Adaptation on 3D Object Detection
* Stable View Synthesis
* StablePose: Learning 6D Object Poses from Geometrically Stable Patches
* STaR: Self-supervised Tracking and Reconstruction of Rigid Objects in Motion with Neural Rendering
* Stay Positive: Non-Negative Image Synthesis for Augmented Reality
* STC-Flow: Spatio-temporal context-aware optical flow estimation
* StEP: Style-based Encoder Pre-training for Multi-modal Image Synthesis
* Stereo Radiance Fields (SRF): Learning View Synthesis for Sparse Views of Novel Scenes
* StereoPIFu: Depth Aware Clothed Human Digitization via Stereo Vision
* StickyPillars: Robust and Efficient Feature Matching on Point Clouds using Graph Neural Networks
* STMTrack: Template-free Visual Tracking with Space-time Memory Networks
* Stochastic Image-to-Video Synthesis using cINNs
* Stochastic Whitening Batch Normalization
* Strengthen Learning Tolerance for Weakly Supervised Object Localization
* Structure-Aware Face Clustering on a Large-Scale Graph with 107 Nodes
* Structured Multi-Level Interaction Network for Video Moment Localization via Language Query
* Structured Scene Memory for Vision-Language Navigation
* StruMonoNet: Structure-Aware Monocular 3D Prediction
* Student-Teacher Learning from Clean Inputs to Noisy Inputs
* Study on Hand Gesture Recognition with CNN-Based Deep Learning
* study on selection of optimised piping supports and E-BOM generation, A
* Style-Aware Normalized Loss for Improving Arbitrary Style Transfer
* Style-based Point Generator with Adversarial Rendering for Point Cloud Completion
* StyleMeUp: Towards Style-Agnostic Sketch-Based Image Retrieval
* StyleMix: Separating Content and Style for Enhanced Data Augmentation
* StylePeople: A Generative Model of Fullbody Human Avatars
* StyleSpace Analysis: Disentangled Controls for StyleGAN Image Generation
* Stylized Neural Painting
* Sub-band coding of hexagonal images
* SuperMix: Supervising the Mixing Data Augmentation
* Superpixel Guided Sample Selection Neural Network for Handling Noisy Labels in Hyperspectral Image Classification, A
* Superpixel-Guided Discriminative Low-Rank Representation of Hyperspectral Images for Classification
* SurFree: a fast surrogate-free black-box attack
* Surrogate Gradient Field for Latent Space Manipulation
* SUTD-TrafficQA: A Question Answering Benchmark and an Efficient Network for Video Reasoning over Traffic Events
* SwiftNet: Real-time Video Object Segmentation
* Synthesize-It-Classifier: Learning a Generative Classifier through Recurrent Self-analysis
* Synthesizing Long-Term 3D Human Motion and Interaction in 3D Scenes
* systematic evaluation of coding strategies for sparse binary images, A
* t-vMF Similarity For Regularizing Intra-Class Feature Distribution
* T2VLAD: Global-Local Sequence Alignment for Text-Video Retrieval
* Tackling the Ill-Posedness of Super-Resolution through Adaptive Target Generation
* Taming Transformers for High-Resolution Image Synthesis
* Tangent Space Backpropagation for 3D Transformation Groups
* TAP: Text-Aware Pre-training for Text-VQA and Text-Caption
* Target-Aware Object Discovery and Association for Unsupervised Video Multi-Object Segmentation
* Task Programming: Learning Data Efficient Behavior Representations
* Task-Aware Variational Adversarial Active Learning
* Taskology: Utilizing Task Relations at Scale
* TDN: Temporal Difference Networks for Efficient Action Recognition
* Teachers Do More Than Teach: Compressing Image-to-Image Models
* TearingNet: Point Cloud Autoencoder to Learn Topology-Friendly Representations
* TediGAN: Text-Guided Diverse Face Image Generation and Manipulation
* Temporal Action Segmentation from Timestamp Supervision
* Temporal Context Aggregation Network for Temporal Action Proposal Refinement
* Temporal modelling of first-person actions using hand-centric verb and object streams
* Temporal Modulation Network for Controllable Space-Time Video Super-Resolution
* Temporal Opportunist: Self-Supervised Multi-Frame Monocular Depth, The
* Temporal Query Networks for Fine-grained Video Understanding
* Temporal-Relational CrossTransformers for Few-Shot Action Recognition
* Temporally-Weighted Hierarchical Clustering for Unsupervised Action Segmentation
* TesseTrack: End-to-End Learnable Multi-Person Articulated 3D Pose Tracking
* Test-Time Fast Adaptation for Dynamic Scene Deblurring via Meta-Auxiliary Learning
* Text-Based Localization of Moments in a Video Corpus
* TextOCR: Towards large-scale end-to-end reasoning for arbitrary-shaped scene text
* There is More than Meets the Eye: Self-Supervised Multi-Object Detection and Tracking with Sound by Distilling Multimodal Knowledge
* Thinking Fast and Slow: Efficient Text-to-Visual Retrieval with Transformers
* Three Birds with One Stone: Multi-Task Temporal Action Detection via Recycling Temporal Annotations
* Three Ways to Improve Semantic Segmentation with Self-Supervised Depth Estimation
* Time Adaptive Recurrent Neural Network
* Time-Lapse Seismic Difference-and-Joint Prestack AVA Inversion
* To the Point: Efficient 3D Object Detection in the Range Image with Graph Convolution Kernels
* Topological Planning with Transformers for Vision-and-Language Navigation
* Toward Accurate and Realistic Outfits Visualization with Attention to Details
* Toward Joint Thing-and-Stuff Mining for Weakly Supervised Panoptic Segmentation
* Towards Accurate 3D Human Motion Prediction from Incomplete Observations
* Towards Accurate Text-based Image Captioning with Content Diversity Exploration
* Towards Bridging Event Captioner and Sentence Localizer for Weakly Supervised Dense Event Captioning
* Towards Compact CNNs via Collaborative Compression
* Towards Diverse Paragraph Captioning for Untrimmed Videos
* Towards Efficient Tensor Decomposition-Based DNN Model Compression with Optimization Framework
* Towards Evaluating and Training Verifiably Robust Neural Networks
* Towards Extremely Compact RNNs for Video Recognition with Fully Decomposed Hierarchical Tucker Structure
* Towards Fast and Accurate Real-World Depth Super-Resolution: Benchmark Dataset and Baseline
* Towards Good Practices for Efficiently Annotating Large-Scale Image Classification Datasets
* Towards High Fidelity Face Relighting with Realistic Shadows
* Towards Improving the Consistency, Efficiency, and Flexibility of Differentiable Neural Architecture Search
* Towards Long-Form Video Understanding
* Towards More Flexible and Accurate Object Tracking with Natural Language: Algorithms and Benchmark
* Towards Open World Object Detection
* Towards Part-Based Understanding of RGB-D Scans
* Towards Real-World Blind Face Restoration with Generative Facial Prior
* Towards Robust Classification Model by Counterfactual and Invariant Data Generation
* Towards Rolling Shutter Correction and Deblurring in Dynamic Scenes
* Towards Semantic Segmentation of Urban-Scale 3D Point Clouds: A Dataset, Benchmarks and Challenges
* Towards Unified Surgical Skill Assessment
* TPCN: Temporal Point Cloud Networks for Motion Forecasting
* Track to Detect and Segment: An Online Multi-Object Tracker
* Track, Check, Repeat: An EM Approach to Unsupervised Tracking
* Tracking Pedestrian Heads in Dense Crowd
* TrafficSim: Learning to Simulate Realistic Multi-Agent Behaviors
* Training Generative Adversarial Networks in One Stage
* Training Networks in Null Space of Feature Covariance for Continual Learning
* Trajectory Prediction with Latent Belief Energy-Based Model
* Transferable Query Selection for Active Domain Adaptation
* Transferable Semantic Augmentation for Domain Adaptation
* TransFill: Reference-guided Image Inpainting by Merging Multiple Color and Spatial Transformations
* Transformation Driven Visual Reasoning
* Transformation Invariant Few-Shot Object Detection
* Transformer Interpretability Beyond Attention Visualization
* Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking
* Transformer Tracking
* Transitional Adaptation of Pretrained Models for Visual Storytelling
* Translucent Patch: A Physical and Universal Attack on Object Detectors, The
* TransNAS-Bench-101: Improving transferability and Generalizability of Cross-Task Neural Architecture Search
* Tree-like Decision Distillation
* Triple-cooperative Video Shadow Detection
* Troubleshooting Blind Image Quality Models in the Wild
* Truly shift-invariant convolutional neural networks
* TSGCNet: Discriminative Geometric Feature Learning with Two-Stream Graph Convolutional Network for 3D Dental Model Segmentation
* Tuning IR-cut Filter for Illumination-aware Spectral Reconstruction from RGB
* Turning Frequency to Resolution: Video Super-resolution via Event Cameras
* Two-Decade Variability of Sea Surface Temperature and Chlorophyll-a in the Northern South China Sea as Revealed by Reconstructed Cloud-Free Satellite Data
* UAV-Human: A Large Benchmark for Human Behavior Understanding with Unmanned Aerial Vehicles
* UC2: Universal Cross-lingual Cross-modal Vision-and-Language Pre-training
* UHR-DeepFMT: Ultra-High Spatial Resolution Reconstruction of Fluorescence Molecular Tomography Based on 3-D Fusion Dual-Sampling Deep Neural Network
* Ultra-High-Definition Image Dehazing via Multi-Guided Bilateral Learning
* Unbalanced Feature Transport for Exemplar-based Image Translation
* Unbiased Mean Teacher for Cross-domain Object Detection
* Uncalibrated Neural Inverse Rendering for Photometric Stereo of General Surfaces
* Uncertainty Guided Collaborative Training for Weakly Supervised Temporal Action Detection
* Uncertainty Reduction for Model Adaptation in Semantic Segmentation
* Uncertainty-Aware Camera Pose Estimation from Points and Lines
* Uncertainty-Aware Joint Salient Object and Camouflaged Object Detection
* Uncertainty-guided Model Generalization to Unseen Domains
* Understanding and Simplifying Perceptual Distances
* Understanding Failures of Deep Networks via Robust Feature Extraction
* Understanding Object Dynamics for Interactive Image-to-Video Synthesis
* Understanding the Behaviour of Contrastive Loss
* Understanding the Imaging Capability of Tundra Orbits Compared to Other Orbits
* Understanding the Robustness of Skeleton-based Action Recognition under Adversarial Attack
* Unified quality assessment of natural and screen content images via adaptive weighting on double scales
* Unified Supervised-Unsupervised (SUPER) Learning for X-Ray CT Image Reconstruction
* UniT: Unified Knowledge Transfer for Any-shot Object Detection and Segmentation
* Universal Spectral Adversarial Attacks for Deformable Shapes
* Unpaired Image-to-Image Translation via Latent Energy Transport
* Unpaired MR Motion Artifact Deep Learning Using Outlier-Rejecting Bootstrap Aggregation
* UnrealPerson: An Adaptive Pipeline towards Costless Person Re-identification
* Unsupervised 3D Shape Completion through GAN Inversion
* Unsupervised Degradation Representation Learning for Blind Super-Resolution
* Unsupervised Discovery of the Long-Tail in Instance Segmentation Using Hierarchical Self-Supervision
* Unsupervised Disentanglement of Linear-Encoded Facial Semantics
* Unsupervised Feature Learning by Cross-Level Instance-Group Discrimination
* Unsupervised Human Pose Estimation through Transforming Shape Templates
* Unsupervised Hyperbolic Metric Learning
* Unsupervised Hyperbolic Representation Learning via Message Passing Auto-Encoders
* Unsupervised Learning for Robust Fitting: A Reinforcement Learning Approach
* Unsupervised Learning of 3D Object Categories from Videos in the Wild
* Unsupervised Learning of Depth and Depth-of-Field Effect from Natural Images with Aperture Rendering Generative Adversarial Networks
* Unsupervised multi-domain image translation with domain representation learning
* Unsupervised Multi-Source Domain Adaptation for Person Re-Identification
* Unsupervised Multi-source Domain Adaptation Without Access to Source Data
* Unsupervised Object Detection with LiDAR Clues
* Unsupervised Part Segmentation through Disentangling Appearance and Shape
* Unsupervised Pre-training for Person Re-identification
* Unsupervised Real-World Image Super Resolution via Domain-Distance Aware Training
* Unsupervised Visual Attention and Invariance for Reinforcement Learning
* Unsupervised Visual Representation Learning by Tracking Patches in Video
* UnsupervisedR&R: Unsupervised Point Cloud Registration via Differentiable Rendering
* Unveiling the Potential of Structure Preserving for Weakly Supervised Object Localization
* UP-DETR: Unsupervised Pre-training for Object Detection with Transformers
* UPFlow: Upsampling Pyramid for Unsupervised Optical Flow Learning
* User-Guided Line Art Flat Filling with Split Filling Mechanism
* Using Shape to Categorize: Low-Shot Learning with an Explicit Shape Bias
* UV-Net: Learning from Boundary Representations
* VaB-AL: Incorporating Class Imbalance and Difficulty with Variational Bayes for Active Learning
* Variational Pedestrian Detection
* Variational Prototype Learning for Deep Face Recognition
* Variational Relational Point Completion Network
* Variational Transformer Networks for Layout Generation
* VarifocalNet: An IoU-aware Dense Object Detector
* VDSM: Unsupervised Video Disentanglement with State-Space Modeling and Deep Mixtures of Experts
* Vectorization and Rasterization: Self-Supervised Learning for Sketch and Handwriting
* Verifiability and Predictability: Interpreting Utilities of Network Architectures for Point Cloud Processing
* Video Object Segmentation Using Global and Instance Embedding Learning
* Video Prediction Recalling Long-term Motion Context via Memory Alignment Learning
* Video Rescaling Networks with Joint Optimization Strategies for Downscaling and Upscaling
* VideoMoCo: Contrastive Video Representation Learning with Temporally Adversarial Examples
* View Generalization for Single Image Textured 3D Models
* View-Guided Point Cloud Completion
* VIGOR: Cross-View Image Geo-localization beyond One-to-one Retrieval
* VinVL: Revisiting Visual Representations in Vision-Language Models
* ViP-DeepLab: Learning Visual Perception with Depth-aware Video Panoptic Segmentation
* ViPNAS: Efficient Video Pose Estimation via Neural Architecture Search
* VirFace: Enhancing Face Recognition via Unlabeled Shallow Data
* VirTex: Learning Visual Representations from Textual Annotations
* Virtual Fully-Connected Layer: Training a Large-Scale Face Recognition Dataset with Limited Computational Resources
* Visiting the Invisible: Layer-by-Layer Completed Scene Decomposition
* Visual Navigation with Spatial Attention
* Visual Room Rearrangement
* Visual Semantic Role Labeling for Video Understanding
* Visualizing Adapted Knowledge in Domain Transfer
* Visually Informed Binaural Audio Generation without Binaural Audios
* VisualVoice: Audio-Visual Speech Separation with Cross-Modal Consistency
* VITON-HD: High-Resolution Virtual Try-On via Misalignment-Aware Normalization
* VLN_BERT: A Recurrent Vision-and-Language BERT for Navigation
* VoxelContext-Net: An Octree based Framework for Point Cloud Compression
* VS-Net: Voting with Segmentation for Visual Localization
* VSPW: A Large-scale Dataset for Video Scene Parsing in the Wild
* VX2TEXT: End-to-End Learning of Video-Based Text Generation From Multimodal Inputs
* Wasserstein Barycenter for Multi-Source Domain Adaptation
* Wasserstein Contrastive Representation Distillation
* Watching You: Global-guided Reciprocal Learning for Video-based Person Re-identification
* We are More than Our Joints: Predicting how 3D Bodies Move
* Weakly Supervised Action Selection Learning in Video
* Weakly Supervised Instance Segmentation for Videos with Temporal Mask Consistency
* Weakly Supervised Learning of Rigid 3D Scene Flow
* Weakly Supervised Neuron Reconstruction From Optical Microscopy Images With Morphological Priors
* Weakly Supervised Video Salient Object Detection
* Weakly-supervised Instance Segmentation via Class-agnostic Learning with Salient Images
* Weakly-Supervised Physically Unconstrained Gaze Estimation
* WebFace260M: A Benchmark Unveiling the Power of Million-Scale Deep Face Recognition
* What Can Style Transfer and Paintings Do For Model Robustness?
* What If We Only Use Real Datasets for Scene Text Recognition? Toward Scene Text Recognition With Fewer Labels
* What's in the Image? Explorable Decoding of Compressed Images
* When Age-Invariant Face Recognition Meets Face Age Synthesis: A Multi-Task Learning Framework
* When Human Pose Estimation Meets Robustness: Adversarial Algorithms and Benchmarks
* Where and What? Examining Interpretable Disentangled Representations
* Wide-Baseline Multi-Camera Calibration using Person Re-Identification
* Wide-Baseline Relative Camera Pose Estimation with Directional Learning
* Wide-Depth-Range 6D Object Pose Estimation in Space
* WOAD: Weakly Supervised Online Action Detection in Untrimmed Videos
* XProtoNet: Diagnosis in Chest Radiography with Global and Local Explanations
* You Only Look One-level Feature
* You See What I Want You to See: Exploring Targeted Black-Box Transferability Attack for Hash-based Image Retrieval Systems
* Your Flamingo is My Bird: Fine-Grained, or Not
* Zero-shot Adversarial Quantization
* Zero-Shot Instance Segmentation
* Zero-shot Single Image Restoration through Controlled Perturbation of Koschmieder's Model
* ZeroScatter: Domain Transfer for Long Distance Imaging and Vision through Scattering Media
* Zillow Indoor Dataset: Annotated Floor Plans With 360° Panoramas and 3D Room Layouts
1838 for 2111

Last update:11-Jul-26 12:47:02
Use price@usc.edu for comments.