Pretrain23
* *Pretraining Large Vision and Multimodal Models
* Benefits of Synthetically Pre-trained Depth-Prediction Networks for Indoor/Outdoor Image Classification
* Masked Autoencoder for Self-Supervised Pre-training on Lidar Point Clouds
* RarePlanes Soar Higher: Self-Supervised Pretraining for Resource Constrained and Synthetic Datasets
* Scalable and Accurate Self-supervised Multimodal Representation Learning without Aligned Video and Text Data
Pretrain24
* *Pretraining Large Vision and Multimodal Models
* COMEDIAN: Self-Supervised Learning and Knowledge Distillation for Action Spotting Using Transformers
* Cross-modal Contrastive Learning with Asymmetric Co-attention Network for Video Moment Retrieval
* Does the Fairness of Your Pre-Training Hold Up? Examining the Influence of Pre-Training Techniques on Skin Tone Bias in Skin Lesion Classification
* epislon-ViLM: Efficient Video-Language Model via Masked Video Modeling with Semantic Vector-Quantized Tokenizer
* Evaluating Pretrained Models for Deployable Lifelong Learning
* Labeling Indoor Scenes with Fusion of Out-of-the-Box Perception Models
* Metric Learning for 3D Point Clouds Using Optimal Transport
* RDIR: Capturing Temporally-Invariant Representations of Multiple Objects in Videos
* Self-Supervised Pre-Training for Semantic Segmentation in an Indoor Scene
* Semi-Supervised Cross-Spectral Face Recognition with Small Datasets
* SLVP: Self-Supervised Language-Video Pre-Training for Referring Video Object Segmentation
* Source-Free Domain Adaptation for RGB-D Semantic Segmentation with Vision Transformers
* Unified Framework for Cropland Field Boundary Detection and Segmentation, A
* Zero-Shot Edge Detection with SCESAME: Spectral Clustering-based Ensemble for Segment Anything Model Estimation
15 for Pretrain24