Tung, F.[Frederick],
Mori, G.[Greg],
Deep Neural Network Compression by In-Parallel Pruning-Quantization,
PAMI(42), No. 3, March 2020, pp. 568-579.
IEEE DOI
2002
BibRef
Earlier:
CLIP-Q: Deep Network Compression Learning by In-parallel
Pruning-Quantization,
CVPR18(7873-7882)
IEEE DOI
1812
Quantization (signal), Image coding, Neural networks,
Visualization, Training, Convolution, Network architecture,
Bayesian optimization.
Training, Task analysis, Optimization
BibRef
Kwon, G.[Gihyun],
Ye, J.C.[Jong Chul],
One-Shot Adaptation of GAN in Just One CLIP,
PAMI(45), No. 10, October 2023, pp. 12179-12191.
IEEE DOI
2310
BibRef
Han, B.H.[Bai-Hong],
Jiang, X.Y.[Xiao-Yan],
Fang, Z.J.[Zhi-Jun],
Fujita, H.[Hamido],
Gao, Y.B.[Yong-Bin],
F-SCP: An automatic prompt generation method for specific classes
based on visual language pre-training models,
PR(147), 2024, pp. 110096.
Elsevier DOI
2312
Multi-modal, Vision language model, Prompt tuning,
Large-scale pre-training model
BibRef
Liu, B.[Bo],
Lu, D.H.[Dong-Huan],
Wei, D.[Dong],
Wu, X.[Xian],
Wang, Y.[Yan],
Zhang, Y.[Yu],
Zheng, Y.F.[Ye-Feng],
Improving Medical Vision-Language Contrastive Pretraining with
Semantics-Aware Triage,
MedImg(42), No. 12, December 2023, pp. 3579-3589.
IEEE DOI
2312
BibRef
Gao, P.[Peng],
Geng, S.J.[Shi-Jie],
Zhang, R.R.[Ren-Rui],
Ma, T.[Teli],
Fang, R.Y.[Rong-Yao],
Zhang, Y.F.[Yong-Feng],
Li, H.S.[Hong-Sheng],
Qiao, Y.[Yu],
CLIP-Adapter: Better Vision-Language Models with Feature Adapters,
IJCV(132), No. 2, February 2024, pp. 581-595.
Springer DOI
2402
BibRef
Liu, Y.H.[Yi-Hao],
He, J.W.[Jing-Wen],
Gu, J.J.[Jin-Jin],
Kong, X.T.[Xiang-Tao],
Qiao, Y.[Yu],
Dong, C.[Chao],
DegAE: A New Pretraining Paradigm for Low-Level Vision,
CVPR23(23292-23303)
IEEE DOI
2309
BibRef
Dong, S.[Sijun],
Wang, L.[Libo],
Du, B.[Bo],
Meng, X.L.[Xiao-Liang],
ChangeCLIP: Remote sensing change detection with multimodal
vision-language representation learning,
PandRS(208), 2024, pp. 53-69.
Elsevier DOI Code:
WWW Link.
2402
Change detection, Vision-language representation learning,
Multimodal artificial intelligence, Remote sensing
BibRef
Peng, F.[Fang],
Yang, X.S.[Xiao-Shan],
Xiao, L.H.[Lin-Hui],
Wang, Y.[Yaowei],
Xu, C.S.[Chang-Sheng],
SgVA-CLIP: Semantic-Guided Visual Adapting of Vision-Language Models
for Few-Shot Image Classification,
MultMed(26), 2024, pp. 3469-3480.
IEEE DOI
2402
Visualization, Adaptation models, Prototypes, Image classification,
Learning systems, Task analysis, Feature extraction, Few-shot,
vision-language models
BibRef
Guo, Y.F.[Yu-Fei],
Chen, Y.P.[Yuan-Pei],
Ma, Z.[Zhe],
NeuroCLIP: Neuromorphic Data Understanding by CLIP and SNN,
SPLetters(31), 2024, pp. 246-250.
IEEE DOI
2402
Neuromorphics, Adaptation models, Visualization, Training,
Neural networks, Membrane potentials, CLIP, neuromorphic data,
spiking neural network
BibRef
Xing, Y.H.[Ying-Hui],
Wu, Q.[Qirui],
Cheng, D.[De],
Zhang, S.Z.[Shi-Zhou],
Liang, G.Q.[Guo-Qiang],
Wang, P.[Peng],
Zhang, Y.N.[Yan-Ning],
Dual Modality Prompt Tuning for Vision-Language Pre-Trained Model,
MultMed(26), 2024, pp. 2056-2068.
IEEE DOI
2402
Visualization, Tuning, Task analysis, Adaptation models,
Computational modeling, Feature extraction, Training,
vision-language model
BibRef
Xiao, L.H.[Lin-Hui],
Yang, X.S.[Xiao-Shan],
Peng, F.[Fang],
Yan, M.[Ming],
Wang, Y.[Yaowei],
Xu, C.S.[Chang-Sheng],
CLIP-VG: Self-Paced Curriculum Adapting of CLIP for Visual Grounding,
MultMed(26), 2024, pp. 4334-4347.
IEEE DOI
2403
Grounding, Reliability, Adaptation models, Task analysis,
Visualization, Data models, Annotations, Visual grounding,
and vision-language models
BibRef
Zhang, K.[Ke],
Yang, Y.[Yan],
Yu, J.[Jun],
Jiang, H.L.[Han-Liang],
Fan, J.P.[Jian-Ping],
Huang, Q.M.[Qing-Ming],
Han, W.D.[Wei-Dong],
Multi-Task Paired Masking With Alignment Modeling for Medical
Vision-Language Pre-Training,
MultMed(26), 2024, pp. 4706-4721.
IEEE DOI
2403
Task analysis, Image reconstruction, Visualization,
Medical diagnostic imaging, Transformers, Semantics, Multitasking,
cross-modal alignment
BibRef
Zara, G.[Giacomo],
Turrisi-da Costa, V.G.[Victor Guilherme],
Roy, S.[Subhankar],
Rota, P.[Paolo],
Ricci, E.[Elisa],
Simplifying open-set video domain adaptation with contrastive
learning,
CVIU(241), 2024, pp. 103953.
Elsevier DOI
2403
BibRef
Earlier: A1, A3, A4, A5, Only:
AutoLabel: CLIP-based framework for Open-Set Video Domain Adaptation,
CVPR23(11504-11513)
IEEE DOI
2309
Open-set video domain adaptation, Video Action Recognition, Contrastive learning
BibRef
Wang, X.H.[Xiao-Han],
Wang, W.G.[Wen-Guan],
Shao, J.Y.[Jia-Yi],
Yang, Y.[Yi],
Learning to Follow and Generate Instructions for Language-Capable
Navigation,
PAMI(46), No. 5, May 2024, pp. 3334-3350.
IEEE DOI
2404
Navigation, Task analysis, Robot kinematics, Natural languages,
Behavioral sciences, Training, Visualization, Multi-task learning,
vision-language pre-training
BibRef
Zhang, W.B.[Wen-Bo],
Zhang, Y.F.[Yi-Fan],
Deng, Y.Y.[Yu-Yang],
Zhang, W.L.[Wen-Long],
Lin, J.F.[Jian-Feng],
Huang, B.Q.[Bin-Qiang],
Zhang, J.[Jinlu],
Yu, W.H.[Wen-Hao],
Ta-Adapter: Enhancing few-shot CLIP with task-aware encoders,
PR(153), 2024, pp. 110559.
Elsevier DOI
2405
Contrastive language-image pre-training, Adapter,
Prompt learning, Transfer learning
BibRef
Wu, Z.X.[Zu-Xuan],
Weng, Z.[Zejia],
Peng, W.[Wujian],
Yang, X.T.[Xi-Tong],
Li, A.[Ang],
Davis, L.S.[Larry S.],
Jiang, Y.G.[Yu-Gang],
Building an Open-Vocabulary Video CLIP Model With Better
Architectures, Optimization and Data,
PAMI(46), No. 7, July 2024, pp. 4747-4762.
IEEE DOI
2406
Task analysis, Image recognition, Training, Adaptation models,
Streaming media, Computational modeling, Optimization, CLIP,
zero-shot recognition
BibRef
Yu, W.W.[Wen-Wen],
Liu, Y.L.[Yu-Liang],
Zhu, X.K.[Xing-Kui],
Cao, H.Y.[Hao-Yu],
Sun, X.[Xing],
Bai, X.[Xiang],
Turning a CLIP Model Into a Scene Text Spotter,
PAMI(46), No. 9, September 2024, pp. 6040-6054.
IEEE DOI
2408
Text detection, Task analysis, Visualization, Text recognition,
Image segmentation, Transformers, Magnetic heads, CLIP, few-shot,
scene text spotting
BibRef
Yu, W.W.[Wen-Wen],
Liu, Y.L.[Yu-Liang],
Hua, W.[Wei],
Jiang, D.Q.[De-Qiang],
Ren, B.[Bo],
Bai, X.[Xiang],
Turning a CLIP Model into a Scene Text Detector,
CVPR23(6978-6988)
IEEE DOI
2309
BibRef
Cheng, H.[Hang],
Ye, H.H.[He-Hui],
Zhou, X.F.[Xiao-Fei],
Liu, X.M.[Xi-Meng],
Chen, F.[Fei],
Wang, M.Q.[Mei-Qing],
Vision-language pre-training via modal interaction,
PR(156), 2024, pp. 110809.
Elsevier DOI
2408
Cross-modal, Pre-training, Partial auxiliary, Image captioning
BibRef
Kong, J.[Jun],
Wang, J.[Jin],
Yu, L.C.[Liang-Chih],
Zhang, X.J.[Xue-Jie],
Multimodality Self-distillation for Fast Inference of Vision and
Language Pretrained Models,
MultMed(26), 2024, pp. 8928-8940.
IEEE DOI
2408
Computational modeling, Transformers, Task analysis, Visualization,
Semantics, Training, Quantization (signal), Accelerating inference,
vision and language pretrained models
BibRef
Yang, Z.Q.[Zhao-Qilin],
An, G.[Gaoyun],
Zheng, Z.X.[Zhen-Xing],
Cao, S.[Shan],
Ruan, Q.Q.[Qiu-Qi],
GBC: Guided Alignment and Adaptive Boosting CLIP Bridging Vision and
Language for Robust Action Recognition,
CirSysVideo(34), No. 9, September 2024, pp. 8172-8187.
IEEE DOI
2410
Adaptation models, Visualization, Biological system modeling,
Boosting, Image recognition, Transformers, Task analysis,
CLIP
BibRef
Lin, X.[Xiao],
Zhu, M.H.[Ming-Hao],
Dang, R.H.[Rong-Hao],
Zhou, G.L.[Guang-Liang],
Shu, S.L.[Shao-Long],
Lin, F.[Feng],
Liu, C.J.[Cheng-Ju],
Chen, Q.J.[Qi-Jun],
CLIPose: Category-Level Object Pose Estimation With Pre-Trained
Vision-Language Knowledge,
CirSysVideo(34), No. 10, October 2024, pp. 9125-9138.
IEEE DOI
2411
Feature extraction, Pose estimation, Point cloud compression,
Shape, Task analysis, Training, Object pose estimation,
multi-modal learning
BibRef
Chen, Y.C.[Yi-Chia],
Li, W.H.[Wei-Hua],
Chen, C.S.[Chu-Song],
Open-Vocabulary Panoptic Segmentation Using Bert Pre-Training of
Vision-Language Multiway Transformer Model,
ICIP24(2494-2500)
IEEE DOI Code:
WWW Link.
2411
Training, Visualization, Image segmentation, Adaptation models,
Vocabulary, Training data, Linguistics, Open vocabulary, panoptic segmentation
BibRef
Yi, C.[Chao],
Ren, L.[Lu],
Zhan, D.C.[De-Chuan],
Ye, H.J.[Han-Jia],
Leveraging Cross-Modal Neighbor Representation for Improved CLIP
Classification,
CVPR24(27392-27401)
IEEE DOI Code:
WWW Link.
2410
Training, Codes, Contrastive learning, Image representation,
Feature extraction, Generators, Vision-Language Model, Multimodal Learning
BibRef
Shao, S.[Shuai],
Bai, Y.[Yu],
Wang, Y.[Yan],
Liu, B.[Baodi],
Zhou, Y.C.[Yi-Cong],
DeIL: Direct-and-Inverse CLIP for Open-World Few-Shot Learning,
CVPR24(28505-28514)
IEEE DOI Code:
WWW Link.
2410
Filtering, Noise reduction, Transforms,
Few shot learning, CLIP, open-world few-shot learning
BibRef
Martin, S.[Ségolčne],
Huang, Y.[Yunshi],
Shakeri, F.[Fereshteh],
Pesquet, J.C.[Jean-Christophe],
Ayed, I.B.[Ismail Ben],
Transductive Zero-Shot and Few-Shot CLIP,
CVPR24(28816-28826)
IEEE DOI Code:
WWW Link.
2410
Image segmentation, Accuracy, Minimization,
Probability distribution, Inference algorithms, Data models,
expectation-maximization
BibRef
Xing, X.[Xin],
Xiong, Z.[Zhexiao],
Stylianou, A.[Abby],
Sastry, S.[Srikumar],
Gong, L.[Liyu],
Jacobs, N.[Nathan],
Vision-Language Pseudo-Labels for Single-Positive Multi-Label
Learning,
ZeroShot24(7799-7808)
IEEE DOI Code:
WWW Link.
2410
Costs, Annotations, Computational modeling, Training data,
Predictive models, Single-Positive Multi-label Learning, CLIP, Pseudo-labeling
BibRef
Xie, J.[Jinheng],
Deng, S.[Songhe],
Li, B.[Bing],
Liu, H.Z.[Hao-Zhe],
Huang, Y.W.[Ya-Wen],
Zheng, Y.F.[Ye-Feng],
Schmidhuber, J.[Jürgen],
Ghanem, B.[Bernard],
Shen, L.L.[Lin-Lin],
Shou, M.Z.[Mike Zheng],
Tune-an-Ellipse: CLIP Has Potential to Find what you Want,
CVPR24(13723-13732)
IEEE DOI
2410
Location awareness, Visualization, Buildings, Detectors, Proposals
BibRef
Maniparambil, M.[Mayug],
Akshulakov, R.[Raiymbek],
Djilali, Y.A.D.[Yasser Abdelaziz Dahou],
Seddik, M.E.A.[Mohamed EI Amine],
Narayan, S.[Sanath],
Mangalam, K.[Karttikeya],
O'Connor, N.E.[Noel E.],
Do Vision and Language Encoders Represent the World Similarly?,
CVPR24(14334-14343)
IEEE DOI
2410
Training, Measurement, Codes, Space communications, Semantics, Kernel,
Unified Representations, Vision Language, Zero-shot, CLIP
BibRef
Pan, C.[Chenbin],
Yaman, B.[Burhaneddin],
Velipasalar, S.[Senem],
Ren, L.[Liu],
CLIP-BEVFormer: Enhancing Multi-View Image-Based BEV Detector with
Ground Truth Flow,
CVPR24(15216-15225)
IEEE DOI
2410
Solid modeling, Transportation, Detectors, Object detection,
Contrastive learning
BibRef
Yang, C.G.[Chuan-Guang],
An, Z.[Zhulin],
Huang, L.[Libo],
Bi, J.Y.[Jun-Yu],
Yu, X.[Xinqiang],
Yang, H.[Han],
Diao, B.[Boyu],
Xu, Y.J.[Yong-Jun],
CLIP-KD: An Empirical Study of CLIP Model Distillation,
CVPR24(15952-15962)
IEEE DOI Code:
WWW Link.
2410
Visualization, Codes, Accuracy, Computational modeling,
Contrastive learning, Solids, Knowledge Distillation
BibRef
Vasu, P.K.A.[Pavan Kumar Anasosalu],
Pouransari, H.[Hadi],
Faghri, F.[Fartash],
Vemulapalli, R.[Raviteja],
Tuzel, O.[Oncel],
MobileCLIP: Fast Image-Text Models through Multi-Modal Reinforced
Training,
CVPR24(15963-15974)
IEEE DOI Code:
WWW Link.
2410
Training, Performance evaluation, Accuracy, Runtime, Computational modeling,
Benchmark testing, Transformers, CLIP, Multi-Modal Distillation
BibRef
Fan, L.[Lei],
Zhou, J.X.[Jian-Xiong],
Xing, X.Y.[Xiao-Ying],
Wu, Y.[Ying],
Active Open-Vocabulary Recognition: Let Intelligent Moving Mitigate
CLIP Limitations,
CVPR24(16394-16403)
IEEE DOI
2410
Training, Vocabulary, Visualization, Accuracy, Sensitivity, Navigation,
Habitats, Active vision, Embodied perception
BibRef
Stevens, S.[Samuel],
Wu, J.[Jiaman],
Thompson, M.J.[Matthew J],
Campolongo, E.G.[Elizabeth G],
Song, C.H.[Chan Hee],
Carlyn, D.E.[David Edward],
Dong, L.[Li],
Dahdul, W.M.[Wasila M],
Stewart, C.[Charles],
Berger-Wolf, T.[Tanya],
Chao, W.L.[Wei-Lun],
Su, Y.[Yu],
BioCLIP: A Vision Foundation Model for the Tree of Life,
CVPR24(19412-19424)
IEEE DOI
2410
Fungi, Visualization, Biological system modeling, Plants (biology),
Vegetation, Data mining, machine learning, imageomics,
evolutionary biology & ecology
BibRef
Tang, Y.W.[Yu-Wei],
Lin, Z.[Zhenyi],
Wang, Q.L.[Qi-Long],
Zhu, P.F.[Peng-Fei],
Hu, Q.H.[Qing-Hua],
AMU-Tuning: Effective Logit Bias for CLIP-based Few-shot Learning,
CVPR24(23323-23333)
IEEE DOI
2410
Training, Learning systems, Limiting, Benchmark testing,
Few shot learning, Logit Bias
BibRef
Huang, Y.S.[Yun-Shi],
Shakeri, F.[Fereshteh],
Dolz, J.[Jose],
Boudiaf, M.[Malik],
Bahig, H.[Houda],
Ben Ayed, I.[Ismail],
LP++: A Surprisingly Strong Linear Probe for Few-Shot CLIP,
CVPR24(23773-23782)
IEEE DOI Code:
WWW Link.
2410
Deep learning, Visualization, Prototypes, Linear programming,
Computational efficiency
BibRef
Bai, J.[Jiawang],
Gao, K.[Kuofeng],
Min, S.B.[Shao-Bo],
Xia, S.T.[Shu-Tao],
Li, Z.F.[Zhi-Feng],
Liu, W.[Wei],
BadCLIP: Trigger-Aware Prompt Learning for Backdoor Attacks on CLIP,
CVPR24(24239-24250)
IEEE DOI Code:
WWW Link.
2410
Learning systems, Image recognition, Computational modeling,
Training data, Optimization methods, Predictive models
BibRef
Liang, S.Y.[Si-Yuan],
Zhu, M.L.[Ming-Li],
Liu, A.[Aishan],
Wu, B.Y.[Bao-Yuan],
Cao, X.C.[Xiao-Chun],
Chang, E.C.[Ee-Chien],
BadCLIP: Dual-Embedding Guided Backdoor Attack on Multimodal
Contrastive Learning,
CVPR24(24645-24654)
IEEE DOI Code:
WWW Link.
2410
Resistance, Visualization, Ethics, Codes, Semantics,
Contrastive learning, Multimodal Contrastive Learning, Backdoor Attacks
BibRef
Cheng, J.[Jun],
Liang, D.[Dong],
Tan, S.[Shan],
Transfer CLIP for Generalizable Image Denoising,
CVPR24(25974-25984)
IEEE DOI
2410
Computed tomography, Noise, Noise reduction, Feature extraction,
Robustness, Decoding
BibRef
Ma, J.W.[Jia-Wei],
Huang, P.Y.[Po-Yao],
Xie, S.[Saining],
Li, S.W.[Shang-Wen],
Zettlemoyer, L.[Luke],
Chang, S.F.[Shih-Fu],
Yih, W.T.[Wen-Tau],
Xu, H.[Hu],
MoDE: CLIP Data Experts via Clustering,
CVPR24(26344-26353)
IEEE DOI
2410
Training, Adaptation models, Correlation, Costs, Computational modeling,
Noise, Semantics, Data Expert, Multi-Modal, Data Clustering
BibRef
Li, X.[Xiao],
Zhang, W.[Wei],
Liu, Y.N.[Yi-Ning],
Hu, Z.H.[Zhan-Hao],
Zhang, B.[Bo],
HU, X.L.[Xiao-Lin],
Language-Driven Anchors for Zero-Shot Adversarial Robustness,
CVPR24(24686-24695)
IEEE DOI Code:
WWW Link.
2410
Training, Codes, Computational modeling, Semantics,
Artificial neural networks, Robustness, adversarial robustness, CLIP
BibRef
Massiceti, D.[Daniela],
Longden, C.[Camilla],
Slowik, A.[Agnieszka],
Wills, S.[Samuel],
Grayson, M.[Martin],
Morrison, C.[Cecily],
Explaining CLIP's Performance Disparities on Data from Blind/Low
Vision Users,
CVPR24(12172-12182)
IEEE DOI
2410
Image quality, Visualization, Image recognition, Sensitivity,
Text recognition, Prevention and mitigation, Lighting, CLIP, multi-modal
BibRef
Wu, S.[Sitong],
Tan, H.[Haoru],
Tian, Z.[Zhuotao],
Chen, Y.[Yukang],
Qi, X.J.[Xiao-Juan],
Jia, J.Y.[Jia-Ya],
SaCo Loss: Sample-Wise Affinity Consistency for Vision-Language
Pre-Training,
CVPR24(27348-27359)
IEEE DOI
2410
Training, Analytical models, Computational modeling,
Computational efficiency,
Consistency
BibRef
Lin, H.[Haokun],
Bai, H.[Haoli],
Liu, Z.[Zhili],
Hou, L.[Lu],
Sun, M.[Muyi],
Song, L.Q.[Lin-Qi],
Wei, Y.[Ying],
Surr, Z.A.[Zhen-An],
MoPE-CLIP: Structured Pruning for Efficient Vision-Language Models
with Module-Wise Pruning Error Metric,
CVPR24(27360-27370)
IEEE DOI
2410
Measurement, Costs, Computational modeling,
Measurement uncertainty,
Structured Pruning
BibRef
Gao, Y.P.[Yi-Peng],
Wang, Z.[Zeyu],
Zheng, W.S.[Wei-Shi],
Xie, C.[Cihang],
Zhou, Y.[Yuyin],
Sculpting Holistic 3D Representation in Contrastive Language-Image-3D
Pre-Training,
CVPR24(22998-23008)
IEEE DOI Code:
WWW Link.
2410
Point cloud compression, Training, Solid modeling, Pipelines,
Contrastive learning, Benchmark testing, Contrastive learning,
Multi-Modality Training
BibRef
Shen, S.[Sitian],
Zhu, Z.[Zilin],
Fan, L.Q.[Lin-Qian],
Zhang, H.[Harry],
Wu, X.X.[Xin-Xiao],
DiffCLIP: Leveraging Stable Diffusion for Language Grounded 3D
Classification,
WACV24(3584-3593)
IEEE DOI
2404
Point cloud compression, Training, Solid modeling, Visualization,
Computational modeling, Algorithms, 3D computer vision, Algorithms,
Vision + language and/or other modalities
BibRef
Wan, B.[Bo],
Tuytelaars, T.[Tinne],
Exploiting CLIP for Zero-shot HOI Detection Requires Knowledge
Distillation at Multiple Levels,
WACV24(1794-1804)
IEEE DOI Code:
WWW Link.
2404
Codes, Annotations, Computational modeling, Neural networks,
Benchmark testing, Task analysis, Algorithms,
Vision + language and/or other modalities
BibRef
Mei, J.[Jie],
Piergiovanni, A.J.,
Hwang, J.N.[Jenq-Neng],
Li, W.[Wei],
SLVP: Self-Supervised Language-Video Pre-Training for Referring Video
Object Segmentation,
Pretrain24(507-517)
IEEE DOI
2404
Bridges, Image segmentation, Visualization, Computational modeling,
Transfer learning, Object segmentation, Computer architecture
BibRef
Nicolas, J.[Julien],
Chiaroni, F.[Florent],
Ziko, I.[Imtiaz],
Ahmad, O.[Ola],
Desrosiers, C.[Christian],
Dolz, J.[Jose],
MoP-CLIP: A Mixture of Prompt-Tuned CLIP Models for Domain
Incremental Learning,
WACV24(1751-1761)
IEEE DOI
2404
Training, Adaptation models, Visualization, Scalability, Data models,
Task analysis, Tuning, Algorithms, Machine learning architectures,
Vision + language and/or other modalities
BibRef
Hess, G.[Georg],
Tonderski, A.[Adam],
Petersson, C.[Christoffer],
Ĺström, K.[Kalle],
Svensson, L.[Lennart],
LidarCLIP or: How I Learned to Talk to Point Clouds,
WACV24(7423-7432)
IEEE DOI
2404
Point cloud compression, Training, Visualization, Laser radar,
Semantic segmentation, Object detection, Applications,
Vision + language and/or other modalities
BibRef
Theisen, W.[William],
Scheirer, W.[Walter],
C-CLIP: Contrastive Image-Text Encoders to Close the
Descriptive-Commentative Gap,
WACV24(7226-7235)
IEEE DOI
2404
Training, Social networking (online), Disasters,
Computational modeling, Sociology, Training data, Applications,
and algorithms
BibRef
Gondal, M.W.[Muhammad Waleed],
Gast, J.[Jochen],
Ruiz, I.A.[Inigo Alonso],
Droste, R.[Richard],
Macri, T.[Tommaso],
Kumar, S.[Suren],
Staudigl, L.[Luitpold],
Domain Aligned CLIP for Few-shot Classification,
WACV24(5709-5718)
IEEE DOI
2404
Representation learning, Adaptation models, Costs,
Computational modeling, Benchmark testing, Robustness, Algorithms,
Image recognition and understanding
BibRef
Phan, T.[Thinh],
Vo, K.[Khoa],
Le, D.[Duy],
Doretto, G.[Gianfranco],
Adjeroh, D.[Donald],
Le, N.[Ngan],
ZEETAD: Adapting Pretrained Vision-Language Model for Zero-Shot
End-to-End Temporal Action Detection,
WACV24(7031-7040)
IEEE DOI Code:
WWW Link.
2404
Location awareness, Adaptation models, Semantics,
Supervised learning, Training data, Transforms, Transformers,
Vision + language and/or other modalities
BibRef
Lai, Z.F.[Zheng-Feng],
Bai, H.P.[Hao-Ping],
Zhang, H.T.[Hao-Tian],
Du, X.Z.[Xian-Zhi],
Shan, J.L.[Jiu-Long],
Yang, Y.F.[Yin-Fei],
Chuah, C.N.[Chen-Nee],
Cao, M.[Meng],
Empowering Unsupervised Domain Adaptation with Large-scale
Pre-trained Vision-Language Models,
WACV24(2679-2689)
IEEE DOI
2404
Training, Bridges, Adaptation models, Computational modeling,
Semantics, Benchmark testing, Algorithms,
Vision + language and/or other modalities
BibRef
Gupta, D.[Devaansh],
Kharbanda, S.[Siddhant],
Zhou, J.W.[Jia-Wei],
Li, W.[Wanhua],
Pfister, H.[Hanspeter],
Wei, D.L.[Dong-Lai],
CLIPTrans: Transferring Visual Knowledge with Pre-trained Models for
Multimodal Machine Translation,
ICCV23(2863-2874)
IEEE DOI
2401
BibRef
Fang, H.[Han],
Yang, Z.F.[Zhi-Fei],
Wei, Y.H.[Yu-Han],
Zang, X.H.[Xiang-Hao],
Ban, C.[Chao],
Feng, Z.[Zerun],
He, Z.J.[Zhong-Jiang],
Li, Y.X.[Yong-Xiang],
Sun, H.[Hao],
Alignment and Generation Adapter for Efficient Video-Text
Understanding,
CLVL23(2783-2789)
IEEE DOI
2401
BibRef
Zhu, B.[Beier],
Niu, Y.[Yulei],
Han, Y.C.[Yu-Cheng],
Wu, Y.[Yue],
Zhang, H.W.[Han-Wang],
Prompt-aligned Gradient for Prompt Tuning,
ICCV23(15613-15623)
IEEE DOI
2401
BibRef
Yuan, H.J.[Hang-Jie],
Zhang, S.W.[Shi-Wei],
Wang, X.[Xiang],
Albanie, S.[Samuel],
Pan, Y.[Yining],
Feng, T.[Tao],
Jiang, J.W.[Jian-Wen],
Ni, D.[Dong],
Zhang, Y.[Yingya],
Zhao, D.L.[De-Li],
RLIPv2: Fast Scaling of Relational Language-Image Pre-training,
ICCV23(21592-21604)
IEEE DOI Code:
WWW Link.
2401
BibRef
Wang, Z.[Ziyi],
Yu, X.[Xumin],
Rao, Y.M.[Yong-Ming],
Zhou, J.[Jie],
Lu, J.W.[Ji-Wen],
Take-A-Photo: 3D-to-2D Generative Pre-training of Point Cloud Models,
ICCV23(5617-5627)
IEEE DOI Code:
WWW Link.
2401
BibRef
Li, M.[Ming],
Wu, J.[Jie],
Wang, X.[Xionghui],
Chen, C.[Chen],
Qin, J.[Jie],
Xiao, X.F.[Xue-Feng],
Wang, R.[Rui],
Zheng, M.[Min],
Pan, X.[Xin],
AlignDet: Aligning Pre-training and Fine-tuning in Object Detection,
ICCV23(6843-6853)
IEEE DOI
2401
BibRef
Lee, S.[Suhyeon],
Chung, H.J.[Hyung-Jin],
Park, M.Y.[Min-Young],
Park, J.[Jonghyuk],
Ryu, W.S.[Wi-Sun],
Ye, J.C.[Jong Chul],
Improving 3D Imaging with Pre-Trained Perpendicular 2D Diffusion
Models,
ICCV23(10676-10686)
IEEE DOI Code:
WWW Link.
2401
BibRef
Ye, Q.H.[Qing-Hao],
Xu, G.H.[Guo-Hai],
Yan, M.[Ming],
Xu, H.Y.[Hai-Yang],
Qian, Q.[Qi],
Zhang, J.[Ji],
Huang, F.[Fei],
HiTeA: Hierarchical Temporal-Aware Video-Language Pre-training,
ICCV23(15359-15370)
IEEE DOI
2401
BibRef
Wu, C.Y.[Chao-Yi],
Zhang, X.M.[Xiao-Man],
Zhang, Y.[Ya],
Wang, Y.F.[Yan-Feng],
Xie, W.[Weidi],
MedKLIP: Medical Knowledge Enhanced Language-Image Pre-Training for
X-ray Diagnosis,
ICCV23(21315-21326)
IEEE DOI
2401
BibRef
Yang, Q.S.[Qiu-Shi],
Li, W.Y.[Wu-Yang],
Li, B.[Baopu],
Yuan, Y.X.[Yi-Xuan],
MRM: Masked Relation Modeling for Medical Image Pre-Training with
Genetics,
ICCV23(21395-21405)
IEEE DOI Code:
WWW Link.
2401
BibRef
Ma, W.X.[Wen-Xuan],
Li, S.[Shuang],
Zhang, J.[JinMing],
Liu, C.H.[Chi Harold],
Kang, J.X.[Jing-Xuan],
Wang, Y.L.[Yu-Lin],
Huang, G.[Gao],
Borrowing Knowledge From Pre-trained Language Model:
A New Data-efficient Visual Learning Paradigm,
ICCV23(18740-18751)
IEEE DOI Code:
WWW Link.
2401
BibRef
Ganugula, P.[Prajwal],
Kumar, Y.S.S.S.S.[Y.S.S.S. Santosh],
Reddy, N.K.S.[N.K. Sagar],
Chellingi, P.[Prabhath],
Thakur, A.[Avinash],
Kasera, N.[Neeraj],
Anand, C.S.[C. Shyam],
MOSAIC: Multi-Object Segmented Arbitrary Stylization Using CLIP,
NIVT23(892-903)
IEEE DOI
2401
BibRef
Luo, Z.Y.[Zi-Yang],
Zhao, P.[Pu],
Xu, C.[Can],
Geng, X.[Xiubo],
Shen, T.[Tao],
Tao, C.Y.[Chong-Yang],
Ma, J.[Jing],
Lin, Q.W.[Qing-Wei],
Jiang, D.X.[Da-Xin],
LexLIP: Lexicon-Bottlenecked Language-Image Pre-Training for
Large-Scale Image-Text Sparse Retrieval,
ICCV23(11172-11183)
IEEE DOI
2401
BibRef
Zhai, X.H.[Xiao-Hua],
Mustafa, B.[Basil],
Kolesnikov, A.[Alexander],
Beyer, L.[Lucas],
Sigmoid Loss for Language Image Pre-Training,
ICCV23(11941-11952)
IEEE DOI
2401
BibRef
Shtedritski, A.[Aleksandar],
Rupprecht, C.[Christian],
Vedaldi, A.[Andrea],
What does CLIP know about a red circle? Visual prompt engineering for
VLMs,
ICCV23(11953-11963)
IEEE DOI
2401
BibRef
Zhu, Z.Y.[Zi-Yu],
Ma, X.J.[Xiao-Jian],
Chen, Y.X.[Yi-Xin],
Deng, Z.D.[Zhi-Dong],
Huang, S.Y.[Si-Yuan],
Li, Q.[Qing],
3D-VisTA: Pre-trained Transformer for 3D Vision and Text Alignment,
ICCV23(2899-2909)
IEEE DOI
2401
BibRef
Yang, K.C.[Kai-Cheng],
Deng, J.K.[Jian-Kang],
An, X.[Xiang],
Li, J.W.[Jia-Wei],
Feng, Z.[Ziyong],
Guo, J.[Jia],
Yang, J.[Jing],
Liu, T.L.[Tong-Liang],
ALIP: Adaptive Language-Image Pre-training with Synthetic Caption,
ICCV23(2910-2919)
IEEE DOI Code:
WWW Link.
2401
BibRef
Yang, Y.F.[Yi-Fan],
Huang, W.Q.[Wei-Quan],
Wei, Y.X.[Yi-Xuan],
Peng, H.[Houwen],
Jiang, X.Y.[Xin-Yang],
Jiang, H.Q.[Hui-Qiang],
Wei, F.[Fangyun],
Wang, Y.[Yin],
Hu, H.[Han],
Qiu, L.[Lili],
Yang, Y.Q.[Yu-Qing],
Attentive Mask CLIP,
ICCV23(2759-2769)
IEEE DOI Code:
WWW Link.
2401
BibRef
Vinker, Y.[Yael],
Alaluf, Y.[Yuval],
Cohen-Or, D.[Daniel],
Shamir, A.[Ariel],
CLIPascene: Scene Sketching with Different Types and Levels of
Abstraction,
ICCV23(4123-4133)
IEEE DOI Code:
WWW Link.
2401
BibRef
Wei, Y.X.[Yi-Xuan],
Hu, H.[Han],
Xie, Z.[Zhenda],
Liu, Z.[Ze],
Zhang, Z.[Zheng],
Cao, Y.[Yue],
Bao, J.M.[Jian-Min],
Chen, D.[Dong],
Guo, B.[Baining],
Improving CLIP Fine-tuning Performance,
ICCV23(5416-5426)
IEEE DOI Code:
WWW Link.
2401
BibRef
Maniparambil, M.[Mayug],
Vorster, C.[Chris],
Molloy, D.[Derek],
Murphy, N.[Noel],
McGuinness, K.[Kevin],
O'Connor, N.E.[Noel E.],
Enhancing CLIP with GPT-4: Harnessing Visual Descriptions as Prompts,
MMFM23(262-271)
IEEE DOI
2401
BibRef
Zheng, X.[Xiao],
Huang, X.S.[Xiao-Shui],
Mei, G.F.[Guo-Feng],
Hou, Y.N.[Yue-Nan],
Lyu, Z.Y.[Zhao-Yang],
Dai, B.[Bo],
Ouyang, W.L.[Wan-Li],
Gong, Y.S.[Yong-Shun],
Point Cloud Pre-Training with Diffusion Models,
CVPR24(22935-22945)
IEEE DOI Code:
WWW Link.
2410
Point cloud compression, Codes, Diffusion models,
Feature extraction, Generators
BibRef
Huang, T.Y.[Tian-Yu],
Dong, B.[Bowen],
Yang, Y.H.[Yun-Han],
Huang, X.S.[Xiao-Shui],
Lau, R.W.H.[Rynson W.H.],
Ouyang, W.L.[Wan-Li],
Zuo, W.M.[Wang-Meng],
CLIP2Point: Transfer CLIP to Point Cloud Classification with
Image-Depth Pre-Training,
ICCV23(22100-22110)
IEEE DOI Code:
WWW Link.
2401
BibRef
Wu, K.[Kan],
Peng, H.W.[Hou-Wen],
Zhou, Z.H.[Zheng-Hong],
Xiao, B.[Bin],
Liu, M.C.[Meng-Chen],
Yuan, L.[Lu],
Xuan, H.[Hong],
Valenzuela, M.[Michael],
Chen, X.S.[Xi Stephen],
Wang, X.G.[Xing-Gang],
Chao, H.Y.[Hong-Yang],
Hu, H.[Han],
TinyCLIP: CLIP Distillation via Affinity Mimicking and Weight
Inheritance,
ICCV23(21913-21923)
IEEE DOI
2401
BibRef
Deng, X.[Xinchi],
Shi, H.[Han],
Huang, R.[Runhui],
Li, C.L.[Chang-Lin],
Xu, H.[Hang],
Han, J.H.[Jian-Hua],
Kwok, J.[James],
Zhao, S.[Shen],
Zhang, W.[Wei],
Liang, X.D.[Xiao-Dan],
GrowCLIP: Data-aware Automatic Model Growing for Large-scale
Contrastive Language-Image Pre-training,
ICCV23(22121-22132)
IEEE DOI
2401
BibRef
Ranasinghe, K.[Kanchana],
McKinzie, B.[Brandon],
Ravi, S.[Sachin],
Yang, Y.F.[Yin-Fei],
Toshev, A.[Alexander],
Shlens, J.[Jonathon],
Perceptual Grouping in Contrastive Vision-Language Models,
ICCV23(5548-5561)
IEEE DOI
2401
BibRef
Shao, B.[Bin],
Liu, J.Z.[Jian-Zhuang],
Pei, R.[Renjing],
Xu, S.[Songcen],
Dai, P.[Peng],
Lu, J.W.[Ju-Wei],
Li, W.[Weimian],
Yan, Y.[Youliang],
HiVLP: Hierarchical Interactive Video-Language Pre-Training,
ICCV23(13710-13720)
IEEE DOI
2401
BibRef
Ali, M.[Muhammad],
Khan, S.[Salman],
CLIP-Decoder: ZeroShot Multilabel Classification using Multimodal
CLIP Aligned Representations,
VLAR23(4677-4681)
IEEE DOI
2401
BibRef
Singha, M.[Mainak],
Pal, H.[Harsh],
Jha, A.[Ankit],
Banerjee, B.[Biplab],
AD-CLIP: Adapting Domains in Prompt Space Using CLIP,
OutDistri23(4357-4366)
IEEE DOI
2401
BibRef
Zhang, J.[Junbo],
Dong, R.[Runpei],
Ma, K.[Kaisheng],
CLIP-FO3D:
Learning Free Open-world 3D Scene Representations from 2D Dense CLIP,
OpenSUN3D(2040-2051)
IEEE DOI
2401
BibRef
Auty, D.[Dylan],
Mikolajczyk, K.[Krystian],
Learning to Prompt CLIP for Monocular Depth Estimation:
Exploring the Limits of Human Language,
OpenSUN3D(2031-2049)
IEEE DOI Code:
WWW Link.
2401
BibRef
Hegde, D.[Deepti],
Valanarasu, J.M.J.[Jeya Maria Jose],
Patel, V.M.[Vishal M.],
CLIP goes 3D: Leveraging Prompt Tuning for Language Grounded 3D
Recognition,
OpenSUN3D(2020-2030)
IEEE DOI Code:
WWW Link.
2401
BibRef
Xu, X.[Xin],
Xiong, T.Y.[Tian-Yi],
Ding, Z.[Zheng],
Tu, Z.W.[Zhuo-Wen],
MasQCLIP for Open-Vocabulary Universal Image Segmentation,
ICCV23(887-898)
IEEE DOI Code:
WWW Link.
2401
BibRef
Wang, H.L.[Hua-Liang],
Li, Y.[Yi],
Yao, H.[Huifeng],
Li, X.M.[Xiao-Meng],
CLIPN for Zero-Shot OOD Detection: Teaching CLIP to Say No,
ICCV23(1802-1812)
IEEE DOI Code:
WWW Link.
2401
BibRef
Zhu, X.Y.[Xiang-Yang],
Zhang, R.R.[Ren-Rui],
He, B.[Bowei],
Zhou, A.[Aojun],
Wang, D.[Dong],
Zhao, B.[Bin],
Gao, P.[Peng],
Not All Features Matter:
Enhancing Few-shot CLIP with Adaptive Prior Refinement,
ICCV23(2605-2615)
IEEE DOI Code:
WWW Link.
2401
BibRef
Paiss, R.[Roni],
Ephrat, A.[Ariel],
Tov, O.[Omer],
Zada, S.[Shiran],
Mosseri, I.[Inbar],
Irani, M.[Michal],
Dekel, T.[Tali],
Teaching CLIP to Count to Ten,
ICCV23(3147-3157)
IEEE DOI
2401
BibRef
Zhu, X.Y.[Xiang-Yang],
Zhang, R.R.[Ren-Rui],
He, B.[Bowei],
Guo, Z.Y.[Zi-Yu],
Zeng, Z.[Ziyao],
Qin, Z.[Zipeng],
Zhang, S.H.[Shang-Hang],
Gao, P.[Peng],
PointCLIP V2: Prompting CLIP and GPT for Powerful 3D Open-world
Learning,
ICCV23(2639-2650)
IEEE DOI Code:
WWW Link.
2401
BibRef
Yuan, M.[Min],
Lv, N.N.[Ning-Ning],
Xie, Y.F.[Yu-Fei],
Lu, F.X.[Fu-Xiang],
Zhan, K.[Kun],
CLIP-FG: Selecting Discriminative Image Patches by Contrastive
Language-Image Pre-Training for Fine-Grained Image Classification,
ICIP23(560-564)
IEEE DOI
2312
BibRef
Zeng, Z.Y.[Zi-Yun],
Ge, Y.Y.[Yu-Ying],
Liu, X.H.[Xi-Hui],
Chen, B.[Bin],
Luo, P.[Ping],
Xia, S.T.[Shu-Tao],
Ge, Y.X.[Yi-Xiao],
Learning Transferable Spatiotemporal Representations from Natural
Script Knowledge,
CVPR23(23079-23089)
IEEE DOI
2309
WWW Link.
BibRef
Wang, J.P.[Jin-Peng],
Ge, Y.X.[Yi-Xiao],
Yan, R.[Rui],
Ge, Y.Y.[Yu-Ying],
Lin, K.Q.[Kevin Qinghong],
Tsutsui, S.[Satoshi],
Lin, X.D.[Xu-Dong],
Cai, G.[Guanyu],
Wu, J.P.[Jian-Ping],
Shan, Y.[Ying],
Qie, X.[Xiaohu],
Shou, M.Z.[Mike Zheng],
All in One: Exploring Unified Video-Language Pre-Training,
CVPR23(6598-6608)
IEEE DOI
2309
BibRef
Ramrakhya, R.[Ram],
Batra, D.[Dhruv],
Wijmans, E.[Erik],
Das, A.[Abhishek],
PIRLNav: Pretraining with Imitation and RL Finetuning for OBJECTNAV,
CVPR23(17896-17906)
IEEE DOI
2309
BibRef
Lin, X.D.[Xu-Dong],
Tiwari, S.[Simran],
Huang, S.Y.[Shi-Yuan],
Li, M.[Manling],
Shou, M.Z.[Mike Zheng],
Ji, H.[Heng],
Chang, S.F.[Shih-Fu],
Towards Fast Adaptation of Pretrained Contrastive Models for
Multi-channel Video-Language Retrieval,
CVPR23(14846-14855)
IEEE DOI
2309
BibRef
Wang, H.C.[Hao-Chen],
Du, X.D.[Xiao-Dan],
Li, J.H.[Jia-Hao],
Yeh, R.A.[Raymond A.],
Shakhnarovich, G.[Greg],
Score Jacobian Chaining: Lifting Pretrained 2D Diffusion Models for
3D Generation,
CVPR23(12619-12629)
IEEE DOI
2309
BibRef
Jin, L.[Lei],
Luo, G.[Gen],
Zhou, Y.[Yiyi],
Sun, X.S.[Xiao-Shuai],
Jiang, G.[Guannan],
Shu, A.[Annan],
Ji, R.R.[Rong-Rong],
RefCLIP: A Universal Teacher for Weakly Supervised Referring
Expression Comprehension,
CVPR23(01-10)
IEEE DOI
2309
BibRef
Saito, K.[Kuniaki],
Sohn, K.[Kihyuk],
Zhang, X.[Xiang],
Li, C.L.[Chun-Liang],
Lee, C.Y.[Chen-Yu],
Saenko, K.[Kate],
Pfister, T.[Tomas],
Prefix Conditioning Unifies Language and Label Supervision,
CVPR23(2861-2870)
IEEE DOI
2309
BibRef
Park, J.[Jaeyoo],
Han, B.H.[Bo-Hyung],
Multi-Modal Representation Learning with Text-Driven Soft Masks,
CVPR23(2798-2807)
IEEE DOI
2309
BibRef
Jin, Z.[Zhao],
Hayat, M.[Munawar],
Yang, Y.W.[Yu-Wei],
Guo, Y.L.[Yu-Lan],
Lei, Y.J.[Yig-Jie],
Context-aware Alignment and Mutual Masking for 3D-Language
Pre-training,
CVPR23(10984-10994)
IEEE DOI
2309
BibRef
Guo, Z.X.[Zi-Xian],
Dong, B.[Bowen],
Ji, Z.L.[Zhi-Long],
Bai, J.F.[Jin-Feng],
Guo, Y.W.[Yi-Wen],
Zuo, W.M.[Wang-Meng],
Texts as Images in Prompt Tuning for Multi-Label Image Recognition,
CVPR23(2808-2817)
IEEE DOI
2309
WWW Link.
BibRef
Cherti, M.[Mehdi],
Beaumont, R.[Romain],
Wightman, R.[Ross],
Wortsman, M.[Mitchell],
Ilharco, G.[Gabriel],
Gordon, C.[Cade],
Schuhmann, C.[Christoph],
Schmidt, L.[Ludwig],
Jitsev, J.[Jenia],
Reproducible Scaling Laws for Contrastive Language-Image Learning,
CVPR23(2818-2829)
IEEE DOI
2309
BibRef
Lei, J.[Jie],
Li, L.J.[Lin-Jie],
Zhou, L.[Luowei],
Gan, Z.[Zhe],
Berg, T.L.[Tamara L.],
Bansal, M.[Mohit],
Liu, J.J.[Jing-Jing],
Less is More:
CLIPBERT for Video-and-Language Learning via Sparse Sampling,
CVPR21(7327-7337)
IEEE DOI
2111
Training, Computational modeling, Feature extraction, Knowledge discovery,
Distance measurement
BibRef
Zhou, J.H.[Jing-Hao],
Dong, L.[Li],
Gan, Z.[Zhe],
Wang, L.J.[Li-Juan],
Wei, F.[Furu],
Non-Contrastive Learning Meets Language-Image Pre-Training,
CVPR23(11028-11038)
IEEE DOI
2309
BibRef
Hu, Z.[Ziniu],
Iscen, A.[Ahmet],
Sun, C.[Chen],
Wang, Z.[Zirui],
Chang, K.W.[Kai-Wei],
Sun, Y.Z.[Yi-Zhou],
Schmid, C.[Cordelia],
Ross, D.A.[David A.],
Fathi, A.[Alireza],
Reveal: Retrieval-Augmented Visual-Language Pre-Training with
Multi-Source Multimodal Knowledge Memory,
CVPR23(23369-23379)
IEEE DOI
2309
BibRef
Li, Y.H.[Yang-Hao],
Fan, H.Q.[Hao-Qi],
Hu, R.H.[Rong-Hang],
Feichtenhofer, C.[Christoph],
He, K.M.[Kai-Ming],
Scaling Language-Image Pre-Training via Masking,
CVPR23(23390-23400)
IEEE DOI
2309
BibRef
Jin, P.[Peng],
Huang, J.[Jinfa],
Xiong, P.F.[Peng-Fei],
Tian, S.X.[Shang-Xuan],
Liu, C.[Chang],
Ji, X.Y.[Xiang-Yang],
Yuan, L.[Li],
Chen, J.[Jie],
Video-Text as Game Players: Hierarchical Banzhaf Interaction for
Cross-Modal Representation Learning,
CVPR23(2472-2482)
IEEE DOI
2309
WWW Link.
BibRef
Ye, S.Q.[Shu-Quan],
Xie, Y.[Yujia],
Chen, D.D.[Dong-Dong],
Xu, Y.[Yichong],
Yuan, L.[Lu],
Zhu, C.G.[Chen-Guang],
Liao, J.[Jing],
Improving Commonsense in Vision-Language Models via Knowledge Graph
Riddles,
CVPR23(2634-2645)
IEEE DOI
2309
BibRef
Li, H.[Hao],
Zhu, J.[Jinguo],
Jiang, X.[Xiaohu],
Zhu, X.[Xizhou],
Li, H.S.[Hong-Sheng],
Yuan, C.[Chun],
Wang, X.H.[Xiao-Hua],
Qiao, Y.[Yu],
Wang, X.G.[Xiao-Gang],
Wang, W.H.[Wen-Hai],
Dai, J.F.[Ji-Feng],
Uni-Perceiver v2: A Generalist Model for Large-Scale Vision and
Vision-Language Tasks,
CVPR23(2691-2700)
IEEE DOI
2309
BibRef
Wu, W.H.[Wen-Hao],
Wang, X.H.[Xiao-Han],
Luo, H.P.[Hai-Peng],
Wang, J.D.[Jing-Dong],
Yang, Y.[Yi],
Ouyang, W.L.[Wan-Li],
Bidirectional Cross-Modal Knowledge Exploration for Video Recognition
with Pre-trained Vision-Language Models,
CVPR23(6620-6630)
IEEE DOI
2309
BibRef
Seth, A.[Ashish],
Hemani, M.[Mayur],
Agarwal, C.[Chirag],
DeAR: Debiasing Vision-Language Models with Additive Residuals,
CVPR23(6820-6829)
IEEE DOI
2309
BibRef
Radenovic, F.[Filip],
Dubey, A.[Abhimanyu],
Kadian, A.[Abhishek],
Mihaylov, T.[Todor],
Vandenhende, S.[Simon],
Patel, Y.[Yash],
Wen, Y.[Yi],
Ramanathan, V.[Vignesh],
Mahajan, D.[Dhruv],
Filtering, Distillation, and Hard Negatives for Vision-Language
Pre-Training,
CVPR23(6967-6977)
IEEE DOI
2309
BibRef
Yu, T.[Tao],
Lu, Z.[Zhihe],
Jin, X.[Xin],
Chen, Z.B.[Zhi-Bo],
Wang, X.C.[Xin-Chao],
Task Residual for Tuning Vision-Language Models,
CVPR23(10899-10909)
IEEE DOI
2309
BibRef
Ma, Z.X.[Zi-Xian],
Hong, J.[Jerry],
Gul, M.O.[Mustafa Omer],
Gandhi, M.[Mona],
Gao, I.[Irena],
Krishna, R.[Ranjay],
@ CREPE: Can Vision-Language Foundation Models Reason
Compositionally?,
CVPR23(10910-10921)
IEEE DOI
2309
BibRef
Yin, D.[Da],
Gao, F.[Feng],
Thattai, G.[Govind],
Johnston, M.[Michael],
Chang, K.W.[Kai-Wei],
GIVL: Improving Geographical Inclusivity of Vision-Language Models
with Pre-Training Methods,
CVPR23(10951-10961)
IEEE DOI
2309
BibRef
Gao, C.[Chen],
Peng, X.Y.[Xing-Yu],
Yan, M.[Mi],
Wang, H.[He],
Yang, L.R.[Li-Rong],
Ren, H.B.[Hai-Bing],
Li, H.S.[Hong-Sheng],
Liu, S.[Si],
Adaptive Zone-aware Hierarchical Planner for Vision-Language
Navigation,
CVPR23(14911-14920)
IEEE DOI
2309
BibRef
Yeh, C.H.[Chun-Hsiao],
Russell, B.[Bryan],
Sivic, J.[Josef],
Heilbron, F.C.[Fabian Caba],
Jenni, S.[Simon],
Meta-Personalizing Vision-Language Models to Find Named Instances in
Video,
CVPR23(19123-19132)
IEEE DOI
2309
BibRef
Gou, Y.H.[Yun-Hao],
Ko, T.[Tom],
Yang, H.[Hansi],
Kwok, J.[James],
Zhang, Y.[Yu],
Wang, M.X.[Ming-Xuan],
Leveraging per Image-Token Consistency for Vision-Language
Pre-Training,
CVPR23(19155-19164)
IEEE DOI
2309
BibRef
Wang, S.J.[Shi-Jie],
Chang, J.L.[Jian-Long],
Li, H.J.[Hao-Jie],
Wang, Z.H.[Zhi-Hui],
Ouyang, W.L.[Wan-Li],
Tian, Q.[Qi],
Open-Set Fine-Grained Retrieval via Prompting Vision-Language
Evaluator,
CVPR23(19381-19391)
IEEE DOI
2309
BibRef
Cheng, F.[Feng],
Wang, X.[Xizi],
Lei, J.[Jie],
Crandall, D.[David],
Bansal, M.[Mohit],
Bertasius, G.[Gedas],
VindLU: A Recipe for Effective Video-and-Language Pretraining,
CVPR23(10739-10750)
IEEE DOI
2309
BibRef
Zhou, H.[Honglu],
Martín-Martín, R.[Roberto],
Kapadia, M.[Mubbasir],
Savarese, S.[Silvio],
Niebles, J.C.[Juan Carlos],
Procedure-Aware Pretraining for Instructional Video Understanding,
CVPR23(10727-10738)
IEEE DOI
2309
BibRef
Yang, A.[Antoine],
Nagrani, A.[Arsha],
Seo, P.H.[Paul Hongsuck],
Miech, A.[Antoine],
Pont-Tuset, J.[Jordi],
Laptev, I.[Ivan],
Sivic, J.[Josef],
Schmid, C.[Cordelia],
Vid2Seq: Large-Scale Pretraining of a Visual Language Model for Dense
Video Captioning,
CVPR23(10714-10726)
IEEE DOI
2309
BibRef
Ji, Y.[Yatai],
Tu, R.C.[Rong-Cheng],
Jiang, J.[Jie],
Kong, W.J.[Wei-Jie],
Cai, C.[Chengfei],
Zhao, W.Z.[Wen-Zhe],
Wang, H.F.[Hong-Fa],
Yang, Y.[Yujiu],
Liu, W.[Wei],
Seeing What You Miss: Vision-Language Pre-training with Semantic
Completion Learning,
CVPR23(6789-6798)
IEEE DOI
2309
BibRef
Alper, M.[Morris],
Fiman, M.[Michael],
Averbuch-Elor, H.[Hadar],
Is BERT Blind? Exploring the Effect of Vision-and-Language
Pretraining on Visual Language Understanding,
CVPR23(6778-6788)
IEEE DOI
2309
BibRef
Liu, M.Y.[Meng-Yin],
Jiang, J.[Jie],
Zhu, C.[Chao],
Yin, X.C.[Xu-Cheng],
VLPD: Context-Aware Pedestrian Detection via Vision-Language Semantic
Self-Supervision,
CVPR23(6662-6671)
IEEE DOI
2309
BibRef
Wei, Y.X.[Yi-Xuan],
Cao, Y.[Yue],
Zhang, Z.[Zheng],
Peng, H.[Houwen],
Yao, Z.L.[Zhu-Liang],
Xie, Z.[Zhenda],
Hu, H.[Han],
Guo, B.[Baining],
iCLIP: Bridging Image Classification and Contrastive Language-Image
Pre-training for Visual Recognition,
CVPR23(2776-2786)
IEEE DOI
2309
BibRef
Hyung, J.[Junha],
Hwang, S.[Sungwon],
Kim, D.[Daejin],
Lee, H.[Hyunji],
Choo, J.[Jaegul],
Local 3D Editing via 3D Distillation of CLIP Knowledge,
CVPR23(12674-12684)
IEEE DOI
2309
BibRef
Mu, N.[Norman],
Kirillov, A.[Alexander],
Wagner, D.[David],
Xie, S.[Saining],
SLIP: Self-supervision Meets Language-Image Pre-training,
ECCV22(XXVI:529-544).
Springer DOI
2211
BibRef
Crowson, K.[Katherine],
Biderman, S.[Stella],
Kornis, D.[Daniel],
Stander, D.[Dashiell],
Hallahan, E.[Eric],
Castricato, L.[Louis],
Raff, E.[Edward],
VQGAN-CLIP: Open Domain Image Generation and Editing with Natural
Language Guidance,
ECCV22(XXXVII:88-105).
Springer DOI
2211
BibRef
Wu, X.S.[Xiao-Shi],
Zhu, F.[Feng],
Zhao, R.[Rui],
Li, H.S.[Hong-Sheng],
CORA: Adapting CLIP for Open-Vocabulary Detection with Region
Prompting and Anchor Pre-Matching,
CVPR23(7031-7040)
IEEE DOI
2309
BibRef
Dong, X.Y.[Xiao-Yi],
Bao, J.M.[Jian-Min],
Zheng, Y.[Yinglin],
Zhang, T.[Ting],
Chen, D.D.[Dong-Dong],
Yang, H.[Hao],
Zeng, M.[Ming],
Zhang, W.M.[Wei-Ming],
Yuan, L.[Lu],
Chen, D.[Dong],
Wen, F.[Fang],
Yu, N.H.[Neng-Hai],
MaskCLIP: Masked Self-Distillation Advances Contrastive
Language-Image Pretraining,
CVPR23(10995-11005)
IEEE DOI
2309
BibRef
Xie, C.W.[Chen-Wei],
Sun, S.Y.[Si-Yang],
Xiong, X.[Xiong],
Zheng, Y.[Yun],
Zhao, D.L.[De-Li],
Zhou, J.R.[Jing-Ren],
RA-CLIP: Retrieval Augmented Contrastive Language-Image Pre-Training,
CVPR23(19265-19274)
IEEE DOI
2309
BibRef
Chen, P.J.[Pei-Jie],
Li, Q.[Qi],
Biaz, S.[Saad],
Bui, T.[Trung],
Nguyen, A.[Anh],
gScoreCAM: What Objects Is CLIP Looking At?,
ACCV22(IV:588-604).
Springer DOI
2307
Analysis of OpenAI's CLIP.
BibRef
Wang, R.[Runqi],
Duan, X.Y.[Xiao-Yue],
Kang, G.L.[Guo-Liang],
Liu, J.Z.[Jian-Zhuang],
Lin, S.H.[Shao-Hui],
Xu, S.[Songcen],
Lv, J.[Jinhu],
Zhang, B.C.[Bao-Chang],
AttriCLIP: A Non-Incremental Learner for Incremental Knowledge Learning,
CVPR23(3654-3663)
IEEE DOI
2309
WWW Link.
BibRef
Rasheed, H.[Hanoona],
Khattak, M.U.[Muhammad Uzair],
Maaz, M.[Muhammad],
Khan, S.[Salman],
Khan, F.S.[Fahad Shahbaz],
Fine-tuned CLIP Models are Efficient Video Learners,
CVPR23(6545-6554)
IEEE DOI
2309
WWW Link.
BibRef
Liu, R.[Ruyang],
Huang, J.J.[Jing-Jia],
Li, G.[Ge],
Feng, J.S.[Jia-Shi],
Wu, X.L.[Xing-Long],
Li, T.H.[Thomas H.],
Revisiting Temporal Modeling for CLIP-Based Image-to-Video Knowledge
Transferring,
CVPR23(6555-6564)
IEEE DOI
2309
BibRef
Tschannen, M.[Michael],
Mustafa, B.[Basil],
Houlsby, N.[Neil],
CLIPPO: Image-and-Language Understanding from Pixels Only,
CVPR23(11006-11017)
IEEE DOI
2309
BibRef
Zhou, Z.Q.[Zi-Qin],
Lei, Y.J.[Yig-Jie],
Zhang, B.[Bowen],
Liu, L.Q.[Ling-Qiao],
Liu, Y.F.[Yi-Fan],
ZegCLIP: Towards Adapting CLIP for Zero-shot Semantic Segmentation,
CVPR23(11175-11185)
IEEE DOI
2309
BibRef
He, W.B.[Wen-Bin],
Jamonnak, S.[Suphanut],
Gou, L.[Liang],
Ren, L.[Liu],
CLIP-S4: Language-Guided Self-Supervised Semantic Segmentation,
CVPR23(11207-11216)
IEEE DOI
2309
BibRef
Huang, Z.X.[Zi-Xuan],
Jampani, V.[Varun],
Thai, A.[Anh],
Li, Y.Z.[Yuan-Zhen],
Stojanov, S.[Stefan],
Rehg, J.M.[James M.],
ShapeClipper: Scalable 3D Shape Learning from Single-View Images via
Geometric and CLIP-Based Consistency,
CVPR23(12912-12922)
IEEE DOI
2309
BibRef
Tao, M.[Ming],
Bao, B.K.[Bing-Kun],
Tang, H.[Hao],
Xu, C.S.[Chang-Sheng],
GALIP: Generative Adversarial CLIPs for Text-to-Image Synthesis,
CVPR23(14214-14223)
IEEE DOI
2309
BibRef
Zeng, Y.[Yihan],
Jiang, C.[Chenhan],
Mao, J.[Jiageng],
Han, J.H.[Jian-Hua],
Ye, C.Q.[Chao-Qiang],
Huang, Q.Q.[Qing-Qiu],
Yeung, D.Y.[Dit-Yan],
Yang, Z.[Zhen],
Liang, X.D.[Xiao-Dan],
Xu, H.[Hang],
CLIP2: Contrastive Language-Image-Point Pretraining from Real-World
Point Cloud Data,
CVPR23(15244-15253)
IEEE DOI
2309
BibRef
Lin, Y.Q.[Yu-Qi],
Chen, M.H.[Ming-Hao],
Wang, W.X.[Wen-Xiao],
Wu, B.[Boxi],
Li, K.[Ke],
Lin, B.B.[Bin-Bin],
Liu, H.F.[Hai-Feng],
He, X.F.[Xiao-Fei],
CLIP is Also an Efficient Segmenter: A Text-Driven Approach for
Weakly Supervised Semantic Segmentation,
CVPR23(15305-15314)
IEEE DOI
2309
BibRef
Sanghi, A.[Aditya],
Fu, R.[Rao],
Liu, V.[Vivian],
Willis, K.D.D.[Karl D.D.],
Shayani, H.[Hooman],
Khasahmadi, A.H.[Amir H.],
Sridhar, S.[Srinath],
Ritchie, D.[Daniel],
CLIP-Sculptor: Zero-Shot Generation of High-Fidelity and Diverse
Shapes from Natural Language,
CVPR23(18339-18348)
IEEE DOI
2309
BibRef
Pei, R.J.[Ren-Jing],
Liu, J.Z.[Jian-Zhuang],
Li, W.[Weimian],
Shao, B.[Bin],
Xu, S.[Songcen],
Dai, P.[Peng],
Lu, J.W.[Ju-Wei],
Yan, Y.[Youliang],
CLIPPING: Distilling CLIP-Based Models with a Student Base for
Video-Language Retrieval,
CVPR23(18983-18992)
IEEE DOI
2309
BibRef
Jeong, J.[Jongheon],
Zou, Y.[Yang],
Kim, T.[Taewan],
Zhang, D.Q.[Dong-Qing],
Ravichandran, A.[Avinash],
Dabeer, O.[Onkar],
WinCLIP: Zero-/Few-Shot Anomaly Classification and Segmentation,
CVPR23(19606-19616)
IEEE DOI
2309
BibRef
Shamshad, F.[Fahad],
Naseer, M.[Muzammal],
Nandakumar, K.[Karthik],
CLIP2Protect: Protecting Facial Privacy Using Text-Guided Makeup via
Adversarial Latent Search,
CVPR23(20595-20605)
IEEE DOI
2309
BibRef
Chen, Y.H.[Yi-Hao],
Qi, X.[Xianbiao],
Wang, J.A.[Jian-An],
Zhang, L.[Lei],
DisCo-CLIP: A Distributed Contrastive Loss for Memory Efficient CLIP
Training,
CVPR23(22648-22657)
IEEE DOI
2309
BibRef
Wasim, S.T.[Syed Talal],
Naseer, M.[Muzammal],
Khan, S.[Salman],
Khan, F.S.[Fahad Shahbaz],
Shah, M.[Mubarak],
Vita-CLIP: Video and text adaptive CLIP via Multimodal Prompting,
CVPR23(23034-23044)
IEEE DOI
2309
BibRef
Parelli, M.[Maria],
Delitzas, A.[Alexandros],
Hars, N.[Nikolas],
Vlassis, G.[Georgios],
Anagnostidis, S.[Sotirios],
Bachmann, G.[Gregor],
Hofmann, T.[Thomas],
CLIP-Guided Vision-Language Pre-training for Question Answering in 3D
Scenes,
ODRUM23(5607-5612)
IEEE DOI
2309
BibRef
Ning, S.[Shan],
Qiu, L.[Longtian],
Liu, Y.F.[Yong-Fei],
He, X.M.[Xu-Ming],
HOICLIP: Efficient Knowledge Transfer for HOI Detection with
Vision-Language Models,
CVPR23(23507-23517)
IEEE DOI
2309
BibRef
Yao, L.W.[Le-Wei],
Han, J.H.[Jian-Hua],
Liang, X.D.[Xiao-Dan],
Xu, D.[Dan],
Zhang, W.[Wei],
Li, Z.G.[Zhen-Guo],
Xu, H.[Hang],
DetCLIPv2: Scalable Open-Vocabulary Object Detection Pre-training via
Word-Region Alignment,
CVPR23(23497-23506)
IEEE DOI
2309
BibRef
Singha, M.[Mainak],
Jha, A.[Ankit],
Solanki, B.[Bhupendra],
Bose, S.[Shirsha],
Banerjee, B.[Biplab],
APPLeNet: Visual Attention Parameterized Prompt Learning for Few-Shot
Remote Sensing Image Generalization using CLIP,
EarthVision23(2024-2034)
IEEE DOI
2309
BibRef
Zhu, J.[Jun],
Jin, J.[Jiandong],
Yang, Z.H.[Zi-Han],
Wu, X.[Xiaohao],
Wang, X.[Xiao],
Learning CLIP Guided Visual-Text Fusion Transformer for Video-based
Pedestrian Attribute Recognition,
NFVLR23(2626-2629)
IEEE DOI
2309
BibRef
Gannamaneni, S.S.[Sujan Sai],
Sadaghiani, A.[Arwin],
Rao, R.P.[Rohil Prakash],
Mock, M.[Michael],
Akila, M.[Maram],
Investigating CLIP Performance for Meta-data Generation in AD
Datasets,
SAIAD23(3840-3850)
IEEE DOI
2309
BibRef
Chen, R.N.[Run-Nan],
Liu, Y.Q.[You-Quan],
Kong, L.D.[Ling-Dong],
Zhu, X.G.[Xin-Ge],
Ma, Y.X.[Yue-Xin],
Li, Y.[Yikang],
Hou, Y.N.[Yue-Nan],
Qiao, Y.[Yu],
Wang, W.P.[Wen-Ping],
CLIP2Scene: Towards Label-efficient 3D Scene Understanding by CLIP,
CVPR23(7020-7030)
IEEE DOI
2309
BibRef
Ni, B.L.[Bo-Lin],
Peng, H.W.[Hou-Wen],
Chen, M.H.[Ming-Hao],
Zhang, S.Y.[Song-Yang],
Meng, G.F.[Gao-Feng],
Fu, J.L.[Jian-Long],
Xiang, S.M.[Shi-Ming],
Ling, H.B.[Hai-Bin],
Expanding Language-Image Pretrained Models for General Video
Recognition,
ECCV22(IV:1-18).
Springer DOI
2211
WWW Link.
BibRef
Zhang, R.R.[Ren-Rui],
Zhang, W.[Wei],
Fang, R.Y.[Rong-Yao],
Gao, P.[Peng],
Li, K.C.[Kun-Chang],
Dai, J.F.[Ji-Feng],
Qiao, Y.[Yu],
Li, H.S.[Hong-Sheng],
Tip-Adapter: Training-Free Adaption of CLIP for Few-Shot Classification,
ECCV22(XXXV:493-510).
Springer DOI
2211
BibRef
Yang, J.[Jinyu],
Duan, J.L.[Jia-Li],
Tran, S.[Son],
Xu, Y.[Yi],
Chanda, S.[Sampath],
Chen, L.Q.[Li-Qun],
Zeng, B.[Belinda],
Chilimbi, T.[Trishul],
Huang, J.Z.[Jun-Zhou],
Vision-Language Pre-Training with Triple Contrastive Learning,
CVPR22(15650-15659)
IEEE DOI
2210
Representation learning, Visualization,
Question answering (information retrieval),
Self- semi- meta- unsupervised learning
BibRef
Guo, X.Y.[Xiao-Yuan],
Duan, J.L.[Jia-Li],
Kuo, C.C.J.[C.C. Jay],
Gichoya, J.W.[Judy Wawira],
Banerjee, I.[Imon],
Augmenting Vision Language Pretraining by Learning Codebook with
Visual Semantics,
ICPR22(4779-4785)
IEEE DOI
2212
Representation learning, Bridges, Visualization, Vocabulary,
Semantics, Buildings, Benchmark testing
BibRef
Zhou, C.[Chong],
Loy, C.C.[Chen Change],
Dai, B.[Bo],
Extract Free Dense Labels from CLIP,
ECCV22(XXVIII:696-712).
Springer DOI
2211
WWW Link. Contrastive Language-Image Pre-training.
BibRef
Lin, Z.[Ziyi],
Geng, S.J.[Shi-Jie],
Zhang, R.R.[Ren-Rui],
Gao, P.[Peng],
de Melo, G.[Gerard],
Wang, X.G.[Xiao-Gang],
Dai, J.F.[Ji-Feng],
Qiao, Y.[Yu],
Li, H.S.[Hong-Sheng],
Frozen CLIP Models are Efficient Video Learners,
ECCV22(XXXV:388-404).
Springer DOI
2211
BibRef
Rao, Y.M.[Yong-Ming],
Zhao, W.L.[Wen-Liang],
Chen, G.Y.[Guang-Yi],
Tang, Y.S.[Yan-Song],
Zhu, Z.[Zheng],
Huang, G.[Guan],
Zhou, J.[Jie],
Lu, J.W.[Ji-Wen],
DenseCLIP: Language-Guided Dense Prediction with Context-Aware
Prompting,
CVPR22(18061-18070)
IEEE DOI
2210
Representation learning, Image segmentation, Visualization, Shape,
Computational modeling, Semantics, Predictive models,
grouping and shape analysis
BibRef
Kwon, G.[Gihyun],
Ye, J.C.[Jong Chul],
CLIPstyler: Image Style Transfer with a Single Text Condition,
CVPR22(18041-18050)
IEEE DOI
2210
Semantics, Modulation, Vision+language,
Image and video synthesis and generation
BibRef
Khandelwal, A.[Apoorv],
Weihs, L.[Luca],
Mottaghi, R.[Roozbeh],
Kembhavi, A.[Aniruddha],
Simple but Effective: CLIP Embeddings for Embodied AI,
CVPR22(14809-14818)
IEEE DOI
2210
Contrastive Language Image Pretraining.
Training, Measurement, Visualization, Navigation, Semantics,
Robot vision systems, Robot vision, Navigation and autonomous driving
BibRef
Ma, H.Y.[Hao-Yu],
Zhao, H.[Handong],
Lin, Z.[Zhe],
Kale, A.[Ajinkya],
Wang, Z.Y.[Zhang-Yang],
Yu, T.[Tong],
Gu, J.X.[Jiu-Xiang],
Choudhary, S.[Sunav],
Xie, X.H.[Xiao-Hui],
EI-CLIP: Entity-aware Interventional Contrastive Learning for
E-commerce Cross-modal Retrieval,
CVPR22(18030-18040)
IEEE DOI
2210
Design methodology, Semantics, Clothing, Metadata, Benchmark testing,
Search problems,
BibRef
Barraco, M.[Manuele],
Cornia, M.[Marcella],
Cascianelli, S.[Silvia],
Baraldi, L.[Lorenzo],
Cucchiara, R.[Rita],
The Unreasonable Effectiveness of CLIP Features for Image Captioning:
An Experimental Analysis,
MULA22(4661-4669)
IEEE DOI
2210
Visualization, Protocols, Detectors,
Distance measurement, Data models
BibRef
Tevet, G.[Guy],
Gordon, B.[Brian],
Hertz, A.[Amir],
Bermano, A.H.[Amit H.],
Cohen-Or, D.[Daniel],
MotionCLIP: Exposing Human Motion Generation to CLIP Space,
ECCV22(XXII:358-374).
Springer DOI
2211
WWW Link. Contrastive Language-Image Pre-training (CLIP) applied to motion.
BibRef
Materzynska, J.[Joanna],
Torralba, A.[Antonio],
Bau, D.[David],
Disentangling visual and written concepts in CLIP,
CVPR22(16389-16398)
IEEE DOI
2210
Visualization, Image coding, Benchmark testing, Cognition,
Task analysis, Vision + language, Visual reasoning
BibRef
Li, M.[Manling],
Xu, R.[Ruochen],
Wang, S.[Shuohang],
Zhou, L.[Luowei],
Lin, X.D.[Xu-Dong],
Zhu, C.G.[Chen-Guang],
Zeng, M.[Michael],
Ji, H.[Heng],
Chang, S.F.[Shih-Fu],
CLIP-Event: Connecting Text and Images with Event Structures,
CVPR22(16399-16408)
IEEE DOI
2210
Codes, Computational modeling, Image retrieval, Benchmark testing,
Information retrieval, Vision + language
BibRef
Zhong, Y.[Yiwu],
Yang, J.W.[Jian-Wei],
Zhang, P.C.[Peng-Chuan],
Li, C.Y.[Chun-Yuan],
Codella, N.[Noel],
Li, L.H.[Liunian Harold],
Zhou, L.[Luowei],
Dai, X.[Xiyang],
Yuan, L.[Lu],
Li, Y.[Yin],
Gao, J.F.[Jian-Feng],
RegionCLIP: Region-based Language-Image Pretraining,
CVPR22(16772-16782)
IEEE DOI
2210
Representation learning, Visualization, Technological innovation,
Image recognition, Text recognition, Transfer learning, Vision + language
BibRef
Baldrati, A.[Alberto],
Bertini, M.[Marco],
Uricchio, T.[Tiberio],
del Bimbo, A.[Alberto],
Effective conditioned and composed image retrieval combining
CLIP-based features,
CVPR22(21434-21442)
IEEE DOI
2210
BibRef
And:
Conditioned and composed image retrieval combining and partially
fine-tuning CLIP-based features,
ODRUM22(4955-4964)
IEEE DOI
2210
Visualization, Shape, Image retrieval, Pipelines, Search engines,
Pattern recognition.
Training, Visualization, Solids
BibRef
Patashnik, O.[Or],
Wu, Z.Z.[Zong-Ze],
Shechtman, E.[Eli],
Cohen-Or, D.[Daniel],
Lischinski, D.[Dani],
StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery,
ICCV21(2065-2074)
IEEE DOI
2203
Hair, Visualization, Image coding, Manuals, Optimization,
Image and video synthesis, Vision applications and systems
BibRef
Chapter on Implementations and Applications, Databases, QBIC, Video Analysis, Hardware and Software, Inspection continues in
Internet Label Information .