Bazi, Y.[Yakoub],
Bashmal, L.[Laila],
Al Rahhal, M.M.[Mohamad M.],
Al Dayil, R.[Reham],
Al Ajlan, N.[Naif],
Vision Transformers for Remote Sensing Image Classification,
RS(13), No. 3, 2021, pp. xx-yy.
DOI Link
2102
BibRef
Hu, H.Q.[Hao-Qi],
Lu, X.F.[Xiao-Feng],
Zhang, X.P.[Xin-Peng],
Zhang, T.X.[Tian-Xing],
Sun, G.L.[Guang-Ling],
Inheritance Attention Matrix-Based Universal Adversarial
Perturbations on Vision Transformers,
SPLetters(28), 2021, pp. 1923-1927.
IEEE DOI
2110
Perturbation methods, Robustness, Visualization, Transformers,
Optimization, Task analysis, Head, Vision Transformers, self-attention
BibRef
Li, T.[Tao],
Zhang, Z.[Zheng],
Pei, L.[Lishen],
Gan, Y.[Yan],
HashFormer: Vision Transformer Based Deep Hashing for Image Retrieval,
SPLetters(29), 2022, pp. 827-831.
IEEE DOI
2204
Transformers, Binary codes, Task analysis, Training, Image retrieval,
Feature extraction, Databases, Binary embedding, image retrieval
BibRef
Jiang, B.[Bo],
Zhao, K.K.[Kang-Kang],
Tang, J.[Jin],
RGTransformer: Region-Graph Transformer for Image Representation and
Few-Shot Classification,
SPLetters(29), 2022, pp. 792-796.
IEEE DOI
2204
Measurement, Transformers, Image representation,
Feature extraction, Visualization, transformer
BibRef
Chen, Z.M.[Zhao-Min],
Cui, Q.[Quan],
Zhao, B.[Borui],
Song, R.J.[Ren-Jie],
Zhang, X.Q.[Xiao-Qin],
Yoshie, O.[Osamu],
SST: Spatial and Semantic Transformers for Multi-Label Image
Recognition,
IP(31), 2022, pp. 2570-2583.
IEEE DOI
2204
Correlation, Semantics, Transformers, Image recognition,
Task analysis, Training, Feature extraction, label correlation
BibRef
Xue, Z.X.[Zhi-Xiang],
Tan, X.[Xiong],
Yu, X.[Xuchu],
Liu, B.[Bing],
Yu, A.[Anzhu],
Zhang, P.Q.[Peng-Qiang],
Deep Hierarchical Vision Transformer for Hyperspectral and LiDAR Data
Classification,
IP(31), 2022, pp. 3095-3110.
IEEE DOI
2205
Feature extraction, Transformers, Hyperspectral imaging,
Laser radar, Data mining, Collaboration, Data models,
cross attention fusion
BibRef
Wang, G.H.[Guang-Hui],
Li, B.[Bin],
Zhang, T.[Tao],
Zhang, S.[Shubi],
A Network Combining a Transformer and a Convolutional Neural Network
for Remote Sensing Image Change Detection,
RS(14), No. 9, 2022, pp. xx-yy.
DOI Link
2205
BibRef
Luo, G.[Gen],
Zhou, Y.[Yiyi],
Sun, X.S.[Xiao-Shuai],
Wang, Y.[Yan],
Cao, L.J.[Liu-Juan],
Wu, Y.J.[Yong-Jian],
Huang, F.Y.[Fei-Yue],
Ji, R.R.[Rong-Rong],
Towards Lightweight Transformer Via Group-Wise Transformation for
Vision-and-Language Tasks,
IP(31), 2022, pp. 3386-3398.
IEEE DOI
2205
Transformers, Task analysis, Computational modeling,
Benchmark testing, Visualization, Convolution, Head,
reference expression comprehension
BibRef
Tu, Y.B.[Yun-Bin],
Li, L.[Liang],
Su, L.[Li],
Gao, S.X.[Sheng-Xiang],
Yan, C.G.[Cheng-Gang],
Zha, Z.J.[Zheng-Jun],
Yu, Z.T.[Zheng-Tao],
Huang, Q.M.[Qing-Ming],
I2-Transformer: Intra- and Inter-Relation Embedding Transformer for
TV Show Captioning,
IP(31), 2022, pp. 3565-3577.
IEEE DOI
2206
Transformers, Semantics, Task analysis, Visualization, TV,
Graph neural networks, TV Show captioning, transformer
BibRef
Heo, J.[Jiseong],
Wang, Y.[Yooseung],
Park, J.[Jihun],
Occlusion-aware spatial attention transformer for occluded object
recognition,
PRL(159), 2022, pp. 70-76.
Elsevier DOI
2206
Occluded object recognition, Visual transformer, Spatial attention
BibRef
Wang, J.Y.[Jia-Yun],
Chakraborty, R.[Rudrasis],
Yu, S.X.[Stella X.],
Transformer for 3D Point Clouds,
PAMI(44), No. 8, August 2022, pp. 4419-4431.
IEEE DOI
2207
Convolution, Feature extraction, Shape, Semantics, Task analysis,
Measurement, point cloud, transformation, deformable, segmentation, 3D detection
BibRef
Wang, L.[Libo],
Li, R.[Rui],
Zhang, C.[Ce],
Fang, S.H.[Sheng-Hui],
Duan, C.X.[Chen-Xi],
Meng, X.L.[Xiao-Liang],
Atkinson, P.M.[Peter M.],
UNetFormer: A UNet-like transformer for efficient semantic
segmentation of remote sensing urban scene imagery,
PandRS(190), 2022, pp. 196-214.
Elsevier DOI
2208
Semantic Segmentation, Remote Sensing, Vision Transformer,
Fully Transformer Network, Global-local Context, Urban Scene
BibRef
Kheldouni, A.[Amine],
Boumhidi, J.[Jaouad],
A Study of Bidirectional Encoder Representations from Transformers
for Sequential Recommendations,
ISCV22(1-5)
IEEE DOI
2208
Knowledge engineering, Recurrent neural networks,
Predictive models, Markov processes
BibRef
Li, Z.[Zekun],
Liu, Y.F.[Yu-Fan],
Li, B.[Bing],
Feng, B.L.[Bai-Lan],
Wu, K.[Kebin],
Peng, C.W.[Cheng-Wei],
Hu, W.M.[Wei-Ming],
SDTP: Semantic-Aware Decoupled Transformer Pyramid for Dense Image
Prediction,
CirSysVideo(32), No. 9, September 2022, pp. 6160-6173.
IEEE DOI
2209
Transformers, Semantics, Task analysis, Detectors,
Image segmentation, Head, Convolution, Transformer, dense prediction,
multi-level interaction
BibRef
Wu, J.J.[Jia-Jing],
Wei, Z.Q.[Zhi-Qiang],
Zhang, J.P.[Jin-Peng],
Zhang, Y.[Yushi],
Jia, D.N.[Dong-Ning],
Yin, B.[Bo],
Yu, Y.C.[Yun-Chao],
Full-Coupled Convolutional Transformer for Surface-Based Duct
Refractivity Inversion,
RS(14), No. 17, 2022, pp. xx-yy.
DOI Link
2209
BibRef
Dalmaz, O.[Onat],
Yurt, M.[Mahmut],
Çukur, T.[Tolga],
ResViT: Residual Vision Transformers for Multimodal Medical Image
Synthesis,
MedImg(41), No. 10, October 2022, pp. 2598-2614.
IEEE DOI
2210
Transformers, Biomedical imaging, Subspace constraints,
Task analysis, Image synthesis, Magnetic resonance imaging, unified
BibRef
Jiang, K.[Kai],
Peng, P.[Peng],
Lian, Y.[Youzao],
Xu, W.S.[Wei-Sheng],
The encoding method of position embeddings in vision transformer,
JVCIR(89), 2022, pp. 103664.
Elsevier DOI
2212
Vision transformer, Position embeddings, Gabor filters
BibRef
Han, K.[Kai],
Wang, Y.H.[Yun-He],
Chen, H.[Hanting],
Chen, X.[Xinghao],
Guo, J.[Jianyuan],
Liu, Z.H.[Zhen-Hua],
Tang, Y.[Yehui],
Xiao, A.[An],
Xu, C.J.[Chun-Jing],
Xu, Y.X.[Yi-Xing],
Yang, Z.H.[Zhao-Hui],
Zhang, Y.[Yiman],
Tao, D.C.[Da-Cheng],
A Survey on Vision Transformer,
PAMI(45), No. 1, January 2023, pp. 87-110.
IEEE DOI
2212
Survey, Vision Transformer. Transformers, Task analysis, Encoding, Computational modeling,
Visualization, Object detection, high-level vision,
video
BibRef
Hou, Q.[Qibin],
Jiang, Z.[Zihang],
Yuan, L.[Li],
Cheng, M.M.[Ming-Ming],
Yan, S.C.[Shui-Cheng],
Feng, J.S.[Jia-Shi],
Vision Permutator:
A Permutable MLP-Like Architecture for Visual Recognition,
PAMI(45), No. 1, January 2023, pp. 1328-1334.
IEEE DOI
2212
Transformers, Encoding, Visualization, Convolutional codes, Mixers,
Computer architecture, Training data, Vision permutator, deep neural network
BibRef
Yu, X.H.[Xiao-Han],
Wang, J.[Jun],
Zhao, Y.[Yang],
Gao, Y.S.[Yong-Sheng],
Mix-ViT: Mixing attentive vision transformer for ultra-fine-grained
visual categorization,
PR(135), 2023, pp. 109131.
Elsevier DOI
2212
Ultra-fine-grained visual categorization, Vision transformer,
Self-supervised learning, Attentive mixing
BibRef
Li, Y.[Yehao],
Yao, T.[Ting],
Pan, Y.[Yingwei],
Mei, T.[Tao],
Contextual Transformer Networks for Visual Recognition,
PAMI(45), No. 2, February 2023, pp. 1489-1500.
IEEE DOI
2301
Transformers, Convolution, Visualization, Task analysis,
Image recognition, Object detection, Transformer, image recognition
BibRef
Wang, H.[Hang],
Du, Y.[Youtian],
Zhang, Y.[Yabin],
Li, S.[Shuai],
Zhang, L.[Lei],
One-Stage Visual Relationship Referring With Transformers and
Adaptive Message Passing,
IP(32), 2023, pp. 190-202.
IEEE DOI
2301
Visualization, Proposals, Transformers, Task analysis, Detectors,
Message passing, Predictive models, gated message passing
BibRef
Kim, B.[Boah],
Kim, J.[Jeongsol],
Ye, J.C.[Jong Chul],
Task-Agnostic Vision Transformer for Distributed Learning of Image
Processing,
IP(32), 2023, pp. 203-218.
IEEE DOI
2301
Task analysis, Transformers, Servers, Distance learning,
Computer aided instruction, Tail, Head, Distributed learning,
task-agnostic learning
BibRef
Kiya, H.[Hitoshi],
Iijima, R.[Ryota],
Maungmaung, A.[Aprilpyone],
Kinoshit, Y.[Yuma],
Image and Model Transformation with Secret Key for Vision Transformer,
IEICE(E106-D), No. 1, January 2023, pp. 2-11.
WWW Link.
2301
BibRef
Lin, X.[Xiao],
Sun, S.Z.[Shu-Zhou],
Huang, W.[Wei],
Sheng, B.[Bin],
Li, P.[Ping],
Feng, D.D.[David Dagan],
EAPT: Efficient Attention Pyramid Transformer for Image Processing,
MultMed(25), 2023, pp. 50-61.
IEEE DOI
2301
Transformers, Encoding, Task analysis, Semantics, Feature extraction,
Costs, Convolutional neural networks, Transformer,
semantic segmentation
BibRef
Mou, C.[Chong],
Zhang, J.[Jian],
TransCL: Transformer Makes Strong and Flexible Compressive Learning,
PAMI(45), No. 4, April 2023, pp. 5236-5251.
IEEE DOI
2303
Task analysis, Transformers, Image reconstruction, Image coding,
Compressed sensing, Sensors, Cameras, Compressed sensing,
semantic segmentation
BibRef
Mo, S.T.[Shen-Tong],
Sun, Z.[Zhun],
Li, C.[Chao],
Multi-level Contrastive Learning for Self-Supervised Vision
Transformers,
WACV23(2777-2786)
IEEE DOI
2302
Training, Representation learning, Head, Semantic segmentation,
Self-supervised learning, visual reasoning
BibRef
Yun, J.[Jooyeol],
Lee, S.[Sanghyeon],
Park, M.H.[Min-Ho],
Choo, J.[Jaegul],
iColoriT: Towards Propagating Local Hints to the Right Region in
Interactive Colorization by Leveraging Vision Transformer,
WACV23(1787-1796)
IEEE DOI
2302
Convolutional codes, Image color analysis, Stacking, Gray-scale,
Transformers, Algorithms: Computational photography, image and video synthesis
BibRef
Liu, Y.[Yue],
Matsoukas, C.[Christos],
Strand, F.[Fredrik],
Azizpour, H.[Hossein],
Smith, K.[Kevin],
PatchDropout: Economizing Vision Transformers Using Patch Dropout,
WACV23(3942-3951)
IEEE DOI
2302
Training, Image resolution, Computational modeling,
Biological system modeling, Memory management, Transformers,
Biomedical/healthcare/medicine
BibRef
Chen, X.Y.[Xiang-Yu],
Hu, Q.[Qinghao],
Li, K.[Kaidong],
Zhong, C.[Cuncong],
Wang, G.H.[Guang-Hui],
Accumulated Trivial Attention Matters in Vision Transformers on Small
Datasets,
WACV23(3973-3981)
IEEE DOI
2302
Codes, Focusing, Transformers, Convolutional neural networks,
Task analysis, Algorithms: Machine learning architectures,
and algorithms (including transfer)
BibRef
Chen, C.[Chang],
Zhang, J.[JiaMing],
Yang, K.[Kailun],
Peng, K.[Kunyu],
Stiefelhagen, R.[Rainer],
Trans4Map: Revisiting Holistic Bird's-Eye-View Mapping from
Egocentric Images to Allocentric Semantics with Vision Transformers,
WACV23(4002-4011)
IEEE DOI
2302
Computational modeling, Semantic segmentation, Semantics,
Memory management, Pipelines, Transformers, Feature extraction,
segmentation
BibRef
Lan, H.[Hai],
Wang, X.[Xihao],
Shen, H.[Hao],
Liang, P.[Peidong],
Wei, X.[Xian],
Couplformer: Rethinking Vision Transformer with Coupling Attention,
WACV23(6464-6473)
IEEE DOI
2302
Couplings, Visualization, Image segmentation,
Computational modeling, Memory management, Object detection,
Visualization
BibRef
Marin, D.[Dmitrii],
Chang, J.H.R.[Jen-Hao Rick],
Ranjan, A.[Anurag],
Prabhu, A.[Anish],
Rastegari, M.[Mohammad],
Tuzel, O.[Oncel],
Token Pooling in Vision Transformers for Image Classification,
WACV23(12-21)
IEEE DOI
2302
Filtering, Semantic segmentation, Pose estimation, Transformers,
Encoding, Convolutional neural networks, and algorithms (including transfer)
BibRef
Song, C.H.[Chull Hwan],
Yoon, J.Y.[Joo-Young],
Choi, S.[Shunghyun],
Avrithis, Y.[Yannis],
Boosting vision transformers for image retrieval,
WACV23(107-117)
IEEE DOI
2302
Training, Location awareness, Image retrieval,
Self-supervised learning, Image representation, Transformers
BibRef
Yang, J.[Jinyu],
Liu, J.J.[Jing-Jing],
Xu, N.[Ning],
Huang, J.Z.[Jun-Zhou],
TVT: Transferable Vision Transformer for Unsupervised Domain
Adaptation,
WACV23(520-530)
IEEE DOI
2302
Benchmark testing, Image representation, Transformers,
Convolutional neural networks, Task analysis,
and algorithms (including transfer)
BibRef
Lin, K.E.[Kai-En],
Yen-Chen, L.[Lin],
Lai, W.S.[Wei-Sheng],
Lin, T.Y.[Tsung-Yi],
Shih, Y.C.[Yi-Chang],
Ramamoorthi, R.[Ravi],
Vision Transformer for NeRF-Based View Synthesis from a Single Input
Image,
WACV23(806-815)
IEEE DOI
2302
Shape, Pose estimation, Feature extraction, Transformers, Cameras,
Algorithms: Computational photography,
3D computer vision
BibRef
Saavedra-Ruiz, M.[Miguel],
Morin, S.[Sacha],
Paull, L.[Liam],
Monocular Robot Navigation with Self-Supervised Pretrained Vision
Transformers,
CRV22(197-204)
IEEE DOI
2301
Adaptation models, Image segmentation, Image resolution,
Navigation, Transformers, Robot sensing systems, Visual Servoing
BibRef
Debnath, B.[Biplob],
Po, O.[Oliver],
Chowdhury, F.A.[Farhan Asif],
Chakradhar, S.[Srimat],
Cosine Similarity based Few-Shot Video Classifier with
Attention-based Aggregation,
ICPR22(1273-1279)
IEEE DOI
2212
Training, Head, Pipelines, Benchmark testing, Feature extraction,
Transformers
BibRef
Patel, K.[Krushi],
Bur, A.M.[Andrés M.],
Li, F.J.[Feng-Jun],
Wang, G.H.[Guang-Hui],
Aggregating Global Features into Local Vision Transformer,
ICPR22(1141-1147)
IEEE DOI
2212
Source coding, Computational modeling,
Information processing, Performance gain, Transformers
BibRef
Shen, Z.Q.[Zhi-Qiang],
Liu, Z.[Zechun],
Xing, E.[Eric],
Sliced Recursive Transformer,
ECCV22(XXIV:727-744).
Springer DOI
2211
BibRef
Shao, Y.[Yidi],
Loy, C.C.[Chen Change],
Dai, B.[Bo],
Transformer with Implicit Edges for Particle-Based Physics Simulation,
ECCV22(XIX:549-564).
Springer DOI
2211
BibRef
Wang, W.[Wen],
Zhang, J.[Jing],
Cao, Y.[Yang],
Shen, Y.L.[Yong-Liang],
Tao, D.C.[Da-Cheng],
Towards Data-Efficient Detection Transformers,
ECCV22(IX:88-105).
Springer DOI
2211
BibRef
Mari, C.R.[Carlos Roig],
Gonzalez, D.V.[David Varas],
Bou-Balust, E.[Elisenda],
Multi-Scale Transformer-Based Feature Combination for Image Retrieval,
ICIP22(3166-3170)
IEEE DOI
2211
Visualization, Semantics, Image retrieval, Feature extraction,
Transformers, Internet, Image retrieval, Attention, Multi-scale,
Feature combination
BibRef
Lorenzana, M.B.[Marlon Bran],
Engstrom, C.[Craig],
Chandra, S.S.[Shekhar S.],
Transformer Compressed Sensing Via Global Image Tokens,
ICIP22(3011-3015)
IEEE DOI
2211
Training, Limiting, Image resolution, Neural networks,
Image representation, Transformers, MRI
BibRef
Furukawa, R.[Ryouichi],
Hotta, K.[Kazuhiro],
Local Embedding for Axial Attention,
ICIP22(2586-2590)
IEEE DOI
2211
Deep learning, Image segmentation, Visualization,
Computational modeling, Neural networks, Transformers.
BibRef
Lu, X.Y.[Xiao-Yong],
Du, S.[Songlin],
NCTR: Neighborhood Consensus Transformer for Feature Matching,
ICIP22(2726-2730)
IEEE DOI
2211
Learning systems, Impedance matching, Aggregates, Pose estimation,
Neural networks, Transformers, Local feature matching,
graph neural network
BibRef
Jeny, A.A.[Afsana Ahsan],
Junayed, M.S.[Masum Shah],
Islam, M.B.[Md Baharul],
An Efficient End-To-End Image Compression Transformer,
ICIP22(1786-1790)
IEEE DOI
2211
Image coding, Correlation, Limiting, Computational modeling,
Rate-distortion, Video compression, Transformers, entropy model
BibRef
Shang, J.H.[Jing-Huan],
Kahatapitiya, K.[Kumara],
Li, X.[Xiang],
Ryoo, M.S.[Michael S.],
StARformer: Transformer with State-Action-Reward Representations for
Visual Reinforcement Learning,
ECCV22(XXIX:462-479).
Springer DOI
2211
BibRef
Kakogeorgiou, I.[Ioannis],
Gidaris, S.[Spyros],
Psomas, B.[Bill],
Avrithis, Y.[Yannis],
Bursuc, A.[Andrei],
Karantzalos, K.[Konstantinos],
Komodakis, N.[Nikos],
What to Hide from Your Students: Attention-Guided Masked Image Modeling,
ECCV22(XXX:300-318).
Springer DOI
2211
WWW Link.
BibRef
Bai, J.W.[Jia-Wang],
Yuan, L.[Li],
Xia, S.T.[Shu-Tao],
Yan, S.C.[Shui-Cheng],
Li, Z.F.[Zhi-Feng],
Liu, W.[Wei],
Improving Vision Transformers by Revisiting High-Frequency Components,
ECCV22(XXIV:1-18).
Springer DOI
2211
BibRef
Ding, M.Y.[Ming-Yu],
Xiao, B.[Bin],
Codella, N.[Noel],
Luo, P.[Ping],
Wang, J.D.[Jing-Dong],
Yuan, L.[Lu],
DaViT: Dual Attention Vision Transformers,
ECCV22(XXIV:74-92).
Springer DOI
2211
BibRef
Li, K.[Kehan],
Yu, R.[Runyi],
Wang, Z.[Zhennan],
Yuan, L.[Li],
Song, G.[Guoli],
Chen, J.[Jie],
Locality Guidance for Improving Vision Transformers on Tiny Datasets,
ECCV22(XXIV:110-127).
Springer DOI
2211
BibRef
Wang, P.[Pichao],
Wang, X.[Xue],
Wang, F.[Fan],
Lin, M.[Ming],
Chang, S.[Shuning],
Li, H.[Hao],
Jin, R.[Rong],
KVT: k-NN Attention for Boosting Vision Transformers,
ECCV22(XXIV:285-302).
Springer DOI
2211
BibRef
Tu, Z.Z.[Zheng-Zhong],
Talebi, H.[Hossein],
Zhang, H.[Han],
Yang, F.[Feng],
Milanfar, P.[Peyman],
Bovik, A.C.[Alan C.],
Li, Y.[Yinxiao],
MaxViT: Multi-axis Vision Transformer,
ECCV22(XXIV:459-479).
Springer DOI
2211
BibRef
Yang, R.[Rui],
Ma, H.L.[Hai-Long],
Wu, J.[Jie],
Tang, Y.[Yansong],
Xiao, X.F.[Xue-Feng],
Zheng, M.[Min],
Li, X.[Xiu],
ScalableViT: Rethinking the Context-Oriented Generalization of Vision
Transformer,
ECCV22(XXIV:480-496).
Springer DOI
2211
BibRef
Touvron, H.[Hugo],
Cord, M.[Matthieu],
El-Nouby, A.[Alaaeldin],
Verbeek, J.[Jakob],
Jégou, H.[Hervé],
Three Things Everyone Should Know About Vision Transformers,
ECCV22(XXIV:497-515).
Springer DOI
2211
BibRef
Touvron, H.[Hugo],
Cord, M.[Matthieu],
Jégou, H.[Hervé],
DeiT III: Revenge of the ViT,
ECCV22(XXIV:516-533).
Springer DOI
2211
BibRef
Li, Y.H.[Yang-Hao],
Mao, H.Z.[Han-Zi],
Girshick, R.[Ross],
He, K.M.[Kai-Ming],
Exploring Plain Vision Transformer Backbones for Object Detection,
ECCV22(IX:280-296).
Springer DOI
2211
BibRef
Yu, Q.H.[Qi-Hang],
Wang, H.Y.[Hui-Yu],
Qiao, S.Y.[Si-Yuan],
Collins, M.[Maxwell],
Zhu, Y.K.[Yu-Kun],
Adam, H.[Hartwig],
Yuille, A.L.[Alan L.],
Chen, L.C.[Liang-Chieh],
k-means Mask Transformer,
ECCV22(XXIX:288-307).
Springer DOI
2211
BibRef
Lezama, J.[José],
Chang, H.[Huiwen],
Jiang, L.[Lu],
Essa, I.[Irfan],
Improved Masked Image Generation with Token-Critic,
ECCV22(XXIII:70-86).
Springer DOI
2211
Generative transformer.
BibRef
Rao, Y.M.[Yong-Ming],
Zhao, W.[Wenliang],
Zhou, J.[Jie],
Lu, J.W.[Ji-Wen],
AMixer:
Adaptive Weight Mixing for Self-Attention Free Vision Transformers,
ECCV22(XXI:50-67).
Springer DOI
2211
BibRef
Pham, K.[Khoi],
Kafle, K.[Kushal],
Lin, Z.[Zhe],
Ding, Z.H.[Zhi-Hong],
Cohen, S.[Scott],
Tran, Q.[Quan],
Shrivastava, A.[Abhinav],
Improving Closed and Open-Vocabulary Attribute Prediction Using
Transformers,
ECCV22(XXV:201-219).
Springer DOI
2211
BibRef
Yu, W.X.[Wen-Xin],
Zhang, H.[Hongru],
Lan, T.X.[Tian-Xiang],
Hu, Y.C.[Yu-Cheng],
Yin, D.[Dong],
CBPT: A New Backbone for Enhancing Information Transmission of Vision
Transformers,
ICIP22(156-160)
IEEE DOI
2211
Merging, Information processing, Object detection, Transformers,
Computational complexity, Vision Transformer, Backbone
BibRef
Takeda, M.[Mana],
Yanai, K.[Keiji],
Continual Learning in Vision Transformer,
ICIP22(616-620)
IEEE DOI
2211
Learning systems, Image recognition, Transformers,
Natural language processing, Convolutional neural networks, Vision Transformer
BibRef
Zhou, W.L.[Wei-Lian],
Kamata, S.I.[Sei-Ichiro],
Luo, Z.[Zhengbo],
Xue, X.[Xi],
Rethinking Unified Spectral-Spatial-Based Hyperspectral Image
Classification Under 3D Configuration of Vision Transformer,
ICIP22(711-715)
IEEE DOI
2211
Flowcharts, Correlation, Convolution, Transformers,
Hyperspectral image classification, 3D coordinate positional embedding
BibRef
Li, A.[Ang],
Jiao, J.[Jichao],
Li, N.[Ning],
Qi, W.[Wangjing],
Xu, W.[Wei],
Pang, M.[Min],
Conmw Transformer: A General Vision Transformer Backbone With
Merged-Window Attention,
ICIP22(1551-1555)
IEEE DOI
2211
Image resolution, Convolution, Transformers, Feature extraction,
Tokenization, Computational efficiency, Vision Transformer,
hybrid architecture
BibRef
Li, J.[Junbo],
Zhang, H.[Huan],
Xie, C.[Cihang],
ViP: Unified Certified Detection and Recovery for Patch Attack with
Vision Transformers,
ECCV22(XXV:573-587).
Springer DOI
2211
BibRef
Zhang, Q.M.[Qi-Ming],
Xu, Y.[Yufei],
Zhang, J.[Jing],
Tao, D.C.[Da-Cheng],
VSA: Learning Varied-Size Window Attention in Vision Transformers,
ECCV22(XXV:466-483).
Springer DOI
2211
BibRef
Cao, Y.H.[Yun-Hao],
Yu, H.[Hao],
Wu, J.X.[Jian-Xin],
Training Vision Transformers with only 2040 Images,
ECCV22(XXV:220-237).
Springer DOI
2211
BibRef
Wang, C.[Cong],
Xu, H.M.[Hong-Min],
Zhang, X.[Xiong],
Wang, L.[Li],
Zheng, Z.[Zhitong],
Liu, H.F.[Hai-Feng],
Convolutional Embedding Makes Hierarchical Vision Transformer Stronger,
ECCV22(XX:739-756).
Springer DOI
2211
BibRef
Wu, B.[Boxi],
Gu, J.D.[Jin-Dong],
Li, Z.F.[Zhi-Feng],
Cai, D.[Deng],
He, X.F.[Xiao-Fei],
Liu, W.[Wei],
Towards Efficient Adversarial Training on Vision Transformers,
ECCV22(XIII:307-325).
Springer DOI
2211
BibRef
Gu, J.D.[Jin-Dong],
Tresp, V.[Volker],
Qin, Y.[Yao],
Are Vision Transformers Robust to Patch Perturbations?,
ECCV22(XII:404-421).
Springer DOI
2211
BibRef
Zong, Z.[Zhuofan],
Li, K.[Kunchang],
Song, G.[Guanglu],
Wang, Y.[Yali],
Qiao, Y.[Yu],
Leng, B.[Biao],
Liu, Y.[Yu],
Self-slimmed Vision Transformer,
ECCV22(XI:432-448).
Springer DOI
2211
BibRef
Fayyaz, M.[Mohsen],
Koohpayegani, S.A.[Soroush Abbasi],
Jafari, F.R.[Farnoush Rezaei],
Sengupta, S.[Sunando],
Joze, H.R.V.[Hamid Reza Vaezi],
Sommerlade, E.[Eric],
Pirsiavash, H.[Hamed],
Gall, J.[Jürgen],
Adaptive Token Sampling for Efficient Vision Transformers,
ECCV22(XI:396-414).
Springer DOI
2211
BibRef
Li, Z.K.[Zhi-Kai],
Ma, L.P.[Li-Ping],
Chen, M.J.[Meng-Juan],
Xiao, J.R.[Jun-Rui],
Gu, Q.Y.[Qing-Yi],
Patch Similarity Aware Data-Free Quantization for Vision Transformers,
ECCV22(XI:154-170).
Springer DOI
2211
BibRef
Weng, Z.J.[Ze-Jia],
Yang, X.T.[Xi-Tong],
Li, A.[Ang],
Wu, Z.X.[Zu-Xuan],
Jiang, Y.G.[Yu-Gang],
Semi-supervised Vision Transformers,
ECCV22(XXX:605-620).
Springer DOI
2211
BibRef
Mallick, R.[Rupayan],
Benois-Pineau, J.[Jenny],
Zemmari, A.[Akka],
I Saw: A Self-Attention Weighted Method for Explanation of Visual
Transformers,
ICIP22(3271-3275)
IEEE DOI
2211
Measurement, Correlation coefficient, Visualization,
Image segmentation, Databases, Object detection, Transformers,
Gaze Fixation Density Maps
BibRef
Su, T.[Tong],
Ye, S.[Shuo],
Song, C.Q.[Cheng-Qun],
Cheng, J.[Jun],
Mask-Vit: an Object Mask Embedding in Vision Transformer for
Fine-Grained Visual Classification,
ICIP22(1626-1630)
IEEE DOI
2211
Knowledge engineering, Visualization, Focusing, Interference,
Benchmark testing, Transformers, Feature extraction,
Knowledge Embedding
BibRef
Gai, L.[Lulu],
Chen, W.[Wei],
Gao, R.[Rui],
Chen, Y.W.[Yan-Wei],
Qiao, X.[Xu],
Using Vision Transformers in 3-D Medical Image Classifications,
ICIP22(696-700)
IEEE DOI
2211
Deep learning, Training, Visualization, Transfer learning,
Optimization methods, Self-supervised learning, Transformers,
3-D medical image classifications
BibRef
Wu, K.[Kan],
Zhang, J.[Jinnian],
Peng, H.[Houwen],
Liu, M.[Mengchen],
Xiao, B.[Bin],
Fu, J.L.[Jian-Long],
Yuan, L.[Lu],
TinyViT: Fast Pretraining Distillation for Small Vision Transformers,
ECCV22(XXI:68-85).
Springer DOI
2211
BibRef
Gao, L.[Li],
Nie, D.[Dong],
Li, B.[Bo],
Ren, X.F.[Xiao-Feng],
Doubly-Fused ViT: Fuse Information from Vision Transformer Doubly with
Local Representation,
ECCV22(XXIII:744-761).
Springer DOI
2211
BibRef
Yao, T.[Ting],
Pan, Y.[Yingwei],
Li, Y.[Yehao],
Ngo, C.W.[Chong-Wah],
Mei, T.[Tao],
Wave-ViT: Unifying Wavelet and Transformers for Visual Representation
Learning,
ECCV22(XXV:328-345).
Springer DOI
2211
BibRef
Yuan, Z.H.[Zhi-Hang],
Xue, C.H.[Chen-Hao],
Chen, Y.Q.[Yi-Qi],
Wu, Q.[Qiang],
Sun, G.[Guangyu],
PTQ4ViT: Post-training Quantization for Vision Transformers with Twin
Uniform Quantization,
ECCV22(XII:191-207).
Springer DOI
2211
BibRef
Kong, Z.L.[Zheng-Lun],
Dong, P.Y.[Pei-Yan],
Ma, X.L.[Xiao-Long],
Meng, X.[Xin],
Niu, W.[Wei],
Sun, M.S.[Meng-Shu],
Shen, X.[Xuan],
Yuan, G.[Geng],
Ren, B.[Bin],
Tang, H.[Hao],
Qin, M.[Minghai],
Wang, Y.Z.[Yan-Zhi],
SPViT:
Enabling Faster Vision Transformers via Latency-Aware Soft Token Pruning,
ECCV22(XI:620-640).
Springer DOI
2211
BibRef
Pan, J.[Junting],
Bulat, A.[Adrian],
Tan, F.[Fuwen],
Zhu, X.T.[Xia-Tian],
Dudziak, L.[Lukasz],
Li, H.S.[Hong-Sheng],
Tzimiropoulos, G.[Georgios],
Martinez, B.[Brais],
EdgeViTs: Competing Light-Weight CNNs on Mobile Devices with Vision
Transformers,
ECCV22(XI:294-311).
Springer DOI
2211
BibRef
Xu, R.S.[Run-Sheng],
Xiang, H.[Hao],
Tu, Z.Z.[Zheng-Zhong],
Xia, X.[Xin],
Yang, M.H.[Ming-Hsuan],
Ma, J.Q.[Jia-Qi],
V2X-ViT: Vehicle-to-Everything Cooperative Perception with Vision
Transformer,
ECCV22(XXIX:107-124).
Springer DOI
2211
BibRef
Liu, Y.[Yong],
Mai, S.Q.[Si-Qi],
Chen, X.N.[Xiang-Ning],
Hsieh, C.J.[Cho-Jui],
You, Y.[Yang],
Towards Efficient and Scalable Sharpness-Aware Minimization,
CVPR22(12350-12360)
IEEE DOI
2210
WWW Link. Training, Schedules, Scalability, Perturbation methods,
Stochastic processes, Transformers, Minimization,
Vision applications and systems
BibRef
Ren, P.Z.[Peng-Zhen],
Li, C.[Changlin],
Wang, G.[Guangrun],
Xiao, Y.[Yun],
Du, Q.[Qing],
Liang, X.D.[Xiao-Dan],
Chang, X.J.[Xiao-Jun],
Beyond Fixation: Dynamic Window Visual Transformer,
CVPR22(11977-11987)
IEEE DOI
2210
Performance evaluation, Visualization, Systematics,
Computational modeling, Scalability, Transformers,
Deep learning architectures and techniques
BibRef
Liu, Z.[Ze],
Hu, H.[Han],
Lin, Y.T.[Yu-Tong],
Yao, Z.L.[Zhu-Liang],
Xie, Z.D.[Zhen-Da],
Wei, Y.X.[Yi-Xuan],
Ning, J.[Jia],
Cao, Y.[Yue],
Zhang, Z.[Zheng],
Dong, L.[Li],
Wei, F.[Furu],
Guo, B.[Baining],
Swin Transformer V2: Scaling Up Capacity and Resolution,
CVPR22(11999-12009)
IEEE DOI
2210
Training, Representation learning, Adaptation models,
Image resolution, Computational modeling, Semantics,
Representation learning
BibRef
Bhattacharjee, D.[Deblina],
Zhang, T.[Tong],
Süsstrunk, S.[Sabine],
Salzmann, M.[Mathieu],
MuIT: An End-to-End Multitask Learning Transformer,
CVPR22(12021-12031)
IEEE DOI
2210
Heart, Image segmentation, Computational modeling,
Image edge detection, Semantics, Estimation, Predictive models,
Scene analysis and understanding
BibRef
Fang, J.[Jiemin],
Xie, L.X.[Ling-Xi],
Wang, X.G.[Xing-Gang],
Zhang, X.P.[Xiao-Peng],
Liu, W.Y.[Wen-Yu],
Tian, Q.[Qi],
MSG-Transformer:
Exchanging Local Spatial Information by Manipulating Messenger Tokens,
CVPR22(12053-12062)
IEEE DOI
2210
Deep learning, Visualization, Neural networks,
Graphics processing units, retrieval
BibRef
Sandler, M.[Mark],
Zhmoginov, A.[Andrey],
Vladymyrov, M.[Max],
Jackson, A.[Andrew],
Fine-tuning Image Transformers using Learnable Memory,
CVPR22(12145-12154)
IEEE DOI
2210
Deep learning, Adaptation models, Costs, Computational modeling,
Memory management, Transformers, Transfer/low-shot/long-tail learning
BibRef
Yu, X.[Xumin],
Tang, L.[Lulu],
Rao, Y.M.[Yong-Ming],
Huang, T.J.[Tie-Jun],
Zhou, J.[Jie],
Lu, J.W.[Ji-Wen],
Point-BERT: Pre-training 3D Point Cloud Transformers with Masked
Point Modeling,
CVPR22(19291-19300)
IEEE DOI
2210
Point cloud compression, Solid modeling, Computational modeling,
Bit error rate, Transformers, Pattern recognition,
Deep learning architectures and techniques
BibRef
Park, C.[Chunghyun],
Jeong, Y.[Yoonwoo],
Cho, M.[Minsu],
Park, J.[Jaesik],
Fast Point Transformer,
CVPR22(16928-16937)
IEEE DOI
2210
Point cloud compression, Shape, Semantics, Neural networks,
Transformers, grouping and shape analysis
BibRef
Ren, S.[Sucheng],
Zhou, D.[Daquan],
He, S.F.[Sheng-Feng],
Feng, J.S.[Jia-Shi],
Wang, X.C.[Xin-Chao],
Shunted Self-Attention via Multi-Scale Token Aggregation,
CVPR22(10843-10852)
IEEE DOI
2210
Degradation, Deep learning, Costs, Computational modeling, Merging,
Efficient learning and inferences
BibRef
Zeng, W.[Wang],
Jin, S.[Sheng],
Liu, W.T.[Wen-Tao],
Qian, C.[Chen],
Luo, P.[Ping],
Ouyang, W.L.[Wan-Li],
Wang, X.G.[Xiao-Gang],
Not All Tokens Are Equal:
Human-centric Visual Analysis via Token Clustering Transformer,
CVPR22(11091-11101)
IEEE DOI
2210
Visualization, Shape, Pose estimation, Semantics,
Pose estimation and tracking,
Deep learning architectures and techniques
BibRef
Yu, W.H.[Wei-Hao],
Luo, M.[Mi],
Zhou, P.[Pan],
Si, C.Y.[Chen-Yang],
Zhou, Y.C.[Yi-Chen],
Wang, X.C.[Xin-Chao],
Feng, J.S.[Jia-Shi],
Yan, S.C.[Shui-Cheng],
MetaFormer is Actually What You Need for Vision,
CVPR22(10809-10819)
IEEE DOI
2210
Computational modeling, Focusing,
Transformers, Pattern recognition, Task analysis, retrieval
BibRef
Xie, Z.D.[Zhen-Da],
Zhang, Z.[Zheng],
Cao, Y.[Yue],
Lin, Y.T.[Yu-Tong],
Bao, J.M.[Jian-Min],
Yao, Z.L.[Zhu-Liang],
Dai, Q.[Qi],
Hu, H.[Han],
SimMIM: a Simple Framework for Masked Image Modeling,
CVPR22(9643-9653)
IEEE DOI
2210
WWW Link. Representation learning, Training, Head, Self-supervised learning,
Predictive models, Data models, Self- semi- meta- Representation learning
BibRef
Song, Z.[Zikai],
Yu, J.Q.[Jun-Qing],
Chen, Y.P.P.[Yi-Ping Phoebe],
Yang, W.[Wei],
Transformer Tracking with Cyclic Shifting Window Attention,
CVPR22(8781-8790)
IEEE DOI
2210
WWW Link. Visualization, Target tracking, Image recognition,
Optimization methods, Benchmark testing
BibRef
Tu, Z.Z.[Zheng-Zhong],
Talebi, H.[Hossein],
Zhang, H.[Han],
Yang, F.[Feng],
Milanfar, P.[Peyman],
Bovik, A.[Alan],
Li, Y.X.[Yin-Xiao],
MAXIM: Multi-Axis MLP for Image Processing,
CVPR22(5759-5770)
IEEE DOI
2210
WWW Link. Training, Photography, Adaptation models, Visualization,
Computational modeling, Transformers, Low-level vision,
Computational photography
BibRef
Chen, Z.[Zhe],
Zhang, J.[Jing],
Tao, D.C.[Da-Cheng],
Recurrent Glimpse-based Decoder for Detection with Transformer,
CVPR22(5250-5259)
IEEE DOI
2210
WWW Link. Training, Visualization, Pipelines, Detectors, Feature extraction,
Transformers, Recognition: detection, categorization, retrieval
BibRef
Yun, S.[Sukmin],
Lee, H.[Hankook],
Kim, J.[Jaehyung],
Shin, J.[Jinwoo],
Patch-level Representation Learning for Self-supervised Vision
Transformers,
CVPR22(8344-8353)
IEEE DOI
2210
Training, Representation learning, Visualization, Neural networks,
Object detection, Self-supervised learning, Transformers,
Self- semi- meta- unsupervised learning
BibRef
Hou, Z.J.[Ze-Jiang],
Kung, S.Y.[Sun-Yuan],
Multi-Dimensional Vision Transformer Compression via Dependency
Guided Gaussian Process Search,
EVW22(3668-3677)
IEEE DOI
2210
Adaptation models, Image coding, Head, Computational modeling,
Neurons, Gaussian processes, Transformers
BibRef
Zhang, G.J.[Gong-Jie],
Luo, Z.P.[Zhi-Peng],
Yu, Y.C.[Ying-Chen],
Cui, K.[Kaiwen],
Lu, S.J.[Shi-Jian],
Accelerating DETR Convergence via Semantic-Aligned Matching,
CVPR22(939-948)
IEEE DOI
2210
Code, Detection Transformer.
WWW Link. Training, Costs, Semantics, Object detection, Transformers,
Feature extraction, Recognition: detection, categorization,
Motion and tracking
BibRef
Gupta, A.[Akshita],
Narayan, S.[Sanath],
Joseph, K.J.[K J],
Khan, S.[Salman],
Khan, F.S.[Fahad Shahbaz],
Shah, M.[Mubarak],
OW-DETR: Open-world Detection Transformer,
CVPR22(9225-9234)
IEEE DOI
2210
Training, Object detection, Transformers, Pattern recognition,
Proposals, Object recognition, retrieval, categorization,
Recognition: detection
BibRef
Lou, Q.[Qian],
Hsu, Y.C.[Yen-Chang],
Uzkent, B.[Burak],
Hua, T.[Ting],
Shen, Y.[Yilin],
Jin, H.X.[Hong-Xia],
Lite-MDETR: A Lightweight Multi-Modal Detector,
CVPR22(12196-12205)
IEEE DOI
2210
nTraining, Performance evaluation, Visualization, Dictionaries,
Grounding, Detectors, Transformers,
BibRef
Li, F.[Feng],
Zhang, H.[Hao],
Liu, S.[Shilong],
Guo, J.[Jian],
Ni, L.M.[Lionel M.],
Zhang, L.[Lei],
DN-DETR: Accelerate DETR Training by Introducing Query DeNoising,
CVPR22(13609-13617)
IEEE DOI
2210
Training, Codes, Machine vision, Noise reduction, Transformers,
Pattern recognition, Decoding, Recognition: detection,
Vision applications and systems
BibRef
La Bonte, T.[Tyler],
Song, Y.[Yale],
Wang, X.[Xin],
Vineet, V.[Vibhav],
Joshi, N.[Neel],
Scaling Novel Object Detection with Weakly Supervised Detection
Transformers,
WACV23(85-96)
IEEE DOI
2302
Training, Costs, Surveillance, Object detection, Detectors,
Transformers, Data models, and un-supervised learning)
BibRef
Bar, A.[Amir],
Wang, X.[Xin],
Kantorov, V.[Vadim],
Reed, C.J.[Colorado J.],
Herzig, R.[Roei],
Chechik, G.[Gal],
Rohrbach, A.[Anna],
Darrell, T.J.[Trevor J.],
Globerson, A.[Amir],
DETReg: Unsupervised Pretraining with Region Priors for Object
Detection,
CVPR22(14585-14595)
IEEE DOI
2210
Location awareness, Training, Object detection, Detectors,
Transformers, Generators, Representation learning
BibRef
Salman, H.[Hadi],
Jain, S.[Saachi],
Wong, E.[Eric],
Madry, A.[Aleksander],
Certified Patch Robustness via Smoothed Vision Transformers,
CVPR22(15116-15126)
IEEE DOI
2210
Visualization, Smoothing methods, Costs, Computational modeling,
Transformers, Adversarial attack and defense
BibRef
Wang, Y.K.[Yi-Kai],
Chen, X.H.[Xing-Hao],
Cao, L.[Lele],
Huang, W.B.[Wen-Bing],
Sun, F.C.[Fu-Chun],
Wang, Y.H.[Yun-He],
Multimodal Token Fusion for Vision Transformers,
CVPR22(12176-12185)
IEEE DOI
2210
Point cloud compression, Image segmentation, Shape, Semantics,
Object detection,
Vision+X
BibRef
Tang, Y.[Yehui],
Han, K.[Kai],
Wang, Y.H.[Yun-He],
Xu, C.[Chang],
Guo, J.Y.[Jian-Yuan],
Xu, C.[Chao],
Tao, D.C.[Da-Cheng],
Patch Slimming for Efficient Vision Transformers,
CVPR22(12155-12164)
IEEE DOI
2210
Visualization, Quantization (signal), Computational modeling,
Aggregates, Benchmark testing,
Representation learning
BibRef
Zhang, J.[Jinnian],
Peng, H.[Houwen],
Wu, K.[Kan],
Liu, M.[Mengchen],
Xiao, B.[Bin],
Fu, J.L.[Jian-Long],
Yuan, L.[Lu],
MiniViT: Compressing Vision Transformers with Weight Multiplexing,
CVPR22(12135-12144)
IEEE DOI
2210
Multiplexing, Performance evaluation, Image coding, Codes,
Computational modeling, Benchmark testing,
Vision applications and systems
BibRef
Chen, J.N.[Jie-Neng],
Sun, S.[Shuyang],
He, J.[Ju],
Torr, P.H.S.[Philip H.S.],
Yuille, A.L.[Alan L.],
Bai, S.[Song],
TransMix: Attend to Mix for Vision Transformers,
CVPR22(12125-12134)
IEEE DOI
2210
Training, Image segmentation, Codes, Semantics, Object detection,
Benchmark testing, Transformers,
Representation learning
BibRef
Dong, X.Y.[Xiao-Yi],
Bao, J.[Jianmin],
Chen, D.D.[Dong-Dong],
Zhang, W.M.[Wei-Ming],
Yu, N.H.[Neng-Hai],
Yuan, L.[Lu],
Chen, D.[Dong],
Guo, B.[Baining],
CSWin Transformer: A General Vision Transformer Backbone with
Cross-Shaped Windows,
CVPR22(12114-12124)
IEEE DOI
2210
Image segmentation, Costs, Mathematical analysis, Training data,
Transformer cores, Transformers,
grouping and shape analysis
BibRef
Liu, H.[Hao],
Jiang, X.H.[Xing-Hua],
Li, X.[Xin],
Bao, Z.M.[Zhi-Min],
Jiang, D.Q.[De-Qiang],
Ren, B.[Bo],
NomMer: Nominate Synergistic Context in Vision Transformer for Visual
Recognition,
CVPR22(12063-12072)
IEEE DOI
2210
Visualization, Image segmentation, Semantics, Redundancy,
Object detection, Deep learning architectures and techniques
BibRef
Chen, T.L.[Tian-Long],
Zhang, Z.Y.[Zhen-Yu],
Cheng, Y.[Yu],
Awadallah, A.[Ahmed],
Wang, Z.Y.[Zhang-Yang],
The Principle of Diversity: Training Stronger Vision Transformers
Calls for Reducing All Levels of Redundancy,
CVPR22(12010-12020)
IEEE DOI
2210
Training, Convolutional codes, Deep learning,
Computational modeling, Redundancy, Deep learning architectures and techniques
BibRef
Yang, C.[Chenglin],
Wang, Y.[Yilin],
Zhang, J.M.[Jian-Ming],
Zhang, H.[He],
Wei, Z.J.[Zi-Jun],
Lin, Z.[Zhe],
Yuille, A.L.[Alan L.],
Lite Vision Transformer with Enhanced Self-Attention,
CVPR22(11988-11998)
IEEE DOI
2210
Convolutional codes, Image segmentation, Visualization,
Convolution, Semantics, Merging, Predictive models, Deep learning architectures and techniques
BibRef
Yin, H.X.[Hong-Xu],
Vahdat, A.[Arash],
Alvarez, J.M.[Jose M.],
Mallya, A.[Arun],
Kautz, J.[Jan],
Molchanov, P.[Pavlo],
A-ViT: Adaptive Tokens for Efficient Vision Transformer,
CVPR22(10799-10808)
IEEE DOI
2210
Training, Adaptive systems, Network architecture, Transformers,
Throughput, Hardware, Complexity theory,
Efficient learning and inferences
BibRef
Lu, J.H.[Jia-Hao],
Zhang, X.S.[Xi Sheryl],
Zhao, T.L.[Tian-Li],
He, X.Y.[Xiang-Yu],
Cheng, J.[Jian],
APRIL: Finding the Achilles' Heel on Privacy for Vision Transformers,
CVPR22(10041-10050)
IEEE DOI
2210
Privacy, Data privacy, Federated learning, Computational modeling,
Training data, Transformers, Market research, Privacy and federated learning
BibRef
Hatamizadeh, A.[Ali],
Yin, H.X.[Hong-Xu],
Roth, H.[Holger],
Li, W.Q.[Wen-Qi],
Kautz, J.[Jan],
Xu, D.[Daguang],
Molchanov, P.[Pavlo],
GradViT: Gradient Inversion of Vision Transformers,
CVPR22(10011-10020)
IEEE DOI
2210
Measurement, Differential privacy, Neural networks, Transformers,
Pattern recognition, Security, Iterative methods, Privacy and federated learning
BibRef
Zhang, H.[Haofei],
Duan, J.R.[Jia-Rui],
Xue, M.Q.[Meng-Qi],
Song, J.[Jie],
Sun, L.[Li],
Song, M.L.[Ming-Li],
Bootstrapping ViTs: Towards Liberating Vision Transformers from
Pre-training,
CVPR22(8934-8943)
IEEE DOI
2210
Training, Upper bound, Neural networks, Training data,
Network architecture, Transformers, Computer vision theory,
Efficient learning and inferences
BibRef
Chavan, A.[Arnav],
Shen, Z.Q.[Zhi-Qiang],
Liu, Z.[Zhuang],
Liu, Z.[Zechun],
Cheng, K.T.[Kwang-Ting],
Xing, E.[Eric],
Vision Transformer Slimming:
Multi-Dimension Searching in Continuous Optimization Space,
CVPR22(4921-4931)
IEEE DOI
2210
Training, Performance evaluation, Image coding, Force,
Graphics processing units,
Vision applications and systems
BibRef
Xia, Z.F.[Zhuo-Fan],
Pan, X.[Xuran],
Song, S.[Shiji],
Li, L.E.[Li Erran],
Huang, G.[Gao],
Vision Transformer with Deformable Attention,
CVPR22(4784-4793)
IEEE DOI
2210
Deformable models, Adaptation models, Computational modeling,
Predictive models, Transformers, Data models,
grouping and shape analysis
BibRef
Hong, W.X.[Wei-Xiang],
Lao, J.W.[Jiang-Wei],
Ren, W.[Wang],
Wang, J.[Jian],
Chen, J.D.[Jing-Dong],
Chu, W.[Wei],
Training Object Detectors from Scratch: An Empirical Study in the Era
of Vision Transformer,
CVPR22(4652-4661)
IEEE DOI
2210
Training, Visualization, Semantics, Detectors, Object detection,
Transformers, Recognition: detection, categorization, retrieval, Deep learning architectures and techniques
BibRef
Chen, Z.Y.[Zhao-Yu],
Li, B.[Bo],
Wu, S.[Shuang],
Xu, J.H.[Jiang-He],
Ding, S.H.[Shou-Hong],
Zhang, W.Q.[Wen-Qiang],
Shape Matters: Deformable Patch Attack,
ECCV22(IV:529-548).
Springer DOI
2211
BibRef
Chen, Z.Y.[Zhao-Yu],
Li, B.[Bo],
Xu, J.H.[Jiang-He],
Wu, S.[Shuang],
Ding, S.H.[Shou-Hong],
Zhang, W.Q.[Wen-Qiang],
Towards Practical Certifiable Patch Defense with Vision Transformer,
CVPR22(15127-15137)
IEEE DOI
2210
Smoothing methods, Toy manufacturing industry, Semantics,
Network architecture, Transformers, Robustness,
Adversarial attack and defense
BibRef
Chen, R.J.[Richard J.],
Chen, C.[Chengkuan],
Li, Y.C.[Yi-Cong],
Chen, T.Y.[Tiffany Y.],
Trister, A.D.[Andrew D.],
Krishnan, R.G.[Rahul G.],
Mahmood, F.[Faisal],
Scaling Vision Transformers to Gigapixel Images via Hierarchical
Self-Supervised Learning,
CVPR22(16123-16134)
IEEE DOI
2210
Training, Visualization, Self-supervised learning,
Image representation, Transformers,
Self- semi- meta- unsupervised learning
BibRef
Yang, Z.[Zhao],
Wang, J.Q.[Jia-Qi],
Tang, Y.S.[Yan-Song],
Chen, K.[Kai],
Zhao, H.S.[Heng-Shuang],
Torr, P.H.S.[Philip H.S.],
LAVT: Language-Aware Vision Transformer for Referring Image
Segmentation,
CVPR22(18134-18144)
IEEE DOI
2210
Image segmentation, Visualization, Image coding, Shape, Linguistics,
Transformers, Feature extraction, Segmentation, grouping and shape analysis
BibRef
Scheibenreif, L.[Linus],
Hanna, J.[Joëlle],
Mommert, M.[Michael],
Borth, D.[Damian],
Self-supervised Vision Transformers for Land-cover Segmentation and
Classification,
EarthVision22(1421-1430)
IEEE DOI
2210
Training, Earth, Image segmentation, Computational modeling,
Conferences, Transformers
BibRef
Zhai, X.H.[Xiao-Hua],
Kolesnikov, A.[Alexander],
Houlsby, N.[Neil],
Beyer, L.[Lucas],
Scaling Vision Transformers,
CVPR22(1204-1213)
IEEE DOI
2210
Training, Error analysis, Computational modeling, Neural networks,
Memory management, Training data,
Transfer/low-shot/long-tail learning
BibRef
Guo, J.Y.[Jian-Yuan],
Han, K.[Kai],
Wu, H.[Han],
Tang, Y.[Yehui],
Chen, X.H.[Xing-Hao],
Wang, Y.H.[Yun-He],
Xu, C.[Chang],
CMT: Convolutional Neural Networks Meet Vision Transformers,
CVPR22(12165-12175)
IEEE DOI
2210
Visualization, Image recognition, Force,
Object detection, Transformers,
Representation learning
BibRef
Meng, L.C.[Ling-Chen],
Li, H.D.[Heng-Duo],
Chen, B.C.[Bor-Chun],
Lan, S.Y.[Shi-Yi],
Wu, Z.X.[Zu-Xuan],
Jiang, Y.G.[Yu-Gang],
Lim, S.N.[Ser-Nam],
AdaViT: Adaptive Vision Transformers for Efficient Image Recognition,
CVPR22(12299-12308)
IEEE DOI
2210
Image recognition, Head, Law enforcement, Computational modeling,
Redundancy, Transformers, Efficient learning and inferences,
retrieval
BibRef
Herrmann, C.[Charles],
Sargent, K.[Kyle],
Jiang, L.[Lu],
Zabih, R.[Ramin],
Chang, H.[Huiwen],
Liu, C.[Ce],
Krishnan, D.[Dilip],
Sun, D.Q.[De-Qing],
Pyramid Adversarial Training Improves ViT Performance,
CVPR22(13409-13419)
IEEE DOI
2210
Training, Image recognition, Stochastic processes,
Transformers, Robustness, retrieval,
Recognition: detection
BibRef
Li, C.L.[Chang-Lin],
Zhuang, B.[Bohan],
Wang, G.R.[Guang-Run],
Liang, X.D.[Xiao-Dan],
Chang, X.J.[Xiao-Jun],
Yang, Y.[Yi],
Automated Progressive Learning for Efficient Training of Vision
Transformers,
CVPR22(12476-12486)
IEEE DOI
2210
Training, Adaptation models, Schedules, Computational modeling,
Estimation, Manuals, Transformers, Representation learning
BibRef
Yu, T.[Tong],
Khalitov, R.[Ruslan],
Cheng, L.[Lei],
Yang, Z.R.[Zhi-Rong],
Paramixer: Parameterizing Mixing Links in Sparse Factors Works Better
than Dot-Product Self-Attention,
CVPR22(681-690)
IEEE DOI
2210
Protocols, Costs, Scalability, Neural networks, Stacking, Genomics,
Transformers, Deep learning architectures and techniques,
Representation learning
BibRef
Guo, J.Y.[Jian-Yuan],
Tang, Y.H.[Ye-Hui],
Han, K.[Kai],
Chen, X.H.[Xing-Hao],
Wu, H.[Han],
Xu, C.[Chao],
Xu, C.[Chang],
Wang, Y.H.[Yun-He],
Hire-MLP: Vision MLP via Hierarchical Rearrangement,
CVPR22(816-826)
IEEE DOI
2210
Representation learning, Image segmentation, Semantics,
Object detection, Transformers,
Representation learning
BibRef
Cheng, B.[Bowen],
Misra, I.[Ishan],
Schwing, A.G.[Alexander G.],
Kirillov, A.[Alexander],
Girdhar, R.[Rohit],
Masked-attention Mask Transformer for Universal Image Segmentation,
CVPR22(1280-1289)
IEEE DOI
2210
Image segmentation, Shape, Computational modeling, Semantics,
Transformers, Feature extraction,
retrieval
BibRef
Pu, M.Y.[Meng-Yang],
Huang, Y.P.[Ya-Ping],
Liu, Y.M.[Yu-Ming],
Guan, Q.J.[Qing-Ji],
Ling, H.B.[Hai-Bin],
EDTER: Edge Detection with Transformer,
CVPR22(1392-1402)
IEEE DOI
2210
Head, Image edge detection, Semantics, Detectors, Transformers,
Feature extraction, Segmentation, grouping and shape analysis,
Scene analysis and understanding
BibRef
Rangrej, S.B.[Samrudhdhi B.],
Srinidhi, C.L.[Chetan L.],
Clark, J.J.[James J.],
Consistency driven Sequential Transformers Attention Model for
Partially Observable Scenes,
CVPR22(2508-2517)
IEEE DOI
2210
Training, Computational modeling, Imaging, Predictive models,
Transformers, Prediction algorithms, Visual reasoning
BibRef
Zhu, R.[Rui],
Li, Z.Q.[Zheng-Qin],
Matai, J.[Janarbek],
Porikli, F.M.[Fatih M.],
Chandraker, M.[Manmohan],
IRISformer: Dense Vision Transformers for Single-Image Inverse
Rendering in Indoor Scenes,
CVPR22(2812-2821)
IEEE DOI
2210
Photorealism, Shape, Computational modeling, Lighting,
Transformers,
Physics-based vision and shape-from-X
BibRef
Ermolov, A.[Aleksandr],
Mirvakhabova, L.[Leyla],
Khrulkov, V.[Valentin],
Sebe, N.[Nicu],
Oseledets, I.[Ivan],
Hyperbolic Vision Transformers: Combining Improvements in Metric
Learning,
CVPR22(7399-7409)
IEEE DOI
2210
Measurement, Geometry, Visualization, Semantics,
Self-supervised learning, Transformer cores, Transformers,
Representation learning
BibRef
Lee, Y.[Youngwan],
Kim, J.[Jonghee],
Willette, J.[Jeffrey],
Hwang, S.J.[Sung Ju],
MPViT: Multi-Path Vision Transformer for Dense Prediction,
CVPR22(7277-7286)
IEEE DOI
2210
Image segmentation, Semantics, Object detection, Transformers,
Feature extraction, Pattern recognition, Recognition: detection,
Representation learning
BibRef
Zhang, C.Z.[Chong-Zhi],
Zhang, M.Y.[Ming-Yuan],
Zhang, S.H.[Shang-Hang],
Jin, D.S.[Dai-Sheng],
Zhou, Q.[Qiang],
Cai, Z.A.[Zhong-Ang],
Zhao, H.[Haiyu],
Liu, X.L.[Xiang-Long],
Liu, Z.[Ziwei],
Delving Deep into the Generalization of Vision Transformers under
Distribution Shifts,
CVPR22(7267-7276)
IEEE DOI
2210
Training, Representation learning, Systematics, Shape, Taxonomy,
Self-supervised learning, Transformers, Recognition: detection,
Representation learning
BibRef
Hou, Z.[Zhi],
Yu, B.[Baosheng],
Tao, D.C.[Da-Cheng],
BatchFormer: Learning to Explore Sample Relationships for Robust
Representation Learning,
CVPR22(7246-7256)
IEEE DOI
2210
Training, Deep learning, Representation learning, Neural networks,
Tail, Transformers, Transfer/low-shot/long-tail learning,
Self- semi- meta- unsupervised learning
BibRef
Zamir, S.W.[Syed Waqas],
Arora, A.[Aditya],
Khan, S.[Salman],
Hayat, M.[Munawar],
Khan, F.S.[Fahad Shahbaz],
Yang, M.H.[Ming-Hsuan],
Restormer: Efficient Transformer for High-Resolution Image
Restoration,
CVPR22(5718-5729)
IEEE DOI
2210
Computational modeling, Transformer cores,
Transformers, Data models, Image restoration, Task analysis,
Deep learning architectures and techniques
BibRef
Zhao, H.S.[Heng-Shuang],
Jiang, L.[Li],
Jia, J.Y.[Jia-Ya],
Torr, P.H.S.[Philip H.S.],
Koltun, V.[Vladlen],
Point Transformer,
ICCV21(16239-16248)
IEEE DOI
2203
Point cloud compression, Measurement, Image segmentation,
Semantics, Object detection, Transformer cores,
Recognition and classification
BibRef
Lin, K.[Kevin],
Wang, L.J.[Li-Juan],
Liu, Z.C.[Zi-Cheng],
Mesh Graphormer,
ICCV21(12919-12928)
IEEE DOI
2203
Convolutional codes, Solid modeling, Network topology,
Transformers, Gestures and body pose
BibRef
Casey, E.[Evan],
Pérez, V.[Víctor],
Li, Z.[Zhuoru],
The Animation Transformer: Visual Correspondence via Segment Matching,
ICCV21(11303-11312)
IEEE DOI
2203
Visualization, Image segmentation, Image color analysis,
Production, Animation, Transformers,
grouping and shape
BibRef
Reizenstein, J.[Jeremy],
Shapovalov, R.[Roman],
Henzler, P.[Philipp],
Sbordone, L.[Luca],
Labatut, P.[Patrick],
Novotny, D.[David],
Common Objects in 3D: Large-Scale Learning and Evaluation of
Real-life 3D Category Reconstruction,
ICCV21(10881-10891)
IEEE DOI
2203
Award, Marr Prize, HM. Point cloud compression, Transformers,
Rendering (computer graphics), Cameras, Image reconstruction,
3D from multiview and other sensors
BibRef
Mariotti, O.[Octave],
Aodha, O.M.[Oisin Mac],
Bilen, H.[Hakan],
ViewNet: Unsupervised Viewpoint Estimation from Conditional
Generation,
ICCV21(10398-10408)
IEEE DOI
2203
Training, Annotations, Estimation, Benchmark testing, Transformers,
Representation learning, Transfer/Low-shot/Semi/Unsupervised Learning
BibRef
Feng, W.X.[Wei-Xin],
Wang, Y.J.[Yuan-Jiang],
Ma, L.H.[Li-Hua],
Yuan, Y.[Ye],
Zhang, C.[Chi],
Temporal Knowledge Consistency for Unsupervised Visual Representation
Learning,
ICCV21(10150-10160)
IEEE DOI
2203
Training, Representation learning, Visualization, Protocols,
Object detection, Semisupervised learning, Transformers,
Transfer/Low-shot/Semi/Unsupervised Learning
BibRef
Wu, H.P.[Hai-Ping],
Xiao, B.[Bin],
Codella, N.[Noel],
Liu, M.C.[Meng-Chen],
Dai, X.Y.[Xi-Yang],
Yuan, L.[Lu],
Zhang, L.[Lei],
CvT: Introducing Convolutions to Vision Transformers,
ICCV21(22-31)
IEEE DOI
2203
Code, Vision Transformer.
WWW Link. Convolutional codes, Image resolution, Image recognition,
Performance gain, Transformers, Distortion,
BibRef
Touvron, H.[Hugo],
Cord, M.[Matthieu],
Sablayrolles, A.[Alexandre],
Synnaeve, G.[Gabriel],
Jégou, H.[Hervé],
Going deeper with Image Transformers,
ICCV21(32-42)
IEEE DOI
2203
Training, Neural networks, Training data,
Data models, Circuit faults, Recognition and classification,
Optimization and learning methods
BibRef
Zhao, J.W.[Jia-Wei],
Yan, K.[Ke],
Zhao, Y.F.[Yi-Fan],
Guo, X.W.[Xiao-Wei],
Huang, F.Y.[Fei-Yue],
Li, J.[Jia],
Transformer-based Dual Relation Graph for Multi-label Image
Recognition,
ICCV21(163-172)
IEEE DOI
2203
Image recognition, Correlation, Computational modeling, Semantics,
Benchmark testing,
Representation learning
BibRef
Chen, C.F.R.[Chun-Fu Richard],
Fan, Q.F.[Quan-Fu],
Panda, R.[Rameswar],
CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image
Classification,
ICCV21(347-356)
IEEE DOI
2203
Image segmentation, Image recognition, Computational modeling,
Semantics, Memory management, Object detection,
Representation learning
BibRef
Peng, Z.L.[Zhi-Liang],
Huang, W.[Wei],
Gu, S.Z.[Shan-Zhi],
Xie, L.X.[Ling-Xi],
Wang, Y.[Yaowei],
Jiao, J.B.[Jian-Bin],
Ye, Q.X.[Qi-Xiang],
Conformer: Local Features Coupling Global Representations for Visual
Recognition,
ICCV21(357-366)
IEEE DOI
2203
Couplings, Representation learning, Visualization, Fuses,
Convolution, Object detection, Transformers,
Representation learning
BibRef
Pan, Z.Z.[Zi-Zheng],
Zhuang, B.[Bohan],
Liu, J.[Jing],
He, H.Y.[Hao-Yu],
Cai, J.F.[Jian-Fei],
Scalable Vision Transformers with Hierarchical Pooling,
ICCV21(367-376)
IEEE DOI
2203
Visualization, Image recognition, Computational modeling,
Scalability, Transformers, Computational efficiency,
Efficient training and inference methods
BibRef
Yue, X.Y.[Xiao-Yu],
Sun, S.Y.[Shu-Yang],
Kuang, Z.H.[Zhang-Hui],
Wei, M.[Meng],
Torr, P.H.S.[Philip H.S.],
Zhang, W.[Wayne],
Lin, D.[Dahua],
Vision Transformer with Progressive Sampling,
ICCV21(377-386)
IEEE DOI
2203
Codes, Computational modeling, Interference,
Transformers, Feature extraction, Recognition and classification,
Representation learning
BibRef
Chefer, H.[Hila],
Gur, S.[Shir],
Wolf, L.B.[Lior B.],
Generic Attention-model Explainability for Interpreting Bi-Modal and
Encoder-Decoder Transformers,
ICCV21(387-396)
IEEE DOI
2203
Measurement, Visualization, Image segmentation,
Computational modeling, Object detection,
BibRef
Yuan, L.[Li],
Chen, Y.P.[Yun-Peng],
Wang, T.[Tao],
Yu, W.H.[Wei-Hao],
Shi, Y.J.[Yu-Jun],
Jiang, Z.H.[Zi-Hang],
Tay, F.E.H.[Francis E. H.],
Feng, J.S.[Jia-Shi],
Yan, S.C.[Shui-Cheng],
Tokens-to-Token ViT:
Training Vision Transformers from Scratch on ImageNet,
ICCV21(538-547)
IEEE DOI
2203
Training, Image resolution, Computational modeling,
Image edge detection, Transformers,
BibRef
Wu, B.[Bichen],
Xu, C.F.[Chen-Feng],
Dai, X.L.[Xiao-Liang],
Wan, A.[Alvin],
Zhang, P.Z.[Pei-Zhao],
Yan, Z.C.[Zhi-Cheng],
Tomizuka, M.[Masayoshi],
Gonzalez, J.[Joseph],
Keutzer, K.[Kurt],
Vajda, P.[Peter],
Visual Transformers: Where Do Transformers Really Belong in Vision
Models?,
ICCV21(579-589)
IEEE DOI
2203
Training, Visualization, Image segmentation, Lips,
Computational modeling, Semantics,
Vision applications and systems
BibRef
Hu, R.H.[Rong-Hang],
Singh, A.[Amanpreet],
UniT: Multimodal Multitask Learning with a Unified Transformer,
ICCV21(1419-1429)
IEEE DOI
2203
Training, Natural languages,
Object detection, Predictive models, Transformers, Multitasking,
Representation learning
BibRef
Qiu, Y.[Yue],
Yamamoto, S.[Shintaro],
Nakashima, K.[Kodai],
Suzuki, R.[Ryota],
Iwata, K.[Kenji],
Kataoka, H.[Hirokatsu],
Satoh, Y.[Yutaka],
Describing and Localizing Multiple Changes with Transformers,
ICCV21(1951-1960)
IEEE DOI
2203
Measurement, Location awareness, Codes, Natural languages,
Benchmark testing, Transformers,
Vision applications and systems
BibRef
Song, M.[Myungseo],
Choi, J.[Jinyoung],
Han, B.H.[Bo-Hyung],
Variable-Rate Deep Image Compression through Spatially-Adaptive
Feature Transform,
ICCV21(2360-2369)
IEEE DOI
2203
Training, Image coding, Neural networks, Rate-distortion, Transforms,
Network architecture, Computational photography,
Low-level and physics-based vision
BibRef
Shenga, H.[Hualian],
Cai, S.[Sijia],
Liu, Y.[Yuan],
Deng, B.[Bing],
Huang, J.Q.[Jian-Qiang],
Hua, X.S.[Xian-Sheng],
Zhao, M.J.[Min-Jian],
Improving 3D Object Detection with Channel-wise Transformer,
ICCV21(2723-2732)
IEEE DOI
2203
Point cloud compression, Object detection, Detectors, Transforms,
Transformers, Encoding, Detection and localization in 2D and 3D,
BibRef
Zhang, P.[Pengchuan],
Dai, X.[Xiyang],
Yang, J.W.[Jian-Wei],
Xiao, B.[Bin],
Yuan, L.[Lu],
Zhang, L.[Lei],
Gao, J.F.[Jian-Feng],
Multi-Scale Vision Longformer: A New Vision Transformer for
High-Resolution Image Encoding,
ICCV21(2978-2988)
IEEE DOI
2203
Image segmentation, Image coding, Computational modeling,
Memory management, Object detection, Transformers,
Representation learning
BibRef
Dong, Q.[Qi],
Tu, Z.W.[Zhuo-Wen],
Liao, H.[Haofu],
Zhang, Y.T.[Yu-Ting],
Mahadevan, V.[Vijay],
Soatto, S.[Stefano],
Visual Relationship Detection Using Part-and-Sum Transformers with
Composite Queries,
ICCV21(3530-3539)
IEEE DOI
2203
Visualization, Detectors, Transformers, Task analysis, Standards,
Detection and localization in 2D and 3D,
Representation learning
BibRef
Wang, T.[Tao],
Yuan, L.[Li],
Chen, Y.P.[Yun-Peng],
Feng, J.S.[Jia-Shi],
Yan, S.C.[Shui-Cheng],
PnP-DETR: Towards Efficient Visual Analysis with Transformers,
ICCV21(4641-4650)
IEEE DOI
2203
Adaptation models, Visualization, Image segmentation,
Image recognition, Computational modeling, Redundancy,
Detection and localization in 2D and 3D
BibRef
Fan, H.Q.[Hao-Qi],
Xiong, B.[Bo],
Mangalam, K.[Karttikeya],
Li, Y.[Yanghao],
Yan, Z.C.[Zhi-Cheng],
Malik, J.[Jitendra],
Feichtenhofer, C.[Christoph],
Multiscale Vision Transformers,
ICCV21(6804-6815)
IEEE DOI
2203
Visualization, Image recognition, Codes, Computational modeling,
Transformers, Complexity theory,
Recognition and classification
BibRef
Mahmood, K.[Kaleel],
Mahmood, R.[Rigel],
van Dijk, M.[Marten],
On the Robustness of Vision Transformers to Adversarial Examples,
ICCV21(7818-7827)
IEEE DOI
2203
Transformers, Robustness,
Adversarial machine learning, Security,
Machine learning architectures and formulations
BibRef
Chen, X.L.[Xin-Lei],
Xie, S.[Saining],
He, K.[Kaiming],
An Empirical Study of Training Self-Supervised Vision Transformers,
ICCV21(9620-9629)
IEEE DOI
2203
Training, Benchmark testing, Transformers, Standards,
Representation learning, Recognition and classification, Transfer/Low-shot/Semi/Unsupervised Learning
BibRef
Caron, M.[Mathilde],
Touvron, H.[Hugo],
Misra, I.[Ishan],
Jegou, H.[Hervé],
Mairal, J.[Julien],
Bojanowski, P.[Piotr],
Joulin, A.[Armand],
Emerging Properties in Self-Supervised Vision Transformers,
ICCV21(9630-9640)
IEEE DOI
2203
Training, Image segmentation, Semantics, Layout, Image retrieval,
Representation learning,
Transfer/Low-shot/Semi/Unsupervised Learning
BibRef
Yuan, Y.[Ye],
Weng, X.[Xinshuo],
Ou, Y.[Yanglan],
Kitani, K.[Kris],
AgentFormer: Agent-Aware Transformers for Socio-Temporal Multi-Agent
Forecasting,
ICCV21(9793-9803)
IEEE DOI
2203
Uncertainty, Stochastic processes, Predictive models, Transformers,
Encoding, Trajectory, Motion and tracking,
Vision for robotics and autonomous vehicles
BibRef
Xu, W.J.[Wei-Jian],
Xu, Y.F.[Yi-Fan],
Chang, T.[Tyler],
Tu, Z.W.[Zhuo-Wen],
Co-Scale Conv-Attentional Image Transformers,
ICCV21(9961-9970)
IEEE DOI
2203
Image segmentation, Computational modeling, Object detection,
Transformers, Convolutional neural networks, Task analysis,
Recognition and classification
BibRef
Wu, K.[Kan],
Peng, H.W.[Hou-Wen],
Chen, M.H.[Ming-Hao],
Fu, J.L.[Jian-Long],
Chao, H.Y.[Hong-Yang],
Rethinking and Improving Relative Position Encoding for Vision
Transformer,
ICCV21(10013-10021)
IEEE DOI
2203
Image coding, Codes, Computational modeling, Transformers, Encoding,
Natural language processing, Datasets and evaluation,
Recognition and classification
BibRef
Bhojanapalli, S.[Srinadh],
Chakrabarti, A.[Ayan],
Glasner, D.[Daniel],
Li, D.[Daliang],
Unterthiner, T.[Thomas],
Veit, A.[Andreas],
Understanding Robustness of Transformers for Image Classification,
ICCV21(10211-10221)
IEEE DOI
2203
Perturbation methods, Transformers,
Robustness, Data models, Convolutional neural networks,
Recognition and classification
BibRef
Yan, B.[Bin],
Peng, H.[Houwen],
Fu, J.L.[Jian-Long],
Wang, D.[Dong],
Lu, H.C.[Hu-Chuan],
Learning Spatio-Temporal Transformer for Visual Tracking,
ICCV21(10428-10437)
IEEE DOI
2203
Visualization, Target tracking, Smoothing methods, Pipelines,
Benchmark testing, Transformers,
BibRef
Heo, B.[Byeongho],
Yun, S.[Sangdoo],
Han, D.Y.[Dong-Yoon],
Chun, S.[Sanghyuk],
Choe, J.[Junsuk],
Oh, S.J.[Seong Joon],
Rethinking Spatial Dimensions of Vision Transformers,
ICCV21(11916-11925)
IEEE DOI
2203
Dimensionality reduction, Computational modeling,
Object detection, Transformers, Robustness,
Recognition and classification
BibRef
Voskou, A.[Andreas],
Panousis, K.P.[Konstantinos P.],
Kosmopoulos, D.[Dimitrios],
Metaxas, D.N.[Dimitris N.],
Chatzis, S.[Sotirios],
Stochastic Transformer Networks with Linear Competing Units:
Application to end-to-end SL Translation,
ICCV21(11926-11935)
IEEE DOI
2203
Training, Memory management, Stochastic processes,
Gesture recognition, Benchmark testing, Assistive technologies,
BibRef
Ranftl, R.[René],
Bochkovskiy, A.[Alexey],
Koltun, V.[Vladlen],
Vision Transformers for Dense Prediction,
ICCV21(12159-12168)
IEEE DOI
2203
Image resolution, Semantics, Neural networks, Estimation,
Training data,
grouping and shape
BibRef
Chen, M.H.[Ming-Hao],
Peng, H.W.[Hou-Wen],
Fu, J.L.[Jian-Long],
Ling, H.B.[Hai-Bin],
AutoFormer: Searching Transformers for Visual Recognition,
ICCV21(12250-12260)
IEEE DOI
2203
Training, Convolutional codes, Visualization, Head, Search methods,
Manuals,
Recognition and classification
BibRef
Yang, G.L.[Guang-Lei],
Tang, H.[Hao],
Ding, M.L.[Ming-Li],
Sebe, N.[Nicu],
Ricci, E.[Elisa],
Transformer-Based Attention Networks for Continuous Pixel-Wise
Prediction,
ICCV21(16249-16259)
IEEE DOI
2203
Correlation, Estimation, Logic gates,
Transformers, Natural language processing,
Vision applications and systems
BibRef
Yuan, K.[Kun],
Guo, S.P.[Shao-Peng],
Liu, Z.[Ziwei],
Zhou, A.[Aojun],
Yu, F.W.[Feng-Wei],
Wu, W.[Wei],
Incorporating Convolution Designs into Visual Transformers,
ICCV21(559-568)
IEEE DOI
2203
Training, Visualization, Costs, Convolution, Training data,
Transformers, Feature extraction, Recognition and classification,
Efficient training and inference methods
BibRef
Chen, Z.[Zhengsu],
Xie, L.X.[Ling-Xi],
Niu, J.W.[Jian-Wei],
Liu, X.F.[Xue-Feng],
Wei, L.[Longhui],
Tian, Q.[Qi],
Visformer: The Vision-friendly Transformer,
ICCV21(569-578)
IEEE DOI
2203
Convolutional codes, Training, Visualization, Protocols,
Computational modeling, Fitting, Recognition and classification,
Representation learning
BibRef
Wang, W.[Wenhai],
Xie, E.[Enze],
Li, X.[Xiang],
Fan, D.P.[Deng-Ping],
Song, K.[Kaitao],
Liang, D.[Ding],
Lu, T.[Tong],
Luo, P.[Ping],
Shao, L.[Ling],
Pyramid Vision Transformer:
A Versatile Backbone for Dense Prediction without Convolutions,
ICCV21(548-558)
IEEE DOI
2203
Image resolution, Costs, Semantics, Object detection, Transformers,
Feature extraction, Recognition and classification,
grouping and shape
BibRef
Yao, Z.L.[Zhu-Liang],
Cao, Y.[Yue],
Lin, Y.T.[Yu-Tong],
Liu, Z.[Ze],
Zhang, Z.[Zheng],
Hu, H.[Han],
Leveraging Batch Normalization for Vision Transformers,
NeruArch21(413-422)
IEEE DOI
2112
Training, Transformers, Feeds
BibRef
Kim, K.[Kyungmin],
Wu, B.C.[Bi-Chen],
Dai, X.L.[Xiao-Liang],
Zhang, P.Z.[Pei-Zhao],
Yan, Z.C.[Zhi-Cheng],
Vajda, P.[Peter],
Kim, S.[Seon],
Rethinking the Self-Attention in Vision Transformers,
ECV21(3065-3069)
IEEE DOI
2109
Computational modeling, Pattern recognition
BibRef
Zhang, Z.X.[Zi-Xiao],
Lu, X.Q.[Xiao-Qiang],
Cao, G.J.[Guo-Jin],
Yang, Y.T.[Yu-Ting],
Jiao, L.C.[Li-Cheng],
Liu, F.[Fang],
ViT-YOLO: Transformer-Based YOLO for Object Detection,
VisDrone21(2799-2808)
IEEE DOI
2112
Semantics, Detectors, Object detection,
Feature extraction, Robustness
BibRef
Kong, D.[Daehyeon],
Kong, K.[Kyeongbo],
Kim, K.[Kyunghun],
Min, S.J.[Sung-Jun],
Kang, S.J.[Suk-Ju],
Image-Adaptive Hint Generation via Vision Transformer for Outpainting,
WACV22(4029-4038)
IEEE DOI
2202
Image synthesis, Neural networks,
Complex networks, Benchmark testing, Transformers,
Vision Systems and Applications
BibRef
Graham, B.[Ben],
El-Nouby, A.[Alaaeldin],
Touvron, H.[Hugo],
Stock, P.[Pierre],
Joulin, A.[Armand],
Jégou, H.[Hervé],
Douze, M.[Matthijs],
LeViT: a Vision Transformer in ConvNet's Clothing for Faster
Inference,
ICCV21(12239-12249)
IEEE DOI
2203
Training, Image resolution, Neural networks,
Parallel processing, Transformers, Feature extraction,
Representation learning
BibRef
Horváth, J.[János],
Baireddy, S.[Sriram],
Hao, H.X.[Han-Xiang],
Montserrat, D.M.[Daniel Mas],
Delp, E.J.[Edward J.],
Manipulation Detection in Satellite Images Using Vision Transformer,
WMF21(1032-1041)
IEEE DOI
2109
BibRef
Earlier: A1, A4, A3, A5, Only:
Manipulation Detection in Satellite Images Using Deep Belief Networks,
WMF20(2832-2840)
IEEE DOI
2008
Image sensors, Satellites, Splicing, Forestry, Tools.
Satellites, Image reconstruction, Training, Forgery,
Heating systems, Feature extraction
BibRef
Beal, J.[Josh],
Wu, H.Y.[Hao-Yu],
Park, D.H.[Dong Huk],
Zhai, A.[Andrew],
Kislyuk, D.[Dmitry],
Billion-Scale Pretraining with Vision Transformers for Multi-Task
Visual Representations,
WACV22(1431-1440)
IEEE DOI
2202
Visualization, Solid modeling, Systematics,
Computational modeling, Transformers,
Semi- and Un- supervised Learning
BibRef
Chapter on Pattern Recognition, Clustering, Statistics, Grammars, Learning, Neural Nets, Genetic Algorithms continues in
Video Transformers .