11.14.3.4.2 Vision Transformers for Image Generation and Image Synthesis

Chapter Contents (Back)
Vision Transformers. Transformers. Synthesis. Image Generation. Image Synthesis.
See also Vision Transformers, ViT.
See also SWIN Transformer.
See also Detection Transformer, DETR Applications.

Naveen, S., Ram Kiran, M.S., Indupriya, M., Manikanta, T.V., Sudeep, P.V.,
Transformer models for enhancing AttnGAN based text to image generation,
IVC(115), 2021, pp. 104284.
Elsevier DOI 2110
Generative Adversarial Networks (GANs), Natural Language Processing (NLP), Text to image synthesis, Attention mechanism BibRef

Dalmaz, O.[Onat], Yurt, M.[Mahmut], Çukur, T.[Tolga],
ResViT: Residual Vision Transformers for Multimodal Medical Image Synthesis,
MedImg(41), No. 10, October 2022, pp. 2598-2614.
IEEE DOI 2210
Transformers, Biomedical imaging, Subspace constraints, Task analysis, Image synthesis, Magnetic resonance imaging, unified BibRef

Zhang, X.[Xin], Jiao, W.T.[Wen-Tao], Wang, B.[Bing], Tian, X.D.[Xue-Dong],
CT-GAN: A conditional Generative Adversarial Network of transformer architecture for text-to-image,
SP:IC(115), 2023, pp. 116959.
Elsevier DOI 2306
Generative Adversarial Networks (GANs), Text-to-image, Transformer, Filter module, Shift self-attention BibRef

Sortino, R.[Renato], Palazzo, S.[Simone], Rundo, F.[Francesco], Spampinato, C.[Concetto],
Transformer-based image generation from scene graphs,
CVIU(233), 2023, pp. 103721.
Elsevier DOI 2307
BibRef
Earlier: A1, A2, A4, Only: Sortino, R.[Renato], Palazzo, S.[Simone], Spampinato, C.[Concetto],
Transforming Image Generation from Scene Graphs,
ICPR22(4118-4124)
IEEE DOI 2212
Scene graphs, Transformers, Generative models, Conditional image generation. Visualization, Image synthesis, Semantics, Process control, Genomics, Rendering (computer graphics) BibRef

Wang, Z.C.[Zi-Cong], Ren, Q.[Qiang], Wang, J.L.[Jun-Li], Yan, C.G.[Chun-Gang], Jiang, C.J.[Chang-Jun],
Mush: Multi-scale Hierarchical Feature Extraction for Semantic Image Synthesis,
ACCV22(VII:185-201).
Springer DOI 2307
BibRef

Wei, Z.P.[Zhi-Peng], Chen, J.J.[Jing-Jing], Goldblum, M.[Micah], Wu, Z.X.[Zu-Xuan], Goldstein, T.[Tom], Jiang, Y.G.[Yu-Gang], Davis, L.S.[Larry S],
Towards Transferable Adversarial Attacks on Image and Video Transformers,
IP(32), 2023, pp. 6346-6358.
IEEE DOI 2311
Image recognition, Codes, Perturbation methods, Computational modeling, Boosting, Data models, Adversarial attack and defense BibRef

Wei, Z.P.[Zhi-Peng], Chen, J.J.[Jing-Jing], Wu, Z.X.[Zu-Xuan], Jiang, Y.G.[Yu-Gang],
Adaptive Cross-Modal Transferable Adversarial Attacks From Images to Videos,
PAMI(46), No. 5, May 2024, pp. 3772-3783.
IEEE DOI 2404
BibRef
Earlier:
Cross-Modal Transferable Adversarial Attacks from Images to Videos,
CVPR22(15044-15053)
IEEE DOI 2210
Videos, Adaptation models, Perturbation methods, Feature extraction, Computational modeling, Glass box, Closed box, transferable attack BibRef

Bahani, M.[Mourad], El Ouaazizi, A.[Aziza], Maalmi, K.[Khalil],
The effectiveness of T5, GPT-2, and BERT on text-to-image generation task,
PRL(173), 2023, pp. 57-63.
Elsevier DOI 2310
Deep learning, Generative adversarial networks, Text-to-image generation, Natural language processing, Transformer-based models BibRef

Su, S.[Sitong], Zhu, J.[Junchen], Gao, L.[Lianli], Song, J.K.[Jing-Kuan],
Utilizing Greedy Nature for Multimodal Conditional Image Synthesis in Transformers,
MultMed(26), 2024, pp. 2354-2366.
IEEE DOI 2402
Transformers, Image synthesis, Visualization, Image segmentation, Task analysis, Image reconstruction, Computer architecture, transformer BibRef


Zheng, S.[Siming], Yuan, X.[Xin],
Unfolding Framework with Prior of Convolution-Transformer Mixture and Uncertainty Estimation for Video Snapshot Compressive Imaging,
ICCV23(12692-12703)
IEEE DOI Code:
WWW Link. 2401
BibRef

Cao, S.Y.[Shi-Yue], Yin, Y.Q.[Yue-Qin], Huang, L.H.[Liang-Hua], Liu, Y.[Yu], Zhao, X.[Xin], Zhao, D.L.[De-Li], Huang, K.Q.[Kai-Qi],
Efficient-VQGAN: Towards High-Resolution Image Generation with Efficient Vision Transformers,
ICCV23(7334-7343)
IEEE DOI 2401
BibRef

Yun, J.[Jooyeol], Lee, S.[Sanghyeon], Park, M.H.[Min-Ho], Choo, J.[Jaegul],
iColoriT: Towards Propagating Local Hints to the Right Region in Interactive Colorization by Leveraging Vision Transformer,
WACV23(1787-1796)
IEEE DOI 2302
Convolutional codes, Image color analysis, Stacking, Gray-scale, Transformers, Algorithms: Computational photography, image and video synthesis BibRef

Lin, K.E.[Kai-En], Yen-Chen, L.[Lin], Lai, W.S.[Wei-Sheng], Lin, T.Y.[Tsung-Yi], Shih, Y.C.[Yi-Chang], Ramamoorthi, R.[Ravi],
Vision Transformer for NeRF-Based View Synthesis from a Single Input Image,
WACV23(806-815)
IEEE DOI 2302
Shape, Pose estimation, Feature extraction, Transformers, Cameras, Algorithms: Computational photography, 3D computer vision BibRef

Lezama, J.[José], Chang, H.[Huiwen], Jiang, L.[Lu], Essa, I.[Irfan],
Improved Masked Image Generation with Token-Critic,
ECCV22(XXIII:70-86).
Springer DOI 2211
Generative transformer. BibRef

Kong, X.[Xiang], Jiang, L.[Lu], Chang, H.[Huiwen], Zhang, H.[Han], Hao, Y.[Yuan], Gong, H.F.[Hai-Feng], Essa, I.[Irfan],
BLT: Bidirectional Layout Transformer for Controllable Layout Generation,
ECCV22(XVII:474-490).
Springer DOI 2211
BibRef

Kong, D.[Daehyeon], Kong, K.[Kyeongbo], Kim, K.[Kyunghun], Min, S.J.[Sung-Jun], Kang, S.J.[Suk-Ju],
Image-Adaptive Hint Generation via Vision Transformer for Outpainting,
WACV22(4029-4038)
IEEE DOI 2202
Image synthesis, Neural networks, Complex networks, Benchmark testing, Transformers, Vision Systems and Applications BibRef

Maharana, A.[Adyasha], Hannan, D.[Darryl], Bansal, M.[Mohit],
StoryDALL-E: Adapting Pretrained Text-to-Image Transformers for Story Continuation,
ECCV22(XXXVII:70-87).
Springer DOI 2211
BibRef

Kim, T.[Taehoon], Song, G.[Gwangmo], Lee, S.[Sihaeng], Kim, S.[Sangyun], Seo, Y.[Yewon], Lee, S.[Soonyoung], Kim, S.H.[Seung Hwan], Lee, H.L.[Hong-Lak], Bae, K.[Kyunghoon],
L-Verse: Bidirectional Generation Between Image and Text,
CVPR22(16505-16515)
IEEE DOI 2210
Representation learning, Training, Scalability, Transformers, Robustness, Scene analysis and understanding BibRef

Wang, J.A.[Jian-An], Lu, G.S.[Guan-Song], Xu, H.[Hang], Li, Z.G.[Zhen-Guo], Xu, C.J.[Chun-Jing], Fu, Y.W.[Yan-Wei],
ManiTrans: Entity-Level Text-Guided Image Manipulation via Token-wise Semantic Alignment and Generation,
CVPR22(10697-10707)
IEEE DOI 2210
Image synthesis, Semantics, Transformers, Pattern recognition, Task analysis, Image and video synthesis and generation, Vision+language BibRef

Liu, Z.Z.[Zheng-Zhe], Wang, Y.[Yi], Qi, X.J.[Xiao-Juan], Fu, C.W.[Chi-Wing],
Towards Implicit Text-Guided 3D Shape Generation,
CVPR22(17875-17885)
IEEE DOI 2210
Codes, Shape, Image color analysis, Benchmark testing, Transformers, Vision+language, 3D from single images BibRef

Wu, F.X.[Fu-Xiang], Liu, L.[Liu], Hao, F.S.[Fu-Sheng], He, F.X.[Feng-Xiang], Cheng, J.[Jun],
Text-to-Image Synthesis based on Object-Guided Joint-Decoding Transformer,
CVPR22(18092-18101)
IEEE DOI 2210
Visualization, Image coding, Image synthesis, Layout, Semantics, Natural languages, Transformers, Vision+language, Image and video synthesis and generation BibRef

Wang, X.P.[Xin-Peng], Yeshwanth, C.[Chandan], Nießner, M.[Matthias],
SceneFormer: Indoor Scene Generation with Transformers,
3DV21(106-115)
IEEE DOI 2201
Layout, Transformers, Task analysis, scene generation, transformer, text conditioned, indoor scene, deep learning, machine learning, virtual reality BibRef

Mariotti, O.[Octave], Aodha, O.M.[Oisin Mac], Bilen, H.[Hakan],
ViewNet: Unsupervised Viewpoint Estimation from Conditional Generation,
ICCV21(10398-10408)
IEEE DOI 2203
Training, Annotations, Estimation, Benchmark testing, Transformers, Representation learning, Transfer/Low-shot/Semi/Unsupervised Learning BibRef

Ribeiro, L.S.F.[Leo Sampaio Ferraz], Bui, T.[Tu], Collomosse, J.[John], Ponti, M.[Moacir],
Scene Designer: A Unified Model for Scene Search and Synthesis from Sketch,
SHE21(2424-2433)
IEEE DOI 2112
Visualization, Correlation, Digital images, Layout, Transformer cores, Search problems, Transformers BibRef

Yang, C.F.[Cheng-Fu], Fan, W.C.[Wan-Cyuan], Yang, F.E.[Fu-En], Wang, Y.C.A.F.[Yu-Chi-Ang Frank],
LayoutTransformer: Scene Layout Generation with Conceptual and Spatial Diversity,
CVPR21(3731-3740)
IEEE DOI 2111
Visualization, Spatial diversity, Layout, Semantics, Fitting, Genomics BibRef

Lin, C., Yumer, E., Wang, O., Shechtman, E., Lucey, S.,
ST-GAN: Spatial Transformer Generative Adversarial Networks for Image Compositing,
CVPR18(9455-9464)
IEEE DOI 1812
Training, Generators, Image generation, Manifolds, Generative adversarial networks, Games BibRef

Chapter on 3-D Object Description and Computation Techniques, Surfaces, Deformable, View Generation, Video Conferencing continues in
Merging Views, Object Insertion in Image .


Last update:Apr 27, 2024 at 11:46:35