Zhang, J.[Ji],
Mei, K.Z.[Kui-Zhi],
Zheng, Y.[Yu],
Fan, J.P.[Jian-Ping],
Exploiting Mid-Level Semantics for Large-Scale Complex Video
Classification,
MultMed(21), No. 10, October 2019, pp. 2518-2530.
IEEE DOI
1910
feature extraction, image classification,
image motion analysis, image representation,
large-scale video classification
BibRef
Zhang, J.[Ji],
Mei, K.Z.[Kui-Zhi],
Wang, X.,
Zheng, Y.[Yu],
Fan, J.P.[Jian-Ping],
From Text to Video: Exploiting Mid-Level Semantics for Large-Scale
Video Classification,
ICPR18(1695-1700)
IEEE DOI
1812
Semantics, Task analysis, Visualization, Streaming media, Detectors,
Encoding, Bridges
BibRef
Yang, M.[Min],
Liu, J.H.[Jun-Hao],
Shen, Y.[Ying],
Zhao, Z.[Zhou],
Chen, X.J.[Xiao-Jun],
Wu, Q.Y.[Qing-Yao],
Li, C.M.[Cheng-Ming],
An Ensemble of Generation- and Retrieval-Based Image Captioning With
Dual Generator Generative Adversarial Network,
IP(29), 2020, pp. 9627-9640.
IEEE DOI
2011
Generators, Decoding, Generative adversarial networks, Training,
Computational modeling, Task analysis, Image captioning,
adversarial learning
BibRef
Chen, Q.[Qi],
Wu, Q.[Qi],
Chen, J.[Jian],
Wu, Q.Y.[Qing-Yao],
van den Hengel, A.J.[Anton J.],
Tan, M.K.[Ming-Kui],
Scripted Video Generation With a Bottom-Up Generative Adversarial
Network,
IP(29), 2020, pp. 7454-7467.
IEEE DOI
2007
Generative adversarial networks, video generation,
semantic alignment, temporal coherence
BibRef
Sheng, L.[Lu],
Pan, J.T.[Jun-Ting],
Guo, J.M.[Jia-Ming],
Shao, J.[Jing],
Loy, C.C.[Chen Change],
High-Quality Video Generation from Static Structural Annotations,
IJCV(128), No. 10-11, November 2020, pp. 2552-2569.
Springer DOI
2009
BibRef
Sener, F.[Fadime],
Saraf, R.[Rishabh],
Yao, A.[Angela],
Transferring Knowledge From Text to Video:
Zero-Shot Anticipation for Procedural Actions,
PAMI(45), No. 6, June 2023, pp. 7836-7852.
IEEE DOI
2305
Visualization, Robots, Data models, Task analysis, Predictive models,
Natural languages, Text recognition, Deep learning, video analysis
BibRef
Liu, S.Y.[Si-Ying],
Dragotti, P.L.[Pier Luigi],
Sensing Diversity and Sparsity Models for Event Generation and Video
Reconstruction from Events,
PAMI(45), No. 10, October 2023, pp. 12444-12458.
IEEE DOI
2310
Event to video.
BibRef
Köksal, A.[Ali],
Ak, K.E.[Kenan E.],
Sun, Y.[Ying],
Rajan, D.[Deepu],
Lim, J.H.[Joo Hwee],
Controllable Video Generation With Text-Based Instructions,
MultMed(26), 2024, pp. 190-201.
IEEE DOI
2401
BibRef
Liu, J.W.[Jia-Wei],
Wang, W.N.[Wei-Ning],
Chen, S.[Sihan],
Zhu, X.X.[Xin-Xin],
Liu, J.[Jing],
Sounding Video Generator: A Unified Framework for Text-Guided
Sounding Video Generation,
MultMed(26), 2024, pp. 141-153.
IEEE DOI
2401
BibRef
Hu, Y.[Yaosi],
Luo, C.[Chong],
Chen, Z.Z.[Zhen-Zhong],
A Benchmark for Controllable Text-Image-to-Video Generation,
MultMed(26), 2024, pp. 1706-1719.
IEEE DOI
2402
Task analysis, Measurement, Generators, Uncertainty, Visualization, Dynamics,
Benchmark testing, Video generation, text-image-to-video,
multimodal-conditioned generation
BibRef
Fang, S.[Sheng],
Dang, T.T.[Tian-Tian],
Wang, S.H.[Shu-Hui],
Huang, Q.M.[Qing-Ming],
Linguistic Hallucination for Text-Based Video Retrieval,
CirSysVideo(34), No. 10, October 2024, pp. 9692-9705.
IEEE DOI Code:
WWW Link.
2411
Linguistics, Training, Testing, Encoding, Context modeling,
Feature extraction, Task analysis, Text-video retrieval, curriculum learning
BibRef
Nadeem, M.[Mohammad],
Sohail, S.S.[Shahab Saquib],
Cambria, E.[Erik],
Schuller, B.W.[Björn W.],
Hussain, A.[Amir],
Gender Bias in Text-to-Video Generation Models: A Case Study of Sora,
IEEE_Int_Sys(40), No. 3, May 2025, pp. 10-15.
IEEE DOI
2506
Analytical models, Leadership, Ethics, Generative AI,
Prevention and mitigation, Training data, Focusing, Intelligent systems
BibRef
Kuang, Q.[Qi],
Chen, Y.[Ying],
Visual-Aware Text as Query for Referring Video Object Segmentation,
IVC(161), 2025, pp. 105608.
Elsevier DOI
2509
Referring video object segmentation, Text-to-video, CLIP
BibRef
He, H.G.[Hui-Guo],
Yang, H.[Huan],
Tuo, Z.X.[Zi-Xi],
Zhou, Y.[Yuan],
Wang, Q.Y.[Qiu-Yue],
Zhang, Y.H.[Yu-Hang],
Liu, Z.[Zeyu],
Huang, W.H.[Wen-Hao],
Chao, H.Y.[Hong-Yang],
Yin, J.[Jian],
DreamStory: Open-Domain Story Visualization by LLM-Guided
Multi-Subject Consistent Diffusion,
PAMI(47), No. 12, December 2025, pp. 11874-11891.
IEEE DOI
2511
Visualization, Diffusion models, Semantics, Faces, Videos, Training,
Large language models, Hands, Benchmark testing, Snow
BibRef
Wang, X.H.[Xing-Han],
Kang, Z.X.[Zi-Xi],
Mu, Y.D.[Ya-Dong],
Text-Controlled Motion Mamba: Text-Instructed Temporal Grounding of
Human Motion,
IP(34), 2025, pp. 7079-7092.
IEEE DOI
2511
Motion segmentation, Computational modeling, Grounding,
Annotations, Context modeling, Visualization, Topology, Semantics,
state space models
BibRef
Wang, F.M.[Fu-Ming],
Wang, W.L.[Wen-Long],
Gao, D.[Dahua],
Huang, X.[Xunliang],
Song, X.D.[Xiao-Dan],
Sun, H.Y.[Hao-Yuan],
Peng, C.[Cheng],
A LLM-guided hybrid Mamba-Transformer architecture for part-to-whole
motion synthesis,
CVIU(262), 2025, pp. 104549.
Elsevier DOI
2512
Motion synthesis, Text-to-motion, Hybrid architecture, Generative model
BibRef
Huang, Z.Q.[Zi-Qi],
Zhang, F.[Fan],
Xu, X.J.[Xiao-Jie],
He, Y.[Yinan],
Yu, J.[Jiashuo],
Dong, Z.Y.[Zi-Yue],
Ma, Q.L.[Qian-Li],
Chanpaisit, N.[Nattapol],
Si, C.Y.[Chen-Yang],
Jiang, Y.M.[Yu-Ming],
Wang, Y.H.[Yao-Hui],
Chen, X.Y.[Xin-Yuan],
Chen, Y.C.[Ying-Cong],
Wang, L.M.[Li-Min],
Lin, D.[Dahua],
Qiao, Y.[Yu],
Liu, Z.W.[Zi-Wei],
VBench++: Comprehensive and Versatile Benchmark Suite for Video
Generative Models,
PAMI(48), No. 3, March 2026, pp. 3268-3285.
IEEE DOI
2602
Text to video, Adaptation models, Benchmark testing, Annotations,
Quality assessment, Numerical models, Image synthesis, Data models,
evaluation benchmark
BibRef
Huang, Z.Q.[Zi-Qi],
He, Y.[Yinan],
Yu, J.[Jiashuo],
Zhang, F.[Fan],
Si, C.Y.[Chen-Yang],
Jiang, Y.M.[Yu-Ming],
Zhang, Y.H.[Yuan-Han],
Wu, T.X.[Tian-Xing],
Jin, Q.Y.[Qing-Yang],
Chanpaisit, N.[Nattapol],
Wang, Y.H.[Yao-Hui],
Chen, X.Y.[Xin-Yuan],
Wang, L.M.[Li-Min],
Lin, D.[Dahua],
Qiao, Y.[Yu],
Liu, Z.W.[Zi-Wei],
VBench: Comprehensive Benchmark Suite for Video Generative Models,
CVPR24(21807-21818)
IEEE DOI
2410
Measurement, Image synthesis, Annotations, Computational modeling,
Benchmark testing, evaluation, human preference
BibRef
Chen, Y.H.[Yu-Hui],
Li, H.R.[Hao-Ran],
Jiang, Z.N.[Zhen-Nan],
Wen, H.[Haowei],
Zhao, D.B.[Dong-Bin],
TeViR: Text-to-Video Reward With Diffusion Models for Efficient
Reinforcement Learning,
SMCS(56), No. 2, February 2026, pp. 893-905.
IEEE DOI
2602
Text to video, Diffusion models, Robots, Visualization, Trajectory,
Predictive models, Image sequences, Reinforcement learning,
text-to-video diffusion model
BibRef
Era, Y.[Yuki],
Togo, R.[Ren],
Maeda, K.[Keisuke],
Ogawa, T.[Takahiro],
Haseyama, M.[Miki],
Generalizing Stylized Motion Generation Method by Introducing
Metadata-Independent Learning and Unified Multiple Motion Dataset,
MultMed(28), 2026, pp. 1884-1893.
IEEE DOI
2603
Training, Metadata, Feature extraction, Adaptation models, Videos,
Motion measurement, Knowledge engineering, Humanities, Hands,
text-to-motion
BibRef
Hu, M.X.[Meng-Xian],
Zhu, M.H.[Ming-Hao],
Zhou, X.[Xun],
Yan, Q.Q.[Qing-Qing],
Li, S.[Shu],
Liu, C.J.[Cheng-Ju],
Chen, Q.J.[Qi-Jun],
Efficient Text-Driven Motion Generation via Latent Consistency
Training,
SMCS(56), No. 3, March 2026, pp. 1506-1519.
IEEE DOI
2603
Training, Trajectory, Diffusion models, Costs,
Computational modeling, Nearest neighbor methods,
neural network
BibRef
Zhang, Z.C.[Zhi-Chao],
Li, X.Y.[Xin-Yue],
Sun, W.[Wei],
Zhang, Z.C.[Zi-Cheng],
Liu, X.H.[Xiao-Hong],
Min, X.K.[Xiong-Kuo],
Zhai, G.T.[Guang-Tao],
LMVQ: Label-Free Metric-Learning for General AI-Generated Video
Quality Assessment,
CirSysVideo(36), No. 3, March 2026, pp. 3367-3381.
IEEE DOI
2603
Distortion, Text to video, Quality assessment, Visualization, Degradation,
Adaptation models, Transformers, Training, Semantics, metric-learning
BibRef
Liang, S.Y.[Si-Yuan],
Liu, J.Y.[Jia-Yang],
Zhai, J.C.[Jie-Chen],
Fang, T.M.[Tian-Meng],
Tu, R.C.[Rong-Cheng],
Liu, A.S.[Ai-Shan],
Cao, X.C.[Xiao-Chun],
Tao, D.C.[Da-Cheng],
T2VShield: Model-Agnostic Jailbreak Defense for Text-to-Video Models,
IJCV(134), No. 4, April 2026, pp. 144.
Springer DOI Code:
WWW Link.
2603
BibRef
Shi, Y.Y.[Yang-Yang],
Zou, B.[Beiji],
Deng, X.N.[Xiao-Nian],
Zhang, Y.C.[Yu-Cong],
Liu, Z.[Zehua],
Kui, X.Y.[Xiao-Yan],
Xu, G.[Gang],
Si, W.X.[Wei-Xin],
Integrating frequency-aware mamba with diffusion for 4D volumetric
image synthesis,
PR(175), 2026, pp. 113067.
Elsevier DOI
2603
Motion synthesis, Diffusion model, Frequency-aware Mamba, Radiotherapy
BibRef
Hu, Y.[Yaosi],
Chen, C.W.[Chang Wen],
MotionPrior: Exploring Efficient Learning of Motion Concepts for
Few-Shot Video Generation,
IP(35), 2026, pp. 3324-3338.
IEEE DOI Code:
WWW Link.
2604
Videos, Text to video, Diffusion models, Adaptation models, Training,
Noise, Generators, Computational modeling, Visualization, few-shot learning
BibRef
Zheng, G.C.[Guang-Cong],
Li, X.[Xi],
CamI2V-Epipolar: Epipolar-Constrained Block Sparse Attention for
Camera-Controlled Image-to-Video Diffusion Model,
SPLetters(33), 2026, pp. 1401-1405.
IEEE DOI
2604
Cameras, Videos, Diffusion models, Controllability,
Attention mechanisms, Translation, Trajectory, Noise, Measurement,
video block sparse attention
BibRef
Kwon, S.[Soonwoo],
Kim, J.Y.[Jin-Young],
Go, H.[Hyojun],
Baek, K.[Kyungjune],
Toward stable world models: Measuring and addressing world
instability in generative environments,
PR(177), 2026, pp. 113351.
Elsevier DOI
2605
When you get back to the start, the model should be the same.
World model, Metric
BibRef
Sun, C.Y.[Chun-Yi],
Han, J.L.[Jun-Lin],
Deng, W.J.[Wei-Jian],
Wang, X.L.[Xin-Long],
Qin, Z.S.[Zi-Shan],
Gould, S.[Stephen],
3D-GPT: Procedural 3D Modeling with Large Language Models,
3DV25(1253-1263)
IEEE DOI
2512
Automated content creation.
Solid modeling, Adaptation models, Visualization,
Procedural generation, Large language models, Ray tracing,
Software reliability
BibRef
Zhang, R.[Ruihan],
Yu, B.[Borou],
Min, J.J.[Jia-Jian],
Xin, Y.[Yetong],
Wei, Z.[Zheng],
Shi, J.C.N.[Jun-Cheng Nemo],
Huang, M.Z.[Ming-Zhen],
Kong, X.H.[Xiang-Hao],
Xin, N.L.[Nix Liu],
Jiang, S.S.[Shan-Shan],
Bahuguna, P.[Praagya],
Chan, M.[Mark],
Hora, K.[Khushi],
Yang, L.J.[Li-Jian],
Liang, Y.Q.[Yong-Qi],
Bian, R.[Runhe],
Liu, Y.L.[Yun-Lei],
Valencia, I.C.[Isabela Campillo],
Tredinick, P.M.[Patricia Morales],
Kozlov, I.[Ilia],
Jiang, S.[Sijia],
Huang, P.[Peiwen],
Chen, N.[Na],
Liu, X.[Xuanxuan],
Rao, A.[Anyi],
Generative AI for Film Creation: A Survey of Recent Advances,
AIConGen25(6257-6269)
IEEE DOI
2512
Surveys, Visualization, Generative AI, Navigation, Films,
Text to image, Production, Neural radiance field, Film, AI Film, GenAI
BibRef
Li, A.[Ang],
Evaluating Text-to-Video Alignment: A Hierarchical Benchmark for
Video Generation Models,
ICIVC25(583-590)
IEEE DOI
2512
Correlation, Systematics, Protocols, Human-machine systems,
Computational modeling, Benchmark testing, Diffusion models,
latent diffusion model
BibRef
Li, X.M.[Xiao-Min],
Liu, Y.X.[Yi-Xuan],
Isobe, T.[Takashi],
Jia, X.[Xu],
Cui, Q.P.[Qin-Peng],
Zhou, D.[Dong],
Li, D.[Dong],
He, Y.[You],
Lu, H.C.[Hu-Chuan],
Wang, Z.D.[Zhong-Dao],
Barsoum, E.[Emad],
ReNeg: Learning Negative Embedding with Reward Guidance,
CVPR25(23636-23645)
IEEE DOI Code:
WWW Link.
2508
Training, Codes, Computational modeling, Semantics, Buildings,
Text to image, Diffusion models, Text to video, negative embedding
BibRef
Deo, A.[Anurag],
Bhat, S.[Savita],
Karande, S.[Shirish],
VisualFusion: Enhancing Blog Content with Advanced Infographic
Pipeline,
WACV25(5591-5600)
IEEE DOI
2505
Measurement, Image quality, Visualization, Pipelines, Blogs,
Text to image, Streaming media, Data models, Standards, Videos, NLP
BibRef
Shin, C.[Chaehun],
Choi, J.Y.[Joo-Young],
Kim, H.[Heeseung],
Yoon, S.[Sungroh],
Large-Scale Text-to-Image Model with Inpainting is a Zero-Shot
Subject-Driven Image Generator,
CVPR25(7986-7996)
IEEE DOI
2508
Visualization, Image synthesis, Semantics, Text to image, Generators,
Videos, Context modeling, zero-shot subject-driven text-to-image generation
BibRef
Liang, F.[Feng],
Ma, H.Y.[Hao-Yu],
He, Z.C.[Ze-Cheng],
Hou, T.B.[Ting-Bo],
Hou, J.[Ji],
Li, K.P.[Kun-Peng],
Dai, X.L.[Xiao-Liang],
Juefei-Xu, F.[Felix],
Azadi, S.[Samaneh],
Sinha, A.[Animesh],
Zhang, P.Z.[Pei-Zhao],
Vajda, P.[Peter],
Marculescu, D.[Diana],
Movie Weaver: Tuning-Free Multi-Concept Video Personalization with
Anchored Prompts,
CVPR25(13146-13156)
IEEE DOI
2508
Limiting, Accuracy, Animals, Face recognition, Motion pictures, Videos
BibRef
Nguyen, T.[Thao],
Singh, K.K.[Krishna Kumar],
Shi, J.[Jing],
Bui, T.[Trung],
Lee, Y.J.[Yong Jae],
Li, Y.H.[Yu-Heng],
Yo'Chameleon: Personalized Vision and Language Generation,
CVPR25(14438-14448)
IEEE DOI
2508
Training, Image quality, Adaptation models, Visualization,
Image synthesis, Statistical analysis, Tuning, Optimization,
personalization
BibRef
Wang, H.L.[Han-Lin],
Ouyang, H.[Hao],
Wang, Q.Y.[Qiu-Yu],
Wang, W.[Wen],
Cheng, K.L.[Ka Leong],
Chen, Q.F.[Qi-Feng],
Shen, Y.J.[Yu-Jun],
Wang, L.M.[Li-Min],
LeviTor: 3D Trajectory Oriented Image-to-Video Synthesis,
CVPR25(12490-12500)
IEEE DOI Code:
WWW Link.
2508
Solid modeling, Image segmentation, Tracking,
Computational modeling, Pipelines,
3d trajectory control
BibRef
Lai, Z.H.[Zi-Hang],
Vedaldi, A.[Andrea],
Tracktention: Leveraging Point Tracking to Attend Videos Faster and
Better,
CVPR25(22809-22819)
IEEE DOI
2508
Temporal consistency in synthesis.
Tracking, Computational modeling, Dynamics, Predictive models,
Transformers, Computational efficiency, Videos
BibRef
Ji, B.[Bin],
Pan, Y.[Ye],
Liu, Z.M.[Zhi-Meng],
Tan, S.[Shuai],
Jin, X.G.[Xiao-Gang],
Yang, X.K.[Xiao-Kang],
POMP: Physics-constrainable Motion Generative Model through Phase
Manifolds,
CVPR25(22690-22701)
IEEE DOI
2508
Manifolds, Training, Target tracking, Simulation, Dynamics, Kinematics,
Encoding, Real-time systems, Topology, Physics
BibRef
Wang, Y.F.[Yi-Fan],
Yang, P.[Peishan],
Xu, Z.[Zhen],
Sun, J.M.[Jia-Ming],
Zhang, Z.[Zhanhua],
Chen, Y.[Yong],
Bao, H.J.[Hu-Jun],
Peng, S.[Sida],
Zhou, X.W.[Xiao-Wei],
FreeTimeGS: Free Gaussian Primitives at Anytime Anywhere for Dynamic
Scene Reconstruction,
CVPR25(21750-21760)
IEEE DOI
2508
Surface reconstruction, Solid modeling, Deformation, Dynamics,
Redundancy, Rendering (computer graphics)
BibRef
Wang, S.J.[Shi-Jie],
Azadi, S.[Samaneh],
Girdhar, R.[Rohit],
Rambhatla, S.[Saketh],
Sun, C.[Chen],
Yin, X.[Xi],
MotiF: Making Text Count in Image Animation with Motion Focal Loss,
CVPR25(7773-7783)
IEEE DOI
2508
Training, Optical losses, Heating systems, Protocols, Benchmark testing,
Animation, Optical flow, Videos, video generation, image animation
BibRef
Bian, W.[Weikang],
Huang, Z.Y.[Zhao-Yang],
Shi, X.Y.[Xiao-Yu],
Li, Y.J.[Yi-Jin],
Wang, F.Y.[Fu-Yun],
Li, H.S.[Hong-Sheng],
GS-DiT: Advancing Video Generation with Dynamic 3D Gaussian Fields
through Efficient Dense 3D Point Tracking,
CVPR25(21717-21727)
IEEE DOI Code:
WWW Link.
2508
Training, Tracking, Dynamics, Production, Cameras, Controllability,
Transformers, Videos, video generation, point tracking,
camera pose control
BibRef
Ren, X.[Xuanchi],
Shen, T.[Tianchang],
Huang, J.H.[Jia-Hui],
Ling, H.[Huan],
Lu, Y.F.[Yi-Fan],
Nimier-David, M.[Merlin],
Müller, T.[Thomas],
Keller, A.[Alexander],
Fidler, S.[Sanja],
Gao, J.[Jun],
Gen3C: 3D-Informed World-Consistent Video Generation with Precise
Camera Control,
CVPR25(6121-6132)
IEEE DOI
2508
Training, Point cloud compression, Solid modeling, Dynamics, Cameras,
Rendering (computer graphics), Trajectory, Videos,
camera control
BibRef
Wang, X.[Xi],
Courant, R.[Robin],
Christie, M.[Marc],
Kalogeiton, V.[Vicky],
AKiRa: Augmentation Kit on Rays for optical video generation,
CVPR25(2609-2619)
IEEE DOI
2508
Visualization, Mood, Optical distortion, Cameras, Quality assessment,
Optical films, Videos, Optical control, Lenses
BibRef
Han, H.N.[Hao-Nan],
Wu, X.Z.[Xiang-Zuo],
Liao, H.[Huan],
Xu, Z.[Zunnan],
Hu, Z.Y.[Zhong-Yuan],
Li, R.H.[Rong-Hui],
Zhang, Y.C.[Ya-Chao],
Li, X.[Xiu],
AToM: Aligning Text-to-Motion Model at Event-Level with GPT-4Vision
Reward,
CVPR25(22746-22755)
IEEE DOI Code:
WWW Link.
2508
Visualization, Annotations, Computational modeling,
Reinforcement learning, Generators
BibRef
Wu, Y.S.[Yu-Shu],
Zhang, Z.X.[Zhi-Xing],
Li, Y.Y.[Yan-Yu],
Xu, Y.[Yanwu],
Kag, A.[Anil],
Sui, Y.[Yang],
Coskun, H.[Huseyin],
Ma, K.[Ke],
Lebedev, A.[Aleksei],
Hu, J.[Ju],
Metaxas, D.N.[Dimitris N.],
Wang, Y.Z.[Yan-Zhi],
Tulyakov, S.[Sergey],
Ren, J.[Jian],
SnapGen-V: Generating a Five-Second Video within Five Seconds on a
Mobile Device,
CVPR25(2479-2490)
IEEE DOI Code:
WWW Link.
2508
Limiting, Image synthesis, Computational modeling,
Image edge detection, Noise reduction, Network architecture, efficiency
BibRef
Han, H.[Hui],
Li, S.Y.[Si-Yuan],
Chen, J.Q.[Jia-Qi],
Yuan, Y.W.[Yi-Wen],
Wu, Y.L.[Yu-Ling],
Deng, Y.F.[Yu-Fan],
Leong, C.T.[Chak Tou],
Du, H.[Hanwen],
Fu, J.C.[Jun-Chen],
Li, Y.[Youhua],
Zhang, J.[Jie],
Zhang, C.[Chi],
Li, L.J.[Li-Jia],
Ni, Y.X.[Yong-Xin],
Video-Bench: Human-Aligned Video Generation Benchmark,
CVPR25(18858-18868)
IEEE DOI
2508
Measurement, Visualization, Large language models,
Computational modeling, Benchmark testing, Cognition, video generation
BibRef
Shen, X.Q.[Xiao-Qian],
Elhoseiny, M.[Mohamed],
StoryGPT-V: Large Language Models as Consistent Story Visualizers,
CVPR25(13273-13283)
IEEE DOI Code:
WWW Link.
2508
Visualization, Image segmentation, Image resolution, Accuracy,
Navigation, Large language models, Semantics, Character generation
BibRef
Chen, T.S.[Tsai-Shien],
Siarohin, A.[Aliaksandr],
Menapace, W.[Willi],
Fang, Y.W.[Yu-Wei],
Lee, K.S.[Kwot Sin],
Skorokhodov, I.[Ivan],
Aberman, K.[Kfir],
Zhu, J.Y.[Jun-Yan],
Yang, M.H.[Ming-Hsuan],
Tulyakov, S.[Sergey],
Multi-subject Open-set Personalization in Video Generation,
CVPR25(6099-6110)
IEEE DOI
2508
Training, Computational modeling, Pipelines, Benchmark testing,
Transformers, Image augmentation, Optimization, Videos, Overfitting,
customization
BibRef
Huang, Z.P.[Zhi-Peng],
Zhuang, S.[Shaobin],
Fu, C.[Canmiao],
Yang, B.X.[Bin-Xin],
Zhang, Y.[Ying],
Sun, C.[Chong],
Zhang, Z.Z.[Zhi-Zheng],
Wang, Y.[Yali],
Li, C.[Chen],
Zha, Z.J.[Zheng-Jun],
WeGen: A Unified Model for Interactive Multimodal Generation as We
Chat,
CVPR25(23679-23689)
IEEE DOI Code:
WWW Link.
2508
Visualization, Foundation models, Refining, Pipelines, Internet,
Iterative methods, Creativity, Image reconstruction, Videos
BibRef
Su, T.T.[Tong-Tong],
Wang, C.Y.[Cheng-Yu],
Liu, B.Y.[Bing-Yan],
Huang, J.[Jun],
Lu, D.M.[Dong-Ming],
Encapsulated Composition of Text-to-Image and Text-to-Video Models
for High-Quality Video Synthesis,
CVPR25(18209-18218)
IEEE DOI
2508
Adaptation models, Visualization, Synthesizers, Noise reduction,
Refining, Pipelines, Text to image, Imaging, Quality assessment, Text to video
BibRef
Wang, L.Z.[Luo-Zhou],
Li, Y.J.[Yi-Jun],
Chen, Z.F.[Zhi-Fei],
Wang, J.H.[Jui-Hsien],
Zhang, Z.F.[Zhi-Fei],
Zhang, H.[He],
Lin, Z.[Zhe],
Chen, Y.C.[Ying-Cong],
TransPixeler: Advancing Text-to-Video Generation with Transparency,
CVPR25(18229-18239)
IEEE DOI Code:
WWW Link.
2508
Adaptation models, Training data, Entertainment industry,
Visual effects, Transformers, Robustness, Reflection, Generators, Text to video
BibRef
Wang, J.R.[Jia-Rui],
Duan, H.Y.[Hui-Yu],
Zhai, G.T.[Guang-Tao],
Wang, J.T.[Jun-Tong],
Min, X.K.[Xiong-Kuo],
AIGV-Assessor: Benchmarking and Evaluating the Perceptual Quality of
Text-to-Video Generation with LMM,
CVPR25(18869-18880)
IEEE DOI Code:
WWW Link.
2508
Visualization, Systematics, Databases, Pressing, Benchmark testing,
Predictive models, Robustness, Quality assessment,
Text to video
BibRef
Reddy, A.[Arun],
Martin, A.[Alexander],
Yang, E.[Eugene],
Yates, A.[Andrew],
Sanders, K.[Kate],
Murray, K.[Kenton],
Kriz, R.[Reno],
de Melo, C.M.[Celso M.],
van Durme, B.[Benjamin],
Chellappa, R.[Rama],
Video-ColBERT: Contextualized Late Interaction for Text-to-Video
Retrieval,
CVPR25(19691-19701)
IEEE DOI
2508
Training, Visualization, Benchmark testing, Encoding, Text to video,
video retrieval, video-language, temporal modeling
BibRef
Zanella, L.[Luca],
Mancini, M.[Massimiliano],
Menapace, W.[Willi],
Tulyakov, S.[Sergey],
Wang, Y.M.[Yi-Ming],
Ricci, E.[Elisa],
Can Text-to-Video Generation help Video-Language Alignment?,
CVPR25(24097-24107)
IEEE DOI
2508
Visualization, Computational modeling, Large language models,
Semantics, Noise, Linguistics, Noise measurement, Text to video,
text-to-video generation
BibRef
Fan, T.[Tiehan],
Nan, K.[Kepan],
Xie, R.[Rui],
Zhou, P.H.[Peng-Hao],
Yang, Z.[Zhenheng],
Fu, C.[Chaoyou],
Li, X.[Xiang],
Yang, J.[Jian],
Tai, Y.[Ying],
InstanceCap: Improving Text-to-Video Generation via Instance-aware
Structured Caption,
CVPR25(28974-28983)
IEEE DOI
2508
Training, Computational modeling, Pipelines, Text to video,
instance-aware, structured caption, high-fidelity video generation
BibRef
Huang, Y.Y.[Yu-Yang],
Chen, Y.[Yabo],
Ding, L.[Li],
Zhang, X.P.[Xiao-Peng],
Dai, W.R.[Wen-Rui],
Zou, J.[Junni],
Xiong, H.K.[Hong-Kai],
Tian, Q.[Qi],
IM-Zero: Instance-level Motion Controllable Video Generation in a
Zero-shot Manner,
CVPR25(7265-7275)
IEEE DOI
2508
Shape, Layout, Text to image, Controllability, Trajectory,
Quality assessment, Text to video, Motion control, Videos, zero-shot
BibRef
Wang, Y.P.[Yi-Ping],
He, X.[Xuehai],
Wang, K.[Kuan],
Ma, L.[Luyao],
Yang, J.W.[Jian-Wei],
Wang, S.[Shuohang],
Du, S.S.L.[Simon Shao-Lei],
Shen, Y.[Yelong],
Is Your World Simulator a Good Story Presenter? A Consecutive
Events-Based Benchmark for Future Long Video Generation,
CVPR25(13629-13638)
IEEE DOI Code:
WWW Link.
2508
Measurement, Current measurement, Computational modeling,
Benchmark testing, Atoms, Reliability, Text to video,
multi-event
BibRef
Chen, S.[Shoufa],
Ge, C.J.[Chong-Jian],
Zhang, Y.Q.[Yu-Qi],
Zhang, Y.[Yida],
Zhu, F.[Fengda],
Yang, H.[Hao],
Hao, H.X.[Hong-Xiang],
Wu, H.[Hui],
Lai, Z.C.[Zhi-Chao],
Hu, Y.F.[Yi-Fei],
Lin, T.C.[Ting-Che],
Zhang, S.L.[Shi-Long],
Li, F.[Fu],
Li, C.[Chuan],
Wang, X.[Xing],
Peng, Y.H.[Yang-Hua],
Sun, P.[Peize],
Luo, P.[Ping],
Jiang, Y.[Yi],
Yuan, Z.H.[Ze-Huan],
Peng, B.[Bingyue],
Liu, X.B.[Xia-Bing],
Goku: Flow Based Video Generative Foundation Models,
CVPR25(23516-23527)
IEEE DOI
2508
Training, Visualization, Computational modeling, Pipelines,
Text to image, Transformers, Data models, Text to video,
flow model
BibRef
Wang, H.J.[Hong-Jie],
Ma, C.Y.[Chih-Yao],
Liu, Y.C.[Yen-Cheng],
Hou, J.[Ji],
Xu, T.[Tao],
Wang, J.L.[Jia-Liang],
Juefei-Xu, F.[Felix],
Luo, Y.Q.[Ya-Qiao],
Zhang, P.Z.[Pei-Zhao],
Hou, T.B.[Ting-Bo],
Vajda, P.[Peter],
Jha, N.K.[Niraj K.],
Dai, X.L.[Xiao-Liang],
LinGen: Towards High-Resolution Minute-Length Text-to-Video
Generation with Linear Computational Complexity,
CVPR25(2578-2588)
IEEE DOI Code:
WWW Link.
2508
Correlation, Reviews, Computational modeling,
Graphics processing units, Transformers, Motion pictures, efficiency
BibRef
Gao, B.J.[Bing-Jie],
Gao, X.Y.[Xin-Yu],
Wu, X.X.[Xiao-Xue],
Zhou, Y.J.[Yu-Jie],
Qiao, Y.[Yu],
Niu, L.[Li],
Chen, X.Y.[Xin-Yuan],
Wang, Y.H.[Yao-Hui],
The Devil is in the Prompts: Retrieval-Augmented Prompt Optimization
for Text-To-Video Generation,
CVPR25(3173-3183)
IEEE DOI
2508
Training, Vocabulary, Sensitivity, Large language models,
Instruction sets, Refining, Text to video, Optimization,
prompt optimization
BibRef
Sharan, S.P.[S P],
Choi, M.[Minkyu],
Shah, S.[Sahil],
Goel, H.[Harsh],
Omama, M.[Mohammad],
Chinchali, S.[Sandeep],
Neuro-Symbolic Evaluation of Text-to-Video Models using Formal
Verification,
CVPR25(8395-8405)
IEEE DOI
2508
Measurement, Visualization, Computational modeling, Automata,
Benchmark testing, Logic, Text to video, Autonomous vehicles, dataset
BibRef
Sun, K.Y.[Kai-Yue],
Huang, K.Y.[Kai-Yi],
Liu, X.[Xian],
Wu, Y.[Yue],
Xu, Z.[Zihan],
Li, Z.G.[Zhen-Guo],
Liu, X.H.[Xi-Hui],
T2V-CompBench: A Comprehensive Benchmark for Compositional
Text-to-video Generation,
CVPR25(8406-8416)
IEEE DOI
2508
Measurement, Analytical models, Systematics, Correlation, Tracking,
Computational modeling, Large language models, Benchmark testing,
compositional text-to-video generation
BibRef
Yuan, S.H.[Sheng-Hai],
Huang, J.[Jinfa],
He, X.[Xianyi],
Ge, Y.[Yunyang],
Shi, Y.J.[Yu-Jun],
Chen, L.[Liuhan],
Luo, J.B.[Jie-Bo],
Yuan, L.[Li],
Identity-Preserving Text-To-Video Generation by Frequency
Decomposition,
CVPR25(12978-12988)
IEEE DOI
2508
Frequency-domain analysis, Computational modeling, Pipelines,
Optimal control, Transformers, Text to video, Frequency control,
video generation
BibRef
Bu, J.[Jiazi],
Ling, P.Y.[Peng-Yang],
Zhang, P.[Pan],
Wu, T.[Tong],
Dong, X.Y.[Xiao-Yi],
Zang, Y.H.[Yu-Hang],
Cao, Y.H.[Yu-Hang],
Lin, D.[Dahua],
Wang, J.Q.[Jia-Qi],
ByTheWay: Boost Your Text-to-Video Generation Model to Higher Quality
in a Training-free Way,
CVPR25(12999-13008)
IEEE DOI Code:
WWW Link.
2508
Visualization, Costs, Correlation, Codes, Decoding, High frequency,
Text to video, generative models, text-to-video generation,
video quality enhancement
BibRef
Weng, S.[Shuchen],
Zheng, H.J.[Hao-Jie],
Zhang, P.X.[Pei-Xuan],
Hong, Y.C.[Yu-Chen],
Jiang, H.[Han],
Li, S.[Si],
Shi, B.X.[Bo-Xin],
VIRES: Video Instance Repainting via Sketch and Text Guided
Generation,
CVPR25(28416-28425)
IEEE DOI Code:
WWW Link.
2508
Training, Adaptation models, Visualization, Semantics, Layout,
Transformers, Sampling methods, Text to video, Videos
BibRef
Huang, H.P.[Hsin-Ping],
Su, Y.C.[Yu-Chuan],
Sun, D.Q.[De-Qing],
Jiang, L.[Lu],
Jia, X.H.[Xu-Hui],
Zhu, Y.K.[Yu-Kun],
Yang, M.H.[Ming-Hsuan],
Fine-grained Controllable Video Generation via Object Appearance and
Context,
WACV25(3698-3708)
IEEE DOI
2505
Visualization, Natural languages, Benchmark testing, Transformers,
Controllability, Trajectory, Text to video, Standards,
subject customization
BibRef
Wu, W.J.[Wei-Jia],
Li, Z.[Zhuang],
Gu, Y.C.[Yu-Chao],
Zhao, R.[Rui],
He, Y.F.[Ye-Fei],
Zhang, D.J.H.[David Jun-Hao],
Shou, M.Z.[Mike Zheng],
Li, Y.[Yan],
Gao, T.T.[Ting-Ting],
Zhang, D.[Di],
DragAnything: Motion Control for Anything Using Entity Representation,
ECCV24(XXII: 331-348).
Springer DOI
2412
Project:
WWW Link. in controllable video generation.
BibRef
Chen, X.[Xi],
Liu, Z.H.[Zhi-Heng],
Chen, M.T.[Meng-Ting],
Feng, Y.T.[Yu-Tong],
Liu, Y.[Yu],
Shen, Y.J.[Yu-Jun],
Zhao, H.S.[Heng-Shuang],
Livephoto: Real Image Animation with Text-guided Motion Control,
ECCV24(XVIII: 475-491).
Springer DOI
2412
Project:
WWW Link.
BibRef
Dai, W.X.[Wen-Xun],
Chen, L.H.[Ling-Hao],
Wang, J.B.[Jing-Bo],
Liu, J.P.[Jin-Peng],
Dai, B.[Bo],
Tang, Y.S.[Yan-Song],
Motionlcm: Real-time Controllable Motion Generation via Latent
Consistency Model,
ECCV24(XVI: 390-408).
Springer DOI
2412
BibRef
Huang, Y.M.[Yi-Ming],
Wan, W.L.[Wei-Lin],
Yang, Y.[Yue],
Callison-Burch, C.[Chris],
Yatskar, M.[Mark],
Liu, L.J.[Ling-Jie],
Como: Controllable Motion Generation Through Language Guided Pose Code
Editing,
ECCV24(XXIX: 180-196).
Springer DOI
2412
BibRef
Zou, Q.[Qiran],
Yuan, S.[Shangyuan],
Du, S.[Shian],
Wang, Y.[Yu],
Liu, C.[Chang],
Xu, Y.[Yi],
Chen, J.[Jie],
Ji, X.Y.[Xiang-Yang],
Parco: Part-coordinating Text-to-motion Synthesis,
ECCV24(LVI: 126-143).
Springer DOI
2412
BibRef
Wang, Y.H.[Yan-Hui],
Bao, J.M.[Jian-Min],
Weng, W.M.[Wen-Ming],
Feng, R.[Ruoyu],
Yin, D.C.[Da-Cheng],
Yang, T.[Tao],
Zhang, J.X.[Jing-Xu],
Dai, Q.[Qi],
Zhao, Z.Y.[Zhi-Yuan],
Wang, C.Y.[Chun-Yu],
Qiu, K.[Kai],
Yuan, Y.H.[Yu-Hui],
Sun, X.Y.[Xiao-Yan],
Luo, C.[Chong],
Guo, B.[Baining],
MicroCinema: A Divide-and-Conquer Approach for Text-to-Video
Generation,
CVPR24(8414-8424)
IEEE DOI
2410
Image synthesis, Noise, Pipelines, Dynamics, Text to image,
Dynamic scheduling, Motion capture, video generation, diffusion model
BibRef
Christie, R.[Robert],
Kitchen, B.[Brian],
Tumilowicz, W.[Wiktor],
Hooper, S.[Steffan],
Wünsche, B.C.[Burkhard C.],
Procedurally Generating Large Synthetic Worlds: Chunked Hierarchical
Wave Function Collapse,
IVCNZ24(1-6)
IEEE DOI
2503
Backtracking, Video games, Runtime, Procedural generation,
Heuristic algorithms, Layout, Wave functions, multithreading
BibRef
Kwon, M.[Mingi],
Oh, S.W.[Seoung Wug],
Zhou, Y.[Yang],
Liu, D.[Difan],
Lee, J.Y.[Joon-Young],
Cai, H.R.[Hao-Ran],
Liu, B.[Baqiao],
Liu, F.[Feng],
Uh, Y.J.[Young-Jung],
Harivo: Harnessing Text-to-image Models for Video Generation,
ECCV24(LIII: 19-36).
Springer DOI
2412
BibRef
Wu, R.Q.[Rui-Qi],
Chen, L.[Liangyu],
Yang, T.[Tong],
Guo, C.[Chunle],
Li, C.Y.[Chong-Yi],
Zhang, X.Y.[Xiang-Yu],
LAMP: Learn A Motion Pattern for Few-Shot Video Generation,
CVPR24(7089-7098)
IEEE DOI Code:
WWW Link.
2410
Training, Computational modeling, Pipelines, Text to image,
Diffusion models, Stability analysis, Quality assessment
BibRef
Yang, S.[Shuai],
Zhou, Y.F.[Yi-Fan],
Liu, Z.W.[Zi-Wei],
Loy, C.C.[Chen Change],
Fresco: Spatial-Temporal Correspondence for Zero-Shot Video
Translation,
CVPR24(8703-8712)
IEEE DOI
2410
Training, Visualization, Attention mechanisms, Superresolution,
Text to image, Coherence, diffusion, video-to-video translation,
intra-frame consistency
BibRef
Bahmani, S.[Sherwin],
Liu, X.[Xian],
Wang, Y.F.[Yi-Fan],
Skorokhodov, I.[Ivan],
Rong, V.[Victor],
Liu, Z.W.[Zi-Wei],
Liu, X.H.[Xi-Hui],
Park, J.J.[Jeong Joon],
Tulyakov, S.[Sergey],
Wetzstein, G.[Gordon],
Tagliasacchi, A.[Andrea],
Lindell, D.B.[David B.],
TC4D: Trajectory-Conditioned Text-to-4D Generation,
ECCV24(XLVI: 53-72).
Springer DOI
2412
BibRef
Fan, K.[Ke],
Tang, J.[Junshu],
Cao, W.J.[Wei-Jian],
Yi, R.[Ran],
Li, M.[Moran],
Gong, J.Y.[Jing-Yu],
Zhang, J.N.[Jiang-Ning],
Wang, Y.B.[Ya-Biao],
Wang, C.J.[Cheng-Jie],
Ma, L.Z.[Li-Zhuang],
Freemotion:
A Unified Framework for Number-Free Text-to-Motion Synthesis,
ECCV24(VIII: 93-109).
Springer DOI
2412
BibRef
Oh, G.[Gyeongrok],
Jeong, J.[Jaehwan],
Kim, S.[Sieun],
Byeon, W.[Wonmin],
Kim, J.[Jinkyu],
Kim, S.[Sungwoong],
Kim, S.[Sangpil],
Mevg: Multi-event Video Generation with Text-to-video Models,
ECCV24(XLIII: 401-418).
Springer DOI
2412
BibRef
Girdhar, R.[Rohit],
Singh, M.[Mannat],
Brown, A.[Andrew],
Duval, Q.[Quentin],
Azadi, S.[Samaneh],
Rambhatla, S.S.[Sai Saketh],
Shah, A.[Akbar],
Yin, X.[Xi],
Parikh, D.[Devi],
Misra, I.[Ishan],
Factorizing Text-to-video Generation by Explicit Image Conditioning,
ECCV24(LXII: 205-224).
Springer DOI
2412
BibRef
Materzynska, J.[Joanna],
Sivic, J.[Josef],
Shechtman, E.[Eli],
Torralba, A.[Antonio],
Zhang, R.[Richard],
Russell, B.[Bryan],
Newmove: Customizing Text-to-video Models with Novel Motions,
ACCV24(V: 113-130).
Springer DOI
2412
BibRef
Menapace, W.[Willi],
Siarohin, A.[Aliaksandr],
Skorokhodov, I.[Ivan],
Deyneka, E.[Ekaterina],
Chen, T.S.[Tsai-Shien],
Kag, A.[Anil],
Fang, Y.W.[Yu-Wei],
Stoliar, A.[Aleksei],
Ricci, E.[Elisa],
Ren, J.[Jian],
Tulyakov, S.[Sergey],
Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video
Synthesis,
CVPR24(7038-7048)
IEEE DOI
2410
Training, Visualization, Image coding, Computational modeling,
Scalability, Transformers, video generation,
efficiency
BibRef
Tian, K.B.[Kai-Bin],
Zhao, R.X.[Rui-Xiang],
Xin, Z.J.[Zi-Jie],
Lan, B.X.[Bang-Xiang],
Li, X.R.[Xi-Rong],
Holistic Features are Almost Sufficient for Text-to-Video Retrieval,
CVPR24(17138-17147)
IEEE DOI
2410
Computational modeling, Scalability, Ad hoc networks,
Text to video
BibRef
Qing, Z.W.[Zhi-Wu],
Zhang, S.W.[Shi-Wei],
Wang, J.Y.[Jia-Yu],
Wang, X.[Xiang],
Wei, Y.J.[Yu-Jie],
Zhang, Y.Y.[Ying-Ya],
Gao, C.X.[Chang-Xin],
Sang, N.[Nong],
Hierarchical Spatio-temporal Decoupling for Text-to-Video Generation,
CVPR24(6635-6645)
IEEE DOI
2410
Training, Source coding, Semantics, Spatial coherence, Cognition,
Stability analysis, Complexity theory
BibRef
Wang, X.[Xiang],
Zhang, S.W.[Shi-Wei],
Yuan, H.J.[Hang-Jie],
Qing, Z.W.[Zhi-Wu],
Gong, B.[Biao],
Zhang, Y.Y.[Ying-Ya],
Shen, Y.J.[Yu-Jun],
Gao, C.X.[Chang-Xin],
Sang, N.[Nong],
A Recipe for Scaling up Text-to-Video Generation with Text-free
Videos,
CVPR24(6572-6582)
IEEE DOI
2410
Training, Video on demand, Scalability, Pipelines, Text to image,
Performance gain
BibRef
Kim, T.[Taehoon],
Kang, C.[ChanHee],
Park, J.[JaeHyuk],
Jeong, D.[Daun],
Yang, C.[ChangHee],
Kang, S.J.[Suk-Ju],
Kong, K.[Kyeongbo],
Human Motion Aware Text-to-Video Generation with Explicit Camera
Control,
WACV24(5069-5078)
IEEE DOI Code:
WWW Link.
2404
Knowledge engineering, Codes, Punching, Cameras, Algorithms,
Generative models for image, video, 3D, etc., Algorithms, Biometrics,
Vision + language and/or other modalities
BibRef
Ji, P.L.[Peng-Liang],
Xiao, C.[Chuyang],
Tai, H.L.[Hui-Lin],
Huo, M.X.[Ming-Xiao],
T2VBench: Benchmarking Temporal Dynamics for Text-to-Video Generation,
GenerativeFM24(5325-5335)
IEEE DOI
2410
Measurement, Analytical models, Computational modeling,
Encyclopedias, Benchmark testing, multimodal
BibRef
Godfrey, W.W.[W. Wilfred],
Ratna, A.[Abhinav],
Enhancing the Video Editing Capabilities of Text-to-Video Generators
Using DDPM Inversion,
ICCVMI23(1-5)
IEEE DOI
2403
Visualization, Computational modeling, Refining, Pipelines,
Noise reduction, Transformers, Probabilistic logic,
DDPM Inversion
BibRef
Lee, H.[Hanbit],
Kim, Y.[Youna],
Lee, S.G.[Sang-Goo],
Multi-scale Contrastive Learning for Complex Scene Generation,
WACV23(764-774)
IEEE DOI
2302
Semantics, Generative adversarial networks, Generators,
Data models, Task analysis, image and video synthesis
BibRef
Loeschcke, S.[Sebastian],
Belongie, S.[Serge],
Benaim, S.[Sagie],
Text-driven Stylization of Video Objects,
CVEU22(594-609).
Springer DOI
2304
BibRef
Mazaheri, A.[Amir],
Shah, M.[Mubarak],
Video Generation from Text Employing Latent Path Construction for
Temporal Modeling,
ICPR22(5010-5016)
IEEE DOI
2212
Interpolation, Visualization, Natural languages,
Stacking, Machine learning
BibRef
Lee, S.H.[Seung Hyun],
Oh, G.[Gyeongrok],
Byeon, W.[Wonmin],
Kim, C.[Chanyoung],
Ryoo, W.J.[Won Jeong],
Yoon, S.H.[Sang Ho],
Cho, H.[Hyunjun],
Bae, J.Y.[Jih-Yun],
Kim, J.[Jinkyu],
Kim, S.[Sangpil],
Sound-Guided Semantic Video Generation,
ECCV22(XVII:34-50).
Springer DOI
2211
BibRef
Zhan, F.N.[Fang-Neng],
Zhang, J.H.[Jia-Hui],
Yu, Y.C.[Ying-Chen],
Wu, R.L.[Rong-Liang],
Lu, S.J.[Shi-Jian],
Modulated Contrast for Versatile Image Synthesis,
CVPR22(18259-18269)
IEEE DOI
2210
Photography, Visualization, Codes, Image synthesis, Force,
Performance gain, Image and video synthesis and generation,
Computational photography
BibRef
Ntavelis, E.[Evangelos],
Shahbazi, M.[Mohamad],
Kastanis, I.[Iason],
Timofte, R.[Radu],
Danelljan, M.[Martin],
Van Gool, L.J.[Luc J.],
Arbitrary-Scale Image Synthesis,
CVPR22(11523-11532)
IEEE DOI
2210
Training, Image coding, Image synthesis, Pipelines,
Generative adversarial networks, Encoding,
Image and video synthesis and generation
BibRef
Yang, Z.P.[Zuo-Peng],
Liu, D.Q.[Da-Qing],
Wang, C.Y.[Chao-Yue],
Yang, J.[Jie],
Tao, D.C.[Da-Cheng],
Modeling Image Composition for Complex Scene Generation,
CVPR22(7754-7763)
IEEE DOI
2210
Training, Measurement, Visualization, Image coding, Layout, Genomics,
Predictive models, Image and video synthesis and generation
BibRef
Aldausari, N.[Nuha],
Sowmya, A.[Arcot],
Marcus, N.[Nadine],
Mohammadi, G.[Gelareh],
Cascaded Siamese Self-supervised Audio to Video GAN,
MULA22(4690-4699)
IEEE DOI
2210
Solid modeling, Correlation, Computational modeling
BibRef
Tao, M.[Ming],
Tang, H.[Hao],
Wu, F.[Fei],
Jing, X.Y.[Xiao-Yuan],
Bao, B.K.[Bing-Kun],
Xu, C.S.[Chang-Sheng],
DF-GAN: A Simple and Effective Baseline for Text-to-Image Synthesis,
CVPR22(16494-16504)
IEEE DOI
2210
Visualization, Codes, Semantics,
Generative adversarial networks, Generators, Vision+language,
Image and video synthesis and generation
BibRef
Zhou, Y.F.[Yu-Fan],
Zhang, R.[Ruiyi],
Chen, C.Y.[Chang-You],
Li, C.Y.[Chun-Yuan],
Tensmeyer, C.[Chris],
Yu, T.[Tong],
Gu, J.X.[Jiu-Xiang],
Xu, J.H.[Jin-Hui],
Sun, T.[Tong],
Towards Language-Free Training for Text-to-Image Generation,
CVPR22(17886-17896)
IEEE DOI
2210
Training, Image synthesis, Semantics, Training data, Tail,
Data collection, Data models, Vision+language,
Image and video synthesis and generation
BibRef
Li, Z.H.[Zhi-Heng],
Min, M.R.[Martin Renqiang],
Li, K.[Kai],
Xu, C.L.[Chen-Liang],
StyleT2I: Toward Compositional and High-Fidelity Text-to-Image
Synthesis,
CVPR22(18176-18186)
IEEE DOI
2210
Measurement, Ethics, Image synthesis, Computational modeling,
Semantics, Robustness, Image and video synthesis and generation,
Vision+language
BibRef
Wang, Y.[Yi],
Qi, L.[Lu],
Chen, Y.C.[Ying-Cong],
Zhang, X.Y.[Xiang-Yu],
Jia, J.Y.[Jia-Ya],
Image Synthesis via Semantic Composition,
ICCV21(13729-13738)
IEEE DOI
2203
Correlation, Image synthesis, Convolution, Semantics, Layout,
Benchmark testing, Image and video synthesis,
Neural generative models
BibRef
Xiang, X.Y.[Xiao-Yu],
Liu, D.[Ding],
Yang, X.[Xiao],
Zhu, Y.H.[Yi-Heng],
Shen, X.H.[Xiao-Hui],
Allebach, J.P.[Jan P.],
Adversarial Open Domain Adaptation for Sketch-to-Photo Synthesis,
WACV22(944-954)
IEEE DOI
2202
Training, Image color analysis, Training data,
Distortion, Generators, Optimization, Image and Video Synthesis
BibRef
Dorkenwald, M.[Michael],
Milbich, T.[Timo],
Blattmann, A.[Andreas],
Rombach, R.[Robin],
Derpanis, K.G.[Konstantinos G.],
Ommer, B.[Björn],
Stochastic Image-to-Video Synthesis using cINNs,
CVPR21(3741-3752)
IEEE DOI
2111
Neural networks, Stochastic processes,
Process control, Predictive models, Probabilistic logic
BibRef
Mallya, A.[Arun],
Wang, T.C.[Ting-Chun],
Sapra, K.[Karan],
Liu, M.Y.[Ming-Yu],
World-Consistent Video-to-Video Synthesis,
ECCV20(VIII:359-378).
Springer DOI
2011
BibRef
Nawhal, M.[Megha],
Zhai, M.Y.[Meng-Yao],
Lehrmann, A.[Andreas],
Sigal, L.[Leonid],
Mori, G.[Greg],
Generating Videos of Zero-shot Compositions of Actions and Objects,
ECCV20(XII: 382-401).
Springer DOI
2010
BibRef
Chapter on 3-D Object Description and Computation Techniques, Surfaces, Deformable, View Generation, Video Conferencing continues in
Video Diffusion, Video Sysnthesis, Text to Video .