Liu, D.[Daizong],
Zhu, J.H.[Jia-Hao],
Fang, X.[Xiang],
Xiong, Z.[Zeyu],
Wang, H.[Huan],
Li, R.[Renfu],
Zhou, P.[Pan],
Conditional Video Diffusion Network for Fine-Grained Temporal
Sentence Grounding,
MultMed(26), 2024, pp. 5461-5476.
IEEE DOI
2404
Semantics, Grounding, Task analysis, Cognition, Visualization,
Training, Stochastic processes, diffusion models
BibRef
Xing, Z.[Zhen],
Feng, Q.J.[Qi-Jun],
Chen, H.R.[Hao-Ran],
Dai, Q.[Qi],
Hu, H.[Han],
Xu, H.[Hang],
Wu, Z.X.[Zu-Xuan],
Jiang, Y.G.[Yu-Gang],
A Survey on Video Diffusion Models,
Surveys(57), No. 2, November 2024, pp. xx-yy.
DOI Link
2501
Survey, Video Diffusion. Survey, video diffusion model, video generation, video editing, AIGC
BibRef
Cao, J.H.[Jing-Hao],
Liu, S.[Sheng],
Yang, X.[Xiong],
Li, Y.[Yang],
Du, S.[Sidan],
ARES: Text-Driven Automatic Realistic Simulator for Autonomous
Traffic,
SPLetters(31), 2024, pp. 3049-3053.
IEEE DOI
2411
Trajectory, Rendering (computer graphics), Training,
Diffusion models, Accuracy, Logic, Turning, Predictive models
BibRef
Liu, D.M.[Ding-Ming],
Li, S.W.[Shao-Wei],
Zhou, R.Y.[Ruo-Yan],
Liang, L.L.[Li-Li],
Hong, Y.G.[Yong-Guan],
Zeng, Y.Z.[Yuan-Ze],
Chang, X.[Xiang],
Li, L.J.[Li-Jiang],
Xu, T.S.[Tian-Shuo],
Chao, F.[Fei],
Shang, C.J.[Chang-Jing],
Shen, Q.[Qiang],
NADM: Noise-Aware Diffusion Model for Landscape Painting Video
Generation,
Cyber(55), No. 8, August 2025, pp. 3686-3698.
IEEE DOI Code:
WWW Link.
2508
Painting, Text to video, Diffusion models, Noise, Dynamics, Art,
Training, Noise reduction, Interpolation, Image synthesis,
text-to-video diffusion
BibRef
Zhang, R.[Rui],
Chen, Y.[Yaosen],
Liu, Y.[Yuegen],
Wang, W.[Wei],
Wen, X.M.[Xu-Ming],
Wang, H.X.[Hong-Xia],
TVG: A Training-Free Transition Video Generation Method With
Diffusion Models,
CirSysVideo(35), No. 8, August 2025, pp. 7471-7484.
IEEE DOI Code:
WWW Link.
2508
Diffusion models, Training, Artificial intelligence, Noise, Media,
Interpolation, Gaussian processes, Visualization, Reliability, training-free
BibRef
Wang, C.Y.[Chen-Yu],
Yan, S.[Shuo],
Chen, Y.X.[Yi-Xuan],
Wang, X.W.[Xian-Wei],
Wang, Y.J.[Yu-Jiang],
Dong, M.Z.[Ming-Zhi],
Yang, X.C.[Xiao-Chen],
Li, D.S.[Dong-Sheng],
Zhu, R.[Rui],
Clifton, D.A.[David A.],
Dick, R.P.[Robert P.],
Lv, Q.[Qin],
Yang, F.[Fan],
Lu, T.[Tun],
Gu, N.[Ning],
Shang, L.[Li],
Denoising Reuse: Exploiting Inter-Frame Motion Consistency for
Efficient Video Generation,
CirSysVideo(35), No. 9, September 2025, pp. 8436-8451.
IEEE DOI Code:
WWW Link.
2509
Noise reduction, Diffusion models, Visualization, Dynamics,
Computational modeling, Semantics, Optical flow,
computational efficiency
BibRef
Zhang, D.J.H.[David Jun-Hao],
Wu, J.Z.J.[Jay Zhang-Jie],
Wu, J.Z.J.[Jay Zhang-Jie],
Liu, J.W.[Jia-Wei],
Zhao, R.[Rui],
Ran, L.M.[Ling-Min],
Gu, Y.C.[Yu-Chao],
Gao, D.F.[Di-Fei],
Shou, M.Z.[Mike Zheng],
Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video
Generation,
IJCV(133), No. 4, April 2025, pp. 1879-1893.
Springer DOI
2504
BibRef
Wang, Y.H.[Yao-Hui],
Chen, X.Y.[Xin-Yuan],
Ma, X.[Xin],
Zhou, S.C.[Shang-Chen],
Huang, Z.Q.[Zi-Qi],
Wang, Y.[Yi],
Yang, C.Y.[Ce-Yuan],
He, Y.N.[Yi-Nan],
Yu, J.S.[Jia-Shuo],
Yang, P.Q.[Pei-Qing],
Guo, Y.W.[Yu-Wei],
Wu, T.X.[Tian-Xing],
Si, C.Y.[Chen-Yang],
Jiang, Y.M.[Yu-Ming],
Chen, C.J.[Cun-Jian],
Loy, C.C.[Chen Change],
Dai, B.[Bo],
Lin, D.H.[Da-Hua],
Liu, Y.Q.Z.W.[Yu Qiao Zi-Wei],
LaVie: High-Quality Video Generation with Cascaded Latent Diffusion
Models,
IJCV(133), No. 5, May 2025, pp. 3059-3078.
Springer DOI
2504
BibRef
Weng, W.M.[Wen-Ming],
Feng, R.[Ruoyu],
Wang, Y.H.[Yan-Hui],
Dai, Q.[Qi],
Wang, C.Y.[Chun-Yu],
Yin, D.C.[Da-Cheng],
Zhao, Z.Y.[Zhi-Yuan],
Qiu, K.[Kai],
Bao, J.M.[Jian-Min],
Yuan, Y.H.[Yu-Hui],
Luo, C.[Chong],
Zhang, Y.Y.[Yue-Yi],
Xiong, Z.W.[Zhi-Wei],
ART•V: Auto-Regressive Text-to-Video Generation with Diffusion Models,
GCV24(7395-7405)
IEEE DOI
2410
Noise, Training data, Coherence, Predictive models, Diffusion models
BibRef
Zhang, Y.P.[Yi-Peng],
Wang, X.[Xin],
Chen, H.[Hong],
Qin, C.Y.[Chen-Yang],
Hao, Y.[Yibo],
Mei, H.[Hong],
Zhu, W.W.[Wen-Wu],
ScenarioDiff: Text-to-video Generation with Dynamic Transformations of
Scene Conditions,
IJCV(133), No. 7, July 2025, pp. 4909-4922.
Springer DOI
2506
BibRef
Wang, W.J.[Wen-Jing],
Yang, H.[Huan],
Tuo, Z.X.[Zi-Xi],
He, H.G.[Hui-Guo],
Zhu, J.C.[Jun-Chen],
Fu, J.L.[Jian-Long],
Liu, J.Y.[Jia-Ying],
Swap Attention in Spatiotemporal Diffusions for Text-to-Video
Generation,
IJCV(133), No. 7, July 2025, pp. 4177-4195.
Springer DOI
2506
BibRef
Wang, C.[Cong],
Hu, P.[Panwen],
Zhao, H.Y.[Hao-Yu],
Guo, Y.F.[Yuan-Fan],
Gu, J.X.[Jia-Xi],
Dong, X.[Xiao],
Han, J.H.[Jian-Hua],
Xu, H.[Hang],
Liang, X.D.[Xiao-Dan],
UniAdapter: All-in-One Control for Flexible Video Generation,
CirSysVideo(35), No. 6, June 2025, pp. 6059-6073.
IEEE DOI
2506
Diffusion models, Adaptation models,
Transformers, Text to video, Modeling, Training, Feature extraction
BibRef
Yang, L.[Ling],
Zhao, Y.K.[Yi-Kai],
Yu, Z.C.[Zhao-Chen],
Zeng, B.[Bohan],
Xu, M.[Minkai],
Hong, S.[Shenda],
Cui, B.[Bin],
Spatio-Temporal Energy-Guided Diffusion Model for Zero-Shot Video
Synthesis and Editing,
CirSysVideo(35), No. 6, June 2025, pp. 6034-6046.
IEEE DOI
2506
Diffusion models, Kernel, Training, Text to video,
Text to image, Faces, Dynamics, zero-shot generation
BibRef
Yuan, S.H.[Sheng-Hai],
Huang, J.[Jinfa],
Shi, Y.J.[Yu-Jun],
Xu, Y.Q.[Yong-Qi],
Zhu, R.J.[Rui-Jie],
Lin, B.[Bin],
Cheng, X.H.[Xin-Hua],
Yuan, L.[Li],
Luo, J.B.[Jie-Bo],
MagicTime: Time-Lapse Video Generation Models as Metamorphic
Simulators,
PAMI(47), No. 9, September 2025, pp. 7340-7351.
IEEE DOI
2508
Training, Text to video, Diffusion models, Physics, Flowering plants,
Adaptation models, Noise reduction, Feature extraction, Encoding,
metamorphic video
BibRef
Ma, Z.Y.[Zhi-Yuan],
Zhang, Y.Z.[Yu-Zhu],
Jia, G.[Guoli],
Zhao, L.L.[Liang-Liang],
Ma, Y.C.[Yi-Chao],
Ma, M.J.[Ming-Jie],
Liu, G.F.[Gao-Feng],
Zhang, K.[Kaiyan],
Ding, N.[Ning],
Li, J.J.[Jian-Jun],
Zhou, B.[Bowen],
Efficient Diffusion Models:
A Comprehensive Survey From Principles to Practices,
PAMI(47), No. 9, September 2025, pp. 7506-7525.
IEEE DOI
2508
Survey, Text to Video. Diffusion models, Training, Surveys, Image synthesis,
Vectors, Reviews, Noise reduction, Image coding, video generation
BibRef
Hu, W.P.[Wei-Peng],
Hoe, J.T.[Jiun Tian],
Li, J.H.[Jian-Hui],
Hu, H.F.[Hai-Feng],
Jiang, X.D.[Xu-Dong],
Tan, Y.P.[Yap-Peng],
Cascaded Dynamic Memory Refinement and Semantic Alignment for
Exo-to-Ego Cross-View Video Generation,
PAMI(47), No. 9, September 2025, pp. 7490-7505.
IEEE DOI
2508
Semantics, Pipelines, Vehicle dynamics, Automobiles, Transformers,
Training, Translation, Transforms, Sports, Robot sensing systems,
semantic alignment
BibRef
Hoe, J.T.[Jiun Tian],
Jiang, X.D.[Xu-Dong],
Chan, C.S.[Chee Seng],
Tan, Y.P.[Yap-Peng],
Hu, W.P.[Wei-Peng],
InteractDiffusion: Interaction Control in Text-to-Image Diffusion
Models,
CVPR24(6180-6189)
IEEE DOI Code:
WWW Link.
2410
Location awareness, Visualization, Computational modeling, Layout,
Text to image, Diffusion models, image generation, generative ai
BibRef
Hu, W.B.[Wen-Bo],
Gao, X.J.[Xiang-Jun],
Li, X.Y.[Xiao-Yu],
Zhao, S.[Sijie],
Cun, X.D.[Xiao-Dong],
Zhang, Y.[Yong],
Quan, L.[Long],
Shan, Y.[Ying],
DepthCrafter: Generating Consistent Long Depth Sequences for
Open-world Videos,
CVPR25(2005-2015)
IEEE DOI
2508
Training, Depth measurement, Estimation, Cameras, Visual effects,
Diffusion models, Optical flow, Videos, Synthetic data,
video diffusion model
BibRef
Lee, J.[Jaerin],
Jung, D.S.[Daniel Sungho],
Lee, K.[Kanggeon],
Lee, K.M.[Kyoung Mu],
SemanticDraw: Towards Real-Time Interactive Content Creation from
Image Diffusion Models,
CVPR25(13021-13030)
IEEE DOI Code:
WWW Link.
2508
Productivity, Pipelines, Semantics, Streaming media,
Diffusion models, Throughput, Real-time systems,
latent diffusion model
BibRef
Huang, C.P.[Chi-Pin],
Wu, Y.S.[Yen-Siang],
Chung, H.K.[Hung-Kai],
Chang, K.P.[Kai-Po],
Yang, F.E.[Fu-En],
Wang, Y.C.A.F.[Yu-Chi-Ang Frank],
VideoMage: Multi-Subject and Motion Customization of Text-to-Video
Diffusion Models,
CVPR25(17603-17612)
IEEE DOI Code:
WWW Link.
2508
Visualization, Limiting, Collaboration, Diffusion models,
Controllability, Text to video, text-to-video diffusion models,
video concept customization
BibRef
Du, S.[Shian],
Xia, M.[Menghan],
Liu, C.[Chang],
Wang, X.T.[Xin-Tao],
Wang, J.[Jing],
Wan, P.F.[Peng-Fei],
Zhang, D.[Di],
Ji, X.Y.[Xiang-Yang],
PatchVSR: Breaking Video Diffusion Resolution Limits with Patch-wise
Video Super-Resolution,
CVPR25(17799-17809)
IEEE DOI
2508
Adaptation models, Visualization, Superresolution, Semantics,
Modulation, Feature extraction, Diffusion models, Text to video, Videos
BibRef
Zhang, H.[Hui],
Gao, T.W.[Ting-Wei],
Shao, J.[Jie],
Wu, Z.X.[Zu-Xuan],
BlockDance: Reuse Structurally Similar Spatio-Temporal Features to
Accelerate Diffusion Transformers,
CVPR25(12891-12900)
IEEE DOI
2508
Visualization, Computational modeling, Noise reduction,
Decision making, Transformers, Diffusion models, Iterative methods,
video generation
BibRef
Wang, Y.Q.[Yu-Qing],
Ren, S.[Shuhuai],
Lin, Z.J.[Zhi-Jie],
Han, Y.J.[Yu-Jin],
Guo, H.Y.[Hao-Yuan],
Yang, Z.[Zhenheng],
Zou, D.[Difan],
Feng, J.S.[Jia-Shi],
Liu, X.H.[Xi-Hui],
Parallelized Autoregressive Visual Generation,
CVPR25(12955-12965)
IEEE DOI Code:
WWW Link.
2508
Degradation, Visualization, Predictive models, Standards, Videos,
autoregressive visual generation, image generation,
autoregressive models
BibRef
Lee, Y.C.[Yao-Chih],
Lu, E.[Erika],
Rumbley, S.[Sarah],
Geyer, M.[Michal],
Huang, J.B.[Jia-Bin],
Dekel, T.[Tali],
Cole, F.[Forrester],
Generative Omnimatte: Learning to Decompose Video into Layers,
CVPR25(12522-12532)
IEEE DOI
2508
Accuracy, Depth measurement, Computational modeling,
Diffusion models, Cameras, Reflection, Object recognition, Videos,
video diffusion model
BibRef
Wang, L.[Lei],
Li, S.M.[Sen-Mao],
Yang, F.[Fei],
Wang, J.[Jianye],
Zhang, Z.H.[Zi-Heng],
Liu, Y.H.[Yu-Han],
Wang, Y.X.[Ya-Xing],
Yang, J.[Jian],
Not All Parameters Matter: Masking Diffusion Models for Enhancing
Generation Ability,
CVPR25(12880-12890)
IEEE DOI Code:
WWW Link.
2508
Training, Computational modeling, Semantics, Noise reduction,
Deep architecture, Diffusion models, Text to video, Optimization
BibRef
Ge, Y.Y.[Yu-Ying],
Li, Y.Z.[Yi-Zhuo],
Ge, Y.X.[Yi-Xiao],
Shan, Y.[Ying],
Divot: Diffusion Powers Video Tokenizer for Comprehension and
Generation,
CVPR25(13606-13617)
IEEE DOI Code:
WWW Link.
2508
Representation learning, Large language models,
Benchmark testing, Predictive models, Diffusion models, diffusion
BibRef
Pallotta, E.[Enrico],
Azar, S.M.[Sina Mokhtarzadeh],
Li, S.[Shuai],
Zatsarynna, O.[Olga],
Gall, J.[Juergen],
SyncVP: Joint Diffusion for Synchronous Multi-Modal Video Prediction,
CVPR25(13787-13797)
IEEE DOI
2508
Training, Temperature measurement, Semantics, Predictive models,
Benchmark testing, Diffusion models, Robustness,
diffusion models
BibRef
Gu, Y.C.[Yu-Chao],
Zhou, Y.[Yipin],
Ye, Y.F.[Yun-Fan],
Nie, Y.X.[Yi-Xin],
Yu, L.C.[Li-Cheng],
Ma, P.[Pingchuan],
Lin, K.Q.[Kevin Qinghong],
Shou, M.Z.[Mike Zheng],
ROICtrl: Boosting Instance Control for Visual Generation,
CVPR25(23658-23667)
IEEE DOI
2508
Visualization, Computational modeling, Natural languages, Buildings,
Object detection, Diffusion models, Encoding, Computational efficiency
BibRef
Wu, Z.[Ziyi],
Siarohin, A.[Aliaksandr],
Menapace, W.[Willi],
Skorokhodov, I.[Ivan],
Fang, Y.W.[Yu-Wei],
Chordia, V.[Varnith],
Gilitschenski, I.[Igor],
Tulyakov, S.[Sergey],
Mind the Time: Temporally-Controlled Multi-Event Video Generation,
CVPR25(23989-24000)
IEEE DOI
2508
Transformers, Encoding, Generators, Timing, Videos, video generation,
diffusion transformer, temporal control, multi-event video generation
BibRef
Zhou, D.Y.[De-Yu],
Sun, Q.[Quan],
Peng, Y.[Yuang],
Yan, K.[Kun],
Dong, R.[Runpei],
Wang, D.[Duomin],
Ge, Z.[Zheng],
Duan, N.[Nan],
Zhang, X.Y.[Xiang-Yu],
Taming Teacher Forcing for Masked Autoregressive Video Generation,
CVPR25(7374-7384)
IEEE DOI
2508
Training, Bridges, Technological innovation,
Computational modeling, Video sequences, Benchmark testing,
generative models
BibRef
Gao, M.[Mingju],
Pan, Y.[Yike],
Gao, H.A.[Huan-Ang],
Zhang, Z.Z.[Zong-Zheng],
Li, W.[Wenyi],
Dong, H.[Hao],
Tang, H.[Hao],
Yi, L.[Li],
Zhao, H.[Hao],
PartRM: Modeling Part-Level Dynamics with Large Cross-State
Reconstruction Model,
CVPR25(7004-7014)
IEEE DOI
2508
Geometry, Training, Solid modeling, Dynamics, Predictive models,
Diffusion models, Image reconstruction, Videos, 3d vision,
drag based deformation
BibRef
Zhang, Q.H.[Qi-Hang],
Zhai, S.F.[Shuang-Fei],
Martin, M.A.B.[Miguel Angel Bautista],
Miao, K.[Kevin],
Toshev, A.[Alexander],
Susskind, J.[Josh],
Gu, J.[Jiatao],
World-consistent Video Diffusion with Explicit 3D Modeling,
CVPR25(21685-21695)
IEEE DOI Code:
WWW Link.
2508
Visualization, Image synthesis, Benchmark testing, Transformers,
Multitasking, Diffusion models, Trajectory, Videos, video generation,
diffusion model
BibRef
Ji, L.B.[Long-Bin],
Zhong, L.[Lei],
Wei, P.F.[Peng-Fei],
Li, C.J.[Chang-Jian],
PoseTraj: Pose-Aware Trajectory Control in Video Diffusion,
CVPR25(22776-22785)
IEEE DOI
2508
Solid modeling, Accuracy, Face recognition, Benchmark testing,
Trajectory, Quality assessment, Videos, Synthetic data,
trajectory-guided video generation
BibRef
Shao, J.H.[Jia-Hao],
Yang, Y.B.[Yuan-Bo],
Zhou, H.Y.[Hong-Yu],
Zhang, Y.M.[You-Min],
Shen, Y.J.[Yu-Jun],
Guizilini, V.[Vitor],
Wang, Y.[Yue],
Poggi, M.[Matteo],
Liao, Y.[Yiyi],
Learning Temporally Consistent Video Depth from Video Diffusion
Priors,
CVPR25(22841-22852)
IEEE DOI
2508
Training, Accuracy, Depth measurement, Computational modeling, Noise,
Videos, Noise level
BibRef
Li, Z.L.[Zhuo-Ling],
Rahmani, H.[Hossein],
Ke, Q.H.[Qiu-Hong],
Liu, J.[Jun],
LongDiff: Training-Free Long Video Generation in One Go,
CVPR25(17789-17798)
IEEE DOI
2508
Visualization, Diffusion models, Transformers, Videos
BibRef
Li, Z.J.[Zong-Jian],
Lin, B.[Bin],
Ye, Y.[Yang],
Chen, L.[Liuhan],
Cheng, X.H.[Xin-Hua],
Yuan, S.H.[Sheng-Hai],
Yuan, L.[Li],
WF-VAE: Enhancing Video VAE by Wavelet-Driven Energy Flow for Latent
Video Diffusion Model,
CVPR25(17778-17788)
IEEE DOI Code:
WWW Link.
2508
Wavelet transforms, Training, Costs, Autoencoders,
Feature extraction, Diffusion models, Throughput, Encoding,
wavelet transform
BibRef
Li, J.[Jiaman],
Liu, C.K.[C. Karen],
Wu, J.J.[Jia-Jun],
Lifting Motion to the 3D World via 2D Diffusion,
CVPR25(17518-17528)
IEEE DOI
2508
Training, Limiting, Animals, Diffusion models, Motion capture,
Trajectory, Optimization, Synthetic data
BibRef
Wang, C.Y.[Chao-Yang],
Zhuang, P.[Peiye],
Ngo, T.D.[Tuan Duc],
Menapace, W.[Willi],
Siarohin, A.[Aliaksandr],
Vasilkovsky, M.[Michael],
Skorokhodov, I.[Ivan],
Tulyakov, S.[Sergey],
Wonka, P.[Peter],
Lee, H.Y.[Hsin-Ying],
4Real-Video: Learning Generalizable Photo-Realistic 4D Video
Diffusion,
CVPR25(17723-17732)
IEEE DOI
2508
Training, Visualization, Limiting, Transformers, Robustness,
Synchronization, Videos, 4real-video, 4d video, video generation, 4d generation
BibRef
Akkerman, R.[Rick],
Feng, H.[Haiwen],
Black, M.J.[Michael J.],
Tzionas, D.[Dimitrios],
Abrevaya, V.F.[Victoria Fernández],
InterDyn: Controllable Interactive Dynamics with Video Diffusion
Models,
CVPR25(12467-12479)
IEEE DOI Code:
WWW Link.
2508
Dynamics, Toy manufacturing industry, Process control,
Predictive models, Encoding, Intelligent systems, Physics, Engines,
intuitive physics
BibRef
Hyung, J.[Junha],
Kim, K.[Kinam],
Hong, S.[Susung],
Kim, M.J.[Min-Jung],
Choo, J.[Jaegul],
Spatiotemporal Skip Guidance for Enhanced Video Diffusion Sampling,
CVPR25(11006-11015)
IEEE DOI Code:
WWW Link.
2508
Training, Ethics, Diffusion models, Transformers,
Spatiotemporal phenomena, Quality assessment, Tuning, Videos, skip guidance
BibRef
o
Liu, Z.[Zhuoman],
Ye, W.[Weicai],
Lu-Ximon, Y.[Yan],
Wan, P.F.[Peng-Fei],
Zhang, D.[Di],
Unleashing the Potential of Multi-modal Foundation Models and Video
Diffusion for 4D Dynamic Physical Scene Simulation,
CVPR25(11016-11025)
IEEE DOI Code:
WWW Link.
2508
Optical losses, Solid modeling, Accuracy, Foundation models, Force,
Optical flow, Videos, Material properties,
video diffusion
BibRef
Liu, F.[Feng],
Zhang, S.W.[Shi-Wei],
Wang, X.F.[Xiao-Feng],
Wei, Y.J.[Yu-Jie],
Qiu, H.[Haonan],
Zhao, Y.Z.[Yu-Zhong],
Zhang, Y.Y.[Ying-Ya],
Ye, Q.X.[Qi-Xiang],
Wan, F.[Fang],
Timestep Embedding Tells: It's Time to Cache for Video Diffusion
Model,
CVPR25(7353-7363)
IEEE DOI
2508
Visualization, Analytical models, Schedules, Correlation,
Computational modeling, Fitting, Diffusion models, Polynomials,
cache mechanism
BibRef
Sun, M.Z.[Ming-Zhen],
Wang, W.[Weining],
Li, G.[Gen],
Liu, J.W.[Jia-Wei],
Sun, J.H.[Jia-Hui],
Feng, W.Q.[Wan-Quan],
Lao, S.S.[Shan-Shan],
Zhou, S.[Siyu],
He, Q.[Qian],
Liu, J.[Jing],
AR-Diffusion: Asynchronous Video Generation with Auto-Regressive
Diffusion,
CVPR25(7364-7373)
IEEE DOI
2508
Training, Computational modeling, Coherence, Benchmark testing,
Diffusion models, Videos, video generation, diffusion model,
video auto-encoder
BibRef
Jeong, H.[Hyeonho],
Huang, C.H.P.[Chun-Hao P.],
Ye, J.C.[Jong Chul],
Mitra, N.J.[Niloy J.],
Ceylan, D.[Duygu],
Track4Gen: Teaching Video Diffusion Models to Track Points Improves
Video Generation,
CVPR25(7276-7287)
IEEE DOI
2508
Visualization, Education, Coherence, Diffusion models, Generators,
Videos, video diffusion models, point tracking, video generation
BibRef
Ma, X.[Xin],
Wang, Y.H.[Yao-Hui],
Jia, G.[Gengyu],
Chen, X.Y.[Xin-Yuan],
Wong, T.T.[Tien-Tsin],
Li, Y.F.[Yuan-Fang],
Chen, C.[Cunjian],
Consistent and Controllable Image Animation with Motion Diffusion
Models,
CVPR25(7288-7298)
IEEE DOI Code:
WWW Link.
2508
Control design, Noise, Dynamics, Animation, Diffusion models,
Discrete cosine transforms, Videos, image animation,
controllable video generation
BibRef
Wu, S.[Sibo],
Xu, C.R.[Cong-Rong],
Huang, B.B.[Bin-Bin],
Geiger, A.[Andreas],
Chen, A.[Anpei],
GenFusion: Closing the Loop between Reconstruction and Generation via
Videos,
CVPR25(6078-6088)
IEEE DOI
2508
Training, Pipelines, Rendering (computer graphics),
Diffusion models, Videos
BibRef
Liu, J.X.[Jin-Xiu],
Lin, S.[Shaoheng],
Li, Y.X.[Yin-Xiao],
Yang, M.H.[Ming-Hsuan],
DynamicScaler: Seamless and Scalable Video Generation for Panoramic
Scenes,
CVPR25(6144-6153)
IEEE DOI Code:
WWW Link.
2508
Scalability, Dynamics, Noise reduction, Coherence, Diffusion models,
Spatial resolution, Videos, video generation, diffusion model, panorama
BibRef
Yan, X.[Xin],
Cai, Y.X.[Yu-Xuan],
Wang, Q.Y.[Qiu-Yue],
Zhou, Y.[Yuan],
Huang, W.H.[Wen-Hao],
Yang, H.[Huan],
Long Video Diffusion Generation with Segmented Cross-Attention and
Content-Rich Video Data Curation,
CVPR25(3184-3194)
IEEE DOI
2508
Attention mechanisms, Computational modeling, Semantics, Coherence,
Diffusion models, Videos
BibRef
Wang, H.Y.[Han-Yang],
Liu, F.[Fangfu],
Chi, J.W.[Jia-Wei],
Duan, Y.[Yueqi],
VideoScene: Distilling Video Diffusion Model to Generate 3D Scenes in
One Step,
CVPR25(16475-16485)
IEEE DOI
2508
Geometry, Visualization, Solid modeling, Noise reduction,
Diffusion models, Quality assessment, Optimization, Videos, video-to-3d
BibRef
Burgert, R.[Ryan],
Xu, Y.C.[Yuan-Cheng],
Xian, W.Q.[Wen-Qi],
Pilarski, O.[Oliver],
Clausen, P.[Pascal],
He, M.M.[Ming-Ming],
Ma, L.[Li],
Deng, Y.T.[Yi-Tong],
Li, L.X.[Ling-Xiao],
Mousavi, M.[Mohsen],
Ryoo, M.[Michael],
Debevec, P.[Paul],
Yu, N.[Ning],
Go-with-the-Flow: Motion-Controllable Video Diffusion Models Using
Real-Time Warped Noise,
CVPR25(13-23)
IEEE DOI
2508
Training, Source coding, Noise, Pipelines, Transforms, Streaming media,
Diffusion models, Real-time systems, Motion control, noise warping
BibRef
Lu, R.J.[Rui-Jie],
Chen, Y.X.[Yi-Xin],
Ni, J.F.[Jun-Feng],
Jia, B.X.[Bao-Xiong],
Liu, Y.[Yu],
Wan, D.[Diwen],
Zeng, G.[Gang],
Huang, S.Y.[Si-Yuan],
MOVIS: Enhancing Multi-Object Novel View Synthesis for Indoor Scenes,
CVPR25(26767-26778)
IEEE DOI Code:
WWW Link.
2508
Training, Measurement, Shape, Computational modeling, Synthesizers,
Noise reduction, Predictive models, Diffusion models,
diffusion model
BibRef
Yang, L.[Lehan],
Qi, L.[Lu],
Li, X.T.[Xiang-Tai],
Li, S.[Sheng],
Jampani, V.[Varun],
Yang, M.H.[Ming-Hsuan],
Unified Dense Prediction of Video Diffusion,
CVPR25(28963-28973)
IEEE DOI
2508
Training, Motion segmentation, Computational modeling,
Depth measurement, Predictive models, Multitasking, Encoding,
depth estimation
BibRef
Wu, R.[Rundi],
Gao, R.Q.[Rui-Qi],
Poole, B.[Ben],
Trevithick, A.[Alex],
Zheng, C.X.[Chang-Xi],
Barron, J.T.[Jonathan T.],
Holynski, A.[Aleksander],
CAT4D: Create Anything in 4D with Multi-View Video Diffusion Models,
CVPR25(26057-26068)
IEEE DOI
2508
Deformable models, Solid modeling, Transforms, Benchmark testing,
Diffusion models, Cameras, Optimization, 4d generation,
4d reconstruction
BibRef
Tan, J.T.[Jiang-Tong],
Yu, H.[Hu],
Huang, J.[Jie],
Xiao, J.[Jie],
Zhao, F.[Feng],
FreePCA: Integrating Consistency Information across Long-short Frames
in Training-free Long Video Generation via Principal Component
Analysis,
CVPR25(27979-27988)
IEEE DOI Code:
WWW Link.
2508
Training, Visualization, Noise, Feature extraction, Diffusion models,
Motion measurement, Videos, Principal component analysis, long video generation
BibRef
Zhai, S.[Shangjin],
Ye, Z.C.[Zhi-Chao],
Liu, J.L.[Jia-Lin],
Xie, W.J.[Wei-Jian],
Hu, J.Q.[Jia-Qi],
Peng, Z.[Zhen],
Xue, H.[Hua],
Chen, D.[Danpeng],
Wang, X.M.[Xiao-Meng],
Yang, L.[Lei],
Wang, N.[Nan],
Liu, H.M.[Hao-Min],
Zhang, G.F.[Guo-Feng],
StarGen: A Spatiotemporal Autoregression Framework with Video
Diffusion Model for Scalable and Controllable Scene Generation,
CVPR25(26822-26833)
IEEE DOI Code:
WWW Link.
2508
Interpolation, Accuracy, Scalability, Computational modeling,
Urban areas, Diffusion models, Spatiotemporal phenomena,
diffusion models
BibRef
Soucek, T.[Tomáš],
Gatti, P.[Prajwal],
Wray, M.[Michael],
Laptev, I.[Ivan],
Damen, D.[Dima],
Sivic, J.[Josef],
ShowHowTo: Generating Scene-Conditioned Step-by-Step Visual
Instructions,
CVPR25(27435-27445)
IEEE DOI
2508
Visualization, Accuracy, Pipelines, Training data, Manuals,
Diffusion models, Image sequences, Planning, Videos,
video diffusion model
BibRef
Cai, S.Q.[Sheng-Qu],
Chan, E.R.[Eric Ryan],
Zhang, Y.Z.[Yun-Zhi],
Guibas, L.J.[Leonidas J.],
Wu, J.J.[Jia-Jun],
Wetzstein, G.[Gordon],
Diffusion Self-Distillation for Zero-Shot Customized Image Generation,
CVPR25(18434-18443)
IEEE DOI
2508
Art, Image synthesis, Computational modeling, Text to image,
Diffusion models, Data models, Tuning, Optimization, computer vision,
image generation
BibRef
Yin, T.W.[Tian-Wei],
Zhang, Q.[Qiang],
Zhang, R.[Richard],
Freeman, W.T.[William T.],
Durand, F.[Frédo],
Shechtman, E.[Eli],
Huang, X.[Xun],
From Slow Bidirectional to Fast Autoregressive Video Diffusion Models,
CVPR25(22963-22974)
IEEE DOI
2508
Training, Translation, Graphics processing units, Streaming media,
Diffusion models, Transformers, Generators, Trajectory, Videos,
diffusion distillation
BibRef
Zhang, Z.H.[Zheng-Hao],
Liao, J.C.[Jun-Chao],
Li, M.H.[Meng-Hao],
Dai, Z.[ZuoZhuo],
Qiu, B.X.[Bing-Xue],
Zhu, S.[Siyu],
Qin, L.[Long],
Wang, W.Z.[Wei-Zhi],
TORA: Trajectory-oriented Diffusion Transformer for Video Generation,
CVPR25(2063-2073)
IEEE DOI Code:
WWW Link.
2508
Training, Visualization, Scalability, Dynamics, Transformers,
Robustness, Trajectory, Videos
BibRef
Jin, W.[Wonjoon],
Dai, Q.[Qi],
Luo, C.[Chong],
Baek, S.H.[Seung-Hwan],
Cho, S.[Sunghyun],
FloVD: Optical Flow Meets Video Diffusion Model for Enhanced
Camera-Controlled Video Synthesis,
CVPR25(2040-2049)
IEEE DOI
2508
Training, Correlation, Accuracy, Semantic segmentation, Pipelines,
Cameras, Diffusion models, Optical flow, Videos, video synthesis,
optical flow
BibRef
Wu, X.[Xindi],
Singer, U.[Uriel],
Lin, Z.J.[Zhao-Jiang],
Madotto, A.[Andrea],
Xia, X.[Xide],
Xu, Y.F.[Yi-Fan],
Crook, P.[Paul],
Dong, X.L.[Xin Luna],
Moon, S.[Seungwhan],
Corgi: Cached Memory Guided Video Generation,
WACV25(4585-4594)
IEEE DOI
2505
Visualization, Art, Diffusion models, Text to video
BibRef
Rahman, A.[Aimon],
Perera, M.V.[Malsha V.],
Patel, V.M.[Vishal M.],
Frame by Familiar Frame: Understanding Replication in Video Diffusion
Models,
WACV25(2766-2776)
IEEE DOI
2505
Training, Measurement, Systematics, Image synthesis, Law,
Computational modeling, Training data, Diffusion models,
memorization
BibRef
Li, P.X.[Peng-Xiang],
Chen, K.[Kai],
Liu, Z.[Zhili],
Gao, R.[Ruiyuan],
Hong, L.Q.[Lan-Qing],
Yeung, D. .Y.[Dit- Yan],
Lu, H.C.[Hu-Chuan],
Jia, X.[Xu],
TrackDiffusion: Tracklet-Conditioned Video Generation via Diffusion
Models,
WACV25(3539-3548)
IEEE DOI
2505
Training, Visualization, Tracking, Computational modeling,
Video sequences, Training data, Diffusion models, Data models,
Motion control
BibRef
Yang, J.W.[Jing-Wen],
Sun, J.M.[Jia-Mu],
Yang, Y.L.[Yong-Liang],
Yang, J.[Jie],
Shan, Y.[Ying],
Cao, Y.P.[Yan-Pei],
Gao, L.[Lin],
DMiT: Deformable Mipmapped Tri-plane Representation for Dynamic Scenes,
ECCV24(LV: 436-453).
Springer DOI
2412
BibRef
Zatsarynna, O.[Olga],
Bahrami, E.[Emad],
Farha, Y.A.[Yazan Abu],
Francesca, G.[Gianpiero],
Gall, J.[Juergen],
Gated Temporal Diffusion for Stochastic Long-term Dense Anticipation,
ECCV24(LV: 454-472).
Springer DOI
2412
BibRef
Hong, F.Z.[Fang-Zhou],
Guzov, V.[Vladimir],
Kim, H.J.[Hyo Jin],
Ye, Y.T.[Yu-Ting],
Newcombe, R.[Richard],
Liu, Z.W.[Zi-Wei],
Ma, L.[Lingni],
EgoLM: Multi-Modal Language Model of Egocentric Motions,
CVPR25(5344-5354)
IEEE DOI Code:
WWW Link.
2508
Computational modeling, Large language models, Natural languages,
Motion detection, Wearable devices, Motion control, Videos
BibRef
Zhang, M.Y.[Ming-Yuan],
Jin, D.[Daisheng],
Gu, C.Y.[Chen-Yang],
Hong, F.Z.[Fang-Zhou],
Cai, Z.A.[Zhong-Ang],
Huang, J.F.[Jing-Fang],
Zhang, C.Z.[Chong-Zhi],
Guo, X.Y.[Xin-Ying],
Yang, L.[Lei],
He, Y.[Ying],
Liu, Z.W.[Zi-Wei],
Large Motion Model for Unified Multi-modal Motion Generation,
ECCV24(XIII: 397-421).
Springer DOI
2412
BibRef
Lu, J.C.[Jia-Chen],
Huang, Z.[Ze],
Yang, Z.[Zeyu],
Zhang, J.H.[Jia-Hui],
Zhang, L.[Li],
Wovogen: World Volume-aware Diffusion for Controllable Multi-camera
Driving Scene Generation,
ECCV24(LXXX: 329-345).
Springer DOI
2412
BibRef
Sampieri, A.[Alessio],
Palma, A.[Alessio],
Spinelli, I.[Indro],
Galasso, F.[Fabio],
Length-aware Motion Synthesis via Latent Diffusion,
ECCV24(LIII: 107-124).
Springer DOI
2412
BibRef
Zhu, L.[Lin],
Zheng, Y.L.[Yun-Long],
Zhang, Y.J.[Yi-Jun],
Wang, X.[Xiao],
Wang, L.Z.[Li-Zhi],
Huang, H.[Hua],
Temporal Residual Guided Diffusion Framework for Event-driven Video
Reconstruction,
ECCV24(XL: 411-427).
Springer DOI
2412
BibRef
Jin, P.[Peng],
Li, H.[Hao],
Cheng, Z.[Zesen],
Li, K.[Kehan],
Yu, R.[Runyi],
Liu, C.[Chang],
Ji, X.Y.[Xiang-Yang],
Yuan, L.[Li],
Chen, J.[Jie],
Local Action-guided Motion Diffusion Model for Text-to-motion
Generation,
ECCV24(XXV: 392-409).
Springer DOI
2412
BibRef
Chi, S.G.[Seung-Geun],
Chi, H.G.[Hyung-Gun],
Ma, H.[Hengbo],
Agarwal, N.[Nakul],
Siddiqui, F.[Faizan],
Ramani, K.[Karthik],
Lee, K.[Kwonjoon],
M2d2m: Multi-Motion Generation from Text with Discrete Diffusion Models,
ECCV24(XIV: 18-36).
Springer DOI
2412
BibRef
Kara, O.[Ozgur],
Singh, K.K.[Krishna Kumar],
Liu, F.[Feng],
Ceylan, D.[Duygu],
Rehg, J.M.[James M.],
Hinz, T.[Tobias],
ShotAdapter: Text-to-Multi-Shot Video Generation with Diffusion
Models,
CVPR25(28405-28415)
IEEE DOI
2508
Computational modeling, Pipelines, Training data, Transforms, Data collection,
Diffusion models, Generators, Text to video, video generation
BibRef
Cai, S.Q.[Sheng-Qu],
Ceylan, D.[Duygu],
Gadelha, M.[Matheus],
Huang, C.H.P.[Chun-Hao Paul],
Wang, T.F.Y.[Tuan-Feng Yang],
Wetzstein, G.[Gordon],
Generative Rendering: Controllable 4D-Guided Video Generation with 2D
Diffusion Models,
CVPR24(7611-7620)
IEEE DOI
2410
Geometry, Pipelines, Text to image, Manuals, Diffusion models,
Rendering (computer graphics), Computer Graphics, Animation
BibRef
Zhao, Y.P.[Ya-Ping],
Zhang, P.[Pei],
Wang, C.[Chutian],
Lam, E.Y.[Edmund Y.],
Controllable Unsupervised Event-Based Video Generation,
ICIP24(2278-2284)
IEEE DOI Code:
WWW Link.
2411
Training, Codes, Image edge detection, Cameras,
Diffusion models, neuromorphic imaging, computational imaging
BibRef
Qazi, T.[Tayeba],
Lall, B.[Brejesh],
Thermal Videodiff (TVD):
A Diffusion Architecture for Thermal Video Synthesis,
ICIP24(2438-2444)
IEEE DOI Code:
WWW Link.
2411
Deep learning, Temperature distribution, Costs, Infrared imaging,
Thermal sensors, Diffusion models, Synthetic Video Generation,
Visible Spectrum Context
BibRef
Zhang, K.W.[Kai-Wen],
Zhou, Y.F.[Yi-Fan],
Xu, X.D.[Xu-Dong],
Dai, B.[Bo],
Pan, X.G.[Xin-Gang],
DiffMorpher: Unleashing the Capability of Diffusion Models for Image
Morphing,
CVPR24(7912-7921)
IEEE DOI
2410
Interpolation, Schedules, Image synthesis, Semantics, Image morphing,
Noise, Fitting, Diffusion models, Image morphing, video generation
BibRef
Chen, H.X.[Hao-Xin],
Zhang, Y.[Yong],
Cun, X.D.[Xiao-Dong],
Xia, M.H.[Meng-Han],
Wang, X.T.[Xin-Tao],
Weng, C.[Chao],
Shan, Y.[Ying],
VideoCrafter2: Overcoming Data Limitations for High-Quality Video
Diffusion Models,
CVPR24(7310-7320)
IEEE DOI
2410
Training, Couplings, Degradation, Analytical models, Noise, Diffusion models
BibRef
Ren, Y.X.[Yi-Xuan],
Zhou, Y.[Yang],
Yang, J.[Jimei],
Shi, J.[Jing],
Liu, D.[Difan],
Liu, F.[Feng],
Kwon, M.[Mingi],
Shrivastava, A.[Abhinav],
Customize-a-video: One-shot Motion Customization of Text-to-video
Diffusion Models,
ECCV24(LXXXIX: 332-349).
Springer DOI
2412
BibRef
Zhang, J.T.[Jun-Tao],
Liu, Y.H.[Yue-Huai],
Tai, Y.W.[Yu-Wing],
Tang, C.K.[Chi-Keung],
C3Net: Compound Conditioned ControlNet for Multimodal Content
Generation,
CVPR24(26876-26885)
IEEE DOI
2410
Training, Interpolation, Semantics, Training data,
Aerospace electronics, Diffusion models,
BibRef
Li, Z.Q.[Zheng-Qi],
Tucker, R.[Richard],
Snavely, N.[Noah],
Holynski, A.[Aleksander],
Generative Image Dynamics,
CVPR24(24142-24153)
IEEE DOI
2410
Code:
WWW Link. Solid modeling, Dynamics, Video sequences, Predictive models,
Diffusion models, Rendering (computer graphics), Turning
BibRef
Zhuang, S.[Shaobin],
Li, K.[Kunchang],
Chen, X.Y.[Xin-Yuan],
Wang, Y.H.[Yao-Hui],
Liu, Z.W.[Zi-Wei],
Qiao, Y.[Yu],
Wang, Y.[Yali],
Vlogger: Make Your Dream A Vlog,
CVPR24(8806-8817)
IEEE DOI
2410
Training, Visualization, Spatial coherence, Coherence,
Diffusion models, Boosting, Planning,
spatial-temporal coherence
BibRef
Zeng, Y.[Yan],
Wei, G.Q.[Guo-Qiang],
Zheng, J.[Jiani],
Zou, J.X.[Jia-Xin],
Wei, Y.[Yang],
Zhang, Y.C.[Yu-Chen],
Li, H.[Hang],
Make Pixels Dance: High-Dynamic Video Generation,
CVPR24(8850-8860)
IEEE DOI
2410
Training, Humanities, Focusing, Diffusion models, Visual effects,
Video Generation, Diffusion Models
BibRef
Zhang, Z.C.[Zhi-Cheng],
Hu, J.[Junyao],
Cheng, W.T.[Wen-Tao],
Paudel, D.[Danda],
Yang, J.F.[Ju-Feng],
ExtDM: Distribution Extrapolation Diffusion Model for Video
Prediction,
CVPR24(19310-19320)
IEEE DOI
2410
Extrapolation, Solid modeling, Uncertainty, Computational modeling,
Predictive models, Diffusion models, Video Generation, Diffusion Model
BibRef
Skorokhodov, I.[Ivan],
Menapace, W.[Willi],
Siarohin, A.[Aliaksandr],
Tulyakov, S.[Sergey],
Hierarchical Patch Diffusion Models for High-Resolution Video
Generation,
CVPR24(7569-7579)
IEEE DOI Code:
WWW Link.
2410
Training, Limiting, Scalability, Computational modeling, Pipelines,
Diffusion models, Generators, video generation, diffusion models, efficiency
BibRef
Jiang, Y.M.[Yu-Ming],
Wu, T.X.[Tian-Xing],
Yang, S.[Shuai],
Si, C.Y.[Chen-Yang],
Lin, D.[Dahua],
Qiao, Y.[Yu],
Loy, C.C.[Chen Change],
Liu, Z.W.[Zi-Wei],
VideoBooth: Diffusion-based Video Generation with Image Prompts,
CVPR24(6689-6700)
IEEE DOI
2410
Visualization, Image coding, Accuracy, Computational modeling,
Video Generation, Diffusion Models
BibRef
Wei, Y.J.[Yu-Jie],
Zhang, S.W.[Shi-Wei],
Qing, Z.W.[Zhi-Wu],
Yuan, H.J.[Hang-Jie],
Liu, Z.H.[Zhi-Heng],
Liu, Y.[Yu],
Zhang, Y.Y.[Ying-Ya],
Zhou, J.R.[Jing-Ren],
Shan, H.M.[Hong-Ming],
Dream Video: Composing Your Dream Videos with Customized Subject and
Motion,
CVPR24(6537-6549)
IEEE DOI Code:
WWW Link.
2410
Adaptation models, Image synthesis, Computational modeling,
Diffusion models, Controllability, customized generation
BibRef
Gal, R.[Rinon],
Vinker, Y.[Yael],
Alaluf, Y.[Yuval],
Bermano, A.[Amit],
Cohen-Or, D.[Daniel],
Shamir, A.[Ariel],
Chechik, G.[Gal],
Breathing Life Into Sketches Using Text-to-Video Priors,
CVPR24(4325-4336)
IEEE DOI
2410
Training, Deformation, Animation, Diffusion models, Vectors,
Sketch animation, text-to-video, diffusion score distillation
BibRef
Jain, Y.[Yash],
Nasery, A.[Anshul],
Vineet, V.[Vibhav],
Behl, H.[Harkirat],
Peekaboo: Interactive Video Generation via Masked-Diffusion,
CVPR24(8079-8088)
IEEE DOI
2410
Training, Codes, Computational modeling, Benchmark testing,
Creativity, video generation, diffusion,
text to video
BibRef
Yatim, D.[Danah],
Fridman, R.[Rafail],
Bar-Tal, O.[Omer],
Kasten, Y.[Yoni],
Dekel, T.[Tali],
Space-Time Diffusion Features for Zero-Shot Text-Driven Motion
Transfer,
CVPR24(8466-8476)
IEEE DOI Code:
WWW Link.
2410
Shape, Layout, Dogs, Diffusion models,
Text to video, video editing, motion transfer, diffusion models
BibRef
Chen, S.[Shoufa],
Xu, M.M.[Meng-Meng],
Ren, J.W.[Jia-Wei],
Cong, Y.[Yuren],
He, S.[Sen],
Xie, Y.P.[Yan-Ping],
Sinha, A.[Animesh],
Luo, P.[Ping],
Xiang, T.[Tao],
Perez-Rua, J.M.[Juan-Manuel],
GenTron: Diffusion Transformers for Image and Video Generation,
CVPR24(6441-6451)
IEEE DOI Code:
WWW Link.
2410
Visualization, Adaptation models, Scalability, Transformers,
Diffusion models, Quality assessment, Diffusion Transformers,
Text-to-Video Generation
BibRef
Lee, T.[Taegyeong],
Kwon, S.[Soyeong],
Kim, T.[Taehwan],
Grid Diffusion Models for Text-to-Video Generation,
CVPR24(8734-8743)
IEEE DOI
2410
Visualization, Computational modeling, Memory management,
Text to image, Graphics processing units, Diffusion models,
Diffusion models
BibRef
Eldesokey, A.[Abdelrahman],
Wonka, P.[Peter],
LatentMan: Generating Consistent Animated Characters using Image
Diffusion Models,
GCV24(7510-7519)
IEEE DOI
2410
Bridges, Visualization, Computational modeling, Text to image,
Diffusion processes, Diffusion Models, Text-to-Video, Animation, Text-to-Image
BibRef
Yuan, X.[Xin],
Baek, J.[Jinoo],
Xu, K.[Keyang],
Tov, O.[Omer],
Fei, H.L.[Hong-Liang],
Inflation with Diffusion: Efficient Temporal Adaptation for
Text-to-Video Super-Resolution,
VAQuality24(489-496)
IEEE DOI
2404
Adaptation models, Visualization, Computational modeling,
Superresolution, Computer architecture
BibRef
Wu, J.Z.J.[Jay Zhang-Jie],
Ge, Y.X.[Yi-Xiao],
Wang, X.T.[Xin-Tao],
Lei, S.W.X.[Stan Wei-Xian],
Gu, Y.C.[Yu-Chao],
Shi, Y.F.[Yu-Fei],
Hsu, W.[Wynne],
Shan, Y.[Ying],
Qie, X.H.[Xiao-Hu],
Shou, M.Z.[Mike Zheng],
Tune-A-Video: One-Shot Tuning of Image Diffusion Models for
Text-to-Video Generation,
ICCV23(7589-7599)
IEEE DOI
2401
BibRef
Zhu, Y.Z.[Yuan-Zhi],
Liu, X.C.[Xing-Chao],
Liu, Q.[Qiang],
Slimflow: Training Smaller One-step Diffusion Models with Rectified
Flow,
ECCV24(LXXXII: 342-359).
Springer DOI
2412
BibRef
Zhong, L.[Lei],
Xie, Y.M.[Yi-Ming],
Jampani, V.[Varun],
Sun, D.Q.[De-Qing],
Jiang, H.[Huaizu],
Smoodi: Stylized Motion Diffusion Model,
ECCV24(I: 405-421).
Springer DOI
2412
BibRef
Zhou, W.Y.[Wen-Yang],
Dou, Z.Y.[Zhi-Yang],
Cao, Z.[Zeyu],
Liao, Z.Y.C.[Zhou-Ying-Cheng],
Wang, J.B.[Jing-Bo],
Wang, W.J.[Wen-Jia],
Liu, Y.[Yuan],
Komura, T.[Taku],
Wang, W.P.[Wen-Ping],
Liu, L.J.[Ling-Jie],
EMDM: Efficient Motion Diffusion Model for Fast and High-quality Motion
Generation,
ECCV24(II: 18-38).
Springer DOI
2412
BibRef
Gupta, A.[Agrim],
Yu, L.J.[Li-Jun],
Sohn, K.[Kihyuk],
Gu, X.[Xiuye],
Hahn, M.[Meera],
Li, F.F.[Fei-Fei],
Essa, I.[Irfan],
Jiang, L.[Lu],
Lezama, J.[José],
Photorealistic Video Generation with Diffusion Models,
ECCV24(LXXIX: 393-411).
Springer DOI
2412
BibRef
Han, J.L.[Jun-Lin],
Kokkinos, F.[Filippos],
Torr, P.H.S.[Philip H.S.],
Vfusion3d: Learning Scalable 3d Generative Models from Video Diffusion
Models,
ECCV24(II: 333-350).
Springer DOI
2412
BibRef
Jagpal, D.[Diljeet],
Chen, X.[Xi],
Namboodiri, V.P.[Vinay P.],
EIDT-V: Exploiting Intersections in Diffusion Trajectories for
Model-Agnostic, Zero-Shot, Training-Free Text-to-Video Generation,
CVPR25(18219-18228)
IEEE DOI Code:
WWW Link.
2508
Training, Adaptation models, Visualization, Scalability,
Video sequences, Switches, Coherence, Trajectory, Timing,
generative models
BibRef
Xue, Q.Y.[Qi-Yao],
Yin, X.Y.[Xiang-Yu],
Yang, B.[Boyuan],
Gao, W.[Wei],
PhyT2V: LLM-Guided Iterative Self-Refinement for Physics-Grounded
Text-to-Video Generation,
CVPR25(18826-18836)
IEEE DOI
2508
Computational modeling, Transformers, Diffusion models, Cognition,
Iterative methods, Text to video, video generation,
step-back reasoning
BibRef
Qi, T.H.[Tian-Hao],
Yuan, J.L.[Jian-Long],
Feng, W.Q.[Wan-Quan],
Fang, S.C.[Shan-Cheng],
Liu, J.W.[Jia-Wei],
Zhou, S.[SiYu],
He, Q.[Qian],
Xie, H.T.[Hong-Tao],
Zhang, Y.D.[Yong-Dong],
Mask2DiT: Dual Mask-based Diffusion Transformer for Multi-Scene Long
Video Generation,
CVPR25(18837-18846)
IEEE DOI Code:
WWW Link.
2508
Visualization, Annotations, Semantics, Coherence, Transformers,
Data models, Videos, diffusion transformer, multi-scene video generation
BibRef
Shi, F.Y.[Feng-Yuan],
Gu, J.X.[Jia-Xi],
Xu, H.[Hang],
Xu, S.[Songcen],
Zhang, W.[Wei],
Wang, L.M.[Li-Min],
BIVDiff: A Training-Free Framework for General-Purpose Video
Synthesis via Bridging Image and Video Diffusion Models,
CVPR24(7393-7402)
IEEE DOI
2410
Training, Smoothing methods, Image synthesis, Memory management,
Text to image, Diffusion models, Video Synthesis, Diffusion models,
General Framework
BibRef
Ge, S.W.[Song-Wei],
Nah, S.J.[Seung-Jun],
Liu, G.L.[Gui-Lin],
Poon, T.[Tyler],
Tao, A.[Andrew],
Catanzaro, B.[Bryan],
Jacobs, D.[David],
Huang, J.B.[Jia-Bin],
Liu, M.Y.[Ming-Yu],
Balaji, Y.[Yogesh],
Preserve Your Own Correlation:
A Noise Prior for Video Diffusion Models,
ICCV23(22873-22884)
IEEE DOI Code:
WWW Link.
2401
BibRef
Zhu, Z.X.[Zi-Xin],
Feng, X.L.[Xue-Lu],
Chen, D.D.[Dong-Dong],
Yuan, J.S.[Jun-Song],
Qiao, C.M.[Chun-Ming],
Hua, G.[Gang],
Exploring Pre-trained Text-to-video Diffusion Models for Referring
Video Object Segmentation,
ECCV24(XII: 452-469).
Springer DOI
2412
BibRef
Guo, Y.W.[Yu-Wei],
Yang, C.[Ceyuan],
Rao, A.[Anyi],
Agrawala, M.[Maneesh],
Lin, D.[Dahua],
Dai, B.[Bo],
Sparsectrl: Adding Sparse Controls to Text-to-video Diffusion Models,
ECCV24(XLII: 330-348).
Springer DOI
2412
BibRef
Zhao, R.[Rui],
Gu, Y.C.[Yu-Chao],
Wu, J.Z.J.[Jay Zhang-Jie],
Zhang, D.J.H.[David Jun-Hao],
Liu, J.W.[Jia-Wei],
Wu, W.J.[Wei-Jia],
Keppo, J.[Jussi],
Shou, M.Z.[Mike Zheng],
Motiondirector: Motion Customization of Text-to-video Diffusion Models,
ECCV24(LVI: 273-290).
Springer DOI
2412
BibRef
Liang, J.Y.[Jing-Yun],
Fan, Y.C.[Yu-Chen],
Zhang, K.[Kai],
Timofte, R.[Radu],
Van Gool, L.J.[Luc J.],
Ranjan, R.[Rakesh],
Movideo: Motion-aware Video Generation with Diffusion Model,
ECCV24(XLIV: 56-74).
Springer DOI
2412
BibRef
Huang, T.[Tao],
Jiang, G.Q.[Guang-Qi],
Ze, Y.J.[Yan-Jie],
Xu, H.Z.[Hua-Zhe],
Diffusion Reward: Learning Rewards via Conditional Video Diffusion,
ECCV24(XLII: 478-495).
Springer DOI
2412
BibRef
Niu, M.[Muyao],
Cun, X.D.[Xiao-Dong],
Wang, X.T.[Xin-Tao],
Zhang, Y.[Yong],
Shan, Y.[Ying],
Zheng, Y.Q.[Yin-Qiang],
MOFA-Video: Controllable Image Animation via Generative Motion Field
Adaptions in Frozen Image-to-Video Diffusion Model,
ECCV24(XIX: 111-128).
Springer DOI
2412
BibRef
Wu, T.X.[Tian-Xing],
Si, C.Y.[Chen-Yang],
Jiang, Y.M.[Yu-Ming],
Huang, Z.Q.[Zi-Qi],
Liu, Z.W.[Zi-Wei],
Freeinit: Bridging Initialization Gap in Video Diffusion Models,
ECCV24(III: 378-394).
Springer DOI
2412
BibRef
Xing, J.[Jinbo],
Xia, M.H.[Meng-Han],
Zhang, Y.[Yong],
Chen, H.X.[Hao-Xin],
Yu, W.B.[Wang-Bo],
Liu, H.Y.[Han-Yuan],
Liu, G.[Gongye],
Wang, X.T.[Xin-Tao],
Shan, Y.[Ying],
Wong, T.T.[Tien-Tsin],
Dynamicrafter: Animating Open-domain Images with Video Diffusion Priors,
ECCV24(XLVI: 399-417).
Springer DOI
2412
BibRef
Kim, K.[Kihong],
Lee, H.[Haneol],
Park, J.[Jihye],
Kim, S.[Seyeon],
Lee, K.[Kwanghee],
Kim, S.[Seungryong],
Yoo, J.[Jaejun],
Hybrid Video Diffusion Models with 2d Triplane and 3d Wavelet
Representation,
ECCV24(LII: 148-165).
Springer DOI
2412
BibRef
Yuan, H.J.[Hang-Jie],
Zhang, S.W.[Shi-Wei],
Wang, X.[Xiang],
Wei, Y.J.[Yu-Jie],
Feng, T.[Tao],
Pan, Y.[Yining],
Zhang, Y.Y.[Ying-Ya],
Liu, Z.W.[Zi-Wei],
Albanie, S.[Samuel],
Ni, D.[Dong],
InstructVideo: Instructing Video Diffusion Models with Human Feedback,
CVPR24(6463-6474)
IEEE DOI Code:
WWW Link.
2410
Degradation, Visualization, Image segmentation, Costs,
Computational modeling, Diffusion processes, video generation,
reward fine-tuning
BibRef
Ni, H.[Haomiao],
Egger, B.[Bernhard],
Lohit, S.[Suhas],
Cherian, A.[Anoop],
Wang, Y.[Ye],
Koike-Akino, T.[Toshiaki],
Huang, S.X.[Sharon X.],
Marks, T.K.[Tim K.],
TI2V-Zero: Zero-Shot Image Conditioning for Text-to-Video Diffusion
Models,
CVPR24(9015-9025)
IEEE DOI
2410
Training, Visualization, Gaussian noise, Noise reduction,
Diffusion models, Sampling methods
BibRef
Jeong, H.[Hyeonho],
Park, G.Y.[Geon Yeong],
Ye, J.C.[Jong Chul],
VMC: Video Motion Customization Using Temporal Attention Adaption for
Text-to-Video Diffusion Models,
CVPR24(9212-9221)
IEEE DOI Code:
WWW Link.
2410
Visualization, Adaptation models, Noise, Diffusion models, Vectors,
Trajectory, Diffusion Models,
Motion Customization
BibRef
Motamed, S.[Saman],
van Gansbeke, W.[Wouter],
Van Gool, L.J.[Luc J.],
Investigating the Effectiveness of Cross-Attention to Unlock
Zero-Shot Editing of Text-to-Video Diffusion Models,
GCV24(7406-7415)
IEEE DOI Code:
WWW Link.
2410
Codes, Shape, Computational modeling, Text to image,
Diffusion models, Video Diffusion models,
Video editing
BibRef
Fei, H.[Hao],
Wu, S.Q.[Sheng-Qiong],
Ji, W.[Wei],
Zhang, H.W.[Han-Wang],
Chua, T.S.[Tat-Seng],
Dysen-VDM: Empowering Dynamics-Aware Text-to-Video Diffusion with
LLMs,
CVPR24(7641-7653)
IEEE DOI Code:
WWW Link.
2410
Schedules, Codes, Dynamics, Dynamic scheduling, Diffusion models, Chatbots
BibRef
Blattmann, A.[Andreas],
Rombach, R.[Robin],
Ling, H.[Huan],
Dockhorn, T.[Tim],
Kim, S.W.[Seung Wook],
Fidler, S.[Sanja],
Kreis, K.[Karsten],
Align Your Latents: High-Resolution Video Synthesis with Latent
Diffusion Models,
CVPR23(22563-22575)
IEEE DOI
2309
BibRef
Rombach, R.[Robin],
Blattmann, A.[Andreas],
Lorenz, D.[Dominik],
Esser, P.[Patrick],
Ommer, B.[Björn],
High-Resolution Image Synthesis with Latent Diffusion Models,
CVPR22(10674-10685)
IEEE DOI
2210
Training, Visualization, Image synthesis, Computational modeling,
Noise reduction, Superresolution, Process control,
Image and video synthesis and generation
BibRef
Yeganeh, Y.[Yousef],
Farshad, A.[Azade],
Charisiadis, I.[Ioannis],
Hasny, M.[Marta],
Hartenberger, M.[Martin],
Ommer, B.[Björn],
Navab, N.[Nassir],
Adeli, E.[Ehsan],
Latent Drifting in Diffusion Models for Counterfactual Medical Image
Synthesis,
CVPR25(7685-7695)
IEEE DOI
2508
Training, Image synthesis, Computational modeling,
Magnetic resonance imaging, Text to image, Performance gain, generative models
BibRef
Khachatryan, L.[Levon],
Movsisyan, A.[Andranik],
Tadevosyan, V.[Vahram],
Henschel, R.[Roberto],
Wang, Z.Y.[Zhang-Yang],
Navasardyan, S.[Shant],
Shi, H.[Humphrey],
Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video
Generators,
ICCV23(15908-15918)
IEEE DOI Code:
WWW Link.
2401
BibRef
Ni, H.[Haomiao],
Shi, C.H.[Chang-Hao],
Li, K.[Kai],
Huang, S.X.[Sharon X.],
Min, M.R.[Martin Renqiang],
Conditional Image-to-Video Generation with Latent Flow Diffusion
Models,
CVPR23(18444-18455)
IEEE DOI
2309
BibRef
Luo, Z.X.[Zheng-Xiong],
Chen, D.Y.[Da-You],
Zhang, Y.Y.[Ying-Ya],
Huang, Y.[Yan],
Wang, L.[Liang],
Shen, Y.J.[Yu-Jun],
Zhao, D.L.[De-Li],
Zhou, J.[Jingren],
Tan, T.N.[Tie-Niu],
VideoFusion: Decomposed Diffusion Models for High-Quality Video
Generation,
CVPR23(10209-10218)
IEEE DOI
2309
BibRef
Ruan, L.[Ludan],
Ma, Y.Y.[Yi-Yang],
Yang, H.[Huan],
He, H.G.[Hui-Guo],
Liu, B.[Bei],
Fu, J.L.[Jian-Long],
Yuan, N.J.[Nicholas Jing],
Jin, Q.[Qin],
Guo, B.[Baining],
MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint Audio
and Video Generation,
CVPR23(10219-10228)
IEEE DOI
2309
BibRef
Benny, Y.[Yaniv],
Wolf, L.B.[Lior B.],
Dynamic Dual-Output Diffusion Models,
CVPR22(11472-11481)
IEEE DOI
2210
Image quality, Image synthesis, Noise reduction,
Generative adversarial networks,
Image and video synthesis and generation
BibRef
Hu, M.H.[Ming-Hui],
Wang, Y.J.[Yu-Jie],
Cham, T.J.[Tat-Jen],
Yang, J.F.[Jian-Fei],
Suganthan, P.N.,
Global Context with Discrete Diffusion in Vector Quantised Modelling
for Image Generation,
CVPR22(11492-11501)
IEEE DOI
2210
Training, Visualization, Image resolution, Image synthesis,
Pipelines, Noise reduction, Probabilistic logic,
Image and video synthesis and generation
BibRef
Chapter on 3-D Object Description and Computation Techniques, Surfaces, Deformable, View Generation, Video Conferencing continues in
Image Matting, Video Matting .