Lavagetto, F.,
Time-Delay Neural Networks for Estimating Lip Movements from
Speech Analysis:
A Useful Tool in Audio Video Synchronization,
CirSysVideo(7), No. 5, October 1997, pp. 786-800.
IEEE Top Reference.
9710
BibRef
Gasper, E.[Elon],
Matthews, III, J.H.[Joseph H.],
Wesley, R.[Richard],
Advanced tools for speech synchronized animation,
US_Patent5,613,056, Mar 18, 1997
WWW Link.
BibRef
9703
And:
US_Patent5,630,017, May 13, 1997
WWW Link.
BibRef
And: A1, A3, Only:
US_Patent5,689,618, Nov 18, 1997
WWW Link.
BibRef
Chien, J.T.,
Lin, M.S.,
Frame-synchronous noise compensation for hands-free speech recognition
in car environments,
VISP(147), No. 6, December 2000, pp. 508-515.
0101
BibRef
Rosenfeld, M.[Maury],
Method for automatically animating lip synchronization and
facial expression of animated characters,
US_Patent6,307,576, Oct 23, 2001
WWW Link.
BibRef
0110
Haisma, N.[Nicoline],
Sinke, J.G.[Johannes Govert],
Bergevoet, B.A.J.[Bas Arnold Jan],
van Gestel, H.A.W.[Henricus Antonius Wilhelmus],
Post-synchronizing an information stream including lip
objects replacement,
US_Patent7,145,606, Dec 5, 2006
WWW Link.
BibRef
0612
Huang, Y.[Ying],
Lin, S.S.T.[Stephen Ssu-Te],
Guo, B.N.[Bai-Ning],
Shum, H.Y.[Heung-Yeung],
System and method for real time lip synchronization,
US_Patent7,133,535, Nov 7, 2006
WWW Link.
BibRef
0611
Saenko, K.[Kate],
Livescu, K.[Karen],
Glass, J.[James],
Darrell, T.J.[Trevor J.],
Multistream Articulatory Feature-Based Models for Visual Speech
Recognition,
PAMI(31), No. 9, September 2009, pp. 1700-1707.
IEEE DOI
0907
Lip opening, lip rounding features.
BibRef
Saenko, K.[Kate],
Livescu, K.[Karen],
Siracusa, M.[Michael],
Wilson, K.[Kevin],
Glass, J.[James],
Darrell, T.J.[Trevor J.],
Visual Speech Recognition with Loosely Synchronized Feature Streams,
ICCV05(II: 1424-1431).
IEEE DOI
0510
BibRef
El-Sallam, A.A.[Amar A.],
Mian, A.S.[Ajmal S.],
Correlation based speech-video synchronization,
PRL(32), No. 6, 15 April 2011, pp. 780-786.
Elsevier DOI
1103
BibRef
Earlier:
Speech-Video Synchronization Using Lips Movements and Speech Envelope
Correlation,
ICIAR09(397-407).
Springer DOI
0907
Correlation; Lip sync; Formants; Estimation; AM,FM
BibRef
Chen, Y.M.,
Huang, F.C.,
Guan, S.H.,
Chen, B.Y.,
Animating Lip-Sync Characters With Dominated Animeme Models,
CirSysVideo(22), No. 9, September 2012, pp. 1344-1353.
IEEE DOI
1209
BibRef
Srinivasan, V.,
Bethel, C.L.,
Murphy, R.R.,
Evaluation of Head Gaze Loosely Synchronized With Real-Time Synthetic
Speech for Social Robots,
HMS(44), No. 6, December 2014, pp. 767-778.
IEEE DOI
1412
human-robot interaction
BibRef
Liu, N.[Na],
Zhou, T.[Tao],
Ji, Y.F.[Yun-Feng],
Zhao, Z.Y.[Zi-Yi],
Wan, L.H.[Li-Hong],
Synthesizing Talking Faces from Text and Audio: An Autoencoder and
Sequence-to-Sequence Convolutional Neural Network,
PR(102), 2020, pp. 107231.
Elsevier DOI
2003
Convolutional neural network, Autoencoder, Regression,
Face landmark, Face tracking, Lip sync, Video, Audio
BibRef
Yu, L.,
Yu, J.,
Li, M.,
Ling, Q.,
Multimodal Inputs Driven Talking Face Generation With
Spatial-Temporal Dependency,
CirSysVideo(31), No. 1, January 2021, pp. 203-216.
IEEE DOI
2101
Face, Mouth, Lips, Generative adversarial networks, Synchronization,
Facial animation, Video synthesis, self-attention mechanism
BibRef
Liu, L.,
Feng, G.,
Beautemps, D.,
Zhang, X.P.,
Re-Synchronization Using the Hand Preceding Model for Multi-Modal
Fusion in Automatic Continuous Cued Speech Recognition,
MultMed(23), 2021, pp. 292-305.
IEEE DOI
2012
Lips, Shape, Feature extraction, Hidden Markov models,
Speech recognition, Organizations, Encoding, Cued speech, MSHMM
BibRef
Beyan, C.[Cigdem],
Shahid, M.[Muhammad],
Murino, V.[Vittorio],
RealVAD: A Real-World Dataset and A Method for Voice Activity
Detection by Body Motion Analysis,
MultMed(23), 2021, pp. 2071-2085.
IEEE DOI
2107
Feature extraction, Visualization, Lips, Voice activity detection,
Task analysis, Benchmark testing, Synchronization,
unsupervised domain adaptation
BibRef
Eskimez, S.E.[Sefik Emre],
Zhang, Y.[You],
Duan, Z.Y.[Zhi-Yao],
Speech Driven Talking Face Generation From a Single Image and an
Emotion Condition,
MultMed(24), 2022, pp. 3480-3490.
IEEE DOI
2207
Faces, Visualization, Face recognition, Emotion recognition,
Synchronization, Speech processing, Lips, Audiovisual, emotion,
talking face generation
BibRef
Ghose, S.[Sanchita],
Prevost, J.J.[John J.],
FoleyGAN: Visually Guided Generative Adversarial Network-Based
Synchronous Sound Generation in Silent Videos,
MultMed(25), 2023, pp. 4508-4519.
IEEE DOI
2310
BibRef
Zhou, J.X.[Jin-Xing],
Guo, D.[Dan],
Wang, M.[Meng],
Contrastive Positive Sample Propagation Along the Audio-Visual Event
Line,
PAMI(45), No. 6, June 2023, pp. 7239-7257.
IEEE DOI
2305
Visualization, Task analysis, Image segmentation, Synchronization,
Roads, Aggregates, Representation learning, Audio-visual event,
positive sample propagation
BibRef
Zhou, J.X.[Jin-Xing],
Zheng, L.[Liang],
Zhong, Y.R.[Yi-Ran],
Hao, S.J.[Shi-Jie],
Wang, M.[Meng],
Positive Sample Propagation along the Audio-Visual Event Line,
CVPR21(8432-8440)
IEEE DOI
2111
Location awareness, Visualization, Correlation,
Filtering, Feature extraction
BibRef
Wang, X.S.[Xin-Sheng],
Xie, Q.[Qicong],
Zhu, J.[Jihua],
Xie, L.[Lei],
Scharenborg, O.[Odette],
AnyoneNet: Synchronized Speech and Talking Head Generation for
Arbitrary Persons,
MultMed(25), 2023, pp. 6717-6728.
IEEE DOI
2311
BibRef
Yang, D.[Daowu],
Huang, S.[Sheng],
Jiang, W.[Wen],
Zou, J.[Jin],
Implicit Memory-Based Variational Motion Talking Face Generation,
SPLetters(31), 2024, pp. 431-435.
IEEE DOI
2402
Training, Generators, Decoding, Synchronization, Feature extraction,
Facial animation, Implicit memory, speech-driven facial, audio-to-motion
BibRef
Cheng, H.[Harry],
Guo, Y.Y.[Yang-Yang],
Yin, J.H.[Jian-Hua],
Chen, H.[Haonan],
Wang, J.[Jiafang],
Nie, L.Q.[Li-Qiang],
Audio-Driven Talking Video Frame Restoration,
MultMed(26), 2024, pp. 4110-4122.
IEEE DOI
2403
Streaming media, Faces, Lips, Task analysis, Image restoration,
Visualization, Synchronization, Frame Restoration,
Generative Adversial Network
BibRef
Sheng, Z.C.[Zhi-Cheng],
Nie, L.Q.[Li-Qiang],
Zhang, M.[Min],
Chang, X.J.[Xiao-Jun],
Yan, Y.[Yan],
Stochastic Latent Talking Face Generation Toward Emotional
Expressions and Head Poses,
CirSysVideo(34), No. 4, April 2024, pp. 2734-2748.
IEEE DOI
2404
Faces, Stochastic processes, Head, Dynamics, Mixers, Synchronization,
Computational modeling, Talking face generation,
continuous normalizing flow
BibRef
Liu, M.[Meng],
Li, D.[Da],
Li, Y.Q.[Yong-Qiang],
Song, X.M.[Xue-Meng],
Nie, L.Q.[Li-Qiang],
Audio-Semantic Enhanced Pose-Driven Talking Head Generation,
CirSysVideo(34), No. 11, November 2024, pp. 11056-11069.
IEEE DOI
2412
Head, Semantics, Visualization, Synchronization, Lips,
Dentistry, One-shot, talking head generation, head pose
BibRef
Ahn, Y.[Youngdo],
Chae, J.[Jongwook],
Shin, J.W.[Jong Won],
Text-to-Speech With Lip Synchronization Based on Speech-Assisted
Text-to-Video Alignment and Masked Unit Prediction,
SPLetters(32), 2025, pp. 961-965.
IEEE DOI
2503
Lips, Training, Text to video, Text to speech, Synchronization,
Vocoders, Iodine, Speech recognition, Visualization, Vectors,
multi-modality
BibRef
Han, T.S.[Tian-Shun],
Gui, S.N.[Sheng-Nan],
Huang, Y.Q.[Yi-Qing],
Li, B.H.[Bai-Hui],
Liu, L.J.[Li-Jian],
Zhou, B.J.[Ben-Jia],
Jiang, N.[Ning],
Lu, Q.[Quan],
Zhi, R.C.[Rui-Cong],
Liang, Y.Y.[Yan-Yan],
Zhang, D.[Du],
Wan, J.[Jun],
PMMTalk: Speech-Driven 3D Facial Animation From Complementary
Pseudo Multi-Modal Features,
MultMed(27), 2025, pp. 2570-2581.
IEEE DOI
2505
Facial animation, Faces, Visualization, Feature extraction,
Face recognition, Solid modeling, Synchronization, Decoding,
3D Chinese Audio-Visual Facial Animation (3D-CAVFA) dataset
BibRef
Fan, X.[Xulin],
Gao, H.[Heting],
Chen, Z.[Ziyi],
Chang, P.[Peng],
Han, M.[Mei],
Hasegawa-Johnson, M.[Mark],
SyncDiff: Diffusion-Based Talking Head Synthesis with Bottlenecked
Temporal Visual Prior for Improved Synchronization,
WACV25(4554-4563)
IEEE DOI
2505
Training, Visualization, Tracking, Lips, Diffusion processes,
Feature extraction, Generators, Synchronization,
audiovisual
BibRef
Zhang, L.[Lin],
Mo, S.T.[Shen-Tong],
Zhang, Y.J.[Yi-Jing],
Morgado, P.[Pedro],
Audio-synchronized Visual Animation,
ECCV24(XLI: 1-18).
Springer DOI
2412
BibRef
Guan, J.Z.[Jia-Zhi],
Xu, Z.L.[Zhi-Liang],
Zhou, H.[Hang],
Wang, K.Y.[Kaisi-Yuan],
He, S.Y.[Sheng-Yi],
Zhang, Z.W.[Zhan-Wang],
Liang, B.R.[Bo-Rong],
Feng, H.C.[Hao-Cheng],
Ding, E.[Errui],
Liu, J.T.[Jing-Tuo],
Wang, J.D.[Jing-Dong],
Zhao, Y.J.[You-Jian],
Liu, Z.W.[Zi-Wei],
Resyncer: Rewiring Style-based Generator for Unified Audio-Visually
Synced Facial Performer,
ECCV24(XLI: 348-367).
Springer DOI
2412
BibRef
Yaman, D.[Dogucan],
Eyiokur, F.I.[Fevziye Irem],
Bärmann, L.[Leonard],
Ekenel, H.K.[Hazim Kemal],
Waibel, A.[Alexander],
Audio-driven Talking Face Generation with Stabilized Synchronization
Loss,
ECCV24(XIX: 417-435).
Springer DOI
2412
BibRef
Kim, H.K.[Hyung Kyu],
Lee, S.[Sangmin],
Kim, H.G.[Hak Gu],
Analyzing Visible Articulatory Movements in Speech Production for
Speech-Driven 3D Facial Animation,
ICIP24(3575-3579)
IEEE DOI
2411
Visualization, Solid modeling, Analytical models, Lips, Production,
Speech-driven 3D facial animation, Lip synchronization, Visible articulatory
BibRef
Fernandez-Labrador, C.[Clara],
Akçay, M.[Mertcan],
Abecassis, E.[Eitan],
Massich, J.[Joan],
Schroers, C.[Christopher],
DiVAS: Video and Audio Synchronization with Dynamic Frame Rates,
CVPR24(26836-26844)
IEEE DOI
2410
Crops, Production, Media, Predictive models, Transformers,
Motion pictures, Robustness
BibRef
Peng, Z.Q.[Zi-Qiao],
Hu, W.T.[Wen-Tao],
Shi, Y.[Yue],
Zhu, X.Y.[Xiang-Yu],
Zhang, X.M.[Xiao-Mei],
Zhao, H.[Hao],
He, J.[Jun],
Liu, H.Y.[Hong-Yan],
Fan, Z.X.[Zhao-Xin],
SyncTalk: The Devil is in the Synchronization for Talking Head
Synthesis,
CVPR24(666-676)
IEEE DOI Code:
WWW Link.
2410
Torso, Visualization, Technological innovation, Lips,
Neural radiance field, Generative adversarial networks, Audio-driven
BibRef
Zhang, C.X.[Chen-Xu],
Wang, C.[Chao],
Zhao, Y.F.[Yi-Fan],
Cheng, S.[Shuo],
Luo, L.J.[Lin-Jie],
Guo, X.H.[Xiao-Hu],
DR2: Disentangled Recurrent Representation Learning for
Data-efficient Speech Video Synthesis,
WACV24(6192-6202)
IEEE DOI
2404
Training, Representation learning, Computational modeling,
Training data, Data models, Synchronization, Algorithms, Biometrics
BibRef
Gupta, H.[Honey],
Perceptual Synchronization Scoring of Dubbed Content using
Phoneme-Viseme Agreement,
VAQuality24(392-402)
IEEE DOI
2404
Correlation, Streaming media, Synchronization, History
BibRef
Lee, D.[Dongyeun],
Kim, C.[Chaewon],
Yu, S.J.[Sang-Joon],
Yoo, J.[Jaejun],
Park, G.M.[Gyeong-Moon],
RADIO: Reference-Agnostic Dubbing Video Synthesis,
WACV24(4156-4166)
IEEE DOI
2404
Shape, Lips, Modulation, Robustness, Decoding, Synchronization, Mirrors,
Algorithms, Computational photography, image and video synthesis
BibRef
Sung-Bin, K.[Kim],
Hyun, L.[Lee],
Hong, D.H.[Da Hye],
Nam, S.[Suekyeong],
Ju, J.[Janghoon],
Oh, T.H.[Tae-Hyun],
LaughTalk: Expressive 3D Talking Head Generation with Laughter,
WACV24(6392-6401)
IEEE DOI
2404
Training, Solid modeling, Lips, Fires, Synchronization, Algorithms,
Biometrics, face, gesture, body pose, Algorithms, 3D computer vision
BibRef
Mukhopadhyay, S.[Soumik],
Suri, S.[Saksham],
Gadde, R.T.[Ravi Teja],
Shrivastava, A.[Abhinav],
Diff2Lip: Audio Conditioned Diffusion Models for Lip-Synchronization,
WACV24(5280-5290)
IEEE DOI Code:
WWW Link.
2404
Image quality, Measurement, Degradation, Lips,
Entertainment industry, Synchronization, Task analysis, Algorithms,
image and video synthesis
BibRef
Niu, Z.[Zhe],
Mak, B.[Brian],
On the Audio-visual Synchronization for Lip-to-Speech Synthesis,
ICCV23(7809-7818)
IEEE DOI
2401
BibRef
Ki, T.[Taekyung],
Min, D.[Dongchan],
StyleLipSync: Style-based Personalized Lip-sync Video Generation,
ICCV23(22784-22793)
IEEE DOI
2401
BibRef
Guan, J.Z.[Jia-Zhi],
Zhang, Z.W.[Zhan-Wang],
Zhou, H.[Hang],
Hu, T.S.[Tian-Shu],
Wang, K.Y.[Kaisi-Yuan],
He, D.L.[Dong-Liang],
Feng, H.C.[Hao-Cheng],
Liu, J.T.[Jing-Tuo],
Ding, E.[Errui],
Liu, Z.W.[Zi-Wei],
Wang, J.D.[Jing-Dong],
StyleSync: High-Fidelity Generalized and Personalized Lip Sync in
Style-Based Generator,
CVPR23(1505-1515)
IEEE DOI
2309
BibRef
Chatziagapi, A.[Aggelina],
Athar, S.[ShahRukh],
Jain, A.[Abhinav],
MV, R.[Rohith],
Bhat, V.[Vimal],
Samaras, D.[Dimitris],
LipNeRF: What is the right feature space to lip-sync a NeRF?,
FG23(1-8)
IEEE DOI
2303
Geometry, Solid modeling, Visualization, Lips, Face recognition, Lighting
BibRef
Gupta, A.[Anchit],
Mukhopadhyay, R.[Rudrabha],
Balachandra, S.[Sindhu],
Khan, F.F.[Faizan Farooq],
Namboodiri, V.P.[Vinay P.],
Jawahar, C.V.,
Towards Generating Ultra-High Resolution Talking-Face Videos with Lip
synchronization,
WACV23(5198-5207)
IEEE DOI
2302
Industries, Shape, Lips, Motion pictures, Generators, Synchronization,
Algorithms: Vision + language and/or other modalities, Education
BibRef
Zheng, A.[Aihua],
Zhu, F.X.[Fei-Xia],
Zhu, H.[Hao],
Luo, M.[Mandi],
He, R.[Ran],
Talking Face Generation via Learning Semantic and Temporal
Synchronous Landmarks,
ICPR21(3682-3689)
IEEE DOI
2105
Learning systems, Visualization, Face recognition, Semantics, Mouth,
Detectors, Benchmark testing
BibRef
Lee, J.Y.[Ji-Young],
Chung, S.W.[Soo-Whan],
Kim, S.[Sunok],
Kang, H.G.[Hong-Goo],
Sohn, K.H.[Kwang-Hoon],
Looking into Your Speech: Learning Cross-modal Affinity for
Audio-visual Speech Separation,
CVPR21(1336-1345)
IEEE DOI
2111
Visualization, Stability criteria, Speech recognition, Jitter,
Delays, Synchronization, Data mining
BibRef
Zheng, R.B.[Ruo-Bing],
Zhu, Z.[Zhou],
Song, B.[Bo],
Ji, C.J.[Chang-Jiang],
A Neural Lip-Sync Framework for Synthesizing Photorealistic Virtual
News Anchors,
ICPR21(5286-5293)
IEEE DOI
2105
Deep learning, Visualization, Convolution, Lips, Neural networks,
Mouth, Rendering (computer graphics), lip sync, virtual anchor,
neural rendering
BibRef
Kumar, N.,
Goel, S.,
Narang, A.,
Hasan, M.,
Robust One Shot Audio to Video Generation,
DeepVision20(3334-3343)
IEEE DOI
2008
Generators, Adaptation models, Synchronization, Lips,
Mouth
BibRef
Shalev, Y.[Yoav],
Wolf, L.B.[Lior B.],
End to End Lip Synchronization with a Temporal AutoEncoder,
WACV20(330-339)
IEEE DOI
2006
Visualization, Streaming media, Decoding, Lips, Face, Tensile stress, Mouth
BibRef
Wang, J.,
Fang, Z.,
Zhao, H.,
AlignNet: A Unifying Approach to Audio-Visual Alignment,
WACV20(3298-3306)
IEEE DOI
2006
Feature extraction, Synchronization, Visualization, Task analysis,
Training, Rhythm, Face
BibRef
Saidi, I.,
Zhang, L.,
Barriac, V.,
Déforges, O.,
Laboratory and Crowdsourcing Studies of Lip Sync Effect on the
Audio-Video Quality Assessment for Videoconferencing Application,
ICIP19(3207-3211)
IEEE DOI
1910
Subjective test, crowdsourcing, quality assessment,
audio-video synchronization, videoconferencing
BibRef
Yu, J.[Jun],
Speech Synchronized Tongue Animation by Combining Physiology Modeling
and X-ray Image Fitting,
MMMod17(I: 726-737).
Springer DOI
1701
BibRef
Chung, J.S.[Joon Son],
Zisserman, A.[Andrew],
Out of Time: Automated Lip Sync in the Wild,
LipRead16(II: 251-263).
Springer DOI
1704
BibRef
Cadavid, S.[Steven],
Abdel-Mottaleb, M.[Mohamed],
Messinger, D.S.[Daniel S.],
Mahoor, M.H.[Mohammad H.],
Bahrick, L.E.[Lorraine E.],
Detecting local audio-visual synchrony in monologues utilizing vocal
pitch and facial landmark trajectories,
BMVC09(xx-yy).
PDF File.
0909
BibRef
Kumar, K.[Kshitiz],
Navratil, J.[Jiri],
Marcheret, E.[Etienne],
Libal, V.[Vit],
Ramaswamy, G.[Ganesh],
Potamianos, G.[Gerasimos],
Audio-visual speech synchronization detection using a bimodal linear
prediction model,
Biometrics09(53-59).
IEEE DOI
0906
BibRef
Melek, Z.[Zeki],
Akarun, L.[Lale],
Automated Lip Synchronized Speech Driven Facial Animation,
ICME00(TA1).
0007
BibRef
Chen, T.H.[Tsu-Han],
Graf, H.P.,
Haskell, B.G.,
Petajan, E.,
Wang, Y.[Yao],
Chen, H.,
Chou, W.[Wu],
Speech-assisted lip synchronization in audio-visual communications,
ICIP95(II: 579-582).
IEEE DOI
9510
BibRef
Chapter on 3-D Object Description and Computation Techniques, Surfaces, Deformable, View Generation, Video Conferencing continues in
Face Animation, Video Face Synthesis .