Bloomstein, R.W.[Richard W.],
Cinematic works with altered facial displays,
US_Patent4,827,532, May 2, 1989
WWW Link. alter lip motions
BibRef
8905
Waters, K.[Keith],
Levergood, T.M.[Thomas M.],
Method and apparatus for producing audio-visual synthetic speech,
US_Patent5,657,426, Aug 12, 1997
WWW Link.
BibRef
9708
Gasper, E.[Elon],
Matthews, III, J.H.[Joseph H.],
Wesley, R.[Richard],
Advanced tools for speech synchronized animation,
US_Patent5,613,056, Mar 18, 1997
WWW Link.
BibRef
9703
And:
US_Patent5,630,017, May 13, 1997
WWW Link.
BibRef
And: A1, A3, Only:
US_Patent5,689,618, Nov 18, 1997
WWW Link.
BibRef
Lyberg, B.[Bertil],
Device and method for dubbing an audio-visual presentation
which generates synthesized speech and corresponding facial movements,
US_Patent5,826,234, Oct 20, 1998
WWW Link.
BibRef
9810
Henton, C.G.[Caroline G.],
Method and apparatus for synthetic speech in facial animation,
US_Patent5,878,396, Mar 2, 1999
WWW Link.
BibRef
9903
Goldenthal, W.D.[William D.],
van Thong, J.M.[Jean-Manuel],
Waters, K.[Keith],
Automated speech alignment for image synthesis,
US_Patent5,884,267, Mar 16, 1999
WWW Link.
BibRef
9903
Scott, K.C.[Kenneth C.],
Yeates, M.C.[Matthew C.],
Kagels, D.S.[David S.],
Watson, S.H.[Stephen Hilary],
Method and apparatus for synthesizing realistic animations
of a human speaking using a computer,
US_Patent6,097,381, Aug 1, 2000
WWW Link.
BibRef
0008
Danieli, D.V.[Damon Vincent],
Method for generating mouth features of an animated or
physical character,
US_Patent6,067,095, May 23, 2000
WWW Link.
BibRef
0005
Grammalidis, N.[Nikos],
Sarris, N.[Nikos],
Deligianni, F.[Fani],
Strintzis, M.G.[Michael G.],
Three-Dimensional Facial Adaptation for MPEG-4 Talking Heads,
JASP(2002), No. 10, October 2002, pp. 1005-1020.
WWW Link.
0211
BibRef
Cosatto, E.,
Ostermann, J.,
Graf, H.P.,
Schroeter, J.,
Lifelike talking faces for interactive services,
PIEEE(91), No. 9, September 2003, pp. 1406-1429.
IEEE DOI
0309
BibRef
Cosatto, E.[Eric],
Graf, H.P.[Hans Peter],
Potamianos, G.[Gerasimos],
Schroeter, J.[Juergen],
Audio-visual selection process for the synthesis of
photo-realistic talking-head animations,
US_Patent6,654,018, Nov 25, 2003
WWW Link.
BibRef
0311
Cosatto, E.[Eric],
Graf, H.P.[Hans Peter],
Huang, F.J.[Fu Jie],
System and method for triphone-based unit selection for visual
speech synthesis,
US_Patent7,209,882, Apr 24, 2007
WWW Link.
BibRef
0704
And:
US_Patent7,369,992, May 6, 2008
WWW Link.
BibRef
Cosatto, E.[Eric],
Potamianos, G.[Gerasimos],
Graf, H.P.[Hans Peter],
Audio-Visual Unit Selection for the Synthesis of Photo-Realistic
Talking-Heads,
ICME00(TA1).
0007
BibRef
Graf, H.P.[Hans Peter],
Cosatto, E.[Eric],
Ezzat, A.F.[Antoine F.],
Face Analysis for the Synthesis of Photo-Realistic Talking Heads,
AFGR00(189-194).
IEEE DOI
0003
BibRef
Graf, H.P.[Hans Peter],
Sample-based Synthesis of Talking Heads,
RATFG01(xx-yy).
0106
BibRef
Liu, K.[Kang],
Ostermann, J.[Joern],
Minimized Database of Unit Selection in Visual Speech Synthesis without
Loss of Naturalness,
CAIP09(1212-1219).
Springer DOI
0909
BibRef
Ostermann, J.,
Weissenfeld, A.,
Talking Faces: Technologies and Applications,
ICPR04(III: 826-833).
IEEE DOI
0409
BibRef
Rosenfeld, M.[Maury],
Method for automatically animating lip synchronization and
facial expression of animated characters,
US_Patent6,307,576, Oct 23, 2001
WWW Link.
BibRef
0110
Basu, S.[Sankar],
Faruquie, T.A.[Tanveer Atzal],
Neti, C.V.[Chalapathy V.],
Rajput, N.[Nitendra],
Senior, A.W.[Andrew William],
Subramaniam, L.V.[L. Venkata],
Verma, A.[Ashish],
Speech driven lip synthesis using viseme based hidden markov models,
US_Patent6,366,885, Apr 2, 2002
WWW Link.
BibRef
0204
Dorvil, R.[Richemond],
Device and method for prosody generation at visual synthesis,
US_Patent6,389,396, May 14, 2002
WWW Link.
BibRef
0205
Theobald, B.J.,
Kruse, S.M.,
Bangham, J.A.,
Cawley, G.C.,
Towards a low bandwidth talking face using appearance models,
IVC(21), No. 12-13, December 2003, pp. 1117-1124.
Elsevier DOI
0401
BibRef
Earlier: A1, A4, A2, A3:
BMVC01(Session 6: Faces).
HTML Version.
0110
University of East Anglia.
BibRef
Theobald, B.J.,
Bangham, J.A.,
Matthews, I.,
Glauert, J.R.W.,
Cawley, G.C.,
2.5D Visual Speech Synthesis Using Appearance Models,
BMVC03(xx-yy).
HTML Version.
0409
BibRef
Devin, V.E.[Vincent E.],
Hogg, D.C.[David C.],
Reactive Memories: An Interactive Talking-Head,
IVC(21), No. 12-13, December 2003, pp. 1125-1133.
Elsevier DOI
0401
BibRef
Earlier:
BMVC01(Session 6: Faces).
HTML Version. University of Leeds
0110
BibRef
Sutton, S.[Stephen],
Vermeulen, P.[Pieter],
Methods and devices for producing and using synthetic
visual speech based on natural coarticulation,
US_Patent6,539,354, Mar 25, 2003
WWW Link.
BibRef
0303
Cosker, D.P.,
Marshall, A.D.,
Rosin, P.L.,
Hicks, Y.A.,
Speech-driven facial animation using a hierarchical model,
VISP(151), No. 4, August 2004, pp. 314-321.
IEEE Abstract.
0411
BibRef
Earlier:
Speech driven facial animation using a hidden markov coarticulation
model,
ICPR04(I: 128-131).
IEEE DOI
0409
BibRef
Cosker, D.P.[Darren P.],
Borkett, R.,
Marshall, A.D.[A. David],
Rosin, P.L.[Paul L.],
Towards automatic performance-driven animation between multiple types
of facial model,
IET-CV(2), No. 3, September 2008, pp. 129-141.
DOI Link
0905
BibRef
Cosker, D.P.[Darren P.],
Roy, S.[Steven],
Rosin, P.L.[Paul L.],
Marshall, A.D.[A. David],
Re-mapping Animation Parameters Between Multiple Types of Facial Model,
MIRAGE07(365-376).
Springer DOI
0703
BibRef
Ma, J.Y.[Ji-Yong],
Cole, R.[Ronald],
Animating visible speech and facial expressions,
VC(20), No. 2-3, May 2004, pp. 86-105.
Springer DOI
0405
BibRef
Muller, P.,
Kalberer, G.A.,
Proesmans, M.,
Van Gool, L.J.,
Realistic speech animation based on observed 3D face dynamics,
VISP(152), No. 4, August 2005, pp. 491-500.
DOI Link
0512
BibRef
Hilton, A.,
Kalkavouras, M.,
Collins, G.,
3D studio production of animated actor models,
VISP(152), No. 4, August 2005, pp. 481-490.
DOI Link
0512
BibRef
Hsieh, C.K.[Chao-Kuei],
Chen, Y.C.[Yung-Chang],
Partial linear regression for speech-driven talking head application,
SP:IC(21), No. 1, January 2006, pp. 1-12.
Elsevier DOI
0512
Adapt models.
BibRef
Haisma, N.[Nicoline],
Sinke, J.G.[Johannes Govert],
Bergevoet, B.A.J.[Bas Arnold Jan],
van Gestel, H.A.W.[Henricus Antonius Wilhelmus],
Post-synchronizing an information stream including lip
objects replacement,
US_Patent7,145,606, Dec 5, 2006
WWW Link.
BibRef
0612
Xie, L.[Lei],
Liu, Z.Q.[Zhi-Qiang],
A coupled HMM approach to video-realistic speech animation,
PR(40), No. 8, August 2007, pp. 2325-2340.
Elsevier DOI
0704
BibRef
Earlier:
Speech Animation Using Coupled Hidden Markov Models,
ICPR06(I: 1128-1131).
IEEE DOI
0609
Speech animation; Audio-to-visual conversion; Talking faces;
Facial animation; Coupled hidden Markov models (CHMMs)
BibRef
Sargin, M.E.[Mehmet E.],
Yemez, Y.[Yucel],
Erzin, E.[Engin],
Tekalp, A.M.[Ahmet M.],
Analysis of Head Gesture and Prosody Patterns for Prosody-Driven
Head-Gesture Animation,
PAMI(30), No. 8, August 2008, pp. 1330-1345.
IEEE DOI
0806
BibRef
Ofli, F.[Ferda],
Erzin, E.[Engin],
Yemez, Y.[Yucel],
Tekalp, A.M.[A. Murat],
Estimation and Analysis of Facial Animation Parameter Patterns,
ICIP07(IV: 293-296).
IEEE DOI
0709
BibRef
Erdem, A.T.[A. Tanju],
Method for animating a 3-D model of a face,
US_Patent6,731,287, May 4, 2004
WWW Link.
BibRef
0405
Bozkurt, E.[Elif],
Erdem, C.E.[Cigdem Eroglu],
Erzin, E.[Engin],
Erdem, T.[Tanju],
Ozkan, M.K.[Mehmet K.],
Tekalp, A.M.[A. Murat],
Speech-Driven Automatic Facial Expression Synthesis,
3DTV08(273-276).
IEEE DOI
0805
BibRef
Bozkurt, E.[Elif],
Erdem, C.E.[Cigdem Eroglu],
Erzin, E.[Engin],
Erdem, T.[Tanju],
Ozkan, M.K.[Mehmet K.],
Comparison of Phoneme and Viseme Based Acoustic Units for Speech Driven
Realistic lip Animation,
3DTV07(1-4).
IEEE DOI
0705
BibRef
Hong, P.Y.[Peng-Yu],
Wen, Z.[Zhen],
Huang, T.S.[Thomas S.],
iface: A 3d Synthetic Talking Face,
IJIG(1), No. 1, January 2001, pp. 19-26.
0101
BibRef
Cheiky, M.[Michael],
Gately, P.[Peter],
Photo realistic talking head creation system and method,
US_Patent6,919,892, Jul 19, 2005
WWW Link.
BibRef
0507
And:
Do-it-yourself photo realistic talking head creation system and method,
US_Patent7,027,054, Apr 11, 2006
WWW Link.
BibRef
Huang, Y.[Ying],
Lin, S.S.T.[Stephen Ssu-Te],
Guo, B.N.[Bai-Ning],
Shum, H.Y.[Heung-Yeung],
System and method for real time lip synchronization,
US_Patent7,133,535, Nov 7, 2006
WWW Link.
BibRef
0611
McAlpine, P.[Paul],
Hernandez, T.[Todd],
Bateman, J.[John],
Zimmermann, R.[Remy],
Depallens, P.[Philippe],
Facial feature-localized and global real-time video morphing,
US_Patent7,209,577, Apr 24, 2007
WWW Link.
BibRef
0704
Massaro, D.W.[Dominic W.],
Cohen, M.M.[Michael M.],
Beskow, J.[Jonas],
Visual display methods for in computer-animated
speech production models,
US_Patent7,225,129, May 29, 2007
WWW Link.
BibRef
0705
Clarke, S.[Simon],
Hovhannisyan, A.[Armen],
Cutler, R.[Ross],
System and process for adding high frame-rate current speaker data
to a low frame-rate video using delta frames,
US_Patent7,355,622, Apr 8, 2008
WWW Link.
BibRef
0804
Cutler, R.[Ross],
System and process for adding high frame-rate current speaker data to
a low frame-rate video using audio watermarking techniques,
US_Patent7,355,623, Apr 8, 2008
WWW Link.
BibRef
0804
And:
US_Patent7,362,350, Apr 22, 2008
WWW Link.
BibRef
Yeung, M.[Minerva],
Du, P.[Ping],
Huang, C.[Chao],
Method and apparatus for animation of a human speaker,
US_Patent7,388,586, Jun 17, 2008
WWW Link.
BibRef
0806
Buenaposada, J.M.[José Miguel],
Muñoz, E.[Enrique],
Baumela, L.[Luis],
Efficient illumination independent appearance-based face tracking,
IVC(27), No. 5, 2 April 2009, pp. 560-578.
Elsevier DOI
0904
BibRef
Earlier:
Efficiently estimating facial expression and illumination in
appearance-based tracking,
BMVC06(I:57).
PDF File.
0609
BibRef
Earlier:
Performance Driven Facial Animation by Appearance Based Tracking,
IbPRIA05(I:476).
Springer DOI
0509
Linear models of appearance; Illumination invariance; Efficient linear subspace model fitting; Facial expression analysis
BibRef
Buenaposada, J.M.[José Miguel],
Muñoz, E.[Enrique],
Performance driven facial animation using illumination independent
appearance-based tracking,
ICPR06(I: 303-306).
IEEE DOI
0609
BibRef
Munoz, E.[Enrique],
Buenaposada, J.M.[Jose M.],
Baumela, L.[Luis],
A direct approach for efficiently tracking with 3D morphable models,
ICCV09(1615-1622).
IEEE DOI
0909
BibRef
Aina, O.O.[Olusola O.],
Generating anatomical substructures for physically-based facial
animation. Part 1: A methodology for skull fitting,
VC(25), No. 5-7, May 2009, pp. xx-yy.
Springer DOI
0905
BibRef
Schreer, O.,
Englert, R.,
Eisert, P.,
Tanger, R.,
Real-Time Vision and Speech Driven Avatars for Multimedia Applications,
MultMed(10), No. 3, April 2008, pp. 352-360.
IEEE DOI
0905
BibRef
Chen, H.[Hui],
Wang, L.[Lan],
Liu, W.X.[Wen-Xi],
Heng, P.A.[Pheng-Ann],
Combined X-ray and facial videos for phoneme-level articulator dynamics,
VC(26), No. 6-8, June 2010, pp. 477-486.
WWW Link.
1101
BibRef
Kim, B.U.[Byung-Uck],
Feng, W.W.[Wei-Wei],
Yu, Y.Z.[Yi-Zhou],
Real-time data driven deformation with affine bones,
VC(26), No. 6-8, June 2010, pp. 487-495.
WWW Link.
1101
BibRef
And: Erratum:
VC(26), No. 9, September 2010, pp. 1241.
WWW Link.
1101
BibRef
Xiong, B.,
Fan, X.,
Zhu, C.,
Jing, X.,
Peng, Q.,
Face Region Based Conversational Video Coding,
CirSysVideo(21), No. 7, July 2011, pp. 917-931.
IEEE DOI
1107
BibRef
Chen, Y.M.,
Huang, F.C.,
Guan, S.H.,
Chen, B.Y.,
Animating Lip-Sync Characters With Dominated Animeme Models,
CirSysVideo(22), No. 9, September 2012, pp. 1344-1353.
IEEE DOI
1209
BibRef
Zhou, Z.H.[Zi-Heng],
Zhao, G.Y.[Guo-Ying],
Guo, Y.[Yimo],
Pietikäinen, M.[Matti],
An Image-Based Visual Speech Animation System,
CirSysVideo(22), No. 10, October 2012, pp. 1420-1432.
IEEE DOI
1210
BibRef
Earlier: A1, A2, A4, Only:
Synthesizing a talking mouth,
ICCVGIP10(211-218).
DOI Link
1111
See also Concatenated Frame Image Based CNN for Visual Speech Recognition.
BibRef
Deena, S.[Salil],
Hou, S.B.[Shao-Bo],
Galata, A.[Aphrodite],
Visual Speech Synthesis Using a Variable-Order Switching Shared
Gaussian Process Dynamical Model,
MultMed(15), No. 8, December 2013, pp. 1755-1768.
IEEE DOI
1402
BibRef
Earlier: A1, A3, Only:
Speech-Driven Facial Animation Using a Shared Gaussian Process Latent
Variable Model,
ISVC09(I: 89-100).
Springer DOI
0911
Gaussian processes
BibRef
Srinivasan, V.,
Bethel, C.L.,
Murphy, R.R.,
Evaluation of Head Gaze Loosely Synchronized With Real-Time Synthetic
Speech for Social Robots,
HMS(44), No. 6, December 2014, pp. 767-778.
IEEE DOI
1412
human-robot interaction
BibRef
Yu, J.,
Wang, Z.,
A Video, Text, and Speech-Driven Realistic 3-D Virtual Head for
Human-Machine Interface,
Cyber(45), No. 5, May 2015, pp. 977-988.
IEEE DOI
1505
Adaptation models
BibRef
Ding, Y.,
Huang, J.,
Pelachaud, C.,
Audio-Driven Laughter Behavior Controller,
AffCom(8), No. 4, October 2017, pp. 546-558.
IEEE DOI
1712
Animation, Correlation, Hidden Markov models, Lips, Mouth, Speech, Torso,
Kalman filter, Laughter, animation synthesis, audio-driven,
virtual character
BibRef
Castro-González, Á.,
Alcocer-Luna, J.,
Malfaz, M.,
Alonso-Martín, F.,
Salichs, M.A.,
Evaluation of Artificial Mouths in Social Robots,
HMS(48), No. 4, August 2018, pp. 369-379.
IEEE DOI
1808
human-robot interaction, social robot, speaking robot,
on-video robot, mouthless robot, artificial LED-based mouth,
social robot
BibRef
Jamaludin, A.[Amir],
Chung, J.S.[Joon Son],
Zisserman, A.[Andrew],
You Said That?: Synthesising Talking Faces from Audio,
IJCV(127), No. 11-12, December 2019, pp. 1767-1779.
Springer DOI
1911
BibRef
Vougioukas, K.[Konstantinos],
Petridis, S.[Stavros],
Pantic, M.[Maja],
Realistic Speech-Driven Facial Animation with GANs,
IJCV(128), No. 5, May 2020, pp. 1398-1413.
Springer DOI
2005
BibRef
Liu, N.[Na],
Zhou, T.[Tao],
Ji, Y.F.[Yun-Feng],
Zhao, Z.Y.[Zi-Yi],
Wan, L.H.[Li-Hong],
Synthesizing Talking Faces from Text and Audio: An Autoencoder and
Sequence-to-Sequence Convolutional Neural Network,
PR(102), 2020, pp. 107231.
Elsevier DOI
2003
Convolutional neural network, Autoencoder, Regression,
Face landmark, Face tracking, Lip sync, Video, Audio
BibRef
Ahuja, C.[Chaitanya],
Lee, D.W.[Dong Won],
Nakano, Y.I.[Yukiko I.],
Morency, L.P.[Louis-Philippe],
Style Transfer for Co-speech Gesture Animation: A Multi-speaker
Conditional-mixture Approach,
ECCV20(XVIII:248-265).
Springer DOI
2012
BibRef
Yu, L.,
Yu, J.,
Li, M.,
Ling, Q.,
Multimodal Inputs Driven Talking Face Generation With
Spatial-Temporal Dependency,
CirSysVideo(31), No. 1, January 2021, pp. 203-216.
IEEE DOI
2101
Face, Mouth, Lips, Generative adversarial networks, Synchronization,
Facial animation, Video synthesis, self-attention mechanism
BibRef
Sadoughi, N.[Najmeh],
Busso, C.[Carlos],
Speech-Driven Expressive Talking Lips with Conditional Sequential
Generative Adversarial Networks,
AffCom(12), No. 4, October 2021, pp. 1031-1044.
IEEE DOI
2112
Hidden Markov models, Speech recognition, Adaptation models,
Training data, Data models, Lips, Generative adversarial networks,
generative adversarial network
BibRef
Zhang, J.N.[Jiang-Ning],
Zeng, X.F.[Xian-Fang],
Xu, C.[Chao],
Liu, Y.[Yong],
Real-Time Audio-Guided Multi-Face Reenactment,
SPLetters(29), 2022, pp. 1-5.
IEEE DOI
2202
Faces, Task analysis, Head, Training, Real-time systems,
Feature extraction, Convolution, Real-time face reenactment,
generative adversarial nets
BibRef
Xia, Y.F.[Yi-Fan],
Zheng, W.B.[Wen-Bo],
Wang, Y.M.[Yi-Ming],
Yu, H.[Hui],
Dong, J.Y.[Jun-Yu],
Wang, F.Y.[Fei-Yue],
Local and Global Perception Generative Adversarial Network for Facial
Expression Synthesis,
CirSysVideo(32), No. 3, March 2022, pp. 1443-1452.
IEEE DOI
2203
Task analysis, Face recognition, Mouth,
Generative adversarial networks, Facial features, Generators, facial mask
BibRef
Websdale, D.[Danny],
Taylor, S.[Sarah],
Milner, B.[Ben],
Speaker-Independent Speech Animation Using Perceptual Loss Functions
and Synthetic Data,
MultMed(24), 2022, pp. 2539-2552.
IEEE DOI
2205
Visualization, Speech recognition, Real-time systems,
Facial animation, Hidden Markov models, Face recognition, Mouth,
recurrent neural networks
BibRef
Yu, L.Y.[Ling-Yun],
Xie, H.T.[Hong-Tao],
Zhang, Y.D.[Yong-Dong],
Multimodal Learning for Temporally Coherent Talking Face Generation
With Articulator Synergy,
MultMed(24), 2022, pp. 2950-2962.
IEEE DOI
2206
Mouth, Lips, Faces, Facial animation, Shape, Visualization,
Task analysis, Articulator synergy, multimodal learning, video synthesis
BibRef
Eskimez, S.E.[Sefik Emre],
Zhang, Y.[You],
Duan, Z.Y.[Zhi-Yao],
Speech Driven Talking Face Generation From a Single Image and an
Emotion Condition,
MultMed(24), 2022, pp. 3480-3490.
IEEE DOI
2207
Faces, Visualization, Face recognition, Emotion recognition,
Synchronization, Speech processing, Lips, Audiovisual, emotion,
talking face generation
BibRef
Pham, H.X.[Hai X.],
Wang, Y.T.[Yu-Ting],
Pavlovic, V.[Vladimir],
Learning Continuous Facial Actions From Speech for Real-Time
Animation,
AffCom(13), No. 3, July 2022, pp. 1567-1580.
IEEE DOI
2209
Acoustics, Faces, Hidden Markov models, Face recognition, Animation,
Solid modeling, Deep learning, speech, emotion, facial action unit, animation
BibRef
Sanguineti, V.[Valentina],
Morerio, P.[Pietro],
del Bue, A.[Alessio],
Murino, V.[Vittorio],
Unsupervised Synthetic Acoustic Image Generation for Audio-Visual
Scene Understanding,
IP(31), 2022, pp. 7102-7115.
IEEE DOI
2212
Acoustics, Task analysis, Visualization, Location awareness,
Image reconstruction, Generative adversarial networks, Training,
spatial audio
BibRef
Zhang, J.L.[Jia-Le],
Liu, C.X.[Cheng-Xin],
Xian, K.[Ke],
Cao, Z.G.[Zhi-Guo],
Large motion anime head animation using a cascade pose transform
network,
PR(135), 2023, pp. 109181.
Elsevier DOI
2212
Talking head animation, Generative adversarial networks,
Pose transformation, Anime image generation, Anime dataset
BibRef
Zhang, J.L.[Jia-Le],
Xian, K.[Ke],
Liu, C.X.[Cheng-Xin],
Chen, Y.P.[Yin-Peng],
Cao, Z.G.[Zhi-Guo],
Zhong, W.[Weicai],
Cptnet: Cascade Pose Transform Network for Single Image Talking Head
Animation,
ACCV20(IV:493-508).
Springer DOI
2103
BibRef
Song, L.[Linsen],
Wu, W.[Wayne],
Fu, C.[Chaoyou],
Loy, C.C.[Chen Change],
He, R.[Ran],
Audio-Driven Dubbing for User Generated Contents via Style-Aware
Semi-Parametric Synthesis,
CirSysVideo(33), No. 3, March 2023, pp. 1247-1261.
IEEE DOI
2303
Faces, Mouth, Training, Production, Training data, Shape, Visualization,
Talking face generation, video generation, GAN, thin-plate spline
BibRef
Ye, Z.P.[Zi-Peng],
Xia, M.F.[Meng-Fei],
Yi, R.[Ran],
Zhang, J.Y.[Ju-Yong],
Lai, Y.K.[Yu-Kun],
Huang, X.[Xuwei],
Zhang, G.X.[Guo-Xin],
Liu, Y.J.[Yong-Jin],
Audio-Driven Talking Face Video Generation With Dynamic Convolution
Kernels,
MultMed(25), 2023, pp. 2033-2046.
IEEE DOI
2306
Kernel, Task analysis, Convolution, Faces, Feature extraction,
Streaming media, Dynamic kernel, convolutional neural network,
audio-driven talking-face generation
BibRef
Xia, G.Y.[Gui-Yu],
Luo, D.[Dong],
Zhang, Z.[Zeyuan],
Sun, Y.[Yubao],
Liu, Q.S.[Qing-Shan],
3D Information Guided Motion Transfer via Sequential Image Based
Human Model Refinement and Face-Attention GAN,
CirSysVideo(33), No. 7, July 2023, pp. 3270-3283.
IEEE DOI
2307
Solid modeling, Image reconstruction, Faces, Image synthesis,
Deformable models, Surface reconstruction, Motion transfer,
motion image generation
BibRef
Doukas, M.C.[Michail Christos],
Ververas, E.[Evangelos],
Sharmanska, V.[Viktoriia],
Zafeiriou, S.P.[Stefanos P.],
Free-HeadGAN: Neural Talking Head Synthesis With Explicit Gaze
Control,
PAMI(45), No. 8, August 2023, pp. 9743-9756.
IEEE DOI
2307
Faces, Head, Solid modeling, Videos, Magnetic heads, Optical imaging,
Canonical 3D key-points, gaze estimation, gaze redirection,
reenactment
BibRef
Ghose, S.[Sanchita],
Prevost, J.J.[John J.],
FoleyGAN: Visually Guided Generative Adversarial Network-Based
Synchronous Sound Generation in Silent Videos,
MultMed(25), 2023, pp. 4508-4519.
IEEE DOI
2310
BibRef
Wu, R.L.[Rong-Liang],
Yu, Y.C.[Ying-Chen],
Zhan, F.N.[Fang-Neng],
Zhang, J.[Jiahui],
Zhang, X.Q.[Xiao-Qin],
Lu, S.J.[Shi-Jian],
Audio-driven talking face generation with diverse yet realistic
facial animations,
PR(144), 2023, pp. 109865.
Elsevier DOI
2310
Audio-driven talking face generation, Face, Face animation,
Audio-to-visual mapping, Image synthesis
BibRef
Wang, X.S.[Xin-Sheng],
Xie, Q.[Qicong],
Zhu, J.[Jihua],
Xie, L.[Lei],
Scharenborg, O.[Odette],
AnyoneNet: Synchronized Speech and Talking Head Generation for
Arbitrary Persons,
MultMed(25), 2023, pp. 6717-6728.
IEEE DOI
2311
BibRef
Sheng, Z.C.[Zhi-Cheng],
Nie, L.Q.[Li-Qiang],
Liu, M.[Meng],
Wei, Y.W.[Yin-Wei],
Gao, Z.[Zan],
Toward Fine-Grained Talking Face Generation,
IP(32), 2023, pp. 5794-5807.
IEEE DOI
2311
BibRef
Chen, B.[Bolin],
Wang, Z.[Zhao],
Li, B.[Binzhe],
Wang, S.Q.[Shi-Qi],
Ye, Y.[Yan],
Compact Temporal Trajectory Representation for Talking Face Video
Compression,
CirSysVideo(33), No. 11, November 2023, pp. 7009-7023.
IEEE DOI Code:
WWW Link.
2311
BibRef
Yang, D.[Daowu],
Huang, S.[Sheng],
Jiang, W.[Wen],
Zou, J.[Jin],
Implicit Memory-Based Variational Motion Talking Face Generation,
SPLetters(31), 2024, pp. 431-435.
IEEE DOI
2402
Training, Generators, Decoding, Synchronization, Feature extraction,
Facial animation, Implicit memory, speech-driven facial, audio-to-motion
BibRef
Shen, S.[Shuai],
Li, W.[Wanhua],
Huang, X.[Xiaoke],
Zhu, Z.[Zheng],
Zhou, J.[Jie],
Lu, J.W.[Ji-Wen],
SD-NeRF: Towards Lifelike Talking Head Animation via
Spatially-Adaptive Dual-Driven NeRFs,
MultMed(26), 2024, pp. 3221-3234.
IEEE DOI
2402
Head, Videos, Solid modeling, Animation, Mouth, Visualization,
Attention mechanism, neural radiance fields, talking head video synthesis
BibRef
Cheng, H.[Harry],
Guo, Y.Y.[Yang-Yang],
Yin, J.H.[Jian-Hua],
Chen, H.[Haonan],
Wang, J.[Jiafang],
Nie, L.Q.[Li-Qiang],
Audio-Driven Talking Video Frame Restoration,
MultMed(26), 2024, pp. 4110-4122.
IEEE DOI
2403
Streaming media, Faces, Lips, Task analysis, Image restoration,
Visualization, Synchronization, Frame Restoration,
Generative Adversial Network
BibRef
Hong, F.T.[Fa-Ting],
Shen, L.[Li],
Xu, D.[Dan],
DaGAN++: Depth-Aware Generative Adversarial Network for Talking Head
Video Generation,
PAMI(46), No. 5, May 2024, pp. 2997-3012.
IEEE DOI
2404
Faces, Head, Geometry, Magnetic heads,
Estimation, Annotations, Talking head generation,
geometry-guided video generation
BibRef
Hong, F.T.[Fa-Ting],
Zhang, L.H.[Long-Hao],
Shen, L.[Li],
Xu, D.[Dan],
Depth-Aware Generative Adversarial Network for Talking Head Video
Generation,
CVPR22(3387-3396)
IEEE DOI
2210
Geometry, Learning systems, Annotations, Face recognition,
Image representation, Generative adversarial networks, Image and video synthesis and generation
BibRef
Xu, C.H.[Cheng-Hao],
Yan, J.[Jiexi],
Yang, Y.H.[Yan-Hua],
Deng, C.[Cheng],
Implicit Compositional Generative Network for Length-Variable
Co-Speech Gesture Synthesis,
MultMed(26), 2024, pp. 6325-6335.
IEEE DOI
2404
Interpolation, Encoding, Task analysis, Semantics,
Mel frequency cepstral coefficient, Transformers, Correlation,
compositional generation
BibRef
Sheng, Z.C.[Zhi-Cheng],
Nie, L.Q.[Li-Qiang],
Zhang, M.[Min],
Chang, X.J.[Xiao-Jun],
Yan, Y.[Yan],
Stochastic Latent Talking Face Generation Toward Emotional
Expressions and Head Poses,
CirSysVideo(34), No. 4, April 2024, pp. 2734-2748.
IEEE DOI
2404
Faces, Stochastic processes, Head, Dynamics, Mixers, Synchronization,
Computational modeling, Talking face generation,
continuous normalizing flow
BibRef
Zhao, M.[Minglu],
Wang, W.[Wenmin],
Chen, T.[Tongbao],
Zhang, R.[Rui],
Li, R.[Ruochen],
TA2V: Text-Audio Guided Video Generation,
MultMed(26), 2024, pp. 7250-7264.
IEEE DOI
2405
Task analysis, Semantics, Generators,
Generative adversarial networks, Computational modeling, deep learning
BibRef
Guo, J.Y.[Jia-Yi],
Manukyan, H.[Hayk],
Yang, C.Y.[Chen-Yu],
Wang, C.F.[Chao-Fei],
Khachatryan, L.[Levon],
Navasardyan, S.[Shant],
Song, S.[Shiji],
Shi, H.[Humphrey],
Huang, G.[Gao],
FaceCLIP: Facial Image-to-Video Translation via a Brief Text
Description,
CirSysVideo(34), No. 6, June 2024, pp. 4270-4284.
IEEE DOI
2406
Task analysis, Training, Faces, Solid modeling, Image synthesis,
Image reconstruction, Autoencoder, CLIP, facial video generation,
transformer
BibRef
Wang, X.P.[Xue-Ping],
Ruan, T.[Tao],
Xu, J.[Jun],
Guo, X.[Xueni],
Li, J.[Jiahe],
Yan, F.H.[Fei-Hu],
Zhao, G.Z.[Guang-Zhe],
Wang, C.Y.[Cai-Yong],
Expression-aware neural radiance fields for high-fidelity talking
portrait synthesis,
IVC(147), 2024, pp. 105075.
Elsevier DOI
2406
Talking portrait synthesis,
Expression-aware neural radiance fields, Attention mechanism
BibRef
Li, J.[Jiahe],
Zhang, J.W.[Jia-Wei],
Bai, X.[Xiao],
Zhou, J.[Jun],
Gu, L.[Lin],
Efficient Region-Aware Neural Radiance Fields for High-Fidelity
Talking Portrait Synthesis,
ICCV23(7534-7544)
IEEE DOI Code:
WWW Link.
2401
BibRef
Shin, A.H.[Ah-Hyung],
Lee, J.H.[Jae-Ho],
Hwang, J.[Jiwon],
Kim, Y.[Yoonhyung],
Park, G.M.[Gyeong-Moon],
Wav2NeRF: Audio-driven realistic talking head generation via
wavelet-based NeRF,
IVC(148), 2024, pp. 105104.
Elsevier DOI
2407
Talking head generation, Neural radiance fields,
Cross-modal generation, Audio-visual, Wavelet transform
BibRef
Qiu, H.N.[Hao-Nan],
Chen, Z.X.[Zhao-Xi],
Jiang, Y.M.[Yu-Ming],
Zhou, H.[Hang],
Fan, X.Y.[Xiang-Yu],
Yang, L.[Lei],
Wu, W.[Wayne],
Liu, Z.W.[Zi-Wei],
ReliTalk: Relightable Talking Portrait Generation from a Single Video,
IJCV(132), No. 8, August 2024, pp. 2713-2728.
Springer DOI
2408
BibRef
Gao, J.[Jian],
Shu, C.[Chang],
Zheng, X.[Ximin],
Lu, Z.[Zheng],
Bao, N.[Nengsheng],
MergeTalk: Audio-Driven Talking Head Generation From Single Image
With Feature Merge,
SPLetters(31), 2024, pp. 1850-1854.
IEEE DOI
2408
Faces, Feature extraction, Solid modeling, Lips,
Generative adversarial networks, Training, GAN
BibRef
Liu, J.[Jin],
Wang, X.[Xi],
Fu, X.M.[Xiao-Meng],
Chai, Y.S.[Ye-Sheng],
Yu, C.[Cai],
Dai, J.[Jiao],
Han, J.Z.[Ji-Zhong],
OSM-Net: One-to-Many One-Shot Talking Head Generation With
Spontaneous Head Motions,
CirSysVideo(34), No. 8, August 2024, pp. 6888-6900.
IEEE DOI
2408
Head, Mouth, Feature extraction, Faces, Circuits and systems, Shape,
Task analysis, Talking head generation, generative model,
one-to-many mapping
BibRef
Zhang, J.[Jiale],
Liu, C.X.[Cheng-Xin],
Xian, K.[Ke],
Cao, Z.G.[Zhi-Guo],
Hierarchical Feature Warping and Blending for Talking Head Animation,
CirSysVideo(34), No. 8, August 2024, pp. 7301-7314.
IEEE DOI
2408
Head, Image resolution, Task analysis, Generators, Vectors, Animation,
Iterative methods, Talking head animation, anime dataset
BibRef
Wang, S.Z.[Su-Zhen],
Ma, Y.F.[Yi-Feng],
Ding, Y.[Yu],
Hu, Z.P.[Zhi-Peng],
Fan, C.J.[Chang-Jie],
Lv, T.J.[Tang-Jie],
Deng, Z.D.[Zhi-Dong],
Yu, X.[Xin],
StyleTalk++: A Unified Framework for Controlling the Speaking Styles
of Talking Heads,
PAMI(46), No. 6, June 2024, pp. 4331-4347.
IEEE DOI
2405
Head, Videos, Magnetic heads, Codes, Faces, Decoding, Deep learning,
facial animation, head pose generation, neural network,
talking head generation
BibRef
Chu, Z.J.[Zhao-Jie],
Guo, K.L.[Kai-Ling],
Xing, X.F.[Xiao-Fen],
Lan, Y.L.[Yi-Lin],
Cai, B.[Bolun],
Xu, X.M.[Xiang-Min],
CorrTalk: Correlation Between Hierarchical Speech and Facial Activity
Variances for 3D Animation,
CirSysVideo(34), No. 9, September 2024, pp. 8953-8965.
IEEE DOI Code:
WWW Link.
2410
Facial animation, Correlation, Mouth, Decoding, Speech processing,
Visualization, 3D facial animation, hierarchical speech features,
transformer
BibRef
Wang, Z.[Zheng],
Zhang, W.[Wei],
Ye, L.[Long],
Zeng, D.[Dan],
Mei, T.[Tao],
Cross-Modal Quantization for Co-Speech Gesture Generation,
MultMed(26), 2024, pp. 10251-10263.
IEEE DOI
2410
Codes, Speech coding, Quantization (signal), Task analysis, Lips,
Speech enhancement, Co-speech gesture generation,
social robots
BibRef
Tang, A.[Anni],
He, T.Y.[Tian-Yu],
Tan, X.[Xu],
Ling, J.[Jun],
Li, R.[Runnan],
Zhao, S.[Sheng],
Bian, J.[Jiang],
Song, L.[Li],
Memories are One-to-Many Mapping Alleviators in Talking Face
Generation,
PAMI(46), No. 12, December 2024, pp. 8758-8770.
IEEE DOI
2411
Faces, Visualization, Shape, Rendering (computer graphics),
Feature extraction, Lips, Training, Memory learning,
talking face generation
BibRef
Zhang, W.X.[Wei-Xia],
Zhu, C.G.[Cheng-Guang],
Gao, J.N.[Jing-Nan],
Yan, Y.C.[Yi-Chao],
Zhai, G.T.[Guang-Tao],
Yang, X.K.[Xiao-Kang],
A Comparative Study of Perceptual Quality Metrics For Audio-Driven
Talking Head Videos,
ICIP24(1218-1224)
IEEE DOI Code:
WWW Link.
2411
Performance evaluation, Visualization, Head, Image coding,
Gaussian noise, Transform coding, Propulsion,
audio-driven talking head generation
BibRef
Lei, H.[Haobo],
Bie, Z.[Zhisong],
Jing, Z.[Zhao],
Bie, H.X.[Hong-Xia],
Talking-Head Video Compression With Motion Semantic Enhancement Model,
ICIP24(1980-1986)
IEEE DOI
2411
Visualization, Image coding, Accuracy, Semantics, Redundancy, Bit rate,
Estimation, video compression, motion semantic representation,
deformable feature estimation
BibRef
Xing, Y.Z.[Ya-Zhou],
He, Y.Q.[Ying-Qing],
Tian, Z.[Zeyue],
Wang, X.[Xintao],
Chen, Q.F.[Qi-Feng],
Seeing and Hearing: Open-domain Visual-Audio Generation with
Diffusion Latent Aligners,
CVPR24(7151-7161)
IEEE DOI Code:
WWW Link.
2410
Training, Industries, Bridges, Visualization, Noise reduction, Motion pictures
BibRef
Mughal, M.H.[Muhammad Hamza],
Dabral, R.[Rishabh],
Habibie, I.[Ikhsanul],
Donatelli, L.[Lucia],
Habermann, M.[Marc],
Theobalt, C.[Christian],
ConvoFusion: Multi-Modal Conversational Diffusion for Co-Speech
Gesture Synthesis,
CVPR24(1388-1398)
IEEE DOI
2410
Focusing, Propulsion, Diffusion models, Controllability,
gesture synthesis, multi-modal motion synthesis
BibRef
Mehta, S.[Shivam],
Deichler, A.[Anna],
O'Regan, J.[Jim],
Moëll, B.[Birger],
Beskow, J.[Jonas],
Henter, G.E.[Gustav Eje],
Alexanderson, S.[Simon],
Fake it to make it: Using synthetic data to remedy the data shortage
in joint multi-modal speech-and-gesture synthesis,
HUMOGen24(1952-1964)
IEEE DOI
2410
Training, Training data, Computer architecture, Oral communication,
motion synthesis, gesture synthesis, synthetic data,
training-on-generated-data
BibRef
Zhang, Q.[Qiang],
Xiao, T.[Tong],
Habeeb, H.[Haroun],
Laich, L.[Larissa],
Bouaziz, S.[Sofien],
Snape, P.[Patrick],
Zhang, W.J.[Wen-Jing],
Cioffi, M.[Matthew],
Zhang, P.Z.[Pei-Zhao],
Pidlypenskyi, P.[Pavel],
Lin, W.[Winnie],
Ma, L.[Luming],
Wang, M.J.[Meng-Jiao],
Li, K.[Kunpeng],
Long, C.J.[Cheng-Jiang],
Song, S.[Steven],
Prazak, M.[Martin],
Sjoholm, A.[Alexander],
Deogade, A.[Ajinkya],
Lee, J.[Jaebong],
Mangas, J.D.[Julio Delgado],
Aubel, A.[Amaury],
REFA: Real-time Egocentric Facial Animations for Virtual Reality,
ABAW24(4793-4802)
IEEE DOI
2410
Headphones, Training, Solid modeling, Virtual environments, Resists,
Cameras, Real-time systems, face tracking, virtual reality,
face alignment
BibRef
Fernandez-Labrador, C.[Clara],
Akçay, M.[Mertcan],
Abecassis, E.[Eitan],
Massich, J.[Joan],
Schroers, C.[Christopher],
DiVAS: Video and Audio Synchronization with Dynamic Frame Rates,
CVPR24(26836-26844)
IEEE DOI
2410
Crops, Production, Media, Predictive models, Transformers,
Motion pictures, Robustness
BibRef
Yang, K.D.[Karren D.],
Ranjan, A.[Anurag],
Chang, J.H.R.[Jen-Hao Rick],
Vemulapalli, R.[Raviteja],
Tuzel, O.[Oncel],
Probabilistic Speech-Driven 3D Facial Motion Synthesis: New
Benchmarks, Methods, and Applications,
CVPR24(27284-27293)
IEEE DOI
2410
Measurement, Training, Geometry, Solid modeling, Lips,
Benchmark testing, speech-driven facial animation, synthetic data generation
BibRef
Liu, Y.F.[Yi-Fei],
Cao, Q.[Qiong],
Wen, Y.D.[Yan-Dong],
Jiang, H.[Huaiguang],
Ding, C.X.[Chang-Xing],
Towards Variable and Coordinated Holistic Co-Speech Motion Generation,
CVPR24(1566-1576)
IEEE DOI Code:
WWW Link.
2410
Couplings, Codes, Quantization (signal), Avatars, Focusing,
Probabilistic logic, Human Motion Generation, Speech-Driven 3D Avatar Synthesis
BibRef
Ng, E.[Evonne],
Romero, J.[Javier],
Bagautdinov, T.[Timur],
Bai, S.J.[Shao-Jie],
Darrell, T.J.[Trevor J.],
Kanazawa, A.[Angjoo],
Richard, A.[Alexander],
From Audio to Photoreal Embodiment:
Synthesizing Humans in Conversations,
CVPR24(1001-1010)
IEEE DOI
2410
Ethics, Visualization, Photorealism, Avatars, Vector quantization,
Face recognition, gestures, generative motion, multimodal, face, body, hands
BibRef
Peng, Z.Q.[Zi-Qiao],
Hu, W.T.[Wen-Tao],
Shi, Y.[Yue],
Zhu, X.Y.[Xiang-Yu],
Zhang, X.M.[Xiao-Mei],
Zhao, H.[Hao],
He, J.[Jun],
Liu, H.Y.[Hong-Yan],
Fan, Z.X.[Zhao-Xin],
SyncTalk: The Devil is in the Synchronization for Talking Head
Synthesis,
CVPR24(666-676)
IEEE DOI Code:
WWW Link.
2410
Torso, Visualization, Technological innovation, Lips,
Neural radiance field, Generative adversarial networks, Audio-driven
BibRef
Zhang, Z.C.[Zi-Cheng],
Zheng, R.[Ruobing],
Li, B.[Bonan],
Han, C.Y.[Cong-Ying],
Li, T.Q.[Tian-Qi],
Wang, M.[Meng],
Guo, T.D.[Tian-De],
Chen, J.D.[Jing-Dong],
Liu, Z.[Ziwen],
Yang, M.[Ming],
Learning Dynamic Tetrahedra for High-Quality Talking Head Synthesis,
CVPR24(5209-5219)
IEEE DOI Code:
WWW Link.
2410
Geometry, Training, Visualization, Deformation, Dynamics, Video sequences
BibRef
Jang, Y.J.[Young-Joon],
Kim, J.H.[Ji-Hoon],
Ahn, J.[Junseok],
Kwak, D.[Doyeop],
Yang, H.S.[Hong-Sun],
Ju, Y.C.[Yoon-Cheol],
Kim, I.H.[Il-Hwan],
Kim, B.Y.[Byeong-Yeol],
Chung, J.S.[Joon Son],
Faces that Speak: Jointly Synthesising Talking Face and Speech from
Text,
CVPR24(8818-8828)
IEEE DOI
2410
Codes, Impedance matching, Noise, Text to speech,
Timbre
BibRef
Aneja, S.[Shivangi],
Thies, J.[Justus],
Dai, A.[Angela],
Niessner, M.[Matthias],
FaceTalk: Audio-Driven Motion Diffusion for Neural Parametric Head
Models,
CVPR24(21263-21273)
IEEE DOI Code:
WWW Link.
2410
Hair, Solid modeling, Shape, Face recognition, Diffusion models,
Magnetic heads, Facial Animation, Digital Humans, Talking Heads
BibRef
Tan, S.[Shuai],
Ji, B.[Bin],
Pan, Y.[Ye],
FlowVQTalker: High-Quality Emotional Talking Face Generation through
Normalizing Flow and Quantization,
CVPR24(26307-26317)
IEEE DOI
2410
Quantization (signal), Codes, Computational modeling, Avatars,
Generators, Talking Face Generation, Emotional Expression
BibRef
Hogue, S.[Steven],
Zhang, C.X.[Chen-Xu],
Daruger, H.[Hamza],
Tian, Y.[Yapeng],
Guo, X.[Xiaohu],
DiffTED: One-shot Audio-driven TED Talk Video Generation with
Diffusion-based Co-speech Gestures,
HUMOGen24(1922-1931)
IEEE DOI
2410
Head, Diffusion models, Rendering (computer graphics), Animation,
Diffusion, co-speech gestures, talking video
BibRef
Yaman, D.[Dogucan],
Eyiokur, F.I.[Fevziye Irem],
Bärmann, L.[Leonard],
Akti, S.[Seymanur],
Ekenel, H.K.[Hazim Kemal],
Waibel, A.[Alexander],
Audio-Visual Speech Representation Expert for Enhanced Talking Face
Video Generation and Evaluation,
NTIRE24(6003-6013)
IEEE DOI
2410
Measurement, Training, Visualization, Lips, Face recognition,
Watermarking, Speech enhancement
BibRef
Pham, T.T.[Trong Thang],
Do, T.[Tuong],
Le, N.[Nhat],
Le, N.[Ngan],
Nguyen, H.[Hung],
Tjiputra, E.[Erman],
Tran, Q.[Quang],
Nguyen, A.[Anh],
Style Transfer for 2D Talking Head Generation,
GCV24(7500-7509)
IEEE DOI
2410
Head, Source coding, Streaming media, Animation, Generators,
Distance measurement, Feeds, Talking Head Generation
BibRef
Xu, C.[Chao],
Liu, Y.[Yang],
Xing, J.Z.[Jia-Zheng],
Wang, W.[Weida],
Sun, M.Z.[Ming-Ze],
Dan, J.[Jun],
Huang, T.X.[Tian-Xin],
Li, S.Y.[Si-Yuan],
Cheng, Z.Q.[Zhi-Qi],
Tai, Y.[Ying],
Sun, B.[Baigui],
FaceChain-ImagineID: Freely Crafting High-Fidelity Diverse Talking
Faces from Disentangled Audio,
CVPR24(1292-1302)
IEEE DOI
2410
Training, Geometry, Costs, Semantics, Diffusion models,
Controllability, Talking Face, FaceChain
BibRef
Zhang, C.X.[Chen-Xu],
Wang, C.[Chao],
Zhao, Y.F.[Yi-Fan],
Cheng, S.[Shuo],
Luo, L.J.[Lin-Jie],
Guo, X.[Xiaohu],
DR2: Disentangled Recurrent Representation Learning for
Data-efficient Speech Video Synthesis,
WACV24(6192-6202)
IEEE DOI
2404
Training, Representation learning, Computational modeling,
Training data, Data models, Synchronization, Algorithms, Biometrics
BibRef
Stypulkowski, M.[Michal],
Vougioukas, K.[Konstantinos],
He, S.[Sen],
Zieba, M.[Maciej],
Petridis, S.[Stavros],
Pantic, M.[Maja],
Diffused Heads: Diffusion Models Beat GANs on Talking-Face Generation,
WACV24(5089-5098)
IEEE DOI
2404
Data models, Complexity theory, Faces, Videos, Algorithms,
Generative models for image, video, 3D, etc., Algorithms, Biometrics,
Vision + language and/or other modalities
BibRef
Ni, H.M.[Hao-Miao],
Liu, J.C.[Jia-Chen],
Xue, Y.[Yuan],
Huang, S.X.[Sharon X.],
3D-Aware Talking-Head Video Motion Transfer,
WACV24(4942-4952)
IEEE DOI
2404
Geometry, Solid modeling, Head, Art, Computational modeling,
Algorithms, Generative models for image, video, 3D, etc., Algorithms,
image and video synthesis
BibRef
Gupta, H.[Honey],
Perceptual Synchronization Scoring of Dubbed Content using
Phoneme-Viseme Agreement,
VAQuality24(392-402)
IEEE DOI
2404
Correlation, Streaming media, Synchronization, History
BibRef
Lee, D.[Dongyeun],
Kim, C.[Chaewon],
Yu, S.J.[Sang-Joon],
Yoo, J.[Jaejun],
Park, G.M.[Gyeong-Moon],
RADIO: Reference-Agnostic Dubbing Video Synthesis,
WACV24(4156-4166)
IEEE DOI
2404
Shape, Lips, Modulation, Robustness, Decoding, Synchronization, Mirrors,
Algorithms, Computational photography, image and video synthesis
BibRef
Sung-Bin, K.[Kim],
Hyun, L.[Lee],
Hong, D.H.[Da Hye],
Nam, S.[Suekyeong],
Ju, J.[Janghoon],
Oh, T.H.[Tae-Hyun],
LaughTalk: Expressive 3D Talking Head Generation with Laughter,
WACV24(6392-6401)
IEEE DOI
2404
Training, Solid modeling, Lips, Fires, Synchronization, Algorithms,
Biometrics, face, gesture, body pose, Algorithms, 3D computer vision
BibRef
Zhao, L.[Lin],
Li, H.X.[Hong-Xuan],
Ning, X.F.[Xue-Fei],
Jiang, X.[Xinru],
THInImg: Cross-modal Steganography for Presenting Talking Heads in
Images,
WACV24(5541-5550)
IEEE DOI
2404
Steganography, Image resolution, Pipelines, Computer architecture,
Copyright protection, Encoding, Algorithms,
image and video synthesis
BibRef
Mukhopadhyay, S.[Soumik],
Suri, S.[Saksham],
Gadde, R.T.[Ravi Teja],
Shrivastava, A.[Abhinav],
Diff2Lip: Audio Conditioned Diffusion Models for Lip-Synchronization,
WACV24(5280-5290)
IEEE DOI Code:
WWW Link.
2404
Image quality, Measurement, Degradation, Lips,
Entertainment industry, Synchronization, Task analysis, Algorithms,
image and video synthesis
BibRef
Liu, Y.F.[Yun-Fei],
Lin, L.J.[Li-Jian],
Yu, F.[Fei],
Zhou, C.Y.[Chang-Yin],
Li, Y.[Yu],
MODA: Mapping-Once Audio-driven Portrait Animation with Dual
Attentions,
ICCV23(22963-22972)
IEEE DOI
2401
BibRef
Zheng, J.L.[Jia-Li],
Jang, Y.[Youngkyoon],
Papaioannou, A.[Athanasios],
Kampouris, C.[Christos],
Potamias, R.A.[Rolandos Alexandros],
Papantoniou, F.P.[Foivos Paraperas],
Galanakis, E.[Efstathios],
Leonardis, A.[Aleš],
Zafeiriou, S.[Stefanos],
ILSH: The Imperial Light-Stage Head Dataset for Human Head View
Synthesis,
RHWC23(1104-1112)
IEEE DOI
2401
BibRef
Muaz, U.[Urwa],
Jang, W.[Wondong],
Tripathi, R.[Rohun],
Mani, S.[Santhosh],
Ouyang, W.B.[Wen-Bin],
Gadde, R.T.[Ravi Teja],
Gecer, B.[Baris],
Elizondo, S.[Sergio],
Madad, R.[Reza],
Nair, N.[Naveen],
SIDGAN: High-Resolution Dubbed Video Generation via Shift-Invariant
Learning,
ICCV23(7799-7808)
IEEE DOI
2401
BibRef
Li, W.C.[Wei-Chuang],
Zhang, L.[Longhao],
Wang, D.[Dong],
Zhao, B.[Bin],
Wang, Z.G.[Zhi-Gang],
Chen, M.[Mulin],
Zhang, B.[Bang],
Wang, Z.J.[Zhong-Jian],
Bo, L.[Liefeng],
Li, X.L.[Xue-Long],
One-Shot High-Fidelity Talking-Head Synthesis with Deformable Neural
Radiance Field,
CVPR23(17969-17978)
IEEE DOI
2309
BibRef
Niu, Z.[Zhe],
Mak, B.[Brian],
On the Audio-visual Synchronization for Lip-to-Speech Synthesis,
ICCV23(7809-7818)
IEEE DOI
2401
BibRef
Jeong, Y.J.[Yu-Jin],
Ryoo, W.[Wonjeong],
Lee, S.H.[Seung-Hyun],
Seo, D.[Dabin],
Byeon, W.[Wonmin],
Kim, S.[Sangpil],
Kim, J.[Jinkyu],
The Power of Sound (TPoS): Audio Reactive Video Generation with
Stable Diffusion,
ICCV23(7788-7798)
IEEE DOI Code:
WWW Link.
2401
BibRef
Ki, T.[Taekyung],
Min, D.[Dongchan],
StyleLipSync: Style-based Personalized Lip-sync Video Generation,
ICCV23(22784-22793)
IEEE DOI
2401
BibRef
Gururani, S.[Siddharth],
Mallya, A.[Arun],
Wang, T.C.[Ting-Chun],
Valle, R.[Rafael],
Liu, M.Y.[Ming-Yu],
SPACE: Speech-driven Portrait Animation with Controllable Expression,
ICCV23(20857-20866)
IEEE DOI Code:
WWW Link.
2401
BibRef
Thambiraja, B.[Balamurugan],
Habibie, I.[Ikhsanul],
Aliakbarian, S.[Sadegh],
Cosker, D.[Darren],
Theobalt, C.[Christian],
Thies, J.[Justus],
Imitator: Personalized Speech-driven 3D Facial Animation,
ICCV23(20564-20574)
IEEE DOI Code:
WWW Link.
2401
BibRef
Yang, P.[Peiji],
Wei, H.[Huawei],
Zhong, Y.C.[Yi-Cheng],
Wang, Z.S.[Zhi-Sheng],
Semi-supervised Speech-driven 3D Facial Animation via Cross-modal
Encoding,
ICCV23(20975-20984)
IEEE DOI
2401
BibRef
Wu, X.Z.[Xiu-Zhe],
Hu, P.F.[Peng-Fei],
Wu, Y.[Yang],
Lyu, X.Y.[Xiao-Yang],
Cao, Y.P.[Yan-Pei],
Shan, Y.[Ying],
Yang, W.M.[Wen-Ming],
Sun, Z.Q.[Zhong-Qian],
Qi, X.J.[Xiao-Juan],
Speech2Lip: High-fidelity Speech to Lip Generation by Learning from a
Short Video,
ICCV23(22111-22120)
IEEE DOI Code:
WWW Link.
2401
BibRef
He, S.[Shan],
He, H.[Haonan],
Yang, S.[Shuo],
Wu, X.Y.[Xiao-Yan],
Xia, P.C.[Peng-Cheng],
Yin, B.[Bing],
Liu, C.[Cong],
Dai, L.R.[Li-Rong],
Xu, C.[Chang],
Speech4Mesh: Speech-Assisted Monocular 3D Facial Reconstruction for
Speech-Driven 3D Facial Animation,
ICCV23(14146-14156)
IEEE DOI
2401
BibRef
Choi, J.[Jeongsoo],
Hong, J.[Joanna],
Ro, Y.M.[Yong Man],
DiffV2S: Diffusion-based Video-to-Speech Synthesis with Vision-guided
Speaker Embedding,
ICCV23(7778-7787)
IEEE DOI
2401
BibRef
Gong, Y.[Yuan],
Zhang, Y.[Yong],
Cun, X.D.[Xiao-Dong],
Yin, F.[Fei],
Fan, Y.B.[Yan-Bo],
Wang, X.[Xuan],
Wu, B.Y.[Bao-Yuan],
Yang, Y.[Yujiu],
ToonTalker: Cross-Domain Face Reenactment,
ICCV23(7656-7666)
IEEE DOI
2401
BibRef
Yu, Z.T.[Zhen-Tao],
Yin, Z.X.[Zi-Xin],
Zhou, D.Y.[De-Yu],
Wang, D.[Duomin],
Wong, F.[Finn],
Wang, B.Y.[Bao-Yuan],
Talking Head Generation with Probabilistic Audio-to-Visual Diffusion
Priors,
ICCV23(7611-7621)
IEEE DOI
2401
BibRef
Oorloff, T.[Trevine],
Yacoob, Y.[Yaser],
Expressive Talking Head Video Encoding in StyleGAN2 Latent Space,
CVEU23(2990-2999)
IEEE DOI Code:
WWW Link.
2401
BibRef
Chen, L.Y.[Li-Yang],
Wu, Z.Y.[Zhi-Yong],
Li, R.[Runnan],
Bao, W.H.[Wei-Hong],
Ling, J.[Jun],
Tan, X.[Xu],
Zhao, S.[Sheng],
VAST: Vivify Your Talking Avatar via Zero-Shot Expressive Facial
Style Transfer,
CVEU23(2969-2979)
IEEE DOI
2401
BibRef
Hong, F.T.[Fa-Ting],
Xu, D.[Dan],
Implicit Identity Representation Conditioned Memory Compensation
Network for Talking Head Video Generation,
ICCV23(23005-23015)
IEEE DOI
2401
BibRef
Gan, Y.[Yuan],
Yang, Z.X.[Zong-Xin],
Yue, X.[Xihang],
Sun, L.Y.[Ling-Yun],
Yang, Y.[Yi],
Efficient Emotional Adaptation for Audio-Driven Talking-Head
Generation,
ICCV23(22577-22588)
IEEE DOI Code:
WWW Link.
2401
BibRef
Nocentini, F.[Federico],
Ferrari, C.[Claudio],
Berretti, S.[Stefano],
Learning Landmarks Motion from Speech for Speaker-agnostic 3d Talking
Heads Generation,
CIAP23(I:340-351).
Springer DOI
2312
BibRef
Wang, Q.[Qianrui],
Li, D.[Dengshi],
Liao, L.[Liang],
Song, H.[Hao],
Li, W.[Wei],
Xiao, J.[Jing],
ASVFI: Audio-Driven Speaker Video Frame Interpolation,
ICIP23(3200-3204)
IEEE DOI
2312
BibRef
Zeng, B.[Bohan],
Liu, X.[Xuhui],
Gao, S.C.[Si-Cheng],
Liu, B.[Boyu],
Li, H.[Hong],
Liu, J.Z.[Jian-Zhuang],
Zhang, B.C.[Bao-Chang],
Face Animation with an Attribute-Guided Diffusion Model,
GCV23(628-637)
IEEE DOI
2309
BibRef
Shen, S.[Shuai],
Zhao, W.L.[Wen-Liang],
Meng, Z.[Zibin],
Li, W.[Wanhua],
Zhu, Z.[Zheng],
Zhou, J.[Jie],
Lu, J.W.[Ji-Wen],
DiffTalk: Crafting Diffusion Models for Generalized Audio-Driven
Portraits Animation,
CVPR23(1982-1991)
IEEE DOI
2309
BibRef
Zhang, B.[Bowen],
Qi, C.Y.[Chen-Yang],
Zhang, P.[Pan],
Zhang, B.[Bo],
Wu, H.[HsiangTao],
Chen, D.[Dong],
Chen, Q.F.[Qi-Feng],
Wang, Y.[Yong],
Wen, F.[Fang],
MetaPortrait: Identity-Preserving Talking Head Generation with Fast
Personalized Adaptation,
CVPR23(22096-22105)
IEEE DOI
2309
BibRef
Vahdati, D.S.[Danial Samadi],
Nguyen, T.D.[Tai Duc],
Stamm, M.C.[Matthew C.],
Defending Low-Bandwidth Talking Head Videoconferencing Systems From
Real-Time Puppeteering Attacks,
WMF23(983-992)
IEEE DOI
2309
BibRef
Wang, D.[Duomin],
Deng, Y.[Yu],
Yin, Z.X.[Zi-Xin],
Shum, H.Y.[Heung-Yeung],
Wang, B.Y.[Bao-Yuan],
Progressive Disentangled Representation Learning for Fine-Grained
Controllable Talking Head Synthesis,
CVPR23(17979-17989)
IEEE DOI
2309
BibRef
Wang, J.[Jiayu],
Zhao, K.[Kang],
Zhang, S.W.[Shi-Wei],
Zhang, Y.Y.[Ying-Ya],
Shen, Y.J.[Yu-Jun],
Zhao, D.L.[De-Li],
Zhou, J.[Jingren],
LipFormer: High-fidelity and Generalizable Talking Face Generation
with A Pre-learned Facial Codebook,
CVPR23(13844-13853)
IEEE DOI
2309
BibRef
Zhong, W.Z.[Wei-Zhi],
Fang, C.W.[Chao-Wei],
Cai, Y.Q.[Yin-Qi],
Wei, P.X.[Peng-Xu],
Zhao, G.M.[Gang-Ming],
Lin, L.[Liang],
Li, G.B.[Guan-Bin],
Identity-Preserving Talking Face Generation with Landmark and
Appearance Priors,
CVPR23(9729-9738)
IEEE DOI
2309
BibRef
Zhang, W.X.[Wen-Xuan],
Cun, X.D.[Xiao-Dong],
Wang, X.[Xuan],
Zhang, Y.[Yong],
Shen, X.[Xi],
Guo, Y.[Yu],
Shan, Y.[Ying],
Wang, F.[Fei],
SadTalker: Learning Realistic 3D Motion Coefficients for Stylized
Audio-Driven Single Image Talking Face Animation,
CVPR23(8652-8661)
IEEE DOI
2309
BibRef
Gao, Y.[Yue],
Zhou, Y.[Yuan],
Wang, J.L.[Jing-Lu],
Li, X.[Xiao],
Ming, X.[Xiang],
Lu, Y.[Yan],
High-Fidelity and Freely Controllable Talking Head Video Generation,
CVPR23(5609-5619)
IEEE DOI
2309
BibRef
Huang, R.C.[Ri-Cong],
Lai, P.W.[Pei-Wen],
Qin, Y.P.[Yi-Peng],
Li, G.B.[Guan-Bin],
Parametric Implicit Face Representation for Audio-Driven Facial
Reenactment,
CVPR23(12759-12768)
IEEE DOI
2309
BibRef
Xing, J.[Jinbo],
Xia, M.[Menghan],
Zhang, Y.[Yuechen],
Cun, X.D.[Xiao-Dong],
Wang, J.[Jue],
Wong, T.T.[Tien-Tsin],
CodeTalker: Speech-Driven 3D Facial Animation with Discrete Motion
Prior,
CVPR23(12780-12790)
IEEE DOI
2309
BibRef
Guan, J.Z.[Jia-Zhi],
Zhang, Z.W.[Zhan-Wang],
Zhou, H.[Hang],
Hu, T.S.[Tian-Shu],
Wang, K.Y.[Kaisi-Yuan],
He, D.L.[Dong-Liang],
Feng, H.C.[Hao-Cheng],
Liu, J.T.[Jing-Tuo],
Ding, E.[Errui],
Liu, Z.W.[Zi-Wei],
Wang, J.D.[Jing-Dong],
StyleSync: High-Fidelity Generalized and Personalized Lip Sync in
Style-Based Generator,
CVPR23(1505-1515)
IEEE DOI
2309
BibRef
Chatziagapi, A.[Aggelina],
Samaras, D.[Dimitris],
AVFace: Towards Detailed Audio-Visual 4D Face Reconstruction,
CVPR23(16878-16889)
IEEE DOI
2309
BibRef
Chatziagapi, A.[Aggelina],
Athar, S.[ShahRukh],
Jain, A.[Abhinav],
MV, R.[Rohith],
Bhat, V.[Vimal],
Samaras, D.[Dimitris],
LipNeRF: What is the right feature space to lip-sync a NeRF?,
FG23(1-8)
IEEE DOI
2303
Geometry, Solid modeling, Visualization, Lips, Face recognition, Lighting
BibRef
Gustafson, J.[Joakim],
Székely, E.[Eva],
Alexandersson, S.[Simon],
Beskow, J.[Jonas],
Casual chatter or speaking up? Adjusting articulatory effort in
generation of speech and animation for conversational characters,
FG23(1-4)
IEEE DOI
2303
Lips, Face recognition, Social robots, Speech recognition,
Gesture recognition, Animation, Behavioral sciences
BibRef
Agarwal, A.[Aditya],
Sen, B.[Bipasha],
Mukhopadhyay, R.[Rudrabha],
Namboodiri, V.[Vinay],
Jawahar, C.V.,
Towards MOOCs for Lipreading: Using Synthetic Talking Heads to Train
Humans in Lipreading at Scale,
WACV23(2216-2225)
IEEE DOI
2302
Training, Vocabulary, Statistical analysis, Veins, Pipelines,
Auditory system, Applications: Social good, Education
BibRef
Agarwal, M.[Madhav],
Mukhopadhyay, R.[Rudrabha],
Namboodiri, V.[Vinay],
Jawahar, C.V.[C V],
Audio-Visual Face Reenactment,
WACV23(5167-5176)
IEEE DOI
2302
Measurement, Visualization, Motion segmentation, Lips, Mouth,
Streaming media, Generators, Commercial/retail
BibRef
Gupta, A.[Anchit],
Mukhopadhyay, R.[Rudrabha],
Balachandra, S.[Sindhu],
Khan, F.F.[Faizan Farooq],
Namboodiri, V.P.[Vinay P.],
Jawahar, C.V.,
Towards Generating Ultra-High Resolution Talking-Face Videos with Lip
synchronization,
WACV23(5198-5207)
IEEE DOI
2302
Industries, Shape, Lips, Motion pictures, Generators, Synchronization,
Algorithms: Vision + language and/or other modalities, Education
BibRef
Kim, K.[Kangyeol],
Park, S.[Sunghyun],
Lee, J.[Jaeseong],
Chung, S.[Sunghyo],
Lee, J.[Junsoo],
Choo, J.[Jaegul],
AnimeCeleb: Large-Scale Animation CelebHeads Dataset for Head
Reenactment,
ECCV22(VIII:414-430).
Springer DOI
2211
BibRef
Ling, J.W.[Jing-Wang],
Wang, Z.B.[Zhi-Bo],
Lu, M.[Ming],
Wang, Q.[Quan],
Qian, C.[Chen],
Xu, F.[Feng],
Structure-Aware Editable Morphable Model for 3D Facial Detail Animation
and Manipulation,
ECCV22(III:249-267).
Springer DOI
2211
BibRef
Youwang, K.[Kim],
Ji-Yeon, K.[Kim],
Oh, T.H.[Tae-Hyun],
CLIP-Actor: Text-Driven Recommendation and Stylization for Animating
Human Meshes,
ECCV22(III:173-191).
Springer DOI
2211
BibRef
Zhou, M.[Mohan],
Bai, Y.[Yalong],
Zhang, W.[Wei],
Yao, T.[Ting],
Zhao, T.J.[Tie-Jun],
Mei, T.[Tao],
Responsive Listening Head Generation: A Benchmark Dataset and Baseline,
ECCV22(XXXVIII:124-142).
Springer DOI
2211
BibRef
Hong, J.[Joanna],
Kim, M.[Minsu],
Ro, Y.M.[Yong Man],
VisageSynTalk: Unseen Speaker Video-to-Speech Synthesis via
Speech-Visage Feature Selection,
ECCV22(XXXVI:452-468).
Springer DOI
2211
BibRef
Shen, S.[Shuai],
Li, W.[Wanhua],
Zhu, Z.[Zheng],
Duan, Y.[Yueqi],
Zhou, J.[Jie],
Lu, J.W.[Ji-Wen],
Learning Dynamic Facial Radiance Fields for Few-Shot Talking Head
Synthesis,
ECCV22(XII:666-682).
Springer DOI
2211
BibRef
Liang, B.[Borong],
Pan, Y.[Yan],
Guo, Z.[Zhi=Zhi],
Zhou, H.[Hang],
Hong, Z.B.[Zhi-Bin],
Han, X.G.[Xiao-Guang],
Han, J.Y.[Jun-Yu],
Liu, J.[Jingtuo],
Ding, E.[Errui],
Wang, J.D.[Jing-Dong],
Expressive Talking Head Generation with Granular Audio-Visual Control,
CVPR22(3377-3386)
IEEE DOI
2210
Training, Visualization, Shape, Lips, Pipelines, Mouth,
Speech recognition, Image and video synthesis and generation
BibRef
Liu, X.[Xian],
Wu, Q.Y.[Qian-Yi],
Zhou, H.[Hang],
Xu, Y.H.[Ying-Hao],
Qian, R.[Rui],
Lin, X.Y.[Xin-Yi],
Zhou, X.W.[Xiao-Wei],
Wu, W.[Wayne],
Dai, B.[Bo],
Zhou, B.[Bolei],
Learning Hierarchical Cross-Modal Association for Co-Speech Gesture
Generation,
CVPR22(10452-10462)
IEEE DOI
2210
Avatars, Semantics, Pipelines, Vision + X,
Face and gestures
BibRef
Medina, S.[Salvador],
Tome, D.[Denis],
Stoll, C.[Carsten],
Tiede, M.[Mark],
Munhall, K.[Kevin],
Hauptmann, A.[Alex],
Matthews, I.[Iain],
Speech Driven Tongue Animation,
CVPR22(20374-20384)
IEEE DOI
2210
Solid modeling, Tongue, Lips, Face recognition, Mouth,
Speech recognition, Face and gestures, Vision + graphics
BibRef
Song, H.K.[Hyoung-Kyu],
Woo, S.H.[Sang Hoon],
Lee, J.[Junhyeok],
Yang, S.M.[Seung-Min],
Cho, H.[Hyunjae],
Lee, Y.[Youseong],
Choi, D.[Dongho],
Kim, K.W.[Kang-Wook],
Talking Face Generation with Multilingual TTS,
CVPR22(21393-21398)
IEEE DOI
2210
Training, Face recognition, Buildings, Speech synthesis, Faces, Videos
BibRef
Richard, A.[Alexander],
Zollhöfer, M.[Michael],
Wen, Y.D.[Yan-Dong],
de la Torre, F.[Fernando],
Sheikh, Y.[Yaser],
MeshTalk: 3D Face Animation from Speech using Cross-Modality
Disentanglement,
ICCV21(1153-1162)
IEEE DOI
2203
Tongue, Telepresence, Lips, Scalability, Cameras,
Vision + other modalities, Faces
BibRef
Zhang, C.X.[Chen-Xu],
Zhao, Y.F.[Yi-Fan],
Huang, Y.F.[Yi-Fei],
Zeng, M.[Ming],
Ni, S.F.[Sai-Feng],
Budagavi, M.[Madhukar],
Guo, X.[Xiaohu],
FACIAL: Synthesizing Dynamic Talking Face with Implicit Attribute
Learning,
ICCV21(3847-3856)
IEEE DOI
2203
Solid modeling, Correlation, Lips, Dynamics,
Rendering (computer graphics), Generative adversarial networks,
Image and video synthesis
BibRef
Guo, Y.D.[Yu-Dong],
Chen, K.Y.[Ke-Yu],
Liang, S.[Sen],
Liu, Y.J.[Yong-Jin],
Bao, H.J.[Hu-Jun],
Zhang, J.Y.[Ju-Yong],
AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head
Synthesis,
ICCV21(5764-5774)
IEEE DOI
2203
Hair, Solid modeling, Head, Neural networks, Meetings,
Neural generative models, Image and video synthesis, Vision + other modalities
BibRef
Meshry, M.[Moustafa],
Suri, S.[Saksham],
Davis, L.S.[Larry S.],
Shrivastava, A.[Abhinav],
Learned Spatial Representations for Few-shot Talking-Head Synthesis,
ICCV21(13809-13818)
IEEE DOI
2203
Solid modeling, Head, Shape, Image color analysis, Layout, Lighting,
Image and video synthesis, Neural generative models, Representation learning
BibRef
Wang, Q.L.[Qiu-Lin],
Zhang, L.[Lu],
Li, B.[Bo],
SAFA: Structure Aware Face Animation,
3DV21(679-688)
IEEE DOI
2201
Hair, Geometry, Solid modeling, Face recognition, Transforms,
Animation, Face Animation, 3DMM, GAN
BibRef
Zhou, H.[Hang],
Sun, Y.S.[Ya-Sheng],
Wu, W.[Wayne],
Loy, C.C.[Chen Change],
Wang, X.G.[Xiao-Gang],
Liu, Z.W.[Zi-Wei],
Pose-Controllable Talking Face Generation by Implicitly Modularized
Audio-Visual Representation,
CVPR21(4174-4184)
IEEE DOI
2111
Speech coding, Face recognition, Lips,
Speech recognition, Aerospace electronics, Robustness
BibRef
Ye, S.Q.[Shu-Quan],
Han, C.[Chu],
Lin, J.Y.[Jia-Ying],
Han, G.Q.[Guo-Qiang],
He, S.F.[Sheng-Feng],
Coherence and Identity Learning for Arbitrary-length Face Video
Generation,
ICPR21(915-922)
IEEE DOI
2105
Interpolation, Visualization, Face recognition,
Coherence, Generative adversarial networks, Generators
BibRef
Zheng, A.[Aihua],
Zhu, F.X.[Fei-Xia],
Zhu, H.[Hao],
Luo, M.[Mandi],
He, R.[Ran],
Talking Face Generation via Learning Semantic and Temporal
Synchronous Landmarks,
ICPR21(3682-3689)
IEEE DOI
2105
Learning systems, Visualization, Face recognition, Semantics, Mouth,
Detectors, Benchmark testing
BibRef
Zheng, R.B.[Ruo-Bing],
Zhu, Z.[Zhou],
Song, B.[Bo],
Ji, C.J.[Chang-Jiang],
A Neural Lip-Sync Framework for Synthesizing Photorealistic Virtual
News Anchors,
ICPR21(5286-5293)
IEEE DOI
2105
Deep learning, Visualization, Convolution, Lips, Neural networks,
Mouth, Rendering (computer graphics), lip sync, virtual anchor,
neural rendering
BibRef
Liang, H.[Hao],
Yu, L.[Lulan],
Xu, G.[Guikang],
Raj, B.[Bhiksha],
Singh, R.[Rita],
Controlled Autoencoders to Generate Faces from Voices,
ISVC20(I:476-487).
Springer DOI
2103
BibRef
Liao, M.[Miao],
Zhang, S.[Sibo],
Wang, P.[Peng],
Zhu, H.[Hao],
Zuo, X.[Xinxin],
Yang, R.G.[Rui-Gang],
Speech2video Synthesis with 3d Skeleton Regularization and Expressive
Body Poses,
ACCV20(V:308-323).
Springer DOI
2103
BibRef
Xue, H.,
Ling, J.,
Song, L.,
Xie, R.,
Zhang, W.,
Realistic Talking Face Synthesis With Geometry-Aware Feature
Transformation,
ICIP20(1581-1585)
IEEE DOI
2011
Talking face synthesis, GANs, geometry-aware, feature transformation
BibRef
Chen, L.[Lele],
Cui, G.F.[Guo-Feng],
Liu, C.[Celong],
Li, Z.[Zhong],
Kou, Z.[Ziyi],
Xu, Y.[Yi],
Xu, C.L.[Chen-Liang],
Talking-head Generation with Rhythmic Head Motion,
ECCV20(IX:35-51).
Springer DOI
2011
BibRef
Khakhulin, T.[Taras],
Sklyarova, V.[Vanessa],
Lempitsky, V.[Victor],
Zakharov, E.[Egor],
Realistic One-Shot Mesh-Based Head Avatars,
ECCV22(II:345-362).
Springer DOI
2211
BibRef
Zakharov, E.[Egor],
Ivakhnenko, A.[Aleksei],
Shysheya, A.[Aliaksandra],
Lempitsky, V.[Victor],
Fast Bi-layer Neural Synthesis of One-shot Realistic Head Avatars,
ECCV20(XII: 524-540).
Springer DOI
2010
BibRef
Chu, H.[Hang],
Ma, S.G.[Shu-Gao],
de la Torre, F.[Fernando],
Fidler, S.[Sanja],
Sheikh, Y.[Yaser],
Expressive Telepresence via Modular Codec Avatars,
ECCV20(XII: 330-345).
Springer DOI
2010
BibRef
Thies, J.[Justus],
Elgharib, M.[Mohamed],
Tewari, A.[Ayush],
Theobalt, C.[Christian],
Nießner, M.[Matthias],
Neural Voice Puppetry: Audio-driven Facial Reenactment,
ECCV20(XVI: 716-731).
Springer DOI
2010
BibRef
Pang, W.[Wei],
Wang, X.J.[Xiao-Jie],
Guessing State Tracking for Visual Dialogue,
ECCV20(XVI: 683-698).
Springer DOI
2010
BibRef
Das, D.[Dipanjan],
Biswas, S.[Sandika],
Sinha, S.[Sanjana],
Bhowmick, B.[Brojeshwar],
Speech-driven Facial Animation Using Cascaded Gans for Learning of
Motion and Texture,
ECCV20(XXX: 408-424).
Springer DOI
2010
BibRef
Kumar, N.,
Goel, S.,
Narang, A.,
Hasan, M.,
Robust One Shot Audio to Video Generation,
DeepVision20(3334-3343)
IEEE DOI
2008
Generators, Adaptation models, Synchronization, Lips,
Mouth
BibRef
Shalev, Y.[Yoav],
Wolf, L.B.[Lior B.],
Image Animation with Perturbed Masks,
CVPR22(3637-3646)
IEEE DOI
2210
Codes, Shape, Perturbation methods, Benchmark testing, Generators,
Encoding, Image and video synthesis and generation,
Vision applications and systems
BibRef
Shalev, Y.[Yoav],
Wolf, L.B.[Lior B.],
End to End Lip Synchronization with a Temporal AutoEncoder,
WACV20(330-339)
IEEE DOI
2006
Visualization, Streaming media, Decoding, Lips, Face, Tensile stress, Mouth
BibRef
Mittal, G.,
Wang, B.,
Animating Face using Disentangled Audio Representations,
WACV20(3279-3287)
IEEE DOI
2006
Face, Solid modeling, Phonetics, Predictive models
BibRef
Chen, Z.,
Liu, Z.,
Hu, H.,
Bai, J.,
Lian, S.,
Shi, F.,
Wang, K.,
A Realistic Face-to-Face Conversation System Based on Deep Neural
Networks,
ACVR19(2575-2583)
IEEE DOI
2004
avatars, computer animation, face recognition,
learning (artificial intelligence), neural nets,
Face Synthesizer
BibRef
Zakharov, E.,
Shysheya, A.,
Burkov, E.,
Lempitsky, V.,
Few-Shot Adversarial Learning of Realistic Neural Talking Head Models,
ICCV19(9458-9467)
IEEE DOI
2004
convolutional neural nets, image classification,
learning (artificial intelligence), video signal processing, Adaptation models
BibRef
Cudeiro, D.[Daniel],
Bolkart, T.[Timo],
Laidlaw, C.[Cassidy],
Ranjan, A.[Anurag],
Black, M.J.[Michael J.],
Capture, Learning, and Synthesis of 3D Speaking Styles,
CVPR19(10093-10103).
IEEE DOI
2002
BibRef
Chen, L.[Lele],
Maddox, R.K.[Ross K.],
Duan, Z.Y.[Zhi-Yao],
Xu, C.L.[Chen-Liang],
Hierarchical Cross-Modal Talking Face Generation With Dynamic
Pixel-Wise Loss,
CVPR19(7824-7833).
IEEE DOI
2002
BibRef
Chu, H.[Hang],
Li, D.Q.[Dai-Qing],
Fidler, S.[Sanja],
A Face-to-Face Neural Conversation Model,
CVPR18(7113-7121)
IEEE DOI
1812
Conversation.
Face, Motion pictures, Adaptation models, Hidden Markov models,
Decoding, History, Encoding
BibRef
Afouras, T.[Triantafyllos],
Asano, Y.M.[Yuki M.],
Fagan, F.[Francois],
Vedaldi, A.[Andrea],
Metze, F.[Florian],
Self-supervised object detection from audio-visual correspondence,
CVPR22(10565-10576)
IEEE DOI
2210
Training, Location awareness, Heating systems, Instruments, Music,
Detectors, Object detection, Vision + X, Recognition: detection,
Self-& semi-& meta- & unsupervised learning
BibRef
Feng, C.[Chao],
Chen, Z.Y.[Zi-Yang],
Owens, A.[Andrew],
Self-Supervised Video Forensics by Audio-Visual Anomaly Detection,
CVPR23(10491-10503)
IEEE DOI
2309
BibRef
Afouras, T.[Triantafyllos],
Owens, A.[Andrew],
Chung, J.S.[Joon Son],
Zisserman, A.[Andrew],
Self-supervised Learning of Audio-visual Objects from Video,
ECCV20(XVIII:208-224).
Springer DOI
2012
BibRef
Wiles, O.[Olivia],
Koepke, A.S.[A. Sophia],
Zisserman, A.[Andrew],
Self-Supervised Learning of Class Embeddings from Video,
CEFRL19(3019-3027)
IEEE DOI
2004
BibRef
Earlier:
X2Face: A Network for Controlling Face Generation Using Images, Audio,
and Pose Codes,
ECCV18(XIII: 690-706).
Springer DOI
1810
feature extraction, pose estimation, probability,
supervised learning, video signal processing, learned representations.
BibRef
Modrzejewski, M.[Mateusz],
Rokita, P.[Przemyslaw],
Graphical Interface Design for Chatbots for the Needs of Artificial
Intelligence Support in Web and Mobile Applications,
ICCVG18(48-56).
Springer DOI
1810
BibRef
Nakatani, S.[Shota],
Saiki, S.[Sachio],
Nakamura, M.[Masahide],
Yasuda, K.[Kiyoshi],
Generating Personalized Virtual Agent in Speech Dialogue System for
People with Dementia,
DHM18(326-337).
Springer DOI
1807
BibRef
Huber, B.,
McDuff, D.,
Facial Expression Grounded Conversational Dialogue Generation,
FG18(365-372)
IEEE DOI
1806
Face, Gesture recognition, Dialogue Generation, Facial Expression
BibRef
Luo, C.,
Yu, J.,
Li, X.,
Zhang, L.,
HMM based speech-driven 3D tongue animation,
ICIP17(4377-4381)
IEEE DOI
1803
Animation, Deformable models, Hidden Markov models, Sensors,
Solid modeling, Speech, Tongue, articulatory movements,
tongue animation
BibRef
Filntisis, P.P.,
Katsamanis, A.,
Maragos, P.,
Photorealistic adaptation and interpolation of facial expressions
using HMMS and AAMS for audio-visual speech synthesis,
ICIP17(2941-2945)
IEEE DOI
1803
Acoustics, Active appearance model, Adaptation models,
Hidden Markov models, Interpolation, Shape, Visualization,
photorealistic
BibRef
Filntisis, P.P.,
Katsamanis, A.,
Maragos, P.,
Demonstration of an HMM-based photorealistic expressive audio-visual
speech synthesis system,
ICIP17(4588-4588)
IEEE DOI
1803
Face, Hidden Markov models, Interpolation, Portable computers,
Speech, Speech synthesis, Technological innovation
BibRef
Wang, Y.B.[Yan-Bin],
Khoo, E.T.[Eng Tat],
Real-time stereoscopic rendering of realistic avatar for interactive
3D telepresence system,
ICIVC17(577-581)
IEEE DOI
1708
Avatars, Cameras, Collaboration, Rendering (computer graphics),
Virtual environments,
avatar rendering, collaborative virtual environment,
stereoscopic 3D, telepresence, virtual, reality
BibRef
Yu, J.[Jun],
A Real-Time 3D Visual Singing Synthesis:
From Appearance to Internal Articulators,
MMMod17(I: 53-64).
Springer DOI
1701
BibRef
Yu, J.[Jun],
Speech Synchronized Tongue Animation by Combining Physiology Modeling
and X-ray Image Fitting,
MMMod17(I: 726-737).
Springer DOI
1701
BibRef
Dawood, S.[Samia],
Hicks, Y.[Yulia],
Marshall, D.[David],
Speech-Driven Facial Animation Using Manifold Relevance Determination,
CVAVM16(II: 869-882).
Springer DOI
1611
BibRef
Malleson, C.[Charles],
Bazin, J.C.,
Wang, O.,
Bradley, D.,
Beeler, T.,
Hilton, A.[Adrian],
Sorkine-Hornung, A.,
FaceDirector: Continuous Control of Facial Performance in Video,
ICCV15(3979-3987)
IEEE DOI
1602
Face
BibRef
Musti, U.[Utpala],
Zhou, Z.H.[Zi-Heng],
Pietikainen, M.[Matti],
Facial 3D Shape Estimation from Images for Visual Speech Animation,
ICPR14(40-45)
IEEE DOI
1412
Estimation
BibRef
Yu, J.[Jun],
Li, A.[Aijun],
3D visual pronunciation of Mandarine Chinese for language learning,
ICIP14(2036-2040)
IEEE DOI
1502
Animation
BibRef
Anderson, R.[Robert],
Stenger, B.[Bjorn],
Wan, V.[Vincent],
Cipolla, R.[Roberto],
Expressive Visual Text-to-Speech Using Active Appearance Models,
CVPR13(3382-3389)
IEEE DOI
1309
See also Dense Active Appearance Models Using a Bounded Diameter Minimum Spanning Tree.
BibRef
Manresa-Yee, C.[Cristina],
Ponsa, P.[Pere],
Arellano, D.[Diana],
Larrea, M.[Martin],
An Avatar Acceptance Study for Home Automation Scenarios,
AMDO12(230-238).
Springer DOI
1208
BibRef
Tang, Y.Q.[Yong-Qing],
Fang, Y.[Yong],
Huang, Q.H.[Qing-Hua],
Audio personalization using head related transfer function in 3DTV,
3DTV11(1-4).
IEEE DOI
1105
BibRef
Liu, K.[Kang],
Ostermann, J.[Joern],
Realistic head motion synthesis for an image-based talking head,
FG11(125-130).
IEEE DOI
1103
BibRef
And:
FG11(221-226).
IEEE DOI
1103
BibRef
Chaloupka, J.[Josef],
Chaloupka, Z.[Zdenek],
Czech Artificial Computerized Talking Head George,
COST08(324-330).
Springer DOI
0810
BibRef
Zhao, H.[Hui],
Chen, Y.B.[Yue-Bing],
Shen, Y.M.[Ya-Min],
Tang, C.J.[Chao-Jing],
Audio-Visual Speech Synthesis Based on Chinese Visual Triphone,
CISP09(1-5).
IEEE DOI
0910
BibRef
Hu, Y.L.[Yong-Li],
Zhou, M.Q.[Ming-Quan],
Wu, Z.K.[Zhong-Ke],
An Automatic Dense Point Registration Method for 3D Face Animation,
CISP09(1-6).
IEEE DOI
0910
BibRef
Berger, M.O.[Marie-Odile],
Ponroy, J.[Jonathan],
Wrobel-Dautcourt, B.[Brigitte],
Realistic Face Animation for Audiovisual Speech Applications: A
Densification Approach Driven by Sparse Stereo Meshes,
MIRAGE09(297-307).
Springer DOI
0905
BibRef
Verdet, F.[Florian],
Hennebert, J.[Jean],
Impostures of Talking Face Systems Using Automatic Face Animation,
BTAS08(1-4).
IEEE DOI
0809
BibRef
Gaur, U.[Utkarsh],
Jain, A.[Amrita],
Goel, S.[Sanjay],
Towards Real-Time Monocular Video-Based Avatar Animation,
ISVC08(II: 949-958).
Springer DOI
0812
BibRef
Badin, P.[Pierre],
Elisei, F.[Frédéric],
Bailly, G.[Gérard],
Tarabalka, Y.[Yuliya],
An Audiovisual Talking Head for Augmented Speech Generation: Models and
Animations Based on a Real Speaker's Articulatory Data,
AMDO08(xx-yy).
Springer DOI
0807
BibRef
Fanelli, G.[Gabriele],
Fratarcangeli, M.[Marco],
A Non-Invasive Approach for Driving Virtual Talking Heads from Real
Facial Movements,
3DTV07(1-4).
IEEE DOI
0705
BibRef
Xiong, L.[Lei],
Zheng, N.N.[Nan-Ning],
You, Q.[Qubo],
Liu, J.Y.[Jian-Yi],
Facial Expression Sequence Synthesis Based on Shape and Texture Fusion
Model,
ICIP07(IV: 473-476).
IEEE DOI
0709
BibRef
Beaumesnil, B.[Brice],
Luthon, F.[Franck],
Real Time Tracking for 3D Realistic Lip Animation,
ICPR06(I: 219-222).
IEEE DOI
0609
BibRef
Ravindra de Silva, P.,
Madurapperuma, A.P.,
Marasinghe, A.,
Osano, M.,
Integrating Animated Pedagogical Agent as Motivational Supporter into
Interactive System,
CRV06(34-34).
IEEE DOI
0607
BibRef
Pei, Y.R.[Yu-Ru],
Zha, H.B.[Hong-Bin],
Vision Based Speech Animation Transferring with Underlying Anatomical
Structure,
ACCV06(I:591-600).
Springer DOI
0601
BibRef
Liu, Y.H.[Yang-Hua],
Xu, G.Y.[Guang-You],
Tao, L.M.[Lin-Mi],
An Efficient Approach for Multi-view Face Animation Based on Quasi 3D
Model,
ACCV06(II:913-922).
Springer DOI
0601
BibRef
Leszczynski, M.[Mariusz],
Skarbek, W.[Wladyslaw],
Viseme Classification for Talking Head Application,
CAIP05(773).
Springer DOI
0509
BibRef
Earlier:
Viseme recognition: A comparative study,
AVSBS05(287-292).
IEEE DOI
0602
BibRef
Leszczynski, M.[Mariusz],
Skarbek, W.[Wladyslaw],
Badura, S.[Stanislaw],
Fast Viseme Recognition for Talking Head Application,
ICIAR05(516-523).
Springer DOI
0509
BibRef
Gracia-Roche, J.J.[Juan José],
Orrite, C.[Carlos],
Bernués, E.[Emiliano],
Herrero, J.E.[José Elías],
Color Distribution Tracking for Facial Analysis,
IbPRIA05(I:484).
Springer DOI
0509
BibRef
Ypsilos, I.A.,
Hilton, A.,
Turkmani, A.,
Jackson, P.J.B.,
Speech-driven face synthesis from 3D video,
3DPVT04(58-65).
IEEE DOI
0412
BibRef
Saisan, P.[Payam],
Bissacco, A.[Alessandro],
Chiuso, A.[Alessandro],
Soatto, S.[Stefano],
Modeling and Synthesis of Facial Motion Driven by Speech,
ECCV04(Vol III: 456-467).
Springer DOI
0405
BibRef
Malcangi, M.[Mario],
de Tintis, R.[Raffaele],
Audio Based Real-Time Speech Animation of Embodied Conversational
Agents,
GW03(350-360).
Springer DOI
0405
BibRef
Aleksic, P.S.,
Katsaggelos, A.K.,
Speech-to-video synthesis using facial animation parameters,
ICIP03(III: 1-4).
IEEE DOI
0312
BibRef
Hack, C.A.,
Taylor, C.J.,
Modelling 'Talking Head' Behaviour,
BMVC03(xx-yy).
HTML Version.
0409
BibRef
Choi, K.H.[Kyoung-Ho],
Hwang, J.N.[Jenq-Neng],
Creating 3D speech-driven talking heads: a probabilistic network
approach,
ICIP02(I: 984-987).
IEEE DOI
0210
BibRef
Hong, P.Y.[Peng-Yu],
Wen, Z.[Zhen],
Huang, T.S.,
Shum, H.Y.[Heung-Yeung],
Real-time speech-driven 3D face animation,
3DPVT02(713-716).
IEEE DOI
0206
BibRef
Morishima, S.,
Yotsukura, T.,
Hypermask: Talking Head Projected Onto Moving Surface,
ICIP01(III: 947-950).
IEEE DOI
0108
BibRef
Neumann, J.[Jan],
Aloimonos, Y.[Yiannis],
Talking Heads: Introducing the tool of 3D motion fields in the study of
action,
HUMO00(25-32).
IEEE Top Reference.
0010
BibRef
Melek, Z.[Zeki],
Akarun, L.[Lale],
Automated Lip Synchronized Speech Driven Facial Animation,
ICME00(TA1).
0007
BibRef
Chen, T.H.[Tsu-Han],
Wang, Y.[Yao],
Graf, H.P.,
Swain, C.T.,
A new frame interpolation scheme for talking head sequences,
ICIP95(II: 591-594).
IEEE DOI
9510
BibRef
Shan, S.,
Individual 3d Face Synthesis Based on Orthogonal Photos and
Speech-driven Facial Animation,
ICIP00(Vol III: 238-241).
IEEE DOI
0008
BibRef
Noh, J.Y.[Jun-Yong],
Neumann, U.[Ulrich],
Talking Face,
ICME00(TA1).
0007
BibRef
Kakihara, K.[Kiyotsugu],
Nakamura, S.[Satoshi],
Shikano, K.[Kiyohiro],
Speech-To-Face Movement Synthesis Based on HMMS,
ICME00(MP7).
0007
BibRef
Van Gool, L.J.[Luc J.],
Tuytelaars, T.[Tinne],
Pollefeys, M.[Marc],
Adventurous Tourism for Couch Potatoes,
CAIP99(98-107).
Springer DOI
9909
Talking mask, display of scene, etc.
BibRef
Ishikawa, T.,
Sera, H.,
Morishima, S.,
Terzopoulos, D.,
Facial Image Reconstruction by Estimated Muscle Parameter,
AFGR98(342-347).
IEEE DOI
BibRef
9800
Bothe, H.H.[Hans H.],
A visual speech model based on fuzzy-neuro methods,
CIAP95(152-158).
Springer DOI
9509
BibRef
Chen, T.H.[Tsu-Han],
Graf, H.P.,
Haskell, B.G.,
Petajan, E.,
Wang, Y.[Yao],
Chen, H.,
Chou, W.[Wu],
Speech-assisted lip synchronization in audio-visual communications,
ICIP95(II: 579-582).
IEEE DOI
9510
BibRef
Chapter on 3-D Object Description and Computation Techniques, Surfaces, Deformable, View Generation, Video Conferencing continues in
Face Animation, Video Face Synthesis .