Wu, J.X.[Jian-Xiong],
Chan, C.[Chorkin],
Recognition of phonetic labels of the timit speech corpus by means of
an artificial neural network,
PR(24), No. 11, 1991, pp. 1085-1091.
WWW Version.
0401
BibRef
Wu, J.T.[Jian-Tong],
Tamura, S.[Shinichi],
Mitsumoto, H.[Hiroshi],
Kawai, H.[Hideo],
Kurosu, K.[Kenji],
Okazaki, K.[Kozo],
Neural network vowel-recognition jointly using voice features and mouth
shape image,
PR(24), No. 10, 1991, pp. 921-927.
WWW Version.
0401
BibRef
Lavagetto, F.,
Time-Delay Neural Networks for Estimating Lip Movements from
Speech Analysis:
A Useful Tool in Audio Video Synchronization,
CirSysVideo(7), No. 5, October 1997, pp. 786-800.
IEEE Top Reference.
9710
BibRef
Movellan, J.R.,
Mineiro, P.,
Robust Sensor Fusion:
Analysis and Application to Audio-Visual Speech Recognition,
MachLearn(32), No. 2, August 1998, pp. 85-100.
9810
BibRef
Wachsmuth, S.[Sven],
Socher, G.[Gudrun],
Brandt-Pook, H.[Hans],
Kummert, F.[Franz],
Sagerer, G.[Gerhard],
Integration of Vision and Speech Understanding Using Bayesian Networks,
Videre(1), No. 4, Winter 2000, pp. xx-yy.
0005
BibRef
Earlier: A1, A3, A2, A4, A5:
Multilevel Integration of Vision and Speech Understanding Using
Bayesian Networks,
CVS99(231 ff.).
HTML Version.
0209
BibRef
Chien, J.T.,
Lin, M.S.,
Frame-synchronous noise compensation for hands-free speech recognition
in car environments,
VISP(147), No. 6, December 2000, pp. 508-515.
0101
BibRef
Patel, D.,
Turner, L.F.,
Effects of ATM network impairments on audio-visual broadcast
applications,
VISP(147), No. 5, October 2000, pp. 436-444.
0101
BibRef
Aleksic, P.S.[Petar S.],
Williams, J.J.[Jay J.],
Wu, Z.L.[Zhi-Lin],
Katsaggelos, A.K.[Aggelos K.],
Audio-Visual Speech Recognition Using MPEG-4 Compliant Visual Features,
JASP(2002), No. 11, November 2002, pp. 1213.
HTML Version.
0304
BibRef
Earlier:
Audio-visual continuous speech recognition using MPEG-4 compliant
visual features,
ICIP02(I: 960-963).
IEEE Abstract. IEEE Top Reference.
0210
BibRef
Aleksic, P.S.[Petar S.],
Katsaggelos, A.K.[Aggelos K.],
Audio-Visual Biometrics,
PIEEE(94), No. 11, November 2006, pp. 2025-2044.
IEEE DOI may work or IEEE-CS DOI may work.
0611
BibRef
Aleksic, P.S.[Petar S.],
Katsaggelos, A.K.[Aggelos K.],
Speech-to-video synthesis using MPEG-4 compliant visual features,
CirSysVideo(14), No. 5, May 2004, pp. 682-692.
IEEE Abstract. IEEE Top Reference.
0407
BibRef
Earlier:
Comparison of MPEG-4 Facial Animation Parameter Groups with Respect to
Audio-Visual Speech Recognition Performance,
ICIP05(III: 501-504).
IEEE DOI may work or IEEE-CS DOI may work.
0512
BibRef
Jiang, J.T.[Jin-Tao],
Alwan, A.[Abeer],
Keating, P.A.[Patricia A.],
Auer Jr., E.T.[Edward T.],
Bernstein, L.E.[Lynne E.],
On the Relationship between Face Movements, Tongue Movements, and
Speech Acoustics,
JASP(2002), No. 11, November 2002, pp. 1174.
HTML Version.
0304
BibRef
Sodoyer, D.[David],
Schwartz, J.L.[Jean-Luc],
Girin, L.[Laurent],
Klinkisch, J.[Jacob],
Jutten, C.[Christian],
Separation of Audio-Visual Speech Sources: A New Approach Exploiting
the Audio-Visual Coherence of Speech Stimuli,
JASP(2002), No. 11, November 2002, pp. 1165.
HTML Version.
0304
BibRef
Zotkin, D.N.[Dmitry N.],
Duraiswami, R.[Ramani],
Davis, L.S.[Larry S.],
Joint Audio-Visual Tracking Using Particle Filters,
JASP(2002), No. 11, November 2002, pp. 1154.
HTML Version.
0304
BibRef
Heckmann, M.[Martin],
Berthommier, F.[Frédéric],
Kroschel, K.[Kristian],
Noise Adaptive Stream Weighting in Audio-Visual Speech Recognition,
JASP(2002), No. 11, November 2002, pp. 1260.
HTML Version.
0304
BibRef
Nefian, A.V.[Ara V.],
Liang, L.H.[Lu-Hong],
Pi, X.B.[Xiao-Bo],
Liu, X.X.[Xiao-Xing],
Murphy, K.[Kevin],
Dynamic Bayesian Networks for Audio-Visual Speech Recognition,
JASP(2002), No. 11, November 2002, pp. 1274.
HTML Version.
0304
BibRef
Nefian, A.V.[Ara V.],
Liang, L.H.[Lu Hong],
Fu, T.Y.[Tie-Yan],
Liu, X.X.[Xiao Xing],
A Bayesian Approach to Audio-Visual Speaker Identification,
AVBPA03(761-769).
HTML Version.
0310
BibRef
Patterson, E.K.[Eric K.],
Gurbuz, S.[Sabri],
Tufekci, Z.[Zekeriya],
Gowdy, J.N.[John N.],
Moving-Talker, Speaker-Independent Feature Study, and Baseline Results
Using the CUAVE Multimodal Speech Corpus,
JASP(2002), No. 11, November 2002, pp. 1189.
HTML Version.
0304
BibRef
Gurbuz, S.[Sabri],
Patterson, E.K.[Eric K.],
Tufekci, Z.[Zekeriya],
Gowdy, J.N.[John N.],
Affine-Invariant Visual Features Contain Supplementary Information to
Enhance Speech Recognition,
AVBPA01(175).
HTML Version.
0310
BibRef
Garg, A.[Ashutosh],
Pavlovic, V.[Vladimir],
Rehg, J.M.[James M.],
Boosted learning in dynamic Bayesian networks for multimodal speaker
detection,
PIEEE(91), No. 9, September 2003, pp. 1355-1369.
IEEE DOI may work or IEEE-CS DOI may work.
0309
BibRef
Earlier:
Audio-visual speaker detection using dynamic Bayesian networks,
AFGR00(384-390).
IEEE DOI may work or IEEE-CS DOI may work.
0003
BibRef
Pavlovic, V.[Vladimir],
Garg, A.[Ashutosh],
Rehg, J.M.[James M.],
Huang, T.S.[Thomas S.],
Multimodal Speaker Detection using Error Feedback Dynamic Bayesian
Networks,
CVPR00(II: 34-41).
IEEE Abstract. IEEE Top Reference.
WWW Version.
0005
BibRef
Pavlovic, V.,
Berry, G., and
Huang, T.S.,
Integration of Audio/Visual Information for Use in
Human-Computer Intelligent Interaction,
ICIP97(I: 121-124).
IEEE DOI may work or IEEE-CS DOI may work.
BibRef
9700
Choudhury, T.[Tanzeem],
Rehg, J.M.,
Pavlovic, V.,
Pentland, A.P.,
Boosting and structure learning in dynamic Bayesian networks for
audio-visual speaker detection,
ICPR02(III: 789-794).
IEEE DOI may work or IEEE-CS DOI may work.
0211
BibRef
Pavlovic, V.[Vladimir],
Multimodal tracking and classification of audio-visual features,
ICIP98(I: 343-347).
IEEE DOI may work or IEEE-CS DOI may work.
9810
BibRef
Rehg, J.M.[James M.],
Murphy, K.P.[Kevin P.],
Fieguth, P.W.[Paul W.],
Vision-Based Speaker Detection Using Bayesian Networks,
CVPR99(II: 110-116).
IEEE Abstract. IEEE Top Reference.
WWW Version. More particuarly the one talking.
BibRef
9900
Kalberer, G.A.[Gregor A.],
Müller, P.[Pascal],
Van Gool, L.J.[Luc J.],
Visual speech, a trajectory in viseme space,
IJIST(13), No. 1, 2003, pp. 74-84.
WWW Version.
0308
BibRef
Sharma, R.,
Yeasin, M.,
Krahnstoever, N.,
Rauschert, I.,
Cai, G.,
Brewer, I.,
MacEachren, A.M.,
Sengupta, K.,
Speech-gesture driven multimodal interfaces for crisis management,
PIEEE(91), No. 9, September 2003, pp. 1327-1354.
IEEE DOI may work or IEEE-CS DOI may work.
0309
BibRef
Potamianos, G.,
Neti, C.,
Gravier, G.,
Garg, A.,
Senior, A.W.,
Recent advances in the automatic recognition of audiovisual speech,
PIEEE(91), No. 9, September 2003, pp. 1306-1326.
IEEE DOI may work or IEEE-CS DOI may work.
0309
BibRef
Kaynak, M.N.,
Zhi, Q.,
Cheok, A.D.,
Sengupta, K.,
Jian, Z.,
Chung, K.C.,
Analysis of Lip Geometric Features for Audio-Visual Speech Recognition,
SMC-A(34), No. 4, July 2004, pp. 564-570.
IEEE Abstract. IEEE Top Reference.
0407
BibRef
Foo, S.W.[Say Wei],
Lian, Y.[Yong],
Dong, L.[Liang],
Recognition of visual speech elements using adaptively boosted hidden
Markov models,
CirSysVideo(14), No. 5, May 2004, pp. 693-705.
IEEE Abstract. IEEE Top Reference.
0407
BibRef
Albiol, A.[Alberto],
Torres, L.[Luis],
Delp, E.J.[Edward J.],
Fully automatic face recognition system using a combined audio-visual
approach,
VISP(152), No. 3, June 2005, pp. 318-326.
WWW Version.
0510
BibRef
Earlier:
A Fast Anchor Person Searching Scheme in News Sequences,
AVBPA01(366).
HTML Version.
0310
BibRef
And:
An Unsupervised Color Image Segmentation Algorithm for Face Detection
Applications,
ICIP01(II: 681-684).
IEEE Abstract. IEEE Top Reference.
0108
BibRef
Earlier:
Optimum Color Spaces for Skin Detection,
ICIP01(I: 122-124).
IEEE Abstract. IEEE Top Reference.
0108
BibRef
Kleindienst, J.[Jan],
Macek, T.[Tomáš],
Serédi, L.[Ladislav],
Šedivý, J.[Jan],
Interaction framework for home environment using speech and vision,
IVC(25), No. 12, 3 December 2007, pp. 1836-1847.
WWW Version.
0710
BibRef
Earlier:
Djinn: Interaction Framework for Home Environment Using Speech and
Vision,
CVHCI04(153-164).
WWW Version.
0505Multi-modal; Computer-vision; Context-aware; Speech recognition
BibRef
Palanivel, S.,
Yegnanarayana, B.,
Multimodal person authentication using speech, face and visual speech,
CVIU(109), No. 1, January 2008, pp. 44-55.
WWW Version.
0801Multimodal person authentication; Face tracking; Eye location;
Visual speech; Multiscale morphological dilation and erosion;
Autoassociative neural network
BibRef
Talantzis, F.,
Pnevmatikakis, A.,
Constantinides, A.G.,
Audio-Visual Active Speaker Tracking in Cluttered Indoors Environments,
SMC-B(37), No. 3, June 2007, pp. 799-807.
IEEE DOI may work or IEEE-CS DOI may work.
0711
BibRef
Chetty, G.[Girija],
Wagner, M.[Michael],
Robust face-voice based speaker identity verification using multilevel
fusion,
IVC(26), No. 9, 1 September 2008, pp. 1249-1260.
WWW Version.
0806
BibRef
Earlier:
Audio Visual Speaker Verification Based on Hybrid Fusion of Cross Modal
Features,
PReMI07(469-478).
WWW Version.
0712Lip; 3D Face; Voice; Biometric; Identity verification; Robust;
Multilevel fusion
BibRef
Delakis, M.[Manolis],
Gravier, G.[Guillaume],
Gros, P.[Patrick],
Audiovisual integration with Segment Models for tennis video parsing,
CVIU(111), No. 2, August 2008, pp. 142-154.
WWW Version.
0808Hidden Markov Models; Segment Models; Multimodal fusion;
Video indexing; Video summarization
BibRef
Horii, Y.[Yu],
Kawashima, H.[Hiroaki],
Matsuyama, T.[Takashi],
Speaker detection using the timing structure of lip motion and sound,
CVPR4HB08(1-8).
IEEE DOI may work or IEEE-CS DOI may work.
0806
BibRef
Rúa, E.A.[Enrique Argones],
Castro, J.L.A.[José Luis Alba],
Mateo, C.G.[Carmen García],
Quality-Based Score Normalization for Audiovisual Person Authentication,
ICIAR08(xx-yy).
WWW Version.
0806
BibRef
Wang, L.[Lei],
Tjondrongoro, D.[Dian],
Liu, Y.[Yuee],
Clustering and Visualizing Audio-Visual Dataset on Mobile Devices in a
Topic-Oriented Manner,
Visual07(310-321).
WWW Version.
0706
BibRef
Zajdel, W.,
Krijnders, J.D.,
Andringa, T.,
Gavrila, D.M.,
CASSANDRA: audio-video sensor fusion for aggression detection,
AVSBS07(200-205).
IEEE DOI may work or IEEE-CS DOI may work.
0709
BibRef
Stødle, D.[Daniel],
Bjørndalen, J.M.[John Markus],
Anshus, O.J.[Otto J.],
A System for Hybrid Vision- and Sound-Based Interaction with Distal and
Proximal Targets on Wall-Sized, High-Resolution Tiled Displays,
CVHCI07(59-68).
WWW Version.
0710
BibRef
van Hengel, P.W.J.,
Andringa, T.C.,
Verbal aggression detection in complex social environments,
AVSBS07(15-20).
IEEE DOI may work or IEEE-CS DOI may work.
0709
BibRef
Ikeda, O.[Osamu],
Detection of a Speaker in Video by Combined Analysis of Speech Sound
and Mouth Movement,
ISVC07(II: 602-610).
WWW Version.
0711
BibRef
Das, A.[Amitava],
Audio Visual Person Authentication by Multiple Nearest Neighbor
Classifiers,
ICB07(1114-1123).
WWW Version.
0708
BibRef
Xin, L.[Le],
Tao, J.H.[Jian-Hua],
Tan, T.N.[Tie-Niu],
Dynamic Audio-Visual Mapping using Fused Hidden Markov Model Inversion
Method,
ICIP07(III: 293-296).
IEEE DOI may work or IEEE-CS DOI may work.
0709
BibRef
Casanovas, A.L.[Anna Llagostera],
Monaci, G.[Gianluca],
Vandergheynst, P.[Pierre],
Blind Audiovisual Source Separation using Sparse Representations,
ICIP07(III: 301-304).
IEEE DOI may work or IEEE-CS DOI may work.
0709
BibRef
Barzelay, Z.[Zohar],
Schechner, Y.Y.[Yoav Y.],
Harmony in Motion,
CVPR07(1-8).
IEEE DOI may work or IEEE-CS DOI may work.
0706Audio-visual analysis.
BibRef
O'Donovan, A.[Adam],
Duraiswami, R.[Ramani],
Neumann, J.[Jan],
Microphone Arrays as Generalized Cameras for Integrated Audio Visual
Processing,
CVPR07(1-8).
IEEE DOI may work or IEEE-CS DOI may work.
0706
BibRef
Abbas, J.[Jehanzeb],
Dagli, C.K.[Charlie K.],
Huang, T.S.[Thomas S.],
A Multimodality Framework for Creating Speaker/Non-Speaker Profile
Databases for Real-World Video,
SLAM07(1-8).
IEEE DOI may work or IEEE-CS DOI may work.
0706
BibRef
Kushal, A.[Akash],
Rahurkar, M.[Mandar],
Fei-Fei, L.[Li],
Ponce, J.[Jean],
Huang, T.[Thomas],
Audio-Visual Speaker Localization Using Graphical Models,
ICPR06(I: 291-294).
WWW Version.
0609
BibRef
Tsuji, T.[Tokuo],
Yamamoto, K.[Kenkichi],
Ishii, I.[Idaku],
Real-time Sound Source Localization Based on Audiovisual Frequency
Integration,
ICPR06(IV: 322-325).
WWW Version.
0609
BibRef
Monaci, G.[Gianluca],
Vandergheynst, P.[Pierre],
Audiovisual Gestalts,
PercOrg06(200).
IEEE DOI may work or IEEE-CS DOI may work.
0609
BibRef
Zhu, Z.G.[Zhi-Gang],
Li, W.H.[Wei-Hong],
Molina, E.[Edgardo],
Wolberg, G.[George],
LDV Sensing and Processing for Remote Hearing in a Multimodal
Surveillance System,
MSCSAS07(1-2).
IEEE DOI may work or IEEE-CS DOI may work.
0706
BibRef
Zhu, Z.G.[Zhi-Gang],
Li, W.H.[Wei-Hong],
Wolberg, G.,
Integrating LDV Audio and IR Video for Remote Multimodal Surveillance,
OTCBVS05(III: 10-10).
IEEE DOI may work or IEEE-CS DOI may work.
0507
BibRef
Chetty, G.[Girija],
Wagner, M.[Michael],
Face-Voice Authentication Based on 3D Face Models,
ACCV06(I:559-568).
WWW Version.
0601
BibRef
Wu, Z.Y.[Zhi-Yong],
Cai, L.H.[Lian-Hong],
Meng, H.[Helen],
Multi-level Fusion of Audio and Visual Features for Speaker
Identification,
ICB06(493-499).
WWW Version.
0601
BibRef
Yang, P.[Pu],
Yang, Y.C.[Ying-Chun],
Wu, Z.H.[Zhao-Hui],
Exploiting Glottal Information in Speaker Recognition Using Parallel
GMMs,
AVBPA05(804).
WWW Version.
0509
BibRef
Lei, Z.[Zhenchun],
Yang, Y.C.[Ying-Chun],
Wu, Z.H.[Zhao-Hui],
An UBM-Based Reference Space for Speaker Recognition,
ICPR06(IV: 318-321).
WWW Version.
0609
BibRef
Li, D.D.[Dong-Dong],
Yang, Y.C.[Ying-Chun],
Wu, Z.H.[Zhao-Hui],
Dynamic Bayesian Networks for Audio-Visual Speaker Recognition,
ICB06(539-545).
WWW Version.
0601
BibRef
Megherbi, N.,
Ambellouis, S.,
Colot, O.,
Cabestaing, F.,
Data Association in Multi-Target Tracking Using Belief Theory:
Handling Target Emergence and Disappearance Issue,
AVSBS05(517-521).
IEEE DOI may work or IEEE-CS DOI may work.
0602
BibRef
Megherbi, N.,
Ambellouis, S.,
Colot, O.,
Cabestaing, F.,
Joint audio-video people tracking using belief theory,
AVSBS05(135-140).
IEEE DOI may work or IEEE-CS DOI may work.
0602
BibRef
Saenko, K.[Kate],
Livescu, K.[Karen],
Siracusa, M.[Michael],
Wilson, K.[Kevin],
Glass, J.[James],
Darrell, T.J.[Trevor J.],
Visual Speech Recognition with Loosely Synchronized Feature Streams,
ICCV05(II: 1424-1431).
IEEE DOI may work or IEEE-CS DOI may work.
0510
BibRef
Lei, Z.[Zhenchun],
Yang, Y.C.[Ying-Chun],
Wu, Z.H.[Zhao-Hui],
Constructing the Discriminative Kernels Using GMM for Text-Independent
Speaker Identification,
IWBRS05(165).
WWW Version.
0601
BibRef
And:
Speaker Identification Using the VQ-Based Discriminative Kernels,
AVBPA05(797).
WWW Version.
0509
BibRef
Fox, N.A.[Niall A.],
O'Mullane, B.A.[Brian A.],
Reilly, R.B.[Richard B.],
VALID:
A New Practical Audio-Visual Database, and Comparative Results,
AVBPA05(777).
WWW Version.
0509
WWW Version.
Dataset, Faces.
BibRef
Sharma, P.[Prag],
Reilly, R.B.[Richard B.],
The UCD Colour Face Image Database for Face Detection,
Online1998.
WWW Version.
Dataset, Faces.
BibRef
9800
Fox, N.A.[Niall A.],
O'Mullane, B.A.[Brian A.],
Reilly, R.B.[Richard B.],
Audio-Visual Speaker Identification via Adaptive Fusion Using
Reliability Estimates of Both Modalities,
AVBPA05(787).
WWW Version.
0509
BibRef
Li, X.[Xin],
Sun, L.[Luo],
Tao, L.M.[Lin-Mi],
Xu, G.Y.[Guang-You],
Jia, Y.[Ying],
A Speaker Tracking Algorithm Based on Audio and Visual Information
Fusion Using Particle Filter,
ICIAR04(II: 572-580).
WWW Version.
0409
BibRef
Zhang, D.,
Ghobakhlou, A.,
Kasabov, N.,
An adaptive model of person identification combining speech and image
information,
ICARCV04(I: 413-418).
IEEE DOI may work or IEEE-CS DOI may work.
0412
BibRef
Kratt, J.[Jan],
Metze, F.[Florian],
Stiefelhagen, R.[Rainer],
Waibel, A.[Alex],
Large Vocabulary Audio-Visual Speech Recognition Using the Janus Speech
Recognition Toolkit,
DAGM04(488-495).
WWW Version.
0505
BibRef
Hanafiah, Z.M.,
Yamazaki, C.,
Nakamura, A.,
Kuno, Y.,
Understanding inexplicit utterances using vision for helper robots,
ICPR04(IV: 925-928).
IEEE DOI may work or IEEE-CS DOI may work.
0409
BibRef
Hermann, T.[Thomas],
Henning, T.[Thomas],
Ritter, H.[Helge],
Gesture Desk an Integrated Multi-modal Gestural Workplace
for Sonification,
GW03(369-379).
WWW Version.
0405
BibRef
Merola, G.[Giorgio],
Poggi, I.[Isabella],
Multimodality and Gestures in the Teacher's Communication,
GW03(101-111).
WWW Version.
0405
BibRef
Althoff, F.[Frank],
McGlaun, G.[Gregor],
Lang, M.[Manfred],
Rigoll, G.[Gerhard],
Evaluating Multimodal Interaction Patterns in Various Application
Scenarios,
GW03(421-435).
WWW Version.
0405
BibRef
Kranstedt, A.[Alfred],
Kühnlein, P.[Peter],
Wachsmuth, I.[Ipke],
Deixis in Multimodal Human Computer Interaction:
An Interdisciplinary Approach,
GW03(112-123).
WWW Version.
0405
BibRef
Saeed, K.[Khalid],
Kozlowski, M.[Marcin],
An Image-Based System for Spoken-Letter Recognition,
CAIP03(494-502).
WWW Version.
0311
BibRef
Ho, P.[Purdy],
Armington, J.[John],
A Dual-Factor Authentication System Featuring Speaker Verification and
Token Technology,
AVBPA03(128-136).
HTML Version.
0310
BibRef
Fox, N.A.[Niall A.],
Reilly, R.B.[Richard B.],
Audio-Visual Speaker Identification Based on the Use of Dynamic Audio
and Visual Features,
AVBPA03(743-751).
HTML Version.
0310
BibRef
Czyz, J.[Jacek],
Bengio, S.[Samy],
Marcel, C.[Christine],
Vandendorpe, L.[Luc],
Scalability Analysis of Audio-Visual Person Identity Verification,
AVBPA03(752-760).
HTML Version.
0310
BibRef
Bengio, S.[Samy],
Multimodal Authentication Using Asynchronous HMMs,
AVBPA03(770-777).
HTML Version.
0310
BibRef
Lucey, S.[Simon],
Chen, T.H.[Tsu-Han],
Improved Audio-Visual Speaker Recognition via the Use of a Hybrid
Combination Strategy,
AVBPA03(929-936).
HTML Version.
0310
BibRef
Krahnstoever, N.,
Schapira, E.,
Kettebeko, S.,
Sharma, R.,
Multimodal human-computer interaction for crisis management systems,
WACV02(203-207).
IEEE Abstract. IEEE Top Reference.
0303
BibRef
Kettebekov, S.,
Yeasin, M.,
Sharma, R.,
Improving continuous gesture recognition with spoken prosody,
CVPR03(I: 565-570).
IEEE Abstract. IEEE Top Reference.
0307
BibRef
Higgins, J.E.,
Damper, R.I.,
An HMM-Based Subband Processing Approach to Speaker Identification,
AVBPA01(169).
HTML Version.
0310
BibRef
Poh, N.[Norman],
Korczak, J.[Jerzy],
Hybrid Biometric Person Authentication Using Face and Voice Features,
AVBPA01(348).
HTML Version.
0310
BibRef
Nakamura, S.[Satoshi],
Fusion of Audio-Visual Information for Integrated Speech Processing,
AVBPA01(127).
HTML Version.
0310
BibRef
Sullivan, K.P.H.[Kirk P.H.],
Pelecanos, J.[Jason],
Revisiting Carl Bildt's Impostor: Would a Speaker Verification System
Foil Him?,
AVBPA01(144).
HTML Version.
0310
BibRef
Geiger, G.[Gadi],
Ezzat, T.[Tony],
Poggio, T.[Tomaso],
Perceptual Evaluation of Video-Realistic Speech,
MIT AIMAIM-2003-003, February 28, 2003.
WWW Version. We describe here our scheme and its application to a new video-realistic (potentially
indistinguishable from real recorded video) visual-speech animation system, called Mary 101.
0306
BibRef
Gordan, M.,
Kotropoulos, C.,
Pitas, I.,
Application of support vector machines classifiers to visual speech
recognition,
ICIP02(III: 129-132).
IEEE Abstract. IEEE Top Reference.
0210
BibRef
Blake, A.,
Gangnet, M.,
Perez, P.,
Vermaak, J.,
Integrated tracking with vision and sound,
CIAP01(354-357).
IEEE Top Reference.
0210
BibRef
Zhang, X.Z.[Xiao-Zheng],
Merserratt, R.M.,
Clements, M.,
Bimodal fusion in audio-visual speech recognition,
ICIP02(I: 964-967).
IEEE Abstract. IEEE Top Reference.
0210
BibRef
Graf, H.P.,
Cosatto, E.,
Strom, V.,
Huang, F.J.[Fu Jie],
Visual prosody: facial movements accompanying speech,
AFGR02(381-386).
IEEE DOI may work or IEEE-CS DOI may work.
0206
BibRef
Qi, Y.[Yuan],
Learning Algorithms for Audio and Video Processing:
Independent Component Analysis and Support Vector Machine Based Approaches,
UMD--TR4174, August 2000.
WWW Version.
WWW Version.
BibRef
0008
Nankaku, Y.,
Tokuda, K.,
Kitamura, T.,
Normalized Training for HMM-based Visual Speech Recognition,
ICIP00(Vol III: 234-237).
IEEE Abstract. IEEE Top Reference.
0008
BibRef
Zhang, Y.[You],
Levinson, S.[Stephen],
Huang, T.S.[Thomas S.],
Speaker Independent Audio-Visual Speech Recognition,
ICME00(TP8).
0007
BibRef
Pan, H.[Hao],
Huang, T.S.[Thomas S.],
A New Approach to Integrate Audio and Visual Features of Speech,
ICME00(TP8).
0007
BibRef
Potamianos, G.[Gerasimos],
Verma, A.[Ashish],
Neti, C.[Chalapathy],
Iyengar, G.[Giri],
Basu, S.[Sankar],
A Cascade Image Transform for Speaker Independent Automatic Speech
Reading,
ICME00(TP8).
0007
BibRef
Pan, H.,
Liang, Z.P.,
Huang, T.S.,
Fusing Audio and Visual Features of Speech,
ICIP00(Vol III: 214-217).
IEEE Abstract. IEEE Top Reference.
0008
BibRef
Faruquie, T.A.,
Majumdar, A.,
Rajput, N.,
Subramaniam, L.V.,
Large Vocabulary Audio-visual Speech Recognition Using Active Shape
Models,
ICPR00(Vol III: 106-109).
IEEE DOI may work or IEEE-CS DOI may work.
HTML Version.
0009
BibRef
Yu, K.,
Jiang, X.,
Bunke, H.,
Combining Acoustic and Visual Classifiers for the Recognition of Spoken
Sentences,
ICPR00(Vol II: 491-494).
IEEE DOI may work or IEEE-CS DOI may work.
HTML Version.
0009
BibRef
Nam, J.,
Alghoniemy, M.,
Tewfik, A.H.[Ahmed H.],
Audio-visual content-based violent scene characterization,
ICIP98(I: 353-357).
IEEE DOI may work or IEEE-CS DOI may work.
9810
BibRef
Luettin, J.[Juergen],
Dupont, S.[Stéphane],
Continuous Audio-Visual Speech Recognition,
ECCV98(II: 657).
WWW Version.
BibRef
9800
Kaucic, R.,
Dalton, B.,
Blake, A.,
Real-Time Lip Tracking for Audio-Visual Speech Recognition Applications,
ECCV96(II:376-387).
WWW Version.
Snakes.
BibRef
9600
Yang, J.[Jie],
Xiao, J.[Jing],
Ritter, M.[Max],
Automatic Selection of Visemes for Image-based Visual Speech Synthesis,
ICME00(TP8).
0007
BibRef
Faruquie, T.A.[Tanveer A.],
Neti, C.[Chalapathy],
Rajput, N.[Nitendra],
Subramaniam, L.V.[L. Venkata],
Verma, A.[Ashish],
Translingual Visual Speech Synthesis,
ICME00(TP8).
0007
BibRef
Sharma, R.[Rajeev],
Cai, J.[Jiongyu],
Chakravarthy, S.[Srivatsan],
Poddar, I.[Indrajit],
Sethi, Y.[Yogesh],
Exploiting Speech/Gesture Co-occurrence for Improving Continuous
Gesture Recognition in Weather Narration,
AFGR00(422-427).
IEEE DOI may work or IEEE-CS DOI may work.
0003
BibRef
Yamamoto, E.,
Nakamura, S.,
Shikano, K.,
Lip Movement Synthesis from Speech Based on Hidden Markov Models,
AFGR98(154-159).
IEEE DOI may work or IEEE-CS DOI may work.
BibRef
9800
Roy, D.,
Pentland, A.P.,
Automatic spoken affect classification and analysis,
AFGR96(363-367).
IEEE DOI may work or IEEE-CS DOI may work.
9610
BibRef
Petajan, E.D.[Eric D.],
An Architecture for Automatic Lipreading to Enhance Speech Recognition,
CVPR85(40-47).
(AT&T Bell Labs)
Application, Lipreading. A real hardware implementation of a system that tracks the nostrils
and mouth. Improvement over use of acoustic data alone.
BibRef
8500
Chapter on Face Recognition, Detection, Tracking, Gesture Recognition, Fingerprints, Biometrics continues in
Mouth Location, Lip Location, Detection .