21.3.4.1.1 Combined Audio Visual Speaker Tracking

Chapter Contents (Back)
Real Time Vision. Audiovisual Speech. Audiovisual Tracking. Speaker Tracking.

Zotkin, D.N.[Dmitry N.], Duraiswami, R.[Ramani], Davis, L.S.[Larry S.],
Joint Audio-Visual Tracking Using Particle Filters,
JASP(2002), No. 11, November 2002, pp. 1154.
WWW Link. 0304
BibRef

Garg, A.[Ashutosh], Pavlovic, V.[Vladimir], Rehg, J.M.[James M.],
Boosted learning in dynamic Bayesian networks for multimodal speaker detection,
PIEEE(91), No. 9, September 2003, pp. 1355-1369.
IEEE DOI 0309
BibRef
Earlier:
Audio-visual speaker detection using dynamic Bayesian networks,
AFGR00(384-390).
IEEE DOI 0003
BibRef

Pavlovic, V.[Vladimir], Garg, A.[Ashutosh], Rehg, J.M.[James M.], Huang, T.S.[Thomas S.],
Multimodal Speaker Detection using Error Feedback Dynamic Bayesian Networks,
CVPR00(II: 34-41).
IEEE DOI 0005
BibRef

Pavlovic, V., Berry, G., and Huang, T.S.,
Integration of Audio/Visual Information for Use in Human-Computer Intelligent Interaction,
ICIP97(I: 121-124).
IEEE DOI BibRef 9700

Choudhury, T.[Tanzeem], Rehg, J.M., Pavlovic, V., Pentland, A.P.,
Boosting and structure learning in dynamic Bayesian networks for audio-visual speaker detection,
ICPR02(III: 789-794).
IEEE DOI 0211
BibRef

Pavlovic, V.[Vladimir],
Multimodal tracking and classification of audio-visual features,
ICIP98(I: 343-347).
IEEE DOI 9810
BibRef

Rehg, J.M.[James M.], Murphy, K.P.[Kevin P.], Fieguth, P.W.[Paul W.],
Vision-Based Speaker Detection Using Bayesian Networks,
CVPR99(II: 110-116).
IEEE DOI More particuarly the one talking. BibRef 9900

Talantzis, F., Pnevmatikakis, A., Constantinides, A.G.,
Audio-Visual Active Speaker Tracking in Cluttered Indoors Environments,
SMC-B(39), No. 1, February 2009, pp. 7-15.
IEEE DOI 0902
BibRef
Earlier: SMC-B(38), No. 3, June 2008, pp. 799-807.
IEEE DOI 0711
The top one is the special issue, it was published early in the other issue. BibRef

Qian, X., Brutti, A., Lanz, O., Omologo, M., Cavallaro, A.,
Multi-Speaker Tracking From an Audio-Visual Sensing Device,
MultMed(21), No. 10, October 2019, pp. 2576-2588.
IEEE DOI 1910
image colour analysis, object detection, object tracking, particle filtering (numerical methods), sensor fusion, particle filter BibRef

Ban, Y.T.[Yu-Tong], Alameda-Pineda, X.[Xavier], Girin, L.[Laurent], Horaud, R.[Radu],
Variational Bayesian Inference for Audio-Visual Tracking of Multiple Speakers,
PAMI(43), No. 5, May 2021, pp. 1761-1776.
IEEE DOI 2104
BibRef
Earlier: A1, A3, A2, A4:
Exploiting the Complementarity of Audio and Visual Data in Multi-speaker Tracking,
CVAVM17(446-454)
IEEE DOI 1802
Visualization, Target tracking, Acoustics, Bayes methods, Cameras, Object tracking, Direction-of-arrival estimation, speaker diarization. Cameras, Detectors, Kalman filters, Microphones, Robots, Tracking, Visualization BibRef

Qian, X.Y.[Xin-Yuan], Brutti, A.[Alessio], Lanz, O.[Oswald], Omologo, M.[Maurizio], Cavallaro, A.[Andrea],
Audio-Visual Tracking of Concurrent Speakers,
MultMed(24), 2022, pp. 942-954.
IEEE DOI 2202
Target tracking, Acoustics, Faces, Cameras, Visualization, Image color analysis, 3D multiple target tracking, particle filter BibRef


Alcázar, J.L.[Juan León], Heilbron, F.C.[Fabian Caba], Thabet, A.K.[Ali K.], Ghanem, B.[Bernard],
MAAS: Multi-modal Assignation for Active Speaker Detection,
ICCV21(265-274)
IEEE DOI 2203
Visualization, Benchmark testing, Feature extraction, Data structures, Task analysis, Vision + other modalities, Video analysis and understanding BibRef

Köpüklü, O.[Okan], Taseska, M.[Maja], Rigoll, G.[Gerhard],
How to Design a Three-Stage Architecture for Audio-Visual Active Speaker Detection in the Wild,
ICCV21(1173-1183)
IEEE DOI 2203
Codes, Computational modeling, Pipelines, Computer architecture, Encoding, Task analysis, Vision + other modalities, Vision applications and systems BibRef

Wu, Y.[Yu], Yang, Y.[Yi],
Exploring Heterogeneous Clues for Weakly-Supervised Audio-Visual Video Parsing,
CVPR21(1326-1335)
IEEE DOI 2111
Training, Visualization, Target tracking, Annotations, Predictive models, Pattern recognition BibRef

Liu, H.[Hong], Sun, Y.H.[Yong-Heng], Li, Y.D.[Yi-Di], Yang, B.[Bing],
3D Audio-Visual Speaker Tracking with A Novel Particle Filter,
ICPR21(7343-7348)
IEEE DOI 2105
BibRef
Earlier: A1, A3, A4, Only:
3D Audio-Visual Speaker Tracking with A Two-Layer Particle Filter,
ICIP19(1955-1959)
IEEE DOI 1910
Visualization, Histograms, Head, Image color analysis, Sensor phenomena and characterization, compact platform. 3D speaker tracking, audio-visual fusion, particle filter, adaptive likelihood BibRef

Le, N.[Nam], Heili, A.[Alexandre], Wu, D.[Di], Odobez, J.M.[Jean-Marc],
Temporally subsampled detection for accurate and efficient face tracking and diarization,
ICPR16(1792-1797)
IEEE DOI 1705
Detectors, Face, Face detection, Image color analysis, Motion pictures, TV, Tracking BibRef

Saeed, A.[Anwar], Al-Hamadi, A.[Ayoub], Heuer, M.[Michael],
Speaker Tracking Using Multi-modal Fusion Framework,
ICISP12(539-546).
Springer DOI 1208
BibRef

Katsarakis, N.[Nikos], Talantzis, F.[Fotios], Pnevmatikakis, A.[Aristodemos], Polymenakos, L.[Lazaros],
The AIT 3D Audio / Visual Person Tracker for CLEAR 2007,
MTPH07(xx-yy).
Springer DOI 0705

See also AIT 2D Face Detection and Tracking System for CLEAR 2007, The.
See also AIT Multimodal Person Identification System for CLEAR 2007, The. BibRef

Megherbi, N., Ambellouis, S., Colot, O., Cabestaing, F.,
Data Association in Multi-Target Tracking Using Belief Theory: Handling Target Emergence and Disappearance Issue,
AVSBS05(517-521).
IEEE DOI 0602
BibRef

Megherbi, N., Ambellouis, S., Colot, O., Cabestaing, F.,
Joint audio-video people tracking using belief theory,
AVSBS05(135-140).
IEEE DOI 0602
BibRef

Li, X.[Xin], Sun, L.[Luo], Tao, L.M.[Lin-Mi], Xu, G.Y.[Guang-You], Jia, Y.[Ying],
A Speaker Tracking Algorithm Based on Audio and Visual Information Fusion Using Particle Filter,
ICIAR04(II: 572-580).
Springer DOI 0409
BibRef

Lange, C.[Christian], Hermann, T.[Thomas], Ritter, H.[Helge],
Holistic Body Tracking for Gestural Interfaces,
GW03(132-139).
Springer DOI 0405
BibRef

Blake, A., Gangnet, M., Perez, P., Vermaak, J.,
Integrated tracking with vision and sound,
CIAP01(354-357).
IEEE DOI 0210
BibRef

Chapter on Face Recognition, Detection, Tracking, Gesture Recognition, Fingerprints, Biometrics continues in
Mouth Location, Lip Location, Detection .


Last update:Jun 19, 2022 at 13:58:21