Index for audio

_audio_
3-D audio-Visual Corpus of Affective Communication, A
3D audio in human-computer interfaces
3D audio-Visual Speaker Tracking with A Novel Particle Filter
3D audio-Visual Speaker Tracking with A Two-Layer Particle Filter
Abnormal acoustic event localization based on selective frequency bin in high noise environment for audio surveillance
Abnormal events detection using unsupervised One-Class SVM: Application to audio surveillance and evaluation
ACAV100M: Automatic Curation of Large-Scale Datasets for audio-Visual Video Representation Learning
Accelerating Index-Based audio Identification
Accurate glottal model parametrization by integrating audio and high-speed endoscopic video data
Acoustic Signals, Sounds, audio
Acoustoseismic Method for Buried-Object Detection by Means of Surface-Acceleration Measurements and audio Facilities
Active audio-Visual Separation of Dynamic Sound Sources
Active learning of custom sound taxonomies in unstructured audio data
Active Learning Paradigm for Online audio-Visual Emotion Recognition, An
AD-NeRF: audio Driven Neural Radiance Fields for Talking Head Synthesis
Adaptive audio Steganography Based on Advanced Audio Coding and Syndrome-Trellis Coding
Adaptive audio Steganography Based on Advanced Audio Coding and Syndrome-Trellis Coding
Adaptive context recognition based on audio signal
Adaptive Selection of Embedding Locations for Spread Spectrum Watermarking of Compressed audio
Adaptive Signal Models for Wide-Band Speech and audio Compression
Adaptive Speaker Identification with audio-Visual Cues for Movie Content Analysis
Adaptive Synthesis in Progressive Retrieval of audio-Visual Data
AdVerb: Visually Guided audio Dereverberation
Adversarial-Metric Learning for audio-Visual Cross-Modal Matching
Adversarially Training for audio Classifiers
AENet: Learning Deep audio Features for Video Analysis
Affective audio Annotation of Public Speeches with Convolutional Clustering Neural Network
Affective audio-Visual Words and Latent Topic Driving Model for Realizing Movie Affective Scene Classification
AI-Based human audio processing for COVID-19: A comprehensive overview
AIT 3D audio / Visual Person Tracker for CLEAR 2007, The
AKVSR: audio Knowledge Empowered Visual Speech Recognition by Compressing Audio Knowledge of a Pretrained Model
AKVSR: audio Knowledge Empowered Visual Speech Recognition by Compressing Audio Knowledge of a Pretrained Model
Algorithm of Resist Cropping Robust audio Watermark Based On Wavelet Transformation, An
Algorithms for multiplex scheduling of object-based audio-visual presentations
AlignNet: A Unifying Approach to audio-Visual Alignment
Analysis of Lip Geometric Features for audio-Visual Speech Recognition
Analysis of Optimal Search Interval for Estimation of Modified Quantization Step Size in Quantization-Based audio Watermark Detection
Animating Face using Disentangled audio Representations
Annotation-free audio-Visual Segmentation
Appearance Matters, So Does audio: Revealing the Hidden Face via Cross-Modality Transfer
Applying audio description for context understanding of surveillance videos by people with visual impairments
Applying Segment-Level Attention on Bi-Modal Transformer Encoder for audio-Visual Emotion Recognition
approach to immersive audio rendering with wave field synthesis for 3D multimedia content, An
Are Multiple Cross-Correlation Identities better than just Two? Improving the Estimate of Time Differences-of-Arrivals from Blind audio Signals
ARMA digital filter design method for audio and musical purposes
Arousal Recognition Using audio-Visual Features and FMRI-Based Brain Response
Assessment and classification of singing quality based on audio-visual features
Assisted Listening Using a Headset: Enhancing audio perception in real, augmented, and virtual environments
Associating audio-visual activity cues in a dominance estimation framework
ASVFI: audio-Driven Speaker Video Frame Interpolation
Asymmetric Contrastive Learning for audio Fingerprinting
Asymmetric Matching Method for a Robust Binary audio Fingerprinting, An
ATGNN: audio Tagging Graph Neural Network
Attention Based Speaker-independent audio-visual Deep Learning Model for Speech Enhancement, An
Attention Fusion for audio-Visual Person Verification Using Multi-Scale Features
Attention guided deep audio-face fusion for efficient speaker naming
Attention-Guided Neural Networks for Full-Reference and No-Reference audio-Visual Quality Assessment
Atypical Lyrics Completion Considering Musical audio Signals
audio and Video Coding Standard Workgroup of China
audio and Video Coding Standard, AVS Coding Issues, Standards
audio and Video-based Emotion Recognition using Multimodal Transformers
audio Assisted Robust Visual Tracking With Adaptive Particle Filtering
audio Based Real-Time Speech Animation of Embodied Conversational Agents
audio classification based on MPEG-7 spectral basis representations
audio Classification in Speech and Music: A Comparison Between a Statistical and a Neural Approach
audio Coding Using Overlap and Kernel Adaptation
audio Copyright Protection Schemes Based on SMM in Cepstrum Domain, An
audio decoding with frequency and complexity scalability
audio effects to enhance spatial information displays
audio Event-Relational Graph Representation Learning for Acoustic Scene Classification
audio Features for Music Emotion Recognition: A Survey
audio Identification by Sampling Sub-fingerprints and Counting Matches
audio Matters in Video Super-Resolution by Implicit Semantic Guidance
audio Matters in Visual Attention
audio Music Genre Classification Using Different Classifiers and Feature Selection Methods
audio Partitioning and Transcription for Broadcast Data Indexation
audio personalization using head related transfer function in 3DTV
audio Postprocessing Detection Based on Amplitude Cooccurrence Vector Feature
audio Properties of Perceived Boundaries in Music
audio Recapture Detection With Convolutional Neural Networks
audio Related Quality of Experience Evaluation in Urban Transportation Environments With Brain Inspired Graph Learning
audio Retrieval With Natural Language Queries: A Benchmark Study
audio Secret Management Scheme Using Shamir's Secret Sharing
audio Segmentation and Speaker Localization in Meeting Videos
audio signal identification via pattern capture and template matching
audio Signal-based Depression Level Prediction Combining Temporal and Spectral Features
audio Soft Declipping Based on Constrained Weighted Least Squares
audio Source Separation Using Variational Autoencoders and Weak Class Supervision
audio Source Separation, Source Localization, Direction of Arrival, DoA, Analysis
audio Surveillance Eye, The
audio Surveillance of Roads: A System for Detecting Anomalous Sounds
audio surveillance using a bag of aural words classifier
audio to Body Dynamics
audio visual isolated Hindi digits recognition using HMM
audio Visual Person Authentication by Multiple Nearest Neighbor Classifiers
audio Visual Scene-Aware Dialog
audio Visual Speaker Verification Based on Hybrid Fusion of Cross Modal Features
audio Watermarking Algorithm Robust to TSM Based on Counter Propagation Neural Network
audio Watermarking Based on Music Content Analysis: Robust against Time Scale Modification
audio Watermarking Scheme Based on Singular-Spectrum Analysis, An
audio watermarking techniques using sinusoidal patterns based on pseudorandom sequences
audio Watermarks, Speech Watermarks
audio- and Gaze-driven Facial Animation of Codec Avatars
audio- and Video-Based Biometric Person Authentication
audio-Adaptive Activity Recognition Across Video Domains
audio-Assisted Movie Dialogue Detection
audio-Based Automatic Generation of a Piano Reduction Score by Considering the Musical Structure
audio-Based Emotion Recognition Enhancement Through Progressive Gans
audio-Based Granularity-Adapted Emotion Classification
audio-Based Machine Learning Model for Traffic Congestion Detection
audio-Based Musical Version Identification: Elements and challenges
audio-based Near-Duplicate Video Retrieval with Audio Similarity Learning
audio-based Near-Duplicate Video Retrieval with Audio Similarity Learning
audio-Based Sports Video Segmentation and Event Detection Algorithm, An
audio-Driven Deformation Flow for Effective Lip Reading
audio-Driven Dubbing for User Generated Contents via Style-Aware Semi-Parametric Synthesis
audio-Driven Emotional Video Portraits
audio-Driven Laughter Behavior Controller
audio-driven Neural Gesture Reenactment with Video Motion Graphs
audio-Driven Robot Upper-Body Motion Synthesis
audio-Driven Stylized Gesture Generation with Flow-Based Model
audio-driven talking face generation with diverse yet realistic facial animations
audio-Driven Talking Face Video Generation With Dynamic Convolution Kernels
audio-Driven Talking Video Frame Restoration
audio-Enhanced Text-to-Video Retrieval using Text-Conditioned Feature Alignment
audio-Facial Laughter Detection in Naturalistic Dyadic Conversations
audio-Guided Video-Based Face Recognition
audio-Noise Power Spectral Density Estimation Using Long Short-Term Memory
audio-Oculomotor Transformation
audio-Video Analysis for Indexing and Classification
audio-Video Analysis of Musical Expressive Intentions
audio-Video Based Emotion Recognition Using Minimum Cost Flow Algorithm
audio-Video detection of the active speaker in meetings
audio-video front-end for multimedia applications, An
audio-Video Integration for Background Modelling
audio-Video Sensor Fusion with Probabilistic Graphical Models
audio-video surveillance system for public transportation
audio-Visual Active Speaker Tracking in Cluttered Indoors Environments
audio-Visual Affect Recognition
audio-Visual Affect Recognition through Multi-Stream Fused HMM for HCI
audio-Visual Affective Expression Recognition Through Multistream Fused HMM
audio-visual attention: Eye-tracking dataset and analysis toolbox
audio-Visual Automatic Group Affect Analysis
audio-visual based emotion recognition-a new approach
audio-visual biometric recognition via joint sparse representations
audio-Visual Biometrics
audio-Visual Class-Incremental Learning
audio-Visual Classification and Fusion of Spontaneous Affective Data in Likelihood Space
audio-Visual Classification of Sports Types
audio-Visual Classification Video Browser
audio-Visual Co-Training for Vehicle Classification
audio-visual content-based violent scene characterization
audio-visual continuous speech recognition using MPEG-4 compliant visual features
audio-Visual Contrastive and Consistency Learning for Semi-Supervised Action Recognition
audio-visual data association for face expression analysis
audio-Visual Data Fusion Using a Particle Filter in the Application of Face Recognition
audio-Visual Deception Detection: DOLOS Dataset and Parameter-Efficient Crossmodal Learning
audio-Visual Efficient Conformer for Robust Speech Recognition
audio-Visual Emotion Analysis Using Semi-Supervised Temporal Clustering with Constraint Propagation
audio-Visual Emotion Recognition in Video Clips
audio-visual emotion recognition using Boltzmann Zippers
audio-visual emotion recognition with boosted coupled HMM
audio-Visual Emotion Recognition With Preference Learning Based on Intended and Multi-Modal Perceived Labels
audio-Visual Emotion, Audiovisual Emotion Recognition
audio-Visual Emotion-Aware Cloud Gaming Framework
audio-visual event classification via spatial-temporal-audio words
audio-visual event classification via spatial-temporal-audio words
audio-Visual Event Detection using Duration Dependent Input Output Markov Models
audio-Visual Event Localization by Learning Spatial and Semantic Co-Attention
audio-Visual Event Localization in Unconstrained Videos
audio-Visual Event Localization via Recursive Fusion by Joint Co-Attention
audio-Visual Event Recognition in Surveillance Video Sequences
audio-Visual Face Reenactment
audio-Visual Feature Fusion for Vehicles Classification in a Surveillance System
audio-Visual Floorplan Reconstruction
audio-visual flow: A variational approach to multi-modal flow estimation
audio-Visual Foreground Extraction for Event Characterization
audio-Visual Gated-Sequenced Neural Networks for Affect Recognition
audio-Visual Glance Network for Efficient Video Recognition
audio-Visual Grouping Network for Sound Localization from Mixtures
audio-visual Hybrid Approach for Filling Mass Estimation
audio-Visual Identity Verification and Robustness to Imposture
audio-Visual Instance Discrimination with Cross-Modal Agreement
audio-Visual Keyword Spotting Based on Multidimensional Convolutional Neural Network
audio-visual Keyword Spotting for Mandarin Based on Discriminative Local Spatial-Temporal Descriptors
audio-Visual Kinship Verification: A New Dataset and a Unified Adaptive Adversarial Multimodal Learning Approach
audio-Visual Mismatch-Aware Video Retrieval via Association and Adjustment
audio-Visual Model Distillation Using Acoustic Images
audio-Visual Particle Flow SMC-PHD Filtering for Multi-Speaker Tracking
audio-Visual Person Authentication with Multiple Visualized-Speech Features and Multiple Face Profiles
audio-Visual Person Verification
audio-Visual Person-of-Interest DeepFake Detection
audio-Visual Predictive Coding for Self-Supervised Visual Representation Learning
audio-visual processing for scene change detection
audio-Visual Quality Assessment for User Generated Content: Database and Method
audio-Visual Recognition System in Compression Domain
audio-visual saliency prediction with multisensory perception and integration
audio-Visual Scene Analysis with Self-Supervised Multisensory Features
audio-Visual Segmentation
audio-visual selection process for the synthesis of photo-realistic talking-head animations
audio-visual sensor fusion approach for feature based vehicle identification, An
audio-visual Sensor Fusion Framework Using Person Attributes Robust to Missing Visual Modality for Person Recognition
audio-visual speaker detection using dynamic Bayesian networks
audio-Visual Speaker Diarization Based on Spatiotemporal Bayesian Fusion
audio-Visual Speaker Identification Based on the Use of Dynamic Audio and Visual Features
audio-Visual Speaker Identification Based on the Use of Dynamic Audio and Visual Features
audio-visual speaker identification using coupled hidden Markov models, A
audio-Visual Speaker Identification via Adaptive Fusion Using Reliability Estimates of Both Modalities
audio-visual speaker identification with multi-view distance metric learning
audio-Visual Speaker Localization Using Graphical Models
audio-visual speaker tracking with importance particle filters
audio-Visual Speech Codecs: Rethinking Audio-Visual Speech Enhancement by Re-Synthesis
audio-Visual Speech Codecs: Rethinking Audio-Visual Speech Enhancement by Re-Synthesis
audio-Visual Speech Fusion Using Coupled Hidden Markov Models
audio-Visual Speech Recognition Based on AAM Parameter and Phoneme Analysis of Visual Feature
audio-Visual Speech Recognition Scheme Based on Wavelets and Random Forests Classification
audio-visual speech recognition techniques in augmented reality environments
audio-Visual Speech Recognition Using A Two-Step Feature Fusion Strategy
audio-Visual Speech Recognition Using MPEG-4 Compliant Visual Features
audio-visual speech synchronization detection using a bimodal linear prediction model
audio-Visual Speech Synthesis Based on Chinese Visual Triphone
audio-Visual System for Object-Based Audio: From Recording to Listening, An
audio-Visual System for Object-Based Audio: From Recording to Listening, An
audio-Visual Temporal Saliency Modeling Validated by fMRI Data
audio-Visual Tracking of Concurrent Speakers
audio-Visual Transformer Based Crowd Counting
audio-Visual Unit Selection for the Synthesis of Photo-Realistic Talking-Heads
audio2Gestures: Generating Diverse Gestures from Speech Audio with Conditional Variational Autoencoders
audioScopeV2: Audio-Visual Attention Architectures for Calibrated Open-Domain On-Screen Sound Separation
audiovisual Spatial-Audio Analysis by Means of Sound Localization and Imaging: A Multimedia Healthcare Framework in Abdominal Sound Mapping
audiovisual Transformer with Instance Attention for Audio-visual Event Localization
Auditory Distance Rendering Based on ICPD Control for Stereophonic 3D audio System
AutoAD II: The Sequel - Who, When, and What in Movie audio Description
Automated audio-visual Activity Analysis
Automated detection of errors and quality issues in audio-visual content
Automated MPEG audio-video summarization and description
Automatic annotation of tennis games: An integration of audio, vision, and learning
Automatic audio Feature Extraction for Keyword Spotting
Automatic audio-Visual Fusion for Aggression Detection Using Meta-information
Automatic Detection and Removal of Impulsive Noise in audio Signals
Automatic Music Stretching Resistance Classification Using audio Features and Genres
Autonomous audio-Supported Learning of Visual Classifiers for Traffic Monitoring
AV-GAZE: A Study on the Effectiveness of audio Guided Visual Attention Estimation for Non-profilic Faces
AVE-CLIP: audioCLIP-based Multi-window Temporal Transformer for Audio Visual Event Localization
AVFace: Towards Detailed audio-Visual 4D Face Reconstruction
AVGZSLNet: audio-Visual Generalized Zero-Shot Learning by Reconstructing Label Features from Multi-Modal Embeddings
AVS: Scientific Research Community audio-Visual Systems
AWEAR 2.0 system: Omni-directional audio-visual data acquisition and processing
Backdoor Attacks against Deep Neural Networks by Personalized audio Steganography
Ball Hit Detection in Table Tennis Games Based on audio Analysis
Ballroom Dance Recognition from audio Recordings
BAUM-1: A Spontaneous audio-Visual Face Database of Affective and Mental States
Bayesian Approach to audio-Visual Speaker Identification, A
Bayesian Blind Identification of Nonlinear Distortion with Memory for audio Applications
Be Everywhere - Hear Everything (BEE): audio Scene Reconstruction by Sparse Audio-Visual Samples
Be Everywhere - Hear Everything (BEE): audio Scene Reconstruction by Sparse Audio-Visual Samples
Beat Synchronous Dance Animation Based on Visual Analysis of Human Motion and audio Analysis of Music Tempo
Beyond audio and video retrieval: Topic-oriented multimedia summarization
Beyond audio and video retrieval: Towards multimedia summarization
Beyond Mono to Binaural: Generating Binaural audio from Mono Audio with Depth and Cross Modal Attention
Beyond Mono to Binaural: Generating Binaural audio from Mono Audio with Depth and Cross Modal Attention
Bi-modal First Impressions Recognition Using Temporally Ordered Deep audio and Stochastic Visual Features
Bimodal fusion in audio-visual speech recognition
Biometric Based Unique Key Generation for Authentic audio Watermarking
biometric-based verification system for handwritten image-based signatures using audio to image matching, A
Bird Sounds, Bird Song, Birds audio, Identification
BirdSoundsDenoising: Deep Visual audio Denoising for Bird Sounds
Blind audio watermark decoding using independent component analysis
Blind audio-Visual Localization and Separation via Low-Rank and Sparsity
BLTRCNN-Based 3-D Articulatory Movement Prediction: Learning Articulatory Synchronicity From Both Text and audio Inputs
Boosted audio-visual HMM for speech reading
Boosting and structure learning in dynamic Bayesian networks for audio-visual speaker detection
Boosting audio chord estimation using multiple classifiers
Boosting Positive Segments for Weakly-Supervised audio-Visual Video Parsing
brand new application of visual-audio fingerprints: Estimating the position of the pirate in a theater-A case study, A
C-GCN: Correlation Based Graph Convolutional Network for audio-Video Emotion Recognition
Camera Pose Estimation and Localization with Active audio Sensing
Can audio-visual integration strengthen robustness under multimodal attacks?
Car crashes detection by audio analysis in crowded roads
Cartesian Genetic Programming Parameterization in the Context of audio Synthesis
Cascade classifiers trained on gammatonegrams for reliably detecting audio events
Cascaded Siamese Self-supervised audio to Video GAN
CASP-Net: Rethinking Video Saliency Prediction from an audio-Visual Consistency Perceptual Perspective
CASSANDRA: audio-video sensor fusion for aggression detection
Casual Conversations v2 Dataset: A diverse, large benchmark for measuring fairness and robustness in audio/vision/speech models, The
CATNet: Cross-modal fusion for audio-visual speech recognition
Challenges in audio Processing of Terrorist-Related Data
Channel Capacity Analysis of the Generalized Spread Spectrum Watermarking in audio Signals
Channel Capacity Analysis of the Multiple Orthogonal Sequence Spread Spectrum Watermarking in audio Signals
Class Confusability Reduction in audio-Visual Speech Recognition Using Random Forests
Class-Incremental Grouping Network for Continual audio-Visual Learning
Classification of audio Signals in All-Night Sleep Studies
Classification of audio Signals Using Fuzzy C-Means with Divergence-Based Kernel
Classification of general audio data for content-based retrieval
Classifying audio of movies by a multi-expert system
cloud infrastructure for target detection and tracking using audio and video fusion, A
Clustering and Visualizing audio-Visual Dataset on Mobile Devices in a Topic-Oriented Manner
Coding, Compression, Acoustic Signals, Sounds, audio
Coherent bag-of audio words model for efficient large-scale video copy detection
Collaborative Interface for Multimodal Ink and audio Documents, A
Collecting Cross-Modal Presence-Absence Evidence for Weakly-Supervised audio-Visual Event Perception
Combined audio Visual Recognition and Analysis
Combined audio Visual Speaker Tracking
Combined Rule-Based Machine Learning audio-Visual Emotion Recognition Approach, A
Combining visual and acoustic features for audio classification tasks
Commentary Paper on Person Tracking With audio-Visual Cues Using the Iterative Decoding Framework
Comparative Error Analysis of audio-Visual Source Localization, A
Comparative Study of Different Segmentation Approaches for audio Track Indexing, A
comparative study on automatic audio-visual fusion for aggression detection using meta-information, A
comparison of extended fingerprint hashing and locality sensitive hashing for binary audio fingerprints, A
Comparison of MPEG-4 Facial Animation Parameter Groups with Respect to audio-Visual Speech Recognition Performance
Complementary Cues from audio Help Combat Noise in Weakly-Supervised Object Detection
Complementary video and audio analysis for broadcast news archives
Complex and Quaternionic Principal Component Pursuit and Its Application to audio Separation
Compositional Models for audio Processing: Uncovering the structure of sound mixtures
Comprehensive Survey on Video Saliency Detection With Auditory Information: The audio-Visual Consistency Perceptual is the Key!, A
Compressing audio Signals with Inpainting-Based Sparsification
Compression enhancement of video motion of mouth region using joint audio and video coding
Computer Vision for audio-Visual Media
Concealing Fingerprint-Biometric Data into audio Signals for Identify Authentication
Conditional Generation of audio from Video via Foley Analogies
Conditional Random Fields in Speech, audio, and Language Processing
Conducting audio Files via Computer Vision
confidence-based late fusion framework for audio-visual biometric identification, A
Consistent Wiener Filtering for audio Source Separation
consumer video search system by audio-visual concept classification, A
Content-Adaptive Analysis and Representation Framework for audio Event Discovery from Unscripted Multimedia, A
Content-Based audio Classification Using Support Vector Machines and Independent Component Analysis
Content-based audio retrieval with relevance feedback
Content-Based Movie Analysis and Indexing Based on audio-Visual Cues
Content-based video parsing and indexing based on audio-visual interaction
Continuous audio-Visual Speech Recognition
Continuous Emotion Recognition using Visual-audio-linguistic Information: A Technical Report for ABAW3
Continuous Emotion Recognition with audio-visual Leader-follower Attentive Fusion
Contrastive Positive Sample Propagation Along the audio-Visual Event Line
Cooperative Game Modeling With Weighted Token-Level Alignment for audio-Text Retrieval
Coordinated Joint Multimodal Embeddings for Generalized audio-Visual Zero-shot Classification and Retrieval of Videos
Correlation of Gestural Musical audio Cues and Perceived Expressive Qualities
Cost-effective solution to synchronised audio-visual data capture using multiple sensors
Cost-Effective Solution to Synchronized audio-Visual Capture Using Multiple Sensors
Cost-Sensitive Multi-Label Learning for audio Tag Annotation and Retrieval
Creating audio-centric, image-centric, and integrated audio-visual summaries
Creating audio-centric, image-centric, and integrated audio-visual summaries
Cross Attentional audio-Visual Fusion for Dimensional Emotion Recognition
Cross-Domain Deep Feature Combination for Bird Species Classification with audio-Visual Data
Cross-modal Background Suppression for audio-Visual Event Localization
Cross-modal Deep Learning Applications: audio-visual Retrieval
Cross-modal Embeddings for Video and audio Retrieval
Cross-Referencing Self-Training Network for Sound Event Detection in audio Mixtures
Crossmodal Matching of Speakers Using Lip and Voice Features in Temporally Non-overlapping audio and Video Streams
Current Developments and Future Trends in audio Authentication
Data Hiding in MPEG Compressed audio Using Wet Paper Codes
Data-Driven Approach to audio Decorrelation, A
DAVD-Net: Deep audio-Aided Video Decompression of Talking Heads
DBATES: Dataset for Discerning Benefits of audio, Textual, and Facial Expression Features in Competitive Debate Speeches
DCAR: A Discriminative and Compact audio Representation for Audio Processing
DCAR: A Discriminative and Compact audio Representation for Audio Processing
Decision Level Fusion for audio-Visual Speech Recognition in Noisy Conditions
Deep audio-Visual Beamforming for Speaker Localization
Deep audio-Visual Fusion Neural Network for Saliency Estimation
Deep audio-Visual Speech Recognition
Deep Boltzmann Machines for i-Vector Based audio-Visual Person Identification
Deep emotion recognition based on audio-visual correlation
Deep Learning for audio-Based Music Classification and Tagging: Teaching Computers to Distinguish Rock from Bach
Deep Learning-based Video Retrieval Using Object Relationships and Associated audio Classes
Deep Neural Network Based 3D Articulatory Movement Prediction Using Both Text and audio Inputs
Deep Neural Networks for Full-Reference and No-Reference audio-Visual Quality Assessment
DeepComboSAD: Spectro-Temporal Correlation Based Speech Activity Detection for Naturalistic audio Streams
Deepfake Video Detection Using audio-visual Consistency
Deepfakes audio Detection Leveraging Audio Spectrogram and Convolutional Neural Networks
Deepfakes audio Detection Leveraging Audio Spectrogram and Convolutional Neural Networks
Demonstration of an HMM-based photorealistic expressive audio-visual speech synthesis system
Denoising of audio Data by Nonlinear Diffusion
Dense Modality Interaction Network for audio-Visual Event Localization
Dense-Localizing audio-Visual Events in Untrimmed Videos: A Large-Scale Benchmark and Baseline
Derivative-Based Steganographic Distortion and its Non-additive Extensions for audio
Design and implementation of MPEG audio layer III decoder using graphics processing units
Design of MP3 audio Signal System Based on DSP
Designing an Interactive audio Interface for Climate Science
Detect audio-Video Temporal Synchronization Errors in Advertisements (Ads)
Detecting and indexing moving objects for Behavior Analysis by Video and audio Interpretation
Detecting Fake-Quality WAV audio Based on Phase Differences
Detecting Group Turn Patterns in Conversations Using audio-Video Change Scale-Space
Detecting Hubs in Music audio Based on Network Analysis
Detecting information-hiding in WAV audios
Detecting local audio-visual synchrony in monologues utilizing vocal pitch and facial landmark trajectories
Detecting News Reporting Using audio/Visual Information
Detecting Replay Attacks Using Multi-Channel audio: A Neural Network-Based Method
Detecting road surface wetness from audio: A deep learning approach
Detecting Semantic Concepts Using Context and audio/Visual Features
Development of an Estimation Model for Instantaneous Presence in audio-Visual Content
Device and method for dubbing an audio-visual presentation which generates synthesized speech and corresponding facial movements
Diff2Lip: audio Conditioned Diffusion Models for Lip-Synchronization
DiffTalk: Crafting Diffusion Models for Generalized audio-Driven Portraits Animation
Digit Recognition Applied to Reconstructed audio Signals Using Deep Learning
Digital audio watermarking method based on wavelet transform
Discovering joint audio-visual codewords for video event detection
Discovering meaningful multimedia patterns with audio-visual concepts and associated text
Discriminative Collaborative Representation and Its Application to audio Signal Classification
Discriminative Cross-Modality Attention Network for Temporal Inconsistent audio-Visual Event Localization
Distilling audio-Visual Knowledge by Compositional Contrastive Learning
Distributed audio Network for Speech Enhancement in Challenging Noise Backgrounds
Domain Generalization through audio-Visual Relative Norm Alignment in First Person Action Recognition
DSP Restoration Techniques for audio
Dual Attention Matching for audio-Visual Event Localization
Dual Perspective Network for audio-Visual Event Localization
Dual-modality Talking-metrics: 3D Visual-audio Integrated Behaviometric Cues from Speakers
Dynamic 2D/3D Speaking Face Dataset with Synchronized audio
Dynamic audio-Visual Mapping using Fused Hidden Markov Model Inversion Method
Dynamic Bayesian Networks for audio-Visual Speaker Recognition
Dynamic Bayesian Networks for audio-Visual Speech Recognition
EAVA: A 3D Emotive audio-Visual Avatar
Effective News Anchorperson Shot Detection Method Based on Adaptive audio/Visual Model Generation, An
Effective Pseudonoise Sequence and Decoding Function for Imperceptibility and Robustness Enhancement in Time-Spread Echo-Based audio Watermarking
Effective Watermarking of Digital audio and Image Using Matlab Technique
Effects of ATM network impairments on audio-visual broadcast applications
Effects of audio Compression on Chord Recognition
Efficient audio Rendering Using Angular Region-Wise Source Enhancement for 360° Video
Efficient Cascaded Filtering Retrieval Method for Big audio Data, An
Efficient Emotional Adaptation for audio-Driven Talking-Head Generation
Efficient Implementation of the Forward and Inverse MDCT in MPEG audio Coding, An
Efficient Parallel audio Generation Using Group Masked Language Modeling
Efficient video coding based on audio-visual focus of attention
Egocentric audio-Visual Object Localization
Egocentric Deep Multi-Channel audio-Visual Active Speaker Localization
EM Algorithms for Weighted-Data Clustering with Application to audio-Visual Scene Analysis
EM Estimation of Scale Factor for Quantization-Based audio Watermarking
EMMN: Emotional Motion Memory Network for audio-driven Emotional Talking Face Generation
Emotion Analysis Using audio/Video, EMG and EEG: A Dataset and Comparison Study
Emotion Recognition Based on Joint Visual and audio Cues
Emotional Tone-Based audio Continuous Emotion Recognition
Empirical Study of audio-Visual Features Fusion for Gait Recognition
Empirical Study of Feature Extraction Methods for audio Classification, An
Energy and Computation Efficient audio-Visual Voice Activity Detection Driven by Event-Cameras
ENF Detection in audio Recordings via Multi-Harmonic Combining
Enhancing audio surveillance with hierarchical recurrent neural networks
Enhancing Transferability of Adversarial audio in Speaker Recognition Systems
Ensemble of Rejecting Classifiers for Anomaly Detection of audio Events, An
eNTERFACE-05 audio-Visual Emotion Database, The
Environmental Sound Classification Using Local Binary Pattern and audio Features Collaboration
EPG2S: Speech Generation and Speech Enhancement Based on Electropalatography and audio Signals Using Multimodal Learning
EPIC-Fusion: audio-Visual Temporal Binding for Egocentric Action Recognition
ERANNs: Efficient residual audio neural networks for audio pattern recognition
ERANNs: Efficient residual audio neural networks for audio pattern recognition
Error Weighted Semi-Coupled Hidden Markov Model for audio-Visual Emotion Recognition
Escape from the Dark Jungle: A 3D audio Game for Emotion Regulation
Estimating Cohesion in Small Groups Using audio-Visual Nonverbal Behavior
Estimating Rainfall from Surveillance audio Based on Parallel Network with Multi-Scale Fusion and Attention Mechanism
Evaluating audio skimming and frame rate acceleration for summarizing BBC rushes
Event Detection in Field Sports Video Using audio-Visual Features and a Support Vector Machine
Event-Specific audio-Visual Fusion Layers: A Simple and New Perspective on Video Understanding
Exploiting evidential theory in the fusion of textual, audio, and visual modalities for affective music video retrieval
Exploiting the Complementarity of audio and Visual Data in Multi-speaker Tracking
Exploiting Visual-audio-Textual Characteristics for Automatic TV Commercial Block Detection and Segmentation
Exploring audio Compression as Image Completion in Time-frequency Domain
Exploring Co-Occurence Between Speech and Body Movement for audio-Guided Video Localization
Exploring Heterogeneous Clues for Weakly-Supervised audio-Visual Video Parsing
Exploring the Resolution Limit for In-Air Synthetic-Aperture audio Imaging
Exploring the Topics of audio Words for Detecting Alzheimer's Disease From Spontaneous Speech
Exponential Hyperbolic Cosine Robust Adaptive Filters for audio Signal Processing
Expressive Talking Head Generation with Granular audio-Visual Control
Extracting High Level Semantics by Means of Speech, audio, and Image Primitives in Surveillance Applications
Extracting Semantic Information from Basketball Video Based on audio-Visual Features
Fast Conversion Algorithm for the Dolby Digital (Plus) AC-3 audio Coding Standards
Fast Mode Decision Algorithm for Intra Encoding of the 3rd Generation audio Video Coding Standard
Feature Analysis for audio Classification
Feature contours fusion for determining segment boundaries in audio data
Feature fluctuation absorption for a quick audio retrieval from long recordings
Feature space video stream consistency estimation for dynamic stream weighting in audio-visual speech recognition
Few-Shot Class-Incremental audio Classification Using Dynamically Expanded Classifier With Self-Attention Modified Prototypes
Finding Fallen Objects Via Asynchronous audio-Visual Integration
Fingerprint extraction of audio signal using wavelet transform
Flow-guided One-shot Talking Face Generation with a High-resolution audio-visual Dataset
Formant-based acoustic features for cow's estrus detection in audio surveillance system
FPGA-based real-time MFCC extraction for automatic audio indexing on FM broadcast data
Frame-Independent and Parallel Method for 3D audio Real-Time Rendering on Mobile Devices
framework for estimating geometric distortions in video copies based on visual-audio fingerprints, A
Free Viewpoint Image Generation Synchronized with Free Listening-Point audio for 3-D Real Space Navigation
Frequency Domain Long-Term Prediction for Low Delay General audio Coding
From Blind to Guided audio Source Separation: How models and side information can improve the separation of sound
From Horspiel to audio Fiction: Sound Design Perspectives for Blind and Visually Impaired People
Fully automatic face recognition system using a combined audio-visual approach
Fusing audio and Visual Features of Speech
Fusing audio-Visual Nonverbal Cues to Detect Dominant People in Group Conversations
Fusion of audio and Video Information for Multi Modal Person Authentication
Fusion of audio and visual cues for laughter detection
Fusion of audio- and Visual Cues for Real-Life Emotional Human Robot Interaction
Fusion of audio-Visual Information for Integrated Speech Processing
Fusion of classifier predictions for audio-visual emotion recognition
Fwobble: Continuous audio-haptic feedback for balance control, The
Gammatone Cepstral Coefficients: Biologically Inspired Features for Non-Speech audio Classification
Generalizing AUC Optimization to Multiclass Classification for audio Segmentation With Limited Training Data
Generating Adversarial Examples in audio Classification with Generative Adversarial Network
Generation of sports highlights using motion activity in combination with a common audio feature extraction framework
Generative Model Driven Representation Learning in a Hybrid Framework for Environmental audio Scene and Sound Event Recognition
Generic content-based audio indexing and retrieval framework
Genre-Adaptive Semantic Computing and audio-Based Modelling for Music Mood Annotation
Geometric Invariant audio Watermarking Based on an LCM Feature
Gestural Interactions for Multi-parameter audio Control and Audification
GLAVNet: Global-Local audio-Visual Cues for Fine-Grained Material Recognition
Glitch in the matrix: A large scale benchmark for content driven audio-visual forgery detection and localization
Global Affective Video Content Regression Based on Complementary audio-visual Features
Goal detection in soccer video using audio/visual keywords
Graph Attention for Automated audio Captioning
Graph Fourier Transform Based audio Zero-Watermarking
Group Feature Selection for audio-Based Video Genre Classification
Group Masked Model Learning for General audio Representation
HAMEX: A Handwritten and audio Dataset of Mathematical Expressions
Handwritten and audio Information Fusion for Mathematical Symbol Recognition
Heterogeneous Networks for audio and Video: Using IEEE 802.1 Audio Video Bridging
Heterogeneous Networks for audio and Video: Using IEEE 802.1 Audio Video Bridging
Heuristic Attack Method to PRH-Based audio Copy Detectors, A
HHT-based audio coding
Hiding Video in audio via Reversible Generative Models
Hierarchical audio-visual cue integration framework for activity analysis in intelligent meeting rooms
Hierarchical audio-Visual Surveillance for Passenger Elevators
Hierarchical Model For Long-Length Video Summarization With Adversarially Enhanced audio/Visual Features
High Scrambling Degree in audio Through Imitation of an Unintelligible Signal
High-Level Feature Extraction Using SIFT GMMs and audio Models
Highly Efficient audio Coding With Blind Spectral Recovery Based on Machine Learning
Highly Transparent and Secure Scheme for Concealing Text Within audio
HMM Based Falling Person Detection Using Both audio and Video
Horror film genre typing and scene labeling via audio analysis
Hough transform-based mouth localization for audio-visual speech recognition
How to Design a Three-Stage Architecture for audio-Visual Active Speaker Detection in the Wild
Htad: A Home-tasks Activities Dataset with Wrist-accelerometer and audio Features
Human emotion recognition from videos using spatio-temporal and audio features
Human interaction categorization by using audio-visual cues
Human Perception of audio-Visual Synthetic Character Emotion Expression in the Presence of Ambiguous and Conflicting Information
hybrid visual feature extraction method for audio-visual speech recognition, A
Hyperbolic audio-visual Zero-shot Learning
Identification of Sparse audio Tampering Using Distributed Source Coding and Compressive Sensing Techniques
Identification of story units in audio-visual sequences by joint audio and video processing
Identification of story units in audio-visual sequences by joint audio and video processing
Identification of Successive Correlated Camera Shots Using audio and Video Information
Identifying Colombian Bird Species from audio Recordings
Identifying dominant people in meetings from audio-visual sensors
Identifying Human Behaviors Using Synchronized audio-Visual Cues
IEEE Standards for Advanced audio and Video Coding in Emerging Applications
Image and audio Sequence Visualization and Interaction Mechanisms for Structured Video Browsing and Editing
Image/Video/audio Quality in Computer Vision and Generative AI
Image2audio: Facilitating Semi-supervised Audio Emotion Recognition with Facial Expression Image
Impact of audio on Subjective Assessment of Video Quality
Impact of audio on Subjective Assessment of Video Quality in Videoconferencing Applications
Improved audio-Visual Speaker Recognition via the Use of a Hybrid Combination Strategy
Improved High Capacity Spread Spectrum-Based audio Watermarking by Hadamard Matrices
Improved Soccer Action Spotting using both audio and Video Streams
Improving accuracy in behaviour identification for content-based retrieval by using audio and video information
Improving audio Steganalysis Using Deep Residual Networks
Improving mix-and-separate training in audio-visual sound source separation with an object prior
Improving user verification in human-robot interaction from audio or image inputs through sample quality assessment
Improving videophone subjective quality using audio information
Incorporating audio Signals into Constructing a Visual Saliency Map
Increasing Robustness of an Improved Spread Spectrum audio Watermarking Method Using Attack Characterization
Indexing audiovisual databases through joint audio and video processing
Indexing of multilingual news telecast using audio-visual keywords
Information-Geometric Approach to Real-Time audio Segmentation, An
Instant Mobile Video Search With Layered audio-Video Indexing and Progressive Transmission
Instantaneous Evaluation of the Sense of Presence in audio-Visual Content
integrated decoding framework for audio watermark extraction, An
Integrating LDV audio and IR Video for Remote Multimodal Surveillance
Integrating Visual, audio and Text Analysis for News Video
Integration of 3D audio and 3D video for FTV
Integration of audio and visual information for content-based video segmentation
Integration of audio/Visual Information for Use in Human-Computer Intelligent Interaction
Interactive 3-D audio System With Loudspeakers, An
Interactive Multi-View Video and View-Dependent audio Under MPEG-21 DIA (Digital Item Adaptation)
Introduction to the Special Issue on audio and Video Analysis for Multimedia Interactive Services
Introduction to the Special Issue: Advances on pattern recognition for speech and audio processing
Investigating Blind User Preference on Tactile Symbols for Landmarks on audio-Tactile Map
investigation on MPEG audio segmentation by evolutionary algorithms, An
Investigations into the Robustness of audio-Visual Gender Classification to Background Noise and Illumination Effects
iQuery: Instruments as Queries for audio-Visual Sound Separation
ISLA: Temporal Segmentation and Labeling for audio-Visual Emotion Recognition
ISNN: Impact Sound Neural Network for audio-Visual Object Classification
Joint audio-video Object Tracking
Joint audio-video people tracking using belief theory
Joint audio-visual bi-modal codewords for video event detection
Joint audio-Visual Deepfake Detection
Joint audio-Visual Tracking Using Particle Filters
Joint Cross-Attention Model for audio-Visual Fusion in Dimensional Emotion Recognition, A
Joint Inversion of audio-Magnetotelluric and Seismic Travel Time Data With Deep Learning Constraint
Joint Object-Material Category Segmentation from audio-Visual Cues
Joint Visual and audio Learning for Video Highlight Detection
Joint-Modal Label Denoising for Weakly-Supervised audio-Visual Video Parsing
KAN-AV dataset for audio-visual face and speech analysis in the wild
Kernel Fusion of audio and Visual Information for Emotion Recognition
Known-Artist Live Song Identification Using audio Hashprints
Laboratory and Crowdsourcing Studies of Lip Sync Effect on the audio-Video Quality Assessment for Videoconferencing Application
Language-Guided audio-Visual Source Separation via Trimodal Consistency
Large Scale audio-Visual Video Analytics Platform for Forensic Investigations of Terroristic Attacks
Large Vocabulary audio-visual Speech Recognition Using Active Shape Models
Large Vocabulary audio-Visual Speech Recognition Using the Janus Speech Recognition Toolkit
Latent topic model for audio retrieval
LAVSS: Location-Guided audio-Visual Spatial Audio Separation
LAVSS: Location-Guided audio-Visual Spatial Audio Separation
Learning Affective Features With a Hybrid Deep Model for audio-Visual Emotion Recognition
Learning Algorithms for audio and Video Processing: Independent Component Analysis and Support Vector Machine Based Approaches
Learning audio and image representations with bio-inspired trainable feature extractors
Learning audio-Video Modalities from Image Captions
Learning audio-Visual Source Localization via False Negative Aware Contrastive Learning
Learning Contextually Fused audio-Visual Representations for Audio-Visual Speech Recognition
Learning Contextually Fused audio-Visual Representations for Audio-Visual Speech Recognition
Learning Self-supervised audio-Visual Representations for Sound Recommendations
Learning to Answer Questions in Dynamic audio-Visual Scenarios
Learning to Predict Salient Faces: A Novel Visual-audio Saliency Model
Learning Visual Styles from audio-Visual Associations
Let's Play Music: audio-Driven Performance Video Generation
Level Ratio Based Inter and Intra Channel Prediction with Application to Stereo audio Frame Loss Concealment
Leveraging Acoustic Images for Effective Self-supervised audio Representation Learning
Leveraging recent advances in deep learning for audio-Visual emotion recognition
Leveraging TCN and Transformer for effective visual-audio fusion in continuous emotion recognition
Leveraging the Video-Level Semantic Consistency of Event for audio-Visual Event Localization
Lifelog Scene Change Detection Using Cascades of audio and Video Detectors
Linear Dynamic Range Reduction of Musical audio Using an Allpass Filter Chain
Lip2Vec: Efficient and Robust Visual Speech Recognition via Latent-to-Latent Visual to audio Representation Mapping
Listen and Look: audio-Visual Matching Assisted Speech Source Separation
Listen to Look: Action Recognition by Previewing audio
Listen to Your Face: Inferring Facial Action Units from audio Channel
Live Version Identification with audio Scene Detection
Local AM/FM Parameters Estimation: Application to Sinusoidal Modeling and Blind audio Source Separation
Local Information Assisted Attention-Free Decoder for audio Captioning
Localize to Binauralize: audio Spatialization from Visual Sound Source Localization
Look who's talking: Speaker detection using video and audio correlation
Looking and Hearing Into Details: Dual-Enhanced Siamese Adversarial Network for audio-Visual Matching
Looking into Your Speech: Learning Cross-modal Affinity for audio-visual Speech Separation
Low bit rate audio-visual communication having improved face and lip region detection
Low Cost Force-Feedback Interaction with Haptic Digital audio Effects
Low-Complexity Linear-Phase Graphic audio Equalizer Based on IFIR Filters, A
Low-Spec Extendable GPU-Based audio Library, A
MAD: A Scalable Dataset for Language Grounding in Videos from Movie audio Descriptions
Making a scene: alignment of complete sets of clips based on pairwise audio match
MALip: Modal Amplification Lipreading based on reconstructed audio features
Mead: A Large-scale audio-visual Dataset for Emotional Talking-face Generation
Mean-Shift and Sparse Sampling-Based SMC-PHD Filtering for audio Informed Visual Speaker Tracking
Method and apparatus for enhancing and indexing video and audio signals
Method and apparatus for producing audio-visual synthetic speech
Method and apparatus for summarizing and indexing the contents of an audio-visual presentation
Method and apparatus for tracking moving objects using combined video and audio information in video conferencing and other applications
Method and system for generating facial animation values based on a combination of visual and audio information
Method of and apparatus for animation, driven by an audio signal, of a synthesized model of a human face
Method of audio Watermarking Based on Adaptive Phase Modulation
Methods and apparatuses for segmenting an audio-visual recording using image similarity searching and audio speaker recognition
Methods and apparatuses for segmenting an audio-visual recording using image similarity searching and audio speaker recognition
Metric Learning-Based Multimodal audio-Visual Emotion Recognition
Microphone Arrays as Generalized Cameras for Integrated audio Visual Processing
Minimal test collections for low-cost evaluation of audio Music Similarity and Retrieval systems
MixSpeech: Cross-Modality Self-Learning with audio-Visual Stream Mixup for Visual Speech Translation and Recognition
MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint audio and Video Generation
MODA: Mapping-Once audio-driven Portrait Animation with Dual Attentions
Modeling Label Dependencies for audio Tagging With Graph Convolutional Network
Modelling Stochastic Context of audio-Visual Expressive Behaviour With Affective Processes
Modification of Polar Echo Kernel for Performance Improvement of audio Watermarking
Mood detection analyzing lyrics and audio signal based on deep learning architectures
Motion estimation using audio and video fusion
Move2Hear: Active audio-Visual Source Separation
Movie genre classification by exploiting audio-visual features of previews
MPEG Digital audio-Coding and Video-Coding Standards
MPEG Standards for Compressed Representation of Immersive audio
MPEG-1 audio Real-Time Encoding System
MPEG-1, Coding of Moving Pictures and Associated audio for Digital Storage Media at Up to About 1.5 mbit/s
MPEG-4 natural audio coding
MPEG-4 Systems and Description Languages: A Way Ahead in audio Visual Information Representation, The
MPEG-4: audio/video and synthetic graphics/audio for mixed media
MPEG-4: audio/video and synthetic graphics/audio for mixed media
MUGEN: A Playground for Video-audio-Text Multimodal Understanding and GENeration
Multi Event Localization by audio-Visual Fusion with Omnidirectional Camera and Microphone Array
Multi-beam steering for 3D audio rendering in linear phased loudspeaker arrays
Multi-Granularity Aggregation Transformer for Joint Video-audio-Text Representation Learning
Multi-level Fusion of audio and Visual Features for Speaker Identification
Multi-level Particle Filter Fusion of Features and Cues for audio-Visual Person Tracking
Multi-Modal Particle Filtering Tracking using Appearance, Motion and audio Likelihoods
Multi-Speaker Tracking From an audio-Visual Sensing Device
Multi-step Coding Structure of Spatial audio Object Coding
Multi-Task Adapters for On-Device audio Inference
Multi-View Video and Multi-Channel audio Broadcasting System
Multichannel audio Coding Based on Analysis by Synthesis
Multimedia, audio-Visual Communications, Survey
Multimodal and Multi-task audio-Visual Vehicle Detection and Classification
Multimodal Approach for Percussion Music Transcription from audio and Video, A
Multimodal emotion recognition using cross modal audio-video fusion with attention and deep metric learning
Multimodal framework based on audio-visual features for summarisation of cricket videos
Multimodal fusion of audio, scene, and face features for first impression estimation
Multimodal Music Mood Classification by Fusion of audio and Lyrics
Multimodal Person Recognition Using Unconstrained audio and Video
Multimodal Processing and Interaction: audio, Video, Text
Multimodal Saliency Model for Videos With High audio-Visual Correspondence, A
Multimodal speaker identification with audio-video processing
Multimodal tracking and classification of audio-visual features
Multimodal Variational Auto-encoder based audio-Visual Segmentation
Multimodal(audio, Facial and Gesture) based Emotion Recognition challenge
Multiple Scrambling and Adaptive Synchronization for audio Watermarking
Multiple Speaker Tracking in Spatial audio via PHD Filtering and Depth-Audio Fusion
Multiple Speaker Tracking in Spatial audio via PHD Filtering and Depth-Audio Fusion
Multipurpose audio watermarking
Multistandard Digital HD/SD audio Multiplexer With Modular Ancillary Packet Substitution, A
Multivariate mutual information for audio video fusion
Music Popularity: Metrics, Characteristics, and audio-Based Prediction
NDFT-based audio Watermarking Scheme with High Security
Neural network based reinforcement learning for audio-visual gaze control in human-robot interaction
Neural Voice Puppetry: audio-driven Facial Reenactment
New algorithm for searching minimum bit rate wavelet representations with application to multiresolution-based perceptual audio coding
New Approach to Integrate audio and Visual Features of Speech, A
New audio Watermarking for Copyright Protection and Content Authentication, A
New matching pursuit based sinusoidal modelling method for audio coding
new multi-purpose audio-visual UNMC-VIER database with multiple variabilities, A
No-Reference model for Detecting audio Artifacts using Pretrained Audio Neural Networks, A
No-Reference model for Detecting audio Artifacts using Pretrained Audio Neural Networks, A
Noise Adaptive Stream Weighting in audio-Visual Speech Recognition
Noise-Free audio Signal Processing in Noisy Environment: A Hardware and Algorithm Solution
Nonnegative OPLS for Supervised Design of Filter Banks: Application to Image and audio Feature Extraction
novel 3D audio display system using radiated loudspeaker for future 3D multimodal communications, A
Novel Anti-Collusion audio Fingerprinting Scheme Based on Fourier Coefficients Reversing, A
Novel audio Feature Projection Using KDLPCCA-Based Correlation with EEG Features for Favorite Music Classification
Novel audio Features for Music Emotion Recognition
novel efficient approach for audio segmentation, A
Novel Lip Descriptor for audio-Visual Keyword Spotting Based on Adaptive Decision Fusion, A
novel perceptual feature set for audio emotion recognition, A
Novel Representation of Bioacoustic Events for Content-Based Search in Field audio Data, A
Novel Steganalysis of Steghide Focused on High-Frequency Region of audio Waveform, A
Object Category Detection Using audio-Visual Cues
Omnidirectional Information Gathering for Knowledge Transfer-based audio-Visual Navigation
On the audio-visual Synchronization for Lip-to-Speech Synthesis
On the Correlation of Automatic audio and Visual Segmentations of Music Videos
On the Effect of Observed Subject Biases in Apparent Personality Analysis From audio-Visual Signals
On the Use of Locality Sensitive Hashing for audio Following
On-line adaptive background modelling for audio surveillance
Online Cross-Modal Adaptation for audio-Visual Person Identification With Wearable Cameras
Online Spectrogram Inversion for Low-Latency audio Source Separation
Open-Set Recognition and Few-Shot Learning Dataset for audio Event Classification in Domestic Environments, An
Open-Source Practices for Music Signal Processing Research: Recommendations for Transparent, Sustainable, and Reproducible audio Research
Optimized recursive subband synthesis windowing for implementing efficient MPEG audio decoders
Optimizing a High-Order Graphic Equalizer for audio Processing
Optimum Design of Multistage Multirate FIR Filter for audio Signal Sampling Rate Conversion via a Genetic Algorithm Approach, The
Ornithologist's Guide for Including Machine Learning in a Workflow to Identify a Secretive Focal Species from Recorded audio, An
Other Related Papers, audio, Speech, Signal Processing, Pattern Recognition
Overview of MPEG-7 audio
Overview on Perceptually Motivated audio Indexing and Classification, An
Pano-AVQA: Grounded audio-Visual Question Answering on 360° Videos
Parametric Implicit Face Representation for audio-Driven Facial Reenactment
Patra: A Novel Document Architecture for Integrating Handwriting with audio-Visual Information
Perception-Aware Cross-Modal Signal Reconstruction: From audio-Haptic to Visual
Perceptual audio data concealment and watermarking scheme using direct frequency domain substitution
Perceptual audio Watermarking by Learning in Wavelet Domain
Perceptual Coding of High-Quality Digital audio
Perceptual criterion based fragile audio watermarking using adaptive wavelet packets
Perceptual Lossless Quantization of Spatial Parameter for 3D audio Signals, The
Perceptual-based quality assessment for audio-visual services: A survey
Performance-Based Interpreter Identification in Saxophone audio Recordings
Performances of low-level audio classifiers for large-scale music similarity
Person Tracking Using audio and Depth Cues
Person Tracking with audio-Visual Cues Using the Iterative Decoding Framework
Personal Sound Zones: Delivering interface-free audio to multiple listeners
Phase-Entrained Particle Filter for audio-Locomotion Synchronization, A
phone-viseme dynamic Bayesian network for audio-visual automatic speech recognition, A
Photorealistic adaptation and interpolation of facial expressions using HMMS and AAMS for audio-visual speech synthesis
Pose-Controllable Talking Face Generation by Implicitly Modularized audio-Visual Representation
Positive Sample Propagation along the audio-Visual Event Line
Power of Sound (TPoS): audio Reactive Video Generation with Stable Diffusion, The
Pre-Training audio Representations With Self-Supervision
Predicting audio-visual salient events based on visual, audio and text modalities for movie summarization
Predicting audio-visual salient events based on visual, audio and text modalities for movie summarization
Prediction of the Leadership Style of an Emergent Leader Using audio and Visual Nonverbal Features
Primal-dual algorithms for audio decomposition using mixed norms
Probabilistic Kernels for Improved Text-to-Speech Alignment in Long audio Tracks
Proposed Integration Algorithm to Optimize the Separation of audio Signals Using the Ica and Wavelet Transform
Prosodic, Spectral and Voice Quality Feature Selection Using a Long-Term Stopping Criterion for audio-Based Emotion Recognition
Psychoacoustic Calibration of Loss Functions for Efficient End-to-End Neural audio Coding
Pyramid Based Interpolation for Face-Video Playback in audio Visual Recognition
QUALIFIER: Question-Guided Self-Attentive Multimodal Fusion Network for audio Visual Scene-Aware Dialog
RAV: Learning-Based Adaptive Streaming to Coordinate the audio and Video Bitrate Selections
Ray-Space-Based Multichannel Nonnegative Matrix Factorization for audio Source Separation
Real-Time audio-Guided Multi-Face Reenactment
Real-time Demonstration of Personal audio and 3d Audio Rendering Using Line Array Systems
Real-time Demonstration of Personal audio and 3d Audio Rendering Using Line Array Systems
Real-Time Lip Tracking for audio-Visual Speech Recognition Applications
Real-Time monophonic and polyphonic audio classification from power spectra
Real-Time MPEG-1 audio Coding and Decoding on a DSP Chip
Real-Time Perceptual Model for Distraction in Interfering audio-on-Audio Scenarios
Real-Time Perceptual Model for Distraction in Interfering audio-on-Audio Scenarios
real-time system for audio source localization with cheap sensor device, A
Real-Time User Position Estimation in Indoor Environments Using Digital Watermarking for audio Signals
Realistic Human Action Recognition with audio Context
Recalibrated Bandpass Filtering on Temporal Waveform for audio Spoof Detection
Recognizing High-level audio-visual Concepts Using Context
Recovering audio-to-video synchronization by audiovisual correlation analysis
Recovery of audio-to-video synchronization through analysis of cross-modality correlation
Reliable detection of audio events in highly noisy environments
Remote audio/video acquisition for human signature detection
Representation and linking mechanisms for audio in MPEG-7
Representations of the Complex-Valued Frequency-Domain LPC for audio Coding
Resilience Mask for Robust audio Hashing, A
Rethink Cross-Modal Fusion in Weakly-Supervised audio-Visual Video Parsing
Reverberant audio Source Separation via Sparse and Low-Rank Modeling
Reversible and Robust audio Watermarking Based on Quantization Index Modulation and Amplitude Expansion
Reversible and Robust audio Watermarking Based on Spread Spectrum and Amplitude Expansion
Reversible audio Data Hiding Based on Variable Error-Expansion of Linear Prediction for Segmental Audio and G.711 Speech
Reversible audio Data Hiding Based on Variable Error-Expansion of Linear Prediction for Segmental Audio and G.711 Speech
Reversible audio Information Hiding Based on Integer DCT Coefficients with Adaptive Hiding Locations
Review of Automatic Fault Diagnosis Systems Using audio and Vibration Signals
Right to Talk: An audio-Visual Transformer Approach, The
RNN-Based Speech-Music Discrimination Used for Hybrid audio Coder, An
Robot Command Interface Using an audio-Visual Speech Recognition System
Robust audio Fingerprint's Based Identification Method, A
Robust audio Patch Attacks Using Physical Sample Simulation and Adversarial Patch Noise Generation
robust audio searching method for cellular-phone-based music information retrieval, A
Robust audio Watermarking Based on Log-Polar Frequency Index
Robust audio Watermarking Based on Low-Order Zernike Moments
Robust audio watermarking based on multi-carrier modulation
Robust audio Watermarking by Using Low-Frequency Histogram
Robust audio Watermarking Scheme Based on Lifting Wavelet Transform and Singular Value Decomposition, A
Robust audio Watermarking Using Both DWT and Masking Effect
Robust audio Watermarking Using Perceptual Masking
Robust audio Zero-Watermark Based on LWT and Chaotic Modulation
Robust audio-Visual Instance Discrimination
Robust audio-Visual Mandarin Speech Recognition Based On Adaptive Decision Fusion And Tone Features
Robust audio-Visual Speech Recognition Based on Hybrid Fusion
Robust audio-Visual Speech Recognition Based on Late Integration
Robust audio-Visual Speech Recognition Under Noisy Audio-Video Conditions
Robust audio-Visual Speech Recognition Under Noisy Audio-Video Conditions
Robust AVS audio Watermarking
robust digital audio watermarking based on statistics characteristics, A
Robust Estimation of Amplitude Modification for Scalar Costa Scheme Based audio Watermark Detection
Robust Frequency Domain audio Watermarking: A Tuning Analysis
Robust Hiding of Fingerprint-Biometric Data into audio Signals
Robust One Shot audio to Video Generation
Robust Sensor Fusion: Analysis and Application to audio-Visual Speech Recognition
Robust, Blindly-Detectable, and Semi-Reversible Technique of audio Watermarking Based on Cochlear Delay Characteristics
Robustness of Multiplexing Protocols for audio-Visual Services Over Wireless Networks
SadTalker: Learning Realistic 3D Motion Coefficients for Stylized audio-Driven Single Image Talking Face Animation
Scalability Analysis of audio-Visual Person Identity Verification
Scalable audio coding for compression and loss resilient streaming
scale-free distribution of false positives for a large class of audio similarity measures, A
Scene Change Detection Based on audio-Visual Analysis and Interaction
Score-Informed Source Separation for Musical audio Recordings: An overview
Scream and gunshot detection and localization for audio-surveillance systems
Search the audio, Browse the Video: A Generic Paradigm for Video Collections
Secure spread spectrum watermarking for images, audio and video
Segmental DCT Coefficient Reversal Based Anti-Collusion audio Fingerprinting Mechanism
Selective Background Adaptation Based Abnormal Acoustic Event Recognition for audio Surveillance
Self-Supervised audio Spatialization with Correspondence Classifier
Self-Supervised Contrastive Learning for audio-Visual Action Recognition
Self-Supervised Fine-Grained Cycle-Separation Network (FSCN) for Visual-audio Separation
Self-Supervised Learning of audio Representations From Permutations With Differentiable Ranking
Self-supervised Learning of audio-visual Objects from Video
Self-supervised object detection from audio-visual correspondence
Self-Supervised Video Forensics by audio-Visual Anomaly Detection
Semantic and Relation Modulation for audio-Visual Event Localization
Semantic audio-Visual Navigation
Semantic Context Detection Using audio Event Fusion
Semantic Indexing of Multimedia Content Using Visual, audio, and Text Cues
Semantic indexing of soccer audio-visual sequences: A multimodal approach based on controlled Markov chains
Semantic indexing of sports program sequences by audio-visual analysis
Semantic Learning for audio Applications: A Computer Vision Approach
Semantic Video Retrieval Using audio Analysis
Semantic-Aware Implicit Neural audio-Driven Video Portrait Generation
Sensor and Data Systems, audio-Assisted Cameras and Acoustic Doppler Sensors
Sep-stereo: Visually Guided Stereophonic audio Generation by Associating Source Separation
Separation of audio-Visual Speech Sources: A New Approach Exploiting the Audio-Visual Coherence of Speech Stimuli
Separation of audio-Visual Speech Sources: A New Approach Exploiting the Audio-Visual Coherence of Speech Stimuli
SEWA DB: A Rich Database for audio-Visual Emotion and Sentiment Research in the Wild
Shot genre classification using compressed audio-visual features
Signal-Aware Parametric Quality Model for audio and Speech over IP Networks
Simple and Efficient method for Dubbed audio Sync Detection using Compressive Sensing, A
Simple Baseline for audio-Visual Scene-Aware Dialog, A
Single-modal Incremental Terrain Clustering from Self-Supervised audio-Visual Feature Learning
Sinusoidal modelling using perceptual matching pursuits in the bark scale for parametric audio coding
Sleep Apnea Detection via Depth Video and audio Feature Learning
SMART-I2: Spatial Multi-user audio-visual Real-time interactive interface, A broadcast application context
SNR-Constrained Heuristics for Optimizing the Scaling Parameter of Robust audio Watermarking
Sociometry Based Multiparty audio Recordings Summarization
Sonic Trampoline: How audio Feedback Impacts the User's Experience of Jumping
Sound event detection in real-life audio using joint spectral and temporal features
Sound Quality Evaluation for audio Watermarking Based on Phase Shift Keying Using BCH Code
Sound to Visual Scene Generation by audio-to-Visual Latent Alignment
Sound Transformation: Applying Image Neural Style Transfer Networks to audio Spectograms
Soundspaces: audio-visual Navigation in 3d Environments
Sparse representation of audio features for sputum detection from lung sounds
Spatial audio Object Coding With Two-Step Coding Structure for Interactive Audio Service
Spatial audio Object Coding With Two-Step Coding Structure for Interactive Audio Service
Spatial misregistration of virtual human audio: Implications of the precedence effect
Speaker and Digit Recognition by audio-Visual Lip Biometrics
Speaker dependent video indexing based on audio-visual interaction
Speaker Independent audio-Visual Speech Recognition
Speaker Tracking Algorithm Based on audio and Visual Information Fusion Using Particle Filter, A
Special Issue on audio-Based and Video-Based Person Authentication
Speech Activity Detection in Naturalistic audio Environments: Fearless Steps Apollo Corpus
Speech driven video editing via an audio-conditioned diffusion model
Speech Personality Recognition Based on Annotation Classification Using Log-Likelihood Distance and Extraction of Essential audio Features
Speech-assisted lip synchronization in audio-visual communications
Speech/Music Classification Based on Distributed Evolutionary Fuzzy Logic for Intelligent audio Coding
Spoken Moments: Learning Joint audio-Visual Representations from Video Descriptions
Spontaneous Driver Emotion Facial Expression (DEFE) Dataset for Intelligent Vehicles: Emotions Triggered by Video-audio Clips in Driving Scenarios, A
Spotting audio-Visual Inconsistencies (SAVI) in Manipulated Video
Stacked Sparse Autoencoder for audio Object Coding
Statistical Lip-Appearance Models Trained Automatically Using audio Information
Steganalysis Scheme for AAC audio Based on MDCT Difference Between Intra and Inter Frame, A
Structuring Soccer Video Based on audio Classification and Segmentation Using Hidden Markov Model
Strumming to the Beat: audio-Conditioned Contrastive Video Textures
Study of Subjective and Objective Quality Assessment of audio-Visual Signals
Subjective and Objective audio-Visual Quality Assessment for User Generated Content
Summarizing Long-Length Videos with GAN-Enhanced audio/Visual Features
Supplementary Material: AVA-ActiveSpeaker: An audio-Visual Dataset for Active Speaker Detection
Survey of Affect Recognition Methods: audio, Visual, and Spontaneous Expressions, A
Survey of audio-Based Music Classification and Annotation, A
Survey of compressed-domain features used in audio-visual indexing and analysis
survey of MPEG-1 audio, video and semantic analysis techniques, A
SVD-Based Adaptive QIM Watermarking on Stereo audio Signals
SVGC-AVA: 360-Degree Video Saliency Prediction With Spherical Vector-Based Graph Convolution and audio-Visual Attention
SVM-Based audio Classification for Content- Based Multimedia Retrieval
Synchronization of Multiple Camera Videos Using audio-Visual Features
Synchronization of Processed audio-Video Signals using Time-Stamps
Synchronized audio-Visual Frames with Fractional Positional Encoding for Transformers in Video-to-Text Translation
syntactic framework for bitstream-level representation of audio-visual objects, A
Synthesizing Photo-Realistic 3D Talking Head: Learning Lip Synchronicity and Emotion from audio and Video
Synthesizing Talking Faces from Text and audio: An Autoencoder and Sequence-to-Sequence Convolutional Neural Network
Synthetic and SNHC audio in MPEG-4
System and method for audio/video speaker detection
System and method for skimming digital audio/video data
System and process for adding high frame-rate current speaker data to a low frame-rate video using audio watermarking techniques
Tackling the Cover Source Mismatch Problem in audio Steganalysis With Unsupervised Domain Adaptation
Talking Head Generation with Probabilistic audio-to-Visual Diffusion Priors
Talking With Hands 16.2M: A Large-Scale Dataset of Synchronized Body-Finger Motion and audio for Conversational Motion Analysis and Synthesis
Taming Diffusion Models for audio-Driven Co-Speech Gesture Generation
TCD-TIMIT: An audio-Visual Corpus of Continuous Speech
Teaching Practices Analysis Through audio Signal Processing
Teleimmersive audio-Visual Communication Using Commodity Hardware
Temporal and Cross-modal Attention for audio-Visual Zero-Shot Learning
Temporal Bayesian Fusion for Affect Sensing: Combining Video, audio, and Lexical Modalities
Temporal Cue Guided Video Highlight Detection with Low-Rank audio-Visual Fusion
Temporal Envelope Fit of Transient audio Signals
Tests on MPEG-4 audio codec proposals
Texturedness decision time for audio texturedness indicator
theoretical analysis of a buffer frame size conversion algorithm for audio applications ensuring minimum latency, A
Three-Dimensional Speaker Localization: audio-Refined Visual Scaling Factor Estimation
Time-Delay Neural Networks for Estimating Lip Movements from Speech Analysis: A Useful Tool in audio Video Synchronization
Time-frequency analysis for audio event detection in real scenarios
Tolerance Evaluation of audio Watermarking Method Based on Modification of Sound Pressure Level between Channels
TOOTEKO: A Case Study of Augmented Reality for an Accessible Cultural Heritage. Digitization, 3D Printing and Sensors for an audio-Tactile Experience
Toward Automating Oral Presentation Scoring During Principal Certification Program Using audio-Video Low-Level Behavior Profiles
Towards affect-aware vehicles for increasing safety and comfort: recognising driver emotions from audio recordings in a realistic driving study
Towards an End-to-End Visual-to-Raw-audio Generation With GAN
Towards audio-Visual On-line Diarization Of Participants In Group Meetings
Towards audio-Visual Saliency Prediction for Omnidirectional Video with Spatial Audio
Towards audio-Visual Saliency Prediction for Omnidirectional Video with Spatial Audio
Towards event detection in an audio-based sensor network
Towards Intercultural Affect Recognition: audio-Visual Affect Recognition in the Wild Across Six Cultures
Tracking Multiple audio Sources With the von Mises Distribution and Variational EM
Tracking the Active Speaker Based on a Joint audio-Visual Observation Model
Transcribing broadcast news for audio and video indexing
TUM Gait from audio, Image and Depth (GAID) database: Multimodal recognition of subjects and traits, The
two level classifier process for audio segmentation, A
Two-Level Bimodal Association for audio-Visual Speech Recognition
Two-level Method for Unsupervised Speaker-based audio Segmentation, A
UAVM: Towards Unifying audio and Visual Models
Ultra wide band audio visual PHY IEEE 802.15.3c for SPIHT-compressed image transmission
Unified audio-Visual Saliency Model for Omnidirectional Videos With Spatial Audio
Unified audio-Visual Saliency Model for Omnidirectional Videos With Spatial Audio
Unified Multisensory Perception: Weakly-supervised audio-visual Video Parsing
Unifying Background Models over Complex audio using Entropy
Unsupervised audio-Visual Lecture Segmentation
Unsupervised Cross-Modal Deep-Model Adaptation for audio-Visual Re-identification with Wearable Cameras
Unsupervised Sound Source Localization From audio-Image Pairs Using Input Gradient Map
Unsupervised Synthetic Acoustic Image Generation for audio-Visual Scene Understanding
use of audio-Visual Description Profile in 3D video content description, The
Using audio-Derived Affective Offset to Enhance TV Recommendation
Using background audio change detection for segmenting video
Using mel-frequency audio features from footstep sound and spatial segmentation techniques to improve frame-based moving object detection
Using the audio Respiration Signal for Multimodal Discrimination of Expressive Movement Qualities
Using Three Reassigned Spectrogram Patches and Log-Gabor Filter for audio Surveillance Application
utility of MPEG-7 systems in audio-visual applications with multiple streams, The
Va2mass: Towards the Fluid Filling Mass Estimation via Integration of Vision and audio Learning
VALID: A New Practical audio-Visual Database, and Comparative Results
Variational Bayes Adapted GMM Based Models for audio Clip Classification
Variational Bayesian Inference for audio-Visual Tracking of Multiple Speakers
Very low bit-rate audio-visual applications
Video Augmentation for Improving audio Speech Recognition under Noise
Video clip recognition using joint audio-visual processing model
Video concept detection by audio-visual grouplets
Video Rewrite: Driving Visual Speech with audio
Video Scene Segmentation Using Video and audio Features
Video Segmentation with the Assistance of audio Content Analysis
Video Skimming for Quick Browsing based on audio and Image Characterization
Video Summarization using MPEG-7 Motion Activity and audio Descriptors
Video tracking through occlusions by fast audio source localisation
Video/audio Quality in Computer Vision
Violent Video Recognition Based on Global-Local Visual and audio Contrastive Learning
Violin Timbre Navigator: Real-Time Visual Feedback of Violin Bowing Based on audio Analysis and Machine Learning
Virtual audio system customization using visual matching of ear parameters
Virtual Talk: A Model-Based Virtual Phone Using a Layered audio-Visual Integration
Vision Transformers are Parameter-Efficient audio-Visual Learners
Vision-Infused Deep audio Inpainting
Visual Music Transcription of Clarinet Video Recordings Trained with audio-Based Labelled Data
Visual Scene Graphs for audio Source Separation
Visual Signal Reliability for Robust audio-Visual Speaker Identification, A
Visually Guided audio Source Separation with Meta Consistency Learning
Visually Informed Binaural audio Generation without Binaural Audios
Visually Informed Binaural audio Generation without Binaural Audios
Visually-Guided audio Spatialization in Video with Geometry-Aware Multi-task Learning
VisualVoice: audio-Visual Speech Separation with Cross-Modal Consistency
Voice Activity Detection Using Wavelet-Based Multiresolution Spectrum and Support Vector Machines and audio Mixing Algorithm
VoViT: Low Latency Graph-Based audio-Visual Voice Separation Transformer
Watch or Listen: Robust audio-Visual Speech Recognition with Visual Corruption Modeling and Reliability Scoring
Watermarking for Digital audio Based on Adaptive Phase Modulation
Waveprint: Efficient wavelet-based audio fingerprinting
Weakly Supervised audio-Visual Violence Detection
Weakly-Supervised Action Detection Guided by audio Narration
Which are the factors affecting the performance of audio surveillance systems?
Wnet: audio-Guided Video Object Segmentation via Wavelet-Based Cross- Modal Denoising Networks
X2Face: A Network for Controlling Face Generation Using Images, audio, and Pose Codes
You Said That?: Synthesising Talking Faces from audio
YouTube Movie Reviews: Sentiment Analysis in an audio-Visual Context
1012 for audio

_audio2gesture_
audio2gestures: Generating Diverse Gestures from Speech Audio with Conditional Variational Autoencoders

_audioclip_
AVE-CLIP: audioclip-based Multi-window Temporal Transformer for Audio Visual Event Localization

_audioframe_
Enhanced Eigen-audioframes for Audiovisual Scene Change Detection

_audiofrequency_
Modeling of audiofrequency Track Circuits for Validation, Tuning, and Conducted Interference Prediction

_audiogame_
audiogames: Accessibility and Inclusion in Digital Entertainment

_audiology_
Something to Talk About: Signal Processing in Speech and audiology Research: Promising Investigations Explore New Opportunities in Human Communication

_audiometries_
Automatic Eye Gesture Recognition in audiometries for Patients with Cognitive Decline

_audiometry_
Automatic Analysis of the Patient's Conscious Responses to the Emission of Auditory Stimuli during the Performance of an audiometry

_audiophonic_
Hybrid representations for audiophonic signal encoding

_audioscopev2_
audioscopev2: Audio-Visual Attention Architectures for Calibrated Open-Domain On-Screen Sound Separation

_audiotory_
audiotory Movie Summarization by Detecting Scene Changes and Sound Events

_audioviewer_
audioviewer: Learning to Visualize Sounds

_audiovisual_
ABAW5 Challenge: A Facial Affect Recognition Approach Utilizing Transformer Encoder and audiovisual Fusion
ACADI showcase: Automatic character indexing in audiovisual document
Adaptation of audiovisual Contents and Their Delivery Means
Adding Semantics to audiovisual Content: The FAETHON Project
Amazigh audiovisual speech recognition system design
APES: audiovisual Person Search in Untrimmed Video
Audio-Visual Emotion, audiovisual Emotion Recognition
audiovisual Analysis of Music Performances: Overview of an Emerging Field
audiovisual and contextual approach for categorical and continuous emotion recognition in-the-wild, An
audiovisual Annotation Procedure for Multi-view Field Recordings
audiovisual attention model for natural conversation scenes, An
audiovisual Conflict Detection in Political Debates
audiovisual Dependency Attention for Violence Detection in Videos
audiovisual Discrimination Between Speech and Laughter: Why and When Visual Information Might Help
audiovisual emotion recognition in wild
audiovisual event detection towards scene understanding
audiovisual Fusion: Challenges and New Approaches
audiovisual Generalised Zero-shot Learning with Cross-modal Attention and Language
audiovisual Gestalts
audiovisual Information Fusion in Human-Computer Interfaces and Intelligent Environments: A Survey
audiovisual Information Fusion in Human-Computer Interfaces and Intelligent Environments: A Survey
audiovisual information management system
audiovisual integration with Segment Models for tennis video parsing
audiovisual Liveness Detection
audiovisual localization of multiple speakers in a video teleconferencing setting
audiovisual Masked Autoencoders
audiovisual Quality Components
audiovisual quality fusion based on relative multimodal complexity
audiovisual Spatial-Audio Analysis by Means of Sound Localization and Imaging: A Multimedia Healthcare Framework in Abdominal Sound Mapping
audiovisual Speech Source Separation: An overview of key methodologies
audiovisual Talking Head for Augmented Speech Generation: Models and Animations Based on a Real Speaker's Articulatory Data, An
audiovisual Tracking Using STAC Sensors
audiovisual Transformer with Instance Attention for Audio-visual Event Localization
audiovisual voice activity detection using off-the-shelf cameras
Automatic audiovisual behavior descriptors for psychological disorder analysis
Automatic creation and evaluation of MPEG-7 compliant summary descriptions for generic audiovisual content
Automatic personality prediction from audiovisual data using random forest regression
Bayesian Hierarchical Regression Models for QoE Estimation and Prediction in audiovisual Communications
behaviorally inspired fusion approach for computational audiovisual saliency modeling, A
Being bored? Recognising natural interest by extensive audiovisual integration for real-life application
Blind audiovisual Source Separation Based on Sparse Redundant Representations
Blind audiovisual Source Separation using Sparse Representations
Boosting the Level of Immersion: Integrating Stereoscopic Output into Interactive audiovisual Applications
Class-Aware Sounding Objects Localization via audiovisual Correspondence
Computer-Assisted audiovisual Language Learning
Content-Based Analysis Improves audiovisual Archive Retrieval
Context-Sensitive Learning for Enhanced audiovisual Emotion Classification
Cross-task Transfer for Geotagged audiovisual Aerial Scene Recognition
Deep Impression: audiovisual Deep Residual Networks for Multimodal Apparent Personality Trait Recognition
Deep Multimodal Clustering for Unsupervised audiovisual Learning
Development of an Integrated audiovisual Coding Standard: MPEG, The
Dynamic organization of audiovisual database using a user-defined similarity measure based on low-level features
Emotional Expressions In audiovisual Human Computer Interaction
End-to-End audiovisual Speech Recognition System With Multitask Learning
Enhanced Eigen-Audioframes for audiovisual Scene Change Detection
Estimation of affective dimensions using CNN-based features of audiovisual data
Experienced audiovisual quality for mobile 3D television
Exploring Cross-Modality Affective Reactions for audiovisual Emotion Recognition
Exploring Fine-Grained audiovisual Categorization with the SSW60 Dataset
Finding Speaker Face Region by audiovisual Correlation
Funnynet: audiovisual Learning of Funny Moments in Videos
graphical model for audiovisual object tracking, A
Hemiface mirroring: a new approach to reducing bandwidth requirements of audiovisual telecommunication
Hierarchical method and system for object-based audiovisual descriptive tagging of images for information retrieval, editing, and manipulation
How Does My 3D Video Sound Like? Impact of Loudspeaker Set-Ups on audiovisual Quality on Mid-Sized Autostereoscopic Display
Impact of Packet Loss and Google Congestion Control on QoE for WebRTC-Based Mobile Multiparty audiovisual Telemeetings, The
Impairment-Factor-Based audiovisual Quality Model for IPTV: Influence of Video Resolution, Degradation Type, and Content Type
Improving Cluster Selection and Event Modeling in Unsupervised Mining for Automatic audiovisual Video Structuring
Improving Speech Related Facial Action Unit Recognition by audiovisual Information Fusion
Indexing audiovisual databases through joint audio and video processing
Interaction recognition in wide areas using audiovisual sensors
Intra- and Inter-Emotion Transformer-Based Fusion Model with Homogeneous and Diverse Constraints Using Multi-Emotional audiovisual Features for Depression Detection, An
Investigation into audiovisual Speech Correlation in Reverberant Noisy Environments, An
IP-Based Mobile and Fixed Network audiovisual Media Services
Iterative, maximally probable, batch-mode commercial detection for audiovisual content
Low-Complexity Parabolic Lip Contour Model With Speaker Normalization for High-Level Feature Extraction in Noise-Robust audiovisual Speech Recognition, A
LSTM-Modeling of continuous emotions in an audiovisual affect recognition framework
Modeling and delivering heterogeneous audiovisual content for group consumption
Movie summarization based on audiovisual saliency detection
MPEG-7: A standardised description of audiovisual content
Multimodaltrace: Deepfake Detection using audiovisual Representation Learning
Mutual Alignment between audiovisual Features for End-to-End Audiovisual Speech Recognition
Mutual Alignment between audiovisual Features for End-to-End Audiovisual Speech Recognition
new validated method for improving the audiovisual spatial congruence in the case of stereoscopic-3D video and wave field synthesis, A
OuluVS2: A multi-view audiovisual database for non-rigid mouth motion analysis
OWL (Observe, Watch, Listen): audiovisual Temporal Context for Localizing Actions in Egocentric Videos
PACS: A Dataset for Physical audiovisual CommonSense Reasoning
Prediction of asynchronous dimensional emotion ratings from audiovisual and physiological data
Prediction-Based audiovisual Fusion for Classification of Non-Linguistic Vocalisations
Prediction-based classification for audiovisual discrimination between laughter and speech
Probabalistic Models and Informative Subspaces for audiovisual Correspondence
Produced Quality is Not Perceived Quality: A Qualitative Approach to Overall audiovisual Quality
Pushing the boundaries of audiovisual word recognition using Residual Networks and LSTMs
QoE Evaluation of Multimedia Services Based on audiovisual Quality and User Interest
Quality-Based Score Normalization for audiovisual Person Authentication
Reading comprehension by content-related audiovisual feedback in children's story application (app)
Real-Time audiovisual Feature Extraction for Online Service Provision over DVB Streams
Real-time Sound Source Localization Based on audiovisual Frequency Integration
Realistic Face Animation for audiovisual Speech Applications: A Densification Approach Driven by Sparse Stereo Meshes
Realistic Transformation of Facial and Vocal Smiles in Real-Time audiovisual Streams
Recent advances in the automatic recognition of audiovisual speech
Recognizing Human Emotional State From audiovisual Signals
Recognizing Human Emotional State From audiovisual Signals*
Recommender System for Sport Videos Based on User audiovisual Consumption
Recovering audio-to-video synchronization by audiovisual correlation analysis
Robust audiovisual Emotion Recognition: Aligning Modalities, Capturing Temporal Information, and Handling Missing Features
scalable multi-view audiovisual entertainment framework with content-aware distribution, A
Self-attention fusion for audiovisual emotion recognition with incomplete data
Self-Supervised Learning for Heterogeneous audiovisual Scene Analysis
Similarity-Based Approach for audiovisual Document Classification Using Temporal Relation Analysis, A
Smooth Nonnegative Matrix Factorization for Unsupervised audiovisual Document Structuring
Special issue on digital signal processing in audiovisual communication
STAViS: Spatio-Temporal audiovisual Saliency Network
Strategies for multiple feature fusion with Hierarchical HMM: Application to activity recognition from wearable audiovisual sensors
Strength modelling for real-worldautomatic continuous affect recognition from audiovisual signals
String-based audiovisual fusion of behavioural events for the assessment of dimensional affect
Summarizing audiovisual Contents of a Video Program
Syncing Shared Multimedia through audiovisual Bimodal Segmentation
Taylor: Impersonation of AI for audiovisual Content Documentation and Search
Temporal Multimodal Learning in audiovisual Speech Recognition
Temporal Video Segmentation to Scenes Using High-Level audiovisual Features
Time-Continuous audiovisual Fusion with Recurrence vs Attention for In-The-Wild Affect Recognition
Today's and tomorrow's retrieval practice in the audiovisual archive
Two-Level Hierarchical Alignment for Semi-Coupled HMM-Based audiovisual Emotion Recognition With Temporal Course
Uncertainty-Aware audiovisual Activity Recognition Using Deep Bayesian Variational Inference
Using electroencephalography to analyze sleepiness due to low-quality audiovisual stimuli
Video preprocessing for audiovisual indexing
Voice activity detection and speaker localization using audiovisual cues
YouTube Lens: Crowdsourced Personality Impressions and audiovisual Analysis of Vlogs, The
129 for audiovisual

Index for "a"


Last update:27-Apr-24 12:10:16
Use price@usc.edu for comments.