Nagrani, A.
Standard Author Listing
with: Afouras, T.: Localizing Visual Sounds the Hard Way
with: Alahari, K.: Masking Modalities for Cross-modal Video Retrieval
with: Albanie, S.: Learnable PINs: Cross-modal Embeddings for Person Identity
with: Albanie, S.: Seeing Voices and Hearing Faces: Cross-Modal Biometric Ma...
with: Arnab, A.: End-to-end Generative Pretraining for Multimodal Video Capt...
with: Arnab, A.: Uncertainty-aware Weakly Supervised Action Detection from U...
with: Arnab, A.: UnLoc: A Unified Framework for Video Localization Tasks
with: Bain, M.: AutoAD II: The Sequel - Who, When, and What in Movie Audio D...
with: Bain, M.: AutoAD: Movie Description in Context
with: Bain, M.: Condensed Movies: Story Based Retrieval with Contextual Embe...
with: Bain, M.: Count, Crop and Recognise: Fine-Grained Recognition in the W...
with: Bain, M.: Frozen in Time: A Joint Video and Image Encoder for End-to-E...
with: Brown, A.: Condensed Movies: Story Based Retrieval with Contextual Emb...
with: Caron, M.: Verbs in Action: Improving verb understanding in video-lang...
with: Chen, H.: Localizing Visual Sounds the Hard Way
with: Damen, D.: EPIC-Fusion: Audio-Visual Temporal Binding for Egocentric A...
with: Darrell, T.J.: TL;DW? Summarizing Instructional Videos with Task Relev...
with: Gabeur, V.: Masking Modalities for Cross-modal Video Retrieval
with: Ge, W.: UnLoc: A Unified Framework for Video Localization Tasks
with: Han, T.: AutoAD: Movie Description in Context
with: Han, T.D.: AutoAD II: The Sequel - Who, When, and What in Movie Audio ...
with: Hauth, A.: Learning Audio-Video Modalities from Image Captions
with: Kazakos, E.: EPIC-Fusion: Audio-Visual Temporal Binding for Egocentric...
with: Laptev, I.: Vid2Seq: Large-Scale Pretraining of a Visual Language Mode...
with: Manen, S.: Learning Audio-Video Modalities from Image Captions
with: Miech, A.: Vid2Seq: Large-Scale Pretraining of a Visual Language Model...
with: Momeni, L.: Verbs in Action: Improving verb understanding in video-lan...
with: Narasimhan, M.: TL;DW? Summarizing Instructional Videos with Task Rele...
with: Pont Tuset, J.: Vid2Seq: Large-Scale Pretraining of a Visual Language ...
with: Rohrbach, A.: TL;DW? Summarizing Instructional Videos with Task Releva...
with: Ross, D.: Speech2Action: Cross-Modal Supervision for Action Recognition
with: Ross, D.: UnLoc: A Unified Framework for Video Localization Tasks
with: Rubinstein, M.: TL;DW? Summarizing Instructional Videos with Task Rele...
with: Schmid, C.: AVFormer: Injecting Vision into Frozen Speech Models for Z...
with: Schmid, C.: Composable Augmentation Encoding for Video Representation ...
with: Schmid, C.: End-to-end Generative Pretraining for Multimodal Video Cap...
with: Schmid, C.: Learning Audio-Video Modalities from Image Captions
with: Schmid, C.: Look Before you Speak: Visually Contextualized Utterances
with: Schmid, C.: Masking Modalities for Cross-modal Video Retrieval
with: Schmid, C.: Speech2Action: Cross-Modal Supervision for Action Recognit...
with: Schmid, C.: TL;DW? Summarizing Instructional Videos with Task Relevanc...
with: Schmid, C.: Uncertainty-aware Weakly Supervised Action Detection from ...
with: Schmid, C.: UnLoc: A Unified Framework for Video Localization Tasks
with: Schmid, C.: Verbs in Action: Improving verb understanding in video-lan...
with: Schmid, C.: Vid2Seq: Large-Scale Pretraining of a Visual Language Mode...
with: Schofield, D.: Count, Crop and Recognise: Fine-Grained Recognition in ...
with: Seo, P.H.: AVFormer: Injecting Vision into Frozen Speech Models for Ze...
with: Seo, P.H.: End-to-end Generative Pretraining for Multimodal Video Capt...
with: Seo, P.H.: Learning Audio-Video Modalities from Image Captions
with: Seo, P.H.: Look Before you Speak: Visually Contextualized Utterances
with: Seo, P.H.: Vid2Seq: Large-Scale Pretraining of a Visual Language Model...
with: Seybold, B.: Learning Audio-Video Modalities from Image Captions
with: Sivic, J.: Vid2Seq: Large-Scale Pretraining of a Visual Language Model...
with: Sukthankar, R.: Speech2Action: Cross-Modal Supervision for Action Reco...
with: Sun, C.: Composable Augmentation Encoding for Video Representation Lea...
with: Sun, C.: Learning Audio-Video Modalities from Image Captions
with: Sun, C.: Masking Modalities for Cross-modal Video Retrieval
with: Sun, C.: Speech2Action: Cross-Modal Supervision for Action Recognition
with: Sun, C.: TL;DW? Summarizing Instructional Videos with Task Relevance a...
with: Sun, C.: Uncertainty-aware Weakly Supervised Action Detection from Unt...
with: Tian, Y.L.: Composable Augmentation Encoding for Video Representation ...
with: Varol, G.: AutoAD II: The Sequel - Who, When, and What in Movie Audio ...
with: Varol, G.: AutoAD: Movie Description in Context
with: Varol, G.: Frozen in Time: A Joint Video and Image Encoder for End-to-...
with: Vedaldi, A.: Localizing Visual Sounds the Hard Way
with: Wang, Z.H.: UnLoc: A Unified Framework for Video Localization Tasks
with: Xie, W.: AutoAD II: The Sequel - Who, When, and What in Movie Audio De...
with: Xie, W.: AutoAD: Movie Description in Context
with: Xie, W.: Localizing Visual Sounds the Hard Way
with: Xiong, X.: UnLoc: A Unified Framework for Video Localization Tasks
with: Yan, S.: UnLoc: A Unified Framework for Video Localization Tasks
with: Yang, A.: Vid2Seq: Large-Scale Pretraining of a Visual Language Model ...
with: Zisserman, A.: AutoAD II: The Sequel - Who, When, and What in Movie Au...
with: Zisserman, A.: AutoAD: Movie Description in Context
with: Zisserman, A.: Condensed Movies: Story Based Retrieval with Contextual...
with: Zisserman, A.: Count, Crop and Recognise: Fine-Grained Recognition in ...
with: Zisserman, A.: EPIC-Fusion: Audio-Visual Temporal Binding for Egocentr...
with: Zisserman, A.: Frozen in Time: A Joint Video and Image Encoder for End...
with: Zisserman, A.: Learnable PINs: Cross-modal Embeddings for Person Ident...
with: Zisserman, A.: Localizing Visual Sounds the Hard Way
with: Zisserman, A.: Seeing Voices and Hearing Faces: Cross-Modal Biometric ...
with: Zisserman, A.: Speech2Action: Cross-Modal Supervision for Action Recog...
with: Zisserman, A.: Verbs in Action: Improving verb understanding in video-...
83 for Nagrani, A.
Nagrecha, K.
Standard Author Listing
with: Vasconcelos, N.M.: Gradient-based Algorithms for Machine Teaching
with: Wang, P.: Gradient-based Algorithms for Machine Teaching