Visual7W visual question answering,
Large-scale visual question answering (QA) dataset, with object-level
groundings and multimodal answers.
WWW Link.
Dataset, Visual Question Answering.
Liang, J.W.[Jun-Wei],
Jiang, L.[Lu],
Cao, L.L.[Liang-Liang],
Kalantidis, Y.[Yannis],
Li, L.J.[Li-Jia],
Hauptmann, A.G.[Alexander G.],
Focal Visual-Text Attention for Memex Question Answering,
PAMI(41), No. 8, August 2019, pp. 1893-1908.
IEEE DOI
1907
BibRef
Earlier: A1, A2, A3, A5, A6, Only:
Focal Visual-Text Attention for Visual Question Answering,
CVPR18(6135-6143)
IEEE DOI
1812
Task analysis, Knowledge discovery, Visualization, Grounding,
Metadata, Cognition, Photo albums, question answering,
memex.
Visualization, Videos, Computational modeling, Correlation.
BibRef
Riquelme, F.[Felipe],
de Goyeneche, A.[Alfredo],
Zhang, Y.D.[Yun-Dong],
Niebles, J.C.[Juan Carlos],
Soto, A.[Alvaro],
Explaining VQA predictions using visual grounding and a knowledge
base,
IVC(101), 2020, pp. 103968.
Elsevier DOI
2009
Deep Learning, Attention, Supervision, Knowledge Base,
Interpretability, Explainability
BibRef
Zhao, L.C.[Li-Chen],
Cai, D.G.[Dai-Gang],
Zhang, J.[Jing],
Sheng, L.[Lu],
Xu, D.[Dong],
Zheng, R.[Rui],
Zhao, Y.J.[Yin-Jie],
Wang, L.P.[Li-Peng],
Fan, X.[Xibo],
Toward Explainable 3D Grounded Visual Question Answering: A New
Benchmark and Strong Baseline,
CirSysVideo(33), No. 6, June 2023, pp. 2935-2949.
IEEE DOI
2306
Task analysis, Visualization, Annotations, Point cloud compression,
Solid modeling, Question answering (information retrieval),
vision and language on 3D scenes
BibRef
Zhu, L.J.[Liang-Jun],
Peng, L.[Li],
Zhou, W.N.[Wei-Nan],
Yang, J.L.[Jie-Long],
Dual-decoder transformer network for answer grounding in visual
question answering,
PRL(171), 2023, pp. 53-60.
Elsevier DOI
2306
Visual question answering, Answer grounding, Dual-decoder transformer
BibRef
Chen, K.[Kang],
Wu, X.Q.[Xiang-Qian],
VTQA: Visual Text Question Answering via Entity Alignment and
Cross-Media Reasoning,
CVPR24(27208-27217)
IEEE DOI Code:
WWW Link.
2410
Measurement, Visualization, Grounding, Computational modeling,
Natural languages, Fitting, Object detection,
dataset
BibRef
Di, S.Z.[Shang-Zhe],
Xie, W.[Weidi],
Grounded Question-Answering in Long Egocentric Videos,
CVPR24(12934-12943)
IEEE DOI
2410
Visualization, Grounding, Large language models, Pipelines,
Training data, Benchmark testing, Data models, egocentric vision,
video grounding
BibRef
Chen, C.Y.[Chong-Yan],
Anjum, S.[Samreen],
Gurari, D.[Danna],
VQA Therapy: Exploring Answer Differences by Visually Grounding
Answers,
ICCV23(15269-15279)
IEEE DOI Code:
WWW Link.
2401
BibRef
Le, T.M.[Thao Minh],
Le, V.[Vuong],
Gupta, S.I.[Sun-Il],
Venkatesh, S.[Svetha],
Tran, T.[Truyen],
Guiding Visual Question Answering with Attention Priors,
WACV23(4370-4379)
IEEE DOI
2302
Training, Visualization, Systematics, Grounding, Semantics,
Linguistics, Cognition, visual reasoning)
BibRef
Khan, A.U.[Aisha Urooj],
Kuehne, H.[Hilde],
Gan, C.[Chuang],
da Vitoria Lobo, N.[Niels],
Shah, M.[Mubarak],
Weakly Supervised Grounding for VQA in Vision-Language Transformers,
ECCV22(XXXV:652-670).
Springer DOI
2211
BibRef
Gupta, K.[Kshitij],
Gautam, D.[Devansh],
Mamidi, R.[Radhika],
cViL: Cross-Lingual Training of Vision-Language Models using
Knowledge Distillation,
ICPR22(1734-1741)
IEEE DOI
2212
Training, Visualization, Analytical models, Pipelines, Transformers,
Question answering (information retrieval), Data models
BibRef
Li, Y.C.[Yi-Cong],
Wang, X.[Xiang],
Xiao, J.B.[Jun-Bin],
Ji, W.[Wei],
Chua, T.S.[Tat-Seng],
Invariant Grounding for Video Question Answering,
CVPR22(2918-2927)
IEEE DOI
2210
Visualization, Correlation, Grounding, Semantics, Predictive models,
Linguistics, Question answering (information retrieval), Vision + language
BibRef
Lu, X.P.[Xiao-Peng],
Fan, Z.[Zhen],
Wang, Y.[Yansen],
Oh, J.[Jean],
Rosé, C.P.[Carolyn P.],
Localize, Group, and Select: Boosting Text-VQA by Scene Text Modeling,
XSAnim21(2631-2639)
IEEE DOI
2112
Integrated optics, Visualization, Grounding,
Computational modeling, Knowledge discovery
BibRef
Khan, A.U.[Aisha Urooj],
Kuehne, H.[Hilde],
Duarte, K.[Kevin],
Gan, C.[Chuang],
Lobo, N.[Niels],
Shah, M.[Mubarak],
Found a Reason for me? Weakly-supervised Grounded Visual Question
Answering using Capsules,
CVPR21(8461-8470)
IEEE DOI
2111
Training, Visualization, Vocabulary, Grounding, Focusing, Detectors,
Knowledge discovery
BibRef
Selvaraju, R.R.,
Tendulkar, P.,
Parikh, D.,
Horvitz, E.,
Tulio Ribeiro, M.,
Nushi, B.,
Kamar, E.,
SQuINTing at VQA Models: Introspecting VQA Models With Sub-Questions,
CVPR20(10000-10008)
IEEE DOI
2008
Cognition, Task analysis, Visualization, Image color analysis,
Grounding, Text recognition, Computational modeling
BibRef
Gouthaman, K.V.,
Mittal, A.[Anurag],
Reducing Language Biases in Visual Question Answering with
Visually-grounded Question Encoder,
ECCV20(XIII:18-34).
Springer DOI
2011
BibRef
Tan, H.L.,
Leong, M.C.,
Xu, Q.,
Li, L.,
Fang, F.,
Cheng, Y.,
Gauthier, N.,
Sun, Y.,
Lim, J.H.,
Task-Oriented Multi-Modal Question Answering For Collaborative
Applications,
ICIP20(1426-1430)
IEEE DOI
2011
Task analysis, Collaboration, Grounding, Visualization, Cognition,
Training, Machine learning, question answering, corpora
BibRef
Selvaraju, R.R.,
Lee, S.,
Shen, Y.,
Jin, H.,
Ghosh, S.,
Heck, L.,
Batra, D.,
Parikh, D.,
Taking a HINT: Leveraging Explanations to Make Vision and Language
Models More Grounded,
ICCV19(2591-2600)
IEEE DOI
2004
gradient methods, image retrieval, natural language processing,
neural nets, question answering (information retrieval), HINT, Correlation
BibRef
Zhang, Y.,
Niebles, J.C.,
Soto, A.,
Interpretable Visual Question Answering by Visual Grounding From
Attention Supervision Mining,
WACV19(349-357)
IEEE DOI
1904
data mining, data visualisation, image representation,
learning (artificial intelligence)
BibRef
Chapter on Implementations and Applications, Databases, QBIC, Video Analysis, Hardware and Software, Inspection continues in
Referring Expression Comprehension .