VQA: Visual Question Answering,
dataset containing open-ended questions about images
WWW Link.
Dataset, Visual Question Answering.
See also VQA: Visual Question Answering.
Visual7W visual question answering,
Large-scale visual question answering (QA) dataset, with object-level
groundings and multimodal answers.
WWW Link.
Dataset, Visual Question Answering.
Visual Genome,
Visual Genome is a dataset, a knowledge base, an ongoing effort to
connect structured image concepts to language.
WWW Link.
WWW Link.
Dataset, Visual Question Answering.
Wu, Y.C.[Yu-Chieh],
Yang, J.C.[Jie-Chi],
A Robust Passage Retrieval Algorithm for Video Question Answering,
CirSysVideo(18), No. 10, October 2008, pp. 1411-1421.
IEEE DOI
0811
BibRef
Wu, Y.C.[Yu-Chieh],
Lee, Y.S.[Yue-Shi],
Yang, J.C.[Jie-Chi],
Yen, S.J.[Show-Jane],
A New Passage Ranking Algorithm for Video Question Answering,
PSIVT06(563-572).
Springer DOI
0612
BibRef
Li, G.D.[Guang-Da],
Li, H.[Haojie],
Ming, Z.Y.[Zhao-Yan],
Hong, R.[Richang],
Tang, S.[Sheng],
Chua, T.S.[Tat-Seng],
Question Answering over Community-Contributed Web Videos,
MultMedMag(17), No. 4, October-December 2010, pp. 46-57.
IEEE DOI
1011
BibRef
Song, Y.C.[Yi-Cheng],
Li, H.[Haojie],
Mash-Up Approach for Web Video Category Recommendation,
PSIVT10(197-202).
IEEE DOI
1011
BibRef
Agrawal, A.[Aishwarya],
Lu, J.[Jiasen],
Antol, S.[Stanislaw],
Mitchell, M.[Margaret],
Zitnick, C.L.[C. Lawrence],
Parikh, D.[Devi],
Batra, D.[Dhruv],
VQA: Visual Question Answering,
IJCV(123), No. 1, May 2017, pp. 4-31.
Springer DOI
1705
BibRef
Malinowski, M.[Mateusz],
Rohrbach, M.[Marcus],
Fritz, M.[Mario],
Ask Your Neurons: A Deep Learning Approach to Visual Question Answering,
IJCV(125), No. 1-3, December 2018, pp. 110-135.
Springer DOI
1711
BibRef
Earlier:
Ask Your Neurons:
A Neural-Based Approach to Answering Questions about Images,
ICCV15(1-9)
IEEE DOI
1602
Deep learning for questions about real-world images.
A Visual Turing Test.
Language output based on visual and natural language input.
BibRef
Kafle, K.[Kushal],
Kanan, C.[Christopher],
Visual question answering:
Datasets, algorithms, and future challenges,
CVIU(163), No. 1, 2017, pp. 3-20.
Elsevier DOI
1712
Image understanding
BibRef
Wu, Q.[Qi],
Teney, D.[Damien],
Wang, P.[Peng],
Shen, C.H.[Chun-Hua],
Dick, A.[Anthony],
van den Hengel, A.J.[Anton J.],
Visual question answering: A survey of methods and datasets,
CVIU(163), No. 1, 2017, pp. 21-40.
Elsevier DOI
1712
Survey, Visual Question Answering. Visual question answering
BibRef
Tamaazousti, Y.[Youssef],
Borgne, H.L.[Hervé Le],
Popescu, A.[Adrian],
Gadeski, E.[Etienne],
Ginsca, A.[Alexandru],
Hudelot, C.[Céline],
Vision-language integration using constrained local semantic features,
CVIU(163), No. 1, 2017, pp. 41-57.
Elsevier DOI
1712
Image classification
BibRef
Das, A.[Abhishek],
Agrawal, H.[Harsh],
Zitnick, L.[Larry],
Parikh, D.[Devi],
Batra, D.[Dhruv],
Human Attention in Visual Question Answering:
Do Humans and Deep Networks Look at the Same Regions?,
CVIU(163), No. 1, 2017, pp. 90-100.
Elsevier DOI
1712
Visual Question Answering
BibRef
Teney, D.[Damien],
Wu, Q.,
van den Hengel, A.J.[Anton J.],
Visual Question Answering: A Tutorial,
SPMag(34), No. 6, November 2017, pp. 63-75.
IEEE DOI
1712
Survey, Visual Question Answering. Bioinformatics, Computer vision, Genomics, Machine learning, Visualization
BibRef
Teney, D.[Damien],
Liu, L.,
van den Hengel, A.J.[Anton J.],
Graph-Structured Representations for Visual Question Answering,
CVPR17(3233-3241)
IEEE DOI
1711
Feature extraction, Knowledge discovery, Neural networks,
Syntactics, Training, Visualization
BibRef
Teney, D.[Damien],
van den Hengel, A.J.[Anton J.],
Visual Question Answering as a Meta Learning Task,
ECCV18(XV: 229-245).
Springer DOI
1810
BibRef
Wu, Q.[Qi],
Shen, C.H.[Chun-Hua],
Wang, P.[Peng],
Dick, A.[Anthony],
van den Hengel, A.J.[Anton J.],
Image Captioning and Visual Question Answering Based on Attributes
and External Knowledge,
PAMI(40), No. 6, June 2018, pp. 1367-1381.
IEEE DOI
1805
BibRef
Earlier: A1, A3, A2, A4, A5:
Ask Me Anything: Free-Form Visual Question Answering Based on
Knowledge from External Sources,
CVPR16(4622-4630)
IEEE DOI
1612
Computational modeling, Computer vision, Knowledge based systems,
Knowledge discovery, Resource description framework, Semantics,
visual question answering
BibRef
Lioutas, V.[Vasileios],
Passalis, N.[Nikolaos],
Tefas, A.[Anastasios],
Explicit ensemble attention learning for improving visual question
answering,
PRL(111), 2018, pp. 51-57.
Elsevier DOI
1808
Visual question answering, Explicit attention, Pictorial superiority effect
BibRef
Huang, Y.Z.[Yan-Zhou],
Zhong, T.[Tao],
Multitask learning for neural generative question answering,
RealTimeIP(14), No. 1, January 2018, pp. 1009-1017.
WWW Link.
1809
BibRef
Zhang, Q.S.[Quan-Shi],
Wu, Y.N.[Ying Nian],
Zhang, H.[Hao],
Zhu, S.C.[Song-Chun],
Mining deep And-Or object structures via cost-sensitive
question-answer-based active annotations,
CVIU(176-177), 2018, pp. 33-44.
Elsevier DOI
1812
Hierarchical graphical model, Part semantics
BibRef
Zhang, Q.S.[Quan-Shi],
Cao, R.,
Wu, Y.N.[Ying Nian],
Zhu, S.C.[Song-Chun],
Mining Object Parts from CNNs via Active Question-Answering,
CVPR17(3890-3899)
IEEE DOI
1711
BibRef
Earlier: A1, A3, A4, Only:
Mining And-Or Graphs for Graph Matching and Object Discovery,
ICCV15(55-63)
IEEE DOI
1602
Object detection, Object recognition, Semantics, Strain, Training,
Visualization
BibRef
Garg, S.[Shivam],
Srivastava, R.[Rajeev],
Object sequences: encoding categorical and spatial information for a
yes/no visual question answering task,
IET-CV(12), No. 8, December 2018, pp. 1141-1150.
DOI Link
1812
BibRef
Goyal, Y.[Yash],
Khot, T.[Tejas],
Agrawal, A.[Aishwarya],
Summers-Stay, D.[Douglas],
Batra, D.[Dhruv],
Parikh, D.[Devi],
Making the V in VQA Matter: Elevating the Role of Image Understanding
in Visual Question Answering,
IJCV(127), No. 4, April 2019, pp. 398-414.
Springer DOI
1903
BibRef
Earlier: A1, A2, A4, A5, A6, Only:
CVPR17(6325-6334)
IEEE DOI
1711
Benchmark testing, Data collection, Data models,
Knowledge discovery, Protocols, Visualization
BibRef
Fang, Z.W.[Zhi-Wei],
Liu, J.[Jing],
Li, Y.[Yong],
Qiao, Y.Y.[Yan-Yuan],
Lu, H.Q.[Han-Qing],
Improving visual question answering using dropout and enhanced
question encoder,
PR(90), 2019, pp. 404-414.
Elsevier DOI
1903
Visual question answering, Coherent dropout, Siamese dropout, Enhanced question encoder
BibRef
Das, A.[Abhishek],
Kottur, S.[Satwik],
Gupta, K.[Khushi],
Singh, A.[Avi],
Yadav, D.[Deshraj],
Lee, S.[Stefan],
Moura, J.M.F.[José M. F.],
Parikh, D.[Devi],
Batra, D.[Dhruv],
Visual Dialog,
PAMI(41), No. 5, May 2019, pp. 1242-1256.
IEEE DOI
1904
Hold a meaningful dialog about visual content.
Visualization, Task analysis, Artificial intelligence, History,
Protocols, Natural languages, Wheelchairs, Visual dialog,
machine learning
BibRef
Osman, A.[Ahmed],
Samek, W.[Wojciech],
DRAU: Dual Recurrent Attention Units for Visual Question Answering,
CVIU(185), 2019, pp. 24-30.
Elsevier DOI
1906
Visual Question Answering, Attention Mechanisms,
Multi-modal Learning, Machine Vision, Natural Language Processing
BibRef
Liang, J.W.[Jun-Wei],
Jiang, L.[Lu],
Cao, L.L.[Liang-Liang],
Kalantidis, Y.[Yannis],
Li, L.J.[Li-Jia],
Hauptmann, A.G.[Alexander G.],
Focal Visual-Text Attention for Memex Question Answering,
PAMI(41), No. 8, August 2019, pp. 1893-1908.
IEEE DOI
1907
BibRef
Earlier: A1, A2, A3, A5, A6, Only:
Focal Visual-Text Attention for Visual Question Answering,
CVPR18(6135-6143)
IEEE DOI
1812
Task analysis, Knowledge discovery, Visualization, Grounding,
Metadata, Cognition, Photo albums, question answering,
memex.
Visualization, Videos, Computational modeling, Correlation.
BibRef
Zhao, Z.[Zhou],
Zhang, Z.[Zhu],
Jiang, X.H.[Xing-Hua],
Cai, D.[Deng],
Multi-Turn Video Question Answering via Hierarchical Attention
Context Reinforced Networks,
IP(28), No. 8, August 2019, pp. 3860-3872.
IEEE DOI
1907
learning (artificial intelligence), natural language processing,
reinforcement learning
BibRef
Gu, M.[Mao],
Zhao, Z.[Zhou],
Jin, W.[Weike],
Cai, D.[Deng],
Wu, F.[Fei],
Video Dialog via Multi-Grained Convolutional Self-Attention Context
Multi-Modal Networks,
CirSysVideo(30), No. 12, December 2020, pp. 4453-4466.
IEEE DOI
2012
Visualization, Knowledge discovery, History, Task analysis,
Context modeling, Decoding, Computational modeling, Video dialog,
convolution
BibRef
Xue, H.Y.[Hong-Yang],
Chu, W.,
Zhao, Z.[Zhou],
Cai, D.[Deng],
A Better Way to Attend: Attention With Trees for Video Question
Answering,
IP(27), No. 11, November 2018, pp. 5563-5574.
IEEE DOI
1809
computational linguistics, feature extraction, grammars,
natural language processing,
scene understanding
BibRef
Xue, H.Y.[Hong-Yang],
Zhao, Z.[Zhou],
Cai, D.[Deng],
Unifying the Video and Question Attentions for Open-Ended Video
Question Answering,
IP(26), No. 12, December 2017, pp. 5656-5666.
IEEE DOI
1710
image retrieval, video coding,
temporal question attention, temporal structures,
Adaptation models, Coherence, Hair, Knowledge discovery,
BibRef
Zhao, Z.[Zhou],
Xiao, S.W.[Shu-Wen],
Song, Z.[Zehan],
Lu, C.J.[Chu-Jie],
Xiao, J.[Jun],
Zhuang, Y.T.[Yue-Ting],
Open-Ended Video Question Answering via Multi-Modal Conditional
Adversarial Networks,
IP(29), 2020, pp. 3859-3870.
IEEE DOI
2002
Open-ended video question answering, multi-modal neural network
BibRef
Zhao, Z.[Zhou],
Zhang, Z.[Zhu],
Xiao, S.W.[Shu-Wen],
Xiao, Z.X.[Zhen-Xin],
Yan, X.H.[Xiao-Hui],
Yu, J.[Jun],
Cai, D.[Deng],
Wu, F.[Fei],
Long-Form Video Question Answering via Dynamic Hierarchical
Reinforced Networks,
IP(28), No. 12, December 2019, pp. 5939-5952.
IEEE DOI
1909
Knowledge discovery, Semantics, Visualization, Natural languages,
Road transportation, Task analysis, Decoding,
reinforcement learning
BibRef
Jang, Y.[Yunseok],
Song, Y.[Yale],
Kim, C.D.[Chris Dongjoo],
Yu, Y.[Youngjae],
Kim, Y.[Youngjin],
Kim, G.[Gunhee],
Video Question Answering with Spatio-Temporal Reasoning,
IJCV(127), No. 10, October 2019, pp. 1385-1412.
Springer DOI
1909
BibRef
Earlier: A1, A2, A4, A5, A6, Only:
TGIF-QA:
Toward Spatio-Temporal Reasoning in Visual Question Answering,
CVPR17(1359-1367)
IEEE DOI
1711
Cognition, Crowdsourcing, Image color analysis,
Knowledge discovery, Motion pictures, Visualization
BibRef
Toor, A.S.[Andeep S.],
Wechsler, H.[Harry],
Nappi, M.[Michele],
Biometric surveillance using visual question answering,
PRL(126), 2019, pp. 111-118.
Elsevier DOI
1909
Biometrics, Forensics, Visual question answering,
Question relevance, Surveillance, Deep learning, Visual turing test
BibRef
Wang, A.,
Luu, A.T.,
Foo, C.,
Zhu, H.,
Tay, Y.,
Chandrasekhar, V.,
Holistic Multi-Modal Memory Network for Movie Question Answering,
IP(29), No. 1, 2020, pp. 489-499.
IEEE DOI
1910
question answering (information retrieval),
holistic multimodal memory network, multimodal context,
MovieQA
BibRef
Yu, T.,
Yu, J.,
Yu, Z.,
Tao, D.,
Compositional Attention Networks With Two-Stream Fusion for Video
Question Answering,
IP(29), No. , 2020, pp. 1204-1218.
IEEE DOI
1911
Visualization, Streaming media, Knowledge discovery,
Feature extraction, Proposals, Task analysis, Semantics,
action pooling stream
BibRef
Ruwa, N.[Nelson],
Mao, Q.[Qirong],
Song, H.P.[He-Ping],
Jia, H.J.[Hong-Jie],
Dong, M.[Ming],
Triple attention network for sentimental visual question answering,
CVIU(189), 2019, pp. 102829.
Elsevier DOI
1911
Visual question answering, Feature embedding, Attention model,
Sentiment analysis
BibRef
Li, W.W.[Wen-Wen],
Song, M.M.[Miao-Miao],
Tian, Y.Y.[Yuan-Yuan],
An Ontology-Driven Cyberinfrastructure for Intelligent Spatiotemporal
Question Answering and Open Knowledge Discovery,
IJGI(8), No. 11, 2019, pp. xx-yy.
DOI Link
1912
BibRef
Tommasi, T.[Tatiana],
Mallya, A.[Arun],
Plummer, B.A.[Bryan A.],
Lazebnik, S.[Svetlana],
Berg, A.C.[Alexander C.],
Berg, T.L.[Tamara L.],
Combining Multiple Cues for Visual Madlibs Question Answering,
IJCV(127), No. 1, January 2019, pp. 38-60.
Springer DOI
1901
BibRef
Earlier:
Solving Visual Madlibs with Multiple Cues,
BMVC16(xx-yy).
HTML Version.
1805
BibRef
Yu, L.C.[Li-Cheng],
Park, E.[Eunbyung],
Berg, A.C.[Alexander C.],
Berg, T.L.[Tamara L.],
Visual Madlibs:
Fill in the Blank Description Generation and Question Answering,
ICCV15(2461-2469)
IEEE DOI
1602
dataset consisting of 360,001 focused natural language descriptions
for 10,738 images
BibRef
Xi, Y.L.[Yu-Ling],
Zhang, Y.N.[Yan-Ning],
Ding, S.T.[Song-Tao],
Wan, S.H.[Shao-Hua],
Visual question answering model based on visual relationship
detection,
SP:IC(80), 2020, pp. 115648.
Elsevier DOI
1912
Visual question answering, Appearance features,
Relationship predicate, Word vector similarity
BibRef
Liu, F.,
Xiang, T.,
Hospedales, T.M.,
Yang, W.,
Sun, C.,
Inverse Visual Question Answering: A New Benchmark and VQA Diagnosis
Tool,
PAMI(42), No. 2, February 2020, pp. 460-474.
IEEE DOI
2001
BibRef
Earlier:
iVQA: Inverse Visual Question Answering,
CVPR18(8611-8619)
IEEE DOI
1812
Benchmark testing, Visualization, Predictive models,
Analytical models, Image color analysis, Knowledge discovery,
reinforcement learning.
Task analysis, Measurement, Decoding, Natural languages, Cognition
BibRef
Wu, Y.,
Jiang, L.,
Yang, Y.,
Revisiting EmbodiedQA: A Simple Baseline and Beyond,
IP(29), 2020, pp. 3984-3992.
IEEE DOI
2002
Embodied question answering, vision and language, visual question answering
BibRef
Huang, C.[Chaoran],
Yao, L.[Lina],
Wang, X.Z.[Xian-Zhi],
Benatallah, B.[Boualem],
Zhang, X.[Xiang],
Software expert discovery via knowledge domain embeddings in a
collaborative network,
PRL(130), 2020, pp. 46-53.
Elsevier DOI
2002
Knowledge discovery, Stack overflow, Expertise finding,
Question answering, Expert as a Service
BibRef
Wang, W.[Weining],
Huang, Y.[Yan],
Wang, L.[Liang],
Long video question answering: A Matching-guided Attention Model,
PR(102), 2020, pp. 107248.
Elsevier DOI
2003
Long video QA, Matching-guided attention
BibRef
Zhang, W.,
Tang, S.,
Cao, Y.,
Pu, S.,
Wu, F.,
Zhuang, Y.,
Frame Augmented Alternating Attention Network for Video Question
Answering,
MultMed(22), No. 4, April 2020, pp. 1032-1041.
IEEE DOI
2004
Feature extraction, Visualization, Knowledge discovery,
Task analysis, Data mining, Neural networks, Semantics, Video QA,
neural network
BibRef
Chen, J.[Jie],
Shao, J.[Jie],
He, C.[Chengkun],
Movie fill in the blank by joint learning from video and text with
adaptive temporal attention,
PRL(132), 2020, pp. 62-68.
Elsevier DOI
2005
Video question answering, Adaptive temporal attention, Text information fusion
BibRef
Li, W.[Wei],
Sun, J.H.[Jian-Hui],
Liu, G.[Ge],
Zhao, L.[Linglan],
Fang, X.Z.[Xiang-Zhong],
Visual question answering with attention transfer and a cross-modal
gating mechanism,
PRL(133), 2020, pp. 334-340.
Elsevier DOI
2005
Attention, Visual question answering, Gating
BibRef
Messina, N.[Nicola],
Amato, G.[Giuseppe],
Carrara, F.[Fabio],
Falchi, F.[Fabrizio],
Gennaro, C.[Claudio],
Learning visual features for relational CBIR,
MultInfoRetr(9), No. 2, June 2020, pp. 113-124.
Springer DOI
2005
BibRef
Earlier:
Learning Relationship-Aware Visual Features,
CEFR-LCV18(IV:486-501).
Springer DOI
1905
BibRef
Methani, N.,
Ganguly, P.,
Khapra, M.M.,
Kumar, P.,
PlotQA: Reasoning over Scientific Plots,
WACV20(1516-1525)
IEEE DOI
2006
Vocabulary, Cognition, Bars, Numerical models,
Optical character recognition software, Data mining, Image color analysis
BibRef
Guo, D.,
Wang, H.,
Wang, S.,
Wang, M.,
Textual-Visual Reference-Aware Attention Network for Visual Dialog,
IP(29), 2020, pp. 6655-6666.
IEEE DOI
2007
Visualization, Semantics, History, Correlation, Head, Cognition,
Task analysis, Visual dialog, attention network, textual reference,
multimodal semantic interaction
BibRef
Patil, C.[Charulata],
Patwardhan, M.[Manasi],
Visual Question Generation: The State of the Art,
Surveys(53), No. 3, May 2020, pp. xx-yy.
DOI Link
2007
Image understanding, question generation
BibRef
Yu, J.[Jing],
Zhu, Z.H.[Zi-Hao],
Wang, Y.J.[Yu-Jing],
Zhang, W.F.[Wei-Feng],
Hu, Y.[Yue],
Tan, J.L.[Jian-Long],
Cross-modal knowledge reasoning for knowledge-based visual question
answering,
PR(108), 2020, pp. 107563.
Elsevier DOI
2008
Cross-modal knowledge reasoning, Multimodal knowledge graphs,
Compositional reasoning module, Explainable reasoning
BibRef
Yang, Z.Q.[Zhuo-Qian],
Qin, Z.C.[Zeng-Chang],
Yu, J.[Jing],
Wan, T.[Tao],
Prior Visual Relationship Reasoning For Visual Question Answering,
ICIP20(1411-1415)
IEEE DOI
2011
Visualization, Semantics, Convolution, Cognition,
Knowledge discovery, Benchmark testing, Measurement, VQA,
GCN, Attention Mechanism
BibRef
Vu, M.H.,
Löfstedt, T.,
Nyholm, T.,
Sznitman, R.,
A Question-Centric Model for Visual Question Answering in Medical
Imaging,
MedImg(39), No. 9, September 2020, pp. 2856-2868.
IEEE DOI
2009
Feature extraction, Visualization, Predictive models,
Knowledge discovery, Task analysis, Medical diagnostic imaging,
medical questions and answers
BibRef
Riquelme, F.[Felipe],
de Goyeneche, A.[Alfredo],
Zhang, Y.D.[Yun-Dong],
Niebles, J.C.[Juan Carlos],
Soto, A.[Alvaro],
Explaining VQA predictions using visual grounding and a knowledge
base,
IVC(101), 2020, pp. 103968.
Elsevier DOI
2009
Deep Learning, Attention, Supervision, Knowledge Base,
Interpretability, Explainability
BibRef
Bai, Z.W.[Zong-Wen],
Li, Y.[Ying],
Wozniak, M.[Marcin],
Zhou, M.L.[Mei-Li],
Li, D.[Di],
DecomVQANet: Decomposing visual question answering deep network via
tensor decomposition and regression,
PR(110), 2021, pp. 107538.
Elsevier DOI
2011
Tensor decomposition, Tensor regression layer,
Tensor contraction layer, Visual question answering
BibRef
Patro, B.N.[Badri N.],
Anupriy,
Namboodiri, V.P.[Vinay P.],
Probabilistic framework for solving visual dialog,
PR(110), 2021, pp. 107586.
Elsevier DOI
2011
CNN, LSTM, Uncertainty, Aleatoric uncertainty,
Epistemic uncertainty vision and language, Visual dialog, VQA,
Bayesian deep learning
BibRef
Farazi, M.R.[Moshiur R.],
Khan, S.H.[Salman H.],
Barnes, N.[Nick],
From known to the unknown: Transferring knowledge to answer questions
about novel visual and semantic concepts,
IVC(103), 2020, pp. 103985.
Elsevier DOI
2011
Visual Question Answering, Computer vision, Deep learning,
Natural language processing, Dataset bias
BibRef
Terao, K.[Kento],
Tamaki, T.[Toru],
Raytchev, B.[Bisser],
Kaneda, K.[Kazufumi],
Satoh, S.[Shin'ichi],
Rephrasing Visual Questions by Specifying the Entropy of the Answer
Distribution,
IEICE(E103-D), No. 11, November 2020, pp. 2362-2370.
WWW Link.
2011
BibRef
Yu, J.[Jing],
Zhang, W.F.[Wei-Feng],
Lu, Y.H.[Yu-Hang],
Qin, Z.C.[Zeng-Chang],
Hu, Y.[Yue],
Tan, J.L.[Jian-Long],
Wu, Q.[Qi],
Reasoning on the Relation: Enhancing Visual Representation for Visual
Question Answering and Cross-Modal Retrieval,
MultMed(22), No. 12, December 2020, pp. 3196-3209.
IEEE DOI
2011
Visualization, Cognition, Task analysis, Knowledge discovery,
Semantics, Correlation, Information retrieval,
cross-modal information retrieval
BibRef
Lobry, S.,
Marcos, D.,
Murray, J.,
Tuia, D.,
RSVQA: Visual Question Answering for Remote Sensing Data,
GeoRS(58), No. 12, December 2020, pp. 8555-8566.
IEEE DOI
2012
Remote sensing, Task analysis, Visualization, Data models,
Feature extraction, Knowledge discovery,
visual question answering (VQA)
BibRef
Niu, Y.L.[Yu-Lei],
Zhang, H.W.[Han-Wang],
Lu, Z.W.[Zhi-Wu],
Chang, S.F.[Shih-Fu],
Variational Context: Exploiting Visual and Textual Context for
Grounding Referring Expressions,
PAMI(43), No. 1, January 2021, pp. 347-359.
IEEE DOI
2012
Grounding, Context modeling, Visualization, Task analysis,
Pediatrics, Bayes methods, Annotations,
referring expression generation
BibRef
Sun, B.[Bo],
Yao, Z.[Zeng],
Zhang, Y.H.[Ying-Hui],
Yu, L.J.[Le-Jun],
Local relation network with multilevel attention for visual question
answering,
JVCIR(73), 2020, pp. 102762.
Elsevier DOI
2012
Visual question answering, Relation network, Attention mechanism
BibRef
Wang, J.M.[Jian-Ming],
Cui, E.[Enjie],
Liu, K.L.[Kun-Liang],
Sun, Y.K.[Yu-Kuan],
Liang, J.Y.[Jia-Yu],
Yuan, C.M.[Chun-Miao],
Duan, X.J.[Xiao-Jie],
Jin, G.H.[Guang-Hao],
Chung, T.S.[Tae-Sun],
Referring expression comprehension model with matching detection and
linguistic feedback,
IET-CV(14), No. 8, December 2020, pp. 625-633.
DOI Link
2012
BibRef
Li, X.,
Yuan, A.,
Lu, X.,
Vision-to-Language Tasks Based on Attributes and Attention Mechanism,
Cyber(51), No. 2, February 2021, pp. 913-926.
IEEE DOI
2101
Semantics, Task analysis, Visualization, Cats, Natural languages,
Knowledge discovery, Feature extraction, Deep learning,
visual question answering (VQA)
BibRef
Murahari, V.[Vishvak],
Batra, D.[Dhruv],
Parikh, D.[Devi],
Das, A.[Abhishek],
Large-scale Pretraining for Visual Dialog:
A Simple State-of-the-art Baseline,
ECCV20(XVIII:336-352).
Springer DOI
2012
BibRef
Gupta, T.[Tanmay],
Vahdat, A.[Arash],
Chechik, G.[Gal],
Yang, X.D.[Xiao-Dong],
Kautz, J.[Jan],
Hoiem, D.[Derek],
Contrastive Learning for Weakly Supervised Phrase Grounding,
ECCV20(III:752-768).
Springer DOI
2012
BibRef
Garcia, N.[Noa],
Nakashima, Y.[Yuta],
Knowledge-based Video Question Answering with Unsupervised Scene
Descriptions,
ECCV20(XVIII:581-598).
Springer DOI
2012
BibRef
Gouthaman, K.V.,
Mittal, A.[Anurag],
Reducing Language Biases in Visual Question Answering with
Visually-grounded Question Encoder,
ECCV20(XIII:18-34).
Springer DOI
2011
BibRef
Tan, S.[Sinan],
Xiang, W.[Weilai],
Liu, H.P.[Hua-Ping],
Guo, D.[Di],
Sun, F.C.[Fu-Chun],
Multi-agent Embodied Question Answering in Interactive Environments,
ECCV20(XIII:663-678).
Springer DOI
2011
BibRef
Qiao, Y.,
Yu, Z.,
Liu, J.,
VC-VQA: Visual Calibration Mechanism For Visual Question Answering,
ICIP20(1481-1485)
IEEE DOI
2011
Visualization, Image reconstruction, Calibration, Task analysis,
Predictive models, Feature extraction, Knowledge discovery,
Feature Reconstruction
BibRef
Tan, H.L.,
Leong, M.C.,
Xu, Q.,
Li, L.,
Fang, F.,
Cheng, Y.,
Gauthier, N.,
Sun, Y.,
Lim, J.H.,
Task-Oriented Multi-Modal Question Answering For Collaborative
Applications,
ICIP20(1426-1430)
IEEE DOI
2011
Task analysis, Collaboration, Grounding, Visualization, Cognition,
Training, Machine learning, question answering,
corpora
BibRef
Jain, V.,
Lodhavia, J.,
Automatic Question Tagging using k-Nearest Neighbors and Random
Forest,
ISCV20(1-4)
IEEE DOI
2011
learning (artificial intelligence),
question answering (information retrieval),
Natural Language Processing
BibRef
Yang, S.[Sibei],
Li, G.B.[Guan-Bin],
Yu, Y.Z.[Yi-Zhou],
Propagating Over Phrase Relations for One-stage Visual Grounding,
ECCV20(XIX:589-605).
Springer DOI
2011
BibRef
Tang, R.X.[Rui-Xue],
Ma, C.[Chao],
Zhang, W.E.[Wei Emma],
Wu, Q.[Qi],
Yang, X.K.[Xiao-Kang],
Semantic Equivalent Adversarial Data Augmentation for Visual Question
Answering,
ECCV20(XIX:437-453).
Springer DOI
2011
BibRef
Zhu, Y.[Ye],
Wu, Y.[Yu],
Yang, Y.[Yi],
Yan, Y.[Yan],
Describing Unseen Videos via Multi-Modal Cooperative Dialog Agents,
ECCV20(XXIII:153-169).
Springer DOI
2011
BibRef
Gokhale, T.[Tejas],
Banerjee, P.[Pratyay],
Baral, C.[Chitta],
Yang, Y.Z.[Ye-Zhou],
VQA-LOL: Visual Question Answering Under the Lens of Logic,
ECCV20(XXI:379-396).
Springer DOI
2011
BibRef
Yang, X.F.[Xiao-Feng],
Lin, G.S.[Guo-Sheng],
Lv, F.M.[Feng-Mao],
Liu, F.[Fayao],
TRRNET:
Tiered Relation Reasoning for Compositional Visual Question Answering,
ECCV20(XXI:414-430).
Springer DOI
2011
BibRef
Bansal, A.[Ankan],
Zhang, Y.[Yuting],
Chellappa, R.[Rama],
Visual Question Answering on Image Sets,
ECCV20(XXI:51-67).
Springer DOI
2011
BibRef
Xiao, J.B.[Jun-Bin],
Shang, X.[Xindi],
Yang, X.[Xun],
Tang, S.[Sheng],
Chua, T.S.[Tat-Seng],
Visual Relation Grounding in Videos,
ECCV20(VI:447-464).
Springer DOI
2011
Code, Relations.
WWW Link.
BibRef
Han, X.Z.[Xin-Zhe],
Wang, S.H.[Shu-Hui],
Su, C.[Chi],
Zhang, W.G.[Wei-Gang],
Huang, Q.M.[Qing-Ming],
Tian, Q.[Qi],
Interpretable Visual Reasoning via Probabilistic Formulation Under
Natural Supervision,
ECCV20(IX:553-570).
Springer DOI
2011
BibRef
Kant, Y.[Yash],
Batra, D.[Dhruv],
Anderson, P.[Peter],
Schwing, A.[Alexander],
Parikh, D.[Devi],
Lu, J.[Jiasen],
Agrawal, H.[Harsh],
Spatially Aware Multimodal Transformers for TextVQA,
ECCV20(IX:715-732).
Springer DOI
2011
BibRef
Li, Q.[Qing],
Huang, S.Y.[Si-Yuan],
Hong, Y.[Yining],
Zhu, S.C.[Song-Chun],
A Competence-aware Curriculum for Visual Concepts Learning via Question
Answering,
ECCV20(II:141-157).
Springer DOI
2011
BibRef
Zheng, W.B.[Wen-Bo],
Yan, L.[Lan],
Gou, C.[Chao],
Wang, F.Y.[Fei-Yue],
Webly Supervised Knowledge Embedding Model for Visual Reasoning,
CVPR20(12442-12451)
IEEE DOI
2008
Visual reasoning between visual image and natural language description.
Visualization, Cognition, Knowledge based systems, Task analysis,
Knowledge engineering, Modulation, Robustness
BibRef
Qi, J.,
Niu, Y.,
Huang, J.,
Zhang, H.,
Two Causal Principles for Improving Visual Dialog,
CVPR20(10857-10866)
IEEE DOI
2008
Visualization, History, Task analysis, Data models, Training, Feeds, Decoding
BibRef
Mun, J.,
Cho, M.,
Han, B.,
Local-Global Video-Text Interactions for Temporal Grounding,
CVPR20(10807-10816)
IEEE DOI
2008
Semantics, Feature extraction, Grounding, Visualization, Proposals,
Task analysis, Context modeling
BibRef
Wu, C.,
Lin, Z.,
Cohen, S.,
Bui, T.,
Maji, S.,
PhraseCut: Language-Based Image Segmentation in the Wild,
CVPR20(10213-10222)
IEEE DOI
2008
Visualization, Grounding, Image segmentation, Task analysis,
Genomics, Bioinformatics, Natural languages
BibRef
Abbasnejad, E.[Ehsan],
Teney, D.[Damien],
Parvaneh, A.[Amin],
Shi, J.[Javen],
van den Hengel, A.J.[Anton J.],
Counterfactual Vision and Language Learning,
CVPR20(10041-10051)
IEEE DOI
2008
Training, Visualization, Training data, Task analysis,
Machine learning, Knowledge discovery, Data models
BibRef
Wang, P.[Peng],
Wu, Q.[Qi],
Cao, J.[Jiewei],
Shen, C.H.[Chun-Hua],
Gao, L.L.[Lian-Li],
van den Hengel, A.J.[Anton J.],
Neighbourhood Watch: Referring Expression Comprehension via
Language-Guided Graph Attention Networks,
CVPR19(1960-1968).
IEEE DOI
2002
BibRef
Guo, D.,
Wang, H.,
Zhang, H.,
Zha, Z.,
Wang, M.,
Iterative Context-Aware Graph Inference for Visual Dialog,
CVPR20(10052-10061)
IEEE DOI
2008
Visualization, History, Task analysis, Semantics, Message passing,
Neural networks, Cognition
BibRef
Zhu, Y.,
Zhu, F.,
Zhan, Z.,
Lin, B.,
Jiao, J.,
Chang, X.,
Liang, X.,
Vision-Dialog Navigation by Exploring Cross-Modal Memory,
CVPR20(10727-10736)
IEEE DOI
2008
Navigation, Visualization, Task analysis, History, Memory modules,
Natural languages, Decision making
BibRef
Bajaj, G.,
Bandyopadhyay, B.,
Schmidt, D.,
Maneriker, P.,
Myers, C.,
Parthasarathy, S.,
Understanding Knowledge Gaps in Visual Question Answering:
Implications for Gap Identification and Testing,
MVM20(1563-1566)
IEEE DOI
2008
Cognition, Training, Task analysis, Artificial intelligence,
Global communication, Taxonomy, Semantics
BibRef
Chen, L.,
Yan, X.,
Xiao, J.,
Zhang, H.,
Pu, S.,
Zhuang, Y.,
Counterfactual Samples Synthesizing for Robust Visual Question
Answering,
CVPR20(10797-10806)
IEEE DOI
2008
Training, Cascading style sheets, Predictive models, Visualization,
Image color analysis, Linguistics, Computational modeling
BibRef
Vatashsky, B.,
Ullman, S.,
VQA With No Questions-Answers Training,
CVPR20(10373-10383)
IEEE DOI
2008
Visualization, Training, Image color analysis, Knowledge discovery,
Boats, Image analysis, Task analysis
BibRef
Jiang, H.,
Misra, I.,
Rohrbach, M.,
Learned-Miller, E.,
Chen, X.,
In Defense of Grid Features for Visual Question Answering,
CVPR20(10264-10273)
IEEE DOI
2008
Feature extraction, Visualization, Task analysis, Detectors,
Object detection, Training, Pipelines
BibRef
Wang, X.,
Liu, Y.,
Shen, C.,
Ng, C.C.,
Luo, C.,
Jin, L.,
Chan, C.S.,
van den Hengel, A.,
Wang, L.,
On the General Value of Evidence, and Bilingual Scene-Text Visual
Question Answering,
CVPR20(10123-10132)
IEEE DOI
2008
Measurement, Cognition, Knowledge discovery, Correlation,
Task analysis, Visualization, Optical character recognition software
BibRef
Kim, J.,
Ma, M.,
Pham, T.,
Kim, K.,
Yoo, C.D.,
Modality Shifting Attention Network for Multi-Modal Video Question
Answering,
CVPR20(10103-10112)
IEEE DOI
2008
Cognition, Visualization, Task analysis, Knowledge discovery,
Proposals, Modulation, Context modeling
BibRef
Xiong, P.,
Wu, Y.,
TA-Student VQA: Multi-Agents Training by Self-Questioning,
CVPR20(10062-10072)
IEEE DOI
2008
Visualization, Training, Knowledge discovery, Standards,
Task analysis, Boosting
BibRef
Selvaraju, R.R.,
Tendulkar, P.,
Parikh, D.,
Horvitz, E.,
Tulio Ribeiro, M.,
Nushi, B.,
Kamar, E.,
SQuINTing at VQA Models: Introspecting VQA Models With Sub-Questions,
CVPR20(10000-10008)
IEEE DOI
2008
Cognition, Task analysis, Visualization, Image color analysis,
Grounding, Text recognition, Computational modeling
BibRef
Le, T.M.,
Le, V.,
Venkatesh, S.,
Tran, T.,
Hierarchical Conditional Relation Networks for Video Question
Answering,
CVPR20(9969-9978)
IEEE DOI
2008
Linguistics, Cognition, Visualization, Computer architecture,
Context modeling, Encoding, Buildings
BibRef
Jiang, M.,
Chen, S.,
Yang, J.,
Zhao, Q.,
Fantastic Answers and Where to Find Them: Immersive Question-Directed
Visual Attention,
CVPR20(2977-2986)
IEEE DOI
2008
Task analysis, Videos, Visualization, Computational modeling, Head,
Resists, Computer vision
BibRef
Agarwal, V.,
Shetty, R.,
Fritz, M.,
Towards Causal VQA: Revealing and Reducing Spurious Correlations by
Invariant and Covariant Semantic Editing,
CVPR20(9687-9695)
IEEE DOI
2008
Data models, Robustness, Predictive models, Semantics, Correlation,
Vocabulary, Visualization
BibRef
Hu, R.,
Singh, A.,
Darrell, T.J.,
Rohrbach, M.,
Iterative Answer Prediction With Pointer-Augmented Multimodal
Transformers for TextVQA,
CVPR20(9989-9999)
IEEE DOI
2008
Optical character recognition software, Task analysis,
Feature extraction, Visualization, Iterative decoding, Vocabulary,
Predictive models
BibRef
Kafle, K.,
Shrestha, R.,
Price, B.,
Cohen, S.,
Kanan, C.,
Answering Questions about Data Visualizations using Efficient Bimodal
Fusion,
WACV20(1487-1496)
IEEE DOI
2006
Bars, Data visualization, Image color analysis, Visualization,
Task analysis, Optical character recognition software, Training
BibRef
Yang, Z.,
Garcia, N.,
Chu, C.,
Otani, M.,
Nakashima, Y.,
Takemura, H.,
BERT Representations for Video Question Answering,
WACV20(1545-1554)
IEEE DOI
2006
Visualization, Bit error rate, Feature extraction,
Knowledge discovery, Task analysis, Semantics, Standards
BibRef
Patro, B.N.,
Kurmi, V.K.,
Kumar, S.,
Namboodiri, V.P.,
Deep Bayesian Network for Visual Question Generation,
WACV20(1555-1565)
IEEE DOI
2006
Bayes methods, Task analysis, Visualization, Uncertainty, Decoding,
Probabilistic logic, Semantics
BibRef
Patro, B.N.,
Patel, S.,
Namboodiri, V.P.,
Robust Explanations for Visual Question Answering,
WACV20(1566-1575)
IEEE DOI
2006
Visualization, Robustness, Perturbation methods,
Knowledge discovery, Collaboration, Task analysis, Coherence
BibRef
Chou, S.,
Chao, W.,
Lai, W.,
Sun, M.,
Yang, M.,
Visual Question Answering on 360° Images,
WACV20(1596-1605)
IEEE DOI
2006
Visualization, Task analysis, Feature extraction, Distortion,
Cognition, Image color analysis, Spatial resolution
BibRef
Chaudhry, R.,
Shekhar, S.,
Gupta, U.,
Maneriker, P.,
Bansal, P.,
Joshi, A.,
LEAF-QA: Locate, Encode Attend for Figure Question Answering,
WACV20(3501-3510)
IEEE DOI
2006
Bars, Knowledge discovery, Image color analysis, Training,
Vocabulary, Data mining, Data visualization
BibRef
Liang, Y.Z.[Yuan-Zhi],
Bai, Y.L.[Ya-Long],
Zhang, W.[Wei],
Qian, X.M.[Xue-Ming],
Zhu, L.[Li],
Mei, T.[Tao],
VrR-VG: Refocusing Visually-Relevant Relationships,
ICCV19(10402-10411)
IEEE DOI
2004
bioinformatics, data mining, data visualisation,
feature extraction, genomics, graph theory, image annotation, Cognition
BibRef
Singh, A.K.,
Mishra, A.,
Shekhar, S.,
Chakraborty, A.,
From Strings to Things: Knowledge-Enabled VQA Model That Can Read and
Reason,
ICCV19(4601-4611)
IEEE DOI
2004
document image processing, graph theory, inference mechanisms,
neural nets, text analysis, visual content proposals, Proposals
BibRef
Chen, L.[Lei],
Zhai, M.Y.[Meng-Yao],
He, J.W.[Jia-Wei],
Mori, G.[Greg],
Object Grounding via Iterative Context Reasoning,
MDALC19(1407-1415)
IEEE DOI
2004
Localize set of queries in the image.
image classification, image representation, image segmentation,
inference mechanisms, iterative methods, query processing,
weakly supervised learning
BibRef
Bhattacharya, N.,
Li, Q.,
Gurari, D.,
Why Does a Visual Question Have Different Answers?,
ICCV19(4270-4279)
IEEE DOI
2004
Code, Visual Q-A.
WWW Link. computer vision, question answering (information retrieval),
visual question answering, Visualization, Powders, Task analysis,
Computer vision
BibRef
Luo, H.,
Lin, G.,
Liu, Z.,
Liu, F.,
Tang, Z.,
Yao, Y.,
SegEQA: Video Segmentation Based Visual Attention for Embodied
Question Answering,
ICCV19(9666-9675)
IEEE DOI
2004
computer vision, feature extraction, image fusion,
image segmentation, question answering (information retrieval),
feature fusion
BibRef
Li, L.,
Gan, Z.,
Cheng, Y.,
Liu, J.,
Relation-Aware Graph Attention Network for Visual Question Answering,
ICCV19(10312-10321)
IEEE DOI
2004
data visualisation, graph theory,
learning (artificial intelligence), object detection, Computational modeling
BibRef
Peng, G.,
You, H.,
Zhang, Z.,
Wang, X.,
Li, H.,
Multi-Modality Latent Interaction Network for Visual Question
Answering,
ICCV19(5824-5834)
IEEE DOI
2004
data visualisation, image representation, image retrieval,
learning (artificial intelligence),
Object detection
BibRef
Fan, C.,
EgoVQA: An Egocentric Video Question Answering Benchmark Dataset,
EPIC19(4359-4366)
IEEE DOI
2004
question answering (information retrieval),
video signal processing, EgoVQA dataset, visual question,
dataset
BibRef
Selvaraju, R.R.,
Lee, S.,
Shen, Y.,
Jin, H.,
Ghosh, S.,
Heck, L.,
Batra, D.,
Parikh, D.,
Taking a HINT: Leveraging Explanations to Make Vision and Language
Models More Grounded,
ICCV19(2591-2600)
IEEE DOI
2004
gradient methods, image retrieval, natural language processing,
neural nets, question answering (information retrieval), HINT,
Correlation
BibRef
Do, T.,
Tran, H.,
Do, T.,
Tjiputra, E.,
Tran, Q.,
Compact Trilinear Interaction for Visual Question Answering,
ICCV19(392-401)
IEEE DOI
2004
computer vision, learning (artificial intelligence),
matrix decomposition,
Correlation
BibRef
Yang, T.,
Zha, Z.,
Zhang, H.,
Making History Matter:
History-Advantage Sequence Training for Visual Dialog,
ICCV19(2561-2569)
IEEE DOI
2004
image retrieval, image sequences, interactive systems, neural nets,
question answering (information retrieval),
Decoding
BibRef
Nguyen, D.K.[Duy-Kien],
Okatani, T.[Takayuki],
Multi-Task Learning of Hierarchical Vision-Language Representation,
CVPR19(10484-10493).
IEEE DOI
2002
BibRef
Schwartz, I.[Idan],
Yu, S.[Seunghak],
Hazan, T.[Tamir],
Schwing, A.G.[Alexander G.],
Factor Graph Attention,
CVPR19(2039-2048).
IEEE DOI
2002
BibRef
Kolesnikov, A.[Alexander],
Beyer, L.[Lucas],
Zhai, X.H.[Xiao-Hua],
Puigcerver, J.[Joan],
Yung, J.[Jessica],
Gelly, S.[Sylvain],
Houlsby, N.[Neil],
Big Transfer (BIT): General Visual Representation Learning,
ECCV20(V:491-507).
Springer DOI
2011
BibRef
Kolesnikov, A.[Alexander],
Zhai, X.H.[Xiao-Hua],
Beyer, L.[Lucas],
Revisiting Self-Supervised Visual Representation Learning,
CVPR19(1920-1929).
IEEE DOI
2002
BibRef
Xiong, P.[Peixi],
Zhan, H.[Huayi],
Wang, X.[Xin],
Sinha, B.[Baivab],
Wu, Y.[Ying],
Visual Query Answering by Entity-Attribute Graph Matching and Reasoning,
CVPR19(8349-8358).
IEEE DOI
2002
BibRef
Singh, A.[Amanpreet],
Natarajan, V.[Vivek],
Shah, M.[Meet],
Jiang, Y.[Yu],
Chen, X.[Xinlei],
Batra, D.[Dhruv],
Parikh, D.[Devi],
Rohrbach, M.[Marcus],
Towards VQA Models That Can Read,
CVPR19(8309-8318).
IEEE DOI
2002
BibRef
Manjunatha, V.[Varun],
Saini, N.[Nirat],
Davis, L.S.[Larry S.],
Explicit Bias Discovery in Visual Question Answering Models,
CVPR19(9554-9563).
IEEE DOI
2002
BibRef
Guo, D.[Dalu],
Xu, C.[Chang],
Tao, D.C.[Da-Cheng],
Image-Question-Answer Synergistic Network for Visual Dialog,
CVPR19(10426-10435).
IEEE DOI
2002
BibRef
Shrestha, R.[Robik],
Kafle, K.[Kushal],
Kanan, C.[Christopher],
Answer Them All! Toward Universal Visual Question Answering Models,
CVPR19(10464-10473).
IEEE DOI
2002
BibRef
Zadeh, A.[Amir],
Chan, M.[Michael],
Liang, P.P.[Paul Pu],
Tong, E.[Edmund],
Morency, L.P.[Louis-Philippe],
Social-IQ: A Question Answering Benchmark for Artificial Social
Intelligence,
CVPR19(8799-8809).
IEEE DOI
2002
BibRef
Noh, H.[Hyeonwoo],
Kim, T.[Taehoon],
Mun, J.[Jonghwan],
Han, B.H.[Bo-Hyung],
Transfer Learning via Unsupervised Task Discovery for Visual Question
Answering,
CVPR19(8377-8386).
IEEE DOI
2002
BibRef
Kim, J.Y.[Jun-Yeong],
Ma, M.[Minuk],
Kim, K.[Kyungsu],
Kim, S.[Sungjin],
Yoo, C.D.[Chang D.],
Progressive Attention Memory Network for Movie Story Question Answering,
CVPR19(8329-8338).
IEEE DOI
2002
BibRef
Hudson, D.A.[Drew A.],
Manning, C.D.[Christopher D.],
GQA: A New Dataset for Real-World Visual Reasoning and Compositional
Question Answering,
CVPR19(6693-6702).
IEEE DOI
2002
BibRef
Wijmans, E.[Erik],
Datta, S.[Samyak],
Maksymets, O.[Oleksandr],
Das, A.[Abhishek],
Gkioxari, G.[Georgia],
Lee, S.[Stefan],
Essa, I.[Irfan],
Parikh, D.[Devi],
Batra, D.[Dhruv],
Embodied Question Answering in Photorealistic Environments With Point
Cloud Perception,
CVPR19(6652-6661).
IEEE DOI
2002
BibRef
Shah, M.[Meet],
Chen, X.[Xinlei],
Rohrbach, M.[Marcus],
Parikh, D.[Devi],
Cycle-Consistency for Robust Visual Question Answering,
CVPR19(6642-6651).
IEEE DOI
2002
BibRef
Zheng, Z.L.[Zi-Long],
Wang, W.G.[Wen-Guan],
Qi, S.Y.[Si-Yuan],
Zhu, S.C.[Song-Chun],
Reasoning Visual Dialogs With Structural and Partial Observations,
CVPR19(6662-6671).
IEEE DOI
2002
BibRef
Li, H.[Hui],
Wang, P.[Peng],
Shen, C.H.[Chun-Hua],
van den Hengel, A.[Anton],
Visual Question Answering as Reading Comprehension,
CVPR19(6312-6321).
IEEE DOI
2002
BibRef
Yu, L.C.[Li-Cheng],
Chen, X.[Xinlei],
Gkioxari, G.[Georgia],
Bansal, M.[Mohit],
Berg, T.L.[Tamara L.],
Batra, D.[Dhruv],
Multi-Target Embodied Question Answering,
CVPR19(6302-6311).
IEEE DOI
2002
BibRef
Yu, Z.[Zhou],
Yu, J.[Jun],
Cui, Y.[Yuhao],
Tao, D.C.[Da-Cheng],
Tian, Q.[Qi],
Deep Modular Co-Attention Networks for Visual Question Answering,
CVPR19(6274-6283).
IEEE DOI
2002
BibRef
Abbasnejad, E.[Ehsan],
Wu, Q.[Qi],
Shi, Q.F.[Qin-Feng],
van den Hengel, A.[Anton],
What's to Know? Uncertainty as a Guide to Asking Goal-Oriented
Questions,
CVPR19(4150-4159).
IEEE DOI
2002
BibRef
Marino, K.[Kenneth],
Rastegari, M.[Mohammad],
Farhadi, A.[Ali],
Mottaghi, R.[Roozbeh],
OK-VQA: A Visual Question Answering Benchmark Requiring External
Knowledge,
CVPR19(3190-3199).
IEEE DOI
2002
BibRef
Krishna, R.[Ranjay],
Bernstein, M.[Michael],
Fei-Fei, L.[Li],
Information Maximizing Visual Question Generation,
CVPR19(2008-2018).
IEEE DOI
2002
BibRef
Fan, C.Y.[Chen-You],
Zhang, X.F.[Xiao-Fan],
Zhang, S.[Shu],
Wang, W.S.[Wen-Sheng],
Zhang, C.[Chi],
Huang, H.[Heng],
Heterogeneous Memory Enhanced Multimodal Attention Model for Video
Question Answering,
CVPR19(1999-2007).
IEEE DOI
2002
BibRef
Cadene, R.[Remi],
Ben-younes, H.[Hedi],
Cord, M.[Matthieu],
Thome, N.[Nicolas],
MUREL: Multimodal Relational Reasoning for Visual Question Answering,
CVPR19(1989-1998).
IEEE DOI
2002
BibRef
Haurilet, M.[Monica],
Roitberg, A.[Alina],
Stiefelhagen, R.[Rainer],
It's Not About the Journey; It's About the Destination: Following Soft
Paths Under Question-Guidance for Visual Reasoning,
CVPR19(1930-1939).
IEEE DOI
2002
BibRef
Qiu, Y.,
Satoh, Y.,
Suzuki, R.,
Kataoka, H.,
Incorporating 3D Information Into Visual Question Answering,
3DV19(756-765)
IEEE DOI
1911
Feature extraction, Task analysis,
Visualization, Natural language processing, Cognition,
Human computer interaction
BibRef
Haurilet, M.[Monica],
Al-Halah, Z.[Ziad],
Stiefelhagen, R.[Rainer],
DynGraph: Visual Question Answering via Dynamic Scene Graphs,
GCPR19(428-441).
Springer DOI
1911
BibRef
Earlier:
MoQA: A Multi-modal Question Answering Architecture,
VL18(IV:106-113).
Springer DOI
1905
BibRef
Liu, F.,
Liu, J.,
Fang, Z.,
Lu, H.,
Language and Visual Relations Encoding for Visual Question Answering,
ICIP19(3307-3311)
IEEE DOI
1910
Visual question answering, Relations, Attention
BibRef
Liu, C.N.[Chao-Ning],
Chen, D.J.[Ding-Jie],
Chen, H.T.[Hwann-Tzong],
Liu, T.L.[Tyng-Luh],
A2A: Attention to Attention Reasoning for Movie Question Answering,
ACCV18(VI:404-419).
Springer DOI
1906
BibRef
Fang, Z.W.[Zhi-Wei],
Liu, J.[Jing],
Tang, Q.[Qu],
Li, Y.[Yong],
Lu, H.Q.[Han-Qing],
Answer Distillation for Visual Question Answering,
ACCV18(I:72-87).
Springer DOI
1906
BibRef
Kuhnle, A.[Alexander],
Xie, H.Y.[Hui-Yuan],
Copestake, A.[Ann],
How Clever Is the FiLM Model, and How Clever Can it Be?,
VL18(IV:162-172).
Springer DOI
1905
BibRef
Li, W.[Wei],
Yuan, Z.H.[Ze-Huan],
Fang, X.Z.[Xiang-Zhong],
Wang, C.[Changhu],
Knowing Where to Look? Analysis on Attention of Visual Question
Answering System,
VL18(IV:145-152).
Springer DOI
1905
BibRef
Bani, G.[Gabriele],
Belli, D.[Davide],
Dagan, G.[Gautier],
Geenen, A.[Alexander],
Skliar, A.[Andrii],
Venkatesh, A.[Aashish],
Baumgärtner, T.[Tim],
Bruni, E.[Elia],
Fernández, R.[Raquel],
Adding Object Detection Skills to Visual Dialogue Agents,
VL18(IV:180-187).
Springer DOI
1905
BibRef
Wagner, M.[Misha],
Basevi, H.[Hector],
Shetty, R.[Rakshith],
Li, W.B.[Wen-Bin],
Malinowski, M.[Mateusz],
Fritz, M.[Mario],
Leonardis, A.[Ale],
Answering Visual What-If Questions: From Actions to Predicted Scene
Descriptions,
VLEASE18(I:521-537).
Springer DOI
1905
BibRef
Zhang, Y.,
Niebles, J.C.,
Soto, A.,
Interpretable Visual Question Answering by Visual Grounding From
Attention Supervision Mining,
WACV19(349-357)
IEEE DOI
1904
data mining, data visualisation, image representation,
learning (artificial intelligence),
Computer architecture
BibRef
Zhuang, B.,
Wu, Q.,
Shen, C.,
Reid, I.D.,
van den Hengel, A.J.[Anton J.],
Parallel Attention: A Unified Framework for Visual Object Discovery
Through Dialogs and Queries,
CVPR18(4252-4261)
IEEE DOI
1812
Visualization, Task analysis, Cognition, Proposals, Grounding,
Computer vision, Correlation
BibRef
Yang, M.,
Yang, N.S.R.,
Zhang, K.,
Tao, J.,
Self-Talk: Responses to Users' Opinions and Challenges in Human
Computer Dialog,
ICPR18(2839-2844)
IEEE DOI
1812
History, Robots, Databases, Predictive models, Pattern recognition,
Automation, Search engines, human computer dialog,
abstract extraction
BibRef
Duke, B.,
Taylor, G.W.,
Generalized Hadamard-Product Fusion Operators for Visual Question
Answering,
CRV18(39-46)
IEEE DOI
1812
Feature extraction, Visualization, Task analysis, Data models,
Mathematical model, Natural languages, Model Selection,
Visual Question-Answering
BibRef
Das, A.,
Datta, S.,
Gkioxari, G.,
Lee, S.,
Parikh, D.,
Batra, D.,
Embodied Question Answering,
CVPR18(1-10)
IEEE DOI
1812
Navigation, Visualization, Task analysis, Automobiles,
Knowledge discovery
BibRef
Misra, I.,
Girshick, R.,
Fergus, R.,
Hebert, M.,
Gupta, A.,
van der Maaten, L.[Laurens],
Learning by Asking Questions,
CVPR18(11-20)
IEEE DOI
1812
Training, Proposals, Visualization, Knowledge discovery, Standards,
Task analysis, Data models
BibRef
Gurari, D.,
Li, Q.,
Stangl, A.J.,
Guo, A.,
Lin, C.,
Grauman, K.,
Luo, J.,
Bigham, J.P.,
VizWiz Grand Challenge: Answering Visual Questions from Blind People,
CVPR18(3608-3617)
IEEE DOI
1812
Visualization, Blindness, Prediction algorithms, Lighting,
Computer vision, Mobile handsets, Shape
BibRef
Li, J.,
Su, H.,
Zhu, J.,
Wang, S.,
Zhang, B.,
Textbook Question Answering Under Instructor Guidance with Memory
Networks,
CVPR18(3655-3663)
IEEE DOI
1812
Task analysis, Cognition, Visualization, Feature extraction,
Semantics, Knowledge discovery, Drugs
BibRef
Gordon, D.,
Kembhavi, A.,
Rastegari, M.,
Redmon, J.,
Fox, D.,
Farhadi, A.,
IQA: Visual Question Answering in Interactive Environments,
CVPR18(4089-4098)
IEEE DOI
1812
Task analysis, Navigation, Visualization, Knowledge discovery,
Semantics, Planning
BibRef
Agrawal, A.,
Batra, D.,
Parikh, D.,
Kembhavi, A.,
Don't Just Assume; Look and Answer: Overcoming Priors for Visual
Question Answering,
CVPR18(4971-4980)
IEEE DOI
1812
Image color analysis, Visualization, Data models, Training data,
Training, Knowledge discovery, Dogs
BibRef
Sha, F.,
Chao, W.,
Hu, H.,
Learning Answer Embeddings for Visual Question Answering,
CVPR18(5428-5436)
IEEE DOI
1812
Visualization, Semantics, Probabilistic logic,
Computational modeling, Task analysis, Training, Adaptation models
BibRef
Kafle, K.,
Price, B.,
Cohen, S.,
Kanan, C.,
DVQA: Understanding Data Visualizations via Question Answering,
CVPR18(5648-5656)
IEEE DOI
1812
Bars, Cognition, Image color analysis, Visualization,
Data visualization, Data mining, Knowledge discovery
BibRef
Sha, F.,
Hu, H.,
Chao, W.,
Cross-Dataset Adaptation for Visual Question Answering,
CVPR18(5716-5725)
IEEE DOI
1812
Visualization, Task analysis, Adaptation models,
Knowledge discovery, Games, Training, Target recognition
BibRef
Jain, U.,
Schwing, A.,
Lazebnik, S.,
Two Can Play This Game: Visual Dialog with Discriminative Question
Generation and Answering,
CVPR18(5754-5763)
IEEE DOI
1812
Visualization, Task analysis, History, Knowledge discovery,
Measurement, Training, Computer architecture
BibRef
Anderson, P.,
He, X.,
Buehler, C.,
Teney, D.,
Johnson, M.,
Gould, S.,
Zhang, L.,
Bottom-Up and Top-Down Attention for Image Captioning and Visual
Question Answering,
CVPR18(6077-6086)
IEEE DOI
1812
Visualization, Task analysis, Proposals, Mathematical model, Servers,
Context modeling, Object detection
BibRef
Nguyen, D.,
Okatani, T.,
Improved Fusion of Visual and Language Representations by Dense
Symmetric Co-attention for Visual Question Answering,
CVPR18(6087-6096)
IEEE DOI
1812
Feature extraction, Visualization, Computer architecture, Fuses,
Knowledge discovery, Bidirectional control
BibRef
Dokania, P.K.,
Torr, P.H.S.,
Siddharth, N.,
Massiceti, D.,
FLIPDIAL: A Generative Model for Two-Way Visual Dialogue,
CVPR18(6097-6105)
IEEE DOI
1812
Visualization, Task analysis, Computational modeling, History,
Data models, Pediatrics, Image color analysis
BibRef
Wu, Q.,
Wang, P.,
Shen, C.,
Reid, I.D.,
van den Hengel, A.J.[Anton J.],
Are You Talking to Me? Reasoned Visual Dialog Generation Through
Adversarial Learning,
CVPR18(6106-6115)
IEEE DOI
1812
Visualization, Task analysis, Generators, History,
Computational modeling, Image color analysis
BibRef
Gao, J.,
Ge, R.,
Chen, K.,
Nevatia, R.,
Motion-Appearance Co-memory Networks for Video Question Answering,
CVPR18(6576-6585)
IEEE DOI
1812
Knowledge discovery, Cognition, Task analysis, Dynamics,
Memory modules, Micromechanical devices, Logic gates
BibRef
Ma, C.,
Shen, C.,
Dick, A.,
Wu, Q.,
Wang, P.,
van den Hengel, A.J.[Anton J.],
Reid, I.D.,
Visual Question Answering with Memory-Augmented Networks,
CVPR18(6975-6984)
IEEE DOI
1812
Visualization, Neural networks, Training, Knowledge discovery,
Feature extraction, Bidirectional control, Prediction algorithms
BibRef
Cao, Q.,
Liang, X.,
Li, B.,
Li, G.,
Lin, L.,
Visual Question Reasoning on General Dependency Tree,
CVPR18(7249-7257)
IEEE DOI
1812
Cognition, Visualization, Layout, Feature extraction, Task analysis,
Collaboration, Neural networks
BibRef
Patro, B.,
Namboodiri, V.P.,
Differential Attention for Visual Question Answering,
CVPR18(7680-7688)
IEEE DOI
1812
Semantics, Task analysis, Visualization, Knowledge discovery,
Correlation, Measurement, Training
BibRef
Su, Z.,
Zhu, C.,
Dong, Y.,
Cai, D.,
Chen, Y.,
Li, J.,
Learning Visual Knowledge Memory Networks for Visual Question
Answering,
CVPR18(7736-7745)
IEEE DOI
1812
Visualization, Knowledge based systems, Task analysis,
Knowledge discovery, Cognition, Ovens
BibRef
Shin, A.,
Ushiku, Y.,
Harada, T.,
Customized Image Narrative Generation via Interactive Visual Question
Generation and Answering,
CVPR18(8925-8933)
IEEE DOI
1812
Visualization, Task analysis, Feature extraction, Proposals,
Knowledge discovery, Recurrent neural networks, Training
BibRef
Das, A.,
Datta, S.,
Gkioxari, G.,
Lee, S.,
Parikh, D.,
Batra, D.,
Embodied Question Answering,
DeepLearnRV18(2135-213509)
IEEE DOI
1812
Navigation, Visualization, Task analysis, Automobiles,
Knowledge discovery
BibRef
Cheng, W.,
Huang, Y.,
Wang, L.,
Towards Unconstrained Pointing Problem of Visual Question Answering:
A Retrieval-based Method,
ICPR18(3303-3308)
IEEE DOI
1812
Visualization, Task analysis, Feature extraction, Training,
Knowledge discovery, Proposals, Semantics
BibRef
Teney, D.,
Anderson, P.,
He, X.,
van den Hengel, A.J.[Anton J.],
Tips and Tricks for Visual Question Answering:
Learnings from the 2017 Challenge,
CVPR18(4223-4232)
IEEE DOI
1812
Training, Visualization, Task analysis, Neural networks,
Knowledge discovery, Logic gates, Computer architecture
BibRef
Zhou, B.[Bolei],
Sun, Y.[Yiyou],
Bau, D.[David],
Torralba, A.[Antonio],
Interpretable Basis Decomposition for Visual Explanation,
ECCV18(VIII: 122-138).
Springer DOI
1810
BibRef
Kottur, S.[Satwik],
Moura, J.M.F.[José M. F.],
Parikh, D.[Devi],
Batra, D.[Dhruv],
Rohrbach, M.[Marcus],
Visual Coreference Resolution in Visual Dialog Using Neural Module
Networks,
ECCV18(XV: 160-178).
Springer DOI
1810
BibRef
Strub, F.[Florian],
Seurin, M.[Mathieu],
Perez, E.[Ethan],
de Vries, H.[Harm],
Mary, J.[Jérémie],
Preux, P.[Philippe],
Courville, A.[Aaron],
Pietquin, O.[Olivier],
Visual Reasoning with Multi-hop Feature Modulation,
ECCV18(VI: 808-831).
Springer DOI
1810
BibRef
Shi, Y.[Yang],
Furlanello, T.[Tommaso],
Zha, S.[Sheng],
Anandkumar, A.[Animashree],
Question Type Guided Attention in Visual Question Answering,
ECCV18(II: 158-175).
Springer DOI
1810
BibRef
Narasimhan, M.[Medhini],
Schwing, A.G.[Alexander G.],
Straight to the Facts: Learning Knowledge Base Retrieval for Factual
Visual Question Answering,
ECCV18(VIII: 460-477).
Springer DOI
1810
BibRef
Yang, G.R.[Guangyu Robert],
Ganichev, I.[Igor],
Wang, X.J.[Xiao-Jing],
Shlens, J.[Jonathon],
Sussillo, D.[David],
A Dataset and Architecture for Visual Reasoning with a Working Memory,
ECCV18(X: 729-745).
Springer DOI
1810
BibRef
Malinowski, M.[Mateusz],
Doersch, C.[Carl],
Santoro, A.[Adam],
Battaglia, P.[Peter],
Learning Visual Question Answering by Bootstrapping Hard Attention,
ECCV18(VI: 3-20).
Springer DOI
1810
BibRef
Kim, K.M.[Kyung-Min],
Choi, S.H.[Seong-Ho],
Kim, J.H.[Jin-Hwa],
Zhang, B.T.[Byoung-Tak],
Multimodal Dual Attention Memory for Video Story Question Answering,
ECCV18(XV: 698-713).
Springer DOI
1810
BibRef
Yu, Y.J.[Young-Jae],
Kim, J.S.[Jong-Seok],
Kim, G.[Gunhee],
A Joint Sequence Fusion Model for Video Question Answering and
Retrieval,
ECCV18(VII: 487-503).
Springer DOI
1810
BibRef
Gu, J.X.[Jiu-Xiang],
Cai, J.F.[Jian-Fei],
Joty, S.[Shafiq],
Niu, L.[Li],
Wang, G.[Gang],
Look, Imagine and Match: Improving Textual-Visual Cross-Modal
Retrieval with Generative Models,
CVPR18(7181-7189)
IEEE DOI
1812
Visualization, Training, Decoding, Semantics, Measurement.
BibRef
Yang, Z.Y.[Zheng-Yuan],
Chen, T.L.[Tian-Lang],
Wang, L.[Liwei],
Luo, J.B.[Jie-Bo],
Improving One-Stage Visual Grounding by Recursive Sub-query
Construction,
ECCV20(XIV:387-404).
Springer DOI
2011
Code, Query.
WWW Link.
BibRef
Li, Q.[Qing],
Tao, Q.Y.[Qing-Yi],
Joty, S.[Shafiq],
Cai, J.F.[Jian-Fei],
Luo, J.B.[Jie-Bo],
VQA-E: Explaining, Elaborating, and Enhancing Your Answers for Visual
Questions,
ECCV18(VII: 570-586).
Springer DOI
1810
BibRef
Bai, Y.L.[Ya-Long],
Fu, J.L.[Jian-Long],
Zhao, T.J.[Tie-Jun],
Mei, T.[Tao],
Deep Attention Neural Tensor Network for Visual Question Answering,
ECCV18(XII: 21-37).
Springer DOI
1810
BibRef
Sinha, A.[Abhishek],
Ayush, K.[Kumar],
Towards Mathematical Reasoning: A Multimodal Deep Learning Approach,
ICIP18(4028-4032)
IEEE DOI
1809
Mathematical model, Task analysis, Visualization, Decoding,
Computational modeling, Machine learning, Numerical models,
Mathematical Reasoning
BibRef
Yu, D.,
Gao, X.,
Xiong, H.,
Structured Semantic Representation for Visual Question Answering,
ICIP18(2286-2290)
IEEE DOI
1809
Semantics, Training, Cognition, Visualization, Task analysis,
Linguistics, Computational modeling,
Visual question answering
BibRef
Huang, L.,
Kulkarni, K.,
Jha, A.,
Lohit, S.,
Jayasuriya, S.,
Turaga, P.K.,
CS-VQA: Visual Question Answering with Compressively Sensed Images,
ICIP18(1283-1287)
IEEE DOI
1809
Visualization, Image reconstruction, Image coding, Task analysis,
Feature extraction, Training, Multiplexing, Computer vision,
image reconstruction
BibRef
Hasan Chowdhury, M.I.,
Nguyen, K.,
Sridharan, S.,
Fookes, C.,
Hierarchical Relational Attention for Video Question Answering,
ICIP18(599-603)
IEEE DOI
1809
Feature extraction, Knowledge discovery, Visualization,
Task analysis, Mathematical model, Natural languages,
scene understanding
BibRef
Desta, M.T.,
Chen, L.,
Kornuta, T.,
Object-Based Reasoning in VQA,
WACV18(1814-1823)
IEEE DOI
1806
data visualisation, inference mechanisms,
natural language processing, object detection,
Visualization
BibRef
Zhao, H.,
Fan, Q.,
Gutfreund, D.,
Fu, Y.,
Semantically Guided Visual Question Answering,
WACV18(1852-1860)
IEEE DOI
1806
data visualisation, image colour analysis, image representation,
learning (artificial intelligence),
Visualization
BibRef
Wang, Z.,
Liu, X.,
Wang, L.,
Qiao, Y.,
Xie, X.,
Fowlkes, C.C.[Charless C.],
Structured Triplet Learning with POS-Tag Guided Attention for Visual
Question Answering,
WACV18(1888-1896)
IEEE DOI
1806
convolution, data visualisation,
learning (artificial intelligence),
Visualization
BibRef
Chowdhury, I.,
Nguyen, K.,
Fookes, C.,
Sridharan, S.,
A cascaded long short-term memory (LSTM) driven generic visual
question answering (VQA),
ICIP17(1842-1846)
IEEE DOI
1803
Feature extraction, Mathematical model, Natural languages,
Principal component analysis, Task analysis, Training,
scene understanding
BibRef
Sheng, S.[Shurong],
Venkitasubramanian, A.N.[Aparna Nurani],
Moens, M.F.[Marie-Francine],
A Markov Network Based Passage Retrieval Method for Multimodal Question
Answering in the Cultural Heritage Domain,
MMMod18(I:3-15).
Springer DOI
1802
BibRef
Rosso-Mateus, A.[Andrés],
González, F.A.[Fabio A.],
Montes-y-Gómez, M.[Manuel],
A Two-Step Neural Network Approach to Passage Retrieval for Open Domain
Question Answering,
CIARP17(566-574).
Springer DOI
1802
BibRef
Gupta, T.[Tanmay],
Shih, K.J.[Kevin J.],
Singh, S.[Saurabh],
Hoiem, D.[Derek],
Aligned Image-Word Representations Improve Inductive Transfer Across
Vision-Language Tasks,
ICCV17(4223-4232)
IEEE DOI
1802
data visualisation, image recognition,
learning (artificial intelligence),
Visualization
BibRef
Das, A.,
Kottur, S.,
Moura, J.M.F.,
Lee, S.,
Batra, D.,
Learning Cooperative Visual Dialog Agents with Deep Reinforcement
Learning,
ICCV17(2970-2979)
IEEE DOI
1802
interactive systems, learning (artificial intelligence),
multi-agent systems, natural language interfaces, robot vision,
Visualization
BibRef
Mun, J.[Jonghwan],
Seo, P.H.[Paul Hongsuck],
Jung, I.[Ilchae],
Han, B.H.[Bo-Hyung],
MarioQA: Answering Questions by Watching Gameplay Videos,
ICCV17(2886-2894)
IEEE DOI
1802
computer games, inference mechanisms, neural nets,
question answering (information retrieval), VideoQA problems,
Visualization
BibRef
Gan, C.,
Li, Y.,
Li, H.,
Sun, C.,
Gong, B.,
VQS: Linking Segmentations to Questions and Answers for Supervised
Attention in VQA and Question-Focused Semantic Segmentation,
ICCV17(1829-1838)
IEEE DOI
1802
image annotation, image segmentation, multilayer perceptrons,
question answering (information retrieval), COCO, VQA dataset,
Visualization
BibRef
Yu, Z.,
Yu, J.,
Fan, J.,
Tao, D.,
Multi-modal Factorized Bilinear Pooling with Co-attention Learning
for Visual Question Answering,
ICCV17(1839-1848)
IEEE DOI
1802
computational complexity, feature extraction, image fusion,
learning (artificial intelligence),
Visualization
BibRef
Ben-younes, H.,
Cadene, R.,
Cord, M.,
Thome, N.,
MUTAN: Multimodal Tucker Fusion for Visual Question Answering,
ICCV17(2631-2639)
IEEE DOI
1802
image fusion, image representation,
question answering (information retrieval), tensors, (VQA) tasks,
Visualization
BibRef
Zhu, C.,
Zhao, Y.,
Huang, S.,
Tu, K.,
Ma, Y.,
Structured Attentions for Visual Question Answering,
ICCV17(1300-1309)
IEEE DOI
1802
belief networks, data visualisation, image retrieval,
inference mechanisms, neural nets,
Visualization
BibRef
Hu, R.,
Andreas, J.,
Rohrbach, M.,
Darrell, T.J.,
Saenko, K.,
Learning to Reason: End-to-End Module Networks for Visual Question
Answering,
ICCV17(804-813)
IEEE DOI
1802
computational linguistics, grammars, natural language processing,
neural net architecture,
Visualization
BibRef
Jain, U.[Unnat],
Zhang, Z.Y.[Zi-Yu],
Schwing, A.[Alexander],
Creativity: Generating Diverse Questions Using Variational
Autoencoders,
CVPR17(5415-5424)
IEEE DOI
1711
Artificial intelligence, Creativity, Hidden Markov models,
Training, Transforms, Visualization
BibRef
de Vries, H.[Harm],
Strub, F.[Florian],
Chandar, S.[Sarath],
Pietquin, O.[Olivier],
Larochelle, H.[Hugo],
Courville, A.[Aaron],
GuessWhat?! Visual Object Discovery through Multi-modal Dialogue,
CVPR17(4466-4475)
IEEE DOI
1711
Databases, Games, Knowledge discovery,
Natural languages, Visualization
BibRef
Nam, H.[Hyeonseob],
Ha, J.W.[Jung-Woo],
Kim, J.[Jeonghee],
Dual Attention Networks for Multimodal Reasoning and Matching,
CVPR17(2156-2164)
IEEE DOI
1711
Cognition, Knowledge discovery, Mathematical model,
Neural networks, Semantics, Visualization
BibRef
Johnson, J.[Justin],
Hariharan, B.[Bharath],
van der Maaten, L.[Laurens],
Hoffman, J.,
Fei-Fei, L.[Li],
Zitnick, C.L.[C. Lawrence],
Girshick, R.[Ross],
Inferring and Executing Programs for Visual Reasoning,
ICCV17(3008-3017)
IEEE DOI
1802
BibRef
Earlier: A1, A2, A3, A5, A6, A7, Only:
CLEVR: A Diagnostic Dataset for Compositional Language and Elementary
Visual Reasoning,
CVPR17(1988-1997)
IEEE DOI
1711
Dataset, Visual Reasoning.
WWW Link.
backpropagation, image matching,
learning (artificial intelligence), neural nets,
Visualization.
Cognition, Image color analysis, Metals, Semantics, Shape.
BibRef
Yu, Y.,
Ko, H.,
Choi, J.,
Kim, G.,
End-to-End Concept Word Detection for Video Captioning, Retrieval,
and Question Answering,
CVPR17(3261-3269)
IEEE DOI
1711
Detectors, Knowledge discovery, Motion pictures, Semantics, Training, Visualization
BibRef
Das, A.[Abhishek],
Kottur, S.[Satwik],
Gupta, K.[Khushi],
Singh, A.[Avi],
Yadav, D.[Deshraj],
Moura, J.M.F.[José M. F.],
Parikh, D.[Devi],
Batra, D.[Dhruv],
Visual Dialog,
CVPR17(1080-1089)
IEEE DOI
1711
Hold a dialog with humans in a natural visual context.
History, Knowledge discovery, Protocols, Visualization, Wheelchairs
BibRef
Zhu, Y.,
Lim, J.J.,
Fei-Fei, L.[Li],
Knowledge Acquisition for Visual Question Answering via Iterative
Querying,
CVPR17(6146-6155)
IEEE DOI
1711
Computational modeling, Data models, Generators,
Knowledge discovery, Standards, Visualization
BibRef
Peris, Á.[Álvaro],
Casacuberta, F.[Francisco],
Interactive-Predictive Neural Multimodal Systems,
IbPRIA(I:16-28).
Springer DOI
1910
BibRef
Bolańos, M.[Marc],
Peris, Á.[Álvaro],
Casacuberta, F.[Francisco],
Radeva, P.[Petia],
VIBIKNet: Visual Bidirectional Kernelized Network for Visual Question
Answering,
IbPRIA17(372-380).
Springer DOI
1706
BibRef
Gao, P.[Peng],
Li, H.S.[Hong-Sheng],
Li, S.[Shuang],
Lu, P.[Pan],
Li, Y.K.[Yi-Kang],
Hoi, S.C.H.[Steven C. H.],
Wang, X.G.[Xiao-Gang],
Question-Guided Hybrid Convolution for Visual Question Answering,
ECCV18(I: 485-501).
Springer DOI
1810
BibRef
Li, Y.K.[Yi-Kang],
Duan, N.[Nan],
Zhou, B.L.[Bo-Lei],
Chu, X.[Xiao],
Ouyang, W.L.[Wan-Li],
Wang, X.G.[Xiao-Gang],
Zhou, M.[Ming],
Visual Question Generation as Dual Task of Visual Question Answering,
CVPR18(6116-6124)
IEEE DOI
1812
Task analysis, Visualization, Knowledge discovery, Training,
Computational modeling, Computer vision
BibRef
Gao, P.[Peng],
Jiang, Z.K.[Zheng-Kai],
You, H.X.[Hao-Xuan],
Lu, P.[Pan],
Hoi, S.C.H.[Steven C. H.],
Wang, X.G.[Xiao-Gang],
Li, H.S.[Hong-Sheng],
Dynamic Fusion With Intra- and Inter-Modality Attention Flow for Visual
Question Answering,
CVPR19(6632-6641).
IEEE DOI
2002
BibRef
Lin, Y.T.[Yue-Tan],
Pang, Z.Y.[Zhang-Yang],
Li, Y.[Yanan],
Wang, D.H.[Dong-Hui],
Simple and effective visual question answering in a single modality,
ICIP16(2276-2280)
IEEE DOI
1610
Benchmark testing. Not just add text to image questions.
BibRef
Kafle, K.[Kushal],
Kanan, C.[Christopher],
An Analysis of Visual Question Answering Algorithms,
ICCV17(1983-1991)
IEEE DOI
1802
BibRef
Earlier:
Answer-Type Prediction for Visual Question Answering,
CVPR16(4976-4984)
IEEE DOI
1612
case-based reasoning, computer vision, data visualisation,
image retrieval, neural nets,
Visualization
BibRef
Wang, P.,
Wu, Q.,
Shen, C.,
van den Hengel, A.J.[Anton J.],
The VQA-Machine: Learning How to Use Existing Vision Algorithms to
Answer New Questions,
CVPR17(3909-3918)
IEEE DOI
1711
Cognition, Data mining, Neural networks, Prediction algorithms,
Telescopes, Visualization
BibRef
Yu, D.,
Fu, J.,
Mei, T.,
Rui, Y.,
Multi-level Attention Networks for Visual Question Answering,
CVPR17(4187-4195)
IEEE DOI
1711
Feature extraction, Knowledge discovery, Natural languages,
Recurrent neural networks, Semantics, Visualization
BibRef
Kembhavi, A.,
Seo, M.,
Schwenk, D.,
Choi, J.,
Farhadi, A.,
Hajishirzi, H.,
Are You Smarter Than a Sixth Grader? Textbook Question Answering for
Multimodal Machine Comprehension,
CVPR17(5376-5384)
IEEE DOI
1711
Cognition, Knowledge discovery, Natural languages,
Training, Visualization
BibRef
Ganju, S.,
Russakovsky, O.,
Gupta, A.,
What's in a Question:
Using Visual Questions as a Form of Supervision,
CVPR17(6422-6431)
IEEE DOI
1711
Artificial intelligence, Computational modeling,
Dogs, Image color analysis, SPICE, Visualization
BibRef
Ramakrishnan, S.K.,
Pal, A.,
Sharma, G.,
Mittal, A.,
An Empirical Evaluation of Visual Question Answering for Novel
Objects,
CVPR17(7312-7321)
IEEE DOI
1711
Knowledge discovery, Recurrent neural networks, Training,
Training data, Visualization, Vocabulary
BibRef
Maharaj, T.,
Ballas, N.,
Rohrbach, A.,
Courville, A.,
Pal, C.,
A Dataset and Exploration of Models for Understanding Video Data
through Fill-in-the-Blank Question-Answering,
CVPR17(7359-7368)
IEEE DOI
1711
Computational modeling, Motion pictures, Natural languages,
Training, Visualization, Voltage, control
BibRef
Xu, H.J.[Hui-Juan],
Saenko, K.[Kate],
Ask, Attend and Answer:
Exploring Question-Guided Spatial Attention for Visual Question Answering,
ECCV16(VII: 451-466).
Springer DOI
1611
Visual Question Answering.
BibRef
Jabri, A.[Allan],
Joulin, A.[Armand],
van der Maaten, L.[Laurens],
Revisiting Visual Question Answering Baselines,
ECCV16(VIII: 727-739).
Springer DOI
1611
BibRef
Yang, Z.C.[Zi-Chao],
He, X.D.[Xiao-Dong],
Gao, J.F.[Jian-Feng],
Deng, L.[Li],
Smola, A.[Alex],
Stacked Attention Networks for Image Question Answering,
CVPR16(21-29)
IEEE DOI
1612
BibRef
Sadeghi, F.[Fereshteh],
Divvala, S.K.[Santosh K.],
Farhadi, A.[Ali],
VisKE: Visual knowledge extraction and question answering by visual
verification of relation phrases,
CVPR15(1456-1464)
IEEE DOI
1510
Visual verification of text relationships.
BibRef
Liu, Y.[Yang],
Liu, J.[Jie],
Wang, D.[Dong],
Cheng, J.[Jian],
A robust multivariate reranking algorithm for Question Answering
enrichment,
ICIP12(1917-1920).
IEEE DOI
1302
BibRef
Varekamp, C.[Chris],
van de Walle, P.[Patrick],
de Putter, M.[Marc],
Question interface for 3D picture creation on an autostereoscopic
digital picture frame,
3DTV09(1-4).
IEEE DOI
0905
BibRef
Chapter on Implementations and Applications, Databases, QBIC, Video Analysis, Hardware and Software, Inspection continues in
Internet Label Information .