19.4.3.2.1 Visual Question Answering, Query, VQA

Chapter Contents (Back)
Question Answer. Visual Q-A. Datasets may be in: See also Object Recognition, Retrieval Datasets. See also Context in Computer Vision.

VQA: Visual Question Answering,
dataset containing open-ended questions about images WWW Link.
Dataset, Visual Question Answering. See also VQA: Visual Question Answering.

Visual7W visual question answering,
Large-scale visual question answering (QA) dataset, with object-level groundings and multimodal answers. WWW Link.
Dataset, Visual Question Answering.

Visual Genome,
Visual Genome is a dataset, a knowledge base, an ongoing effort to connect structured image concepts to language. WWW Link.

WWW Link. Dataset, Visual Question Answering.

Wu, Y.C.[Yu-Chieh], Yang, J.C.[Jie-Chi],
A Robust Passage Retrieval Algorithm for Video Question Answering,
CirSysVideo(18), No. 10, October 2008, pp. 1411-1421.
IEEE DOI 0811
BibRef

Wu, Y.C.[Yu-Chieh], Lee, Y.S.[Yue-Shi], Yang, J.C.[Jie-Chi], Yen, S.J.[Show-Jane],
A New Passage Ranking Algorithm for Video Question Answering,
PSIVT06(563-572).
Springer DOI 0612
BibRef

Li, G.D.[Guang-Da], Li, H.[Haojie], Ming, Z.Y.[Zhao-Yan], Hong, R.[Richang], Tang, S.[Sheng], Chua, T.S.[Tat-Seng],
Question Answering over Community-Contributed Web Videos,
MultMedMag(17), No. 4, October-December 2010, pp. 46-57.
IEEE DOI 1011
BibRef

Song, Y.C.[Yi-Cheng], Li, H.[Haojie],
Mash-Up Approach for Web Video Category Recommendation,
PSIVT10(197-202).
IEEE DOI 1011
BibRef

Agrawal, A.[Aishwarya], Lu, J.[Jiasen], Antol, S.[Stanislaw], Mitchell, M.[Margaret], Zitnick, C.L.[C. Lawrence], Parikh, D.[Devi], Batra, D.[Dhruv],
VQA: Visual Question Answering,
IJCV(123), No. 1, May 2017, pp. 4-31.
Springer DOI 1705
BibRef

Malinowski, M.[Mateusz], Rohrbach, M.[Marcus], Fritz, M.[Mario],
Ask Your Neurons: A Deep Learning Approach to Visual Question Answering,
IJCV(125), No. 1-3, December 2018, pp. 110-135.
Springer DOI 1711
BibRef
Earlier:
Ask Your Neurons: A Neural-Based Approach to Answering Questions about Images,
ICCV15(1-9)
IEEE DOI 1602
Deep learning for questions about real-world images. A Visual Turing Test. Language output based on visual and natural language input. BibRef

Kafle, K.[Kushal], Kanan, C.[Christopher],
Visual question answering: Datasets, algorithms, and future challenges,
CVIU(163), No. 1, 2017, pp. 3-20.
Elsevier DOI 1712
Image understanding BibRef

Wu, Q.[Qi], Teney, D.[Damien], Wang, P.[Peng], Shen, C.H.[Chun-Hua], Dick, A.[Anthony], van den Hengel, A.J.[Anton J.],
Visual question answering: A survey of methods and datasets,
CVIU(163), No. 1, 2017, pp. 21-40.
Elsevier DOI 1712
Survey, Visual Question Answering. Visual question answering BibRef

Tamaazousti, Y.[Youssef], Borgne, H.L.[Hervé Le], Popescu, A.[Adrian], Gadeski, E.[Etienne], Ginsca, A.[Alexandru], Hudelot, C.[Céline],
Vision-language integration using constrained local semantic features,
CVIU(163), No. 1, 2017, pp. 41-57.
Elsevier DOI 1712
Image classification BibRef

Das, A.[Abhishek], Agrawal, H.[Harsh], Zitnick, L.[Larry], Parikh, D.[Devi], Batra, D.[Dhruv],
Human Attention in Visual Question Answering: Do Humans and Deep Networks Look at the Same Regions?,
CVIU(163), No. 1, 2017, pp. 90-100.
Elsevier DOI 1712
Visual Question Answering BibRef

Teney, D.[Damien], Wu, Q., van den Hengel, A.J.[Anton J.],
Visual Question Answering: A Tutorial,
SPMag(34), No. 6, November 2017, pp. 63-75.
IEEE DOI 1712
Survey, Visual Question Answering. Bioinformatics, Computer vision, Genomics, Machine learning, Visualization BibRef

Teney, D.[Damien], Liu, L., van den Hengel, A.J.[Anton J.],
Graph-Structured Representations for Visual Question Answering,
CVPR17(3233-3241)
IEEE DOI 1711
Feature extraction, Knowledge discovery, Neural networks, Syntactics, Training, Visualization BibRef

Teney, D.[Damien], van den Hengel, A.J.[Anton J.],
Visual Question Answering as a Meta Learning Task,
ECCV18(XV: 229-245).
Springer DOI 1810
BibRef

Wu, Q.[Qi], Shen, C.H.[Chun-Hua], Wang, P.[Peng], Dick, A.[Anthony], van den Hengel, A.J.[Anton J.],
Image Captioning and Visual Question Answering Based on Attributes and External Knowledge,
PAMI(40), No. 6, June 2018, pp. 1367-1381.
IEEE DOI 1805
BibRef
Earlier: A1, A3, A2, A4, A5:
Ask Me Anything: Free-Form Visual Question Answering Based on Knowledge from External Sources,
CVPR16(4622-4630)
IEEE DOI 1612
Computational modeling, Computer vision, Knowledge based systems, Knowledge discovery, Resource description framework, Semantics, visual question answering BibRef

Lioutas, V.[Vasileios], Passalis, N.[Nikolaos], Tefas, A.[Anastasios],
Explicit ensemble attention learning for improving visual question answering,
PRL(111), 2018, pp. 51-57.
Elsevier DOI 1808
Visual question answering, Explicit attention, Pictorial superiority effect BibRef

Huang, Y.Z.[Yan-Zhou], Zhong, T.[Tao],
Multitask learning for neural generative question answering,
RealTimeIP(14), No. 1, January 2018, pp. 1009-1017.
WWW Link. 1809
BibRef

Zhang, Q.S.[Quan-Shi], Wu, Y.N.[Ying Nian], Zhang, H.[Hao], Zhu, S.C.[Song-Chun],
Mining deep And-Or object structures via cost-sensitive question-answer-based active annotations,
CVIU(176-177), 2018, pp. 33-44.
Elsevier DOI 1812
Hierarchical graphical model, Part semantics BibRef

Zhang, Q.S.[Quan-Shi], Cao, R., Wu, Y.N.[Ying Nian], Zhu, S.C.[Song-Chun],
Mining Object Parts from CNNs via Active Question-Answering,
CVPR17(3890-3899)
IEEE DOI 1711
BibRef
Earlier: A1, A3, A4, Only:
Mining And-Or Graphs for Graph Matching and Object Discovery,
ICCV15(55-63)
IEEE DOI 1602
Object detection, Object recognition, Semantics, Strain, Training, Visualization BibRef

Garg, S.[Shivam], Srivastava, R.[Rajeev],
Object sequences: encoding categorical and spatial information for a yes/no visual question answering task,
IET-CV(12), No. 8, December 2018, pp. 1141-1150.
DOI Link 1812
BibRef

Goyal, Y.[Yash], Khot, T.[Tejas], Agrawal, A.[Aishwarya], Summers-Stay, D.[Douglas], Batra, D.[Dhruv], Parikh, D.[Devi],
Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering,
IJCV(127), No. 4, April 2019, pp. 398-414.
Springer DOI 1903
BibRef
Earlier: A1, A2, A4, A5, A6, Only: CVPR17(6325-6334)
IEEE DOI 1711
Benchmark testing, Data collection, Data models, Knowledge discovery, Protocols, Visualization BibRef

Fang, Z.W.[Zhi-Wei], Liu, J.[Jing], Li, Y.[Yong], Qiao, Y.Y.[Yan-Yuan], Lu, H.Q.[Han-Qing],
Improving visual question answering using dropout and enhanced question encoder,
PR(90), 2019, pp. 404-414.
Elsevier DOI 1903
Visual question answering, Coherent dropout, Siamese dropout, Enhanced question encoder BibRef

Das, A.[Abhishek], Kottur, S.[Satwik], Gupta, K.[Khushi], Singh, A.[Avi], Yadav, D.[Deshraj], Lee, S.[Stefan], Moura, J.M.F.[José M. F.], Parikh, D.[Devi], Batra, D.[Dhruv],
Visual Dialog,
PAMI(41), No. 5, May 2019, pp. 1242-1256.
IEEE DOI 1904
Hold a meaningful dialog about visual content. Visualization, Task analysis, Artificial intelligence, History, Protocols, Natural languages, Wheelchairs, Visual dialog, machine learning BibRef

Osman, A.[Ahmed], Samek, W.[Wojciech],
DRAU: Dual Recurrent Attention Units for Visual Question Answering,
CVIU(185), 2019, pp. 24-30.
Elsevier DOI 1906
Visual Question Answering, Attention Mechanisms, Multi-modal Learning, Machine Vision, Natural Language Processing BibRef

Liang, J.W.[Jun-Wei], Jiang, L.[Lu], Cao, L.L.[Liang-Liang], Kalantidis, Y.[Yannis], Li, L.J.[Li-Jia], Hauptmann, A.G.[Alexander G.],
Focal Visual-Text Attention for Memex Question Answering,
PAMI(41), No. 8, August 2019, pp. 1893-1908.
IEEE DOI 1907
BibRef
Earlier: A1, A2, A3, A5, A6, Only:
Focal Visual-Text Attention for Visual Question Answering,
CVPR18(6135-6143)
IEEE DOI 1812
Task analysis, Knowledge discovery, Visualization, Grounding, Metadata, Cognition, Photo albums, question answering, memex. Visualization, Videos, Computational modeling, Correlation. BibRef

Zhao, Z.[Zhou], Zhang, Z.[Zhu], Jiang, X.H.[Xing-Hua], Cai, D.[Deng],
Multi-Turn Video Question Answering via Hierarchical Attention Context Reinforced Networks,
IP(28), No. 8, August 2019, pp. 3860-3872.
IEEE DOI 1907
learning (artificial intelligence), natural language processing, reinforcement learning BibRef

Xue, H.Y.[Hong-Yang], Chu, W., Zhao, Z.[Zhou], Cai, D.[Deng],
A Better Way to Attend: Attention With Trees for Video Question Answering,
IP(27), No. 11, November 2018, pp. 5563-5574.
IEEE DOI 1809
computational linguistics, feature extraction, grammars, natural language processing, scene understanding BibRef

Xue, H.Y.[Hong-Yang], Zhao, Z.[Zhou], Cai, D.[Deng],
Unifying the Video and Question Attentions for Open-Ended Video Question Answering,
IP(26), No. 12, December 2017, pp. 5656-5666.
IEEE DOI 1710
image retrieval, video coding, temporal question attention, temporal structures, Adaptation models, Coherence, Hair, Knowledge discovery, BibRef

Zhao, Z.[Zhou], Xiao, S.W.[Shu-Wen], Song, Z.[Zehan], Lu, C.J.[Chu-Jie], Xiao, J.[Jun], Zhuang, Y.T.[Yue-Ting],
Open-Ended Video Question Answering via Multi-Modal Conditional Adversarial Networks,
IP(29), 2020, pp. 3859-3870.
IEEE DOI 2002
Open-ended video question answering, multi-modal neural network BibRef

Zhao, Z.[Zhou], Zhang, Z.[Zhu], Xiao, S.W.[Shu-Wen], Xiao, Z.X.[Zhen-Xin], Yan, X.H.[Xiao-Hui], Yu, J.[Jun], Cai, D.[Deng], Wu, F.[Fei],
Long-Form Video Question Answering via Dynamic Hierarchical Reinforced Networks,
IP(28), No. 12, December 2019, pp. 5939-5952.
IEEE DOI 1909
Knowledge discovery, Semantics, Visualization, Natural languages, Road transportation, Task analysis, Decoding, reinforcement learning BibRef

Jang, Y.[Yunseok], Song, Y.[Yale], Kim, C.D.[Chris Dongjoo], Yu, Y.[Youngjae], Kim, Y.[Youngjin], Kim, G.[Gunhee],
Video Question Answering with Spatio-Temporal Reasoning,
IJCV(127), No. 10, October 2019, pp. 1385-1412.
Springer DOI 1909
BibRef
Earlier: A1, A2, A4, A5, A6, Only:
TGIF-QA: Toward Spatio-Temporal Reasoning in Visual Question Answering,
CVPR17(1359-1367)
IEEE DOI 1711
Cognition, Crowdsourcing, Image color analysis, Knowledge discovery, Motion pictures, Visualization BibRef

Toor, A.S.[Andeep S.], Wechsler, H.[Harry], Nappi, M.[Michele],
Biometric surveillance using visual question answering,
PRL(126), 2019, pp. 111-118.
Elsevier DOI 1909
Biometrics, Forensics, Visual question answering, Question relevance, Surveillance, Deep learning, Visual turing test BibRef

Wang, A., Luu, A.T., Foo, C., Zhu, H., Tay, Y., Chandrasekhar, V.,
Holistic Multi-Modal Memory Network for Movie Question Answering,
IP(29), No. 1, 2020, pp. 489-499.
IEEE DOI 1910
question answering (information retrieval), holistic multimodal memory network, multimodal context, MovieQA BibRef

Yu, T., Yu, J., Yu, Z., Tao, D.,
Compositional Attention Networks With Two-Stream Fusion for Video Question Answering,
IP(29), No. , 2020, pp. 1204-1218.
IEEE DOI 1911
Visualization, Streaming media, Knowledge discovery, Feature extraction, Proposals, Task analysis, Semantics, action pooling stream BibRef

Ruwa, N.[Nelson], Mao, Q.[Qirong], Song, H.P.[He-Ping], Jia, H.J.[Hong-Jie], Dong, M.[Ming],
Triple attention network for sentimental visual question answering,
CVIU(189), 2019, pp. 102829.
Elsevier DOI 1911
Visual question answering, Feature embedding, Attention model, Sentiment analysis BibRef

Li, W.W.[Wen-Wen], Song, M.M.[Miao-Miao], Tian, Y.Y.[Yuan-Yuan],
An Ontology-Driven Cyberinfrastructure for Intelligent Spatiotemporal Question Answering and Open Knowledge Discovery,
IJGI(8), No. 11, 2019, pp. xx-yy.
DOI Link 1912
BibRef

Tommasi, T.[Tatiana], Mallya, A.[Arun], Plummer, B.A.[Bryan A.], Lazebnik, S.[Svetlana], Berg, A.C.[Alexander C.], Berg, T.L.[Tamara L.],
Combining Multiple Cues for Visual Madlibs Question Answering,
IJCV(127), No. 1, January 2019, pp. 38-60.
Springer DOI 1901
BibRef
Earlier:
Solving Visual Madlibs with Multiple Cues,
BMVC16(xx-yy).
HTML Version. 1805
BibRef

Yu, L.C.[Li-Cheng], Park, E.[Eunbyung], Berg, A.C.[Alexander C.], Berg, T.L.[Tamara L.],
Visual Madlibs: Fill in the Blank Description Generation and Question Answering,
ICCV15(2461-2469)
IEEE DOI 1602
dataset consisting of 360,001 focused natural language descriptions for 10,738 images BibRef

Xi, Y.L.[Yu-Ling], Zhang, Y.N.[Yan-Ning], Ding, S.T.[Song-Tao], Wan, S.H.[Shao-Hua],
Visual question answering model based on visual relationship detection,
SP:IC(80), 2020, pp. 115648.
Elsevier DOI 1912
Visual question answering, Appearance features, Relationship predicate, Word vector similarity BibRef

Liu, F., Xiang, T., Hospedales, T.M., Yang, W., Sun, C.,
Inverse Visual Question Answering: A New Benchmark and VQA Diagnosis Tool,
PAMI(42), No. 2, February 2020, pp. 460-474.
IEEE DOI 2001
BibRef
Earlier:
iVQA: Inverse Visual Question Answering,
CVPR18(8611-8619)
IEEE DOI 1812
Benchmark testing, Visualization, Predictive models, Analytical models, Image color analysis, Knowledge discovery, reinforcement learning. Task analysis, Measurement, Decoding, Natural languages, Cognition BibRef

Wu, Y., Jiang, L., Yang, Y.,
Revisiting EmbodiedQA: A Simple Baseline and Beyond,
IP(29), 2020, pp. 3984-3992.
IEEE DOI 2002
Embodied question answering, vision and language, visual question answering BibRef

Huang, C.[Chaoran], Yao, L.[Lina], Wang, X.Z.[Xian-Zhi], Benatallah, B.[Boualem], Zhang, X.[Xiang],
Software expert discovery via knowledge domain embeddings in a collaborative network,
PRL(130), 2020, pp. 46-53.
Elsevier DOI 2002
Knowledge discovery, Stack overflow, Expertise finding, Question answering, Expert as a Service BibRef

Wang, W.[Weining], Huang, Y.[Yan], Wang, L.[Liang],
Long video question answering: A Matching-guided Attention Model,
PR(102), 2020, pp. 107248.
Elsevier DOI 2003
Long video QA, Matching-guided attention BibRef

Zhang, W., Tang, S., Cao, Y., Pu, S., Wu, F., Zhuang, Y.,
Frame Augmented Alternating Attention Network for Video Question Answering,
MultMed(22), No. 4, April 2020, pp. 1032-1041.
IEEE DOI 2004
Feature extraction, Visualization, Knowledge discovery, Task analysis, Data mining, Neural networks, Semantics, Video QA, neural network BibRef

Chen, J.[Jie], Shao, J.[Jie], He, C.[Chengkun],
Movie fill in the blank by joint learning from video and text with adaptive temporal attention,
PRL(132), 2020, pp. 62-68.
Elsevier DOI 2005
Video question answering, Adaptive temporal attention, Text information fusion BibRef

Li, W.[Wei], Sun, J.H.[Jian-Hui], Liu, G.[Ge], Zhao, L.[Linglan], Fang, X.Z.[Xiang-Zhong],
Visual question answering with attention transfer and a cross-modal gating mechanism,
PRL(133), 2020, pp. 334-340.
Elsevier DOI 2005
Attention, Visual question answering, Gating BibRef

Messina, N.[Nicola], Amato, G.[Giuseppe], Carrara, F.[Fabio], Falchi, F.[Fabrizio], Gennaro, C.[Claudio],
Learning visual features for relational CBIR,
MultInfoRetr(9), No. 2, June 2020, pp. 113-124.
Springer DOI 2005
BibRef
Earlier:
Learning Relationship-Aware Visual Features,
CEFR-LCV18(IV:486-501).
Springer DOI 1905
BibRef

Methani, N., Ganguly, P., Khapra, M.M., Kumar, P.,
PlotQA: Reasoning over Scientific Plots,
WACV20(1516-1525)
IEEE DOI 2006
Vocabulary, Cognition, Bars, Numerical models, Optical character recognition software, Data mining, Image color analysis BibRef

Guo, D., Wang, H., Wang, S., Wang, M.,
Textual-Visual Reference-Aware Attention Network for Visual Dialog,
IP(29), 2020, pp. 6655-6666.
IEEE DOI 2007
Visualization, Semantics, History, Correlation, Head, Cognition, Task analysis, Visual dialog, attention network, textual reference, multimodal semantic interaction BibRef

Patil, C.[Charulata], Patwardhan, M.[Manasi],
Visual Question Generation: The State of the Art,
Surveys(53), No. 3, May 2020, pp. xx-yy.
DOI Link 2007
Image understanding, question generation BibRef

Yu, J.[Jing], Zhu, Z.[Zihao], Wang, Y.[Yujing], Zhang, W.F.[Wei-Feng], Hu, Y.[Yue], Tan, J.[Jianlong],
Cross-modal knowledge reasoning for knowledge-based visual question answering,
PR(108), 2020, pp. 107563.
Elsevier DOI 2008
Cross-modal knowledge reasoning, Multimodal knowledge graphs, Compositional reasoning module, Explainable reasoning BibRef

Vu, M.H., Löfstedt, T., Nyholm, T., Sznitman, R.,
A Question-Centric Model for Visual Question Answering in Medical Imaging,
MedImg(39), No. 9, September 2020, pp. 2856-2868.
IEEE DOI 2009
Feature extraction, Visualization, Predictive models, Knowledge discovery, Task analysis, Medical diagnostic imaging, medical questions and answers BibRef

Riquelme, F.[Felipe], De Goyeneche, A.[Alfredo], Zhang, Y.D.[Yun-Dong], Niebles, J.C.[Juan Carlos], Soto, A.[Alvaro],
Explaining VQA predictions using visual grounding and a knowledge base,
IVC(101), 2020, pp. 103968.
Elsevier DOI 2009
Deep Learning, Attention, Supervision, Knowledge Base, Interpretability, Explainability BibRef


Zheng, W.B.[Wen-Bo], Yan, L.[Lan], Gou, C.[Chao], Wang, F.Y.[Fei-Yue],
Webly Supervised Knowledge Embedding Model for Visual Reasoning,
CVPR20(12442-12451)
IEEE DOI 2008
Visual reasoning between visual image and natural language description. Visualization, Cognition, Knowledge based systems, Task analysis, Knowledge engineering, Modulation, Robustness BibRef

Qi, J., Niu, Y., Huang, J., Zhang, H.,
Two Causal Principles for Improving Visual Dialog,
CVPR20(10857-10866)
IEEE DOI 2008
Visualization, History, Task analysis, Data models, Training, Feeds, Decoding BibRef

Mun, J., Cho, M., Han, B.,
Local-Global Video-Text Interactions for Temporal Grounding,
CVPR20(10807-10816)
IEEE DOI 2008
Semantics, Feature extraction, Grounding, Visualization, Proposals, Task analysis, Context modeling BibRef

Wu, C., Lin, Z., Cohen, S., Bui, T., Maji, S.,
PhraseCut: Language-Based Image Segmentation in the Wild,
CVPR20(10213-10222)
IEEE DOI 2008
Visualization, Grounding, Image segmentation, Task analysis, Genomics, Bioinformatics, Natural languages BibRef

Abbasnejad, E.[Ehsan], Teney, D.[Damien], Parvaneh, A.[Amin], Shi, J.[Javen], van den Hengel, A.J.[Anton J.],
Counterfactual Vision and Language Learning,
CVPR20(10041-10051)
IEEE DOI 2008
Training, Visualization, Training data, Task analysis, Machine learning, Knowledge discovery, Data models BibRef

Wang, P.[Peng], Wu, Q.[Qi], Cao, J.[Jiewei], Shen, C.H.[Chun-Hua], Gao, L.L.[Lian-Li], van den Hengel, A.J.[Anton J.],
Neighbourhood Watch: Referring Expression Comprehension via Language-Guided Graph Attention Networks,
CVPR19(1960-1968).
IEEE DOI 2002
BibRef

Guo, D., Wang, H., Zhang, H., Zha, Z., Wang, M.,
Iterative Context-Aware Graph Inference for Visual Dialog,
CVPR20(10052-10061)
IEEE DOI 2008
Visualization, History, Task analysis, Semantics, Message passing, Neural networks, Cognition BibRef

Zhu, Y., Zhu, F., Zhan, Z., Lin, B., Jiao, J., Chang, X., Liang, X.,
Vision-Dialog Navigation by Exploring Cross-Modal Memory,
CVPR20(10727-10736)
IEEE DOI 2008
Navigation, Visualization, Task analysis, History, Memory modules, Natural languages, Decision making BibRef

Bajaj, G., Bandyopadhyay, B., Schmidt, D., Maneriker, P., Myers, C., Parthasarathy, S.,
Understanding Knowledge Gaps in Visual Question Answering: Implications for Gap Identification and Testing,
MVM20(1563-1566)
IEEE DOI 2008
Cognition, Training, Task analysis, Artificial intelligence, Global communication, Taxonomy, Semantics BibRef

Chen, L., Yan, X., Xiao, J., Zhang, H., Pu, S., Zhuang, Y.,
Counterfactual Samples Synthesizing for Robust Visual Question Answering,
CVPR20(10797-10806)
IEEE DOI 2008
Training, Cascading style sheets, Predictive models, Visualization, Image color analysis, Linguistics, Computational modeling BibRef

Vatashsky, B., Ullman, S.,
VQA With No Questions-Answers Training,
CVPR20(10373-10383)
IEEE DOI 2008
Visualization, Training, Image color analysis, Knowledge discovery, Boats, Image analysis, Task analysis BibRef

Jiang, H., Misra, I., Rohrbach, M., Learned-Miller, E., Chen, X.,
In Defense of Grid Features for Visual Question Answering,
CVPR20(10264-10273)
IEEE DOI 2008
Feature extraction, Visualization, Task analysis, Detectors, Object detection, Training, Pipelines BibRef

Wang, X., Liu, Y., Shen, C., Ng, C.C., Luo, C., Jin, L., Chan, C.S., van den Hengel, A., Wang, L.,
On the General Value of Evidence, and Bilingual Scene-Text Visual Question Answering,
CVPR20(10123-10132)
IEEE DOI 2008
Measurement, Cognition, Knowledge discovery, Correlation, Task analysis, Visualization, Optical character recognition software BibRef

Kim, J., Ma, M., Pham, T., Kim, K., Yoo, C.D.,
Modality Shifting Attention Network for Multi-Modal Video Question Answering,
CVPR20(10103-10112)
IEEE DOI 2008
Cognition, Visualization, Task analysis, Knowledge discovery, Proposals, Modulation, Context modeling BibRef

Xiong, P., Wu, Y.,
TA-Student VQA: Multi-Agents Training by Self-Questioning,
CVPR20(10062-10072)
IEEE DOI 2008
Visualization, Training, Knowledge discovery, Standards, Task analysis, Boosting BibRef

Selvaraju, R.R., Tendulkar, P., Parikh, D., Horvitz, E., Tulio Ribeiro, M., Nushi, B., Kamar, E.,
SQuINTing at VQA Models: Introspecting VQA Models With Sub-Questions,
CVPR20(10000-10008)
IEEE DOI 2008
Cognition, Task analysis, Visualization, Image color analysis, Grounding, Text recognition, Computational modeling BibRef

Le, T.M., Le, V., Venkatesh, S., Tran, T.,
Hierarchical Conditional Relation Networks for Video Question Answering,
CVPR20(9969-9978)
IEEE DOI 2008
Linguistics, Cognition, Visualization, Computer architecture, Context modeling, Encoding, Buildings BibRef

Jiang, M., Chen, S., Yang, J., Zhao, Q.,
Fantastic Answers and Where to Find Them: Immersive Question-Directed Visual Attention,
CVPR20(2977-2986)
IEEE DOI 2008
Task analysis, Videos, Visualization, Computational modeling, Head, Resists, Computer vision BibRef

Agarwal, V., Shetty, R., Fritz, M.,
Towards Causal VQA: Revealing and Reducing Spurious Correlations by Invariant and Covariant Semantic Editing,
CVPR20(9687-9695)
IEEE DOI 2008
Data models, Robustness, Predictive models, Semantics, Correlation, Vocabulary, Visualization BibRef

Hu, R., Singh, A., Darrell, T., Rohrbach, M.,
Iterative Answer Prediction With Pointer-Augmented Multimodal Transformers for TextVQA,
CVPR20(9989-9999)
IEEE DOI 2008
Optical character recognition software, Task analysis, Feature extraction, Visualization, Iterative decoding, Vocabulary, Predictive models BibRef

Kafle, K., Shrestha, R., Price, B., Cohen, S., Kanan, C.,
Answering Questions about Data Visualizations using Efficient Bimodal Fusion,
WACV20(1487-1496)
IEEE DOI 2006
Bars, Data visualization, Image color analysis, Visualization, Task analysis, Optical character recognition software, Training BibRef

Yang, Z., Garcia, N., Chu, C., Otani, M., Nakashima, Y., Takemura, H.,
BERT Representations for Video Question Answering,
WACV20(1545-1554)
IEEE DOI 2006
Visualization, Bit error rate, Feature extraction, Knowledge discovery, Task analysis, Semantics, Standards BibRef

Patro, B.N., Kurmi, V.K., Kumar, S., Namboodiri, V.P.,
Deep Bayesian Network for Visual Question Generation,
WACV20(1555-1565)
IEEE DOI 2006
Bayes methods, Task analysis, Visualization, Uncertainty, Decoding, Probabilistic logic, Semantics BibRef

Patro, B.N., Patel, S., Namboodiri, V.P.,
Robust Explanations for Visual Question Answering,
WACV20(1566-1575)
IEEE DOI 2006
Visualization, Robustness, Perturbation methods, Knowledge discovery, Collaboration, Task analysis, Coherence BibRef

Chou, S., Chao, W., Lai, W., Sun, M., Yang, M.,
Visual Question Answering on 360° Images,
WACV20(1596-1605)
IEEE DOI 2006
Visualization, Task analysis, Feature extraction, Distortion, Cognition, Image color analysis, Spatial resolution BibRef

Chaudhry, R., Shekhar, S., Gupta, U., Maneriker, P., Bansal, P., Joshi, A.,
LEAF-QA: Locate, Encode Attend for Figure Question Answering,
WACV20(3501-3510)
IEEE DOI 2006
Bars, Knowledge discovery, Image color analysis, Training, Vocabulary, Data mining, Data visualization BibRef

Liang, Y.Z.[Yuan-Zhi], Bai, Y.L.[Ya-Long], Zhang, W.[Wei], Qian, X.M.[Xue-Ming], Zhu, L.[Li], Mei, T.[Tao],
VrR-VG: Refocusing Visually-Relevant Relationships,
ICCV19(10402-10411)
IEEE DOI 2004
bioinformatics, data mining, data visualisation, feature extraction, genomics, graph theory, image annotation, Cognition BibRef

Singh, A.K., Mishra, A., Shekhar, S., Chakraborty, A.,
From Strings to Things: Knowledge-Enabled VQA Model That Can Read and Reason,
ICCV19(4601-4611)
IEEE DOI 2004
document image processing, graph theory, inference mechanisms, neural nets, text analysis, visual content proposals, Proposals BibRef

Chen, L.[Lei], Zhai, M.Y.[Meng-Yao], He, J.W.[Jia-Wei], Mori, G.[Greg],
Object Grounding via Iterative Context Reasoning,
MDALC19(1407-1415)
IEEE DOI 2004
Localize set of queries in the image. image classification, image representation, image segmentation, inference mechanisms, iterative methods, query processing, weakly supervised learning BibRef

Bhattacharya, N., Li, Q., Gurari, D.,
Why Does a Visual Question Have Different Answers?,
ICCV19(4270-4279)
IEEE DOI 2004
Code, Visual Q-A.
WWW Link. computer vision, question answering (information retrieval), visual question answering, Visualization, Powders, Task analysis, Computer vision BibRef

Luo, H., Lin, G., Liu, Z., Liu, F., Tang, Z., Yao, Y.,
SegEQA: Video Segmentation Based Visual Attention for Embodied Question Answering,
ICCV19(9666-9675)
IEEE DOI 2004
computer vision, feature extraction, image fusion, image segmentation, question answering (information retrieval), feature fusion BibRef

Li, L., Gan, Z., Cheng, Y., Liu, J.,
Relation-Aware Graph Attention Network for Visual Question Answering,
ICCV19(10312-10321)
IEEE DOI 2004
data visualisation, graph theory, learning (artificial intelligence), object detection, Computational modeling BibRef

Peng, G., You, H., Zhang, Z., Wang, X., Li, H.,
Multi-Modality Latent Interaction Network for Visual Question Answering,
ICCV19(5824-5834)
IEEE DOI 2004
data visualisation, image representation, image retrieval, learning (artificial intelligence), Object detection BibRef

Fan, C.,
EgoVQA: An Egocentric Video Question Answering Benchmark Dataset,
EPIC19(4359-4366)
IEEE DOI 2004
question answering (information retrieval), video signal processing, EgoVQA dataset, visual question, dataset BibRef

Selvaraju, R.R., Lee, S., Shen, Y., Jin, H., Ghosh, S., Heck, L., Batra, D., Parikh, D.,
Taking a HINT: Leveraging Explanations to Make Vision and Language Models More Grounded,
ICCV19(2591-2600)
IEEE DOI 2004
gradient methods, image retrieval, natural language processing, neural nets, question answering (information retrieval), HINT, Correlation BibRef

Do, T., Tran, H., Do, T., Tjiputra, E., Tran, Q.,
Compact Trilinear Interaction for Visual Question Answering,
ICCV19(392-401)
IEEE DOI 2004
computer vision, learning (artificial intelligence), matrix decomposition, Correlation BibRef

Yang, T., Zha, Z., Zhang, H.,
Making History Matter: History-Advantage Sequence Training for Visual Dialog,
ICCV19(2561-2569)
IEEE DOI 2004
image retrieval, image sequences, interactive systems, neural nets, question answering (information retrieval), Decoding BibRef

Nguyen, D.K.[Duy-Kien], Okatani, T.[Takayuki],
Multi-Task Learning of Hierarchical Vision-Language Representation,
CVPR19(10484-10493).
IEEE DOI 2002
BibRef

Schwartz, I.[Idan], Yu, S.[Seunghak], Hazan, T.[Tamir], Schwing, A.G.[Alexander G.],
Factor Graph Attention,
CVPR19(2039-2048).
IEEE DOI 2002
BibRef

Kolesnikov, A.[Alexander], Zhai, X.H.[Xiao-Hua], Beyer, L.[Lucas],
Revisiting Self-Supervised Visual Representation Learning,
CVPR19(1920-1929).
IEEE DOI 2002
BibRef

Xiong, P.[Peixi], Zhan, H.[Huayi], Wang, X.[Xin], Sinha, B.[Baivab], Wu, Y.[Ying],
Visual Query Answering by Entity-Attribute Graph Matching and Reasoning,
CVPR19(8349-8358).
IEEE DOI 2002
BibRef

Singh, A.[Amanpreet], Natarajan, V.[Vivek], Shah, M.[Meet], Jiang, Y.[Yu], Chen, X.[Xinlei], Batra, D.[Dhruv], Parikh, D.[Devi], Rohrbach, M.[Marcus],
Towards VQA Models That Can Read,
CVPR19(8309-8318).
IEEE DOI 2002
BibRef

Manjunatha, V.[Varun], Saini, N.[Nirat], Davis, L.S.[Larry S.],
Explicit Bias Discovery in Visual Question Answering Models,
CVPR19(9554-9563).
IEEE DOI 2002
BibRef

Guo, D.[Dalu], Xu, C.[Chang], Tao, D.C.[Da-Cheng],
Image-Question-Answer Synergistic Network for Visual Dialog,
CVPR19(10426-10435).
IEEE DOI 2002
BibRef

Shrestha, R.[Robik], Kafle, K.[Kushal], Kanan, C.[Christopher],
Answer Them All! Toward Universal Visual Question Answering Models,
CVPR19(10464-10473).
IEEE DOI 2002
BibRef

Zadeh, A.[Amir], Chan, M.[Michael], Liang, P.P.[Paul Pu], Tong, E.[Edmund], Morency, L.P.[Louis-Philippe],
Social-IQ: A Question Answering Benchmark for Artificial Social Intelligence,
CVPR19(8799-8809).
IEEE DOI 2002
BibRef

Noh, H.[Hyeonwoo], Kim, T.[Taehoon], Mun, J.[Jonghwan], Han, B.H.[Bo-Hyung],
Transfer Learning via Unsupervised Task Discovery for Visual Question Answering,
CVPR19(8377-8386).
IEEE DOI 2002
BibRef

Kim, J.Y.[Jun-Yeong], Ma, M.[Minuk], Kim, K.[Kyungsu], Kim, S.[Sungjin], Yoo, C.D.[Chang D.],
Progressive Attention Memory Network for Movie Story Question Answering,
CVPR19(8329-8338).
IEEE DOI 2002
BibRef

Hudson, D.A.[Drew A.], Manning, C.D.[Christopher D.],
GQA: A New Dataset for Real-World Visual Reasoning and Compositional Question Answering,
CVPR19(6693-6702).
IEEE DOI 2002
BibRef

Wijmans, E.[Erik], Datta, S.[Samyak], Maksymets, O.[Oleksandr], Das, A.[Abhishek], Gkioxari, G.[Georgia], Lee, S.[Stefan], Essa, I.[Irfan], Parikh, D.[Devi], Batra, D.[Dhruv],
Embodied Question Answering in Photorealistic Environments With Point Cloud Perception,
CVPR19(6652-6661).
IEEE DOI 2002
BibRef

Shah, M.[Meet], Chen, X.[Xinlei], Rohrbach, M.[Marcus], Parikh, D.[Devi],
Cycle-Consistency for Robust Visual Question Answering,
CVPR19(6642-6651).
IEEE DOI 2002
BibRef

Zheng, Z.L.[Zi-Long], Wang, W.G.[Wen-Guan], Qi, S.Y.[Si-Yuan], Zhu, S.C.[Song-Chun],
Reasoning Visual Dialogs With Structural and Partial Observations,
CVPR19(6662-6671).
IEEE DOI 2002
BibRef

Li, H.[Hui], Wang, P.[Peng], Shen, C.H.[Chun-Hua], van den Hengel, A.[Anton],
Visual Question Answering as Reading Comprehension,
CVPR19(6312-6321).
IEEE DOI 2002
BibRef

Yu, L.C.[Li-Cheng], Chen, X.[Xinlei], Gkioxari, G.[Georgia], Bansal, M.[Mohit], Berg, T.L.[Tamara L.], Batra, D.[Dhruv],
Multi-Target Embodied Question Answering,
CVPR19(6302-6311).
IEEE DOI 2002
BibRef

Yu, Z.[Zhou], Yu, J.[Jun], Cui, Y.[Yuhao], Tao, D.C.[Da-Cheng], Tian, Q.[Qi],
Deep Modular Co-Attention Networks for Visual Question Answering,
CVPR19(6274-6283).
IEEE DOI 2002
BibRef

Abbasnejad, E.[Ehsan], Wu, Q.[Qi], Shi, Q.F.[Qin-Feng], van den Hengel, A.[Anton],
What's to Know? Uncertainty as a Guide to Asking Goal-Oriented Questions,
CVPR19(4150-4159).
IEEE DOI 2002
BibRef

Marino, K.[Kenneth], Rastegari, M.[Mohammad], Farhadi, A.[Ali], Mottaghi, R.[Roozbeh],
OK-VQA: A Visual Question Answering Benchmark Requiring External Knowledge,
CVPR19(3190-3199).
IEEE DOI 2002
BibRef

Krishna, R.[Ranjay], Bernstein, M.[Michael], Fei-Fei, L.[Li],
Information Maximizing Visual Question Generation,
CVPR19(2008-2018).
IEEE DOI 2002
BibRef

Fan, C.Y.[Chen-You], Zhang, X.F.[Xiao-Fan], Zhang, S.[Shu], Wang, W.S.[Wen-Sheng], Zhang, C.[Chi], Huang, H.[Heng],
Heterogeneous Memory Enhanced Multimodal Attention Model for Video Question Answering,
CVPR19(1999-2007).
IEEE DOI 2002
BibRef

Cadene, R.[Remi], Ben-younes, H.[Hedi], Cord, M.[Matthieu], Thome, N.[Nicolas],
MUREL: Multimodal Relational Reasoning for Visual Question Answering,
CVPR19(1989-1998).
IEEE DOI 2002
BibRef

Haurilet, M.[Monica], Roitberg, A.[Alina], Stiefelhagen, R.[Rainer],
It's Not About the Journey; It's About the Destination: Following Soft Paths Under Question-Guidance for Visual Reasoning,
CVPR19(1930-1939).
IEEE DOI 2002
BibRef

Qiu, Y., Satoh, Y., Suzuki, R., Kataoka, H.,
Incorporating 3D Information Into Visual Question Answering,
3DV19(756-765)
IEEE DOI 1911
Feature extraction, Task analysis, Visualization, Natural language processing, Cognition, Human computer interaction BibRef

Haurilet, M.[Monica], Al-Halah, Z.[Ziad], Stiefelhagen, R.[Rainer],
DynGraph: Visual Question Answering via Dynamic Scene Graphs,
GCPR19(428-441).
Springer DOI 1911
BibRef
Earlier:
MoQA: A Multi-modal Question Answering Architecture,
VL18(IV:106-113).
Springer DOI 1905
BibRef

Liu, F., Liu, J., Fang, Z., Lu, H.,
Language and Visual Relations Encoding for Visual Question Answering,
ICIP19(3307-3311)
IEEE DOI 1910
Visual question answering, Relations, Attention BibRef

Liu, C.N.[Chao-Ning], Chen, D.J.[Ding-Jie], Chen, H.T.[Hwann-Tzong], Liu, T.L.[Tyng-Luh],
A2A: Attention to Attention Reasoning for Movie Question Answering,
ACCV18(VI:404-419).
Springer DOI 1906
BibRef

Fang, Z.W.[Zhi-Wei], Liu, J.[Jing], Tang, Q.[Qu], Li, Y.[Yong], Lu, H.Q.[Han-Qing],
Answer Distillation for Visual Question Answering,
ACCV18(I:72-87).
Springer DOI 1906
BibRef

Kuhnle, A.[Alexander], Xie, H.Y.[Hui-Yuan], Copestake, A.[Ann],
How Clever Is the FiLM Model, and How Clever Can it Be?,
VL18(IV:162-172).
Springer DOI 1905
BibRef

Li, W.[Wei], Yuan, Z.H.[Ze-Huan], Fang, X.Z.[Xiang-Zhong], Wang, C.[Changhu],
Knowing Where to Look? Analysis on Attention of Visual Question Answering System,
VL18(IV:145-152).
Springer DOI 1905
BibRef

Bani, G.[Gabriele], Belli, D.[Davide], Dagan, G.[Gautier], Geenen, A.[Alexander], Skliar, A.[Andrii], Venkatesh, A.[Aashish], Baumgärtner, T.[Tim], Bruni, E.[Elia], Fernández, R.[Raquel],
Adding Object Detection Skills to Visual Dialogue Agents,
VL18(IV:180-187).
Springer DOI 1905
BibRef

Wagner, M.[Misha], Basevi, H.[Hector], Shetty, R.[Rakshith], Li, W.B.[Wen-Bin], Malinowski, M.[Mateusz], Fritz, M.[Mario], Leonardis, A.[Aleš],
Answering Visual What-If Questions: From Actions to Predicted Scene Descriptions,
VLEASE18(I:521-537).
Springer DOI 1905
BibRef

Zhang, Y., Niebles, J.C., Soto, A.,
Interpretable Visual Question Answering by Visual Grounding From Attention Supervision Mining,
WACV19(349-357)
IEEE DOI 1904
data mining, data visualisation, image representation, learning (artificial intelligence), Computer architecture BibRef

Zhuang, B., Wu, Q., Shen, C., Reid, I.D., van den Hengel, A.J.[Anton J.],
Parallel Attention: A Unified Framework for Visual Object Discovery Through Dialogs and Queries,
CVPR18(4252-4261)
IEEE DOI 1812
Visualization, Task analysis, Cognition, Proposals, Grounding, Computer vision, Correlation BibRef

Yang, M., Yang, N.S.R., Zhang, K., Tao, J.,
Self-Talk: Responses to Users' Opinions and Challenges in Human Computer Dialog,
ICPR18(2839-2844)
IEEE DOI 1812
History, Robots, Databases, Predictive models, Pattern recognition, Automation, Search engines, human computer dialog, abstract extraction BibRef

Duke, B., Taylor, G.W.,
Generalized Hadamard-Product Fusion Operators for Visual Question Answering,
CRV18(39-46)
IEEE DOI 1812
Feature extraction, Visualization, Task analysis, Data models, Mathematical model, Natural languages, Model Selection, Visual Question-Answering BibRef

Das, A., Datta, S., Gkioxari, G., Lee, S., Parikh, D., Batra, D.,
Embodied Question Answering,
CVPR18(1-10)
IEEE DOI 1812
Navigation, Visualization, Task analysis, Automobiles, Knowledge discovery BibRef

Misra, I., Girshick, R., Fergus, R., Hebert, M., Gupta, A., van der Maaten, L.[Laurens],
Learning by Asking Questions,
CVPR18(11-20)
IEEE DOI 1812
Training, Proposals, Visualization, Knowledge discovery, Standards, Task analysis, Data models BibRef

Gurari, D., Li, Q., Stangl, A.J., Guo, A., Lin, C., Grauman, K., Luo, J., Bigham, J.P.,
VizWiz Grand Challenge: Answering Visual Questions from Blind People,
CVPR18(3608-3617)
IEEE DOI 1812
Visualization, Blindness, Prediction algorithms, Lighting, Computer vision, Mobile handsets, Shape BibRef

Li, J., Su, H., Zhu, J., Wang, S., Zhang, B.,
Textbook Question Answering Under Instructor Guidance with Memory Networks,
CVPR18(3655-3663)
IEEE DOI 1812
Task analysis, Cognition, Visualization, Feature extraction, Semantics, Knowledge discovery, Drugs BibRef

Gordon, D., Kembhavi, A., Rastegari, M., Redmon, J., Fox, D., Farhadi, A.,
IQA: Visual Question Answering in Interactive Environments,
CVPR18(4089-4098)
IEEE DOI 1812
Task analysis, Navigation, Visualization, Knowledge discovery, Semantics, Planning BibRef

Agrawal, A., Batra, D., Parikh, D., Kembhavi, A.,
Don't Just Assume; Look and Answer: Overcoming Priors for Visual Question Answering,
CVPR18(4971-4980)
IEEE DOI 1812
Image color analysis, Visualization, Data models, Training data, Training, Knowledge discovery, Dogs BibRef

Sha, F., Chao, W., Hu, H.,
Learning Answer Embeddings for Visual Question Answering,
CVPR18(5428-5436)
IEEE DOI 1812
Visualization, Semantics, Probabilistic logic, Computational modeling, Task analysis, Training, Adaptation models BibRef

Kafle, K., Price, B., Cohen, S., Kanan, C.,
DVQA: Understanding Data Visualizations via Question Answering,
CVPR18(5648-5656)
IEEE DOI 1812
Bars, Cognition, Image color analysis, Visualization, Data visualization, Data mining, Knowledge discovery BibRef

Sha, F., Hu, H., Chao, W.,
Cross-Dataset Adaptation for Visual Question Answering,
CVPR18(5716-5725)
IEEE DOI 1812
Visualization, Task analysis, Adaptation models, Knowledge discovery, Games, Training, Target recognition BibRef

Jain, U., Schwing, A., Lazebnik, S.,
Two Can Play This Game: Visual Dialog with Discriminative Question Generation and Answering,
CVPR18(5754-5763)
IEEE DOI 1812
Visualization, Task analysis, History, Knowledge discovery, Measurement, Training, Computer architecture BibRef

Anderson, P., He, X., Buehler, C., Teney, D., Johnson, M., Gould, S., Zhang, L.,
Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering,
CVPR18(6077-6086)
IEEE DOI 1812
Visualization, Task analysis, Proposals, Mathematical model, Servers, Context modeling, Object detection BibRef

Nguyen, D., Okatani, T.,
Improved Fusion of Visual and Language Representations by Dense Symmetric Co-attention for Visual Question Answering,
CVPR18(6087-6096)
IEEE DOI 1812
Feature extraction, Visualization, Computer architecture, Fuses, Knowledge discovery, Bidirectional control BibRef

Dokania, P.K., Torr, P.H.S., Siddharth, N., Massiceti, D.,
FLIPDIAL: A Generative Model for Two-Way Visual Dialogue,
CVPR18(6097-6105)
IEEE DOI 1812
Visualization, Task analysis, Computational modeling, History, Data models, Pediatrics, Image color analysis BibRef

Wu, Q., Wang, P., Shen, C., Reid, I.D., van den Hengel, A.J.[Anton J.],
Are You Talking to Me? Reasoned Visual Dialog Generation Through Adversarial Learning,
CVPR18(6106-6115)
IEEE DOI 1812
Visualization, Task analysis, Generators, History, Computational modeling, Image color analysis BibRef

Gao, J., Ge, R., Chen, K., Nevatia, R.,
Motion-Appearance Co-memory Networks for Video Question Answering,
CVPR18(6576-6585)
IEEE DOI 1812
Knowledge discovery, Cognition, Task analysis, Dynamics, Memory modules, Micromechanical devices, Logic gates BibRef

Ma, C., Shen, C., Dick, A., Wu, Q., Wang, P., van den Hengel, A.J.[Anton J.], Reid, I.D.,
Visual Question Answering with Memory-Augmented Networks,
CVPR18(6975-6984)
IEEE DOI 1812
Visualization, Neural networks, Training, Knowledge discovery, Feature extraction, Bidirectional control, Prediction algorithms BibRef

Cao, Q., Liang, X., Li, B., Li, G., Lin, L.,
Visual Question Reasoning on General Dependency Tree,
CVPR18(7249-7257)
IEEE DOI 1812
Cognition, Visualization, Layout, Feature extraction, Task analysis, Collaboration, Neural networks BibRef

Patro, B., Namboodiri, V.P.,
Differential Attention for Visual Question Answering,
CVPR18(7680-7688)
IEEE DOI 1812
Semantics, Task analysis, Visualization, Knowledge discovery, Correlation, Measurement, Training BibRef

Su, Z., Zhu, C., Dong, Y., Cai, D., Chen, Y., Li, J.,
Learning Visual Knowledge Memory Networks for Visual Question Answering,
CVPR18(7736-7745)
IEEE DOI 1812
Visualization, Knowledge based systems, Task analysis, Knowledge discovery, Cognition, Ovens BibRef

Shin, A., Ushiku, Y., Harada, T.,
Customized Image Narrative Generation via Interactive Visual Question Generation and Answering,
CVPR18(8925-8933)
IEEE DOI 1812
Visualization, Task analysis, Feature extraction, Proposals, Knowledge discovery, Recurrent neural networks, Training BibRef

Das, A., Datta, S., Gkioxari, G., Lee, S., Parikh, D., Batra, D.,
Embodied Question Answering,
DeepLearnRV18(2135-213509)
IEEE DOI 1812
Navigation, Visualization, Task analysis, Automobiles, Knowledge discovery BibRef

Cheng, W., Huang, Y., Wang, L.,
Towards Unconstrained Pointing Problem of Visual Question Answering: A Retrieval-based Method,
ICPR18(3303-3308)
IEEE DOI 1812
Visualization, Task analysis, Feature extraction, Training, Knowledge discovery, Proposals, Semantics BibRef

Teney, D., Anderson, P., He, X., van den Hengel, A.J.[Anton J.],
Tips and Tricks for Visual Question Answering: Learnings from the 2017 Challenge,
CVPR18(4223-4232)
IEEE DOI 1812
Training, Visualization, Task analysis, Neural networks, Knowledge discovery, Logic gates, Computer architecture BibRef

Zhou, B.[Bolei], Sun, Y.[Yiyou], Bau, D.[David], Torralba, A.[Antonio],
Interpretable Basis Decomposition for Visual Explanation,
ECCV18(VIII: 122-138).
Springer DOI 1810
BibRef

Kottur, S.[Satwik], Moura, J.M.F.[José M. F.], Parikh, D.[Devi], Batra, D.[Dhruv], Rohrbach, M.[Marcus],
Visual Coreference Resolution in Visual Dialog Using Neural Module Networks,
ECCV18(XV: 160-178).
Springer DOI 1810
BibRef

Strub, F.[Florian], Seurin, M.[Mathieu], Perez, E.[Ethan], de Vries, H.[Harm], Mary, J.[Jérémie], Preux, P.[Philippe], Courville, A.[Aaron], Pietquin, O.[Olivier],
Visual Reasoning with Multi-hop Feature Modulation,
ECCV18(VI: 808-831).
Springer DOI 1810
BibRef

Shi, Y.[Yang], Furlanello, T.[Tommaso], Zha, S.[Sheng], Anandkumar, A.[Animashree],
Question Type Guided Attention in Visual Question Answering,
ECCV18(II: 158-175).
Springer DOI 1810
BibRef

Narasimhan, M.[Medhini], Schwing, A.G.[Alexander G.],
Straight to the Facts: Learning Knowledge Base Retrieval for Factual Visual Question Answering,
ECCV18(VIII: 460-477).
Springer DOI 1810
BibRef

Yang, G.R.[Guangyu Robert], Ganichev, I.[Igor], Wang, X.J.[Xiao-Jing], Shlens, J.[Jonathon], Sussillo, D.[David],
A Dataset and Architecture for Visual Reasoning with a Working Memory,
ECCV18(X: 729-745).
Springer DOI 1810
BibRef

Malinowski, M.[Mateusz], Doersch, C.[Carl], Santoro, A.[Adam], Battaglia, P.[Peter],
Learning Visual Question Answering by Bootstrapping Hard Attention,
ECCV18(VI: 3-20).
Springer DOI 1810
BibRef

Kim, K.M.[Kyung-Min], Choi, S.H.[Seong-Ho], Kim, J.H.[Jin-Hwa], Zhang, B.T.[Byoung-Tak],
Multimodal Dual Attention Memory for Video Story Question Answering,
ECCV18(XV: 698-713).
Springer DOI 1810
BibRef

Yu, Y.J.[Young-Jae], Kim, J.S.[Jong-Seok], Kim, G.[Gunhee],
A Joint Sequence Fusion Model for Video Question Answering and Retrieval,
ECCV18(VII: 487-503).
Springer DOI 1810
BibRef

Gu, J.X.[Jiu-Xiang], Cai, J.F.[Jian-Fei], Joty, S.[Shafiq], Niu, L.[Li], Wang, G.[Gang],
Look, Imagine and Match: Improving Textual-Visual Cross-Modal Retrieval with Generative Models,
CVPR18(7181-7189)
IEEE DOI 1812
Visualization, Training, Decoding, Semantics, Measurement. BibRef

Li, Q.[Qing], Tao, Q.Y.[Qing-Yi], Joty, S.[Shafiq], Cai, J.F.[Jian-Fei], Luo, J.B.[Jie-Bo],
VQA-E: Explaining, Elaborating, and Enhancing Your Answers for Visual Questions,
ECCV18(VII: 570-586).
Springer DOI 1810
BibRef

Bai, Y.L.[Ya-Long], Fu, J.L.[Jian-Long], Zhao, T.J.[Tie-Jun], Mei, T.[Tao],
Deep Attention Neural Tensor Network for Visual Question Answering,
ECCV18(XII: 21-37).
Springer DOI 1810
BibRef

Sinha, A.[Abhishek], Ayush, K.[Kumar],
Towards Mathematical Reasoning: A Multimodal Deep Learning Approach,
ICIP18(4028-4032)
IEEE DOI 1809
Mathematical model, Task analysis, Visualization, Decoding, Computational modeling, Machine learning, Numerical models, Mathematical Reasoning BibRef

Yu, D., Gao, X., Xiong, H.,
Structured Semantic Representation for Visual Question Answering,
ICIP18(2286-2290)
IEEE DOI 1809
Semantics, Training, Cognition, Visualization, Task analysis, Linguistics, Computational modeling, Visual question answering BibRef

Huang, L., Kulkarni, K., Jha, A., Lohit, S., Jayasuriya, S., Turaga, P.K.,
CS-VQA: Visual Question Answering with Compressively Sensed Images,
ICIP18(1283-1287)
IEEE DOI 1809
Visualization, Image reconstruction, Image coding, Task analysis, Feature extraction, Training, Multiplexing, Computer vision, image reconstruction BibRef

Hasan Chowdhury, M.I., Nguyen, K., Sridharan, S., Fookes, C.,
Hierarchical Relational Attention for Video Question Answering,
ICIP18(599-603)
IEEE DOI 1809
Feature extraction, Knowledge discovery, Visualization, Task analysis, Mathematical model, Natural languages, scene understanding BibRef

Desta, M.T., Chen, L., Kornuta, T.,
Object-Based Reasoning in VQA,
WACV18(1814-1823)
IEEE DOI 1806
data visualisation, inference mechanisms, natural language processing, object detection, Visualization BibRef

Zhao, H., Fan, Q., Gutfreund, D., Fu, Y.,
Semantically Guided Visual Question Answering,
WACV18(1852-1860)
IEEE DOI 1806
data visualisation, image colour analysis, image representation, learning (artificial intelligence), Visualization BibRef

Wang, Z., Liu, X., Wang, L., Qiao, Y., Xie, X., Fowlkes, C.C.[Charless C.],
Structured Triplet Learning with POS-Tag Guided Attention for Visual Question Answering,
WACV18(1888-1896)
IEEE DOI 1806
convolution, data visualisation, learning (artificial intelligence), Visualization BibRef

Chowdhury, I., Nguyen, K., Fookes, C., Sridharan, S.,
A cascaded long short-term memory (LSTM) driven generic visual question answering (VQA),
ICIP17(1842-1846)
IEEE DOI 1803
Feature extraction, Mathematical model, Natural languages, Principal component analysis, Task analysis, Training, scene understanding BibRef

Sheng, S.[Shurong], Venkitasubramanian, A.N.[Aparna Nurani], Moens, M.F.[Marie-Francine],
A Markov Network Based Passage Retrieval Method for Multimodal Question Answering in the Cultural Heritage Domain,
MMMod18(I:3-15).
Springer DOI 1802
BibRef

Rosso-Mateus, A.[Andrés], González, F.A.[Fabio A.], Montes-y-Gómez, M.[Manuel],
A Two-Step Neural Network Approach to Passage Retrieval for Open Domain Question Answering,
CIARP17(566-574).
Springer DOI 1802
BibRef

Gupta, T.[Tanmay], Shih, K.J.[Kevin J.], Singh, S.[Saurabh], Hoiem, D.[Derek],
Aligned Image-Word Representations Improve Inductive Transfer Across Vision-Language Tasks,
ICCV17(4223-4232)
IEEE DOI 1802
data visualisation, image recognition, learning (artificial intelligence), Visualization BibRef

Das, A., Kottur, S., Moura, J.M.F., Lee, S., Batra, D.,
Learning Cooperative Visual Dialog Agents with Deep Reinforcement Learning,
ICCV17(2970-2979)
IEEE DOI 1802
interactive systems, learning (artificial intelligence), multi-agent systems, natural language interfaces, robot vision, Visualization BibRef

Mun, J.[Jonghwan], Seo, P.H.[Paul Hongsuck], Jung, I.[Ilchae], Han, B.[Bohyung],
MarioQA: Answering Questions by Watching Gameplay Videos,
ICCV17(2886-2894)
IEEE DOI 1802
computer games, inference mechanisms, neural nets, question answering (information retrieval), VideoQA problems, Visualization BibRef

Gan, C., Li, Y., Li, H., Sun, C., Gong, B.,
VQS: Linking Segmentations to Questions and Answers for Supervised Attention in VQA and Question-Focused Semantic Segmentation,
ICCV17(1829-1838)
IEEE DOI 1802
image annotation, image segmentation, multilayer perceptrons, question answering (information retrieval), COCO, VQA dataset, Visualization BibRef

Yu, Z., Yu, J., Fan, J., Tao, D.,
Multi-modal Factorized Bilinear Pooling with Co-attention Learning for Visual Question Answering,
ICCV17(1839-1848)
IEEE DOI 1802
computational complexity, feature extraction, image fusion, learning (artificial intelligence), Visualization BibRef

Ben-younes, H., Cadene, R., Cord, M., Thome, N.,
MUTAN: Multimodal Tucker Fusion for Visual Question Answering,
ICCV17(2631-2639)
IEEE DOI 1802
image fusion, image representation, question answering (information retrieval), tensors, (VQA) tasks, Visualization BibRef

Zhu, C., Zhao, Y., Huang, S., Tu, K., Ma, Y.,
Structured Attentions for Visual Question Answering,
ICCV17(1300-1309)
IEEE DOI 1802
belief networks, data visualisation, image retrieval, inference mechanisms, neural nets, Visualization BibRef

Hu, R., Andreas, J., Rohrbach, M., Darrell, T.J., Saenko, K.,
Learning to Reason: End-to-End Module Networks for Visual Question Answering,
ICCV17(804-813)
IEEE DOI 1802
computational linguistics, grammars, natural language processing, neural net architecture, Visualization BibRef

Jain, U.[Unnat], Zhang, Z.Y.[Zi-Yu], Schwing, A.[Alexander],
Creativity: Generating Diverse Questions Using Variational Autoencoders,
CVPR17(5415-5424)
IEEE DOI 1711
Artificial intelligence, Creativity, Hidden Markov models, Training, Transforms, Visualization BibRef

de Vries, H.[Harm], Strub, F.[Florian], Chandar, S.[Sarath], Pietquin, O.[Olivier], Larochelle, H.[Hugo], Courville, A.[Aaron],
GuessWhat?! Visual Object Discovery through Multi-modal Dialogue,
CVPR17(4466-4475)
IEEE DOI 1711
Databases, Games, Knowledge discovery, Natural languages, Visualization BibRef

Nam, H.[Hyeonseob], Ha, J.W.[Jung-Woo], Kim, J.[Jeonghee],
Dual Attention Networks for Multimodal Reasoning and Matching,
CVPR17(2156-2164)
IEEE DOI 1711
Cognition, Knowledge discovery, Mathematical model, Neural networks, Semantics, Visualization BibRef

Johnson, J.[Justin], Hariharan, B.[Bharath], van der Maaten, L.[Laurens], Hoffman, J., Fei-Fei, L.[Li], Zitnick, C.L.[C. Lawrence], Girshick, R.[Ross],
Inferring and Executing Programs for Visual Reasoning,
ICCV17(3008-3017)
IEEE DOI 1802
BibRef
Earlier: A1, A2, A3, A5, A6, A7, Only:
CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning,
CVPR17(1988-1997)
IEEE DOI 1711
Dataset, Visual Reasoning.
WWW Link. backpropagation, image matching, learning (artificial intelligence), neural nets, Visualization. Cognition, Image color analysis, Metals, Semantics, Shape. BibRef

Yu, Y., Ko, H., Choi, J., Kim, G.,
End-to-End Concept Word Detection for Video Captioning, Retrieval, and Question Answering,
CVPR17(3261-3269)
IEEE DOI 1711
Detectors, Knowledge discovery, Motion pictures, Semantics, Training, Visualization BibRef

Das, A.[Abhishek], Kottur, S.[Satwik], Gupta, K.[Khushi], Singh, A.[Avi], Yadav, D.[Deshraj], Moura, J.M.F.[José M. F.], Parikh, D.[Devi], Batra, D.[Dhruv],
Visual Dialog,
CVPR17(1080-1089)
IEEE DOI 1711
Hold a dialog with humans in a natural visual context. History, Knowledge discovery, Protocols, Visualization, Wheelchairs BibRef

Zhu, Y., Lim, J.J., Fei-Fei, L.[Li],
Knowledge Acquisition for Visual Question Answering via Iterative Querying,
CVPR17(6146-6155)
IEEE DOI 1711
Computational modeling, Data models, Generators, Knowledge discovery, Standards, Visualization BibRef

Peris, Á.[Álvaro], Casacuberta, F.[Francisco],
Interactive-Predictive Neural Multimodal Systems,
IbPRIA(I:16-28).
Springer DOI 1910
BibRef

Bolańos, M.[Marc], Peris, Á.[Álvaro], Casacuberta, F.[Francisco], Radeva, P.[Petia],
VIBIKNet: Visual Bidirectional Kernelized Network for Visual Question Answering,
IbPRIA17(372-380).
Springer DOI 1706
BibRef

Gao, P.[Peng], Li, H.S.[Hong-Sheng], Li, S.[Shuang], Lu, P.[Pan], Li, Y.K.[Yi-Kang], Hoi, S.C.H.[Steven C. H.], Wang, X.G.[Xiao-Gang],
Question-Guided Hybrid Convolution for Visual Question Answering,
ECCV18(I: 485-501).
Springer DOI 1810
BibRef

Li, Y.K.[Yi-Kang], Duan, N.[Nan], Zhou, B.L.[Bo-Lei], Chu, X.[Xiao], Ouyang, W.L.[Wan-Li], Wang, X.G.[Xiao-Gang], Zhou, M.[Ming],
Visual Question Generation as Dual Task of Visual Question Answering,
CVPR18(6116-6124)
IEEE DOI 1812
Task analysis, Visualization, Knowledge discovery, Training, Computational modeling, Computer vision BibRef

Gao, P.[Peng], Jiang, Z.K.[Zheng-Kai], You, H.X.[Hao-Xuan], Lu, P.[Pan], Hoi, S.C.H.[Steven C. H.], Wang, X.G.[Xiao-Gang], Li, H.S.[Hong-Sheng],
Dynamic Fusion With Intra- and Inter-Modality Attention Flow for Visual Question Answering,
CVPR19(6632-6641).
IEEE DOI 2002
BibRef

Lin, Y.T.[Yue-Tan], Pang, Z.Y.[Zhang-Yang], Li, Y.[Yanan], Wang, D.H.[Dong-Hui],
Simple and effective visual question answering in a single modality,
ICIP16(2276-2280)
IEEE DOI 1610
Benchmark testing. Not just add text to image questions. BibRef

Kafle, K.[Kushal], Kanan, C.[Christopher],
An Analysis of Visual Question Answering Algorithms,
ICCV17(1983-1991)
IEEE DOI 1802
BibRef
Earlier:
Answer-Type Prediction for Visual Question Answering,
CVPR16(4976-4984)
IEEE DOI 1612
case-based reasoning, computer vision, data visualisation, image retrieval, neural nets, Visualization BibRef

Wang, P., Wu, Q., Shen, C., van den Hengel, A.J.[Anton J.],
The VQA-Machine: Learning How to Use Existing Vision Algorithms to Answer New Questions,
CVPR17(3909-3918)
IEEE DOI 1711
Cognition, Data mining, Neural networks, Prediction algorithms, Telescopes, Visualization BibRef

Yu, D., Fu, J., Mei, T., Rui, Y.,
Multi-level Attention Networks for Visual Question Answering,
CVPR17(4187-4195)
IEEE DOI 1711
Feature extraction, Knowledge discovery, Natural languages, Recurrent neural networks, Semantics, Visualization BibRef

Kembhavi, A., Seo, M., Schwenk, D., Choi, J., Farhadi, A., Hajishirzi, H.,
Are You Smarter Than a Sixth Grader? Textbook Question Answering for Multimodal Machine Comprehension,
CVPR17(5376-5384)
IEEE DOI 1711
Cognition, Knowledge discovery, Natural languages, Training, Visualization BibRef

Ganju, S., Russakovsky, O., Gupta, A.,
What's in a Question: Using Visual Questions as a Form of Supervision,
CVPR17(6422-6431)
IEEE DOI 1711
Artificial intelligence, Computational modeling, Dogs, Image color analysis, SPICE, Visualization BibRef

Ramakrishnan, S.K., Pal, A., Sharma, G., Mittal, A.,
An Empirical Evaluation of Visual Question Answering for Novel Objects,
CVPR17(7312-7321)
IEEE DOI 1711
Knowledge discovery, Recurrent neural networks, Training, Training data, Visualization, Vocabulary BibRef

Maharaj, T., Ballas, N., Rohrbach, A., Courville, A., Pal, C.,
A Dataset and Exploration of Models for Understanding Video Data through Fill-in-the-Blank Question-Answering,
CVPR17(7359-7368)
IEEE DOI 1711
Computational modeling, Motion pictures, Natural languages, Training, Visualization, Voltage, control BibRef

Xu, H.J.[Hui-Juan], Saenko, K.[Kate],
Ask, Attend and Answer: Exploring Question-Guided Spatial Attention for Visual Question Answering,
ECCV16(VII: 451-466).
Springer DOI 1611
Visual Question Answering. BibRef

Jabri, A.[Allan], Joulin, A.[Armand], van der Maaten, L.[Laurens],
Revisiting Visual Question Answering Baselines,
ECCV16(VIII: 727-739).
Springer DOI 1611
BibRef

Yang, Z.C.[Zi-Chao], He, X.D.[Xiao-Dong], Gao, J.F.[Jian-Feng], Deng, L.[Li], Smola, A.[Alex],
Stacked Attention Networks for Image Question Answering,
CVPR16(21-29)
IEEE DOI 1612
BibRef

Sadeghi, F.[Fereshteh], Divvala, S.K.[Santosh K.], Farhadi, A.[Ali],
VisKE: Visual knowledge extraction and question answering by visual verification of relation phrases,
CVPR15(1456-1464)
IEEE DOI 1510
Visual verification of text relationships. BibRef

Liu, Y.[Yang], Liu, J.[Jie], Wang, D.[Dong], Cheng, J.[Jian],
A robust multivariate reranking algorithm for Question Answering enrichment,
ICIP12(1917-1920).
IEEE DOI 1302
BibRef

Varekamp, C.[Chris], van de Walle, P.[Patrick], de Putter, M.[Marc],
Question interface for 3D picture creation on an autostereoscopic digital picture frame,
3DTV09(1-4).
IEEE DOI 0905
BibRef

Chapter on Implementations and Applications, Databases, QBIC, Video Analysis, Hardware and Software, Inspection continues in
Internet Label Information .


Last update:Sep 28, 2020 at 12:04:43