19.4.3.2.1 Visual Question Answering, Query

Chapter Contents (Back)
Question Answer. Visual Q-A. Datasets may be in: See also Object Recognition, Retrieval Datasets.

VQA: Visual Question Answering,
dataset containing open-ended questions about images WWW Link.
Dataset, Visual Question Answering. See also VQA: Visual Question Answering.

Visual7W visual question answering,
Large-scale visual question answering (QA) dataset, with object-level groundings and multimodal answers. WWW Link.
Dataset, Visual Question Answering.

Visual Genome,
Visual Genome is a dataset, a knowledge base, an ongoing effort to connect structured image concepts to language. WWW Link.

WWW Link. Dataset, Visual Question Answering.

Wu, Y.C.[Yu-Chieh], Yang, J.C.[Jie-Chi],
A Robust Passage Retrieval Algorithm for Video Question Answering,
CirSysVideo(18), No. 10, October 2008, pp. 1411-1421.
IEEE DOI 0811
BibRef

Wu, Y.C.[Yu-Chieh], Lee, Y.S.[Yue-Shi], Yang, J.C.[Jie-Chi], Yen, S.J.[Show-Jane],
A New Passage Ranking Algorithm for Video Question Answering,
PSIVT06(563-572).
Springer DOI 0612
BibRef

Li, G.D.[Guang-Da], Li, H.[Haojie], Ming, Z.Y.[Zhao-Yan], Hong, R.[Richang], Tang, S.[Sheng], Chua, T.S.[Tat-Seng],
Question Answering over Community-Contributed Web Videos,
MultMedMag(17), No. 4, October-December 2010, pp. 46-57.
IEEE DOI 1011
BibRef

Song, Y.C.[Yi-Cheng], Li, H.[Haojie],
Mash-Up Approach for Web Video Category Recommendation,
PSIVT10(197-202).
IEEE DOI 1011
BibRef

Agrawal, A.[Aishwarya], Lu, J.[Jiasen], Antol, S.[Stanislaw], Mitchell, M.[Margaret], Zitnick, C.L.[C. Lawrence], Parikh, D.[Devi], Batra, D.[Dhruv],
VQA: Visual Question Answering,
IJCV(123), No. 1, May 2017, pp. 4-31.
Springer DOI 1705
BibRef

Xue, H., Zhao, Z., Cai, D.,
Unifying the Video and Question Attentions for Open-Ended Video Question Answering,
IP(26), No. 12, December 2017, pp. 5656-5666.
IEEE DOI 1710
image retrieval, video coding, temporal question attention, temporal structures, Adaptation models, Coherence, Hair, Knowledge discovery, BibRef

Malinowski, M.[Mateusz], Rohrbach, M.[Marcus], Fritz, M.[Mario],
Ask Your Neurons: A Deep Learning Approach to Visual Question Answering,
IJCV(125), No. 1-3, December 2018, pp. 110-135.
Springer DOI 1711
BibRef
Earlier:
Ask Your Neurons: A Neural-Based Approach to Answering Questions about Images,
ICCV15(1-9)
IEEE DOI 1602
Deep learning for questions about real-world images. A Visual Turing Test. Language output based on visual and natural language input. BibRef

Kafle, K.[Kushal], Kanan, C.[Christopher],
Visual question answering: Datasets, algorithms, and future challenges,
CVIU(163), No. 1, 2017, pp. 3-20.
Elsevier DOI 1712
Image understanding BibRef

Wu, Q.[Qi], Teney, D.[Damien], Wang, P.[Peng], Shen, C.H.[Chun-Hua], Dick, A.[Anthony], van den Hengel, A.J.[Anton J.],
Visual question answering: A survey of methods and datasets,
CVIU(163), No. 1, 2017, pp. 21-40.
Elsevier DOI 1712
Survey, Visual Question Answering. Visual question answering BibRef

Tamaazousti, Y.[Youssef], Borgne, H.L.[Hervé Le], Popescu, A.[Adrian], Gadeski, E.[Etienne], Ginsca, A.[Alexandru], Hudelot, C.[Céline],
Vision-language integration using constrained local semantic features,
CVIU(163), No. 1, 2017, pp. 41-57.
Elsevier DOI 1712
Image classification BibRef

Das, A.[Abhishek], Agrawal, H.[Harsh], Zitnick, L.[Larry], Parikh, D.[Devi], Batra, D.[Dhruv],
Human Attention in Visual Question Answering: Do Humans and Deep Networks Look at the Same Regions?,
CVIU(163), No. 1, 2017, pp. 90-100.
Elsevier DOI 1712
Visual Question Answering BibRef

Teney, D., Wu, Q., van den Hengel, A.J.[Anton J.],
Visual Question Answering: A Tutorial,
SPMag(34), No. 6, November 2017, pp. 63-75.
IEEE DOI 1712
Survey, Visual Question Answering. Bioinformatics, Computer vision, Genomics, Machine learning, Visualization BibRef

Wu, Q.[Qi], Shen, C.H.[Chun-Hua], Wang, P.[Peng], Dick, A.[Anthony], van den Hengel, A.J.[Anton J.],
Image Captioning and Visual Question Answering Based on Attributes and External Knowledge,
PAMI(40), No. 6, June 2018, pp. 1367-1381.
IEEE DOI 1805
BibRef
Earlier: A1, A3, A2, A4, A5:
Ask Me Anything: Free-Form Visual Question Answering Based on Knowledge from External Sources,
CVPR16(4622-4630)
IEEE DOI 1612
Computational modeling, Computer vision, Knowledge based systems, Knowledge discovery, Resource description framework, Semantics, visual question answering BibRef

Lioutas, V.[Vasileios], Passalis, N.[Nikolaos], Tefas, A.[Anastasios],
Explicit ensemble attention learning for improving visual question answering,
PRL(111), 2018, pp. 51-57.
Elsevier DOI 1808
Visual question answering, Explicit attention, Pictorial superiority effect BibRef


Desta, M.T., Chen, L., Kornuta, T.,
Object-Based Reasoning in VQA,
WACV18(1814-1823)
IEEE DOI 1806
data visualisation, inference mechanisms, natural language processing, object detection, Visualization BibRef

Zhao, H., Fan, Q., Gutfreund, D., Fu, Y.,
Semantically Guided Visual Question Answering,
WACV18(1852-1860)
IEEE DOI 1806
data visualisation, image colour analysis, image representation, learning (artificial intelligence), Visualization BibRef

Wang, Z., Liu, X., Wang, L., Qiao, Y., Xie, X., Fowlkes, C.,
Structured Triplet Learning with POS-Tag Guided Attention for Visual Question Answering,
WACV18(1888-1896)
IEEE DOI 1806
convolution, data visualisation, learning (artificial intelligence), Visualization BibRef

Chowdhury, I., Nguyen, K., Fookes, C., Sridharan, S.,
A cascaded long short-term memory (LSTM) driven generic visual question answering (VQA),
ICIP17(1842-1846)
IEEE DOI 1803
Feature extraction, Mathematical model, Natural languages, Principal component analysis, Task analysis, Training, scene understanding BibRef

Sheng, S.[Shurong], Venkitasubramanian, A.N.[Aparna Nurani], Moens, M.F.[Marie-Francine],
A Markov Network Based Passage Retrieval Method for Multimodal Question Answering in the Cultural Heritage Domain,
MMMod18(I:3-15).
Springer DOI 1802
BibRef

Rosso-Mateus, A.[Andrés], González, F.A.[Fabio A.], Montes-y-Gómez, M.[Manuel],
A Two-Step Neural Network Approach to Passage Retrieval for Open Domain Question Answering,
CIARP17(566-574).
Springer DOI 1802
BibRef

Zhang, Q., Cao, R., Wu, Y.N., Zhu, S.C.,
Mining Object Parts from CNNs via Active Question-Answering,
CVPR17(3890-3899)
IEEE DOI 1711
BibRef
Earlier: A1, A3, A4, Only:
Mining And-Or Graphs for Graph Matching and Object Discovery,
ICCV15(55-63)
IEEE DOI 1602
Object detection, Object recognition, Semantics, Strain, Training, Visualization BibRef

Gupta, T.[Tanmay], Shih, K.J.[Kevin J.], Singh, S.[Saurabh], Hoiem, D.[Derek],
Aligned Image-Word Representations Improve Inductive Transfer Across Vision-Language Tasks,
ICCV17(4223-4232)
IEEE DOI 1802
data visualisation, image recognition, learning (artificial intelligence), Visualization BibRef

Das, A., Kottur, S., Moura, J.M.F., Lee, S., Batra, D.,
Learning Cooperative Visual Dialog Agents with Deep Reinforcement Learning,
ICCV17(2970-2979)
IEEE DOI 1802
interactive systems, learning (artificial intelligence), multi-agent systems, natural language interfaces, robot vision, Visualization BibRef

Mun, J.[Jonghwan], Seo, P.H.[Paul Hongsuck], Jung, I.[Ilchae], Han, B.[Bohyung],
MarioQA: Answering Questions by Watching Gameplay Videos,
ICCV17(2886-2894)
IEEE DOI 1802
computer games, inference mechanisms, neural nets, question answering (information retrieval), VideoQA problems, Visualization BibRef

Gan, C., Li, Y., Li, H., Sun, C., Gong, B.,
VQS: Linking Segmentations to Questions and Answers for Supervised Attention in VQA and Question-Focused Semantic Segmentation,
ICCV17(1829-1838)
IEEE DOI 1802
image annotation, image segmentation, multilayer perceptrons, question answering (information retrieval), COCO, VQA dataset, Visualization BibRef

Yu, Z., Yu, J., Fan, J., Tao, D.,
Multi-modal Factorized Bilinear Pooling with Co-attention Learning for Visual Question Answering,
ICCV17(1839-1848)
IEEE DOI 1802
computational complexity, feature extraction, image fusion, learning (artificial intelligence), Visualization BibRef

Ben-younes, H., Cadene, R., Cord, M., Thome, N.,
MUTAN: Multimodal Tucker Fusion for Visual Question Answering,
ICCV17(2631-2639)
IEEE DOI 1802
image fusion, image representation, question answering (information retrieval), tensors, (VQA) tasks, Visualization BibRef

Zhu, C., Zhao, Y., Huang, S., Tu, K., Ma, Y.,
Structured Attentions for Visual Question Answering,
ICCV17(1300-1309)
IEEE DOI 1802
belief networks, data visualisation, image retrieval, inference mechanisms, neural nets, Visualization BibRef

Hu, R., Andreas, J., Rohrbach, M., Darrell, T.J., Saenko, K.,
Learning to Reason: End-to-End Module Networks for Visual Question Answering,
ICCV17(804-813)
IEEE DOI 1802
computational linguistics, grammars, natural language processing, neural net architecture, Visualization BibRef

Jain, U.[Unnat], Zhang, Z.Y.[Zi-Yu], Schwing, A.[Alexander],
Creativity: Generating Diverse Questions Using Variational Autoencoders,
CVPR17(5415-5424)
IEEE DOI 1711
Artificial intelligence, Creativity, Hidden Markov models, Training, Transforms, Visualization BibRef

de Vries, H.[Harm], Strub, F.[Florian], Chandar, S.[Sarath], Pietquin, O.[Olivier], Larochelle, H.[Hugo], Courville, A.[Aaron],
GuessWhat?! Visual Object Discovery through Multi-modal Dialogue,
CVPR17(4466-4475)
IEEE DOI 1711
Databases, Games, Knowledge discovery, Natural languages, Visualization BibRef

Nam, H.[Hyeonseob], Ha, J.W.[Jung-Woo], Kim, J.[Jeonghee],
Dual Attention Networks for Multimodal Reasoning and Matching,
CVPR17(2156-2164)
IEEE DOI 1711
Cognition, Knowledge discovery, Mathematical model, Neural networks, Semantics, Visualization BibRef

Johnson, J.[Justin], Hariharan, B.[Bharath], van der Maaten, L.[Laurens], Hoffman, J., Fei-Fei, L.[Li], Zitnick, C.L.[C. Lawrence], Girshick, R.[Ross],
Inferring and Executing Programs for Visual Reasoning,
ICCV17(3008-3017)
IEEE DOI 1802
BibRef
Earlier: A1, A2, A3, A5, A6, A7, Only:
CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning,
CVPR17(1988-1997)
IEEE DOI 1711
backpropagation, image matching, learning (artificial intelligence), neural nets, Visualization. Cognition, Image color analysis, Metals, Semantics, Shape. BibRef

Yu, Y., Ko, H., Choi, J., Kim, G.,
End-to-End Concept Word Detection for Video Captioning, Retrieval, and Question Answering,
CVPR17(3261-3269)
IEEE DOI 1711
Detectors, Knowledge discovery, Motion pictures, Semantics, Training, Visualization BibRef

Teney, D., Liu, L., van den Hengel, A.J.[Anton J.],
Graph-Structured Representations for Visual Question Answering,
CVPR17(3233-3241)
IEEE DOI 1711
Feature extraction, Knowledge discovery, Neural networks, Syntactics, Training, Visualization BibRef

Das, A.[Abhishek], Kottur, S.[Satwik], Gupta, K.[Khushi], Singh, A.[Avi], Yadav, D.[Deshraj], Moura, J.M.F.[José M. F.], Parikh, D.[Devi], Batra, D.[Dhruv],
Visual Dialog,
CVPR17(1080-1089)
IEEE DOI 1711
Hold a dialog with humans in a natural visual context. History, Knowledge discovery, Protocols, Visualization, Wheelchairs BibRef

Zhu, Y., Lim, J.J., Fei-Fei, L.[Li],
Knowledge Acquisition for Visual Question Answering via Iterative Querying,
CVPR17(6146-6155)
IEEE DOI 1711
Computational modeling, Data models, Generators, Knowledge discovery, Standards, Visualization BibRef

Bolaños, M.[Marc], Peris, Á.[Álvaro], Casacuberta, F.[Francisco], Radeva, P.[Petia],
VIBIKNet: Visual Bidirectional Kernelized Network for Visual Question Answering,
IbPRIA17(372-380).
Springer DOI 1706
BibRef

Lin, Y.T.[Yue-Tan], Pang, Z.Y.[Zhang-Yang], Li, Y.[Yanan], Wang, D.H.[Dong-Hui],
Simple and effective visual question answering in a single modality,
ICIP16(2276-2280)
IEEE DOI 1610
Benchmark testing. Not just add text to image questions. BibRef

Kafle, K.[Kushal], Kanan, C.[Christopher],
An Analysis of Visual Question Answering Algorithms,
ICCV17(1983-1991)
IEEE DOI 1802
BibRef
Earlier:
Answer-Type Prediction for Visual Question Answering,
CVPR16(4976-4984)
IEEE DOI 1612
case-based reasoning, computer vision, data visualisation, image retrieval, neural nets, Visualization BibRef

Jang, Y., Song, Y., Yu, Y., Kim, Y., Kim, G.,
TGIF-QA: Toward Spatio-Temporal Reasoning in Visual Question Answering,
CVPR17(1359-1367)
IEEE DOI 1711
Cognition, Crowdsourcing, Image color analysis, Knowledge discovery, Motion pictures, Visualization BibRef

Wang, P., Wu, Q., Shen, C., van den Hengel, A.J.[Anton J.],
The VQA-Machine: Learning How to Use Existing Vision Algorithms to Answer New Questions,
CVPR17(3909-3918)
IEEE DOI 1711
Cognition, Data mining, Neural networks, Prediction algorithms, Telescopes, Visualization BibRef

Yu, D., Fu, J., Mei, T., Rui, Y.,
Multi-level Attention Networks for Visual Question Answering,
CVPR17(4187-4195)
IEEE DOI 1711
Feature extraction, Knowledge discovery, Natural languages, Recurrent neural networks, Semantics, Visualization BibRef

Kembhavi, A., Seo, M., Schwenk, D., Choi, J., Farhadi, A., Hajishirzi, H.,
Are You Smarter Than a Sixth Grader? Textbook Question Answering for Multimodal Machine Comprehension,
CVPR17(5376-5384)
IEEE DOI 1711
Cognition, Knowledge discovery, Natural languages, Training, Visualization BibRef

Goyal, Y., Khot, T., Summers-Stay, D., Batra, D., Parikh, D.,
Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering,
CVPR17(6325-6334)
IEEE DOI 1711
Benchmark testing, Data collection, Data models, Knowledge discovery, Protocols, Visualization BibRef

Ganju, S., Russakovsky, O., Gupta, A.,
What's in a Question: Using Visual Questions as a Form of Supervision,
CVPR17(6422-6431)
IEEE DOI 1711
Artificial intelligence, Computational modeling, Dogs, Image color analysis, SPICE, Visualization BibRef

Ramakrishnan, S.K., Pal, A., Sharma, G., Mittal, A.,
An Empirical Evaluation of Visual Question Answering for Novel Objects,
CVPR17(7312-7321)
IEEE DOI 1711
Knowledge discovery, Recurrent neural networks, Training, Training data, Visualization, Vocabulary BibRef

Maharaj, T., Ballas, N., Rohrbach, A., Courville, A., Pal, C.,
A Dataset and Exploration of Models for Understanding Video Data through Fill-in-the-Blank Question-Answering,
CVPR17(7359-7368)
IEEE DOI 1711
Computational modeling, Motion pictures, Natural languages, Training, Visualization, Voltage, control BibRef

Xu, H.J.[Hui-Juan], Saenko, K.[Kate],
Ask, Attend and Answer: Exploring Question-Guided Spatial Attention for Visual Question Answering,
ECCV16(VII: 451-466).
Springer DOI 1611
Visual Question Answering. BibRef

Jabri, A.[Allan], Joulin, A.[Armand], van der Maaten, L.[Laurens],
Revisiting Visual Question Answering Baselines,
ECCV16(VIII: 727-739).
Springer DOI 1611
BibRef

Yang, Z.C.[Zi-Chao], He, X.D.[Xiao-Dong], Gao, J.F.[Jian-Feng], Deng, L.[Li], Smola, A.[Alex],
Stacked Attention Networks for Image Question Answering,
CVPR16(21-29)
IEEE DOI 1612
BibRef

Sadeghi, F.[Fereshteh], Divvala, S.K.[Santosh K.], Farhadi, A.[Ali],
VisKE: Visual knowledge extraction and question answering by visual verification of relation phrases,
CVPR15(1456-1464)
IEEE DOI 1510
Visual verification of text relationships. BibRef

Liu, Y.[Yang], Liu, J.[Jie], Wang, D.[Dong], Cheng, J.[Jian],
A robust multivariate reranking algorithm for Question Answering enrichment,
ICIP12(1917-1920).
IEEE DOI 1302
BibRef

Varekamp, C.[Chris], van de Walle, P.[Patrick], de Putter, M.[Marc],
Question interface for 3D picture creation on an autostereoscopic digital picture frame,
3DTV09(1-4).
IEEE DOI 0905
BibRef

Chapter on Implementations and Applications, Databases, QBIC, Video Analysis, Hardware and Software, Inspection continues in
Internet Label Information .


Last update:Aug 16, 2018 at 18:22:30