19.4.3.2.1 Visual Question Answering, Query, VQA

Chapter Contents (Back)
Question Answer. Visual Q-A. Datasets may be in: See also Object Recognition, Retrieval Datasets.

VQA: Visual Question Answering,
dataset containing open-ended questions about images WWW Link.
Dataset, Visual Question Answering. See also VQA: Visual Question Answering.

Visual7W visual question answering,
Large-scale visual question answering (QA) dataset, with object-level groundings and multimodal answers. WWW Link.
Dataset, Visual Question Answering.

Visual Genome,
Visual Genome is a dataset, a knowledge base, an ongoing effort to connect structured image concepts to language. WWW Link.

WWW Link. Dataset, Visual Question Answering.

Wu, Y.C.[Yu-Chieh], Yang, J.C.[Jie-Chi],
A Robust Passage Retrieval Algorithm for Video Question Answering,
CirSysVideo(18), No. 10, October 2008, pp. 1411-1421.
IEEE DOI 0811
BibRef

Wu, Y.C.[Yu-Chieh], Lee, Y.S.[Yue-Shi], Yang, J.C.[Jie-Chi], Yen, S.J.[Show-Jane],
A New Passage Ranking Algorithm for Video Question Answering,
PSIVT06(563-572).
Springer DOI 0612
BibRef

Li, G.D.[Guang-Da], Li, H.[Haojie], Ming, Z.Y.[Zhao-Yan], Hong, R.[Richang], Tang, S.[Sheng], Chua, T.S.[Tat-Seng],
Question Answering over Community-Contributed Web Videos,
MultMedMag(17), No. 4, October-December 2010, pp. 46-57.
IEEE DOI 1011
BibRef

Song, Y.C.[Yi-Cheng], Li, H.[Haojie],
Mash-Up Approach for Web Video Category Recommendation,
PSIVT10(197-202).
IEEE DOI 1011
BibRef

Agrawal, A.[Aishwarya], Lu, J.[Jiasen], Antol, S.[Stanislaw], Mitchell, M.[Margaret], Zitnick, C.L.[C. Lawrence], Parikh, D.[Devi], Batra, D.[Dhruv],
VQA: Visual Question Answering,
IJCV(123), No. 1, May 2017, pp. 4-31.
Springer DOI 1705
BibRef

Xue, H., Zhao, Z., Cai, D.,
Unifying the Video and Question Attentions for Open-Ended Video Question Answering,
IP(26), No. 12, December 2017, pp. 5656-5666.
IEEE DOI 1710
image retrieval, video coding, temporal question attention, temporal structures, Adaptation models, Coherence, Hair, Knowledge discovery, BibRef

Xue, H., Chu, W., Zhao, Z., Cai, D.,
A Better Way to Attend: Attention With Trees for Video Question Answering,
IP(27), No. 11, November 2018, pp. 5563-5574.
IEEE DOI 1809
computational linguistics, feature extraction, grammars, natural language processing, scene understanding BibRef

Malinowski, M.[Mateusz], Rohrbach, M.[Marcus], Fritz, M.[Mario],
Ask Your Neurons: A Deep Learning Approach to Visual Question Answering,
IJCV(125), No. 1-3, December 2018, pp. 110-135.
Springer DOI 1711
BibRef
Earlier:
Ask Your Neurons: A Neural-Based Approach to Answering Questions about Images,
ICCV15(1-9)
IEEE DOI 1602
Deep learning for questions about real-world images. A Visual Turing Test. Language output based on visual and natural language input. BibRef

Kafle, K.[Kushal], Kanan, C.[Christopher],
Visual question answering: Datasets, algorithms, and future challenges,
CVIU(163), No. 1, 2017, pp. 3-20.
Elsevier DOI 1712
Image understanding BibRef

Wu, Q.[Qi], Teney, D.[Damien], Wang, P.[Peng], Shen, C.H.[Chun-Hua], Dick, A.[Anthony], van den Hengel, A.J.[Anton J.],
Visual question answering: A survey of methods and datasets,
CVIU(163), No. 1, 2017, pp. 21-40.
Elsevier DOI 1712
Survey, Visual Question Answering. Visual question answering BibRef

Tamaazousti, Y.[Youssef], Borgne, H.L.[Hervé Le], Popescu, A.[Adrian], Gadeski, E.[Etienne], Ginsca, A.[Alexandru], Hudelot, C.[Céline],
Vision-language integration using constrained local semantic features,
CVIU(163), No. 1, 2017, pp. 41-57.
Elsevier DOI 1712
Image classification BibRef

Das, A.[Abhishek], Agrawal, H.[Harsh], Zitnick, L.[Larry], Parikh, D.[Devi], Batra, D.[Dhruv],
Human Attention in Visual Question Answering: Do Humans and Deep Networks Look at the Same Regions?,
CVIU(163), No. 1, 2017, pp. 90-100.
Elsevier DOI 1712
Visual Question Answering BibRef

Teney, D.[Damien], Wu, Q., van den Hengel, A.J.[Anton J.],
Visual Question Answering: A Tutorial,
SPMag(34), No. 6, November 2017, pp. 63-75.
IEEE DOI 1712
Survey, Visual Question Answering. Bioinformatics, Computer vision, Genomics, Machine learning, Visualization BibRef

Teney, D.[Damien], Liu, L., van den Hengel, A.J.[Anton J.],
Graph-Structured Representations for Visual Question Answering,
CVPR17(3233-3241)
IEEE DOI 1711
Feature extraction, Knowledge discovery, Neural networks, Syntactics, Training, Visualization BibRef

Teney, D.[Damien], van den Hengel, A.J.[Anton J.],
Visual Question Answering as a Meta Learning Task,
ECCV18(XV: 229-245).
Springer DOI 1810
BibRef

Wu, Q.[Qi], Shen, C.H.[Chun-Hua], Wang, P.[Peng], Dick, A.[Anthony], van den Hengel, A.J.[Anton J.],
Image Captioning and Visual Question Answering Based on Attributes and External Knowledge,
PAMI(40), No. 6, June 2018, pp. 1367-1381.
IEEE DOI 1805
BibRef
Earlier: A1, A3, A2, A4, A5:
Ask Me Anything: Free-Form Visual Question Answering Based on Knowledge from External Sources,
CVPR16(4622-4630)
IEEE DOI 1612
Computational modeling, Computer vision, Knowledge based systems, Knowledge discovery, Resource description framework, Semantics, visual question answering BibRef

Lioutas, V.[Vasileios], Passalis, N.[Nikolaos], Tefas, A.[Anastasios],
Explicit ensemble attention learning for improving visual question answering,
PRL(111), 2018, pp. 51-57.
Elsevier DOI 1808
Visual question answering, Explicit attention, Pictorial superiority effect BibRef

Huang, Y.Z.[Yan-Zhou], Zhong, T.[Tao],
Multitask learning for neural generative question answering,
RealTimeIP(14), No. 1, January 2018, pp. 1009-1017.
WWW Link. 1809
BibRef

Zhang, Q.S.[Quan-Shi], Wu, Y.N.[Ying Nian], Zhang, H.[Hao], Zhu, S.C.[Song-Chun],
Mining deep And-Or object structures via cost-sensitive question-answer-based active annotations,
CVIU(176-177), 2018, pp. 33-44.
Elsevier DOI 1812
Hierarchical graphical model, Part semantics BibRef

Zhang, Q.S.[Quan-Shi], Cao, R., Wu, Y.N.[Ying Nian], Zhu, S.C.[Song-Chun],
Mining Object Parts from CNNs via Active Question-Answering,
CVPR17(3890-3899)
IEEE DOI 1711
BibRef
Earlier: A1, A3, A4, Only:
Mining And-Or Graphs for Graph Matching and Object Discovery,
ICCV15(55-63)
IEEE DOI 1602
Object detection, Object recognition, Semantics, Strain, Training, Visualization BibRef

Garg, S.[Shivam], Srivastava, R.[Rajeev],
Object sequences: encoding categorical and spatial information for a yes/no visual question answering task,
IET-CV(12), No. 8, December 2018, pp. 1141-1150.
DOI Link 1812
BibRef

Goyal, Y.[Yash], Khot, T.[Tejas], Agrawal, A.[Aishwarya], Summers-Stay, D.[Douglas], Batra, D.[Dhruv], Parikh, D.[Devi],
Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering,
IJCV(127), No. 4, April 2019, pp. 398-414.
Springer DOI 1903
BibRef
Earlier: A1, A2, A4, A5, A6, Only: CVPR17(6325-6334)
IEEE DOI 1711
Benchmark testing, Data collection, Data models, Knowledge discovery, Protocols, Visualization BibRef

Fang, Z.W.[Zhi-Wei], Liu, J.[Jing], Li, Y.[Yong], Qiao, Y.Y.[Yan-Yuan], Lu, H.Q.[Han-Qing],
Improving visual question answering using dropout and enhanced question encoder,
PR(90), 2019, pp. 404-414.
Elsevier DOI 1903
Visual question answering, Coherent dropout, Siamese dropout, Enhanced question encoder BibRef

Das, A.[Abhishek], Kottur, S.[Satwik], Gupta, K.[Khushi], Singh, A.[Avi], Yadav, D.[Deshraj], Lee, S.[Stefan], Moura, J.M.F.[José M. F.], Parikh, D.[Devi], Batra, D.[Dhruv],
Visual Dialog,
PAMI(41), No. 5, May 2019, pp. 1242-1256.
IEEE DOI 1904
Hold a meaningful dialog about visual content. Visualization, Task analysis, Artificial intelligence, History, Protocols, Natural languages, Wheelchairs, Visual dialog, machine learning BibRef

Osman, A.[Ahmed], Samek, W.[Wojciech],
DRAU: Dual Recurrent Attention Units for Visual Question Answering,
CVIU(185), 2019, pp. 24-30.
Elsevier DOI 1906
Visual Question Answering, Attention Mechanisms, Multi-modal Learning, Machine Vision, Natural Language Processing BibRef

Liang, J.W.[Jun-Wei], Jiang, L.[Lu], Cao, L.L.[Liang-Liang], Kalantidis, Y.[Yannis], Li, L.J.[Li-Jia], Hauptmann, A.G.[Alexander G.],
Focal Visual-Text Attention for Memex Question Answering,
PAMI(41), No. 8, August 2019, pp. 1893-1908.
IEEE DOI 1907
BibRef
Earlier: A1, A2, A3, A5, A6, Only:
Focal Visual-Text Attention for Visual Question Answering,
CVPR18(6135-6143)
IEEE DOI 1812
Task analysis, Knowledge discovery, Visualization, Grounding, Metadata, Cognition, Photo albums, question answering, memex. Visualization, Videos, Computational modeling, Correlation. BibRef

Zhao, Z.[Zhou], Zhang, Z.[Zhu], Jiang, X.H.[Xing-Hua], Cai, D.[Deng],
Multi-Turn Video Question Answering via Hierarchical Attention Context Reinforced Networks,
IP(28), No. 8, August 2019, pp. 3860-3872.
IEEE DOI 1907
learning (artificial intelligence), natural language processing, reinforcement learning BibRef

Zhao, Z.[Zhou], Zhang, Z.[Zhu], Xiao, S.W.[Shu-Wen], Xiao, Z.X.[Zhen-Xin], Yan, X.H.[Xiao-Hui], Yu, J.[Jun], Cai, D.[Deng], Wu, F.[Fei],
Long-Form Video Question Answering via Dynamic Hierarchical Reinforced Networks,
IP(28), No. 12, December 2019, pp. 5939-5952.
IEEE DOI 1909
Knowledge discovery, Semantics, Visualization, Natural languages, Road transportation, Task analysis, Decoding, reinforcement learning BibRef

Jang, Y.[Yunseok], Song, Y.[Yale], Kim, C.D.[Chris Dongjoo], Yu, Y.[Youngjae], Kim, Y.[Youngjin], Kim, G.[Gunhee],
Video Question Answering with Spatio-Temporal Reasoning,
IJCV(127), No. 10, October 2019, pp. 1385-1412.
Springer DOI 1909
BibRef
Earlier: A1, A2, A4, A5, A6, Only:
TGIF-QA: Toward Spatio-Temporal Reasoning in Visual Question Answering,
CVPR17(1359-1367)
IEEE DOI 1711
Cognition, Crowdsourcing, Image color analysis, Knowledge discovery, Motion pictures, Visualization BibRef

Toor, A.S.[Andeep S.], Wechsler, H.[Harry], Nappi, M.[Michele],
Biometric surveillance using visual question answering,
PRL(126), 2019, pp. 111-118.
Elsevier DOI 1909
Biometrics, Forensics, Visual question answering, Question relevance, Surveillance, Deep learning, Visual turing test BibRef

Wang, A., Luu, A.T., Foo, C., Zhu, H., Tay, Y., Chandrasekhar, V.,
Holistic Multi-Modal Memory Network for Movie Question Answering,
IP(29), No. 1, 2020, pp. 489-499.
IEEE DOI 1910
question answering (information retrieval), holistic multimodal memory network, multimodal context, MovieQA BibRef

Yu, T., Yu, J., Yu, Z., Tao, D.,
Compositional Attention Networks With Two-Stream Fusion for Video Question Answering,
IP(29), No. , 2020, pp. 1204-1218.
IEEE DOI 1911
Visualization, Streaming media, Knowledge discovery, Feature extraction, Proposals, Task analysis, Semantics, action pooling stream BibRef

Ruwa, N.[Nelson], Mao, Q.[Qirong], Song, H.P.[He-Ping], Jia, H.J.[Hong-Jie], Dong, M.[Ming],
Triple attention network for sentimental visual question answering,
CVIU(189), 2019, pp. 102829.
Elsevier DOI 1911
Visual question answering, Feature embedding, Attention model, Sentiment analysis BibRef


Qiu, Y., Satoh, Y., Suzuki, R., Kataoka, H.,
Incorporating 3D Information Into Visual Question Answering,
3DV19(756-765)
IEEE DOI 1911
Feature extraction, Task analysis, Visualization, Natural language processing, Cognition, Human computer interaction BibRef

Haurilet, M.[Monica], Al-Halah, Z.[Ziad], Stiefelhagen, R.[Rainer],
DynGraph: Visual Question Answering via Dynamic Scene Graphs,
GCPR19(428-441).
Springer DOI 1911
BibRef
Earlier:
MoQA: A Multi-modal Question Answering Architecture,
VL18(IV:106-113).
Springer DOI 1905
BibRef

Liu, F., Liu, J., Fang, Z., Lu, H.,
Language and Visual Relations Encoding for Visual Question Answering,
ICIP19(3307-3311)
IEEE DOI 1910
Visual question answering, Relations, Attention BibRef

Liu, C.N.[Chao-Ning], Chen, D.J.[Ding-Jie], Chen, H.T.[Hwann-Tzong], Liu, T.L.[Tyng-Luh],
A2A: Attention to Attention Reasoning for Movie Question Answering,
ACCV18(VI:404-419).
Springer DOI 1906
BibRef

Fang, Z.W.[Zhi-Wei], Liu, J.[Jing], Tang, Q.[Qu], Li, Y.[Yong], Lu, H.Q.[Han-Qing],
Answer Distillation for Visual Question Answering,
ACCV18(I:72-87).
Springer DOI 1906
BibRef

Messina, N.[Nicola], Amato, G.[Giuseppe], Carrara, F.[Fabio], Falchi, F.[Fabrizio], Gennaro, C.[Claudio],
Learning Relationship-Aware Visual Features,
CEFR-LCV18(IV:486-501).
Springer DOI 1905
BibRef

Kuhnle, A.[Alexander], Xie, H.Y.[Hui-Yuan], Copestake, A.[Ann],
How Clever Is the FiLM Model, and How Clever Can it Be?,
VL18(IV:162-172).
Springer DOI 1905
BibRef

Li, W.[Wei], Yuan, Z.H.[Ze-Huan], Fang, X.Z.[Xiang-Zhong], Wang, C.[Changhu],
Knowing Where to Look? Analysis on Attention of Visual Question Answering System,
VL18(IV:145-152).
Springer DOI 1905
BibRef

Bani, G.[Gabriele], Belli, D.[Davide], Dagan, G.[Gautier], Geenen, A.[Alexander], Skliar, A.[Andrii], Venkatesh, A.[Aashish], Baumgärtner, T.[Tim], Bruni, E.[Elia], Fernández, R.[Raquel],
Adding Object Detection Skills to Visual Dialogue Agents,
VL18(IV:180-187).
Springer DOI 1905
BibRef

Wagner, M.[Misha], Basevi, H.[Hector], Shetty, R.[Rakshith], Li, W.B.[Wen-Bin], Malinowski, M.[Mateusz], Fritz, M.[Mario], Leonardis, A.[Aleš],
Answering Visual What-If Questions: From Actions to Predicted Scene Descriptions,
VLEASE18(I:521-537).
Springer DOI 1905
BibRef

Zhang, Y., Niebles, J.C., Soto, A.,
Interpretable Visual Question Answering by Visual Grounding From Attention Supervision Mining,
WACV19(349-357)
IEEE DOI 1904
data mining, data visualisation, image representation, learning (artificial intelligence), Computer architecture BibRef

Zhuang, B., Wu, Q., Shen, C., Reid, I.D., van den Hengel, A.J.[Anton J.],
Parallel Attention: A Unified Framework for Visual Object Discovery Through Dialogs and Queries,
CVPR18(4252-4261)
IEEE DOI 1812
Visualization, Task analysis, Cognition, Proposals, Grounding, Computer vision, Correlation BibRef

Yang, M., Yang, N.S.R., Zhang, K., Tao, J.,
Self-Talk: Responses to Users' Opinions and Challenges in Human Computer Dialog,
ICPR18(2839-2844)
IEEE DOI 1812
History, Robots, Databases, Predictive models, Pattern recognition, Automation, Search engines, human computer dialog, abstract extraction BibRef

Duke, B., Taylor, G.W.,
Generalized Hadamard-Product Fusion Operators for Visual Question Answering,
CRV18(39-46)
IEEE DOI 1812
Feature extraction, Visualization, Task analysis, Data models, Mathematical model, Natural languages, Model Selection, Visual Question-Answering BibRef

Das, A., Datta, S., Gkioxari, G., Lee, S., Parikh, D., Batra, D.,
Embodied Question Answering,
CVPR18(1-10)
IEEE DOI 1812
Navigation, Visualization, Task analysis, Automobiles, Knowledge discovery BibRef

Misra, I., Girshick, R., Fergus, R., Hebert, M., Gupta, A., van der Maaten, L.[Laurens],
Learning by Asking Questions,
CVPR18(11-20)
IEEE DOI 1812
Training, Proposals, Visualization, Knowledge discovery, Standards, Task analysis, Data models BibRef

Gurari, D., Li, Q., Stangl, A.J., Guo, A., Lin, C., Grauman, K., Luo, J., Bigham, J.P.,
VizWiz Grand Challenge: Answering Visual Questions from Blind People,
CVPR18(3608-3617)
IEEE DOI 1812
Visualization, Blindness, Prediction algorithms, Lighting, Computer vision, Mobile handsets, Shape BibRef

Li, J., Su, H., Zhu, J., Wang, S., Zhang, B.,
Textbook Question Answering Under Instructor Guidance with Memory Networks,
CVPR18(3655-3663)
IEEE DOI 1812
Task analysis, Cognition, Visualization, Feature extraction, Semantics, Knowledge discovery, Drugs BibRef

Gordon, D., Kembhavi, A., Rastegari, M., Redmon, J., Fox, D., Farhadi, A.,
IQA: Visual Question Answering in Interactive Environments,
CVPR18(4089-4098)
IEEE DOI 1812
Task analysis, Navigation, Visualization, Knowledge discovery, Semantics, Planning BibRef

Agrawal, A., Batra, D., Parikh, D., Kembhavi, A.,
Don't Just Assume; Look and Answer: Overcoming Priors for Visual Question Answering,
CVPR18(4971-4980)
IEEE DOI 1812
Image color analysis, Visualization, Data models, Training data, Training, Knowledge discovery, Dogs BibRef

Sha, F., Chao, W., Hu, H.,
Learning Answer Embeddings for Visual Question Answering,
CVPR18(5428-5436)
IEEE DOI 1812
Visualization, Semantics, Probabilistic logic, Computational modeling, Task analysis, Training, Adaptation models BibRef

Kafle, K., Price, B., Cohen, S., Kanan, C.,
DVQA: Understanding Data Visualizations via Question Answering,
CVPR18(5648-5656)
IEEE DOI 1812
Bars, Cognition, Image color analysis, Visualization, Data visualization, Data mining, Knowledge discovery BibRef

Sha, F., Hu, H., Chao, W.,
Cross-Dataset Adaptation for Visual Question Answering,
CVPR18(5716-5725)
IEEE DOI 1812
Visualization, Task analysis, Adaptation models, Knowledge discovery, Games, Training, Target recognition BibRef

Jain, U., Schwing, A., Lazebnik, S.,
Two Can Play This Game: Visual Dialog with Discriminative Question Generation and Answering,
CVPR18(5754-5763)
IEEE DOI 1812
Visualization, Task analysis, History, Knowledge discovery, Measurement, Training, Computer architecture BibRef

Anderson, P., He, X., Buehler, C., Teney, D., Johnson, M., Gould, S., Zhang, L.,
Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering,
CVPR18(6077-6086)
IEEE DOI 1812
Visualization, Task analysis, Proposals, Mathematical model, Servers, Context modeling, Object detection BibRef

Nguyen, D., Okatani, T.,
Improved Fusion of Visual and Language Representations by Dense Symmetric Co-attention for Visual Question Answering,
CVPR18(6087-6096)
IEEE DOI 1812
Feature extraction, Visualization, Computer architecture, Fuses, Knowledge discovery, Bidirectional control BibRef

Dokania, P.K., Torr, P.H.S., Siddharth, N., Massiceti, D.,
FLIPDIAL: A Generative Model for Two-Way Visual Dialogue,
CVPR18(6097-6105)
IEEE DOI 1812
Visualization, Task analysis, Computational modeling, History, Data models, Pediatrics, Image color analysis BibRef

Wu, Q., Wang, P., Shen, C., Reid, I.D., van den Hengel, A.J.[Anton J.],
Are You Talking to Me? Reasoned Visual Dialog Generation Through Adversarial Learning,
CVPR18(6106-6115)
IEEE DOI 1812
Visualization, Task analysis, Generators, History, Computational modeling, Image color analysis BibRef

Gao, J., Ge, R., Chen, K., Nevatia, R.,
Motion-Appearance Co-memory Networks for Video Question Answering,
CVPR18(6576-6585)
IEEE DOI 1812
Knowledge discovery, Cognition, Task analysis, Dynamics, Memory modules, Micromechanical devices, Logic gates BibRef

Ma, C., Shen, C., Dick, A., Wu, Q., Wang, P., van den Hengel, A.J.[Anton J.], Reid, I.D.,
Visual Question Answering with Memory-Augmented Networks,
CVPR18(6975-6984)
IEEE DOI 1812
Visualization, Neural networks, Training, Knowledge discovery, Feature extraction, Bidirectional control, Prediction algorithms BibRef

Cao, Q., Liang, X., Li, B., Li, G., Lin, L.,
Visual Question Reasoning on General Dependency Tree,
CVPR18(7249-7257)
IEEE DOI 1812
Cognition, Visualization, Layout, Feature extraction, Task analysis, Collaboration, Neural networks BibRef

Patro, B., Namboodiri, V.P.,
Differential Attention for Visual Question Answering,
CVPR18(7680-7688)
IEEE DOI 1812
Semantics, Task analysis, Visualization, Knowledge discovery, Correlation, Measurement, Training BibRef

Su, Z., Zhu, C., Dong, Y., Cai, D., Chen, Y., Li, J.,
Learning Visual Knowledge Memory Networks for Visual Question Answering,
CVPR18(7736-7745)
IEEE DOI 1812
Visualization, Knowledge based systems, Task analysis, Knowledge discovery, Cognition, Ovens BibRef

Liu, F., Xiang, T., Hospedales, T.M., Yang, W., Sun, C.,
iVQA: Inverse Visual Question Answering,
CVPR18(8611-8619)
IEEE DOI 1812
Task analysis, Visualization, Measurement, Decoding, Benchmark testing, Natural languages, Cognition BibRef

Shin, A., Ushiku, Y., Harada, T.,
Customized Image Narrative Generation via Interactive Visual Question Generation and Answering,
CVPR18(8925-8933)
IEEE DOI 1812
Visualization, Task analysis, Feature extraction, Proposals, Knowledge discovery, Recurrent neural networks, Training BibRef

Das, A., Datta, S., Gkioxari, G., Lee, S., Parikh, D., Batra, D.,
Embodied Question Answering,
DeepLearnRV18(2135-213509)
IEEE DOI 1812
Navigation, Visualization, Task analysis, Automobiles, Knowledge discovery BibRef

Cheng, W., Huang, Y., Wang, L.,
Towards Unconstrained Pointing Problem of Visual Question Answering: A Retrieval-based Method,
ICPR18(3303-3308)
IEEE DOI 1812
Visualization, Task analysis, Feature extraction, Training, Knowledge discovery, Proposals, Semantics BibRef

Teney, D., Anderson, P., He, X., van den Hengel, A.J.[Anton J.],
Tips and Tricks for Visual Question Answering: Learnings from the 2017 Challenge,
CVPR18(4223-4232)
IEEE DOI 1812
Training, Visualization, Task analysis, Neural networks, Knowledge discovery, Logic gates, Computer architecture BibRef

Zhou, B.[Bolei], Sun, Y.[Yiyou], Bau, D.[David], Torralba, A.[Antonio],
Interpretable Basis Decomposition for Visual Explanation,
ECCV18(VIII: 122-138).
Springer DOI 1810
BibRef

Kottur, S.[Satwik], Moura, J.M.F.[José M. F.], Parikh, D.[Devi], Batra, D.[Dhruv], Rohrbach, M.[Marcus],
Visual Coreference Resolution in Visual Dialog Using Neural Module Networks,
ECCV18(XV: 160-178).
Springer DOI 1810
BibRef

Strub, F.[Florian], Seurin, M.[Mathieu], Perez, E.[Ethan], de Vries, H.[Harm], Mary, J.[Jérémie], Preux, P.[Philippe], Courville, A.[Aaron], Pietquin, O.[Olivier],
Visual Reasoning with Multi-hop Feature Modulation,
ECCV18(VI: 808-831).
Springer DOI 1810
BibRef

Shi, Y.[Yang], Furlanello, T.[Tommaso], Zha, S.[Sheng], Anandkumar, A.[Animashree],
Question Type Guided Attention in Visual Question Answering,
ECCV18(II: 158-175).
Springer DOI 1810
BibRef

Narasimhan, M.[Medhini], Schwing, A.G.[Alexander G.],
Straight to the Facts: Learning Knowledge Base Retrieval for Factual Visual Question Answering,
ECCV18(VIII: 460-477).
Springer DOI 1810
BibRef

Yang, G.R.[Guangyu Robert], Ganichev, I.[Igor], Wang, X.J.[Xiao-Jing], Shlens, J.[Jonathon], Sussillo, D.[David],
A Dataset and Architecture for Visual Reasoning with a Working Memory,
ECCV18(X: 729-745).
Springer DOI 1810
BibRef

Malinowski, M.[Mateusz], Doersch, C.[Carl], Santoro, A.[Adam], Battaglia, P.[Peter],
Learning Visual Question Answering by Bootstrapping Hard Attention,
ECCV18(VI: 3-20).
Springer DOI 1810
BibRef

Kim, K.M.[Kyung-Min], Choi, S.H.[Seong-Ho], Kim, J.H.[Jin-Hwa], Zhang, B.T.[Byoung-Tak],
Multimodal Dual Attention Memory for Video Story Question Answering,
ECCV18(XV: 698-713).
Springer DOI 1810
BibRef

Yu, Y.J.[Young-Jae], Kim, J.S.[Jong-Seok], Kim, G.[Gunhee],
A Joint Sequence Fusion Model for Video Question Answering and Retrieval,
ECCV18(VII: 487-503).
Springer DOI 1810
BibRef

Gu, J.X.[Jiu-Xiang], Cai, J.F.[Jian-Fei], Joty, S.[Shafiq], Niu, L.[Li], Wang, G.[Gang],
Look, Imagine and Match: Improving Textual-Visual Cross-Modal Retrieval with Generative Models,
CVPR18(7181-7189)
IEEE DOI 1812
Visualization, Training, Decoding, Semantics, Measurement. BibRef

Li, Q.[Qing], Tao, Q.Y.[Qing-Yi], Joty, S.[Shafiq], Cai, J.F.[Jian-Fei], Luo, J.B.[Jie-Bo],
VQA-E: Explaining, Elaborating, and Enhancing Your Answers for Visual Questions,
ECCV18(VII: 570-586).
Springer DOI 1810
BibRef

Bai, Y.L.[Ya-Long], Fu, J.L.[Jian-Long], Zhao, T.J.[Tie-Jun], Mei, T.[Tao],
Deep Attention Neural Tensor Network for Visual Question Answering,
ECCV18(XII: 21-37).
Springer DOI 1810
BibRef

Sinha, A.[Abhishek], Ayush, K.[Kumar],
Towards Mathematical Reasoning: A Multimodal Deep Learning Approach,
ICIP18(4028-4032)
IEEE DOI 1809
Mathematical model, Task analysis, Visualization, Decoding, Computational modeling, Machine learning, Numerical models, Mathematical Reasoning BibRef

Yu, D., Gao, X., Xiong, H.,
Structured Semantic Representation for Visual Question Answering,
ICIP18(2286-2290)
IEEE DOI 1809
Semantics, Training, Cognition, Visualization, Task analysis, Linguistics, Computational modeling, Visual question answering BibRef

Huang, L., Kulkarni, K., Jha, A., Lohit, S., Jayasuriya, S., Turaga, P.,
CS-VQA: Visual Question Answering with Compressively Sensed Images,
ICIP18(1283-1287)
IEEE DOI 1809
Visualization, Image reconstruction, Image coding, Task analysis, Feature extraction, Training, Multiplexing, Computer vision, image reconstruction BibRef

Hasan Chowdhury, M.I., Nguyen, K., Sridharan, S., Fookes, C.,
Hierarchical Relational Attention for Video Question Answering,
ICIP18(599-603)
IEEE DOI 1809
Feature extraction, Knowledge discovery, Visualization, Task analysis, Mathematical model, Natural languages, scene understanding BibRef

Desta, M.T., Chen, L., Kornuta, T.,
Object-Based Reasoning in VQA,
WACV18(1814-1823)
IEEE DOI 1806
data visualisation, inference mechanisms, natural language processing, object detection, Visualization BibRef

Zhao, H., Fan, Q., Gutfreund, D., Fu, Y.,
Semantically Guided Visual Question Answering,
WACV18(1852-1860)
IEEE DOI 1806
data visualisation, image colour analysis, image representation, learning (artificial intelligence), Visualization BibRef

Wang, Z., Liu, X., Wang, L., Qiao, Y., Xie, X., Fowlkes, C.C.[Charless C.],
Structured Triplet Learning with POS-Tag Guided Attention for Visual Question Answering,
WACV18(1888-1896)
IEEE DOI 1806
convolution, data visualisation, learning (artificial intelligence), Visualization BibRef

Chowdhury, I., Nguyen, K., Fookes, C., Sridharan, S.,
A cascaded long short-term memory (LSTM) driven generic visual question answering (VQA),
ICIP17(1842-1846)
IEEE DOI 1803
Feature extraction, Mathematical model, Natural languages, Principal component analysis, Task analysis, Training, scene understanding BibRef

Sheng, S.[Shurong], Venkitasubramanian, A.N.[Aparna Nurani], Moens, M.F.[Marie-Francine],
A Markov Network Based Passage Retrieval Method for Multimodal Question Answering in the Cultural Heritage Domain,
MMMod18(I:3-15).
Springer DOI 1802
BibRef

Rosso-Mateus, A.[Andrés], González, F.A.[Fabio A.], Montes-y-Gómez, M.[Manuel],
A Two-Step Neural Network Approach to Passage Retrieval for Open Domain Question Answering,
CIARP17(566-574).
Springer DOI 1802
BibRef

Gupta, T.[Tanmay], Shih, K.J.[Kevin J.], Singh, S.[Saurabh], Hoiem, D.[Derek],
Aligned Image-Word Representations Improve Inductive Transfer Across Vision-Language Tasks,
ICCV17(4223-4232)
IEEE DOI 1802
data visualisation, image recognition, learning (artificial intelligence), Visualization BibRef

Das, A., Kottur, S., Moura, J.M.F., Lee, S., Batra, D.,
Learning Cooperative Visual Dialog Agents with Deep Reinforcement Learning,
ICCV17(2970-2979)
IEEE DOI 1802
interactive systems, learning (artificial intelligence), multi-agent systems, natural language interfaces, robot vision, Visualization BibRef

Mun, J.[Jonghwan], Seo, P.H.[Paul Hongsuck], Jung, I.[Ilchae], Han, B.[Bohyung],
MarioQA: Answering Questions by Watching Gameplay Videos,
ICCV17(2886-2894)
IEEE DOI 1802
computer games, inference mechanisms, neural nets, question answering (information retrieval), VideoQA problems, Visualization BibRef

Gan, C., Li, Y., Li, H., Sun, C., Gong, B.,
VQS: Linking Segmentations to Questions and Answers for Supervised Attention in VQA and Question-Focused Semantic Segmentation,
ICCV17(1829-1838)
IEEE DOI 1802
image annotation, image segmentation, multilayer perceptrons, question answering (information retrieval), COCO, VQA dataset, Visualization BibRef

Yu, Z., Yu, J., Fan, J., Tao, D.,
Multi-modal Factorized Bilinear Pooling with Co-attention Learning for Visual Question Answering,
ICCV17(1839-1848)
IEEE DOI 1802
computational complexity, feature extraction, image fusion, learning (artificial intelligence), Visualization BibRef

Ben-younes, H., Cadene, R., Cord, M., Thome, N.,
MUTAN: Multimodal Tucker Fusion for Visual Question Answering,
ICCV17(2631-2639)
IEEE DOI 1802
image fusion, image representation, question answering (information retrieval), tensors, (VQA) tasks, Visualization BibRef

Zhu, C., Zhao, Y., Huang, S., Tu, K., Ma, Y.,
Structured Attentions for Visual Question Answering,
ICCV17(1300-1309)
IEEE DOI 1802
belief networks, data visualisation, image retrieval, inference mechanisms, neural nets, Visualization BibRef

Hu, R., Andreas, J., Rohrbach, M., Darrell, T.J., Saenko, K.,
Learning to Reason: End-to-End Module Networks for Visual Question Answering,
ICCV17(804-813)
IEEE DOI 1802
computational linguistics, grammars, natural language processing, neural net architecture, Visualization BibRef

Jain, U.[Unnat], Zhang, Z.Y.[Zi-Yu], Schwing, A.[Alexander],
Creativity: Generating Diverse Questions Using Variational Autoencoders,
CVPR17(5415-5424)
IEEE DOI 1711
Artificial intelligence, Creativity, Hidden Markov models, Training, Transforms, Visualization BibRef

de Vries, H.[Harm], Strub, F.[Florian], Chandar, S.[Sarath], Pietquin, O.[Olivier], Larochelle, H.[Hugo], Courville, A.[Aaron],
GuessWhat?! Visual Object Discovery through Multi-modal Dialogue,
CVPR17(4466-4475)
IEEE DOI 1711
Databases, Games, Knowledge discovery, Natural languages, Visualization BibRef

Nam, H.[Hyeonseob], Ha, J.W.[Jung-Woo], Kim, J.[Jeonghee],
Dual Attention Networks for Multimodal Reasoning and Matching,
CVPR17(2156-2164)
IEEE DOI 1711
Cognition, Knowledge discovery, Mathematical model, Neural networks, Semantics, Visualization BibRef

Johnson, J.[Justin], Hariharan, B.[Bharath], van der Maaten, L.[Laurens], Hoffman, J., Fei-Fei, L.[Li], Zitnick, C.L.[C. Lawrence], Girshick, R.[Ross],
Inferring and Executing Programs for Visual Reasoning,
ICCV17(3008-3017)
IEEE DOI 1802
BibRef
Earlier: A1, A2, A3, A5, A6, A7, Only:
CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning,
CVPR17(1988-1997)
IEEE DOI 1711
Dataset, Visual Reasoning.
WWW Link. backpropagation, image matching, learning (artificial intelligence), neural nets, Visualization. Cognition, Image color analysis, Metals, Semantics, Shape. BibRef

Yu, Y., Ko, H., Choi, J., Kim, G.,
End-to-End Concept Word Detection for Video Captioning, Retrieval, and Question Answering,
CVPR17(3261-3269)
IEEE DOI 1711
Detectors, Knowledge discovery, Motion pictures, Semantics, Training, Visualization BibRef

Das, A.[Abhishek], Kottur, S.[Satwik], Gupta, K.[Khushi], Singh, A.[Avi], Yadav, D.[Deshraj], Moura, J.M.F.[José M. F.], Parikh, D.[Devi], Batra, D.[Dhruv],
Visual Dialog,
CVPR17(1080-1089)
IEEE DOI 1711
Hold a dialog with humans in a natural visual context. History, Knowledge discovery, Protocols, Visualization, Wheelchairs BibRef

Zhu, Y., Lim, J.J., Fei-Fei, L.[Li],
Knowledge Acquisition for Visual Question Answering via Iterative Querying,
CVPR17(6146-6155)
IEEE DOI 1711
Computational modeling, Data models, Generators, Knowledge discovery, Standards, Visualization BibRef

Peris, Á.[Álvaro], Casacuberta, F.[Francisco],
Interactive-Predictive Neural Multimodal Systems,
IbPRIA(I:16-28).
Springer DOI 1910
BibRef

Bolańos, M.[Marc], Peris, Á.[Álvaro], Casacuberta, F.[Francisco], Radeva, P.[Petia],
VIBIKNet: Visual Bidirectional Kernelized Network for Visual Question Answering,
IbPRIA17(372-380).
Springer DOI 1706
BibRef

Gao, P.[Peng], Li, H.S.[Hong-Sheng], Li, S.[Shuang], Lu, P.[Pan], Li, Y.K.[Yi-Kang], Hoi, S.C.H.[Steven C. H.], Wang, X.G.[Xiao-Gang],
Question-Guided Hybrid Convolution for Visual Question Answering,
ECCV18(I: 485-501).
Springer DOI 1810
BibRef

Li, Y.K.[Yi-Kang], Duan, N.[Nan], Zhou, B.L.[Bo-Lei], Chu, X.[Xiao], Ouyang, W.L.[Wan-Li], Wang, X.G.[Xiao-Gang], Zhou, M.[Ming],
Visual Question Generation as Dual Task of Visual Question Answering,
CVPR18(6116-6124)
IEEE DOI 1812
Task analysis, Visualization, Knowledge discovery, Training, Computational modeling, Computer vision BibRef

Lin, Y.T.[Yue-Tan], Pang, Z.Y.[Zhang-Yang], Li, Y.[Yanan], Wang, D.H.[Dong-Hui],
Simple and effective visual question answering in a single modality,
ICIP16(2276-2280)
IEEE DOI 1610
Benchmark testing. Not just add text to image questions. BibRef

Kafle, K.[Kushal], Kanan, C.[Christopher],
An Analysis of Visual Question Answering Algorithms,
ICCV17(1983-1991)
IEEE DOI 1802
BibRef
Earlier:
Answer-Type Prediction for Visual Question Answering,
CVPR16(4976-4984)
IEEE DOI 1612
case-based reasoning, computer vision, data visualisation, image retrieval, neural nets, Visualization BibRef

Wang, P., Wu, Q., Shen, C., van den Hengel, A.J.[Anton J.],
The VQA-Machine: Learning How to Use Existing Vision Algorithms to Answer New Questions,
CVPR17(3909-3918)
IEEE DOI 1711
Cognition, Data mining, Neural networks, Prediction algorithms, Telescopes, Visualization BibRef

Yu, D., Fu, J., Mei, T., Rui, Y.,
Multi-level Attention Networks for Visual Question Answering,
CVPR17(4187-4195)
IEEE DOI 1711
Feature extraction, Knowledge discovery, Natural languages, Recurrent neural networks, Semantics, Visualization BibRef

Kembhavi, A., Seo, M., Schwenk, D., Choi, J., Farhadi, A., Hajishirzi, H.,
Are You Smarter Than a Sixth Grader? Textbook Question Answering for Multimodal Machine Comprehension,
CVPR17(5376-5384)
IEEE DOI 1711
Cognition, Knowledge discovery, Natural languages, Training, Visualization BibRef

Ganju, S., Russakovsky, O., Gupta, A.,
What's in a Question: Using Visual Questions as a Form of Supervision,
CVPR17(6422-6431)
IEEE DOI 1711
Artificial intelligence, Computational modeling, Dogs, Image color analysis, SPICE, Visualization BibRef

Ramakrishnan, S.K., Pal, A., Sharma, G., Mittal, A.,
An Empirical Evaluation of Visual Question Answering for Novel Objects,
CVPR17(7312-7321)
IEEE DOI 1711
Knowledge discovery, Recurrent neural networks, Training, Training data, Visualization, Vocabulary BibRef

Maharaj, T., Ballas, N., Rohrbach, A., Courville, A., Pal, C.,
A Dataset and Exploration of Models for Understanding Video Data through Fill-in-the-Blank Question-Answering,
CVPR17(7359-7368)
IEEE DOI 1711
Computational modeling, Motion pictures, Natural languages, Training, Visualization, Voltage, control BibRef

Xu, H.J.[Hui-Juan], Saenko, K.[Kate],
Ask, Attend and Answer: Exploring Question-Guided Spatial Attention for Visual Question Answering,
ECCV16(VII: 451-466).
Springer DOI 1611
Visual Question Answering. BibRef

Jabri, A.[Allan], Joulin, A.[Armand], van der Maaten, L.[Laurens],
Revisiting Visual Question Answering Baselines,
ECCV16(VIII: 727-739).
Springer DOI 1611
BibRef

Yang, Z.C.[Zi-Chao], He, X.D.[Xiao-Dong], Gao, J.F.[Jian-Feng], Deng, L.[Li], Smola, A.[Alex],
Stacked Attention Networks for Image Question Answering,
CVPR16(21-29)
IEEE DOI 1612
BibRef

Sadeghi, F.[Fereshteh], Divvala, S.K.[Santosh K.], Farhadi, A.[Ali],
VisKE: Visual knowledge extraction and question answering by visual verification of relation phrases,
CVPR15(1456-1464)
IEEE DOI 1510
Visual verification of text relationships. BibRef

Liu, Y.[Yang], Liu, J.[Jie], Wang, D.[Dong], Cheng, J.[Jian],
A robust multivariate reranking algorithm for Question Answering enrichment,
ICIP12(1917-1920).
IEEE DOI 1302
BibRef

Varekamp, C.[Chris], van de Walle, P.[Patrick], de Putter, M.[Marc],
Question interface for 3D picture creation on an autostereoscopic digital picture frame,
3DTV09(1-4).
IEEE DOI 0905
BibRef

Chapter on Implementations and Applications, Databases, QBIC, Video Analysis, Hardware and Software, Inspection continues in
Internet Label Information .


Last update:Dec 7, 2019 at 17:16:29