Feng, Y.S.[Yan-Song],
Lapata, M.,
Automatic Caption Generation for News Images,
PAMI(35), No. 4, April 2013, pp. 797-812.
IEEE DOI
1303
Use existing captions and tags, expand to similar images.
BibRef
Nakayama, H.[Hideki],
Harada, T.[Tatsuya],
Kuniyoshi, Y.[Yasuo],
Dense Sampling Low-Level Statistics of Local Features,
IEICE(E93-D), No. 7, July 2010, pp. 1727-1736.
WWW Link.
1008
BibRef
Earlier:
CIVR09(Article No 17).
DOI Link
0907
BibRef
And:
Global Gaussian approach for scene categorization using information
geometry,
CVPR10(2336-2343).
IEEE DOI
1006
BibRef
Earlier:
AI Goggles: Real-time Description and Retrieval in the Real World with
Online Learning,
CRV09(184-191).
IEEE DOI
0905
local features.
Scalability of matching for large-scale indexing.
Boost global features with sampled statistics of local features.
BibRef
Ushiku, Y.[Yoshitaka],
Yamaguchi, M.[Masataka],
Mukuta, Y.[Yusuke],
Harada, T.[Tatsuya],
Common Subspace for Model and Similarity:
Phrase Learning for Caption Generation from Images,
ICCV15(2668-2676)
IEEE DOI
1602
Feature extraction
BibRef
Jin, J.[Jiren],
Nakayama, H.[Hideki],
Annotation order matters:
Recurrent Image Annotator for arbitrary length image tagging,
ICPR16(2452-2457)
IEEE DOI
1705
Correlation, Feature extraction, Indexes, Predictive models,
Recurrent neural networks, Training
BibRef
Harada, T.[Tatsuya],
Nakayama, H.[Hideki],
Kuniyoshi, Y.[Yasuo],
Improving Local Descriptors by Embedding Global and Local Spatial
Information,
ECCV10(IV: 736-749).
Springer DOI
1009
BibRef
Earlier: A2, A1, A3:
Evaluation of dimensionality reduction methods for image
auto-annotation,
BMVC10(xx-yy).
HTML Version.
1009
BibRef
Verma, Y.[Yashaswi],
Jawahar, C.V.,
A support vector approach for cross-modal search of images and texts,
CVIU(154), No. 1, 2017, pp. 48-63.
Elsevier DOI
1612
Image search
BibRef
Dutta, A.[Ayushi],
Verma, Y.[Yashaswi],
Jawahar, C.V.,
Recurrent Image Annotation with Explicit Inter-Label Dependencies,
ECCV20(XXIX: 191-207).
Springer DOI
2010
BibRef
Xue, J.F.[Jian-Fei],
Eguchi, K.[Koji],
Video Data Modeling Using Sequential Correspondence Hierarchical
Dirichlet Processes,
IEICE(E100-D), No. 1, January 2017, pp. 33-41.
WWW Link.
1701
multimodal data such as the mixture of visual words and speech words
extracted from video files
BibRef
Tariq, A.[Amara],
Foroosh, H.[Hassan],
A Context-Driven Extractive Framework for Generating Realistic Image
Descriptions,
IP(26), No. 2, February 2017, pp. 619-632.
IEEE DOI
1702
image annotation
BibRef
Vinyals, O.[Oriol],
Toshev, A.[Alexander],
Bengio, S.[Samy],
Erhan, D.[Dumitru],
Show and Tell: Lessons Learned from the 2015 MSCOCO Image Captioning
Challenge,
PAMI(39), No. 4, April 2017, pp. 652-663.
IEEE DOI
1703
BibRef
Earlier:
Show and tell: A neural image caption generator,
CVPR15(3156-3164)
IEEE DOI
1510
Computational modeling
BibRef
Gao, L.L.[Lian-Li],
Guo, Z.[Zhao],
Zhang, H.W.[Han-Wang],
Xu, X.[Xing],
Shen, H.T.[Heng Tao],
Video Captioning With Attention-Based LSTM and Semantic Consistency,
MultMed(19), No. 9, September 2017, pp. 2045-2055.
IEEE DOI
1708
Computational modeling, Correlation, Feature extraction,
Neural networks, Semantics,
Visualization, Attention mechanism, embedding,
long short-term memory (LSTM), video, captioning
BibRef
Hu, M.,
Yang, Y.,
Shen, F.,
Zhang, L.,
Shen, H.T.,
Li, X.,
Robust Web Image Annotation via Exploring Multi-Facet and Structural
Knowledge,
IP(26), No. 10, October 2017, pp. 4871-4884.
IEEE DOI
1708
image annotation, image retrieval, iterative methods,
learning (artificial intelligence), multimedia systems,
optimisation, pattern classification, RMSL,
data structural information,
digital technologies,
image semantic indexing, image semantic retrieval,
robust multiview semi-supervised learning, visual features,
Manifolds, Multimedia communication, Semantics,
Semisupervised learning, Supervised learning, Image annotation,
l2, p-norm, multi-view learning, semi-supervised learning
BibRef
Bin, Y.,
Yang, Y.,
Shen, F.,
Xie, N.,
Shen, H.T.,
Li, X.,
Describing Video With Attention-Based Bidirectional LSTM,
Cyber(49), No. 7, July 2019, pp. 2631-2641.
IEEE DOI
1905
Visualization, Semantics, Decoding, Feature extraction,
Natural languages, Recurrent neural networks, Grammar,
video captioning
BibRef
Wang, J.Y.[Jing-Ya],
Zhu, X.T.[Xia-Tian],
Gong, S.G.[Shao-Gang],
Discovering visual concept structure with sparse and incomplete tags,
AI(250), No. 1, 2017, pp. 16-36.
Elsevier DOI
1708
Automatically discovering the semantic structure of tagged visual data
(e.g. web videos and images).
BibRef
Kilickaya, M.[Mert],
Akkus, B.K.[Burak Kerim],
Cakici, R.[Ruket],
Erdem, A.[Aykut],
Erdem, E.[Erkut],
Ikizler-Cinbis, N.[Nazli],
Data-driven image captioning via salient region discovery,
IET-CV(11), No. 6, September 2017, pp. 398-406.
DOI Link
1709
BibRef
Fu, K.[Kun],
Jin, J.Q.[Jun-Qi],
Cui, R.P.[Run-Peng],
Sha, F.[Fei],
Zhang, C.S.[Chang-Shui],
Aligning Where to See and What to Tell: Image Captioning with
Region-Based Attention and Scene-Specific Contexts,
PAMI(39), No. 12, December 2017, pp. 2321-2334.
IEEE DOI
1711
Adaptation models, Computational modeling, Context modeling,
Data mining, Feature extraction, Image classification,
Visualization, Image captioning, LSTM,
visual attention.
BibRef
Liu, A.A.[An-An],
Xu, N.[Ning],
Wong, Y.[Yongkang],
Li, J.[Junnan],
Su, Y.T.[Yu-Ting],
Kankanhalli, M.[Mohan],
Hierarchical & multimodal video captioning: Discovering and
transferring multimodal knowledge for vision to language,
CVIU(163), No. 1, 2017, pp. 113-125.
Elsevier DOI
1712
Video to text
BibRef
Nian, F.D.[Fu-Dong],
Li, T.[Teng],
Wang, Y.[Yan],
Wu, X.Y.[Xin-Yu],
Ni, B.B.[Bing-Bing],
Xu, C.S.[Chang-Sheng],
Learning explicit video attributes from mid-level representation for
video captioning,
CVIU(163), No. 1, 2017, pp. 126-138.
Elsevier DOI
1712
Mid-level video representation
BibRef
He, X.D.[Xiao-Dong],
Deng, L.[Li],
Deep Learning for Image-to-Text Generation: A Technical Overview,
SPMag(34), No. 6, November 2017, pp. 109-116.
IEEE DOI
1712
BibRef
And:
Errata:
SPMag(35), No. 1, January 2018, pp. 178.
IEEE DOI Artificial intelligence, Computer vision, Image classification,
Natural language processing, Pediatrics, Semantics, Training data,
Visualization
BibRef
Li, L.H.[Ling-Hui],
Tang, S.[Sheng],
Zhang, Y.D.[Yong-Dong],
Deng, L.X.[Li-Xi],
Tian, Q.[Qi],
GLA: Global-Local Attention for Image Description,
MultMed(20), No. 3, March 2018, pp. 726-737.
IEEE DOI
1802
Computational modeling, Decoding, Feature extraction,
Image recognition, Natural language processing,
recurrent neural network
BibRef
Guan, J.N.[Jin-Ning],
Wang, E.[Eric],
Repeated review based image captioning for image evidence review,
SP:IC(63), 2018, pp. 141-148.
Elsevier DOI
1804
Repeated review, Image captioning, Encoder-decoder, Multimodal layer
BibRef
Lu, X.,
Wang, B.,
Zheng, X.,
Li, X.,
Exploring Models and Data for Remote Sensing Image Caption Generation,
GeoRS(56), No. 4, April 2018, pp. 2183-2195.
IEEE DOI
1804
Computer vision, Feature extraction, Image representation,
Recurrent neural networks, Remote sensing, Semantics,
semantic understanding
BibRef
Cheng, Q.[Qimin],
Zhang, Q.[Qian],
Fu, P.[Peng],
Tu, C.H.[Cong-Huan],
Li, S.[Sen],
A survey and analysis on automatic image annotation,
PR(79), 2018, pp. 242-259.
Elsevier DOI
1804
Automatic image annotation, Generative model,
Nearest-neighbor model, Discriminative model, Tag-completion, Deep learning
BibRef
Ben Rejeb, I.[Imen],
Ouni, S.[Sonia],
Barhoumi, W.[Walid],
Zagrouba, E.[Ezzeddine],
Fuzzy VA-Files for multi-label image annotation based on visual content
of regions,
SIViP(12), No. 5, July 2018, pp. 877-884.
Springer DOI
1806
Vector Approximation Files.
BibRef
Helmy, T.[Tarek],
A Generic Framework for Semantic Annotation of Images,
IJIG(18), No. 3, July 2018, pp. Article 1850013.
DOI Link
1807
BibRef
Wu, C.L.[Chun-Lei],
Wei, Y.[Yiwei],
Chu, X.L.[Xiao-Liang],
Su, F.[Fei],
Wang, L.[Leiquan],
Modeling visual and word-conditional semantic attention for image
captioning,
SP:IC(67), 2018, pp. 100-107.
Elsevier DOI
1808
Image captioning, Word-conditional semantic attention,
Visual attention, Attention variation
BibRef
Ye, S.,
Han, J.,
Liu, N.,
Attentive Linear Transformation for Image Captioning,
IP(27), No. 11, November 2018, pp. 5514-5524.
IEEE DOI
1809
feature extraction, image classification,
learning (artificial intelligence), matrix algebra, probability,
LSTM
BibRef
Zhang, M.,
Yang, Y.,
Zhang, H.,
Ji, Y.,
Shen, H.T.,
Chua, T.,
More is Better: Precise and Detailed Image Captioning Using Online
Positive Recall and Missing Concepts Mining,
IP(28), No. 1, January 2019, pp. 32-44.
IEEE DOI
1810
data mining, image representation, image retrieval,
image segmentation, learning (artificial intelligence),
element-wise selection
BibRef
Hu, J.[Jiwei],
Lam, K.M.[Kin-Man],
Lou, P.[Ping],
Liu, Q.[Quan],
Deng, W.P.[Wu-Peng],
Can a machine have two systems for recognition, like human beings?,
JVCIR(56), 2018, pp. 275-286.
Elsevier DOI
1811
Image annotation, Multi-labeling, Hierarchical tree structure,
Feature-pool selection
BibRef
Bhagat, P.K.,
Choudhary, P.,
Image annotation: Then and now,
IVC(80), 2018, pp. 1-23.
Elsevier DOI
1812
Image annotation, Automatic image annotation,
Multi-label classification, Image labeling, Image tagging,
Image retrieval
BibRef
Gil-Gonzalez, J.,
Alvarez-Meza, A.,
Orozco-Gutierrez, A.,
Learning from multiple annotators using kernel alignment,
PRL(116), 2018, pp. 150-156.
Elsevier DOI
1812
Multiple annotators, Kernel methods, Classification
BibRef
Bazrafkan, S.[Shabab],
Javidnia, H.[Hossein],
Corcoran, P.[Peter],
Latent space mapping for generation of object elements with
corresponding data annotation,
PRL(116), 2018, pp. 179-186.
Elsevier DOI
1812
Generative models, Latent space mapping, Deep neural networks
BibRef
Gella, S.[Spandana],
Keller, F.[Frank],
Lapata, M.[Mirella],
Disambiguating Visual Verbs,
PAMI(41), No. 2, February 2019, pp. 311-322.
IEEE DOI
1901
Given an image and a verb, assign the correct sense of the verb.
Visualization, Image recognition, Semantics,
Natural language processing, Horses, Bicycles, Computer vision,
BibRef
Xu, N.[Ning],
Liu, A.A.[An-An],
Liu, J.[Jing],
Nie, W.Z.[Wei-Zhi],
Su, Y.T.[Yu-Ting],
Scene graph captioner:
Image captioning based on structural visual representation,
JVCIR(58), 2019, pp. 477-485.
Elsevier DOI
1901
Image captioning, Scene graph, Structural representation, Attention
BibRef
Niu, Y.,
Lu, Z.,
Wen, J.,
Xiang, T.,
Chang, S.,
Multi-Modal Multi-Scale Deep Learning for Large-Scale Image
Annotation,
IP(28), No. 4, April 2019, pp. 1720-1731.
IEEE DOI
1901
feature extraction, image classification, image fusion,
image representation, learning (artificial intelligence),
label quantity prediction
BibRef
Jiu, M.Y.[Ming-Yuan],
Sahbi, H.[Hichem],
Deep representation design from deep kernel networks,
PR(88), 2019, pp. 447-457.
Elsevier DOI
1901
Multiple kernel learning, Kernel design, Deep networks,
Efficient computation, Image annotation
BibRef
He, X.W.[Xin-Wei],
Shi, B.G.[Bao-Guang],
Bai, X.[Xiang],
Xia, G.S.[Gui-Song],
Zhang, Z.X.[Zhao-Xiang],
Dong, W.S.[Wei-Sheng],
Image Caption Generation with Part of Speech Guidance,
PRL(119), 2019, pp. 229-237.
Elsevier DOI
1902
Image caption generation, Part-of-speech tags,
Long Short-Term Memory, Visual attributes
BibRef
Park, C.C.,
Kim, B.,
Kim, G.,
Towards Personalized Image Captioning via Multimodal Memory Networks,
PAMI(41), No. 4, April 2019, pp. 999-1012.
IEEE DOI
1903
BibRef
Earlier:
Attend to You: Personalized Image Captioning with Context Sequence
Memory Networks,
CVPR17(6432-6440)
IEEE DOI
1711
Tagging, Twitter, Task analysis, Computational modeling, Writing,
Vocabulary, Context modeling, Image captioning, personalization,
convolutional neural networks.
Pattern recognition
BibRef
Xiao, X.Y.[Xin-Yu],
Wang, L.F.[Ling-Feng],
Ding, K.[Kun],
Xiang, S.M.[Shi-Ming],
Pan, C.[Chunhong],
Dense semantic embedding network for image captioning,
PR(90), 2019, pp. 285-296.
Elsevier DOI
1903
Image captioning, Retrieval, High-level semantic information,
Visual concept, Densely embedding, Long short-term memory
BibRef
Yang, M.,
Zhao, W.,
Xu, W.,
Feng, Y.,
Zhao, Z.,
Chen, X.,
Lei, K.,
Multitask Learning for Cross-Domain Image Captioning,
MultMed(21), No. 4, April 2019, pp. 1047-1061.
IEEE DOI
1903
Task analysis, Image generation, Data models, Training data,
Neural networks, Training, Maximum likelihood estimation,
reinforcement learning
BibRef
Foumani, S.N.M.[Seyed Navid Mohammadi],
Nickabadi, A.[Ahmad],
A probabilistic topic model using deep visual word representation for
simultaneous image classification and annotation,
JVCIR(59), 2019, pp. 195-203.
Elsevier DOI
1903
Image classification and annotation, Topic models,
Probabilistic model, Deep learning,
LLC
BibRef
Zhang, X.R.[Xiang-Rong],
Wang, X.[Xin],
Tang, X.[Xu],
Zhou, H.Y.[Hui-Yu],
Li, C.[Chen],
Description Generation for Remote Sensing Images Using Attribute
Attention Mechanism,
RS(11), No. 6, 2019, pp. xx-yy.
DOI Link
1903
BibRef
Zheng, H.[He],
Wu, J.H.[Jia-Hong],
Liang, R.[Rui],
Li, Y.[Ye],
Li, X.Z.[Xu-Zhi],
Multi-task learning for captioning images with novel words,
IET-CV(13), No. 3, April 2019, pp. 294-301.
DOI Link
1904
BibRef
Yu, N.,
Hu, X.,
Song, B.,
Yang, J.,
Zhang, J.,
Topic-Oriented Image Captioning Based on Order-Embedding,
IP(28), No. 6, June 2019, pp. 2743-2754.
IEEE DOI
1905
image classification, image matching, image retrieval,
learning (artificial intelligence), image matching,
cross-modal retrieval
BibRef
Ding, S.T.[Song-Tao],
Qu, S.[Shiru],
Xi, Y.L.[Yu-Ling],
Sangaiah, A.K.[Arun Kumar],
Wan, S.H.[Shao-Hua],
Image caption generation with high-level image features,
PRL(123), 2019, pp. 89-95.
Elsevier DOI
1906
Image captioning, Language model,
Bottom-up attention mechanism, Faster R-CNN
BibRef
Liu, X.X.[Xiao-Xiao],
Xu, Q.Y.[Qing-Yang],
Wang, N.[Ning],
A survey on deep neural network-based image captioning,
VC(35), No. 3, March 2019, pp. 445-470.
WWW Link.
1906
BibRef
Hossain, M.Z.[Md. Zakir],
Sohel, F.[Ferdous],
Shiratuddin, M.F.[Mohd Fairuz],
Laga, H.[Hamid],
A Comprehensive Survey of Deep Learning for Image Captioning,
Surveys(51), No. 6, February 2019, pp. Article No 118.
DOI Link
1906
Survey, Captioning.
BibRef
Peng, Y.Q.[Yu-Qing],
Liu, X.[Xuan],
Wang, W.H.[Wei-Hua],
Zhao, X.S.[Xiao-Song],
Wei, M.[Ming],
Image caption model of double LSTM with scene factors,
IVC(86), 2019, pp. 38-44.
Elsevier DOI
1906
Image caption, Deep neural network, Scene recognition, Semantic information
BibRef
Zhang, J.J.[Jun-Jie],
Wu, Q.[Qi],
Zhang, J.[Jian],
Shen, C.H.[Chun-Hua],
Lu, J.F.[Jian-Feng],
Wu, Q.A.[Qi-Ang],
Heritage image annotation via collective knowledge,
PR(93), 2019, pp. 204-214.
Elsevier DOI
1906
Annotation diversity, Image annotation,
Representation learning, Collective knowledge, Heritage image collection
BibRef
Verma, Y.[Yashaswi],
Diverse image annotation with missing labels,
PR(93), 2019, pp. 470-484.
Elsevier DOI
1906
Image annotation, Diverse labels, Missing labels, Nearest neighbour
BibRef
Markatopoulou, F.,
Mezaris, V.,
Patras, I.,
Implicit and Explicit Concept Relations in Deep Neural Networks for
Multi-Label Video/Image Annotation,
CirSysVideo(29), No. 6, June 2019, pp. 1631-1644.
IEEE DOI
1906
Task analysis, Correlation, Standards, Training, Electronic mail,
Neural networks, Semantics, Video/image concept annotation,
video analysis
BibRef
Zhang, Z.J.[Zong-Jian],
Wu, Q.[Qiang],
Wang, Y.[Yang],
Chen, F.[Fang],
High-Quality Image Captioning With Fine-Grained and Semantic-Guided
Visual Attention,
MultMed(21), No. 7, July 2019, pp. 1681-1693.
IEEE DOI
1906
BibRef
Earlier:
Fine-Grained and Semantic-Guided Visual Attention for Image
Captioning,
WACV18(1709-1717)
IEEE DOI
1806
Visualization, Semantics, Feature extraction, Decoding,
Task analysis, Object oriented modeling, Image resolution,
fully convolutional network-long short term memory framework.
feedforward neural nets, image representation,
image segmentation, convolutional neural network,
Visualization
BibRef
Laib, L.[Lakhdar],
Allili, M.S.[Mohand Saïd],
Ait-Aoudia, S.[Samy],
A probabilistic topic model for event-based image classification and
multi-label annotation,
SP:IC(76), 2019, pp. 283-294.
Elsevier DOI
1906
Event recognition, Image annotation, Topic modeling, Convolutional neural nets
BibRef
Olaode, A.[Abass],
Naghdy, G.[Golshah],
Review of the application of machine learning to the automatic semantic
annotation of images,
IET-IPR(13), No. 8, 20 June 2019, pp. 1232-1245.
DOI Link
1906
BibRef
Li, X.,
Jiang, S.,
Know More Say Less: Image Captioning Based on Scene Graphs,
MultMed(21), No. 8, August 2019, pp. 2117-2130.
IEEE DOI
1908
convolutional neural nets, feature extraction, graph theory,
image representation, learning (artificial intelligence),
vision-language
BibRef
Zhang, C.J.[Chun-Jie],
Cheng, J.[Jian],
Tian, Q.[Qi],
Multiview, Few-Labeled Object Categorization by Predicting Labels
With View Consistency,
Cyber(49), No. 11, November 2019, pp. 3834-3843.
IEEE DOI
1908
image annotation, image classification,
learning (artificial intelligence), mapping function,
view consistency
BibRef
Xian, Y.,
Tian, Y.,
Self-Guiding Multimodal LSTM: When We Do Not Have a Perfect Training
Dataset for Image Captioning,
IP(28), No. 11, November 2019, pp. 5241-5252.
IEEE DOI
1909
Task analysis, Visualization, Training, Semantics, Flickr, Urban areas,
Training data, Image captioning, self-guiding, real-world dataset,
recurrent neural network
BibRef
Li, X.,
Xu, C.,
Wang, X.,
Lan, W.,
Jia, Z.,
Yang, G.,
Xu, J.,
COCO-CN for Cross-Lingual Image Tagging, Captioning, and Retrieval,
MultMed(21), No. 9, September 2019, pp. 2347-2360.
IEEE DOI
1909
Image annotation, Task analysis, Training, Image retrieval, Internet,
Streaming media, Visualization, COCO-CN, Chinese language,
image retrieval
BibRef
Tian, C.[Chunna],
Tian, M.[Ming],
Jiang, M.M.[Meng-Meng],
Liu, H.[Heng],
Deng, D.H.[Dong-Hu],
How much do cross-modal related semantics benefit image captioning by
weighting attributes and re-ranking sentences?,
PRL(125), 2019, pp. 639-645.
Elsevier DOI
1909
Semant attributes, Attribute reweighting,
Cross-modal related semantics, Sentence re-ranking
BibRef
Sharif, N.[Naeha],
White, L.[Lyndon],
Bennamoun, M.[Mohammed],
Liu, W.[Wei],
Shah, S.A.A.[Syed Afaq Ali],
LCEval: Learned Composite Metric for Caption Evaluation,
IJCV(127), No. 10, October 2019, pp. 1586-1610.
Springer DOI
1909
Fine-grained analysis.
BibRef
Zhang, Z.Y.[Zheng-Yuan],
Diao, W.H.[Wen-Hui],
Zhang, W.K.[Wen-Kai],
Yan, M.L.[Meng-Long],
Gao, X.[Xin],
Sun, X.[Xian],
LAM: Remote Sensing Image Captioning with Label-Attention Mechanism,
RS(11), No. 20, 2019, pp. xx-yy.
DOI Link
1910
BibRef
Fu, K.[Kun],
Li, Y.[Yang],
Zhang, W.K.[Wen-Kai],
Yu, H.F.[Hong-Feng],
Sun, X.[Xian],
Boosting Memory with a Persistent Memory Mechanism for Remote Sensing
Image Captioning,
RS(12), No. 11, 2020, pp. xx-yy.
DOI Link
2006
BibRef
Tan, J.H.,
Chan, C.S.,
Chuah, J.H.,
COMIC: Toward A Compact Image Captioning Model With Attention,
MultMed(21), No. 10, October 2019, pp. 2686-2696.
IEEE DOI
1910
embedded systems; feature extraction; image retrieval; matrix algebra.
BibRef
Zhou, L.,
Zhang, Y.,
Jiang, Y.,
Zhang, T.,
Fan, W.,
Re-Caption: Saliency-Enhanced Image Captioning Through Two-Phase
Learning,
IP(29), No. 1, 2020, pp. 694-709.
IEEE DOI
1910
feature extraction, image processing,
learning (artificial intelligence),
visual attribute
BibRef
Yang, L.[Liang],
Hu, H.F.[Hai-Feng],
Visual Skeleton and Reparative Attention for Part-of-Speech image
captioning system,
CVIU(189), 2019, pp. 102819.
Elsevier DOI
1911
Neural network, Visual attention, Image captioning
BibRef
Wang, J.B.[Jun-Bo],
Wang, W.[Wei],
Wang, L.[Liang],
Wang, Z.Y.[Zhi-Yong],
Feng, D.D.[David Dagan],
Tan, T.N.[Tie-Niu],
Learning visual relationship and context-aware attention for image
captioning,
PR(98), 2020, pp. 107075.
Elsevier DOI
1911
Image captioning, Relational reasoning, Context-aware attention
BibRef
Xiao, X.,
Wang, L.,
Ding, K.,
Xiang, S.,
Pan, C.,
Deep Hierarchical Encoder-Decoder Network for Image Captioning,
MultMed(21), No. 11, November 2019, pp. 2942-2956.
IEEE DOI
1911
Visualization, Semantics, Hidden Markov models, Decoding,
Logic gates, Training, Computer architecture,
vision-sentence
BibRef
Jiang, T.[Teng],
Zhang, Z.[Zehan],
Yang, Y.[Yupu],
Modeling coverage with semantic embedding for image caption generation,
VC(35), No. 11, November 2018, pp. 1655-1665.
WWW Link.
1911
BibRef
Tang, C.,
Liu, X.,
Wang, P.,
Zhang, C.,
Li, M.,
Wang, L.,
Adaptive Hypergraph Embedded Semi-Supervised Multi-Label Image
Annotation,
MultMed(21), No. 11, November 2019, pp. 2837-2849.
IEEE DOI
1911
Image annotation, Semisupervised learning, Semantics,
Computational modeling, Task analysis, Training, Computer science,
feature projection
BibRef
Mundnich, K.[Karel],
Booth, B.M.[Brandon M.],
Girault, B.[Benjamin],
Narayanan, S.[Shrikanth],
Generating labels for regression of subjective constructs using
triplet embeddings,
PRL(128), 2019, pp. 385-392.
Elsevier DOI
1912
Continuous-time annotations, Annotation fusion,
Inter-rater agreement, Triplet embeddings, Ordinal embeddings
BibRef
Huang, Y.,
Chen, J.,
Ouyang, W.,
Wan, W.,
Xue, Y.,
Image Captioning With End-to-End Attribute Detection and Subsequent
Attributes Prediction,
IP(29), 2020, pp. 4013-4026.
IEEE DOI
2002
Image captioning, semantic attention, end-to-end training,
multimodal attribute detector, subsequent attribute predictor
BibRef
Lu, X.,
Wang, B.,
Zheng, X.,
Sound Active Attention Framework for Remote Sensing Image Captioning,
GeoRS(58), No. 3, March 2020, pp. 1985-2000.
IEEE DOI
2003
Active attention, remote sensing image captioning, semantic understanding
BibRef
Wu, L.,
Xu, M.,
Wang, J.,
Perry, S.,
Recall What You See Continually Using GridLSTM in Image Captioning,
MultMed(22), No. 3, March 2020, pp. 808-818.
IEEE DOI
2003
Visualization, Decoding, Task analysis, Neural networks, Training,
Computational modeling, Logic gates, Image captioning,
GridLSTM, recurrent neural network
BibRef
Li, Y.Y.[Yang-Yang],
Fang, S.K.[Shuang-Kang],
Jiao, L.C.[Li-Cheng],
Liu, R.[Ruijiao],
Shang, R.H.[Rong-Hua],
A Multi-Level Attention Model for Remote Sensing Image Captions,
RS(12), No. 6, 2020, pp. xx-yy.
DOI Link
2003
What are the important things in the image.
BibRef
Chaudhary, C.,
Goyal, P.,
Prasad, D.N.,
Chen, Y.P.,
Enhancing the Quality of Image Tagging Using a Visio-Textual
Knowledge Base,
MultMed(22), No. 4, April 2020, pp. 897-911.
IEEE DOI
2004
Knowledge based systems, Visualization, Image annotation,
Encyclopedias, Electronic publishing, Internet, Tagging,
knowledge based systems
BibRef
Chen, X.H.[Xing-Han],
Zhang, M.X.[Ming-Xing],
Wang, Z.[Zheng],
Zuo, L.[Lin],
Li, B.[Bo],
Yang, Y.[Yang],
Leveraging unpaired out-of-domain data for image captioning,
PRL(132), 2020, pp. 132-140.
Elsevier DOI
2005
Image captioning, Out-of-domain data, Deep learning
BibRef
Xu, N.,
Zhang, H.,
Liu, A.,
Nie, W.,
Su, Y.,
Nie, J.,
Zhang, Y.,
Multi-Level Policy and Reward-Based Deep Reinforcement Learning
Framework for Image Captioning,
MultMed(22), No. 5, May 2020, pp. 1372-1383.
IEEE DOI
2005
Visualization, Measurement, Task analysis, Reinforcement learning,
Optimization, Adaptation models, Semantics, Multi-level policy,
image captioning
BibRef
Deng, Z.R.[Zhen-Rong],
Jiang, Z.Q.[Zhou-Qin],
Lan, R.[Rushi],
Huang, W.M.[Wen-Ming],
Luo, X.N.[Xiao-Nan],
Image captioning using DenseNet network and adaptive attention,
SP:IC(85), 2020, pp. 115836.
Elsevier DOI
2005
Image captioning, DenseNet, LSTM, Adaptive attention mechanism
BibRef
Ji, J.,
Xu, C.,
Zhang, X.,
Wang, B.,
Song, X.,
Spatio-Temporal Memory Attention for Image Captioning,
IP(29), 2020, pp. 7615-7628.
IEEE DOI
2007
Image captioning, spatio-temporal relationship,
attention transmission, memory attention, LSTM
BibRef
Guo, L.,
Liu, J.,
Lu, S.,
Lu, H.,
Show, Tell, and Polish: Ruminant Decoding for Image Captioning,
MultMed(22), No. 8, August 2020, pp. 2149-2162.
IEEE DOI
2007
Decoding, Visualization, Planning, Training, Semantics,
Reinforcement learning, Task analysis, Image captioning,
rumination
BibRef
Khatchatoorian, A.G.[Artin Ghostan],
Jamzad, M.[Mansour],
Architecture to improve the accuracy of automatic image annotation
systems,
IET-CV(14), No. 5, August 2020, pp. 214-223.
DOI Link
2007
BibRef
Theodosiou, Z.[Zenonas],
Tsapatsoulis, N.[Nicolas],
Image annotation: the effects of content, lexicon and annotation method,
MultInfoRetr(9), No. 3, September 2020, pp. 191-203.
WWW Link.
2008
BibRef
Che, W.B.[Wen-Bin],
Fan, X.P.[Xiao-Peng],
Xiong, R.Q.[Rui-Qin],
Zhao, D.B.[De-Bin],
Visual Relationship Embedding Network for Image Paragraph Generation,
MultMed(22), No. 9, September 2020, pp. 2307-2320.
IEEE DOI
2008
Visualization, Semantics, Task analysis, Proposals, Automobiles,
Buildings, Paragraph generation, image caption, LSTM
BibRef
Feng, Q.,
Wu, Y.,
Fan, H.,
Yan, C.,
Xu, M.,
Yang, Y.,
Cascaded Revision Network for Novel Object Captioning,
CirSysVideo(30), No. 10, October 2020, pp. 3413-3421.
IEEE DOI
2010
Visualization, Semantics, Task analysis, Detectors, Training,
Knowledge engineering, Feature extraction, Captioning,
semantic matching
BibRef
Wei, H.Y.[Hai-Yang],
Li, Z.X.[Zhi-Xin],
Zhang, C.L.[Can-Long],
Ma, H.F.[Hui-Fang],
The synergy of double attention: Combine sentence-level and
word-level attention for image captioning,
CVIU(201), 2020, pp. 103068.
Elsevier DOI
2011
Image captioning, Sentence-level attention,
Word-level attention, Reinforcement learning
BibRef
Shilpa, M.[Mohankumar],
He, J.[Jun],
Zhao, Y.[Yijia],
Sun, B.[Bo],
Yu, L.J.[Le-Jun],
Feedback evaluations to promote image captioning,
IET-IPR(14), No. 13, November 2020, pp. 3021-3027.
DOI Link
2012
BibRef
Yu, J.,
Li, J.,
Yu, Z.,
Huang, Q.,
Multimodal Transformer With Multi-View Visual Representation for
Image Captioning,
CirSysVideo(30), No. 12, December 2020, pp. 4467-4480.
IEEE DOI
2012
Visualization, Feature extraction, Hidden Markov models,
Adaptation models, Task analysis, Decoding, Computational modeling,
deep learning
BibRef
Zhao, W.,
Wu, X.,
Luo, J.,
Cross-Domain Image Captioning via Cross-Modal Retrieval and Model
Adaptation,
IP(30), 2021, pp. 1180-1192.
IEEE DOI
2012
Adaptation models, Task analysis, Visualization,
Computational modeling, Linguistics, Semantics, Image segmentation,
model adaptation
BibRef
Zhang, J.,
Mei, K.,
Zheng, Y.,
Fan, J.,
Integrating Part of Speech Guidance for Image Captioning,
MultMed(23), 2021, pp. 92-104.
IEEE DOI
2012
Visualization, Predictive models, Semantics, Feature extraction,
Task analysis, Computer vision, Speech processing, Part of speech,
multi-task learning
BibRef
Sharif, N.[Naeha],
Jalwana, M.A.A.K.[Mohammad A.A.K.],
Bennamoun, M.[Mohammed],
Liu, W.[Wei],
Shah, S.A.A.[Syed Afaq Ali],
Leveraging Linguistically-aware Object Relations and NASNet for Image
Captioning,
IVCNZ20(1-6)
IEEE DOI
2012
Visualization, Semantics, Pipelines, Computer architecture,
Knowledge discovery, Feature extraction, Task analysis,
NASNet
BibRef
Gurari, D.[Danna],
Zhao, Y.N.[Yi-Nan],
Zhang, M.[Meng],
Bhattacharya, N.[Nilavra],
Captioning Images Taken by People Who Are Blind,
ECCV20(XVII:417-434).
Springer DOI
2011
BibRef
Jiu, M.,
Sahbi, H.,
End-to-End Deep Kernel Map Design for Image Annotation,
ICIP20(1546-1550)
IEEE DOI
2011
Kernel, Task analysis, Training, Image annotation, Neural networks,
Training data, Supervised learning, Deep kernel networks,
image annotation
BibRef
Zhong, Y.W.[Yi-Wu],
Wang, L.W.[Li-Wei],
Chen, J.S.[Jian-Shu],
Yu, D.[Dong],
Li, Y.[Yin],
Comprehensive Image Captioning via Scene Graph Decomposition,
ECCV20(XIV:211-229).
Springer DOI
2011
BibRef
Wang, Z.[Zeyu],
Feng, B.[Berthy],
Narasimhan, K.[Karthik],
Russakovsky, O.[Olga],
Towards Unique and Informative Captioning of Images,
ECCV20(VII:629-644).
Springer DOI
2011
BibRef
Wang, J.[Jiuniu],
Xu, W.[Wenjia],
Wang, Q.Z.[Qing-Zhong],
Chan, A.B.[Antoni B.],
Compare and Reweight:
Distinctive Image Captioning Using Similar Images Sets,
ECCV20(I:370-386).
Springer DOI
2011
BibRef
Sidorov, O.[Oleksii],
Hu, R.[Ronghang],
Rohrbach, M.[Marcus],
Singh, A.[Amanpreet],
Textcaps: A Dataset for Image Captioning with Reading Comprehension,
ECCV20(II:742-758).
Springer DOI
2011
BibRef
Durand, T.[Thibaut],
Learning User Representations for Open Vocabulary Image Hashtag
Prediction,
CVPR20(9766-9775)
IEEE DOI
2008
Tagging, Twitter, Computational modeling, Vocabulary,
Predictive models, History, Visualization
BibRef
Prabhudesai, M.[Mihir],
Tung, H.Y.F.[Hsiao-Yu Fish],
Javed, S.A.[Syed Ashar],
Sieb, M.[Maximilian],
Harley, A.W.[Adam W.],
Fragkiadaki, K.[Katerina],
Embodied Language Grounding With 3D Visual Feature Representations,
CVPR20(2217-2226)
IEEE DOI
2008
Associating language utterances to 3D visual abstractions.
Visualization,
Cameras, Feature extraction, Detectors, Solid modeling
BibRef
Li, Z.,
Tran, Q.,
Mai, L.,
Lin, Z.,
Yuille, A.L.,
Context-Aware Group Captioning via Self-Attention and Contrastive
Features,
CVPR20(3437-3447)
IEEE DOI
2008
Task analysis, Visualization, Computer vision, Context modeling,
Training, Natural languages, Computational modeling
BibRef
Zhou, Y.,
Wang, M.,
Liu, D.,
Hu, Z.,
Zhang, H.,
More Grounded Image Captioning by Distilling Image-Text Matching
Model,
CVPR20(4776-4785)
IEEE DOI
2008
Visualization, Grounding, Task analysis, Training, Measurement,
Computational modeling, Image edge detection
BibRef
Sammani, F.,
Melas-Kyriazi, L.,
Show, Edit and Tell: A Framework for Editing Image Captions,
CVPR20(4807-4815)
IEEE DOI
2008
Decoding, Visualization, Task analysis, Logic gates,
Natural languages, Adaptation models, Glass
BibRef
Chen, S.,
Jin, Q.,
Wang, P.,
Wu, Q.,
Say As You Wish: Fine-Grained Control of Image Caption Generation
With Abstract Scene Graphs,
CVPR20(9959-9968)
IEEE DOI
2008
Semantics, Decoding, Visualization, Feature extraction,
Controllability, Task analysis, Measurement
BibRef
Guo, L.,
Liu, J.,
Zhu, X.,
Yao, P.,
Lu, S.,
Lu, H.,
Normalized and Geometry-Aware Self-Attention Network for Image
Captioning,
CVPR20(10324-10333)
IEEE DOI
2008
Geometry, Task analysis, Visualization, Decoding, Training,
Feature extraction, Computer architecture
BibRef
Cornia, M.,
Stefanini, M.,
Baraldi, L.,
Cucchiara, R.,
Meshed-Memory Transformer for Image Captioning,
CVPR20(10575-10584)
IEEE DOI
2008
Decoding, Encoding, Visualization, Image coding,
Computer architecture, Proposals, Task analysis
BibRef
Chen, J.,
Jin, Q.,
Better Captioning With Sequence-Level Exploration,
CVPR20(10887-10896)
IEEE DOI
2008
Task analysis, Measurement, Training, Computational modeling,
Computer architecture, Portable computers, Decoding
BibRef
Pan, Y.,
Yao, T.,
Li, Y.,
Mei, T.,
X-Linear Attention Networks for Image Captioning,
CVPR20(10968-10977)
IEEE DOI
2008
Visualization, Decoding, Cognition, Knowledge discovery,
Task analysis, Aggregates, Weight measurement
BibRef
Tran, A.,
Mathews, A.,
Xie, L.,
Transform and Tell: Entity-Aware News Image Captioning,
CVPR20(13032-13042)
IEEE DOI
2008
Decoding, Vocabulary, Transforms, Linguistics, Performance gain,
Neural networks, Training
BibRef
Park, G.[Geondo],
Han, C.[Chihye],
Kim, D.[Daeshik],
Yoon, W.J.[Won-Jun],
MHSAN: Multi-Head Self-Attention Network for Visual Semantic
Embedding,
WACV20(1507-1515)
IEEE DOI
2006
Feature extraction, Visualization, Semantics, Task analysis,
Recurrent neural networks, Image representation, Image coding
BibRef
Chen, C.,
Zhang, R.,
Koh, E.,
Kim, S.,
Cohen, S.,
Rossi, R.,
Figure Captioning with Relation Maps for Reasoning,
WACV20(1526-1534)
IEEE DOI
2006
Bars, Training, Visualization, Decoding, Computational modeling,
Task analysis, Portable document format
BibRef
He, S.,
Tavakoli, H.R.,
Borji, A.,
Pugeault, N.,
Human Attention in Image Captioning: Dataset and Analysis,
ICCV19(8528-8537)
IEEE DOI
2004
Code, Captioning.
WWW Link. convolutional neural nets, image segmentation,
natural language processing, object detection, visual perception,
Adaptation models
BibRef
Laina, I.,
Rupprecht, C.,
Navab, N.,
Towards Unsupervised Image Captioning With Shared Multimodal
Embeddings,
ICCV19(7413-7423)
IEEE DOI
2004
computer vision, natural language processing, text analysis,
multimodal embeddings, explicit supervision, computer vision,
Semantics
BibRef
Huang, L.,
Wang, W.,
Chen, J.,
Wei, X.,
Attention on Attention for Image Captioning,
ICCV19(4633-4642)
IEEE DOI
2004
Code, Captioning.
WWW Link. decoding, encoding, image processing, natural language processing,
element-wise multiplication, image captioning, weighted average,
Testing
BibRef
Yang, X.,
Zhang, H.,
Cai, J.,
Learning to Collocate Neural Modules for Image Captioning,
ICCV19(4249-4259)
IEEE DOI
2004
image processing, learning (artificial intelligence),
natural language processing, neural nets, Neural networks
BibRef
Yao, T.,
Pan, Y.,
Li, Y.,
Mei, T.,
Hierarchy Parsing for Image Captioning,
ICCV19(2621-2629)
IEEE DOI
2004
convolutional neural nets, feature extraction, image coding,
image representation, image segmentation, Image segmentation
BibRef
Liu, L.,
Tang, J.,
Wan, X.,
Guo, Z.,
Generating Diverse and Descriptive Image Captions Using Visual
Paraphrases,
ICCV19(4239-4248)
IEEE DOI
2004
computer vision, image classification,
learning (artificial intelligence), Machine learning
BibRef
Ke, L.,
Pei, W.,
Li, R.,
Shen, X.,
Tai, Y.,
Reflective Decoding Network for Image Captioning,
ICCV19(8887-8896)
IEEE DOI
2004
decoding, encoding, feature extraction,
learning (artificial intelligence), Random access memory
BibRef
Vered, G.,
Oren, G.,
Atzmon, Y.,
Chechik, G.,
Joint Optimization for Cooperative Image Captioning,
ICCV19(8897-8906)
IEEE DOI
2004
gradient methods, image sampling, natural language processing,
stochastic programming, text analysis, Loss measurement
BibRef
Ge, H.,
Yan, Z.,
Zhang, K.,
Zhao, M.,
Sun, L.,
Exploring Overall Contextual Information for Image Captioning in
Human-Like Cognitive Style,
ICCV19(1754-1763)
IEEE DOI
2004
cognition, computational linguistics,
learning (artificial intelligence), Cognition
BibRef
Agrawal, H.,
Desai, K.,
Wang, Y.,
Chen, X.,
Jain, R.,
Johnson, M.,
Batra, D.,
Parikh, D.,
Lee, S.,
Anderson, P.,
nocaps: novel object captioning at scale,
ICCV19(8947-8956)
IEEE DOI
2004
computer vision, feature extraction,
learning (artificial intelligence), object detection, Vegetation
BibRef
Li, G.,
Zhu, L.,
Liu, P.,
Yang, Y.,
Entangled Transformer for Image Captioning,
ICCV19(8927-8936)
IEEE DOI
2004
image retrieval, learning (artificial intelligence),
natural language processing, recurrent neural nets, robot vision, Proposals
BibRef
Hu, H.,
Misra, I.,
van der Maaten, L.,
Evaluating Text-to-Image Matching using Binary Image Selection
(BISON),
CLVL19(1887-1890)
IEEE DOI
2004
computer vision, content-based retrieval, image annotation,
image matching, image retrieval, text analysis, linguistic content,
Image Captioning
BibRef
Nguyen, A.,
Tran, Q.D.,
Do, T.,
Reid, I.,
Caldwell, D.G.,
Tsagarakis, N.G.,
Object Captioning and Retrieval with Natural Language,
ACVR19(2584-2592)
IEEE DOI
2004
convolutional neural nets, image retrieval,
learning (artificial intelligence), vision and language
BibRef
Gu, J.,
Joty, S.,
Cai, J.,
Zhao, H.,
Yang, X.,
Wang, G.,
Unpaired Image Captioning via Scene Graph Alignments,
ICCV19(10322-10331)
IEEE DOI
2004
graph theory, image representation, image retrieval,
natural language processing, text analysis, Encoding
BibRef
Shen, T.,
Kar, A.,
Fidler, S.,
Learning to Caption Images Through a Lifetime by Asking Questions,
ICCV19(10392-10401)
IEEE DOI
2004
image retrieval, multi-agent systems,
natural language processing, Automobiles
BibRef
Tanaka, M.,
Itamochi, T.,
Narioka, K.,
Sato, I.,
Ushiku, Y.,
Harada, T.,
Generating Easy-to-Understand Referring Expressions for Target
Identifications,
ICCV19(5793-5802)
IEEE DOI
2004
Code, Annotation.
WWW Link. computer games, image processing, referred objects,
salient contexts, human annotation, Grand Theft Auto V,
Task analysis
BibRef
Aneja, J.[Jyoti],
Agrawal, H.[Harsh],
Batra, D.[Dhruv],
Schwing, A.G.[Alexander G.],
Sequential Latent Spaces for Modeling the Intention During Diverse
Image Captioning,
ICCV19(4260-4269)
IEEE DOI
2004
image retrieval, image segmentation,
learning (artificial intelligence), recurrent neural nets, Controllability
BibRef
Gupta, T.,
Schwing, A.G.,
Hoiem, D.,
ViCo: Word Embeddings From Visual Co-Occurrences,
ICCV19(7424-7433)
IEEE DOI
2004
feature extraction, image annotation, image classification,
pattern clustering, supervised learning, text analysis,
Vocabulary
BibRef
Deshpande, A.[Aditya],
Aneja, J.[Jyoti],
Wang, L.[Liwei],
Schwing, A.G.[Alexander G.],
Forsyth, D.[David],
Fast, Diverse and Accurate Image Captioning Guided by Part-Of-Speech,
CVPR19(10687-10696).
IEEE DOI
2002
BibRef
Wei, H.Y.[Hai-Yang],
Li, Z.X.[Zhi-Xin],
Zhang, C.L.[Can-Long],
Image Captioning Based on Visual and Semantic Attention,
MMMod20(I:151-162).
Springer DOI
2003
BibRef
Dognin, P.[Pierre],
Melnyk, I.[Igor],
Mroueh, Y.[Youssef],
Ross, J.[Jerret],
Sercu, T.[Tom],
Adversarial Semantic Alignment for Improved Image Captions,
CVPR19(10455-10463).
IEEE DOI
2002
BibRef
Yang, X.[Xu],
Tang, K.[Kaihua],
Zhang, H.[Hanwang],
Cai, J.F.[Jian-Fei],
Auto-Encoding Scene Graphs for Image Captioning,
CVPR19(10677-10686).
IEEE DOI
2002
BibRef
Fukui, H.[Hiroshi],
Hirakawa, T.[Tsubasa],
Yamashita, T.[Takayoshi],
Fujiyoshi, H.[Hironobu],
Attention Branch Network: Learning of Attention Mechanism for Visual
Explanation,
CVPR19(10697-10706).
IEEE DOI
2002
BibRef
Biten, A.F.[Ali Furkan],
Gomez, L.[Lluis],
Rusinol, M.[Marcal],
Karatzas, D.[Dimosthenis],
Good News, Everyone! Context Driven Entity-Aware Captioning for News
Images,
CVPR19(12458-12467).
IEEE DOI
2002
BibRef
Surís, D.[Dídac],
Epstein, D.[Dave],
Ji, H.[Heng],
Chang, S.F.[Shih-Fu],
Vondrick, C.[Carl],
Learning to Learn Words from Visual Scenes,
ECCV20(XXIX: 434-452).
Springer DOI
2010
BibRef
Akbari, H.[Hassan],
Karaman, S.[Svebor],
Bhargava, S.[Surabhi],
Chen, B.[Brian],
Vondrick, C.[Carl],
Chang, S.F.[Shih-Fu],
Multi-Level Multimodal Common Semantic Space for Image-Phrase Grounding,
CVPR19(12468-12478).
IEEE DOI
2002
BibRef
Li, Y.[Yehao],
Yao, T.[Ting],
Pan, Y.[Yingwei],
Chao, H.Y.[Hong-Yang],
Mei, T.[Tao],
Pointing Novel Objects in Image Captioning,
CVPR19(12489-12498).
IEEE DOI
2002
BibRef
Bracha, L.[Lior],
Chechik, G.[Gal],
Informative Object Annotations: Tell Me Something I Don't Know,
CVPR19(12499-12507).
IEEE DOI
2002
BibRef
Shuster, K.[Kurt],
Humeau, S.[Samuel],
Hu, H.[Hexiang],
Bordes, A.[Antoine],
Weston, J.[Jason],
Engaging Image Captioning via Personality,
CVPR19(12508-12518).
IEEE DOI
2002
BibRef
Feng, Y.[Yang],
Ma, L.[Lin],
Liu, W.[Wei],
Luo, J.B.[Jie-Bo],
Unsupervised Image Captioning,
CVPR19(4120-4129).
IEEE DOI
2002
BibRef
Xu, Y.[Yan],
Wu, B.Y.[Bao-Yuan],
Shen, F.[Fumin],
Fan, Y.B.[Yan-Bo],
Zhang, Y.[Yong],
Shen, H.T.[Heng Tao],
Liu, W.[Wei],
Exact Adversarial Attack to Image Captioning via Structured Output
Learning With Latent Variables,
CVPR19(4130-4139).
IEEE DOI
2002
BibRef
Wang, Q.Z.[Qing-Zhong],
Chan, A.B.[Antoni B.],
Describing Like Humans: On Diversity in Image Captioning,
CVPR19(4190-4198).
IEEE DOI
2002
BibRef
Guo, L.T.[Long-Teng],
Liu, J.[Jing],
Yao, P.[Peng],
Li, J.W.[Jiang-Wei],
Lu, H.Q.[Han-Qing],
MSCap: Multi-Style Image Captioning With Unpaired Stylized Text,
CVPR19(4199-4208).
IEEE DOI
2002
BibRef
Zhang, L.[Lu],
Zhang, J.M.[Jian-Ming],
Lin, Z.[Zhe],
Lu, H.C.[Hu-Chuan],
He, Y.[You],
CapSal: Leveraging Captioning to Boost Semantics for Salient Object
Detection,
CVPR19(6017-6026).
IEEE DOI
2002
BibRef
Yin, G.J.[Guo-Jun],
Sheng, L.[Lu],
Liu, B.[Bin],
Yu, N.H.[Neng-Hai],
Wang, X.G.[Xiao-Gang],
Shao, J.[Jing],
Context and Attribute Grounded Dense Captioning,
CVPR19(6234-6243).
IEEE DOI
2002
BibRef
Kim, D.J.[Dong-Jin],
Choi, J.[Jinsoo],
Oh, T.H.[Tae-Hyun],
Kweon, I.S.[In So],
Dense Relational Captioning: Triple-Stream Networks for
Relationship-Based Captioning,
CVPR19(6264-6273).
IEEE DOI
2002
BibRef
Gao, J.L.[Jun-Long],
Wang, S.[Shiqi],
Wang, S.S.[Shan-She],
Ma, S.W.[Si-Wei],
Gao, W.[Wen],
Self-Critical N-Step Training for Image Captioning,
CVPR19(6293-6301).
IEEE DOI
2002
BibRef
Cornia, M.[Marcella],
Baraldi, L.[Lorenzo],
Cucchiara, R.[Rita],
Show, Control and Tell: A Framework for Generating Controllable and
Grounded Captions,
CVPR19(8299-8308).
IEEE DOI
2002
BibRef
Qin, Y.[Yu],
Du, J.J.[Jia-Jun],
Zhang, Y.[Yonghua],
Lu, H.T.[Hong-Tao],
Look Back and Predict Forward in Image Captioning,
CVPR19(8359-8367).
IEEE DOI
2002
BibRef
Zheng, Y.[Yue],
Li, Y.[Yali],
Wang, S.J.[Sheng-Jin],
Intention Oriented Image Captions With Guiding Objects,
CVPR19(8387-8396).
IEEE DOI
2002
BibRef
Huang, Y.,
Li, C.,
Li, T.,
Wan, W.,
Chen, J.,
Image Captioning with Attribute Refinement,
ICIP19(1820-1824)
IEEE DOI
1910
Image captioning, attribute recognition, Semantic attention,
Deep Neural Network, Conditional Random Field
BibRef
Lee, J.,
Lee, Y.,
Seong, S.,
Kim, K.,
Kim, S.,
Kim, J.,
Capturing Long-Range Dependencies in Video Captioning,
ICIP19(1880-1884)
IEEE DOI
1910
Video captioning, non-local block, long short-term memory,
long-range dependency, video representation
BibRef
Shi, J.,
Li, Y.,
Wang, S.,
Cascade Attention: Multiple Feature Based Learning for Image
Captioning,
ICIP19(1970-1974)
IEEE DOI
1910
Image Captioning, Attention Mechanism, Cascade Attention
BibRef
Wang, Y.,
Shen, Y.,
Xiong, H.,
Lin, W.,
Adaptive Hard Example Mining for Image Captioning,
ICIP19(3342-3346)
IEEE DOI
1910
Reinforcement Learning, Image Captioning
BibRef
Xiao, H.,
Shi, J.,
A Novel Attribute Selection Mechanism for Video Captioning,
ICIP19(619-623)
IEEE DOI
1910
Attributes, Video captioning, Attention, Reinforcement learning
BibRef
Lim, J.H.,
Chan, C.S.,
Mask Captioning Network,
ICIP19(1-5)
IEEE DOI
1910
Image captioning, Deep learning, Scene understanding
BibRef
Wang, Q.Z.[Qing-Zhong],
Chan, A.B.[Antoni B.],
Gated Hierarchical Attention for Image Captioning,
ACCV18(IV:21-37).
Springer DOI
1906
BibRef
Wang, W.[Weixuan],
Chen, Z.H.[Zhi-Hong],
Hu, H.F.[Hai-Feng],
Multivariate Attention Network for Image Captioning,
ACCV18(VI:587-602).
Springer DOI
1906
BibRef
Ghanimifard, M.[Mehdi],
Dobnik, S.[Simon],
Knowing When to Look for What and Where: Evaluating Generation of
Spatial Descriptions with Adaptive Attention,
VL18(IV:153-161).
Springer DOI
1905
See also Knowing When to Look: Adaptive Attention via a Visual Sentinel for Image Captioning.
BibRef
Kim, B.[Boeun],
Lee, Y.H.[Young Han],
Jung, H.[Hyedong],
Cho, C.[Choongsang],
Distinctive-Attribute Extraction for Image Captioning,
VL18(IV:133-144).
Springer DOI
1905
BibRef
Tanti, M.[Marc],
Gatt, A.[Albert],
Muscat, A.[Adrian],
Pre-gen Metrics: Predicting Caption Quality Metrics Without Generating
Captions,
VL18(IV:114-123).
Springer DOI
1905
BibRef
Tanti, M.[Marc],
Gatt, A.[Albert],
Camilleri, K.P.[Kenneth P.],
Quantifying the Amount of Visual Information Used by Neural Caption
Generators,
VL18(IV:124-132).
Springer DOI
1905
BibRef
Ren, L.,
Qi, G.,
Hua, K.,
Improving Diversity of Image Captioning Through Variational
Autoencoders and Adversarial Learning,
WACV19(263-272)
IEEE DOI
1904
computer vision, image classification, image coding,
image segmentation, learning (artificial intelligence),
Maximum likelihood estimation
BibRef
Zhou, Y.,
Sun, Y.,
Honavar, V.,
Improving Image Captioning by Leveraging Knowledge Graphs,
WACV19(283-293)
IEEE DOI
1904
computer vision, graph theory, image capture, image retrieval,
performance measure, image captioning systems, knowledge graphs,
Generators
BibRef
Rapson, C.J.,
Seet, B.,
Naeem, M.A.,
Lee, J.E.,
Al-Sarayreh, M.,
Klette, R.,
Reducing the Pain: A Novel Tool for Efficient Ground-Truth Labelling
in Images,
IVCNZ18(1-9)
IEEE DOI
1902
Labeling, Tools, Image segmentation, Image color analysis, Brushes,
Head, Automobiles, image labelling, annotations, segmentation,
image dataset
BibRef
Lu, J.S.[Jia-Sen],
Yang, J.W.[Jian-Wei],
Batra, D.[Dhruv],
Parikh, D.[Devi],
Neural Baby Talk,
CVPR18(7219-7228)
IEEE DOI
1812
Detectors, Visualization, Grounding, Pediatrics, Natural languages,
Dogs, Task analysis
BibRef
Wu, B.Y.[Bao-Yuan],
Chen, W.D.[Wei-Dong],
Sun, P.[Peng],
Liu, W.[Wei],
Ghanem, B.[Bernard],
Lyu, S.W.[Si-Wei],
Tagging Like Humans: Diverse and Distinct Image Annotation,
CVPR18(7967-7975)
IEEE DOI
1812
Semantics, Image annotation, Redundancy, Training,
Task analysis, Generators
BibRef
Wu, X.J.[Xin-Jian],
Zhang, L.[Li],
Li, F.Z.[Fan-Zhang],
Wang, B.J.[Bang-Jun],
A Novel Model for Multi-label Image Annotation,
ICPR18(1953-1958)
IEEE DOI
1812
Feature extraction, Image annotation, Computational modeling,
Semantics, Measurement, Visualization, Classification algorithms,
Multi-label learning
BibRef
Jiu, M.,
Sahbi, H.,
Qi, L.,
Deep Context Networks for Image Annotation,
ICPR18(2422-2427)
IEEE DOI
1812
image annotation, image classification,
learning (artificial intelligence), deep context networks,
Standards
BibRef
Khademi, M.,
Schulte, O.,
Image Caption Generation with Hierarchical Contextual Visual Spatial
Attention,
Cognitive18(2024-20248)
IEEE DOI
1812
Feature extraction, Visualization, Logic gates,
Computer architecture, Task analysis, Context modeling, Computational modeling
BibRef
Yan, S.,
Wu, F.,
Smith, J.S.,
Lu, W.,
Zhang, B.,
Image Captioning using Adversarial Networks and Reinforcement
Learning,
ICPR18(248-253)
IEEE DOI
1812
Generators, Generative adversarial networks,
Monte Carlo methods, Maximum likelihood estimation,
Task analysis
BibRef
Wang, F.,
Gong, X.,
Huang, L.,
Time-Dependent Pre-attention Model for Image Captioning,
ICPR18(3297-3302)
IEEE DOI
1812
Decoding, Task analysis, Semantics, Visualization,
Feature extraction, Computational modeling, Computer science
BibRef
Luo, R.,
Shakhnarovich, G.,
Cohen, S.,
Price, B.,
Discriminability Objective for Training Descriptive Captions,
CVPR18(6964-6974)
IEEE DOI
1812
Training, Task analysis, Visualization, Measurement,
Computational modeling, Generators, Airplanes
BibRef
Cui, Y.,
Yang, G.,
Veit, A.,
Huang, X.,
Belongie, S.,
Learning to Evaluate Image Captioning,
CVPR18(5804-5812)
IEEE DOI
1812
Measurement, Pathology, Training, Correlation, SPICE, Robustness, Task analysis
BibRef
Aneja, J.,
Deshpande, A.,
Schwing, A.G.,
Convolutional Image Captioning,
CVPR18(5561-5570)
IEEE DOI
1812
Training, Computer architecture, Task analysis,
Hidden Markov models, Microprocessors, Computational modeling, Indexing
BibRef
Chen, F.,
Ji, R.,
Sun, X.,
Wu, Y.,
Su, J.,
GroupCap: Group-Based Image Captioning with Structured Relevance and
Diversity Constraints,
CVPR18(1345-1353)
IEEE DOI
1812
Visualization, Correlation, Semantics, Feature extraction, Training,
Adaptation models, Task analysis
BibRef
Chen, X.,
Ma, L.,
Jiang, W.,
Yao, J.,
Liu, W.,
Regularizing RNNs for Caption Generation by Reconstructing the Past
with the Present,
CVPR18(7995-8003)
IEEE DOI
1812
Computer vision, Pattern recognition
BibRef
Yao, T.[Ting],
Pan, Y.W.[Ying-Wei],
Li, Y.[Yehao],
Mei, T.[Tao],
Exploring Visual Relationship for Image Captioning,
ECCV18(XIV: 711-727).
Springer DOI
1810
BibRef
Shah, S.A.A.[Syed Afaq Ali],
NNEval: Neural Network Based Evaluation Metric for Image Captioning,
ECCV18(VIII: 39-55).
Springer DOI
1810
BibRef
Jiang, W.H.[Wen-Hao],
Ma, L.[Lin],
Jiang, Y.G.[Yu-Gang],
Liu, W.[Wei],
Zhang, T.[Tong],
Recurrent Fusion Network for Image Captioning,
ECCV18(II: 510-526).
Springer DOI
1810
BibRef
Chatterjee, M.[Moitreya],
Schwing, A.G.[Alexander G.],
Diverse and Coherent Paragraph Generation from Images,
ECCV18(II: 747-763).
Springer DOI
1810
BibRef
Chen, S.[Shi],
Zhao, Q.[Qi],
Boosted Attention: Leveraging Human Attention for Image Captioning,
ECCV18(XI: 72-88).
Springer DOI
1810
BibRef
Dai, B.[Bo],
Ye, D.[Deming],
Lin, D.[Dahua],
Rethinking the Form of Latent States in Image Captioning,
ECCV18(VI: 294-310).
Springer DOI
1810
BibRef
Liu, X.[Xihui],
Li, H.S.[Hong-Sheng],
Shao, J.[Jing],
Chen, D.P.[Da-Peng],
Wang, X.G.[Xiao-Gang],
Show, Tell and Discriminate:
Image Captioning by Self-retrieval with Partially Labeled Data,
ECCV18(XV: 353-369).
Springer DOI
1810
BibRef
Fang, F.,
Wang, H.,
Tang, P.,
Image Captioning with Word Level Attention,
ICIP18(1278-1282)
IEEE DOI
1809
Visualization, Feature extraction, Task analysis, Training,
Recurrent neural networks, Semantics, Computational modeling,
bidirectional spatial embedding
BibRef
Zhu, Z.,
Xue, Z.,
Yuan, Z.,
Topic-Guided Attention for Image Captioning,
ICIP18(2615-2619)
IEEE DOI
1809
Visualization, Semantics, Feature extraction, Training, Decoding,
Generators, Measurement, Image captioning, Attention, Topic, Attribute,
Deep Neural Network
BibRef
Gomez-Garay, A.[Alejandro],
Raducanu, B.[Bogdan],
Salas, J.[Joaquín],
Dense Captioning of Natural Scenes in Spanish,
MCPR18(145-154).
Springer DOI
1807
BibRef
Yao, L.[Li],
Ballas, N.[Nicolas],
Cho, K.[Kyunghyun],
Smith, J.[John],
Bengio, Y.[Yoshua],
Oracle Performance for Visual Captioning,
BMVC16(xx-yy).
HTML Version.
1805
BibRef
Khatchatoorian, A.G.,
Jamzad, M.,
Post Rectifying Methods to Improve the Accuracy of Image Annotation,
DICTA17(1-7)
IEEE DOI
1804
feature extraction, image annotation, image classification,
image retrieval, matrix algebra, Class-tag relation matrix,
Time division multiplexing
BibRef
Dong, H.[Hao],
Zhang, J.Q.[Jing-Qing],
McIlwraith, D.[Douglas],
Guo, Y.[Yike],
I2T2I: Learning text to image synthesis with textual data
augmentation,
ICIP17(2015-2019)
IEEE DOI
1803
Birds, Generators, Image generation,
Recurrent neural networks, Shape, Training, Deep learning, GAN, Image Synthesis
BibRef
Pellegrin, L.[Luis],
Escalante, H.J.[Hugo Jair],
Montes-y-Gómez, M.[Manuel],
Villegas, M.[Mauricio],
González, F.A.[Fabio A.],
A Flexible Framework for the Evaluation of Unsupervised Image
Annotation,
CIARP17(508-516).
Springer DOI
1802
BibRef
Jia, Y.H.[Yu-Hua],
Bai, L.[Liang],
Wang, P.[Peng],
Guo, J.L.[Jin-Lin],
Xie, Y.X.[Yu-Xiang],
Deep Convolutional Neural Network for Correlating Images and Sentences,
MMMod18(I:154-165).
Springer DOI
1802
BibRef
Liu, J.Y.[Jing-Yu],
Wang, L.[Liang],
Yang, M.H.[Ming-Hsuan],
Referring Expression Generation and Comprehension via Attributes,
ICCV17(4866-4874)
IEEE DOI
1802
Language Descriptions for objects.
learning (artificial intelligence), object detection, RefCOCO,
RefCOCO+, RefCOCOg, attribute learning model, common space model,
Visualization
BibRef
Dai, B.,
Fidler, S.,
Urtasun, R.,
Lin, D.,
Towards Diverse and Natural Image Descriptions via a Conditional GAN,
ICCV17(2989-2998)
IEEE DOI
1802
image retrieval, image sequences, inference mechanisms,
learning (artificial intelligence),
Visualization
BibRef
Niu, Z.X.[Zhen-Xing],
Zhou, M.[Mo],
Wang, L.[Le],
Gao, X.[Xinbo],
Hua, G.[Gang],
Hierarchical Multimodal LSTM for Dense Visual-Semantic Embedding,
ICCV17(1899-1907)
IEEE DOI
1802
map sentences and images.
document image processing, image representation,
recurrent neural nets, HM-LSTM, Hierarchical Multimodal LSTM,
Recurrent neural networks
BibRef
Liang, X.,
Hu, Z.,
Zhang, H.,
Gan, C.,
Xing, E.P.,
Recurrent Topic-Transition GAN for Visual Paragraph Generation,
ICCV17(3382-3391)
IEEE DOI
1802
document image processing, inference mechanisms, natural scenes,
recurrent neural nets, text analysis, RTT-GAN,
Visualization
BibRef
Shetty, R.,
Rohrbach, M.,
Hendricks, L.A.,
Fritz, M.,
Schiele, B.,
Speaking the Same Language:
Matching Machine to Human Captions by Adversarial Training,
ICCV17(4155-4164)
IEEE DOI
1802
image matching, learning (artificial intelligence),
sampling methods, vocabulary, adversarial training,
Visualization
BibRef
Liu, S.,
Zhu, Z.,
Ye, N.,
Guadarrama, S.,
Murphy, K.,
Improved Image Captioning via Policy Gradient optimization of SPIDEr,
ICCV17(873-881)
IEEE DOI
1802
Maximum likelihood estimation, Measurement, Mixers, Robustness,
SPICE, Training
BibRef
Gu, J.X.[Jiu-Xiang],
Joty, S.[Shafiq],
Cai, J.F.[Jian-Fei],
Wang, G.[Gang],
Unpaired Image Captioning by Language Pivoting,
ECCV18(I: 519-535).
Springer DOI
1810
BibRef
Gu, J.X.[Jiu-Xiang],
Wang, G.[Gang],
Cai, J.F.[Jian-Fei],
Chen, T.H.[Tsu-Han],
An Empirical Study of Language CNN for Image Captioning,
ICCV17(1231-1240)
IEEE DOI
1802
computer vision, convolution, learning (artificial intelligence),
natural language processing, recurrent neural nets,
Recurrent neural networks
BibRef
Pedersoli, M.,
Lucas, T.,
Schmid, C.,
Verbeek, J.,
Areas of Attention for Image Captioning,
ICCV17(1251-1259)
IEEE DOI
1802
image segmentation, inference mechanisms,
natural language processing, object detection,
Visualization
BibRef
Li, Y.,
Ouyang, W.,
Zhou, B.,
Wang, K.,
Wang, X.,
Scene Graph Generation from Objects, Phrases and Region Captions,
ICCV17(1270-1279)
IEEE DOI
1802
graph theory, image classification, image representation,
neural nets, object detection,
Visualization
BibRef
Zhang, Z.,
Wu, J.J.,
Li, Q.,
Huang, Z.,
Traer, J.,
McDermott, J.H.,
Tenenbaum, J.B.,
Freeman, W.T.,
Generative Modeling of Audible Shapes for Object Perception,
ICCV17(1260-1269)
IEEE DOI
1802
audio recording, audio signal processing, audio-visual systems,
feature extraction, inference mechanisms, interactive systems,
Visualization
BibRef
Liu, Z.J.[Zhi-Jian],
Freeman, W.T.[William T.],
Tenenbaum, J.B.[Joshua B.],
Wu, J.J.[Jia-Jun],
Physical Primitive Decomposition,
ECCV18(XII: 3-20).
Springer DOI
1810
BibRef
Wu, J.J.[Jia-Jun],
Lim, J.[Joseph],
Zhang, H.Y.[Hong-Yi],
Tenenbaum, J.B.[Joshua B.],
Freeman, W.T.[William T.],
Physics 101: Learning Physical Object Properties from Unlabeled Videos,
BMVC16(xx-yy).
HTML Version.
1805
BibRef
Tavakoliy, H.R.,
Shetty, R.,
Borji, A.,
Laaksonen, J.,
Paying Attention to Descriptions Generated by Image Captioning Models,
ICCV17(2506-2515)
IEEE DOI
1802
feature extraction, image processing, human descriptions,
human-written descriptions, image captioning model,
Visualization
BibRef
Chen, T.H.,
Liao, Y.H.,
Chuang, C.Y.,
Hsu, W.T.,
Fu, J.,
Sun, M.,
Show, Adapt and Tell:
Adversarial Training of Cross-Domain Image Captioner,
ICCV17(521-530)
IEEE DOI
1802
image processing, inference mechanisms, text analysis, MSCOCO,
adversarial training procedure, captioner act, critic networks,
Training data
BibRef
Tripathi, A.[Anurag],
Gupta, A.[Abhinav],
Chaudhary, S.[Santanu],
Lall, B.[Brejesh],
Image Annotation Using Latent Components and Transmedia Association,
PReMI17(493-500).
Springer DOI
1711
BibRef
Pini, S.[Stefano],
Cornia, M.[Marcella],
Baraldi, L.[Lorenzo],
Cucchiara, R.[Rita],
Towards Video Captioning with Naming:
A Novel Dataset and a Multi-modal Approach,
CIAP17(II:384-395).
Springer DOI
1711
BibRef
Wu, B.Y.[Bao-Yuan],
Jia, F.[Fan],
Liu, W.[Wei],
Ghanem, B.[Bernard],
Diverse Image Annotation,
CVPR17(6194-6202)
IEEE DOI
1711
Correlation, Feature extraction, Measurement, Redundancy, Semantics
BibRef
Krause, J.[Jonathan],
Johnson, J.[Justin],
Krishna, R.[Ranjay],
Fei-Fei, L.[Li],
A Hierarchical Approach for Generating Descriptive Image Paragraphs,
CVPR17(3337-3345)
IEEE DOI
1711
Feature extraction, Natural languages, Pragmatics,
Recurrent neural networks, Speech, Visualization
BibRef
Vedantam, R.,
Bengio, S.,
Murphy, K.,
Parikh, D.,
Chechik, G.,
Context-Aware Captions from Context-Agnostic Supervision,
CVPR17(1070-1079)
IEEE DOI
1711
Birds, Cats, Cognition, Context modeling, Pragmatics, Training
BibRef
Gan, Z.,
Gan, C.,
He, X.,
Pu, Y.,
Tran, K.,
Gao, J.,
Carin, L.,
Deng, L.,
Semantic Compositional Networks for Visual Captioning,
CVPR17(1141-1150)
IEEE DOI
1711
Feature extraction, Mouth, Pediatrics, Semantics, Tensile stress,
Training, Visualization
BibRef
Ren, Z.,
Wang, X.,
Zhang, N.,
Lv, X.,
Li, L.J.,
Deep Reinforcement Learning-Based Image Captioning with Embedding
Reward,
CVPR17(1151-1159)
IEEE DOI
1711
Decision making, Learning (artificial intelligence), Measurement,
Neural networks, Training, Visualization
BibRef
Rennie, S.J.,
Marcheret, E.,
Mroueh, Y.,
Ross, J.,
Goel, V.,
Self-Critical Sequence Training for Image Captioning,
CVPR17(1179-1195)
IEEE DOI
1711
Inference algorithms, Learning (artificial intelligence),
Logic gates, Measurement, Predictive models, Training
BibRef
Yang, L.,
Tang, K.,
Yang, J.,
Li, L.J.,
Dense Captioning with Joint Inference and Visual Context,
CVPR17(1978-1987)
IEEE DOI
1711
Bioinformatics, Genomics, Object detection, Proposals, Semantics,
Training, Visualization
BibRef
Lu, J.,
Xiong, C.,
Parikh, D.,
Socher, R.,
Knowing When to Look: Adaptive Attention via a Visual Sentinel for
Image Captioning,
CVPR17(3242-3250)
IEEE DOI
1711
Adaptation models, Computational modeling, Context modeling,
Decoding, Logic gates, Mathematical model, Visualization
BibRef
Yao, T.,
Pan, Y.,
Li, Y.,
Mei, T.,
Incorporating Copying Mechanism in Image Captioning for Learning
Novel Objects,
CVPR17(5263-5271)
IEEE DOI
1711
Decoding, Hidden Markov models, Object recognition,
Recurrent neural networks, Standards, Training, Visualization
BibRef
Chen, L.,
Zhang, H.,
Xiao, J.,
Nie, L.,
Shao, J.,
Liu, W.,
Chua, T.S.,
SCA-CNN: Spatial and Channel-Wise Attention in Convolutional Networks
for Image Captioning,
CVPR17(6298-6306)
IEEE DOI
1711
Detectors, Feature extraction, Image coding, Neural networks,
Semantics, Visualization
BibRef
Sun, Q.,
Lee, S.,
Batra, D.,
Bidirectional Beam Search: Forward-Backward Inference in Neural
Sequence Models for Fill-in-the-Blank Image Captioning,
CVPR17(7215-7223)
IEEE DOI
1711
Approximation algorithms, Computational modeling, Decoding,
History, Inference algorithms, Recurrent, neural, networks
BibRef
Wang, Y.,
Lin, Z.,
Shen, X.,
Cohen, S.,
Cottrell, G.W.,
Skeleton Key: Image Captioning by Skeleton-Attribute Decomposition,
CVPR17(7378-7387)
IEEE DOI
1711
Measurement, Recurrent neural networks, SPICE, Semantics, Skeleton, Training
BibRef
Zanfir, M.[Mihai],
Marinoiu, E.[Elisabeta],
Sminchisescu, C.[Cristian],
Spatio-Temporal Attention Models for Grounded Video Captioning,
ACCV16(IV: 104-119).
Springer DOI
1704
BibRef
Chen, T.H.[Tseng-Hung],
Zeng, K.H.[Kuo-Hao],
Hsu, W.T.[Wan-Ting],
Sun, M.[Min],
Video Captioning via Sentence Augmentation and Spatio-Temporal
Attention,
Assist16(I: 269-286).
Springer DOI
1704
BibRef
Tan, Y.H.[Ying Hua],
Chan, C.S.[Chee Seng],
phi-LSTM: A Phrase-Based Hierarchical LSTM Model for Image Captioning,
ACCV16(V: 101-117).
Springer DOI
1704
BibRef
Weiland, L.[Lydia],
Hulpus, I.[Ioana],
Ponzetto, S.P.[Simone Paolo],
Dietz, L.[Laura],
Using Object Detection, NLP, and Knowledge Bases to Understand the
Message of Images,
MMMod17(II: 405-418).
Springer DOI
1701
BibRef
Liu, Y.[Yu],
Guo, Y.M.[Yan-Ming],
Lew, M.S.[Michael S.],
What Convnets Make for Image Captioning?,
MMMod17(I: 416-428).
Springer DOI
1701
BibRef
Tran, K.,
He, X.,
Zhang, L.,
Sun, J.,
Rich Image Captioning in the Wild,
DeepLearn-C16(434-441)
IEEE DOI
1612
BibRef
Wang, Y.L.[Yi-Lin],
Wang, S.H.[Su-Hang],
Tang, J.L.[Ji-Liang],
Liu, H.[Huan],
Li, B.X.[Bao-Xin],
PPP: Joint Pointwise and Pairwise Image Label Prediction,
CVPR16(6005-6013)
IEEE DOI
1612
BibRef
Yatskar, M.[Mark],
Ordonez, V.,
Zettlemoyer, L.[Luke],
Farhadi, A.[Ali],
Commonly Uncommon: Semantic Sparsity in Situation Recognition,
CVPR17(6335-6344)
IEEE DOI
1711
BibRef
Earlier: A1, A3, A4, Only:
Situation Recognition: Visual Semantic Role Labeling for Image
Understanding,
CVPR16(5534-5542)
IEEE DOI
1612
Image recognition, Image representation, Predictive models,
Semantics, Tensile stress, Training
BibRef
Kottur, S.[Satwik],
Vedantam, R.[Ramakrishna],
Moura, J.M.F.[José M. F.],
Parikh, D.[Devi],
VisualWord2Vec (Vis-W2V):
Learning Visually Grounded Word Embeddings Using Abstract Scenes,
CVPR16(4985-4994)
IEEE DOI
1612
BibRef
Zhu, Y.,
Groth, O.,
Bernstein, M.,
Fei-Fei, L.,
Visual7W: Grounded Question Answering in Images,
CVPR16(4995-5004)
IEEE DOI
1612
BibRef
Zhang, P.,
Goyal, Y.,
Summers-Stay, D.,
Batra, D.,
Parikh, D.,
Yin and Yang: Balancing and Answering Binary Visual Questions,
CVPR16(5014-5022)
IEEE DOI
1612
BibRef
Park, D.H.,
Darrell, T.J.,
Rohrbach, A.,
Robust Change Captioning,
ICCV19(4623-4632)
IEEE DOI
2004
feature extraction, learning (artificial intelligence),
natural language processing, object-oriented programming, Predictive models
BibRef
Venugopalan, S.[Subhashini],
Hendricks, L.A.[Lisa Anne],
Rohrbach, M.[Marcus],
Mooney, R.[Raymond],
Darrell, T.J.[Trevor J.],
Saenko, K.[Kate],
Captioning Images with Diverse Objects,
CVPR17(1170-1178)
IEEE DOI
1711
BibRef
Earlier: A2, A1, A3, A4, A6, A5:
Deep Compositional Captioning: Describing Novel Object Categories
without Paired Training Data,
CVPR16(1-10)
IEEE DOI
1612
Data models, Image recognition, Predictive models, Semantics,
Training, Visualization.
Novel objects not in training data.
BibRef
Johnson, J.[Justin],
Karpathy, A.[Andrej],
Fei-Fei, L.[Li],
DenseCap:
Fully Convolutional Localization Networks for Dense Captioning,
CVPR16(4565-4574)
IEEE DOI
1612
Both localize and describe salient regions in images in natural language.
BibRef
Wang, M.[Minsi],
Song, L.[Li],
Yang, X.K.[Xiao-Kang],
Luo, C.F.[Chuan-Fei],
A parallel-fusion RNN-LSTM architecture for image caption generation,
ICIP16(4448-4452)
IEEE DOI
1610
Computational modeling
deep convolutional networks and recurrent neural networks.
BibRef
Lin, X.[Xiao],
Parikh, D.[Devi],
Leveraging Visual Question Answering for Image-Caption Ranking,
ECCV16(II: 261-277).
Springer DOI
1611
BibRef
Earlier:
Don't just listen, use your imagination:
Leveraging visual common sense for non-visual tasks,
CVPR15(2984-2993)
IEEE DOI
1510
BibRef
Chen, T.L.[Tian-Lang],
Zhang, Z.P.[Zhong-Ping],
You, Q.Z.[Quan-Zeng],
Fang, C.[Chen],
Wang, Z.W.[Zhao-Wen],
Jin, H.L.[Hai-Lin],
Luo, J.B.[Jie-Bo],
'Factual' or 'Emotional':
Stylized Image Captioning with Adaptive Learning and Attention,
ECCV18(X: 527-543).
Springer DOI
1810
BibRef
You, Q.Z.[Quan-Zeng],
Jin, H.L.[Hai-Lin],
Wang, Z.W.[Zhao-Wen],
Fang, C.[Chen],
Luo, J.B.[Jie-Bo],
Image Captioning with Semantic Attention,
CVPR16(4651-4659)
IEEE DOI
1612
BibRef
Jia, X.[Xu],
Gavves, E.[Efstratios],
Fernando, B.[Basura],
Tuytelaars, T.[Tinne],
Guiding the Long-Short Term Memory Model for Image Caption Generation,
ICCV15(2407-2415)
IEEE DOI
1602
Computer architecture
BibRef
Chen, X.L.[Xin-Lei],
Zitnick, C.L.[C. Lawrence],
Mind's eye:
A recurrent visual representation for image caption generation,
CVPR15(2422-2431)
IEEE DOI
1510
BibRef
Vedantam, R.[Ramakrishna],
Zitnick, C.L.[C. Lawrence],
Parikh, D.[Devi],
CIDEr: Consensus-based image description evaluation,
CVPR15(4566-4575)
IEEE DOI
1510
BibRef
Fang, H.[Hao],
Gupta, S.[Saurabh],
Iandola, F.[Forrest],
Srivastava, R.K.[Rupesh K.],
Deng, L.[Li],
Dollar, P.[Piotr],
Gao, J.F.[Jian-Feng],
He, X.D.[Xiao-Dong],
Mitchell, M.[Margaret],
Platt, J.C.[John C.],
Zitnick, C.L.[C. Lawrence],
Zweig, G.[Geoffrey],
From captions to visual concepts and back,
CVPR15(1473-1482)
IEEE DOI
1510
BibRef
Ramnath, K.[Krishnan],
Baker, S.[Simon],
Vanderwende, L.[Lucy],
El-Saban, M.[Motaz],
Sinha, S.N.[Sudipta N.],
Kannan, A.[Anitha],
Hassan, N.[Noran],
Galley, M.[Michel],
Yang, Y.[Yi],
Ramanan, D.[Deva],
Bergamo, A.[Alessandro],
Torresani, L.[Lorenzo],
AutoCaption: Automatic caption generation for personal photos,
WACV14(1050-1057)
IEEE DOI
1406
Clouds
BibRef
Pan, J.Y.[Jia-Yu],
Yang, H.J.[Hyung-Jeong],
Faloutsos, C.[Christos],
MMSS: Graph-based Multi-modal Story-oriented Video Summarization and
Retrieval,
CMU-CS-TR-04-114.
HTML Version.
0501
BibRef
Pan, J.Y.[Jia-Yu],
Yang, H.J.[Hyung-Jeong],
Faloutsos, C.[Christos],
Duygulu, P.[Pinar],
GCap: Graph-based Automatic Image Captioning,
MMDE04(146).
IEEE DOI
0406
BibRef
Pan, J.Y.[Jia-Yu],
Advanced Tools for Video and Multimedia Mining,
CMU-CS-06-126, May 2006.
BibRef
0605
Ph.D.Thesis,
HTML Version.
BibRef
Chapter on Matching and Recognition Using Volumes, High Level Vision Techniques, Invariants continues in
General References for Matching .