13.6.9 Annotation, Captioning, Image Captioning

Chapter Contents (Back)
Image Captioning. Captioning. Annotation. Fine-Grained. The most important object or concept in the image. See also Video Retrieval, Video Annotation, Video Categorization, Genre. See also LSTM: Long Short-Term Memory.

Feng, Y.S.[Yan-Song], Lapata, M.,
Automatic Caption Generation for News Images,
PAMI(35), No. 4, April 2013, pp. 797-812.
IEEE DOI 1303
Use existing captions and tags, expand to similar images. BibRef

Nakayama, H.[Hideki], Harada, T.[Tatsuya], Kuniyoshi, Y.[Yasuo],
Dense Sampling Low-Level Statistics of Local Features,
IEICE(E93-D), No. 7, July 2010, pp. 1727-1736.
WWW Link. 1008
BibRef
Earlier: CIVR09(Article No 17).
DOI Link 0907
BibRef
And:
Global Gaussian approach for scene categorization using information geometry,
CVPR10(2336-2343).
IEEE DOI 1006
BibRef
Earlier:
AI Goggles: Real-time Description and Retrieval in the Real World with Online Learning,
CRV09(184-191).
IEEE DOI 0905
local features. Scalability of matching for large-scale indexing. Boost global features with sampled statistics of local features. BibRef

Ushiku, Y.[Yoshitaka], Yamaguchi, M.[Masataka], Mukuta, Y.[Yusuke], Harada, T.[Tatsuya],
Common Subspace for Model and Similarity: Phrase Learning for Caption Generation from Images,
ICCV15(2668-2676)
IEEE DOI 1602
Feature extraction BibRef

Jin, J.[Jiren], Nakayama, H.[Hideki],
Annotation order matters: Recurrent Image Annotator for arbitrary length image tagging,
ICPR16(2452-2457)
IEEE DOI 1705
Correlation, Feature extraction, Indexes, Predictive models, Recurrent neural networks, Training BibRef

Harada, T.[Tatsuya], Nakayama, H.[Hideki], Kuniyoshi, Y.[Yasuo],
Improving Local Descriptors by Embedding Global and Local Spatial Information,
ECCV10(IV: 736-749).
Springer DOI 1009
BibRef
Earlier: A2, A1, A3:
Evaluation of dimensionality reduction methods for image auto-annotation,
BMVC10(xx-yy).
HTML Version. 1009
BibRef

Verma, Y.[Yashaswi], Jawahar, C.V.,
A support vector approach for cross-modal search of images and texts,
CVIU(154), No. 1, 2017, pp. 48-63.
Elsevier DOI 1612
Image search BibRef

Xue, J.F.[Jian-Fei], Eguchi, K.[Koji],
Video Data Modeling Using Sequential Correspondence Hierarchical Dirichlet Processes,
IEICE(E100-D), No. 1, January 2017, pp. 33-41.
WWW Link. 1701
multimodal data such as the mixture of visual words and speech words extracted from video files BibRef

Tariq, A.[Amara], Foroosh, H.[Hassan],
A Context-Driven Extractive Framework for Generating Realistic Image Descriptions,
IP(26), No. 2, February 2017, pp. 619-632.
IEEE DOI 1702
image annotation BibRef

Vinyals, O.[Oriol], Toshev, A.[Alexander], Bengio, S.[Samy], Erhan, D.[Dumitru],
Show and Tell: Lessons Learned from the 2015 MSCOCO Image Captioning Challenge,
PAMI(39), No. 4, April 2017, pp. 652-663.
IEEE DOI 1703
BibRef
Earlier:
Show and tell: A neural image caption generator,
CVPR15(3156-3164)
IEEE DOI 1510
Computational modeling BibRef

Gao, L.L.[Lian-Li], Guo, Z.[Zhao], Zhang, H.W.[Han-Wang], Xu, X.[Xing], Shen, H.T.[Heng Tao],
Video Captioning With Attention-Based LSTM and Semantic Consistency,
MultMed(19), No. 9, September 2017, pp. 2045-2055.
IEEE DOI 1708
Computational modeling, Correlation, Feature extraction, Neural networks, Semantics, Visualization, Attention mechanism, embedding, long short-term memory (LSTM), video, captioning BibRef

Hu, M., Yang, Y., Shen, F., Zhang, L., Shen, H.T., Li, X.,
Robust Web Image Annotation via Exploring Multi-Facet and Structural Knowledge,
IP(26), No. 10, October 2017, pp. 4871-4884.
IEEE DOI 1708
image annotation, image retrieval, iterative methods, learning (artificial intelligence), multimedia systems, optimisation, pattern classification, RMSL, data structural information, digital technologies, image semantic indexing, image semantic retrieval, robust multiview semi-supervised learning, visual features, Manifolds, Multimedia communication, Semantics, Semisupervised learning, Supervised learning, Image annotation, l2, p-norm, multi-view learning, semi-supervised learning BibRef

Bin, Y., Yang, Y., Shen, F., Xie, N., Shen, H.T., Li, X.,
Describing Video With Attention-Based Bidirectional LSTM,
Cyber(49), No. 7, July 2019, pp. 2631-2641.
IEEE DOI 1905
Visualization, Semantics, Decoding, Feature extraction, Natural languages, Recurrent neural networks, Grammar, video captioning BibRef

Wang, J.Y.[Jing-Ya], Zhu, X.T.[Xia-Tian], Gong, S.G.[Shao-Gang],
Discovering visual concept structure with sparse and incomplete tags,
AI(250), No. 1, 2017, pp. 16-36.
Elsevier DOI 1708
Automatically discovering the semantic structure of tagged visual data (e.g. web videos and images). BibRef

Kilickaya, M.[Mert], Akkus, B.K.[Burak Kerim], Cakici, R.[Ruket], Erdem, A.[Aykut], Erdem, E.[Erkut], Ikizler-Cinbis, N.[Nazli],
Data-driven image captioning via salient region discovery,
IET-CV(11), No. 6, September 2017, pp. 398-406.
DOI Link 1709
BibRef

Fu, K.[Kun], Jin, J.Q.[Jun-Qi], Cui, R.P.[Run-Peng], Sha, F.[Fei], Zhang, C.S.[Chang-Shui],
Aligning Where to See and What to Tell: Image Captioning with Region-Based Attention and Scene-Specific Contexts,
PAMI(39), No. 12, December 2017, pp. 2321-2334.
IEEE DOI 1711
Adaptation models, Computational modeling, Context modeling, Data mining, Feature extraction, Image classification, Visualization, Image captioning, LSTM, visual attention. BibRef

Liu, A.A.[An-An], Xu, N.[Ning], Wong, Y.[Yongkang], Li, J.[Junnan], Su, Y.T.[Yu-Ting], Kankanhalli, M.[Mohan],
Hierarchical & multimodal video captioning: Discovering and transferring multimodal knowledge for vision to language,
CVIU(163), No. 1, 2017, pp. 113-125.
Elsevier DOI 1712
Video to text BibRef

Nian, F.D.[Fu-Dong], Li, T.[Teng], Wang, Y.[Yan], Wu, X.Y.[Xin-Yu], Ni, B.B.[Bing-Bing], Xu, C.S.[Chang-Sheng],
Learning explicit video attributes from mid-level representation for video captioning,
CVIU(163), No. 1, 2017, pp. 126-138.
Elsevier DOI 1712
Mid-level video representation BibRef

He, X.D.[Xiao-Dong], Deng, L.[Li],
Deep Learning for Image-to-Text Generation: A Technical Overview,
SPMag(34), No. 6, November 2017, pp. 109-116.
IEEE DOI 1712
BibRef
And: Errata: SPMag(35), No. 1, January 2018, pp. 178.
IEEE DOI Artificial intelligence, Computer vision, Image classification, Natural language processing, Pediatrics, Semantics, Training data, Visualization BibRef

Li, L.H.[Ling-Hui], Tang, S.[Sheng], Zhang, Y.D.[Yong-Dong], Deng, L.X.[Li-Xi], Tian, Q.[Qi],
GLA: Global-Local Attention for Image Description,
MultMed(20), No. 3, March 2018, pp. 726-737.
IEEE DOI 1802
Computational modeling, Decoding, Feature extraction, Image recognition, Natural language processing, recurrent neural network BibRef

Guan, J.N.[Jin-Ning], Wang, E.[Eric],
Repeated review based image captioning for image evidence review,
SP:IC(63), 2018, pp. 141-148.
Elsevier DOI 1804
Repeated review, Image captioning, Encoder-decoder, Multimodal layer BibRef

Lu, X., Wang, B., Zheng, X., Li, X.,
Exploring Models and Data for Remote Sensing Image Caption Generation,
GeoRS(56), No. 4, April 2018, pp. 2183-2195.
IEEE DOI 1804
Computer vision, Feature extraction, Image representation, Recurrent neural networks, Remote sensing, Semantics, semantic understanding BibRef

Cheng, Q.[Qimin], Zhang, Q.[Qian], Fu, P.[Peng], Tu, C.H.[Cong-Huan], Li, S.[Sen],
A survey and analysis on automatic image annotation,
PR(79), 2018, pp. 242-259.
Elsevier DOI 1804
Automatic image annotation, Generative model, Nearest-neighbor model, Discriminative model, Tag-completion, Deep learning BibRef

Ben Rejeb, I.[Imen], Ouni, S.[Sonia], Barhoumi, W.[Walid], Zagrouba, E.[Ezzeddine],
Fuzzy VA-Files for multi-label image annotation based on visual content of regions,
SIViP(12), No. 5, July 2018, pp. 877-884.
Springer DOI 1806
Vector Approximation Files. BibRef

Helmy, T.[Tarek],
A Generic Framework for Semantic Annotation of Images,
IJIG(18), No. 3, July 2018, pp. Article 1850013.
DOI Link 1807
BibRef

Wu, C.L.[Chun-Lei], Wei, Y.[Yiwei], Chu, X.L.[Xiao-Liang], Su, F.[Fei], Wang, L.[Leiquan],
Modeling visual and word-conditional semantic attention for image captioning,
SP:IC(67), 2018, pp. 100-107.
Elsevier DOI 1808
Image captioning, Word-conditional semantic attention, Visual attention, Attention variation BibRef

Ye, S., Han, J., Liu, N.,
Attentive Linear Transformation for Image Captioning,
IP(27), No. 11, November 2018, pp. 5514-5524.
IEEE DOI 1809
feature extraction, image classification, learning (artificial intelligence), matrix algebra, probability, LSTM BibRef

Zhang, M., Yang, Y., Zhang, H., Ji, Y., Shen, H.T., Chua, T.,
More is Better: Precise and Detailed Image Captioning Using Online Positive Recall and Missing Concepts Mining,
IP(28), No. 1, January 2019, pp. 32-44.
IEEE DOI 1810
data mining, image representation, image retrieval, image segmentation, learning (artificial intelligence), element-wise selection BibRef

Hu, J.[Jiwei], Lam, K.M.[Kin-Man], Lou, P.[Ping], Liu, Q.[Quan], Deng, W.P.[Wu-Peng],
Can a machine have two systems for recognition, like human beings?,
JVCIR(56), 2018, pp. 275-286.
Elsevier DOI 1811
Image annotation, Multi-labeling, Hierarchical tree structure, Feature-pool selection BibRef

Bhagat, P.K., Choudhary, P.,
Image annotation: Then and now,
IVC(80), 2018, pp. 1-23.
Elsevier DOI 1812
Image annotation, Automatic image annotation, Multi-label classification, Image labeling, Image tagging, Image retrieval BibRef

Gil-Gonzalez, J., Alvarez-Meza, A., Orozco-Gutierrez, A.,
Learning from multiple annotators using kernel alignment,
PRL(116), 2018, pp. 150-156.
Elsevier DOI 1812
Multiple annotators, Kernel methods, Classification BibRef

Bazrafkan, S.[Shabab], Javidnia, H.[Hossein], Corcoran, P.[Peter],
Latent space mapping for generation of object elements with corresponding data annotation,
PRL(116), 2018, pp. 179-186.
Elsevier DOI 1812
Generative models, Latent space mapping, Deep neural networks BibRef

Gella, S.[Spandana], Keller, F.[Frank], Lapata, M.[Mirella],
Disambiguating Visual Verbs,
PAMI(41), No. 2, February 2019, pp. 311-322.
IEEE DOI 1901
Given an image and a verb, assign the correct sense of the verb. Visualization, Image recognition, Semantics, Natural language processing, Horses, Bicycles, Computer vision, BibRef

Xu, N.[Ning], Liu, A.A.[An-An], Liu, J.[Jing], Nie, W.Z.[Wei-Zhi], Su, Y.T.[Yu-Ting],
Scene graph captioner: Image captioning based on structural visual representation,
JVCIR(58), 2019, pp. 477-485.
Elsevier DOI 1901
Image captioning, Scene graph, Structural representation, Attention BibRef

Niu, Y., Lu, Z., Wen, J., Xiang, T., Chang, S.,
Multi-Modal Multi-Scale Deep Learning for Large-Scale Image Annotation,
IP(28), No. 4, April 2019, pp. 1720-1731.
IEEE DOI 1901
feature extraction, image classification, image fusion, image representation, learning (artificial intelligence), label quantity prediction BibRef

Jiu, M.Y.[Ming-Yuan], Sahbi, H.[Hichem],
Deep representation design from deep kernel networks,
PR(88), 2019, pp. 447-457.
Elsevier DOI 1901
Multiple kernel learning, Kernel design, Deep networks, Efficient computation, Image annotation BibRef

He, X.W.[Xin-Wei], Shi, B.G.[Bao-Guang], Bai, X.[Xiang], Xia, G.S.[Gui-Song], Zhang, Z.X.[Zhao-Xiang], Dong, W.S.[Wei-Sheng],
Image Caption Generation with Part of Speech Guidance,
PRL(119), 2019, pp. 229-237.
Elsevier DOI 1902
Image caption generation, Part-of-speech tags, Long Short-Term Memory, Visual attributes BibRef

Park, C.C., Kim, B., Kim, G.,
Towards Personalized Image Captioning via Multimodal Memory Networks,
PAMI(41), No. 4, April 2019, pp. 999-1012.
IEEE DOI 1903
BibRef
Earlier:
Attend to You: Personalized Image Captioning with Context Sequence Memory Networks,
CVPR17(6432-6440)
IEEE DOI 1711
Tagging, Twitter, Task analysis, Computational modeling, Writing, Vocabulary, Context modeling, Image captioning, personalization, convolutional neural networks. Pattern recognition BibRef

Xiao, X.Y.[Xin-Yu], Wang, L.F.[Ling-Feng], Ding, K.[Kun], Xiang, S.M.[Shi-Ming], Pan, C.[Chunhong],
Dense semantic embedding network for image captioning,
PR(90), 2019, pp. 285-296.
Elsevier DOI 1903
Image captioning, Retrieval, High-level semantic information, Visual concept, Densely embedding, Long short-term memory BibRef

Yang, M., Zhao, W., Xu, W., Feng, Y., Zhao, Z., Chen, X., Lei, K.,
Multitask Learning for Cross-Domain Image Captioning,
MultMed(21), No. 4, April 2019, pp. 1047-1061.
IEEE DOI 1903
Task analysis, Image generation, Data models, Training data, Neural networks, Training, Maximum likelihood estimation, reinforcement learning BibRef

Foumani, S.N.M.[Seyed Navid Mohammadi], Nickabadi, A.[Ahmad],
A probabilistic topic model using deep visual word representation for simultaneous image classification and annotation,
JVCIR(59), 2019, pp. 195-203.
Elsevier DOI 1903
Image classification and annotation, Topic models, Probabilistic model, Deep learning, LLC BibRef

Zhang, X.R.[Xiang-Rong], Wang, X.[Xin], Tang, X.[Xu], Zhou, H.Y.[Hui-Yu], Li, C.[Chen],
Description Generation for Remote Sensing Images Using Attribute Attention Mechanism,
RS(11), No. 6, 2019, pp. xx-yy.
DOI Link 1903
BibRef

Zheng, H.[He], Wu, J.[Jiahong], Liang, R.[Rui], Li, Y.[Ye], Li, X.[Xuzhi],
Multi-task learning for captioning images with novel words,
IET-CV(13), No. 3, April 2019, pp. 294-301.
DOI Link 1904
BibRef

Yu, N., Hu, X., Song, B., Yang, J., Zhang, J.,
Topic-Oriented Image Captioning Based on Order-Embedding,
IP(28), No. 6, June 2019, pp. 2743-2754.
IEEE DOI 1905
image classification, image matching, image retrieval, learning (artificial intelligence), image matching, cross-modal retrieval BibRef

Ding, S.T.[Song-Tao], Qu, S.[Shiru], Xi, Y.L.[Yu-Ling], Sangaiah, A.K.[Arun Kumar], Wan, S.H.[Shao-Hua],
Image caption generation with high-level image features,
PRL(123), 2019, pp. 89-95.
Elsevier DOI 1906
Image captioning, Language model, Bottom-up attention mechanism, Faster R-CNN BibRef

Liu, X.X.[Xiao-Xiao], Xu, Q.Y.[Qing-Yang], Wang, N.[Ning],
A survey on deep neural network-based image captioning,
VC(35), No. 3, March 2019, pp. 445-470.
WWW Link. 1906
BibRef

Hossain, M.Z.[Md. Zakir], Sohel, F.[Ferdous], Shiratuddin, M.F.[Mohd Fairuz], Laga, H.[Hamid],
A Comprehensive Survey of Deep Learning for Image Captioning,
Surveys(51), No. 6, February 2019, pp. Article No 118.
DOI Link 1906
Survey, Captioning. BibRef

Peng, Y.Q.[Yu-Qing], Liu, X.[Xuan], Wang, W.H.[Wei-Hua], Zhao, X.S.[Xiao-Song], Wei, M.[Ming],
Image caption model of double LSTM with scene factors,
IVC(86), 2019, pp. 38-44.
Elsevier DOI 1906
Image caption, Deep neural network, Scene recognition, Semantic information BibRef

Zhang, J.J.[Jun-Jie], Wu, Q.[Qi], Zhang, J.[Jian], Shen, C.H.[Chun-Hua], Lu, J.F.[Jian-Feng], Wu, Q.A.[Qi-Ang],
Heritage image annotation via collective knowledge,
PR(93), 2019, pp. 204-214.
Elsevier DOI 1906
Annotation diversity, Image annotation, Representation learning, Collective knowledge, Heritage image collection BibRef

Verma, Y.[Yashaswi],
Diverse image annotation with missing labels,
PR(93), 2019, pp. 470-484.
Elsevier DOI 1906
Image annotation, Diverse labels, Missing labels, Nearest neighbour BibRef

Markatopoulou, F., Mezaris, V., Patras, I.,
Implicit and Explicit Concept Relations in Deep Neural Networks for Multi-Label Video/Image Annotation,
CirSysVideo(29), No. 6, June 2019, pp. 1631-1644.
IEEE DOI 1906
Task analysis, Correlation, Standards, Training, Electronic mail, Neural networks, Semantics, Video/image concept annotation, video analysis BibRef

Zhang, Z.J.[Zong-Jian], Wu, Q.[Qiang], Wang, Y.[Yang], Chen, F.[Fang],
High-Quality Image Captioning With Fine-Grained and Semantic-Guided Visual Attention,
MultMed(21), No. 7, July 2019, pp. 1681-1693.
IEEE DOI 1906
BibRef
Earlier:
Fine-Grained and Semantic-Guided Visual Attention for Image Captioning,
WACV18(1709-1717)
IEEE DOI 1806
Visualization, Semantics, Feature extraction, Decoding, Task analysis, Object oriented modeling, Image resolution, fully convolutional network-long short term memory framework. feedforward neural nets, image representation, image segmentation, convolutional neural network, Visualization BibRef

Laib, L.[Lakhdar], Allili, M.S.[Mohand Saïd], Ait-Aoudia, S.[Samy],
A probabilistic topic model for event-based image classification and multi-label annotation,
SP:IC(76), 2019, pp. 283-294.
Elsevier DOI 1906
Event recognition, Image annotation, Topic modeling, Convolutional neural nets BibRef

Olaode, A.[Abass], Naghdy, G.[Golshah],
Review of the application of machine learning to the automatic semantic annotation of images,
IET-IPR(13), No. 8, 20 June 2019, pp. 1232-1245.
DOI Link 1906
BibRef

Li, X., Jiang, S.,
Know More Say Less: Image Captioning Based on Scene Graphs,
MultMed(21), No. 8, August 2019, pp. 2117-2130.
IEEE DOI 1908
convolutional neural nets, feature extraction, graph theory, image representation, learning (artificial intelligence), vision-language BibRef

Zhang, C.J.[Chun-Jie], Cheng, J.[Jian], Tian, Q.[Qi],
Multiview, Few-Labeled Object Categorization by Predicting Labels With View Consistency,
Cyber(49), No. 11, November 2019, pp. 3834-3843.
IEEE DOI 1908
image annotation, image classification, learning (artificial intelligence), mapping function, view consistency BibRef

Xian, Y., Tian, Y.,
Self-Guiding Multimodal LSTM: When We Do Not Have a Perfect Training Dataset for Image Captioning,
IP(28), No. 11, November 2019, pp. 5241-5252.
IEEE DOI 1909
Task analysis, Visualization, Training, Semantics, Flickr, Urban areas, Training data, Image captioning, self-guiding, real-world dataset, recurrent neural network BibRef

Li, X., Xu, C., Wang, X., Lan, W., Jia, Z., Yang, G., Xu, J.,
COCO-CN for Cross-Lingual Image Tagging, Captioning, and Retrieval,
MultMed(21), No. 9, September 2019, pp. 2347-2360.
IEEE DOI 1909
Image annotation, Task analysis, Training, Image retrieval, Internet, Streaming media, Visualization, COCO-CN, Chinese language, image retrieval BibRef

Tian, C.[Chunna], Tian, M.[Ming], Jiang, M.M.[Meng-Meng], Liu, H.[Heng], Deng, D.H.[Dong-Hu],
How much do cross-modal related semantics benefit image captioning by weighting attributes and re-ranking sentences?,
PRL(125), 2019, pp. 639-645.
Elsevier DOI 1909
Semant attributes, Attribute reweighting, Cross-modal related semantics, Sentence re-ranking BibRef

Sharif, N.[Naeha], White, L.[Lyndon], Bennamoun, M.[Mohammed], Liu, W.[Wei], Shah, S.A.A.[Syed Afaq Ali],
LCEval: Learned Composite Metric for Caption Evaluation,
IJCV(127), No. 10, October 2019, pp. 1586-1610.
Springer DOI 1909
Fine-grained analysis. BibRef

Zhang, Z.Y.[Zheng-Yuan], Diao, W.H.[Wen-Hui], Zhang, W.K.[Wen-Kai], Yan, M.L.[Meng-Long], Gao, X.[Xin], Sun, X.[Xian],
LAM: Remote Sensing Image Captioning with Label-Attention Mechanism,
RS(11), No. 20, 2019, pp. xx-yy.
DOI Link 1910
BibRef

Fu, K.[Kun], Li, Y.[Yang], Zhang, W.K.[Wen-Kai], Yu, H.F.[Hong-Feng], Sun, X.[Xian],
Boosting Memory with a Persistent Memory Mechanism for Remote Sensing Image Captioning,
RS(12), No. 11, 2020, pp. xx-yy.
DOI Link 2006
BibRef

Tan, J.H., Chan, C.S., Chuah, J.H.,
COMIC: Toward A Compact Image Captioning Model With Attention,
MultMed(21), No. 10, October 2019, pp. 2686-2696.
IEEE DOI 1910
embedded systems; feature extraction; image retrieval; matrix algebra. BibRef

Zhou, L., Zhang, Y., Jiang, Y., Zhang, T., Fan, W.,
Re-Caption: Saliency-Enhanced Image Captioning Through Two-Phase Learning,
IP(29), No. 1, 2020, pp. 694-709.
IEEE DOI 1910
feature extraction, image processing, learning (artificial intelligence), visual attribute BibRef

Yang, L.[Liang], Hu, H.F.[Hai-Feng],
Visual Skeleton and Reparative Attention for Part-of-Speech image captioning system,
CVIU(189), 2019, pp. 102819.
Elsevier DOI 1911
Neural network, Visual attention, Image captioning BibRef

Wang, J.B.[Jun-Bo], Wang, W.[Wei], Wang, L.[Liang], Wang, Z.Y.[Zhi-Yong], Feng, D.D.[David Dagan], Tan, T.N.[Tie-Niu],
Learning visual relationship and context-aware attention for image captioning,
PR(98), 2020, pp. 107075.
Elsevier DOI 1911
Image captioning, Relational reasoning, Context-aware attention BibRef

Xiao, X., Wang, L., Ding, K., Xiang, S., Pan, C.,
Deep Hierarchical Encoder-Decoder Network for Image Captioning,
MultMed(21), No. 11, November 2019, pp. 2942-2956.
IEEE DOI 1911
Visualization, Semantics, Hidden Markov models, Decoding, Logic gates, Training, Computer architecture, vision-sentence BibRef

Jiang, T.[Teng], Zhang, Z.[Zehan], Yang, Y.[Yupu],
Modeling coverage with semantic embedding for image caption generation,
VC(35), No. 11, November 2018, pp. 1655-1665.
WWW Link. 1911
BibRef

Tang, C., Liu, X., Wang, P., Zhang, C., Li, M., Wang, L.,
Adaptive Hypergraph Embedded Semi-Supervised Multi-Label Image Annotation,
MultMed(21), No. 11, November 2019, pp. 2837-2849.
IEEE DOI 1911
Image annotation, Semisupervised learning, Semantics, Computational modeling, Task analysis, Training, Computer science, feature projection BibRef

Mundnich, K.[Karel], Booth, B.M.[Brandon M.], Girault, B.[Benjamin], Narayanan, S.[Shrikanth],
Generating labels for regression of subjective constructs using triplet embeddings,
PRL(128), 2019, pp. 385-392.
Elsevier DOI 1912
Continuous-time annotations, Annotation fusion, Inter-rater agreement, Triplet embeddings, Ordinal embeddings BibRef

Huang, Y., Chen, J., Ouyang, W., Wan, W., Xue, Y.,
Image Captioning With End-to-End Attribute Detection and Subsequent Attributes Prediction,
IP(29), 2020, pp. 4013-4026.
IEEE DOI 2002
Image captioning, semantic attention, end-to-end training, multimodal attribute detector, subsequent attribute predictor BibRef

Lu, X., Wang, B., Zheng, X.,
Sound Active Attention Framework for Remote Sensing Image Captioning,
GeoRS(58), No. 3, March 2020, pp. 1985-2000.
IEEE DOI 2003
Active attention, remote sensing image captioning, semantic understanding BibRef

Wu, L., Xu, M., Wang, J., Perry, S.,
Recall What You See Continually Using GridLSTM in Image Captioning,
MultMed(22), No. 3, March 2020, pp. 808-818.
IEEE DOI 2003
Visualization, Decoding, Task analysis, Neural networks, Training, Computational modeling, Logic gates, Image captioning, GridLSTM, recurrent neural network BibRef

Li, Y.Y.[Yang-Yang], Fang, S.K.[Shuang-Kang], Jiao, L.C.[Li-Cheng], Liu, R.[Ruijiao], Shang, R.H.[Rong-Hua],
A Multi-Level Attention Model for Remote Sensing Image Captions,
RS(12), No. 6, 2020, pp. xx-yy.
DOI Link 2003
What are the important things in the image. BibRef

Chaudhary, C., Goyal, P., Prasad, D.N., Chen, Y.P.,
Enhancing the Quality of Image Tagging Using a Visio-Textual Knowledge Base,
MultMed(22), No. 4, April 2020, pp. 897-911.
IEEE DOI 2004
Knowledge based systems, Visualization, Image annotation, Encyclopedias, Electronic publishing, Internet, Tagging, knowledge based systems BibRef

Chen, X.H.[Xing-Han], Zhang, M.X.[Ming-Xing], Wang, Z.[Zheng], Zuo, L.[Lin], Li, B.[Bo], Yang, Y.[Yang],
Leveraging unpaired out-of-domain data for image captioning,
PRL(132), 2020, pp. 132-140.
Elsevier DOI 2005
Image captioning, Out-of-domain data, Deep learning BibRef

Xu, N., Zhang, H., Liu, A., Nie, W., Su, Y., Nie, J., Zhang, Y.,
Multi-Level Policy and Reward-Based Deep Reinforcement Learning Framework for Image Captioning,
MultMed(22), No. 5, May 2020, pp. 1372-1383.
IEEE DOI 2005
Visualization, Measurement, Task analysis, Reinforcement learning, Optimization, Adaptation models, Semantics, Multi-level policy, image captioning BibRef

Deng, Z.R.[Zhen-Rong], Jiang, Z.Q.[Zhou-Qin], Lan, R.[Rushi], Huang, W.M.[Wen-Ming], Luo, X.N.[Xiao-Nan],
Image captioning using DenseNet network and adaptive attention,
SP:IC(85), 2020, pp. 115836.
Elsevier DOI 2005
Image captioning, DenseNet, LSTM, Adaptive attention mechanism BibRef

Ji, J., Xu, C., Zhang, X., Wang, B., Song, X.,
Spatio-Temporal Memory Attention for Image Captioning,
IP(29), 2020, pp. 7615-7628.
IEEE DOI 2007
Image captioning, spatio-temporal relationship, attention transmission, memory attention, LSTM BibRef

Guo, L., Liu, J., Lu, S., Lu, H.,
Show, Tell, and Polish: Ruminant Decoding for Image Captioning,
MultMed(22), No. 8, August 2020, pp. 2149-2162.
IEEE DOI 2007
Decoding, Visualization, Planning, Training, Semantics, Reinforcement learning, Task analysis, Image captioning, rumination BibRef

Khatchatoorian, A.G.[Artin Ghostan], Jamzad, M.[Mansour],
Architecture to improve the accuracy of automatic image annotation systems,
IET-CV(14), No. 5, August 2020, pp. 214-223.
DOI Link 2007
BibRef

Theodosiou, Z.[Zenonas], Tsapatsoulis, N.[Nicolas],
Image annotation: the effects of content, lexicon and annotation method,
MultInfoRetr(9), No. 3, September 2020, pp. 191-203.
WWW Link. 2008
BibRef

Che, W.B.[Wen-Bin], Fan, X.P.[Xiao-Peng], Xiong, R.Q.[Rui-Qin], Zhao, D.B.[De-Bin],
Visual Relationship Embedding Network for Image Paragraph Generation,
MultMed(22), No. 9, September 2020, pp. 2307-2320.
IEEE DOI 2008
Visualization, Semantics, Task analysis, Proposals, Automobiles, Buildings, Paragraph generation, image caption, LSTM BibRef


Durand, T.[Thibaut],
Learning User Representations for Open Vocabulary Image Hashtag Prediction,
CVPR20(9766-9775)
IEEE DOI 2008
Tagging, Twitter, Computational modeling, Vocabulary, Predictive models, History, Visualization BibRef

Prabhudesai, M.[Mihir], Tung, H.Y.F.[Hsiao-Yu Fish], Javed, S.A.[Syed Ashar], Sieb, M.[Maximilian], Harley, A.W.[Adam W.], Fragkiadaki, K.[Katerina],
Embodied Language Grounding With 3D Visual Feature Representations,
CVPR20(2217-2226)
IEEE DOI 2008
Associating language utterances to 3D visual abstractions. Visualization, Cameras, Feature extraction, Detectors, Solid modeling BibRef

Li, Z., Tran, Q., Mai, L., Lin, Z., Yuille, A.L.,
Context-Aware Group Captioning via Self-Attention and Contrastive Features,
CVPR20(3437-3447)
IEEE DOI 2008
Task analysis, Visualization, Computer vision, Context modeling, Training, Natural languages, Computational modeling BibRef

Zhou, Y., Wang, M., Liu, D., Hu, Z., Zhang, H.,
More Grounded Image Captioning by Distilling Image-Text Matching Model,
CVPR20(4776-4785)
IEEE DOI 2008
Visualization, Grounding, Task analysis, Training, Measurement, Computational modeling, Image edge detection BibRef

Sammani, F., Melas-Kyriazi, L.,
Show, Edit and Tell: A Framework for Editing Image Captions,
CVPR20(4807-4815)
IEEE DOI 2008
Decoding, Visualization, Task analysis, Logic gates, Natural languages, Adaptation models, Glass BibRef

Chen, S., Jin, Q., Wang, P., Wu, Q.,
Say As You Wish: Fine-Grained Control of Image Caption Generation With Abstract Scene Graphs,
CVPR20(9959-9968)
IEEE DOI 2008
Semantics, Decoding, Visualization, Feature extraction, Controllability, Task analysis, Measurement BibRef

Guo, L., Liu, J., Zhu, X., Yao, P., Lu, S., Lu, H.,
Normalized and Geometry-Aware Self-Attention Network for Image Captioning,
CVPR20(10324-10333)
IEEE DOI 2008
Geometry, Task analysis, Visualization, Decoding, Training, Feature extraction, Computer architecture BibRef

Cornia, M., Stefanini, M., Baraldi, L., Cucchiara, R.,
Meshed-Memory Transformer for Image Captioning,
CVPR20(10575-10584)
IEEE DOI 2008
Decoding, Encoding, Visualization, Image coding, Computer architecture, Proposals, Task analysis BibRef

Chen, J., Jin, Q.,
Better Captioning With Sequence-Level Exploration,
CVPR20(10887-10896)
IEEE DOI 2008
Task analysis, Measurement, Training, Computational modeling, Computer architecture, Portable computers, Decoding BibRef

Pan, Y., Yao, T., Li, Y., Mei, T.,
X-Linear Attention Networks for Image Captioning,
CVPR20(10968-10977)
IEEE DOI 2008
Visualization, Decoding, Cognition, Knowledge discovery, Task analysis, Aggregates, Weight measurement BibRef

Tran, A., Mathews, A., Xie, L.,
Transform and Tell: Entity-Aware News Image Captioning,
CVPR20(13032-13042)
IEEE DOI 2008
Decoding, Vocabulary, Transforms, Linguistics, Performance gain, Neural networks, Training BibRef

Park, G.[Geondo], Han, C.[Chihye], Kim, D.[Daeshik], Yoon, W.[Wonjun],
MHSAN: Multi-Head Self-Attention Network for Visual Semantic Embedding,
WACV20(1507-1515)
IEEE DOI 2006
Feature extraction, Visualization, Semantics, Task analysis, Recurrent neural networks, Image representation, Image coding BibRef

Chen, C., Zhang, R., Koh, E., Kim, S., Cohen, S., Rossi, R.,
Figure Captioning with Relation Maps for Reasoning,
WACV20(1526-1534)
IEEE DOI 2006
Bars, Training, Visualization, Decoding, Computational modeling, Task analysis, Portable document format BibRef

He, S., Tavakoli, H.R., Borji, A., Pugeault, N.,
Human Attention in Image Captioning: Dataset and Analysis,
ICCV19(8528-8537)
IEEE DOI 2004
Code, Captioning.
WWW Link. convolutional neural nets, image segmentation, natural language processing, object detection, visual perception, Adaptation models BibRef

Laina, I., Rupprecht, C., Navab, N.,
Towards Unsupervised Image Captioning With Shared Multimodal Embeddings,
ICCV19(7413-7423)
IEEE DOI 2004
computer vision, natural language processing, text analysis, multimodal embeddings, explicit supervision, computer vision, Semantics BibRef

Huang, L., Wang, W., Chen, J., Wei, X.,
Attention on Attention for Image Captioning,
ICCV19(4633-4642)
IEEE DOI 2004
Code, Captioning.
WWW Link. decoding, encoding, image processing, natural language processing, element-wise multiplication, image captioning, weighted average, Testing BibRef

Yang, X., Zhang, H., Cai, J.,
Learning to Collocate Neural Modules for Image Captioning,
ICCV19(4249-4259)
IEEE DOI 2004
image processing, learning (artificial intelligence), natural language processing, neural nets, Neural networks BibRef

Yao, T., Pan, Y., Li, Y., Mei, T.,
Hierarchy Parsing for Image Captioning,
ICCV19(2621-2629)
IEEE DOI 2004
convolutional neural nets, feature extraction, image coding, image representation, image segmentation, Image segmentation BibRef

Liu, L., Tang, J., Wan, X., Guo, Z.,
Generating Diverse and Descriptive Image Captions Using Visual Paraphrases,
ICCV19(4239-4248)
IEEE DOI 2004
computer vision, image classification, learning (artificial intelligence), Machine learning BibRef

Ke, L., Pei, W., Li, R., Shen, X., Tai, Y.,
Reflective Decoding Network for Image Captioning,
ICCV19(8887-8896)
IEEE DOI 2004
decoding, encoding, feature extraction, learning (artificial intelligence), Random access memory BibRef

Vered, G., Oren, G., Atzmon, Y., Chechik, G.,
Joint Optimization for Cooperative Image Captioning,
ICCV19(8897-8906)
IEEE DOI 2004
gradient methods, image sampling, natural language processing, stochastic programming, text analysis, Loss measurement BibRef

Ge, H., Yan, Z., Zhang, K., Zhao, M., Sun, L.,
Exploring Overall Contextual Information for Image Captioning in Human-Like Cognitive Style,
ICCV19(1754-1763)
IEEE DOI 2004
cognition, computational linguistics, learning (artificial intelligence), Cognition BibRef

Agrawal, H., Desai, K., Wang, Y., Chen, X., Jain, R., Johnson, M., Batra, D., Parikh, D., Lee, S., Anderson, P.,
nocaps: novel object captioning at scale,
ICCV19(8947-8956)
IEEE DOI 2004
computer vision, feature extraction, learning (artificial intelligence), object detection, Vegetation BibRef

Li, G., Zhu, L., Liu, P., Yang, Y.,
Entangled Transformer for Image Captioning,
ICCV19(8927-8936)
IEEE DOI 2004
image retrieval, learning (artificial intelligence), natural language processing, recurrent neural nets, robot vision, Proposals BibRef

Hu, H., Misra, I., van der Maaten, L.,
Evaluating Text-to-Image Matching using Binary Image Selection (BISON),
CLVL19(1887-1890)
IEEE DOI 2004
computer vision, content-based retrieval, image annotation, image matching, image retrieval, text analysis, linguistic content, Image Captioning BibRef

Nguyen, A., Tran, Q.D., Do, T., Reid, I., Caldwell, D.G., Tsagarakis, N.G.,
Object Captioning and Retrieval with Natural Language,
ACVR19(2584-2592)
IEEE DOI 2004
convolutional neural nets, image retrieval, learning (artificial intelligence), vision and language BibRef

Gu, J., Joty, S., Cai, J., Zhao, H., Yang, X., Wang, G.,
Unpaired Image Captioning via Scene Graph Alignments,
ICCV19(10322-10331)
IEEE DOI 2004
graph theory, image representation, image retrieval, natural language processing, text analysis, Encoding BibRef

Shen, T., Kar, A., Fidler, S.,
Learning to Caption Images Through a Lifetime by Asking Questions,
ICCV19(10392-10401)
IEEE DOI 2004
image retrieval, multi-agent systems, natural language processing, Automobiles BibRef

Tanaka, M., Itamochi, T., Narioka, K., Sato, I., Ushiku, Y., Harada, T.,
Generating Easy-to-Understand Referring Expressions for Target Identifications,
ICCV19(5793-5802)
IEEE DOI 2004
Code, Annotation.
WWW Link. computer games, image processing, referred objects, salient contexts, human annotation, Grand Theft Auto V, Task analysis BibRef

Aneja, J.[Jyoti], Agrawal, H.[Harsh], Batra, D.[Dhruv], Schwing, A.G.[Alexander G.],
Sequential Latent Spaces for Modeling the Intention During Diverse Image Captioning,
ICCV19(4260-4269)
IEEE DOI 2004
image retrieval, image segmentation, learning (artificial intelligence), recurrent neural nets, Controllability BibRef

Gupta, T., Schwing, A.G., Hoiem, D.,
ViCo: Word Embeddings From Visual Co-Occurrences,
ICCV19(7424-7433)
IEEE DOI 2004
feature extraction, image annotation, image classification, pattern clustering, supervised learning, text analysis, Vocabulary BibRef

Deshpande, A.[Aditya], Aneja, J.[Jyoti], Wang, L.[Liwei], Schwing, A.G.[Alexander G.], Forsyth, D.[David],
Fast, Diverse and Accurate Image Captioning Guided by Part-Of-Speech,
CVPR19(10687-10696).
IEEE DOI 2002
BibRef

Wei, H.Y.[Hai-Yang], Li, Z.X.[Zhi-Xin], Zhang, C.L.[Can-Long],
Image Captioning Based on Visual and Semantic Attention,
MMMod20(I:151-162).
Springer DOI 2003
BibRef

Dognin, P.[Pierre], Melnyk, I.[Igor], Mroueh, Y.[Youssef], Ross, J.[Jerret], Sercu, T.[Tom],
Adversarial Semantic Alignment for Improved Image Captions,
CVPR19(10455-10463).
IEEE DOI 2002
BibRef

Yang, X.[Xu], Tang, K.[Kaihua], Zhang, H.[Hanwang], Cai, J.F.[Jian-Fei],
Auto-Encoding Scene Graphs for Image Captioning,
CVPR19(10677-10686).
IEEE DOI 2002
BibRef

Fukui, H.[Hiroshi], Hirakawa, T.[Tsubasa], Yamashita, T.[Takayoshi], Fujiyoshi, H.[Hironobu],
Attention Branch Network: Learning of Attention Mechanism for Visual Explanation,
CVPR19(10697-10706).
IEEE DOI 2002
BibRef

Biten, A.F.[Ali Furkan], Gomez, L.[Lluis], Rusinol, M.[Marcal], Karatzas, D.[Dimosthenis],
Good News, Everyone! Context Driven Entity-Aware Captioning for News Images,
CVPR19(12458-12467).
IEEE DOI 2002
BibRef

Akbari, H.[Hassan], Karaman, S.[Svebor], Bhargava, S.[Surabhi], Chen, B.[Brian], Vondrick, C.[Carl], Chang, S.F.[Shih-Fu],
Multi-Level Multimodal Common Semantic Space for Image-Phrase Grounding,
CVPR19(12468-12478).
IEEE DOI 2002
BibRef

Li, Y.[Yehao], Yao, T.[Ting], Pan, Y.[Yingwei], Chao, H.Y.[Hong-Yang], Mei, T.[Tao],
Pointing Novel Objects in Image Captioning,
CVPR19(12489-12498).
IEEE DOI 2002
BibRef

Bracha, L.[Lior], Chechik, G.[Gal],
Informative Object Annotations: Tell Me Something I Don't Know,
CVPR19(12499-12507).
IEEE DOI 2002
BibRef

Shuster, K.[Kurt], Humeau, S.[Samuel], Hu, H.[Hexiang], Bordes, A.[Antoine], Weston, J.[Jason],
Engaging Image Captioning via Personality,
CVPR19(12508-12518).
IEEE DOI 2002
BibRef

Feng, Y.[Yang], Ma, L.[Lin], Liu, W.[Wei], Luo, J.B.[Jie-Bo],
Unsupervised Image Captioning,
CVPR19(4120-4129).
IEEE DOI 2002
BibRef

Xu, Y.[Yan], Wu, B.Y.[Bao-Yuan], Shen, F.[Fumin], Fan, Y.B.[Yan-Bo], Zhang, Y.[Yong], Shen, H.T.[Heng Tao], Liu, W.[Wei],
Exact Adversarial Attack to Image Captioning via Structured Output Learning With Latent Variables,
CVPR19(4130-4139).
IEEE DOI 2002
BibRef

Wang, Q.Z.[Qing-Zhong], Chan, A.B.[Antoni B.],
Describing Like Humans: On Diversity in Image Captioning,
CVPR19(4190-4198).
IEEE DOI 2002
BibRef

Guo, L.T.[Long-Teng], Liu, J.[Jing], Yao, P.[Peng], Li, J.W.[Jiang-Wei], Lu, H.Q.[Han-Qing],
MSCap: Multi-Style Image Captioning With Unpaired Stylized Text,
CVPR19(4199-4208).
IEEE DOI 2002
BibRef

Zhang, L.[Lu], Zhang, J.M.[Jian-Ming], Lin, Z.[Zhe], Lu, H.C.[Hu-Chuan], He, Y.[You],
CapSal: Leveraging Captioning to Boost Semantics for Salient Object Detection,
CVPR19(6017-6026).
IEEE DOI 2002
BibRef

Yin, G.J.[Guo-Jun], Sheng, L.[Lu], Liu, B.[Bin], Yu, N.H.[Neng-Hai], Wang, X.G.[Xiao-Gang], Shao, J.[Jing],
Context and Attribute Grounded Dense Captioning,
CVPR19(6234-6243).
IEEE DOI 2002
BibRef

Kim, D.J.[Dong-Jin], Choi, J.[Jinsoo], Oh, T.H.[Tae-Hyun], Kweon, I.S.[In So],
Dense Relational Captioning: Triple-Stream Networks for Relationship-Based Captioning,
CVPR19(6264-6273).
IEEE DOI 2002
BibRef

Gao, J.L.[Jun-Long], Wang, S.[Shiqi], Wang, S.S.[Shan-She], Ma, S.W.[Si-Wei], Gao, W.[Wen],
Self-Critical N-Step Training for Image Captioning,
CVPR19(6293-6301).
IEEE DOI 2002
BibRef

Cornia, M.[Marcella], Baraldi, L.[Lorenzo], Cucchiara, R.[Rita],
Show, Control and Tell: A Framework for Generating Controllable and Grounded Captions,
CVPR19(8299-8308).
IEEE DOI 2002
BibRef

Qin, Y.[Yu], Du, J.J.[Jia-Jun], Zhang, Y.[Yonghua], Lu, H.T.[Hong-Tao],
Look Back and Predict Forward in Image Captioning,
CVPR19(8359-8367).
IEEE DOI 2002
BibRef

Zheng, Y.[Yue], Li, Y.[Yali], Wang, S.[Shengjin],
Intention Oriented Image Captions With Guiding Objects,
CVPR19(8387-8396).
IEEE DOI 2002
BibRef

Huang, Y., Li, C., Li, T., Wan, W., Chen, J.,
Image Captioning with Attribute Refinement,
ICIP19(1820-1824)
IEEE DOI 1910
Image captioning, attribute recognition, Semantic attention, Deep Neural Network, Conditional Random Field BibRef

Lee, J., Lee, Y., Seong, S., Kim, K., Kim, S., Kim, J.,
Capturing Long-Range Dependencies in Video Captioning,
ICIP19(1880-1884)
IEEE DOI 1910
Video captioning, non-local block, long short-term memory, long-range dependency, video representation BibRef

Shi, J., Li, Y., Wang, S.,
Cascade Attention: Multiple Feature Based Learning for Image Captioning,
ICIP19(1970-1974)
IEEE DOI 1910
Image Captioning, Attention Mechanism, Cascade Attention BibRef

Wang, Y., Shen, Y., Xiong, H., Lin, W.,
Adaptive Hard Example Mining for Image Captioning,
ICIP19(3342-3346)
IEEE DOI 1910
Reinforcement Learning, Image Captioning BibRef

Xiao, H., Shi, J.,
A Novel Attribute Selection Mechanism for Video Captioning,
ICIP19(619-623)
IEEE DOI 1910
Attributes, Video captioning, Attention, Reinforcement learning BibRef

Lim, J.H., Chan, C.S.,
Mask Captioning Network,
ICIP19(1-5)
IEEE DOI 1910
Image captioning, Deep learning, Scene understanding BibRef

Wang, Q.Z.[Qing-Zhong], Chan, A.B.[Antoni B.],
Gated Hierarchical Attention for Image Captioning,
ACCV18(IV:21-37).
Springer DOI 1906
BibRef

Wang, W.[Weixuan], Chen, Z.H.[Zhi-Hong], Hu, H.F.[Hai-Feng],
Multivariate Attention Network for Image Captioning,
ACCV18(VI:587-602).
Springer DOI 1906
BibRef

Ghanimifard, M.[Mehdi], Dobnik, S.[Simon],
Knowing When to Look for What and Where: Evaluating Generation of Spatial Descriptions with Adaptive Attention,
VL18(IV:153-161).
Springer DOI 1905
See also Knowing When to Look: Adaptive Attention via a Visual Sentinel for Image Captioning. BibRef

Kim, B.[Boeun], Lee, Y.H.[Young Han], Jung, H.[Hyedong], Cho, C.[Choongsang],
Distinctive-Attribute Extraction for Image Captioning,
VL18(IV:133-144).
Springer DOI 1905
BibRef

Tanti, M.[Marc], Gatt, A.[Albert], Muscat, A.[Adrian],
Pre-gen Metrics: Predicting Caption Quality Metrics Without Generating Captions,
VL18(IV:114-123).
Springer DOI 1905
BibRef

Tanti, M.[Marc], Gatt, A.[Albert], Camilleri, K.P.[Kenneth P.],
Quantifying the Amount of Visual Information Used by Neural Caption Generators,
VL18(IV:124-132).
Springer DOI 1905
BibRef

Ren, L., Qi, G., Hua, K.,
Improving Diversity of Image Captioning Through Variational Autoencoders and Adversarial Learning,
WACV19(263-272)
IEEE DOI 1904
computer vision, image classification, image coding, image segmentation, learning (artificial intelligence), Maximum likelihood estimation BibRef

Zhou, Y., Sun, Y., Honavar, V.,
Improving Image Captioning by Leveraging Knowledge Graphs,
WACV19(283-293)
IEEE DOI 1904
computer vision, graph theory, image capture, image retrieval, performance measure, image captioning systems, knowledge graphs, Generators BibRef

Rapson, C.J., Seet, B., Naeem, M.A., Lee, J.E., Al-Sarayreh, M., Klette, R.,
Reducing the Pain: A Novel Tool for Efficient Ground-Truth Labelling in Images,
IVCNZ18(1-9)
IEEE DOI 1902
Labeling, Tools, Image segmentation, Image color analysis, Brushes, Head, Automobiles, image labelling, annotations, segmentation, image dataset BibRef

Lu, J.S.[Jia-Sen], Yang, J.W.[Jian-Wei], Batra, D.[Dhruv], Parikh, D.[Devi],
Neural Baby Talk,
CVPR18(7219-7228)
IEEE DOI 1812
Detectors, Visualization, Grounding, Pediatrics, Natural languages, Dogs, Task analysis BibRef

Wu, B.Y.[Bao-Yuan], Chen, W.D.[Wei-Dong], Sun, P.[Peng], Liu, W.[Wei], Ghanem, B.[Bernard], Lyu, S.W.[Si-Wei],
Tagging Like Humans: Diverse and Distinct Image Annotation,
CVPR18(7967-7975)
IEEE DOI 1812
Semantics, Image annotation, Redundancy, Training, Task analysis, Generators BibRef

Wu, X.J.[Xin-Jian], Zhang, L.[Li], Li, F.Z.[Fan-Zhang], Wang, B.J.[Bang-Jun],
A Novel Model for Multi-label Image Annotation,
ICPR18(1953-1958)
IEEE DOI 1812
Feature extraction, Image annotation, Computational modeling, Semantics, Measurement, Visualization, Classification algorithms, Multi-label learning BibRef

Jiu, M., Sahbi, H., Qi, L.,
Deep Context Networks for Image Annotation,
ICPR18(2422-2427)
IEEE DOI 1812
image annotation, image classification, learning (artificial intelligence), deep context networks, Standards BibRef

Khademi, M., Schulte, O.,
Image Caption Generation with Hierarchical Contextual Visual Spatial Attention,
Cognitive18(2024-20248)
IEEE DOI 1812
Feature extraction, Visualization, Logic gates, Computer architecture, Task analysis, Context modeling, Computational modeling BibRef

Yan, S., Wu, F., Smith, J.S., Lu, W., Zhang, B.,
Image Captioning using Adversarial Networks and Reinforcement Learning,
ICPR18(248-253)
IEEE DOI 1812
Generators, Generative adversarial networks, Monte Carlo methods, Maximum likelihood estimation, Task analysis BibRef

Wang, F., Gong, X., Huang, L.,
Time-Dependent Pre-attention Model for Image Captioning,
ICPR18(3297-3302)
IEEE DOI 1812
Decoding, Task analysis, Semantics, Visualization, Feature extraction, Computational modeling, Computer science BibRef

Luo, R., Shakhnarovich, G., Cohen, S., Price, B.,
Discriminability Objective for Training Descriptive Captions,
CVPR18(6964-6974)
IEEE DOI 1812
Training, Task analysis, Visualization, Measurement, Computational modeling, Generators, Airplanes BibRef

Cui, Y., Yang, G., Veit, A., Huang, X., Belongie, S.,
Learning to Evaluate Image Captioning,
CVPR18(5804-5812)
IEEE DOI 1812
Measurement, Pathology, Training, Correlation, SPICE, Robustness, Task analysis BibRef

Aneja, J., Deshpande, A., Schwing, A.G.,
Convolutional Image Captioning,
CVPR18(5561-5570)
IEEE DOI 1812
Training, Computer architecture, Task analysis, Hidden Markov models, Microprocessors, Computational modeling, Indexing BibRef

Chen, F., Ji, R., Sun, X., Wu, Y., Su, J.,
GroupCap: Group-Based Image Captioning with Structured Relevance and Diversity Constraints,
CVPR18(1345-1353)
IEEE DOI 1812
Visualization, Correlation, Semantics, Feature extraction, Training, Adaptation models, Task analysis BibRef

Chen, X., Ma, L., Jiang, W., Yao, J., Liu, W.,
Regularizing RNNs for Caption Generation by Reconstructing the Past with the Present,
CVPR18(7995-8003)
IEEE DOI 1812
Computer vision, Pattern recognition BibRef

Yao, T.[Ting], Pan, Y.W.[Ying-Wei], Li, Y.[Yehao], Mei, T.[Tao],
Exploring Visual Relationship for Image Captioning,
ECCV18(XIV: 711-727).
Springer DOI 1810
BibRef

Shah, S.A.A.[Syed Afaq Ali],
NNEval: Neural Network Based Evaluation Metric for Image Captioning,
ECCV18(VIII: 39-55).
Springer DOI 1810
BibRef

Jiang, W.H.[Wen-Hao], Ma, L.[Lin], Jiang, Y.G.[Yu-Gang], Liu, W.[Wei], Zhang, T.[Tong],
Recurrent Fusion Network for Image Captioning,
ECCV18(II: 510-526).
Springer DOI 1810
BibRef

Chatterjee, M.[Moitreya], Schwing, A.G.[Alexander G.],
Diverse and Coherent Paragraph Generation from Images,
ECCV18(II: 747-763).
Springer DOI 1810
BibRef

Chen, S.[Shi], Zhao, Q.[Qi],
Boosted Attention: Leveraging Human Attention for Image Captioning,
ECCV18(XI: 72-88).
Springer DOI 1810
BibRef

Dai, B.[Bo], Ye, D.[Deming], Lin, D.[Dahua],
Rethinking the Form of Latent States in Image Captioning,
ECCV18(VI: 294-310).
Springer DOI 1810
BibRef

Liu, X.[Xihui], Li, H.S.[Hong-Sheng], Shao, J.[Jing], Chen, D.P.[Da-Peng], Wang, X.G.[Xiao-Gang],
Show, Tell and Discriminate: Image Captioning by Self-retrieval with Partially Labeled Data,
ECCV18(XV: 353-369).
Springer DOI 1810
BibRef

Fang, F., Wang, H., Tang, P.,
Image Captioning with Word Level Attention,
ICIP18(1278-1282)
IEEE DOI 1809
Visualization, Feature extraction, Task analysis, Training, Recurrent neural networks, Semantics, Computational modeling, bidirectional spatial embedding BibRef

Zhu, Z., Xue, Z., Yuan, Z.,
Topic-Guided Attention for Image Captioning,
ICIP18(2615-2619)
IEEE DOI 1809
Visualization, Semantics, Feature extraction, Training, Decoding, Generators, Measurement, Image captioning, Attention, Topic, Attribute, Deep Neural Network BibRef

Gomez-Garay, A.[Alejandro], Raducanu, B.[Bogdan], Salas, J.[Joaquín],
Dense Captioning of Natural Scenes in Spanish,
MCPR18(145-154).
Springer DOI 1807
BibRef

Yao, L.[Li], Ballas, N.[Nicolas], Cho, K.[Kyunghyun], Smith, J.[John], Bengio, Y.[Yoshua],
Oracle Performance for Visual Captioning,
BMVC16(xx-yy).
HTML Version. 1805
BibRef

Khatchatoorian, A.G., Jamzad, M.,
Post Rectifying Methods to Improve the Accuracy of Image Annotation,
DICTA17(1-7)
IEEE DOI 1804
feature extraction, image annotation, image classification, image retrieval, matrix algebra, Class-tag relation matrix, Time division multiplexing BibRef

Dong, H.[Hao], Zhang, J.Q.[Jing-Qing], McIlwraith, D.[Douglas], Guo, Y.[Yike],
I2T2I: Learning text to image synthesis with textual data augmentation,
ICIP17(2015-2019)
IEEE DOI 1803
Birds, Generators, Image generation, Recurrent neural networks, Shape, Training, Deep learning, GAN, Image Synthesis BibRef

Pellegrin, L.[Luis], Escalante, H.J.[Hugo Jair], Montes-y-Gómez, M.[Manuel], Villegas, M.[Mauricio], González, F.A.[Fabio A.],
A Flexible Framework for the Evaluation of Unsupervised Image Annotation,
CIARP17(508-516).
Springer DOI 1802
BibRef

Jia, Y.H.[Yu-Hua], Bai, L.[Liang], Wang, P.[Peng], Guo, J.L.[Jin-Lin], Xie, Y.X.[Yu-Xiang],
Deep Convolutional Neural Network for Correlating Images and Sentences,
MMMod18(I:154-165).
Springer DOI 1802
BibRef

Liu, J.Y.[Jing-Yu], Wang, L.[Liang], Yang, M.H.[Ming-Hsuan],
Referring Expression Generation and Comprehension via Attributes,
ICCV17(4866-4874)
IEEE DOI 1802
Language Descriptions for objects. learning (artificial intelligence), object detection, RefCOCO, RefCOCO+, RefCOCOg, attribute learning model, common space model, Visualization BibRef

Dai, B., Fidler, S., Urtasun, R., Lin, D.,
Towards Diverse and Natural Image Descriptions via a Conditional GAN,
ICCV17(2989-2998)
IEEE DOI 1802
image retrieval, image sequences, inference mechanisms, learning (artificial intelligence), Visualization BibRef

Niu, Z.X.[Zhen-Xing], Zhou, M.[Mo], Wang, L.[Le], Gao, X.[Xinbo], Hua, G.[Gang],
Hierarchical Multimodal LSTM for Dense Visual-Semantic Embedding,
ICCV17(1899-1907)
IEEE DOI 1802
map sentences and images. document image processing, image representation, recurrent neural nets, HM-LSTM, Hierarchical Multimodal LSTM, Recurrent neural networks BibRef

Liang, X., Hu, Z., Zhang, H., Gan, C., Xing, E.P.,
Recurrent Topic-Transition GAN for Visual Paragraph Generation,
ICCV17(3382-3391)
IEEE DOI 1802
document image processing, inference mechanisms, natural scenes, recurrent neural nets, text analysis, RTT-GAN, Visualization BibRef

Shetty, R., Rohrbach, M., Hendricks, L.A., Fritz, M., Schiele, B.,
Speaking the Same Language: Matching Machine to Human Captions by Adversarial Training,
ICCV17(4155-4164)
IEEE DOI 1802
image matching, learning (artificial intelligence), sampling methods, vocabulary, adversarial training, Visualization BibRef

Liu, S., Zhu, Z., Ye, N., Guadarrama, S., Murphy, K.,
Improved Image Captioning via Policy Gradient optimization of SPIDEr,
ICCV17(873-881)
IEEE DOI 1802
Maximum likelihood estimation, Measurement, Mixers, Robustness, SPICE, Training BibRef

Gu, J.X.[Jiu-Xiang], Joty, S.[Shafiq], Cai, J.F.[Jian-Fei], Wang, G.[Gang],
Unpaired Image Captioning by Language Pivoting,
ECCV18(I: 519-535).
Springer DOI 1810
BibRef

Gu, J.X.[Jiu-Xiang], Wang, G.[Gang], Cai, J.F.[Jian-Fei], Chen, T.[Tsuhan],
An Empirical Study of Language CNN for Image Captioning,
ICCV17(1231-1240)
IEEE DOI 1802
computer vision, convolution, learning (artificial intelligence), natural language processing, recurrent neural nets, Recurrent neural networks BibRef

Pedersoli, M., Lucas, T., Schmid, C., Verbeek, J.,
Areas of Attention for Image Captioning,
ICCV17(1251-1259)
IEEE DOI 1802
image segmentation, inference mechanisms, natural language processing, object detection, Visualization BibRef

Li, Y., Ouyang, W., Zhou, B., Wang, K., Wang, X.,
Scene Graph Generation from Objects, Phrases and Region Captions,
ICCV17(1270-1279)
IEEE DOI 1802
graph theory, image classification, image representation, neural nets, object detection, Visualization BibRef

Zhang, Z., Wu, J.J., Li, Q., Huang, Z., Traer, J., McDermott, J.H., Tenenbaum, J.B., Freeman, W.T.,
Generative Modeling of Audible Shapes for Object Perception,
ICCV17(1260-1269)
IEEE DOI 1802
audio recording, audio signal processing, audio-visual systems, feature extraction, inference mechanisms, interactive systems, Visualization BibRef

Liu, Z.J.[Zhi-Jian], Freeman, W.T.[William T.], Tenenbaum, J.B.[Joshua B.], Wu, J.J.[Jia-Jun],
Physical Primitive Decomposition,
ECCV18(XII: 3-20).
Springer DOI 1810
BibRef

Wu, J.J.[Jia-Jun], Lim, J.[Joseph], Zhang, H.Y.[Hong-Yi], Tenenbaum, J.B.[Joshua B.], Freeman, W.T.[William T.],
Physics 101: Learning Physical Object Properties from Unlabeled Videos,
BMVC16(xx-yy).
HTML Version. 1805
BibRef

Tavakoliy, H.R., Shetty, R., Borji, A., Laaksonen, J.,
Paying Attention to Descriptions Generated by Image Captioning Models,
ICCV17(2506-2515)
IEEE DOI 1802
feature extraction, image processing, human descriptions, human-written descriptions, image captioning model, Visualization BibRef

Chen, T.H., Liao, Y.H., Chuang, C.Y., Hsu, W.T., Fu, J., Sun, M.,
Show, Adapt and Tell: Adversarial Training of Cross-Domain Image Captioner,
ICCV17(521-530)
IEEE DOI 1802
image processing, inference mechanisms, text analysis, MSCOCO, adversarial training procedure, captioner act, critic networks, Training data BibRef

Tripathi, A.[Anurag], Gupta, A.[Abhinav], Chaudhary, S.[Santanu], Lall, B.[Brejesh],
Image Annotation Using Latent Components and Transmedia Association,
PReMI17(493-500).
Springer DOI 1711
BibRef

Pini, S.[Stefano], Cornia, M.[Marcella], Baraldi, L.[Lorenzo], Cucchiara, R.[Rita],
Towards Video Captioning with Naming: A Novel Dataset and a Multi-modal Approach,
CIAP17(II:384-395).
Springer DOI 1711
BibRef

Wu, B.Y.[Bao-Yuan], Jia, F.[Fan], Liu, W.[Wei], Ghanem, B.[Bernard],
Diverse Image Annotation,
CVPR17(6194-6202)
IEEE DOI 1711
Correlation, Feature extraction, Measurement, Redundancy, Semantics BibRef

Krause, J.[Jonathan], Johnson, J.[Justin], Krishna, R.[Ranjay], Fei-Fei, L.[Li],
A Hierarchical Approach for Generating Descriptive Image Paragraphs,
CVPR17(3337-3345)
IEEE DOI 1711
Feature extraction, Natural languages, Pragmatics, Recurrent neural networks, Speech, Visualization BibRef

Vedantam, R., Bengio, S., Murphy, K., Parikh, D., Chechik, G.,
Context-Aware Captions from Context-Agnostic Supervision,
CVPR17(1070-1079)
IEEE DOI 1711
Birds, Cats, Cognition, Context modeling, Pragmatics, Training BibRef

Gan, Z., Gan, C., He, X., Pu, Y., Tran, K., Gao, J., Carin, L., Deng, L.,
Semantic Compositional Networks for Visual Captioning,
CVPR17(1141-1150)
IEEE DOI 1711
Feature extraction, Mouth, Pediatrics, Semantics, Tensile stress, Training, Visualization BibRef

Ren, Z., Wang, X., Zhang, N., Lv, X., Li, L.J.,
Deep Reinforcement Learning-Based Image Captioning with Embedding Reward,
CVPR17(1151-1159)
IEEE DOI 1711
Decision making, Learning (artificial intelligence), Measurement, Neural networks, Training, Visualization BibRef

Rennie, S.J., Marcheret, E., Mroueh, Y., Ross, J., Goel, V.,
Self-Critical Sequence Training for Image Captioning,
CVPR17(1179-1195)
IEEE DOI 1711
Inference algorithms, Learning (artificial intelligence), Logic gates, Measurement, Predictive models, Training BibRef

Yang, L., Tang, K., Yang, J., Li, L.J.,
Dense Captioning with Joint Inference and Visual Context,
CVPR17(1978-1987)
IEEE DOI 1711
Bioinformatics, Genomics, Object detection, Proposals, Semantics, Training, Visualization BibRef

Lu, J., Xiong, C., Parikh, D., Socher, R.,
Knowing When to Look: Adaptive Attention via a Visual Sentinel for Image Captioning,
CVPR17(3242-3250)
IEEE DOI 1711
Adaptation models, Computational modeling, Context modeling, Decoding, Logic gates, Mathematical model, Visualization BibRef

Yao, T., Pan, Y., Li, Y., Mei, T.,
Incorporating Copying Mechanism in Image Captioning for Learning Novel Objects,
CVPR17(5263-5271)
IEEE DOI 1711
Decoding, Hidden Markov models, Object recognition, Recurrent neural networks, Standards, Training, Visualization BibRef

Chen, L., Zhang, H., Xiao, J., Nie, L., Shao, J., Liu, W., Chua, T.S.,
SCA-CNN: Spatial and Channel-Wise Attention in Convolutional Networks for Image Captioning,
CVPR17(6298-6306)
IEEE DOI 1711
Detectors, Feature extraction, Image coding, Neural networks, Semantics, Visualization BibRef

Sun, Q., Lee, S., Batra, D.,
Bidirectional Beam Search: Forward-Backward Inference in Neural Sequence Models for Fill-in-the-Blank Image Captioning,
CVPR17(7215-7223)
IEEE DOI 1711
Approximation algorithms, Computational modeling, Decoding, History, Inference algorithms, Recurrent, neural, networks BibRef

Wang, Y., Lin, Z., Shen, X., Cohen, S., Cottrell, G.W.,
Skeleton Key: Image Captioning by Skeleton-Attribute Decomposition,
CVPR17(7378-7387)
IEEE DOI 1711
Measurement, Recurrent neural networks, SPICE, Semantics, Skeleton, Training BibRef

Zanfir, M.[Mihai], Marinoiu, E.[Elisabeta], Sminchisescu, C.[Cristian],
Spatio-Temporal Attention Models for Grounded Video Captioning,
ACCV16(IV: 104-119).
Springer DOI 1704
BibRef

Chen, T.H.[Tseng-Hung], Zeng, K.H.[Kuo-Hao], Hsu, W.T.[Wan-Ting], Sun, M.[Min],
Video Captioning via Sentence Augmentation and Spatio-Temporal Attention,
Assist16(I: 269-286).
Springer DOI 1704
BibRef

Tan, Y.H.[Ying Hua], Chan, C.S.[Chee Seng],
phi-LSTM: A Phrase-Based Hierarchical LSTM Model for Image Captioning,
ACCV16(V: 101-117).
Springer DOI 1704
BibRef

Weiland, L.[Lydia], Hulpus, I.[Ioana], Ponzetto, S.P.[Simone Paolo], Dietz, L.[Laura],
Using Object Detection, NLP, and Knowledge Bases to Understand the Message of Images,
MMMod17(II: 405-418).
Springer DOI 1701
BibRef

Liu, Y.[Yu], Guo, Y.M.[Yan-Ming], Lew, M.S.[Michael S.],
What Convnets Make for Image Captioning?,
MMMod17(I: 416-428).
Springer DOI 1701
BibRef

Tran, K., He, X., Zhang, L., Sun, J.,
Rich Image Captioning in the Wild,
DeepLearn-C16(434-441)
IEEE DOI 1612
BibRef

Wang, Y.L.[Yi-Lin], Wang, S.H.[Su-Hang], Tang, J.L.[Ji-Liang], Liu, H.[Huan], Li, B.X.[Bao-Xin],
PPP: Joint Pointwise and Pairwise Image Label Prediction,
CVPR16(6005-6013)
IEEE DOI 1612
BibRef

Yatskar, M.[Mark], Ordonez, V., Zettlemoyer, L.[Luke], Farhadi, A.[Ali],
Commonly Uncommon: Semantic Sparsity in Situation Recognition,
CVPR17(6335-6344)
IEEE DOI 1711
BibRef
Earlier: A1, A3, A4, Only:
Situation Recognition: Visual Semantic Role Labeling for Image Understanding,
CVPR16(5534-5542)
IEEE DOI 1612
Image recognition, Image representation, Predictive models, Semantics, Tensile stress, Training BibRef

Kottur, S.[Satwik], Vedantam, R.[Ramakrishna], Moura, J.M.F.[José M. F.], Parikh, D.[Devi],
VisualWord2Vec (Vis-W2V): Learning Visually Grounded Word Embeddings Using Abstract Scenes,
CVPR16(4985-4994)
IEEE DOI 1612
BibRef

Zhu, Y., Groth, O., Bernstein, M., Fei-Fei, L.,
Visual7W: Grounded Question Answering in Images,
CVPR16(4995-5004)
IEEE DOI 1612
BibRef

Zhang, P., Goyal, Y., Summers-Stay, D., Batra, D., Parikh, D.,
Yin and Yang: Balancing and Answering Binary Visual Questions,
CVPR16(5014-5022)
IEEE DOI 1612
BibRef

Park, D.H., Darrell, T.J., Rohrbach, A.,
Robust Change Captioning,
ICCV19(4623-4632)
IEEE DOI 2004
feature extraction, learning (artificial intelligence), natural language processing, object-oriented programming, Predictive models BibRef

Venugopalan, S.[Subhashini], Hendricks, L.A.[Lisa Anne], Rohrbach, M.[Marcus], Mooney, R.[Raymond], Darrell, T.J.[Trevor J.], Saenko, K.[Kate],
Captioning Images with Diverse Objects,
CVPR17(1170-1178)
IEEE DOI 1711
BibRef
Earlier: A2, A1, A3, A4, A6, A5:
Deep Compositional Captioning: Describing Novel Object Categories without Paired Training Data,
CVPR16(1-10)
IEEE DOI 1612
Data models, Image recognition, Predictive models, Semantics, Training, Visualization. Novel objects not in training data. BibRef

Johnson, J.[Justin], Karpathy, A.[Andrej], Fei-Fei, L.[Li],
DenseCap: Fully Convolutional Localization Networks for Dense Captioning,
CVPR16(4565-4574)
IEEE DOI 1612
Both localize and describe salient regions in images in natural language. BibRef

Wang, M.[Minsi], Song, L.[Li], Yang, X.K.[Xiao-Kang], Luo, C.F.[Chuan-Fei],
A parallel-fusion RNN-LSTM architecture for image caption generation,
ICIP16(4448-4452)
IEEE DOI 1610
Computational modeling deep convolutional networks and recurrent neural networks. BibRef

Lin, X.[Xiao], Parikh, D.[Devi],
Leveraging Visual Question Answering for Image-Caption Ranking,
ECCV16(II: 261-277).
Springer DOI 1611
BibRef
Earlier:
Don't just listen, use your imagination: Leveraging visual common sense for non-visual tasks,
CVPR15(2984-2993)
IEEE DOI 1510
BibRef

Chen, T.L.[Tian-Lang], Zhang, Z.P.[Zhong-Ping], You, Q.Z.[Quan-Zeng], Fang, C.[Chen], Wang, Z.W.[Zhao-Wen], Jin, H.L.[Hai-Lin], Luo, J.B.[Jie-Bo],
'Factual' or 'Emotional': Stylized Image Captioning with Adaptive Learning and Attention,
ECCV18(X: 527-543).
Springer DOI 1810
BibRef

You, Q.Z.[Quan-Zeng], Jin, H.L.[Hai-Lin], Wang, Z.W.[Zhao-Wen], Fang, C.[Chen], Luo, J.B.[Jie-Bo],
Image Captioning with Semantic Attention,
CVPR16(4651-4659)
IEEE DOI 1612
BibRef

Jia, X.[Xu], Gavves, E.[Efstratios], Fernando, B.[Basura], Tuytelaars, T.[Tinne],
Guiding the Long-Short Term Memory Model for Image Caption Generation,
ICCV15(2407-2415)
IEEE DOI 1602
Computer architecture BibRef

Chen, X.L.[Xin-Lei], Zitnick, C.L.[C. Lawrence],
Mind's eye: A recurrent visual representation for image caption generation,
CVPR15(2422-2431)
IEEE DOI 1510
BibRef

Vedantam, R.[Ramakrishna], Zitnick, C.L.[C. Lawrence], Parikh, D.[Devi],
CIDEr: Consensus-based image description evaluation,
CVPR15(4566-4575)
IEEE DOI 1510
BibRef

Fang, H.[Hao], Gupta, S.[Saurabh], Iandola, F.[Forrest], Srivastava, R.K.[Rupesh K.], Deng, L.[Li], Dollar, P.[Piotr], Gao, J.F.[Jian-Feng], He, X.D.[Xiao-Dong], Mitchell, M.[Margaret], Platt, J.C.[John C.], Zitnick, C.L.[C. Lawrence], Zweig, G.[Geoffrey],
From captions to visual concepts and back,
CVPR15(1473-1482)
IEEE DOI 1510
BibRef

Ramnath, K.[Krishnan], Baker, S.[Simon], Vanderwende, L.[Lucy], El-Saban, M.[Motaz], Sinha, S.N.[Sudipta N.], Kannan, A.[Anitha], Hassan, N.[Noran], Galley, M.[Michel], Yang, Y.[Yi], Ramanan, D.[Deva], Bergamo, A.[Alessandro], Torresani, L.[Lorenzo],
AutoCaption: Automatic caption generation for personal photos,
WACV14(1050-1057)
IEEE DOI 1406
Clouds BibRef

Pan, J.Y.[Jia-Yu], Yang, H.J.[Hyung-Jeong], Faloutsos, C.[Christos],
MMSS: Graph-based Multi-modal Story-oriented Video Summarization and Retrieval,
CMU-CS-TR-04-114.
HTML Version. 0501
BibRef

Pan, J.Y.[Jia-Yu], Yang, H.J.[Hyung-Jeong], Faloutsos, C.[Christos], Duygulu, P.[Pinar],
GCap: Graph-based Automatic Image Captioning,
MMDE04(146).
IEEE DOI 0406
BibRef

Pan, J.Y.[Jia-Yu],
Advanced Tools for Video and Multimedia Mining,
CMU-CS-06-126, May 2006. BibRef 0605 Ph.D.Thesis,
HTML Version. BibRef

Chapter on Matching and Recognition Using Volumes, High Level Vision Techniques, Invariants continues in
General References for Matching .


Last update:Sep 24, 2020 at 19:44:22