Taylor, S.L.,
Dahl, D.A.,
Lipshutz, M.,
Weir, C.,
Norton, L.M.,
Nilson, R.W.,
Linebarger, M.C.,
Integrating Natural-Language Understanding with
Document Structure-Analysis,
AIR(8), No. 2-3, 1994, pp. 255-276.
BibRef
9400
McKevitt, P.,
Integration of Natural-Language and Vision Processing: Theory,
AIR(9), No. 4-5, October 1995, pp. 247-250.
BibRef
9510
And:
SpringerReprinted. 1995.
ISBN: 978-0-7923-3758-4
WWW Link.
BibRef
McKevitt, P.,
Integration of Natural-Language and Vision Processing:
Intelligent Multimedia,
AIR(9), No. 2-3, June 1995, pp. 77-80.
BibRef
9506
McKevitt, P.,
Integration of Natural-Language and Vision Processing:
More Computational Models and Systems,
AIR(8), No. 5-6, 1995, pp. 345-348.
BibRef
9500
And:
Integration of Natural-Language and Vision Processing:
Computational Models and Systems,
AIR(8), No. 2-3, 1994, pp. 99-104.
BibRef
And:
KluwerDordrecht, The Netherlands, 1995.
BibRef
McKevitt, P.,
Integration of Natural-Language and Vision Processing:
Grounding Representations,
AIR(10), No. 1-2, April 1996, pp. 7-13.
9606
BibRef
McKevitt, P.[Paul],
Integration of Natural Language and Vision Processing,
KluwerSeptember 1996, ISBN 0-7923-4114-7.
Recent Advances Volume IV.
WWW Link.
BibRef
9609
Siskind, J.M.,
Grounding Language in Perception,
AIR(8), No. 5-6, 1995, pp. 371-391.
BibRef
9500
Srihari, R.K.,
Automatic-Indexing and Content-Based Retrieval of Captioned Images,
Computer(28), No. 9, September 1995, pp. 49-56.
Database. Use language/vision knowledge base combined with basic image understanding
modules to interpret images
BibRef
9509
Srihari, R.K.,
Use of Captions and Other Collateral Text in Understanding Photographs,
AIR(8), No. 5-6, 1995, pp. 409-430.
BibRef
9500
Earlier:
Use of Collateral Text in Understanding Photos in Documents,
SPIE(2368), 1994, pp. 186-199.
BibRef
Srihari, R.K.,
Computational Models for Integrating Linguistic and Visual Information:
A Survey,
AIR(8), No. 5-6, 1995, pp. 349-369.
Survey, Computational Vision.
BibRef
9500
Srihari, R.K.[Rohini K.],
Using Linguistic Context for Image Interpretation and Annotation,
Radius97(419-427).
BibRef
9700
Srihari, R.K.[Rohini K.],
Linguistic Context In Vision,
Context95(xx)
BibRef
9500
Srihari, R.K.,
Zhang, Z.,
Venkatraman, M.,
Chopra, R.,
Using Speech Input for Image Interpretation and Annotation,
ARPA96(501-510).
BibRef
9600
Burhans, D.T.[Debra T.],
Chopra, R.[Rajiv],
Srihari, R.K.[Rohini K.],
Govindaraju, V.[Venugopal], and
Venkataraman, M.[Mahesh],
Use of Collateral Text in Image Interpretation,
ARPA94(II:897-907).
Knowledge Representation.
BibRef
9400
Srihari, R.K., and
Burhans, D.T.,
Visual Semantics: Extracting Visual Information from
Text Accompanying Pictures,
AAAI-94(793-798).
BibRef
9400
Govindaraju, V.,
Srihari, S.N.,
Sher, D.B.,
Caption-Aided Face Location In Newspaper Photographs,
ICPR92(I:474-477).
IEEE DOI
BibRef
9200
And:
A Computational Model for Face Location Based on Cognitive Principles,
AAAI-92(350-355).
BibRef
Schank, R.C.,
Fano, A.,
Memory and Expectations in Learning, Language, and Visual Understanding,
AIR(9), No. 4-5, October 1995, pp. 261-271.
BibRef
9510
Wilks, Y.,
Language, Vision and Metaphor,
AIR(9), No. 4-5, October 1995, pp. 273-289.
BibRef
9510
Partridge, D.,
Language and Vision: A Single Perceptual Mechanism,
AIR(9), No. 4-5, October 1995, pp. 291-303.
BibRef
9510
Marconi, D.,
Work on the Integration of Language and Vision at the
University of Torino,
AIR(10), No. 1-2, April 1996, pp. 15-20.
9606
BibRef
Meini, C.,
Paternoster, A.,
Understanding Language Through Vision,
AIR(10), No. 1-2, April 1996, pp. 37-48.
9606
BibRef
McKevitt, P.,
Guo, C.M.,
From Chinese Rooms to Irish Rooms: New Words on Visions for Language,
AIR(10), No. 1-2, April 1996, pp. 49-63.
9606
BibRef
Grumbach, A.,
Grounding Symbols into Perceptions,
AIR(10), No. 1-2, April 1996, pp. 131-146.
9606
BibRef
Socher, G.,
Sagerer, G.F.,
Perona, P.,
Bayesian reasoning on qualitative descriptions from images and speech,
IVC(18), No. 2, January 2000, pp. 155-172.
Elsevier DOI
0001
BibRef
Mukerjee, A.,
Gupta, K.,
Nautiyal, S.,
Singh, M.P.,
Mishra, N.,
Conceptual description of visual scenes from linguistic models,
IVC(18), No. 2, January 2000, pp. 173-187.
Elsevier DOI
0001
BibRef
Arens, M.,
Gerber, R.,
Nagel, H.H.,
Conceptual representations between video signals and natural language
descriptions,
IVC(26), No. 1, 1 January 2008, pp. 53-66.
Elsevier DOI
0711
Cognitive vision; Knowledge representation
BibRef
Gerber, R.,
Nagel, H.H.,
(Mis?-) Using DRT for Generation of Natural Language Text
from Image Sequences,
ECCV98(II: 255).
Springer DOI
BibRef
9800
Lee, Y.J.[Yong Jae],
Grauman, K.[Kristen],
Object-Graphs for Context-Aware Visual Category Discovery,
PAMI(34), No. 2, February 2012, pp. 346-358.
IEEE DOI
1112
BibRef
Earlier:
Object-graphs for context-aware category discovery,
CVPR10(1-8).
IEEE DOI Video of talk:
WWW Link.
1006
BibRef
And:
Learning the easy things first: Self-paced visual category discovery,
CVPR11(1721-1728).
IEEE DOI
1106
Context for matchings. Given some categories, learn new ones in unlabeled
images. Encode 2D and 3D layout of object level patterns related to familiar
patterns.
See also Foreground Focus: Unsupervised Learning from Partially Matching Images.
BibRef
Yu, A.[Aron],
Grauman, K.[Kristen],
Densifying Supervision for Fine-Grained Visual Comparisons,
IJCV(128), No. 10-11, November 2020, pp. 2704-2730.
Springer DOI
2009
BibRef
Perona, P.[Pietro],
A taxonomy of visual recognition,
VMV04(187).
0411
BibRef
Takahashi, T.[Takuya],
Nakanishi, S.[Satoru],
Kuno, Y.[Yoshinori],
Shirai, Y.[Yoshiaki],
Helping Computer Vision by Verbal and Nonverbal Communication,
ICPR98(Vol II: 1216-1218).
IEEE DOI
9808
BibRef
Satoh, S.[Shin'ichi],
Nakamura, Y.[Yuichi],
Kanade, T.[Takeo],
Name-It: Naming and Detecting Faces in Video by the Integration
of Image and Natural Language Processing,
IJCAI97(1488-1495).
BibRef
9700
Satoh, S.[Shin'ichi],
Kanade, T.,
Name-It: Association Of Face And Name In Video,
CVPR97(368-373).
IEEE DOI
9704
IU and language to associate names (from language) with the face.
BibRef
Socher, G.[Gudrun],
Sagerer, G.F.[Gerhard F.],
Kummert, F.[Franz],
Fuhr, T.[Thomas],
Talking About 3D Scenes: Integration of Image and Speech Understanding
in a Hybrid Distributed System,
ICIP96(II: 809-812).
IEEE DOI
BibRef
9600
Chapter on Matching and Recognition Using Volumes, High Level Vision Techniques, Invariants continues in
Context from the environment .