13.6.11.2 Context Supplied by Text or Language

Chapter Contents (Back)
Matching, Context. Natural Language.

Taylor, S.L., Dahl, D.A., Lipshutz, M., Weir, C., Norton, L.M., Nilson, R.W., Linebarger, M.C.,
Integrating Natural-Language Understanding with Document Structure-Analysis,
AIR(8), No. 2-3, 1994, pp. 255-276. BibRef 9400

McKevitt, P.,
Integration of Natural-Language and Vision Processing: Theory,
AIR(9), No. 4-5, October 1995, pp. 247-250. BibRef 9510
And: SpringerReprinted. 1995. ISBN: 978-0-7923-3758-4
WWW Link. BibRef

McKevitt, P.,
Integration of Natural-Language and Vision Processing: Intelligent Multimedia,
AIR(9), No. 2-3, June 1995, pp. 77-80. BibRef 9506

McKevitt, P.,
Integration of Natural-Language and Vision Processing: More Computational Models and Systems,
AIR(8), No. 5-6, 1995, pp. 345-348. BibRef 9500
And:
Integration of Natural-Language and Vision Processing: Computational Models and Systems,
AIR(8), No. 2-3, 1994, pp. 99-104. BibRef
And: KluwerDordrecht, The Netherlands, 1995. BibRef

McKevitt, P.,
Integration of Natural-Language and Vision Processing: Grounding Representations,
AIR(10), No. 1-2, April 1996, pp. 7-13. 9606
BibRef

McKevitt, P.[Paul],
Integration of Natural Language and Vision Processing,
KluwerSeptember 1996, ISBN 0-7923-4114-7. Recent Advances Volume IV.
WWW Link. BibRef 9609

Siskind, J.M.,
Grounding Language in Perception,
AIR(8), No. 5-6, 1995, pp. 371-391. BibRef 9500

Srihari, R.K.,
Automatic-Indexing and Content-Based Retrieval of Captioned Images,
Computer(28), No. 9, September 1995, pp. 49-56. Database. Use language/vision knowledge base combined with basic image understanding modules to interpret images BibRef 9509

Srihari, R.K.,
Use of Captions and Other Collateral Text in Understanding Photographs,
AIR(8), No. 5-6, 1995, pp. 409-430. BibRef 9500
Earlier:
Use of Collateral Text in Understanding Photos in Documents,
SPIE(2368), 1994, pp. 186-199. BibRef

Srihari, R.K.,
Computational Models for Integrating Linguistic and Visual Information: A Survey,
AIR(8), No. 5-6, 1995, pp. 349-369. Survey, Computational Vision. BibRef 9500

Srihari, R.K.[Rohini K.],
Using Linguistic Context for Image Interpretation and Annotation,
Radius97(419-427). BibRef 9700

Srihari, R.K.[Rohini K.],
Linguistic Context In Vision,
Context95(xx) BibRef 9500

Srihari, R.K., Zhang, Z., Venkatraman, M., Chopra, R.,
Using Speech Input for Image Interpretation and Annotation,
ARPA96(501-510). BibRef 9600

Burhans, D.T.[Debra T.], Chopra, R.[Rajiv], Srihari, R.K.[Rohini K.], Govindaraju, V.[Venugopal], and Venkataraman, M.[Mahesh],
Use of Collateral Text in Image Interpretation,
ARPA94(II:897-907). Knowledge Representation. BibRef 9400

Srihari, R.K., and Burhans, D.T.,
Visual Semantics: Extracting Visual Information from Text Accompanying Pictures,
AAAI-94(793-798). BibRef 9400

Govindaraju, V., Srihari, S.N., Sher, D.B.,
Caption-Aided Face Location In Newspaper Photographs,
ICPR92(I:474-477).
IEEE DOI BibRef 9200
And:
A Computational Model for Face Location Based on Cognitive Principles,
AAAI-92(350-355). BibRef

Schank, R.C., Fano, A.,
Memory and Expectations in Learning, Language, and Visual Understanding,
AIR(9), No. 4-5, October 1995, pp. 261-271. BibRef 9510

Wilks, Y.,
Language, Vision and Metaphor,
AIR(9), No. 4-5, October 1995, pp. 273-289. BibRef 9510

Partridge, D.,
Language and Vision: A Single Perceptual Mechanism,
AIR(9), No. 4-5, October 1995, pp. 291-303. BibRef 9510

Marconi, D.,
Work on the Integration of Language and Vision at the University of Torino,
AIR(10), No. 1-2, April 1996, pp. 15-20. 9606
BibRef

Meini, C., Paternoster, A.,
Understanding Language Through Vision,
AIR(10), No. 1-2, April 1996, pp. 37-48. 9606
BibRef

McKevitt, P., Guo, C.M.,
From Chinese Rooms to Irish Rooms: New Words on Visions for Language,
AIR(10), No. 1-2, April 1996, pp. 49-63. 9606
BibRef

Grumbach, A.,
Grounding Symbols into Perceptions,
AIR(10), No. 1-2, April 1996, pp. 131-146. 9606
BibRef

Socher, G., Sagerer, G.F., Perona, P.,
Bayesian reasoning on qualitative descriptions from images and speech,
IVC(18), No. 2, January 2000, pp. 155-172.
Elsevier DOI 0001
BibRef

Mukerjee, A., Gupta, K., Nautiyal, S., Singh, M.P., Mishra, N.,
Conceptual description of visual scenes from linguistic models,
IVC(18), No. 2, January 2000, pp. 173-187.
Elsevier DOI 0001
BibRef

Arens, M., Gerber, R., Nagel, H.H.,
Conceptual representations between video signals and natural language descriptions,
IVC(26), No. 1, 1 January 2008, pp. 53-66.
Elsevier DOI 0711
Cognitive vision; Knowledge representation BibRef

Gerber, R., Nagel, H.H.,
(Mis?-) Using DRT for Generation of Natural Language Text from Image Sequences,
ECCV98(II: 255).
Springer DOI BibRef 9800

Lee, Y.J.[Yong Jae], Grauman, K.[Kristen],
Object-Graphs for Context-Aware Visual Category Discovery,
PAMI(34), No. 2, February 2012, pp. 346-358.
IEEE DOI 1112
BibRef
Earlier:
Object-graphs for context-aware category discovery,
CVPR10(1-8).
IEEE DOI Video of talk:
WWW Link. 1006
BibRef
And:
Learning the easy things first: Self-paced visual category discovery,
CVPR11(1721-1728).
IEEE DOI 1106
Context for matchings. Given some categories, learn new ones in unlabeled images. Encode 2D and 3D layout of object level patterns related to familiar patterns.
See also Foreground Focus: Unsupervised Learning from Partially Matching Images. BibRef

Yu, A.[Aron], Grauman, K.[Kristen],
Densifying Supervision for Fine-Grained Visual Comparisons,
IJCV(128), No. 10-11, November 2020, pp. 2704-2730.
Springer DOI 2009
BibRef

Hessel, J.[Jack], Hwang, J.D.[Jena D.], Park, J.S.[Jae Sung], Zellers, R.[Rowan], Bhagavatula, C.[Chandra], Rohrbach, A.[Anna], Saenko, K.[Kate], Choi, Y.[Yejin],
The Abduction of Sherlock Holmes: A Dataset for Visual Abductive Reasoning,
ECCV22(XXXVI:558-575).
Springer DOI 2211

WWW Link. BibRef

Perona, P.[Pietro],
A taxonomy of visual recognition,
VMV04(187). 0411
BibRef

Takahashi, T.[Takuya], Nakanishi, S.[Satoru], Kuno, Y.[Yoshinori], Shirai, Y.[Yoshiaki],
Helping Computer Vision by Verbal and Nonverbal Communication,
ICPR98(Vol II: 1216-1218).
IEEE DOI 9808
BibRef

Satoh, S.[Shin'ichi], Nakamura, Y.[Yuichi], Kanade, T.[Takeo],
Name-It: Naming and Detecting Faces in Video by the Integration of Image and Natural Language Processing,
IJCAI97(1488-1495). BibRef 9700

Satoh, S.[Shin'ichi], Kanade, T.,
Name-It: Association Of Face And Name In Video,
CVPR97(368-373).
IEEE DOI 9704
IU and language to associate names (from language) with the face. BibRef

Socher, G.[Gudrun], Sagerer, G.F.[Gerhard F.], Kummert, F.[Franz], Fuhr, T.[Thomas],
Talking About 3D Scenes: Integration of Image and Speech Understanding in a Hybrid Distributed System,
ICIP96(II: 809-812).
IEEE DOI BibRef 9600

Chapter on Matching and Recognition Using Volumes, High Level Vision Techniques, Invariants continues in
Context from the environment .

Last update:Jul 11, 2026 at 11:55:55