23.2.2.2.8 Document Layout, Structure Analysis, Web Documents, Online Documents

Chapter Contents (Back)
Document Analysis. Application, Document Layout.

Ashraf, F., Ozyer, T., Alhajj , R.,
Employing Clustering Techniques for Automatic Information Extraction From HTML Documents,
SMC-C(38), No. 5, September 2008, pp. 660-673.
IEEE DOI may work or IEEE-CS DOI may work. 0810 BibRef


Hirano, T.[Takashi], Okano, Y.[Yuichi], Okada, Y.[Yasuhiro], Yoda, F.[Fumio],
Text and Layout Information Extraction from Document Files of Various Formats Based on the Analysis of Page Description Language,
ICDAR07(262-266).
IEEE DOI may work or IEEE-CS DOI may work. 0709 BibRef

Burget, R.,
Layout Based Information Extraction from HTML Documents,
ICDAR07(624-628).
IEEE DOI may work or IEEE-CS DOI may work. 0709 BibRef

Guo, H., Mahmud, J., Borodin, Y., Stent, A., Ramakrishnan, I.,
A General Approach for Partitioning Web Page Content Based on Geometric and Style Information,
ICDAR07(929-933).
IEEE DOI may work or IEEE-CS DOI may work. 0709 BibRef

Yoshida, M., Nakagawa, H.,
Web Document Parsing: A New Approach to Modeling Layout-Language Relations,
ICDAR07(203-207).
IEEE DOI may work or IEEE-CS DOI may work. 0709 BibRef

Esposito, F., Ferilli, S., Mauro, N.D.[N. Di], Basile, T.,
Incremental Learning of First Order Logic Theories for the Automatic Annotations of Web Documents,
ICDAR07(1093-1097).
IEEE DOI may work or IEEE-CS DOI may work. 0709 BibRef

Watai, Y.[Yasuyuki], Yamasaki, T.[Toshihiko], Aizawa, K.[Kiyoharu],
View-Based Web Page Retrieval using Interactive Sketch Query,
ICIP07(VI: 357-360).
IEEE DOI may work or IEEE-CS DOI may work. 0709 BibRef

Ma, J.C.[Jun-Chang], Gu, Z.[Zhimin],
A Shared Fragments Analysis System for Large Collections of Web Pages,
DAS06(390-401).
WWW Version. 0602 BibRef

Déjean, H.[Hervé], Meunier, J.L.[Jean-Luc],
A System for Converting PDF Documents into Structured XML Format,
DAS06(129-140).
WWW Version. 0602 BibRef

Meunier, J.L.,
Optimized XY-cut for determining a page reading order,
ICDAR05(I: 347-351).
IEEE DOI may work or IEEE-CS DOI may work. 0508 BibRef

Liu, W.Y.[Wen-Yin], Huang, G.[Guanglin], Liu, X.Y.[Xiao-Yue], Deng, X.[Xiaotie], Min, Z.[Zhang],
Phishing Web page detection,
ICDAR05(II: 560-564).
IEEE DOI may work or IEEE-CS DOI may work. 0508 BibRef

Feng, J., Haffner, P., Gilbert, M.,
A learning approach to discovering Web page semantic structures,
ICDAR05(II: 1055-1059).
IEEE DOI may work or IEEE-CS DOI may work. 0508 BibRef

Chao, H.[Hui], Lin, X.F.[Xiao Fan],
Capturing the layout of electronic documents for reuse in variable data printing,
ICDAR05(II: 940-944).
IEEE DOI may work or IEEE-CS DOI may work. 0508 BibRef

Chao, H.[Hui], Fan, J.[Jian],
Layout and Content Extraction for PDF Documents,
DAS04(213-224).
WWW Version. 0505 BibRef

Behera, A., Lalanne, D., Ingold, R.,
Enhancement of layout-based identification of low-resolution documents using geometrical color distribution,
ICDAR05(I: 468-472).
IEEE DOI may work or IEEE-CS DOI may work. 0508 BibRef

Mekhaldi, D.[Dalila], Lalanne, D.[Denis], Ingold, R.[Rolf],
From searching to browsing through multimodal documents linking,
ICDAR05(II: 924-928).
IEEE DOI may work or IEEE-CS DOI may work. 0508 BibRef
Earlier:
Unity Is Strength: Coupling Media for Thematic Segmentation,
DAS04(559-562).
WWW Version. 0505 BibRef

Rigamonti, M., Bloechle, J.L., Hadjar, K., Lalanne, D., Ingold, R.,
Towards a canonical and structured representation of PDF documents through reverse engineering,
ICDAR05(II: 1050-1054).
IEEE DOI may work or IEEE-CS DOI may work. 0508 BibRef

Hadjar, K., Rigamonti, M., Lalanne, D., Ingold, R.,
Xed: a new tool for extracting hidden structures from electronic documents,
DIAL04(212-224).
IEEE DOI may work or IEEE-CS DOI may work. 0404 BibRef

Hadjar, K., Ingold, R.,
Logical labeling of Arabic newspapers using artificial neural nets,
ICDAR05(I: 426-430).
IEEE DOI may work or IEEE-CS DOI may work. 0508 BibRef

Schenker, A.[Adam], Bunke, H.[Horst], Last, M.[Mark], Kandel, A.[Abraham],
A Graph-Based Framework for Web Document Mining,
DAS04(401-412).
WWW Version. 0505 BibRef

Schenker, A.[Adam], Last, M.[Mark], Bunke, H.[Horst], Kandel, A.[Abraham],
Classification of web documents using a graph model,
ICDAR03(240-244).
IEEE Abstract. IEEE Top Reference. 0311 BibRef

Vitali, F.[Fabio], Di Iorio, A.[Angelo], Campori, E.V.[Elisa Ventura],
Rule-Based Structural Analysis of Web Pages,
DAS04(425-437).
WWW Version. 0505 BibRef

Hu, J.Y.[Jian-Ying], Bagga, A.,
Identifying story and preview images in news web pages,
ICDAR03(640-644).
IEEE Abstract. IEEE Top Reference. 0311 BibRef

Ramachandran, S., Kashi, R.,
An architecture for ink annotations on web documents,
ICDAR03(256-260).
IEEE Abstract. IEEE Top Reference. 0311 BibRef

Gagneux, A., Emptoz, H.,
Web site: a structured document,
ICDAR03(1158-1162).
IEEE Abstract. IEEE Top Reference. 0311 BibRef

Mukherjee, S., Yang, G.[Guizhen], Tan, W.[Wenfang], Ramakrishnan, I.V.,
Automatic discovery of semantic structures in HTML documents,
ICDAR03(245-249).
IEEE Abstract. IEEE Top Reference. 0311 BibRef

Alam, H., Kumar, A., Nakamura, M., Rahman, F., Tarnikova, Y., Wilcox, C.[Che],
Structured and unstructured document summarization: Design of a commercial summarizer using Lexical chains,
ICDAR03(1147-1152).
IEEE Abstract. IEEE Top Reference. 0311 BibRef

Rahman, F., Alam, H.,
A commercial Web based digital library for sharing and distributing documents,
DIAL04(93-103).
IEEE DOI may work or IEEE-CS DOI may work. 0404 BibRef

Alam, H., Hartono, R., Kumar, A., Rahman, F., Tarnikova, Y., Wilcox, C.[Che],
Web page summarization for handheld devices: a natural language approach,
ICDAR03(1153-1158).
IEEE Abstract. IEEE Top Reference. 0311 BibRef

Rahman, A.F.R., Alam, H., Hartono, R., Ariyoshi, K.,
Automatic summarization of Web content to smaller display devices,
ICDAR01(1064-1068).
IEEE DOI may work or IEEE-CS DOI may work. 0109 BibRef

Serradura, L., Slimane, M., Vincent, N.,
Web sites thematic classification using hidden Markov models,
ICDAR01(1094-1098).
IEEE DOI may work or IEEE-CS DOI may work. 0109 BibRef

Penn, G., Hu, J.Y.[Jian-Ying], Luo, H.[Hengbin], McDonald, R.,
Flexible Web document analysis for delivery to narrow-bandwidth devices,
ICDAR01(1074-1078).
IEEE DOI may work or IEEE-CS DOI may work. 0109 BibRef

Anjewierden, A.,
AIDAS: incremental logical structure discovery in PDF documents,
ICDAR01(374-378).
IEEE DOI may work or IEEE-CS DOI may work. 0109 BibRef

Athitsos, V., Swain, M.J., Frankel, C.,
Distinguishing photographs and graphics on the World Wide Web,
CBAIVL97(10).
IEEE DOI may work or IEEE-CS DOI may work. 9706 BibRef

Chapter on OCR, Document Analysis and Character Recognition Systems continues in
Document Retrieval Systems, Databases and Issues, Libraries .


Last update:Oct 10, 2008 at 17:20:17