Ashraf, F.,
Ozyer, T.,
Alhajj , R.,
Employing Clustering Techniques for Automatic Information Extraction
From HTML Documents,
SMC-C(38), No. 5, September 2008, pp. 660-673.
IEEE DOI may work or IEEE-CS DOI may work.
0810
BibRef
Burget, R.,
Layout Based Information Extraction from HTML Documents,
ICDAR07(624-628).
IEEE DOI may work or IEEE-CS DOI may work.
0709
BibRef
Guo, H.,
Mahmud, J.,
Borodin, Y.,
Stent, A.,
Ramakrishnan, I.,
A General Approach for Partitioning Web Page Content Based on Geometric
and Style Information,
ICDAR07(929-933).
IEEE DOI may work or IEEE-CS DOI may work.
0709
BibRef
Yoshida, M.,
Nakagawa, H.,
Web Document Parsing:
A New Approach to Modeling Layout-Language Relations,
ICDAR07(203-207).
IEEE DOI may work or IEEE-CS DOI may work.
0709
BibRef
Esposito, F.,
Ferilli, S.,
Mauro, N.D.[N. Di],
Basile, T.,
Incremental Learning of First Order Logic Theories for the Automatic
Annotations of Web Documents,
ICDAR07(1093-1097).
IEEE DOI may work or IEEE-CS DOI may work.
0709
BibRef
Watai, Y.[Yasuyuki],
Yamasaki, T.[Toshihiko],
Aizawa, K.[Kiyoharu],
View-Based Web Page Retrieval using Interactive Sketch Query,
ICIP07(VI: 357-360).
IEEE DOI may work or IEEE-CS DOI may work.
0709
BibRef
Ma, J.C.[Jun-Chang],
Gu, Z.[Zhimin],
A Shared Fragments Analysis System for Large Collections of Web Pages,
DAS06(390-401).
WWW Version.
0602
BibRef
Déjean, H.[Hervé],
Meunier, J.L.[Jean-Luc],
A System for Converting PDF Documents into Structured XML Format,
DAS06(129-140).
WWW Version.
0602
BibRef
Meunier, J.L.,
Optimized XY-cut for determining a page reading order,
ICDAR05(I: 347-351).
IEEE DOI may work or IEEE-CS DOI may work.
0508
BibRef
Liu, W.Y.[Wen-Yin],
Huang, G.[Guanglin],
Liu, X.Y.[Xiao-Yue],
Deng, X.[Xiaotie],
Min, Z.[Zhang],
Phishing Web page detection,
ICDAR05(II: 560-564).
IEEE DOI may work or IEEE-CS DOI may work.
0508
BibRef
Feng, J.,
Haffner, P.,
Gilbert, M.,
A learning approach to discovering Web page semantic structures,
ICDAR05(II: 1055-1059).
IEEE DOI may work or IEEE-CS DOI may work.
0508
BibRef
Chao, H.[Hui],
Lin, X.F.[Xiao Fan],
Capturing the layout of electronic documents for reuse in variable data
printing,
ICDAR05(II: 940-944).
IEEE DOI may work or IEEE-CS DOI may work.
0508
BibRef
Chao, H.[Hui],
Fan, J.[Jian],
Layout and Content Extraction for PDF Documents,
DAS04(213-224).
WWW Version.
0505
BibRef
Behera, A.,
Lalanne, D.,
Ingold, R.,
Enhancement of layout-based identification of low-resolution documents
using geometrical color distribution,
ICDAR05(I: 468-472).
IEEE DOI may work or IEEE-CS DOI may work.
0508
BibRef
Mekhaldi, D.[Dalila],
Lalanne, D.[Denis],
Ingold, R.[Rolf],
From searching to browsing through multimodal documents linking,
ICDAR05(II: 924-928).
IEEE DOI may work or IEEE-CS DOI may work.
0508
BibRef
Earlier:
Unity Is Strength: Coupling Media for Thematic Segmentation,
DAS04(559-562).
WWW Version.
0505
BibRef
Rigamonti, M.,
Bloechle, J.L.,
Hadjar, K.,
Lalanne, D.,
Ingold, R.,
Towards a canonical and structured representation of PDF documents
through reverse engineering,
ICDAR05(II: 1050-1054).
IEEE DOI may work or IEEE-CS DOI may work.
0508
BibRef
Hadjar, K.,
Rigamonti, M.,
Lalanne, D.,
Ingold, R.,
Xed: a new tool for extracting hidden structures from electronic
documents,
DIAL04(212-224).
IEEE DOI may work or IEEE-CS DOI may work.
0404
BibRef
Hadjar, K.,
Ingold, R.,
Logical labeling of Arabic newspapers using artificial neural nets,
ICDAR05(I: 426-430).
IEEE DOI may work or IEEE-CS DOI may work.
0508
BibRef
Schenker, A.[Adam],
Bunke, H.[Horst],
Last, M.[Mark],
Kandel, A.[Abraham],
A Graph-Based Framework for Web Document Mining,
DAS04(401-412).
WWW Version.
0505
BibRef
Schenker, A.[Adam],
Last, M.[Mark],
Bunke, H.[Horst],
Kandel, A.[Abraham],
Classification of web documents using a graph model,
ICDAR03(240-244).
IEEE Abstract. IEEE Top Reference.
0311
BibRef
Vitali, F.[Fabio],
Di Iorio, A.[Angelo],
Campori, E.V.[Elisa Ventura],
Rule-Based Structural Analysis of Web Pages,
DAS04(425-437).
WWW Version.
0505
BibRef
Hu, J.Y.[Jian-Ying],
Bagga, A.,
Identifying story and preview images in news web pages,
ICDAR03(640-644).
IEEE Abstract. IEEE Top Reference.
0311
BibRef
Ramachandran, S.,
Kashi, R.,
An architecture for ink annotations on web documents,
ICDAR03(256-260).
IEEE Abstract. IEEE Top Reference.
0311
BibRef
Gagneux, A.,
Emptoz, H.,
Web site: a structured document,
ICDAR03(1158-1162).
IEEE Abstract. IEEE Top Reference.
0311
BibRef
Mukherjee, S.,
Yang, G.[Guizhen],
Tan, W.[Wenfang],
Ramakrishnan, I.V.,
Automatic discovery of semantic structures in HTML documents,
ICDAR03(245-249).
IEEE Abstract. IEEE Top Reference.
0311
BibRef
Alam, H.,
Kumar, A.,
Nakamura, M.,
Rahman, F.,
Tarnikova, Y.,
Wilcox, C.[Che],
Structured and unstructured document summarization: Design of a
commercial summarizer using Lexical chains,
ICDAR03(1147-1152).
IEEE Abstract. IEEE Top Reference.
0311
BibRef
Rahman, F.,
Alam, H.,
A commercial Web based digital library for sharing and distributing
documents,
DIAL04(93-103).
IEEE DOI may work or IEEE-CS DOI may work.
0404
BibRef
Alam, H.,
Hartono, R.,
Kumar, A.,
Rahman, F.,
Tarnikova, Y.,
Wilcox, C.[Che],
Web page summarization for handheld devices: a natural language
approach,
ICDAR03(1153-1158).
IEEE Abstract. IEEE Top Reference.
0311
BibRef
Rahman, A.F.R.,
Alam, H.,
Hartono, R.,
Ariyoshi, K.,
Automatic summarization of Web content to smaller display devices,
ICDAR01(1064-1068).
IEEE DOI may work or IEEE-CS DOI may work.
0109
BibRef
Serradura, L.,
Slimane, M.,
Vincent, N.,
Web sites thematic classification using hidden Markov models,
ICDAR01(1094-1098).
IEEE DOI may work or IEEE-CS DOI may work.
0109
BibRef
Penn, G.,
Hu, J.Y.[Jian-Ying],
Luo, H.[Hengbin],
McDonald, R.,
Flexible Web document analysis for delivery to narrow-bandwidth devices,
ICDAR01(1074-1078).
IEEE DOI may work or IEEE-CS DOI may work.
0109
BibRef
Anjewierden, A.,
AIDAS: incremental logical structure discovery in PDF documents,
ICDAR01(374-378).
IEEE DOI may work or IEEE-CS DOI may work.
0109
BibRef
Athitsos, V.,
Swain, M.J.,
Frankel, C.,
Distinguishing photographs and graphics on the World Wide Web,
CBAIVL97(10).
IEEE DOI may work or IEEE-CS DOI may work.
9706
BibRef
Chapter on OCR, Document Analysis and Character Recognition Systems continues in
Document Retrieval Systems, Databases and Issues, Libraries .