Nadler, M.[Morton],
Document Segmentation and Coding Techniques,
CVGIP(28), No. 2, November 1984, pp. 240-262.
Elsevier DOI
Survey, Page Segmentation.
BibRef
8411
Pavlidis, T.[Theo],
Zhou, J.Y.[Jiang-Ying],
Page Segmentation and Classification,
GMIP(54), No. 6, November 1992, pp. 484-496.
Survey, Page Segmentation.
BibRef
9211
Pavlidis, T.[Theo],
Page Segmentation by White Streams,
ICDAR91(945-953).
BibRef
9100
Zlatopolsky, A.A.,
Automated Document Segmentation,
PRL(15), No. 7, July 1994, pp. 699-704.
BibRef
9407
Leng, G.W.,
Mital, D.P.,
Yong, T.S.,
Kang, T.K.,
A Differential-Processing Extraction Approach to
Text and Image Segmentation,
EngAAI(7), No. 6, December 1994, pp. 639-651.
BibRef
9412
Jain, A.K.[Anil K.],
Zhong, Y.[Yu],
Page Segmentation Using Texture Analysis,
PR(29), No. 5, May 1996, pp. 743-770.
Elsevier DOI
9605
BibRef
Earlier:
Page segmentation using texture discrimination masks,
ICIP95(III: 308-311).
IEEE DOI
9510
BibRef
Jain, A.K.,
Bhattacharjee, S.,
Text Segmentation Using Gabor Filters for
Automatic Document Processing,
MVA(5), 1992, pp. 169-184.
BibRef
9200
Jain, A.K.,
Bhattacharjee, S.K.,
Chen, Y.,
On texture in document images,
CVPR92(677-680).
IEEE DOI
0403
BibRef
Venkateswarlu, N.B.,
Boyle, R.D.,
New segmentation techniques for document image analysis,
IVC(13), No. 7, September 1995, pp. 573-583.
Elsevier DOI
0401
BibRef
Shih, F.Y.,
Chen, S.S.,
Adaptive Document Block Segmentation and Classification,
SMC-B(26), No. 5, October 1996, pp. 797-802.
IEEE Top Reference. Segment based on run length smoothing. Then a rule-based classification
into text, graphics, picture.
BibRef
9610
Patel, D.,
Page Segmentation for Document Image-Analysis Using a Neural-Network,
OptEng(35), No. 7, July 1996, pp. 1854-1861.
9608
BibRef
Patel, D.,
Stonham, T.J.,
Texture image classification and segmentation using RANK-order
clustering,
ICPR92(III:92-95).
IEEE DOI
9208
BibRef
Payne, J.S.,
Stonham, T.J.,
Patel, D.,
Document segmentation using texture analysis,
ICPR94(B:380-382).
IEEE DOI
9410
BibRef
Etemad, K.,
Doermann, D.S.,
Chellappa, R.,
Multiscale Segmentation of Unstructured Document Pages Using
Soft Decision Integration,
PAMI(19), No. 1, January 1997, pp. 92-96.
IEEE DOI
9702
BibRef
And:
Multiscale Document Page Segmentation Using Soft Decision Integration,
UMDTR3444, 1995.
WWW Link.
BibRef
Earlier:
Page Segmentation Using Decision Integration and Wavelet Packets,
ICPR94(B:345-349).
IEEE DOI Classify regions of the page image into text or images.
BibRef
Etemad, K.[Kamran],
Multi-Scale Discriminant Analysis and Recognition of Signals and Images,
Ph.D.Thesis, April 1996.
BibRef
9604
UMDTR3629.
The goal is to find efficient multi-scale representations that yield
maximum between-class separations and minimum within-class scatters.
WWW Link. Also for Faces.
BibRef
Chen, J.L.,
A Simplified Approach to the HMM Based Texture Analysis
and Its Application to Document Segmentation,
PRL(18), No. 10, October 1997, pp. 993-1007.
9802
Markov model texture analysis.
BibRef
Kise, K.[Koichi],
Sato, A.[Akinori],
Iwata, M.[Motoi],
Segmentation of Page Images Using the Area Voronoi Diagram,
CVIU(70), No. 3, June 1998, pp. 370-382.
DOI Link
For evaluation:
See also Empirical Performance Evaluation Methodology and Its Application to Page Segmentation Algorithms.
BibRef
9806
Hobby, J.D.[John D.],
Matching Document Images with Ground Truth,
IJDAR(1), No. 1, Spring 1998, pp. xx-yy.
BibRef
9800
Earlier:
ICDAR97(Tu-2B)
9708
In program, not in proceedings.
BibRef
Cinque, L.,
Lombardi, L.,
Manzini, G.,
A Multiresolution Approach for Page Segmentation,
PRL(19), No. 2, February 1998, pp. 217-225.
9808
See also Shape-Description and Recognition by a Multiresolution Approach.
BibRef
Cantoni, V.,
Cinque, L.,
Lombardi, L.,
Manzini, G.,
Page Segmentation Using a Pyramidal Architecture,
CAMP97(Session 6).
BibRef
9700
Cinque, L.,
Levialdi, S.,
Lombardi, L.,
Tanimoto, S.,
Segmentation of page images having artifacts of photocopying and
scanning,
PR(35), No. 5, May 2002, pp. 1167-1177.
Elsevier DOI
0202
BibRef
Cinque, L.,
Forino, L.,
Levialdi, S.,
Lombardi, L.,
Tanimoto, S.,
Understanding the page logical structure,
CIAP99(1003-1008).
IEEE DOI
9909
BibRef
Cinque, L.,
Levialdi, S.,
Malizia, A.,
de Rosa, F.,
DAN:
An Automatic Segmentation and Classification Engine for Paper Documents,
DAS02(491 ff.).
Springer DOI
0303
BibRef
Cinque, L.,
Levialdi, S.,
Malizia, A.,
A system for the automatic layout segmentation and classification of
digital documents,
CIAP03(201-206).
IEEE DOI
0310
BibRef
Liu, J.M.,
Tang, Y.Y.,
Distributed Autonomous Agents For Chinese Document Image Segmentation,
PRAI(12), No. 1, February 1998, pp. 97-118.
9806
See also Adaptive Image Segmentation With Distributed Behavior-Based Agents.
BibRef
de Queiroz, R.L.,
Processing JPEG Compressed Images and Documents,
IP(7), No. 12, December 1998, pp. 1661-1672.
IEEE DOI
9812
BibRef
de Queiroz, R.L.,
Processing JPEG-Compressed Images,
ICIP97(II: 334-337).
IEEE DOI
BibRef
9700
de Queiroz, R.L.,
Eschbach, R.,
Fast Segmentation of the JPEG Compressed Documents,
JEI(7), No. 2, April 1998, pp. 367-377.
9807
BibRef
de Queiroz, R.L., and
Eschbach, R.,
Segmentation of Compressed Documents,
ICIP97(III: 70-73).
IEEE DOI
BibRef
9700
de Queiroz, R.L.[Ricardo L.],
Compression of Compound Documents,
ICIP99(I:209-213).
IEEE DOI
BibRef
9900
Antonacopoulos, A.[Apostolos],
Page Segmentation Using the Description of the Background,
CVIU(70), No. 3, June 1998, pp. 350-369.
DOI Link
BibRef
9806
Jain, A.K.,
Yu, B.,
Document Representation and Its Application to Page Decomposition,
PAMI(20), No. 3, March 1998, pp. 294-308.
IEEE DOI
9805
Generates a structured version of the document for editing, storage,
retrieval, and analysis. Performs skew correction, segmentation, and
labeling (text, table, image, drawing, and ruler).
Some review of approaches.
BibRef
Jain, A.K.,
Yu, B.,
Page segmentation using document model,
ICDAR97(34-38).
IEEE DOI
9708
BibRef
Yang, J.C.Y.[James Ching-Yu],
Tsai, W.H.[Wen-Hsiang],
Document image segmentation and quality improvement by Moiré pattern
analysis,
SP:IC(15), No. 9, July 2000, pp. 781-797.
Elsevier DOI
0008
BibRef
Mao, S.[Song],
Kanungo, T.[Tapas],
Empirical Performance Evaluation Methodology and Its Application to
Page Segmentation Algorithms,
PAMI(23), No. 3, March 2001, pp. 242-256.
IEEE DOI
0103
Survey, Page Segmentation.
Evaluation, Page Segmentation. Created separate test and training data, a computable performance metric,
find optimal parameters for different algorithms, evaluate.
Compare
Voronoi (Kise) (
See also Segmentation of Page Images Using the Area Voronoi Diagram. );
Docstrum (O'Gorman) (
See also Document Spectrum for Page Layout Analysis, The. );
Caere (commercial system) (
See also Caere. );
(these 3 have about the same performance)
Are better than
ScanSoft (commercial system) (
See also ScanSoft. );
which is better than the older X-Y cut (
See also Prototype Document Image Analysis System for Technical Journals, A. ).
Similar conclusion in later analysis:
See also Performance Evaluation and Benchmarking of Six-Page Segmentation Algorithms.
BibRef
Mao, S.[Song],
Kanungo, T.[Tapas],
Software Architecture of PSET: A Page Segmentation Evaluation Toolkit,
IJDAR(4), No. 3, 2002, pp. 205-217.
Springer DOI
0205
BibRef
Earlier:
UMD--TR4190, September 2000.
WWW Link.
Evaluation, Page Segmentation.
BibRef
Mao, S.[Song],
Kanungo, T.[Tapas],
A Methodology for Empirical Performance Evaluation of
Page Segmentation Algorithms,
UMD--TR4093, December 1999.
WWW Link.
BibRef
9912
Mao, S.,
Kanungo, T.,
Automatic Training of Page Segmentation Algorithms:
An Optimization Approach,
ICPR00(Vol IV: 531-534).
IEEE DOI
0009
BibRef
Kanungo, T.,
Mao, S.[Song],
Stochastic language models for style-directed layout analysis of
document images,
IP(12), No. 5, May 2003, pp. 583-596.
IEEE DOI
0307
BibRef
Amin, A.[Adnan],
Shiu, R.[Ricky],
Page Segmentation And Classification Utilizing Bottom-up Approach,
IJIG(1), No. 2, April 2001, pp. 345-361.
0104
BibRef
Deng, S.[Shulan],
Latifi, S.[Shahram],
Regentova, E.E.[Emma E.],
Document segmentation using polynomial spline wavelets,
PR(34), No. 12, December 2001, pp. 2533-2545.
Elsevier DOI
0110
BibRef
Regentova, E.E.,
Latifi, S.,
Chen, D.,
Taghva, K.,
Yao, D.,
Document analysis by processing JBIG-encoded images,
IJDAR(7), No. 4, September 2005, pp. 260-272.
Springer DOI
0512
BibRef
Diligenti, M.[Michelangelo],
Frasconi, P.[Paolo],
Gori, M.[Marco],
Hidden Tree Markov Models for Document Image Classification,
PAMI(25), No. 4, April 2003, pp. 520-524.
IEEE Abstract.
0304
Learning. Learn the concept of a set of documents of similar structure.
BibRef
Diligenti, M.,
Gori, M.,
Maggini, M.,
Scarselli, F.,
Classification of HTML documents by Hidden Tree-Markov Models,
ICDAR01(849-853).
IEEE DOI
0109
BibRef
Haji, M.M.,
Katebi, S.D.,
An Efficient Text Segmentation Technique Based
on Naive Bayes Classifier,
GVIP(05), No. V7, 2005, pp. 21-30
HTML Version.
BibRef
0500
Wang, Y.L.[Ya-Lin],
Phillips, I.T.[Ihsin T.],
Haralick, R.M.[Robert M.],
Document zone content classification and its performance evaluation,
PR(39), No. 1, January 2006, pp. 57-73.
Elsevier DOI
0512
Evaluation, Page Segmentation.
BibRef
Earlier:
A Study on the Document Zone Content Classification Problem,
DAS02(212 ff.).
Springer DOI
0303
BibRef
And:
A method for document zone content classification,
ICPR02(III: 196-199).
IEEE DOI
0211
BibRef
Earlier: A1, A3, A2:
Zone content classification and its performance evaluation,
ICDAR01(540-544).
IEEE DOI
0109
See also Table structure understanding and its performance evaluation.
BibRef
Leydier, Y.[Yann],
Le Bourgeois, F.[Frank],
Emptoz, H.[Hubert],
Text search for medieval manuscript images,
PR(40), No. 12, December 2007, pp. 3552-3567.
Elsevier DOI
0709
BibRef
Earlier:
Omnilingual Segmentation-Free Word Spotting for Ancient Manuscripts
Indexation,
ICDAR05(I: 533-537).
IEEE DOI
0508
BibRef
Earlier:
Serialized unsupervised classifier for adaptative color image
segmentation: application to digitized ancient manuscripts,
ICPR04(I: 494-497).
IEEE DOI
0409
Word-spotting; Medieval manuscripts
BibRef
Le Bourgeois, F.[Frank],
Kaileh, H.[Hala],
Automatic Metadata Retrieval from Ancient Manuscripts,
DAS04(75-89).
Springer DOI
0505
BibRef
Allier, B.,
Emptoz, H.,
Segmentation and typography extraction in document images using
geodesic active regions,
ICPR04(I: 409-412).
IEEE DOI
0409
BibRef
Leydier, Y.[Yann],
Ouji, A.[Asma],
Le Bourgeois, F.[Frank],
Emptoz, H.[Hubert],
Towards an omnilingual word retrieval system for ancient manuscripts,
PR(42), No. 9, September 2009, pp. 2089-2105.
Elsevier DOI
0905
Document indexing; Word-spotting; Word retrieval; Ancient documents;
Segmentation-free; Omnilingual
BibRef
Ouji, A.[Asma],
Leydier, Y.[Yann],
Le Bourgeois, F.[Frank],
Chromatic / Achromatic Separation in Noisy Document Images,
ICDAR11(167-171).
IEEE DOI
1111
BibRef
Shafait, F.[Faisal],
Keysers, D.[Daniel],
Breuel, T.M.[Thomas M.],
Performance Evaluation and Benchmarking of Six-Page Segmentation
Algorithms,
PAMI(30), No. 6, June 2008, pp. 941-954.
IEEE DOI
0804
Survey, Page Segmentation.
Evaluation, Page Segmentation.
BibRef
Earlier:
Performance Comparison of Six Algorithms for Page Segmentation,
DAS06(368-379).
Springer DOI
0602
BibRef
And:
Pixel-Accurate Representation and Evaluation of Page Segmentation in
Document Images,
ICPR06(I: 872-875).
IEEE DOI
0609
Also use the dummy program -- no segmentation for a minimum level.
X-Y Cut (
See also Prototype Document Image Analysis System for Technical Journals, A. ),
Run Length Smearing (
See also Document Analysis System. ),
Whitespace Analysis (
See also Two Geometric Algorithms for Layout Analysis. ) and
Constrained textline detection.
The last two:
Docstrum (
See also Document Spectrum for Page Layout Analysis, The. ),
Voronoi (
See also Segmentation of Page Images Using the Area Voronoi Diagram. ).
are generally the best choice.
For similar analysis also see:
See also Empirical Performance Evaluation Methodology and Its Application to Page Segmentation Algorithms.
BibRef
Nagy, G.[George],
Seth, S.C.[Sharad C.],
Viswanathan, M.[Mahesh],
Comment: Projection Methods Require Black Border Removal,
PAMI(31), No. 4, April 2009, pp. 762-762.
IEEE DOI
0903
Flaw in page segmentation evaluation.
See also Performance Evaluation and Benchmarking of Six-Page Segmentation Algorithms. Relative to evaluation of:
See also Prototype Document Image Analysis System for Technical Journals, A.
BibRef
Shafait, F.[Faisal],
Keysers, D.[Daniel],
Breuel, T.M.[Thomas M.],
Response to 'Projection Methods Require Black Border Removal',
PAMI(31), No. 4, April 2009, pp. 763-764.
IEEE DOI
0903
See also Performance Evaluation and Benchmarking of Six-Page Segmentation Algorithms.
BibRef
Shafait, F.[Faisal],
Breuel, T.M.[Thomas M.],
The Effect of Border Noise on the Performance of Projection-Based Page
Segmentation Methods,
PAMI(33), No. 4, April 2011, pp. 846-851.
IEEE DOI
1103
Page segmentation usually sensitive to border noise.
BibRef
Stamatopoulos, N.[Nikolaos],
Gatos, B.[Basilis],
Perantonis, S.J.[Stavros J.],
A method for combining complementary techniques for document image
segmentation,
PR(42), No. 12, December 2009, pp. 3158-3168.
Elsevier DOI
0909
Document image segmentation; Combination method; Document image
analysis; Segmentation
BibRef
Deryagin, D.,
Unified Performance Evaluation for OCR Zoning: Calculating Page
Segmentation's Score, That Includes Text Zones, Tables and Non-text
Objects,
ICDAR13(953-957)
IEEE DOI
1312
image segmentation
BibRef
Lebourgeois, F.,
Drira, F.,
Gaceb, D.,
Duong, J.,
Fast Integral MeanShift:
Application to Color Segmentation of Document Images,
ICDAR13(52-56)
IEEE DOI
1312
computational complexity
BibRef
Antonacopoulos, A.[Apostolos],
Pletschacher, S.[Stefan],
Bridson, D.[David],
Papadopoulos, C.[Christos],
ICDAR 2009 Page Segmentation Competition,
ICDAR09(1370-1374).
IEEE DOI
0907
BibRef
Antonacopoulos, A.,
Gatos, B.,
Bridson, D.,
Page Segmentation Competition,
ICDAR07(1279-1283).
IEEE DOI
0709
BibRef
Earlier:
ICDAR2005 page segmentation competition,
ICDAR05(I: 75-79).
IEEE DOI
0508
BibRef
Earlier:
ICDAR 2003 page segmentation competition,
ICDAR03(688-692).
IEEE DOI
0311
BibRef
Peng, L.R.[Liang-Rui],
Chen, M.[Ming],
Liu, C.S.[Chang-Song],
Ding, X.Q.[Xiao-Qing],
Zheng, J.R.[Ji-Rong],
An automatic performance evaluation method for document page
segmentation,
ICDAR01(134-137).
IEEE DOI
0109
BibRef
Fumera, G.,
Pillai, I.,
Roli, F.,
Classification with reject option in text categorisation systems,
CIAP03(582-587).
IEEE DOI
0310
BibRef
Ma, H.F.[Huan-Feng],
Doermann, D.S.,
Gabor filter based multi-class classifier for scanned document images,
ICDAR03(968-972).
IEEE DOI
0311
BibRef
Allier, B.[Bénédicte],
Emptoz, H.[Hubert],
Type extraction and character prototyping using gabor filters,
ICDAR03(799-803).
IEEE DOI
0311
BibRef
And:
Character prototyping in document images using Gabor filters,
ICIP03(I: 537-540).
IEEE Abstract.
0312
BibRef
And:
SCIA03(28-35).
Springer DOI
0310
BibRef
Laurence, D.[Duffy],
Le Bourgeois, F.[Frank],
Emptoz, H.[Hubert],
Logical structure analysis by typographic characteristics extraction,
CIAP97(II: 639-646).
Springer DOI
9709
BibRef
Allier, B.,
Duong, J.,
Gagneux, A.,
Mallet, P.,
Emptoz, H.,
Texture feature characterization for logical pre-labeling,
ICDAR03(567-571).
IEEE DOI
0311
BibRef
Liu, L.J.[Li-Jie],
Dong, Y.[Yan],
Song, X.M.[Xiao-Mu],
Fan, G.L.[Guo-Liang],
An entropy-based segmentation algorithm for computer-generated
documentimages,
ICIP03(I: 541-544).
IEEE Abstract.
0312
BibRef
Leedham, G.,
Yan, C.[Chen],
Takru, K.,
Tan, J.H.N.[Joie Hadi Nata],
Mian, L.[Li],
Comparison of some thresholding algorithms for text/background
segmentation in difficult document images,
ICDAR03(859-864).
IEEE DOI
0311
BibRef
Leedham, G.,
Varma, S.,
Patankar, A.,
Govindaraju, V.,
Separating text and background in degraded document images:
A comparison of global thresholding techniques for
multi-stage thresholding,
FHR02(244-249).
IEEE Top Reference.
0209
BibRef
Kise, K.,
Miki, Y.,
Matsumoto, K.,
Stippling data on backgrounds of pages-toward seamless integration of
paper and electronic documents,
ICDAR03(1213-1217).
IEEE DOI
0311
BibRef
Kise, K.,
Yanagida, O.,
Takamatsu, S.,
Page Segmentation Based on Thinning of Background,
ICPR96(III: 788-792).
IEEE DOI
9608
(Osaka Prefecture Univ., J)
BibRef
Kise, K.,
Yamaoka, M.,
Babaguchi, N.,
Tezuka, Y.,
Model based system for analyzing document images,
ICPR92(II:647-650).
IEEE DOI
9208
BibRef
Suvichakorn, A.[Aimamorn],
Watcharabusaracum, S.[Sarin],
Sinthupinyo, W.[Wasin],
Simple Layout Segmentation of Gray-Scale Document Images,
DAS02(245 ff.).
Springer DOI
0303
BibRef
Caillault, E.,
Viard-Gaudin, C.,
Ahmad, A.R.,
MS-TDNN with global discriminant trainings,
ICDAR05(II: 856-860).
IEEE DOI
0508
NN HMM.
BibRef
Golenzer, J.,
Viard-Gaudin, C.,
Lallican, P.M.,
Finding regions of interest in document images by planar HMM,
ICPR02(III: 415-418).
IEEE DOI
0211
BibRef
Sivaramakrishnam, R.,
Phillips, I.T.,
Ha, J.,
Subramanium, S.,
Haralick, R.M.,
Zone Classification in a Document Using the Method of
Feature Vector Generation,
ICDAR95(541-544).
Pixel based, multiple classes.
BibRef
9500
Cheng, H.[Hui],
Fan, Z.G.[Zhi-Gang],
Background identification based segmentation and multilayer tree
representation of document images,
ICIP02(III: 1005-1008).
IEEE DOI
0210
BibRef
Blumenstein, M.,
Verma, B.,
Analysis of segmentation performance on the CEDAR benchmark database,
ICDAR01(1142-1146).
IEEE DOI
0109
BibRef
Yang, Y.D.[Yu-Dong],
Zhang, H.J.[Hong-Jiang],
HTML page analysis based on visual cues,
ICDAR01(859-864).
IEEE DOI
0109
BibRef
Mukherjee, D.P.[Dipti Prasad],
Acton, S.T.[Scott T.],
Document Page Segmentation using Multiscale Clustering,
ICIP99(I:234-238).
IEEE DOI
BibRef
9900
He, S.,
Abe, N.,
A Clustering-Based Approach to the Separation of Text Strings from
Mixed Text/Graphics Documents,
ICPR96(III: 706-710).
IEEE DOI
9608
(National Univers. of Singapore, SGP)
BibRef
Randen, T.[Trygve], and
Husøy, J.H.[John Håkon],
Segmentation of text/image documents using texture approaches,
Proc.
NOBIM-konferansen-94, Asker (Norway), June 1994, pp. 60-67.
HTML Version.
BibRef
9406
Fischer, S.,
Amin, A., and
Drivas, D.,
Segmentation of the Yellow Pages,
ICDAR95(605-609).
BibRef
9500
Randriamasy, S.,
Vincent, L.,
Benchmarking Page Segmentation Algorithms,
CVPR94(411-416).
IEEE DOI
BibRef
9400
Higashino, J.,
Fujisawa, H.,
Nakano, Y.,
Ejiri, M.,
A Knowledge-Based Segmentation Method for Document Understanding,
ICPR86(745-748).
Top-down layout analysis using FDL.
BibRef
8600
Makino, H.,
Representation and Segmentation of Document Images,
CVPR839291-295).
BibRef
8300
Chapter on OCR, Document Analysis and Character Recognition Systems continues in
Find Text in Documents .