14.1.4.1 Sample Sizes Issues, Data analysis, Training Sets

Chapter Contents (Back)
Small Sample Size. Evaluation, Samples. 9805
See also Imbalanced Sample Sizes, Imbalanced Data.

Fukunaga, K., and Hayes, R.R.,
Effects of Sample Size in Classifier Design,
PAMI(11), No. 8, August 1989, pp. 873-885.
IEEE DOI BibRef 8908

Raudys, S.J., and Jain, A.K.,
Small Sample Size Effects in Statistical Pattern Recognition: Recommendations for Practitioners,
PAMI(13), No. 3, March 1991, pp. 252-264.
IEEE DOI BibRef 9103
Earlier:
Small sample size effects in statistical pattern recognition: recommendations for practitioners and open problems,
ICPR90(I: 417-423).
IEEE DOI 9006
BibRef

Wong, A.K.C., and Chiu, D.K.Y.,
Synthesizing Statistical Knowledge from Incomplete Mixed-Mode Data,
PAMI(9), No. 6, November 1987, pp. 796-805. BibRef 8711

Chiu, D.K.Y., Wong, A.K.C., and Chan, K.C.C.,
Synthesis of Statistical Knowledge from Time-Dependent Data,
PAMI(13), No. 3, March 1991, pp. 265-271.
IEEE DOI BibRef 9103

Chan, S.C., Wong, A.K.C.,
Synthesis and recognition of sequences,
PAMI(13), No. 12, December 1991, pp. 1245-1255.
IEEE DOI 0401
BibRef

Ibrahim, J.,
Incomplete Data in Generalized Linear Models,
ASAJ(85), 1990, pp. 765-769. BibRef 9000

Krishnan, T., Nandy, S.C.,
Efficiency of discriminant analysis when initial samples are classified stochastically,
PR(23), No. 5, 1990, pp. 529-537.
WWW Link. 0401
BibRef

Benali, H., Buvat, I., Frouin, F., Bazin, J.P., di Paola, R.,
Foundations of Factor Analysis of Medical Image Sequences: A Unified Approach and Some Practical Implications,
IVC(12), No. 6, July-August 1994, pp. 375-385.
WWW Link. 0401
BibRef

Lazo-Cortes, M., Ruiz-Shulcloper, J.,
Determining the Feature Relevance for Nonclassically Described Objects and a New Algorithm to Compute Typical Fuzzy Testors,
PRL(16), No. 12, December 1995, pp. 1259-1265. BibRef 9512

Lazo-Cortes, M.[Manuel], Ruiz-Shulcloper, J.[Jose], Alba-Cabrera, E.[Eduardo],
An Overview of the Evolution of the Concept of Testor,
PR(34), No. 4, April 2001, pp. 753-762.
WWW Link. 0101
BibRef

Sanchez-Díaz, G.[Guillermo], Lazo-Cortés, M.[Manuel],
CT-EXT: An Algorithm for Computing Typical Testor Set,
CIARP07(506-514).
Springer DOI 0711
BibRef

Castelli, V., Cover, T.M.,
The Relative Value of Labeled and Unlabeled Samples in Pattern-Recognition with an Unknown Mixing Parameter,
IT(42), No. 6, Part 2, November 1996, pp. 2102-2117. 9701
BibRef

Jain, A.K.[Anil K.], Zongker, D.[Douglas],
Feature-Selection: Evaluation, Application, and Small Sample Performance,
PAMI(19), No. 2, February 1997, pp. 153-158.
IEEE DOI 9703
Sequential Forward Floating Selection algorithm ( See also Floating Search Methods in Feature-Selection. ) dominates other tested algorithms. Applied with 4 texture models on SAR. BibRef

Zongker, D., Jain, A.K.,
Algorithms for Feature Selection: An Evaluation,
ICPR96(II: 18-22).
IEEE DOI 9608
(Michigan State Univ., USA) BibRef

Raudys, S.J.[Sarunas J.],
Dimensionality, Sample-Size, and Classification Error of Nonparametric Linear Classification Algorithms,
PAMI(19), No. 6, June 1997, pp. 667-671.
IEEE DOI 9708
BibRef
Earlier:
Linear Classifiers in Perceptron Design,
ICPR96(IV: 763-767).
IEEE DOI 9608
(Institute of Mathematics and Informatics, LIT) BibRef

Tseng, C.H.,
Identification of Cubically Nonlinear Systems Using Undersampled Data,
VISP(144), No. 5, October 1997, pp. 267-277. 9806
BibRef

Skurichina, M.[Marina], Duin, R.P.W.[Robert P.W.],
Regularisation of Linear Classifiers by Adding Redundant Features,
PAA(2), No. 1, 1999, pp. 44-52. BibRef 9900

Hodgson, M.E.,
What Size Window for Image Classification: A Cognitive Perspective,
PhEngRS(64), No. 8, August 1998, pp. 797-807. 9808
BibRef

Guyon, I.[Isabelle], Makhoul, J.[John], Schwartz, R.[Richard], Vapnik, V.[Vladimir],
What Size Test Set Gives Good Error Rate Estimates?,
PAMI(20), No. 1, January 1998, pp. 52-64.
IEEE DOI 9803
OCR. Applied to the character recognition problem. BibRef

Avena, G.C., Ricotta, C., Volpe, F.,
The influence of principal component analysis on the spatial structure of a multispectral dataset,
JRS(20), No. 17, November 1999, pp. 3367. BibRef 9911

Sitek, A., Gullberg, G.T., Huesman, R.H.,
Correction for ambiguous solutions in factor analysis using a penalized least squares objective,
MedImg(21), No. 3, March 2002, pp. 216-225.
IEEE Top Reference. 0205
BibRef

Ennaji, A.[Abdellatif], Ribert, A.[Arnaud], Lecourtier, Y.[Yves],
From data topology to a modular classifier,
IJDAR(6), No. 1, 2003, pp. 1-9.
Springer DOI 0308
BibRef

Ribert, A., Ennaji, A., Lecourtier, Y.,
Clustering Data: Dealing with High Density Variations,
ICPR00(Vol II: 736-739).
IEEE DOI 0009
BibRef

Stocker, E., Ribert, A., Lecourtier, Y., Ennaji, A.,
Incremental Distributed Classifier Building,
ICPR96(IV: 128-132).
IEEE DOI 9608
(Univ. de Rouen, F) BibRef

Liu, Z.Y., Chiu, K.C., Xu, L.,
Investigations on Non-Gaussian Factor Analysis,
SPLetters(11), No. 7, July 2004, pp. 597-600.
IEEE Abstract. 0407
BibRef

Wang, Y.[Ye], Huang, S.T.[Shang-Teng],
Training TSVM with the proper number of positive samples,
PRL(26), No. 14, 15 October 2005, pp. 2187-2194.
WWW Link. 0510
Transductive Support Vector Machine. BibRef

Knijnenburg, T.A.[Theo A.], Reinders, M.J.T.[Marcel J.T.], Wessels, L.F.A.[Lodewyk F.A.],
Artifacts of Markov blanket filtering based on discretized features in small sample size applications,
PRL(27), No. 7, May 2006, pp. 709-714.
WWW Link. 0604
Feature evaluation and selection. Apply to gene expression data. BibRef

Lai, C.[Carmen], Reinders, M.J.T.[Marcel J.T.], Wessels, L.F.A.[Lodewyk F.A.],
Random subspace method for multivariate feature selection,
PRL(27), No. 10, 15 July 2006, pp. 1067-1076.
WWW Link. 0606
Random subspace method; Small sample size problem BibRef

Li, Y.L.[Yun-Lei], Wessels, L.F.A.[Lodewyk F.A.], de Ridder, D.[Dick], Reinders, M.J.T.[Marcel J.T.],
Classification in the presence of class noise using a probabilistic Kernel Fisher method,
PR(40), No. 12, December 2007, pp. 3349-3357.
WWW Link. 0709
BibRef
And: Erratum: PR(41), No. 3, March 2008, pp. 1214.
WWW Link. 0711
Classification; Class noise; Labeling noise; Kernel Fisher discriminant BibRef

Liang, Y.X.[Yi-Xiong], Li, C.R.[Cheng-Rong], Gong, W.G.[Wei-Guo], Pan, Y.J.[Ying-Jun],
Uncorrelated linear discriminant analysis based on weighted pairwise Fisher criterion,
PR(40), No. 12, December 2007, pp. 3606-3615.
WWW Link. 0709
Uncorrelated LDA; Null space LDA; Weighted pairwise Fisher criterion; Decorrelation BibRef

Isaksson, A., Wallman, M., Goransson, H., Gustafsson, M.G.,
Cross-validation and bootstrapping are unreliable in small sample classification,
PRL(29), No. 14, October 2008, pp. 1960-1965.
WWW Link. 0804
Supervised classification; Performance estimation; Confidence interval BibRef

Das, K.[Koel], Nenadic, Z.[Zoran],
An efficient discriminant-based solution for small sample size problem,
PR(42), No. 5, May 2009, pp. 857-866.
Elsevier DOI 0902
Feature extraction; Principal component analysis; Classification; Linear discriminant analysis; Bayes error BibRef

Jacquemont, S.[Stephanie], Jacquenet, F.[Francois], Sebban, M.[Marc],
A lower bound on the sample size needed to perform a significant frequent pattern mining task,
PRL(30), No. 11, 1 August 2009, pp. 960-967.
Elsevier DOI 0909
Frequent pattern mining; Lower bound BibRef

Hernandez-Leal, P.[Pablo], Carrasco-Ochoa, J.A.[J. Ariel], Martínez-Trinidad, J.F., Olvera-Lopez, J.A.[J. Arturo],
InstanceRank based on borders for instance selection,
PR(46), No. 1, January 2013, pp. 365-375.
Elsevier DOI 1209
Instance selection; Instance ranking; Border instances; Supervised classification BibRef

Olvera-López, J.A.[J. Arturo], Martínez-Trinidad, J.F.[J. Francisco], Carrasco-Ochoa, J.A.[J. Ariel],
Mixed Data Object Selection Based on Clustering and Border Objects,
CIARP07(674-683).
Springer DOI 0711
Instance selection. BibRef

Hernandez-Rodriguez, S.[Selene], Martínez-Trinidad, J.F.[J. Francisco], Carrasco-Ochoa, J.A.[J. Ariel],
On the selection of base prototypes for LAESA and TLAESA classifiers,
ICPR08(1-4).
IEEE DOI 0812
BibRef

Macià, N.[Núria], Bernadó-Mansilla, E.[Ester], Orriols-Puig, A.[Albert], Ho, T.K.[Tin Kam],
Learner excellence biased by data set selection: A case for data characterisation and artificial data sets,
PR(46), No. 3, March 2013, pp. 1054-1066.
Elsevier DOI 1212
BibRef
Earlier: A1, A2, A3:
Preliminary approach on synthetic data sets generation based on class separability measure,
ICPR08(1-4).
IEEE DOI 0812
Supervised learning; Learner assessment; Data complexity BibRef

Macià, N.[Núria], Ho, T.K.[Tin Kam], Orriols-Puig, A.[Albert], Bernadó-Mansilla, E.[Ester],
The Landscape Contest at ICPR 2010,
ICPR-Contests10(29-45).
Springer DOI 1008
Evaluate robustness of supervised classifications and their limitations. BibRef

Hanczar, B.[Blaise], Dougherty, E.R.[Edward R.],
The reliability of estimated confidence intervals for classification error rates when only a single sample is available,
PR(46), No. 3, March 2013, pp. 1067-1077.
Elsevier DOI 1212
Supervised learning; Error estimation; High dimension; Small sample setting; Confidence interval BibRef

Jiang, Y.G.[Yu-Gang], Wang, J.[Jun], Xue, X., Chang, S.F.[Shih-Fu],
Query-Adaptive Image Search With Hash Codes,
MultMed(15), No. 2, 2013, pp. 442-453.
IEEE DOI 1302
BibRef

Jiang, Y.G.[Yu-Gang], Wang, J.[Jun], Chang, S.F.[Shih-Fu],
Lost in binarization: query-adaptive ranking for similar image search with compact codes,
ICMR11(16).
DOI Link 1301
BibRef
And: A2, A1, A3:
Label diagnosis through self tuning for web image search,
CVPR09(1390-1397).
IEEE DOI 0906
Are the initial label good? BibRef

Wu, H.[Hao], Miao, Z.J.[Zhen-Jiang], Wang, Y.[Yi], Lin, M.[Manna],
Optimized recognition with few instances based on semantic distance,
VC(31), No. 4, April 2015, pp. 367-375.
Springer DOI 1503
Learning with only a few examples. BibRef

Hsiao, P.H.[Pai-Heng], Chang, F.J.[Feng-Ju], Lin, Y.Y.[Yen-Yu],
Learning Discriminatively Reconstructed Source Data for Object Recognition With Few Examples,
IP(25), No. 8, August 2016, pp. 3518-3532.
IEEE DOI 1608
learning systems BibRef

Potapov, A.[Alexey], Potapova, V.[Vita], Peterson, M.[Maxim],
A feasibility study of an autoencoder meta-model for improving generalization capabilities on training sets of small sizes,
PRL(80), No. 1, 2016, pp. 24-29.
Elsevier DOI 1609
Autoencoders BibRef

Li, D.[Dong], Liu, S.L.[Shu-Lin], Zhang, H.L.[Hong-Li],
A method of anomaly detection and fault diagnosis with online adaptive learning under small training samples,
PR(64), No. 1, 2017, pp. 374-385.
Elsevier DOI 1701
Artificial immune system BibRef


Davari, A.[Amir_Abbas], Christlein, V.[Vincent], Vesal, S.[Sulaiman], Maier, A.[Andreas], Riess, C.[Christian],
GMM Supervectors for Limited Training Data in Hyperspectral Remote Sensing Image Classification,
CAIP17(II: 296-306).
Springer DOI 1708
BibRef

Ustuner, M., Sanli, F.B., Abdikan, S.,
Balanced Vs Imbalanced Training Data: Classifying Rapideye Data With Support Vector Machines,
ISPRS16(B7: 379-384).
DOI Link 1610
BibRef

Su, H.[Hang], Yin, Z.Z.[Zhao-Zheng], Kanade, T.[Takeo], Huh, S.[Seungil],
Active sample selection and correction propagation on a gradually-augmented graph,
CVPR15(1975-1983)
IEEE DOI 1510
BibRef

Kadar, I.[Ilan], Ben-Shahar, O.[Ohad],
Small sample scene categorization from perceptual relations,
CVPR12(2711-2718).
IEEE DOI 1208
BibRef

Singh, M.[Mayank], Gupta, P.K., Mishra, S.[Shailendra],
Automated test data generation for mutation testing using AspectJ programs,
ICIIP11(1-5).
IEEE DOI 1112
BibRef

Xia, X.T.[Xin-Tao], Zhou, Q.[Qing], Zhu, J.M.[Jian-Min],
Evaluation for repeatability and reproducibility of information poor process,
IASP10(528-531).
IEEE DOI 1004
Incomplete, small sample size. BibRef

Petersen, H.[Henry], Poon, J.[Josiah],
Reworking Bridging for Use within the Image Domain,
CAIP09(832-839).
Springer DOI 0909
Bridging: from string text classification (Zelikovitz et al.) Data set issues. BibRef

Ricamato, M.T.[Maria Teresa], Marrocco, C.[Claudio], Tortorella, F.[Francesco],
MCS-based balancing techniques for skewed classes: An empirical comparison,
ICPR08(1-4).
IEEE DOI 0812
Train multiple classifiers using the minority class and part of majority class. BibRef

Na, J.H.[Jin Hee], Yun, S.M.[Seok Min], Kim, M.S.[Min-Soo], Choi, J.Y.[Jin Young],
Relevant pattern selection for subspace learning,
ICPR08(1-4).
IEEE DOI 0812
BibRef

Xu, Y.[Yong], Zhang, D.[David],
A New Solution Scheme of Unsupervised Locality Preserving Projection Method for the SSS Problem,
SSPR08(775-781).
Springer DOI 0812
Small Sample Size problem. BibRef

Morales-Manilla, L.R.[Luis Roberto], Sanchez-Diaz, G.[Guillermo],
FS-EX Plus: A New Algorithm for the Calculation of Typical FS-Testor Set,
CIARP07(380-386).
Springer DOI 0711
Feature Selection. BibRef

Zheng, Y.J.[Yu-Jie], Yang, J.Y.[Jing-Yu], Yang, J.[Jian], Wu, X.J.[Xiao-Jun],
Effective classification image space which can solve small sample size problem,
ICPR06(II: 861-864).
IEEE DOI 0609
BibRef

Pranckeviciene, E.[Erinija], Ho, T.K.[Tin Kam], Somorjai, R.[Ray],
Class Separability in Spaces Reduced By Feature Selection,
ICPR06(III: 254-257).
IEEE DOI 0609
BibRef

Zhu, X.Q.[Xing-Quan], Wu, X.D.[Xin-Dong],
Scalable Representative Instance Selection and Ranking,
ICPR06(III: 352-355).
IEEE DOI 0609
BibRef

Xuan, G.R.[Guo-Rong], Zhu, X.M.[Xiu-Ming], Chai, P.Q.[Pei-Qi], Zhang, Z.P.[Zhen-Ping], Shi, Y.Q.[Yun Q.], Fu, D.D.[Dong-Dong],
Feature Selection based on the Bhattacharyya Distance,
ICPR06(III: 1232-1235).
IEEE DOI 0609
BibRef
And: ICPR06(IV: 957).
IEEE DOI 0609
BibRef

Levi, K.[Kobi], Fink, M.[Michael], Weiss, Y.[Yair],
Learning From a Small Number of Training Examples by Exploiting Object Categories,
LCV04(96).
IEEE DOI 0406
BibRef

Ulusoy, I.[Ilkay], Bishop, C.M.[Christopher M.],
Comparison of Generative and Discriminative Techniques for Object Detection and Classification,
CLOR06(173-195).
Springer DOI 0711
BibRef
Earlier:
Generative versus Discriminative Methods for Object Recognition,
CVPR05(II: 258-265).
IEEE DOI 0507
BibRef

Salah, A.A., Alpaydin, E.,
Incremental mixtures of factor analysers,
ICPR04(I: 276-279).
IEEE DOI 0409
BibRef

Huang, R.[Rui], Liu, Q.S.[Qing-Shan], Lu, H.Q.[Han-Qing], Ma, S.D.[Song-De],
Solving the small sample size problem of LDAf,
ICPR02(III: 29-32).
IEEE DOI 0211
BibRef

Duin, R.P.W.[Robert P.W.],
Relational Discriminant Analysis and its Large Sample Size Problem,
ICPR98(Vol I: 445-449).
IEEE DOI 9808
BibRef

Chapter on Pattern Recognition, Clustering, Statistics, Grammars, Learning, Neural Nets, Genetic Algorithms continues in
Imbalanced Sample Sizes, Imbalanced Data .


Last update:Sep 22, 2017 at 21:00:01