Cross-validation in PCA models with the element-wise k-fold (ekf) algorithm: Practical Aspects.
-
José Camacho; Alberto Ferrer
- Abstract:
- This is the second paper of a series devoted to provide
theoretical and practical results and new algorithms for the
selection of the number of Principal Components (PCs) in Principal
Component Analysis (PCA) using cross-validation. The study is
especially focused on the element-wise \emph{k}-fold (\emph{ekf}),
which is among the most used algorithms for that purpose. In this paper, a taxonomy of PCA applications is proposed and it is argued
that cross-validatory algorithms computing the prediction error in observable variables, like ekf, are only suited for a class of applications. A number of cross-validation methods, several of which are original, are compared in two applications of this
class: missing data imputation and compression. The results show that the ekf is especially suited for missing data applications while other traditional cross-validation methods,
those by Wold and Eastment and Krzanowski, are not found to
provide useful outcomes in any of the two application. These
results are of special value considering that the methods
investigated are computed in the main commercial software packets for chemometrics. Finally, the choice of the missing data algorithm within ekf is also investigated.
- Research areas:
- Year:
- 2014
- Type of Publication:
- Article
- Keywords:
- Principal Component Analysis, number of components, cross-validation, missing data, compression
- Journal:
- Chemometrics and Intelligent Laboratory Systems
- Volume:
- 131
- Pages:
- 37-50
Hits: 3250