NESG

Icono Icono

Icono Icono

On the Generation of Random Multivariate Data

José Camacho
Abstract:
The simulation of multivariate data is often necessary for assessing the performance of multivariate analysis techniques. The random generation of multivariate data when the covariance matrix is completely or partly specified is solved by different methods, from the Cholesky decomposition to some recent alternatives. However, many times the covariance matrix has to be generated also at random, so that the data simulation spans different situations from highly correlated to uncorrelated data. This is the case when assessing a new multivariate analysis technique in Montercarlo experiments. In this paper, we introduce a new algorithm for the generation of random data from covariance matrices of random structure, where the user only decides the data dimension and the level of correlation. We will illustrate the application of this algorithm in several relevant problems in multivariate analysis, namely the selection of the number of Principal Components in Principal Component Analysis, the evaluation of the performance of sparse Partial Least Squares and the calibration of Multivariate Statistical Process Control systems. The algorithm is available as part of the \{MEDA\} Toolbox v1.111 Available at https://github.com/josecamachop/MEDA-Toolbox/releases/tag/v1.1
Research areas:
Year:
2017
Type of Publication:
Article
Keywords:
Multivariate data
Journal:
Chemometrics and Intelligent Laboratory Systems
Volume:
160
Pages:
40 - 51
ISSN:
0169-7439
DOI:
http://dx.doi.org/10.1016/j.chemolab.2016.11.013
Hits: 3082