Group-wise ANOVA simultaneous component analysis for designed omics experiments

Abstract:

Introduction. Modern omics experiments pertain not only to the measurement of many variables but also follow complex experimental designs where many factors are manipulated at the same time. This data can be conveniently analyzed using multivariate tools like ANOVA-simultaneous component analysis (ASCA) which allows interpretation of the variation induced by the different factors in a principal component analysis fashion. However, while in general only a subset of the measured variables may be related to the problem studied, all variables contribute to the final model and this may hamper interpretation.
Objectives. We introduce here a sparse implementation of ASCA termed group-wise ANOVA-simultaneous component analysis (GASCA) with the aim of obtaining models that are easier to interpret.
Methods. GASCA is based on the concept of group-wise sparsity introduced in group-wise principal components analysis where structure to impose sparsity is defined in terms of groups of correlated variables found in the correlation matrices calculated from the effect matrices.
Results. The GASCA model, containing only selected subsets of the original variables, is easier to interpret and describes relevant biological processes.
Conclusions. GASCA is applicable to any kind of omics data obtained through designed experiments such as, but not limited to, metabolomic, proteomic and gene expression data.

[Pulse aquí para ver el artículo completo]