Model-based analysis of latent factors
Citable Link (URL):http://resolver.sub.uni-goettingen.de/purl?gs-1/15782
The detection of community or population structure through analysis of explicit cause–effect modeling of given observations has received considerable attention. The complexity of the task is mirrored by the large number of existing approaches and methods, the applicability of which heavily depends on the design of efficient algorithms of data analysis. It is occasionally even difficult to disentangle concepts and algorithms. To add more clarity to this situation, the present paper focuses on elaborating the system analytic framework that probably encompasses most of the common concepts and approaches by classifying them as model-based analyses of latent factors. Problems concerning the efficiency of algorithms are not of primary concern here. In essence, the framework suggests an input–output model system in which the inputs are provided as latent model parameters and the output is specified by the observations. There are two types of model involved, one of which organizes the inputs by assigning combinations of potentially interacting factor levels to each observed object, while the other specifies the mechanisms by which these combinations are processed to yield the observations. It is demonstrated briefly how some of the most popular methods (Structure, BAPS, Geneland) fit into the framework and how they differ conceptually from each other. Attention is drawn to the need to formulate and assess qualification criteria by which the validity of the model can be judged. One probably indispensable criterion concerns the cause–effect character of the model-based approach and suggests that measures of association between assignments of factor levels and observations be considered together with maximization of their likelihoods (or posterior probabilities). In particular the likelihood criterion is difficult to realize with commonly used estimates based on Markov chain Monte Carlo (MCMC) algorithms. Generally applicable MCMC-based alternatives that allow for approximate employment of the primary qualification criterion and the implied model validation including further descriptors of model characteristics are suggested.