www.kjemometri.org/Om kjemometri/Litteraturliste/Martens&Martens bok
Bok om kvalitetsanalyse


Faggruppe for kjemometri

|
|
|
|
| | |
|

     

Kontakt oss:

- for å bli medlem
- for å få mer informasjon

Postadresse:

Norsk kjemisk selskaps faggruppe for kjemometri
c/o Tarja Rajalahti
UiB, Kjemisk inst.
Allegaten 41
5007 Bergen


Oppdatert 22.07.2010
kjemometri [Q] kjemometri.org

Dear Scandinavian chemometricians and sensometricians,

We take the liberty to share with you some of the thoughts behind our brand new book:

Martens, H. and Martens, M. (2001): MULTIVARIATE ANALYSIS OF QUALITY An Introduction

John Wiley & Sons, ISBN: 0-471-974285 (see:www.wiley.co.uk/wileychi/chemometrics )

The fields of chemometrics and sensometrics have accumulated some characteristic and efficient ways to explore complex systems and to analyse multivariate data, and we consider these to be of value also to the outside world. In this introductory book we have tried to summarise some of the experiences and methods from chemometrics and sensometrics. We thereby try to fill an introductory, interdisciplinary niche between (or rather, below) all the good textbooks, handbooks and manuals already published in these fields, and in the neigbouring field of qualimetrics. The new book took many years to write, because we wanted to present a rather complete, yet simple approach to multivariate data analysis, and for this we needed to fill in some methodological holes. With that done, the book presents ONE single data-analytical approach, which can be used for MANY different types of data analysis, and for MANY different types of input data.

Copenhagen, February 8, 2001

Best regards,

Harald Martens prof. chemometrics NTNU(Norway), DTU(Denmark) Harald.Martens@mail.tele.dk

Magni Martens prof. sensory science KVL (Denmark) mma@kvl.dk

 

 

BACKGROUND:

In this book, the diverse field of QUALITY ASSESSMENT is used for explaining the chosen data analytical method, also from the perspective of ISO standards. The inter-disciplinary examples range from calibration of NIR instruments and fluorescence process-analysers, via explorative prediction of toxicity from quantum chemistry computations, sensory and chemical analysis of food quality, and analysis of various questionnaire data sets, to the confirmative analysis of effects and their reliabilities ("significances") in designed micro-biological experiments. As the title indicates, the book is introductory, and therefore written with very little mathematics and statistics hurdles (except in the technical appendicies where the method is detailed). For particularly critical projects, we recommend the researchers to consult a professional statistician. But for all other projects, the researcher should analyse his or her own data, because only the researcher has the necessary contextual knowledge. Moreover, there are not enough statisticians around, and many of them are too traditional and too theory-hungry.

There is no such thing as THE BEST data analytical method. However, to use a method wrongly can be worse than not using it at all; it can be misleading and it breeds scientific hubris. Researchers who are busy enough in their own fields, may not have time to learn to master many different statistical methods properly, particularly if that requires heavy attention on mathematics and statistical distribution theory. Therefore, we have strived to develop a rather complete tool for multivariate data analysis, which efficiently replaces a long list of different, specialized methods from classical statistics. The reliability of the results is primarily assessed by What-You-See-Is-What-You-Get visualisations of model stability, with little or no need for traditional statistical distribution theory. We have strived to ensure that the chosen methodology is well rooted in contemporary statistics (linear models, latent variables and resampling statistics), because we consider scientific connectivity to be important. The methodology chosen is no foreigner to chemometricians and sensometricians: Interactive, graphically oriented multivariate soft modelling, based on one single method: Bi-linear modelling by weighted PCA/PLS Regression, with reliability assessed by cross-validation / jack-knifing. But certain details are new, and the way we explain it may be new to some.

We primarily focus on using this method for explorative investigations of various types: Simple unmixing of known systems, more advanced multivariate calibration of incompletely understood systems, general interdisciplinary studies by multivariate regression and by one- or two-block factor analysis to find patterns, and classification / discrimination to study heterogeneous sample sets. Confirmative analysis of designed experiments is also given attention (see the last section, below). The book is written as a simplified introduction to the more theoretical book, Martens & Naes: Multivariate Calibration (J.Wiley & Sons Ltd 1989), and uses the same notation. It is primarily geared towards researchers and students who see themselves as novices in multivariate data analysis, in particular those with a mathematics allergy. Most of the methodology is already available in updated third-party software.

The new book has a fair number of innovations and filled-in holes, of potential interest to the more method-oriented reader, e.g.: - how to distinguish between statistical weights of input variables (to optimize parameter estimation) and their graphical weights (to simplify interpretation of loading plots etc) - how to compute correlation loadings, to make bi-linear plots scale- invariant - how to assess a model at different validity levels (repeatability, reproducibility, interpolation ability etc) by using different cross-validation segmentation schemes - how to make cost-effective experimental planning, by assessing alternative experimental plans w.r.t. statistical power, without any difficult distribution theory. (Instead, for each alternative, automatically analyse many simulated data just like the future real data will be analysed, and look at the histograms of outcomes) - how to use jack-knifed PLSR perturbations for visual and quantitative reliability assessment, instead of the traditional SS-DF-MSE-F-p tables in ANOVA - how to use PLSR for unmixing, i.e. for curve resolution with very few calibration data - In the technical appendix section, which details the methodology, additional new stuff is outlined, e.g.: - how to stabilize the PLSR numerically, for analysis of designed experiments (alternative to ANOVA) - how to pre-treat data to reduce the effect of a priori known uncertainty and irrelevant covariances - why NIR spectra are so information rich. The various method innovations used in the book have recently been studied in more depth in various chemometrics and sensometrics publications, or are now in the process of being studied, by ourselves, our PhD students or by others. More information about the book is available at the publisher's web site listed above. This site also gives access to all the data used in the examples, in several different file formats (ASCII, The Unscrambler and Matlab).

ANALYSIS OF EFFECTS BY JACK-KNIFED PLS REGRESSION. The section of the book on confirmative analysis is the part with the most new methodology. We here outline the method, and motivate it for our fellow chemometricans and sensometricians: The data from controlled experiments (e.g. factorial designs) may be analysed just like more explorative data sets, by: 1. Formulating the structure modelling from e.g. ANOVA, MANOVA or Covariance analysis as a linear regression problem (nothing new in that!) 2. Estimating the linear regression coefficients via bi-linear Principal Components by PLS Regression (PLSR), with numerical stabilisation of the PLSR (new) and, finally 3. Estimating the uncertainties and "significances" of the linear and bi-linear parameters by Procrustes-rotated jack-knifing (new, although in line with Herman Wold's early suggestions for PLS path modelling). This approach is particularly valuable when there are MANY inter-related responses Y to be related to the design factors X, because then e.g. the traditional ANOVA output tends to become overwhelming. Instead, one can now overview the main results in plots of the first few bi-linear PCs, study detailed quantitative effects in the linear regression coefficients, study the reliability of these results by graphical inspection of the jack-knifing perturbations and the resulting reliability ranges, and even estimate their statistical "significances".

The "soft/hard modelling" can reveal outliers and other surprises that in traditional analysis of variance would cause "non-significance" but otherwise tend to pass unnoticed. Of course we expect to draw some fire by taking on big brother ANOVA, who is so well established. And since our method is new, it needs critical scrutiny. But we are not asking for war, because chemometricians and sensometricians struggle with many of the same fundamental problems that classical statisticians are up against: How to obtain and interpret data from a complex world under indirect, uncertain observation, given the human tendency for wishful thinking. We just wanted to develop a simpler, less abstract analysis of effects, applicable even for designed experiments with multiple responses, - one that combines the excitement of discovery with a critical assessment of reliability, - that gives plots rather than significance tables, and that does not require an understanding of what a degree of freedom is. In the final appendix we illustrate that the new method gives significance estimates quite similar to those from traditional ANOVA, if so desired.

To request a review copy of the book, please contact Julia Lampam on (+44) 1243 770668 or e-mail: jlampam@wiley.co.uk

Thank you for your patience. M & H