|
Kontakt oss:
- for
å bli medlem
- for
å få mer informasjon
Postadresse:
Norsk kjemisk selskaps
faggruppe for kjemometri
c/o Tarja Rajalahti
UiB, Kjemisk inst.
Allegaten 41
5007 Bergen
Oppdatert
22.07.2010
kjemometri [Q] kjemometri.org
|
Dear Scandinavian
chemometricians and sensometricians,
We take the liberty to share with you some
of the thoughts behind our brand new book:
Martens, H. and Martens, M. (2001): MULTIVARIATE
ANALYSIS OF QUALITY An Introduction
John Wiley & Sons, ISBN: 0-471-974285 (see:www.wiley.co.uk/wileychi/chemometrics
)
The fields of chemometrics and sensometrics
have accumulated some characteristic and efficient ways to explore
complex systems and to analyse multivariate data, and we consider
these to be of value also to the outside world. In this introductory
book we have tried to summarise some of the experiences and methods
from chemometrics and sensometrics. We thereby try to fill an introductory,
interdisciplinary niche between (or rather, below) all the good
textbooks, handbooks and manuals already published in these fields,
and in the neigbouring field of qualimetrics. The new book took
many years to write, because we wanted to present a rather complete,
yet simple approach to multivariate data analysis, and for this
we needed to fill in some methodological holes. With that done,
the book presents ONE single data-analytical approach, which can
be used for MANY different types of data analysis, and for MANY
different types of input data.
Copenhagen, February 8, 2001
Best regards,
Harald Martens prof. chemometrics NTNU(Norway),
DTU(Denmark) Harald.Martens@mail.tele.dk
Magni Martens prof. sensory science KVL (Denmark)
mma@kvl.dk
|
|
|
BACKGROUND:
In this book, the diverse field of QUALITY
ASSESSMENT is used for explaining the chosen data analytical method,
also from the perspective of ISO standards. The inter-disciplinary
examples range from calibration of NIR instruments and fluorescence
process-analysers, via explorative prediction of toxicity from quantum
chemistry computations, sensory and chemical analysis of food quality,
and analysis of various questionnaire data sets, to the confirmative
analysis of effects and their reliabilities ("significances") in
designed micro-biological experiments. As the title indicates, the
book is introductory, and therefore written with very little mathematics
and statistics hurdles (except in the technical appendicies where
the method is detailed). For particularly critical projects, we
recommend the researchers to consult a professional statistician.
But for all other projects, the researcher should analyse his or
her own data, because only the researcher has the necessary contextual
knowledge. Moreover, there are not enough statisticians around,
and many of them are too traditional and too theory-hungry.
There is no such thing as THE BEST data analytical
method. However, to use a method wrongly can be worse than not using
it at all; it can be misleading and it breeds scientific hubris.
Researchers who are busy enough in their own fields, may not have
time to learn to master many different statistical methods properly,
particularly if that requires heavy attention on mathematics and
statistical distribution theory. Therefore, we have strived to develop
a rather complete tool for multivariate data analysis, which efficiently
replaces a long list of different, specialized methods from classical
statistics. The reliability of the results is primarily assessed
by What-You-See-Is-What-You-Get visualisations of model stability,
with little or no need for traditional statistical distribution
theory. We have strived to ensure that the chosen methodology is
well rooted in contemporary statistics (linear models, latent variables
and resampling statistics), because we consider scientific connectivity
to be important. The methodology chosen is no foreigner to chemometricians
and sensometricians: Interactive, graphically oriented multivariate
soft modelling, based on one single method: Bi-linear modelling
by weighted PCA/PLS Regression, with reliability assessed by cross-validation
/ jack-knifing. But certain details are new, and the way we explain
it may be new to some.
We primarily focus on using this method for
explorative investigations of various types: Simple unmixing of
known systems, more advanced multivariate calibration of incompletely
understood systems, general interdisciplinary studies by multivariate
regression and by one- or two-block factor analysis to find patterns,
and classification / discrimination to study heterogeneous sample
sets. Confirmative analysis of designed experiments is also given
attention (see the last section, below). The book is written as
a simplified introduction to the more theoretical book, Martens
& Naes: Multivariate Calibration (J.Wiley & Sons Ltd 1989), and
uses the same notation. It is primarily geared towards researchers
and students who see themselves as novices in multivariate data
analysis, in particular those with a mathematics allergy. Most of
the methodology is already available in updated third-party software.
The new book has a fair number of innovations
and filled-in holes, of potential interest to the more method-oriented
reader, e.g.: - how to distinguish between statistical weights of
input variables (to optimize parameter estimation) and their graphical
weights (to simplify interpretation of loading plots etc) - how
to compute correlation loadings, to make bi-linear plots scale-
invariant - how to assess a model at different validity levels (repeatability,
reproducibility, interpolation ability etc) by using different cross-validation
segmentation schemes - how to make cost-effective experimental planning,
by assessing alternative experimental plans w.r.t. statistical power,
without any difficult distribution theory. (Instead, for each alternative,
automatically analyse many simulated data just like the future real
data will be analysed, and look at the histograms of outcomes) -
how to use jack-knifed PLSR perturbations for visual and quantitative
reliability assessment, instead of the traditional SS-DF-MSE-F-p
tables in ANOVA - how to use PLSR for unmixing, i.e. for curve resolution
with very few calibration data - In the technical appendix section,
which details the methodology, additional new stuff is outlined,
e.g.: - how to stabilize the PLSR numerically, for analysis of designed
experiments (alternative to ANOVA) - how to pre-treat data to reduce
the effect of a priori known uncertainty and irrelevant covariances
- why NIR spectra are so information rich. The various method innovations
used in the book have recently been studied in more depth in various
chemometrics and sensometrics publications, or are now in the process
of being studied, by ourselves, our PhD students or by others. More
information about the book is available at the publisher's web site
listed above. This site also gives access to all the data used in
the examples, in several different file formats (ASCII, The Unscrambler
and Matlab).
ANALYSIS OF EFFECTS BY JACK-KNIFED PLS REGRESSION.
The section of the book on confirmative
analysis is the part with the most new methodology. We here outline
the method, and motivate it for our fellow chemometricans and sensometricians:
The data from controlled experiments (e.g. factorial designs) may
be analysed just like more explorative data sets, by: 1. Formulating
the structure modelling from e.g. ANOVA, MANOVA or Covariance analysis
as a linear regression problem (nothing new in that!) 2. Estimating
the linear regression coefficients via bi-linear Principal Components
by PLS Regression (PLSR), with numerical stabilisation of the PLSR
(new) and, finally 3. Estimating the uncertainties and "significances"
of the linear and bi-linear parameters by Procrustes-rotated jack-knifing
(new, although in line with Herman Wold's early suggestions for
PLS path modelling). This approach is particularly valuable when
there are MANY inter-related responses Y to be related to the design
factors X, because then e.g. the traditional ANOVA output tends
to become overwhelming. Instead, one can now overview the main results
in plots of the first few bi-linear PCs, study detailed quantitative
effects in the linear regression coefficients, study the reliability
of these results by graphical inspection of the jack-knifing perturbations
and the resulting reliability ranges, and even estimate their statistical
"significances".
The "soft/hard modelling" can reveal outliers
and other surprises that in traditional analysis of variance would
cause "non-significance" but otherwise tend to pass unnoticed. Of
course we expect to draw some fire by taking on big brother ANOVA,
who is so well established. And since our method is new, it needs
critical scrutiny. But we are not asking for war, because chemometricians
and sensometricians struggle with many of the same fundamental problems
that classical statisticians are up against: How to obtain and interpret
data from a complex world under indirect, uncertain observation,
given the human tendency for wishful thinking. We just wanted to
develop a simpler, less abstract analysis of effects, applicable
even for designed experiments with multiple responses, - one that
combines the excitement of discovery with a critical assessment
of reliability, - that gives plots rather than significance tables,
and that does not require an understanding of what a degree of freedom
is. In the final appendix we illustrate that the new method gives
significance estimates quite similar to those from traditional ANOVA,
if so desired.
To request a review copy of the book, please
contact Julia Lampam on (+44) 1243 770668 or e-mail: jlampam@wiley.co.uk
Thank you for your patience. M & H
|