Cambridge University Press. 2013. — 400 pages.
ISBN: 0521887933.
'Big data' poses challenges that require both classical multivariate methods and contemporary techniques from machine learning and engineering. This modern text equips you for the new world - integrating the old and the new, fusing theory and practice and bridging the gap to statistical learning. The theoretical framework includes formal statements that set out clearly the guaranteed 'safe operating zone' for the methods and allow you to assess whether data is in the zone, or near enough. Extensive examples showcase the strengths and limitations of different methods with small classical data, data from medicine, biology, marketing and finance, high-dimensional data from bioinformatics, functional data from proteomics, and simulated data. High-dimension low-sample-size data gets special attention. Several data sets are revisited repeatedly to allow comparison of methods. Generous use of colour, algorithms, MatLAB code, and problem sets complete the package. Suitable for master's/graduate students in statistics and researchers in data-rich disciplines.
Classical MethodsMultidimensional data
Principal component analysis
Canonical correlation analysis
Discriminant analysis
Factors and GroupingsNorms, proximities, features, and dualities
Cluster analysis
Factor analysis
Multidimensional scaling
Non-Gaussian AnalysisTowards non-Gaussianity
Independent component analysis
Projection pursuit
Kernel and more independent component methods
Feature selection and principal component analysis revisited