Algorithms
NIPALS
The NIPALS Algorithm
Computational Results
Properties of the NIPALS Algorithm
SIMPLS
Optimization Criterion
Implications for the Algorithm
The SIMPLS Algorithm
More on VIPs
The Standardize X Option
Determining the Number of Factors
Cross Validation: How JMP Does It
Appendix 2: Simulation Studies
Introduction
The Bias-Variance Tradeoff in PLS
Introduction
Two Simple Examples
Motivation
The Simulation Study
Results and Discussion
Conclusion
Using PLS for Variable Selection
Introduction
Structure of the Study
The Simulation
Computation of Result Measures
Results
Conclusion
Preface
A Word to the Practitioner
Welcome to Discovering Partial Least Squares with JMP. This book introduces you to the exciting area of partial least squares. Partial least squares is a multivariate modeling technique based on the idea of projection—the inspiration for the book’s cover design. You will obtain background understanding and see the technique applied in a number of examples. The book is built around the intuitive and powerful JMP statistical software, which will help you understand and internalize this new topic in a way that just reading simply cannot.
Since our goal is to help you apply partial least squares in your own setting, the textual material exists only to build your understanding and confidence as you progress through the worked examples. Although we endeavor to provide the salient details, the area of partial least squares is very broad and this book is necessarily incomplete. To the extent that we cannot cover certain topics fully, we provide references for your further study.
The Organization of the Book
We open with a number of introductory chapters that describe the concepts behind partial least squares and help position it in the wider world of statistical methodology and application. The meat of the book is found in Chapters 5 through 8, which contain four examples. Working through these examples using JMP prepares you to apply partial least squares to your own data. The book also contains two appendixes that provide further statistical details and the results of some simulation studies. Depending on your level and area of interest, you might find these useful.
Required Software
Although a user of standard JMP 11 or later will find this book useful, many examples require JMP Pro 11 or later. Compared to the standard version of JMP, the Pro version is intended for those who require deeper analytical capabilities. In JMP Pro, the implementation of partial least squares is quite complete.
The book uses JMP Pro 11.0 in screenshots, instructions, and discussions. Even though JMP’s PLS capabilities will continue to be developed, the major features and design shown here will persist. However, in future versions, you may notice very slight differences from the specific instruction sequences and screenshots presented in this book.
Ideally, you will have JMP Pro 11 available as you work through this book. A fully functional version of JMP Pro 11 that runs for 30 days can be requested at http://www.jmp.com/webforms/jmp_pro_eval.shtml.
The standard version of JMP enables you to run some partial least squares analyses through a simplified interface. Using this version you will be able to work through some, but not all, of the examples, and many of the scripts linked to in the book will not function correctly. But the book should still help your understanding of partial least squares, and help you decide if you need the Pro version of JMP.
Accessing the Supplementary Content
The data tables and scripts associated with the book can be accessed at either http://support.sas.com/cox or http://support.sas.com/gaudard, which provides a single ZIP file. Once downloaded, you can unzip the contents to a convenient location on your hard disk. This process creates a master JMP journal file Discovering Partial Least Squares with JMP.jrn, along with a folder for each chapter containing scripts. Data tables are created by running these scripts using the links in the master journal. The master journal file provides a convenient way to access all of the supplementary content, and the instructions in the text assume that you will do this.
The data tables themselves contain saved scripts that are referred to in the chapters. Often, when working through an example, we show the steps that you can follow to generate a report in JMP. In addition, either parenthetically or directly, we give the name of a script that has been saved to the data table and that generates that same analysis.
This way, if you want to see the report without stepping through the selections to create it, you can simply run that script.
The scripts are used to illustrate concepts and to help you develop understanding. Because many of the scripts have an element of randomness built in, it is usually worth running the same script more than once to see the effect over various random choices. Also, be aware that the scripts have been encrypted. If you open one of these scripts directly rather than via the journal file mentioned earlier, you see what appears to be gibberish. Nevertheless, you can right-click within the script window and select Run Script.
1
Introducing Partial Least Squares
Partial Least Squares in Today’s World
Transforming, and Centering and Scaling Data.
Modeling in General
Applied