Robert Carver

Practical Data Analysis with JMP, Third Edition


Скачать книгу

Several chapters include topics that some instructors might view as “advanced”—typically when the output from JMP makes it a natural extension of a more elementary topic. This is one way in which software can redefine the boundaries of introductory statistics.

      Third, nearly all the data sets in the book are real and are drawn from those disciplines whose practitioners are the primary users of JMP software. Inasmuch as most undergraduate programs now require coursework in statistics, the examples span major areas in which statistical analysis is an important path to knowledge. Those areas include engineering, life sciences, business, and economics.

      Fourth, each chapter invites students to practice the habits of thought that are essential to statistical reasoning. Long after readers forget the details of a particular procedure or the options available in a specific JMP analysis platform, this book may continue to resonate with valuable lessons about variability, uncertainty, and the logic of inference.

      Each chapter concludes with a set of “Application Scenarios,” which lay out a problem-solving or investigative context that is in turn supported by a data table. Each scenario includes a set of questions that implicitly require the application of the techniques and concepts presented in the chapter.

      New in the Third Edition

      This edition preserves much of the content and approach of the earlier editions, while updating examples and introducing new JMP features. As in the second edition, there are three review chapters (Chapters 5, 9, and 17) that pause to recap concepts and techniques. One of the perennial challenges in learning statistics is that it is easy to lose sight of major themes as a course progresses through a series of seemingly disconnected techniques and topics. Some readers should find the review chapters to be helpful in this respect. The review chapters share a single large data set of World Development Indicators, published by the World Bank.

      The scope and sequence of chapters is basically the same as the prior edition. There is some additional new material about the importance of documenting one’s work with an eye toward reproducibility of analyses, as well as production of presentation-ready reporting. The second edition was based on JMP 11, and since that time, platforms have been added or modified, and some functionality has relocated in the menu system. This edition captures those changes.

      Some of the updated data tables are considerably larger than their counterparts in earlier editions. This creates the opportunity to demonstrate methods for meaningful graphs when data density and overplotting become issues. I also use some of the larger data tables to introduce machine learning practices like partitioning a data set into training and validation sets.

      JMP Projects are introduced in Chapter 2 and used throughout the book. Projects are a way to organize, preserve, and document multiple analyses using multiple data tables. They naturally support a logical and reproducible workflow. Using projects is a way for newcomers to establish good habits and for JMP veterans to be more efficient.

      Other additions and amendments include:

      ● Early introduction of more data types, Header Graphs, and JMP Public.

      ● Expanded use of Subset, Global and Local Data Filters and Animate. In the prior editions, for example, the set of data tables included some subsets of larger tables. Because data preparation is such an important part of the analytical cycle, readers learn to perform filtering and subsetting functions on their own.

      ● The Recode command has evolved since JMP 11, as have the lessons using Recode. Readers will learn why and how to recode a column.

      ● In the Regression chapters, coverage of the Profiler has expanded, and I have added the Partition Platform to the discussion of variable selection. The Fit Curve platform also makes its first appearance, as do temporary variable transformations.

      ● For JMP Pro users, there is a brief treatment of the Formula Depot to facilitate comparison of models.

      ● In Chapter 21 on Design of Experiments, we meet Definitive Screening Designs.

      ● In Chapter 22, Variability Charts have been added.

      ● Simulators and calculators previously supplied as JSL scripts in earlier editions have been bundled among JMP’s teaching demonstrations in the Help system. The text now reflects this very useful change.

      Is This Book for You?

      Intended Audience

      This book is intended to supplement an introductory college-level statistics course with real investigations of some important and engaging problems. Each chapter presents a set of self-paced exercises to help students learn the skills of quantitative reasoning by performing the types of analyses that typically form the core of a first course in applied statistics. Students can learn and practice the software skills outside of class. Instructors can devote class time to statistics and statistical reasoning, rather than to rudimentary software instruction. Both students and teachers can direct their energies to the practice of data analysis in ways that inform students’ understanding of the world through investigations of problems that matter in various fields of study.

      Though written with undergraduate and beginning graduate students in mind, some practitioners might find the book helpful on the job and are well-advised to read the book selectively to address current tasks or projects. Chapters 1 and 2 form a good starting point before reading later sections. Appendix B (online for this edition) covers several data management topics that might be helpful for readers who undertake projects involving disparate data sources.

      Prerequisites

      No prior statistical knowledge is presumed. A basic grounding in algebra and some familiarity with the Mac OS or Windows environment are all you need in advance. An open, curious mind is also helpful.

      A Message for Instructors

      I assume that most teachers view class time as a scarce resource. One of my goals in writing this book was to strive for clarity throughout so that students can be expected to work through the book on their own and learn through their encounters with the examples and exercises. This book may be especially welcome for instructors using an inverted, or flipped, classroom approach.

      Instructors might selectively use exercises as in-class demonstrations or group activities, interspersing instruction or discussion with computer work. More often, the chapters and scenarios can serve as homework exercises or assignments, either to prepare for other work, to acquire skills and understanding, or to demonstrate progress and mastery. Finally, some instructors might want to assign a chapter in connection with an independent analysis project. Several of the data tables contain additional variables that are not used within chapters. These variables might form the basis for original analyses or explorations.

      The bibliography may also aid instructors seeking additional data sources or background material for exercises and assignments. Tips for classroom use of JMP are also available at the book’s website, accessible through the author’s page at support.sas.com/carver.

      A Message for Students

      Remember that the primary goal of this book is to help you understand the concepts and techniques of statistical analysis. JMP provides an ideal software environment to do just that. Naturally, each chapter is “about” the software and at times you will find yourself focusing on the details of a JMP analysis platform and its options. If you become entangled in the specifics of a problem, step back and try to refocus on the main statistical ideas rather than software issues.

      This book should augment, but not replace, your primary textbook or your classroom time. To get the maximum benefit from the book, work mindfully and carefully. Read through a chapter before you sit down at the computer. Each chapter will require approximately 30 minutes of computer time; work at your own pace and take your time. Remember that variability is omnipresent, so expect that the time you need to complete a chapter may be more or less than 30 minutes.

      The Application Scenarios at the end of each chapter are designed to reinforce and extend what you have learned in the chapter. The questions in this section are designed to challenge you. Sometimes, it is obvious how to proceed with your analysis; sometimes, you will need to think a