Robert J. Moffat

Planning and Executing Credible Experiments


Скачать книгу

target="_blank" rel="nofollow" href="#ulink_ed6cc30b-439e-53ed-b7d7-1e560b87b79b">2 https://quoteinvestigator.com/2014/03/29/sharp‐axe asserts no evidence of Lincoln writing this. Having grown up on a small farm with one chore of clearing hundreds of pines out of our pasture, I (RH) can attest to its advice. Would rail‐splitter Lincoln not agree?

      Richard Feynman

      Science, medicine, and engineering depend heavily on experiments. Business does too, where product design and marketing experiments must account for the added complexity of human nature and taste. In every case in any field, an experiment must be credible or it wastes time and resources.

      Engineering problems are solved using three tools: insight, analysis, and experiment. Usually all three are brought to bear on any given problem; each complementing the others. A sudden insight or inspiration may be enough to suggest what must be done next, a new analysis or a new experiment, but to actually get an answer takes work.

      This book is about the strategy and tactics of experimental work – the techniques whereby one plans and executes an experiment which insight or inspiration has suggested. Our strategies and tactics are a superset which includes the concepts of design of experiments (DoE) and extends beyond.

      Have you heard how designed experiments improve results while reducing the amount of data needed? In our experience, few engineering students have learned DoE in their undergraduate labs, so we are including DoE concepts. Many DoE techniques were pioneered by Ronald Fisher (1890–1962) for agricultural experiments in the early 1900s. As the advantages of DoE become known, we all benefit.

      The purpose of an experiment is to get provably accurate, relevant, and credible data – data that are reliable enough to serve as the basis for answering questions and making decisions. Most experiments arise out of questions which must be answered, such as “Does this device behave the way it is supposed to?” or “How much cooling do we need?” or “What is the relationship between X and Y?” or “Which of these designs performs best?” In many cases, such questions lead to experiments, and the data from those experiments lead to decisions.

      The three key words are “accurate,” “relevant,” and “credible.” One point to remember is that when the work is all done, it will be your signature on the report – and that report will be around for a long time! It is not enough for you to be personally convinced that your results are accurate. You must be able to establish the credibility of the work “beyond a reasonable doubt” or at least well enough so that a prudent engineer would be willing to accept your results as valid when you are no longer around to answer his or her questions.

      The process of establishing credibility begins with the experiment plan and only ends when the results have been presented in such form that they can easily be understood. The experiment plan must make provision for the appropriate checks and balances: baseline checking, repeatability tests, and the other diagnostics which guard against error. Showing agreement with a baseline dataset is one of the most convincing pieces of evidence that can be offered to support the credibility of an experiment. The data presentation must include a quantitative description of the residual uncertainty in the results.

      In large measure, that is what this book is all about: designing and executing experiments for credibility. Before we get to the main issue, however, there are some key points to consider about experiments in general.

      A quantitative property can be measured. A categorical property can be recorded but not measured.

      The act of measurement is an ordering in a scalar system involving a “less than, equal to, or greater than” test. We assign a value to the measurand by comparing it with a standard interval and counting the number of intervals equal to the measurand. The only attributes of any system that can be measured are those which can be put into one‐to‐one correspondence with points on the real number line. Since only the real number system has the order property, only real numbers (scalars) can be measured.

      2.2.1 Examples Not Measurable

      Only the simplest attributes of systems can be directly measured – the rest are inferred.

      Many times people seem to recognize this problem but don't know how to describe their malaise. They want information but find themselves talking about measurements. For example, a former governor of Mississippi has been quoted as saying (and I paraphrase), “When putting money in education, everyone wants to see some measurable return for the money. Yet it is the one area that has the greatest degree of intangibles.” There is no measurable attribute of education except a purely artificial one: “test scores.” Teachers know that test scores don’t measure educational achievement. But when people insist on measurements, they will get measurements.

      A great deal of the information we use in daily decision‐making is nonscalar and, therefore, intrinsically not measurable. For example, we cannot measure the appearance of a face, the sound of a voice, or the taste of tomato soup, and yet with no difficulty at all, we greet our friends, recognize their voices, and enjoy our dinners. The information transfer by sight, hearing, and through taste represents very complex information handling using arrays of scalars and correlations between pairs of scalars (temporal and spatial). No instrumentation system can do as sophisticated a job of pattern recognition as the human eye/mind combination, or of frequency analysis/correlation as the ear/mind combination, or of chemical analysis as the taste‐bud/mind combination. We are not denying the improving capabilities of neural networks, wavelet transformations, or AI deep learning – we are just marveling.

      2.2.2 Shapes

      Shape cannot be measured – not even simple shapes, such as circles. Simple shapes can be described by names that we all understand by experience, but they cannot be measured. For example, a “circle” is defined as the locus of points lying in a plane and at the same distance from a common point, called “the center.” Given that definition and a value for the radius, you can draw a circle and look at it, and you know exactly what was meant – but that does not constitute measuring the shape of the circle. The shape information was conveyed using the reserved word “circle”; only the size was described by the radius and location by the center.