of the flat earth society. He believes that the earth is flat. Obviously, he has to defend against the evidence from partially disappearing ships. For if the flat earth theory were true, we expect to observe just continuous shrinking, but not partial disappearance. However, we do observe partial disappearance. It seems that this observation falsifies the flat earth theory decisively. However, the flat earther has a move left, even if it will strike you as some sort of a parlor trick.
In response to the partially disappearing ship, he might say something along these lines: “Well, this observation is quite compatible with a flat earth. For example, if light didn’t travel in straight lines, but was simply bent slightly toward the earth on its trip from the ship to the observer, then, after a certain distance, the light reflecting off the bottom of the ship would hit the water before it had a chance to reach the eyes of the observer. Thus, the bottom of the ship will disappear from sight before the masts will, even though the earth is flat.”
What this guy is saying sounds outrageous – light being bent? (As we’ll see, it is, as a general idea, not as outrageous as it may first seem, but it couldn’t explain the ship’s partial disappearance.) However, from the perspective of theory falsification, he has a point, namely this one: No hypothesis has observational consequences all by itself. There are always so-called auxiliary hypotheses that need to be in place as well. In our example, the “flat earth” hypothesis together with the claim that light travels in straight lines predict that a ship sailing away from the observer will shrink continuously and uniformly; it will not partially disappear. If the ship does partially disappear, I can blame one of the two claims: either that the earth is flat, or that light travels in straight lines. Either “falsity” would account for the partial disappearance.
You might still resist this move of blaming an auxiliary hypothesis (in our case, the claim that light travels in straight lines); it might strike you as cheating. But there are many other examples where this move is exactly the move to make. Suppose a group of physics majors gets a result in a lab exercise that contradicts some well-established theory. Clearly, we are not immediately going to overthrow the theory. Rather, we’ll blame the students: They didn’t set up the experiment correctly, they misread the measuring instrument, the instrument was broken, or what have you. The last thing we do is take them to have falsified classical mechanics! If we were to go there, all theories in physics, chemistry, and so on would continuously be falsified by legions of students doing lab exercises in colleges all over the US on a daily basis. Let’s not go there.
The general lesson here is this. Theories and hypotheses always rely on auxiliary hypotheses in order to generate observable predictions. This phenomenon is known as confirmation holism. It was first explicitely discussed by the French physicist Pierre Duhem, who said in his book, The Aim and Structure of Physical Theory: “To seek to separate each of the hypotheses of theoretical physics from the other assumptions upon which this science rests, in order to subject it in isolation to the control of observation, is to pursue a chimera.”2 And while Duhem restricted his discussion to physics, it is clear that the underlying point generalizes: If we don’t observe what the main hypothesis predicts, we can blame either the main hypothesis, or one or more of the auxiliary hypotheses. The outcome of the test does not tell us, however, which of the available hypotheses – main or any of the auxiliary ones – we should blame for observed discrepancies. In other words, hypotheses, and especially entire theories, which usually consist of many integrated hypotheses, cannot be conclusively falsified.
3.3 The Demarcation Problem
To be honest, we have simplified Popper’s position to a considerable extent in order to introduce the notion of falsification. His actual view is much more sophisticated. To see this, consider a famous problem that motivated much of early twentieth-century work in the philosophy of science: the so-called demarcation problem, which is the problem of distinguishing empirical science from various forms of pseudoscience. For a group of philosophers known as logical positivists (a.k.a. logical empiricists), pseudoscience included metaphysics. The positivists sought to draw the distinction in terms of meaningfulness, which in turn was assessed by the verifiability of statements. For example, the statement, “The absolute is beautiful” sounds at first glance meaningful; after all, it is a grammatically well-formed sentence in English. But what exactly does it say? The positivists thought that if a statement has meaning, you should be able to determine whether it is true or false. If you can’t, the statement is meaningless. Now, is it true that the absolute is beautiful? How would you verify it? If you think that this statement can’t be verified, because you don’t know, e.g., where to find the absolute to see whether it is indeed beautiful, then you should conclude, so the positivists contended, that it is meaningless. Of course, you might find the statement evocative, or it might resonate with you on an aesthetic level, and so have some sort of meaning. But what it lacks, according to the positivists, is cognitive meaning. Real science consists of statements that are cognitively meaningful, i.e., can be verified. This view has become known as the verifiability theory of cognitive meaningfulness.
Unfortunately, the verifiability criterion is too strong and rules out a lot of science. Consider again the claim that all ravens are black. We have seen earlier that to confirm this seems hopeless; verifying it is outright impossible, because you’d have to inspect all ravens past, present, and future. By the verifiability criterion for meaningfulness, the claim that all ravens are black is therefore meaningless. But that seems wrong. While “The absolute is beautiful” is rather obviously meaningless, “All ravens are black” is not. However, both are unverifiable. Thus, verifiability is the wrong criterion for distinguishing the meaningful from the meaningless.
Popper agreed with the positivists on the importance of distinguishing genuine science from pseudoscientific babble, but he didn’t think that taking a detour through a criterion for meaningfulness was the right approach. So he opted for falsification instead. “All ravens are black” can be falsified, while “The absolute is beautiful” can’t be. You can, in principle, find at a raven that’s not black, but you can’t find the absolute and see that it’s ugly. Thus, the first statement belongs to science, while the second one doesn’t, whether meaningful or not.
3.3.1 Progressive Modifications
However, we have also seen that due to the role of auxiliary hypotheses in theory testing, conclusive falsification is not possible either. A hypothesis can always be protected against falsification by denying one or more of the auxiliary hypotheses; in principle, one can even deny the observational statement with which the hypothesis has been found inconsistent. In the early 1800s, English chemist William Prout proposed that the atomic weights of the various elements are whole multiples of the atomic weight of hydrogen. It was well known though that some elements appear to have weights that are inconsistent with Prout’s hypothesis. Chlorine, for example, was measured to have 35.5 times the weight of hydrogen. Prout remained undeterred and suggested that the chemical processes used to isolate elements were defective, and thus the chlorine sample was impure. In this case, assuming the truth of the main hypothesis was used to criticize and consequently modify the then prevailing experimental techniques that produced observational statements.
This raises the question of how one should decide what to modify in light of an inconsistency between theory and observational statements: the main hypothesis, one or more of the auxiliary hypotheses, or the observational statements which are based on the experimental techniques producing them? As an answer, Popper proposed that any modifications made should increase the falsifiability of the resulting theory. Since falsifiability is a measure of content – the more falsifiers a theory has, the greater its content – this amounts to the advice to modify in the direction of greater content, by either being broader in scope or more precise.
As an example, consider the theory C that all orbits of celestial bodies are circular. This theory is of broader scope than the theory P that all orbits of planets are circular, since planets are a particular kind of celestial body. Having broader scope, it has more falsifiers. On the other hand, it is also more precise than the theory E that all orbits of celestial