refute or falsify the hypothesis. If a safe haven strategy does not raise a portfolio's CAGR over time, then the null hypothesis—that the strategy cost‐effectively mitigates the portfolio's risk—does not hold. If it disagrees with experiment, it is wrong—it is not a cost‐effective safe haven strategy. What we cannot do, however, is prove that something is a cost‐effective safe haven strategy. Such is the scientific method.
To illustrate why you can't prove things in reverse, it's important to note that I could not have posed this syllogism instead, as the inverse of our premises: If a strategy does not cost‐effectively mitigate a portfolio's risk, then adding that strategy lowers the portfolio's CAGR over time. That would be deductively invalid; it mistakes a sufficient condition for a necessary condition. Observing that adding the strategy raises the portfolio's CAGR over time actually proves nothing about cost‐effective risk mitigation. This is because there are other ways that the strategy could have raised the portfolio's CAGR; the strategy needn't have even mitigated risk at all, and it may have even added risk. We would need to delve deeper into the source of that outperformance. (As Hemingway said, “Being against evil doesn't make you good.”)
This is equivalent to the fallacy of affirming the consequent: If “O is true,” can we then conclude that “H is true”? Thinking that we can is a mistake often committed even in the physical sciences: “If my theory is correct, then we will observe this data. We observe this data; therefore, my theory is correct.”
In my previous example with Nana, what if I don't have a groundhog problem? Can I proudly claim that my Nana is good at catching groundhogs? After all, there could be myriad other possible reasons that I don't have a groundhog problem. Maybe they were scared off by our resident fox. Or perhaps my son has been playing in our woods with his bow and arrow and ghillie suit.
We would similarly mistake sufficiency for necessity with the fallacy of denying the antecedent. In this case, we might conclude that I have a groundhog problem as a logical consequence of knowing that Nana is not good at hunting groundhogs.
All knowledge is a hypothesis; it is all conjectural and provisional, and it can only ever be falsified, never confirmed.
Now, a critical part of this scientific method is the way we go about choosing the very hypothesis that we are putting to the test. We specifically need to avoid ad hoc hypotheses that simply fit our observations. They are always there, ripe for the picking. We need logical explanations that are independent of and formed prior to those observations.
Scientific knowledge is not just to know that something is so; it is just as important to know why something is so.
The deductive thinking behind my hypothesis “Nana is good at catching groundhogs” was so important because it's going to hang around as sort of our working hypothesis until I manage to falsify it. Did I have a sound deductive reason to think that her skill, if it exists, would really result in the disappearance of groundhogs? Skilled or not, does she prefer to spend her summer days sleeping indoors? (As a Bernese Mountain Dog, she much prefers snow and air conditioning.) Does she more often chase or get chased by the groundhogs?
DEDUCTIVE DICE
We need a sound, deductive framework for understanding why and how risk mitigation can lead to higher compound growth rates. Is it even possible? How can we expect risk mitigation to be cost‐effective? What's the mechanism behind it? And how might we recognize this ability if or when we see it?
There are so many forces swirling around in investing and markets; any attempts at explaining risk mitigation based on that swirling data will likely get us into trouble. So we will need to figure out, deductively, the mechanism that gives rise to our hypothesis. It won't be enough just to show that a strategy raises wealth and, poof!, it is cost‐effective risk mitigation. We need to understand the forces at work to make that happen.
The best deductive tool at our disposal will be the same deductive tool used from prehistory right up to today to discover and comprehend the general science of probability and the formal idea of risk.
Archaeological excavations have uncovered collections of talus bones, or ankle bones of goats and sheep, dating back to 5,000 BCE. These four‐sided bones were the oldest of all gambling devices. Our more familiar six‐sided “bones” started showing up by 3,000 BCE. Visibly, dice have been embedded in civilization's history as the generators of fate, then chance, and then even skill (in the early precursors of modern backgammon). They were ubiquitous throughout both the depth and breadth of that history and are a part of our collective unconscious.
But it took ages, and the need to better understand wagers on games, for dice to become the intuitive pedagogical tools of deductive inference that would set in motion a theory of probability. As early as the fourth‐century BCE, Aristotle casually pointed out that, while it is easy to make a couple of lucky rolls of a die, with 10,000 repeated trials, divine intervention be damned, the luck of the die evens out: “The probable is that which for the most part happens.” Imagine, this was revolutionary stuff! But the ancient Greeks and Romans never really got it—they never even bothered to make sure their dice faces were symmetric. If everything was fate anyway, what difference did it make? Julius Caesar's famous line “the die has been cast” was not a statement about probability. (For all the wisdom that we ascribe to the ancients, if you had the ability to go back in time, you would totally clean up gambling against them.)
It wasn't until much later, by the seventeenth century, that Galileo and then Blaise Pascal and Pierre de Fermat became gambling advisor‐mercenaries to various noblemen. For instance, the Chevalier de Méré needed advice on his costly observation that betting with even odds on the appearance of a 6 in four rolls of a single die was profitable in the long run, whereas betting with the same odds on a double‐6 in 24 rolls of two dice was not. (In 1952, the famed New York City gambler “Fat the Butch” rediscovered this same deductive fact in his own rather costly hypothesis test.)
At that point, probability was all about deductive reasoning, starting with the known properties of the generator (a die) and then reasoning forward with expectations about its particular outcomes. Repeatability was implicitly a necessary condition to probabilistic inference. This was the frequentist perspective, where the very meaning of probability was the frequency of occurrences over many trials. It is the logic of the gambler with an edge, the logic of the casino. Probability was truly coming of age as a neat trick to allow mathematicians to pick off degenerate gamblers. These were the original quants—money on the line has a way of sparking innovation. (Heck, it took having an options position for me to ever start really thinking about math.)
Of course, we have always understood risk and its mitigation in our bones; that's how we made it this far, after all. But along with advancements in our understanding of probability grew a gradual formalization and sophistication of risk mitigation. And we can think of the growth of that formality first and foremost as the growth of innovations in insurance—which itself would facilitate an explosion in risk taking and innovations. Insurance is an ancient idea, and a key part of the very progress of our civilization. It began as solidarity, as risks were shared—spreading throughout small villages, for instance, as commitments to mutually self‐insure and share the replacement costs of homes within the community. This aggregation of individual risks created a frequentist's perspective where there otherwise was none, effectively expanding an individual's sample size from 1 to the size of their community.
Fast forward to the twentieth century, and an epic nerd skirmish would erupt over this perspective. The newly founded Bayesians and Popper's propensity theory—where probability instead meant “degrees of belief” or “tendencies,” respectively—went head‐to‐head with the simpleton frequentists. And it doesn't really matter who was right. All that matters is which perspective is being used. When your sample size is small, and worse yet unique and unrepeatable, no matter your subjective probabilities, there is so much noise in your sample you can hardly know anything anyway. Your N equals 1. You are a punter,