Daniel J. Denis

Applied Univariate, Bivariate, and Multivariate Statistics Using Python


Скачать книгу

However, without a proper decision analysis of the risks and probabilities beforehand, the quality of the decision should not be based on the lucky or unlucky outcome. For instance, if as assessed by experts in the area the probability of finding weapons of mass destruction (and that they would be used) were equal to 0.99, then the logic of the decision to go to war may have been a good one. The outcome of not finding such weapons, in the sense we are discussing, was simply an “unlucky” outcome. The decision, however, may have been correct. However, if the decision analysis revealed a low probability of having such weapons or whether they would be used, then regardless of the outcome, the actual decision would have been a poor one.

       The decision in 2020 to essentially shut down the US economy in the month of March due to the spread of COVID-19. Was it a good decision? The decision should not be evaluated based on the outcome of the spread or the degree to which it affected people’s lives. The decision should be evaluated on the principles and logic that went into the decision beforehand. Whether a lucky outcome or not was achieved is a different process to the actual decision that was made. Likewise, the decision to purchase a stock then lose all of one’s investment cannot be based on the outcome of oil dropping to negative numbers during the pandemic. It must be instead evaluated on the decision-making criteria that went into the decision. You may have purchased a great stock prior to the pandemic, but got an extremely unlucky and improbable outcome when the oil crash hit.

       The launch of SpaceX in May of 2020, returning Americans to space. On the day of the launch, there was a slight chance of lightning in the area, but the risk was low enough to go ahead with the launch. Had lightning occurred and it adversely affected the mission, it would not have somehow meant a poor decision was made. What it would have indicated above all else is that an unlucky outcome occurred. There is always a measure of risk tolerance in any event such as this. The goal in decision-making is generally to calibrate such risk and minimize it to an acceptable and sufficient degree.

      1.3 Quantifying Error Rates in Decision-Making: Type I and Type II Errors

      As discussed thus far, decision-making is risky business. Virtually all decisions are made with at least some degree of risk of being wrong. How that risk is distributed and calibrated, and the costs of making the wrong decision, are the components that must be considered before making the decision. For example, again with the coin, if we start out assuming the coin is fair (null hypothesis), then reject that hypothesis after obtaining a large number of heads out of 100 flips, though the decision is logical, reality itself may not agree with our decision. That is, the coin may, in reality, be fair. We simply observed a string of heads that may simply be due to chance fluctuation. Now, how are we ever to know if the coin is fair or not? That’s a difficult question, since according to frequentist probabilists, we would literally need to flip the coin forever to get the true probability of heads. Since we cannot study an infinite population of coin flips, we are always restricted on betting based on the sample, and hoping our bet gets us a lucky outcome.

      What may be most surprising to those unfamiliar with statistical inference, is that quite remarkably, statistical inference in science operates on the same philosophical principles as games of chance in Vegas! Science is a gamble and all decisions have error rates. Again, consider the idea of a potential treatment being advanced for COVID-19 in 2020, the year of the pandemic. Does the treatment work? We hope so, but if it does not, what are the risks of it not working? With every decision, there are error rates, and error rates also imply potential opportunity costs. Good decisions are made with an awareness of the benefits of being correct or the costs of being wrong. Beyond that, we roll the proverbial dice and see what happens.

      However, error rates go both ways. Researchers often wish to minimize the risk of a type I error, often ignoring the type II error rate. A type II error is failing to reject a false null hypothesis. For our COVID-19 example, this would essentially mean failing to detect that a treatment is effective when in fact it is effective and could potentially save lives. If in reality the null hypothesis is false, yet through our statistical test we fail to detect its falsity, then we could potentially be missing out on a treatment that is effective. So-called “experimental treatments” for a disease (i.e. the “right to try”) are often well-attuned to the risk of making type II errors. That is, the risk of not acting, even on something that has a relatively small probability of working out, may be high, because if it does work out, then the benefits could be substantial.

       Virtually all decisions involve a certain degree of risk. A classical hypothesis test involves two error rates. The first is a type I error, which is a false rejection of the null hypothesis. The probability of making a type I error is equal to the significance level set for the test. The second is a type II error, which is failing to reject a false null hypothesis.

      1.4 Estimation of Parameters

      Estimation in statistics usually operates by one of two types. Point estimation involves estimating a precise value of