Training Systems Using Python Statistical Modeling

上QQ阅读APP看书，第一时间看更新

Hypothesis testing for means

We can test the null hypothesis that the population mean (often denoted by the Greek letter μ) is equal to a hypothesized number (denoted by μ₀) against an alternative hypothesis. The alternative will state that the population mean is either less than, greater than, or not equal to the mean we hypothesized. Again, if we assume that data was drawn from a normal distribution, we can use t-procedures—namely, the t-test. This test works well for non-normal data, when the sample size is large. Unfortunately, there is not a stable function in statmodels for this test; however, we can use the _tstat_generic() function, from version 0.8.0, for this test. We may need to hack it a little bit, but it can get us the p value for this test.

So, the confidence interval that you computed earlier suggests that the resistors this manufacturer is sending your company are not being properly manufactured. In fact, you believe that their resistors have a resistance level that's less than that specified. So, you'll be testing the following hypotheses:

The first hypothesis indicates that the company is telling the truth, so you assumed that at the outset. The alternative hypothesis says that the true mean is less than 1,000 Ω. So, you are going to assume that the resistance is normally distributed, and this will be your test statistic. We will now perform the hypotheses testing using the following steps:

Our first step is to import the _tstat_generic() function, as follows:

Then, we're going to define all the parameters that will be used in the function. This includes the mean of the dataset, the mean under the null hypothesis, the standard deviation, and so on. This results in the following output:

So, we compute the p value, and this p value is minuscule. So, clearly, the resistance of the resistors the manufacturer makes is less than 1,000Ω—therefore, your company is being fleeced by this manufacturer; they're not actually producing quality parts. We can also test whether two populations have the same mean, or whether their means are different in some way.