Training Systems Using Python Statistical Modeling

上QQ阅读APP看书，第一时间看更新

Credible intervals for means

Getting a credible interval for the mean is the same as the one for proportions, except that we will work with the marginal distribution for just the unknown mean from the posterior distribution.

Let's repeat a context that we used in the Computing confidence intervals for means section of this chapter. You are employed by a company that's fabricating chips and other electronic components. The company wants you to investigate the resistors it's using to produce its components. These resistors are being manufactured by an outside company and they've been specified as having a particular resistance. They want you to ensure that the resistors being produced and sent to them are high quality products—specifically, that when they are labeled with a resistance level of 1,000 Ω, then they do in fact have a resistance of 1,000 Ω. So, let's get started, using the following steps:

We will use the same dataset as we did in the Computing confidence intervals for means section.

Now, we're going to use the NIG (1, 1, 1/2, 0.0005) distribution for our prior distribution. You can compute the parameters of the posterior distribution using the following code:

When the parameters of the distribution are computed, it results in the following output:

It looks as if the mean has been moved; you now have 105 observations being used as your evidence.

Now, let's visualize the prior and posterior distribution—specifically, their marginal distributions:

Blue represents the prior distribution, and red represents the posterior distribution. It appears that the prior distribution was largely uninformative about where the true resistance was, while the posterior distribution strongly says that the resistance is approximately 0.99.

Now, let's use this to compute a 95% credible interval for the mean of μ. I have written a function that will do this for you, where you feed it data and also the parameters of the prior distribution, and it will give you a credible interval with a specified level of credibility. Let's run this function as follows:

Now, let's compute the credible interval:

Here, what we notice is that 1 is not in this credible interval, so there's a 95% chance that the true resistance level is between 0.9877 and 0.9919.