![Hands-On Mathematics for Deep Learning](https://wfqqreader-1252317822.image.myqcloud.com/cover/81/36698081/b_36698081.jpg)
Conditional probability
Conditional probabilities are useful when the occurrence of one event leads to the occurrence of another. If we have two events, A and B, where B has occurred and we want to find the probability of A occurring, we write this as follows:
![](https://epubservercos.yuewen.com/FF11E0/19470372701459106/epubprivate/OEBPS/Images/Chapter_1032.jpg?sign=1738886548-aYOJcnFPW9xBsoqTAqm2fE5LB1J4mC8e-0-fcc74ac05d19bc1cc0e8791268b4bcba)
Here, .
However, if the two events, A and B, are independent, then we have the following:
![](https://epubservercos.yuewen.com/FF11E0/19470372701459106/epubprivate/OEBPS/Images/Chapter_1104.jpg?sign=1738886548-8Op6JJJy32Uc8gGZ2ZRzbXDcH2VFCvs5-0-fe970044a03886373908af80388686ac)
Additionally, if , then it is said that B attracts A. However, if A attracts BC, then it repels B.
The following are some of the axioms of conditional probability:
.
.
.
is a probability function that works only for subsets of B.
.
- If
, then
.
The following equation is known as Bayes' rule:
![](https://epubservercos.yuewen.com/FF11E0/19470372701459106/epubprivate/OEBPS/Images/Chapter_1751.jpg?sign=1738886548-kNV24D1vHAHg74Y8WjDWobFGHNn23ZKV-0-8bf23848d0fa7e41282947a262e814cd)
This can also be written as follows:
![](https://epubservercos.yuewen.com/FF11E0/19470372701459106/epubprivate/OEBPS/Images/Chapter_68.jpg?sign=1738886548-XVvV0TwxEMvzQD8UPQcdAsE0Lh7lWaS3-0-850f300b2b99c14bc4804f9ba13de939)
Here, we have the following:
is called the prior.
is the posterior.
is the likelihood.
acts as a normalizing constant.
![](https://epubservercos.yuewen.com/FF11E0/19470372701459106/epubprivate/OEBPS/Images/Chapter_273.jpg?sign=1738886548-LBzo82eQ8L4Gm7MxVPTEzLfS1LuGlZq8-0-01f4fae5bbec6a4cfe461769203fd017)
Often, we end up having to deal with complex events, and to effectively navigate them, we need to decompose them into simpler events.
This leads us to the concept of partitions. A partition is defined as a collection of events that together makes up the sample space, such that, for all cases of Bi, .
In the coin flipping example, the sample space is partitioned into two possible events—heads and tails.
If A is an event and Bi is a partition of Ω, then we have the following:
![](https://epubservercos.yuewen.com/FF11E0/19470372701459106/epubprivate/OEBPS/Images/Chapter_225.jpg?sign=1738886548-hAqKgi4hdUg300ndO1nLm2cblTBsxbz7-0-242ff23287fbaf787c5c9a64bb33c61a)
We can also rewrite Bayes' formula with partitions so that we have the following:
![](https://epubservercos.yuewen.com/FF11E0/19470372701459106/epubprivate/OEBPS/Images/Chapter_58.jpg?sign=1738886548-IrZEjwuq0ANnSQxp5v4uEvdRjXwScwIn-0-efb26826155088e56a8b0b6e2a4ae659)
Here, .