- [[joint probability of dependent events]], [[joint probability of independent events]], [[Bayes theorem]]
# Idea
## Step 1
The joint probability of $A$ and $B$ is equivalent to the joint probability of $B$ and $A$. This first step is key to deriving [[Bayes theorem|Bayes's rule]].
$P(A,B) = P(B,A)$
The joint probability of events $A$ and $B$ occurring can be denoted in various ways: $P(A \ \text{and} \ B)$ or $P(A \cap B)$ or $P(A, B)$.
Venn diagram of $P(A,B)$, aka $P(A \cap B)$
![[s20220326_012031.png|300]]
## Step 2
Bayes theorem or Bayes rule can be derived from [[joint probability of dependent events]].
$P(A,B)=P(A|B) P(B)$
That is, the probability of $A$ and $B$ jointly occurring, $P(A,B)$, is the probability of $B$ occurring, $P(B)$, multiplied by the probability of $A$ occurring given that $B$ has already occurred, $P(A|B)$. See [[joint probability of dependent events]] for explanation.
Similarly,
$P(B,A)=P(B|A) \ P(A)$
## Step 3
Because $P(A,B) = P(B,A)$ (**left-hand-sides** of the two equations above; see also first step and venn diagram above), we can also equate the **right-hand-sides** of the two equations above:
$P(A|B) \ P(B) = P(B|A) \ P(A)$
We divide both sides by $P(B)$, which leads to [[Bayes theorem]]:
$P(A|B) = \frac{P(B|A) \ P(A)}{P(B)}$
As can be seen from the final equation, [[Bayes theorem]] allows us to "flip/reverse" conditional probabilities: $P(B|A)$ to $P(A|B)$, or vice versa.
For example, we can use [[Bayes theorem]] to tell us the probability of our theory being true given the data $P(\theta|D)$ (left-hand-side), when all we have (right-hand-side) is the probability of the data given our theory $P(D|\theta)$, probability of theory $P(\theta)$, and probability of data $P(D)$:
$P(\theta|D)=\frac{P(D|\theta) \ P(\theta)}{P(D)}$
# References