derive Bayes rule

- [[joint probability of dependent events]], [[joint probability of independent events]], [[Bayes theorem]] # Idea ## Step 1 The joint probability of $A$ and $B$ is equivalent to the joint probability of $B$ and $A$. This first step is key to deriving [[Bayes theorem|Bayes's rule]]. $P(A,B) = P(B,A)$ The joint probability of events $A$ and $B$ occurring can be denoted in various ways: $P(A \ \text{and} \ B)$ or $P(A \cap B)$ or $P(A, B)$. Venn diagram of $P(A,B)$, aka $P(A \cap B)$ ![[s20220326_012031.png|300]] ## Step 2 Bayes theorem or Bayes rule can be derived from [[joint probability of dependent events]]. $P(A,B)=P(A|B) P(B)$ That is, the probability of $A$ and $B$ jointly occurring, $P(A,B)$, is the probability of $B$ occurring, $P(B)$, multiplied by the probability of $A$ occurring given that $B$ has already occurred, $P(A|B)$. See [[joint probability of dependent events]] for explanation. Similarly, $P(B,A)=P(B|A) \ P(A)$ ## Step 3 Because $P(A,B) = P(B,A)$ (**left-hand-sides** of the two equations above; see also first step and venn diagram above), we can also equate the **right-hand-sides** of the two equations above: $P(A|B) \ P(B) = P(B|A) \ P(A)$ We divide both sides by $P(B)$, which leads to [[Bayes theorem]]: $P(A|B) = \frac{P(B|A) \ P(A)}{P(B)}$ As can be seen from the final equation, [[Bayes theorem]] allows us to "flip/reverse" conditional probabilities: $P(B|A)$ to $P(A|B)$, or vice versa. For example, we can use [[Bayes theorem]] to tell us the probability of our theory being true given the data $P(\theta|D)$ (left-hand-side), when all we have (right-hand-side) is the probability of the data given our theory $P(D|\theta)$, probability of theory $P(\theta)$, and probability of data $P(D)$: $P(\theta|D)=\frac{P(D|\theta) \ P(\theta)}{P(D)}$ # References