# Idea $k \sim Binomial(\theta, n)$ $k$, the number of successes depends on $\theta$ (or $p$), the probability of success, and $n$, the total number of events or trials. $f(k)=\left(\begin{array}{l}n \\ k\end{array}\right) p^{k}(1-p)^{n-k}$ The mean of the binomial distribution is $pN$, where $p$ is the probability of success and $N$ is the sample size. At a college with 2000 students, each student wears a baseball hat with probability 0.2. Assume that their decisions to wear hats are independent. The expected value or mean on any given day is $2000 \times 0.2 = 400$. The [[standard deviation]] is $\sqrt{p(1-p)N}$. If $p$ or $\theta$ is 0.5, then the SD is just $\frac{\sqrt{N}}{2}$: $SD = \sqrt{p(1-p)N}$ $SD = \sqrt{0.5 * 0.5 * N}$ $SD = \sqrt{0.25 * N}$ $SD = 0.5 \sqrt{N}$ $SD = \frac{\sqrt{N}}{2}$ ## Examples Boeing 747 has 380 seats. Since the passenger show-up rate is 90%, Boeing decides to sell 400 tickets (over-sell). What's the mean and SD (99% intervals) of the number of passengers who will show up? ```python import numpy as np n, p = 400, 0.9 mu = n * p # 360 sd = np.sqrt(p * (1 - p) * n) # 6 sd3 = sd * 3 # 3 SD around mean is 99% interval99 = [mu - sd3, mu, mu + sd3] # [342.0, 360.0, 378.0] from scipy.stats import binom mean, var = binom.stats(n, p) sd = np.sqrt(var) ``` If 10,000 people each wear blue jeans independently with probability 0.8, what's the standard deviation of the resulting distribution of people wearing blue jeans? ```python import numpy as np n, p = 10000, 0.8 mu = n * p sd = np.sqrt(p * (1 - p) * n) from scipy.stats import binom mean, var, skew, kurt = binom.stats(n, p, moments='mvsk') sd = np.sqrt(var) ``` # References - [A Secret Weapon for Predicting Outcomes: The Binomial Distribution - YouTube](https://www.youtube.com/watch?v=6YzrVUVO9M0) - https://www.coursera.org/learn/model-thinking/lecture/9Kknw/central-limit-theorem - [Site Unreachable](https://www.youtube.com/watch?v=8idr1WZ1A7Q)