# Idea
The covariance of two variables/features/columns, $x$ and $y$ is given by
$
\operatorname{cov}_{x, y}=\frac{\sum\left(x_{i}-\bar{x}\right)\left(y_{i}-\bar{y}\right)}{N-1}
$
Where $N$ is the number of data values/rows.
It quantifies how similar two variables are—the extent to which they covary.
Alternatively,
$
\operatorname{Cov}(X, Y)=E\left(D_X D_Y\right)=E\left[\left(X-\mu_X\right)\left(Y-\mu_Y\right)\right]
$
Covariance is an expected product: It is the expected product of deviations. It is the mean of the product minus the product of means.
$
\begin{aligned}
\operatorname{Cov}(X, Y) &=E\left[\left(X-\mu_X\right)\left(Y-\mu_Y\right)\right] \\
&=E(X Y)-E(X) \mu_Y-\mu_X E(Y)+\mu_X \mu_Y \\
&=E(X Y)-\mu_X \mu_Y
\end{aligned}
$
Covariance is unstandardized [[correlation]].
In a [[variance-covariance matrix]], the off-diagonal elements are the [[covariance]] between pairs of features/columns.
## Matrix notation: covariance is [[dot product]]
Assuming $x$ and $y$ have already been mean-centered:
$
\operatorname{cov}_{x, y}=\frac{\sum{x_i y_i}}{N-1}
$
$
\frac{\mathbf{x}^{\prime} \mathbf{y}}{N-1}
$
That is, take the [[dot product]] of the two features/columns and divide by $N - 1$.
**When both variables have been centered, covariance is [[dot product]]. Both measure "similarity" of two vectors.**
## Code example
```r
set.seed(1)
m <- matrix(rnorm(32, 100), nrow = 8) # 8x4 matrix
m <- apply(m, 2, function(x) x - mean(x)) # mean center each feature/column
m
[,1] [,2] [,3] [,4]
[1,] -0.75790817 0.52377208 -0.2823272 0.6032217
[2,] 0.05218897 -0.35739766 0.6776993 -0.0727328
[3,] -0.96708297 1.45977189 0.5550843 -0.1723996
[4,] 1.46382645 0.33783396 0.3277644 -1.4873564
[5,] 0.19805342 -0.67324986 0.6528404 -0.4947541
[6,] -0.95192274 -2.26670916 0.5159994 0.4013375
[7,] 0.35597470 1.07292164 -0.1915719 1.3420755
[8,] 0.60687035 -0.09694289 -2.2554886 -0.1193918
> (m1 <- cov(m)) # covariance
[,1] [,2] [,3] [,4]
[1,] 0.7279396 0.14495431 -0.2295077 -0.36374714
[2,] 0.1449543 1.34270437 -0.1521557 0.06609935
[3,] -0.2295077 -0.15215575 0.9672740 -0.12950263
[4,] -0.3637471 0.06609935 -0.1295026 0.69034167
> (m2 <- (t(m) %*% m) / (nrow(m) - 1)) # matrix solution
[,1] [,2] [,3] [,4]
[1,] 0.7279396 0.14495431 -0.2295077 -0.36374714
[2,] 0.1449543 1.34270437 -0.1521557 0.06609935
[3,] -0.2295077 -0.15215575 0.9672740 -0.12950263
[4,] -0.3637471 0.06609935 -0.1295026 0.69034167
> all.equal(m1, m2)
[1] TRUE
```
# References