# Idea The covariance of two variables/features/columns, $x$ and $y$ is given by $ \operatorname{cov}_{x, y}=\frac{\sum\left(x_{i}-\bar{x}\right)\left(y_{i}-\bar{y}\right)}{N-1} $ Where $N$ is the number of data values/rows. It quantifies how similar two variables are—the extent to which they covary. Alternatively, $ \operatorname{Cov}(X, Y)=E\left(D_X D_Y\right)=E\left[\left(X-\mu_X\right)\left(Y-\mu_Y\right)\right] $ Covariance is an expected product: It is the expected product of deviations. It is the mean of the product minus the product of means. $ \begin{aligned} \operatorname{Cov}(X, Y) &=E\left[\left(X-\mu_X\right)\left(Y-\mu_Y\right)\right] \\ &=E(X Y)-E(X) \mu_Y-\mu_X E(Y)+\mu_X \mu_Y \\ &=E(X Y)-\mu_X \mu_Y \end{aligned} $ Covariance is unstandardized [[correlation]]. In a [[variance-covariance matrix]], the off-diagonal elements are the [[covariance]] between pairs of features/columns. ## Matrix notation: covariance is [[dot product]] Assuming $x$ and $y$ have already been mean-centered: $ \operatorname{cov}_{x, y}=\frac{\sum{x_i y_i}}{N-1} $ $ \frac{\mathbf{x}^{\prime} \mathbf{y}}{N-1} $ That is, take the [[dot product]] of the two features/columns and divide by $N - 1$. **When both variables have been centered, covariance is [[dot product]]. Both measure "similarity" of two vectors.** ## Code example ```r set.seed(1) m <- matrix(rnorm(32, 100), nrow = 8) # 8x4 matrix m <- apply(m, 2, function(x) x - mean(x)) # mean center each feature/column m [,1] [,2] [,3] [,4] [1,] -0.75790817 0.52377208 -0.2823272 0.6032217 [2,] 0.05218897 -0.35739766 0.6776993 -0.0727328 [3,] -0.96708297 1.45977189 0.5550843 -0.1723996 [4,] 1.46382645 0.33783396 0.3277644 -1.4873564 [5,] 0.19805342 -0.67324986 0.6528404 -0.4947541 [6,] -0.95192274 -2.26670916 0.5159994 0.4013375 [7,] 0.35597470 1.07292164 -0.1915719 1.3420755 [8,] 0.60687035 -0.09694289 -2.2554886 -0.1193918 > (m1 <- cov(m)) # covariance [,1] [,2] [,3] [,4] [1,] 0.7279396 0.14495431 -0.2295077 -0.36374714 [2,] 0.1449543 1.34270437 -0.1521557 0.06609935 [3,] -0.2295077 -0.15215575 0.9672740 -0.12950263 [4,] -0.3637471 0.06609935 -0.1295026 0.69034167 > (m2 <- (t(m) %*% m) / (nrow(m) - 1)) # matrix solution [,1] [,2] [,3] [,4] [1,] 0.7279396 0.14495431 -0.2295077 -0.36374714 [2,] 0.1449543 1.34270437 -0.1521557 0.06609935 [3,] -0.2295077 -0.15215575 0.9672740 -0.12950263 [4,] -0.3637471 0.06609935 -0.1295026 0.69034167 > all.equal(m1, m2) [1] TRUE ``` # References