Multivariate Gaussian Density#
A random vector \(X \in \mathbb{R}^p\) with mean \(\mu\), covariance \(\Sigma\), and density
is said to follow a multivariate normal (Gaussian) distribution. To evaluate this density, one needs to compute:
The (square root) determinant: \((\det{\Sigma})^{-\frac{1}{2}}\)
The quadratic form \((x - \mu)^\top\Sigma^{-1}(x - \mu)\)
Without the sweep operator, both of these terms require special computational care. The sweep operator permits straightforward calculation of both terms when we construct
and sweep on the diagonal entries of \(\Sigma\). This operation will result in the following:
and in the process also accumulate \(\det(\Sigma)\). Thus, we can evaluate the MVN density in-place without any additional allocation.
[1]:
import sweepystats as sw
import numpy as np
from scipy.linalg import toeplitz
Loglikelihood#
To evaluate the likelihood of \(X = (X_1 = x_1,..., X_p = x_p)\), we initialize the Normal class and call the loglikelihood function.
[2]:
# instantiate a Normal
p = 5
mu = np.zeros(p)
sigma = toeplitz(np.array([0.5**i for i in range(p)]))
d = sw.Normal(mu, sigma)
Here we simulated
with \(\rho = 0.5\). Now we can ask what is the likelihood that \(X = (X_1 = x_1,..., X_p = x_p)\)?
[3]:
X = np.random.normal(size=p)
d.loglikelihood(X)
100%|███████████████████████████████████████████| 5/5 [00:00<00:00, 28610.53it/s]
[3]:
np.float64(-7.372282461672331)
We can check answer with scipy:
[4]:
from scipy.stats import multivariate_normal
multivariate_normal.logpdf(X, mean=mu, cov=sigma)
[4]:
np.float64(-7.372282461672331)
Conditional distributions#
Let \(X \sim N(\mu, \Sigma)\). If \(X = (Y, Z)\) and
then the following classical formulas hold
These quantities be easily computed via the sweep operation. Consider the matrix:
Sweeping on the diagonals of \(\Sigma_Y\) yields:
Thus, the conditional expection and variance becomes immediately available, again without needing to allocate anything new.
Conditional expectations#
We wish to evaluate
[23]:
# instantiate a Normal
p = 6
mu = np.zeros(p)
sigma = toeplitz(np.array([0.5**i for i in range(p)]))
d = sw.Normal(mu, sigma)
Suppose \(p=4\) and we condition on the first 2 elements. What is the conditional mean for the remaining 2 elements?
[24]:
y = np.random.normal(2, size=(2,))
yidx = [0, 1]
d.cond_mean(y, yidx)
[24]:
array([0.66050197, 0.33025098, 0.16512549, 0.08256275])
[25]:
# check answers with brute-force implementation
mu_Y, mu_Z = np.zeros(2), np.zeros(p - len(yidx))
sigma_Y = sigma[0:2, 0:2]
sigma_Z = sigma[2:, 2:]
sigma_ZY = sigma[2:, 0:2]
mu_Z + sigma_ZY @ np.linalg.inv(sigma_Y) @ (y - mu_Y)
[25]:
array([0.66050197, 0.33025098, 0.16512549, 0.08256275])
yidx does NOT have to be contiguous!
[38]:
p = 6
y = np.random.normal(2, size=(2,))
yidx = [0, 3]
d.cond_mean(y, yidx)
[38]:
array([1.72322558, 1.95738227, 1.58511505, 0.79255753])
[39]:
# check answers with brute-force implementation
zidx = [1, 2, 4, 5]
mu_Y, mu_Z = np.zeros(2), np.zeros(p - len(yidx))
sigma_Y = sigma[np.ix_(yidx, yidx)]
sigma_Z = sigma[np.ix_(zidx, zidx)]
sigma_ZY = sigma[np.ix_(zidx, yidx)]
mu_Z + sigma_ZY @ np.linalg.inv(sigma_Y) @ (y - mu_Y)
[39]:
array([1.72322558, 1.95738227, 1.58511505, 0.79255753])
Conditional Variance#
We wish to evaluate
[46]:
# instantiate a Normal
p = 6
mu = np.zeros(p)
sigma = toeplitz(np.array([0.5**i for i in range(p)]))
d = sw.Normal(mu, sigma)
To compute conditional variance:
[47]:
y = np.random.normal(2, size=(2,))
yidx = [0, 2]
d.cond_var(y, yidx)
[47]:
array([[0.6 , 0. , 0. , 0. ],
[0. , 0.75 , 0.375 , 0.1875 ],
[0. , 0.375 , 0.9375 , 0.46875 ],
[0. , 0.1875 , 0.46875 , 0.984375]])
[48]:
# check answers with brute-force implementation
zidx = [1, 3, 4, 5]
sigma_Y = sigma[np.ix_(yidx, yidx)]
sigma_Z = sigma[np.ix_(zidx, zidx)]
sigma_ZY = sigma[np.ix_(zidx, yidx)]
sigma_YZ = sigma[np.ix_(yidx, zidx)]
sigma_Z - sigma_ZY @ np.linalg.inv(sigma_Y) @ sigma_YZ
[48]:
array([[0.6 , 0. , 0. , 0. ],
[0. , 0.75 , 0.375 , 0.1875 ],
[0. , 0.375 , 0.9375 , 0.46875 ],
[0. , 0.1875 , 0.46875 , 0.984375]])
Of course, for conditional variance computation, the vector zidx need not be contiguous as well.