Lecture 16: Metropolis Algorithm
1 Metropolis Algorithm
Recall, last time we discussed the grid-based approximation to the posterior distribution. This approach is not scalable to high-dimensional problems.
Goal: Estimate \(p(\theta\mid y)\)
How: We construct a Markov Chain \(\{\theta^{(1)},\ldots,\theta^{(s)}\}\). How to get the next sample \(\theta^{(s+1)}\)?
1.1 Metropolis Algorithm
Sample a proposal value \(\theta^*\) \[ \\[1cm] \]
Compute \(r=\) \[ \\[1cm] \]
\(\theta^{(s+1)}=\)
2 Normal Model with known variance
We will illustrate the Metropolis algorithm on the normal model with known variance. We have data \(y_1,\ldots,y_n\) and we want to estimate \(\theta\) in the model
Sampling model
\[ y_i\overset{iid}{\sim} N(\theta,\sigma^2),\qquad \text{ where $\sigma^2$ is known}. \]
Prior distribution
\[ \theta\sim N(\mu,t^2). \]
Posterior distribution
\[ \theta\mid y_1,\ldots,y_n\sim N\left(\frac{\frac{n}{\sigma^2}\bar{y}+\frac{1}{\tau^2}\mu}{\frac{n}{\sigma^2}+\frac{1}{\tau^2}},\frac{1}{\frac{n}{\sigma^2}+\frac{1}{\tau^2}}\right). \]
2.1 Setting up the Metropolis algorithm
\[ \sigma^2=1, \qquad \tau^2=1, \qquad \mu=5, \qquad n=5,\qquad y=(\ldots,\ldots). \] and we get
\[ \theta\mid y_1,\ldots,y_n\sim N(10.04,0.2). \]
Metropolis algorithm pseudocode
2.2 Results from the Metropolis algorithm for the normal model.
Q: Which parts of the graph correspond to under-dispersed (small proposal SD) and over-dispersed (large proposal SD) proposal distributions?
Markov chains under three different proposal distributions.
Q: Which of the three chains correspond to \(\delta^2=1/32\), \(\delta^2=2\) and \(\delta^2=64\)?
3 Poisson regression
- \(Y_i=\) number of eggs
- \(X_i=\) Age
\[ \log(E(Y_i\mid x_i))=\beta_1+\beta_2 x_i+\beta_3 x_i^2=\underline{\beta}^T\underline{x}_i \]
Prior distribution
\[ \underline{\beta}\sim N(\underline{0},100\times\underline{I}) \]
Proposal distribution
\[ \underline{\beta}^*\sim J(\underline{\beta}^* \mid \underline{\beta})=\hspace{10cm} \]
3.1 WWaFD
#### Back to sparrow data
load("../data/sparrows.RData")
set.seed(639)
fledged<-sparrows[,1] ; age<-sparrows[,2] ; age2<-age^2
fit.mle<-glm(fledged~age+age2,family="poisson")
summary(fit.mle)
Call:
glm(formula = fledged ~ age + age2, family = "poisson")
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.27662 0.44219 0.626 0.5316
age 0.68174 0.33850 2.014 0.0440 *
age2 -0.13451 0.05786 -2.325 0.0201 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for poisson family taken to be 1)
Null deviance: 76.081 on 51 degrees of freedom
Residual deviance: 67.837 on 49 degrees of freedom
AIC: 198.78
Number of Fisher Scoring iterations: 5
3.2 WWaBD
#### Poisson regression MCMC
load("../data/sparrows.RData")
fledged <- sparrows[, 1] ; age <- sparrows[, 2] ; age2 <- age^2
y <- fledged
X <- cbind(rep(1, length(y)), age, age^2)
yX <- cbind(y, X)
colnames(yX) <- c("fledged", "intercept", "age", "age2")
n_pois <- length(y)
p <- dim(X)[2]
pmn.beta <- rep(0, p)
psd.beta <- rep(10, p)
var.prop <- var(log(y + 1/2)) * solve(t(X) %*% X)
beta <- rep(0, p) ; S <- 10000 ; ac <- 0
BETA <- matrix(0, nrow = S, ncol = p)
set.seed(639)
rmvnorm <- function(n, mu, Sigma) {
E <- matrix(rnorm(n * length(mu)), n, length(mu))
t(t(E %*% chol(Sigma)) + c(mu))
}
for(s in 1:S) {
beta.p <- t(rmvnorm(1, beta, var.prop))
lhr <- sum(dpois(y, exp(X %*% beta.p), log = T)) -
sum(dpois(y, exp(X %*% beta), log = T)) +
sum(dnorm(beta.p, pmn.beta, psd.beta, log = T)) -
sum(dnorm(beta, pmn.beta, psd.beta, log = T))
if(log(runif(1)) < lhr) { beta <- beta.p; ac <- ac + 1 }
BETA[s,] <- beta
}
ac_rate <- ac / S
eff_size <- apply(BETA, 2, effectiveSize)Acceptance rate: 0.417
Effective sample sizes:
[1] 883.2414 854.9256 791.2289