Lecture 16: Metropolis Algorithm

1 Metropolis Algorithm

Recall, last time we discussed the grid-based approximation to the posterior distribution. This approach is not scalable to high-dimensional problems.

Goal: Estimate $p(\theta\mid y)$

How: We construct a Markov Chain $\{\theta^{(1)},\ldots,\theta^{(s)}\}$. How to get the next sample $\theta^{(s+1)}$?

1.1 Metropolis Algorithm

Sample a proposal value $\theta^*$ \[ \\[1cm] \]
Compute $r=$ \[ \\[1cm] \]
$\theta^{(s+1)}=$

2 Normal Model with known variance

We will illustrate the Metropolis algorithm on the normal model with known variance. We have data $y_1,\ldots,y_n$ and we want to estimate $\theta$ in the model

Sampling model

\[ y_i\overset{iid}{\sim} N(\theta,\sigma^2),\qquad \text{ where $\sigma^2$ is known}. \]

Prior distribution

\[ \theta\sim N(\mu,t^2). \]

Posterior distribution

\[ \theta\mid y_1,\ldots,y_n\sim N\left(\frac{\frac{n}{\sigma^2}\bar{y}+\frac{1}{\tau^2}\mu}{\frac{n}{\sigma^2}+\frac{1}{\tau^2}},\frac{1}{\frac{n}{\sigma^2}+\frac{1}{\tau^2}}\right). \]

2.1 Setting up the Metropolis algorithm

\[ \sigma^2=1, \qquad \tau^2=1, \qquad \mu=5, \qquad n=5,\qquad y=(\ldots,\ldots). \] and we get

\[ \theta\mid y_1,\ldots,y_n\sim N(10.04,0.2). \]

Metropolis algorithm pseudocode

2.2 Results from the Metropolis algorithm for the normal model.

Q: Which parts of the graph correspond to under-dispersed (small proposal SD) and over-dispersed (large proposal SD) proposal distributions?

Markov chains under three different proposal distributions.

Q: Which of the three chains correspond to $\delta^2=1/32$, $\delta^2=2$ and $\delta^2=64$?

3 Poisson regression

$Y_i=$ number of eggs
$X_i=$ Age

\[ \log(E(Y_i\mid x_i))=\beta_1+\beta_2 x_i+\beta_3 x_i^2=\underline{\beta}^T\underline{x}_i \]

Prior distribution

\[ \underline{\beta}\sim N(\underline{0},100\times\underline{I}) \]

Proposal distribution

\[ \underline{\beta}^*\sim J(\underline{\beta}^* \mid \underline{\beta})=\hspace{10cm} \]

3.1 WWaFD

#### Back to sparrow data
load("../data/sparrows.RData")  
set.seed(639)
fledged<-sparrows[,1] ; age<-sparrows[,2] ; age2<-age^2
fit.mle<-glm(fledged~age+age2,family="poisson")
summary(fit.mle)


Call:
glm(formula = fledged ~ age + age2, family = "poisson")

Coefficients:
            Estimate Std. Error z value Pr(>|z|)  
(Intercept)  0.27662    0.44219   0.626   0.5316  
age          0.68174    0.33850   2.014   0.0440 *
age2        -0.13451    0.05786  -2.325   0.0201 *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for poisson family taken to be 1)

    Null deviance: 76.081  on 51  degrees of freedom
Residual deviance: 67.837  on 49  degrees of freedom
AIC: 198.78

Number of Fisher Scoring iterations: 5

3.2 WWaBD

#### Poisson regression MCMC
  load("../data/sparrows.RData")  
  fledged <- sparrows[, 1] ; age <- sparrows[, 2] ; age2 <- age^2
  
  y <- fledged
  X <- cbind(rep(1, length(y)), age, age^2)
  yX <- cbind(y, X)
  colnames(yX) <- c("fledged", "intercept", "age", "age2") 
  
  n_pois <- length(y)
  p <- dim(X)[2]
  
  pmn.beta <- rep(0, p)
  psd.beta <- rep(10, p)
  
  var.prop <- var(log(y + 1/2)) * solve(t(X) %*% X)
  beta <- rep(0, p) ; S <- 10000 ; ac <- 0
  BETA <- matrix(0, nrow = S, ncol = p)
  set.seed(639)
  
  rmvnorm <- function(n, mu, Sigma) {
    E <- matrix(rnorm(n * length(mu)), n, length(mu))
    t(t(E %*% chol(Sigma)) + c(mu))
  }
  
  for(s in 1:S) {
    beta.p <- t(rmvnorm(1, beta, var.prop))
    
    lhr <- sum(dpois(y, exp(X %*% beta.p), log = T)) -
           sum(dpois(y, exp(X %*% beta), log = T)) +
           sum(dnorm(beta.p, pmn.beta, psd.beta, log = T)) -
           sum(dnorm(beta, pmn.beta, psd.beta, log = T))
    
    if(log(runif(1)) < lhr) { beta <- beta.p; ac <- ac + 1 }
    
    BETA[s,] <- beta
  }
  
  ac_rate <- ac / S
  eff_size <- apply(BETA, 2, effectiveSize)

Acceptance rate: 0.417

Effective sample sizes:

[1] 883.2414 854.9256 791.2289

1 Metropolis Algorithm

1.1 Metropolis Algorithm

2 Normal Model with known variance

Sampling model

Prior distribution

Posterior distribution

2.1 Setting up the Metropolis algorithm

Metropolis algorithm pseudocode

2.2 Results from the Metropolis algorithm for the normal model.

Markov chains under three different proposal distributions.

3 Poisson regression

Prior distribution

Proposal distribution

3.1 WWaFD

3.2 WWaBD

3.3 MCMC approximations to the posterior marginal distributions of \(\beta_2\) and \(\beta_3\)