In this document we will perform disease mapping and spatial regression using NIMBLE
(de Valpine et al. 2017). We will fit a series of disease mapping models with unstructured and structured random effects (ICAR, BYM and BYM2). The data can be downloaded by geodacenter and includes lip cancer cases (CANCER) per ward in Scotland during 1975-1980 (Clayton and Kaldor 1987). The dataset also includes the district number (DISTRICT), name (NAME), code (CODE), the population per ward (POP) and the expected number of cases (CEXP) (Lawson 1999).
This practical requires the following packages to be installed and attached: sf
, dplyr
, tidyverse
, nimble
, coda
, spdep
, patchwork
, GGally
, ggmcmc
and INLA
.
install.packages(c("sf", "dplyr", "tidyverse", "nimble", "coda", "spdep", "patchwork",
"GGally", "ggmcmc"), dependencies = TRUE, repos = "http://cran.r-project.org")
INLA
, you can useinstall.packages("INLA",repos=c(getOption("repos"),
INLA="https://inla.r-inla-download.org/R/stable"), dep=TRUE)
library(sf) # Simple feature for R
library(dplyr) # A package for data manipulation
library(tidyverse) # A package for data manipulation and plotting
library(nimble) # A package for performing MCMC in R
library(coda) # A package for summarizing and plotting of MCMC output
# and diagnostic tests
library(spdep) # A package to calculate neighbors
library(patchwork) # A package to combine plots
library(GGally) # A package to do pairplots
library(ggmcmc) # MCMC diagnostics with ggplot
library(INLA) # A package for quick Bayesian inference (here used to calculate
# the scale for BYM2)
library(kableExtra) # packages for formatting tables in rmarkdown
ScotLip <- read_sf("ScotLip.shp")
ScotLip
## Simple feature collection with 56 features and 7 fields
## geometry type: POLYGON
## dimension: XY
## bbox: xmin: 95631 ymin: 530297 xmax: 454570 ymax: 1203008
## CRS: NA
## # A tibble: 56 x 8
## CODENO DISTRICT NAME CODE CANCER POP CEXP geometry
## <int> <int> <chr> <chr> <int> <int> <dbl> <POLYGON>
## 1 6126 1 Skye-~ w6126 9 28324 1.38 ((214091.9 841215.2, 218829~
## 2 6016 2 Banff~ w6016 39 231337 8.66 ((383866 865862, 398721 867~
## 3 6121 3 Caith~ w6121 11 83190 3.04 ((311487 968650, 320989 968~
## 4 5601 4 Berwi~ w5601 9 51710 2.53 ((377180 672603, 386871.7 6~
## 5 6125 5 Ross-~ w6125 15 129271 4.26 ((278680.1 882371.8, 294960~
## 6 6554 6 Okney w6554 8 53199 2.4 ((328040 1011192, 321498 10~
## 7 6019 7 Moray w6019 26 245513 8.11 ((348968 868724, 352324 867~
## 8 6655 8 Shetl~ w6655 7 62603 2.3 ((450593 1203008, 454570 12~
## 9 6123 9 Locha~ w6123 6 59183 1.98 ((192646 807178, 216426 805~
## 10 6017 10 Gordon w6017 20 165554 6.63 ((397156 840640, 399497 838~
## # ... with 46 more rows
ScotLip
ScotLip$crudeSIR <- ScotLip$CANCER/ScotLip$CEXP
ggplot() + geom_boxplot(data = ScotLip, aes(y = crudeSIR)) + theme_bw() +
theme(text = element_text(size = 12))-> p1
ggplot() + geom_sf(data = ScotLip, aes(fill = crudeSIR), col = "NA") +
scale_fill_viridis_c() + theme_bw() +
theme(axis.text = element_blank(), text = element_text(size = 12)) -> p2
(p1|p2)
The SIRs will be smoothed using the Poisson-logNormal model. The inference is done with NIMBLE
called through R. In particular, let each ward \(i\) be indexed by the integers \(1, 2,...,N\). The model is as follows: \[
\begin{eqnarray}
O_{i}|\lambda_{i} & \sim & \text{Poisson}(\lambda_{i}E_{i} ) \\
\log(\lambda_{i}) & = & \alpha + \theta_{i} \\
\theta_{i} & \sim & N(0,1/\tau_{\theta})
\end{eqnarray}
\]
where \(O_i\) is the observed number of scottish lip cancer cases, \(E_{i}\) the expected, \(\alpha\) is an intercept term denoting the average log SIR, \(\theta_{i}\) is a random intercept and \(\tau_{\theta}\) a precision (reciprocal of the variance) term that controls the magnitude of \(\theta_{i}\).
UnstrCode <- nimbleCode(
{
for (i in 1:N){
O[i] ~ dpois(mu[i]) # Poisson for observed counts
log(mu[i]) <- log(E[i]) + alpha + theta[i]
theta[i] ~ dnorm(0, tau = tau.theta) # area-specific unstr random effects
SIR[i] <- exp(alpha + theta[i]) # area-specific SIR
resSIR[i] <- exp(theta[i]) # area-specific residual SIR
e[i] <- (O[i]-mu[i])/sqrt(mu[i]) # residuals
}
# Priors:
alpha ~ dflat() # Unif(-inf, +inf)
overallSIR <- exp(alpha) # overall SIR across study region
tau.theta ~ dgamma(1, 0.01) # prior for the precision
sigma2.theta <- 1/tau.theta # variance of theta
}
)
NIMBLE.
Here we define the data and its constants. In our case the data is the observed number of lip cancer cases, whereas the constants the total number of wards and the expected number of cases per ward:# Obtain the number of wards
N <- dim(ScotLip)[1]
# Format the data for NIMBLE in a list
ScotLipdata = list(
O = ScotLip$CANCER # observed nb of cases
)
ScotLipConsts <-list(
N = N, # nb of wards
E = ScotLip$CEXP # expected number of cases
)
# Initialise the unknown parameters, 2 chains
inits <- list(
list(alpha=0.01, tau.theta=10, theta = rep(0.01,times=N)), # chain 1
list(alpha=0.5, tau.theta=1, theta = rep(-0.01,times=N))) # chain 2
# Monitored parameters
params <- c("sigma2.theta", "overallSIR", "resSIR", "SIR", "e", "mu", "alpha", "theta")
Note that the parameters that are not set, will NOT be monitored!
# MCMC setting
ni <- 50000 # nb iterations
nt <- 10 # thinning interval
nb <- 10000 # nb iterations as burn-in
nc <- 2 # nb chains
The burn-in should be long enough to discard the initial part of the Markov chains that have not yet converged to the stationary distribution.
nimbleMCMC()
UnstrCodesamples <- nimbleMCMC(code = UnstrCode,
data = ScotLipdata,
constants = ScotLipConsts,
inits = inits,
monitors = params,
niter = ni,
nburnin = nb,
thin = nt,
nchains = nc,
setSeed = 9,
progressBar = FALSE, # if true then you can monitor the progress
samplesAsCodaMCMC = TRUE,
summary = TRUE,
WAIC = TRUE
)
Note that specifying samplesAsCodaMCMC = TRUE
, we can use the functionality of coda
package to run diagnostics.
We will use the function gelman.diag()
from the package coda
to calculate the Rhat and get an idea of which random variables have already converged:
GR.diag <- gelman.diag(UnstrCodesamples$samples, multivariate = FALSE)
all(GR.diag$psrf[,"Point est."] < 1.1)
## [1] TRUE
which(GR.diag$psrf[,"Point est."] > 1.1)
## named integer(0)
There is some evidence that most random variables of interest have converged. Lets assess it more. We can do traceplots and autocorrelation plots using the coda
package:
unstr_ggmcmc <- ggs(UnstrCodesamples$samples)
unstr_ggmcmc %>% filter(Parameter == "overallSIR") %>%
ggs_traceplot + theme_bw() + theme(text = element_text(size = 12)) -> p1
unstr_ggmcmc %>% filter(Parameter == "overallSIR") %>%
ggs_autocorrelation + theme_bw() + theme(text = element_text(size = 12)) -> p2
p1/p2
unstr_ggmcmc %>% filter(Parameter == "sigma2.theta") %>%
ggs_traceplot + theme_bw() + theme(text = element_text(size = 12)) -> p1
unstr_ggmcmc %>% filter(Parameter == "sigma2.theta") %>%
ggs_autocorrelation + theme_bw() + theme(text = element_text(size = 12)) -> p2
p1/p2
set.seed(9)
ra <- sample(1:N, size = 1)
unstr_ggmcmc %>% filter(Parameter == paste0("SIR[", ra, "]")) %>%
ggs_traceplot + theme_bw() + theme(text = element_text(size = 12)) -> p1
unstr_ggmcmc %>% filter(Parameter == paste0("SIR[", ra, "]")) %>%
ggs_autocorrelation + theme_bw() + theme(text = element_text(size = 12)) -> p2
p1/p2
Once we are more or less sure that the chain has reached convergence we can extract the results
UnstrCodesamples
:head(UnstrCodesamples$summary$chain1, digits = 3)
head(UnstrCodesamples$summary$chain2, digits = 3)
head(UnstrCodesamples$summary$all.chains, digits = 3)
# also
UnstrCodesamples$summary$chain2[c(1, 2, 3, 7),]
# or
UnstrCodesamples$summary$chain2["sigma2.theta",]
sigma2.theta
:unstr_ggmcmc %>% filter(Parameter == "sigma2.theta") %>%
ggs_histogram() + theme_bw() + theme(text = element_text(size = 12)) -> p1
unstr_ggmcmc %>% filter(Parameter == "sigma2.theta") %>%
ggs_density() + theme_bw() + theme(text = element_text(size = 12)) -> p2
p1/p2
R
we extract the posterior mean of the SIRs and add it on the shapefile.ScotLip$unstr_SIR <- UnstrCodesamples$summary$all.chains[paste0("SIR[", 1:N, "]"), "Median"]
ggplot2
to produce a map of the smoothed SIRsggplot() + geom_sf(data = ScotLip, aes(fill = unstr_SIR), col = NA) + theme_bw() +
scale_fill_viridis_c(limits = c(0,5)) +
theme(axis.text = element_blank(),
text = element_text(size = 12))
dat.box = data.frame(SMR = c(ScotLip$crudeSIR, ScotLip$unstr_SIR),
type = c(rep("Crude", times = N), rep("SIR_unstr", times = N)))
ggplot() + geom_boxplot(data = dat.box, aes(x = type, y = SMR, fill = type), alpha = .4) +
theme_bw() + theme(text = element_text(size = 14))
* We can also get the posterior probability that the spatial SIR per area is larger than 1:
postProb <- sapply(paste0("SIR[", 1:N, "]"),
function(X) mean(UnstrCodesamples$samples$chain1[,X]>1))
ScotLip$unstr_postProb <- postProb
ggplot() + geom_sf(data = ScotLip, aes(fill = postProb), col = NA) + theme_bw() +
scale_fill_viridis_c(limits = c(0,1)) +
theme(axis.text = element_blank(),
text = element_text(size = 12)) -> p1
ggplot() + geom_boxplot(data = ScotLip, aes(y = postProb)) +
scale_fill_viridis_d(alpha = .5) + theme_bw() + xlim(c(-1,1)) +
theme(axis.text.x = element_blank(),
text = element_text(size = 12)) -> p2
(p1|p2) + plot_annotation(title = 'Posterior probability that Pr(SIR>1)')
We will now fit an ICAR model (Besag, York, and Mollié 1991). Note that an ICAR prior models strong spatial dependence. The model is as follows: \[ \begin{eqnarray} O_{i}|\lambda_{i} & \sim & \text{Poisson}(\lambda_{i}E_{i} ) \\ \log(\lambda_{i}) & = & \alpha + \phi_{i} \\ \phi_{i} & \sim & \text{ICAR}({\bf W}, 1/\tau_{\phi}) \end{eqnarray} \]
where in this case the random effects are denoted with \(\phi\) and model strong spatial dependence. The elements of the adjacency matrix \({\bf W}\) are defined as: \[\begin{equation} w_{ij} = \begin{cases} 1 & \text{if } j \in \partial_i \\ 0 & \text{otherwise} \end{cases} \end{equation}\]
where \(\partial_i\) is the set of neighbors of \(j\).
Wards_nb <- poly2nb(pl = ScotLip)
nbWB_A <- nb2WB(nb = Wards_nb)
NIMBLE
and not to answer any research question specific for the scottish lip cancer data.ICARCode <- nimbleCode(
{
for (i in 1:N){
O[i] ~ dpois(mu[i]) # Poisson for observed counts
log(mu[i]) <- log(E[i]) + alpha + phi[i]
SIR[i] <- exp(alpha + phi[i]) # area-specific SIR
resSIR[i] <- exp(phi[i]) # area-specific residual SIR
e[i] <- (O[i]-mu[i])/sqrt(mu[i]) # residuals
}
# ICAR
phi[1:N] ~ dcar_normal(adj[1:L], weights[1:L], num[1:N], tau.phi, zero_mean = 1)
# the zero_mean is to impose sum to zero constrains in case you have an intercept in the model
# Priors:
alpha ~ dflat() # vague uniform prior
overallSIR <- exp(alpha) # overall SIR across study region
tau.phi ~ dgamma(1, 0.01) # precision of the ICAR component
sigma2.phi <- 1/tau.phi # variance of the ICAR component
}
)
zero_mean = 1
imposes sum to zero constraints. This is important in case you include an intercept in the model to bypass the identifiability issues. If you dont have an intercept in the model then you should remove the sum to zero constraints. In theory, given a flat prior on the intercept both approaches should be identical.# Obtain the number of wards
N <- dim(ScotLip)[1]
# Format the data for NIMBLE in a list
ScotLipdata = list(
O = ScotLip$CANCER # observed nb of cases
)
ScotLipConsts <-list(
N = N, # nb of wards
E = ScotLip$CEXP, # expected number of cases
# and the elements of the neighboring matrix:
L = length(nbWB_A$weights),
adj = nbWB_A$adj,
num = nbWB_A$num,
weights = nbWB_A$weights
)
# Initialise the unknown parameters, 2 chains
inits <- list(
list(alpha=0.01,
tau.phi=10, phi = rep(0.01,times=N)), # chain 1
list(alpha=0.5,
tau.phi=1, phi = rep(-0.01,times=N)) # chain 2
)
# Monitored parameters
params <- c("sigma2.phi", "overallSIR", "resSIR", "SIR", "e", "mu", "alpha", "phi")
# MCMC setting
ni <- 50000 # nb iterations
nt <- 10 # thinning interval
nb <- 10000 # nb iterations as burn-in
nc <- 2 # nb chains
ICARCodesamples <- nimbleMCMC(code = ICARCode,
data = ScotLipdata,
constants = ScotLipConsts,
inits = inits,
monitors = params,
niter = ni,
nburnin = nb,
thin = nt,
nchains = nc,
setSeed = 9,
progressBar = FALSE,
samplesAsCodaMCMC = TRUE,
summary = TRUE,
WAIC = TRUE
)
We can check convergence similar as before:
GR.diag <- gelman.diag(ICARCodesamples$samples, multivariate = FALSE)
all(GR.diag$psrf[,"Point est."] < 1.1)
## [1] TRUE
which(GR.diag$psrf[,"Point est."] > 1.1)
## named integer(0)
Note that you can get an overall plot for the G-R diagnostic using the ggmcmc
package. Lets select specific variables because it will look messy, say overallSIR
, sigma2.phi
, alpha
and resSIR[1:3]
ICAR_ggmcmc <- ggs(ICARCodesamples$samples)
ICAR_ggmcmc %>% filter(Parameter == c("overallSIR", "sigma2.phi",
"alpha", paste0("resSIR[", 1:3, "]"))) %>%
ggs_Rhat() + xlab("R_hat") + theme_bw() + theme(text = element_text(size = 16))
ICAR_ggmcmc %>% filter(Parameter == "overallSIR") %>%
ggs_traceplot + theme_bw() + theme(text = element_text(size = 12)) -> p1
ICAR_ggmcmc %>% filter(Parameter == "overallSIR") %>%
ggs_autocorrelation + theme_bw() + theme(text = element_text(size = 12)) -> p2
p1/p2
Of course more checks are needed to make sure that the random variables of interest have converged, but the initial impression is positive.
For comparison purposes, I will attached the ICAR SIR on the initial shp
ScotLip$ICAR_SIR <- ICARCodesamples$summary$all.chains[paste0("SIR[", 1:N, "]"), "Median"]
The BYM model is probably the most popular model for disease mapping. It is basically a combination of the ICAR model and a model with an unstructured component, see (Besag, York, and Mollié 1991) for more details: \[ \begin{eqnarray} O_{i}|\lambda_{i} & \sim & \text{Poisson}(\lambda_{i}E_{i} ) \\ \log(\lambda_{i}) & = & \alpha + \theta_{i} + \phi_{i} \\ \theta_{i} & \sim & N(0,1/\tau_{\theta}) \\ \phi_{i} & \sim & \text{ICAR}({\bf W}, 1/\tau_{\phi}) \end{eqnarray} \]
The code is as follows:
BYMCode <- nimbleCode(
{
for (i in 1:N){
O[i] ~ dpois(mu[i]) # Poisson for observed counts
log(mu[i]) <- log(E[i]) + alpha + theta[i] + phi[i]
theta[i] ~ dnorm(0, tau = tau.theta) # area-specific RE
SIR[i] <- exp(alpha + theta[i] + phi[i]) # area-specific SIR
resSIR[i] <- exp(theta[i] + phi[i]) # area-specific residual SIR
e[i] <- (O[i]-mu[i])/sqrt(mu[i]) # residuals
}
# ICAR
phi[1:N] ~ dcar_normal(adj[1:L], weights[1:L], num[1:N], tau.phi, zero_mean = 1)
# Priors:
alpha ~ dflat() # vague uniform prior
overallSIR <- exp(alpha) # overall SIR across study region
tau.theta ~ dgamma(1, 0.01) # prior for the precision of theta
sigma2.theta <- 1/tau.theta # variance of theta
tau.phi ~ dgamma(1, 0.01) # prior for the precision of phi
sigma2.phi <- 1/tau.phi # variance of phi
}
)
In a similar way as done before you need to specify the data. The data that we are using is identical as for the ICAR so we will skip this.
inits <- list(
list(alpha=0.01,
tau.theta=10, theta = rep(0.01,times=N),
tau.phi=10, phi = rep(0.01,times=N)), # chain 1
list(alpha=0.5,
tau.theta=1, theta = rep(-0.01,times=N),
tau.phi=1, phi = rep(-0.01,times=N)) # chain 2
)
# Monitored parameters
params <- c("sigma2.phi", "sigma2.theta", "overallSIR", "resSIR", "SIR", "e", "mu", "alpha", "theta", "phi")
BYMCodesamples <- nimbleMCMC(code = BYMCode,
data = ScotLipdata,
constants = ScotLipConsts,
inits = inits,
monitors = params,
niter = ni,
nburnin = nb,
thin = nt,
nchains = nc,
setSeed = 9,
progressBar = FALSE,
samplesAsCodaMCMC = TRUE,
summary = TRUE,
WAIC = TRUE
)
GR.diag <- gelman.diag(BYMCodesamples$samples, multivariate = FALSE)
all(GR.diag$psrf[,"Point est."] < 1.1)
## [1] TRUE
which(GR.diag$psrf[,"Point est."] > 1.1)
## named integer(0)
sigma2.theta
BYM_ggmcmc <- ggs(BYMCodesamples$samples)
BYM_ggmcmc %>% filter(Parameter == "sigma2.theta") %>%
ggs_traceplot + theme_bw() + theme(text = element_text(size = 12)) -> p1
BYM_ggmcmc %>% filter(Parameter == "sigma2.theta") %>%
ggs_autocorrelation + theme_bw() + theme(text = element_text(size = 12)) -> p2
p1/p2
Seems that we need to tune values given for the MCMC setting to get convergence for sigma2.theta
(not done here).
and attach the BYM SIR on the initial shp
ScotLip$BYM_SIR <- BYMCodesamples$summary$all.chains[paste0("SIR[", 1:N, "]"), "Median"]
BYM2 is a reparametrization of BYM model which has the following features: 1) It resolves the identifiability issues of the BYM model 2) The hyperparameters of the field are almost orthogonal, facilitating clear interpretation and making prior assignment intuitive, 3) The spatially structured random effect is scaled and thus the hyperprior used in one graph have the same interpretation in any other graph (for instance the hyperpriors for the graph of Scotland will have same interpretation as the hyperpriors for the graph of Greece only if the spatially structured random effect is scaled). For more information see Riebler et al. (2016) and Freni-Sterrantino, Ventrucci, and Rue (2018). \[ \begin{eqnarray} O_{i}|\lambda_{i} & \sim & \text{Poisson}(\lambda_{i}E_{i} ) \\ \log(\lambda_{i}) & = & \alpha + b_{i} \\ b_{i} & = & \frac{1}{\tau_b}\big(\theta\sqrt{1-\rho} + \phi\sqrt{\rho}\big)\\ \theta_{i} & \sim & N(0,1/\tau_{\theta}) \\ \phi_{i} & \sim & \text{ICAR}({\bf W^*}, 1) \end{eqnarray} \]
The code is as follows:
BYM2Code <- nimbleCode(
{
for (i in 1:N){
O[i] ~ dpois(mu[i]) # Poisson likelihood for observed counts
log(mu[i]) <- log(E[i]) + alpha + b[i]
b[i] <- (1/sqrt(tau.b))*(sqrt((1-rho))*theta[i] + sqrt(rho/scale)*phi[i])
theta[i] ~ dnorm(0, tau = tau.theta) # area-specific RE
SIR[i] <- exp(alpha + b[i]) # area-specific SIR
resSIR[i] <- exp(b[i]) # area-specific residual SIR
e[i] <- (O[i]-mu[i])/sqrt(mu[i]) # residuals
}
# ICAR
phi[1:N] ~ dcar_normal(adj[1:L], weights[1:L], num[1:N], tau = 1, zero_mean = 1) # its scaled so tau = 1
# Priors:
# intercept
alpha ~ dflat() # vague uniform prior
overallSIR <- exp(alpha) # overall SIR across study region
# precision parameter of the reparametrization
tau.b ~ dgamma(1, 0.01) # prior for the precision of b
sigma2.b <- 1/tau.b # the variance of b
# precision parameter of theta
tau.theta ~ dgamma(1, 0.01) # prior for the precision of theta
sigma2.theta <- 1/tau.theta # the variance of theta
# mixing parameter
rho ~ dbeta(1, 1) # prior for the mixing parameter
}
)
# Wards_nb is the spatial object created using the shapefile and the spdep package
W.scale <- nb2mat(Wards_nb, zero.policy = TRUE, style = "B")
W.scale <- -W.scale
diag(W.scale) <- abs(apply(W.scale, 1, sum))
# solve(W.scale) # this should not work since by definition the matrix is singular
Q = inla.scale.model(W.scale, constr=list(A=matrix(1, nrow=1, ncol=nrow(W.scale)), e=0))
scale = exp((1/nrow(W.scale))*sum(log(1/diag(Q))))
LipCancerConsts <-list(
N = N,
E = ScotLip$CEXP,
# elements of neighboring matrix
adj = nbWB_A$adj,
weights = nbWB_A$weights,
num = nbWB_A$num,
L = length(nbWB_A$weights),
# scale
scale = scale
)
inits <- list(
list(alpha = 0.01,
tau.theta = 10,
theta = rep(0.01, times = N),
phi = rep(0.01, times = N),
tau.b = 10,
rho = .2), # chain 1
list(alpha = 0.5,
tau.theta = 1,
theta = rep(-0.01 ,times = N),
phi = rep(-0.01, times = N),
tau.b = 1,
rho = .7) # chain 2
)
params <- c("sigma2.b", "sigma2.theta", "rho", "overallSIR", "resSIR", "SIR", "e", "mu", "alpha", "b", "theta")
BYM2Codesamples <- nimbleMCMC(code = BYM2Code,
data = ScotLipdata,
constants = LipCancerConsts,
inits = inits,
monitors = params,
niter = ni,
nburnin = nb,
thin = nt,
nchains = nc,
setSeed = 9,
progressBar = FALSE,
samplesAsCodaMCMC = TRUE,
summary = TRUE,
WAIC = TRUE
)
GR.diag <- gelman.diag(BYM2Codesamples$samples, multivariate = FALSE)
all(GR.diag$psrf[,"Point est."] < 1.1)
## [1] TRUE
which(GR.diag$psrf[,"Point est."] > 1.1)
## named integer(0)
attached the SIR on the initial shp
ScotLip$BYM2_SIR <- BYM2Codesamples$summary$all.chains[paste0("SIR[", 1:N, "]"), "Median"]
In this section we will compare the output of the three different models. In this comparison we will focus on SIRs, exceedance probabilities and the different intercepts.
First, I will take some random samples of the SIRs and check if we are fine with convergence (keep in mind the the G-R test provided evidence towards convergence):
set.seed(9)
samples <- sample(x = 1:N, size = 5)
unstr_ggmcmc %>% filter(Parameter == paste0("SIR[", samples, "]")) %>%
ggs_traceplot + theme_bw() + theme(text = element_text(size = 12))
samples <- sample(x = 1:N, size = 5)
ICAR_ggmcmc %>% filter(Parameter == paste0("SIR[", samples, "]")) %>%
ggs_traceplot + theme_bw() + theme(text = element_text(size = 12))
samples <- sample(x = 1:N, size = 5)
BYM_ggmcmc %>% filter(Parameter == paste0("SIR[", samples, "]")) %>%
ggs_traceplot + theme_bw() + theme(text = element_text(size = 12))
BYM2_ggmcmc <- ggs(BYM2Codesamples$samples)
samples <- sample(x = 1:N, size = 5)
BYM2_ggmcmc %>% filter(Parameter == paste0("SIR[", samples, "]")) %>%
ggs_traceplot + theme_bw() + theme(text = element_text(size = 12))
Now we can start comparing
ggplot() + geom_sf(data = ScotLip, aes(fill = crudeSIR), col = "NA") +
scale_fill_viridis_c(limits = c(0, 6.6)) + theme_bw() +
theme(axis.text = element_blank(), text = element_text(size = 10)) -> p1
ggplot() + geom_sf(data = ScotLip, aes(fill = unstr_SIR), col = "NA") +
scale_fill_viridis_c(limits = c(0, 6.6)) + theme_bw() +
theme(axis.text = element_blank(), text = element_text(size = 10)) -> p2
ggplot() + geom_sf(data = ScotLip, aes(fill = ICAR_SIR), col = "NA") +
scale_fill_viridis_c(limits = c(0, 6.6)) + theme_bw() +
theme(axis.text = element_blank(), text = element_text(size = 10)) -> p3
ggplot() + geom_sf(data = ScotLip, aes(fill = BYM_SIR), col = "NA") +
scale_fill_viridis_c(limits = c(0, 6.6)) + theme_bw() +
theme(axis.text = element_blank(), text = element_text(size = 10)) -> p4
ggplot() + geom_sf(data = ScotLip, aes(fill = BYM2_SIR), col = "NA") +
scale_fill_viridis_c(limits = c(0, 6.6)) + theme_bw() +
theme(axis.text = element_blank(), text = element_text(size = 10)) -> p5
(p1|p2|p3)/(p4|p5)
It is clear that there is some level of smoothness induced in the SIR using the different hierarchical Bayesian models. For the model with just the unstructured term, as expected we have a global but not local smoothing, whereas for the ICAR and the rest of the models, it is clear that the smoothing was also performed locally, using the adjacency structure. We can also assess the boxplots of the SIRs:
data.frame(model = c(rep("Crude", N), rep("Unstr", N),
rep("ICAR", N), rep("BYM", N), rep("BYM2", N)),
SIR =c(ScotLip$crudeSIR, ScotLip$unstr_SIR, ScotLip$ICAR_SIR,
ScotLip$BYM_SIR, ScotLip$BYM2_SIR)) %>%
ggplot() + geom_boxplot(aes(x = model, y = SIR)) + theme_bw() +
theme(text = element_text(size = 14))
suppressMessages(
print(
ScotLip %>% select(crudeSIR, unstr_SIR, ICAR_SIR, BYM_SIR, BYM2_SIR) %>%
as.data.frame() %>%
select(-geometry) %>%
ggpairs(lower = list(continuous = wrap("points", size=.8))) +
theme_bw() + theme(text = element_text(size = 14))
)
)
ScotLip$ICAR_postProb <- sapply(paste0("SIR[", 1:N, "]"),
function(X) mean(ICARCodesamples$samples$chain1[,X]>1))
ScotLip$BYM_postProb <- sapply(paste0("SIR[", 1:N, "]"),
function(X) mean(BYMCodesamples$samples$chain1[,X]>1))
ScotLip$BYM2_postProb <- sapply(paste0("SIR[", 1:N, "]"),
function(X) mean(BYM2Codesamples$samples$chain1[,X]>1))
ggplot() + geom_sf(data = ScotLip, aes(fill = unstr_postProb), col = "NA") +
scale_fill_viridis_c(limits = c(0, 1)) + theme_bw() +
theme(axis.text = element_blank(), text = element_text(size = 10)) -> p1
ggplot() + geom_sf(data = ScotLip, aes(fill = ICAR_postProb), col = "NA") +
scale_fill_viridis_c(limits = c(0, 1)) + theme_bw() +
theme(axis.text = element_blank(), text = element_text(size = 10)) -> p2
ggplot() + geom_sf(data = ScotLip, aes(fill = BYM_postProb), col = "NA") +
scale_fill_viridis_c(limits = c(0, 1)) + theme_bw() +
theme(axis.text = element_blank(), text = element_text(size = 10)) -> p3
ggplot() + geom_sf(data = ScotLip, aes(fill = BYM2_postProb), col = "NA") +
scale_fill_viridis_c(limits = c(0, 1)) + theme_bw() +
theme(axis.text = element_blank(), text = element_text(size = 10)) -> p4
(p1|p2)/(p3|p4)
Note that we can also calculate exceedance probabilities in the sampler using the step()
function. For example in the BYM2, inside the for loop under the calculation of e, we can add this: proba.resSIR[i] <- step(resSIR[i]-1)
. Make sure you monitor proba.resSIR
by specifying it in the params
. After the sampler works we can take its posterior mean and get the posterior probability.
Unstr | ICAR | BYM | BYM2 | |
---|---|---|---|---|
\(\alpha\) | 0.08 (-0.16, 0.31) | 0.09 (-0.01, 0.19) | 0.09 (-0.02, 0.2) | 0.09 (-0.01, 0.19) |
\(\sigma^2_{\theta}\) | 0.57 (0.35, 0.94) |
|
0.01 (0, 0.07) | 0.01 (0, 0.16) |
\(\sigma^2_{\phi}\) |
|
0.64 (0.35, 1.15) | 0.61 (0.31, 1.13) | 1 |
\(\sigma^2_{b}\) |
|
|
|
0.37 (0.18, 1.19) |
\(\rho\) |
|
|
|
0.71 (0.23, 0.99) |
WAIC | 307.5 | 293.61 | 293.69 | 295.24 |
Some interesting features for the table above:
The purpose of this tutorial is to show how to fit spatial models using NIMBLE
(de Valpine et al. 2017). Nevertheless, the tutorial does not provide an exhaustive list of spatial model that can be fit in NIMBLE
. Examples of other models include the proper model (Cressie and Chan 1989), the Leroux model (Leroux, Lei, and Breslow 2000), see also Best, Richardson, and Thomson (2005) and Lee (2011) for simulation studies and relevant comparisons. Additional sources for code include the tutorial by Lawson (Lawson 2020) and also chapter 9 of the NIMBLE manual.
Besag, Julian, Jeremy York, and Annie Mollié. 1991. “Bayesian Image Restoration, with Two Applications in Spatial Statistics.” Annals of the Institute of Statistical Mathematics 43 (1): 1–20.
Best, Nicky, Sylvia Richardson, and Andrew Thomson. 2005. “A Comparison of Bayesian Spatial Models for Disease Mapping.” Statistical Methods in Medical Research 14 (1): 35–59.
Clayton, David, and John Kaldor. 1987. “Empirical Bayes Estimates of Age-Standardized Relative Risks for Use in Disease Mapping.” Biometrics, 671–81.
Cressie, Noel, and Ngai H Chan. 1989. “Spatial Modeling of Regional Variables.” Journal of the American Statistical Association 84 (406): 393–401.
de Valpine, P., D. Turek, C. J. Paciorek, C. Anderson-Bergman, D. Temple Lang, and R. Bodik. 2017. “Programming with Models: Writing Statistical Algorithms for General Model Structures with NIMBLE.” Journal of Computational and Graphical Statistics 26: 403–17. https://doi.org/10.1080/10618600.2016.1172487.
Freni-Sterrantino, Anna, Massimo Ventrucci, and Håvard Rue. 2018. “A Note on Intrinsic Conditional Autoregressive Models for Disconnected Graphs.” Spatial and Spatio-Temporal Epidemiology 26: 25–34.
Gelman, Andrew, Donald B Rubin, and others. 1992. “Inference from Iterative Simulation Using Multiple Sequences.” Statistical Science 7 (4): 457–72.
Lawson, Andrew. 1999. Disease Mapping and Risk Assessment for Public Health. Wiley.
Lawson, Andrew B. 2020. “NIMBLE for Bayesian Disease Mapping.” Spatial and Spatio-Temporal Epidemiology, 100323.
Lee, Duncan. 2011. “A Comparison of Conditional Autoregressive Models Used in Bayesian Disease Mapping.” Spatial and Spatio-Temporal Epidemiology 2 (2): 79–89.
Leroux, Brian G, Xingye Lei, and Norman Breslow. 2000. “Estimation of Disease Rates in Small Areas: A New Mixed Model for Spatial Dependence.” In Statistical Models in Epidemiology, the Environment, and Clinical Trials, 179–91. Springer.
Riebler, Andrea, Sigrunn H Sørbye, Daniel Simpson, and Håvard Rue. 2016. “An Intuitive Bayesian Spatial Model for Disease Mapping That Accounts for Scaling.” Statistical Methods in Medical Research 25 (4): 1145–65.
Watanabe, Sumio. 2013. “A Widely Applicable Bayesian Information Criterion.” Journal of Machine Learning Research 14 (Mar): 867–97.