How to use package sbim: Simulation-Based Inference using a Metamodel for Log-Likelihood Estimator

Introduction

This tutorial introduces you to the package sbim and explains how to use it using examples. This package implements parameter inference methods for stochastic models defined implicitly by a simulation algorithm, developed by Park, J. (2025). Scalable simulation-based inference for implicitly defined models using a metamodel for log-likelihood estimator . First, the methodological and theoretical framework for inference is explained. Then how to create an R object that contains simulation-based log-likelihood estimates will be explained. How to carry out a hypothesis test will be explained first for independent and identically distributed (iid) data using a toy example. Conducting hypothesis tests for a certain class of models generating dependent observations will be explained next using an example of stochastic volatility model.

Mathematical framework

This section provides a mathematical basis for the methods implemented in the package sbim. Further details can be found in the article Park (2025). If you want to learn only about how to use the package, you may skip to the next section.

We consider a collection of latent random variables $X$ distributed according to $P_\theta$, and partial observations $Y$ whose conditional distributions have density $g(y|x;\theta)$. The underlying process $P_\theta$ is not assumed to have a density that can be evaluated analytically pointwise. The parameter $\theta$ may affect both the latent process $X$ and the conditional measurement process $Y$ given $X$; however, $\theta$ may comprise two components each governing the latent process or the measurement process only.

The goodness of fit to the observed data is assessed for a set of parameter values $\theta_1,\dots, \theta_M$, by simulating the underlying process at the given parameter value and obtaining a log likelihood estimate. There may be duplicates among $\theta_{1:M}$, meaning that independent simulations were carried out at the parameter value. The reason that estimates of the log likelihood estimates are used is that the variance of the likelihood estimate (on the natural, i.e., non-log scale) often scales exponentially with the size of the data, and the distribution of likelihood estimator is often highly skewed. However, the distribution of a log likelihood estimator is often sufficiently regular. Therefore by modeling the distribution of log likelihood estimator, one might be able to use all simulation-based log likelihood estimates for inference, rather than only a small fraction of simulations where the likelihood estimate is reasonably close to the exact likelihood.

Our approach is based on a simulation metamodel for log likelihood estimates given by \[\ell^S(\theta;y_{1:n}) \sim \mathcal N\left\{a(y_{1:n}) + b(y_{1:n})^\top \theta + \theta^\top c(y_{1:n})\theta, \, \frac{\sigma^2(y_{1:n})}{w(\theta)}\right\}\] where $\ell^S(\theta; y_{1:n})$ denotes the simulation log-likelihood at $\theta$, or the estimate of the log likelihood $\ell(\theta;y_{1:n})$ given the observations $y_{1:n}$. The simulation log-likelihood may be obtained in as simple a way as $\ell^S(\theta; y_{1:n}) := g(y_{1:n}|X;\theta)$ where $X\sim P_\theta$ is a certain simulation outcome of the underlying process at $\theta$. However, it may be obtained by using a different method, e.g., the particle filter for hidden Markov models. The mean function is a quadratic polynomial in $\theta$. This meta model is supposed to give a local description of the distribution of the simulation log-likelihood around the maximum of the mean function. The variance depends on parameter specific value $w(\theta)$, which may be approximated to be a constant if the variance of simulation log-likelihood varies little in this local neighborhood. If simulations were carried out with different precisions at different parameter values, $w(\theta)$ may be chosen to reflect the relative differences. For example, if the simulation log-likelihoods are obtained by running the particle filter, then the variance in the log-likelihood estimate approximately scales inversely proportional to the number of particles used. In this case, $w(\theta)$ may be chosen proportional to the number of particles.

We assume that the simulation log-likelihood for each observation piece $\ell^S(\theta; y_i)$ is available for $i\in 1:n$. For instance, it may be given by $\ell^S(\theta; y_i) = g_i(y_i|X;\theta)$ where $g_i$ is the measurement density of the $i$-th observation piece and $X$ is a simulated draw from $P_\theta$. If the observations $y_i$, $i\in 1:n$ are conditionally independent given $X$, $g(y_{1:n}|X;\theta) = \prod_{i=1}^n g_i(y_i|X;\theta)$, and thus we have \[ \ell^S(\theta; y_{1:n}) = \sum_{i=1}^n \ell^S(\theta;y_i).\]

The maximizer of the mean function $\mu(\theta;y_{1:n}) := a(y_{1:n}) + b(y_{1:n})^\top \theta + \theta^\top c(y_{1:n})\theta$ is called the maximum expected simulation log-likelihood estimator (MESLE) and denoted by $\hat\theta_{\text{MESLE}}$.

Parameter inference using local asymptotic normality (LAN) for simulation log-likelihood

Parameter inference is carried out by analyzing the distribution of the quadratic mean function $\mu(\theta;Y_{1:n})$ where $Y_{1:n}$ are partial observations of the underlying system realized at a given parameter value $\theta_0$. We use a local approximation drawn from the following asymptotic result, which is satisfied under reasonably mild regulatory conditions. The maximizer of the mean function averaged over the data distribution under $\theta_0$ will be referred to as a simulation-based proxy. It will be denoted by $\mathcal J(\theta_0)$ or simply by $\theta_*$ when the dependence on the true parameter value $\theta_0$ is not stressed. \[ \mathcal J(\theta_0) = \arg\max_\theta \mathbb E_{X\sim P_{\theta_0}} \mathbb E_{Y|X \sim g(\cdot|X, \theta_0)} ~\mu(\theta; Y_{1:n}).\] In general, $\mathcal J(\theta_0) \neq \theta_0$, but there are some simple examples where $\mathcal J(\theta_0) = \theta_0$.

The change in the mean function $\mu(\theta)$ on a $O(1/\sqrt{n})$ scale can be described by \[\mu(\theta_* + \frac{t}{\sqrt n}; Y_{1:n}) - \mu(\theta_*; Y_{1:n}) \approx S_n^\top t - \frac{1}{2} t^\top K_2 (\theta_*; \theta_0) t\] where the difference between the left and the right hand side converges to zero in probability as $n\to\infty$. The probabilistic statement is with respect to the randomness in the observations generated under a true parameter value denoted by $\theta_0$. The random variable $S_n$ given by \[S_n := \frac{1}{\sqrt n} \left.\frac{\partial \mu}{\partial \theta}\right\vert_{\theta=\theta_*}(\theta; Y_{1:n}),\] which converges in distribution to $\mathcal N(0, K_1(\theta_*;\theta_0))$, where $K_1(\theta_*;\theta_0)$ is a positive definite matrix.

The matrix $K_2(\theta_*;\theta_0)$ is defined by the in-probability limit \[-\frac{1}{n} \frac{\partial^2 \mu}{\partial \theta^2} (\theta_*; Y_{1:n}) \overset{i.p.}{\underset{n\to\infty}{\to}} K_2(\theta_*;\theta_0)\] where $Y_{1:n}$ are generated under $\theta_0$.

Inference is carried out by considering both the randomness in simulations and the randomness in observed data under a given parameter. The main tool is regression through the points $(\theta_m, \ell^S(\theta_m; y_{1:n}))$, $m\in 1:M$ by a quadratic polynomial. When $w(\theta_m)$, $m\in 1:M$ are not all the same, we carry out weighted quadratic regression with weights $w(\theta_m)$, $m\in 1:M$.

Getting started

The package can be installed from the package author’s github repository as follows. Doing so requires having the devtools package installed.

install.packages('devtools') # skip this line if `devtools` is already installed.

Then install the sbim package:

devtools::install_github("joonhap/sbi")

The package source code may be downloaded in tar.gz format from here (not active yet, as of Dec 2023). The package can be loaded as usual:

library(sbim)

The simulation-based inference is carried out using the ht function. When the parameter of interest is one-dimensional, confidence intervals may be constructed using the ci function.

An example: gamma-Poisson observations

As an example, we consider a latent process $X$ consisting of $n$ iid copies of gamma random variates, denoted by $(X_1, \dots, X_n)\overset{iid}\sim \Gamma(1, \theta)$ where the shape parameter is unity and the rate parameter is $\theta$. Partial observations $Y_i$ depend only on $X_i$ and Poisson-distributed: $Y_i|X_i \sim \text{Pois}(X_i)$. The observations $y_{1:n}$ are generated under $\theta = \theta_0 = 1$. For this model, the maximum expected simulation log-likelihood estimator (MESLE) is given by $1/\bar y = \frac{n}{\sum_{i=1}^n y_i}$, which is equal to the maximum likelihood estimator (MLE) given the observations $y_{1:n}$ (see Example 2 in Park (2025) for mathematical details.) Furthermore, the simulation-based proxy $\mathcal J(\theta_0)$ is equal to $\theta_0$.

n <- 1000 # number of iid observations
gamma_t <- 1 # true shape parameter 
theta_t <- 1 # true rate parameter
set.seed(98475)
## Marginally, observations are distributed following the negative binomial distribution.
y_o <- rnbinom(n=n, size=gamma_t, prob=theta_t/(theta_t+1)) # observed data (y_1,...,y_n). 
MESLE <- gamma_t*n/sum(y_o) # MESLE for theta (= MLE)
sims_at <- seq(.8, 1.2, by=.001) # theta values at which simulations are made
llest <- sapply(sims_at, function(tht) {dpois(y_o, rgamma(n, gamma_t, rate=tht), log=TRUE)})

The llest object is a matrix with $n=1000$ rows and $M=401$ column. Here $M$ is the number of parameter values at which simulations are carried out (i.e., the length of sims_at). The $(i,m)$-th entry of llest gives the simulation log-likelihood $\ell^S(\theta_m; y_i)$ for the $i$-th observation piece at $\theta_m$.

We create a class simll object which contains both the simulation log-likelihoods (llest) and the corresponding parameter values (sims_at).

s <- simll(llest, params=sims_at)

Hypothesis test on the maximum expected simulation log-likelihood estimator (MESLE)

Hypothesis tests can be carried out for both the MESLE and the simulation-based proxy. Note that the MESLE is a statistic that depends on the observed data $y_{1:n}$, like the MLE. The MESLE needs to be estimated using simulations, because the likelihood function is not accessible analytically. We can test $H_0: \hat \theta_\text{MESLE}= \theta_{\text{MESLE},0}$, $H_1: \hat \theta_\text{MESLE}\neq \theta_{\text{MESLE},0}$ for multiple null values $\theta_{\text{MESLE},0}$ simultaneously. The null values can be passed to the ht function as an argument null.value in different forms. If we test for a single null value, it can be a vector of length $d$, where $d$ is the dimension of the parameter space (for the current example, $d=1$.) If more than null values are tested, null.value can be either a matrix having $d$ columns where each row gives a null value vector, or a list of vectors of length $d$.

nulls_theta <- c(seq(.8, 1.2, by=.05), MESLE) # null values to be tested

We test about the MESLE for a range of values between 0.8 and 1.2 as well as the exact value of the MESLE (=y). Since the parameter in the current example is one-dimensional, nulls_theta is a numeric vector. If it is supplied to the ht function as-is, ht will think that we are testing about a single parameter value of dimension length(nulls_theta), which is not what we want. Thus we supply the null values by coercing nulls_theta into a list using as.list. The test argument "MESLE" indicates that we are testing about the MESLE.

ht_result <- ht(s, null.value=as.list(nulls_theta), test="MESLE")

The output is a list consisting of several components. First, the estimated regression coefficients $\hat a$, $\hat b$, $\hat c$, as well as the estimated error variance $\hat \sigma^2$ are given ($regression_estimates). The simulation metamodel defines a metamodel likelihood for the obtained simulation log-likelihoods $\ell^S(\theta_m; y_{1:n})$. The maximizer of this metamodel likelihood, which gives a point estimate for the MESLE, is outputted as well ($meta_model_MLE_for_MESLE). Next, a table consisting of the null values and the corresponding p-values are outputted ($Hypothesis_Tests). Finally, in order to assess the degree to which the weighted quadratic regression was appropriate, regression using a cubic polynomial is carried out and the p-value for the significance of the cubic terms is given ($pval_cubic). If pval_cubic is too small (say $<0.01$), then the inference using the quadratic regression may be considered as biased, and the range of parameter values for simulation may need to be narrowed down.

ht_result
#> $regression_estimates
#> $regression_estimates$a
#> [1] -2766.172
#> 
#> $regression_estimates$b
#>          [,1]
#> [1,] 1436.992
#> 
#> $regression_estimates$c
#>           [,1]
#> [1,] -687.4277
#> 
#> $regression_estimates$sigma_sq
#> [1] 3917.005
#> 
#> 
#> $meta_model_MLE_for_MESLE
#> [1] 1.045195
#> 
#> $Hypothesis_Tests
#>    MESLE_null       pvalue
#> 1    0.800000 0.0019596412
#> 2    0.850000 0.0013381581
#> 3    0.900000 0.0007788392
#> 4    0.950000 0.0005697934
#> 5    1.000000 0.0223814770
#> 6    1.050000 0.8609011185
#> 7    1.100000 0.2019227109
#> 8    1.150000 0.0835907749
#> 9    1.200000 0.0497693858
#> 10   1.051525 0.8200727746
#> 
#> $pval_cubic
#> [1] 0.5123048

The ci function constructs a one-dimensional confidence interval for the parameter.

cci <- ci(s, level=c(.9, .95), ci="MESLE") # constructed confidence intervals

The following figure gives the simulation log likelihoods and the constructed confidence intervals for $\hat\theta_\text{MESLE}$. The vertical dashed line indicates the MESLE. The blue curve indicates the fitted quadratic polynomial, and the red curve the exact log-likelihood function with a vertical shift to facilitate ready comparison of the curvature between the fitted curve and the exact log likelihood.

Hypothesis test on the simulation-based proxy

The simulation-based proxy $\mathcal J(\theta_0)$ for this example is equal to $\theta_0$, the true parameter value under which the observations were generated. We can carry out a hypothesis test $H_0: \mathcal J(\theta_0) = \theta_{*,0}$, $H_1: \mathcal J(\theta_0) \neq \theta_{*,0}$ using the ht function with the test argument set equal to "parameter" (the default). We use the same set of simulation log-likelihoods stored in s and the same collection of null values used for the test on the MESLE. For hypothesis testing for the simulation-based proxy, we need to let the ht function know that the observations are iid so that the uncertainty may be correctly quantified. This can be done by passing the character string "iid" to the case argument. By default, the ht function assumes that the observations form a serially dependent, stationary sequence, which corresponds to case="stationary". This case will be discussed in the next section.

ht_result <- ht(s, null.value=as.list(nulls_theta), test="parameter", case="iid")
ht_result
#> $regression_estimates
#> $regression_estimates$a
#> [1] -2766.172
#> 
#> $regression_estimates$b
#>          [,1]
#> [1,] 1436.992
#> 
#> $regression_estimates$c
#>           [,1]
#> [1,] -687.4277
#> 
#> $regression_estimates$sigma_sq
#> [1] 3917.005
#> 
#> 
#> $meta_model_MLE_for_parameter
#> [1] 1.045195
#> 
#> $K1
#>          [,1]
#> [1,] 1.895297
#> 
#> $K2
#>          [,1]
#> [1,] 1.374855
#> 
#> $error_variance
#> [1] 3926.798
#> 
#> $Hypothesis_Tests
#>    parameter_null      pvalue
#> 1        0.800000 0.004063236
#> 2        0.850000 0.004471903
#> 3        0.900000 0.006806350
#> 4        0.950000 0.023845583
#> 5        1.000000 0.227609734
#> 6        1.050000 0.908901037
#> 7        1.100000 0.305120673
#> 8        1.150000 0.125547238
#> 9        1.200000 0.068807793
#> 10       1.051525 0.880937861
#> 
#> $pval_cubic
#> [1] 0.5123048

Inference for non-iid observations

When the observations are not iid, the hypothesis test on the MESLE is carried out in the same way as in the iid case, but the hypothesis test for the simulation-based parameter proxy is carried out in a different way. The reason for the difference for the non-iid case is that when estimating the unknown value of $K_1(\theta_*;\theta_0)$, the auto-correlation among $\frac{\partial \mu}{\partial \theta}(\theta_*; y_{1:n})$ need to be taken into consideration. We demonstrate hypothesis test on the simulation-based proxy for the non-iid case using the following example.

Stochastic volatility model

We consider a stochastic volatility model where the distribution of the log rate of return $r_t$ of a stock at time $t$ is described by \[ r_t = e^{s_t} W_t, \quad W_t \overset{iid}{\sim} t_5, \] where $s_t$ denotes the volatility at time $t$ and $t_5$ the $t$ distribution with five degrees of freedom. The distribution of the stochastic volatility process $\{s_t\}$ is described by \[ s_t = \kappa s_{t-1} + \tau \sqrt{1-\kappa^2} V_t ~~\text{for}~ t>1, \quad s_1 = \tau V_1, \quad \quad V_t \overset{iid}{\sim} \mathcal N(0,1). \] The rates of return $r_t$ are observed for $t\in 1:T$ where $T=1000$. The stochastic volatility $\{s_t; t \in 1 : T\}$ is a Markov process with partial observations $\{r_t; t\in 1:T\}$. We simulate the stochastic volatility process for $\kappa=0.8, \tau=1$ and generate an observed data sequence $r_{1:T}$.

Unbliased likelihood estimates for the sequence of observations may be obtained by the bootstrap particle filter, which uses an ensemble of Monte Carlo draws called particles simulated under a given set of parameters. We may use the logarithm of the obtained unbiased likelihood estimates as the simulation log-likelihoods for inference on the parameter proxy. The bootstrap particle filter can be run using the package pomp with the reasonably small amount of effort of specifying the underlying process and the measurement process models.

library(pomp)

The stochastic volatility model can be defined in the pomp-style as follows.

# Stochastic volatility model
SV_pomp <- function(t0, times, kappa, tau, sim_seed=NULL) {

    rinit <- pomp::Csnippet("
        s = tau*rnorm(0,1);
    ")

    rproc <- pomp::Csnippet("
        s = kappa*s + tau*sqrt(1-kappa*kappa)*rnorm(0,1);
    ")

    rmeas <- pomp::Csnippet("
        r = exp(s)*rt(5);
    ")
    
    dmeas <- pomp::Csnippet("
        double logdmeas = dt(r*exp(-s), 5, 1) - s;
        lik = (give_log) ? logdmeas : exp(logdmeas);
    ")

    if (!is.null(sim_seed)) {
        set.seed(sim_seed)
    }
    
    pomp::simulate(t0=t0, times=times,
        rinit=rinit,
        rprocess=pomp::discrete_time(rproc,delta.t=1),
        rmeasure=rmeas,
        dmeasure=dmeas,
        params=c(kappa=kappa, tau=tau),
        statenames = "s",
        obsnames = "r",
        paramnames = c("kappa", "tau")
    )
}

The observations $r_t$ for $t=1,\dots, T=500$ are generated under $\kappa = 0.8$ and $\tau = 1$. The parameter $\kappa$ is constrained to be between 0 and 1, and $\tau$ has the positivity constraint. In order to remove the constraints, we estimate $\kappa$ and $\tau$ on the logit and the log scale, respectively.

t0 <- 0
T <- 500
times <- 1:T
kappa_t <- 0.8
tau_t <- 1
theta_t <- c(kappa=kappa_t, tau=tau_t) # true parameter value
theta_Est <- c(kappa=logit(kappa_t), tau=log(tau_t)) # true parameter on the estimation scale
# simulate the process and generate data
SV_t <- SV_pomp(t0, times, kappa_t, tau_t, sim_seed=1554654) 
r_o <- obs(SV_t) # generated observation sequence
dat_raw <- data.frame(time=time(SV_t), latent=c(states(SV_t)), obs=c(r_o))
dat <- pivot_longer(dat_raw, -time)
parnames <- c("kappa", "tau") 
trans <- list(logit, log) # functions to use for transformations to the estimation scale
btrans <- list(plogis, exp) # functions for transformations back to the original scale
transnames <- c("logit(kappa)", "log(tau)")

The bootstrap particle filter was run at varied parameter values $\theta = (\kappa, \tau)$ to obtain likelihood estimates using the R package pomp (King, Nguyen, and Ionides (2016), King et al. (2023)). The logarithm of the likelihood estimates were used as the simulation log-likelihoods $\ell^S(\theta; r_{1:T})$.

We first consider estimation of $\kappa$ on the logit scale. Simulations are carried out at equally spaced points between the true $\text{logit}(\kappa)$ value $\pm$ 1. The width for the simulation points is chosen in a way that balance the bias and the variance in the test. Specifically, it is roughly chosen such that it is as large as possible while ensuring that the p-value for the test on the cubic coefficient in the regression polynomial is distributed close to the uniform distribution.

sim_widths <- c(1, .4) 
inc <- c("kappa") # parameter being estimated
parid <- which(parnames%in%inc)
M <- 100 # number of simulation points
Np <- 100 # number of particles used by the bootstrap particle filter
# simulation points are uniformly spaced between the true value +-1
sims_incl_Est <- cbind(seq(theta_Est[inc] - sim_widths[parid], theta_Est[inc] + sim_widths[parid],
    length.out=M)) |> `colnames<-`(inc)
sims_at_Nat <- outer(rep(1,M), theta_t)
sims_at_Nat[,inc] <- sapply(1:length(inc), function(i) btrans[[parid[i]]](sims_incl_Est[,i]))
set.seed(729875)
llp_est <- simplify2array(lapply(1:nrow(sims_at_Nat), function(i) {
    sp <- sims_at_Nat[i,]
    ## run the particle filter for each simulation point
    pfSV <- pfilter(pomp(SV_t, params=sp), Np=Np, save.states="prediction") 
    ## estimate the simulation log-likelihood for each r_t
    lme <- sapply(saved_states(pfSV)$weights, logmeanexp)
    lme
}))
ll_est <- apply(llp_est, 2, sum)

The individual simulation log likelihood $\ell^S(\theta; r_t)$ for each $t$ was found by the average of the particle weights at time $t$ of the bootstrap particle filter.

The simll object is created, and hypothesis tests for the parameter $\kappa$ is carried out as follows.

s <- simll(llp_est, params=sims_incl_Est)
null_values_Est <- cbind(theta_Est[inc]+sim_widths[parid]/5*(-10:10))
colnames(null_values_Est) <- inc
ht_result <- ht(s, null.value=null_values_Est, test="parameter", case="stationary")
#> Warning in ht.simll(s, null.value = null_values_Est, test = "parameter", : The
#> slope estimates at consecutive batches had significant correlation. Consider
#> manualy increasing the batch size for estimation of K1.

The scaled variance $K_1(\theta_*;\theta_0) = \lim_{T\to\infty} \frac{1}{T} \text{Var}\left\{\frac{\partial \mu}{\partial \theta}(\theta; r_{1:T})\right\}$ can be estimated by dividing the observation sequence into continuous batches so that the estimates of the slope of the function $\mu$ are almost independent between batches. The default size of each batch is round(n^{0.4}) where n is the number of observations. The default batch size can be overridden by supplying the batch_size argument. A different method for estimating $K_1$ is to use a truncated sum of autocovariances. This can be done by setting K1_est_method="autocov".

Confidence intervals for $\text{logit}(\kappa)$ can be constructed as follows.

ci_result <- ci(s, level=c(.9,.95), ci="parameter", case="stationary")
#> Warning in ci.simll(s, level = c(0.9, 0.95), ci = "parameter", case =
#> "stationary"): The slope estimates at consecutive batches had significant
#> correlation. Consider manualy increasing the batch size for estimation of K1.
regcoef_k <- ci_result$regression_estimates
# estimated expected simulation log-likelihood
eesl <- function(x) { regcoef_k$a + regcoef_k$b*x + regcoef_k$c*x^2 } 
vline_names <- c("true","CI90","CI95")
dat <- data.frame(simll=ll_est)
dat[inc] <- sims_incl_Est
g1 <- ggplot(dat, aes(x=!!sym(inc), y=simll)) + geom_point() +
    geom_function(fun=eesl, linetype="longdash") +
    geom_vline(data=data.frame(
        kind=factor(c("true",rep("CI90",2),rep("CI95",2)), levels=vline_names),
        value=c(theta_Est[inc], ci_result$confidence_interval[1,2:3],
            ci_result$confidence_interval[2,2:3])),
        aes(xintercept=value, linetype=kind)) +
    xlab(transnames[parid]) + ylab('simulation log-likelihood') +
    scale_linetype_manual(name="", labels=c("true", "90%CI", "95%CI"),
        values=c(true="dashed", CI90="dotdash", CI95="dotted")) +
    theme(legend.position="bottom")
g1

Similarly, parameter inference for $\log(\tau)$ can be carried out as follows. The simulation points are equally spaced within $\pm 0.4$ of the log of the true $\tau$ value.

sim_widths <- c(1, .4) 
inc <- c("tau") # parameter being estimated
parid <- which(parnames%in%inc)
M <- 100
Np <- 100
# simulation points are uniformly spaced between the true value +-.4
sims_incl_Est <- cbind(seq(theta_Est[inc] - sim_widths[parid], theta_Est[inc] + sim_widths[parid],
    length.out=M)) |> `colnames<-`(inc)
sims_at_Nat <- outer(rep(1,M), theta_t)
sims_at_Nat[,inc] <- sapply(1:length(inc), function(i) btrans[[parid[i]]](sims_incl_Est[,i]))
set.seed(729875)
llp_est <- simplify2array(lapply(1:nrow(sims_at_Nat), function(i) {
    sp <- sims_at_Nat[i,]
    pfSV <- pfilter(pomp(SV_t, params=sp), Np=Np, save.states="prediction")
    lme <- sapply(saved_states(pfSV)$weights, logmeanexp)
    lme
}))
ll_est <- apply(llp_est, 2, sum)
s <- simll(llp_est, params=sims_incl_Est)
null_values_Est <- cbind(theta_Est[inc]+sim_widths[parid]/5*(-10:10))
colnames(null_values_Est) <- inc
ht_result <- ht(s, null.value=null_values_Est, test="parameter", case="stationary")
dat <- data.frame(simll=ll_est)
dat[inc] <- sims_incl_Est
ci_result <- ci(s, level=c(.9,.95), ci="parameter", case="stationary")
regcoef_t <- ci_result$regression_estimates
eesl <- function(x) { regcoef_t$a + regcoef_t$b*x + regcoef_t$c*x^2 } # estimated expected simlation log-likelihood
vline_names <- c("true","CI90","CI95")
g2 <- ggplot(dat, aes(x=!!sym(inc), y=simll)) + geom_point() +
    geom_function(fun=eesl, linetype="longdash") +
    geom_vline(data=data.frame(kind=factor(c("true",rep("CI90",2),rep("CI95",2)), levels=vline_names),
        value=c(theta_Est[inc], ci_result$confidence_interval[1,2:3],
            ci_result$confidence_interval[2,2:3])),
        aes(xintercept=value, linetype=kind)) +
    scale_linetype_manual(name=NULL, labels=c("true", "90%CI", "95%CI"),
        values=c(true="dashed",CI90="dotdash",CI95="dotted")) +
    ylab('simulation log-likelihood') + xlab(transnames[parid]) + theme(legend.position="bottom")
g2

We can estimate both parameters $(\text{logit}(\kappa), \log(\tau))$ jointly. The hypothesis tests $H_0: (\kappa_*, \tau_*) = (\kappa_{*,0}, \tau_{*,0})$, $H_1: (\kappa_*, \tau_*) = (\kappa_{*,0}, \tau_{*,0})$ were carried out for varied null value pairs. The plot generated below shows the 80%, 90%, 95% confidence regions, constructed by marking the null value pairs for which the p-value is greater than 20%, 10%, and 5%, respectively. The true parameter pair is marked by an $X$.

sim_widths <- c(.5, .2) 
inc <- c("kappa", "tau") # parameters being estimated
parid <- which(parnames%in%inc)
M <- 400
Np <- 100
sims_incl_Est <- expand.grid(seq(theta_Est[inc[1]]-sim_widths[parid[1]], theta_Est[inc[1]]+sim_widths[parid[1]], length.out=sqrt(M)), seq(theta_Est[inc[2]]-sim_widths[parid[2]], theta_Est[inc[2]]+sim_widths[parid[2]], length.out=sqrt(M))) |> as.matrix() |> `colnames<-`(inc)
sims_at_Nat <- outer(rep(1,M), theta_t)
sims_at_Nat[,inc] <- sapply(1:length(inc), function(i) btrans[[parid[i]]](sims_incl_Est[,i]))
set.seed(729875)
llp_est <- simplify2array(lapply(1:nrow(sims_at_Nat), function(i) {
    sp <- sims_at_Nat[i,]
    pfSV <- pfilter(pomp(SV_t, params=sp), Np=Np, save.states="prediction")
    lme <- sapply(saved_states(pfSV)$weights, logmeanexp)
    lme
}))
ll_est <- apply(llp_est, 2, sum)
s <- simll(llp_est, params=sims_incl_Est)
null_values_Est <- expand.grid(theta_Est[inc[1]]+sim_widths[parid[1]]/5*(-10:10), theta_Est[inc[2]]+sim_widths[parid[2]]/5*(-10:10)) |> as.matrix() |> `colnames<-`(inc)
ht_result <- ht(s, null.value=null_values_Est, test="parameter", case="stationary")
dat <- data.frame(null_values_Est)
g3 <- ggplot(dat %>% mutate(pval=ht_result$Hypothesis_Tests[,"pvalue"]) %>%
             mutate(conflvl=cut(pval, breaks=c(0,0.05,0.1,0.2,1), right=FALSE)),
    aes(x=!!sym(inc[1]), y=!!sym(inc[2]))) +
    geom_point(aes(color=conflvl), size=0.6) +
    xlab(transnames[parid[1]]) + ylab(transnames[parid[2]]) +
    scale_color_manual(name="", labels=c("100%", "95%", "90%", "80%"),
        values=rgb(0,0,0,c(0.05,0.25,0.5,1))) +
    geom_point(aes(x=theta_Est[inc[1]], y=theta_Est[inc[2]]), shape=4, size=2) +
    theme(legend.position="bottom")
g3

More on hypothesis tests

Test on the metamodel parameters

The test on the metamodel parameters, namely $a$, $b$, $c$, and $\sigma^2$ \[ H_0: (a,b,c,\sigma^2) = (a_0,b_0,c_0, \sigma_0^2), ~~H_1: (a,b,c,\sigma^2) \neq (a_0,b_0,c_0, \sigma_0^2).\] can be carried out by passing the character string "moments" to the test argument for the ht function. The null.value argument can be the list of the four elements in the order of $(a_0, b_0, c_0, \sigma_0^2)$. The second element ($b_0$) is a vector and the third element ($c_0$) is a matrix. When testing more than one null values, null.value should be a list of lists of length four. For tests on the MESLE or the simulation-based proxy, the test statistic follows the $F$-distribution. The test statistic for the metamodel parameters follows what is named the $\textsf{SCL}$ distribution in Park (2025). The quantiles and the cdf of the $\textsf{SCL}$ distribution are numerically found by drawing a number of random values from the distribution. The number of random variables is roughly chosen such that the Monte Carlo standard error for the obtained p-value is approximately 0.01. However, if any of the obtained p-values is less than 0.01, a greater number of random numbers are drawn from the $\textsf{SCL}$ distribution so that the Monte Carlo standard error of the p-value is approximately 0.001.

null_moments <- list(a=-984, b=c(25.8,-23.7), c=matrix(c(-8.25,8,8,-93),2,2), sigsq=5)
ht(s, null.value=null_moments, test="moments")
#> $meta_model_MLE_for_moments
#> $meta_model_MLE_for_moments$a
#> [1] -982.6367
#> 
#> $meta_model_MLE_for_moments$b
#>           [,1]
#> [1,]  23.52540
#> [2,] -23.91113
#> 
#> $meta_model_MLE_for_moments$c
#>           [,1]       [,2]
#> [1,] -7.310008   7.885767
#> [2,]  7.885767 -91.652205
#> 
#> $meta_model_MLE_for_moments$sigma_sq
#> [1] 5.655018
#> 
#> 
#> $Hypothesis_Tests
#> $Hypothesis_Tests[[1]]
#> $Hypothesis_Tests[[1]]$a_null
#> [1] -984
#> 
#> $Hypothesis_Tests[[1]]$b_null
#> [1]  25.8 -23.7
#> 
#> $Hypothesis_Tests[[1]]$c_null
#>       [,1] [,2]
#> [1,] -8.25    8
#> [2,]  8.00  -93
#> 
#> $Hypothesis_Tests[[1]]$sigma_sq_null
#> [1] 5
#> 
#> $Hypothesis_Tests[[1]]$pvalue
#> [1] 0.5542071
#> 
#> 
#> 
#> $pvalue_numerical_error_size
#> [1] 0.01
#> 
#> $pval_cubic
#> [1] 0.8634763

Test on the mean and variance of the simulation log likelihood at a single parameter value

Finally, the ht function can be used to test about the mean and the variance of the simulation log likelihood at a single parameter value. For this, the type argument should be set to "point". When the type argument is not provided, its default value is "point" if the object passed to the s argument has no params attribute and "regression" otherwise. All hypothesis tests described in earlier sections used the default type="regression".

As an example, suppose that the simulation log likelihood is normally distributed with mean $\mu=0$ and variance $\sigma^2 = 1$. We artificially generate $M=50$ simulation log likelihoods from the standard normal distribution, and create a simll object without the params attribute.

set.seed(1098204)
s <- simll(rnorm(50, 0, 1))
#> Simulation log likelihoods were coerced into a matrix form.

For testing $H_0: (\mu, \sigma^2) = (\mu_0, \sigma_0^2)$, $H_1: (\mu, \sigma^2) \neq (\mu_0, \sigma_0^2)$, the test argument should be "moments" and the type argument should be "point". The null.value is a vector of length two (one entry for the mean and the other for the variance of the simulation log likelihoods), a matrix of two columns (one for the mean and the other for the variance), or a list of vectors of length two (each entry of the list gives a null value consisting of the mean and the variance.)

null_values <- rbind(c(0,1), c(0,0.8), c(0.2, 1))
ht(s, null.value=null_values, test="moments", type="point")
#> $meta_model_MLE_for_moments
#>         mu   sigma_sq 
#> -0.2397480  0.6591244 
#> 
#> $Hypothesis_Tests
#>   mu_null sigma_sq_null      pvalue
#> 1     0.0           1.0 0.038825758
#> 2     0.0           0.8 0.112926136
#> 3     0.2           1.0 0.001392045
#> 
#> $pvalue_numerical_error_size
#> [1] 0.001

Automatic weight adjustments for bias reduction

Both the ht and ci functions have the option autoAdjust to automatically adjust the weights of the simulation points so that the error in quadratic approximation is small. Specifically, the weights are multiplied by discount factors given by $exp(-\{q_2(\hat \theta_\text{MESLE})-q_2(\theta)\}/g)$ where $q_2$ is the fitted quadratic polynomial to the simulated log-likelihoods, $\theta$ is the given simulation point, and $g$ is a tuning parameter. These weight adjustments ensure that points far from $\hat\theta_\text{MESLE}$ have small weights for estimating the parameter. The value of $g$ is tuned iteratively by making sure that the p-value for the significance of the cubic term is between 0.01 and 0.3. If the p-value is too small (less than 0.01), in order to reduce the bias introduced, $g$ is decreased so that parameter estimation is carried out using effectively only points close to the estimated MESLE where the cubic term is negligibly small compared to the quadratic approximation. Conversely, if the p-value is large enough (larger than 0.3), there is a room to explore a wider region in the parameter sapce to improve estimation efficiency. Additionally, in order to ensure that the cubic regression can be carried out without numerical instability, $g$ is guaranteed to not fall below a value that makes the effective sample size (ESS) less than $(d+1)(d+2)(d+3)/6$, which is the number of scalar parameters estimated in cubic regression. Here the ESS is defined as $(\sum_{m=1}^M {w_m^{adj}})^2/(\sum_{m=1}^M {w_m^{adj}}^2)$, where $w_m^{adj}$, $m=1,\dots, M$, are the adjusted weights.

Proposal of the next simulation point for near-optimal efficiency

The next simulation point for optimal estimation efficiency can be found using the optDesign function. This function returns a point that could (approximately) minimize the Monte Carlo variance of the estimate of the MESLE. If a point is too far from the estimated MESLE, then the adjusted weight for that point could be very small because the third order term in a cubic approximation could be significant at that point. That point could contribute little to decrease the Monte Carlo uncertainty in the parameter estimate. Conversely, a point too close to the estimated MESLE does not substantially reduce the variance of the parameter estimate either, since it has a low leverage in polynomial regression. The point returned by optDesign is a near-optimum balancing these two factors.

References

King, Aaron A., Edward L. Ionides, Carles Martinez Bretó, Stephen P. Ellner, Matthew J. Ferrari, Sebastian Funk, Steven G. Johnson, et al. 2023. pomp: Statistical Inference for Partially Observed Markov Processes. https://kingaa.github.io/pomp/.

King, Aaron A., Dao Nguyen, and Edward L. Ionides. 2016. “Statistical Inference for Partially Observed Markov Processes via the R Package pomp.” Journal of Statistical Software 69 (12): 1–43. https://doi.org/10.18637/jss.v069.i12.

Park, Joonha. 2025. “Scalable Simulation-Based Inference for Implicitly Defined Models Using a Metamodel for Monte Carlo Log-Likelihood Estimator.” arXiv Preprint. https://doi.org/10.48550/arxiv.2311.09446.