General interface
Overview and concepts
The general interface supports the following setup for likelihood-based indirect inference, using a structural model $S$ and an auxiliary model $A$, with given data $y$.
For each set of parameters $θ$, a structural model $S$ is used to generate simulated data $x$, ie
where $ϵ$ is a set of common random numbers that can be kept constant for various $θ$s. Nevertheless, the above is not necessarily a deterministic relationship, as additional randomness can be used.
An auxiliary model $A$ with parameters $ϕ$ can be estimated using generated data $x$ with maximum likelihood, ie
is the maximum likelihood estimate.
The likelihood of $y$ at parameters $θ$ is obtained as
This is multiplied by a prior in $θ$, specified in logs.
The user should define
a model type, which represents both the structural and the auxiliary model,
methods for the functions below to dispatch on this type.
component | Julia method |
---|---|
$x_M$ | simulate_data |
$p_A$ | loglikelihood |
$ϕ_A$ | MLE |
draw $ϵ$ | common_random |
The framework is explained in detail below.
Models
The structural and the auxiliary model should be represented by a type (ie struct
) defined for each problem. A single type is used for both the structural and the auxiliary model; since the methods that implement the two aspects are implemented by different methods, this should not cause confusion.
Simulating data
The (structural) model is used to generate data from parameters $θ$ and common random numbers $ϵ$ using simulate_data
. The mapping is not necessarily deterministic even given $ϵ$, as additional randomness is allowed, but common random numbers are advantageous since they make the mapping from structural to auxiliary parameters continuous. When used for simulation, common random numbers also reduce variance.
When applicable, independent variables (eg covariates) necessary for simulation should be included in the structural model object. It is recommended that a type is defined for each problem, along with methods for simulate_data
.
IndirectLikelihood.simulate_data
— Function.simulate_data(rng, model, θ, ϵ)
Simulate data from the model
using random number generator
rng
,with parameters
θ
,common random numbers
ϵ
.
This method should
accept
ϵ
generated bycommon_random
,return simulated
data
in the format that can be used byMLE
and
The user should define a method for this function for each model
type with the signature
simulate_data(rng::AbstractRNG, model, θ, ϵ)
For infeasible/meaningless parameters, return nothing
.
simulate_data(rng, model, θ)
Simulate data, generating ϵ
using rng
.
See common_random
.
For interactive/exploratory use. Models should define methods for simulate_data(rng::AbstractRNG, model, θ, ϵ)
.
simulate_data(model, θ)
Simulate data, generating ϵ
with the default random number generator.
See common_random
.
For interactive/exploratory use. Models should define methods for simulate_data(rng::AbstractRNG, model, θ, ϵ)
.
simulate_data(problem, θ)
Simulate data with the given parameters θ
.
Common random numbers
Common random numbers are a set of random numbers that can be re-used for for simulation with different parameter values.
common_random
should yield a random value for the random variables (usually an Array
or a collection of similar structures).
IndirectLikelihood.common_random
— Function.common_random(rng, model)
Return common random numbers that can be reused by simulate_data
with different parameters.
When the model structure does not allow common random numbers, the convention is to return nothing
.
The first argument is the random number generator.
The user should define a method for this function for each model
type.
See also common_random!
for further optimizations.
For error structures which can be overwritten in place, the user can define common_random!
as an optimization.
IndirectLikelihood.common_random!
— Function.common_random!(rng, model, ϵ)
Update the common random numbers for the model. The semantics is as follows:
it can, but does not need to, change the contents of its argument
ϵ
,the “new” common random numbers should be returned regardless.
Two common usage patterns are
having a mutable
ϵ
, updating that in place, returningϵ
,generating new
ϵ
, returning that.
The default method falls back to common_random
, reallocating with each call. A method for this function should be defined only when allocations can be optimized.
common_random!(problem)
Return a new problem with updated common random numbers.
Data
Data can be of any type, since it is generated and used by user-defined functions. Arrays, tuples (optionally named) are recommended for simple models, as there no need to wrap them in a type since the model type is used for dispatch.
More complex data structures may benefit from being wrapped in a struct
.
Auxiliary model estimation and likelihood
These methods should be defined for model types, and accept data from simulate_data
.
IndirectLikelihood.MLE
— Function.MLE(model, data)
Maximum likelihood estimate of the parameters for data
in model
.
When ϕ == MLE(model, data)
, ϕ
should maximize
ϕ -> loglikelihood(model, data, ϕ)
See loglikelihood
.
Methods should be defined by the user for each model
type.
IndirectLikelihood.loglikelihood
— Function.loglikelihood(model, data, ϕ)
Log likelihood of data
under model
with parameters $ϕ$. See MLE
.
Problem framework
Estimation problems can be organized into a single structure, which contains the model and data objects, and also a (log) prior in the parameters. This simplifies the evaluation of likelihoods.
The common random numbers are saved in the object. Use common_random!
to update them.
IndirectLikelihoodProblem(model, logprior, data; rng, ϵ)
A simple wrapper for an indirect likelihood problem, with the given model
object, log prior, and data.
The random number generator rng
is saved, and used to initialize the common random numbers ϵ
by default.
The user should implement simulate_data
, MLE
, loglikelihood
, and common_random
.
You can then obtain the (log) posterior at the parameters. Note the single-argument version which returns a callable.
IndirectLikelihood.indirect_logposterior
— Function.indirect_logposterior(problem, θ)
Evaluate the indirect log posterior of problem
at parameters θ
.
Short-circuits for infeasible parameters.
indirect_logposterior(problem)
Return a callable that evaluates indirect_logposterior(problem, θ)
at the given θ
.
For testing inference with simulated data, the following function can be used to create a problem object.
IndirectLikelihood.simulate_problem
— Function.simulate_problem(model, logprior, θ; rng, ϵ)
Initialize an IndirectLikelihoodProblem
with simulated data, using parameters θ
.
Useful for debugging and exploration of identification with simulated data.
Utilities
IndirectLikelihood.local_jacobian
— Function.local_jacobian(problem, θ₀, ω_to_θ; vecϕ)
Calculate the local Jacobian of the estimated auxiliary parameters $ϕ$ at the structural parameters $θ=θ₀$.
ω_to_θ
is a transformation that maps a vector of reals $ω ∈ ℝⁿ$ to the parameters θ
in the format acceptable to simulate_data
. It should support ContinuousTransformations.transform
and ContinuousTransformations.inverse
. See, for example, ContinuousTransformations.TransformationTuple
.
vecϕ
is a function that is used to flatten the auxiliary parameters to a vector. Defaults to vec_parameters
.
IndirectLikelihood.vec_parameters
— Function.Return the values of the argument as a vector, potentially (but not necessarily) restricting to those elements that uniquely determine the argument.
For example, a symmetric matrix would be determined by the diagonal and either half.