User API
Summary
Defining
Defining a problem is done in one of three ways:
The easiest is to use the Turing.jl or Soss.jl probabilistic programming languages to define a model, which is then automatically useable by MuseInference. See the documentation for
TuringMuseProblem
andSossMuseProblem
. Note that to use either of these, you first need runusing Turing
orusing Soss
in addition tousing MuseInference
in your session.If the problem has $\theta$ variables whose domain is $(-\infty,\infty)$, and if all necessary gradients can be computed with automatic differentiation, you can use
SimpleMuseProblem
and specify the posterior and simulation generation code by-hand.Otherwise, you can write a custom
AbstractMuseProblem
. See the Developer API
Solving
Solving a problem is then done using:
muse
andmuse!
which compute the MUSE estimate and return aMuseResult
object. Themuse!
version will store the result into an existingMuseResult
, and will resume an existing run if theMuseResult
already holds some work. This can be useful if e.g. one wishes to run the solver for more steps after an initial solution. Both functions accept a keyword argumentget_covariance
which iftrue
will also compute the covariance of the MUSE estimate. Iffalse
, the covariance can be computed later by manually calling the following two functions:get_J!
andget_H!
which compute the $H$ and $J$ matrices which together give the MUSE covariance, $\Sigma_{\rm posterior} = H^{-1}\,J\,H^{-\dagger} + \Sigma_{\rm prior}$. Calling these by hand rather than settingget_covariance=true
can be useful as they allow more configurable options.
Tuning
The main tunable parameters to these functions which the user should consider are:
nsims = 100
— The number of simulations used. MUSE is a stochastic estimate. The error on the central MUSE estimate of $\theta$ scales as $1 / \sqrt N_{\rm sims} \cdot \sigma(\theta)$. For example, a MUSE solution withnsims=100
will have roughly the same error on the mean as an HMC chain with an effective sample size of 100. The total runtime of MUSE is linear innsims
. A different number of sims can be used formuse
,get_J!
andget_H!
(note that $H$ is generally less realization dependent and can be computed with fewer sims).∇z_logLike_atol = 1e-2
— The MUSE estimate involves a maximization of the likelihood over $z$. This controls the absolute solver error tolerance for $z$ for this maximization.(θ_rtol, α) = (0.1, 0.7)
— The outermost loop of the MUSE algorithm is solving an equation like $f(θ)=0$. Theθ_rtol
parameter sets the error tolerance for this solution for $\theta$, relative to an estimate of its uncertainty. The default0.1
means the solution is good to $0.1\,\sigma_\theta$. Theα
parameter is a step-size for this solver. If a good starting guess for $\theta$ is unavailable, this may need to be reduced for the solver to converge.
The defaults given above should give reasonable results, but may need tweaking for individual problems. A good strategy is to reduce ∇z_logLike_atol
, θ_rtol
, and α
to ensure a stable solution, then experiment with which can be increased without screwing up the result. The convergence properties are not largely affected by nsims
so this can be lowered during this experimentation to offset the longer run-time.
Parallelizing
MUSE is trivially parallelizable over the nsims
simulations which are averaged over in the algorithm. The functions muse
, get_J!
, and get_H!
can all be passed a keyword argument pmap
which will be used for mapping over the simulations. For example, pmap=Distributed.pmap
can be used to use Julia's Distributed map across different processes.
Index
MuseInference.MuseResult
MuseInference.SimpleMuseProblem
MuseInference.SossMuseProblem
MuseInference.TuringMuseProblem
MuseInference.get_H!
MuseInference.get_J!
MuseInference.muse
MuseInference.muse!
Contents
MuseInference.TuringMuseProblem
— TypeTuringMuseProblem(model; params, autodiff = Turing.ADBACKEND)
Specify a MUSE problem with a Turing model.
The Turing model should be conditioned on the variables which comprise the "data", and all other variables should be unconditioned. By default, any parameter which doesn't depend on another parameter will be estimated by MUSE, but this can be overridden by passing params
as a list of symbols. All other non-conditioned and non-params
variables will be considered the latent space.
The autodiff
parameter should be either MuseInference.ForwardDiffBackend()
or MuseInference.ZygoteBackend()
, specifying which library to use for automatic differenation. The default uses whatever the global Turing.ADBACKEND
is currently set to.
Example
# toy hierarchical model
Turing.@model function toy()
σ ~ Uniform(0, 1)
θ ~ Normal(0, σ)
z ~ MvNormal(zeros(512), exp(σ/2)*I)
w ~ MvNormal(z, I)
x ~ MvNormal(w, I)
y ~ MvNormal(x, I)
(;σ, θ, z, w, x, y)
end
sim = toy()()
model = toy() | (;sim.x, sim.y)
prob = TuringMuseProblem(model, params=(:σ, :θ))
# get solution
result = muse(prob, (σ=0.5, θ=0))
Here we have chosen (σ, θ)
to be the parameters which MuseInference will estimate (note that the default would have only chosen σ
). The observed data are (x,y)
and the remaining (z,w)
are the latent space.
You can also call muse
, etc... directly on the model, e.g. muse(model, (σ=0.5, θ=0))
, in which case the parameter names params
will be read from the keys of provided the starting point.
The model function cannot have any of the random variables as arguments, although it can have other parameters as arguments. E.g.,
# not OK
@model function toy(x)
x ~ Normal()
...
end
# OK
@model function toy(σ)
x ~ Normal(σ)
...
end
MuseInference.SossMuseProblem
— TypeSossMuseProblem(model; params, autodiff = ForwardDiffBackend())
Specify a MUSE problem with a Soss model.
The Soss model should be conditioned on the variables which comprise the "data", and all other variables should be unconditioned. By default, any parameter which doesn't depend on another parameter will be estimated by MUSE, but this can be overridden by passing params
as a list of symbols. All other non-conditioned and non-params
variables will be considered the latent space.
The autodiff
parameter should be either MuseInference.ForwardDiffBackend()
or MuseInference.ZygoteBackend()
, specifying which library to use for automatic differenation.
Example
# toy hierarchical problem, using MeasureTheory distributions
funnel = Soss.@model (σ) begin
θ ~ MeasureTheory.Normal(0, σ)
z ~ MeasureTheory.Normal(0, exp(θ/2)) ^ 512
x ~ For(z) do zᵢ
MeasureTheory.Normal(zᵢ, 1)
end
end
sim = rand(funnel(3))
model = funnel(3) | (;sim.x)
prob = SossMuseProblem(model)
# get solution
result = muse(prob, (θ=0,))
You can also call muse
, etc... directly on the model, e.g. muse(model, (θ=0,))
, in which case the parameter names params
will be read from the keys of provided the starting point.
MuseInference.SimpleMuseProblem
— TypeSimpleMuseProblem(x, sample_x_z, logLike, logPriorθ=(θ->0); ad=AD.ForwardDiffBackend())
Specify a MUSE problem by providing the simulation and posterior evaluation code by-hand. The argument x
should be the observed data. The function sample_x_z
should have signature:
function sample_x_z(rng::AbstractRNG, θ)
# ...
return (;x, z)
end
and return a joint sample of data x
and latent space z
. The function logLike
should have signature:
function logLike(x, z, θ)
# return log likelihood
end
and return the likelihood $\log\mathcal{P}(x,z\,|\,\theta)$ for your problem. The optional function logPriorθ
should have signature:
function logPriorθ(θ)
# return log prior
end
and should return the prior $\log\mathcal{P}(\theta)$ for your problem. The autodiff
parameter should be either MuseInference.ForwardDiffBackend()
or MuseInference.ZygoteBackend()
, specifying which library to use for automatic differenation through logLike
.
All variables (x, z, θ)
can be any types which support basic arithmetic.
Example
# 512-dimensional noisy funnel
prob = SimpleMuseProblem(
rand(512),
function sample_x_z(rng, θ)
z = rand(rng, MvNormal(zeros(512), exp(θ)*I))
x = rand(rng, MvNormal(z, I))
(;x, z)
end,
function logLike(x, z, θ)
-(1//2) * (sum((x .- z).^2) + sum(z.^2) / exp(θ) + 512*θ)
end,
function logPrior(θ)
-θ^2/(2*3^2)
end;
autodiff = MuseInference.ZygoteBackend()
)
# get solution
muse(prob, (θ=1,))
MuseInference.muse
— Functionmuse(prob::AbstractMuseProblem, θ₀; kwargs...)
muse!(result::MuseResult, prob::AbstractMuseProblem, [θ₀=nothing]; kwargs...)
Run the MUSE estimate. The muse!
form resumes an existing result. If the muse
form is used instead, θ₀
must give a starting guess for $\theta$.
See MuseResult
for description of return value.
Keyword arguments:
rng
— Random number generator to use. Taken fromresult.rng
orRandom.default_rng()
if not passed.z₀
— Starting guess for the latent space MAP.maxsteps = 50
— Maximum number of iterations.θ_rtol = 1e-2
— Error tolerance on $\theta$ relative to its standard deviation.∇z_logLike_atol = 1e-2
— Absolute tolerance on the $z$-gradient at the MAP solution.nsims = 100
— Number of simulations.α = 0.7
— Step size for root-finder.progress = false
— Show progress bar.pool :: AbstractWorkerPool
— Worker pool for parallelization.regularize = identity
— Apply some regularization after each step.H⁻¹_like = nothing
— Initial guess for the inverse Jacobian of $s^{\rm MUSE}(\theta)$H⁻¹_update
— How to updateH⁻¹_like
. Should be:sims
,:broyden
, or:diagonal_broyden
.broyden_memory = Inf
— How many past steps to keep for Broyden updates.checkpoint_filename = nothing
— Save result to a file after each iteration.get_covariance = false
— Also callget_H
andget_J
to get the full covariance.save_MAPs = false
— Whether to store the MAP solution at each step and for each sim and data into the history (for debugging). Defaults tofalse
since these are generally high-dimensional and may consume lots of memory. Can also pass a function which preprocess the MAP before storing it (e.g.save_MAPs = x -> adapt(Array, x)
to convert from a GPU to a CPU array).
MuseInference.muse!
— FunctionSee muse
.
MuseInference.get_J!
— Functionget_J!(result::MuseResult, prob::AbstractMuseProblem, [θ₀=nothing]; kwargs...)
Compute the $J$ matrix, which is part of the MUSE covariance computation (see Millea & Seljak, 2021).
Positional arguments:
result
—MuseResult
into which to store resultprob
—AbstractMuseProblem
being solvedθ₀
— theθ
at which to evaluate $J$ (default:result.θ
if it exists, otherwiseθ₀
must be given)
Keyword arguments:
z₀
— Starting guess for the latent space MAPs. Defaults to random sample from prior.∇z_logLike_atol = 1e-2
— Absolute tolerance on the $z$-gradient at the MAP solution.rng
— Random number generator to use. Taken fromresult.rng
orRandom.default_rng()
if not passed.nsims
— How many simulations to average over (default:100
)pmap
— Parallel map function.progress
— Show progress bar (default:false
),skip_errors
— Ignore any MAP that errors (default:false
)covariance_method
— ACovarianceEstimator
used to compute $J$ (default:SimpleCovariance(corrected=true)
)
MuseInference.get_H!
— Functionget_H!(result::MuseResult, prob::AbstractMuseProblem, [θ₀=nothing]; kwargs...)
Compute the $H$ matrix, which is part of the MUSE covariance computation (see Millea & Seljak, 2021).
Positional arguments:
result
—MuseResult
into which to store resultprob
—AbstractMuseProblem
being solvedθ₀
— theθ
at which to evaluate $H$ (default:result.θ
if it exists, otherwiseθ₀
must be given)
Keyword arguments:
z₀
— Starting guess for the latent space MAPs. Defaults to random sample from prior.∇z_logLike_atol = 1e-2
— Absolute tolerance on the $z$-gradient at the MAP solution.rng
— Random number generator to use. Taken fromresult.rng
orRandom.default_rng()
if not passed.nsims
— How many simulations to average over (default:10
)pmap
— Parallel map function.progress
— Show progress bar (default:false
),skip_errors
— Ignore any MAP that errors (default:false
)fdm
— AFiniteDifferenceMethod
used to compute the finite difference Jacobian of AD gradients involved in computing $H$ (defaults to:FiniteDifferences.central_fdm(3,1)
)step
— A guess for the finite difference step-size (defaults to 0.1σ for each parameter using J to estimate σ; for this reason its recommended to runget_J!
beforeget_H!
). Is only a guess because different choices offdm
may adapt this.implicit_diff
— Whether to use experimental implicit differentiation, rather than finite differences. Will require 2nd order AD through yourlogLike
so pay close attention to yourprob.autodiff
. EitherAD.HigherOrderBackend((AD.ForwardDiffBackend(), AD.ZygoteBackend()))
orAD.ForwardDiffBackend()
are recommended (default:false
)
MuseInference.MuseResult
— TypeStores the result of a MUSE run. Can be constructed by-hand as MuseResult()
and passed to any of the inplace muse!
, get_J!
, or get_H!
.
Fields:
θ
— The estimate of the $\theta$ parameters.Σ, Σ⁻¹
— The approximate covariance of $\theta$ and its inverse.H, J
— The $H$ and $J$ matrices which form the covariance (see Millea & Seljak, 2021)gs
— The MAP gradient sims used to computeJ
.Hs
— The jacobian sims used to computeH
.dist
— ANormal
orMvNormal
built fromθ
andΣ
, for convenience.history
— Internal diagnostic info from the run.rng
— RNG used to generate sims for this run (so the same sims can be reused if resuming later).time
— TotalMillisecond
wall-time spent computing the result.