User API
Summary
Defining
Defining a problem is done in one of three ways:
The easiest is to use the Turing.jl or Soss.jl probabilistic programming languages to define a model, which is then automatically useable by MuseInference. See the documentation for
TuringMuseProblemandSossMuseProblem. Note that to use either of these, you first need runusing Turingorusing Sossin addition tousing MuseInferencein your session.If the problem has $\theta$ variables whose domain is $(-\infty,\infty)$, and if all necessary gradients can be computed with automatic differentiation, you can use
SimpleMuseProblemand specify the posterior and simulation generation code by-hand.Otherwise, you can write a custom
AbstractMuseProblem. See the Developer API
Solving
Solving a problem is then done using:
museandmuse!which compute the MUSE estimate and return aMuseResultobject. Themuse!version will store the result into an existingMuseResult, and will resume an existing run if theMuseResultalready holds some work. This can be useful if e.g. one wishes to run the solver for more steps after an initial solution. Both functions accept a keyword argumentget_covariancewhich iftruewill also compute the covariance of the MUSE estimate. Iffalse, the covariance can be computed later by manually calling the following two functions:get_J!andget_H!which compute the $H$ and $J$ matrices which together give the MUSE covariance, $\Sigma_{\rm posterior} = H^{-1}\,J\,H^{-\dagger} + \Sigma_{\rm prior}$. Calling these by hand rather than settingget_covariance=truecan be useful as they allow more configurable options.
Tuning
The main tunable parameters to these functions which the user should consider are:
nsims = 100— The number of simulations used. MUSE is a stochastic estimate. The error on the central MUSE estimate of $\theta$ scales as $1 / \sqrt N_{\rm sims} \cdot \sigma(\theta)$. For example, a MUSE solution withnsims=100will have roughly the same error on the mean as an HMC chain with an effective sample size of 100. The total runtime of MUSE is linear innsims. A different number of sims can be used formuse,get_J!andget_H!(note that $H$ is generally less realization dependent and can be computed with fewer sims).∇z_logLike_atol = 1e-2— The MUSE estimate involves a maximization of the likelihood over $z$. This controls the absolute solver error tolerance for $z$ for this maximization.(θ_rtol, α) = (0.1, 0.7)— The outermost loop of the MUSE algorithm is solving an equation like $f(θ)=0$. Theθ_rtolparameter sets the error tolerance for this solution for $\theta$, relative to an estimate of its uncertainty. The default0.1means the solution is good to $0.1\,\sigma_\theta$. Theαparameter is a step-size for this solver. If a good starting guess for $\theta$ is unavailable, this may need to be reduced for the solver to converge.
The defaults given above should give reasonable results, but may need tweaking for individual problems. A good strategy is to reduce ∇z_logLike_atol, θ_rtol, and α to ensure a stable solution, then experiment with which can be increased without screwing up the result. The convergence properties are not largely affected by nsims so this can be lowered during this experimentation to offset the longer run-time.
Parallelizing
MUSE is trivially parallelizable over the nsims simulations which are averaged over in the algorithm. The functions muse, get_J!, and get_H! can all be passed a keyword argument pmap which will be used for mapping over the simulations. For example, pmap=Distributed.pmap can be used to use Julia's Distributed map across different processes.
Index
MuseInference.MuseResultMuseInference.SimpleMuseProblemMuseInference.SossMuseProblemMuseInference.TuringMuseProblemMuseInference.get_H!MuseInference.get_J!MuseInference.museMuseInference.muse!
Contents
MuseInference.TuringMuseProblem — TypeTuringMuseProblem(model; params, autodiff = Turing.ADBACKEND)Specify a MUSE problem with a Turing model.
The Turing model should be conditioned on the variables which comprise the "data", and all other variables should be unconditioned. By default, any parameter which doesn't depend on another parameter will be estimated by MUSE, but this can be overridden by passing params as a list of symbols. All other non-conditioned and non-params variables will be considered the latent space.
The autodiff parameter should be either MuseInference.ForwardDiffBackend() or MuseInference.ZygoteBackend(), specifying which library to use for automatic differenation. The default uses whatever the global Turing.ADBACKEND is currently set to.
Example
# toy hierarchical model
Turing.@model function toy()
σ ~ Uniform(0, 1)
θ ~ Normal(0, σ)
z ~ MvNormal(zeros(512), exp(σ/2)*I)
w ~ MvNormal(z, I)
x ~ MvNormal(w, I)
y ~ MvNormal(x, I)
(;σ, θ, z, w, x, y)
end
sim = toy()()
model = toy() | (;sim.x, sim.y)
prob = TuringMuseProblem(model, params=(:σ, :θ))
# get solution
result = muse(prob, (σ=0.5, θ=0))Here we have chosen (σ, θ) to be the parameters which MuseInference will estimate (note that the default would have only chosen σ). The observed data are (x,y) and the remaining (z,w) are the latent space.
You can also call muse, etc... directly on the model, e.g. muse(model, (σ=0.5, θ=0)), in which case the parameter names params will be read from the keys of provided the starting point.
The model function cannot have any of the random variables as arguments, although it can have other parameters as arguments. E.g.,
# not OK
@model function toy(x)
x ~ Normal()
...
end
# OK
@model function toy(σ)
x ~ Normal(σ)
...
endMuseInference.SossMuseProblem — TypeSossMuseProblem(model; params, autodiff = ForwardDiffBackend())Specify a MUSE problem with a Soss model.
The Soss model should be conditioned on the variables which comprise the "data", and all other variables should be unconditioned. By default, any parameter which doesn't depend on another parameter will be estimated by MUSE, but this can be overridden by passing params as a list of symbols. All other non-conditioned and non-params variables will be considered the latent space.
The autodiff parameter should be either MuseInference.ForwardDiffBackend() or MuseInference.ZygoteBackend(), specifying which library to use for automatic differenation.
Example
# toy hierarchical problem, using MeasureTheory distributions
funnel = Soss.@model (σ) begin
θ ~ MeasureTheory.Normal(0, σ)
z ~ MeasureTheory.Normal(0, exp(θ/2)) ^ 512
x ~ For(z) do zᵢ
MeasureTheory.Normal(zᵢ, 1)
end
end
sim = rand(funnel(3))
model = funnel(3) | (;sim.x)
prob = SossMuseProblem(model)
# get solution
result = muse(prob, (θ=0,))You can also call muse, etc... directly on the model, e.g. muse(model, (θ=0,)), in which case the parameter names params will be read from the keys of provided the starting point.
MuseInference.SimpleMuseProblem — TypeSimpleMuseProblem(x, sample_x_z, logLike, logPriorθ=(θ->0); ad=AD.ForwardDiffBackend())Specify a MUSE problem by providing the simulation and posterior evaluation code by-hand. The argument x should be the observed data. The function sample_x_z should have signature:
function sample_x_z(rng::AbstractRNG, θ)
# ...
return (;x, z)
endand return a joint sample of data x and latent space z. The function logLike should have signature:
function logLike(x, z, θ)
# return log likelihood
endand return the likelihood $\log\mathcal{P}(x,z\,|\,\theta)$ for your problem. The optional function logPriorθ should have signature:
function logPriorθ(θ)
# return log prior
endand should return the prior $\log\mathcal{P}(\theta)$ for your problem. The autodiff parameter should be either MuseInference.ForwardDiffBackend() or MuseInference.ZygoteBackend(), specifying which library to use for automatic differenation through logLike.
All variables (x, z, θ) can be any types which support basic arithmetic.
Example
# 512-dimensional noisy funnel
prob = SimpleMuseProblem(
rand(512),
function sample_x_z(rng, θ)
z = rand(rng, MvNormal(zeros(512), exp(θ)*I))
x = rand(rng, MvNormal(z, I))
(;x, z)
end,
function logLike(x, z, θ)
-(1//2) * (sum((x .- z).^2) + sum(z.^2) / exp(θ) + 512*θ)
end,
function logPrior(θ)
-θ^2/(2*3^2)
end;
autodiff = MuseInference.ZygoteBackend()
)
# get solution
muse(prob, (θ=1,))MuseInference.muse — Functionmuse(prob::AbstractMuseProblem, θ₀; kwargs...)
muse!(result::MuseResult, prob::AbstractMuseProblem, [θ₀=nothing]; kwargs...)Run the MUSE estimate. The muse! form resumes an existing result. If the muse form is used instead, θ₀ must give a starting guess for $\theta$.
See MuseResult for description of return value.
Keyword arguments:
rng— Random number generator to use. Taken fromresult.rngorRandom.default_rng()if not passed.z₀— Starting guess for the latent space MAP.maxsteps = 50— Maximum number of iterations.θ_rtol = 1e-2— Error tolerance on $\theta$ relative to its standard deviation.∇z_logLike_atol = 1e-2— Absolute tolerance on the $z$-gradient at the MAP solution.nsims = 100— Number of simulations.α = 0.7— Step size for root-finder.progress = false— Show progress bar.pool :: AbstractWorkerPool— Worker pool for parallelization.regularize = identity— Apply some regularization after each step.H⁻¹_like = nothing— Initial guess for the inverse Jacobian of $s^{\rm MUSE}(\theta)$H⁻¹_update— How to updateH⁻¹_like. Should be:sims,:broyden, or:diagonal_broyden.broyden_memory = Inf— How many past steps to keep for Broyden updates.checkpoint_filename = nothing— Save result to a file after each iteration.get_covariance = false— Also callget_Handget_Jto get the full covariance.save_MAPs = false— Whether to store the MAP solution at each step and for each sim and data into the history (for debugging). Defaults tofalsesince these are generally high-dimensional and may consume lots of memory. Can also pass a function which preprocess the MAP before storing it (e.g.save_MAPs = x -> adapt(Array, x)to convert from a GPU to a CPU array).
MuseInference.muse! — FunctionSee muse.
MuseInference.get_J! — Functionget_J!(result::MuseResult, prob::AbstractMuseProblem, [θ₀=nothing]; kwargs...)Compute the $J$ matrix, which is part of the MUSE covariance computation (see Millea & Seljak, 2021).
Positional arguments:
result—MuseResultinto which to store resultprob—AbstractMuseProblembeing solvedθ₀— theθat which to evaluate $J$ (default:result.θif it exists, otherwiseθ₀must be given)
Keyword arguments:
z₀— Starting guess for the latent space MAPs. Defaults to random sample from prior.∇z_logLike_atol = 1e-2— Absolute tolerance on the $z$-gradient at the MAP solution.rng— Random number generator to use. Taken fromresult.rngorRandom.default_rng()if not passed.nsims— How many simulations to average over (default:100)pmap— Parallel map function.progress— Show progress bar (default:false),skip_errors— Ignore any MAP that errors (default:false)covariance_method— ACovarianceEstimatorused to compute $J$ (default:SimpleCovariance(corrected=true))
MuseInference.get_H! — Functionget_H!(result::MuseResult, prob::AbstractMuseProblem, [θ₀=nothing]; kwargs...)Compute the $H$ matrix, which is part of the MUSE covariance computation (see Millea & Seljak, 2021).
Positional arguments:
result—MuseResultinto which to store resultprob—AbstractMuseProblembeing solvedθ₀— theθat which to evaluate $H$ (default:result.θif it exists, otherwiseθ₀must be given)
Keyword arguments:
z₀— Starting guess for the latent space MAPs. Defaults to random sample from prior.∇z_logLike_atol = 1e-2— Absolute tolerance on the $z$-gradient at the MAP solution.rng— Random number generator to use. Taken fromresult.rngorRandom.default_rng()if not passed.nsims— How many simulations to average over (default:10)pmap— Parallel map function.progress— Show progress bar (default:false),skip_errors— Ignore any MAP that errors (default:false)fdm— AFiniteDifferenceMethodused to compute the finite difference Jacobian of AD gradients involved in computing $H$ (defaults to:FiniteDifferences.central_fdm(3,1))step— A guess for the finite difference step-size (defaults to 0.1σ for each parameter using J to estimate σ; for this reason its recommended to runget_J!beforeget_H!). Is only a guess because different choices offdmmay adapt this.implicit_diff— Whether to use experimental implicit differentiation, rather than finite differences. Will require 2nd order AD through yourlogLikeso pay close attention to yourprob.autodiff. EitherAD.HigherOrderBackend((AD.ForwardDiffBackend(), AD.ZygoteBackend()))orAD.ForwardDiffBackend()are recommended (default:false)
MuseInference.MuseResult — TypeStores the result of a MUSE run. Can be constructed by-hand as MuseResult() and passed to any of the inplace muse!, get_J!, or get_H!.
Fields:
θ— The estimate of the $\theta$ parameters.Σ, Σ⁻¹— The approximate covariance of $\theta$ and its inverse.H, J— The $H$ and $J$ matrices which form the covariance (see Millea & Seljak, 2021)gs— The MAP gradient sims used to computeJ.Hs— The jacobian sims used to computeH.dist— ANormalorMvNormalbuilt fromθandΣ, for convenience.history— Internal diagnostic info from the run.rng— RNG used to generate sims for this run (so the same sims can be reused if resuming later).time— TotalMillisecondwall-time spent computing the result.