User API

Summary

Defining

Defining a problem is done in one of three ways:

The easiest is to use the Turing.jl or Soss.jl probabilistic programming languages to define a model, which is then automatically useable by MuseInference. See the documentation for TuringMuseProblem and SossMuseProblem. Note that to use either of these, you first need run using Turing or using Soss in addition to using MuseInference in your session.
If the problem has $\theta$ variables whose domain is $(-\infty,\infty)$, and if all necessary gradients can be computed with automatic differentiation, you can use SimpleMuseProblem and specify the posterior and simulation generation code by-hand.
Otherwise, you can write a custom AbstractMuseProblem. See the Developer API

Solving

Solving a problem is then done using:

muse and muse! which compute the MUSE estimate and return a MuseResult object. The muse! version will store the result into an existing MuseResult, and will resume an existing run if the MuseResult already holds some work. This can be useful if e.g. one wishes to run the solver for more steps after an initial solution. Both functions accept a keyword argument get_covariance which if true will also compute the covariance of the MUSE estimate. If false, the covariance can be computed later by manually calling the following two functions:
get_J! and get_H! which compute the $H$ and $J$ matrices which together give the MUSE covariance, $\Sigma_{\rm posterior} = H^{-1}\,J\,H^{-\dagger} + \Sigma_{\rm prior}$. Calling these by hand rather than setting get_covariance=true can be useful as they allow more configurable options.

Tuning

The main tunable parameters to these functions which the user should consider are:

nsims = 100 — The number of simulations used. MUSE is a stochastic estimate. The error on the central MUSE estimate of $\theta$ scales as $1 / \sqrt N_{\rm sims} \cdot \sigma(\theta)$. For example, a MUSE solution with nsims=100 will have roughly the same error on the mean as an HMC chain with an effective sample size of 100. The total runtime of MUSE is linear in nsims. A different number of sims can be used for muse, get_J! and get_H! (note that $H$ is generally less realization dependent and can be computed with fewer sims).
∇z_logLike_atol = 1e-2 — The MUSE estimate involves a maximization of the likelihood over $z$. This controls the absolute solver error tolerance for $z$ for this maximization.
(θ_rtol, α) = (0.1, 0.7) — The outermost loop of the MUSE algorithm is solving an equation like $f(θ)=0$. The θ_rtol parameter sets the error tolerance for this solution for $\theta$, relative to an estimate of its uncertainty. The default 0.1 means the solution is good to $0.1\,\sigma_\theta$. The α parameter is a step-size for this solver. If a good starting guess for $\theta$ is unavailable, this may need to be reduced for the solver to converge.

The defaults given above should give reasonable results, but may need tweaking for individual problems. A good strategy is to reduce ∇z_logLike_atol, θ_rtol, and α to ensure a stable solution, then experiment with which can be increased without screwing up the result. The convergence properties are not largely affected by nsims so this can be lowered during this experimentation to offset the longer run-time.

Parallelizing

MUSE is trivially parallelizable over the nsims simulations which are averaged over in the algorithm. The functions muse, get_J!, and get_H! can all be passed a keyword argument pmap which will be used for mapping over the simulations. For example, pmap=Distributed.pmap can be used to use Julia's Distributed map across different processes.

MuseInference.TuringMuseProblem — Type

TuringMuseProblem(model; params, autodiff = Turing.ADBACKEND)

Specify a MUSE problem with a Turing model.

The Turing model should be conditioned on the variables which comprise the "data", and all other variables should be unconditioned. By default, any parameter which doesn't depend on another parameter will be estimated by MUSE, but this can be overridden by passing params as a list of symbols. All other non-conditioned and non-params variables will be considered the latent space.

The autodiff parameter should be either MuseInference.ForwardDiffBackend() or MuseInference.ZygoteBackend(), specifying which library to use for automatic differenation. The default uses whatever the global Turing.ADBACKEND is currently set to.

Example

# toy hierarchical model
Turing.@model function toy()
    σ ~ Uniform(0, 1)
    θ ~ Normal(0, σ)
    z ~ MvNormal(zeros(512), exp(σ/2)*I)
    w ~ MvNormal(z, I)
    x ~ MvNormal(w, I)
    y ~ MvNormal(x, I)
    (;σ, θ, z, w, x, y)
end
sim = toy()()
model = toy() | (;sim.x, sim.y)
prob = TuringMuseProblem(model, params=(:σ, :θ))

# get solution
result = muse(prob, (σ=0.5, θ=0))

Here we have chosen (σ, θ) to be the parameters which MuseInference will estimate (note that the default would have only chosen σ). The observed data are (x,y) and the remaining (z,w) are the latent space.

Note

You can also call muse, etc... directly on the model, e.g. muse(model, (σ=0.5, θ=0)), in which case the parameter names params will be read from the keys of provided the starting point.

Note

The model function cannot have any of the random variables as arguments, although it can have other parameters as arguments. E.g.,

# not OK
@model function toy(x)
    x ~ Normal()
    ...
end

# OK
@model function toy(σ)
    x ~ Normal(σ)
    ...
end

MuseInference.SossMuseProblem — Type

SossMuseProblem(model; params, autodiff = ForwardDiffBackend())

Specify a MUSE problem with a Soss model.

The Soss model should be conditioned on the variables which comprise the "data", and all other variables should be unconditioned. By default, any parameter which doesn't depend on another parameter will be estimated by MUSE, but this can be overridden by passing params as a list of symbols. All other non-conditioned and non-params variables will be considered the latent space.

The autodiff parameter should be either MuseInference.ForwardDiffBackend() or MuseInference.ZygoteBackend(), specifying which library to use for automatic differenation.

Example

# toy hierarchical problem, using MeasureTheory distributions
funnel = Soss.@model (σ) begin
    θ ~ MeasureTheory.Normal(0, σ)
    z ~ MeasureTheory.Normal(0, exp(θ/2)) ^ 512
    x ~ For(z) do zᵢ
        MeasureTheory.Normal(zᵢ, 1)
    end
end

sim = rand(funnel(3))
model = funnel(3) | (;sim.x)
prob = SossMuseProblem(model)

# get solution
result = muse(prob, (θ=0,))

Note

You can also call muse, etc... directly on the model, e.g. muse(model, (θ=0,)), in which case the parameter names params will be read from the keys of provided the starting point.

MuseInference.SimpleMuseProblem — Type

SimpleMuseProblem(x, sample_x_z, logLike, logPriorθ=(θ->0); ad=AD.ForwardDiffBackend())

Specify a MUSE problem by providing the simulation and posterior evaluation code by-hand. The argument x should be the observed data. The function sample_x_z should have signature:

function sample_x_z(rng::AbstractRNG, θ)
    # ...
    return (;x, z)
end

and return a joint sample of data x and latent space z. The function logLike should have signature:

function logLike(x, z, θ) 
    # return log likelihood
end

and return the likelihood $\log\mathcal{P}(x,z\,|\,\theta)$ for your problem. The optional function logPriorθ should have signature:

function logPriorθ(θ)
    # return log prior
end

and should return the prior $\log\mathcal{P}(\theta)$ for your problem. The autodiff parameter should be either MuseInference.ForwardDiffBackend() or MuseInference.ZygoteBackend(), specifying which library to use for automatic differenation through logLike.

All variables (x, z, θ) can be any types which support basic arithmetic.

Example

# 512-dimensional noisy funnel
prob = SimpleMuseProblem(
    rand(512),
    function sample_x_z(rng, θ)
        z = rand(rng, MvNormal(zeros(512), exp(θ)*I))
        x = rand(rng, MvNormal(z, I))
        (;x, z)
    end,
    function logLike(x, z, θ)
        -(1//2) * (sum((x .- z).^2) + sum(z.^2) / exp(θ) + 512*θ)
    end, 
    function logPrior(θ)
        -θ^2/(2*3^2)
    end;
    autodiff = MuseInference.ZygoteBackend()
)

# get solution
muse(prob, (θ=1,))

MuseInference.muse — Function

muse(prob::AbstractMuseProblem, θ₀; kwargs...)
muse!(result::MuseResult, prob::AbstractMuseProblem, [θ₀=nothing]; kwargs...)

Run the MUSE estimate. The muse! form resumes an existing result. If the muse form is used instead, θ₀ must give a starting guess for $\theta$.

See MuseResult for description of return value.

Keyword arguments:

rng — Random number generator to use. Taken from result.rng or Random.default_rng() if not passed.
z₀ — Starting guess for the latent space MAP.
maxsteps = 50 — Maximum number of iterations.
θ_rtol = 1e-2 — Error tolerance on $\theta$ relative to its standard deviation.
∇z_logLike_atol = 1e-2 — Absolute tolerance on the $z$-gradient at the MAP solution.
nsims = 100 — Number of simulations.
α = 0.7 — Step size for root-finder.
progress = false — Show progress bar.
pool :: AbstractWorkerPool — Worker pool for parallelization.
regularize = identity — Apply some regularization after each step.
H⁻¹_like = nothing — Initial guess for the inverse Jacobian of $s^{\rm MUSE}(\theta)$
H⁻¹_update — How to update H⁻¹_like. Should be :sims, :broyden, or :diagonal_broyden.
broyden_memory = Inf — How many past steps to keep for Broyden updates.
checkpoint_filename = nothing — Save result to a file after each iteration.
get_covariance = false — Also call get_H and get_J to get the full covariance.
save_MAPs = false — Whether to store the MAP solution at each step and for each sim and data into the history (for debugging). Defaults to false since these are generally high-dimensional and may consume lots of memory. Can also pass a function which preprocess the MAP before storing it (e.g. save_MAPs = x -> adapt(Array, x) to convert from a GPU to a CPU array).

MuseInference.muse! — Function

See muse.

MuseInference.get_J! — Function

get_J!(result::MuseResult, prob::AbstractMuseProblem, [θ₀=nothing]; kwargs...)

Compute the $J$ matrix, which is part of the MUSE covariance computation (see Millea & Seljak, 2021).

Positional arguments:

result — MuseResult into which to store result
prob — AbstractMuseProblem being solved
θ₀ — the θ at which to evaluate $J$ (default: result.θ if it exists, otherwise θ₀ must be given)

Keyword arguments:

z₀ — Starting guess for the latent space MAPs. Defaults to random sample from prior.
∇z_logLike_atol = 1e-2 — Absolute tolerance on the $z$-gradient at the MAP solution.
rng — Random number generator to use. Taken from result.rng or Random.default_rng() if not passed.
nsims — How many simulations to average over (default: 100)
pmap — Parallel map function.
progress — Show progress bar (default: false),
skip_errors — Ignore any MAP that errors (default: false)
covariance_method — A CovarianceEstimator used to compute $J$ (default: SimpleCovariance(corrected=true))

MuseInference.get_H! — Function

get_H!(result::MuseResult, prob::AbstractMuseProblem, [θ₀=nothing]; kwargs...)

Compute the $H$ matrix, which is part of the MUSE covariance computation (see Millea & Seljak, 2021).

Positional arguments:

result — MuseResult into which to store result
prob — AbstractMuseProblem being solved
θ₀ — the θ at which to evaluate $H$ (default: result.θ if it exists, otherwise θ₀ must be given)

Keyword arguments:

z₀ — Starting guess for the latent space MAPs. Defaults to random sample from prior.
∇z_logLike_atol = 1e-2 — Absolute tolerance on the $z$-gradient at the MAP solution.
rng — Random number generator to use. Taken from result.rng or Random.default_rng() if not passed.
nsims — How many simulations to average over (default: 10)
pmap — Parallel map function.
progress — Show progress bar (default: false),
skip_errors — Ignore any MAP that errors (default: false)
fdm — A FiniteDifferenceMethod used to compute the finite difference Jacobian of AD gradients involved in computing $H$ (defaults to: FiniteDifferences.central_fdm(3,1))
step — A guess for the finite difference step-size (defaults to 0.1σ for each parameter using J to estimate σ; for this reason its recommended to run get_J! before get_H!). Is only a guess because different choices of fdm may adapt this.
implicit_diff — Whether to use experimental implicit differentiation, rather than finite differences. Will require 2nd order AD through your logLike so pay close attention to your prob.autodiff. Either AD.HigherOrderBackend((AD.ForwardDiffBackend(), AD.ZygoteBackend())) or AD.ForwardDiffBackend() are recommended (default: false)

MuseInference.MuseResult — Type

Stores the result of a MUSE run. Can be constructed by-hand as MuseResult() and passed to any of the inplace muse!, get_J!, or get_H!.

Fields:

θ — The estimate of the $\theta$ parameters.
Σ, Σ⁻¹ — The approximate covariance of $\theta$ and its inverse.
H, J — The $H$ and $J$ matrices which form the covariance (see Millea & Seljak, 2021)
gs — The MAP gradient sims used to compute J.
Hs — The jacobian sims used to compute H.
dist — A Normal or MvNormal built from θ and Σ, for convenience.
history — Internal diagnostic info from the run.
rng — RNG used to generate sims for this run (so the same sims can be reused if resuming later).
time — Total Millisecond wall-time spent computing the result.