Probabilistic Circuits

This page lists all the API documentation for ProbabilisticCircuits package.

Base.readMethod
Base.read(file::AbstractString, ::Type{C}) where C <: ProbCircuit

Reads circuit from file; uses extension to detect format type, for example ".psdd" for PSDDs.

source
Base.writeMethod
Base.write(file::AbstractString, circuit::ProbCircuit)

Writes circuit to file; uses file name extention to detect file format.

source
DirectedAcyclicGraphs.foldupMethod
foldup(node::ProbCircuit, 
    f_i::Function, 
    f_m::Function, 
    f_s::Function)::T where {T}

Compute a function bottom-up on the circuit. f_in is called on input nodes, f_m is called on product nodes, and f_s is called on sum nodes. Values of type T are passed up the circuit and given to f_m and f_s through a callback from the children.

source
DirectedAcyclicGraphs.foldup_aggregateMethod
foldup_aggregate(node::ProbCircuit, 
    f_i::Function, 
    f_m::Function, 
    f_s::Function, 
    ::Type{T})::T where T

Compute a function bottom-up on the circuit. f_in is called on input nodes, f_m is called on product nodes, and f_s is called on sum nodes. Values of type T are passed up the circuit and given to f_m and f_s in an aggregate vector from the children.

source
ProbabilisticCircuits.MAPMethod
MAP(bpc::CuBitsProbCircuit, data::CuArray; batch_size, mars_mem=nothing)

Retruns the MAP states for a given circuit and data on gpu. Missing values should be denoted as missing.

Note that the MAP states are exact only when the circuit is both decomposable and deterministic, otherwise its just an approximation.

  • bpc: BitCircuit on gpu
  • data: CuArray{Union{Missing, data_types...}}
  • batch_size
  • mars_mem: Not required, advanced usage. CuMatrix to reuse memory and reduce allocations. See prep_memory and cleanup_memory.
source
ProbabilisticCircuits.MAPMethod
MAP(pc::ProbCircuit, data::Matrix; batch_size, Float=Float32)

Evaluate max a posteriori (MAP) state of the circuit for given input(s) on cpu.

Note: This algorithm is only exact if the circuit is both decomposable and determinisitic. If the circuit is only decomposable and not deterministic, this will give inexact results without guarantees.

source
ProbabilisticCircuits.RATMethod
RAT(num_features; input_func::Function = RAT_InputFunc(Literal), num_nodes_region, num_nodes_leaf, rg_depth, rg_replicas, num_nodes_root = 1, balance_childs_parents = true)

Generate a RAT-SPN structure. First, it generates a random region graph with depth, and replicas. Then uses the random region graph to generate a ProbCircuit conforming to that region graph.

  • num_features: Number of features in the dataset, assuming x1...xn
  • input_func: Function to generate a new input node for variable when calling input_func(var).

The list of hyperparamters are:

  • rg_depth: how many layers to do splits in the region graph
  • rg_replicas: number of replicas or paritions (replicas only used for the root region; for other regions only 1 parition (inner nodes), or 0 parition for leaves)
  • num_nodes_root: number of sum nodes in the root region
  • num_nodes_leaf: number of sum nodes per leaf region
  • num_nodes_region: number of in each region except root and leaves
  • num_splits: number of splits for each parition; split variables into random equaly sized regions
source
ProbabilisticCircuits.RAT_InputFuncMethod

Default input_func for different types. This function returns another function input_func. Then input_func(var) should generate a new input function with the desired distribution.

source
ProbabilisticCircuits.balanced_fully_factorized_leavesMethod

Makes sure input nodes don't have too many parents. Makes a dummy sum node for each input per partition. Then nodes corresponding to the partition use the dummy node as their children instead of the input node. This way instead of numnodesroot * numnodesleaf, we would have numnodesroot parents nodes.

source
ProbabilisticCircuits.bitsMethod
bits(d::InputDist, heap)

Appends the required memory for this input dist to the heap.

Used internally for moving from CPU to GPU.

source
ProbabilisticCircuits.clear_memoryMethod
clear_memory(d::InputDist, heap, rate)

Clears the accumulated flow values on the heap by multiplying it by rate. rate == 0.0 will be equivalent to initializing the value to 0.0.

source
ProbabilisticCircuits.eval_circuit!Method
eval_circuit!(mars, linPC::AbstractVector{<:ProbCircuit}, data::Matrix, example_ids;  node2idx::Dict{ProbCircuit, UInt32}, Float=Float32)

Used internally. Evaluates the marginals of the circuit on cpu. Stores the values in mars.

  • mars: (batch_size, nodes)
  • linPC: linearized PC. (i.e. linearize(pc))
  • data: data Matrix (num_examples, features)
  • example_ids: Array or collection of ids for current batch
  • node2idx: Index of each ProbCircuit node in the linearized circuit
source
ProbabilisticCircuits.eval_circuit_max!Method
eval_circuit_max!(mars, linPC::AbstractVector{<:ProbCircuit}, data::Matrix, example_ids;  node2idx::Dict{ProbCircuit, UInt32}, Float=Float32)

Used internally. Evaluates the MAP upward pass of the circuit on cpu. Stores the values in mars.

  • mars: (batch_size, nodes)
  • linPC: linearized PC. (i.e. linearize(pc))
  • data: data Matrix (num_examples, features)
  • example_ids: Array or collection of ids for current batch
  • node2idx: Index of each ProbCircuit node in the linearized circuit
source
ProbabilisticCircuits.full_batch_emMethod
full_batch_em(bpc::CuBitsProbCircuit, raw_data::CuArray, num_epochs; batch_size, pseudocount)

Update the paramters of the CuBitsProbCircuit by doing EM on the full batch (i.e. update paramters at the end of each epoch).

source
ProbabilisticCircuits.hcltMethod
hclt(data, num_hidden_cats; num_cats = nothing, input_type = LiteralDist)

Learns HiddenChowLiuTree (hclt) circuit structure from data.

  • data: Matrix or CuMatrix
  • num_hidden_cats: Number of categories in hidden variables
  • input_type: Distribution type for the inputs
  • num_cats: Number of categories (in case of categorical inputs). Automatically deduced if not given explicilty.
source
ProbabilisticCircuits.loglikelihoodMethod
loglikelihood(root::ProbCircuit, data::Matrix, example_id; Float=Float32)

Computes marginal loglikelihood recursively on cpu for a single instance data[example_id, :].

Note: Quite slow, only use for demonstration/educational purposes.

source
ProbabilisticCircuits.loglikelihoodsMethod
loglikelihoods(bpc::CuBitsProbCircuit, data::CuArray; batch_size, mars_mem = nothing)

Returns loglikelihoods for each datapoint on gpu. Missing values should be denoted by missing.

  • bpc: BitCircuit on gpu
  • data: CuArray{Union{Missing, data_types...}}
  • batch_size
  • mars_mem: Not required, advanced usage. CuMatrix to reuse memory and reduce allocations. See prep_memory and cleanup_memory.
source
ProbabilisticCircuits.loglikelihoodsMethod
loglikelihoods(pc::ProbCircuit, data::Matrix)

Computes loglikelihoods of the circuit over the data on cpu. Linearizes the circuit and computes the marginals in batches.

source
ProbabilisticCircuits.map_down_rec!Method
map_down_rec!(mars, node::ProbCircuit, data, states::Matrix, batch_idx, example_idx; node2idx::Dict{ProbCircuit, UInt32}, Float=Float32)

Downward pass on cpu for MAP. Recursively chooses the best (max) sum node children according to the "MAP upward pass" values. Updates the missing values with map_state of that input node.

source
ProbabilisticCircuits.mini_batch_emMethod
mini_batch_em(bpc::CuBitsProbCircuit, raw_data::CuArray, num_epochs; batch_size, pseudocount,
    param_inertia, param_inertia_end = param_inertia, shuffle=:each_epoch)

Update the parameters of the CuBitsProbCircuit by doing EM, update the parameters after each batch.

source
ProbabilisticCircuits.prep_memoryFunction
prep_memory(reuse, sizes, exact = map(x -> true, sizes))

Mostly used internally. Prepares memory for the specifed size, reuses reuse if possible to avoid memory allocation/deallocation.

source
ProbabilisticCircuits.random_region_graphMethod
random_region_graph(X::AbstractVector, depth::Int = 5, replicas::Int = 2, num_splits::Int = 2)
  • X: Vector of all variables to include; for the root region
  • depth: how many layers to do splits
  • replicas: number of replicas or paritions (replicas only used for the root region; for other regions only 1 parition (inner nodes), or 0 parition for leaves)
  • num_splits: number of splits for each parition; split variables into random equaly sized regions
source
ProbabilisticCircuits.region_graph_2_pcMethod
region_graph_2_pc(node::RegionGraph; num_nodes_root, num_nodes_region, num_nodes_leaf, balance_childs_parents)
  • num_nodes_root: number of sum nodes in the root region
  • num_nodes_leaf: number of sum nodes per leaf region
  • num_nodes_region: number of in each region except root and leaves
source
ProbabilisticCircuits.sampleMethod
sample(bpc::CuBitsProbCircuit, num_samples, data::CuMatrix; rng=default_rng())

Generate num_samples for each datapoint in data from the joint distribution of the circuit conditioned on the data. Samples are generated using GPU.

  • bpc: Circuit on gpu (CuBitProbCircuit)
  • num_samples: how many samples to generate
  • rng: (Optional) Random Number Generator

The size of returned CuArray is (num_samples, size(data, 1), size(data, 2)).

source
ProbabilisticCircuits.sampleMethod
sample(bpc::CuBitsProbCircuit, num_samples::Int, num_rand_vars::Int, types; rng=default_rng())

Generate num_samples from the joint distribution of the circuit without any conditions. Samples are genearted on the GPU.

  • bpc: Circuit on gpu (CuBitProbCircuit)
  • num_samples: how many samples to generate
  • num_rand_vars: number of random variables in the circuit
  • types: Array of possible input types
  • rng: (Optional) Random Number Generator

The size of returned Array is (num_samples, 1, size(data, 2)).

source
ProbabilisticCircuits.sampleMethod
sample(pc::ProbCircuit, num_samples; rng = default_rng())

Generate num_samples from the joint distribution of the circuit without any conditions. Samples are generated on the CPU.

source
ProbabilisticCircuits.sampleMethod
sample(pc::ProbCircuit, num_samples, data::Matrix;; batch_size, rng = default_rng())

Generate num_samples from the joint distribution of the circuit conditioned on the data.

source
ProbabilisticCircuits.sample_stateMethod
sample_state(d::InputDist, threshold::Float32, heap)

Returns a sample from InputDist. Threshold is a uniform random value in range (0, 1) given to this API by the sampleing algorithm

source
ProbabilisticCircuits.unbitsMethod
unbits(d::InputDist, heap)

Returns the InputDist struct from the heap. Note, each input dist type needs to store where in the heap its paramters are to be able to do this.

Used internally for moving from GPU to CPU.

source