Probabilistic Circuits

This page lists all the API documentation for ProbabilisticCircuits package.

ProbabilisticCircuits.Categorical — Type

A N-value categorical input distribution ranging over integers [0...N-1]

source

ProbabilisticCircuits.FlatVectors — Type

An isbits representation of a AbstractVector{<:AbstractVector}

source

ProbabilisticCircuits.Indicator — Type

A input distribution node that places all probability on a single value

source

ProbabilisticCircuits.Literal — Type

A logical literal input distribution node

source

ProbabilisticCircuits.NodeType — Type

Probabilistic circuit node types

source

ProbabilisticCircuits.NodeType — Method

Get the probabilistic circuit node type

source

ProbabilisticCircuits.PlainInnerNode — Type

A probabilistic inner node

source

ProbabilisticCircuits.PlainInputNode — Type

A probabilistic input node

source

ProbabilisticCircuits.PlainMulNode — Type

A probabilistic multiplication node

source

ProbabilisticCircuits.PlainProbCircuit — Type

Root of the plain probabilistic circuit node hierarchy

source

ProbabilisticCircuits.PlainSumNode — Type

A probabilistic summation node

source

ProbabilisticCircuits.ProbCircuit — Type

Root of the probabilistic circuit node hierarchy

source

ProbabilisticCircuits.RegionGraph — Type

Root of region graph node hierarchy

source

Base.read — Method

Base.read(file::AbstractString, ::Type{C}) where C <: ProbCircuit

Reads circuit from file; uses extension to detect format type, for example ".psdd" for PSDDs.

source

Base.write — Method

Base.write(file::AbstractString, circuit::ProbCircuit)

Writes circuit to file; uses file name extention to detect file format.

source

DirectedAcyclicGraphs.foldup — Method

foldup(node::ProbCircuit, 
    f_i::Function, 
    f_m::Function, 
    f_s::Function)::T where {T}

Compute a function bottom-up on the circuit. f_in is called on input nodes, f_m is called on product nodes, and f_s is called on sum nodes. Values of type T are passed up the circuit and given to f_m and f_s through a callback from the children.

source

DirectedAcyclicGraphs.foldup_aggregate — Method

foldup_aggregate(node::ProbCircuit, 
    f_i::Function, 
    f_m::Function, 
    f_s::Function, 
    ::Type{T})::T where T

source

ProbabilisticCircuits.MAP — Method

MAP(bpc::CuBitsProbCircuit, data::CuArray; batch_size, mars_mem=nothing)

Retruns the MAP states for a given circuit and data on gpu. Missing values should be denoted as missing.

Note that the MAP states are exact only when the circuit is both decomposable and deterministic, otherwise its just an approximation.

bpc: BitCircuit on gpu
data: CuArray{Union{Missing, data_types...}}
batch_size
mars_mem: Not required, advanced usage. CuMatrix to reuse memory and reduce allocations. See prep_memory and cleanup_memory.

source

ProbabilisticCircuits.MAP — Method

MAP(pc::ProbCircuit, data::Matrix; batch_size, Float=Float32)

Evaluate max a posteriori (MAP) state of the circuit for given input(s) on cpu.

Note: This algorithm is only exact if the circuit is both decomposable and determinisitic. If the circuit is only decomposable and not deterministic, this will give inexact results without guarantees.

source

ProbabilisticCircuits.RAT — Method

RAT(num_features; input_func::Function = RAT_InputFunc(Literal), num_nodes_region, num_nodes_leaf, rg_depth, rg_replicas, num_nodes_root = 1, balance_childs_parents = true)

Generate a RAT-SPN structure. First, it generates a random region graph with depth, and replicas. Then uses the random region graph to generate a ProbCircuit conforming to that region graph.

num_features: Number of features in the dataset, assuming x1...xn
input_func: Function to generate a new input node for variable when calling input_func(var).

The list of hyperparamters are:

rg_depth: how many layers to do splits in the region graph
rg_replicas: number of replicas or paritions (replicas only used for the root region; for other regions only 1 parition (inner nodes), or 0 parition for leaves)
num_nodes_root: number of sum nodes in the root region
num_nodes_leaf: number of sum nodes per leaf region
num_nodes_region: number of in each region except root and leaves
num_splits: number of splits for each parition; split variables into random equaly sized regions

source

ProbabilisticCircuits.RAT_InputFunc — Method

Default input_func for different types. This function returns another function input_func. Then input_func(var) should generate a new input function with the desired distribution.

source

ProbabilisticCircuits.balance_sum — Method

Makes sure the sum nodes does not have too many children. Makes balanced sums of sums to reduce children count.

source

ProbabilisticCircuits.balanced_fully_factorized_leaves — Method

Makes sure input nodes don't have too many parents. Makes a dummy sum node for each input per partition. Then nodes corresponding to the partition use the dummy node as their children instead of the input node. This way instead of numnodesroot * numnodesleaf, we would have numnodesroot parents nodes.

source

ProbabilisticCircuits.bits — Method

bits(d::InputDist, heap)

Appends the required memory for this input dist to the heap.

Used internally for moving from CPU to GPU.

source

ProbabilisticCircuits.cleanup_memory — Method

Cleansup allocated memory. Used internally.

source

ProbabilisticCircuits.clear_memory — Method

clear_memory(d::InputDist, heap, rate)

Clears the accumulated flow values on the heap by multiplying it by rate. rate == 0.0 will be equivalent to initializing the value to 0.0.

source

ProbabilisticCircuits.dist — Function

Get the distribution of a PC input node

source

ProbabilisticCircuits.eval_circuit! — Method

eval_circuit!(mars, linPC::AbstractVector{<:ProbCircuit}, data::Matrix, example_ids;  node2idx::Dict{ProbCircuit, UInt32}, Float=Float32)

Used internally. Evaluates the marginals of the circuit on cpu. Stores the values in mars.

mars: (batch_size, nodes)
linPC: linearized PC. (i.e. linearize(pc))
data: data Matrix (num_examples, features)
example_ids: Array or collection of ids for current batch
node2idx: Index of each ProbCircuit node in the linearized circuit

source

ProbabilisticCircuits.eval_circuit_max! — Method

eval_circuit_max!(mars, linPC::AbstractVector{<:ProbCircuit}, data::Matrix, example_ids;  node2idx::Dict{ProbCircuit, UInt32}, Float=Float32)

Used internally. Evaluates the MAP upward pass of the circuit on cpu. Stores the values in mars.

mars: (batch_size, nodes)
linPC: linearized PC. (i.e. linearize(pc))
data: data Matrix (num_examples, features)
example_ids: Array or collection of ids for current batch
node2idx: Index of each ProbCircuit node in the linearized circuit

source

ProbabilisticCircuits.flow — Method

flow(d::InputDist, value, node_flow, heap)

Updates the "flow" values in the heap for the input node.

source

ProbabilisticCircuits.full_batch_em — Method

full_batch_em(bpc::CuBitsProbCircuit, raw_data::CuArray, num_epochs; batch_size, pseudocount)

Update the paramters of the CuBitsProbCircuit by doing EM on the full batch (i.e. update paramters at the end of each epoch).

source

ProbabilisticCircuits.hclt — Method

hclt(data, num_hidden_cats; num_cats = nothing, input_type = LiteralDist)

Learns HiddenChowLiuTree (hclt) circuit structure from data.

data: Matrix or CuMatrix
num_hidden_cats: Number of categories in hidden variables
input_type: Distribution type for the inputs
num_cats: Number of categories (in case of categorical inputs). Automatically deduced if not given explicilty.

source

ProbabilisticCircuits.init_heap_map_loglikelihood! — Method

init_heap_map_loglikelihood!(d::InputDist, heap)

Initializes the heap for the input dist. Called before running MAP queries.

source

ProbabilisticCircuits.init_heap_map_state! — Method

init_heap_map_state!(d::InputDist, heap)

Initializes the heap for the input dist. Called before running MAP queries.

source

ProbabilisticCircuits.init_parameters — Method

init_parameters(pc::ProbCircuit; perturbation = 0.0)

Initialize parameters of ProbCircuit.

source

ProbabilisticCircuits.init_params — Method

init_params(d::InputDist, perturbation)

Returns a new distribution with same type with initialized parameters.

source

ProbabilisticCircuits.inputnodes — Method

Get all input nodes in a given circuit

source

ProbabilisticCircuits.inputs — Method

Get the inputs of a PC node

source

ProbabilisticCircuits.isinput — Method

Is the node an input node?

source

ProbabilisticCircuits.ismul — Method

Is the node a multiplication?

source

ProbabilisticCircuits.isonlysubedge — Method

whether this sub edge is the only outgoing edge from sub

source

ProbabilisticCircuits.ispartial — Method

whether this series of edges is partial or complete

source

ProbabilisticCircuits.issum — Method

Is the node a summation?

source

ProbabilisticCircuits.loglikelihood — Method

loglikelihood(bpc::CuBitsProbCircuit, data::CuArray; batch_size, mars_mem = nothing)

Computes Average loglikelihood of circuit given the data using gpu. See loglikelihoods for more details.

source

ProbabilisticCircuits.loglikelihood — Method

loglikelihood(d::InputDist, value, heap)

Returns the log( P(input_var == value) ) according to the InputDist.

source

ProbabilisticCircuits.loglikelihood — Method

loglikelihood(root::ProbCircuit, data::Matrix, example_id; Float=Float32)

Computes marginal loglikelihood recursively on cpu for a single instance data[example_id, :].

Note: Quite slow, only use for demonstration/educational purposes.

source

ProbabilisticCircuits.loglikelihoods — Method

loglikelihoods(bpc::CuBitsProbCircuit, data::CuArray; batch_size, mars_mem = nothing)

Returns loglikelihoods for each datapoint on gpu. Missing values should be denoted by missing.

bpc: BitCircuit on gpu
data: CuArray{Union{Missing, data_types...}}
batch_size
mars_mem: Not required, advanced usage. CuMatrix to reuse memory and reduce allocations. See prep_memory and cleanup_memory.

source

ProbabilisticCircuits.loglikelihoods — Method

loglikelihoods(pc::ProbCircuit, data::Matrix)

Computes loglikelihoods of the circuit over the data on cpu. Linearizes the circuit and computes the marginals in batches.

source

ProbabilisticCircuits.loglikelihoods_vectorized — Method

Note: Experimental**; will be removed or renamed later

source

ProbabilisticCircuits.map_down_rec! — Method

map_down_rec!(mars, node::ProbCircuit, data, states::Matrix, batch_idx, example_idx; node2idx::Dict{ProbCircuit, UInt32}, Float=Float32)

Downward pass on cpu for MAP. Recursively chooses the best (max) sum node children according to the "MAP upward pass" values. Updates the missing values with map_state of that input node.

source

ProbabilisticCircuits.map_loglikelihood — Method

map_loglikelihood(d::InputDist, heap)

Returns the MAP loglikelihoods the most likely state of the InputDist d

source

ProbabilisticCircuits.map_state — Method

map_state(d::InputDist, heap)

Returns the MAP state for the InputDist d

source

ProbabilisticCircuits.mini_batch_em — Method

mini_batch_em(bpc::CuBitsProbCircuit, raw_data::CuArray, num_epochs; batch_size, pseudocount,
    param_inertia, param_inertia_end = param_inertia, shuffle=:each_epoch)

Update the parameters of the CuBitsProbCircuit by doing EM, update the parameters after each batch.

source

ProbabilisticCircuits.mulnodes — Method

Get all multiplication nodes in a given circuit

source

ProbabilisticCircuits.multiply — Function

Multiply nodes into a single circuit

source

ProbabilisticCircuits.num_inputs — Method

Number of inputs of a PC node

source

ProbabilisticCircuits.num_parameters — Function

Count the number of parameters in the circuit

source

ProbabilisticCircuits.num_parameters — Method

num_parameters(d::InputDist, independent)

Returns number of parameters for the input dist.

independent: whether to only count independent parameters

source

ProbabilisticCircuits.num_parameters_node — Method

Count the number of parameters in the node

source

ProbabilisticCircuits.num_randvars — Method

Number of variables in the data structure

source

ProbabilisticCircuits.params — Method

params(d::InputDist)

Returns paramters of the input dist.

source

ProbabilisticCircuits.params — Method

Get the parameters associated with a node

source

ProbabilisticCircuits.prep_memory — Function

prep_memory(reuse, sizes, exact = map(x -> true, sizes))

Mostly used internally. Prepares memory for the specifed size, reuses reuse if possible to avoid memory allocation/deallocation.

source

ProbabilisticCircuits.psdd_num_nodes_leafs — Method

Count the number of decision and leaf nodes in the PSDD

source

ProbabilisticCircuits.random_region_graph — Method

random_region_graph(X::AbstractVector, depth::Int = 5, replicas::Int = 2, num_splits::Int = 2)

X: Vector of all variables to include; for the root region
depth: how many layers to do splits
replicas: number of replicas or paritions (replicas only used for the root region; for other regions only 1 parition (inner nodes), or 0 parition for leaves)
num_splits: number of splits for each parition; split variables into random equaly sized regions

source

ProbabilisticCircuits.randvars — Function

variables(pc::ProbCircuit)::BitSet

Get a bitset of variables mentioned in the circuit.

source

ProbabilisticCircuits.region_graph_2_pc — Method

region_graph_2_pc(node::RegionGraph; num_nodes_root, num_nodes_region, num_nodes_leaf, balance_childs_parents)

num_nodes_root: number of sum nodes in the root region
num_nodes_leaf: number of sum nodes per leaf region
num_nodes_region: number of in each region except root and leaves

source

ProbabilisticCircuits.sample — Method

sample(bpc::CuBitsProbCircuit, num_samples, data::CuMatrix; rng=default_rng())

Generate num_samples for each datapoint in data from the joint distribution of the circuit conditioned on the data. Samples are generated using GPU.

bpc: Circuit on gpu (CuBitProbCircuit)
num_samples: how many samples to generate
rng: (Optional) Random Number Generator

The size of returned CuArray is (num_samples, size(data, 1), size(data, 2)).

source

ProbabilisticCircuits.sample — Method

sample(bpc::CuBitsProbCircuit, num_samples::Int, num_rand_vars::Int, types; rng=default_rng())

Generate num_samples from the joint distribution of the circuit without any conditions. Samples are genearted on the GPU.

bpc: Circuit on gpu (CuBitProbCircuit)
num_samples: how many samples to generate
num_rand_vars: number of random variables in the circuit
types: Array of possible input types
rng: (Optional) Random Number Generator

The size of returned Array is (num_samples, 1, size(data, 2)).

source

ProbabilisticCircuits.sample — Method

sample(pc::ProbCircuit, num_samples; rng = default_rng())

Generate num_samples from the joint distribution of the circuit without any conditions. Samples are generated on the CPU.

source

ProbabilisticCircuits.sample — Method

sample(pc::ProbCircuit, num_samples, data::Matrix;; batch_size, rng = default_rng())

Generate num_samples from the joint distribution of the circuit conditioned on the data.

source

ProbabilisticCircuits.sample_state — Method

sample_state(d::InputDist, threshold::Float32, heap)

Returns a sample from InputDist. Threshold is a uniform random value in range (0, 1) given to this API by the sampleing algorithm

source

ProbabilisticCircuits.soften_data — Method

Turn binary data into floating point data close to 0 and 1.

source

ProbabilisticCircuits.summate — Function

Sum nodes into a single circuit

source

ProbabilisticCircuits.sumnodes — Method

Get all summation nodes in a given circuit

source

ProbabilisticCircuits.unbits — Method

unbits(d::InputDist, heap)

Returns the InputDist struct from the heap. Note, each input dist type needs to store where in the heap its paramters are to be able to do this.

Used internally for moving from GPU to CPU.

source

ProbabilisticCircuits.update_parameters — Method

map parameters from BitsPC back to the ProbCircuit it was created from

source

ProbabilisticCircuits.update_params — Method

update_params(d::InputDist, heap, pseudocount, inertia)

Update the parameters of the InputDist using stored values on the heap and (pseudocount, inertia)

source