Probabilistic Circuits
This page lists all the API documentation for ProbabilisticCircuits
package.
ProbabilisticCircuits.Categorical
— TypeA N-value categorical input distribution ranging over integers [0...N-1]
ProbabilisticCircuits.FlatVectors
— TypeAn isbits
representation of a AbstractVector{<:AbstractVector}
ProbabilisticCircuits.Indicator
— TypeA input distribution node that places all probability on a single value
ProbabilisticCircuits.Literal
— TypeA logical literal input distribution node
ProbabilisticCircuits.NodeType
— TypeProbabilistic circuit node types
ProbabilisticCircuits.NodeType
— MethodGet the probabilistic circuit node type
ProbabilisticCircuits.PlainInnerNode
— TypeA probabilistic inner node
ProbabilisticCircuits.PlainInputNode
— TypeA probabilistic input node
ProbabilisticCircuits.PlainMulNode
— TypeA probabilistic multiplication node
ProbabilisticCircuits.PlainProbCircuit
— TypeRoot of the plain probabilistic circuit node hierarchy
ProbabilisticCircuits.PlainSumNode
— TypeA probabilistic summation node
ProbabilisticCircuits.ProbCircuit
— TypeRoot of the probabilistic circuit node hierarchy
ProbabilisticCircuits.RegionGraph
— TypeRoot of region graph node hierarchy
Base.read
— MethodBase.read(file::AbstractString, ::Type{C}) where C <: ProbCircuit
Reads circuit from file; uses extension to detect format type, for example ".psdd" for PSDDs.
Base.write
— MethodBase.write(file::AbstractString, circuit::ProbCircuit)
Writes circuit to file; uses file name extention to detect file format.
DirectedAcyclicGraphs.foldup
— Methodfoldup(node::ProbCircuit,
f_i::Function,
f_m::Function,
f_s::Function)::T where {T}
Compute a function bottom-up on the circuit. f_in
is called on input nodes, f_m
is called on product nodes, and f_s
is called on sum nodes. Values of type T
are passed up the circuit and given to f_m
and f_s
through a callback from the children.
DirectedAcyclicGraphs.foldup_aggregate
— Methodfoldup_aggregate(node::ProbCircuit,
f_i::Function,
f_m::Function,
f_s::Function,
::Type{T})::T where T
Compute a function bottom-up on the circuit. f_in
is called on input nodes, f_m
is called on product nodes, and f_s
is called on sum nodes. Values of type T
are passed up the circuit and given to f_m
and f_s
in an aggregate vector from the children.
ProbabilisticCircuits.MAP
— MethodMAP(bpc::CuBitsProbCircuit, data::CuArray; batch_size, mars_mem=nothing)
Retruns the MAP states for a given circuit and data on gpu. Missing values should be denoted as missing
.
Note that the MAP states are exact only when the circuit is both decomposable and deterministic, otherwise its just an approximation.
bpc
: BitCircuit on gpudata
: CuArray{Union{Missing, data_types...}}batch_size
mars_mem
: Not required, advanced usage. CuMatrix to reuse memory and reduce allocations. Seeprep_memory
andcleanup_memory
.
ProbabilisticCircuits.MAP
— MethodMAP(pc::ProbCircuit, data::Matrix; batch_size, Float=Float32)
Evaluate max a posteriori (MAP) state of the circuit for given input(s) on cpu.
Note: This algorithm is only exact if the circuit is both decomposable and determinisitic. If the circuit is only decomposable and not deterministic, this will give inexact results without guarantees.
ProbabilisticCircuits.RAT
— MethodRAT(num_features; input_func::Function = RAT_InputFunc(Literal), num_nodes_region, num_nodes_leaf, rg_depth, rg_replicas, num_nodes_root = 1, balance_childs_parents = true)
Generate a RAT-SPN structure. First, it generates a random region graph with depth
, and replicas
. Then uses the random region graph to generate a ProbCircuit conforming to that region graph.
num_features
: Number of features in the dataset, assuming x1...xninput_func
: Function to generate a new input node for variable when callinginput_func(var)
.
The list of hyperparamters are:
rg_depth
: how many layers to do splits in the region graphrg_replicas
: number of replicas or paritions (replicas only used for the root region; for other regions only 1 parition (inner nodes), or 0 parition for leaves)num_nodes_root
: number of sum nodes in the root regionnum_nodes_leaf
: number of sum nodes per leaf regionnum_nodes_region
: number of in each region except root and leavesnum_splits
: number of splits for each parition; split variables into random equaly sized regions
ProbabilisticCircuits.RAT_InputFunc
— MethodDefault input_func
for different types. This function returns another function input_func
. Then input_func(var)
should generate a new input function with the desired distribution.
ProbabilisticCircuits.balance_sum
— MethodMakes sure the sum nodes does not have too many children. Makes balanced sums of sums to reduce children count.
ProbabilisticCircuits.balanced_fully_factorized_leaves
— MethodMakes sure input nodes don't have too many parents. Makes a dummy sum node for each input per partition. Then nodes corresponding to the partition use the dummy node as their children instead of the input node. This way instead of numnodesroot * numnodesleaf, we would have numnodesroot parents nodes.
ProbabilisticCircuits.bits
— Methodbits(d::InputDist, heap)
Appends the required memory for this input dist to the heap.
Used internally for moving from CPU to GPU.
ProbabilisticCircuits.cleanup_memory
— MethodCleansup allocated memory. Used internally.
ProbabilisticCircuits.clear_memory
— Methodclear_memory(d::InputDist, heap, rate)
Clears the accumulated flow values on the heap
by multiplying it by rate
. rate == 0.0
will be equivalent to initializing the value to 0.0.
ProbabilisticCircuits.dist
— FunctionGet the distribution of a PC input node
ProbabilisticCircuits.eval_circuit!
— Methodeval_circuit!(mars, linPC::AbstractVector{<:ProbCircuit}, data::Matrix, example_ids; node2idx::Dict{ProbCircuit, UInt32}, Float=Float32)
Used internally. Evaluates the marginals of the circuit on cpu. Stores the values in mars
.
mars
: (batch_size, nodes)linPC
: linearized PC. (i.e.linearize(pc)
)data
: data Matrix (num_examples, features)example_ids
: Array or collection of ids for current batchnode2idx
: Index of each ProbCircuit node in the linearized circuit
ProbabilisticCircuits.eval_circuit_max!
— Methodeval_circuit_max!(mars, linPC::AbstractVector{<:ProbCircuit}, data::Matrix, example_ids; node2idx::Dict{ProbCircuit, UInt32}, Float=Float32)
Used internally. Evaluates the MAP upward pass of the circuit on cpu. Stores the values in mars
.
mars
: (batch_size, nodes)linPC
: linearized PC. (i.e.linearize(pc)
)data
: data Matrix (num_examples, features)example_ids
: Array or collection of ids for current batchnode2idx
: Index of each ProbCircuit node in the linearized circuit
ProbabilisticCircuits.flow
— Methodflow(d::InputDist, value, node_flow, heap)
Updates the "flow" values in the heap
for the input node.
ProbabilisticCircuits.full_batch_em
— Methodfull_batch_em(bpc::CuBitsProbCircuit, raw_data::CuArray, num_epochs; batch_size, pseudocount)
Update the paramters of the CuBitsProbCircuit by doing EM on the full batch (i.e. update paramters at the end of each epoch).
ProbabilisticCircuits.hclt
— Methodhclt(data, num_hidden_cats; num_cats = nothing, input_type = LiteralDist)
Learns HiddenChowLiuTree (hclt) circuit structure from data.
data
: Matrix or CuMatrixnum_hidden_cats
: Number of categories in hidden variablesinput_type
: Distribution type for the inputsnum_cats
: Number of categories (in case of categorical inputs). Automatically deduced if not given explicilty.
ProbabilisticCircuits.init_heap_map_loglikelihood!
— Methodinit_heap_map_loglikelihood!(d::InputDist, heap)
Initializes the heap for the input dist. Called before running MAP queries.
ProbabilisticCircuits.init_heap_map_state!
— Methodinit_heap_map_state!(d::InputDist, heap)
Initializes the heap for the input dist. Called before running MAP queries.
ProbabilisticCircuits.init_parameters
— Methodinit_parameters(pc::ProbCircuit; perturbation = 0.0)
Initialize parameters of ProbCircuit.
ProbabilisticCircuits.init_params
— Methodinit_params(d::InputDist, perturbation)
Returns a new distribution with same type with initialized parameters.
ProbabilisticCircuits.inputnodes
— MethodGet all input nodes in a given circuit
ProbabilisticCircuits.inputs
— MethodGet the inputs of a PC node
ProbabilisticCircuits.isinput
— MethodIs the node an input node?
ProbabilisticCircuits.ismul
— MethodIs the node a multiplication?
ProbabilisticCircuits.isonlysubedge
— Methodwhether this sub edge is the only outgoing edge from sub
ProbabilisticCircuits.ispartial
— Methodwhether this series of edges is partial or complete
ProbabilisticCircuits.issum
— MethodIs the node a summation?
ProbabilisticCircuits.loglikelihood
— Methodloglikelihood(bpc::CuBitsProbCircuit, data::CuArray; batch_size, mars_mem = nothing)
Computes Average loglikelihood of circuit given the data using gpu. See loglikelihoods
for more details.
ProbabilisticCircuits.loglikelihood
— Methodloglikelihood(d::InputDist, value, heap)
Returns the log( P(input_var == value) )
according to the InputDist.
ProbabilisticCircuits.loglikelihood
— Methodloglikelihood(root::ProbCircuit, data::Matrix, example_id; Float=Float32)
Computes marginal loglikelihood recursively on cpu for a single instance data[example_id, :]
.
Note: Quite slow, only use for demonstration/educational purposes.
ProbabilisticCircuits.loglikelihoods
— Methodloglikelihoods(bpc::CuBitsProbCircuit, data::CuArray; batch_size, mars_mem = nothing)
Returns loglikelihoods for each datapoint on gpu. Missing values should be denoted by missing
.
bpc
: BitCircuit on gpudata
: CuArray{Union{Missing, data_types...}}batch_size
mars_mem
: Not required, advanced usage. CuMatrix to reuse memory and reduce allocations. Seeprep_memory
andcleanup_memory
.
ProbabilisticCircuits.loglikelihoods
— Methodloglikelihoods(pc::ProbCircuit, data::Matrix)
Computes loglikelihoods of the circuit
over the data
on cpu. Linearizes the circuit and computes the marginals in batches.
ProbabilisticCircuits.loglikelihoods_vectorized
— MethodNote: Experimental**; will be removed or renamed later
ProbabilisticCircuits.map_down_rec!
— Methodmap_down_rec!(mars, node::ProbCircuit, data, states::Matrix, batch_idx, example_idx; node2idx::Dict{ProbCircuit, UInt32}, Float=Float32)
Downward pass on cpu for MAP. Recursively chooses the best (max) sum node children according to the "MAP upward pass" values. Updates the missing values with map_state of that input node.
ProbabilisticCircuits.map_loglikelihood
— Methodmap_loglikelihood(d::InputDist, heap)
Returns the MAP loglikelihoods the most likely state of the InputDist d
ProbabilisticCircuits.map_state
— Methodmap_state(d::InputDist, heap)
Returns the MAP state for the InputDist d
ProbabilisticCircuits.mini_batch_em
— Methodmini_batch_em(bpc::CuBitsProbCircuit, raw_data::CuArray, num_epochs; batch_size, pseudocount,
param_inertia, param_inertia_end = param_inertia, shuffle=:each_epoch)
Update the parameters of the CuBitsProbCircuit by doing EM, update the parameters after each batch.
ProbabilisticCircuits.mulnodes
— MethodGet all multiplication nodes in a given circuit
ProbabilisticCircuits.multiply
— FunctionMultiply nodes into a single circuit
ProbabilisticCircuits.num_inputs
— MethodNumber of inputs of a PC node
ProbabilisticCircuits.num_parameters
— FunctionCount the number of parameters in the circuit
ProbabilisticCircuits.num_parameters
— Methodnum_parameters(d::InputDist, independent)
Returns number of parameters for the input dist.
independent
: whether to only count independent parameters
ProbabilisticCircuits.num_parameters_node
— MethodCount the number of parameters in the node
ProbabilisticCircuits.num_randvars
— MethodNumber of variables in the data structure
ProbabilisticCircuits.params
— Methodparams(d::InputDist)
Returns paramters of the input dist.
ProbabilisticCircuits.params
— MethodGet the parameters associated with a node
ProbabilisticCircuits.prep_memory
— Functionprep_memory(reuse, sizes, exact = map(x -> true, sizes))
Mostly used internally. Prepares memory for the specifed size, reuses reuse
if possible to avoid memory allocation/deallocation.
ProbabilisticCircuits.psdd_num_nodes_leafs
— MethodCount the number of decision and leaf nodes in the PSDD
ProbabilisticCircuits.random_region_graph
— Methodrandom_region_graph(X::AbstractVector, depth::Int = 5, replicas::Int = 2, num_splits::Int = 2)
X
: Vector of all variables to include; for the root regiondepth
: how many layers to do splitsreplicas
: number of replicas or paritions (replicas only used for the root region; for other regions only 1 parition (inner nodes), or 0 parition for leaves)num_splits
: number of splits for each parition; split variables into random equaly sized regions
ProbabilisticCircuits.randvars
— Functionvariables(pc::ProbCircuit)::BitSet
Get a bitset of variables mentioned in the circuit.
ProbabilisticCircuits.region_graph_2_pc
— Methodregion_graph_2_pc(node::RegionGraph; num_nodes_root, num_nodes_region, num_nodes_leaf, balance_childs_parents)
num_nodes_root
: number of sum nodes in the root regionnum_nodes_leaf
: number of sum nodes per leaf regionnum_nodes_region
: number of in each region except root and leaves
ProbabilisticCircuits.sample
— Methodsample(bpc::CuBitsProbCircuit, num_samples, data::CuMatrix; rng=default_rng())
Generate num_samples
for each datapoint in data
from the joint distribution of the circuit conditioned on the data
. Samples are generated using GPU.
bpc
: Circuit on gpu (CuBitProbCircuit)num_samples
: how many samples to generaterng
: (Optional) Random Number Generator
The size of returned CuArray is (num_samples, size(data, 1), size(data, 2))
.
ProbabilisticCircuits.sample
— Methodsample(bpc::CuBitsProbCircuit, num_samples::Int, num_rand_vars::Int, types; rng=default_rng())
Generate num_samples
from the joint distribution of the circuit without any conditions. Samples are genearted on the GPU.
bpc
: Circuit on gpu (CuBitProbCircuit)num_samples
: how many samples to generatenum_rand_vars
: number of random variables in the circuittypes
: Array of possible input typesrng
: (Optional) Random Number Generator
The size of returned Array is (num_samples, 1, size(data, 2))
.
ProbabilisticCircuits.sample
— Methodsample(pc::ProbCircuit, num_samples; rng = default_rng())
Generate num_samples
from the joint distribution of the circuit without any conditions. Samples are generated on the CPU.
ProbabilisticCircuits.sample
— Methodsample(pc::ProbCircuit, num_samples, data::Matrix;; batch_size, rng = default_rng())
Generate num_samples
from the joint distribution of the circuit conditioned on the data
.
ProbabilisticCircuits.sample_state
— Methodsample_state(d::InputDist, threshold::Float32, heap)
Returns a sample from InputDist. Threshold
is a uniform random value in range (0, 1) given to this API by the sampleing algorithm
ProbabilisticCircuits.soften_data
— MethodTurn binary data into floating point data close to 0 and 1.
ProbabilisticCircuits.summate
— FunctionSum nodes into a single circuit
ProbabilisticCircuits.sumnodes
— MethodGet all summation nodes in a given circuit
ProbabilisticCircuits.unbits
— Methodunbits(d::InputDist, heap)
Returns the InputDist struct from the heap. Note, each input dist type needs to store where in the heap its paramters are to be able to do this.
Used internally for moving from GPU to CPU.
ProbabilisticCircuits.update_parameters
— Methodmap parameters from BitsPC back to the ProbCircuit it was created from
ProbabilisticCircuits.update_params
— Methodupdate_params(d::InputDist, heap, pseudocount, inertia)
Update the parameters of the InputDist using stored values on the heap
and (pseudocount
, inertia
)