Learning the parameters of a BN from sample data
How to learn the parameters of a BN with known structure from a set of observations. The code of this example can be found in ./Examples/Learning.py
Create the set of observations
We first have to import the OpenBayes package and some other packages (such as copy) that we will need
from OpenBayes import JoinTree, MCMCEngine
from copy import deepcopy
from time import time
Then we will generate our observations by sampling the Water-Sprinkler network. Please note that this is a tutorial and that in real life you don't generate your observations this way. Usually the set of observations is already available (from an experiment for example).
# create the Water-Sprinkler bayesian network
from WaterSprinkler import *
N = 1000
# sample the network N times
cases = G.Sample(N) # cases = [{'c':0,'s':1,'r':0,'w':1},{...},...]
Now cases contains 1000 dictionaries, each one of them containing one sampled value for each node in the Water-Sprinkler BN. This is a case of full observability, meaning that data is available for all the variables in the network. The case of partial observability (missing values) has not yet been implemented in OpenBayes.
Create a BN with no parameters
# copy the BN
G2 = deepcopy(G)
# set all parameters to 1s
G2.InitDistributions()
Now we have a copy of the structure of the original network but all the parameters are set to 1
Learn these parameters!
# create an inference Engine
# choose the one you like by commenting/uncommenting the appropriate line
ie = JoinTree(G2)
#ie = MCMCEngine(G)
# Learn the parameters from the set of cases
t = time()
ie.LearnMLParams(cases)
print 'Learned from %d cases in %1.3f secs' %(N,(time()-t))
# print the learned parameters
for v in G2.all_v:
print v.distribution,'\n'
Learning works by simply counting the number of occurences of each combination of the values in BN. For example, if we have 1000 observations and the combinations S=0 and C=0 comes up 200 times, then we get an approximation of Pr(S=0|c=0) = 200/1000 = 0.2.