Personal tools
You are here: Home Tutorials Tutorial - First Bayesian Network
Document Actions

Tutorial - First Bayesian Network

by Kosta Gaitanis last modified 2006-04-05 18:09
Contributors: Kosta Gaitanis

In this tutorial we show how to create a small bayesian network. The code of this example can be found in ./Examples/WaterSprinkler.py

Create your first Bayesian Network

Imagine you want to create your first bayesian network, say for example the 'Water Sprinkler' network.

Water Sprinkler
This is an easy example because all the nodes are discrete and boolean (only 2 possible values).

1st Step : Import the OpenBayes package

An easy way to do this is by importing everything contained in the package

from OpenBayes import *

Although this is nice for quickly testing a package or a module, it is not recommended for writing Python modules. A better way is to only import the attributes you need.

from OpenBayes import BNet, BVertex, DirEdge

Note : In order for this to work, the OpenBayes directory (usually in $Python$/site-packages/OpenBayes) must be in the Python path.

2nd Step : Create the network topology

The network topology contains all the information needed to reconstruct the bayesian graph : all the nodes and all the edges


Create an empty bayesian network

We first create an empty bayesian network and give it a name

G = BNet( 'Water Sprinkler Bayesian Network' )


Add nodes to the network

We then add all the nodes in this network

c = G.add_v( BVertex('c', True, 2))

The first argument in BVertex is the name of the node (e.g. 'c'). Since this name will be used for indexing later, it's better to choose a small and convenient name, although any kind of string is accepeted.
The 'True' signifies that this node is Discrete. Equivalently 'False would signify that this node is continuous (see xxx???)
The third argument gives the size of the node. In this case, there are only 2 possibles states for the clouds : present or absent.

You can continue adding all the nodes in the network this way, but I prefer using the following code, which is mcuh faster and less error-prone :

c, s, r, w = [G.add_v( BVertex( nm, True, 2 ) ) for nm in 'c s r w'.split()]


Connect the nodes

Once all the nodes have been added, we have to specify how they are connected between them. We connect nodes by adding edges

G.add_e( DirEdge( len(G.e), c, s) )

The first argument is the index of the edge. len(G.e) gives us the number of edges already existing in the network which is a convenient way of indexing our edges : by order of creation : first one gets the index 0, second 1, etc...

Again I like using the following code for adding edges :

for ep in [( c, r ), ( c, s ), ( r, w ), ( s, w )]:
G.add_e( DirEdge( len( G.e ), *ep ) )


View the bayesian network

Now our bayesian network is ready. We can visualize it by typing :

>>> print G
BNet Water Sprinkler Bayesian Network
Vertices:
c (discrete, 2)
r (discrete, 2)
s (discrete, 2)
w (discrete, 2)

Edges:
0: c (discrete, 2) -> r (discrete, 2)
1: c (discrete, 2) -> s (discrete, 2)
2: r (discrete, 2) -> w (discrete, 2)
3: s (discrete, 2) -> w (discrete, 2)


3rd Step : Enter the parameters

Once the network topology is constructed, we must initialize the distributions :

G.InitDistributions()

This assures us that every node (continuous or discrete) will have a correct distribution and that the sizes of the distributions will correspond to the number of states and parents of each node.

Note : If we decide to further modify the structure of our network we must again execute this command for changes to take place.

1st way : brutal

Now we are ready to fill in the parameters for each node in our network.
Since all our nodes are discrete, we will use the MultinomialDistribution to represent them. A Multinomial Distribution is simply a matrice containing  Pr(v|Pa(v)) : The probability of an event knowing its parents.

'Cloudy' is the easiest because it has no parents :

c.setDistributionParameters([0.5, 0.5])

First number is the Pr(c=0) (no clouds) and the second is Pr(c=1) (very cloudy)

Note : this is equivalent to doing

c.distribution.cpt = na.array([0.5,0.5])
c.distribution[:] = na.array([0.5,0.5])

'Sprinkler' and 'Rain' have a single parent and are bot boolean. We must therefore supply 4 values, 2 for each state of their parent :

s.setDistributionParameters([0.5, 0.9, 0.5, 0.1])
r.setDistributionParameters([0.8, 0.2, 0.2, 0.8])

Since the order in which the parameters are introduced is important and can get confusing, we have introduced several ways of putting parameters into the distributions.

2nd way : using the order of variables

Each distribution has a certain order for all the variables that appear in it (the variable represented and its parents). To know it, simply type :

>>> w.distribution.names_list
['w','s','r']

now we can set the parameters using this syntax :

w.distribution[:,0,0]=[0.99, 0.01]
w.distribution[:,0,1]=[0.1, 0.9]
w.distribution[:,1,0]=[0.1, 0.9]
w.distribution[:,1,1]=[0.0, 1.0]


3rd way : using a dictionary

This is probably the most convenient way.

w.distribution[{'s':0,'r':0}]=[0.99, 0.01]
w.distribution[{'s':0,'r':1}]=[0.1, 0.9]
w.distribution[{'s':1,'r':0}]=[0.1, 0.9]
w.distribution[{'s':1,'r':1}]=[0.0, 1.0]

The use of dictionaries is generalized in OpenBayes in order to facilitate the use and avoid common errors that happen when introducing data into th wrong places. Any time a distribution or a potential has to be indexed, you can use a dictionary and avoid having to know the exact dimension in which a variable is stored into the matrix.

Visualize it ...

Posted by wr at 2007-02-28 15:39

... with http://www.openbayes.org/Members/wr/bntoimg.py/view


Powered by Plone, the Open Source Content Management System

This site conforms to the following standards: