Distributions: an Overview


Introduction

Before MSBN3 can do inference, it must know the conditional probability for every node given its parents. This probability distribution is specified has the Dist property of every node. If a node does not have a distribution, it's Dist property will be Nothing and no inference will be possible. This overview tells how these conditional distributions are created, read, and edited. Sample code, in Visual Basic, for this overview is available in Samples\VB\Distributions.bas.

Distributions as Tables

If all nodes are discrete (as in the case currently in MSBN3),  a conditional distribution between a node and its parents can be thought of as a table. For example, suppose we have this model of a cat:

Cat.bmp (11794 bytes)

in which a cat's happiness is caused by being feed, being petted, and seeing a bird. The conditional distribution of Happy given Feed, Petted, and SawBird can be thought of as a table. Suppose it has these values:

 

Feed

 

Petted

 

SawBird

 

Happy

Yes

No

Yes

Yes

Yes

0.971909

0.0280912

Yes

Yes

No

0.875726

0.124275

Yes

No

Yes

0.877786

0.122214

Yes

No

No

0.59855

0.40145

No

Yes

Yes

0.5

0.5

No

Yes

No

0.0887789

0.911221

No

No

Yes

0.248626

0.751374

No

No

No

0.0187786

0.981221

The table has one row for each possible assignment of parent nodes. It also has one column of parameter values for each state in the Happy node.

Accessing Parameter Values

The parameter values of a distribution can be accessed via the Dist object by specifying a row and column. For example, if nodeHappy is the node in the model with name "Happy", then this expression:

nodeHappy.Dist(0, "Yes")

will return value 0.971909. Parameters can also be set this way. For example, this expression changes the parameters of the last row:

nodeHappy.Dist(7, "Yes") = 0.01
nodeHappy.Dist(7, "No") = 0.99

As the examples show, rows can be specified with an integer index. Additionally, they can be specified with a parent assignment. This code creates a parent assignment and uses it to change the parameters.

Dim anAssign As MSBN3Lib.Assignment
Set anAssign = modelCat.CreateAssignment
anAssign.Add "Feed", "No"
anAssign.Add "Petted", "Yes"
anAssign.Add "SawBird", "Yes"
' anAssign is now: Feed->No,Petted->Yes,SawBird->Yes
nodeHappy.Dist(anAssign, "Yes") = 0.55
nodeHappy.Dist(anAssign, "No") = 0.45

The parent assignment must include a setting for every parent, but the parents can be in any order and other nodes can be included, too. (Other nodes will just be ignored.)

Table Dimensions

The Dist object's Count property tells the number of rows in the table.

nodeHappy.Dist.Count

The number of columns is available from these equivalent expressions:

nodeHappy.States.Count
nodeHappy.Dist.Node.States.Count

Parent Assignments for each Row

To find the parent assignment of a given row, use the Dist object's KeyObjects property. It takes a index integer and returns the Assignment collection for the row. (Like other MSBN3 collections, Dist.KeyObjects also accepts objects, but unlike other collections, it does not accept strings.) This expression retrieves the parent assignment for the row with index 2 (the third row) and then turns that Assignment collection into a string

nodeHappy.Dist.KeyObjects(2).Description

The returned value is the string "Feed->Yes,Petted->No,SawBird->Yes". Similarly, this expression, retrieves the same parent assignment and then finds the state to which the Petted node is set:

nodeHappy.Dist.KeyObjects(2)("Petted").Name

It returns "No". The documentation for the Assignment collection has details on access and enumeration of Assignment collections.

Enumerating the Rows

The rows of a distribution can be enumerated with Dist object's KeyObjects collection. For example, this code fragment prints parent assignments and parameter values.

For Each anAssign In nodeHappy.Dist.KeyObjects
Debug.Print anAssign.Description,
For Each aState In nodeHappy.States
Debug.Print nodeHappy.Dist(anAssign, aState),
Next aState
Debug.Print
Next anAssign

Output:

Feed->Yes,Petted->Yes,SawBird->Yes 0.971908986568451 0.028091199696064 
Feed->Yes,Petted->Yes,SawBird->No 0.875725984573364 0.124274998903275
Feed->Yes,Petted->No,SawBird->Yes 0.877785980701447 0.122213996946812
Feed->Yes,Petted->No,SawBird->No 0.598550021648407 0.401450008153915
Feed->No,Petted->Yes,SawBird->Yes 0.55 0.45
Feed->No,Petted->Yes,SawBird->No 8.87788981199265E-02 0.911221027374268
Feed->No,Petted->No,SawBird->Yes 0.248625993728638 0.751374006271362
Feed->No,Petted->No,SawBird->No 0.01 0.99

Default Parameters

Suppose we want to model the happiness of a second cat with this distribution:

 

Feed

 

Petted

 

SawBird

 

HappyCat2

Yes

No

Yes

Yes

Yes

0.99

0.01

Yes

Yes

No

0.01

0.99

Yes

No

Yes

0.01

0.99

Yes

No

No

0.01

0.99

No

Yes

Yes

0.01

0.99

No

Yes

No

0.01

0.99

No

No

Yes

0.01

0.99

No

No

No

0.01

0.99

This cat is only likely be happy when everything is going its way. Do we really have to specify identical parameters for 7 of the 8 rows? The answer is no. We can instead, specify default parameters and let 7 of the 8 rows use that default.

This code fragment shows the creation of a new node with parents and states based on nodeHappy. The AddDist method is used to add a distribution to the node. The default parameter values of the distribution are then set with Default. These at first apply to all rows, but then are over ridden for row #0. We use UsingDefault to confirm that row #0 is not using the default parameters but that row #1 is.

' Create a node for a new cat's happiness.
Dim nodeHappyCat2 As MSBN3Lib.Node
Set nodeHappyCat2 = modelCat.ModelNodes.Add("HappyCat2", "HappyCat2", _
States:=nodeHappy.States, ParentNodes:=nodeHappy.ParentNodes)
nodeHappyCat2.AddDist

' Specify the default parameters
nodeHappyCat2.Dist.Default("Yes") = 0.01
nodeHappyCat2.Dist.Default("No") = 0.99

' Override the default on row #0
nodeHappyCat2.Dist(0, "Yes") = 0.99
nodeHappyCat2.Dist(0, "No") = 0.01

'Confirm that row #0 is not using the default and that row #1 is:
Debug.Assert Not nodeHappyCat2.Dist.UsingDefault(0)
Debug.Assert nodeHappyCat2.Dist.UsingDefault(1)

The UsingDefault property can also be set true or false to force a row to use the defaults or not.

Creating Causally Independent Distributions

Suppose we wish to model the happiness of a third cat. For this cat we want to assume that the causes of its happiness are independent. Here is what our table looks like:

 

Feed

 

Petted

 

SawBird

 

HappyCat3

Yes

No

Yes

Yes

Yes

0.99

0.01

Yes

Yes

No

0.50

0.50

Yes

No

Yes

0.40

0.60

No

Yes

Yes

0.05

0.95

Notice that it has fewer rows than before. It has one row for the "all normal" case (this is called the leak case) and one row for each abnormal state of any parent.

The distribution is called "causally independent" or CI. MSBN3 requires that all parents of a CI distribution have their Normal state before their Abnormal states. For example the Feed nodes states must be ordered "Yes", "No", not "No","Yes".

This code fragment shows the creation of a node with a causally independent distribution, setting its parameters, and then printing the result.

' Create a node for a new cat's happiness.
Dim nodeHappyCat3 As MSBN3Lib.Node
Set nodeHappyCat3 = modelCat.ModelNodes.Add("HappyCat3", "HappyCat3", _
States:=nodeHappy.States, ParentNodes:=nodeHappy.ParentNodes)
nodeHappyCat3.Add deCondCI

' Specify the parameters for the four rows
nodeHappyCat3.Dist(0, "Yes") = 0.99
nodeHappyCat3.Dist(0, "No") = 0.01
nodeHappyCat3.Dist(1, "Yes") = 0.5
nodeHappyCat3.Dist(1, "No") = 0.5
nodeHappyCat3.Dist(2, "Yes") = 0.4
nodeHappyCat3.Dist(2, "No") = 0.6
nodeHappyCat3.Dist(3, "Yes") = 0.05
nodeHappyCat3.Dist(3, "No") = 0.95

Debug.Print
Debug.Print "Happy Cat 3"
For Each anAssign In nodeHappyCat3.Dist.KeyObjects
Debug.Print anAssign.Description,
For Each aState In nodeHappyCat3.States
Debug.Print nodeHappyCat3.Dist(anAssign, aState),
Next aState
Debug.Print
Next anAssign

Output:

Happy Cat 3
Feed->Yes,Petted->Yes,SawBird->Yes 0.99 0.01
Feed->No,Petted->Yes,SawBird->Yes 0.5 0.5
Feed->Yes,Petted->No,SawBird->Yes 0.4 0.6
Feed->Yes,Petted->Yes,SawBird->No 0.05 0.95

The new node has a CI distribution because of the Add method that created it included this: "DistType:=deCondCI". The type of distribution can also be determined (and changed) using the Dist object's Type property. The default distribution type is deCondSparse, the type we used above.

From Parameters to Probabilities

Suppose we want to know probability that Cat 3 will be happy when all the parent nodes are set to "No". The table contains no row for this condition. We could use the general inference (see Inference: an Overview), but there is a simpler way. We can use the Dist object's  Prob property. For example, this code fragment:

Set anAssign = modelCat.CreateAssignment
anAssign!Feed = "No"
anAssign!Petted = "No"
anAssign!SawBird = "No"
Debug.Print nodeHappyCat3.Dist.Prob(anAssign, "Yes")

will output 0.0099.

Indeed, even when a parent assignment is in the table, the Prob property should be used. For example:

Set anAssign = modelCat.CreateAssignment
anAssign!Feed = "No"
anAssign!Petted = "Yes"
anAssign!SawBird = "Yes"
Debug.Print nodeHappyCat3.Dist.Prob(anAssign, "Yes")

has output 0.495 which less than the 0.5 parameter in the table. Why does the probability and the parameter differ? Both  account for cat being unhappy because it is not feed. Only the probability, however, accounts for cat possibly being unhappy even when all is well. (This possibility is represented by  row #0, the leak case).

For Details See:

Sample Code