API Manual
API Manual
REFERENCE MANUAL
Version 8.3
This manual was prepared using the LATEX Document Preparation System
and the PDFTEX typesetting software.
Set in 11 point Bitstream Charter, Bitstream Courier, and AMS Euler.
Version 8.3, February 2016.
UNIX
Preface
The HUGIN API 8.3 Reference Manual provides a reference for the C language Application Program Interface to the HUGIN system. However, brief
descriptions of the Java and C++ versions are also provided (see Chapter 1).
The present manual assumes familiarity with the methodology of Bayesian
belief networks and (limited memory) influence diagrams (LIMIDs) as well
as knowledge of the C programming language and programming concepts.
As introductions to Bayesian belief networks and influence diagrams, the
books by Jensen and Nielsen [14] and Kjrulff and Madsen [18] are recommended. Deeper treatments of the subjects are given in the books by Cowell
et al [8], Darwiche [9], Koller and Friedman [19], and Pearl [31].
continuous nodes). Chapter 5 explains how to access and modify the contents of tables.
Chapter 6 describes how the contents of a conditional probability, a policy,
or a utility table can be generated from a mathematical description of the
relationship between a node and its parents.
Chapter 7 explains how to transform a domain into a secondary structure
(a junction forest), suitable for inference. This transformation is known
as compilation. It also explains how to improve performance of inference
by controlling the triangulation step and by performing approximation and
compression.
Chapter 8 explains how to access the collection of junction trees of a compiled domain and how to traverse a junction tree.
Chapter 9 shows how to handle the beliefs and the evidence that form the
core of the reasoning process in the HUGIN inference engine. This chapter
explains how to enter and retract evidence, how to determine independence
properties induced by evidence and network structure, how to retrieve beliefs and expected utilities, how to compute values of function nodes, how
to examine evidence, and how to save evidence as a case file for later use.
Chapter 10 documents the functions used to control the inference engine
itself. The chapter also explains how to perform conflict analysis, simulation,
value of information analysis, sensitivity analysis, and how to find the most
probable configurations of a set of nodes.
Chapter 11 explains how to adapt conditional probability distributions to
new evidence, and Chapter 12 describes how the network structure and the
conditional probability distributions can be extracted (learned) from data
(a set of cases).
Chapter 13 describes the NET language, a language used to specify the
nodes and the structure of a network as well as the numerical data required
to form a complete specification.
Chapter 14 describes the data set a tool that aids in loading of data provided as so-called CSV files (short for comma-separated-values files). After
the data has been loaded (as pure text), it can be modified (if necessary) and
used as case data for the learning algorithms.
Chapter 15 describes how to enter and modify information that is purely
descriptive. This information is not used by other parts of the HUGIN API.
It is used by the HUGIN GUI application to generate a graphical display of a
network.
Appendix A gives an example of a network using CG variables. Appendix B
provides a history of news and changes for all releases of the HUGIN API
since version 2.
Finally, an index is provided. The index contains the names of all functions,
types, and constants of enumeration types, defined in this manual.
iv
Acknowledgements
Lars P. Fischer wrote the HUGIN API 1.1 manual, and Per Abrahamsen wrote
the HUGIN API 1.2 (Extensions) manual. The present document is partly
based on these manuals.
I would also like to thank Anders L. Madsen, Martin Karlsen, Uffe B. Kjrulff,
Marianne Bangs, Sren L. Dittmer, Michael Lang, Lars Nielsen, Lars Bo
Nielsen, and Kristian G. Olesen for providing constructive comments and
other contributions that have improved this manual as well as the API itself.
In particular, Anders L. Madsen wrote a large part of Chapter 11.
Any errors and omissions remaining in this manual are, however, my responsibility.
The development of the functionality concerning the (real-valued) function
node type (introduced in HUGIN 7.3) has been sponsored by Danish mortgage credit institution Nykredit Realkredit (www.nykredit.dk).
The development of the functionality concerning the discrete function node
type as well as the aggregate and probability operators (introduced in
v
HUGIN 7.7) has been sponsored by the research project Operational risk
in banking and finance. The project is dedicated to strengthening management of operational risk in the banking and finance sector, hereunder
develop Basel II compliant operational risk measurement and management
tools in accordance with the Advanced Measurement Approach (AMA). The
project is financed by the University of Stavanger and a consortium of Norwegian banks consisting of Sparebank 1 SR-Bank, Sparebank 1 SNN, Sparebank 1 SMN, Sparebanken Hedmark, and Sparebank 1 Oslo and Akershus.
Frank Jensen
Hugin Expert A/S
February, 2016
vi
Contents
Preface
iii
1 General Information
1.1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2
1.3
1.4
Naming conventions
1.5
Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.6
Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
. . . . . . . . . . . . . . . . . . . . . . 15
1.8
27
Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.1.1 Node category . . . . . . . . . . . . . . . . . . . . . . 27
2.1.2 Node kind . . . . . . . . . . . . . . . . . . . . . . . . 28
2.2
2.3
2.4
2.5
2.6
2.7
2.8
. . . . . . . . . . . 43
2.9
User data . . . . . . . . . . . .
2.9.1 Arbitrary user data . .
2.9.2 User-defined attributes
2.10 HUGIN Knowledge Base files .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
44
44
45
47
.
.
.
.
.
.
.
.
.
.
.
.
.
51
51
52
52
52
53
53
54
55
57
58
60
61
62
.
.
.
.
.
63
63
64
65
67
68
.
.
.
.
.
69
69
72
73
73
74
6 Generating Tables
6.1 Subtyping of discrete nodes . . . . . . . . . . . . . . . . . .
6.2 Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.3 Syntax of expressions . . . . . . . . . . . . . . . . . . . . . .
77
77
78
85
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
viii
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
6.4
6.5
State labels . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
6.6
State values . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
6.7
Statistical distributions . . . . . . . . . . . . . . . . . . . . . 90
6.7.1 Continuous distributions . . . . . . . . . . . . . . . . 91
6.7.2 Discrete distributions . . . . . . . . . . . . . . . . . . 93
6.8
Generating tables . . . . . . . . . . . . . . . . . . . . . . . . 95
6.9
7 Compilation
101
7.1
7.2
Compilation . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
7.3
Triangulation . . . . . . . . . . . . . . . . . . . . . . . . . . 103
7.4
7.5
Uncompilation . . . . . . . . . . . . . . . . . . . . . . . . . . 110
7.6
Compression . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
7.7
Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . 111
115
8.1
Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
8.2
8.3
Cliques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
8.4
119
Evidence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
9.1.1 Discrete evidence . . . . . . . . . . . . . . . . . . . . 119
9.1.2 Continuous evidence . . . . . . . . . . . . . . . . . . 121
9.1.3 Evidence in LIMIDs . . . . . . . . . . . . . . . . . . . 121
9.2
9.3
9.4
9.5
9.6
9.7
9.8
9.9
10 Inference
133
159
181
. . . . . . . . . . . . . . . . . . . . . . . . . 197
203
209
213
215
Bibliography
235
Index
239
xi
xii
Chapter 1
General Information
This chapter explains how to use the HUGIN API within your own applications. It also gives some general information on the functions and data types
defined by the HUGIN API and explains the mechanisms for error handling.
Finally, instructions on how to take advantage of multi-processor systems to
speed up inference is given.
1.1
Introduction
The HUGIN API contains a high performance inference engine that can be
used as the core of knowledge based systems built using Bayesian belief networks or limited memory influence diagrams (LIMIDs) [24]. A knowledge
engineer can build knowledge bases that model the application domain, using probabilistic descriptions of causal relationships in the domain. Given
this description, the HUGIN inference engine can perform fast and accurate
reasoning.
The HUGIN API is provided in the form of a library that can be linked into
applications written using the C, C++, or Java programming languages. The
C version provides a traditional function-oriented interface, while the C++
and Java versions provide an object-oriented interface. The present manual
describes the C interface. The C++ and Java interfaces are described in
online documentation supplied with the respective libraries.
On Windows platforms only, C# and Visual Basic language interfaces are
also available.
Additionally, a web service version of the HUGIN API is provided. This
allows the decision engine to be exercised from any programming language
using the well-known HTTP protocol.
The HUGIN API is used just like any other library. It does not require any
special programming techniques or program structures. The HUGIN API
1
does not control your application. Rather, your application controls the
HUGIN API by telling it which operations to perform. The HUGIN inference
engine sits passive until you engage it.
Applications built using the HUGIN API can make use of any other library
packages such as database servers, GUI toolkits, etc. The HUGIN API itself
only depends on (in addition to the Standard C library) the presence of the
Zlib library (www.zlib.net), which is preinstalled in the Solaris, Linux,
and Mac OS X operating environments.
1.2
The first step in using the C version of the HUGIN API is to include the
definitions for the HUGIN functions and data types in the program. This is
done by inserting the following line at the top of the program source code:
# include "hugin.h"
The hugin.h header file contains all the definitions for the API.
When compiling the program, you must inform the C compiler where the
header file is stored. Assuming the HUGIN system has been installed in the
directory /usr/local/hugin, the following command is used:
cc -I/usr/local/hugin/include -c myapp.c
This will compile the source code file myapp.c and store the result in the
object code file myapp.o, without linking. The -I option adds the directory
/usr/local/hugin/include to the search path for include files.
If you have installed the HUGIN system somewhere else, the path above
must be modified as appropriate. If the environment variable HUGINHOME
has been defined to point to the location of the HUGIN installation, the
following command can be used:
cc -I$HUGINHOME/include -c myapp.c
Using the environment variable, HUGINHOME, has the advantage that if the
HUGIN system is moved, only the environment variable must be changed.
When the source code, possibly stored in several files, has been compiled,
the object files must be linked to create an executable file. At this point, it
is necessary to specify that the object files should be linked with the HUGIN
library:
cc myapp.o other.o -L$HUGINHOME/lib -lhugin -lm -lz
The -L$HUGINHOME/lib option specifies the directory to search for the
HUGIN libraries, while the -lhugin option specifies the library to link
2
with. The -lz option directs the compiler/linker to link with the Zlib library (www.zlib.net). This option is needed if either of the h domain
save as kb(49) or h kb load domain(49) functions is used.
If the source code for your application is a single file, you can simplify the
above to:
cc -I$HUGINHOME/include myapp.c
-L$HUGINHOME/lib -lhugin -lm -lz -o myapp
compiling the source code file myapp.c and storing the final application in
the executable file myapp. (Note that the above command should be typed
as a single line.)
Following the above instructions will result in an executable using the singleprecision version of the HUGIN API library. If, instead, you want to use
the double-precision version of the HUGIN API library, you must define H
DOUBLE when you invoke the compiler, and specify -lhugin2 for the linking step:
cc -DH_DOUBLE -I$HUGINHOME/include myapp.c
-L$HUGINHOME/lib -lhugin2 -lm -lz -o myapp
(Again, all this should be typed on one line.)
The above might look daring, but it would typically be done in a Makefile
so that you will only have to do it once for each project.
The instructions above also assume that you are using the 32-bit version of
the HUGIN API. If, instead, you are using the 64-bit version, you will need
to specify $HUGINHOME/lib64 as the directory to search for libraries.
The hugin.h header file has been designed to work with both ISO C compliant compilers and C++ compilers. For C++ compilers, the hugin.h header
file depends on the symbol __cplusplus being defined (this symbol should
be automatically defined by the compiler).
Some API functions take pointers to stdio FILE objects as arguments. This
implies that inclusion of hugin.h also implies inclusion of <stdio.h>.
Moreover, in order to provide suitable type definitions, the standard C header
<stddef.h> is also included.
The Java and C++ versions use classes for modeling domains, nodes, etc.
Each of the classes has a set of methods enabling you to manipulate objects
of the class. These methods will throw exceptions when errors occur. The
exception classes are all subclasses of the main HUGIN exception class (ExceptionHugin). In Java, this is an extension of the standard Java Exception
class.
The classes, methods, and exceptions are all specified in the online documentation distributed together with these interfaces.
C++ To use the C++ HUGIN API definitions in your code, you must include
the hugin header file (note that there is no suffix):
# include "hugin"
All entities defined by the C++ API are defined within the HAPI namespace.
To access these entities, either use the HAPI:: prefix or place the following
declaration before the first use of C++ API entities (but after the hugin
header file has been included):
using namespace HAPI;
Like the C API, the C++ API is available in two versions: a single-precision
version and a double-precision version. To use the single-precision version,
use a command like the following for compiling and linking:
g++ -I$HUGINHOME/include myapp.c
-L$HUGINHOME/lib -lhugincpp -lm -lz -o myapp
(This should be typed on one line.) To use the double-precision version,
define the H DOUBLE preprocessor symbol and specify -lhugincpp2 for
the linking step:
g++ -DH_DOUBLE -I$HUGINHOME/include myapp.c
-L$HUGINHOME/lib -lhugincpp2 -lm -lz -o myapp
(Again, this should be typed on one line.)
Also, the C++ API is available as both a 32-bit and a 64-bit version. To use
the 64-bit version, specify $HUGINHOME/lib64 as the directory to search
for libraries.
Java The Java version of the HUGIN API library is provided as two files
(the files comprising the 64-bit version have an additional -64 suffix):
hapi83.jar (hapi83-64.jar) contains the Java interface to the
underlying C library. This file must be mentioned by the CLASSPATH
environment variable.
4
1.3
C, C++, Java, C# (.NET), and Visual Basic language interfaces for the HUGIN
API are available on the Windows platforms.
Five1 sets of library files are provided with developer versions of HUGIN
for Windows: One for Microsoft Visual Studio 6.0, one for Microsoft Visual
Studio .NET 2003, one for Microsoft Visual Studio 2005, one for Microsoft
Visual Studio 2008, and one for Microsoft Visual Studio 2010.2 Each set of
library files contains libraries for all combinations of the following:
C and C++ programming languages;
Debug and Release configurations;
single-precision and double-precision.
1
For 64-bit packages, only three sets of library files are provided (for Microsoft Visual
Studio 2005, Microsoft Visual Studio 2008, and Microsoft Visual Studio 2010).
2
If you need libraries for other development environments, please contact info@hugin.
com.
Each of these libraries has two parts: An import library and a DLL. For example, the import library for the 32-bit, double-precision, C++ version of
the HUGIN API compiled for Microsoft Visual Studio 2008, Debug configuration, is named hugincpp2-8.3-vc9d.lib, and the corresponding DLL
file is named hugincpp2-8.3-vc9d.dll. For 64-bit versions, there is an
additional suffix: hugincpp2-8.3-vc9d-x64 plus the usual .lib and
.dll extensions.
In general, the library files have unique names, indicating language (C or
C++), API version number, compiler, configuration, and 32-bit/64-bit. This
naming scheme makes it possible for all DLLs to be in the search path simultaneously.
(4) Click Settings on the Project menu. Click the C/C++ tab, and
select C++ Language in the Category box.
Make sure Enable exception handling is selected.
Make sure Enable Run-Time Type Information (RTTI) is selected.
The above steps set up Microsoft Visual Studio 6.0 to use the single-precision
version of the HUGIN API. If you want to use the double-precision version,
modify the instructions as follows:
(1) Click Settings on the Project menu. Click the C/C++ tab, and
select Preprocessor in the Category box.
(b) Add H_DOUBLE to the Preprocessor definitions box.
(2) Click Settings on the Project menu. Click the Link tab, and select
Input in the Category box.
(b) Add the import library to the Object/library modules list:
If hConfigurationi is Debug, add hugincpp2-8.3-vc6d.lib.
If hConfigurationi is Release, add hugincpp2-8.3-vc6.lib.
When running the compiled program, the DLL corresponding to the import
library used in the compilation must be located in a directory mentioned in
the search path.
Microsoft Visual Studio .NET 2003
Let hPathi denote the main directory of the Hugin installation (for example:
C:\Program Files\Hugin Expert\Hugin Developer 8.3), and let
hConfigurationi denote the active configuration (either Debug or Release).
(1a) Click Properties on the Project menu. Click the C/C++ folder,
and select the General property page.
Add hPathi\HDE8.3CPP\Include to the Additional Include Directories property.
(2a) Click Properties on the Project menu. Click the Linker folder, and
select the General property page.
Add hPathi\HDE8.3CPP\Lib\VC7\hConfigurationi to the Additional Library Directories property.
(2b) Click Properties on the Project menu. Click the Linker folder, and
select the Input property page.
10
#if X64
using size_t = System.UInt64;
using h_index_t = System.Int64;
#else
using size_t = System.UInt32;
using h_index_t = System.Int32;
#endif
#if H_DOUBLE
using h_number_t = System.Double;
#else
using h_number_t = System.Single;
#endif
The symbol H_DOUBLE must be defined (only) when using a double-precision version of the HUGIN .NET API, and the symbol X64 must be defined
(only) when using a 64-bit version of the HUGIN .NET API. (See more
examples in the documentation accompanying the HUGIN .NET API.)
The Microsoft .NET Framework (version 2.0 or version 4.0) is required by
the HUGIN .NET API: http://msdn.microsoft.com/netframework/
Microsoft Visual Studio 2005, 2008, and 2010
The following steps set up a Microsoft Visual Studio (2005, 2008, and 2010)
C# Project to use the .NET version of the HUGIN API:
(1) Click Add Reference on the Project menu. Click the Browse tab,
browse to the location of the HUGIN .NET API DLL files and select a
file corresponding to the desired version.
(2) Click Properties on the Project menu. Click the Build fan, and
configure Platform target to either x86 or x64 (according to the
HUGIN .NET API version used).
(3) If using the type aliasing scheme described above, define the appropriate symbols in the field Conditional compilation symbols under the
General category on the Build fan.
When running the compiled program, the DLL referenced in the project
must be located in a directory mentioned in the search path.
1.4
Naming conventions
and remove all remaining underscore characters. So, for example, the following classes are defined in the Java and C++ APIs: Clique, Expression,
JunctionTree, Model, Node, and Table.
There are some differences between C and object-oriented languages such
as Java and C++ that made it natural to add some extra classes. These
include different Node subclasses (DiscreteChanceNode, DiscreteDecisionNode, BooleanDCNode, LabelledDDNode, etc.) and a lot of Expression subclasses (AddExpression, ConstantExpression, BinomialDistribution, BetaDistribution, etc.). Each group forms their own class hierarchy below the corresponding superclass. Some of the most specialized Node classes use abbreviations in their names (to avoid too long class names): e.g., BooleanDCNode
is a subclass of DiscreteChanceNode which again is a subclass of Node. Here,
BooleanDCNode is abbreviated from BooleanDiscreteChanceNode.
The methods defined on the Java/C++ HUGIN API classes all correspond
to similar C API functions. For example, the setName method of the Node
class corresponds to h node set name(42) . The rule is: the h prefix is removed, letters immediately following all (other) underscore characters are
uppercased, and, finally, the underscore characters themselves are removed.
There are some exceptions where functions correspond to class constructors:
e.g., the h domain new node(30) function in the C version corresponds to a
number of different Node subclass constructors in the Java/C++ versions.
1.5
Types
Scalar types
Probabilistic reasoning is about numbers, so the HUGIN API will of course
need to handle numbers. The beliefs and utilities used in the inference engine are of type h number t, which is defined as a single-precision floatingpoint value in the standard version of the HUGIN library. The HUGIN API
also defines another floating-point type, h double t, which is defined as a
double-precision floating-point type in the standard version of the HUGIN
API. This type is used to represent quantities that are particularly sensitive to
range (e.g., the joint probability of evidence see h domain get normalization constant(140) ) and precision (e.g., the summation operations performed
as part of a marginalization operation is done with double precision).
The reason for introducing the h number t and h double t types is to make
it easier to use higher precision versions of the HUGIN API with just a simple
recompilation of the application program with some extra flags defined.
The HUGIN API uses a number of enumeration types. Some examples: The
type h triangulation method t defines the possible triangulation methods
used during compilation; the type h error t defines the various error codes
returned when errors occur during execution of API functions. Both of these
types will have new values added as extra features are added to the HUGIN
API in the future.
Many functions return integer values. However, these integer values have
different meanings for different functions.
Functions with no natural return value simply return a status result that
indicates if the function failed or succeeded. If the value is zero, the function
succeeded; if the value is nonzero, the function failed and the value will
be the error code of the error. Such functions can be easily recognized by
having the return type h status t.
Some functions have the return type h boolean t. Such functions have truth
values (i.e., true and false) as their natural return values. These functions
will return a positive integer for true, zero for false, and a negative integer
if an error occurs. The nature of the error can be revealed by using the h
error code(18) function and friends.
The HUGIN API also defines a number of other types for general use: The
type h string t is used for character strings (this type is used for node
names, file names, labels, etc.). The type h count t is an integral type to
17
1.6
Errors
Several types of errors can occur when using a function from the HUGIN
API. These errors can be the result of errors in the application program, of
running out of memory, of corrupted data files, etc.
As a general principle, the HUGIN API will try to recover from any error as
well as possible. The API will inform the application program of the problem
and take no further action. It is then up to the application program to choose
an appropriate action.
This way of error handling is chosen to give the application programmer the
highest possible degree of freedom in dealing with errors. The HUGIN API
will never make a choice of error handling, leaving it up to the application
programmer to create as elaborate an error recovery scheme as needed.
When a HUGIN API function fails, the data structures will always be left in a
consistent state. Moreover, unless otherwise stated explicitly for a particular
function, this state can be assumed identical to the state before the failed
API call.
To communicate errors to the user of the HUGIN API, the API defines the
enumeration type h error t. This type contains constants to identify the
various types of errors. All constants for values of the h error t type have
the prefix h error .
All functions in the HUGIN API (except those described in this section) set
an error indicator. This error indicator can be inspected using the h error
code function.
x
All functions with no natural return value (i.e., the return type is void) have
been modified to return a value. These functions have return type h status
t (which is an alias for an integral type). A zero result from such a function
indicates success while a nonzero result indicates failure. Other functions
use an otherwise impossible value to indicate errors. For example, consider
the h node get belief (125) function which returns the belief for a state of a
(chance) variable. This is a nonnegative number (and less than or equal
to one since it is a probability). This function returns a negative number
if an error occurred. Such a convention is not possible for the h node get
expected utility(127) function since any real number is a valid utility; in this
case, the h error code function must be used.
Also, most functions that return a pointer value use NULL to indicate errors.
The only exception is the group of functions that handle arbitrary user
data (see Section 2.9.1) since NULL can be a valid datum.
It is important that the application always checks for errors. Even the most
innocent-looking function might generate an error.
Note that, if an API function returns a value indicating that an error occurred, the inference engine may be in a state where normal progress of
the application is impossible. This is the case if, say, a domain could not be
loaded. For the sanity of the application it is therefore good programming
practice to always examine return values and check for errors, just like when
using ordinary Standard C library calls.
1.6.1
Handling errors
19
h_domain_t d;
...
if ((d = h_kb_load_domain (file_name, password)) == NULL)
{
fprintf (stderr, "h_kb_load_domain failed: %s\n",
h_error_description (h_error_code ()));
exit (EXIT_FAILURE);
}
If the domain could not be loaded, an error message is printed and the program terminates. Lots of things could cause the load operation to fail: the file is non-existing
or unreadable, the HUGIN KB file was generated by an incompatible version of the
API, the HUGIN KB file was corrupted, insufficient memory, etc.
1.6.2
General errors
Here is a list of some error codes that most functions might generate.
h error usage This error code is returned when a trivial violation of the
interface for an API function has been detected. Examples of this error:
NULL pointers are usually not allowed as arguments (if they are, it will
be stated so explicitly); asking for the belief in a non-existing state of
a node; etc.
20
h error no memory The API function failed because there was insufficient
(virtual) memory available to perform the operation.
h error io Functions that involve I/O (i.e., reading from and writing to files
on disk). The errors could be: problems with permissions, files do not
exist, disk is full, etc.
1.7
In order to achieve faster inference through parallel execution on multiprocessor systems, many of the most time-consuming table operations have
been made threaded. Note, however, that in the current implementation
table operations for compressed domains (see Section 7.6) are not threaded.
The creation of threads (or tasks) is controlled by two parameters: the desired level of concurrency and the grain size. The first of these parameters
specifies the maximum number of threads to create when performing a specific table operation, and the second parameter specifies a lower limit on
the size of the tasks to be performed by the threads. The size of a task is
approximately equal to the number of floaing-point operations needed to
perform the task (e.g., the number of elements to sum when performing a
marginalization task).
h status t h domain set concurrency level
(h domain t domain, size t level)
This function sets the level of concurrency associated with domain to level
(this must be a positive number). Setting the concurrency level parameter to 1 will cause all table operations (involving tables originating from
domain) to be performed sequentially. The initial value of this parameter
is 1.
Note that the concurrency level and the grain size parameters are specific to
each domain.3 Hence, the parameters must be explicitly set for all domains
for which parallel execution is desired.
h count t h domain get concurrency level (h domain t domain)
This function returns the current concurrency level associated with domain.
h status t h domain set grain size (h domain t domain, size t size)
This function sets the grain size parameter associated with domain to size
(this value must be positive). The initial value of this parameter is 10 000.
3
21
1.7.1
22
h_domain_t d;
...
h_domain_set_concurrency_level (d, 4);
pthread_setconcurrency (4);
...
/* do compilations, propagations, and other stuff
that involves inference */
We could use thr setconcurrency instead of pthread setconcurrency (in that case we
would include <thread.h> instead of <pthread.h>).
1.7.2
1.8
The HUGIN API can be used safely in a multithreaded application. The major obstacle to thread-safety is shared data for example, global variables.
The only global variable in the HUGIN API is the error code variable. When
the HUGIN API is used in a multithreaded application, an error code variable is maintained for each thread. This variable is allocated the first time
it is accessed. It is recommended that the first HUGIN API function (if any)
being called in a specific thread be the h error code(18) function. If this function returns zero, it is safe to proceed (i.e., the error code variable has been
successfully allocated). If h error code returns nonzero, the thread must not
call any other HUGIN API function, since the HUGIN API functions critically
depend on being able to read and write the error code variable. (Failure to
allocate the error code variable is very unlikely, though.)
Example 1.4 This code shows the creation of a thread, where the function executed by the thread calls h error code(18) as the first HUGIN API function. If this
call returns zero, it is safe to proceed.
This example uses POSIX threads.
# include "hugin.h"
# include <pthread.h>
pthread_t thread;
void *data; /* pointer to data used by the thread */
void *thread_function (void *data)
23
{
if (h_error_code () != 0)
return NULL; /* it is not safe to proceed */
/* now the Hugin API is ready for use */
...
}
...
pthread_create (&thread, NULL, thread_function, data);
Note that the check for h error code(18) returning zero should also be performed
for the main (only) thread in a multithreaded (singlethreaded) application, when
using a thread-safe version of the HUGIN API (all APIs provided by Hugin Expert
A/S is thread-safe as of version 6.1).
24
...
pthread_mutex_unlock (&mutex);
}
...
/* In Thread B: */
if (pthread_mutex_lock (&mutex) != 0)
/* handle error */ ...;
else
{
/* use domain d */
...
pthread_mutex_unlock (&mutex);
}
Since domain d is being used by more than one thread, it is important that while
one thread is modifying the data structures belonging to d, other threads do not
attempt to read or write the same data structures. This is achieved by requiring all
threads to lock the mutex variable while they access the data structures of d. The
thread library ensures that only one thread at a time can lock the mutex variable.
Many HUGIN API functions that operate on nodes also modify the state
of the domain or class to which the nodes belong. For example, entering
evidence to a node clearly modifies the state of the node, but it also modifies
book-keeping information relating to evidence within the domain to which
the node belongs.
On the other hand, many HUGIN API functions only read attributes of a
class, domain, or node. Such functions can be used simultaneously from
different threads on the same or related objects, as long as it has been ensured that no thread is trying to modify the objects concurrently with the
read operations. Examples of functions that only read attributes are: h
node get category(30) , h domain get attribute(46) , h node get belief (125) , etc.
In general, all functions with get or is as part of their names do not
modify data, unless their descriptions explicitly state that they do. Examples
of the latter category are:
h node get name(42) and h class get name(53) will assign names to the
node or class, if no name has previously been assigned. (If the node
or class is known to be named, then these functions will not modify
data.)
h node get table(39) , h node get experience table(160) , and h node get
fading table(161) will create a table if one doesnt already exist.
h domain get marginal(126) and h node get distribution(126) must, in
the general case, perform a propagation (which needs to modify the
junction tree).
25
All HUGIN API functions returning a list of nodes may have to allocate
and store the list.
26
Chapter 2
2.1
Types
Nodes and domains are the fundamental objects used in the construction of
belief network and LIMID models in HUGIN. The HUGIN API introduces the
opaque pointer types h node t and h domain t to represent these objects.
2.1.1
Node category
27
In all of these network models, so-called function nodes [26] can be created:
A function node represents either a single real value (a real-valued function
node) or a discrete marginal distribution (a discrete function node). In both
cases, this entity is a function of (the values or distributions of) the parents.
Real-valued function nodes are not (directly) involved in the inference process evidence cannot be specified for such nodes, but the functions associated with the nodes can be evaluated using the results of inference or
simulation as input. In this case, evaluation must take place after inference
or simulation.
However, real-valued function nodes can also be used to provide input to
the table generation process: The expressions in the model of a node may
refer to the values of (real-valued) function nodes in the parent set of the
node. Since table generation takes place before inference, the values of
these parents must be available before inference.
A discrete function node behaves like a chance node, except that the parents
of the node are not affected by inference with evidence specified for the
node. Instead, the values or distributions of the parents are used to generate
a marginal distribution (which becomes the node table) for the node before
inference is performed.
These properties allow the tables of some nodes to depend on the beliefs
(computed by inference) of other nodes see Section 10.2. In order to
make this work, the network must satisfy some constraints see Section 2.4.
In order to distinguish between the different types of nodes, the HUGIN API
associates with each node a category, represented as a value of the enumeration type h node category t. The constants of this enumeration type are:
h category chance (for nodes representing random variables),
h category decision (for nodes representing decisions),
h category utility (for nodes representing utility functions),
h category function (for function nodes), and
h category instance (for nodes representing class instances in objectoriented models).
In addition, the special constant h category error is used for handling errors.
2.1.2
Node kind
Another grouping of nodes exists, called the kind2 of a node. This grouping is a characterization of the state space of the node. The HUGIN API
introduces the enumeration type h node kind t to represent it.
2
The terms category and kind have been deliberately chosen so as not to conflict with the
traditional vocabulary used in programming languages. Thus, the term type was ruled out.
28
Chance and decision nodes are either discrete or continuous.3 The enumeration constants h kind discrete and h kind continuous represent those
kinds. Discrete nodes have a finite number of states. Continuous nodes
are real-valued and have a special kind of distribution, known as a Conditional Gaussian (CG) distribution, meaning that the distribution is Gaussian
(also known as normal) given values of the parents. For this reason, continuous nodes are also referred to as CG nodes. (See Appendix A for further
information on CG variables.)
As mentioned above, function nodes are either real-valued or discrete. Discrete function nodes have discrete kind, while real-valued function nodes
have other kind (represented by the enumeration constant h kind other).
This is also the kind of utility and instance nodes.
In addition, the special constant h kind error is used for handling errors.
2.2
Create a clone of domain. The clone will be identical to domain, except that
the clone will not be compiled (even if domain is compiled).4
3
Currently, the HUGIN API does not support LIMIDs with continuous nodes. Thus, all
(chance and decision) nodes of a LIMID must be discrete.
4
Chapter 7 provides information on compiling domains a prerequisite for performing
inference.
29
2.3
The following function is used for creating nodes in a domain. Only chance,
decision, utility, and function nodes are permitted in a domain.
x
Delete node (and all links involving node) from the domain or class to which
node belongs. If node has children, the tables and models of those children
are adjusted (see h node remove parent(34) for a description of the adjustment procedure).
If node is not a real-valued function node, and it belongs to a domain, then
that domain is uncompiled (see Section 7.5).
OOBN: Special actions are taken if node is an interface node, an instance
node, or an output clone. See Section 3.7 and Section 3.8 for further details.
DBN: If node has a temporal clone, then that clone is also deleted.
A new node can also be created by cloning an existing node.
h node t h node clone (h node t node)
Create a clone of node. The clone belongs to the same domain or class
as node, and it has attributes that are identical to (or clones of) the corresponding attributes of node: category, kind, subtype, number of states, state
labels, state values, parents, tables (conditional probability, policy, utility,
experience, and fading), model, case data, evidence, structure learning constraints, label, and user-defined attributes. However, the user data pointer
is not copied (it is set to NULL for the clone).
The clone has no name (because there cannot be two identically named
nodes in the same domain or class). Also, the clone has no children (because
that would imply changes to the children).
OOBN/DBN: If node is an interface node or an output clone, or node is or
has a temporal clone, then the clone has none of these properties.
OOBN: If node is a class instance, then the clone is an instance of the same
class as node and has the same inputs as node.
OOBN: If node belongs to a domain derived from a class (a so-called runtime domain), then the source list of node is not copied to the clone.
If (and only if) the cloning process succeeds, node is not a real-valued function node, and it belongs to a domain, then that domain is uncompiled
(see Section 7.5).
2.4
The links of a belief network or a LIMID are directed edges between the
nodes of the network. [Undirected edges are also possible, but the API
interface to support them has not yet been defined. However, see Chapter 13
for a description of the NET language interface.]
31
Moreover, the SPU (Single Policy Updating) algorithm [24] used to compute policies in LIMIDs cannot handle networks with functional trails from
a node u to a node v unless u or v is a real-valued function node. An attempt to create a LIMID that violates this condition results in a functional
dependency error.5
It is not possible to link nodes from different domains or classes.
The quantitative part of the relationship between a discrete, a continuous,
or a utility node and its parents (ignoring functional links) is represented as
a table. This table can either be specified directly (as a complete set of numbers), or it can be generated from a model. For real-valued function nodes,
only models can be used. When links are added, removed, or reversed, the
tables and models involved are automatically updated.
x
33
affect the table produced by the table generator.7 That is also the reason
for requiring the lists of state labels to be identical only for labeled nodes,
although all discrete nodes can have state labels.
x
35
If the list returned by h node get parents (or h node get children) is used to
control the iteration of a loop (such as the for-loop in the example above),
then API functions that modify the list must not be used in the body of the
loop. For example, calling h node add parent(33) modifies the list of parents
of the child and the list of children of the parent: The contents of the lists are
obviously modified, but the memory locations of the lists might also change.
Other similar functions to watch out for are h node remove parent(34) , h
node switch parent(34) , h node reverse edge(35) , and h node delete(31) .
The problem can be avoided if a copy of the list is used to control the loop.
h node t h node get children (h node t node)
2.4.1
Not all available observations matter when a decision must be made. Intuitively, a parent of a decision node is said to be requisite if the value of the
parent may affect the optimal choice of the decision.
The performance of inference in a LIMID can be improved, if the network is
simplified by removing the nonrequisite parents of all decision nodes.
Lauritzen and Nilsson [24] present an algorithm for removing the nonrequisite parents of the decision nodes in a LIMID. The result of this algorithm is
a LIMID, where all parents of each decision node are requisite. This network
is known as the minimal reduction of the LIMID.8
h node t h node get requisite parents (h node t node)
Return a list of the requisite parents of node (which must be a decision node
belonging to a domain). If an error occurs, NULL is returned.
If the requisite parents are not already known (see below), the function
computes the minimal reduction of the underlying LIMID network. The
parents of node in this minimal reduction are returned.
Notice that the function does not remove the nonrequisite parents it only
tells which parents are (non)requisite. In order to remove the nonrequisite
parents, h node remove parent(34) must be used.
In order to improve performance, the results of the minimal reduction algorithm are cached (that is, the requisite parents of all decisions are cached).
8
Our definition of requisiteness is slightly different than the one given by Lauritzen and
Nilsson: We define a parent of a decision node to be requisite if and only if it is also a parent
of the decision node in the minimal reduction of the LIMID.
36
If a link is added, if a link is removed (unless that link represents a nonrequisite parent of a decision node), or if a decision node is converted to
a chance node (or vice versa), the cached results are deleted. Notice that
deletion of nodes usually removes links (but creation of nodes does not add
links). Also notice that switching parents adds a link and removes a link and
therefore also causes the cached results to be deleted.
The list of requisite parents is stored within the node data structure. The
application must not modify or deallocate this list. Also, this list is deleted
by the HUGIN API when the contents become invalid (as explained above).
In this case, an updated list must be requested using h node get requisite
parents.
In ordinary influence diagrams, decisions are made according to the noforgetting rule (which states that past observations and decisions are taken
into account by all future decisions). To help identify the relevant part of
the past, the following function can be used.
h node t h node get requisite ancestors (h node t node)
Ordinary influence diagrams also assume a total ordering of all decisions in the network.
This ordering can be specified by ensuring the existence of a directed path that contains all
decisions in the network. This must be done before the requisite ancestors are retrieved.
37
by the HUGIN API when the contents become invalid (as explained above).
In this case, an updated list must be requested using h node get requisite
ancestors.
2.5
As mentioned above, discrete nodes in the HUGIN API has a finite number
of states. The enumeration of the states follows traditional C conventions: If
a node has n states, the first state has index 0, the second state has index 1,
. . . , and the last state has index n 1.
It is possible to associate labels and values with the states of discrete nodes.
See Section 6.5 and Section 6.6.
The following function is used to specify the number of states of a discrete
node.
x
If count is smaller than the current number of states of node, then the labels
and the values associated with the deleted states are deleted.
OOBN: If node is an output node of its class, then the changes described
above are applied to all its output clones (recursively, if output clones are
themselves output nodes).
DBN: If node has a temporal clone, then the changes described above are
applied to that clone.
h count t h node get number of states (h node t node)
2.6
39
40
The conditional distribution for a continuous random variable Y with discrete parents I and continuous parents Z is a (one-dimensional) Gaussian
distribution conditional on the values of the parents:
p(Y |I = i, Z = z) = N((i) + (i)T z, (i))
[This is known as a CG distribution.] Note that the mean depends linearly
on the continuous parent variables and that the variance does not depend
on the continuous parent variables. However, both the linear function and
the variance are allowed to depend on the discrete parent variables. (These
restrictions ensure that exact inference is possible.)
The following six functions are used to set and get the individual elements
of the conditional distribution for a continuous node. In the prototypes of
these functions, node is a continuous chance node, parent is a continuous
parent of node, i is the index of a discrete parent state configuration12 (see
Section 5.1 for an explanation of configuration indexes), and alpha, beta,
and gamma refer to the (i), (i), and (i) components of a CG distribution
as specified above.
x
41
2.7
Note that the name returned by h node get name is not a copy. Thus, the
application must not modify or free it.
h node t h domain get node by name
(h domain t domain, h string t name)
Return the node with name name in domain, or NULL if no node with that
name exists in domain, or if an error occurs (i.e., domain or name is NULL).
2.8
An application may need to perform some action for all nodes of a domain.
To handle such situations, the HUGIN API provides a set of functions for
iterating through the nodes of a domain, using an order determined by the
age of the nodes: the first node in the order is the youngest node (i.e., the
most recently created node that hasnt been deleted), . . . , and the last node
is the oldest node.
If the application needs the nodes in some other order, it must obtain all the
nodes, using the functions described below, and sort the nodes according to
the desired order.
x
43
2.9
User data
2.9.1
The HUGIN API provides a slot within the node structure for exclusive use
by the application. This slot can be used to hold a pointer to arbitrary data,
completely controlled by the user.
x
44
belief_window w;
h_node_t n;
...
w = (belief_window) h_node_get_user_data (n);
update_belief_window (w, n);
where update belief window is a function defined by the application. Again, note
the cast of the pointer type.
Using the user data facility is analogous to adding an extra slot to the node
data structure. It must be noted that only one such slot is available. If more
are needed, store a list of slots or create a compound data type (e.g., a C
structure). Note also that the extra slot is not saved in HUGIN KB or in
NET files. If this is needed, the application must create the necessary files.
Alternatively, the attribute facility described below can be used.
It is also possible to associate user data with a domain as a whole. This is
done using the following functions.
x
2.9.2
User-defined attributes
2.10
47
If the domain is compressed (which implies compiled), then the domain will also be compressed when loaded. This property implies that
compressed domains can be created on computers with large amounts
of (virtual) memory and then later be loaded on computers with limited amounts of (virtual) memory.
When a compressed domain is loaded, a propagation is required before beliefs can be retrieved.14
If the domain is a triangulated influence diagram (i.e., the HKB file
was created by a HUGIN API older than version 7.0), then the domain
is loaded in uncompiled form. The domain is then treated as a LIMID.
OOBN/DBN: The source lists of nodes in a runtime domain are not
saved in the HKB file. This implies that the domain is not recognized
as a runtime domain when it is loaded from the HKB file.15
There is no published specification of the HKB format, and since the format
is binary (and non-obvious), the only way to load an HKB file is to use the
appropriate HUGIN API function. This property makes it possible to protect
HKB files from unauthorized access: A password can be embedded in the
HKB file, when the file is created; this password must then be supplied,
when the HKB file is loaded. (The password is embedded in the HKB file in
encrypted form, so that the true password cannot easily be discovered by
inspection of the HKB file contents.)
In general, the format of an HKB file is specific to the version of the HUGIN
API that was used to create it. Thus, when upgrading the HUGIN API (which
is also used by the HUGIN GUI tool, so upgrading that tool usually implies
a HUGIN API upgrade), it may be necessary to save a domain in the NET
format (see Section 13.10) using the old software before upgrading to the
new version of the software (because the new software may not be able to
load the old HKB files).16
where the table data were actually included in the HKB file, and therefore the table data
could be loaded directly from the file.
14
This is also a change in HUGIN API version 6.6. See the previous footnote (replacing
compilation with propagation).
15
This prevents functions such as h domain learn class tables(179) from being used.
16
The HKB formats for HUGIN API versions 3.x and 4.x were identical, but the HKB format
changed for version 5.0 and again for versions 5.1, 5.2, and 5.3. Versions 5.4, 6.0, and 6.1
used the same format as version 5.3. Versions 6.26.5 also used this format for HKB files
that were not password protected, but a newer revision of the format was used for password
protected HKB files. The HKB format changed for versions 6.6, 6.7, 7.0, and 7.1. Version 7.2
used the same format as version 7.1. Version 7.3 also used this format for networks without
function nodes, but a newer revision of the format was used for networks with function
nodes. Versions 7.4 and 7.5 used the same format as version 7.3, unless the file contained a
compressed domain in this case, a newer revision of the format was used. Version 7.6 used
a newer revision of the format if the network contained non-function nodes with function
48
HUGIN KB files are (as of HUGIN API version 6.2) automatically compressed
using the Zlib library (www.zlib.net). This implies that the developer
(i.e., the user of the HUGIN API) must (on some platforms) explicitly link to
the Zlib library, if the application makes use of HKB files see Section 1.2.
x
parents; otherwise, it used the same format as version 7.5. Versions 7.7, 7.8, 8.0, and 8.1 use
a new revision of the format if the network contains function nodes; otherwise, they use the
same format as version 7.5. Versions 8.2 and 8.3 use a new revision of the format if the stateindex operator is used within some expression in the network; otherwise, the HKB revision
used is the same as in version 8.1. HUGIN API 8.3 can load HKB files produced by version
5.0 or any later version up to (at least) version 8.3 but future versions of the HUGIN API
might not.
49
50
Chapter 3
Object-Oriented
Belief Networks and LIMIDs
This chapter provides the tools for constructing object-oriented belief network and LIMID models.
An object-oriented model is described by a class. Like a domain, a class
has an associated set of nodes, connected by links. However, a class may
also contain special nodes representing instances of other classes. A class
instance represents a network. This network receives input through input
nodes and provides output through output nodes. Input nodes of a class are
placeholders to be filled-in when the class is instantiated. Output nodes
can be used as parents of other nodes within the class containing the class
instance.
Object-oriented models cannot be used directly for inference: An objectoriented model must be converted to an equivalent flat model (represented as a domain see Chapter 2) before inference can take place.
3.1
3.2
A class always belongs to a (unique) class collection. So, before a class can
be created, a class collection must be created.
h class collection t h new class collection (void)
Create a new class. The new class will belong to class collection cc.
h class t h cc get members (h class collection t cc)
Retrieve the list of classes belonging to class collection cc. The list is a NULLterminated list.
h class collection t h class get class collection (h class t class)
3.3
Delete class collection cc. This also deletes all classes belonging to cc.
h status t h class delete (h class t class)
Delete class class and remove it from the class collection to which it belongs. If class is instantiated, then this operation will fail. (The h class get
instances(55) function can be used to test whether class is instantiated.)
3.4
Naming classes
Retrieve the name of class. If class is unnamed, a new unique name will
automatically be generated and assigned to class.
h class t h cc get class by name
(h class collection t cc, h string t name)
Retrieve the class with name name in class collection cc. If no such class
exists in cc, NULL is returned.
3.5
Create a new basic node of the indicated category and kind within class.
The node will have default values assigned to its attributes: The desired
attributes of the new node should be explicitly set using the relevant API
functions.
h class t h node get home class (h node t node)
Retrieve the class to which node belongs. If node is NULL, or node does not
belong to a class (i.e., it belongs to a domain), NULL is returned.
Deletion of basic nodes is done using h node delete(31) .
3.6
Naming nodes
Nodes belonging to classes can be named, just like nodes belonging to domains. The functions to handle names of class nodes are the same as those
used for domain nodes (see Section 2.7) plus the following function.
x
3.7
A class has a set of input nodes and a set of output nodes. These nodes
represent the interface of the class and are used to link instances of the class
to other class instances and network fragments.
For the following functions, when a node appears as an argument, it must
belong to a class. If not, a usage error is generated.
x
This function illustrates that modifying one class may affect many other
classes. This can happen when a class is modified such that its interface,
or some attribute of a node in the interface, is changed. In that case, all
instances of the class are affected. It is most efficient to specify the interface
of a class before creating instances of it.
Deletion of an interface node (using h node delete(31) ) implies invocation of
either h node remove from inputs or h node remove from outputs, depending on whether the node to be deleted is an input or an output node, respectively.
3.8
A class can be instantiated within other classes. Each such instance is represented by a so-called instance node. Instance nodes are of category h
category instance.
x
Output clones
Whenever a class instance is created, instances of all output nodes of the
class are also created. These nodes are called output clones. Since several
instances of some class C can exist in the same class, we need a way to distinguish different copies of some output node Y of class C corresponding to
different instances of C the output clones serve this purpose. For example, when specifying output Y of class instance I as a parent of some node,
the output clone corresponding to the (I, Y) combination must be passed
to h node add parent(33) . Output clones are retrieved using the h node get
output(57) function.
Many API operations are not allowed for output clones. The following restrictions apply:
Output clones can be used as parents, but cannot have parents themselves.
Output clones do not have tables or models.
For discrete output clones, the category and attributes related to states
(i.e., subtype, number of states, state labels, and state values) can be
retrieved, but not set. These attributes are identical to those of the
real output node (known as the master node) and change automatically whenever the corresponding attributes of the master are modified. For example, when the number of states of an output node is
changed, then all tables in which one or more of its clones appear will
automatically be resized as described in Section 2.5.
An output clone cannot be deleted directly. Instead, it is automatically
deleted when its master is deleted or removed from the class interface
(see h node remove from outputs(54) ), or when the class instance to
which it is associated is deleted.
Output clones are created without names, but they can be named just like
other nodes.
An output clone belongs to the same class as the instance node with which
it is associated. Hence, it appears in the node list of that class (and will be
seen when iterating over the nodes of the class).
x
Retrieve the instance node with which node is associated. If node is not an
output clone, NULL is returned.
h node t h node get output (h node t instance, h node t output)
Retrieve the output clone that was created from output when instance was
created. (This implies that output is an output node of the class from which
instance was created, and that output is the master of the output clone returned.)
h status t h node substitute class (h node t instance, h class t new)
Change the class instance instance to be an instance of class new. Let old be
the original class of instance. Then the following conditions must hold:
for each input node in old, there must exist an input node in new with
the same name, category, and kind;
for each output node in old, there must exist a compatible output node
in new with the same name.
(Note that this implies that interface nodes must be named.) The notion of
compatibility referred to in the last condition is the same as that used by
h node switch parent(34) and for input bindings (see Section 3.9 below).
The input bindings for instance are updated to refer to input nodes of class
new instead of class old (using match-by-name).
Similarly, the output clones associated with instance are updated to refer to
output nodes of class new instead of class old (again using match-by-name).
This affects only the value returned by h node get master(56) in all other
respects, the output clones are unaffected.
Extra output clones will be created, if class new has more output nodes than
class old.
3.9
instance and node must belong to the same class, and input and node must
be of the same category and kind.
The h node set input function does not prevent the same node from being
bound to two or more input nodes of the same class instance. However, it
is an error if a node ends up being parent of some other node twice in the
runtime domain (Section 3.10). This happens if some node A is bound to
two distinct input nodes, X1 and X2 , of some class instance I, and X1 and X2
have a common child in the class of which I is an instance. This will cause
h class create domain(58) to fail.
Note that for a given input binding to make sense, the formal and actual
input nodes must be compatible. The notion of compatibility used for this
purpose is the same as that used by the h node switch parent(34) and h node
substitute class(57) functions. This means that the nodes must be of the same
category and kind, and (if the nodes are discrete) have the same subtype,
the same number of states, the same list of state labels, and the same list of
state values (depending on the subtype). Only the category/kind restriction
is checked by h node set input. All restrictions (including the category/kind
restriction) are checked by h class create domain(58) .
x
3.10
Before inference can be performed, a class must be expanded to its corresponding flat domain known as the runtime domain.
x
58
Creating a runtime domain is a recursive process: First, domains corresponding to the instance nodes within class are constructed (using h class
create domain recursively). These domains are then merged into a common
domain, and copies of all non-instance nodes of class are added to the domain. Finally, the copies of the formal input nodes of the subdomains are
identified with their corresponding actual input nodes, if any.
Note that the runtime domain contains only basic (i.e., non-instance) nodes.
The attributes of the runtime domain are copies of those of class.
Models and tables are copied to the runtime domain. In particular, if tables
are up-to-date with respect to their models in class, then this will also be the
case in the runtime domain. This can save a lot of time (especially if many
copies of a class are made), since it can be very expensive to generate a
table. Generating up-to-date tables is done using h class generate tables(96) .
In order to associate a node of the runtime domain with (the path of) the
node of the object-oriented model from which it was created, a list of nodes
(called the source list) is provided. This node list traces a path from the root
of the object-oriented model to a leaf of the model. Assume the source list
corresponding to a runtime node is hN1 , ..., Nm i. All nodes except the last
must be instance nodes: N1 must be a node within class, and Ni (i > 1) must
be a node within the class of which Ni1 is an instance.
The nodes of the runtime domain are assigned names based on the source
lists: If the name of node Ni is ni , then the name of the runtime node is the
dotted name n1 .n2 . .nm . Because the names of the source nodes are
not allowed to contain dots, this scheme will generate unique (and meaningful) names for all runtime nodes. (As a side-effect of this operation, the
source nodes are also assigned names if they are not already named.)
DBN: In ordinary runtime domains, the nodes have unique source lists. This
is not the case in DBN runtime domains, where the source lists only uniquely
identify nodes in a given time slice. In order to provide unique names for
the nodes, a time slice ID is included in the names. See h class create dbn
domain(65) for more information.
x
59
A
I1 :
I2 :
3.11
Node iterator
In order to iterate over the nodes of a class, the following function is needed.
x
60
A0
B0
Y2
W1
W2
Z1
Z2
C0
Figure 3.2: A runtime domain corresponding to the object-oriented model
shown in Figure 3.1.
3.12
User data
Section 2.9 describes functions for associating user-defined data with domains and nodes. Similar functions are also provided for classes.
The first two functions manage generic pointers to data structures that must
be maintained by the user application.
x
3.13
A class collection can be saved as a HUGIN Knowledge Base (HKB) file. This
is a portable binary file format, which is only intended to be read by the
appropriate HUGIN API functions. There are two types of HKB files: HKB
files containing domains (see Section 2.10) and HKB files containing class
collections.
x
h status t h cc save as kb
(h class collection t cc, h string t file name,
h string t password)
Save cc as a HUGIN KB to a file named file name. If password is not NULL,
then the HKB file will be protected by the given password (i.e., the file can
only be loaded if this password is supplied to the h kb load class collection
function).
Chapter 4
4.1
Temporal clones
The structure of a DBN at a given time instant is modeled using a class. This
class is then instantiated a specified number of times using h class create
dbn domain(65) . The result is called a DBN runtime domain, and it represents the time window of the system. The time window is where evidence is
entered and inference is performed.
When modeling a system that evolves over time, random variables at a given
time instant typically depend on the state of the system at past time instants.
In order to specify such temporal dependencies, temporal clones can be constructed for the regular nodes of the class. A temporal clone of a variable X
represents X at the previous time instant.1 Links can then be added from
temporal clones to the regular nodes.
Hence, the temporal clones represent the interface between two successive
time slices.
x
63
Retrieve the temporal clone of node. If node does not have a temporal clone,
NULL is returned.
And this is the inverse function:
h node t h node get temporal master (h node t clone)
Retrieve the temporal master of clone. If clone is not a temporal clone, NULL
is returned.
4.2
As for object-oriented models in general, the time slice class must be instantiated (one instance per time slice) to form a so-called DBN runtime domain.
This domain represents the time window, and this is where evidence is entered and inference is performed.
64
4.3
Inference in DBNs
Before inference can be performed, the DBN runtime domain must be compiled. A DBN runtime domain can be compiled as an ordinary domain. This
produces a compiled domain that only allows exact inference to be performed within the time window itself. The operation of moving the time
window to include future time instants (see below), and the operation of
prediction, can only be performed approximately with this domain. If the
operations must be performed exactly (i.e., without approximations), then
the triangulation must respect extra properties. This is ensured by using the
following function for triangulation.
h status t h domain triangulate dbn
(h domain t domain, h triangulation method t tm)
This function is similar to h domain triangulate(107) except that the interface between successive time slices are made complete. This allows exact
inference in the situations described above to be made.
It is assumed that domain was created by h class create dbn domain(65) , and
that no changes have been made to domain since its creation (in particular, no node creations or deletions, no link changes, and no changes to the
CPTs4 ). The same is assumed for the class from which domain was created
(the class must not be deleted, or modified, until domain is deleted).
2
65
All interface nodes of the network must be discrete chance nodes, and the
network must not contain decision or utility nodes. Also, the network must
not contain a functional trail between any pair of interface nodes.
A (compiled) DBN runtime domain that has been triangulated using h domain triangulate dbn cannot be compressed.5 Also, such domains can only
use sum-propagations for inference. And, in the current implementation,
such domains cannot be saved as HKB files.
If the time window doesnt cover all time instants of interest, it can be moved
using the following function.
x
66
4.4
Prediction
We can compute beliefs for discrete nodes, means and variances for continuous nodes, and values for real-valued function nodes in time slices that lie
beyond the time window. This operation is called prediction.
h status t h domain compute dbn predictions
(h domain t domain, size t number of time instants)
This operation computes predictions for all nodes at time instants following the time window. Let the time of the last slice in the time window be t.
Then this function computes predictions for all nodes at times t + 1, . . . ,
t + number of time instants. These predictions are referred to using time
indexes ranging from 0 to number of time instants 1.
The number of time instants must be a positive number.
If domain (which must be a compiled DBN runtime domain) is triangulated
using h domain triangulate dbn(65) , the predictions are performed exactly.
Otherwise, approximate inference is performed.
All interface nodes of the network must be discrete chance nodes, and the
network must not contain decision or utility nodes. Also, the network must
not contain a functional trail between any pair of interface nodes.
The junction tree potentials must be up-to-date with respect to the evidence,
the node tables and their models (if any). The equilibrium must be sum,
and the evidence incorporation mode must be normal. This can be ensured
using a propagation operation.
The predictions computed by h domain compute dbn predictions can be accessed by the following functions. In all of these functions, the node argument must be the representative that belongs to the first (complete) time
slice of the time window (i.e., the node that is named with the prefix T1.
by h class create dbn domain(65) ).
The predictions are (only) available as long as domain is compiled.6 Also,
note that the predictions are not automatically updated by propagation operations an explicit call to h domain compute dbn predictions is required.
h number t h node get predicted belief
(h node t node, size t s, size t time)
Retrieve the belief of node for state s at time instant time as computed by the
most recent successful call to h domain compute dbn predictions (provided
that the predictions are still available); node must be a discrete node, and
time must be less than the number of time instants specified in that call.
6
Recall that many HUGIN API functions perform an implicit uncompile operation.
67
Retrieve the mean value of node at time instant time as computed by the
most recent successful call to h domain compute dbn predictions (provided
that the predictions are still available); node must be a continuous node,
and time must be less than the number of time instants specified in that call.
h number t h node get predicted variance
(h node t node, size t time)
Retrieve the variance of node at time instant time as computed by the most
recent successful call to h domain compute dbn predictions (provided that
the predictions are still available); node must be a continuous node, and
time must be less than the number of time instants specified in that call.
h number t h node get predicted value (h node t node, size t time)
Retrieve the value of node at time instant time as computed by the most
recent successful call to h domain compute dbn predictions (provided that
the predictions are still available); node must be a real-valued function node,
and time must be less than the number of time instants specified in that call.
4.5
Boyen and Koller ([4] and [19, 15.3.2]) have described a way to approximate the joint distribution of the interface variables between two successive
time slices such that the accumulated errors remain bounded indefinitely.
The idea is to use a factorization of the distribution so that inference in the
junction tree corresponding to a single time slice has tractable complexity.
During inference, the distribution over the interface variables is approximated according to the desired factorization, and the factors are transferred
to the junction corresponding to the next time slice.
The Hugin inference engine uses this technique to approximate moving of
the time window and prediction when the DBN runtime domain has not
been compiled for exact inference.
The Hugin inference engine uses the links between the temporal clones as
the definition of the factorization to be used for the approximation.
68
Chapter 5
Tables
Tables are used within HUGIN for representing the conditional probability,
policy, and utility potentials of nodes, the probability and utility potentials
on separators and cliques of junction trees, evidence potentials, etc.
The HUGIN API makes (some of) these tables accessible to the programmer
via the opaque pointer type h table t and associated functions.
The HUGIN API currently does not provide means for the programmer to
construct his own table objects, just the functions to manipulate the tables
created by HUGIN.
5.1
What is a table?
A potential is a function from the state space of a set of variables into the set
of real numbers. A table is a computer representation of a potential.
The HUGIN API introduces the opaque pointer type h table t to represent
table objects.
Consider a potential defined over a set of nodes. In general, the state space
of the potential has both a discrete part and a continuous part. Both parts
are indexed by the set I of configurations of states of the discrete nodes.
The discrete data are comprised of numbers x(i) (i I). If the potential is a
probability potential, x(i) is a probability (i.e., a number between 0 and 1,
inclusive). If the potential is a utility potential, x(i) can be any real number.
Probability potentials with continuous nodes represent so-called CG potentials (see [8, 20, 23]). They can either represent conditional or marginal
probability distributions. CG potentials of the conditional type are accessed
using special functions see Section 2.6. For CG potentials of the marginal
type, we have, for each i I, a number x(i) (a probability), a mean value
vector (i), and a (symmetric) covariance matrix (i); (i) and (i) are the
69
mean value vector and the covariance matrix for the conditional distribution
of the continuous nodes given the configuration i of the discrete nodes.
To be able to use a table object effectively, it is necessary to know some facts
about the representation of the table.
The set of configurations of the discrete nodes (i.e., the discrete state space I)
is organized as a multi-dimensional array in row-major format. Each dimension of this array corresponds to a discrete node, and the ordered list of
dimensions defines the format as follows.1
Suppose that the list of discrete nodes is hN1 , . . . , Nn i, and suppose that
node Nk has sk states. A configuration of the states of these nodes is a list
hi1 , . . . , in i, with 0 ik < sk (1 k n).
The set of configurations is mapped into the index set {0, . . . , S 1} where
n
Y
S=
sk
k=1
ak i k
k=1
where
ak = sk+1 sn
(Note that this mapping is one-to-one.)
Example 5.1 The mapping from state configurations to table indexes can be expressed using a simple loop. Let node count be the number of discrete nodes in the
given table, let state count[k] be the number of states of the kth node , and let configuration[k] be the state of the kth node in the state configuration. Then the table
index corresponding to the given state configuration can be computed as follows:
size_t k, index = 0;
for (k = 0; k < node_count; k++)
index = index * state_count[k] + configuration[k];
An API function is also provided to perform this computation: h table get index
from configuration(71) .
Many HUGIN API functions use the index of a configuration whenever the
states of a list of discrete nodes are needed. Examples of such functions are:
h node set alpha(41) , h node get alpha(42) , h table get mean(73) , etc.
1
This only applies to uncompressed tables. If a table is compressed, then there is no simple
way to map a configuration to a table index. (Compression is described in Section 7.6.)
70
71
5.2
x
must be used (we assume that table is a table returned by h domain get
marginal(126) or h node get distribution(126) ):
h double t h table get mean (h table t table, size t i, h node t node)
Return the mean value of the conditional distribution of the continuous node
node given the discrete state configuration i.
h double t h table get covariance
(h table t table, size t i, h node t node1 , h node t node2 )
5.3
Deleting tables
The HUGIN API also provides a function to release the storage resources
used by a table. The h table delete function can be used to deallocate tables returned by h domain get marginal(126) , h node get distribution(126) , h
node get experience table(160) , and h node get fading table(161) . All other deletion requests are ignored (e.g., a table returned by h node get table(39)
cannot be deleted).
h status t h table delete (h table t table)
5.4
This assumes that all state configurations are represented in the table. If
some state configurations have been removed (by a process known as compression see Section 7.6), the size will be smaller.
x
k=1
We call this quantity the CG size of the table. (If the table is compressed,
then this quantity is reduced proportional to the number of discrete configurations removed from the table.)
size t h table get cg size (h table t table)
5.5
74
Now suppose we want to ensure that a appears before b in the node list of the
conditional probability table of y. We make a list containing the desired order of y
and its parents, and then we call h table reorder nodes.
h_node_t list[5];
h_table_t t = h_node_get_table (y);
list[0] = a; list[1] = b;
list[2] = y; list[3] = x;
list[4] = NULL;
h_table_reorder_nodes (t, list);
Note that since y (the child node of the table) is continuous, it must be the first
node among the continuous nodes in the node list of the table. Had y been discrete,
it should have been the last node in the node list of the table (in this case, all nodes
would be discrete).
75
76
Chapter 6
Generating Tables
This chapter describes how to specify a compact description of a node table.
From this description, the contents of the table is generated.
Such a table description is called a model. A model consists of a list of discrete nodes and a set of expressions (one expression for each configuration
of states of the nodes). The expressions are built using standard statistical distributions (such as Normal, Binomial, Beta, Gamma, etc.), arithmetic
operators (such as addition, subtraction, etc.), standard functions (such as
logarithms, exponential, trigonometric, and hyperbolic functions), logical
operators (conjunction, disjunction, and conditional), and relations (such
as less-than or equals).
Models are also used to specify the functions associated with real-valued
function nodes. In this case, no tables are generated.
6.1
Recall that discrete nodes have a finite number of states. This implies that
numbered nodes can only represent a finite subset of, e.g., the nonnegative
integers (so special conventions are needed for discrete infinite distributions
such as the Poisson see Section 6.7.2).
The above classification of discrete nodes is represented by the enumeration
type h node subtype t. The constants of this enumeration type are: h
subtype label, h subtype boolean, h subtype number, and h subtype interval.
In addition, the constant h subtype error is defined for denoting errors.
h status t h node set subtype
(h node t node, h node subtype t subtype)
Return the subtype of node (which must be a discrete node). If node is NULL
or not a discrete node, h subtype error is returned.
6.2
Expressions
Expressions are classified (typed) by what they denote. There are four different types: labeled, boolean, numeric, and distribution.1
An opaque pointer type h expression t is defined to represent the expressions that constitute a model. Expressions can represent constants, variables, and composite expressions (i.e., expressions comprised of an operator
applied to a list of arguments). The HUGIN API defines the following set of
functions to construct expressions.
All these functions return NULL on error (e.g., out-of-memory).
x
78
is of numeric type. If node is a discrete node, then the type of the expression
is either labeled, boolean, or numeric, depending on the subtype of node.
x
80
h operator aggregate
This operator denotes the distribution of the sum of a random number
of independent identically distributed random variables.
The number of random variables (the number of events) is modeled by
a discrete distribution, called the frequency distribution. The severity
of each event is modeled by a continuous distribution. The frequency
and severity distributions are assumed to be independent.
This is known as an aggregate distribution.
The operator takes two arguments: The first argument (representing
the frequency distribution) must be a numbered node, and the state
values for that node must form the sequence 0, 1, . . . , m for some m.
The second argument (representing the severity distribution) must be
an interval node (usually ranging from zero to infinity).2
An aggregate distribution can only be specified for a discrete function node (of interval subtype).2 The reason is that the frequency and
severity distributions must be available before the aggregate distribution can be computed. The intervals of the function node must cover
all values of (the domain of) the aggregate distribution. This usually
means all values from zero to infinity.
Example 6.2 An aggregate distribution can be used to model the total claim
amount for an insurance portfolio. The number of claims is modeled by the
frequency distribution (this might be a Poisson distribution), and the size (or
cost) of each claim is modeled by the severity distribution.
h operator probability
This operator takes a single argument a boolean expression. This argument must contain exactly one discrete node (and no other nodes).
Moreover, the operator cannot be nested.
The expression is evaluated by instantiating the node in the argument
to all possible states. The result is the sum of the beliefs of the instantiations for which the boolean argument evaluates to true. (This means
that inference must be performed before the expression can be evaluated.)
The operator can only be used in models for function nodes.
Example 6.3 If X is a labeled discrete node with states low, medium, and
high, then the following expressions are all valid in models for function nodes
having X as parent.
2
Aggregate distributions are currently not supported for nodes with zero-width intervals.
81
probability (X == "low")
probability (X != "low")
probability (or (X == "low", X == "medium"))
Such expressions can be used to transfer probabilities from one subnetwork
to another if they are connected by a functional trail.
6.3
Syntax of expressions
hSimple expressioni
hSimple expressioni hPlus or minusi hTermi
| hPlus or minusi hTermi
| hTermi
hTermi
hTermi hTimes or dividei hExp factori
| hExp factori
hExp factori
hFactori
hUnsigned numberi
| hNode namei
| hStringi
| false
| true
| # hNode namei
| ( hExpressioni )
| hOperator namei ( hExpression sequencei )
hExpression sequencei
hEmptyi
| hExpressioni [ , hExpressioni ]*
hComparisoni3
hPlus or minusi
+ | -
hTimes or dividei * | /
hOperator namei Normal | LogNormal | Beta | Gamma
| Exponential | Weibull
| Uniform | Triangular | PERT
| Binomial | Poisson | NegativeBinomial
| Geometric | Distribution | NoisyOR
| truncate | aggregate | probability
| min | max | log | log2 | log10 | exp
| sin | cos | tan | sinh | cosh | tanh
| sqrt | abs | floor | ceil | mod
| if | and | or | not
85
The operator names refer to the operators of the h operator t(79) type: prefix the operator name with h operator to get the corresponding constant of
the h operator t type.
h expression t h string parse expression
(h string t s, h model t model,
void (error handler) (h location t, h string t, void ),
void data)
Allocate a string and write into this string a representation of the expression e using the above described syntax.
Note that it is the responsibility of the user of the HUGIN API to deallocate
the returned string when it is no longer needed.
6.4
The HUGIN API introduces the opaque pointer type h model t to represent
models. Models must be explicitly created before they can be used.
h model t h node new model
(h node t node, h node t model nodes)
Create and return a model for node (which must be a discrete, a utility, or a
function node) using model nodes (a NULL-terminated list of discrete nodes,
3
Note that both C and Pascal notations for equality/inequality operators are accepted.
86
Return the expression stored at position index within model. If model is NULL
or no expression has been stored at the indicated position, NULL is returned
(the first case is an error).
6.5
State labels
Labels assigned to states of discrete nodes serve two purposes: (1) to identify states of labeled nodes in the table generator, and (2) to identify states in
the HUGIN GUI application (for example, when beliefs or expected utilities
are displayed).
x
not a discrete node, or label is NULL (these conditions are treated as errors),
or no (unique) state matching the specified label exists, 1 is returned.
The following function can be used to reorder the states of a labeled node.
h status t h node reorder states (h node t node, h string t order)
Reorder the states of node according to the specified order, where node is a
labeled discrete node, and order is a NULL-terminated list of strings containing a permutation of the state labels of node. The states of node must be
uniquely labeled.
OOBN/DBN: node must not be an output clone or a temporal clone.
In addition to reordering the state labels of node, the data in all tables containing node are reorganized according to the new ordering. The affected
tables are the same as those resized by h node set number of states(38) .
If node is a model node in the model of a child, then the expressions in
that model are reorganized according to the new ordering.
OOBN: If node is an output node of a class, then similar updates are performed for all output clones of node. The process is repeated (recursively) if
some output clones are also output nodes.
DBN: If node has a temporal clone, then similar updates are performed for
the temporal clone.
The elements of the vector of findings and the case data4 associated with
node are updated to match the new state ordering.
Unless order is identical to the current ordering of the states of node, and if
node belongs to a domain, then that domain will be uncompiled.
Notice that states can only be reordered for labeled nodes (not numeric or
boolean nodes).
6.6
State values
Similar to the above functions for dealing with state labels, we need functions for associating states with points or intervals on the real line. We
introduce a common set of functions for handling both of these purposes.
h status t h node set state value
(h node t node, size t s, h double t value)
If node has more than 32767 states, then state indexes 32767 in the case data that
should be mapped (by order) to indexes > 32767 are instead set to missing. The reason is
that state indexes in case data are stored as signed 2-byte quantities see Section 12.1.
89
When node is used with the table generator facility, the state values must
form an increasing sequence.
For numbered nodes, value indicates the specific number to be associated
with the specified state.
For interval nodes, the values specified for state i and state i + 1 are the left
and right endpoints of the interval denoted by state i (the dividing point
between two adjacent intervals is taken to belong to the interval to the right
of the point, except when the first interval has zero width see below). To
indicate the right endpoint of the rightmost interval, specify s equal to the
number of states of node.
To specify (semi-)infinite intervals, the constant h infinity is defined. The
negative of this constant may be specified for the left endpoint of the first
interval, and the positive of this constant may be specified for the right
endpoint of the last interval.
As a special case, the left and right endpoints can be equal. Such so-called
zero-width intervals can be used to express distributions that are combinations of discrete and continuous distributions. In this case, both the left and
right endpoints belong to the interval (so the next interval, if any, is open
at the left end).
h double t h node get state value (h node t node, size t s)
Return the value associated with state s for the numeric node node.
The following function provides the inverse functionality.
h index t h node get state index from value
(h node t node, h double t value)
Return the index of the state of node matching the specified value:
If node is an interval node, the state index of the interval containing
value is returned. If an error is detected (that is, if the state values
of node do not form an increasing sequence), or no interval contains
value, 1 is returned.
If node is a numbered node, the index of the state having value as the
associated state value is returned. If no (unique) state is found, 1 is
returned.
If node is not a numeric node (this is an error condition), 1 is returned.
6.7
Statistical distributions
This section defines the distributions that can be specified using the model
feature of HUGIN.
90
6.7.1
Continuous distributions
1
22
e 2 (x) /
<x<
2 > 0
2 > 0
x>0
(x/b)a1 ex/b
b (a)
a>0
b>0
x0
1
(x a)1 (b x)1
B(, )
(b a)+1
>0
>0
axb
1
x1 (1 x)1
B(, )
>0
>0
0x1
The normal and log-normal distributions are often specified using the standard deviation instead of the variance 2 . To be consistent with the convention used for CG potentials,
we have chosen to use the variance.
91
a1
x
e(x/b)
a>0
b>0
x0
b>0
x0
1
ba
axb
a<b
x a axm
a<m
ma
2
pX (x) =
amb
a<b
ba
b x mxb
m<b
bm
This distribution is denoted Triangular(a, m, b).
PERT A PERT distribution is a beta distribution specified using the parameters: a (min), m (mode), b (max) (a < m < b), and an optional shape
parameter ( > 0). The mean of the beta distribution is assumed to
be (a + m + b)/( + 2), and from this assumption, formulas for computing the and parameters of the beta distribution can be derived:
=1+
ma
ba
and
92
=1+
bm
ba
6.7.2
Discrete distributions
P(X = k) =
n k
p (1 p)nk
k
k = 0, 1, . . . , n
e k
k!
>0
k = 0, 1, 2, . . .
qi
i:bi =true
6.8
Generating tables
Normally, the user doesnt need to worry about generating tables from their
corresponding models. This is automatically taken care of by the compilation, propagation, and reset-inference-engine operations (by calling the
functions described below).
However, it may sometimes be desirable to generate a single table from its
model (for example, when deciding how to split a continuous range into
subintervals). This is done using the following function.
x
A state label (of a labeled node), a state value, the number of states,
or the subtype of node (if node is a discrete node) or one of its discrete
parents has changed.
The value of a parent linked to node by a functional link has (or might
have) changed provided the parent is used in the model of node.6
This includes the case where the parent has been replaced by another
parent using h node switch parent(34) .
If the value cant be computed, the table wont be generated (i.e., the
h node generate table operation fails).
A parent of node has been removed (provided the parent was used in
the model of node).7
If the operation fails, the contents of the table will be undefined. If a log-file
has been specified (see h domain set log file(109) ), then information about
the computations (including reasons for failures) is written to the log-file.
Experience and fading tables (see Chapter 11) are not affected by h node
generate table.
Generation of tables is usually a static operation. That is, tables can be generated once prior to compilation and inference. But this is not always the
case:8 If there are functional trails between nodes that are not real-valued
function nodes, then tables might have to be generated during inference
see Section 10.2.
h status t h domain generate tables (h domain t domain)
Generate tables for all relevant nodes of domain. This is done by calling h
node generate table for all nodes (except real-valued function nodes) having
a model. Hence, the description of that function also applies here.
The operation is aborted if table generation fails for some node. This implies
that some tables may have been successfully generated, some may not have
been generated at all, and one table has been only partially generated.
The following function is identical to the above function, except that it operates on classes instead of domains.
h status t h class generate tables (h class t class)
Generate tables for all relevant nodes of class. See the above description of
h domain generate tables for further details.
6
Parents that only appear within probability and aggregate expressions are ignored, because their values are not needed for the purpose of table generation.
7
Adding a parent, however, will not cause the table to be generated (because the contents
of the table would not change).
8
This changed in HUGIN 7.7.
96
Set the file to be used for logging by subsequent HUGIN API operations that
apply to class. (Currently, only table generation information is written to
log-files for classes.)
If log file is NULL, no log will be produced. If log file is not NULL, it must
be a stdio text file, opened for writing (or appending). Writing is done
sequentially (i.e., no seeking is done).
See also Section 7.4.
6.9
For a given interval of the parent (i.e., for a specific parent state configuration), we compute many probability distributions for the child, each distribution being obtained by instantiating the parent to a value in the interval
under consideration.9 The average of these distributions is used as the conditional probability distribution for the child given the parent is in the interval state considered. (For this scheme to work well, the intervals should be
chosen such that the discretised distributions corresponding to the chosen
points in the parent interval are not too different from each other.)
By default, 25 values are taken within each bounded interval of an interval
parent: The interval is divided into 25 subintervals, and the midpoints of
these subintervals are then used in the computations. A large number of
values gives high accuracy, and a small number of values results in fast
computations. The number of values used can be changed by the following
function.
h status t h model set number of samples per interval
(h model t model, size t count)
Specify that count values should be sampled from each bounded interval of
an interval parent when generating a table from model.
Note that this has no effect if the owner of model is a function node (of
any kind).
h count t h model get number of samples per interval
(h model t model)
Retrieve the count indicating the number of samples that would be used if a
table were to be generated from model now.
Deterministic relationships
If the type of the expression for the parent state configuration under consideration is not distribution, then we have a deterministic relationship.
The expression must then evaluate to something that matches one of the
states of the child node. For labeled, boolean, and numbered nodes, the
value must match exactly one of the state values or labels. For interval
nodes, the value must belong to one of the intervals represented by the
states of the child node.
If one or more of the parents are of interval subtype, then a number of
samples (25 by default) within each (bounded) interval will be generated.
Each of these samples will result in a degenerate distribution (i.e., all
9
For semi-infinite intervals, only one value is used. This value is chosen to be close to the
finite endpoint. Intervals that are infinite in both directions are discouraged the behavior
is unspecified.
98
probability mass will be assigned to a single state) for the child node. The
final distribution assigned to the child node is the average over all generated
distributions. This amounts to counting the number of times a given child
state appears when applying the deterministic relationship to the generated
samples.
If all samples within a given parent state interval map to the same child
state, then the resulting child distribution is independent of the number of
samples generated. It is recommended that the intervals be chosen such
that this is the case.
If this is not feasible, then the number of samples generated should be large
in order to compensate for the sampling errors. Typically, some of the child
states will have a frequency count one higher (or lower) than the ideal
count.
Example 6.5 Let X be an interval node having [0, 1) as one of its states (intervals).
Let Y be a child of X having [0, 1), [1, 2), and [2, 3) as some of its states. Assume
that Y is specified through the deterministic relation Y = 3X. If 25 samples for X
are taken within the interval [0, 1), then 8, 9, and 8 of the computed values will
fall in the intervals [0, 1), [1, 2), and [2, 3) of Y, respectively. Ideally, the frequency
counts should be the same, resulting in a uniform distribution over the three interval states.
99
100
Chapter 7
Compilation
Before a belief network or a LIMID can be used for inference, it must be
compiled.
This chapter describes functions to compile domains, control triangulations,
and to perform other related tasks, such as approximation and compression.
7.1
What is compilation?
101
(7) Finally, potentials are associated with the cliques and the links (the
separators) of each junction tree. These potentials are initialized from
the evidence and the conditional probability tables (and the policies
and the utility tables in the case of LIMIDs), using a sum-propagation
(see Section 10.1).
All steps, except the triangulation step, are quite straightforward. The triangulation problem is known to be NP-hard for all reasonable criteria of
optimality, so (especially for large networks) finding the optimal solution
is not always feasible. The HUGIN API provides several methods for finding triangulations: five heuristic methods based on local cost measures, and
a combined exact/heuristic method capable of minimizing the storage requirements (i.e., the sum of state space sizes) of the cliques of the triangulated graph, if sufficient computational resources are available.
Alternatively, a triangulation can be specified in the form of an elimination
sequence.
An elimination sequence is an ordered list containing each node of the graph
exactly once. An elimination sequence hv1 , . . . , vn i generates a triangulated
graph from an undirected graph as follows: Complete the set of neighbors
of v1 in the graph (i.e., for each pair of unconnected neighbors, add a fillin edge between them). Then eliminate v1 from the graph (i.e., delete v1
and edges incident to v1 ). Repeat this process for all nodes of the graph in
the specified order. The input graph with the set of generated fill-in edges
included is a triangulated graph.
The elimination sequence can be chosen arbitrarily, except for belief networks with both discrete and continuous nodes.
In order to ensure correct inference, the theory of CG belief networks (see
[8, 20, 23]) requires the continuous nodes to be eliminated before the discrete nodes.
Let denote the set of discrete nodes, and let denote the set of continuous
nodes. A valid elimination sequence must contain the nodes of (in any
order) followed by the nodes of (in any order).
Let x, y . It is well-known that, for any valid elimination sequence, the
following must hold for the corresponding triangulated graph: If between x
and y there exists a path lying entirely in (except for the end-points), then
x and y are connected. If x and y are not connected in the moral graph,
we say that x and y are connected by a necessary fill-in edge. Conversely,
it can be shown that a triangulated graph with this property has a valid
elimination sequence.
Let G be the moral graph extended with all necessary fill-in edges. The
neighbors of a connected component of G[ ] form a complete separator of G
(unless there is exactly one connected component having all nodes of as
neighbors). A maximal subgraph that does not have a complete separator
102
7.2
Compilation
h status t h domain compile (h domain t domain)
7.3
Triangulation
This is not always proportional to the actual storage requirements (measured in bytes)
of the table. This is because the data elements can be of different types: The data elements
associated with state configurations of discrete nodes are of type h number t, while the data
elements associated with continuous nodes (such as mean and covariance values) are of type
h double t. In a single-precision version of the HUGIN API, h number t is a 4-byte quantity,
but in a double-precision version, it is an 8-byte quantity. In both versions, h double t is an
8-byte quantity.
104
Specify an intial triangulation to be used by the h tm total weight triangulation method. The triangulation is specified in the form of the elimination
sequence order (a NULL-terminated list of nodes containing each discrete
and each continuous node of domain exactly once, and continuous nodes
must precede discrete nodes see Section 7.1).
If NULL is specified as the initial triangulation, then any previously specified
initial triangulation is removed. The initial triangulation is also removed if a
new (discrete or continuous) node is created within domain, or an existing
(discrete or continuous) node is deleted.
The number of separators. Some prime components have more minimal separators than the memory of a typical computer can hold. In order to handle such components, an upper bound on the number of separators can be
specified: If the search for minimal separators determines that more than
the specified maximum number of separators exist, then the component is
split using one of the separators already found.3 The fragments obtained
are then recursively triangulated.
Experience suggests that 100 000 is a good number to use as an upper bound
on the number of minimal separators.
h status t h domain set max number of separators
(h domain t domain, size t count)
Retrieve the current setting for domain of the maximum number of separators to generate for the h tm total weight triangulation method. If an error
occurs, a negative number is returned.
The size of separators. The algorithm by Berry et al [2] for finding minimal
separators generate new separators from separators already found. Large
separators are only useful to ensure that all separators are eventually found.
If large separators are ignored (discarded), less storage is consumed, and
3
The separator is selected using a heuristic method that considers the cost of the separator
and the size of the largest fragment generated, when the component is split using the separator. The heuristic method used for this selection may change in a future version of the
HUGIN API.
106
larger networks can be handled (without the need for splitting prime components into smaller graphs). The drawback is that finding all relevant separators (i.e., the small separators that are useful for generating triangulations) is no longer assured.
x
This storage (which is the largest part of the storage consumed by a typical
compilation) is allocated by the h domain compile(103) function. However,
the total size of the junction tree tables can be queried before this storage is
allocated see h jt get total size(117) and h jt get total cg size(117) .
x
7.4
It is possible to get a log of the actions taken by the compilation process (the
elimination order chosen, the fill-in edges created, the cliques, the junction
trees, etc.). Such a log is useful for debugging purposes (e.g., to find out
why the compiled version of the domain became so big).
x
h status t h domain set log file (h domain t domain, FILE log file)
Set the file to be used for logging by subsequent calls to HUGIN API functions.
If log file is NULL, no log will be produced. If log file is not NULL, it must
be a stdio text file, opened for writing (or appending). Writing is done
sequentially (i.e., no seeking is done).
Note that if a log is wanted, and (some of) the nodes (that are mentioned
in the log) have not been assigned names, then names will automatically be
assigned (through calls to the h node get name(42) function).
Example 7.1 The following code fragment illustrates a typical compilation process.
h_domain_t d;
FILE *log;
...
log = fopen ("mydomain.log", "w");
h_domain_set_log_file (d, log);
h_domain_triangulate (d, h_tm_clique_weight);
h_domain_compile (d);
h_domain_set_log_file (d, NULL);
fclose (log);
A file (log) is opened for writing and assigned as log file to domain d. Next, triangulation, using the h tm clique weight heuristic, is performed. Then the domain is
compiled. When the compilation process has completed, the log file is closed. Note
that further writing to the log file (by HUGIN API functions) is prevented by setting
the log file of domain d to NULL.
In addition to the compilation and triangulation functions, the h node generate table(95) , h domain learn structure(173) , h domain learn tables(176) , and
h domain learn class tables(179) functions also use the log file to report errors, warnings, and other information. HUGIN API functions that use h
node generate table internally (for example, the propagation operations call
this function when tables need to be regenerated from their models) also
write to the log file (if it is non-NULL).
109
7.5
Uncompilation
h status t h domain uncompile (h domain t domain)
7.6
Compression
Most of the memory consumed by a compiled domain is used for storing the
data of the clique and separator tables. Many of the entries of these tables
might be zero, reflecting the fact that these state combinations in the model
are impossible. Zeros in the junction tree tables arise from logical relations
within the model. Logical relations can be caused by deterministic nodes,
approximation, or propagation of evidence. To conserve memory, the data
elements with a value of zero can be removed, thereby making the tables
smaller. This process is called compression.
x
7.7
Approximation
The discrete part of a clique potential consists of a joint probability distribution over the set of state configurations of the discrete nodes of the clique.
4
Prior to HUGIN API 7.4, a 16-bit integer type was used for table indexes within the data
structures of compressed domains. In version 7.4, this type was changed to a 32-bit integer
type. This allows construction of compressed domains with much larger tables at the cost
of a larger overhead of the data structures needed to support table operations on compressed
tables.
111
Example 7.2 Example 7.1 can be extended with approximation and compression
as follows.
h_domain_t d;
FILE *log;
...
log = fopen ("mydomain.log", "w");
h_domain_set_log_file (d, log);
h_domain_triangulate (d, h_tm_clique_weight);
h_domain_compile (d);
h_domain_set_log_file (d, NULL);
fclose (log);
h_domain_approximate (d, 1E-8);
h_domain_compress (d);
h_domain_save_as_kb (d, "mydomain.hkb", NULL);
Probability mass of weight up to 108 is removed from each clique of the compiled
domain using approximation. Then the zero elements are removed from the clique
potentials using compression. Finally, the domain is saved as an HKB file (this
is necessary in order to use the compressed domain on another computer with
insufficient memory to create the uncompressed version of the domain).
113
114
Chapter 8
1
Actually, since functional links are ignored by the compilation process, the network
can be connected and generate more than one junction tree.
115
8.1
Types
8.2
Junction trees
The HUGIN API provides a pair of functions to access the junction trees of a
triangulated domain.
x
Return the root of junction tree jt. If the junction tree is undirected (which
it is unless there are continuous nodes involved), this is just an arbitrarily
selected clique. If the junction tree is directed, a strong root (see [8, 20, 23])
is returned (there may be more than one of those). If an error is detected,
NULL is returned.
size t h jt get total size (h junction tree t jt)
Return the total size (i.e., the total number of discrete configurations) of all
clique and separator tables associated with junction tree jt.
Each discrete table configuration has a numeric quantity of type h number t
associated with it. In a single-precision version of the HUGIN API, this is a
4-byte quantity. In a double-precision version, this is an 8-byte quantity.
Note that both probability and utility tables are counted (that is, the discrete
clique and separator configurations are counted twice if there are utility
potentials in the junction tree).
If an error occurs (e.g., the total size of all tables exceeds the maximum
value of the size t type), (size t) 1 is returned.
size t h jt get total cg size (h junction tree t jt)
Return the total CG size of all clique and separator tables associated with
junction tree jt. This counts the total number of CG data elements of all
tables. Each such data element occupies 8 bytes.
If the junction tree contains no CG nodes, the tables contain no CG data. In
this case (only), the function returns 0.
If an error occurs, (size t) 1 is returned.
8.3
Cliques
Each clique corresponds to a maximal complete set of nodes in the triangulated graph. The members of such a set can be retrieved from the corresponding clique object by the following function.
x
8.4
The h jt get root(117) and h clique get neighbors(118) functions can be used
to traverse a junction tree in a recursive fashion.
Example 8.1 The following code outlines the structure of the DistributeEvidence
function used by the propagation algorithm (see [13] for further details).
void distribute_evidence (h_clique_t self, h_clique_t parent)
{
h_clique_t *neighbors = h_clique_get_neighbors (self);
if (parent != 0)
/* absorb from parent */ ;
for (h_clique_t *n = neighbors; *n != 0; n++)
if (*n != parent)
distribute_evidence (*n, self);
}
...
{
h_junction_tree_t jt;
...
distribute_evidence (h_jt_get_root (jt), 0);
...
}
The parent argument of distribute evidence indicates the origin of the invocation;
this is used to avoid calling the caller.
118
Chapter 9
9.1
9.1.1
Evidence
Discrete evidence
Initially, before any evidence has been entered, all finding vectors consist of
1-elements only. Such evidence is termed vacuous.
If the finding vector for a node has exactly one positive element, the node
is said to be instantiated. The function h node select state(121) instantiates a
node to a specific state (using 1 as the finding value of the specified state).
In general, specifying 0 as the finding value of a state declares the state to
be impossible. All finding vectors must have at least one positive element.
If not, inference will fail with an impossible evidence error code: h error
inconsistency or underflow(136) .
If a finding vector has two (or more) 1-elements, and the remaining elements are 0, we call the finding vector a multi-state finding.
If a finding vector has at least one element that is 6= 0 and 6= 1, the finding
vector is called a likelihood. The following examples illustrate the use of
likelihoods.
Example 9.1 Let A be the node that we wish to enter likelihood evidence for. Now,
suppose we add a new node B as a child of A and specify the conditional probability
table P(B|A) as follows:
b1
b2
a1
a2
0.3
0.7
0.4
0.6
b1
b2
a1
a2
0.9
0.1
0.1
0.9
An instantiated node is said to have hard evidence. All other types of (nonvacuous) evidence are called soft evidence.
120
9.1.2
Continuous evidence
Evidence for continuous nodes always take the form of a statement that a
node is known to have a specific value. Such evidence is entered using the
h node enter value(122) function.
This type of evidence is an example of hard evidence.
9.1.3
Evidence in LIMIDs
Decision nodes can only have hard evidence (and the finding value must
be 1). In addition, a chance node with evidence must not have a decision node without evidence as ancestor in the network obtained by ignoring information links (i.e., links pointing at decision nodes). Such an evidence scenario would amount to observing the consequences of a decision
before the decision is made, and an attempt to perform inference given
such evidence fails with an invalid evidence error code: h error invalid
evidence(137) .
9.2
Entering evidence
The functions described in this section can be used to enter evidence for
a given set of nodes (one node at a time). It is also possible to load the
evidence for all nodes at once, when the evidence is stored in a case file
(see h domain parse case(132) ) or as a case in main memory (see h domain
enter case(168) ).
The following function handles evidence taking the form of instantiations of
discrete variables.
x
121
If the evidence is not a simple instantiation, then the function h node enter
finding should be called, once for each state of the node, specifying the
finding value for the state.
h status t h node enter finding
(h node t node, size t state, h number t value)
Specify value as the finding value for state of node (which can be any discrete
node1 ). The finding value must be nonnegative, and state must specify a
valid state of node.
If you have several independent observations to be presented as likelihoods
to HUGIN for the same node, you have to multiply them yourself; each call
to h node enter finding overrides the previous finding value stored for the
indicated state. The h node get entered finding(129) function can be conveniently used for the accumulation of a set of likelihoods.
To specify evidence for a continuous node, the following function must be
used.
h status t h node enter value (h node t node, h double t value)
Specify that the continuous node node has the value value.
Note that inference is not automatically performed when evidence is entered
(not even when the domain is compiled). To get the updated beliefs, you
must explicitly call a propagation function (see Section 10.2).
9.3
Retracting evidence
122
set number of states(38) function deletes evidence when the number of states
of a (discrete) node is changed.
Example 9.4 The code
...
d = h_kb_load_domain ("mydomain.hkb", NULL);
n = h_domain_get_node_by_name (d, "input");
h_node_select_state (n, 0);
...
h_node_retract_findings (n);
enters the observation that the discrete node input is in state 0; later, that observation is retracted, returning the node input to its initial status.
9.4
This criterion forms the basis of the algorithm in the HUGIN API for determining independence properties.
The following functions determine the (maximal) sets of nodes that are
d-connected to (respectively d-separated from) the specified source nodes
given evidence nodes.
x
9.5
Retrieving beliefs
When the domain has been compiled (see Chapter 7) and the evidence propagated (see Chapter 10), the calculated beliefs can be retrieved using the
functions described below.
x
For continuous nodes, the beliefs computed take the form of the mean and
variance of the distribution of the node given the evidence.
x
Compute the marginal table for the specified list nodes of nodes with respect
to the (imaginary) joint potential, determined by the current potentials on
the junction tree(s) of domain. The nodes must be distinct discrete or continuous nodes, and they must belong to domain.2 If the nodes list contains
continuous nodes, they must be last in the list. This operation is not allowed
on compressed domains. If an error occurs, a NULL pointer is returned.
The fact that the marginal is computed based on the current junction tree
potentials implies that the equilibrium and evidence incorporation mode
(see Section 10.1) for the marginal will be as specified in the propagation
that produced the current junction tree potentials.
If the nodes list contains continuous nodes, the marginal will in general be
a so-called weak marginal [8, 20, 23]. This means that only the means and
the (co)variances are computed, not the full distribution. In other words,
the marginal is not necessarily a multi-variate normal distribution with the
indicated means and (co)variances (in general, it is a mixture of such distributions). Also note that if the discrete probability is zero, then the mean and
(co)variances are essentially random numbers (the inference engine doesnt
bother computing zero components of a distribution).
The table returned by h domain get marginal is owned by the application,
and it is the responsibility of the application to deallocate it (using h table
delete(73) ) after use.
See Chapter 5 for information on how to manipulate h table t objects.
h table t h node get distribution (h node t node)
This function computes the distribution for the CG node node. No value
must be propagated for node. If an error occurs, a NULL pointer is returned.
The distribution for a CG node is in general a mixture of several Gaussian
distributions. What h node get distribution really computes is a joint distribution for node and a set of discrete nodes. The set of discrete nodes is
chosen such that the computed marginal is a strong marginal [8, 20, 23],
but the set is not necessarily minimal.
As is the case for the h domain get marginal function, the means and variances corresponding to zero probability components are arbitrary numbers.
2
In the current implementation, all nodes must belong to the same junction tree.
126
9.6
In LIMIDs, we will want to retrieve the expected utilities associated with the
states of a node (usually a decision node). We might also be interested in
the overall expected utility of a decision problem.
First, evidence must be entered and propagated. Then, the functions below
can be used to retrieve expected utilities.
x
For both functions, a negative value is returned if an error occurs. But this
is of little use for error detection, since any real value is (in general) a valid
utility. Thus, errors must be checked for using the h error code(18) function.
9.7
The results of inference can be used as input to the functions associated with
real-valued function nodes.
x
128
9.8
The HUGIN API provides functions to access the evidence currently entered
to the nodes of a domain. Functions to determine the type of evidence (nonvacuous or likelihood) are also provided.
The node argument of the functions described below must be a discrete or a
continuous node. The functions having propagated in their names require
the underlying domain to be compiled. The functions having entered in
their names do not.
x
Retrieve the value that has been propagated for the continuous node node.
If no value has been propagated, a usage error code is set and a negative
number is returned. However, since a negative number is a valid value for
a continuous node, checking for errors must be done using h error code(18)
and friends.
h boolean t h node evidence is entered (h node t node)
Is the evidence potential for node (which must be a discrete or a continuous node), incorporated within the current junction tree potentials, nonvacuous?
h boolean t h node likelihood is propagated (h node t node)
Is the evidence potential for node (which must be a discrete or a continuous node), incorporated within the current junction tree potentials, a likelihood?
Note: If node is a continuous node, false (zero) is returned.
9.9
Case files
When evidence has been entered to a set of nodes, it can be saved to a file.
Such a file is known as a case file. The HUGIN API provides functions for
reading and writing case files.
A case file is a text file. The format (i.e., syntax) of a case file can be described by the following grammar.
hCase filei
hNode findingi*
hValuei
#hIntegeri
hLikelihoodi
where:
hState indexi is a valid specification for all discrete nodes. The state
index is interpreted as if specified as the last argument to h node select
state(121) for the named node.
hLikelihoodi is also a valid specification for all discrete nodes. A nonnegative real number must be specified for each state of the named
node (and at least one of the numbers must be positive).
hReal numberi is a valid specification for CG, numbered, and interval
nodes. For numbered and interval nodes, the acceptable values are
defined by the state values of the named node.
hLabeli is a valid specification for labeled nodes. The label (a doublequoted string) must match a unique state label of the named node.
If the contents of the string conform to the definition of a name (see
Section 13.8), the quotes can be omitted.
true and false are valid specifications for boolean nodes.
Comments can be included in the file. Comments are specified using the %
character and extends to the end of the line. Comments are ignored by the
case file parser.
Example 9.7 The following case file demonstrates the different ways to specify
evidence: A, B, and C are labeled nodes with states yes and no; D is a boolean
node; E is a numbered node; F is an interval node; and G is a CG node.
A:
B:
C:
D:
E:
F:
G:
"yes"
#1
(.3 1.2)
true
2
3.5
-1.4
% equivalent to "no"
% likelihood
Because yes is a valid name, the finding for A can instead be specified simply as:
A: yes
131
132
Chapter 10
Inference
When evidence has been entered and the domain has been compiled, we
want to compute revised beliefs for the nodes of the domain. This process
is called inference. In HUGIN, this is done by a two-pass propagation operation on the junction tree(s). The two passes are known as CollectEvidence
and DistributeEvidence, respectively. The CollectEvidence operation proceeds
inwards from the leaves of the junction tree to a root clique, which has been
selected in advance. The DistributeEvidence operation proceeds outwards
from the root to the leaves.
This inference scheme is described in many places. See, for example, [8, 10,
13, 14, 15, 20].
10.1
Propagation methods
10.1.1
It turns out that both kinds of marginals can be computed by the collect/distribute propagation method by a simple parametrization of the marginalization method.
When a propagation has been successfully completed, we have a situation
where the potentials on the cliques and the separators of the junction tree
are consistent, meaning that the marginal on a set S of variables can be
computed from the potential of any clique or separator containing S. We
also say that we have established equilibrium on the junction tree. The
equilibria, discussed above, are called sum-equilibrium and max-equilibrium,
respectively.
The HUGIN API introduces an enumeration type to represent the equilibrium. This type is called h equilibrium t. The values of this type are denoted by h equilibrium sum and h equilibrium max.
10.1.2
10.2
Propagation
the conditional probabilities corresponding to a given parent state configuration sum to 1). Note that the initial propagation (i.e., the one performed
by the compilation operation) considers all tables to be new.
Inference is performed separately for each junction tree. However, if there
is a functional trail from a node in one junction tree T1 to a node in another
junction tree T2 , then inference in T1 is performed before inference in T2 .
Since the purpose of such trails is usually to let the probability potentials
of some nodes in T1 depend on the values of some nodes in T2 , tables must
usually be regenerated between the individual inference operations.
In general, h domain propagate does not perform any unnecessary work. For
example, if the current evidence is a superset of the evidence propagated in
the most recent propagation (and the equilibrium and evidence incorporation mode of that propagation was equal to equilibrium and evidence mode,
and the set of node tables is unchanged), then an incremental propagation
is performed using d-separation analysis (see Section 9.4) to reduce the
extent of the DistributeEvidence pass.
However, if some currently propagated evidence has been retracted (and the
current evidence incorporation mode is normal), then the new equilibrium
must be computed either from scratch using the node tables or using a memory backup of the junction tree tables (see h domain save to memory(142) ).
Using a memory backup can significantly speed up the propagation in this
case.
If an error is detected during the propagation, the initial (that is, with no
evidence incorporated) distribution is established. This might fail, and if it
does, subsequent requests for beliefs and other quantities computed by a
propagation will fail.
The set of evidence is never changed by any of the propagation functions.
If the propagation fails, the error code (in addition to the general error
conditions such as usage and out-of-memory) can be:
h error inconsistency or underflow Some probability potential has degenerated into a zero-potential (i.e., with all values equal to zero). This
is almost always due to evidence that is considered impossible (i.e.,
its probability is zero) by the domain model. (In theory, it can also
be caused by underflow in the propagation process, but this rarely
happens in practice.)
h error overflow Overflow (caused by operations the purpose of which was
to avoid underflow) has occurred in the propagation process. This is
an unlikely error (but it is not entirely impossible it can occur in
very large naive Bayes models, for example). The error never occurs
if only one item of evidence is propagated at a time.
136
h error fast retraction A fast-retraction propagation has failed due to logical relations within the domain model.
h error invalid evidence (This only applies to LIMIDs.) The evidence scenario is invalid see Section 9.1.3.
It is also possible to perform inference on individual junction trees:
x
h status t h jt propagate
(h junction tree t jt, h equilibrium t equilibrium,
h evidence mode t evidence mode)
The meanings (including constraints) of the arguments and return value are
similar to the meanings and constraints of the arguments and return value
of the h domain propagate(135) function.
This is because the computations take all existing observations (in addition
to future observations specified as parents of decision nodes) into account
when policies are computed.
x
10.4
Conflict of evidence
139
10.5
The final step of the collection phase of a propagation operation is to normalize the root clique potential.1 For a sum-propagation, the normalization
constant of this potential is the probability of the propagated evidence E:
= P(E)
For a max-propagation, the normalization constant is the probability of the
most probable configuration xE consistent with the propagated evidence E:
= P(xE )
This information is useful in many applications, so the HUGIN API provides
functions to access the normalization constant and its logarithm:
x
In order to avoid underflow, local normalizations are performed in the separators as part
of CollectEvidence. The normalization constant also includes the constants used in the local
normalizations.
2
After a failed propagation operation, h domain get normalization constant returns 1.
140
constant can be used (this function returns the correct value for all successful propagations).3
If approximation is used, the normalization constant should be compared to
the error introduced by the approximation process see h domain get approximation constant(113) . If the probability of the evidence is smaller than
the approximation error, propagation within the original (unapproximated)
model should be considered (in order to get more accurate answers). See
also Section 7.7.
If likelihood evidence has been propagated, the normalization constant cannot, in general, be interpreted as a probability. As an example, consider
a binary variable: The likelihoods h 21 , 1i and h1, 2i yield the same beliefs,
but not the same normalization constant. However, see Example 9.1 and
Example 9.2 for cases where it makes sense to interpret the normalization
constant as a probability.
If CG evidence has been propagated, then the normalization constant is proportional to the density at the observed values of the continuous nodes (the
proportionality constant is the conditional probability of the discrete evidence given the CG evidence). The density depends directly on the scale
chosen for the continuous variables: Suppose that the scale of some continuous variable is changed from centimeter [cm] to millimeter [mm]. This
causes the density values for that variable to be reduced by a factor of 10.
Hence, the normalization constant should only be used to compare different
sets of findings.
10.6
141
142
{
h_domain_initialize (d);
done = perform_experiment (d);
}
...
10.7
The HUGIN API provides several functions that enables the application to
determine the exact state of the inference engine. The following queries can
be answered:
Which type of equilibrium is the junction tree(s) currently in?
Which evidence incorporation mode was used to obtain the equilibrium?
Has evidence been propagated?
Has any likelihood evidence been propagated?
Is there any unpropagated evidence?
DBN: The functions described below for testing propagated evidence only
consider evidence for nodes in the time window. Evidence that has been
moved out of the time window using h domain move dbn window(66) are
ignored!
x
h boolean t h jt equilibrium is
(h junction tree t jt, h equilibrium t equilibrium)
Test if the equilibrium of junction tree jt is equilibrium.
Similar to h domain tables to propagate, but specific to the junction tree jt.
10.8
Simulation
Often, we are interested in generating (sampling) configurations (i.e., vectors of values over the set of variables in the network) with respect to the
conditional distribution given the evidence.
h status t h domain simulate (h domain t domain)
Retrieve the state index of node (which must be a discrete node) within the
configuration generated by the most recent call to h domain simulate. If an
error occurs, a negative number is returned.
h double t h node get sampled value (h node t node)
If node is a continuous node, then the value of node within the configuration
generated by the most recent call to h domain simulate is returned.
If node is a real-valued function node, the function associated with node is
evaluated using the configuration generated by h domain simulate as input
4
In order to avoid returning invalid values, simulation results are automatically invalidated when an uncompile operation is performed. It is an error to request values derived
from invalid simulation results.
145
(that is, if the function refers to a parent in an expression, then the sampled
value of that parent is used in the evaluation), and the result is returned.
If an error occurs, a negative number is returned. However, since negative
numbers can be valid in both cases, error conditions must be checked for
using h error code(18) and related functions.
x
10.9
Consider the situation where a decision maker has to make a decision based
on the probability distribution of a hypothesis variable. It could, for instance, be a physician deciding on a treatment of a patient given the probability distribution of a disease variable. For instance, if the probability of
146
the patient suffering from the disease is above a certain threshold, then the
patient should be treated immediately. Prior to deciding on a treatment, the
physician may have the option to gather additional information about the
patient such as performing a test or asking a certain question. Given a range
of options, which option should the physician choose next? That is, which
of the given options will produce the most information? These questions
can be answered by a value of information analysis.
Given a Bayesian network model and a hypothesis variable, the task is to
identify the variable which is most informative with respect to the hypothesis variable.
Entropy and Mutual Information
The main reason for acquiring additional information is to reduce the uncertainty about the hypothesis under consideration. The selection of the variable to observe next (for example, the question to ask next) can be based
on the notion of entropy. Entropy is a measure of how much the probability
mass is scattered around on the states of a variable (the degree of chaos in
the distribution of the variable). In other words, entropy is a measure of
randomness. The more random a variable is, the higher its entropy will be.
The entropy H(X) of a discrete random variable X is defined as follows:
X
H(X) =
p(x) log p(x)
x
where
I(X; Y) =
p(x, y) log
x,y
p(x, y)
p(x)p(y)
10.10
Sensitivity analysis
Often there are one or more nodes (and associated beliefs) in a belief network that are regarded as outputs of the model. For example, the probability (belief) of a disease in a medical domain model.
In order to improve the robustness of the belief network model, it should
be investigated how sensitive the output probabilities are to variations in the
numerical parameters of the model (because these numerical parameters
are often imprecisely specified). The most influential parameters should be
identified, and effort should be directed towards reducing the impreciseness
of those parameters.
The process of identifying the most influential parameters of a belief network model and analyzing their effects on the output probabilities of the
model is known as sensitivity analysis [5, 6].
Let A be a (discrete chance) node in a belief network model being subjected
to sensitivity analysis, and let a be a state of A. For given evidence E, the
probability P(A = a|E) can be considered as a function of the conditional
148
probabilities in the CPTs of the chance nodes in the network. If the network
is a LIMID, P(A = a|E) can also be considered as a function of the conditional probabilities in the policies of the decision nodes.
Let B be a (discrete) node in the belief network, and let x be a conditional
probability parameter associated with B we shall refer to x as an input
parameter. This parameter corresponds to some state b of B and some state
configuration of the parents of B.
We wish to express P(A = a|E) as a function of x. When x varies, the other
conditional probabilities associated with B for parent configuration must
also vary (in order to satisfy the sum-to-1 constraint). The most common
assumption is that the other conditional probabilities vary according to a
proportional scheme:
P(b 0 |) =
if b 0 = b
1x
0
1 b| b |
if b 0 6= b
Here, b| is the initial (user-specified) value of x (and similarly for the other
input parameters).
Under this assumption (and also assuming that there are no functional trails
from B to A), it can be shown that the probability of the evidence E is a linear function of x:
P(E)(x) = x +
(10.1)
(10.2)
x +
x +
(10.3)
We then conclude:
P(A = a|E)(x) =
the constants are scaled to avoid floating-point underflow: The scaling factor is P(E)1 (assuming the user-specified values for all input parameters).
Also, if = p and = p for some constant p, then the function might
return = = 0 and = p and = 1. This is the case if the specified input
parameter is not associated with a node in the sensitivity set (see below).
Example 10.3 This example uses the chest clinic network [25]. We shall determine the probability of lung cancer (L) given shortness of breath (D) as a function
of the prior probability of the patient being a smoker (S).
h_domain_t d = h_kb_load_domain ("asia.hkb", NULL);
h_number_t alpha, beta, gamma, delta;
h_node_t L, D, S;
... /* code to find the nodes */
h_domain_compile (d);
h_node_select_state (D, 0); /* set D to "yes" */
h_domain_propagate (d, h_equilibrium_sum, h_mode_normal);
h_node_compute_sensitivity_data (L, 0);
h_node_get_sensitivity_constants
(S, 0, &alpha, &beta, &gamma, &delta);
printf ("P(L=yes|D=yes)(x)"
" = (%g * x + %g) / (%g * x + %g)\n",
alpha, beta, gamma, delta);
This code produces the following output (reformatted to fit the width of the page):
P(L=yes|D=yes)(x) = (0.170654 * x + 0.0174324)
/ (0.535988 * x + 0.732006)
In this output, x refers to the prior probability of the patient being a smoker.
In many cases, only a small subset of the input parameters can influence the
output probability P(A = a|E). The set of nodes to which these input parameters are associated is known as the sensitivity set for A given E [6]. This set
of nodes can be identified using d-separation analyses (Section 9.4): Suppose that we add an extra parent X to node B. If X is d-connected to A
given E, then B belongs to the sensitivity set.
The h node compute sensitivity data(150) function identifies the sensitivity
set for the specified output probability given the available evidence. The
following function can be used to retrieve the sensitivity set.
x
Parameter tuning
In order to tune the behavior of a belief network, we often need to impose
constraints on the output probabilities of the network.
We recall the equation for the output probability P(A = a|E) as a function of
some input parameter x:
P(A = a|E)(x) =
x +
.
x +
Using this equation, we can find the values of x that satisfy a given constraint. In [5], the following types of constraints are considered:
(1) P(A = a|E) P(A 0 = a 0 |E)
(D IFFERENCE )
(R ATIO)
(S IMPLE)
where P(A = a|E) and P(A 0 = a 0 |E) are output probabilities (referring to the
same variable or to two different variables). The relation can also be .
If x is associated with a node belonging to the sensitivity sets of both output
probabilities, the sensitivity functions (as computed by h node compute sensitivity data(150) or h domain compute sensitivity data(153) ) are:
P(A = a|E)(x) =
1 x + 1
x +
and
P(A 0 = a 0 |E)(x) =
2 x + 2
x +
(1 2 )x 1 + 2 +
If x is not associated with a node belonging to the sensitivity sets of both output probabilities, then at least one of the probabilities is a constant. If both
are constants, the solution is trivial. If only one is a constant, the D IFFER ENCE constraint reduces to a S IMPLE constraint.
For the R ATIO constraint, we get:
P(A = a|E) / P(A 0 = a 0 |E)
(1 2 )x 1 + 2
(1 )x 1 +
See h node get sensitivity constants(150) for the remaining usage conditions
and the semantics of this function.
Note: h node get sensitivity constants is implemented in terms of this function.
x
10.11
Steffen Lauritzen has proposed a Monte Carlo algorithm for finding the most
probable configurations of a set of discrete nodes given evidence on some of
the remaining nodes.
Let A be the set of nodes for which we seek the most probable configurations. The goal is to identify all configurations with probability at least pmin .
The algorithm can be outlined as follows.
A sequence of configurations of A is sampled from the junction tree.
[Configurations of A are obtained by ignoring the sampled values of
the remaining nodes.] Repetitions are discarded.
The probability of each (unique) configuration in the sequence is computed. Let ptotal be the total probability of all (unique) configurations.
If 1 ptotal < pmin , the algorithm terminates.
The most probable configuration is also known as the maximum a posteriori
(MAP) configuration. The above algorithm thus solves a more general form
of the MAP configuration problem.
The basic assumption of the algorithm is that most of the probability mass
is represented by a small set of configurations. If this is not the case, the
algorithm could run for a long time.
If the size of A is small, it is probably more efficient to compute the joint
probability table of A using h domain get marginal(126) . From this table, it
is easy to identify the most probable configurations.
x
domain. The domain must be compiled, and the distribution on the junction
tree(s) must be in sum-equilibrium with evidence incorporated in normal
mode (see Section 10.1).
The current implementation imposes the following additional restrictions.5
LIMIDs are not supported (that is, domain must not contain decisions).
Evidence must not be specified for any of the nodes in nodes.
The junction tree potentials must be up-to-date with respect to the
evidence, the CPTs and their models (if any).
The network must not contain functional trails between nodes that are
not real-valued function nodes.
The probability pmin should be reasonable (like 0.01 or higher). Otherwise,
the algorithm could run for a long time.
The results of a call to h domain find map configurations are retrieved using
the functions described below. The results remain available until domain is
uncompiled.
h count t h domain get number of map configurations
(h domain t domain)
This function returns the number of configurations found by the most recent
successful call to h domain find map configurations (with domain as argument). If no such call has been made (or the results of the call are no longer
available), a negative number is returned.
Let n be the number of configurations with probability at least pmin as
specified in the call to h domain find map configurations. The configurations are identified by integer indexes 0, . . . , n 1, where index 0 identifies
the most probable configuration, index 1 identifies the second-most probable configuration, . . . , and index n 1 identifies the least probable of the n
configurations.
size t h domain get map configuration
(h domain t domain, size t index)
This function returns the configuration identified by index among the configurations with probability at least pmin as specified in the most recent
successful call to h domain find map configurations (as explained above). If
an error occurs, NULL is returned.
5
An alternative implementation is under consideration. This implementation might impose a different set of restrictions (e.g., requiring that domain not being compressed, but
removing the other additional restrictions).
155
10.12
Explanation
The explanation facilities in the HUGIN API consist of computing the impact
of subsets of the given evidence on a single hypothesis [18, section 10.1.3]
or on one hypothesis versus an alternative hypothesis [18, section 10.1.4].
Impact of evidence subsets on a single hypothesis. We wish to investigate how
different subsets of the evidence support (or do not support) a given hypothesis. Let E be the set of evidence (a single piece of evidence e E consists
of evidence on a unique variable), and let X = x be the hypothesis (where X
is a discrete variable, and x is a state of X).
We compute the impact score of a subset E 0 of the evidence on the hypothesis X = x as the normalized likelihood of the hypothesis given the evidence E 0 :
P(E 0 |X = x)
P(X = x|E 0 )
=
0
P(E )
P(X = x)
We assume P(E 0 ) > 0 and P(X = x) > 0.
The normalized likelihood is a measure of the impact of the evidence on the
hypothesis. By comparing the normalized likelihoods of different subsets of
the evidence, we compare the impacts of the subsets on the hypothesis.
Discrimination of competing hypotheses. We wish to investigate how different
subsets of the evidence support (or do not support) different (competing)
hypotheses. Let E be the set of evidence (a single piece of evidence e E
consists of evidence on a unique variable), and let X = x and Y = y be the
competing hypotheses (where X and Y are discrete variables, and x and y
are states of X and Y, respectively).
We shall use the Bayes factor (or Bayesian likelihood ratio) B as the impact
score of a subset E 0 of the evidence on the (primary) hypothesis X = x versus
156
157
158
Chapter 11
Sequential Updating of
Conditional Probability Tables
This chapter describes the facilities for using data to sequentially update the
conditional probability tables for a domain when the graphical structure and
an initial specification of the conditional probability distributions have been
given in advance.
Sequential updating makes it possible to update and improve these conditional probability distributions as observations are made. This is especially
important if the model is incomplete, the modeled domain is drifting over
time, or the model quite simply does not reflect the modeled domain properly.
The sequential learning method implemented (also referred to as adaptation) was developed by Spiegelhalter and Lauritzen [35]. See also Cowell et
al [8] and Olesen et al [29].
Spiegelhalter and Lauritzen introduced the notion of experience. The experience is the quantitative memory which can be based both on quantitative
expert judgment and past cases. Dissemination of experience refers to the
process of calculating prior conditional distributions for the variables in the
belief network. Retrieval of experience refers to the process of calculating
updated distributions for the parameters that determine the conditional distributions for the variables in the belief network.
11.1
This function returns the experience table of node (which must be a chance
node).1 If node doesnt have an experience table, then one will be created.
The nodes of the experience table are the discrete parents of node, and the
order of the nodes in the experience table is the same as the order in the
conditional probability table of node.
OOBN: node must not be an output clone.
When an experience table is created, it is filled with zeros. Zero is an invalid
experience count for the adaptation algorithm, so positive values must be
stored in the table before adaptation can take place. If a database of cases
is available, the EM algorithm can be used to get initial experience counts.
The adaptation algorithm only adapts conditional distributions corresponding to parent configurations with positive experience counts. All other configurations (including all configurations for nodes with no experience table)
are ignored. This convention can be used to turn on/off adaptation at the
level of individual parent configurations: Setting an experience count to a
positive number turns on adaptation for the associated parent configuration;
setting the experience count to zero or a negative number turns it off.
Note that the table returned by h node get experience table is the table stored
within node (and not a copy of that table). This implies that the experience
counts for node can be modified using functions that provide access to the
internal data structures of tables (see Chapter 5).
Experience tables can be deleted using the h table delete(73) function. Deleting an experience table turns off adaptation for the node associated with the
table.
h boolean t h node has experience table (h node t node)
Although adaptation is only possible for discrete chance nodes, experience tables are
also used to control the EM algorithm, and the EM algorithm applies to both discrete and
continuous chance nodes.
160
161
11.2
Updating tables
When experience tables (and optionally fading tables) have been created
and their contents specified, then the model is ready for adaptation.
An adaptation step consists of entering evidence, propagating it, and, finally,
updating (adapting) the conditional probability and experience tables. The
last substep is performed using the following function.
x
162
If the experience count for some parent configuration is (or can be expected
to be) very large (104 or more) or the fading factor is very close to 1 (1104
or closer), then it is recommended that a double-precision version of the
HUGIN API is used.
163
164
Chapter 12
12.1
Data
Note that the mechanism for entering cases described in this section is intended for case sets that fit in main memory. The learning algorithms currently provided by the HUGIN API assume that the data is stored in main
memory. Also note that case data is not saved as part of the HUGIN KB file
produced by h domain save as kb(49) .
The cases are numbered sequentially from 0 to N 1, where N is the total
number of cases. The first case gets the number 0, the second case gets the
number 1, etc.
x
h index t h node get case state (h node t node, size t case index)
Retrieve the state value of the discrete node node associated with case case
index. If an error occurs or no state value (or an invalid state value) has
been specified, a negative number is returned.
166
Case data for continuous nodes is specified using the following functions.
x
h double t h node get case value (h node t node, size t case index)
Retrieve the value of the continuous node node associated with case case
index. If an error occurs, a negative number is returned, but this cannot be
used for error detection, since any real value is a valid value. Instead, the
h error code(18) function must be used.
The next two functions apply to both discrete and continuous nodes.
12.2
Data files
When a set of cases has been entered as described in the previous section,
it can be saved to a file. Such a file is known as a data file. The HUGIN API
provides functions for reading and writing data files.
A data file is a text file. The format (i.e., syntax) of a data file can be
described by the following grammar.
hData filei
hHeaderihCasei*
hHeaderi
hSeparatori , | hEmptyi
hNode listi
hCasei
hDatai
hValuei | * | ? | hEmptyi
where:
The hHeaderi must occupy a single line in the file. Likewise, each
hCasei must occupy a single line.
If # is the first element of hHeaderi, then each hCasei must include a
hCase counti.
Each hCasei must contain a hDatai item for each node specified in the
hHeaderi. The ith hDatai item (if it is a hValuei) in the hData listi must
be valid (as explained in Section 9.9) for the ith node in the hNode listi
of the hHeaderi.
If hDatai is *, ?, or hEmptyi, then the data is taken as missing.
If hSeparatori is hEmptyi, then none of the separated items is allowed
to be hEmptyi.
hValuei is as defined in Section 9.9, except that the hLikelihoodi alternative is not allowed.
168
Comments can be included in the file. Comments are specified using the %
character and extends to the end of the line. Comments behave like newline
characters. Empty lines (after removal of blanks, tabs, and comments) are
ignored by the data file parser (i.e., they do not represent empty cases).
Example 12.1 Here is a small set of cases for the Asia domain [25].
#
1
1
1
1
2
1
1
1
"yes"
"no"
"no"
"no"
"yes"
"yes"
"yes"
"no"
"no"
"yes"
"yes"
"no"
"yes"
"no"
"yes"
"no"
"no"
"yes"
"yes"
"yes"
*
"no"
"yes"
"no"
"no"
"no"
"yes"
"yes"
"no"
*
"yes"
*
The first line lists the nodes, and the remaining lines each describe a case. The first
case corresponds to a non-smoking patient, that has been to Asia recently, does
not have shortness of breath, and the X-ray doesnt show anything. The last case
corresponds to a non-smoking patient, that has not (recently) been to Asia, does
not have shortness of breath, and the X-ray is not available. Similarly for the other
cases.
Note the extra (optional) initial column of numbers: These numbers are case
counts. The number 2 for the fifth case indicates that this case has been observed
twice; the other cases have only been observed once. The presence of case counts
is indicated by the # character in the header line.
Note the distinction between case files (Section 9.9) and data files: A case
file contains exactly one case, it may contain likelihood data, and reading a
case file means loading the case data as evidence. A data file, on the other
hand, can contain arbitrarily many cases, but likelihood data is not allowed,
and reading a data file (using h domain parse cases described below) loads
the case data using the facilities described in Section 12.1.
x
The h domain save cases function saves case data stored in a domain.
h status t h domain save cases
(h domain t domain, h string t file name, h node t nodes,
h index t cases, h boolean t include case counts,
h string t separator, h string t missing data)
Save (some of) the case data stored in domain as a data file named file name.
(Note: If a file named file name already exists and is not write-protected, it
is overwritten.) The format and contents of the file are controlled by several
arguments:
nodes is a non-empty NULL-terminated list of (distinct) nodes. Moreover,
all nodes must be discrete or continuous nodes belonging to domain.
The list specifies which nodes (and their order) that are saved.
Note: If (some of) the nodes do not have names, they will be assigned
names (through calls to the h node get name(42) function).
cases is a list of case indexes (which must all be valid), terminated by 1.
The list specifies which cases (and their order in the file) that must
be included. Duplicates are allowed (the case will be output for each
occurrence of its index in the list).
When a case is output, the associated case count is output unmodified:
If the case has case count n, then it is also n in the generated file (not
n/2 or something like that).
can be passed for cases. This will cause all cases to be output (in
the same order as stored in domain).
NULL
If the data file is to be read by other applications, it can be useful to use a different
separator and/or a different missing data indicator. Therefore, these restrictions are not
enforced.
170
=
=
=
=
h_domain_parse_cases
(d, "Asia.dat", error_handler, "Asia.dat");
h_domain_save_cases
(d, "New.dat", nodes, cases, 0, ",\t", "");
When this code is executed, a new data file (New.dat) is generated. It has the
following contents:
S,
D,
"no",
"yes",
"yes",
"yes",
"yes",
"no",
"yes",
,
,
"yes",
"no"
"yes"
"no"
"no"
"yes"
Note that the case with index 4 (the fifth case) from the input data file is repeated
twice in the output data file. This is because that case has case count 2 in the input
data.
12.3
When we learn a graphical model from a set of cases, we want the model
that best describes the data. We want to express this goodness using a
171
single number, so that we can easily compare different models. We call such
a number a score.
Several different scoring measures have been proposed. The HUGIN API
provides the following scores:
The log-likelihood of the data given the model. This is simply the sum
of the log-likelihoods of the individual cases.
Akaikes Information Criterion (AIC): This is computed as l , where
l is the log-likelihood and is the number of free parameters in the
model.
The JeffreysSchwarz criterion, also called the Bayesian Information
Criterion (BIC): This is computed as l 21 log n, where l and are
defined as above, and n is the number of cases.
The log-likelihood score doesnt take model complexity into account, whereas the AIC and BIC scores do.
The following functions assume that case data has been specified, that domain is compiled (because inference will be performed), and that the junction tree potentials are up-to-date with respect to the node tables and their
models (if any). If domain is a LIMID, then each case must specify a valid
evidence scenario (see Section 9.1.3). Moreover, the network must not contain functional trails between nodes that are not real-valued function nodes.
h double t h domain get log likelihood (h domain t domain)
Get the log-likelihood of the case data with respect to the graphical model of
domain.2 This is computed using the current conditional probability tables.
If this function is called immediately after the EM algorithm has been run
(for example, using h domain learn tables(176) ), the log-likelihood will be
computed with respect to the final tables computed by the EM algorithm.
But the function can also be used without the EM algorithm.
h double t h domain get AIC (h domain t domain)
Get the AIC score of the case data with respect to the graphical model of
domain.
h double t h domain get BIC (h domain t domain)
Get the BIC score of the case data with respect to the graphical model of
domain.
2
Prior to version 6.5 of the HUGIN API, this function could only be used after the EM
algorithm had run, and the function returned the log-likelihood computed with respect to
the conditional probability tables before the final iteration of the EM algorithm i.e., not the
final tables.
172
12.4
The algorithm used by HUGIN for learning the network structure is known
as the PC algorithm [36]. Domain knowledge (i.e., knowledge of which
edges to include or exclude, directions of the edges, or both) is taken into
account. Such knowledge is specified as a set of edge constraints (see Section 12.5).
An outline of the algorithm is as follows:
Case data is specified using the functions described in Section 12.1.
Alternatively, the cases can be loaded from a data file using h domain
parse cases(169) .
Statistical tests for conditional independence of pairs of nodes (X, Y)
given sets of other nodes SXY (with the size of SXY varying from 0 to 3)
are performed.
An undirected graph (called the skeleton) is constructed: X and Y are
connected with an edge if and only if (1) the edge is required by the
edge constraints, or (2) the edge is permitted by the edge constraints
and no conditional independence relation for (X, Y) given a set SXY
was found in the previous step.
Edges for which directions are specified by the edge constraints are
directed according to the constraints (unless the constraints impose
directed cycles or invalid directions).
Colliders (also known as v-structures) (i.e., edges directed at a common node) and derived directions are identified. Edges are directed
such that no directed cycles are created.
The previous step results in a partially directed graph. The remaining
edges are arbitrarily directed (one at a time, each edge directed is
followed by a step identifying derived directions).
x
If a log-file has been specified (using h domain set log file(109) ), then a log
of the actions taken by the PC algorithm is produced. Such a log is useful for debugging and validation purposes (e.g., to determine which edge
directions were determined from data and which were selected at random).
The dependency tests calculate a test statistic which is asymptotically 2 distributed assuming (conditional) independence. If the test statistic is
large, the independence hypothesis is rejected; otherwise, it is accepted.
The probability of rejecting a true independence hypothesis is set using the
following function.
h status t h domain set significance level
(h domain t domain, h double t probability)
Set the significance level (i.e., the probability of rejecting a true independence hypothesis) to probability (a value between 0 and 1) for domain. The
default value is 0.05.
In general, increasing the significance level results in more edges, whereas
reducing it results in fewer edges. With fewer edges, the number of arbitrarily directed edges generally decreases.
Reducing the significance level also reduces the run time of h domain learn
structure.
h double t h domain get significance level (h domain t domain)
12.5
Domain knowledge
Background knowledge about the domain can be used to constrain the set of
networks that can be learned. Such knowledge can be used by the learning
algorithm to resolve ambiguities (e.g., deciding the direction of an edge).
Domain knowledge can be knowledge of the direction of an edge, the presence or absence of an edge, or both.
The enumeration type h edge constraint t is introduced to represent the
set of possible items of knowledge about a particular edge a b. The possibilities are:
3
This is useful if the computer has multiple processors, a processor with multiple cores,
R
R
or a processor that supports Intel
Hyper-Threading Technology (such as an Intel
CoreTM
R
R
or an Intel Xeon processor). The concurrency level should be less than or equal to the
number of simultaneous threads supported by the computer.
174
12.6
Before learning of conditional probability tables can take place, the data set
and the set of nodes for which conditional probability distributions should
be learned must be specified. This set of nodes is specified as the nodes
having experience tables. Experience tables are created by the h node get
experience table(160) function, and they are deleted by the h table delete(73)
function.
175
176
If the experience count is negative, then learning is disabled for the corresponding parent state configuration.
For the discrete nodes, the starting point of the EM algorithm consists of the
pre-specified conditional probability tables. If no tables have been specified,
uniform distributions are assumed. Sometimes, it is desirable to enforce zeros in the joint probability distribution. This is done by specifying zeros in
the conditional probability tables for the configurations that should be impossible (i.e., have zero probability). However, note that presence of cases
in the data set which are impossible according to the initial joint distribution
will cause the learning operation to fail.
For the continuous nodes, a suitable starting point is computed from the
case data.
Example 12.3 The following code loads the Asia domain [25] and makes sure
that all nodes except E have experience tables. All entries of these experience tables
are then set to 0 (because we want to compute maximum likelihood estimates of
the conditional probability tables). Note that newly created experience tables are
already filled with zeros.
h_domain_t d = h_kb_load_domain ("Asia.hkb", NULL);
h_node_t E = h_domain_get_node_by_name (d, "E");
h_node_t n = h_domain_get_first_node (d);
for (; n != NULL; n = h_node_get_next (n))
if (n != E)
{
h_boolean_t b = h_node_has_experience_table (n);
h_table_t t = h_node_get_experience_table (n);
if (b)
{
h_number_t *data = h_table_get_data (t);
size_t k = h_table_get_size (t);
for (; k > 0; k--, data++)
*data = 0.0;
}
}
if (h_node_has_experience_table (E))
h_table_delete (h_node_get_experience_table (E));
Now we read and enter into the domain a file of cases (data file). This is done
using the h domain parse cases(169) function (see Example 13.25 for an appropriate definition of error handler). After having ensured that the domain is compiled,
we call h domain learn tables in order to learn conditional probability tables for all
nodes except E. [We assume that the correct conditional probability table has already been specified for E, and that the other conditional probability tables contain
nonzero values.]
177
h_domain_parse_cases
(d, data_file, error_handler, data_file);
if (!h_domain_is_compiled (d))
h_domain_compile (d);
h_domain_learn_tables (d);
The h domain learn tables operation will also update the experience tables with
the counts derived from the file of cases. These experience counts can then form
the basis for the sequential learning feature. (But note that if some parent state
configurations are absent from the data set, then the corresponding experience
counts will be zero.)
179
180
Chapter 13
The first revision of the NET language (used by versions 1. x of the HUGIN API) had a
fixed format (i.e., the semantics of the different elements were determined by their positions
within the specification). This format could not (easily) be extended to support new features,
so a completely different format had to be developed.
181
13.1
A domain or a class specification in the NET language is conceptually comprised of the following parts:
Information pertaining to the domain or class as a whole.
Specification of basic nodes (category, kind, states, label, etc).
Specification of the relationships between the nodes (i.e., the network
structure, and the potentials and functions associated with the links).
[Classes only] Specification of class instances, including bindings of
interface nodes.
[DBN classes only] Specification of special nodes, known as temporal
clones, to express temporal dependencies.
The first part (i.e., the part providing global information about a domain or
a class) must appear first, but the other parts can be overlapping, except that
nodes must be defined before they can be used in specifications of structure
or quantitative data.
A specification of a domain in the NET language has the following form:
hdomain definitioni hdomain headeri hdomain elementi*
hdomain headeri
net { hattributei* }
hclass elementi
A NET file can contain several class definitions. The only restriction is that
classes must be defined before they are instantiated.
Names (hclass namei, hattribute namei, etc.) are specified in Section 13.8.
The following sections describe the syntax and semantics of the remaining
elements of the grammar: hbasic nodei (Section 13.2), hclass instancei (Section 13.3), htemporal clonei (Section 13.4), and hpotentiali (Section 13.5
and Section 13.6),
182
13.2
Basic nodes
A specification of a basic node begins with the specification of the node type:
[hprefixi] node (for specifying a chance node). The optional hprefixi
must be either discrete or continuous (omitting the hprefixi causes a
discrete chance node to be specified).
decision (for specifying a decision node).
utility (for specifying a utility node).
[discrete] function (for specifying a function node). If the optional
prefix discrete is included, the specification defines a discrete function
node. Otherwise, a real-valued function node is defined.
183
The node type specification is followed by a name that must be unique within the model. See Section 13.8 for the rules of forming valid node names.
A LIMID model (i.e., a model containing decision or utility nodes) must not
contain continuous (chance) nodes.
Example 13.1 shows three of the attributes currently defined in the NET
language for nodes: states, label, position, subtype, and state values. All of
these attributes are optional: If an attribute is absent, a default value is used.
states specifies the states of the node: This must be a non-empty list
of strings, comprising the labels of the states. If the node is used as
a labeled node with the table generator facility, then the labels must
be unique; otherwise, the labels need not be unique (and can even be
empty strings). The length of the list defines the number of states of
the node, which is the only quantity needed by the HUGIN inference
engine.
The default value is a list of length one, containing an empty string
(i.e., the node has only one state).
The states attribute is only allowed for discrete nodes.
label is a string that is used by the HUGIN GUI tool when displaying
the nodes. The label is not used by the inference engine. The default
value is the empty string.
position is a list of integers (the list must have length two). It indicates
the position within the graphical display of the network by the HUGIN
GUI tool. The position is not used by the inference engine. The default
position is at (0, 0).
subtype specifies the subtype of a discrete node. The value must be
one of the following name tokens: label, boolean, number, or interval.
See Section 6.1 for more information.
The default value is label.
state values is a list of numbers, defining the state values of the node.
These values are used by the table generator facility. This attribute
must only appear for nodes of subtypes number or interval (and must
appear after the subtype and states attributes). If the subtype is number, the list must have the same length as the states list; if the subtype
is interval, the list must have one more element than the states list.
The list of numbers must form an increasing sequence.
If the subtype is interval, the first element can be infinity, and the
last element can be infinity.
In addition to the standard attributes, an application can introduce its own
attributes.
184
Example 13.2 Here, the node T has been given the application-specific attribute
MY APPL my attr.
node T
{
states = ("yes" "no");
label = "Has tuberculosis?";
position = (25 275);
MY_APPL_my_attr = "1000";
}
13.3
Class instances
This defines an instance (with name hinstance namei) of the class with name
hclass namei. Currently, the hnode attributesi for a class instance can only
contain label, position, and user-defined attributes.
The hinput bindingsi specify how formal input nodes of the class instance
are associated with actual input nodes. The syntax is as follows:
hinput bindingsi hinput bindingi , hinput bindingsi
hinput bindingi hformal input namei = hactual input namei
The hformal input namei must refer to a node listed in the inputs attribute
(see Section 13.7) of the class with name hclass namei. The node referred
to by the hactual input namei must be defined somewhere in the class containing the class instance.
The hinput bindingsi need not specify bindings for all the formal input nodes
of the class (but at most one binding can be specified for each input node).
The houtput bindingsi are used to give names to output clones. The syntax
is similar to that of the input bindings:
houtput bindingsi houtput bindingi , houtput bindingsi
houtput bindingi hactual output namei = hformal output namei
The hactual output namei is the name assigned to the output clone that corresponds to the output node with name hformal output namei for this particular class instance. An hactual output namei may appear in the outputs attribute (see Section 13.7) of a class definition and as a parent in hpotentiali
specifications.
Example 13.4 The following fragment of a NET specification defines an instance I1
of class C.
instance I1 : C (X=X1, Y=Y1; Z1=Z) {...}
Class C must have (at least) two input nodes: X and Y. For instance I1 , X corresponds to node X1 , and Y corresponds to node Y1 . Class C must also have (at least)
one output node: Z. The output clone corresponding to Z for instance I1 is given
the name Z1 .
A NET file can contain several class definitions, but the classes must be
ordered such that instantiations of a class follow its definition. Often, a
NET file will be self-contained (i.e., no class instances refer to classes not
defined in the file), but it is also possible to store the classes in individual
files. When a NET file is parsed, classes will be looked up whenever they
are instantiated. If the class is already loaded, the loaded class will be used.
If no such class is known, it must be created (for example, by calling the
parser recursively). See Section 13.9 for further details.
186
13.4
Temporal clones
13.5
The structure (i.e., the links of the underlying graph) is specified as part
of the hpotentiali specifications. We have two kinds of links: directed and
undirected links. We denote a directed link from A to B as A B, and we
denote an undirected link between A and B as A B. If there is a directed
link from A to B, we say that A is a parent of B and that B is a child of A.
A network model containing undirected links is called a chain graph model.
A hpotentiali specification is introduced by the potential keyword. In the
following, we explain how to specify links through a series of examples.
Example 13.5 This is a typical specification of directed links:
potential ( A | B C ) { }
This specifies that node A has two parents: B and C. That is, there is a directed
link from B to A, and there is also a directed link from C to A.
Example 13.6 This is a typical specification of undirected links:
187
potential ( A B ) { }
This specifies that there is an undirected link between A and B. (If there are no
parents, the vertical bar may be omitted.)
Example 13.7 Directed and undirected links can also be specified together:
potential ( A B | C D ) { }
This specifies that there is an undirected link between A and B, and it also specifies
that A and B both have C and D as parents.
Example 13.9 This specification is also invalid, since there is a directed cycle A
B C A.
potential ( B | A ) { }
potential ( C | B ) { }
potential ( A C ) { }
However, the following specification is valid.
potential ( A | B ) { }
potential ( C | B ) { }
potential ( A C ) { }
If there is only one child node in the hpotentiali specification, then this specifies the node table associated with the child node (Section 13.6 explains
188
how to specify the numeric parts of the table). Only one such specification
can be provided for each node.
Chance, decision, and utility nodes can only have chance, decision, and
function nodes as parents. Discrete (chance and decision) nodes cannot
have continuous nodes as parents (but discrete function nodes can).
In dynamic models, temporal clones can only have temporal clones as parents (that is, links must follow the natural flow of time).
Links entering function nodes and links leaving real-valued function nodes
are called functional links. Additional constraints are imposed on network
models containing functional links. See Section 2.4.
Links entering decision nodes are called information links. The parents of a
decision node are exactly those nodes that are assumed to be known when
the decision is made.
Example 13.10 Assume that we want to specify a LIMID with two decisions, D1
and D2 , and with three discrete chance variables, A, B, and C. First, A is observed;
then, decision D1 is made; then, B is observed; finally, decision D2 is made. This
sequence of events can be specified as follows:
potential ( D1 | A ) { }
potential ( D2 | D1 B ) { }
The last line specifies that the decision maker forgets the observation of A before
he makes decision D2 . If this is not desired, then A should be included in the
hpotentiali specification for D2 :
potential ( D2 | D1 A B ) { }
However, this makes the policy of D2 more complex. Reducing the comlexity of
decision making by ignoring less important observations can often be an acceptable
(or even necessary) trade-off.
13.6
Potentials
We also need to specify the quantitative part of the model. This part consists of conditional probability potentials for chance and discrete function
nodes, policies for decision nodes, and utility functions for utility nodes. We
distinguish between discrete probability, continuous probability, and utility
potentials. Discrete probability potentials are used for all discrete nodes
(including decision nodes).
189
There are two ways to specify the discrete probability and the utility potentials: (1) by listing the numbers comprising the potentials (Section 13.6.1),
and (2) by using the table generation facility (Section 13.6.2).
Real-valued function nodes do not have potentials, but it is convenient to use
hpotentiali specifications to define the functions represented by the nodes.
In this case, we use expressions to specify the functions (see Section 13.6.2).
13.6.1
%
%
A=yes
A=no
This specifies that the probability of tuberculosis given a trip to Asia is 5%, whereas
it is only 1% if the patient has not been to Asia.
The data attribute may also be specified as an unstructured list of numbers:
potential ( T | A )
{
data = ( 0.05 0.95
0.01 0.99 );
}
%
%
A=yes
A=no
Example 13.12
potential ( D E F | A B C ) { }
The data attribute of this hpotentiali specification corresponds to a multi-dimensional table with dimension list hA, B, C, D, E, Fi.
Oil=dry
Oil=wet
Oil=soaking
Oil=dry
Oil=wet
Oil=soaking
The data attribute of this hpotentiali specification corresponds to a multi-dimensional table with dimension list hDrill, Oili.
The (multi-dimensional) table corresponding to the data attribute of a continuous probability potential has dimension list comprised of the discrete
parents of the hpotentiali specification (in the given order). These nodes
must be listed first on the right hand side of the vertical bar, followed by the
continuous parents. However, the items in the table are no longer numbers
but instead continuous distribution functions; only normal (i.e., Gaussian)
distributions can be used. A normal distribution is specified by its mean and
variance. In the following example, a continuous probability potential is
specified.
Example 13.14 Suppose A is a continuous node with parents B and C, which are
both discrete. Also, both B and C have two states: B has states b1 and b2 while C
has states c1 and c2 .
potential (A | B C)
{
data = (( normal
normal
( normal
normal
}
(
0,
( -1,
(
1,
( 2.5,
191
1
1
1
1.5
)
) )
)
) ));
%
%
%
%
B=b1
B=b1
B=b2
B=b2
C=c1
C=c2
C=c1
C=c2
The data attribute of this hpotentiali specification corresponds to a table with dimension list hB, Ci. Each entry contains a probability distribution for the continuous node A.
All entries in the above example contain a specification of a normal distribution. A normal distribution is specified using the keyword normal followed
by a list of two parameters. The first parameter is the mean and the second
parameter is the variance of the normal distribution.
Example 13.15 Let A be a continuous node with one discrete parent B (with states
b1 and b2 ) and one continuous parent C.
potential (A | B C)
{
data = ( normal ( 1 + C, 1 )
normal ( 1 + 1.5 * C, 2.5 ) );
}
%
%
B=b1
B=b2
The data attribute of this hpotentiali specification corresponds to a table with dimension list hBi (B is the only discrete parent, and it must therefore be listed first
on the right hand side of the vertical bar). Each entry again contains a continuous
distribution function for A. The influence of C on A now comes from the use of C
in the expressions specifying the mean parameters of the normal distributions.
13.6.2
For potentials that do not involve CG variables, a different method for specifying the quantitative part of the relationship for a single node and its parents is provided. This method can be used for all discrete nodes as well as
utility nodes.
192
Models are also used to define the functions associated with real-valued
function nodes. This is explained below.
Example 13.16 Let A denote the number of 1s in a throw with B (possibly biased)
dice, where the probability of getting a 1 in a throw with a single die is C. The
specification of the conditional probability potential for A given B and C can be
given using the table generation facility described in Chapter 6 as follows:
potential (A | B C)
{
model_nodes = ();
samples_per_interval = 50;
model_data = ( Binomial (B, C) );
}
First, we list the model nodes attribute: This defines the set of configurations for the
model data attribute. In this case, the list is empty, meaning that there is just one
configuration. The expression for that configuration is the binomial distribution
expression shown in the model data attribute.
C will typically be an interval node (i.e., its states represent intervals). However,
when computing the binomial distribution, a specific value for C is needed. This
is handled by choosing 50 distinct values within the given interval and computing
the distributions corresponding to those values. The average of these distributions
is then taken as the conditional distribution for A given the value of B and the
interval (i.e., state) of C. The number 50 is specified by the samples per interval
attribute. See Section 6.9 for further details.
Example 13.17 In the Chest Clinic example [25], the node E is specified as a
logical OR of its parents, T and L. Assuming that all three nodes are of labeled
subtype with states yes and no (in that order), the potential for E can be specified
as follows:
potential (E | T L)
{
model_nodes = (T L);
model_data = ( "yes", "yes", "yes", "no" );
}
An equivalent specification can be given in terms of the OR operator:
potential (E | T L)
{
model_nodes = ();
model_data
= ( if (or (T="yes", L="yes"), "yes", "no") );
}
If all three nodes are given a boolean subtype, the specification can be simplified to
the following:
193
potential (E | T L)
{
model_nodes = ();
model_data = ( or (T, L) );
}
In general, the model nodes attribute is a list containing a subset of the discrete parents listed to the right of the vertical bar in the hpotentiali specification. The order of the nodes in the model nodes list defines the interpretation
of the model data attribute: The model data attribute is a comma-separated
list of expressions, one for each configuration of the nodes in the model
nodes list. As usual, the ordering of these configurations is row-major.
A non-empty model nodes list is a convenient way to specify a model with
distinct expressions for distinct parent state configurations. An alternative
is nested if-expressions. See Example 13.17.
The model nodes attribute must appear before the samples per interval and
model data attributes.
The complete definition of the syntax of expressions is given in Section 6.3.
If both a specification using the model attributes and a specification using
the data attribute are provided, then the specification in the data attribute
is assumed to be correct (regardless of whether it was generated from the
model). The functions that generate NET files (Section 13.10) will output
both, if HUGIN believes that the table is up-to-date with respect to the
model (see the description of h node generate table(95) for precise details).
Since generating a table from its model can be a very expensive operation,
having a (redundant) specification in the data attribute can be considered a
cache for h node generate table.
Real-valued function nodes
Real-valued function nodes do not have potentials, but it is convenient to use
expressions to specify the functional relationship between a (real-valued)
function node and its parents.
This is achieved by using a hpotentiali specification containing only the
model nodes and model data attributes (the data attribute is not allowed).
In this case, all expressions must be of numeric type.
Example 13.18 Real-valued function nodes can be used to provide important constants in expressions. This is done by introducing a (real-valued) function node
(named appropriately) and defining the constant using an expression.
If we would like to fix the value of C in Example 13.16 to 0.2, we could define C as
a (real-valued) function node and define its value using a hpotentiali specification:
194
potential (C)
{
model_nodes = ();
model_data = ( 0.2 );
}
Defining the probability parameter of the binomial distribution this way makes it
easy to change it later.
13.6.3
Parameter learning
Information for use by the adaptation facility (see Chapter 11) is specified
through the experience and fading attributes of a hpotentiali specification.
These attributes have the same syntax as the data attribute.
The experience data is also used to control the EM algoritm (Section 12.6).
The experience attribute is only allowed in hpotentiali specifications for single (that is, there must be exactly one child node in the specification)
chance nodes. The fading attribute is only allowed in hpotentiali specifications for single discrete chance nodes.
For the adaptation algoritm, valid experience counts must be positive numbers, while the EM algoritm only requires nonnegative numbers. Specifying
an invalid value for some parent state configuration turns off parameter
learning for that configuration. If the hpotentiali specification doesnt contain an experience attribute, parameter learning is turned off completely for
the child node.
A fading factor is valid if 0 < 1. Specifying an invalid fading factor for
some parent state configuration turns off adaptation for that configuration.
If the hpotentiali specification doesnt contain a fading attribute, then all
fading factors are considered to be equal to 1 (which implies no fading).
See Chapter 11 and Section 12.6 for further details.
Example 13.19 The following shows a specification of experience and fading information for the node D (Dyspnoea) from the Chest Clinic example in [25].
This node has two parents, E and B. We specify an experience count and a fading
factor for each configuration of states of hE, Bi.
potential (D | E B)
{
data = ((( 0.9 0.1
( 0.7 0.3
(( 0.8 0.2
( 0.1 0.9
experience = (( 10
)
))
)
)));
%
%
%
%
%
195
E=yes
E=yes
E=no
E=no
E=yes
B=yes
B=no
B=yes
B=no
B=yes
12 )
0
14 ));
fading = (( 1.0
0.9 )
( 1.0
1.0 ));
%
%
%
%
%
%
%
E=yes
E=no
E=no
E=yes
E=yes
E=no
E=no
B=no
B=yes
B=no
B=yes
B=no
B=yes
B=no
}
Note that the experience count for the E = no/B = yes state configuration is 0. This
value is an invalid experience count for the adaptation algoritm (but not for the EM
algoritm), so adaptation is turned off for that particular state configuration. Also,
note that only the experience count for the E = yes/B = no state configuration will
be faded during adaptation (since the other parent state configurations have fading
factors equal to 1).
13.7
Global information
Currently, only the node size attribute is recognized as a special global attribute. However, as with nodes, extra attributes can be specified. These
extra attributes must take strings as values. The attributes are accessed
using the HUGIN API functions h domain get attribute(46) , h domain set attribute(46) , h class get attribute(62) , and h class set attribute(61) .
The HUGIN GUI tool uses the UTF-8 encoding for storing arbitrary text in
attributes. If you need to have your networks loaded within that tool, you
should use the UTF-8 encoding for non-ASCII text.
Example 13.21
net
{
node_size = (100 40);
MY_APPL_my_attr = "1000";
}
This specification has an application specific attribute named MY APPL my attr.
196
Example 13.22 Recent versions of the HUGIN GUI tool use several application
specific attributes. Some of them are shown here:
net
{
node_size = (80 40);
HR_Grid_X = "10";
HR_Grid_Y = "10";
HR_Grid_GridSnap = "1";
HR_Grid_GridShow = "0";
HR_Font_Name = "Arial";
HR_Font_Size = "-12";
HR_Font_Weight = "400";
HR_Font_Italic = "0";
HR_Propagate_Auto = "0";
}
HUGIN GUI uses the prefix HR on all of its application specific attributes (a predecessor of the HUGIN GUI tool was named HUGIN Runtime).
13.8
Lexical matters
In general, a name has the same structure as an identifier in the C programming language. That is, a name is a non-empty sequence of letters and
197
13.9
The HUGIN API provides two functions for parsing models specified in the
NET language: one for non-object-oriented models (domains), and one
for object-oriented models (classes).
This function parses non-object-oriented specifications (i.e., NET files starting with the net keyword) and creates a corresponding h domain t object.
x
If no error reports are desired (in this case, only the error indicator returned
by h error code(18) will be available), then the error handler argument may
be NULL. (In this case, warnings will be completely ignored.)
If the NET specification is successfully parsed, an opaque reference to the
created domain structure is returned; otherwise, NULL is returned. The domain is not compiled; use a compilation function to get a compiled version.
Example 13.24 The error handler function could be written as follows.
void my_error_handler
(h_location_t line_no, h_string_t message, void *data)
{
fprintf (stderr, "Error at line %d: %s\n",
line_no, message);
}
This error handler simply writes all messages to stderr. See Example 13.25 for a
different error handler.
The following function must be used when parsing NET files containing class
specifications (i.e., NET files starting with the class keyword).
x
contains several class definitions, classes must be defined before they are
instantiated.
If an error is detected, the error handler function is called with a line number, indicating the location within the source file currently being parsed,
and a string that describes the error. The storage used to hold this string is
reclaimed by h net parse classes when error handler returns (so if the error
message will be needed later, a copy must be made).
If parsing fails, then h net parse classes will try to preserve the initial contents of cc by deleting the new (and possibly incomplete) classes before it
returns. If get class has modified any of the classes initially in cc, then this
may not be possible. Also, if the changes are sufficiently vicious, then removing the new classes might not even be possible. However, if get class
only does things it is supposed to do, there will be no problems.
As described above, the get class function must insert a class with the specified name into the given class collection. This can be done by whatever
means are convenient, such as calling the parser recursively, or through explicit construction of the class.
Example 13.25 Suppose we have classes stored in separate files in a common directory, and that the name of each file is the name of the class stored in the file with
.net appended. Then the get class function could be written as follows:
void get_class
(h_string_t name, h_class_collection_t cc, void *data)
{
h_string_t file_name = malloc (strlen (name) + 5);
if (file_name == NULL)
return;
(void) strcat (strcpy (file_name, name), ".net");
(void) h_net_parse_classes
(file_name, cc, get_class, error_handler,
file_name);
free (file_name);
}
void error_handler
(h_location_t line_no, h_string_t err_msg, void *data)
{
fprintf (stderr, "Error in file %s at line %lu: %s\n",
(h_string_t) data, (unsigned long) line_no,
err_msg);
}
200
Note that we pass the file name as the data argument to h net parse classes. This
means that the error handler receives the name of the file as its third argument.
If more data is needed by either get class or the error handler, then the data argument can be specified as a pointer to a structure containing the needed data
items.
13.10
201
202
Chapter 14
14.1
Data sets
204
205
14.2
CSV files
Parse a CSV file that uses delimiter to separate the fields of a record (line).
The delimiter should be a normal character (that is, it should be visible
and not a so-called control character). However, a blank or a tab (but not
a double-quote) character can be used as delimiter.
The error handler and data arguments are used for error handling. This is
similar to the error handling done by the other parse functions. See Section 13.9 for further information.
h status t h ds save
(h data set t data set, h string t file name, int delimiter)
Save data set in the format of a comma-separated-values (CSV) file. However, another delimiter than a comma may be used. The delimiter should be
2
We shall also use the term CSV for such files, although they are not comma-separated.
206
a normal character (that is, it should be visible and not a so-called control
character). However, a blank or a tab (but not a double-quote) character can
be used as delimiter.
If necessary (so that the resulting file can be loaded by h csv parse data set),
fields will be quoted.
207
208
Chapter 15
Display Information
The HUGIN API was developed partly to satisfy the needs of the HUGIN
GUI application. This application can present an arbitrary belief network or
LIMID model. To do this, it was necessary to associate a certain amount of
graphical information with each node of the network. The functions to
support this are hereby provided for the benefit of the general API user.
Please note that not all items of graphical information have a special interface (such as the one provided for the label of a node see Section 15.1
below). Many more items of graphical information have been added using
the attribute interface described in Section 2.9.2. To find the names of these
extra attributes, take a look at the NET files generated by the HUGIN GUI
application.
15.1
15.2
In order to display a network graphically, the HUGIN GUI application associates with each node a position in a two-dimensional coordinate system.
The coordinates used by HUGIN are integral values; their type is h coordinate t.
x
15.3
210
211
212
Appendix A
213
Mi
Mo
Figure A.1: The structural aspects of the waste incinerator model described
in Example A.1: B, F, and W are discrete variables, while the remaining
variables are continuous.
The result of inference within a belief network model containing Conditional Gaussian variables is the beliefs (i.e., marginal distributions) of the
individual variables given evidence. For a discrete variable this (as usual)
amounts to a probability distribution over the states of the variable. For a
Conditional Gaussian variable two measures are provided:
(1) the mean and variance of the distribution;
(2) since the distribution is in general not a simple Gaussian distribution,
but a mixture (i.e., a weighted sum) of Gaussians, a list of the parameters (weight, mean, and variance) for each of the Gaussians is
available.
The algorithms necessary for computing these results are described in [23].
Example A.2 From the network shown in Figure A.1 (and given that the discrete
variables B, F, and W are all binary), we see that
the distribution for C can be comprised of up to two Gaussians (one if B is
instantiated);
initially (i.e., with no evidence incorporated), the distribution for E is comprised of up to four Gaussians;
if L is instantiated (and none of B, F, or W is instantiated), then the distribution for E is comprised of up to eight Gaussians.
214
Appendix B
A new version of the HKB format is used for networks containing function nodes. If the network does not contain function nodes, the format
is the same as that used by HUGIN API version 7.5.
The NET language has been extended in order to support discrete
function nodes and the new expression operators. See Chapter 13.
The EM algorithm now checks that the equilibrium is sum with no evidence incorporated. As a sanity check, it also verifies that learning is
enabled (that is, there must exist at least one node with a nonnegative
experience count), and that case data has been specified.
Because the EM algorithm is controlled by experience tables, continuous nodes may now also be given experience tables.
The HKB format has been updated (in order to support parameter
learning for continuous nodes).
The model scoring functions (Section 12.3) now check that the junction tree potentials are up-to-date with respect to the node tables and
their models (if any).
A new naming scheme has been introduced for the HUGIN API libraries on the Windows platforms: The libraries are now uniquely
named, making it possible to have all DLLs in the search path simultaneously.
d-separation analysis is now used to improve the performance of inference. This is particularly useful for incremental propagation of evidence in large networks.
The performance of the total-weight triangulation method has been
greatly improved.
The triangulation functions now construct junction trees, but do not
allocate storage for the data arrays of the clique and separator tables.
This permits the application to see the junction trees before attempting
the final (and most expensive) part of the compilation process.
It is now possible to query the size of a junction tree (even before
storage is allocated for the junction tree tables). See h jt get total
size(117) and h jt get total cg size(117) .
A function h domain is triangulated(108) for testing whether a domain is triangulated is now provided.
The HUGIN KB file format has changed (in order to handle HKB files
produced by 64-bit versions of the HUGIN API, among other things).
There are a few user-visible changes: If a compiled (but not compressed) domain is saved as an HKB file, it will only be triangulated
when loaded. A compilation is required before inference can be performed (see Section 2.10). Compressed domains are still loaded as
compressed (which implies compiled), but a propagation is required
before beliefs can be retrieved.
Functions for converting between table indexes and state configurations are now provided: h table get index from configuration(71) and
h table get configuration from index(71) .
A function to retrieve the CG size of a table is provided: h table get
cg size(74) .
224
h domain learn tables(176) and h domain learn class tables(179) now report the log-likelihood to the log-file (if it is non-NULL) after each iteration of the EM algorithm.
Direct access to the pseudorandom number generator implemented in
Hugin is now provided through the functions h domain get uniform
deviate(146) and h domain get normal deviate(146) .
HUGIN API libraries for Windows platforms are now provided for Visual Studio .NET 2003 (in addition to Visual Studio 6.0).
The HUGIN KB file format has changed (again), but version 5.2 of the
HUGIN API will load HKB files produced by versions 3 or later (up to
version 5.2). But note that support for older formats may be dropped
in future versions of the HUGIN API.
230
changes affect all functions that enter, retract, or query (entered) evidence, as well as h domain uncompile(110) and the functions that perform implicit uncompile operations with the exception of h node
set number of states(38) which still removes the entered evidence.
233
234
Bibliography
[1] S. K. Andersen, K. G. Olesen, F. V. Jensen, and F. Jensen. HUGIN
a shell for building Bayesian belief universes for expert systems. In
Proceedings of the Eleventh International Joint Conference on Artificial
Intelligence, pages 10801085, Detroit, Michigan, Aug. 2025, 1989.
Reprinted in [33].
[2] A. Berry, J.-P. Bordat, and O. Cogis. Generating all the minimal separators of a graph. International Journal of Foundations of Computer
Science, 11(3):397403, Sept. 2000.
[3] V. Bouchitte and I. Todinca. Treewidth and minimum fill-in: Grouping
the minimal separators. SIAM Journal on Computing, 31(1):212232,
July 2001.
[4] X. Boyen and D. Koller. Tractable inference for complex stochastic processes. In G. F. Cooper and S. Moral, editors, Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence, pages 3342,
Madison, Wisconsin, July 2426, 1998. Morgan Kaufmann, San Francisco, California.
[5] H. Chan and A. Darwiche. When do numbers really matter? Journal
of Artificial Intelligence Research, 17:265287, 2002.
[6] V. M. H. Coupe and L. C. van der Gaag. Properties of sensitivity analysis of Bayesian belief networks. Annals of Mathematics and Artificial
Intelligence, 36(4):323356, Dec. 2002.
[7] R. G. Cowell and A. P. Dawid. Fast retraction of evidence in a probabilistic expert system. Statistics and Computing, 2(1):3740, Mar.
1992.
[8] R. G. Cowell, A. P. Dawid, S. L. Lauritzen, and D. J. Spiegelhalter.
Probabilistic Networks and Expert Systems. Statistics for Engineering
and Information Science. Springer-Verlag, New York, 1999.
[9] A. Darwiche. Modeling and Reasoning with Bayesian Networks. Cambridge University Press, Cambridge, UK, 2009.
235
[10] A. P. Dawid. Applications of a general propagation algorithm for probabilistic expert systems. Statistics and Computing, 2(1):2536, Mar.
1992.
[11] F. Jensen and S. K. Andersen. Approximations in Bayesian belief universes for knowledge-based systems. In Proceedings of the Sixth Conference on Uncertainty in Artificial Intelligence, pages 162169, Cambridge, Massachusetts, July 2729, 1990.
[12] F. V. Jensen, B. Chamberlain, T. Nordahl, and F. Jensen. Analysis in
HUGIN of data conflict. In P. P. Bonissone, M. Henrion, L. N. Kanal, and
J. F. Lemmer, editors, Uncertainty in Artificial Intelligence, volume 6,
pages 519528. Elsevier Science Publishers, Amsterdam, The Netherlands, 1991.
[13] F. V. Jensen, S. L. Lauritzen, and K. G. Olesen. Bayesian updating in
causal probabilistic networks by local computations. Computational
Statistics Quarterly, 5(4):269282, 1990.
[14] F. V. Jensen and T. D. Nielsen. Bayesian Networks and Decision Graphs.
Information Science and Statistics. Springer-Verlag, New York, second
edition, 2007.
[15] F. V. Jensen, K. G. Olesen, and S. K. Andersen. An algebra of Bayesian
belief universes for knowledge-based systems. Networks, 20(5):637
659, Aug. 1990. Special Issue on Influence Diagrams.
[16] U. Kjrulff. Triangulation of graphs algorithms giving small total
state space. Research Report R-90-09, Department of Mathematics
and Computer Science, Aalborg University, Denmark, Mar. 1990.
[17] U. Kjrulff. dHugin: a computational system for dynamic time-sliced
Bayesian networks. International Journal of Forecasting, 11(1):89
111, Mar. 1995. Special issue on Probability Forecasting.
[18] U. B. Kjrulff and A. L. Madsen. Bayesian Networks and Influence Diagrams: A Guide to Construction and Analysis. Information Science and
Statistics. Springer-Verlag, New York, second edition, 2013.
[19] D. Koller and N. Friedman. Probabilistic Graphical Models: Principles
and Techniques. Adaptive Computation and Machine Learning. MIT
Press, Cambridge, Massachusetts, 2009.
[20] S. L. Lauritzen. Propagation of probabilities, means, and variances in
mixed graphical association models. Journal of the American Statistical
Association (Theory and Methods), 87(420):10981108, Dec. 1992.
236
[21] S. L. Lauritzen. The EM algorithm for graphical association models with missing data. Computational Statistics & Data Analysis,
19(2):191201, Feb. 1995.
[22] S. L. Lauritzen, A. P. Dawid, B. N. Larsen, and H.-G. Leimer. Independence properties of directed Markov fields. Networks, 20(5):491505,
Aug. 1990. Special Issue on Influence Diagrams.
[23] S. L. Lauritzen and F. Jensen. Stable local computation with conditional Gaussian distributions. Statistics and Computing, 11(2):191
203, Apr. 2001.
[24] S. L. Lauritzen and D. Nilsson. Representing and solving decision
problems with limited information. Management Science, 47(9):1235
1251, Sept. 2001.
[25] S. L. Lauritzen and D. J. Spiegelhalter. Local computations with probabilities on graphical structures and their application to expert systems. Journal of the Royal Statistical Society, Series B (Methodological),
50(2):157224, 1988. Reprinted in [33].
[26] A. L. Madsen, F. Jensen, M. Karlsen, and N. Sndberg-Jeppesen.
Bayesian networks with function nodes. In L. C. van der Gaag and
A. J. Feelders, editors, PGM 2014, volume 8754 of LNAI, pages 286
301, Utrecht, The Netherlands, Sept. 1719, 2014. Springer-Verlag.
[27] A. L. Madsen, F. Jensen, U. B. Kjrulff, and M. Lang. The HUGIN tool
for probabilistic graphical models. International Journal on Artificial
Intelligence Tools, 14(3):507543, June 2005.
[28] A. L. Madsen, M. Lang, U. B. Kjrulff, and F. Jensen. The HUGIN tool
for learning Bayesian networks. In T. D. Nielsen and N. L. Zhang, editors, ECSQARU 2003, volume 2711 of LNAI, pages 594605, Aalborg,
Denmark, July 25, 2003. Springer-Verlag.
[29] K. G. Olesen, S. L. Lauritzen, and F. V. Jensen. aHUGIN: A system
creating adaptive causal probabilistic networks. In D. Dubois, M. P.
Wellman, B. DAmbrosio, and P. Smets, editors, Proceedings of the
Eighth Conference on Uncertainty in Artificial Intelligence, pages 223
229, Stanford, California, July 1719, 1992. Morgan Kaufmann, San
Mateo, California.
[30] K. G. Olesen and A. L. Madsen. Maximal prime subgraph decomposition of Bayesian networks. IEEE Transactions on Systems, Man, and
Cybernetics, Part B: Cybernetics, 32(1):2131, Feb. 2002.
[31] J. Pearl. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann, San Mateo, California, 1988.
237
[32] J. Pearl. Causality: Models, Reasoning, and Inference. Cambridge University Press, Cambridge, UK, 2000.
[33] G. Shafer and J. Pearl, editors. Readings in Uncertain Reasoning. Morgan Kaufmann, San Mateo, California, 1990.
[34] K. Shoikhet and D. Geiger. A practical algorithm for finding optimal
triangulations. In Proceedings of the Fourteenth National Conference on
Artificial Intelligence, pages 185190, Providence, Rhode Island, July
2731, 1997. AAAI Press, Menlo Park, California.
[35] D. J. Spiegelhalter and S. L. Lauritzen. Sequential updating of conditional probabilities on directed graphical structures. Networks,
20(5):579605, Aug. 1990. Special Issue on Influence Diagrams.
[36] P. Spirtes, C. Glymour, and R. Scheines. Causation, Prediction, and
Search. Adaptive Computation and Machine Learning. MIT Press,
Cambridge, Massachusetts, second edition, 2000.
[37] D. Vose. Risk Analysis: A Quantitative Guide. Wiley, Chichester, UK,
second edition, 2000.
238
Index
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
239
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
H
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
equilibrium t, 134
error code, 18
error compressed, 111
error description, 19
error fast retraction, 136
error inconsistency or underflow,
136
error invalid evidence, 137
error io, 21
error name, 19
error no memory, 21
error none, 18
error overflow, 136
error t, 18
error usage, 20
evidence mode t, 135
expression clone, 84
expression delete, 84
expression get boolean, 84
expression get label, 84
expression get node, 84
expression get number, 84
expression get operands, 84
expression get operator, 83
expression is composite, 83
expression t, 78
expression to string, 86
index t, 18
infinity, 90
jt cg evidence is propagated, 144
jt equilibrium is, 143
jt evidence is propagated, 144
jt evidence mode is, 144
jt evidence to propagate, 144
jt get cliques, 116
jt get conflict, 139
jt get next, 116
jt get root, 117
jt get total cg size, 117
jt get total size, 117
jt likelihood is propagated, 144
jt propagate, 137
jt tables to propagate, 145
junction tree t, 116
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
node
node
node
node
node
node
node
node
node
node
node
node
node
node
node
node
node
node
node
node
node
node
node
node
node
node
node
node
node
node
node
node
node
node
node
node
node
node
node
node
node
node
node
node
get
get
get
get
get
get
get
get
get
get
get
get
get
get
get
get
get
get
get
get
get
get
get
get
get
get
get
get
get
get
get
get
get
get
get
get
get
get
get
get
get
get
get
get
belief , 125
beta, 42
case state, 166
case value, 167
category, 30
children, 36
distribution, 126
domain, 30
edge constraint, 175
entered finding, 129
entered value, 129
entropy, 148
expected utility, 127
experience table, 160
fading table, 161
first attribute, 46
gamma, 42
home class, 53
input, 58
instance, 57
instance class, 55
junction tree, 116
kind, 30
label, 209
master, 56
mean, 125
model, 87
mutual information, 148
name, 42
next, 43
number of states, 39
output, 57
parents, 35
position, 210
predicted belief , 67
predicted mean, 68
predicted value, 68
predicted variance, 68
propagated finding, 129
propagated value, 130
requisite ancestors, 37
requisite parents, 36
sampled state, 145
sampled utility, 146
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
operator multiply, 79
operator negate, 79
operator NegativeBinomial, 80
operator node, 83
operator NoisyOR, 80
operator Normal, 80
operator not, 83
operator not equals, 79
operator number, 83
operator or, 83
operator PERT, 80
operator Poisson, 80
operator power, 79
operator probability, 81
operator sin, 82
operator sinh, 82
operator sqrt, 82
operator state index, 80
operator subtract, 79
operator t, 79
operator tan, 82
operator tanh, 82
operator Triangular, 80
operator truncate, 80
operator Uniform, 80
operator Weibull, 80
status t, 17
string parse expression, 86
string t, 17
subtype boolean, 78
subtype error, 78
subtype interval, 78
subtype label, 78
subtype number, 78
table delete, 73
table get cg size, 74
table get configuration from index,
71
table get covariance, 73
table get data, 72
table get index from configuration,
71
table get mean, 73
table get nodes, 72
244
h
h
h
h
h
h
h
h
h
h
h
Linux, 2
Mac OS X, 2
Solaris, 2, 2223
UTF-8, 45, 185, 196
Windows, 515, 23
Zlib, 2, 3