Probabilistic DMN

Probabilistic DMN, or pDMN for short, is another DMN extension that we have been working on. It extends DMN with probabilistic reasoning, to better express uncertainty in decision models. This page details pDMN’s syntax, shows some examples of probabilistic decision models and elaborates on how to use our pDMN solver.

1. pDMN Notation

There are three kinds of tables in pDMN: glossary tables, decision tables, and query tables.

1.1 Glossary

In DMN, all variables are either constants (e.g., “Age” and “Name”) or booleans (e.g., “Adult” or “Eligible”). pDMN extends these with n-ary functions and predicates: you can think of them as constants and booleans that range over one or more types (aka a domain of elements). Simply put, it allows you to express relations/function over domains of values (also known as types). These variables are declared in the type, predicate and function glossary tables. Like in cDMN, there is no specific syntax for these symbols – you simply write them as you like, and the pDMN solver automatically detects which types are used.

For example, if you want to create a decision model over two coinflips, you’ll want to create two booleans (which are just predicates without any types).

Predicate
Name
coinflip1
coinflip2

If you want to express knowledge on whether or not a person is vaccinated, you’d do so by declaring a type Person together with a 1-ary predicate denoting their vaccination status. Note the occurrence of Person in vaccine of Person – this ensures the solver knows that Person is an argument of the predicate.

Type
Name Elements
Person ann, bob

Predicate
Name
vaccine of Person

If we also want to know the specific type of vaccination (instead of true/false), we can introduce a type for the vaccines and use a function instead. You can think of a function as a mapping of the input argument(s) on the output argument. For example, the below vaccine of Person will map each person (ann, bob) on a vaccine (a, b or n).

Type
Name Elements
Person ann, bob
Vaccine a, b, n

Function
Name Type
vaccine of Person Vaccine

1.2 Decision Table

Decision tables in pDMN behave mostly the same as in standard DMN. There are, however, three new additions:

  • Probabilities
  • new Ch(oice) hit policy
  • Quantification

1.2.1 Probabilities

One of pDMN’s main changes is allowing modelers to express probabilities in decision tables. Writing a probability in a decision table requires a change to the way the table is formatted: instead of writing the output values in the output cells, we write the values in a separate row and their probabilities in the cells. For example, if you want to express that there’s a 50% chance for a coin to land on its head, you can do so as follows:

h1
U heads
Yes
1 0.5

Or, maybe you want to express that there’s an 80% chance that your neighbour will call you when your house alarm goes off, and a 10% chance that they will call you if it doesn’t:

Calls
U alarm Calls
Yes
1 Yes 0.8
2 No 0.1

1.2.2 Ch(oice) hit policy

Another new concept is the Ch(oice) hit policy. This HP denotes that the output values for the output variable are mutually exclusive, so that only one value can be assigned at the same time. Each of these values is given a separate probability. For example, the odds of a die landing on a value can be influenced when that die is loaded to roll a specific value more often. We can express this as follows:

Throwing Die
Ch loaded die value
one two three four five six
1 Yes 1/6 1/6 1/6 1/6 1/6 1/6
2 No 0.1 0.1 0.1 0.1 0.1 0.5

Here, die value is a constant representing either one, two, three, four, five or six.

1.2.3 Quantification

Quantification allows us to express a rule for all elements of a specific type. Whenever you write a predicate or function that contains an argument, the table automatically quantifies over this argument. For example, the below table states that “Every person X has a 0.36 chance to have received vaccine a, a 63% chance to have received vaccine b and a 0.01 chance to have no vaccine.”

Vaccine
Ch vaccine of X
a b n
1 0.36 0.63 0.01

Basically, whenever you write a non-specific type (so not belonging to a specific domain element) in the column header, the pDMN solver will instantiate it as a “for all” expression.

1.3 Query

Typically, we will want our model to calculate the probability of a specific variable. To denote this in our models, we can use the Query table. Querying a predicate is done by adding it to the query table, either with specific type elements or with a quantification variable. To query a function, the cell should contain a string of the form “func_name(arg) = val”.

For example:

Query
head

Query
vaccine of bob
X is infected

Query
die value = six

2. pDMN Examples

This section shows some pDMN examples. These are based on standard examples used in the world of probabilistic logic, and should exemplify the pDMN notation well.

2.1 Coinflips

Imagine you are given two coins, one with a 50% probability to land on its head and one with a 60% probability to land on its head. You are now asked to calculate the odds of a single coin landing on its head, and the odds of both landing on their head.

Predicate
Name
heads1
heads2
twoHeads
someHeads
h1
U heads1
Yes
1 0.5
h2
U heads2
Yes
1 0.6



First, we declare four boolean variables in our glossary to respectively denote heads on coin1, heads on coin2, both coins flipping heads and at least one coin flipping head. Next, we introduce two simple decision tables to express the odds of the coins landing on heads.

heads
U heads1 heads1 twoHeads someHeads
1 Yes Yes Yes Yes
2 Yes No No Yes
3 No Yes No Yes
4 No No No No

Query
heads1
heads2

We can then add a simple, standard DMN table to define whenever twoHeads and someHeads are true, together with a query table to denote what probabilities we are interested in nowing.

By giving this model to the pDMN solver, we find the following odds: 80% chance of at least one coin flipping head and a 30% chance of both coins flipping heads.

2.2 Infections

In this example, we are tasked with modeling infections of a virus between people, based on which vaccine they were given. Each person has a 36% chance to have received vaccine a, a 63% chance to have received vaccine b, and a 1% chance to have received no vaccine at all. Depending on which vaccine a person has, contact with an infected person becomes riskier: 80% chance of infection if no vaccine, 10% for vaccine a and 20% for vaccine b. Given that ann, who is infected, contacted bob, what are the odds that bob is infected now too?

Type
Name Elements
Person ann, bob
Vaccine a, b, n
Predicate
Name
Person is infected
Person contacted Person
Predicate
Name Type
vaccine of Person Vaccine


First, we flesh out our glossary. In this case, it’s pretty easy: we have two “domains of values”, people and vaccines, so we create a type for both. Next, we want a way to describe that a person is infected, and whether two people had contact. Both of these concepts are either true or false for every person, so we can use a predicate. We also need a way to map each person on the vaccine that they have received, so we introduce a 1-ary function vaccine of Person for exactly that.

ann
U ann is infected
1 Yes
contact
U bob contacted ann
1 Yes

Vaccine
U vaccine of X
a b n
1 0.36 0.63 0.01

To start things off, we define that ann is infected, and that bob contacted ann. We also add a Choice table to denote the probabilitites of having received the specific vaccines.

Infection
U X contacted Y Y is infected vaccine of X X is infected
Yes
1 Yes Yes n 0.8
2 Yes Yes a 0.1
3 Yes Yes b 0.2

Query
bob is infected

The above decision table expresses that “Every person X that came into contact with an infected person Y has a probability to also be infected, based on their vaccine”. If we then query the probability of bob being infected, we find a probability of 17%.

2.3 Smokers

In the Smokers example, we want to express that a person can start smoking due to two causes: they either smoke if they have stress, or smoke when they are influenced by another smoker. In turn, each person has a 30% chance of being under stress, and a 20% chance of being influenced by another person. Given two people, alice and bob, what are the odds of them smoking?

Type
Name Elements
Person alice, bob
Predicate
Name
Person has stress
Person smokes
Person influences Person

Like always, we start by declaring our variables in a glossary. In this case, we only have one type, namely Person. We then declare three predicates: to denote whether a person smokes, whether a person has stress, and whether a person influences another.

Influence
U X influences Y
Yes
1 0.2
Stress
U Person has stress
Yes
0.3

To denote the chances of people influencing each other and having stress, we use straightforward tables.

Smokes 1
U X has stress X smokes
1 Yes Yes

Smokes 2
U Y smokes Y influences X X smokes
1 Yes Yes Yes

Query
Person smokes

Using simple decision tables we can then express when people start smoking, and query the result. Based on this model, each person has a 34.2% chance to start smoking.

3. pDMN Solver

3.1 Installation

pDMN is available as a Python package and can be installed via pip:

$ pip3 install pDMN

3.2 Usage

The pDMN solver can currently only execute pDMN models that are modeled in Excel sheets. You can do this as follows:

$ pdmn name_of_file.xlsx -n name_of_sheet -x

Some example pDMN implementations are available in our GitLab repo.

Note on performance

Internally, the pDMN solver uses ProbLog to calculate probabilities. While this is a very powerful system, it might sometimes run slower for large problems. However, if you limit pDMN to just constants and booleans, it should be very easy to write a very efficient algorithm.