Mục Lục

Machine Learning

BBN: Bayesian Belief Networks — How to Build Them Effectively in Python

A detailed explanation of Bayesian Belief Networks using real-life data to build a model in Python

Image by Gerd Altmann from Pixabay

Intro

Most of you may already be familiar with the Naive Bayes algorithm, a fast and simple modeling technique used in classification problems. While it is used widely due to its speed and relatively good performance, Naive Bayes is built on the assumption that all variables (model features) are independent, which in reality is often not true.

In some cases, you may want to build a model where you can specify which variables are dependent, independent, or conditionally independent (this is explained in the next section). You may also want to track real-time how event probabilities change as new evidence is introduced to the model.

This is where the Bayesian Belief Networks come in handy as they allow you to construct a model with nodes and directed edges by clearly outlining the relationships between variables.

The category of algorithms Bayesian Belief Networks (BBN) belong to
Introduction to Bayesian Belief Networks (BBN) and Directed Acyclic Graphs (DAG)
Bayesian Belief Network Python example using real-life data
– Directed Acyclic Graph for weather prediction
– Data and Python library setup
– BBN setup
– Using BBN for predictions
Conclusions

What category of algorithms does Bayesian Belief Networks (BBN) belong to?

Technically there is no training happening within BBN. We simply define how different nodes in the network are linked together. Then we observe how the probabilities change after passing some evidence into specific nodes. Hence, I have put Probabilistic Graphical Models into their own category (see below).

Side note, I have put Neural Networks in a category of their own due to their unique approach to Machine Learning. However, they can be used to solve a wide range of problems, including but not limited to classification and regression. The below chart is interactive so make sure to click👇 on different categories to enlarge and reveal more.

Machine Learning algorithm classification. Interactive chart created by the author.

If you share a passion for Data Science and Machine Learning, please subscribe to receive an email whenever I publish a new story.

Bayesian Belief Networks (BBN) and Directed Acyclic Graphs (DAG)

Bayesian Belief Network (BBN) is a Probabilistic Graphical Model (PGM) that represents a set of variables and their conditional dependencies via a Directed Acyclic Graph (DAG).

To understand what this means, let’s draw a DAG and analyze the relationship between different nodes.

Directed Acyclic Graph (DAG). Image by author.

Using the above, we can state the relationship between variables (nodes):

Independence: A and C are independent of each other. So are B and C. This is because knowing whether C has happened does not change our knowledge about A or B and vice versa.
Dependence: B is dependent on A since A is the parent of B. This relationship can be written as a conditional probability: P(B|A). D is also dependent on other variables, and in this case, it depends on two of them — B and C. Again, this can be written as a conditional probability: P(D|B,C).
Conditional Independence: D is considered conditionally independent of A. This is because as soon as we know whether event B has happened, A becomes irrelevant from the perspective of D. In other words, the following is true: P(D|B,A) = P(D|B).

Bayesian Belief Network Python example using real-life data

Directed Acyclic Graph for weather prediction

Let’s use Australian weather data to build a BBN. This will enable us to predict if it will rain tomorrow based on a few weather observations from today.

First, let’s take a look at a DAG before we go through the details of how to build it. Note, I have displayed probabilities for all the different event combinations. You will see how we calculate these using our weather data in the following few sections.

Directed Acyclic Graph (DAG) for a Bayesian Belief Network (BBN) to forecast whether it will rain tomorrow. Image by author.

Data and Python library setup

We will use the following data and libraries:

Let’s import all the libraries:

Then we get the Australian weather data from Kaggle, which you can download following this link: https://www.kaggle.com/jsphyg/weather-dataset-rattle-package.

We ingest the data and derive a few new variables for usage in the model.

Here is a snapshot of the data:

A snippet of Kaggle’s Australian weather data with some modifications. Image by the author.

Setting up Bayesian Belief Network

Now that we have all the libraries and data ready, it is time to set up a BBN. The first stage requires us to define nodes.

A few things to note:

Probabilities here are normalized frequencies of the variable categories from the data. E.g., the “H9am” variable has 43,594 observations where the value is ≤60 and 98,599 observations where the value is >60.

Variable value counts. Image by author.

While I have used normalized frequencies (probabilities), it also works if you put actual frequencies instead. In that case, your code would look like this: H9am = BbnNode(Variable(0, 'H9am',['<=60', '>60']), [43594, 98599]) .
For child nodes, like “Humidity3pmCat”, which has a parent “Humidity9amCat”, we need to provide probabilities (or frequencies) for each combination as shown in the DAG (note each row adds up to 1):

“Humidity3pmCat” normalized frequencies (probabilities). Image by author.

You can do this by calculating probabilities/frequencies of “H3pm” twice — the first time by taking a subset of data where “H9am”≤60 and the second time by taking a subset of data where “H9am”>60.
Since calculating frequencies one at a time is time-consuming, I have written a short function that gives us what we need.

So, instead of manually typing in all the probabilities, let’s use the above function. At the same time, we will create an actual network:

Note, if you are working with a small data sample, there is a risk of some event combinations not being present. In such scenario, you would get a “list index out of range” error. A solution could be to expand your data to include all possibe event combinations, or to identify missing combinations and add them in.

Now, we want to draw the graph to check that we have set it up as intended:

Here is the resulting graph, which matches our intended design:

Directed Acyclic Graph (DAG) for weather prediction BBN. Image by author.

Using BBN for predictions

With our model being ready, we can use it to predict whether it will rain tomorrow.

First, let’s plot probabilities for each node without passing any additional information to the graph. Note, I have set up a simple function so we don’t have to retype the same code later on, as we will want to regenerate the results multiple times.

The above code prints the following:

Original BBN probabilities. Image by author.

As you can see, this gives us the likelihood of each event occurring with a “Rain Tomorrow (RT)” probability of 22%. While this is cool, we could have got the same 22% probability by looking at the frequency of the “RainTomorrow” variable in our original dataset.

Said that the following step is where we get a lot of value out of our BBN. We can pass evidence into BBN and see how that affects probabilities for every node in the network.

Let’s say it is 9 am right now, and we have measured the humidity outside. It says 72, which obviously belongs to the “>60” band. Hence, let’s pass this evidence into the BBN and see what happens. Note, I have created another small function to help us with that.

This gives us the following results:

BBN probabilities with “H9am” evidence. Image by author.

As you can see, “Humidity9am>60” is now equal to 100%, and the likelihood of “Humidity3pm>60” has increased from 32.8% to 44.2%. At the same time, the chance of “RainTomorrow” has gone up to 26.1%.

Also, note how probabilities for “WindGustSpeed” did not change since “W” and “H9am” are independent of each other.

You can run the same evidence code one more time to remove the evidence from the network. After that, let’s pass two pieces of evidence for “H3pm” and “W.”

And here are the results:

BBN probabilities with “H3pm” and “W” evidence. Image by author.

Unsurprisingly, this tells us that the chance of rain tomorrow has gone up to 67.8%. Note how “H9am” probabilities also changed, which tells us that despite us only measuring humidity at 3 pm, we are 93% certain that humidity was also above 60 at 9 am this morning.

Conclusions

There are many use cases for Bayesian Belief Networks, from helping to diagnose diseases to real-time predictions of a race outcome.

You can also build BBNs to help you with marketing decisions. Say, I may want to know how likely this article is to reach 10K views. Hence, I can build a BBN to tell me the probability of certain events occurring, such as posting a link to this article on Twitter and then evaluating how this probability changes as I get ten retweets.

At the end of the day, the possibilities are almost limitless, with the ability to generate real-time predictions that automatically update the entire network as soon as new evidence is introduced.

I hope you found Bayesian Belief Networks just as exciting as I did. Feel free to reach out if you have any questions or suggestions. Thanks for reading!

Cheers! 👏
Saul Dobilas