Bayesian Network – The Decision Lab

Controversies

One critique of Bayesian networks is that because they are directed acyclic graphs, they do not allow for feedback loops. This lack can be an issue when the model is used to display information about biology, specifically because our bodies often function as a response to feedback loops.

Homeostasis – our bodies’ regulation of internal functioning – is one example of a biological feedback loop in which descendant nodes would impact parent nodes. For example, goosebumps are an effect of being cold. In a Bayesian network, goosebumps would be a descendant node, and the cold feeling would be the parent node. However, goosebumps then impact the likelihood that you are cold, since they warm you up. A Bayesian network does not account for this bilateral direction of cause and effect.13

There are other probability models that function differently to Bayesian networks, like neural networks. Neural networks allow for correlations between input variables, unlike Bayesian networks.14 Instead of basing themselves on the probabilities of independent variables only, neural networks work by teaching the system how to differentiate between different variables.

For example, if you want to create a program able to differentiate between images of squares and images of circles, you would input many different examples of circles and squares and classify them as such. The machine would then, hopefully, learn by itself what properties it should examine in order to categorize incoming shapes. Essentially, neural networks work from inputs to outputs, whereas Bayesian networks work from outputs and try to trace causes back to inputs.

Predicting Election Results

American statistician Nate Silver rose to fame after he correctly predicted not only that Barack Obama would win the 2012 U.S. Presidential election, but also the voting outcome of every single state.15 How did this previously unknown blogger make these wildly accurate predictions, even when the media claimed the race was about even? All thanks to Bayesian networks.

The way that the U.S. presidential election functions is hierarchical, which makes it perfect for a Bayesian network that assumes that parent nodes impact descendant nodes, but not vice versa. In order to win the election, candidates must win the most states. States are therefore the parent nodes that impact the descendant node: the outcome of the election.

Silver collected data months prior to voting on how people thought they would vote. Of course, there can always be discrepancies between how people think they will vote and how they actually vote. Luckily, that did not pose an issue for Silver, because Bayes’ theorem allows shifts in hypothesis depending on new information collected.

Silver started off with a ‘nowcast’, which determined the probability of the outcome of each state if voting was to happen on any given day. Various variables impacted this decision: the socioeconomic status of the population of each state, its racial makeup, and its voting history, among others. These variables gave Silver an initial prediction of who would win each state. Then, as time went on, Silver incorporated new incoming data. For example, if unemployment rates changed in a state, he considered it a factor and updated his predictions.15

Silver generated the probability that Obama would win at different times throughout the election period. As election day grew nearer, more and more polling data emerged, giving Silver confidence in his predictions. It was through mapping all the variables onto Bayesian networks that Silver was able to correctly predict the  outcome of the 2012 election.15

Medical Diagnosis Uncertainty

Unfortunately, the diagnostic tests are never 100% accurate. Thankfully, Bayesian networks account for this uncertainty. Bayesian networks understand that whether test results are not the only important variable when it comes to diagnosis. The frequency of false positives and false negatives also influences the likelihood of diagnosis .

Bayesian networks could be useful to figure out accurate numbers of COVID-19 infection and mortality rates. A group of researchers conducted a study suggesting that globally reported statistics of COVID-19 do not take into account the uncertainty of data.17 These statistics simply use how many people tested positive as the infection rate figure.

Using a Bayesian network, the researchers examined how many times positive and negative tests were actually false and adjusted the infection rate accordingly. Different tests have different accuracy rates, which means that the variable for whether or not someone has COVID-19 is not solely dependent on the test result.

Figuring out the false positives and negatives is also important for determining fatality rates. If someone died and previously had been tested positive for COVID-19, the likelihood that the COVID test had been accurate is obviously increased.  As a result of employing a Bayesian network model, the researchers concluded that infection rates are actually higher than popular statistics suggest, but mortality rates are lower than reported.17