SlideShare una empresa de Scribd logo
1 de 26
Descargar para leer sin conexión
The Bayesia Portfolio of Research Software
BayesiaLab 5.1
Bayesia Market Simulator 1.6
BEKEE 2.0
Bayesia Engine API
www.bayesia.us
Table of Contents
Introduction
Framework: The Bayesian Network Paradigm
Acyclic Graphs & Bayes’s Rule 5
Compact Representation of the Joint Probability Distribution 6
BayesiaLab 5.1
Executive Summary 7
Select Client List 8
Conceptual Highlights 9
Expert Knowledge Modeling 9
Knowledge Discovery with Machine Learning 9
Knowledge Unification 10
Reasoning Under Uncertainty 10
Discrete, Nonlinear and Nonparametric Modeling 11
Key Functions 11
Unsupervised Structural Learning 11
Supervised Learning 12
Clustering 13
Observational Inference 13
Causal Inference 14
Diagnosis, Prediction and Simulation 14
Effects Analysis 15
Analyzing Observational Studies 16
Optimization 16
Bayesia Market Simulator 1.6
Motivation 17
Bayesian Networks for Choice Modeling 17
Bayesia Market Simulator 18
The Bayesia Portfolio of Research Software
ii
 www.bayesia.us | www.bayesia.sg
BEKEE 2.0
Motivation 20
Bayesia Expert Knowledge Elicitation Environment (BEKEE) 21
Bayesia Engines
Bayesia Engine API 23
References
Contact Information
Bayesia USA 26
Bayesia Singapore Pte. Ltd. 26
Bayesia S.A.S. 26
Copyright 26
The Bayesia Portfolio of Research Software
www.bayesia.us | www.bayesia.sg
 iii
Introduction
The Bayesia portfolio of research software is the result of over 20 years of continuous research and devel-
opment by two French professors in the field of artificial intelligence, Dr. Lionel Jouffe and Dr. Paul Mun-
teanu. Their team of computer scientists and software developers at Bayesia S.A.S. has embraced the Bayes-
ian networks paradigm and built tools for making it accessible to a broad audience, and practical for a wide
range of research tasks.
The idea of Bayesian networks dates back to the mid-1980s, when Professor Judea Pearl of UCLA began to
formalize their semantics in a series of seminal works. The study of Bayesian networks has since grown into
a large body of work with dozens of books and countless scientific papers exploring all their properties.
However, thanks to Bayesia’s software tools, and the ever-increasing power of computers, Bayesian net-
works have become powerful and practical tools well beyond the world of academia. For applied research
in all domains, Bayesian networks can facilitate deep understanding of very complex, high-dimensional
problem domains. Their computational efficiency and inherently visual structure make Bayesian networks
attractive for exploring and explaining complex domains. Most importantly, Bayesian networks allow rea-
soning about such domains in a formally correct yet highly intuitive way.
The Bayesia Portfolio of Research Software
4 www.bayesia.us | www.bayesia.sg
EXPERT
KNOWLEDGE
BAYESIAN
NETWORK
ANALYTICS SIMULATION
RISK
MANAGEMENT
OPTIMIZATIONDIAGNOSIS
DATA
KNOWLEDGE MODELING
KNOWLEDGE DISCOVERY
DECISION SUPPORT
Framework: The Bayesian Network Paradigm1
Acyclic Graphs & Bayes’s Rule
Probabilistic models based on directed acyclic graphs have a long and rich tradition, beginning with the
work of geneticist Sewall Wright in the 1920s. Variants have appeared in many fields. Within statistics, such
models are known as directed graphical models; within cognitive science and artificial intelligence, such
models are known as Bayesian networks. The name honors the Rev. Thomas Bayes (1702-1761), whose
rule for updating probabilities in the light of new evidence is the foundation of the approach.
Rev. Bayes addressed both the case of discrete probability distributions of data and the more complicated
case of continuous probability distributions. In the discrete case, Bayes’ theorem relates the conditional and
marginal probabilities of events A and B, provided that the probability of B does not equal zero:
P(A∣B) =
P(B∣A)P(A)
P(B)
In Bayes’ theorem, each probability has a conventional name:
• P(A) is the prior probability (or “unconditional” or “marginal” probability) of A. It is “prior” in the
sense that it does not take into account any information about B; however, the event B need not occur
after event A. In the nineteenth century, the unconditional probability P(A) in Bayes’s rule was called the
“antecedent” probability; in deductive logic, the antecedent set of propositions and the inference rule
imply consequences. The unconditional probability P(A) was called “a priori” by Ronald A. Fisher.
• P(A|B) is the conditional probability of A, given B. It is also called the posterior probability because it is
derived from or depends upon the specified value of B.
• P(B|A) is the conditional probability of B given A. It is also called the likelihood.
• P(B) is the prior or marginal probability of B, and acts as a normalizing constant.
Bayes theorem in this form gives a mathematical representation of how the conditional probability of event
A given B is related to the converse conditional probability of B given A.
The initial development of Bayesian networks in the late 1970s was motivated by the need to model the top-
down (semantic) and bottom-up (perceptual) combination of evidence in reading. The capability for bidirec-
tional inferences, combined with a rigorous probabilistic foundation, led to the rapid emergence of Bayesian
networks as the method of choice for uncertain reasoning in AI and expert systems replacing earlier, ad hoc
rule-based schemes.
The Bayesia Portfolio of Research Software
www.bayesia.us | www.bayesia.sg 5
1 Adapted from Pearl (2000), used with permission.
The nodes in a Bayesian network represent variables
of interest (e.g. the temperature of a device, the gen-
der of a patient, a feature of an object, the occur-
rence of an event) and the links represent statistical
(informational) or causal dependencies among the
variables. The dependencies are quantified by condi-
tional probabilities for each node given its parents in
the network. The network supports the computation
of the posterior probabilities of any subset of vari-
ables given evidence about any other subset.
Compact Representation of the Joint
Probability Distribution
“The central paradigm of probabilistic reasoning is
to identify all relevant variables x1, . . . , xN in the
environment [i.e. the domain under study], and
make a probabilistic model p(x1, . . . , xN) of their interaction [i.e. represent the variables’ joint probability
distribution].”
Bayesian networks are very attractive for this purpose as they can, by means of factorization, compactly
represent the joint probability distribution of all variables.
“Reasoning (inference) is then performed by introducing evidence that sets variables in known states, and
subsequently computing probabilities of interest, conditioned on this evidence. The rules of probability,
combined with Bayes’ rule make for a complete reasoning system, one which includes traditional deductive
logic as a special case.” (Barber, 2012)
The Bayesia Portfolio of Research Software
6 www.bayesia.us | www.bayesia.sg
BayesiaLab 5.1
Executive Summary
BayesiaLab is a powerful desktop application
(Windows/Mac/Unix) for knowledge manage-
ment, data mining, analytics, predictive model-
ing and simulation — all based on the para-
digm of Bayesian networks. Bayesian networks
have become a very powerful tool for deep
understanding of very complex, high-
dimensional problem domains, ranging from
bioinformatics to marketing science.
BayesiaLab is the world’s only comprehensive
software package for generating, manipulating
and analyzing Bayesian networks.
Analysts and researchers around the world, including Bayesia’s strategic partner P&G, have embraced
BayesiaLab to gain unprecedented insights into problems which had previously not been tractable with tra-
ditional analysis methods.
The latest version of BayesiaLab, 5.1, is the result of nearly twenty years of development by a team of re-
searchers, led by Dr. Lionel Jouffe and Dr. Paul Munteanu, who are widely recognized as world leaders in
their field of study.
While cutting-edge research tools are often of no practical use outside the laboratory, BayesiaLab is a major
exception. Its performance is like a Formula One race car; its everyday practicality resembles an SUV.
As such, BayesiaLab provides an extremely user-friendly interface that allows novices and experts alike to
easily and quickly navigate all the functions available in the program. Intuitive menu structures and step-by-
step wizards allow end-users to focus on their principal analysis task without having to worry about idio-
syncratic syntax or arcane commands.
The Bayesia Portfolio of Research Software
www.bayesia.us | www.bayesia.sg 7
Select Client List
The Bayesia Portfolio of Research Software
8 www.bayesia.us | www.bayesia.sg
• Acxiom
• AGC Glass
• Airbus
• Ales Market Research
• American Diabetes Association
• Arcelor Mittal
• BBDO
• Booz Allen Hamilton
• BP
• BVA
• Cancer Care Ontario
• Cap Gemini
• Cargill
• Center for Disease Control
• CFI Group
• Crédit Agricole
• Dassault Aviation
• Dell
• Direction Générale de l'Arme-
ment (DGA)
• EADS Telecom
• Électricité de France (EDF)
• ENI
• Firmenich
• Fractal Analytics
• France Telecom
• Georgetown University
• GfK
• GlaxoSmithKline
• GnResearch
• GroupM
• Hilton Hotels & Resorts
• Hyatt
• Iceology
• IMRB International
• InterContinental Hotels Group
• Ipsos
• Klinikum der Universität München
• L'Oreal
• La Poste
• Lancaster University
• Lilly
• Lockheed Martin
• Louisiana State University
• Marketing Analysts (MAi)
• McGill University
• MedSolutions
• Millward Brown
• Mu Sigma
• NASA
• National Analysts
• National Central University,
Taiwan
• Neiman Marcus
• Nestlé
• Nissan
• NTT
• Opinion Way
• Orange
• Pennsylvania State University
• Procter & Gamble
• PSA Peugeot Citroën
• Renault
• Repères
• Rhodia
• Rutgers, The State
University of New Jersey
• Saint-Gobain
• Samsung
• Sanofi
• Servier
• Singapore Telecom
• Smucker
• SNCF
• Société Générale
• Sony
• Soredab
• Synovate
• Team Detroit
• The Pert Group
• TNS
• Total
• Turbomeca
• UCLA
• Unilever
• University of Toronto
• University of Virginia
• Vanderbilt University
• Veterans Administration
• Virginia Tech
Testimonials
“BayesiaLab provides exceptional capability in probabilistic inference. This Bayesian network software allows model
building based on data, expert knowledge or any combination of the two. It polishes off modeling with a suite of
advanced analysis methods unavailable in other such tools. The results are clear, interpretable solutions of the prob-
lem at hand. With BayesiaLab, Bayesia has set new standards of usability, productivity and value for Bayesian net-
work software.”
Michael L. Thompson
Procter & Gamble (USA), CF-RD/Modeling & Simulation
“BayesiaLab has been able to accelerate our consumer modeling in Family Care by cutting costs and enabling model
creation in minutes – not months. Beyond that, it has allowed us to take our traditional approaches into new territo-
ries: virtual product design & testing, influencing copy development and even volume forecasting. It has been the
single biggest enabler of deeper consumer insights & more actionable modeling across our business.”
Prabhath Nanisetty
Procter & Gamble (USA), Family Care CMK
Conceptual Highlights
Expert Knowledge Modeling
In today’s business environment that strives to be “data-driven”, expert knowledge seems to be perceived
more and more as qualitative or is perhaps even seen as “soft” knowledge. With billions of “hard” data
points being accumulated every second, what cannot be counted may not count for much these days. A life-
time of experience in any particular domain may appear insignificant in comparison to the huge quantities
of newly generated data.
This mindset has a critical flaw, which is that causal relationships cannot be machine-learned from data.
Rather, causal reasoning always requires some form of assumptions, i.e. assumptions coming from the hu-
man mind.
Experts often express causal paths in the form of
graphs. This visual representation of causes and ef-
fects has a direct analogue in the network graph in
BayesiaLab’s graph panel. Nodes (representing vari-
ables) can be added and positioned with a mouse-
click, arcs (representing relationships) can be
“drawn” between nodes. The causal direction is
simply encoded in the direction of the arc.
The quantitative nature of dependencies, plus many
other attributes can be managed in the Node Editor,
which is available by right-clicking any node.
BayesiaLab thus facilitates intuitively encoding one’s
own understanding of a domain with a minimum of
effort. Simultaneously it enforces internal consis-
tency, so that no impossible conditions are acciden-
tally encoded.
In addition to allowing users to directly encode their explicit knowledge by drawing a network in the graph
panel, the Bayesia Expert Knowledge Elicitation Environment (BEKEE) is available as an extension to
BayesiaLab. It allows to systematically elicit both explicit and tacit knowledge of experts (see chapter on
BEKEE).
Knowledge Discovery with Machine Learning
Despite our emphasis on the relevance of human expert knowledge, especially for identifying causal rela-
tions, there is no doubt that there is a lot to learn from data, regardless of whether the data is sparse or
“big”. BayesiaLab features a very comprehensive array of highly optimized learning algorithms that can
quickly uncover so-far-unknown structures in datasets. This proves to be particularly powerful regardless of
whether you have a handful of variables or thousands of variables, with millions of potentially relevant rela-
tionships.
The Bayesia Portfolio of Research Software
www.bayesia.us | www.bayesia.sg 9
Knowledge Unification
Ultimately, “deep understanding” of a domain requires knowing the parameters of the relationships be-
tween the variables plus the knowledge of their causal directions. Machines are ideally suited for estimating
quantities, such as the parameters, while human knowledge is still required to determine causality.
So, if there were one central tenet in Bayesia’s philosophy, it would have to be “the mission of unifying
machine learning and human knowledge for better reasoning.” Although the expression “the best of both
worlds” may sound like a cliché, it is what Bayesian networks and BayesiaLab can indeed offer.
Reasoning Under Uncertainty
Based on a Bayesian network, BayesiaLab can re-
liably carry out inference with multiple pieces of
uncertain and even conflicting evidence. The inher-
ent ability of Bayesian networks to facilitate com-
putations under uncertainty makes them highly
suitable for a wide range of real-world applica-
tions.
Reasoning under uncertainty applies in two ways:
“Art” “Science”
Expert
Knowledge
Qualitative
Mathematical
Representation
Quantitative
Bayesian Network
Unified Knowledge Representation
Domain
The Bayesia Portfolio of Research Software
10 www.bayesia.us | www.bayesia.sg
• Diagnosis (inference from effect to cause)
• Simulation (inference from cause to effect)
Maintaining uncertainty during inference automatically prevents potentially misleading point estimates.
Discrete, Nonlinear and Nonparametric Modeling
BayesiaLab processes all data on a discre-
tized basis. As part of BayesiaLab’s Data
Import Wizard, a number of methods are
available to discretize any continuous vari-
ables.
In BayesiaLab, all “parameters” describing
probabilistic relationships between variables
are contained in conditional probability
tables (or cubes/hypercubes when two di-
mensions are exceeded), which means that
no functional forms are utilized. Given this
nonparametric, discrete approach, Bayesia-
Lab can implicitly handle highly nonlinear
relationships between variables.
All the optimization criteria of BayesiaLab’s
learning algorithms are based on informa-
tion theory (e.g.the Minimum Description Length). With that, no assumptions of linearity are made at any
point.
Key Functions
Unsupervised Structural Learning
In statistics, unsupervised learning is typically understood to be a
classification or clustering task. To make a very clear distinction, we
put emphasis on “structural” in “Unsupervised Structural Learning”,
which covers a number of important algorithms in BayesiaLab.
The Bayesia Portfolio of Research Software
www.bayesia.us | www.bayesia.sg 11
Unsupervised Structural
Learning means that Bayesia-
Lab can discover probabilistic
relationships between a large
number of variables, without
the need to define inputs or
outputs. One might say that
this is the quintessential form
of knowledge discovery, as no
assumptions whatsoever are
required to perform these
algorithms on unknown
datasets.2
Supervised Learning
Supervised Learning in BayesiaLab has the same objective as many
traditional modeling techniques, i.e. to develop a model for predict-
ing a target variable. Some other data mining packages also offer
“Bayesian Networks” as an option in their array of available tech-
niques. However, in most cases, these packages are restricted in their
capabilities to a very limited type of network, i.e. the Naïve Bayesian
Network.
Within BayesiaLab, a vastly greater number of
algorithms is available to search for a Bayesian
network that best describes the target variable,
while taken into account the complexity of the
resulting network. The Markov Blanket algo-
rithm should be highlighted here as its speed is
particularly helpful whenever dealing with a
larger number of variables. In this context, the
Markov Blanket also serves as an exceptionally
powerful variable selection algorithm.
Finally, structural coefficient analysis, cross-
validation and data perturbation functions are
available for thoroughly testing and validating
the robustness of candidate networks, helping
the analyst to make a trade-off between precision and parsimony. These validation methods are applicable
to both Supervised and Unsupervised Learning.
The Bayesia Portfolio of Research Software
12 www.bayesia.us | www.bayesia.sg
2 However, the analyst can still use any available domain knowledge to define structural constraints.
Clustering
Clustering in BayesiaLab covers both data clustering (e.g. by observations) and
variable clustering, which, as the name implies, allows the grouping of variables
according to the strength of their mutual relationships.
A third variation of this concept is of particular importance in BayesiaLab: the semi-automatic Multiple
Clustering workflow can be described as a kind of nonlinear, nonparametric and nonorthogonal factor
analysis.
In practice, Multiple Clustering often serves as the basis for developing Probabilistic Structural Equation
Models with BayesiaLab.
Observational Inference
One of the basic properties of Bayesian networks is that they are “omnidirectional observational inference
engines”. Given an observation on any of the networks nodes (or a subset of nodes), one can compute the
posterior probabilities of all other nodes in the network. Both exact and approximate observational infer-
ence algorithms are implemented in BayesiaLab.
The Bayesia Portfolio of Research Software
www.bayesia.us | www.bayesia.sg 13
Causal Inference
Besides observational inference, BayesiaLab also offers causal inference for computing the impact of inter-
vening on a subset of variables instead of merely observing their states. Both Pearl’s Do-Operator and
Jouffe’s Likelihood Matching are available for this purpose.
Missing Values Processing
Missing values are encountered in virtually all real-world data collection processes. Missing values could be
the result of nonresponses in surveys, poor recordkeeping, server outages, attrition in longitudinal surveys
or the faulty sensors of a measuring device, etc.
Traditionally, missing values processing (beyond the naïve ad-hoc approaches) has been a demanding task,
both methodologically and computationally. What is often overlooked is that not properly handling missing
observations can lead to misleading interpretations or create a false sense of confidence in one’s findings,
regardless of how many more complete observations might be available.
BayesiaLab offers a range of sophisticated methods for missing values processing from which the analyst
can choose. During network learning, BayesiaLab performs missing values processing automatically “behind
the scenes”. More specifically, the Structural Expectation-Maximization algorithm and the Dynamic Com-
pletion algorithm are automatically applied after each modification of the network during learning, i.e. after
every single arc addition, suppression and inversion.
Bayesian networks actually provide several advantages for dealing with missing values, which makes it at-
tractive to use BayesiaLab solely for that purpose.
• Bayesian networks offer a unified framework for representing the joint distribution of the overall domain
and simultaneously encoding the dependencies with the missing values (Heckerman, 2008). This implicitly
addresses the requirement that Shafer and Olson (1998) stipulate for missing values imputation, namely
“any association that may prove important in subsequent analysis should be present in the imputation
model.... A rich imputation model that preserves a large number of associations is desirable because it
may be used for a variety of post-imputation analyses.” Also, by using a Bayesian network, the
“functional form” for missing values imputation and for representing the overall model are automatically
identical and thus compatible.
• The inherently probabilistic nature of Bayesian networks allows to deal with missing values and their
imputation nondeterministically. That means that the (needed) variance in the imputed data does not need
to be generated artificially, but is inherently present.
Diagnosis, Prediction and Simulation
In the Bayesian network framework, diagnosis, prediction and simula-
tion are identical computations. They all consist of inference condi-
tional upon evidence. The distinction only exists from the perspective of
the researcher, who would presumably sees the symptom of a disease as
an effect and the disease itself as the cause. Hence, carrying out infer-
ence based on observed symptoms is interpreted as “diagnosis”.
The Bayesia Portfolio of Research Software
14 www.bayesia.us | www.bayesia.sg
BayesiaLab offers a considerable number of functions relating to inference. For instance, inference can be
performed by setting evidence, i.e. clicking on any one of the Monitors, and results are returned instantly
for all the other Monitors.
Batch Inference is available when infer-
ence needs to be computed for a large
number of records. For instance, this can
be used for applying a predictive score for
all customers in a database.
The Adaptive Questionnaire function
provides guidance in terms of the opti-
mum sequence for seeking evidence. With
every piece of evidence set, BayesiaLab
determines which is the next best piece of
evidence to obtain for a maximum infor-
mation gain with respect to the target
variable. In a medical context, this allows
to optimally “escalate” diagnostic proce-
dures, from “low-cost & small-gain evi-
dence (e.g. measuring the patient’s blood pressure) to “high-cost & large-gain” evidence (e.g. performing an
MRI scan).
Effects Analysis
Many research activities focus on estimating the size of an effect, for instance establishing the treatment ef-
fect of a new drug or determining the sales impact of a new advertising
campaign. Other studies are about attribution, i.e. they attempt to de-
compose observed effects into their causes and thus allocate contribu-
tions.
All of the above questions can be answered, if the domain is fully un-
derstood, which is a priori never the case. However, if we are able to
build an adequate model of the domain that captures all of its dynam-
ics, BayesiaLab will be able to extract the effects.
BayesiaLab employs simulation to derive effects, as parameters per se
do not exist in this nonparametric framework. As all the dynamics of
the domain are encoded in discrete conditional probability tables, effect sizes only manifest themselves when
different conditions are simulated.
Total Effects Analysis, Target Mean Analysis and many more of BayesiaLab’s functions offer the analyst
ways to study effects, especially nonlinear and interactive effects.
The Bayesia Portfolio of Research Software
www.bayesia.us | www.bayesia.sg 15
Analyzing Observational Studies
This simulation approach also offers special opportunities for evaluating observational studies. More spe-
cifically, it can help overcome the problem of systematic differences between treatment and control groups.
BayesiaLab’s Likelihood Matching performs on-the-fly matching of pretreatment covariates as part of the
Direct Effects Analysis, thus yielding the “exclusive” effect of a particular variable on the target, everything
else being equal. This also obliterates the need for separately preforming matching techniques, such as pro-
pensity score matching.
Optimization
The ability to perform inference across all possible states
of all nodes of the network also facilitates searching for
optimum values. BayesiaLab’s Target Dynamic Profile
and the Resource Allocation Optimization provide the
toolsets for this purpose.
Using this function in combination with Direct Effects is
of particular interest when searching for the optimum
combination of variables that have a nonlinear relation-
ship with the target (and co-relations between the driv-
ers). A typical example would be searching for the opti-
mum mix of an array of marketing instruments.
BayesiaLab’s Resource Allocation Optimization with
Direct Effects will search, within the constraints set by
the analysts, for those scenarios that optimize the target
criterion.
The Bayesia Portfolio of Research Software
16 www.bayesia.us | www.bayesia.sg
Bayesia Market Simulator 1.6
Motivation
For the vast majority of businesses, market share is a key performance indicator. Market share is used as a
metric that allows comparing competitive performance independently from overall market size and its fluc-
tuations.
In the product planning process, the expected market share is critical, along with the overall market fore-
cast, as together they define the sales volume expectation, which, for obvious reasons, is a key element in
most business cases.
As a result, it is critical for decision makers to correctly predict the future market shares of products not yet
developed. The task of such market share forecasts typically falls into marketing and market research de-
partments, who are mostly closely involved with understanding consumer behavior and, more specifically,
the product choices they make.
If we fully understood the consumer’s decision making process and observed all components of it, we could
simply generate a deterministic model for predicting future consumer choices. However, we do not and it is
obvious that many elements contributing to a consumer’s purchase decision are inherently unobservable.
Despite our limited comprehension of the true human choice process, there are a number of tools that still
allow modeling consumer choice with what is observable, and accounting for what will remain unknow-
able. In this context, and based on the seminal works of Nobel-laureate Daniel McFadden, choice modeling
has emerged as an important tool in understanding and simulating consumer choice.
Bayesian Networks for Choice Modeling
Beyond the convenience and speed of estimating Bayesian networks with BayesiaLab, there are several
noteworthy differences in modeling consumer choice with Bayesian networks compared to traditional dis-
crete choice models.
• Whereas utility-based choice models, such as multinomial logit models (MNL), will “flatten” the vector of
attribute utilities into a single scalar value, Bayesian networks do not inherently restrict all the dimensions
relating to choice. For example, learning a Bayesian network on observed vehicle choices might reveal that
fuel economy and vehicle price are subject to tradeoff, while safety is a nonnegotiable basic requirement
for the consumer. Correctly recognizing such dynamics are obviously critical for making predictions about
future consumer choices.
• Bayesian networks are nonparametric and thus they do not require the specification of a functional form.
No assumptions need to made regarding the form of links between variables. Potentially nonlinear
patterns are therefore not an issue for model estimation or simulation.
• Bayesian networks are inherently probabilistic, and, as such, there is no need to specify an error term. A
traditional choice model would require an error term to make it nondeterministic.
The Bayesia Portfolio of Research Software
www.bayesia.us | www.bayesia.sg 17
• In BayesiaLab all computations are natively discrete and therefore no transformation functions, such as
logit or probit, are needed. Given that we are dealing with discrete consumer choices, this all-discrete
approach is an advantage.
Bayesia Market Simulator
BayesiaLab and the Bayesia Market Simula-
tor are unique in their ability to utilize
Bayesian networks for choice modeling, for
instance for market share simulation of new
products and services.
The principal idea is that a Bayesian net-
work represents a generalization of a do-
main, such as the interactions between
products and consumers (both stated prefer-
ence and revealed preference data can be
used). This means that all of the products
attributes may interact with all of the con-
sumer attributes, which can amount to hun-
dreds of variables. Unsupervised Learning of
a sufficient number of such interactions (in
all their dimensions) will then generate a network that generalizes all these relationships, i.e. the network
becomes a function that maps consumer attributes to product attributes.
The Bayesia Market Simulator can subse-
quently utilize this generalization and simu-
late hypothetical product scenarios, such as
a different combination of product features.
Given the network, a new choice probabil-
ity can then be computed for every single
consumer across all hypothetical and real
product scenarios. In summary, this pro-
vides new market shares for an alternative
state of the world.
With the ability to leverage revealed prefer-
ence data, BayesiaLab and Bayesia Market
Simulator allow using a vast range of exist-
ing research for choice predictions.
BayesiaLab can learn a Bayesian network
from consumer choices in recorded in the
form of stated preference (SP) or revealed
The Bayesia Portfolio of Research Software
18 www.bayesia.us | www.bayesia.sg
preference (RP) data. The learned Bayesian network allows computing the posterior probability distribution
in each choice situation, including hypothetical product alternatives (and even hypothetical consumers). As
a result, we obtain a choice probability as a function of product and consumer attributes.
In order to obtain a product’s projected
market share, we can then simply simulate
choice probabilities across all product sce-
narios and across all individuals in the
population under study.
The Bayesia Portfolio of Research Software
www.bayesia.us | www.bayesia.sg 19
BEKEE 2.0
Motivation
Everybody is talking about “Big Data” and all the opportunities that are associated with it. Very often
though, we hear almost as much about the challenges that come with this flood of data.
However, much more serious problems exist on the opposite end of the spectrum, where there is not enough
data. Unfortunately, all the advanced knowledge discovery algorithms fail in the absence of data.
In over ten years of continuous development, and in increasingly sophisticated ways, BayesiaLab has permit-
ted deriving knowledge from data through its machine learning algorithms, very much in the spirit of under-
standing “Big Data”. However, BayesiaLab has maintained an equal focus on managing knowledge that
exists beyond measurable and countable data points, such as the knowledge contained in the human mind.
BayesiaLab’s graphical user interface has made it highly intuitive for individual subject matter experts to
encode their own domain understanding into a Bayesian network, thus capturing what they explicitly or
implicitly know. What is especially valuable, one can very easily and formally capture causal relations in a
Bayesian network graph, which is something that few other frameworks can do.
However, when it comes to consolidating the collective knowledge from a group of experts, rather than
from an individual, the process is not that straightforward any longer. Traditionally, one would perhaps
bring the experts together in a brainstorming session and let them form a common understanding. Subse-
quently such a consensus could be encoded manually. However, brainstorming sessions are prone to intro-
ducing a wide range of biases, which can be disastrously counterproductive in studying complex domains.
The Bayesia Portfolio of Research Software
20 www.bayesia.us | www.bayesia.sg
Bayesia Expert Knowledge Elicitation Environment (BEKEE)
Bayesia Expert Knowledge Elicitation Environment, or BEKEE for short, is a new web application that is
designed to minimize detrimental group biases. The central idea is not to coerce consensus, but rather to
elicit everyone’s individual views regarding the domain under study. In order to ensure the independent
elicitation of probabilities, BEKEE queries stakeholders individually via an interactive or batch question-
naire linked to the core BayesiaLab application. Retrieving expert views in such a fashion generates many
“parallel universes” in terms of domain understanding. These different perspectives can be formally com-
pared by the facilitator and potentially returned to the group for a formal debate in the case of seriously
conflicting assessments.
In most cases, this is an iterative process and, even if stakeholder opinions do not converge, BayesiaLab will
compile all views and produce a unifying Bayesian network. This graph is now a probabilistic summary of
all the available expert opinions. As such, it can be utilized as a formal representation of the underlying do-
main. Most importantly, this graph is not merely a visual representation. Rather, a Bayesian network is a
fully computable model of the domain, which immediately facilitates the simulation of what-if scenarios.
15
The Bayesia Portfolio of Research Software
www.bayesia.us | www.bayesia.sg 21
In fact, we can evaluate this Bayesian network model the same way as a statistical model estimated from
“Big Data”. One might still prefer a data-based model, if data were indeed available, but in the absence
thereof, the formally-encoded collective expert knowledge best represents what is known at the time.
The Bayesia Portfolio of Research Software
22 www.bayesia.us | www.bayesia.sg
Bayesia Engines
Developers can also access many of BayesiaLab’s functions outside the graphical user interface by using
Bayesia’s Modeling and Inference Engines. You can thus directly leverage Bayesian networks in your own
applications and workflows and deploy them for client use, without requiring clients to install BayesiaLab.
Bayesia Engine API
The Bayesia Engines are Application Program Interfaces (API) as pure Java
class library (jar file) that can be integrated in any software project.
With the Bayesia Modeling Engine you can create your own Bayesian net-
works from within your own code and subsequently perform inference with
the Bayesia Inference Engine.
The Bayesia Inference Engine al-
lows you to perform inference on
Bayesian networks from within
your own application. Networks
created with BayesiaLab or with the
Modeling Engine can be used for
computing inference with the
Bayesia Inference Engine.
A typical implementation scenario
would be developing a Bayesian
network offline with BayesiaLab
and then deploying this network for real-time prediction on streaming data with the Bayesia Inference En-
gine.
The Bayesia Portfolio of Research Software
www.bayesia.us | www.bayesia.sg 23
The Bayesia Inference Engine can,
for instance, also serve as the back-
end of a web-based simulator,
which can interactively perform
inference on the user’s input.
The Bayesia Portfolio of Research Software
24 www.bayesia.us | www.bayesia.sg
References
Barber, David. Bayesian Reasoning and Machine Learning. Cambridge University Press, 2012.
Darwiche, Adnan. Modeling and Reasoning with Bayesian Networks. 1st ed. Cambridge University Press,
2009.
Heckerman, D. “A Tutorial on Learning with Bayesian Networks.” Innovations in Bayesian Networks
(2008): 33–82.
Holmes, Dawn E., ed. Innovations in Bayesian Networks: Theory and Applications. Softcover reprint of
hardcover 1st ed. 2008. Springer, 2010.
Kjaerulff, Uffe B., and Anders L. Madsen. Bayesian Networks and Influence Diagrams: A Guide to Con-
struction and Analysis. Softcover reprint of hardcover 1st ed. 2008. Springer, 2010.
Koller, Daphne, and Nir Friedman. Probabilistic Graphical Models: Principles and Techniques. 1st ed. The
MIT Press, 2009.
Koski, Timo, and John Noble. Bayesian Networks: An Introduction. 1st ed. Wiley, 2009.
Mittal, Ankush. Bayesian Network Technologies: Applications and Graphical Models. Edited by Ankush
Mittal and Ashraf Kassim. 1st ed. IGI Publishing, 2007.
Neapolitan, Richard E. Learning Bayesian Networks. Prentice Hall, 2003.
Pearl, Judea. Causality: Models, Reasoning and Inference. 2nd ed. Cambridge University Press, 2009.
———. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. 1st ed. Morgan
Kaufmann, 1988.
Pearl, Judea, and Stuart Russell. Bayesian Networks. UCLA Congnitive Systems Laboratory, November
2000. http://bayes.cs.ucla.edu/csl_papers.html.
Pourret, Olivier, Patrick Naïm, and Bruce Marcot, eds. Bayesian Networks: A Practical Guide to Applica-
tions. 1st ed. Wiley, 2008.
Schafer, J.L., and M.K. Olsen. “Multiple Imputation for Multivariate Missing-data Problems: A Data Ana-
lyst’s Perspective.” Multivariate Behavioral Research 33, no. 4 (1998): 545–571.
Spirtes, Peter; Glymour, Clark. Causation, Prediction and Search. The MIT Press, 2001.
The Bayesia Portfolio of Research Software
www.bayesia.us | www.bayesia.sg 25
Contact Information
Bayesia USA
312 Hamlet’s End Way
Franklin, TN 37067
USA
Phone: +1 888-386-8383
info@bayesia.us
www.bayesia.us
Bayesia Singapore Pte. Ltd.
20 Cecil Street
#14-01, Equity Plaza
Singapore 049705
Phone: +65 3158 2690
info@bayesia.sg
www.bayesia.sg
Bayesia S.A.S.
6, rue Léonard de Vinci
BP 119
53001 Laval Cedex
France
Phone: +33(0)2 43 49 75 69
info@bayesia.com
www.bayesia.com
Copyright
© 2013 Bayesia S.A.S., Bayesia USA and Bayesia Singapore. All rights reserved.
The Bayesia Portfolio of Research Software
26 www.bayesia.us | www.bayesia.sg

Más contenido relacionado

Similar a The Bayesia Portfolio of Research Software

Bayesian Network Modeling using Python and R
Bayesian Network Modeling using Python and RBayesian Network Modeling using Python and R
Bayesian Network Modeling using Python and RPyData
 
PyData DC 2016 Talk: Bayesian Network Modeling Using Python and R
PyData DC 2016 Talk: Bayesian Network Modeling Using Python and RPyData DC 2016 Talk: Bayesian Network Modeling Using Python and R
PyData DC 2016 Talk: Bayesian Network Modeling Using Python and RPragyansmita Nayak, Ph.D.
 
Bayesian Networks and Association Analysis
Bayesian Networks and Association AnalysisBayesian Networks and Association Analysis
Bayesian Networks and Association AnalysisAdnan Masood
 
712201907
712201907712201907
712201907IJRAT
 
BayesiaLab 5.0 Introduction
BayesiaLab 5.0 IntroductionBayesiaLab 5.0 Introduction
BayesiaLab 5.0 IntroductionBayesia USA
 
Criminal and Civil Identification with DNA Databases Using Bayesian Networks
Criminal and Civil Identification with DNA Databases Using Bayesian NetworksCriminal and Civil Identification with DNA Databases Using Bayesian Networks
Criminal and Civil Identification with DNA Databases Using Bayesian NetworksCSCJournals
 
Knowledge Discovery in the Stock Market
Knowledge Discovery in the Stock MarketKnowledge Discovery in the Stock Market
Knowledge Discovery in the Stock MarketBayesia USA
 
Knowledge Discovery in Stock Market
Knowledge Discovery in Stock MarketKnowledge Discovery in Stock Market
Knowledge Discovery in Stock Marketjouffe
 
Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...
Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...
Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...Databricks
 
Modified naive bayes model for improved web page classification
Modified naive bayes model for improved web page classificationModified naive bayes model for improved web page classification
Modified naive bayes model for improved web page classificationHammad Haleem
 
The bayesian revolution in genetics
The bayesian revolution in geneticsThe bayesian revolution in genetics
The bayesian revolution in geneticsBeat Winehouse
 
Decentralized Data Management for the Semantic Web
Decentralized Data Management for the Semantic WebDecentralized Data Management for the Semantic Web
Decentralized Data Management for the Semantic Webhala Skaf
 
Modeling Vehicle Choice and Simulating Market Share with Bayesian Networks
Modeling Vehicle Choice and Simulating Market Share with Bayesian NetworksModeling Vehicle Choice and Simulating Market Share with Bayesian Networks
Modeling Vehicle Choice and Simulating Market Share with Bayesian NetworksBayesia USA
 
A Study on Comparison of Bayesian Network Structure Learning Algorithms for S...
A Study on Comparison of Bayesian Network Structure Learning Algorithms for S...A Study on Comparison of Bayesian Network Structure Learning Algorithms for S...
A Study on Comparison of Bayesian Network Structure Learning Algorithms for S...Jae-seong Yoo
 
ISPE 2019 Driving Step Changes in Manufacturing Operations with Predictive In...
ISPE 2019 Driving Step Changes in Manufacturing Operations with Predictive In...ISPE 2019 Driving Step Changes in Manufacturing Operations with Predictive In...
ISPE 2019 Driving Step Changes in Manufacturing Operations with Predictive In...Bigfinite
 

Similar a The Bayesia Portfolio of Research Software (20)

Bayesian Network Modeling using Python and R
Bayesian Network Modeling using Python and RBayesian Network Modeling using Python and R
Bayesian Network Modeling using Python and R
 
PyData DC 2016 Talk: Bayesian Network Modeling Using Python and R
PyData DC 2016 Talk: Bayesian Network Modeling Using Python and RPyData DC 2016 Talk: Bayesian Network Modeling Using Python and R
PyData DC 2016 Talk: Bayesian Network Modeling Using Python and R
 
Bayesian Networks and Association Analysis
Bayesian Networks and Association AnalysisBayesian Networks and Association Analysis
Bayesian Networks and Association Analysis
 
712201907
712201907712201907
712201907
 
BayesiaLab 5.0 Introduction
BayesiaLab 5.0 IntroductionBayesiaLab 5.0 Introduction
BayesiaLab 5.0 Introduction
 
Criminal and Civil Identification with DNA Databases Using Bayesian Networks
Criminal and Civil Identification with DNA Databases Using Bayesian NetworksCriminal and Civil Identification with DNA Databases Using Bayesian Networks
Criminal and Civil Identification with DNA Databases Using Bayesian Networks
 
BioNLPSADI
BioNLPSADIBioNLPSADI
BioNLPSADI
 
Knowledge Discovery in the Stock Market
Knowledge Discovery in the Stock MarketKnowledge Discovery in the Stock Market
Knowledge Discovery in the Stock Market
 
Knowledge Discovery in Stock Market
Knowledge Discovery in Stock MarketKnowledge Discovery in Stock Market
Knowledge Discovery in Stock Market
 
Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...
Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...
Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...
 
Modified naive bayes model for improved web page classification
Modified naive bayes model for improved web page classificationModified naive bayes model for improved web page classification
Modified naive bayes model for improved web page classification
 
The bayesian revolution in genetics
The bayesian revolution in geneticsThe bayesian revolution in genetics
The bayesian revolution in genetics
 
KMi HypER 2009
KMi HypER 2009KMi HypER 2009
KMi HypER 2009
 
Decentralized Data Management for the Semantic Web
Decentralized Data Management for the Semantic WebDecentralized Data Management for the Semantic Web
Decentralized Data Management for the Semantic Web
 
Modeling Vehicle Choice and Simulating Market Share with Bayesian Networks
Modeling Vehicle Choice and Simulating Market Share with Bayesian NetworksModeling Vehicle Choice and Simulating Market Share with Bayesian Networks
Modeling Vehicle Choice and Simulating Market Share with Bayesian Networks
 
Berger 2000
Berger 2000Berger 2000
Berger 2000
 
Naive Bayes | Statistics
Naive Bayes | StatisticsNaive Bayes | Statistics
Naive Bayes | Statistics
 
A Study on Comparison of Bayesian Network Structure Learning Algorithms for S...
A Study on Comparison of Bayesian Network Structure Learning Algorithms for S...A Study on Comparison of Bayesian Network Structure Learning Algorithms for S...
A Study on Comparison of Bayesian Network Structure Learning Algorithms for S...
 
ISPE 2019 Driving Step Changes in Manufacturing Operations with Predictive In...
ISPE 2019 Driving Step Changes in Manufacturing Operations with Predictive In...ISPE 2019 Driving Step Changes in Manufacturing Operations with Predictive In...
ISPE 2019 Driving Step Changes in Manufacturing Operations with Predictive In...
 
How To Make Linked Data More than Data
How To Make Linked Data More than DataHow To Make Linked Data More than Data
How To Make Linked Data More than Data
 

Más de Bayesia USA

Loyalty_Driver_Analysis_V13b
Loyalty_Driver_Analysis_V13bLoyalty_Driver_Analysis_V13b
Loyalty_Driver_Analysis_V13bBayesia USA
 
vehicle_safety_v20b
vehicle_safety_v20bvehicle_safety_v20b
vehicle_safety_v20bBayesia USA
 
Impact Analysis V12
Impact Analysis V12Impact Analysis V12
Impact Analysis V12Bayesia USA
 
Microarray Analysis with BayesiaLab
Microarray Analysis with BayesiaLabMicroarray Analysis with BayesiaLab
Microarray Analysis with BayesiaLabBayesia USA
 
Breast Cancer Diagnostics with Bayesian Networks
Breast Cancer Diagnostics with Bayesian NetworksBreast Cancer Diagnostics with Bayesian Networks
Breast Cancer Diagnostics with Bayesian NetworksBayesia USA
 
Car And Driver Hk Interview
Car And Driver Hk InterviewCar And Driver Hk Interview
Car And Driver Hk InterviewBayesia USA
 

Más de Bayesia USA (6)

Loyalty_Driver_Analysis_V13b
Loyalty_Driver_Analysis_V13bLoyalty_Driver_Analysis_V13b
Loyalty_Driver_Analysis_V13b
 
vehicle_safety_v20b
vehicle_safety_v20bvehicle_safety_v20b
vehicle_safety_v20b
 
Impact Analysis V12
Impact Analysis V12Impact Analysis V12
Impact Analysis V12
 
Microarray Analysis with BayesiaLab
Microarray Analysis with BayesiaLabMicroarray Analysis with BayesiaLab
Microarray Analysis with BayesiaLab
 
Breast Cancer Diagnostics with Bayesian Networks
Breast Cancer Diagnostics with Bayesian NetworksBreast Cancer Diagnostics with Bayesian Networks
Breast Cancer Diagnostics with Bayesian Networks
 
Car And Driver Hk Interview
Car And Driver Hk InterviewCar And Driver Hk Interview
Car And Driver Hk Interview
 

The Bayesia Portfolio of Research Software

  • 1. The Bayesia Portfolio of Research Software BayesiaLab 5.1 Bayesia Market Simulator 1.6 BEKEE 2.0 Bayesia Engine API www.bayesia.us
  • 2. Table of Contents Introduction Framework: The Bayesian Network Paradigm Acyclic Graphs & Bayes’s Rule 5 Compact Representation of the Joint Probability Distribution 6 BayesiaLab 5.1 Executive Summary 7 Select Client List 8 Conceptual Highlights 9 Expert Knowledge Modeling 9 Knowledge Discovery with Machine Learning 9 Knowledge Unification 10 Reasoning Under Uncertainty 10 Discrete, Nonlinear and Nonparametric Modeling 11 Key Functions 11 Unsupervised Structural Learning 11 Supervised Learning 12 Clustering 13 Observational Inference 13 Causal Inference 14 Diagnosis, Prediction and Simulation 14 Effects Analysis 15 Analyzing Observational Studies 16 Optimization 16 Bayesia Market Simulator 1.6 Motivation 17 Bayesian Networks for Choice Modeling 17 Bayesia Market Simulator 18 The Bayesia Portfolio of Research Software ii www.bayesia.us | www.bayesia.sg
  • 3. BEKEE 2.0 Motivation 20 Bayesia Expert Knowledge Elicitation Environment (BEKEE) 21 Bayesia Engines Bayesia Engine API 23 References Contact Information Bayesia USA 26 Bayesia Singapore Pte. Ltd. 26 Bayesia S.A.S. 26 Copyright 26 The Bayesia Portfolio of Research Software www.bayesia.us | www.bayesia.sg iii
  • 4. Introduction The Bayesia portfolio of research software is the result of over 20 years of continuous research and devel- opment by two French professors in the field of artificial intelligence, Dr. Lionel Jouffe and Dr. Paul Mun- teanu. Their team of computer scientists and software developers at Bayesia S.A.S. has embraced the Bayes- ian networks paradigm and built tools for making it accessible to a broad audience, and practical for a wide range of research tasks. The idea of Bayesian networks dates back to the mid-1980s, when Professor Judea Pearl of UCLA began to formalize their semantics in a series of seminal works. The study of Bayesian networks has since grown into a large body of work with dozens of books and countless scientific papers exploring all their properties. However, thanks to Bayesia’s software tools, and the ever-increasing power of computers, Bayesian net- works have become powerful and practical tools well beyond the world of academia. For applied research in all domains, Bayesian networks can facilitate deep understanding of very complex, high-dimensional problem domains. Their computational efficiency and inherently visual structure make Bayesian networks attractive for exploring and explaining complex domains. Most importantly, Bayesian networks allow rea- soning about such domains in a formally correct yet highly intuitive way. The Bayesia Portfolio of Research Software 4 www.bayesia.us | www.bayesia.sg EXPERT KNOWLEDGE BAYESIAN NETWORK ANALYTICS SIMULATION RISK MANAGEMENT OPTIMIZATIONDIAGNOSIS DATA KNOWLEDGE MODELING KNOWLEDGE DISCOVERY DECISION SUPPORT
  • 5. Framework: The Bayesian Network Paradigm1 Acyclic Graphs & Bayes’s Rule Probabilistic models based on directed acyclic graphs have a long and rich tradition, beginning with the work of geneticist Sewall Wright in the 1920s. Variants have appeared in many fields. Within statistics, such models are known as directed graphical models; within cognitive science and artificial intelligence, such models are known as Bayesian networks. The name honors the Rev. Thomas Bayes (1702-1761), whose rule for updating probabilities in the light of new evidence is the foundation of the approach. Rev. Bayes addressed both the case of discrete probability distributions of data and the more complicated case of continuous probability distributions. In the discrete case, Bayes’ theorem relates the conditional and marginal probabilities of events A and B, provided that the probability of B does not equal zero: P(A∣B) = P(B∣A)P(A) P(B) In Bayes’ theorem, each probability has a conventional name: • P(A) is the prior probability (or “unconditional” or “marginal” probability) of A. It is “prior” in the sense that it does not take into account any information about B; however, the event B need not occur after event A. In the nineteenth century, the unconditional probability P(A) in Bayes’s rule was called the “antecedent” probability; in deductive logic, the antecedent set of propositions and the inference rule imply consequences. The unconditional probability P(A) was called “a priori” by Ronald A. Fisher. • P(A|B) is the conditional probability of A, given B. It is also called the posterior probability because it is derived from or depends upon the specified value of B. • P(B|A) is the conditional probability of B given A. It is also called the likelihood. • P(B) is the prior or marginal probability of B, and acts as a normalizing constant. Bayes theorem in this form gives a mathematical representation of how the conditional probability of event A given B is related to the converse conditional probability of B given A. The initial development of Bayesian networks in the late 1970s was motivated by the need to model the top- down (semantic) and bottom-up (perceptual) combination of evidence in reading. The capability for bidirec- tional inferences, combined with a rigorous probabilistic foundation, led to the rapid emergence of Bayesian networks as the method of choice for uncertain reasoning in AI and expert systems replacing earlier, ad hoc rule-based schemes. The Bayesia Portfolio of Research Software www.bayesia.us | www.bayesia.sg 5 1 Adapted from Pearl (2000), used with permission.
  • 6. The nodes in a Bayesian network represent variables of interest (e.g. the temperature of a device, the gen- der of a patient, a feature of an object, the occur- rence of an event) and the links represent statistical (informational) or causal dependencies among the variables. The dependencies are quantified by condi- tional probabilities for each node given its parents in the network. The network supports the computation of the posterior probabilities of any subset of vari- ables given evidence about any other subset. Compact Representation of the Joint Probability Distribution “The central paradigm of probabilistic reasoning is to identify all relevant variables x1, . . . , xN in the environment [i.e. the domain under study], and make a probabilistic model p(x1, . . . , xN) of their interaction [i.e. represent the variables’ joint probability distribution].” Bayesian networks are very attractive for this purpose as they can, by means of factorization, compactly represent the joint probability distribution of all variables. “Reasoning (inference) is then performed by introducing evidence that sets variables in known states, and subsequently computing probabilities of interest, conditioned on this evidence. The rules of probability, combined with Bayes’ rule make for a complete reasoning system, one which includes traditional deductive logic as a special case.” (Barber, 2012) The Bayesia Portfolio of Research Software 6 www.bayesia.us | www.bayesia.sg
  • 7. BayesiaLab 5.1 Executive Summary BayesiaLab is a powerful desktop application (Windows/Mac/Unix) for knowledge manage- ment, data mining, analytics, predictive model- ing and simulation — all based on the para- digm of Bayesian networks. Bayesian networks have become a very powerful tool for deep understanding of very complex, high- dimensional problem domains, ranging from bioinformatics to marketing science. BayesiaLab is the world’s only comprehensive software package for generating, manipulating and analyzing Bayesian networks. Analysts and researchers around the world, including Bayesia’s strategic partner P&G, have embraced BayesiaLab to gain unprecedented insights into problems which had previously not been tractable with tra- ditional analysis methods. The latest version of BayesiaLab, 5.1, is the result of nearly twenty years of development by a team of re- searchers, led by Dr. Lionel Jouffe and Dr. Paul Munteanu, who are widely recognized as world leaders in their field of study. While cutting-edge research tools are often of no practical use outside the laboratory, BayesiaLab is a major exception. Its performance is like a Formula One race car; its everyday practicality resembles an SUV. As such, BayesiaLab provides an extremely user-friendly interface that allows novices and experts alike to easily and quickly navigate all the functions available in the program. Intuitive menu structures and step-by- step wizards allow end-users to focus on their principal analysis task without having to worry about idio- syncratic syntax or arcane commands. The Bayesia Portfolio of Research Software www.bayesia.us | www.bayesia.sg 7
  • 8. Select Client List The Bayesia Portfolio of Research Software 8 www.bayesia.us | www.bayesia.sg • Acxiom • AGC Glass • Airbus • Ales Market Research • American Diabetes Association • Arcelor Mittal • BBDO • Booz Allen Hamilton • BP • BVA • Cancer Care Ontario • Cap Gemini • Cargill • Center for Disease Control • CFI Group • Crédit Agricole • Dassault Aviation • Dell • Direction Générale de l'Arme- ment (DGA) • EADS Telecom • Électricité de France (EDF) • ENI • Firmenich • Fractal Analytics • France Telecom • Georgetown University • GfK • GlaxoSmithKline • GnResearch • GroupM • Hilton Hotels & Resorts • Hyatt • Iceology • IMRB International • InterContinental Hotels Group • Ipsos • Klinikum der Universität München • L'Oreal • La Poste • Lancaster University • Lilly • Lockheed Martin • Louisiana State University • Marketing Analysts (MAi) • McGill University • MedSolutions • Millward Brown • Mu Sigma • NASA • National Analysts • National Central University, Taiwan • Neiman Marcus • Nestlé • Nissan • NTT • Opinion Way • Orange • Pennsylvania State University • Procter & Gamble • PSA Peugeot Citroën • Renault • Repères • Rhodia • Rutgers, The State University of New Jersey • Saint-Gobain • Samsung • Sanofi • Servier • Singapore Telecom • Smucker • SNCF • Société Générale • Sony • Soredab • Synovate • Team Detroit • The Pert Group • TNS • Total • Turbomeca • UCLA • Unilever • University of Toronto • University of Virginia • Vanderbilt University • Veterans Administration • Virginia Tech Testimonials “BayesiaLab provides exceptional capability in probabilistic inference. This Bayesian network software allows model building based on data, expert knowledge or any combination of the two. It polishes off modeling with a suite of advanced analysis methods unavailable in other such tools. The results are clear, interpretable solutions of the prob- lem at hand. With BayesiaLab, Bayesia has set new standards of usability, productivity and value for Bayesian net- work software.” Michael L. Thompson Procter & Gamble (USA), CF-RD/Modeling & Simulation “BayesiaLab has been able to accelerate our consumer modeling in Family Care by cutting costs and enabling model creation in minutes – not months. Beyond that, it has allowed us to take our traditional approaches into new territo- ries: virtual product design & testing, influencing copy development and even volume forecasting. It has been the single biggest enabler of deeper consumer insights & more actionable modeling across our business.” Prabhath Nanisetty Procter & Gamble (USA), Family Care CMK
  • 9. Conceptual Highlights Expert Knowledge Modeling In today’s business environment that strives to be “data-driven”, expert knowledge seems to be perceived more and more as qualitative or is perhaps even seen as “soft” knowledge. With billions of “hard” data points being accumulated every second, what cannot be counted may not count for much these days. A life- time of experience in any particular domain may appear insignificant in comparison to the huge quantities of newly generated data. This mindset has a critical flaw, which is that causal relationships cannot be machine-learned from data. Rather, causal reasoning always requires some form of assumptions, i.e. assumptions coming from the hu- man mind. Experts often express causal paths in the form of graphs. This visual representation of causes and ef- fects has a direct analogue in the network graph in BayesiaLab’s graph panel. Nodes (representing vari- ables) can be added and positioned with a mouse- click, arcs (representing relationships) can be “drawn” between nodes. The causal direction is simply encoded in the direction of the arc. The quantitative nature of dependencies, plus many other attributes can be managed in the Node Editor, which is available by right-clicking any node. BayesiaLab thus facilitates intuitively encoding one’s own understanding of a domain with a minimum of effort. Simultaneously it enforces internal consis- tency, so that no impossible conditions are acciden- tally encoded. In addition to allowing users to directly encode their explicit knowledge by drawing a network in the graph panel, the Bayesia Expert Knowledge Elicitation Environment (BEKEE) is available as an extension to BayesiaLab. It allows to systematically elicit both explicit and tacit knowledge of experts (see chapter on BEKEE). Knowledge Discovery with Machine Learning Despite our emphasis on the relevance of human expert knowledge, especially for identifying causal rela- tions, there is no doubt that there is a lot to learn from data, regardless of whether the data is sparse or “big”. BayesiaLab features a very comprehensive array of highly optimized learning algorithms that can quickly uncover so-far-unknown structures in datasets. This proves to be particularly powerful regardless of whether you have a handful of variables or thousands of variables, with millions of potentially relevant rela- tionships. The Bayesia Portfolio of Research Software www.bayesia.us | www.bayesia.sg 9
  • 10. Knowledge Unification Ultimately, “deep understanding” of a domain requires knowing the parameters of the relationships be- tween the variables plus the knowledge of their causal directions. Machines are ideally suited for estimating quantities, such as the parameters, while human knowledge is still required to determine causality. So, if there were one central tenet in Bayesia’s philosophy, it would have to be “the mission of unifying machine learning and human knowledge for better reasoning.” Although the expression “the best of both worlds” may sound like a cliché, it is what Bayesian networks and BayesiaLab can indeed offer. Reasoning Under Uncertainty Based on a Bayesian network, BayesiaLab can re- liably carry out inference with multiple pieces of uncertain and even conflicting evidence. The inher- ent ability of Bayesian networks to facilitate com- putations under uncertainty makes them highly suitable for a wide range of real-world applica- tions. Reasoning under uncertainty applies in two ways: “Art” “Science” Expert Knowledge Qualitative Mathematical Representation Quantitative Bayesian Network Unified Knowledge Representation Domain The Bayesia Portfolio of Research Software 10 www.bayesia.us | www.bayesia.sg
  • 11. • Diagnosis (inference from effect to cause) • Simulation (inference from cause to effect) Maintaining uncertainty during inference automatically prevents potentially misleading point estimates. Discrete, Nonlinear and Nonparametric Modeling BayesiaLab processes all data on a discre- tized basis. As part of BayesiaLab’s Data Import Wizard, a number of methods are available to discretize any continuous vari- ables. In BayesiaLab, all “parameters” describing probabilistic relationships between variables are contained in conditional probability tables (or cubes/hypercubes when two di- mensions are exceeded), which means that no functional forms are utilized. Given this nonparametric, discrete approach, Bayesia- Lab can implicitly handle highly nonlinear relationships between variables. All the optimization criteria of BayesiaLab’s learning algorithms are based on informa- tion theory (e.g.the Minimum Description Length). With that, no assumptions of linearity are made at any point. Key Functions Unsupervised Structural Learning In statistics, unsupervised learning is typically understood to be a classification or clustering task. To make a very clear distinction, we put emphasis on “structural” in “Unsupervised Structural Learning”, which covers a number of important algorithms in BayesiaLab. The Bayesia Portfolio of Research Software www.bayesia.us | www.bayesia.sg 11
  • 12. Unsupervised Structural Learning means that Bayesia- Lab can discover probabilistic relationships between a large number of variables, without the need to define inputs or outputs. One might say that this is the quintessential form of knowledge discovery, as no assumptions whatsoever are required to perform these algorithms on unknown datasets.2 Supervised Learning Supervised Learning in BayesiaLab has the same objective as many traditional modeling techniques, i.e. to develop a model for predict- ing a target variable. Some other data mining packages also offer “Bayesian Networks” as an option in their array of available tech- niques. However, in most cases, these packages are restricted in their capabilities to a very limited type of network, i.e. the Naïve Bayesian Network. Within BayesiaLab, a vastly greater number of algorithms is available to search for a Bayesian network that best describes the target variable, while taken into account the complexity of the resulting network. The Markov Blanket algo- rithm should be highlighted here as its speed is particularly helpful whenever dealing with a larger number of variables. In this context, the Markov Blanket also serves as an exceptionally powerful variable selection algorithm. Finally, structural coefficient analysis, cross- validation and data perturbation functions are available for thoroughly testing and validating the robustness of candidate networks, helping the analyst to make a trade-off between precision and parsimony. These validation methods are applicable to both Supervised and Unsupervised Learning. The Bayesia Portfolio of Research Software 12 www.bayesia.us | www.bayesia.sg 2 However, the analyst can still use any available domain knowledge to define structural constraints.
  • 13. Clustering Clustering in BayesiaLab covers both data clustering (e.g. by observations) and variable clustering, which, as the name implies, allows the grouping of variables according to the strength of their mutual relationships. A third variation of this concept is of particular importance in BayesiaLab: the semi-automatic Multiple Clustering workflow can be described as a kind of nonlinear, nonparametric and nonorthogonal factor analysis. In practice, Multiple Clustering often serves as the basis for developing Probabilistic Structural Equation Models with BayesiaLab. Observational Inference One of the basic properties of Bayesian networks is that they are “omnidirectional observational inference engines”. Given an observation on any of the networks nodes (or a subset of nodes), one can compute the posterior probabilities of all other nodes in the network. Both exact and approximate observational infer- ence algorithms are implemented in BayesiaLab. The Bayesia Portfolio of Research Software www.bayesia.us | www.bayesia.sg 13
  • 14. Causal Inference Besides observational inference, BayesiaLab also offers causal inference for computing the impact of inter- vening on a subset of variables instead of merely observing their states. Both Pearl’s Do-Operator and Jouffe’s Likelihood Matching are available for this purpose. Missing Values Processing Missing values are encountered in virtually all real-world data collection processes. Missing values could be the result of nonresponses in surveys, poor recordkeeping, server outages, attrition in longitudinal surveys or the faulty sensors of a measuring device, etc. Traditionally, missing values processing (beyond the naïve ad-hoc approaches) has been a demanding task, both methodologically and computationally. What is often overlooked is that not properly handling missing observations can lead to misleading interpretations or create a false sense of confidence in one’s findings, regardless of how many more complete observations might be available. BayesiaLab offers a range of sophisticated methods for missing values processing from which the analyst can choose. During network learning, BayesiaLab performs missing values processing automatically “behind the scenes”. More specifically, the Structural Expectation-Maximization algorithm and the Dynamic Com- pletion algorithm are automatically applied after each modification of the network during learning, i.e. after every single arc addition, suppression and inversion. Bayesian networks actually provide several advantages for dealing with missing values, which makes it at- tractive to use BayesiaLab solely for that purpose. • Bayesian networks offer a unified framework for representing the joint distribution of the overall domain and simultaneously encoding the dependencies with the missing values (Heckerman, 2008). This implicitly addresses the requirement that Shafer and Olson (1998) stipulate for missing values imputation, namely “any association that may prove important in subsequent analysis should be present in the imputation model.... A rich imputation model that preserves a large number of associations is desirable because it may be used for a variety of post-imputation analyses.” Also, by using a Bayesian network, the “functional form” for missing values imputation and for representing the overall model are automatically identical and thus compatible. • The inherently probabilistic nature of Bayesian networks allows to deal with missing values and their imputation nondeterministically. That means that the (needed) variance in the imputed data does not need to be generated artificially, but is inherently present. Diagnosis, Prediction and Simulation In the Bayesian network framework, diagnosis, prediction and simula- tion are identical computations. They all consist of inference condi- tional upon evidence. The distinction only exists from the perspective of the researcher, who would presumably sees the symptom of a disease as an effect and the disease itself as the cause. Hence, carrying out infer- ence based on observed symptoms is interpreted as “diagnosis”. The Bayesia Portfolio of Research Software 14 www.bayesia.us | www.bayesia.sg
  • 15. BayesiaLab offers a considerable number of functions relating to inference. For instance, inference can be performed by setting evidence, i.e. clicking on any one of the Monitors, and results are returned instantly for all the other Monitors. Batch Inference is available when infer- ence needs to be computed for a large number of records. For instance, this can be used for applying a predictive score for all customers in a database. The Adaptive Questionnaire function provides guidance in terms of the opti- mum sequence for seeking evidence. With every piece of evidence set, BayesiaLab determines which is the next best piece of evidence to obtain for a maximum infor- mation gain with respect to the target variable. In a medical context, this allows to optimally “escalate” diagnostic proce- dures, from “low-cost & small-gain evi- dence (e.g. measuring the patient’s blood pressure) to “high-cost & large-gain” evidence (e.g. performing an MRI scan). Effects Analysis Many research activities focus on estimating the size of an effect, for instance establishing the treatment ef- fect of a new drug or determining the sales impact of a new advertising campaign. Other studies are about attribution, i.e. they attempt to de- compose observed effects into their causes and thus allocate contribu- tions. All of the above questions can be answered, if the domain is fully un- derstood, which is a priori never the case. However, if we are able to build an adequate model of the domain that captures all of its dynam- ics, BayesiaLab will be able to extract the effects. BayesiaLab employs simulation to derive effects, as parameters per se do not exist in this nonparametric framework. As all the dynamics of the domain are encoded in discrete conditional probability tables, effect sizes only manifest themselves when different conditions are simulated. Total Effects Analysis, Target Mean Analysis and many more of BayesiaLab’s functions offer the analyst ways to study effects, especially nonlinear and interactive effects. The Bayesia Portfolio of Research Software www.bayesia.us | www.bayesia.sg 15
  • 16. Analyzing Observational Studies This simulation approach also offers special opportunities for evaluating observational studies. More spe- cifically, it can help overcome the problem of systematic differences between treatment and control groups. BayesiaLab’s Likelihood Matching performs on-the-fly matching of pretreatment covariates as part of the Direct Effects Analysis, thus yielding the “exclusive” effect of a particular variable on the target, everything else being equal. This also obliterates the need for separately preforming matching techniques, such as pro- pensity score matching. Optimization The ability to perform inference across all possible states of all nodes of the network also facilitates searching for optimum values. BayesiaLab’s Target Dynamic Profile and the Resource Allocation Optimization provide the toolsets for this purpose. Using this function in combination with Direct Effects is of particular interest when searching for the optimum combination of variables that have a nonlinear relation- ship with the target (and co-relations between the driv- ers). A typical example would be searching for the opti- mum mix of an array of marketing instruments. BayesiaLab’s Resource Allocation Optimization with Direct Effects will search, within the constraints set by the analysts, for those scenarios that optimize the target criterion. The Bayesia Portfolio of Research Software 16 www.bayesia.us | www.bayesia.sg
  • 17. Bayesia Market Simulator 1.6 Motivation For the vast majority of businesses, market share is a key performance indicator. Market share is used as a metric that allows comparing competitive performance independently from overall market size and its fluc- tuations. In the product planning process, the expected market share is critical, along with the overall market fore- cast, as together they define the sales volume expectation, which, for obvious reasons, is a key element in most business cases. As a result, it is critical for decision makers to correctly predict the future market shares of products not yet developed. The task of such market share forecasts typically falls into marketing and market research de- partments, who are mostly closely involved with understanding consumer behavior and, more specifically, the product choices they make. If we fully understood the consumer’s decision making process and observed all components of it, we could simply generate a deterministic model for predicting future consumer choices. However, we do not and it is obvious that many elements contributing to a consumer’s purchase decision are inherently unobservable. Despite our limited comprehension of the true human choice process, there are a number of tools that still allow modeling consumer choice with what is observable, and accounting for what will remain unknow- able. In this context, and based on the seminal works of Nobel-laureate Daniel McFadden, choice modeling has emerged as an important tool in understanding and simulating consumer choice. Bayesian Networks for Choice Modeling Beyond the convenience and speed of estimating Bayesian networks with BayesiaLab, there are several noteworthy differences in modeling consumer choice with Bayesian networks compared to traditional dis- crete choice models. • Whereas utility-based choice models, such as multinomial logit models (MNL), will “flatten” the vector of attribute utilities into a single scalar value, Bayesian networks do not inherently restrict all the dimensions relating to choice. For example, learning a Bayesian network on observed vehicle choices might reveal that fuel economy and vehicle price are subject to tradeoff, while safety is a nonnegotiable basic requirement for the consumer. Correctly recognizing such dynamics are obviously critical for making predictions about future consumer choices. • Bayesian networks are nonparametric and thus they do not require the specification of a functional form. No assumptions need to made regarding the form of links between variables. Potentially nonlinear patterns are therefore not an issue for model estimation or simulation. • Bayesian networks are inherently probabilistic, and, as such, there is no need to specify an error term. A traditional choice model would require an error term to make it nondeterministic. The Bayesia Portfolio of Research Software www.bayesia.us | www.bayesia.sg 17
  • 18. • In BayesiaLab all computations are natively discrete and therefore no transformation functions, such as logit or probit, are needed. Given that we are dealing with discrete consumer choices, this all-discrete approach is an advantage. Bayesia Market Simulator BayesiaLab and the Bayesia Market Simula- tor are unique in their ability to utilize Bayesian networks for choice modeling, for instance for market share simulation of new products and services. The principal idea is that a Bayesian net- work represents a generalization of a do- main, such as the interactions between products and consumers (both stated prefer- ence and revealed preference data can be used). This means that all of the products attributes may interact with all of the con- sumer attributes, which can amount to hun- dreds of variables. Unsupervised Learning of a sufficient number of such interactions (in all their dimensions) will then generate a network that generalizes all these relationships, i.e. the network becomes a function that maps consumer attributes to product attributes. The Bayesia Market Simulator can subse- quently utilize this generalization and simu- late hypothetical product scenarios, such as a different combination of product features. Given the network, a new choice probabil- ity can then be computed for every single consumer across all hypothetical and real product scenarios. In summary, this pro- vides new market shares for an alternative state of the world. With the ability to leverage revealed prefer- ence data, BayesiaLab and Bayesia Market Simulator allow using a vast range of exist- ing research for choice predictions. BayesiaLab can learn a Bayesian network from consumer choices in recorded in the form of stated preference (SP) or revealed The Bayesia Portfolio of Research Software 18 www.bayesia.us | www.bayesia.sg
  • 19. preference (RP) data. The learned Bayesian network allows computing the posterior probability distribution in each choice situation, including hypothetical product alternatives (and even hypothetical consumers). As a result, we obtain a choice probability as a function of product and consumer attributes. In order to obtain a product’s projected market share, we can then simply simulate choice probabilities across all product sce- narios and across all individuals in the population under study. The Bayesia Portfolio of Research Software www.bayesia.us | www.bayesia.sg 19
  • 20. BEKEE 2.0 Motivation Everybody is talking about “Big Data” and all the opportunities that are associated with it. Very often though, we hear almost as much about the challenges that come with this flood of data. However, much more serious problems exist on the opposite end of the spectrum, where there is not enough data. Unfortunately, all the advanced knowledge discovery algorithms fail in the absence of data. In over ten years of continuous development, and in increasingly sophisticated ways, BayesiaLab has permit- ted deriving knowledge from data through its machine learning algorithms, very much in the spirit of under- standing “Big Data”. However, BayesiaLab has maintained an equal focus on managing knowledge that exists beyond measurable and countable data points, such as the knowledge contained in the human mind. BayesiaLab’s graphical user interface has made it highly intuitive for individual subject matter experts to encode their own domain understanding into a Bayesian network, thus capturing what they explicitly or implicitly know. What is especially valuable, one can very easily and formally capture causal relations in a Bayesian network graph, which is something that few other frameworks can do. However, when it comes to consolidating the collective knowledge from a group of experts, rather than from an individual, the process is not that straightforward any longer. Traditionally, one would perhaps bring the experts together in a brainstorming session and let them form a common understanding. Subse- quently such a consensus could be encoded manually. However, brainstorming sessions are prone to intro- ducing a wide range of biases, which can be disastrously counterproductive in studying complex domains. The Bayesia Portfolio of Research Software 20 www.bayesia.us | www.bayesia.sg
  • 21. Bayesia Expert Knowledge Elicitation Environment (BEKEE) Bayesia Expert Knowledge Elicitation Environment, or BEKEE for short, is a new web application that is designed to minimize detrimental group biases. The central idea is not to coerce consensus, but rather to elicit everyone’s individual views regarding the domain under study. In order to ensure the independent elicitation of probabilities, BEKEE queries stakeholders individually via an interactive or batch question- naire linked to the core BayesiaLab application. Retrieving expert views in such a fashion generates many “parallel universes” in terms of domain understanding. These different perspectives can be formally com- pared by the facilitator and potentially returned to the group for a formal debate in the case of seriously conflicting assessments. In most cases, this is an iterative process and, even if stakeholder opinions do not converge, BayesiaLab will compile all views and produce a unifying Bayesian network. This graph is now a probabilistic summary of all the available expert opinions. As such, it can be utilized as a formal representation of the underlying do- main. Most importantly, this graph is not merely a visual representation. Rather, a Bayesian network is a fully computable model of the domain, which immediately facilitates the simulation of what-if scenarios. 15 The Bayesia Portfolio of Research Software www.bayesia.us | www.bayesia.sg 21
  • 22. In fact, we can evaluate this Bayesian network model the same way as a statistical model estimated from “Big Data”. One might still prefer a data-based model, if data were indeed available, but in the absence thereof, the formally-encoded collective expert knowledge best represents what is known at the time. The Bayesia Portfolio of Research Software 22 www.bayesia.us | www.bayesia.sg
  • 23. Bayesia Engines Developers can also access many of BayesiaLab’s functions outside the graphical user interface by using Bayesia’s Modeling and Inference Engines. You can thus directly leverage Bayesian networks in your own applications and workflows and deploy them for client use, without requiring clients to install BayesiaLab. Bayesia Engine API The Bayesia Engines are Application Program Interfaces (API) as pure Java class library (jar file) that can be integrated in any software project. With the Bayesia Modeling Engine you can create your own Bayesian net- works from within your own code and subsequently perform inference with the Bayesia Inference Engine. The Bayesia Inference Engine al- lows you to perform inference on Bayesian networks from within your own application. Networks created with BayesiaLab or with the Modeling Engine can be used for computing inference with the Bayesia Inference Engine. A typical implementation scenario would be developing a Bayesian network offline with BayesiaLab and then deploying this network for real-time prediction on streaming data with the Bayesia Inference En- gine. The Bayesia Portfolio of Research Software www.bayesia.us | www.bayesia.sg 23
  • 24. The Bayesia Inference Engine can, for instance, also serve as the back- end of a web-based simulator, which can interactively perform inference on the user’s input. The Bayesia Portfolio of Research Software 24 www.bayesia.us | www.bayesia.sg
  • 25. References Barber, David. Bayesian Reasoning and Machine Learning. Cambridge University Press, 2012. Darwiche, Adnan. Modeling and Reasoning with Bayesian Networks. 1st ed. Cambridge University Press, 2009. Heckerman, D. “A Tutorial on Learning with Bayesian Networks.” Innovations in Bayesian Networks (2008): 33–82. Holmes, Dawn E., ed. Innovations in Bayesian Networks: Theory and Applications. Softcover reprint of hardcover 1st ed. 2008. Springer, 2010. Kjaerulff, Uffe B., and Anders L. Madsen. Bayesian Networks and Influence Diagrams: A Guide to Con- struction and Analysis. Softcover reprint of hardcover 1st ed. 2008. Springer, 2010. Koller, Daphne, and Nir Friedman. Probabilistic Graphical Models: Principles and Techniques. 1st ed. The MIT Press, 2009. Koski, Timo, and John Noble. Bayesian Networks: An Introduction. 1st ed. Wiley, 2009. Mittal, Ankush. Bayesian Network Technologies: Applications and Graphical Models. Edited by Ankush Mittal and Ashraf Kassim. 1st ed. IGI Publishing, 2007. Neapolitan, Richard E. Learning Bayesian Networks. Prentice Hall, 2003. Pearl, Judea. Causality: Models, Reasoning and Inference. 2nd ed. Cambridge University Press, 2009. ———. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. 1st ed. Morgan Kaufmann, 1988. Pearl, Judea, and Stuart Russell. Bayesian Networks. UCLA Congnitive Systems Laboratory, November 2000. http://bayes.cs.ucla.edu/csl_papers.html. Pourret, Olivier, Patrick Naïm, and Bruce Marcot, eds. Bayesian Networks: A Practical Guide to Applica- tions. 1st ed. Wiley, 2008. Schafer, J.L., and M.K. Olsen. “Multiple Imputation for Multivariate Missing-data Problems: A Data Ana- lyst’s Perspective.” Multivariate Behavioral Research 33, no. 4 (1998): 545–571. Spirtes, Peter; Glymour, Clark. Causation, Prediction and Search. The MIT Press, 2001. The Bayesia Portfolio of Research Software www.bayesia.us | www.bayesia.sg 25
  • 26. Contact Information Bayesia USA 312 Hamlet’s End Way Franklin, TN 37067 USA Phone: +1 888-386-8383 info@bayesia.us www.bayesia.us Bayesia Singapore Pte. Ltd. 20 Cecil Street #14-01, Equity Plaza Singapore 049705 Phone: +65 3158 2690 info@bayesia.sg www.bayesia.sg Bayesia S.A.S. 6, rue Léonard de Vinci BP 119 53001 Laval Cedex France Phone: +33(0)2 43 49 75 69 info@bayesia.com www.bayesia.com Copyright © 2013 Bayesia S.A.S., Bayesia USA and Bayesia Singapore. All rights reserved. The Bayesia Portfolio of Research Software 26 www.bayesia.us | www.bayesia.sg