Driver Analysis and Product Optimization with Bayesian Networks

Tutorial on Driver Analysis and
Product Optimization with BayesiaLab
Stefan Conrady, stefan.conrady@bayesia.us
Dr. Lionel Jouffe, jouffe@bayesia.com
December 11, 2010
Revised: March 13, 2013

Table of Contents
Introduction
BayesiaLab 4
Acknowledgements 4
Abstract 4
Bayesian Networks 5
Structural Equation Models 5
Probabilistic Structural Equation Models 5
Tutorial
Notation 6
Model Development 6
Dataset 6
Consumer Research 6
Data Import 7
Unsupervised Learning 11
Preliminary Analysis 15
Variable Clustering 20
Multiple Clustering 23
Analysis of Factors 26
Completing the PSEM 32
Market Driver Analysis 38
Product Driver Analysis 44
Product Optimization 44
Conclusion 56
Appendix: The Bayesian Network Paradigm
Acyclic Graphs & Bayes’s Rule 57
Compact Representation of the Joint Probability Distribution 58
References
Driver Analysis and Product Optimization with BayesiaLab
ii
www.bayesia.com | www.bayesia.us

Contact Information
Bayesia USA 60
Bayesia S.A.S. 60
Bayesia Singapore Pte. Ltd. 60
Copyright 60
www.bayesia.com | www.bayesia.us
iii

Introduction
This tutorial is intended for new or prospective users of BayesiaLab. The example in this tutorial is taken
from the field of marketing science and is meant to illustrate the capabilities of BayesiaLab with a real-world
case study and actual consumer data. Beyond market researchers, analysts and researchers in many fields
will hopefully find the proposed methodology valuable and intuitive. In this context, many of the technical
steps are outlined in great detail, such as data preparation and the network learning, as they are applicable
to research with BayesiaLab in general, regardless of the domain.1
BayesiaLab
Bayesia S.A.S., based in Laval, France has been developing BayesiaLab since 1999 and it has emerged as the
leading software package for knowledge discovery, data mining and knowledge modeling using Bayesian
networks. BayesiaLab enjoys broad acceptance in academic communities as well as in business and industry.
The relevance of Bayesian networks, especially in the context of market research, is highlighted by Bayesia’s
strategic partnership with Procter & Gamble, who has deployed BayesiaLab globally since 2007.
Acknowledgements
We would like to express our gratitude to Ares Research (www.ares-etudes.com) for generously providing
data from their consumer research for our case study.
Abstract
Market driver analysis and product optimization are one of the central tasks in Product Marketing and thus
relevant to virtually all types of businesses. BayesiaLab provides a unified software platform, which can,
based on consumer data,
1. provide deep understanding of the market preference structure
2. directly generate recommendations for prioritized product actions.
The proposed approach utilizes Probabilistic Structural Equation Models (PSEM), based on machine-
learned Bayesian networks. PSEMs provide an efficient alternative to Structural Equation Models (SEM),
which have been used traditionally in market research.
4 www.bayesia.com | www.bayesia.us
1 This tutorial is based on version 5.0 of BayesiaLab.

Bayesian Networks
A Bayesian network or belief network is a directed acyclic graphical model that represents the joint prob-
ability distribution over a set of random variables and their conditional dependencies via a directed acyclic
graph (DAG). For example, a Bayesian network could represent the probabilistic relationships between dis-
eases and symptoms. Given symptoms, the network can be used to compute the probabilities of the presence
of various diseases.2
Structural Equation Models
Structural Equation Modeling (SEM) is a statistical technique for testing and estimating causal relations
using a combination of statistical data and qualitative causal assumptions. This definition of SEM was ar-
ticulated by the geneticist Sewall Wright (1921), the economist Trygve Haavelmo (1943) and the cognitive
scientist Herbert Simon (1953), and formally defined by Judea Pearl (2000).
Structural Equation Models (SEM) allow both confirmatory and exploratory modeling, meaning they are
suited to both theory testing and theory development.
Probabilistic Structural Equation Models
Traditionally, specifying and estimating an SEM required a multitude of manual steps, which are typically
very time consuming, often requiring weeks or even months of an analyst’s time. PSEMs are based on the
idea of leveraging machine learning for automatically generating a structural model. As a result, creating
PSEMs with BayesiaLab is extremely fast and can thus form an immediate basis for much deeper analysis
and optimization.
www.bayesia.com | www.bayesia.us 5
2 See appendix for a brief introduction to Bayesian networks.

Tutorial
At the beginning of this tutorial, we want to emphasize the overarching objectives of this case study, so we
do not lose sight of the “big picture” as we immerse ourselves into the technicalities of BayesiaLab and
Bayesian networks.
In this study, we want to examine how product attributes perceived by consumers relate to purchase inten-
tion for specific products. Put simply, we want to understand the key drivers for purchase intent. Given the
large number of attributes in our study, we also want to identify common concepts among these attributes
in order to make interpretation easier and communication with managerial decision makers more effective.
Secondly, we want to utilize the generated understanding of consumer dynamics so product developers can
optimize the characteristics of the products under study in order to increase purchase intent among consum-
ers, which is our ultimate business objective.
Notation
In order to clearly distinguish between natural language, BayesiaLab-specific functions and study-specific
variable names, the following notation is used:
• BayesiaLab functions, keywords, commands, etc., are shown in bold type.
• Variable names are capitalized and italicized.
Model Development
Dataset
Consumer Research
This study is based on a monadic3 consumer survey about perfumes, which was conducted in France. In this
example, we use survey responses from 1,320 women, who have evaluated a total of 11 fragrances on a
wide range of attributes:
• 27 ratings on fragrance-related attributes, such as, “sweet”, “flowery”, “feminine”, etc., measured on a 1-
to-10 scale.
• 12 ratings on projected imagery related to someone, who would be wearing the respective fragrance, e.g.
“is sexy”, “is modern”, measured on a 1-to-10 scale.
• 1 variable for Intensity, a measure reflecting the level of intensity, measured on a 1-to-5 scale.4
• 1 variable for Purchase Intent, measured on a 1-to-6 scale.
3 a product test only involving one product, i.e. in our study each respondent evaluated only one perfume.
4 The variable Intensity is listed separately due to the a-priori knowledge of its non-linearity and the existence of a “just-
about-right” level.

• 1 nominal variable, Product, for product identification purposes.
Data Import
To start the analysis with BayesiaLab, we first import the data set, which is formatted as a CSV file.5 With
Data | Open Data Source | Text File, we start the Data Import wizard, which immediately provides a pre-
view of the data file.
The table displayed in the Data Import wizard
shows the individual variables as columns and
the responses as rows. There are a number of
options available, e.g. for sampling. However,
this is not necessary in our example given the
relatively small size of the database.
Clicking the Next button, prompts a data type
analysis, which provides BayesiaLab’s best
guess regarding the data type of each variable.
Furthermore, the Information box provides a
brief summary regarding the number of re-
cords, the number of missing values 6 , filtered
states, etc.
For this example, we will need to override the
default data type for the Product variable as
each value is a nominal product identifier
rather than a numerical scale value. We can
change the data type by highlighting the Prod-
uct variable and clicking the Discrete check
box, which changes the color of the Product
column to red.
5 CSV stands for “comma-separated values”, a common format for text-based data files.
6 There are no missing values in our database and filtered states are not applicable in this survey.

We will also define Purchase Intent and Inten-
sity as discrete variables, as the default number
of states of these variables is already adequate
for our purposes.7
The next screen provides options as to how to
treat any missing values. In our case, there are
no missing values so that the corresponding
panel is grayed-out.
Clicking the small upside-down triangle next
to the variable names brings up a window with
key statistics of the selected variable, in this
case Fresh.
The next step is the Discretization and Aggre-
gation dialogue, which allows the analyst to
determine the type of discretization, which
must be performed on all continuous
variables.8 For this survey, and given the num-
ber of observations, it is appropriate to reduce
the number of states from the original 10
states (1 through 10) to a smaller number. One
could, for instance, bin the 1-10 rating into
low, mid and high, or apply any other arbi-
trary method deemed appropriate by the ana-
lyst.
7 The desired number of variable states is largely a function of the analyst’s judgment.
8 BayesiaLab requires discrete distributions for all variables.

The screenshot shows the dialogue for the Manual selection of discretization steps, which permits to select
binning thresholds by point-and-click.
For this particular example, we select Equal Distances with 5 intervals for all continuous variables. This was
the analyst’s choice in order to be consistent with prior research.
Clicking Select All Continuous followed by
Finish completes the import process and the 49
variables (columns) from our database are
now shown as blue nodes in the Graph Panel,
which is the main window for network editing.
Note
For choosing discretization algorithms beyond this
example, the following rule of thumb may be helpful:
• For supervised learning, choose Decision Tree.
• For unsupervised learning, choose, in the order of
priority, K-Means, Equal Distances or Equal
Frequencies.

This initial view represents a fully unconnected Bayesian network.
For reasons, which will become clear later, we will initially exclude two variables, Product and Purchase
Intent. We can do so by right-clicking the nodes and selecting Properties | Exclusion. Alternatively, holding
“x” while double-clicking the nodes performs the same exclusion function.

Unsupervised Learning
As the next step, we will perform the first unsupervised learning of a network by selecting Learning | Asso-
ciation Discovering | EQ.
The resulting view shows the learned network with all the nodes in their original position.

Needless to say, this view of the network is not very intuitive. BayesiaLab has numerous built-in layout al-
gorithms, of which the Force Directed Layout is perhaps the most commonly used.

It can be invoked by View | Automatic Layout | Force Directed Layout or alternatively through the key-
board shortcut “p”. This shortcut is worthwhile to remember as it is one of the most commonly used func-
tions.

The resulting network will look similar to the following screenshot.
To optimize the use of the available screen, clicking the Best Fit button in the toolbar “zooms to fit”
the graph to the screen. In addition, rotating the graph with the Rotate Left and Rotate Right buttons
helps to create a suitable view.
The final graph should closely resemble the following screenshot and, in this view, the properties of this first
learned Bayesian network become immediately apparent. This network is a now compact representation of
the 47 dimensions of the joint probability distribution of the underlying database.

It is very important to note that, although this learned graph happens to have a tree structure, this is not the
result of an imposed constraint.
Preliminary Analysis
The analyst can further examine this graph by switching into the Validation Mode, which immediately
opens up the Monitor Panel on the right side of the screen.

This panel is initially empty, but by clicking on any node (or multiple nodes) in the network, Monitors ap-
pear inside the Monitor Panel. The corresponding nodes are highlighted in yellow.

By default, the Monitors show the marginal distributions of all selected variables. This shows, for instance,
9.7% of respondents rated their perfume at <=2.8 in terms of the Fresh attribute.
On this basis, one can start to experiment with the properties of this particular Bayesian network and query
it. With BayesiaLab this can be done in an extremely intuitive way, i.e. by setting evidence (or observations)
directly on the Monitors. For instance, we can compute the conditional probability distribution of Flowery,
given that we have observed a specific value, i.e. a specific state of Fresh. In formal notation, this would be
P(Flowery | Fresh)
We will now set Flowery to the state that represents the highest rating (>8.2), and we can immediately ob-
serve the conditional probability distribution of Fresh, i.e.
P(Fresh | Flowery = " > 8.2")

The gray arrows inside the bars indicate how the distributions have
changed compared to the previous distributions. This means that re-
spondents, who have rated the Flowery attribute of a perfume at the
top level, will have a 67% probability of also assigning a top rating to
the Fresh attribute.
P(Fresh = " > 8.2"| Flowery = " > 8.2") = 66.9%
Switching briefly back into the Modeling Mode and by clicking on the Flowery node, one can see the prob-
abilistic relationship between Flowery and Fresh in detail. By learning the network, BayesiaLab has auto-
matically created a contingency table for every single direct relationship between nodes.
All contingency tables, together with the graph
structure, thus encode the joint probability
distribution of our original database.
Returning to the Validation Mode, we can
further examine the properties of our network.
Of great interest is the strength of the prob-
abilistic relationships between the variables. In
BayesiaLab this can be shown by selecting
Analysis | Graphic | Arcs’ Mutual Information.
Note
The structure of our Bayesian network may be
directed, but the directions of the arcs do not
necessarily have to be meaningful.
For observational inference, it is only necessary that
the Bayesian network correctly represents the joint
probability distribution of the underlying database.

The thickness of the arcs is now proportional to the Mutual Information, i.e. the strength of the relationship
between the nodes.
Intuitively, Mutual Information measures the information that X and Y share: it measures how much know-
ing one of these variables reduces our uncertainty about the other. For example, if X and Y are independent,
then knowing X does not provide any information about Y and vice versa, so their mutual information is
zero. At the other extreme, if X and Y are identical then all information conveyed by X is shared with Y:
knowing X determines the value of Y and vice versa.
Formal Definition of Mutual Information
I(X;Y ) = p(x,y)log
p(x,y)
p(x)p(y)
⎛
⎝⎜
⎞
⎠⎟
x∈X
∑
y∈Y
∑
We can also show the values of the Mutual Information on the graph by clicking on Display Arc Com-
ments.

In the top part of the comment box attached to each arc, the Mutual Information of the arc is
shown. Below, expressed as a percentage and highlighted in blue, we see the relative Mutual In-
formation in the direction of the arc (parent node ➔ child node). And, at the bottom, we have
the relative mutual information in the opposite direction of the arc (child node ➔ parent node).
Variable Clustering
The information about the strength between the manifest variables can also be utilized for purposes of Vari-
able Clustering. More specifically, a concept related closely to the Mutual Information, namely the
Kullback-Leibler Divergence (K-L Divergence) is utilized for clustering.
For probability distributions P and Q of a discrete
random variable their K–L divergence is defined to be
DKL = (P || Q) = P(i)log
P(i)
Q(i)i
∑
In words, it is the average of the logarithmic difference
between the joint probability distributions P(i) and Q(i),
where the average is taken using the probabilities P(i).

Such variable clusters will allow us to induce new latent variables, which each represent a common concept
among the manifest variables.9 From here on, we will make a very clear distinction between manifest vari-
ables, which are directly observed, such as the survey responses, and latent variables, which are derived. In
traditional statistics, deriving such latent variables or factors is typically performed by means of Factor
Analysis, e.g. Principal Components Analysis (PCA).
In BayesiaLab, this “factor extraction” can be done very easily via the Analysis | Graphics | Variable Clus-
tering function, which is also accessible through the keyboard shortcut “s”.
The speed in which this is performed is one of the strengths of BayesiaLab, as the resulting variable clusters
are presented instantly.
9 An alternative approach is to interpret the derived concept or factor as a hidden common cause.

In this case, BayesiaLab has identified 15 variable clusters and each node is color-coded according to its
cluster membership. To interpret these newly-found clusters, we can zoom in and visually examine the
structure in the Graph Panel.
To support the interpretation process, BayesiaLab can also display a Dendrogram, which allows the analyst
to review the linkage of nodes into variable clusters.
The analyst may also choose a different number of clusters, based on his own judgement relating to the do-
main. A slider in the toolbar allows to choose various numbers of clusters and the color association of the
nodes will be update instantly.
By clicking the Validate Clustering button in the toolbar, the clusters are saved and the color codes will
be formally associated with the nodes. A clustering report provides us with a formal summary of the new
factors and their associated manifest variables.10
10 Variable cluster = derived concept = unobserved latent variable = hidden cause = extracted factor.

The analyst also has the option to use his do-
main knowledge to modify which manifest
variables belong to specific factors. This can be
done by right-clicking on the Graph Panel and
selecting Class Editor.
Multiple Clustering
As our next step towards building the PSEM, we will introduce these newly-generated latent factors into our
existing network and also estimate their probabilistic relationships with the manifest variables. This means
we will create a new node for each latent factor, creating 15 new dimensions in our network. For this step,
we will need to return to the Modeling Mode, because the introduction of the factor nodes into the net-
works requires the learning algorithms.

More specifically, we select Learning | Multiple
Clustering, which brings up the Multiple Clus-
tering dialogue. There is a range of settings,
but we will focus only on a subset. Firstly, we
need to specify an output directory for the to-
be-learned networks. Secondly, we need to set
some parameters for the clustering process,
such as the minimum and maximum number
of states, which can be created during the
learning process.
In our example, we select Automatic Selection
of the Number of Classes, which will allow the
learning algorithm to find the optimum num-
ber of factor states up to a maximum of five
states. This means that each new factor will
need to represent the corresponding manifest
variables with up to five states.

The Multiple Clustering process concludes
with a report, which shows details regarding
the generated clustering. The top portion of
the report is shown in the following screen-
shot.
The detail section of Factor_0, as it relates to
the manifest variables, is worth highlighting.
Here, we can see the strength of the relation-
ship between the manifest variables, such as
Trust, Bold, etc., and Factor_0. In a traditional
Factor Analysis, this would be the equivalent
of factor loading.
After closing the report, we will now see a new
(unconnected) network, with 15 additional
nodes, one for each factor, i.e. Factor_0
through Factor_14, highlighted in yellow in
the screenshot.

Analysis of Factors
We can also further examine how the new factors relate to the manifest variables and how well they repre-
sent them. In the case of Factor_0, we want to understand how it can summarize our five manifest variables.
By going into our previously-specified output directory, using the Windows Explorer or the Mac Finder, we
can see that 15 new networks (in BayesiaLab’s xbl format for networks) were generated. We open the spe-
cific network for Factor_0, either by directly double-clicking the xbl file or by selecting Network | Open.
The factor-specific networks are identified by a suffix/extension of the format “_[Factor_#].xbl” and “#”
stands for the factor number. We then see a network including the manifest variables and with the factor
being linked by arcs going from the factor to the manifest variables.
Returning to the Validation Mode, we can see five states for Factor_0, labeled C1 through C5, as well as
their marginal distribution. As Factor_0 is a target node by default, it automatically appears highlighted in
red in the Monitor Panel.

Here, we can also study how the states of the manifest variables relate to the states of Factor_0. This can be
done easily by setting observations to the monitors, e.g. setting C1 to 100%.
We now see that given that Factor_0 is in state
C1, the variable Active has a probability of
approximately 75% of being in state <=2.8.
Expressed more formally, we would state
P(Active = “<=2.8” | Factor_0 = C1) =
74.57%. This means that for respondents,
who have been assigned to C1, it is likely that
they would rate the Active attribute very low
as well.
In the Monitor for Factor_0, in parentheses
behind the cluster name, we find the expected
mean value of the numeric equivalents of the
states of the manifest variables, e.g. “C1 (2.08)”. That means that given the state C1 of Factor_0, we expect
the mean value of Trust, Bold, Fulfilled, Active and Character to be 2.08.

To go into even greater detail, we can actually look at every single respondent, i.e. every record in the data-
base and see what cluster they were assigned to. We select Inference | Interactive Inference,
which will bring up a record selector in the toolbar.
With this record selector, we can now scroll through the entire database, review the actual ratings of the
respondents and then see the estimation to which cluster each respondent belongs.

In our first case, record 0, we see the ratings of this respondent indicated by the manifest Monitors. In the
highlighted Monitor for Factor_0, we read that this respondent, given her responses, has an 82% probabil-
ity of belonging to Cluster 5 (C5) in Factor_0.

Moving to our second case, record 1, we see that the respondent belongs to Cluster 3 (C3) with a 96%
probability.
We can also evaluate the performance of our new network based on Factor_0 by selecting Analysis | Net-
work Performance | Global.

This will return the log-likelihood density function as shown in the following screenshot.

Completing the PSEM
We are now returning to our main task and our principal network, which has been augmented by the 15
new factors.
Before we re-learn our network with the new factors, we need to include Purchase Intent as a variable and
also impose a number of constraints in the form of Forbidden Arcs.

Being in the Modeling Mode, we can include Purchase Intent by right-clicking the node and uncheck Exclu-
sion.
This makes the Purchase Intent variable available in the next stage of learning, which is reflected visually as
well in the node color and the icon.
Our desired SEM-type network structure stipulates that manifest variables be connected exclusively to the
factors and that all the connections with Purchase Intent must also go through the factors. We achieve such
a structure by imposing the following sets of forbidden arcs:
1. No arcs between manifest variables
2. No arcs from manifest variables to factors
3. No arcs between manifest variables and Purchase Intent

We can define these forbidden arcs by right-clicking anywhere on the Graph Panel, which brings up the fol-
lowing menu.
In BayesiaLab, all manifest variables and all factors are conveniently grouped into classes, so we can easily
define which arcs are forbidden in the Forbidden Arc Editor.

Upon completing this step, we can proceed to learning our network again: Learning | Association Discover-
ing | EQ

The initial result will resemble the following screenshot.

Using the Force Directed Layout algorithm (shortcut “p”), as before, we can quickly transform this network
into a much more interpretable format.
Now we see the manifest variables “laddering up” to the factors, and we also see how the factors are related
to each other. Most importantly, we can observe where the Purchase Intent node was attached to the net-
work during the learning process. The structure conveys that Purchase Intent has the strongest link with
Factor_2.
Now that we can see the big picture, it is perhaps appropriate to give the factors more descriptive names.
For obvious reasons, this task is the responsibility of the analyst. In this case study, Factor_0 was given the
name “Self-Confident”. We add this name into the node comments by double-clicking Factor_0 and scroll-
ing to the right inside the Node Editor until we see the Comments tab.

We repeat this for all other nodes, and we can subsequently display the node comments for all factors by
clicking the Display Node Comment icon in the toolbar or by selecting View | Display Node Comments
from the menu.
Market Driver Analysis
Our Probabilistic Structural Equations Model is now complete, and we can use it to perform the actual
analysis part of this exercise, namely to find out what “drives” Purchase Intent.
We return to the Validation Mode and right-click on Purchase Intent and then check Set As Target Node.
Double-clicking the node while pressing “t” is a helpful shortcut.

This will also change the appearance of the node and literally give it the look of a target.
In order to understand the relationship between the factors and Purchase Intent, we want to tune out all the
manifest variables for the time being. We can do so by right-clicking the Use of Classes icon in the bottom
right corner of the screen. This will bring up a list of all classes. By default, all are checked and thus visible.

For our purposes, we want to deselect All and then only check the Factor class.
The resulting view has all the manifest variables grayed-out, so the relationship between the factors becomes
more prominent. By deselecting the manifest variables, we also exclude them from subsequent analysis.

We will now right-click inside the (currently empty) Monitor Panel and select Monitors Sorted wrt Target
Variable Correlations. The keyboard shortcut “x” will do the same.
This brings up the monitor for the target node, Purchase Intent, plus all the monitors for the factors, in the
order of the strength of relationship with the Target Node.

This immediately highlights the order of importance of the factors relative to the Target Node, Purchase
Intent. Another way of comprehensively displaying the importance is by selecting Reports | Target Analy-
sis | Correlations With the Target Node
“Correlations” is more of a metaphor here, as BayesiaLab actually orders the factors by their Mutual In-
formation relative to the target node, Purchase Intent.
By clicking Quadrants, we can obtain a type of opportunity graph, which shows the mean value of each
factor on the x-axis and the relative Mutual Information with Purchase Intent on the y-axis. Mutual Infor-
mation can be interpreted as importance in this context.

By right-clicking on the graph, we can switch between the display of the formal factor names, e.g. Factor_0,
Factor_1, etc., and the factor comments, such as Adequacy, Seduction, which is much easier for interpreta-
tion.
As in the previous views, it becomes very obvious that the factor Adequacy is most important with regard to
Purchase Intent, followed by the factor Seduction. This is very helpful for understanding the overall market
dynamics and for communicating the key drivers to managerial decision makers.
The lines dividing the graph into quadrants reflect the mean values for each axis. The upper-left quadrant
highlights opportunities as these particular factors are “above average” in importance, but “below average”
in terms of their rating.

Product Driver Analysis
Although this insight is relevant for the whole market, it does not yet allow us to work on improving spe-
cific products. For this we need to look at product-specific graphs. In addition, we may need to introduce
constraints as to where we may not have the ability to impact any attributes. Such information must come
from the domain expert, in our case from the perfumer, who will determine if and how odoriferous com-
pounds can affect the consumers’ perception of the product attributes.
These constraints can be entered into BayesiaLab’s Cost Editor, which is accessible by right-clicking any-
where in the Graph Panel. Those attributes, which cannot be changed (as determined by the expert), will be
set to “Not Observable”. As we proceed with our analysis, these constraints will be extremely important
when searching for realistic product scenarios.
On a side note, an example from the presumably more tangible auto industry may better illustrate such
kinds of constraints. For instance, a vehicle platform may have an inherent wheelbase limitation, which thus
sets a hard limit regarding the maximum amount of rear passenger legroom. Even if consumers perceived a
need for improvement on this attribute, making such a recommendation to the engineers would be futile. As
we search for optimum product solutions with our Bayesian network, this is very important to bear in mind
and thus we must formally encode these constraints of our domain through the Cost Editor.
Product Optimization
We now return briefly to the Modeling Mode to include the Product variable, which has been excluded
from our analysis thus far. Right-clicking the node and then unchecking Properties | Exclusion will achieve
this.
At this time, we will also move beyond the analysis of factors and actually look at the individual product
attributes, so we select Manifest from the Display Classes menu.

Back in the Validation Mode, we can perform a Multi Quadrant Analysis: Tools | Multi Quadrant Analysis
This tool allows us to look at the attribute ratings of each product and their respective importance, as ex-
pressed with the Mutual Information. Thus, we pick Product as the Selector Node and choose Mutual In-
formation for Analysis. In this case, we also want to check Linearize Nodes’ Values, Regenerate Values and
specify an Output Directory, where the product-specific networks will be saved. In the process of generating
the Multi Quadrant Analysis, BayesiaLab produces one Bayesian network for each Product. For all Prod-
ucts the network structure will be identical to the network for the entire market, however, the parameters,
i.e. the contingency tables, will be specific to each Product.
However, before we proceed to the product-specific networks, we will first see a Multi Quadrant Analysis
by Product, and we can select each product’s graph simply by right-clicking and choosing the appropriate
product identification number.
Please note that only the observable variables are visible on the chart, i.e. those variables which were not
previously defined as “Not Observable” in the Cost Editor.

For Product No. 5, Personality is at the very top of the importance scale. But how will the Personality at-
tribute compare in the competitive context? If we Display Scales by right-clicking on the graph, it appears
that Personality is already at the best level among the competitors, i.e. to the far right of the horizontal
scale. On the other hand, on the Fresh attribute Product No. 511 marks the bottom end of the competitive
range.
11 Any similarities of identifiers with actual product names are purely coincidental.

For a perfumer it would thus be reasonable to assume that there is limited room for improvement with re-
gard to Personality, and that Fresh perhaps offers a significant opportunity for Product No. 5.

To highlight the differences between products, we will also show Product No. 1 in comparison.
For Product No. 1 it becomes apparent that Intensity is highly important, but that its rating is towards the
bottom end of the scale. The perfumer may thus conclude a bolder version of the same fragrance will im-
prove Purchase Intent.

Finally, by hovering over any data point in the opportunity chart, BayesiaLab can also display the position
of competitors compared to the reference product for any attribute. The screenshot shows Product No. 5 as
the reference and the position of competitors on the Personality attribute.
BayesiaLab also allows us to measure and save the “gap to best level” (=variations) for each product and
each variable through the Export Variations function. This formally captures our opportunity for improve-
ment.

Please note that these variations need to be saved individually by Product.
By now we have all the components necessary for a comprehensive optimization of product attributes:
1. Constraints on “non-actionable” attributes, i.e. excluding those variables, which can’t be affected
through product changes.
2. A Bayesian network for each Product.
3. The current attribute rating of each Product and each attribute’s importance relative to Purchase Intent.
4. The “gap to best level” (variation) for each attribute and Product.
With the above, we are now in a position to search for realistic product configurations, based on the exist-
ing product, which would realistically optimize Purchase Intent.
We proceed individually by Product, and for illustration purposes we use Product No. 5 again. We load the
product-specific network, which was previously saved when the Multi Quadrant Analysis was performed.

One of the powerful features of BayesiaLab is Target Dynamic Profile, which we will apply here on this
network to optimize Purchase Intent: Analysis | Report | Target Analysis | Target Dynamic Profile

The Target Dynamic Profile provides a number of important options:
• Profile Search Criterion: we intend to optimize the mean of the Purchase Intent.
• Criterion Optimization: maximization is the objective.

• Search Method: We select Mean and also click on Edit Variations, which allows us to manually stipulate
the range of possible variations of each attribute. In our case, however, we had saved the actual variations
of Product No. 5 versus the competition, so we load that data set, which subsequently displays the values
in the Variation Editor. For example, Fresh could be improved by 10.7% before catching up to the
highest-rated product in this attribute.
• Search Stop Criterion: We check Maximum Number of Evidence Reached and set this parameter to 4.
This means that no more than the top-four attributes will be suggested for improvement.
Upon completion of all computations, we will obtain a list of product action priorities: Fresh, Fruity, Flow-
ery and Wooded.
The highlighted Value/Mean column shows the successive improvement upon implementation of each ac-
tion. From initially 3.76, the Purchase Intent improves to 3.92, which may seem like a fairly small step.
However, the importance lies in the fact that this improvement is not based on utopian thinking, but rather
on attainable product improvements within the range of competitive performance.

Initially, we have the marginal distribution of the attributes and the original mean value for Purchase Intent,
i.e. 3.77.
To further illustrate the impact of our product actions, we will simulate their implementation step-by-step,
which is available through Inference | Interactive Inference.
With the selector in the toolbar, we can go through each product action step-by-step in the order in which
they were recommended.

Upon implementation of the first product action, we obtain the following picture and Purchase Intent grows
to 3.9. Please note that this is not a sea change in terms of Purchase Intent, but rather a realistic consumer
response to a product change.
The second change results in further subtle improvement to Purchase Intent:

The third and fourth step are analogous and bring us to the final value for Purchase Intent of 3.92.
Although BayesiaLab generates these recommendations effortlessly, they represent a major innovation in the
field of marketing science. This particular optimization task has not been tractable with traditional methods.
Conclusion
The presented case study demonstrates how BayesiaLab can transform simple survey data into a deep un-
derstanding of consumers’ thinking and quickly provides previously-inconceivable product recommenda-
tions. As such, BayesiaLab is a revolutionary tool, especially as the workflow shown here may take no more
than a few hours for an analyst to implement. This kind of rapid and “actionable”12 insight is clearly a
breakthrough and creates an entirely new level of relevance of research for business applications.
12 The authors cringe at the inflationary use of “actionable”, but here, for once, it actually seems appropriate.

Appendix: The Bayesian Network Paradigm13
Acyclic Graphs & Bayes’s Rule
Probabilistic models based on directed acyclic graphs have a long and rich tradition, beginning with the
work of geneticist Sewall Wright in the 1920s. Variants have appeared in many fields. Within statistics, such
models are known as directed graphical models; within cognitive science and artificial intelligence, such
models are known as Bayesian networks. The name honors the Rev. Thomas Bayes (1702-1761), whose
rule for updating probabilities in the light of new evidence is the foundation of the approach.
Rev. Bayes addressed both the case of discrete probability distributions of data and the more complicated
case of continuous probability distributions. In the discrete case, Bayes’ theorem relates the conditional and
marginal probabilities of events A and B, provided that the probability of B does not equal zero:
P(A∣B) =
P(B∣A)P(A)
P(B)
In Bayes’ theorem, each probability has a conventional name:
• P(A) is the prior probability (or “unconditional” or “marginal” probability) of A. It is “prior” in the
sense that it does not take into account any information about B; however, the event B need not occur
after event A. In the nineteenth century, the unconditional probability P(A) in Bayes’s rule was called the
“antecedent” probability; in deductive logic, the antecedent set of propositions and the inference rule
imply consequences. The unconditional probability P(A) was called “a priori” by Ronald A. Fisher.
• P(A|B) is the conditional probability of A, given B. It is also called the posterior probability because it is
derived from or depends upon the specified value of B.
• P(B|A) is the conditional probability of B given A. It is also called the likelihood.
• P(B) is the prior or marginal probability of B, and acts as a normalizing constant.
Bayes theorem in this form gives a mathematical representation of how the conditional probability of event
A given B is related to the converse conditional probability of B given A.
The initial development of Bayesian networks in the late 1970s was motivated by the need to model the top-
down (semantic) and bottom-up (perceptual) combination of evidence in reading. The capability for bidirec-
tional inferences, combined with a rigorous probabilistic foundation, led to the rapid emergence of Bayesian
networks as the method of choice for uncertain reasoning in AI and expert systems replacing earlier, ad hoc
rule-based schemes.
13 Adapted from Pearl (2000), used with permission.

The nodes in a Bayesian network represent variables
of interest (e.g. the temperature of a device, the gen-
der of a patient, a feature of an object, the occur-
rence of an event) and the links represent statistical
(informational) or causal dependencies among the
variables. The dependencies are quantified by condi-
tional probabilities for each node given its parents in
the network. The network supports the computation
of the posterior probabilities of any subset of vari-
ables given evidence about any other subset.
Compact Representation of the Joint
Probability Distribution
“The central paradigm of probabilistic reasoning is
to identify all relevant variables x1, . . . , xN in the
environment [i.e. the domain under study], and make a probabilistic model p(x1, . . . , xN) of their interac-
tion [i.e. represent the variables’ joint probability distribution].”
Bayesian networks are very attractive for this purpose as they can, by means of factorization, compactly
represent the joint probability distribution of all variables.
“Reasoning (inference) is then performed by introducing evidence that sets variables in known states, and
subsequently computing probabilities of interest, conditioned on this evidence. The rules of probability,
combined with Bayes’ rule make for a complete reasoning system, one which includes traditional deductive
logic as a special case.” (Barber, 2012)

References
Barber, David. Bayesian Reasoning and Machine Learning. Cambridge University Press, 2012.
Darwiche, Adnan. Modeling and Reasoning with Bayesian Networks. 1st ed. Cambridge University Press,
2009.
Heckerman, D. “A Tutorial on Learning with Bayesian Networks.” Innovations in Bayesian Networks
(2008): 33–82.
Holmes, Dawn E., ed. Innovations in Bayesian Networks: Theory and Applications. Softcover reprint of
hardcover 1st ed. 2008. Springer, 2010.
Kjaerulff, Uffe B., and Anders L. Madsen. Bayesian Networks and Influence Diagrams: A Guide to Con-
struction and Analysis. Softcover reprint of hardcover 1st ed. 2008. Springer, 2010.
Koller, Daphne, and Nir Friedman. Probabilistic Graphical Models: Principles and Techniques. 1st ed. The
MIT Press, 2009.
Koski, Timo, and John Noble. Bayesian Networks: An Introduction. 1st ed. Wiley, 2009.
Mittal, Ankush. Bayesian Network Technologies: Applications and Graphical Models. Edited by Ankush
Mittal and Ashraf Kassim. 1st ed. IGI Publishing, 2007.
Neapolitan, Richard E. Learning Bayesian Networks. Prentice Hall, 2003.
Pearl, Judea. Causality: Models, Reasoning and Inference. 2nd ed. Cambridge University Press, 2009.
———. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. 1st ed. Morgan
Kaufmann, 1988.
Pearl, Judea, and Stuart Russell. Bayesian Networks. UCLA Congnitive Systems Laboratory, November
2000. http://bayes.cs.ucla.edu/csl_papers.html.
Pourret, Olivier, Patrick Naïm, and Bruce Marcot, eds. Bayesian Networks: A Practical Guide to Applica-
tions. 1st ed. Wiley, 2008.
Schafer, J.L., and M.K. Olsen. “Multiple Imputation for Multivariate Missing-data Problems: A Data Ana-
lyst’s Perspective.” Multivariate Behavioral Research 33, no. 4 (1998): 545–571.
Spirtes, Peter; Glymour, Clark. Causation, Prediction and Search. The MIT Press, 2001.

Contact Information
Bayesia USA
312 Hamlet’s End Way
Franklin, TN 37067
USA
Phone: +1 888-386-8383
info@bayesia.us
www.bayesia.us
Bayesia S.A.S.
6, rue Léonard de Vinci
BP 119
53001 Laval Cedex
France
Phone: +33(0)2 43 49 75 69
info@bayesia.com
www.bayesia.com
Bayesia Singapore Pte. Ltd.
20 Cecil Street
#14-01, Equity Plaza
Singapore 049705
Phone: +65 3158 2690
info@bayesia.sg
www.bayesia.sg
Copyright
© 2013 Bayesia USA, Bayesia S.A.S. and Bayesia Singapore. All rights reserved.

Driver Analysis and Product Optimization with Bayesian Networks

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (18)

Similar a Driver Analysis and Product Optimization with Bayesian Networks

Similar a Driver Analysis and Product Optimization with Bayesian Networks (20)

Más de Bayesia USA

Más de Bayesia USA (15)

Último

Último (20)

Driver Analysis and Product Optimization with Bayesian Networks