2. 2
jurisdiction with respect to classes of problems, who has resources, who has expert knowledge.
When an agent or agency is tasked to solve, or to develop a plan to solve a problem, the next
thing is establish what is known about the problem, including determining successful outcomes
of the effort. An outcome may be a plan for action, an action, or simply answers to questions,
e.g., find the missing aircraft or develop a plan to combat global warming.
The ability, and necessity, to offload cognitive processing to computers fundamentally changes
the engineering tool set. Cognition is the process by which the sensory input is transformed,
reduced, elaborated, stored, recovered, and used. In science, cognition is the mental processing
that includes the attention of working memory, comprehending and producing language,
calculating, reasoning, problem solving, and decision making. Engineers, as humans, have
always used cognitive processing to solve engineering problems. What is new for engineering is
that wicked problems may require the processing immense volumes of data whose meaning is
not known, a priory. To get meaning from the data, cognitive processing is needed. Because of
the volume of data, some of this processing needs to be off loaded to computers. Much of the
decision making for action will still be performed by humans. Engineers need to understand how
software tools that use cognitive processing work, how they can make problem solving easier, as
well as the pitfalls of their use. To employ machine cognition successfully requires some
understanding of how they get their results, and how or whether the results are justified. If some
action is taken which has serious consequences such as closing an airspace, the justification for
the action will almost certainly have to be given. Engineers will need computational systems that
have the capability to provide how conclusions were reached.
Engineering process as a cognitive loop
A wicked problem occurs and plays out through the dynamic interactions of systems within
an environment. The environment itself is inherently complex and uncertain and may be a broad
physical environment with its population of flora, fauna, and human organizations and structures.
When a cognitive process is employed to solve engineering problems it means that the agency
applying the process starts with background knowledge, makes observations regarding the
problem context, as circumstances unfold. Observation leads to the development of hypothesis.
Hypothesis get tested within the context of the environment against data and modified or
discarded and replaced until a solution is obtained or the process fails. The cognitive processes
used in engineering are a speeded up version of the scientific method: develop hypothesis, test
them, discard or refine until a solution is found. Cognitive processes have been identified and
described in multiple fields. They have been employed as the architecture for computer problem
solving systems (Kulkarni and Simon 1988).
Figure 1. OODA Loop
3. 3
Figure 1, taken from (Boyd 1995) illustrates a cognitive process. The OODA Loop, developed
by USAF Colonel John Boyd is a cycle of observe, orient, decide, and act. Boyd applied the
concept to the combat operations process, often at the strategic level in military operations.
Several authors have suggested applying the OODA Loop to solving engineering problems
(Garrett 2014). The OODA Loop is a good description of the engineering processing
architecture. The process operates on an environment context by making observations; the orient
task converts the observations to knowledge based on previous experience; decisions involve the
ability to choose hypotheses based on the information, and test them. Within such an OODA
loop, the results of observations are continually used to update the knowledge base of the actors.
Actions result from hypothesis testing. Actions may only be to revise a hypothesis and collect
more data to find patterns that lead to solutions.
An engineering loop, like the OODA, always starts with initial knowledge about the world, i.e.,
some characterization of individuals which we recognize and group into classes or types.
Hopefully, no problem solving effort starts with a completely blank sheet of paper. The initial
state consists of the metadata about what is relevant and the general assumptions about how the
world works. This background knowledge is the context in which the loop of observing,
developing hypotheses, discarding or refining them to achieve a solution operates. The
background knowledge is what we generally assume to be true and don’t worry about unless
some radical anomaly contradicts our basic theories. We collect facts in accordance with this
model of the world. For example, when the problem is to detect bank fraud, the background
knowledge of what constitutes banks, accounts in banks, transactions, etc. More likely we find
that the background knowledge is incomplete. What constitutes a bank may turn out to be
ambiguous.
Problem solving, particularly for wicked engineering problems, almost always involves
generating hypothesis, testing them, and refining them or discarding them, i.e., the application of
the scientific method. When searching for a missing aircraft one might make assumptions
regarding the bounds of where it could fly with its load of fuel. Additional information may be
sought that allows one to further restrict the possible location of the aircraft. We keep theories
only as long as data confirms them. If data contradicts a theory then it is inconsistent and we
discard it. Data may be inconsistent with a theory. If our theory implies that a missing aircraft is
in two distinct places at once, then the theory is inconsistent. Cognitive science is now beginning
to understand how a cognitive process can generate and recognize concepts (Minzer 2014) which
is a precursor to enabling computers to generate hypotheses. This means that computers may
begin to off load the process of generating hypothesizes to computers.
Problem Solving Scenario
The following scenario is intended to illustrate the challenges involving in using an OODA
loop which uses computers for cognitive processing, in particular the challenges with
representing problem context knowledge within computers. The kind of knowledge needed for
computation is the same as that which has always been used by experts tasked with solving
problems “manually.” The only difference is that keeping track of the information for large scale
complex problems without the help of computational processing is more that an individual or
even team can manage. Further one may want to task computational systems to collect, interpret,
data, and take action to collect more data without human intervention.
4. 4
The problem context can be decomposed into goals, a physical environment and an
organizational environment. Suppose a hazardous cargo goes missing, or a system which uses
hazardous materials fails catastrophically. An agency is given overall responsibility to assess the
problem, find and secure any hazardous material, protect the population and environment, as
well as prepare the public and potentially legal response. There often isn’t one agency which has
clear overall responsibility, but we ignore this how this is resolved and assume a single agency is
in charge. The execution of any action plan will involve agencies, and require resources, not
directly under the agency responsible for a solution. For example, the resources needed may
belong to another nation. A plan must be developed and executed; the OODA Loop integrates
observation of the physical environment using resources of the organizational environment. The
plan used or produced by the OODA loop must identify the resources needed, the authority
signoffs necessary to perform any actions.
Figure 2. Problem Context Knowledge
The detail and kind of knowledge required for problem solving depends on the breadth and
depth of the problem. As the problem solving task evolves, the initial knowledge representation
may need to change significantly. The initial knowledge base represents what is known or
assumed relative to the problem. Ideally, the change would be refinement of the initial
knowledge representation. Figure 2 is a schematic illustration of some of the kinds of knowledge
that might be needed for a wicked problem. At the top of the diagram is a physical representation
of the physical terrain, over laid with geographic information, then with mechanical information,
at least for a localized region. On the bottom are categories of organizations that might play a
role in the problem solving process. On the right side is a trace of the OODA Loop problem
solving process.
5. 5
To decide what level of computer knowledge representation is needed for a problem, one
starts with the kind of questions that need to be answered. For example
• what is the location of a toxic substance
• what would happen if an explosion occurred at a specific location in a structure
• how far and how fast would toxic material disperse
• what evidence is there that certain precursor events have occurred
• is there any relationship between certain kinds of events
Of course when one attempts to answer these questions, almost certainly external information in
possession of outside experts will be needed.
Big Data as Game Changer
The OODA loop is routinely employed by individuals and organizations in engineering, business
as well as in daily life for a variety of situations. The OODA Loop is also used by software
applications which deliver real-time monitoring and tracking, to produce operational intelligence.
A typical application may consist of a hierarchy with multiple agents each using an OODA
Loop. In the realm of commercial applications, OODA Loop architectures are now main stream
where large data stores are accessed and billions of events are processed. These software
applications use inference from historical patterns, reason to gather further information and form
conclusions that are then acted upon: alerts can be sent, processes can be triggered, and decisions
can be made and acted upon. An application may track what an individual or organization is
doing, determine if there is a change in behavior, attempt to discover why they are doing it, and
decide if action is warranted, and act.
Cell phone usage within the United States can provide an example of the scale of big data. The
Pew Research Center (Smith, 2013) estimates that 83% of American adults have a cell phone.
The survey then found that the average adult sends 42 text messages and makes 12 voice calls
each day. That corresponds to about 3 trillion text messages and almost 1 trillion voice calls a
year just in the United States by adults in 2013.
As used here data are the bit strings, arrays, streams that our sensors collect and process.
Information is data that has been tagged with metadata that tells where it came from and what it
is about. Only when data has been tagged and classified does it become knowledge. For example,
only when signal data has been classified as a telephone call is it knowledge. Knowledge is
always relative to the classification schemes we use to understand the world. People have been
building information systems for banking and other domains for a long time. These systems
represent knowledge, knowledge based on the classification of entities as banks, accounts, and
transactions. Modern programming languages have concepts such as classes that can readily be
used to represent concepts. However, the architectures of these information systems are generally
too static to be used for cognitive processing. Cognitive processing requires that new
classifications be added dynamically. If the knowledge base of categories used for classifying
data is encoded directly in the program, then when the knowledge base has to be revised then
coding is needed. This can be time consuming and expensive. Thus, the switch from hard coded
information systems to knowledge representation systems is essential. In knowledge
representation systems the knowledge is represented as data to systems where the classification
structure can be changed at execution time.
6. 6
Computer Representation of Knowledge
Cognitive processing is knowledge-based processing. Knowledge is used to convert data to
information, and then to facts as expressed in terms of our knowledge of the world. What does it
mean for computers to do knowledge-based processing and what is the role of this kind of
processing in engineering? From the engineering viewpoint it means that knowledge is used to
convert data to knowledge, test hypothesis, and possibly generate new hypothesis. We are
already familiar with limited forms of computer based knowledge processing. Cognitive
processing by computer requires that the knowledge is represented within the computer. For
example, employee information management systems use knowledge encoded in programs to
produce knowledge in the form of query answers. While computer representation of knowledge
is an immense field, examples can be used to illustrate how this works in the present context.
Such a computer system starts with the base of its background knowledge. As the OODA Loop
executes, this knowledge base may expand with other general knowledge as needed. As
hypothesis are formed additional assumptions are added, but they may have to be revised or
retracted.
Ontology and Interpretation
The concepts and relationships which are used by a cognitive process to classify data are its
models of the world. The concepts and relations used to build a model are referred to here as an
ontology. The concept of an ontology traces back to Aristotle. For Aristotle, the main task of
philosophy was to experience the empirical world and acquire knowledge about it (Metaphysics,
Chapter 9). He created the first system of ontology in the form of an ontology of substances. For
Aristotle the general properties of things, that is, those properties of things which constitute their
invariant form, have to be found through a cognitive process. The knowledge consists of general
building blocks in the form of classes and roles, to use Artificial Intelligence, AI, terminology.
These building blocks are used to construct models of the problem domain. The building blocks
are the language; a model is a sentence in the language. The purpose of the model is to provide
structure for organizing data collected when the system operates. This data is used to monitor
and asses operation of the system, providing diagnostics, and predictions of future behavior.
The full semantic framework used by a cognitive process includes not only the ontology
(language) but the sensor processes which recognize individuals, classify them, and recognize
relationships between individuals. These topics are often not included in discussions of ontology,
but they are crucial when a cognitive process uses a model as the basis for action. Actions have
consequences. When action is taken, for example to eliminate a threat, one needs to be sure that
the action is justified. The justification includes how the conclusion that something is a threat is
made on the basis of the recognition procedures used in the cognitive process. Both humans and
artificial cognitive processes may need to justify their actions after the fact.
To use a language of individuals, categories, and relations, the first issue is what constitutes
an individual thing. Most ontology languages start with a language of individuals, but in
engineering or applied science that is not always a simple problem. Establishing a common basis
and consistency in abstraction in both categories and relationships across a model can be
challenging. For example, a radar sends out energy. The radar sensor collects the data reflected
back. When anomalies are detected the radar converts this data, first into blips and takes
“snapshots” of the blips. These are called track records. When the radar gets enough blip
snapshots whose space-time proximity is consistent with something moving through space, the
7. 7
radar decides to identify it as a thing. It then tries to figure out whether it is one thing or several
things and to identify what kind of thing it is. However, for simplicity, this discussion presumes
that we can recognize individuals and tell whether two individuals are the same.
The building blocks (ontology) for the hazardous cargo scenario context include individuals,
classes, and relationships used to represent both the physical and organizational context. For this
example, the base ontology classes and relations are relatively invariant (static). The membership
relationship of an individual to a class may change as a function of both knowledge change and
environment change. Also new classes and relationships may be added to the base ontology.
While we build on the basic ontological concepts, as developed by Aristotle, and elaborated by
Artificial Intelligence and Knowledge Representation computer science community, these
concepts are extended by modeling concepts developed by the engineering community. These
modeling contexts are routinely used to build models of problem contexts, such as aircraft
operating in a physical environment (Graves and Bijan 2010).
Modeling the Problem Context
With the problem scenario description and the brief guide to computer based knowledge
representation, the next task is to build a computer implementable model of the Hazardous
Material problem context. The computer implementation of the model is to be executable, as it
will host the computer portion of the OODA loop which manages data collection and processing.
The model also has to dynamically evolve as the OODA loop interacts with the problem
environment. This presents a challenge for current computer modeling technology, but not an
unsurmountable one.
Currently no single modeling language has all of the features needed for the dynamic
modeling scenario such as the Hazardous Material problem. However, as engineers we are
accustomed to working with less than perfect tools. Two modeling languages which represent the
current best of breed are OWL 2 (2012) and SysML (2013). OWL has several features which are
needed, and which SysML does not possess. The OWL language constructions, such as the class
operations ⊓, ⊔ , ¬, (intersection, union and logical negation) and the universal class, Thing, are
very useful for engineering modeling. Any class is a subclass of Thing. OWL uses individuals
represent the things whose attributes are the data collected in problem solving process. The use
of the class, Thing, is a convenience used to tag an individual that we are unable to classify
further. Possibly the most distinguishing feature of OWL is that classes and relations can be
added at run time when the system is operating. This represents the ability to dynamically
modify the problem model. These features will be used in the Hazardous Material model.
SysML may not be recognized as an ontological language, but it is one that has very rich
ontology concepts. These features include classes and relations, but not individuals. Most
importantly, SysML has well-developed constructions for component relations. The component
constructions enable state vectors, procedure calls, and state machines to be associated with
specific components. This ontological richness results from the experience modeling complex
systems. These features are needed for modeling the Hazardous Material problem and will be
introduced as they are used in constructing the model throughout this paper.
The Product Context diagram in Figure 2 above serves as the starting point for a hazardous
material model that uses many of the SysML features. The advantages of using SysML greatly
over shadow its deficiencies. The features missing from SysML can be worked around in an
8. 8
application. SysML does not fully support instances, only classes. For much product
development modeling this can be worked around. However, modeling dynamic systems that
collect large volumes of data generally involve large numbers of individuals and relationships
between these individuals. For example, if the data consists of telephone calls between
individuals, then the number of individuals and the relationships which connect them can be
enormous.
Figure 3 is a highly abstracted model of the invariant (static) structure of the Problem Context
shown in Figure 2. The graphical conventions follow SysML. The classes and relations represent
the assumed invariant (static) structure of the Problem Context. In general these connections can
represent actual physical connections such as pipes of various kinds, transportation channels, as
well as organizational connections. These connections are appropriately shown as a static
description of some enterprise or system, essentially representing what is assumed about the
problem environment. A model may contain expectations as to how long it takes to get a contract
authorized and placed into effect. The abstracted diagrams in Figure 3 suppress the detail that
makes them executable. Lower level detail supplies the rules which govern the behavior of the
system when it operates in the real world, or when it is operated in a simulation mode.
Figure 3. Problem Context Model
The left side diagram in Figure 3 is a Block Definition Diagram (BDD); the right side is an
Internal Block Diagram (IBD). The BDD shows the classes, blocks in SysML terminology, and
part relations between these classes. ProblemContext is a top level block for the problem context.
Figure 2 contains an illustration of the physical environment, the organizational environment,
and the OODA loop, which is the cognitive process for the problem context. Each of these
classes is represented as a SysML block in the diagram on the left. The lines in the BDD
represent part relations. The IBD on the right of Figure 3 shows a portion of the internal
connection structure between the OODA loop process and the resources that it interacts with for
the Hazardous Material problem context. The rectangles in the two diagrams are blocks (classes).
The arrows in the diagrams are associations (relations). In contrast to some ontology languages,
the associations have domain and range classes.
The PhysicalEnvironment component of the Problem Context Model has two subcomponents,
PhysicalLaws, and Views. The views represent the model of the physical world as it evolves
9. 9
during the course of the problem solving operation. For the model the physical environment
additional ontologies are needed, such as those for geography, classification of people, events,
and time. Evolution of the physical environment is captured by sensors. In general sensors
provide overlapping views of the physical environment. The PhysicalLaws component keeps any
overlap of the views synchronized. The SysML constraint block provides an effective way to
represent the mechanisms that keep the information synchronized. Equations in the constraint
block represent the synchronization information.
The associations in the BDD are of a special kind, called part associations. For the Hazardous
Material model they provide a tree structured part decomposition for the ProblemContext, as
shown in the BDD of Figure 3. In a good model development environment, when a user
introduces a part association the software enforces properties which imply that the part
associations yield a tree decomposition. In the IBD, the rectangles have labels such as
p1.o1:Actor. The rectangle, as a class, describes the sensor intelligence resources which are used
in the application.
p1:OODAL is the cognitive process employed at the top level for this problem. As a process it
interacts with resources. The resources interact with the physical environment parts. The OODA
Loop instance in an implementation will have a state machine which executes to perform the
process behaviour. The state machine can send and receive information via the communication
channels shown in the IBD diagram. The Hazardous Material model will likely be populated
with individual organizational agents which are known to the problem solving agency. Large
collections of individuals are created when the system is set in operation. An individual is
something that the sensory part of the OODA loop recognizes as a thing. This means that it is
represented as a member of Thing, or a subclass of thing if more is known about it.
Execution of the Hazardous Material model creates an instance of the template described by
the model. The execution instance is a graph. The nodes and edges of this graph are described in
Table 1. This graph is the static structure of the model. The graph evolves as new/updated data is
collected and processed according to the OODA loop which controls the execution. The behavior
is modeled by the change of state associated with the components of an instance of the model. In
execution of the model state change is accomplished by three mechanisms. Some states represent
the value of sensor attributes. The physical change is modeled by the physical world sensor state
variable change. In this respect the system is not closed, as these changes may appear random
from the viewpoint of the OODA loop. As a system operates one can instrument it to check that
is operating as expected. The third way that behavior is represented is by an explicit process. The
OODA loop component is a process whose behavior is described by a state chart. The state chart
is an explicit component of OODAL.
Model Execution
The Hazardous Material model is executed by creating an instance of the model and
connecting sensors to the external environment or a simulation of this environment. An instance,
at least for the static part described in Figure 3 is a graph, as described in Table 1. During
execution data collection and actions taken by the OODA loop expand the graph and update the
context model. Data collection is performed by the sensor components of the Sensor-Intelligence
component of the Hazardous Material model, as directed by the OODA loop. The base ontology
and the model constructed from the ontology provide the framework for data collection
interpretation in the sense that the result of data collection new individual nodes and arrows are
10. 10
added to the initial invariant graph in Table 1. For example, telephone call data between people
reporting an accident or clusters of peopling discussing observations about the accident scene
after it is processed in light of an ontology that has a call relation become an arrow between the
individual caller and the individual called. The execution of the Hazardous Material model in
that case would offload converting the signal data collection and processing to producing facts
with respect to the ontology of telephony.
The execution of the model creates an instance of the model. The model is actually a template
for instances. Multiple instances might be created in the course of designing and debugging the
model. For simplicity, let’s assume the model only contains the 9 nodes in the Figure 3 BDD and
the 3 connections in the IBD. By giving the name hm1 to the model instance, and using the
“dot” notation the naming conventions are as shown in Table 1.
Table 1. The model for Notional Hazardous Material Problem
Nodes Connections
hm1:ProblemContext c1:p1 → p2.r1
hm1.p1:PhysicalEnvironment c2:p1 → p2.r2
hm1.p2:OrgEnv c2:p1 → p2.r3
hm1.p3:OODAL
hm1.p1.q1:PhysicalLaws
hm1.p1.q2:Views
hm1.p2.r1:Authority
hm1.p2.r2:SensorIntel
hm1.p2.r3:Actor
When the problem solving system is operational, additional instances are generated when
individual contacts or interacts with another. These connections include, telephone and email.
Suppose the sensors record bank or credit card transactions. Each transaction consists of a
sender, receiver, and message. An ontology with people as individuals, relationships such as
making a telephone call from one person to another may be used in telephone signal processing
to convert a digital string into the statement that “John called Tom”, provided the telephony data
representing a call enables the recognition that the call was made by John and received by Tom.
Potentially any node can contact any other node. These dynamic connections occur in real
operation or in simulation. First, let’s simplify the problem to collecting metadata, not the
content of the message. One is collecting events such as “a receives x from b at time t” or “a
sends y to c at time t”. These events can be stored as records or strings associated directly with
“a”. If so it is relatively easy to determine who “a” talks to and compute chains of calls through
the event space. There are other storage approaches as well. From a transaction the sensor
processing might be able to identify a person or organization which initiated the transaction, as
well as the recipient. Then in addition to the invariant graph, the runtime system will populate a
11. 11
graph database which contains this information.
By constructing this run time extension of the initial invariant graph anomalous behavior can
be detected and the result used to add classes to the model and refine meaning of the instances
and relationships that have been constructed. Anomalous behavior is often represented by path
relationships constructed from the collection of relation instances. For example if a calls b and b
calls c then the path of calls from a to c . One may track calls in terms of time elapsed between a
call from a to b and the start time of b to c, using this to determine more precise hypothesis
regarding the data.
Problem and Knowledge Evolution
Wicked problems characteristically have change at their core, usually non-linear change. The
problem itself may evolve while one is trying to solve it. For example, as a hazardous material
spreads, the problem of containment changes. The material may start as the spread of a liquid,
but through reactions with water and/or air a toxic gas cloud could emerge. The problem solution
will certainly include an action plan. The plan actions may be to halt contamination of soil,
water, and atmosphere by the diffusion and phase change of the hazardous material. The plan
will requires establishing rates and area of diffusion of the contamination, the dangers posed if it
is not halted, as well as impact analysis of actions taken to halt the diffusion. Traditional problem
strategies such as dividing the problem into sub problems, solving and combining the solutions
to the sub problems, as is the strategy of narrowing the solution space by ruling out regions may
be applicable. However, as the problem context is dynamic, they may be in sufficient. The base
ontology may be refined by defining new classes and relations. For example, if we have a
general telephone calls relation, Call. one may define a restricted relation whose domain and
range have a particular property.
The base ontology model may be refined by defining new classes and relations, as well as by
introducing new individuals. The traditional divide and conquer strategy may be implemented
partitioning a class A into two disjoint subclasses, A1 and A2. This means that A = A1 ⊔ A2 and
A1 ⊓ A2 =∅. Often one has a relation such as telephone call, Call, where initially we do not
know much about its domain and range. If we get further information we may refine the call
relation as Dom(Call) = A, and Range(Call) = B, for some classes A and B.
Problem context knowledge evolves by virtue of external processes acting on the problem
environment. This knowledge evolution may be reflected by data collection established at the
onset of the problem solving project. Cognitive processing by either man or machine may
suggest that additional kinds of observations are needed. Executing an OODA loop will result in
the development and test of new hypothesis. Currently the generation of hypothesis is at the edge
of computer technology, so we assume that hypothesis generation is done manually. However,
we can describe how a cognitive process can add hypothesis to its knowledge base, as
represented by its problem context model, and how confirmation of falsification by data
collection or reasoning takes place.
A simple hypothesis may assert a statement about an individual. For example, the missing
aircraft is located in a specific area, or that an individual has a specific disease. We have
mentioned that the Problem Context model contains physical laws in the form of equations
relating state variables of the environment model. The problem context model may contain other
12. 12
rules which represent hypothesis regarding, for example, causal connections between symptoms
of an individual piece of equipment and its malfunction. This kind of hypothesis can be
expressed in various forms. For example, if a and b are nodes in the runtime graph, and p and q
are properties which can be evaluated for these nodes, then
p(a) ⇒ q(b)
or there may be a probability statement such as
p(a) ⇒ Probability (q(b)) < k
When hypothesis get added to a model the addition may make the model inconsistent. This
can happen either by data that contradicts the hypothesis or by a hypothesis that may be
logically incompatible with the model to which it was added. In either case, the hypothesis has to
be retracted from the model. The adding and retracting of hypotheses is a dynamic process. This
kind of process is more familiar to the Artificial Intelligence and automated reasoning
community than the model-based engineering community.
Pitfalls
We have seen that knowledge about a problem domain is usually incomplete, and that this
incompleteness is supplemented by adding and retracting hypotheses. Knowledge-based
information systems can lead us off track, unless sufficient care is taken when information is
combined from multiple sources. Human oversight can provide for valuable checks on the
conclusions of these machine-based cognitive analyses because ontology only mirrors our
knowledge of the world at any given time. These mirrors may reflect incommensurable
conceptual views, depending on the language and understanding of these using the ontology.
Languages vary considerably not only in the basic distinctions they recognize, but also in the
assemblage of the ontological categories into a coherent system of reference. Thus the
ontological language provides to its users is not a common, universal system, but one peculiar to
the individual language, and one which makes possible a particular `fashion of speaking'. The
ontology depends upon the level of knowledge existing at any given time.
The following is an example of improper combining of information leading to spurious
conclusions which happened well before the advent of big data, but will serve to illustrate
problems encountered with integrating information from multiple sources. During the oil crisis
of 1979 two government agencies reported results on oil imports into the US on a monthly basis.
The two agencies reported results for the same statement “oil imports to the US in May 1978”
each differing by a significant amount. Was it because the terms were ill defined, were the data
collection methods bad, or what? The problem was to understand why the answers were so
different and how to build computer based information systems that would not suffer from this
problem. The investigation showed that each agency actually had good data collection methods,
and in each agency the criteria for interpreting the results was carefully documented. The
problem was that when the information went into computer information systems, all information
about the meaning of the terms was lost – the semantics, or metadata, was missing from the
information systems used. All that remained was column headings such as US, May, Oil, etc.
So the agency information systems executed on the data, not the information. .
Another way that error enters the knowledge base as represented by the problem context
13. 13
model is that the validity of the computational result depends upon the lineage and uncertainty of
the input and whether the input data (with associated semantics) is appropriate as input to not
only the processing algorithms, but also to the problem to be solved. Figure 3 above also shows
an example of a data item lineage tree. In this example, we can see that a change in data item DI1
has an impact that can be easily determined, and the appropriate actions can be taken—that
process executions PE1 and PE3 potentially need to be rerun to reproduce data items DI5 and DI8.
Problem solving information in big data contexts is model-based. The model represents the
ground truth of knowledge about the problem solving enterprise. These models include the
physical environment, geographic data, organization data, and transportation, as well as
manufactured products such as buildings and equipment. Keeping the relationships between
these entities consistent is an essential component of model management necessary to achieve
meaningful simulations. If the data is not kept synchronized then the model may be inconsistent
Figure 4.
Each data processing step transforms input data to output data. Input data often comes from
multiple sources. For example, the processing to compute the surface temperature of a physical
structure must use the correct inputs for the specific environment of interest, e.g., physical model
of the structure and its materials, heat sources within the structure, properties of the atmosphere,
and the correct approximations of the physical laws of heat dissipation. Incorrect input to this
computation can propagate through further calculations to yield grossly incorrect and/or
misleading results.
In a model-based system each data collection step and each processing step adds to the model.
The validity of the result depends on the lineage of the input and whether it is appropriate as
input. For the same reason that humans cannot keep track of the data, and whether it has been
appropriately combined in a big data context, traditional configuration management which
depends on manually keeping track of the meaning of the data is too error prone to work.
Cognitive processing is required to check for data consistency.
Figure 4 taken from (Scrudder et al 2004) shows an example of a data item lineage tree. In
this example, we can see that a change in data item DI1 has an impact that can be easily
determined, and the appropriate actions can be taken—that process executions PE1 and PE3
potentially need to be rerun to reproduce data items DI5 and DI8.This example also illustrates
DI1
PE1
DI2 DI3
DI6
PE3
DI5
PE4
DI8
DI7
PE2
DI4
14. 14
how a metamodel is key to addressing data coherency issues. In Figure 2 we see that data DI3 is
used in processes PE1 and PE2. Thus, we must make sure that the correct data item is used in
both processes. If inconsistent versions of DI3 are used for the two processes, the result is data
items produced higher in the production tree that are not coherent. In this example, the validity of
item DI8 is at stake. A concrete example would be to consider where CAD information is used as
an input to two analysis chains that eventually produce radar cross section and aerodynamic
performance data. If different Computer Aided Design information were used to produce these
two data items, a metamodel-based query would raise a coherency warning flag. Absent further
examination (which might show the differences between the two CAD file versions are not
significant), these two data items should not be used together to represent the aircraft.
Conclusion
The Hazardous Material model is the template of the title. It is a template in the sense that it
provides a common organizational structure for a wide class of wicked problems. This structure
can be applied to many problems which involve big data. The organizational entities of the
model constitute the authorities which control and allocate resources with their own goals and
operating procedures. These resources are employed to observe and interact with the
environment. The OODA loop controls the problem solving task, using the resources allocated to
it to observe, build hypotheses, and take action. While all problems are different, many can use
specializations of the template outlined here.
Wicked problems characteristically require large amounts of data to be processed to produce
actionable results. They involve a physical environment with complex interdependences.
Cognitive processing by computer will become increasingly necessary as the volume and
complexity of data become too large for even teams of people to comprehend. The Hazardous
Material model is designed to be executable and to represent sufficient domain knowledge that
the OODA loop can use cognitive processing to manage the consistency of the data, monitor the
state of the problem solving effort, and provide justification for the actions taken.
The discussion of computer representation of knowledge is only intended to introduce the
subject and take a bit of the mystery away from it. Ontology simply refers to the vocabulary used
to represent the knowledge, as well as language constructions such as the SysML part
associations. The models, such as the Hazardous Material model represent knowledge expressed
using an ontology. As computer systems transition to be knowledge based, then the rules and
laws needed to keep information synchronized and detect inconsistencies have to be represented
the computer knowledge base, and used by the computer inference system. We have not really
illustrated how processing of data collected at runtime can yield information, nor have we
illustrated how reasoning about the knowledge base, as represented by the models can be
reasoned about to determine invalidity of a hypothesis. For example, If the knowledge base
contains both a in A and a in B and A ⊓ B = ∅ then from a simple logical inference one can
conclude the model is inconsistent.
The discussion of the ontology language did not make a point of it, but by using an ontology
language which is embedded within the language of a logic, the models we construct serve as
axiom sets within the logic. These axiom sets can be used by automated reasoning engines to
detect the inconsistency of a hypothesis. The results of the reasoning together with how the
15. 15
results were obtained become part of the explanation of the state of the problem solving process.
This kind of cognitive processing by computer systems will assume an increasingly important
role in engineering, as they have in other areas where big data is prevalent.
References
Boyd, John, R., The Essence of Winning and Losing, 28 June 1995 a five slide set by Boyd.
Kulkarni, D., Simon, H. A. 1988. “The pocess of scientific discovery: The strategy of
experimentation.” Cognitive Science 12: 139-175.
Garrett, R.K., Jr., 2014. A Graph-Based Metamodel for Enabling System of Systems
Engineering. Submitted for consideration to Systems Engineering.
Graves, Henson, and Yvonne Bijan. "Using formal methods with SysML in aerospace design
and engineering." Annals of Mathematics and Artificial Intelligence 63.1 (2011): 53-102.
Graves, Henson, and Ian Horrocks. "Application of OWL 1.1 to Systems Engineering." OWL
Experiences and Directions April Workshop, 2008.
Mainzer, K. 2014. “The effectiveness of Complex Sysems. A Mathematical, Computational, and
Philosophical Approach.” Proceedings of Philosophy, Mathematics, Linguistics: Aspects of
Interaction 2014. St Petersburg, Russia.
OWL 2 Web Ontology Language Document Overview (Second Edition), 2012. Retrieved from
http://www.w3.org/TR/owl2-overview/
Smith, A., 2011. Americans and text messaging, Retrieved from
http://pewinternet.org/Reports/2011/Cell-Phone-Texting-2011.aspx.
Scrudder, Roy, et al. "Improving Information Quality and Consistency for Modeling and
Simulation Activities." The Interservice/Industry Training, Simulation & Education
Conference (I/ITSEC). Vol. 2004. No. 1. National Training Systems Association, 2004.
Sheard, S. A. 2006. “Definition of the Sciences of Complex Systems.” INSIGHT 9 (1): 25.
SySML (2013). OMG Systems Modeling Language. Retrieved from
http://www.omgsysml.org/
Biography
Dr. Henson Graves is a Lockheed Martin Senior Technical Fellow Emeritus and a San Jose State
University Emeritus Professor in Mathematics and Computer Science. He has a PhD in
mathematics from McMaster University. Dr. Graves is the principle of Algos Associates, a
technology consulting firm.
Mr. Robert K. Garrett, Jr. earned a B.S. in Materials Engineering from Purdue University in
1981 and an M.S. in Materials Engineering from Purdue University in 1983. He worked in
research and development with the Naval Surface Warfare Center for 27 years. His expertise is
in systems of systems engineering, integration of diverse technologies into weapon systems, and
the application of materials science and continuum mechanics to weapon systems development.
In 2010 Mr. Garrett joined the Missile Defense Agency as the agency’s first Chief Engineer for
Modeling and Simulation.