October 31, 2007: “Managing and Benefiting from Multi-Million Rule Systems”. Presented at the 2007 Conference of the New England Complex Systems Institute.
Managing and benefiting from multi million rule systems
1. Cover Page
Managing and
Benefitting from Multi‐
Million Rule Systems
Author: Jeffrey G. Long (jefflong@aol.com)
Date: October 31, 2007
Forum: Poster session presented at the 2007 Conference of the New England
Complex Systems Institute.
Contents
Page 1: Abstract
Pages 2‐26: Slides (but no text) for presentation
License
This work is licensed under the Creative Commons Attribution‐NonCommercial
3.0 Unported License. To view a copy of this license, visit
http://creativecommons.org/licenses/by‐nc/3.0/ or send a letter to Creative
Commons, 444 Castro Street, Suite 900, Mountain View, California, 94041, USA.
Uploaded June 24, 2011
2. Managing and Benefitting From Multi-Million Rule Systems
Abstract
Jeffrey G. Long
October 31, 2007
This talk will discuss the idea that better representation and understanding of complex
systems will require new abstractions and new uses of existing abstractions. One
approach I have been exploring is taking system rules out of software and representing
them as data. I will discuss several abstractions I have found useful in representing
various kinds of complex business, linguistic, and biological systems as data. These
include (1) the notion of tens of thousands of complex, contingent "Competency Rules"
that define or describe the behavior of a system, (2) the implementation of those rules
partly in software (like an inference engine) and primarily in data (like an expert system);
(3) the notion of contingent rules having multiple "factors" or primary drivers and zero or
more "considerations" that the system must review before deciding what to do next; and
(4) the notion of the form of a rule, as contrasted with its content (like algebra).
Reducing complexity cannot mean ignoring details, but must include seeing the larger
picture presented by ruleforms. Several specific examples will be given from current and
past projects.
3. Managing & Benefiting
from Multi-Million
Rule Systems
International Conference on Complex Systems
ICCS2007 – Boston, MA
Jeffrey G. Long
October 31, 2007
jefflong@aol.com
4. Studying a Variety of Notational Systems
Wh t makes th
What k them powerful?
f l?
speech & writing What is their nature & structure?
cartography Can their design be facilitated?
arithmetic & algebra How and why did they evolve?
geometry
Who created them?
chemical notation
p
What accelerated or impeded
dance/movement notation
d / t t ti their general usage and
music notation acceptance?
logic notation What effects did they have on
money society? on cognition?
How do we know if we’re at the
limits of usefulness of a notational
y
system?
5. Key Points
Modern society is critically dependant upon a number of different kinds of rule systems Yet we
systems.
have (increasingly) enormous problems creating and managing large rule systems.
This arises from how we currently represent rules and data. We cannot solve them by means of
faster computers or other extensions of current representations. Reducing complexity
cannot mean ignoring details, but must include seeing an even larger p
g g , g g picture.
We can look to the past for guidance. Many times in the past, society has overcome “complexity
barriers” by means of new notational systems. These events are what I call “notational
revolutions”, and they affect how we see the world, how we think about the world, and
how readily and what we can communicate with others.
My experience is that representing rules and data as an integrated whole, and using a place-
value representation, does make large rule systems much more comprehensible,
therefore more manageable, and therefore more able to safely grow and change as
needed (i.e. evolve). My name for this approach is “Ultra-Structure”.
I hope other proposed Rule Calculi will consider the issues and approaches I’m suggesting here.
6. Rule Systems are Ubi it
R l S t Ubiquitous
Subject
j Business Scientific Legal
# Rules Rules Rules Rules etc.
Small
< 1,000
Medium
< 100,000
Large
< 10,000,000
10 000 000
Very Large
> 10,000,000
10 000 000
7. Many Types of Rules
Ontological Rules (what exists, how entities relate)
exists
Operating Rules (how a system nominally works)
Strategy Rules (how to optimize a process; win; be artful)
Ethical Rules (additional guidelines for a clear conscience)
Evaluation Rules (how to tell if making progress/“winning”, or
detecting that rules are not working well)
Learning Rules (rules for changing rules)
Historical Rules (past events; custom)
Rules are multi-notational: largely qualitative but may include
multi notational:
quantities or other kinds of abstractions (e.g. musical notes)
Rules are probabilistic but can be treated as deterministic
8. Characteristics of Notational Revolutions
g p , g y
Some involve looking at the world from a different viewpoint, e.g. a birds-eye rather than a
ground-truth viewpoint, or indirect rather than direct reference to the world.
Some involve moving from a relative-value representation to a place-value representation.
Some involve the introduction of new abstractions such as zero, musical notes or map
abstractions, zero notes,
coordinates.
Physics has benefited from and might be said to have even co evolved with improved
co-evolved
notational systems such as calculus, Feynman Diagrams, Riemannian geometry, tensors
They all greatly expand the sphere of what can be readily said; the notation is the limitation.
They
Th are examples of “notational engineering” occurring without the benefit of systematic
l f “ t ti l i i ” i ith t th b fit f t ti
guidelines from the experience of others, or of a general theory of notation derived from a
longitudinal and comparative study of humanity’s notational systems
9. 1. Separation of Algorithms from Data
Traditional separation
contributes to and is
caused by object-
centered view of the
world.
In a process-
centered worldview,
everything is a
process and every
process is only
describable in
terms of r les
rules.
11. Conventional Data are Rule Fragments
Bin Part QOH QOO etc.
A X 5 4
Rule
Fragments
B B 15 7
Satisfies TNF requirements, but is still not flexible enough.
12. Data-Inclusive Rules Include Conventional
Data as Part of Larger Rules
Universals
Qty Provide
Loc’n Part Type Qty etc. Context
A X QOH 5
Simple A X QOO 4
Sourcing
Rules
B B QOH 15
B B Q
QOO 7
13. 2. Examples of Relative to Place Value
Roman to Hindu-Arabic Numerals
Hindu Arabic
500 BCE, 200 CE, 875 CE, 1200 CE, 1600 CE
Neumatic to Staff Notation
eu at c Sta otat o
500 CE, 800 CE, 1025 CE, 1300 CE, 1600 CE
Peripli to Coordinate-System maps
500 BCE, 100 CE, 1600 CE
14. Place-Value b Q
Pl V l by Quantity
tit
Hindu-Arabic
Roman Numerals Numerals
103 102 101 100
IV 4
CXII 1 1 2
MCMIX 1 9 0 9
Without a placeholder, you can’t reliably have columns
p ,y y
15. Place V l b Pit h
Pl Value by Pitch
Neume direction indicated
voice interval
F
E
D
C
B
A
G
F
E
A G
F E
D C
B A
G
17. RuleML Adds More Complexity
<imp>
<_head>
< head>
<atom>
<_opr><rel>isAvailable</rel></_opr>
<var>Car</var>
</atom>
</_head>
<_body>
<and>
<atom> “A car is available for rental if it is
<_opr rel isPresent /rel /_opr
opr><rel>isPresent</rel></ opr>
<var>Car</var>
</atom>
physically present, i not assigned t
h i ll t is t i d to
<not>
<atom>
<_opr><rel>isAssignedToRentalOrder</rel></_opr>
any rental order, is not scheduled for
<var>Car</var>
</atom>
</not>
service, and does not require service.”
<not>
<atom>
<_opr><rel>isScheduledForService</rel></_opr>
<var>Car</var>
</atom>
</not>
<not>
<atom>
<_opr><rel>requiresService</rel></_opr>
<var>Car</var>
</atom>
</not> H. Boley, S Tabet, G. Wagner, “Design Rationale for
</and>
</_body> RuleML: A Markup Language for Semantic Web Rules”
</imp>
18. Place-Value of Rules
Conventional rules are semantically informal and multiplex (many parts)
Exceptions to rules are themselves rules
Any co e t o a rule ca be co e ted into >1 “simple” rules
y conventional u e can converted to s p e u es
Each “simple” rule has the form:
“If a and b and c…Then Consider x and y and z”, where
>= 1 Ifs
>= 0 Then Considers
Rules converted into simple form are grouped based on their format (# Ifs, #
Then-Considers) and meaning (= function) e.g. agencies versus locations
versus products
d t
Result is a small (< 100) set of tables each having different structure and/or
function (syntax and semantics)
19. Each simple rule is represented as one record in one table (out of
n tables)
Each column of each table has a general meaning that is used to
g g
assign context to that part of each rule in that table
Initial rule selection for inspection (the If component) constitutes
the primary key column(s)
Subsequent rule evaluation and possible execution (the Then
Consider component) constitutes most other columns
p )
There are usually several columns of rule metadata at the end
Software implements a Competency Rule Engine that (ideally)
doesn’t know anything about world, only about how to read the
rules for a broad application area (e.g. business, games, law)
20. Rule systems have several kinds of Existential Ruleforms as a
foundation
agencies
products/services
locations
time periods
Existential rules are referenced by foreign key constraints to form
Compound Ruleforms
network ruleforms define relations among same kind of entities
attribute ruleforms define characteristics of entities
authorization ruleforms define relations among different kinds of entities
p
protocol ruleforms define pprocesses
Most columns are foreign keys to a particular existential table (this can
cause problems with some RDBMS)
21. 3.
3 Competency Rule Engine (CoRE)
Very small
amount of
t f
code in engine
(~100K LOC)
Stimulus
Control Response
Logic
Conventional
data is ab- Competency
sorbed into
rules; every- Rules
thing is a rule!
22. Benefits
data
Representing rules as “data” rather than software decreases
required amount of software by 1-2 orders of magnitude:
reduced amount of software may reduce initial development cost
reduced amount of software definitely reduces chances for bugs, thus
reducing d
d i development and maintenance costs
l t d i t t
Rules as “data” can be directly accessed and managed by
subject experts, without reliance on programmers:
changes in rules normally do not require changes in software, reducing
software
maintenance costs
reduces/eliminates communication requirements from subject expert to
programmer
As rules are externalized corporate knowledge can be seen
externalized, seen,
studied, and improved by many
with added metadata regarding each rule, and hyperlinks, this can become
a true knowledgebase
23. Exploratory CoREs
CoRE650 – Business (wholesaler with 10 000 orders/day)
10,000
CoRE415 – Language (search documents for concepts)
CoRE576 – Biology (various toy models in proteomics lab)
Im
I’m always eager to try this theory on new kinds of rule systems.
systems
24. ,
0
50,000
100,000
150,000
200,000
250,000
300,000
350,000
400,000
450,000
9/22/2005
10/22/2005
11/22/2005
12/22/2005
1/22/2006
R l C
2/22/2006
3/22/2006
4/22/2006
5/22/2006
6/22/2006
t
7/22/2006
Rule Counts
8/22/2006
9/22/2006
10/22/2006
11/22/2006
12/22/2006
Over Time
1/22/2007
# Existential Rules
2/22/2007
3/22/2007
4/22/2007
5/22/2007
6/22/2007
7/22/2007
8/22/2007
9/22/2007
Agency
Location
Product or Serv
Master Protocol
ice
25. Summary
Rules are a type of abstraction, and should be studied as such. There are
higher-level abstractions than individual rules, and many rule types; we need a
discipline whose object of study is rules/laws.
Putting more rules into software is not the solution, nor is building new layers on
top of existing layers Software is the problem It substitutes for a formalized
layers. problem. formalized,
place-value representation of rules, enforces a divide between algorithms and
data, and obscures the rules with significant ancillary syntax.
Rule systems must be conceived at a higher level of abstraction to be
y g
manageable while still maintaining all necessary detail
< 100 ruleforms and their interactions are comprehensible
1+ million individual rules are not comprehensible
The
Th resulting system must b able t perform b th d d ti i f
lti t t be bl to f both deductive inference and
d
computations, and be managed directly by subject experts (not programmers)
26. Questions
Might the problems of large rule systems arise from the way we
represent them?
What i th
Wh t is the optimal representation of l
ti l t ti f large numbers ( illi
b (millions) of
) f
complex, contingent rules?
What might a place-value system for representing rules look
like?
What is the relationship of algorithms and data? Is there benefit
in conceiving and representing both as rules?
27. Ult St t
Ultra-Structure R f
References
Long, J., and Denning, D., “Ultra-Structure: A design theory for complex systems and processes.” In
Communications of the ACM (January 1995)
Long, J., “A new notation for representing business and other rules.” In Long, J. (guest editor), Semiotica
Special Issue: Notational Engineering, Volume 125-1/3 (1999)
Shostko, A., “Design of an automatic course-scheduling system using Ultra-Structure.” In Long, J. (guest
Design course scheduling Ultra Structure.
editor), Semiotica Special Issue: Notational Engineering, Volume 125-1/3 (1999)
Long, J., “Automated Identification of Sensitive Information in Documents Using Ultra-Structure.”
Proceedings of the 20th Annual ASEM Conference, American Society for Engineering Management (1999)
Oh, Y.,
Oh Y and Scotti, R., “Analysis and Design of a Database using Ultra Structure Theory (UST) –
Scotti R Analysis Ultra-Structure
Conversion of a Traditional Software System to One Based on UST,” Proceeding of the 20th Annual
Conference, American Society for Engineering Management (1999)
Parmelee, M., “Design For Change: Ontology-Driven Knowledgebase Applications For Dynamic Biological
Domains.” Master’s Paper for the M.S. in I.S. degree, University of North Carolina, Chapel Hill (November
2002)
Maier, C., CoRE576 : An Exploration of the Ultra-Structure Notational System for Systems Biology
Research. Master’s Paper for the M.S. in I.S. degree, University of North Carolina, Chapel Hill (April 2006)