1. Improving Decision-Making Support
by Linking Database results to Simulations
Gio Wiederhold
Stanford University
July 2011
Gio Wiederhold SimQL 1
2. Problem : Mismatch
Database Technology should support Decision-Making
• What does database technology do?
o Databases provide information about past events
» Consistent
» Reliable
» Fast
• What does a decision-maker do?
o Guess how decisions will affect the future
» Multiple possibilities
» Uncertainty
» Slow, manual, multiple tools
Gio Wiederhold SimQL 2
8/17/2012 Gio: SimQL
3. Information Systems should also
Project into the Future
past now future
time
Support of decision-making requires dealing with the future ,
as well the past
• Databases deal well with the past
• Sensors can provide current status
• Spreadsheets, simulations deal with the likely futures
Information systems should be able to combine all three
Gio Wiederhold SimQL 3
8/17/2012 Gio: SimQL
4. Decision-making (DM)
Analyze Alternatives
• Current Capabilities
• Future Expectations
• Planning for them
now future
Process tasks:
• List resources
• Enumerate alternatives
• Prune alternative
• Compare alternatives
8/17/2012 Gio: SimQL Gio Wiederhold SimQL 4
5. Current Processes
• Data conversion to files for spreadsheets.
• Model building and testing by analysts
• Planning for likely future scenarios
• Recording expected results .
• Data collection • Comparing many scenarios .
• Data validation • Finding the best plans .
• Data integration • Advising the actual .
• Information selection decision maker
• Data reduction & summarization
• File generation for analysts
Gio Wiederhold SimQL 5
8/17/2012 Gio: SimQL
6. Progress in Data Integration
Information Integration has progressed in supporting
Decision Making
1. Integrate data from distributed sources
o Issues: inconsistency of scope and timing
2. Capture new relationships
o Often requires expert inter-domain knowledge
3. Include current sensor data
o Select streaming data
4. include predictions about future courses
******* A new, potentially major topic *******
Gio Wiederhold SimQL 6
8/17/2012 Gio: SimQL
7. DM support is disjoint
does not interoperate
Planning Science
extensions to move
Distribution to networked support
are also disjoint
8/17/2012 Gio: SimQL Gio Wiederhold SimQL 7
8. Current state of DM Support
past now future
organized support disjointed support
x17 @qbfera
ffga 67 .78 jjkl,a
nsnd nn 23.5a Intuition +
Data integration • Spreadsheets
• Resource allocations
• Explicit simulations
Databases
various point assessments
distributed, heterogeneous
Past future time
Gio Wiederhold SimQL 8
8/17/2012 Gio: SimQL
9. Prediction Requires Tools
E-mail this book,
Alfred Knopf, 1997
8/17/2012 Gio: SimQL Gio Wiederhold SimQL 9
10. Requirements for DM
• Ubiquitous access to simulations
of a wide variety of types
• Rapid response to parameter changes
o Access to up-to-date facts
o May need High-Performance recomputation
• Model, scenario, and choice retention
o Analysts’ planning to be reused
» But updatable
Gio Wiederhold SimQL 10
8/17/2012 Gio: SimQL
11. How to merge 2 disciplines
• Databases
o High-level languages
» Data descriptions
o Drive detailed processes
o Intentional
• Simulations & spreadsheets
o High-level languages
» Model desriptions
o Parameter driven
o Extensional
Gio Wiederhold SimQL 11
8/17/2012 Gio: SimQL
12. Integration concept
• Enable intentional simulation access
o Follow database model
» Similar to data description
o Provide interfaces
»To support needed processes
Create SimQL similar to SQL
schema & links to access procedures
Gio Wiederhold SimQL 12
8/17/2012 Gio: SimQL
13. Transform Data to Information
Database oo middle-
-)
Design management
Schema SQL user
Data Reports
:-(
Collection
Model value-added
:-)
Design services
Data-driven decision-makers
Modeling
Plans
o o
8/17/2012 Gio: SimQL Gio Wiederhold SimQL 13
14. Language implementation
Stanford Experiment uses an existing SQL parser:
1. Replace the SELECT verb with ESTIMATE;
2. Remove the UPDATE statement. Nothing persists
3. Replace CREATE DATABASE with CREATE MODEL;
4. Add to the CREATE attributes IN, OUT, and INOUT;
5. Add a REGISTER statement to identify resources;
6. Replaced SQL’s functions code generators that access
stored data with functions that deliver the
a. Query IN parameters to various simulations
b. Collect the data specified as OUT parameters
c. Return the result.
Gio Wiederhold SimQL 14
8/17/2012 Gio: SimQL
15. Examples
SQL:
SELECT Temperature, Cloudcover, Windspeed,
Winddirection FROM
WeatherDB WHERE Date = `yesterday' AND
Location = `ORD'.
SimQL:
ESTIMATE Temperature, Cloudcover, Windspeed,
Winddirection FROM
WeatherSimulation WHERE Date = `tomorrow' AND
Location = `ORD'.
Gio Wiederhold SimQL 15
8/17/2012 Gio: SimQL
16. Available Functions
1. Continously executing: weather prediction
o SimQL result reports best match samples
2. Execution specific to query: Spreadsheet what-if assessment
o may require HPC power for adequate response
3. Past simulations collect results in a base: materials
o performs inter- or extra-polations to match query parameters
4. Combinations, i.e., 2. + 3.: top layer simulation using stored
partial lower level results: weapon performance in new setting
5. Human-in-the-loop: Wrapper for Amazon’s Mechanical Turk
Note
• A simulation service program can be written in any language
• A simulation service must be compliant to the interface spec.
Gio Wiederhold SimQL 16
8/17/2012 Gio: SimQL
18. Interfaces enable integration:
SimQL to access Simulations
past now future
time
Databases, Simulations,
accessed via SQL or
XML, CORBA compliant accessed via SimQL and
wrappers compliant wrappers
Msg
systems,
sensors
Gio Wiederhold SimQL 18
8/17/2012 Gio: SimQL
19. Current State of SimQL research
GUI
collect language
requirements
Test Application
wrapper wrapper wrapper
Spreadsheets Weather Engineering
Gio Wiederhold SimQL 19
8/17/2012 Gio: SimQL
21. More to be done
• Stanford experiment only produced point results.
• A decision maker would estimate multiple scenarios
1. Collect results identified with parameters
2. Provide search functions to compare results
1. Consider time lines for result synchronization
3. Support pruning of low-value results
4. Deliver only high-value results to decision-maker
Gio Wiederhold SimQL 21
8/17/2012 Gio: SimQL
22. Use of Simulation Results
0.6 0.3 0.2
0.1 0.07
0.5 0.03
0.5 0.5 0.3
0.1 0.2
time
0.4 0.2 0.1 0.1 prob
Simulation results can be composed for
alternative Courses-of-actions
Composition should include computation
and recomputation of likelihoods
Likelihoods change as now moves forwards
and eliminates earlier alternatives.
Gio Wiederhold SimQL 22
8/17/2012 Gio: SimQL
23. Estimates have probabilities
• p=30% chance of rain
• Flight p=91% likely to arrive with 15 min of ETA
• Interest rate p=50% same, p=25% 1% higher, … .
• Employee p=50% returns to work in a week, … .
• Project p=10% completed in time, …
• Spreadsheets can compute alternative values
with such data provided by the model builder,
not the SimQL user.
Gio Wiederhold SimQL 23
8/17/2012 Gio: SimQL
24. The branches can be labeled with probabilities,
then assessed using the outcome with values
prob
value
0.1 100 0.3 1000
Next period alternatives
1200 0.4 2000
0.5
and subsequent periods 600
0.6 0.1 5000
66
1266 0.1 0.3 1100 0.2
500
1000
134 0.2
0.3 200 200 0
0.1
-1086 -420
0.07 0 -6000
0.4 -1220
0.2
-820 0.13
-400 -3000
Values
past now future
time
Gio Wiederhold SimQL 24
8/17/2012 Gio: SimQL
25. Integrating data & planning support will make
our data reusable and much more valuable
A Pruned Bush
Re-assess as time 100 ? ?
marches forward ! 1200 600
1000
1266 ? 2000
1100 500
66 5000
200 200
1000
0
0
past now future
time
Spreadsheets,
Databases, . . . other simulations,
8/17/2012
Msgs
Gio: SimQL
Gio Wiederhold SimQL 25
sensors
26. Even the present needs SimQL
point-in-time for
last recorded observations situational
assessment
simple simulations
to extrapolate data
past now future
time
Is the delivery truck in X?
Not all data are current: • Is the right stuff on the truck?
• Will the crew be at X?
• Will the forces be ready to accept delivery?
8/17/2012 Gio: SimQL Gio Wiederhold SimQL 26
27. Use of Simulation Results
Simulation results can be composed for
Alternative Courses-of-actions
Composition should be seamless, elegant,
with computation and recomputation of
likelihoods
Results change as now moves forwards and
eliminates earlier alternatives.
Gio Wiederhold SimQL 27
28. Summary
Databases Simulations should
• serve clients via SQL by • serve clients via SimQL by
Sharing a Model (The Schema)
Sharing a Model (research q.)
A query language over the model A query language over the model
the SQL interface enables a SimQL interface enables
• independence of • independence of
application development application development
DBMS technology development simulation technology develop’t
reuse of infrastructure reuse of infrastructure
Today Objective
• most new systems use a • build information systems
DBMS for data storage combining DBMS, Simulations
even with less performance, even with less performance,
inability to handle all problems, inability to handle all problems,
but enough of them well enough. but enough of them . . .
Gio Wiederhold SimQL 28
8/17/2012 Gio: SimQL
29. Further research questions
• How to move seamlessly from the past to the future?
• How can multiple futures be managed (indexed)?
• How can multiple futures be compared, selected?
• How should joint uncertainty be computed?
• How can the NOW point be moved automatically?
Gio Wiederhold SimQL 29
8/17/2012 Gio: SimQL
30. Future information systems
Combine data from the past, with current data,
knowledge, and predictions into the future
oo
o o
o o
Assessment of the
values of alternative
possible outcomes
Gio Wiederhold SimQL 30
8/17/2012 Gio: SimQL
31. SimQL research questions
• How little of the model needs to be exposed?
• How can defaults be set rationally?
• How should expected execution cost be reported?
• How should uncertainty be reported?
• Are there differences among application areas that
require different language structures?
• Are there differences among application areas that
require different language features?
• How will the language interface support effective
partitioning and distribution?
Gio Wiederhold SimQL 31
8/17/2012 Gio: SimQL
32. Moving to a Service Paradigm
• Server is an independent contractor, defines service
• Client selects service, and specifies parameters
• Server’s success depends on value provided
• Some form of payment received for services
x,y
Databases are a current example.
Simulations have the same potential.
8/17/2012 Gio: SimQL Gio Wiederhold SimQL 32
33. Summary of SimQL
A new service for Decision Making:
• follows database paradigm
– ( by about 25 years )
• coherence in prediction
– displacement of ad-hoc practices
• seamless information integration
– single paradigm for decision makers
• simulation industry infrastructure
– investment has a potential market
– should follows database industry model:
Interfaces promote new industries
8/17/2012 Gio: SimQL Gio Wiederhold SimQL 33
34. Publications
Gio Wiederhold: "Information Systems that Really
Support Decision-making"; 11th International
Symposium on Methodologies for Intelligent Systems
(ISMIS), Warsaw Poland, June 1999, in Ras & Skowron
Foundations for Intelligent Systems, Springer LNAI
1609, pages 56-66
Gio Wiederhold and Rushan Jiang: “Augmenting
Information Systems with Access to Predictive Tools”;
http://infolab.stanford.edu/pub/gio/2000/VLDB2000-1.htm
The specifics of the language as implemented are at
http://www-db.stanford.edu/LIC/SimQL.html
Gio Wiederhold SimQL 34
8/17/2012 Gio: SimQL