1. SE and AI: A Two-way Street
26th
May2013
John A Clark
Dept. of Computer Science
University of York
York, UK
2. Why am I here? Why are you here?
Aim to inspire progress in software engineering, in artificial
intelligence, and at the interface between the two disciplines.
Some aspirations/stretch goals for the next 20 years.
Some are clear.
Some are rather loose.
Some may be met fully.
Others are harder, but in trying to meet them….
Something Good Will Happen.
Target the SE and AI communities, both individually and to inspire
collaboration.
Many have roots in practical applications but satisfying them may
need theoretical advances in both disciplines.
3. Deviation!
Inputs from Simon M Poulding, Mark Harman, Edmund Burke and Xin
Yao (my DAASE colleagues).
It is better to travel than to arrive.
The map is not the terrain.
…..
The talk is not the paper!
I’ve kept many of the challenges but have thrown in more
observational material and some questions.
KEYNOTE NIRVANA:
Get to say things that would never get past referees!
This is a health warning.
4. The challenge of proof automation:
striking at the heart of rigorous
software engineering
5. Moore’s Law for Formal Reasoning for
Software
Rigorous software development has a long history in CS
Turing’s 1949 has a proof of correctness
of a program with two nested loops and outlines
a more general proof approach.
1960s saw classic works by Floyd and Hoare: {P} Q {R}
1970s and 1980s: VDM, Z, and B, CSP and CCS.
Tool support has matured, including automated reasoning.
Tools generate “verification conditions” that must be proved for the software to be
demonstrably correct.
Most are dispatched automatically but the remaining case cases require manual
and expensive proof effort.
Technological improvements here would greatly facilitate take-up. This brings us to
our first challenge.
Challenge. Demonstrate for 20 years a 20% year-on-year reduction in the
residual manual proof effort that must be expended to produce formally
verified systems.
6. Moore’s Law for Formal Reasoning for
Software
We seek a software engineering proof equivalent of Moore’s Law. This is
exponentially (geometrically) ambitious and persistently feasible.
“Classic AI” and undoubtedly “Classic CS” in the service of software
engineering.
NOTA BENE:
In many successful toolsets developers build powerful domain specific theories for
their provers.
Obvious example here is ASTREE toolset (supporting AIRBUS C program
analysis)
This has an important place in any way forward, but we observe also that current
theorem proving technology seems to draw little on the technologies and ideas that
AI has to offer (e.g. a raft of adaptive learning techniques).
Provision for better tools:
Automated proof tactic development via large scale HPC experimentation?
Computer crowd sourcing?
Data mining of proof strategies and understanding their applicability?
What do people find difficult?
7. Impact Pragmatics
Take a REAL SYSTEM that has been subject to formal
development and use that as a case study.
Tokeneer. A security system Development using Z and Spark
Ada
Part of the motivation is that EVERY TOOL has its own
idiosyncrasies/weaknesses and patterns of them.
Perfectly reasonable to expect AI to detect them and for tool
environmental improvements to be gained.
There are real opportunities to do some work that will grab
attention.
8. Learning form Software Engineering
Software engineering knows the power of domain specificity
or focus more generally. There is an age-old tension
between generality/expressiveness and tractability.
Witness tools that resolve issues of:
Freedom from deadlock (e.g. via model checking)
Exception-freeness (e.g. via abstract interpretation).
Latter may still suffer false positive issues.
THEY HAVE WELL DEFINED NARROW TARGETS AND
THEY AIM TO DELIVER ON THEM.
YOU DON’T HAVE TO DO FULL FORMAL DEVELOPMENT.
SOLVING SMALL PROBLEMS WELL IS GOOD
9. Software Engineering Helping AI to Help
Software Engineering: Ratcliff, White and Clark.
Here we use evolutionary
computation to evolve candidate
invariants from trace data – i.e.
what Daikon does.
Generates thousands!!!!!!!
Not all all interesting.
But see which invariants are
broken by mutant programs.
Some are very special to the
original program. Those are
INTERESTING
11. trustable real-time ai
Modern critical systems (e.g. those with safety implications) will increasingly
use AI to deliver the required services.
Prototype driverless cars using image
processing techniques to detect humans avoid collision.
Common “the non-determinism of many AI
algorithms makes their application in critical systems
difficult”.
It is the inability to provide guaranteed envelopes of system behavior that is
problematic; stochastic algorithms, for example, would be fine, provided you could
rigorously argue that their behavior is satisfactorily bounded.
Functional and non-functional correctness are both relevant here. (Dealing with
non-functional performance and resource trade-offs forms an important component
DAASE project, most typically as part of the collaborative work on adaptive
automated software engineering.
Challenge. Demonstrate across a range of fundamental AI algorithms formal
machine assisted proofs of correctness and scientifically justified
predictive models of functionality-timing-memory-power-other tradeoffs.
12. Principle
people and computers are not the
same
Understand it, live with it, and
then embrace it
Or…
Vive la (les) Difference(s)!
13. Hard for humans
Humans do some things well, and some things badly, and some
things not at all.
Forget the doom and gloom. A lot of software works and works
acceptably well. This is largely due to human efforts. We humans
actually can do a fair job….But….
Ask a human to write an image classifier that distinguishes pictures
containing cats from pictures containing dogs. Rather hard and
standard specification and refinement techniques are not much use.
Furthermore, ask them to write such a binary classifier that works only
80% of the time. Che?
It’s just not the sort of thing we do well at all.
But why should we care?
14. Vive la (les) Difference(s)
Making n-version programming work!
Critical systems hardware: majority voting of a 2-out-of-3 architecture allows
continued operation in the presence of a single hardware fault.
The software variant of this idea is more controversial It is questionable
whether independence holds between developing teams and teams work from
the same, possibly flawed, specification.
Prog 1
Prog 1
Prog 1
Prog 1
Prog 2
Prog 3
Majority Vote
on Output
Majority Vote
on Output
Input goes to
all processors
Input goes to
all processors
15. Vive la (les) Difference(s)
Making n-version programming work!
Automated programming may actually make the idea more
palatable.
Programming teams may make the same assumptions and the
same mistakes
But automated program discovery techniques can give sets of
programs with complementary weaknesses.
Only need the majority to be correct on any training and
subsequent example.
Ensemble based approaches…
Challenge. Demonstrate a credible approach to N-version automated
programming that is scientifically grounded and capable of satisfying
assurance requirements of appropriate regulatory authorities.
Suitable applications will have to be identified. This challenge has
roots in both SE and AI.
16. Vive la (les) similitude(s)
Making computers like humans!
Humans are expensive and get bored very quickly.
Great need for human testers in many systems.
Long-standing quest. To create a system that cannot be
distinguished from humans.
For most people the challenge was for humans to create such a
system
Challenge. Bring AI to bear to create high performing proxies for
humans for specific purposes?
What we really need is automated human compilation. (Humans are
programs too by the way.)
But this should also allow for the “dumb user” – the one who does
something that screws the system up in an unanticipated way.
Abstractly, this reduces to modeling of traces/sequences – what is the
range of approaches for this?
17. trustable real-time ai
Finally, as part of our drive for engineering trusted systems, we envisage
significant exchange of ideas on software testing.
Challenge. Draw on the testing expertise of SE and AI to develop a
credible and scientific basis for the testing of complex systems.
We envisage further uses of AI of for stress testing of systems (but the
scientific justification for such testing may be some way off).
In addition, enhanced fault based (e.g. mutation) testing will likely play an
important part in testing systems of systems based on AI technology (e.g.
agent based systems).
18. The Challenge of coming to terms
with resource abundance and
resource constraints
or
The sky’s (cloud’s) the limit
And
Just how low can you go?
19. Resource availability: Aunt Ada’s
Dividend Challenge
Extraordinary computational power. The ‘cloud’ (however
constituted) is very much the topic of the day.
One of our most computationally-minded august relatives, Aunt
Ada, has now retired from programming and has invested
deeply and successfully in cloud technology.
Each year she receives a dividend of 1 billion processor
hours (with each processor clocking at approximately 1
GHz).
She is free to use them as she sees fit, but they have to be
spent within one year. She wishes to support speculative
research in SE and AI.
Challenge. Identify the problem from SE, from AI, or their
fusion that provides the best use of such resources.
20. The guilt trip power diet challenge
Ever-increasing power consumption is a serious concern: some highly
developed societies now live in fear of power outages.
GUILT TRIPPING: We aim to encourage and celebrate power-frugality with a
series of challenges.
Challenge. For your favourite application or algorithm from SE, from AI, or their
fusion, demonstrate a year on year power consumption reduction of 20% for the
next 20 years.
Challenge. (The 1 J Diet Challenge.) You are given 1 Joule of energy. Identify the
most ambitious task that can be completed using no more than 1 J.
For the benefit of the less militantly frugal, we offer the corresponding 10 J
and 100 J Challenges. The above is intended as a playful attempt to spark
efforts in the area of low power functionality.
The challenge may morph from real power to a more abstract model of power,
e.g. to a virtual mapping from instructions to power consumed.
Yes, you’ll have to make hardware assumptions/have some common
computation al model.
21. The dark side
Principle: embracing the dark side
of ai can be more fun and highly
productive
challenge: to fully realise the
destructive capabilities of ai
22. Thinking within the box
Habitual to recommend THINK OUTSIDE THE BOX . In a
sense this is nonsense. Aim is not to impose unnecessary
constraints due to the mental baggage (assumptions,
favourite techniques etc.). THINK WITHIN THE BOX but
do so in a way that gives a result we actually find
appealing.
Self-constructed box The real box
23. Thinking within the box: stressing
systems
Best is best but very personal. Lots of AI (and indeed OR)
related research in the optimisation arena.
Provided we can engineer feedback we can optimise for what
are usually regarded as NEGATIVE performance or other
criteria – STRESS TEST THE SYSTEM.
Already see examples – e.g. growing worst case execution
times (various)
But “stress” and “systems” are flexible beasts:
Search based software testing – grow tests data that breaks
predicates (module pre-conditions, post conditions)
Push non-functional properties to their limits
Push predicates (a la Daikon etc.) to their limits by aggressive test
data generation). Attacking products of AI with AI in order to
strengthen them!
25. Teleportation
Quantum Information Processing (QIP) and Quantum Computing (QC) offers
us a radically different computational model and capability.
Star Trek is with us now (sort of)!. Here’s Brassard’s TELEPORTATION
Here’s an evolved one form Yabuki & Iba
26. Grover’s Algorithm
Quantum Information Processing (QIP) and Quantum
Computing (QC) offers us a radically different computational
model and capability.
Here is Grover’s Search (a fundamental building block of QC) –
allows you to search a database of size 2N
in order of 2N/2
.
Spector et al. evolved a GS circuit for two qubits
27. Shor’s Quantum Fourier Transform
Here a Quantum Fourier Transform (Shor’s fundamental
building block) – this is what enables you to break
factorization in polynomial time.
28. Massey et al Quantum Fourier
Transform Generation
29. New Computing
If we were being cruel we could say that the AI community
(including me!) is capable of re-discovering what the physicists
have already discovered.
But there are actually very few genuinely different quantum
algorithms around. A real opportunity….
Challenge. Using AI techniques generate new quantum
algorithms to solve problems of acknowledged importance.
Question: what does the new computing offer current
software engineering in terms of verification capability?
30. Dynamic Adaptive Automated Software
Engineering
A major EPSRC “programme grant” (around £6.7m) over four sites:
UCL (Mark Harman PI), Birmingham (Xin Yao), Stirling (Edmund
Burke) and York (John Clark - or me).
Major focus on automation and adaptivity (off-line and on-line)
Will have to face many of the problems discussed at this workshop:
Squishiness (in many forms)!
Uncertainty and its resolution.
Pareto and other MO approaches and their applicability.
Ascertaining the limits of machine learning and what can be
justified/reasoned about.
31. Acknowledgements
Sponsors
DAASE: Dynamic Adaptive Automated Software
Engineering. EPSRC grant EP/J017515
The Birth, Life and Death of Semantic Mutants
EPSRC grant EP/G043604/1
Many thanks to: Simon M Poulding, Mark Harman,
Edmund Burke and Xin Yao