1. Institute of Computer Science
Chair of Cognitive Modeling
Demolition Derby 2012
Based on TORCS: The Open Racing Car Simulator
07/2012, Martin V. Butz
2. Demolition Derby 2012
http://cm.inf.uni-tuebingen.de/competitions
• Organized by:
- Martin V. Butz, University of Tübingen, Germany
• Supported by:
- Andreas Alin, University of Tübingen, Germany
- Dennis Schwartz, University of Tübingen, Germany
• And with previous help by:
- Matthias J. Linhardt, University of Bamberg, Germany
- Daniele Loiacono, Politecnico di Milano, Italy
- Luigi Cardamone, Politecnico di Milano, Italy
- Pier Luca Lanzi, Politecnico di Milano, Italy
3. Demolition Derby: Purpose
• Optimize opponent interactions
- Avoid being hit – run away when necessary
- Try to hit others at the right moment.
• Enables (co-)optimization of interaction behavior.
behavior
- Fitness may be based on damage caused to other cars.
- Co-development of two or more competitors is possible
(possibly with different approaches).
- Can do policy-gradient-based optimization
• Various strategy components are relevant
- Avoidance optimization
- Chasing others optimization
- Forwards & backwards steering control
- Opponent monitoring
- Meta-strategies
3
4. Goal & Setup
• Goal:
Wreck all opponent cars by
crashing into them without getting
wrecked yourself.
• Setup:
Local sensor information as in the Simulated Car Racing Competition.
• Sensors:
- Simulated distances sensors (noiseless).
Surrounding 36 opponent sensors with a range of 300m.
19 track sensors with a rang of 200m.
- Other sensors
Current damage of own car.
Damage produced on other cars.
Status of car (speed, wheels, gear…).
Relative position on track.
- Damage model:
Cars do not take any damage when colliding with walls.
Cars do not take any damage in the front when colliding with each other.
Cars only take damage when their rear is hit by another car.
4
5. Winner Determination
• Arena: Large circular track (surface: asphalt; length: 640m, width: 90m)
Arena
• Qualifying
- 1-vs-1 matches evaluating all against all (winner = 1 point = less
damage)
- Eight best controllers qualify for the final showdown.
• Final demolition derby matches:
- The best eight controllers fight each other.
- Ten matches are played.
- Car that wins most often wins the competition.
- Alternative scoring with rank-based points is also considered.
5
6. Additional Goodies for a Quick Start
• Basic controller clients for Java and C++, to easily add additional
functionality.
• COBOSTAR client in Java
- With opponent monitor that tracks opponents over time.
- With simple crashing strategy that targets closest car in range.
• Evolvable client setup that
- receives caused damage signal,
- applies CMA evolution strategy-based optimization,
- runs continuously with or (even faster) without visualization for
as many generations as desired.
7. Last Years Entries
• Base Client
- Dep. of Computer Science
- University of Würzburg, Germany
• DemoStar
- Thies Lönneker, Dep. of Computer Science
- University of Würzburg, Germany
• Spartiat
- Zygmunt Horodyski, Piłsudskiego 39/1
- 66-530 Drezdenko, Poland
12. USM Rule-Based Agents
Agent behaviors are determined using a
rule-based approach
No learning for the entries here, but this approach
is designed for EC learning
Conditions and actions are drawn from a discrete
“vocabulary” of pre-designed options
Each rule is a condition-action vector
Conflict resolution:
Rule order here (but specificity is usually helpful)
Gagne, Knowlton, Tellier, and Congdon, GECCO
13. USM Rule-Based Agents
Abstractions interface between game and rules
Low-level game sensors are abstracted to high-level rule
conditions
High-level rule actions are translated to low-level game
controls
Game
low-level sensors controls low-level
details details
high-level inputs outputs high-level
abstractions Agent abstractions
conditions actions
Gagne, Knowlton, Tellier, and Congdon, GECCO
14. Conditions – Input Abstractions
Condition TRUE when…
Ahead
Ahead An enemy is ahead and within 100 m
Close Ahead Any enemy is ahead and within 20 m
Behind An enemy is behind and within 100 m Behind
Advantage Opponent has 2000 more damage than agent
Edge Agent is near the track edge
Wounded Agent has more than 7,000 damage
Agent has been doing the same thing
No Change
for a long time
Duel There is only one other opponent
Turning Agent has started a U-turn, but has not finished
Gagne, Knowlton, Algorithms Congdon, GECCO
Congdon, Genetic Tellier, and and NonCoding DNA
15. Actions – Output Abstractions as
Pictures
Action Pictorial Description
Ram Agent SMASH!
Run
Bait
Get Clear
U-Turn
Circle Track
Gagne, Knowlton, Tellier, and Congdon, GECCO
16. Actions – Output Abstractions in
English
Action Description
Ram • Steer toward opponent
• Slow down if necessary.
• Otherwise, full acceleration.
Run • Steer away from opponent
• Circle the track
• Full acceleration
Bait • Circle the track
• Speed limit 110 kph
• When opponent is close, swerve
Get Clear • Turn away from track edge
• If very close to edge, back up
U-Turn • Cut wheel hard left or right (coin flip)
• Keep wheel cut for 100 steps
Circle Track • Stay centered and in line with track axis
• Speed limit 110 kph
Gagne, Knowlton, Tellier, and Congdon, GECCO
17. Sloppy Jalopy Entry – Rule Set
Conditions Action
Ahead Close Behin Advantage Edge Wounded No Change Duel Turnin
Ahead d g
F * * * T * * * Get Clear
* * * F * F T * U-Turn
* * * * F T * F Run
* * * T F * * T Run
* * T * F F * F Bait
T * F * * * * F Ram
T * * * * F F T Ram
* * T F F * F T Bait
* * * * * * F * Circle Track
‘*’ Means any state satisfies this condition (Don’t Care).
Grayed-out conditions are ignored by this agent.
Gagne, Knowlton, Tellier, and Congdon, GECCO
18. Sloppy Jalopy Behavior
SJ runs away when it’s wounded or winning a
duel by a margin.
Rams only when there is nobody behind it.
Tries to score mainly by baiting opponents.
Is more timid in multiplayer, more aggressive in a
duel.
Does a U-Turn if it’s been doing one thing for a
while.
Circle track by default.
Gagne, Knowlton, Tellier, and Congdon, GECCO
19. Crash and Segfault Entry – Rule Set
Conditions Action
Ahead Close Behin Advantage Edge Wounded No Change Duel Turnin
Ahead d g
F F * * T * * * * Get Clear
* F T * F * F * F Run
T F F T F * * * * Run
T F F F * * * * F Ram
F F F T F * * * * Circle
Track
* T * * * * * * * Ram
* F T * F * T * * U-Turn
F F F F F * F * F Circle
Track
F F F F F * T * * U-Turn
F F * F F * * * T U-Turn
‘*’ Means any state satisfies this condition (Don’t Care).
Grayed-out conditions are ignored by this agent.
Gagne, Knowlton, Tellier, and Congdon, GECCO
20. Crash and Segfault Behavior
Will always attempt to ram if an opponent is close
ahead.
Runs away if an opponent is behind it.
Runs away if it has a damage advantage in a
duel.
Attempts to ram if there is an opponent ahead
and it isn't running.
Will make a U-turn if it has gone a complete lap
around the track while either running or circling.
If no other action is called for, will circle the track
to try to find opponents.
Gagne, Knowlton, Tellier, and Congdon, GECCO
21. JustDetermined Entry – Rule Set
Conditions Action
Ahead Close Behin Advantage Edge Wounded No Change Duel Turnin
Ahead d g
* * * * T * * * * Get Clear
* * T * * * * * * U-Turn
T * F * F * * * * Ram
* * F T F * * * * Run
F * F * F * * * * Circle Track
‘*’ Means any state satisfies this condition (Don’t Care).
Grayed-out conditions are ignored by this agent.
Gagne, Knowlton, Tellier, and Congdon, GECCO
22. JustDetermined Behavior
Basic wall detection to avoid walls
Opponent is its main focus when opponent is in
front of the controller
If an opponent is behind the controller, it will turn
around as fast as possible to hit the opponent
When no opponent is near, the controller circles
the track to maintain speed
Gagne, Knowlton, Tellier, and Congdon, GECCO
23. Future Work
Further evaluation of agents against a wider
variety of drivers.
This basic approach is designed as a step
towards using EC on the rule sets.
In addition to evolving the rule sets, parameters
such as “close” can benefit from EC to refine
these values.
Gagne, Knowlton, Tellier, and Congdon, GECCO
32. And the Winner is....
SEALbot
Anderson Rocha Tavares
Anderson Rocha Tavares & Gabriel de Oliveira Ramos & Renato
de Pontes Pereira & Sérgio Montazzolli Silva & Ana L. C. Bazzan
Universidade Federal do Rio Grande do Sul (UFRGS), Porto Alegre
UFRGS
Brasil
34. Institute of Computer Science
Chair of Cognitive Modeling
Thank you for the
attention!
Notas del editor
Optimization of avoidance is as important as is the optimization of hitting. Co-optimization is possible. Point out that various optimization methods can be applied – policy-gradient algorithms can be done with CMA-ES for example. Strategy components can be optimized individually or in parallel.
Last year: Both competitors did NOT consider opponent AVOIDANCE. However, chasing and crashing was attempted to be optimized. Result was, however, rather unclear outcomes with 8 competitors.