More than Just Lines on a Map: Best Practices for U.S Bike Routes
Detection of Typical Progress Patterns of Industrial Incidents by Text Mining Technique
1. Detection of
Typical Progress Patterns of
Industrial Incidents
by
Text Mining Technique
Toru Nakata, Mohammad Sohrab
National Institute of Advanced Industrial Science and
Technology, Japan.
1
3. Progress of Accident
3
An incident
Initial
Situation
Environm
ent
State 1
State 2
Damage
Another incident
Initial
Situation
Environm
ent
State 1
State 2
Protection
Measure
Safe
Result
5. Causality Patterns accidents
5
Merged Many Progress Flows
Motivation
By merging many
progress flows, we
can find typical
patterns of
accident and key
factors to prevent
accidents.
7. State = Sentence
7
A particular
incident
State 1
State 2
State 3
Its report
Sentence 1
Sentence 2
Sentence 3
Strongly
Reflected
Hard to
analyze numerically
8. State = Sentence (= Bug of
Words)
8
A particular
incident
State 1
State 2
State 3
Its report
Sentence 1
Sentence 2
Sentence 3
Strongly
Reflected
Bugs of words
Loosely
connected
Hard to
analyze numerically
Easy to
analyze numerically
9. ASRS's Incident reports
• Aviation Safety Reporting System of
NASA
• Authors: Pilots and other crews who were
involved in incidents.
• Amount of Reports in 2013
– 4,469 report. 1,365,260 words. 110,963
sentences.
• In average, a report has 305 words in 25
sentences.
9
10. An example of ASRS report
I told her that it was 11 minutes until scheduled
departure. I had also informed the gate agents
twice during boarding that oversized carry-on
baggage was entering the aircraft. The agent
Supervisor stated that the cargo bin had already
closed.
10
(This is a fragment. The original report is much longer.)
11. Split into sentences
11
1. << Beginning of Story>>
2. I told her that it was 11 minutes until scheduled
departure.
3. I had also informed the gate agents twice during
boarding that oversized carry-on baggage was
entering the aircraft.
4. The agent supervisor stated that the cargo bin had
already closed.
5. <<End of Story>>
12. Each sentence stands for a state
12
I told her that it was 11 minutes until scheduled departure.
BoS
I had also informed the gate agents twice during boarding that
oversized carry-on baggage was entering the aircraft.
The agent supervisor stated that the cargo bin had already closed.
EoS
13. Drop 'stop words' (= less specific words)
and syntactic information
13
{tell, minute, schedule (verb), departure}
BoS
{inform, gate, agent, twice, board (verb), oversize, carry-
on, baggage, enter,aircraft}
{agent, supervisor, state, cargo, bin, close}
EoS
14. Transition between a pair of words
14
{tell, minute, schedule (verb), departure}
BoS
{inform, gate, agent, twice, board (verb), oversize, carry-
on, baggage, enter,aircraft}
{agent, supervisor, state, cargo, bin, close}
EOS
16. Word Feature 1:
Word Order Ranking
16
• Tij : Frequency of transition from word i to
word j
• ai = sum (Tij for all j) - sum (Tji for all j)
– Balance of departures versus arrivals around
word i.
• This score is unstable.
• bi = sum (aj for all j of Tij> 0) - sum (aj for
all j of Tij< 0)
– Improved score of word order of word i based
on its neighbors' balance.
17. Word Feature 2: Commonality
17
• cij =Tij + Tji : Commonality between word i
to word j.
• vi : vector representation of word i.
– Similar meaning words should have similar
content in their vector.
– Calculated by iteration of
– Number of Dimension set as 2.
19. Vi versus Tij
190-2 2-3 -1 1 3-2.5 -1.5 -0.5 0.5 1.5 2.5
0
-6
-4
-2
2
4
6
-5
-3
-1
1
3
5
Topic Position (X)
WordOrderIndex
aircraft
runway flight
foot
approach
time
pilot
landingaltitude
airport
engine
departure
traffic
fuel
emergency
arrival
procedure
clearance
takeoff
problem
descentmile crew
speed
situation
minute
passenger
pointturn
gear
controller
control
checklist
area
ground
position
issue
gatefrequency
number
airplane
taxi
climb
knot
weather
maintenance
event
plane system
light
flap
power
call
message
condition
course
incident
day
autopilot
cabin
line
way
radio
degree
change
company
indication
airspace
side
destination
level
vector
ATC
Tower
Captain
FirstOfficer
Maintenance
QRH
CONTROL
Aircraft
Controller
Center
StoryEnd
fly
make
get
go
tell
ask
call
see
say
clear
turn take
give
continue
land
leave
use
head
start
need
come
notice
think
checklook
know
descend
advise
climb
follow
receive
work
begin
feeltry
return
inform
cross
find
stop
depart
report
run
believe
declare
request
hear
contact
complete
require
happen
set
issue
hold
show
state
want
appear
maintain
causeput
miss
decide
taxi
indicate
enter
perform
StoryStart
20. Vi versus Tij
200-2 2-3 -1 1 3-2.5 -1.5 -0.5 0.5 1.5 2.5
0
-6
-4
-2
2
4
6
-5
-3
-1
1
3
5
Topic Position (X)
WordOrderIndex
aircraft
runway flight
foot
approach
time
pilot
landingaltitude
airport
engine
departure
traffic
fuel
emergency
arrival
procedure
clearance
takeoff
problem
descentmile crew
speed
situation
minute
passenger
pointturn
gear
controller
control
checklist
area
ground
position
issue
gatefrequency
number
airplane
taxi
climb
knot
weather
maintenance
event
plane system
light
flap
power
call
message
condition
course
incident
day
autopilot
cabin
line
way
radio
degree
change
company
indication
airspace
side
destination
level
vector
ATC
Tower
Captain
FirstOfficer
Maintenance
QRH
CONTROL
Aircraft
Controller
Center
StoryEnd
fly
make
get
go
tell
ask
call
see
say
clear
turn take
give
continue
land
leave
use
head
start
need
come
notice
think
checklook
know
descend
advise
climb
follow
receive
work
begin
feeltry
return
inform
cross
find
stop
depart
report
run
believe
declare
request
hear
contact
complete
require
happen
set
issue
hold
show
state
want
appear
maintain
causeput
miss
decide
taxi
indicate
enter
perform
StoryStart
Words tend to
appear earlier. (=
cause)
Words tend to
appear later. (=
result)
21. Vi versus Tij (zoomed-up)
210-2 2-1 1-2.4 -2.2 -1.8 -1.6 -1.4 -1.2 -0.8 -0.6 -0.4 -0.2 0.2 0.4 0.6 0.8 1.2 1.4 1.6 1.8 2.2 2.4 2.6
0
-2
2
-1
1
-2.5
-1.5
-0.5
0.5
1.5
2.5
Topic Position (X)
WordOrderIndex
aircraft
runway
flight
foot
approach
time
pilot
landingaltitude
airport
engine
departure
traffic
fuel
emergency
arrival
procedure
clearance
takeoff
problem
descentmile
crew
speed
situation
minute
passenger
point
turn
gear
controller
control
checklist
area
ground
position
issue
gatefrequency
number
airplane
taxi
climb
knot
weather
maintenancemaintenancemaintenance
event
plane system
light
flap
power
call
message
condition
course
incident
day
autopilot
cabin
line
way
radio
degree
change
company
indication
airspace
side
destination
level
vector
ATC
Tower
Captain
FirstOfficer
Maintenance
QRH
CONTROL
Aircraft
Controller
Center
fly
make
get
go
tell
ask
call
see
say
clear
turn
take
give
continue
land
leave
use
head
start
need
come
notice
think
checklook
know
descend
advise
climb
follow
receive
work
begin
feel
try
return
inform
cross
find
stop
depart
report
run
believe
declare
request
hear
contact
complete
require
happen
set
issue
hold
show
state
want
appear
maintain
cause
put
miss
decide
taxi
indicate
enter
perform
Words tend to
appear earlier. (=
cause)
Words tend to
appear later. (=
result)
22. View from Another Dimension
22
0-2 2-3 -1 1 3-2.5 -1.5 -0.5 0.5 1.5 2.5
0
-6
-4
-2
2
4
6
-5
-3
-1
1
3
5
Topic Position (X)
WordOrderIndex
aircraft
runway flight
foot
approach
time
pilot
landingaltitude
airport
engine
departure
traffic
fuel
emergency
arrival
procedure
clearance
takeoff
problem
descentmile crew
speed
situation
minute
passenger
pointturn
gear
controller
control
checklist
area
ground
position
issue
gatefrequency
number
airplane
taxi
climb
knot
weather
maintenance
event
plane system
light
flap
power
call
message
condition
course
incident
day
autopilot
cabin
line
way
radio
degree
change
company
indication
airspace
side
destination
level
vector
ATC
Tower
Captain
FirstOfficer
Maintenance
QRH
CONTROL
Aircraft
Controller
Center
StoryEnd
fly
make
get
go
tell
ask
call
see
say
clear
turn take
give
continue
land
leave
use
head
start
need
come
notice
think
checklook
know
descend
advise
climb
follow
receive
work
begin
feeltry
return
inform
cross
find
stop
depart
report
run
believe
declare
request
hear
contact
complete
require
happen
set
issue
hold
show
state
want
appear
maintain
causeput
miss
decide
taxi
indicate
enter
perform
StoryStart
Word Order
Topic Position (Primal component)
23. View from Another Dimension
23
0-2 2-3 -1 1 3-2.5 -1.5 -0.5 0.5 1.5 2.5
0
-6
-4
-2
2
4
6
-5
-3
-1
1
3
5
Topic Position (X)
WordOrderIndex
aircraft
runway flight
foot
approach
time
pilot
landingaltitude
airport
engine
departure
traffic
fuel
emergency
arrival
procedure
clearance
takeoff
problem
descentmile crew
speed
situation
minute
passenger
pointturn
gear
controller
control
checklist
area
ground
position
issue
gatefrequency
number
airplane
taxi
climb
knot
weather
maintenance
event
plane system
light
flap
power
call
message
condition
course
incident
day
autopilot
cabin
line
way
radio
degree
change
company
indication
airspace
side
destination
level
vector
ATC
Tower
Captain
FirstOfficer
Maintenance
QRH
CONTROL
Aircraft
Controller
Center
StoryEnd
fly
make
get
go
tell
ask
call
see
say
clear
turn take
give
continue
land
leave
use
head
start
need
come
notice
think
checklook
know
descend
advise
climb
follow
receive
work
begin
feeltry
return
inform
cross
find
stop
depart
report
run
believe
declare
request
hear
contact
complete
require
happen
set
issue
hold
show
state
want
appear
maintain
causeput
miss
decide
taxi
indicate
enter
perform
StoryStart
0-2 2-3 -1 1 3-2.5 -1.5 -0.5 0.5 1.5 2.5
0
-2
2
-1
1
-1.5
-0.5
0.5
1.5
Topic Position (X)
TopicPosition(Y)
aircraftrunway flightfoot approach timepilot landingaltitude airport enginedeparturetraffic fuelemergencyarrival procedureclearance takeoff problemdescentmile crewspeed situationminute passengerpointturn gearcontroller control checklistarea groundposition issue gatefrequency numberairplanetaxiclimbknot weather maintenanceeventplane systemlightflappowercall messageconditioncourse incident dayautopilot cabinline wayradiodegree change company indicationairspace side destinationlevelvector ATCTower CaptainFirstOfficer MaintenanceQRHCONTROLAircraftController Center StoryEndfly makegetgotellask callseesayclearturn takegive continuelandleaveusehead startneed comenoticethink checklook knowdescend adviseclimb followreceive workbegin feeltry returninformcross findstopdepartreport runbelieve declarerequesthear contact completerequirehappensetissue hold showstate wantappearmaintain causeputmiss decidetaxiindicateenter performStoryStart
Word Order
Topic Position (Primal component)
Topic Position (Primal component)
Topic Position
(Second
component)
25. 0-2 2-3 -1 1 3-2.5 -1.5 -0.5 0.5 1.5 2.5
0
-2
2
-1
1
-1.5
-0.5
0.5
1.5
Topic Position (X)
TopicPosition(Y)
aircraftrunway flightfoot approach timepilot landingaltitude airport enginedeparturetraffic fuelemergencyarrival procedureclearance takeoff problemdescentmile crewspeed situationminute passengerpointturn gearcontroller control checklistarea groundposition issue gatefrequency numberairplanetaxiclimbknot weather maintenanceeventplane systemlightflappowercall messageconditioncourse incident dayautopilot cabinline wayradiodegree change company indicationairspace side destinationlevelvector ATCTower CaptainFirstOfficer MaintenanceQRHCONTROLAircraftController Center StoryEndfly makegetgotellask callseesayclearturn takegive continuelandleaveusehead startneed comenoticethink checklook knowdescend adviseclimb followreceive workbegin feeltry returninformcross findstopdepartreport runbelieve declarerequesthear contact completerequirehappensetissue hold showstate wantappearmaintain causeputmiss decidetaxiindicateenter performStoryStartPose Control
Communication
with controller
Avoiding
other traffic
General Verbs
Scene at Airport
Trouble on Aircraft
Reporting
26. Detected Topic Clusters
No. contents and meaning Typical word
1 general verbs work, stop, call, feel, hear, run,
declare
2 control of airplane posture
and direction
head (verb), descend, climb, cross
3 communication controller, tower, call, radio,
frequency
4 avoiding near-miss traffic, clearance, turn, mile
5 events at an airport runway, takeoff, departure, landing
6 mechanical troubles engine, gear, system, maintenance,
7 incident reporting incident, day, company, control
(noun)
26
27. Visualization of Causality Graph
27
0-2 2-1 1-2.4 -2.2 -1.8 -1.6 -1.4 -1.2 -0.8 -0.6 -0.4 -0.2 0.2 0.4 0.6 0.8 1.2 1.4 1.6 1.8 2.2 2.4 2.6
0
-2
2
-1
1
-2.5
-1.5
-0.5
0.5
1.5
2.5
Topic Position (X)
WordOrderIndex
aircraft
runway
flight
foot
approach
time
pilot
landingaltitude
airport
engine
departure
traffic
fuel
emergency
arrival
procedure
clearance
takeoff
problem
descentmile
crew
speed
situation
minute
passenger
point
turn
gear
controller
control
checklist
area
ground
position
issue
gatefrequency
number
airplane
taxi
climb
knot
weather
maintenancemaintenancemaintenance
event
plane system
light
flap
power
call
message
condition
course
incident
day
autopilot
cabin
line
way
radio
degree
change
company
indication
airspace
side
destination
level
vector
ATC
Tower
Captain
FirstOfficer
Maintenance
QRH
CONTROL
Aircraft
Controller
Center
fly
make
get
go
tell
ask
call
see
say
clear
turn
take
give
continue
land
leave
use
head
start
need
come
notice
think
checklook
know
descend
advise
climb
follow
receive
work
begin
feel
try
return
inform
cross
find
stop
depart
report
run
believe
declare
request
hear
contact
complete
require
happen
set
issue
hold
show
state
want
appear
maintain
cause
put
miss
decide
taxi
indicate
enter
perform
28. Conclusion
• We have huge among of accident report,
so it is difficult to read all of them and to
find typical progress pattern of accidents.
• By using NLP (word order and word
commonality analysis), common progress
pattern of aviation incidents are visualized.
• Future Work: How common accident
patterns can be generated as sentence-
format? (What we showed is word-wise
network style.) 28
30. Why text data?
• We have much information of
accidents/incidents.
• Most of them are in text format.
– Accident Reports, daily operation log book,
etc.
• To use the data, natural language
processing (NLP) is required.
30
31. Result of conventional method
(Classification of reports with respect to containing words)
31
-5.0
-4.0
-3.0
-2.0
-1.0
0.0
1.0
2.0
3.0
-5.0 -4.0 -3.0 -2.0 -1.0 0.0 1.0 2.0 3.0 4.0
Distribution of Incident Reports (PCA, Nomarized Data)
Landing
Airport
Tower Attitude
Runway
Controller
Engine
Maintenance
CrewTraffic