SlideShare una empresa de Scribd logo
1 de 21
Descargar para leer sin conexión
Disaster Scene Description and
Indexing (DSDI) Task Overview
TRECVID 2022
Asad Anwar Butt, National Institute of Standards and Technology; Johns Hopkins University
George Awad, National Institute of Standards and Technology
Jeffrey Liu, MIT Lincoln Laboratory
William Drew, Office of Homeland Security and Preparedness
• Video and imagery data can be extremely helpful for public
safety operations.
• Natural Disasters, e.g.,
• Wildfires
• Hurricanes
• Earthquakes
• Floods
• Man-made Disasters, e.g.,
• Hazardous material spills
• Mining accidents
• Explosions
Introduction - DSDI
* All images are under creative common licenses
• Prior knowledge about affected areas can be very useful for
the first responders.
• Oftentimes, the communication systems go down in major
disasters, which makes it very difficult to get any information
regarding the damage.
• Automated systems to gather information before rescue
workers enter the area can be very helpful.
Introduction - DSDI
• Computer vision capabilities have rapidly advanced recently
with the popularity of deep learning.
• Research groups have access to large image and video datasets for various tasks.
• However, the capabilities do not meet public safety needs.
• Lack of relevant training data.
• Most current image and video datasets have no public safety
hazard labels.
• State-of-the-art systems trained on such datasets fail to provide helpful labels.
Introduction - DSDI
• In response, the MIT Lincoln Lab developed a dataset of images
collected by the Civil Air Patrol of various natural disasters.
• The Low Altitude Disaster Imagery (LADI) dataset was developed as
part of a larger NIST Public Safety Innovator Accelerator Program
(PSIAP) grant.
• Two key properties of the dataset are:
• Low altitude
• Oblique perspective of the imagery and disaster-related features.
• The DSDI test data and ground truth from 2020 & 2021 are also
available for teams to use as training data.
Training Dataset
• LADI Dataset:
• Hosted as part of the AWS Public Dataset program.
• Consists of 20,000+ annotated images.
• The images are from locations with FEMA (Federal Emergency Management
Agency) major disaster declarations for a hurricane or flooding.
• Lower altitude criteria distinguish the LADI dataset from satellite datasets to support
the development of computer vision capabilities with small drones operating at low
altitudes.
• A minimum image size was selected to maximize the efficiency of the crowd source
workers; lower resolution images are harder to annotate.
• 2020 - 2021 DSDI Test Set:
• ~ 11 hours of video.
• Segmented into small video clips (shots) of maximum 20 sec.
• Videos are from earthquake, hurricane, and flood affected areas.
• Total number of shots: 4626
Training Dataset
• A test dataset of about 6 hours of video was distributed for this task.
• Collected by FEMA as individual images after disaster events.
• Individual images were stitched to form videos with reasonable
speed.
• The test dataset was segmented into small video clips (shots) of a
maximum of 16 sec, with a mean length of 10 sec.
• Subset was selected by NIST taking into account diversity
• In addition, a small set of videos was collected from Defense Visual
Information Distribution Service (DVIDS): https://www.dvidshub.net/
• Total number of shots: 2157
Test Dataset
Testing Data: Example Videos
* All videos are under public domain licenses
• Hierarchical labeling scheme: 5 coarse categories, each with 4 to 9
more specific annotations.
Testing Dataset - Categories
Damage Environment Infrastructure Vehicles Water
Misc. Damage Dirt Bridge Aircraft Flooding
Flooding/Water Damage Grass Building Boat Lake/Pond
Landslide Lava Dam/Levee Car Ocean
Road Washout Rocks Pipes Truck Puddle
Rubble/Debris Sand Utility Or Power Lines/Electric Towers River/Stream
Smoke/Fire Shrubs Railway
Snow/Ice
Wireless/Radio Communication
Towers
Trees Water Tower
Road
• We had 2 full time annotators (same annotators since 2020) instead
of crowdsourcing.
• For each category, a practice web page was created with multiple
examples and sample test videos.
• This allowed the annotators to become familiarized with the task and
labels before starting a category.
• The annotators worked independently on each category.
• For each coarse category, they marked all the specific labels that
were present in the video.
• To create the final ground truth, for each shot, the union of labels
was used.
Annotation
The annotators watch the video and mark the categories that are visible in the video.
Annotation Tool
• Systems are required to return a ranked list of up to 1000 shots for
each of the 32 features.
• Each submitted run specified its training type:
• LADI-based (L): The run only used the supplied LADI dataset for development of its system.
• Non-LADI (N): The run did not use the LADI dataset, but only trained using other dataset(s).
• LADI + Others (O): The run used the LADI dataset in addition to any other dataset(s) for
training purposes.
DSDI System Task
• The following evaluation metrics were used to compare the
submissions:
Evaluation Metrics
Metric Description
Speed Clock time per inference (reported by participants).
Mean Average
Precision (MAP)
Average precision is calculated for each feature, and the mean average
precision reported for a submission.
Recall True positive, true negative, false positive, and false negative rates.
Submissions
Run Type Run Id
O PKU_WICT_7
O PKU_WICT_2
O PKU_WICT_4
O PKU_WICT_5
L PKU_WICT_6
L PKU_WICT_3
L PKU_WICT_1
L PKU_WICT_8
L UMKC_1
O UMKC_1
PKU_WICT : Peking University
UMKC : University of Missouri-Kansas City
Frequency of Features
• Graph shows number of shots containing each feature.
• Some features (e.g. grass, trees, buildings, roads, etc.) occur much more frequently than others.
• 4 features were dropped due to the rare occurrence in ground truth (lava, snow/ice, landslide, road washout)
0
200
400
600
800
1000
1200
1400
1600
1800
2000
2200
d
a
m
a
g
e
(
m
i
s
c
)
f
l
o
o
d
i
n
g
r
u
b
b
l
e
/
d
e
b
r
i
s
s
m
o
k
e
/
f
i
r
e
d
i
r
t
g
r
a
s
s
r
o
c
k
s
s
a
n
d
s
h
r
u
b
s
t
r
e
e
s
b
r
i
d
g
e
b
u
i
l
d
i
n
g
d
a
m
/
l
e
v
e
e
p
i
p
e
s
u
t
i
l
i
t
y
/
p
o
w
e
r
l
i
n
e
s
r
a
i
l
w
a
y
w
i
r
e
l
e
s
s
c
o
m
m
u
n
i
c
a
t
i
o
n
t
o
w
e
r
s
w
a
t
e
r
t
o
w
e
r
a
i
r
c
r
a
f
t
b
o
a
t
c
a
r
t
r
u
c
k
f
l
o
o
d
i
n
g
l
a
k
e
/
p
o
n
d
o
c
e
a
n
p
u
d
d
l
e
r
i
v
e
r
/
s
t
r
e
a
m
r
o
a
d
Frequency
of
shots
True Positives per feature
Results by Teams
0.501 0.5 0.482
0.429
0.354
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
PKU_WICT _7 PKU_WICT_2 PKU_WICT_4 PKU_WICT_5 UMKC_1
Mean
Average
Precision
LADI+Others-based Systems
Performance across all features
0.468 0.468 0.465
0.423
0.354
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
PKU_WICT_6 PKU_WICT_3 PKU_WICT_1 PKU_WICT_8 UMKC_1
Mean
Average
Precision
LADI-based Systems
Performance across all features
Results by Categories
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
D
a
m
a
g
e
E
n
v
i
r
o
n
m
e
n
t
I
n
f
r
a
s
t
r
u
c
t
u
r
e
V
e
h
i
c
l
e
s
W
a
t
e
r
Mean
Average
Precision
Categories of tested features
Performance by category
LADI+Others
LADI
• Average precision values for each feature categorized by training type.
• 5 LADI-based runs ; 5 LADI+Others-based runs
Results by Features
0
0.2
0.4
0.6
0.8
1
1.2
damage
flooding
rubble
smoke
dirt
grass
rocks
sand
shrubs
trees
bridge
building
dam
pipes
utility
line
railway
wireless
tower
water
tower
aircraft
boat
car
truck
flooding
lake
ocean
puddle
river
road
Average
Precision
LADI-based runs - by feature
Min
Median
Max
0
0.2
0.4
0.6
0.8
1
1.2
damage
flooding
rubble
smoke
dirt
grass
rocks
sand
shrubs
trees
bridge
building
dam
pipes
utility
line
railway
wireless
tower
water
tower
aircraft
boat
car
truck
flooding
lake
ocean
puddle
river
road
Average
Precision
LADI+Others-based runs - by feature
Min
Median
Max
Efficiency
0
100
200
300
400
500
600
700
800
0 0.2 0.4 0.6 0.8 1 1.2
Time
(sec)
Average Precision
LADI-based runs
0
100
200
300
400
500
600
700
800
0 0.2 0.4 0.6 0.8 1 1.2
Time
(sec)
Average Precision
LADI+Others-based runs
• LADI-based systems reported less processing time
• Majority of systems consumed more time but without more gain in performance
• Lowest processing time was ~30 sec at max performance
O_UMKC_1
L_UMKC_1
L_PKU_WICT_8
F-Measure per feature
• A new test dataset from various event sources were employed
representing more diversity.
• Performance varies by feature.
• L+O runs performed higher than L-based runs.
• Few runs/features are good and efficient.
• Challenges include:
• Small dataset and limited resources for annotation.
• Training and testing dataset should be from the same distribution. Hard to do with
different nature of calamities.
• The task had little participation compared to the last 2 years.
• Question to teams about continuation of task.
Conclusion and Future Work

Más contenido relacionado

Similar a Disaster Scene Indexing at TRECVID 2022

Poster Drone Based Aerial Imaging for Post-Disaster Reconnaissance by Theau H...
Poster Drone Based Aerial Imaging for Post-Disaster Reconnaissance by Theau H...Poster Drone Based Aerial Imaging for Post-Disaster Reconnaissance by Theau H...
Poster Drone Based Aerial Imaging for Post-Disaster Reconnaissance by Theau H...
Theau Heral
 
M4M 2 the Rescue of M2M (Eclipse DemoCamp Trondheim)
M4M 2 the Rescue of M2M (Eclipse DemoCamp Trondheim)M4M 2 the Rescue of M2M (Eclipse DemoCamp Trondheim)
M4M 2 the Rescue of M2M (Eclipse DemoCamp Trondheim)
Werner Keil
 
MediaEval 2018: Multimedia Satellite Task: Emergency Response for Flooding Ev...
MediaEval 2018: Multimedia Satellite Task: Emergency Response for Flooding Ev...MediaEval 2018: Multimedia Satellite Task: Emergency Response for Flooding Ev...
MediaEval 2018: Multimedia Satellite Task: Emergency Response for Flooding Ev...
multimediaeval
 
Standards for the Future of Java Embedded (16:9)
Standards for the Future of Java Embedded (16:9)Standards for the Future of Java Embedded (16:9)
Standards for the Future of Java Embedded (16:9)
Werner Keil
 
Crowdsourcing the Acquisition and Analysis of Mobile Videos for Disaster Resp...
Crowdsourcing the Acquisition and Analysis of Mobile Videos for Disaster Resp...Crowdsourcing the Acquisition and Analysis of Mobile Videos for Disaster Resp...
Crowdsourcing the Acquisition and Analysis of Mobile Videos for Disaster Resp...
University of Southern California
 
"Using Deep Learning for Video Event Detection on a Compute Budget," a Presen...
"Using Deep Learning for Video Event Detection on a Compute Budget," a Presen..."Using Deep Learning for Video Event Detection on a Compute Budget," a Presen...
"Using Deep Learning for Video Event Detection on a Compute Budget," a Presen...
Edge AI and Vision Alliance
 

Similar a Disaster Scene Indexing at TRECVID 2022 (20)

geecon 2013 - Standards for the Future of Java Embedded
geecon 2013 - Standards for the Future of Java Embeddedgeecon 2013 - Standards for the Future of Java Embedded
geecon 2013 - Standards for the Future of Java Embedded
 
IOT BASED LANDSLIDE DETECTION
IOT BASED LANDSLIDE DETECTIONIOT BASED LANDSLIDE DETECTION
IOT BASED LANDSLIDE DETECTION
 
Poster Drone Based Aerial Imaging for Post-Disaster Reconnaissance by Theau H...
Poster Drone Based Aerial Imaging for Post-Disaster Reconnaissance by Theau H...Poster Drone Based Aerial Imaging for Post-Disaster Reconnaissance by Theau H...
Poster Drone Based Aerial Imaging for Post-Disaster Reconnaissance by Theau H...
 
UAV – Unmanned Aerial Vehicles
UAV – Unmanned Aerial VehiclesUAV – Unmanned Aerial Vehicles
UAV – Unmanned Aerial Vehicles
 
med_poster_spie
med_poster_spiemed_poster_spie
med_poster_spie
 
Secured Video Watermarking Based On DWT
Secured Video Watermarking Based On DWTSecured Video Watermarking Based On DWT
Secured Video Watermarking Based On DWT
 
IRJET - Remote Sensing of Potholes and Measurement of Pothole Volume using Ul...
IRJET - Remote Sensing of Potholes and Measurement of Pothole Volume using Ul...IRJET - Remote Sensing of Potholes and Measurement of Pothole Volume using Ul...
IRJET - Remote Sensing of Potholes and Measurement of Pothole Volume using Ul...
 
IRJET- A Real Time Yolo Human Detection in Flood Affected Areas based on Vide...
IRJET- A Real Time Yolo Human Detection in Flood Affected Areas based on Vide...IRJET- A Real Time Yolo Human Detection in Flood Affected Areas based on Vide...
IRJET- A Real Time Yolo Human Detection in Flood Affected Areas based on Vide...
 
IRJET- Risk Assessment and Management Techniques to Avoid Landslide Hazard
IRJET- Risk Assessment and Management Techniques to Avoid Landslide HazardIRJET- Risk Assessment and Management Techniques to Avoid Landslide Hazard
IRJET- Risk Assessment and Management Techniques to Avoid Landslide Hazard
 
Volunteer Crowd Computing and Federated Cloud developments
Volunteer Crowd Computing and Federated Cloud developmentsVolunteer Crowd Computing and Federated Cloud developments
Volunteer Crowd Computing and Federated Cloud developments
 
M4M 2 the Rescue of M2M (Eclipse DemoCamp Trondheim)
M4M 2 the Rescue of M2M (Eclipse DemoCamp Trondheim)M4M 2 the Rescue of M2M (Eclipse DemoCamp Trondheim)
M4M 2 the Rescue of M2M (Eclipse DemoCamp Trondheim)
 
MediaEval 2018: Multimedia Satellite Task: Emergency Response for Flooding Ev...
MediaEval 2018: Multimedia Satellite Task: Emergency Response for Flooding Ev...MediaEval 2018: Multimedia Satellite Task: Emergency Response for Flooding Ev...
MediaEval 2018: Multimedia Satellite Task: Emergency Response for Flooding Ev...
 
Standards for the Future of Java Embedded (16:9)
Standards for the Future of Java Embedded (16:9)Standards for the Future of Java Embedded (16:9)
Standards for the Future of Java Embedded (16:9)
 
FME for Disaster Response
FME for Disaster ResponseFME for Disaster Response
FME for Disaster Response
 
Crowdsourcing the Acquisition and Analysis of Mobile Videos for Disaster Resp...
Crowdsourcing the Acquisition and Analysis of Mobile Videos for Disaster Resp...Crowdsourcing the Acquisition and Analysis of Mobile Videos for Disaster Resp...
Crowdsourcing the Acquisition and Analysis of Mobile Videos for Disaster Resp...
 
ROAD POTHOLE DETECTION USING YOLOV4 DARKNET
ROAD POTHOLE DETECTION USING YOLOV4 DARKNETROAD POTHOLE DETECTION USING YOLOV4 DARKNET
ROAD POTHOLE DETECTION USING YOLOV4 DARKNET
 
陸永祥/全球網路攝影機帶來的機會與挑戰
陸永祥/全球網路攝影機帶來的機會與挑戰陸永祥/全球網路攝影機帶來的機會與挑戰
陸永祥/全球網路攝影機帶來的機會與挑戰
 
 Gigapixel resolution imaging for near-remote sensing and phenomics
 Gigapixel resolution imaging for near-remote sensing and phenomics Gigapixel resolution imaging for near-remote sensing and phenomics
 Gigapixel resolution imaging for near-remote sensing and phenomics
 
ASCE OC EWRI February Luncheon - 3D Animation to Convey Advanced Hydraulic Mo...
ASCE OC EWRI February Luncheon - 3D Animation to Convey Advanced Hydraulic Mo...ASCE OC EWRI February Luncheon - 3D Animation to Convey Advanced Hydraulic Mo...
ASCE OC EWRI February Luncheon - 3D Animation to Convey Advanced Hydraulic Mo...
 
"Using Deep Learning for Video Event Detection on a Compute Budget," a Presen...
"Using Deep Learning for Video Event Detection on a Compute Budget," a Presen..."Using Deep Learning for Video Event Detection on a Compute Budget," a Presen...
"Using Deep Learning for Video Event Detection on a Compute Budget," a Presen...
 

Más de George Awad

Más de George Awad (20)

Movie Summarization at TRECVID 2022
Movie Summarization at TRECVID 2022Movie Summarization at TRECVID 2022
Movie Summarization at TRECVID 2022
 
Activity Detection at TRECVID 2022
Activity Detection at TRECVID 2022Activity Detection at TRECVID 2022
Activity Detection at TRECVID 2022
 
Multimedia Understanding at TRECVID 2022
Multimedia Understanding at TRECVID 2022Multimedia Understanding at TRECVID 2022
Multimedia Understanding at TRECVID 2022
 
Video Search at TRECVID 2022
Video Search at TRECVID 2022Video Search at TRECVID 2022
Video Search at TRECVID 2022
 
Video Captioning at TRECVID 2022
Video Captioning at TRECVID 2022Video Captioning at TRECVID 2022
Video Captioning at TRECVID 2022
 
TRECVID 2019 Video to Text
TRECVID 2019 Video to TextTRECVID 2019 Video to Text
TRECVID 2019 Video to Text
 
TRECVID 2019 introduction slides
TRECVID 2019 introduction slidesTRECVID 2019 introduction slides
TRECVID 2019 introduction slides
 
TRECVID 2019 Instance Search
TRECVID 2019 Instance SearchTRECVID 2019 Instance Search
TRECVID 2019 Instance Search
 
TRECVID 2019 Ad-hoc Video Search
TRECVID 2019 Ad-hoc Video SearchTRECVID 2019 Ad-hoc Video Search
TRECVID 2019 Ad-hoc Video Search
 
[TRECVID 2018] Video to Text
[TRECVID 2018] Video to Text[TRECVID 2018] Video to Text
[TRECVID 2018] Video to Text
 
[TRECVID 2018] Instance Search
[TRECVID 2018] Instance Search[TRECVID 2018] Instance Search
[TRECVID 2018] Instance Search
 
[TRECVID 2018] Ad-hoc Video Search
[TRECVID 2018] Ad-hoc Video Search[TRECVID 2018] Ad-hoc Video Search
[TRECVID 2018] Ad-hoc Video Search
 
Instance Search (INS) task at TRECVID
Instance Search (INS) task at TRECVIDInstance Search (INS) task at TRECVID
Instance Search (INS) task at TRECVID
 
Activity detection at TRECVID
Activity detection at TRECVIDActivity detection at TRECVID
Activity detection at TRECVID
 
The Video-to-Text task evaluation at TRECVID
The Video-to-Text task evaluation at TRECVIDThe Video-to-Text task evaluation at TRECVID
The Video-to-Text task evaluation at TRECVID
 
Introduction about TRECVID (The Video Retreival benchmark)
Introduction about TRECVID (The Video Retreival benchmark)Introduction about TRECVID (The Video Retreival benchmark)
Introduction about TRECVID (The Video Retreival benchmark)
 
TRECVID ECCV2018 Tutorial (Ad-hoc Video Search)
TRECVID ECCV2018 Tutorial (Ad-hoc Video Search)TRECVID ECCV2018 Tutorial (Ad-hoc Video Search)
TRECVID ECCV2018 Tutorial (Ad-hoc Video Search)
 
TRECVID 2016 : Multimedia event Detection
TRECVID 2016 : Multimedia event DetectionTRECVID 2016 : Multimedia event Detection
TRECVID 2016 : Multimedia event Detection
 
TRECVID 2016 : Concept Localization
TRECVID 2016 : Concept LocalizationTRECVID 2016 : Concept Localization
TRECVID 2016 : Concept Localization
 
TRECVID 2016 : Video to Text Description
TRECVID 2016 : Video to Text DescriptionTRECVID 2016 : Video to Text Description
TRECVID 2016 : Video to Text Description
 

Último

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 

Último (20)

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 

Disaster Scene Indexing at TRECVID 2022

  • 1. Disaster Scene Description and Indexing (DSDI) Task Overview TRECVID 2022 Asad Anwar Butt, National Institute of Standards and Technology; Johns Hopkins University George Awad, National Institute of Standards and Technology Jeffrey Liu, MIT Lincoln Laboratory William Drew, Office of Homeland Security and Preparedness
  • 2. • Video and imagery data can be extremely helpful for public safety operations. • Natural Disasters, e.g., • Wildfires • Hurricanes • Earthquakes • Floods • Man-made Disasters, e.g., • Hazardous material spills • Mining accidents • Explosions Introduction - DSDI * All images are under creative common licenses
  • 3. • Prior knowledge about affected areas can be very useful for the first responders. • Oftentimes, the communication systems go down in major disasters, which makes it very difficult to get any information regarding the damage. • Automated systems to gather information before rescue workers enter the area can be very helpful. Introduction - DSDI
  • 4. • Computer vision capabilities have rapidly advanced recently with the popularity of deep learning. • Research groups have access to large image and video datasets for various tasks. • However, the capabilities do not meet public safety needs. • Lack of relevant training data. • Most current image and video datasets have no public safety hazard labels. • State-of-the-art systems trained on such datasets fail to provide helpful labels. Introduction - DSDI
  • 5. • In response, the MIT Lincoln Lab developed a dataset of images collected by the Civil Air Patrol of various natural disasters. • The Low Altitude Disaster Imagery (LADI) dataset was developed as part of a larger NIST Public Safety Innovator Accelerator Program (PSIAP) grant. • Two key properties of the dataset are: • Low altitude • Oblique perspective of the imagery and disaster-related features. • The DSDI test data and ground truth from 2020 & 2021 are also available for teams to use as training data. Training Dataset
  • 6. • LADI Dataset: • Hosted as part of the AWS Public Dataset program. • Consists of 20,000+ annotated images. • The images are from locations with FEMA (Federal Emergency Management Agency) major disaster declarations for a hurricane or flooding. • Lower altitude criteria distinguish the LADI dataset from satellite datasets to support the development of computer vision capabilities with small drones operating at low altitudes. • A minimum image size was selected to maximize the efficiency of the crowd source workers; lower resolution images are harder to annotate. • 2020 - 2021 DSDI Test Set: • ~ 11 hours of video. • Segmented into small video clips (shots) of maximum 20 sec. • Videos are from earthquake, hurricane, and flood affected areas. • Total number of shots: 4626 Training Dataset
  • 7. • A test dataset of about 6 hours of video was distributed for this task. • Collected by FEMA as individual images after disaster events. • Individual images were stitched to form videos with reasonable speed. • The test dataset was segmented into small video clips (shots) of a maximum of 16 sec, with a mean length of 10 sec. • Subset was selected by NIST taking into account diversity • In addition, a small set of videos was collected from Defense Visual Information Distribution Service (DVIDS): https://www.dvidshub.net/ • Total number of shots: 2157 Test Dataset
  • 8. Testing Data: Example Videos * All videos are under public domain licenses
  • 9. • Hierarchical labeling scheme: 5 coarse categories, each with 4 to 9 more specific annotations. Testing Dataset - Categories Damage Environment Infrastructure Vehicles Water Misc. Damage Dirt Bridge Aircraft Flooding Flooding/Water Damage Grass Building Boat Lake/Pond Landslide Lava Dam/Levee Car Ocean Road Washout Rocks Pipes Truck Puddle Rubble/Debris Sand Utility Or Power Lines/Electric Towers River/Stream Smoke/Fire Shrubs Railway Snow/Ice Wireless/Radio Communication Towers Trees Water Tower Road
  • 10. • We had 2 full time annotators (same annotators since 2020) instead of crowdsourcing. • For each category, a practice web page was created with multiple examples and sample test videos. • This allowed the annotators to become familiarized with the task and labels before starting a category. • The annotators worked independently on each category. • For each coarse category, they marked all the specific labels that were present in the video. • To create the final ground truth, for each shot, the union of labels was used. Annotation
  • 11. The annotators watch the video and mark the categories that are visible in the video. Annotation Tool
  • 12. • Systems are required to return a ranked list of up to 1000 shots for each of the 32 features. • Each submitted run specified its training type: • LADI-based (L): The run only used the supplied LADI dataset for development of its system. • Non-LADI (N): The run did not use the LADI dataset, but only trained using other dataset(s). • LADI + Others (O): The run used the LADI dataset in addition to any other dataset(s) for training purposes. DSDI System Task
  • 13. • The following evaluation metrics were used to compare the submissions: Evaluation Metrics Metric Description Speed Clock time per inference (reported by participants). Mean Average Precision (MAP) Average precision is calculated for each feature, and the mean average precision reported for a submission. Recall True positive, true negative, false positive, and false negative rates.
  • 14. Submissions Run Type Run Id O PKU_WICT_7 O PKU_WICT_2 O PKU_WICT_4 O PKU_WICT_5 L PKU_WICT_6 L PKU_WICT_3 L PKU_WICT_1 L PKU_WICT_8 L UMKC_1 O UMKC_1 PKU_WICT : Peking University UMKC : University of Missouri-Kansas City
  • 15. Frequency of Features • Graph shows number of shots containing each feature. • Some features (e.g. grass, trees, buildings, roads, etc.) occur much more frequently than others. • 4 features were dropped due to the rare occurrence in ground truth (lava, snow/ice, landslide, road washout) 0 200 400 600 800 1000 1200 1400 1600 1800 2000 2200 d a m a g e ( m i s c ) f l o o d i n g r u b b l e / d e b r i s s m o k e / f i r e d i r t g r a s s r o c k s s a n d s h r u b s t r e e s b r i d g e b u i l d i n g d a m / l e v e e p i p e s u t i l i t y / p o w e r l i n e s r a i l w a y w i r e l e s s c o m m u n i c a t i o n t o w e r s w a t e r t o w e r a i r c r a f t b o a t c a r t r u c k f l o o d i n g l a k e / p o n d o c e a n p u d d l e r i v e r / s t r e a m r o a d Frequency of shots True Positives per feature
  • 16. Results by Teams 0.501 0.5 0.482 0.429 0.354 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 PKU_WICT _7 PKU_WICT_2 PKU_WICT_4 PKU_WICT_5 UMKC_1 Mean Average Precision LADI+Others-based Systems Performance across all features 0.468 0.468 0.465 0.423 0.354 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 PKU_WICT_6 PKU_WICT_3 PKU_WICT_1 PKU_WICT_8 UMKC_1 Mean Average Precision LADI-based Systems Performance across all features
  • 18. • Average precision values for each feature categorized by training type. • 5 LADI-based runs ; 5 LADI+Others-based runs Results by Features 0 0.2 0.4 0.6 0.8 1 1.2 damage flooding rubble smoke dirt grass rocks sand shrubs trees bridge building dam pipes utility line railway wireless tower water tower aircraft boat car truck flooding lake ocean puddle river road Average Precision LADI-based runs - by feature Min Median Max 0 0.2 0.4 0.6 0.8 1 1.2 damage flooding rubble smoke dirt grass rocks sand shrubs trees bridge building dam pipes utility line railway wireless tower water tower aircraft boat car truck flooding lake ocean puddle river road Average Precision LADI+Others-based runs - by feature Min Median Max
  • 19. Efficiency 0 100 200 300 400 500 600 700 800 0 0.2 0.4 0.6 0.8 1 1.2 Time (sec) Average Precision LADI-based runs 0 100 200 300 400 500 600 700 800 0 0.2 0.4 0.6 0.8 1 1.2 Time (sec) Average Precision LADI+Others-based runs • LADI-based systems reported less processing time • Majority of systems consumed more time but without more gain in performance • Lowest processing time was ~30 sec at max performance O_UMKC_1 L_UMKC_1 L_PKU_WICT_8
  • 21. • A new test dataset from various event sources were employed representing more diversity. • Performance varies by feature. • L+O runs performed higher than L-based runs. • Few runs/features are good and efficient. • Challenges include: • Small dataset and limited resources for annotation. • Training and testing dataset should be from the same distribution. Hard to do with different nature of calamities. • The task had little participation compared to the last 2 years. • Question to teams about continuation of task. Conclusion and Future Work