SlideShare una empresa de Scribd logo
1 de 20
Project Matsu: Large Scale On-Demand
Image Processing for Disaster Relief
Collin Bennett, Robert Grossman,
Yunhong Gu, and Andrew Levine
Open Cloud Consortium
June 21, 2010
www.opencloudconsortium.org
Project Matsu Goals
• Provide persistent data resources and elastic
computing to assist in disasters:
– Make imagery available for disaster relief workers
– Elastic computing for large scale image processing
– Change detection for temporally different and
geospatially identical image sets
• Provide a resource to test standards and
interoperability studies large data clouds
Part 1:
Open Cloud Consortium
• 501(3)(c) Not-for-profit corporation
• Supports the development of standards,
interoperability frameworks, and reference
implementations.
• Manages testbeds: Open Cloud Testbed and
Intercloud Testbed.
• Manages cloud computing infrastructure to support
scientific research: Open Science Data Cloud.
• Develops benchmarks.
4
www.opencloudconsortium.org
OCC Members
• Companies: Aerospace, Booz Allen Hamilton,
Cisco, InfoBlox, Open Data Group, Raytheon,
Yahoo
• Universities: CalIT2, Johns Hopkins,
Northwestern Univ., University of Illinois at
Chicago, University of Chicago
• Government agencies: NASA
• Open Source Projects: Sector Project
5
Operates Clouds
• 500 nodes
• 3000 cores
• 1.5+ PB
• Four data centers
• 10 Gbps
• Target to refresh 1/3
each year.
• Open Cloud Testbed
• Open Science Data Cloud
• Intercloud Testbed
• Project Matsu: Cloud-
based Disaster Relief
Services
Open Science Data Cloud
7
Astronomical data
Biological data
(Bionimbus)
Networking data
Image processing for disaster relief
Focus of OCC Large Data Cloud Working Group
8
Cloud Storage Services
Cloud Compute Services
(MapReduce, UDF, & other programming
frameworks)
Table-based Data
Services
Relational-like
Data Services
App App App App App
App App
App App
• Developing APIs for this framework.
Tools and Standards
• Apache Hadoop/MapReduce
• Sector/Sphere large data cloud
• Open Geospatial Consortium
– Web Map Service (WMS)
• OCC tools are open source (matsu-project)
– http://code.google.com/p/matsu-project/
Part 2: Technical Approach
• Hadoop – Lead Andrew Levine
• Hadoop with Python Streams – Lead Collin
Bennet
• Sector/Sphere – Lead Yunhong Gu
Implementation 1:
Hadoop & Mapreduce
Andrew Levine
Image Processing in the Cloud - Mapper
Mapper Input Key: Bounding Box
Mapper Input Value:
Mapper Output Key: Bounding Box
Mapper Output Value:
Mapper resizes and/or cuts up the original
image into pieces to output Bounding Boxes
(minx = -135.0 miny = 45.0 maxx = -112.5 maxy = 67.5)
Step 1: Input to Mapper
Step 2: Processing in Mapper Step 3: Mapper Output
Mapper Output Key: Bounding Box
Mapper Output Value:
Mapper Output Key: Bounding Box
Mapper Output Value:
Mapper Output Key: Bounding Box
Mapper Output Value:
Mapper Output Key: Bounding Box
Mapper Output Value:
Mapper Output Key: Bounding Box
Mapper Output Value:
Mapper Output Key: Bounding Box
Mapper Output Value:
Mapper Output Key: Bounding Box
Mapper Output Value:
+ Timestamp
+ Timestamp
+ Timestamp
+ Timestamp
+ Timestamp
+ Timestamp
+ Timestamp
+ Timestamp
+ Timestamp
Image Processing in the Cloud - Reducer
Reducer Key Input: Bounding Box
(minx = -45.0 miny = -2.8125 maxx = -43.59375 maxy = -2.109375)
Reducer Value Input:
Step 1: Input to Reducer
… …
Step 2: Process difference in Reducer
Assemble Images based on timestamps and compare Result is a delta of the two Images
Step 3: Reducer Output
All images go to different map layers set of images for display in WMS
Timestamp 1
Set
Timestamp 2
Set
Delta Set
Implementation 2:
Hadoop & Python Streams
Collin Bennett
Preprocessing Step
• All images (in a batch to be processed) are
combined into a single file.
• Each line contains the image’s byte array
transformed to pixels (raw bytes don’t seem
to work well with the one-line-at-a-time
Hadoop streaming paradigm).
geolocation t timestamp | tuple size
; image width ; image height; comma-
separated list of pixels
the fields in red are metadata needed to process the image in the
reducer
Map and Shuffle
• We can use the identity mapper
• All of the work for mapping was
done in the pre-process step
• Map / Shuffle key is the geolocation
• In the reducer, the timestamp will be
1st field of each record when
splitting on ‘|’
Implementation 3:
Sector/Sphere
Yunhong Gu
Sector Distributed File System
• Sector aggregate hard disk storage across
commodity computers
– With single namespace, file system level reliability
(using replication), high availability
• Sector does not split files
– A single image will not be split, therefore when it
is being processed, the application does not need
to read the data from other nodes via network
– A directory can be kept together on a single node
as well, as an option
Sphere UDF
• Sphere allows a User Defined Function to be
applied to each file (either it is a single image
or multiple images)
• Existing applications can be wrapped up in a
Sphere UDF
• In many situations, Sphere streaming utility
accepts a data directory and a application
binary as inputs
• ./stream -i haiti -c ossim_foo -o results
For More Information
info@opencloudconsortium.org
www.opencloudconsortium.org

Más contenido relacionado

La actualidad más candente

QGIS Module 4
QGIS Module 4QGIS Module 4
QGIS Module 4CAPSUCSF
 
Principles of Computing Resources Planning in Cloud-Based Problem Solving Env...
Principles of Computing Resources Planning in Cloud-Based Problem Solving Env...Principles of Computing Resources Planning in Cloud-Based Problem Solving Env...
Principles of Computing Resources Planning in Cloud-Based Problem Solving Env...Ural-PDC
 
Big Linked Data Interlinking - ExtremeEarth Open Workshop
Big Linked Data Interlinking - ExtremeEarth Open WorkshopBig Linked Data Interlinking - ExtremeEarth Open Workshop
Big Linked Data Interlinking - ExtremeEarth Open WorkshopExtremeEarth
 
06 how to write a map reduce version of k-means clustering
06 how to write a map reduce version of k-means clustering06 how to write a map reduce version of k-means clustering
06 how to write a map reduce version of k-means clusteringSubhas Kumar Ghosh
 
MapReduce: Simplified Data Processing on Large Clusters
MapReduce: Simplified Data Processing on Large ClustersMapReduce: Simplified Data Processing on Large Clusters
MapReduce: Simplified Data Processing on Large ClustersAshraf Uddin
 
Big Linked Data Querying - ExtremeEarth Open Workshop
Big Linked Data Querying - ExtremeEarth Open WorkshopBig Linked Data Querying - ExtremeEarth Open Workshop
Big Linked Data Querying - ExtremeEarth Open WorkshopExtremeEarth
 
MapReduce : Simplified Data Processing on Large Clusters
MapReduce : Simplified Data Processing on Large ClustersMapReduce : Simplified Data Processing on Large Clusters
MapReduce : Simplified Data Processing on Large ClustersAbolfazl Asudeh
 
T.2.4 – 3 d modelling for harvesting planning (by graphitech)
T.2.4 – 3 d modelling for harvesting planning (by graphitech)T.2.4 – 3 d modelling for harvesting planning (by graphitech)
T.2.4 – 3 d modelling for harvesting planning (by graphitech)SLOPE Project
 
GIS and QGIS training notes
GIS and QGIS training notesGIS and QGIS training notes
GIS and QGIS training notesArnold Kilaini
 
All the New Cool Stuff in QGIS 2.0
All the New Cool Stuff in QGIS 2.0All the New Cool Stuff in QGIS 2.0
All the New Cool Stuff in QGIS 2.0Nathan Woodrow
 
MapReduce Scheduling Algorithms
MapReduce Scheduling AlgorithmsMapReduce Scheduling Algorithms
MapReduce Scheduling AlgorithmsLeila panahi
 
Relational Algebra and MapReduce
Relational Algebra and MapReduceRelational Algebra and MapReduce
Relational Algebra and MapReducePietro Michiardi
 
Map reduce - simplified data processing on large clusters
Map reduce - simplified data processing on large clustersMap reduce - simplified data processing on large clusters
Map reduce - simplified data processing on large clustersCleverence Kombe
 
Mapreduce - Simplified Data Processing on Large Clusters
Mapreduce - Simplified Data Processing on Large ClustersMapreduce - Simplified Data Processing on Large Clusters
Mapreduce - Simplified Data Processing on Large ClustersAbhishek Singh
 
Pregel: A System For Large Scale Graph Processing
Pregel: A System For Large Scale Graph ProcessingPregel: A System For Large Scale Graph Processing
Pregel: A System For Large Scale Graph ProcessingRiyad Parvez
 

La actualidad más candente (20)

QGIS Module 4
QGIS Module 4QGIS Module 4
QGIS Module 4
 
MapReduce Algorithm Design
MapReduce Algorithm DesignMapReduce Algorithm Design
MapReduce Algorithm Design
 
Mapreduce
MapreduceMapreduce
Mapreduce
 
Principles of Computing Resources Planning in Cloud-Based Problem Solving Env...
Principles of Computing Resources Planning in Cloud-Based Problem Solving Env...Principles of Computing Resources Planning in Cloud-Based Problem Solving Env...
Principles of Computing Resources Planning in Cloud-Based Problem Solving Env...
 
Big Linked Data Interlinking - ExtremeEarth Open Workshop
Big Linked Data Interlinking - ExtremeEarth Open WorkshopBig Linked Data Interlinking - ExtremeEarth Open Workshop
Big Linked Data Interlinking - ExtremeEarth Open Workshop
 
06 how to write a map reduce version of k-means clustering
06 how to write a map reduce version of k-means clustering06 how to write a map reduce version of k-means clustering
06 how to write a map reduce version of k-means clustering
 
MapReduce: Simplified Data Processing on Large Clusters
MapReduce: Simplified Data Processing on Large ClustersMapReduce: Simplified Data Processing on Large Clusters
MapReduce: Simplified Data Processing on Large Clusters
 
Big Linked Data Querying - ExtremeEarth Open Workshop
Big Linked Data Querying - ExtremeEarth Open WorkshopBig Linked Data Querying - ExtremeEarth Open Workshop
Big Linked Data Querying - ExtremeEarth Open Workshop
 
MapReduce : Simplified Data Processing on Large Clusters
MapReduce : Simplified Data Processing on Large ClustersMapReduce : Simplified Data Processing on Large Clusters
MapReduce : Simplified Data Processing on Large Clusters
 
T.2.4 – 3 d modelling for harvesting planning (by graphitech)
T.2.4 – 3 d modelling for harvesting planning (by graphitech)T.2.4 – 3 d modelling for harvesting planning (by graphitech)
T.2.4 – 3 d modelling for harvesting planning (by graphitech)
 
cnsm2011_slide
cnsm2011_slidecnsm2011_slide
cnsm2011_slide
 
GIS and QGIS training notes
GIS and QGIS training notesGIS and QGIS training notes
GIS and QGIS training notes
 
All the New Cool Stuff in QGIS 2.0
All the New Cool Stuff in QGIS 2.0All the New Cool Stuff in QGIS 2.0
All the New Cool Stuff in QGIS 2.0
 
Hadoop
HadoopHadoop
Hadoop
 
MapReduce Scheduling Algorithms
MapReduce Scheduling AlgorithmsMapReduce Scheduling Algorithms
MapReduce Scheduling Algorithms
 
Relational Algebra and MapReduce
Relational Algebra and MapReduceRelational Algebra and MapReduce
Relational Algebra and MapReduce
 
Map reduce - simplified data processing on large clusters
Map reduce - simplified data processing on large clustersMap reduce - simplified data processing on large clusters
Map reduce - simplified data processing on large clusters
 
QGIS training class 1
QGIS training class 1QGIS training class 1
QGIS training class 1
 
Mapreduce - Simplified Data Processing on Large Clusters
Mapreduce - Simplified Data Processing on Large ClustersMapreduce - Simplified Data Processing on Large Clusters
Mapreduce - Simplified Data Processing on Large Clusters
 
Pregel: A System For Large Scale Graph Processing
Pregel: A System For Large Scale Graph ProcessingPregel: A System For Large Scale Graph Processing
Pregel: A System For Large Scale Graph Processing
 

Similar a Large Scale Image Processing for Disaster Relief Using Open Cloud Consortium Resources

Project Matsu: Elastic Clouds for Disaster Relief
Project Matsu: Elastic Clouds for Disaster ReliefProject Matsu: Elastic Clouds for Disaster Relief
Project Matsu: Elastic Clouds for Disaster ReliefRobert Grossman
 
HP - Jerome Rolia - Hadoop World 2010
HP - Jerome Rolia - Hadoop World 2010HP - Jerome Rolia - Hadoop World 2010
HP - Jerome Rolia - Hadoop World 2010Cloudera, Inc.
 
Bring Satellite and Drone Imagery into your Data Science Workflows
Bring Satellite and Drone Imagery into your Data Science WorkflowsBring Satellite and Drone Imagery into your Data Science Workflows
Bring Satellite and Drone Imagery into your Data Science WorkflowsDatabricks
 
Brewing the Ultimate Data Fusion
Brewing the Ultimate Data FusionBrewing the Ultimate Data Fusion
Brewing the Ultimate Data FusionSafe Software
 
Object extraction from satellite imagery using deep learning
Object extraction from satellite imagery using deep learningObject extraction from satellite imagery using deep learning
Object extraction from satellite imagery using deep learningAly Abdelkareem
 
Big Data Analysis : Deciphering the haystack
Big Data Analysis : Deciphering the haystack Big Data Analysis : Deciphering the haystack
Big Data Analysis : Deciphering the haystack Srinath Perera
 
Cahall Final Intern Presentation
Cahall Final Intern PresentationCahall Final Intern Presentation
Cahall Final Intern PresentationDaniel Cahall
 
The Big Data Stack
The Big Data StackThe Big Data Stack
The Big Data StackZubair Nabi
 
Efficient Scheduling for Dynamic Streaming of 3D Scene for Mobile Devices
Efficient Scheduling for Dynamic Streaming of 3D Scene for Mobile DevicesEfficient Scheduling for Dynamic Streaming of 3D Scene for Mobile Devices
Efficient Scheduling for Dynamic Streaming of 3D Scene for Mobile DevicesBudianto Tandianus
 
OpenStreetMap in 3D - current developments
OpenStreetMap in 3D - current developmentsOpenStreetMap in 3D - current developments
OpenStreetMap in 3D - current developmentsvirtualcitySYSTEMS GmbH
 
MapInfo Professional 12.5 and Discover3D 2014 - A brief overview
MapInfo Professional 12.5 and Discover3D 2014 - A brief overviewMapInfo Professional 12.5 and Discover3D 2014 - A brief overview
MapInfo Professional 12.5 and Discover3D 2014 - A brief overviewPrakher Hajela Saxena
 
Scalable Similarity-Based Neighborhood Methods with MapReduce
Scalable Similarity-Based Neighborhood Methods with MapReduceScalable Similarity-Based Neighborhood Methods with MapReduce
Scalable Similarity-Based Neighborhood Methods with MapReducesscdotopen
 
Rack Cluster Deployment for SDSC Supercomputer
Rack Cluster Deployment for SDSC SupercomputerRack Cluster Deployment for SDSC Supercomputer
Rack Cluster Deployment for SDSC SupercomputerRebekah Rodriguez
 
Analysis of KinectFusion
Analysis of KinectFusionAnalysis of KinectFusion
Analysis of KinectFusionDong-Won Shin
 
Extending Hadoop for Fun & Profit
Extending Hadoop for Fun & ProfitExtending Hadoop for Fun & Profit
Extending Hadoop for Fun & ProfitMilind Bhandarkar
 
A performance analysis of OpenStack Cloud vs Real System on Hadoop Clusters
A performance analysis of OpenStack Cloud vs Real System on Hadoop ClustersA performance analysis of OpenStack Cloud vs Real System on Hadoop Clusters
A performance analysis of OpenStack Cloud vs Real System on Hadoop ClustersKumari Surabhi
 
Hadoop fault tolerance
Hadoop  fault toleranceHadoop  fault tolerance
Hadoop fault tolerancePallav Jha
 

Similar a Large Scale Image Processing for Disaster Relief Using Open Cloud Consortium Resources (20)

Project Matsu: Elastic Clouds for Disaster Relief
Project Matsu: Elastic Clouds for Disaster ReliefProject Matsu: Elastic Clouds for Disaster Relief
Project Matsu: Elastic Clouds for Disaster Relief
 
HP - Jerome Rolia - Hadoop World 2010
HP - Jerome Rolia - Hadoop World 2010HP - Jerome Rolia - Hadoop World 2010
HP - Jerome Rolia - Hadoop World 2010
 
Bring Satellite and Drone Imagery into your Data Science Workflows
Bring Satellite and Drone Imagery into your Data Science WorkflowsBring Satellite and Drone Imagery into your Data Science Workflows
Bring Satellite and Drone Imagery into your Data Science Workflows
 
Brewing the Ultimate Data Fusion
Brewing the Ultimate Data FusionBrewing the Ultimate Data Fusion
Brewing the Ultimate Data Fusion
 
Object extraction from satellite imagery using deep learning
Object extraction from satellite imagery using deep learningObject extraction from satellite imagery using deep learning
Object extraction from satellite imagery using deep learning
 
Big Data Analysis : Deciphering the haystack
Big Data Analysis : Deciphering the haystack Big Data Analysis : Deciphering the haystack
Big Data Analysis : Deciphering the haystack
 
Cahall Final Intern Presentation
Cahall Final Intern PresentationCahall Final Intern Presentation
Cahall Final Intern Presentation
 
The Big Data Stack
The Big Data StackThe Big Data Stack
The Big Data Stack
 
Efficient Scheduling for Dynamic Streaming of 3D Scene for Mobile Devices
Efficient Scheduling for Dynamic Streaming of 3D Scene for Mobile DevicesEfficient Scheduling for Dynamic Streaming of 3D Scene for Mobile Devices
Efficient Scheduling for Dynamic Streaming of 3D Scene for Mobile Devices
 
OpenStreetMap in 3D - current developments
OpenStreetMap in 3D - current developmentsOpenStreetMap in 3D - current developments
OpenStreetMap in 3D - current developments
 
MapInfo Professional 12.5 and Discover3D 2014 - A brief overview
MapInfo Professional 12.5 and Discover3D 2014 - A brief overviewMapInfo Professional 12.5 and Discover3D 2014 - A brief overview
MapInfo Professional 12.5 and Discover3D 2014 - A brief overview
 
Scalable Similarity-Based Neighborhood Methods with MapReduce
Scalable Similarity-Based Neighborhood Methods with MapReduceScalable Similarity-Based Neighborhood Methods with MapReduce
Scalable Similarity-Based Neighborhood Methods with MapReduce
 
GRID COMPUTING
GRID COMPUTINGGRID COMPUTING
GRID COMPUTING
 
Rack Cluster Deployment for SDSC Supercomputer
Rack Cluster Deployment for SDSC SupercomputerRack Cluster Deployment for SDSC Supercomputer
Rack Cluster Deployment for SDSC Supercomputer
 
ICIECA 2014 Paper 05
ICIECA 2014 Paper 05ICIECA 2014 Paper 05
ICIECA 2014 Paper 05
 
Analysis of KinectFusion
Analysis of KinectFusionAnalysis of KinectFusion
Analysis of KinectFusion
 
Extending Hadoop for Fun & Profit
Extending Hadoop for Fun & ProfitExtending Hadoop for Fun & Profit
Extending Hadoop for Fun & Profit
 
A performance analysis of OpenStack Cloud vs Real System on Hadoop Clusters
A performance analysis of OpenStack Cloud vs Real System on Hadoop ClustersA performance analysis of OpenStack Cloud vs Real System on Hadoop Clusters
A performance analysis of OpenStack Cloud vs Real System on Hadoop Clusters
 
Hadoop fault tolerance
Hadoop  fault toleranceHadoop  fault tolerance
Hadoop fault tolerance
 
Map reducecloudtech
Map reducecloudtechMap reducecloudtech
Map reducecloudtech
 

Último

Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...shambhavirathore45
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceDelhi Call girls
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 

Último (20)

Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 

Large Scale Image Processing for Disaster Relief Using Open Cloud Consortium Resources

  • 1. Project Matsu: Large Scale On-Demand Image Processing for Disaster Relief Collin Bennett, Robert Grossman, Yunhong Gu, and Andrew Levine Open Cloud Consortium June 21, 2010 www.opencloudconsortium.org
  • 2. Project Matsu Goals • Provide persistent data resources and elastic computing to assist in disasters: – Make imagery available for disaster relief workers – Elastic computing for large scale image processing – Change detection for temporally different and geospatially identical image sets • Provide a resource to test standards and interoperability studies large data clouds
  • 3. Part 1: Open Cloud Consortium
  • 4. • 501(3)(c) Not-for-profit corporation • Supports the development of standards, interoperability frameworks, and reference implementations. • Manages testbeds: Open Cloud Testbed and Intercloud Testbed. • Manages cloud computing infrastructure to support scientific research: Open Science Data Cloud. • Develops benchmarks. 4 www.opencloudconsortium.org
  • 5. OCC Members • Companies: Aerospace, Booz Allen Hamilton, Cisco, InfoBlox, Open Data Group, Raytheon, Yahoo • Universities: CalIT2, Johns Hopkins, Northwestern Univ., University of Illinois at Chicago, University of Chicago • Government agencies: NASA • Open Source Projects: Sector Project 5
  • 6. Operates Clouds • 500 nodes • 3000 cores • 1.5+ PB • Four data centers • 10 Gbps • Target to refresh 1/3 each year. • Open Cloud Testbed • Open Science Data Cloud • Intercloud Testbed • Project Matsu: Cloud- based Disaster Relief Services
  • 7. Open Science Data Cloud 7 Astronomical data Biological data (Bionimbus) Networking data Image processing for disaster relief
  • 8. Focus of OCC Large Data Cloud Working Group 8 Cloud Storage Services Cloud Compute Services (MapReduce, UDF, & other programming frameworks) Table-based Data Services Relational-like Data Services App App App App App App App App App • Developing APIs for this framework.
  • 9. Tools and Standards • Apache Hadoop/MapReduce • Sector/Sphere large data cloud • Open Geospatial Consortium – Web Map Service (WMS) • OCC tools are open source (matsu-project) – http://code.google.com/p/matsu-project/
  • 10. Part 2: Technical Approach • Hadoop – Lead Andrew Levine • Hadoop with Python Streams – Lead Collin Bennet • Sector/Sphere – Lead Yunhong Gu
  • 11. Implementation 1: Hadoop & Mapreduce Andrew Levine
  • 12. Image Processing in the Cloud - Mapper Mapper Input Key: Bounding Box Mapper Input Value: Mapper Output Key: Bounding Box Mapper Output Value: Mapper resizes and/or cuts up the original image into pieces to output Bounding Boxes (minx = -135.0 miny = 45.0 maxx = -112.5 maxy = 67.5) Step 1: Input to Mapper Step 2: Processing in Mapper Step 3: Mapper Output Mapper Output Key: Bounding Box Mapper Output Value: Mapper Output Key: Bounding Box Mapper Output Value: Mapper Output Key: Bounding Box Mapper Output Value: Mapper Output Key: Bounding Box Mapper Output Value: Mapper Output Key: Bounding Box Mapper Output Value: Mapper Output Key: Bounding Box Mapper Output Value: Mapper Output Key: Bounding Box Mapper Output Value: + Timestamp + Timestamp + Timestamp + Timestamp + Timestamp + Timestamp + Timestamp + Timestamp + Timestamp
  • 13. Image Processing in the Cloud - Reducer Reducer Key Input: Bounding Box (minx = -45.0 miny = -2.8125 maxx = -43.59375 maxy = -2.109375) Reducer Value Input: Step 1: Input to Reducer … … Step 2: Process difference in Reducer Assemble Images based on timestamps and compare Result is a delta of the two Images Step 3: Reducer Output All images go to different map layers set of images for display in WMS Timestamp 1 Set Timestamp 2 Set Delta Set
  • 14. Implementation 2: Hadoop & Python Streams Collin Bennett
  • 15. Preprocessing Step • All images (in a batch to be processed) are combined into a single file. • Each line contains the image’s byte array transformed to pixels (raw bytes don’t seem to work well with the one-line-at-a-time Hadoop streaming paradigm). geolocation t timestamp | tuple size ; image width ; image height; comma- separated list of pixels the fields in red are metadata needed to process the image in the reducer
  • 16. Map and Shuffle • We can use the identity mapper • All of the work for mapping was done in the pre-process step • Map / Shuffle key is the geolocation • In the reducer, the timestamp will be 1st field of each record when splitting on ‘|’
  • 18. Sector Distributed File System • Sector aggregate hard disk storage across commodity computers – With single namespace, file system level reliability (using replication), high availability • Sector does not split files – A single image will not be split, therefore when it is being processed, the application does not need to read the data from other nodes via network – A directory can be kept together on a single node as well, as an option
  • 19. Sphere UDF • Sphere allows a User Defined Function to be applied to each file (either it is a single image or multiple images) • Existing applications can be wrapped up in a Sphere UDF • In many situations, Sphere streaming utility accepts a data directory and a application binary as inputs • ./stream -i haiti -c ossim_foo -o results