SlideShare una empresa de Scribd logo
1 de 27
www.hdfgroup.org
The HDF Group
A Brief Introduction to HDF5
Quincey Koziol
Director of Core Software and HPC
The HDF Group
koziol@hdfgroup.org
March 5, 2015 1HPC Oil & Gas Workshop
http://bit.ly/HDF5-HPCOGW-2015
www.hdfgroup.org
Why use HDF5?
• Challenging Data:
• Application data that pushes the limits of traditional
solutions.
• Software Solutions:
• For very large and/or complex data
• With very fast access requirements
• Easily share data across a platforms
• Use different programming languages and OSs.
• Take advantage of the tools that understand HDF5.
• Enable long-term preservation of data.
March 5, 2015 2HPC Oil & Gas Workshop
http://bit.ly/HDF5-HPCOGW-2015
www.hdfgroup.org
HDF5 is like …
March 5, 2015 HPC Oil & Gas Workshop 3
www.hdfgroup.org
What is HDF5?
March 5, 2015 HPC Oil & Gas Workshop 4
• HDF5 == Hierarchical Data Format, v5
• A flexible data model
• Structures for data organization and specification
• Open source software
• Implements the data model
• Portable file format
• Designed for high volume or complex data
www.hdfgroup.orgMarch 5, 2015 5
HDF5 Data Model
• Groups – provide structure among objects
• Datasets – where the primary data goes
• Data arrays
• Rich set of datatype options
• Flexible, efficient storage and I/O
• Attributes - for metadata
Everything else is built essentially from
these parts.
HPC Oil & Gas Workshop
www.hdfgroup.org
HDF5 Software
HDF5 home page:
http://hdfgroup.org/HDF5/
March 5, 2015 HPC Oil & Gas Workshop 6
www.hdfgroup.org
Useful Tools For New Users
March 5, 2015 HPC Oil & Gas Workshop 7
h5dump, h5ls:
Tools to “dump” or list contents of HDF5 file
HDFView:
Java browser for HDF5 files
http://www.hdfgroup.org/hdf-java-html/hdfview/
HDF5 Examples (C, Fortran, Java, Python, Matlab)
http://www.hdfgroup.org/ftp/HDF5/examples/
h5cc, h5c++, h5fc:
Scripts to compile applications
www.hdfgroup.org
Recent HPC Success Story
• Performance results on Blue Waters @ NCSA
• I/O Kernel of a DOE Plasma Physics
application
• Running on 298,048 cores
• ~10 Trillion particles
• Single 291TB HDF5 file
• Achieved 52 GB/s
• ~50% of the peak performance
• Using 1 GB stripe size and 160 Lustre OSTs
March 5, 2015 8HPC Oil & Gas Workshop
www.hdfgroup.org
HDF5 in Oil & Gas
• REMSQL: Standard for reservoir data
(Energistics)
• http://www.energistics.org/reservoir/resqml-
standards/current-standards
• H5EM-TS: Exchange standard for field EM data
(EMGS, Statoil, Interaction)
• ftp://fileformats.emgs.com/H5EM-
TS_1.0/documentation/H5EM-
TS_information_sheet.pdf
March 5, 2015 HPC Oil & Gas Workshop 9
www.hdfgroup.org
HDF5 in Oil & Gas
• TEMHDF: Exchange standard for
MetalMapper and other EMI data
• ftp://geom.geometrics.com/pub/Data/TEM2H5_
Deliverables/TEM2HDF_RefManual.pdf
• PH5: Archival format for active source seismic
data (moving away from SEG-Y, to HDF5)
• http://www.passcal.nmt.edu/content/ph5-what-it
• Petrel: E&P Workflow and Visualization
• http://www.software.slb.com/products/platform/
Pages/petrel.aspx
March 5, 2015 HPC Oil & Gas Workshop 10
www.hdfgroup.org
HDF5 in Oil & Gas
• Globe Claritas: HDF5 is format for their seismic
processing software
• SEG-Y vs. HDF5 Whitepaper:
http://www.globeclaritas.com/content/download/10
303/55223/file/HDF5%20For%20Seismic%20Refle
ction%20Datasets.pdf
• News release:
http://www.globeclaritas.com/Claritas/Overview/Lat
est-Release
• PDF data sheet:
http://www.globeclaritas.com/content/download/88
39/47774/file/Claritas%20HDF5.pdf
• Powerpoint:
http://www.slideshare.net/guy_maslen/a-quick-
start-guide-to-using-hdf5-in-globe-claritas
March 5, 2015 HPC Oil & Gas Workshop 11
www.hdfgroup.org
Where We’ll Be Soon: HDF5 1.10
• Beta release: Fall 2015
• Major Features:
• Single-Writer/Multiple-Reader (SWMR)
• Virtual Datasets
• Improved scalability of chunked datasets
• Parallel I/O performance and capabilities
March 5, 2015 12HPC Oil & Gas Workshop
www.hdfgroup.org
Other Items of Interest
• We’re not planning to change current
multi-threaded concurrency behavior
• HDF5 Excel Add-in: HEXAD
• REST-based service for HDF5 data
• HDF Compass visualization package
March 5, 2015 13HPC Oil & Gas Workshop
www.hdfgroup.org
The HDF Group
Thank You!
Questions & Comments?
March 5, 2015 14HPC Oil & Gas Workshop
http://bit.ly/HDF5-HPCOGW-2015
www.hdfgroup.org
The HDF Group Services
• Helpdesk and Mailing Lists
• Available to all users as a first level of support:
help@hdfgroup.org, hdf-forum@lists.hdfgroup.org
• Priority Support
• Rapid issue resolution and advice
• Consulting
• Needs assessment, troubleshooting, design reviews, etc.
• Training
• Tutorials and hands-on practical experience
• Enterprise Support
• Coordinate HDF activities across departments
• Special Projects
• Adapting customer applications to HDF
• New features and tools
• Research and Development
March 5, 2015 15HPC Oil & Gas Workshop
http://bit.ly/HDF5-HPCOGW-2015
www.hdfgroup.org
HDF5 1.10 Planned Features: SWMR
• Improves HDF5 for Data Acquisition:
• Allows simultaneous data gathering and
monitoring/analysis
• Focused on storing data sequences for
high-speed data sources
• Supports ‘Ordered Updates’ to file:
• Crash-proofs accessing HDF5 file
• Possibly uses small amount of extra space
March 5, 2015 16HPC Oil & Gas Workshop
www.hdfgroup.org
HDF5 1.10 Planned Features
• Virtual Object Layer (VOL)
• Provides the HDF5 data model and API, but
allows different underlying storage
mechanisms
• Intercepts all HDF5 API calls that can touch
the data on disk and routes them to a VOL
plugin
• Possibly SEG-Y VOL plugin?
March 5, 2015 17HPC Oil & Gas Workshop
www.hdfgroup.org
HDF5 1.10 Planned Features
• ‘Virtual’ Datasets
• Can “stitch together” multiple ‘source’
datasets into a single ‘virtual’ dataset
• Supports unlimited dimensions in both source
and virtual datasets
March 5, 2015 18HPC Oil & Gas Workshop
www.hdfgroup.org
HDF5 1.10 Planned Features: Chunk Imp.
Dataset type Index type Space
improvements
Speed
improvements
no unlimited
dimensions,
no I/O filters,
no missing
chunks
“implicit”
no actual
chunk index
Same storage
space as
contiguous dataset
storage (no index)
Constant time
lookups
Faster parallel I/O
no unlimited
dimensions
“fixed sized”
smaller chunk
index
Smaller index
overhead
Constant time
lookups
1 unlimited
dimension
“extensible
array”
Smaller index
overhead
Constant time
lookups and
appends
2+ unlimited
dimension
Improved
B-tree*
Smaller index
overhead
Faster
March 5, 2015 19HPC Oil & Gas Workshop
www.hdfgroup.org
HDF5 1.10 Planned Features: HPC
• Continue to improve our use of MPI and
parallel file system features
• Remove ‘truncate’ operation on file close, etc.
• Reduce # of I/O accesses for metadata access
• Collective Read/Write of metadata
• Multi-dataset Collective I/O
• Support for compression in parallel
• Collective access mode only
• Possibly Support Single-Write/Multiple-Reader
(SWMR) access in parallel
March 5, 2015 20HPC Oil & Gas Workshop
www.hdfgroup.org
HDF5 Roadmap
March 5, 2015 21
• Concurrency
• Single-Writer/Multiple-
Reader (SWMR)
• Internal threading
• Virtual Object Layer (VOL)
• Data Analysis
• Query / View / Index APIs
• Native HDF5 client/server
• Performance
• Scalable chunk indices
• Metadata aggregation
and Page buffering
• Asynchronous I/O
• Variable-length
records
• Fault tolerance
• Parallel I/O
• I/O Autotuning
HPC Oil & Gas Workshop
“The best way to predict the
future is to invent it.”
– Alan Kay
www.hdfgroup.org
Where We’re Not Going
• We’re not changing multi-threaded
concurrency support
• Keep “global lock” on library
• Will focus on asynchronous I/O instead
• Will be using threads internally though
March 5, 2015 22HPC Oil & Gas Workshop
www.hdfgroup.org
Codename “HEXAD”
• HDF5 Excel Add-in: HEXAD
• Lets you do the usual things including:
• Display content (file structure, detailed object info)
• Create/read/write datasets
• Create/read/update attributes
• Plenty of ideas for bells & whistles
• HDF5 Image & PyTables support, etc.
• Send in your Must Have/Nice To Have list!*
• Stay tuned for the beta program
* help@hdfgroup.org
March 5, 2015 23HPC Oil & Gas Workshop
www.hdfgroup.org
HDF Server
• REST-based service for HDF5 data
• Reference Implementation for REST API
• Developed in Python using Tornado Framework
• Supports Read/Write operations
• Clients can be Python/C/Fortran or Web Page
• Let us know what specific features you’d like to
see.
March 5, 2015 24HPC Oil & Gas Workshop
www.hdfgroup.org
HDF Compass
• “Simple” Python HDF5 Viewer application
• Cross platform (Windows/Mac/Linux)
• Native look and feel
• Can display extremely large HDF5 files
• View HDF5 files and OpenDAP resources
• Plugin model enables different file
formats/remote resources to be supported
• Community-based development model
March 5, 2015 25HPC Oil & Gas Workshop
www.hdfgroup.orgMarch 5, 2015 26
Brief History of HDF
1987 At NCSA (University of Illinois), forms task force to
create an architecture-independent file format and
library, which becomes HDF
Early NASA adopts HDF for Earth Observing System project
1990’s
1996 DOE collaborates with the HDF group (at NCSA) to
create “Big HDF” which becomes HDF5
1998 HDF5 released, with support from DOE, NASA & NCSA
2006 The HDF Group spins out of University of Illinois as
non-profit corporation
HPC Oil & Gas Workshop
www.hdfgroup.org
The HDF Group
• Established in 1988
• 18 years at University of Illinois’ National Center
for Supercomputing Applications
• 8 years as independent non-profit company:
“The HDF Group”
• The HDF Group owns HDF4 and HDF5
• HDF4 & HDF5 formats, libraries, and tools are
open source and freely available with BSD-style
license
March 5, 2015 27HPC Oil & Gas Workshop

Más contenido relacionado

La actualidad más candente

Running Apache NiFi with Apache Spark : Integration Options
Running Apache NiFi with Apache Spark : Integration OptionsRunning Apache NiFi with Apache Spark : Integration Options
Running Apache NiFi with Apache Spark : Integration OptionsTimothy Spann
 
Real-time Freight Visibility: How TMW Systems uses NiFi and SAM to create sub...
Real-time Freight Visibility: How TMW Systems uses NiFi and SAM to create sub...Real-time Freight Visibility: How TMW Systems uses NiFi and SAM to create sub...
Real-time Freight Visibility: How TMW Systems uses NiFi and SAM to create sub...DataWorks Summit
 
Productionizing Spark ML pipelines with the portable format for analytics
Productionizing Spark ML pipelines with the portable format for analyticsProductionizing Spark ML pipelines with the portable format for analytics
Productionizing Spark ML pipelines with the portable format for analyticsDataWorks Summit
 
NiFi Best Practices for the Enterprise
NiFi Best Practices for the EnterpriseNiFi Best Practices for the Enterprise
NiFi Best Practices for the EnterpriseGregory Keys
 
Pivotal Strata NYC 2015 Apache HAWQ Launch
Pivotal Strata NYC 2015 Apache HAWQ LaunchPivotal Strata NYC 2015 Apache HAWQ Launch
Pivotal Strata NYC 2015 Apache HAWQ LaunchVMware Tanzu
 
Apache NiFi: Ingesting Enterprise Data At Scale
Apache NiFi:   Ingesting Enterprise Data At Scale Apache NiFi:   Ingesting Enterprise Data At Scale
Apache NiFi: Ingesting Enterprise Data At Scale Timothy Spann
 
Present and future of unified, portable and efficient data processing with Ap...
Present and future of unified, portable and efficient data processing with Ap...Present and future of unified, portable and efficient data processing with Ap...
Present and future of unified, portable and efficient data processing with Ap...DataWorks Summit
 
Introduction to data flow management using apache nifi
Introduction to data flow management using apache nifiIntroduction to data flow management using apache nifi
Introduction to data flow management using apache nifiAnshuman Ghosh
 
Data processing at the speed of 100 Gbps@Apache Crail (Incubating)
Data processing at the speed of 100 Gbps@Apache Crail (Incubating)Data processing at the speed of 100 Gbps@Apache Crail (Incubating)
Data processing at the speed of 100 Gbps@Apache Crail (Incubating)DataWorks Summit
 
Manage democratization of the data - Data Replication in Hadoop
Manage democratization of the data - Data Replication in HadoopManage democratization of the data - Data Replication in Hadoop
Manage democratization of the data - Data Replication in HadoopDataWorks Summit
 

La actualidad más candente (20)

Running Apache NiFi with Apache Spark : Integration Options
Running Apache NiFi with Apache Spark : Integration OptionsRunning Apache NiFi with Apache Spark : Integration Options
Running Apache NiFi with Apache Spark : Integration Options
 
HDF Project Status and Plans
HDF Project Status and PlansHDF Project Status and Plans
HDF Project Status and Plans
 
HDF Project Update
HDF Project UpdateHDF Project Update
HDF Project Update
 
HDF Studio
HDF StudioHDF Studio
HDF Studio
 
Real-time Freight Visibility: How TMW Systems uses NiFi and SAM to create sub...
Real-time Freight Visibility: How TMW Systems uses NiFi and SAM to create sub...Real-time Freight Visibility: How TMW Systems uses NiFi and SAM to create sub...
Real-time Freight Visibility: How TMW Systems uses NiFi and SAM to create sub...
 
Productionizing Spark ML pipelines with the portable format for analytics
Productionizing Spark ML pipelines with the portable format for analyticsProductionizing Spark ML pipelines with the portable format for analytics
Productionizing Spark ML pipelines with the portable format for analytics
 
2011 ACSI Survey Summary
2011 ACSI Survey Summary2011 ACSI Survey Summary
2011 ACSI Survey Summary
 
Apache deep learning 101
Apache deep learning 101Apache deep learning 101
Apache deep learning 101
 
Support for NPP/NPOESS by The HDF Group
Support for NPP/NPOESS by The HDF GroupSupport for NPP/NPOESS by The HDF Group
Support for NPP/NPOESS by The HDF Group
 
NiFi Best Practices for the Enterprise
NiFi Best Practices for the EnterpriseNiFi Best Practices for the Enterprise
NiFi Best Practices for the Enterprise
 
Pivotal Strata NYC 2015 Apache HAWQ Launch
Pivotal Strata NYC 2015 Apache HAWQ LaunchPivotal Strata NYC 2015 Apache HAWQ Launch
Pivotal Strata NYC 2015 Apache HAWQ Launch
 
Apache NiFi: Ingesting Enterprise Data At Scale
Apache NiFi:   Ingesting Enterprise Data At Scale Apache NiFi:   Ingesting Enterprise Data At Scale
Apache NiFi: Ingesting Enterprise Data At Scale
 
Present and future of unified, portable and efficient data processing with Ap...
Present and future of unified, portable and efficient data processing with Ap...Present and future of unified, portable and efficient data processing with Ap...
Present and future of unified, portable and efficient data processing with Ap...
 
HDF OPeNDAP update
HDF OPeNDAP updateHDF OPeNDAP update
HDF OPeNDAP update
 
Introduction to data flow management using apache nifi
Introduction to data flow management using apache nifiIntroduction to data flow management using apache nifi
Introduction to data flow management using apache nifi
 
HDF Updae
HDF UpdaeHDF Updae
HDF Updae
 
Data processing at the speed of 100 Gbps@Apache Crail (Incubating)
Data processing at the speed of 100 Gbps@Apache Crail (Incubating)Data processing at the speed of 100 Gbps@Apache Crail (Incubating)
Data processing at the speed of 100 Gbps@Apache Crail (Incubating)
 
HDF and ENVI Services Engine
HDF and ENVI Services EngineHDF and ENVI Services Engine
HDF and ENVI Services Engine
 
Manage democratization of the data - Data Replication in Hadoop
Manage democratization of the data - Data Replication in HadoopManage democratization of the data - Data Replication in Hadoop
Manage democratization of the data - Data Replication in Hadoop
 
The Elephant in the Clouds
The Elephant in the CloudsThe Elephant in the Clouds
The Elephant in the Clouds
 

Destacado

Destacado (20)

HDF Update 2016
HDF Update 2016HDF Update 2016
HDF Update 2016
 
NEON HDF5
NEON HDF5NEON HDF5
NEON HDF5
 
ICESat-2 Metadata and Status
ICESat-2 Metadata and StatusICESat-2 Metadata and Status
ICESat-2 Metadata and Status
 
Breakthrough Listen
Breakthrough ListenBreakthrough Listen
Breakthrough Listen
 
Pilot Project for HDF5 Metadata Structures for SWOT
Pilot Project for HDF5 Metadata Structures for SWOTPilot Project for HDF5 Metadata Structures for SWOT
Pilot Project for HDF5 Metadata Structures for SWOT
 
HDF Cloud Services
HDF Cloud ServicesHDF Cloud Services
HDF Cloud Services
 
Utilizing HDF4 File Content Maps for the Cloud Computing
Utilizing HDF4 File Content Maps for the Cloud ComputingUtilizing HDF4 File Content Maps for the Cloud Computing
Utilizing HDF4 File Content Maps for the Cloud Computing
 
Using visualization tools to access HDF data via OPeNDAP
Using visualization tools to access HDF data via OPeNDAP Using visualization tools to access HDF data via OPeNDAP
Using visualization tools to access HDF data via OPeNDAP
 
Scientific Computing and Visualization using HDF
Scientific Computing and Visualization using HDFScientific Computing and Visualization using HDF
Scientific Computing and Visualization using HDF
 
SPD and KEA: HDF5 based file formats for Earth Observation
SPD and KEA: HDF5 based file formats for Earth ObservationSPD and KEA: HDF5 based file formats for Earth Observation
SPD and KEA: HDF5 based file formats for Earth Observation
 
EOSDIS Status
EOSDIS StatusEOSDIS Status
EOSDIS Status
 
HDF Update
HDF UpdateHDF Update
HDF Update
 
The CFD General Notation System transition to HDF5
The CFD General Notation System transition to HDF5The CFD General Notation System transition to HDF5
The CFD General Notation System transition to HDF5
 
Status of HDF-EOS, Related Software, and Tools
Status of HDF-EOS, Related Software, and ToolsStatus of HDF-EOS, Related Software, and Tools
Status of HDF-EOS, Related Software, and Tools
 
Migrating from HDF5 1.6 to 1.8
Migrating from HDF5 1.6 to 1.8Migrating from HDF5 1.6 to 1.8
Migrating from HDF5 1.6 to 1.8
 
What will be new in HDF5?
What will be new in HDF5?What will be new in HDF5?
What will be new in HDF5?
 
HDF and HDF-EOS Experiences and Applications
HDF and HDF-EOS Experiences and ApplicationsHDF and HDF-EOS Experiences and Applications
HDF and HDF-EOS Experiences and Applications
 
Profile of HDF-EOS5 Files
Profile of HDF-EOS5 FilesProfile of HDF-EOS5 Files
Profile of HDF-EOS5 Files
 
Workshop Discussion: HDF & HDF-EOS Future Direction
Workshop Discussion: HDF & HDF-EOS Future DirectionWorkshop Discussion: HDF & HDF-EOS Future Direction
Workshop Discussion: HDF & HDF-EOS Future Direction
 
ENVI/IDL for HDF
ENVI/IDL for HDFENVI/IDL for HDF
ENVI/IDL for HDF
 

Similar a Hdf5 current future

Stinger.Next by Alan Gates of Hortonworks
Stinger.Next by Alan Gates of HortonworksStinger.Next by Alan Gates of Hortonworks
Stinger.Next by Alan Gates of HortonworksData Con LA
 
Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point ...
Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point ...Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point ...
Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point ...Global Business Events
 

Similar a Hdf5 current future (20)

Parallel HDF5 Developments
Parallel HDF5 DevelopmentsParallel HDF5 Developments
Parallel HDF5 Developments
 
HDF Update
HDF UpdateHDF Update
HDF Update
 
HDF Update
HDF UpdateHDF Update
HDF Update
 
HDF Update for DAAC Managers (2017-02-27)
HDF Update for DAAC Managers (2017-02-27)HDF Update for DAAC Managers (2017-02-27)
HDF Update for DAAC Managers (2017-02-27)
 
HDF Product Designer
HDF Product DesignerHDF Product Designer
HDF Product Designer
 
Support for NPP/NPOESS/JPSS by The HDF Group
 Support for NPP/NPOESS/JPSS by The HDF Group Support for NPP/NPOESS/JPSS by The HDF Group
Support for NPP/NPOESS/JPSS by The HDF Group
 
HDF Project Update
HDF Project UpdateHDF Project Update
HDF Project Update
 
HDF Product Designer
HDF Product DesignerHDF Product Designer
HDF Product Designer
 
HDF Update
HDF UpdateHDF Update
HDF Update
 
HDF Update
HDF UpdateHDF Update
HDF Update
 
Update on HDF5 1.8
Update on HDF5 1.8Update on HDF5 1.8
Update on HDF5 1.8
 
HDF Update
HDF UpdateHDF Update
HDF Update
 
HDF5 and Ecosystem: What Is New?
HDF5 and Ecosystem: What Is New?HDF5 and Ecosystem: What Is New?
HDF5 and Ecosystem: What Is New?
 
HDF OPeNDAP project update and demo
HDF OPeNDAP project update and demoHDF OPeNDAP project update and demo
HDF OPeNDAP project update and demo
 
Introduction to HDF5 Data and Programming Models
Introduction to HDF5 Data and Programming ModelsIntroduction to HDF5 Data and Programming Models
Introduction to HDF5 Data and Programming Models
 
Stinger.Next by Alan Gates of Hortonworks
Stinger.Next by Alan Gates of HortonworksStinger.Next by Alan Gates of Hortonworks
Stinger.Next by Alan Gates of Hortonworks
 
Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point ...
Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point ...Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point ...
Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point ...
 
HDF Cloud: HDF5 at Scale
HDF Cloud: HDF5 at ScaleHDF Cloud: HDF5 at Scale
HDF Cloud: HDF5 at Scale
 
HDF5 OPeNDAP project update and demo
HDF5 OPeNDAP project update and demoHDF5 OPeNDAP project update and demo
HDF5 OPeNDAP project update and demo
 
Transition from HDF4 to HDF5
Transition from HDF4 to HDF5 Transition from HDF4 to HDF5
Transition from HDF4 to HDF5
 

Último

Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsAndolasoft Inc
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️anilsa9823
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 

Último (20)

Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 

Hdf5 current future

  • 1. www.hdfgroup.org The HDF Group A Brief Introduction to HDF5 Quincey Koziol Director of Core Software and HPC The HDF Group koziol@hdfgroup.org March 5, 2015 1HPC Oil & Gas Workshop http://bit.ly/HDF5-HPCOGW-2015
  • 2. www.hdfgroup.org Why use HDF5? • Challenging Data: • Application data that pushes the limits of traditional solutions. • Software Solutions: • For very large and/or complex data • With very fast access requirements • Easily share data across a platforms • Use different programming languages and OSs. • Take advantage of the tools that understand HDF5. • Enable long-term preservation of data. March 5, 2015 2HPC Oil & Gas Workshop http://bit.ly/HDF5-HPCOGW-2015
  • 3. www.hdfgroup.org HDF5 is like … March 5, 2015 HPC Oil & Gas Workshop 3
  • 4. www.hdfgroup.org What is HDF5? March 5, 2015 HPC Oil & Gas Workshop 4 • HDF5 == Hierarchical Data Format, v5 • A flexible data model • Structures for data organization and specification • Open source software • Implements the data model • Portable file format • Designed for high volume or complex data
  • 5. www.hdfgroup.orgMarch 5, 2015 5 HDF5 Data Model • Groups – provide structure among objects • Datasets – where the primary data goes • Data arrays • Rich set of datatype options • Flexible, efficient storage and I/O • Attributes - for metadata Everything else is built essentially from these parts. HPC Oil & Gas Workshop
  • 6. www.hdfgroup.org HDF5 Software HDF5 home page: http://hdfgroup.org/HDF5/ March 5, 2015 HPC Oil & Gas Workshop 6
  • 7. www.hdfgroup.org Useful Tools For New Users March 5, 2015 HPC Oil & Gas Workshop 7 h5dump, h5ls: Tools to “dump” or list contents of HDF5 file HDFView: Java browser for HDF5 files http://www.hdfgroup.org/hdf-java-html/hdfview/ HDF5 Examples (C, Fortran, Java, Python, Matlab) http://www.hdfgroup.org/ftp/HDF5/examples/ h5cc, h5c++, h5fc: Scripts to compile applications
  • 8. www.hdfgroup.org Recent HPC Success Story • Performance results on Blue Waters @ NCSA • I/O Kernel of a DOE Plasma Physics application • Running on 298,048 cores • ~10 Trillion particles • Single 291TB HDF5 file • Achieved 52 GB/s • ~50% of the peak performance • Using 1 GB stripe size and 160 Lustre OSTs March 5, 2015 8HPC Oil & Gas Workshop
  • 9. www.hdfgroup.org HDF5 in Oil & Gas • REMSQL: Standard for reservoir data (Energistics) • http://www.energistics.org/reservoir/resqml- standards/current-standards • H5EM-TS: Exchange standard for field EM data (EMGS, Statoil, Interaction) • ftp://fileformats.emgs.com/H5EM- TS_1.0/documentation/H5EM- TS_information_sheet.pdf March 5, 2015 HPC Oil & Gas Workshop 9
  • 10. www.hdfgroup.org HDF5 in Oil & Gas • TEMHDF: Exchange standard for MetalMapper and other EMI data • ftp://geom.geometrics.com/pub/Data/TEM2H5_ Deliverables/TEM2HDF_RefManual.pdf • PH5: Archival format for active source seismic data (moving away from SEG-Y, to HDF5) • http://www.passcal.nmt.edu/content/ph5-what-it • Petrel: E&P Workflow and Visualization • http://www.software.slb.com/products/platform/ Pages/petrel.aspx March 5, 2015 HPC Oil & Gas Workshop 10
  • 11. www.hdfgroup.org HDF5 in Oil & Gas • Globe Claritas: HDF5 is format for their seismic processing software • SEG-Y vs. HDF5 Whitepaper: http://www.globeclaritas.com/content/download/10 303/55223/file/HDF5%20For%20Seismic%20Refle ction%20Datasets.pdf • News release: http://www.globeclaritas.com/Claritas/Overview/Lat est-Release • PDF data sheet: http://www.globeclaritas.com/content/download/88 39/47774/file/Claritas%20HDF5.pdf • Powerpoint: http://www.slideshare.net/guy_maslen/a-quick- start-guide-to-using-hdf5-in-globe-claritas March 5, 2015 HPC Oil & Gas Workshop 11
  • 12. www.hdfgroup.org Where We’ll Be Soon: HDF5 1.10 • Beta release: Fall 2015 • Major Features: • Single-Writer/Multiple-Reader (SWMR) • Virtual Datasets • Improved scalability of chunked datasets • Parallel I/O performance and capabilities March 5, 2015 12HPC Oil & Gas Workshop
  • 13. www.hdfgroup.org Other Items of Interest • We’re not planning to change current multi-threaded concurrency behavior • HDF5 Excel Add-in: HEXAD • REST-based service for HDF5 data • HDF Compass visualization package March 5, 2015 13HPC Oil & Gas Workshop
  • 14. www.hdfgroup.org The HDF Group Thank You! Questions & Comments? March 5, 2015 14HPC Oil & Gas Workshop http://bit.ly/HDF5-HPCOGW-2015
  • 15. www.hdfgroup.org The HDF Group Services • Helpdesk and Mailing Lists • Available to all users as a first level of support: help@hdfgroup.org, hdf-forum@lists.hdfgroup.org • Priority Support • Rapid issue resolution and advice • Consulting • Needs assessment, troubleshooting, design reviews, etc. • Training • Tutorials and hands-on practical experience • Enterprise Support • Coordinate HDF activities across departments • Special Projects • Adapting customer applications to HDF • New features and tools • Research and Development March 5, 2015 15HPC Oil & Gas Workshop http://bit.ly/HDF5-HPCOGW-2015
  • 16. www.hdfgroup.org HDF5 1.10 Planned Features: SWMR • Improves HDF5 for Data Acquisition: • Allows simultaneous data gathering and monitoring/analysis • Focused on storing data sequences for high-speed data sources • Supports ‘Ordered Updates’ to file: • Crash-proofs accessing HDF5 file • Possibly uses small amount of extra space March 5, 2015 16HPC Oil & Gas Workshop
  • 17. www.hdfgroup.org HDF5 1.10 Planned Features • Virtual Object Layer (VOL) • Provides the HDF5 data model and API, but allows different underlying storage mechanisms • Intercepts all HDF5 API calls that can touch the data on disk and routes them to a VOL plugin • Possibly SEG-Y VOL plugin? March 5, 2015 17HPC Oil & Gas Workshop
  • 18. www.hdfgroup.org HDF5 1.10 Planned Features • ‘Virtual’ Datasets • Can “stitch together” multiple ‘source’ datasets into a single ‘virtual’ dataset • Supports unlimited dimensions in both source and virtual datasets March 5, 2015 18HPC Oil & Gas Workshop
  • 19. www.hdfgroup.org HDF5 1.10 Planned Features: Chunk Imp. Dataset type Index type Space improvements Speed improvements no unlimited dimensions, no I/O filters, no missing chunks “implicit” no actual chunk index Same storage space as contiguous dataset storage (no index) Constant time lookups Faster parallel I/O no unlimited dimensions “fixed sized” smaller chunk index Smaller index overhead Constant time lookups 1 unlimited dimension “extensible array” Smaller index overhead Constant time lookups and appends 2+ unlimited dimension Improved B-tree* Smaller index overhead Faster March 5, 2015 19HPC Oil & Gas Workshop
  • 20. www.hdfgroup.org HDF5 1.10 Planned Features: HPC • Continue to improve our use of MPI and parallel file system features • Remove ‘truncate’ operation on file close, etc. • Reduce # of I/O accesses for metadata access • Collective Read/Write of metadata • Multi-dataset Collective I/O • Support for compression in parallel • Collective access mode only • Possibly Support Single-Write/Multiple-Reader (SWMR) access in parallel March 5, 2015 20HPC Oil & Gas Workshop
  • 21. www.hdfgroup.org HDF5 Roadmap March 5, 2015 21 • Concurrency • Single-Writer/Multiple- Reader (SWMR) • Internal threading • Virtual Object Layer (VOL) • Data Analysis • Query / View / Index APIs • Native HDF5 client/server • Performance • Scalable chunk indices • Metadata aggregation and Page buffering • Asynchronous I/O • Variable-length records • Fault tolerance • Parallel I/O • I/O Autotuning HPC Oil & Gas Workshop “The best way to predict the future is to invent it.” – Alan Kay
  • 22. www.hdfgroup.org Where We’re Not Going • We’re not changing multi-threaded concurrency support • Keep “global lock” on library • Will focus on asynchronous I/O instead • Will be using threads internally though March 5, 2015 22HPC Oil & Gas Workshop
  • 23. www.hdfgroup.org Codename “HEXAD” • HDF5 Excel Add-in: HEXAD • Lets you do the usual things including: • Display content (file structure, detailed object info) • Create/read/write datasets • Create/read/update attributes • Plenty of ideas for bells & whistles • HDF5 Image & PyTables support, etc. • Send in your Must Have/Nice To Have list!* • Stay tuned for the beta program * help@hdfgroup.org March 5, 2015 23HPC Oil & Gas Workshop
  • 24. www.hdfgroup.org HDF Server • REST-based service for HDF5 data • Reference Implementation for REST API • Developed in Python using Tornado Framework • Supports Read/Write operations • Clients can be Python/C/Fortran or Web Page • Let us know what specific features you’d like to see. March 5, 2015 24HPC Oil & Gas Workshop
  • 25. www.hdfgroup.org HDF Compass • “Simple” Python HDF5 Viewer application • Cross platform (Windows/Mac/Linux) • Native look and feel • Can display extremely large HDF5 files • View HDF5 files and OpenDAP resources • Plugin model enables different file formats/remote resources to be supported • Community-based development model March 5, 2015 25HPC Oil & Gas Workshop
  • 26. www.hdfgroup.orgMarch 5, 2015 26 Brief History of HDF 1987 At NCSA (University of Illinois), forms task force to create an architecture-independent file format and library, which becomes HDF Early NASA adopts HDF for Earth Observing System project 1990’s 1996 DOE collaborates with the HDF group (at NCSA) to create “Big HDF” which becomes HDF5 1998 HDF5 released, with support from DOE, NASA & NCSA 2006 The HDF Group spins out of University of Illinois as non-profit corporation HPC Oil & Gas Workshop
  • 27. www.hdfgroup.org The HDF Group • Established in 1988 • 18 years at University of Illinois’ National Center for Supercomputing Applications • 8 years as independent non-profit company: “The HDF Group” • The HDF Group owns HDF4 and HDF5 • HDF4 & HDF5 formats, libraries, and tools are open source and freely available with BSD-style license March 5, 2015 27HPC Oil & Gas Workshop