SlideShare una empresa de Scribd logo
1 de 12
Using HDF5 tools for
performance tuning and
troubleshooting

2/18/2014

HDF and HDF-EOS Workshop X, Landover, MD

1
Introduction
• HDF5 tools may be very useful for performance tuning
and troubleshooting
• Discover objects and their properties in HDF5 files
h5dump -p

• Get file size overhead information
h5stat

• Get locations of the objects in a file
h5ls

• Discover differences
h5diff, h5ls

• Location of raw data
h5ls –vra

2/18/2014

HDF and HDF-EOS Workshop X, Landover, MD

2
h5stat
• Prints different statistics about HDF5 file
• Helps
• To troubleshoot size overhead in HDF5 files
• To choose specific object’s properties and storage
strategies

• To use
h5stat --help
h5stat file.h5

• Spec can be found
http://www.hdfgroup.org/RFC/h5stat/
• Let us know if you need some “special” type of statistics
2/18/2014

HDF and HDF-EOS Workshop X, Landover, MD

3
h5stat
• Reports two types of statistics:
• High-level information about objects (examples):
• Number of different objects (groups, datasets, datatypes) in
a file
• Number of unique datatypes
• Size of raw data in a file

• Information about object’s structural metadata
• Sizes of structural metadata (total/free)
• Object headers, local and global heaps
• Sizes of B-trees

• Object headers fragmentation

2/18/2014

HDF and HDF-EOS Workshop X, Landover, MD

4
h5stat
• Examples of high-level information:
File information
# of unique groups: 10008
# of unique datasets: 30
# of unique named datatypes: 0
……………………
Max. # of links to object: 1
Max. depth of hierarchy: 4
Max. # of objects in group: 19
……………………
Group bins:
# of groups of size 0: 10000
# of groups of size 1 - 9: 7
# of groups of size 10 - 99: 1
……………………

Max. dimension size of 1-D datasets: 1643
……………………
Dataset filters information:
Number of datasets with

………………
SZIP filter: 2
………………
NBIT filter: 10
USER-DEFINED filter: 1
2/18/2014

HDF and HDF-EOS Workshop X, Landover, MD

5
h5stat
• Conclusion:
• There are a lot of empty groups in the file; good candidate for
compact group feature
• Some datasets use “user-defined” filters and may not be readable by
HDF5 library
• SZIP compression is needed to read some datasets

Oh… my application uses buffers of size 1024 to read data…
No wonder it crashes on reading…
Do I have all filters needed to read the data?

2/18/2014

HDF and HDF-EOS Workshop X, Landover, MD

6
h5stat
• Examples of structural metadata information:
Object header size: (total/unused)
Groups: 1808/72
Datasets: 15792/832
………
Dataset storage information:
Total raw data size: 6140688
………
Dataset datatype #3:
Count (total/named) = (2/0)
Size (desc./elmt) = (10/65535)
Dataset datatype #4:
Count (total/named) = (1/0)
Size (desc./elmt) = (10/32000)
2/18/2014

HDF and HDF-EOS Workshop X, Landover, MD

7
h5stat
• Conclusions
• File size: 6228197
• 1.5% overhead (not bad at all!)
• There some elements are of size 65535 and 32000

Oh… Is it really what I want?
Should I use other datatype and get advantage of compression?

2/18/2014

HDF and HDF-EOS Workshop X, Landover, MD

8
Case study: Using HDF5tools to debug a problem
• My applications creates files on Windows with VS2005 and VS2003. I can
read the VS2003 file but not the VS2005 one. H5dump reads both files
OK and there are no differences. What am I doing wrong?
• h5diff good.h5 bad.h5
Datatype:
</Definitions/timespec> and </Definitions/timespec> 1
differences found

• h5ls –vr good.h5
/Definitions/timespec
Location: 0:1:0:900

Type

• h5debug good.h5 900
Message Information:
Type class:
Size:

compound
8 bytes

• h5debug bad.h5 900
Message Information:
Type class:
Size:
2/18/2014

HDF and HDF-EOS Workshop X, Landover, MD

compound
16 bytes
9
Case study: Using HDF5tools to debug a problem

• Conclusions
• Compound datatype “timespec” requires different
number of bytes on VS2005 (16 bytes; 2x8bytes) and
on VS2003 (8bytes; 2x4bytes)

Oh… How do I read my data back?
I assumed that my struct would need only 8 bytes for each elements but
it needs 16 bytes on VS2005. I need H5Tget_native_type function
to find the type of my data in memory

2/18/2014

HDF and HDF-EOS Workshop X, Landover, MD

10
Where is my data?
• h5ls –var be_data.h5:
Opened "be_data.h5" with sec2 driver.
/Array
Dataset {5/5, 6/6}
Location: 0:1:0:792
Links:
1
Modified: 2006-04-07 15:08:39 CDT
Storage:
240 logical bytes, 240 allocated bytes, 100.00%
utilization
Type:
IEEE 64-bit big-endian float
Address:
2048

• 30 8-byte elements can be read from address 2048 by non-HDF5 application

2/18/2014

HDF and HDF-EOS Workshop X, Landover, MD

11
Questions? Comments?

?

Thank you!

2/18/2014

HDF and HDF-EOS Workshop X, Landover, MD

12

Más contenido relacionado

La actualidad más candente

STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...The HDF-EOS Tools and Information Center
 
HDF5 Advanced Topics - Object's Properties, Storage Methods, Filters, Datatypes
HDF5 Advanced Topics - Object's Properties, Storage Methods, Filters, DatatypesHDF5 Advanced Topics - Object's Properties, Storage Methods, Filters, Datatypes
HDF5 Advanced Topics - Object's Properties, Storage Methods, Filters, DatatypesThe HDF-EOS Tools and Information Center
 

La actualidad más candente (20)

HDF Group Support for NPP/NPOESS/JPSS
HDF Group Support for NPP/NPOESS/JPSSHDF Group Support for NPP/NPOESS/JPSS
HDF Group Support for NPP/NPOESS/JPSS
 
Using HDF5 and Python: The H5py module
Using HDF5 and Python: The H5py moduleUsing HDF5 and Python: The H5py module
Using HDF5 and Python: The H5py module
 
Images of HDF5
Images of HDF5Images of HDF5
Images of HDF5
 
Advanced HDF5 Features
Advanced HDF5 FeaturesAdvanced HDF5 Features
Advanced HDF5 Features
 
HDF5 <-> Zarr
HDF5 <-> ZarrHDF5 <-> Zarr
HDF5 <-> Zarr
 
Product Designer Hub - Taking HPD to the Web
Product Designer Hub - Taking HPD to the WebProduct Designer Hub - Taking HPD to the Web
Product Designer Hub - Taking HPD to the Web
 
H5Coro: The Cloud-Optimized Read-Only Library
H5Coro: The Cloud-Optimized Read-Only LibraryH5Coro: The Cloud-Optimized Read-Only Library
H5Coro: The Cloud-Optimized Read-Only Library
 
Advanced HDF5 Features
Advanced HDF5 FeaturesAdvanced HDF5 Features
Advanced HDF5 Features
 
Efficiently serving HDF5 via OPeNDAP
Efficiently serving HDF5 via OPeNDAPEfficiently serving HDF5 via OPeNDAP
Efficiently serving HDF5 via OPeNDAP
 
Visualizing and Analyzing HDF-EOS5 and HDF5 data with NCL
Visualizing and Analyzing HDF-EOS5 and HDF5 data with NCLVisualizing and Analyzing HDF-EOS5 and HDF5 data with NCL
Visualizing and Analyzing HDF-EOS5 and HDF5 data with NCL
 
Caching and Buffering in HDF5
Caching and Buffering in HDF5Caching and Buffering in HDF5
Caching and Buffering in HDF5
 
HDF Product Designer: Using Templates to Achieve Interoperability
HDF Product Designer: Using Templates to Achieve InteroperabilityHDF Product Designer: Using Templates to Achieve Interoperability
HDF Product Designer: Using Templates to Achieve Interoperability
 
HDF Tools Tutorial
HDF Tools TutorialHDF Tools Tutorial
HDF Tools Tutorial
 
Easy Access of NASA HDF data via OPeNDAP
Easy Access of NASA HDF data via OPeNDAPEasy Access of NASA HDF data via OPeNDAP
Easy Access of NASA HDF data via OPeNDAP
 
Open-source Scientific Computing and Data Analytics using HDF
Open-source Scientific Computing and Data Analytics using HDFOpen-source Scientific Computing and Data Analytics using HDF
Open-source Scientific Computing and Data Analytics using HDF
 
NEON HDF5
NEON HDF5NEON HDF5
NEON HDF5
 
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
 
HDF5 Advanced Topics - Object's Properties, Storage Methods, Filters, Datatypes
HDF5 Advanced Topics - Object's Properties, Storage Methods, Filters, DatatypesHDF5 Advanced Topics - Object's Properties, Storage Methods, Filters, Datatypes
HDF5 Advanced Topics - Object's Properties, Storage Methods, Filters, Datatypes
 
HDF4 and HDF5 Performance Preliminary Results
HDF4 and HDF5 Performance Preliminary ResultsHDF4 and HDF5 Performance Preliminary Results
HDF4 and HDF5 Performance Preliminary Results
 
Performance Tuning in HDF5
Performance Tuning in HDF5 Performance Tuning in HDF5
Performance Tuning in HDF5
 

Similar a HDF5 tools for performance tuning and troubleshooting

Hdf5 parallel
Hdf5 parallelHdf5 parallel
Hdf5 parallelmfolk
 

Similar a HDF5 tools for performance tuning and troubleshooting (20)

HDF5 Life cycle of data
HDF5 Life cycle of dataHDF5 Life cycle of data
HDF5 Life cycle of data
 
HDF Update for DAAC Managers (2017-02-27)
HDF Update for DAAC Managers (2017-02-27)HDF Update for DAAC Managers (2017-02-27)
HDF Update for DAAC Managers (2017-02-27)
 
HDF5 Advanced Topics
HDF5 Advanced TopicsHDF5 Advanced Topics
HDF5 Advanced Topics
 
HDF5 Advanced Topics - Datatypes and Partial I/O
HDF5 Advanced Topics - Datatypes and Partial I/OHDF5 Advanced Topics - Datatypes and Partial I/O
HDF5 Advanced Topics - Datatypes and Partial I/O
 
Introduction to HDF5
Introduction to HDF5Introduction to HDF5
Introduction to HDF5
 
Hdf5 parallel
Hdf5 parallelHdf5 parallel
Hdf5 parallel
 
Integrating HDF5 with SRB
Integrating HDF5 with SRBIntegrating HDF5 with SRB
Integrating HDF5 with SRB
 
Introduction to HDF5
Introduction to HDF5Introduction to HDF5
Introduction to HDF5
 
HDF Status and Development
HDF Status and DevelopmentHDF Status and Development
HDF Status and Development
 
HDF Update
HDF UpdateHDF Update
HDF Update
 
Introduction to HDF5 Data Model, Programming Model and Library APIs
Introduction to HDF5 Data Model, Programming Model and Library APIsIntroduction to HDF5 Data Model, Programming Model and Library APIs
Introduction to HDF5 Data Model, Programming Model and Library APIs
 
Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps
Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout MapsEnsuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps
Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps
 
Introduction to HDF5 Data Model, Programming Model and Library APIs
Introduction to HDF5 Data Model, Programming Model and Library APIsIntroduction to HDF5 Data Model, Programming Model and Library APIs
Introduction to HDF5 Data Model, Programming Model and Library APIs
 
HDF5 iRODS
HDF5 iRODSHDF5 iRODS
HDF5 iRODS
 
HDF Cloud: HDF5 at Scale
HDF Cloud: HDF5 at ScaleHDF Cloud: HDF5 at Scale
HDF Cloud: HDF5 at Scale
 
Update on HDF5 1.8
Update on HDF5 1.8Update on HDF5 1.8
Update on HDF5 1.8
 
Introduction to HDF5
Introduction to HDF5Introduction to HDF5
Introduction to HDF5
 
Cloud-Optimized HDF5 Files
Cloud-Optimized HDF5 FilesCloud-Optimized HDF5 Files
Cloud-Optimized HDF5 Files
 
HDF5 Tools Update
HDF5 Tools UpdateHDF5 Tools Update
HDF5 Tools Update
 
HDF5 and The HDF Group
HDF5 and The HDF GroupHDF5 and The HDF Group
HDF5 and The HDF Group
 

Más de The HDF-EOS Tools and Information Center

Más de The HDF-EOS Tools and Information Center (20)

Accessing HDF5 data in the cloud with HSDS
Accessing HDF5 data in the cloud with HSDSAccessing HDF5 data in the cloud with HSDS
Accessing HDF5 data in the cloud with HSDS
 
The State of HDF
The State of HDFThe State of HDF
The State of HDF
 
Highly Scalable Data Service (HSDS) Performance Features
Highly Scalable Data Service (HSDS) Performance FeaturesHighly Scalable Data Service (HSDS) Performance Features
Highly Scalable Data Service (HSDS) Performance Features
 
Creating Cloud-Optimized HDF5 Files
Creating Cloud-Optimized HDF5 FilesCreating Cloud-Optimized HDF5 Files
Creating Cloud-Optimized HDF5 Files
 
HDF5 OPeNDAP Handler Updates, and Performance Discussion
HDF5 OPeNDAP Handler Updates, and Performance DiscussionHDF5 OPeNDAP Handler Updates, and Performance Discussion
HDF5 OPeNDAP Handler Updates, and Performance Discussion
 
Hyrax: Serving Data from S3
Hyrax: Serving Data from S3Hyrax: Serving Data from S3
Hyrax: Serving Data from S3
 
Accessing Cloud Data and Services Using EDL, Pydap, MATLAB
Accessing Cloud Data and Services Using EDL, Pydap, MATLABAccessing Cloud Data and Services Using EDL, Pydap, MATLAB
Accessing Cloud Data and Services Using EDL, Pydap, MATLAB
 
HDF - Current status and Future Directions
HDF - Current status and Future DirectionsHDF - Current status and Future Directions
HDF - Current status and Future Directions
 
HDFEOS.org User Analsys, Updates, and Future
HDFEOS.org User Analsys, Updates, and FutureHDFEOS.org User Analsys, Updates, and Future
HDFEOS.org User Analsys, Updates, and Future
 
HDF - Current status and Future Directions
HDF - Current status and Future Directions HDF - Current status and Future Directions
HDF - Current status and Future Directions
 
MATLAB Modernization on HDF5 1.10
MATLAB Modernization on HDF5 1.10MATLAB Modernization on HDF5 1.10
MATLAB Modernization on HDF5 1.10
 
HDF for the Cloud - Serverless HDF
HDF for the Cloud - Serverless HDFHDF for the Cloud - Serverless HDF
HDF for the Cloud - Serverless HDF
 
HDF for the Cloud - New HDF Server Features
HDF for the Cloud - New HDF Server FeaturesHDF for the Cloud - New HDF Server Features
HDF for the Cloud - New HDF Server Features
 
Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3
Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3
Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3
 
HDF5 and Ecosystem: What Is New?
HDF5 and Ecosystem: What Is New?HDF5 and Ecosystem: What Is New?
HDF5 and Ecosystem: What Is New?
 
HDF5 Roadmap 2019-2020
HDF5 Roadmap 2019-2020HDF5 Roadmap 2019-2020
HDF5 Roadmap 2019-2020
 
Leveraging the Cloud for HDF Software Testing
Leveraging the Cloud for HDF Software TestingLeveraging the Cloud for HDF Software Testing
Leveraging the Cloud for HDF Software Testing
 
Google Colaboratory for HDF-EOS
Google Colaboratory for HDF-EOSGoogle Colaboratory for HDF-EOS
Google Colaboratory for HDF-EOS
 
Parallel Computing with HDF Server
Parallel Computing with HDF ServerParallel Computing with HDF Server
Parallel Computing with HDF Server
 
HDF-EOS Data Product Developer's Guide
HDF-EOS Data Product Developer's GuideHDF-EOS Data Product Developer's Guide
HDF-EOS Data Product Developer's Guide
 

Último

Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 

Último (20)

Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 

HDF5 tools for performance tuning and troubleshooting

  • 1. Using HDF5 tools for performance tuning and troubleshooting 2/18/2014 HDF and HDF-EOS Workshop X, Landover, MD 1
  • 2. Introduction • HDF5 tools may be very useful for performance tuning and troubleshooting • Discover objects and their properties in HDF5 files h5dump -p • Get file size overhead information h5stat • Get locations of the objects in a file h5ls • Discover differences h5diff, h5ls • Location of raw data h5ls –vra 2/18/2014 HDF and HDF-EOS Workshop X, Landover, MD 2
  • 3. h5stat • Prints different statistics about HDF5 file • Helps • To troubleshoot size overhead in HDF5 files • To choose specific object’s properties and storage strategies • To use h5stat --help h5stat file.h5 • Spec can be found http://www.hdfgroup.org/RFC/h5stat/ • Let us know if you need some “special” type of statistics 2/18/2014 HDF and HDF-EOS Workshop X, Landover, MD 3
  • 4. h5stat • Reports two types of statistics: • High-level information about objects (examples): • Number of different objects (groups, datasets, datatypes) in a file • Number of unique datatypes • Size of raw data in a file • Information about object’s structural metadata • Sizes of structural metadata (total/free) • Object headers, local and global heaps • Sizes of B-trees • Object headers fragmentation 2/18/2014 HDF and HDF-EOS Workshop X, Landover, MD 4
  • 5. h5stat • Examples of high-level information: File information # of unique groups: 10008 # of unique datasets: 30 # of unique named datatypes: 0 …………………… Max. # of links to object: 1 Max. depth of hierarchy: 4 Max. # of objects in group: 19 …………………… Group bins: # of groups of size 0: 10000 # of groups of size 1 - 9: 7 # of groups of size 10 - 99: 1 …………………… Max. dimension size of 1-D datasets: 1643 …………………… Dataset filters information: Number of datasets with ……………… SZIP filter: 2 ……………… NBIT filter: 10 USER-DEFINED filter: 1 2/18/2014 HDF and HDF-EOS Workshop X, Landover, MD 5
  • 6. h5stat • Conclusion: • There are a lot of empty groups in the file; good candidate for compact group feature • Some datasets use “user-defined” filters and may not be readable by HDF5 library • SZIP compression is needed to read some datasets Oh… my application uses buffers of size 1024 to read data… No wonder it crashes on reading… Do I have all filters needed to read the data? 2/18/2014 HDF and HDF-EOS Workshop X, Landover, MD 6
  • 7. h5stat • Examples of structural metadata information: Object header size: (total/unused) Groups: 1808/72 Datasets: 15792/832 ……… Dataset storage information: Total raw data size: 6140688 ……… Dataset datatype #3: Count (total/named) = (2/0) Size (desc./elmt) = (10/65535) Dataset datatype #4: Count (total/named) = (1/0) Size (desc./elmt) = (10/32000) 2/18/2014 HDF and HDF-EOS Workshop X, Landover, MD 7
  • 8. h5stat • Conclusions • File size: 6228197 • 1.5% overhead (not bad at all!) • There some elements are of size 65535 and 32000 Oh… Is it really what I want? Should I use other datatype and get advantage of compression? 2/18/2014 HDF and HDF-EOS Workshop X, Landover, MD 8
  • 9. Case study: Using HDF5tools to debug a problem • My applications creates files on Windows with VS2005 and VS2003. I can read the VS2003 file but not the VS2005 one. H5dump reads both files OK and there are no differences. What am I doing wrong? • h5diff good.h5 bad.h5 Datatype: </Definitions/timespec> and </Definitions/timespec> 1 differences found • h5ls –vr good.h5 /Definitions/timespec Location: 0:1:0:900 Type • h5debug good.h5 900 Message Information: Type class: Size: compound 8 bytes • h5debug bad.h5 900 Message Information: Type class: Size: 2/18/2014 HDF and HDF-EOS Workshop X, Landover, MD compound 16 bytes 9
  • 10. Case study: Using HDF5tools to debug a problem • Conclusions • Compound datatype “timespec” requires different number of bytes on VS2005 (16 bytes; 2x8bytes) and on VS2003 (8bytes; 2x4bytes) Oh… How do I read my data back? I assumed that my struct would need only 8 bytes for each elements but it needs 16 bytes on VS2005. I need H5Tget_native_type function to find the type of my data in memory 2/18/2014 HDF and HDF-EOS Workshop X, Landover, MD 10
  • 11. Where is my data? • h5ls –var be_data.h5: Opened "be_data.h5" with sec2 driver. /Array Dataset {5/5, 6/6} Location: 0:1:0:792 Links: 1 Modified: 2006-04-07 15:08:39 CDT Storage: 240 logical bytes, 240 allocated bytes, 100.00% utilization Type: IEEE 64-bit big-endian float Address: 2048 • 30 8-byte elements can be read from address 2048 by non-HDF5 application 2/18/2014 HDF and HDF-EOS Workshop X, Landover, MD 11
  • 12. Questions? Comments? ? Thank you! 2/18/2014 HDF and HDF-EOS Workshop X, Landover, MD 12