SlideShare una empresa de Scribd logo
1 de 14
FAIR Principle – Data Accessibility Practice at NCI
Jingbo Wang
ANDS webinar FAIR principles
6 Sep 2017
Canberra
nci.org.au© NCI Australia 2017
Overview of datasets at NCI
• climate and weather models
• satellite images
• bathymetry and elevation
• hydrology
• geophysics
• Also: optical astro, genomic
and social sciences
NCI makes available national reference datasets – especially those produced
by the government agencies. It is brought together at NCI and organised for
both high performance computation & high performance data analysis, as
well as making available more broadly to the research community.
nci.org.au© NCI Australia 2017
NCI provides user with Data-as-a-Service
User
generate/transfer
data
Fast data storage Data
Management
Portal
Web-time analytics
softwareHPC
Data Curation,
Publish, Citation
Data
catalogue
Super
computer
users
Paper and Data
are published
Data
visualization
Visualisation
tools
Data share and
re-use
What do we do?
nci.org.au© NCI Australia 2017
•Control Data Access - license, data access controls
•Persistently Access Data
•Access large Data through data services
•Scalable and distributed Data Access (advanced data services)
•Provenance implementation to Access versioned Data
•Access quality Data
nci.org.au© NCI Australia 2017
License: CCBY4.0, CCBY-NC-ND, CCBY-NC-SA, ECMWF, others
Access Control List (ACL): set to manage read and write permission
Embargo period, dev-ops time delay, publication dependency
FYI: Poblet. et al. 2016 Assigning Creative Commons Licenses to Research Metadata: Issues and Cases
https://arxiv.org/abs/1609.05700
1. Control Data Access
nci.org.au© NCI Australia 2017
2. Persistently Access Data
Reference: Wang et al. 2017 Persistent Identifier Practice for Big Data Management at NCI.
Digital Science Journal https://datascience.codata.org/articles/10.5334/dsj-2017-020/
nci.org.au© NCI Australia 2017
Open Geospatial Consortium Services
• Web mapping services
• Web coverage services
• Web feature services
• Web processing services
• Web coverage-processing services
Other NCI data services
• THREDDS data subsetting
• GSKY
• ESGF (services used for international Climate Model Intercomparison Projects (MIPS))
• ASVO
• ERDDAP, GeoServer, Rasdaman
The most broad-scale, general purpose server has become TDS because of the range of data services
and protocols supported.
3. Access Data through NCI’s data services
nci.org.au© NCI Australia 2017
nci.org.au© NCI Australia 2017
(courtesy of Adam Steer)
nci.org.au© NCI Australia 2017
3. Access Data through NCI’s data services - examples
When searching and browsing data,
published collections will have a direct
link to NCI’s Data Services.
https://datacatalogue.nci.org.au
https://geonetwork.nci.org.au
nci.org.au© NCI Australia 2017
THREDDS data services
What is THREDDS?
THREDDS (Thematic Realtime Environmental Distributed Data Services) data server (TDS) developed by
Unidata (UCAR) Allows for browsing and accessing of data (as well as metadata)
Name Description
OPeNDAP (DAP2) Protocol enabling data access and subsetting through the web
NetCDF Subset Service (NCSS) Web service for subsetting files that can be read by the netCDF
java library
Web Map Service (WMS) OGC web service for requesting static images of data
Web Coverage Service (WCS) OGC web service for requesting data in some output format
Godiva Data Viewer Tool for simple visualisation of data
HTTP File Download Direct downloading
nci.org.au© NCI Australia 2017
WMS
request
WCS
request
WPS
request
FEATURES
• Distributed
• Scalable
• Concurrent
• Multi Cloud
INPUT
OUTPUT
OGC
Request
OGC
Output
User’s
browser
WMS client
4. Scalable and Distributed Data Access - GSKY
GSKY http://ceur-ws.org/Vol-1913/RL17_paper_14.pdf
nci.org.au© NCI Australia 2017
5. Provenance implementation to Access versioned Data
Publication
with data
extract
reference
URI
points to
data
extract
URI
points to
an earlier
version x
of data
extract
URI
points to
an even
earlier
version 1
of data
extract
URI points the
original source
used to
generate a
series of data
extracts
reference: Wang et al. 2015. Enabling dynamic access to dynamic petascale Earth Systems and Environmental data collections is easy:
citing and reproducing the actual data extracts used in research publications is NOT. American Geophysical Union Fall meeting.
Data DataMetadata
final version raw data
nci.org.au© NCI Australia 2017
6. Access quality Data
NCI Data Quality Strategy Reference: Evans. et. al. 2017 (invited) A data quality strategy for
programmatic access to large collections of diverse datasets on an integrated high performance
platform. DMPI Informatics.
• Data Quality Control (QC), Quality Assurance (QA) report,
benchmarking use cases should be available for the community.
• When users access the data, they can also access the data quality
report.

Más contenido relacionado

La actualidad más candente

Global registries initiative frumkin omodei
Global registries initiative frumkin omodeiGlobal registries initiative frumkin omodei
Global registries initiative frumkin omodei
ASIS&T
 
Scribe (sharing community related information in bedfordshire
Scribe (sharing community related information in bedfordshireScribe (sharing community related information in bedfordshire
Scribe (sharing community related information in bedfordshire
Jeff Van Etten
 
OAI and Publishers’ metadata: Using the static repositories approach to discl...
OAI and Publishers’ metadata: Using the static repositories approach to discl...OAI and Publishers’ metadata: Using the static repositories approach to discl...
OAI and Publishers’ metadata: Using the static repositories approach to discl...
R. John Robertson
 

La actualidad más candente (20)

Report on EDINA Authentication Related Academic Sector Activities
Report on EDINA Authentication Related Academic Sector ActivitiesReport on EDINA Authentication Related Academic Sector Activities
Report on EDINA Authentication Related Academic Sector Activities
 
Ragld
RagldRagld
Ragld
 
Global registries initiative frumkin omodei
Global registries initiative frumkin omodeiGlobal registries initiative frumkin omodei
Global registries initiative frumkin omodei
 
An Overview of Data Citation Principles Synthesis Activity
An Overview of Data Citation Principles Synthesis ActivityAn Overview of Data Citation Principles Synthesis Activity
An Overview of Data Citation Principles Synthesis Activity
 
The Commons
The CommonsThe Commons
The Commons
 
Cool Tools Esri ArcGIS
Cool Tools Esri ArcGISCool Tools Esri ArcGIS
Cool Tools Esri ArcGIS
 
COBWEB Authentication Workshop
COBWEB Authentication WorkshopCOBWEB Authentication Workshop
COBWEB Authentication Workshop
 
COBWEB: Brief Introduction, GBIF Secretariat
COBWEB: Brief Introduction, GBIF SecretariatCOBWEB: Brief Introduction, GBIF Secretariat
COBWEB: Brief Introduction, GBIF Secretariat
 
Altman RDAP11 Policy-based Data Management
Altman RDAP11 Policy-based Data ManagementAltman RDAP11 Policy-based Data Management
Altman RDAP11 Policy-based Data Management
 
The BlueBRIDGE Project - Pasquale Pagano
The BlueBRIDGE Project - Pasquale PaganoThe BlueBRIDGE Project - Pasquale Pagano
The BlueBRIDGE Project - Pasquale Pagano
 
Smith RDAP11 NSF Data Management Plan Case Studies
Smith RDAP11 NSF Data Management Plan Case StudiesSmith RDAP11 NSF Data Management Plan Case Studies
Smith RDAP11 NSF Data Management Plan Case Studies
 
Accelerating your research with Microsoft Azure
Accelerating your research with Microsoft AzureAccelerating your research with Microsoft Azure
Accelerating your research with Microsoft Azure
 
UK RepositoryNet+ Project: New Services for the Institutional Repository Netw...
UK RepositoryNet+ Project: New Services for the Institutional Repository Netw...UK RepositoryNet+ Project: New Services for the Institutional Repository Netw...
UK RepositoryNet+ Project: New Services for the Institutional Repository Netw...
 
HNSciCloud Introduction - Bob Jones - Prototype Phase kickoff meeting
HNSciCloud Introduction - Bob Jones - Prototype Phase kickoff meetingHNSciCloud Introduction - Bob Jones - Prototype Phase kickoff meeting
HNSciCloud Introduction - Bob Jones - Prototype Phase kickoff meeting
 
COBWEB Project: Citizens Observatories Side Event
COBWEB Project: Citizens Observatories Side EventCOBWEB Project: Citizens Observatories Side Event
COBWEB Project: Citizens Observatories Side Event
 
041018 Esds Poster
041018 Esds Poster041018 Esds Poster
041018 Esds Poster
 
Scribe (sharing community related information in bedfordshire
Scribe (sharing community related information in bedfordshireScribe (sharing community related information in bedfordshire
Scribe (sharing community related information in bedfordshire
 
OAI and Publishers’ metadata: Using the static repositories approach to discl...
OAI and Publishers’ metadata: Using the static repositories approach to discl...OAI and Publishers’ metadata: Using the static repositories approach to discl...
OAI and Publishers’ metadata: Using the static repositories approach to discl...
 
COBWEB Project: Overall Project Status and Deliverables
COBWEB Project: Overall Project Status and DeliverablesCOBWEB Project: Overall Project Status and Deliverables
COBWEB Project: Overall Project Status and Deliverables
 
DE gitConnect
DE gitConnectDE gitConnect
DE gitConnect
 

Similar a #2 NCI data services - Fair data webinar 6 Sept 2017

How Cyverse.org enables scalable data discoverability and re-use
How Cyverse.org enables scalable data discoverability and re-useHow Cyverse.org enables scalable data discoverability and re-use
How Cyverse.org enables scalable data discoverability and re-use
Matthew Vaughn
 

Similar a #2 NCI data services - Fair data webinar 6 Sept 2017 (20)

How to use NCI's national repository of big spatial data collections
How to use NCI's national repository of big spatial data collectionsHow to use NCI's national repository of big spatial data collections
How to use NCI's national repository of big spatial data collections
 
Big Data is today: key issues for big data - Dr Ben Evans
Big Data is today: key issues for big data - Dr Ben EvansBig Data is today: key issues for big data - Dr Ben Evans
Big Data is today: key issues for big data - Dr Ben Evans
 
UKSG Conference 2017 Breakout - Jisc Research Data Shared Service - John Kaye
UKSG Conference 2017 Breakout - Jisc Research Data Shared Service - John KayeUKSG Conference 2017 Breakout - Jisc Research Data Shared Service - John Kaye
UKSG Conference 2017 Breakout - Jisc Research Data Shared Service - John Kaye
 
AusCover Earth Observation Services and Data Cubes
AusCover Earth Observation Services and Data CubesAusCover Earth Observation Services and Data Cubes
AusCover Earth Observation Services and Data Cubes
 
How Cyverse.org enables scalable data discoverability and re-use
How Cyverse.org enables scalable data discoverability and re-useHow Cyverse.org enables scalable data discoverability and re-use
How Cyverse.org enables scalable data discoverability and re-use
 
Wiser2009 Luis Martinez
Wiser2009 Luis MartinezWiser2009 Luis Martinez
Wiser2009 Luis Martinez
 
Evolving Storage and Cyber Infrastructure at the NASA Center for Climate Simu...
Evolving Storage and Cyber Infrastructure at the NASA Center for Climate Simu...Evolving Storage and Cyber Infrastructure at the NASA Center for Climate Simu...
Evolving Storage and Cyber Infrastructure at the NASA Center for Climate Simu...
 
D4Science Data infrastructure: a facilitator for a FAIR data management
D4Science Data infrastructure: a facilitator for a FAIR data managementD4Science Data infrastructure: a facilitator for a FAIR data management
D4Science Data infrastructure: a facilitator for a FAIR data management
 
D4Science Data Infrastructure - Facilitator for a FAIR Data Management
D4Science Data Infrastructure - Facilitator for a FAIR Data ManagementD4Science Data Infrastructure - Facilitator for a FAIR Data Management
D4Science Data Infrastructure - Facilitator for a FAIR Data Management
 
Ben Evans SPEDDEXES 2014
Ben Evans SPEDDEXES 2014Ben Evans SPEDDEXES 2014
Ben Evans SPEDDEXES 2014
 
Bonazzi commons bd2 k ahm 2016 v2
Bonazzi commons bd2 k ahm 2016 v2Bonazzi commons bd2 k ahm 2016 v2
Bonazzi commons bd2 k ahm 2016 v2
 
Enduring Impact in Data-Driven Science
Enduring Impact in Data-Driven ScienceEnduring Impact in Data-Driven Science
Enduring Impact in Data-Driven Science
 
NCI Cancer Research Data Commons - Overview
NCI Cancer Research Data Commons - OverviewNCI Cancer Research Data Commons - Overview
NCI Cancer Research Data Commons - Overview
 
Data publication at CSIRO
Data publication at CSIROData publication at CSIRO
Data publication at CSIRO
 
CPaaS.io Y1 Review Meeting - Holistic Data Management
CPaaS.io Y1 Review Meeting - Holistic Data ManagementCPaaS.io Y1 Review Meeting - Holistic Data Management
CPaaS.io Y1 Review Meeting - Holistic Data Management
 
Water Quality - Lukas Kuenzel.pdf
Water Quality - Lukas Kuenzel.pdfWater Quality - Lukas Kuenzel.pdf
Water Quality - Lukas Kuenzel.pdf
 
What is Data Commons and How Can Your Organization Build One?
What is Data Commons and How Can Your Organization Build One?What is Data Commons and How Can Your Organization Build One?
What is Data Commons and How Can Your Organization Build One?
 
Geospatial Data Insfrastructures, Cybercartography and Open Data: The Need f...
Geospatial Data Insfrastructures, Cybercartography and Open Data:  The Need f...Geospatial Data Insfrastructures, Cybercartography and Open Data:  The Need f...
Geospatial Data Insfrastructures, Cybercartography and Open Data: The Need f...
 
Geospatial Data Insfrastructures, Cybercartography and Open Data: The Need f...
Geospatial Data Insfrastructures, Cybercartography and Open Data:  The Need f...Geospatial Data Insfrastructures, Cybercartography and Open Data:  The Need f...
Geospatial Data Insfrastructures, Cybercartography and Open Data: The Need f...
 
STI 2022 - Generating large-scale network analyses of scientific landscapes i...
STI 2022 - Generating large-scale network analyses of scientific landscapes i...STI 2022 - Generating large-scale network analyses of scientific landscapes i...
STI 2022 - Generating large-scale network analyses of scientific landscapes i...
 

Más de ARDC

Más de ARDC (20)

Introduction to ADA
Introduction to ADAIntroduction to ADA
Introduction to ADA
 
Architecture and Standards
Architecture and StandardsArchitecture and Standards
Architecture and Standards
 
Data Sharing and Release Legislation
Data Sharing and Release Legislation   Data Sharing and Release Legislation
Data Sharing and Release Legislation
 
Australian Dementia Network (ADNet)
Australian Dementia Network (ADNet)Australian Dementia Network (ADNet)
Australian Dementia Network (ADNet)
 
Investigator-initiated clinical trials: a community perspective
Investigator-initiated clinical trials: a community perspectiveInvestigator-initiated clinical trials: a community perspective
Investigator-initiated clinical trials: a community perspective
 
NCRIS and the health domain
NCRIS and the health domainNCRIS and the health domain
NCRIS and the health domain
 
International perspective for sharing publicly funded medical research data
International perspective for sharing publicly funded medical research dataInternational perspective for sharing publicly funded medical research data
International perspective for sharing publicly funded medical research data
 
Clinical trials data sharing
Clinical trials data sharingClinical trials data sharing
Clinical trials data sharing
 
Clinical trials and cohort studies
Clinical trials and cohort studiesClinical trials and cohort studies
Clinical trials and cohort studies
 
Introduction to vision and scope
Introduction to vision and scopeIntroduction to vision and scope
Introduction to vision and scope
 
FAIR for the future: embracing all things data
FAIR for the future: embracing all things dataFAIR for the future: embracing all things data
FAIR for the future: embracing all things data
 
ARDC 2018 state engagements - Nov-Dec 2018 - Slides - Ian Duncan
ARDC 2018 state engagements - Nov-Dec 2018 - Slides - Ian DuncanARDC 2018 state engagements - Nov-Dec 2018 - Slides - Ian Duncan
ARDC 2018 state engagements - Nov-Dec 2018 - Slides - Ian Duncan
 
Skilling-up-in-research-data-management-20181128
Skilling-up-in-research-data-management-20181128Skilling-up-in-research-data-management-20181128
Skilling-up-in-research-data-management-20181128
 
Research data management and sharing of medical data
Research data management and sharing of medical dataResearch data management and sharing of medical data
Research data management and sharing of medical data
 
Findable, Accessible, Interoperable and Reusable (FAIR) data
Findable, Accessible, Interoperable and Reusable (FAIR) dataFindable, Accessible, Interoperable and Reusable (FAIR) data
Findable, Accessible, Interoperable and Reusable (FAIR) data
 
Applying FAIR principles to linked datasets: Opportunities and Challenges
Applying FAIR principles to linked datasets: Opportunities and ChallengesApplying FAIR principles to linked datasets: Opportunities and Challenges
Applying FAIR principles to linked datasets: Opportunities and Challenges
 
How to make your data count webinar, 26 Nov 2018
How to make your data count webinar, 26 Nov 2018How to make your data count webinar, 26 Nov 2018
How to make your data count webinar, 26 Nov 2018
 
Ready, Set, Go! Join the Top 10 FAIR Data Things Global Sprint
Ready, Set, Go! Join the Top 10 FAIR Data Things Global SprintReady, Set, Go! Join the Top 10 FAIR Data Things Global Sprint
Ready, Set, Go! Join the Top 10 FAIR Data Things Global Sprint
 
How FAIR is your data? Copyright, licensing and reuse of data
How FAIR is your data? Copyright, licensing and reuse of dataHow FAIR is your data? Copyright, licensing and reuse of data
How FAIR is your data? Copyright, licensing and reuse of data
 
Peter neish DMPs BoF eResearch 2018
Peter neish DMPs BoF eResearch 2018Peter neish DMPs BoF eResearch 2018
Peter neish DMPs BoF eResearch 2018
 

Último

The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
heathfieldcps1
 

Último (20)

FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptx
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptx
 
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxCOMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptx
 
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxOn_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
 
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptx
 

#2 NCI data services - Fair data webinar 6 Sept 2017

  • 1. FAIR Principle – Data Accessibility Practice at NCI Jingbo Wang ANDS webinar FAIR principles 6 Sep 2017 Canberra
  • 2. nci.org.au© NCI Australia 2017 Overview of datasets at NCI • climate and weather models • satellite images • bathymetry and elevation • hydrology • geophysics • Also: optical astro, genomic and social sciences NCI makes available national reference datasets – especially those produced by the government agencies. It is brought together at NCI and organised for both high performance computation & high performance data analysis, as well as making available more broadly to the research community.
  • 3. nci.org.au© NCI Australia 2017 NCI provides user with Data-as-a-Service User generate/transfer data Fast data storage Data Management Portal Web-time analytics softwareHPC Data Curation, Publish, Citation Data catalogue Super computer users Paper and Data are published Data visualization Visualisation tools Data share and re-use What do we do?
  • 4. nci.org.au© NCI Australia 2017 •Control Data Access - license, data access controls •Persistently Access Data •Access large Data through data services •Scalable and distributed Data Access (advanced data services) •Provenance implementation to Access versioned Data •Access quality Data
  • 5. nci.org.au© NCI Australia 2017 License: CCBY4.0, CCBY-NC-ND, CCBY-NC-SA, ECMWF, others Access Control List (ACL): set to manage read and write permission Embargo period, dev-ops time delay, publication dependency FYI: Poblet. et al. 2016 Assigning Creative Commons Licenses to Research Metadata: Issues and Cases https://arxiv.org/abs/1609.05700 1. Control Data Access
  • 6. nci.org.au© NCI Australia 2017 2. Persistently Access Data Reference: Wang et al. 2017 Persistent Identifier Practice for Big Data Management at NCI. Digital Science Journal https://datascience.codata.org/articles/10.5334/dsj-2017-020/
  • 7. nci.org.au© NCI Australia 2017 Open Geospatial Consortium Services • Web mapping services • Web coverage services • Web feature services • Web processing services • Web coverage-processing services Other NCI data services • THREDDS data subsetting • GSKY • ESGF (services used for international Climate Model Intercomparison Projects (MIPS)) • ASVO • ERDDAP, GeoServer, Rasdaman The most broad-scale, general purpose server has become TDS because of the range of data services and protocols supported. 3. Access Data through NCI’s data services
  • 9. nci.org.au© NCI Australia 2017 (courtesy of Adam Steer)
  • 10. nci.org.au© NCI Australia 2017 3. Access Data through NCI’s data services - examples When searching and browsing data, published collections will have a direct link to NCI’s Data Services. https://datacatalogue.nci.org.au https://geonetwork.nci.org.au
  • 11. nci.org.au© NCI Australia 2017 THREDDS data services What is THREDDS? THREDDS (Thematic Realtime Environmental Distributed Data Services) data server (TDS) developed by Unidata (UCAR) Allows for browsing and accessing of data (as well as metadata) Name Description OPeNDAP (DAP2) Protocol enabling data access and subsetting through the web NetCDF Subset Service (NCSS) Web service for subsetting files that can be read by the netCDF java library Web Map Service (WMS) OGC web service for requesting static images of data Web Coverage Service (WCS) OGC web service for requesting data in some output format Godiva Data Viewer Tool for simple visualisation of data HTTP File Download Direct downloading
  • 12. nci.org.au© NCI Australia 2017 WMS request WCS request WPS request FEATURES • Distributed • Scalable • Concurrent • Multi Cloud INPUT OUTPUT OGC Request OGC Output User’s browser WMS client 4. Scalable and Distributed Data Access - GSKY GSKY http://ceur-ws.org/Vol-1913/RL17_paper_14.pdf
  • 13. nci.org.au© NCI Australia 2017 5. Provenance implementation to Access versioned Data Publication with data extract reference URI points to data extract URI points to an earlier version x of data extract URI points to an even earlier version 1 of data extract URI points the original source used to generate a series of data extracts reference: Wang et al. 2015. Enabling dynamic access to dynamic petascale Earth Systems and Environmental data collections is easy: citing and reproducing the actual data extracts used in research publications is NOT. American Geophysical Union Fall meeting. Data DataMetadata final version raw data
  • 14. nci.org.au© NCI Australia 2017 6. Access quality Data NCI Data Quality Strategy Reference: Evans. et. al. 2017 (invited) A data quality strategy for programmatic access to large collections of diverse datasets on an integrated high performance platform. DMPI Informatics. • Data Quality Control (QC), Quality Assurance (QA) report, benchmarking use cases should be available for the community. • When users access the data, they can also access the data quality report.