The Euro-Mediterranean Center on Climate Change (CMCC) Foundation is a non-profit research institution that manages and promotes scientific and applied activities in the field of international climate change research. In this talk, CLIMA, the climate information management platform that has been developed recently at CMCC Supercomputing Center, will be presented. The platform is based on iRODS, an open source data management software that provides features such as data discovery, automated data workflows, secure collaboration and data virtualization. The main goal in CLIMA is to provide climate data services such as data portals, data delivery, and data analytics, that are provisioned through OpenNebula private cloud by using features such as Oneflow and Onegate. Moreover, CLIMA can provision high available climate data services, by using a cloud hybrid approach based on the federation of OpenNebula and iRODS zones defined on-premise and on Amazon AWS.
2. About CMCC
• CMCC is a non-profit research institution (Since 10th Dec. 2015 it is a
Foundation)
• Established in 2005, with the financial support of the Ministry of Education,
University and Research (MIUR), the Ministry of the Environment and
Protection of Land and Sea (MATT), the Ministry of Agricultural and Forestry
Policies (MIPAF) and the Ministry of Finance (MEF)
• CMCC’s Mission is to investigate and model our climate system and its
interactions with society and the environment to guarantee reliable, rigorous,
and timely scientific results to stimulate sustainable growth, protect the
environment, and to develop science driven adaptation and mitigation policies
in a changing climate.
• 6 Consortium Members: National Institute of Geophysics and Volcanology
(INGV); University of Salento; Italian Aerospace Research Center (CIRA S.c.p.a);
Ca’ Foscari University of Venice; University of Tuscia; University of Sassari.
• 8 Research Divisions: ASC, CSP, ECIP, IAFES, ODA, OPA, RAAS, REMHI
• 1 Supercomputing Center with HPC and Storage facilities
3. The big challenge is to model this complex system
• Several complex processes
to be simulated
• Several interacting
processes
• Great range of time scales to
be analyzed
• Great range of spatial scales
to be considered
• Need interdisciplinar
sciences (physics, chemistry,
biology, geology,…)
• Inherently non-linear
governing equations
• Need sophisticated numerics
• Need huge computational
resources
• …and large volumes of data
can be produced
Warren M. Washington – NCAR
Scien9fic Grand Challenges Workshop Series:
Challenges in Climate Change Science and the Role of Compu9ng at the Extreme Scale
DOE Workshop (ASCR-BER)
November 6-7, 2008
4. CMCC information LIfecycle Management plAtform
CLIMA
CMCC information LIfecycle Management plAtform
High
Performance
Computing
Analysis and
Visualization
Sharing and
Publication
Archiving and
Retrieval
Objectives
• Enforcing Data Policies
• Optimizing Storage Cost
• Improving Data High Availability
• Robust Implementation of Operational Chains
• Ease Search&Discovery, Data Sharing and Collaboration
Federation of Data Services
5. CLIMA Data Service
Ingestion
Operational
Chains
Data
Access
Portal
Gateway
Search &
Discovery
Data
Manage-
ment
iRODS is an open-source
data management software:
• Virtualization
• Data Discovery
• Workflow Automation
• Data Sharing
Solr is open source
enterprise search server
that provides faceted
navigation, clustering,
grouping, and other
search features
Thredds is a data
access server that
provides bulk file
transfer, remote
access, subsetting,
web map services
10. High Available Data Services
Amazon EC2 Amazon S3
ONEFLOW
VPN
VM
VM
VM
VM
VM
VM
VM
VM
VM
VM
VM
VM
ONEFLOW
Federation +
File Replication
Cross Data Center
Replication
Federation
Slave Zone
Master Zone