SlideShare a Scribd company logo
1 of 26
Distributed Computing Environments Team
Marian Bubak
bubak@agh.edu.pl
Department of Computer Science and Cyfronet
AGH University of Science and Technology
Krakow, Poland
dice.cyfronet.pl
DICE Team
Academic Computer Centre
CYFRONET AGH (1973)
120 employees
http://www.cyfronet.pl/en/
Department of Computer Science AGH (1980)
800 students, 70 employees
http://www.ki.agh.edu.pl/uk/index.htm
Faculty of Computer Science, Electronics and
Telecommunication (2012)
2000 students, 200 employees
http://www.iet.agh.edu.pl/
AGH University of Science and Technology (1919)
16 faculties, 36000 students; 4000 employees
http://www.agh.edu.pl/en
Other 15
faculties
Distributed Computing
Environments (DICE) Team
http://dice.cyfronet.pl
• Investigation of methods for building complex scientific collaborative applications
• Elaboration of environments and tools for e-Science
• Integration of large-scale distributed computing infrastructures
• Knowledge-based approach to services, components, and their semantic composition
• Investigating applicability of cloud computing model for complex
scientific applications
• Optimization of resource allocation for applications on clouds
• Resource management for services on heterogeneous resources
• Urgent computing scenarios on distributed infrastructures
• Billing and accounting models
• Procedural and technical aspects of ensuring efficient yet secure
data storage, transfer and processing
• Methods for component dependency management, composition
and deployment
• Information representation model for cloud federating platform, its
components and operating procedures
Current research objectives
• Optimization of service
deployment on clouds
– Constraint satisfaction and
optimization of multiple
criteria (cost, performance)
– Static deployment planning
and dynamic auto-scaling
• Billing and accounting
model
– Adapted for the federated
cloud infrastructure
– Handle multiple billing
models
• Supporting system-level
(e)Science
– tools for effective scientific
research and collaboration
– advanced scientific analyses
using HPC/HTC resources
• Cloud security
– security of data transfer
– reliable storage and removal
of the data
• Cross-cloud service
deployment based on
container model
Topics for collaboration
seconds
~95%
3 hours
100 jobs
1 job
<10%
asynchronous and frequent failures
and hardware/software upgrades
long and unpredictable job waiting times
J. T. Moscicki: Understanding and mastering dynamics in Computing Grids, UvA PhD thesis, promoter: M. Bubak, co-promoter: P. Sloot;
12.04.2011
Spatial and temporal dynamics in grids
• Grids increase research capabilities for science
• Large-scale federation of computing and storage resources
– 300 sites, 60 countries, 200 Virtual Organizations
– 10^5 CPUs, 20 PB data storage, 10^5 jobs daily
• However operational and runtime dynamics have a negative
impact on reliability and efficiency
Completion time
with late binding.
Completion time
with early binding.
40 hours1.5 hours
J. T. Moscicki, M. Lamanna, M. Bubak, P. M. A.Sloot: Processing moldable tasks on the Grid: late job binding with lightweight user-level
overlay, FGCS 27(6) pp 725-736, 2011
User-level overlay with late binding scheduling
• Improved job execution characteristics
• HTC-HPC Interoperability
• Heuristic resource selection
• Application aware task scheduling
IaaS Provider
EEA
Zoning
jClouds
API
Support
BLOB
storage
support
Per-
hour
instance
billing
API
Access
Published
price
VM
Image
Import /
Export
Relational
DB
support Score
Weight 20 20 10 5 5 5 3 2
1 Amazon AWS 1 1 1 1 1 1 0 1 27
2 Rackspace 1 1 1 1 1 1 0 1 27
3 SoftLayer 1 1 1 1 1 1 0 0 25
4 CloudSigma 1 1 0 1 1 1 1 0 18
5 ElasticHosts 1 1 0 1 1 1 1 0 18
6 Serverlove 1 1 0 1 1 1 1 0 18
7 GoGrid 1 1 0 1 1 1 0 0 15
8 Terremark ecloud 1 1 0 1 1 0 1 0 13
9 RimuHosting 1 1 0 0 1 1 0 1 12
10 Stratogen 1 1 0 0 1 0 1 0 8
11 Bluelock 1 1 0 0 1 0 0 0 5
12 Fujitsu GCP 1 1 0 0 1 0 0 0 5
13 BitRefinery 0 0 0 0 0 1 0 1 0
14 BrightBox 1 0 0 1 1 1 1 0 0
15 BT Global Services 1 0 0 0 1 0 1 0 0
16 Carpathia Hosting 1 0 0 0 0 0 1 0 0
17 City Cloud 1 0 0 1 1 1 0 0 0
18 Claris Networks 0 0 0 1 0 0 0 0 0
19 Codero 0 0 0 1 1 1 0 0 0
20 CSC 1 0 0 0 0 0 1 0 0
21 Datapipe 1 0 0 1 1 0 0 0 0
22 e24cloud 1 0 0 1 0 1 0 0 0
23 eApps 0 0 0 0 0 1 0 0 0
24 FlexiScale 1 0 0 1 1 1 1 0 0
25 Google GCE 1 0 1 1 1 1 0 1 0
26 Green House Data 0 0 0 0 1 0 1 0 0
27 Hosting.com 0 0 0 0 0 1 1 1 0
28 HP Cloud 0 1 1 1 1 1 1 1 0
29 IBM SmartCloud 0 0 1 1 1 1 0 1 0
30 IIJ GIO 0 0 0 0 0 0 0 0 0
31 iland cloud 1 0 0 1 0 1 1 0 0
32 Internap 0 0 1 1 1 1 0 0 0
33 Joyent 0 0 0 1 1 1 0 0 0
34 LunaCloud 1 0 1 1 1 1 0 0 0
35 Oktawave 1 0 1 1 1 1 0 1 0
36 Openhosting.co.uk 1 0 0 0 0 1 0 0 0
37 Openhosting.com 0 1 0 1 1 1 1 0 0
38 OpSource 1 0 1 1 1 1 1 0 0
39 ProfitBricks 1 0 0 1 1 1 0 0 0
40 Qube 1 0 0 0 0 1 0 0 0
41 ReliaCloud 0 0 0 0 0 0 0 0 0
42 SaavisDirect 0 0 1 1 0 1 0 0 0
43 SkaliCloud 0 1 0 1 1 1 1 0 0
44 Teklinks 0 0 0 0 0 0 0 0 0
45 Terremark vcloud 0 1 0 1 1 1 1 0 0
46 Tier 3 0 0 0 0 1 0 0 0 0
47 Umbee 1 0 0 1 1 1 1 0 0
48 VPS.net 1 0 0 0 1 1 0 0 0
49 Windows Azure 1 0 1 1 1 1 0 1 0
• Performance of VM deployment times
• Virtualization overhead Evaluation of open source cloud
stacks (Eucalyptus, OpenNebula, OpenStack)
• Survey of European public cloud providers
• Performance evaluation of top cloud providers
(EC2, RackSpace, SoftLayer)
• A grant from Amazon has been obtained
M. Bubak, M. Kasztelnik, M. Malawski, J. Meizner, P. Nowakowski and S. Varma: Evaluation of Cloud Providers for VPH Applications, poster
at CCGrid2013 - 13th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, Delft, the Netherlands, May 13-16, 2013
Cloud performance evaluation
• Infrastructure model
– Multiple compute and
storage clouds
– Heterogeneous instance
types
• Application model
– Bag of tasks
– Leyered workflows
• Modeling with AMPL (A
Modeling Language for
Mathematical
Programming)
• Cost optimization under
deadline constraints
• Mixed integer
programming
• Bonmin, Cplex solvers
0
500
1000
1500
2000
2500
3000
0 10 20 30 40 50 60 70 80 90 100
Cost($)
Time limit (hours)
20000 tasks, 512 MiB input and 512 MiB output, task execution time 0.1h @ 1ccu machine
Rackspace instances
Rackspace and private instances
Amazon's and private instances
Multiple providers
Amazon S3
Rackspace Cloud Files
Optimal
Layer 1 A
Layer 2
B
B B C
Layer 3 D
Layer 4 E
Layer 5 F
1h
2.5 h
0.5 h
0.3 h
2 h
6 h
M. Malawski, K. Figiela, J. Nabrzyski: Cost minimization for computational applications on hybrid cloud infrastructures, Future Generation
Computer Systems, Volume 29, Issue 7, September 2013, Pages 1786-1794, ISSN 0167-739X, http://dx.doi.org/10.1016/j.future.2013.01.004
Private cloud
Compute
private
Amazon
Storage
Compute
m1.small m1.large
t1.micro m2.xlarge
Task
Input
Output
Application
Rackspace
Storage
Compute
rs.1gb rs.2gb
rs.4gb rs.16gb
Cost optimization of applications on clouds
VPH-Share Master Int.
AdminDeveloper Scientist
Development Mode
VPH-Share Core Services Host
OpenStack/Nova Computational Cloud Site
Worker
Node
Worker
Node
Worker
Node
Worker
Node
Worker
Node
Worker
Node
Worker
Node
Worker
Node
Head
Node
Image store
(Glance)
Cloud Facade
(secure
RESTful API )
Other CS
Amazon EC2
Atmosphere
Management
Service (AMS)
Cloud stack
plugins (Fog)
Atmosphere
Internal
Registry (AIR)
Cloud Manager
Generic Invoker
Workflow management
External application
Cloud Facade client
Customized applications may directly
interface Atmosphere via its RESTful
API called the Cloud Facade
The Atmosphere Cloud Platform is a one-stop management service for
hybrid cloud resources, ensuring optimal deployment of application
services on the underlying hardware.
P. Nowakowski, T. Bartynski, T. Gubala, D. Harezlak, M. Kasztelnik, M. Malawski, J. Meizner, M. Bubak: Cloud Platform for Medical
Applications, eScience 2012 (2012)
Resource allocation management
DRI is a tool which can keeps track of binary data stored in a cloud infrastructure, monitor
data availability and faciliate optimal deployment of application services in a hybrid cloud
(bringing computations to data or the other way around).
Binary
data
registry
LOBCDER
Amazon S3 OpenStack Swift Cumulus
Register files
Get metadata
Migrate LOBs
Get usage stats
(etc.)
Distributed Cloud storage
Store and marshal data
End-user features
(browsing, querying,
direct access to data,
checksumming)
VPH Master Int.
Data management
portlet (with DRI
management
extensions)
DRI Service
A standalone application service, capable of autonomous operation. It periodically
verifies access to any datasets submitted for validation and is capable of issuing alerts
to dataset owners and system administrators in case of irregularities.Validation
policy
Configurable validation runtime
(registry-driven)
Runtime layer
Extensible
resource
client layer
Metadata extensions for DRI
Data reliability and integrity
Data security in clouds
Jan Meizner, Marian Bubak, Maciej Malawski, and Piotr Nowakowski: Secure storage and processing of confidential data on public clouds.
In: Proceedings of the International Conference On Parallel Processing and Applied Mathematics (PPAM) 2013
• To ensure security of data in transit
• Modern applications use secure tranport
protocols (e.g.TLS)
• For legacy unencrypted protocols if absolutly
needed, or as additional security measure:
– Site-to-Site VPN, e.g. between cloud sites is
outside of the instance, might use
– Remote access – for individual users accessing
e.g. from their laptops
• Data should be secure stored and realiable
deleted when no longer needed
• Clouds not secure enough, data optimisations
preventing ensuring that data were deleted
• A solution:
– end-to-end encryption (decryption key stays in
protected/private zone)
– data dispersal (portion of data, dispersed
between nodes so it’s non-trivial/impossible to
recover whole message)
• GworkflowDL language (with A.
Hoheisel)
• Dynamic, ad-hoc refinement of
workflows based on semantic
description in ontologies
• Novelty
– Abstract, functional blocks translated
automatically into computation unit
candidates (services)
– Expansion of a single block into a
subworkflow with proper concurrency
and parallelism constructs (based on
Petri Nets)
– Runtime refinement: unknown or failed
branches are re-constructed with
different computation unit candidates
T. Gubala, D. Harezlak, M. Bubak, M. Malawski: Semantic Composition of Scientific Workflows Based on the Petri Nets Formalism. In: "The
2nd IEEE International Conference on e-Science and Grid Computing", IEEE Computer Society
Press, http://doi.ieeecomputersociety.org/10.1109/E-SCIENCE.2006.127, 2006
Semantic workflow composition
• Design of a laboratory for virologists, epidemiologists and clinicians
investigating the HIV virus and the possibilities of treating HIV-positive
patients
• Based on notion of in-silico experiments built and refined by cooperating
teams of programmers, scientists and clinicians
• Novelty
– Employed full concept-prototype-
refinement-production circle for
virology tools
– Set of dedicated yet interoperable
tools bind together programmers
and scientists for a single task
– Support for system-level science
with concept of result reuse
between different experiments
T. Gubala, M. Bubak, P. M. A. Sloot: Semantic Integration of Collaborative Research Environments, chapter XXVI in “Handbook of Research
on Computational Grid Technologies for Life Sciences, Biomedicine and Healthcare”, Information Science Reference IGI Global 2009, ISBN:
978-1-60566-374-6, pages 514-530
Cooperative virtual laboratory for e-Science
T. Gubala, K. Prymula, P. Nowakowski, M. Bubak: Semantic Integration for Model-based Life Science Applications. In: SIMULTECH 2013
Proceedings of the 3rd International Conference on Simulation and Modeling Methodologies, Technologies and Applications, Reykjavik, Iceland
29 - 31 July, 2013, pp. 74-81
• Concept of describing scientific domains for in-silico
experimentation and collaboration within laboratories
• Based on separation of the domain model, containing
concepts of the subject of experimentation from the
integration model, regarding the method of (virtual)
experimentation (tools, processes, computations)
• Facets defined in integration model are automatically
mixed-in concepts from domain model: any piece of
data may show any desired behavior
• Proposed, designed and deployed the
method for 3 domains of science:
– Computational chemistry inside InSilicoLab
chemistry portal
– Sensor processing for early warning and crisis
simulation in UrbanFlood EWS
– Processing of results of massive bioinformatic
computations for protein folding method
comparison
– Composition and execution of multiscale
simulations
– Setup and management of VPH applications
Semantic integration for science domains
GridSpace - platform for e-Science applications
• Experiment: an e-science application
composed of code fragments
(snippets), expressed in either general-
purpose scripting programming
languages, domain-specific languages or
purpose-specific notations. Each snippet
is evaluated by a corresponding
interpreter.
• GridSpace2 Experiment Workbench: a
web application - an entry point to
GridSpace2. It facilitates exploratory
development, execution and
management of e-science experiments.
• Embedded Experiment: a published
experiment embedded in a web site.
• GridSpace2 Core: a Java library providing
an API for
development, storage, management and
execution of experiments. Records all
available interpreters and their
installations on the underlying
computational resources.
• Computational Resources:
servers, clusters, grids, clouds and e-
infrastructures where the experiments
are computed.
E. Ciepiela, D. Harezlak, J. Kocot, T. Bartynski, M. Kasztelnik, P. Nowakowski, T. Gubała, M. Malawski, M. Bubak: Exploratory Programming
in the Virtual Laboratory. In: Proceedings of the International Multiconference on Computer Science and Information Technology, pp. 621-
628, October 2010, the best paper award.
Goal:
Extending the traditional
scientific publishing model with
computational access and
interactivity mechanisms;
enabling readers (including
reviewers) to replicate and
verify experimentation results
and browse large-scale result
spaces.
Challenges:
Scientific: A common description schema for primary data (experimental data, algorithms, software,
workflows, scripts) as part of publications; deployment mechanisms for on-demand reenactment of
experiments in e-Science.
Technological: An integrated architecture for storing, annotating, publishing, referencing and reusing
primary data sources.
Organizational: Provisioning of executable paper services to a large community of users representing
various branches of computational science; fostering further uptake through involvement of major
players in the field of scientific publishing.
P. Nowakowski, E. Ciepiela, D. Harężlak, J. Kocot, M. Kasztelnik, T. Bartyński, J. Meizner, G. Dyk, M. Malawski: The Collage Authoring
Environment. In: Proceedings of the International Conference on Computational Science, ICCS 2011 (2011), Winner of the Elseview/ICCS
Executable Paper Grand Challenge
E. Ciepiela, D. Harężlak, M. Kasztelnik, J. Meizner, G. Dyk, P. Nowakowski, M. Bubak: The Collage Authoring Environment: From Proof-of-
Concept Prototype to Pilot Service in Procedia Computer Science, vol. 18, 2013
Collage - executable e-Science publications
17
Jun 2012
• Goal: Extend the traditional way of authoring and
publishing scientific methods with computational
access and interactivity mechanisms thus bringing
reproducibility to scientific computational
workflows and publications
• Scientific challenge: Conceive a model and
methodology to embrace reproducibility in
scientific worflows and publications
• Technological challenge: support these by modern
Internet technologies and available computing
infrastructures
• Solution proposed:
• GridSpace2 – web-oriented distributed
computing platform
• Collage – authoring environment for
executable publications Dec 2011
Jun 2011
GridSpace2 / Collage - Executable
e-Science Publications
Results:
• GridSpace2/Collage won Executable
Paper Grand Challenge in 2011
• Collage was integrated with Elsevier
ScienceDirect portal so papers can be
linked and presented with
corresponding computational
experiments
• Special Issue of Computers &
Graphics journal featuring Collage-
based executable papers was
released in May 2013
• GridSpace2/Collage has been applied
to multiple computational workflows
in the scope of PL-Grid, PL-Grid Plus
and Mapper projects
E. Ciepiela, P. Nowakowski, J. Kocot, D. Harężlak, T. Gubała, J. Meizner, M. Kasztelnik, T. Bartyński, M. Malawski, M. Bubak: Managing
entire lifecycles of e-science applications in the GridSpace2 virtual laboratory–from motivation through idea to operable web-accessible
environment built on top of PL-grid e-infrastructure. In: Building a National Distributed e-Infrastructure–PL-Grid, 2012
P. Nowakowski, E. Ciepiela, D. Harężlak, J. Kocot, M. Kasztelnik, T. Bartyński, J. Meizner, G. Dyk, M. Malawski: The Collage Authoring
Environment. In: Procedia Computer Science, vol. 4, 2011
GridSpace2 / Collage - Executable e-Science
Publications
E. Ciepiela, D. Harężlak, M. Kasztelnik, J. Meizner, G. Dyk, P.
Nowakowski, M. Bubak: The Collage Authoring Environment:
From Proof-of-Concept Prototype to Pilot Service. In: Procedia
Computer Science, vol. 18, 2013
Common Information Space (CIS)
• Facilitate creation, deployment and robust operation of Early Warning
Systems in virtualized cloud environment
• Early Warning System (EWS): any system
working according to four steps:
monitoring, analysis, judgment,
action (e.g. environmental
monitoring)
B. Balis, M. Kasztelnik, M. Bubak, T. Bartynski, T. Gubala, P. Nowakowski, J. Broekhuijsen: The UrbanFlood Common Information Space for
Early Warning Systems. In: Elsevier Procedia Computer Science, vol 4, pp 96-105, ICCS 2011.
Common Information Space
• connects distributed component
into EWS and deploy it on cloud
• optimizes resource usage taking into
acount EWS importance level
• provides EWS and self monitoring
• equipped with autohealing
• Simple yet expressive model for complex scientific apps
• App = set of processes performing well-defined functions and
exchanging signals HyperFlow model JSON serialization
{
"name": "...",  name of the app
"processes": [ ... ],  processes of the app
"functions": [ ... ],  functions used by processes
"signals": [ ... ],  exchanged signals info
"ins": [ ... ],  inputs of the app
"outs": [ ... ]  outputs of the app
}
• Supports a rich set
of workflow patterns
• Suitable for various
application classes
• Abstracts from other distributed app aspects (service model,
data exchange model, communication protocols, etc.)
HyperFlow: model & execution engine
• HyperFlow model
& engine for
distributed apps
• App optimization
& scheduling
• Autoscaling and
dynamic app
reconfiguration
• Multi-cloud
resource
provisioning
Execution Platform Provisioning platform
VM
VM
VM
Cloud
VM VM
Executor
Input data
Trigger app execution
Monitoring
Provisioner
Start/Stop/Reconfigure VM
Autoscaler
Optimizer & Scheduler
Reconfigure app
Scaling rules
measuremants
HyperFlow
Enactment Engine
Enact
Execute
App model
App state
Composite App
Initial
deployment
Platform for distributed applications
Objectives
• Provide means for ad-hoc metadata model
creation and deployment of corresponding
storage facilities
• Create a research space for metadata model
exchange and discovery with associated data
repositories with access restrictions in place
• Support different types of storage sites and
data transfer protocols
• Support the exploratory paradigm by making
the models evolve together with data
Architecture
• Web Interface is used by users to
create, extend and discover metadata models
• Model repositories are deployed in the PaaS
Cloud layer for scalable and reliable access
from computing nodes through REST
interfaces
• Data items from Storage Sites are linked from
the model repositories
Colaborative metadata management
• MAPPER Memory (MaMe) a semantics-
aware persistence store to record metadata
about models and scales
• Multiscale Application Designer (MAD)
visual composition tool transforming high level
description into executable experiment
• GridSpace Experiment Workbench
(GridSpace) execution and result
management of experiments
choose/add/delete
Mapper A
Mapper B
Submodule
A
Submodule
B
MADGridSpace
MaMe
K. Rycerz, E. Ciepiela, G. Dyk, D. Groen, T. Gubala, D. Harezlak, M. Pawlik, J. Suter, S. Zasada, P. Coveney, M. Bubak: Support for Multiscale
Simulations with Molecular Dynamics, Procedia Computer Science, Volume 18, 2013, pp. 1116-1125, ISSN 1877-0509
K. Rycerz, M. Bubak, E. Ciepiela, D. Harezlak, T. Gubala, J. Meizner, M. Pawlik, B.Wilk: Composing, Execution and Sharing of Multiscale
Applications, submitted to Future Generation Computer Systems, after 1st review (2013)
K. Rycerz, M. Bubak, E. Ciepiela, M. Pawlik, O. Hoenen, D. Harezlak, B. Wilk, T. Gubala, J. Meizner, and D. Coster: Enabling Multiscale Fusion
Simulations on Distributed Computing Resources, submitted to PLGrid PLUS book 2014
• A method and an environment for composing multiscale
applications from single-scale models
• Validation of the the method against real applications
structured using tools
• Extension of application composition techniques to
multiscale simulations
• Support for multisite execution of multiscale simulations
• Proof-of-concept transformation of high-level formal
descriptions into actual execution using e-infrastructures
Multiscale programming and execution tools
Research on Feature Modeling:
• modelling eScience applications family
component hierarchy
• modelling requirements
• methods of mapping Feature Models to
Software Product Line architectures
Research on adapting Software Product Line
principles in scientific software projects:
• automatic composition of distributed
eScience applications based on Feature
Model configuration
• architectural design of Software Product
Line engine framework
B. Wilk, M. Bubak, M. Kasztelnik: Software for eScience: from feature modeling to automatic setup of environments, Advances in Software
Development, Scientific Papers of the Polish Informations Processing, Society Scientific Council, 2013 pp. 83-96
Building scientific software based on Feature Model
CrossGrid 2002-2005 Interactive compute- and data-intensive applications
K-Wf Grid 2004-2007 Knowledge-based composition of grid workflow applications
CoreGRID 2004-2008 Problem solving environments, programming models for grid applications
GREDIA 2006-2009 Grid platform for media and banking applications
ViroLab 2006-2009 Script based composition of applications, GridSpace virtual laboratory
PL-Grid; + 2009-2015 Advanced virtual laboratory, DataNet – metadata models (2 large Polish projects)
gSLM 2009-2012 Service level management for grid and clouds
UrbanFlood 2009-2012 Common Information Space for Early Warning Systems
MAPPER 2010-2013 Computational strategies, software and services for distributed multiscale simulations
VPH-Share 2011-2015 Federating cloud resources for VPH compute- and data intensive applications
Collage 2011-2013 Executable Papers; 1st award of Elsevier Competition at ICCS2011 (Elsevier project)
ISMOP 2013-2016 Management of cloud resources, workflows, big data storage and access, analysis tools (MCBiR)
PaaSage 2013-2016 Optimization of workflow applications on cloud resources
DICE team in EU projects
• Optimization of service
deployment on clouds
– Constraint satisfaction and
optimization of multiple
criteria (cost, performance)
– Static deployment planning
and dynamic auto-scaling
• Billing and accounting
model
– Adapted for the federated
cloud infrastructure
– Handle multiple billing
models
• Supporting system-level
(e)Science
– tools for effective scientific
research and collaboration
– advanced scientific analyses
using HPC/HTC resources
• Cloud security
– security of data transfer
– reliable storage and removal
of the data
• Cross-cloud service
deployment based on
container model
Topics for collaboration
dice.cyfronet.pl

More Related Content

What's hot

Data Science und Machine Learning im Kubernetes-Ökosystem
Data Science und Machine Learning im Kubernetes-ÖkosystemData Science und Machine Learning im Kubernetes-Ökosystem
Data Science und Machine Learning im Kubernetes-Ökosysteminovex GmbH
 
Dsd int 2014 - data science symposium - application 1 - point clouds, prof. p...
Dsd int 2014 - data science symposium - application 1 - point clouds, prof. p...Dsd int 2014 - data science symposium - application 1 - point clouds, prof. p...
Dsd int 2014 - data science symposium - application 1 - point clouds, prof. p...Deltares
 
Bioclouds CAMDA (Robert Grossman) 09-v9p
Bioclouds CAMDA (Robert Grossman) 09-v9pBioclouds CAMDA (Robert Grossman) 09-v9p
Bioclouds CAMDA (Robert Grossman) 09-v9pRobert Grossman
 
Pathways for EOSC-hub and MaX collaboration
Pathways for EOSC-hub and MaX collaborationPathways for EOSC-hub and MaX collaboration
Pathways for EOSC-hub and MaX collaborationEOSC-hub project
 
OCC Overview OMG Clouds Meeting 07-13-09 v3
OCC Overview OMG Clouds Meeting 07-13-09 v3OCC Overview OMG Clouds Meeting 07-13-09 v3
OCC Overview OMG Clouds Meeting 07-13-09 v3Robert Grossman
 
Big Linked Data Federation - ExtremeEarth Open Workshop
Big Linked Data Federation - ExtremeEarth Open WorkshopBig Linked Data Federation - ExtremeEarth Open Workshop
Big Linked Data Federation - ExtremeEarth Open WorkshopExtremeEarth
 
Materials Data Facility: Streamlined and automated data sharing, discovery, ...
Materials Data Facility: Streamlined and automated data sharing,  discovery, ...Materials Data Facility: Streamlined and automated data sharing,  discovery, ...
Materials Data Facility: Streamlined and automated data sharing, discovery, ...Ian Foster
 
Project Matsu: Elastic Clouds for Disaster Relief
Project Matsu: Elastic Clouds for Disaster ReliefProject Matsu: Elastic Clouds for Disaster Relief
Project Matsu: Elastic Clouds for Disaster ReliefRobert Grossman
 
Scalable Graph Convolutional Network Based Link Prediction on a Distributed G...
Scalable Graph Convolutional Network Based Link Prediction on a Distributed G...Scalable Graph Convolutional Network Based Link Prediction on a Distributed G...
Scalable Graph Convolutional Network Based Link Prediction on a Distributed G...miyurud
 
Grid'5000: Running a Large Instrument for Parallel and Distributed Computing ...
Grid'5000: Running a Large Instrument for Parallel and Distributed Computing ...Grid'5000: Running a Large Instrument for Parallel and Distributed Computing ...
Grid'5000: Running a Large Instrument for Parallel and Distributed Computing ...Frederic Desprez
 
Distance oracle - Truy vấn nhanh khoảng cách giữa hai điểm bất kỳ trên đồ thị
Distance oracle - Truy vấn nhanh khoảng cách giữa hai điểm bất kỳ trên đồ thịDistance oracle - Truy vấn nhanh khoảng cách giữa hai điểm bất kỳ trên đồ thị
Distance oracle - Truy vấn nhanh khoảng cách giữa hai điểm bất kỳ trên đồ thịHong Ong
 
袁晓如:大数据时代可视化和可视分析的机遇与挑战
袁晓如:大数据时代可视化和可视分析的机遇与挑战袁晓如:大数据时代可视化和可视分析的机遇与挑战
袁晓如:大数据时代可视化和可视分析的机遇与挑战hdhappy001
 
k-means algorithm implementation on Hadoop
k-means algorithm implementation on Hadoopk-means algorithm implementation on Hadoop
k-means algorithm implementation on HadoopStratos Gounidellis
 
Improving computer vision models at scale presentation
Improving computer vision models at scale presentationImproving computer vision models at scale presentation
Improving computer vision models at scale presentationDr. Mirko Kämpf
 
SURFsara Cloud - OpenNebula Tech Day 2014.03.26
SURFsara Cloud - OpenNebula Tech Day 2014.03.26 SURFsara Cloud - OpenNebula Tech Day 2014.03.26
SURFsara Cloud - OpenNebula Tech Day 2014.03.26 OpenNebula Project
 
Azure Stream Analytics Project : On-demand real-time analytics
Azure Stream Analytics Project : On-demand real-time analyticsAzure Stream Analytics Project : On-demand real-time analytics
Azure Stream Analytics Project : On-demand real-time analyticsLamprini Koutsokera
 
High Performance Data Analytics with Java on Large Multicore HPC Clusters
High Performance Data Analytics with Java on Large Multicore HPC ClustersHigh Performance Data Analytics with Java on Large Multicore HPC Clusters
High Performance Data Analytics with Java on Large Multicore HPC ClustersSaliya Ekanayake
 
Simplified Research Data Management with the Globus Platform
Simplified Research Data Management with the Globus PlatformSimplified Research Data Management with the Globus Platform
Simplified Research Data Management with the Globus PlatformGlobus
 
Big Linked Data Querying - ExtremeEarth Open Workshop
Big Linked Data Querying - ExtremeEarth Open WorkshopBig Linked Data Querying - ExtremeEarth Open Workshop
Big Linked Data Querying - ExtremeEarth Open WorkshopExtremeEarth
 
rasdaman: from barebone Arrays to DataCubes
rasdaman: from barebone Arrays to DataCubesrasdaman: from barebone Arrays to DataCubes
rasdaman: from barebone Arrays to DataCubesEUDAT
 

What's hot (20)

Data Science und Machine Learning im Kubernetes-Ökosystem
Data Science und Machine Learning im Kubernetes-ÖkosystemData Science und Machine Learning im Kubernetes-Ökosystem
Data Science und Machine Learning im Kubernetes-Ökosystem
 
Dsd int 2014 - data science symposium - application 1 - point clouds, prof. p...
Dsd int 2014 - data science symposium - application 1 - point clouds, prof. p...Dsd int 2014 - data science symposium - application 1 - point clouds, prof. p...
Dsd int 2014 - data science symposium - application 1 - point clouds, prof. p...
 
Bioclouds CAMDA (Robert Grossman) 09-v9p
Bioclouds CAMDA (Robert Grossman) 09-v9pBioclouds CAMDA (Robert Grossman) 09-v9p
Bioclouds CAMDA (Robert Grossman) 09-v9p
 
Pathways for EOSC-hub and MaX collaboration
Pathways for EOSC-hub and MaX collaborationPathways for EOSC-hub and MaX collaboration
Pathways for EOSC-hub and MaX collaboration
 
OCC Overview OMG Clouds Meeting 07-13-09 v3
OCC Overview OMG Clouds Meeting 07-13-09 v3OCC Overview OMG Clouds Meeting 07-13-09 v3
OCC Overview OMG Clouds Meeting 07-13-09 v3
 
Big Linked Data Federation - ExtremeEarth Open Workshop
Big Linked Data Federation - ExtremeEarth Open WorkshopBig Linked Data Federation - ExtremeEarth Open Workshop
Big Linked Data Federation - ExtremeEarth Open Workshop
 
Materials Data Facility: Streamlined and automated data sharing, discovery, ...
Materials Data Facility: Streamlined and automated data sharing,  discovery, ...Materials Data Facility: Streamlined and automated data sharing,  discovery, ...
Materials Data Facility: Streamlined and automated data sharing, discovery, ...
 
Project Matsu: Elastic Clouds for Disaster Relief
Project Matsu: Elastic Clouds for Disaster ReliefProject Matsu: Elastic Clouds for Disaster Relief
Project Matsu: Elastic Clouds for Disaster Relief
 
Scalable Graph Convolutional Network Based Link Prediction on a Distributed G...
Scalable Graph Convolutional Network Based Link Prediction on a Distributed G...Scalable Graph Convolutional Network Based Link Prediction on a Distributed G...
Scalable Graph Convolutional Network Based Link Prediction on a Distributed G...
 
Grid'5000: Running a Large Instrument for Parallel and Distributed Computing ...
Grid'5000: Running a Large Instrument for Parallel and Distributed Computing ...Grid'5000: Running a Large Instrument for Parallel and Distributed Computing ...
Grid'5000: Running a Large Instrument for Parallel and Distributed Computing ...
 
Distance oracle - Truy vấn nhanh khoảng cách giữa hai điểm bất kỳ trên đồ thị
Distance oracle - Truy vấn nhanh khoảng cách giữa hai điểm bất kỳ trên đồ thịDistance oracle - Truy vấn nhanh khoảng cách giữa hai điểm bất kỳ trên đồ thị
Distance oracle - Truy vấn nhanh khoảng cách giữa hai điểm bất kỳ trên đồ thị
 
袁晓如:大数据时代可视化和可视分析的机遇与挑战
袁晓如:大数据时代可视化和可视分析的机遇与挑战袁晓如:大数据时代可视化和可视分析的机遇与挑战
袁晓如:大数据时代可视化和可视分析的机遇与挑战
 
k-means algorithm implementation on Hadoop
k-means algorithm implementation on Hadoopk-means algorithm implementation on Hadoop
k-means algorithm implementation on Hadoop
 
Improving computer vision models at scale presentation
Improving computer vision models at scale presentationImproving computer vision models at scale presentation
Improving computer vision models at scale presentation
 
SURFsara Cloud - OpenNebula Tech Day 2014.03.26
SURFsara Cloud - OpenNebula Tech Day 2014.03.26 SURFsara Cloud - OpenNebula Tech Day 2014.03.26
SURFsara Cloud - OpenNebula Tech Day 2014.03.26
 
Azure Stream Analytics Project : On-demand real-time analytics
Azure Stream Analytics Project : On-demand real-time analyticsAzure Stream Analytics Project : On-demand real-time analytics
Azure Stream Analytics Project : On-demand real-time analytics
 
High Performance Data Analytics with Java on Large Multicore HPC Clusters
High Performance Data Analytics with Java on Large Multicore HPC ClustersHigh Performance Data Analytics with Java on Large Multicore HPC Clusters
High Performance Data Analytics with Java on Large Multicore HPC Clusters
 
Simplified Research Data Management with the Globus Platform
Simplified Research Data Management with the Globus PlatformSimplified Research Data Management with the Globus Platform
Simplified Research Data Management with the Globus Platform
 
Big Linked Data Querying - ExtremeEarth Open Workshop
Big Linked Data Querying - ExtremeEarth Open WorkshopBig Linked Data Querying - ExtremeEarth Open Workshop
Big Linked Data Querying - ExtremeEarth Open Workshop
 
rasdaman: from barebone Arrays to DataCubes
rasdaman: from barebone Arrays to DataCubesrasdaman: from barebone Arrays to DataCubes
rasdaman: from barebone Arrays to DataCubes
 

Viewers also liked

Поточний та капітальний ремонт доріг
Поточний та капітальний ремонт дорігПоточний та капітальний ремонт доріг
Поточний та капітальний ремонт дорігNazar Rohiv
 
ОСББ "Затишок-2". Програма кредитування.
ОСББ "Затишок-2". Програма кредитування.ОСББ "Затишок-2". Програма кредитування.
ОСББ "Затишок-2". Програма кредитування.Nazar Rohiv
 
Total quality management
Total quality managementTotal quality management
Total quality managementChristian Bacoy
 
Factors Affecting Anthropometry
Factors Affecting AnthropometryFactors Affecting Anthropometry
Factors Affecting AnthropometryChristian Bacoy
 
Tourism in Gilgit Baltistan
Tourism in Gilgit BaltistanTourism in Gilgit Baltistan
Tourism in Gilgit Baltistanandreymaxi
 
Перепланування вуличного простору вул. Тролейбусної
Перепланування вуличного простору вул. ТролейбусноїПерепланування вуличного простору вул. Тролейбусної
Перепланування вуличного простору вул. ТролейбусноїNazar Rohiv
 
Муніципальна інспекція з благоустрою
Муніципальна інспекція з благоустроюМуніципальна інспекція з благоустрою
Муніципальна інспекція з благоустроюNazar Rohiv
 
Презентація маршрутної мережі Івано-Франківська
Презентація маршрутної мережі Івано-ФранківськаПрезентація маршрутної мережі Івано-Франківська
Презентація маршрутної мережі Івано-ФранківськаNazar Rohiv
 
Міський транспорт івано франківська та Жешува
Міський транспорт івано франківська та ЖешуваМіський транспорт івано франківська та Жешува
Міський транспорт івано франківська та ЖешуваNazar Rohiv
 
РЕ-креація набережної ім. В.Стефаника в Івано-Франківську
РЕ-креація набережної ім. В.Стефаника в Івано-ФранківськуРЕ-креація набережної ім. В.Стефаника в Івано-Франківську
РЕ-креація набережної ім. В.Стефаника в Івано-ФранківськуNazar Rohiv
 
Перерозподіл вуличного простору: велопланування та розумна розмітка в Івано-Ф...
Перерозподіл вуличного простору: велопланування та розумна розмітка в Івано-Ф...Перерозподіл вуличного простору: велопланування та розумна розмітка в Івано-Ф...
Перерозподіл вуличного простору: велопланування та розумна розмітка в Івано-Ф...Nazar Rohiv
 

Viewers also liked (16)

Поточний та капітальний ремонт доріг
Поточний та капітальний ремонт дорігПоточний та капітальний ремонт доріг
Поточний та капітальний ремонт доріг
 
ОСББ "Затишок-2". Програма кредитування.
ОСББ "Затишок-2". Програма кредитування.ОСББ "Затишок-2". Програма кредитування.
ОСББ "Затишок-2". Програма кредитування.
 
Wild Food from Finland
Wild Food from FinlandWild Food from Finland
Wild Food from Finland
 
Total quality management
Total quality managementTotal quality management
Total quality management
 
Factors Affecting Anthropometry
Factors Affecting AnthropometryFactors Affecting Anthropometry
Factors Affecting Anthropometry
 
Tourism in Gilgit Baltistan
Tourism in Gilgit BaltistanTourism in Gilgit Baltistan
Tourism in Gilgit Baltistan
 
Перепланування вуличного простору вул. Тролейбусної
Перепланування вуличного простору вул. ТролейбусноїПерепланування вуличного простору вул. Тролейбусної
Перепланування вуличного простору вул. Тролейбусної
 
Муніципальна інспекція з благоустрою
Муніципальна інспекція з благоустроюМуніципальна інспекція з благоустрою
Муніципальна інспекція з благоустрою
 
Презентація маршрутної мережі Івано-Франківська
Презентація маршрутної мережі Івано-ФранківськаПрезентація маршрутної мережі Івано-Франківська
Презентація маршрутної мережі Івано-Франківська
 
Feasibility study 2014
Feasibility study 2014Feasibility study 2014
Feasibility study 2014
 
Міський транспорт івано франківська та Жешува
Міський транспорт івано франківська та ЖешуваМіський транспорт івано франківська та Жешува
Міський транспорт івано франківська та Жешува
 
Paleoravinto
PaleoravintoPaleoravinto
Paleoravinto
 
Budidaya kacang merah
Budidaya kacang merahBudidaya kacang merah
Budidaya kacang merah
 
Villi Terveys 2012
Villi Terveys 2012Villi Terveys 2012
Villi Terveys 2012
 
РЕ-креація набережної ім. В.Стефаника в Івано-Франківську
РЕ-креація набережної ім. В.Стефаника в Івано-ФранківськуРЕ-креація набережної ім. В.Стефаника в Івано-Франківську
РЕ-креація набережної ім. В.Стефаника в Івано-Франківську
 
Перерозподіл вуличного простору: велопланування та розумна розмітка в Івано-Ф...
Перерозподіл вуличного простору: велопланування та розумна розмітка в Івано-Ф...Перерозподіл вуличного простору: велопланування та розумна розмітка в Івано-Ф...
Перерозподіл вуличного простору: велопланування та розумна розмітка в Івано-Ф...
 

Similar to Dice presents-feb2014

Experiments with Complex Scientific Applications on Hybrid Cloud Infrastructures
Experiments with Complex Scientific Applications on Hybrid Cloud InfrastructuresExperiments with Complex Scientific Applications on Hybrid Cloud Infrastructures
Experiments with Complex Scientific Applications on Hybrid Cloud InfrastructuresRafael Ferreira da Silva
 
Openstack Pakistan Workshop (intro)
Openstack Pakistan Workshop (intro)Openstack Pakistan Workshop (intro)
Openstack Pakistan Workshop (intro)Affan Syed
 
云计算及其应用
云计算及其应用云计算及其应用
云计算及其应用lantianlcdx
 
What Are Science Clouds?
What Are Science Clouds?What Are Science Clouds?
What Are Science Clouds?Robert Grossman
 
Using the Open Science Data Cloud for Data Science Research
Using the Open Science Data Cloud for Data Science ResearchUsing the Open Science Data Cloud for Data Science Research
Using the Open Science Data Cloud for Data Science ResearchRobert Grossman
 
Cyberinfrastructure and Applications Overview: Howard University June22
Cyberinfrastructure and Applications Overview: Howard University June22Cyberinfrastructure and Applications Overview: Howard University June22
Cyberinfrastructure and Applications Overview: Howard University June22marpierc
 
Emerging Computing Architectures
Emerging Computing ArchitecturesEmerging Computing Architectures
Emerging Computing ArchitecturesDaniel Holmberg
 
Better Information Faster: Programming the Continuum
Better Information Faster: Programming the ContinuumBetter Information Faster: Programming the Continuum
Better Information Faster: Programming the ContinuumIan Foster
 
LesFurets.com: From 0 to Cassandra on AWS in 30 days - Tsunami Alerting Syste...
LesFurets.com: From 0 to Cassandra on AWS in 30 days - Tsunami Alerting Syste...LesFurets.com: From 0 to Cassandra on AWS in 30 days - Tsunami Alerting Syste...
LesFurets.com: From 0 to Cassandra on AWS in 30 days - Tsunami Alerting Syste...DataStax Academy
 
CHAPTER 2 cloud computing technology in cs
CHAPTER 2 cloud computing technology in csCHAPTER 2 cloud computing technology in cs
CHAPTER 2 cloud computing technology in csTSha7
 
Introduction to Cloud Computing
Introduction to Cloud ComputingIntroduction to Cloud Computing
Introduction to Cloud ComputingAnimesh Chaturvedi
 
Tim Hall [InfluxData] | InfluxDB Roadmap | InfluxDays Virtual Experience Lond...
Tim Hall [InfluxData] | InfluxDB Roadmap | InfluxDays Virtual Experience Lond...Tim Hall [InfluxData] | InfluxDB Roadmap | InfluxDays Virtual Experience Lond...
Tim Hall [InfluxData] | InfluxDB Roadmap | InfluxDays Virtual Experience Lond...InfluxData
 
Federated Cloud Computing
Federated Cloud ComputingFederated Cloud Computing
Federated Cloud ComputingDavid Wallom
 
An Introduction to Cloud Computing by Robert Grossman 08-06-09 (v19)
An Introduction to Cloud Computing by Robert Grossman 08-06-09 (v19)An Introduction to Cloud Computing by Robert Grossman 08-06-09 (v19)
An Introduction to Cloud Computing by Robert Grossman 08-06-09 (v19)Robert Grossman
 
Session 8 - Creating Data Processing Services | Train the Trainers Program
Session 8 - Creating Data Processing Services | Train the Trainers ProgramSession 8 - Creating Data Processing Services | Train the Trainers Program
Session 8 - Creating Data Processing Services | Train the Trainers ProgramFIWARE
 
L2-3.FA19.ppt
L2-3.FA19.pptL2-3.FA19.ppt
L2-3.FA19.pptmohaaalsa
 

Similar to Dice presents-feb2014 (20)

Experiments with Complex Scientific Applications on Hybrid Cloud Infrastructures
Experiments with Complex Scientific Applications on Hybrid Cloud InfrastructuresExperiments with Complex Scientific Applications on Hybrid Cloud Infrastructures
Experiments with Complex Scientific Applications on Hybrid Cloud Infrastructures
 
Openstack Pakistan Workshop (intro)
Openstack Pakistan Workshop (intro)Openstack Pakistan Workshop (intro)
Openstack Pakistan Workshop (intro)
 
云计算及其应用
云计算及其应用云计算及其应用
云计算及其应用
 
What Are Science Clouds?
What Are Science Clouds?What Are Science Clouds?
What Are Science Clouds?
 
Using the Open Science Data Cloud for Data Science Research
Using the Open Science Data Cloud for Data Science ResearchUsing the Open Science Data Cloud for Data Science Research
Using the Open Science Data Cloud for Data Science Research
 
Cyberinfrastructure and Applications Overview: Howard University June22
Cyberinfrastructure and Applications Overview: Howard University June22Cyberinfrastructure and Applications Overview: Howard University June22
Cyberinfrastructure and Applications Overview: Howard University June22
 
Emerging Computing Architectures
Emerging Computing ArchitecturesEmerging Computing Architectures
Emerging Computing Architectures
 
Better Information Faster: Programming the Continuum
Better Information Faster: Programming the ContinuumBetter Information Faster: Programming the Continuum
Better Information Faster: Programming the Continuum
 
LesFurets.com: From 0 to Cassandra on AWS in 30 days - Tsunami Alerting Syste...
LesFurets.com: From 0 to Cassandra on AWS in 30 days - Tsunami Alerting Syste...LesFurets.com: From 0 to Cassandra on AWS in 30 days - Tsunami Alerting Syste...
LesFurets.com: From 0 to Cassandra on AWS in 30 days - Tsunami Alerting Syste...
 
CHAPTER 2 cloud computing technology in cs
CHAPTER 2 cloud computing technology in csCHAPTER 2 cloud computing technology in cs
CHAPTER 2 cloud computing technology in cs
 
Self-Service Supercomputing
Self-Service SupercomputingSelf-Service Supercomputing
Self-Service Supercomputing
 
Introduction to Cloud Computing
Introduction to Cloud ComputingIntroduction to Cloud Computing
Introduction to Cloud Computing
 
Tim Hall [InfluxData] | InfluxDB Roadmap | InfluxDays Virtual Experience Lond...
Tim Hall [InfluxData] | InfluxDB Roadmap | InfluxDays Virtual Experience Lond...Tim Hall [InfluxData] | InfluxDB Roadmap | InfluxDays Virtual Experience Lond...
Tim Hall [InfluxData] | InfluxDB Roadmap | InfluxDays Virtual Experience Lond...
 
Federated Cloud Computing
Federated Cloud ComputingFederated Cloud Computing
Federated Cloud Computing
 
CloudBus
CloudBusCloudBus
CloudBus
 
An Introduction to Cloud Computing by Robert Grossman 08-06-09 (v19)
An Introduction to Cloud Computing by Robert Grossman 08-06-09 (v19)An Introduction to Cloud Computing by Robert Grossman 08-06-09 (v19)
An Introduction to Cloud Computing by Robert Grossman 08-06-09 (v19)
 
Session 8 - Creating Data Processing Services | Train the Trainers Program
Session 8 - Creating Data Processing Services | Train the Trainers ProgramSession 8 - Creating Data Processing Services | Train the Trainers Program
Session 8 - Creating Data Processing Services | Train the Trainers Program
 
L2-3.FA19.ppt
L2-3.FA19.pptL2-3.FA19.ppt
L2-3.FA19.ppt
 
L2-3.FA19.ppt
L2-3.FA19.pptL2-3.FA19.ppt
L2-3.FA19.ppt
 
Cloud Computing
Cloud ComputingCloud Computing
Cloud Computing
 

Recently uploaded

Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 

Recently uploaded (20)

Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 

Dice presents-feb2014

  • 1. Distributed Computing Environments Team Marian Bubak bubak@agh.edu.pl Department of Computer Science and Cyfronet AGH University of Science and Technology Krakow, Poland dice.cyfronet.pl
  • 2. DICE Team Academic Computer Centre CYFRONET AGH (1973) 120 employees http://www.cyfronet.pl/en/ Department of Computer Science AGH (1980) 800 students, 70 employees http://www.ki.agh.edu.pl/uk/index.htm Faculty of Computer Science, Electronics and Telecommunication (2012) 2000 students, 200 employees http://www.iet.agh.edu.pl/ AGH University of Science and Technology (1919) 16 faculties, 36000 students; 4000 employees http://www.agh.edu.pl/en Other 15 faculties Distributed Computing Environments (DICE) Team http://dice.cyfronet.pl • Investigation of methods for building complex scientific collaborative applications • Elaboration of environments and tools for e-Science • Integration of large-scale distributed computing infrastructures • Knowledge-based approach to services, components, and their semantic composition
  • 3. • Investigating applicability of cloud computing model for complex scientific applications • Optimization of resource allocation for applications on clouds • Resource management for services on heterogeneous resources • Urgent computing scenarios on distributed infrastructures • Billing and accounting models • Procedural and technical aspects of ensuring efficient yet secure data storage, transfer and processing • Methods for component dependency management, composition and deployment • Information representation model for cloud federating platform, its components and operating procedures Current research objectives
  • 4. • Optimization of service deployment on clouds – Constraint satisfaction and optimization of multiple criteria (cost, performance) – Static deployment planning and dynamic auto-scaling • Billing and accounting model – Adapted for the federated cloud infrastructure – Handle multiple billing models • Supporting system-level (e)Science – tools for effective scientific research and collaboration – advanced scientific analyses using HPC/HTC resources • Cloud security – security of data transfer – reliable storage and removal of the data • Cross-cloud service deployment based on container model Topics for collaboration
  • 5. seconds ~95% 3 hours 100 jobs 1 job <10% asynchronous and frequent failures and hardware/software upgrades long and unpredictable job waiting times J. T. Moscicki: Understanding and mastering dynamics in Computing Grids, UvA PhD thesis, promoter: M. Bubak, co-promoter: P. Sloot; 12.04.2011 Spatial and temporal dynamics in grids • Grids increase research capabilities for science • Large-scale federation of computing and storage resources – 300 sites, 60 countries, 200 Virtual Organizations – 10^5 CPUs, 20 PB data storage, 10^5 jobs daily • However operational and runtime dynamics have a negative impact on reliability and efficiency
  • 6. Completion time with late binding. Completion time with early binding. 40 hours1.5 hours J. T. Moscicki, M. Lamanna, M. Bubak, P. M. A.Sloot: Processing moldable tasks on the Grid: late job binding with lightweight user-level overlay, FGCS 27(6) pp 725-736, 2011 User-level overlay with late binding scheduling • Improved job execution characteristics • HTC-HPC Interoperability • Heuristic resource selection • Application aware task scheduling
  • 7. IaaS Provider EEA Zoning jClouds API Support BLOB storage support Per- hour instance billing API Access Published price VM Image Import / Export Relational DB support Score Weight 20 20 10 5 5 5 3 2 1 Amazon AWS 1 1 1 1 1 1 0 1 27 2 Rackspace 1 1 1 1 1 1 0 1 27 3 SoftLayer 1 1 1 1 1 1 0 0 25 4 CloudSigma 1 1 0 1 1 1 1 0 18 5 ElasticHosts 1 1 0 1 1 1 1 0 18 6 Serverlove 1 1 0 1 1 1 1 0 18 7 GoGrid 1 1 0 1 1 1 0 0 15 8 Terremark ecloud 1 1 0 1 1 0 1 0 13 9 RimuHosting 1 1 0 0 1 1 0 1 12 10 Stratogen 1 1 0 0 1 0 1 0 8 11 Bluelock 1 1 0 0 1 0 0 0 5 12 Fujitsu GCP 1 1 0 0 1 0 0 0 5 13 BitRefinery 0 0 0 0 0 1 0 1 0 14 BrightBox 1 0 0 1 1 1 1 0 0 15 BT Global Services 1 0 0 0 1 0 1 0 0 16 Carpathia Hosting 1 0 0 0 0 0 1 0 0 17 City Cloud 1 0 0 1 1 1 0 0 0 18 Claris Networks 0 0 0 1 0 0 0 0 0 19 Codero 0 0 0 1 1 1 0 0 0 20 CSC 1 0 0 0 0 0 1 0 0 21 Datapipe 1 0 0 1 1 0 0 0 0 22 e24cloud 1 0 0 1 0 1 0 0 0 23 eApps 0 0 0 0 0 1 0 0 0 24 FlexiScale 1 0 0 1 1 1 1 0 0 25 Google GCE 1 0 1 1 1 1 0 1 0 26 Green House Data 0 0 0 0 1 0 1 0 0 27 Hosting.com 0 0 0 0 0 1 1 1 0 28 HP Cloud 0 1 1 1 1 1 1 1 0 29 IBM SmartCloud 0 0 1 1 1 1 0 1 0 30 IIJ GIO 0 0 0 0 0 0 0 0 0 31 iland cloud 1 0 0 1 0 1 1 0 0 32 Internap 0 0 1 1 1 1 0 0 0 33 Joyent 0 0 0 1 1 1 0 0 0 34 LunaCloud 1 0 1 1 1 1 0 0 0 35 Oktawave 1 0 1 1 1 1 0 1 0 36 Openhosting.co.uk 1 0 0 0 0 1 0 0 0 37 Openhosting.com 0 1 0 1 1 1 1 0 0 38 OpSource 1 0 1 1 1 1 1 0 0 39 ProfitBricks 1 0 0 1 1 1 0 0 0 40 Qube 1 0 0 0 0 1 0 0 0 41 ReliaCloud 0 0 0 0 0 0 0 0 0 42 SaavisDirect 0 0 1 1 0 1 0 0 0 43 SkaliCloud 0 1 0 1 1 1 1 0 0 44 Teklinks 0 0 0 0 0 0 0 0 0 45 Terremark vcloud 0 1 0 1 1 1 1 0 0 46 Tier 3 0 0 0 0 1 0 0 0 0 47 Umbee 1 0 0 1 1 1 1 0 0 48 VPS.net 1 0 0 0 1 1 0 0 0 49 Windows Azure 1 0 1 1 1 1 0 1 0 • Performance of VM deployment times • Virtualization overhead Evaluation of open source cloud stacks (Eucalyptus, OpenNebula, OpenStack) • Survey of European public cloud providers • Performance evaluation of top cloud providers (EC2, RackSpace, SoftLayer) • A grant from Amazon has been obtained M. Bubak, M. Kasztelnik, M. Malawski, J. Meizner, P. Nowakowski and S. Varma: Evaluation of Cloud Providers for VPH Applications, poster at CCGrid2013 - 13th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, Delft, the Netherlands, May 13-16, 2013 Cloud performance evaluation
  • 8. • Infrastructure model – Multiple compute and storage clouds – Heterogeneous instance types • Application model – Bag of tasks – Leyered workflows • Modeling with AMPL (A Modeling Language for Mathematical Programming) • Cost optimization under deadline constraints • Mixed integer programming • Bonmin, Cplex solvers 0 500 1000 1500 2000 2500 3000 0 10 20 30 40 50 60 70 80 90 100 Cost($) Time limit (hours) 20000 tasks, 512 MiB input and 512 MiB output, task execution time 0.1h @ 1ccu machine Rackspace instances Rackspace and private instances Amazon's and private instances Multiple providers Amazon S3 Rackspace Cloud Files Optimal Layer 1 A Layer 2 B B B C Layer 3 D Layer 4 E Layer 5 F 1h 2.5 h 0.5 h 0.3 h 2 h 6 h M. Malawski, K. Figiela, J. Nabrzyski: Cost minimization for computational applications on hybrid cloud infrastructures, Future Generation Computer Systems, Volume 29, Issue 7, September 2013, Pages 1786-1794, ISSN 0167-739X, http://dx.doi.org/10.1016/j.future.2013.01.004 Private cloud Compute private Amazon Storage Compute m1.small m1.large t1.micro m2.xlarge Task Input Output Application Rackspace Storage Compute rs.1gb rs.2gb rs.4gb rs.16gb Cost optimization of applications on clouds
  • 9. VPH-Share Master Int. AdminDeveloper Scientist Development Mode VPH-Share Core Services Host OpenStack/Nova Computational Cloud Site Worker Node Worker Node Worker Node Worker Node Worker Node Worker Node Worker Node Worker Node Head Node Image store (Glance) Cloud Facade (secure RESTful API ) Other CS Amazon EC2 Atmosphere Management Service (AMS) Cloud stack plugins (Fog) Atmosphere Internal Registry (AIR) Cloud Manager Generic Invoker Workflow management External application Cloud Facade client Customized applications may directly interface Atmosphere via its RESTful API called the Cloud Facade The Atmosphere Cloud Platform is a one-stop management service for hybrid cloud resources, ensuring optimal deployment of application services on the underlying hardware. P. Nowakowski, T. Bartynski, T. Gubala, D. Harezlak, M. Kasztelnik, M. Malawski, J. Meizner, M. Bubak: Cloud Platform for Medical Applications, eScience 2012 (2012) Resource allocation management
  • 10. DRI is a tool which can keeps track of binary data stored in a cloud infrastructure, monitor data availability and faciliate optimal deployment of application services in a hybrid cloud (bringing computations to data or the other way around). Binary data registry LOBCDER Amazon S3 OpenStack Swift Cumulus Register files Get metadata Migrate LOBs Get usage stats (etc.) Distributed Cloud storage Store and marshal data End-user features (browsing, querying, direct access to data, checksumming) VPH Master Int. Data management portlet (with DRI management extensions) DRI Service A standalone application service, capable of autonomous operation. It periodically verifies access to any datasets submitted for validation and is capable of issuing alerts to dataset owners and system administrators in case of irregularities.Validation policy Configurable validation runtime (registry-driven) Runtime layer Extensible resource client layer Metadata extensions for DRI Data reliability and integrity
  • 11. Data security in clouds Jan Meizner, Marian Bubak, Maciej Malawski, and Piotr Nowakowski: Secure storage and processing of confidential data on public clouds. In: Proceedings of the International Conference On Parallel Processing and Applied Mathematics (PPAM) 2013 • To ensure security of data in transit • Modern applications use secure tranport protocols (e.g.TLS) • For legacy unencrypted protocols if absolutly needed, or as additional security measure: – Site-to-Site VPN, e.g. between cloud sites is outside of the instance, might use – Remote access – for individual users accessing e.g. from their laptops • Data should be secure stored and realiable deleted when no longer needed • Clouds not secure enough, data optimisations preventing ensuring that data were deleted • A solution: – end-to-end encryption (decryption key stays in protected/private zone) – data dispersal (portion of data, dispersed between nodes so it’s non-trivial/impossible to recover whole message)
  • 12. • GworkflowDL language (with A. Hoheisel) • Dynamic, ad-hoc refinement of workflows based on semantic description in ontologies • Novelty – Abstract, functional blocks translated automatically into computation unit candidates (services) – Expansion of a single block into a subworkflow with proper concurrency and parallelism constructs (based on Petri Nets) – Runtime refinement: unknown or failed branches are re-constructed with different computation unit candidates T. Gubala, D. Harezlak, M. Bubak, M. Malawski: Semantic Composition of Scientific Workflows Based on the Petri Nets Formalism. In: "The 2nd IEEE International Conference on e-Science and Grid Computing", IEEE Computer Society Press, http://doi.ieeecomputersociety.org/10.1109/E-SCIENCE.2006.127, 2006 Semantic workflow composition
  • 13. • Design of a laboratory for virologists, epidemiologists and clinicians investigating the HIV virus and the possibilities of treating HIV-positive patients • Based on notion of in-silico experiments built and refined by cooperating teams of programmers, scientists and clinicians • Novelty – Employed full concept-prototype- refinement-production circle for virology tools – Set of dedicated yet interoperable tools bind together programmers and scientists for a single task – Support for system-level science with concept of result reuse between different experiments T. Gubala, M. Bubak, P. M. A. Sloot: Semantic Integration of Collaborative Research Environments, chapter XXVI in “Handbook of Research on Computational Grid Technologies for Life Sciences, Biomedicine and Healthcare”, Information Science Reference IGI Global 2009, ISBN: 978-1-60566-374-6, pages 514-530 Cooperative virtual laboratory for e-Science
  • 14. T. Gubala, K. Prymula, P. Nowakowski, M. Bubak: Semantic Integration for Model-based Life Science Applications. In: SIMULTECH 2013 Proceedings of the 3rd International Conference on Simulation and Modeling Methodologies, Technologies and Applications, Reykjavik, Iceland 29 - 31 July, 2013, pp. 74-81 • Concept of describing scientific domains for in-silico experimentation and collaboration within laboratories • Based on separation of the domain model, containing concepts of the subject of experimentation from the integration model, regarding the method of (virtual) experimentation (tools, processes, computations) • Facets defined in integration model are automatically mixed-in concepts from domain model: any piece of data may show any desired behavior • Proposed, designed and deployed the method for 3 domains of science: – Computational chemistry inside InSilicoLab chemistry portal – Sensor processing for early warning and crisis simulation in UrbanFlood EWS – Processing of results of massive bioinformatic computations for protein folding method comparison – Composition and execution of multiscale simulations – Setup and management of VPH applications Semantic integration for science domains
  • 15. GridSpace - platform for e-Science applications • Experiment: an e-science application composed of code fragments (snippets), expressed in either general- purpose scripting programming languages, domain-specific languages or purpose-specific notations. Each snippet is evaluated by a corresponding interpreter. • GridSpace2 Experiment Workbench: a web application - an entry point to GridSpace2. It facilitates exploratory development, execution and management of e-science experiments. • Embedded Experiment: a published experiment embedded in a web site. • GridSpace2 Core: a Java library providing an API for development, storage, management and execution of experiments. Records all available interpreters and their installations on the underlying computational resources. • Computational Resources: servers, clusters, grids, clouds and e- infrastructures where the experiments are computed. E. Ciepiela, D. Harezlak, J. Kocot, T. Bartynski, M. Kasztelnik, P. Nowakowski, T. Gubała, M. Malawski, M. Bubak: Exploratory Programming in the Virtual Laboratory. In: Proceedings of the International Multiconference on Computer Science and Information Technology, pp. 621- 628, October 2010, the best paper award.
  • 16. Goal: Extending the traditional scientific publishing model with computational access and interactivity mechanisms; enabling readers (including reviewers) to replicate and verify experimentation results and browse large-scale result spaces. Challenges: Scientific: A common description schema for primary data (experimental data, algorithms, software, workflows, scripts) as part of publications; deployment mechanisms for on-demand reenactment of experiments in e-Science. Technological: An integrated architecture for storing, annotating, publishing, referencing and reusing primary data sources. Organizational: Provisioning of executable paper services to a large community of users representing various branches of computational science; fostering further uptake through involvement of major players in the field of scientific publishing. P. Nowakowski, E. Ciepiela, D. Harężlak, J. Kocot, M. Kasztelnik, T. Bartyński, J. Meizner, G. Dyk, M. Malawski: The Collage Authoring Environment. In: Proceedings of the International Conference on Computational Science, ICCS 2011 (2011), Winner of the Elseview/ICCS Executable Paper Grand Challenge E. Ciepiela, D. Harężlak, M. Kasztelnik, J. Meizner, G. Dyk, P. Nowakowski, M. Bubak: The Collage Authoring Environment: From Proof-of- Concept Prototype to Pilot Service in Procedia Computer Science, vol. 18, 2013 Collage - executable e-Science publications
  • 17. 17 Jun 2012 • Goal: Extend the traditional way of authoring and publishing scientific methods with computational access and interactivity mechanisms thus bringing reproducibility to scientific computational workflows and publications • Scientific challenge: Conceive a model and methodology to embrace reproducibility in scientific worflows and publications • Technological challenge: support these by modern Internet technologies and available computing infrastructures • Solution proposed: • GridSpace2 – web-oriented distributed computing platform • Collage – authoring environment for executable publications Dec 2011 Jun 2011 GridSpace2 / Collage - Executable e-Science Publications
  • 18. Results: • GridSpace2/Collage won Executable Paper Grand Challenge in 2011 • Collage was integrated with Elsevier ScienceDirect portal so papers can be linked and presented with corresponding computational experiments • Special Issue of Computers & Graphics journal featuring Collage- based executable papers was released in May 2013 • GridSpace2/Collage has been applied to multiple computational workflows in the scope of PL-Grid, PL-Grid Plus and Mapper projects E. Ciepiela, P. Nowakowski, J. Kocot, D. Harężlak, T. Gubała, J. Meizner, M. Kasztelnik, T. Bartyński, M. Malawski, M. Bubak: Managing entire lifecycles of e-science applications in the GridSpace2 virtual laboratory–from motivation through idea to operable web-accessible environment built on top of PL-grid e-infrastructure. In: Building a National Distributed e-Infrastructure–PL-Grid, 2012 P. Nowakowski, E. Ciepiela, D. Harężlak, J. Kocot, M. Kasztelnik, T. Bartyński, J. Meizner, G. Dyk, M. Malawski: The Collage Authoring Environment. In: Procedia Computer Science, vol. 4, 2011 GridSpace2 / Collage - Executable e-Science Publications E. Ciepiela, D. Harężlak, M. Kasztelnik, J. Meizner, G. Dyk, P. Nowakowski, M. Bubak: The Collage Authoring Environment: From Proof-of-Concept Prototype to Pilot Service. In: Procedia Computer Science, vol. 18, 2013
  • 19. Common Information Space (CIS) • Facilitate creation, deployment and robust operation of Early Warning Systems in virtualized cloud environment • Early Warning System (EWS): any system working according to four steps: monitoring, analysis, judgment, action (e.g. environmental monitoring) B. Balis, M. Kasztelnik, M. Bubak, T. Bartynski, T. Gubala, P. Nowakowski, J. Broekhuijsen: The UrbanFlood Common Information Space for Early Warning Systems. In: Elsevier Procedia Computer Science, vol 4, pp 96-105, ICCS 2011. Common Information Space • connects distributed component into EWS and deploy it on cloud • optimizes resource usage taking into acount EWS importance level • provides EWS and self monitoring • equipped with autohealing
  • 20. • Simple yet expressive model for complex scientific apps • App = set of processes performing well-defined functions and exchanging signals HyperFlow model JSON serialization { "name": "...",  name of the app "processes": [ ... ],  processes of the app "functions": [ ... ],  functions used by processes "signals": [ ... ],  exchanged signals info "ins": [ ... ],  inputs of the app "outs": [ ... ]  outputs of the app } • Supports a rich set of workflow patterns • Suitable for various application classes • Abstracts from other distributed app aspects (service model, data exchange model, communication protocols, etc.) HyperFlow: model & execution engine
  • 21. • HyperFlow model & engine for distributed apps • App optimization & scheduling • Autoscaling and dynamic app reconfiguration • Multi-cloud resource provisioning Execution Platform Provisioning platform VM VM VM Cloud VM VM Executor Input data Trigger app execution Monitoring Provisioner Start/Stop/Reconfigure VM Autoscaler Optimizer & Scheduler Reconfigure app Scaling rules measuremants HyperFlow Enactment Engine Enact Execute App model App state Composite App Initial deployment Platform for distributed applications
  • 22. Objectives • Provide means for ad-hoc metadata model creation and deployment of corresponding storage facilities • Create a research space for metadata model exchange and discovery with associated data repositories with access restrictions in place • Support different types of storage sites and data transfer protocols • Support the exploratory paradigm by making the models evolve together with data Architecture • Web Interface is used by users to create, extend and discover metadata models • Model repositories are deployed in the PaaS Cloud layer for scalable and reliable access from computing nodes through REST interfaces • Data items from Storage Sites are linked from the model repositories Colaborative metadata management
  • 23. • MAPPER Memory (MaMe) a semantics- aware persistence store to record metadata about models and scales • Multiscale Application Designer (MAD) visual composition tool transforming high level description into executable experiment • GridSpace Experiment Workbench (GridSpace) execution and result management of experiments choose/add/delete Mapper A Mapper B Submodule A Submodule B MADGridSpace MaMe K. Rycerz, E. Ciepiela, G. Dyk, D. Groen, T. Gubala, D. Harezlak, M. Pawlik, J. Suter, S. Zasada, P. Coveney, M. Bubak: Support for Multiscale Simulations with Molecular Dynamics, Procedia Computer Science, Volume 18, 2013, pp. 1116-1125, ISSN 1877-0509 K. Rycerz, M. Bubak, E. Ciepiela, D. Harezlak, T. Gubala, J. Meizner, M. Pawlik, B.Wilk: Composing, Execution and Sharing of Multiscale Applications, submitted to Future Generation Computer Systems, after 1st review (2013) K. Rycerz, M. Bubak, E. Ciepiela, M. Pawlik, O. Hoenen, D. Harezlak, B. Wilk, T. Gubala, J. Meizner, and D. Coster: Enabling Multiscale Fusion Simulations on Distributed Computing Resources, submitted to PLGrid PLUS book 2014 • A method and an environment for composing multiscale applications from single-scale models • Validation of the the method against real applications structured using tools • Extension of application composition techniques to multiscale simulations • Support for multisite execution of multiscale simulations • Proof-of-concept transformation of high-level formal descriptions into actual execution using e-infrastructures Multiscale programming and execution tools
  • 24. Research on Feature Modeling: • modelling eScience applications family component hierarchy • modelling requirements • methods of mapping Feature Models to Software Product Line architectures Research on adapting Software Product Line principles in scientific software projects: • automatic composition of distributed eScience applications based on Feature Model configuration • architectural design of Software Product Line engine framework B. Wilk, M. Bubak, M. Kasztelnik: Software for eScience: from feature modeling to automatic setup of environments, Advances in Software Development, Scientific Papers of the Polish Informations Processing, Society Scientific Council, 2013 pp. 83-96 Building scientific software based on Feature Model
  • 25. CrossGrid 2002-2005 Interactive compute- and data-intensive applications K-Wf Grid 2004-2007 Knowledge-based composition of grid workflow applications CoreGRID 2004-2008 Problem solving environments, programming models for grid applications GREDIA 2006-2009 Grid platform for media and banking applications ViroLab 2006-2009 Script based composition of applications, GridSpace virtual laboratory PL-Grid; + 2009-2015 Advanced virtual laboratory, DataNet – metadata models (2 large Polish projects) gSLM 2009-2012 Service level management for grid and clouds UrbanFlood 2009-2012 Common Information Space for Early Warning Systems MAPPER 2010-2013 Computational strategies, software and services for distributed multiscale simulations VPH-Share 2011-2015 Federating cloud resources for VPH compute- and data intensive applications Collage 2011-2013 Executable Papers; 1st award of Elsevier Competition at ICCS2011 (Elsevier project) ISMOP 2013-2016 Management of cloud resources, workflows, big data storage and access, analysis tools (MCBiR) PaaSage 2013-2016 Optimization of workflow applications on cloud resources DICE team in EU projects
  • 26. • Optimization of service deployment on clouds – Constraint satisfaction and optimization of multiple criteria (cost, performance) – Static deployment planning and dynamic auto-scaling • Billing and accounting model – Adapted for the federated cloud infrastructure – Handle multiple billing models • Supporting system-level (e)Science – tools for effective scientific research and collaboration – advanced scientific analyses using HPC/HTC resources • Cloud security – security of data transfer – reliable storage and removal of the data • Cross-cloud service deployment based on container model Topics for collaboration dice.cyfronet.pl