SlideShare una empresa de Scribd logo
1 de 19
Descargar para leer sin conexión
Model-Driven
Cloud Data Storage
Juan Castrejón, Genoveva Vargas-Solar, Christine Collet, Rafael Lozano
Université de Grenoble, CNRS, Grenoble INP, Tecnológico de Monterrey




CloudMDE 2012
2




Background
•  Cloud computing (NIST-2011)
   •  Utility computing model for enabling ubiquitous, convenient, on-
      demand network access to a shared pool of configurable resources

•  Cloud data storage (Ruiz-2011, Armbrust-2009)
   •  Store, retrieve and manage large amounts of data, using highly
      scalable distributed infrastructures


•  Polyglot persistence (Fowler-2011)
   •  Different data storage technologies for different kinds of data
   •  Each storage mechanism introduces a new interface to be learned
   •  To get decent performance, you have to understand a lot about
      how the technology works
3




Background
•  Variety of data storage models and implementations
 (Cattell-2011, Edlich-2012)
  •  Models: key-value, document, extensible record, graph, blob,
     object, queue, xml, relational
  •  Implementations: Redis, Voldemort, MongoDB, CouchDB,
     Cassandra, Neo4J, db4o, eXist-db, etc. (As of today, over 120 options)


•  Cloud deployment environments (Ruiz-2011)
   •  Different combinations of pricing, support, service level
      agreements, and management APIs
   •  Public providers (Amazon, Windows Azure, Xeround, etc.)
   •  Private providers (Eucalyptus, OpenNebula, etc.)
4

Use the right tool for the right job…




                                                        How do I know which is the
                                                        right tool for the right job?




                                        (Katsov-2012)
5




Problem
•  How to specify data requirements for cloud environments?


•  For a set of data requirements, how to choose an
 appropriate combination of cloud storage system
 implementation and deployment provider?

•  How to generate/manage everything that’s required to
 work with the selection that I make?
6




Existing solutions
•  Integration of cloud storage platforms (Livenson-2011)
    •  Cloud Data Management Interface (CDMI) (SNIA-2011) proxy to
       integrate blob and queue data stores
•  Data integration over NoSQL stores (Curé-2011)
   •  Integration of relational and NoSQL databases (Document, column)
   •  Focus on efficient answering of queries
•  Storage provider selection (Ruiz-2011, Ruiz-2012)
   •  Characterize storage providers features (Ex: performance, cost)
   •  Specify requirements for application datasets (Ex: expected size,
      access latency, concurrent clients)
   •  Based on the previous information, an assignment of datasets to
      different storage systems is proposed
7




Existing solutions
•  Modeling as a Service (Bruneliere-2010)
   •  Deploy and execute model-driven services over the Internet (SaaS)


•  Design and deploy applications in the cloud (Peidro-2011)
   •  Promotes graphical models to capture cloud requirements
   •  Models automatically deployed to PaaS and IaaS environments


•  Application design/execution in multiple clouds (Ardagna-2012)
  •  MDE quality-driven method for design, development and operation
  •  Monitoring and feedback system
8




Limitations of existing solutions
•  Support for a limited set of cloud storage interfaces


•  Data integration can be highly based on the relational
 model

•  Limited information for the selection of data storage
 systems

•  Consideration for high-level cloud models (SaaS) but
 limited support for low-level models (PaaS and IaaS)
9




Objectives
1.  Provide adequate notations and environments to
   characterize cloud data storage requirements

2.  Selection of cloud data storage implementations and
   deployment providers

3.  Management of the required artifacts to work with
   different combinations of cloud storage implementations
   and providers
10




  Objectives
                          Cloud
                       requirements
                                Conceptual                    High-level of abstraction
                                  models                (Conceptual models and environments)




Selection process      Logical    Logical    Logical
Artifacts management   model      model      model




                       Physical   Physical   Physical           Low-level of abstraction
                        model      model      model     (Storage implementations and providers)
11




Proposed solution
•  Rely on Model-Driven Engineering (MDE) (Kent-2002) to:
   •  Characterize cloud storage requirements
   •  Encapsulate selection, administration and use of cloud data
      storage implementations


•  Why MDE?
   •  Avoid dependencies between high-level (data models) and low-
      level abstractions (storage implementations and providers)
   •  Emphasis on relying on different levels of modeling notations
   •  Generation of low-level abstractions by using automatic
      transformation procedures
12




Objective 1: Data requirements for the cloud
•  Do traditional modeling notations (ER and UML diagrams)
 make sense for data storage in the cloud?
  •  Define-extend notations and environments for cloud data modeling
•  What requirements should a cloud data storage notation
 consider?
  •  Rely on quality standards (ISO/IEC SQuaRE, S-Cube) to guide this
    analysis. Example: performance, efficiency, portability, etc.
•  How to characterize the proposed requirements?
   •  Associate quality metrics relevant to (cloud) scenarios, based on
      the characteristics of the reference standard (Jureta-2010)
   •  Validate currently proposed metrics. For example: throughput, cost,
      access latency, etc.
13




Objective 2: Data storage selection
•  Based on the analysis of historic data and usage patterns
   •  Both in test applications and within systems generated in our modeling
      environment
•  Monitoring data is gathered in a non-intrusive manner
   •  AOP monitoring
   •  Monitor the behaviour of the selected implementation/providers, based
      on the metrics specified in the modeling environment
   •  Compare expected values and actual performance
•  Monitoring data is shared in open/collaborative manner
   •  Used by our decision process
   •  Available for external users
•  Users could work, at the same time, with multiple combinations
 of storage implementations and providers
  •  Test the performance of the different combinations
14




Objective 3: Cloud artifacts management
•  Generate the low-level artifacts to work with data storage
 implementations and deployment providers
  •  Configuration files for deployment providers
  •  Data management interfaces (CDMI, Spring Data, etc.)


•  Different levels of transformation procedures
   •  From the high-level data model to an intermediate Domain Specific
      Language (DSL) (Liu-2010, SpringRoo-2012)
   •  From the intermediate DSL to configuration files, AOP monitoring
      aspects and data management interfaces (SpringData-2012)


•  MDE transformation techniques
   •  Model-to-Model (M2M), Model-to-Text (M2T)
15




Proof of concept                                      Work in progress…

                                                                        1
•  Extension - Model2Roo (http://code.google.com/p/model2roo/)
                                                                  High-level
                                                                 abstractions

                                               Java
                                               web
                                               App
                                                          Spring Data
UML class diagram        Spring Roo




                    2
               Low-level
              abstractions
                              Graph database
                                                        Relational database
16




Preliminary results
•  Castrejón, J., Vargas-Solar, G., Collet, C., Lozano, R., :
 “Model-Driven Cloud Data Storage”. In: First International
 Workshop on Model-Driven Engineering on and for the
 Cloud (CloudMDE 2012). Co-located with ECMFA ’12.
 July 2012

•  Castrejón, J., Vargas-Solar, G., Lozano, R., : “Model2Roo:
 Web Application Development based on the Eclipse
 Modeling Framework and Spring Roo”. In: First Workshop
 on Academics Modeling with Eclipse (ACME 2012). Co-
 located with ECMFA ’12. July 2012
17




Demonstration / Questions



  Contact: Juan.Castrejon@imag.fr
18




References
•  Ardagna, D., Di Nitto, E., Casale, G., et al. MODACLOUDS, A Model-Driven Approach for the
     Design and Execution of Applications on Multiple Clouds. Models in Software Engineering
     Workshop (MiSE 2012). Co-located with ICSE ’12. (2012)
•    Armbrust M. , Fox A., Griffith R., Joseph A. D, et al. Above the Clouds: A Berkeley View of
     Cloud Computing, 2009.
•    Bruneliere, H., Cabot, J., Jouault, F.: Combining model-driven engineering and cloud
     computing. In: Modeling, Design, and Analysis for the Service Cloud Workshop.
     MDA4ServiceCloud ’10 (2010)
•    Cattell, R.: Scalable sql and nosql data stores. SIGMOD Rec. 39, 12–27 (May 2011)
•    Curé, O., Hecht, R., Le Duc, C., Lamolle, M.: Data Integration over NoSQL Stores Using
     Access Path Based Mappings. A. In: Proceedings of the 22nd International Conference on
     Database and Expert Systems Applications (DEXA 2011). Hameurlain et al. (Eds.), Part I,
     LNCS 6860, pp. 481–495, (2011)
•    Edlich, S.: List of nosql databases. http://nosqldatabase.org/ (March 2012)
•    Fowler, M.: Polyglot persistence. http://martinfowler.com/bliki/PolyglotPersistence.html
     (November 2011)
•    Jureta, I., Borgida, A., Ernst, N., Mylopoulos, J.: Techne: Towards a New Generation of
     Requirements Modeling Languages with Goals, Preferences, and Inconsistency Handling. In:
     Proceedings of the 18th IEEE International Requirements Engineering Conference. pp.
     115-124. RE 2010. IEEE Computer Society (2010)
•    Katsov, I.: Nosql data modeling techniques. http://highlyscalable.wordpress.com/ 2012/03/01/
     nosql-data-modeling-techniques/ (March 2012)
19




References
•  Kent, S.: Model driven engineering. In: Butler, M., Petre, L., Sere, K. (eds.) Integrated Formal Methods,
     LNCS, vol. 2335, pp. 286–298. Springer Berlin (2002)
•    Lenzerini, M.: Data integration is harder than you thought. In: Proceedings of the 9th International
     Conference on Cooperative Information Systems. pp. 22-26. CooplS ’01, Springer-Verlag, London, UK
     (2001)
•    Livenson, I., Laure, E.: Towards Transparent Integration of Heterogeneous Cloud Storage Platforms. In:
     Fourth International Workshop on Data Intensive Distributed Computing. DIDC ’11. Co-located with HDPC
     ‘11 (2011)
•    Liu, D., Zic, J.: Cloud#: A specification language for modeling cloud. In: Proceedings of the 2011 IEEE 4th
     International Conference on Cloud Computing. pp. 533–540. CLOUD ’11, IEEE Computer Society,
     Washington, DC, USA (2011)
•    Peidro, J.E., Muñoz-Escoí, F.D.: Towards the next generation of model driven cloud platforms. In: 1st
     International Conference on Cloud Computing and Services Science. pp. 494–500. CLOSER ’11 (2011)
•    Ruiz-Alvarez, A., Humphrey, M.: An automated approach to cloud storage service selection. In: Proceedings
     of the 2nd international workshop on Scientific cloud computing. pp. 39–48. ScienceCloud ’11, ACM, New
     York, NY, USA (2011)
•    Ruiz-Alvarez, A., Humphrey, M.: A model and decision procedure for data storage in cloud computing. In:
     Proceedings of the IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing. CCGrid ’12
     (2012)
•    Storage Networking Industry Association (SNIA): Cloud data management interface (CDMI). http://
     www.snia.org/cdmi (September 2011)
•    SpringSource: Spring data projects. http://www.springsource.org/spring-data (March 2012)
•    SpringSource: Spring roo. http://www.springsource.org/spring-roo (March 2012)

Más contenido relacionado

La actualidad más candente

yuchung Resume LA
yuchung Resume LAyuchung Resume LA
yuchung Resume LA
Tom Chung
 
seanresume15-a
seanresume15-aseanresume15-a
seanresume15-a
Sean Lynch
 
MoDisco Poster EclipseCon 2009
MoDisco Poster EclipseCon 2009MoDisco Poster EclipseCon 2009
MoDisco Poster EclipseCon 2009
fmadiot
 
Discover models out of existing applications with Eclipse/MoDisco
Discover models out of existing applications with Eclipse/MoDiscoDiscover models out of existing applications with Eclipse/MoDisco
Discover models out of existing applications with Eclipse/MoDisco
fmadiot
 

La actualidad más candente (12)

yuchung Resume LA
yuchung Resume LAyuchung Resume LA
yuchung Resume LA
 
[2015/2016] AADL (Architecture Analysis and Design Language)
[2015/2016] AADL (Architecture Analysis and Design Language)[2015/2016] AADL (Architecture Analysis and Design Language)
[2015/2016] AADL (Architecture Analysis and Design Language)
 
seanresume15-a
seanresume15-aseanresume15-a
seanresume15-a
 
Architecture Knowledge
Architecture KnowledgeArchitecture Knowledge
Architecture Knowledge
 
MoDisco Poster EclipseCon 2009
MoDisco Poster EclipseCon 2009MoDisco Poster EclipseCon 2009
MoDisco Poster EclipseCon 2009
 
Struts Ppt 1
Struts Ppt 1Struts Ppt 1
Struts Ppt 1
 
Fostering MBSE in Engineering Culture
Fostering MBSE in Engineering CultureFostering MBSE in Engineering Culture
Fostering MBSE in Engineering Culture
 
Discover models out of existing applications with Eclipse/MoDisco
Discover models out of existing applications with Eclipse/MoDiscoDiscover models out of existing applications with Eclipse/MoDisco
Discover models out of existing applications with Eclipse/MoDisco
 
[2017/2018] AADL - Architecture Analysis and Design Language
[2017/2018] AADL - Architecture Analysis and Design Language[2017/2018] AADL - Architecture Analysis and Design Language
[2017/2018] AADL - Architecture Analysis and Design Language
 
MoDisco & ATL - Eclipse DemoCamp Indigo 2011 in Nantes
MoDisco & ATL - Eclipse DemoCamp Indigo 2011 in NantesMoDisco & ATL - Eclipse DemoCamp Indigo 2011 in Nantes
MoDisco & ATL - Eclipse DemoCamp Indigo 2011 in Nantes
 
Month 3 report
Month 3 reportMonth 3 report
Month 3 report
 
Mia-Software at Eclipse Modeling Symposium 2010
Mia-Software at Eclipse Modeling Symposium 2010Mia-Software at Eclipse Modeling Symposium 2010
Mia-Software at Eclipse Modeling Symposium 2010
 

Destacado

Google drive powerpoint
Google drive powerpointGoogle drive powerpoint
Google drive powerpoint
Cory Lincourt
 

Destacado (9)

Community Career Center: Introduction to Cloud Storage (Dropbox, Google Drive...
Community Career Center: Introduction to Cloud Storage (Dropbox, Google Drive...Community Career Center: Introduction to Cloud Storage (Dropbox, Google Drive...
Community Career Center: Introduction to Cloud Storage (Dropbox, Google Drive...
 
SkyDrive and Google Drive Cloud Storage Options
SkyDrive and Google Drive Cloud Storage OptionsSkyDrive and Google Drive Cloud Storage Options
SkyDrive and Google Drive Cloud Storage Options
 
Understaning Risk
Understaning RiskUnderstaning Risk
Understaning Risk
 
Google drive
Google driveGoogle drive
Google drive
 
Cloudschool 2014
Cloudschool 2014Cloudschool 2014
Cloudschool 2014
 
An introduction of cloud storage
An introduction of cloud storage An introduction of cloud storage
An introduction of cloud storage
 
Cloud storage
Cloud storageCloud storage
Cloud storage
 
Cloud storage slides
Cloud storage slidesCloud storage slides
Cloud storage slides
 
Google drive powerpoint
Google drive powerpointGoogle drive powerpoint
Google drive powerpoint
 

Similar a Model-Driven Cloud Data Storage

The elephantintheroom bigdataanalyticsinthecloud
The elephantintheroom bigdataanalyticsinthecloudThe elephantintheroom bigdataanalyticsinthecloud
The elephantintheroom bigdataanalyticsinthecloud
Khazret Sapenov
 
Ieee projects-2014-bulk-ieee-projects-2015-title-list-for-me-be-mphil-final-y...
Ieee projects-2014-bulk-ieee-projects-2015-title-list-for-me-be-mphil-final-y...Ieee projects-2014-bulk-ieee-projects-2015-title-list-for-me-be-mphil-final-y...
Ieee projects-2014-bulk-ieee-projects-2015-title-list-for-me-be-mphil-final-y...
birdsking
 

Similar a Model-Driven Cloud Data Storage (20)

Cloud Computing: A Perspective on Next Basic Utility in IT World
Cloud Computing: A Perspective on Next Basic Utility in IT World Cloud Computing: A Perspective on Next Basic Utility in IT World
Cloud Computing: A Perspective on Next Basic Utility in IT World
 
Concurrent and Distributed CloudSim Simulations
Concurrent and Distributed CloudSim SimulationsConcurrent and Distributed CloudSim Simulations
Concurrent and Distributed CloudSim Simulations
 
Data Engineer, Patterns & Architecture The future: Deep-dive into Microservic...
Data Engineer, Patterns & Architecture The future: Deep-dive into Microservic...Data Engineer, Patterns & Architecture The future: Deep-dive into Microservic...
Data Engineer, Patterns & Architecture The future: Deep-dive into Microservic...
 
The elephantintheroom bigdataanalyticsinthecloud
The elephantintheroom bigdataanalyticsinthecloudThe elephantintheroom bigdataanalyticsinthecloud
The elephantintheroom bigdataanalyticsinthecloud
 
Cloud-Computing-Course-Description-and-Syllabus-Spring2020.pdf
Cloud-Computing-Course-Description-and-Syllabus-Spring2020.pdfCloud-Computing-Course-Description-and-Syllabus-Spring2020.pdf
Cloud-Computing-Course-Description-and-Syllabus-Spring2020.pdf
 
Madhava_Sr_JAVA_J2EE
Madhava_Sr_JAVA_J2EEMadhava_Sr_JAVA_J2EE
Madhava_Sr_JAVA_J2EE
 
Ieee projects-2014-bulk-ieee-projects-2015-title-list-for-me-be-mphil-final-y...
Ieee projects-2014-bulk-ieee-projects-2015-title-list-for-me-be-mphil-final-y...Ieee projects-2014-bulk-ieee-projects-2015-title-list-for-me-be-mphil-final-y...
Ieee projects-2014-bulk-ieee-projects-2015-title-list-for-me-be-mphil-final-y...
 
CloudComputingJun28.ppt
CloudComputingJun28.pptCloudComputingJun28.ppt
CloudComputingJun28.ppt
 
CloudComputingJun28.ppt
CloudComputingJun28.pptCloudComputingJun28.ppt
CloudComputingJun28.ppt
 
CloudComputingJun28.ppt
CloudComputingJun28.pptCloudComputingJun28.ppt
CloudComputingJun28.ppt
 
ClouNS - A Cloud-native Application Reference Model for Enterprise Architects
ClouNS - A Cloud-native Application Reference Model for Enterprise ArchitectsClouNS - A Cloud-native Application Reference Model for Enterprise Architects
ClouNS - A Cloud-native Application Reference Model for Enterprise Architects
 
Use Case: Apollo Group at Oracle Open World
Use Case: Apollo Group at Oracle Open WorldUse Case: Apollo Group at Oracle Open World
Use Case: Apollo Group at Oracle Open World
 
TERM PAPER presentation (2).pptx
TERM PAPER presentation (2).pptxTERM PAPER presentation (2).pptx
TERM PAPER presentation (2).pptx
 
Simplifying Cloud Architectures with Data Virtualization
Simplifying Cloud Architectures with Data VirtualizationSimplifying Cloud Architectures with Data Virtualization
Simplifying Cloud Architectures with Data Virtualization
 
IC2E A Configuration Crawler for Cloud Appliances
IC2E A Configuration Crawler for Cloud AppliancesIC2E A Configuration Crawler for Cloud Appliances
IC2E A Configuration Crawler for Cloud Appliances
 
Towards CloudML, a Model-Based Approach to Provision Resources in the Clouds
Towards CloudML, a Model-Based Approach  to Provision Resources in the CloudsTowards CloudML, a Model-Based Approach  to Provision Resources in the Clouds
Towards CloudML, a Model-Based Approach to Provision Resources in the Clouds
 
Scaling Multi-Cloud Deployments with Denodo: Automated Infrastructure Management
Scaling Multi-Cloud Deployments with Denodo: Automated Infrastructure ManagementScaling Multi-Cloud Deployments with Denodo: Automated Infrastructure Management
Scaling Multi-Cloud Deployments with Denodo: Automated Infrastructure Management
 
A Successful Journey to the Cloud with Data Virtualization
A Successful Journey to the Cloud with Data VirtualizationA Successful Journey to the Cloud with Data Virtualization
A Successful Journey to the Cloud with Data Virtualization
 
(R)evolution of the computing continuum - A few challenges
(R)evolution of the computing continuum  - A few challenges(R)evolution of the computing continuum  - A few challenges
(R)evolution of the computing continuum - A few challenges
 
Cloud computingjun28
Cloud computingjun28Cloud computingjun28
Cloud computingjun28
 

Último

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 

Último (20)

Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 

Model-Driven Cloud Data Storage

  • 1. Model-Driven Cloud Data Storage Juan Castrejón, Genoveva Vargas-Solar, Christine Collet, Rafael Lozano Université de Grenoble, CNRS, Grenoble INP, Tecnológico de Monterrey CloudMDE 2012
  • 2. 2 Background •  Cloud computing (NIST-2011) •  Utility computing model for enabling ubiquitous, convenient, on- demand network access to a shared pool of configurable resources •  Cloud data storage (Ruiz-2011, Armbrust-2009) •  Store, retrieve and manage large amounts of data, using highly scalable distributed infrastructures •  Polyglot persistence (Fowler-2011) •  Different data storage technologies for different kinds of data •  Each storage mechanism introduces a new interface to be learned •  To get decent performance, you have to understand a lot about how the technology works
  • 3. 3 Background •  Variety of data storage models and implementations (Cattell-2011, Edlich-2012) •  Models: key-value, document, extensible record, graph, blob, object, queue, xml, relational •  Implementations: Redis, Voldemort, MongoDB, CouchDB, Cassandra, Neo4J, db4o, eXist-db, etc. (As of today, over 120 options) •  Cloud deployment environments (Ruiz-2011) •  Different combinations of pricing, support, service level agreements, and management APIs •  Public providers (Amazon, Windows Azure, Xeround, etc.) •  Private providers (Eucalyptus, OpenNebula, etc.)
  • 4. 4 Use the right tool for the right job… How do I know which is the right tool for the right job? (Katsov-2012)
  • 5. 5 Problem •  How to specify data requirements for cloud environments? •  For a set of data requirements, how to choose an appropriate combination of cloud storage system implementation and deployment provider? •  How to generate/manage everything that’s required to work with the selection that I make?
  • 6. 6 Existing solutions •  Integration of cloud storage platforms (Livenson-2011) •  Cloud Data Management Interface (CDMI) (SNIA-2011) proxy to integrate blob and queue data stores •  Data integration over NoSQL stores (Curé-2011) •  Integration of relational and NoSQL databases (Document, column) •  Focus on efficient answering of queries •  Storage provider selection (Ruiz-2011, Ruiz-2012) •  Characterize storage providers features (Ex: performance, cost) •  Specify requirements for application datasets (Ex: expected size, access latency, concurrent clients) •  Based on the previous information, an assignment of datasets to different storage systems is proposed
  • 7. 7 Existing solutions •  Modeling as a Service (Bruneliere-2010) •  Deploy and execute model-driven services over the Internet (SaaS) •  Design and deploy applications in the cloud (Peidro-2011) •  Promotes graphical models to capture cloud requirements •  Models automatically deployed to PaaS and IaaS environments •  Application design/execution in multiple clouds (Ardagna-2012) •  MDE quality-driven method for design, development and operation •  Monitoring and feedback system
  • 8. 8 Limitations of existing solutions •  Support for a limited set of cloud storage interfaces •  Data integration can be highly based on the relational model •  Limited information for the selection of data storage systems •  Consideration for high-level cloud models (SaaS) but limited support for low-level models (PaaS and IaaS)
  • 9. 9 Objectives 1.  Provide adequate notations and environments to characterize cloud data storage requirements 2.  Selection of cloud data storage implementations and deployment providers 3.  Management of the required artifacts to work with different combinations of cloud storage implementations and providers
  • 10. 10 Objectives Cloud requirements Conceptual High-level of abstraction models (Conceptual models and environments) Selection process Logical Logical Logical Artifacts management model model model Physical Physical Physical Low-level of abstraction model model model (Storage implementations and providers)
  • 11. 11 Proposed solution •  Rely on Model-Driven Engineering (MDE) (Kent-2002) to: •  Characterize cloud storage requirements •  Encapsulate selection, administration and use of cloud data storage implementations •  Why MDE? •  Avoid dependencies between high-level (data models) and low- level abstractions (storage implementations and providers) •  Emphasis on relying on different levels of modeling notations •  Generation of low-level abstractions by using automatic transformation procedures
  • 12. 12 Objective 1: Data requirements for the cloud •  Do traditional modeling notations (ER and UML diagrams) make sense for data storage in the cloud? •  Define-extend notations and environments for cloud data modeling •  What requirements should a cloud data storage notation consider? •  Rely on quality standards (ISO/IEC SQuaRE, S-Cube) to guide this analysis. Example: performance, efficiency, portability, etc. •  How to characterize the proposed requirements? •  Associate quality metrics relevant to (cloud) scenarios, based on the characteristics of the reference standard (Jureta-2010) •  Validate currently proposed metrics. For example: throughput, cost, access latency, etc.
  • 13. 13 Objective 2: Data storage selection •  Based on the analysis of historic data and usage patterns •  Both in test applications and within systems generated in our modeling environment •  Monitoring data is gathered in a non-intrusive manner •  AOP monitoring •  Monitor the behaviour of the selected implementation/providers, based on the metrics specified in the modeling environment •  Compare expected values and actual performance •  Monitoring data is shared in open/collaborative manner •  Used by our decision process •  Available for external users •  Users could work, at the same time, with multiple combinations of storage implementations and providers •  Test the performance of the different combinations
  • 14. 14 Objective 3: Cloud artifacts management •  Generate the low-level artifacts to work with data storage implementations and deployment providers •  Configuration files for deployment providers •  Data management interfaces (CDMI, Spring Data, etc.) •  Different levels of transformation procedures •  From the high-level data model to an intermediate Domain Specific Language (DSL) (Liu-2010, SpringRoo-2012) •  From the intermediate DSL to configuration files, AOP monitoring aspects and data management interfaces (SpringData-2012) •  MDE transformation techniques •  Model-to-Model (M2M), Model-to-Text (M2T)
  • 15. 15 Proof of concept Work in progress… 1 •  Extension - Model2Roo (http://code.google.com/p/model2roo/) High-level abstractions Java web App Spring Data UML class diagram Spring Roo 2 Low-level abstractions Graph database Relational database
  • 16. 16 Preliminary results •  Castrejón, J., Vargas-Solar, G., Collet, C., Lozano, R., : “Model-Driven Cloud Data Storage”. In: First International Workshop on Model-Driven Engineering on and for the Cloud (CloudMDE 2012). Co-located with ECMFA ’12. July 2012 •  Castrejón, J., Vargas-Solar, G., Lozano, R., : “Model2Roo: Web Application Development based on the Eclipse Modeling Framework and Spring Roo”. In: First Workshop on Academics Modeling with Eclipse (ACME 2012). Co- located with ECMFA ’12. July 2012
  • 17. 17 Demonstration / Questions Contact: Juan.Castrejon@imag.fr
  • 18. 18 References •  Ardagna, D., Di Nitto, E., Casale, G., et al. MODACLOUDS, A Model-Driven Approach for the Design and Execution of Applications on Multiple Clouds. Models in Software Engineering Workshop (MiSE 2012). Co-located with ICSE ’12. (2012) •  Armbrust M. , Fox A., Griffith R., Joseph A. D, et al. Above the Clouds: A Berkeley View of Cloud Computing, 2009. •  Bruneliere, H., Cabot, J., Jouault, F.: Combining model-driven engineering and cloud computing. In: Modeling, Design, and Analysis for the Service Cloud Workshop. MDA4ServiceCloud ’10 (2010) •  Cattell, R.: Scalable sql and nosql data stores. SIGMOD Rec. 39, 12–27 (May 2011) •  Curé, O., Hecht, R., Le Duc, C., Lamolle, M.: Data Integration over NoSQL Stores Using Access Path Based Mappings. A. In: Proceedings of the 22nd International Conference on Database and Expert Systems Applications (DEXA 2011). Hameurlain et al. (Eds.), Part I, LNCS 6860, pp. 481–495, (2011) •  Edlich, S.: List of nosql databases. http://nosqldatabase.org/ (March 2012) •  Fowler, M.: Polyglot persistence. http://martinfowler.com/bliki/PolyglotPersistence.html (November 2011) •  Jureta, I., Borgida, A., Ernst, N., Mylopoulos, J.: Techne: Towards a New Generation of Requirements Modeling Languages with Goals, Preferences, and Inconsistency Handling. In: Proceedings of the 18th IEEE International Requirements Engineering Conference. pp. 115-124. RE 2010. IEEE Computer Society (2010) •  Katsov, I.: Nosql data modeling techniques. http://highlyscalable.wordpress.com/ 2012/03/01/ nosql-data-modeling-techniques/ (March 2012)
  • 19. 19 References •  Kent, S.: Model driven engineering. In: Butler, M., Petre, L., Sere, K. (eds.) Integrated Formal Methods, LNCS, vol. 2335, pp. 286–298. Springer Berlin (2002) •  Lenzerini, M.: Data integration is harder than you thought. In: Proceedings of the 9th International Conference on Cooperative Information Systems. pp. 22-26. CooplS ’01, Springer-Verlag, London, UK (2001) •  Livenson, I., Laure, E.: Towards Transparent Integration of Heterogeneous Cloud Storage Platforms. In: Fourth International Workshop on Data Intensive Distributed Computing. DIDC ’11. Co-located with HDPC ‘11 (2011) •  Liu, D., Zic, J.: Cloud#: A specification language for modeling cloud. In: Proceedings of the 2011 IEEE 4th International Conference on Cloud Computing. pp. 533–540. CLOUD ’11, IEEE Computer Society, Washington, DC, USA (2011) •  Peidro, J.E., Muñoz-Escoí, F.D.: Towards the next generation of model driven cloud platforms. In: 1st International Conference on Cloud Computing and Services Science. pp. 494–500. CLOSER ’11 (2011) •  Ruiz-Alvarez, A., Humphrey, M.: An automated approach to cloud storage service selection. In: Proceedings of the 2nd international workshop on Scientific cloud computing. pp. 39–48. ScienceCloud ’11, ACM, New York, NY, USA (2011) •  Ruiz-Alvarez, A., Humphrey, M.: A model and decision procedure for data storage in cloud computing. In: Proceedings of the IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing. CCGrid ’12 (2012) •  Storage Networking Industry Association (SNIA): Cloud data management interface (CDMI). http:// www.snia.org/cdmi (September 2011) •  SpringSource: Spring data projects. http://www.springsource.org/spring-data (March 2012) •  SpringSource: Spring roo. http://www.springsource.org/spring-roo (March 2012)