Project COLA: Use Case to create a scalable application in the cloud based on the MICADO concept
1. Project COLA: Use Case to
create a scalable application in
the cloud based on the MICADO
concept
Peter Kacsuk
SZTAKI
kacsuk@sztaki.hu
2. Introduction to the application
The application is Data Avenue:
• This application can
• transfer data from one data storage to another one
• Upload data from your computer to a data storage
• Download data from a data storage to your computer
• The data storage can be any of the following type:
• HTTP, HTTPS, SFTP, GSIFTP, SRM, iRODS and S3
3. Data Avenue services
3
FS1 FSn
Data Avenue Blacktop service
Openstack Amazon
Data Avenue
@ SZTAKI
Java or
other appli
code (JAVA
API or WS
API)
Data Avenue
Portlet
gateways
FS2OpenNebula
storage-related protocols
web service interface
4. Introduction to the application
Detailed objectives:
• We want to create a scalable service out of this application. It
means that users from all over the world can access this
service in order to transfer their data from one data storage to
another one provided that they have got access right to the
used data storages.
• The current service can be accessed at:
• https://data-avenue.eu/en_GB/
10. The Data Avenue Concept
Copy between different storages
Progress bar
Directory contents: files,
directories
11. Problems with the current service
• It uses a single Data Avenue application and it becomes a
bottleneck in the following cases:
• One user initiates many transfers in parallel
• Many users initiate single transfers in parallel
• Many users initiate many transfers in parallel
12. Solution
• The Data Avenue application can run in several instances in parallel in
the cloud
• If a single Data Avenue (DA) becomes bottleneck new DA is instantiated
in the cloud and some of the required data transfers are automatically re-
directed to the new DA instance
• In order to organize all the services as a set of coordinated services we
need an information service => Consul service
• In order to observe if a data instance becomes overloaded we need a
monitoring tool => Prometheus service
• In order to deploy and manage all the services as a set of coordinated
services we need a cloud orchestrator service => Occopus service
• In order to evenly exploit the DA instances we need a load balancer
service that directs the users’ DA service calls evenly to the available DA
instances => haproxy service
14. More details for monitoring
Alert
Manager
Executor
Available as tutorial at the Occopus web page: http://occopus.lpds.sztaki.hu/tutorials
Soon will be available as tutorial at the COLA web page
17. Limitations
• It uses VMs for every service and hence the deployment time
of the whole infrastructure is 8-10 minutes in CloudSigma
• Good scalability performance results can be achieved only if
the used storage has a scalable access mechanism. For
example, S3 cloud storage enables scalability but https
does not.
18. Docker container based architecture
• In order to reduce deployment time we placed every service in
a separate docker container.
• The modification required:
• Dockerized version of all the services (HAProxy, Prometheus,
Consul)
• Dockerized version of the data node application (DA)
• Modification of the cloud-init file that contains the description of cloud-
related parameters (cloud-init file became much simpler)
19. General architecture adaptable for
other applications -> MICADO v0
• If the DA description is replaced by another service-oriented
application in the cloud-init file, then it can be used for other
applications, too.
• Therefore this architecture can be considered as the first
version of the MICADO architecture -> MICADO v0
20. Towards a more generic MICADO
architecture -> MICADO v1
• Goal:
• create a MICADO architecture in which the application can be
accommodated without modifying cloud-init
• To achieve this we introduce a docker cluster and the
applications will run within in docker containers within this
docker cluster
• Therefore this architecture can be considered as the improved
version of the MICADO v0 architecture -> MICADO v1
• MICADO v1 could have to alternatives:
• A. 2-layered MICADO architecture without load balancer layer
• B. 3-layered MICADO architecture with load balancer layer
22. Assessment of MICADO v1/a
Advantage:
• Adaptable for different applications without modifying the cloud-init file
• Scales up and down the number of data nodes according to the actual
load of the data nodes
• Guarantees the balanced usage of data nodes (responsibility of the
Docker Master)
Disadvantage:
• No guarantee that the data nodes are used in a balanced way to serve
user requests
• All the data nodes should run the same applications
24. Assessment of MICADO v1/b
Advantage:
• Adaptable for different applications without modifying the cloud-init file
• Scales up and down the number of data nodes according to the actual
load of the data nodes
• Guarantees the balanced usage of data nodes (responsibility of the
Docker Master)
• Guarantees that the data nodes are used in a balanced way to serve
user requests (responsibility of the HAProxy services)
Disadvantage:
• All the data nodes should run the same applications
Future work:
• MICADO v2 to enable the balanced and scalable usage of different
applications
25. Acknowledgement
• The development of MICADO is in strong collaboration
between the SZTAKI and UoW teams.
• SZTAKI team: Prof. Peter Kacsuk
• Dr. Jozsef Kovacs
• Enikő Nagy
• Attila Farkas
• Dr. Robert Lovas
• Dr. Zoltan Farkas
• UoW team: Dr. Tamas Kiss
• Prof. Gabor Terstyanszky
• Botond Rákoczi
• Gregoire Gesmier
• Gabriele Pierantoni
26. For more information please visit
www.project-cola.eu
twitter.com/projectCOLA
facebook.com/projectCOLA
Thank you!