Over the past decade the field of Cloud Computing has been the focus of intensive research. In this paper we propose a framework that will simulate the architectural setup of a cloud environment and examine how it can leverage Apriori and Sequential Pattern based recommendation algorithms through R. Furthermore, we present a multi layered application encompassing its backend architecture, user interface built using the responsive web design technique and its development workflow. The proposed system was also exhaustively load tested using Apache JMeter to ensure its reliability at scale and the experimental results are presented.
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
A Public Cloud Based SOA Workflow for Machine Learning Based Recommendation Algorithms
1. A Public Cloud Based SOA
Workflow for Machine Learning
Based Recommendation Algorithms
Presented By
Srinivasan Thanukrishnan, Founder & CEO
Ram G Athreya, Research Intern
Glosys Technology Solutions Pvt. Ltd.
Chennai, India
5th
IEEE International Conference on Cloud & Service Computing – SC2 2015
3. Challenges
• Existing workflows for web based (SOA)
applications elaborate only on certain aspects
such as
– Cloud Computing
– Backend
– Frontend
– Development & Testing
– Machine Learning
• How do all these work together at a big picture
level?
4. Motivation
• Our aim is to combine these vast and disparate fields and
provide a cohesive framework for building such applications
• We propose a multi layered architecture that will simulate
the structure of a cloud environment in terms of frontend and
backend, and examine how it can leverage Machine Learning
• For completeness we created an actual Retail Application
which integrates the above technologies
5. Proposed System
• It comprises three major modules which are
– Product Information System (PIS)
– Analytics Based Inventory Management (ABIM)
– Transaction Based Analytics (TBA)
• To build a framework that will simulate the architectural
setup of an E – Commerce site
• To examine how it can improve its sales by employing
intelligence
• To derive a general workflow on how such systems can be
built end-to-end starting from the user interface up to the
machine learning algorithm that powers it in the backend
6. Design Components
• Cloud Architecture
• Back End Application Stack
• User Interface Design
• Development Environment
• Load Testing
• Machine Learning Based Recommendation
Algorithms
7. Cloud Architecture
• Core components of the Cloud Architecture
are
– Content Delivery Network (CDN)
– Load Balancer
– Server Instances
– Storage Services
8. Content Delivery Network (CDN)
• It is a large distributed
network of servers
across geographies
• It serves assets such as
images, css, js
• The CDN caches requests
• Thus load to origin server
is reduced
9. Load Balancers
• To optimize resource use, maximize throughput, minimize response
time
• It employ round-robin or least recently used algorithms to route
internet traffic
• The ability of auto-scaling
• During a traffic spike it automatically increases the number of
application server instances
10. Server Instances
• It generates responses with the help of backing services
• It contains only application code which is version
controlled
• Creating a new instance is as simple as checking out the
latest version of the codebase and deploying it within an
instance
• They are commodity servers which can be scaled on
demand
12. Storage Services
• To protect the database and ensure its availability, a
Master-Slave setup is required
• All database write operations happen at the master
and are replicated to the slaves, while the read
operations are carried out on the slave instances
• If master fails one of the slave nodes becomes the
new master
13. Storage Services
• The cache lies between the application servers and
the database
• It has in-memory (RAM) storage
• This ensures speeding up of requests since fewer
queries hit the database
• AWS S3 was used to store static assets such as
images, css & js
15. Back End Application Stack
• Model-View-Controller (MVC)
– Promotes the principle of ‘separation of concerns’
– The Model is responsible for managing the data required
by the application
– The View is responsible for presentation of data triggered
by a Controller action
– Template systems are used to embed dynamic data within
the HTML structure of the View
– The Controller is responsible for responding to user
requests
18. HTML
• Basically a set of tags within which content is placed
• Starts with <html> tag
• Has two major sections which are <head> and
<body>
• <head> contains metadata
• <body> contains all the content
19. CSS
• It achieves this in the form of rules that
are defined on HTML selectors
• Additionally LESS a CSS pre-processor is
necessary
• LESS provides additional features such as
variables, functions and mixins etc
• This makes CSS more maintainable,
themable and extensible
20. JS
• For dependency management, Bower is
used which is package manager for
browser development
• Require.js is a library for
asynchronously loading Javascript
dependencies within a web page
• jQuery is used for DOM Navigation,
Event Handling and AJAX calls with the
server
21. Responsive Web Design
• Designing for large variety of devices with varying screen
sizes and resolutions is difficult
• To support multiple devices, a web design methodology
called responsive web design (RWD) was used to provide
optimal viewing experience across a wide range of
devices
• RWD achieves this capability with the help of CSS3
Media Queries which is a W3C Recommendation
23. Development Environment
• Ideally, the development
environment must be similar to
the production environment
• Vagrant is a Free and Open
Source Software (FOSS) for
creating and configuring virtual
development environments
• This setup ensures that
environment related bugs are
kept to a minimum
24. Development Environment
• Any Software Project would involve multiple
developers working together.
• That fact brings about the need for a version control
system (VCS) since a version control system makes
tracking changes easy
• The Git Distributed Version Control System (DVCS)
was used to commit and track code changes and was
hosted in a GitHub repository
25. Load Testing
• Developers typically measure a Web application’s quality
of service in terms of response time, throughput, and
availability
• Load testing measures an application’s QoS performance
based on actual customer behavior
• When customers access the site, a script recorder uses
their requests to create interaction scripts
• A load generator then replays the scripts, possibly
modified by test parameters, against the website
27. Machine Learning Based
Recommendation Algorithms
• To illustrate the intelligence portion of the system, Apriori
and sequential pattern based machine learning algorithms
were employed
• Both algorithms take the transaction data of user purchases
as input based on which each algorithm individually makes
predictions on what the user might buy next
• Although both algorithms try to find frequently occurring
patterns in the dataset, they employ different methodologies
and hence come up with slightly different results
• The algorithms were implemented using the R programming
language
28. Apriori Based Algorithm
• The Apriori algorithm takes the historical transaction data of users
(stored in the database) so that it can identify frequently occurring
itemsets that can then be formulated into association rules.
• For example a rule might be where a user who buys a smartphone
is also likely to buy earphones, that is {smartphone} => {earphone}
• Such a rule can be found by the algorithm if there are enough
transactions to support it
• These rules ultimately become insights on what the user might do
next and can be given as product recommendations within the
application
30. Sequential Pattern Mining Based
Algorithm
• The sequential pattern mining algorithm also attempts to
mine relevant patterns from available data, but it additionally
takes the order of the pattern into account
• The algorithm tries to find patterns based on the order in
which transactions take place
• There are many variations of the sequential pattern mining
algorithm, the one used by the program is called SPADE
(Sequential PAttern Discovery using Equivalence classes)
32. Cloud Based Development
Environment
Technology Software/Tool
CDN CloudFront
Load Balancer HA Proxy
Server Instances Ubuntu 14.04
Distributed Cache Redis
RDBMS MySQL
Assets Storage AWS S3
Orchestration & Provisioning Chef
33. Backend Development Environment
Technology Software/Tool
Server Language Node.js
Server Package Management NPM
MVC Framework Express.js
Template Engine Jade
ORM Node-ORM
Redis Library Node_redis
Authentication Passport.js
34. Front End Development Environment
Technology Software/Tool
Content HTML
Presentation CSS
Interactivity JS
RWD Framework Bootstrap
CSS Pre-processor LESS
Frontend Framework jQuery
Frontend Package Management Bower
Dynamic Script Injection Require.js
40. Conclusion
• We presented a workflow for creating online applications
deployed in a cloud environment
• We looked at its cloud architecture in exhaustive detail and
how different cloud appliances such as virtual machines, load
balancers etc interact with each other
• We also focused on how such an application is built from the
ground up, including its backend architecture, user interface
built using the responsive web design technique as well as its
development workflow
41. Conclusion
• For completeness, we examined how such a cloud application
should be tested to ensure its reliability at scale
• Finally, we explored how such a system could leverage the
vast amounts of data it collects and employed Apriori and
sequential pattern based machine learning algorithms to
generate insights about its users
• Using these insights the application can better assist its
customers by providing relevant and timely
recommendations based on their behavior
42. Future Work
• In future works, we plan to explore the performance of
algorithms used in such a cloud application and
interoperability between two or more algorithms and the
usage of a more distributed architecture, such as Hadoop for
the machine learning setup
The motto of the CDN is to serve content with high availability and performance
Using such a setup increases the reliability and efficiency of the overall architecture
In a cloud computing setup, middleware or more popularly ‘cloud middleware’ is software that connects an application/service to another application/service. The most common example of a middleware would be a message queue or a distributed cache.
That is, there are two (or more) identical MySQL instances having the same schema and data and a single instance is usually designated as the master and all the others are slaves of the master
The critical requirement is to ensure that information is presented in a readable format.
But using CSS alone it is not possible to declare variables or functions, that is no form of abstraction is possible, which is where LESS comes in. Being a dynamic stylesheet language LESS provides additional features above CSS such as variables, functions and mixins etc which makes CSS more maintainable, themeable and extensible
It can be defined in three types which are Inline, Internal and External
Its sole purpose is to manage frontend assets or libraries that the application relies on. This may include javascript files, css, images, fonts etc
Bower uses a configuration file usually called ‘bower.json’ in which the frontend dependencies along with their version numbers are defined
Require.js loads all dependencies of a web page in an asynchronous manner so that they do not block page load
The overall Javascript interactivity of the application is facilitated with the help of jQuery
The benefit in this setup is that if another person is collaborating in development all they need to do is take the configuration file from version control and run bower install, after that they will have all dependencies required to run the application
so that environment/operating system related bugs could be prevented
Lets say the production environment is in Ubuntu while the developer works using a Windows machine
It becomes essential to move to a previous stable state
The standard Apriori algorithm is sequence agnostic, that is the association rules {smartphone, earphone} =&gt; {watch} and {earphone, smartphone} =&gt; {watch} are considered equivalent