Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Business Intelligence Solution on Windows Azure
1. White Paper
Business Intelligence Solutions on Windows® Azure™
- Sidharth Subhash Ghag
Abstract
Enterprise Business Intelligence (BI) solutions today are analyzing growing amounts of data. More often, the data is
historical in nature, coming from within the enterprise and also from external channels such as the Web, mobile, and
devices. This has led to the growth of data volume to alarming levels. In traditional BI implementations, this
information explosion, along with increasing demands on computational power to process high volumes of data,
has been managed through expensive hardware and software upgrades. This is a highly inefficient approach to meet the
demands of a growing business, one that the enterprise considers economically unfavorable.
With the global scale of operation of large enterprises, the need of the hour is to make information available to
partners, remotely-located analysts, and managers who are on the move. This in turn results in additional demands on
infrastructure and IT.
This white paper discusses how Cloud Computing might help address these challenges with its round-the-clock availability
and its dynamic and scalable nature. Cloud infrastructure would be beneficial in terms of offloading - BI storage, long
running processes, and handling erratic load behaviors. The proposed solution discussed in this paper is an alternative BI
architecture providing an optimal solution that extends existing BI infrastructure.
www.infosys.com
2. BI Process
Overview
Primarily, a BI solution has two parts: data storage and analysis. The stored raw data is an asset that needs to be cleansed and processed to
derive information for making decisions. The information has to be presented to the decision makers in an intuitive and highly interactive
manner, so that key strategic decisions can be made in the least possible time. BI relies on data warehousing (a data repository designed
to support an organization’s decision making). Ineffectively managed data warehouses make it difficult for organizations to quickly extract
necessary data for information analysis to facilitate practical decision-making.
The BI process can be represented using the following diagram:
Figure 1: BI Process
Online Transaction Data
Online transactional data (operational data) from multiple systems (finance, sales, and CRM) is extracted and processed to eliminate data
redundancy or is optimized to be stored in a data warehouse. The purpose of creating a data warehouse is to bring information from
heterogeneous systems to a common data storage platform.
Data Warehouse
A data warehouse is an independent master store of all the historical transactional data for any enterprise. Extracting transactional data from
multiple systems and then cleansing the data for using it for further analysis is the most important activity of establishing a data warehouse.
The process of accumulating data largely depends on the source systems from where the data is retrieved. Mostly, this process of accumulation
is customized enough to handle the multiple data sources and data rules, easing the transformation of data from multiple disparate systems,
which needs to be stored in a single platform.
Data Marts
Although a data warehouse is a storehouse for voluminous data, it is difficult to process complex analytical queries or jobs directly off the data
warehouse. Thus, the data warehouse is broken down logically or sometimes physically into smaller analysis units called data marts. Data marts
can be conceptualized as units of data storage used for dedicated analysis, which is generated using specific filters and queries. Data marts
contain specialized multi-dimensional data structures called data cubes. Unlike relational database tables, which have only two dimensions
(row and column), a data cube has multiple dimensions.
Typical data mart queries include how the sale of grocery products was in the last six months and how a promotion performed in the last six
months in the southern region. Data marts are useful for such focused analysis.
Since the data warehouse is responsible for storing high volumes of historical and ever-growing data, a data warehouse solution should be
cost-effective and reliable and should always be available to other components for analysis and reporting.
Reports, Dashboards and Key Performance Matrix
Analysis is the process of slicing and dicing a set of information to interpret a pattern that can be used to justify certain impact or for further
planning. The analytics engine works on data marts. The purpose of the analytics engine is to execute complex queries and present data with
multiple dimensions and measures. Dimensions and measures are key parameters in BI that help slice and dice information to make it more
precise for decision makers.
2 | Infosys – White Paper
3. Data presentation is a crucial component in analysis. The richer the presentation of the data to be analyzed, the better it is for decision makers
to examine the information. This presentation layer helps in presenting reports, KPI matrix, and dashboards to the end user for slicing and
dicing information. These rich reports also support ‘what-if’ scenario analyses.
A BI system is an aggregation of multiple systems and sub-systems. Data storage, information slicing and dicing tools, and reporting or rich
visualization interfaces are some of the multiple sub-systems of any typical BI system. This peculiarity of structure and integration creates
inherent challenges. Let us look at the typical challenges faced by enterprises in implementing and using BI solutions.
BI Implementation Challenges
• Intermittent demands for storage
Since a data warehouse is the backbone of the entire BI solution, it becomes important to manage this data warehouse properly to keep
it running all the time. The data warehouse is a storehouse for large datasets, and it is not possible to keep the entire data active so that it
may be used for on-demand analysis. In certain scenarios, historical data that has otherwise been inactive for some time may need to be
activated. Activation of historical data involves obtaining the backed up tapes, retrieving the data, and loading and fitting it into the current
activated data warehouse or data marts, all of which are by no means simple. Even if such a situation arises only once a month, it would
still consume a considerable amount of IT operational resources. Storage demand increases with every such request because activation of
inactive data adds to rather than taking away from the currently activated data. The need for extra storage capacity adds to the investment
of hardware and the pressure of managing the same.
• Sub-optimal utilization of resources
As the BI solutions have been in place for many years, it is highly likely that the number of users, size of the storage, and complexity of the
systems have increased. Increase in users adds pressure on the scalability of the solution, which might have been provisioned long ago.
There is yet another possibility where an organization may have considered the rapid growth in the number of users, where the storage
and other infrastructure capacities are planned upfront. In such cases, it is highly likely that the system may remain underutilized causing
the loss of opportunity of using the same investment elsewhere. The scalability challenge is crucial in deciding utilization as well as smooth
running of the system.
• Lacking external dimension
On-premise BI solutions are mostly oriented around the transactional data of the enterprises. They lack the external dimensions and
measures of analysis, that are important for strategic analysis. A combination of internal data such as sales data and external data such as
government collected data and industry trends can be used to get better insight and plan effective strategies.
External environmental data is available through different data marketplaces, which can help enhance the quality of analytics. Increasing
demand to factor external entities into the analysis is adding pressure on the design and flexibility of the BI solutions. Many a time,
enterprises end up developing their own components or smaller, independent BI solutions to factor these external entities.
• Lacking multi-channel delivery capabilities
Most enterprises work with workforce spread all over the world. These geographically distributed stakeholders demand round-the-clock
availability and accessibility from any place. Enterprises that had not factored this demand have ended up spending huge amounts of
money and resources to address it. The need to make data warehouses and BI solutions available over the Internet with multiple delivery
channels such as RIA, services, mobile and browsers is increasing. This quick, easy, perennial accessibility adds an edge to enterprises,
facilitating them to collaborate better and take decisions quickly. Thus, it becomes essential for enterprises to make their BI platform
available over the Internet. This requirement not only demands additional investment for infrastructure, but also adds to the additional
integration touch points to address such requirements.
Present businesses operate in highly dynamic environments influenced by factors such as changing business scenarios, change in
compliances and governance processes, new integration requirements adding to the complexities of the systems, and increasing pressure
on the system to be responsive. These challenges multiply with the increasing demand for dynamism in the business, processes, and
technologies. It is important for every enterprise to address these challenges and make use of their BI investment to get the best results.
3 | Infosys – White Paper
4. BI Solution Based on Cloud Computing
With more and more devices getting meshed and inter-connected on the information highway, demand for data and everything related to it
will grow manifold. This information explosion will lead to the need of systems that can:
• Process large amounts of data efficiently and in near real-time
• Handle storage for data flowing in from the various systems and devices into storage units that can store large amounts of data
The figure shown below depicts a typical information flow landscape of any large enterprise in the future. Thus, a BI solution has to meet the
high volume requirements of an enterprise, which constantly exchanges information with multiple stakeholders, systems, and devices as part
of its day-to-day operations.
Regulatory Content Providers Field
Agencies Devices/Appliances
Enterprise – Geo1
Sales SCM CRM
DW Analytical Transformation Portal &
Engine Engine Reporting
Enterprise – Geo2
Customers
Sales SCM CRM
Delivery Channels
Suppliers
Partners
Figure 2: Typical Azure™ Business Intelligence Eco-System
Cloud computing, a new generation technology platform of deploying and delivering software services, addresses the growth requirements
of an enterprise and the commonly faced BI challenges. The value proposition delivered by cloud computing, which can address the needs of
the BI platform for the future, includes:
• Capability to process voluminous and rapidly-growing data over the Internet
• Replication of machines, applications, and data storage at multiple instances to provide high availability
• Dynamic, elastic capability to support scaling up and down of infrastructure within minutes
Improved Cost Efficiency
Managing complexity and Total Cost of Ownership (TCO) using cloud storage solutions are relatively more appealing compared to traditional
RDBMS data solutions, especially in a data warehouse scenario that deals with handling historic or inactive data. With cloud storage, data
can be kept active at all times while avoiding the aide of the IT management to activate any historical data. Thus, cloud storage addresses
the challenges of intermittent data storage access, particularly when there is an urgent need to reload historical data, say to meet
compliance-related queries.
4 | Infosys – White Paper
5. Elastic and Scalable
A cloud-based solution offers users the capability to provide cloud resources such as computing services, storage services, and cache services
instantaneously. This infrastructure-level flexibility allows one to handle workload fluctuations, both planned and unplanned, in an elastic
manner without having to plan for any investments upfront. The elastic and scalable nature of the cloud, along with the pay-as-you-go model,
aligns well with the enterprise needs such that the business gets a more transparent and assured view of its IT resource consumption.
Interoperable
Since the cloud is available over the Internet and can easily provide interoperable endpoints such as REST and SOAP, the architecture supports
easy integration with external services. Relatively easy and quick integration with externally available interface endpoints makes the enterprises
account for adding external dimensions to their analysis. These rich sets of external dimensions provide a platform for the enterprise to logically
consider factors for their analysis, be it competitor data, national/international growth data, neighborhood safety, climate effect, or new stores
or services in the neighborhood.
Available Anytime Anywhere
The cloud is available ubiquitously and can be accessed through standard http protocols. Enterprises do not have to spend extra money or
resources to make the solution available over the Internet. Concerns such as provisioning and hardening are inconsequential with the cloud.
The cloud helps enterprises support multiple delivery channels that allow information to reach stakeholders including employees, mobile field
agents, and external partners easily.
Even as the cloud computing platform is growing, different vendors are adding to the rich set of building blocks required to develop enterprise
applications on the cloud. The basic principle in developing these building blocks is to be able to integrate easily and quickly. All the vendors
are striving for open and interoperable standards of integration, making it easier to use these enterprise application services on any cloud
platform. It also delivers the advantage of making the system agile to handle system changes required to address dynamic business and
technical needs.
These characteristics of the cloud computing platform enable the implementation of large BI solutions possible in an easy and relatively
inexpensive manner. Cloud computing platforms are maturing and cloud vendors are trying hard to increase the functional and technical
richness of their offerings to drive innovations. These innovations would help enterprises in better management, easy decision making, and being
more competitive.
We will explore Microsoft Azure™, a public cloud platform that offers Platform as a Service (PaaS), for developing the next generation cloud-
based BI solution. PaaS offers hosted scalable application servers with necessary supporting services such as storage, security, and integration
infrastructure. PaaS platform also provides development tools and application building blocks to develop custom solutions on the cloud.
Though we have selected PaaS for our proposed solution, there are two other cloud delivery models: Software as a Service (SaaS) and IaaS
(Infrastructure as a service), which we will discuss briefly in this paper.
5 | Infosys – White Paper
6. Azure™ Based BI Solution
We will now attempt to explain a high-level design for a custom-built BI solution on Windows® Azure™.
Let us first get acquainted with the Azure™ terminologies given in the following table:
Windows® Azure™ A cloud operating system platform that provides the computing capability on a cloud
Entity/Key value or tuple store-based service capabilities provided by Microsoft Azure™ to address large,
Azure™ Table Storage
structured, and scalable data storage
Large and scalable data storage made available by Microsoft Azure™ for unstructured data such as
Azure™ Blob Storage
documents and media files
Queue service offered by Microsoft Azure™ for message orchestration and asynchronous
Azure™ Queue
request processing
Relational database capability similar to SQL Server made available by Microsoft Azure™ to address
SQL Azure™
relational database capabilities on the cloud
A web server instance to run web applications readily available at http/https endpoints for access.
Web Role
A web role is simply a web server provided by Microsoft Azure™
Worker Role A computing instance for executing long running processes on Microsoft Azure™
A role used to run a virtual hard disk image, store that image in the cloud, and load and run it on demand.
VM Role
The role is highly suited for moving legacy applications to the cloud with minimal effort
A service-bus-like messaging platform on the cloud that allows on-premise applications to be available
AppFabric Service Bus
externally and to seamlessly connect with other systems
A claim-based authorization service that supports federated access to enterprise systems and services
AppFabric Access Control
on the cloud. All authorization rules can be abstracted and managed from ACS independently out of
Service (ACS)
the application in a standard oriented way
An information marketplace that acts as an external dataset provider, which would be consumed by the
Windows® Azure™ Data
BI stack to leverage external dimensioning metrics such as demographics, location, and other publically
Marketplace
available information to enrich the analytical reporting capabilities
An identity management framework that externalizes identity-related logic from an application.
Windows® Identity Federated single sign-on scenarios involving multiple stakeholders can be built on this framework. For
Foundation (WIF) the enterprise, this will also help integrate on-premise Active Directory-based authentication with the
Azure™ deployed application
High-Level Design for Custom-Built BI Solution on Azure™
Owing to concerns around data privacy, security, and data ownership, enterprises have been cautious in adopting cloud computing. However,
at the same time, they have also shown a keen interest in leveraging the value proposition offered by the cloud and the potential opportunity
it presents in growing their businesses.
Keeping these key aspects in mind, a hybrid BI solution is proposed to alleviate enterprise challenges. As shown in the figure below, the
proposed solution divides the architecture into two distinct facets – On-premise component and Cloud component.
6 | Infosys – White Paper
7. Figure 3: High-Level Design for Custom-Built BI Solution on Azure™
On-Premise Components
Data Cleansing and Profiling Agent
This agent would be responsible for collating transactional and unstructured data from on-premise systems, cleansing the data, and uploading
it on a data warehouse developed on Azure™ table storage. This component can be extended to consider disparate data sources such as Oracle,
SQL Server, mainframes, and excel data. Cleansing and profiling would also be configurable according to business needs to handle business-
specific rules, such as soft-deleted data should not be uploaded and transactional data not in the published state should not be uploaded.
The data transfer from agent to the cloud would happen over a secured channel. This agent is usually a part of the Extract Transform Load
(ETL) component.
Data Integration Layer
Based on the criticality of information, an enterprise may have structure data categorized into different levels. We will discuss the different data
integration approaches to cover mission critical and non-mission critical data.
Exposing master data on the cloud without having to upload the master data on the cloud storage helps in maintaining data privacy
and ownership in the hands of the enterprise. This would avoid the need to physically store confidential data such as credit card details,
address information of customers, and salary information of employees on the cloud. It would instead be fetched from the enterprise as and
when required.
7 | Infosys – White Paper
8. An on-premise component that forms a part of the integration layer would help in exposing the master data to the cloud. Technically, this can
be achieved by leveraging the capabilities of the Azure™ AppFabric service bus. Azure™ AppFabric service bus, with its service virtualization
capabilities, allows exposing on-premise components or services on the cloud without having to physically move the data outside the enterprise.
The AppFabric service bus provides a publically accessible virtual endpoint on the cloud to any on-premise service endpoint it manages. This
channel of communication between the Azure™ AppFabric service bus and the on-premise service can be secured at the transport level, which
would be achieved by using SSL, and at the message level, which would be achieved by using standard encryption techniques.
To avoid latency issues, which could be a cause of concern arising due to the external network hop between an on-premise and cloud
environments, a distributed caching functionality can be implemented on the cloud. The analytical engine deployed on the cloud can be
embedded with a caching component such as Azure™ AppFabric Cache to cache regularly-used master data and in turn reduce the effects
of latency.
Data integration achieved using service virtualization addresses data security concerns, but this comes at the cost of performance. It is, thus,
advisable that for non-critical data, the data be transported and made to reside physically on the cloud, closer to the hosted application. This
can be achieved by leveraging existing data integration techniques such as ETL, Change Data Capture (CDC), and Enterprise Information
Integration (EII) implemented using a tool such as Microsoft’s SQL Server Integration Services (SSIS).
Power Pivot
’Power Pivot for Excel’ is a data analysis tool that delivers unmatched computational power directly within the application and with a tool such
as MS Excel, which users are fairly acquainted with. Power Pivot is a user-friendly way to perform data analysis using familiar Excel features
such as the common MS Office User Interface shell, PivotTable, PivotChart views, and slicers. Power Pivot helps users analyze data marts offline
without being connected to the online data marts. Power Pivot enables focused analysis on the data marts for on-premise and on-the-move
analysts to access at their own convenience.
ADFS 2.0
ADFS 2.0 is an identity provider service that enables an enterprise-level identity federation solution. It is developed on Windows® Identity
Foundation (WIF) and makes it very easy to integrate with web applications for authentication/authorization from on-premise active directory
use stores. The BI portal solution proposed here would implement claims-based authentication using WIF and ADFS 2.0 for allowing enterprise
users to login to the system with their existing active directory credentials.
Azure™ Components
Cloud Data Warehouse
All the collated data uploaded by the cleansing and profiling agent would be stored in Azure™ table storage. Azure™ table storage is highly
scalable and is an appropriate fit for persisting de-normalized data due to its Entity Value Attribute (tuple store) style of storage. No analytical
processing or advanced queries would be run on the data warehouse. Hence, the economically cheaper Azure™ table storage is a relatively
better option compared to relational data stores such as SQL Azure. The Azure™ storage, through blobs, can also persist metadata of the data
warehouse along with unstructured data such as files, documents, scanned images, and video files.
The inexpensive storage capability delivered by table storage frees data warehouse administrators from having to deactivate historical data, a
practice often followed in the earlier BI systems due to storage capacity limitations of on-premise storage facilities. CAPEX spending, normally
involved in expanding storage to meet enterprise growth, is also eliminated. However, due to the Pay-As-You-Use pricing model of Windows
Azure services, there would be a rise in the OPEX spending, but it would tend to align more closely with the demands of the growing business.
A detailed assessment of the existing system along with a Y-O-Y ROI analysis of the Azure™ platform can help provide a clear picture in terms
of overall savings and business value that can be realized in the future.
Analytical Engine
The analytical engine is the most important component in the BI solution. The analytical engine:
• Prepares data required for focused analysis
• Applies algorithms for processing data based on different facts, measures, and dimensions
• Analyzes structured and unstructured information to provide patterns and predicts trends that are usually difficult to spot with the naked
eye or traditional reporting
• Identifies cases or exceptions in the data to isolate or identify anomalies
8 | Infosys – White Paper
9. As of now, the SQL Server Analysis Services are not provided as part of the SQL Azure™ services. Hence, it is imperative to build this custom
component, which would achieve analysis services, cube formation, and querying cube-related functionalities on SQL Azure™.
In the proposed solution, the analytical engine has the following parts:
• Batch Process (Azure™ worker role): This Azure™ worker role would be responsible for the creation of data marts and offline reports.
• Data-Mart Processor: Responsible for creating new data marts (SQL Azure™ tables) from the data warehouse (Azure™ table storage)
for focused analysis. The multiple requests submitted by analysts from the BI portal to create data marts would be handled
asynchronously by batch-processing requests, implemented using Azure™ queues.
• Offline Report Generator: Responsible for generating standard reports periodically and storing it in the Azure™ blobs to make it
readily available for the BI portal. This component would generate standard reports as per the configuration stored in the Azure™
table storage.
• Real Time Analytics (Azure™ web role): This Azure™ web role is one of the most important components used for analysis. It would be
responsible for fetching data from data marts and presenting it on the BI portal for analysis. BI portal presentation of dynamic reports and
KPI matrix and generation of ad-hoc reports on existing data marts are achieved through this component. It services analysis requests
synchronously on the existing data marts, making real-time analysis possible on the data marts.
Note: With Windows® Azure™ version 1.6 release (November 2011), running SSAS off Azure VM roles is not supported by Microsoft. Hence, until
Microsoft recognizes SSAS as a first class citizen of the cloud, we suggest using the data-mart processor approach.
• Data Marts: Since the proposed data warehouse is created using Azure™ table storage, which is entity-value schema-based and
non-relational, we propose to create data marts in the SQL Azure™ tables. This is primarily because existing analytical engines can also
leverage the premium RDBMS capabilities offered by SQL Azure™ on the cloud without any changes. SQL Azure™ is a relational database
and makes it easy to fetch data using complicated analytical queries. Power Pivot provides a quick and powerful analysis tool to be used
with SQL Azure™. Moreover, the BI portal would be able to generate the desired reports and analyses out of SQL Azure™.
• Application Data: Application data comprises configuration and customization data required as a part of the BI solution.
• SQL Azure Reporting Services Reports: As part of the BI solutions, standard reports can be configured using SQL Azure™ Reporting
Services (SARS) and can be made available from the BI portal.
• Standard Reports: As part of the BI solution, there are standard reports needed to be generated on the data using the specific dimensions
and measures. These standard reports can be generated in a batch process to reduce the latency and can be made available all the time. As
explained previously, the batch analytics component running on the Azure™ worker role generates these reports periodically.
• BI Portal: This is the web portal ported on Azure™ web role. It interacts with the analytical engine to generate dashboards, ad-hoc reports,
and visual analyses of data from multiple dimensions and measures. This BI portal would be accessible everywhere over the Internet and
would be made available over multiple delivery channels including desktop, mobile, and PDAs.
• Windows Azure Data Marketplace Dataset External Measures: The analytics engine can be configured to use specific datasets exposed
from Windows Azure™ data marketplace. These datasets would be used as an external measure, along with the data mart measures, for
analysis. Examples of such datasets that can be used as external measures could be demographic information of customers, upcoming
business/stores in nearby locations, and weather conditions impacting sales for specific location
Design Considerations
• Geo-location and affinity group: Applications developed on Windows® Azure™ can be deployed across multiple data centers located
around the world – South Central US, North Central US, West Europe, East Europe, East Asia, and South East Asia. The Windows Azure global
footprint is rapidly growing as Microsoft continues to build new global data centers for Azure™ deployment. Selection of appropriate data
centers and creating an affinity group for deployment should be considered for the following reasons:
9 | Infosys – White Paper
10. • Regional Legislations/Regulations — These are to address regulatory requirements of deploying the application and its data within a
specific geographical location. There are a few compliance requirements that organizations have to abide by, to keep their data
geographically close to the region of business operations. These requirements can be addressed by deploying the Azure™
application in an appropriate data center.
• Performance — Data center proximity to end users would help in reducing network latency and improving overall application
performance. Creating an affinity group for application and data instances would deploy these components within the same data
center and would bring them closer. Inter-process communication within the same affinity group is faster and helps in improving
application performance, especially when there would be a large amount of data transfers involved during activities such as
reporting and data mart creation.
• Caching: Caching frequently used data such as reference data and infrequently modified data would help reduce data access calls and
latency in serving requests. Moreover, since there would be multiple roles running in the Azure™ load-balanced environment, we need to
consider using distributed caching systems such as Windows Azure AppFabric Caching services or Distributed Memcached.
• Partition keys for table storage: Partition keys used for data warehouse should not create too large partitions such that they are not able to
run efficient queries on Azure™. We need to consider using partition keys in all queries for better performance.
• Communication security for data in transit: We need to ensure transport level security using SSL. For highly confidential data, we need to
consider using messaging-level security, such as encryption and signatures.
• Processing Model: We could analyze business use-cases and choose the appropriate processing model between online and batch. Long
running processes can be effectively scaled using the worker role approach for computation tasks. Message queue based asynchronous
processing also provides data and processing reliability.
• SQL Azure™ Partition: In case where data-mart size expands more than one database instance limit of 150 GB for SQL Azure™, consider
horizontal partitioning of few tables. We could consider high-growth tables for partitioning and range-based keys or storing hash of keys
to identify a specific partition.
Other Cloud-Based BI Implementation Models
According to US-based National Institute of Standards and Technology, the cloud is composed of three service models, namely, SaaS, PaaS, and
IaaS. The design of the cloud-hosted BI solution explained in this paper was made by considering the boundaries of a PaaS cloud service model,
realized using Microsoft Azure™. The other cloud models available for implementing BI solutions are as follows:
• SaaS: This is the highest abstraction of the cloud. In this model, a finished application or solution is offered as a service. It is akin to
a packaged product with support for limited customization offered through the cloud. Since it is a standard packaged solution,
there may be limitations for enterprises to map their unique customizations and heterogeneous data stores to avail this solution.
SaaS might be a good offering for smaller organizations to address their limited BI needs.
• IaaS: This is the lowest abstraction of the cloud. In this model, vendors provide basic hardware and software infrastructure as a
service. Customers need to deploy their software, ranging from the operating system to the end application. Using this model,
enterprise will have to address the need of software licensing and deployment themselves, which limits the benefits of the cloud
computing platform.
Enterprises can select their cloud platform based on the criteria described in the figure below, driven by factors that make business sense in their
respective domains.
10 | Infosys – White Paper
11. Private Public
On-Premise IAAS PAAS SAAS
Business Intelligence Platform Evaluation Model
Selection
Criteria On-Premise IAAS PAAS SAAS
Flexibility
Ease of
Management
(Hardware,
Software &
Infrastructure)
Control
Functional
Richness
Application
Building Blocks
Security &
Compliance
Time to Market
QoS (Scalability,
Availability,
Reliability &
Performance)
Preferred
Procurement Buy Buy Build Subscribe
Choice
Figure 4: Business Intelligence Platform Evaluation Model
The above evaluation model summarizes the business value realized in implementing a cloud-based BI solution on different cloud service
models. A model of this nature can help guide enterprises in selecting the most appropriate cloud service by mapping the expected outcome
of their BI initiatives to the business value realized from the different cloud service options available.
Concerns About BI in Cloud/Azure™
The cloud platform addresses most of the challenges faced by enterprises in implementing and managing a traditional on-premise BI solution.
However, there are few concerns around cloud usage for implementing BI solutions. These concerns are common to any cloud implementation
and are not specific to BI. Let us briefly discuss these concerns from a BI cloud adoption point of view. The most talked about concern is around
data security and compliance.
Enterprises have concerns about placing their confidential data on the cloud where it would get replicated onto multiple servers. Technically,
the cloud technology treats all data in a similar fashion and that raises concerns around information security. To address this problem
practically, there is a need to amend the compliance rules to cater to the technology evolution. At the same time, cloud vendors need to
provide mechanisms that can handle the need to meet compliance requirements more effectively. Until then, a hybrid solution as proposed
in the high-level design in this paper, wherein critical data is stored on-premise but is exposed as a service for integration and aggregation
purpose and transactional data is stored in the cloud, is an option that can be explored.
11 | Infosys – White Paper