SlideShare una empresa de Scribd logo
1 de 12
Descargar para leer sin conexión
White Paper




                     Business Intelligence Solutions on Windows® Azure™
                   - Sidharth Subhash Ghag


                   Abstract
                   Enterprise Business Intelligence (BI) solutions today are analyzing growing amounts of data. More often, the data is
                   historical in nature, coming from within the enterprise and also from external channels such as the Web, mobile, and
                   devices. This has led to the growth of data volume to alarming levels. In traditional BI implementations, this
                   information explosion, along with increasing demands on computational power to process high volumes of data,
                   has been managed through expensive hardware and software upgrades. This is a highly inefficient approach to meet the
                   demands of a growing business, one that the enterprise considers economically unfavorable.
                   With the global scale of operation of large enterprises, the need of the hour is to make information available to
                   partners, remotely-located analysts, and managers who are on the move. This in turn results in additional demands on
                   infrastructure and IT.
                   This white paper discusses how Cloud Computing might help address these challenges with its round-the-clock availability
                   and its dynamic and scalable nature. Cloud infrastructure would be beneficial in terms of offloading - BI storage, long
                   running processes, and handling erratic load behaviors. The proposed solution discussed in this paper is an alternative BI
                   architecture providing an optimal solution that extends existing BI infrastructure.


              www.infosys.com
BI Process
Overview
Primarily, a BI solution has two parts: data storage and analysis. The stored raw data is an asset that needs to be cleansed and processed to
derive information for making decisions. The information has to be presented to the decision makers in an intuitive and highly interactive
manner, so that key strategic decisions can be made in the least possible time. BI relies on data warehousing (a data repository designed
to support an organization’s decision making). Ineffectively managed data warehouses make it difficult for organizations to quickly extract
necessary data for information analysis to facilitate practical decision-making.

The BI process can be represented using the following diagram:




                                                              Figure 1: BI Process

Online Transaction Data
Online transactional data (operational data) from multiple systems (finance, sales, and CRM) is extracted and processed to eliminate data
redundancy or is optimized to be stored in a data warehouse. The purpose of creating a data warehouse is to bring information from
heterogeneous systems to a common data storage platform.

Data Warehouse
A data warehouse is an independent master store of all the historical transactional data for any enterprise. Extracting transactional data from
multiple systems and then cleansing the data for using it for further analysis is the most important activity of establishing a data warehouse.
The process of accumulating data largely depends on the source systems from where the data is retrieved. Mostly, this process of accumulation
is customized enough to handle the multiple data sources and data rules, easing the transformation of data from multiple disparate systems,
which needs to be stored in a single platform.

Data Marts
Although a data warehouse is a storehouse for voluminous data, it is difficult to process complex analytical queries or jobs directly off the data
warehouse. Thus, the data warehouse is broken down logically or sometimes physically into smaller analysis units called data marts. Data marts
can be conceptualized as units of data storage used for dedicated analysis, which is generated using specific filters and queries. Data marts
contain specialized multi-dimensional data structures called data cubes. Unlike relational database tables, which have only two dimensions
(row and column), a data cube has multiple dimensions.

Typical data mart queries include how the sale of grocery products was in the last six months and how a promotion performed in the last six
months in the southern region. Data marts are useful for such focused analysis.

Since the data warehouse is responsible for storing high volumes of historical and ever-growing data, a data warehouse solution should be
cost-effective and reliable and should always be available to other components for analysis and reporting.

Reports, Dashboards and Key Performance Matrix
Analysis is the process of slicing and dicing a set of information to interpret a pattern that can be used to justify certain impact or for further
planning. The analytics engine works on data marts. The purpose of the analytics engine is to execute complex queries and present data with
multiple dimensions and measures. Dimensions and measures are key parameters in BI that help slice and dice information to make it more
precise for decision makers.


2 | Infosys – White Paper
Data presentation is a crucial component in analysis. The richer the presentation of the data to be analyzed, the better it is for decision makers
to examine the information. This presentation layer helps in presenting reports, KPI matrix, and dashboards to the end user for slicing and
dicing information. These rich reports also support ‘what-if’ scenario analyses.

A BI system is an aggregation of multiple systems and sub-systems. Data storage, information slicing and dicing tools, and reporting or rich
visualization interfaces are some of the multiple sub-systems of any typical BI system. This peculiarity of structure and integration creates
inherent challenges. Let us look at the typical challenges faced by enterprises in implementing and using BI solutions.


BI Implementation Challenges
• 	 Intermittent demands for storage

	   Since a data warehouse is the backbone of the entire BI solution, it becomes important to manage this data warehouse properly to keep
    it running all the time. The data warehouse is a storehouse for large datasets, and it is not possible to keep the entire data active so that it
    may be used for on-demand analysis. In certain scenarios, historical data that has otherwise been inactive for some time may need to be
    activated. Activation of historical data involves obtaining the backed up tapes, retrieving the data, and loading and fitting it into the current
    activated data warehouse or data marts, all of which are by no means simple. Even if such a situation arises only once a month, it would
    still consume a considerable amount of IT operational resources. Storage demand increases with every such request because activation of
    inactive data adds to rather than taking away from the currently activated data. The need for extra storage capacity adds to the investment
    of hardware and the pressure of managing the same.

• 	 Sub-optimal utilization of resources

	   As the BI solutions have been in place for many years, it is highly likely that the number of users, size of the storage, and complexity of the
    systems have increased. Increase in users adds pressure on the scalability of the solution, which might have been provisioned long ago.

	   There is yet another possibility where an organization may have considered the rapid growth in the number of users, where the storage
    and other infrastructure capacities are planned upfront. In such cases, it is highly likely that the system may remain underutilized causing
    the loss of opportunity of using the same investment elsewhere. The scalability challenge is crucial in deciding utilization as well as smooth
    running of the system.

• 	 Lacking external dimension

	   On-premise BI solutions are mostly oriented around the transactional data of the enterprises. They lack the external dimensions and
    measures of analysis, that are important for strategic analysis. A combination of internal data such as sales data and external data such as
    government collected data and industry trends can be used to get better insight and plan effective strategies.

	   External environmental data is available through different data marketplaces, which can help enhance the quality of analytics. Increasing
    demand to factor external entities into the analysis is adding pressure on the design and flexibility of the BI solutions. Many a time,
    enterprises end up developing their own components or smaller, independent BI solutions to factor these external entities.

• 	 Lacking multi-channel delivery capabilities

	   Most enterprises work with workforce spread all over the world. These geographically distributed stakeholders demand round-the-clock
    availability and accessibility from any place. Enterprises that had not factored this demand have ended up spending huge amounts of
    money and resources to address it. The need to make data warehouses and BI solutions available over the Internet with multiple delivery
    channels such as RIA, services, mobile and browsers is increasing. This quick, easy, perennial accessibility adds an edge to enterprises,
    facilitating them to collaborate better and take decisions quickly. Thus, it becomes essential for enterprises to make their BI platform
    available over the Internet. This requirement not only demands additional investment for infrastructure, but also adds to the additional
    integration touch points to address such requirements.

	   Present businesses operate in highly dynamic environments influenced by factors such as changing business scenarios, change in
    compliances and governance processes, new integration requirements adding to the complexities of the systems, and increasing pressure
    on the system to be responsive. These challenges multiply with the increasing demand for dynamism in the business, processes, and
    technologies. It is important for every enterprise to address these challenges and make use of their BI investment to get the best results.




3 | Infosys – White Paper
BI Solution Based on Cloud Computing
With more and more devices getting meshed and inter-connected on the information highway, demand for data and everything related to it
will grow manifold. This information explosion will lead to the need of systems that can:

• 	 Process large amounts of data efficiently and in near real-time

•   Handle storage for data flowing in from the various systems and devices into storage units that can store large amounts of data

The figure shown below depicts a typical information flow landscape of any large enterprise in the future. Thus, a BI solution has to meet the
high volume requirements of an enterprise, which constantly exchanges information with multiple stakeholders, systems, and devices as part
of its day-to-day operations.




                Regulatory                          Content Providers                        Field
                 Agencies                                                             Devices/Appliances

                                                                                                                     Enterprise – Geo1
                                                                                                                  Sales     SCM       CRM




                                   DW          Analytical     Transformation      Portal &
                                                Engine            Engine         Reporting
                                                                                                                     Enterprise – Geo2
       Customers
                                                                                                                  Sales     SCM       CRM


          Delivery Channels
                                                     Suppliers
                                                                                              Partners




                                          Figure 2: Typical Azure™ Business Intelligence Eco-System


Cloud computing, a new generation technology platform of deploying and delivering software services, addresses the growth requirements
of an enterprise and the commonly faced BI challenges. The value proposition delivered by cloud computing, which can address the needs of
the BI platform for the future, includes:

•   Capability to process voluminous and rapidly-growing data over the Internet

•   Replication of machines, applications, and data storage at multiple instances to provide high availability

•   Dynamic, elastic capability to support scaling up and down of infrastructure within minutes


Improved Cost Efficiency
Managing complexity and Total Cost of Ownership (TCO) using cloud storage solutions are relatively more appealing compared to traditional
RDBMS data solutions, especially in a data warehouse scenario that deals with handling historic or inactive data. With cloud storage, data
can be kept active at all times while avoiding the aide of the IT management to activate any historical data. Thus, cloud storage addresses
the challenges of intermittent data storage access, particularly when there is an urgent need to reload historical data, say to meet
compliance-related queries.



4 | Infosys – White Paper
Elastic and Scalable
A cloud-based solution offers users the capability to provide cloud resources such as computing services, storage services, and cache services
instantaneously. This infrastructure-level flexibility allows one to handle workload fluctuations, both planned and unplanned, in an elastic
manner without having to plan for any investments upfront. The elastic and scalable nature of the cloud, along with the pay-as-you-go model,
aligns well with the enterprise needs such that the business gets a more transparent and assured view of its IT resource consumption.

Interoperable
Since the cloud is available over the Internet and can easily provide interoperable endpoints such as REST and SOAP, the architecture supports
easy integration with external services. Relatively easy and quick integration with externally available interface endpoints makes the enterprises
account for adding external dimensions to their analysis. These rich sets of external dimensions provide a platform for the enterprise to logically
consider factors for their analysis, be it competitor data, national/international growth data, neighborhood safety, climate effect, or new stores
or services in the neighborhood.

Available Anytime Anywhere
The cloud is available ubiquitously and can be accessed through standard http protocols. Enterprises do not have to spend extra money or
resources to make the solution available over the Internet. Concerns such as provisioning and hardening are inconsequential with the cloud.
The cloud helps enterprises support multiple delivery channels that allow information to reach stakeholders including employees, mobile field
agents, and external partners easily.

Even as the cloud computing platform is growing, different vendors are adding to the rich set of building blocks required to develop enterprise
applications on the cloud. The basic principle in developing these building blocks is to be able to integrate easily and quickly. All the vendors
are striving for open and interoperable standards of integration, making it easier to use these enterprise application services on any cloud
platform. It also delivers the advantage of making the system agile to handle system changes required to address dynamic business and
technical needs.

These characteristics of the cloud computing platform enable the implementation of large BI solutions possible in an easy and relatively
inexpensive manner. Cloud computing platforms are maturing and cloud vendors are trying hard to increase the functional and technical
richness of their offerings to drive innovations. These innovations would help enterprises in better management, easy decision making, and being
more competitive.

We will explore Microsoft Azure™, a public cloud platform that offers Platform as a Service (PaaS), for developing the next generation cloud-
based BI solution. PaaS offers hosted scalable application servers with necessary supporting services such as storage, security, and integration
infrastructure. PaaS platform also provides development tools and application building blocks to develop custom solutions on the cloud.
Though we have selected PaaS for our proposed solution, there are two other cloud delivery models: Software as a Service (SaaS) and IaaS
(Infrastructure as a service), which we will discuss briefly in this paper.




5 | Infosys – White Paper
Azure™ Based BI Solution
We will now attempt to explain a high-level design for a custom-built BI solution on Windows® Azure™.
Let us first get acquainted with the Azure™ terminologies given in the following table:


  Windows® Azure™              A cloud operating system platform that provides the computing capability on a cloud

                               Entity/Key value or tuple store-based service capabilities provided by Microsoft Azure™ to address large,
  Azure™ Table Storage
                               structured, and scalable data storage

                               Large and scalable data storage made available by Microsoft Azure™ for unstructured data such as
  Azure™ Blob Storage
                               documents and media files

                               Queue service offered by Microsoft Azure™ for message orchestration and asynchronous
  Azure™ Queue
                               request processing

                               Relational database capability similar to SQL Server made available by Microsoft Azure™ to address
  SQL Azure™
                               relational database capabilities on the cloud

                               A web server instance to run web applications readily available at http/https endpoints for access.
  Web Role
                               A web role is simply a web server provided by Microsoft Azure™

  Worker Role                  A computing instance for executing long running processes on Microsoft Azure™

                               A role used to run a virtual hard disk image, store that image in the cloud, and load and run it on demand.
  VM Role
                               The role is highly suited for moving legacy applications to the cloud with minimal effort

                               A service-bus-like messaging platform on the cloud that allows on-premise applications to be available
  AppFabric Service Bus
                               externally and to seamlessly connect with other systems

                               A claim-based authorization service that supports federated access to enterprise systems and services
  AppFabric Access Control
                               on the cloud. All authorization rules can be abstracted and managed from ACS independently out of
  Service (ACS)
                               the application in a standard oriented way

                               An information marketplace that acts as an external dataset provider, which would be consumed by the
  Windows® Azure™ Data
                               BI stack to leverage external dimensioning metrics such as demographics, location, and other publically
  Marketplace
                               available information to enrich the analytical reporting capabilities

                               An identity management framework that externalizes identity-related logic from an application.
  Windows® Identity            Federated single sign-on scenarios involving multiple stakeholders can be built on this framework. For
  Foundation (WIF)             the enterprise, this will also help integrate on-premise Active Directory-based authentication with the
                               Azure™ deployed application



High-Level Design for Custom-Built BI Solution on Azure™
Owing to concerns around data privacy, security, and data ownership, enterprises have been cautious in adopting cloud computing. However,
at the same time, they have also shown a keen interest in leveraging the value proposition offered by the cloud and the potential opportunity
it presents in growing their businesses.

Keeping these key aspects in mind, a hybrid BI solution is proposed to alleviate enterprise challenges. As shown in the figure below, the
proposed solution divides the architecture into two distinct facets – On-premise component and Cloud component.




6 | Infosys – White Paper
Figure 3: High-Level Design for Custom-Built BI Solution on Azure™


On-Premise Components
Data Cleansing and Profiling Agent

This agent would be responsible for collating transactional and unstructured data from on-premise systems, cleansing the data, and uploading
it on a data warehouse developed on Azure™ table storage. This component can be extended to consider disparate data sources such as Oracle,
SQL Server, mainframes, and excel data. Cleansing and profiling would also be configurable according to business needs to handle business-
specific rules, such as soft-deleted data should not be uploaded and transactional data not in the published state should not be uploaded.
The data transfer from agent to the cloud would happen over a secured channel. This agent is usually a part of the Extract Transform Load
(ETL) component.

Data Integration Layer

Based on the criticality of information, an enterprise may have structure data categorized into different levels. We will discuss the different data
integration approaches to cover mission critical and non-mission critical data.

Exposing master data on the cloud without having to upload the master data on the cloud storage helps in maintaining data privacy
and ownership in the hands of the enterprise. This would avoid the need to physically store confidential data such as credit card details,
address information of customers, and salary information of employees on the cloud. It would instead be fetched from the enterprise as and
when required.




7 | Infosys – White Paper
An on-premise component that forms a part of the integration layer would help in exposing the master data to the cloud. Technically, this can
be achieved by leveraging the capabilities of the Azure™ AppFabric service bus. Azure™ AppFabric service bus, with its service virtualization
capabilities, allows exposing on-premise components or services on the cloud without having to physically move the data outside the enterprise.
The AppFabric service bus provides a publically accessible virtual endpoint on the cloud to any on-premise service endpoint it manages. This
channel of communication between the Azure™ AppFabric service bus and the on-premise service can be secured at the transport level, which
would be achieved by using SSL, and at the message level, which would be achieved by using standard encryption techniques.

To avoid latency issues, which could be a cause of concern arising due to the external network hop between an on-premise and cloud
environments, a distributed caching functionality can be implemented on the cloud. The analytical engine deployed on the cloud can be
embedded with a caching component such as Azure™ AppFabric Cache to cache regularly-used master data and in turn reduce the effects
of latency.

Data integration achieved using service virtualization addresses data security concerns, but this comes at the cost of performance. It is, thus,
advisable that for non-critical data, the data be transported and made to reside physically on the cloud, closer to the hosted application. This
can be achieved by leveraging existing data integration techniques such as ETL, Change Data Capture (CDC), and Enterprise Information
Integration (EII) implemented using a tool such as Microsoft’s SQL Server Integration Services (SSIS).

Power Pivot

’Power Pivot for Excel’ is a data analysis tool that delivers unmatched computational power directly within the application and with a tool such
as MS Excel, which users are fairly acquainted with. Power Pivot is a user-friendly way to perform data analysis using familiar Excel features
such as the common MS Office User Interface shell, PivotTable, PivotChart views, and slicers. Power Pivot helps users analyze data marts offline
without being connected to the online data marts. Power Pivot enables focused analysis on the data marts for on-premise and on-the-move
analysts to access at their own convenience.

ADFS 2.0

ADFS 2.0 is an identity provider service that enables an enterprise-level identity federation solution. It is developed on Windows® Identity
Foundation (WIF) and makes it very easy to integrate with web applications for authentication/authorization from on-premise active directory
use stores. The BI portal solution proposed here would implement claims-based authentication using WIF and ADFS 2.0 for allowing enterprise
users to login to the system with their existing active directory credentials.


Azure™ Components
Cloud Data Warehouse

All the collated data uploaded by the cleansing and profiling agent would be stored in Azure™ table storage. Azure™ table storage is highly
scalable and is an appropriate fit for persisting de-normalized data due to its Entity Value Attribute (tuple store) style of storage. No analytical
processing or advanced queries would be run on the data warehouse. Hence, the economically cheaper Azure™ table storage is a relatively
better option compared to relational data stores such as SQL Azure. The Azure™ storage, through blobs, can also persist metadata of the data
warehouse along with unstructured data such as files, documents, scanned images, and video files.

The inexpensive storage capability delivered by table storage frees data warehouse administrators from having to deactivate historical data, a
practice often followed in the earlier BI systems due to storage capacity limitations of on-premise storage facilities. CAPEX spending, normally
involved in expanding storage to meet enterprise growth, is also eliminated. However, due to the Pay-As-You-Use pricing model of Windows
Azure services, there would be a rise in the OPEX spending, but it would tend to align more closely with the demands of the growing business.
A detailed assessment of the existing system along with a Y-O-Y ROI analysis of the Azure™ platform can help provide a clear picture in terms
of overall savings and business value that can be realized in the future.

Analytical Engine

The analytical engine is the most important component in the BI solution. The analytical engine:
•   Prepares data required for focused analysis
•   Applies algorithms for processing data based on different facts, measures, and dimensions
•   Analyzes structured and unstructured information to provide patterns and predicts trends that are usually difficult to spot with the naked 	
    eye or traditional reporting
•   Identifies cases or exceptions in the data to isolate or identify anomalies



8 | Infosys – White Paper
As of now, the SQL Server Analysis Services are not provided as part of the SQL Azure™ services. Hence, it is imperative to build this custom
component, which would achieve analysis services, cube formation, and querying cube-related functionalities on SQL Azure™.

In the proposed solution, the analytical engine has the following parts:

•   Batch Process (Azure™ worker role): This Azure™ worker role would be responsible for the creation of data marts and offline reports.

	         • Data-Mart Processor: Responsible for creating new data marts (SQL Azure™ tables) from the data warehouse (Azure™ table storage)
	           for focused analysis. The multiple requests submitted by analysts from the BI portal to create data marts would be handled
	           asynchronously by batch-processing requests, implemented using Azure™ queues.

	         • Offline Report Generator: Responsible for generating standard reports periodically and storing it in the Azure™ blobs to make it
	           readily available for the BI portal. This component would generate standard reports as per the configuration stored in the Azure™
	            table storage.

•   Real Time Analytics (Azure™ web role): This Azure™ web role is one of the most important components used for analysis. It would be
    responsible for fetching data from data marts and presenting it on the BI portal for analysis. BI portal presentation of dynamic reports and
    KPI matrix and generation of ad-hoc reports on existing data marts are achieved through this component. It services analysis requests
    synchronously on the existing data marts, making real-time analysis possible on the data marts.


    Note: With Windows® Azure™ version 1.6 release (November 2011), running SSAS off Azure VM roles is not supported by Microsoft. Hence, until
    Microsoft recognizes SSAS as a first class citizen of the cloud, we suggest using the data-mart processor approach.



• Data Marts: Since the proposed data warehouse is created using Azure™ table storage, which is entity-value schema-based and
  non-relational, we propose to create data marts in the SQL Azure™ tables. This is primarily because existing analytical engines can also
  leverage the premium RDBMS capabilities offered by SQL Azure™ on the cloud without any changes. SQL Azure™ is a relational database
  and makes it easy to fetch data using complicated analytical queries. Power Pivot provides a quick and powerful analysis tool to be used
  with SQL Azure™. Moreover, the BI portal would be able to generate the desired reports and analyses out of SQL Azure™.

•   Application Data: Application data comprises configuration and customization data required as a part of the BI solution.

•   SQL Azure Reporting Services Reports: As part of the BI solutions, standard reports can be configured using SQL Azure™ Reporting
    Services (SARS) and can be made available from the BI portal.

•   Standard Reports: As part of the BI solution, there are standard reports needed to be generated on the data using the specific dimensions
    and measures. These standard reports can be generated in a batch process to reduce the latency and can be made available all the time. As
    explained previously, the batch analytics component running on the Azure™ worker role generates these reports periodically.

•   BI Portal: This is the web portal ported on Azure™ web role. It interacts with the analytical engine to generate dashboards, ad-hoc reports,
    and visual analyses of data from multiple dimensions and measures. This BI portal would be accessible everywhere over the Internet and
    would be made available over multiple delivery channels including desktop, mobile, and PDAs.

•    Windows Azure Data Marketplace Dataset External Measures: The analytics engine can be configured to use specific datasets exposed
    from Windows Azure™ data marketplace. These datasets would be used as an external measure, along with the data mart measures, for
    analysis. Examples of such datasets that can be used as external measures could be demographic information of customers, upcoming
    business/stores in nearby locations, and weather conditions impacting sales for specific location


Design Considerations
•   Geo-location and affinity group: Applications developed on Windows® Azure™ can be deployed across multiple data centers located
    around the world – South Central US, North Central US, West Europe, East Europe, East Asia, and South East Asia. The Windows Azure global
    footprint is rapidly growing as Microsoft continues to build new global data centers for Azure™ deployment. Selection of appropriate data
    centers and creating an affinity group for deployment should be considered for the following reasons:




9 | Infosys – White Paper
• Regional Legislations/Regulations — These are to address regulatory requirements of deploying the application and its data within a
	              specific geographical location. There are a few compliance requirements that organizations have to abide by, to keep their data
	              geographically close to the region of business operations. These requirements can be addressed by deploying the Azure™
        	       application in an appropriate data center.

	           • Performance — Data center proximity to end users would help in reducing network latency and improving overall application
	             performance. Creating an affinity group for application and data instances would deploy these components within the same data
    	         center and would bring them closer. Inter-process communication within the same affinity group is faster and helps in improving
	             application performance, especially when there would be a large amount of data transfers involved during activities such as
	             reporting and data mart creation.

• Caching: Caching frequently used data such as reference data and infrequently modified data would help reduce data access calls and
  latency in serving requests. Moreover, since there would be multiple roles running in the Azure™ load-balanced environment, we need to
  consider using distributed caching systems such as Windows Azure AppFabric Caching services or Distributed Memcached.

• Partition keys for table storage: Partition keys used for data warehouse should not create too large partitions such that they are not able to
  run efficient queries on Azure™. We need to consider using partition keys in all queries for better performance.

• Communication security for data in transit: We need to ensure transport level security using SSL. For highly confidential data, we need to 	
  consider using messaging-level security, such as encryption and signatures.

• Processing Model: We could analyze business use-cases and choose the appropriate processing model between online and batch. Long
  running processes can be effectively scaled using the worker role approach for computation tasks. Message queue based asynchronous
  processing also provides data and processing reliability.

• SQL Azure™ Partition: In case where data-mart size expands more than one database instance limit of 150 GB for SQL Azure™, consider
  horizontal partitioning of few tables. We could consider high-growth tables for partitioning and range-based keys or storing hash of keys
  to identify a specific partition.


Other Cloud-Based BI Implementation Models
According to US-based National Institute of Standards and Technology, the cloud is composed of three service models, namely, SaaS, PaaS, and
IaaS. The design of the cloud-hosted BI solution explained in this paper was made by considering the boundaries of a PaaS cloud service model,
realized using Microsoft Azure™. The other cloud models available for implementing BI solutions are as follows:

	•            SaaS: This is the highest abstraction of the cloud. In this model, a finished application or solution is offered as a service. It is akin to
	             a packaged product with support for limited customization offered through the cloud. Since it is a standard packaged solution,
	             there may be limitations for enterprises to map their unique customizations and heterogeneous data stores to avail this solution.
	             SaaS might be a good offering for smaller organizations to address their limited BI needs.

	           • IaaS: This is the lowest abstraction of the cloud. In this model, vendors provide basic hardware and software infrastructure as a
	              service. Customers need to deploy their software, ranging from the operating system to the end application. Using this model,
	             enterprise will have to address the need of software licensing and deployment themselves, which limits the benefits of the cloud 	
	             computing platform.

Enterprises can select their cloud platform based on the criteria described in the figure below, driven by factors that make business sense in their
respective domains.




10 | Infosys – White Paper
Private                                               Public



                                   On-Premise                     IAAS                     PAAS                      SAAS


                             Business Intelligence Platform Evaluation Model
     Selection
     Criteria                    On-Premise                      IAAS                      PAAS                     SAAS

    Flexibility
    Ease of
    Management
    (Hardware,
    Software &
    Infrastructure)
    Control
    Functional
    Richness
    Application
    Building Blocks
    Security &
    Compliance
    Time to Market

    QoS (Scalability,
    Availability,
    Reliability &
    Performance)
         Preferred
       Procurement                      Buy                        Buy                      Build                 Subscribe
          Choice

                                         Figure 4: Business Intelligence Platform Evaluation Model

The above evaluation model summarizes the business value realized in implementing a cloud-based BI solution on different cloud service
models. A model of this nature can help guide enterprises in selecting the most appropriate cloud service by mapping the expected outcome
of their BI initiatives to the business value realized from the different cloud service options available.


Concerns About BI in Cloud/Azure™
The cloud platform addresses most of the challenges faced by enterprises in implementing and managing a traditional on-premise BI solution.
However, there are few concerns around cloud usage for implementing BI solutions. These concerns are common to any cloud implementation
and are not specific to BI. Let us briefly discuss these concerns from a BI cloud adoption point of view. The most talked about concern is around
data security and compliance.

Enterprises have concerns about placing their confidential data on the cloud where it would get replicated onto multiple servers. Technically,
the cloud technology treats all data in a similar fashion and that raises concerns around information security. To address this problem
practically, there is a need to amend the compliance rules to cater to the technology evolution. At the same time, cloud vendors need to
provide mechanisms that can handle the need to meet compliance requirements more effectively. Until then, a hybrid solution as proposed
in the high-level design in this paper, wherein critical data is stored on-premise but is exposed as a service for integration and aggregation
purpose and transactional data is stored in the cloud, is an option that can be explored.


11 | Infosys – White Paper
Conclusion
As cloud computing is evolving and growing every day, it would bring on several distinct changes. We foresee changes in compliance
requirements and a mindset shift to make optimized use of the cloud technology from the decision support system perspective. BI, as
elucidated, has a peculiar nature; it would need a customized solution approach. An integrated BI solution formed from a combination of on-
premise deployments, as well as cloud-based deployments, is the most suitable option available not only to realize the cloud benefits but also
to address enterprise concerns around the cloud.

This paper has discussed in detail how Microsoft Azure™ can be a good fit for an enterprise willing to optimize yet futuristically enrich its
solution. This paper also envisages an integration pattern for hybrid in-cloud and on-premise solutions developed using Windows® Azure™.
This pattern is not limited to BI solutions; it can also be used in multiple problem domains such as disaster recovery, data backup, seasonal
campaigning, and collaboration solution. We hope to see a lot of interest generated in developing a green field BI solution, migrating an
existing BI solution, or using the proposed aggregation design for implementing solutions on Windows® Azure™.

References
http://www.powerpivot.com/

http://msdn.microsoft.com/en-us/security/aa570351.aspx			


  About the Author
  Sidharth Subhash Ghag (Sidharth_ghag@infosys.com) is a Senior Technology Architect with the Microsoft Technology Center (MTC) in Infosys.
  With several years of software industry experience, he currently leads solutions in Microsoft Technologies in the area of Cloud computing.
  He has also worked in the areas of SOA and service-enabling mainframe systems and on domains such as Finance, Utilities, and Transportation.
  He has been instrumental in helping Infosys clients with service orientation of their legacy systems. Currently, he helps customers adopt
  Cloud computing within their Enterprise. He has authored papers on Cloud computing and service-enabling mainframe systems. Sidharth
  blogs at http://www.infosysblogs.com/cloudcomputing



Acknowledgement
Sachin Kumar Sancheti, Technical Architect, for his immense contribution in preparing the initial draft and for technical input provided during
his tenure in the organization.

Yogesh Bhatt, Principal Architect, Infosys Labs and Sudhanshu Hate, Senior Technology Architect, Infosys Labs for paper review.




About Infosys
Many of the world's most successful organizations rely on Infosys to
deliver measurable business value. Infosys provides business consulting,
technology, engineering and outsourcing services to help clients in over
30 countries build tomorrow's enterprise.

For more information, contact askus@infosys.com                                                                                                                                      www.infosys.com
© 2012 Infosys Limited, Bangalore, India. Infosys believes the information in this publication is accurate as of its publication date; suchinformation is subject to change without notice. Infosys acknowledges
the proprietary rights of the trademarks and product names of other companies mentioned in this document.

Más contenido relacionado

La actualidad más candente

Enterprise Security: Tableau vs. Power BI
Enterprise Security: Tableau vs. Power BIEnterprise Security: Tableau vs. Power BI
Enterprise Security: Tableau vs. Power BISenturus
 
Power BI Overview, Deployment and Governance
Power BI Overview, Deployment and GovernancePower BI Overview, Deployment and Governance
Power BI Overview, Deployment and GovernanceJames Serra
 
Power BI Advanced Data Modeling Virtual Workshop
Power BI Advanced Data Modeling Virtual WorkshopPower BI Advanced Data Modeling Virtual Workshop
Power BI Advanced Data Modeling Virtual WorkshopCCG
 
Power BI Create lightning fast dashboard with power bi & Its Components
Power BI Create lightning fast dashboard with power bi & Its Components Power BI Create lightning fast dashboard with power bi & Its Components
Power BI Create lightning fast dashboard with power bi & Its Components Vishal Pawar
 
Power BI for Office 365: Using SharePoint to Deliver Self-Service
Power BI for Office 365: Using SharePoint to Deliver Self-ServicePower BI for Office 365: Using SharePoint to Deliver Self-Service
Power BI for Office 365: Using SharePoint to Deliver Self-ServicePerficient, Inc.
 
Slides: Proven Strategies for Hybrid Cloud Computing with Mainframes — From A...
Slides: Proven Strategies for Hybrid Cloud Computing with Mainframes — From A...Slides: Proven Strategies for Hybrid Cloud Computing with Mainframes — From A...
Slides: Proven Strategies for Hybrid Cloud Computing with Mainframes — From A...DATAVERSITY
 
Cepta The Future of Data with Power BI
Cepta The Future of Data with Power BICepta The Future of Data with Power BI
Cepta The Future of Data with Power BIKellyn Pot'Vin-Gorman
 
What is BI on Cloud
What is BI on CloudWhat is BI on Cloud
What is BI on Cloudtdwiindia
 
Big Data and BI Tools - BI Reporting for Bay Area Startups User Group
Big Data and BI Tools - BI Reporting for Bay Area Startups User GroupBig Data and BI Tools - BI Reporting for Bay Area Startups User Group
Big Data and BI Tools - BI Reporting for Bay Area Startups User GroupScott Mitchell
 
Power BI - The self service BI Lifecycle in the cloud
Power BI - The self service BI Lifecycle in the cloudPower BI - The self service BI Lifecycle in the cloud
Power BI - The self service BI Lifecycle in the cloudTillmann Eitelberg
 
Best practices to deliver data analytics to the business with power bi
Best practices to deliver data analytics to the business with power biBest practices to deliver data analytics to the business with power bi
Best practices to deliver data analytics to the business with power biSatya Shyam K Jayanty
 
Snowflake: The most cost-effective agile and scalable data warehouse ever!
Snowflake: The most cost-effective agile and scalable data warehouse ever!Snowflake: The most cost-effective agile and scalable data warehouse ever!
Snowflake: The most cost-effective agile and scalable data warehouse ever!Visual_BI
 
South Florida SQL Saturday - Power BI Report Server Enterprise Architecture, ...
South Florida SQL Saturday - Power BI Report Server Enterprise Architecture, ...South Florida SQL Saturday - Power BI Report Server Enterprise Architecture, ...
South Florida SQL Saturday - Power BI Report Server Enterprise Architecture, ...Vishal Pawar
 
Dynamics Day 2014: Microsoft Dynamics AX - Business Insight Leveraging Analytics
Dynamics Day 2014: Microsoft Dynamics AX - Business Insight Leveraging AnalyticsDynamics Day 2014: Microsoft Dynamics AX - Business Insight Leveraging Analytics
Dynamics Day 2014: Microsoft Dynamics AX - Business Insight Leveraging AnalyticsIntergen
 
Understanding Identity Management with Office 365
Understanding Identity Management with Office 365Understanding Identity Management with Office 365
Understanding Identity Management with Office 365Perficient, Inc.
 
Power BI vs Tableau: Which One is Best For Business Intelligence
Power BI vs Tableau: Which One is Best For Business IntelligencePower BI vs Tableau: Which One is Best For Business Intelligence
Power BI vs Tableau: Which One is Best For Business IntelligenceStat Analytica
 
Governance for power bi Toronto SPS Saturday
Governance for power bi Toronto SPS Saturday Governance for power bi Toronto SPS Saturday
Governance for power bi Toronto SPS Saturday Berkovich Consulting
 

La actualidad más candente (20)

Enterprise Security: Tableau vs. Power BI
Enterprise Security: Tableau vs. Power BIEnterprise Security: Tableau vs. Power BI
Enterprise Security: Tableau vs. Power BI
 
Power BI Overview, Deployment and Governance
Power BI Overview, Deployment and GovernancePower BI Overview, Deployment and Governance
Power BI Overview, Deployment and Governance
 
Tableau vs PowerBI
Tableau vs PowerBITableau vs PowerBI
Tableau vs PowerBI
 
Power BI Advanced Data Modeling Virtual Workshop
Power BI Advanced Data Modeling Virtual WorkshopPower BI Advanced Data Modeling Virtual Workshop
Power BI Advanced Data Modeling Virtual Workshop
 
Power BI Create lightning fast dashboard with power bi & Its Components
Power BI Create lightning fast dashboard with power bi & Its Components Power BI Create lightning fast dashboard with power bi & Its Components
Power BI Create lightning fast dashboard with power bi & Its Components
 
Power BI for Office 365: Using SharePoint to Deliver Self-Service
Power BI for Office 365: Using SharePoint to Deliver Self-ServicePower BI for Office 365: Using SharePoint to Deliver Self-Service
Power BI for Office 365: Using SharePoint to Deliver Self-Service
 
Slides: Proven Strategies for Hybrid Cloud Computing with Mainframes — From A...
Slides: Proven Strategies for Hybrid Cloud Computing with Mainframes — From A...Slides: Proven Strategies for Hybrid Cloud Computing with Mainframes — From A...
Slides: Proven Strategies for Hybrid Cloud Computing with Mainframes — From A...
 
Cepta The Future of Data with Power BI
Cepta The Future of Data with Power BICepta The Future of Data with Power BI
Cepta The Future of Data with Power BI
 
What is BI on Cloud
What is BI on CloudWhat is BI on Cloud
What is BI on Cloud
 
Big Data and BI Tools - BI Reporting for Bay Area Startups User Group
Big Data and BI Tools - BI Reporting for Bay Area Startups User GroupBig Data and BI Tools - BI Reporting for Bay Area Startups User Group
Big Data and BI Tools - BI Reporting for Bay Area Startups User Group
 
Power BI - The self service BI Lifecycle in the cloud
Power BI - The self service BI Lifecycle in the cloudPower BI - The self service BI Lifecycle in the cloud
Power BI - The self service BI Lifecycle in the cloud
 
Best practices to deliver data analytics to the business with power bi
Best practices to deliver data analytics to the business with power biBest practices to deliver data analytics to the business with power bi
Best practices to deliver data analytics to the business with power bi
 
Snowflake: The most cost-effective agile and scalable data warehouse ever!
Snowflake: The most cost-effective agile and scalable data warehouse ever!Snowflake: The most cost-effective agile and scalable data warehouse ever!
Snowflake: The most cost-effective agile and scalable data warehouse ever!
 
BI Tools
BI Tools BI Tools
BI Tools
 
South Florida SQL Saturday - Power BI Report Server Enterprise Architecture, ...
South Florida SQL Saturday - Power BI Report Server Enterprise Architecture, ...South Florida SQL Saturday - Power BI Report Server Enterprise Architecture, ...
South Florida SQL Saturday - Power BI Report Server Enterprise Architecture, ...
 
Dynamics Day 2014: Microsoft Dynamics AX - Business Insight Leveraging Analytics
Dynamics Day 2014: Microsoft Dynamics AX - Business Insight Leveraging AnalyticsDynamics Day 2014: Microsoft Dynamics AX - Business Insight Leveraging Analytics
Dynamics Day 2014: Microsoft Dynamics AX - Business Insight Leveraging Analytics
 
Business Intelligence In The Cloud
Business Intelligence In The CloudBusiness Intelligence In The Cloud
Business Intelligence In The Cloud
 
Understanding Identity Management with Office 365
Understanding Identity Management with Office 365Understanding Identity Management with Office 365
Understanding Identity Management with Office 365
 
Power BI vs Tableau: Which One is Best For Business Intelligence
Power BI vs Tableau: Which One is Best For Business IntelligencePower BI vs Tableau: Which One is Best For Business Intelligence
Power BI vs Tableau: Which One is Best For Business Intelligence
 
Governance for power bi Toronto SPS Saturday
Governance for power bi Toronto SPS Saturday Governance for power bi Toronto SPS Saturday
Governance for power bi Toronto SPS Saturday
 

Similar a Business Intelligence Solution on Windows Azure

Types of Data Engineering Services - By DataToBiz
Types of Data Engineering Services - By DataToBizTypes of Data Engineering Services - By DataToBiz
Types of Data Engineering Services - By DataToBizKavika Roy
 
Business intelligence in the real time economy
Business intelligence in the real time economyBusiness intelligence in the real time economy
Business intelligence in the real time economyJohan Blomme
 
What to focus on when choosing a Business Intelligence tool?
What to focus on when choosing a Business Intelligence tool? What to focus on when choosing a Business Intelligence tool?
What to focus on when choosing a Business Intelligence tool? Marketplanet
 
About Business Intelligence
About Business IntelligenceAbout Business Intelligence
About Business IntelligenceAshish Kargwal
 
The New Enterprise Blueprint featuring the Gartner Magic Quadrant
The New Enterprise Blueprint featuring the Gartner Magic QuadrantThe New Enterprise Blueprint featuring the Gartner Magic Quadrant
The New Enterprise Blueprint featuring the Gartner Magic QuadrantLindaWatson19
 
Getting down to business on Big Data analytics
Getting down to business on Big Data analyticsGetting down to business on Big Data analytics
Getting down to business on Big Data analyticsThe Marketing Distillery
 
Running head Database and Data Warehousing design1Database and.docx
Running head Database and Data Warehousing design1Database and.docxRunning head Database and Data Warehousing design1Database and.docx
Running head Database and Data Warehousing design1Database and.docxhealdkathaleen
 
Running head Database and Data Warehousing design1Database and.docx
Running head Database and Data Warehousing design1Database and.docxRunning head Database and Data Warehousing design1Database and.docx
Running head Database and Data Warehousing design1Database and.docxtodd271
 
Data lake benefits
Data lake benefitsData lake benefits
Data lake benefitsRicky Barron
 
Evolving Big Data Strategies: Bringing Data Lake and Data Mesh Vision to Life
Evolving Big Data Strategies: Bringing Data Lake and Data Mesh Vision to LifeEvolving Big Data Strategies: Bringing Data Lake and Data Mesh Vision to Life
Evolving Big Data Strategies: Bringing Data Lake and Data Mesh Vision to LifeSG Analytics
 
Introduction to data warehousing and business intelligence
Introduction to data warehousing and business intelligenceIntroduction to data warehousing and business intelligence
Introduction to data warehousing and business intelligenceVijayMohan Vasu
 
Introduction to data warehousing and business intelligence
Introduction to data warehousing and business intelligenceIntroduction to data warehousing and business intelligence
Introduction to data warehousing and business intelligenceVijayMohan Vasu
 
Big Data is Here for Financial Services White Paper
Big Data is Here for Financial Services White PaperBig Data is Here for Financial Services White Paper
Big Data is Here for Financial Services White PaperExperian
 
Business Intellegence
Business IntellegenceBusiness Intellegence
Business IntellegenceKallol Sarkar
 
Traditional BI vs. Business Data Lake – A Comparison
Traditional BI vs. Business Data Lake – A ComparisonTraditional BI vs. Business Data Lake – A Comparison
Traditional BI vs. Business Data Lake – A ComparisonCapgemini
 
Business Intelligence Module 2
Business Intelligence Module 2Business Intelligence Module 2
Business Intelligence Module 2Home
 

Similar a Business Intelligence Solution on Windows Azure (20)

Types of Data Engineering Services - By DataToBiz
Types of Data Engineering Services - By DataToBizTypes of Data Engineering Services - By DataToBiz
Types of Data Engineering Services - By DataToBiz
 
Business intelligence in the real time economy
Business intelligence in the real time economyBusiness intelligence in the real time economy
Business intelligence in the real time economy
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
What to focus on when choosing a Business Intelligence tool?
What to focus on when choosing a Business Intelligence tool? What to focus on when choosing a Business Intelligence tool?
What to focus on when choosing a Business Intelligence tool?
 
Using Big Data Smarter Decision Making
Using Big Data Smarter Decision MakingUsing Big Data Smarter Decision Making
Using Big Data Smarter Decision Making
 
Bi orientations
Bi orientationsBi orientations
Bi orientations
 
About Business Intelligence
About Business IntelligenceAbout Business Intelligence
About Business Intelligence
 
Data Warehouse
Data WarehouseData Warehouse
Data Warehouse
 
The New Enterprise Blueprint featuring the Gartner Magic Quadrant
The New Enterprise Blueprint featuring the Gartner Magic QuadrantThe New Enterprise Blueprint featuring the Gartner Magic Quadrant
The New Enterprise Blueprint featuring the Gartner Magic Quadrant
 
Getting down to business on Big Data analytics
Getting down to business on Big Data analyticsGetting down to business on Big Data analytics
Getting down to business on Big Data analytics
 
Running head Database and Data Warehousing design1Database and.docx
Running head Database and Data Warehousing design1Database and.docxRunning head Database and Data Warehousing design1Database and.docx
Running head Database and Data Warehousing design1Database and.docx
 
Running head Database and Data Warehousing design1Database and.docx
Running head Database and Data Warehousing design1Database and.docxRunning head Database and Data Warehousing design1Database and.docx
Running head Database and Data Warehousing design1Database and.docx
 
Data lake benefits
Data lake benefitsData lake benefits
Data lake benefits
 
Evolving Big Data Strategies: Bringing Data Lake and Data Mesh Vision to Life
Evolving Big Data Strategies: Bringing Data Lake and Data Mesh Vision to LifeEvolving Big Data Strategies: Bringing Data Lake and Data Mesh Vision to Life
Evolving Big Data Strategies: Bringing Data Lake and Data Mesh Vision to Life
 
Introduction to data warehousing and business intelligence
Introduction to data warehousing and business intelligenceIntroduction to data warehousing and business intelligence
Introduction to data warehousing and business intelligence
 
Introduction to data warehousing and business intelligence
Introduction to data warehousing and business intelligenceIntroduction to data warehousing and business intelligence
Introduction to data warehousing and business intelligence
 
Big Data is Here for Financial Services White Paper
Big Data is Here for Financial Services White PaperBig Data is Here for Financial Services White Paper
Big Data is Here for Financial Services White Paper
 
Business Intellegence
Business IntellegenceBusiness Intellegence
Business Intellegence
 
Traditional BI vs. Business Data Lake – A Comparison
Traditional BI vs. Business Data Lake – A ComparisonTraditional BI vs. Business Data Lake – A Comparison
Traditional BI vs. Business Data Lake – A Comparison
 
Business Intelligence Module 2
Business Intelligence Module 2Business Intelligence Module 2
Business Intelligence Module 2
 

Más de Infosys

Demystifying Machine Learning for Manufacturing: Data Science for all
Demystifying Machine Learning for Manufacturing: Data Science for allDemystifying Machine Learning for Manufacturing: Data Science for all
Demystifying Machine Learning for Manufacturing: Data Science for allInfosys
 
Digital Outlook: Healthcare Industry
Digital Outlook: Healthcare IndustryDigital Outlook: Healthcare Industry
Digital Outlook: Healthcare IndustryInfosys
 
5 tips to make your mainframe as fit as you
5 tips to make your mainframe as fit as you5 tips to make your mainframe as fit as you
5 tips to make your mainframe as fit as youInfosys
 
Mainframe modernization powered by AI
Mainframe modernization powered by AIMainframe modernization powered by AI
Mainframe modernization powered by AIInfosys
 
Human Amplification In The Enterprise - Resources and Utilities
Human Amplification In The Enterprise - Resources and UtilitiesHuman Amplification In The Enterprise - Resources and Utilities
Human Amplification In The Enterprise - Resources and UtilitiesInfosys
 
Human Amplification In The Enterprise - Telecom and Communication
Human Amplification In The Enterprise - Telecom and CommunicationHuman Amplification In The Enterprise - Telecom and Communication
Human Amplification In The Enterprise - Telecom and CommunicationInfosys
 
Human Amplification In The Enterprise - Retail and CPG
Human Amplification In The Enterprise - Retail and CPGHuman Amplification In The Enterprise - Retail and CPG
Human Amplification In The Enterprise - Retail and CPGInfosys
 
Human Amplification In The Enterprise - Manufacturing and High-tech
Human Amplification In The Enterprise - Manufacturing and High-techHuman Amplification In The Enterprise - Manufacturing and High-tech
Human Amplification In The Enterprise - Manufacturing and High-techInfosys
 
Human amplification in the enterprise - Automation. Innovation. Learning.
Human amplification in the enterprise - Automation. Innovation. Learning.Human amplification in the enterprise - Automation. Innovation. Learning.
Human amplification in the enterprise - Automation. Innovation. Learning.Infosys
 
Human Amplification In The Enterprise - Healthcare and Life Sciences
Human Amplification In The Enterprise - Healthcare and Life SciencesHuman Amplification In The Enterprise - Healthcare and Life Sciences
Human Amplification In The Enterprise - Healthcare and Life SciencesInfosys
 
Human Amplification In The Enterprise - Banking and Insurance
Human Amplification In The Enterprise - Banking and InsuranceHuman Amplification In The Enterprise - Banking and Insurance
Human Amplification In The Enterprise - Banking and InsuranceInfosys
 
Mainframe modernization powered by AI
Mainframe modernization powered by AIMainframe modernization powered by AI
Mainframe modernization powered by AIInfosys
 
Reimagining the future of IT Infrastructure
Reimagining the future of IT InfrastructureReimagining the future of IT Infrastructure
Reimagining the future of IT InfrastructureInfosys
 
Infosys Amplifying Human Potential
Infosys Amplifying Human PotentialInfosys Amplifying Human Potential
Infosys Amplifying Human PotentialInfosys
 
Snapshots from Infosys Confluence 2016
Snapshots from Infosys Confluence 2016Snapshots from Infosys Confluence 2016
Snapshots from Infosys Confluence 2016Infosys
 
Be Digital. Be More.
Be Digital. Be More.Be Digital. Be More.
Be Digital. Be More.Infosys
 
Being Digital
Being DigitalBeing Digital
Being DigitalInfosys
 
Disruptive forces in digital payments
Disruptive forces in digital paymentsDisruptive forces in digital payments
Disruptive forces in digital paymentsInfosys
 
Infosys 'Go Green' Initiative
Infosys 'Go Green' InitiativeInfosys 'Go Green' Initiative
Infosys 'Go Green' InitiativeInfosys
 
Serving the perfect Information Cocktail
Serving the perfect Information CocktailServing the perfect Information Cocktail
Serving the perfect Information CocktailInfosys
 

Más de Infosys (20)

Demystifying Machine Learning for Manufacturing: Data Science for all
Demystifying Machine Learning for Manufacturing: Data Science for allDemystifying Machine Learning for Manufacturing: Data Science for all
Demystifying Machine Learning for Manufacturing: Data Science for all
 
Digital Outlook: Healthcare Industry
Digital Outlook: Healthcare IndustryDigital Outlook: Healthcare Industry
Digital Outlook: Healthcare Industry
 
5 tips to make your mainframe as fit as you
5 tips to make your mainframe as fit as you5 tips to make your mainframe as fit as you
5 tips to make your mainframe as fit as you
 
Mainframe modernization powered by AI
Mainframe modernization powered by AIMainframe modernization powered by AI
Mainframe modernization powered by AI
 
Human Amplification In The Enterprise - Resources and Utilities
Human Amplification In The Enterprise - Resources and UtilitiesHuman Amplification In The Enterprise - Resources and Utilities
Human Amplification In The Enterprise - Resources and Utilities
 
Human Amplification In The Enterprise - Telecom and Communication
Human Amplification In The Enterprise - Telecom and CommunicationHuman Amplification In The Enterprise - Telecom and Communication
Human Amplification In The Enterprise - Telecom and Communication
 
Human Amplification In The Enterprise - Retail and CPG
Human Amplification In The Enterprise - Retail and CPGHuman Amplification In The Enterprise - Retail and CPG
Human Amplification In The Enterprise - Retail and CPG
 
Human Amplification In The Enterprise - Manufacturing and High-tech
Human Amplification In The Enterprise - Manufacturing and High-techHuman Amplification In The Enterprise - Manufacturing and High-tech
Human Amplification In The Enterprise - Manufacturing and High-tech
 
Human amplification in the enterprise - Automation. Innovation. Learning.
Human amplification in the enterprise - Automation. Innovation. Learning.Human amplification in the enterprise - Automation. Innovation. Learning.
Human amplification in the enterprise - Automation. Innovation. Learning.
 
Human Amplification In The Enterprise - Healthcare and Life Sciences
Human Amplification In The Enterprise - Healthcare and Life SciencesHuman Amplification In The Enterprise - Healthcare and Life Sciences
Human Amplification In The Enterprise - Healthcare and Life Sciences
 
Human Amplification In The Enterprise - Banking and Insurance
Human Amplification In The Enterprise - Banking and InsuranceHuman Amplification In The Enterprise - Banking and Insurance
Human Amplification In The Enterprise - Banking and Insurance
 
Mainframe modernization powered by AI
Mainframe modernization powered by AIMainframe modernization powered by AI
Mainframe modernization powered by AI
 
Reimagining the future of IT Infrastructure
Reimagining the future of IT InfrastructureReimagining the future of IT Infrastructure
Reimagining the future of IT Infrastructure
 
Infosys Amplifying Human Potential
Infosys Amplifying Human PotentialInfosys Amplifying Human Potential
Infosys Amplifying Human Potential
 
Snapshots from Infosys Confluence 2016
Snapshots from Infosys Confluence 2016Snapshots from Infosys Confluence 2016
Snapshots from Infosys Confluence 2016
 
Be Digital. Be More.
Be Digital. Be More.Be Digital. Be More.
Be Digital. Be More.
 
Being Digital
Being DigitalBeing Digital
Being Digital
 
Disruptive forces in digital payments
Disruptive forces in digital paymentsDisruptive forces in digital payments
Disruptive forces in digital payments
 
Infosys 'Go Green' Initiative
Infosys 'Go Green' InitiativeInfosys 'Go Green' Initiative
Infosys 'Go Green' Initiative
 
Serving the perfect Information Cocktail
Serving the perfect Information CocktailServing the perfect Information Cocktail
Serving the perfect Information Cocktail
 

Último

Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 

Último (20)

Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 

Business Intelligence Solution on Windows Azure

  • 1. White Paper Business Intelligence Solutions on Windows® Azure™ - Sidharth Subhash Ghag Abstract Enterprise Business Intelligence (BI) solutions today are analyzing growing amounts of data. More often, the data is historical in nature, coming from within the enterprise and also from external channels such as the Web, mobile, and devices. This has led to the growth of data volume to alarming levels. In traditional BI implementations, this information explosion, along with increasing demands on computational power to process high volumes of data, has been managed through expensive hardware and software upgrades. This is a highly inefficient approach to meet the demands of a growing business, one that the enterprise considers economically unfavorable. With the global scale of operation of large enterprises, the need of the hour is to make information available to partners, remotely-located analysts, and managers who are on the move. This in turn results in additional demands on infrastructure and IT. This white paper discusses how Cloud Computing might help address these challenges with its round-the-clock availability and its dynamic and scalable nature. Cloud infrastructure would be beneficial in terms of offloading - BI storage, long running processes, and handling erratic load behaviors. The proposed solution discussed in this paper is an alternative BI architecture providing an optimal solution that extends existing BI infrastructure. www.infosys.com
  • 2. BI Process Overview Primarily, a BI solution has two parts: data storage and analysis. The stored raw data is an asset that needs to be cleansed and processed to derive information for making decisions. The information has to be presented to the decision makers in an intuitive and highly interactive manner, so that key strategic decisions can be made in the least possible time. BI relies on data warehousing (a data repository designed to support an organization’s decision making). Ineffectively managed data warehouses make it difficult for organizations to quickly extract necessary data for information analysis to facilitate practical decision-making. The BI process can be represented using the following diagram: Figure 1: BI Process Online Transaction Data Online transactional data (operational data) from multiple systems (finance, sales, and CRM) is extracted and processed to eliminate data redundancy or is optimized to be stored in a data warehouse. The purpose of creating a data warehouse is to bring information from heterogeneous systems to a common data storage platform. Data Warehouse A data warehouse is an independent master store of all the historical transactional data for any enterprise. Extracting transactional data from multiple systems and then cleansing the data for using it for further analysis is the most important activity of establishing a data warehouse. The process of accumulating data largely depends on the source systems from where the data is retrieved. Mostly, this process of accumulation is customized enough to handle the multiple data sources and data rules, easing the transformation of data from multiple disparate systems, which needs to be stored in a single platform. Data Marts Although a data warehouse is a storehouse for voluminous data, it is difficult to process complex analytical queries or jobs directly off the data warehouse. Thus, the data warehouse is broken down logically or sometimes physically into smaller analysis units called data marts. Data marts can be conceptualized as units of data storage used for dedicated analysis, which is generated using specific filters and queries. Data marts contain specialized multi-dimensional data structures called data cubes. Unlike relational database tables, which have only two dimensions (row and column), a data cube has multiple dimensions. Typical data mart queries include how the sale of grocery products was in the last six months and how a promotion performed in the last six months in the southern region. Data marts are useful for such focused analysis. Since the data warehouse is responsible for storing high volumes of historical and ever-growing data, a data warehouse solution should be cost-effective and reliable and should always be available to other components for analysis and reporting. Reports, Dashboards and Key Performance Matrix Analysis is the process of slicing and dicing a set of information to interpret a pattern that can be used to justify certain impact or for further planning. The analytics engine works on data marts. The purpose of the analytics engine is to execute complex queries and present data with multiple dimensions and measures. Dimensions and measures are key parameters in BI that help slice and dice information to make it more precise for decision makers. 2 | Infosys – White Paper
  • 3. Data presentation is a crucial component in analysis. The richer the presentation of the data to be analyzed, the better it is for decision makers to examine the information. This presentation layer helps in presenting reports, KPI matrix, and dashboards to the end user for slicing and dicing information. These rich reports also support ‘what-if’ scenario analyses. A BI system is an aggregation of multiple systems and sub-systems. Data storage, information slicing and dicing tools, and reporting or rich visualization interfaces are some of the multiple sub-systems of any typical BI system. This peculiarity of structure and integration creates inherent challenges. Let us look at the typical challenges faced by enterprises in implementing and using BI solutions. BI Implementation Challenges • Intermittent demands for storage Since a data warehouse is the backbone of the entire BI solution, it becomes important to manage this data warehouse properly to keep it running all the time. The data warehouse is a storehouse for large datasets, and it is not possible to keep the entire data active so that it may be used for on-demand analysis. In certain scenarios, historical data that has otherwise been inactive for some time may need to be activated. Activation of historical data involves obtaining the backed up tapes, retrieving the data, and loading and fitting it into the current activated data warehouse or data marts, all of which are by no means simple. Even if such a situation arises only once a month, it would still consume a considerable amount of IT operational resources. Storage demand increases with every such request because activation of inactive data adds to rather than taking away from the currently activated data. The need for extra storage capacity adds to the investment of hardware and the pressure of managing the same. • Sub-optimal utilization of resources As the BI solutions have been in place for many years, it is highly likely that the number of users, size of the storage, and complexity of the systems have increased. Increase in users adds pressure on the scalability of the solution, which might have been provisioned long ago. There is yet another possibility where an organization may have considered the rapid growth in the number of users, where the storage and other infrastructure capacities are planned upfront. In such cases, it is highly likely that the system may remain underutilized causing the loss of opportunity of using the same investment elsewhere. The scalability challenge is crucial in deciding utilization as well as smooth running of the system. • Lacking external dimension On-premise BI solutions are mostly oriented around the transactional data of the enterprises. They lack the external dimensions and measures of analysis, that are important for strategic analysis. A combination of internal data such as sales data and external data such as government collected data and industry trends can be used to get better insight and plan effective strategies. External environmental data is available through different data marketplaces, which can help enhance the quality of analytics. Increasing demand to factor external entities into the analysis is adding pressure on the design and flexibility of the BI solutions. Many a time, enterprises end up developing their own components or smaller, independent BI solutions to factor these external entities. • Lacking multi-channel delivery capabilities Most enterprises work with workforce spread all over the world. These geographically distributed stakeholders demand round-the-clock availability and accessibility from any place. Enterprises that had not factored this demand have ended up spending huge amounts of money and resources to address it. The need to make data warehouses and BI solutions available over the Internet with multiple delivery channels such as RIA, services, mobile and browsers is increasing. This quick, easy, perennial accessibility adds an edge to enterprises, facilitating them to collaborate better and take decisions quickly. Thus, it becomes essential for enterprises to make their BI platform available over the Internet. This requirement not only demands additional investment for infrastructure, but also adds to the additional integration touch points to address such requirements. Present businesses operate in highly dynamic environments influenced by factors such as changing business scenarios, change in compliances and governance processes, new integration requirements adding to the complexities of the systems, and increasing pressure on the system to be responsive. These challenges multiply with the increasing demand for dynamism in the business, processes, and technologies. It is important for every enterprise to address these challenges and make use of their BI investment to get the best results. 3 | Infosys – White Paper
  • 4. BI Solution Based on Cloud Computing With more and more devices getting meshed and inter-connected on the information highway, demand for data and everything related to it will grow manifold. This information explosion will lead to the need of systems that can: • Process large amounts of data efficiently and in near real-time • Handle storage for data flowing in from the various systems and devices into storage units that can store large amounts of data The figure shown below depicts a typical information flow landscape of any large enterprise in the future. Thus, a BI solution has to meet the high volume requirements of an enterprise, which constantly exchanges information with multiple stakeholders, systems, and devices as part of its day-to-day operations. Regulatory Content Providers Field Agencies Devices/Appliances Enterprise – Geo1 Sales SCM CRM DW Analytical Transformation Portal & Engine Engine Reporting Enterprise – Geo2 Customers Sales SCM CRM Delivery Channels Suppliers Partners Figure 2: Typical Azure™ Business Intelligence Eco-System Cloud computing, a new generation technology platform of deploying and delivering software services, addresses the growth requirements of an enterprise and the commonly faced BI challenges. The value proposition delivered by cloud computing, which can address the needs of the BI platform for the future, includes: • Capability to process voluminous and rapidly-growing data over the Internet • Replication of machines, applications, and data storage at multiple instances to provide high availability • Dynamic, elastic capability to support scaling up and down of infrastructure within minutes Improved Cost Efficiency Managing complexity and Total Cost of Ownership (TCO) using cloud storage solutions are relatively more appealing compared to traditional RDBMS data solutions, especially in a data warehouse scenario that deals with handling historic or inactive data. With cloud storage, data can be kept active at all times while avoiding the aide of the IT management to activate any historical data. Thus, cloud storage addresses the challenges of intermittent data storage access, particularly when there is an urgent need to reload historical data, say to meet compliance-related queries. 4 | Infosys – White Paper
  • 5. Elastic and Scalable A cloud-based solution offers users the capability to provide cloud resources such as computing services, storage services, and cache services instantaneously. This infrastructure-level flexibility allows one to handle workload fluctuations, both planned and unplanned, in an elastic manner without having to plan for any investments upfront. The elastic and scalable nature of the cloud, along with the pay-as-you-go model, aligns well with the enterprise needs such that the business gets a more transparent and assured view of its IT resource consumption. Interoperable Since the cloud is available over the Internet and can easily provide interoperable endpoints such as REST and SOAP, the architecture supports easy integration with external services. Relatively easy and quick integration with externally available interface endpoints makes the enterprises account for adding external dimensions to their analysis. These rich sets of external dimensions provide a platform for the enterprise to logically consider factors for their analysis, be it competitor data, national/international growth data, neighborhood safety, climate effect, or new stores or services in the neighborhood. Available Anytime Anywhere The cloud is available ubiquitously and can be accessed through standard http protocols. Enterprises do not have to spend extra money or resources to make the solution available over the Internet. Concerns such as provisioning and hardening are inconsequential with the cloud. The cloud helps enterprises support multiple delivery channels that allow information to reach stakeholders including employees, mobile field agents, and external partners easily. Even as the cloud computing platform is growing, different vendors are adding to the rich set of building blocks required to develop enterprise applications on the cloud. The basic principle in developing these building blocks is to be able to integrate easily and quickly. All the vendors are striving for open and interoperable standards of integration, making it easier to use these enterprise application services on any cloud platform. It also delivers the advantage of making the system agile to handle system changes required to address dynamic business and technical needs. These characteristics of the cloud computing platform enable the implementation of large BI solutions possible in an easy and relatively inexpensive manner. Cloud computing platforms are maturing and cloud vendors are trying hard to increase the functional and technical richness of their offerings to drive innovations. These innovations would help enterprises in better management, easy decision making, and being more competitive. We will explore Microsoft Azure™, a public cloud platform that offers Platform as a Service (PaaS), for developing the next generation cloud- based BI solution. PaaS offers hosted scalable application servers with necessary supporting services such as storage, security, and integration infrastructure. PaaS platform also provides development tools and application building blocks to develop custom solutions on the cloud. Though we have selected PaaS for our proposed solution, there are two other cloud delivery models: Software as a Service (SaaS) and IaaS (Infrastructure as a service), which we will discuss briefly in this paper. 5 | Infosys – White Paper
  • 6. Azure™ Based BI Solution We will now attempt to explain a high-level design for a custom-built BI solution on Windows® Azure™. Let us first get acquainted with the Azure™ terminologies given in the following table: Windows® Azure™ A cloud operating system platform that provides the computing capability on a cloud Entity/Key value or tuple store-based service capabilities provided by Microsoft Azure™ to address large, Azure™ Table Storage structured, and scalable data storage Large and scalable data storage made available by Microsoft Azure™ for unstructured data such as Azure™ Blob Storage documents and media files Queue service offered by Microsoft Azure™ for message orchestration and asynchronous Azure™ Queue request processing Relational database capability similar to SQL Server made available by Microsoft Azure™ to address SQL Azure™ relational database capabilities on the cloud A web server instance to run web applications readily available at http/https endpoints for access. Web Role A web role is simply a web server provided by Microsoft Azure™ Worker Role A computing instance for executing long running processes on Microsoft Azure™ A role used to run a virtual hard disk image, store that image in the cloud, and load and run it on demand. VM Role The role is highly suited for moving legacy applications to the cloud with minimal effort A service-bus-like messaging platform on the cloud that allows on-premise applications to be available AppFabric Service Bus externally and to seamlessly connect with other systems A claim-based authorization service that supports federated access to enterprise systems and services AppFabric Access Control on the cloud. All authorization rules can be abstracted and managed from ACS independently out of Service (ACS) the application in a standard oriented way An information marketplace that acts as an external dataset provider, which would be consumed by the Windows® Azure™ Data BI stack to leverage external dimensioning metrics such as demographics, location, and other publically Marketplace available information to enrich the analytical reporting capabilities An identity management framework that externalizes identity-related logic from an application. Windows® Identity Federated single sign-on scenarios involving multiple stakeholders can be built on this framework. For Foundation (WIF) the enterprise, this will also help integrate on-premise Active Directory-based authentication with the Azure™ deployed application High-Level Design for Custom-Built BI Solution on Azure™ Owing to concerns around data privacy, security, and data ownership, enterprises have been cautious in adopting cloud computing. However, at the same time, they have also shown a keen interest in leveraging the value proposition offered by the cloud and the potential opportunity it presents in growing their businesses. Keeping these key aspects in mind, a hybrid BI solution is proposed to alleviate enterprise challenges. As shown in the figure below, the proposed solution divides the architecture into two distinct facets – On-premise component and Cloud component. 6 | Infosys – White Paper
  • 7. Figure 3: High-Level Design for Custom-Built BI Solution on Azure™ On-Premise Components Data Cleansing and Profiling Agent This agent would be responsible for collating transactional and unstructured data from on-premise systems, cleansing the data, and uploading it on a data warehouse developed on Azure™ table storage. This component can be extended to consider disparate data sources such as Oracle, SQL Server, mainframes, and excel data. Cleansing and profiling would also be configurable according to business needs to handle business- specific rules, such as soft-deleted data should not be uploaded and transactional data not in the published state should not be uploaded. The data transfer from agent to the cloud would happen over a secured channel. This agent is usually a part of the Extract Transform Load (ETL) component. Data Integration Layer Based on the criticality of information, an enterprise may have structure data categorized into different levels. We will discuss the different data integration approaches to cover mission critical and non-mission critical data. Exposing master data on the cloud without having to upload the master data on the cloud storage helps in maintaining data privacy and ownership in the hands of the enterprise. This would avoid the need to physically store confidential data such as credit card details, address information of customers, and salary information of employees on the cloud. It would instead be fetched from the enterprise as and when required. 7 | Infosys – White Paper
  • 8. An on-premise component that forms a part of the integration layer would help in exposing the master data to the cloud. Technically, this can be achieved by leveraging the capabilities of the Azure™ AppFabric service bus. Azure™ AppFabric service bus, with its service virtualization capabilities, allows exposing on-premise components or services on the cloud without having to physically move the data outside the enterprise. The AppFabric service bus provides a publically accessible virtual endpoint on the cloud to any on-premise service endpoint it manages. This channel of communication between the Azure™ AppFabric service bus and the on-premise service can be secured at the transport level, which would be achieved by using SSL, and at the message level, which would be achieved by using standard encryption techniques. To avoid latency issues, which could be a cause of concern arising due to the external network hop between an on-premise and cloud environments, a distributed caching functionality can be implemented on the cloud. The analytical engine deployed on the cloud can be embedded with a caching component such as Azure™ AppFabric Cache to cache regularly-used master data and in turn reduce the effects of latency. Data integration achieved using service virtualization addresses data security concerns, but this comes at the cost of performance. It is, thus, advisable that for non-critical data, the data be transported and made to reside physically on the cloud, closer to the hosted application. This can be achieved by leveraging existing data integration techniques such as ETL, Change Data Capture (CDC), and Enterprise Information Integration (EII) implemented using a tool such as Microsoft’s SQL Server Integration Services (SSIS). Power Pivot ’Power Pivot for Excel’ is a data analysis tool that delivers unmatched computational power directly within the application and with a tool such as MS Excel, which users are fairly acquainted with. Power Pivot is a user-friendly way to perform data analysis using familiar Excel features such as the common MS Office User Interface shell, PivotTable, PivotChart views, and slicers. Power Pivot helps users analyze data marts offline without being connected to the online data marts. Power Pivot enables focused analysis on the data marts for on-premise and on-the-move analysts to access at their own convenience. ADFS 2.0 ADFS 2.0 is an identity provider service that enables an enterprise-level identity federation solution. It is developed on Windows® Identity Foundation (WIF) and makes it very easy to integrate with web applications for authentication/authorization from on-premise active directory use stores. The BI portal solution proposed here would implement claims-based authentication using WIF and ADFS 2.0 for allowing enterprise users to login to the system with their existing active directory credentials. Azure™ Components Cloud Data Warehouse All the collated data uploaded by the cleansing and profiling agent would be stored in Azure™ table storage. Azure™ table storage is highly scalable and is an appropriate fit for persisting de-normalized data due to its Entity Value Attribute (tuple store) style of storage. No analytical processing or advanced queries would be run on the data warehouse. Hence, the economically cheaper Azure™ table storage is a relatively better option compared to relational data stores such as SQL Azure. The Azure™ storage, through blobs, can also persist metadata of the data warehouse along with unstructured data such as files, documents, scanned images, and video files. The inexpensive storage capability delivered by table storage frees data warehouse administrators from having to deactivate historical data, a practice often followed in the earlier BI systems due to storage capacity limitations of on-premise storage facilities. CAPEX spending, normally involved in expanding storage to meet enterprise growth, is also eliminated. However, due to the Pay-As-You-Use pricing model of Windows Azure services, there would be a rise in the OPEX spending, but it would tend to align more closely with the demands of the growing business. A detailed assessment of the existing system along with a Y-O-Y ROI analysis of the Azure™ platform can help provide a clear picture in terms of overall savings and business value that can be realized in the future. Analytical Engine The analytical engine is the most important component in the BI solution. The analytical engine: • Prepares data required for focused analysis • Applies algorithms for processing data based on different facts, measures, and dimensions • Analyzes structured and unstructured information to provide patterns and predicts trends that are usually difficult to spot with the naked eye or traditional reporting • Identifies cases or exceptions in the data to isolate or identify anomalies 8 | Infosys – White Paper
  • 9. As of now, the SQL Server Analysis Services are not provided as part of the SQL Azure™ services. Hence, it is imperative to build this custom component, which would achieve analysis services, cube formation, and querying cube-related functionalities on SQL Azure™. In the proposed solution, the analytical engine has the following parts: • Batch Process (Azure™ worker role): This Azure™ worker role would be responsible for the creation of data marts and offline reports. • Data-Mart Processor: Responsible for creating new data marts (SQL Azure™ tables) from the data warehouse (Azure™ table storage) for focused analysis. The multiple requests submitted by analysts from the BI portal to create data marts would be handled asynchronously by batch-processing requests, implemented using Azure™ queues. • Offline Report Generator: Responsible for generating standard reports periodically and storing it in the Azure™ blobs to make it readily available for the BI portal. This component would generate standard reports as per the configuration stored in the Azure™ table storage. • Real Time Analytics (Azure™ web role): This Azure™ web role is one of the most important components used for analysis. It would be responsible for fetching data from data marts and presenting it on the BI portal for analysis. BI portal presentation of dynamic reports and KPI matrix and generation of ad-hoc reports on existing data marts are achieved through this component. It services analysis requests synchronously on the existing data marts, making real-time analysis possible on the data marts. Note: With Windows® Azure™ version 1.6 release (November 2011), running SSAS off Azure VM roles is not supported by Microsoft. Hence, until Microsoft recognizes SSAS as a first class citizen of the cloud, we suggest using the data-mart processor approach. • Data Marts: Since the proposed data warehouse is created using Azure™ table storage, which is entity-value schema-based and non-relational, we propose to create data marts in the SQL Azure™ tables. This is primarily because existing analytical engines can also leverage the premium RDBMS capabilities offered by SQL Azure™ on the cloud without any changes. SQL Azure™ is a relational database and makes it easy to fetch data using complicated analytical queries. Power Pivot provides a quick and powerful analysis tool to be used with SQL Azure™. Moreover, the BI portal would be able to generate the desired reports and analyses out of SQL Azure™. • Application Data: Application data comprises configuration and customization data required as a part of the BI solution. • SQL Azure Reporting Services Reports: As part of the BI solutions, standard reports can be configured using SQL Azure™ Reporting Services (SARS) and can be made available from the BI portal. • Standard Reports: As part of the BI solution, there are standard reports needed to be generated on the data using the specific dimensions and measures. These standard reports can be generated in a batch process to reduce the latency and can be made available all the time. As explained previously, the batch analytics component running on the Azure™ worker role generates these reports periodically. • BI Portal: This is the web portal ported on Azure™ web role. It interacts with the analytical engine to generate dashboards, ad-hoc reports, and visual analyses of data from multiple dimensions and measures. This BI portal would be accessible everywhere over the Internet and would be made available over multiple delivery channels including desktop, mobile, and PDAs. • Windows Azure Data Marketplace Dataset External Measures: The analytics engine can be configured to use specific datasets exposed from Windows Azure™ data marketplace. These datasets would be used as an external measure, along with the data mart measures, for analysis. Examples of such datasets that can be used as external measures could be demographic information of customers, upcoming business/stores in nearby locations, and weather conditions impacting sales for specific location Design Considerations • Geo-location and affinity group: Applications developed on Windows® Azure™ can be deployed across multiple data centers located around the world – South Central US, North Central US, West Europe, East Europe, East Asia, and South East Asia. The Windows Azure global footprint is rapidly growing as Microsoft continues to build new global data centers for Azure™ deployment. Selection of appropriate data centers and creating an affinity group for deployment should be considered for the following reasons: 9 | Infosys – White Paper
  • 10. • Regional Legislations/Regulations — These are to address regulatory requirements of deploying the application and its data within a specific geographical location. There are a few compliance requirements that organizations have to abide by, to keep their data geographically close to the region of business operations. These requirements can be addressed by deploying the Azure™ application in an appropriate data center. • Performance — Data center proximity to end users would help in reducing network latency and improving overall application performance. Creating an affinity group for application and data instances would deploy these components within the same data center and would bring them closer. Inter-process communication within the same affinity group is faster and helps in improving application performance, especially when there would be a large amount of data transfers involved during activities such as reporting and data mart creation. • Caching: Caching frequently used data such as reference data and infrequently modified data would help reduce data access calls and latency in serving requests. Moreover, since there would be multiple roles running in the Azure™ load-balanced environment, we need to consider using distributed caching systems such as Windows Azure AppFabric Caching services or Distributed Memcached. • Partition keys for table storage: Partition keys used for data warehouse should not create too large partitions such that they are not able to run efficient queries on Azure™. We need to consider using partition keys in all queries for better performance. • Communication security for data in transit: We need to ensure transport level security using SSL. For highly confidential data, we need to consider using messaging-level security, such as encryption and signatures. • Processing Model: We could analyze business use-cases and choose the appropriate processing model between online and batch. Long running processes can be effectively scaled using the worker role approach for computation tasks. Message queue based asynchronous processing also provides data and processing reliability. • SQL Azure™ Partition: In case where data-mart size expands more than one database instance limit of 150 GB for SQL Azure™, consider horizontal partitioning of few tables. We could consider high-growth tables for partitioning and range-based keys or storing hash of keys to identify a specific partition. Other Cloud-Based BI Implementation Models According to US-based National Institute of Standards and Technology, the cloud is composed of three service models, namely, SaaS, PaaS, and IaaS. The design of the cloud-hosted BI solution explained in this paper was made by considering the boundaries of a PaaS cloud service model, realized using Microsoft Azure™. The other cloud models available for implementing BI solutions are as follows: • SaaS: This is the highest abstraction of the cloud. In this model, a finished application or solution is offered as a service. It is akin to a packaged product with support for limited customization offered through the cloud. Since it is a standard packaged solution, there may be limitations for enterprises to map their unique customizations and heterogeneous data stores to avail this solution. SaaS might be a good offering for smaller organizations to address their limited BI needs. • IaaS: This is the lowest abstraction of the cloud. In this model, vendors provide basic hardware and software infrastructure as a service. Customers need to deploy their software, ranging from the operating system to the end application. Using this model, enterprise will have to address the need of software licensing and deployment themselves, which limits the benefits of the cloud computing platform. Enterprises can select their cloud platform based on the criteria described in the figure below, driven by factors that make business sense in their respective domains. 10 | Infosys – White Paper
  • 11. Private Public On-Premise IAAS PAAS SAAS Business Intelligence Platform Evaluation Model Selection Criteria On-Premise IAAS PAAS SAAS Flexibility Ease of Management (Hardware, Software & Infrastructure) Control Functional Richness Application Building Blocks Security & Compliance Time to Market QoS (Scalability, Availability, Reliability & Performance) Preferred Procurement Buy Buy Build Subscribe Choice Figure 4: Business Intelligence Platform Evaluation Model The above evaluation model summarizes the business value realized in implementing a cloud-based BI solution on different cloud service models. A model of this nature can help guide enterprises in selecting the most appropriate cloud service by mapping the expected outcome of their BI initiatives to the business value realized from the different cloud service options available. Concerns About BI in Cloud/Azure™ The cloud platform addresses most of the challenges faced by enterprises in implementing and managing a traditional on-premise BI solution. However, there are few concerns around cloud usage for implementing BI solutions. These concerns are common to any cloud implementation and are not specific to BI. Let us briefly discuss these concerns from a BI cloud adoption point of view. The most talked about concern is around data security and compliance. Enterprises have concerns about placing their confidential data on the cloud where it would get replicated onto multiple servers. Technically, the cloud technology treats all data in a similar fashion and that raises concerns around information security. To address this problem practically, there is a need to amend the compliance rules to cater to the technology evolution. At the same time, cloud vendors need to provide mechanisms that can handle the need to meet compliance requirements more effectively. Until then, a hybrid solution as proposed in the high-level design in this paper, wherein critical data is stored on-premise but is exposed as a service for integration and aggregation purpose and transactional data is stored in the cloud, is an option that can be explored. 11 | Infosys – White Paper
  • 12. Conclusion As cloud computing is evolving and growing every day, it would bring on several distinct changes. We foresee changes in compliance requirements and a mindset shift to make optimized use of the cloud technology from the decision support system perspective. BI, as elucidated, has a peculiar nature; it would need a customized solution approach. An integrated BI solution formed from a combination of on- premise deployments, as well as cloud-based deployments, is the most suitable option available not only to realize the cloud benefits but also to address enterprise concerns around the cloud. This paper has discussed in detail how Microsoft Azure™ can be a good fit for an enterprise willing to optimize yet futuristically enrich its solution. This paper also envisages an integration pattern for hybrid in-cloud and on-premise solutions developed using Windows® Azure™. This pattern is not limited to BI solutions; it can also be used in multiple problem domains such as disaster recovery, data backup, seasonal campaigning, and collaboration solution. We hope to see a lot of interest generated in developing a green field BI solution, migrating an existing BI solution, or using the proposed aggregation design for implementing solutions on Windows® Azure™. References http://www.powerpivot.com/ http://msdn.microsoft.com/en-us/security/aa570351.aspx About the Author Sidharth Subhash Ghag (Sidharth_ghag@infosys.com) is a Senior Technology Architect with the Microsoft Technology Center (MTC) in Infosys. With several years of software industry experience, he currently leads solutions in Microsoft Technologies in the area of Cloud computing. He has also worked in the areas of SOA and service-enabling mainframe systems and on domains such as Finance, Utilities, and Transportation. He has been instrumental in helping Infosys clients with service orientation of their legacy systems. Currently, he helps customers adopt Cloud computing within their Enterprise. He has authored papers on Cloud computing and service-enabling mainframe systems. Sidharth blogs at http://www.infosysblogs.com/cloudcomputing Acknowledgement Sachin Kumar Sancheti, Technical Architect, for his immense contribution in preparing the initial draft and for technical input provided during his tenure in the organization. Yogesh Bhatt, Principal Architect, Infosys Labs and Sudhanshu Hate, Senior Technology Architect, Infosys Labs for paper review. About Infosys Many of the world's most successful organizations rely on Infosys to deliver measurable business value. Infosys provides business consulting, technology, engineering and outsourcing services to help clients in over 30 countries build tomorrow's enterprise. For more information, contact askus@infosys.com www.infosys.com © 2012 Infosys Limited, Bangalore, India. Infosys believes the information in this publication is accurate as of its publication date; suchinformation is subject to change without notice. Infosys acknowledges the proprietary rights of the trademarks and product names of other companies mentioned in this document.