1. Introduction to Pentaho - BI
View Pentaho Course at www.edureka.co/pentaho-business-intelligence-training
www.edureka.co/pentaho-business-intelligence-training
2. How it Works?
LIVE Online Class
Class Recording in LMS
24/7 Post Class Support
Module Wise Quiz
Project Work on Large Data Base
Verifiable Certificate
Slide 2 www.edureka.co/pentaho-Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions business-intelligence-training
3. Objectives of this Session
Understand Pentaho Business Intelligence
A very successful case study
Understand key product offerings BA, PDI and PRD
For Queries during the session and class recording:
Post on Twitter @edurekaIN: #askEdureka
Post on Facebook /edurekaIN
Pentaho Demo
Job trends in Pentaho
Slide 3 www.edureka.co/pentaho-business-intelligence-training
4. What is Business Intelligence?
Business Intelligence is the process of transforming the business data into Information/Knowledge using computer
based techniques thus enabling the users to take effective fact based decisions
Business Intelligence
Information /Knowledge Decision
Business Data
Slide 4 www.edureka.co/pentaho-business-intelligence-training
5. Business Intelligence: Concept
Business Intelligence is in place from last few decades
Its primary job was more around structured data which used to be integrated at Data Warehouse (DWH)
With the time and evolving technologies, BI has been evolved to cater to the disruptive data growth
BI can crunch enormous structured/unstructured data to help identify or create strategic business
opportunities
BI essentially is intended for the following 3 things:
1. Precise and concise interpretation of (voluminous) data
2. Identifying new opportunities
3. Implementing an effective strategy to have competitive edge
Slide 5 www.edureka.co/pentaho-business-intelligence-training
7. Business Intelligence: Need of the time
In today’s growing global competition, business market poses following questions to business leaders.
How can I ensure to
retain/improve ROI of
my business?
End to End BI Solution
ETL
Q What would be my insightful decision based on ocean of data? How quick can I
take decision based on that huge data?
Q How can I integrate heterogeneous data feeds to common platform to analyse
it further?
Q How to interpret raw data in best possible manner? Data Discovery, Analytics
Q Can I predict the future of my business trajectory? Predictive Analytics, Machine
Learning
Q What is the best way to share the data? Visualization, Reporting
Q How can I monitor the dynamics of changing trends? Dashboard
Slide 7 www.edureka.co/pentaho-business-intelligence-training
8. Business Intelligence: Existing Solutions
There are various BI vendors offering end-to-end solutions in the market which can be traditionally classified as:
Large BI Vendors New Breed
Slide 8 www.edureka.co/pentaho-business-intelligence-training
9. Business Intelligence: Existing Challenges
With existing solutions, you can get the problems addressed but there are many challenges you will have to face
» Integration time
» Infrastructure cost
» Maintenance cost
» Licensing cost
» Single platform
Slide 9 www.edureka.co/pentaho-business-intelligence-training
10. Business Intelligence: Existing Challenges
How will I solve
all these
challenges?
Peter, the Java Developer is trying to address those challenges.
Slide 10 www.edureka.co/pentaho-business-intelligence-training
13. Pentaho : Solution for Existing Challenges of BI
Pentaho
Slide 13 www.edureka.co/pentaho-business-intelligence-training
14. Mapping of Pentaho Stack to BI Stack
Pentaho Tool BI Stack
Data Integration (PDI) ETL
Metadata Editor (PME) Metadata Management
Pentaho BA Analytics
Report Designer (PRD) Operational Reporting
Saiku Adhoc Reporting
CDE Dashboards
Pentaho User Console (PUC) Governance/Monitoring
Slide 14 www.edureka.co/pentaho-business-intelligence-training
15. Why Pentaho?
One stop solution for all the business analytics need
Low integration time and infrastructure cost
Huge community support
Easily scalable
Virtually unlimited visualization and data sources
Easily integrated on top of existing infrastructure including any big data eco-system
All core engines are open and standalone projects with their own community and development plan
A very good toolset which has far wider applicability beyond just the base product
Slide 15 www.edureka.co/pentaho-business-intelligence-training
16. Pentaho BA Server: EE vs. CE
Features Open Source Professional Standard Professional Premium Enterprise
Enterprise Reporting YES YES YES YES
Interactive Reporting NO YES YES YES
Interactive Analysis & Reporting NO YES YES YES
Dashboard NO YES YES YES
Data Integration YES
(w/o scheduling)
YES YES YES
Secured Data Integration NO YES YES YES
Connectivity Basic Expanded Expanded Expanded+
SSO NO YES YES YES
Mobile NO YES YES YES
Release Patches NO YES YES YES
Portal Support NO YES YES YES
Phone Support NO NO YES YES
24x7x365 NO NO YES YES
SLA NO 4hr 1hr 1hr
Named support contacts NO 2 3 3
Training Credits NO 8 8 12
Slide 16 www.edureka.co/pentaho-business-intelligence-training
17. Pentaho Stack: Abstract Component View
Pentaho Open BI Suite
Presentation Layer
Browser Portal Office Web Services E-mail
Reporting Analysis Dashboards
Process
Management
Production Data Mining
OLAP
Drill & Explore
Metrics
KPIs
Alerts
Business Intelligence Platform
Operational
Ad-hoc
Integration
Definition
Execution
Security Administration Business Logic Repository
Data and Application Integration
ETL Metadata EII
3rd Party Applications
ERP/CRM Legacy Data OLAP Other Applications Local Data
Slide 17 www.edureka.co/pentaho-business-intelligence-training
18. Pentaho Stack: Architecture View
Data Layer Server Layer Client Layer
Data Server Layer Thin Client
User Console
Interactive Reporting
Analyzer
Dashboard Designer
Enterprise Console
Thick Client
Enterprise Console Server
BA Server
Data Models
Pentaho Analysis
Pentaho Metadata
Schema Workbench
» RDBMS
» Hadoop
» NoSQL
» High
Performance
Analytic
Database
Flat
Files
DI Server
Aggregation Designer
Metadata Editor
Report Designer
Data Integration
Data Studio
RDBMS
Repository
Pentaho has its
presence in all the
three layers with
respective products
mentioned in the
diagram
It supports a
hosted solution
Slide 18 www.edureka.co/pentaho-business-intelligence-training
19. Pentaho Stack: Diving Deeper Continues…
Having gone through different layers and after understanding various
offerings of Pentaho in those areas, lets look at individual Pentaho
products in the next slides...
Slide 19 www.edureka.co/pentaho-business-intelligence-training
20. Pentaho BA Server (Server Layer)
Runs on localhost:8080
Slide 20 www.edureka.co/pentaho-business-intelligence-training
21. Pentaho BA Server (Server Layer)
This is a platform recently renamed from BI to BA, makes up a core software piece on server
side with following features
Hosts content created in the server itself through plug-ins or file is published to the server
from the desktop applications
Manages security
Running reports
Displaying/Designing dashboard
Report bursting
OLAP (Online Analytical Processing) analysis
Mondrian with MDX support
Pentaho Report Designer (CDE for CE)
Slide 21 www.edureka.co/pentaho-business-intelligence-training
22. Pentaho BA Server (Server Layer)
Pentaho Analyzer (Saiku)
Scheduling and Monitoring
Can be extended with commercial as well as open source (Ctools) plug-ins
Runs by default on Apache Tomcat but can be embedded in any Java based application server
This comes in two flavors: Community Edition (CE) and Enterprise Edition (EE)
Slide 22 www.edureka.co/pentaho-business-intelligence-training
23. What is Reporting?
Regular provision of information (Visualization of filtered data)
Derived from historical data (Base data)
For decision makers (Decision)
Slide 23 www.edureka.co/pentaho-business-intelligence-training
24. Pentaho Report Designer (Design Tool)
Pentaho Report Designer (PRD) is a pixel level accurate reporting tool. It is open source containing rich set of features
as listed below:
Rich GUI
Allows to create relational and analytics reports from wide range of data sources
It supports sub-reports, charts and graphs
Supports API based reporting
Pentaho Reporting Engine (PRE) executes the report designed by PRD
API based reporting, can be easily embedded in any Java applications
Slide 24 www.edureka.co/pentaho-business-intelligence-training
26. Extract, Transfer and Load (ETL) - BI Concept
Source Relational Data Base
Files
NoSQL
External Application
(SAP, Salesforce, etc)
Hadoop
Cassandra
Mongo Db
Netezza
ETL Process
Extract
Data Cleansing,
Data Profiling
Transformation
Aggregation
Joining
Filtering,
Sorting
Loading
Create and
execute
workflow to
load data from
source to target
Target
Data Warehouse
(DWH)
» Data Mart 1
» Data Mart 2
» Data Mart 3
Slide 26 www.edureka.co/pentaho-business-intelligence-training
27. Pentaho Data Integration
This is referred as PDI or still popularly known as Kettle which is its original project
name. Its intended for Extract, Transform, Load (ETL) mainly.
This consists of
DI Server (Server Application)
» Core component
» Executes jobs and transformations using pdi engine
» Has default user and role based security. It can also be integrated with
existing LDAP/Active Directory security provider
Design Tools (Standalone)
» Spoon
» Kitchen
» Pan
» Carte
Data Connections
Transformation
Job
Slide 27 www.edureka.co/pentaho-business-intelligence-training
28. Pentaho Data Integration: Layout Screenshot
Slide 28 www.edureka.co/pentaho-business-intelligence-training
29. Mondrian (Analysis Services)
ROLAP server (Relational OnLine Analytical Processing)
This comes under analysis services and codenamed as Mondrian, which is very useful to slice and dice the data
Open source OLAP (Online Analytical Processing) server written on pure java
Supports MDX (Multi-Dimensional eXpression) query language
Mondrian schema is universal metadata descriptor supported by almost any OLAP client tool
Slide 29 www.edureka.co/pentaho-business-intelligence-training
30. Dashboard Capability
One can create charts containing KPI’s and publish them in a dashboard on BA server to share with broader audience
There could be several ways to build Dashboard
» Pentaho Dashboard Designer available only in EE
» Community Dashboard Editor (CDE) is community edition and comes under ctools plugin
» Saiku plugin (EE)
We will be using CDE which has come to Pentaho after its aquision of Webdetails
CDE can be integrated with Pentaho BA server (CE/EE) from market place
This is based on common framework CDF and it simplifies creation, edition and rendering of CTools dashboards
You can also use custom CSS to get your own look and feel
Slide 30 www.edureka.co/pentaho-business-intelligence-training
33. Pentaho Case Study: Lufthansa
Business Challenge:
This Global Aviation company wanted to build a solution to make their core process of Inter Airline Through Check In
(IATCI) measurable, accessible and available real time for operational monitoring through dashboard
Pentaho Solution:
Pentaho Business Analytics: Real time data analysis and report generation
Pentaho Data Integration: Data extraction, transformation and loading
Pentaho is installed on a standalone 16 core machine with backend powered by MySQL
Slide 33 www.edureka.co/pentaho-business-intelligence-training
34. Pentaho Case Study: Lufthansa
Value Added:
Lufthansa will be able to aggregate and feed data into a management cockpit to analyze collected data for
key decision-making purposes
Management now receives instant notification of transmission errors, enabling the company to detect
patterns on large amounts of data in real-time
Partner airlines also receive alerts in the case of transmission errors between different IT systems for
passenger data
Lufthansa is able to comprehensively monitor one of its most important core processes in real-time for
quality management: the transfer of passenger data between different airlines
Slide 34 www.edureka.co/pentaho-business-intelligence-training
35. Pentaho in Production
Slide 35 www.edureka.co/pentaho-business-intelligence-training
36. Job Opportunity and Trends on Pentaho
Slide 36 www.edureka.co/pentaho-business-intelligence-training
37. Course Curriculum
Module 1
» Introduction to Pentaho BI Suite
Module 2
» Pentaho Report Designer - Basic
Module 3
» Pentaho Report Designer - Advanced
Module 4
» Pentaho Data Integration – Introduction
Module 5
» Pentaho Data Integration - Transformation
Module 6
» Pentaho Data Integration - Job and More
Module 7
» Pentaho BA Server and User Console
Module 8
» Project
Slide 37 www.edureka.co/pentaho-business-intelligence-training
TODO: instead of vendor names, can you please include their respective logos?
Need to explain each stack
TODO: instead of vendor names, can you please include their respective logos?
TODO: instead of vendor names, can you please include their respective logos?
TODO: instead of vendor names, can you please include their respective logos?
Taken from http://www.pentaho.com/sites/default/files/uploads/resources/pentaho-edition-comparison.pdf
BA EE has additional features on top of CE:
- interactive reporting (Saiku)
- analyzer
- dashboard designer (CDE/CDF)
- Instaview, data discovery
TODO: Image taken from http://www.bizcubed.com.au/pentaho-pillars
Please recreate it.
TODO: Image taken from http://www.systopia.co.uk/wp-content/uploads/2012/07/software-flexible1.jpg
Please recreate it.