Here is a case study that I developed to explain the different sets of functionality with the Pentaho Suite. I focused on the functionality, features, illustrative tools and key strengths. I've provided an understanding toward evaluating BI tools when selecting vendors. Enjoy!
2. aboutpentaho
Founded in 2004 in Orlando, Florida, Pentaho, Inc.,provides a
comprehensive, end-to-end open-source BI solution.
Pentahao is well known for its unified, embeddable platform
that tightly couples both data integration (DI) and business
intelligence (BI), OLAP services, reporting, dashboarding, data
mining and ETL capabilities.
“It is our mission to not
only provide our
customers with a
business analytics
platform that fuels
growth and
innovation, but also
one that is fast to
deploy, easy to use and
extremely cost-
effective.” – Quentin
Gallivan, CEO Pentaho
3. Pentahocapabilities
Pentaho offers a suite of products that stems from the involvement of
the open source community. Pentaho allows its users to benefit from
the value from all business data, including Big Data and improve
operational efficiencies, customer experience and compliance while
maximizing revenue. (Russom 2013).
The products address the each business need listed below:
Data Integration
Business Analytics
Big Data Analytics
Embedded Analytics
Mobile BI
5. Pentaho Data Mining is a desktop
application that is an Open Source
software. Pentaho used the Waikato
Environment for Knowledge Analysis to
search data for patterns. Weka consists
of machine learning algorithms for a
broad set of data mining tasks. It contains
functions for data processing, regression
analysis, classification methods, cluster
analysis, and visualization. Based on the
discovered patterns, users can predict
future trends.
wekadatamining
A comprehensive set of tools for machine learning and data mining to enhance
organizational insights through predictive analytics.
6. dataintegration
Pentaho’s data integration prepares and
blends data to create a complete picture
for organization that will drive actionable
insights. The complete data integration
platform delivers accurate, “analytics
ready” data to end users from any
source. With visual tools to eliminate
coding and complexity, Pentaho puts big
data and all data sources at the fingertips
of business and IT users alike.
Pentaho’s Data Integration (PDI) – is a desktop
application (Kettle) that consists of of core data
integration (ETL) engine, and GUI applications
that allow the user to define data integration
jobs and transformations. It supports
deployment on single node computers as well
as on a cloud, or cluster.
7. Migrating data between
applications or
databases
Exporting data from
databases to flat files
Loading data massively
into databases
Data cleansing
Integrating applications
PDI can be used as a standalone application, or it can be used as part of the larger Pentaho Suite. As an
ETL tool, it is the most popular open source tool available. PDI supports a vast array of input and output
formats, including text files, data sheets, and commercial and free database engines. Moreover, the
transformation capabilities of PDI allow you to manipulate data with very few limitations.
Pdikettleadditionaluses
Dataprix.com
8. Pentaho 5.0 provides an open, unified platform to access, integrate and blend any data, in
any environment, across a full spectrum of analytics. A new, modern interface simplifies the
analytics experience for all users turning data into a competitive advantage.
Newdiscoveries5.0
10. Pentaho for Big Data is a data integration tool
based on Pentaho Data Integration.[6] It allows
executing ETL jobs in and out of big data
environments such as Apache Hadoop or
Hadoop distributions such as Amazon,
Cloudera, EMC Greenplum, MapR, and
Hortonworks.[7] It also supports NoSQL data
sources such as MongoDB and HBase.[8]
bigdataanalytics
Within a single platform the Pentaho solution
provides visual big data analytics tools to
extract, prepare and blend, data plus the
visualizations and analytics. Regardless of the
data source, analytic requirement or
deployment environment, Pentaho allows
businesses to turn big data into big insights.
11. One of the biggest advantages of Pentaho is
that it integrates with multiple data sources
seamlessly.Pentaho Data Integration 4.4
Community Edition (referred as CE hereafter)
supports 44 open source and proprietary
databases, flat files, spreadsheets, and more
out of box third-party software. Pentaho
introduced Adaptive Big Data Layer as part of
the Pentaho Data Integration engine to support
the evolution of the Big Data stores. This layer
accelerates access and integration to the latest
version and capabilities of the Big Data stores.
It natively supports third-party Hadoop distributions from MapR, Cloudera, Hortonworks, as
well as popular NoSQL databases such as Cassandra and MongoDB. These new Pentaho Big
Data initiatives bring greater adaptability, abstraction from change, and increased
competitive advantage to companies facing the never-ceasing evolution of the Big Data
ecosystem. Pentaho also supports analytic databases such as Greenplum and Vertica
(packtpub.com).
12. Pentaho’s flexible cloud-ready platform is purpose-built for embedding into and
integrating with your applications. Its powerful embedded analytics combined
with flexible licensing and a strong partner program ensures that its users can
get to market quickly, drive new revenue streams, and delight its customers.
PentahoEmbeddedAnalytics
Embedded analytics – is the usage of reporting and
analytic capabilities in business applications. These
capabilities can be outside the application, but must
be easily accessible from inside the application,
without forcing users to switch between systems.
The integration of a Business Intelligence (BI)
platform with the application architecture will enable
users to choose where in the business process the
analytics should be embedded.
13. MobileBI
Pentaho Mobile offers a complete business
analytics platform, including interactive
analysis, rich visualizations, executive
dashboards, and enterprise reporting on the
iPad. With Pentaho, you receieve a consistent
business intelligence and analytics experience
across your desktop, laptop and iPad.
Pentaho Mobile is a new addition since 4.5-GA
suite that is a user interface adapted for use
with the Apple iPad. It exposes all of the major
functionality of OLAP analysis and running of
reports and dashboards that allow greater
interaction on a small, touch screen. Mobile
also adds features for bookmarking favorite
content for easy access and the concept of
opening several pieces of content in tabs. etlapps.com
14. Pentahobusinessanalytics
Pentaho's modern, simplified and interactive approach empowers business
users to access, discover and blend all types and sizes of data. With a
spectrum of increasingly advanced analytics, from basic reports to predictive
modeling, users can analyze and visualize data across multiple dimensions,
all while minimizing dependence on IT. At the same time, a true designed-
for-mobile experience ensures users are productive no matter where they
are (pentaho.com).
16. Pentaho’s in-memory caching capability enables ad hoc analysis of millions of rows of data
in seconds. Pentaho’s pluggable, in-memory architecture is integrated with popular open
source caching platforms such as Infinispan and Memcached and is used by many of the
world’s most popular social, ecommerce and multi-media websites. In addition, Pentaho
allows in-memory aggregation of data – where granular data can be rolled-up to higher-
level summaries entirely in-memory, reducing the need to send new queries to the
database. This will result in even faster performance for more complex analytic queries.
17. PRESENTATIONCREATION
“Capability provides inputs to presentation
capability, the ability to use appropriate
reporting and balanced scorecards tools, and
thereby make BI more valuable to its users...”
18. PentahoDashboardDesigner
(PDD)
Pentaho Dashboard Designer (PDD) is a
commercial plug-in provided to enterprise
edition (EE) subscribers. It allows users to create
dashboards, which are collections of other
content components displayed together with
the goal of providing a centralized view of key
performance indicators (KPI)s and other
business data movements, letting users monitor
them and make decisions. Content components
are usually individual Information graphics,
tables, OLAP views or reports. The plug-in
simplifies dashboard creation through the use
of layout templates, drag-and-drop interaction
and a GUI for providing parameters and inputs
to dashboard components.
19. Dashboard Designer has dynamic filter
controls, which enable dashboard
viewers to change a dashboard's details
by choosing different values from a
drop-down list, and to control the
content in one dashboard panel by
changing the options in another. This is
known as content linking. Dashboard
Designer includes the following content
types:
Charts: simple bar, line, area, pie, and dial
charts created with Chart Designer
Data Tables: tabular data
URLs: Web sites that users want to display
in a dashboard panel
dashboarddesigner
21. Pentaho is a young, dynamic organization that offers modern,
integrated, embeddable business analytics and a data integration
platform built for the future of analytics (g2crowd.com).
Users rave that the client tools are intuitive and easy to use,
making it popular in Gartner and Forrester reports.
Pentaho ranked second highest in the Gartner Magic Quadrant for
its data access and integration.
According to Butler Analytics Pentaho is broad enough to meet
most needs, and is best summarized as a ‘good all-rounder’ –
something that will be attractive to business managers who simply
want to get a job done.
22. References
Nikos Mastorakis, Valeria Mladenov and Vassiliki Kontargyri. "Proceedings of the European Computing
Conference." Heidelberg, Germany: Springer Science and Business Media, 2009. ISBN 978-0387848136. p. 789.
Retrieved July 11, 2012.
Patil, M. (2013, November 1). Pentaho for Big Data Analytics. Retrieved October 19, 2014, from
https://www.packtpub.com/big-data-and-business-intelligence/pentaho-big-data-analytics
Pashuk, A. (2014, October 15). Pentaho as embedded analytics solution. Retrieved October 19, 2014, from
https://xpansa.com/business-intelligence/pentaho-as-embedded-analytics-solution/
Pentaho Data Integration (Kettle) Tutorial. (n.d.). Retrieved October 20, 2014, from
http://wiki.pentaho.com/display/EAI/Pentaho Data Integration (Kettle) Tutorial
Pentaho InfoCenter. (n.d.). Retrieved October 19, 2014, from
http://infocenter.pentaho.com/help/index.jsp?topic=/puc_user_guide/content_displaying_data_in_dshbrds_intro.htm
l
Performance and Scalability Overview. (n.d.). Retrieved October 19, 2014, from
http://www.pentaho.com/assets/pdf/4GINEpCGdB4e9E5OysTi.pdf
Russom, P. (2013). INTEGRATING HADOOP INTO BUSINESS INTELLIGENCE AND DATA WAREHOUSING. TDWI Best
Practices Report, 1-38. Retrieved October 18, 2014, from
http://www.pentaho.com/sites/default/files/uploads/resources/tdwi_best_practices_report_-_hadoop_for_bi_and_dw.pdf
WikiPedia (n.d.). Retrieved October 20, 2014, from
http://en.wikipedia.org/wiki/Pentaho#Accolades_and_awards