1. Sybase IQ
Issue 1 Introduction
2012
“Big Data” is the new hot topic for IT managers, and is causing quite a panic amongst some
organizations; but, there is no need to panic, Big Data can be looked upon as Big Opportunity.
IN THIS ISSUE With the data explosion companies now have access to more information than ever before – if
the data can be exploited properly it can lead to a big competitive advantage.
Introduction.........................1
With companies acquiring massive amounts of data in different forms from different sources,
SAP Sybase IQ - Turning ranging from traditional channels with structured formats to social media channels with
Big Data into a Big unstructured formats, it has changed the focus of analytics in the “real-world”. Throughout
organizations there are changes in the way data is being analyzed – in marketing, the focus has
Advantage............................. 2
shifted to digital channels – click streams and social media – to understand buying patterns, and
Gartner Research: target marketing activities for maximum impact. In sales, the focus is on what we call “deal
Magic Quadrant for Data DNA”, to correlate emails, meeting notes and chatter to assess the probability that a sales
Warehouse Database deal will close. On the financial side, simulation is being used to predict margins and portfolio
values; while on the operational side, machine data via sensors, and other kinds of digital data are
Management Systems......... 5
being analyzed to track down operational inefficiencies – it’s no wonder companies are having
About Sybase..................... 29 information overload and are at a loss as to how to manage the information let alone how to use
that information intelligently.
The key to Big Data is the ability to access and connect all the data no matter what type or
where it came from, in order to achieve this you have to break the information silos that trap
data – turning massive amounts of data into actionable insight while providing complete access to
decision makers – creating an environment that offers “intelligence for everyone”.
Featuring research from
3. Figure 1: SAP Sybase IQ - A complete and comprehensive big data analytics platform
Source: Sybase
query execution is only distributed to For statistics and data mining Sybase IQ techniques such as network analysis or for
member nodes of the logical server, and supports a DBLytix library from Fuzzy searching large amounts of unstructured
member nodes can be dynamically added Logix containing hundreds of advanced data that is not indexed.
or dropped as necessary. analytic, statistical and data mining
algorithms that can run inside Sybase IQ. In addition to a native MapReduce
Specialized Tools & API, Sybase IQ offers four ways to
Techniques For text analytics Sybase IQ provides integrate results from 3rd party Hadoop
comprehensive in-database text search frameworks into Sybase IQ queries, giving
Sybase IQ has partnered with a number capabilities. With Sybase IQ’s key a tiered approach to analyzing massive
of key advanced analytic partners in Analytics partnerships – both internal and data sets. In essence, massive volumes
order to provide key in-database analytics external, such as, SAP BusinessObjects, of data can be searched from distributed
techniques. Using in-database analytics ISYS and KAPOW, hundreds of document file systems. The data returned from a
enterprises and application vendors can formats and Web content can be ingested Hadoop analysis can then be integrated
answer complex questions without having and/or extracted into Sybase IQ for into a Sybase IQ database in any of the
to move mountains of data to 3rd party analysis. four ways:
tools. With hundreds of statistical and • ETL Processing, which bulk load
data mining techniques, advanced text Sybase IQ provides a native MapReduce data from Hadoop data stores into
analytics capabilities, and APIs to execute API that can leverage massively parallel Sybase IQ using the open source
proprietary algorithms safely inside processing across a PlexQ™ grid. Using utility SCOOP from Sybase’s
Sybase IQ, companies can gain insights in MapReduce allows you to move beyond partner Cloudera.
unparalleled time. limitations with SQL queries, enabling
• Data Federation, which exposes
you to more easily execute alternative
HDFS files as tables in a Sybase IQ
database that participate in SQL
3
5. Gartner Reserch: Magic Quadrant for Data Warehouse
Database Management Systems
The data warehouse DBMS market used as a data warehouse – rather, a data data (SSED), excluding all data warehouse
is undergoing a transformation with warehouse (solution/data architecture) design-specific structures (such as indexes,
the introduction of “big data” and the is deployed on a DBMS platform. A data cubes, stars and summary tables). SSED
logical data warehouse demand for new warehouse solution architecture can is the actual row/byte count of data
techniques in practices and technology. and often does, use many different data extracted from all sources.
The integration of professional services constructs and repositories. Importantly, From 2012 onwards, defining the size of
with product offerings also increased in the definition of this market is changing a warehouse will become less important
importance in 2011. and a DBMS will become only part of the and information asset access will become
overall market definition as the logical more important. Within SSED it is
Market Definition/Description data warehouse (LDW) continues to important to separate the actual data size
This document was revised on 05 March grow in acceptance and deployment. in a data warehouse from the database
2012. The document you are viewing total size. Gartner clients report that
is the corrected version. For more A data warehouse is a database in which many 100-terabyte warehouses often
information, see the Corrections page on two or more disparate data sources can hold less than 30 terabytes of actual data.
gartner.com. be brought together in an integrated, Throughout 2012 and 2013, the size of a
time-variant information management warehouse will evolve toward a combined
The supplier side of the data warehouse strategy. Its logical design includes the metric, relative to the repositories under
database management system (DBMS) flexibility to introduce additional disparate direct management of the warehouse and
market consists of those vendors data without significant modification complemented by the volume of available
supplying DBMS products for the database of any existing entity design. A data information accessed by the warehouse,
infrastructure of a data warehouse and warehouse DBMS is now expected to as well as its performance in doing so (see
the required operational management coordinate virtualization strategies, as Note 3).
controls. well as distributed and/or processing
approaches such as MapReduce, to In addition, for the purposes of this
For the purposes of this Magic Quadrant handle one aspect of big or extreme data analysis, we treat all of a vendor’s
analysis, a DBMS is defined as a complete situations. products as a set. If a vendor markets
software system that supports and more than one DBMS that can be used
manages a logical database or databases A data warehouse can be of any size. The as a data warehouse DBMS, we note
in storage. Data warehouse DBMSs are sizing definitions of traditional warehouses this fact in the section related to the
systems that, in addition to supporting remain as: specific vendor, but evaluate its products
the relational data model (extended to • Small data warehouses are less than together as a single entity. Further, a
support new structures and data types 5 TB. DBMS product must be part of a vendor’s
such as materialized views, XML and product set for the majority of the
• Midsize data warehouses are 5 TB
metadata-enabled access to content), calendar year in question. If a product
to 20 TB.
support data availability to independent or vendor is acquired mid-year, it will be
front-end application software and • Large data warehouse are greater labeled appropriately but placed separately
include mechanisms to isolate workload than 20 TB on the Magic Quadrant until the following
requirements (see Note 2) and control year (see Figure 1).
various parameters of end-user access Importantly, none of these categories
within a single instance of the data. qualify a warehouse as a “big data” There are many different delivery models,
warehouse. Volume alone is not “Big such as stand-alone DBMS software,
This market is specific to DBMSs used data.” For the purpose of measuring the certified configurations, data warehouse
as a platform for a data warehouse. It is size of a data warehouse database, we appliances (see Note 1) and cloud (public
important to note that a DBMS cannot be define data as source-system-extracted and private) offerings. These are also
evaluated together within the analysis of
each vendor.
5
6. Figure 1. Magic Quadrant for Data Warehouse Database Management Systems is either a visionary with cloud
and data warehouse as a service,
but does not execute against the
challengers leaders
rest of the market, or it is good at
execution against two of the many
use cases in the market with little
Teradata vision for the remainder.
Oracle The 1010data position is almost
IBM perpendicular to our combined
EMC/Greenplum evaluation criteria. Therefore, we
ability to execute
Sybase, an SAP Company have placed it with high execution
1010data
against a sub-section of the market
Microsoft
we evaluate. From a visionary
ParAccel Vertica perspective, 1010data is difficult
Kognitio
to evaluate under current criteria.
SAND Technology
Its approach in using a cloud-
Infobright based and “as a service” DBMS/
analytics solution is the primary
Actian business model and technology
approach. Cloud-based analytics as
a service and the ability to deliver
Exasol
under a managed on-premises
niche players visionaries model, leaves 1010data short of
the much broader vision desired
completeness of vision by the greatest portion of the
As of February 2012
data warehouse market, but in
these few delivery segments of the
Source: Gartner (February 2012) market 1010data is a formidable
performance competitor.
• 1010data is expected to add
Magic Quadrant share large amounts of data without probabilistic matching in 2012.
needing to manage it locally – for The company has exhibited
Vendor Strengths and Cautions
example, large quantities of CPG significantly more reduced load
1010data times than some of its significant
data can be shared by multiple retail
1010data (www.1010data.com) was big data competitors, as well as
companies.
established 11 years ago as a managed orders of magnitude and faster
service data warehouse provider with an As a managed service solution
performance in extremely large
integrated DBMS and business intelligence vendor, 1010data can complement
datasets. 1010data products read
(BI) solution primarily for the financial the customer’s internal IT
SQL, but also utilize their own,
sector and more recently, the retail/ department with fast-to-market
non-SQL language that performs
consumer packaged goods (CPG) sector. solutions for business units, so
high-speed joins with unplanned
1010data can host its solution using reducing resource consumption
data rationalization built into the
traditional software as a service (SaaS) within the IT department. More
queries without the performance
model or support a managed solution importantly, the managed service
disadvantages of using interim
at the customer’s site. 1010data has model enables 1010data to leverage
return datasets.
approximately 200 customers. software solutions across multiple
customers. As new applications are • Perhaps the most important
Strengths created, they become available to point raised by those customers
all clients, increasing the availability referenced is that 1010data is
• Since 1010data offers a complete utilized by both IT and the business
of these applications to businesses.
SaaS solution, the customer’s with fast response times on queries
With more than 200 customers,
business unit and IT organization running against hundreds of billions
1010data has reached a position to
need little experience of data of row tables (with a combined
break out of its former niche status.
warehousing or BI. The SaaS model number of rows throughout
The problem is that the company
also allows multiple organizations to
6
7. databases exceeding a trillion rows As the demand for hybrid analytics Actian
in the entire database in some mixing structured data with content Actian (www.actian.com) offers two
instances). The company also increases, 1010data will need to products, the general-purpose Ingres
serves as a data aggregator and data introduce unstructured data analysis DBMS and Vectorwise, a new offering
marketplace providing datasets for as well as operational technology introduced in June 2010 and targeted at
rapid enhancement and enrichment or machine-generated data analysis. analytic data warehouses. Open-source
of analytics normally bound to 1010data’s competitors have greater Ingres, one of the original RDBMS
internal datasets only. financial resources and already are engines, has a 30-year history and claims
Our reference checks and in the process of building out this more than 10,000 customers running
discussions with Gartner clients part of the data warehouse vision. mission-critical applications, including data
also show that 1010data is • One of 1010data’s strengths warehouses.
price-competitive with non-SaaS also acts as a caution. While the
alternatives, especially by reducing business prefers a solution that is a Strengths
the management overheads needed complete, deployment-ready stack, • The Actian database contains most
to support a data warehouse IT departments and purchasing of the features necessary for data
environment. 1010data has offices do not. 1010data’s offering is warehousing, such as partitioning,
expanded from the financial sector sold as a fully integrated DBMS and compression, parallel querying
(where it began) into a broader BI solution, which limits potential and multidimensional structures.
market, including the retail sector. customers to those wanting a Release 10 added bulk load, scalar
1010data now claims more than full solution (primarily because subqueries, long identifiers and
200 customers and its customer of 1010data’s pricing model). a geospatial offering that was
references support our belief that 1010data’s product is a compliant, community driven with hundreds
it is one of the stronger small relational DBMS (RDBMS) that of committers contributing code.
data warehouse DBMS vendors. customers can use as a stand-alone The performance of Vectorwise,
In addition, the company has a system if desired – but fees are especially in analytic applications,
small number of customers that charged as if the entire solution is was cited by customers interviewed
install its system on-premises as managed. Customers are advised to by Gartner. With the emergence
a managed solution, with several check the total cost of ownership of new server platforms with
using 1010data as an enterprise in such cases, as it may not be storage-class memory (of 1 TB and
data warehouse solution vendor. advantageous to use 1010data in more), Vectorwise will prove a
Therefore, from an execution this way. valuable asset for data warehousing
standpoint, 1010data matches • As a solution vendor, 1010data and analytics as more of the data
performance, pricing and delivery has a different competitive warehouse moves to memory.
model for two specific needs in model from vendors of pure-play • Actian has aggressively pursued
the market quite well and it is DBMS offerings. In addition to partners, including independent
expanding both its scope of delivery competing in the data warehouse software vendors (ISVs) in the BI
and its vertical customer base. DBMS market, it competes with market, the primary driver of new
system integration vendors that installations in data warehousing.
Cautions offer outsourced solutions, such Both new and existing customers
• The market continues to resist as Cognizant and HP (via EDS). are looking for an open-source
fully-managed data warehouse Additionally, IBM, Oracle and other BI stack with partners such as
services in many verticals and large vendors with professional Jaspersoft and commercial BI
horizontal use cases. 1010data is service organizations compete with vendors such as MicroStrategy
susceptible to resistance from IT 1010data in two markets, data have also engaged with Actian.
departments requiring all its data warehouse DBMSs and services. It Ingres and Vectorwise are gaining
warehouses to be located in-house, remains to be seen if this is a bias attention from vertical application
along with in-house governance to be overcome or if the cloud vendors, system integrators and
of the organization’s data assets. and on-premises mix will ultimately resellers. Vectorwise uses some
The IT market is not fickle and exclude a vendor like 1010data. Ingres software atop a column store
persists in its use of better name- However, based on its extremely from the MonetDB project and uses
branded vendors and not simply positive customer references, it hardware assists, turning columns
because they are name-branded. is very unlikely 1010data will be into vectors and processing them
excluded from such a mix. in x86 chip registers to leverage
7
8. instruction parallelism and on-chip • Actian offers professional services Strengths
caching. Vectorwise has delivered in data warehousing and has a go- • Greenplum’s understanding and
several top non-clustered TPC-H to-market strategy with a growing vision of the data warehouse
benchmark results at 1 TB and stable of partners – it claims half market was ahead of the market as
below. The company was renamed of its 2011 Vectorwise sales have it was one of the first to work with
in late 2011 and introduced another come though channels. However, MapReduce, manage external files
new product offering, the Cloud it lacks data models and must from within the DBMS and optimize
Action Platform, to support the continue to add marketing and sales for very large database sizes. As
delivery of “Action Apps” that expertise for data warehousing. big data is now important in the
will act on the analytic capabilities Additionally, Actian has strength market and the LDW is emerging as
Actian supports. in open-source, but the overall a necessary functionality to support
• Previous reference checks have adoption of open-source for data today’s mix of volume, velocity,
shown Ingres customers to be very warehousing remains weak. While variety and complexity, Greenplum
loyal. Most have online transaction Actian has professional services, it has a base to support this that was
processing (OLTP) applications, tends to lack some of the tools and launched several years ago, which
but Ingres has also been used methodology support that other translates into the high ability to
for smaller data warehouses organizations have readily available. execute.
(historically up to about 2 TB, the • Actian’s new brand and name, as Greenplum announced the first
company is targeting warehouses well as its portfolio expansions, can unified analytics appliance addressing
smaller than 10 TB). Among open- help overcome Ingres’s reputation big data (a modular solution for
source DBMS, only Oracle’s MySQL as an older product that has not structured and unstructured data),
compares with proven maturity regained much market traction. in May 2011 that was released
for mission-critical applications, Importantly, Actian has taken a bold in September 2011. The EMC
including data warehousing. stance in attempting to re-establish Greenplum Data Computing
Vectorwise has begun to gain new itself with a new vision and new Appliance (DCA) uses the
customers and software partners, plans for execution. Initial response Greenplum Database, Greenplum
targeting another set of use cases. to Vectorwise is significant with the HD (Hadoop), and Greenplum
Now in its version 2.0, it has added addition of more than 20 customers Data Integration Accelerator (DIA)
Windows as a platform and has a in its first year offering and users modules that can be configured
clear road map for several future should consider Actian’s Vectorwise within one single appliance cluster.
releases. to be a new and innovative In addition, Greenplum has Chorus,
solution in that respect. However, its analytics productivity software,
Cautions market perception is difficult to leveraging VMware’s technology, to
change. Both offerings have gained support automated, self-service data
• Although Vectorwise enhances
new customers and third-party services and collaborative analytics.
Actian’s ability to support analytic
relationships, but to become a In a recent announcement, EMC
data marts, the company must
serious competitor in this market announced the first Hadoop NAS
continue to address enhanced data
Actian must continue to show attached HDFS system – HDFS
warehouse functionality, storage
increased growth in both revenue running native on EMC Isilon
management and mixed workload
and numbers of new customers at connected to the Greenplum HD
management if it is to compete
a higher rate than it has thus far. or Greenplum Data Computing
with larger, equally mature vendors
Effective marketing execution is a Appliance (DCA). Finally, through
and meet the needs of the broader
must-have for Actian to compete. the external file mechanisms and
data warehouse DBMS market.
Vectorwise needs to support more user defined functions (UDF),
analytic SQL constructs than it does EMC/Greenplum Greenplum has started along the
now and add stored procedures Greenplum (www.greenplum.com) is part path to support LDW. Greenplum
and user-defined functions and of the Data Products division of EMC even supports an iOS, Linux and
data types to move closer to with a massively parallel processing (MPP) Windows single-user development
competitors. Its new product and data warehouse DBMS running on Linux system downloadable as free (not
restructuring around Action Apps and Unix. It can be sold as an appliance or open-source) software.
can be synergistic – but could also as a stand-alone DBMS and has more than • As Greenplum has settled into
prove distracting. 400 customers worldwide. the EMC organization, we have
8
9. seen an increase in hiring directly presence to compete with all the Exasol
related to development. This, incumbent, large DBMS vendors. Exasol (www.exasol.com) is a small DBMS
coupled with the EMC development Importantly, EMC’s customer base vendor in Nuremberg, Germany. Exasol
organization has led Greenplum is primarily within the IT unit of has been in business since 2000 with the
to offer its DCA supporting big the organization. Data warehousing first in-memory column-store DBMS,
data for both structured and is the technical infrastructure for EXASolution, available since 2004 and
unstructured data and intergraded an intensely business-oriented primarily used as a data mart for analytic
MapReduce processing. The DCA use-case. EMC will need to learn applications.
is now assembled by EMC and sold from its Greenplum acquired
by its sales force. In an interesting knowledgebase, specifically how to Strengths
manufacturing cost management solution sell a data warehouse and • Exasol offers an in-memory column-
model, EMC is assembling its analytics solution. store DBMS for data warehousing.
appliances in different countries • Interestingly, this year our customer As we have stated, this technology
around the world, affording EMC references have raised several is one of the critical capabilities of
Greenplum a tax advantage in many issues around support. In these the future for the data warehouse
countries where others (such as cases it was not related to the DBMS market. Exasol runs in a
Oracle and Teradata) are subject attention to rapid support and clustered environment offering
to stiff import duties. This positions fixes (with all customers stating scalability across multiple servers.
the company for easier entry fixes were available in an expected, Not only does this allow for high-
into global markets. Due to the timely manner), but more with availability in the case of a server
acquisition, Greenplum has been the bugs in the first place. We failure using EXACluster OS, but
able to work more closely with would classify these as “growing also scaling for larger memory sizes.
VMware, for example rearchitecting pains” especially for a small EXASolution maintains redundant
the Chorus private cloud offering. organization (as Greenplum was copies of the data in memory to
• Our customer references support pre-acquisition) being integrated reduce the downtime associated
the claims of high performance into a large organization such as with server failures.
as well as advantageous price/ EMC. We should also note that in Exasol also includes the use of disk
performance ratios. These our inquiries with Gartner clients, for persistence and overflow (if all
references also support the we have seen this issue diminish, the data does not fit in memory).
Greenplum claim of scalability to coupled with consistently high However, when data is loaded into
very large database sizes. Reported marks for personalized customer Exasol, it is loaded into memory
sizes range from 10 terabytes to support. first and then written to the disk,
more than 500 terabytes. When • As Greenplum leverages EMC allowing for the applications to
this combination of performance more, it will find itself competing begin before the slower activity
and scalability are joined to an at a higher level with the mature, of disk input/output (I/O) is
appliance, the potential of EMC/ incumbent vendors. The major completed. This separation of the
Greenplum to compete in the data vendors (such as IBM, Oracle, SAP data access and data persistence
warehouse market is increased. and Teradata), have a much larger model is a visionary change for the
customer base allowing them, as market. Additionally, as a column-
Cautions the incumbent, a stronger position. store, Exasol has excellent data
• Although acquired by EMC 18 EMC/Greenplum must continue compression (reported to be on
months ago and despite doubling to demonstrate differentiation as average, four times faster), thus
the install-base, Greenplum’s it addresses the data warehouse reducing the amount of memory
market position is sixth or seventh market and big data is one specific necessary. EXASolution is sold by
worldwide. To really increase area, as is cloud. The company must the amount of memory used for the
velocity and gain market share, continue to support customers data.
Greenplum must continue to accustomed to the type of service • Another advantage of Exasol, as
develop the EMC sales force so provided by a small company with other in-memory DBMSs, is
that it has the necessary skills with focused, customer-specific the high speed of the DBMS. In
in the DBMS software market. professional services solutions, published benchmarks, Exasol has
Greenplum must also continue issue-focused support and leveraging attained data warehouse transaction
to leverage the EMC worldwide key customer inputs for product speeds up to 20 times the closest
enhancements.
9
10. competitor. Server memory Exasol lacked a marketing vision vendors such as Quest are less
is expensive, but these same to grow beyond the borders of its likely to support the DBMS,
benchmarks demonstrated costs European base. The company began requiring Exasol to create their own
of approximately one-third of the an expansion plan in 2011 and management software.
standard DBMS. Our reference will begin to grow offices in other
checks also validate the claims of locations, including North America.
IBM
cost reduction and speed. Another • Another issue is the increasing IBM (www.ibm.com) offers stand-
strength of the in-memory nature competition, both in column-store alone DBMS solutions as well as data
of Exasol is removing the necessity and in-memory. Exasol has a clear warehouse appliances, currently marketed
of optimization and calculation advantage being the first with an as the IBM Smart Analytics System family
structures within the database. in-memory column-store DBMS. (ISAS) and the Netezza brand. IBM’s
There is no need to build Now, most of the DBMS vendors data warehouse software, InfoSphere
summaries, aggregates and cubes offer some form of column-store Warehouse, is available on Unix, Linux,
for use in business intelligence capabilities. Further, when Exasol Windows and z/OS. IBM has also
and analytics. This reduces the began, there were only a handful of continued research and development and
overhead in the DBMS by as much in-memory DBMS, mostly used for market execution for the Netezza brand
as 10 times, as well as reducing streaming data applications. There and product line following its acquisition.
the database administrator (DBA) are now many in-memory DBMSs IBM has thousands of database customers
resources used to maintain such available in both the column and worldwide and more than 500 appliance
structures. In addition, this also row-store variety. Finally, SAP has customers (Netezza and ISAS combined).
leads to very fast load times, released its SAP HANA appliance
as there are no complicated with an in-memory column-store Strengths
structures to build during loading. DBMS for an analytics data mart
• The breadth of IBM technology
• Customer references clearly and now available under the SAP
offerings is complementary to
espouse the abilities of NetWeaver Business Warehouse.
and part of its solution delivery
EXASolution for both pure As with many technologies,
capability. InfoSphere Warehouse,
performance and cost/performance. being first is not sufficient unless
a data warehouse offering based
The references (although few in capitalized in growth of market
on IBM DB2, is a software-only
number) also state that customer share. Exasol has missed the
solution. IBM’s data warehouse
support is excellent. Finally, window of opportunity of being
appliance solution, the IBM
references corroborate the results first and now faces increased
Smart Analytics System (ISAS) is
of the benchmarks mentioned competition.
a combined server and storage
here, with better than 20 times • Customer references report that hardware solution (using the IBM
performance at half to a third there is one major issue with the Power Systems server with AIX,
of the cost. They also support use of EXASolution – the lack of the System x server with Linux or
the claims of 4 times (or more) interfaces to common BI tools. Windows and the IBM InfoSphere
compression. Exasol offers the standard ODBC Warehouse and a robust System
and JDBC interfaces, but this can z ISAS data warehouse solution),
Cautions be a performance drawback with complete with service and support.
tools such as BusinessObjects,
• The primary challenge Exasol faces IBM’s introduction of InfoSphere
Cognos and SAS. As Exasol has a
is the small size of the company and BigInsights includes offerings to aid
small installed base, it is difficult to
previous lack of expansion beyond the design, installation, integration
engage the tools vendors to assist
Germany. Exasol was primarily and monitoring of the use of
in creating native interfaces to the
engaged in product development Hadoop technologies within an
DBMS. We do expect to see this
for its first five years of operations IBM-supported environment. In
remedied over the next few years
and with changes in management IBM’s case, it is important to note
as the size of the installed base
two years ago has now obtained that it has embraced the vision
grows. Similarly, there is a reported
the vast majority of its 30 or more for the LDW – which Gartner
lack of software to manage the
customer base in the past two describes as the emerging new best
Exasol environment (EXASolution).
years. These customers are mostly practices in analytics management.
Again, with a small installed base,
located in Germany, with several in By tying together relational data,
third-party management software
Italy and Japan. Until very recently, data streams and Hadoop files,
10
11. IBM’s stack builds confidence among IBM specifically assigns technical own methodology and highlights
managers of existing warehouse account managers to support that the traditional enterprise data
implementations that the product is accounts). Additionally, IBM’s focus warehouse [EDW] is vital to all data
evolving as new demands for these on prospect qualification resulted in warehouse strategies including as a
two components of the logical data a higher growth in 2011 vs. 2009 to base component for the LDW.
warehouse emerge. 2010 for all of its products. This was IBM’s first incarnation of
Additionally, for Smart • The overall effect is that referenced the LDW approach. The market
Consolidation – rather than customers are confident regarding is acknowledging that the EDW
developing tooling in isolation, IBM release dates and the road map. does not have to be the center of
focused on tooling that existed in Customers list concurrency, the strategy but will be significant.
its Information Integration portfolio scalability, performance optimization However, the justification for
(InfoSphere BluePrint Director). and support as positives and were the LDW and evolving existing
This resulted in improvements in the most often repeated phrases warehouses or replacing them
the area of integration, including but in the reference survey in 2011. will be difficult at first because
not limited to the common Data References elaborated by indicating it appears to supporters of
Warehouse Packs and Models now that partitioning, compression and traditional data warehouses to
supported on DB2 and Netezza reduced administrative hours all be a radical departure from their
platforms alike. contribute to their experience to beloved traditional data warehouse
• IBM combines product sales with support optimized performance. practices. Gartner’s own research
solution services. This market At the same time, some references indicates that the LDW approach is
demands a widely varied level reported that optimization of quickly emerging as the newest data
of sophistication and knowledge queries should be targeted rather warehouse best practice. Gartner
depending on each client than being forced to optimize every anticipates the LDW will become
organization’s maturity in analytics single query because the system is a best practices approach during
and information management. As able to engage a solid query plan for 2013-2015. With market leadership
noted in the overview, the data execution. This evaluation considers there is risk commensurate with the
warehouse market in 2011 has the LDW concept to be innovative, anticipated rewards. IBM will need
multiple visions for the future. but has yet to see a wider embrace to continue their careful education
IBM has embraced the logical in the market. IBM’s early adoption message regarding their leadership
data warehouse (via “Smart of the LDW concept in both its approach in LDW practices. When
Consolidation”) approach while messaging and its product road engaging in an LDW approach
continuing to advance its technology map has established this vendor as with IBM, clients should insure
solutions and implementation an early resource for the market. they completely understand IBM’s
practices supporting traditional data However, the majority of the positioning for implementing this
warehousing architectures. market for data warehousing will solution.
Professional services available remain significantly focused on • Gartner inquiries report indicate
from IBM range from expert traditional solutions for a minimum that IBM data warehouse solutions
education through turnkey of the next three years. are also marketed and delivered in
solutions to managed services for isolation from each other. There are
data warehousing. Importantly, Cautions strategic reasons to continue such
where IBM leverages its services an approach with any acquisition,
• IBM has embraced the logical data
organization most, is in feeding but Netezza products tend to have
warehouse vision as the likely
field experiences into the overall their own niche in customers’ minds
successor to current best practices
data warehouse vision. In 2010, that is viewed as being separate and
in traditional data warehousing. The
clients reported that IBM’s support distinct from IBM (but Netezza’s
market has not yet determined if
appears disconnected from its growth was more than 30% in 2011,
it is ready to adopt this approach
product strategy – this improved in which is faster than its previous
as the new vision for the data
2011 with an even larger reference growth rate as an independent
warehouse and abandon 20 years
base reporting. This does not mean company).
of traditional best practices.
the issue has been resolved, but it IBM’s professional services have As a result, IBM customers often
appears that IBM’s focus on solution experience in delivering various engage only part of the organization
services is paying off (for example, aspects of the LDW under its for solutions and at least in the
11
12. customer’s minds, eliminate the compressed DBMS. The company Infobright also released an option
others. This creates both marketing provides both an open-source version for the Enterprise Edition called the
and sales process challenges. This (Infobright Community Edition [ICE]) Distributed Load Processor (DLP)
is not an issue with shortlisted and a commercial version (Infobright which allows for the parallel loading
solutions (IBM should recommend Enterprise Edition [IEE]). Infobright has of data into the system at very high
one solution or another), but does approximately 200 customers worldwide. speeds. Infobright has also added
carry over into the solution delivery connectivity to Hadoop MapReduce
team and IBM is missing some Strengths for the processing of “Big data.”
opportunities for the different parts • Infobright remains one of the only This is extremely important to
of the sales organization to leverage column-store DBMS in the open- the machine-generated data world
each other. IBM has implemented source software environment. as much of this data is stored in
organizational changes intended to Its revenue is generated from Hadoop or other such file systems
address these issues. the Enterprise Edition (using a and needs to be extracted into a
Netezza and IBM personnel do commercial license, rather than a DBMS for processing.
interact and coordinate with General Public License [GPL]) with • Our customer references are clear
each other behind the scenes. a subscription support model based on several points. Infobright is
A marketing solution would on the amount of SSED stored in extremely fast compared to other
simply begin branding software the system. As we stated in 2011, systems, including MySQL. Reports
and hardware combinations for Infobright decided in mid-2010 to of up to an average 500% increase
limited purposes. However, IBM focus on operational technology in performance over MySQL
will choose the more difficult (and data (which it calls machine- deployments have been reported.
more appropriate) solution of generated data). This encompasses We believe this is not only from
creating an educational sales and data from sources such as smart the column-store design, but also
implementation process which meter data (in the utilities space), the Knowledge Grid. References
will demonstrate how software customer data records (in the telco suggest that Infobright is replacing
and hardware capabilities can be space) and clickstream data from an existing MySQL environment
leveraged effectively to support Internet interactions. with great gains in stability,
each use case. This focus has helped Infobright compression and performance.
• IBM customers report (via inquiry during 2011 where its customer Some cases report a year or more
and reference survey results) base has grown to more than 200 without an outage.
a scattering of intermittent direct and OEM channel customers. Finally, many references state that
and irregular issues with Not only has this focus increased simplicity is a factor in their choice
product performance or their customers, but has also attracted to use Infobright. We also believe
implementation experience. Some a number of additional OEMs this will interest OEMs that want to
of these are possibly attributed to (now accounting for approximately build-in Infobright to their existing
the implementation process and 40% of customers). This, along systems for resale. The simplicity
not the products. However, these with partnerships with Pentaho, of management, scalability and
same customers report that IBM Jaspersoft, Talend and others, will compression all interest the OEM
support addresses these issues with help the company grow substantially looking for a DBMS to embed that
efficiency. Nonetheless, as with faster than direct sales only. requires little support on their part.
any IT products, an assumption • Infobright has several unique The focus on machine-generated
that appliances or certified technologies in the DBMS. In data has been important to
configurations alleviate all issues is addition to the column-store file Infobright, but we believe that the
incorrect. Most issues are irregular system for MySQL, the Knowledge future will greatly depend on the
in nature and IBM support is Grid in-memory metadata store company’s ability to leverage these
intimately involved in the resolution is a major differentiator for OEM partners.
process. Infobright, as this product analyzes
queries to minimize the number Cautions
Infobright of “data packs” that have to be • One of the biggest challenges for
Infobright (www.infobright.com) has decompressed to give a result (data a small vendor is to focus on what
offices in Canada, Europe and the packs are the compressed domains/ they do well. Infobright has done
U.S. and offers a combination of a regions of data in Infobright’s this with machine-generated data.
column-vectored DBMS and a fully offering).
12
13. However, as a small, relatively MySQL. To date, Oracle has not started to produce results, with
young vendor, Infobright must done anything other than enhance several new customers. Kognitio
continue to differentiate its the product. However, in the future has also added several hosting
offerings and open-source model when the contract is done with EU, partners in the U.S. and the U.K.
from mature column-store DBMSs. we cannot guarantee that Oracle offering managed services on WX2.
Sometimes, these two statements will not change the agreements, Its sales model as dbSaaS makes up
are contradictory not least because especially those with OEMs. This almost half of its revenue and has
the focus on machine-generated is an issue customers of Infobright supported much of the company’s
data cannot be an excuse for should monitor in the future. growth this year.
ignoring its existing customers • Kognitio continues to invest in
addressing other data management in-memory capabilities. Gartner
Kognitio
use cases, reported in several considers that in-memory DBMSs
Kognitio (www.kognitio.com) started by
customer references as an issue. An can play a major role in enterprises
offering data warehouse appliances and
example is workload management information infrastructure and as
warehousing as a hosted service. Today,
software, where the managed such Kognitio’s technology has
it has a mixture of less than 50 customers
workloads are basically for machine- an opportunity to meet customer
using its DBMS (WX2) separately as an
generated data and may lack the demand, given the maturity of its
appliance, a data warehouse DBMS engine,
robustness needed for management offering, compared to other more
or data warehousing as a managed service
of overall workload. recent offerings. Kognitio’s DBMS,
(hosted on hardware located at Kognitio’s
• There are other issues raised by sites or those of its partners). WX2 version 7, already includes
our reference checks. As with most in-memory analytics, and customer
small startup vendors, stability from Strengths references continue to report
one release to another can suffer. that the speed of query and load
• Kognitio pioneered the data
Customer references reveal that performance is excellent. In 2011,
warehousing database as a service
there have been issues with new Kognitio added Pablo in-memory
(dbSaaS) model, where a data
releases, but they are quick to point online analytical processing (OLAP)
warehouse DBMS is delivered
out that the problems are quickly capabilities to further strengthen its
as a managed service from the
resolved. The lack of management analytical capabilities The DBMS is
DBMS vendor. Clients buy data
software (also an issue for smaller already an in-memory DBMS, with
warehousing services from Kognitio,
vendors) was raised. Third-party hot data held in-memory and cold
while Kognitio hosts the database.
software vendors are not quick data on disk, managed automatically
Data warehousing dbSaaS permits
to pick up new, young software by the DBMS.
clients to expand their warehouses
companies, as the potential market • Those customers referenced
incrementally and clients note
is small, so this puts more pressure reported significant concurrency
that this model provides for low
on Infobright to produce its own capabilities, as well as excellent
upfront costs with virtually no
management software. support and product management.
capital expenditure required to
• Finally, Infobright is open-source get started. This is a growing Kognitio is gaining visibility thanks
and makes use of portions of segment of the data warehouse to the current market interest in
MySQL, under a Commercial OEM DBMS market. Kognitio also works in-memory technologies. Kognitio’s
License with Oracle. We always with deployment partners such customers report that deployment
question the open-source model as Capgemini (and contributes of large-scale data warehouse
for revenue generation. First, to Capgemini’s Immediate cloud efforts takes as little as 10 weeks
Infobright has a community version computing offering). using this model. References also
with less functionality than the report predictable, linear scaling of
Additionally, in line with existing
Enterprise Edition. This has proven performance and under the “as a
market demands, Kognitio has
useful as a trial system to attract service” model, customers report
an appliance to install on-site for
new customers, but some may opt scale up and scale down needs as
customers requiring their own
for the ICE version in lieu of the part of a solid account management
infrastructures. Kognitio opened
Enterprise Edition. approach. Finally and possibly most
offices in the U.S. three years ago
The other issue is specifically the importantly, references indicate that
in addition to its U.K. headquarters
use of MySQL, as it is owned by new queries and new variations on
and has continued to expand its
Oracle. This implies risks remain existing analytics can be deployed
presence in the U.S. by hiring
due to the uncertain future of rapidly.
additional resources. This has
13
14. Cautions such as those of IBM (Cognos) can also leverage SharePoint and
• Kognitio has a very substantial and SAP (BusinessObjects), is PowerPivot and the ability to
opportunity in the small or midsize difficult to manage. This problem include an unstructured information
business data warehouse and is compounded by Kognitio’s type in analytics is the result of
BI market thanks to its dbSaaS small market penetration and the its technology blend and this is a
model. However, over the past resulting scarcity of tool expertise strength that should definitely not
year, managed services offerings in the market. References also be ignored.
from IBM and HP/Vertica have report the absence of any form of • References report that Microsoft
experienced growing acceptance developers’ forum or marketplace, exhibits one of the best value
and penetration in the market. scarcity of skills in the market and propositions on the market with
These offerings are not direct an extremely lean global presence a low cost and a highly favorable
competitors to Kognitio’s solution, makes commitment to the product price/performance ratio. Skills are
but the customer base views them and consistent delivery difficult. widely available in the marketplace
as an equal alternative from more to operate a Microsoft data
established vendors. Microsoft warehouse and there is an easy
Kognitio has not yet addressed Microsoft (www.microsoft.com) continues learning curve to acquire those
some of the very large volume to market its SQL Server 2008 DBMS same skills, as needed. As an added
or variety of data support issues (Release 2) Business Data Warehouse bonus, customers report that the
– more specifically support for and Fast Track Data Warehouse for data integration and continuity of a
content and complexity aspects of warehousing customers not requiring an complete Microsoft data warehouse
extreme information. However, MPP DBMS. Microsoft released its own and business intelligence stack is
Kognitio’s in-memory analytical MPP data warehouse appliance, the SQL highly advantageous to time-to-value
capabilities can be of value in low Server 2008 R2 Parallel Data Warehouse in delivery. Noticeably absent are
latency, high volume analytics. (Microsoft) (PDW), in November 2010. any fears regarding vendor lock-in.
The market shifted dramatically Strengths According to our reference checks
during 2011 toward a new position. and discussions with our clients,
• Microsoft spent 2011 revitalizing
Kognitio did not stand still, but worldwide support from Microsoft
its vision for the data warehouse
market demand regarding new is extensive, encompassing partners,
market. Additionally, it announced
functionality expanded more rapidly value-added re-sellers, vendors of
two Apache/Hadoop connectors
than Kognitio’s product feature third-party software and tools and
for SQL Server, SMP and Parallel
sets. This appears to only be a widely available SQL Server skills.
Data Warehouse (PDW) in
temporary condition while Kognitio support of the market’s big data • Microsoft references indicate a
addresses these new expectations. issues. Many would be surprised dominant presence in midsize data
• While Kognitio continues to grow to learn that Microsoft already warehouses —especially those
its installed base (with an additional provided combined structured end-user organizations reporting
seven clients in 2011) the company and unstructured analysis in SQL that their companies and their data
remains a small vendor with fewer Server 2008/R2. A third quarter management needs are growing.
than 50 customers worldwide. appliance update included support According to customer references,
This makes it increasingly difficult and enhancements for integration Microsoft assures its customers of
to sell to organizations that have with SAP/Business Objects, a solid data warehouse platform
incumbent vendors, and to compete MicroStrategy and Informatica. including features and functions
with some of the lower-priced that run the gamut of traditional
In addition, Microsoft offers the
appliance offerings. Additionally, warehouse functionality.
SQL Server Fast Track Data
as a data warehouse outsourcing Warehouse, which includes For connectivity in a multi-
solution, organizations should be validated reference architectures vendor environment Microsoft
aware that they are still responsible for building a balanced data offers a SAP/BW, Teradata and
for contracting and auditing data warehouse infrastructure. This Oracle connector. The DBMS
security procedures. road map contributes significantly supports compression and
• Clients report interoperability to the company’s vision for the backup compression, partitioned
with third-party popular BI tools, market and its customers. Microsoft table parallelism, policy-based
14