IBM is a market leader in big data and analytics solutions. This session explains the basics of Big Data, with actual use cases of clients who have benefited from IBM solutions in this space, followed by architectures with IBM BigInsights, BigSQL, Platform Symphony and Spectrum Scale.
23. What is Big Data?
Big Data Use Cases
IBM Analytics Platform
IBM Spectrum Scale
Agenda
24. 23
The IBM big data platform advantage
BI /
Reporting
BI /
Reporting
Exploration /
Visualization
Functional
App
Industry
App
Predictive
Analytics
Content
Analytics
Analytic Applications
IBM big data platform
Systems
Management
Application
Development
Visualization
& Discovery
Accelerators
Information Integration & Governance
Hadoop
System
Stream
Computing
Data
Warehouse
• The platform provides benefit
as you move from an entry
point to a second and third
project
• Shared components and
integration between systems
lowers deployment costs
• Key points of leverage
• Reuse text analytics across streams and
BigInsights
• Hadoop connectors between Streams
and Information Integration
• Common integration, metadata and
governance across all engines
• Accelerators built across multiple engines
– common analytics, models, and
visualization
28. Dominant Players vs. Contender platforms
OS Tape Cloud
Management
Big Data &
Analytics
Dominant
Player
Microsoft
Windows
Quantum
DLT
Amazon Web
Services
Cloudera
Contender
platform
Linux Linear Tape
Open (LTO)
OpenStack Open Data
Platform
Supporters
of Contender
platform
IBM,
RedHat,
SUSE,
Oracle and
others
IBM, HP,
Certance
and others
IBM, HP,
Rackspace,
RedHat, Dell,
Cisco, VMware
and others
IBM, Pivotal,
Hortonworks
and others
27
37. HDFS versus IBM Spectrum Scale™
Hadoop HDFS
HDFS NameNode HA added in version 2.0.
NameNode HA in active/passive configuration
Difficulty to ingest data – special tools required
Lacking enterprise readiness
No single point of failure, distributed
metadata in active/active configuration since
1998
Ingest data using policies for data
placement
Versatile, Multi-purpose,
Hybrid Storage (locality and shared)
Enterprise ready with support for advanced
storage features (Encryption, DR, replication,
SW RAID etc)
Large block-sizes – poor support for small files
Variable block sizes – suited to multiple types
of data and metadata access pattern
Scale compute and storage independently
(Policy based ILM)
Compute and Storage tightly coupled –
leading to very low CPU utilization
Single-purpose, Hadoop MapReduce only
POSIX file system – easy to use and manage
Non-POSIX file system – obscure commands.
Does not support in-place updates.
IBM Spectrum Scale
36
41. 40
Session summary
• Big data is being generated by
everything around us
• Every digital process and social
media exchange produces it
• Systems, sensors and mobile
devices transmit it
• Big data is arriving from multiple
sources at amazing velocities,
volumes and varieties
• To extract meaningful value from
big data, you need optimal
processing power, storage,
analytics capabilities, and skills
Sources: The Economist, and special thanks to
Dr. Bob Sutor, IBM VP, Business Solutions & Mathematical Sciences
48. 47
About the Speaker
Tony Pearson is a Master Inventor and Senior managing consultant for the IBM System Storage™ product line. Tony joined
IBM Corporation in 1986 in Tucson, Arizona, USA, and has lived there ever since. In his current role, Tony presents briefings
on storage topics covering the entire System Storage product line, Tivoli storage software products, and topics related to Cloud
Computing. He interacts with clients, speaks at conferences and events, and leads client workshops to help clients with
strategic planning for IBM’s integrated set of storage management software, hardware, and virtualization products.
Tony writes the “Inside System Storage” blog, which is read by hundreds of clients, IBM sales reps and IBM Business Partners
every week. This blog was rated one of the top 10 blogs for the IT storage industry by “Networking World” magazine, and #1
most read IBM blog on IBM’s developerWorks. The blog has been published in series of books, Inside System Storage:
Volume I through V.
Over the past years, Tony has worked in development, marketing and customer care positions for various storage hardware
and software products. Tony has a Bachelor of Science degree in Software Engineering, and a Master of Science degree in
Electrical Engineering, both from the University of Arizona. Tony holds 19 IBM patents for inventions on storage hardware and
software products.
9000 S. Rita Road
Bldg 9032 Floor 1
Tucson, AZ 85744
+1 520-799-4309 (Office)
tpearson@us.ibm.com
Tony Pearson
Master Inventor,
Senior IT Specialist
IBM System Storage™