Driving Behavioral Change for Information Management through Data-Driven Gree...
Scaling Data Hadoop and Big Data Solutions
1.
2. Confidential and Proprietary of Scaling Data All Rights Reserved
2
Scaling Data introduction
What is “Big Data”
Hadoop Capabilities and Uses
Hadoop and its use in Analytics
SSN overview and analytics direction
Next steps
3. Confidential and Proprietary of Scaling Data All Rights Reserved
3
• Partnership comprised of seasoned Big Data, Hadoop,
financial services, and security entrepreneurs
• Focused on extracting value from ALL your data
• Services include:
Data Discovery Assessments
Strategy Development
Hadoop Implementation
Hosted Hadoop Environment
Advanced Analytics Development
4. 4
FLEXIBILITYFLEXIBILITY
Commoditization ofCommoditization of
Distributed ComputingDistributed Computing
SCALABILITYSCALABILITY
Distributed Data ProcessingDistributed Data Processing
Competitive AdvantageCompetitive Advantage
SECURITYSECURITY
Hardened ServersHardened Servers
World-Class EncryptionWorld-Class Encryption
Confidential and Proprietary of Scaling Data All Rights Reserved
5. 5
• Scaling Data focuses on Big Data problems in
the financial services arena.
• We provide data discovery, capture, analysis
and strategies that allow organizations to better
leverage ALL current and historical data beyond
traditional relational and BI limitations
• Hadoop Hosting
Confidential and Proprietary of Scaling Data All Rights Reserved
6. 6
Scaling Data solutions focus on following Big Data
industries :
•Financial Services
− Security/AML/Fraud
− Payments Analysis
•Retail
− Spend Analysis
− Pricing Optimization
• Telecom and Utilities
− Smart Grid Analysis
− Pricing Optimization
Confidential and Proprietary of Scaling Data All Rights Reserved
8. 8Confidential and Proprietary of Scaling Data All Rights Reserved
Relational Databases:
• ACID system
• Stores Tables (Schema)
• Stores single digit terabytes
• Processes GB’s per query
• SQL
• Interactive response
• Low latency
Hadoop:
• A distributed operating system for
data analysis
• Stores Files (Structured and
Unstructured)
• Stores dozens of petabytes
• Queries & Data Processing
• Batch response (>30 sec)
• HBase allows for low latency queries
but you lose SQL
Hadoop is good for storing and processing large amounts of unstructured or
structured data in batch form
HBase is the tool to use for petabyte size, low latency applications
9. 9
Companies that use Hadoop can expect the following:
•70% are more confident in their ability to mange large data
•88% can perform more analysis on large data
•88% can keep more historical records
•94% can analyze data in greater detail
•82% can capture and use all source data
Source: Ventana Research
Confidential and Proprietary of Scaling Data All Rights Reserved
12. 12
• Efficiently execute sophisticated
analytics
Supports real-time transaction processing;
handle thousands of transactions a second.
Leverage the platform’s comprehensive
range of analytic capabilities.
• Leverage packaged capabilities and
open analytics
Balance the need for proven, off-the-shelf
analytics with the capability to develop new
rules / models with easy-to-use graphical
tools.
• Drive process efficiencies
Automate and streamline investigations –
with alert generation and comprehensive
workflow and investigation management.
• Adapt to changing organizational
needs
Adapt logic, processing and policies with
user-friendly controls and tools, with and
without IT support. New solutions can
easily be deployed on the common
platform to meet changing business
needs.
• Yield faster returns
Proven, out-of-the-box analytics detect and
prevent issues immediately. Speed
implementation with flexible data
mapping to legacy environments and a
data source agnostic architecture.
Confidential and Proprietary of Scaling Data All Rights Reserved
Difference between san storage and commodity disk A gigabyte of storage in Hadoop is .25 per month Where 1.00 a month in other database
Hadoop is not a replacement for Oracle and Mysql you offload task that they do not well
Gathers data from multiple sites Industry Customized Algorithms Flexible/Scalable platform Ability to see and highly unique trends Ability to store and analyze petabytes of data