Big Data Management
•Big Data is the term applied to data sets whose size, structure or complexity challenges conventional software tools to capture, process and present data within a tolerable period of time, cost effectively.
Big Data
Data Volume
Data Structure
Data Source
•Petabytes / Exabyte of data
•Billions of customer records
•Billions / Trillions of records
•Loosely structured and/or distributed data
•Flat schema with complex inter-relationships
•Varying formats and often incomplete data
•Transactional
•Social media applications
•Analytics
Hadoop Framework
Hadoop is a “framework” for running applications on large clusters built of commodity hardware.
Hadoop is an open source project sponsored by the Apache foundation.
What is Hadoop?
A reliable, massively scalable, framework for distributed processing of large, complex or unstructured data.
Eliminate/Reduce costs for traditional RDBMS license and high speed SAN.
Ecosystem supporting the Hadoop framework e.g. Cloudera, Hortonworks, IBM, VMWare, MapR Technologies, et. al.
Major database software vendors (IBM, SAP, Microsoft, Oracle, Teradata, et. al.) have completed plans to integrate with Hadoop.
Why use Hadoop?
Corporations such as Skype, EBay, Google, IBM, Facebook, LinkedIn, Twitter, Rackspace, et. al. are some of the high profile Hadoop users.
The technology adoption is now past the “early adopters”.
Who uses Hadoop?
Provisioning
Legacy and Next Generation Network
Billing Mediation application
Big Data Solution
Billing Domain
Rating, Charging, Billing
Customer Management
Wholesale Billing application
Provisioning application
Retail Billing
Comprehensive Solution
OSS / Mediation
Collect network events
including CDR, IPDR,
SNMP traps, NetFlow, etc.
Distribute formatted
records to wholesale
system
Upload wholesale
charging records
Upload customer
profile, billing
information, etc.
Initiate customer provisioning queries (lines, features, etc.)
Upload provisioning
information
Query NEs for Provisioning details
Wholesale Billing
Exploiting strategic OSS / BSS portfolio to offer data analytics.
Solution Components
Usage, Provisioning, Retail & Wholesale Billing, Network data
Data Ingestion
Data Management
Reporting
Big Data Management (BDM) Functions
Oracle
Hadoop cluster
Incremental updates
Business Objects
Optional data flow
Primary data flow
Big Data Solution Architecture
Big Data Management through the Hadoop framework.
Solution Scope
Data Warehousing
•A reliable, massively scalable, low cost data warehousing solution through Hadoop.
Reporting
•Operations dashboard
•Enhanced audit of workflow and records / transactions processed
•Offer granular insight into the product through subsystem level monitoring
•Capacity utilization of platform resources
•SLA Management
Flume
Scribe
FTP
HDFS
Oozie
Map Reduce
Hive
Pig
Commodity Servers
Hadoop Framework
Solution Scope
Revenue Assurance - Usage
•Reconciling Usage to Network event records
•CDR to Diameter records
•CDR to RADIUS records
•CDR to SS7 records
•Reconciling Usage to Trunk records
•Reconciling AMA to EMI records
•Error Record Management
Revenue Assurance – Service
•Order accuracy / order management
•Inventory analysis
•Fulfillment analysis
Revenue Assurance – Billing
•Reconciling CDR to Billing Records
•Rating and Billing Verification
•Retail Billing plan analysis
Solution Scope
•Monitoring network elements to acquire a detailed, time based view of application usage to:
Gauge service acceptance, and
Measure allocation / availability of appropriate resources
•Mining of service / feature monitoring data to launch proactive marketing and customer service initiatives. e.g.
Roaming
Content download
Geographic correlation