2. Contents
Introduction
Big Data Market
Business Domains
Solutions
Cloudify
Cluster Modelling
Data Governance
Q & A
3. Data !!
According to Gartner there will be nearly 26 billion
devices on Internet of Things by 2020
Data Big data is when the data itself becomes part
of the problem.
Big data is a term for data sets that are so large or
complex that traditional data
processing applications are inadequate to deal with
them
4. Big Data - Change from Web Technology.
Traditional software stacks have been application oriented, and applications
were built in more or less the same way.
The Data is represented to suit the Application.
Data first, application second
Applications and Solutions are Have be tailored to datatype and workload.
Given the differences in datatypes – and workload context – this in turn means that
there are going to be a number of different solutions and applications.
5. Money from Big Data
The main point of using big data analytics
should be to grow your revenue
7. Big Data Trends in 2017
Big data becomes fast and approachable
In memory computing
Kudu, Impala, CarbonData
Purpose-built tools for Hadoop become obsolete
Analytics on all data
purpose-built for Hadoop and fail to deploy across use cases will fall
Architectures mature to reject one-size-fits all frameworks
use case-specific architecture design
combine data-prep tools, Hadoop Core and analytics platforms
Convergence of IoT, cloud, and big data
Increasing share of this data is being deployed on cloud services
demand is growing for analytical tools to connect to and combine cloud-
hosted data sources
8. Big Data solutions
I have Data and I want
xyz result ! Can some
one tell what to use ?
Here comes
the
solutions
The Era of solution architects rises
Enterprises looks for a pre defined solution for a particular use case.
9. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 9
This is a ‘one-size-fits-all’ category that could include anything
- Bernard Marr Leading Business and Data Expert
10. Web Analytics
Web analytics is the measurement, collection, analysis and reporting of web data for purposes of
understanding and optimizing web usage.
can be used as a tool for business and market research, and to assess and improve the effectiveness of a website.
Estimate how traffic to a website changes after the launch of a new advertising campaign.
Provides information about the number of visitors to a website and the number of page views.
Helps gauge traffic and popularity trends which is useful for market research.
11. CIick Stream Analysis
What is the most efficient path for a site visitor to research a product, and then buy it?
What products do visitors tend to buy together, and what are they most likely to buy in the future?
Where should I spend resources on fixing or enhancing the user experience on my website?
Stream feeds into HDFS with Flume
Use SPARK to build a relational view of the data
Use SPARK to query and refine the data
Visualize data with Tableau
Tableau does not have a official
HBase connector.
12. Log analysis - > ELK stack
Flume ingests log data from
multiple web servers into a
centralized store (HDFS, HBase)
efficiently.
Along with the log files, Flume is
also used to import huge volumes
of event data produced by social
networking sites like Facebook
and Twitter, and e-commerce
websites like Amazon and
Flipkart.
Flume supports a large set of
sources and destinations types.
Flume can be scaled horizontally.
2016-11-09 03:45:49,168 DEBUG [Thread-1] o.e.jetty.server.handler.ContextHandler contextDestroyed: ServletContextEvent[source=ServletContext ]
13. Big Data In Cloud
Why ?
Auto Scaling
Low TCO
Zero maintenance
Cloud Services
Low Storage Cost
14. Big Data In Cloud
Companies standardized on Big Data tools including Spark, Hadoop/MapReduce, Hive,
and Pig, will find a natural transition to Cloud
The major motive is to use the available cloud services such as OBS, RDS etc. along
with Big Data Technologies
Many solutions will come as a box up by combining available services in cloud.
Customers will demand to cloudify the applications and use cloud services.
15. Data Governance
1) Enable the lake
Managed ingestion
Metadata management
2) Govern the data
Data lineage
Data privacy and security
Data quality
Data lifecycle management
3) Engage the business
Data catalog
Self-service data preparation
4) Pipelining and ETL
Data movement across stores
Periodic aggregation
16. Cluster modelling and Maintenance
Facilitating accurate migration of data from
traditional data storage systems to Hadoop.
17. Analytics & Visualization
Help organizations to
process large amount of complex data
visualize it to meaningful and user readable form
easily analyse business data and perform trend analysis.