Exploratory Webcast for the Big Data Information Architecture Research Project
Live Webcast Jan. 22, 2014
Watch the archive: https://bloorgroup.webex.com/bloorgroup/lsr.php?RCID=32304b307fc5359a2f97b173166ea07b
Big Data is everywhere -- that's for sure. But the big question for today's savvy enterprise is where, exactly, should it fit within the Information Architecture? Making that decision correctly can save a lot of money while adding significant value to any number of enterprise operations. Business processes can be improved with critical new data sets; marketing can excel at hitting the right targets quickly; sales can hit home runs by having a much deeper understanding of key prospects; and senior executives can see the big picture more clearly than ever before.
Register for this Exploratory Webcast to hear veteran Analyst Dr. Robin Bloor outline the current landscape of Big Data, and offer guidance for today's organizations to determine how, when and where to deploy this powerful if unwieldy information asset. This event will kick off The Bloor Group's Interactive Research Report for 2014 which will focus on illuminating optimal Big Data Information Architectures. The series will include a dozen interviews with today's Big Data visionaries, plus three interactive Webcasts and a detailed findings report.
Visit InsideAnalysis.com for more information.
Powerful Google developer tools for immediate impact! (2023-24 C)
Think Big - How to Design a Big Data Information Architecture
1. Grab some coffee and enjoy
the pre-show banter before
the top of the hour!
2. “Think Big: How to Design a Big Data Information Architecture”
Exploratory Webcast | January 22, 2014
3. Guests
Robin Bloor
Chief Analyst, The Bloor Group
@robinbloor robin.bloor@bloorgroup.com
Eric Kavanagh
CEO, The Bloor Group
@eric_kavanagh eric.kavanagh@bloorgroup.com
4. Big Data Information Architecture
Exploratory Webcast
January 22, 2014
Roundtable Webcast
April 9, 2014
Findings Webcast
June 25, 2014
#BigDataArch
9. The Visible “Big Data” Trend
u Corporate data volumes
grow at about 55% per
annum - exponentially
u Data has been growing
at this rate for, maybe,
40 years
u There is nothing new
about big data. It clings
to an established
exponential trend
10. The Invisible Trend: Moore’s Law Cubed
u The biggest databases are new
databases
u They grow at the cube of
Moore’s Law
u Moore’s Law = 10x every 6 years
u VLDB: 1000x every 6 years
– 1991/2 megabytes
– 1997/8 gigabytes
– 2003/4 terabytes
– 2009/10 petabytes
– 2015/16 exabytes
11. Technology Evolution (Bloor Curve)
Application
Migration
The Area Of
As-Yet-Unrealized
Applications
Source: The Bloor Group
12. The Traditional Force of Disruption
u Software architectures
change: centralized, C/S,
3 tier/web, SOA, etc.
u Applications migrate
according to latencies
u Dominant applications
and software brands can
die via “The innovator’s
dilemma”
u Wholly new applications
appear because of lower
latencies, e.g., VMs, CEP
Application
Migration
The Area Of
As-Yet-Unrealized
Applications
Source: The Bloor Group
13. This Curve is Compromised
Application
Migration
The Area Of
As-Yet-Unrealized
Applications
Source: The Bloor Group
Two DISRUPTIVE
forces have changed
the curve:
PARALLELISM
and
The CLOUD
16. It’s Over for Spinning Disk
u SSD is now on the
Moore’s Law curve
u Disk is not and never
was (in respect of seek
time)
u All traditional databases
were engineered for
spinning disk and not
for scale-out
u This explains the new
DBMS products…
17. In-Memory Disruption
u Memory may gradually
become the primary
store for data (this
impacts data flows)
u Almost all applications
are poorly built for
this
u Memory is an
accelerator – as is CPU
cache. This is
becoming a factor
18. The Memory Cascade
u On chip speed v RAM
• L1(32K) = 100x
• L2(246K) = 30x
• L3(8-20Mb) = 8.6x
u RAM v SSD
• RAM = 300x
u SSD v Disk
• SSD = 10x
Note: Vector instructions
and data compression
19. Tech Revolutions
TECH REVOLUTION ARCHITECTURE
u Computer
u On-line
u PC
u Internet
u Mobile
u Internet of things
u Batch
u Centralized
u Client/server
u Multi-tier
u Service Orientation
u Event Driven/Big
Data
21. The Open Source Picture
u The R Language
• Over 1 million
users
u Hadoop and its
Ecosystem
• Reduced latency
for analytics
u Machine Learning
Algorithms
• Raw power
None of these are engineered for performance
23. What Is A Data Scientist?
u Project manager
u Qualified statistician
u Domain Business
expert
u Experienced data
architect
u Software engineer
(IT’S A TEAM)
24. A Process, Not an Activity
u Data Analytics is a multi-disciplinary
end-to-end
process
u Until recently it was a
walled-garden. But
recently the walls were
torn down by…
• Data availability
• Scalable technology
• Open source tools
25. The CRITICAL Workload Issue
u Previously, we
viewed database
workloads as an i/o
optimization problem
u With analytics the
workload is a very
variable mix of i/o
and calculation
u No databases were
built precisely for
this – not even Big
Data databases
26. Take Note
You can know more
about a BUSINESS from
its data
than by any other means
27. The Biological System
u Our human control system
works at different speeds:
• Almost instant reflex
• Swift response
• Considered response
u Organizations will
gradually implement
similar control systems
u This suggests a data-flow-based
architecture
28. The Corporate Biological System
u Right now this division
into two different data
flows is already occurring
u Currently we can
distinguish between:
• Real-time/Business time
applications
• Analytical applications
u We should build specific
architectures for this
29. Some Architectural Principles
u The new atom of data
is the event
u SUSO, scale up before
scale out
u Take the processing
to the data, if you
can
u Hadoop is a
component not a
solution
30. In Conclusion
The Big Data Curve?
Technology Disruption
Data Flow
PART
ONE
PART
THREE
PART
TWO