John P. Girard, Ph.D.'s talk at Sales & Marketing Middle East. Everyone is talking about big data. Lots of people of selling big data. Many leaders are wondering about big data. An honest, sans hype, overview of where we are in the big data space.
4. www.johngirard.net john@johngirard.net4
Is
Big
Data
New?
www.google.com/trends/
Teradata, 1991
(Osco Drug)
www.tinyurl.com/GirardBD
Prairie
Business
Magazine,
7(1)
-‐‑ 2008
Is
data
mining
synonymous
with
Big
Data?
No.
Big
Data
is
the
data
set
(or
asset).
Data
mining
is
the
process (or
handler).
5. www.johngirard.net john@johngirard.net5
The
History
of
Big
Data
Information
Overload
Information
overload
occurs
when
the
amount
of
input
to
a
system
exceeds
its
processing
capacity.
(Speier et
al,
1999)
Information
Overload
Information
overload
is
that
state
in
which
available,
and
potentially
useful,
information
is
a
hindrance
rather
than
a
help.
(Bawden,
2001)
Personal
Information
Overload
A
perception on
the
part
of
the
individual
(or
observers
of
that
person)
that
the
flow
of
information
associated
with
work
tasks
is
greater
than
can
be
managed
effectively.
(Wilson,
2001)
Organizational
Information
Overload
A
situation
in
which
the
extent
of
perceived information
overload
is
sufficiently
widespread
within
an
organization
as
to
reduce
the
overall
effectiveness
of
management
operations.(Wilson,
2001)
Overload
is
not
new!
The
Roman
Philosopher
Seneca
worried
about
information
overload
nearly
2,000
years
before
it
was
cool.
“What
is
the
point
of
having
countless
books
and
libraries
whose
titles
the
owner
could
scarcely
read
through
in
a
whole
lifetime?”
he
wondered.
Michael
Grunwald @MikeGrunwald Aug.
28,
2014
6. www.johngirard.net john@johngirard.net6
The
History
of
Big
Data
2/3
of
managers
complained
of
Information
overload
(KPMG,
2000)
38%
of
the
surveyed
managers
waste
a
substantial
amount
of
time
locating
information
(Wilson,
2001)
Managers
“dwell
on
information
that
is
entertaining
but
not
informative,
or
easily
available
but
not
of
high
quality”
(Linden,
2001)
43%
of
the
managers
delayed
decisions
because
of
too
much
information.
(Wilson,
2001)
The
total
accumulated
codified
database
of
the
world,
which
includes
all
books
and
all
electronic
files,
doubles
every
seven
years
and
some
predict
this
will
double
twice
a
day
by
2010
(Bontis,
2000).
What
we
knew a
decade
ago:
KM
1.0
(According
to
John)
Knowledge
Information
Data
Data to
Information
Context
Categorize
Calculate
Correct
Condense
Information to
Knowledge
Compare
Consequences
Connects
Conversation
7. www.johngirard.net john@johngirard.net7
KM
2.0
Ikujiro Nonaka
Sociali
zation Externa
lization
Interna
lization
Comb
ination
TACIT
EXPLICIT
EXPLICIT
TACIT
What
do
decision-‐makers
want?
http://www.youtube.com/watch?v=lH39xjXaLW8
8. www.johngirard.net john@johngirard.net8
Seek
Wisdom
Seek
wisdom,
not
knowledge.
Knowledge
is
of
the
past,
wisdom
is
of
the
future.
~
Lumbee Proverb
The
Lumbee Tribe
of
North
Carolina
is
a
state
recognized
tribe
of
approximately
55,000
enrolled
members,
most
of
them
living
in
Robeson
and
the
adjacent
counties
in
southeastern
North
Carolina.
The
Cognitive
Hierarchy
10
Years
Knowledge
Information
Data
Ackoff’s Apex
Wisdom
Understanding
Knowledge
Seek
Wisdom not
Knowledge
(KM
2.5?)
9. www.johngirard.net john@johngirard.net9
Big
Data
– Some
Definitions
A
term
coined
to
reflect
very
large and
very
complex data
sets.
(Sultanow &
Chircu,
2015)
Big
data
is
a
term
for
any
collection
of
large and
complex
data
sets
that
it
becomes
difficult
to
process.
(Gordon,
2015)
Data
set
that
is
beyond
the
capacity
of
relational
database
applications.
(Joseph,
2015)
Term
for
a
collection
of
large
and
complex data
sets
that
it
becomes difficult
to
process
with
traditional
tools.
(Klepac
&
Berg,
2015)
Large Complex Difficult
Strategic
Data-‐based
Wisdom
in
the
Big
Data
Era
Complex:
A
Definition
Large Complex Difficult
“a
group
of
obviously
related
units
of
which
the
degree
and
nature
of
the
relationship
is
imperfectly
known”
10. www.johngirard.net john@johngirard.net10
Knowledge
Application
=
KM
3.0
Knowledge
Information
Data
Wisdom
Understanding
Knowledge
“With 3,600 stores in the United States and
roughly 100 million customers walking
through the doors each week, Wal-Mart has
access to information about a broad slice of
America . . . The data are gathered item by
item at the checkout aisle, then recorded,
mapped and updated by store, by state, by
region . . . By its own account Wal-Mart has
460 terabytes of data.”
14 November 2004
Hurricane
An
Example
13. www.johngirard.net john@johngirard.net13
Focus
on
the
desired
business
end
state
…
The right
technology
Branson’s
secret
weapon
is
carrying
an
old-‐fashioned
notebook
with
him
everywhere
he
goes.
16. www.johngirard.net john@johngirard.net16
The
Size
of
Big
Data
http://www.youtube.com/watch?v=B27SpLOOhWw
CEO:
How
much
data
do
we
need?
http://www.computerworlduk.com/news/infrastructure/3433595/boeing-‐‑787s-‐‑
create-‐‑half-‐‑terabyte-‐‑of-‐‑data-‐‑per-‐‑flight-‐‑says-‐‑virgin-‐‑atlantic/
17. www.johngirard.net john@johngirard.net17
Decide
later
…
The
History
of
Big
Data
2/3
of
managers
complained
of
Information
overload
(KPMG,
2000)
38%
of
the
surveyed
managers
waste
a
substantial
amount
of
time
locating
information
(Wilson,
2001)
Managers
“dwell
on
information
that
is
entertaining
but
not
informative,
or
easily
available
but
not
of
high
quality”
(Linden,
2001)
43%
of
the
managers
delayed
decisions
because
of
too
much
information.
(Wilson,
2001)
The
total
accumulated
codified
database
of
the
world,
which
includes
all
books
and
all
electronic
files,
doubles
every
seven
years
and
some
predict
this
will
double
twice
a
day
by
2010
(Bontis,
2000).
What
we
knew a
decade
ago:
18. www.johngirard.net john@johngirard.net18
Michael
Jordan
on
the
“Delusions”
of
Big
Data
http://spectrum.ieee.org/robotics/artificial-‐‑intelligence/machinelearnin g-‐‑maestro-‐‑michael -‐‑jordan-‐‑on-‐‑the-‐‑
delusions-‐‑of-‐‑big-‐‑data-‐‑and-‐‑other-‐‑huge-‐‑engineering-‐‑efforts
When
you
have
large
amounts
of
data,
your
appetite
for
hypotheses
tends
to
get
even
larger.
And
if
it’s
growing
faster
than
the
statistical
strength
of
the
data,
then
many
of
your
inferences
are
likely
to
be
false.
They
are
likely
to
be
white
noise.
http://www.tylervigen.com/