2. A Brief Comparison of Associative
Information Systems with other
NoSQL solutions for Managing Big
Data Problems
Introduc*on
h:p://www.virtue-‐desk.com
Wednesday, August 28, 13
3. The
NEW
WORLD
The
“FLAT”
Rela*onal
DB
World
VS.
the
“ROUND”
Associa*ve
World
2
Dimensional
–
Un-‐Natural (N)
Dimensional
-‐
Natural
Wednesday, August 28, 13
4. (The
BIG
LIE)
Big
IT
says
you
need
Big
Data
solu*ons
to
help
you
find
value
hidden
in
your
data.
The
most
important
ques*on
to
ask
is
about
the
Total
Cost
of
Ownership,
(including
all
the
design,
consul*ng,
set-‐up,
development,
implementa*on,
evolu*on
and
maintenance
services)
vs.
the
Real
($)
Benefit
to
be
a:ained.
“Will
it
Deliver
more
$
value
to
my
organiza*on
than
it
will
Cost
me?”
If
you
don’t
get
a
guarantee,
(or
your
money
cheerfully
refunded),
or
at
least
an
answer,
perhaps
you
shouldn’t
buy
in.
Wednesday, August 28, 13
5. Atomic
DB vs NoSQL
Big
Data?
Big
Issues?
Big
Bucks!!!
Once
upon
a
*me,
customers
were
complaining
about
not
ge]ng
enough
value
for
their
money
spent
on
IT.
Sure
they
needed
it
to
run
their
business,
but
any
good
business
man
will
eventually
ask
“Where
is
my
return
on
this
investment?”
Apparently
Big
IT
listened.
The
Big
Systems
they’d
delivered
weren’t
performing
up
to
spec.
Too
much
data,
too
fast,
too
complex,
So
...
Big
Deal
to
the
rescue!
When
the
customer
is
unhappy,
confuse
them
with
a
vast
array
of
new
stuff,
for
which
they
have
no
in-‐house
exper*se
and
promise
them
the
mythical
keys
to
that
hidden
treasure
chest
of
magical
insight,
concealed
by
circumstance
in
the
many
haystacks
of
data,
just
wai*ng
to
be
found
by
complicated
new
technology,
filled
to
the
brim
with
the
latest
buzz
words.
Wednesday, August 28, 13
6. Atomic
DB vs NoSQL
Big
Promises?
Big
Projects?
Big
Disappointments
!!!
Just
like
Big
Promises
of
the
past,
Knowledge
Management,
Business
Intelligence,
Data
Warehouses,
Data
Fusion,
System
Federa*on,
Y2K,
Asset
Management,
and
every
expensive
genera*on
of
Big
IT
Systems
ever
produced,
those
promises
of
“EVERYTHING
You
Need
and
Want”
in
the
next
completely
new
and
be:er
collec*on
of
Buzz
Word
filled
products
has
always
been
a
Big
IT
sales
strategy.
Unfortunately
the
Big
Promises
did
not
and
do
not
get
delivered
!!!!
Every
new
technology
always
comes
like
a
puppy,
wrapped-‐up
in
some
irresis*ble
features,
but
laden
with
a
life*me
of
care,
feeding,
training,
cleanup
and
support.
Big
IT
always
stands
to
gain
billions
with
each
new
wave
of
puppies.
Customers
each
stand
to
lose
millions
with
each
Big
Failure
“Big
Data”
is
the
new
Big
Buzz
word.
And
NoSQL
systems
are
the
new
puppies.
And
Customers
are
once
again
being
‘encouraged’
to
Buy-‐in.
Wednesday, August 28, 13
7. Atomic
DB vs NoSQL
Big
Problems?
Big
Decisions?
Big
Responsibility
!!!
So
get
ready
for
the
next
Big
Wave
of
Big
IT
hype
and
promo*on:
You’re
problems
are
Big,
so
Big,
so
count
on
the
Big
Experts,
who
now
have
a
new
game:
“Free
Soiware!”,
(open
source)
to
accompany
their
license-‐laden
Enterprise
systems,
all
requiring
extensive
Big
IT
services
and
support
in
order
to
make
everything
work
together,
…
eventually,
…
we
hope
...
Since
the
exis*ng
RDBMS-‐based
Enterprise
systems
are
performance-‐shy,
and
hold
only
a
subset
of
the
Big
Data
required
to
drive
the
modern
organiza*on,
new
and
be:er
Big
Data
solu*ons
are
required
to
augment
those
expensive
silos
and
get
results
be:er
and
faster
than
they
ever
could
deliver
as
stand-‐alone
monuments
to
inefficiency.
A
new
breed
of
data
warehouse
has
hit
town
and
it
looks
like
the
next
Big
Thing.
Now
every
manager
is
being
condi*oned
to
think
in
terms
of
Big
Data,
and
see
NoSQL
as
the
wonder-‐filled
solu*on
to
the
problems
of
running
a
business
in
the
digital
age
of
Informa*on
Overload.
Now
if
only
it
would
work
as
promised…
And
not
cost
a
fortune.
So,
What
to
Choose?
There’s
so
many
op*ons…
Wednesday, August 28, 13
8. Atomic
DB vs NoSQL
Difference
1
Complexity
of
Querying
Wednesday, August 28, 13
9. • 100,000
organiza*ons
globally
• 1,000,000
databases
• 10,000,000
tables
• 100,000,000
queries
SQL
/NoSQL
–
let’s
suppose
All the databases in the world All the tables, triple, KV and document stores in the world
All the companies in the world
1,000,000 10,000,000
All the queries in the world
100,000,000
•Assuming
only
100,000,000
queries
globally,
(one
can
es*mate
many
more),
and
‘x’
hours
per
query,
that’s
a
lot
of
person
hours
•Each
query
can
work
only
with
the
table(s)
it
was
designed
for
•Every
database
is
incompa*ble
with
every
other
database
•For
each
and
every
query,
a
database
specialist
needs
to
write
it.
100,000
Wednesday, August 28, 13
10. Atomic
DB
• Each
Atomic
DB
Query
is
compa*ble
with
every
Atomic
DB
Informa*on
store
• Every
Item
in
a
Atomic
DB
Informa*on
store
can
reference
and
be
referenced
by
any
Item
in
its
own
and
any
other
Atomic
DB
Informa*on
store
• Mul*-‐store
mapping
is
an
inherent
capability
of
every
Atomic
DB
system
• No
IT
professionals
required
to
query
any
Atomic
DB
Informa*on
store
All the organizations in the world 100,000 of significance
All the Associative systems in the world
All the Atomic DB queries in the world 5 universal queries, generic to all data sets
Only 1 Atomic DB system required per organization
Wednesday, August 28, 13
11. Atomic
DB vs NoSQL
Difference
2
Complexity
of
Implementa*on
Wednesday, August 28, 13
12. NO-
Number
of
disparate
tools,
systems
and
exper*se
needed
to
set-‐
up
and
operate:
NoSQL
requires:
Schema
Layouts,
Spec
Produc*on,
RDF
Specialists,
Special
Data
Stores,
DB
Administrators
and
other
DB
specific
specialists,
SQL,
OWL,
&
SPARQL
programmers,
Ontology
and
Taxonomy
Specialists,
Extrac*on
Tools,
Data
Scien*sts,
ETL,
Data
Modelers,
Integra*on
Tools,
Migra*on
Tools,
Data
Cleansing
Tools,
Modeling
Tools,
Object,
Class
and
Hierarchy
(UML)
Managers,
Data
Universe
Builders,
Open
Source
system
managers,
version
control,
migra*on
and
release
managers,
installa*on
specialists,
applica*on
specialists,
and
MORE…
Wednesday, August 28, 13
13. ATOMIC
Number
of
disparate
tools,
systems
and
exper*se
needed
to
set-‐
up
and
operate:
Atomic
DB
requires:
IAMCore
ManageIT
Business
Analyst
Customer
Wednesday, August 28, 13
14. Atomic
DB vs. NoSQL
Difference
3
Capacity
for
Complexity
Wednesday, August 28, 13
15. NoSQL
• K-V Stores … Amazon Dynamo, …
• Column-oriented … Google Big Table, Hadoop, …
• Document DB … Mark Logic, Mongo DB, …
• Graph DB … Neo4J, Titan, …
• RDBMS … SQL Server, MySQL, …
All available ‘Big Data’ solutions are Name-Space and storage structure bound.
Only graph databases can handle high complexity of relationships in the data because
they are open (often indexed) triple stores but all contextualization has to be handled at
run-time and extracted / derived from the data.
Relational systems can handle moderate complexity but need many columns and many
tables with FK links abounding to represent even a moderate degree of complexity.
The other ‘Big Data’ solutions are extremely limited in the complexity they handle. They
usually are dedicated to a single purpose or application.
Wednesday, August 28, 13
16. ATOMIC
Relavance
Associa*ve
Informa*on
Systems
have
no
Name-‐Space
or
storage
structure
binding;
each
data
element
is
just
an
a:ribute
of
its
Token-‐Space
iden*ty.
Relavance
Associa*ve
Informa*on
Systems
are
mul*-‐Dimensional,
mul*-‐data
informa*on
stores,
designed
from
incep*on
to
manage
rela*onship
complexity
of
any
degree.
Its
storage
model
is
a
4-‐D
128
bit
vector
space.
There
are
no
restric*ve
limita*ons
on
the
number
of
associa*ve
dimensions
or
levels.
Each
system
can
scale
to
reference
(super-‐index)
/
hold
(aggregate)
1018
items,
each
with
‘n’
rela*onships
in
any
of
‘m’
rela*onship
dimensions.
All
data
elements
and
their
rela*onships
are
fully
contextualized
upon
inges*on
so
that
everything
is
always
grouped
and
reference-‐able
in
as
many
ways
as
there
are
contexts.
Wednesday, August 28, 13
18. Atomic
DB vs. NoSQL
Difference
4
Cost
of
Implementa*on
Wednesday, August 28, 13
19. Atomic
DB vs. NoSQL
Moderately Complex ‘Big Data’ System implementation involving
multi-data (RDBMS, Structured and Unstructured Text) requires:
Days to Weeks
Small Team of:
Business Analysts
UI Specialists
One technology base
Months to Years
Large team(s) of:
Technology and Domain Experts,
Implementation Specialists, Project
Managers, Component Specialists,
UI Specialists, Consultants…
Many technologies and components
Wednesday, August 28, 13
20. Atomic
DB vs. NoSQL
Difference
5
•
Maintenance
•
Support
and
•
System
Evolu*on
Requirements
Wednesday, August 28, 13
21. Atomic
DB vs. NoSQL
Moderately Complex ‘Big Data’ System maintenance, support and evolution:
1 administrator,
Small Team of:
Business Analysts
Hours to Days:
Requirements Gathering,
Map and Add new Data Sets,
Add new Workflow models.
UI Adaptation and Validation.
System stays up and usable.
Many administrators and experts,
Large team(s) of:
Technology and Domain Experts
Weeks to Months:
Requirements Gathering, Planning, Data
Extraction, Specification Production,
Implementation Project Management, Regression
testing, Validation, Deployment, Training, Change
Management, …
Version Migration downtime.
System Evolution to meet New Requirements
Maintenance and Support
Wednesday, August 28, 13
22. THE
“UPGRADE”
CYCLE
“$”
Oracle
Microsoi
IBM
DB2
Atomic-‐DB “Because
we
are
ATOMIC
in
Nature..
There
is
no
Upgrade
Cycle...”
Wednesday, August 28, 13
23. *
Cost
of
custom
research
service
depends
on
project
scope
Development
Comparison
Cost
Comparison Rela*onal
(SQL)
Associa*ve
Schema
Development
/
Database
Design X X
Schema
Mapping
/Table
Layout
/
Query
development X
Data
Integra*on
and
Development X X
Applica*on
Class
Libraries X X
Data
Encapsula*on X
Materialized
Views X
Performance
Organiza*on X
Table
Segmenta*on X
Meta-‐Data
Management X
Referen*al
Integrity
Checks X
Query
Evolu*on X
Configura*on
Management x
Applica*on
User
Interface
Development X X
Wednesday, August 28, 13
24. Atomic
DB vs. NoSQL
Difference
6
Our
API
Wednesday, August 28, 13
26. Atomic
DB vs. NoSQL
Difference
7
Our
Capacity
Wednesday, August 28, 13
27. • An
exabyte
is
1018
or
1,000,000,000,000,000,000
bytes.
• One
exabyte
(abbreviated
"EB")
is
equal
to
1,000
petabytes
and
precedes
the
ze:abyte
unit
of
measurement
• The
exabyte
unit
of
measure
measurement
is
so
large,
it
is
not
used
to
measure
the
capacity
of
data
storage
devices.
Even
the
storage
capacity
of
the
largest
cloud
storage
centers
is
measured
in
petabytes,
which
is
a
frac*on
of
one
exabyte.
Instead,
Exabytes
are
used
to
measure
the
sum
of
mul*ple
storage
networks
or
the
amount
of
data
transferred
over
the
Internet
in
a
certain
amount
of
*me.
For
example,
several
hundred
Exabytes
of
data
are
transferred
over
the
Internet
Associa*ve
Capacity
Reference
1
gigabyte
1
terabyte
1
Petabyte
1
Exabyte
When
we
consider
the
Environment
&
System
Actual
capacity
is
1036
Wednesday, August 28, 13
28. INTRODUCING
ATOMIC-‐DB
The
only
Completely
“Associa*ve”
Database
in
the
World…
Wednesday, August 28, 13
29. Atomic
DB vs. NoSQL
Difference
8
Our
Business
Advantages
Wednesday, August 28, 13
30. • Summarizing
our
technology
is
a
complex
task
as
we
are
discussing
a
PARDIGM
shii
in
the
way
data
is
both
Stored
and
Retrieved.
• A
few
Key
points
• 100X
faster
than
SQL
on
READS
-‐
CASE
SENSATIVE(if
required)
• 10X
faster
on
WRITES
-‐
LITTLE
or
NO
SUPPORT
STAFF
• 1/3
the
DISK
SPACE
usage
-‐
OBJECT
ORIENTED
DESIGN
• NO
QUERIES
to
WRITE
-‐
80%
reduc*on
in
DEVELOPMENT
TIME.
• NO
TABLES
-‐
50-‐75%
reduc*on
is
Development
costs
• NO
INDEXES
-‐
only
6
INSTRUCTIONS
in
the
API
• NO
VIEWS
-‐
one
line
of
code
access
to
your
data
• NO
WHITESPACE
-‐
Associate
Anything
to
Anything
• NO
DUPLICATES
-‐
DOD
verified
Security
Model
• 1
to
100+
concurrent
SOURCES
of
disparate
DATA
(ORACLE,
MSSQL,
MSSQL,
ACCESS,
DB2,EXCEL,
Flat
FILES(csv)
)
Key
Benefits
of
Atomic
DB
Wednesday, August 28, 13
31. Atomic
DB vs. NoSQL
Difference
9
Our
Performance
Advantages
Wednesday, August 28, 13
32. SYSTEM
:
(1)
4
CORE
INTEL
processor
,
4GB
RAM,
(1)
5400
RPM
500GB
Drive
Here
are
some
calculaMons
to
set
the
stage:
A
record
with
50
columns
of
data
represents
2500
triples,
if
you
include
both
direcMons,
(which
we
do).
Because
every
possible
associaMve
path
is
maintained,
discovery
of
all
associaMons
is
implicit
from
every
data
point.
We
assimilate
1
million
records
of
50
columns
of
data
in
typically
<
30
minutes
(best
case
10
minutes,
avg
20
minutes)
That's
the
equivalent
of
1,000,000
*
2500
triples
or
2.5
billion
triples
in
30
minutes,
worst
case
performance.
2.5
billion
triples
in
1800
seconds
(30
minutes
*
60
seconds
per
minute),
is
1.389
million
triples
per
second.
Because
of
the
proprietary
way
we
reference
and
store
informaMon
as
composite
mulM-‐dimensional
informaMon
atoms,
we
are
able
to
produce
the
funcMonal
equivalent
of
2.5
billion
triples
in
less
than
30
minutes,
operaMng
with
a
sustained
throughput
of
30,000
composite
'atomic'
transacMons
per
second
(world
record
=
18,000)
Since
we
don't
store
the
triples
as
triples,
yet
maintain
the
equivalent
'associaMve'
capability
triples
have,
we
can
get
a
huge
assimilaMon
performance
equivalent
benefit
over
triple
stores,
with
a
be`er,
faster
and
more
efficient
retrieval
and
storage.
Some
Metrics
Let’s
set
the
Stage www.tpc.org
Wednesday, August 28, 13
33. MORE
ATOMIC-‐DB
“Unlike
other
systems
where
a
Structure
is
built
to
STORE
data,
here
the
“Data”
is
the
Structure….
“
Wednesday, August 28, 13
36. • Jean
Michel
LeTennier
jm@virtue-‐desk.com
– 917-‐751-‐3131
• James
Murphy
james@virtue-‐desk.com
– 646-‐408-‐4385
• Andre
De
Castro
andre@virtue-‐desk.com
– 917-‐548-‐9810
– h:p://www.virtue-‐desk.com
Contact
Informa*on
Wednesday, August 28, 13