2. NoSQL
● umbrella term
●
non-relational data storage
● no fixed table schemas
● a fresh take on the database
technology
3. Relational databases have issues in
handling big volumes of data
Some companies and their
databases:
●
Digg.com - 3 TB for green badges
● Facebook - 50 TB for inbox search
●
eBay - 2 P(eta)B in total
4. Issues
●
horizontal scalability
● server performance
●
rigid schemas
● distribution across servers
5. Characteristics of NoSQL
● no ACID guarantees (Atomicity,
Consistency, Isolation, Durability)
● highly distributed
● scalable
●
better performance - they don't
have to handle relations
6. NoSQL databases examples:
● Google Bigtable (used intensively
by almost everything made by
Google)
●
Amazon Dynamo (used by Amazon
S3)
● Facebook Cassandra
●
Apache HBase
● LinkedIn Voldemort
7. Some types of databases:
●
Document Oriented databases
● JSON format, XML databases
● examples: CouchDB, BaseX
● Key - Value pairs databases
●
values can be more than strings
(set of strings)
●
examples: Redis, Cassandra
8. CouchDB
● created by the Apache
Foundation
●
written in Erlang
● open source
●
document oriented database
● stores data as JSON documents
collection
9. ● queried via REST API
●
JavaScript is the default language
● also supported:
PHP, Ruby, Python and Erlang
● built-in replication features
● used by Ubuntu One
11. Operations with these documents
●
HTTP requests:
● GET (select), POST (create), PUT
(update), DELETE (delete).
●
HTTP AUTH
● Aplications: curl, Futon
●
JavaScript
● any application that knows HTTP
requests
13. Redis
●
key - value database
● written in C
●
open source
● networked
● in-memory
●
persistent database
● similar to memcached
●
data is non-volatile
14. ● atomic operations
●
very high performance
~100.000 operations/second
by 50 parallel clients
● all data is kept in memory -
blazing fast
●
periodic synchronization to hard-
drive
●
powerful replication
15. ●
bindings for a lot of languages:
PHP, Ruby, Python, C, Java, etc.
SET foo bar
GET foo => bar
SET - insert
GET - select
16. Key - value based databases
became very popular lately
Other key-value databases:
● Facebook's Cassandra (now also
used by Digg)
● GM.T
●
MemcacheDB (a persistence
enabled variant of memcached)
●
LinkedIn Voldemort
17. Conclusion
● relational databases are not the
holy grail of data storage
●
scalability issues determined
large corporations to look to other
solutions
● don't believe the FUD and give
them a try