5. What is Yahoo! JAPAN?
5
Many Strong Services
Media
US
Search Video Answer Mail
JP
US
JP
Membership C2C Payment C2C EC B2C EC Local
Search Knowledge search MailNews
YAHUOKU!Premium Wallet Loco
6. What is the NoSQL Team?
6
300+
Systems
100+
Services
NoSQL Team
7. Cassandra @ Yahoo! JAPAN
7
2010 2012 2014 2016 2018
Service
Departments
NoSQL
Team
0.5 0.8 1.x
0.8 1.x 2.x 3.xNoSQL
Team
8. Cassandra @ Yahoo! JAPAN
8
50
Clusters
50TB
Usages
2000+
Nodes
500,000
Read/sec
500,000
Write/sec
2017
10
Nodes /
Cluster
200
Nodes /
Cluster
…
1
Shared
Cluster
50
Special
Clusters
50
Systems
50
Systems
3
DCs
10. Key Value Store
Team
Search Engine
Team
Before Cassandra
10
Services
Search Engine
Key Value Store
• Problems: Inappropriate usage by internal platforms
and new demands for more big data.
Don’t store
any big key
data!!
Don’t use it
like a key
value store!!
2012
We store data of
our services on
your platforms.
We want to store
more big data of
our new services
easily.
11. Key Value Store
Team
Search Engine
Team
NoSQL Team
11
• Launched NoSQL Team in 2012
We should build
new centralized
platform for more
big data!!
2012
NoSQL
Team
Service
Departments
Join
Join
Join
However, many open source NoSQL
databases have been released already,
so we have to evaluate these.
New
12. NoSQL Team
12
• NoSQL team selected Cassandra as our
first centralized NoSQL database.
Services
2012
• High Availability
• Performance
• Persistence
• Scalability
• …..
• Maintainability
• Appropriate
Open Source
License
• …..
Function Point Analysis
No1
NoSQL
Team
15. NewSQL Trends
• NewSQL
= NoSQL (Scalability) + RDBMS (SQL, ACID)
= Scalable RDBMS like NoSQL
15
NoSQL
Team
16. Services
Requests for NewSQL
16
NoSQL
Team
We have big on-
premises data centers
and, we can’t use the
NewSQL platforms in
our private cloud.
Public Cloud OSS
We want to make use
of our knowledge
experience with
Cassandra.
Private Cloud Knowledge
Experience
17. 17
NewSQL with Cassandra
Function
Google Amazon
MariaDB Cockroachdb
Spanner Aurora
Logging
Query
Engine
Transaction
Schema
Store
Storage
NoSQL
Team
Could we use Cassandra
for storage layer of
NewSQL databases?
19. Trial Concept
19
• OSS SQL Engine + Distributed Storage
= PostgreSQL + Cassandra or
SQLite + Cassandra
Function Traial
Logging
Query Engine
Transaction
Schema
Store
Storage
NoSQL
Team
Could we replace storage
layer of SQL databases
with Cassandra?
20. Study Implementation
20
NoSQL
Team
PostgreSQL’s storage is
abstracted as the storage
manager, but …..
Storage Manager
SQLite’s storage is
abstracted as the virtual file
system too, but …..
Virtual File System
NoSQL
Team
To implement the abstract
functions directly is hard to
debug ….
21. POSIX Emulation
21
#define open(path, flags, mode) posix_vfs_cassandra_open(path, flags, mode)
#define close(fd) posix_vfs_cassandra_close(fd)
#define read(fd, buf, nbytes) posix_vfs_cassandra_read(fd, buf, nbytes)
#define write(fd, buf, nbytes) posix_vfs_cassandra_write(fd, buf, nbytes)
#define access(path, mode) posix_vfs_cassandra_access(path, mode)
#define unlink(path) posix_vfs_cassandra_unlink(path)
#define fstat(fd, buf) posix_vfs_cassandra_fstat(fd, buf)
#define fsync(fd) posix_vfs_cassandra_fsync(fd)
#define lseek(fd, offset, whence) posix_vfs_cassandra_lseek(fd, offset, whence)
NoSQL
Team
Develop compliant library with Cassandra for POSX file I/O functions,
and replace the POSIX functions with Cassandra compliant functions
The storage layers of PostgreSQL and SQLite are
implemented using POSIX file I/O functions.
Storage Manager Virtual File System
POSIX file I/O
functions
POSIX file I/O
functions
POSIX file I/O
functions
Compliant file I/O
functions
NoSQL
Team
This implementation method
easy to write the unit test,
and it is easy to debug too.
22. Cassandra
File Management
22
CREATE TABLE IF NOT EXISTS posix.storage (
path varchar,
block_no bigint,
block blob,
PRIMARY KEY (path, block_no));
SQL Engines
Storage Manager
Virtual File System
• File A
• File B
• .....
• .....
• .....
• File N
Block 0
File
Block 1 Block 2 ….. ….. ….. ….. …..
File
23. Benchmark
23
0
5
10
15
20
25
30
35
INSERT SELECT UPDATE
SQLite (Disk) SQLite+C* (1KB) SQLite+C* (4KB) SQLite+C* (8KB)
• Naive Implementation
X : Multi-threads
X : Async Requests
• Don’t care
X : Only Storage Layer
X : Access Coflict
(v3.20.1 + speedtest.tcl)
This is very a naive
and rough
implementation of a
distributed database
now, but ….