HBaseCon 2015: HBase @ CyberAgent

1. HBase @ CyberAgent Toshihiro Suzuki, Hirotaka Kakishima

2. Who We Are ● Hirotaka Kakishima o Database Engineer, CyberAgent, Inc. ● Toshihiro Suzuki o Software Engineer, CyberAgent, Inc. o Worked on HBase since 2012 o @brfrn169

3. Who We Are We authored Beginner’s Guide to HBase in Japanese

4. Who We Are Our office is located in Akihabara, Japan

5. Agenda ● About CyberAgent & Ameba ● HBase @ CyberAgent Our HBase History Use Case: Social Graph Database

6. About CyberAgent

7. ● Advertising (agency, tech) ● Games ● Ameba https://www.cyberagent.co.jp/en/ CyberAgent, Inc.

8. What’s Ameba?

9. ● Blogging/Social Networking/Game Platform ● 40 million users What’s Ameba?

10. Ranking of Domestic Internet Services Desktop Smartphone by Nielsen 2014 http://www.nielsen.com/jp/ja/insights/newswire-j/press-release-chart/nielsen-news-release-20141216.html Rank WebSite Name Monthly Unique Visitors WebSite Name Monthly Unique VisitorsRank

11. Ameba Blog 1.9 billion blog articles

12. Ameba Pigg

13. … and More Platform

14. HBase @ CyberAgent

15. We Use HBase for Log Analysis Social Graph Recommendations Advertising Tech

16. ● For Log Analysis ● HBase 0.90 (CDH3) Our HBase History (1st Gen.) Log or SCP Transfer & HDFS Sink M/R & Store Results Our Web Application

17. Our HBase History (2nd Gen.) ● For Social Graph Database, 24/7 ● HBase 0.92 (CDH4b1), HDFS CDH3u3 ● NameNode using Fault Tolerant Server http://www.nec.com/en/global/prod/express/fault_tolerant/technology.html

18. Our HBase History (2nd Gen.) ● Replication using original WAL apply method ● 10TB (not considering HDFS Replicas) ● 6 million requests per minutes ● Average Latency < 20ms

19. Our HBase History (3rd Gen.) ● For other social graph, recommendations ● HBase 0.94 (CDH4.2 〜 CDH4.7) ● NameNode HA ● Chef ● Master-slave replication (some clusters patched HBASE-8207)

20. Our HBase History (4th Gen.) ● For advertising tech (DSP, DMP, etc.) ● HBase 0.98 (CDH5.3) ● Amazon EC2 ● Master-master replication ● Cloudera Manager

21. Currently ● 10 Clusters in Production ● 10 ~ 50 RegionServers / Cluster ● uptime: 16 months (0.92) : Social Graph 24 months (0.94) : Other Social Graph 2 months (0.98) : Advertising tech

22. We Cherish the Basics ● Learning architecture ● Considering Table Schema (very important) ● Having enough RAM, DISKs, Network Bandwidth ● Splitting large regions and running major compaction at off-peak ● Monitoring metrics & tuning configuration parameters ● Catching up BUG reports @ JIRA

23. Next Challenge ● We are going to migrate cluster from 0.92 to 1.0

24. Case: Ameba’s Social Graph

25. Graph data Platform for Smartphone Apps

26. Requirements ● Scalability o growing social graph data ● High availability o 24/7 ● Low latency o for online access

27. Why HBase ● Auto sharding ● Auto failover ● Low latency We decided to use HBase and developed graph database built on it

28. How we use HBase as a Graph Database

29. System Overview HBase Gateway Client Client Client Client Gateway

30. Data Model ● Property Graph follow follow follow node1 node2 node3

31. Data Model ● Property Graph follow follow follow node1 node2 node3 name Taro age 24 date 5/7 name Ichiro age 31 date 4/1 date 3/31 name Jiro age 54

32. API Graph g = ... Node node1 = g.addNode(); node1.setProperty("name", valueOf("Taro")); Node node2 = g.addNode(); node2.setProperty("name", valueOf("Ichiro")); Relationship rel = node1.addRelationship("follow", node2); rel.setProperty("date", valueOf("2015-02-19")); List<Relationship> outRels = node1.out("follow").list(); List<Relationship> inRels = node2.in("follow").list();

39. Schema Design ● RowKey o <hash(nodeId)>-<nodeId> ● Column o n: o r:<direction>-<type>-<nodeId> ● Value o Serialized properties

40. Schema Design (Example) Node node1 = g.addNode(); node1.setProperty("name", valueOf("Taro")); Node node2 = g.addNode(); node2.setProperty("name", valueOf("Ichiro")); Node node3 = g.addNode(); node3.setProperty("name", valueOf("Jiro"));

41. Schema Design (Example) Node node1 = g.addNode(); node1.setProperty("name", valueOf("Taro")); Node node2 = g.addNode(); node2.setProperty("name", valueOf("Ichiro")); Node node3 = g.addNode(); node3.setProperty("name", valueOf("Jiro")); node1

42. Schema Design (Example) Node node1 = g.addNode(); node1.setProperty("name", valueOf("Taro")); Node node2 = g.addNode(); node2.setProperty("name", valueOf("Ichiro")); Node node3 = g.addNode(); node3.setProperty("name", valueOf("Jiro")); node1 node2

43. Schema Design (Example) Node node1 = g.addNode(); node1.setProperty("name", valueOf("Taro")); Node node2 = g.addNode(); node2.setProperty("name", valueOf("Ichiro")); Node node3 = g.addNode(); node3.setProperty("name", valueOf("Jiro")); node1 node3 node2

44. Schema Design (Example) RowKey Column Value

45. Schema Design (Example) RowKey Column Value hash(nodeId1)-nodeId1 n: {“name”: “Taro”}

46. Schema Design (Example) RowKey Column Value hash(nodeId1)-nodeId1 n: {“name”: “Taro”} hash(nodeId2)-nodeId2 n: {“name”: “Ichiro”}

47. Schema Design (Example) RowKey Column Value hash(nodeId1)-nodeId1 n: {“name”: “Taro”} hash(nodeId2)-nodeId2 n: {“name”: “Ichiro”} hash(nodeId3)-nodeId3 n: {“name”: “Jiro”}

49. Schema Design (Example) Relationship rel1 = node1.addRelationship("follow", node2); rel1.setProperty("date", valueOf("2015-02-19")); Relationship rel2 = node1.addRelationship("follow", node3); rel2.setProperty("date", valueOf("2015-02-20")); Relationship rel3 = node3.addRelationship("follow", node2); rel3.setProperty("date", valueOf("2015-04-12")); node1 node3 node2

50. Schema Design (Example) Relationship rel1 = node1.addRelationship("follow", node2); rel1.setProperty("date", valueOf("2015-02-19")); Relationship rel2 = node1.addRelationship("follow", node3); rel2.setProperty("date", valueOf("2015-02-20")); Relationship rel3 = node3.addRelationship("follow", node2); rel3.setProperty("date", valueOf("2015-04-12")); node1 node3 node2 follow

51. Schema Design (Example) Relationship rel1 = node1.addRelationship("follow", node2); rel1.setProperty("date", valueOf("2015-02-19")); Relationship rel2 = node1.addRelationship("follow", node3); rel2.setProperty("date", valueOf("2015-02-20")); Relationship rel3 = node3.addRelationship("follow", node2); rel3.setProperty("date", valueOf("2015-04-12")); node1 node3 node2 follow follow

52. Schema Design (Example) Relationship rel1 = node1.addRelationship("follow", node2); rel1.setProperty("date", valueOf("2015-02-19")); Relationship rel2 = node1.addRelationship("follow", node3); rel2.setProperty("date", valueOf("2015-02-20")); Relationship rel3 = node3.addRelationship("follow", node2); rel3.setProperty("date", valueOf("2015-04-12")); node1 node3 node2 follow followfollow

54. Schema Design (Example) RowKey Column Value hash(nodeId1)-nodeId1 n: {“name”: “Taro”} r:OUTGOING-follow-nodeId2 {“date”: “2015-02-19”} hash(nodeId2)-nodeId2 n: {“name”: “Ichiro”} r:INCOMING-follow-nodeId1 {“date”: “2015-02-19”} hash(nodeId3)-nodeId3 n: {“name”: “Jiro”}

55. Schema Design (Example) RowKey Column Value hash(nodeId1)-nodeId1 n: {“name”: “Taro”} r:OUTGOING-follow-nodeId2 {“date”: “2015-02-19”} hash(nodeId2)-nodeId2 n: {“name”: “Ichiro”} r:INCOMING-follow-nodeId1 {“date”: “2015-02-19”} hash(nodeId3)-nodeId3 n: {“name”: “Jiro”}

56. Schema Design (Example) RowKey Column Value hash(nodeId1)-nodeId1 n: {“name”: “Taro”} r:OUTGOING-follow-nodeId2 {“date”: “2015-02-19”} r:OUTGOING-follow-nodeId3 {“date”: “2015-02-20”} hash(nodeId2)-nodeId2 n: {“name”: “Ichiro”} r:INCOMING-follow-nodeId1 {“date”: “2015-02-19”} hash(nodeId3)-nodeId3 n: {“name”: “Jiro”} r:INCOMING-follow-nodeId1 {“date”: “2015-02-20”}

57. Schema Design (Example) RowKey Column Value hash(nodeId1)-nodeId1 n: {“name”: “Taro”} r:OUTGOING-follow-nodeId2 {“date”: “2015-02-19”} r:OUTGOING-follow-nodeId3 {“date”: “2015-02-20”} hash(nodeId2)-nodeId2 n: {“name”: “Ichiro”} r:INCOMING-follow-nodeId1 {“date”: “2015-02-19”} hash(nodeId3)-nodeId3 n: {“name”: “Jiro”} r:INCOMING-follow-nodeId1 {“date”: “2015-02-20”}

58. Schema Design (Example) RowKey Column Value hash(nodeId1)-nodeId1 n: {“name”: “Taro”} r:OUTGOING-follow-nodeId2 {“date”: “2015-02-19”} r:OUTGOING-follow-nodeId3 {“date”: “2015-02-20”} hash(nodeId2)-nodeId2 n: {“name”: “Ichiro”} r:INCOMING-follow-nodeId1 {“date”: “2015-02-19”} r:INCOMING-follow-nodeId3 {“date”: “2015-04-12”} hash(nodeId3)-nodeId3 n: {“name”: “Jiro”} r:OUTGOING-follow-nodeId2 {“date”: “2015-04-12”} r:INCOMING-follow-nodeId1 {“date”: “2015-02-20”}

60. Schema Design (Example) List<Relationship> outRels = node1.out("follow").list(); node3 node2 follow followfollow node1

61. Schema Design (Example) List<Relationship> outRels = node1.out("follow").list(); node3 node2 follow followfollow node1

66. Schema Design (Example) List<Relationship> inRels = node2.in("follow").list(); node3 node2 follow followfollow node1

67. Schema Design (Example) List<Relationship> inRels = node2.in("follow").list(); node3 node2 follow followfollow node1

72. Consistency Problem ● HBase has no native cross-row transactional support ● Possibility of inconsistency between outgoing and incoming rows

73. Consistency Problem RowKey Column Value hash(nodeId1)-nodeId1 n: {“name”: “Taro”} hash(nodeId2)-nodeId2 n: {“name”: “Ichiro”} hash(nodeId3)-nodeId3 n: {“name”: “Jiro”}

74. Consistency Problem RowKey Column Value hash(nodeId1)-nodeId1 n: {“name”: “Taro”} r:OUTGOING-follow-nodeId2 {“date”: “2015-02-19”} hash(nodeId2)-nodeId2 n: {“name”: “Ichiro”} r:INCOMING-follow-nodeId1 {“date”: “2015-02-19”} hash(nodeId3)-nodeId3 n: {“name”: “Jiro”} Inconsistency

75. Coprocessor ● Endpoints o like a stored procedure in RDBMS o push your business logic into RegionServer ● Observers o like a trigger in RDBMS o insert user code by overriding upcall methods

76. Using Observers ● We use 2 observers o WALObserver#postWALWrite o RegionObserver#postWALRestore ● The same logic o write an INCOMING row ● Eventual Consistency

77. Using Observers (Normal Case) Client Memstore RegionServer HDFS WALs WALObserver# postWALWrite

78. Using Observers (Normal Case) Client 1, write only an OUTGOING row Memstore RegionServer HDFS WALs WALObserver# postWALWrite

79. Using Observers (Normal Case) Client 1, write only an OUTGOING row Memstore 2, write to Memstore RegionServer HDFS WALs WALObserver# postWALWrite 3, write WAL to HDFS

80. Using Observers (Normal Case) Client 1, write only an OUTGOING row Memstore 2, write to Memstore RegionServer HDFS WALs WALObserver# postWALWrite 3, write WAL to HDFS 4, write the INCOMING row

81. Using Observers (Normal Case) Client 1, write only an OUTGOING row Memstore 2, write to Memstore RegionServer HDFS WALs WALObserver# postWALWrite 3, write WAL to HDFS 5, respond 4, write the INCOMING row

82. Using Observers (Abnormal Case) Client Memstore RegionServer HDFS WALs WALObserver# postWALWrite

83. Using Observers (Abnormal Case) Client 1, write only an OUTGOING row Memstore RegionServer HDFS WALs WALObserver# postWALWrite

84. Using Observers (Abnormal Case) Client 1, write only an OUTGOING row Memstore 2, write to Memstore RegionServer HDFS WALs WALObserver# postWALWrite 3, write WAL to HDFS

85. Using Observers (Abnormal Case) Client 1, write only an OUTGOING row Memstore 2, write to Memstore RegionServer HDFS WALs WALObserver# postWALWrite 3, write WAL to HDFS

86. Using Observers (Abnormal Case) Another RegionServer HDFS RegionObserver# postWALRestore WALs Memstore

87. Using Observers (Abnormal Case) Another RegionServer HDFS RegionObserver# postWALRestore WALs Memstore 1, replay a WAL of an OUTGOING row

88. Using Observers (Abnormal Case) Another RegionServer HDFS RegionObserver# postWALRestore WALs Memstore 2, write the INCOMING row 1, replay a WAL of an OUTGOING row

89. Summary ● We have used HBase in several projects o Log Analysis, Social Graph, Recommendations, Advertising tech ● We developed graph database built on HBase o HBase is good for storing social graphs o We use coprocessor to resolve consistency problems

90. If you have any questions, please tweet @brfrn169. Questions

HBaseCon 2015: HBase @ CyberAgent

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to HBaseCon 2015: HBase @ CyberAgent

Similar to HBaseCon 2015: HBase @ CyberAgent (20)

More from HBaseCon

More from HBaseCon (20)

Recently uploaded

Recently uploaded (20)

HBaseCon 2015: HBase @ CyberAgent

Editor's Notes