3. Data Analytics System
●
KPI reporting system for Cloud System
●
GMO Apps Cloud
●
Over 500 Titles
mobage, gree, mixi, Hangame, facebook, nikoniko … etc
●
Data Center
Japan, US(west coast)
4. Analytics Specification
●
Social Game Data KPI
DAU/PV, Play Time, Sales
A/B Testing, Conversion … etc
●
Hourly, Daily, Weekly, Monthly
●
Since 2010/06 ~
5. System Architecture
SNS Game
User SNS Platform Master
Cloud System Management Monitoring
System System
Cloud Server
(Game Server)
Logging
Scheduler ・・・・・・・・
Server
MySQL
Hadoop/Hive
(for Hive)
Data Center A Data Center N
6. Specification, Statistics
●
Multiple NameNode per Data Center
●
Hardware Spacification
CPU : 8~16CPU(HT)
MEM: 12~64Gbyte
HD : RAID 1, 5, 1+0
●
Statistics
6,000,000 blocks/44,000 jobs/day
1,000 over AP servers logging
7. Data Flow
load data local inpath 'hogehoge-access_log.*.log.gz'
overwrite into table original_logs
partition (log_date='2012-07-26', log_number=13);
host string from deserializer
identity string from deserializer
user string from deserializer Cloud Server
time string from deserializer (Game Server)
method string from deserializer
request string from deserializer
status string from deserializer Logging
size string from deserializer Management
Server System
referer string from deserializer
agent string from deserializer
log_date string
log_number tinyint
Hadoop/Hive Scheduler
host string
time string
method string HiveDriver
request string
userid string
log_date string Filter → Hourly, Daily, Weekly, Monthly Report
log_number tinyint (AB Testing, Conversion, DAU..etc)
8. Conversion Count HQL
INSERT OVERWRITE TABLE conversion_click
PARTITION (log_date= :logDate, log_number=:logNumber)
SELECT regexp_extract(request, 'convid=([a-zA-Z0-9%])', 1),
regexp_extract(request, 'convflg=(A|B){1}', 1),
count(1),
:logMonth,
:logWeek
FROM parsed_log
WHERE request RLIKE 'convid=[a-zA-Z0-9%]'
AND request RLIKE 'convflg=(A|B){1}'
AND log_date = :logDate
AND log_number = :logNumber
GROUP BY regexp_extract(request, 'convid=([a-zA-Z0-9%])', 1),
regexp_extract(request, 'convflg=(A|B){1}', 1)