Más contenido relacionado Similar a Как писать сервис, поддержка которого не превращается в ад / Антон Резников, Игорь Мунькин (Mail.Ru Group) (20) Как писать сервис, поддержка которого не превращается в ад / Антон Резников, Игорь Мунькин (Mail.Ru Group)4. Why do we need logs?
✱ Investigation
✱ Debugging
✱ Statistics
✱ Monitoring
4 / 105
27. Who framed Roger Rabbit
2017-10-31 17:34:04 Jessica Rabbit <SAVE>
2017-10-31 17:34:04 Judge Doom <SAVE>
2017-10-31 17:34:04 Roger Rabbit <SAVE>
27 / 105
28. Who framed Roger Rabbit
2017-10-31 17:34:04 Jessica Rabbit <SAVE>
2017-10-31 17:34:04 Judge Doom <SAVE>
2017-10-31 17:34:04 Roger Rabbit <SAVE>
28 / 105
29. Who framed Roger Rabbit
2017-10-31 17:34:04.010 Jessica Rabbit <SAVE> #1
2017-10-31 17:34:04.990 Judge Doom <SAVE> #3
2017-10-31 17:34:04.550 Roger Rabbit <SAVE> #2
29 / 105
33. Race
***
2017-10-31 17:34:04.540 Duplicate key exists in
unique index 0
2017-10-31 17:34:04.545 [END=500] user/create time:
0.300s ...
2017-10-31 17:34:04.595 [END=200] user/create time:
0.400s ...
***
33 / 105
37. Race
2017-10-31 17:34:04.195 START user/create
***
2017-10-31 17:34:04.245 START user/create
***
***
***
***
***
***
***
***
***
***
***
***
2017-10-31 17:34:04.545 [END=500] user/create time: 0.300s ...
***
2017-10-31 17:34:04.595 [END=200] user/create time: 0.400s ...
37 / 105
38. Race
2017-10-31 17:34:04.195 START user/create
2017-10-31 17:34:04.217 Auth
2017-10-31 17:34:04.245 START user/create
2017-10-31 17:34:04.247 Cache lookup
2017-10-31 17:34:04.256 Auth
2017-10-31 17:34:04.256 Select
2017-10-31 17:34:04.256 Cache lookup
2017-10-31 17:34:04.256 Get profile
2017-10-31 17:34:04.256 Select
2017-10-31 17:34:04.256 Lock
2017-10-31 17:34:04.256 Insert
2017-10-31 17:34:04.256 Get profile
2017-10-31 17:34:04.256 Unlock
2017-10-31 17:34:04.256 Lock
2017-10-31 17:34:04.256 Cache update
2017-10-31 17:34:04.545 [END=500] user/create time: 0.300s ...
2017-10-31 17:34:04.569 Init tree
2017-10-31 17:34:04.595 [END=200] user/create time: 0.400s ...
38 / 105
39. Race
2017-10-31 17:34:04.195 [Jgz36] START user/create
2017-10-31 17:34:04.217 [Jgz36] Auth
2017-10-31 17:34:04.245 [W4IL6] START user/create
2017-10-31 17:34:04.247 [Jgz36] Cache lookup
2017-10-31 17:34:04.256 [W4IL6] Auth
2017-10-31 17:34:04.256 [Jgz36] Select
2017-10-31 17:34:04.256 [W4IL6] Cache lookup
2017-10-31 17:34:04.256 [Jgz36] Get profile
2017-10-31 17:34:04.256 [W4IL6] Select
2017-10-31 17:34:04.256 [Jgz36] Lock
2017-10-31 17:34:04.256 [Jgz36] Insert
2017-10-31 17:34:04.256 [W4IL6] Get profile
2017-10-31 17:34:04.256 [Jgz36] Unlock
2017-10-31 17:34:04.256 [W4IL6] Lock
2017-10-31 17:34:04.256 [Jgz36] Cache update
2017-10-31 17:34:04.545 [W4IL6] [END=500] user/create time: 0.300s ...
2017-10-31 17:34:04.569 [Jgz36] Init tree
2017-10-31 17:34:04.595 [Jgz36] [END=200] user/create time: 0.400s ...
39 / 105
40. Race
2017-10-31 17:34:04.195 [Jgz36] START user/create
2017-10-31 17:34:04.217 [Jgz36] Auth
2017-10-31 17:34:04.245 [W4IL6] START user/create
2017-10-31 17:34:04.247 [Jgz36] Cache lookup
2017-10-31 17:34:04.256 [W4IL6] Auth
2017-10-31 17:34:04.256 [Jgz36] Select
2017-10-31 17:34:04.256 [W4IL6] Cache lookup
2017-10-31 17:34:04.256 [Jgz36] Get profile
2017-10-31 17:34:04.256 [W4IL6] Select
2017-10-31 17:34:04.256 [Jgz36] Lock
2017-10-31 17:34:04.256 [Jgz36] Insert
2017-10-31 17:34:04.256 [W4IL6] Get profile
2017-10-31 17:34:04.256 [Jgz36] Unlock
2017-10-31 17:34:04.256 [W4IL6] Lock
2017-10-31 17:34:04.256 [Jgz36] Cache update
2017-10-31 17:34:04.545 [W4IL6] [END=500] user/create time: 0.300s ...
2017-10-31 17:34:04.569 [Jgz36] Init tree
2017-10-31 17:34:04.595 [Jgz36] [END=200] user/create time: 0.400s ...
40 / 105
63. Long connection story
17:30:05.010 +0.000 GET /HLJ2017.hief
17:30:05.054 +0.044 Auth
***
18:00:00.010 +1795.000 done:45%, wb:100%, r:860kB/s
***
18:20:00.010 +2995.000 done:85%, wb:50%, r:460kB/s
***
18:37:39.990 +4027.980 END=200
63 / 105
78. grep | grep | sort | sed | xargs | awk |
sort | uniq | sort | head | xargs |
grep | cut | awk
Video
Options
Storage degradation
Video conversion problems
Network problems
Cache node degradation
78 / 105
79. Video
Let's log it!
... ffmpeg t: 3.3s, q: 1080p, p: 27 ...
... upload t: 3.5s, sz: 5Mb, r: 1.4kB/s n: 3 ...
79 / 105
80. Video
Let's log it!
... ffmpeg t: 3.3s, q: 1080p, p: 27 ...
... upload t: 3.5s, sz: 5Mb, r: 1.4kB/s n: 3 ...
Let's graph it!
80 / 105
86. And video again
Host: cloud.mail.ru
Referer: https://cloud.mail.ru/
X-Real-Ip: 8.9.8.9
Content-Length: 575
Cookie: session_id
Connection: close
Accept: application/octet-stream
86 / 105
87. And video again
Host: cloud.mail.ru
Referer: https://cloud.mail.ru/
X-Real-Ip: 8.9.8.9
Content-Length: 575
Cookie: session_id
Connection: close
Accept: application/octet-stream
87 / 105
93. Query of death
SMS: 2017-11-04T19:19:10 App worker ended with SIGABRT
Email: http://store.local/lightning.core
93 / 105
94. Query of death
SMS: 2017-11-04T19:19:10 App worker ended with SIGABRT
Email: http://store.local/lightning.core
What's going on?
94 / 105
95. Query of death
SMS: 2017-11-04T19:19:10 App worker ended with SIGABRT
Email: http://store.local/lightning.core
What's going on?
12 servers * 10 workers * 10 rps = 1200 requests
95 / 105
97. Query of death
SMS: 2017-11-04T19:19:10 <websrv9> Lightning worker[29453] ended with
SIGABRT
Email: http://store.local/websrv9.lightning.29453.core
97 / 105