7. How inspect logs
Retrospection (reactive search)
Store data, and search
Prospection (proactive search)
Define what should be processed, and store data
14年1月31日金曜日
8. What logs inspected
Schema-full data:
strict schema: pre defined fields w/ types (or reject)
schema on read: try to read known fields (or ignore)
Schema-less data:
any fields (or ignore), any types (implicit/explicit
conversion)
fit for services in-development (all internet services!)
14年1月31日金曜日
10. Data size: schema & index
Logs: size is always important (xTB - xPB)
Schema:
size optimization
access optimization on memory/disk
Index:
access optimization on memory/disk
more memory/disk required
hard to distribute
14年1月31日金曜日
11. Query response improvements
of retrospection
Schema-full + indexed (RDBMS)
Query plan optimization
Schema on read
I/O and Task size optimization & scale out
Schema-less + indexed (Mongo)
mmap-ed index & data (!)
14年1月31日金曜日
13. Stream processing
and data size
No disks: reduction of failure points
Less memory:
size of just processing and I/O buffers
aggregation results
Easy to distribute:
stream duplication
stream splitting by aggregation key
14年1月31日金曜日
14. Stream processing and schema
Stream processing: query -> data
Prospective schema by queries:
Queries know required fields and its types
Unused fields can be ignored
Implicit type conversion available
Schema-less data + schema-full queries
14年1月31日金曜日