Más contenido relacionado La actualidad más candente (20) Similar a Solr for Indexing and Searching Logs (20) Más de Sematext Group, Inc. (20) Solr for Indexing and Searching Logs1. Using Solr to Search and
Analyze Logs
Radu Gheorghe
@sematext
@radu0gheorghe
5. defining and handling logs in general
4 sets of tools to send logs to
Performance tuning and SolrCloud
14. Facets. Logging in JSON
2013-11-06… mickey mouse
{
"date": "2013-11-06",
"message": "mickey mouse"
}
15. Facets. Logging in JSON
2013-11-06… mickey mouse
2013-11-06… @cee:{"user": "mickey"}
{
{
"date": "2013-11-06",
"message": "mickey mouse"
}
"date": "2013-11-06",
"user": "mickey"
}
18. 4 Ways of Sending Logs to Solr
logger
Logstash
files
20. Automatic ID generation
solrconfig.xml
<updateRequestProcessorChain name="add-unknown-fields-to-the-schema">
……..
<processor class="solr.UUIDUpdateProcessorFactory">
<str name="fieldName">id</str>
</processor>
<processor class="solr.LogUpdateProcessorFactory"/>
<processor class="solr.RunUpdateProcessorFactory"/>
</updateRequestProcessorChain>
http://solr.pl/en/2013/07/08/automatically-generate-document-identifiers-solr-4-x/
22. /dev/log -> parse -> format -> send to Solr
% logger '@cee: {"hello": "world"}'
rsyslog.conf
module(load="imuxsock") # version 7+
23. /dev/log -> parse -> format -> send to Solr
...
module(load="mmjsonparse")
action(type="mmjsonparse")
24. /dev/log -> parse -> format -> send to Solr
...
template(name="CEE"
type="list") {
property(name="$!all-json")
constant(value="n")
}
25. /dev/log -> parse -> format -> send to Solr
...
action(type="mmjsonparse")
template(name="CEE"
…
module(load="omprog")
if $parsesuccess == "OK" then action(type="omprog"
binary="/opt/json-to-solr.py"
template="CEE")
26. /dev/log -> parse -> format -> send to Solr
import json, pysolr, sys
solr = pysolr.Solr('http://localhost:8983/solr/')
while True:
line = sys.stdin.readline()
doc = json.loads(line)
solr.add([doc])
28. Avro -> buffer -> parse -> send to Solr
https://github.com/mpercy/flume-log4j-example
flume.conf
agent.sources = avroSrc
agent.sources.avroSrc.type = avro
agent.sources.avroSrc.bind = 0.0.0.0
agent.sources.avroSrc.port = 41414
29. Avro -> buffer -> parse -> send to Solr
flume.conf
agent.channels = solrMemoryChannel
agent.channels.solrMemoryChannel.type = memory
agent.sources.avroSrc.channels = solrMemoryChannel
30. Avro -> buffer -> parse -> send to Solr
flume.conf
agent.sinks = solrSink
agent.sinks.solrSink.type = org.apache.flume.sink.solr.morphline.MorphlineSolrSink
agent.sinks.solrSink.morphlineFile = conf/morphline.conf
agent.sinks.solrSink.channel = solrMemoryChannel
31. Avro -> buffer -> parse -> send to Solr
morphline.conf
...
commands : [
{ readLine { charset : UTF-8 }}
{ grok {
dictionaryFiles : [conf/grok-patterns]
expressions : {
message : """%{INT:pid} %{DATA:message}"""
...
https://github.com/cloudera/search/tree/master/samples/solr-nrt/grok-dictionaries
32. Avro -> buffer -> parse -> send to Solr
morphline.conf
SOLR_LOCATOR : {
collection : collection1
#zkHost : "127.0.0.1:2181"
solrUrl : "http://localhost:8983/solr/"
}
...
commands : [
...
{ loadSolr {
solrLocator : ${SOLR_LOCATOR}
...
34. fluent-logger -> fluentd -> fluent-plugin-solr
% pip install fluent-logger
from fluent import sender,event
sender.setup('solr.test')
event.Event('forward', {'hello': 'world'})
35. fluent-logger -> fluentd -> fluent-plugin-solr
<source>
type forward
</source>
<match solr.**>
type solr
host localhost
port 8983
core collection1
</match>
36. fluent-logger -> fluentd -> fluent-plugin-solr
% gem install fluent-plugin-solr
https://github.com/btigit/fluent-plugin-solr
out_solr.rb
doc = Solr::Document.new(:hello => record["hello"])
38. file input -> grok filter -> solr_http output
% echo '2 world' >> /tmp/testlog
logstash.conf:
input {
file { path => "/tmp/testlog" }
}
39. file input -> grok filter -> solr_http output
logstash.conf:
filter {
grok {
match => ["message", "%{NUMBER:pid} %{GREEDYDATA:hello}"]
}
}
{"pid": "2", "hello":"world"}
40. file input -> grok filter -> solr_http output
logstash.conf:
output {
solr_http { # master or v1.2.3+
solr_url => "http://localhost:8983/solr"
}
}
43. |>>>>|Single Core: # of docs/update
http://static.memrise.com.s3.amazonaws.com/uploads/blog-pictures/Simpsons_Updates.bmp
45. |>>>>|Single Core: Size and Merges
omitNorms="true"
omitTermFreqAndPositions="true"
<mergeFactor>??
http://sweetclipart.com/multisite/sweetclipart/files/scissors_blue_silver.png
http://mergewords.com/gfx/logo-big.png
46. |>>>>|Single Core: Caches
facets
<fieldValueCache ...
size="???"
autowarmCount="0"
changing data
to sort&facet
docValues="true"
http://vector-magz.com/wp-content/uploads/2013/06/diamond-clip-art4.png
http://www.clker.com/cliparts/1/f/6/3/11971228961330048838SaraSara_Ice_cube_2.svg.med.png
http://clipartist.info/RSS/openclipart.org/2011/May/02-Monday/migrating_penguin_penguinmigrating-555px.png
47. SolrCloud: ZooKeeper
bin/zkServer.sh start
OR
java -DzkRun … -jar start.jar
http://www.clker.com/cliparts/c/a/8/d/1331060720387485902Roaring%20Tiger.svg.hi.png
http://fc03.deviantart.net/fs71/f/2012/196/6/a/piggy_back_rides_are_the_best_rides__by_yipped-d57b3sh.png
48. SolrCloud: ZooKeeper
zkcli.sh -cmd upconfig
-zkhost SERVER:2181
-confdir solr/collection1/conf/
-confname start
-Dbootstrap_confdir=solr/collection1/conf Dcollection.configName=start
http://www.clker.com/cliparts/c/a/8/d/1331060720387485902Roaring%20Tiger.svg.hi.png
http://fc03.deviantart.net/fs71/f/2012/196/6/a/piggy_back_rides_are_the_best_rides__by_yipped-d57b3sh.png