3. ELK
open source data visualization
platform that allows you to interact
with your data through stunning,
powerful graphics.
distributed, open source search and
analytics engine, designed for
horizontal scalability, reliability, and easy
management.
flexible, open source data collection,
parsing, and enrichment pipeline.
Shield brings enterprise-grade security to Elasticsearch, protecting the entire ELK
stack with encrypted communications, authentication, role-based access control
and auditing.
comprehensive tool that provides you
with complete transparency into the
status of your Elasticsearch
deployment.
Elasticsearch 1.4.4 Kibana 4.0.1
Logstash 1.4.2Marvel
Shield 1.0.1
4. SHIELD
Security as a Plugin
Security features for Elasticsearch are implemented in a
plugin that you install on each node in your cluster.
5. ARCHITECTURE NOTES
• The plugin intercepts inbound API calls in order to
enforce authentication and authorization.
• The plugin provides encryption using Secure Sockets
Layer/Transport Layer Security (SSL/TLS) for the
network traffic to and from the Elasticsearch node.
• The plugin uses the API interception layer that
enables authentication and authorization to provide
audit logging capability.
6. MAIN FEATURES
• User Authentication
Shield defines (realm) a known set of users in order to authenticate users that make
requests.The supported realms are esusers and LDAP.
• Authorization
Shield’s data model for action authorization includes: Secured Resource, Privilege,
Permissions, Role, Users
• Node Authentication and Channel Encryption
Shield use SSL/TLS to wrap usual node communication over port 9300.When SSL/TLS
is enabled, the nodes validate each other’s certificates, establishing trust between the
nodes.
• IP Filtering
Shield provides IP-based access control for Elasticsearch nodes that allows to restrict
which other servers, via their IP address, can connect to Elasticsearch nodes and make
requests.
• Auditing
The audit functionality in a secure Elasticsearch cluster logs particular events and
activity on that cluster. The events logged include authentication attempts, including
granted and denied access.
7. KIBANA
Kibana 4 provides dozens of new features that enable you
to compose questions, get answers, and solve problems like
never before.
8. WHAT’S NEW?
• New interface with D3, drag&drop dashboard builder
• New diagrams:Area Chart, DataTable, MarkdownText Widget, Pie Chart,
Raw Document Widget, Single Metric Widget,Tile Map,Vertical Bar Chart
• Advanced aggregation-based analytics capabilities: Unique counts
(cardinality), Non-date histograms, Ranges, Significant terms, Percentiles etc.
• Expressions-based scripted fields enable you to perform ad-hoc analysis by
performing computations on the fly
• Search result highlighting
• Ability to save searches and visualizations
• Faster dashboard loading due to a reduction in the number HTTP calls
needed to load the page
• SSL encryption for client requests as well as requests to and from
Elasticsearch
11. WHAT’S NEW? SINCE 1.2.0
• Upgraded to Lucene 4.10.1 release
• New aggregations: percentiles_rank, top_hits, cardinality,
scripted_metric, …
• Added sum of the doc counts of other buckets in terms aggs
• Added support bounding box aggregation on geo_shape/
geo_point data types
• Parent/child optimization
• Added support for scripted upserts
• Fielddata and cache optimisation
• Removed deprecated gateway functionality
• …
12. PERCENTILES RANK AGGREGATION
A multi-value metrics aggregation that calculates one or more percentile
ranks over numeric values extracted from the aggregated documents.
{
“aggs” : {
“load_time_outlier” : {
“percentile_ranks” : {
“field” :“load_time”,
“values” : [15, 30]
}
}
}
}
{
“aggregations” : {
“load_time_outlier” : {
“values” : {
“15”: 92,
“30”: 100
}
}
}
}
Example above shows that 92% of page were loaded within 15 sec, and
100% within 30 sec.
13. TOP HITS AGGREGATION
A top_hits metric aggregator keeps track of the most relevant document
being aggregated.This aggregator is intended to be used as a sub aggregator,
so that the top matching documents can be aggregated per bucket.
{
“aggs”: {
“top_logs”: {
“top_hits”: {
“sort": [
{
“created_at”: {
“order”:“desc”
}
}
],
“_source”: {
“include”: [
“path”
]
}
}
}
{
“aggregations”: {
“top_logs”: {
“hits”: {
“total”: 180
“hits”: [
{
“_index”:“logs”,
“_type”:“log”,
“_id”:“an893d30mlss”,
“_source”: {
“path”:“/home/user/”
}
sort: [ 1422388801000 ]
…
}
14. CARDINALITY AGGREGATION
A single-value metrics aggregation that calculates an approximate count of
distinct values. It is based on the HyperLogLog++ algorithm, which counts
based on the hashes of the values with some interesting properties:
• configurable precision, which decides on how to trade memory for accuracy,
• excellent accuracy on low-cardinality sets,
• fixed memory usage: no matter if there are tens or billions of unique values,
memory usage only depends on the configured precision.
{
“aggs” : {
“tags_count” : {
“cardinality” : {
“field” :“tags”,
“precision_threshold”: 100
}
}
}
}
{
“aggregations” : {
“tags_count” : {
“value”: 120002
}
}
}
15. SCRIPTED METRIC AGGREGATION
A metric aggregation that executes using scripts to provide a metric output.
{
“aggs” : {
"profit": {
"scripted_metric": {
"init_script" : "_agg['transactions'] = []",
"map_script" : "if (doc['type'].value == "sale")
{ _agg.transactions.add(doc['amount'].value) }
else { _agg.transactions.add(-1 * doc['amount'].value) }",
"combine_script" : "profit = 0;
for (t in _agg.transactions) { profit += t };
return profit",
"reduce_script" : "profit = 0;
for (a in _aggs) { profit += a };
return profit"
}
}
}
17. PROBLEM I
{
“location”: {
“type”:“geo_point”
},
“tags”: {
“type”:“string”,
“index”:“not_analyzed”
},
“text”: {
“type”:“string”,
“index”:“not_analyzed”
}
}
Find most popular tags per location (e.g. grouping by
geohash with precision 10km x 10km)