The Elastic blog [1] recently featured webLyzard’s Visual Exploration of Sustainability Communication with Elasticsearch, a project [2] to track global information flows. Customized for the United Nations Environment Programme, the resulting platform identifies opinion leaders and analyzes the public debate surrounding the UN’s Sustainable Development Goals (SDGs). Its custom-built dashboard [3] synchronizes multiple views in real time and uses aggregations to convey context information through a portfolio of visual tools.
Two of the webLyzard [4] co-founders will present a live demo of the platform and similar applications in other domains. They will discuss some of the underlying aggregations and their experience of recently migrating to Elasticsearch 6.5. The concluding outlook will show how predictive capabilities might help to anticipate mobility bottlenecks, support digital newsrooms, or maximize the impact of published content across social media channels.
[1] https://www.elastic.co/blog/weblyzards-visual-exploration-of-sustainability-communication-with-elasticsearch
[2] https://www.weblyzard.com/unep-live
[3] https://unep.ecoresearch.net
[4] https://www.weblyzard.com
5. 5 Elasticsearch | Integration
History
- Started with PostgreSQL/Tsearch2
- Switched to Apache Lucene in 2009
- Adopted Elasticsearch 1.0 early 2014
- Migrated to Elasticsearch 6 earlier this year (currently 6.5.1)
Current Cluster
- 5 physical machines (XEON E5, 40 cores, 256GB RAM)
- 3 x 2TB M.2 NVMe Samsung 960 Pro in striped LVM
- Multiple Elasticsearch nodes per machine
4 data nodes, separate master nodes, one coordinating-only node per machine
- Docker containers, overlay network, discovery using DNS
6. 6 Elasticsearch | Integration
Indexing
– Custom component based on Vert.x
– Data read from PostgreSQL
– Indexer applies transformations/de-normalizations
– Enriches with additional metadata; e.g., translations
– One index per source and language (e.g. German news media)
per month (e.g. de.1.media.2018-12)
– Balance between index/shard size and number of indexes
affected by queries and aggregations
7. 7 Elasticsearch | Integration
Migration 1.7 > 6.x
– No in-place upgrade path from Elasticsearch 1.x to 6.x
– Skipped Elasticsearch 2.x because of field names containing
dots in existing mapping
– Complete re-index needed, adapted document mapping
Performance Improvements
– Single request performance improved by about 50%
– Concurrent requests with 100 simulated users improved
by almost 85%
8. 8 Elasticsearch | Integration
Example 1: Geographic Map
– Hashgrid aggregation for extracted target locations
– User-selectable precision
– Aggregate average document sentiment
Example 2: Word Tree
– Inner hits on search query
– Filtered for sentences matching the query
– Maintaining document-level sorting
10. 10 InVID | www.weblyzard.com/invid
In Video Veritas – Video Verification
Media Partners: APA, AFP, Deutsche Welle
EU Horizon 2020, 3.65 Mio EUR
SONAR | www.weblyzard.com/sonar
Semantic Repository for News Analytics
Media Partner: ProSiebenSat.1 PULS 4
Google News Initiative, 225,000 EUR
ReTV | www.weblyzard.com/retv
Enhancing and Repurposing TV Content
Media Partners: Zattoo, rbb|24, Sound & Vision
EU Horizon 2020, 3.5 Mio EUR
11. 11 EPOCH | www.weblyzard.com/epoch
Extracting and Predicting Events from Online
Communication and Hybrid Datasets
Partners: Ketchum Publico, KPMG
FFG ICT of the Future; 500,000 EUR
EPOCH will assess the real-world impact of events reported
in news and social media channels. It will use extracted
knowledge from the public debate across these channels to
predict future trends, offering a visual dashboard to explore
and analyze these trends.
Organizations will be able to identify and thus better
prepare for anticipated changes, adapting their decision-
making and resource allocation strategies accordingly. New
methods developed witin EPOCH will be applied to the
domains of purchase price forecasting and public relations.
12. 12
@weblyzard
We Are Definitely Hiring
Our team spans 2+ countries and counting.
See if you or someone you know is a fit.
careers@weblyzard.com