Presented by Mark Davis, CTO Kitenga - See conference video - http://www.lucidimagination.com/devzone/events/conferences/lucene-revolution-2012
Kitenga's Analyst system uses the LucidWorks Enterprise REST API in a variety of ways, including for configuring collections and managing Solr schema. As part of the Kitenga platform, the ZettaSearch Designer empowers the end-user to dynamically drag-and-drop search widgets to create a specialized search interface. For a user to effectively design search UIs that meet their needs, they need to be able to understand the available schema fields that populate a given collection. ZettaSearch Designer interrogates the Solr infrastructure using the Lucid REST API to provide an overview of the available metadata. It is then easy for the user to build rich, facetted search experiences around the metadata library indexed into the collection. In this implementation overview, I will describe the design of ZettaSearch Designer, how it interacts with big data technologies like Hadoop as part of the indexing pipeline, and how it uses the LucidWorks API to enable user discovery of the metadata needed to create novel search user interfaces on the fly.
18. ZettaSearch
GOAL:
Build
“second
screen
Facetted Search
experiences”
and Analytics
SOURCES:
wikipedia,
IMDB,
blogs
relationships
ANALYSIS:
Extract
named
entities
and
metadata
entities
ZettaVox
data
relationships,
preserve
existing
structural
metadata
ACTION:
Enable
new
media
experiences
Sources
18
19. ¡ Crawlers
on
Hadoop
¡ Document
format
crackers
on
Hadoop
¡ Extractors
on
Hadoop
¡ Filters
on
Hadoop
¡ HTTP
documents
to
Solr
sharded
cluster
¡ Intermediary
files
remain
on
HDFS
for
reprocessing
20. ¡ Missing
piece
of
the
puzzle
¡ Addresses
the
impedance
mismatch
between
Big
Data
technologies
and
Solr
search
¡ Manage
collections
¡ Manage
schema
24. ¡ Schema
interrogation
¡ Schema
binding
to
user
experience
¡ Facetted
search
¡ Embedded
analytics
25.
26.
27. ¡ Big
Data
search
and
analytics
has
many
challenges:
§ Volume
of
data
§ Variety
of
data
§ Velocity
of
data
§ Extracting
structure
from
unstructured
information
¡ Hadoop
processing
enables
each
of
these
aspects
¡ Controlling
indexing
and
search
is
enabled
by
the
Lucid
Imagination
search
API
¡ We
can
enable
complex
user
interactions
with
Big
Data
on
a
self-‐serve
basis