RedisDay London 2018 - RedisSearch Deep Dive

•

0 recomendaciones•617 vistas

Redis Labs

RedisDay London 2018

Tecnología

RediSearch Deep Dive
KYLEDAVIS,REDISLABS

RediSearch can be used for two [primary] things:
Full Text
Search
Secondary
Index

Search Aggregate
RediSearch can do three things.
Autocomplete
Query

• Create a schema using four types
Text
Numeric
Tag
Geospatial
• Add Documents in Real Time
Directly
From Hash
Index only
• Search & Aggregate
• Delete documents as needed
• Drop the whole index
Data Lifecycle in RediSearch / Search and Aggregate

• Goals
Intentionally not SQL
But familiar
Exposable to end-users
• Simple
No knowledge of data/structure needed
• Powerful
With knowledge, zero in on data
Query Language

AND / OR / NOT / Exact Phrase / Geospatial /
Tags / Prefix / Number Ranges / Optional Terms
& more
Query Syntax – more advanced
And combine them all into one query:
(chev*|ford) -explorer ~truck @year:[2001
2011] @location:[74 40 100 km] @condition:{
good | verygood }

• Stop words:
”a fox in the woods” -> “fox woods”
• Stemming:
Query “going” -> find “going” ”go” “gone”
• Slop:
Query: “glass pitcher”, slop 2 -> “glass gallon beer pitcher”
• With or without content:
Query: “To be or not to be” -> Hamlet (without the whole play)
Matched text highlight/summary:
Query – “To be or not to be” -> Hamlet. <b>To be, or not to be</b>- that
is the question
Full-text Search

•Synonyms
Query “Bob” -> Find documents with “Robert”
•Query Spell Check
”a fxo in the woods” -> Did you mean “a fox in the
woods”
•Phonetic Search
“John Smith” -> “Jon Smyth”
Full-text Search

• Each field can have a weight which influences the rank in the returned result
• Each document can have a score to influence rank
• Built-in Scoring Functions
Default: TF-IDF / term frequency–inverse document frequency
• Variant: DOCNORM
• Variant: BM25
DISMAX (Solr’s default)
DOCSCORE
HAMMING for binary payloads
• Fields can be independently sortable, which trumps any in-built scorcer
Scoring, Weights, and Sorting

Aggregations
• Processes and transforms
• Same query language as search
• Can group, sort and apply transformations
• Follows pipeline of composable actions:
Filter Group Apply Sort Apply
Reduce
Reduce

Grouping & Applications
• Reducers:
COUNT
COUNT_DISTINCT
COUNT_DISTINCTISH
SUM
MIN
MAX
AVG
STDDEV
QUANTILE
TOLIST
FIRST_VALUE
RANDOM_SAMPLE
• Manipulate
Strings
• substr(upper('hello
world'),0,3) -> “HEL”
Numbers w/ Arithmetic
• sqrt(log(foo) *
floor(@bar/baz)) + (3^@qaz %
6)
Timestamp to Calendar
• timefmt(@mytimestamp, "%b %d
%Y %H:%M:%S”) -> Feb 24 2018
00:05:48

FT.AGGREGATE shipments "@box_area:[300 +inf]”
APPLY "year(@shipment_timestamp / 1000)" AS shipment_year
GROUPBY 1 @shipment_year REDUCE COUNT 0 AS shipment_count
SORTBY 2 @shipment_count DESC
LIMIT 0 3
APPLY "format("%sk+ Shipments",floor(@shipment_count /
1000))" AS shipment_count
RediSearch in Action: FT.AGGREGATE

Legacy, Dataset-Based Autocomplete
Add data
Prefix
Search
Your general dataset is
the source for the
autocomplete
Each keystroke you do
a prefix search on your
whole dataset.

•Frequency != Good Search Query
•Frequency != Quality
•Expensive & Slow
•Ignores context
Here be dragons

• Radix Trie
• Store a collection of suggestion at a single Key
• Each suggestion has a score
• You can use it with anything that has:
Search box
User interacts with results
• API
FT.SUGGET
FT.SUGADD
FT.SUGDEL
FT.SUGLEN
RediSearch Suggestion Engine

Search
box
Results
Search
hit/miss
User-Directed Autocomplete
FT.SUGGET
mySug
{user input}
Instead of going
directly to results
go through
RediSearch-based
forwarder
FT.SUGADD
mySug
{user input}
1
INCR

Más contenido relacionado

La actualidad más candente

Why is postgis awesome?Kasper Van Lombeek

Type Checking Scala Spark Datasets: Dataset TransformsJohn Nestor

The NoSQL Geospatial LandscapeRaj Singh

The evolution of Netflix's S3 data warehouse (Strata NY 2018)Ryan Blue

Making data storage more efficientCentre of Geographic Sciences (COGS)

Reading HDF family of formats via NetCDF-Java / CDMThe HDF-EOS Tools and Information Center

Solr Payloads for Ranking Data BloomReach

La actualidad más candente (7)

Why is postgis awesome?

Type Checking Scala Spark Datasets: Dataset Transforms

The NoSQL Geospatial Landscape

The evolution of Netflix's S3 data warehouse (Strata NY 2018)

Making data storage more efficient

Reading HDF family of formats via NetCDF-Java / CDM

Solr Payloads for Ranking Data

Similar a RedisDay London 2018 - RedisSearch Deep Dive

RedisSearch / CRDT: Kyle Davis, Meir ShpilraienRedis Labs

Redshift Chartio Event PresentationChartio

Azure Redis Cache - Cache on Steroids!François Boucher

Why MongoDB over other Databases - HabilelabsHabilelabs

Apache Hadoop 1.1Sperasoft

Web analytics at scale with Druid at naver.comJungsu Heo

Redshift overviewAmazon Web Services LATAM

Extending Solr: Behind CareerBuilder’s Cloud-like Knowledge Discovery Platfor...lucenerevolution

Extending Solr: Building a Cloud-like Knowledge Discovery PlatformLucidworks (Archived)

Efficient Query Processing in Geographic Web Search EnginesYen-Yu Chen

AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...Amazon Web Services

MongoDB's New Aggregation frameworkChris Westin

No sql Databasemymail2ashok

Azure CosmosDb - Where we areMarco Parenzan

ACS DataMart_pptJeremy Searls

Oracle by Muhammad IqbalYOUTH MEDIA AGENCY

Test driving Azure Search and DocumentDBAndrew Siemer

SolrCloud on HadoopAlex Moundalexis

Apache Solr crash courseTommaso Teofili

Similar a RedisDay London 2018 - RedisSearch Deep Dive (20)

RedisSearch / CRDT: Kyle Davis, Meir Shpilraien

Redshift Chartio Event Presentation

Azure Redis Cache - Cache on Steroids!

Why MongoDB over other Databases - Habilelabs

Apache Hadoop 1.1

Web analytics at scale with Druid at naver.com

Redshift overview

Extending Solr: Behind CareerBuilder’s Cloud-like Knowledge Discovery Platfor...

Extending Solr: Building a Cloud-like Knowledge Discovery Platform

Efficient Query Processing in Geographic Web Search Engines

AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...

MongoDB's New Aggregation framework

No sql Database

Azure CosmosDb - Where we are

ACS DataMart_ppt

Oracle by Muhammad Iqbal

Test driving Azure Search and DocumentDB

SolrCloud on Hadoop

Apache Solr crash course

Más de Redis Labs

Redis Day Bangalore 2020 - Session state caching with redisRedis Labs

Protecting Your API with Redis by Jane Paek - Redis Day Seattle 2020Redis Labs

The Happy Marriage of Redis and Protobuf by Scott Haines of Twilio - Redis Da...Redis Labs

SQL, Redis and Kubernetes by Paul Stanton of Windocks - Redis Day Seattle 2020Redis Labs

Rust and Redis - Solving Problems for Kubernetes by Ravi Jagannathan of VMwar...Redis Labs

Redis for Data Science and Engineering by Dmitry Polyakovsky of OracleRedis Labs

Practical Use Cases for ACLs in Redis 6 by Jamie Scott - Redis Day Seattle 2020Redis Labs

Moving Beyond Cache by Yiftach Shoolman Redis Labs - Redis Day Seattle 2020Redis Labs

Leveraging Redis for System Monitoring by Adam McCormick of SBG - Redis Day S...Redis Labs

JSON in Redis - When to use RedisJSON by Jay Won of Coupang - Redis Day Seatt...Redis Labs

Highly Available Persistent Session Management Service by Mohamed Elmergawi o...Redis Labs

Anatomy of a Redis Command by Madelyn Olson of Amazon Web Services - Redis Da...Redis Labs

Building a Multi-dimensional Analytics Engine with RedisGraph by Matthew Goos...Redis Labs

RediSearch 1.6 by Pieter Cailliau - Redis Day Bangalore 2020Redis Labs

RedisGraph 2.0 by Pieter Cailliau - Redis Day Bangalore 2020Redis Labs

RedisTimeSeries 1.2 by Pieter Cailliau - Redis Day Bangalore 2020Redis Labs

RedisAI 0.9 by Sherin Thomas of Tensorwerk - Redis Day Bangalore 2020Redis Labs

Rate-Limiting 30 Million requests by Vijay Lakshminarayanan and Girish Koundi...Redis Labs

Three Pillars of Observability by Rajalakshmi Raji Srinivasan of Site24x7 Zoh...Redis Labs

Solving Complex Scaling Problems by Prashant Kumar and Abhishek Jain of Myntr...Redis Labs

Más de Redis Labs (20)

Redis Day Bangalore 2020 - Session state caching with redis

Protecting Your API with Redis by Jane Paek - Redis Day Seattle 2020

The Happy Marriage of Redis and Protobuf by Scott Haines of Twilio - Redis Da...

SQL, Redis and Kubernetes by Paul Stanton of Windocks - Redis Day Seattle 2020

Rust and Redis - Solving Problems for Kubernetes by Ravi Jagannathan of VMwar...

Redis for Data Science and Engineering by Dmitry Polyakovsky of Oracle

Practical Use Cases for ACLs in Redis 6 by Jamie Scott - Redis Day Seattle 2020

Moving Beyond Cache by Yiftach Shoolman Redis Labs - Redis Day Seattle 2020

Leveraging Redis for System Monitoring by Adam McCormick of SBG - Redis Day S...

JSON in Redis - When to use RedisJSON by Jay Won of Coupang - Redis Day Seatt...

Highly Available Persistent Session Management Service by Mohamed Elmergawi o...

Anatomy of a Redis Command by Madelyn Olson of Amazon Web Services - Redis Da...

Building a Multi-dimensional Analytics Engine with RedisGraph by Matthew Goos...

RediSearch 1.6 by Pieter Cailliau - Redis Day Bangalore 2020

RedisGraph 2.0 by Pieter Cailliau - Redis Day Bangalore 2020

RedisTimeSeries 1.2 by Pieter Cailliau - Redis Day Bangalore 2020

RedisAI 0.9 by Sherin Thomas of Tensorwerk - Redis Day Bangalore 2020

Rate-Limiting 30 Million requests by Vijay Lakshminarayanan and Girish Koundi...

Three Pillars of Observability by Rajalakshmi Raji Srinivasan of Site24x7 Zoh...

Solving Complex Scaling Problems by Prashant Kumar and Abhishek Jain of Myntr...

Último

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge

What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco

08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls

08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls

GenCyber Cyber Security Day PresentationMichael W. Hawkins

A Domino Admins Adventures (Engage 2024)Gabriella Davis

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j

Artificial Intelligence: Facts and MythsJoaquim Jorge

Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun

Finology Group – Insurtech Innovation Award 2024The Digital Insurer

IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo

A Year of the Servo Reboot: Where Are We Now?Igalia

🐬 The future of MySQL is Postgres 🐘RTylerCroy

Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge

08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls

TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc

Histor y of HAM Radio presentation slidevu2urc

Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1

RedisDay London 2018 - RedisSearch Deep Dive

1. RediSearch Deep Dive KYLEDAVIS,REDISLABS

2. ➔ ➔

3. RediSearch can be used for two [primary] things: Full Text Search Secondary Index

4. Search Aggregate RediSearch can do three things. Autocomplete Query

5. • Create a schema using four types Text Numeric Tag Geospatial • Add Documents in Real Time Directly From Hash Index only • Search & Aggregate • Delete documents as needed • Drop the whole index Data Lifecycle in RediSearch / Search and Aggregate

6. Search & Query

7. • Goals Intentionally not SQL But familiar Exposable to end-users • Simple No knowledge of data/structure needed • Powerful With knowledge, zero in on data Query Language

8. ford truck Query Syntax

9. AND / OR / NOT / Exact Phrase / Geospatial / Tags / Prefix / Number Ranges / Optional Terms & more Query Syntax – more advanced And combine them all into one query: (chev*|ford) -explorer ~truck @year:[2001 2011] @location:[74 40 100 km] @condition:{ good | verygood }

10. • Stop words: ”a fox in the woods” -> “fox woods” • Stemming: Query “going” -> find “going” ”go” “gone” • Slop: Query: “glass pitcher”, slop 2 -> “glass gallon beer pitcher” • With or without content: Query: “To be or not to be” -> Hamlet (without the whole play) Matched text highlight/summary: Query – “To be or not to be” -> Hamlet. <b>To be, or not to be</b>- that is the question Full-text Search

11. •Synonyms Query “Bob” -> Find documents with “Robert” •Query Spell Check ”a fxo in the woods” -> Did you mean “a fox in the woods” •Phonetic Search “John Smith” -> “Jon Smyth” Full-text Search

12. • Each field can have a weight which influences the rank in the returned result • Each document can have a score to influence rank • Built-in Scoring Functions Default: TF-IDF / term frequency–inverse document frequency • Variant: DOCNORM • Variant: BM25 DISMAX (Solr’s default) DOCSCORE HAMMING for binary payloads • Fields can be independently sortable, which trumps any in-built scorcer Scoring, Weights, and Sorting

13. Aggregations

14. Aggregations • Processes and transforms • Same query language as search • Can group, sort and apply transformations • Follows pipeline of composable actions: Filter Group Apply Sort Apply Reduce Reduce

15. Grouping & Applications • Reducers: COUNT COUNT_DISTINCT COUNT_DISTINCTISH SUM MIN MAX AVG STDDEV QUANTILE TOLIST FIRST_VALUE RANDOM_SAMPLE • Manipulate Strings • substr(upper('hello world'),0,3) -> “HEL” Numbers w/ Arithmetic • sqrt(log(foo) * floor(@bar/baz)) + (3^@qaz % 6) Timestamp to Calendar • timefmt(@mytimestamp, "%b %d %Y %H:%M:%S”) -> Feb 24 2018 00:05:48

16. FT.AGGREGATE shipments "@box_area:[300 +inf]” APPLY "year(@shipment_timestamp / 1000)" AS shipment_year GROUPBY 1 @shipment_year REDUCE COUNT 0 AS shipment_count SORTBY 2 @shipment_count DESC LIMIT 0 3 APPLY "format("%sk+ Shipments",floor(@shipment_count / 1000))" AS shipment_count RediSearch in Action: FT.AGGREGATE

17. Autocomplete

18. Legacy, Dataset-Based Autocomplete Add data Prefix Search Your general dataset is the source for the autocomplete Each keystroke you do a prefix search on your whole dataset.

19. •Frequency != Good Search Query •Frequency != Quality •Expensive & Slow •Ignores context Here be dragons

20. • Radix Trie • Store a collection of suggestion at a single Key • Each suggestion has a score • You can use it with anything that has: Search box User interacts with results • API FT.SUGGET FT.SUGADD FT.SUGDEL FT.SUGLEN RediSearch Suggestion Engine

21. Search box Results Search hit/miss User-Directed Autocomplete FT.SUGGET mySug {user input} Instead of going directly to results go through RediSearch-based forwarder FT.SUGADD mySug {user input} 1 INCR

RedisDay London 2018 - RedisSearch Deep Dive

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (7)

Similar a RedisDay London 2018 - RedisSearch Deep Dive

Similar a RedisDay London 2018 - RedisSearch Deep Dive (20)

Más de Redis Labs

Más de Redis Labs (20)

Último

Último (20)

RedisDay London 2018 - RedisSearch Deep Dive