SlideShare una empresa de Scribd logo
1 de 75
Descargar para leer sin conexión
Enrich Search User Experience For Different Parts
of Your Application Using Amazon CloudSearch
Jon Handler, CloudSearch Solution Architect
November 15, 2013

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.
Agenda
• 
• 
• 
• 

Sourcing your documents
Retrieval and ranking
Search user interface
Performance and Scale

•  Developer example:
Peter Simpkin, Solution Architect, Elsevier
Architecting with CloudSearch
Hands-Off Operation
Document Quantity and Size

SEARCH INSTANCE

SEARCH INSTANCE

SEARCH INSTANCE

SEARCH INSTANCE

SEARCH INSTANCE

SEARCH INSTANCE

SEARCH INSTANCE

SEARCH INSTANCE

SEARCH INSTANCE

Index Partition 1
Copy 1

Search
Request
Volume and
Complexity

Index Partition 1
Copy 2

Index Partition 1
Copy n

Index Partition 2
Copy 1

Index Partition 2
Copy 2

Index Partition 2
Copy n

Index Partition n
Copy 1

Index Partition n
Copy 2

Index Partition n
Copy n
MovieMate Application
Multiple
Sources
Multiple
Functions
Mobile Experience
Cancel

Iron Man!

Iron Man

Done

Iron Man 3 (2013)!
When Tony Stark's world is torn apart by a
formidable terrorist called the Mandarin, he
starts an odyssey of rebuilding and retribution. !

Iron Man 2 (2010)!
Tony Stark has declared himself Iron Man and
installed world peace... or so he thinks. He soon
realizes that not only is there a mad man...!

Iron Man (2008)!

!

When wealthy industrialist Tony Stark is forced
to build an armored suit after a life-threatening
incident, he ultimately decides to use its
technology to fight against evil. !

The Man With The Iron Fists (2012) !
On the hunt for a fabled treasure of gold, a band
of warriors, assassins, and a rogue British soldier
descend upon a village in feudal China, where a
humble blacksmith...!
Movies

Search

Social

Nearby

Account

Movies

Search

Social

Nearby

Account
Agenda
• 
• 
• 
• 

Sourcing your documents
Retrieval and ranking
Search user interface
Performance and Scale

•  Developer example:
Peter Simpkin, Elsevier Oxford

Sourcing your documents
Retrieval and ranking
Search user interface
Performance and Scale
Developer example
CloudSearch Documents
•  Unique identifier
•  Version
•  Fields
–  Indexed according to configuration
–  Source of matches

Sourcing your documents
Retrieval and ranking
Search user interface
Performance and Scale
Developer example
Amazon RDS
DynamoDB
Amazon S3

Application Content
User actions
Help files
Movie data
Media (clips,
Theater data
images)
User reviews,
Articles
lists etc.
Sourcing your documents
Retrieval and ranking
Search user interface
Performance and Scale
Developer example

Bootstrap Strategy

Amazon
CloudSearch

Amazon EC2

Amazon SQS

Source
System

Processing
Script

Amazon EC2

Queuing Batching
Document Construction
•  One source will be the master
for	
  each	
  record	
  
	
  determine	
  doc	
  id	
  and	
  version	
  
	
  create	
  fields	
  
	
  for	
  each	
  auxiliary	
  source	
  
	
   	
  gather	
  additional	
  data	
  
	
   	
  send	
  or	
  queue	
  the	
  document	
  

Sourcing your documents
Retrieval and ranking
Search user interface
Performance and Scale
Developer example
Relational DB

Addresses
Street
City

Movie

Theater

Title

Name

Description

AddressesID

Showtimes

TheaterID

ShowtimesID

Date

State

Time
State

Sourcing your documents
Retrieval and ranking
Search user interface
Performance and Scale
Developer example
S3
•  Clips, images, reviews
•  Apache Tika to extract content
•  S3 Metadata for additional fields

Sourcing your documents
Retrieval and ranking
Search user interface
Performance and Scale
Developer example
Sourcing your documents
Retrieval and ranking
Search user interface
Performance and Scale
Developer example

Dynamo DB
DynamoDB

CloudSearch

Table

Item

Domain

Attribute
Attribute
Attribute
Attribute

Field
Field
Field
Field

Document
Sourcing your documents
Retrieval and ranking
Search user interface
Performance and Scale
Developer example
Cancel

Iron Man!

Iron Man

Done

Iron Man 3 (2013)!
When Tony Stark's world is torn apart by a
formidable terrorist called the Mandarin, he
starts an odyssey of rebuilding and retribution. !

Iron Man 2 (2010)!
Tony Stark has declared himself Iron Man and
installed world peace... or so he thinks. He soon
realizes that not only is there a mad man...!

Iron Man (2008)!

!

When wealthy industrialist Tony Stark is forced
to build an armored suit after a life-threatening
incident, he ultimately decides to use its
technology to fight against evil. !

The Man With The Iron Fists (2012) !
On the hunt for a fabled treasure of gold, a band
of warriors, assassins, and a rogue British soldier
descend upon a village in feudal China, where a
humble blacksmith...!
Movies

Search

Social

Nearby

Account

Movies

Search

Social

Nearby

Account
Sourcing your documents
Retrieval and ranking
Search user interface
Performance and Scale
Developer example

Searching Show Times
id title

description t_name

t_street date

time

1

Iron
Man

...

Galaxy

Main

11/11 12:30pm

2

Iron
Man

...

Galaxy

Main

11/11 1:15pm

3

Iron
Man

...

Galaxy

Main

11/11 2:45pm

4

Iron
Man

...

Galaxy

Main

11/11 6:00pm
Heterogenous Data

Sourcing your documents
Retrieval and ranking
Search user interface
Performance and Scale
Developer example
Multi Domain

Sourcing your documents
Retrieval and ranking
Search user interface
Performance and Scale
Developer example
Sourcing your documents
Retrieval and ranking
Search user interface
Performance and Scale
Developer example

Updating CloudSearch
Update Processor

Web Server

Users

Amazon EC2

Amazon SQS

Amazon EC2

DynamoDB

Amazon RDS

Amazon
CloudSearch

Amazon S3
Section Summary
•  Multiple sources
•  Bootstrap / Update
•  Heterogeneous data

Sourcing your documents
Retrieval and ranking
Search user interface
Performance and Scale
Developer example
Agenda
• 
• 
• 
• 

Sourcing your documents
Retrieval and ranking
Search user interface
Performance and Scale

•  Developer example:
Peter Simpkin, Elsevier Oxford

Sourcing your documents
Retrieval and ranking
Search user interface
Performance and Scale
Developer example
Cancel

Iron Man!
Iron Man 3 (2013)!

When Tony Stark's world is torn apart by a
formidable terrorist called the Mandarin, he
starts an odyssey of rebuilding and retribution. !

Iron Man 2 (2010)!
Tony Stark has declared himself Iron Man and
installed world peace... or so he thinks. He soon
realizes that not only is there a mad man...!

Good Matches

Iron Man (2008)!
When wealthy industrialist Tony Stark is forced
to build an armored suit after a life-threatening
incident, he ultimately decides to use its
technology to fight against evil. !

The Man With The Iron Fists (2012) !
On the hunt for a fabled treasure of gold, a band
of warriors, assassins, and a rogue British soldier
descend upon a village in feudal China, where a
humble blacksmith...!
Movies

Search

Social

Nearby

Account
The Search Algorithm
•  Locate documents that satisfy Boolean
constraints
–  Usually intersection

•  Relevance rank those documents
–  Differentiates from databases by relevance

Sourcing your documents
Retrieval and ranking
Search user interface
Performance and Scale
Developer example
Document Structure
Movie
title
description
user_rating
likes
release_date
latitude
longitude

Sourcing your documents
Retrieval and ranking
Search user interface
Performance and Scale
Developer example
Configuring for Search

Sourcing your documents
Retrieval and ranking
Search user interface
Performance and Scale
Developer example

•  Text fields for individual word search
–  User-generated and external text – titles, descriptions

•  Literal fields for exact matches
–  Application-generated text like facets

•  Integer fields for range searching and ranking
Searching Text

Sourcing your documents
Retrieval and ranking
Search user interface
Performance and Scale
Developer example

http(s)://<endpoint>/2011-02-01/search?
•  Simple searches
–  q=<text>

•  Filtering
–  bq= (or title:'iron' (and description:'iron' description:'man'))

•  Filtering with integer ranges
–  bq=(and 'iron man' year:..2010)

•  Geo filtering
–  bq=(and 'iron man' latitude:12700..12900 longitude:5700..5800)
Search Results

Sourcing your documents
Retrieval and ranking
Search user interface
Performance and Scale
Developer example

{"rank":	
  "-­‐text_relevance",	
  
"match-­‐expr":	
  "(label	
  'iron	
  man')",	
  
"hits":	
  {	
  "found":	
  204,	
  "start":	
  0,	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "hit":	
  [	
  {	
  "id":	
  "sontsst12cf5f88b42"	
  },	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  {	
  "id":	
  "sopvopr12ab017f082"	
  },	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  {	
  "id":	
  "sorzrpw12ac468a13b"	
  },	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  ]	
  },	
  
...	
  
}	
  
Cancel

Iron Man!
Iron Man 3 (2013)!

When Tony Stark's world is torn apart by a
formidable terrorist called the Mandarin, he
starts an odyssey of rebuilding and retribution. !

Iron Man 2 (2010)!
Tony Stark has declared himself Iron Man and
installed world peace... or so he thinks. He soon
realizes that not only is there a mad man...!

Relevant Results

Iron Man (2008)!
When wealthy industrialist Tony Stark is forced
to build an armored suit after a life-threatening
incident, he ultimately decides to use its
technology to fight against evil. !

The Man With The Iron Fists (2012) !
On the hunt for a fabled treasure of gold, a band
of warriors, assassins, and a rogue British soldier
descend upon a village in feudal China, where a
humble blacksmith...!
Movies

Search

Social

Nearby

Account
Customizing Ranking
•  text_relevance and cs.text_relevance
•  Rank expressions
–  Compute a score for each document
–  &rank=<function>

•  Defined in the console
•  Defined at query-time
–  &q='iron-man'&rank-recency=text_relevance + year
&rank=recency

Sourcing your documents
Retrieval and ranking
Search user interface
Performance and Scale
Developer example
Field Weighting
Field Weighting
•  Adjust relative importance of fields
•  &rank-title=
cs.text_relevance({"weights":{"title":4.0},
"default_weight":1})

Sourcing your documents
Retrieval and ranking
Search user interface
Performance and Scale
Developer example
Popularity
Popularity

Sourcing your documents
Retrieval and ranking
Search user interface
Performance and Scale
Developer example

•  Convert floating point to integer
•  Weight by the number of ranks
•  rank-pop=text_relevance +
log10(user-rating * number-user-ranks) * 10 +
metascore * 3
Freshness
Sourcing your documents
Retrieval and ranking
Search user interface
Performance and Scale
Developer example

Freshness
•  Exponential decay function

r = ce

− λt

•  &rank-decay=text_relevance +
200*Math.exp(-0.1*days_ago)
Location Sort
Iron Man

Done

!

Movies

Search

Social

Nearby

Account
Location Sort
Movie
title
description
user_rating
likes
release_date
latitude
longitude

Sourcing your documents
Retrieval and ranking
Search user interface
Performance and Scale
Developer example

•  Latitude and longitude
expressed as integers
•  Denormalized for particular
theaters with locations
Location Sort
•  Cartesian distance function
(lat − latuser )2 + (lon − lonuser )2

•  &rank-geo=sqrt(pow(latitude - lat, 2) +
pow(longitude - lon), 2)
•  &rank=-geo

Sourcing your documents
Retrieval and ranking
Search user interface
Performance and Scale
Developer example
Rank Expressions: Combined

Sourcing your documents
Retrieval and ranking
Search user interface
Performance and Scale
Developer example

•  &rank-combined=text_relevance + 2.0 * geo +
0.5 * popularity + 0.3 * freshness
•  &rank=combined
Section Summary
•  Search API basics
•  Customizing ranking
–  Field weighting, popularity, freshness, GEO, combined

•  Rank expression comparison tool

Sourcing your documents
Retrieval and ranking
Search user interface
Performance and Scale
Developer example
Agenda
• 
• 
• 
• 

Sourcing your documents
Retrieval and ranking
Search user interface
Performance and Scale

•  Developer example:
Peter Simpkin, Elsevier Oxford

Sourcing your documents
Retrieval and ranking
Search user interface
Performance and Scale
Developer example
Facets
Facets
Simple Faceting: Document

Movie
title
description
genre

Sourcing your documents
Retrieval and ranking
Search user interface
Performance and Scale
Developer example
Simple Faceting: Configuration

Sourcing your documents
Retrieval and ranking
Search user interface
Performance and Scale
Developer example
Simple Faceting: Query
q=iron+man&facet=genre
{"rank":	
  "-­‐text_relevance",	
  
"match-­‐expr":	
  "(label	
  'star	
  wars')",	
  
"hits":	
  {"found":	
  7,	
  "start":	
  0,	
  "hit":	
  []	
  
	
  	
  	
  	
  	
  	
  	
  	
  },	
  
"facets":	
  {	
  
	
  	
  "genre":	
  {	
  
	
  	
  	
  	
  "constraints":	
  [	
  
	
  	
  	
  	
  	
  	
  {"value":	
  "Family",	
  "count":	
  62},	
  
	
  	
  	
  	
  	
  	
  {"value":	
  "Action/Adventure",	
  "count":	
  21},	
  
	
  	
  	
  	
  	
  	
  {"value":	
  "Drama",	
  "count":	
  5	
  },	
  

Sourcing your documents
Retrieval and ranking
Search user interface
Performance and Scale
Developer example
Simple Faceting: UI
<div	
  class='facet'>	
  
	
  	
  	
  	
  <ul	
  class='facet_list'>	
  
	
  	
  	
  	
  	
  	
  	
  	
  <?php	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  $genres	
  =	
  $resultsObj-­‐>facets-­‐>genre-­‐>constraints;	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  for	
  ($i	
  =	
  0;	
  $i	
  <	
  count($genres);	
  $i++)	
  {	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  $curGenre	
  =	
  $genres[$i];	
  $curCount	
  =	
  $thisGenre-­‐>count;	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  ?>	
  
	
  	
  	
  	
  	
  	
  	
  	
  <li	
  class='facet_item'>	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  <div	
  class='facet_name'><?=$curGenre?></div>	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  <div	
  class='facet_count'><?=$curCount?></div>	
  
	
  	
  	
  	
  	
  	
  	
  	
  </li>	
  
	
  	
  	
  	
  	
  	
  	
  	
  <?php	
  }	
  ?>	
  
	
  	
  	
  	
  </ul>	
  
</div>	
  

Sourcing your documents
Retrieval and ranking
Search user interface
Performance and Scale
Developer example
Facets
Sourcing your documents
Retrieval and ranking
Search user interface
Performance and Scale
Developer example

Document
Movie
title
description
oscar1
oscar2
oscar3

• 
• 
• 
• 
• 

title: Lincoln
description: ...
oscar1: Awards
oscar2: Awards/Best Actor
oscar3: Awards/Best Actor/
Daniel Day Lewis
Query

Sourcing your documents
Retrieval and ranking
Search user interface
Performance and Scale
Developer example

&q=lincoln&facet=oscar1,oscar2,oscar3
{"rank":	
  "-­‐text_relevance",	
  "hits":{...},	
  
"facets":	
  {	
  
	
  	
  "oscar1":	
  {	
  
	
  	
  	
  	
  "constraints":	
  [	
  
	
  	
  	
  	
  	
  	
  {"value":	
  "Awards",	
  "count":	
  23},	
  
	
  	
  	
  	
  	
  	
  {"value":	
  "Nominations",	
  "count":	
  124}]},	
  
	
  	
  "oscar2":	
  {	
  
	
  	
  	
  	
  "constraints":	
  [	
  
	
  	
  	
  	
  	
  	
  {"value":	
  "Awards/Best	
  Actor",	
  "count":	
  6},	
  
	
  	
  	
  	
  	
  	
  {"value":	
  "Awards/Best	
  Actress",	
  "count":	
  3}...]},	
  	
  	
  	
  
	
  	
  "oscar3":	
  {	
  
	
  	
  	
  	
  "constraints":	
  [	
  
	
  	
  	
  	
  	
  	
  {"value":	
  "Awards/Best	
  Actor/Daniel	
  Day	
  Lewis",	
  "count":	
  1},	
  
	
  	
  	
  	
  	
  	
  {"value":	
  "Awards/Best	
  Actor/Denzel	
  Washington",	
  "count":	
  2}...]},	
  	
  	
  	
  
Drilldown
• 
• 
• 
• 

Sourcing your documents
Retrieval and ranking
Search user interface
Performance and Scale
Developer example

bq=oscar1:'Awards'
bq=oscar2:'Awards/Best Actor'
bq=oscar3:'Awards/Best Actor/Daniel Day Lewis'
bq=(and 'star' oscar2:'Awards/Best Actor')
Section Summary
•  Simple faceting
•  Hierarchical faceting
•  Hierarchical data handling

Sourcing your documents
Retrieval and ranking
Search user interface
Performance and Scale
Developer example
Agenda
• 
• 
• 
• 

Sourcing your documents
Retrieval and ranking
Search user interface
Performance and Scale

•  Developer example:
Peter Simpkin, Elsevier Oxford
The Search Algorithm
•  Locate documents that satisfy Boolean
constraints
–  Usually intersection

•  Relevance rank those documents
–  Differentiates from databases by relevance

Sourcing your documents
Retrieval and ranking
Search user interface
Performance and Scale
Developer example
Performance Best Practices

Sourcing your documents
Retrieval and ranking
Search user interface
Performance and Scale
Developer example

•  Match set size
•  Text queries perform better than integer queries
•  Complex relevance functions
Optimizing Index Size

Sourcing your documents
Retrieval and ranking
Search user interface
Performance and Scale
Developer example

•  Trade off literal and uint for cost/performance
•  Result fields matter most
•  Enabling faceting increases size
Wrap Up
• 
• 
• 
• 

Sourcing documents from various locations
Building queries and ranking
UI Components for faceting
Getting the most out of your index
Agenda
• 
• 
• 
• 

Sourcing your documents
Retrieval and ranking
Search user interface
Performance and Scale

•  Developer example:
Peter Simpkin, Elsevier Oxford

Sourcing your documents
Retrieval and ranking
Search user interface
Performance and Scale
Developer example
Agenda
• 
• 
• 
• 
• 

Elsevier Intro
Search Problem Statement
Enterprise Content Search
Hints and Tips
CloudSearch Observations
• 
• 
• 
• 

7,000+ employees in 26 countries
2,200 journals / article market
share 25%
$3B revenue
Scientific, Technical & Medical
Customers
Academic
Research
Institutions
Government &
Health
Corporate
Research
Labs
Individual
Researchers

Products
Content Challenges:
•  No central place for consumers
to discover content
• 

Is not currently possible to
search and retrieve atomic
assets

• 

Assets are not reusable across
products

Content Systems

Consumer Platforms
Empower our product development partners
Search Opportunities:
•  Create a comprehensive
inventory to discover easily
content Elsevier owns
• 

Provide access to Granular /
Modular content they want at
will

• 

Assets must be uniquely
addressable

Enterprise Content Search Engine
Enterprise Content Search eco-system

Amazon SWF

SDF metadata

E.U Corporate Data center

Amazon S3

U.S Corporate Data center

Amazon
CloudSearch

DynamoDB

Federated Content Warehouse

Product Platform Data center
Simple Search UI
Elsevier Technical Drivers & Approach
•  Fully-managed, full featured search service in
the cloud
•  Automatically scales for data & traffic
•  Easy to set up and use
•  PoC created in days
•  Search Engine as a Service
•  Pay-as-you-go pricing model
Hints & Tips
(and issn:'0022-1694'
(and type:'1.2' 
(and (not action:'D')
(or (and pubstartdate:..2013176 pubenddate:2005002..)
(or (and pubstartdate:2005001
(and pubstarttime:0.. pubstarttime:..235959))
             (or (and pubstartdate:2013177 pubstarttime:..235959)
               (or (and pubenddate:2005001 pubendtime:0..)
(and pubenddate:2013177
(and pubendtime:..235959 pubendtime:0..)))))))))

•  Query Response Time = 5 seconds
Optimising Nested Queries
(and issn:'0022-1694' type:'1.2' 
(not action:'D')
(or (and pubstartdate:..2013176 pubenddate:2005002..)
         (and pubstartdate:2005001 pubstarttime:0..235959)
         (and pubstartdate:2013177 pubstarttime:0..235959)
         (and pubenddate:2005001 pubendtime:0..)
         (and pubenddate:2013177 pubendtime:0..235959)))

•  Response Time = 2.5 seconds
Optimised Nested Query
((not action:'D')
(or (and issn:'0022-1694' and type‘1.2'
and pubstartdate:..2013176 pubenddate:2005002..)
      (and issn:'0022-1694' and type‘1.2'
and pubstartdate:2005001 pubstarttime:0..235959)
      (and issn:'0022-1694' and type‘1.2'
and pubstartdate:2013177 pubstarttime:0..235959)
      (and issn:'0022-1694' and type‘1.2'
and pubenddate:2005001 pubendtime:0..)
      (and issn:'0022-1694' and type‘1.2'
and pubenddate:2013177 pubendtime:0..235959)))

•  Response Time = 0.17ms
CloudSearch Observations
facilitate knowledge sharing on content matters
across Elsevier’s product platforms
ability to leverage content infrastructure and
capabilities across Elsevier’s divisions
easy to integrate with existing on-premise
Content Systems
speed to market, allows developers to focus
building other core Content Strategy components
need to spend time optimising queries to
maximise performance
Please give us your feedback on this
presentation

SVC302
As a thank you, we will select prize
winners daily for completed surveys!

Thank You

Más contenido relacionado

Similar a Amazon Cloudsearch Session With Elsevier: re:Invent 2013

Build a Scalable Search Engine With Amazon CloudSearch by Jon Handler
Build a Scalable Search Engine With Amazon CloudSearch by Jon HandlerBuild a Scalable Search Engine With Amazon CloudSearch by Jon Handler
Build a Scalable Search Engine With Amazon CloudSearch by Jon HandlerEiji Shinohara
 
AWS Webcast - Build a Scalable Search Engine with the New Amazon CloudSearch
AWS Webcast - Build a Scalable Search Engine with the New Amazon CloudSearchAWS Webcast - Build a Scalable Search Engine with the New Amazon CloudSearch
AWS Webcast - Build a Scalable Search Engine with the New Amazon CloudSearchAmazon Web Services
 
Enrich Search User Experience Using Amazon CloudSearch (SVC302) | AWS re:Inve...
Enrich Search User Experience Using Amazon CloudSearch (SVC302) | AWS re:Inve...Enrich Search User Experience Using Amazon CloudSearch (SVC302) | AWS re:Inve...
Enrich Search User Experience Using Amazon CloudSearch (SVC302) | AWS re:Inve...Amazon Web Services
 
IJCAI 2015 Presentation: Did you know?- Mining Interesting Trivia for Entitie...
IJCAI 2015 Presentation: Did you know?- Mining Interesting Trivia for Entitie...IJCAI 2015 Presentation: Did you know?- Mining Interesting Trivia for Entitie...
IJCAI 2015 Presentation: Did you know?- Mining Interesting Trivia for Entitie...Abhay Prakash
 
Gadgets they use are no longer fiction
Gadgets they use are no longer fictionGadgets they use are no longer fiction
Gadgets they use are no longer fictionTricon Infotech
 
What to expect when you are visualizing (v.2)
What to expect when you are visualizing (v.2)What to expect when you are visualizing (v.2)
What to expect when you are visualizing (v.2)Krist Wongsuphasawat
 
Thor 2 blockbuster
Thor 2 blockbusterThor 2 blockbuster
Thor 2 blockbusterbethbraine
 

Similar a Amazon Cloudsearch Session With Elsevier: re:Invent 2013 (8)

Build a Scalable Search Engine With Amazon CloudSearch by Jon Handler
Build a Scalable Search Engine With Amazon CloudSearch by Jon HandlerBuild a Scalable Search Engine With Amazon CloudSearch by Jon Handler
Build a Scalable Search Engine With Amazon CloudSearch by Jon Handler
 
AWS Webcast - Build a Scalable Search Engine with the New Amazon CloudSearch
AWS Webcast - Build a Scalable Search Engine with the New Amazon CloudSearchAWS Webcast - Build a Scalable Search Engine with the New Amazon CloudSearch
AWS Webcast - Build a Scalable Search Engine with the New Amazon CloudSearch
 
Enrich Search User Experience Using Amazon CloudSearch (SVC302) | AWS re:Inve...
Enrich Search User Experience Using Amazon CloudSearch (SVC302) | AWS re:Inve...Enrich Search User Experience Using Amazon CloudSearch (SVC302) | AWS re:Inve...
Enrich Search User Experience Using Amazon CloudSearch (SVC302) | AWS re:Inve...
 
IJCAI 2015 Presentation: Did you know?- Mining Interesting Trivia for Entitie...
IJCAI 2015 Presentation: Did you know?- Mining Interesting Trivia for Entitie...IJCAI 2015 Presentation: Did you know?- Mining Interesting Trivia for Entitie...
IJCAI 2015 Presentation: Did you know?- Mining Interesting Trivia for Entitie...
 
Gadgets they use are no longer fiction
Gadgets they use are no longer fictionGadgets they use are no longer fiction
Gadgets they use are no longer fiction
 
thor.docx
thor.docxthor.docx
thor.docx
 
What to expect when you are visualizing (v.2)
What to expect when you are visualizing (v.2)What to expect when you are visualizing (v.2)
What to expect when you are visualizing (v.2)
 
Thor 2 blockbuster
Thor 2 blockbusterThor 2 blockbuster
Thor 2 blockbuster
 

Más de Michael Bohlig

Using Amazon CloudSearch With Databases - CloudSearch Meetup 061913
Using Amazon CloudSearch With Databases - CloudSearch Meetup 061913Using Amazon CloudSearch With Databases - CloudSearch Meetup 061913
Using Amazon CloudSearch With Databases - CloudSearch Meetup 061913Michael Bohlig
 
Amazon CloudSearch User Talk - Naked Wines
Amazon CloudSearch User Talk - Naked Wines Amazon CloudSearch User Talk - Naked Wines
Amazon CloudSearch User Talk - Naked Wines Michael Bohlig
 
DynamoDB and Amazon Cloudsearch
DynamoDB and Amazon CloudsearchDynamoDB and Amazon Cloudsearch
DynamoDB and Amazon CloudsearchMichael Bohlig
 
Tuning Search Requests - Amazon CloudSearch
Tuning Search Requests - Amazon CloudSearchTuning Search Requests - Amazon CloudSearch
Tuning Search Requests - Amazon CloudSearchMichael Bohlig
 
Snapguide - Amazon Cloudsearch
Snapguide - Amazon CloudsearchSnapguide - Amazon Cloudsearch
Snapguide - Amazon CloudsearchMichael Bohlig
 
EDU2.0 and Amazon CloudSearch
EDU2.0 and Amazon CloudSearchEDU2.0 and Amazon CloudSearch
EDU2.0 and Amazon CloudSearchMichael Bohlig
 
Coursera amazon cloudsearch presentation
Coursera amazon cloudsearch presentation Coursera amazon cloudsearch presentation
Coursera amazon cloudsearch presentation Michael Bohlig
 
Geospatial Search With Amazon CloudSearch
Geospatial Search With Amazon CloudSearch Geospatial Search With Amazon CloudSearch
Geospatial Search With Amazon CloudSearch Michael Bohlig
 
Amazon CloudSearch - Relevance, Ranking, Tuning and Analytics
Amazon CloudSearch - Relevance, Ranking, Tuning and AnalyticsAmazon CloudSearch - Relevance, Ranking, Tuning and Analytics
Amazon CloudSearch - Relevance, Ranking, Tuning and AnalyticsMichael Bohlig
 

Más de Michael Bohlig (9)

Using Amazon CloudSearch With Databases - CloudSearch Meetup 061913
Using Amazon CloudSearch With Databases - CloudSearch Meetup 061913Using Amazon CloudSearch With Databases - CloudSearch Meetup 061913
Using Amazon CloudSearch With Databases - CloudSearch Meetup 061913
 
Amazon CloudSearch User Talk - Naked Wines
Amazon CloudSearch User Talk - Naked Wines Amazon CloudSearch User Talk - Naked Wines
Amazon CloudSearch User Talk - Naked Wines
 
DynamoDB and Amazon Cloudsearch
DynamoDB and Amazon CloudsearchDynamoDB and Amazon Cloudsearch
DynamoDB and Amazon Cloudsearch
 
Tuning Search Requests - Amazon CloudSearch
Tuning Search Requests - Amazon CloudSearchTuning Search Requests - Amazon CloudSearch
Tuning Search Requests - Amazon CloudSearch
 
Snapguide - Amazon Cloudsearch
Snapguide - Amazon CloudsearchSnapguide - Amazon Cloudsearch
Snapguide - Amazon Cloudsearch
 
EDU2.0 and Amazon CloudSearch
EDU2.0 and Amazon CloudSearchEDU2.0 and Amazon CloudSearch
EDU2.0 and Amazon CloudSearch
 
Coursera amazon cloudsearch presentation
Coursera amazon cloudsearch presentation Coursera amazon cloudsearch presentation
Coursera amazon cloudsearch presentation
 
Geospatial Search With Amazon CloudSearch
Geospatial Search With Amazon CloudSearch Geospatial Search With Amazon CloudSearch
Geospatial Search With Amazon CloudSearch
 
Amazon CloudSearch - Relevance, Ranking, Tuning and Analytics
Amazon CloudSearch - Relevance, Ranking, Tuning and AnalyticsAmazon CloudSearch - Relevance, Ranking, Tuning and Analytics
Amazon CloudSearch - Relevance, Ranking, Tuning and Analytics
 

Último

Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKJago de Vreede
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 

Último (20)

Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 

Amazon Cloudsearch Session With Elsevier: re:Invent 2013

  • 1. Enrich Search User Experience For Different Parts of Your Application Using Amazon CloudSearch Jon Handler, CloudSearch Solution Architect November 15, 2013 © 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.
  • 2. Agenda •  •  •  •  Sourcing your documents Retrieval and ranking Search user interface Performance and Scale •  Developer example: Peter Simpkin, Solution Architect, Elsevier
  • 3.
  • 4.
  • 6. Hands-Off Operation Document Quantity and Size SEARCH INSTANCE SEARCH INSTANCE SEARCH INSTANCE SEARCH INSTANCE SEARCH INSTANCE SEARCH INSTANCE SEARCH INSTANCE SEARCH INSTANCE SEARCH INSTANCE Index Partition 1 Copy 1 Search Request Volume and Complexity Index Partition 1 Copy 2 Index Partition 1 Copy n Index Partition 2 Copy 1 Index Partition 2 Copy 2 Index Partition 2 Copy n Index Partition n Copy 1 Index Partition n Copy 2 Index Partition n Copy n
  • 9. Mobile Experience Cancel Iron Man! Iron Man Done Iron Man 3 (2013)! When Tony Stark's world is torn apart by a formidable terrorist called the Mandarin, he starts an odyssey of rebuilding and retribution. ! Iron Man 2 (2010)! Tony Stark has declared himself Iron Man and installed world peace... or so he thinks. He soon realizes that not only is there a mad man...! Iron Man (2008)! ! When wealthy industrialist Tony Stark is forced to build an armored suit after a life-threatening incident, he ultimately decides to use its technology to fight against evil. ! The Man With The Iron Fists (2012) ! On the hunt for a fabled treasure of gold, a band of warriors, assassins, and a rogue British soldier descend upon a village in feudal China, where a humble blacksmith...! Movies Search Social Nearby Account Movies Search Social Nearby Account
  • 10. Agenda •  •  •  •  Sourcing your documents Retrieval and ranking Search user interface Performance and Scale •  Developer example: Peter Simpkin, Elsevier Oxford Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example
  • 11. CloudSearch Documents •  Unique identifier •  Version •  Fields –  Indexed according to configuration –  Source of matches Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example
  • 12. Amazon RDS DynamoDB Amazon S3 Application Content User actions Help files Movie data Media (clips, Theater data images) User reviews, Articles lists etc.
  • 13. Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example Bootstrap Strategy Amazon CloudSearch Amazon EC2 Amazon SQS Source System Processing Script Amazon EC2 Queuing Batching
  • 14. Document Construction •  One source will be the master for  each  record    determine  doc  id  and  version    create  fields    for  each  auxiliary  source      gather  additional  data      send  or  queue  the  document   Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example
  • 16. S3 •  Clips, images, reviews •  Apache Tika to extract content •  S3 Metadata for additional fields Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example
  • 17. Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example Dynamo DB DynamoDB CloudSearch Table Item Domain Attribute Attribute Attribute Attribute Field Field Field Field Document
  • 18. Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example Cancel Iron Man! Iron Man Done Iron Man 3 (2013)! When Tony Stark's world is torn apart by a formidable terrorist called the Mandarin, he starts an odyssey of rebuilding and retribution. ! Iron Man 2 (2010)! Tony Stark has declared himself Iron Man and installed world peace... or so he thinks. He soon realizes that not only is there a mad man...! Iron Man (2008)! ! When wealthy industrialist Tony Stark is forced to build an armored suit after a life-threatening incident, he ultimately decides to use its technology to fight against evil. ! The Man With The Iron Fists (2012) ! On the hunt for a fabled treasure of gold, a band of warriors, assassins, and a rogue British soldier descend upon a village in feudal China, where a humble blacksmith...! Movies Search Social Nearby Account Movies Search Social Nearby Account
  • 19. Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example Searching Show Times id title description t_name t_street date time 1 Iron Man ... Galaxy Main 11/11 12:30pm 2 Iron Man ... Galaxy Main 11/11 1:15pm 3 Iron Man ... Galaxy Main 11/11 2:45pm 4 Iron Man ... Galaxy Main 11/11 6:00pm
  • 20. Heterogenous Data Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example
  • 21. Multi Domain Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example
  • 22. Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example Updating CloudSearch Update Processor Web Server Users Amazon EC2 Amazon SQS Amazon EC2 DynamoDB Amazon RDS Amazon CloudSearch Amazon S3
  • 23. Section Summary •  Multiple sources •  Bootstrap / Update •  Heterogeneous data Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example
  • 24. Agenda •  •  •  •  Sourcing your documents Retrieval and ranking Search user interface Performance and Scale •  Developer example: Peter Simpkin, Elsevier Oxford Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example
  • 25. Cancel Iron Man! Iron Man 3 (2013)! When Tony Stark's world is torn apart by a formidable terrorist called the Mandarin, he starts an odyssey of rebuilding and retribution. ! Iron Man 2 (2010)! Tony Stark has declared himself Iron Man and installed world peace... or so he thinks. He soon realizes that not only is there a mad man...! Good Matches Iron Man (2008)! When wealthy industrialist Tony Stark is forced to build an armored suit after a life-threatening incident, he ultimately decides to use its technology to fight against evil. ! The Man With The Iron Fists (2012) ! On the hunt for a fabled treasure of gold, a band of warriors, assassins, and a rogue British soldier descend upon a village in feudal China, where a humble blacksmith...! Movies Search Social Nearby Account
  • 26. The Search Algorithm •  Locate documents that satisfy Boolean constraints –  Usually intersection •  Relevance rank those documents –  Differentiates from databases by relevance Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example
  • 27. Document Structure Movie title description user_rating likes release_date latitude longitude Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example
  • 28. Configuring for Search Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example •  Text fields for individual word search –  User-generated and external text – titles, descriptions •  Literal fields for exact matches –  Application-generated text like facets •  Integer fields for range searching and ranking
  • 29. Searching Text Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example http(s)://<endpoint>/2011-02-01/search? •  Simple searches –  q=<text> •  Filtering –  bq= (or title:'iron' (and description:'iron' description:'man')) •  Filtering with integer ranges –  bq=(and 'iron man' year:..2010) •  Geo filtering –  bq=(and 'iron man' latitude:12700..12900 longitude:5700..5800)
  • 30. Search Results Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example {"rank":  "-­‐text_relevance",   "match-­‐expr":  "(label  'iron  man')",   "hits":  {  "found":  204,  "start":  0,                      "hit":  [  {  "id":  "sontsst12cf5f88b42"  },                                        {  "id":  "sopvopr12ab017f082"  },                                        {  "id":  "sorzrpw12ac468a13b"  },                                    ]  },   ...   }  
  • 31. Cancel Iron Man! Iron Man 3 (2013)! When Tony Stark's world is torn apart by a formidable terrorist called the Mandarin, he starts an odyssey of rebuilding and retribution. ! Iron Man 2 (2010)! Tony Stark has declared himself Iron Man and installed world peace... or so he thinks. He soon realizes that not only is there a mad man...! Relevant Results Iron Man (2008)! When wealthy industrialist Tony Stark is forced to build an armored suit after a life-threatening incident, he ultimately decides to use its technology to fight against evil. ! The Man With The Iron Fists (2012) ! On the hunt for a fabled treasure of gold, a band of warriors, assassins, and a rogue British soldier descend upon a village in feudal China, where a humble blacksmith...! Movies Search Social Nearby Account
  • 32. Customizing Ranking •  text_relevance and cs.text_relevance •  Rank expressions –  Compute a score for each document –  &rank=<function> •  Defined in the console •  Defined at query-time –  &q='iron-man'&rank-recency=text_relevance + year &rank=recency Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example
  • 34. Field Weighting •  Adjust relative importance of fields •  &rank-title= cs.text_relevance({"weights":{"title":4.0}, "default_weight":1}) Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example
  • 36. Popularity Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example •  Convert floating point to integer •  Weight by the number of ranks •  rank-pop=text_relevance + log10(user-rating * number-user-ranks) * 10 + metascore * 3
  • 38. Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example Freshness •  Exponential decay function r = ce − λt •  &rank-decay=text_relevance + 200*Math.exp(-0.1*days_ago)
  • 40. Location Sort Movie title description user_rating likes release_date latitude longitude Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example •  Latitude and longitude expressed as integers •  Denormalized for particular theaters with locations
  • 41. Location Sort •  Cartesian distance function (lat − latuser )2 + (lon − lonuser )2 •  &rank-geo=sqrt(pow(latitude - lat, 2) + pow(longitude - lon), 2) •  &rank=-geo Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example
  • 42.
  • 43. Rank Expressions: Combined Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example •  &rank-combined=text_relevance + 2.0 * geo + 0.5 * popularity + 0.3 * freshness •  &rank=combined
  • 44. Section Summary •  Search API basics •  Customizing ranking –  Field weighting, popularity, freshness, GEO, combined •  Rank expression comparison tool Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example
  • 45. Agenda •  •  •  •  Sourcing your documents Retrieval and ranking Search user interface Performance and Scale •  Developer example: Peter Simpkin, Elsevier Oxford Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example
  • 48. Simple Faceting: Document Movie title description genre Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example
  • 49. Simple Faceting: Configuration Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example
  • 50. Simple Faceting: Query q=iron+man&facet=genre {"rank":  "-­‐text_relevance",   "match-­‐expr":  "(label  'star  wars')",   "hits":  {"found":  7,  "start":  0,  "hit":  []                  },   "facets":  {      "genre":  {          "constraints":  [              {"value":  "Family",  "count":  62},              {"value":  "Action/Adventure",  "count":  21},              {"value":  "Drama",  "count":  5  },   Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example
  • 51. Simple Faceting: UI <div  class='facet'>          <ul  class='facet_list'>                  <?php                          $genres  =  $resultsObj-­‐>facets-­‐>genre-­‐>constraints;                          for  ($i  =  0;  $i  <  count($genres);  $i++)  {                                  $curGenre  =  $genres[$i];  $curCount  =  $thisGenre-­‐>count;                    ?>                  <li  class='facet_item'>                          <div  class='facet_name'><?=$curGenre?></div>                          <div  class='facet_count'><?=$curCount?></div>                  </li>                  <?php  }  ?>          </ul>   </div>   Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example
  • 53. Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example Document Movie title description oscar1 oscar2 oscar3 •  •  •  •  •  title: Lincoln description: ... oscar1: Awards oscar2: Awards/Best Actor oscar3: Awards/Best Actor/ Daniel Day Lewis
  • 54. Query Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example &q=lincoln&facet=oscar1,oscar2,oscar3 {"rank":  "-­‐text_relevance",  "hits":{...},   "facets":  {      "oscar1":  {          "constraints":  [              {"value":  "Awards",  "count":  23},              {"value":  "Nominations",  "count":  124}]},      "oscar2":  {          "constraints":  [              {"value":  "Awards/Best  Actor",  "count":  6},              {"value":  "Awards/Best  Actress",  "count":  3}...]},            "oscar3":  {          "constraints":  [              {"value":  "Awards/Best  Actor/Daniel  Day  Lewis",  "count":  1},              {"value":  "Awards/Best  Actor/Denzel  Washington",  "count":  2}...]},        
  • 55. Drilldown •  •  •  •  Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example bq=oscar1:'Awards' bq=oscar2:'Awards/Best Actor' bq=oscar3:'Awards/Best Actor/Daniel Day Lewis' bq=(and 'star' oscar2:'Awards/Best Actor')
  • 56. Section Summary •  Simple faceting •  Hierarchical faceting •  Hierarchical data handling Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example
  • 57. Agenda •  •  •  •  Sourcing your documents Retrieval and ranking Search user interface Performance and Scale •  Developer example: Peter Simpkin, Elsevier Oxford
  • 58. The Search Algorithm •  Locate documents that satisfy Boolean constraints –  Usually intersection •  Relevance rank those documents –  Differentiates from databases by relevance Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example
  • 59. Performance Best Practices Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example •  Match set size •  Text queries perform better than integer queries •  Complex relevance functions
  • 60. Optimizing Index Size Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example •  Trade off literal and uint for cost/performance •  Result fields matter most •  Enabling faceting increases size
  • 61. Wrap Up •  •  •  •  Sourcing documents from various locations Building queries and ranking UI Components for faceting Getting the most out of your index
  • 62. Agenda •  •  •  •  Sourcing your documents Retrieval and ranking Search user interface Performance and Scale •  Developer example: Peter Simpkin, Elsevier Oxford Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example
  • 63. Agenda •  •  •  •  •  Elsevier Intro Search Problem Statement Enterprise Content Search Hints and Tips CloudSearch Observations
  • 64. •  •  •  •  7,000+ employees in 26 countries 2,200 journals / article market share 25% $3B revenue Scientific, Technical & Medical
  • 66. Content Challenges: •  No central place for consumers to discover content •  Is not currently possible to search and retrieve atomic assets •  Assets are not reusable across products Content Systems Consumer Platforms
  • 67. Empower our product development partners Search Opportunities: •  Create a comprehensive inventory to discover easily content Elsevier owns •  Provide access to Granular / Modular content they want at will •  Assets must be uniquely addressable Enterprise Content Search Engine
  • 68. Enterprise Content Search eco-system Amazon SWF SDF metadata E.U Corporate Data center Amazon S3 U.S Corporate Data center Amazon CloudSearch DynamoDB Federated Content Warehouse Product Platform Data center
  • 70. Elsevier Technical Drivers & Approach •  Fully-managed, full featured search service in the cloud •  Automatically scales for data & traffic •  Easy to set up and use •  PoC created in days •  Search Engine as a Service •  Pay-as-you-go pricing model
  • 71. Hints & Tips (and issn:'0022-1694' (and type:'1.2'  (and (not action:'D') (or (and pubstartdate:..2013176 pubenddate:2005002..) (or (and pubstartdate:2005001 (and pubstarttime:0.. pubstarttime:..235959))              (or (and pubstartdate:2013177 pubstarttime:..235959)                (or (and pubenddate:2005001 pubendtime:0..) (and pubenddate:2013177 (and pubendtime:..235959 pubendtime:0..))))))))) •  Query Response Time = 5 seconds
  • 72. Optimising Nested Queries (and issn:'0022-1694' type:'1.2'  (not action:'D') (or (and pubstartdate:..2013176 pubenddate:2005002..)          (and pubstartdate:2005001 pubstarttime:0..235959)          (and pubstartdate:2013177 pubstarttime:0..235959)          (and pubenddate:2005001 pubendtime:0..)          (and pubenddate:2013177 pubendtime:0..235959))) •  Response Time = 2.5 seconds
  • 73. Optimised Nested Query ((not action:'D') (or (and issn:'0022-1694' and type‘1.2' and pubstartdate:..2013176 pubenddate:2005002..)       (and issn:'0022-1694' and type‘1.2' and pubstartdate:2005001 pubstarttime:0..235959)       (and issn:'0022-1694' and type‘1.2' and pubstartdate:2013177 pubstarttime:0..235959)       (and issn:'0022-1694' and type‘1.2' and pubenddate:2005001 pubendtime:0..)       (and issn:'0022-1694' and type‘1.2' and pubenddate:2013177 pubendtime:0..235959))) •  Response Time = 0.17ms
  • 74. CloudSearch Observations facilitate knowledge sharing on content matters across Elsevier’s product platforms ability to leverage content infrastructure and capabilities across Elsevier’s divisions easy to integrate with existing on-premise Content Systems speed to market, allows developers to focus building other core Content Strategy components need to spend time optimising queries to maximise performance
  • 75. Please give us your feedback on this presentation SVC302 As a thank you, we will select prize winners daily for completed surveys! Thank You