SlideShare una empresa de Scribd logo
1 de 17
Lucene
BUILDING INDEX
Introduction
 Lucene Index
 Lucene Index data in form of Posting list which are in Inverted Index format.
 How does it look ?
 Lucene index data in files called segments.
 Unlike a database, Lucene has no notion of a fixed global schema
 Lucene’s flexible schema also means a single index can hold documents that
rep- resent different entities.
 Lucene requires you to flatten, or de-normalize, your content when you index it.
 A document is Lucene’s atomic unit of indexing and searching. It’s a
container that holds one or more fields, which in turn contain the “real”
content.
 To index your raw content sources, you must first translate it into Lucene’s
documents and fields. Then, at search time, it’s the field values that are
searched
 Three things Lucene can do with each field:
 The value may be indexed
 If it’s indexed, the field may also optionally store term vectors,
 the field’s value may be stored,
Inverted Index
Indexing Process
 Enriching and Creating the Document
 To Index any data, we need to get text of the raw data i.e the form in which Lucene
can ingest the data.
 Build Documents are not always simple, when you are indexing from database or
PDF or Website HTML you need to have to do so much, preprocess so that a proper
Document can be build out of it.
 Analysis
 Method addDocument & addDocuments of IndexWriter Class hand our data off to
Lucene to index.
 As a first step Lucene analyzes the text, create tokens out of it and perform analysis
operations like for instance, tokens could be lowercased before indexing, so that it
will help in making search case insensitive.
 StemFilter, Synonyms and Stopwords are such examples of analysis
 Adding to the index
 After the analyzed part is done, data is ready to be added to index.
 Lucene uses inverted index as the data structure beneath the surface.
 Lets see how it works ?
 Rather than answering question
“What words are contained in this document?”
it is optimized for providing quick answers to
“Which documents contain word X?”
 Lucene index data in the Segments
Indexing
Process
 INDEX SEGMENTS
 Each segment is a standalone index, holding a subset of all indexed documents.
 Index Time : A new segment is created whenever the writer flushes buffered
documents and pending deletions into the directory.
 Search time: Each segment is visited separately and the results are combined.
 Each segment is consist of various types of files :
 _X.<ext> where X is the segment’s name and ext is extension
 There are separate files to hold the different parts of the index
 You can use compound file format so that most of these index files are collapsed into a
single compound file in extension .cfs
 segements file is the file which contains references of all live segments named
segments_<N>
 Types of Index files and formats:
Name Extension Brief Description
Segments File segments.gen, segments_N Stores information about segments
Lock File write.lock The Write lock prevents multiple IndexWriters from writing to
same file.
Compound File .cfs An optional "virtual" file consisting of all the other index files for
systems that frequently run out of file handles.
Fields .fnm Stores information about the fields
Field Index .fdx Contains pointers to field data
Field Data .fdt The stored fields for documents
Term Infos .tis Part of the term dictionary, stores term info
Term Info Index .tii The index into the Term Infos file
Frequencies .frq Contains the list of docs which contain each term along with
frequency
Positions .prx Stores position information about where a term occurs in the
index
Norms .nrm Encodes length and boost factors for docs and fields
Term Vector Index .tvx Stores offset into the document data file
Term Vector Documents .tvd Contains information about each document that has term
Term Vector Fields .tvf The field level info about term vectors
Deleted Documents .del Info about what files are deleted
Indexing Utils
 Indexing Operations
 Adding documents
 addDocument(Document) Adds the document using the default analyze
 addDocuments(List<Document>) Adds the document using the default analyze in a block
 Deleting documents
 IndexWriter provides various methods to remove documents from an index:
 deleteDocuments(Term)
 deleteDocuments(Term[])
 deleteDocuments(Query)
 deleteDocuments(Query[])
 As with added documents, you must call commit() or close() on your writer to commit the changes to the index.
 hasDeletions() method to check if an index contains any documents marked for deletion.
 After optimize the deleted docs got removed from index
 Indexing Operations
 Updating documents
 updateDocument(Term, Document) first deletes all documents containing the
provided term and then adds the new document using the writer’s default analyzer.
 updateDocument(Term, Document, Analyzer) does the same but uses provided
analyzer instead of the writer’s default analyzer.
 Optimize Index
 When you index documents, especially many documents or using multiple
sessions with IndexWriter, you’ll invariably create an index that has many
separate segments.
 When you search the index, Lucene must search each segment separately
then combine the results.
 This has a tradeoff as the large no of segments the large no of seprate search
and more the merge would be.
 An optimized index also consumes fewer file descriptors during searching.
 Optimizing only improves searching speed, not indexing speed.
 Optimize Index
 IndexWriter exposes four methods to optimize:
 forceMerge(int maxNumSegments): Forces merge policy to merge segments until
there are <= maxNumSegments.
 forceMerge(int maxNumSegments, boolean doWait): Just like forceMerge(int),
except you can specify whether the call should block until all merging completes.
 forceMergeDeletes() : Forces merging of all segments that have deleted
 Index Commits
 A new index commit is created whenever you invoke one of IndexWriter’s
commit methods.
 Commits all pending changes (added and deleted documents, segment
merges, added indexes, etc.) to the index, and syncs all referenced index files,
such that a reader will see the changes and the index updates will survive an
or machine crash or power loss.
 The steps IndexWriter takes during commit:
 Flush any buffered documents and deletions.
 Sync all newly created files, including newly flushed files
 Write and sync the next segments_N file.
 Remove old commits by calling on IndexDeletionPolicy to remove old com- mits.
 Index Merging
 When an index has too many segments, IndexWriter selects some of the segments
and merges them into a single, large segment
 There are various merge policies like : LogMergePolicy , LogDocMergePolicy etc
 Concurrency, thread safety, and locking issues
 Any number of read-only IndexReaders may be open at once on a single index.
 Only a single writer may be open on an index at once. Lucene uses a write lock
to enforce this
 IndexReaders may be open even while an IndexWriter is making changes to the
index. Each IndexReader will always show the index as of the point in time that it
was opened. It won’t see any changes being done by the IndexWriter until the
commits and the reader is reopened.
 Concurrency, thread safety, and locking issues
 The Lucene index only blocks concurrent write operations on the index.
 Various implementations of Lock are :
 NoLockFactory
 SimpleFSLockFactory
 SingleInstanceLockFactory
 VerifyingLockFactory
 Boosting documents and fields
 Index-time boosts are not supported anymore. As a replacement, index-time
scoring factors should be indexed into a doc value field combined at query
time using eg. FunctionScoreQuery.

Más contenido relacionado

La actualidad más candente

Indexing & Query Optimization
Indexing & Query OptimizationIndexing & Query Optimization
Indexing & Query Optimization
MongoDB
 

La actualidad más candente (20)

Dawid Weiss- Finite state automata in lucene
 Dawid Weiss- Finite state automata in lucene Dawid Weiss- Finite state automata in lucene
Dawid Weiss- Finite state automata in lucene
 
Elastic Search Indexing Internals
Elastic Search Indexing InternalsElastic Search Indexing Internals
Elastic Search Indexing Internals
 
Indexing & Query Optimization
Indexing & Query OptimizationIndexing & Query Optimization
Indexing & Query Optimization
 
Elasticsearch presentation 1
Elasticsearch presentation 1Elasticsearch presentation 1
Elasticsearch presentation 1
 
Sed & awk the dynamic duo
Sed & awk   the dynamic duoSed & awk   the dynamic duo
Sed & awk the dynamic duo
 
[2D1]Elasticsearch 성능 최적화
[2D1]Elasticsearch 성능 최적화[2D1]Elasticsearch 성능 최적화
[2D1]Elasticsearch 성능 최적화
 
Consuming RealTime Signals in Solr
Consuming RealTime Signals in Solr Consuming RealTime Signals in Solr
Consuming RealTime Signals in Solr
 
SPARQL introduction and training (130+ slides with exercices)
SPARQL introduction and training (130+ slides with exercices)SPARQL introduction and training (130+ slides with exercices)
SPARQL introduction and training (130+ slides with exercices)
 
Elasticsearch for Data Analytics
Elasticsearch for Data AnalyticsElasticsearch for Data Analytics
Elasticsearch for Data Analytics
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to Redis
 
Introduction to Elasticsearch with basics of Lucene
Introduction to Elasticsearch with basics of LuceneIntroduction to Elasticsearch with basics of Lucene
Introduction to Elasticsearch with basics of Lucene
 
Elasticsearch V/s Relational Database
Elasticsearch V/s Relational DatabaseElasticsearch V/s Relational Database
Elasticsearch V/s Relational Database
 
Tutorial on developing a Solr search component plugin
Tutorial on developing a Solr search component pluginTutorial on developing a Solr search component plugin
Tutorial on developing a Solr search component plugin
 
MongoDB presentation
MongoDB presentationMongoDB presentation
MongoDB presentation
 
MongoDB
MongoDBMongoDB
MongoDB
 
Dns covert channels with scapy
Dns covert channels with scapyDns covert channels with scapy
Dns covert channels with scapy
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to Redis
 
Mongo DB
Mongo DBMongo DB
Mongo DB
 
Elastic search Walkthrough
Elastic search WalkthroughElastic search Walkthrough
Elastic search Walkthrough
 
ElasticSearch
ElasticSearchElasticSearch
ElasticSearch
 

Similar a Lucene indexing

Must be similar to screenshotsI must be able to run the projects.docx
Must be similar to screenshotsI must be able to run the projects.docxMust be similar to screenshotsI must be able to run the projects.docx
Must be similar to screenshotsI must be able to run the projects.docx
herthaweston
 
Lucene Bootcamp -1
Lucene Bootcamp -1 Lucene Bootcamp -1
Lucene Bootcamp -1
GokulD
 
Solr中国6月21日企业搜索
Solr中国6月21日企业搜索Solr中国6月21日企业搜索
Solr中国6月21日企业搜索
longkeyy
 
CSCI6505 Project:Construct search engine using ML approach
CSCI6505 Project:Construct search engine using ML approachCSCI6505 Project:Construct search engine using ML approach
CSCI6505 Project:Construct search engine using ML approach
butest
 

Similar a Lucene indexing (20)

Lucece Indexing
Lucece IndexingLucece Indexing
Lucece Indexing
 
Lucene
LuceneLucene
Lucene
 
Apache lucene
Apache luceneApache lucene
Apache lucene
 
MARUTHI_INVERTED_SEARCH_presentation.pptx
MARUTHI_INVERTED_SEARCH_presentation.pptxMARUTHI_INVERTED_SEARCH_presentation.pptx
MARUTHI_INVERTED_SEARCH_presentation.pptx
 
Elasticsearch Architechture
Elasticsearch ArchitechtureElasticsearch Architechture
Elasticsearch Architechture
 
Is Your Index Reader Really Atomic or Maybe Slow?
Is Your Index Reader Really Atomic or Maybe Slow?Is Your Index Reader Really Atomic or Maybe Slow?
Is Your Index Reader Really Atomic or Maybe Slow?
 
Searching and Analyzing Qualitative Data on Personal Computer
Searching and Analyzing Qualitative Data on Personal ComputerSearching and Analyzing Qualitative Data on Personal Computer
Searching and Analyzing Qualitative Data on Personal Computer
 
Must be similar to screenshotsI must be able to run the projects.docx
Must be similar to screenshotsI must be able to run the projects.docxMust be similar to screenshotsI must be able to run the projects.docx
Must be similar to screenshotsI must be able to run the projects.docx
 
 
Chapter 3 Indexing Structure.pdf
Chapter 3 Indexing Structure.pdfChapter 3 Indexing Structure.pdf
Chapter 3 Indexing Structure.pdf
 
Introduction to Lucene and Solr - 1
Introduction to Lucene and Solr - 1Introduction to Lucene and Solr - 1
Introduction to Lucene and Solr - 1
 
Lucene Bootcamp -1
Lucene Bootcamp -1 Lucene Bootcamp -1
Lucene Bootcamp -1
 
ElasticSearch Basic Introduction
ElasticSearch Basic IntroductionElasticSearch Basic Introduction
ElasticSearch Basic Introduction
 
Solr中国6月21日企业搜索
Solr中国6月21日企业搜索Solr中国6月21日企业搜索
Solr中国6月21日企业搜索
 
Index Structures.pptx
Index Structures.pptxIndex Structures.pptx
Index Structures.pptx
 
Indexing in Search Engine
Indexing in Search EngineIndexing in Search Engine
Indexing in Search Engine
 
Advanced full text searching techniques using Lucene
Advanced full text searching techniques using LuceneAdvanced full text searching techniques using Lucene
Advanced full text searching techniques using Lucene
 
G0361034038
G0361034038G0361034038
G0361034038
 
CSCI6505 Project:Construct search engine using ML approach
CSCI6505 Project:Construct search engine using ML approachCSCI6505 Project:Construct search engine using ML approach
CSCI6505 Project:Construct search engine using ML approach
 
Elastic search
Elastic searchElastic search
Elastic search
 

Último

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 

Último (20)

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Lucene indexing

  • 2. Introduction  Lucene Index  Lucene Index data in form of Posting list which are in Inverted Index format.  How does it look ?  Lucene index data in files called segments.  Unlike a database, Lucene has no notion of a fixed global schema  Lucene’s flexible schema also means a single index can hold documents that rep- resent different entities.  Lucene requires you to flatten, or de-normalize, your content when you index it.
  • 3.  A document is Lucene’s atomic unit of indexing and searching. It’s a container that holds one or more fields, which in turn contain the “real” content.  To index your raw content sources, you must first translate it into Lucene’s documents and fields. Then, at search time, it’s the field values that are searched  Three things Lucene can do with each field:  The value may be indexed  If it’s indexed, the field may also optionally store term vectors,  the field’s value may be stored,
  • 5. Indexing Process  Enriching and Creating the Document  To Index any data, we need to get text of the raw data i.e the form in which Lucene can ingest the data.  Build Documents are not always simple, when you are indexing from database or PDF or Website HTML you need to have to do so much, preprocess so that a proper Document can be build out of it.  Analysis  Method addDocument & addDocuments of IndexWriter Class hand our data off to Lucene to index.  As a first step Lucene analyzes the text, create tokens out of it and perform analysis operations like for instance, tokens could be lowercased before indexing, so that it will help in making search case insensitive.  StemFilter, Synonyms and Stopwords are such examples of analysis
  • 6.  Adding to the index  After the analyzed part is done, data is ready to be added to index.  Lucene uses inverted index as the data structure beneath the surface.  Lets see how it works ?  Rather than answering question “What words are contained in this document?” it is optimized for providing quick answers to “Which documents contain word X?”  Lucene index data in the Segments
  • 8.  INDEX SEGMENTS  Each segment is a standalone index, holding a subset of all indexed documents.  Index Time : A new segment is created whenever the writer flushes buffered documents and pending deletions into the directory.  Search time: Each segment is visited separately and the results are combined.  Each segment is consist of various types of files :  _X.<ext> where X is the segment’s name and ext is extension  There are separate files to hold the different parts of the index  You can use compound file format so that most of these index files are collapsed into a single compound file in extension .cfs  segements file is the file which contains references of all live segments named segments_<N>
  • 9.  Types of Index files and formats: Name Extension Brief Description Segments File segments.gen, segments_N Stores information about segments Lock File write.lock The Write lock prevents multiple IndexWriters from writing to same file. Compound File .cfs An optional "virtual" file consisting of all the other index files for systems that frequently run out of file handles. Fields .fnm Stores information about the fields Field Index .fdx Contains pointers to field data Field Data .fdt The stored fields for documents Term Infos .tis Part of the term dictionary, stores term info Term Info Index .tii The index into the Term Infos file Frequencies .frq Contains the list of docs which contain each term along with frequency Positions .prx Stores position information about where a term occurs in the index Norms .nrm Encodes length and boost factors for docs and fields Term Vector Index .tvx Stores offset into the document data file Term Vector Documents .tvd Contains information about each document that has term Term Vector Fields .tvf The field level info about term vectors Deleted Documents .del Info about what files are deleted
  • 10. Indexing Utils  Indexing Operations  Adding documents  addDocument(Document) Adds the document using the default analyze  addDocuments(List<Document>) Adds the document using the default analyze in a block  Deleting documents  IndexWriter provides various methods to remove documents from an index:  deleteDocuments(Term)  deleteDocuments(Term[])  deleteDocuments(Query)  deleteDocuments(Query[])  As with added documents, you must call commit() or close() on your writer to commit the changes to the index.  hasDeletions() method to check if an index contains any documents marked for deletion.  After optimize the deleted docs got removed from index
  • 11.  Indexing Operations  Updating documents  updateDocument(Term, Document) first deletes all documents containing the provided term and then adds the new document using the writer’s default analyzer.  updateDocument(Term, Document, Analyzer) does the same but uses provided analyzer instead of the writer’s default analyzer.
  • 12.  Optimize Index  When you index documents, especially many documents or using multiple sessions with IndexWriter, you’ll invariably create an index that has many separate segments.  When you search the index, Lucene must search each segment separately then combine the results.  This has a tradeoff as the large no of segments the large no of seprate search and more the merge would be.  An optimized index also consumes fewer file descriptors during searching.  Optimizing only improves searching speed, not indexing speed.
  • 13.  Optimize Index  IndexWriter exposes four methods to optimize:  forceMerge(int maxNumSegments): Forces merge policy to merge segments until there are <= maxNumSegments.  forceMerge(int maxNumSegments, boolean doWait): Just like forceMerge(int), except you can specify whether the call should block until all merging completes.  forceMergeDeletes() : Forces merging of all segments that have deleted
  • 14.  Index Commits  A new index commit is created whenever you invoke one of IndexWriter’s commit methods.  Commits all pending changes (added and deleted documents, segment merges, added indexes, etc.) to the index, and syncs all referenced index files, such that a reader will see the changes and the index updates will survive an or machine crash or power loss.  The steps IndexWriter takes during commit:  Flush any buffered documents and deletions.  Sync all newly created files, including newly flushed files  Write and sync the next segments_N file.  Remove old commits by calling on IndexDeletionPolicy to remove old com- mits.
  • 15.  Index Merging  When an index has too many segments, IndexWriter selects some of the segments and merges them into a single, large segment  There are various merge policies like : LogMergePolicy , LogDocMergePolicy etc  Concurrency, thread safety, and locking issues  Any number of read-only IndexReaders may be open at once on a single index.  Only a single writer may be open on an index at once. Lucene uses a write lock to enforce this  IndexReaders may be open even while an IndexWriter is making changes to the index. Each IndexReader will always show the index as of the point in time that it was opened. It won’t see any changes being done by the IndexWriter until the commits and the reader is reopened.
  • 16.  Concurrency, thread safety, and locking issues  The Lucene index only blocks concurrent write operations on the index.  Various implementations of Lock are :  NoLockFactory  SimpleFSLockFactory  SingleInstanceLockFactory  VerifyingLockFactory
  • 17.  Boosting documents and fields  Index-time boosts are not supported anymore. As a replacement, index-time scoring factors should be indexed into a doc value field combined at query time using eg. FunctionScoreQuery.