Tools for Image Retrieval in Large Multimedia Databases
1. Tools for Image Retrieval
in Large Multimedia Databases
by Carles Ventura Royo
Directors:
Verónica Vilaplana
Xavier Giró
Tutor:
Ferran Marqués
Barcelona, September 2011
1
2. Index
Identifying the problem
State of art: indexing techniques
State of art: Hierarchical Cellular Tree (HCT)
Modifications to the original HCT
Experimental results
Implemented tools
Conclusions and future work lines
2
5. Identifying the problem (III)
K nearest neighbor problem
Solution: Sequential scan
Drawback: Computational time for large
databases (10 s for a 200,000 elements)
Approximate K nearest neighbor problem
Indexing techniques
5
6. Requirements of the solution
Dynamic approach
Multimedia databases are not static
Insertions and deletions
High dimensional feature spaces
“curse of dimensionality” problem
MPEG-7 visual descriptors are high-dimensional
feature vectors
6
7. Index
Identifying the problem
State of art: indexing techniques
State of art: Hierarchical Cellular Tree (HCT)
Modifications to the original HCT
Experimental results
Implemented tools
Conclusions and future work lines
7
8. Indexing techniques (I)
Hierarchical data structures
Spatial Access Methods (SAMs)
K-d tree, R-tree, R*-tree, TV-tree, etc.
Drawbacks:
Items have to be represented in an N-dimensional feature
space
Dissimilarity measure based on a Lp metric
SAMs do not scale up well to high dimensional spaces
8
9. Indexing techniques (II)
Hierarchical data structures
Metric Access Methods (MAMs)
VP-tree, MVP-tree, GNAT, M-tree, etc.
More general approach than SAMs
Assuming only a similarity distance function
MAMs scale up well to high dimensional spaces
Drawbacks:
Static MAMs do not support dynamic changes
Dependence on pre-fixed parameters
9
10. Indexing techniques (III)
Locality Sensitive Hashing
It uses hash functions
Nearby data points are hashed into the same
bucket with a high probability
Points faraway are hashed into the same bucket
with a low probability
Drawback:
It does not solves the K nearest neighbor problem, but
the Є-near neighbor problem.
10
11. Index
Identifying the problem
State of art: indexing techniques
State of art: Hierarchical Cellular Tree (HCT)
Modifications to the original HCT
Experimental results
Implemented tools
Conclusions and future work lines
11
12. Solution adopted
Hierarchical Cellular Tree (HCT)
MAM-based indexing scheme
Hierarchical structure
Self-organized tree
Incremental construction
in a bottom-up fashion
Unbalanced tree
Not dependence on a maximum capacity
Preemptive cell search algorithm for insertion
Dynamic approach
[KG07] S. Kiranyaz and M.Gabbouj, Hierarchical Cellular Tree: An efficient
indexing scheme for content-based retrieval on multimedia databases.
12
15. HCT: Level Structure
Representatives for each cell from the lower
level
Responsible for maximizing the compactness
of its cells
Compactness threshold
15
16. HCT Operations (I)
Item insertion
Find the most
suitable cell
Most Similar Nucleus
vs Preemptive Cell Search
16
17. HCT Operations (II)
Item insertion
Find the most
suitable cell
Most Similar Nucleus
vs Preemptive Cell Search
C2: d(O,O2
N) - r(O2
N) < dmin
C3: d(O,O3
N) - r(O3
N) > dmin
17
18. HCT Operations (III)
Item insertion
Find the most suitable cell
Append the element
Generic post-processing check
Mitosis operation
Nucleus change
18
19. HCT Operations (IV)
Item removal
Cell search algorithm not required
Remove the element
Generic post-processing check
Mitosis operation
Nucleus change
19
20. HCT: Retrieval scheme
Progressive Query
Periodical subqueries over database subsets
Query Path formation
Based on Most Similar Nucleus
Ranking agreggation
20
21. Index
Identifying the problem
State of art: indexing techniques
State of art: Hierarchical Cellular Tree (HCT)
Modifications to the original HCT
Experimental results
Implemented tools
Conclusions and future work lines
21
22. Modifications to the HCT (I)
Covering radius
Original definition gives an approximation by
defect
Consider all the elements belonging to the
subtree
High computational cost
Approximation by excess
22
24. Modifications to the HCT (III)
Covering radius
24
rC (C9) = max
• d(E,B)+rC(C1)
• rC(C2)
• d(E,H)+rC(C3)
25. Modifications to the HCT (III)
HCT construction
Preemptive Cell Search over all the levels
A method for updating the covering radius
To reduce the searching time
It can be performed after the HCT construction or
periodically
25
26. Modifications to the HCT (IV)
Searching techniques
PQ fails in solving the KNN problem efficiently
New searching techniques
Most Similar Nucleus
Preemptive Cell Search
Hybrid
Number of cells to be considered
Minimum number of cells
Cells hosting 2·K elements
Cellular structure is not kept
26
27. Index
Identifying the problem
State of art: indexing techniques
State of art: Hierarchical Cellular Tree (HCT)
Modifications to the original HCT
Experimental results
Implemented tools
Conclusions and future work lines
27
28. Experimental results
CCMA image database of 216,317 elements
HCT building evaluation
Construction time
Retrieval system evaluation
Retrieval time
Elements retrieved
28
33. Retrieval system evaluation (II)
33
Evaluation with respect to exhaustive search
Mean Competitive Recall
Elements in common
Mean Normalized Aggregate Goodness
Kendall distance
Number of exchanges needed in a bubble sort
Query set of 1,082 images
36. Index
Identifying the problem
State of art: indexing techniques
State of art: Hierarchical Cellular Tree (HCT)
Modifications to the original HCT
Experimental results
Implemented tools
Conclusions and future work lines
36
37. Implemented tools (I)
database_indexing tool
Tool for indexing an image database
HCT is stored at disk
hct_query tool
Tool for carrying out a search over an indexed
database
HCT is read from disk and load at main memory
37
39. Index
Identifying the problem
State of art: indexing techniques
State of art: Hierarchical Cellular Tree (HCT)
Modifications to the original HCT
Experimental results
Implemented tools
Conclusions and future work lines
39
40. Conclusions
Hierarchical Cellular Tree implementation
To improve the retrieval times
Generic implementation for any kind of data
Modifications proposed
HCT evaluation
Measures extracted from literature
Preemptive Cell Search technique gives the
best performance
It is essential not to use an underestimated value
for the covering radius 40
41. Future work lines
Very large databases
Not using only main memory
Region-based CBIR system
Each image can be represented by a set of
regions
Browser application based on HCT
Take advantage of the hierarchical structure
Alternative way to retrieve elements
41
45. A tool for image DB indexing
MPEG-7 visual descriptors
Indexing a database
Inserting new elements to an indexed
database
45
46. A tool for image retrieval
It requires an indexed database
Image query
Image file
XML file with the visual descriptors
TXT file with several image queries
Interactive mode
Performing a search
46
Good morning, I’m going to defend my master’s final project dissertation, titled Tools for Image Retrieval in Large Multimedia Databases, which has been directed by Veronica Vilaplana and Xavi Giró and tutored by Ferran Marques.
First of all, we’re going to identify which is the problem to be solved.
When we are looking for images in an image database, there are basically two ways for informing the retrieval system of what we’re looking for: through words or by giving an image, which is called query.
Our retrieval system is based on the latter one. Thus, we give a query image to the system in order to retrieve the K most similar elements from a given database.This problem is known as the K nearest neighbor problem. In order to compute how similar the images are in our retrieval system, the images have been represented by some visual descriptors defined in MPEG-7, which are compared by using some dissimlaritiy functions also defined in this standard.
The way to obtain these most similar elements is performing an exhaustive search over the whole database. However, this sequential scan is not feasible in a large database. To get an idea, an exhaustive search performed on a database of about 200,000 elements lasts 10 seconds, time which will not be acceptable for the user.
Therefore, CBIR systems needs incorporating indexing techniques in order to scale up well when they work over large databases in order to achieve retrieval times that would be acceptable for the user. On the other hand, the exact K nearest neighbors may not be retrieved and we may obtain an approximated solution to the KNN problem.
In particular, we need an indexing technique which fulfills the following requirements:
A dynamic approach. Since multimedia databases are not static, we want an index structure which allows insertions and deletions of the indexed elements without having to reindex the whole database from scratch.
Some indexing techniques suffer from the so-called “curse of dimensionality” problem when they are used in high dimensional feature spaces and, therefore, become more inefficient than even an exhaustive search. Since we work with the MPEG-7 visual descriptors, which are high-dimensional feature vectors, we need an indexing technique which is appropriate for high dimensional feature spaces.
Once it’s clear the problem to be solved, we’re going to analyze which indexing techniques are available in the literature.
The most traditional indexing techniques are formed mostly in a hierarchical tree structure. These indexing techniques can be mainly grouped in two categories: spatial access methods and metric access methods.
The applicability of SAMs is limited by the fact that the items have to be represented by points in an N-dimensional feature space and the dissimilarity measure between two points has to be based on a distance function in Lp metric such as Euclidean distance. Furthermore, the main disadvantage of SAMs is that they do not scale up well to high dimensional spaces.
The metric access methods give a more general approach than SAMs and do not suffer from the “curse of dimensionality” problem. Therefore, they scale up well to high dimensional spaces. However, some existing MAMs present several drawbacks. For instance, the static MAMs do not support dynamic changes such as new insertions or deletions. Even though some MAM-based indexing techniques provide a dynamic approach, their dependence on pre-fixed parameters can lead to significantly varying performances.
Another very popular indexing technique is the Locality Sensitive Hashing, which is not based on a hierarchical tree structure. Its main disadvatge is that LSH, by its nature, does not solve the KNN problem but the e-near neighbor problem, which consists in gathering the elements within a predefined e-distance from the query point.
After analyzing several indexing techniques from the literature, we have decided to implement an indexing technique called Hierarchical Cellular Tree, which was designed to bring an effective solution especially for indexing large multimedia databases.
The HCT is a MAM-based indexing technique with a hierarchical structure, which consists of one or more levels and each level in turn holds one or more cells. It is a self-organized tree, which basically means that the operations (item insertion, removal, etc.) are not externally controlled, but they are carried out according to some internal rules. The HCT is built dynamically in a bottom-up fashion and is an unbalanced tree optimized for achieving highly focused cells. The HCT does not depend on a maximum capacity and, therefore, it has no limit for the cell size as long as the cell keeps a definite compactness measure. In order to perform an optimum search for the insertion process, the HCT includes a preemptive cell search algorithm. Finally, the HCT is also characterized for its totally dynamic approach.
Now, we’re going to detail the two main components of the HCT: the cell structure and the level structure. A cell is a basic container structure where similar database elements are stored. The distances between each pair of elements are assigned to each edge of a connected, undirected graph. We also have the Minimum Spanning Tree, which is the subgraph that connects all the vertices together with the minimum cumulative total weight. The cell nucleus is the element which represents the cell on the upper level. It is essential to promote the best item for this representation since these elements are used during the top-down search for item insertion or for query requests. Therefore, it was proposed to choose the element having the maximum number of branches in the MST. Then, the covering radius is defined as the distance from the nucleus to the furthest element in the cell.
Another cell parameter is the cell compactness, which quantifies how focused or compact the clustering for the items within the cell is. It is computed as a function of the following cell parameters: mean and standard deviation of the MST branch weights, covering radius, maximum MST branch weight and the number of elements. The cell compactness plays a key role in deciding whether or not to perform mitosis within the cell at any instant, and so does the maturity size. The maturity size is a prefixed parameter. When a cell holds a number of elements greater than or equal to the maturity size value, we consider it mature. Therefore, a cell can suffer from mitosis operation only if it is mature and it is not compact enough. When mitosis is granted, MST is again used to decide how to split the cell by breaking the branch with largest weight of the MST.
As mentioned before, the HCT has a hierarchical structure, which is formed by one or more levels. The elements belonging to a particular level are the representants for each cell from the lower level except, obviously, the ground level, which contains the entire database. Each level is the responsible for maximizing the compactness of its cell. With this purpose, each level updates its own compactness threshold, which is computed by applying the median operator over the compactness values of its mature cells. This parameter is used to evaluate whether a cell is compact enough.
Now, we’re going to detail the two external operations that can be carried out over a HCT: item insertion and item removal.
Whenever an item insertion is performed, the first thing to do is to find the most suitable cell to which the element must be appended in the ground level. In this figure, element C is the closest nucleus from the ground level. In order to perform this operation in an efficient way, a search algorithm called Preemptive Cell Search was proposed instead of the traditional Most Similar Nucleus. The MS-Nucleus assumes that the closest nucleus yields the best subtree during descend. In the figure, the element being inserted is first compared to the nucleus objects from the upper level. Since the nucleus of cell C1 is the closest one, the two other cells would be discarded. Therefore, element C would be wrongly discarded.
On the other hand, the pre-emptive cell search algorithm performs a preemptive analysis on the upper level to find out all possible nucleus objects which might yield the closest objects on the lower level. In this figure, since d(O,ON3)-r(ON3) &gt; dmin, cell C3 can not yield the closest nucleus and, therefore, is discarded. On the other hand, since d(O,ON2)-r(ON2) &lt; dmin, cell C2 is not discarded. Therefore, element C will be considered and the element being inserted will be appended to the most suitable cell.
Once the most suitable cell has been found, the element is appended. Then, the cell becomes subject to a generic post-processing check. First, the cell is examined for a mitosis operation. In case mitosis is not performed, it is necessary to check if the nucleus has changed due to the insertion.
The item removal is an operation that does not require any cell search algorithm. The element to be removed is taken out of the cell and the cell becomes subject to a generic post-processing check as in an item insertion operation.
In the original article, a retrieval scheme called Progressive Query was proposed. This searching technique consists in performing periodical sub-queries over subsets of database items. The order in which the comparisons are done is given by the Query Path, which is formed based on the Most Similar Nucleus searching technique. The rankings obtained over each subset are aggregated after each subquery.
So far, we have presented the Hierarchical Cellular Tree, the solution adopted for indexing large multimedia databases. Therefore, the HCT has been implemented in our development platform. Since we have detected some weak points, now we are going to present some proposed modifications to the original HCT.
The covering radius was originally defined as the distance from the nucleus to the furthest element in the cell. However, we consider that we should take into account not only the elements belonging to the corresponding cell, but also all the elements belonging to its subtree. Therefore, the original HCT is working with a poor approximation by defect. Since the computation of the exact covering radius can have a high computational cost, we have proposed an approximation by excess, which guarantees that no subtree will be wrongly discarded. Therefore, the most suitable cell will be always found.
For instance, in order to compute the covering radius for cell C9, all the elements in the cells C1, C2 and C3 should be considered, instead of using only the elements belonging to that cell. Using the approximation by excess that we have proposed, the covering radius for the cell C9 is computed as follows:
The covering radius for the cell C9 is computed as the maximum among the following values:
Covering radius of cell C1 plus the distance from element B (its nucleus) to element E, the cell C9 nucleus
Covering radius of cell C2
Covering radius of cell C3 plus the distance from element H (its nucleus) to element E, the cell C9 nucleus
Originally, it was proposed to adopt a hybrid approach to build the HCT, using the Preemptive Cell Search algorithm only in a certain number of the uppermost levels and the Most Similar Nucleus technique in the lowest ones. Since the Most Similar Nucleus is used, the element being inserted may not be appended to the most suitable cell. That’s the reason why we have decided to build the HCT by using the Preemptive Cell Search algorithm over all the levels.
Furthermore, we have implemented a method for updating the covering radius to its actual value in order to reduce the searching time of both the insertion operations and query requests. In this way, we will visit only the cells which are strictly necessary. This method can be applied after the HCT construction or periodically.
As we will see in the results, the Progressive Query fails in solving the KNN problem efficiently. That’s the reason why we have decided to implement new searching techniques. Therefore, we have adapted the cell search algorithms for insertion operations to searching techniques for retrieval. In particular, we are interested on the Preemptive Cell Search algorithm since it takes advantage of the covering radius and it guarantees that the most suitable cell is retrieved.
Since the cell whose nucleus is the nearest element to the query may not contain the most similar elements, we have proposed considering a minimum number of cells. We have also proposed that the number of cells being considered depends on the number of expected results. Therefore, we have decided that the cells considered must host at least twice the number of elements expected. These elements are sorted individually without keeping the cellular structure.
Once the original HCT and some proposed modifications have been presented, we’re going to proceed to their evaluation.
We have selected a database which consists of approximately 200,000 elements given by the Corporació Catalana de Mitjans Audiovisuals. Although the main objective of these experiments is to evaluate how fast the retrieval system is and how good the obtained results are depending on the searching technique and the HCT used, we also want to analyze the time required for the HCT to be built.
First of all, we are going to analyze the building time. We can see the time required for building the HCT when the original covering radius is used is by far the best one. However, as we will see in the retrieval system evaluation, this approach gives the worst performance. This is because the original covering radius is an approximation by defect and, therefore, a number of cells smaller than the strictly necessary are being visited during the insertion operations. Although the building time required by the proposed covering radius is worse, this approach give a much better performance. Furthermore, this time can be reduced by applying periodically the update method without affecting its performance. Thus, we have reduced the building time in a 23% by applying the update method every 5000 insertions.
Now, we’re going to evaluate the retrieval system. We want to compare the approximate ranking, obtained by using a searching technique over the HCT, with the exact ranking given by the exhaustive search. Here we have an example of a search in which an image of an anchorman has been used as query. All the images illustrated represent the results of an exhaustive search over the whole image database. The images marked with a green rectangle have been retrieved by the Preemptive Cell Search algorithm whereas the ones which are marked with a red rectangle are the missing ones in the approximate ranking.
Here we have an example in which three images have not been retrieved by the Preemptive Cell Search technique.
And another one in which all the images have been retrieved.
In order to compare the approximate ranking, obtained by using a searching technique over the HCT, with the exact ranking given by the exhaustive search, we have selected three different measures from the literature:
The Mean Competitive Recall, which is defined as the number of elements in common.
The Mean Normalized Aggregate Goodness.It is only 1 when the k nearest elements are retrieved and is only 0 when the k farthest elements are retrieved, the worst case.
The Kendall distance, which turns out to be equal to the number of exchanges needed in a bubble sort.
These measures have been computed over a query set of 1082 images.
Now, we want to present the results obtained by the different searching techniques when the number of resuts expected by the user is 40. First of all, we’re going to analyze the Preemptive Cell Search algorithm depending on the covering radius used and whether the update method for the covering radius has been applied or not. We can see that the retrieval system behaves by far the worst when the original covering radius is used without the update method since there are only 12.35 out of 40 elements in common between the rankings on average. Futhermore, only 49.26% of the query images have been retrieved. However, the quality of the results for the original covering radius is remarkably improved when the proposed update method for the covering radius is used. Therefore, we can see how important it is not to use an underestimated value for the covering radius. That’s the reason why the performance of the retrieval system is also so good when the proposed covering radius is used regardless whether the update method is applied or not. Thus, in a retrieval time of about 1 second, 99% of the image queries are retrieved and there are about 28 out of 40 elements in common between the rankings on average. These results are also corroborated by the other two quality measures.
Here we have a table which will allow us to compare the implemented searching techniques. We can see that the retrieval time increases jointly with the number of the levels over which the Preemptive Cell Search algorithm is applied. On the other hand, the greater the number of levels over which the Preemptive technique is applied, the better the quality measures. Since we consider that a retrieval time of about 1 second is absolutely acceptable for the user, we can conclude that the Preemptive Cell Search technique gives the best performance. Finally, we want to compare it to the Progressive Query, which is the retrieval scheme originally proposed. If we consider a larger number of elements instead of only 80 elements to obtain the approximate ranking, the Most Similar Nucleus technique becomes equivalent to the first subqueries performed by the Progressive Query. From the results showed in the table, we can see that even when we consider 20,000 elements, which means a retrieval time of 1.37 seconds, the quality of the results is worse than when the Preemptive Cell Search algorithm is used. In such interval of time, there are only 12.33 out of 40 elements in common between the rankings on average and only 31.61% of the query images have been retrieved.
Finally, we’re going to present some tools which have been implemented to make it easier for the user to use the HCT.
We have implemented a tool called database_indexing which is used for indexing an image database.The resulting HCT is stored at hard disk. Then, in order to use this HCT to perform searches over it, the user should use the hct_query tool.
Furthermore, the image retrieval system based on the HCT has been also implemented over a server/client architecture. On the one hand, we have the hct_building tool, a daemon program which loads an indexed database and is constantly running waiting for new query requests. On the other hand, there is the query_request tool, which is the tool used to send the query request. Since we also need a means of communication between these tools, we have selected a messaging system called KSC.
Finally, let’s draw the conclusions and present the future work lines.
As seen, in this project we have implemented an indexing technique called Hierarchical Cellular Tree in order to improve the retrieval times of the CBIR system over large databases. Although the HCT has been chosen to be used in a particular scenario, its implementation allows to index any type of data and, therefore, the HCT is useful for indexing a generic database as long as there is a function to measure the dissimilarity between any pair of elements. Futhermore, some modifications have been proposed which have resulted in an improvement of the quality of the retrieved elements. In order to evaluate its performance, we have used some contrasted measures extracted from the literature. Finally, we conclude that the Preemptive Cell Search gives the best performance when is used for retrieval. Furthermore, it is essential not to use an underestimated value for the covering radius, either building the HCT with the proposed covering radius or applying the update method over the HCT built with the original covering radius.
This project opens the door to new future work lines. The main work consists in adapting the HCT implementation for working with very large databases which do not allow to be indexed by only using the main memory. The implementation of this indexing technique will also allow to improve the retrieval time of any region-based CBIR system. For instance, if each image is represented by a set of 100 regions, an small image database of 1,000 images results in a large database of 100,000 elements. Finally, another future work line from this project is the implementation of a browser for an image database, as it is proposed in the original article, since the hierarchical structure of the HCT is quite appropriate to give an overview to the user about what lies under the current level. Thus, the browser can be also used as an alternative way to retrieve elements from an image database.
Here we have an illustrative example of HCT construction where the elements are represented by colored circles and the similarity between two elements is given by how similar in color are. First, all the elements are inserted in the unique cell of the HCT, which is created in the ground level, until this becomes mature and not compact enough. In this example, this happens when the yellow ball labeled as 1 is inserted.
As a result, the cell is split into two new cells and a new top level is created. The new nuclei of each child are computed (the yellow ball labeled as 1 and the red ball labeled as f) and they are inserted in the new top cell. This process continues until all the elements of the database are inserted.
Although the HCT has been implemented for indexing any element type, we have implemented a tool which is used to index image database elements which are represented by some MPEG-7 visual descriptors and are compared by using the dissimilarity measures also defined in MPEG-7. The user is asked to precompute this descriptors for each image before using this tool. We have to specify which are the elements to be indexed by a TXT file including the path of the XML files containing the visual descriptors and the directory where the HCT will be stored at disk. We also specify the visual descriptor in which we are intrerested and some HCT parameters as the maturity size.
Furthermore, this tool also allows the user to insert new elements to a previously indexed database by giving the path of the HCT stored at disk.
Another tool which allows the user to carry out a search on an indexed database has been also implemented. The image query is specified by giving either its filename or the XML file including the visual descriptors. This tool also allows the user to perform several retrievals by giving a TXT file including the different image queries. We have also implemented an interactive mode in which the user is asked if he wants to perform a new retrieval without waiting for the HCT being load again. Here we have an example of use of this tool.
The image retrieval system based on the HCT has been also implemented over a server/client architecture. On the one hand, we have the hct_building tool, a daemon program which loads an indexed database and is constantly running waiting for new query requests. On the other hand, there is the query_request tool, which is the tool used to send the query request. Since we also need a means of communication between these tools, we have selected a messaging system called KSC. The objective of the KSC is to control the registration of the clients, to find out to which message type the clients subscribe and to forward the messages to the subscribed clients. Thus, we have defined a message type named SearchKSCMessage, which defines the parameters of the search, and a message type named SearchCompletedKSCMessage which informs that the searching process has been properly completed. The query_request tool subscribes as a writer of the SearchKSCMessage and as a reader of the SearchCompletedKSCMessage, whereas the hct_building tool subscribes as a reader of the SearchKSCMessage and as a writer of the SearchCompletedKSCMessage.
Now, we are going to analyze the same experiment but using the Most Similar Nucleus technique instead. For this technique, the covering radius is not used during the search so it does not matter whether the update method is used or not. Analyzing the results we come to the conclusion that the Most Similar Nucleus technique gives a bad performance since the query image is retrieved only in the 5% of the query requests and there are only 1.35 out of 40 elements in common between the rankings on average. These results are even worse when the original covering radius is used.
If we consider a larger number of elements, the Most Similar Nucleus technique becomes equivalent to the first subqueries performed by the Progressive Query. From the results showed in the table, we can see that even when we consider 20,000 elements, which means a retrieval time of 1.37 seconds, the quality of the results is worse than whether the Preemptive Cell Search algorithm is used. In such interval of time, there are only 12.33 out of 40 elements in common between the rankings on average.
Finally, we consider the Hybrid technique, which combines the two searching techniques analyzed before. The results showed in the table have been obtained when the Preemptive Cell Search algorithm has been applied over the 8 uppermost levels, and the Most Similar Nucleus technique in the lowest ones. For the Hybrid technique, we have a retrieval time which is always shorter than the Preemptive Cell retrieval time and longer tha the Most Similar Nucleus retrieval time. On the contrary, the quality of the elements retrieved by using the Hybrid technique is always better than applying the Most Similar Nucleus and worse than using the Preemptive Cell Search.