Developer Data Modeling Mistakes: From Postgres to NoSQL
Thesis writing - week9
1. Information Retrieved
- Image based search
s1160123 Tomoyuki Soeta
Supervised by Prof. Qiangfu Zhao
System Intelligence Lab
1
2. Outline
Introduction
Information Retrieval
VQ (Vector Quantization)
Divide into the 8x8 block
Making of Code book
K-means algorithm
Extract each image’s feature vector
Result
Image and feature vector
Distance of feature vector
Conclusion
Future work 2
3. Introduction
I want to aim at the improvement of information retrieval
system to search it even if the input data are documents
or images.
I have charge of a research on information retrieval
based on a image.
To search images using a search engine, we may use the
index attached to the image, the file name, etc. as the
key-words. We may also use "the contents of an image
themselves."
I study a new image search technique based on the code
book information.
3
4. Information Retrieved
Image
Text
Divide into the block
(1 block 8x8)
Morphological
Analysis
Code book
Word Filtering
Code of each block
Feature Vector
Feature Vector
NNTree or SVM
4
5. VQ (Vector Quantization)
Compression coding of images
Image compression technology
image In my study, I use VQ to
Vector translate an image into a bag-
Quantization of-blocks (BOB)
(VQ)
same way as document search
5
feature vector
6. Divide into the 8x8 block (1)
I used 10 facial images with the size 256x256.
images are converted to gray scale images.
Divided into the block (one-block 8x8 size).
Each image obtains the block of 32×32
pieces severally.
32 blocks
32
b
l 1 block 8x8
o
c
k
s
6
7. Divide into the 8x8 block (2)
Block’s pixel value is read.
Pixel read value is stored in the array of
1x64.
One image can be divided into 1024 blocks,
and an array of 1024 rows can be obtained.
1x64
2 3 4 6 8 3 7 2 2 3 4 6 8 3 7 2 8 2 8 2
8 2 8 2 8 2
1 block 8x8
・
・
・
・
8x8 1024 rows 7
9. Making of Code book
The array of 10240 that can be done
by reading 10 images is made
The code book is made by using the
k-means method.
Making Code book (size 256)
9
10. K-means algorithm
Step 1) k initial "means" are randomly selected from
the data set .
Step 2) k clusters are created by associating every
observation with the nearest mean.
Step 3) The centroid of each of the k clusters
becomes the new means.
Step 4) Steps 2 and 3 are repeated until
convergence has been reached.
Step 1 Step 2 Step 3 Step 4 10
11. Extract each image’s feature vector (1)
The feature vector are extracted by using code book.
There is arrangement 1024 per one image.
Arranging an individual distance of the array each one
and code book is measured
The number of the nearest code is returned.
Which code how many times came out is preserved as
an array.
5
1x64
4
2 3 4 7 8 9 2 ########## Code 7
3
Code 38
Code 72 2
Code 200 1
Code 7
Code 200
Code 72
Code 38
7 0
・ 1 256
・ 7 38 72 200
・
1024 rows Code book 11
17. Result - Distance of feature vector(1)
Euclidean distance between feature vectors
is measured, and the accuracy of the code
book is seen.
P and Q are assumed to be two feature vectors.
Data : x = (x1, x2, ..., xn) and y = (y1, y2, ..., yn)
n : size of the feature vector
The distance of P and Q is below.
17
19. Conclusion
In my research, I study a new image search
technique based on the code book
information. The code book is obtained using
the VQ method.
It is thought that an accurate feature vector
was able to be extracted about the accuracy
of the feature vector because the distance
between Feature5 and 6 was short.
Information retrieval based on
"the contents of a image themselves." 19
20. Future work
The background is nullified.
The feature vector is extracted in the block
of a different size like the block of not the
block of 8x8 size but 16x16 size etc.
Multimedia retrieval that uses SVM.
20