2. 2022/9/6
LAB 2 ImageNet
The ImageNet project is a large-scale visual database for the research of visual object
recognition software. The project has manually annotated more than 14 million images
to point out objects in the pictures, and provided borders in at least 1 million images.
3
• ImageNet contains more than 20,000 typical categories, such
as "balloons" or "strawberries", and each category contains
hundreds of images.
• You can obtain the annotated third-party image URL directly
from ImageNet for free.
• ImageNet Large-scale Visual Recognition Challenge
(ILSVRC). The challenge uses 1000 "organized" non-
overlapping categories, and the software program matches
correctly classifies and detects targets and scenes.
4
3. 2022/9/6
ILSVR winning models
5
(tensorflow) C:> python classify_image.py –image_file=./model/bbb.jpg
(tensorflow) C:>核對: 何者機率最高? 結果對不對?(Check, which one has the highest probability?
Is the result correct?)
(tensorflow) C:> python classify_image.py –image_file=./model/ccc.jpg
(tensorflow) C:>核對: 何者機率最高? 結果對不對?
(tensorflow) C:> python classify_image.py –image_file=./model/ddd.jpg
(tensorflow) C:>核對: 何者機率最高? 結果對不對?
(tensorflow) C:> python classify_image.py –image)file=./model/iii.jpg
(tensorflow) C:>核對: 何者機率最高? 結果對不對? 可自行找JPG圖檔預估
Run Lab02
6
4. 2022/9/6
Lab 02 - Image classification-
ImageNet
Report
: 109368532 陳明曉
109368516 于善任
7
Content
What is in the script - classify_image.py
Run the script with image
Retrain for the model
Reference (1: retrain python code)
8
5. 2022/9/6
What is in the script – Input variables
model_dir - classify_image_graph_def.pb
image_file
num_top_predictions top 5
9
What is in the script – function
• Main function
10
6. 2022/9/6
What is in the script – function
• run_inference_on_image - 1
11
What is in the script – function
• run_inference_on_image - 2
10 categories
Out of >20000
categories in ImageNet
12
7. 2022/9/6
What is in the script – function
• run_inference_on_image - 2
13
Run the script with image - default
14
10. 2022/9/6
Retrain for the model – prepare
[Reference 1]
Download the training data
In this case flower images from here: flower_photos.tgz
Extract these to a folder e.g. ‘/tmp/flower_photos’
Download the Python code to retrain the model from the Tensorflow repository
called retrain.py.(I copied this file into the ‘models/tutorials/image/imagenet’ directory.)
Now run the retrain.py file as follows:
python retrain.py --image_dir=c:tmpflower_photos
--output_lables=retrained_labels.txt
--output_graph=retrained_graph.pb
19
Retrain for the model – parameter
Training times
How many training steps to run before ending.
GradientDescentOptimizer (learning_rate)
How large a learning rate to use when training.
20
11. 2022/9/6
Retrain for the model – Training times
4000 vs. 100
Pick normal pictures which are not in training data set
4000 (0.99566)
100 (0.77682)
Accuracy increases
21
learning rate
22
12. 2022/9/6
Retrain for the model – Running!
23
Retrain for the model – Error!
Function名稱或引數改變
tf.softmax_cross_entropy_with_logits
原函式 : tf.nn.softmax_cross_entropy_with_logits(y, y_)
修改為 : tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=y_)
'tensorflow.python.training.training' has no attribute 'SummaryWriter’
原函式 : tf.train.SummaryWriter("output", sess.graph)
修改為 : tf.summary.FileWriter("output", sess.graph)
24
13. 2022/9/6
Retrain for the model –
Visualizing the retraining with Tensorboard
tensorboard --logdir /tmp/retrain_logs
shows numerical
statistics
In each training
Accuracy and Loss
25
參考
Retrain model
https://desiganp1.wordpress.com/2017/09/04/tensorflow-for-image-classification/
Netron
https://www.electronjs.org/apps/netron
GradientDescentOptimizer
https://ithelp.ithome.com.tw/articles/10221856
Retrain model parameter
https://lufor129.medium.com/%E6%95%B4%E7%90%86-
%E7%B6%B2%E8%B7%AF%E4%B8%8A%E5%B8%B8%E8%A6%8B%E8%BC%83%E7
%82%BA%E7%B0%A1%E5%96%AE%E7%9A%84transfer-learning-
%E4%BD%9C%E6%B3%95-tensorflow-e7cb32031d1
Error 1
https://blog.csdn.net/caimouse/article/details/61208940
Error 2
https://blog.csdn.net/caimouse/article/details/56936042
26
14. 2022/9/6
27
(tensorflow) C:> cd Lab05_Vikramank_MNIST
(tensorflow) C:> ipython notebook MNISTClassification.ipynb
(tensorflow) C:> 等待瀏覽器顯示…(Wait for the browser to display)
(tensorflow) C:> 請參考 Lab06 陸續㇐格㇐格往下 Run (Please refer to Lab06 to
run down one grid one by one)
(tensorflow) C:> 核對:總共花多少時間進行 training,預估的準確率為何? (Check:
How much time is spent training in total, and what is the estimated accuracy?)
Run Lab05
28
15. 2022/9/6
LAB5 Vikramank MNIST
• The Mnist data set is a well-known handwritten digit data set, and its status can be
said to be the Hello World in the machine learning world.
• It contains 60,000 images of Training data and 10,000 images of Test data.
• The pixels of each picture are 28 x 28, and each pixel is represented by a grayscale
value.
29
0 : [1, 0, 0, 0, 0, 0, 0, 0, 0, 0]
1 : [0, 1, 0, 0, 0, 0, 0, 0, 0, 0]
…
9 : [0, 0, 0, 0, 0, 0, 0, 0, 1, 0]
10: [0, 0, 0, 0, 0, 0, 0, 0, 0, 1]
30
One hot encoding (common in
classification)
16. 2022/9/6
mnist.train.images is a tensor of shape [60000, 784]. The first
dimension number is used to index the image, and the second
dimension number is used to index the pixel in each image, a
certain pixel in the image. The intensity value of is between 0-1.
31
Lab05_Vikramank_MNIST Report
職自動化科技研究所
109618506 王騰立
32
17. 2022/9/6
# importing data
from tensorflow.examples.tutorials.mnist import input_data
# one hot encoding returns an array of zeros and a single one. One corresponds to the class
data = input_data.read_data_sets("data/MNIST/", one_hot=True)
# %%
print("Shape of images in training dataset {}".format(data.train.images.shape))
print("Shape of classes in training dataset {}".format(data.train.labels.shape))
print("Shape of images in testing dataset {}".format(data.test.images.shape))
print("Shape of classes in testing dataset {}".format(data.test.labels.shape))
print("Shape of images in validation dataset {}".format(data.validation.images.shape))
print("Shape of classes in validation dataset {}".format(data.validation.labels.shape))
Shape of images in training dataset (55000, 784)
Shape of classes in training dataset (55000, 10)
Shape of images in testing dataset (10000, 784)
Shape of classes in testing dataset (10000, 10)
Shape of images in validation dataset (5000, 784)
Shape of classes in validation dataset (5000, 10)
code
result
33
code
result
# sample image
sample = data.train.images[5].reshape(28, 28)
plt.imshow(sample, cmap='gray')
plt.title('Sample image')
plt.axis('off')
plt.show()
# function to display montage of input data
imgs = data.train.images[0:100]
montage_img = np.zeros([100, 28, 28])
for i in range(len(imgs)):
montage_img[i] = imgs[i].reshape(28, 28)
plt.imshow(montage(montage_img), cmap='gray')
plt.title('Sample of input data')
plt.axis('off')
plt.show()
34
18. 2022/9/6
code
result
with tf.Session() as sess:
x = np.linspace(-3, 3)
tanh = tf.nn.tanh(x).eval()
sigmoid = tf.nn.sigmoid(x).eval()
relu = tf.nn.relu(x).eval()
plt.plot(x, tanh, 'g', x, sigmoid, 'b', x, relu, 'r')
plt.legend(('tanh', 'sigmoid', 'relu'))
plt.show()
35
code
# input - shape 'None' states that, the value can be anything, i.e we can feed in any number of images
# input image
x = tf.placeholder(tf.float32, shape=[None, 784])
# input class
y_ = tf.placeholder(tf.float32, shape=[None, 10])
# %%
# Input Layer
# reshaping input for convolutional operation in tensorflow
# '-1' states that there is no fixed batch dimension, 28x28(=784) is reshaped from 784 pixels and '1' for a single
# channel, i.e a gray scale image
x_input = tf.reshape(x, [-1, 28, 28, 1], name='input')
# first convolutional layer with 32 output filters, filter size 5x5, stride of 2,same padding, and RELU activation.
# please note, I am not adding bias, but one could add bias.Optionally you can add max pooling layer as well
conv_layer1 = tflearn.layers.conv.conv_2d(x_input, nb_filter=32, filter_size=5, strides=[1, 1, 1, 1],
padding='same', activation='relu', regularizer="L2", name='conv_layer_1')
# 2x2 max pooling layer
out_layer1 = tflearn.layers.conv.max_pool_2d(conv_layer1, 2) # 取出 32 個特徵,然後將圖片降維成解析度 14x14
# second convolutional layer
conv_layer2 = tflearn.layers.conv.conv_2d(out_layer1, nb_filter=32, filter_size=5, strides=[1, 1, 1, 1],
padding='same', activation='relu', regularizer="L2", name='conv_layer_2')
out_layer2 = tflearn.layers.conv.max_pool_2d(conv_layer2, 2) # 取出 32 個特徵,然後將圖片降維成解析度 7x7
# fully connected layer
fcl = tflearn.layers.core.fully_connected(out_layer2, 1024, activation='relu')
fcl_dropout = tflearn.layers.core.dropout(fcl, 0.7) # 輸出結果之前使用 Dropout 函數避免過度配適
y_predicted = tflearn.layers.core.fully_connected(fcl_dropout, 10, activation='softmax', name='output')
36
19. 2022/9/6
code
# loss function
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y_predicted), reduction_indices=[1]))
# optimiser -g
train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)
# calculating accuracy of our model
correct_prediction = tf.equal(tf.argmax(y_predicted, 1), tf.argmax(y_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
epoch = 15000;
batch_size = 100;
for i in range(epoch):
# batch wise training
x_batch, y_batch = data.train.next_batch(batch_size)
_, loss = sess.run([train_step, cross_entropy], feed_dict={x: x_batch, y_: y_batch})
if i % 500 == 0:
Accuracy = sess.run(accuracy,
feed_dict={
x: data.test.images,
y_: data.test.labels
})
Accuracy = round(Accuracy * 100, 2)
print("step : {} , Loss : {} , Accuracy on test set : {} %".format(i, loss, Accuracy))
elif i % 100 == 0:
print("step : {} , Loss : {}".format(i, loss))
validation_accuracy = round((sess.run(accuracy,
feed_dict={
x: data.validation.images,
y_: data.validation.labels
})) * 100, 2)
print("Accuracy in the validation dataset: {}%".format(validation_accuracy))
37
result-1: 參數(batch_size=50, dropout=0.8)
result-2: 參數(batch_size=100, dropout=0.7)
38
20. 2022/9/6
Keras
from keras.layers import Conv2D, MaxPooling2D, Flatten
from keras.optimizers import SGD, Adam
from keras.utils import np_utils
from keras.datasets import mnist
import tensorflow as tf
from sklearn.metrics import confusion_matrix
import matplotlib.pyplot as plt
def load_data(): # categorical_crossentropy
(x_train, y_train), (x_test, y_test) = mnist.load_data()
number = 60000
x_train = x_train[0:number]
y_train = y_train[0:number]
x_train = x_train.reshape(number, 28 * 28)
x_test = x_test.reshape(x_test.shape[0], 28 * 28)
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
# convert class vectors to binary class matrices
y_train = np_utils.to_categorical(y_train, 10)
y_test = np_utils.to_categorical(y_test, 10)
x_train = x_train
x_test = x_test
# x_test = np.random.normal(x_test) # 加噪声
x_train = x_train / 255
x_test = x_test / 255
return (x_train, y_train), (x_test, y_test)
39
if __name__ == '__main__':
# load training data and testing data
(x_train, y_train), (x_test, y_test) = load_data()
# define network structure
model = Sequential()
model.add(Dense(input_dim=28 * 28, units=500, activation='relu'))
model.add(Dropout(0.6))
model.add(Dense(units=500, activation='relu'))
model.add(Dropout(0.6))
model.add(Dense(units=10, activation='softmax'))
# set configurations
model.compile(loss='categorical_crossentropy',
optimizer='adam', metrics=['accuracy'])
# train model
model.fit(x_train, y_train, batch_size=100, epochs=20)
# evaluate the model and output the accuracy
result_train = model.evaluate(x_train, y_train)
result_test = model.evaluate(x_test, y_test)
print('Train Acc:', result_train[1])
print('Test Acc:', result_test[1])
40
21. 2022/9/6
Hyperparameter Tuning
• Do not have too many hidden layers, otherwise vanishing gradient may occur, 2~3
layers is enough for this simple task MNIST.
• You should choose ReLU, approximately linear activation function such as Maxout
• The number of neurons contained in each hidden layer, 500~600 is appropriate
41
• For classification problems, the loss function must use cross entropy (categorical_crossentropy)
instead of mean square error(mse)
• The optimizer generally chooses Adam, which combines RMSProp and Momentum, and
considers the past gradient, current gradient, and last inertia
• If the accuracy rate on the testing data is very low and the accuracy rate on the training data is
relatively high, you can consider using dropout, The way to use Keras is to add a
model.add(Dropout(0.5)) after each hidden layer, where 0.5 is a tuable parameter. Note that
after adding dropout, the accuracy on the training set will be reduced, but in the accuracy on
the testing set will increase, which is normal
• If the input is the pixel of the picture, normalize the grayscale value, that is, divide by 255 to
make it between 0 and 1.
• It is best to output the accuracy of the training set and the testing set at the same time, for easy
comparison.
42
22. 2022/9/6
43
(tensorflow) C:> cd ..
(tensorflow) C:> cd Lab06_SSD-Tensorflow-master/notebooks
(tensorflow) C:> ipython notebook ssd_notebook.ipynb
(tensorflow) C:> 等待瀏覽器顯示…
Run Lab06
陸續一格一格往下 Run(shift+Enter)
44
23. 2022/9/6
You can find the image file you want to predict by yourself and place it in the
Lab06_SSD-Tensorflow-master/demo directory, or use the prepared image file
to modify the index(0~13)Run prediction below to display image_name[8]
Run Lab06
0~13
45
Classes
none: (0, ‘Background’) dining table: (11,’Indoor’)
aeroplane: (1,’Vehicle’) dog: (12”Animal’)
bicycle (2,’Vehicle’) horse: (13,’Animal’)
bird (3,’Animal’) motorbike (14,’Vehicle’)
boat (4,’Vehicle’) person: (15,’Person’)
bottle (5,’Indoor’) pottedplant (16,’Indoor’)
bus: (6,’Vehicle’) sheep: (17,’Animal’)
car: (7,’Vehicle’) sofa: (18,’Indoor’)
cat: (8,’Animal’) train: (19,’Vehicle’)
chair (9,’Indoor’) tvmonitor (20,’Indoor’)
cow (10,’Animal’)
Run Lab06 -21 Classes
46
25. 2022/9/6
Outline
1. Modify the category
2. Modify the SSD network definition based on VGG 300
3. Modify the parameters of the visualization program
4. Modify the parameters of the preprocessing image program
5. Instructions for running the image test program
6. Display the test results of the trained model
49
1. Modify the category
filename : pascalvoc_common.py
path : ...AI_LabsLab06_SSD-Tensorflow-masterdatasets
Note: Modify the number of categories as appropriate
50
26. 2022/9/6
2. Modify the SSD network definition based on VGG 300
ssd_vgg_300.py
...AI_LabsLab06_SSD-Tensorflow-masternets
Note: Modify dropout_keep_prob, weight_decay, match_threshold,
negative_ratio
51
3. Modify the parameters of the visualization program
visualization.py
...AI_LabsLab06_SSD-Tensorflow-master
Note: Modify fontsize, color
52
27. 2022/9/6
4. Modify the parameters of the preprocessing image
program
train_ssd_network.py
AI_LabsLab06_SSD-Tensorflow-master
Modify weight_decay, optimizer, learning_rate, learning_rate_decay_factor
53
5. Instructions for running the image test program
ssd_notebook.ipynb
...AI_LabsLab06_SSD-Tensorflow-masternotebooks
Note: Modify select_threshold, nms_threshold, 換圖片
54
28. 2022/9/6
6. Display the test results of the trained model(1)
55
6. Display the test results of the trained model(2)
56
29. 2022/9/6
6. Display the test results of the trained model(3)
57
ML Lab06 – SSD Report
110C51509 陳清雲 Sam
110C51518 黃彥騰yan
110C51513 蘇小淇anGIE
58
30. 2022/9/6
1. ADDING PREDICTION IMAGE INTO DATASET
“ DEMO “
1. ADDING PEOPLE’S PHOTO : image name 00021.jpg
ADJUSTING PARAMETER
1. Default value : select_threshold=0.5
(顯示參數以上的辨識結果)
Only a couple people’s image
were framed and recognized
2.Changing select_threshold from 0.5 to 0.3
(more boxes)
People were all framed and predicted
However, accuracy has no improvement
31. 2022/9/6
ADJUSTING PARAMETER
1. Default value: nms_threshold 0.45
2.Changing :nms_threshold from 0.45 to 0.9
(參數是排除鄰近的相似辨識,參數越大, 鄰近值抓取過多,
noise大,反而容易精度下降 more overlapping boxes)