Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.

Classification of Satellite Images

321 visualizaciones

Publicado el

a Python based Transfer Learning project based on mongoDB, flask and tensorflow @ Geopyhton 2018

Publicado en: Tecnología
  • Sé el primero en comentar

Classification of Satellite Images

  1. 1. Classification of Satellite Images Python based Transfer Learning approach Johannes Oos oosjoh@gmail.com
  2. 2. Project Overview - Purpose “S. […] had a farm north of the railway about 160 miles from mine. In the middle of it was a great tooth of granite, which soared up for about 200 feet […]” (Snow on the Equator, Tilman, 1937)
  3. 3. Project Overview - Purpose
  4. 4. Single pics via request PIL à BW à MongoDB : “0” : “1” Manually Label and Store Retraining Script Customized Classifier Collect, Classify and Store Result : “0.91” Pre-trained Classifier Project Overview Data Flow
  5. 5. Technology Overview 1. Collection – gmaps api 2. Storing – MongoDB 3. Labelling – Flask 4. Classifier – Tensorflow
  6. 6. Single pics via request PIL à BW à MongoDB : “0” : “1” Manually Label and Store Retraining Script Customized Classifier Collect, Classify and Store Result : “0.91” Pre-trained Classifier Getting the data
  7. 7. Getting the data - source No available data on this topic è Creation of a dataset Gmaps API è Very good documentation è Picture size chosen 320x320 pixel è Zoom about 1 meter/pixel PRO: è covers the entire planet è flexible zoom and picture sizes, è high resolution CON: è free contingency is only enough for 50x50 km per day
  8. 8. Climbing Areas: • Kenyan ones give too few samples • Filling up with other (American) Non Climbing Areas: • Make Sure there is no Climbing area by coincidence • Selected mostly from that region Problem: Positive samples are rare but appear in areas suspected to be completely negative è careful labelling Getting the data - subsetting
  9. 9. Storing geodata in MongoDB generally straight forward: • Very felxible • Geo Index for quick retrieval • Geo queries for clear separation with multiple agents Storing pics directly in MongoDB: Advantages • easy to scale (Sharding) • Easy to back up (Replica) • File and Other Information in same place (less complex) • Disadvantages • Saving pictures as binary • Performance loss compared to file system (?) Getting the data - storing
  10. 10. Saving Pics as binary: file à PIL à Binary (and back to .jpg for classifier) Getting the data - storing
  11. 11. Getting the data – tile system
  12. 12. Single pics via request PIL à BW à MongoDB : “0” : “1” Manually Label and Store Retraining Script Customized Classifier Collect, Classify and Store Result : “0.91” Pre-trained Classifier Labelling
  13. 13. 1. The Unlabelled Data in MongoDB Collection: • {"loc": (80.0, 70.1), "pic": the binary-PIL-file, "labelled": False} • MongoDB Document is very flexible and can be used without to much preparation 2. Show 5 pictures in a go in a flask View with the labelling option and navigate with tab and enter to select the label 3. Store the pictures in a different collection Labelling
  14. 14. Flask to show and label the pictures
  15. 15. Flask to show and label the pictures
  16. 16. MongoDB Analytics statsPreOut[7]: [8620, (8605, 0, 0), 15]
  17. 17. Labelling – Sample “CLIMB”
  18. 18. Labelling – Sample “CLIMB”
  19. 19. Labelling – Sample “CLIMB”
  20. 20. Labelling – Sample “NO”
  21. 21. Labelling – Sample “NO”
  22. 22. Labelling – Sample “NO”
  23. 23. Labelling – Sample “NO”
  24. 24. 1. 8605 entries in MongoDB Collection 2. Pictures of Size 320 x 320 3. Single Channel (BW: 0…255) 4. 809 labelled “1” (CLIMB) and 7796 labelled “0” (NO CLIMB) 5. {u'_id': ObjectId('5a901c579f0fea0418f3006e'), u'label': u'0', u'labelled': True, u'length': 298.21614707161154, u'loc': [38.7126511, -109.33571780038419], u'pic': u'iVBORw0KGgoAAAANSUhEUgAAAUAAAAFACAQAAABnmW 0hAAEAAElEQVR4nGT92Y9kZ5om+P3c1mO7uZn5GptHkAxum cwsVmVlVXV110z1TPd0YyA0IAjQhQboW0kX+k8ECAPNjQBpB N0MMBcaYDBST3ejqnqr7qwlF2aSDAbJC …….. } Labelling – Result
  25. 25. Single pics via request PIL à BW à MongoDB : “0” : “1” Manually Label and Store Retraining Script Customized Classifier Collect, Classify and Store Result : “0.91” Pre-trained Classifier Choice of Classifier
  26. 26. Built your Own • Capsule network • Customized CNN architecture • Feature engineering Pre-trained Object Detection: Github - detection_model_zoo Pre-trained Classifier: • Inception v4 • Inception v3 • Mobile Nets • VGG16 • AlexNet Choice of Classifier
  27. 27. AN ANALYSIS OF DEEP NEURAL NETWORK MODELS FOR PRACTICAL APPLICATIONS, Canziani et al 2017 Choice of Classifier
  28. 28. Choice of Classifier
  29. 29. Choice of Classifier
  30. 30. Classifier Architecture
  31. 31. Classifier Architecture – mixed_1
  32. 32. Classifier Architecture – tower_1
  33. 33. Classifier Architecture Rethinking the Inception Architecture for Computer Vision, Szegedy et al 2015
  34. 34. Classifier Architecture
  35. 35. Single pics via request PIL à BW à MongoDB : “0” : “1” Manually Label and Store Retraining Script Customized Classifier Collect, Classify and Store Result : “0.91” Pre-trained Classifier Retraining
  36. 36. Retraining 1. General Idea 2. How To 3. Input format 4. Size of graph 5. Duration of training 6. Evaluation
  37. 37. Retraining - general idea Original: Output = f(g(h(Input))) Retrained: Output = f*(g(h(Input*)))
  38. 38. Retraining - general idea Original: Retrained: https://commons.wikimedia.org/wiki/File:Typical_cnn.png Climb No CLimb
  39. 39. Retraining - general idea adjusted input constant processing adjusted output
  40. 40. […] That means it [the penultimate layer] has to be a meaningful and compact summary of the images, since it has to contain enough information for the classifier to make a good choice in a very small set of values. The reason our final layer retraining can work on new classes is that it turns out the kind of information needed to distinguish between all the 1,000 classes in ImageNet is often also useful to distinguish between new kinds of objects. (www.tensorflow.org/) è Classification of top view landscape images is actually out of scope -- but works reasonably well Retraining - general idea
  41. 41. https://github.com/googlecodelabs/tensorflow-for-poets-2 python scripts/retrain.py --output_graph=tf_files/new_graph.pb --output_labels=tf_files/climb_labels.txt --image_dir=tf_files/climb --how_many_training_steps=5000 Retraining – HowTo
  42. 42. Retraining - Input Format
  43. 43. Retraining - Graph Size 90 MB
  44. 44. Retraining - Duration INFO:tensorflow:2018-05-06 21:08:48.270479: Step 0: Train accuracy = 92.0% INFO:tensorflow:2018-05-06 21:08:48.273313: Step 0: Cross entropy = 0.601665 INFO:tensorflow:2018-05-06 21:08:51.722497: Step 0: Validation accuracy = 93.0% (N=100) INFO:tensorflow:2018-05-06 21:08:55.491133: Step 10: Train accuracy = 90.0% INFO:tensorflow:2018-05-06 21:08:55.491296: Step 10: Cross entropy = 0.304652 INFO:tensorflow:2018-05-06 21:08:55.826968: Step 10: Validation accuracy = 85.0% (N=100) . INFO:tensorflow:2018-05-06 21:11:00.927218: Step 990: Train accuracy = 96.0% INFO:tensorflow:2018-05-06 21:11:00.927409: Step 990: Cross entropy = 0.111929 INFO:tensorflow:2018-05-06 21:11:01.018710: Step 990: Validation accuracy = 98.0% (N=100) INFO:tensorflow:2018-05-06 21:11:01.830915: Step 999: Train accuracy = 97.0% INFO:tensorflow:2018-05-06 21:11:01.831077: Step 999: Cross entropy = 0.132192 INFO:tensorflow:2018-05-06 21:11:01.927725: Step 999: Validation accuracy = 97.0% (N=100) INFO:tensorflow:Final test accuracy = 93.6% (N=776) INFO:tensorflow:Froze 2 variables. Converted 2 variables to const ops.
  45. 45. https://github.com/googlecodelabs/tensorflow-for-poets-2 python scripts/retrain.py --output_graph=tf_files/new_graph.pb --output_labels=tf_files/climb_labels.txt --image_dir=tf_files/climb --how_many_training_steps=5000 --summaries_dir=tf_files/training_summaries/ Retraining - Log
  46. 46. Single pics via request PIL à BW à MongoDB : “0” : “1” Manually Label and Store Retraining Script Customized Classifier Collect, Classify and Store Result : “0.91” Pre-trained Classifier Classifier Evaluation
  47. 47. Evaluation - Tensorboard files
  48. 48. (tFInc) bash-3.2$ cd training_summaries (tFInc) bash-3.2$ ls train validation (tFInc) bash-3.2$ tensorboard --logdir=train == > TensorBoard 1.7.0 at http://Johannes-MacBook- Air.local:6006 (Press CTRL+C to quit) Evaluation - Tensorboard
  49. 49. Evaluation Tensorboard Graph
  50. 50. Evaluation Tensorboard Graph
  51. 51. Evaluation Tensorboard Graph
  52. 52. Evaluation – Scalars / Histogramms
  53. 53. Single pics via request PIL à BW à MongoDB : “0” : “1” Manually Label and Store Retraining Script Customized Classifier Collect, Classify and Store Result : “0.91” Pre-trained Classifier Usage of Classifier
  54. 54. Timings at the start: Collecting from gmaps: 0.172 sec Label : 2.698 sec Store label to MDB: 0.005 sec Total: 2.875 sec Timings after 10,000 Pictures: Collecting from gmaps: 0.207sec Label : 7.052 sec Store label to MDB: 0.005 sec Total: 7.265 sec Usage of Classifier
  55. 55. Search Process For picture in area: Get picture from gmaps Label with classifier Store result in Collection Calculate next center 320 x 320 Pix @ ca. 1 Pix/meter == > 12 Images per sqkm 25,000 images/day == > 20 km x 100 km /day searchable
  56. 56. Search Process
  57. 57. Search Process
  58. 58. Search Process
  59. 59. Search Process
  60. 60. Places Found • In[2]: getRes() • Out[2]: ([18504, 7174, 4306, 1093, 572, 119, 59, 17], 75463)
  61. 61. Places Found
  62. 62. Places Found Flask output with links to the highest rated locations
  63. 63. Places Found (< 1%)
  64. 64. Places Found (50%)
  65. 65. Places Found (> 98%)
  66. 66. Places Found (> 98%)
  67. 67. Places Found (> 98%)
  68. 68. Improvement 1. Relabelling 2. More flexible Collection (Zoom, Stride, Datasource) 3. Result Visualization on a map 4. Bigger Search Area

×