2. 2
Background
Land use maps help authorities and planners to create spatial
plans in order to manage land and natural resources sustainably
Open Land Use map database is a vector, seamless harmonized
dataset that covers the whole Europe. It is based on open data
and provides land use map in most detailed possible scale for
each country
Due to the fact that in Europe there are lots of vector open data
available it is possible to create such map just based on this data
4. 4
Background
In Africa on the other side there is almost no vector open data available
related to land use
From the research that was done in 2017 during Inspire hackathon in Kehl
just few datasources were identified:
➔Open Street Map
➔Africover dataset
The datasets cover just small part of the Africa, also Africover dataset was
collected in the end of 90s – beginning of 2000s so it is outdatated
Another important motivation for the work is lots of negative proccesses
around in Africa (deforestation, soil erosion, uncontrolled urban growth etc.),
that could be solved with land use plans and measures helping sustainable
development
6. 6
Background
●
The absense of open data together with the fact that negative
processes that were mentioned before are quite rapid and need
to be monitored , taking into account, easy availability of
satellite data nowadays (i.e. Sentinel-2, Landsat-8) it was
decided to use it to derive land use
●
The creation of the map should have been done with the help of
Tensorflow library (one of the most powerful libraries for
machine learning)
7. 7
Background
With the help of this library it is possible to do following
operations:
➔
Classify image (i.e. to tell there is an object X on the image)
➔
Detect object on the image (draw bounding box around the
detected object)
➔
Do semantic segmentation of image (segmentize image into
different objects that appear on it)
➔
Segmentize image into instances of objects
9. 9
Background
In general, from all AI algorithms, convolution neural networks
(CNN) seem to be the most usable for the abovementioned tasks.
Here is the diagram that explains how these CNN models for
image classification are being created:
10. 10
Tensorflow library
From the user utile perspective the process of creating the model
creates of the following steps:
➔
Collecting big (thousands of images) collection of samples of each
land use classes
➔
Dividing the collected samples into two almost equal parts: training
and evaluation
➔
Selecting the base type of CNN model that will be trained
➔
Running training
➔
Running evaluation
➔
Using model
11. 11
Image sample collection
●
There needs to be a lot of sample imagery (a few thousand samples for
each land use class)
●
The images need to be pre-processed (contrast adjustment, central
alignment)
●
Imagees for each class need to be divided into two groups (one for
training the model, another for model evaluation)
●
Tensorflow record file needs to be created for each of these two groups.
The file could be automatically generated from csv file that has the
number of rows corresponding to the number of objects in images and
looks like:
filename width height class xmin ymin xmax ymax
airport1.jpg 200 300 airport 1 1 200 300
12. 12
Image sample collection
●
The tf record file then is easily generated from such csv file using
script generate_tfrecord.py that is provided by Tensorflow
developers
●
python generate_tfrecord.py --csv_input=data/train_labels.csv
–output_path=data/train.record –image_dir=images
●
The images of different land use classes could be downloaded
from Sentinel-2 , the classes of the image could be derived from
Open Street Map database
●
So in the beginning osm data for Eastern Africa (Kenya,
Tanzania, Southern Sudan, Rwanda, Burundi and Uganda) was
downloaded and loaded into PostgreSQL database
13. 13
Image sample collection
●
After this for each land use class the script was run. For instance
taking airports as an example:
➔
initialize database connection
➔
query all airports
➔
for each airport find Sentinel-2 image with the least cloud
coverage (with sentinelsat api)
➔ download and save the sample image for this date from Mundi WCS service
15. 15
Model training and evaluation
●
After samples were collected the training of the object
detection model based on Mobilenet base model was trained:
python train.py --logtostderr --train_dir=training/ --
pipeline_config_path=training/ssd_mobilenet_v1_coco.config
●
After evaluation on another part of the images was run:
python eval.py --logtostderr
--pipeline_config_path=training/ssd_mobilenet_v1_coco.config
--checkpoint_dir=training/ --eval_dir=eval/
16. 16
Using model
●
Eventually the model was exported to the graph:
python export_inference_graph.py --input_type image_tensor
--pipeline_config_path training/ssd_mobilenet_v1_coco.config
--trained_checkpoint_prefix training/model.ckpt-1454 --
output_directory airport_inference_graph2
●
After the models (concretely for residential land use and
airports) were tested on real cloudless images from Sentinel-2
for different areas
18. 18
Real results
In our case unfortunately the model for the airports
identification wasn‘t able to find any airport and the model that
tried to identify residential land use was marking as residential
land use most part of the image.
The reason behind it is probably that in case with airports in
Eastern Africa – it wasn‘t enough samples for training the model.
From the whole Eastern Africa it is just 70 airports in Open Street
Map database and to train model one needs many more
samples.
19. 19
Real results
In case with residential landuse the most samples were small
and looked like:
20. 20
Real results
In the case with residential land use there were several
problems. The most residential areas in Open Street Map
database are very small features that with Sentinel-2 10 meters
spatial resolution become just like couple of brownish, grey
pixels. Taking into account that we have forgotten to improve
contrast of the samples before training model : almost all
Sentinel-2 images without improved contrast look very greyish
and brownish, so most of the images that we tested this model,
were evaluated as residential land use.
21. 21
Things need to be done
●
Improve sample data collection process:
➔
Collect more samples
➔
Set minimum and maximum size of the sample
➔
Enhance contrast of samples
●
Select base CNN model and configure it with more care:
➔
This time Mobilenet CNN was selected since it is leightweight
and can be trained even on relatively simple notebook , possibly,
other models could be more useful for this prupose