SlideShare a Scribd company logo
1 of 7
Download to read offline
Extended Robust Feature-based Visual Navigation
System for UAVs
Anastasiia Volkova
School of Aerospace, Mechanical and Mechatronic
Engineering, The University of Sydney
NSW 2006, Australia
Email: avol9544@uni.sydney.edu.au
Peter W. Gibbens
School of Aerospace, Mechanical and Mechatronic
Engineering The University of Sydney
NSW 2006, Australia
Email: peter.gibbens@sydney.edu.au
Abstract—The current study seeks to advance in the direction
of building a robust feature-based passive visual navigation
system utilising unique characteristics of the features present
in an image to obtain position of the aircraft. This is done
by extracting, prioritising and associating such features as road
centrelines, road intersections and using other natural landmarks
as a context. It is shown that extending the system with com-
plimentary feature matching blocks and choice of the features
prevalent in the scene improves the performance of the navigation
system. The work also discusses the constraints, cost and the
impact of the introduced optical flow component. The algorithm
is evaluated on a simulated dataset containing satellite imagery.
I. INTRODUCTION
Modern aerial robotic systems allow the conduct of surveil-
lance and exploration operations in remote and dangerous
environments without endangering human lives. To satisfy
the requirements of the missions in question, autonomous
systems need to be adaptive and robust as well as operate
reliably in GPS-denied environments. Navigational systems
designed nowadays to solve this problem benefit from the
introduction of cheap cameras and an exponential increase in
computation capabilities of the onboard computers. To provide
a position estimate of the vehicle without using satellite data,
the navigation system must include a passive operating mode
which relies only on the information from the onboard sensors.
Within the framework of visual navigation the questions of
feature extraction and feature association should be solved
in one of a number of possible ways. In our work we give
preference to the semantic features, which are comprehensible
for a human operator, and therefore can be associated with
information from other sources, such as feature maps. Extrac-
tion of semantic features from an image not only allows the
computation of frame-to-frame relative motion but also is used
to obtain an absolute estimate of the position by registering
the features detected in the image within a database. The paper
present extended feature-based algorithm for visual navigation,
studies cost of introduction of an optical flow module and
evaluates its performance on three simulated datasets.
The rest of the paper is structured as follows. Section II looks
at the recent work done in the area of visual navigation and
feature detection. In section III an overview of the proposed
visual navigation system is given. Sections IV and V discuss
the components of the system in detail. Experimental results
and conclusions can be found in sections VI and VII of the
paper.
II. RELATED WORK
Vision systems using Optic flow technique have become
a popular solution for indoor simultaneous localisation and
mapping (SLAM) of micro aerial vehicles (MAVs). With
computational capabilities limited by the take off weight, the
incorporation of visual systems for MAVs is mostly limited to
the use of optical flow [1] and SIFT [2] or SURF [3] corner-
based feature matching techniques. The combination of these
techniques has a number of successful examples [4]- [9] with
a benchmark in real-time monocular SLAM-based navigation
for autonomous micro helicopters recently achieved by Weiss
[10]. Listed methods, however, do not have an objective to
extract a specific object from the scene or analyse it rather
derive relative motion from the changes detected in the scene.
Analysis of aerial imagery found its application both in
photogrammetry and remote sensing as well as in engineering.
Road, building and natural feature extraction are the com-
monly formulated problems in the field. Approaches designed
to solve the problem of road extraction include and are not
limited to mathematical morphology [11]–[13], snakes [14],
[15], support vector machines [16]–[20], neural networks [21]–
[23] and tensor voting frameworks [24], [25]. A detailed
comparison of requirements and applications of the methods
to the visual navigation problem is given in [26]. A road
detection strategy based on the neural network classifiers or
support vector machines can result in high completeness of
the classified data but the cost of such classification can be
extensive supervised training. Less computationally expensive
classifiers, e.g. intensity-based, combined with hierarchical
filtering such as the one used in this work require only 3-
4 labelled images for each class in order to initialise real-time
operation. Section III provides an overview of the structure of
the proposed navigation system with details on the choice and
usage of features and the classifier.
III. OVERVIEW OF THE VISUAL NAVIGATION SYSTEM
To obtain a reliable position estimate of the UAV, navigation
system presented in the paper utilises data acquired by onboard
inertial sensors and onboard cameras. The main modules of
Fig. 1: Block diagram of the optical flow-aided feature-based
visual navigation system.
the navigation system responsible for this operations are a
Kalman filter, an image processing block and an optical flow
block as shown on Fig.1. The system operates as follows.
The measurements of vehicle accelerations provided by inertial
sensors are integrated and serve as a basis for position estimate
calculated by the Kalman filter. The integration operation
amplifies inherent errors and noise found in measurements,
which results in fast drift of the position estimate. The inertial
drift should be constrained using other sources of information
not subject to error accumulation, such as aerial imagery taken
with onboard cameras. Aerial imagery in turn is a passive
source of the positional information which is available under
most flight conditions. Imagery is analysed both temporally
and structurally. The shift in sequential images describes how
much the vehicle has moved between the frames and therefore
can be used to determine the velocity of the aircraft. Features
extracted from the imagery, e.g. roads, lakes, dams, forests
etc, are used to obtain the absolute localisation information by
associating the features present in the image to the features
existing in a database (Fig. 2). This process is described in
more details in the next section. This paper seeks to evaluate
the computation cost and performance improvement associated
with the introduction of two additional capabilities: optical
flow [20] and road spline matching (done within the image
processing block), which are shown in yellow in Fig.1). Sec-
tions IV and V present the details of the additional capabilities.
IV. IMAGE PROCESSING
A. Framework and Available Capabilities
This work is a part of a larger research programme on
feature-based navigation. The original work by Dumble [27]
introduced road-based visual navigation which utilised asso-
ciation between road junctions to calculate position of the
aerial vehicle. Further study by Williams [28] evaluated the
performance of a feature-based terrain-aided visual navigation
system on a rural case scenario, taking the edges between the
waterbodies and land as features for association with the map.
Our latest study presented in [29] showed that the robustness
of the intersection-based matching can be improved using
hierarchal filtering to ensure no false positives are fused in
the filter. The results of intersection feature matching are also
Fig. 2: Feature classification within the image processing
block.
used for comparison in this paper.
The current study builds upon the above mentioned works
and utilises additional techniques, such as optical flow, and
aims to advance in the direction of a hybrid system to achieve
best performance in a variety of scenarios. The hybrid system
in question (Fig.2) allocates individual threads to each of
the feature types. Some features, such as roads and water
edges, are used for feature association, others (greenery and
urban structures) serve as a context to improve the general
classification of the image.
B. Image Classification
Image processing as a technique seeks to transform regions
found in the image into meaningful objects which can be
used to characterise the environment. This can be achieved
by various techniques: from mathematical morphology [13] to
artificial neural networks [23]. Real-time systems favour clas-
sifiers that yield the fastest frame rate. Our system is intended
to be used onboard and must therefore be compliant with the
real-time requirement. Thus, a relatively simple but reliable
intensity-based classifier in conjunction with connected com-
ponent analysis (CCA) [30] was chosen. The operations that
the classifier performs on an aerial image to output a road
graph are described in detail in previous work [29]. Here, for
completeness of the system presentation, a brief overview of
the classification routine is included.
A classifier learns intensity and texture information specific
to each of the feature classes present in the image, e.g.
road, greenery, water, from training image set (3-4 images for
each class). An example of the training image for road and
water classes is given in Fig.3 The output of the classifier is
then subjected to CCA, which based on the shape and area
information about the segments in the image assigns them to
a particular class. After each class is completed, its structure
is revisited and optimised. Since the road class is the focus of
the current work, only road feature detection and association
are considered herein. During optimisation of a road class,
road candidate regions are transformed into a road graph
that represents the road with a collection of road segments
modelled by splines and road junctions connecting them (see
Fig.3, bottom).
Fig. 3: Top image shows original image, training image with
ground truth mask superimposed in the center, bottom image
shows typical output of the algorithm with water region and
road graph.
C. Road feature modelling and association
Feature modelling and association presented here extends
and compliments the corresponding section of the prelimi-
nary study [29]. Previously, only intersection association was
used as a source of feature-based position update. Current
work extends the feature matching block by introducing a
complimentary spline-matching capability. Feature association
within the block is divided into parallel threads processing
road intersections and road centrelines separately. An adaptive
approach based on the presence and quality of the features
in the scene is used to choose which thread to prioritise.
A complimentary capability based on road centrelines is of
particular interest during long flights in the rural environments
or flight missions over mixed terrain where road junction
information is not present or is insufficient. In rural envi-
ronments if no positional update is derived from the visual
data, the divergence of the position estimate is proportional to
the time of the flight (assuming a velocity measurement like
optical flow is available). The road stretches without junctions
however present another possibility to constrain the drift of
the inertial system. Depending on the shape of the feature
(straight line, s-bend, curve) it can be used to constrain the
drift just like a road junction (effectively, a point) in one or
two directions in the image plane.
Let R represent a database entry corresponding to the road
centreline
R = [X, Y, Z]g
s, wr, ni, ii. (1)
The parameters associated with it are 1) the location of s spline
nodes [X, Y, Z]g
s representing road centreline with respect to
the geodetic reference frame, 2) the average width of the road
region wr calculated perpendicular to the road centreline, 3)
the number of intersections ni road segment connects to, 4)
the indices of these intersections ii.
1) Road centreline matching: The spline nodes (shown
with dots in Fig. 4) accurately capture the location of the
road centreline and information about the shape of the road
component.
Spline matching relies both on the location of individual
nodes of the image splines as well as on the characteristic
shape of the spline. Curvature-based spline matching uses
an algebraic curvature description of the spline to search for
correspondences in the database. Each ith node of the kth
image spline is compared to the closest jth node of mth
database spline, and cumulative shift between them, δPk−m,
is determined as
δPk−m = Σ([Xi, Yi, Zi]g
− [Xj, Yj, Zj]g
DB) < δTthr (2)
δKi−j = Σ(Ki − [Kj]DB) < δKthr (3)
where δTthr is a position error threshold which is tied to the
scale of the image. The matches that satisfy (2) are passed for
Fig. 4: A frame showing detected splines (in yellow) and
corresponding database splines (in green). Dots schematically
show the location of the spline node’s curvature between which
a comparison is made to provide a match.
curvature analysis. The accumulated difference in curvature
between the ith and jth nodes of the matched splines, δKi−j,
should satisfy the matching criterion δKi−j < δKthr, where
δKthr is a curvature error threshold. Segments of the image
splines which satisfy both (2) and (3) are considered to be the
correct matches to their database counterparts.
Once the correspondences are found, the difference in position
of the matched splines δPk−m is transformed into an update
for the Kalman filter.
2) The use of road splines shape information: The spline
matching procedure, depending on the shape of the road, can
constrain the drift of dead reckoning in one or both directions.
This leads to prioritisation of splines based on richness of their
shape information. Spline sections capturing greater curvature
of a spline have higher priority in the matching sequence
since they constrain the drift in both directions in a 2D plane
compared to relatively straight sections of the splines which
only limit the drift of the vehicle in a direction perpendicular
to the spline. The reasoning here can be explained as follows.
Take two relatively straight (or slightly curved) spline sections.
If one is to find correspondence between their location by
shifting one along the other, one will find no fixed point at
which the match can be uniquely determined unlike highly
curved splines or s-bends (Fig. 5). Curves and s-bends are
especially rich in shape information, and can be identified by
searching for considerable change in curvature, the change in
the direction of the spline or the change in the sign of the
curvature. Once identified, these spline sections, like the seg-
ment marked in Fig. 4, are isolated as separate contenders and
prioritised during the curvature matching procedure. Spline
detection accounts for the peculiarity that spline features have
due to aerial video as their sources: each subsequent piece of
the information about the feature “enters” the frame at the top,
and is added to the feature representation available from the
previous frame. It is worth noting that the repeatability of the
spline extraction across the frames allows reliable operation
of both SLAM and database matching threads.
V. INTEGRATING OPTICAL FLOW
A. Motivation and benefits
Feature-based navigation operates under the assumption that
the position estimate of the aircraft is reasonably accurate
since the match for the features appearing in the image
taken onboard is sought within some area from the vehicle’s
estimated position. When the system resorts to dead reckoning
Fig. 5: Effect of spline shape information on quality of the
match. In the first case (left), there is unique transformation
to align the candidate spline in grey with the true spline in
green. In second case (on the right) multiple shifts are possible
which all result in overlap of the candidate and true spline.
and information about the position is obtained purely from
integration of the readings of the onboard sensors, the position
estimate is corrupted by errors accumulated during the flight,
thus making the true position unrecoverable. To avoid such a
case and improve fusion of the position estimate, a relatively
simple frame-to-frame motion tracking can be used to provide
a velocity update for the filter.
Implementation of the optical flow [31] used in this work
combines two popular approaches: a numerical scheme imple-
menting a coarse-to-fine warping strategy, that provides better
convergence to the global minimum presented in [32], and
a combined local-global approach [33] which uses a simple
confidence measure to assess the flow reliability and allows
for sparsity of the dense optical flow. Although optical flow
has become a standard algorithm for estimation of the motion
in the generalised scenes, as far as a navigational system
is concerned, the velocity update provided by optical flow
needs to be reliable. To simultaneously control the quality of
the information that is being fused and avoid fusing of the
erroneous measurements, constraints, masking and filtering of
outliers is used.
B. Constraints on the optical flow measurements
There are two constraints made regarding the nature of
the measurements obtained from the estimation of the motion
between two consecutive frames. The first is based on smooth-
ness constraint defined by Horn/Shunck [1] that neighbouring
points on the objects have similar velocities and that the
velocity field varies smoothly across the frame. Combining this
assumption with the convexity of the lens, which contributes
to the fact that objects closer to the centre of the scene appear
to move faster that the object on the periphery of the image a
typical structure of the flow can be predicted. In other words,
flow values describing the motion of the objects in the scene
should be homogeneous without high fluctuations, with values
higher towards the centre of the image and smaller towards its
edges. Practically, this is applied by taking the gradient of the
flow Φ at pixel (x, y), comparing its values to the mean Φm,p
in the pixel neighbourhood of size p × p of the pixel and
thresholding it at δgrad to mask out the areas where the flow
values differ considerably from the expected ones.
∇Φ(x,y) = (
du
dx
)2 + (
du
dy
)2 + (
dv
dx
)2 + (
dv
dy
)2 (4)
∇Φ(x,y) − ∇Φm,p ≤ δgrad (5)
where u and y are optical flow velocities and x and y denote
image dimensions. The corrupted values could possibly results
from association between the uniform areas or areas with a
repetitive pattern (water, glare on the water etc). The second
assumption can be described as temporal consistency:
uOF ≈ uKF ± eu, vOF ≈ vKF ± ev (6)
where average values of the flow in both directions uOF and
vOF describing the motion in the consequent frames should
be consistent with each other and possibly lie within some
set margin eu, ev respectively from the values uKF , vKF
predicted by the filter. In practice onboard velocity sensors
mounted on the aircraft can be used as a benchmark.
The comparison of the algorithm performance with and
without integrated optical flow and spline matching modules
is given in the section VI.
VI. EXPERIMENTAL RESULTS
A. Test set up
Three datasets were constructed using Google Earth im-
agery taken in different years (2007, 2009, 2014) to evaluate
the performance and demonstrate the robustness of the algo-
rithm against changes in lighting as well as structural changes
in the scene. Each dataset has 235 images which simulate
the round flight above urban and suburban areas in NSW,
Australia. The resolution of the images is 1024x768 px. The
algorithm was implemented and tested in Matlab, on a 3.6
Ghz Intel i7 4 core processor. The system performance in the
following modes is compared: using junctions matching only,
using junctions matching in conjunction with optical flow and
finally, using both junctions and spline matching along with
an optical flow update.
B. Algorithm performance using spline matching and optical
flow
There are a number of frames in the video sequence which
contain no road intersections as shown on Fig.6. Every frame
which has intersections in it, also contains splines from which
the intersections are derived. The spline information, however,
becomes useful to constrain the drift of the system only in the
absence of the intersections in the frame. The algorithm with
additional spline matching and optical flow visual odometry
was run on the same three sequences to asses the improvement
in the performance of the system. The change in the position
error for 3 corresponding videos using three different modes of
the algorithm is shown on Fig.7. Each vertical line symbolises
registration and fusion of one spline feature. The feature is
fused during several consequent frames, it is shown with line
Fig. 6: Distribution of road intersections updates in the
datasets.
Operation Mode Used X Error Y Error Z Error 1
Dataset 2007
Junctions only 570 527 1194
Junctions+OptFlow 481 (84.4%) 451 (85.6%) 1132 (94.9%)
Junctions+OptFlow+Splines 470 (82.3%) 450 (85.3%) 1210 (101.4%)
Dataset 2009
Junctions only 1116 2393 1224
Junctions+OptFlow 543 (48.8%) 661 (27.6%) 1467 (119.9%)
Junctions+OptFlow+Splines 485 (43.6%) 359 (14.9%) 1441 (117.7%)
Dataset 2014
Junctions only 703 832 936
Junctions+OptFlow 478 (68.0%) 362 (43.5%) 1449 (154.8%)
Junctions+OptFlow+Splines 453 (64.4%) 361 (43.4%) 1382 (147.7%)
TABLE I: Comparison of accumulated error in [m] for dif-
ferent operation modes. ”Junctions only” mode is used as a
baseline.
only in the first frame where it appears to keep graph readable.
The test results show that incorporation of the optical flow
allowed for the change of the drift from a quadratic to a
linear function resulting in slower accumulation of the error
during the phases when the feature navigation threads do not
provide a positional update. The spline matching procedure
generally exhibits similar behaviour to the junction matching
thread. As shown on Fig.7, it provides positional updates with
a varying degree of uncertainty in different directions. Overall
error estimation shows that incorporation of the optical flow
and the spline matching as a parallel thread providing the
position update results in a bounded positional error in X and
Y directions (see Table I).
The breakdown of execution time, showing the share of each
of the functions in the overall processing time of a typical
1024x768 px frame from an aerial sequence, is presented
in Table II. optical flow in its current implementation takes
almost 2 seconds which is nearly twice as much as the
rest of the image processing block. The benefit gained from
incorporation of optical flow, however, makes it worthwhile
to include the module in the system. It should be noted that
the algorithms could be applied only when necessary, for
instance, the optical flow algorithm could be invoked only
when there are no intersection or spline features present in
the image, thus limiting drift when necessary but without
slowing down the process when it is not needed. In the
case when no features will be present in the image, optical
flow will be the only component providing velocity updates
thus ensuring robustness of the system. The remainder of the
image processing block takes about 1 second with the feature
extraction process consuming only a small proportion of the
time (<0.1%) compared to feature detection and association.
The road detection algorithm constitutes about 61% of the
remaining processing time due to recursive optimisation of
the road graph.
1Z estimates are currently not corrected for ground altitude. In future
implementation of the algorithm digital elevation maps will be incorporated
to provide Z error correction.
Dataset 2007
Dataset 2009
Dataset 2014
Fig. 7: Comparison of position errors in metres for different
operation modes of the algorithm on dataset from 2007, 2009
and 2014.
Algorithm module Time, [s] Ratio
OptFlow 1.8765 62.9%
Extraction 0.0820 2.7%
Detection 0.6791 22.7%
Association 0.3444 11.5%
Other 0.0067 0.2%
Total 2.9820 100%
TABLE II: Breakdown of the algorithm execution time for a
typical 1024x768px video frame
VII. CONCLUSION
Feature extraction and matching techniques using combined
complimentary sources of information and minimal feature de-
scription vectors constitute an effective basis for feature-based
visual navigation system. Maximisation of feature uniqueness
is a key to avoidance of the false positives. Broader coverage
of the feature-based algorithm was achieved by utilising multi-
pronged architecture of the algorithm benefiting from various
characteristics depending on the kind of the features present in
the registered aerial image. Extensive study of spline-matching
capability in rural environment with few intersection features
available is a part of ongoing work on the algorithm and is
expected to demonstrate the benefits of the additional spline-
based road matching stage.
Testing of the algorithm performance in the presence of
varying illumination image conditions has shown its reliability.
Comparison of the different modes of operation and their
performance has proven that incorporation of the information
from complimentary modules is an effective way to constrain
the drift of the navigation system. Usage of the direct updates
of the velocity estimates derived from optical flow changes the
drift rate of the inertial navigation error to be linear rather than
the quadratic form typical of double integration of acceleration
information from the inertial sensors. From the presented
graphs and comparison table it is evident that the fusion
strategy we adopted successfully reduces the accumulated
position error.
The contributions of this paper are the extension of the
multi-pronged approach to road centrelines and introduction
of the optical flow module to the framework with the corre-
sponding constraints. Development of multi-temporal opera-
tion modes for the feature extraction and matching modules
to improve the reliability of feature association is a part of
future work on the algorithm. The focus of this work is on
further development of the image processing components and
extensive testing of the algorithm performance in the rural
environment.
ACKNOWLEDGMENT
The authors would like to thank David G. Williams and Dr.
Steve J. Dumble for their assistance and contribution to the
current work.
REFERENCES
[1] Horn, B.K. and Schunck, B.G., 1981, November. Determining optical
flow. In 1981 Technical symposium east (pp. 319-331). International
Society for Optics and Photonics.
[2] Lowe, D.G., 1999. Object recognition from local scale-invariant features.
In The proceedings of the seventh IEEE international conference on
Computer vision (Vol. 2, pp. 1150-1157). Ieee.
[3] Bay, H., Ess, A., Tuytelaars, T. and Van Gool, L., 2008. Speeded-up
robust features (SURF). In Computer vision and image understanding,
110(3), pp.346-359.
[4] Ruffier, F. and Franceschini, N., 2005. Optic flow regulation: the key
to aircraft automatic guidance. In Robotics and Autonomous Systems,
50(4), pp.177-194.
[5] Wang, J., Garratt, M., Lambert, A., Wang, J.J., Han, S. and Sinclair, D.,
2008. Integration of GPS/INS/vision sensors to navigate unmanned aerial
vehicles. In the International Archives of the Photogrammetry, Remote
Sensing and Spatial Information Sciences, 37, pp.963-970.
[6] Farid Kendoul, Isabelle Fantoni, Kenzo Nonami, Optic flow-based vision
system for autonomous 3D localization and control of small aerial
vehicles. In Robotics and Autonomous Systems, Volume 57, Issues 6-
7, 30 June 2009, Pages 591-602, ISSN 0921-8890
[7] Lippiello, V., Loianno, G. and Siciliano, B., 2011, December. MAV indoor
navigation based on a closed-form solution for absolute scale velocity
estimation using optical flow and inertial data.. 50th IEEE Conference
on Decision and Control and European Control Conference (CDC-ECC)
(pp. 3566-3571). IEEE.
[8] Fraundorfer, F., Heng, L., Honegger, D., Lee, G.H., Meier, L., Tanskanen,
P. and Pollefeys, M., 2012, October. Vision-based autonomous mapping
and exploration using a quadrotor MAV.. IEEE/RSJ International Confer-
ence on Intelligent Robots and Systems (IROS), (pp. 4557-4564). IEEE.
[9] Weiss, S., Achtelik, M.W., Lynen, S., Chli, M. and Siegwart, R.,
2012, May. Real-time onboard visual-inertial state estimation and self-
calibration of mavs in unknown environments. IEEE International Confer-
ence on Robotics and Automation (ICRA), 2012 (pp. 957 - 964). IEEE.
[10] Weiss, S., Scaramuzza, D. and Siegwart, R, 2011. Monocular-
SLAM - based navigation for autonomous micro helicopters in GPS-
denied environments. In Journal of Field Robotics, 28: 854-874. doi:
10.1002/rob.20412
[11] Valero, S., Chanussot, J., Benediktsson, J.A., Talbot, H. and Waske, B.,
2010. Advanced directional mathematical morphology for the detection of
the road network in very high resolution remote sensing images. Pattern
Recognition Letters, 31(10), pp.1120-1127.
[12] Kiani, A., et al. (2014). Design and implementation of an expert
interpreter system for intelligent acquisition of spatial data from aerial
or remotely sensed images. Measurement 47: 676-685.
[13] Mohammadzadeh, A., Tavakoli, A., Zoej, V. and Mohammad, J., 2006.
Road extraction based on fuzzy logic and mathematical morphology from
pan-sharpened ikonos images. The photogrammetric record, 21(113),
pp.44-60.
[14] Gruen, A. and Li, H., 1997. Semi-automatic linear feature extraction
by dynamic programming and LSB-snakes. Photogrammetric engineering
and remote sensing, 63(8), pp.985-994.
[15] Laptev, I., Mayer, H., Lindeberg, T., Eckstein, W., Steger, C. and
Baumgartner, A., 2000. Automatic extraction of roads from aerial images
based on scale space and snakes. In Machine Vision and Applications,
12(1), pp.23-31.
[16] Song, M. and Civco, D., 2004. Road extraction using SVM and image
segmentation. In Photogrammetric Engineering and Remote Sensing,
70(12), pp.1365-1371.
[17] Huang, X. and Zhang, L., 2009. Road centreline extraction from high-
resolution imagery based on multiscale structural features and support
vector machines. In International Journal of Remote Sensing, Vol. 30(Iss.
8).
[18] Zhou, S., Gong, J., Xiong, G., Chen, H. and Iagnemma, K., 2010, June.
Road detection using support vector machine based on online learning
and evaluation. In Intelligent Vehicles Symposium (IV), 2010 IEEE (pp.
256-261). IEEE.
[19] Guo, X., Denman, S., Fookes, C., Mejias, L. and Sridharan, S., 2014,
November. Automatic UAV Forced Landing Site Detection Using Ma-
chine Learning. Digital lmage Computing: Techniques and Applications
(DlCTA), 2014 International Conference on (pp. 1-7). IEEE.
[20] Lin Y, Shen M, Liu B, Ye Q. Remote Sensing Classification Method of
Wetland Based on AN Improved SVM. In ISPRS-International Archives of
the Photogrammetry, Remote Sensing and Spatial Information Sciences.
2013 Aug;1(1):179-83.
[21] Barsi, A., Heipke, C. and Willrich, F., 2002. Junction extraction by
artificial neural network system-JEANS. International Archives of Pho-
togrammetry Remote Sensing and Spatial Information Sciences, 34(3/B),
pp.18-21.
[22] Mokhtarzade, M., Zoej, M.V. and Ebadi, H., 2008. Automatic road
extraction from high resolution satellite images using neural networks,
texture analysis, fuzzy clustering and genetic algorithms. The Interna-
tional Archives of the Photogrammetry, Remote Sensing and Spatial
Information Sciences, 37, pp.549-556.
[23] Mnih, V., 2013. Machine Learning for Aerial Image Labelling. Doctoral
Thesis. University of Toronto.
[24] Zheng, S., Jian, L., Shi, W.-Z. and Zhu, G.-X., 2006. Road Central
Contour Extraction from High Resolution Satellite Image using Tensor
Voting Framework. International Conference on Machine Learning and
Cybernetics pp. 3248,3253.
[25] Poullis, C. and You, S., 2010. Delineation and geometric modelling
of road networks. In ISPRS Journal of Photogrammetry and Remote
Sensing, 65(2), pp. 165-181.
[26] Volkova, A. and P. W. Gibbens (2015). A Comparative Study of Road
Extraction Techniques from Aerial Imagery — A Navigational Perspec-
tive. Asia-Pacific International Symposium on Aerospace Technology
(APISAT), Cairns, Australia, 25 - 27 November 2015.
[27] Dumble, S.J. and Gibbens, P.W., 2015. Airborne Vision-Aided Navi-
gation Using Road Intersection Features. In Journal of Intelligent and
Robotic Systems, 78(2), pp.185-204.
[28] Williams, D.G. and Gibbens, P.W., Google Earth Imagery Assisted B-
Spline SLAM for Monocular Computer Vision Airborne Navigation.
[29] Volkova, A. and Gibbens, P.W., 2016. Satellite Imagery Assisted Road-
based Visual Navigation System. XXIII Congress of the International
Society for Photogrammetry and Remote Sensing (ISPRS), Prague, Czech
Republic, July 12-19 2016.
[30] Gonzalez, R.C., Woods R.E, 2011. Digital Image Processing. Publisher
Pearson Education. Edition 3. ISBN 0133002322, 9780133002324 (976
pages)
[31] C. Liu. Beyond Pixels: Exploring New Representations and Applications
for Motion Analysis. Doctoral Thesis. Massachusetts Institute of Technol-
ogy. May 2009.
[32] T. Brox, A. Bruhn, N. Papenberg, and J.Weickert, 2004. High accu-
racy optical flow estimation based on a theory for warping. European
Conference on Computer Vision (ECCV), pages 2536.
[33] A. Bruhn, J.Weickert and C. Schnorr, 2005. Lucas/Kanade meets
Horn/Schunk: combining local and global optical flow methods. in
International Journal of Computer Vision (IJCV), 61(3):211231.

More Related Content

What's hot

Forecasting electricity usage in industrial applications with gpu acceleratio...
Forecasting electricity usage in industrial applications with gpu acceleratio...Forecasting electricity usage in industrial applications with gpu acceleratio...
Forecasting electricity usage in industrial applications with gpu acceleratio...
Conference Papers
 
Density of route frequency for enforcement
Density of route frequency for enforcement Density of route frequency for enforcement
Density of route frequency for enforcement
Conference Papers
 
Engineering survey in the construction industry
Engineering survey in the construction industryEngineering survey in the construction industry
Engineering survey in the construction industry
noah kertich
 
Vision Based Traffic Surveillance System
Vision Based Traffic Surveillance SystemVision Based Traffic Surveillance System
Vision Based Traffic Surveillance System
maheshwaraneee
 

What's hot (19)

Optimizing GIS based Systems
Optimizing GIS based SystemsOptimizing GIS based Systems
Optimizing GIS based Systems
 
Driving Behavior for ADAS and Autonomous Driving VIII
Driving Behavior for ADAS and Autonomous Driving VIIIDriving Behavior for ADAS and Autonomous Driving VIII
Driving Behavior for ADAS and Autonomous Driving VIII
 
Driving behaviors for adas and autonomous driving XIII
Driving behaviors for adas and autonomous driving XIIIDriving behaviors for adas and autonomous driving XIII
Driving behaviors for adas and autonomous driving XIII
 
Airport Runway Detection Based On ANN Algorithm
Airport Runway Detection Based On ANN AlgorithmAirport Runway Detection Based On ANN Algorithm
Airport Runway Detection Based On ANN Algorithm
 
Scenario-Based Development & Testing for Autonomous Driving
Scenario-Based Development & Testing for Autonomous DrivingScenario-Based Development & Testing for Autonomous Driving
Scenario-Based Development & Testing for Autonomous Driving
 
Forecasting electricity usage in industrial applications with gpu acceleratio...
Forecasting electricity usage in industrial applications with gpu acceleratio...Forecasting electricity usage in industrial applications with gpu acceleratio...
Forecasting electricity usage in industrial applications with gpu acceleratio...
 
IRJET- Road Recognition from Remote Sensing Imagery using Machine Learning
IRJET- Road Recognition from Remote Sensing Imagery using Machine LearningIRJET- Road Recognition from Remote Sensing Imagery using Machine Learning
IRJET- Road Recognition from Remote Sensing Imagery using Machine Learning
 
Density of route frequency for enforcement
Density of route frequency for enforcement Density of route frequency for enforcement
Density of route frequency for enforcement
 
Automated Motion Detection from space in sea surveillance
Automated Motion Detection from space in sea surveillanceAutomated Motion Detection from space in sea surveillance
Automated Motion Detection from space in sea surveillance
 
A METHOD OF TARGET TRACKING AND PREDICTION BASED ON GEOMAGNETIC SENSOR TECHNO...
A METHOD OF TARGET TRACKING AND PREDICTION BASED ON GEOMAGNETIC SENSOR TECHNO...A METHOD OF TARGET TRACKING AND PREDICTION BASED ON GEOMAGNETIC SENSOR TECHNO...
A METHOD OF TARGET TRACKING AND PREDICTION BASED ON GEOMAGNETIC SENSOR TECHNO...
 
Parcel-based Damage Detection using SAR Data
Parcel-based Damage Detection using SAR DataParcel-based Damage Detection using SAR Data
Parcel-based Damage Detection using SAR Data
 
Visual tracking using particle swarm optimization
Visual tracking using particle swarm optimizationVisual tracking using particle swarm optimization
Visual tracking using particle swarm optimization
 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)
 
Engineering survey in the construction industry
Engineering survey in the construction industryEngineering survey in the construction industry
Engineering survey in the construction industry
 
Performance assessment of antenna array for an unmanned air vehicle
Performance assessment of antenna array for an unmanned  air vehicle Performance assessment of antenna array for an unmanned  air vehicle
Performance assessment of antenna array for an unmanned air vehicle
 
Arc Geographic Information System (GIS) Digital Elevation Models (DEM).
Arc Geographic Information System (GIS) Digital Elevation Models (DEM).Arc Geographic Information System (GIS) Digital Elevation Models (DEM).
Arc Geographic Information System (GIS) Digital Elevation Models (DEM).
 
Session 38 Xiaoliang Ma
Session 38 Xiaoliang MaSession 38 Xiaoliang Ma
Session 38 Xiaoliang Ma
 
Vision Based Traffic Surveillance System
Vision Based Traffic Surveillance SystemVision Based Traffic Surveillance System
Vision Based Traffic Surveillance System
 
[Paper research] GOSELO: for Robot navigation using Reactive neural networks
[Paper research] GOSELO: for Robot navigation using Reactive neural networks[Paper research] GOSELO: for Robot navigation using Reactive neural networks
[Paper research] GOSELO: for Robot navigation using Reactive neural networks
 

Viewers also liked (15)

Acordaorb
AcordaorbAcordaorb
Acordaorb
 
Sindaco_FUTURI1
Sindaco_FUTURI1Sindaco_FUTURI1
Sindaco_FUTURI1
 
Sindaco-FUTURI4
Sindaco-FUTURI4Sindaco-FUTURI4
Sindaco-FUTURI4
 
Painel 15
Painel 15Painel 15
Painel 15
 
Arte Renascentista
Arte RenascentistaArte Renascentista
Arte Renascentista
 
recom.
recom.recom.
recom.
 
Planilha De Horarios Dia 26 07 2009
Planilha De Horarios Dia 26 07 2009Planilha De Horarios Dia 26 07 2009
Planilha De Horarios Dia 26 07 2009
 
Nelson Mandela
Nelson MandelaNelson Mandela
Nelson Mandela
 
10.05.02.07_índice evidências
10.05.02.07_índice evidências10.05.02.07_índice evidências
10.05.02.07_índice evidências
 
Segunda Presentacion
Segunda PresentacionSegunda Presentacion
Segunda Presentacion
 
Vidas Secas
Vidas SecasVidas Secas
Vidas Secas
 
Cruces 1 jun 2015
Cruces 1 jun 2015Cruces 1 jun 2015
Cruces 1 jun 2015
 
Sindaco_FUTURI2
Sindaco_FUTURI2Sindaco_FUTURI2
Sindaco_FUTURI2
 
Unitat 3. la hidrosfera
Unitat 3. la hidrosferaUnitat 3. la hidrosfera
Unitat 3. la hidrosfera
 
Prevenção ao Câncer de Mama
Prevenção ao Câncer de MamaPrevenção ao Câncer de Mama
Prevenção ao Câncer de Mama
 

Similar to Volkova_DICTA_robust_feature_based_visual_navigation

Feature selection for sky image classification based on self adaptive ant col...
Feature selection for sky image classification based on self adaptive ant col...Feature selection for sky image classification based on self adaptive ant col...
Feature selection for sky image classification based on self adaptive ant col...
IJECEIAES
 
Autonomous_UAV_Path_Planning_Using_Modified_PSO_for_UAV-Assisted_Wireless_Net...
Autonomous_UAV_Path_Planning_Using_Modified_PSO_for_UAV-Assisted_Wireless_Net...Autonomous_UAV_Path_Planning_Using_Modified_PSO_for_UAV-Assisted_Wireless_Net...
Autonomous_UAV_Path_Planning_Using_Modified_PSO_for_UAV-Assisted_Wireless_Net...
MaryumHina1
 
Real time vehicle counting in complex scene for traffic flow estimation using...
Real time vehicle counting in complex scene for traffic flow estimation using...Real time vehicle counting in complex scene for traffic flow estimation using...
Real time vehicle counting in complex scene for traffic flow estimation using...
Conference Papers
 
Real time vehicle counting in complex scene for traffic flow estimation using...
Real time vehicle counting in complex scene for traffic flow estimation using...Real time vehicle counting in complex scene for traffic flow estimation using...
Real time vehicle counting in complex scene for traffic flow estimation using...
Journal Papers
 
ShawnQuinnCSS565FinalResearchProject
ShawnQuinnCSS565FinalResearchProjectShawnQuinnCSS565FinalResearchProject
ShawnQuinnCSS565FinalResearchProject
Shawn Quinn
 
Real Time Object Identification for Intelligent Video Surveillance Applications
Real Time Object Identification for Intelligent Video Surveillance ApplicationsReal Time Object Identification for Intelligent Video Surveillance Applications
Real Time Object Identification for Intelligent Video Surveillance Applications
Editor IJCATR
 
Inter vehicular communication using packet network theory
Inter vehicular communication using packet network theoryInter vehicular communication using packet network theory
Inter vehicular communication using packet network theory
eSAT Publishing House
 

Similar to Volkova_DICTA_robust_feature_based_visual_navigation (20)

Feature selection for sky image classification based on self adaptive ant col...
Feature selection for sky image classification based on self adaptive ant col...Feature selection for sky image classification based on self adaptive ant col...
Feature selection for sky image classification based on self adaptive ant col...
 
Autonomous_UAV_Path_Planning_Using_Modified_PSO_for_UAV-Assisted_Wireless_Net...
Autonomous_UAV_Path_Planning_Using_Modified_PSO_for_UAV-Assisted_Wireless_Net...Autonomous_UAV_Path_Planning_Using_Modified_PSO_for_UAV-Assisted_Wireless_Net...
Autonomous_UAV_Path_Planning_Using_Modified_PSO_for_UAV-Assisted_Wireless_Net...
 
Identification and classification of moving vehicles on road
Identification and classification of moving vehicles on roadIdentification and classification of moving vehicles on road
Identification and classification of moving vehicles on road
 
Real time vehicle counting in complex scene for traffic flow estimation using...
Real time vehicle counting in complex scene for traffic flow estimation using...Real time vehicle counting in complex scene for traffic flow estimation using...
Real time vehicle counting in complex scene for traffic flow estimation using...
 
Real time vehicle counting in complex scene for traffic flow estimation using...
Real time vehicle counting in complex scene for traffic flow estimation using...Real time vehicle counting in complex scene for traffic flow estimation using...
Real time vehicle counting in complex scene for traffic flow estimation using...
 
LANE CHANGE DETECTION AND TRACKING FOR A SAFE-LANE APPROACH IN REAL TIME VISI...
LANE CHANGE DETECTION AND TRACKING FOR A SAFE-LANE APPROACH IN REAL TIME VISI...LANE CHANGE DETECTION AND TRACKING FOR A SAFE-LANE APPROACH IN REAL TIME VISI...
LANE CHANGE DETECTION AND TRACKING FOR A SAFE-LANE APPROACH IN REAL TIME VISI...
 
Application of Vision based Techniques for Position Estimation
Application of Vision based Techniques for Position EstimationApplication of Vision based Techniques for Position Estimation
Application of Vision based Techniques for Position Estimation
 
ShawnQuinnCSS565FinalResearchProject
ShawnQuinnCSS565FinalResearchProjectShawnQuinnCSS565FinalResearchProject
ShawnQuinnCSS565FinalResearchProject
 
Speed Determination of Moving Vehicles using Lucas- Kanade Algorithm
Speed Determination of Moving Vehicles using Lucas- Kanade AlgorithmSpeed Determination of Moving Vehicles using Lucas- Kanade Algorithm
Speed Determination of Moving Vehicles using Lucas- Kanade Algorithm
 
Driving behaviors for adas and autonomous driving XII
Driving behaviors for adas and autonomous driving XIIDriving behaviors for adas and autonomous driving XII
Driving behaviors for adas and autonomous driving XII
 
COMPARATIVE STUDY ON VEHICLE DETECTION TECHNIQUES IN AERIAL SURVEILLANCE
COMPARATIVE STUDY ON VEHICLE DETECTION TECHNIQUES IN AERIAL SURVEILLANCECOMPARATIVE STUDY ON VEHICLE DETECTION TECHNIQUES IN AERIAL SURVEILLANCE
COMPARATIVE STUDY ON VEHICLE DETECTION TECHNIQUES IN AERIAL SURVEILLANCE
 
An Analysis of Various Deep Learning Algorithms for Image Processing
An Analysis of Various Deep Learning Algorithms for Image ProcessingAn Analysis of Various Deep Learning Algorithms for Image Processing
An Analysis of Various Deep Learning Algorithms for Image Processing
 
Fusion_Conference_final.pdf
Fusion_Conference_final.pdfFusion_Conference_final.pdf
Fusion_Conference_final.pdf
 
proceedings of PSG NCIICT
proceedings of PSG NCIICTproceedings of PSG NCIICT
proceedings of PSG NCIICT
 
Real Time Object Identification for Intelligent Video Surveillance Applications
Real Time Object Identification for Intelligent Video Surveillance ApplicationsReal Time Object Identification for Intelligent Video Surveillance Applications
Real Time Object Identification for Intelligent Video Surveillance Applications
 
Leader follower formation control of ground vehicles using camshift based gui...
Leader follower formation control of ground vehicles using camshift based gui...Leader follower formation control of ground vehicles using camshift based gui...
Leader follower formation control of ground vehicles using camshift based gui...
 
IRJET- Monitoring and Analysing Real Time Traffic Images and Information Via ...
IRJET- Monitoring and Analysing Real Time Traffic Images and Information Via ...IRJET- Monitoring and Analysing Real Time Traffic Images and Information Via ...
IRJET- Monitoring and Analysing Real Time Traffic Images and Information Via ...
 
Vehicle detection and tracking techniques a concise review
Vehicle detection and tracking techniques  a concise reviewVehicle detection and tracking techniques  a concise review
Vehicle detection and tracking techniques a concise review
 
F124144
F124144F124144
F124144
 
Inter vehicular communication using packet network theory
Inter vehicular communication using packet network theoryInter vehicular communication using packet network theory
Inter vehicular communication using packet network theory
 

Volkova_DICTA_robust_feature_based_visual_navigation

  • 1. Extended Robust Feature-based Visual Navigation System for UAVs Anastasiia Volkova School of Aerospace, Mechanical and Mechatronic Engineering, The University of Sydney NSW 2006, Australia Email: avol9544@uni.sydney.edu.au Peter W. Gibbens School of Aerospace, Mechanical and Mechatronic Engineering The University of Sydney NSW 2006, Australia Email: peter.gibbens@sydney.edu.au Abstract—The current study seeks to advance in the direction of building a robust feature-based passive visual navigation system utilising unique characteristics of the features present in an image to obtain position of the aircraft. This is done by extracting, prioritising and associating such features as road centrelines, road intersections and using other natural landmarks as a context. It is shown that extending the system with com- plimentary feature matching blocks and choice of the features prevalent in the scene improves the performance of the navigation system. The work also discusses the constraints, cost and the impact of the introduced optical flow component. The algorithm is evaluated on a simulated dataset containing satellite imagery. I. INTRODUCTION Modern aerial robotic systems allow the conduct of surveil- lance and exploration operations in remote and dangerous environments without endangering human lives. To satisfy the requirements of the missions in question, autonomous systems need to be adaptive and robust as well as operate reliably in GPS-denied environments. Navigational systems designed nowadays to solve this problem benefit from the introduction of cheap cameras and an exponential increase in computation capabilities of the onboard computers. To provide a position estimate of the vehicle without using satellite data, the navigation system must include a passive operating mode which relies only on the information from the onboard sensors. Within the framework of visual navigation the questions of feature extraction and feature association should be solved in one of a number of possible ways. In our work we give preference to the semantic features, which are comprehensible for a human operator, and therefore can be associated with information from other sources, such as feature maps. Extrac- tion of semantic features from an image not only allows the computation of frame-to-frame relative motion but also is used to obtain an absolute estimate of the position by registering the features detected in the image within a database. The paper present extended feature-based algorithm for visual navigation, studies cost of introduction of an optical flow module and evaluates its performance on three simulated datasets. The rest of the paper is structured as follows. Section II looks at the recent work done in the area of visual navigation and feature detection. In section III an overview of the proposed visual navigation system is given. Sections IV and V discuss the components of the system in detail. Experimental results and conclusions can be found in sections VI and VII of the paper. II. RELATED WORK Vision systems using Optic flow technique have become a popular solution for indoor simultaneous localisation and mapping (SLAM) of micro aerial vehicles (MAVs). With computational capabilities limited by the take off weight, the incorporation of visual systems for MAVs is mostly limited to the use of optical flow [1] and SIFT [2] or SURF [3] corner- based feature matching techniques. The combination of these techniques has a number of successful examples [4]- [9] with a benchmark in real-time monocular SLAM-based navigation for autonomous micro helicopters recently achieved by Weiss [10]. Listed methods, however, do not have an objective to extract a specific object from the scene or analyse it rather derive relative motion from the changes detected in the scene. Analysis of aerial imagery found its application both in photogrammetry and remote sensing as well as in engineering. Road, building and natural feature extraction are the com- monly formulated problems in the field. Approaches designed to solve the problem of road extraction include and are not limited to mathematical morphology [11]–[13], snakes [14], [15], support vector machines [16]–[20], neural networks [21]– [23] and tensor voting frameworks [24], [25]. A detailed comparison of requirements and applications of the methods to the visual navigation problem is given in [26]. A road detection strategy based on the neural network classifiers or support vector machines can result in high completeness of the classified data but the cost of such classification can be extensive supervised training. Less computationally expensive classifiers, e.g. intensity-based, combined with hierarchical filtering such as the one used in this work require only 3- 4 labelled images for each class in order to initialise real-time operation. Section III provides an overview of the structure of the proposed navigation system with details on the choice and usage of features and the classifier. III. OVERVIEW OF THE VISUAL NAVIGATION SYSTEM To obtain a reliable position estimate of the UAV, navigation system presented in the paper utilises data acquired by onboard inertial sensors and onboard cameras. The main modules of
  • 2. Fig. 1: Block diagram of the optical flow-aided feature-based visual navigation system. the navigation system responsible for this operations are a Kalman filter, an image processing block and an optical flow block as shown on Fig.1. The system operates as follows. The measurements of vehicle accelerations provided by inertial sensors are integrated and serve as a basis for position estimate calculated by the Kalman filter. The integration operation amplifies inherent errors and noise found in measurements, which results in fast drift of the position estimate. The inertial drift should be constrained using other sources of information not subject to error accumulation, such as aerial imagery taken with onboard cameras. Aerial imagery in turn is a passive source of the positional information which is available under most flight conditions. Imagery is analysed both temporally and structurally. The shift in sequential images describes how much the vehicle has moved between the frames and therefore can be used to determine the velocity of the aircraft. Features extracted from the imagery, e.g. roads, lakes, dams, forests etc, are used to obtain the absolute localisation information by associating the features present in the image to the features existing in a database (Fig. 2). This process is described in more details in the next section. This paper seeks to evaluate the computation cost and performance improvement associated with the introduction of two additional capabilities: optical flow [20] and road spline matching (done within the image processing block), which are shown in yellow in Fig.1). Sec- tions IV and V present the details of the additional capabilities. IV. IMAGE PROCESSING A. Framework and Available Capabilities This work is a part of a larger research programme on feature-based navigation. The original work by Dumble [27] introduced road-based visual navigation which utilised asso- ciation between road junctions to calculate position of the aerial vehicle. Further study by Williams [28] evaluated the performance of a feature-based terrain-aided visual navigation system on a rural case scenario, taking the edges between the waterbodies and land as features for association with the map. Our latest study presented in [29] showed that the robustness of the intersection-based matching can be improved using hierarchal filtering to ensure no false positives are fused in the filter. The results of intersection feature matching are also Fig. 2: Feature classification within the image processing block. used for comparison in this paper. The current study builds upon the above mentioned works and utilises additional techniques, such as optical flow, and aims to advance in the direction of a hybrid system to achieve best performance in a variety of scenarios. The hybrid system in question (Fig.2) allocates individual threads to each of the feature types. Some features, such as roads and water edges, are used for feature association, others (greenery and urban structures) serve as a context to improve the general classification of the image. B. Image Classification Image processing as a technique seeks to transform regions found in the image into meaningful objects which can be used to characterise the environment. This can be achieved by various techniques: from mathematical morphology [13] to artificial neural networks [23]. Real-time systems favour clas- sifiers that yield the fastest frame rate. Our system is intended to be used onboard and must therefore be compliant with the real-time requirement. Thus, a relatively simple but reliable intensity-based classifier in conjunction with connected com- ponent analysis (CCA) [30] was chosen. The operations that the classifier performs on an aerial image to output a road graph are described in detail in previous work [29]. Here, for completeness of the system presentation, a brief overview of the classification routine is included. A classifier learns intensity and texture information specific to each of the feature classes present in the image, e.g. road, greenery, water, from training image set (3-4 images for each class). An example of the training image for road and water classes is given in Fig.3 The output of the classifier is then subjected to CCA, which based on the shape and area information about the segments in the image assigns them to a particular class. After each class is completed, its structure is revisited and optimised. Since the road class is the focus of the current work, only road feature detection and association are considered herein. During optimisation of a road class, road candidate regions are transformed into a road graph that represents the road with a collection of road segments modelled by splines and road junctions connecting them (see Fig.3, bottom).
  • 3. Fig. 3: Top image shows original image, training image with ground truth mask superimposed in the center, bottom image shows typical output of the algorithm with water region and road graph. C. Road feature modelling and association Feature modelling and association presented here extends and compliments the corresponding section of the prelimi- nary study [29]. Previously, only intersection association was used as a source of feature-based position update. Current work extends the feature matching block by introducing a complimentary spline-matching capability. Feature association within the block is divided into parallel threads processing road intersections and road centrelines separately. An adaptive approach based on the presence and quality of the features in the scene is used to choose which thread to prioritise. A complimentary capability based on road centrelines is of particular interest during long flights in the rural environments or flight missions over mixed terrain where road junction information is not present or is insufficient. In rural envi- ronments if no positional update is derived from the visual data, the divergence of the position estimate is proportional to the time of the flight (assuming a velocity measurement like optical flow is available). The road stretches without junctions however present another possibility to constrain the drift of the inertial system. Depending on the shape of the feature (straight line, s-bend, curve) it can be used to constrain the drift just like a road junction (effectively, a point) in one or two directions in the image plane. Let R represent a database entry corresponding to the road centreline R = [X, Y, Z]g s, wr, ni, ii. (1) The parameters associated with it are 1) the location of s spline nodes [X, Y, Z]g s representing road centreline with respect to the geodetic reference frame, 2) the average width of the road region wr calculated perpendicular to the road centreline, 3) the number of intersections ni road segment connects to, 4) the indices of these intersections ii. 1) Road centreline matching: The spline nodes (shown with dots in Fig. 4) accurately capture the location of the road centreline and information about the shape of the road component. Spline matching relies both on the location of individual nodes of the image splines as well as on the characteristic shape of the spline. Curvature-based spline matching uses an algebraic curvature description of the spline to search for correspondences in the database. Each ith node of the kth image spline is compared to the closest jth node of mth database spline, and cumulative shift between them, δPk−m, is determined as δPk−m = Σ([Xi, Yi, Zi]g − [Xj, Yj, Zj]g DB) < δTthr (2) δKi−j = Σ(Ki − [Kj]DB) < δKthr (3) where δTthr is a position error threshold which is tied to the scale of the image. The matches that satisfy (2) are passed for Fig. 4: A frame showing detected splines (in yellow) and corresponding database splines (in green). Dots schematically show the location of the spline node’s curvature between which a comparison is made to provide a match.
  • 4. curvature analysis. The accumulated difference in curvature between the ith and jth nodes of the matched splines, δKi−j, should satisfy the matching criterion δKi−j < δKthr, where δKthr is a curvature error threshold. Segments of the image splines which satisfy both (2) and (3) are considered to be the correct matches to their database counterparts. Once the correspondences are found, the difference in position of the matched splines δPk−m is transformed into an update for the Kalman filter. 2) The use of road splines shape information: The spline matching procedure, depending on the shape of the road, can constrain the drift of dead reckoning in one or both directions. This leads to prioritisation of splines based on richness of their shape information. Spline sections capturing greater curvature of a spline have higher priority in the matching sequence since they constrain the drift in both directions in a 2D plane compared to relatively straight sections of the splines which only limit the drift of the vehicle in a direction perpendicular to the spline. The reasoning here can be explained as follows. Take two relatively straight (or slightly curved) spline sections. If one is to find correspondence between their location by shifting one along the other, one will find no fixed point at which the match can be uniquely determined unlike highly curved splines or s-bends (Fig. 5). Curves and s-bends are especially rich in shape information, and can be identified by searching for considerable change in curvature, the change in the direction of the spline or the change in the sign of the curvature. Once identified, these spline sections, like the seg- ment marked in Fig. 4, are isolated as separate contenders and prioritised during the curvature matching procedure. Spline detection accounts for the peculiarity that spline features have due to aerial video as their sources: each subsequent piece of the information about the feature “enters” the frame at the top, and is added to the feature representation available from the previous frame. It is worth noting that the repeatability of the spline extraction across the frames allows reliable operation of both SLAM and database matching threads. V. INTEGRATING OPTICAL FLOW A. Motivation and benefits Feature-based navigation operates under the assumption that the position estimate of the aircraft is reasonably accurate since the match for the features appearing in the image taken onboard is sought within some area from the vehicle’s estimated position. When the system resorts to dead reckoning Fig. 5: Effect of spline shape information on quality of the match. In the first case (left), there is unique transformation to align the candidate spline in grey with the true spline in green. In second case (on the right) multiple shifts are possible which all result in overlap of the candidate and true spline. and information about the position is obtained purely from integration of the readings of the onboard sensors, the position estimate is corrupted by errors accumulated during the flight, thus making the true position unrecoverable. To avoid such a case and improve fusion of the position estimate, a relatively simple frame-to-frame motion tracking can be used to provide a velocity update for the filter. Implementation of the optical flow [31] used in this work combines two popular approaches: a numerical scheme imple- menting a coarse-to-fine warping strategy, that provides better convergence to the global minimum presented in [32], and a combined local-global approach [33] which uses a simple confidence measure to assess the flow reliability and allows for sparsity of the dense optical flow. Although optical flow has become a standard algorithm for estimation of the motion in the generalised scenes, as far as a navigational system is concerned, the velocity update provided by optical flow needs to be reliable. To simultaneously control the quality of the information that is being fused and avoid fusing of the erroneous measurements, constraints, masking and filtering of outliers is used. B. Constraints on the optical flow measurements There are two constraints made regarding the nature of the measurements obtained from the estimation of the motion between two consecutive frames. The first is based on smooth- ness constraint defined by Horn/Shunck [1] that neighbouring points on the objects have similar velocities and that the velocity field varies smoothly across the frame. Combining this assumption with the convexity of the lens, which contributes to the fact that objects closer to the centre of the scene appear to move faster that the object on the periphery of the image a typical structure of the flow can be predicted. In other words, flow values describing the motion of the objects in the scene should be homogeneous without high fluctuations, with values higher towards the centre of the image and smaller towards its edges. Practically, this is applied by taking the gradient of the flow Φ at pixel (x, y), comparing its values to the mean Φm,p in the pixel neighbourhood of size p × p of the pixel and thresholding it at δgrad to mask out the areas where the flow values differ considerably from the expected ones. ∇Φ(x,y) = ( du dx )2 + ( du dy )2 + ( dv dx )2 + ( dv dy )2 (4) ∇Φ(x,y) − ∇Φm,p ≤ δgrad (5) where u and y are optical flow velocities and x and y denote image dimensions. The corrupted values could possibly results from association between the uniform areas or areas with a repetitive pattern (water, glare on the water etc). The second assumption can be described as temporal consistency: uOF ≈ uKF ± eu, vOF ≈ vKF ± ev (6) where average values of the flow in both directions uOF and vOF describing the motion in the consequent frames should be consistent with each other and possibly lie within some
  • 5. set margin eu, ev respectively from the values uKF , vKF predicted by the filter. In practice onboard velocity sensors mounted on the aircraft can be used as a benchmark. The comparison of the algorithm performance with and without integrated optical flow and spline matching modules is given in the section VI. VI. EXPERIMENTAL RESULTS A. Test set up Three datasets were constructed using Google Earth im- agery taken in different years (2007, 2009, 2014) to evaluate the performance and demonstrate the robustness of the algo- rithm against changes in lighting as well as structural changes in the scene. Each dataset has 235 images which simulate the round flight above urban and suburban areas in NSW, Australia. The resolution of the images is 1024x768 px. The algorithm was implemented and tested in Matlab, on a 3.6 Ghz Intel i7 4 core processor. The system performance in the following modes is compared: using junctions matching only, using junctions matching in conjunction with optical flow and finally, using both junctions and spline matching along with an optical flow update. B. Algorithm performance using spline matching and optical flow There are a number of frames in the video sequence which contain no road intersections as shown on Fig.6. Every frame which has intersections in it, also contains splines from which the intersections are derived. The spline information, however, becomes useful to constrain the drift of the system only in the absence of the intersections in the frame. The algorithm with additional spline matching and optical flow visual odometry was run on the same three sequences to asses the improvement in the performance of the system. The change in the position error for 3 corresponding videos using three different modes of the algorithm is shown on Fig.7. Each vertical line symbolises registration and fusion of one spline feature. The feature is fused during several consequent frames, it is shown with line Fig. 6: Distribution of road intersections updates in the datasets. Operation Mode Used X Error Y Error Z Error 1 Dataset 2007 Junctions only 570 527 1194 Junctions+OptFlow 481 (84.4%) 451 (85.6%) 1132 (94.9%) Junctions+OptFlow+Splines 470 (82.3%) 450 (85.3%) 1210 (101.4%) Dataset 2009 Junctions only 1116 2393 1224 Junctions+OptFlow 543 (48.8%) 661 (27.6%) 1467 (119.9%) Junctions+OptFlow+Splines 485 (43.6%) 359 (14.9%) 1441 (117.7%) Dataset 2014 Junctions only 703 832 936 Junctions+OptFlow 478 (68.0%) 362 (43.5%) 1449 (154.8%) Junctions+OptFlow+Splines 453 (64.4%) 361 (43.4%) 1382 (147.7%) TABLE I: Comparison of accumulated error in [m] for dif- ferent operation modes. ”Junctions only” mode is used as a baseline. only in the first frame where it appears to keep graph readable. The test results show that incorporation of the optical flow allowed for the change of the drift from a quadratic to a linear function resulting in slower accumulation of the error during the phases when the feature navigation threads do not provide a positional update. The spline matching procedure generally exhibits similar behaviour to the junction matching thread. As shown on Fig.7, it provides positional updates with a varying degree of uncertainty in different directions. Overall error estimation shows that incorporation of the optical flow and the spline matching as a parallel thread providing the position update results in a bounded positional error in X and Y directions (see Table I). The breakdown of execution time, showing the share of each of the functions in the overall processing time of a typical 1024x768 px frame from an aerial sequence, is presented in Table II. optical flow in its current implementation takes almost 2 seconds which is nearly twice as much as the rest of the image processing block. The benefit gained from incorporation of optical flow, however, makes it worthwhile to include the module in the system. It should be noted that the algorithms could be applied only when necessary, for instance, the optical flow algorithm could be invoked only when there are no intersection or spline features present in the image, thus limiting drift when necessary but without slowing down the process when it is not needed. In the case when no features will be present in the image, optical flow will be the only component providing velocity updates thus ensuring robustness of the system. The remainder of the image processing block takes about 1 second with the feature extraction process consuming only a small proportion of the time (<0.1%) compared to feature detection and association. The road detection algorithm constitutes about 61% of the remaining processing time due to recursive optimisation of the road graph. 1Z estimates are currently not corrected for ground altitude. In future implementation of the algorithm digital elevation maps will be incorporated to provide Z error correction.
  • 6. Dataset 2007 Dataset 2009 Dataset 2014 Fig. 7: Comparison of position errors in metres for different operation modes of the algorithm on dataset from 2007, 2009 and 2014. Algorithm module Time, [s] Ratio OptFlow 1.8765 62.9% Extraction 0.0820 2.7% Detection 0.6791 22.7% Association 0.3444 11.5% Other 0.0067 0.2% Total 2.9820 100% TABLE II: Breakdown of the algorithm execution time for a typical 1024x768px video frame VII. CONCLUSION Feature extraction and matching techniques using combined complimentary sources of information and minimal feature de- scription vectors constitute an effective basis for feature-based visual navigation system. Maximisation of feature uniqueness is a key to avoidance of the false positives. Broader coverage of the feature-based algorithm was achieved by utilising multi- pronged architecture of the algorithm benefiting from various characteristics depending on the kind of the features present in the registered aerial image. Extensive study of spline-matching capability in rural environment with few intersection features available is a part of ongoing work on the algorithm and is expected to demonstrate the benefits of the additional spline- based road matching stage. Testing of the algorithm performance in the presence of varying illumination image conditions has shown its reliability. Comparison of the different modes of operation and their performance has proven that incorporation of the information from complimentary modules is an effective way to constrain the drift of the navigation system. Usage of the direct updates of the velocity estimates derived from optical flow changes the drift rate of the inertial navigation error to be linear rather than the quadratic form typical of double integration of acceleration information from the inertial sensors. From the presented graphs and comparison table it is evident that the fusion strategy we adopted successfully reduces the accumulated position error. The contributions of this paper are the extension of the multi-pronged approach to road centrelines and introduction of the optical flow module to the framework with the corre- sponding constraints. Development of multi-temporal opera- tion modes for the feature extraction and matching modules to improve the reliability of feature association is a part of future work on the algorithm. The focus of this work is on further development of the image processing components and extensive testing of the algorithm performance in the rural environment. ACKNOWLEDGMENT The authors would like to thank David G. Williams and Dr. Steve J. Dumble for their assistance and contribution to the current work. REFERENCES [1] Horn, B.K. and Schunck, B.G., 1981, November. Determining optical flow. In 1981 Technical symposium east (pp. 319-331). International Society for Optics and Photonics.
  • 7. [2] Lowe, D.G., 1999. Object recognition from local scale-invariant features. In The proceedings of the seventh IEEE international conference on Computer vision (Vol. 2, pp. 1150-1157). Ieee. [3] Bay, H., Ess, A., Tuytelaars, T. and Van Gool, L., 2008. Speeded-up robust features (SURF). In Computer vision and image understanding, 110(3), pp.346-359. [4] Ruffier, F. and Franceschini, N., 2005. Optic flow regulation: the key to aircraft automatic guidance. In Robotics and Autonomous Systems, 50(4), pp.177-194. [5] Wang, J., Garratt, M., Lambert, A., Wang, J.J., Han, S. and Sinclair, D., 2008. Integration of GPS/INS/vision sensors to navigate unmanned aerial vehicles. In the International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 37, pp.963-970. [6] Farid Kendoul, Isabelle Fantoni, Kenzo Nonami, Optic flow-based vision system for autonomous 3D localization and control of small aerial vehicles. In Robotics and Autonomous Systems, Volume 57, Issues 6- 7, 30 June 2009, Pages 591-602, ISSN 0921-8890 [7] Lippiello, V., Loianno, G. and Siciliano, B., 2011, December. MAV indoor navigation based on a closed-form solution for absolute scale velocity estimation using optical flow and inertial data.. 50th IEEE Conference on Decision and Control and European Control Conference (CDC-ECC) (pp. 3566-3571). IEEE. [8] Fraundorfer, F., Heng, L., Honegger, D., Lee, G.H., Meier, L., Tanskanen, P. and Pollefeys, M., 2012, October. Vision-based autonomous mapping and exploration using a quadrotor MAV.. IEEE/RSJ International Confer- ence on Intelligent Robots and Systems (IROS), (pp. 4557-4564). IEEE. [9] Weiss, S., Achtelik, M.W., Lynen, S., Chli, M. and Siegwart, R., 2012, May. Real-time onboard visual-inertial state estimation and self- calibration of mavs in unknown environments. IEEE International Confer- ence on Robotics and Automation (ICRA), 2012 (pp. 957 - 964). IEEE. [10] Weiss, S., Scaramuzza, D. and Siegwart, R, 2011. Monocular- SLAM - based navigation for autonomous micro helicopters in GPS- denied environments. In Journal of Field Robotics, 28: 854-874. doi: 10.1002/rob.20412 [11] Valero, S., Chanussot, J., Benediktsson, J.A., Talbot, H. and Waske, B., 2010. Advanced directional mathematical morphology for the detection of the road network in very high resolution remote sensing images. Pattern Recognition Letters, 31(10), pp.1120-1127. [12] Kiani, A., et al. (2014). Design and implementation of an expert interpreter system for intelligent acquisition of spatial data from aerial or remotely sensed images. Measurement 47: 676-685. [13] Mohammadzadeh, A., Tavakoli, A., Zoej, V. and Mohammad, J., 2006. Road extraction based on fuzzy logic and mathematical morphology from pan-sharpened ikonos images. The photogrammetric record, 21(113), pp.44-60. [14] Gruen, A. and Li, H., 1997. Semi-automatic linear feature extraction by dynamic programming and LSB-snakes. Photogrammetric engineering and remote sensing, 63(8), pp.985-994. [15] Laptev, I., Mayer, H., Lindeberg, T., Eckstein, W., Steger, C. and Baumgartner, A., 2000. Automatic extraction of roads from aerial images based on scale space and snakes. In Machine Vision and Applications, 12(1), pp.23-31. [16] Song, M. and Civco, D., 2004. Road extraction using SVM and image segmentation. In Photogrammetric Engineering and Remote Sensing, 70(12), pp.1365-1371. [17] Huang, X. and Zhang, L., 2009. Road centreline extraction from high- resolution imagery based on multiscale structural features and support vector machines. In International Journal of Remote Sensing, Vol. 30(Iss. 8). [18] Zhou, S., Gong, J., Xiong, G., Chen, H. and Iagnemma, K., 2010, June. Road detection using support vector machine based on online learning and evaluation. In Intelligent Vehicles Symposium (IV), 2010 IEEE (pp. 256-261). IEEE. [19] Guo, X., Denman, S., Fookes, C., Mejias, L. and Sridharan, S., 2014, November. Automatic UAV Forced Landing Site Detection Using Ma- chine Learning. Digital lmage Computing: Techniques and Applications (DlCTA), 2014 International Conference on (pp. 1-7). IEEE. [20] Lin Y, Shen M, Liu B, Ye Q. Remote Sensing Classification Method of Wetland Based on AN Improved SVM. In ISPRS-International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences. 2013 Aug;1(1):179-83. [21] Barsi, A., Heipke, C. and Willrich, F., 2002. Junction extraction by artificial neural network system-JEANS. International Archives of Pho- togrammetry Remote Sensing and Spatial Information Sciences, 34(3/B), pp.18-21. [22] Mokhtarzade, M., Zoej, M.V. and Ebadi, H., 2008. Automatic road extraction from high resolution satellite images using neural networks, texture analysis, fuzzy clustering and genetic algorithms. The Interna- tional Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 37, pp.549-556. [23] Mnih, V., 2013. Machine Learning for Aerial Image Labelling. Doctoral Thesis. University of Toronto. [24] Zheng, S., Jian, L., Shi, W.-Z. and Zhu, G.-X., 2006. Road Central Contour Extraction from High Resolution Satellite Image using Tensor Voting Framework. International Conference on Machine Learning and Cybernetics pp. 3248,3253. [25] Poullis, C. and You, S., 2010. Delineation and geometric modelling of road networks. In ISPRS Journal of Photogrammetry and Remote Sensing, 65(2), pp. 165-181. [26] Volkova, A. and P. W. Gibbens (2015). A Comparative Study of Road Extraction Techniques from Aerial Imagery — A Navigational Perspec- tive. Asia-Pacific International Symposium on Aerospace Technology (APISAT), Cairns, Australia, 25 - 27 November 2015. [27] Dumble, S.J. and Gibbens, P.W., 2015. Airborne Vision-Aided Navi- gation Using Road Intersection Features. In Journal of Intelligent and Robotic Systems, 78(2), pp.185-204. [28] Williams, D.G. and Gibbens, P.W., Google Earth Imagery Assisted B- Spline SLAM for Monocular Computer Vision Airborne Navigation. [29] Volkova, A. and Gibbens, P.W., 2016. Satellite Imagery Assisted Road- based Visual Navigation System. XXIII Congress of the International Society for Photogrammetry and Remote Sensing (ISPRS), Prague, Czech Republic, July 12-19 2016. [30] Gonzalez, R.C., Woods R.E, 2011. Digital Image Processing. Publisher Pearson Education. Edition 3. ISBN 0133002322, 9780133002324 (976 pages) [31] C. Liu. Beyond Pixels: Exploring New Representations and Applications for Motion Analysis. Doctoral Thesis. Massachusetts Institute of Technol- ogy. May 2009. [32] T. Brox, A. Bruhn, N. Papenberg, and J.Weickert, 2004. High accu- racy optical flow estimation based on a theory for warping. European Conference on Computer Vision (ECCV), pages 2536. [33] A. Bruhn, J.Weickert and C. Schnorr, 2005. Lucas/Kanade meets Horn/Schunk: combining local and global optical flow methods. in International Journal of Computer Vision (IJCV), 61(3):211231.