1. Table of Contents:
I. Detailed Proposal Information......................................................................................................2
A. Innovative Claims...................................................................................................................2
B. Proposal Summary..................................................................................................................3
C. Research Objectives................................................................................................................4
D. Technical Approach and Evaluation.......................................................................................6
E. Statement of Work.................................................................................................................22
F. Schedule Graphic...................................................................................................................25
G. Teaming and Tasking............................................................................................................26
H. Project Management and Interaction Plan............................................................................27
I. Deliveries Description............................................................................................................28
J. Technology Transition and Technology Transfer Targets Plans...........................................29
K. Personnel and Qualifications................................................................................................30
L. Facilities................................................................................................................................35
M. Cost Summaries....................................................................................................................36
N. Organizational Conflict of Interest Affirmations and Disclosure.........................................38
O. Intellectual Property..............................................................................................................38
P. Human Use............................................................................................................................38
Q. Movie and Slides...................................................................................................................38
1
2. I. Detailed Proposal Information
A. Innovative Claims
Mobile wireless networks operating indoors face a bewildering shifting radio environment
that must be adapted to in order to provide robust communications. Heretofore network choices
have consisted of radio and network parameters. Small robots open the possibility of network
nodes that proactively move and sense in the real environment to improve the radio environment.
Robot motion expends high energy and so must be used parsimoniously in order to conserve en-
ergy for network communication. The fundamental questions become: 1) How do we operate in
the real space to optimize in radio space? and 2) How can we exploit networking, mobility, and
sensing to maximize network longevity? Machine learning and distributed control techniques of-
fer powerful tools to connect the actions, sensing, and goal space.
Dynamically redeploying nodes to meet network needs offers a powerful new dimension to
optimize network performance. In the LANdroids scenario small autonomous robots operate in
unknown indoor environments in order to self configure a network that provides long-lived con-
nectivity between mobile end users and a gateway. Our approach has two main elements: 1)
combined learning and control strategies to manage network connectivity in new and complex
environments; 2) identifying the critical energy tradeoffs at both design and operational stages.
The proposal addresses four important innovations in controlled mobility for wireless networks:
Network Radio Optimization: The strength of indoor radio links can be increased through
relatively small movements. However, the LANdroid must relay and so signals must be opti-
mized as a network and not for individual links. Our approach to network radio optimization
identifies spatial features, such as hallway corners, that yield robust improvements to network
performance. These features will be identified through detailed radio surveys and analysis. Exist-
ing radio technologies (e.g. 802.11n MIMO and directional antennas) will be assessed for their
system benefits.
Model Free Multivariate Extremum Seeking: Control theoretic approaches applied direct-
ly in signal space can optimize signals as a network. Our model-free approaches are robust to en-
vironmental changes, simple to apply, and provide basic optimization behaviors.
Learning Spatial Features and Optimal Control Strategies: Learning provides a means
to identify useful spatial features in new environments. We apply our experience in reinforce-
ment learning to optimally combine across feature-based and gradient-based optimization.
System Energy Management Optimization: Network longevity is a function of energy
spent on mobility, sensors, processing, transmission, and network overhead. We apply optimiza-
tion over all network resources to exploit these tradeoffs in a unified framework. System design
issues such as which sensors are most effective, where to do computing, and implicit sharing of
battery energy are addressed. Special attention will be paid to resolving the role of robot vision.
The University of Colorado (CU) has had a focused research effort over the past several
years to address these problems. Our interdisciplinary team has produced theoretical foundations
and practical algorithm implementations outlined in numerous publications that places us in a
unique position to implement viable solutions through each of the above innovations.
2
3. B. Proposal Summary
Main goals: The goal of LANdroids is to establish and maintain network connectivity while
maximizing network longevity. Our proposal has four technical goals: to identify exploitable fea-
tures of the radio environment (Page 8); to develop robust self-forming, self-healing, self-opti-
mizing, and energy aware network protocols (Page 9); to learn combined optimization tech-
niques that operate in real and signal space (Page 10); and to understand the system architecture
that will maximize performance at low cost (Page 16).
Tangible benefits: In rich, complex radio environments, the radio structure provides an op-
portunity to improve connectivity and increase network longevity. This research will enable
small robots to autonomously establish and maintain mesh coverage, extend network reach, dy-
namically adapt to network changes, and maximize network longevity (Page 6).
Critical technical barriers: Current radio models are insufficient for predicting indoor ra-
dio performance. Radio analysis focuses on link connectivity rather than network connectivity.
Battery constrained devices can not simply increase power indefinitely to maintain connectivity.
Current mesh networks can not exploit controlled node mobility. Research in controlled mobility
optimizes in real space and requires expensive localization techniques such as GPS. Simple robot
controllers do not work robustly in dynamic or cluttered environments. The LANdroid robot
must be low cost.
Main elements: We will experimentally evaluate the indoor radio environment to identify
exploitable radio and key spatial features (Page 8). We will design gradient-based methods for
connecting mobility in real space to network goals (Page 10). Robots will learn to find the key
spatial features (Page 12). Learning will optimally combine gradient and feature based control
(Page 13). Robot vision will be examined for its overall benefit to the system (Page 14). Robust
network coordination, resource management, and routing protocols will enable a system ap-
proach to sharing network energy resources (Page 9). System-level analysis will identify the ben-
efits of specific architecture and hardware designs (Page 16).
Summary of approach: Our network approach to radio analysis will yield better tools for
predicting network performance. The control approach will draw on our extensive robotics con-
trol experience to combine goals defined in spaces other than real space. Unlike previous work,
we select features and control strategies based on successful experience guided by human opera-
tors. The system approach to architecture and hardware design will better explore the perfor-
mance tradeoffs within the robot cost constraints.
Expected results: The results will be four-fold: 1) a better characterization of the ex-
ploitable features of the indoor environment including collected measurement data sets; 2) a
LANdroid system architecture that defines the most useful sensors, types of antennas, and pro-
cessing model; 3) a modular software library that will implement the proposed learning and con-
trol algorithms; and 4) communication protocol implementations for the mesh networking and
interface between robot hardware and control software.
Evaluation plan: The LANdroid program, as designed, has a rigorous performance-based
evaluation done four times per year throughout the program. This is a uniquely objective and ap-
propriate evaluation scheme in which we look forward to participate. This will be augmented by
an internal CU evaluation test bed based on our extensive wireless network test bed and field ex-
perience.
Cost: Year 1: (12 mo.) $999,749., Year 2: (12 mo.) $999,489., Year 3: (12 mo.) $999,979
3
4. C. Research Objectives
C.1 Problem Description:
The research objectives of this proposal are 1) to identify the system architecture, hardware,
and strategies which can best exploit the indoor and urban radio environment and 2) to develop a
robot learning and control framework that meets the network goals of connectivity and longevity
in a complex dynamic environment.
C.2 Research Goals
Radio Environment Characterization: It is not well understood what radio optimizations
are possible in a dynamic cluttered indoor environment. The LANdroid antennas are electro-
magnetically shorted by the close proximity to the ground and inefficient. Small movements may
be able to improve signal strength but these may be short-lived, hard to track, and better captured
by other means such as diversity or MIMO. A key insight is that the primary purpose of the
LANdroid is to relay. It is not enough for the signal strength to be better for a single LANdroid
neighbor. Therefore, the LANdroid must make a joint optimization of the signal strengths of all
its neighbors. We will perform a measurement survey in an indoor environment to evaluate the
nature of signals near the ground and the stability of small scale improvements. We will identify
and characterize the spatial features which are likely points of good signal. We will study the roll
of different antenna types in exploiting the radio environment for the benefit of the LANdroid
system.
LANdroid System Hardware: The LANdroid System consists of the gateways, LANdroid
robots, and edge nodes. The radios on the war fighter and the gateway have unique capabilities
that can be exploited for improved system performance. For instance, the gateway is expected to
be vehicle mounted and well powered, while the war fighter can add intelligence to LANdroid
drop decisions and provide the necessary mobility. These radios are generally better positioned at
higher elevation than the LANdroid on the ground. Even among LANdroids, some may have bet-
ter positions or more information about the system state. The different node types can have a va-
riety of radio, sensor, or processor hardware which can improve its performance. We will per-
form system analysis to assess the cost, performance, and energy tradeoffs for these components
within the LANdroid system. Special attention will be paid to the role of antennas and video
cameras since they may be critical to good performance. This analysis will provide guidance to
the LANdroid robot development.
System Energy Model: The control software requires a detailed power model in order to
make rational decisions between different actions. The model needs to consider both power con-
sumption and stored energy of individual nodes and the system as a whole. This model will be
the basis of analysis, planning, and optimization described below. This model will also enable
specific analysis of hardware components and their role in the LANdroid system.
Decentralized, Model-Free, Gradient-based, Motion Optimization in Signal Space: The
robot needs basic behaviors to connect actions in real space to network goals in signal space.
Control theoretic approaches applied directly in signal space can optimize signals as a network.
4
5. In a complex environment fewer a priori assumptions are generally better. Our model-free ap-
proaches are robust to environmental changes, simple to apply, and provide basic optimization
behaviors. The control techniques will be modified to operate directly on performance gradients
since, ultimately, the network cares about performance measures such as throughput and latency
rather than signal strength.
Learning Spatial Features: To take advantage of spatial features (such as hallway corners),
the robot must be able to efficiently identify them in the environment. However, these features
are generally complex concepts and can appear in many forms. Human operators will present de-
sirable spatial features to the robot so that it can learn to identify these features. Offline learning
will be performed in the laboratory, while online learning will be used in the field to adapt the
representation of the features to the local environment. Initially this identification will be on a
single robot basis, but distributed methods will be developed to enable enhanced identification at
lower energy costs.
Learning Optimal Control Strategies: The robot has a large number of options it can per-
form at any given time ranging from sensing, communicating, to moving. The CU team has al-
ready developed successful learning-based control strategies for the DARPA LAGR project.
These will be adapted to the LANdroids environment. Initially the learning will be guided by a
human operator. The goal will be to optimally combine gradient and spatial feature based control
techniques.
Vision-Based Cost Estimation (VCE): Video is potentially very useful in the LANdroid
system. It will enable more energy efficient path planning since it can avoid obstacles and dead
ends. Video can identify spatial features at a distance without movement. Video provides a more
stable reference frame. However high frame rate and resolution video is expensive in power.
Therefore we will modify existing algorithms that can use reduced capability video. Video will
also be used to enhance network situational awareness whereby robots can visually estimate rela-
tive positions and track future paths by watching other robots or the warfighter.
Network Protocols: Ad hoc networks enable the LANdroid system to provide extended
connectivity. While many ad hoc protocols exist, none of them explicitly include controlled mo-
bility as a network primitive. Further, the routing must consider other network resources such as
energy at each node. The network must also efficiently and robustly enable the cooperation and
sharing of network resources. A key problem will be protocols for finding other nodes when the
network is disconnected. The research will initially focus on connectivity as the primary network
goal. However, other quality of service measures will also be considered.
C.3 Expected Impact
This proposed project will improve our insights into how to tie radio space goals tightly with
robot mobility. If successful, it will solve open problems in how to use mobility to enable net-
work self configuration, dynamic tethering, intelligent relaying around obstacles, self healing and
self optimization while extending network longevity. We will deliver a set of modular software
libraries which implement the technology described above. These libraries will constitute the
first integrated controlled mobility wireless network implementation.
5
6. D. Technical Approach and Evaluation
Our approach is based on optimizing the overall LANdroid system. The LANdroid system
consists of three major components which each contribute to the overall goals of network con-
nectivity and longevity as shown below.
Gateway LANdroid Edge node (Warfighter)
Comm end point and relay Comm relay Comm end point and relay
Permanent asset Reusable/disposable Permanent asset
Cost $10k Cost $0.1k Cost $1k
Features
Energy rich Energy constrained Energy constrained
Computing rich Computing constrained Computing constrained
Mobility exogenous to comm Comm controls mobility Mobility exogenous to comm
Can have multiple radios Can have multiple radios Can have multiple radios
Can have multiple antennas Can have multiple antennas Can have multiple antennas
• Participate in LANdroid sys- • Participate in LANdroid sys- • Participate in LANdroid sys-
tem protocols tem protocols tem protocols
Roles
• Bridge to other networks • Can self-sacrifice for connec- • Initialize LANdroid
• Support network functions tivity and network longevity
The main evaluation tasks center on self-configuration, tethering, intelligent relaying, self-
healing, self-optimization, and intelligent power management. We describe at a high-level our
vision for each of these and provide details in the following section.
Self-configuration: There are three separate problems here. The first is given a set of con-
nected nodes, how best to connect edge nodes to the gateway. This is the basic problem of ad hoc
networking. Standard MANET protocols will form the basis of our solution. However, these pro-
tocols will be augmented to include power, mobility, learning, and sensing resource management
as noted below.
The second problem is given a disconnected set of nodes, how do they find each other to
form a connected network. We will investigate a number of strategies such as high-power bea-
cons and coding that will extend range and help nodes find the presence of other nodes as well as
systematic search algorithms. Note that the gateway can support higher power radios, and more
sophisticated antennas that can potentially provide deeper (but only one way) signal penetration.
In addition to these radio solutions, the LANdroid will search and reason over likely physical lo-
cations that will be better connectivity locations (e.g. hallway intersections or doorways).
The third problem is how to signal to the warfighter when to drop a LANdroid. Here a coop-
erative solution will form between the warfighter radio and the carried LANdroid. The warfight-
er radio will monitor its network connectivity to the gateway. As the connection degrades, an es-
timate will be made of the path back to recent points of good connectivity. When the signal has
degraded sufficiently, the LANdroid will signal to be dropped. The dropped LANdroid is closer
to the ground and has worse signal and may be disconnected. It proceeds back toward the last
good connectivity point using the protocol for disconnected nodes but with a prior distribution on
the best search path. Note that the drop decision is not simply a signal strength threshold. A teth-
ered LANdroid following a warfighter may signal it is turning a corner and about to solve the
weak link, removing the need to drop. Or, a warfighter in a room to room search may have a pat-
tern of fluctuating signals that warrants weighing signals over a longer period.
6
7. Tethering: Tethering encompasses all of the network morphing and stretching needed to
provide connectivity to edge nodes. We envision tethering that is proactive and reactive. In reac-
tive tethering, when an edge node signal starts to degrade, the network might first choose to in-
crease transmitter power. If the signal continues to degrade, LANdroids will perform a gradient-
based distributed optimization to improve the edge node connectivity.
Proactive tethering considers the problem of multiple LANdroids in close proximity. Based
on traffic, remaining energy, and other factors, the LANdroids will spread out in order to in-
crease the physical area encompassed by the connected network. The spreading will be guided
by signal gradients and the physical environment. This will help to connect disconnected nodes
and to avoid network disruptions as warfighters move through an area.
Intelligent relaying: Intelligent relaying requires decisions on how traffic is routed and how
the LANdroid relays choose to position themselves. Traffic routing will use energy aware algo-
rithms that can load balance rather than use simple shortest path. Since LANdroid movement is
relatively expensive in terms of battery energy compared to computing, communication, and
sensing; the LANdroid must be parsimonious in its movement choices and consider carefully
where it stops. We will use learning mechanisms to identify key spatial features such as hallway
corners which are likely to serve as good relay points now and in the future. The same mecha-
nisms will be used to make predictions whether the expected energy cost of a move now will
likely be rewarded with less future energy cost or better connectivity.
Self-healing: The network must adapt to changes that are external or internal to the network.
Internal changes include new sources of traffic that cause congestion; or network elements that
fail. External changes include new sources of interference or changes to the environment such as
a door that closes. These problems are addressed first by adaptive networking protocols to route
around points of congestion, interference, or failed nodes. Then learning techniques identify
when such changes warrant repositioning of the network. The actual movement can be guided by
gradient techniques. Some changes are predictable (e.g. a node running out of power) and if
needed the network can proactively respond to the upcoming change.
Self-optimization: The LANdroid network operates in an environment that varies over
space and time. The problem is determining which optimizations over space are stable enough
over time to warrant spending energy to seek out. Edge nodes are mobile and the signal strengths
are highly dynamic. However, many LANdroid nodes will be stationary and so become stable
points to optimize around. Cooperative protocols will share information about the environment.
For instance, reinforced concrete has high penetration loss, while wood frame construction has
significantly less. Nodes that can cooperatively identify the construction can adjust their gradient
and node placement algorithms. In reinforced concrete positioning at doorways and corners is
critical while in wood frame distributing more uniformly is optimal. Antenna technologies may
provide significant gains in this environment. For instance, a high-gain antenna can reduce multi-
path and increase effective signal strength. We will survey the current state of available antenna
types from a system perspective to understand which are most useful for the LANdroids sce-
nario.
Intelligent power management: The LANdroid system has many opportunities to trade be-
tween sensing, communication, processing, and movement in order to conserve network energy.
We will study these tradeoffs within the cost constraints of the gateway, LANdroid, and edge
nodes. For instance, our initial hypothesis is that a video camera will be a key energy saving
component. It will enable more energy efficient path planning since it can avoid obstacles and
dead ends. Video can identify spatial features at a distance without movement. Video provides a
7
8. more stable reference frame. For instance, if a node suddenly becomes isolated, it can use video
references of past locations to backtrack even when no radio signal is present or after dead reck-
oning has been lost because it was kicked. These benefits must be weighed against video’s cost
and energy drain. Similarly, other radio, antenna, sensor, and processing tradeoffs will be stud-
ied.
Energy will be managed as a system. LANdroids can altruistically spend power on commu-
nication, movement, processing, and sensing in order to preserve critical nodes’ energy. Routing
can avoid using critical nodes. Nodes can move to make the critical node’s communication easi-
er. Computing can be offloaded to other nodes or sent to the processing and energy rich gateway.
D.1 Technical Approach
The technical details outlined below reflect our experiences in radio frequency environ-
ments, wireless network implementation, controlled mobility in networking, distributed coopera-
tive control, and robotic navigation. These technical approaches represent years of research, im-
plementation, and testing in real environments by the PIs. As such, our proposed solution does
not represent fundamental research, but rather the application of sound and tested approaches to
the LANdroid system.
Radio Environment Characterization: It is not well understood what radio optimizations
are possible in a dynamic cluttered indoor environment. Radio signals are well known to vary by
tens of dB over both small (~ one wavelength) and large distances. Small movements may be
able to improve signal strength but these may be short-lived, hard to track, and better captured by
other means such as diversity or MIMO. Large scale movements are known to provide signifi-
cant improvements. Some of these may be more critical to network connectivity (e.g. a relay
point around a corner) while others may be more ephemeral (e.g. a better location within a
room). This characterization is needed for reliable planning and optimization. A key insight is
that the primary purpose of the LANdroid is to relay. It is not enough for the signal strength to be
better for a single LANdroid neighbor. Therefore, the LANdroid must make a joint optimization
of the signal strengths of all its neighbors.
There are a number of challenges for a small robot located close to ground. At 2.45GHz, the
wavelength is large, and the robot is electrically small as well as electrically close to the perfect
or imperfect ground, resulting in a low-gain antenna with possibly low impedance. Unconven-
tional antennas will need to be investigated from a system standpoint, along with ways to miti-
gate heavy multipath effects. An example is a combination of four corner-cube loaded monopole
antennas (used often in the millimeter-wave region due to their simplicity, e.g, [Gro89]), with
four beams of the radiation pattern pointing roughly 40 deg from the horizontal plane, and with a
local common ground which makes the antenna relatively insensitive to the surface properties.
The four elements can enable spatial and polarization diversity and a combined monopole-loop
or frame antenna feeding the corner-cube reflector can enable in addition field diversity [Jak94].
A relevant figure of merit is the level of independence of the different diversity levels, which
will affect the diversity combining, as shown, e.g. in [PoP02, ZRP05]. Several such antennas
which satisfy the robot space constraints will be investigated in terms of the influence of their
combined radiation patterns on the network system optimization.
To characterize the environment we will rely on both empirical measurements combined
with analysis. We will place dense grids of radio nodes in a space and simultaneously measure
variations in the environment between different node pairs over time to capture the network val-
ue of different locations and their temporal stability. This data will feed measurement-based sim-
8
9. ulations of network performance.
One challenge with small robots 100
Both at 1 meter
is that they are close to the ground. One on the floor
The figure at the right shows that ra- Both on the floor
Throughput (Mbps)
dios at one meter above the floor have C
o
C
o
significantly better reach and can 10 r r
n n
maintain greater than 10Mbps around e e
r r
one hallway corner and greater than 1
Mbps around two hallway corners. 1Mbps
However, placing either or both on the 1
1 2 3 4 5 6 7
floor causes the rate to drop below 1 Location
Mbps at the first corner and communi- Figure 1: Throughput vs. locations along a hallway
cation ceased around the second cor- between an 802.11n MIMO AP and laptop.
ner. One goal will be to understand
whether simple modifications to antennas will improve this situation. However, a more critical
question is the role of the antenna in the larger network optimization problem:
• Multiple directional antennas pointing in different directions indicate the angle of arrival. This
aids navigation and gradient following. In the indoor multipath envi-
ronment, following the strongest signal may lead to dead ends as
shown at right. The second strongest direction can indicate alterna-
tive paths that would assist navigation.
• Directional antennas improve connectivity without expending more
power. They reduce interference and increase SNR. Potentially
avoiding the need to move the robot. Tethered LANdroid
• Small-scale radio effects can be explored without movement. Multi- follows signal
gradient into closet
ple antennas provide diversity against fading and react at electronic
switching speeds compared to robot motion speeds.
We emphasize that the goal of this work is to ensure that system considerations are incorpo-
rated into the antenna design. The antenna technologies discussed here are all COTS and are not
in themselves a focus of the research.
Along the same lines we consider other radio enhancements that can be leveraged to save
power. Performance can be improved by changing channels. Different channels will observe dif-
ferent multipath fading and can avoid interference and jamming. Simple RF measurement de-
vices can be built into the robot that allow it to efficiently survey the spectrum and find channels
that have less noise, interference, or jamming.
Energy Aware Ad Hoc Routing: The energy cost to deliver a packet across a network de-
pends on the route the packet follows and the power it is transmitted along the way. For typical
IEEE 802.11 interfaces, the dissipated energy is not a strong function of transmit power and is in
fact dominated by the majority of the time that the interface is idle awaiting reception. This sug-
gests two directions for research. First, saving energy in the transmitter will best be achieved
through shutting down the interface as often as possible. We will explore simple protocols such
as shutting down interfaces when the channel is idle (say for 10msec) but waking up at synchro-
nized periodic intervals (say every 100msec) to send, relay, and receive traffic as long as there is
network activity. This will enable the delay target of less than 500msec to be met while provid-
ing a significant potential for energy savings.
Second, we observe that the transmit power can be greatly increased at little energy cost.
9
10. This added power can facilitate range and connectivity. In our indoor experiments (using the
same setup as in Figure 1) we compared many recent 802.11 COTS antenna technologies (e.g.
MIMO, beam steering, etc.). The longest range was always an ordinary 802.11b card combined
with a 1 W amplifier. In other words, raw transmit power is a useful dimension to explore.
The goal here is to create a radio with variable transmit powers that might be able to go in
steps from a few mW to up to more than 1W. The high end would be useful initially since an
unattached node could beacon at high power to find other nodes. This is orders of magnitude
more efficient than the robot trying to search through movement to find other nodes. For instance
sending one 3W beacon packet every second would require less than 10mW power on average
and is likely to find other nodes in seconds. Mobile searches would require much higher power
to drive the motors and take much longer. High power transmission interferes over a large area,
overloads the front end of nearby nodes, and quickly drains the battery. Thus the transmitters
should send at lower power whenever possible. We have designed ad hoc routing protocols that
include mechanisms for nodes to estimate the minimum power needed to close a link that we will
apply to the LANdroid scenario [DBB02].
Maximizing network lifetime can be facilitated
through network wide routing decisions. Simply using
transmitter power as a metric and choosing minimum
power paths does not maximize network lifetime. Traffic
can get funneled through nodes with the best connectivity
causing their batteries to quickly drain. We use a concept
of maximum flow life curve [BrG01,BGZ01] to balance
energy drains and effectively treat battery energy as a
network resource. The figure at the right shows the traf-
fic flow carried in a network over time (averaged over 100 network instances). The maximum
flow life curve approach (MFLC) is able to increase network longevity by 50% (at 90% remain-
ing flow) compared to minimizing the power cost of each route (MC).
Decentralized, Model-Free, Gradient-based, Motion Optimization in Signal Space: De-
centralized model-free, gradient-based, motion optimization in signal space will be implemented
using modifications of multivariable extremum seeking algorithms developed by the PIs for lin-
ear communication networks of unmanned aircraft [DiF07]. This approach brings a number of
features that are ideally suited for LANDroid mobility control including:
• Decentralization: Decentralized control schemes are characterized by local decision making in
which a given agent selects its action based only on information it has gathered from its own
sensors and data shared by its “neighbors”. The agent has no knowledge of the global state of
the network or explicit global goals. However, through these local interactions group behavior
emerges that achieves desirable global performance objectives. The main advantages of de-
centralized control schemes are their scalability and robustness to node or network failures.
• Model-Free: Multivariable extremum seeking (MES) [ArK02] controllers are adaptive, model
free controllers designed to drive the set point of a dynamic system to an optimal, but unpre-
dictable location defined by a performance function that is only known to have an extremum.
Thus, the system can adapt to robot mobility limitations and the radio propagation environ-
ment without explicitly modeling them and the mapping from signal space to physical space
is implicitly considered.
• Gradient-Based: In order to find an extremum point in the unmodeled system, MES algo-
rithms follow gradients in order to improve performance. This local control approach elimi-
10
11. nates the need to search
large regions of the envi-
ronment, reducing costly
power loss due to mobility.
• Spatially Distributed: Net-
work coverage tasks such
as self-configuration, self-
optimization, self-healing,
and tethering can all be
achieved through local Figure 2. a.) Self-configuration of a single LANDroid between
source and destination. b)Self-optimization of LANDroid chain.
control only. That is, local
interaction rules can be designed that lead to optimal global behavior. This ability results from
the spatially distributed structure of the problem in which gradients of the global objective
can be determined locally as functions of the state of an agent and its neighbors only (i.e with
no global information).
Multivariable extremum seeking (MES) [ArK02] controllers are adaptive, model free con-
trollers designed to drive the set point of a dynamic system to an optimal, but unpredictable loca-
tion defined by a performance function that is only known to have an extremum point. The MES
algorithm developed by the PIs [DiF07] differs from standard MES algorithms in that the re-
quired external dither signal is provided by periodic motion of the robot about some center point
and we add an external ‘virtual plant’. The MES approach developed by the PIs is a variation of
the algorithm given in [ArK02] and therefore stability and performance results can be taken and
applied. Recent work by [KZA07] has extended the ES framework to nonlinear models that can
capture the guidance level behavior of the LANDroid robots, reduce the required excitation (i.e.
mobility), and eliminate the need for any positioning information.
A key MES framework strength is that a model of the environment and dynamical system is
not needed. Thus, the approach can be applied directly in signal space in order to optimize the re-
ceived signal strength. The PIs have developed this framework specifically to optimize capacity
in linear communication networks of unmanned aircraft using received signal strength only
[DiF07]. Figure 2 shows simulation results using the MES framework to self-optimize a linear
relay network. Since the MES control law is adaptive and model free, the self-configuration,
self-healing, self-optimization, and tethering task behaviors occur as needed. In fact, the network
has no explicit knowledge of which of these tasks
is being performed. The decentralized control
laws continually seek to improve the network in
response to the (unmodeled) environment. Figure
3 shows example data collected using a similar
approach indoors.
The MES approach developed by the PIs can
be applied to any gradient-based decentralized
control scheme for which a local function can be
measured that has the same (local) gradient as the
global objective. This includes the large body of
work currently devoted to the synthesis of simple
interaction rules that result in desired group-wide,
Figure 3. Signal strength versus time using
global behaviors such as distributed macrosen-
the gradient ascent approach indoors.
11
12. sors, coverage control, and robot swarming. For example, robot swarming algorithms are based
on potential energy functions of relative range and the gradient of the potential leads to velocity
control inputs. This gradient information is not available when the robots only have relative
range sensors (e.g. can only measure signal strength). The MES framework estimates the gradi-
ent information while ascending (or descending) it.
In addition to applying the MES framework to existing coverage control and robot swarming
algorithms, we will develop new variations that specifically address the LANDroid environment.
For example, swarming and coverage control algorithms are designed to react instantaneously to
changes in the network. This process uses considerable power as transient responses must settle
out of the network. In the LANDroid scenario we need to consider the nature of the warfighter’s
decision-making and movement processes. For example, a warfighter may briefly explore a new
room before proceeding onward. A LANDroid network tethered to the warfighter should not en-
ter that room only to vacate it moments later. Thus, adding nonlinear elements such as hysteresis
to the virtual force fields that drive the LANDroid will improve the overall performance (e.g.
power usage) of the system.
Finally, the model-free, gradient-based approach complements the feature-based approach to
motion planning. In particular, the MES approach only finds local extrema and can get caught
behind obstacles or dead-zones in the radio propagation environment. Key research questions
pursued during this project will be the appropriate balance between the two approaches and de-
veloping the ability to recognize when a switch from one method to the other must occur.
Learning: Machine Learning and Statistical algorithms will play two roles in the proposed
work. First, they will be used to learn Spatial Features, online as the LANdroids are deployed,
and offline during test deployments designed to emulate a real deployment. The online learning
is required because no constructed test environment can account for all environment types an ac-
tual deployment of LANdroids will encounter. Second, they will be used to learn optimal control
strategies in sensor space, Spatial Feature Space and Signal Gradient Space. Both of these roles
of Machine Learning have their foundation in actual deployments under the DARPA LAGR pro-
gram and the NSF “Human-to-Robot Skill Transfer” grant.
Learning Spatial Features: The Colorado Team has a significant history of using sensor
data to learn such concepts as traversibility and non-traversibility in unstructured outdoor envi-
ronments [GMO07, PMG07]. We propose to use these same techniques to learn relevant Spatial
Features about the environment. These techniques are density based classifiers that have the fol-
lowing properties: 1) No assumption is made on the number of classes (Spatial Features) that
will need to be learned for successful wireless communication; 2) Learning data only becomes
available in small subsets (i.e. the unrealistic assumption that all necessary learning data is avail-
able at once is not made); 3) The features used for each Spatial Class may differ, and a formal
framework for feature selection is used [Str06]; finally, 4) The learned models can predict when
they are applicable to a particular LANdroid deployment, and therefore should be used. This im-
plies that the LANdroid robot will know when it doesn’t know, and can be directed to appropri-
ately act to learn what is necessary. Learning these models involves the use of dot products, Sin-
gular Valued Decomposition (in low dimensional state space), and histogram building, while the
application of these models requires passing the results of dot products through histograms.
These operations are computationally efficient, allowing online learning with limited CPU pow-
er, making the learning framework ideal for the LANdroid project.
For offline learning, the mapping of sensor readings to Spatial Features will be learned as
follows. Examples of such Spatial Features as doorways, corners and walls, will be “shown” to
12
13. the robot by placing it near them. A classifier (of the type discussed above) will then be built for
each Spatial Feature. This learning will take place in environments that mimic those that the
LANdroids will eventually be deployed in. Note that the Spatial Features that will actually be
useful, and therefore be learned, is an open question that this proposal is intended to address. In
essence we will learn only about the Spatial Features that help the robot optimize its ability to
find optimal locations for wireless transmission (see Learning Optimal Control Strategies dis-
cussed below).
For online learning during an actual LANdroids deployment, the Spatial Features will be
learned as necessary. For example, if a soldier runs through something that the robot thinks is a
wall, the robot can take a sensor reading of the area, and classify it as a doorway. Thus it learns
the concept of doorways in the current deployment environment. Similarly, if the robot runs into
a wall (or corner) when its models “believe” the path is clear, the robot can backup and build a
classifier of the wall (or corner), allowing it to better optimize paths towards areas of better wire-
less conditions, using less battery life. Once again, the concept of wall or corner can be learned
with respect to the current environment. Similarly, other relevant Spatial Feature concepts can be
learned during a deployment.
Finally, the Spatial Features models are small (about 1 KB each), allowing them to be shared
by all LANdroids during a deployment, with minimal load to wireless communication.
Learning Optimal Control Strategies: Two types of learning paradigms will be explored
to learn optimal control strategies from Sensor Space, Spatial Feature Space and Signal Gradient
Space readings. The first is a Reinforcement Learning (based on the Markov Decision Process
framework) approach that members of the Colorado Team have developed in the past [GKU03,
GrU01a, GrU01b, GrU00, GrU04]. The second is a new framework for learning fast, intelligent
motion planning using available sensors [ORG07]. This second framework, referred to as Cost
Function Learning, combines domain specific learning of cost functions from available data with
fast A* search [RBB07].
The Reinforcement Learning approach involves probabilistic reasoning on whether the robot
should follow the Signal Gradient, choose paths based on Spatial Features, or combine both of
these inputs. We will develop a set of standard behaviors such as “go towards wall”, “go towards
room center”, “find room corner”, “follow soldier”, “move in the direction the soldier came
from”, “follow signal gradient”, “backtrack current motion”, “randomized motion”, as well as
combinations of these behaviors. A policy gradient Reinforcement Learning framework will be
used to switch between these behaviors based on sensor observations [GrU04]. The key to this
approach is that learning is both fast and efficient, requiring relatively few test deployments to
achieve locally optimal policies. In addition, we will investigate the possibility of making the
state space sufficiently compact, allowing us to investigate the use of standard Value Function
Reinforcement Learning and Dynamic Programming solutions [SuB98]. These algorithms all re-
quire a reinforcement signal from the environment to learn optimal behavior transitions policies,
the choice of which can greatly influence the quality of the final control policies learned. We
propose to investigate a variety of different reinforcement signals [GrU01b], including combina-
tions of battery life length and average wireless signal strength.
The Cost Function Learning approach involves learning to combine all sensor information
into a single cost map that, when A* is used, produce optimal behavior with respect to maintain-
ing wireless signal strength and prolonging battery life. The cost function mappings will be
learned in test settings by having a human operator demonstrate “optimal” control strategies
[RBB07]. These strategies will be determined by experimentation, having the human operator
13
14. move the robot in various ways and only learn the cost function mappings that produce the best
results. In this framework, the human becomes key in influencing which types of Sensor Space,
Spatial Feature Space and Signal Gradient Space readings are relevant, not by choosing them di-
rectly, but by executing robot actions that are near optimal and allowing a learning algorithm to
determine how to combine these readings into a cost function that will achieve near optimal
robot behavior autonomously.
Vision-based Cost Estimation: We propose adding one or more camera devices to the
robot platform to facilitate safe and efficient navigation through the environment. We argue that
for the proposed robot configuration, sensing and associated calculation has a much lower draw
on battery resources than robot motion. Using Computer Vision to identify the constraints of the
physical environment and restrict planned motions to those which are safe and have high poten-
tial payoff make the cost of added devices worthwhile. Sophisticated camera modules designed
for the cell phone market provide high sensitivity and resolution and operate at 150-250mW
while capturing 30 fps. Since we expect to work at relatively low image resolutions, cameras
which satisfy our requirements are available for $5-10 per module.
In the simplest case adding one camera will allow the robot to choose motions which avoid
obstacles and allow identification of environment features (doorways, corridor junctions) which
potentially improve the LANdroid’s ability to relay signals. More cameras provide views in mul-
tiple directions which could allow the robot to track the warfighter as s/he moves on after drop,
as well as analyzing more of the environment without the need to rotate the robot. Depending on
the mounting configuration multiple cameras allow us to reconstruct the shape of the environ-
ment without any robot motion, using sparse or dense features. As part of our work we plan to
explore the space of camera configurations and Vision-based measurements to identify the costs
and benefits of each. Essentially we want to identify which measurements of the space will best
allow us to achieve the LANdroid objectives of maintaining coverage and preserving battery life.
Cameras will be mounted on the robot platform in a variety of configurations for evaluation
of cost/performance tradeoffs. There will always be at least one forward facing camera, addition-
al cameras may form stereo pairs or panoramic rigs. In all cases cameras will be strongly cali-
brated providing both intrinsic parameters (focal length, principal point, skew) and the extrinsic
transformation from camera to robot coordinate frame.
In general the goal for a LANdroid is to find its optimal pose with respect to signal strength
and then carry on with the real mission of providing reliable communications. For the most part
we expect the sensory processing to be in a sleep mode. The question then is when do we need to
activate the vision system? Naturally at drop the LANdroid must seek out the local optimum for
signal coverage, and if a neighboring node in the mesh is added or fails the robot must adjust to
the changes in the signal profile. Any time robot motion may be required we anticipate sensing
first. Another possibility is for the vision system to provide some level of situation awareness, by
periodically capturing frames to identify changes or motion in the environment. Although not
specifically part of the LANdroid mandate, the availability of distributed cameras with commu-
nication capacity could provide information about other warfighters in the area, fire, falling de-
bris or other hazards.
The first task of the vision system is to provide information about the immediate environ-
ment to guide robot motion. Work on localization for indoor navigation has frequently exploited
depth calculation through stereo [MaS87,MuL00] or structure and motion calculations [BoP95]
to identify free space and obstacles for mapping and navigation. For our evaluations, single cam-
eras or panoramic configurations with limited overlap allow computing sparse or dense scene
14
15. structure through tracking scene motion as the robot moves. Stereo configurations can compute
sparse or dense depth information through correspondence and triangulation, before and during
robot motion. The purpose of reconstructing the shape of the environment is to provide informa-
tion about the hard physical constraints present, so we can combine them with signal strength in-
formation to determine the optimal feasible pose for the robot.
We hypothesize several vision-based calculations which will provide information on robot
poses which optimize signal strength thus reducing the search space. The first is to track the war-
fghter after robot drop. Since we are propagating the signal to the warfighter, the direction of de-
parture provides information about the probable location of the next LANdroid in the mesh. An-
other possibility is that certain features in the environment can be exploited with the expectation
of improving mesh coverage. First we want to locate doorways and corridor junctions along the
trajectory of the warfighter where positioning the robot may improve transmission. Another po-
tential improvement in signal strength can be obtained by moving to an interior corner.
We assume that the robot rapidly achieves a ready state when dropped. After drop we esti-
mate gross heading of the warfighter in order to approximate the direction of next node/LAN-
droid in the chain. Observing the warfighter turn a corner or pass through a doorway gives infor-
mation about the space as well as potential information for optimizing pose. Extensive work on
identifying and tracking people exists in the literature [SBF03, LuT01], particularly in the con-
text of identifying pedestrians for automated driving [Lom01]. In the simplest scenario differ-
encing techniques allow us to identify changing regions as the warfighter moves, and simple
blob tracking should suffice to approximate his gross heading. Reasoning about where and when
the person tracker fails will help identify important doorways and turns in corridors.
Historically many indoor navigation systems have used doorways and corridor geometry to
localize and map the environment [DeK02, KoP95, KrB88]. Typically these systems use edge-
based representations where particular configurations of linked edge segments are identified as
doorways based on viewing constraints. We will provide edge structures as well as color and
texture information from the hypothesized region to learn a door detector. Identifying junctions
in corridors is somewhat less well defined, but we will use the same process for learning a junc-
tion detector.
Identifying internal corners where walls meet requires examining dense depth information
and determining plane intersections consistent with wall intersections. The structured nature of
indoor environments allows us to also test a sparse edge-based approach which exploits classic
blocks world edge configuration constraints [Rob65] along with edge-based structure and motion
calculations to hypothesize corners. Again we pass our image-based evidence to the Machine
Learning framework to generate a more robust room corner detector.
Finally there are a number of techniques that could be used to help determine the relative
position of LANdroids in the mesh. Cameras could be used to visually identify signals from oth-
er robots, for example an LED flashing, assuming the robots have lines of sight between them. If
the robots share a common visual space they can locate and communicate common visual land-
marks and compute their position based on the relative viewpoint, for example by exchanging
SIFT features [SLL02] and computing relative pose based on triangulation.
Another space for evaluating cost tradeoffs for vision is the computational load based on the
resolution and complexity of detection and extraction calculations performed. In general we an-
ticipate using low resolution (120x160 pixel) images to reduce computational load. Another pos-
sibility is to apply calculations to smaller regions of interest at higher resolution. For example we
might only reconstruct dense stereo for image regions on or near the ground plane rather than the
15
16. entire image. The calculations we anticipate include edge and corner detection, image differenc-
ing, blob finding and tracking, and image correspondence.
Energy Reduction Through Integrated System Design: One of the primary challenges
facing the Landroids project is making the most efficient use of the available energy while deliv-
ering reliable data forwarding. This entails conserving energy at every stage of movement, detec-
tion and data processing. During motion phases, the majority of the system power will be used to
power the drive train and the majority of the system energy can be saved by efficient planning
and suppressing robot motion that is not useful, as described in other portions of the proposal.
During non-motion phases, the majority of energy will be used for transmissions and relay-
ing communication or, during idle periods, in determining when to actively engage in communi-
cation. Despite these different “phases” an integrated design avoids micro-optimizing the power
budget of a single component at the cost of overall performance or power budgets.
Reducing Energy During Planning Phases: During the planning phase, the LANdroid
needs to make the most use of computation and sensors to reduce movement. Computation costs
can be controlled by using dynamic voltage scaling (DVS) to reduce CPU costs. Depending on
the particular component, voltage scaling can reduce the power needed by a processor such as
the Intel PXA 255 by a factor of 6 [SRH05] by switching the CPU speed from 400Mhz to
100Mhz and reducing the voltage from 1.3V to 1V. Most work on DVS has demonstrated that it
is difficult to extract energy savings from DVS. Our experience [GLF00] has been that automat-
ed voltage scaling methods perform poorly for applications considered in isolation; others
[FlM02] have shown that capturing application interactions improves automated power control if
deadlines can be inferred or provided. However, sensing applications, such as trying to locate an
adjacent access point are more concerned with reducing power since the mission time can not
usually be decreased by faster computation. In this application domain, we think the individual
applications can provide sufficient information to allow automated control with out attempting to
estimate the computing demands of the system.
Sensors can reduce system energy by reducing the need to move. As described, vision based
sensing can be used for semantic analysis and gradient control methods can follow localized en-
ergy. RF sensing can be improved by a combination of protocols, directional antennas and a
broad spectrum RF sensing. We plan on using our experience with the SoftMAC [NFD05] flexi-
ble MAC layer control to provide more accurate protocol layers for direct control of the MAC
layer. For example, using SoftMAC, we can selectively enable enhanced forward error correc-
tion (FEC) to improve the channel condition; we can also use the SoftMAC drivers to retrieve
partially corrupted frames, improving the channel measurement accuracy. The directional anten-
nas provide gross angle of arrival (AOA) information, allowing gradient based methods to im-
prove accuracy – more importantly, directional antennas also increase throughput and decrease
communication costs. While “higher level” tools such as SoftMAC allow us to extract useful sig-
nal strength information from an existing 802.11 radio, those radios still face limitations – they
are power hungry and have limited ability to find “idle” channels. Packets that have significant
corruption (e.g. in the Physical Layer Converge Protocol, or PCLP), can not be interpreted by the
radio; most existing radios have poor interfaces for signal measurements, limiting the ability of
radios to assess the “true channel”. We plan on using spectrum measurement systems to provide
an alternate, lower power sensing method for the LANdroid platforms; an example of a COTS
solution is the WiSPY2.4 device, which can scan the entire 2.4Ghz band in 383Khz “steps” to
measure noise and signal strength.
16
17. time
Figure 4- Sample RF Energy From Inexpensive Signal Detector
freq
For example, Figure 4 shows an experiment conducted in a concrete walled courtyard; a
transmitter (blue square, upper left) was transmitting an 802.11g stream to the receiver (red box,
bottom right). The diagram to the right shows a “waterfall” plot of the received signal strength at
the receiver as it switches between four antenna states (performed by manually rotating the an-
tenna). The spectrum analyzer is clearly able to provide information for gradient-based RF find-
ing, and it can examine a broader band more rapidly than the 802.11g radio. However, it can not
identity what signals mean – i.e. what data is actually packets from adjacent LANdroids. Howev-
er, just as the 802.11g radio has poor abilities to rapidly scan the network to find “idle channels”,
both the radio and spectrum analyzer have their advantages when used cooperatively. We have
made extensive use of combined spectrum sensing and radio use in developing cognitive radio
control algorithms and mesh networking systems with directional antennas that have been evalu-
ated using SoftMAC and the WiSPY devices [WSD07, SDH06, BAY07].
Reducing Energy During Communi-
cation Phases: The single largest energy
consumer during communication phases is
the use of the underlying radio and the MAC
and PHY processing for that radio; consider-
able variation occurs between different RF
chipsets, but many factors are independent of
the chipset. The graph in Figure 5 shows the
time needed to transmit a data frame of vary-
ing sizes at different data rates. This mea-
sured data shows two things – there’s consid- Figure 5 - Transmission Time For Different
erable overhead for small packets and when a 8
802.11g Data Rates
high quality link is in place, the amount of
transmitted data is not important. The time to transmit any packet is approximately 1.49ms at the
lowest data rate, 1.19ms at the highest 802.11b data rate and 0.227ms at the 54Mb/s data rate.
During transmission, both the sender and receiver radios are operating, consuming considerable
power. If we assume that the primary task of the LANdroid relays will be to relay voice traffic,
we can see that system level decisions should guide power optimization – at 11Mb/s or 54Mb/s,
the transmission time for a VOIP packet is largely independent of the underlying codec; at
1Mb/s, transmission time can vary from 1.9ms for an 8kb/s G.729 codec to 3.0ms for a 64kb/s
G.711 code. At times, it may be useful to trade the costs of transcoding streams against in-
creased radio use.
This data also shows the necessity of increasing the link quality using directional antennas,
17
18. which improve gain and reduce air time, or through mobility and transmission power increases.
18
19. D.2 Comparison with Current Technology
The decentralized model-free, gradient-based, motion optimization technique developed by
the PIs overlaps with several state-of-the-art control techniques. Multivariable extremum seeking
control has been applied to a variety of applications [ArK02], but never to teams of autonomous
robots. Current work on MES has applied the approach to the control of a single nonholonomic
vehicle with no position information [ZAG07]. The approached proposed by the PIs uses local
position information obtained by odometry and enables coordination between vehicles. Likewise,
the concepts of swarming, distributed sensing, and coverage control have been studied extensive-
ly in the context of physical space (see [IEE07] for an overview of recent results). These results
all assume relative position and models of the interaction environment are known. Our approach
extends all of these results to cases when positions (in any space) and models are not known.
Controlling node mobility for communication has been considered by other researchers. The
value of controlling mobility to increase connectivity was shown in [LiR00, BRS03, BaR04, CN-
S01]. These generally consider outdoor networks or idealized communication models that do not
capture the tight interaction between robot navigation and network optimization. In [SBR04] a
protocol was analyzed that guided end users to better locations that improve their communication
quality. This requires significant end-user interaction that is not desired here. Using network in-
formation to update node locations was specifically considered in [GLM04]. The nodes move on
long time scales (1-5 min) and are not sufficiently dynamic for LANdroids. All of these refer-
ences fail to properly consider the energy cost of mobility relative to increasing transmitter pow-
er or other means of improving network performance. They also ignore the mobility planning
and assume (usually implicitly) some localization technique such as a GPS.
Data ferrying for delay tolerant networks (DTN) was considered in [MAZ04,SRJ03,ZhA03].
Ferrying exploits or controls mobility in order to deliver messages in disconnected networks and
is similar to epidemic routing and similar models [VaB00,Win00] that rely on mobility to diffuse
messages across nodes eventually reaching the intended recipient. While this does not satisfy the
LANdroid low-latency connectivity goal several concepts are relevant. Throwboxes described in
[ZCA06] are network relays that are placed at important traffic intersections to act as store and
forward devices that increase network throughput. The approach to identifying these intersec-
tions may be useful in positioning LANdroid relays. In [ZAZ04] the authors consider the broader
problem of how DTN nodes signal their communication needs in a disconnected network. Such a
protocol may inform the need to initially self-organize LANdroid nodes.
The Learning Spatial Features approach proposed here is related to concept drift, where the
characteristics of a class change over time [WiK96, KoM05, HeL92]. The approach requires the
use of multiple models, and the goal is to make future predictions by choosing between models
based on the latest labeled data. However, these methods do not offer a formal approach to ques-
tions such as which models to apply to the unlabeled current sample, or when to add new models
or discard old ones. Thrun’s [Thr96] work also directly addresses a similar problem, as well as
online learning as discussed in [Thr96]. However, none of these methods can directly predict
when an unlabeled input is beyond the scope of the current model set, and more learning must be
done. Furthermore, there has been a recent issue of the Journal of Field Robotics (co-edited by
the CO-PIs) that has addressed similar learning problems [MuG06]. Related Reinforcement
Learning work can be found in [SBP04].
19
20. D.3 Evaluation/Experimentation Plans and Metrics
University of Colorado’s operator interface for the DARPA LAGR program left (robotics),
and for the ad hoc UAV and ground node network (AUGNet) right (WiFi networking).
D.3.1 Operator Control System
Our past experience with test bed environments has shown that it is important to have a test
management plane[JBD05,BDJ05]. The management plane provides the ability to issue com-
mands (e.g. to start an experiment), monitor ongoing progress (e.g. node location and network
activity summaries), and collect detailed network statistics. The figure above shows a sampling
of interfaces that we have developed. We have been able to design “in-band” systems that share
the network under test for network monitoring with minimal impact. For detailed monitoring the
data can be held at nodes and collected after a test is finished using monitoring protocols that we
have developed. Our previous work used GPS to monitor node locations but this can not be used
in the indoor environment. We will test both out of band video systems located in the ceiling and
an existing UWB indoor localization system acquired under a previous contract. It is important
to note that we are only using localization information to determine when and why the developed
systems run into problems; they are not used to guide the decisions of the LANdroid system.
D.3.2 Test Bed Environment
Our combined research groups have considerable experience with a variety of robots, as in-
dicated in the Facilities section. We have an existing collection of 12 Roomba systems outfitted
with a custom radio and control system that can
be used in this project; we have also budgeted
equipment for additional Roomba Create compo-
nents and embedded computers. Robots will be
instrumented to monitor power usage. This will
enable us to test the effect of different robot hard-
ware and software strategies in short experiments
without having to test until the entire battery is
drained.
We will use various testbed environments.
Day-to-day testing and evaluation will occur in re-
search labs in the Department of Electrical and
Computer Engineering and in Computer Science.
20
21. These large labs have already been used for robotics experiments, and have a variety of construc-
tion types. In addition, we will use the Center for Innovation and Creativity (CINC); this large
facility is a former manufacturing facility converted to University use. The floor plan is shown in
the picture above. That diagram shows the location of existing directional phase array antennas
in the large building and the variety of room shapes and sizes available. Each of these facilities
has existing 802.11 wireless networks, providing the benefit of background interference; we also
have access to an interference free location at a local warehouse.
D.3.3 Simulation Environment
For rapid algorithms development we will modify an existing network simulator that incor-
porates detailed radio models and controlled mobility. Open source programs such as ns2 only
provide predefined mobility paths that can not be modified during the simulation. Further, the
standard propagation models are weak. We have modified OpNet to incorporate controlled mo-
bility and more accurate propagation models. The radio performance will incorporate measure-
ment data between location pairs to provide a simple, but, accurate radio model. The simulator
will require further development to incorporate our system power models and sensor interac-
tions.
We believe that relatively simple power models will be sufficient for the majority of the
work; the radio components have non-varying power levels, the power demands of the RAM and
FLASH components are constant and power models for statically scheduled microprocessors,
such as the PXA 255, are accurate and have been well studied [GiGr00].
One challenge will be to provide a simplified or abstracted power model that can be used in
on-line decision making; various learning algorithms will need to be able to assess the benefit of
trading one resource (e.g. radio) against another (e.g. specific computation). Again, the project
team has considerable experience in these tasks.
D.3.4 Metrics
In addition to the five program-wide metrics (Coverage, Longevity, Throughput/Latency,
Convergence Time, and Message Overhead) we also consider three additional metrics. The first
is Dynamic Antenna Gain. This measures the ability of the robot to find positions that provide
gain to a specific test point. It is measured as a dB gain in received signal strength relative to the
local median signal strength in the vicinity of the robot. The second is Plan Energy. This mea-
sures the total energy (movement, processing, and communication) to achieve a specific goal.
Goals can be to move to a specific position in a cluttered environment or to find a local minimum
of signal gradient. This will be used to compare different hardware and software strategies for
robot operation and is at a finer grain then Longevity. The third is Reach. This measures the
maximum distance a test point can communicate with the gateway for a given number of robots
within a given environment. This is designed to be a better measure of tethering performance.
Convergence Time appears to require further clarification. We propose that it is the time
past an event until a minimum percentage of test points can communicate (as defined in Cover-
age) with the gateway. Since this may never happen, convergence time is augmented by Conver-
gence Probability which is the fraction of the attempts that the system converges on to a mini-
mum Coverage percentage.
21
22. E. Statement of Work
The overall goals of the University of Colorado LANdroid Team have been outlined in Sec-
tion C. Broadly, we plan to integrate multiple sensor, communication, and system measurements
in order to control the network and robot so that connectivity and network longevity are maxi-
mized. We have divided the problem into nine components: RF Analysis; LANdroid System
Hardware; System Power Models; Decentralized, Model-Free, Gradient-based, Motion Opti-
mization in Signal Space; Learning Spatial Features; Learning Optimal Control; Vision Based
Cost Estimation; Network Protocols; and System Software Development and Evaluation. A de-
tailed breakdown of these components into their constituent tasks is provided below. For each
task, we indicate which PI (and their associated graduate students) is responsible for the task and
in which phase the task will be completed. At a high level, the three phases of this work are to 1)
identify the technologies that will best constitute the LANdroid system and provide design
guidelines to the LANdroid hardware teams; 2) integrate these technologies into a coherent con-
trol system; and 3) integrate the control software with the hardware platform. The tasks are tied
to the Milestones which are described at the end of this section and deliverables in Section I.
However, the tasks will generally encompass the entire phase(s) indicated. Variations will be in-
dicated in the schedule graphic.
RF Analysis (RFA): Each task will support Deliverable 1 (RF Analysis document).
Task 1: (P1, Grunwald) Survey signal strengths using dense measurement networks, compar-
ing different signal strength measurement techniques (WiFi card vs. WiSpy).
Task 2: (P1, Popovic) Characterize the role of the antenna, including form-factor, directional-
ity, and multiple antennas in near-to-ground indoor/urban environments.
Task 3: (P1, Brown) Identify and characterize spatial features that are desirable in signal
space.
LANdroid System Hardware (LSH): Each task will support Deliverable 2 (LANdroid System
Hardware Recommendation document).
Task 4:(P1, Grunwald) Analyze tradeoffs in costs, energy requirements, and system perfor-
mance for sensors, 802.11 chipset, and processor on gateway, LANdroid, and edge node.
Task 5:(P1, Mulligan) Analyze tradeoffs in number and capability of video cameras.
Task 6:(P1, Popovic) Analyze tradeoffs in number and capability of antenna.
System Energy Models (SEM):
Task 7:(P1, Grunwald) Develop generic energy models communication, sensor, robot, and
processing sub-systems. Supports Deliverable 2 and Milestone 6.
Task 8:(P1, Grunwald) Characterize the added network longevity vs. cost of different subsys-
tem choices. Supports Deliverable 2.
Task 9:(P3, Grunwald) Integrate final hardware design into energy model. Supports Mile-
stone 9.
Gradient-Based Motion Optimization in Signal Space (GMO):
Task 10:(P1, Frew) Determine most efficient signal dithering method for determining gradi-
ent. Provides input to Deliverable 1.
Task 11:(P1, Frew) Design and implement basic gradient-based robot deployment algorithm.
22
23. Supports Milestone 1.
Task 12:(P2, Frew) Design and implement gradient-based robot tethering algorithm. Supports
Milestone 5.
Task 13:(P3, Frew) Design and implement gradient-based methods that search in perfor-
mance space (e.g. power or throughput). Supports Milestone 8.
Learning Spatial Features (LSF):
Task 14:(P1, Grudic) Investigate many types of spatial features rated by humans for their net-
work utility. The resulting features will be evaluated in an offline machine learning
paradigm. Supports Milestone 4.
Task 15:(P2,P3, Grudic) Design and implement online machine learning to optimize the
learned spatial feature concepts. Supports Milestone 7.
Task 16:(P3, Grudic) Develop distributed spatial feature detection between LANdroid nodes.
Supports Milestone 7.
Learning Optimal Control (LOC):
Task 17:(P1, Grudic) Adapt LAGR path planning code to the LANdroid environment. Sup-
ports Milestone 2.
Task 18:(P2, Grudic) Investigate methods for the LANdroid to learn from human operater
demonstrated control strategies. Supports Milestone 4.
Task 19:(P3, Grudic) Design and implement a reinforcement learning controller that dynami-
cally determines standard control behaviors and combines signal and spatial optimization.
Supports Milestone 7.
Vision-Based Cost Estimation (VCE):
Task 20:(P1, Mulligan) Implement and refine algorithms for compute and power limited vi-
sion to support path planning and spatial feature identification. Supports Milestone 2.
Task 21:(P2, Mulligan) Design and implement methods for estimating relative pose among
LANdroids and situational awareness. Identify environmental precepts that facilitate LAN-
droid function. Supports Milestone 5.
Task 22:(P3, Mulligan) Design and implement Warfighter tracking algorithm. Supports Mile-
stone 7.
Network Protocols (NP):
Task 23:(P1, Brown) Select and adapt MANET protocol implementation to LANdroid robot.
Supports Milestone 1.
Task 24:(P1, Brown) Implement protocol for connecting disconnected nodes. Supports Mile-
stone 2.
Task 25:(P2, Grunwald) Implement resource management protocol that enables LANdroids
to share resource state information, manage resources, and interface with its own re-
sources. Supports Milestone 6.
Task 26:(P2, Brown) Implement energy aware routing protocol that considers the larger set
of LANdroid resources and node states. Supports Milestone 6.
Task 27:(P3, Grunwald) Implement quality of service aware routing protocols for meeting
throughput, latency, or longevity targets of end nodes. Supports Milestone 8
23
24. System Software Development and Evaluation (SDE):
Task 28:(P1, Mulligan) Design and implement operator control system (OCS) for managing
LANDroid testing. Supports Milestone 1.
Task 29:(P2, Brown) Develop evaluation test bed for understanding LANdroid system behav-
ior in complex environments. Includes both simulation, hardware in the loop, and full
hardware environments. Supports Milestone 4.
Task 30:(P2, Frew) Design control software to robot interface. Supports Milestone 9.
Task 31:(P3, Mulligan) Implement software modules on LANdroid robot hardware. Supports
Deliverable 9.
Milestone 1: 10 LANdroids can self-configure when initially connected to provide connectivity
to static test points in an open space on a single floor.
Milestone 2: 10 LANdroids can self-configure when initially disconnected to provide connectiv-
ity to static test points in an open space on a single floor. The network self heals after node
death.
Milestone 3: Operator control software can operate multiple LANdroid robots and collect situa-
tional awareness and monitoring data.
Milestone 4: 15 LANdroids can self-configure (whether initially connected or not) to provide
connectivity to test points across two floors with static obstacles.
Milestone 5: The network maintains connectivity to mobile test points (tethering).
Milestone 6: Network energy is managed as a network resource.
Milestone 7: 50 LANdroids can self-configure (whether initially connected or not) to provide
connectivity to heterogeneous test points across 3+ floors of dynamic obstacles and RF inter-
ference. The network maintains connectivity to mobile test points (tethering). Energy is man-
aged as a network resource.
Milestone 8: Edge nodes can specify network objectives (e.g. longevity or throughput).
Milestone 9: All control software ported to LANdroid robot hardware.
24
25. F. Schedule Graphic
LANDroids
Phase 1 Phase 2 Phase 3
Year 07 2008 2009 2010
Month Dec Jan Feb Mar Apr MayJun Jul Aug Sep Oct Nov Dec Jan Feb Mar Apr MayJun Jul Aug Sep Oct Nov Dec Jan Feb Mar Apr MayJun Jul Aug Sep Oct Nov
T1 RFA: RF survey
T2 RFA: Charaterize Antenna Role
T3 RFA: Identify Spatial features
T4 LSH: Hardware tradeoffs
T5 LSH: Video tradeoffs
T6 LSH: Antenna tradeoffs
T7 SEM: Generic Energy Model
T8 SEM: Network Longevity vs. Cost Model
T9 SEM: LANdroid specific energy model
T10 GMO: Signal Dithering
T11 GMO: Signal Gradient Following
T12 GMO: Gradient-based tethering
T13 GMO: Performance Gradient Following
T14 LSF: Offline Spatial Feature Learning
T15 LSF: Online Spatial Feature Learning
T16 LSF: Distributed Spatial Feature Identification
T17 LOC: Adapt LAGR Code
T18 LOC: Learning Strategies from humans
T19 LOC: Reinforcement Learning Controller
T20 VCE: Power-limited robot vision
T21 VCE: Network pose and situation
T22 VCE: Warfighter tracking
T23 NP: MANET selection
T24 NP: Connecting disconnected nodes.
T25 NP: Resource Management Protocol
T26 NP: Energy aware routing
T27 NP: Quality of service aware routing
T28 SDE: Operator Control System
T29 SDE: Evaluation Testbed
T30 SDE: Control Software to Robot interface Spec
T31 SDE: Software on LANDroid robot hardware
Evaluations
Milestones
M1 M2 M3 M4 M5 M6 M7 M8 M9
25
26. G. Teaming and Tasking
To be successful the proposal requires a confluence of knowledge and capabilities in RF
propagation, wireless protocols, robotics, and machine learning. The team has these capabilities
along with key cross-disciplinary capabilities that ensure a smooth and productive interaction:
Tim Brown (PI): 18 years experience in Wireless Systems, Machine Learning, and Network-
ing
Eric Frew: 11 Years Experience in Design and Implementation of Control Systems for
Autonomous Robotic Networks
Greg Grudic: 13 Years Experience in Learning Algorithm Development and Robotics Re-
search.
Dirk Grunwald: 20 years experience in computer systems design, evaluation and develop-
ment
Jane Mulligan: 12 Years Experience in Implementation of Robotic Control Systems and
Real-Time Stereo and Vision Algorithms.
Zoya Popovic: 22 years experience in microwave circuit and antenna modeling, design and
characterization.
In addition the proposal will be supported by a dedicated System Integrator whose task will
be to lead the LANdroid system evaluation. In summary, the team has a depth of capabilities in
each of the key areas as shown below.
Team Capabilites
Wireless Machine
RF propagation Robotics
protocols Learning
Popovic Mulligan
Grunwald Grudic
Brown Frew
Brown Brown
Grunwald Grudic
Each of the six faculty will be responsible for the tasks assigned to them in the Statement of
Work section. In addition, the PI is the technical point-of-contact and responsible for the overall
coordination of the project, periodic reporting, and financial management.
This project will be placed in and managed through the Department of Electrical and Computer
Engineering. The ECE Department is headed by Prof. Mike Lightner who is a full-time faculty
member. Therefore, the PI, Prof. Brown will report to Prof. Lightner for program support, report-
ing, and oversight. The departments report to the Dean of the College of Engineering and
Applied Sciences, Prof. Robert H. Davis.
26
27. H. Project Management and Interaction Plan
The entire team is collocated in the Engineering Center building at the University of Col-
orado. Weekly team meetings will track progress and provide forward planning. Meetings will
include faculty and students who will discuss project management as well as research findings.
We will develop an internal CU LANdroid Wiki to facilitate technical discussion further and
document progress. This Wiki will also be integrated with version control for all robot and net-
work control software. All software, test bed monitoring, data measurements, technical reports,
and papers will be archived on a server that will be backed up nightly.
Monthly full-scale exercises will evaluate LANdroid progress and program needs.
27
28. I. Deliveries Description
There are no Proprietary Claims. Technical data and computer software will be furnished to
the Government with Unlimited Rights.
Deliverable 1: RF Analysis document. Based on our measurements and analysis, this describes
the potential for exploiting the RF environment and cost-effective techniques that are best suited
for this exploitation.
Deliverable 2: LANdroid System Hardware Recommendation document: Based on our systems
level analysis of choices in sensors, radios, and computation; we will provide cost-benefit analy-
sis of the potential hardware components in the LANdroid System.
Deliverable 3: Documentation of all software modules.
Deliverable 4: Documentation of all LANdroid system evaluations.
Deliverable 5: Documentation of the final complete software system.
Deliverable 6: Control software written for the final LANdroid hardware platform.
28
29. J. Technology Transition and Technology Transfer Targets Plans
The development of self-organizing robotic network nodes has immediate and obvious ap-
plications in a number of fields. These include:
1. Public Safety. A network that can dynamically adapt and optimize performance with minimal
intervention would improve current police, fire, and medical communications in urban indoor
environments.
2. LANcraft. In outdoor environments, wireless nodes mounted in unmanned aircraft can pro-
vide optimized mobile networks over large areas. LANcraft can operate independently or can
interoperate with LANdriod networks adding another dimension.
3. Consumer Deployments. Automatically identifying wireless access point locations to provide
coverage can simplify deploying building networks. The converged locations of a LANdroid
network would indicate where access points should be installed.
With no additional investment, the algorithms developed for this application could be appli-
cable to many other uses for the DOD and public safety markets. The algorithms would have ob-
vious potential for performing Future Combat Systems (FCS) or Joint Robotics Program (JRP)
missions. The development of learning behaviors would directly benefit and potentially feed the
FCS Autonomous Navigation System (ANS) program which will be applicable to all unmanned
FCS operations. The network optimization algorithms can also support the DARPA next genera-
tion (XG), wireless network after next (WNaN), wireless adaptable network node (WANN) and
WNaN adaptive network development (WAND) programs which focus on expanding low-cost
radio capabilities. The team is working on several of these projects. This research has potential
spin-offs in local companies such as Cardinal Peak, Louisville CO, which is currently funded by
the NSF to study mobile robot relays to optimize video backhaul for public safety.
Using existing sensors and added payloads, the LANdroid could assume additional roles
such as distributed surveillance (audio, video, vibration) to support the urban indoor war fighter.
With some adaptation, the technology logic can be inverted to communication disruption. Using
many of the LANdroids techniques, mobile indoor electronic countermeasure networks could
jam, eavesdrop, or localize emitters.
Results will be published in public forums in order to disseminate and promote the transfer
of this technology.
29
30. K. Personnel and Qualifications
Dr. Timothy X Brown, Associate Professor
• 18 years experience in Wireless Systems, Machine Learning, and Networking.
• Technical Program Committee for ACM International Symposium on Mobile Ad Hoc Net-
working (MobiHoc) 2004, 2007.
• Member National Research Council Committee on Using Information Technology to Enhance
Disaster Management, 2005–2007.
• Sub-contractor on Phase 3 of DARPA XG program.
• Ph. D, Electrical Engineering, California Institute of Technology.
• Dr. Brown’s Ph.D. thesis topic was a neural network framework for solving switching net-
work design problems. Between 1990 and 1992 he worked in an advanced computing archi-
tectures group at the Jet Propulsion Laboratory where he developed novel neural network
ASIC designs. Between 1992 and 1995 he was a member of technical staff at Bell Communi-
cations Research where he developed machine learning techniques for network control. Since
1995 he has been a Professor in Electrical and Computer Engineering at the University of
Colorado, Boulder. He has published research papers in machine learning, wireless systems,
and networking. Dr. Brown’s work in machine learning includes statistical function approxi-
mation of rare events; and adaptation in network and wireless communication systems using
Markov Decision Process formulations solved with reinforcement learning. His work in wire-
less systems and networking includes wireless user mobility models; analysis of random cel-
lular deployments; energy aware ad hoc routing protocols; delay tolerant network routing; and
adaptive network resource allocation. To support the wireless research he has developed a
large-scale outdoor wireless test bed that incorporates real-time monitoring and visualization
of network performance down to the packet level. His current research is on controlled mobil-
ity in ad hoc networks, especially on small unmanned aircraft. His published ad hoc network-
ing protocols (including variants of DSR, AODV, and DTN) have all been implemented on
handheld or small single board computers deployed in indoor, outdoor, vehicular, and/or aeri-
al networks.
Dr. Eric W. Frew, Assistant Professor
• 11 Years Experience in Design and Implementation of Control Systems for Autonomous
Robotic Networks.
• PI of AFOSR Project titled “An Integrated Framework for Controlled Mobility in Ad Hoc
Networks”
• Member of the Research and Engineering Center for Unmanned Vehicles (RECUV) at the
University of Colorado.
• PhD, Department of Aeronautics and Astronautics, Stanford University
• Dr. Frew’s research efforts focus on the exploitation of controlled mobility for integrating
communication into multi-objective control, optimal distributed sensing by teams of au-
30
31. tonomous vehicles, and self-directed collaborative navigation of unmanned aircraft. Prior to
joining the CU Boulder faculty, he was a postdoctoral researcher at the UC Berkeley Center
for Collaborative Control of Unmanned Vehicles (C3UV) from June 2003 through July 2004
where he oversaw the development and flight demonstrations of a fleet of three intelligent
aerial platforms. Prior to that, he worked with unmanned ground vehicles and the Humming-
bird autonomous helicopter at the Stanford University Aerospace Robotics Lab. Dr. Frew has
been involved with successful Air Force STTR/SBIRs and current funding comes from the
Air Force Office of Scientific Research, USAF Materiel Command, and Raytheon IIS.
Dr. Greg Grudic, Assistant Professor
• 13 Years Experience in Learning Algorithm Development and Robotics Research.
• Co-organizer with Mulligan of the 2005 NIPS workshop on Machine Learning Based
Robotics in Unstructured Environments
• (http://www.cs.colorado.edu/janem/NipsMLR.html).
• Co-editor with Mulligan of the 2006 Journal of Field Robotics Special Issue on Machine
Learning Based Robotics (http://www.journalfieldrobotics.org/index.html).
• PI in Phase 2 of the DARPA LAGR program.
• PhD, Electrical and Computer Engineering, University of British Columbia
• Dr. Grudic’s Ph.D. thesis topic was on nonparametric learning from examples in very high di-
mensional state spaces, which produced a machine learning framework for end-to-end learn-
ing of robot navigation tasks. Between 1998 and 2001 he was a Post Doctoral Fellow at the
GRASP lab at the University of Pennsylvania. Since2001 he has been an assistant professor in
the Computer Science department at the University of Colorado at Boulder. He has published
research papers in both Machine Learning and Robotics. As part of his ongoing research in
human-to-robot skill transfer and end-to-end learning of robot tasks, his current research focus
is on probabilistic regression and classification, clustering, semi-supervised learning, outlier
detection, and low dimensional nonlinear manifold representations of robot sensory space. Dr.
Grudic’s research in machine learning includes papers on classification, regression, semi-su-
pervised classification, clustering, outlier detection and reinforcement learning. His research
in robotics includes published papers on end-to-end learning of task driven mobile robot navi-
gation tasks, reinforcement learning for mobile robot navigation, and inverse kinematics for
high degree of freedom robot manipulators.
Dr. Dirk Grunwald, Associate Professor
• More than 20 years experience in computer systems design, evaluation and development
• Contributor to DARPA-sponsored book on power aware computing
• Participant in winning DARPA WANN hardware design team with M/A-COM
• Broad experience in computer systems, including all aspects of the LANDROID platform
• Ph.D. Computer Science, University of Illinois Urbana-Champaign
• Dirk Grunwald is an Associate Professor at the University of Colorado, Boulder. Dr. Grun-
31