Ergodic Flocking of Swarm Robots
by
Conan Veitch
B.Sc., University of Northern British Columbia, 2016

THESIS SUBMITTED IN PARTIAL FULFILLMENT OF
THE REQUIREMENTS FOR THE DEGREE OF
MASTER OF SCIENCE
IN
COMPUTER SCIENCE

UNIVERSITY OF NORTHERN BRITISH COLUMBIA
July 2019
© Conan Veitch, 2019

Abstract
The design of efﬁcient control strategies is a well studied problem. Due to recent technological advancements and applications in the ﬁeld of robotics, exploring novel ways to design optimal control for multi-robot systems has gained interest. In this respect, the concept of ergodicity has successfully been applied as
an effective control technique for tracking and area coverage. The generation of
ﬂocking behaviour is a problem that involves both tracking and coverage, and as
such is also suited for the use of ergodicity. The main contribution of this thesis is
the application of ergodicity to emulate ﬂocking behaviour. This approach is appealing because control and communication is assumed to be local, self-organized,
and does not require separate algorithms in order to generate different behaviour.
Simulation results show that the proposed approach is effective and a prototype
provides evidence that ﬂocking behaviour is possible using ergodicity in a real-life
setting.

3

TABLE OF CONTENTS

Abstract

3

Table of Contents

4

List of Tables

5

List of Figures

6

List of Acronyms

8

Acknowledgements

9

1

Introduction

10

2

Swarm Behaviours
2.1
Natural Swarms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2
Swarm Advantages . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2.1 Swarm Behaviours . . . . . . . . . . . . . . . . . . . . . . . . .
2.3
Flocking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

13
13
15
17
18

3

Ergodic Flocking
3.1
Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2
Discrete Dynamical System Example . . . . . . . . . . . . . . . . . .
3.3
Ergodic Systems - Swarm Robots as Example . . . . . . . . . . . . .
3.3.1 Ergodicity of a Rectangular Domain in R2 . . . . . . . . . . .
3.3.2 Time-Elapsed or Time-Varying Distributions . . . . . . . . . .
3.4
Ergodic Coverage and Ergodic Trojectories . . . . . . . . . . . . . . .
3.4.1 Trajectory Generation . . . . . . . . . . . . . . . . . . . . . . . .
3.4.2 Multi-Trajectory Generation . . . . . . . . . . . . . . . . . . . .
3.5
Ergodic Flocking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.5.1 Localized Control and Communication . . . . . . . . . . . . .
3.5.2 Local Coefﬁcient Generation . . . . . . . . . . . . . . . . . . .
3.5.3 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.6
Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.6.1 Simulator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

21
22
23
24
28
31
31
34
36
37
38
39
40
41
42

4

3.6.2
3.6.3

Time-Elapsed Aggregation . . . . . . . . . . . . . . . . . . . .
Flocking by Distribution Tracking . . . . . . . . . . . . . . . .
3.6.3.1 Linear Flocking . . . . . . . . . . . . . . . . . . . . . .
3.6.3.2 Non-Linear Flocking . . . . . . . . . . . . . . . . . .

43
44
44
45

4

Prototype Implementation
4.1
Commercially Available Platforms . . . . . . . . . . . . . . . . . . . .
4.2
Prototype . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2.1 Hardware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2.2 Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2.3 Navigation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2.3.1 Received Signal Strength Indication (RSSI) . . . . .
4.2.3.2 Data Noise . . . . . . . . . . . . . . . . . . . . . . . .
4.2.3.3 Filtering . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2.4 Coordination . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2.5 Coordinate Finding . . . . . . . . . . . . . . . . . . . . . . . . .
4.2.6 Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.3
Operating System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.4
Ergodic Flocking Implementation . . . . . . . . . . . . . . . . . . . .
4.4.1 Implementation Parameters . . . . . . . . . . . . . . . . . . . .
4.4.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.4.2.1 Trial One . . . . . . . . . . . . . . . . . . . . . . . . . .
4.4.2.2 Trial Two . . . . . . . . . . . . . . . . . . . . . . . . .
4.4.2.3 Trial Three . . . . . . . . . . . . . . . . . . . . . . . . .

47
48
50
52
56
57
58
60
61
62
63
66
67
68
69
71
71
71
72

5

Conclusions and Future Work

76

Bibliography

78

5

LIST OF TABLES

4.1
4.2
4.3

Cost Comparison of Commercial Swarm Platforms . . . . . . . . . .
RPERT State Information by LED Colour . . . . . . . . . . . . . . . .
Implementation Parameters . . . . . . . . . . . . . . . . . . . . . . . .

6

49
54
69

LIST OF FIGURES

2.1
2.2

Weaver Ants Forming a Bridge [1]. . . . . . . . . . . . . . . . . . . .
A Murmuration of Starlings. Source: STARFLAG Project, ISC-CNR

14
18

3.1
3.2
3.3

State Transition in Conway’s Game of Life . . . . . . . . . . . . . . .
Time-Averaged Behaviour vs Spatial Average Behaviour . . . . . .
Left: System with perfect ergodicity, Right: System with imperfect
ergodicity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Non-ergodic Coverage vs Ergodic Coverage . . . . . . . . . . . . . .
Centralized vs distributed ergodicity with 3 agents and step 0.5. . .
Aggregation step, 7 agents, T=100, each time step is 0.1, agent speed
is 0.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Time-Elapsed Aggregation step, 7 agents, T=100, each time step is
0.1, agent speed is 0.1, distribution speed is 0.11 . . . . . . . . . . . .
Local ergodic trajectories with respect to a time-evolving uniform
distribution. 7 Agents, T=100, each time step is 0.15, agent speed is
0.1, distribution speed is 0.11 . . . . . . . . . . . . . . . . . . . . . . .

24
26

Robotic Prototype for Ergodic RSSI-informed Trajectories (RPERT)
CAD Drawings of RPERT Chassis . . . . . . . . . . . . . . . . . . . .
Relationship Between RSSI and Distance . . . . . . . . . . . . . . . .
Actual Distance Heat Map vs Raw RSSI Heat Map . . . . . . . . . .
Equivalent Distance of Robots on Unit Circle . . . . . . . . . . . . .
Differential Drive Dyamics . . . . . . . . . . . . . . . . . . . . . . . .
Experimental Domain for Trial One and Two . . . . . . . . . . . . .
Trial One: Aggregation Simulation vs Application . . . . . . . . . .
Trial Two: Flocking Simulation vs Application . . . . . . . . . . . .

51
55
59
60
64
66
69
74
75

3.4
3.5
3.6
3.7
3.8

4.1
4.2
4.3
4.4
4.5
4.6
4.7
4.8
4.9

7

27
32
39
43
45
46

LIST OF ACRONYMS

API

Application Program Interface

ASF

Advanced Software Framework

EEDI Ergodic Exploration of Distributed Information
EPFL Ecole Polytechnique Federale
NDOF Nine Degrees of Freedom
IMU Inertia Measurement Unit
IRS

Infrared Sensor

LSB

Least Signiﬁcant Bits

LTC

Local Trajectory Generation

PDF

Probability Distribution Function

MAC Media Access Control address
RA

RSSI Anchor

RPERT Robot Prototype for Ergodic RSSI-Informed Trajectories
RSSI

Received Signal Strength Indicator

SE(2) Special Euclidean Group with translational and rotational symmetries
SMC Spectral Multiscale Coverage

8

Acknowledgements
It’s been a long haul for a guy that didn’t ﬁnish high school. Thank you all
for your incredible patience and support. Dr. Alex Aravind for your insight, your
rigour, and your relentless optimism – you’ve been far more than a supervisor to
me, and I can never repay your kindness. Rafael Roman Otero, mi amigo; my
spiritual guide to hardware and robotics. Darshik Shirodaria, a scholar and a gentleman – but mostly a scholar. Brett Kelly, Brian Duffels, and James McCurdy, the
coffee chats have kept me sane. Dr. Edward Dobrowolski, Dr. Mark Dale, and Dr.
George Jones for their inspiration, and mentorship. Rene Francis, Damen DeLeenheer, Desmond Ng, Al Green, and Erin Beveridge, for their friendship. Kyle Pollard: Sweet-Jams made me both better.. and worse. Louis Dressel, for his graciousness. To my mom, dad, grandma and grandpa for their unshakeable faith in
me.
And ﬁnally my partner in crime, Cecilia Castro, without whom I would be lost.
This work would not have been possible without her. Thank you, Cecilia, I can’t
wait to meet our Dean.

9

Chapter 1
Introduction
Multi-robot systems have been promoted in recent years as a consequence of rapid
advances in embedded computing technologies. A task, that would otherwise be
difﬁcult for a single agent, can instead be accomplished collaboratively by a distributed system of robots. These systems boast ﬂexibility, adaptability, and are
more robust than a system consisting of a single complex robot. In the context of
autonomous mobile multi-robot systems, it is essential that the implementation of
tasks be robust, reliable, and above all, scalable. This requires a coordination between robot agents with some notion of individual and group agent position that
acknowledges task objectives and domain boundaries. Swarm robotics is one solution to the problem of coordinating many robots that is based on the success of
biological systems – animals like weaver ants, bees, and many migratory birds and
ﬁsh.
In this thesis, a swarm is deﬁned as a large number of agents capable of jointly
accomplishing tasks without external or centralized control. Similarly, a ﬂock (natural or otherwise) is deﬁned by Reynolds in [2] as: ”a group of objects that exhibits
a general class of polarized, non-colliding, aggregate motion.” The behaviours of
a swarm, also known as emergent behaviours, are the aggregate motions of the

10

agents that make up the swarm. A typical swarm behaviour is the ability to explore a given domain: this is known as area coverage.
In the swarm robotic context, area coverage is a well explored problem [3–7].
The ability for a swarm of robots to search an area in order to map it, or ﬁnd an
item is beneﬁcial. In addition to coverage, the ability for a swarm to track a target
is another fundamental task for robot swarms. Target tracking is useful for applications such as search and rescue [8], mine sweeping [9], and even educational
tasks [5]. The generic behaviour that emerges from the combination of tracking
and coverage is known as ﬂocking. Flocking can be described as the aggregation
and group movement of a large collection of agents, generally accomplished without those agents having global system information [10]. Without this knowledge,
each member of the system instead relies on local interactions with neighbouring
agents to collectively produce global behaviours such as ﬂocking. The direction
and target of ﬂocking behaviour can be emergent, selected by the swarm itself
through some internal method or predetermined [11].
Several approaches have been studied for area coverage behaviours [10–13].
Among these approaches is the innovative technique of ergodic coverage introduced by Mathew and Mezić in [14]. In this technique a robot’s trajectory is given
constant feedback to ensure it is covering the area as intended.
The work presented in this thesis builds on Mathew and Mezić’s methods to
calculate ergodicity and generate ergodic coverage in order to
1. Emulate two common swarm behaviours, aggregation, and ﬂocking without
requiring two separate behavioural algorithms; and to,
2. Reﬁne ergodicity calculation so that robots only require local knowledge, allowing for completely decentralized behaviour after initialization.
Chapter 2 of this thesis provides background on swarm behaviours, both nat11

ural and artiﬁcial. A brief discussion with concrete examples of social animals is
presented, followed with the advantages that such organization provides to artiﬁcial swarms. Terminology for common swarm behaviours are provided, with
careful consideration given to ﬂocking. In Chapter 3 we present ergodic ﬂocking, a behavioural method for robotic swarms. Ergodic systems, and ergodicity
are discussed in depth prior to showing the construction of the ergodic ﬂocking
algorithm. Simulation results of the algorithm in a swarm setting show that it is
effective at generating the desired behaviour. An experimental prototype is shown
in Chapter 4 that implements the ergodic ﬂocking algorithm. The development of
the prototype is catalogued for hardware and calibration, and simple robot navigation is explored. The coordination and control of the prototype as a multi-robot
system is provided and followed by physical experimentation in which the prototype system is given the same parameters as the Chapter 3 simulation. The results
of these experiments show that the ergodic ﬂocking algorithm is capable of emulating swarm ﬂocking behaviours in physical systems.

12

Chapter 2
Swarm Behaviours
It is common in robotics for engineers to take inspiration from biological systems [4]. The dynamics of an industrial welding robot, for example, are informed
by the structure of human limbs. It is not enough to design artiﬁcial joints, bones
and ligaments for a robotic arm, however, as the range of motion and predictable
behaviours must be captured in the robots design as well. Stable, predictable
behaviour is important in single robot systems, but is difﬁcult to achieve in distributed multi robot systems [11]. The coordination of large numbers of robots is a
non-trivial task [6].

2.1

Natural Swarms

Some birds, ﬁsh, and insects are capable of complex group movements and behaviours, most of which are entirely unchoreographed [15]. This lack of choreography stems from the decentralized nature of natural swarms of animals; each
member of a swarm is an individual creature, with a separate ”agenda”. There
appears to be no centralized command structure wherein each swarm member is
informed of their next move or step within the swarm.
Honey bees use highly organized group movements to form parallel lines of
13

honeycomb cells. A circular wax deposit is secreted by the bees, which is then
heated by a living chain of honey bees [16]. When a bee recognises there is enough
wax in an area, it moves to an unoccupied position around the circle, ensuring it
is as close to its neighbours as possible. The heat generated by their combined
thoraxes slowly softens and then eventually melts the wax, allowing it to form
the cell. The chain of bees forms an array of closely packed cylinders for which the
now-heated wax can ﬂow. The entire cell is heated evenly in this way, which allows
the circular deposit of wax to take on a familiar honeycomb hexagonal shape. This
behaviour is entirely undirected, and is generated by the initial deposit of wax [11].

Figure 2.1: Weaver Ants Forming a Bridge [1].

Weaver Ants form living chains from their own bodies, as shown in Figure 2.1.
The chains are intended to create bridges between branches, and paths through
otherwise hazardous terrain for the rest of the swarm to move across [11]. Chain
formation is initiated by ants walking over the hazardous terrain. Each ant will
then move as close to the end of the chain as it can, without moving too far away
14

from all of its closest neighbours. These two simple rules allow sturdy chain formation, as each ant will always attempt to have as many other ants as possible
nearby, while still trying to be as far forward as they can [1].
Starlings are songbirds that collectively ﬂock in enormous numbers, creating
elaborate patterns. Biologists currently believe the reason that starlings ﬂock together in such complex movements is to watch for predators like the peregrine falcon [17]. Starling ﬂocking behaviour, known as murmuration, is a complex group
movement initiated by simple local interactions of each starling. A starling only
considers its seven closest neighbours at all times, regardless of the density of the
ﬂock around them. This is due to very high starling densities leading to sensing
uncertainty, creating difﬁculties for the ﬂock to come to a consensus heading.
Natural swarms have been studied for years by artiﬁcial intelligence and robotics
researchers [18–21]. We will discuss the inherent beneﬁts in having a swarm of
robots to accomplish a task, as opposed to a more centralized multi-robot approach
where multiple robots are controlled by a central system.

2.2

Swarm Advantages

Centralized systems carry inherent ﬂaws to each task they approach; it is difﬁcult
to ﬁnd one that is robust, ﬂexible, and scaleable. In order to be desirable in real
world applications, it is necessary for a system to have all these qualities.
Robustness is the insensitivity of a robotic system to any modiﬁcations to the
system itself, or the environment in which it sits. This does not necessarily imply
a robustness against run-time error or erroneous input, but rather the ability of
a system to function correctly under changing environmental factors. Typically,
robot failure negatively impacts centralized robot systems [7], whereas a swarm
of decentralized robots is typically very resilient to individual robot failure. No

15

single member of a swarm has a key role within the swarm, and as such the failure
of some agents is not considered mission critical.
A ﬂexible system is capable of accomplishing tasks within diverse environments without changing the behaviours of the members of that system [4]. For
example, if robots are tasked with searching a house for an object are moved to
a different house, or an apartment complex, they should be equally capable of
accomplishing their task. A robot swarm is ﬂexible due to a concept called self
organization. This refers to the ability of a swarm to create global patterns, or behaviours strictly from many local interactions between component swarm members [5]. If a swarm is searching a home, it is not because each member of the
swarm has been told exactly where to search. Each member of a swarm typically
only knows its immediate surroundings and neighbours. The interactions with
these neighbours and environment informs the decisions of each individual robot.
This is in contrast with centralized robot systems, in which it is typical for the
robot(s) involved to have access to more complex environmental information. In
such a system, this information would need to be updated with each new task [22].
Scalability is the insensitivity of a group of robots to modiﬁcations to the number of robots in the system [23]. Since each member of the swarm communicates
only locally and does not rely on global or centralized information, the addition of
any number of robots does not complicate communication within the swarm.
A common term for the organization and problem-solving of natural and artiﬁcial swarms is swarm intelligence. The generally accepted [10, 11] deﬁnition of
swarm intelligence is: ”the emergent collective intelligence of groups of simple
agents”. This emergent collective intelligence is desirable and even though swarm
robot systems are robust, ﬂexible, and scaleable, they are only useful if they are
capable of generating behaviours that lead to predictable, reliable actions.

16

2.2.1

Swarm Behaviours

In the ﬁeld of swarm robotics, a wide variety of algorithmic solutions have been
proposed and investigated in order to perform numerous swarm related tasks. Often these solutions are designed around the use of speciﬁc hardware or technologies, without concern for scalability with regards to the number of robots, or the
limitations and costs of the hardware required in order to implement them [23].
There is a distinct lack of “top-down” design methods for swarm tasks where
the methodology is derived from and regulated by a set of mathematical principles [12]. Design methods often show limited application and extensibility to
tasks other than the speciﬁc ones they were proposed for [4]. Hence there is a lack
of formal mathematical analysis with regards to swarm tasks.
Any proposed swarm algorithm must produce consistent, but ﬂexible behaviours.
Most popular approaches to swarm tasks often involve random elements or the
implementation of a probabilistic ﬁnite state machine [6, 23], which can lead to
varying, but still predictable performance between similar groups of robots on collective tasks.
The following are common coordination sub-tasks in swarm robotics;
• Aggregation: the ability of a swarm to form clusters of swarm members;
• Self-assembly: the physical connectivity of swarm members, allowing complex problems to be handled in a distributed fashion;
• Coordinated movement: the collective, purposeful movements of swarm
members in restricted environments – that is, environments that are bounded
in some way;
• Flocking: A speciﬁc variation of coordinated movement and restricted self
assembly in which swarm members move with one another, allowing the
swarm to act as a whole.
17

This thesis focuses on the problem of swarm aggregation and ﬂocking, while nonﬂocking forms of coordinated movement and self-assembly are left as future work.

2.3

Flocking

Murmurations of starlings, as shown in Fig. 2.2, are an example of natural swarm
ﬂocking.

Figure 2.2: A Murmuration of Starlings. Source: STARFLAG Project, ISC-CNR

Schools of anchovies, locust swarms, and other similarly collective animal group
movements show ﬂocking behaviours that are vital to their survival. In natural
swarms, ﬂocking often increases the safety of each member of the swarm since the
number of eyes checking for predators and food sources [4] is increased. Depending on the animal, ﬂocking can have other purposes as well. Flocking birds, for
example, expend less energy since the updraft created by each neighbour creates
wing lift [11].
Flocking can be thought of as an emergent behaviour that occurs as a result of
18

swarm intelligence. This is because each member of the ﬂock interacts only locally
with its neighbours and its environment. Each member of a ﬂock follows a set of
simple rules that may be biologically hard-coded in order to produce patterns as
complex as starling murmurations [17].
Reynolds provides a set of simple rules that allows an artiﬁcial swarm to emulate natural ﬂocking behaviour [2]. These rules rely only on local interactions and
sensing, and do not require any global information. They are described as,
• Collision avoidance: each member of the swarm must avoid collision with
fellow swarm members;
• Velocity matching: each member of the swarm should match the average
velocity of its neighbouring ﬂock members; and,
• Flock centring: a ﬂock member should attempt to move towards the centre
of mass of all neighbouring swarm members.
Collision avoidance is fundamental as it ensures that the swarm does not trip
over itself and cause harm to the individuals creating the ﬂock. Velocity matching
is necessary to ensure the swarm is not broken into separate pieces during transit.
As the swarm accelerates and decelerates, the individuals attempt to match
their neighbours current velocity which forces the swarm to move as a collective.
In order to retain swarm cohesion, ﬂock centring is used to ensure that individuals
within the ﬂock do not travel out on their own if they are on the edge of the ﬂock.
Flock centring is capable of splitting the swarm to avoid obstacles. The swarm
bifurcates easily, since individuals only care about staying near their neighbours
rather than the rest of the ﬂock.
Artiﬁcial ﬂocking can be accomplished using Reynolds’ three rules, but it is not
the only method that can create this behaviour. In the next chapter, we present a

19

method for swarm robot ﬂocking that does not require individual robots to understand the concept of ﬂocking. Instead, we provide a method for area coverage to
each robot that emulates ﬂocking behaviour.

20

Chapter 3
Ergodic Flocking
This chapter describes an ergodic ﬂocking algorithm for swarm-robot systems
which only relies on local information. We show that this algorithm can emulate
the ﬂocking behaviours found in many biological and artiﬁcial swarm systems.
Flocking is achieved in this algorithm by taking advantage of the time-averaged
behavioural properties of ergodic coverage. Due to its reliance on time-averaged
behaviour, ergodic coverage does not necessitate precision movements on the part
of the robot, making it suited to swarms of simple robots.
We begin with deﬁnitions of common terminologies, followed by a concrete example of a dynamical system. In reality, most dynamical systems are continuous in
time. However, in the computational domain, such systems are suitably approximated by discrete systems. The Context and Motivation section then outlines what
discrete dynamical systems are, and by extension ergodic dynamical systems. A
metric for ergodicity is presented in the Ergodicity section, which is used in the
ergodic ﬂocking algorithm. A simulator is then used to show supporting evidence
that the ergodic ﬂocking algorithm allows a multi-robot system to achieve desired
ﬂocking behaviours.

21

3.1

Terminology

The following are common terminologies used throughout this chapter.
• System: A collection of one or more interrelated components organized to
accomplish a common task;
• State: The value of a component deﬁned on its domain;
• System state: The union of all component states, or the collection of all state
variables;
• Control law: The method in which a system moves between states, also commonly known as a transition function;
• Discrete Dynamical system: A system that evolves between states with time;
• Phase space: The set of all possible states of a discrete dynamical system;
• Trajectory: The set of states reached by a dynamical system’s control law
given an initial state;
• Ergodic dynamical system: A dynamical system that is capable of reaching
all possible states in its phase space given enough time;
• Ergodicity: A metric for how ergodic a given trajectory is performing. In this
thesis, all references to ergodicity imply the metric provided in [14];
• Distribution over a domain: A probability density function deﬁned over a set
of points in a domain;
• Time-elapsed distribution over a domain: A distribution that modiﬁes its
own behaviour over a set of points in a domain as a function of time;

22

• Basis function: A set of building block functions which spans a function
space. Each continuous function in the function space can be constructed
using a linear combination of basis functions;
• Fourier basis function: Basis functions that make up a T-periodic function
space (a periodic function in time with period T is called T-periodic). The
Fourier basis functions are formed of sines and cosines.

3.2

Discrete Dynamical System Example

John Conway’s Game of Life is a cellular automata, a collection of cells on an nxn
grid that evolves through time in discrete steps. Each cell on the grid is assigned a
state of alive or dead. A transition function called an evolution assigns a new state
to each cell based on the current system state. The transition function is:
• Each cell on the grid is either alive or dead - for this Chapter we depict alive
cells as shaded blue, and dead cells as white;
• Any alive cell with fewer than two alive neighbours dies from solitude
• Any alive cell with more than three alive neighbours dies from starvation
• Any alive cell with two or three live neighbours continues to be alive
• Any dead cell with exactly three live neighbours becomes alive
Conway’s game of life is an example of a discrete dynamical system. The set
of all possible states that the grid can assume is the dynamical system’s phase
space [24]. Correspondingly, the evolution steps are collectively called the system’s
transition function (control law), and the discrete series of new system states are
the system’s trajectory.

23

(a) State A

(b) State B

Figure 3.1: State Transition in Conway’s Game of Life
Though Conway’s Game of Life cannot reach every possible state in its phase
space from a given initial state, it is possible for it to reach every possible state in
some local area [25].

3.3

Ergodic Systems - Swarm Robots as Example

There are many applications for ergodic theory in the multi-robot systems ﬁeld,
since it deals with the time-averaged behaviour of dynamical systems. For example, a collection of robots covers an area ergodically if that area can be closely
approximated by the system’s trajectory formed by those robots over time. Such a
system is called ergodic if the time-average of the agents’ trajectory equals the average of all system states in the system’s phase space. In other words the fraction
of time a trajectory spends in a subset of a domain is equal to the measure of that
subset [26]. In the case of a system like multi and swarm robotics, we can expect
to approximate the phase space, rather than equating to exact time averages.
A swarm trajectory consists of agents and a bounded domain. As the agents
move (in discrete steps) through a domain, a system trajectory is generated. In
order to quantify how ergodic a trajectory is, one approach is to formulate a metric
that deﬁnes ergodicity [14]. A metric provides a comparison of how well a trajectory is sampling a given probability distribution function (PDF). The metric is then

24

used to inform the control of each agent in the system, forcing each agent to move
in such a way that the metric is minimized. Over a duration of time t, the goal of
agent control is to ensure the metric ε vanishes so that ε → 0 as t → ∞. In this

way, a multi-robot system can cover any area based upon some PDF over some
bounded domain. Each agent’s control inherently takes a provided distribution
into account, without the need for any secondary coverage algorithms.
A collection of robots covers an area ergodically if that area can be closely approximated by the states of the system’s trajectory. The work in this thesis assumes
that each new system state is generated in identical time intervals. The Mathew
and Mezić metric is used to correct a multi-agent system’s trajectory if it is not
searching ergodically. In doing so, the ergodic metric provides a feedback law
which informs a multi-robot system to cover an area ergodically.
Consider Figure 3.2 where the rectangular domain has a unitless area of 1 and
in which a probability measure φ(x) has been provided (the darker the region the
higher the density of the PDF), two regions D1 and D2 , a trajectory over a period
T , and an initial state x(0).

In a time period T , φ(x) is within the region D1 twice in the time intervals [t3 , t4 ]
and [t7 , t8 ]. In the same period, φ(x) is within the region D2 twice in the time
intervals [t1 , t2 ] and [t5 , t6 ]. Here,

�

D φ(x) is the measure of the region D.

The

system shown in Figure 3.2 is ergodic iff
�

φ(x)dx =

(t4 − t3 ) + (t8 − t7 )
T

(3.1)

φ(x)dx =

(t2 − t1 ) + (t6 − t5 )
T

(3.2)

D1

and
�

D2

Ergodic theory has been applied to the problem of autonomous domain ex-

25

Figure 3.2: Time-Averaged Behaviour vs Spatial Average Behaviour
ploration, allowing an agent to explore an arbitrary space while retaining explicit
knowledge of boundary restrictions, objectives, and agent orientations. Since each
state in an ergodic system will eventually occur, we can use a measure of how ergodic an agent’s trajectory is in order to uniformly cover an entire domain. One
can then give a higher weight to areas in a domain that require exploration. This is
so that the agent’s trajectory spends most of its time these areas, as shown in 3.2.
Consider a swarm robotic system’s trajectory that is perfectly ergodic, represented by the left-hand system in 3.3. In the ﬁgure, each white circle represents the
spatial average of the system, each blue circle represents the time-average (up to
that moment) of the system’s trajectory. If a sample of the system’s state is taken
at seven time instances between 11:00 and 14:00, and ergodicity is zero, we can
say that the system is perfectly ergodic, and that there is no difference between the
spatial and time-average of the system as shown.
26

Figure 3.3: Left: System with perfect ergodicity, Right: System with imperfect ergodicity
In implementation, the right-hand system in 3.3 is more likely. This system has
non-zero ergodicity, and at each time step will attempt to correct its control in order to ensure that it is acting as ergodically as possible. With each new state, the
deviation from an ergodic trojectory is computed. The system can feed this information into the next decision cycle via its control law, creating a feedback loop. In
this way the system can continuously correct its trajectory to ensure, regardless of
the physical barriers, that it is creating as ergodic a trajectory as possible for each
robot.
Ideally, the measure of the region must match with the time the trajectory has
spent within that region. However, in reality, this level of accuracy in behaviour
(trajectory) is not easy to achieve, as it would require consistent effort to correct the
swarm trajectory at every time to be closer to ergodic. Speciﬁcally, such correction
requires the knowledge, at every time instant, of how far away the trajectory is
from being ergodic.

27

3.3.1

Ergodicity of a Rectangular Domain in R2

Suppose robot agents are situated within a rectangular domain d deﬁned as: d =
L1 × L2, where L1 = [0, L1 ], L2 = [0, L2 ], and positive real values L1 , L2 ∈ R+ . They
are not allowed to leave in any way; in other words its movement is assumed to
be continuous, connected, and bounded. A distribution φ(x) over d is assumed to
be given. This distribution is generally application speciﬁc; the distribution will
change from task to task. φ can be thought of as an information density over the
domain and it is assumed that φ is zero at all points outside d.
A robot in this system has position coordinates x ∈ d situated within R2 and
the set of all agent coordinates in a system at a given time step i is the system’s
state si , where i = 0, 1, 2, ..., n. A system trajectory x is the set of all si up to the
current time step n. The metric provided by Mathew and Mezic compares the
distribution φ with the spatial statistics of x. This metric decomposes both φ and
x into Fourier coefﬁcients. It is important to note that this is one of many methods
to achieve ergodic trajectories, and as such is not strictly necessary in order to
compute ergodicity. This method signiﬁcantly reduces computational complexity,
however. [14, 27].
In order to numerically approximate values for physical application, the number of basis functions must be reasonable. A higher number of basis functions will
provide a more accurate representation of the PDF supplied. Similarly, a cut-off
must be chosen for the number of Fourier coefﬁcients. With few coefﬁcients, the
gaps between agent trajectories will be larger than those with a larger number of
coefﬁcients.
To implement this metric, φ and x are decomposed into Fourier coefﬁcients.
The decomposition of φ into

�s

i=1 (Ki + 1) coefﬁcients φk where K are the cuttoffs

28

numbers for each state i, is

φk =

�

d

φ(x)Fk (x)dx,

(3.3)

where the multi-index k = [k1 , k2 ] [27], and where Fk is the Fourier basis function
for vector k. As d is in R2 , our orthogonal basis functions are taken as [14]

Fk (x) =

1
cos(k1 x1 )cos(k2 x2 )
hk

(3.4)

where

k1 =

K1 π
K π
, k2 = 2
L1
L2

(3.5)

and

hk =

�
� L1 � L2
0

0

cos2 (k1 x1 )cos2 (k2 x2 )dx1 dx2 .

(3.6)

hk ensures that each Fk is normalized so each entry will have unit length. In [14]

Neumann boundary conditions are needed (the derivative at the boundary of d is
0), and as such a cosine series rather than a full Fourier series is used. After com-

puting our spatial distribution’s Fourier coefﬁcients, we must do the same for the
system trajectory. To do so, we create a probability density function distribution
c(x) from trajectory x
1
c(x) =
T

�T

δ(x − x(t))dt,

(3.7)

0

where x(t) is the system state at time t, δ(β) is the Dirac delta function as described
by Mathew and Mezic [14], and T ∈ Z+ a ﬁnite-time horizon. The Fourier coefﬁ-

29

cients of c(x) are given in [27] as

ck (x) =

1
T

�T
0

Fk (x(t))dt.

(3.8)

After decomposing our distributions into their respective coefﬁcients, we can
then compare them in a metric of ergodicity ε

ε(x) =

�

Λk |ck (x) − φk |2 ,

(3.9)

k

where Λk is the scaling factor so that the inﬁnite sum converges

Λk =

1
3

(a + ||k||2 ) 2

,

(3.10)

provided that

|Ck (x) − φk |2 ∼ |k|2−δ , δ > 0.

(3.11)

Here Mathew and Mezić have ignored φk in ε which initially caused us confusion. To avoid such confusion we include φk in ε computation as follows.

ε(ck , φk ) =

�

Λk |ck − φk |2

(3.12)

k

Now it is clear that we are comparing distribution Fourier coefﬁcients with trajectory Fourier coefﬁcients. In this way, ε quantiﬁes the difference between the actual
trajectory and the ideal ergodic trajectory that could be generated by the same system given the same initial state. In order for the spatial average of the domain to
approximate the time average of the trajectory as t → T , the control of each robot
must minimize ε. The above metric, however, only takes into account a single
agent’s trajectory. We need a way to measure multi-robot trajectories introduced
in Section 3.4.2.
30

R2 is not the only domain for ergodicity, and work has been done on a variety
of different domains. Miller and Murphey [28], of special note, have successfully
used the domain SE(2) - the special euclidean group with translational and rotational symmetries. This allows their work to take into account the facing of robotic
sensors while still implementing ergodic coverage. Fourier basis functions must be
found for the chosen domain, but otherwise ergodicity is independent of domain.
For this work, R2 is utilized rather than SE(2) in order to reduce computational
complexity. The algorithm proposed later in this chapter allows for the ergodic
ﬂocking of a multi-robot swarm. With that context in mind we assume the domain
to be rectangular in R2 .

3.3.2

Time-Elapsed or Time-Varying Distributions

To guide trajectory calculations for applications like tracking, the concept of timeelapsed distribution was introduced in [8]. Time-elapsed distributions are spatiotemporal distributions φ(x, t) [29] that provides a spatial density φ(x) at a given
time t. This is accomplished by including a time parameter in our basis function
Fk . In R2 , the basis function then becomes

Fk =

1
k3 πt
).
cos(k1 x1 )cos(k2 x2 )cos(
hk
T

(3.13)

Distribution decomposition similarly needs to be updated as per [29].

3.4

Ergodic Coverage and Ergodic Trojectories

In the context of multi-robot systems, the coverage problem is well-studied and
most algorithms accomplish their task either deterministically or using greedy algorithms. Greedy algorithms maximize information such that each new state generated by the agent will be closer to a point of highest probability density within
31

the PDF over the search domain. Due to the nature of a greedy algorithm’s control input, it will only ever attempt to visit the region of highest information density [30]. This may cause problems for domains with multiple regions of interest,
or with probability distributions that a greedy algorithm cannot plan for [14].

(a) Greedy Algorithm

(b) Ergodic Trajectory

Figure 3.4: Non-ergodic Coverage vs Ergodic Coverage

Figure 3.4 shows how a greedy algorithm can easily ﬁnd a region of high density. The shaded regions are regions of higher probability density. It then becomes
caught in the region, since any movement outside would cause the control input
to not be maximized; the greedy algorithm can never explore the second region.
Ergodicity was ﬁrst applied to multi-agent systems by Mathew and Mezić to
solve the coverage problem. From this application they developed an algorithm
for multi-agent coverage in 2010 [14]. This algorithm uses prior knowledge of a
domain by considering the agents searching a domain as an ergodic system. They
have developed a metric for ergodicity, a function to measure how far a multiagent system’s trajectory is from being optimally ergodic. The Mathew Mezić met32

ric is widely used by researchers in single and multi-robot research due to being
relatively computationally inexpensive. [29,31–34]. [14] also provides an algorithm
for creating trajectories for multi-agent systems called Spectral Multiscale Coverage (SMC). An ergodic trajectory formed with SMC will coarsely explore both regions of interest before going over them more ﬁnely as t → T . In this way, ergodic
trajectories prove to be more robust and ﬂexible than their greedy counterparts.
The intent of ergodic coverage is to provide a method of search that overcomes
the deﬁciencies of a lawnmower strategy. In a lawnmower algorithm, an agent (or
agents) will begin at an initial ﬁxed position and each be given an equally sized
rectangular slice of the domain to search. Each agent then searches its partition
in parallel lines until the entire partition is ﬁnished [35]. Lawnmower is a common, accurate search strategy, but suffers the following three disadvantages in real
world application [36].
1. Uncertainty: If the domain is irregularly shaped, it is not obvious in how to
slice it into partitions for each agent.
2. Fault Tolerance: If an agent fails, it is not obvious how the other agents
should adapt. The agents must either begin the process of repartitioning,
or adapt in some other way.
3. Myopic: Lawnmowering assumes that all of the domain must be searched
equally. This is not the case in many applications, such as search and rescue.
Mobile multi-robot systems are increasingly common in exploration tasks as
the ubiquity of embedded distributed systems rises. Robot sensors are often unidirectionally ﬁxed to the chassis which requires that robot orientation be adjusted
in order to increase sensor efﬁcacy. The searching for and decommissioning of
mines by autonomous agents requires recognition of both position and bearing of
detection sensors in order to plan movements accordingly [9]. Similarly, accurate
33

surveillance of autonomous unmanned aerial vehicles often requires a camera to
be oriented in such a way to minimize disturbance of the photogrammetric survey [22]. It is clear that in many related systems, position without bearing is not
enough to accurately and efﬁciently plan a movement trajectory.
Lauren Miller and Todd Murphey have considered the sensor orientation problem, and provided the basis functions necessary to use the Euclidean motion group
SE(2) as the domain of search [32]. SE(2) can be considered as the group generated

by rotational and translational symmetries on R2 . Miller and Murphey then go
on to describe a method for generating trajectories called Ergodic Exploration of
Distributed Information (EEDI) [28]. Their contribution is an optimization on continuous trajectories, rather than discrete ones. This method of trajectory generation
will ergodically cover a domain very well in comparison to many other methods,
but is computationally expensive.
There are many other methods for generating trajectories using the Mathew
and Mezić metric. Projection-based gradient descent is one of the most popular
methods [27, 28, 37, 38]. Gradient descent is an optimization technique that iteratively ﬁnds the minimum of a function. Gradient descent works by minimizing an
objective function constructed in terms of ergodicity and control costs. There are
many extensions to this algorithm, but all surveyed were computationally expensive [27].

3.4.1

Trajectory Generation

A great deal of work has been done on trajectory generation [14, 27, 28, 32, 34, 37],
but most of these methods are computationally expensive, and as such are not
suitable to the low power microcontrollers common to multi-robot systems. A
simple algorithm for trajectory generation, called Spectral Multi-Scale Coverage
(SMC), is provided in [14]. This algorithm generates trajectory steps quickly in real
34

time, though at a trade-off of precision. SMC covers a distribution very coarsely
initially, and then covers it more ﬁnely as time elapses due to its multiscale nature;
important features are covered very quickly initially and as time progresses these
features are explored exhaustively.
The coarse initial coverage by SMC aids in the aggregation step, because the
agents will immediately attempt to move directly towards the uniform distribution
before hovering in or nearby it. It also is leveraged in the time-elapsed distribution
ﬂocking step for similar reasons. The trade-off of precision vs computational speed
is acceptable in multi-robot ﬂocking applications where the number of agents will
make up for any initial loss in coverage.
SMC will only work for robot systems made up of single or double integrator
dynamics, where we describe the single integrator case below. Recall that feedback
laws are intended to drive the metric ε to zero as quickly as possible by maximizing
the rate of decay of the metric at each time step. For single integrators [14], the
feedback law for robot j at time t with maximum velocity umax is
Bj (t)
||Bj (t)||2

(3.14)

Λk (ck (t) − φk )∇Fk (xj (t))

(3.15)

uj (t) = −umax

where ||x||2 is the euclidean norm, and

Bj (t) = tNa

�
k

where ∇Fk (x) is the gradient vector ﬁeld of the basis functions Fk of our domain.
The sum in 3.15 without the gradient is the difference between two PDFs Ck and
φk . To correct this difference, we move in the opposite direction of the gradient at

maximum velocity as in 3.14. It is important to note again that the choice of cosine
basis functions will make the gradient of Fk equal 0 on the boundary of the domain
so that the robots cannot move outside of the boundary. uj (t) can be thought of as

35

the next state the robot will move to.
Since our domain is R2 ,

∇Fk (x) =





1 −k1 sin(k1 x1 )cos(k2 x2 )

.
hk
−k2 cos(k1 x1 )sin(k2 x2 )

(3.16)

The restrictions on agent dynamics were acceptable in this work, as the robots
described in Chapter 4: Prototype Implementation are single integrators (or are
modelled as single integrators). For more complicated swarm robots, a method of
trajectory generation like gradient descent [27] would be ideal, though typically
swarm robots have at best double integrator dynamics, and can still utilize SMC
[39].
Before describing an algorithm for generating ergodic trajectories, it is necessary to discuss the beneﬁts of local coefﬁcient generation using neighbourhood
communication.

3.4.2

Multi-Trajectory Generation

Ergodicity is extended for multi-robot systems by averaging coefﬁcients of trajectories. For Na robots, each agent has an individual trajectory x, and x̃jn will be
the corresponding state at time n, of agent j. We can then average the trajectory
coefﬁcients as [27]

N

N

a �
�
1
ck (x̃) =
Fk (x̃jn ).
Na (1 + N)

(3.17)

j=1 n=0

When these coefﬁcients are considered in the metric, the corresponding dynamics of each individual robot are inconsequential [14]. Each robot can have different
dynamics if necessary, though each robot will attempt to explore the closest area of
high information density within the distribution as long as their control laws are
informed by the metric. This applies to any trajectory generation method using

36

this metric, though trajectory generation itself may put restrictions on the agents
dynamics [8, 14, 33]. The system will attempt to match the spatial-average of the
domains distribution with the time-average of all of the robots.
Concern lies with having Na agents attempting to compute coefﬁcients globally, which is necessary for this style of multi-robot ergodicity. Each agent must
communicate its trajectory coefﬁcients to all other robots. This allows the robots
to average every other robot’s coefﬁcients in order to generate a uniﬁed trajectory
coefﬁcient for the system as a whole. In order to ﬁnd the average coefﬁcients of
the system, each robot is required to broadcast its own trajectory coefﬁcients, to
receive the coefﬁcients of all other robots, and then to compute the system ck , c̃k
while ensuring that the computed value is the same as every other agents.
Having each agent receive all other agents trajectory coefﬁcients before generating the average locally can have unintended consequences. Coefﬁcient loss due
to obstacles, multipath and weather fade, shadow induced fading, and connectivity between potentially wireless nodes [13] are all signiﬁcant challenges to a fully
distributed multi-robot model implementing this style of ergodicity.

3.5

Ergodic Flocking

We consider strictly local communication between agent neighbours as a basic
characteristic of ﬂocking. This characteristics also makes ﬂocking behaviour scalable. Therefore, to emulate scalable ﬂocking using ergodicity, we need: (i) an aggregation of robots to one location; and then (ii) tracking a given distribution φ
as a group using only local communication. We achieve the two steps required
for scalable ﬂocking with three rules: (i) aggregate robots to one location using
ergodic coverage; (ii) use a time-elapsed distribution and ergodic tracking to simulate ﬂocking; and (iii) use only local coefﬁcient generation to inform ergodicity

37

and neighbourhood communication. This is described in Section 3.4.3 Local Coefﬁcient Generation.
Using these ideas, ﬂocking behaviour has been implemented in two stages, (i)
apply ergodic coverage by positioning a spatially “small” circular uniform distribution φ at given coordinates (the size of the distribution should be large enough
to include all agents, but small enough to aggregate them together). Each robot
will cover the domain ergodically, clustering in and around φ. Then, (ii) deﬁne
and deploy a time-elapsed distribution to specify a ﬂocking trajectory, and let
the multi-robot system exhibit ergodic tracking behaviour that follows the timeelapsed distribution. Two key aspects in achieving scalable ergodic ﬂocking are
effective trajectory generation, and effective control and communication.

3.5.1

Localized Control and Communication

Multi-robot ergodicity allows the sharing of global trajectory distribution information to guide each robots’ next control step in minimizing the metric. In a fully
distributed multi-robot ergodic system, each robot does not have access to global
coefﬁcients, and as such will be less efﬁcient with their time in exploring a domain
than a robot that is part of a centralized system. This could be due to robots being outside the communication range of other robots within the system, wireless
multi-path fade, or obstacles. This loss of efﬁciency appears to be negligible over
time in standard area-coverage problems using SMC, as shown in Fig. 3.5.
Note that the coverage in a centralized ergodic trajectory is typically more composite than that of a fully distributed trajectory. This is due to the nature of a centralized ergodic system. As each agent contributes its trajectory coefﬁcients to the
combined system, the combined system will predictably cover as many as possible
different areas from what the rest of the system is covering. This is why the blue
agent remains predominantly in the south of the uniform distribution, the purple
38

(a) Centralized Ergodicity

(b) Distributed Ergodicity

Figure 3.5: Centralized vs distributed ergodicity with 3 agents and step 0.5.
in the equator, and the green agent the north-east.
In fully distributed ergodicity each agent is its own system until another agent
approaches its local neighbourhood threshold distance. When two agents come
into threshold range, they form a single system until they leave this threshold
again, allowing them to share trajectory information, and informing each others
next state. The end result is a less composite, but still very thorough coverage of
the distribution.

3.5.2

Local Coefﬁcient Generation

In order to create a robust system that can leverage ergodicity in a scalable, intuitive way, this work modiﬁes the standard averaging of trajectory coefﬁcients.
Each robot within a group will communicate to only a local subset of all group
members. This is both a useful, and a functional limitation placed on each member, as each robot is typically very restricted in both sensor capacity, size, and as
an indirect result, power. Constant communication with all members of a group
would require non-direct transfer of information; an expensive prospect.
39

This new averaging of robot trajectory coefﬁcients (hereafter Local Trajectory
Coefﬁcients (LTC)) follows simple, emergent rules: rather than an average of every member of a global collection, each robot will receive the coefﬁcients of all
members within its local neighbourhood threshold. This neighbourhood can be
deﬁned in any way natural to the system in question, whether by communication
hops, physical distance, or some other way. Each robot is then free from being
forced to synchronize with all other members of the group at each time step, and
does not need to communicate to a potentially unmanageable number of other
robots. This is avoided in decentralized trajectery generation since the threshold
distance can set to a value in which a manageable number of robots can physically
ﬁt. Each robot essentially coordinates only with its neighbors.

3.5.3

Algorithm

The algorithm that each robot has to execute for ﬂocking behavior in a subregion
D of R2 is given as Algorithm 1: Ergodic Flocking on R2 with LTC.

Here: φ(x) is a distribution with density x from D to R2 ; φk (x) = Fourier decomposition of spatial distribution φ; K1, K2 = indices along each coordinate; B = set of
all robots β in the swarm; c(x) = distribution from trajectory xt ; ck Fourier decomposition of trajectory distribution c; Λ = Scaling factor; ε(x) = sum of squared differences between ck (x) and φk (x); ckβ = trajectory coefﬁcients from robot β; ψ ∈ R+
= a threshold value quantifying distance of local neighbourhood; δβ = distance of
robot β from α; εa = average ergodicity value for all robots in robot β’s local system.
Note: In a time-elapsed distribution, the algorithm must be modiﬁed to take
time into account, and so is modiﬁed to include for K3 do at each iteration of calculating φk and ck .

40

Algorithm 1: Ergodic Flocking on R2 with LTC
for K1 do
for K2 do
3
Compute φk
end
4
5 end
6 while true do
7
for K1 do
8
for K2 do
9
Compute ck
10
end
11
end
12
if Num Agents > 1 then
13
for β in B do
14
for Num Agents do
15
if δβ � ψ then
16
Receive ckβ
17
end
18
end
19
Compute Λ
20
Compute εa
21
end
22
end
23
else
24
Compute Λ
25
Compute ε
26
end
27
Get next state using SMC given agent dynamics, ε (εa )
28 end
1

2

3.6

Simulation

It is possible to approximate ﬂocking behaviour with multi-robot ergodic trajectories. Typically a collection of algorithms or built-in behaviours would be required
to accomplish ﬂocking behaviour in a multi-robot system. Beneﬁcially, ergodicity
provides a stable framework for which collective group navigation can be performed without having to introduce additional algorithms.
Rather than having a large library of algorithms available to each robot, it is
preferable for each member of a multi-robot system to take advantage of a navi-

41

gation framework that can adapt depending on the situation or required task. It
is important to note that the ﬂocking behaviours provided by a system of robots
generating local ergodic trajectories are not accomplished in the same manner as
a more traditional multi-robot task algorithm. The end results approximate the
desired behaviours well, however, as shown in ﬁg. 3.8.

3.6.1

Simulator

For our work on ergodic ﬂocking, we extend the ergodic trajectory simulator written by Louis Dressel in the Julia language, named ErgodicControl.jl [31]. Speciﬁcally, the ability to create local ergodic trajectories using both stationary and timeelapsed spatial distributions has been added, as well as some new functionality
allowing the automation of image and GIF generation. The extended simulator
can be found at: https://github.com/Cveitch/ErgodicControl.jl
For all task simulations, seven agents were used. Different time-elapsed distributions were used for each simulation, and a new simulation was loaded for each
trial. The same domain d = [0, 1] × [0, 1] was used for all tasks to ensure comparability of results.
The total time for each trial was T = 100, but the time step between each state
in trials can differ. This was done strictly to avoid outpacing the time-elapsed
distribution, whose time steps need to keep pace with the agents.
It should be noted that the generation of distributions for ﬂocking is a nontrivial task, and will be explored in future works. The distribution generation
depends on domain, coverage region, obstacles, number of agents, and intended
subtask (sensing, moving, etc). We simulate ﬂocking in two steps: (i) Aggregation
to a ﬁxed location, shown in Section 3.6.2; and (ii) Moving as a ﬂock from there
following a trajectory, shown in Section 3.6.3.2.

42

3.6.2

Time-Elapsed Aggregation

(a) Time Step t = 0

(b) Time Step t = 20

(c) Time Step t = 50

(d) Time Step t = 70

Figure 3.6: Aggregation step, 7 agents, T=100, each time step is 0.1, agent speed is
0.1

Aggregation is the ability of a multi-robot system to collect itself into a common
region. As per the Ergodic Flocking algorithm 1, all agents in the system must
aggregate before moving towards a common target.
A uniform circular distribution with radius 0.15 was provided to the system,
centred at point (0.2, 0.2). As shown in ﬁg. 3.6, the robots immediately move to
cluster within the distribution without exception. Once all robots have aggregated,
43

the uniform distribution is removed and a new time-evolving distribution is created to guide ﬂocking behaviour.
Area coverage of ergodically informed trajectories can be halted at any time by
providing the group with an empty distribution over the domain. In application
this could be accomplished by giving the distribution and an alert to one robot
who will then broadcast the alert and new distribution to its neighbours. This
trickle-style network would eventually transmit the necessary information to all
robots within the distribution. If the desired effect is to aggregate and then stop
moving, providing a clustering uniform distribution followed by an empty distribution would be satisfactory.

3.6.3

Flocking by Distribution Tracking

First, ﬂocking in a line (linear ﬂocking) is simulated and then ﬂocking in a curve
(non-linear ﬂocking) is simulated.
3.6.3.1

Linear Flocking

Figure 3.7 shows a new, time-elapsed distribution overlaid on the domain. Another uniform circular distribution with radius 0.15 was centred at point (0.2, 0.2).
This distribution then evolved with time, linearly increasing its x, y coordinates by
0.01 at each time step.
The aggregation and time-elapsed aggregation steps are shown to be possible
in ﬁgures 3.6 and 3.7, but these trials only show linear ﬂocking progression. In
order to be useful for swarm tasks, it needs to be shown that non-linear movement
is also possible with this method.

44

(a) Time Step t = 0

(b) Time Step t = 15

(c) Time Step t = 30

(d) Time Step t = 50

Figure 3.7: Time-Elapsed Aggregation step, 7 agents, T=100, each time step is 0.1,
agent speed is 0.1, distribution speed is 0.11
3.6.3.2

Non-Linear Flocking

Figure 3.8 shows how non-linear ﬂocking behaviour provided by local ergodic trajectories can be inﬂuenced with a time-evolved distribution.
Agents were ﬁrst aggregated to a static uniform distribution with mean (0.4,
0.4) and a radius of 0.18. A time-evolving distribution with the same radius and
initial mean was then provided to the agents, which then moved counter-clockwise
around the domain. Each robot manages to remain in approximate alignment with

45

(a) Time Step t = 15

(b) Time Step t = 35

(c) Time Step t = 65

(d) Time Step t = 85

Figure 3.8: Local ergodic trajectories with respect to a time-evolving uniform distribution. 7 Agents, T=100, each time step is 0.15, agent speed is 0.1, distribution
speed is 0.11
all other agents.
Ergodic Flocking can be accomplished in the ErgodicControl.jl simulator with
agents taking advantage of LTC. Figure 3.8 clearly shows that non-linear ﬂocking
is possible, and a physical implementation of this algorithm is presented in the
next chapter.

46

Chapter 4
Prototype Implementation
This chapter provides evidence in a physical setting that ergodic ﬂocking is possible with a prototype. To the best of our knowledge, distributed swarm ergodicity
has been shown only in a simulated environment. In order to provide supporting evidence for real-world ergodic ﬂocking a prototype was designed, and four
robots used in an experiment intended to closely approximate simulation results.
A description of the hardware and dynamics of the robots involved is presented.
We begin with an investigation of the advantages and disadvantages of commercially available robotic platforms, followed by a discussion on why we ended
up building the prototype in-house. The hardware components of the prototype
are discussed next, with explanations why each was selected and how they are
calibrated. The method of distance-only navigation used by the prototype is described in the Navigation section. It is followed by an explanation of the received
signal strength indicator of the wireless packets used to extract the distance. An
algorithm for how multiple robots implementing distance-only navigation techniques are able to coordinate between one another is presented in the Coordination section, as is the control dynamics of the prototype. The Ergodic Flocking
Implementation section describes how the prototype system is used in a series of

47

ergodic ﬂocking experiments with similar parameters as the simulations in the Ergodic Flocking chapter.

4.1

Commercially Available Platforms

Commercially available robots fall into one of three categories [40]; industrial, academic, or entertainment. There are no suitable industrial platforms available, since
this work focuses on the ﬂocking of robotic swarms and there are no commercially
available industrial swarm robots yet. A collection of academic and entertainmentfocused robots were investigated but none found suitable to ergodic ﬂocking. This
was due to each platform ultimately failing to meet all of the necessary criteria for
ergodic ﬂocking.
The criteria for a platform to be usable in the following ergodic ﬂocking implementation are:
• Affordability: since multiple robots were needed in the experiment, each
robot had to cost less than $150.00 CAD;
• Communication: each robot must transmit and receive 80-byte packets (trajectory coefﬁcients consist of a 4 × 4 × 4 matrix of bytes). This 80-byte restriction is to avoid network complexity;
• Mobility: Robots must be mobile and capable of navigating an environment;
• Minimally sensored: equipped with as few sensors as is necessary to achieve
a speciﬁc task. This is required, as the intent of this work is to be extended to
a swarm of simple, low cost, low-power robots; and,
• Reprogrammable: the robot must be easily reprogrammed. The amount
of debugging required in computer hardware work meant that the robots
should take as little time as possible to reprogram.
48

Platform
AERobot
Alice
AMiR
Cellulo
Colias-4
E-puck
Kilobot
R One
S Bot

Sensors
Distance, Light
Camera, Distance
Bump, Infrared (IR) Communication
Pattern-sensing camera, capacitive touch
Bump, IR Communication, IR Proximity
Distance, Camera, Inertial Measurement Unit (IMU)
Distance, Light
Bearing, Bump, IMU, Light
Humidity, IMU, IR, Light, Temperature

Cost ($ CAD)
20
120
50
NA
360
1070
110
250
NA

Table 4.1: Cost Comparison of Commercial Swarm Platforms

Table 4.1 gives a listing of robotic platforms investigated for this work. Of the
swarm robots surveyed, these nine platforms are the closest to meeting system requirements. All prices are given in Canadian dollars, and are relative conversions
at time of platform choice as of Winter 2017.
Though inexpensive, the Harvard Kilobot and AERobot platforms [21, 41] are
incapable of transmitting the 80 byte packets required to communicate trajectory
coefﬁcients. Their IR sensors are strictly for detecting ambient light and distance
to other swarm robots and environmental obstacles.
In comparison to the Kilobot and AERobot, the S-bot by Êcole Polytechnique
Fédérale (EPFL) [42] de Lausanne meet all implementation criteria, but all attempts
to price the robots failed. It is very likely given the complexity and engineering of
the S-bot that it is outside the price restrictions of this work. EPFL’s Alice platform [43] met all criteria (its cost was over budget, but within acceptable limits),
but is no longer in production and at the time only two were found via secondary
resale. Similarly, the Cellulo [44] meets almost all requirements but was not commercially available at the time. Since Cellulo is not open source, it also fails the
reprogrammable requirement.
49

Fully open-sourced and reprogrammable, the Colias-4 [45] by University of
Lincoln is above minimum affordability if bought commercially. The Colias-4
could be used in this work, however, as the PCB design is given as a ﬁle on the developers website to allow researchers to cut their own board and build their own
robot. The reason Colias-4 was not selected is due to the time and costs involved
in outsourcing the PCB manufacturing proving prohibitive.
AMiR by Islamic Azad University is capable of IR communication and is open
source with free PCB board designs available [46]. The documentation for the
AMiR was unclear if the data transfer by the IR transmitters would be capable of
sending 80 byte bursts without signiﬁcant modiﬁcation. Attempts to contact the
researchers were unsuccessful.
The R-One by Rice University [47] and the E-Puck, also developed by the EPFL
[40], are cost prohibitive for this thesis, and are not minimally sensored.
The decision to build a new platform was made after considering the available
platforms due to the lack of suitable commercially available robots.

4.2

Prototype

The Robotic Prototype for Ergodic RSSI-informed Trajectories (RPERT) Fig. 4.1
described in this section is a low cost, low power, minimally sensored prototype
designed and developed by the Distributed and Embedded Systems Lab of UNBC.
The RPERT is used to show that it is possible for a minimally sensored collection
of robots to perform ergodic ﬂocking behaviour in a real-world setting. The minimally sensored restriction is important;
• Fewer sensor peripherals allows the power requirement for each RPERT to
be low. This extends battery life, and allows the size of the robots involved to
be reduced since the power supply can be smaller than would normally be
50

(a) RPERT External View

(b) RPERT Internal View

Figure 4.1: Robotic Prototype for Ergodic RSSI-informed Trajectories (RPERT)
required for a platform with full sensor suite [23]; and,
• Each minimally sensored robot has fewer points of failure. In a swarm of
dozens or hundreds of robots, fewer sensors overall implies less downtime
for each robot. Additionally, many linked peripherals may stop working if
one sensor goes down [15].
These points informed all design decisions on the RPERT. If ergodic ﬂocking
can be achieved with a minimally sensored robot swarm, then the same problem can be overcome by a collection of robots with a more complex sensor suite.
Swarms are predominantly very simple at the member level, and showing that a
minimally sensored homogeneous collection of robots can ergodically ﬂock is evidence that any sufﬁciently complex collection of robots can [6]. The aim of this
work is to provide an algorithm for the general case.
51

4.2.1

Hardware

Hardware selection for the RPERT is intended to keep each robot minimally sensored. As such, only the following components are necessary;
• A microcontroller to control robot behaviour and control all peripheral hardware;
• Motors for mobility;
• A mechanism to detect other robots to avoid collision;
• An IMU, to allow the RPERT to know how it is situated inside of its environment; and,
• A mechanism to communicate trajectory coefﬁcients to neighbouring RPERTs.
The Atmel SAM4S Xplained Pro development board is the computer running
each RPERT. The development board contains the SAM4SD32C, a microcontroller
built around the ARM Cortex-M4 core, a 32-bit RISC ARM processor [48]. Though
a more powerful microcontroller than necessary this controller was chosen due to
our familiarity with it. This choice was made in order to speed prototype construction, as programming the RPERT required extensive assembly code. This choice of
microcontroller does not change the scope of this project, since any code written
for the prototype can be ran on a smaller Atmel board with only minor cosmetic
code modiﬁcation. This is due to Atmel Studio, the IDE software provided by the
company Microchip, supplying a collection of embedded libraries for use with all
Microchip/Atmel controllers. These libraries are collectively called the Advanced
Software Framework (ASF). It offers an abstraction to hardware, and a series of
simple drivers for pin access in an attempt to reduce the amount of assembly code
required.
52

Using the pin drivers supplied by the ASF, two DG01D 4.5V DC motor gearboxes provide mobility to the robot. Each motor is 140 RPM, and both are connected to a RV8833 DC motor driver. This allows the DC motors to be controlled
by the microcontroller pins, as well as move bi-directionally. The RV8833 was chosen for its low voltage capacity – since each motor is 4.5V it can be powered by the
SAM4S vcc pin. Since we are powering the motors directly from the board, care is
taken to mitigate motor current-draw that could potentially damage the SAM4S.
To account for this, the RV8833 has built-in current limiting, but resistors were
added to the circuit just in case.
As the motors and all other peripherals are not powered by a secondary power
source, low-power sensors are necessary to ensure the SAM4S doesn’t draw too
much current and reset. The Bosch BNO055 IMU Xplained Pro daughter board
[49] runs on 2.4-3.6V, and has a low power mode. This mode means that if the
robot has not moved along an axis for a default 5 seconds, the sensor is put to
sleep – this value has been modiﬁed to 1 second to further conserve power. The
BNO055 is inserted into EXT1 of the SAM4S Xplained Pro board. It contains a
16 bit gyroscope, a 14 bit accelerometer, and a magnetometer that are collectively
used to determine the heading of a RPERT.
Each RPERT communicates ergodic trajectory coefﬁcients with a wireless module: the Digi XBee Pro 802.15.4 [50]. This module is a low cost, low power device
running on the IEEE 802.15.4 technical standard, though this protocol speciﬁes
only the physical and MAC layers. These layers are sufﬁcient as each RPERT communicates coefﬁcients by broadcasting to all other modules in range. The module
operates on the 2.4 Ghz frequency band, which is the same frequency that WiFi
can operate on. Given this shared frequency, the channel has been selected in such
a way to mitigate collision. Channels 0x13 (2.415 Ghz), and 0x19 (2.475 Ghz) have
been selected for this reason. Each RPERT is given a unique Media Access Con53

trol (MAC) address identiﬁer deﬁned at startup. This address can be altered later
through the operating system if required.
This wireless module was chosen for its very low power requirements, and
relative simplicity for driver-writing. This simplicity stems from the Application
Program Interface (API) mode that the XBee Pro can be conﬁgured for. The API
commands can be sent by the drivers directly to the XBee controller, allowing the
speciﬁc implementation details of the XBee to be abstracted.
With the minimal sensors covered, there were two sensor additions to the RPERT
to make experimentation and debugging easier. Each is equipped with a single infrared optical sensor. This sensor is strictly to aid in experimentation. It is only
used to detect physical collisions. Without the sensor, RPERTs will require human intervention if they collide with a physical obstacle or another RPERT. This
intervention adds a degree of uncertainty, and would be difﬁcult to appropriately
analyze.
The second addition is a red-green-blue LED bulb inserted on either side of the
top of the robot. These are for user communication only and supply both debugging and error information, as well as robot-state information. Listing 4.2 is a list
of all error and status codes.

COLOUR

Left LED (Error)

Right LED (Status)

WHITE
YELLOW
RED
CYAN
PURPLE
BLUE
Green

Initializing
Cannot read SD Card
Kernel Panic
No instructions
Battery Low
Null
Null

Initializing
Transmitting
Orientation: Q1
Orientation: Q2
Orientation: Q3
Orientation: Q4
Achieved Goal

Table 4.2: RPERT State Information by LED Colour

54

• Each LED is white on power-on, as the robot initializes,
• Q1-Q4 directly corresponds to quadrants one through four of the unit circle.
These values are measured by the IMU,
• When the SD card program is inserted into the SAM4S Xplained Pro, the
program must toggle the “has instructions” ﬂag in the OS. If this ﬂag is not
toggled, the left hand LED will be cyan. If no SD card is inserted when the
robot is powered on, the LED will be yellow.

(a) RPERT Bottom Plate

(b) RPERT Top Plate

Figure 4.2: CAD Drawings of RPERT Chassis

Each RPERT has a 3D printed chassis that was designed in OpenSCAD, as
shown in Fig. 4.2. OpenSCAD is open source 3D rendering software that allows
for precise manipulation and creation of 3D objects. Each robot was printed on a
Tinkerine Pro 3D printer, and each chassis is 17 cm diameter by 12 cm tall. One
2200 mAh cellphone charger is inserted into a canal in the bottom of each RPERT
as a power supply, allowing the robot to be charged by micro USB. A total of four
RPERTs were constructed for this work.

55

4.2.2

Calibration

Due to the nature of sensors that make up IMUs, they require an environmental
calibration in order to measure accurate values [49]. To calibrate the accelerometer
the Bosch BNO055 requires it be placed in 6 different stable positions for two seconds apiece, with slow movement between each position. During this movement
the robot must be perpendicular to the X and Y axis at least once each. Similarly, to
calibrate the magnetometer the robot must move in a ﬁgure eight pattern until the
CALIB STAT register registers calibration [49]. Experimentally, the time it takes to
calibrate the magnetometer is 10 seconds on average at the RPERT’s lowest speed
(5 cm/s). To calibrate the Gyroscope, the IMU must be in a single stable position
for 2 seconds.
When powered on, each RPERT goes through initialization and calibration
steps to ensure all sensors are working properly. The steps are;
1. Plug USB into the power supply;
2. SAM4S, LEDs, motors, and XBee Pro are initialized, in that order;
3. The BNO055 is set to CONFIG MODE by default, each sensor In the IMU is
conﬁgured separately;
4. Calibrate Magnetometer: RPERT moves in an hourglass pattern on medium
speed, polling bits 0/1 of CALIB STAT register on the IMU. If 3 is returned,
magnetometer is calibrated. If 3 is not returned, repeat;
5. Calibrate Gyroscope: the RPERT remains still and polls bit 2/3 of CALIB STAT
register on the IMU. If 3 is returned, Gyroscope is calibrated. If 3 is not returned, repeat;
6. Calibrate Accelerometer: The RPERT moves in a square pattern, pausing for
2 seconds at each vertex. At the end of the square, the RPERT turns 90 de56

grees, moves forward for 1 second with a 2 second pause, then moves backward for 1 second with a 2 second pause. Poll bits 2 and 3 of CALIB STAT
register on the IMU, if 3 is returned, Accelerometer is calibrated. If 3 is not
returned, repeat; and,
7. The BNO055 is set to NDOF MODE, setting the IMU to nine-degrees-offreedom mode. This gives access to absolute orientation data from the accelerometer, gyroscope and magnetometer.

4.2.3

Navigation

Since the RPERT is minimally sensored, it is difﬁcult to have it accurately navigate
its environment [51]. In order to adequately search a domain without a lawnmower algorithm, each robot in the swarm must be capable of navigating its domain. In order to navigate accurately, the minimum following information is required [51–53]: A notion of proximity to neighbouring robots and a bearing, where
proximity can be absolute, or relative to neighbours.
Each RPERT makes a call to its IMU to retrieve the robot’s heading relative to
magnetic north. The IMU is set to Nine Degrees of Freedom (NDOF) mode in the
conﬁguration step, which handles much of the computation onboard. Values are
returned in Euler angles. Euler angles describe a 3D angle as a three-tuple (φ, θ, ω):
where φ is the rotation around the x axis, θ is the rotation around the y axis, and
(ω is the rotation around the z axis. As each RPERT is not oriented perfectly par-

allel to the xy plane, a tilt compensation is made by the onboard microcontroller
prior to returning the heading. A conversion is then used to turn these values into
conventional compass radians for navigational purposes. These angles are represented internally as: 1 radian = 900 LSB (least signiﬁcant bits). The 16 bit unsigned
integer value returned is in the range of 0-5759, so to return as a radian value we

57

divide the returned value by 900.
4.2.3.1

Received Signal Strength Indication (RSSI)

Each RPERT has a heading, but this alone is not enough to know a position in its
domain. If each robot always started in the same location, they could navigate
a room as long as they are aware of its dimensions. They would move forward
knowing exactly how far their wheels are capable of taking them in a single quantum. Navigation in this way relies on hard-coded motor timings. This is inherently
inaccurate since it does not consider the different surfaces the wheels move along,
the wear on mechanical parts, motor slippage, etc [51]. If a robot navigating in this
way ends its movement in a position that is not exactly where it believes itself to
be, it cannot correct its course. In this way, the error propagates through the system
and becomes worse with each successive quantum. It is possible to mitigate this if
the robot was capable of knowing a landmark it could measure itself against.
As each RPERT has only a wireless device and IMU, this is difﬁcult, but possible. The IMU cannot be reliably used in the long term to correct propagating error,
but the onboard wireless module can be leveraged to extract an approximation of
distance to another external wireless module [52, 54]. Each time a wireless module
broadcasts a packet of information there is an inherent signal strength, or power
level, upon being received by an accepting wireless module. This RSSI ranges from
-100dBm to 0dBm, where -100 is a very weak signal, and 0 a very strong one. The
RSSI values received by an XBee Pro are hexadecimal, and so range from -64 to 0.
Figure 4.3 is the data from an experiment we conducted at UNBC whose results
suggest the relationship between raw RSSI values and distance. Peaks and valleys
in RSSI value typically correspond with some form of signal agitation, collision,
bounce, or loss.
RSSI is used as an indication of link quality between two nodes, but beneﬁcially
58

Figure 4.3: Relationship Between RSSI and Distance
to this work it shares a nonlinear relationship with physical distance [55]. This
relationship (known as the Log-Distance pathloss model) can be modelled as:

rssim = −10νlog10 (d) − RSSId0 ,

(4.1)

where RSSIm is the measured RSSI value, d is the distance in cm between the
transmitter and the received, and RSSId0 is a calibration signal strength when a
transmitter and receiver are positioned d0 from one another. A common choice for
d0 is 100cm [52].
ν is a propogation constant, and is described by:

ν=

RSSIdc − RSSId0
,
dc
10 log10 d0

(4.2)

where RSSId c is an RSSI value measured from dc cm away from a calibration
target. This value is typically chosen as 50cm, and is the default for this work. ν
is called a path-loss exponent, and is used to consider the medium that the signal
is moving through. Signal strength is directly effected by medium, and different
domains may give wildly different values if calibration does not take place. Once
59

the calibration values are measured and recorded, distance can be found by rearranging the initial equality in 4.1 so that:

d = 10

4.2.3.2

RSSIm −RSSIdc
10ν

.

(4.3)

Data Noise

This distance approximation, even when calibrated, is often very inaccurate when
taken with raw RSSI data. RSSI measurements are inaccurate, and prone to data
noise. Figures 4.3 and 4.4 give an example of this instability. Raw RSSI data is fed
into a distance approximation equation to create an RSSI-to-distance heat map of
room 10-2088 (MACE) in UNBC. To generate these, distance was measured from
the furthest corner of each box from the wireless module at the centre of the room.
The experiments were done in late evening, during the summer to try to mitigate
student WiFi collision. The results clearly show that converting raw RSSI to distance is very inaccurate

(b) Raw RSSI Heat Map

(a) Actual Distance Heat Map

Figure 4.4: Actual Distance Heat Map vs Raw RSSI Heat Map

In order to mitigate this data noise for the RPERTs, all RSSI values are now
processed through ﬁrst a low-pass ﬁlter, then a Kalman ﬁlter as in [54].
60

4.2.3.3

Filtering

Initially each piece of data is ran through a low-pass software ﬁlter designed to
impede high frequency signals. This removes spikes from incoming data, making
it easier for a Kalman ﬁlter to predict values afterwards.
A Kalman ﬁlter is an iterative ﬁlter that, based upon prior knowledge and current values, estimates what the next value will be. As RSSI is the only variable of
interest, a linear Kalman ﬁlter is used. It takes an initial sensor value RSSII , and at
each step processes a measured sensor value RSSIM . An initial variance VAR is required. It does not have to be accurate, since it implies that there a 97% conﬁdence
that RSSII is within VAR2 dB of the true RSSI value. We often choose an artiﬁcially
high initial variance, as the Kalman ﬁlter will modify it at each step. Also required
is a sensor variance SENSVAR ; this value is the inherent measurement variance of
the XBee Pro itself. The variance of the system, RSSI, is also required by the ﬁlter.
RSSIVAR represents the variance inherent to RSSI, which we experimentally found

to be 252 = 625.
Once the initial values are found, the Kalman ﬁlter begins a process of update
and prediction. In the prediction step the ﬁlter makes a new estimate. In the update step, the prior estimate and a new measured value are scaled by weights that
depend on their variances. The values are then combined to create a new value.
Each iteration, the Kalman ﬁlter will:
1. Receive a new RSSI measurement NEWR SSI
2. Begin prediction step
(a) Set RSSII + = u, where u is the expected movement in measurement;
often this is 0
(b) Set VAR+ = RSSIVAR

61

3. Begin estimation step
M +RSSII ·SENSVAR
(a) Set RSSII = VAR·RSSI
VAR+SENSVAR

(b) Set VAR =

1

1
1
VAR+1 + SENSVAR

For this work, RSSII was set to the calibration value RSSIdc . VAR was initially
set to 252 = 625, SENSV AR was found to be approximately 5 [50], and RSSIV AR was
experimentally found to be approximately 8.
A Kalman ﬁlter is a linear ﬁlter, and RSSI is not a linear system. The extended
Kalman ﬁlter used in this work makes the predictions and measured value linear
about their mean. Other than this initial preparation of data, the extended ﬁlter
performs as explained above. The beneﬁt of the extended Kalman ﬁlter lies not
only in its ability to more accurately predict the nonlinear RSSI values, but in also
being able to better adapt to noise generated by sensor movement [56].

4.2.4

Coordination

With the low pass and Kalman ﬁlters smoothing RSSI values, the distance values
the RPERTs receive are useable. The values do not perfectly correspond to the
robot’s true distance from a transmitter, but they are close enough to allow navigation; the RPERTs understand a notion of distance from one another. This notion of
distance from one another allows each RPERT to avoid collision with one another,
but it does not totally solve the problem of navigating a domain.
Formation control using only distance between agents is not trivial [57]. Beneﬁcially, as each RPERT is already communicating its trajectory coefﬁcients wirelessly, they are capable of adding their current position coordinates to the end of
each packet. To this end, the RPERTs elect a RPERT from amongst them to act
as an RSSI Anchor (RA): a robot that does not move, and broadcasts its position
coordinates (but not trajectory coefﬁcients) to all neighbouring RPERTs.
62

An election is held in order to elect an RA. The algorithm to elect an RA from a
group of RPERTs follows:
• Broadcast personal MAC address;
• Compare each incoming MAC address;
• If incoming MAC address is lower than personal MAC address, set RAanchor =
false;

• If incoming MAC address is higher than personal MAC address, idle; and,
• Stop broadcasting after n seconds.
RA election happens immediately after calibration. If an RA dies due to power
loss or some other system failure, it would be prudent to immediately elect another RA. This functionality is not currently written, and is intended for future
work. In addition, the problem of RPERT’s being outside of one another’s local
neighbourhood is not covered in this work. In this case, leader election becomes
more difﬁcult, and work has been done on this problem. Multiple leader interactions are covered in [58, 59].

4.2.5

Coordinate Finding

Each robot receives a positional coordinate packet from the RA that also allows it to
extract an RSSI value. This information is used by the RPERT to know both know
the coordinates of the RA and an estimate of distance from it. This introduces a
new problem illustrated in Fig. 4.5: distance to a point does not inform a robot
of its location on the unit circle centred on the RA. A robot can be located at any
coordinates along the unit circle distance d away from the RA.
Each RPERT knows its absolute initial ﬁxed coordinates. It receives its heading
from its IMU, and is assumed to travel at constant speed. This would be enough
63

Figure 4.5: Equivalent Distance of Robots on Unit Circle
to navigate naively by using dead-reckoning. This method of navigation involved
motor timings; move both motors for one second, then turn the right motor for one
and a half seconds in order to get forward movement followed by a turn measurable in radians according to robot dynamics. Dead reckoning introduces an error
that quickly propagates as discussed above.
To offset propagated error, the Euclidean distance Ed is computed between the
RPERTs current coordinates and the RA’s. A second linear Kalman ﬁlter is used
with the Euclidean distance from the RPERTs current (x, y) coordinates to the RA as
initial sensor value, and the distance generated by RSSI to the RA Rd as a measured
64

Algorithm 2: RPERT location-ﬁnding and navigation algorithm
Broadcast M for n in N do
receive Mn if M < Mn then
3
ANCHOR = true
4
end
5 end
6 Compute (x0 , y0 )
7 if ANCHOR = false then
8
while true do
9
Compute (x1 , y1 ) //ergodic ﬂocking algorithm
10
Move to (x1 , y1 )
11
Receive (xa , ya )
12
Compute RSSIa
13
Low-pass ﬁlter RSSIa
14
Extended Kalman ﬁlter RSSIa
15
Compute Rd
16
Compute Ed
17
Compute �d
18
if Ed > Rd then
19
(x1 , y1 ) = (x1 − �d , y1 − �d )
20
end
21
else
22
(x1 , y1 ) = (x1 + �d , y1 + �d )
23
end
24
end
25 end
26 else
27
Broadcast (x0 , y0 )
28 end
1

2

value. As before, this provides an updated value �d that takes Ed and Rd account.
We then add or subtract �d to each coordinate as shown in Algorithm 2, where;
M is the MAC address of the RPERT; Mn is the MAC address of other RPERTs

within ergodicity threshold range; ANCHOR is a ﬂag toggled when a RPERT is
elected as an ergodic anchor; (x0 , y0 ) are initial coordinates; (x1 , y1 ) are the next
state; computed using the ergodic ﬂocking algorithm in Chapter 3; (xa , ya ) is the
packet sent by the ergodic anchor; RSSIa is the RSSI valued extracted from Pa , Rd
is the distance value converted from RSSIa ; Ed is the euclidean distance between
(xa , ya ) and (x1 , y1 ); and, �d is the Kalman ﬁltered difference between Ed and Rd .

65

This method is a general approximation. It trades precision for simplicity and
ease of calculation. This is necessary as swarm robots do not tend to have a powerful onboard microcontroller. Precision is not required for the ergodic ﬂocking
implemented in this work, so this trade is acceptable.

4.2.6

Control

Each RPERT has two wheels that are driven independently. Each wheel can be
driven forward or backward, allowing the RPERT to pivot in place. Each wheel
has an orientation of π/2 with respect to its axle. The mathematical models used
to control the robots navigation are purposely uncomplicated are described below.
[60]

Figure 4.6: Differential Drive Dyamics

66

S=

sr − sl
.
2

(4.4)

This is the average speed of the robot, where sr is the speed of the right wheel,
and sl the speed of the left.
The robots intended angle of rotation is described as

θ=

sr − sl
+ θ0 ,
b

(4.5)

where b = 14cm is the invisible line segment connecting the centre of each wheel
(the axel), and θ0 is the initial heading of the robot.
The coordinates of a robot can then be described as
x = S cos(θ) + x0 ,

(4.6)

y = S sin(θ) + y0 ,

(4.7)

where (x, y) are the intended x and y coordinates of the robot, and x0 and y0 are
the robots initial x and y coordinates. Each RPERT weighs approximately 270g,
and as such the inertia is small enough that changes in speed are treated as instantaneous in this work. It is for this reason that the RPERTs are treated as having ﬁrst
order dynamics, rather than second; acceleration is not taken into account when
generating an ergodic trajectory.

4.3

Operating System

MINIOS [61] is the operating system installed on each RPERT. MINIOS is an operating system designed for academic use with the SAM4S Xplained Pro. It allows
67

a program to be loaded onto a micro-SD card rather than having all RPERT behaviours hard coded and sitting in kernel space. For this work, MINIOS loads the
rpert logic program into memory on startup. This program gives the robot do-

main speciﬁcations, navigational and dynamics logic, and the distributions to be
followed for ergodicity.
MINIOS has been extended for this work and the following modiﬁcations were
made to the kernel:
• Drivers and corresponding system calls: motor, IMU, RGB LED, IR optical
sensor, XBee Pro;
• Low pass ﬁltering, Kalman ﬁltering, IMU functionality, RGB LED, IR optical
sensor, XBee Pro functionality, RSSI functionality, ergodicity; and,
• Minor scheduler modiﬁcations in regards to sleeping processes

4.4

Ergodic Flocking Implementation

The RPERT platform was used in a series of three trials intended to replicate the
ergodic ﬂocking simulation results of Chapter 3. The three trials were:
1. The aggregation of RPERT’s using ergodicity;
2. Linear ergodic ﬂocking of RPERT’s from one corner of a rectangular domain
to the opposite corner; and,
3. Non-linear ergodic ﬂocking of RPERTs in a square pattern around the domain, per the third simulation in Chapter 3.
The ﬁrst two trials were successful, and approximate simulation results. The
third trial proved to be impossible, though the limiting factor was the RPERT hardware, rather than the ergodic ﬂocking algorithm.
68

4.4.1

Implementation Parameters

For all three trials, the same domain and ergodic parameters were used, with the
exception of initial RPERT starting coordinates and the distribution over the domain. The parameter values are shown in Table 4.3. The domain shown in Figure
4.7 was 160 cm × 160 cm.

Figure 4.7: Experimental Domain for Trial One and Two

d
θ
Ip (x, y)
ψ
k

T-1

T-2

T-3

R(160)
C(15, 15)
(65, 30), (50, 135), (115, 95)
60

R(160)
Ci (15, 15)
(5, 35), (25, 25), (30, 5)
60

R(160)
Ci,j (15, 15)
NA
60

5

5

3

Table 4.3: Implementation Parameters

69

The parameters and variables for the trials are deﬁned as follows:
• T-1, T-2, T-3 correspond to the trial;
• d is the domain boundary of the implementation;
• R(n) is an n × n rectangle (in this implementation, measured in cm);
• φ is the distribution over d;
• C(x, y) is the uniform circular distribution centred at (x, y);
• Ci (x, y) is a set of uniform circular distributions centred at (x + i, y + i), where
i = 0, 1, 2...;

• Ci,j (x, y) is a set of uniform circular distributions centred at (x + i, y + j), where
i, j = 0, 1, 2...;

• Ip (x, y) is the initial coordinates of RPERT p on d;
• ψ is the threshold value for local coefﬁcient generation: the maximum value
(in this implementation measured in cm); and,
• k is a multi-index along the x, y coordinates correlating to the number of
Fourier coefﬁcients; the higher k is, the better approximation.
Ergodicity and trajectory generation in Chapter 3 relies on continuity of domain and trajectories over the domain. The domain and its distribution φ was
discretized into 5 cm × 5 cm squares in order to facilitate quicker, more efﬁcient
computation of trajectories [27].
The collision detecting IR sensors on the RPERTs were turned off during all
three trials. The current draw between the motors, wireless device, IR sensors and
IMU were too much for the single battery powering the RPERT which caused the
RPERTs to reset intermittently. This had unintended consequences on Trial Three,
discussed below.
70

4.4.2

Results

Trials one and two were successful in emulating ergodic aggregation and ﬂocking
behaviours. In each trial the ideal trajectories generated by the simulator is supplied in contrast to the implemented robot trials. These comparisons show that
though the robots do manage to achieve the required behaviours, the trajectories
formed are not ideal in most cases. Trajectory discrepancies are due to the imprecise nature of dead reckoning navigation and RSSI correction. This is because
ergodic trajectories at each time step are informed by the RPERTs current (often
very approximated) coordinates.
4.4.2.1

Trial One

In trial one, the RPERT’s aggregated into the lower left corner of the domain as
shown in Fig. 4.8f. Rather than have the RPERTs continue to cover the distribution
once they had arrived, they were instructed to stop once they were aware that they
were fully within the distribution.
The green and yellow RPERTs (Identiﬁer Marianna and Jean-Baptiste respectively) turn and quickly ﬁnd the distribution as intended. The red RPERT (identiﬁer Horatio) appears to have received wildly inaccurate RSSI correction data,
pulling it ﬁrst towards the Y axis, then in the opposite direction. Eventually the
Kalman ﬁlter manages to correct its coordinates, and the RPERT is able to ﬁnd the
distribution as shown in Figure 4.8
A video of trial one can be found at: https://youtu.be/tyY35RD3ZxQ.
4.4.2.2

Trial Two

In trial two, the three RPERTs track a time-elapsed uniform circle distribution.
Marianna and Horatio both end up outside of the domain physically due the propagated error from dead reckoning navigation. This is a limitation of the hardware.
71

Even moving outside the domain, both RPERTs still end up ﬂocking with JeanBaptiste from the lower left corner of the domain to the top right, as required.
A video of trial two can be found at: https://youtu.be/YaW7KD5PBws
4.4.2.3

Trial Three

In trial three, the RPERTs were to ﬂock in a rectangular formation around the
domain. This kind of nonlinear movement caused collision scenarios between
robots, as they had to ensure they were covering the time-elapsed distribution as
it changed directions.
Ergodic ﬂocking does not replicate Reynolds’ three rules of ﬂocking [2] but
rather emulates the desired behaviour. Due to ergodicity informed trajectory generation the swarm robot system moves toward a shared heading without dispersing. As long as the distance traveled by a robot in a single time interval is strictly
less than the local neighbour threshold distance used in local coefﬁcient generation, the Spectral Multiscale Coverage [14] control law ensures that no two robots
generate the same coordinates at any one time step in their trajectory. In the same
vein, each time interval is equal to the amount of time it takes a robot to move
fully out of the location it occupied at the end of the last time interval. In this way,
ergodic ﬂocking avoids collision.
Though the ergodic ﬂocking algorithm doesn’t allow for the agents involved to
collide, this assumes that the agents are precisely where they believe themselves
to be. To make up for this, the IR sensor on each RPERT was intended to be a
safety backup, so that if agents did end up colliding due to faulty coordinates, they
would still recognise when they had collided. Since the IR collision detector was
removed due to the board constantly resetting, rectangular ﬂocking predictably
caused the robots to collide with one another. This removed any hope of deadreckoning working as a navigation scheme. Even with RSSI correction, in each
72

ﬂocking attempt the RPERT’s internal domain representation was too inaccurate
after collision to be useable. Ultimately the trial was abandoned.

73

(a) Simulation: Time Step t = 25

(b) Simulation: Time Step t = 45

(c) Implementation Graph: Time Step t = (d) Implementation Graph: Time Step t =
70

25

(f) Implementation: Time Step t ≈ 70

(e) Implementation: Time Step t ≈ 25

Figure 4.8: Trial One: Aggregation Simulation vs Application
74

(a) Simulation: Time Step t = 25

(b) Simulation: Time Step t = 45

(c) Implementation: Time Step t = 10

(d) Implementation: Time Step t = 60

(e) Implementation: Time Step t ≈ 25

(f) Implementation: Time Step t ≈ 70

Figure 4.9: Trial Two: Flocking Simulation vs Application
75

Chapter 5
Conclusions and Future Work
Flocking tasks involve large groups of individuals aggregating together, and moving to a common target. This behaviour occurs as a consequence of local interactions of agents; schooling ﬁsh swim towards the centre of mass of their neighbours,
and towards the average heading of all neighbouring ﬁsh while avoiding collision
with one another. These local interactions, when induced in a robotic swarm, typically require algorithmic behaviours.
We have shown that it is possible to emulate some swarm behaviours, specifically aggregation and ﬂocking, with local ergodic trajectories informed by timeevolving uniform distributions. Robots can be aggregated into subsets of a bounded
domain by having each robot create an ergodic trajectory that is informed by a
static uniform distribution.
This thesis provides a working simulation of swarm trajectory generation using only local coefﬁcients - allowing for completely decentralized trajectory generation. In addition, physical implementation via a prototype was developed in
order to show the viability of ergodic ﬂocking in a real-life application.
In the future, we will explore inducing more swarm behaviours with ergodicity.
We believe it is possible to emulate self-assembly and more with careful choice of

76

distribution. This would allow any robot swarm implementing local coefﬁcient
generation ergodicity to emulate arbitrary swarm behaviours by simply changing
the domain distribution.
A different robot platform will be explored in the future. Dead reckoning with
RSSI-correction was not suitable for non-linear ﬂocking movements, so while we
were able to simulate ergodic ﬂocking behaviour we could not verify it experimentally.

77

Bibliography
[1] A. Lioni, C. Sauwens, G. Theraulz, and J.-L. Deneubourg, “Chain formation
in Œcophylla Longinoda,” Journal of Insect Behavior, vol. 14, no. 5, pp. 679–696,
2001.
[2] C. Reynolds, “Flocks, herds and schools: A distributed behavioral model,”
SIGGRAPH: Proc. of the 14th annual conference on computer graphics and
interactive techniques, vol. 21, no. 4, pp. 25–34, 1987.
[3] P. Walker, S. A. Amraii, M. Lewis, N. Chakraborty, and K. Sycara, “Control of
swarms with multiple leader agents,” 2014 IEEE International Conference on
Systems, Man, and Cybernetics (SMC), pp. 3567–3572, 2014.
[4] L. Bayindir, “A review of swarm robotics tasks,” Neurocomputing, vol. 172,
no. 8, pp. 292–321, 2016.
[5] J. Mclurkin, J. Smith, J. Frankel, D. Sotkowitz, D. Blau, and B. Schmidt,
“Speaking swarmish: Human-robot interface design for large swarms of autonomous robots,” 2006 AAAI Spring Symposium, pp. 72–75, 2006.
[6] M. Brambilla, E. Ferrante, M. Birattari, and M. Dorigo, “Swarm robotics: A
review from the swarm engineering perspective,” Swarm Intelligence, vol. 7,
no. 3, pp. 1–41, 2013.
[7] S. Miah and J. Knoll, “Area coverage optimization using heterogeneous robots: Algorithm and implementation,” IEEE Transactions on
Instrumentation and Measurement, vol. 67, no. 6, pp. 1380–1388, 2018.
[8] L. M. Miller, “Optimal ergodic control for active search and information acquisition,” PhD Dissertation, Northwestern University, 2014.
[9] E. Gelenbe and Y. Cao, “Autonomous search for mines,” European Journal of
Operational Research, vol. 108, no. 2, pp. 319–333, 1998.
[10] J. Kennedy and R. C. Eberhart, Swarm Intelligence. Morgan Kaufmann, 2001.
[11] E. Bonabeau, M. Dorigo, and G. Theraulaz, Swarm Intelligence, from Natural
to Artiﬁcial Systems. Oxford University Press, 1999.

78

[12] E. Ferrante, A. E. Turgut, C. Huepe, A. Stranieri, C. Pinciroli, and M. Dorigo,
“Self-organized ﬂocking with a mobile robot swarm: a novel motion control
method,” Adaptive Behavior, vol. 20, no. 6, pp. 460–477, 2012.
[13] L. A. V. Reyes and H. G. Tanner, “Flocking, formation control, and path following for a group of mobile robots,” IEEE Transactions on Control Systems
Technology, vol. 23, no. 4, pp. 1268–1282, 2015.
[14] G. Mathew and I. Mezić, “Metrics for ergodicity and design of ergodic dynamics for multi-agent systems,” Physica D: Nonlinear Phenomena, vol. 240,
no. 5, pp. 432–442, 2011.
[15] A. Winﬁeld and J. Nembrini, “Safety in numbers: Fault tolerance in robot
swarms,” International Journal of Modelling Identiﬁcation and Control,
vol. 1, no. 1, pp. 30–37, 2006.
[16] B. A. Karihaloo, K. Zhang, and J. Wang, “Honeybee combs: how the circular cells transform into rounded hexagons,” Journal of the Royal Society,
Interface, vol. 10, no. 86, pp. 201–302, 2013.
[17] G. F. Young, L. Scardovi, A. Cavagna, I. Giardina, and N. E. Leonard, “Starling ﬂock networks manage uncertainty in consensus at low cost,” PLOS
Computational Biology, vol. 9, no. 1, pp. 1–7, 2013.
[18] D. Graff, D. Röhrig, R. Jasper, H. Parzyjegla, G. Mühl, and J. Rabaey, “Operating system support for mobile robot swarms,” Proceedings of the Second
International Workshop on the Swarm at the Edge of the Cloud (SWEC),
pp. 7–12, 2015.
[19] A. Becker, C. Onyuksel, T. Brett, and J. McLurkin, “Controlling many differential drive robots with uniform control inputs,” The International Journal of
Robotics Research, vol. 33, no. 13, pp. 1626–1644, 2014.
[20] D. Maftuleac, S. Kyou Lee, S. P. Fekete, A. K. Akash, A. López-Ortiz, and
J. McLurkin, “Local policies for efﬁciently patrolling a triangulated region
by a robot swarm,” 2015 IEEE International Conference on Robotics and
Automation (ICRA), pp. 1809–1815, 2015.
[21] M. Rubenstein, A. Cornejo, and R. Nagpal, “Programmable self-assembly in
a thousand-robot swarm,” Science, vol. 345, no. 8, pp. 795–799, 2015.
[22] S. Nuske, S. Choudhury, S. Jain, A. Chambers, L. Yoder, S. Scherer, L. Chamberlain, H. Cover, and S. Singh, “Autonomous exploration and motion planning for an unmanned aerial vehicle navigating rivers,” Journal of Field
Robotics, vol. 32, no. 8, pp. 1141–1162, 2015.
[23] L. E. Parker, “Reliability and fault tolerance in collective robot systems,” ch. 6,
pp. 612–644, Stanford, USA: Jenny Stanford Publishing, 2011.
79

[24] L. S. Schulman and P. E. Seiden, “Statistical mechanics of a dynamical system
based on Conway’s game of life,” Journal of Statistical Physics, vol. 19, no. 3,
pp. 293–314, 1978.
[25] A. Illachinski, Cellular Automata: A Discrete Universe. World Scientiﬁc, 2001.
[26] K. E. Peterson, Ergodic Theory. Cambridge University Press, 1990.
[27] L. Dressel and M. J. Kochenderfer, “Tutorial on the generation of ergodic
trajectories with projection-based gradient descent,” IET Cyber-Physical
Systems: Theory & Applications, 2018.
[28] L. M. Miller, Y. Silverman, M. A. MacIver, and T. D. Murphey, “Ergodic exploration of distributed information,” IEEE Transactions on Robotics, vol. 32,
no. 1, pp. 36–52, 2016.
[29] L. M. Miller and T. D. Murphey, “Trajectory optimization for continuous ergodic exploration,” 2013 American Control Conference, pp. 4196–4201, 2013.
[30] M. Vecchio and R. López-Valcarce, “Improving area coverage of wireless sensor networks via controllable mobile nodes: A greedy approach,” Journal of
Network and Computer Applications, vol. 48, no. 1, pp. 1–13, 2015.
[31] L. Dressel, “Ergodiccontrol.jl.” https://github.com/dressel/ErgodicControl.jl,
last accessed: 12.03.2019.
[32] L. M. Miller and T. D. Murphey, “Trajectory optimization for continuous ergodic exploration on the motion group se(2),” IEEE Int. Conf. on Decision and
Control, pp. 4517–4522, 2013.
[33] I. Abraham, A. Prabhakar, M. J. Z. Hartmann, and T. D. Murphey, “Ergodic
exploration using binary sensing for non- parametric shape estimation,” IEEE
Robotics and Automation Letters, vol. 2, no. 2, pp. 827–834, 2017.
[34] H. Salman, E. Ayvali, and H. Choset, “Multi-agent ergodic coverage with obstacle avoidance,” Twenty-Seventh International Conference on Automated
Planning and Scheduling, pp. 242–249, 2017.
[35] G. Antonello, S. Chiaverini, and A. Marino, “A coordination strategy for
multi-robot sampling of dynamic ﬁelds,” 2012 IEEE International Conference
on Robotics and Automation, pp. 1113–1118, 2012.
[36] A. Hubenko, V. A. Fonoberov, G. Mathew, and I. Mezić, “Multiscale adaptive search,” IEEE Transactions on Systems, Man, and Cybernetics, Part B
(Cybernetics), vol. 41, no. 4, pp. 1076–1087, 2011.
[37] E. Ayvali, H. Salman, and H. Choset, “Ergodic coverage in constrained environments using stochastic trajectory optimization,” IEEE/RSJ International
Conference on Intelligent Robots and Systems, pp. 5204–5210, 2017.
80

[38] G. De La Torre, K. Flaßkamp, A. Prabhakar, and T. D. Murphey, “Ergodic exploration with stochastic sensor dynamics,” 2016 American Control
Conference (ACC), pp. 2971–2976, 2016.
[39] V. Gazi, B. Fidan, L. Marques, and R. Ordonez, “Robot swarms: Dynamics
and control,” in Mobile Robots for Dynamic Environments, ch. 9, pp. 1–32,
New York, USA: ASME Publishing, 2015.
[40] P. J. S. Goncalves, P. J. D. Torres, C. M. O. Alves, and F. Mondada, “The epuck, a robot designed for education in engineering,” Proceedings of the 9th
Conference on Autonomous Robot Systems and Competitions, vol. 1, no. 1,
pp. 59–65, 2009.
[41] M. Rubenstein, B. Cimino, R. Nagpal, and J. Werfel, “Aerobot: An affordable one-robot-per-student system for early robotics education,” 2015 IEEE
International Conference on Robotics and Automation (ICRA), pp. 6107–6113,
2015.
[42] F. Mondada, A. Guignard, M. Bonani, M. Lauria, and D. Floreano,
“Swarm-bot: from concept to implementation,” Proceedings 2003 IEEE/RSJ
International Conference on Intelligent Robots and Systems (IROS 2003),
pp. 1626–1631, 2003.
[43] G. Caprari, P. Palmer, R. Piguet, and R. Siegwart, “The autonomous
micro robot ”Alice”: a platform for scientiﬁc and commercial applications,” MHA’98. Proceedings of the 1998 International Symposium on
Micromechatronics and Human Science, pp. 231–235, 1998.
[44] A. Özgür, S. Lemaignan, W. Johal, M. Beltran, M. Briod, L. Pereyre, F. Mondada, and P. Dillenbourg, “Cellulo: Versatile handheld robots for education,”
HRI ’17 Proceedings of the 2017 ACM/IEEE International Conference on
Human-Robot Interaction, pp. 119–127, 2017.
[45] F. Arvin, J. Murray, C. Zhang, and S. Yue, “Colias: An autonomous micro
robot for swarm robotic applications,” International Journal of Advanced
Robotic Systems, vol. 11, no. 7, pp. 635–645, 2014.
[46] F. Arvin, K. Samsudin, and A. R. Ramli, “Development of a miniature
robot for swarm robotic application,” International Journal of Computer and
Electrical Engineering, vol. 1, no. 4, pp. 1793–8163, 2009.
[47] J. McLurkin, A. J. Lynch, S. Rixner, T. W. Barr, A. Chou, K. Foster, and S. Bilstein, “A low-cost multi-robot system for research, teaching, and outreach,”
in Distributed Autonomous Robotic Systems, ch. 4, pp. 597–609, Berlin, Heidelberg: Springer, 2013.
[48] ARM, Cortex-M4 - Revision r0p0: Technical Reference Manual, 2010. Rev. 2.

81

[49] Bosch Sensortec, BNO055 Intelligent 9-axis Absolute Orientation Sensor
Datasheet, 2016. Rev. 1.4.
[50] Digi International Inc, XBee®/XBee-PRO® RF Modules: Product Manual,
2009. Rev. v1.xEx - 802.15.4.
[51] H. Teimoori and A. V. Savkin, “Distance-only based navigation of wheeled
mobile robots with obstacle avoidance,” 2008 IEEE International Conference
on Robotics and Biomimetics, pp. 1956–1961, 2008.
[52] J. Svečko, M. Malajner, and D. Gleich, “Distance estimation using RSSI and
particle ﬁlter,” ISA Transactions, vol. 55, no. 3, pp. 275–285, 2015.
[53] A. Cornejo and R. Nagpal, “Distributed range-based relative localization of
robot swarms,” in Algorithmic Foundations of Robotics XI. Springer Tracts
in Advanced Robotics, pp. 91–107, Heidelberg, New York, London: Springer,
2015.
[54] L. Oliveira, H. Li, L. Almeida, and T. E. Abrudan, “RSSI-based relative localisation for mobile robots,” Ad Hoc Networks, vol. 13, no. 2, pp. 321–335,
2014.
[55] O. S. Oguejiofor, V. N. Okorogu, A. Adewale, and B. O. Osuesu, “Outdoor
localization system using RSSI, measurement of wireless sensor network,”
International Journal of Innovative Technology and Exploring Engineering
(IJITEE), vol. 2, no. 2, pp. 1–6, 2013.
[56] M. B. Kilani, A. J. Raymond, f. Gagnon, and G. Gagnon, “RSSI-based indoor tracking using the extended kalman ﬁlter and circularly polarized antennas,” 2014 11th Workshop on Positioning, Navigation and Communication
(WPNC), pp. 1–6, 2013.
[57] M. Cao, C. Yu, and B. D. Anderson, “Formation control using range-only measurements,” Automatica, vol. 47, no. 4, pp. 776–781, 2011.
[58] W. Ni and D. Cheng, “Leader-following consensus of multi-agent systems
under ﬁxed and switching topologies,” Systems & Control Letters, vol. 59,
no. 4, pp. 209–217, 2010.
[59] L. Ding, Q.-L. Han, and G. Guo, “Network-based leader-following consensus
for distributed multi-agent systems,” Automatica, vol. 49, no. 7, pp. 2281–
2286, 2013.
[60] M. W. Spong, S. Hutchinson, and M. Vidyasagar, Robot Modeling and
Control. Wiley, 2005.
[61] R. R. Otero and A. A. Aravind, “Minios: An instructional platform for teaching operating systems projects,” SIGCSE ’15 Proceedings of the 46th ACM
Technical Symposium on Computer Science Education, pp. 430–435, 2015.
82

[62] D. Krupke, M. Hemmer, J. Mclurkin, Y. Zhou, and S. P. Fekete, “A parallel
distributed strategy for arraying a scattered robot swarm,” 2015 IEEE/RSJ
International Conference on Intelligent Robots and Systems (IROS), pp. 2795–
2802, 2015.
[63] J. McLurkin and et al, “A robot system design for low-cost multi-robot manipulation,” 2014 IEEE/RSJ International Conference on Intelligent Robots and
Systems, pp. 912–918, 2014.
[64] M. W. Lo, “The coverage of elliptical orbits using ergodic theory,” 2004 IEEE
Aerospace Conference Proceedings, pp. 541–546, 2004.
[65] C. B. Ivić, Ivan and I. Mezić, “Ergodicity-based cooperative multiagent area
coverage via a potential ﬁeld,” IEEE Transactions on Cybernetics, vol. 47,
no. 8, pp. 1983–1993, 2017.
[66] M. Gerla, T. J. Kwon, and G. Pei, “On-demand routing in large ad hoc wireless
networks with passive clustering,” 2000 IEEE Wireless Communications and
Networking Conference, pp. 100–105, 2000.
[67] D. A. Shell, C. V. Jones, and M. J. Matarić, “Ergodic dynamics by design:
A route to predictable multi-robot systems,” Multi-Robot Systems: From
Swarms to Intelligent Automata,, vol. 3, pp. 291–297, 2005.

83