Abstract
This paper proposes the problem of modeling video sequences
of dynamic swarms (DSs). We define a DS as
a large layout of stochastically repetitive spatial configurations of dynamic
objects (swarm elements) whose motions
exhibit local spatiotemporal interdependency and stationarity, i.e., the
motions are similar in any small
spatiotemporal neighborhood. Examples of DS abound in nature, e.g., herds of
animals and flocks of birds. To capture
the local spatiotemporal properties of the DS, we present a probabilistic model that learns both the spatial layout of
swarm elements (based on low-level image segmentation) and their joint dynamics that are modeled as
linear transformations. To this end, a spatiotemporal neighborhood is associated with each swarm element, in
which local stationarity is enforced both spatially and temporally. We assume that the prior on the
swarm dynamics is distributed according to an MRF in both space and time. Embedding this model in a MAP
framework, we iterate between learning the spatial layout of the swarm and its dynamics. We learn the
swarm transformations using ICM, which iterates
between estimating these transformations and updating their distribution
in the spatiotemporal neighborhoods. We
demonstrate the validity of our method by conducting experiments on real and
synthetic video sequences. Real
sequences of birds, geese, robot swarms, and pedestrians evaluate the
applicability of our model to real world
data.