The Action-Reaction Learning system functions as a server which receives real-time multi-dimensional data from the vision systems and re-distributes it to the graphical systems for rendering. Typically during training, two vision systems and two graphics systems are connected to the ARL server. Thus, it is natural to consider the signals and their propagation as multiple temporal data streams. Within the ARL server, perceptual data or tracked motions from the vision systems are accumulated and stored explicitly into a finite length time series of measurements.
For the head and hand tracking case, two triples of Gaussian blobs are
generated (one triple for each human) by the vision systems and form
30 continuous scalar parameters that evolve as a multi-dimensional
time series. Each set of 30 scalar parameters can be considered as a
30 dimensional vector
arriving into the ARL engine from
the vision systems at a given time t. The ARL system preprocesses
then trains from this temporal series of vectors. However, certain
issues arise when processing time series data and dealing with
temporal evolution of multidimensional parameters. The representation
of this data is critical to reliably predict and forecast the
evolution of the time series or, equivalently, estimate the parameters
of the 6 blobs in near future. The future value of
will be
referred to as
and it must be forecasted from several
measurements of the previous vectors
which will form a window of perceptual history.