Recall the formalism that was established in the previous chapters. In the discussion of time series processing, a vector notation was employed describing past short term memory as and measurements (future) of both users as . We re-use this notation and assume that two users (user A and user B) can be connected to the ARL system. The two corresponding vision systems independently recover and which are fed through the learning system to two graphics systems in real-time. We enumerate the functionality of each component in the system in terms of this notation.
Generates for user a vector of instantaneous perceptual measurements. The vision system on user A generates and the vision on user B generates .
Synthesizes graphically a vector of perceptual measurements, for example either or .
Accumulates both the actions of user A and user B by concatenating and together into . Also the module stores many of these vectors, into a short term memory. The unit then pre-process the short term memory with decay and dimensionality reduction to form a compact vector .
Learns from the past short term memory and the immediate subsequent vector of measurements from both users (generated by user A and B). Using the CEM machinery, many such pairs of from a few minutes of interaction form a probability density . We can use this model to compute a predicted , the immediate future, for any observed short term past.
In summary, the vision systems (one per user) both produce the components and which are concatenated into . This forms a stream which is fed into some temporal representation. Recall that an accumulation of the past was denoted Y(t). It was then processed with decay and dimensionality reduction to generate . Then, a learning system is used to learn the conditional density from many example pairs of . This allows it to later compute an estimate given the current . Finally, for synthesis, the estimate is broken down into and which are predictions for each user. However, it is critical to note that the estimate into the future is an instantaneous estimate and, on its own, does not generate any finite length, meaningful action or gesture.