The system's building blocks are depicted in Figure 1. The following describes the audio-visual association module, the object recognition algorithm used and gives a short overview of the hardware.