The Structure from Motion community is not only motivated by the long term goals of computer vision, AI and 3D visual understanding. It also has many practical applications which presently drive research in SfM. Below, we illustrate example applications. Some of these are still in their early stages of development while others are quickly becoming commercially viable techniques in industry.
Many techniques exist for scanning real-world objects to form computer graphic 3D models. These range from 3D laser scanning to depth from defocus estimation. Structure from Motion is an important alternative and has been used to flexibly construct 3D coordinates and 3D models from 2D imagery of real objects. One demonstration, for instance, is Debevec, Taylor and Malik's [13] reconstruction of Berkeley's Campanile clock tower and surrounding campus via photogrammetric techniques.
The recovery of 3D motion parameters in the SfM framework can also be used to drive 3D models for animation purposes. Virtual objects can be affixed to real ones in the scene [6] (Figure 2) or computer graphics animations can be visually controlled [5] (Figure 1). Such techniques are currently being integrated into standard computer graphics software for use in film, video, games, interactive media, industrial design and visualization. In addition, motion matching can be used in virtual and augmented reality environments. For example, Kutulakos [35] describes a system with see-through head mounted display where 3D objects are superimposed on the user's scene in real-time.
Recovering a camera's external and internal parameters is another practical application. The external parameters describe a camera's position in 3D real-world coordinates and its internal parameters include variables such as focal length. In the field of Active Vision where cameras are expected to move around and zoom in autonomously, automatic re-calibration is crucial [8].
In many applications in computer vision, SfM paradigm is a useful computational sub-component. The 3D reconstruction that is recovered need not be the final goal of a vision system but an important intermediate step that can be fed back and fed forward to other vision modules. Thus, it can help in tracking, recognition and modeling (for example, see [32]).
Using vision as an interface for computer human interaction is also a potential application for 3D recovery techniques which can be used to identify user gestures that complement traditional keyboard and mouse paradigms [57].
In the area of robotics which includes hand-eye coordination tasks, navigation and obstacle detection, 3D scene structure is an important intermediate step. Related work has been done by Wells [59] and Beardsley et al. [7].
The estimation of 3D parameters to describe image sequences is an important way to compactly encode information about the scene. This representation can then be used for low bit-rate communication, compression as well as noise reduction (see ``A Review of Object-based Coding of 3D sequences'', this issue).
In photogrammetry, multiple images of a landscape or scene are taken and need to be aligned into a large composite image. SfM estimates the displacements and aligns images to re-project them into a large single image. Similarly, imagery can be mosaiced into a larger scene such as in [53].