Space Partitioning Trees

Overview:

Spatial Trees are a recursive space partitioning datastructure that can help organize high-dimensional data. They can assist in analyzing the underlying data density, perform fast nearest-neighbor searches, and do high quality vector-quantization.

There are several instantiations of spatial trees. The most popular include KD trees, RP trees (random projection trees), PD trees (principal direction trees). Each have their strengths and weaknesses, which have been studied in details in several publications (see below).

Code:

   Python: A Python implementation of the most popular types of spatial trees is provided by Brian McFee. Click here.

   C: A C implementation of RP-trees is available (that is no longer supported). Click here.

   Matlab: A Matlab implementation of the most popular types of spatial trees is available here.

Relevant Publications:

Learning the structure of manifolds using random projections.
Y. Freund, S. Dasgupta, M. Kabra and N. Verma.
In Neural Information Processing Systems (NIPS), 2007. [pdf]

Random projection trees and low dimensional manifolds.
S. Dasgupta and Y. Freund.
In ACM Symposium on Theory of Computing (STOC), 2008. [pdf]

Which spatial partition trees are adaptive to intrinsic dimension?
N. Verma, S. Kpotufe and S. Dasgupta.
In Uncertainty in Artificial Intelligence (UAI), 2009. [pdf]

Large-scale music similarity search with spatial trees.
B. McFee and G. Lanckriet.
In 12th International Society for Music Information Retrieval (ISMIR) conference, 2011. [pdf]

An investigation of practical approximate nearest neighbor algorithms.
T. Liu, A. Moore, A. Gray and K. Yang
In Neural Information Processing Systems, 2005. [pdf]

Last Modified on: Nov 30, 2011.