The
Columbia Fast Query Project
Principal Investigator: Kenneth
A. Ross
The aim of this project is to
make
queries on databases fast, in the face of:
-
Huge data sets, typical of
data warehouse
applications
-
Complex queries
-
Sophisticated machines (with
complex and sometimes delicate performance parameters) running the
database code.
Our approach to the problem has
focused
on the following technologies:
-
Databases on Multi-Threaded,
Multi-Core machines.
Power consumption issues are forcing CPU chip
designers to increase performance in new ways. Rather than simply
increasing clock speeds, they are placing multiple CPU cores on a
single chip, and are providing hardware to run multiple logical threads
on each CPU. Running a database system on such an architecture can lead
to interference, because the CPUs on a chip and the threads within each
CPU share (and thus compete for) resources. We are examining ways to
reduce or eliminate such interference for query intensive workloads.
- Architecture-Sensitive
Databases.
Now that very large databases can fit in the
RAM of cheap machines, one can develop database techniques that perform
well in that context. For example, one might try to design query
processing algorithms that yield good data reference locality, and
hence
low CPU cache miss rates. Other architectural features are
branch-misprediction
effects and the availablilty of SIMD instructions (such as SSE on
x86 platforms).
- Materialized Views.
One can potentially answer queries faster by
using stored answers to commonly-used query subexpressions.
These stored answers are called materialized views.
- Query Processing Algorithms
and Query
Optimization Techniques.
By developing new algorithms for common and/or
critical database operations, one can improve query performance.
Choosing
a good evaluation plan for a complex query is a difficult
problem.
New optimization strategies could help find better plans.
Available
Software.
Related Projects
More information can be found
in our
publications.
Database Research Group
This material is based in part upon work supported by the National
Science
Foundation under Grants IRI-9457613, IIS-9812014, IIS-0120939, and
IIS-0534389.
Any opinions, findings, and conclusions or recommendations expressed
in this material are those of the author(s) and do not necessarily
reflect
the views of the National Science Foundation.