We study the fundamental problem of learning an unknown large-margin halfspace in the context of parallel computation.
Our main positive result is a parallel algorithm for learning a large-margin halfspace that is based on interior point methods from convex optimization and fast parallel algorithms for matrix computations. We show that this algorithm learns an unknown $\gamma$-margin halfspace over $n$ dimensions using $\poly(n,1/\gamma)$ processors and runs in time $\tilde{O}(1/\gamma)+O(\log n)$. In contrast, naive parallel algorithms that learn a $\gamma$-margin halfspace in time that depends polylogarithmically on $n$ have $\Omega(1/\gamma^2)$ runtime dependence on $\gamma.$
Our main negative result deals with boosting, which is a standard approach to learning large-margin halfspaces. We give an information-theoretic proof that in the original PAC framework, in which a weak learning algorithm is provided as an oracle that is called by the booster, boosting cannot be parallelized: the ability to call the weak learner multiple times in parallel within a single boosting stage does not reduce the overall number of successive stages of boosting that are required.
pdf of conference version
Link to open-access journal version