For $S \subseteq \{0,1\}^n$, a Boolean function $f: S \to \{-1,1\}$ is a \emph{halfspace over $S$} if there exist $w \in \R^n$ and $\theta \in \R$ such that $f(x)=\sign(w \cdot x - \theta)$ for all $x \in S$. We give bounds on the size of integer weights $w_1,\dots,w_n \in \Z$ that are required to represent halfspaces over Hamming balls centered at $0^n$, i.e. halfspaces over $S =\{0,1\}^n_{\leq k} \eqdef \{x \in \{0,1\}^n : x_1 + \cdots + x_n \leq k\}.$ Such weight bounds for halfspaces over Hamming balls have immediate consequences for the performance of learning algorithms in the increasingly common scenario of learning from very high-dimensional categorical examples which are such that only a small number of features are active in each example.
We give upper and lower bounds on weight both for exact representation (when $\sign(w \cdot x -\theta)$ must equal $f(x)$ for every $x \in S$) and for $\eps$-approximate representation (when $\sign(w \cdot x-\theta)$ may disagree with $f(x)$ for up to an $\eps$ fraction of points $x \in S$). Our results show that extremal bounds for exact representation are qualitatively rather similar whether the domain is all of $\{0,1\}^n$ or the Hamming ball $\{0,1\}^n_{\leq k}$, but extremal bounds for approximate representation are qualitatively very different between these two domains.