# Carathéodory's theorem (convex hull)

In convex geometry Carathéodory's theorem states that if a point x of Rd lies in the convex hull of a set P, there is a subset Template:Italics correction′ of P consisting of d + 1 or fewer points such that x lies in the convex hull of Template:Italics correction′. Equivalently, x lies in an r-simplex with vertices in P, where $r\leq d$ . The result is named for Constantin Carathéodory, who proved the theorem in 1911 for the case when P is compact. In 1914 Ernst Steinitz expanded Carathéodory's theorem for any sets P in Rd.

For example, consider a set P = {(0,0), (0,1), (1,0), (1,1)} which is a subset of R2. The convex hull of this set is a square. Consider now a point x = (1/4, 1/4), which is in the convex hull of P. We can then construct a set {(0,0),(0,1),(1,0)} = Template:Italics correction′, the convex hull of which is a triangle and encloses x, and thus the theorem works for this instance, since |Template:Italics correction′| = 3. It may help to visualise Carathéodory's theorem in 2 dimensions, as saying that we can construct a triangle consisting of points from P that encloses any point in P.

## Proof

Let x be a point in the convex hull of P. Then, x is a convex combination of a finite number of points in P :

$\mathbf {x} =\sum _{j=1}^{k}\lambda _{j}\mathbf {x} _{j}$ where every xj is in P, every λj is non-negative, and $\sum _{j=1}^{k}\lambda _{j}=1$ .

Suppose k > d + 1 (otherwise, there is nothing to prove). Then, the points x2 − x1, ..., xk − x1 are linearly dependent,

so there are real scalars μ2, ..., μk, not all zero, such that

$\sum _{j=2}^{k}\mu _{j}(\mathbf {x} _{j}-\mathbf {x} _{1})=\mathbf {0} .$ If μ1 is defined as

$\mu _{1}:=-\sum _{j=2}^{k}\mu _{j}$ then

$\sum _{j=1}^{k}\mu _{j}\mathbf {x} _{j}=\mathbf {0}$ $\sum _{j=1}^{k}\mu _{j}=0$ and not all of the μj are equal to zero. Therefore, at least one μj > 0. Then,

$\mathbf {x} =\sum _{j=1}^{k}\lambda _{j}\mathbf {x} _{j}-\alpha \sum _{j=1}^{k}\mu _{j}\mathbf {x} _{j}=\sum _{j=1}^{k}(\lambda _{j}-\alpha \mu _{j})\mathbf {x} _{j}$ for any real α. In particular, the equality will hold if α is defined as

$\alpha :=\min _{1\leq j\leq k}\left\{{\tfrac {\lambda _{j}}{\mu _{j}}}:\mu _{j}>0\right\}={\tfrac {\lambda _{i}}{\mu _{i}}}.$ Note that α>0, and for every j between 1 and k,

$\lambda _{j}-\alpha \mu _{j}\geq 0.$ In particular, λi − αμi = 0 by definition of α. Therefore,

${\mathbf {x} }=\sum _{j=1}^{k}(\lambda _{j}-\alpha \mu _{j}){\mathbf {x} }_{j}$ where every $\lambda _{j}-\alpha \mu _{j}$ is nonnegative, their sum is one , and furthermore, $\lambda _{i}-\alpha \mu _{i}=0$ . In other words, x is represented as a convex combination of at most k-1 points of P. This process can be repeated until x is represented as a convex combination of at most d + 1 points in P.

An alternative proof uses Helly's theorem.