# Carathéodory's theorem (convex hull)

An illustration of Carathéodory's theorem for a square in R2

In convex geometry Carathéodory's theorem states that if a point x of Rd lies in the convex hull of a set P, there is a subset Template:Italics correction′ of P consisting of d + 1 or fewer points such that x lies in the convex hull of Template:Italics correction′. Equivalently, x lies in an r-simplex with vertices in P, where ${\displaystyle r\leq d}$. The result is named for Constantin Carathéodory, who proved the theorem in 1911 for the case when P is compact. In 1914 Ernst Steinitz expanded Carathéodory's theorem for any sets P in Rd.

For example, consider a set P = {(0,0), (0,1), (1,0), (1,1)} which is a subset of R2. The convex hull of this set is a square. Consider now a point x = (1/4, 1/4), which is in the convex hull of P. We can then construct a set {(0,0),(0,1),(1,0)} = Template:Italics correction′, the convex hull of which is a triangle and encloses x, and thus the theorem works for this instance, since |Template:Italics correction′| = 3. It may help to visualise Carathéodory's theorem in 2 dimensions, as saying that we can construct a triangle consisting of points from P that encloses any point in P.

## Proof

Let x be a point in the convex hull of P. Then, x is a convex combination of a finite number of points in P :

${\displaystyle \mathbf {x} =\sum _{j=1}^{k}\lambda _{j}\mathbf {x} _{j}}$

where every xj is in P, every λj is non-negative, and ${\displaystyle \sum _{j=1}^{k}\lambda _{j}=1}$.

Suppose k > d + 1 (otherwise, there is nothing to prove). Then, the points x2 − x1, ..., xk − x1 are linearly dependent,

so there are real scalars μ2, ..., μk, not all zero, such that

${\displaystyle \sum _{j=2}^{k}\mu _{j}(\mathbf {x} _{j}-\mathbf {x} _{1})=\mathbf {0} .}$

If μ1 is defined as

${\displaystyle \mu _{1}:=-\sum _{j=2}^{k}\mu _{j}}$

then

${\displaystyle \sum _{j=1}^{k}\mu _{j}\mathbf {x} _{j}=\mathbf {0} }$
${\displaystyle \sum _{j=1}^{k}\mu _{j}=0}$

and not all of the μj are equal to zero. Therefore, at least one μj > 0. Then,

${\displaystyle \mathbf {x} =\sum _{j=1}^{k}\lambda _{j}\mathbf {x} _{j}-\alpha \sum _{j=1}^{k}\mu _{j}\mathbf {x} _{j}=\sum _{j=1}^{k}(\lambda _{j}-\alpha \mu _{j})\mathbf {x} _{j}}$

for any real α. In particular, the equality will hold if α is defined as

${\displaystyle \alpha :=\min _{1\leq j\leq k}\left\{{\tfrac {\lambda _{j}}{\mu _{j}}}:\mu _{j}>0\right\}={\tfrac {\lambda _{i}}{\mu _{i}}}.}$

Note that α>0, and for every j between 1 and k,

${\displaystyle \lambda _{j}-\alpha \mu _{j}\geq 0.}$

In particular, λi − αμi = 0 by definition of α. Therefore,

${\displaystyle {\mathbf {x} }=\sum _{j=1}^{k}(\lambda _{j}-\alpha \mu _{j}){\mathbf {x} }_{j}}$

where every ${\displaystyle \lambda _{j}-\alpha \mu _{j}}$ is nonnegative, their sum is one , and furthermore, ${\displaystyle \lambda _{i}-\alpha \mu _{i}=0}$. In other words, x is represented as a convex combination of at most k-1 points of P. This process can be repeated until x is represented as a convex combination of at most d + 1 points in P.

An alternative proof uses Helly's theorem.

## References

• {{#invoke:citation/CS1|citation

|CitationClass=citation }}.

• {{#invoke:citation/CS1|citation

|CitationClass=citation }}.

• {{#invoke:citation/CS1|citation

|CitationClass=citation }}.

• {{#invoke:citation/CS1|citation

|CitationClass=citation }}