SciTech.blog

What is entropy?

23 Sep 2017, 07:43 • theory, entropy

Intuitively, entropy is the measure of in how many states a system can be. But how can we arrive from this intuition at $$H(X)= -\sum_i p_i \log p_i$$?

Consider an alphabet $$\{\alpha_1,\dots,\alpha_m\}$$ and imagine we want to create a sequence with $$k_i$$ occurrences of $$\alpha_i$$. The length of the sequence is $$n=\sum_{i=1}^m k_i$$. The number of all possible sequences formed this way is given by the multinomial coefficient:

${{n} \choose {k_1,\dots,k_m}} = \frac{n!}{k_1! \dots k_m!}$

However, we want our measure not to depend on $$n$$ so we'll take the nth root of the multinomial coefficient. You can think of it as something like the geometric mean:

$\sqrt[\leftroot{4}n]{\frac{n!}{k_1! \dots k_m!}}$

Note that this expression is multiplicative, but entropy is defined as an additive quantity. To convert a multiplicative quantity into an additive one, we can take its logarithm:

$\log \left(\frac{n!}{k_1! \dots k_m!}\right)^{\frac{1}{n}} = \frac{\log n! - \log k_1! - \dots - \log k_m!}{n} = \frac{\log n! - \sum_i \log k_i!}{n}$

At this point we can use Stirling’s formula ($$\log n! \approx n \log n - n$$) to simplify the fraction:

$\frac{\log n! - \sum_i \log k_i!}{n} \approx \frac{n \log n - n - \sum_i k_i \log k_i + \sum_i k_i}{n}$

whence

$\begin{array}{ll} H(X) & = \log n - \sum_i \frac{k_i \log k_i}{n} = -\left( \sum_i \frac{k_i}{n} \log k_i - \sum_i \frac{k_i}{n} \log n \right) = -\sum_i \frac{k_i}{n} \log \frac{k_i}{n} \\ & = -\sum_i p_i \log p_i \\ \end{array}$

Note that $$H(X)$$ depends merely on $$p_i=\frac{k_i}{n}$$.