# What is entropy?

**23 Sep 2017, 07:43 • theory, entropy**

*Intuitively, entropy is the measure of in how many states a system can be.
But how can we arrive from this intuition at \( H(X)= -\sum_i p_i \log p_i \)?*

Consider an alphabet \(\{\alpha_1,\dots,\alpha_m\}\) and imagine we want to create a sequence with \(k_i\) occurrences of \(\alpha_i\). The length of the sequence is \(n=\sum_{i=1}^m k_i\). The number of all possible sequences formed this way is given by the multinomial coefficient:

\[ {{n} \choose {k_1,\dots,k_m}} = \frac{n!}{k_1! \dots k_m!} \]However, we want our measure not to depend on \(n\) so we'll take the *n*th root of the multinomial coefficient.
You can think of it as something like the geometric mean:

Note that this expression is multiplicative, but entropy is defined as an additive quantity. To convert a multiplicative quantity into an additive one, we can take its logarithm:

\[ \log \left(\frac{n!}{k_1! \dots k_m!}\right)^{\frac{1}{n}} = \frac{\log n! - \log k_1! - \dots - \log k_m!}{n} = \frac{\log n! - \sum_i \log k_i!}{n} \]At this point we can use Stirling’s formula (\(\log n! \approx n \log n - n\)) to simplify the fraction:

\[ \frac{\log n! - \sum_i \log k_i!}{n} \approx \frac{n \log n - n - \sum_i k_i \log k_i + \sum_i k_i}{n} \]whence

\[ \begin{array}{ll} H(X) & = \log n - \sum_i \frac{k_i \log k_i}{n} = -\left( \sum_i \frac{k_i}{n} \log k_i - \sum_i \frac{k_i}{n} \log n \right) = -\sum_i \frac{k_i}{n} \log \frac{k_i}{n} \\ & = -\sum_i p_i \log p_i \\ \end{array} \]Note that \(H(X)\) depends merely on \(p_i=\frac{k_i}{n}\).