How can I generate a Huffman code tree

Huffman coding

Would you like to know what exactly Huffman coding is all about? In the following, we will use a simple example to explain how you can use this coding to find the optimal code.

Huffman coding

The Huffman coding is a coding method which leads to an optimal code with the smallest possible average code word length. When transmitting messages with optimal codes, the transmission times are reduced. The recipient of your message will find out what you want to communicate much faster.

The formula for the mean code word length is:

In general, the mean code word length in Huffman coding is smaller than that in Shannon-Fano coding. It follows that Huffman coding is generally better.


But we don't want to just leave it as it is, we want to check it out ourselves! So we will now code an example using the Huffman method. You want to go to a good bar with your girlfriend again. You have noted the number of previous visits to your favorite bars in a tally sheet. There is a tally sheet that shows the number of visits to bars. The more lines a bar already has, the more likely you will visit it again. Most of the time you end up in the Soju Bar, because you were there on 8 out of 34 evenings. So the code word for Soju Bar should be short in order to get the message across quickly.

First of all, we determine the relative frequency of the visits so that we can sort the bars according to increasing probability. If there are two equally frequent events, it does not matter which comes first. Since the names are a little too long, we will simply abbreviate them below so that Barbie Deinhoff’s becomes BD and the Weinerei becomes DW.

Create Huffman Tree

The Huffman Code has a unique procedure that follows a four-step roadmap. Now that we have already sorted the events, we can start with the first step. You have to find the two characters with the lowest probability of occurrence in the sorted list. In our list there are three characters that are the least likely to occur . We can therefore freely decide which two to choose.

We just take the two characters on the far left of the sorted list, i.e. PG and DH. In the second step, the two characters found are merged into a new element, the probability of which is the sum of the two individual probabilities.

We draw a node that connects the characters PG and DH to a new element, the probability of its occurrence is. So we can start with step three, sorting the new element. Here it is necessary to re-sort the list if the new element is not the least likely to appear in the list. That is mostly the case.

New arrangement of the elements

Our list is no longer sorted according to ascending probability, because the new element - consisting of PG and DH - has the probability of occurrence . However, the probability of the element BD is smaller and must therefore be at the very beginning. We will therefore re-sort the list according to increasing probability.

Now let's continue with the first steps recursively. We are again looking for the elements with the lowest probability of occurrence in our new list. Then we come back to step two by combining them and merging them into a new element.

The new element has the probability of occurrence . We now have to reorder the list so that all elements are sorted according to increasing probability.

After we have re-sorted the list, we summarize the two smallest probabilities again. The event BT with the probability let us grasp the knot with the probability together. This gives a new node with the probability . Now we need to rearrange the list. The two elements with the lowest probability of occurrence are now B3 and KAM, which we merge into a new element. So we draw a knot with the probability and re-sort.

We now combine DW with the node next door, since these have the lowest probabilities. plus results . Now we have to re-sort again because the list is no longer ordered. We combine L with the knot of B3 and KAM. This gives a new node with the probability . We sort again according to ascending probability.

Now we summarize the elements BS and SB, which form a node with the probability to lead. In the sorted list we can now see the nodes and summarize what to do with a new node leads.

Now we only have two nodes that we can create a new node with the probability is equal to 1 summarize. This root node encompasses all possibilities, because it has probability 1. If this is not the case, then you either forgot an event or miscalculated. We are now at step 4, the termination of the algorithm as soon as there is only one object left in the list.

Code words

However, this representation variant is somewhat unusual. Therefore we are still mirroring the tree horizontally so that the roots are on top and the leaves are on the bottom. We now have a tree, but the codewords are still missing. We get this by assigning a 0 to all branches on the left of a node and a 1 to those on the right. When constructing the code word, we always start with the root, which then represents the most significant bit, i.e. the bit on the far left in the code word.

We can read off the finished code words of the individual bars from the tree. They are:

The individual code words differ from the code words of the Shannon-Fano coding. Logical, because Huffman coding is a completely different process with different steps. Nevertheless, both coding methods deliver an optimal code.

Average code word length

Now we can also calculate the mean code word length of our coding. In general, this should be better, i.e. smaller, than with the Shannon-Fano coding. For the mean code word length we get the value

In the Shannon-Fano coding, the mean code word length had exactly the same value. Have we miscalculated? No, because the mean code word length is often a little better, but it can also be just as good, as is the case with us now.

Other properties of the Huffman coding

Finally, two important properties of the Huffman code: A code generated with Huffman, like a Shannon-Fano code, is always prefix-free, which means that no code word may represent the beginning of another code word.

In addition, the code tree is created starting from the leaves and ends in the root. This construction is also called “buttom-up”, because in the classic picture of a tree graph, the leaves are below and the roots are above. This is the big difference to the Shannon-Fano code, which is constructed "top-down", ie from the roots to the leaves. As a result, the Huffman code is superior in terms of the mean code word length, but very cumbersome or almost impossible to construct with many sheets, because you have to reorder very often and therefore need a lot of memory.