Huffman coding
Huffman encoding is a way to assign binary codes to symbols that
reduces the overall number of bits used to encode a typical string of
those symbols.
For example, if you use letters as symbols and have details of
the frequency of occurrence of those letters in typical strings, then
you could just encode each letter with a fixed number of bits, such as
in ASCII codes. You can do better than this by encoding more frequently
occurring letters such as e and a, with smaller bit strings; and less
frequently occurring letters such as q and x with longer bit strings.
Any string of letters will be encoded as a string of bits that
are no-longer of the same length per letter. To successfully decode such
as string, the smaller codes assigned to letters such as 'e' cannot
occur as a prefix in the larger codes such as that for 'x'.
|
If you assign less number or bits or shorter code words for most frequently used symbols you will be saving a lot of storage space.
Suppose you want to assign 26 unique codes to English alphabet and
want to store an English novel ( only letters ) in term of these code
you will require less memory if you assign short length codes to most
frequently occurring characters.
|
Program for Huffmann
coding
clc;
p0=0.4;
l0=1;
p1=0.2;
l1=2;
p2=0.2;
l2=3;
p3=0.1;
l3=4;
p4=0.1;
l4=4;
L=p0*l0+p1*l1+p2*l2+p3*l3+p4*l4;
H_Ru0=p0*(log2(1/p0))+p1*(log2(1/p1))+p2*(log2(1/p2))+p3*(log2(1/p3))+p4*(log2(1/p4));
disp('bits',L,'Avg codeword length L');
disp('bits',H_Ru0,'Entropy of Huffmann code H');
Output:
Avg codeword length
L
2.2 bits
Entropy of Huffmann
code H