Huffman Coding | Greedy Method | ADA

 Huffman Coding


It is a Greedy Approach.It is an compression technique used to encode compress data. It is used for reducing the size of data of message. If we store the data on the file than we need compression to reduce the size of file. 
When the the data is send over a network than the data is compress and send to reduce the cost of transmission.

In this post I will explain you if we send normal message than what will be the cost ? What is fixed size of coding ? What is variable size of coding and than finally What is Huffman coding ? with the help of example.


Let we first understand the problem

suppose I have any message such as

A B B C C C A D D B B B A A E D D C C C

the length of message is 20 which is saved in file and than send it. 

We all knows if we send any data by using any electronic device such as computer than following technique can be used such as:

1. General Method .(Fixed Size Code)

2. ASCII Code Method

3. Huffman Code Method (Variable Size Code)


GENERAL METHOD (FIXED SIZE CODE) :

Let we understand general method by using above example:

Message : A B B C C C A D D B B B A A E D D C C C

Characters used in message : A B C D E

Frequency : It how many number of times the character is repeated in complete message. such as:

A : 4 , B : 5 , C : 6 , D : 4 , E : 1

we know that computer use 0/1 for encoding any character.In this method let we understand the encoding technique such as

if we have 2 character than we represent each character in 1 bits

    Note in 2-bit we have 2 combinations (0 & 1)

if we have 3 character than we represent each character in 2 bits

    Note in 2-bit we have 4 combinations (00 to 11)

if we have 4 character than we represent each character in 2 bits

    Note in 2-bit we have 4 combinations (00 to 11)

if we have 5 character than we represent each character in 3 bits

    Note in 2-bit we have 8 combinations (000 to 111)

And so on......

we can say that 

For 2-character        code: 0,1

For 3-character        code:(00,01,10,11)

For 4-character        code:(00,01,10,11)

For 5/6/7/8-character        code:(000,001,010,011,100,101,110,111)

For 9-character        code:(0000,0001,0010,0011,.......,1111)

and so on

So here we have 5 characters so we can represent the code in following way such as:

Character          Frequency      Binary  Code

    A                        4                000

    B                        5                001

    C                        6                010

    D                        4                011

    E                        1                111


here total no of bits are

A = 4 x 3 (code-bit) = 12

B = 5 x 3 (code-bit) = 15

C = 6 x 3 (code-bit) = 18

D = 4 x 3 (code-bit) = 12

E = 1 x 3 (code-bit) = 3

Total no. of bits = 60 bits taken by a message.


ASCII CODE METHOD:

Let we understand ASCII code method. In this method ASCII code scheme is used and the scheme is that:

ASCII code contain 8-bit (it is fixed)

That means if we use this scheme than all characters must be encoded in 8-bit individually. such as

Character          Frequency         ASCII Code

    A                        4                01000001

    B                        5                01000010

    C                        6                01000011

    D                        4                01000100

    E                        1                01000101


here total no of bits are

A = 4 x 8 (code-bit) = 32

B = 5 x 8 (code-bit) = 40

C = 6 x 8 (code-bit) = 48

D = 4 x 8 (code-bit) = 32

E = 1 x 8 (code-bit) = 8

Total no. of bits = 160 bits taken by a message.


HUFFMAN CODE METHOD (VARIABLE SIZE CODE) :

Huffman code use Optimal Merge Pattern method such as.

From the above problem we have

A : 4 , B : 5 , C : 6 , D : 4 , E : 1

first arrange in ascending order such as

E:1 , A:4  D:4 , B:5 , C:6

Than create a tree in following manner







Now we will collect code from the above tree such as:

For E : 000

For A : 001

For D : 01

For B : 10

For C : 11


Character          Frequency       Huffman Code

    A                        4                    001

    B                        5                    10

    C                        6                    11

    D                        4                    01

    E                        1                    000


 Here total no of bits are

A = 4 x 3 (code-bit) = 12

B = 5 x 2 (code-bit) = 10

C = 6 x 2 (code-bit) = 12

D = 4 x 2 (code-bit) = 8

E = 1 x 3 (code-bit) = 3

Total no. of bits = 45 bits taken by a message.


ANALYSIS FROM ALL ABOVE CASES

        Methods                                Total Bits 

Form Fixed Code  Method    :        60 bits

Form ASCII Code  Method    :      160 bits

Form Variable Code  Method :       45 bits


So conclusion from above data is that , if we are using Huffman Coding Method than total number bits will be reduced.


No comments:

Post a Comment