This project implements Huffman Coding, a greedy algorithm used for
lossless data compression. The algorithm assigns variable-length
prefix codes to characters based on their frequencies, reducing the
overall size of the encoded data.
Calculates frequency of characters from input text
Constructs Huffman Tree using a priority queue (min-heap)
Generates optimal prefix codes for characters
Encodes input text into compressed binary form
Decodes compressed data back to the original text
Displays compression statistics
Python
Data Structures: Tree, Heap (Priority Queue)
Algorithms: Greedy Algorithm
File Handling
Count character frequencies from the input file
Build the Huffman Tree using a min-heap
Generate binary codes by traversing the tree
Encode the input text using generated codes
Decode the encoded data to verify correctness
run it on command line arguments python filename.py encode input.txt output.huff python filename.py decode output.huff out.txt By encode we will generate the compressed bits for the characters and store them in the output.huff file which will be created automatically for decoding the output.huff file we will decode it by running the respective command on terminal with out.txt file which will be created.
input:
file.txt
output:
after run we will get a compressed file
we have to give the input file with a min of 5 to 6 paragraphs then only we can compresses the file. If we give below that then we will definitely get more size of the given file which is a big set back for this approach.