Week 4 Muddiest Point
Gamma Code
Is gamma code stored as a string, like index as string compression methods? The unary code then is a delimiter for the next few characters, letting the program processing the postings know where each posting begins. Storing all the postings as one long string means that no set amount of bytes must be used for each posting, saving space in the postings list. The binary string (gathered from the unary marker and following string) is then processed in memory according to the conventions of binary.
If gamma code is not stored as a string, then I do not understand how gamma code works.
General Compression Question
Memory and storage are always increasing. Does compression also mean faster access? Otherwise, why do open source software like Indri and Lucerne compress index terms and postings automatically?
No comments:
Post a Comment