How to measure Information: Shannon says: the amount of information contained in a message from a source is the entropy H:
H = log2 (M)
M : number of possible messages.
Comments:
- Entropy formula uses log base 2, it means the measurement unit is "bit".
- H is interpreted as the amount of information we lack before we receive a message.
- H of a source equals the minimum number of bits required to represent (encode) a message.
- The formula above assumes that each possible message is equally likely (same probability).
Notes on probability:
- Frequency and probability are very likely to be nearly equal if we make many experiments.
- p(x|y) = p(x & y) / p(y) ; Notice that p(y) = sigma( p(x & y) ) for all x
- F is a numeric function of x. Average(F) = sigma(p(x) * F(x)) for all x (if probabilities are equal, then the average is exactly the arithmetic mean).
Generalisation:
- Given a message x, the surprise of x is : s(x) = log2(1/p(x))
- The more surprising message tells us more (information)
- General formula of entropy: S is the surprise function of x, then
H = Average(S) = sigma(p(x) * log2(1/p(x)))
Communication:
- The codeword is the format that a message (letter, word ...) is represented (encoded) by.
- Efficient communication: the more common a message is, the shorter should be the corresponding codeword.
- Encoding-Decoding bijection: the encoding should satisfy the prefix-free condition = No codeword can be a prefix of an other codeword.
- An encoding is free-prefix => the Kraft inequality is satisfied.
- Kraft inequality: sigma(1/2^l)<= 1 over all codewords (l is codeword's length). The inequality states that they should not be so many short codewords in an encoding.
- Given a free-prefix encoding of X (messages from a source); and L is length's function of the corresponding codewords, then: Average(L) >= H(X).
This inequality states that:
- The length of codewords, on average, can not be shorter than the entropy of information source.
- For encoding efficiency, the average of lengths should be closest (or equal) to the entropy (e.g Huffman code is the best efficient free-prefix encoding)
- Application: data compression is the process of re-encoding data in a more efficient way (closest to H). Most of the time, this is achieved by analysing data regularities (text pattern ....)
Data compression: run length encoding (for simple image, e.g black and white).
No comments:
Post a Comment