L1 - Markov Information Sources
L1 - Markov Information Sources
In this paper we study Markov sources with first order memory and two, three and four states. We also discuss their evolution starting from fixed initial conditions. 2. Theoretical introduction
A discrete information source generates messages at discrete time intervals, each message contains a finite number of symbols. A finite set of impulse-type signals corresponds to the set of symbols. The emission rate of a discrete source is, thus, finite. The discrete source is characterized by the symbols x1,x2,,xj, that it can generate, the words that can be made out of the symbols, in a finite number, the alphabet { x1,,xn} as the total number of symbols and the source language being all the words that can be generated using its alphabet. A discrete information source with memory generates a symbol whose probability depends on the previous symbol or a set of previously generated symbols, their number determining the order of the memory. A stationary or homogenous discrete information source generates symbols whose probabilities dont depend on the time origin, but on their relative positions. These probabilities are invariable to any translation across the string. In a state transition diagram, the nodes represent the states and the links between the nodes represent the transitions. The links are labeled with the conditional probabilities associated with the transitions p(xi/xj). If a symbol is conditioned by more than one previous symbol, the number of states increases. Thus, a binary source that generates symbols conditioned by two previous symbols has four states corresponding to the four groups of two symbols: S1 => 00, S2 = 01, S3 = 10 and S4 = 11. A discrete information source with m-order memory is a source with probabilistic constraints, the probability distribution having the following form:
That means a symbols appearance is conditioned by m previous symbols. If the alphabet has D symbols, a source with m-order memory has r Dm states. The sequence of m symbols is assigned to the i state:
If the appearance of a symbol is conditioned only by the previous state, the source can be modeled with a finite Markov chain and is called a Markov source. A nite state Markov chain is a sequence {,S-2, S-1, S0, S1, S2} of discrete chance variables from a finite alphabet S with the Markov property:
This means that the probability that a random variable take the value of a state, conditioned by an infinite number of states, is equal with the probability conditioned by the most recent state. If the probability of crossing from one state to another is not time dependent, the Markov Chain is homogenous. A Markov information source is a Markov chain, together with a function that maps states S in the Markov chain to letters in the sources alphabet. A Markov source is a sequence of discrete chance variables: {, X0, X1, X2, X3,} that take values from the finite alphabet of the source: X = {x1, x2,,xD} through successive transitions through the state sequence: {S0, S1, S2, S3,} The state sequences makes up a finite Markov chain, with values from the state alphabet: S = {s1, s2, ,sr} with a probability distribution of the initial state S0 P0 = {p01, p02, ,p0r} or
we have 2
When the internal Markov chain enters a new state, the source generates the corresponding symbol specified by the f function. In general, the number of states r can be larger than the number of symbols D, in which case a symbol can correspond to more than one state. The probabilities of changing from one state Si to another state Sj form the transition matrix T:
meaning the sum of the elements on one line is equal to 1. The distribution of probability of the states at the moment n is given by the matrix:
where
state at moment n:
or:
Finally, we have:
then we have
In the case of a stationary source, Pn = Pn-1, the probabilities of the states are independent of n and the transition probabilities are independent of n. If we assign only one symbol to each state, meaning the number of states is equal to the number of symbols, we have a first order Markov source, a symbol being conditioned by the previous symbol. If we assign to symbols to each state, we have a second order Markov source. In this case, the latest pair of symbols corresponds to a state. An ergodic Markov source is a source that has the sequence of probability distributions Pn for n = 1, 2, for the whole set of states approaches a probability distribution w independent of the initial probability distribution. This distribution w is called asymptotic distribution, meaning:
where wj is independent of i. The number wj is called the asymptotic probability for the sj state, and the matrix
The necessary and sufficient condition that w is the equilibrium distribution of a Markov source is:
The entropy of a source with memory is defined as the entropy of a random symbol of the source after observing all the previous symbols. The mean quantity of information when generating a symbol for a source with memory is:
and
(|X| is the number D of symbols in the X alphabet) In most cases, the Markov sources are not stationary. If the source cant remain in one state (or it can remain in that state only for a small number of steps), the state is called a transitory state. If a state, once reached, cant be left (or it can be left after a large number of steps), it is called an absorbent state. If the sequence of successive probability values is convergent, then its limit is the vector of the equilibrium distribution. This convergence can be monotonous or oscillating. 3. Program description
The program contains a theoretical presentation, computations and examples. The use of this program requires a few explanations in regards to the access to the different parts of the program.
The first screen is a generic screen which contains the title of the laboratory. Immediately after this screen is the screen containing the menu. In the menu we have seven study options. Each option has a number of screens for study. To exit these options we need to press the Q (Quit) key. To choose an option from the menu, we have to press the number key corresponding to each option. After we have chosen the option we can navigate back and forward by pressing the N (Next) key or the P (Previous) key. This can be done at any time. In the computations part there are more options which are detailed below: C (Change) Enter V (Vector) M (Matrix) change the data trigger the reading of the data do one computation step calculate the equilibrium distribution vector calculate a matrix with a specified equilibrium distribution vector
It is highly recommended that the input data follows all the conditions, otherwise it will be rejected. 4. Laboratory tasks 4.1. Read the theoretical introduction 4.2. Two state Markov source To illustrate a few aspects of a two state source, go through the following examples and draw the conclusions. 4.2.1. Enter the following data:
After 5 steps the stationary vector is calculated through a monotonous convergence. Notice the rapidity of the convergence. 4.2.2. Enter the following data:
The convergence is oscillating. After 30 steps the oscillations amplitude is smaller than 0.001. What is the cause of this attenuation? 4.2.4. Enter the following data:
After 450 steps, the oscillations amplitude is still relatively high. What is the cause of this slow convergence? Modify the P0 vector by changing the values. What is the difference? Draw the variation of the probabilities over time. 4.3. The Markov source with 3 states Go over the following examples and draw the conclusions. 4.3.1. Enter:
After 150 steps we reach a stationary state. The convergence is oscillating with the oscillation period 2. Why? Notice that Pn(3) has a monotonous convergence. 4.3.2. Enter:
After 150 steps we reach a stationary state. The convergence is oscillating with the oscillation period 3. Why? 4.3.3. Enter:
The convergence is monotonous and slow. Why? 4.4. The Markov source with 4 states 7
Go over the following examples and draw the conclusions. 4.4.1. Enter:
The convergence is oscillating with the oscillation period 4. All the values have an oscillating convergence. This convergence is faster than for the 3 state source. Explain. 4.4.2. Enter:
While Pn(1), Pn(2) and Pn(3) have an oscillating convergence with the oscillation period equal to 3, Pn(4) converges monotonously. Why? The stationary vector is reached after about 100 steps. 4.4.3. Enter:
Notice that Pn(1) and Pn(2) have an oscillating convergence with the period equal to 2, while Pn(3) and Pn(4) converge monotonously. Explain. Enter a matrix T and a vector P0 for which all the components of the Pn vector converge monotonously. 4.5. Stationarity of a 2 state Markov source The emission procedure can be made stationary either by modifying the matrix T or the initial vector P0. 4.5.1. Determining the equilibrium distribution vector 4.5.1.1. Enter:
Verify this result by returning to the second option in the menu. The verification can be done in two ways. The first is that P0 = Pst and Pn = Pst at any moment. The second way allows us to enter a random initial vector, the program determining the limit convergence vector by reading Pn for larger values of n. The second method is preferred when the convergence is fast or when the stationary vector has a lot of decimals 0. 4.5.1.2. Enter:
Check these values. If you use the second method you will notice an oscillating convergence. 4.5.1.3. Enter:
Explain this result. Remember that finding the equilibrium distribution vector can be done using the second option in the menu by reading the Pn vector for a very large value of n. 4.5.2. Determining the transition matrix 4.5.2.1. We have:
Build two transition matrixes that have this vector as the equilibrium distribution vector. If we choose l = 0.5 we get:
Build two transition matrixes that have this stationary vector. If we choose l = 0.5 we get:
4.5.2.3. We have:
Why can we choose the components specified in the message? 4.6. Stationarity of a 3 state Markov source 4.6.1. Determining the equilibrium distribution vector Enter:
Verify this result. The slow oscillating convergence delays the reading of the equilibrium distribution vector if we use the second method for verification. In this case the first method is preferred. 4.6.2. Determining the transition matrix Enter the stationary vector determined in the previous step. Build 5 transition matrixes that have this vector as the equilibrium distribution vector. This can be done by choosing: 1 = 0, 2 = 0, 3 = 0.1, 4 = 0.9
10
We will get obtain the matrix from which we started. We can modify the values for the parameter k. For example, enter: 1 = 0, 2 = 0, 3 = 0.1, 4 = 0.95 1 = 0, 2 = 0, 3 = 0.1, 4 = 0.8 1 = 0, 2 = 0, 3 = 0.1, 4 = 0.3 1 = 0, 2 = 0, 3 = 0.1, 4 = 0 What happens if we enter 3 = 0 in the sets above? Verify the results. Try and modify the parameters 1 and 2. Remember that when the number of states increases, there are more and more possibilities to build a transition matrix when we have Pst. The construction liberty seems to be diminished a lot by the difficulty of choosing the variable parameters so that the T matrix is indeed a transition matrix.
5.
Questions 5.1. What is a first order Markov source? 5.2. What is the difference between the states and the symbols of a first order Markov source? What about for a higher order Markov source? 5.3. How does the emission process of such a source works? 5.4. When does the monotonous and oscillating convergence appear? 5.5. When is a convergence slow? 5.6. Is it possible to have an oscillating convergence with the oscillation period equal to 4 for a 2 state source? What about for a 3 state source? Why? 5.7. How can we achieve a stationary emission process? 5.8. Why are there more transition matrixes for the same equilibrium density vector? 5.9. Whats the difference between a transitory state and an absorbent state? 5.10. When can we calculate the entropy of a Markov source?
11