Markov Chain

Hidden Markov model

In mathematics, a Markov [mahr-kahvchain, named after Russian mathematician Andrey Markov (1856 – 1922), is a discrete (finite or countable) random process with the Markov property (the memoryless property of a stochastic [random] process). A discrete random process means a system which can be in various states. The system also changes randomly in discrete steps. It can be helpful to think of the system as evolving through discrete steps in time, although strictly speaking the ‘step’ may have nothing to do with time.

A stochastic process has the Markov property if the conditional probability distribution of future states of the process depends only upon the present state, not on the sequence of events that preceded it. (Given two jointly distributed random variables X and Y, the conditional probability distribution of Y given X is the probability distribution of Y when X is known to be a particular value.) The term ‘Markov assumption’ is used to describe a model where the Markov property is assumed to hold, such as a hidden Markov model (in which the system being modeled is assumed to be a Markov process with unobserved [hidden] states).

An example of a Markov chain is the dietary habits of a creature who only eats grapes, cheese or lettuce, and whose dietary habits conform to the following (artificial) rules: It eats exactly once a day. If it ate cheese yesterday, it will eat lettuce or grapes today with equal probability for each, and zero chance of eating cheese. If it ate grapes yesterday, it will eat grapes today with probability 1/10, cheese with probability 4/10 and lettuce with probability 5/10. Finally, if it ate lettuce yesterday, it won’t eat it again today, but will eat grapes with probability 4/10 or cheese with probability 6/10. This creature’s eating habits can be modeled with a Markov chain since its choice depends on what it ate yesterday, not additionally on what it ate 2 or 3 (or 4, etc…) days ago. One statistical property one could calculate is the expected percentage of the time the creature will eat cheese over a long period.

A discrete-time random process involves a system which is in a certain state at each step, with the state changing randomly between steps. The steps are often thought of as moments in time, but they can equally well refer to physical distance or any other discrete measurement; formally, the steps are the integers or natural numbers, and the random process is a mapping of these to states. The Markov property states that the conditional probability distribution for the system at the next step (and in fact at all future steps) depends only on the current state of the system, and not additionally on the state of the system at previous steps. Since the system changes randomly, it is generally impossible to predict with certainty the state of a Markov chain at a given point in the future. However, the statistical properties of the system’s future can be predicted. In many applications, it is these statistical properties that are important.

The changes of state of the system are called transitions, and the probabilities associated with various state-changes are called transition probabilities. The set of all states and transition probabilities completely characterizes a Markov chain. By convention, we assume all possible states and transitions have been included in the definition of the processes, so there is always a next state and the process goes on forever. A famous Markov chain is the so-called ‘drunkard’s walk,’ a random walk on the number line where, at each step, the position may change by +1 or −1 with equal probability. From any position there are two possible transitions, to the next or previous integer. The transition probabilities depend only on the current position, not on the manner in which the position was reached (i.e. memoryless). For example, the transition probabilities from 5 to 4 and 5 to 6 are both 0.5, and all other transition probabilities from 5 are 0. These probabilities are independent of whether the system was previously in 4 or 6.

A series of independent events (for example, a series of coin flips) satisfies the formal definition of a Markov chain. However, the theory is usually applied only when the probability distribution of the next step depends non-trivially on the current state. Many other examples of Markov chains exist. Research has reported the application and usefulness of Markov chains in a wide range of topics such as physics, chemistry, medicine, music, game theory and sports. Markovian systems appear extensively in thermodynamics and statistical mechanics, whenever probabilities are used to represent unknown or unmodelled details of the system, if it can be assumed that the dynamics are time-invariant, and that no relevant history need be considered which is not already included in the state description.

Markov chains are used throughout information processing. American cryptographer Claude Shannon’s famous 1948 paper ‘A mathematical theory of communication,’ which in a single step created the field of information theory, opens by introducing the concept of entropy through Markov modeling of the English language. Such idealized models can capture many of the statistical regularities of systems. Even without describing the full structure of the system perfectly, such signal models can make possible very effective data compression through entropy encoding techniques such as arithmetic coding. They also allow effective state estimation and pattern recognition. Markov chains also play an important role in reinforcement learning.

Markov chains are also the basis for Hidden Markov Models, which are an important tool in such diverse fields as telephone networks, speech recognition and bioinformatics. Markov chains are the basis for the analytical treatment of queues (queuing theory). Danish engineer Agner Krarup Erlang initiated the subject in 1917. This makes them critical for optimizing the performance of telecommunications networks, where messages must often compete for limited resources (such as bandwidth). The PageRank of a webpage as used by Google is defined by a Markov chain. Markov models have also been used to analyze web navigation behavior of users. A user’s web link transition on a particular website can be modeled using first- or second-order Markov models and can be used to make predictions regarding future navigation and to personalize the web page for an individual user.

Markov chains are generally used in describing path-dependent arguments, where current structural configurations condition future outcomes. An example is the reformulation of the idea, originally due to Karl Marx’s ‘Das Kapital,’ tying economic development to the rise of capitalism. In current research, it is common to use a Markov chain to model how once a country reaches a specific level of economic development, the configuration of structural factors, such as size of the commercial bourgeoisie, the ratio of urban to rural residence, the rate of political mobilization, etc., will generate a higher probability of transitioning from authoritarian to democratic regime.

Markov chains can be used to model many games of chance. The children’s games ‘Snakes and Ladders’ and ‘Hi Ho! Cherry-O,’ for example, are represented exactly by Markov chains. At each turn, the player starts in a given state (on a given square) and from there has fixed odds of moving to certain other states (squares). Markov chain models have been used in advanced baseball analysis since 1960, although their use is still rare. Each half-inning of a baseball game fits the Markov chain state when the number of runners and outs are considered. During any at-bat, there are 24 possible combinations of number of outs and position of the runners. Models can be used to evaluate runs created for both individual players as well as a team.

Markov processes can also be used to generate superficially ‘real-looking’ text given a sample document: they are used in a variety of recreational ‘parody generator’ software. These processes are also used by spammers to inject real-looking hidden paragraphs into unsolicited email and post comments in an attempt to get these messages past spam filters.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s