Licensed under a Creative Commons BY-NC-SA 4.0 License.

1 Discrete-time Markov chains

1.1 Introduction

We plan to use the markovchain library, by Giorgio Alfredo Spedicato, to deal with Markov chains in R. This first section is devoted to introduce its classes and methods using a simple weather prediction model (W: warm, C: cold):

\[\xymatrix{ *=<5mm,5mm>[o][F]{W} \ar@(ul,dl)[]_{0.6} \ar@/^/[r]^{0.4} & *=<5mm,5mm>[o][F]{C} \ar@(dr,ur)[]_{0.7} \ar@/^/[l]^{0.3} }\qquad\qquad P = \begin{pmatrix} 0.6 & 0.4 \\ 0.3 & 0.7 \end{pmatrix}\]

This intro was based on the official documentation.

First of all, we need to import the new library to get access to it. The following line is equivalent to C’s #include or Python’s import:

## Warning: package 'markovchain' was built under R version 3.6.3

1.2 Transition matrix definition

To start with, we define the transition matrix of the example above using the function matrix (see ?matrix). The components are arranged in a plain vector, so we must specify one dimension at least (number of rows or columns) and tell how to read this vector (by rows or columns). In our example, we define a 2x2 matrix (ncol = 2) by rows (byrow = TRUE):

P <- matrix(data = c(0.6, 0.4,
                     0.3, 0.7), 
            ncol = 2, byrow = TRUE)

1.3 Markov chains in R

A Markov chain is described by a “markovchain” object. We’ll use the function new to create this type of objects by feeding it with the states’ name and the transition matrix. Optionally, we can set a name for the chain.

weather <- new("markovchain", name = "Simple weather model", 
               states = c("W", "C"), transitionMatrix = P)
## Simple weather model 
##  A  2 - dimensional discrete Markov Chain defined by the following states: 
##  W, C 
##  The transition matrix  (by rows)  is defined as follows: 
##     W   C
## W 0.6 0.4
## C 0.3 0.7
## Simple weather model  Markov chain that is composed by: 
## Closed classes: 
## W C 
## Recurrent classes: 
## {W,C}
## Transient classes: 
## NONE 
## The Markov chain is irreducible 
## The absorbing states are: NONE

We cannot use the common multiplication * and exponentiation ^ operators with R matrices. The product of two matrices can be done as follows:

P %*% P
##      [,1] [,2]
## [1,] 0.48 0.52
## [2,] 0.39 0.61

And the matrix exponentiation is defined in the non-standard library expm:

## Warning: package 'expm' was built under R version 3.6.3
## Loading required package: Matrix
## Attaching package: 'expm'
## The following object is masked from 'package:Matrix':
##     expm
##        [,1]   [,2]
## [1,] 0.4332 0.5668
## [2,] 0.4251 0.5749

However, a markovchain behaves like a transition matrix and accepts the common arithmetic operators. The following table lists the main operations defined over this object.

Method Result
mcA * mcB Transition matrix multiplication
mcA^4 Transition matrix exponentiation
mcA[1,] Accessing row 1 of the ransition matrix
mcA[, 3] Accessing column 3 of the ransition matrix
mcA[3, 2] Accessing element (3, 2) of the ransition matrix
t(mcA) Transition matrix transposition
mcA == mcB Comparing two objects (states and matrix)
dim(mcA) Transition matrix dimension
states(mcA) Extracting the states

Therefore, we can use this object directly to operate with the Chapman-Kolmogorov equation:

\[\pi^{(n)} = \pi^{(0)}P^n\]

We start from the initial state “C” and obtain the probability distribution vector for \(n = 6\):

initialState <- c(0, 1)
newState <- initialState * weather^6
##             W        C
## [1,] 0.428259 0.571741

Moreover, the markovchain object gives us a lot of useful information about the chain: irreducibility, accessibility, classes, types of states and so on and so forth.

## [1] TRUE
is.accessible(weather, from = "W", to = "C")
## [1] TRUE
## [1] 1
## character(0)
## character(0)
##              W         C
## [1,] 0.4285714 0.5714286

The more interesting part lies on the simulation and estimation capabilities of this library. Given a Markov chain, simulation is performed in the same way as conventional random variables (rnorm, rexp, etc.) using the function rmarkovchain.

Generation of 1000 random samples from the “weather” chain with random initial state:

x <- rmarkovchain(1000, weather)

Generation of 1000 random samples with initial state “W”:

y <- rmarkovchain(1000, weather, t0 = "W")

Estimation consists of fitting the transition probabilities from a vector of random samples. Fitting from the previous x vector is close to the original chain:

weatherFit <- markovchainFit(x)

1.4 Convergence

Given an irreducible and aperiodic Markov chain, it can be shown that there exists a single steady state to which the chain converges when \(n\to\infty\) with independence of the initial state. Using our weather prediction model, let’s illustrate this property by raising the transition matrix to a high enough power and observing that all rows of the resulting matrix tend to the steady state:

## [1] TRUE
## Simple weather model^100 
##  A  2 - dimensional discrete Markov Chain defined by the following states: 
##  W, C 
##  The transition matrix  (by rows)  is defined as follows: 
##           W         C
## W 0.4285714 0.5714286
## C 0.4285714 0.5714286
##              W         C
## [1,] 0.4285714 0.5714286

What is that but to say, when simulating a Markov chain (obtaining random samples), that the states’ ratio will be steady with a high enough number of samples.

samples <- rmarkovchain(1000, weather)

# Cumulative ratios
yW <- cumsum(samples=="W") / seq_along(samples)
yC <- cumsum(samples=="C") / seq_along(samples)
# Evolution of the distribution of "W"
plot(yW, type="l", col="red", ylim=c(0,1))
# Evolution of the distribution of "C"
lines(yC, col="green")
# Theoretical values
abline(steadyStates(weather)[1], 0, lty=2, col="red")
abline(steadyStates(weather)[2], 0, lty=2, col="green")

The above figure shows that the simulation converge. But, which power is high enough?, how many samples are sufficient to ensure a good approximation to the steady state? The key point here is the convergence time, which tells us the speed of error decay.

The error can be defined as the sum of the absolute differences between elements of two consecutive powers of the transition matrix, that is,

\[\epsilon(n) = \sum_i\sum_j|\left(P^n\right)_{ij} - \left(P^{n-1}\right)_{ij}|\]

Let’s implement an R function to obtain this error.

# Error function
# @param n is the time (one value or a vector of values)
# @param mc is the Markov chain
err <- function(n, mc) {
  # Check that the input values are positive integers
  if (!isTRUE(all(n > 0 && n == floor(n)))) 
    stop("'n' must only contain positive integer values")
  # Reserve some memory for the result
  res <- numeric(length(n))
  # For each n, calculate err(n)
  for (i in 1:length(n)) {
    Pn <- (mc^n[i])@transitionMatrix
    Pn1 <- (mc^(n[i]-1))@transitionMatrix
    res[i] <- sum(abs(Pn - Pn1))

# Create an array of values
x <- 1:10
# Call the previously created function to calculate the error
y <- err(x, weather)

plot(x, y, type="o")

As can be seen, the error decays exponentially.

2 Continuous-time Markov Chains

Example 1: A gas station has a single pump and no space for vehicles to wait (if a vehicle arrives and the pump is not available, it leaves). Vehicles arrive to the gas station following a Poisson process with a rate \(\lambda\) of 3 every 20 minutes, of which \(prob(c)=\) 75% are cars and \(prob(m)=\) 25% are motorcycles. The refuelling time can be modelled with an exponential random variable with mean \(1/\mu_c = 8\) minutes for cars and \(1/\mu_m = 3\) minutes for motorcycles.

This problem is described by the following continuous-time Markov chain:

\[\xymatrix{ *=<15mm,8mm>[o][F]{car} \ar@/^/[r]^{\mu_c} & *=<15mm,8mm>[o][F]{empty} \ar@/^/[l]^{\lambda \cdot prob(c)} \ar@/^/[r]^{\lambda \cdot prob(m)} & *=<15mm,8mm>[o][F]{m/cycle} \ar@/^/[l]^{\mu_m} }\qquad\qquad Q = \begin{pmatrix} -1/8 & 1/8 & 0 \\ 0.75\cdot 3/20 & -3/20 & 0.25\cdot 3/20 \\ 0 & 1/3 & -1/3 \end{pmatrix}\]

The chain is irreducible and recurrent. To theoretically find the steady state distribution, we have to solve the balance equations

\[pQ = 0\]

with the constraint

\[\sum_i p_i = 1\]

There are \(\operatorname{dim}(Q)-1\) independent columns, so the latter constraint is equivalent to substitute any column by ones and match it to one at the other side of the equation, that is:

\[p\begin{pmatrix} 1 & 1/8 & 0 \\ 1 & -3/20 & 0.25\cdot 3/20 \\ 1 & 1/3 & -1/3 \end{pmatrix} = (1, 0, 0)\] The solution \(p\) represents the probability of being at each state in the long-term. Therefore, we can calculate the average number of customers in the system by summing these probabilities multiplied by the number of customers at each state. In our case, \(1\cdot p_1 + 0\cdot p_2 + 1\cdot p_3\).

lambda <- 3/20    # Arrival rate
mu <- c(1/8, 1/3) # Service rate (cars, motorcycles)
p <- 0.75         # Probability of car

A <- matrix(c(1,   mu[1],            0,
              1, -lambda, (1-p)*lambda,
              1,   mu[2],       -mu[2]), byrow=T, ncol=3)

P <- solve(t(A), c(1, 0, 0))
N_average_theor <- sum(P * c(1, 0, 1))
## [1] 0.5031056

In the previous chunk, the t function trasposes the matrix to solve the system of equations using solve. See the help for more details.

2.1 Simmer: a simulation tool for R

Discrete Event Simulation is a method for simulating the behaviour and performance of real-life processes or systems.For example:

  • A supermarket has 4 check-outs. Customers arrive at the less crowded queue at rate \(\lambda\). Each customer is served within a mean time of \(1/\mu\).

Basically, using DES we will be able to model problems with the following characteristics:

  • The problem follow a continuous timeline which contains multiple discrete events.

  • All events can be classified as arrivals or departures from the system, having some processing in between.

This description defines a limited subset of all the problems that can be addressed following a Discrete-Event Simulation (DES) methodology but includes all those problems we want to address during this course.

In R, we have the package simmer. Besides the simulation and monitorization framework itself, the simmer.plot package offers convenience plotting functions based on the popular ggplot2 package.

## Warning: package 'simmer' was built under R version 3.6.3
## Warning: package 'simmer.plot' was built under R version 3.6.3

2.1.1 Simmer: Trajectory

Simulating with simmer leverages the concept of trajectory. A trajectory is the path that a certain type of arrivals follows during their lifetime. Let’s take the previous supermarket example. The customer’s trajectory could be modelled as follows:

  1. Arrive at a check-out.
  2. Seize it (waiting in the queue if it is occupied).
  3. Wait an exponential service time \(\mu=4\).
  4. Release the check-out and leave.
# %>% operator concatenates operations

customer <- trajectory() %>%
  seize("check-out", amount=1) %>%
  timeout(function()rexp(1, 4)) %>%
  release("check-out", amount=1)

2.1.2 Simmer: Simulation

We can create a simulation environment with the necessary resources (4 check-outs, infinite queue) and a customer generator with rate \(\lambda=10\) following the previous trajectory:

supermarket <- simmer() %>%
  add_resource("check-out", capacity=4, queue_size=Inf) %>%
  add_generator("customer", customer, function() rexp(100, 10)) %>%
  run(until=20) # Simulation until the instant t=20

2.1.3 Simmer: Results visualization

We can visualise the average number of customers in the system over time using the following function:

# Store monitored data about arrivals, attributes and resources
df.res <- get_mon_resources(supermarket)
##    resource       time server queue capacity queue_size system limit
## 1 check-out 0.04516067      1     0        4        Inf      1   Inf
## 2 check-out 0.08772235      2     0        4        Inf      2   Inf
## 3 check-out 0.14039323      3     0        4        Inf      3   Inf
## 4 check-out 0.14761076      2     0        4        Inf      2   Inf
## 5 check-out 0.15066620      1     0        4        Inf      1   Inf
## 6 check-out 0.18132978      2     0        4        Inf      2   Inf
##   replication
## 1           1
## 2           1
## 3           1
## 4           1
## 5           1
## 6           1
# Plot results taking system column as item where the resource is called ckeck-out
## Adding steps = TRUE we will show the number of busy resources we have
plot(df.res, metric="usage", "check-out", items="system", steps=TRUE)