2 Discrete Probability Spaces

2.1 Bernoulli trials

Generating a sequence of Bernoulli trials of a given length and with a given probability parameter

n = 10 # length of the sequence
p = 0.5 # probability parameter
X = sample(c("H","T"), n, replace = TRUE, prob = c(p,1-p))
noquote(X)
 [1] H T H H H T H H H H

The number of heads in that sequence

sum(X == "H")
[1] 8

2.2 Sampling without replacement

An urn with \(r\) red ball and \(b\) blue balls represented as a vector

r = 7
b = 10
urn = c(rep("R", r), rep("B", b))
noquote(urn)
 [1] R R R R R R R B B B B B B B B B B

Sampling uniformly at random without replacement from that urn a given number of times

n = 5
X = sample(urn, n)
noquote(X)
[1] B B B R B

2.3 Pólya’s urn model

A direct (and naive) implementation of Pólya’s urn model

polyaUrn = function(n, r, b){
 # n: number of draws
 # r: number of red balls
 # b: number of blue balls
 urn = c(rep("R", r), rep("B", b))
 X = character(n) # stores the successive draws
 for (i in 1:n){
  draw = sample(urn, 1)
  X[i] = draw
  if (draw == "R"){
   urn = c("R", urn)
  } else {
   urn = c(urn, "B")
  }
 }
 return(X)
}
X = polyaUrn(10, 2, 3)
noquote(X)
 [1] B R R B B B B B B B

2.4 Factorials and binomials coefficients

Factorials are quickly very large. To see this, we plot the first few in logarithmic scale (in base 10).

n = 200
plot(0:n, log10(factorial(0:n)), type = "l", lwd = 2, ylab = "factorials (log in base 10)", xlab = "integers")

The factorials quickly exceed the maximum possible number in R, which is given by the following.

.Machine$double.xmax
[1] 1.797693e+308

Binomial coefficients can also be very large even for moderate values of the integers defining them.

par(mfrow = c(2, 2), mai = c(0.8, 0.5, 0.2, 0.2))

n = 10
plot(0:n, log10(choose(n, 0:n)), type = "l", lwd = 2, ylab = "binomials (log in base 10)", xlab = "k (out of n = 10)")

n = 20
plot(0:n, log10(choose(n, 0:n)), type = "l", lwd = 2, ylab = "binomials (log in base 10)", xlab = "k (out of n = 10)")

n = 50
plot(0:n, log10(choose(n, 0:n)), type = "l", lwd = 2, ylab = "binomials (log in base 10)", xlab = "k (out of n = 50)")

n = 100
plot(0:n, log10(choose(n, 0:n)), type = "l", lwd = 2, ylab = "binomials (log in base 10)", xlab = "k (out of n = 100)")