• Introduction
  • Examples
  • RNA-seq
  • Calculating confidence intervals
    • Webtool

A Poisson distribution is the probability distribution that results from a Poisson experiment. A probability distribution assigns a probability to possible outcomes of a random experiment. A Poisson experiment has the following properties:

  1. The outcomes of the experiment can be classified as either successes or failures.
  2. The average number of successes that occurs in a specified region is known.
  3. The probability that a success will occur is proportional to the size of the region.
  4. The probability that a success will occur in an extremely small region is virtually zero.

A Poisson random variable is the number of successes that result from a Poisson experiment. Given the mean number of successes that occur in a specified region, we can compute the Poisson probability based on the following formula:

$ P(x; ) = $

which is also written as:

$ Pr(X = k) = e^{-}   k = 0, 1, 2, $


The average number of homes sold is 2 homes per day. What is the probability that exactly 3 homes will be sold tomorrow?

$ P(3; 2) = $

Calculating this manually in R:

e <- exp(1)
[1] 0.180447

Using dpois():

dpois(x = 3, lambda = 2)
[1] 0.180447


The Poisson distribution can be used to estimate the technical variance in high-throughput sequencing experiments. My basic understanding is that the variance between technical replicates can be modelled using the Poisson distribution. Check out Why Does Rna-Seq Read Count Fit Poisson Distribution? on Biostars.

From Chris Miller:

Picture a process whereby you take the genome and choose a location at random to produce a read. This is a Poisson process. If you plot the depth of sequence along this theoretical genome, it will be a poisson distribution.

Calculating confidence intervals

Calculate the confidence intervals using R. Create data with 1,000,000 values that follow a Poisson distribution with lambda = 20.

n <- 1000000
data <- rpois(n, 20)

Functions for calculating the lower and upper tails.

poisson_lower_tail <- function(n) {
   qchisq(0.025, 2*n)/2
poisson_upper_tail <- function(n) {
   qchisq(0.975, 2*(n+1))/2

Lower limit for lambda = 20.

[1] 12.21652

Upper limit for lambda = 20.

[1] 30.88838

How many values in data are lower than the lower limit?


961213  38787 

How many values in data are higher than the upper limit?


986239  13761 

What percentage of values were outside of the 95% CI?

(sum(data<poisson_lower_tail(20)) + sum(data>poisson_upper_tail(20))) * 100 / n
[1] 5.2548




Using the Poisson Confidence Interval Calculator and lambda = 20 returns:

  • 99% confidence interval: 10.35327 - 34.66800
  • 95% confidence interval: 12.21652 - 30.88838
  • 90% confidence interval: 13.25465 - 29.06202

which matches our 95% CI values.

