Probabilities¶

Why Probability in Biology?¶

Biological systems are fundamentally noisy. A single molecule diffusing through a cell follows a random path. The exact timing of a chemical reaction is unpredictable. When protein copy numbers are small, the same gene regulatory network can produce different outcomes in genetically identical cells. This intrinsic randomness means that deterministic models, while useful, cannot capture the full richness of biological behavior. Probability theory provides the mathematical language for describing and analyzing this stochasticity.

Unlike differential equations, which predict exact trajectories, probabilistic models describe distributions of possible outcomes. Instead of asking "what is the protein concentration at time $t$?", we ask "what is the probability that there are exactly $n$ protein molecules at time $t$?" This shift from deterministic to probabilistic thinking is essential for understanding cellular processes where molecule numbers are small and fluctuations are significant.

Probability Basics: Events and Random Variables¶

At its core, probability assigns numbers between zero and one to events. If we flip a fair coin, the probability of heads is $1/2$, meaning that in many repeated trials, roughly half will show heads.

A random variable is a quantity whose value is uncertain. We might denote the number of protein molecules in a cell at time $t$ by the random variable $N$. Unlike a deterministic variable that has a definite value, a random variable is characterized by its probability distribution.

For discrete random variables, which take on countable values like $0, 1, 2, \ldots$, we describe the distribution using a probability mass function $\mathbb{P}(N = n)$, giving the probability that $N$ equals exactly $n$. The sum of all these probabilities must equal one, since the random variable must take on some value. For continuous random variables, which can take any value in an interval, we instead use a probability density function $p(x)$, where the probability that $X$ lies between $a$ and $b$ is given by the integral $\int_a^b p(x) dx$.

The expectation or mean of a random variable represents its average value over many realizations. For a discrete random variable, the expectation is $\mathbb{E}[N] = \sum_n n \cdot \mathbb{P}(N = n)$. The variance measures spread around the mean: $\text{Var}(N) = \mathbb{E}[(N - \mathbb{E}[N])^2]$. These two quantities often suffice to characterize a distribution's essential features.

The Exponential Distribution: Waiting Times¶

The exponential distribution describes the time between events. If events occur at constant rate $\lambda$, the waiting time $T$ until the next event follows an exponential distribution with probability density:

$$p(t) = \lambda e^{-\lambda t}$$

for $t \geq 0$, which means that

$$\mathbb{P}(T > t) = e^{-\lambda t}.$$

The mean waiting time is $\mathbb{E}[T] = 1/\lambda$, the inverse of the rate. The standard deviation also equals $1/\lambda$, making the coefficient of variation (standard deviation divided by mean) equal to one, indicating substantial variability in waiting times.

The exponential distribution has a crucial property called memorylessness: the probability that we must wait an additional time $s$ does not depend on how long we have already waited. Mathematically, $\mathbb{P}(T > t + s \mid T > t) = \mathbb{P}(T > s)$. This reflects the assumption that the process has no memory—the probability of an event in the next instant depends only on the rate $\lambda$, not on the system's history.

In cellular processes, exponential distributions describe the lifetimes of mRNA and protein molecules. If degradation occurs at constant rate $\gamma$, each molecule's lifetime is exponentially distributed with mean $1/\gamma$. This is why first-order degradation kinetics lead to exponential decay in deterministic models: the average behavior of many exponentially distributed lifetimes is deterministic exponential decay.

The connection between exponential waiting times and Poisson counting is deep. If waiting times between successive events are independent and exponentially distributed with rate $\lambda$, then the number of events in any fixed time interval is Poisson distributed. This duality between continuous waiting times and discrete event counts is fundamental to stochastic modeling in biology.

The Poisson Distribution: Counting Rare Events¶

Suppose chemical reactions occur randomly in time, with an average rate $\lambda$ per unit time. If we observe the system for a time interval and count how many reactions occurred, that count follows a Poisson distribution.

The probability of observing exactly $n$ events is:

$$\mathbb{P}(N = n) = \frac{\lambda^n e^{-\lambda}}{n!}$$

Here, $\lambda$ is both the mean and the variance of the distribution. This is a distinctive feature of the Poisson: the standard deviation equals $\sqrt{\lambda}$, so fluctuations scale as the square root of the mean. When $\lambda$ is small, fluctuations are relatively large; when $\lambda$ is large, the distribution becomes narrow relative to its mean.

The Normal Distribution: Large Numbers and Fluctuations¶

The normal or Gaussian distribution is perhaps the most famous probability distribution, characterized by its bell-shaped curve. Its probability density is:

$$p(x) = \frac{1}{\sqrt{2\pi\sigma^2}} \exp\left(-\frac{(x-\mu)^2}{2\sigma^2}\right)$$

Here, $\mu$ is the mean and $\sigma^2$ is the variance. The normal distribution is symmetric around its mean, and about 68% of the probability lies within one standard deviation of the mean.

The normal distribution's importance stems from the Central Limit Theorem, which states that the sum of many independent random variables, regardless of their individual distributions, tends toward a normal distribution. If we measure a quantity influenced by many small, independent sources of variability, we expect that quantity to be approximately normally distributed.

In biological applications, the normal distribution often describes experimental measurement error or biological variability across a population. When measuring protein expression in many cells, if differences arise from numerous independent factors, the distribution of expression levels across cells may be approximately normal.

The normal distribution also emerges as an approximation to the Poisson distribution when $\lambda$ is large. A Poisson random variable with large mean $\lambda$ is approximately normal with mean $\mu = \lambda$ and variance $\sigma^2 = \lambda$. This is why concentration-based ODE models work well when molecule numbers are large: fluctuations become relatively small, and the system behaves nearly deterministically.

Parameter Variability and Cell-to-Cell Differences¶

Beyond intrinsic stochasticity from small molecule numbers, biological systems exhibit extrinsic variability where parameters differ between cells or over time. Two cells might have different protein production rates due to variations in chromatin state or cellular environment. This extrinsic noise adds another layer of randomness.

We can model extrinsic variability by treating rate constants as random variables drawn from some distribution. If the production rate $\beta$ is normally distributed across a population with mean $\mu_\beta$ and variance $\sigma_\beta^2$, and each cell has Poisson-distributed protein numbers given its $\beta$, the population distribution is a mixture of Poisson distributions. Such hierarchical models capture how molecular noise and population variability combine.

Understanding the relative contributions of intrinsic and extrinsic noise is crucial for interpreting experiments. Time-lapse imaging of individual cells can separate these sources: intrinsic noise appears as fluctuations within a single cell's trajectory, while extrinsic noise appears as differences between cells' trajectories. Mathematical models with appropriate probability distributions help quantify these contributions.

Conditional Probability and Bayesian Reasoning¶

Conditional probability describes how knowledge of one event affects the probability of another. The probability of event $A$ given event $B$ is written $\mathbb{P}(A \mid B)$ and equals $\mathbb{P}(A \cap B)/\mathbb{P}(B)$ when $\mathbb{P}(B) > 0$. This simple definition underlies Bayes' theorem, which is fundamental to statistical inference.

Bayes' theorem states:

$$\mathbb{P}(H \mid D) = \frac{\mathbb{P}(D \mid H) \mathbb{P}(H)}{\mathbb{P}(D)}$$

Here, $H$ represents a hypothesis (such as a particular parameter value) and $D$ represents observed data. The prior $\mathbb{P}(H)$ represents our belief before seeing data, the likelihood $\mathbb{P}(D \mid H)$ represents the probability of observing the data given the hypothesis, and the posterior $\mathbb{P}(H \mid D)$ represents our updated belief after seeing data.

In computational biology, Bayesian inference is used to estimate parameters from noisy measurements. Given a model with unknown rate constants and experimental data showing molecule numbers over time, we can compute the posterior distribution of the parameters. This distribution quantifies both our best estimates and our uncertainty, guiding further experiments.

Correlation and Independence¶

Two random variables are independent if knowing one gives no information about the other: $\mathbb{P}(X = x, Y = y) = \mathbb{P}(X = x)\mathbb{P}(Y = y)$. In biological systems, independence is often an approximation. Molecules in the same cell are subject to common environmental fluctuations, creating correlations.

The covariance $\text{Cov}(X, Y) = \mathbb{E}[(X - \mathbb{E}[X])(Y - \mathbb{E}[Y])]$ measures correlation. Positive covariance indicates that $X$ and $Y$ tend to be large or small together; negative covariance indicates they tend to move in opposite directions. The correlation coefficient normalizes covariance by the product of standard deviations, giving a dimensionless measure between $-1$ and $1$.

In gene regulatory networks, correlations between mRNA and protein levels inform us about the timescales of transcription, translation, and degradation. If mRNA is short-lived, mRNA and protein fluctuations are weakly correlated. If mRNA is long-lived, protein levels track mRNA levels, creating strong correlation. These patterns are signatures of the underlying kinetic rates.

Key Takeaways¶

Probability theory transforms how we think about biological systems, replacing deterministic predictions with distributions of possible outcomes. The Poisson distribution describes molecule copy numbers when production and degradation are random. The exponential distribution describes waiting times between chemical reactions. The normal distribution emerges when many random factors combine and provides approximations when molecule numbers are large.

These distributions are not merely mathematical abstractions; they are observable in single-cell measurements and simulations. Modern experiments increasingly reveal the probabilistic nature of cellular processes, from gene expression to cell fate decisions.

The bridge from microscopic randomness to macroscopic behavior passes through probability theory. Small systems require full stochastic descriptions; large systems behave deterministically but with fluctuations; intermediate systems require careful analysis of both mean behavior and variability. Mastering these concepts enables us to model biological systems at the appropriate level of detail and to extract meaning from noisy experimental data.