Chemical Reaction Networks¶

This lecture covers:

Concepts

Chemical reaction networks (CRNs) to describe species and reactions among them
Determinstic ODE kinetics of CRNs
Stochastic Markov chain kinetics of CRNs
Their link via the Langevin equation

Chemical Reactions Networks¶

A chemical reaction network is described by a set $\mathcal{S}$ of species and a set $\mathcal{R}$ of reactions. A reaction is a triple $(\mathbf{r}, \mathbf{p}, \alpha)$ where $\mathbf{r}, \mathbf{p}\in \mathbb{N}_0^{\mathcal{S}}$ and $\alpha\in\mathbb{R}_{\geq0}$. The species with positive count in $\mathbf{r}$ are called the reaction's reactants and those with positive count in $\mathbf{p}$ are called its products. The parameter $\alpha$ is called the reaction's rate constant.

From Molecular Collisions to Mass-Action Kinetics¶

We often write down chemical reactions like A + B → C and then we say the rate is proportional to the product $A\cdot B$. But why should it be true? Why should the rate depend on the product of the concentrations? Why not the sum, or some other combination?

The answer, as it turns out, comes from a very simple physical picture: molecules bouncing around and occasionally bumping into each other. Let's see if we can understand this from the ground up.

The Essential Physics: Collisions with Activation¶

When we say that A and B react to form C, what we really mean is something quite specific. Two molecules, one of type A and one of type B, have to get close enough to each other that they can rearrange their chemical bonds. But it's not enough for them just to touch gently — they need to collide with enough energy to get over what we call the "activation energy barrier."

Think of it like this: imagine you're trying to push a boulder over a hill. You can't just nudge it gently; you need to hit it hard enough to get it to the top. Once it's over the top, it rolls down the other side (that's your product C forming). But if you don't hit it hard enough, it just rolls back to where it started.

So a chemical reaction requires two things: a collision, and enough energy in that collision. For now, let's ignore the energy part — we'll assume that if molecules collide, they react. The question becomes: how often do molecules collide?

Two Balls in a Box: The Simplest Possible Case¶

Let's start with the simplest case we can imagine. Take two balls, let's call them A and B, and put them in a square box. They're bouncing around, hitting the walls, ricocheting all over the place. The question is: when will they hit each other?

Now, you might think this is easy to answer, but it's not! It depends completely on where they start and how fast they're going in which directions.

If they start right next to each other and they're moving toward each other, they'll collide almost immediately. If they start at opposite corners and they're both moving in the same direction at the same speed, they might never collide at all! One will always be chasing the other around the box.

So the collision time could be anywhere from nearly zero to infinity. That doesn't seem very helpful for chemistry, does it?

The Thermal Equilibrium¶

Chemical systems are often considered to be in "thermal equilibrium." What does this mean? It means that the system has been bouncing around for so long that it has "forgotten" how it started. The positions are random, the directions are random, and the speeds follow a particular distribution (Maxwell-Boltzmann) that depends only on temperature.

This is like taking our two balls and shaking the box for a very long time before we start watching. We don't know where they are or which way they're going, but we know something about the statistics of their motion.

In this thermal state:

The positions are uniformly distributed (equally likely to be anywhere in the box)
The directions are uniformly distributed (equally likely to be going any direction)
The speeds follow Maxwell-Boltzmann (most likely to have speeds near some typical thermal speed)

This is the key insight: we can't predict exactly when our two particular balls will collide, but we can predict the random collision time if we have many such boxes, or if we wait long enough and average over many collisions.

Simplifying Even Further: The Point and the Target¶

To make progress, let's simplify even more. Instead of two moving balls of different sizes, let's consider one ball that's just a point (no size at all) and another ball that sits perfectly still in the center of the box and has radius $r$.

You might object: "But that's not realistic! Both molecules are moving!" Don't worry, we'll fix that later. There's a beautiful trick where the motion of two objects can be converted into the motion of one object relative to the other.

Also, let's assume our point moves at unit speed (speed $= 1$) for simplicity. Again, we'll fix this later.

So now we have a very clean problem: a point bouncing around randomly in a box, and we want to know when it will hit a target of radius $r$ sitting in the center.

The Scaling Laws: How Big, How Fast?¶

How does the expected collision time depend on the parameters of our problem?

First, let's think about the radius $r$. If we make the target bigger, it should be easier to hit, right? So the collision time should go down. In fact, if you double the radius, you double the "cross-sectional area" that the point can hit (in 2D, that's $2\pi r$), so you should halve the collision time. The collision time is inversely proportional to $r$.

Second, what about the area $A$ of the box? If we make the box bigger, the point has to search through more space to find the target. If you double the area, you should roughly double the collision time. The collision time is proportional to $A$.

Third, what about the speed? If the point moves twice as fast, it should find the target in half the time. The collision time is inversely proportional to the speed.

Putting it together: $$\langle t \rangle \propto \frac{A}{r \cdot v}$$

where $\langle t \rangle$ is the expected collision time, $A$ is the area, $r$ is the radius, and $v$ is the speed.

Why does this work? It comes from a piece of mathematics called "equidistribution theory." The point's trajectory, if it bounces around long enough, visits every part of the box with equal probability. It's like the trajectory is "filling up" the area uniformly. The bigger the area to fill, the longer it takes. The bigger the target, the easier it is to hit.

What About the Probability Distribution?¶

Now we know the expected collision time, but what's the full probability distribution? Is it Gaussian? Uniform? Something else?

Here's where things get really interesting. Let's think about the following question: suppose our point has been bouncing around for time $s$ and hasn't hit the target yet. What's the probability it will hit in the next time interval $t$?

If the motion is truly random and "memoryless," then this probability should be the same as the probability of hitting in the first time interval $t$. The past doesn't matter — only the present state.

In mathematical terms, this means: $$P(\text{hit in next } t | \text{no hit in last } s) = P(\text{hit in first } t)$$

This is called the "Markov property" or "memorylessness."

The Shadow Problem¶

But wait — is our system really memoryless?

Suppose the point hasn't hit the target in the last $s$ time units. Does this tell us anything about where the point is now?

Yes, it does! There's a "shadow" region around the target where the point cannot be. If the point were in this shadow region, it would have hit the target sometime in the last $s$ time units.

The shadow region is roughly a circle of radius $r + v \cdot s$ around the target. Any point in this region that was moving toward the target would have collided within time $s$.

This means the position distribution is not uniform anymore — it's uniform everywhere except in the shadow. So the system is not perfectly memoryless.

However, if the target is very small ($r \to 0$), then the shadow region becomes very small compared to the total area. In this limit, the distortion of the uniform distribution is negligible, and the system becomes approximately memoryless.

The Exponential Distribution¶

If a system is truly memoryless, then its waiting times must follow an exponential distribution. This is a theorem, not just a guess.

The exponential distribution with rate $\lambda$ is characterized by:

$$P(\text{wait time} > t) = e^{-\lambda t}$$

This is the "survival function" of the exponential distribution. The probability density function is $\lambda e^{-\lambda t}$, and the expected value is $1/\lambda$.

Since we already know the expected collision time is proportional to $A/(r v)$, we can identify: $$\lambda \propto \frac{r v}{A}$$

So our collision times are approximately exponentially distributed with this rate parameter.

Bringing Back the Moving Target¶

Now let's be more realistic. Both molecules are moving, not just one.

Here's the beautiful trick we mentioned earlier: instead of tracking two moving objects, we can work in the "relative coordinate system." We ask: what's the motion of molecule A relative to molecule B?

In this relative system, molecule B appears stationary (sitting at the origin), and molecule A moves with the relative velocity $\vec{v}_{\text{rel}} = \vec{v}_A - \vec{v}_B$.

The wonderful thing is that the geometry is exactly the same as our point-and-target problem! The collision occurs when the relative distance equals $r_A + r_B$.

For molecules in thermal equilibrium, the relative velocity also has a Maxwell-Boltzmann distribution, but with a different parameter. If each molecule has typical speed $\langle v \rangle$, then the relative motion has typical speed $\sqrt{2} \langle v \rangle$. (This comes from adding two random velocity vectors.)

So all our previous results carry over, just with $v$ replaced by $\langle v_{\text{rel}} \rangle$ and $r$ replaced by $r_A + r_B$.

Multiple Targets: The Magic of the Exponential¶

Now for the final step: what if we have many B molecules that our A molecule could hit?

This is where the exponential distribution shows its magic. Suppose there are $n$ different B molecules, and the collision time with the $i$-th one is exponentially distributed with rate $\lambda_i$. What's the distribution of the time until the first collision (with any of the B molecules)?

It turns out that this is also exponentially distributed, with rate $\lambda_{\text{total}} = \lambda_1 + \lambda_2 + \cdots + \lambda_n$.

This is a beautiful property of exponential distributions: the minimum of independent exponentials is exponential with the sum of the rates.

Why is this true? Think of it this way: the probability of avoiding all $n$ targets for time $t$ is the product of the probabilities of avoiding each one: $$P(\text{avoid all for time } t) = e^{-\lambda_1 t} \cdot e^{-\lambda_2 t} \cdots e^{-\lambda_n t} = e^{-(\lambda_1 + \cdots + \lambda_n)t}$$

This is exactly the survival function of an exponential with rate $\lambda_1 + \cdots + \lambda_n$.

The Rate Addition Rule¶

If each individual B molecule gives a collision rate of $\lambda_1$ (which is proportional to $r/A$ from our earlier calculation), then $n_B$ identical B molecules give a total collision rate of: $$\lambda_{\text{total}} = n_B \cdot \lambda_1 \propto \frac{n_B}{V}$$ where $V$ is the (3-dimensional) volume.

But by the same reasoning, if we have $n_A$ different A molecules that could each hit the B molecules, the total rate becomes: $$\lambda_{\text{total}} \propto \frac{n_A \cdot n_B}{V}$$

The Bigger Picture¶

This derivation shows us something profound: the macroscopic law of mass-action kinetics emerges naturally from the microscopic physics of molecular motion. We don't need to postulate it; we can derive it from first principles.

Of course, real chemistry is more complicated. Molecules aren't hard spheres, they have complex shapes, reactions need activation energy, there might be spatial correlations... But the basic scaling law — the product rule — survives all these complications for elementary reactions.

This is the beauty of statistical mechanics: simple microscopic rules lead to universal macroscopic behavior. The same ideas we've used here apply to everything from chemical reactions to stellar collisions to radioactive decay.

The dynamical equations¶

For simple protein production, we have the following species

DNA. We assume there is a single promoter followed by a protein-coding gene in the cell
mRNA, where $m$ is the current number of mRNA corresponding to the above gene
protein, where $p$ is the current number of proteins corresponding to the above gene

as well as the following reactions among them: \begin{align} \text{DNA} &\rightarrow \text{mRNA} + \text{DNA} &\text{(transcription)}\\ \text{mRNA} &\rightarrow \emptyset &\text{(mRNA degradation and dilution)}\\ \text{mRNA} &\rightarrow \text{protein} + \text{mRNA} &\text{(translation)}\\ \text{protein} &\rightarrow \emptyset &\text{(protein degradation and dilution)} \end{align}

Macroscale equations (deterministic, ODE semantics)¶

As we've seen before, the deterministic dynamics, which describe mean concentrations over a large population of cells, are described by the ODEs

\begin{align} \frac{\mathrm{d}m}{\mathrm{d}t} &= \beta_m - \gamma_m m, \\[1em] \frac{\mathrm{d}p}{\mathrm{d}t} &= \beta_p m - \gamma_p p. \end{align}

CRNs assume the so-called mass-action kinetics for rates. There, the rate of a reaction $(\mathbf{r}, \mathbf{p}, \alpha)$ at time $t$ is $\alpha \prod_{s\in \mathcal{S}} s(t)^{\mathbf{r}(s)}$ where $s(t)$ is the concentration of species $s$ at time $t$. The differential equation for a species $s \in \mathcal{S}$ is: $$ \frac{ds}{dt} = \sum_{(\mathbf{r}, \mathbf{p}, \alpha) \in \mathcal{R}} (\mathbf{p}(s) - \mathbf{r}(s)) \cdot \alpha \prod_{\tilde{s}\in \mathcal{S}} \tilde{s}(t)^{\mathbf{r}(\tilde{s})} $$

Question: What are cases where you would expect mass-action kinetics for rates to hold? Where do you think they break?

The Chemical Master equation (stochastic, Markov chain semantics)¶

We can write a master equation for these dynamics. In this case, each state is defined by an mRNA copy number $m$ and a protein copy number $p$. States can transition to other states with rates. We assume that the state fully described the system state. That is the probability to transition from a state $(m,p)$ to a state $(m',p')$ with in an infinitesimal time $\Delta t$ is independent if how long our system already is in state $(m,p)$. It is approximately $\Delta t \cdot \gamma_i(m,p)$, where $\gamma_i(m,p)$ is the rate at which reaction $i$ happens if in state $(m,p)$.

The following image shows state transitions and their corresponding reactions for large enough $m$ and $p$. Care has to be taken at the boundaries, e.g., if $m = 1$ or $m = 0$.

Denote by $P(m, p, t)$ the probability that the system is in state $(m,p)$ at time $t$. Then, by letting $\Delta t \to 0$, it is

\begin{align} \frac{\mathrm{d}P(m,p,t)}{\mathrm{d}t} &= \beta_m P(m-1,p,t) & \text{(from left)}\\ &+ \gamma_m (m+1)P(m+1,p,t) & \text{(from right)}\\ &+\beta_p mP(m,p-1,t) & \text{(from bottom)}\\ &+ \gamma_p (p+1)P(m,p+1,t) &\text{(from top)}\\ &- \gamma_m mP(m,p,t) & \text{(to left)}\\ &- \beta_m P(m,p,t) & \text{(to right)}\\ &- \gamma_p p P(m,p,t) &\text{(to bottom)}\\ &- \beta_p mP(m,p,t)\enspace. & \text{(to top)} \end{align}

We implicitly define $P(m, p, t) = 0$ if $m < 0$ or $p < 0$. This is the master equation we will sample from using the stochastic simulation algorithm (SSA) also called Gillespie algorithm.

The Chemical Langevin Equation¶

The Chemical Master Equation can be difficult to solve, which is why several approxmations were developed. One of these approximations is the Chemical Langevin Equation, which we will now derive. It is perfectly useful on its own, in particular from a computational perspective. We present it here for another reason as well: to show that the ODE kinetics of a CRN are an approximation to the expected stochastic kinetics, at least for linear CRNs.

It starts

$$ X_i(t + \tau) = X_i(t) + \sum_{j=1}^M \nu_{j,i} K_j(X(t), \tau) $$ where $\nu_{j,i}$ is the net change in the count of species $i$ in reaction $j$ and $K_j(X(t), \tau)$ is the random variable that specifies how many times reaction $j$ occurs in the time interval $[t, t+\tau]$.

Assumption 1 of the chemical Langevin equation is that the propensity functions do not change significantly during the time interval $[t, t+\tau]$, i.e., $a_j(X(t')) \approx a_j(X(t))$ for all $t' \in [t,t+\tau]$.

Then, we can rewrite the above equation as $$ X_i(t + \tau) = X_i(t) + \sum_{j=1}^M \nu_{j,i} \mathcal{P}_j(a_j(X(t)) \cdot \tau) $$ where the $\mathcal{P}_j(a_j(X(t)) \cdot \tau)$ are independent Poisson variables with parameter $a_j(X(t)) \cdot \tau$.

Assumption 2 of the chemical Langevin equation requires the expected number of occurrences of each reaction to be big, i.e., $a_j(X(t))\cdot \tau \gg 1$.

Note that the two assumption require a tradeoff: assumption 1 wants $\tau$ to be small whereas assumption 2 wants $\tau$ to be big. It is very well possible that no choice of $\tau$ satisifies both assumptions for a given system.

It is well-known that the Poisson distribution with parameter $\lambda$ is well approximated by the normal distribution $\mathcal{N}(\lambda,\lambda)$ with expected value $\mu = \lambda$ and variance $\sigma^2 = \lambda$ for large values of $\lambda$.

Assumption 2 thus suggests using this approximation to get $$ \mathcal{P}_j(a_j(X(t)) \tau) \approx a_j(X(t))\tau + \sqrt{a_j(X(t))\tau} \cdot \mathcal{N}(0,1) $$ where $\mathcal{N}(0,1)$ is a standard normally distributed random variable.

Now, switching to $X(t)$ being real-valued and setting $\tau = dt$, in the limit $dt \to 0$, we finally get the chemical Langevin equation: $$ \frac{dX_i(t)}{dt} = \sum_{j=1}^M \nu_{j,i} a_j(X(t)) + \sum_{j=1}^M \nu_{j,i} \sqrt{a_j(X(t))} \Gamma_j(t) $$ where the $\Gamma_j$ are independent standard Gaussian white noise processes.

From Stochastic to Deterministic Kinetics¶

If the assumptions of the chemical Langevin equation hold sufficiently well, it is easy to make the connection between the stochastic and the deterministic kinetics of CRNs. In this case, taking the expected value (over an ensemble of sample paths) on both sides of the equation, we get: $$ \frac{d \mathbb{E} X_i(t)}{dt} = \sum_{j=1}^M \nu_{j,i} \mathbb{E} a_j(X(t)) $$

If the propensities follow a mass-action law, the expected value $\mathbb{E} a_j(X(t))$ is sometimes an approximation to $a_j(\mathbb{E} X(t))$. If there are only unary reactions, they are equal for instance.

The detour via the chemical Langevin equation is not the only way to show the above formula. For a large-volume (or large-species-count) limit, it can be derived directly from the Chemical Master Equation.