Problem 1: Understanding noise sensitivity¶

Generate synthetic logistic growth data with $\mu = 0.3~\mathrm{h}^{-1}$, $K = 0.8~\mathrm{OD}$, $X_0 = 0.02~\mathrm{OD}$ using 20 time points from $t = 0$ to $t = 15$ hours. Create four datasets with different noise levels: $\sigma = 0.01$ (low noise), $\sigma = 0.05$ (medium noise), $\sigma = 0.2$ (high noise), and $\sigma = 0.5$ (very high noise).

For each noise level:

(a) Plot the noisy data along with the true curve.

(b) Fit the model using the combined strategy (differential evolution with 20 iterations followed by gradient descent refinement).

(d) Report the fitted parameters and their percent error relative to true values.

(e) How does increasing noise affect parameter accuracy? At what noise level does the fit become unreliable? Which parameter is most sensitive to noise?

Problem 2: Comparing optimization methods¶

Generate 20 synthetic logistic datasets with the same true parameters ($\mu = 0.5~\mathrm{h}^{-1}$, $K = 1.0~\mathrm{OD}$, $X_0 = 0.01~\mathrm{OD}$) but different random noise realizations ($\sigma = 0.02$).

For each dataset, fit the model using three approaches:

(a) Gradient descent (L-BFGS-B) with good initial guesses near the true parameters (e.g., $\mu = 0.45$, $K = 0.95$, $X_0 = 0.015$).

(b) Gradient descent with random initial guesses sampled uniformly from the bounds: $\mu \in [0.1, 2]$, $K \in [0.3, 3]$, $X_0 \in [0.001, 0.1]$.

For each method, compute the success rate (percentage of fits where all three parameters are within 10% of true values) and report the median fitting time. Which method is most reliable? Which is fastest? When would you use each approach? Compare your results to the 30% success rate we observed in the lecture for random initial guesses.

Problem 3: Real experimental data¶

Your colleague has provided three bacterial growth curves from different experimental conditions. The data files are available as growth_curve_1.csv, growth_curve_2.csv, and growth_curve_3.csv in the data subdirectory, each with columns time (hours) and OD (optical density).

For each dataset:

(a) Plot the data and visually inspect it. Does it look like clean logistic growth, or are there complications (lag phase, death phase, noise)?

(b) Fit the logistic model using the combined strategy.

(d) Calculate the residual sum of squares and comment on model adequacy.

(e) If the model fits poorly (systematic residuals like we saw with Well B1), try extending the model.