Modeling Dynamical Systems

Lecture 06

February 10, 2025

Review

Last Class

Generalized probability models (focused on generalized linear models)
Goal: predict parameters/expectations of distributions (e.g. expectation of Poisson or probability of Binomial).
Often requires use of a link function to convert linear regressions of predictors to appropriate parameter range.

One Note

May need to standardize predictors when they are much larger than the response \[\hat{x} = \frac{x - \mu(x)}{\sigma(x)}\]
This doesn’t really matter for linear regression, but for non-linear models (GLMs or ML) can be important to detect the relevant effects.

Modeling Global Warming

Historical Warming

Code

temps = CSV.read("data/climate/HadCRUT.5.0.1.0.analysis.summary_series.global.annual.csv", DataFrame, delim=",")

time_obs = temps[:, 1]
temp_obs = temps[:, 2]
temp_lo = temps[:, 3]
temp_hi = temps[:, 4]

temp_lo = temp_lo .- mean(temp_obs[1:20])
temp_hi = temp_hi .- mean(temp_obs[1:20])
temp_obs = temp_obs .- mean(temp_obs[1:20]) # compute anomalies relative to first 20 years of data
temp_sd = (temp_hi - temp_lo) / 1.96 # estimate standard deviation using 95% CI

# generate simulations
hind_years = 1850:2022 # model years to simulate for fitting
sim_years = 1850:2100 # model years for projections
forcing_years = 1750:2500 # years for which we have forcing data/simulations
hind_idx = indexin(hind_years, forcing_years) # find indices in t vector of simulation years
sim_idx = indexin(sim_years, forcing_years)

plot(time_obs, temp_obs, ribbon=(temp_obs-temp_lo,temp_hi-temp_obs), color="blue", linewidth=2, fillalpha=0.2, legend=false, xlabel="Year", ylabel="Temperature anomaly (°C)", labelfontsize=18, tickfontsize=16, bottom_margin=10mm, left_margin=10mm)
plot!(size=(550, 450))

Figure 1: Global temperature anomalies

Data Source: HadCRUT 5.0.1.0

Source: Weekly Humorist

Planetary Energy Balance

Representation of Planetary Energy Balance

Source: Reprinted from A Climate Modeling Primer, A. Henderson-Sellers and K. McGuffie, Wiley, pg. 58, (1987) via https://www.e-education.psu.edu/meteo469/node/137.

Radiative Forcing

Climate changes result from changes to the energy balance of the planet (or radiative forcings), due to e.g.:

greenhouse gas emissions (which trap radiation, warming the planet);
aerosol emissions from air pollution or volcanic eruptions (which block incoming radiation, cooling the planet);
changes to the solar cycle (which can increase or decrease the incoming solar radiation).

Historical Radiative Forcing

Code

# Dataset from https://zenodo.org/record/3973015
# The CSV is read into a DataFrame object, and we specify that it is comma delimited
forcings_all_85 = CSV.read("data/climate/ERF_ssp585_1750-2500.csv", DataFrame, delim=",")

# Separate out the individual components
forcing_co2_85 = forcings_all_85[!,"co2"]
# Get total aerosol forcings
forcing_aerosol_rad_85 = forcings_all_85[!,"aerosol-radiation_interactions"]
forcing_aerosol_cloud_85 = forcings_all_85[!,"aerosol-cloud_interactions"]
forcing_aerosol_85 = forcing_aerosol_rad_85 + forcing_aerosol_cloud_85
forcing_total_85 = forcings_all_85[!,"total"]
forcing_non_aerosol_85 = forcing_total_85 - forcing_aerosol_85
forcing_other_85 = forcing_total_85 - (forcing_co2_85 + forcing_aerosol_85)

t = time_forcing = Int64.(forcings_all_85[!,"year"]) # Ensure that years are interpreted as integers

plot(xlabel="Year", ylabel="Radiative Forcing (W/m²)", tickfontsize=16, guidefontsize=18, legendfontsize=16, leftmargin=10mm, bottommargin=5mm, right_margin=5mm)
plot!(time_forcing, forcing_total_85, label="Total", color=:black, linewidth=3)
plot!(time_forcing, forcing_co2_85, label="CO₂", color=:orange, linewidth=2)
plot!(time_forcing, forcing_aerosol_85, label="Aerosol", color=:blue, linewidth=2)
plot!(time_forcing, forcing_other_85, label="Other", color=:purple, linewidth=2)
plot!(size=(800, 450))
xlims!((1750, 2020))
ylims!(-4.5, 5)

Figure 2: Historical and projected radiative forcings.

Data Source: https://zenodo.org/records/3973015

What Are Some Sources of Relevant Uncertainty in Understanding Past and Future Climate Changes and Impacts?

One key question: what is the sensitivity of warming to continued CO₂ emissions?

The Energy Balance Model (EBM)

\[\begin{align*} \overbrace{\frac{dH}{dt}}^{\text{change in heat}} &= \overbrace{F}^{\text{RF}} - \overbrace{\lambda T}^{\substack{\text{change in} \\ \text{temperature}}} \\[1em] \Rightarrow C\frac{dT}{dt} &= F - \lambda T - \gamma(T-T_D)\\ C_D\frac{dT_D}{dt} &= \gamma(T-T_D) \end{align*}\]

Source: Palmer et al. (2018)

The EBM (cont’d)

\(c = 4.184\times 10^6 \\ \text{J/K/m}^2\) is the specific heat of water per area.
Total RF: \[F = F_\text{non-aerosol} + \alpha F_\text{aerosol}.\]
The climate feedback factor \(\lambda\) controls how much the Earth warms in response to radiative forcing.

EBM Discretization

Use Euler discretization:

\[\begin{align*} T(t+1) &= T(t) + \frac{F(t) - \lambda T(t) - \gamma(T(t) - T_D(d))}{C} \Delta t \\[0.5em] T_D(t+1) &= T_D(t) + \frac{\gamma (T(t) - T_D(t))}{C_D} \Delta t \end{align*}\]

Equilibrium Climate Sensitivity (ECS)

Under steady-state conditions (constant \(F\) and \(dT/dt = 0\)), \[T = \frac{F}{\lambda}.\]

When we double atmospheric CO₂, we refer to the equilibrium temperature \(S\) as the equilibrium climate sensitivity:

\[S = \underbrace{F_{2\times \text{CO}_2}}_{\approx 4 \text{W/m}^2}/\lambda\]

Model Fitting

Degree of Freedom / Free Parameters

There are a few uncertain parameters:

\(\lambda\) or \(S\)
\(\gamma\) (deep ocean temp diffusion)
\(\alpha\) (aerosol scaling factor)
\(d\) (upper ocean mixing depth)
\(D\) (deep ocean mixing depth)
\(T_0\) (initial temperature in 1850)

Programming Implementation

function ebm(rf_nonaerosol, rf_aerosol; p=(3.2, 1.0, 1.3, 100.0, 800.0, -0.1))
    # set up model parameters
    S, γ, α, d, D, T₀ = p # this unpacks the parameter tuple into variables
    F2xCO₂ = 4.0 # radiative forcing [W/m²] for a doubling of CO₂
    λ = F2xCO₂ / S

    c = 4.184e6 # heat capacity/area [J/K/m²]
    C = c*d # heat capacity of mixed layer (per area)
    CD = c*D # heat capacity of deep layer (per area)
    F = rf_nonaerosol + α*rf_aerosol # radiative forcing
    Δt = 31558152. # annual timestep [s]

    T = zero(F)
    T[1] = T₀
    TD = zero(F)
    for i in 1:length(F)-1
        T[i+1] = T[i] + (F[i] - λ*T[i] - γ*(T[i]-TD[i]))/C * Δt
        TD[i+1] = TD[i] + γ*(T[i]-TD[i])/CD * Δt
    end
    # return after normalizing to reference period
    return T
end

ebm_wrap(params) = ebm(forcing_non_aerosol_85[hind_idx], forcing_aerosol_85[hind_idx], p = params)

Probability Models for Simulations

Computer Model: \[\eta(\underbrace{\theta}_{\substack{\text{calibration}\\\text{variables}}}; \underbrace{x}_{\substack{\text{control}\\\text{variables}}})\]

Observations: \[\mathbf{y} \sim p(\underbrace{\zeta(\mathbf{x})}_{\substack{\text{expected}\\\text{state}}})\]

Model-Data Discrepancy

Write \[\zeta(\mathbf{x}) = \delta(\eta(\theta; x))\] where \(\delta\) represents the discrepancy between the model output and the expected state.

Then the probability model is: \[\mathbf{y} \sim p(\delta(\eta(\theta; x))).\]

Model-Data Discrepancy

For example, \(\delta\) might capture:

Bias (e.g.: model consistently over/underpredicts);
Accumulations of error (e.g.: persistent model underestimates);
Partial observations (e.g.: do you count every animal?)

Probability Models for Simulation Models

Most common setting (e.g. Brynjarsdóttir & O’Hagan (2014)):

\[\mathbf{y} = \underbrace{\eta(\mathbf{x}; \theta)}_{\text{model}} + \underbrace{\delta(x)}_{\text{discrepancy}} + \underbrace{\varepsilon}_{\text{error}}\]

Discrepancy for the EBM Calibration

Assuming Gaussian Discrepancy

\[ \begin{align*} \mathbf{y} &= \eta(\mathbf{x}; \theta) + \delta(x) + \varepsilon \\ \delta &\sim N(0, \sigma^2) \\ \varepsilon &\sim N(0, \omega^2) \end{align*} \]

where \(\omega\) is the standard error of the data.

MLE for Gaussian Discrepancy

function gaussian_iid_homosked(params, temp_dat, temp_err, m)
    S, γ, α, d, D, T₀, σ = params 
    ebm_sim = m((S, γ, α, d, D, T₀))
    ll = sum(logpdf.(Normal.(ebm_sim, sqrt.(σ^2 .+ temp_err.^2)), temp_dat))
    return ll
end

lower = [1.0, 0.5, 0.0, 50.0, 200.0, temp_lo[1], 0.0]
upper = [5.0, 1.5, 2.0, 200.0, 1000.0, temp_hi[1], 10.0]
p0 = [3.0, 1.0, 1.0, 100.0, 800.0,temp_obs[1], 5.0]

result = Optim.optimize(params -> -gaussian_iid_homosked(params, temp_obs, temp_sd, ebm_wrap), lower, upper, p0)
θ_iid = result.minimizer

MLE for Gaussian Discrepancy

Parameters	MLE
S	3.4
γ	1.5
α	0.9
d	77.1
D	627.6
T₀	-0.1
σ	0.1

Code

n_samples = 10_000
temp_iid = ebm_wrap(θ_iid) # simulate IID best fit
# simulate projections with discrepancy and errors
temp_iid_proj = zeros(n_samples, length(temp_sd))
for i = 1:length(temp_sd)
    temp_iid_err = rand(Normal(0, sqrt.(θ_iid[end]^2 .+ temp_sd[i]^2)), n_samples)
    temp_iid_proj[:, i] = temp_iid[i] .+ temp_iid_err
end
# calculate quantiles
temp_iid_q = mapslices(col -> quantile(col, [0.05, 0.5, 0.95]), temp_iid_proj; dims=1)

p = scatter(time_obs, temp_obs, color=:black, label="Observations", ylabel="(°C)", xlabel="Year", title="Temperature Anomaly")
plot!(p, time_obs, temp_iid_q[2, :], ribbon=(temp_iid_q[2, :] - temp_iid_q[1, :], temp_iid_q[3, :] - temp_iid_q[2, :]), color=:red, fillalpha=0.3, label="Gaussian Discrepancy")
plot!(size=(600, 500))

Figure 3: MLE Fit for Gaussian discrepancy

Diagnosing Residuals

Code

resids_homogauss = temp_obs - temp_iid
p1 = plot(time_obs, resids_homogauss, linewidth=3, label=false, size=(600, 500), xlabel="Year", ylabel="Residual (°C)")
p2 = histogram(resids_homogauss, size=(600, 500), xlabel="Residual (°C)", ylabel="Count")

display(p1)
display(p2)

(a) Residuals

(b)

Figure 4

Analyzing Residual Assumptions

Code

p1 = qqnorm(resids_homogauss, xlabel="Theoretical Values", ylabel="Empirical Values", title="Normal Q-Q Plot", size=(600, 500))
pacf_homogauss = pacf(resids_homogauss, 1:5)
p2 = plot(1:5, pacf_homogauss, marker=:circle, line=:stem, linewidth=3, markersize=8, tickfontsize=16, guidefontsize=18, legend=false, ylabel="Partial Autocorrelation", xlabel="Time Lag", title="Partial Autocorrelation Plot", size=(600, 500))
hline!(p2, [0], color=:black, linestyle=:dash)

display(p1)
display(p2)

(a) Residual diagnostics

(b)

Figure 5

Key Points

Probability models for simulation models involve
- Discrepancy
- Observation error
Most common: additive
“System state”: Model output + discrepancy
Check residual assumptions!

Key Points

Hindcast: compare observations with full output (discrepancy + error)
Projections: no error, just discrepancy (state estimates)

Discussion of Shmueli (2010)

Questions To Seed Discussion

What do you think are the differences between predictive and explanatory modeling?
What can go wrong when we conflate the two?
Can you think of approaches or workflows which bridge the two paradigms?

Upcoming Schedule

Next Classes

Wednesday: Correlates residuals and more general structures

Assessments

Homework 2 available; due next Friday (2/21).

References

References (Scroll for Full List)

Brynjarsdóttir, J., & O’Hagan, A. (2014). Learning about physical parameters: the importance of model discrepancy. Inverse Problems, 30, 114007. https://doi.org/10.1088/0266-5611/30/11/114007

Palmer, M. D., Harris, G. R., & Gregory, J. M. (2018). Extending CMIP5 projections of global mean temperature change and sea level rise due to thermal expansion using a physically-based emulator. Environ. Res. Lett., 13, 084003. https://doi.org/10.1088/1748-9326/aad2e4