Lecture 11
March 3, 2024
Goal: Estimate \(\mathbb{E}_p\left[h(x)\right]\), \(x \sim f(x)\)
Monte Carlo principle:
In other words: replace calculus with data summaries!
Monte Carlo is a broad method, which can be used to:
How can we use MC to estimate \(\pi\)?
Hint: Think of \(\pi\) as an expected value…
Finding \(\pi\) by sampling random values from the unit square and computing the fraction in the unit circle. This is an example of Monte Carlo integration.
\[\frac{\text{Area of Circle}}{\text{Area of Square}} = \frac{\pi}{4}\]
Logging.disable_logging(Logging.Info)
function circleShape(r)
θ = LinRange(0, 2 * π, 500)
r * sin.(θ), r * cos.(θ)
end
nsamp = 3000
unif = Uniform(-1, 1)
x = rand(unif, (nsamp, 2))
l = mapslices(v -> sum(v.^2), x, dims=2)
in_circ = l .< 1
pi_est = [4 * mean(in_circ[1:i]) for i in 1:nsamp]
plt1 = plot(
1,
xlim = (-1, 1),
ylim = (-1, 1),
legend = false,
markersize = 4,
framestyle = :origin,
tickfontsize=16,
grid=:false
)
plt2 = plot(
1,
xlim = (1, nsamp),
ylim = (3, 3.5),
legend = :false,
linewidth=3,
color=:black,
tickfontsize=16,
guidefontsize=16,
xlabel="Iteration",
ylabel="Estimate",
right_margin=5mm
)
hline!(plt2, [π], color=:red, linestyle=:dash)
plt = plot(plt1, plt2, layout=Plots.grid(2, 1, heights=[2/3, 1/3]), size=(600, 500))
plot!(plt, circleShape(1), linecolor=:blue, lw=1, aspectratio=1, subplot=1)
mc_anim = @animate for i = 1:nsamp
if l[i] < 1
scatter!(plt[1], Tuple(x[i, :]), color=:blue, markershape=:x, subplot=1)
else
scatter!(plt[1], Tuple(x[i, :]), color=:red, markershape=:x, subplot=1)
end
push!(plt, 2, i, pi_est[i])
end every 100
gif(mc_anim, "figures/mc_pi.gif", fps=3)What is the probability of rolling 4 dice for a total of 19?
Can simulate dice rolls and find the frequency of 19s among the samples.
function dice_roll_repeated(n_trials, n_dice)
dice_dist = DiscreteUniform(1, 6)
roll_results = zeros(n_trials)
for i=1:n_trials
roll_results[i] = sum(rand(dice_dist, n_dice))
end
return roll_results
end
nsamp = 10000
# roll four dice 10000 times
rolls = dice_roll_repeated(nsamp, 4)
# calculate probability of 19
sum(rolls .== 19) / length(rolls)
# initialize storage for frequencies by sample length
avg_freq = zeros(length(rolls))
std_freq = zeros(length(rolls))
# compute average frequencies of 19
avg_freq[1] = (rolls[1] == 19)
count = 1
for i=2:length(rolls)
avg_freq[i] = (avg_freq[i-1] * (i-1) + (rolls[i] == 19)) / i
std_freq[i] = 1/sqrt(i-1) * std(rolls[1:i] .== 19)
end
plt = plot(
1,
xlim = (1, nsamp),
ylim = (0, 0.1),
legend = :false,
tickfontsize=16,
guidefontsize=16,
xlabel="Iteration",
ylabel="Estimate",
right_margin=8mm,
color=:black,
linewidth=3,
size=(600, 400)
)
hline!(plt, [0.0432], color="red",
linestyle=:dash)
mc_anim = @animate for i = 1:nsamp
push!(plt, 1, i, avg_freq[i])
end every 100
gif(mc_anim, "figures/mc_dice.gif", fps=10)Monte Carlo simulation: propagate uncertainties from inputs through a model to outputs.
This is an example of uncertainty propagation: draw samples from some distribution, and run them through one or more models to find the (conditional) probability of outcomes of interest (for good or bad).
Formally: Monte Carlo estimation as the computation of the expected value of a random quantity \(Y = f(X)\), \(\mu = \mathbb{E}[Y]\).
To do this, generate \(n\) independent and identically distributed values \(Y_1, \ldots, Y_n\). Then the sample estimate is
\[\tilde{\mu}_n = \frac{1}{n}\sum_{i=1}^n Y_i\]
Statistical estimators are random, which means we can’t ever guarantee that we get back the “true” value
If
\(Y\) is a random variable and its expectation exists and
\(Y_1, \ldots, Y_n\) are independently and identically distributed
Then by the weak law of large numbers:
\[\lim_{n \to \infty} \mathbb{P}\left(\left|\tilde{\mu}_n - \mu\right| \leq \varepsilon \right) = 1\]
In other words, eventually Monte Carlo estimates will get within an arbitrary error of the true expectation.
But how large is large enough?
The sample mean \(\tilde{\mu}_n = \frac{1}{n}\sum_{i=1}^n Y_i\) is itself a random variable.
With some assumptions (the mean of \(Y\) exists and \(Y\) has finite variance), the expected Monte Carlo sample mean \(\mathbb{E}[\tilde{\mu}_n]\) is
\[\frac{1}{n}\sum_{i=1}^n \mathbb{E}[Y_i] = \frac{1}{n} n \mu = \mu\]
We’d like to know more about the error of this estimate for a given sample size. The variance of this estimator is
\[\tilde{\sigma}_n^2 = \text{Var}\left(\tilde{\mu}_n\right) = \mathbb{E}\left((\tilde{\mu}_n - \mu)^2\right) = \frac{\sigma_Y^2}{n}\]
So as \(n\) increases, the standard error decreases:
\[\tilde{\sigma}_n = \frac{\sigma_Y}{\sqrt{n}}\]
In other words, if we want to decrease the Monte Carlo error by 10x, we need 100x additional samples. This is not an ideal method for high levels of accuracy.
Monte Carlo is an extremely bad method. It should only be used when all alternative methods are worse.
— Sokal, Monte Carlo Methods in Statistical Mechanics, 1996
But…often most alternatives are worse!
If you can compute your integrals analytically or through quadrature, you probably should.
But for many “real” problems, this is either
Basic Idea: The Central Limit Theorem says that with enough samples, the errors are normally distributed:
\[\left\|\tilde{\mu}_n - \mu\right\| \to \mathcal{N}\left(0, \frac{\sigma_Y^2}{n}\right)\]
The \(\alpha\)-confidence interval is: \[\tilde{\mu}_n \pm \Phi^{-1}\left(1 - \frac{\alpha}{2}\right) \frac{\sigma_Y}{\sqrt{n}}\]
For example, the 95% confidence interval is \[\tilde{\mu}_n \pm 1.96 \frac{\sigma_Y}{\sqrt{n}}.\]
We don’t know the standard deviation \(\sigma_Y\).
But we can estimate it using the simulation standard deviation:
Converging at a rate of \(1/\sqrt{n}\) is not great. But:
Goal: Find the probability of exceeding the 1-hour SO2 average exposure concentration standard, which is 0.14 ppm.
\[\mathbb{P}[\text{SO}_2(\theta) > 0.14] = \int \mathbb{I}(\text{SO}_2(\theta) > 0.14) p(\theta) d\theta\]
Figure 4: Illustration of the airshed, including notation.
\[\frac{dC}{dt} = \frac{u}{L} C_\text{in} + \frac{S-D}{WHL} - \left(\frac{u}{L} + k\right)C\]
\[ \frac{dC}{dt} = \frac{u}{L} C_\text{in}(t) + \frac{S-D}{WHL} - \left(\frac{u}{L} + k\right)C\]
\[\Rightarrow \frac{C(t+1) - C(t)}{\Delta t} = \frac{u}{L} C_\text{in}(t) + \frac{R}{WHL} - \left(\frac{u}{L} + k\right)C(t)\]
\[\bbox[yellow, 10px, border:5px solid red]{C(t+1) = \left(1 - \Delta t\left(\frac{u}{L} + k\right)\right)C(t) + \Delta t \left(\frac{u}{L} C_\text{in}(t) + \frac{R}{WHL}\right)} \]
nsamp = 1000
u = rand(LogNormal(log(2), 1), nsamp)
Cin = rand(LogNormal(log(0.16), 0.12), nsamp)
R = rand(Normal(0.5, 0.5), nsamp)
p1 = histogram(u, ylabel="count", xlabel=L"$u$ (m/s)", label=false, tickfontsize=16, guidefontsize=18, size=(400, 450))
p2 = histogram(Cin, ylabel="count", xlabel=L"$C_{in}$ (ppm)", label=false, tickfontsize=16, guidefontsize=18, size=(400, 450))
p3 = histogram(R, ylabel="count", xlabel=L"$R$ (ppm/hr)", label=false, tickfontsize=16, guidefontsize=18, size=(400, 450))
display(p1)
display(p2)
display(p3)# other parameters
C₀ = 0.07
T = 60
k = 0.3
W = 4
H = 5
L = 4
# conduct simulation
P = u / L .* Cin
l = u / L .+ k
C2 = zeros(T*100 + 1, nsamp)
S = 0:0.01:T
for (i, t) in pairs(S)
if i == 1
C2[i, :] .= C₀
else
C2[i, :] = (1 .- 0.01*l) .* C2[i-1, :] .+ 0.01 * P .+ 0.01 * R / (H * W * L)
end
end
mean_SO2 = map(mean, eachcol(C2)) # calculate means
# plot histogram
p1 = histogram(mean_SO2, xlabel="1-Hour Average Exposure (ppm)", ylabel="Count", legend=false, tickfontsize=16, guidefontsize=18)
vline!(p1, [0.14], color=:red, linestyle=:dash, linewidth=3)
xticks!(p1, 0:0.04:0.3)
xaxis!(p1, xminorticks=2)
plot!(p1, size=(600, 450))
# plot cdf
p2 = plot(sort(mean_SO2), (1:nsamp) ./ nsamp, xlabel="1-Hour Average Exposure (ppm)", ylabel="Cumulative Probability", legend=false, tickfontsize=17, guidefontsize=18, linewidth=3)
vline!(p2, [0.14], linestyle=:dash, color=:red, linewidth=3, minorgrid=true)
xticks!(p2, 0:0.04:0.3)
xaxis!(p2, xminorticks=2)
yaxis!(p2, yminorticks=5)
plot!(p2, size=(600, 450))
display(p1)
display(p2)\[\hat{\mu}_n = \frac{1}{n}\sum_{i=1}^n \mathbb{I}[x_i > 0.14]\]
\[\hat{\sigma}_n = \sqrt{\frac{\text{Var}(\mathbb{I}[x_{1:n} > 0.14])}{n}}\]
# show Monte Carlo estimate stabilization
avg_mc_out = zeros(nsamp)
avg_mc_out[1] = mean_SO2[1] > 0.14
std_mc_out = zeros(nsamp)
std_mc_out[1] = 0
for i = 2:nsamp
avg_mc_out[i] = (avg_mc_out[i-1] * (i-1) + (mean_SO2[i] > 0.14)) / i
std_mc_out[i] = 1/sqrt(i) * std(mean_SO2[1:i] .> 0.14)
end
p = plot(avg_mc_out, xlabel="Monte Carlo Iteration", ylabel="Probability", left_margin=3mm, legend=:false, ribbon=1.96*std_mc_out, fillalpha=0.3, linewidth=2, tickfontsize=16, guidefontsize=18, fillcolor=:red, right_margin=5mm, minorgrid=true)
ylims!(p, (0, 0.3))
yaxis!(p, yminorticks=5)
plot!(p, size=(600, 450))
display(p)Can also use Monte Carlo to estimate expected values for optimization problems
For example: in previous problem, might try to optimize a control strategy with an objective of minimizing violations. But no closed form representation of the distribution, so use MC.
Would like to estimate the CDF \(F\) with some approximation \(\hat{F}_n\), then compute \(\hat{z}^\alpha_n = \hat{F}_n^{-1}(\alpha)\) as an estimator of the \(\alpha\)-quantile \(z^\alpha\).
Given samples \(\hat{\mathbf{y}} = y_1, \ldots, y_n \sim F\), define \[\hat{F}_(y) = \frac{1}{n} \sum_{i=1}^n \mathbb{I}(y_i \leq y).\]
\[ \begin{align*} \mathbb{E}[\hat{F}_(y)] &= \frac{1}{n} \sum_{i=1}^n \mathbb{I}(y_i \leq y) \\ &= \frac{1}{n} \sum_{i=1}^n \mathbb{P}(y_i \leq y) \\ &= F(y) \end{align*} \]
From the CLT and some (not super important) theory about order statistics: \[\text{Var}(\hat{z}^\alpha_n) \to \frac{\sigma^2_y}{n}\frac{\alpha (1 - \alpha)}{f^2(z^\alpha)}\]
In other words, the smaller the density at the “true” quantile \(z^\alpha\), the greater the error.
\[P(X > k) \approx \frac{1}{M} \sum_{i=1}^M \mathbb{I}(X_i > k)\]
Extension of rejection sampling without requiring “rejection”:
Technically works with any proposal \(g\), but more efficient if \(g\) “covers” \(f\) (like with rejection sampling):
\[f(x)/g(x) < M < \infty\]
If target values of pairs of samples \(h(X_i\)) and \(h(Y_i)\) are negatively correlated, can increase rate of convergence of \[\frac{1}{2M} \sum_{i=1}^M [h(X_i) + h(Y_i)]\] relative to \(\frac{1}{2M} h(X_i)\) alone.
Always report Monte Carlo error!
Wednesday: The Bootstrap