Download the notebook here! Interactive online version: binder

Integration

[1]:
import numpy as np

from integration_algorithms import monte_carlo_quasi_two_dimensions
from integration_algorithms import quadrature_newton_trapezoid_one
from integration_algorithms import quadrature_newton_simpson_one
from integration_algorithms import quadrature_gauss_legendre_one
from integration_algorithms import quadrature_gauss_legendre_two
from integration_algorithms import monte_carlo_naive_one

from integration_plots import plot_naive_monte_carlo_randomness
from integration_plots import plot_naive_monte_carlo_error
from integration_plots import plot_gauss_legendre_weights
from integration_plots import plot_benchmarking_exercise
from integration_plots import plot_naive_monte_carlo
from integration_plots import plot_quasi_monte_carlo
from integration_plots import plot_starting_illustration
from integration_plots import plot_trapezoid_rule_illustration
from integration_plots import plot_simpsons_rule_illustration

from integration_problems import problem_kinked
from integration_problems import problem_smooth

Outline

  1. Setup

  2. Newton-Cotes rules

  3. Gaussian formulas

  4. Monte Carlo integration

  5. Resources

Setup

Consider finding the area under a continuous real-valued function f over a bounded interval [a, b]:

\begin{aligned}
A = \int_a^b f(x) dx
\end{aligned}

Let’s look a visual representation of our problem.

[2]:
plot_starting_illustration()
../../_images/labs_integration_notebook_4_0.png

Most numerical methods for computing this integral split up the original integral into a sum of several integrals, each covering a smaller part of the original integration interval [a, b]. This re-writing of the integral is based on a selection of integration points x_i , i = 0, 1, . . . , n that are distributed on the interval [a, b]. Integration points may, or may not, be evenly distributed.

Given the integration points, the original integral is re-written as a sum of integrals, each integral being computed over the sub-interval between two consecutive integration points. The integral from the beginning is thus expressed as:

\begin{aligned}
\int_a^b f(x) dx = \int_{x_0}^{x_1} f(x) dx + \int_{x_1}^{x_2} f(x) dx +  ...  + \int_{x_{n-1}}^{x_n} f(x) dx ,
\end{aligned}

where x_0 = a and x_n = b.

The different integration methods will differ in the way they approximate each integral on the right hand side. The fundamental idea is that each term is an integral over a small interval [x_i , x_{i+1}], and over this small interval, it makes sense to approximate f by a simple shape

Newton-Cotes rules

Newton–Cotes quadrature rules are a group of formulas for numerical integration (also called quadrature) based on evaluating the integrand at equally spaced points. The integration points are then computed as

\begin{aligned}
x_i = a + ih, i = 0, 1, . . . , n,
\end{aligned}

where h = (b - a) / n

The closed Newton-Cotes formula of degree n is stated as

\begin{aligned}
\int_a^b f(x) dx \approx \sum^n_{i=0} \omega_i f(x_i).
\end{aligned}

We will consider the first two degrees in more detail:

  • Trapezoid rule

  • Simpson’s rule

Trapezoid rule

The Trapezoid rule approximates the area under the function f with the area under a piecewise linear approximation to f.

[3]:
plot_trapezoid_rule_illustration()
../../_images/labs_integration_notebook_8_0.png
../../_images/labs_integration_notebook_8_1.png
../../_images/labs_integration_notebook_8_2.png

More formally:

\begin{aligned}
\int_a^b f(x) dx & = \int_{x_0}^{x_1} f(x) dx + \int_{x_1}^{x_2} f(x) dx  +  ... + \int_{x_{n-1}}^{x_n} f(x) dx \\ &\approx h \frac{f(x_0) - f(x_1)}{2} +  h \frac{f(x_1) - f(x_2)}{2} + ... + h \frac{f(x_{n-1}) - f(x_n)}{2},
\end{aligned}

which we can further simplify to:

\begin{aligned}
\int_a^b f(x) dx \approx 0.5h f(x_0) + h\sum_{i = 1}^{n -1}f(x_i) + 0.5 h f(x_n).
\end{aligned}

The weights are then the following:

\begin{aligned}
\omega_i = \begin{cases}
h / 2 & \text{if $i \in \{0, n\}$}   \\
h & \text{otherwise}
\end{cases}
\end{aligned}

[4]:
??quadrature_newton_trapezoid_one
Signature: quadrature_newton_trapezoid_one(f, a, b, n)
Source:
def quadrature_newton_trapezoid_one(f, a, b, n):
    """Return quadrature newton trapezoid example."""
    xvals = np.linspace(a, b, n + 1)
    fvals = np.tile(np.nan, n + 1)
    h = xvals[1] - xvals[0]

    weights = np.tile(h, n + 1)
    weights[0] = weights[-1] = 0.5 * h

    for i, xval in enumerate(xvals):
        fvals[i] = f(xval)

    return np.sum(weights * fvals)
File:      ~/external-storage/sciebo/office/OpenSourceEconomics/teaching/scientific-computing/course/lectures/integration/integration_algorithms.py
Type:      function

[5]:
integrand = quadrature_newton_trapezoid_one(np.exp, 0, 1, 1000)
np.testing.assert_almost_equal(integrand, np.exp(1) - 1)

Simpson’s rule

Simpson’s rule approximates the area under a function f with the area under a piecewise quadratic approximation to f.

[6]:
plot_simpsons_rule_illustration()
../../_images/labs_integration_notebook_13_0.png
../../_images/labs_integration_notebook_13_1.png
../../_images/labs_integration_notebook_13_2.png

The weights are the following:

\begin{aligned}
w_i  = \begin{cases}
h / 3 & \text{for $i \in [0, n]$} \\
4 h / 3& \text{for $0 < i < n, i$ even} \\
2 h / 3& \text{for $0 < i < n, i$ odd} . \\
\end{cases}
\end{aligned}

[7]:
??quadrature_newton_simpson_one
Signature: quadrature_newton_simpson_one(f, a, b, n)
Source:
def quadrature_newton_simpson_one(f, a, b, n):
    """Return quadrature newton simpson example."""
    if n % 2 == 0:
        raise Warning("n must be an odd integer. Increasing by 1")
        n += 1

    xvals = np.linspace(a, b, n)
    fvals = np.tile(np.nan, n)

    h = xvals[1] - xvals[0]

    weights = np.tile(np.nan, n)
    weights[0::2] = 2 * h / 3
    weights[1::2] = 4 * h / 3
    weights[0] = weights[-1] = h / 3

    for i, xval in enumerate(xvals):
        fvals[i] = f(xval)

    return np.sum(weights * fvals)
File:      ~/external-storage/sciebo/office/OpenSourceEconomics/teaching/scientific-computing/course/lectures/integration/integration_algorithms.py
Type:      function

[8]:
integrand = quadrature_newton_simpson_one(np.exp, 0, 1, 1001)
np.testing.assert_almost_equal(integrand, np.exp(1) - 1)

Gaussian formulas

Gaussian quadrature formulas employ a very different logic to compute the area under a curve f over a bounded interval [a, b]. Specifically, the n quadrature nodes x_i and n quadrature weights \omega_i are chosen to exactly integrate polynomials of degree 2n - 1 or less.

Gaussian quadrature approximations of the form:

\begin{aligned}
\int_a^b f(x) w(x) dx \approx \sum_{i = 1}^{n}\omega _i f(x_i)
\end{aligned}

for any nonnegative weighing function w(x). Versions of w(x) lead to include Gauss-Legendre, Gauss-Chebyshev, Gauss-Hermite and Gauss-Laguerre.

We start from the special domain between [-1, 1], again the formula for our approximate solution looks very similar.

\begin{aligned}
\int_{-1}^1 f(x) dx\approx \sum^n_{i=1} w_i f(x_i)
\end{aligned}

So, more generally using a change of variables:

\begin{aligned}
\int_a^b f(x) = \approx \frac{b -a }{2} \sum^n_{i=1} w_i f\left(\frac{(x_i + 1)(b - a)}{2} + a\right)
\end{aligned}

Unlike in the case of the Newton-Cotes approach, Gaussian nodes are not uniformly spaced and do not include the integration limits.

[9]:
[plot_gauss_legendre_weights(deg) for deg in [5, 11, 21]]
../../_images/labs_integration_notebook_18_0.png
../../_images/labs_integration_notebook_18_1.png
../../_images/labs_integration_notebook_18_2.png
[9]:
[None, None, None]

Question

  • What are the properties of the weights?

[10]:
??quadrature_gauss_legendre_one
Signature: quadrature_gauss_legendre_one(f, a, b, n)
Source:
def quadrature_gauss_legendre_one(f, a, b, n):
    """Return quadrature gauss legendre example."""
    xvals, weights = np.polynomial.legendre.leggauss(n)
    xval_trans = (b - a) * (xvals + 1.0) / 2.0 + a

    fvals = np.tile(np.nan, n)
    for i, xval in enumerate(xval_trans):
        fvals[i] = ((b - a) / 2.0) * f(xval)

    return np.sum(weights * fvals)
File:      ~/external-storage/sciebo/office/OpenSourceEconomics/teaching/scientific-computing/course/lectures/integration/integration_algorithms.py
Type:      function

[11]:
integrand = quadrature_gauss_legendre_one(np.exp, 0, 1, 1000)
np.testing.assert_almost_equal(integrand, np.exp(1) - 1)

Benchmarking

Exercise

  • Compare the performance of our integration routines on the following two integrals.

We study the following two cases:

\begin{aligned}
f(x) & = \int_{-1}^1 e^{-x} dx \\
f(x) & = \int_{-1}^1 \sqrt{|x|} dx .
\end{aligned}

[12]:
plot_benchmarking_exercise()
../../_images/labs_integration_notebook_23_0.png

How can we extend the ideas so far to multidimensional integration?

\begin{aligned}
\int_{x}\int_{y} f(x, y) dx dy .
\end{aligned}

The product rule approximates the integral with the following sum:

\begin{aligned}
\sum_{i_x = 1}^m \sum_{i_y = 1}^m \omega_{i_x} \omega_{i_y} f(x_{1} y_{1}) .
\end{aligned}

A difficulty of this approach is the curse of dimensionality as the number of nodes inreases exponentially with the number of dimensions.

[13]:
??quadrature_gauss_legendre_two
Signature: quadrature_gauss_legendre_two(f, a=-1, b=1, n=10)
Source:
def quadrature_gauss_legendre_two(f, a=-1, b=1, n=10):
    """Return quadrature gauss legendre example."""
    n_dim = int(np.sqrt(n))

    xvals, weight_uni = np.polynomial.legendre.leggauss(n_dim)
    xvals_transformed = (b - a) * (xvals + 1.0) / 2.0 + a

    weights = np.tile(np.nan, n_dim ** 2)
    fvals = np.tile(np.nan, n_dim ** 2)

    counter = 0
    for i, x in enumerate(xvals_transformed):
        for j, y in enumerate(xvals_transformed):
            weights[counter] = weight_uni[i] * weight_uni[j]
            fvals[counter] = f([x, y])
            counter += 1

    return ((b - a) / 2) ** 2 * np.sum(weights * np.array(fvals))
File:      ~/external-storage/sciebo/office/OpenSourceEconomics/teaching/scientific-computing/course/lectures/integration/integration_algorithms.py
Type:      function

Monte Carlo integration

Naive Monte Carlo integration uses random sampling of a function to numerically compute an estimate of its integral. We can approximate this integral by averaging N samples of the function f drawn from a uniform distribution between a and b.

\begin{aligned}
\int_a^b f(x) dx \approx (b - a) \frac{1}{N} \sum_{i = 1}^N f(x_i)
\end{aligned}

Why?

\begin{aligned}
E\left[ (b - a)  \frac{1}{N} \sum_{i = 1}^N f(x_i)\right] & = (b - a)  \frac{1}{N}  \sum_{i = 1}^N E\left[f(x_i)\right] \\
& =  (b - a)  \frac{1}{N}  \sum_{i = 1}^N\left[\int_a^b f(x) \frac{1}{b - a} dx\right] \\
& =  \frac{1}{N}  \sum_{i = 1}^N \int_a^b f(x) dx  \\
& = \int_a^b f(x) dx.
\end{aligned}

[14]:
??monte_carlo_naive_one
Signature: monte_carlo_naive_one(f, a=0, b=1, n=10, seed=123)
Source:
def monte_carlo_naive_one(f, a=0, b=1, n=10, seed=123):
    """Return naive monte carlo example."""
    np.random.seed(seed)
    xvals = np.random.uniform(size=n)
    fvals = np.tile(np.nan, n)
    weights = np.tile(1 / n, n)

    scale = b - a
    for i, xval in enumerate(xvals):
        fvals[i] = f(a + xval * (b - a))

    return scale * np.sum(weights * fvals)
File:      ~/external-storage/sciebo/office/OpenSourceEconomics/teaching/scientific-computing/course/lectures/integration/integration_algorithms.py
Type:      function

Monte Carlo integration only converges very slowly. Let’s consider our smooth benchmarking example from earlier and compare the naive Monte Carlo approach to the quadrature methods.

[15]:
plot_naive_monte_carlo_error(max_nodes=200)
../../_images/labs_integration_notebook_29_0.png

Naive Monte Carlo integration differs from our earlier methods as the approximation itself is a random variable. How about the level of randomness at a given number (n = 50) of integration nodes?

[16]:
plot_naive_monte_carlo_randomness()
../../_images/labs_integration_notebook_31_0.png

However, Monte Carlo integration is particular useful when tackling multidimensional integrals. Let’s consider the computation of a two dimensional integral going forward.

[17]:
plot_naive_monte_carlo(100)
../../_images/labs_integration_notebook_33_0.png

Quasi-Monte Carlo employ deterministic sequences of nodes x_i with the property that

\begin{aligned}
\lim_{n \rightarrow \infty} \frac{b - a}{n} \sum_{i=1}^n f(x_i)= \int_{a}^b f(x) dx.
\end{aligned}

Deterministic sequences of nodes chosen to fill space in a regular manner typically provide more accurate integration approximations than pseudo-random sequences. These sequences are readily available in the chaospy package. The documentation of the package also includes a tutorial on Monte Carlo integration.

[18]:
plot_quasi_monte_carlo(3)
../../_images/labs_integration_notebook_35_0.png

Otherwise, all looks pretty similar to our earlier implementation. Note that the default rule actually takes us back to the naive Monte Carlo method.

[19]:
??monte_carlo_quasi_two_dimensions
Signature: monte_carlo_quasi_two_dimensions(f, a=0, b=1, n=10, rule='random')
Source:
def monte_carlo_quasi_two_dimensions(f, a=0, b=1, n=10, rule="random"):
    """Return Monte Carlo example (two-dimensional).

    Corresponds to naive Monthe Carlo for `rule='random'`. Restricted to same
    integration domain for both variables.
    """
    distribution = cp.J(cp.Uniform(a, b), cp.Uniform(a, b))
    samples = distribution.sample(n, rule=rule).T
    volume = (b - a) ** 2

    fvals = np.tile(np.nan, n)
    weights = np.tile(1 / n, n)

    for i, xval in enumerate(samples):
        fvals[i] = f(xval)

    return volume * np.sum(weights * fvals)
File:      ~/external-storage/sciebo/office/OpenSourceEconomics/teaching/scientific-computing/course/lectures/integration/integration_algorithms.py
Type:      function

Let’s consider the following test function from Genz (1984).

d5df61e6b02444d8aa3d1d6eb5418194

Exercises

  1. What are the properties of the function?

  2. Compare the performance naive and quasi Monte Carlo integration routines for u = (0.5, 0.5) and a = (5, 5) over the two-dimensional unit cube.

  3. How does Gauss-Legendre quadrature perform?

Applying the integration routines to functions with other input arguments can be achieved with some small modifications. Here we pass in our benchmarking function without any default arguments to a modified naive Monte Carlo integration routine.

[20]:
def genz_discontinuous_no_defaults(x, u, a):
    if x[0] > u[0] or x[1] > u[1]:
        return 0
    else:
        return np.exp((a * x).sum())


def monte_carlo_naive_with_args(f, args=(), a=0, b=1, n=10, seed=128):
    np.random.seed(seed)
    xvals = np.random.uniform(low=a, high=b, size=2 * n).reshape(n, 2)
    volume = (b - a) ** 2

    fvals = np.tile(np.nan, n)
    weights = np.tile(1 / n, n)

    for i, xval in enumerate(xvals):
        # Here is the main modification that passes in the arguments by position now.
        fvals[i] = f(xval, *args)

    return volume * np.sum(weights * fvals)


u, a = (0.5, 0.5), (5, 5)
rslt = monte_carlo_naive_with_args(genz_discontinuous_no_defaults, args=(u, a))

Resources

Research

Other resources

References

  • Genz, A. (1984, September). Testing multidimensional integration routines. In Proc. of international conference on Tools, methods and languages for scientific and engineering computation (pp. 81-94). Elsevier North-Holland, Inc..