{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "# Numerical solution of nonlinear equations\n", "\n", " \n", "**Anne Kværnø**\n", "\n", "Date: **Oct 23, 2018**\n", "\n", "# Introduction\n", "We will start by considering some numerical techniques for solving nonlinear\n", "scalar equations (one equation, one unknown), such as, for example" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "$$\n", "x^3+x^2-3x=3.\n", "$$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "or systems of equations, such as, for example" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\\begin{align*}\n", "xe^y &= 1, \\\\ \n", "-x^2+y &= 1.\n", "\\end{align*}\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**NB!**\n", "* Refer to section 3.1 in *Preliminaries* for some general comments on convergence. \n", "\n", "* There are no examples of numerical calculations done by hand in this note. If you would like some, you can easily generate them yourself. Take one of the computer exercises, do your calculations by hand, if needed modify the code so that you get the output you want, run the code, and compare with the results you got by your pencil and paper calculation.\n", "\n", "# Scalar equations\n", "In this section we discuss the solution of scalar equations. The techniques we\n", "will use are known from previous courses. When they are repeated here, it is\n", "because the techniques used to develope and analyse these methods can, at least \n", "in principle be extended to systems of equations. We will also emphasise the error analysis of the methods. \n", "\n", "A scalar equation is given by" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "
\n", "\n", "$$\n", "\\begin{equation}\n", "f(x) = 0,\n", "\\label{_auto1} \\tag{1}\n", "\\end{equation}\n", "$$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "where $f$ is a continuous function defined on some interval $[a,b]$. A solution $r$ of the equation is called a *zero* or a *root* of $f$. A nonlinear equation may have one, more than one or none roots. \n", "\n", "\n", "**Example 1:**\n", "Given the function $f(x)=x^3+x^2-3x-3$ on the interval $[-2,2]$.\n", "Plot the function on the interval, and locate possible roots." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%matplotlib inline\n", "\n", "from numpy import *\n", "from numpy.linalg import solve, norm # Solve linear systems and compute norms\n", "from matplotlib.pyplot import *\n", "newparams = {'figure.figsize': (8.0, 4.0), 'axes.grid': True,\n", " 'lines.markersize': 8, 'lines.linewidth': 2,\n", " 'font.size': 14}\n", "rcParams.update(newparams)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Example 1\n", "def f(x): # Define the function\n", " return x**3+x**2-3*x-3\n", "\n", "# Plot the function on an interval \n", "x = linspace(-2, 2, 101) # x-values (evaluation points)\n", "plot(x, f(x)) # Plot the function\n", "plot(x, 0*x, 'r') # Plot the x-axis\n", "xlabel('x')\n", "grid(True)\n", "ylabel('f(x)');" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "According to the plot, the function $f$ has three real roots in the interval\n", "$[-2, 2]$. \n", "\n", "The function can be rewritten as" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "$$\n", "f(x) = x^3+x^2-3x-3 = (x+1)(x^2-3).\n", "$$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Thus the roots of $f$ are $-1$ and $\\pm \\sqrt{3}$. We will use this example as\n", "a test problem later on. \n", "\n", "## Existence and uniqueness of solutions\n", "The following theorem is well known:\n", "\n", "**Theorem 1: Existence and uniqueness of a solution.**\n", "\n", "* If $f\\in C[a,b]$ with $f(a)$ and $f(b)$ of opposite sign, there exist at least one $r\\in (a,b)$ such that $f(r)=0$. \n", "\n", "* The solution is unique if $f\\in C^1[a,b]$ and $f'(x)>0$ or $f'(x)<0$ for all $x\\in (a,b)$.\n", "\n", "\n", "\n", "The first condition guarantees that the graph of $f$ will pass the $x$-axis at some point $r$, the second guarantees that the function is either strictly increasing or strictly decreasing. \n", "\n", "## Bisection method\n", "The first part of the theorem can be used to construct the first, quite intuitive algorithm for solving scalar equations. Given an interval, check if $f$ has a root in the interval, divide it in two, check in which of the two halfs the root is, and continue until a root is located with sufficient accuracy. \n", "\n", "**Bisection method.**\n", "\n", " * Given the function $f$ and the interval $[a,b]$, such that $f(a)\\cdot f(b)<0$.\n", "\n", " * Set $a_0=a$, $b_0=b$.\n", "\n", " * For $k=0,1,2,3,.....$\n", "\n", " * $\\displaystyle c_k =\\frac{a_k+b_k}{2}$\n", "\n", " * $\\displaystyle [a_{k+1},b_{k+1}] = \\begin{cases} [a_k,c_k] & \\text{if } f(a_k)\\cdot f(c_k) \\leq 0 \\\\ [c_k,b_k] & \\text{if } f(b_k)\\cdot f(c_k) \\leq 0 \\end{cases}$\n", "\n", "\n", "\n", "Use $c_k$ as the approximation to the root. Since $c_k$ is the midpoint of the interval $[a_k,b_k]$, the error satisfies $|c_k-r|\\leq (b_k-a_k)/2$. The loop is terminated when $(b_k-a_k)/2$ is smaller than some user specified tolerance. We will of course also terminate the loop if $f(c_k)$ is very close to 0. \n", "\n", "### Implementation\n", "\n", "The algorithm is implemented in the function bisection()." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def bisection(f, a, b, tol=1.e-6, max_iter = 100):\n", " # Solve the scalar equation f(x)=0 by bisection \n", " # The result of each iteration is printed\n", " # Input:\n", " # f: The function. \n", " # a, b: Interval: \n", " # tol : Tolerance\n", " # max_iter: Maximum number of iterations\n", " # Output:\n", " # the root and the number of iterations.\n", "\n", " fa = f(a)\n", " fb = f(b)\n", " if fa*fb > 0:\n", " print('Error: f(a)*f(b)>0, there may be no root in the interval.')\n", " return 0, 0\n", " for k in range(max_iter):\n", " c = 0.5*(a+b) # The midpoint\n", " fc = f(c) \n", " print('k ={:3d}, a = {:.6f}, b = {:.6f}, c = {:10.6f}, f(c) = {:10.3e}'\n", " .format(k, a, b, c, fc)) \n", " if abs(f(c)) < 1.e-14 or (b-a) < 2*tol: # The zero is found!\n", " break \n", " elif fa*fc < 0: \n", " b = c # There is a root in [a, c]\n", " else:\n", " a = c # There is a root in [c, b] \n", " return c, k" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Example 2:**\n", "Use the code above to find the root of $f(x)=x^3+x^2-3x-3$ in the interval $[1.5,2]$. Use $10^{-6}$ as the tolerance." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Example 2\n", "def f(x): # Define the function\n", " return x**3+x**2-3*x-3\n", "a, b = 1.5, 2 # The interval\n", "c, nit = bisection(f, a, b, tol=1.e-6) # Apply the bisecetion method" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Control that the numerical result is within the error tolerance:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "r = sqrt(3) \n", "print('The error: |c_k-r|={:.2e}'.format(abs(r-c)))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Exercises:**\n", "1. Choose some appropriate intervals and find the two other roots of $f$. \n", "\n", "2. Compute the solution(s) of $x^2+\\sin(x)-0.5=0$ by the bisection method.\n", "\n", "3. Given a root in the interval $[1.5, 2]$. How many iterations are required to guarantee that the error is less than $10^{-4}$. \n", "\n", "## Fixed point iterations\n", "The bisection method is very robust, but not particular fast. We will now discuss a major class of iteration schemes, e.g. the so-called fixed point iterations. The idea is: \n", "\n", "* Given a scalar equation $f(x)=0$ with a root $r$. \n", "\n", "* Rewrite the equation in the *fixed point form* $x=g(x)$ such that the root $r$ of $f$ is a *fixed point* of $g$, that is, $r$ satisfies $r=g(r)$. \n", "\n", "The fixed point iterations are then given by\n", "\n", "**Fixed point iterations.**\n", "\n", " * Given $g$ and a starting value $x_0$. \n", "\n", " * For $k=0,1,2,3,\\dotsc$\n", "\n", " * $x_{k+1}=g(x_k)$\n", "\n", "\n", "\n", "\n", "### Implementation\n", "\n", "The fixed point scheme is implemented in the function fixpoint. \n", "The iterations are terminated when either the error estimate $|x_{k+1}-x_k|$ is less than a given tolerance tol, or the number of iterations reaches some maximum number max_iter." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def fixpoint(g, x0, tol=1.e-8, max_iter=30):\n", " # Solve x=g(x) by fixed point iterations\n", " # The output of each iteration is printed\n", " # Input:\n", " # g: The function g(x)\n", " # x0: Initial values\n", " # tol: The tolerance\n", " # Output:\n", " # The root and the number of iterations\n", " x = x0\n", " print('k ={:3d}, x = {:14.10f}'.format(0, x)) \n", " for k in range(max_iter): \n", " x_old = x # Store old values for error estimation \n", " x = g(x) # The iteration\n", " err = abs(x-x_old) # Error estimate\n", " print('k ={:3d}, x = {:14.10f}'.format(k+1, x))\n", " if err < tol: # The solution is accepted \n", " break\n", " return x, k+1" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Example 3:**\n", "The equation" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "$$\n", "x^3+x^2-3x-3=0 \\qquad \\text{can be rewritten as} \\qquad x=\\frac{x^3+x^2-3}{3}.\n", "$$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The fixed points are the intersections of the two graphs $y=x$ and $y=\\frac{x^3+x^2-3}{3}$, as can be demonstrated by the following script:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Example 3\n", "x = linspace(-2, 2, 101)\n", "plot(x, (x**3+x**2-3)/3,'b', x, x, '--r' )\n", "axis([-2, 3, -2, 2])\n", "xlabel('x')\n", "grid('True')\n", "legend(['g(x)','x']);" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We observe that the fixed points of $g$ are the same as the zeros of $f$. \n", "\n", "Apply fixed point iterations on $g(x)$. Aim for the fixed point between 1 and 2, so choose $x_0=1.5$.\n", "Do the iterations converge to the root $r=\\sqrt{3}$?" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Solve the equation from example 3 by fixed point iterations.\n", "\n", "# Define the function\n", "def g(x): \n", " return (x**3+x**2-3)/3\n", "\n", "# Apply the fixed point scheme\n", "x, nit = fixpoint(g, x0=1.5, tol=1.e-6, max_iter=30)\n", "\n", "print('\\n\\nResultat:\\nx = {}, antall iterasjoner={}'.format(x, nit))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Exercises:**\n", "Repeat the experiment with the following reformulations of $f(x)=0$:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\\begin{align*}\n", " x &= g_2(x) = \\frac{-x^2+3x+3}{x^2}, \\\\ \n", " x &= g_3(x) = \\sqrt{3+3x-x^2}, \\\\ \n", " x &= g_4(x) = \\sqrt{\\frac{3+3x-x^2}{x}}\n", " \\end{align*}\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Use $x_0=1.5$ in your experiments, you may well experiment with other values as well. \n", "\n", "## Theory\n", "Let us first state some existence and uniqueness results. Apply Theorem 1 in this note on the equation $f(x) = x-g(x)=0$. The following is then given (the details are left to the reader):\n", "* If $g\\in C[a,b]$ and $a\n", " \n", "\n", "$$\n", "\\begin{equation}\n", "|e_{k+1}| = |r-x_{k+1}| = |g(r)-g(x_k)| = |g'(\\xi_k)|\\cdot|r-x_k| = |g'(\\xi_k)|\\cdot |e_k|. \n", "\\label{_auto2} \\tag{2}\n", "\\end{equation}\n", "$$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here$\\xi_k$is some unknown value between$x_k$(known) and$r$(unknown). We can now use the assumptions from the existence and uniqueness result. \n", "* The condition$g([a,b])\\subset(a,b)$guarantees that if$x_0\\in [a,b]$then$x_k \\in (a,b)$for$k=1,2,3,\\dotsc$. \n", "\n", "* The condition$g'(x)\\leq L < 1$guarantees convergence towards the unique fixed point$r$, since" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "$$\n", "|e_{k+1}| \\leq L \\,|e_k| \\qquad \\Rightarrow \\qquad |e_k| \\leq L^k \\, |e_0| \\rightarrow 0\\quad \\text{as k\\rightarrow \\infty},\n", "$$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "and$L^k \\rightarrow 0$as$k\\rightarrow\\infty$when$L<1$. \n", "\n", "In summary we have\n", "\n", "**The fixed point theorem.**\n", "\n", "If there is an interval$[a,b]$such that$g\\in C^1[a,b]$,$g([a,b])\\subset (a,b)$and there exist a positive constant$L<1$such that$|g'(x)|\\leq L <1$for all$x\\in[a,b]$then\n", "*$g$has a unique fixed point$r$in$(a,b)$. \n", "\n", "* The fixed point iterations$x_{k+1}=g(x_k)$converges towards$r$for all starting values$x_0 \\in [a,b]$.\n", "\n", "\n", "\n", "From the discussion above, we can draw some conclusions: \n", "* The smaller the constant$L$, the faster the convergence. \n", "\n", "* If$|g'(r)|<1$then there will always be a neighbourhood around$r$,$(r-\\delta, r+\\delta)$for some$\\delta$on which all the conditions are satisfied. Meaning that the iterations will always converge if$x_0$is sufficiently close to$r$. \n", "\n", "* If$|g'(r)|>1$the fixed point iterations will never converge towards$r$.\n", "\n", "**Example 3 revisited:**\n", "Use the theory above to analyse the iteration scheme from Example 3, where" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "$$\n", "g(x)=\\frac{x^3+x^2-3}{3}, \\qquad g'(x) = \\frac{3x^2+2x}{3}\n", "$$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It is clear that$g$is differentiable. We already know that$g$has three\n", "fixed points,$r=\\pm\\sqrt{3}$and$r=-1$. For the first two, we have that\n", "$g'(\\sqrt{3}) = 3+\\frac{2}{3}\\sqrt{3} = 4.15$and\n", "$g'(-\\sqrt{3})=3-\\frac{2}{3}\\sqrt{3}=1.85$, so the fixed point iterations will\n", "never converges towards those roots. But$g'(-1)=1/3$, so we get convergence towards\n", "this root, given sufficiently good starting values. The figure below demonstrates\n", "that the assumptions of the theorem are satisfied in some region around$x=-1$, for example\n", "$[-1.3, -0.7]$." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "rcParams['figure.figsize'] = 10, 5 # Resize the figure for a nicer plot" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Demonstrate the assumptions of the fixed point theorem\n", "\n", "def g(x): # The function g(x)\n", " return (x**3+x**2-3)/3\n", "\n", "def dg(x): # The derivative g'(x)\n", " return (3*x**2+2*x)/3\n", "\n", "a, b = -1.3, -0.7 # The interval [a,b] around r.\n", "x = linspace(a, b, 101) # x-values on [a,b]\n", "y_is_1 = ones(101) # For plotting the for g' \n", "\n", "\n", "rcParams['figure.figsize'] = 10, 5 # Resize the figure\n", "# Plot x and g(x) around the$r=-1$.\n", "subplot(1,2,1)\n", "plot(x, g(x), x, x, 'r--')\n", "xlabel('x')\n", "axis([a,b,a,b])\n", "grid(True)\n", "legend(['g(x)','x'])\n", "\n", "# Plot g'(x), and the limits -1 and +1\n", "subplot(1,2,2)\n", "plot(x, dg(x), x, y_is_1, 'r--', x, -y_is_1, 'r--');\n", "xlabel('x')\n", "grid(True)\n", "title('dg/dx');" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The plot to the left demonstrates the assumption$g([a,b])\\subset(a,b)$, as the graph$y=g(x)$\n", "enters at the left boundary and exits at the right and does not leave the region$[a,b]$anywhere in between. \n", "The plot to the right shows that$|g'(x)|<= g'(a) = L < 1$is satisfied in the interval. \n", "\n", "**Exercises:**\n", "1. See how far the interval$[a,b]can be stretched, still with convergence guaranteed. \n", "\n", "2. Do a similar analysis on the three other iteration schemes suggested above, and confirm the results numerically. The schemes are:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\\begin{align*}\n", " x = g_2(x) = \\frac{-x^2+3x+3}{x^2}, \\\\ \n", " x = g_3(x) = \\sqrt{3+3x-x^2}, \\\\ \n", " x = g_4(x) = \\sqrt{\\frac{3+3x-x^2}{x}}\n", "\\end{align*}\n", "" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Insert your code here." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Newton's method\n", "Based on the previous discussion, it is clear that fast convergence can be achieved ifg'(r)$is as small as possible, preferable$g'(r)=0$. Can this be achieved for a general problem? \n", "\n", "We have that$x_k=r-e_k$where$e_k$is the error in iteration$k$. Do a Taylor expansion (*Preliminaries*, section 4) of$g(x_k)$around$r$:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", " \n", "\n", "$$\n", "\\begin{equation}\n", " e_{k+1} = r-x_{k+1} = g(r)-g(x_k) = g(r)-g(r-e_k) = -g'(r)e_k + \\frac{1}{2}g''(\\xi_k)e_k^2\n", "\\label{_auto3} \\tag{3}\n", "\\end{equation}\n", "$$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If$g'(r)=0$and there is a constant$M$such that$|g''(x)|/2 \\leq M$,\n", "then" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "$$\n", "|e_{k+1}| \\leq M |e_k|^2\n", "$$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "and *quadratic convergence* has been achieved, see *Preliminaries*, section 3.1 on \"Convergence of an iterative process\". \n", "\n", "*Question*: Given the equation$f(x)=0$with an unknown solution$r$. Can we find a$g$with$r$as a fixed point, satisfiying$g'(r)=0$?\n", "\n", "*Idea*: Let" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "$$\n", "g(x)=x-h(x)f(x).\n", "$$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "for some function$h(x)$. Independent of the choice of$h(x)$,$r$will be a fixed point of$g$. \n", "Choose$h(x)$such that$g'(r)=0$, that is" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "$$\n", "g'(x) = 1 - h'(x)f(x) - h(x)f'(x) \\qquad \\Rightarrow \\qquad\n", "g'(r) = 1 - h(r)f'(r)\n", "$$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "By choosing$h(x)=1/f'(x)$the aim$g'(r)=0$is achieved. The result is \n", "\n", "**Newton's method.**\n", "\n", " * Given$f$and a starting value$x_0$. \n", "\n", " * For$k=0,1,2,3,\\dotsc$\n", "\n", " *$\\displaystyle x_{k+1} = x_k - \\frac{f(x_k)}{f'(x_k)}$\n", "\n", "\n", "\n", "### Implementation\n", "\n", "The method is implemented in the function newton(). The iterations are terminated when \n", "$|f(x_k)|$is less than a given tolerance." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def newton(f, df, x0, tol=1.e-8, max_iter=30):\n", " # Solve f(x)=0 by Newtons method\n", " # The output of each iteration is printed\n", " # Input:\n", " # f, df: The function f and its derivate f'.\n", " # x0: Initial values\n", " # tol: The tolerance\n", " # Output:\n", " # The root and the number of iterations\n", " x = x0\n", " print('k ={:3d}, x = {:18.15f}, f(x) = {:10.3e}'.format(0, x, f(x)))\n", " for k in range(max_iter):\n", " fx = f(x)\n", " if abs(fx) < tol: # Accept the solution \n", " break \n", " x = x - fx/df(x) # Newton-iteration\n", " print('k ={:3d}, x = {:18.15f}, f(x) = {:10.3e}'.format(k+1, x, f(x)))\n", " return x, k+1" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Example 4:**\n", "Solve$f(x)=x^3+x^2-3x-3=0$by Newton's method. Choose$x_0=1.5$. \n", "The derivative of$f$is$f'(x)=3x^2+2x-3$." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Example 4\n", "def f(x): # The function f\n", " return x**3+x**2-3*x-3\n", "\n", "def df(x): # The derivative f'\n", " return 3*x**2+2*x-3\n", "\n", "x0 = 1 # Starting value\n", "x, nit = newton(f, df, x0, tol=1.e-14, max_iter=30) # Apply Newton\n", "print('\\n\\nResult:\\nx={}, number of iterations={}'.format(x, nit))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Error analysis\n", "\n", "The method was constructed to give quadratic convergence, that is" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "$$\n", "|e_{k+1}| \\leq M |e_k|^2,\n", "$$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "where$e_k=r-x_k$is the error after$k$iterations. But under which conditions is this true, and can we say something about the size of the constant$M$?\n", "\n", "By a Taylor expansion of$f(r)$around$x_k, we get:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\\begin{align*}\n", " 0 = f(r) &= f(x_k) + f'(x_k)(r-x_k) + \\frac{1}{2}f''(\\xi_k)(r-x_k)^2\n", "&& \\text{(Taylor series)} \\\\ \n", " 0 &= f(x_k) + f'(x_k)(x_{k+1}-x_k) && \\text{(Newton's method)}\n", "\\end{align*}\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "where\\xi_k$is between$r$and$x_k$.\n", "\n", "Subtract the two equations to get" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "$$\n", "f'(x_k)(r-x_{k+1}) + \\frac{1}{2}f''(\\xi_{k})(r-x_k)^2 = 0\n", "\\qquad \\Rightarrow \\qquad\n", "e_{k+1} = - \\frac{1}{2} \\frac{f''(\\xi_k)}{f'(x_k)} e_k^2\n", "$$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "So we obtain quadratic convergence if$f$is twice differentiable around$r$,$f'(x_k)\\not=0$, and$x_0$is chosen sufficiently close to$r$. More precise:\n", "\n", "**Theorem: Convergence of Newton iterations.**\n", "\n", "Assume that the function$f$has a root$r$, and let$I_\\delta=[r-\\delta, r+\\delta]$for some$\\delta > 0$. Assume further that\n", "*$f\\in C^2(I_\\delta)$. \n", "\n", "* There is a positive constant$M$such that \n", "\n", " *$\\displaystyle \\left| \\frac{f''(y)}{f'(x)} \\right| \\leq 2M, \\qquad \\text{for all $x,y \\in I_{\\delta}$}$. \n", "\n", "\n", "In this case, Newton's iterations converge quadratically," ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "$$\n", "|e_{k+1}| \\leq M |e_k|^2\n", "$$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "for all starting values satisfying$|x_0-r| \\leq \\min\\{1/M, \\delta\\}$.\n", "\n", "\n", "\n", "**Exercises:**\n", "1. Repeat Example 4 using different starting values$x_0$. Find the two other roots. \n", "\n", "2. Verify quadratic convergence numerically. How to do so is explained in *Preliminaries*, section 3.1.\n", "\n", "3. Solve the equation$x(1-\\cos(x))=0$, both by the bisection method and by Newton's method with$x_0=1. Comment on the result. \n", "\n", "# System of nonlinear equations\n", "In this section we will discuss how to solve systems of non-linear equations, given by" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\\begin{align*}\n", " f_1(x_1, x_2, \\dotsc, x_n) &= 0 \\\\ \n", " f_2(x_1, x_2, \\dotsc, x_n) &= 0 \\\\ \n", " & \\vdots \\\\ \n", " f_n(x_1, x_2, \\dotsc, x_n) &= 0 \\\\ \n", "\\end{align*}\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "or short by" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "$$\n", "\\mathbf{f}(\\mathbf{x}) = \\mathbf{0}.\n", "$$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "where\\mathbf{f}:\\mathbb{R}^n \\rightarrow \\mathbb{R}^n.\n", "\n", "\n", "**Example 5:**\n", "Given the two equations" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\\begin{align*}\n", "x_1^3-x_2 + \\frac{1}{4} &= 0 \\\\ \n", "x_1^2+x_2^2 - 1 &= 0 \\nonumber\n", "\\end{align*}\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This can be illustrated by\n", "rcParams['figure.figsize'] = 6, 6 # Resize the figure for a nicer plot" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Example 5 with graphs\n", "x = linspace(-1.0, 1.0, 101)\n", "plot(x, x**3+0.25);\n", "t = linspace(0,2*pi,101)\n", "plot(cos(t), sin(t)); \n", "axis('equal');\n", "xlabel('x1')\n", "ylabel('x2')\n", "grid(True)\n", "legend(['x1^3-x2+1/4=0','x1^2+x2^2-1=0']);" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The solutions of the two equations are the intersections of the two graphs. So there are two solutions, one in the first and one in the third quadrant. \n", "\n", "We will use this as a test example in the sequel.\n", "\n", "## Newton's method for systems of equations\n", "The idea of fixed point iterations can be extended to systems of equations. But it is in general hard to find convergent schemes. So we will concentrate on the extension of Newton's method to systems of equations. And for the sake of illustration, we only discuss systems of two equations and two unknowns written as" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\\begin{align*}\n", "f(x, y) &= 0 \\\\ \n", "g(x, y) &= 0\n", "\\end{align*}\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "to avoid getting completely lost in indices. \n", "\n", "Let\\mathbf{r}= [r_x, r_y]^T$be a solution to these equations and some$\\hat{\\mathbf{x}}=[\\hat{x},\\hat{y}]^T$a known approximation to$\\mathbf{r}$. We search for a better approximation. This can be done by replacing the nonlinear equation$\\mathbf{f}(\\mathbf{x})=\\mathbf{0}$by its linear approximation. Which can be found by a multidimensional Taylor expansion around$\\hat{\\mathbf{x}}." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\\begin{align*}\n", " f(x, y) &= f(\\hat{x}, \\hat{y})\n", " + \\frac{\\partial f}{\\partial x}(\\hat{x}, \\hat{y})(x - \\hat{x})\n", " + \\frac{\\partial f}{\\partial y}(\\hat{x}, \\hat{y})(y - \\hat{y}) + \\dotsc \\\\ \n", " g(x,y) &= g(\\hat{x}, \\hat{y})\n", " + \\frac{\\partial g}{\\partial x}(\\hat{x}, \\hat{y})(x - \\hat{x})\n", " + \\frac{\\partial g}{\\partial y}(\\hat{x}, \\hat{y})(y - \\hat{y}) + \\dotsc \n", "\\end{align*}\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "where the\\dotsc$represent higher order terms, which are small if$\\hat{\\mathbf{x}}\\approx \\mathbf{x}$. \n", "By ignoring these terms we get a *linear approximation* to$\\mathbf{f}(\\mathbf{x}), and rather than solving the nonlinear original system, we can solve its linear approximation:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\\begin{align*}\n", " f(\\hat{x}, \\hat{y})\n", " + \\frac{\\partial f}{\\partial x}(\\hat{x}, \\hat{y})(x - \\hat{x})\n", " + \\frac{\\partial f}{\\partial y}(\\hat{x}, \\hat{y})(y - \\hat{y}) &= 0\\\\ \n", " g(\\hat{x}, \\hat{y})\n", " + \\frac{\\partial g}{\\partial x}(\\hat{x}, \\hat{y})(x - \\hat{x})\n", " + \\frac{\\partial g}{\\partial y}(\\hat{x}, \\hat{y})(y - \\hat{y}) &= 0\n", "\\end{align*}\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "or more compact" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "$$\n", "\\mathbf{f}(\\hat{\\mathbf{x}}) + J(\\hat{\\mathbf{x}})(\\mathbf{x}-\\hat{\\mathbf{x}}) = 0,\n", "$$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "where the JacobianJ(\\mathbf{x})$is given by" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "$$\n", "J(\\mathbf{x}) =\\left(\\begin{array}{cc}\n", "\\frac{\\partial f}{\\partial x}(x, y) & \\frac{\\partial f}{\\partial y}(x, y) \\\\ \n", "\\frac{\\partial g}{\\partial x}(x, y) & \\frac{\\partial g}{\\partial y}(x, y)\n", "\\end{array} \\right)\n", "$$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It is to be hoped that the solution of the linear equation$\\mathbf{x}$provides a better approximation to$\\mathbf{r}$than our initial guess$\\hat{\\mathbf{x}}$, so the process can be repeated, resulting in\n", "\n", "**Newton's method for system of equations.**\n", "\n", "* Given a function$\\mathbf{f}(\\mathbf{x})$, its Jacobian$J(\\mathbf{x})$and a starting value$\\mathbf{x}_0$. \n", "\n", "* For$k=0,1,2,3,\\dotsc$\n", "\n", " * Solve the system$J(\\mathbf{x}_k)\\Delta_k = - \\mathbf{f}(\\mathbf{x}_k)$. \n", "\n", " * Let$\\mathbf{x}_{k+1} = \\mathbf{x}_k + \\Delta_k$.\n", "\n", " \n", "\n", "The strategy can be generalized to systems of$n$equations in$n$unknowns, in which case the Jacobian is given by:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "$$\n", "J(\\mathbf{x}) = \\left(\\begin{array}{cccc}\n", "\\frac{\\partial f_1}{\\partial x_1}(\\mathbf{x}) &\n", "\\frac{\\partial f_1}{\\partial x_2}(\\mathbf{x}) & \\dotsm &\n", "\\frac{\\partial f_1}{\\partial x_n}(\\mathbf{x}) \\\\ \n", "\\frac{\\partial f_2}{\\partial x_1}(\\mathbf{x}) &\n", "\\frac{\\partial f_2}{\\partial x_2}(\\mathbf{x}) & \\dotsm &\n", "\\frac{\\partial f_2}{\\partial x_n}(\\mathbf{x}) \\\\ \n", "\\vdots & \\vdots & & \\vdots \\\\ \n", "\\frac{\\partial f_n}{\\partial x_1}(\\mathbf{x}) &\n", "\\frac{\\partial f_n}{\\partial x_2}(\\mathbf{x}) & \\dotsm &\n", "\\frac{\\partial f_n}{\\partial x_n}(\\mathbf{x})\n", "\\end{array} \\right)\n", "$$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Implementation\n", "\n", "Newton's method for system of equations is implemented in the function newton_system. The numerical solution is accepted when all components of$\\mathbf{f}(\\mathbf{x}_k)$are smaller than a tolerance in absolute value, that means when$\\|\\mathbf{f}(\\mathbf{x}_k)\\|_{\\infty} < $tol. See *Preliminaries*, section 1 for a description of norms." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "set_printoptions(precision=15) # Output with high accuracy" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def newton_system(f, jac, x0, tol = 1.e-10, max_iter=20):\n", " x = x0\n", " print('k ={:3d}, x = '.format(0), x)\n", " for k in range(max_iter):\n", " fx = f(x)\n", " if norm(fx, inf) < tol: # The solution is accepted. \n", " break\n", " Jx = jac(x)\n", " delta = solve(Jx, -fx) \n", " x = x + delta \n", " print('k ={:3d}, x = '.format(k+1), x)\n", " return x, k" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Example 6:**\n", "Solve the equations from Example 5\n", "by Newton's method. The vector valued function$\\mathbf{f}$and the Jacobian$J$are in this case" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "$$\n", "\\mathbf{f}(\\mathbf{x}) =\n", "\\left(\\begin{array}{c}\n", "x_1^3-x_2 + \\frac{1}{4} \\\\ \n", "x_1^2+x_2^2 - 1\n", "\\end{array} \\right) \\qquad \\text{and} \\qquad\n", "J(\\mathbf{x}) =\n", "\\left( \\begin{array}{cc}\n", "3x_1^2 & -1 \\\\ 2x_1 & 2x_2\n", "\\end{array} \\right)\n", "$$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We already know that the system has 2 solutions, one in the first and one in the third quadrant. To find the first one, choose$\\mathbf{x}_0=[1,1]^T$as starting value." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Example 6\n", "\n", "# The vector valued function. Notice the indexing. \n", "def f(x): \n", " y = array([x**3-x+0.25, \n", " x**2+x**2-1])\n", " return y\n", "\n", "# The Jacobian\n", "def jac(x):\n", " J = array([[3*x**2, -1],\n", " [2*x, 2*x]])\n", " return J\n", "\n", "x0 = array([1.0, 1.0]) # Starting values\n", "max_iter = 20\n", "x, nit = newton_system(f, jac, x0, tol = 1.e-12, max_iter = max_iter) # Apply Newton's method\n", " \n", "print('\\nTest: f(x)={}'.format(f(x)))\n", "if nit == max_iter:\n", " printf('Warning: Convergence har not been achieved')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Exercises:**\n", "1. Search for the solution of Example 5 in the third quadrant by changing the initial values. \n", "\n", "2. Apply Newton's method to the system" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "$$\n", "\\begin{array}{rl} x e^y &= 1 \\\\ -x^2 +y &= 1 \\end{array}, \\qquad \\text{using x_0=y_0=0}.\n", "$$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Final remarks\n", "\n", "A complete error and convergence analysis of Newton's method for systems is far from trivial, and outside the scope of this course. But in summary: \n", "If$\\mathbf{f}$is sufficiently differentiable, and there is a solution$\\mathbf{r}$of the system$\\mathbf{f}(\\mathbf{x})=0$and with$J(\\mathbf{r})$nonsingular, then the Newton iterations will converge quadratically towards$\\mathbf{r}$for all$\\mathbf{x}_0$sufficiently close to$\\mathbf{r}$. \n", "\n", "Finding solutions of nonlinear equations is difficult. Even if the Newton iterations in principle will converge, it can be very hard to find sufficient good starting values. Nonlinear equations can have none or many solutions. Even when there are solutions, there is no guarantee that the solution you find is the one you want. \n", "\n", "If$n\$ is large, each iteration is computationally expensive since the Jacobian is evaluated in each iteration. In practice, some modified and more efficient version of Newton's method will be used, maybe together with more robust but slow algorithms for finding sufficiently good starting values." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.3" } }, "nbformat": 4, "nbformat_minor": 2 }