diff --git a/documentation/source/science_guide/tangent_linear_section/index.rst b/documentation/source/science_guide/tangent_linear_section/index.rst
new file mode 100644
index 000000000..94bbf6bcd
--- /dev/null
+++ b/documentation/source/science_guide/tangent_linear_section/index.rst
@@ -0,0 +1,23 @@
+.. -----------------------------------------------------------------------------
+    (c) Crown copyright Met Office. All rights reserved.
+    The file LICENCE, distributed with this code, contains details of the terms
+    under which the code may be used.
+   -----------------------------------------------------------------------------
+
+.. _tangent_linear_section_index:
+
+Linear Model Section
+=======================
+
+This describes the scientific aspects of the tangent-linear model code for GungHo.
+
+.. toctree::
+    :maxdepth: 2
+      
+    tangent_linear_overview
+    tangent_linear_coding
+    tangent_linear_timestepping
+    tangent_linear_components
+    tangent_linear_mol_transport
+    tangent_linear_ffsl_transport
+    tangent_linear_tests
diff --git a/documentation/source/science_guide/tangent_linear_section/tangent_linear_coding.rst b/documentation/source/science_guide/tangent_linear_section/tangent_linear_coding.rst
new file mode 100644
index 000000000..0b81cc5ed
--- /dev/null
+++ b/documentation/source/science_guide/tangent_linear_section/tangent_linear_coding.rst
@@ -0,0 +1,102 @@
+.. _tangent_linear_coding:
+
+General principles
+=====================
+
+To obtain the tangent linear model, we need to differentiate the
+nonlinear model. There are two main ways to achieve this: differentiate the nonlinear continuous equations and then discretize, or differentiate the nonlinear discretized equations.
+
+Mainly the approach of differentiating the discretized equations is used here - the so-called line-by-line approach, as each line of code can be treated individually. However, some parts such as the semi-implicit solve take the approach of differentiating the nonlinear equations and then discretizing (although in fact as the solver is already linear, in practice this just means reusing the nonlinear code).
+
+The following examples demonstrate how the line-by-line approach is implemented, to differentiate the discretized equations.
+
+First derivative
+----------------
+
+For the most simple example, if the nonlinear model is :math:`y = f(x)`,
+then the tangent linear model is
+
+.. math:: \delta y = \frac{df}{dx} \delta x
+
+For example, if the nonlinear model code is
+
+::
+
+     y = x**2
+
+then :math:`f(x)=x^2`, and the first derivative is :math:`f'(x) = 2x`.
+So the linear model code is
+
+::
+
+     y = 2 * ls_x * x
+
+where now ``y``, and ``x`` are the perturbations, and ``ls_x`` is the
+linearisation state.
+
+Note that we could rewrite this in other variable names as e.g.
+``delta_y = 2 * x * delta_x``, but in the LFRic code we have chosen to
+keep the variable names for the perturbations identical to the nonlinear
+model variable names, and to introduce ``ls_`` to indicate the
+linearisation state. This linearisation state is produced from a
+separate nonlinear model run.
+
+The chain rule
+--------------
+
+For a more complicated expression, then the chain rule of
+differentiation is required
+
+.. math::
+
+   \begin{aligned} 
+     y & = f(g(x)) \\
+     \delta y & = \frac{df}{dx} \delta x \\
+              & = \frac{df}{dg} \frac{dg}{dx} \delta x
+     \end{aligned}
+
+For example, if the nonlinear model code is
+
+::
+
+     y = (sin v)**2
+
+so
+
+.. math::
+
+   \begin{aligned}
+   f(g) & = g^2 \\
+   g(x) & = sin(x)\\
+   f'(g) & = 2 g = 2 sin(x) \\ 
+   g'(x) & = cos(x)
+   \end{aligned}
+
+then the linear model code is
+
+::
+
+     y = 2 * sin(ls_v) * cos(ls_v) * v
+
+Multivariable calculus
+----------------------
+
+When dealing with more than one variable, then the rules need to be
+extended to multi-variable calculus.
+
+If the nonlinear model is
+
+.. math:: y = f(x,z)
+
+the linear model is
+
+.. math:: \delta y = \frac{\partial f}{\partial x} \delta x + \frac{\partial f}{\partial z} \delta z
+
+and the multivariate chain rule gives
+
+.. math::
+
+   \begin{aligned} 
+     y & = f(g_1(x_1), ..., g_n(x_n)) \\
+     \delta y & = \sum_i \frac{\partial f}{\partial g_i} \frac{d g_i}{d x_i} \delta x_i
+     \end{aligned}
diff --git a/documentation/source/science_guide/tangent_linear_section/tangent_linear_components.rst b/documentation/source/science_guide/tangent_linear_section/tangent_linear_components.rst
new file mode 100644
index 000000000..0d2c924a7
--- /dev/null
+++ b/documentation/source/science_guide/tangent_linear_section/tangent_linear_components.rst
@@ -0,0 +1,165 @@
+.. _tangent_linear_components:
+
+Right-hand-side Components
+=====================
+
+Kinetic energy gradient
+-----------------------
+
+The nonlinear model is
+
+.. math::
+
+   \begin{aligned}
+     f & = \nabla (\frac{1}{2} u^2) \\
+       & = g(z(u)) \\
+     \text{where} \quad z(u) & = \frac{1}{2} u^2 \\
+                        g(z) & = \nabla(z)
+     \end{aligned}
+
+The linear model is,
+
+.. math::
+
+   \begin{aligned}
+     \delta f & = \frac{d g}{d z} \frac{ dz}{du} \delta u\\
+              & = \nabla (u \delta u)
+     \end{aligned}
+
+Note that an alternative way to derive this is to form the difference
+between the perturbed and unperturbed, and remove the small terms
+
+.. math::
+
+   \begin{aligned}
+    \delta f & = f(\bar{u}+ \delta u) - f(\bar{u}) \\
+       & = \frac{1}{2} \nabla (\bar{u} + \delta u)^2 - \frac{1}{2} \nabla \bar{u}^2 \\
+       & = \frac{1}{2} \nabla (\bar{u}^2 + 2\bar{u} \delta u + \delta u^2) - \frac{1}{2} \nabla \bar{u}^2 \\
+       & \approx \nabla (\bar{u} \delta u)
+     \end{aligned}
+
+Potential temperature
+---------------------
+
+The nonlinear model is
+
+.. math:: \theta_e  = \theta (m_g/m_t)
+
+The linear model is,
+
+.. math::
+
+   \begin{aligned}
+     \delta \theta_e & = \frac{\partial \theta_e}{\partial \theta} \delta \theta
+                       + \frac{\partial \theta_e}{\partial m_g} \delta m_g
+                       + \frac{\partial \theta_e}{\partial m_t} \delta m_t \\
+                     & = \frac{m_g}{m_t} \delta \theta
+                       + \frac{\theta}{m_t} \delta m_g
+                       -  \frac{\theta m_g}{m_t^2} \delta m_t
+      \end{aligned}
+
+using
+
+.. math:: \frac{d (1/m_t) }{dx} = \frac{d (m_t^{-1}) }{dx} = -1 (m_t^{-2}) = - \frac{1}{m_t^2}
+
+This can be simplified, by substituting in the equation for the
+nonlinear model :math:`\theta_e  = \theta (m_g/m_t)`
+
+.. math:: \delta \theta_e =  \theta_e \left( \frac{ \delta \theta}{\theta} + \frac{\delta m_g}{m_g} - \frac{\delta m_t}{m_t} \right)
+
+Logspace
+--------
+
+The nonlinear model is
+
+.. math::
+
+   \begin{aligned}
+     p   & = \Pi_i |\rho_i|^{c_i} \\
+         & = \Pi_i u_i(y_i(z_i(\rho_i))) \\
+         & = \Pi_i u_i \\
+    \text{where} \quad u_i & = y_i ^{c_i} \\
+     y_i & = | \rho_i | \\
+         & = (z_i) ^{1/2} \\
+     z_i & = \rho_i^2
+     \end{aligned}
+
+The derivatives are
+
+.. math::
+
+   \begin{aligned}
+     \frac{ \partial p}{\partial u_i} & =  \Pi_{j \neq i} u_j =  \Pi_{j \neq i} | \rho_j |^{c_j} \\
+     \frac{ \partial u}{\partial y_i} & = c_i y_i ^{c_i -1} = c_i | \rho_i | ^{c_i -1} \\
+     \frac{ \partial y}{\partial z_i} & = \frac{1}{2} z_i^{-1/2} = \frac{1}{2} \frac{1}{| \rho_i | } \\
+     \frac{ \partial z}{\partial \rho_i} & = 2 \rho_i
+     \end{aligned}
+
+Therefore the linear model is,
+
+.. math::
+
+   \begin{aligned}
+     \delta p & =  \sum_i \frac{ \partial p}{\partial u_i}  \frac{ \partial u_i}{\partial y_i}\frac{ \partial y_i}{\partial z_i} \frac{ \partial z_i}{\partial \rho_i} \delta \rho_i \\
+     & = \sum_i \left( \left( \Pi_{j \neq i} | \rho_j |^{c_j} \right) c_i  | \rho_i | ^{c_i -1} \frac{\rho_i}{| \rho_i |} \delta \rho_i \right)
+     \end{aligned}
+
+This can be simplified, by using a rearrangement of the nonlinear model
+
+.. math::
+
+   \begin{aligned}
+     p   & = \Pi_i |\rho_i|^{c_i} \\
+         & = |\rho_i|^{c_i}  \Pi_{j \neq i} | \rho_j|^{c_j} \\
+   \text{so } \quad \Pi_{j \neq i} | \rho_j|^{c_j} &  = \frac{p}{|\rho_i|^{c_i}} \\
+     \end{aligned}
+
+Substituting this into the linear model, gives
+
+.. math::
+
+   \begin{aligned}
+     \delta p & = p \sum_i c_i \frac{\rho_i }{ | \rho_i |^2} \delta \rho_i \\
+              & = p \sum_i \frac{c_i }{ \rho_i} \delta \rho_i \\
+     \end{aligned}
+
+Newton iteration of the Equation of State
+-----------------------------------------
+
+The equation of state is applied iteratively in the semi-implicit
+timestepping. The nonlinear model rhs is
+
+.. math::
+
+   \begin{aligned}
+     N & = 1/E -1
+     \end{aligned}
+
+where
+
+.. math::
+
+   \begin{aligned}
+   E & = \frac{p_0}{R} \frac{\Pi^\gamma}{\theta \rho} \\ 
+   \gamma & = \frac{1-\kappa}{\kappa}
+   \end{aligned}
+
+The derivatives are:
+
+.. math::
+
+   \begin{aligned}
+     \frac{\partial N}{\partial E} & = - E^{-2} \\
+     \frac{\partial E}{\partial \Pi} & = \frac{p_0}{R} \gamma \Pi^{\gamma-1}\theta^{-1} \rho^{-1} = E \gamma \Pi^{-1} \\
+   \frac{\partial E}{\partial \rho} & = - \frac{p_0}{R} \Pi^\gamma \rho^{-2} \theta^{-1} = - E \rho^{-1} \\
+   \frac{\partial E}{\partial \theta} & = - \frac{p_0}{R} \Pi^\gamma \rho^{-1} \theta^{-2} = - E \theta^{-1}
+   \end{aligned}
+
+The linear model is
+
+.. math::
+
+   \begin{aligned}
+   \delta N & = \frac{\partial N}{\partial E} \left( \frac{ \partial E}{\partial \Pi} \delta \Pi + \frac{ \partial E}{\partial \rho} \delta \rho + \frac{ \partial E}{\partial \theta} \delta \theta \right) \\
+    &  = - \frac{1}{E} \left( \frac{1-\kappa}{\kappa} \frac{\delta \Pi}{\Pi} - \frac{\delta \rho}{\rho} - \frac{\delta \theta}{\theta} \right)
+   \end{aligned}
diff --git a/documentation/source/science_guide/tangent_linear_section/tangent_linear_ffsl_transport.rst b/documentation/source/science_guide/tangent_linear_section/tangent_linear_ffsl_transport.rst
new file mode 100644
index 000000000..2589af727
--- /dev/null
+++ b/documentation/source/science_guide/tangent_linear_section/tangent_linear_ffsl_transport.rst
@@ -0,0 +1,166 @@
+.. _tl_ffsl_transport:
+
+FFSL Transport
+================
+
+| The non-linear flux-form semi-Lagrangian (FFSL) transport scheme solves the conservative form transport
+  equation in the form
+
+  .. math:: q^{n+1} = q^{n} - \Delta{t} F(u,q^n)
+
+  where :math:`q` is the variable we are transporting, :math:`u` is the
+  advecting wind, :math:`\Delta{t}` is the time step, and :math:`F` is
+  the divergence of the mass flux
+
+  .. math:: F(u,q) = \frac{1}{V} \nabla \cdot \tilde{f}(u,q)
+
+  where the flux :math:`\tilde{f}`, is computed using the chosen FFSL
+  scheme.
+
+The non-linear scheme solves the advective form transport equation in
+the form
+
+.. math::
+
+       q^{n+1}_{adv} = \frac{q^{n} - \Delta{t} F(u,q^n)}{\sigma - \Delta{t} F(u,\sigma)}
+
+where :math:`\sigma=1` is the unity field, and therefore
+:math:`F(u,\sigma)` is the wind divergence. The unity field plays the
+role of a psuedo-density.
+
+Tangent Linear Transport
+-------------------------
+
+For the linear model we split the variables into a linear state (LS)
+denoted with a bar, and a perturbation denoted with a prime
+
+.. math:: q = \bar{q} + q' \hspace{1cm} u = \bar{u}+u'
+
+The conservative transport equation becomes
+
+.. math:: \frac{\partial \bar{q}}{\partial t} + \frac{\partial q'}{\partial t} + \nabla \cdot( (\bar{u}+u')(\bar{q}+q') ) = 0
+
+The product of perturbations is assumed to be zero, and so multiplying
+out the divergence term gives
+
+.. math:: \frac{\partial \bar{q}}{\partial t} + \frac{\partial q'}{\partial t} + \nabla \cdot( \bar{u}\bar{q} ) + \nabla \cdot( \bar{u}q' ) + \nabla \cdot( u'\bar{q} ) = 0
+
+The linear state conservative transport equation is
+
+.. math:: \frac{\partial \bar{q}}{\partial t} + \nabla \cdot( \bar{u}\bar{q} ) = 0
+
+Leaving the conservative perturbation transport equation as
+
+.. math::
+
+       \frac{\partial q'}{\partial t} + \nabla \cdot( \bar{u}q' ) + \nabla \cdot( u'\bar{q} ) = 0
+
+Flux Computation
+----------------------
+
+The flux computation can be split into two parts.
+
+The first term, corresponding to :math:`\bar{u}q'`, can be computed
+using standard FFSL operators, provided the scheme is linear (i.e. no
+flux limiting). This is the reconstruction of the perturbation field
+with the ls wind, and we denote this :math:`\tilde{f}_1`.
+
+The second term, corresponding to :math:`u'\bar{q}`, uses the
+perturbation departure distance from the ls departure point. This is
+computed as, where the subscript :math:`d` signifies the ls wind
+departure cell,
+
+.. math:: \tilde{f}_2(u',\bar{q}) = u' \bar{q}_d
+
+The constant reconstruction is to ensure that this term is linear in the
+perturbation wind.
+
+We can now use the notation
+
+.. math::
+
+   \begin{aligned}
+       F_1(u,q) & = \frac{1}{V} \nabla \cdot \tilde{f}_1(u,q) \\
+       F_2(u,q) & = \frac{1}{V} \nabla \cdot \tilde{f}_2(u,q)
+   \end{aligned}
+
+Advective Update
+--------------------
+
+If we expand the variables
+into linearisation state and perturbation parts we find the flux
+divergences can be written as
+
+.. math:: F(u,q) = F(\bar{u},\bar{q}) + F_1(\bar{u},q') + F_2(u',\bar{q})
+
+and, as :math:`\sigma = \bar{\sigma}`,
+
+.. math:: F(u,\sigma) = F(\bar{u},\sigma) + F_2(u',\sigma)
+
+and the advective form becomes
+
+.. math::
+
+       q^{n+1}_{adv} = \frac{\bar{q} + q' - \Delta{t} F(\bar{u},\bar{q}) - \Delta{t}F_1(\bar{u},q') - \Delta{t}F_2(u',\bar{q}) }{\sigma - \Delta{t} F(\bar{u},\sigma)- \Delta{t} F_2(u',\sigma)}
+
+Let
+
+.. math::
+
+   \begin{aligned}
+       \bar{N} &= \bar{q} - \Delta{t} F(\bar{u},\bar{q}) \\
+       N' &= q' - \Delta{t}F_1(\bar{u},q') - \Delta{t}F_2(u',\bar{q}) \\
+       \bar{D} &= \sigma - \Delta{t} F(\bar{u},\sigma) \\
+       D' = - \Delta{t} F_2(u',\sigma)
+   \end{aligned}
+
+and then this can be written as
+
+.. math:: q^{n+1}_{adv} = \frac{\bar{N}+N'}{\bar{D}+D'}
+
+Noting that
+
+.. math::
+
+       \bar{q}^{n+1}_{adv} = \frac{\bar{N}}{\bar{D}}
+
+then the advective perturbation field can be found from
+:math:`q^{n+1}_{adv} - \bar{q}^{n+1}_{adv}`, i.e.
+
+.. math:: q'^{n+1}_{adv} = \frac{\bar{N}+N'}{\bar{D}+D'} - \frac{\bar{N}}{\bar{D}}
+
+Using the first terms of the expansion
+:math:`(1+D'/\bar{D})^{-1} \approx 1 - D'/\bar{D}` we get
+
+.. math::
+
+   \begin{aligned}
+       q'^{n+1}_{adv}=& \frac{\bar{N}+N'}{\bar{D}}(1-D'/\bar{D}) - \frac{\bar{N}}{\bar{D}} \\
+                     =& \frac{N'}{\bar{D}} - \frac{\bar{N}D'}{\bar{D}\bar{D}} - \frac{N'D'}{\bar{D}\bar{D}}
+   \end{aligned}
+
+Using the fact products of prime
+variables equals zero, this simplifies to
+
+.. math:: q'^{n+1}_{adv} = \frac{N' - D' \bar{q}^{n+1}_{adv}}{\bar{D}}
+
+Horizontal Splitting
+-----------------------
+
+The horizontal FFSL scheme uses COSMIC/Lin-Rood splitting which has, for
+the non-linear scheme, the form
+
+.. math::
+
+   \begin{aligned}
+       q^x &= q^{n} - \frac{\Delta}{2}F^x_{adv}(u,q^n) \\
+       q^x &= q^{n} - \frac{\Delta}{2}F^y_{adv}(v,q^n) \\
+       q^{n+1} &= q^{n} - \Delta{t} F^x(u,q^y)- \Delta{t} F^y(v,q^x)
+   \end{aligned}
+
+Transport Efficiency
+------------------------
+
+To improve efficiency, we can make the approximation that
+:math:`\bar{q}^{n+1}_{adv} \approx \bar{q}^{n}`. This removes the need
+to recompute :math:`\bar{q}^{n+1}_{adv}` at every step.
diff --git a/documentation/source/science_guide/tangent_linear_section/tangent_linear_mol_transport.rst b/documentation/source/science_guide/tangent_linear_section/tangent_linear_mol_transport.rst
new file mode 100644
index 000000000..0a24fe56a
--- /dev/null
+++ b/documentation/source/science_guide/tangent_linear_section/tangent_linear_mol_transport.rst
@@ -0,0 +1,84 @@
+.. _tangent_linear_mol_transport:
+
+MoL  Transport
+====================
+
+The Method of Lines (MoL) transport scheme is a Finite Volume Scheme that uses Runge-Kutta timestepping. For the linear model, we use the notation that :math:`\bar{u}` is the linearisation state and :math:`u'` is the perturbation.
+
+Nonlinear MoL
+----------
+
+The nonlinear continuous equation is
+
+.. math:: \frac{\partial \theta}{\partial t} = - u. \nabla \theta
+
+which is discretised using an upwind scheme. On each face, 
+
+.. math::  \begin{aligned} \frac{\partial \theta}{\partial t}
+	                  & = - u. \nabla_L \theta & \text{   if } u>0 \\ 
+                          & = - u. \nabla_R \theta & \text{   if } u<0 \\
+			  & =   0                  & \text{   if } u=0 \\
+	  \end{aligned}
+
+where :math:`\nabla_L` uses the upwind reconstruction calculated from the left and :math:`\nabla_R` uses the upwind reconstruction calculated from the right.
+As required, :math:`\frac{\partial \theta}{\partial t}` tends to 0 as u tends to 0.
+			    
+This is implemented by initialising
+:math:`\frac{\partial \theta}{\partial t}` as :math:`0` and then adding
+on an increment if :math:`u>0` or :math:`u<0`.
+
+Linearised MoL
+----------
+
+The linearised continuous equation is
+
+.. math:: \frac{\partial \theta'}{\partial t} = - \bar{u}. \nabla \theta' - u'. \nabla \bar{\theta}
+
+This is discretised as
+
+.. math::  \begin{aligned} \frac{\partial \theta'}{\partial t}
+	                  & = - \bar{u}. \nabla_L \theta' - u'. \frac{1}{2} (\nabla_L + \nabla_R) \bar{\theta}
+			  & \text{   if } \bar{u}>0 \\ 
+                          & = - \bar{u}. \nabla_R \theta' - u'. \frac{1}{2} (\nabla_L + \nabla_R) \bar{\theta}
+			  & \text{   if } \bar{u}<0 \\
+			  & = - u'. \frac{1}{2} (\nabla_L + \nabla_R) \bar{\theta}
+			  & \text{   if } \bar{u}=0 \\
+	  \end{aligned}
+
+which, as required, tends to
+:math:`- u'. \frac{1}{2} (\nabla_L + \nabla_R) \bar{\theta}` as
+:math:`\bar{u}` tends to :math:`0`.
+
+This is implemented by initialising
+:math:`\frac{\partial \theta'}{\partial t}` as
+:math:`- u'. \frac{1}{2} (\nabla_L + \nabla_R) \bar{\theta}` and then
+adding an increment if :math:`\bar{u}>0` or :math:`\bar{u}<0`.
+
+Runge-Kutta
+-----------
+
+The nonlinear Runge Kutta scheme is:
+
+.. math::
+
+   \begin{aligned}
+     u^{(i)} & = u^n + \Delta t \sum_{k=0}^{i-1} \beta_{i,k} f(u^{(k)}) \quad i=1,\ldots,m \\
+     u^{n+1} & = u^{(m)}
+     \end{aligned}
+
+The tangent linear Runge Kutta scheme comprises two steps. First the linearisation
+state is computed:
+
+.. math:: \bar{u}^{(i)} = \bar{u}^n + \Delta t \sum_{k=0}^{i-1} \beta_{i,k} f(\bar{u}^{(k)}) \quad i=1,\ldots,m
+
+then the perturbations are computed:
+
+.. math::
+
+   \begin{aligned}
+     u'^{(i)} & = u'^n + \Delta t \sum_{k=0}^{i-1} \beta_{i,k} f'(u'^{(k)},\bar{u}^{(k)}) \quad i=1,\ldots,m \\
+     u'^{n+1} & = u'^{(m)}
+     \end{aligned}
+
+
+If ``transport_efficiency`` is set to true, then the ``ls_state`` is not updated, so that the same linearisation state is used for all iterations.
diff --git a/documentation/source/science_guide/tangent_linear_section/tangent_linear_overview.rst b/documentation/source/science_guide/tangent_linear_section/tangent_linear_overview.rst
new file mode 100644
index 000000000..c3f631af9
--- /dev/null
+++ b/documentation/source/science_guide/tangent_linear_section/tangent_linear_overview.rst
@@ -0,0 +1,39 @@
+.. _tangent_linear_overview:
+
+Overview
+=======================
+
+In variational data assimilation, the aim is to compute the optimal
+value of the initial state :math:`\mathbf{x}_0` based on the
+minimization of a cost function:
+
+.. math:: J(\mathbf{x}_0) = (\mathbf{x}_0 - \mathbf{x}_b)^T \mathbf{B}^{-1} (\mathbf{x}_0 - \mathbf{x}_b) + \sum_t ( \mathbf{y}_t - \mathbf{H}_t N_t(\mathbf{x}_0))^T \mathbf{R}^{-1} ( \mathbf{y}_t - \mathbf{H}_t N_t(\mathbf{x}_0))
+
+where :math:`t` is the time, :math:`\mathbf{x}_b` is the background
+state, :math:`\mathbf{y}_t` are the observations, :math:`\mathbf{H}_t`
+is the observation operator (which we can assume is linear here), and
+:math:`N_t` is the nonlinear model that evolves the state
+:math:`\mathbf{x}_0` from time :math:`0` to time :math:`t`.
+:math:`\mathbf{B}` and :math:`\mathbf{R}` are the error covariance
+matrices.
+
+As :math:`N` is nonlinear, there may not be a unique solution and it is difficult to solve. Therefore, the
+problem is rewritten in incremental form, by rewriting the initial state
+in terms of a linearisation state plus a perturbation,
+:math:`\mathbf{x}_0 = \tilde{\mathbf{x}} + \mathbf{\delta x}`, so that
+on the first iteration, when :math:`\tilde{\mathbf{x}} = \mathbf{x}_b`,
+
+.. math:: J(\mathbf{\delta x}_0) = (\mathbf{\delta x}_0)^T B^{-1} (\mathbf{\delta x}_0) + \sum_t ( \mathbf{\delta y}_t - \mathbf{H}_t \mathbf{L}_t\mathbf{\delta x}_0)^T \mathbf{R}^{-1} ( \mathbf{\delta y}_t - \mathbf{H}_t \mathbf{L}_t \mathbf{\delta x}_0)
+
+where :math:`\mathbf{\delta x}_0 = \mathbf{x}_0 - \mathbf{x}_b`, and
+:math:`\mathbf{\delta y}_t = \mathbf{y}_t - \mathbf{H}_t N_t (x_b)`, and
+we have used the Taylor series expansion to approximate the nonlinearly
+evolved perturbation,
+
+.. math:: N( \mathbf{x}_b + \mathbf{\delta x} ) - N(\mathbf{x}_b) \approx \mathbf{L\delta x}
+
+where the linear model
+:math:`\mathbf{L}= \mathbf{L}(\tilde{\mathbf{x}})`, known as the tangent
+linear model, is a function of the linearisation state
+:math:`\tilde{\mathbf{x}}`, and is the Jacobian or first derivative of
+the nonlinear model.
diff --git a/documentation/source/science_guide/tangent_linear_section/tangent_linear_tests.rst b/documentation/source/science_guide/tangent_linear_section/tangent_linear_tests.rst
new file mode 100644
index 000000000..37517311e
--- /dev/null
+++ b/documentation/source/science_guide/tangent_linear_section/tangent_linear_tests.rst
@@ -0,0 +1,64 @@
+.. _tangent_linear_tests:
+
+
+Tangent Linear Tests
+====================
+
+Taylor remainder convergence test
+---------------------------------
+
+By definition, the tangent linear model is the Jacobian of the nonlinear
+model, and therefore the Taylor series expansion can be used to form a
+test for the code.
+
+The Taylor series expansion of :math:`N(\mathbf{x})` is
+
+.. math:: N(\mathbf{x} + \gamma \mathbf{\delta x} ) = N(\mathbf{x}) + \gamma \mathbf{L \delta x} + \mathcal{O}(\gamma^2)
+
+This is used to calculate the norm of the difference between the
+nonlinearly evolved perturbation
+:math:`N(\mathbf{x} + \gamma \mathbf{\delta x} ) - N(\mathbf{x})` and
+the linearly evolved perturbation :math:`\gamma \mathbf{L \delta x}`
+
+.. math:: E_n(\gamma) = || N(\mathbf{x} + \gamma \mathbf{\delta x}) - N(\mathbf{x}) - \gamma \mathbf{L \delta x} || = \mathcal{O}(\gamma^2)
+
+where the norm could be the L2-norm,
+:math:`||\mathbf{x}|| = (\mathbf{x}^T\mathbf{x})^{1/2}`.
+
+The norm :math:`E_n(\gamma)` is known as the linearization error, and it
+tends to zero as the size of the perturbation :math:`\gamma` tends to
+zero, at order :math:`\mathcal(O)(\gamma^2)`.
+
+If we examine the series of :math:`E_n` for values of :math:`\gamma`
+that are successively halved, we can evaluate the convergence rate. For
+example, with the values of :math:`E_n(\gamma)` and :math:`E_n(2\gamma)`
+we have
+
+.. math:: E_n(2\gamma) / E_n(\gamma) = \mathcal{O}(4 \gamma^2) / \mathcal{O}(\gamma^2) = 4
+
+To create a test which gives a simple pass or fail, two values of the
+sequence :math:`E_n` are chosen, and the ratio is compared to the value
+:math:`4`. If the value of the ratio is :math:`4`, to within a specified
+tolerance, then the code passes the test.
+
+Implementation
+--------------
+
+The Taylor remainder convergence test is implemented in the LFRic code
+through the use of integration tests. In the linear app integration
+tests, the convergence test is applied to small sections of code. The
+process for each test is
+
+- Initialise the nonlinear model, linear model and linearisation state.
+
+- Run the nonlinear model using the linearisation state as initial
+  conditiosn.
+
+- Run the nonlinear model with a perturbed linearisation state, and run
+  the linear model with the perturbation, and calculate
+  :math:`E_n(\gamma)`.
+
+- Halve the size of the perturbation, and repeat.
+
+- Compare the ratio of the values of :math:`E_n` with :math:`4`, to a
+  specified tolerance.
diff --git a/documentation/source/science_guide/tangent_linear_section/tangent_linear_timestepping.rst b/documentation/source/science_guide/tangent_linear_section/tangent_linear_timestepping.rst
new file mode 100644
index 000000000..574002812
--- /dev/null
+++ b/documentation/source/science_guide/tangent_linear_section/tangent_linear_timestepping.rst
@@ -0,0 +1,128 @@
+.. _tangent_linear_timestepping:
+
+Iterated (Semi-) Implicit Timestepping
+====================
+
+Nonlinear Equations
+-------------------
+
+The continous nonlinear equations are
+
+.. math::
+
+   \begin{aligned}
+     \frac{\partial u}{\partial t} &= - u . \nabla u - 2 \Omega \times u - \nabla \Phi - c_p \theta \nabla \Pi \\
+     \frac{\partial \rho}{\partial t} & = - \nabla . (\rho u) \\
+     \frac{\partial \theta}{\partial t} & = - u. \nabla \theta
+     \end{aligned}
+
+together with the nonlinear equation of state:
+
+.. math:: \Pi ^{\left( \frac{1-\kappa}{\kappa} \right)} = \frac{R}{p_0} \rho \theta
+
+where the prognostic variables are :math:`u` the velocity vector, :math:`\rho` the density, :math:`\theta` the potential temperature, and :math:`\Pi` the Exner pressure.  The constants are :math:`\Omega` the rotation vector, :math:`\Phi` the geopotential and :math:`c_p` the specific heat capacity.
+
+Discretisation
+--------------------
+In the iterated-implicit timestepping, the equations are solved iteratively
+using a Newton method. If the full nonlinear discrete equations are
+written as
+
+.. math:: \mathcal{R} (\mathbf{x}^{n+1}) =0
+
+then this can be solved iteratively as
+
+.. math::
+
+   \begin{aligned}
+     \mathbf{x}^{n+1}_{(k+1)} & =  \mathbf{x}^{n+1}_{(k)} +  \mathbf{x}'_{(k)} \\
+     \mathcal{L} (\mathbf{x}^*) \mathbf{x}'_{(k)} & = - \mathcal{R} ( \mathbf{x}^{n+1}_{(k)})
+     \end{aligned}
+
+where :math:`\mathcal{L}` is an approximation to the Jacobian and
+:math:`\mathbf{x}^*` is the reference state.
+
+The reference state :math:`u^*` is zero whilst the other reference state
+variables are set to :math:`\mathbf{x}^n`. Relaxation parameters
+:math:`\tau <1` are also introduced to give extra weight to the diagonal
+terms (and hence similar to solving the resulting linear equation using
+the Jacobi method or a damped Newton method).
+
+This gives the approximation to the Jacobian as
+
+.. math::
+
+   \mathcal{L}(\mathbf{x}^*) \mathbf{x}' = \left\{
+     \begin{aligned}
+       u' & + \tau_u \Delta t c_P (\theta' \nabla \Pi^* + \theta^* \nabla \Pi ') \\
+       \rho' & + \tau_\rho \Delta t \nabla.(\rho^* u') \\
+       \theta' & + \tau_\theta \Delta t u' \nabla. \nabla \theta^* \\
+   & \frac{1-\kappa}{\kappa}  \frac{\Pi'}{\Pi^*}  - \frac{1}{E} \frac{\rho'}{\rho^*} - \frac{1}{E} \frac{\theta'}{\theta^*}
+     \end{aligned}
+   \right.
+
+where
+
+.. math:: E = \frac{p_0}{R} \frac{\Pi^{*(1-\kappa)/\kappa}}{\theta^* \rho^*}
+
+The procedure is:
+
+- | Let :math:`\mathbf{x}^{n+1}_{(0)} = \mathbf{x}^n`
+  | The first iteration is given by the state at the previous timestep.
+
+- | Set
+    :math:`\mathbf{x}* \equiv (\rho*, \theta*, \Pi* ) = (\rho^n, \theta^n,\Pi^n)`
+  | The reference state is given by the state at the previous timestep
+    (for density, potential temperature and exner pressure only).
+
+- Loop k=1, K
+
+  - | Compute :math:`\mathcal{R}(\mathbf{x}^n, \mathbf{x}^{n+1}_{(k)})`
+
+  - Solve the linear equation for :math:`\mathbf{x}'_{(k+1)}` and
+    hence for :math:`\mathbf{x}^{n+1}_{(k+1)}`.
+
+Linear Equations
+-----------------------
+
+We can proceed in a similar way with the linear model. Using the notation of :math:`\delta \mathbf{x}` is the perturbation and :math:`\mathbf{x}` is the linearisation state,
+
+.. math::
+
+   \begin{aligned}
+     \delta \mathbf{x}^{n+1}_{(k+1)} & =  \delta \mathbf{x}^{n+1}_{(k)} +  \delta \mathbf{x}'_{(k)} \\
+     \mathcal{L} (\mathbf{x}^*) \delta \mathbf{x}'_{(k)} & = - R ( \mathbf{x}_{(k)}) \delta \mathbf{x}_{(k)}
+     \end{aligned}
+
+where :math:`R` is the linearized version of :math:`\mathcal{R}`. As the operator :math:`\mathcal{L}` is already linear, this does not need to be linearized.
+
+The procedure is:
+
+- | Let :math:`\delta \mathbf{x}^{n+1}_{(0)} = \delta \mathbf{x}^n`
+  | The first iteration is given by the perturbation at the previous
+    timestep.
+
+- | Set
+    :math:`\mathbf{x}* \equiv (\rho*,\theta*, \Pi* ) = (\rho^n, \theta^n, \Pi^n)`
+  | The reference state is given by the linearisation state at the
+    previous timestep - so that the linear operator is identical to the
+    that in the nonlinear model.
+
+- Loop k=1, K
+
+  - Compute
+    :math:`R ( \mathbf{x}_{(k)}) \delta \mathbf{x}_{(k)}`
+
+  - Solve for :math:`\delta \mathbf{x}'_{(k)}` using :math:`\mathcal{L} (\mathbf{x}^*) \delta \mathbf{x}'_{(k)} = R ( \mathbf{x}_{(k)}) \delta \mathbf{x}_{(k)}` and hence for
+    :math:`\delta \mathbf{x}^{n+1}_{(k+1)}`.
+
+The linear operator :math:`\mathcal{L} ( \bar{\mathbf{x}}*)` is exactly
+the same as for the nonlinear model. The solution of the equation (often known as the solver) uses the same method as the nonlinear model. This could be multi-grid or Krylov iterative methods, or a combination of the both. A very exact line-by=line procedure would linearise every iteration of the solver procedure, using the nonlinear solution at every step. However, this is not necessary and the more linearise then discretize approach is taken, to form the linear equation and then solve.
+
+The right hand side term :math:`R ( \mathbf{x}_{(k)}) \delta \mathbf{x}_{(k)}` require both the perturbation :math:`\delta \mathbf{x}_{(k)}` and the linearisation state :math:`\mathbf{x}_{(k)}` at the same iteration. Therefore, for an exact tangent linear semi-implicit algorithm, the
+nonlinear semi-implicit step needs to be run first, and the intermediate
+``ls_state`` values need to be stored and used in the following tangent
+linear semi-implicit step.
+
+In practice the impact of the variation of the linearisation state through the timestep is very small. Furthermore, the evolution of the linearisation state in the linear model timestepping does not include any physics and will be at a different resolution to the outer loop nonlinear model, and hence will not be very accurate. Therefore, it is advised for both efficiency and accuracy reasons to use the ``fixed_ls`` option. This option applies the ``ls_state`` values set at the beginning of the timestep to all iterations, without recalculation.
+