4 Linear DAEs with Time-varying Coefficients

In this section, we consider linear DAEs with variable or time-dependent coefficients. This means, for matrix-valued functions \[ E \in \mathcal C(\mathcal I, \mathbb C^{m,n}), \quad A\in \mathcal C(\mathcal I, \mathbb C^{m,n}) \] and \(f\in \mathcal C(\mathcal I, \mathbb C^{m})\), we consider the DAE \[\begin{equation} E(t)\dot x(t) = A(t)x(t) + f(t) \tag{4.1} \end{equation}\] with, possibly, an initial condition \[\begin{equation} x(t_0) = x_0 \in \mathbb C^{n}. \tag{4.2} \end{equation}\]

The same general solution concept applies. Basically \(x\) should be differentiable, fulfill the DAE, and, if stated, the initial condition too.

In the constant coefficient case, regularity played a decisive role for the existence and uniqueness of solutions; see, e.g. Section 3.4. Thus it seems natural to extend this concept to the time-varying case, e.g., through requiring that \((E(t), A(t))\) is a regular matrix pair independent of \(t\). However, the following two examples show that this will not work out of the box.

Example 4.1 Let \(E\), \(A\) be given as \[ E(t) = \begin{bmatrix} -t & t^2 \\ -1 & t \end{bmatrix}, \quad A(t) = \begin{bmatrix} -1 & 0 \\ 0& -1 \end{bmatrix} \] Then \[ \det ( \lambda E(t) - A(t)) = (1-\lambda t)(1+\lambda t) + \lambda ^2 t^2 \equiv 1, \] for all \(t\in \mathcal I\). Still, for every \(c \in \mathcal C^1(\mathcal I, \mathbb C)\) with \(c(t_0)=0\), the function \[ x\colon t \mapsto c(t)\begin{bmatrix} t\\1 \end{bmatrix} \] solves the homogeneous initial value problem (4.1)(4.2).

This was an example where the pair \((E,A)\) is regular uniformly with respect to \(t\) but still allows for infinitely many solutions to the associated DAE. X: What about the initial value? Why it won’t help to make the solution unique?

Next we see the contrary – a matrix pair that is singular for any \(t\) but defines a unique solution.

Example 4.2 For \[ E(t) = \begin{bmatrix} 0 & 0 \\ 1 & -t \end{bmatrix}, \quad A(t) = \begin{bmatrix} -1 & t \\ 0&0 \end{bmatrix}, \quad f(t) = \begin{bmatrix} f_1(t) \\ f_2(t) \end{bmatrix}, \] one has \[ \det ( \lambda E(t) - A(t)) = 0 \] for all \(t\in \mathcal I\). Still, if \(x=(x_1, x_2)\) denotes the solution, from the first line of the DAE \[\begin{align*} 0 &= -x_1(t) + tx_2(t) + f_1(t) \\ \dot x_1 - t\dot x_2(t) &= \phantom{-x_1(t) + tx_2(t) +}f_2(t) \end{align*}\] one can calculate directly that \[ \dot x_1(t) = t\dot x_2(t) +x_2 + \dot{f_1}(t) \] or that \[ \dot x_1(t) - t\dot x_2(t) = x_2 + \dot{f_1}(t) \] so that the second line becomes \[ x_2(t) + \dot{f_1}(t) = f_2(t) \] which uniquely defines \[ x_2(t) = - \dot{f_1}(t) + f_2(t) \] and also \[ x_1(t) = t(- \dot{f_1}(t) + f_2(t))+f_1(t). \]

For both examples one can then simply choose \(x(t_0)\) in accordance with the right hand side to argue about whether and how a solution exists.

Recall that for the constant coefficient case, we were using invertible scaling and state transformation matrices \(P\) and \(Q\) for the equivalence transformations \[ E \dot x(t) = Ax(t) +f(t) \quad \sim \quad \tilde E \dot {\tilde x(t)} = \tilde A\tilde x(t) +\tilde f(t) \]

with

\[ x=Q\tilde x, \quad \tilde E = PEQ, \quad \tilde A = PAQ, \quad \tilde f = Pf. \]

For the time-varying case, we will use time-varying transformations and require that they are invertible at every point \(t\) in time.

Definition 4.1 Two pairs \((E_i,A_i)\), \(E_i\), \(A_i \in \mathcal C(\mathcal I, \mathbb C^{m,n})\), \(i=1,2\), of matrix functions are called (globally) equivalent, if there exist pointwise nonsingular matrix functions \(P\in \mathcal C(\mathcal I, \mathbb C^{m,m})\) and \(Q\in \mathcal C^1(\mathcal I, \mathbb C^{n,n})\) such that \[\begin{equation} E_2=PE_1Q, \quad A_2 = PA_1Q-PE_1\dot Q \tag{4.3} \end{equation}\] for all \(t\in \mathcal I\). Again, we write \((E_1,A_1) \sim (E_2, A_2)\).

The need of \(Q\) being differentiable and the appearance of \(E_1\dot Q\) in the definition of \(A_2\) comes from the relation \[ E\dot x(t) = E\frac{d}{dt} (Q\tilde x)(t) = E\bigl(Q(t)\dot {\tilde x}(t) + \dot Q(t)\tilde x(t) \bigr) \] for the transformed state \(\tilde x\) with the actual state \(x\).

Lemma 4.1 The relation on pairs of matrix functions as defined in Definition 4.1 is an equivalence relation.
Proof. Exercise!

Next we will define local equivalence of matrix pairs.

Definition 4.2 Two pairs \((E_i,A_i)\), \(E_i\), \(A_i \in \mathbb C^{m,n}\), \(i=1,2\), of matrices are called locally equivalent, if there exist pointwise nonsingular matrices \(P \in \mathbb C^{m,m})\) and \(Q\in \mathbb C^{n,n}\) as well as matrix \(R\in \mathbb C^{n,n}\) such that \[\begin{equation} E_2=PE_1Q, \quad A_2 = PA_1Q-PE_1R \tag{4.4}. \end{equation}\] Again, we write \((E_1,A_1) \sim (E_2, A_2)\) and differentiate by context.
Lemma 4.2 The local equivalence as defined in Definition 4.2 is an equivalence relation on pairs of matrices.
Proof. Exercise!

We state a few observations:

  • Global equivalence implies local equivalence at all points of time \(t\).
  • Vice versa, pointwise local equivalence, e.g. at some time instances \(t_i\) with suitable matrices \(P_i\), \(Q_i\), \(R_i\), can be interpolated to a continuous matrix function \(P\) and a differentiable matrix function \(Q\) by Hermite interpolation, i.e. via \[ P(t_i) = P_i, \quad Q(t_i) = Q_i, \quad \dot Q(t_i) = R_i. \]
  • Local equivalence is more powerful than the simple equivalence of matrix pairs (cp. Definition 3.1) for which \(R=0\). This means we can expect more structure in a normal form.

4.1 A Local Canonical Form

For easier explanations, we introduce the slightly incorrect wording that a matrix \(M\) spans a vector space \(V\) to express that \(V\) is the span of the columns of \(M\). Similarly, we will say that \(M\) is a basis of \(V\), if the columns of \(M\) form a basis for \(V\).

Some more notation:

Notation Explanation
\(V^H\in \mathbb C^{n,m}\) the conjugate transpose or Hermitian transpose of a matrix \(V\in \mathbb C^{m,n}\)
\(T' \in \mathbb C^{n,n-k}\) The complementary space as a matrix. If \(T\in \mathbb C^{n,k}\) is a basis of \(V\), then \(T'\) contains a basis of \(V'\) so that \(V\oplus V' = \mathbb C^{n}\). In particular, the matrix \(\begin{bmatrix}T&T'\end{bmatrix}\) is square and invertible.

Theorem 4.1 Let \(E, A \in \mathbb C^{m,n}\) and let \[\begin{equation} T,~Z,~T',~V \tag{4.5} \end{equation}\] be

Matrix as the basis of
\(T\) \(\operatorname{kernel}E\)
\(Z\) \(\operatorname{corange}E = \operatorname{kernel}E^H\)
\(T'\) \(\operatorname{cokernel}E = \operatorname{range}E^H\)
\(V\) \(\operatorname{corange}(Z^HAT)\)

then the quantities \[\begin{equation} r,~a,~s,~d,~u,~v \tag{4.6} \end{equation}\] defined as

Quantity Definition Name
\(r\) \(\operatorname{rank}E\) rank
\(a\) \(\operatorname{rank}(Z^HAT)\) algebraic part
\(s\) \(\operatorname{rank}(V^HZ^HAT')\) strangeness
\(d\) \(r-s\) differential part
\(u\) \(n-r-a\) undetermined variables
\(v\) \(m-r-a-s\) vanishing equations

are invariant under local equivalence transformations and \((E, A)\) is locally equivalent to the canonical form

\[\begin{equation} \left(\begin{bmatrix} I_s & 0 & 0 & 0 \\ 0 & I_d & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix}, \begin{bmatrix} 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & I_a & 0 \\ I_s & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix}\right), \tag{4.7} \end{equation}\] where all diagonal blocks are square, except maybe the last one.
Proof. To be provided. Until then, see Theorem 3.7 in Kunkel/Mehrmann.

Some remarks on the spaces and how the names are derived for the case \(E\dot x = Ax +f\) with constant coefficients. The ideas are readily transferred to the case with time-varying coefficients.

Let \[x(t) = Ty(t) + T'y'(t),\]

where \(y\) denotes the components of \(x\) that evolve in the range of \(T\) and \(y'\) the respective complement. (Since \([T|T']\) is a basis of \(\mathbb C^{n}\), there exist such \(y\) and \(y'\) that uniquely define \(x\) and vice versa). With \(T\) spanning \(\ker E\) we find that

\[ E \dot x(t) = ET\dot y(t) + ET'\dot y'(t) = ET'\dot y'(t) \]

so that the DAE basically reads

\[ET'\dot y'(t) = ATy(t) + AT'y'(t)+f,\]

i.e. the components of \(x\) defined through \(y\) are, effectively, not differentiated. With \(Z\) containing exactly those \(v\), for which \(v^HE=0\), it follows that

\[Z^HET'\dot y'(t) = 0 = Z^HATy(t) + Z^HAT'y'(t)+Z^Hf,\]

or

\[Z^HATy(t) = -Z^HAT'y'(t)-Z^Hf,\]

so that \(\operatorname{rank}Z^HAT\) indeed describes the number of purely algebraic equations and variables in the sense that it defines parts of \(y\) (which is never going to be differentiated) in terms of algebraic relations (no time derivatives are involved).

With the same arguments and with \(V=\operatorname{corange}Z^HAT\), it follows that

\[V^HZ^HAT'y'(t) = -V^HZ^HATy(t) -V^HZ^Hf=-V^HZ^Hf,\]

is the part of \(E\dot x = Ax + f\) in which those components \(y'\) that are also differentiated are algebraically equated to a right-hand side. This is the strangeness (rather in the sense of skewness) of DAEs that variables can be both differential and algebraic. Accordingly, \(\operatorname{rank}V^HZ^HAT'\) describes the size of the skewness component.

Finally, those variables that are neither strange nor purely algebraic, i.e. those that are differentiated but not defined algebraically, are the differential variables. There is no direct characterization of them, but one can calculate their number as \(r-s\), which means number of differentiated minus number of strange variables.

Outlook: If there is no strangeness, the DAE is called strangeness-free. Strangeness can be eliminated through iterated differentiation and substitution. The needed number of such iterations (that is independent of the the size \(s\) of the strange block here) will define the strangeness index.

Example 4.3 With a basic state transformation \[ \begin{bmatrix} \tilde x_1 \\ \tilde x_2 \\ \tilde x_3 \end{bmatrix} = \begin{bmatrix} x_3 - x_2 \\ x_2-x_1 \\ x_3 \end{bmatrix}, \] one finds for the coefficients of Example 1.2 that: \[ (E, A) \sim \left( \begin{bmatrix} C & 0 & 0 \\ 0 & 0 &0 \\ 0 & 0 &0 \end{bmatrix} , \begin{bmatrix} 0 & \frac{1}{R} & 0 \\ -1 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix} \right). \]

We compute the subspaces as defined in (4.5):

Matrix as the basis of/computed as
\(T=\begin{bmatrix} 0 \\ I_2 \end{bmatrix}\) \(\operatorname{kernel}\begin{bmatrix} C & 0 \\ 0 & 0_2 \end{bmatrix}\)
\(Z=\begin{bmatrix} 0 \\ I_2 \end{bmatrix}\) \(\operatorname{corange}\begin{bmatrix} C & 0 \\ 0 & 0_2 \end{bmatrix}=\operatorname{kernel}\begin{bmatrix} C & 0 \\ 0 & 0_2 \end{bmatrix}^H\)
\(T'=\begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix}\) \(\operatorname{cokernel}\begin{bmatrix} C & 0 \\ 0 & 0_2 \end{bmatrix}=\operatorname{range}\begin{bmatrix} C & 0 \\ 0 & 0_2 \end{bmatrix}^H\)
\(Z^HAT=I_2\) \(\begin{bmatrix} 0 \\ I_2 \end{bmatrix}^H\begin{bmatrix} 0 & \frac{1}{R} & 0 \\ -1 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix}\begin{bmatrix} 0 \\ I_2 \end{bmatrix}\)
\(V=\begin{bmatrix} 0 \\ 0 \end{bmatrix}\) \(\operatorname{corange}(Z^HAT) = \operatorname{kernel}I_2^H\phantom{\begin{bmatrix} 0 \\ I_1 \end{bmatrix}}\)
\(V^HZ^HAT'=\begin{bmatrix} 0 \end{bmatrix}\) \(\begin{bmatrix} 0 \\ 0 \end{bmatrix}^H\begin{bmatrix} 0 \\ I_2 \end{bmatrix}^H\begin{bmatrix} 0 & \frac{1}{R} & 0 \\ -1 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix}\begin{bmatrix} 1 \\0 \\ 0 \end{bmatrix}\)

and derive the quantities as defined in (4.6):

Name Value Derived from
rank \(r=1\) \(\operatorname{rank}E = \operatorname{rank}\begin{bmatrix} C & 0 \\ 0 & 0_2 \end{bmatrix}\)
algebraic part \(a=2\) \(\operatorname{rank}Z^HAT = \operatorname{rank}I_2\)
strangeness \(s=0\) \(\operatorname{rank}V^HZ^HAT' = \operatorname{rank}\begin{bmatrix} 0\end{bmatrix}\)
differential part \(d=1\) \(d=r-s=1-0\)
undetermined variables \(u=0\) \(u=n-r-a=3-2-1\)
vanishing equations \(v=0\) \(v=m-r-a-s=3-2-1-0\)
Example 4.4

For the semi-discrete linearized Navier-Stokes equations, the derivation of the local characteristic quantities is laid out in the Example Section.

```

4.2 A Global Canonical Form

A few observations:

  • For a pair of \((E, A)\) of matrix functions, we can compute the characteristic values \(r\), \(a\), \(s\), \(d\) as in (4.6) for any given \(t\in \mathcal I\).
  • Thus, \(r\), \(a\), \(s\), \(d\colon \mathcal I \to \mathbb R\) are functions of time \(t\).
  • We will assume that \(r\), \(a\), \(s\), \(d\) are constant in time:
    • Analysis will be enabled through a so-called smooth singular value decomposition (SVD – see the following theorem) that applies for matrices of constant rank.
    • Smooth matrix functions have countably many jumps in the rank. The analysis can be performed on subintervals, where the rank of the matrices are constant.
    • In practice, typically, there are but a few jumps in the rank at somewhat particular but known time instances or circumstances.

About a few and known jumps in the rank: A change in the ranks means an instantaneous change in the model itself. In fact the characteristic values, like the number of purely algebraic equations, would change suddenly. An example is the activation of a switch in an electrical circuit or switched systems in general.

Theorem 4.2 (see Kunkel/Mehrmann, Thm. 3.9) Let \(E\in \mathcal C^\ell(I, \mathbb C^{m,n})\) with \(\operatorname{rank}E(t)=r\) for all \(t\in I\). Then there exist pointwise unitary (and, thus, nonsingular) matrix functions \(U\in \mathcal C^\ell(I, \mathbb C^{m,m})\) and \(V\in \mathcal C^\ell(I, \mathbb C^{n,n})\), such that

\[ U^HEV = \begin{bmatrix} \Sigma & 0 \\ 0 & 0 \end{bmatrix} \] with pointwise nonsingular \(\Sigma \in \mathcal C^l(I, \mathbb C^{r,r})\).

Theorem 4.3 Let \(E, A \in \mathcal C^l(I, \mathbb C^{m,n})\) be sufficiently smooth and suppose that \[\begin{equation} r(t) = r, \quad a(t)=a, \quad s(t)=s \tag{4.8} \end{equation}\]

for the local characteristic values of \((E(t), A(t))\). Then \((E, A)\) is globally equivalent to the canonical form \[\begin{equation} \left( \begin{bmatrix} I_s & 0 & 0 & 0 \\ 0 & I_d & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix}, \begin{bmatrix} 0 & A_{12}& 0 & A_{14} \\ 0 & 0 & 0 & A_{24} \\ 0 & 0 & I_a & 0 \\ I_s & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix} \right ). \tag{4.9} \end{equation}\]

All entries are again matrix functions on \(I\) and the last block column in both matrix functions of (4.9) has size \(u=n-s-d-a\).

Proof. In what follows, we will tacitly redefine the block matrix entries that appear after the global equivalence transformations. The first step is the continous SVD of \(E\); see Theorem 4.2. In what follows, the basic operations of

  • condensing blocks by the continuous SVD, e.g. \(U_2^HA_{31}V_2=\begin{bmatrix} I_s & 0 \\ 0 & 0 \end{bmatrix}\)
  • and eliminating blocks through by adding multiples of columns or rows

are applied repeatedly:

\[\begin{align*} (E,A) & \sim \left(\begin{bmatrix} \Sigma & 0 \\ 0 & 0 \end{bmatrix}, \begin{bmatrix} A_{11} & A_{12} \\ A_{21} & A_{22} \end{bmatrix}\right) \\ %%%%%%%%%%%%%%%%%%%%%%%%%% & \sim \left(\begin{bmatrix} I_r & 0 \\ 0 & 0 \end{bmatrix}, \begin{bmatrix} A_{11} & A_{12} \\ A_{21} & A_{22} \end{bmatrix}\right) \\ %%%%%%%%%%%%%%%%%%%%%% & \sim \left(\begin{bmatrix} I_r & 0 \\ 0 & 0 \end{bmatrix}, \begin{bmatrix} A_{11} & A_{12}V_1 \\ U_1^HA_{21} & U_1^HA_{22}V_1 \end{bmatrix} - \begin{bmatrix} I_r & 0 \\ 0 & 0 \end{bmatrix} \begin{bmatrix} 0 & 0 \\ 0 & \dot V_1 \end{bmatrix} \right) \\ %%%%%%%%%%%%%%%%%%%%%% & \sim \left(\begin{bmatrix} I_r & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{bmatrix}, \begin{bmatrix} A_{11} & A_{12} & A_{13}\\ A_{21} & I_a & 0 \\ A_{31} & 0 & 0 \end{bmatrix}\right) \\ %%%%%%%%%%%%%%%%%%%%%% & \sim \left(\begin{bmatrix} V_2 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{bmatrix}, \begin{bmatrix} A_{11}V_2 & A_{12} & A_{13}\\ A_{21}V_2 & I_a & 0 \\ U_2^HA_{31}V_2 & 0 & 0 \end{bmatrix} - \begin{bmatrix} \dot I_r & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{bmatrix} \begin{bmatrix} \dot V_2 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{bmatrix} \right)\\ %%%%%%%%%%%%%%%%%%%%%% & \sim \left(\begin{bmatrix} V_{11} & V_{12} & 0 & 0 \\ V_{21} & V_{22} & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix}, \begin{bmatrix} A_{11} & A_{12} & A_{13} & A_{14} \\ A_{21} & A_{22} & A_{23} & A_{24} \\ A_{31} & A_{32} & I_a & 0 \\ I_s & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix} \right)\\ %%%%%%%%%%%%%%%%%%%%%% & \sim \left(\begin{bmatrix} I_s & 0 & 0 & 0 \\ 0 & I_d & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix}, \begin{bmatrix} 0 & A_{12} & A_{13} & A_{14} \\ 0 & A_{22} & A_{23} & A_{24} \\ 0 & A_{32} & I_a & 0 \\ I_s & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix} \begin{bmatrix} I_s & 0 & 0 & 0 \\ 0 & I_d & 0 & 0 \\ 0 & -A_{32} & I_a & 0 \\ I_s & 0 & 0 & I_a \end{bmatrix} \right) \\ %%%%%%%%%%%%%%%%% & \sim \left(\begin{bmatrix} I_s & 0 & 0 & 0 \\ 0 & I_d & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix}, \begin{bmatrix} 0 & A_{12} & A_{13} & A_{14} \\ 0 & A_{22} & A_{23} & A_{24} \\ 0 & 0 & I_a & 0 \\ I_s & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix}\right)\\ %%%%%%%%%%%%%%%%% & \sim \left(\begin{bmatrix} I_s & 0 & 0 & 0 \\ 0 & I_d & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix}, \begin{bmatrix} 0 & A_{12} & 0 & A_{14} \\ 0 & A_{22} & 0 & A_{24} \\ 0 & 0 & I_a & 0 \\ I_s & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix}\right)\\ %%%%%%%%%%%%%%%%% & \sim \left(\begin{bmatrix} I_s & 0 & 0 & 0 \\ 0 & Q_2 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix}, \begin{bmatrix} 0 & A_{12}Q_2 & 0 & A_{14} \\ 0 & A_{22}Q_2-\dot Q_2 & 0 & A_{24} \\ 0 & 0 & I_a & 0 \\ I_s & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix}\right) \\ %%%%%%%%%%%%%%%%% & \sim \left(\begin{bmatrix} I_s & 0 & 0 & 0 \\ 0 & I_d & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix}, \begin{bmatrix} 0 & A_{12} & 0 & A_{14} \\ 0 & 0 & 0 & A_{24} \\ 0 & 0 & I_a & 0 \\ I_s & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix}\right), \end{align*}\] where the final equivalence holds, if \(Q_2\) is chosen as the (unique and pointwise invertible) solution of the linear matrix valued ODE \[ \dot Q_2 = A_{22}(t)Q_2 , \quad Q_2 (t_0 ) = I_d. \] Then, the prefinal \(A_{22}\)-block vanishes because of the special choice of \(Q_2\) and \(E_{22}\) becomes \(I_d\) after scaling the second block line by \(Q_2^{-1}\).

If we write down the transformed DAE that corresponds to the canonical form (4.9), i.e. \[\begin{align} \dot x_1 &= A_{12}(t)x_2 + A_{14}x_4 + f_1(t) \tag{4.10} \\ \dot x_2 &= A_{24}(t)x_4 + f_2(t) \tag{4.11}\\ 0 &= x_3 + f_3(t) \\ 0 &= x_1 + f_4(t) \tag{4.12} \\ 0 &= f_5(t) \end{align}\] we can read off a few properties:

  1. the part \(x_4\) is free to choose, i.e. the undetermined part
  2. the equation \(f_5=0\) does not define any variable, i.e. it is the vanishing or redundant part
  3. the part \(x_2\) is defined through an ODE (in this representation)
  4. however, the part \(x_1\) is strange (both differential and algebraic) and still linked to \(x_2\).
Corollary 4.1 In fact, one may differentiate (4.12) and eliminate \(\dot x_1\) in (4.10) to obtain \[ -\dot f_4 = A_{12}(t)x_2 + A_{14}x_4 + f_1(t) \] which together with (4.11) becomes a new DAE for \(x_2\): \[ \bar E(t) \dot x_2 = \bar A(t) x_2 + \bar f(t) \] with \[\begin{equation} \bar E(t) = \begin{bmatrix} I_{d_0} \\ 0_{s_0, d_0} \end{bmatrix}\in \mathbb C^{d_0+s_0, d_0}, \quad \bar A(t) = \begin{bmatrix} 0_{d_0} \\ A_{12}(t) \end{bmatrix} \in \mathbb C^{d_0+s_0, d_0}, \quad \tag{4.13} \end{equation}\] and \[ \bar f(t) = \begin{bmatrix} A_{24}x_4(t) + f_2(t) \\ A_{14}x_4(t)+f_1(t)+\dot f_4(t) \end{bmatrix} \in \mathbb C^{d_0+s_0}. \] Here, we have used the subscript to note that these \(d\) and \(s\) quantities were characteristic for the initial matrix pair \((E, A)\). Now, after this differentiation step, one can calculate the characteristic values \(d_1\), \(a_1\), \(s_1\) again for the pair \((E_1, A_1)\) which is obtained from the canonical form (4.9) by replacing equations (4.10)(4.11) by the DAE with \((\bar E, \bar A)\) as in (4.13).

The following theorem states that this differentiation-elimination step (which is not a global equivalence operation on matrix pairs) is well-defined in the sense that the next iteration characteristic values are invariant under global equivalence transformations.

Theorem 4.4 Assume that the pairs \((E, A)\) and \((\tilde E, \tilde A)\) are globally equivalent and in the global canonical form (4.9). Then the pairs \((E_1, A_1)\) and \((\tilde E_1, \tilde A_1)\) that are obtained by differentiation and elimination as described in Corollary 4.1 are globally equivalent too.
Proof. See Kunkel/Mehrmann: Theorem 3.14.

4.3 The Strangeness Index

Theorem 4.4 comes with a number of implications:

  • Starting with \((E, A):=(E_0, A_0)\), we can define \((E_i, A_i)\), \(i \in \mathbb N \cup {0}\) as follows
    1. \((E_i, A_i)\) is the global canonical form
    2. differentiate and eliminate as in Corollary 4.1 and bring the obtained pair into global canonical form to obtain \((E_{i+1}, A_{i+1})\).
  • this gives a series of invariants \((r_i, a_i, s_i)\) – invariant under global equivalence transforms –
  • Since \(r_{i+1} = r_i - s_i\) and \(r_i \geq 0\) (rank of a matrix is always greater than zero) there exists a \(\mu \in \mathbb N \cup \{0\}\) for which \(s_\mu = 0\).
  • This \(\mu\) is also characteristic for a matrix pair (because \(r_i\) and \(s_i\) are).

With these observations, the following definition is well-posed.

Definition 4.3 Let \((E,A)\) be a pair of sufficiently smooth matrix functions and let the sequence \((r_i, a_i, s_i)\), \(i=0,1,2,3,...\), of global characteristic values for the pairs \((E_i, A_i)\) that are generated as

  • \((E_0, A_0):= (E, A)\)
  • \((E_{i+1}, A_{i+1})\) is derived from bringing \((E_i, A_i)\) into the global canonical form as in Theorem 4.3 and removing the \(I_s\) block in \(E_{i+1}\) through differentiation and elimination as in Corollary 4.1
be well-defined. Then, the quantity \[ \mu := \min\{i\in \mathbb N_0 | s_i=0\} \] is called the strangeness index of the DAE (4.1). If \(\mu=0\), then the DAE is called strangeness-free.

The practical implications of the strangeness index and the procedure of its derivation are laid out in the following theorem.

Theorem 4.5 Let the strangeness index \(\mu\) of \((E, A)\) be well defined and let \(f\in \mathcal C^\mu(\mathcal I, \mathbb C^{m})\). Then the DAE (4.1) is equivalent (in the sense that the solution sets are in a one-to-one correspondence via a pointwise nonsingular matrix function) to a DAE of the form \[\begin{align} \dot x_1(t) &= A_{13}(t)x_3(t) + f_1(t) \tag{4.14}\\ 0 &= x_2(t) + f_2(t) \tag{4.15}\\ 0 &= f_3(t), \tag{4.16} \end{align}\] where \(A_{13} \in \mathcal C(\mathcal I, \mathbb C^{d_\mu, u_\mu})\) and where \(f_1\), \(f_2\), \(f_3\) are defined through \(f\), \(\dot f\), , \(f^{(\mu)}\).
System (4.14)(4.16) is in the form of (4.9) with the \(I_s\) blocks not present and the remaining parts of the variables, coefficients, and right hand side renumbered accordingly.

Corollary 4.2 Let the strangeness index \(\mu\) of \((E, A)\) be well defined and let \(f\in \mathcal C^\mu(\mathcal I, \mathbb C^{m})\). Then

  1. The DAE (4.1) is solvable if and only if the \(v_\mu\) consistency conditions (4.16) \[ 0=f_3(t) \] are fulfilled.

  2. An initial condition (4.2) is consistent if, and only if, the \(a_\mu\) conditions \[ 0=x_2(t_0) + f_2(t_0) \] related to (4.15) hold.

  3. The corresponding initial value problem (4.1)(4.2) is uniquely solvable if, and only if, in addition \(u_\mu=0\), i.e., \(x_3\) is not present in (4.14).

Just by comparing the solvability conditions with those for the constant coefficient case, e.g. Theorem 3.3, we can observe that

  1. that \(\mu \sim \nu -1\) (unless \(\nu =0\)) and
  2. a matrix pair is regular if \(u_\mu = v_\mu = 0\).

4.4 Derivative Arrays

The concept and the derivation of the strangeness index gives a complete characterization of solvability of linear DAEs with time-varying coefficients (provided sufficient regularity of the coefficients and the right hand side). However, for practical considerations there are two shortcomings

  1. The formulation through the canonical form is very implicit and requires the derivatives of computed quantities (like the \(\dot V_2\) in the proof of Theorem 4.3).

  2. There is no direct generalization to nonlinear systems.

Both these issues are better addressed in the approach to a strangeness free form like through derivative arrays.

For that, we consider the DAE \[ E(t) \dot x = A(t) x+f \] differentiate it \[ E(t) \ddot x(t) + \dot E(t) \dot x(t) = \dot A(t) x + A(t) \dot x+\dot f, \] and add these equations to the system to obtain the inflated DAE \[ \begin{bmatrix} E & 0\\ \dot E - A & E \end{bmatrix} \frac{d}{dt} \begin{bmatrix} x \\ \dot x \end{bmatrix} = \begin{bmatrix} A & 0\\ \dot A & 0 \end{bmatrix} \begin{bmatrix} x \\ \dot x \end{bmatrix} + \begin{bmatrix} f \\ \dot f \end{bmatrix}. \]

If we add also the second derivative of the equations, we obtain \[ \begin{bmatrix} E & 0 & 0\\ \dot E - A & E &0 \\ \ddot E - 2\dot A & 2\dot E -A &E \end{bmatrix} \frac{d}{dt} \begin{bmatrix} x \\ \dot x \\ \ddot x \end{bmatrix} = \begin{bmatrix} A & 0 & 0\\ \dot A & 0 & 0\\ \ddot A & 0 & 0 \end{bmatrix} \begin{bmatrix} x \\ \dot x \\ \ddot x \end{bmatrix} + \begin{bmatrix} f \\ \dot f \\ \ddot f \end{bmatrix}. \]

If we do this \(\ell\) times, we arrive at the so-called derivative array

Definition 4.4 Consider the DAE (4.1) and let \(E\), \(A\), \(f\) be \(\ell\)-times differentiable for some integear \(\ell\geq 0\). Then the derivative array of order \(\ell\) is given as \[\begin{equation} M_\ell (t) \dot z_\ell(t) = N_\ell (t) z_\ell(t) + g_\ell(t), \end{equation}\] where \[ (M_\ell)_{i,j} = \binom{i}{j}E^{(i-j)} - \binom{i}{j+1}A^{(i-j-1)}, \quad i,j=0,\dotsc,\ell, \]

where \[ (N_\ell)_{i,j} = \begin{cases} A^{(i)}, \quad &\text{for } i=0,\dotsc, \ell, \, j=0 \\ 0, \quad & \text{else} \end{cases}, \]

and where

\[ z_\ell = \begin{bmatrix} x \\ \dot x \\ \vdots \\ x^{(\ell)} \end{bmatrix}, \quad g_\ell = \begin{bmatrix} f \\ \dot f \\ \vdots \\ f^{(\ell)} \end{bmatrix}. \]
  • By construction, any solution \(x\) that solves the derivative array, solves the DAE and vice versa.
  • The strangeness-free form from above is an equivalent system too with \(d_\mu\) differential relations, \(a_\mu\) algebraic equations, and \(v_\mu\) redundant (or consistency) relations.
  • Next, we will show that from the derivative array we can extract \(d_\mu\) differential and \(a_\mu\) well separated algebraic relations for \(x\), i.e. an equivalent strangeness free form.

The following theorem connects the derivative array to the strangeness index and provides a strangeness free reformulation of the DAE (4.1).

Theorem 4.6 Let the strangeness index of the pair \((E, A)\) of matrix-valued functions be well defined according to Definition 4.3 with the global invariants \(d_\mu\), \(a_\mu\), \(v_\mu\). Then for the derivative array as defined in Definition 4.4 it holds that

  1. \(\operatorname{corank}M_{\mu+1}- \operatorname{corank}M_\mu = v_\mu\)

  2. \(\operatorname{rank}M_\mu(t) = (\mu+1)m-a_\mu - v_\mu\) on \(\mathcal I\), and there exists a smooth matrix function \(Z_{2,3}\) (that spans the left null space of \(M_\mu\)) with \[ Z_{2,3}^H M_\mu(t) =0. \]

  3. The projection \(Z_{2,3}\) can be partitioned into two parts:
    • \(Z_3\) (left nullspace of \((M_\mu, N_\mu)\)) so that \[Z_3^HN_\mu(t) =0\].
    • \(Z_2\) such that \[ Z_2^HN_\mu \begin{bmatrix} I_n \\ 0 \\ \vdots \\ 0 \end{bmatrix} = Z_2^H \begin{bmatrix} A \\ \dot A \\ \vdots \\ A^{(\mu)} \end{bmatrix} =:\hat A_2 \] has full rank.
  4. Furthermore, let \(T_2\) be a smooth matrix function such that \(\hat A_2 T_2=0\) (right nullspace of \(\hat A_2\)). Then \[ \operatorname{rank}E(t) T_2 = d_\mu \] and there exists a smooth matrix function \(Z_1\colon \mathcal I \to \mathbb C^{n,d_\mu}\) with \[\operatorname{rank}(Z_1^TE) = d_\mu.\] We define \(Z_1^HE=\hat E_1\).
Proof. Kunkel/Mehrmann: Thm. 3.29, Thm. 3.30, Thm. 3.32.

A few observations and implications:

  • \(Z_{2,3}^H\) operates on the derivative array \[ M_\ell (t) \dot z_\ell(t) = N_\ell (t) z_\ell(t) + g_\ell(t), \]

  • Specifically, it picks out all constraint equations including the redundancies (\(Z_3^H\)) and all explicit and hidden constraints (\(Z_2^H\)).
  • \(Z_1^H\) operates on the original system \[ E(t)\dot x = A(t)x + f(t) \] and picks out the dynamic part.

By means of the projections defined in Theorem 4.6, one can define (Kunkel/Mehrmann: Theorem 3.32) a strangeness-free condensed form

\[\begin{align} \hat E_1(t) \dot x(t) &= \hat A_{1}(t)x(t) + \hat f_1(t) \tag{4.17} \\ 0 &= \hat A_2 x_2(t) + \hat f_2(t) \tag{4.18}\\ 0 &= \hat f_3(t), \tag{4.19} \end{align}\]

where \[ \hat E_1(t) := Z_1(t)^HE(t) \in \mathbb C^{d_\mu, n}, \quad \hat A_1(t) := Z_1^H(t)A (t)\in \mathbb C^{d_\mu, n}, \quad \hat f_1(t) = Z_1^Hf(t) \in \mathbb C^{d_\mu}, \] where \[ \hat A_2(t) = Z_2(t)^H \begin{bmatrix} A(t) \\ \dot A(t) \\ \vdots \\ A^{(\mu)(t)} \end{bmatrix} \in \mathbb C^{a_\mu, n}, \quad \hat f_2(t) = Z_2^Hg_\mu(t) \in \mathbb C^{a_\mu}, \quad \] and where \[ \hat f_3(t) = Z_3^Hg_\mu(t) \in \mathbb C^{(\mu+1)m-a_\mu-d_\mu}. \]

Remark: This condensed equivalent strangeness free form (4.17)(4.19) comes with a number of advantages over the one defined in Theorem 4.5.

  1. It can be derived by only differentiating the given functions \(E\), \(A\), and \(f\). This is much preferable since a given function can be evaluated with high accuracy (which is needed for computing derivatives numerically) or can be even differentiated analytically.

  2. It is formulated in the original variable \(x\) – no state transformation involved.

  3. It can be generalized to nonlinear systems.

Example 1 Cp. Examples 3.33 and 3.34 in Kunkel/Mehrmann that provide the strangeness free forms for the motivating examples of this section.

Example 2 Compute the forms for the incompressible Navier-Stokes equations (to be provided).