3 Linear DAEs with Constant Coefficients

3.1 Basic Notions and Definitions

Consider the DAE in the form \[\begin{equation} E \dot x (t) = Ax(t) + f(t), \tag{3.1} \end{equation}\] where \(E\), \(A\in \mathbb R^{m,n}\) and \(f\in \mathcal C(\mathcal I, \mathbb R^m)\) with, possibly, an initial condition \[\begin{equation} x(t_0) = x_0 \in \mathbb R^{n}. \tag{3.2} \end{equation}\]

For utmost generality, we consider the case that \(m\neq n\), i.e. the number of equations does not meet the number of unknowns, but we will turn to the square case of \(m=n\) soon.

3.1.1 Scalings and State Transformations

One can confirm that if \(x\) is a solution to (3.1) and \(P\in \mathbb R^{m,m}\) is invertible, then \(x\) is a solution to \[\begin{equation*} PE \dot x (t) = PAx(t) + Pf(t). \end{equation*}\]

This is a scaling of the equations.

Similarly, if \(Q\in \mathbb R^{n,n}\) is invertible, then \(\tilde x := Q^{-1}x\) solves

\[\begin{equation*} E Q \dot {\tilde x} (t) = AQ \tilde x(t) + f(t). \end{equation*}\]

This is a state transformation of the system.

Thus, when talking of solvability of (3.1), one may equivalently consider any regular \(Q\in \mathbb R^{n,n}\), \(P\in \mathbb R^{m,m}\) and the scaled and transformed system \[\begin{equation} \tilde E \dot{\tilde x}(t) = \tilde A \tilde x (t) + \tilde f(t), \quad \tilde x(0) = Q^{-1}x_0, \end{equation}\] where \(\tilde E = PEQ\), \(\tilde A = PAQ\), \(\tilde f = Pf\), and \(x=Q\tilde x\).

To characterize all scalings and state transformations, we define these operations as relations of matrix pairs:

3.1.2 Strong Equivalence and Canonical Forms

Definition 3.1 Two pairs of matrices \((E_1, A_1)\) and \((E_2, A_2)\), with \(E_1\), \(A_1\), \(E_2\), \(A_2 \in \mathbb R^{m,n}\), are called strongly equivalent, if there exist regular matrices \(P\in \mathbb R^{m,m}\), \(Q\in \mathbb R^{n,n}\) such that \[\begin{equation*} E_2 = PE_1Q, \quad A_2 = PA_1Q. \end{equation*}\] In this case, we write \[(E_1, A_1) \sim (E_2, A_2).\]
Lemma 3.1 The relation \(\sim\) defined in Definition 3.1 defines an equivalence relation1.
Proof. Exercise.

For a given equivalence relation on a set, one can define equivalence classes by considering all members that are equivalent to each other as basically the same. And for each class one may choose a representative, preferably in canonical form, i.e. a form that, e.g.,

  1. comes with an simple or characteristic representation and
  2. that allows for easy determination or analysis of quantities of interest.

There can be infinitely many canonical forms. For our purposes and for the strong equivalence of matrix pairs, we will use the Kronecker Canonical Form.

Theorem 3.1 Let \(E\), \(A \in \mathbb C^{m,n}\). Then there exist nonsingular matrices \(P\in \mathbb C^{m,m}\), \(Q\in \mathbb C^{n,n}\) such that for all \(\lambda \in \mathbb C\) \[\begin{equation*} P(\lambda E -A)Q = \begin{bmatrix} \mathcal L_{\epsilon_1} \\ & \ddots \\ && \mathcal L_{\epsilon_p} \\ &&& \mathcal M_{\eta_1} \\ &&&& \ddots \\ &&&&& \mathcal M_{\eta_q} \\ &&&&&& \mathcal J_{\rho_1} \\ &&&&&&& \ddots \\ &&&&&&&& \mathcal J_{\rho_r} \\ &&&&&&&&& \mathcal N_{\sigma_1} \\ &&&&&&&&&& \ddots \\ &&&&&&&&&&& \mathcal N_{\sigma_s} \end{bmatrix} \end{equation*}\] Where the block entries are as follows:

  1. Every entry \(\mathcal L_{\epsilon_j}\) is bidiagonal of size \(\epsilon_j\times (\epsilon_j +1)\), \(\epsilon_j \in \mathbb N \cup \{0\}\) of the form \[\begin{equation*} \lambda \begin{bmatrix} 0 & 1 \\ & \ddots & \ddots \\ && 0 & 1 \end{bmatrix} - \begin{bmatrix} 1 & 0 \\ & \ddots & \ddots \\ && 1 & 0 \end{bmatrix} \end{equation*}\]
  2. Every entry \(\mathcal M_{\eta_j}\) is bidiagonal of size \((\eta_j+1)\times \eta_j\), \(\eta_j \in \mathbb N \cup \{0\}\) of the form \[\begin{equation*} \lambda \begin{bmatrix} 1 \\ 0 & \ddots \\ & \ddots & 1 \\ && 0 \end{bmatrix} - \begin{bmatrix} 0 \\ 1 & \ddots \\ & \ddots & 0 \\ && 1 \end{bmatrix} \end{equation*}\]
  3. Every entry \(\mathcal J_{\rho_j}\) is a Jordan block of size \(\rho_j\times \rho_j\), \(\rho_j \in \mathbb N \setminus \{0\}\), \(\lambda_j \in \mathbb C\) of the form \[\begin{equation*} \lambda \begin{bmatrix} 1 \\ & \ddots \\ &&\ddots \\ &&& 1 \end{bmatrix} - \begin{bmatrix} \lambda_j & 1 \\ & \ddots & \ddots \\ &&& 1 \\ &&& \lambda_j \end{bmatrix} \end{equation*}\]
  4. Every entry \(\mathcal N_{\sigma_j}\) is a nilpotent block of size \(\sigma_j\times \sigma_j\), \(\sigma_j \in \mathbb N \setminus \{0\}\), of the form \[\begin{equation*} \lambda \begin{bmatrix} 0 &1\\ & \ddots & \ddots \\ &&& 1 \\ &&& 0 \end{bmatrix} - \begin{bmatrix} 1 \\ & \ddots \\ &&& \\ &&&& 1 \end{bmatrix} \end{equation*}\]
The Kronecker Canonical Form is uniquely defined up to permutations of the blocks.
Proof. Very technical. Can be found, e.g., in the book: Gantmacher (1959) The Theory of Matrices II.

Algorithm for computations exist2 but the computation is notoriously ill posed.

3.2 Regularity and Solvability

In what follows we will consider the regular case with, among others, \(E\) and \(A\) as square matrices.

Like for solving general equation systems, one can expect well posedness for the case that the numbers of equations equals the number of unknowns. Here squareness of the system is a necessary condition. For sufficiency one further needs that there are no redundant equations or incompatible equations. This is encoded in the regularity.

Definition 3.2 Let \(E\), \(A \in \mathbb C^{n,n}\). The matrix pair \((E, A)\) is called regular, if the characteristic polynomial \(p\), defined via \[ p(\lambda) = \det (\lambda E- A), \] is not identically zero. If such a matrix pair is not regular, it is called singular.

Is not identically zero means that there exists a \(\lambda_0\) such that \(p(\lambda_0)=\det (\lambda_0 E - A) \neq 0\), i.e. \(\lambda_0 E - A\) is invertible.

Since a characteristic polynomial has but a finite numbers of roots (unless it is the zero polynomial), is not identically zero means that \(p(\lambda) \neq 0\) for all but a few \(\lambda\).

Next we show, that regularity is invariant under strong equivalence. This is needed, since we will translate regularity of \((E,A)\) to regularity of the associated DAE and we need to ensure that regular scalings and state transformations do not affect regularity.

Lemma 3.2 Let \(E\), \(A \in \mathbb C^{n,n}\). If \((E, A)\) is strongly equivalent to a regular matrix pair, then \((E, A)\) is regular.
Proof. Let \(E_1\), \(A_1 \in \mathbb C^{n, n}\) be similar to \((E, A)\). By definition, there exist invertible \(P\), \(Q\in \mathbb C^{n,n}\) such that \(\lambda E - A = P(\lambda E_1 -A_1)Q\) for all \(\lambda\). Thus, \[ \det (\lambda E - A) = \det P \det (\lambda E_1 - A_1) \det Q \] is not identically zero, since \(\det Q\) and \(\det P\) are not zero and \(\det (\lambda E_1 - A_1)\) is not the zero polynomial.

With that we can derive a canonical form for strongly equivalent matrix pairs.

Theorem 3.2 (Weierstrass canonical form) Let \(E\), \(A \in \mathbb C^{n,n}\) and \((E,A)\) be regular. Then \[\begin{equation} (E, A) \sim \left ( \begin{bmatrix} I_{n_1} \\ & N \end{bmatrix} , \begin{bmatrix} J \\ & I_{n_2} \end{bmatrix} \right ), \tag{3.3} \end{equation}\]

  • where \(n_1\), \(n_2 \in \mathbb N\) such that \(n=n_1+n_2\),
  • where \(I_{n_1}\) and \(I_{n_2}\) denote the identity matrices of size \(n_1\times n_1\) and \(n_2\times n_2\), respectively,
  • where \(J\in \mathbb C^{n_1,n_1}\) is in Jordan canonical form,
  • and where \(N \in \mathbb C^{n_2,n_2}\) is a nilpotent matrix.
  • Moreover, it is allowed that the one or the other block is not present, i.e., \(n_1\) or \(n_2\) can be zero.
Proof. To be provided.

Recall that the Jordan canonical form can be achieved for any square matrix \(M\in \mathbb C\) by a similarity transformation.

Definition 3.3 (Nilpotent Matrix) A matrix \(M\in \mathbb C^{n,n}\) is called nilpotent, if there is an integer \(k\) such that \(M^k=0\). The smallest such integer index, i.e. that \(\nu \in \mathbb N\) such that \(N^\nu=0\) whereas \(N^{\nu-1} \neq 0\) is called the index of nilpotency of \(M\).

With the convention that \(\mathbf 0^0=I\), the zero matrix \(\mathbf 0 \in \mathbb R^{n,n}\) is of (nilpotency) index 1.

The relation of solvability and regularity of DAEs becomes evident in the canonical form of Theorem 3.2. In fact, it states that through regular scalings and state transforms, any DAE with

\[ E\dot x(t) = A x(t) + f(t) \]

with \((E,A)\) regular can be transformed and split into \[\begin{equation} \dot x_1(t) = J x_1(t) + f_1(t) \tag{3.4} \end{equation}\] and \[\begin{equation} N \dot x_2(t) = x_2(t) + f_2(t) \tag{3.5}, \end{equation}\]


  • into an ODE (3.4) that already is in Jordan canonical form
  • and a separated(!) DAE (3.5) of a particular type.

Since linear ODEs always have a unique solution for any initial value, solvability of a general linear DAE with constant, regular coefficients will be completely defined by solvability of the special DAE part (3.5).

3.3 Solution to the N-DAE, Regularity, and Index of a Matrix Pair

In what follows we will consider the special DAE \[\begin{equation} N \dot x(t) = x(t) + f(t) \tag{3.6} \end{equation}\] with \(N \in \mathbb R^{n,n}\) nilpotent with \(\nu\) being the index of nilpotency. For this DAE there is an explicit solution formula:

Lemma 3.3 Consider (3.6). If \(f \in \mathcal C^\nu(\mathcal I, \mathbb R^{n})\), \(n\geq 1\), where \(\nu\) is the index of nilpotency of \(N\), then (3.6) has a unique solution given as \[\begin{equation} x(t) = - \sum_{i=0}^{\nu-1}N^if^{(i)}(t), \tag{3.7} \end{equation}\] where \(f^{(i)}\) denotes the \(i\)-th derivative of \(f\).

Proof. There are a few ways to prove the explicit form (3.7)

  1. Bring \(N\) into Jordan canonical form and prove the formula for the Jordan blocks of arbitrary size by induction.

  2. Write (3.6) as \[(N\frac{d}{dt}-I)x = f\] and show that3 \[(N\frac{d}{dt}-I)^{-1} = - \sum_{i=0}^{\nu-1}N^i\frac{d^i}{dt^i}\].

  3. We take the direct approach as it can be found in the book by Dai4:

Firstly, we observe that \[ x = N\dot x - f. \] Secondly, that (having multiplied by \(N\) and differentiated once) \[ N\dot x = N^2 \ddot x - N \dot f. \] And, finally, that (having multiplied \(k\)-times by \(N\) and differentiated \(k\)-times) \[ N^k x^{(k)} = N^{k+1} x^{(k+1)} - N^k f^{(k)}. \] If one successively replaces \(N^kx^{(k)}=N^{k+1}x^{(k+1)}-N^kf^{(k)}\), \(k=1,2,...,\nu-1\) in \[ x=N\dot x -f = N^2\ddot x - N\dot f -f = \dots = N^\nu x^{(\nu)} - \sum_{i=0}^{\nu-1} N^if^{(i)}, \] with \(N^\nu=0\), one arrives at formula (3.7).

Since this construction holds for any solution, uniqueness is guaranteed too.

We make three important observations here.

  1. The solution \(x\) to (3.6) is uniquely defined without specifying a value at \(t_0\). Vice versa, an initial value \(x_0\) is consistent if, and only if, \[ x_0 = - \sum_{i=0}^{\nu-1}N^if^{(i)}(t_0) \]

  2. The definition of the solution \(x\) requires \(f\) to be \(\nu-1\)-times differentiable. In order to be a solution according to Definition 2.1, the function \(x\) itself has to be differentiable too. Hence the requirement \(f \in \mathcal C^\nu(\mathcal I, \mathbb R^{n})\).

  3. The index of nilpotency of \(N\) defines the necessary smoothness of the right hand side.

The index of nilpotency in the \(N\) part of the Weierstrass canonical form is characteristic for a matrix pair and defines an index of the associated DAE.

Definition 3.4 Consider a regular matrix pair \((E, A)\) and its Weierstrass canonical form, i.e. \[ (E, A) \sim \left ( \begin{bmatrix} I_{n_1} \\ & N \end{bmatrix} , \begin{bmatrix} J \\ & I_{n_2} \end{bmatrix} \right ). \] The index \(\nu\) of nilpotency of \(N\) is called the index of the matrix pair \((E, A)\) and we write \(\nu = \operatorname{ind}(E, A)\). If \(N\) is not present, then we set \(\operatorname{ind}(E,A)=0\).

Furthermore, for a nilpotent matrix \(N\), we will occasionally use the notion \(\nu=\operatorname{ind}(N,I)\) to refer to its index of nilpotency. X: Is this OK, i.e. consistent?

To be on the safe side and to learn how to handle block structured matrices in the analysis, we confirm that this definition of the index is well-posed, i.e. for any regular matrix \((E,A)\) there is a unique index \(\nu\).

If one considers the Weierstrass canonical form as a special case of the Kronecker canonical form, then the statement that the canonical form is well-posed up to permutations in the order of the blocks already implies that the index is well defined (since it only depends on the size of the largest nilpotent block but not in which order it appears).

Lemma 3.4 Suppose that the regular matrix pair \((E, A)\), \(E\), \(A\in \mathbb R^{n,n}\) has the two canonical forms \[ (E, A) \sim \left ( \begin{bmatrix} I_{d_1} \\ & N_1 \end{bmatrix} , \begin{bmatrix} J_1 \\ & I_{n-d_1} \end{bmatrix} \right ) \sim \left ( \begin{bmatrix} I_{d_2} \\ & N_2 \end{bmatrix} , \begin{bmatrix} J_2 \\ & I_{n-d_2} \end{bmatrix} \right ), \] where \(d_1\), \(d_2\) are the size of the Jordan blocks \(J_1\), \(J_2\), respectively. Then \(d_1=d_2=:d\) and the indices of nilpotency of the nilpotent blocks coincide, i.e. \(\operatorname{ind}(N_1, I_{n-d}) = \operatorname{ind}(N_2, I_{n-d})\).

Proof. To show that \(d_1=d_2\), without loss of generality, we can assume that \(N_i\) are in Jordan form too. This, in particular, means that they are upper triangular with zeros on the diagonal. (If this was not the case, we can use \(N_i = T^{-1}\tilde N_i T\) in the arguments below).

Then, we have that the characteristic polynomials \[ \det (\lambda E - A) = \det (\lambda N_i-I_{n-d_i})\det (\lambda I_{d_i} - J_i) = (-1)^{n-d_i}\det (\lambda I_{d_i} - J_i) \] are polynomials of degree \(d_i\), \(i=1,2\).

Since the characteristic polynomials of strongly equivalent matrix pairs only differ by a constant factor (see the proof of Lemma 3.2), this implies that \(d_1=d_2\).

Now we show, that the indices of nilpotency of \(N_1\) and \(N_2\) coincide. If the \(N\)-blocks weren’t present, there would be nothing to show. So let’s assume that they are there.

Let now be \(P\), \(Q \in \mathbb R^{n,n}\) that invertible matrices that realize the strong equivalence of the canonical forms, i.e. \[\begin{equation} \tag{3.8} \begin{bmatrix} P_{11} & P_{12} \\ P_{21} & P_{22} \end{bmatrix} \begin{bmatrix} I \\ & N_2 \end{bmatrix} = \begin{bmatrix} I \\ & N_1 \end{bmatrix} \begin{bmatrix} Q_{11} & Q_{12} \\ Q_{21} & Q_{22} \end{bmatrix} \end{equation}\] and \[\begin{equation} \tag{3.9} \begin{bmatrix} P_{11} & P_{12} \\ P_{21} & P_{22} \end{bmatrix} \begin{bmatrix} J_2 \\ & I \end{bmatrix} = \begin{bmatrix} J_1 \\ & I \end{bmatrix} \begin{bmatrix} Q_{11} & Q_{12} \\ Q_{21} & Q_{22} \end{bmatrix}, \end{equation}\] where \(P\) and \(Q\) have been partitioned in line with the canonical forms.

Taking the blockwise matrix product and equating the blocks separately, the following relations are obtained:

Block: (1,1) (1,2) (2,1) (2,2)
(3.8): \(P_{11}=Q_{11}\) \(P_{12}N_2=Q_{12}\) \(P_{21}=N_1Q_{21}\) \(P_{22}N_2=N_1Q_{22}\)
(3.9): \(P_{11}J_2=J_1Q_{11}\) \(P_{12}=J_1Q_{12}\) \(P_{21}J_2=Q_{21}\) \(P_{22}=Q_{22}\)

If we combine the (2,1) blocks, we obtain that \[ P_{21}=N_1Q_{21}=N_1P_{21}J_2=N_1^2P_{21}J_2^2=N_1^3P_{21}J_2^3=\dotsc=0 \] since \(N_1\) is nilpotent.

Since \(P\) is invertible and, because of \(P_{21}=0\), block upper triangular, the blocks on the diagonals \(P_{11}\) and \(P_{22}\) must be invertible. With \(Q_{22}=P_{22}\) the (2,2) block of (3.8) implies that \[ N_2 = P_{22}^{-1}N_1P_{22} \] and, further, \[ N_2^k = P_{22}^{-1}N_1^kP_{22} \] for all \(k\in\mathbb N\). So that a power \(N_2^k=0\) if, and only if, \(N_1^k=0\). Consequently, the indices of nilpotency of \(N_1\) and \(N_2\) coincide.

3.4 Existence of Solutions

We can now summarize all results and considerations in a theorem.

Theorem 3.3 Consider the DAE (3.1) with initial condition (3.2), \[ E \dot x (t) = Ax(t) + f(t), \quad x(t_0) = x_0 \in \mathbb R^{n}. \] Let the pair \((E, A)\) be regular and consider the strongly equivalent DAE \[ \tilde E \dot {\tilde x} (t) = \tilde A\tilde x(t) + \tilde f(t), \quad \tilde x(t_0) = \tilde x_0 \in \mathbb R^{n}. \] with \((\tilde E,\tilde A)\) in Weierstrass canonical form, i.e. \[ \tilde E = \begin{bmatrix} I & 0 \\ 0 &N \end{bmatrix}, \quad \tilde A = \begin{bmatrix} J & 0 \\ 0 &I \end{bmatrix}, \] and consider the conforming splitting of the transformed variables \[ \tilde x = \begin{bmatrix} \tilde x_1 \\ \tilde x_2 \end{bmatrix}, \quad \tilde x_0 = \begin{bmatrix} \tilde x_{0,1} \\ \tilde x_{0,2} \end{bmatrix}. \] Furthermore, let \(\nu\) be the index of the matrix pair \((E,A)\).

If \(f\in \mathcal C^\nu(\mathcal I,\mathbb C^{n})\), then

  1. The differential algebraic equation (3.1) is solvable.

  2. The initial condition \(x_0\) in (3.2) is consistent if, and only if, for the transformed initial condition \(\tilde x_0\) it holds that \[ \tilde x_{2,0} = - \sum_{i=0}^{\nu-1}N^i\tilde f^{(i)}(t_0) \] In particular, a consistent initial condition to (3.1) exists.

  3. Every initial value problem (3.1)(3.2) with a consistent initial condition is uniquely solvable.
Proof. A summary of the preceding results.

From Theorem 3.3 it follows that the regularity of the matrix pair \((E, A)\) implies the existence of a unique solution to the DAE (3.1) with an initial condition (3.2) provided that the initial condition is consistent.

The negation of this statement is a bit diffuse because there are several things that can go wrong if the DAE is not regular. Depending on the irregularity there might be infinite many solutions to the initial value problem or no solutions at all to the DAE (even without the initial condition).

The following theorem covers the case of singular or non-square matrix pairs.

Theorem 3.4 (Thm. 2.14 in Kunkel/Mehrmann) Let \(E\), \(A\in \mathbb C^{m,n}\).

  1. If \(\operatorname{rank}(\lambda E -A) < n\) for all \(\lambda \in \mathbb C\), then the homogeneous initial value problem \[ E\dot x(t) = Ax(t), \quad x(t_0) = 0, \] has a nontrivial solution.

  2. If \(\operatorname{rank}(\lambda E -A) < m\) for all \(\lambda\in \mathbb C\), then there exist arbitrarily smooth inhomogeneities \(f\) for which the DAE (3.1) \[ E\dot x(t) = Ax(t)+f(t), \] is not solvable.
Proof. The proof is given for Theorem 2.14 in Kunkel/Mehrmann with, however, the second claim being formulated slightly differently. To reduce our formulation to theirs, one may identify that columns of \(\lambda E - A\) that achieve the maximal rank and split off the redundant columns, e.g., the parts of \(x\) associated with them.

Note that a nontrivial solution to the homogeneous problem, means that existence of a solution implies infinitely many solutions to the initial value problem. In fact, if \(x_h\) is the solution to the homogeneous problem and \(x_p\) a solution5 to the initial value problem, then \(x = x_p + \alpha x_h\) solves the initial value problem for any \(\alpha \in \mathbb R\).

To illustrate the difficulties with singular or non-square DAEs, consider the example \[ \begin{bmatrix} 0 & 1 & \\ && 1 \\ && 0 \\ \end{bmatrix} \frac{d}{dt} \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix} = \begin{bmatrix} 1 & 0 & \\ && 0 \\ && 1 \\ \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix} + \begin{bmatrix} 0 \\ f_2 \\ f_3 \end{bmatrix} \] Here, the first part reads \[ x_1 = \dot x_2 \] which defines a solution \(x_2=g\) for any \(g \in \mathcal C^2\) and a nontrivial solution to the associated homogeneous initial value problem if only \(g(t)\neq 0\) for some \(t\) but \(g(t_0)=\dot g(t_0) = 0\).

The second part reads \[\begin{align*} \dot x_3 &= f_2 \\ -x_3 &= f_3 \end{align*}\] which does not permit a solution, if \({-\dot{f}}_3 \neq f_2\).

X: find such a \(g\) for the first part.

3.5 A Variation of Constant Formula

In this section, we want to derive an explicit solution formula for linear DAEs with constant coefficients \(E\dot x(t) = Ax(t)+f(t)\) similar to the formula that exist for linear time-invariant (i.e. with constant coefficients) ODEs.

Certainly, a solution formula is given through the transformation to the Weierstrass canonical form and through (???)(eq:iii-ndae-exp-solution). This however is not an explicit solution representation in so far as both the coefficients and the solution \(x\) itself had to be undertaken a transformation first.

The outline for deriving the explicit formula

The following derivations and arguments base on two major components

  1. Representation of solutions as \(x=x_h+x_p\), where
    • \(x_h\) describes all solutions to the homogeneous problem \(E\dot x = Ax\) and
    • \(x_p\) denotes a (so-called particular) solution to \(E\dot x = Ax + f\).
  2. Additive splitting of DAEs into nilpotent and almost ODE parts, in the way that \(x\) solves \(E\dot x = Ax+f\), if, and only if, \(x=x_1+x_2\) where \(x_1\) and \(x_2\) fulfill
    • \(\tilde C \dot x_1 = A x_1 + f_1\), with \(\tilde C\) almost invertible
    • \(\tilde N \dot x_2 = A x_2 + f_2\), with \(\tilde N\) nilpotent.

We start with defining what was meant by almost invertible. For that consider the differential equation \[ E\dot x(t) = x(t), \quad x(t_0) = x_0 \tag{3.10} \] with \(E\in \mathbb C^{n,n}\).

If \(E\) was invertible, then the unique solution to (3.10) would be given as \[ x(t) = e^{(t-t_0)E^{-1}}x_0. \] If \(E\) is not invertible, then there exists a matrix \(T\in \mathbb R^{n,n}\), invertible, that brings \(E\) into Jordan canonical form \[ J=TET^{-1}= \begin{bmatrix} C \\ & N \end{bmatrix} \] with \(C\) invertible and \(N\) nilpotent so that (3.10) can be written \[ \begin{bmatrix} C \\ & N \end{bmatrix} \begin{bmatrix} \dot x_1(t) \\ \dot x_2(t) \end{bmatrix} = \begin{bmatrix} I \\ & I \end{bmatrix} \begin{bmatrix} x_1(t) \\ x_2(t) \end{bmatrix}, \quad \begin{bmatrix} x_1(t_0) \\ x_2(t_0) \end{bmatrix} = \begin{bmatrix} x_{1,0} \\ x_{2,0} \end{bmatrix} =Tx_0. \]

Since there is no right hand side, by the formula (3.7) for the special DAE \(N\dot x=x+f\), we conclude that \(x_2=0\), so that only the ODE part \(C\dot x_1 = x_1\) remains and the overall solution writes \[ \begin{bmatrix} x_1(t) \\ x_2(t) \end{bmatrix}= \begin{bmatrix} e^{(t-t_0)C^{-1}} \\ & 0 \end{bmatrix} \begin{bmatrix} x_{1,0} \\ x_{2,0} \end{bmatrix}. \] If we define \[ J^D = \begin{bmatrix} C \\ & N \end{bmatrix}^D := \begin{bmatrix} C^{-1} \\ & 0 \end{bmatrix}, \] the solution to (3.10) can be expressed as \[ Tx(t) = e^{(t-t_0)J^D}Tx_0 \] or with \[ E^D := TJ^{D}T^{-1}, \quad e^{(t-t_0)TJ^{D}T^{-1}} = Te^{(t-t_0)J^{D}}T^{-1} \] as \[ x(t) = e^{(t-t_0)E^D}x_0. \]

A few observations

  1. The formula looks like the solution formula for the ODE case.
  2. In fact, if \(E\) is invertible, then \(E^{D}=E^{-1}\) and the formulas coincide.
  3. Thus, \(E^D\) is a generalized inverse – the so-called Drazin inverse.

Definition 3.5 Let \(E\in \mathbb C^{n,n}\) and \(\nu = \operatorname{ind}(E)\). A matrix \(X\in \mathbb C^{n,n}\) that fulfills

\[\begin{align} EX & = XE, \tag{3.11} \\ XEX & = X, \tag{3.12} \\ XE^{\nu+1} & = E^{\nu}, \tag{3.13} \end{align}\]

is called a Drazin inverse of \(E\).

With the following theorem we confirm that a Drazin inverse to a matrix \(E\) is unique so that we can write \(E^D\) for it.

Theorem 3.5 Every matrix \(E\in\mathbb{C}^{n,n}\) has one, and only one, Drazin inverse.

Proof. Uniqueness: Let \(X_1\) and \(X_2\) be two Drazin inverses of \(E\). Then by repeated application of the identities in (3.11)(3.13) one derives that

\[\begin{align*} X_1 EX_1 E X_2 = &X_1EX_2 = X_1EX_2EX_2 \\ X_1^2 E^2 X_2 = \dotsm= &X_1EX_2 = \dotsm= X_1E^2X_2^2 \\ X_1^{\nu+1}E^{\nu+1} X_2 =\dotsm=\dotsm= &X_1EX_2 =\dotsm=\dotsm= X_1E^{\nu+1}X_2^{\nu+1} \\ X_1^{\nu+1}E^{\nu+1} X_1 =\dotsm=\dotsm=\dotsm= &X_1EX_2 =\dotsm=\dotsm=\dotsm= X_2E^{\nu+1}X_2^{\nu+1} \\ X_1 =\dotsm=\dotsm=\dotsm=\dotsm= &X_1EX_2 =\dotsm=\dotsm=\dotsm=\dotsm= X_2, \\ \end{align*}\]

where in the second last step we used the identities

\[ E^{\nu+1}X_1=X_1E^{\nu+1}=E^{\nu}=X_2E^{\nu+1}=E^{\nu+1}X_2. \]
Lemma 3.5 Let \(E\), \(A \in \mathbb C^{n,n}\) commuting, i.e. \(EA=AE\). Then also \[\begin{equation} EA^D = A^DE. \tag{3.14} \end{equation}\]

Theorem 3.6 Let \(E\in \mathbb C^{n,n}\) with \(\nu = \operatorname{ind}E\). Then there exists a unique decomposition \[ E=\tilde C + \tilde N \] with the properties \[\begin{align} \tilde C \tilde N = \tilde N \tilde C = 0, \tag{3.15} \\ \tilde N^\nu = 0, \quad \tilde N^{\nu-1} \neq 0, \tag{3.16}\\ \operatorname{ind}\tilde C \leq 1, \tag{3.17} \end{align}\] and, in particular \[\begin{align} \tilde C = EE^DE, \quad \tilde N = E(I - E^DE). \tag{3.18} \end{align}\]

Proof. 1. Show that such a decomposition with the properties (3.15)-(3.17) also fulfills (3.18), i.e. uniqueness.

  1. Show that \(\tilde C\), \(\tilde N\) as in (3.18) are such a decomposition, i.e. existence.

Now we can define, how the general DAE can be split additively into an almost ODE and a particular nilpotent DAE.

Lemma 3.6 Let \(E\), \(A \in \mathbb C^{n,n}\) and \(f\colon \mathcal I \to \mathbb C^{n}\). If \(E\) and \(A\) commute, then the system \[ E\dot x(t) = Ax(t) + f(t) \] is equivalent – in the sense that solutions correspond one-to-one via \(x=x_1+x_2\) – to the system \[\begin{align*} \tilde C \dot x_1(t) &= Ax_1(t) + f_1(t), \\ \tilde N \dot x_2(t) &= Ax_2(t) + f_2(t), \end{align*}\] where \[\begin{equation} x_1 = E^{D}Ex, \quad x_2 = (I-E^{D}E)x, \end{equation}\] where \[\begin{equation} f_1 = E^{D}Ef, \quad f_2 = (I-E^{D}E)f, \end{equation}\] and where \[ \tilde N + \tilde C = E \] are a decomposition as in Theorem 3.6.

This decomposition is used to characterize all solutions to the homogeneous problem \(E\dot x = Ax\).

Theorem 3.7 Let \(E\), \(A \in \mathbb C^{n,n}\) commuting, i.e. \(EA=AE\), and let \((E,A)\) be regular. Then every solution \(x\in \mathcal C^1(\mathcal I, \mathbb C^n)\) of \(E\dot x=Ax\) has the form \[\begin{equation} x(t) = e^{E^DAt}E^DEv \end{equation}\] for some \(v\in \mathbb C^n\).

Proof. 1. Confirm directly that \(x\colon t \mapsto e^{E^DAt}E^DEv\) satisfies \(E\dot x - Ax=0\). Note that regularity of \((E,A)\) is not needed here.

  1. Use regularity and the splitting to show that any solution has this form.

It remains to find a particular solution.

Theorem 3.8 Let \(E\), \(A \in \mathbb C^{n,n}\) commuting, i.e. \(EA=AE\), and let \((E,A)\) be regular. Let \(f\in \mathcal C^\nu(\mathcal I, \mathbb C^n)\) with \(\nu = \operatorname{ind}E\). Then \(x\in \mathcal C^1(\mathcal I, \mathbb C^n)\) defined by \[ x(t) = \int_{t_0}^t e^{E^DA(t-s)}E^Df(s)ds - (I-E^DE)\sum_{i=0}^{\nu-1}(EA^D)^iA^Df^{(i)}(t) \] is a particular solution of \(E\dot x(t) = Ax(t)+f(t)\).
Theorem 3.9 Let \(E\), \(A \in \mathbb C^{n,n}\) commuting, i.e. \(EA=AE\), and let \((E,A)\) be regular. Let \(f\in \mathcal C^\nu(\mathcal I, \mathbb C^n)\) with \(\nu = \operatorname{ind}E\). Then every solution \(x\in \mathcal C^1(\mathcal I, \mathbb C^n)\) to \(E\dot x(t) = Ax(t)+f(t)\) has the representation \[ x(t) = e^{(t-t_0)E^DA}E^DEv +\int_{t_0}^t e^{E^DA(t-s)}E^Df(s)ds - (I-E^DE)\sum_{i=0}^{\nu-1}(EA^D)^iA^Df^{(i)}(t) \] for some \(v\in \mathbb C^n\).

This theorem also defines what is a consistent initial value.

Corollary 3.1 Let the assumptions of Theorem 3.9 hold. The initial value problem (3.1)(3.2), \[ E\dot x (t) = Ax(t) + f(t), \quad x(t_0)=x_0, \] possesses a solution if, and only if, there exists a \(v\in \mathbb C^n\) such that \[ x_0 = E^{D}Ev - (I-E^DE)\sum_{i=0}^{\nu-1}(EA^D)^iA^Df^{(i)}(t_0). \] If this is the case, then the solution is unique.

We have derived the solution formula under the assumption that \(EA=AE\) which is hardly ever the case.

The following lemma states that the assumption of commutativity is not a restriction for regular matrix pairs.

Lemma 3.7 Let \((E,A)\) be regular and let \(\tilde \lambda\) be such that \(\tilde \lambda E - A\) is invertible. Then \[\bigl (\tilde E,\tilde A\bigr ):=\bigl ( (\tilde \lambda E-A)^{-1}E, (\tilde \lambda E-A)^{-1}A \bigr ) \sim \bigl (E,A \bigr )\] and \(\tilde E\) and \(\tilde A\) commute.

  1. Equivalence relation – RST. Reflexive: \(A\sim A\). Symmetric: \(A\sim B\), then \(B\sim A\). Transitive: \(A\sim B\) and \(B\sim C\), then \(A\sim C\).

  2. Paul v. Dooren The Computation of Kronecker’s Canonical Form of a Singular Pencil

  3. See the proof of Lemma 2.8 in Kunkel/Mehrmann

  4. Dai (1989): Singular Control Systems

  5. A so-called particular solution. Btw., all solutions = a particular plus all solutions to the homogeneous problem is the superposition principle for general linear problems.