Home/Chapter 12

Controllability and Observability

Controllability and observability of linear systems. Feedback and pole placement, observer design, canonical forms, and Riccati equations for stabilizing controllers.

Controllability

Consider

dxdt=Ax(t)+Bu(t),\frac{dx}{dt} = Ax(t) + Bu(t),

where AA is n×nn \times n and BB is n×pn \times p. However, for simplicity, in the derivations below throughout the chapter, we will assume that p=1p = 1.

DefinitionControllability

The pair (A,B)(A, B) is said to be controllable if for any x(0)=x0Rnx(0) = x_0 \in \mathbb{R}^n and xfRnx_f \in \mathbb{R}^n, there exists T<T < \infty and a control input {us,  0sT}\{u_s, \; 0 \leq s \leq T\} so that xT=xfx_T = x_f.

Remark.

Intuition: Controllability asks whether the input uu has enough "reach" to steer the state from any starting point to any target point in finite time. If some part of the state space is untouched by the input -- no matter what uu you apply -- the system is not controllable. Think of it as checking whether you have full authority over the system's state through your control input.

Consider the following:

dxdt=[0145]x+[b1b2]u\frac{dx}{dt} = \begin{bmatrix} 0 & 1 \\ -4 & -5 \end{bmatrix} x + \begin{bmatrix} b_1 \\ b_2 \end{bmatrix} u

In this case, if b1,b2b_1, b_2 are both 0, it is evident that the system is not controllable: for every given x(0)x(0), the future paths are uniquely determined.

Consider, now the more interesting case with b1=1=b2b_1 = 1 = -b_2. In this case, if the initial condition x(0)=[x1(0)x2(0)]x(0) = \begin{bmatrix} x_1(0) \\ x_2(0) \end{bmatrix} takes values from the subspace determined by the line x1(0)+x2(0)=0x_1(0) + x_2(0) = 0, then, for all t>0t > 0, the state remains in this subspace. To see this, note that d(x1(t)+x2(t))dt=dx1(t)+dx2(t)dt=0\frac{d(x_1(t) + x_2(t))}{dt} = \frac{dx_1(t) + dx_2(t)}{dt} = 0 so that the sum of the state components does not change and thus x1(t)+x2(t)x_1(t) + x_2(t) remains 0. Thus, this subspace, which is a strict subset of R2\mathbb{R}^2, is invariant no matter what control is applied: this system is not controllable.

TheoremEquivalent Conditions for Controllability

Conditions (i), (ii), (iii), and (iv) below are equivalent:

(i) (A,B)(A, B) is controllable.

(ii) The n×nn \times n matrix

Wc(t)=0teAsBBTeATsds=0teA(ts)BBTeAT(ts)dsW_c(t) = \int_0^t e^{As}BB^T e^{A^T s} ds = \int_0^t e^{A(t-s)}BB^T e^{A^T(t-s)} ds

is full-rank for every t>0t > 0.

(iii) The controllability matrix

C:=[BABAn1B]\mathcal{C} := \begin{bmatrix} B & AB & \cdots & A^{n-1}B \end{bmatrix}

is full-rank.

(iv) The matrix

[AλIB]\begin{bmatrix} A - \lambda I & B \end{bmatrix}

has full rank (i.e., rank nn) at every eigenvalue λ\lambda of AA.

Remark.

Intuition: This theorem gives four different lenses on the same property. Condition (ii) uses the controllability Grammian Wc(t)W_c(t) -- a matrix that accumulates the "energy" the input can inject into each state direction. Condition (iii) is the most computationally practical: just stack B,AB,,An1BB, AB, \ldots, A^{n-1}B and check rank. Condition (iv), the PBH (Popov-Belevitch-Hautus) test, checks that at every eigenvalue of AA, the input BB can still influence the corresponding eigenspace -- no mode is "hidden" from the input.

The matrix WcW_c above is called the controllability Grammian of (A,B)(A, B).

Reachable Set (from origin) and the Controllable Subspace. An implication of the proof of the equivalence between (ii) and (iii) is that the range space of C\mathcal{C} and the range space of the linear operator from the space of integrable control inputs U={u:R+R,  u1<}\mathcal{U} = \{u : \mathbb{R}_+ \to \mathbb{R}, \; \|u\|_1 < \infty\} to Rn\mathbb{R}^n defined with

t0{0teA(ts)Bu(s)ds,uU}\bigcup_{t \geq 0} \left\{ \int_0^t e^{A(t-s)} Bu(s) ds, \quad u \in \mathcal{U} \right\}

are equal. This set is called the reachable (from the origin) set. This set is also called the controllable subspace.

Exercise. The controllability property is invariant under an algebraically equivalent transformation of the coordinates: x~=Px\tilde{x} = Px for some invertible PP.

Hint: Use the rank condition and show that with dx~dt=A~x~(t)+B~u(t)\frac{d\tilde{x}}{dt} = \tilde{A}\tilde{x}(t) + \tilde{B}u(t) and A~=PAP1\tilde{A} = PAP^{-1}, B~=PB\tilde{B} = PB, and with C=[BABAn1B]\mathcal{C} = \begin{bmatrix} B & AB & \cdots & A^{n-1}B \end{bmatrix}, we have that the transformed controllability matrix writes as C~=[B~A~B~A~n1B~]=PC\tilde{\mathcal{C}} = \begin{bmatrix} \tilde{B} & \tilde{A}\tilde{B} & \cdots & \tilde{A}^{n-1}\tilde{B} \end{bmatrix} = P\mathcal{C}.


Observability

In many problems a controller has access to only the inputs applied and outputs measured. A very important question is whether the controller can recover the state of the system through this information.

Consider

dxdt=Ax(t)+Bu(t),y(t)=Cx(t)+Du(t)\frac{dx}{dt} = Ax(t) + Bu(t), \qquad y(t) = Cx(t) + Du(t)

DefinitionObservability

The pair (A,C)(A, C) is said to be observable if for any x(0)=x0Rnx(0) = x_0 \in \mathbb{R}^n, there exists T<T < \infty such that the knowledge of {(ys,us),  0sT}\{(y_s, u_s), \; 0 \leq s \leq T\} is sufficient to uniquely determine x(0)x(0).

Remark.

Intuition: Observability asks whether we can reconstruct the internal state of a system purely from what we can measure (the output yy) and what we know we applied (the input uu). If some state components have zero effect on the output -- they are invisible to the sensor -- then the system is not observable. It is the "dual" question to controllability: instead of "can we push the state anywhere?", it asks "can we see where the state is?"

In the above, we could consider without any loss that u(t)=0u(t) = 0 for all tt, since the control terms appear in additive forms whose effects can be cancelled from the measurements.

Consider then

dxdt=Ax(t),y(t)=Cx(t)\frac{dx}{dt} = Ax(t), \qquad y(t) = Cx(t)

The measurement at time tt writes as:

y(t)=CeAtx(0)y(t) = Ce^{At}x(0)

Taking the derivative:

dy(t)dt=CAeAtx(0)\frac{dy(t)}{dt} = CAe^{At}x(0)

and taking the derivatives up to order n1n - 1, we obtain for 1kn11 \leq k \leq n - 1

dk1y(t)dtk1=CAk1eAtx(0)\frac{d^{k-1}y(t)}{dt^{k-1}} = CA^{k-1}e^{At}x(0)

In matrix form, we can write the above as

[y(t)dy(t)dtdn1y(t)dtn1]=[CCACAn1]eAtx(0)\begin{bmatrix} y(t) \\ \frac{dy(t)}{dt} \\ \vdots \\ \frac{d^{n-1}y(t)}{dt^{n-1}} \end{bmatrix} = \begin{bmatrix} C \\ CA \\ \vdots \\ CA^{n-1} \end{bmatrix} e^{At} x(0)

Thus, the question of being able to recover x(0)x(0) from the measurements becomes that of whether the observability matrix

O:=[CCACAn1]\mathcal{O} := \begin{bmatrix} C \\ CA \\ \vdots \\ CA^{n-1} \end{bmatrix}

is full-rank or not. Note that adding further rows to this matrix does not increase the rank by the Cayley-Hamilton theorem. Thus, we can recover the initial state if the observability matrix is full-rank.

Furthermore, we have that CeAtCe^{At} is a linear combination of {CAk,  k=0,1,,n1}\{CA^k, \; k = 0, 1, \cdots, n-1\}. Therefore, if x0x_0 is orthogonal to {CAk,  k=0,1,,n1}\{CA^k, \; k = 0, 1, \cdots, n-1\}, then it is also orthogonal to CeAtCe^{At}. In particular, if the observability matrix is not full-rank, then there exists a non-zero x0x_0 so that CeAtx0=0Ce^{At}x_0 = 0. Thus, we cannot distinguish between x0x_0 and the 00 vector in Rn\mathbb{R}^n and thus the system is not observable.

TheoremObservability Rank Condition

The system

dxdt=Ax(t)+Bu(t),y(t)=Cx(t)+Du(t)\frac{dx}{dt} = Ax(t) + Bu(t), \qquad y(t) = Cx(t) + Du(t)

is observable if and only if

O=[CCACAn1]\mathcal{O} = \begin{bmatrix} C \\ CA \\ \vdots \\ CA^{n-1} \end{bmatrix}

is full-rank.

Remark.

Intuition: Just as the controllability matrix checks whether the input can reach every state direction, the observability matrix checks whether every state direction eventually shows up in the output. Each row CAkCA^k represents the information gained from the kk-th derivative of the output. If nn such measurements span all of Rn\mathbb{R}^n, no state can hide.

The null-space of O\mathcal{O}, that is, {vRn:Ov=0}\{v \in \mathbb{R}^n : \mathcal{O}v = 0\} is called the unobservable subspace.

The structure of the observability matrix O\mathcal{O} and the controllability matrix C\mathcal{C} leads to the following very important and useful duality result.

TheoremControllability-Observability Duality

(A,C)(A, C) is observable if and only if (AT,CT)(A^T, C^T) is controllable.

Remark.

Intuition: This duality is one of the most elegant results in linear systems theory. It says that observability and controllability are two sides of the same coin -- any theorem about controllability immediately gives a theorem about observability by transposing the matrices. The observability matrix O\mathcal{O} of (A,C)(A, C) is exactly CT\mathcal{C}^T for the pair (AT,CT)(A^T, C^T).

Remark.

By the duality theorem and the PBH test (Theorem(iv)), (A,C)(A, C) is observable if and only if the matrix

[AλIC]\begin{bmatrix} A - \lambda I \\ C \end{bmatrix}

has full column rank (i.e., rank nn) at every eigenvalue λ\lambda of AA. This is the PBH test for observability: no eigenmode of AA can be invisible to the output CC.

In view of Theorem (and in particular, now that we have related observability to a condition of the form given in Theorem(iii)), we have the following immediate result:

TheoremObservability Grammian

(A,C)(A, C) is observable if and only if

Wo(t)=0teATsCTCeAsdsW_o(t) = \int_0^t e^{A^T s} C^T C e^{As} ds

is invertible for all t>0t > 0.

Remark.

Intuition: The observability Grammian Wo(t)W_o(t) accumulates how much "information" the output reveals about each state direction over time. If it is invertible, every state direction has been sufficiently excited in the output for reconstruction. This is the direct dual of the controllability Grammian Wc(t)W_c(t).


Feedback and Pole Placement

Consider u=Kxu = -Kx. Then,

dxdt=Ax(t)+Bu(t)=(ABK)x(t)\frac{dx}{dt} = Ax(t) + Bu(t) = (A - BK)x(t)

TheoremPole Placement via State Feedback

The eigenvalues of ABKA - BK can be placed arbitrarily if and only if (A,B)(A, B) is controllable.

Remark.

Intuition: If we have full control authority (controllability), we can shape the system's dynamics however we like by choosing the right feedback gain KK. This is the core promise of state feedback design: controllability guarantees that every mode of the system can be moved to any desired location in the complex plane, enabling arbitrary stability and performance specifications.

To see this result, first consider a system in the controllable canonical realization form (see the relevant section) with

ddtx(t)=Ax(t)+Bu(t),y(t)=Cx(t)\frac{d}{dt}x(t) = Ax(t) + Bu(t), \qquad y(t) = Cx(t)

Ac=[010000101a0a1a2aN1]A_c = \begin{bmatrix} 0 & 1 & 0 & \cdots & 0 \\ 0 & 0 & 1 & \cdots & 0 \\ \vdots & \vdots & & \ddots & 1 \\ -a_0 & -a_1 & -a_2 & \cdots & -a_{N-1} \end{bmatrix}

Bc=[0001]B_c = \begin{bmatrix} 0 \\ 0 \\ \vdots \\ 0 \\ 1 \end{bmatrix}

Note that, the eigenvalues of AA solve the characteristic polynomial whose coefficients are located in the bottom row of AcA_c (see the proof of Theorem).

Now, apply u=Kxu = -Kx so that u=i=1Nkixiu = \sum_{i=1}^{N} -k_i x_i, leading to

ABK=[010000100(a0+k1)(a1+k2)(a2+k3)(aN1+kN)]A - BK = \begin{bmatrix} 0 & 1 & 0 & \cdots & 0 \\ 0 & 0 & 1 & \cdots & 0 \\ \vdots & \vdots & & \cdots & 0 \\ -(a_0 + k_1) & -(a_1 + k_2) & -(a_2 + k_3) & \cdots & -(a_{N-1} + k_N) \end{bmatrix}

Once again, since the eigenvalues of this matrix solve the characteristic polynomial whose coefficients are located in the bottom row (by the proof of Theorem), and these coefficients can be placed by selecting the scalars kik_i, we can arbitrarily place the eigenvalues of the closed-loop matrix by feedback.

Through a coordinate transformation x~=Px\tilde{x} = Px, every controllable system x=Ax+Bux' = Ax + Bu can be transformed to an algebraically equivalent linear system in the controllable canonical realization form (Ac,Bc)(A_c, B_c) above. As we saw, for a system in this form, a control can be found so that all the eigenvalues of the closed loop system are on the left-half plane. Finally, the system can be moved back to the original coordinates.

We now see how this (transformation into a controllable canonical realization form) is possible. With x~=Px\tilde{x} = Px, we have that

dx~dt=A~x~(t)+B~u(t)\frac{d\tilde{x}}{dt} = \tilde{A}\tilde{x}(t) + \tilde{B}u(t)

with A~=PAP1\tilde{A} = PAP^{-1}, B~=PB\tilde{B} = PB. Now, if (A,B)(A, B) is controllable, we know that C=[BABAn1B]\mathcal{C} = \begin{bmatrix} B & AB & \cdots & A^{n-1}B \end{bmatrix} is full-rank. The transformed controllability matrix writes as: C~=[B~A~B~A~n1B~]=PC\tilde{\mathcal{C}} = \begin{bmatrix} \tilde{B} & \tilde{A}\tilde{B} & \cdots & \tilde{A}^{n-1}\tilde{B} \end{bmatrix} = P\mathcal{C}. As a result,

P=C~C1P = \tilde{\mathcal{C}} \mathcal{C}^{-1}

whose validity follows from the fact that C\mathcal{C} is invertible. This leads us to the following conclusion.

TheoremTransformation to Controllable Canonical Form

Consider x=Ax+Bux' = Ax + Bu where uRu \in \mathbb{R}. Every such system, provided that (A,B)(A, B) is controllable, can be transformed into a system z=A~z+B~uz' = \tilde{A}z + \tilde{B}u with the transformation z=Pxz = Px so that (A~,B~)(\tilde{A}, \tilde{B}) is in the controllable canonical realization form.

Remark.

Intuition: This theorem says that controllability guarantees we can always find a change of coordinates that puts the system into a standard "canonical" structure where pole placement becomes trivial. The transformation P=C~C1P = \tilde{\mathcal{C}}\mathcal{C}^{-1} is constructive -- it gives you a concrete recipe for computing the coordinate change.

The above then suggests a method to achieve stabilization through feedback: First transform into a controllable canonical realization form, place the eigenvalues through feedback, and transform the system back to the original coordinate.


Observers and Observer Feedback

Consider

dxdt=Ax+Bu,y=Cx\frac{dx}{dt} = Ax + Bu, \qquad y = Cx

Suppose that the controller intends to track the state. A candidate for such a purpose is to write an observer system of the form

dx^dt=Ax^+Bu+L(yCx^)\frac{d\hat{x}}{dt} = A\hat{x} + Bu + L(y - C\hat{x})

We then obtain with e=xx^e = x - \hat{x}, and subtracting the above two equations from one another

dedt=AeLCe=(ALC)e\frac{de}{dt} = Ae - LCe = (A - LC)e

Then, the question whether e(t)0e(t) \to 0 is determined by whether the eigenvalues of ALCA - LC can be pushed to the left-half plane with some appropriate LL. If the system is observable, then this is possible, with the same arguments applicable to the pole placement analysis presented in the previous section (note that controllability and observability are related to each other with a simple duality property that was presented in Theorem: that is LTL^T can be selected so that ATCTLTA^T - C^T L^T has all eigenvalue in the left-half plane, which will also imply that ALCA - LC will have the same property).

Now that under observability we have that the controller can track the state with asymptotically vanishing error, suppose that we consider

dxdt=Ax+Bu,y=Cx\frac{dx}{dt} = Ax + Bu, \qquad y = Cx

with the goal of stabilizing the actual system state x(t)x(t).

Suppose that we run an observer, and that we consider the following feedback control policy

u(t)=Kx^(t)u(t) = -K\hat{x}(t)

where KK is what we used for pole placement, and x^\hat{x} is what we used in our observer. In this case, we obtain the following relation:

[dxdtdedt]=[ABKBK0ALC][xe]\begin{bmatrix} \frac{dx}{dt} \\ \frac{de}{dt} \end{bmatrix} = \begin{bmatrix} A - BK & BK \\ 0 & A - LC \end{bmatrix} \begin{bmatrix} x \\ e \end{bmatrix}

Due to the upper triangular form, we conclude that [x(t)e(t)]0\begin{bmatrix} x(t) \\ e(t) \end{bmatrix} \to 0 if both ABKA - BK and ALCA - LC are stable matrices; two conditions that we have already established under controllability and observability properties. Such a design leads to the separation principle for linear control systems: run an observer and apply the control as if the observer state is the actual state. This design is stabilizing.

Remark.

Intuition: The separation principle is remarkably powerful: it says you can design the controller (KK) and the observer (LL) independently, and when you combine them, the closed-loop system is stable. The upper triangular block structure of the combined dynamics is the key -- the observer error evolves independently of the state, so the two designs do not interfere with each other.


Canonical Forms

TheoremA-Invariance of Controllable and Unobservable Subspaces

(i) If vRnv \in \mathbb{R}^n is in the controllable subspace, then so is AvAv.

(ii) If vRnv \in \mathbb{R}^n is in the unobservable subspace, then so is AvAv.

That is, the controllable and unobservable subspaces are AA-invariant.

Remark.

Intuition: AA-invariance means that the system dynamics cannot push states out of these subspaces. If a state direction is reachable by the input, applying the system dynamics AA keeps it reachable. If a state direction is invisible to the output, it stays invisible after the dynamics act. This structural property is what allows the block-triangular decompositions that follow.

Controllable canonical form

If a model is not controllable, then we can construct a state transformation x~=Px\tilde{x} = Px with the form x~=[x~cx~uc]\tilde{x} = \begin{bmatrix} \tilde{x}_c \\ \tilde{x}_{uc} \end{bmatrix} with

dx~dt=[AcA120Auc][x~cx~uc]+[Bc0]u\frac{d\tilde{x}}{dt} = \begin{bmatrix} A_c & A_{12} \\ 0 & A_{uc} \end{bmatrix} \begin{bmatrix} \tilde{x}_c \\ \tilde{x}_{uc} \end{bmatrix} + \begin{bmatrix} B_c \\ 0 \end{bmatrix} u

y=[CcCuc][x~cx~uc]y = \begin{bmatrix} C_c & C_{uc} \end{bmatrix} \begin{bmatrix} \tilde{x}_c \\ \tilde{x}_{uc} \end{bmatrix}

In the above (Ac,Bc)(A_c, B_c) is controllable. In the above, A12A_{12} is some submatrix. The form above is called a controllable canonical form.

The matrix PP can be obtained with constructing P1P^{-1} to consist of the following: Let n1n_1 be the rank of the controllability matrix C\mathcal{C}. Then take the first n1n_1 columns of P1P^{-1} to be n1n_1 linearly independent columns of C\mathcal{C}, and the remaining nn1n - n_1 columns are arbitrary vectors which make P1P^{-1} invertible. If we write

Ac=PAP1,Bc=PB,A_c = PAP^{-1}, \qquad B_c = PB,

we have that

P1Ac=AP1,P1Bc=BP^{-1}A_c = AP^{-1}, \qquad P^{-1}B_c = B

Using the fact that the controllable subspace is AA invariant, it follows that the structure of AcA_c has to have the given structure.

An implication of the above analysis is that

C(sIA)1B+D=Cc(sIAc)1Bc+DC(sI - A)^{-1}B + D = C_c(sI - A_c)^{-1}B_c + D

Observable canonical form

A similar construction applies for observable canonical forms.

dx~dt=[Ao0A21Auo][x~ox~uo]+[BoBuo]u\frac{d\tilde{x}}{dt} = \begin{bmatrix} A_o & 0 \\ A_{21} & A_{uo} \end{bmatrix} \begin{bmatrix} \tilde{x}_o \\ \tilde{x}_{uo} \end{bmatrix} + \begin{bmatrix} B_o \\ B_{uo} \end{bmatrix} u

y=[Co0][x~ox~uo]y = \begin{bmatrix} C_o & 0 \end{bmatrix} \begin{bmatrix} \tilde{x}_o \\ \tilde{x}_{uo} \end{bmatrix}

with the property that (Ao,Co)(A_o, C_o) is observable.

An implication of the above analysis is that

C(sIA)1B+D=Co(sIAo)1Bo+DC(sI - A)^{-1}B + D = C_o(sI - A_o)^{-1}B_o + D

Kalman decomposition

One can apply a joint construction, known as Kalman's decomposition. There exists a coordinate transformation so that

z=Pxz = Px

with

z=[xcoxc/uoxuc/oxuc/uo]z = \begin{bmatrix} x_{co} \\ x_{c/uo} \\ x_{uc/o} \\ x_{uc/uo} \end{bmatrix}

Aˉ=PAP1\bar{A} = PAP^{-1}

leads to

dzdt=[Ac/o0Ax/o0Ac/xAx/uoAx/xAx/uo00Auc/o000Auc/xAuc/uo]z+[Bc/oBc/uo00]u\frac{dz}{dt} = \begin{bmatrix} A_{c/o} & 0 & A_{x/o} & 0 \\ A_{c/x} & A_{x/uo} & A_{x/x} & A_{x/uo} \\ 0 & 0 & A_{uc/o} & 0 \\ 0 & 0 & A_{uc/x} & A_{uc/uo} \end{bmatrix} z + \begin{bmatrix} B_{c/o} \\ B_{c/uo} \\ 0 \\ 0 \end{bmatrix} u

y=[Cc/o0Cuc/o0]z+Du,y = \begin{bmatrix} C_{c/o} & 0 & C_{uc/o} & 0 \end{bmatrix} z + Du,

where (Ac/o,Bc/o,Cc/o)(A_{c/o}, B_{c/o}, C_{c/o}) is both controllable and observable. Furthermore,

C(sIA)1B+D=Cc/o(sIAc/o)1Bc/o+DC(sI - A)^{-1}B + D = C_{c/o}(sI - A_{c/o})^{-1}B_{c/o} + D

A corollary of the above discussion is that the minimal realization; that is, the state-space realization with the smallest dimensions involving matrices, is attained when the system is both controllable and observable, as there are no redundant state variables.

Stabilizability and detectability. From the controllable canonical form, we can also establish the following result.

A linear system is stabilizable (in the sense of local or global asymptotic stability) by control if and only if AucA_{uc}, whenever exists, is a stable matrix (i.e., with eigenvalues strictly in the left half plane).

Define a control-free system to be detectable if whenever y(t)0y(t) \to 0 then x(t)0x(t) \to 0. A consequence of the observable canonical form is that a system is detectable if and only if AuoA_{uo}, whenever exists, is a stable matrix.


Using Riccati Equations to Find Stabilizing Linear Controllers [Optional]

While controllability and observability properties reveal what is possible or impossible with regard to stabilization, they don't directly present an easy-to-compute or constructive method for arriving at design.

One effective method is through Riccati equations. We will present the discussion for discrete-time, but the approach is essentially identical for continuous-time (with the stability conditions of linear systems, as noted earlier, being different).

Controller design via Riccati equations

Consider the following linear system

xt+1=Axt+But,x_{t+1} = Ax_t + Bu_t,

where xRnx \in \mathbb{R}^n, uRmu \in \mathbb{R}^m.

Suppose that we would like to minimize the expression over all control laws:

t=0xtTQxt+utTRut\sum_{t=0}^{\infty} x_t^T Q x_t + u_t^T R u_t

with R>0R > 0, Q0Q \geq 0.

TheoremRiccati Equation for Optimal Control

Consider the system xt+1=Axt+Butx_{t+1} = Ax_t + Bu_t.

(i) If (A,B)(A, B) is controllable there exists a solution to the Riccati equation

P=Q+ATPAATPB(BTPB+R)1BTPA.P = Q + A^T P A - A^T P B (B^T P B + R)^{-1} B^T P A.

(ii) If (A,B)(A, B) is controllable and, with Q=CTCQ = C^T C, (A,C)(A, C) is observable; as tt \to -\infty, the sequence of Riccati recursions, for P0=PˉP_0 = \bar{P} with Pˉ\bar{P} arbitrary,

Pt=Q+ATPt+1AATPt+1B(BTPt+1B+R)1BTPt+1A,P_t = Q + A^T P_{t+1} A - A^T P_{t+1} B (B^T P_{t+1} B + R)^{-1} B^T P_{t+1} A,

converges to some limit PP that satisfies the algebraic Riccati above. That is, convergence takes place for any initial condition Pˉ\bar{P}. Furthermore, such a PP is unique, and is positive definite. Finally, under the control policy

ut=(BTPB+R)1BTPAxt,u_t = -(B^T P B + R)^{-1} B^T P A x_t,

{xt}\{x_t\} is stable.

(iii) Under the conditions of part (ii), the control minimizes t=0xtTQxt+utTRut\sum_{t=0}^{\infty} x_t^T Q x_t + u_t^T R u_t.

Remark.

Intuition: The Riccati provides a constructive, computational recipe for optimal control design. Instead of just knowing that pole placement is possible (from controllability), the Riccati approach tells you exactly which feedback gain to use -- one that minimizes a quadratic cost balancing state regulation (QQ) against control effort (RR). The iterative Riccati recursion converges from any starting point, making it numerically robust. This is the foundation of LQR (Linear Quadratic Regulator) design.

In the above, we established a method to find KK so that ABKA - BK is stable: Run the recursions, for any arbitrary initial condition, find the limit PP and select ut=Kxtu_t = -Kx_t with

K=(BTPB+R)1BTPAK = (B^T P B + R)^{-1} B^T P A

This controller will be stabilizing.

Observer design via Riccati equations

A similar phenomenon as applies for observer design. In fact, with the above discussion, using the duality analysis presented earlier, we can directly design an observer so that the matrix ALCA - LC is stable. By writing the condition as the stability of ATCTLTA^T - C^T L^T, the question becomes that of finding LTL^T for which ATCTLTA^T - C^T L^T is a stable matrix.

Let (A,C)(A, C) be observable. In Theorem, if we replace AA with ATA^T, BB with CTC^T, and defining W=BBTW = BB^T for any BB with (A,B)(A, B) controllable, we obtain:

S=W+ASATASCT(CSCT+R)1CSAT.S = W + ASA^T - ASC^T(CSC^T + R)^{-1}CSA^T.

or the Riccati equations

St+1=W+AStATAStCT(CStCT+R)1CStAT.S_{t+1} = W + AS_tA^T - AS_tC^T(CS_tC^T + R)^{-1}CS_tA^T.

whose limit as tt \to \infty for any initial S0S_0 will converge to a unique limit. Finally, taking

LT=(CSCT+R)1CSATL^T = (CSC^T + R)^{-1}CSA^T

will lead to the conclusion that ALCA - LC is stable.

Putting controller and observer design together

Accordingly, all we need for the system:

xt+1=Axt+But,yt=Cxtx_{t+1} = Ax_t + Bu_t, \qquad y_t = Cx_t

is that (A,B)(A, B) be controllable and (A,C)(A, C) be observable. With the controller gain KK and observer gain LL from above, we can find KK and LL so that the system

xk+1=AxkBKx^kx_{k+1} = Ax_k - BK\hat{x}_k

x^k+1=Ax^k+L(CxkCx^k)\hat{x}_{k+1} = A\hat{x}_k + L(Cx_k - C\hat{x}_k)

or, equivalently, with ek=xkx^ke_k = x_k - \hat{x}_k, the system defined with

[xk+1ek+1]=[ABKBK0ALC][xkek]\begin{bmatrix} x_{k+1} \\ e_{k+1} \end{bmatrix} = \begin{bmatrix} A - BK & BK \\ 0 & A - LC \end{bmatrix} \begin{bmatrix} x_k \\ e_k \end{bmatrix}

is stable.

In the above, the conditions on (A,B)(A, B) being controllable and (A,C)(A, C) being observable can be relaxed: controllability can be replaced with stabilizability and observability can be relaxed to detectability. While stability will be maintained, the only difference would be that PP or SS would not be guaranteed to be positive-definite.

Continuous-time case

A similar discussion as above applies for the continuous-time setup. We only discuss the control design, as the observer design follows from duality, as shown above.

Consider

dxdt=Ax+Bu\frac{dx}{dt} = Ax + Bu

Let Q0Q \geq 0, R>0R > 0. The only difference with the continuous-time is that the discrete-time Riccati equations above are replaced by a corresponding Riccati differential equation:

dPdt=Q+ATP+PAPBR1BTP.-\frac{dP}{dt} = Q + A^T P + PA - PBR^{-1}B^T P.

If (A,B)(A, B) is controllable and, with Q=CTCQ = C^T C, (A,C)(A, C) is observable, then there exists a unique positive-definite matrix PP such that the following algebraic Riccati equation is satisfied:

Q+ATP+PAPBR1BTP=0Q + A^T P + PA - PBR^{-1}B^T P = 0

With this PP, the control given by

u=Kx=R1BTPxu = -Kx = -R^{-1}B^T Px

is so that ABKA - BK is stable.


Applications and Exercises

Exercise. Recall that we had studied the controlled pendulum on a cart (see Figure 12.1). The non-linear mechanical/rotational dynamics equations were found to be

Md2ydt2=umd2dt2(y+lsin(θ))=umd2ydt2+mld2θdt2ml(dθdt)2sin(θ)M\frac{d^2 y}{dt^2} = u - m\frac{d^2}{dt^2}(y + l\sin(\theta)) = u - m\frac{d^2 y}{dt^2} + ml\frac{d^2\theta}{dt^2} - ml\left(\frac{d\theta}{dt}\right)^2 \sin(\theta)

md2θdt2=mglsin(θ)mld2ydt2cos(θ)m\frac{d^2\theta}{dt^2} = \frac{mg}{l}\sin(\theta) - \frac{m}{l}\frac{d^2 y}{dt^2}\cos(\theta)

Around θ=0\theta = 0, dθdt=0\frac{d\theta}{dt} = 0, we apply the linear approximations sin(θ)θ\sin(\theta) \approx \theta and cos(θ)1\cos(\theta) \approx 1, and (dθdt)20\left(\frac{d\theta}{dt}\right)^2 \approx 0 to arrive at

Md2ydt2=u(md2ydt2+mld2θdt2)M\frac{d^2 y}{dt^2} = u - \left(m\frac{d^2 y}{dt^2} + ml\frac{d^2\theta}{dt^2}\right)

ld2θdt2=gθd2ydt2l\frac{d^2\theta}{dt^2} = g\theta - \frac{d^2 y}{dt^2}

Finally, writing x1=yx_1 = y, x2=dydtx_2 = \frac{dy}{dt}, x3=θx_3 = \theta, x4=dθdtx_4 = \frac{d\theta}{dt}, we arrive at the linear model in state space form

dxdt=[010000mgM0000100(M+m)gMl0]x+[01M01Ml]u,\frac{dx}{dt} = \begin{bmatrix} 0 & 1 & 0 & 0 \\ 0 & 0 & \frac{-mg}{M} & 0 \\ 0 & 0 & 0 & 1 \\ 0 & 0 & \frac{(M+m)g}{Ml} & 0 \end{bmatrix} x + \begin{bmatrix} 0 \\ \frac{1}{M} \\ 0 \\ \frac{-1}{Ml} \end{bmatrix} u,

where x=[x1x2x3x4]x = \begin{bmatrix} x_1 \\ x_2 \\ x_3 \\ x_4 \end{bmatrix}.

a) When is the linearized model controllable?

b) Does there exist a control policy with u=Kxu = -Kx that makes the closed loop linearized system stable? Select specific values for M,m,lM, m, l so that controllability holds, and accordingly find an explicit KK.

c) With the controller in part b), can you conclude that through the arguments presented in the previous chapter (e.g. Theorem), that your (original non-linear) system is locally asymptotically stable?

Hint: a) With

A=[010000mgM0000100(M+m)gMl0],B=[01M01Ml]A = \begin{bmatrix} 0 & 1 & 0 & 0 \\ 0 & 0 & \frac{-mg}{M} & 0 \\ 0 & 0 & 0 & 1 \\ 0 & 0 & \frac{(M+m)g}{Ml} & 0 \end{bmatrix}, \qquad B = \begin{bmatrix} 0 \\ \frac{1}{M} \\ 0 \\ \frac{-1}{Ml} \end{bmatrix}

we have that

[BABA2BA3B]=[01M0mgM2l1M0mgM2l001Ml0(M+m)gM2l21Ml0(M+m)gM2l20]\begin{bmatrix} B & AB & A^2B & A^3B \end{bmatrix} = \begin{bmatrix} 0 & \frac{1}{M} & 0 & \frac{mg}{M^2 l} \\ \frac{1}{M} & 0 & \frac{mg}{M^2 l} & 0 \\ 0 & \frac{-1}{Ml} & 0 & \frac{-(M+m)g}{M^2 l^2} \\ \frac{-1}{Ml} & 0 & \frac{-(M+m)g}{M^2 l^2} & 0 \end{bmatrix}

You will be asked to find the condition for this system to be invertible in your homework assignment.

b) By controllability, we can place the eigenvalues of the matrix arbitrarily. Find an explicit KK. You can use the method presented earlier in the chapter, or try to explicitly arrive at a stabilizing control matrix.

c) Then, by Theorem, the system is locally stable around the equilibrium point. Precisely explain why this is the case.

Exercise. Consider the linear system

dxdt=[000010211]x+[011]u\frac{dx}{dt} = \begin{bmatrix} 0 & 0 & 0 \\ 0 & 1 & 0 \\ 2 & 1 & 1 \end{bmatrix} x + \begin{bmatrix} 0 \\ 1 \\ 1 \end{bmatrix} u

Is this system controllable? Does there exist a matrix KK so that with u=Kxu = Kx, the eigenvalues of the closed-loop matrix: ([000010211]+[011]K)\left(\begin{bmatrix} 0 & 0 & 0 \\ 0 & 1 & 0 \\ 2 & 1 & 1 \end{bmatrix} + \begin{bmatrix} 0 \\ 1 \\ 1 \end{bmatrix} K\right) can be arbitrarily assigned?

Exercise. Consider

dxdt=Ax+Bu,y=Cx\frac{dx}{dt} = Ax + Bu, \qquad y = Cx

with

dxdt=[0110]x+[01]u\frac{dx}{dt} = \begin{bmatrix} 0 & 1 \\ 1 & 0 \end{bmatrix} x + \begin{bmatrix} 0 \\ 1 \end{bmatrix} u

y=[10]xy = \begin{bmatrix} 1 & 0 \end{bmatrix} x

a) Is this system observable? b) Is this system controllable? c) Provide a stabilizing feedback control policy by running an observer.

Hint: a) and b) Yes. c) The system is both controllable and observable. If the system state were available, we could have u=Kxu = -Kx and select KK so that ABKA - BK is stable. Find such a KK. Now, we can run an observer as explained in the relevant section:

dx^dt=Ax^+Bu+L(yCx^)\frac{d\hat{x}}{dt} = A\hat{x} + Bu + L(y - C\hat{x})

with the property that ALCA - LC is stable. Find such an LL. Then, the control to be applied would be: ut=Kx^tu_t = -K\hat{x}_t. Find explicit values.

Exercise. a) Show that controllability is invariant under an algebraically equivalent transformation of the coordinates: x~=Px\tilde{x} = Px for some invertible PP.

Hint: With dx~dt=A~x~(t)+B~u(t)\frac{d\tilde{x}}{dt} = \tilde{A}\tilde{x}(t) + \tilde{B}u(t) and A~=PAP1\tilde{A} = PAP^{-1}, B~=PB\tilde{B} = PB, and with C=[BABAn1B]\mathcal{C} = \begin{bmatrix} B & AB & \cdots & A^{n-1}B \end{bmatrix}, we have that the transformed controllability matrix writes as C~=[B~A~B~A~n1B~]=PC\tilde{\mathcal{C}} = \begin{bmatrix} \tilde{B} & \tilde{A}\tilde{B} & \cdots & \tilde{A}^{n-1}\tilde{B} \end{bmatrix} = P\mathcal{C}. Since PP is invertible, rank(C~)=rank(C)\text{rank}(\tilde{\mathcal{C}}) = \text{rank}(\mathcal{C}).

b) Consider

dxdt=[abba]x+[10]u\frac{dx}{dt} = \begin{bmatrix} a & b \\ -b & a \end{bmatrix} x + \begin{bmatrix} 1 \\ 0 \end{bmatrix} u

Express, through a transformation, this system in a controllable canonical realization form.

Hint: The characteristic polynomial of A=[abba]A = \begin{bmatrix} a & b \\ -b & a \end{bmatrix} is s22as+(a2+b2)s^2 - 2as + (a^2 + b^2). The controllability matrix is C=[1a0b]\mathcal{C} = \begin{bmatrix} 1 & a \\ 0 & -b \end{bmatrix}, which has full rank when b0b \neq 0. The controllable canonical form has Ac=[01(a2+b2)2a]A_c = \begin{bmatrix} 0 & 1 \\ -(a^2+b^2) & 2a \end{bmatrix} and Bc=[01]B_c = \begin{bmatrix} 0 \\ 1 \end{bmatrix}. The transformation P=C~C1P = \tilde{\mathcal{C}}\mathcal{C}^{-1} where C~=[0112a]\tilde{\mathcal{C}} = \begin{bmatrix} 0 & 1 \\ 1 & 2a \end{bmatrix}.

Exercise. Consider

dxdt=Ax+Bu,y=Cx\frac{dx}{dt} = Ax + Bu, \qquad y = Cx

with

dxdt=[0110]x+[01]u\frac{dx}{dt} = \begin{bmatrix} 0 & 1 \\ 1 & 0 \end{bmatrix} x + \begin{bmatrix} 0 \\ 1 \end{bmatrix} u

y=[10]xy = \begin{bmatrix} 1 & 0 \end{bmatrix} x

a) Is this system observable? b) Is this system controllable? c) Provide a stabilizing feedback control policy by running an observer.

Note. The model here and the model for PP given in Exercise are related.

Solution. The system is both controllable and observable. If the system state were available, we could have u=Kxu = -Kx and select KK so that ABKA - BK is stable. Find such a KK.

Now, we can run an observer as explain in the lecture notes:

dx^dt=Ax^+Bu+L(yCx^)\frac{d\hat{x}}{dt} = A\hat{x} + Bu + L(y - C\hat{x})

with the property that ALCA - LC is stable. Find such an LL. Then, the control to be applied would be: ut=Kx^tu_t = -K\hat{x}_t.