LQG控制

LQG控制（linear–quadratic–Gaussian control）的全名是线性二次高斯控制，是控制理论中的基础最优控制问题之一。此问题和存在加性高斯白噪声的线性系统有关。此问题是要找到最佳的输出回授律，可以让二次费用函数的期望值最小化。其输出量测假设受到高斯噪声的影响，其初值也是高斯随机向量。

在“使用线性控制律”的最佳控制假设下，可以用completion-of-squares论述进行推导^[1]。此控制律即为LQG控制器，就是卡尔曼滤波（线性二次状态估测器，LQE）和LQR控制器的结合。分离原理指出状态估测器和状态回授可以独立设计。LQG控制可以应用在线性时不变系统及线性时变系统，产生容易计算以及实现的线性动态回授控制器。LQG控制器本身是一个类似其受控系统的动态系统，两者有相同的维度。

根据分离原理，在一些范围较宽可能是非线性的控制器中，LQG控制器仍然是最佳的。也就是说“使用非线性控制架构不一定可以改善费用泛函的期望值”。这个版本的分离原理是随机控制的分离原理（英语：Separation principle in stochastic control）（separation principle of stochastic control）提到就算过程及输出噪声源可能是非高斯鞅，只要其系统动态是线性的，其最佳控制仍可以分离为最佳状态估测器（不再是卡尔曼滤波器）及LQR控制器^[2]^[3]。LQR控制器也有用来控制扰动的非线性系统^[4]。

问题和解的数学描述

连续时间

考虑连续时间的线性动态系统

{\dot {\mathbf {x} }}(t)=A(t)\mathbf {x} (t)+B(t)\mathbf {u} (t)+\mathbf {v} (t),

\mathbf {y} (t)=C(t)\mathbf {x} (t)+\mathbf {w} (t),

其中 ${\mathbf {x} }$ 是系统状态变量的向量， ${\mathbf {u} }$ 是控制输入向量， ${\mathbf {y} }$ 是输出量测值的向量，可用在回授上。系统受到加成性的高斯系统噪声 $\mathbf {v} (t)$ 及加成性的高斯量测噪声 $\mathbf {w} (t)$ 所影响。给定一系统，其目标是找到一控制输入 ${\mathbf {u} }(t)$ ，此控制输入在每个时间 ${\mathbf {} }t$ 下，和以往的量测量 ${\mathbf {y} }(t'),0\leq t'<t$ 有线性关系，而且此控制输入可以让以下的费用函数有最小值：

J=\mathbb {E} \left[{\mathbf {x} ^{\mathrm {T} }}(T)F{\mathbf {x} }(T)+\int _{0}^{T}{\mathbf {x} ^{\mathrm {T} }}(t)Q(t){\mathbf {x} }(t)+{\mathbf {u} ^{\mathrm {T} }}(t)R(t){\mathbf {u} }(t)\,dt\right],

F\geq 0,\quad Q(t)\geq 0,\quad R(t)>0,

其中 $\mathbb {E}$ 为期望值。最终时间（horizon） ${\mathbf {} }T$ 可能是有限值或是无限值。若最终时间为无限，则费用函数的第一项 ${\mathbf {x} }^{\mathrm {T} }(T)F{\mathbf {x} }(T)$ 可以忽略，和问题无关。而为了要让费用函数为有限值，会定义费用函数为 ${\mathbf {} }J/T$ 。

求解上述LQG问题的LQG控制器可以用以下方程表示：

{\dot {\hat {\mathbf {x} }}}(t)=A(t){\hat {\mathbf {x} }}(t)+B(t){\mathbf {u} }(t)+L(t)\left({\mathbf {y} }(t)-C(t){\hat {\mathbf {x} }}(t)\right),\quad {\hat {\mathbf {x} }}(0)=\mathbb {E} \left[{\mathbf {x} }(0)\right],

{\mathbf {u} }(t)=-K(t){\hat {\mathbf {x} }}(t).

矩阵 ${\mathbf {} }L(t)$ 称为卡尔曼增益（Kalman gain），和第一个方程卡尔曼滤波有关。在时间 ${\mathbf {} }t$ ，滤波器会根据过去量测及输入来产生状态 ${\mathbf {x} }(t)$ 的估测值 ${\hat {\mathbf {x} }}(t)$ 。卡尔曼增益 ${\mathbf {} }L(t)$ 是根据 ${\mathbf {} }A(t),C(t)$ 、二个和白色高斯噪声有关密度矩阵 $\mathbf {v} (t)$ 、 $\mathbf {w} (t)$ 及最后的 $\mathbb {E} \left[{\mathbf {x} }(0){\mathbf {x} }^{\mathrm {T} }(0)\right]$ 来计算。这五个矩阵会透过以下的矩阵Riccati微分方程来决定卡尔曼增益：

{\dot {P}}(t)=A(t)P(t)+P(t)A^{\mathrm {T} }(t)-P(t)C^{\mathrm {T} }(t){\mathbf {} }W^{-1}(t)C(t)P(t)+V(t),

P(0)=\mathbb {E} \left[{\mathbf {x} }(0){\mathbf {x} }^{\mathrm {T} }(0)\right].

假设其解 $P(t),0\leq t\leq T$ ，则卡尔曼增益等于

{\mathbf {} }L(t)=P(t)C^{\mathrm {T} }(t)W^{-1}(t).

矩阵 ${\mathbf {} }K(t)$ 称为回授增益（feedback gain）矩阵，是由 ${\mathbf {} }A(t),B(t),Q(t),R(t)$ 及 ${\mathbf {} }F$ 矩阵，透过以下的矩阵Riccati微分方程来决定

-{\dot {S}}(t)=A^{\mathrm {T} }(t)S(t)+S(t)A(t)-S(t)B(t)R^{-1}(t)B^{\mathrm {T} }(t)S(t)+Q(t),

{\mathbf {} }S(T)=F.

假设其解 ${\mathbf {} }S(t),0\leq t\leq T$ ，回授增益等于

{\mathbf {} }K(t)=R^{-1}(t)B^{\mathrm {T} }(t)S(t).

观察上述二个矩阵Riccati微分方程，第一个沿时间从前往后算，而第二个是沿时间从后往前算，这称为“对偶性”。第一个矩阵Riccati微分方程解了线性平方估测问题（LQE），第二个矩阵Riccati微分方程解了LQR控制器问题。这二个问题是对偶的，合起来就解了线性平方高斯控制问题（LQG)，因此LQG问题分成了LQE问题以及LQR问题，且可以独立求解，因此LQG问题是“可分离的”。

当 ${\mathbf {} }A(t),B(t),C(t),Q(t),R(t)$ 和噪声密度矩阵 $\mathbf {} V(t)$ , $\mathbf {} W(t)$ 不随时间变化 ${\mathbf {} }t$ ，且 ${\mathbf {} }T$ 趋于无限大时，LQG控制器会变成非时变动态系统。此时上述二个矩阵Riccati微分方程会变成代数Riccati方程。

离散时间

离散时间的LQG控制问题和连续时间下的问题相近，因此以下只关注其数学式。

离散时间的线性系统方程为

{\mathbf {x} }_{i+1}=A_{i}\mathbf {x} _{i}+B_{i}\mathbf {u} _{i}+\mathbf {v} _{i},

\mathbf {y} _{i}=C_{i}\mathbf {x} _{i}+\mathbf {w} _{i}.

其中 $\mathbf {} i$ 是离散时间， $\mathbf {v} _{i},\mathbf {w} _{i}$ 是离散时间高斯白噪声过程，其共变异数矩阵为 $\mathbf {} V_{i},W_{i}$ 。

要最小化的二次费用函数为

J=\mathbb {E} \left[{\mathbf {x} }_{N}^{\mathrm {T} }F{\mathbf {x} }_{N}+\sum _{i=0}^{N-1}(\mathbf {x} _{i}^{\mathrm {T} }Q_{i}\mathbf {x} _{i}+\mathbf {u} _{i}^{\mathrm {T} }R_{i}\mathbf {u} _{i})\right],

F\geq 0,Q_{i}\geq 0,R_{i}>0.\,

离散时间的LQG控制器为

{\hat {\mathbf {x} }}_{i+1}=A_{i}{\hat {\mathbf {x} }}_{i}+B_{i}{\mathbf {u} }_{i}+L_{i+1}\left({\mathbf {y} }_{i+1}-C_{i+1}\left\{A_{i}{\hat {\mathbf {x} }}_{i}+B_{i}u_{i}\right\}\right),{\hat {\mathbf {x} }}_{0}=\mathbb {E} [{\mathbf {x} }_{0}]

,

\mathbf {u} _{i}=-K_{i}{\hat {\mathbf {x} }}_{i}.\,

卡尔曼增益等于

{\mathbf {} }L_{i}=P_{i}C_{i}^{\mathrm {T} }(C_{i}P_{i}C_{i}^{\mathrm {T} }+W_{i})^{-1},

其中 ${\mathbf {} }P_{i}$ 是由以下依时间往前进的矩阵Riccati差分方程所决定：

P_{i+1}=A_{i}\left(P_{i}-P_{i}C_{i}^{\mathrm {T} }\left(C_{i}P_{i}C_{i}^{\mathrm {T} }+W_{i}\right)^{-1}C_{i}P_{i}\right)A_{i}^{\mathrm {T} }+V_{i},P_{0}=\mathbb {E} \left({\mathbf {x} }_{0}-{\hat {\mathbf {x} }}_{0}\right)\left({\mathbf {x} }_{0}-{\hat {\mathbf {x} }}_{0}\right)^{\mathrm {T} }.

回授增益矩阵为

{\mathbf {} }K_{i}=(B_{i}^{\mathrm {T} }S_{i+1}B_{i}+R_{i})^{-1}B_{i}^{\mathrm {T} }S_{i+1}A_{i}

\ 其中 ${\mathbf {} }S_{i}$ 是由以下时间从后往前算的矩阵Riccati差分方程所决定：

S_{i}=A_{i}^{\mathrm {T} }\left(S_{i+1}-S_{i+1}B_{i}\left(B_{i}^{\mathrm {T} }S_{i+1}B_{i}+R_{i}\right)^{-1}B_{i}^{\mathrm {T} }S_{i+1}\right)A_{i}+Q_{i},\quad S_{N}=F.

若问题中所有的矩阵都是非时变的，且时间长度 ${\mathbf {} }N$ 趋近无穷大，则离散时间的LQG控制器就是非时变的。此时矩阵Riccati差分方程可以用离散时间的代数Riccati方程取代。可以决定非时变的离散线性二次估测器，以及非时变的离散LQR控制器。为了让费用是有限值，会用 ${\mathbf {} }J/N$ 来代替 ${\mathbf {} }J$ 。

降阶LQG问题

在传统LQG设置中，当系统维度很大时，实现LQG控制器会有困难。降阶LQG问题（reduced-order LQG problem）也称为固定阶数LQG问题（fixed-order LQG problem）先设置了LQG控制的状态数。因为分离原理已不适用，此问题会更不容易求解，而且其解也不唯一。即使如此，降阶LQG问题已有不少的数值算法^[5]^[6]^[7]^[8]可以求解相关的最佳投影方程（optimal projection equations）^[9]^[10]，其中建构了局部优化的降阶LQG问题的充份及必要条件^[5]。

LQG控制的鲁棒性

LQG优化本身不确保有良好的鲁棒性^[11]，需要在设计好LQG控制后，另外确认闭回路系统的鲁棒稳定性。为了提升系统的鲁棒性，可能会将一些系统参数由确定值改假设是随机值。相关的控制问题会更加复杂，会得到一个类似的最佳控制器，只有控制器参数不同^[6]。

参考资料

^ Karl Johan Astrom. Introduction to Stochastic Control Theory 58. Academic Press. 1970. ISBN 0-486-44531-3. .
^ Anders Lindquist. On Feedback Control of Linear Stochastic Systems. SIAM Journal on Control. 1973, 11: 323––343. .
^ Tryphon T. Georgiou and Anders Lindquist. The Separation Principle in Stochastic Control, Redux. IEEE Transactions on Automatic Control. 2013, 58 (10): 2481––2494. doi:10.1109/TAC.2013.2259207. .
^ Athans M. The role and use of the stochastic Linear-Quadratic-Gaussian problem in control system design. IEEE Transaction on Automatic Control. 1971, AC–16 (6): 529–552. doi:10.1109/TAC.1971.1099818.
^ ^5.0 ^5.1 Van Willigenburg L.G.; De Koning W.L. Numerical algorithms and issues concerning the discrete-time optimal projection equations. European Journal of Control. 2000, 6 (1): 93–100. doi:10.1016/s0947-3580(00)70917-4. Associated software download from Matlab Central （页面存档备份，存于互联网档案馆）.
^ ^6.0 ^6.1 Van Willigenburg L.G.; De Koning W.L. Optimal reduced-order compensators for time-varying discrete-time systems with deterministic and white parameters. Automatica. 1999, 35: 129–138. doi:10.1016/S0005-1098(98)00138-1. Associated software download from Matlab Central （页面存档备份，存于互联网档案馆）.
^ Zigic D.; Watson L.T.; Collins E.G.; Haddad W.M.; Ying S. Homotopy methods for solving the optimal projection equations for the H2 reduced order model problem. International Journal of Control. 1996, 56 (1): 173–191. doi:10.1080/00207179208934308.
^ Collins Jr. E.G; Haddad W.M.; Ying S. A homotopy algorithm for reduced-order dynamic compensation using the Hyland-Bernstein optimal projection equations. Journal of Guidance Control & Dynamics. 1996, 19 (2): 407–417. doi:10.2514/3.21633.
^ Hyland D.C; Bernstein D.S. The optimal projection equations for fixed order dynamic compensation. IEEE Transaction on Automatic Control. 1984, AC–29 (11): 1034–1037. doi:10.1109/TAC.1984.1103418.
^ Bernstein D.S.; Davis L.D.; Hyland D.C. The optimal projection equations for reduced-order discrete-time modeling estimation and control. Journal of Guidance Control and Dynamics. 1986, 9 (3): 288–293. doi:10.2514/3.20105.
^ Green, Michael; Limebeer, David J. N. Linear Robust Control. Englewood Cliffs: Prentice Hall. 1995: 27. ISBN 0-13-102278-4.

延伸阅读

Stengel, Robert F. Optimal Control and Estimation. New York: Dover. 1994. ISBN 0-486-68200-5.

[astrom-1] Karl Johan Astrom. Introduction to Stochastic Control Theory 58. Academic Press. 1970. ISBN 0-486-44531-3. .

[lindquist-2] Anders Lindquist. On Feedback Control of Linear Stochastic Systems. SIAM Journal on Control. 1973, 11: 323––343. .

[GL2013-3] Tryphon T. Georgiou and Anders Lindquist. The Separation Principle in Stochastic Control, Redux. IEEE Transactions on Automatic Control. 2013, 58 (10): 2481––2494. doi:10.1109/TAC.2013.2259207. .

[Athans-4] Athans M. The role and use of the stochastic Linear-Quadratic-Gaussian problem in control system design. IEEE Transaction on Automatic Control. 1971, AC–16 (6): 529–552. doi:10.1109/TAC.1971.1099818.

[Wil1-5] 5.0 ^5.1 Van Willigenburg L.G.; De Koning W.L. Numerical algorithms and issues concerning the discrete-time optimal projection equations. European Journal of Control. 2000, 6 (1): 93–100. doi:10.1016/s0947-3580(00)70917-4. Associated software download from Matlab Central （页面存档备份，存于互联网档案馆）.

[Wil2-6] 6.0 ^6.1 Van Willigenburg L.G.; De Koning W.L. Optimal reduced-order compensators for time-varying discrete-time systems with deterministic and white parameters. Automatica. 1999, 35: 129–138. doi:10.1016/S0005-1098(98)00138-1. Associated software download from Matlab Central （页面存档备份，存于互联网档案馆）.

[Bern3-7] Zigic D.; Watson L.T.; Collins E.G.; Haddad W.M.; Ying S. Homotopy methods for solving the optimal projection equations for the H2 reduced order model problem. International Journal of Control. 1996, 56 (1): 173–191. doi:10.1080/00207179208934308.

[Had1-8] Collins Jr. E.G; Haddad W.M.; Ying S. A homotopy algorithm for reduced-order dynamic compensation using the Hyland-Bernstein optimal projection equations. Journal of Guidance Control & Dynamics. 1996, 19 (2): 407–417. doi:10.2514/3.21633.

[Bern1-9] Hyland D.C; Bernstein D.S. The optimal projection equations for fixed order dynamic compensation. IEEE Transaction on Automatic Control. 1984, AC–29 (11): 1034–1037. doi:10.1109/TAC.1984.1103418.

[Bern2-10] Bernstein D.S.; Davis L.D.; Hyland D.C. The optimal projection equations for reduced-order discrete-time modeling estimation and control. Journal of Guidance Control and Dynamics. 1986, 9 (3): 288–293. doi:10.2514/3.20105.

[11] Green, Michael; Limebeer, David J. N. Linear Robust Control. Englewood Cliffs: Prentice Hall. 1995: 27. ISBN 0-13-102278-4.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]