Neural-network-based decentralized control of continuous-time nonlinear interconnected systems with unknown dynamics

 


Neural-network-based decentralized control of continuous-time nonlinear interconnected systems with unknown dynamics

Math and Optimal Control


Problem formulation

Consider a continuous-time nonlinear large-scale system ∑ composed of N interconnected subsystems described by

\begin{align*} \sum : {\color{Red} \dot{{\color{Blue} x}}}_i(t)&=f_i[{\color{Blue} x}_i(t)]+{\color{Magenta} g}_i[{\color{Blue} x}_i(t)]\left \{ {\color{Red} u}_i[x_i(t)] +Z_i[{\color{Blue} x}(t)] \right \} \\ i&=1,2,...,N \end{align*}(1)

where

xi(t) ∈ Rn: state.

The overall state of the large-scale system ∑ is denoted by x=[{\color{Blue} x}_1^T x_2^T ... x_N^T]^T \in \mathbb{R}^n, where\ n=\sum_{i=1}^N n_i

ui [ xi(t) ] ∈ Rmi control input vector of the ith subsystem.

fi : continuous nonlinear internal dynamics function. fi (0)=0. \mathbb{R}^{{\color{Red} n}_i} \to \mathbb{R}^{{\color{Red} n}_i}

gi[ xi(t) ] : input gain function \mathbb{R}^{{\color{Red} n}_i} \to \mathbb{R}^{{\color{Red} n}_i\times {\color{Magenta} m}_i}

Zi [ x(t) ] : interconnected term for the ith subsystem.

The ith isolated subsystem

\begin{align*} \sum_{\color{Red} i} : {\color{Red} \dot{{\color{Blue} x}}}_i(t)&=f_i[{\color{Blue} x}_i(t)]+{\color{Magenta} g}_i[{\color{Blue} x}_i(t)]\left \{ {\color{Red} u}_i[{\color{Blue} x}_i(t)] \right \} \\ i&=1,2,...,N \end{align*}(2)


Decentralized control law

Optimal control

———————-

Reinforcement Learning and Optimal Control Methods for Uncertain Nonlinear Systems

Page 27-29 2.3 Infinite Horizon Optimal Control Problem is the same as Definition 1.

Notation:

x(t) \in \chi \subseteq \mathbb{R}^n : state.

u(t) \in U \subseteq \mathbb{R}^m : control input.

 

\dot{x}=F(x,u)     (2-5)

 

Cost function for the system Eq. 2-5:

J(x(t),u(\tau)_{t\leq \tau < {\infty}} )=\int_{t}^{\infty}r(x(s),u(s))ds     (2-6)

where t : initial time.

r(x,u)   ∈ R : immediate or local cost for the state and control.

 

{\color{Blue} r}(x,u)=Q(x)+u^TRu    (2-7)

where Q(x)  ∈continuously differentiable and positive definite.

Rm x m : positive-definite symmetric matrix.

 

Optimal value function:

V^*(x(t))=min_{{u(\tau)\in \Psi (\chi)},\ {t\leq \tau < \infty}}\int_{t}^{\infty}r \{ x(s),u[x(s)] \}ds     (2-8)

where

\Psi(\cdot) : set of admissible controls.

Bellman’s principle of optimally can be used to derive the following optimality condition

{\color{Blue} 0=min_{u(t)\in \Psi (\chi)} \left [ r(x,u) + \frac{\partial V^*(x)}{\partial x} F(x,u) \right ]}    (2-9)

which is a nonlinear partial differential equation (PDE), also called the HJB equation.

 

Optimal control: (using convex local cost in Eqs. 2-7 and 2-9.)

u^*(x)=-\frac{1}{2}R^{-1} {\color{Magenta} \frac{\partial F(x,u)^T}{\partial u}}\frac{\partial V^*(x)^T}{\partial x}    (2-10)

 

For the control-affine dynamics of the form

\dot x={\color{Golden} f(x)+g(x)u}=F(x,u)    (2-11)

 

Eq. 2-10 -> in terms of the system state

{\color{Red} u}^*(x)=-\frac{1}{2}R^{-1} {\color{Magenta} g^T(x)}\frac{\partial V^*(x)^T}{\partial x}     (2-12)

 

The HJB in Eq. 2-9 can be rewritten in terms of the optimal value function by substituting for the local cost in Eq. 2-7, the system in Eq. 2-11 and the optimal control in Eq. 2-12, as

\begin{align*} 0 &=min_{u(t)\in \Psi (\chi)} \left [ {\color{Blue} r(x,u)} + \frac{\partial V^*(x)}{\partial x} {\color{Golden} F(x,u)} \right ]\\ &=min_{u(t)\in \Psi (\chi)} \left [ {\color{Blue} Q(x)+u^TRu} + \frac{\partial V^*(x)}{\partial x} \left [{\color{Golden} f(x)+g(x)u }\right ] \right ]\\ &=Q(x)+{\color{Red} u^*}^TR{\color{Red} u^*} + \frac{\partial V^*(x)}{\partial x} \left [ f(x) +g(x){\color{Red} u^*} \right ]\\ &=Q(x)+ \left [ {\color{Red} -\frac{1}{2}R^{-1} g^T(x)\frac{\partial V^*(x)^T}{\partial x}} \right ]^TR \left[ {\color{Red} -\frac{1}{2}R^{-1} g^T(x)\frac{\partial V^*(x)^T}{\partial x}} \right ] +\frac{\partial V^*(x)}{\partial x} \left \{ f(x)+g(x) \left[ {\color{Red} -\frac{1}{2}R^{-1} g^T(x)\frac{\partial V^*(x)^T}{\partial x}} \right ]\right \} \\ \end{align*}

\xrightarrow[C^TB^TA^T]{(ABC)^T=}\\ \begin{align*} 0&=Q(x)+ \left \{ -\frac{1}{2} \left [ \frac{\partial V^*(x)^T}{\partial x} \right ]^T [g^T(x)] ^T \left [ R^{-1} \right]^T \right \} R \left[ {\color{Red} -\frac{1}{2}R^{-1} g^T(x)\frac{\partial V^*(x)^T}{\partial x}} \right ] +\frac{\partial V^*(x)}{\partial x} \left \{ f(x)+g(x) \left[ {\color{Red} -\frac{1}{2}R^{-1} g^T(x)\frac{\partial V^*(x)^T}{\partial x}} \right ]\right \} \\ &=Q(x) + \frac{1}{4}\frac{\partial V^*(x)}{\partial x}g(x){R^{-1}}^T g^T(x)\frac{\partial V^*(x)^T}{\partial x} + \frac{\partial V^*(x)}{\partial x}f(x)-\frac{1}{2}\frac{\partial V^*(x)}{\partial x}g(x) R^{-1}g^T(x)\frac{\partial V^*(x)^T}{\partial x} \end{align*}

\xrightarrow[R: symmetric]{R^T=R} {\color{Blue} {R^{-1}}^T=R^{-1}}

\begin{align*} 0 &=min_{u(t)\in \Psi (\chi)} \left [ r(x,u) + \frac{\partial V^*(x)}{\partial x} F(x,u) \right ]\\ &=min_{u(t)\in \Psi (\chi)} \left [ Q(x)+u^TRu + \frac{\partial V^*(x)}{\partial x} \left [f(x)+g(x)u \right ] \right ]\\ &=Q(x)+{\color{Red} u^*}^TR{\color{Red} u^*} + \frac{\partial V^*(x)}{\partial x} \left [ f(x) +g(x){\color{Red} u^*} \right ]\\ &=Q(x) + \frac{1}{4}\frac{\partial V^*(x)}{\partial x}g(x){\color{Blue} {R^{-1}}^T} g^T(x)\frac{\partial V^*(x)^T}{\partial x} + \frac{\partial V^*(x)}{\partial x}f(x)-\frac{1}{2}\frac{\partial V^*(x)}{\partial x}g(x) R^{-1}g^T(x)\frac{\partial V^*(x)^T}{\partial x} \\ &=Q(x) + \frac{1}{4}\frac{\partial V^*(x)}{\partial x}g(x){\color{Blue} R^{-1}} g^T(x)\frac{\partial V^*(x)^T}{\partial x} + \frac{\partial V^*(x)}{\partial x}f(x)-\frac{1}{2}\frac{\partial V^*(x)}{\partial x}g(x) R^{-1}g^T(x)\frac{\partial V^*(x)^T}{\partial x} \\ \end{align*}

\begin{align*} {\color{Blue} 0}&{\color{Blue} = }{\color{Blue} Q(x)+\frac{\partial V^*(x)}{\partial x}f(x)-\frac{1}{4}\frac{\partial V^*(x)}{\partial x}g(x) R^{-1} g^T(x)\frac{\partial V^*(x)^T}{\partial x}}\\ 0&=V^*(0) \end{align*}   (2-13)

———————-

 

 

 


 

Sidebar