RL Math
Global Value vs. Sub-goals by Policy Gradient
Neuro-Dynamic Programming Gradient Methods
Policy Gradient Method for Hierarchical RL
Policy Gradient HRL and Neuro-Dynamic Programming
Policy Gradient Method for HRL
The scanned draft files above contain handwritten mathematical formulas or tools, including analyses of certain academic papers or books, as well as some hypotheses and models proposed by myself. They are provided solely for reference or to offer some inspirational thoughts. However, their academic rigor is not guaranteed to the extent required for publication.