Actor-Critic Algorithms for Hierarchical Markov Decision Processes
Actor-Critic Algorithms for Hierarchical Markov Decision Processes
Actor-Critic Algorithms for Hierarchical Markov Decision Processes
Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation 当环境给的奖励少而延迟时,论文给出了一个解决方案:agent至始至终只有一个,但分两个阶段:1总控器阶段,选goal,2控制器,根据当前state和goal,输出action,critic判断goal是否完成或达到终态。重复1,2。总控器选一个新的goal,控制器再输出action,依次类推。我理解它把环境“分”出N个时序上的小环境,与每个小环境对应1个goal。agent实体在这种环境下可以等效为一个点。 The key is that the policy over goals πg which makes expected Q-value with discounting maximum is the policy which the agent chooses, i.e., if the goal sequence g1-g3-g2-… ‘s Q-value is the maximum value among that of all kinds of goal sequences, the agent should… read more »
Meta Learning Shared Hierarchies Notation S: state space. A: action space. MDP: transition function P(s’, r|s, a), (s’, r): next state and reward, (s,a): state and action. PM : distribution over MDPs M with the same state-action space (S, A). Agent: a function mapping from a multi-episode history (s0, a0, r0, s1, a2, r2, …… read more »
Hierarchical Actor-Critic Download Hierarchical_Actor-Critic Flowchart Terminology Artificial intelligence Optimization/decision/control a Agent Controller or decision maker b Action Control c Environment System d Reward of a stage (Opposite of) Cost of a stage e Stage value (Opposite of) Cost of a state f Value (or state-value) function (Opposite of) Cost function g Maximizing the value function… read more »
Consider two features x1, x2 for a single training set:
Download Radix Sort // // main.cpp // Radix_Sort // // Created by Zhenlin Pei on 12/24/18. // Copyright © 2018 Zhenlin Pei. All rights reserved. // // C++ implementation of Radix Sort #include<iostream> using namespace std; // A utility function to get maximum value in arr[] int getMax(int arr[], int n) { int… read more »
Download Bucket Sort // // main.cpp // Bucket_Sort // // Created by Zhenlin Pei on 12/24/18. // Copyright © 2018 Zhenlin Pei. All rights reserved. // // C++ program to sort an array using bucket sort #include <iostream> #include <algorithm> #include <vector> using namespace std; // Function to sort arr[] of size n using bucket… read more »
Download Counting Sort // // main.cpp // Counting_Sort // // Created by Zhenlin Pei on 12/24/18. // Copyright © 2018 Zhenlin Pei. All rights reserved. // // C Program for counting sort #include <stdio.h> #include <string.h> #define RANGE 255 // The main function that sort the given string arr[] in // alphabatical order void countSort(char… read more »
Geeks for Geeks for Heap Sort. Visualization for Heap Sort. Animation for Heap Sort. Download Heap Sort // // main.cpp // Heap_Sort // // Created by Zhenlin Pei on 12/24/18. // Copyright © 2018 Zhenlin Pei. All rights reserved. // // C++ program for implementation of Heap Sort #include <iostream> using namespace std;… read more »
Download Quick Sort // // main.cpp // Quick_Sort // // Created by Zhenlin Pei on 12/24/18. // Copyright © 2018 Zhenlin Pei. All rights reserved. // /* C implementation QuickSort */ #include<stdio.h> // A utility function to swap two elements void swap(int* a, int* b) { int t = *a; *a… read more »