Algorithm Archives

Protected: Resume

March 18, 2026

There is no excerpt because this is a protected post.

Software and Hardware vs Time by Grok

February 24, 2025

========1======================== Sure! Your request is to analyze the comprehensive performance of Hideo Kojima’s Metal Gear series on PlayStation platforms over time, calculate a weighted performance index (emphasizing fluidity, with stability, SF precision, and dynamic SF included), and plot it using MATLAB with the Y-axis as the index and the X-axis as time. Below, I’ll… read more »

Club Elo

February 18, 2025

Matlab Code: club0 = {‘barcelona’ ‘bayern’ ‘realmadrid’ ‘manunited’ ‘liverpool’,‘mancity’,‘inter’,‘juventus’}; ll = {‘Barcelona’,‘Bayern’,‘Real Madrid’,‘Man United’,‘Liverpool’,‘Man City’,‘Inter’,‘Juventus’}; start = ’01-Jan-2020′; D = ‘./’; for i_club = 1:length(club0);club = club0{i_club}; url = sprintf(‘http://api.clubelo.com/%s’,club); % filename = sprintf(‘%sdata%d.csv’,D,i_club); % websave(filename, url); end S = dir(fullfile(D,‘data*.csv’)); datatotal = cell(1,length(club0)); for k = 1:numel(S) F = fullfile(D,S(k).name); datatotal{k} = readtable(F); end… read more »

Technology Node vs Year

February 9, 2025

As semiconductor technology advances, gaming platform hardware accelerates to meet the increasing demands of software, enhancing user interaction and enriching entertainment experiences for the general public. For example, popular video games like “Metal Gear” showcase the improvements in graphics and gameplay made possible by these advancements, allowing players to immerse themselves in more dynamic and… read more »

Finite-Sample Convergence Rates for Q-Learning and Indirect Algorithms

April 22, 2026

Finite-Sample Convergence Rates for Q-Learning and Indirect Algorithms finite-sample convergence rates for q-learning and indirect algorithms

Solving H-horizon, Stationary Markov Decision Problems In Time Proportional To Log(H)

April 22, 2026

Solving H-horizon, Stationary Markov Decision Problems In Time Proportional To Log(H) Solving h-horizon, stationary markov decision problems in time proportional to log (h) Paul Tseng, Operations Reseserch Letters 9 (1990) 287-297.

Randomized Linear Programming Solves the Discounted Markov Decision Problem In Nearly-Linear (Sometimes Sublinear) Run Time

April 22, 2026

Randomized Linear Programming Solves the Discounted Markov Decision Problem In Nearly-Linear (Sometimes Sublinear) Run Time Randomized Linear Programming Solves the Discounted Markov Decision Problem In Nearly-Linear (Sometimes Sublinear) Run Time The nonlinear Bellman equation = linear programming problem: Primal-Dual LP Primal LP (1) Dual LP (2) Minmax Problem (3) Download: pdf

KL Divergence

July 14, 2019

KL Divergence In mathematical statistics, the Kullback–Leibler divergence (also called relative entropy) is a measure of how one probability distribution is different from a second, reference probability distribution. https://en.wikipedia.org/wiki/Kullback%E2%80%93Leibler_divergence Information entropy KL Divergence

The Asymptotic Convergence-Rate of Q-learning

April 22, 2026

The Asymptotic Convergence-Rate of Q-learning the-asymptotic-convergence-rate-of-q-learning The asymptotic rate of convergence of Q-learning is Ο( 1/tR(1-γ) ), if R(1-γ)<0.5, where R=Pmin/Pmax, P is state-action occupation frequency. |Qt (x,a) − Q*(x,a)| < B/tR(1-γ) Convergence-rate is the difference between True value and Optimum value, i.e., the smaller it is, the faster convergence Q-learning is. We hope the Ο( 1/tR(1-γ) ) should… read more »

Policy Gradient Methods

May 10, 2019

Policy Gradient Methods In summary, I guess because 1. policy (probability of action) has the style: , 2. obtain (or let’s say ‘math trick’) in the objective function ( i.e., value function )’s gradient equation to get an ‘Expectation’ form for : , assign ‘ln’ to policy before gradient for analysis convenience. pg Notation J(θ):… read more »

Dr. Pei

Email Address:

Blog Stats

State Action/Control

Meta

Algorithm Archives - Dr. Pei

Protected: Resume

Software and Hardware vs Time by Grok

Club Elo

Technology Node vs Year

Finite-Sample Convergence Rates for Q-Learning and Indirect Algorithms

Solving H-horizon, Stationary Markov Decision Problems In Time Proportional To Log(H)

Randomized Linear Programming Solves the Discounted Markov Decision Problem In Nearly-Linear (Sometimes Sublinear) Run Time

KL Divergence

The Asymptotic Convergence-Rate of Q-learning

Policy Gradient Methods