AI Archives - Dr. Pei

Protected: Resume

March 18, 2026

There is no excerpt because this is a protected post.

Software and Hardware vs Time by Grok

February 24, 2025

========1======================== Sure! Your request is to analyze the comprehensive performance of Hideo Kojima’s Metal Gear series on PlayStation platforms over time, calculate a weighted performance index (emphasizing fluidity, with stability, SF precision, and dynamic SF included), and plot it using MATLAB with the Y-axis as the index and the X-axis as time. Below, I’ll… read more »

Club Elo

February 18, 2025

Matlab Code: club0 = {‘barcelona’ ‘bayern’ ‘realmadrid’ ‘manunited’ ‘liverpool’,‘mancity’,‘inter’,‘juventus’}; ll = {‘Barcelona’,‘Bayern’,‘Real Madrid’,‘Man United’,‘Liverpool’,‘Man City’,‘Inter’,‘Juventus’}; start = ’01-Jan-2020′; D = ‘./’; for i_club = 1:length(club0);club = club0{i_club}; url = sprintf(‘http://api.clubelo.com/%s’,club); % filename = sprintf(‘%sdata%d.csv’,D,i_club); % websave(filename, url); end S = dir(fullfile(D,‘data*.csv’)); datatotal = cell(1,length(club0)); for k = 1:numel(S) F = fullfile(D,S(k).name); datatotal{k} = readtable(F); end… read more »

Technology Node vs Year

February 9, 2025

As semiconductor technology advances, gaming platform hardware accelerates to meet the increasing demands of software, enhancing user interaction and enriching entertainment experiences for the general public. For example, popular video games like “Metal Gear” showcase the improvements in graphics and gameplay made possible by these advancements, allowing players to immerse themselves in more dynamic and… read more »

Finite-Sample Convergence Rates for Q-Learning and Indirect Algorithms

April 22, 2026

Finite-Sample Convergence Rates for Q-Learning and Indirect Algorithms finite-sample convergence rates for q-learning and indirect algorithms

Solving H-horizon, Stationary Markov Decision Problems In Time Proportional To Log(H)

April 22, 2026

Solving H-horizon, Stationary Markov Decision Problems In Time Proportional To Log(H) Solving h-horizon, stationary markov decision problems in time proportional to log (h) Paul Tseng, Operations Reseserch Letters 9 (1990) 287-297.

Randomized Linear Programming Solves the Discounted Markov Decision Problem In Nearly-Linear (Sometimes Sublinear) Run Time

April 22, 2026

Randomized Linear Programming Solves the Discounted Markov Decision Problem In Nearly-Linear (Sometimes Sublinear) Run Time Randomized Linear Programming Solves the Discounted Markov Decision Problem In Nearly-Linear (Sometimes Sublinear) Run Time The nonlinear Bellman equation = linear programming problem: Primal-Dual LP Primal LP (1) Dual LP (2) Minmax Problem (3) Download: pdf

KL Divergence

July 14, 2019

KL Divergence In mathematical statistics, the Kullback–Leibler divergence (also called relative entropy) is a measure of how one probability distribution is different from a second, reference probability distribution. https://en.wikipedia.org/wiki/Kullback%E2%80%93Leibler_divergence Information entropy KL Divergence

The Asymptotic Convergence-Rate of Q-learning

April 22, 2026

The Asymptotic Convergence-Rate of Q-learning the-asymptotic-convergence-rate-of-q-learning The asymptotic rate of convergence of Q-learning is Ο( 1/tR(1-γ) ), if R(1-γ)<0.5, where R=Pmin/Pmax, P is state-action occupation frequency. |Qt (x,a) − Q*(x,a)| < B/tR(1-γ) Convergence-rate is the difference between True value and Optimum value, i.e., the smaller it is, the faster convergence Q-learning is. We hope the Ο( 1/tR(1-γ) ) should… read more »

Hierarchical Apprenticeship Learning, with Application to Quadruped Locomotion

June 25, 2019

Hierarchical Apprenticeship Learning, with Application to Quadruped Locomotion 本论文关键在于机器狗走路经过崎岖路面到达goal的特殊性决定了比较方便选low-level：四条腿，与地面接触，high-level：整体重心，与goal直线距离（关于专家建议）。后面有分析。图5表明机器狗的足迹，学习前和学习后差别很大，只用footstep约束（四条腿）会使机器狗走弯路，我理解是四条腿更关心路面的崎岖程度，哪里更不容易卡住或者摔倒就走哪里，而body path planner计划机器狗重心近似轨迹（在terrain上方）到goal，可以理解成path更关心到goal的直线距离。机器狗在测试terrain中只从path-level demonstration过不去，也就是说如果只关心机器狗重心到goal的直线距离而不关心4条腿与地面接触就不能到达goal，因为机器狗会在路面上摔倒或者卡住。

Dr. Pei

Email Address:

Blog Stats

State Action/Control

Meta

AI

Protected: Resume

Software and Hardware vs Time by Grok

Club Elo

Technology Node vs Year

Finite-Sample Convergence Rates for Q-Learning and Indirect Algorithms

Solving H-horizon, Stationary Markov Decision Problems In Time Proportional To Log(H)

Randomized Linear Programming Solves the Discounted Markov Decision Problem In Nearly-Linear (Sometimes Sublinear) Run Time

KL Divergence

The Asymptotic Convergence-Rate of Q-learning

Hierarchical Apprenticeship Learning, with Application to Quadruped Locomotion