Experiential Reinforcement Learning
Treat past episodes as a retrievable experience bank rather than gradient updates only — letting the policy reason about what worked last time before deciding what to try next.
Taiwei Shi, Sihao Chen, Bowen Jiang, Linxin Song, Longqi Yang, Jieyu Zhao