BY571

Follow

BY571

Follow

230 followers · 0 following

09:48 (UTC -12:00)

Achievements

Achievements

BY571/README.md

Reinforcement Learning

Soft-Actor-Critic-and-Extensions: SAC with PER, ERE, Munchausen, D2RL, parallel envs
CQL: Conservative Q-Learning for offline RL (DQN-CQL & SAC-CQL)
DQN-Atari-Agents: Modular DDQN, Dueling, Noisy, C51, Rainbow, DRQN
IQN-and-Extensions: Implicit Quantile Networks with PER, Noisy, N-step, Dueling
Deep-Reinforcement-Learning-Algorithm-Collection: Reference implementations across deep RL
Upside-Down-Reinforcement-Learning: Schmidhuber's ⅂ꓤ in PyTorch

RL + Robotics

bricksrl: LEGO-based platform for democratizing robotics and RL research · project page
Autonomous-Robocar: Self-driving RC-car: Raspberry Pi + CNN predicting steering and throttle from camera

RL + Trading

torchtrade: Modular RL framework for algorithmic trading · project page

RL + LLM

DistRL-LLM: Distributed RL for LLM fine-tuning across multiple GPUs
SCoRe: Training language models to self-correct via RL
artificial-agent-lab: Autonomous research lab: PI and PhD agents run experiments and write papers
sft-kl-lora-trainer: trl.SFTTrainer with a KL divergence loss between LoRA adapter and base model
Agent-Tool-RL: Teaching small language models to use tools with RL
CoT-Decoding: Chain-of-Thought reasoning without prompting
nanoDiff: Minimal, hackable diffusion language model — nanoGPT for the LLaDA recipe

Pinned Loading

Soft-Actor-Critic-and-Extensions Soft-Actor-Critic-and-Extensions Public

PyTorch implementation of Soft-Actor-Critic and Prioritized Experience Replay (PER) + Emphasizing Recent Experience (ERE) + Munchausen RL + D2RL and parallel Environments.

Python 296 35
Upside-Down-Reinforcement-Learning Upside-Down-Reinforcement-Learning Public

Upside-Down Reinforcement Learning (⅂ꓤ) implementation in PyTorch. Based on the paper published by Jürgen Schmidhuber.

Jupyter Notebook 79 12
DQN-Atari-Agents DQN-Atari-Agents Public

DQN-Atari-Agents: Modularized & Parallel PyTorch implementation of several DQN Agents, i.a. DDQN, Dueling DQN, Noisy DQN, C51, Rainbow, and DRQN

Jupyter Notebook 122 15
CQL CQL Public

PyTorch implementation of the Offline Reinforcement Learning algorithm CQL. Includes the versions DQN-CQL and SAC-CQL for discrete and continuous action spaces.

Python 146 24
IQN-and-Extensions IQN-and-Extensions Public

PyTorch Implementation of Implicit Quantile Networks (IQN) for Distributional Reinforcement Learning with additional extensions like PER, Noisy layer, N-step bootstrapping, Dueling architecture and…

Jupyter Notebook 94 18
Deep-Reinforcement-Learning-Algorithm-Collection Deep-Reinforcement-Learning-Algorithm-Collection Public

Collection of Deep Reinforcement Learning Algorithms implemented in PyTorch.

Jupyter Notebook 82 13