Baochun’s Notes

❯

February 2025

❯

February 15

Saturday

Andriy Burkov’s minimalist implementation of GRPO from scratch — Rather than using a library such as Hugging Face’s TRL, it would always be a good idea to read a minimalist, back-to-square-one implementation of the GRPO reinforcement learning algorithm.

Baochun’s Notes

Explorer

February 15

Saturday