Blog

Short write-ups on VLMs, training, kernels, and building research tooling.

From ~0.15 to ~0.60 Reward: Fast RL Gains on Low-Resource Translation with Small Tweaks
How curriculum, reward shaping, and decoding stability moved reward quickly in a low-resource translation setup.