Examples and Guides#
This section provides a high-level overview of the Colab notebooks, scripts, and example directories.
All examples are located in this directory.
| Category | Name/Path | Description |
|---|---|---|
| Colab Notebook | qlora_gemma.ipynb |
End-to-end tutorial on fine-tuning (SFT) Gemma 270M model for English-French translation using parameter-efficient LoRA and QLoRA techniques. |
grpo_gemma.ipynb |
Reinforcement learning tutorial using Group Relative Policy Optimization (GRPO) to train the Gemma 3 1B IT model for math reasoning on the GSM8K benchmark. | |
dpo_gemma.ipynb |
Preference tuning using Direct Preference Optimization (DPO) to tune the Gemma 3 1B-IT model on the GSM8K dataset. | |
logit_distillation.ipynb |
Demonstrates knowledge distillation from a Gemma 7B-IT teacher to a Gemma 2B-IT student for translation task. | |
| Script | rl/grpo/gsm8k/ |
Bash scripts for fine-tuning different models and presets (Gemma, Llama, etc.) on the GSM8K mathematical reasoning task using GRPO. |
rl/grpo/gsm8k/verl_compatible/ |
Bash scripts for GRPO-training on the GSM8K dataset to train with a verl-compatible setup. | |
deepscaler/ |
Scripts and notebooks for reproducing the Deepscaler experiment (train_deepscaler_nb.py) and math evaluation. |
|
sft/mtnt/ |
Bash scripts for SFT examples on the MTNT translation task for Gemma, Llama, and Qwen models. | |
model_load/ |
Examples for loading Gemma2 and Gemma3 models from safetensors format. | |
agentic/ |
Examples and scripts for agentic workflows, with async rollout. |