Skip to main content
Back to top
Ctrl
+
K
Search
Ctrl
+
K
Quick Start
Design Overview
Agentic RL
Performance Considerations
Reliability
Launching Jobs
Rollout
Algorithms
Models
Metrics
Examples and Guides
Talks and Announcements
Contributing
Code of Conduct
🖼️ Example gallery
Tuning
DPO Demo with math (gsm8k)
GRPO Demo
Knowledge Distillation with Tunix: Gemma 7B to Gemma 2B
LoRA & QLoRA Demo
Parameter-Efficient Fine-Tuning of Llama 3.1-8B with LoRA/QLoRA on NVIDIA GPUs using JAX and Tunix
📖 Reference
Supervised fine-tuning (SFT)
Reinforcement learning (RL)
Distillation
Generation
Search
Error
Please activate JavaScript to enable the search functionality.
Ctrl
+
K