AiderTeach a small open model your task on a single GPU

Fine-Tune an Open Model on One GPU (QLoRA)

84.0Overall score

A budget fine-tuning pipeline that adapts an open model to your domain with QLoRA, fitting training onto a single consumer GPU and exporting straight to a GGUF you can run in Ollama. For builders who have a few thousand labeled examples and want a specialized model without renting a cluster.

84.0Score

980Votes

5Components

Install this build

Export

terminal

pip install unsloth && python train.py --model llama-3.1-8b --4bit

Components

Model

Llama 3.1 8B
Qwen3 8B
Gemma 3 12B

Stack

Unsloth
TRL
PEFT
bitsandbytes
Weights & Biases

Hardware

1x RTX 4090 24GB
16GB works for 8B with 4-bit

Export

llama.cpp GGUF convert
Ollama Modelfile

How it works

Format your examples as instruction or chat JSONL
Unsloth loads the base model in 4-bit and trains LoRA adapters
Track loss and eval samples in Weights & Biases as it runs
Merge, convert to GGUF, and serve the tuned model in Ollama

Summary

84.0 score 980 votes

0 Reviews

Your rating

Loading discussion...

← All builds