Fine Tuning Articles

Use Gradient Accumulation to Simulate Larger Batches on Small GPUs

Gradient accumulation lets you train models with effectively larger batch sizes than your GPU's memory can hold by splitting a large batch into smaller .

4 min read

Create High-Quality Instruction Datasets for Fine-Tuning

Creating instruction datasets for fine-tuning LLMs feels like you're teaching a super-intelligent toddler – they can grasp complex concepts but need ver.

2 min read

Tune Learning Rate Schedules for Stable LLM Fine-Tuning

Tune Learning Rate Schedules for Stable LLM Fine-Tuning — practical guide covering fine-tuning setup, configuration, and troubleshooting with real-world...

2 min read

Fine-Tune Llama 3 for Instruction Following on Custom Data

Llama 3, when fine-tuned for instruction following, actually learns to predict the next token in a way that aligns with the style and content of your cu.

Use Gradient Accumulation to Simulate Larger Batches on Small GPUs

Create High-Quality Instruction Datasets for Fine-Tuning

Tune Learning Rate Schedules for Stable LLM Fine-Tuning

Fine-Tune Llama 3 for Instruction Following on Custom Data

LoRA Explained: Configure Low-Rank Adaptation for Fine-Tuning

Merge Fine-Tuned Models with mergekit for Ensemble Capabilities

Fine-Tune Mistral 7B on a Custom Dataset with QLoRA

Monitor Fine-Tuning Runs with Weights & Biases

Fine-Tune Multilingual LLMs for Cross-Language Tasks

Fine-Tune GPT-4o Mini with the OpenAI API

Fine-Tune LLMs with ORPO: Odds Ratio Preference Optimization

Detect and Fix Overfitting During LLM Fine-Tuning

Fine-Tune Phi-3 Mini for Edge Deployment on Small GPUs

Automate LLM Fine-Tuning Pipelines for Continuous Retraining

Fine-Tune LLMs in 4-Bit with QLoRA on Consumer GPUs

Deploy Quantized Fine-Tuned Models for Efficient Inference

Train Reward Models from Human Feedback for RLHF

Implement RLHF: Train Reward Models and Run PPO Fine-Tuning

Build the Business Case for Fine-Tuning vs Prompting vs RAG

Secure Your Fine-Tuning Pipeline: Data Privacy and Compliance

Run Supervised Fine-Tuning with TRL on Any Hugging Face Model

Generate Synthetic Training Data with LLMs for Fine-Tuning

Set Up TRL Trainer for Supervised and Preference Fine-Tuning

Speed Up Fine-Tuning 2x with Unsloth on Llama and Mistral

Fine-Tune Vision-Language Models for Multimodal Tasks

Deploy Fine-Tuned Models at Scale with vLLM

Fine-Tuning vs RAG vs Prompting: Choose the Right Approach

A/B Test Base Models Against Fine-Tuned Versions in Production

Merge LoRA Adapters into a Base Model for Deployment

Configure Axolotl for Multi-Format Fine-Tuning Runs

Prevent Catastrophic Forgetting During LLM Fine-Tuning

Format Chat Templates Correctly for Instruction Fine-Tuning

Choose the Best Checkpoint After Fine-Tuning

Fine-Tune LLMs for Classification and Information Extraction

Fine-Tune an LLM for Code Generation on Your Codebase

Estimate GPU Compute Budget Before Starting a Fine-Tuning Run

Fine-Tune LLMs Continually on New Data Without Full Retraining

Format Conversation Datasets for Supervised Fine-Tuning

Compare Cloud GPU Costs for Fine-Tuning: A100 vs H100 vs L40

Deduplicate and Clean Training Data for Better Fine-Tuning Results

Prepare High-Quality Datasets for LLM Fine-Tuning

Configure DeepSpeed ZeRO for Fine-Tuning Large Language Models

Adapt Pre-Trained LLMs to Domain-Specific Tasks with Fine-Tuning

Fine-Tune LLMs with Direct Preference Optimization

Fine-Tune Embedding Models for Domain-Specific Semantic Search

Evaluate Fine-Tuned LLMs with Task-Specific Benchmarks

Configure FSDP for Distributed Fine-Tuning Across Multiple GPUs

Full Fine-Tuning vs PEFT: Choose the Right Approach

Fine-Tune Gemma 2 on a Custom Dataset Step by Step

Export Fine-Tuned Models to GGUF for Local Deployment with Ollama