RAG vs Finetuning LLMs - What to use, when, and why.
RAG (Retrieval Augmented Generation) and finetuning are two popular methods for using Large Language Models (LLMs) with “custom” data.
But after speaking to many customers, I noticed that it can be confusing to know which method to use, when, and why.
In this post, we will:
Clarify that RAG and finetuning are fundamentally different tools for different problems.
List out the right use cases of RAG and finetuning.
Present a set of heuristics for choosing what method to use, and when.
The heuristics will help guide AI developers navigate among the two methods, and avoid analysis paralysis and premature optimization. Spoiler alert: you may need to use both methods for optimal performance.