Turn Llama 3 into an Embedding Model with LLM2Vec
How to get and train a Llama 3 embedding model for RAG applications
The embedding model is a critical component of retrieval-augmented generation (RAG) for large language models (LLMs). They encode the knowledge base and the query written by the user. I explained RAG in this article:
Using an embedding model trained or fine-tuned for the same domain as the LLM can significantly improve a RAG system. However, finding or training such an embedding model is often a difficult task as in-domain data are usually scarce.
In this article, I show how to turn an LLM into a text embedding model using LLM2Vec. We will see how to do it with Llama 3 to create a RAG system that doesn’t need any other models than Llama 3. The same method can be applied to Llama 3.1.
I also wrote a follow-up article to further improve a Llama 3 embedding model with contrastive learning.
The notebook showing how to convert Llama 3 into an embedding model is available here: