Improve a simple semantic search with Contextual Retrieval
Edu Depetris
- Jun 28, 2025- Ai
- Artificial Inteligent
- Contextual Retrieval
- Ruby Llm
Letâs explore a simple yet powerful technique to improve our textâchunk embeddings by adding context for better semantic retrieval.
We're following this article (Read me) for inspiration. I'll use a Rails app to simplify the setup similar to
We're following this article (Read me) for inspiration. I'll use a Rails app to simplify the setup similar to
Background
We have a large plainâtext document and want to perform âsemantic searchâ over it. The basic workflow is:
- Split the document into smaller text chunks.
- Embed each chunk for future semantic search.
- Persist each chunk and its embedding in the database.
Dependencies
Weâll use these gems:
gem "neighbor" # Nearest neighbor search for Rails gem "sqlite-vec" # SQLite vector extension for nearest neighbor search gem "baran" # Text Splitter for Large Language Model datasets. gem "ruby_llm" # Ruby wrapper for Large Language Models
document = File.read("app/lib/document.txt") splitter = Baran::CharacterTextSplitter.new(chunk_size: 1000, chunk_overlap: 50) chunks = splitter.chunks(document) chunks.each.with_index(1) do |chunk, index| embedding = RubyLLM.embed(chunk[:text]) Chunk.create!(content: chunk[:text], embedding: embedding.vectors) end puts Chunk.count
Now you can query for nearest neighbors:
embedding = RubyLLM.embed("is it possible to use chain_type='stuff'") Chunk.nearest_neighbors(:embedding, embedding.vectors, distance: "cosine").limit(3)
Contextual Retrieval
Now we're ready to enrich each chunk, weâll prepend a brief context generated by an LLM.
Weâll use the two previous chunks as context (you can also include following chunks):
previous_chunks = chunks[0...index].last(2).map { it[:text] }.join("\n\n")
Then, we create a prompt to extract the context of the current chunk with the previous one:
contextual_retrieval_prompt = %( The document is an article. <document> #{doc_content} </document> Here is the chunk we want to situate within the whole document. The chunk number is #{text_index}. <chunk> #{chunk} </chunk> These are the two previous chunks of the document to help you understand the context of the current chunk. <previous_chunks> #{previous_chunks} </previous_chunks> Please give a short succinct context to situate this <chunk> within the overall document for the purposes of improving search retrieval of the chunk. Answer only with the succinct context and nothing else. )
Finally, with the the answer from the LLM we're going to add the context to the chunk and create the embedding:
# context is the llm's answer contextualized_chunk = "#{context}\n\n#{chunk[:text]}" embedding = RubyLLM.embed(contextualized_chunk) Chunk.create!(content: chunk[:text], embedding: embedding.vectors)
Let's put all together:
def contextual_retrieval(doc_content:, chunk:, previous_chunks:, text_index:) contextual_retrieval_prompt = %( The document is an article. <document> #{doc_content} </document> Here is the chunk we want to situate within the whole document. The chunk number is #{text_index}. <chunk> #{chunk} </chunk> These are the two previous chunks of the document to help you understand the context of the current chunk. <previous_chunks> #{previous_chunks} </previous_chunks> Please give a short succinct context to situate this <chunk> within the overall document for the purposes of improving search retrieval of the chunk. Answer only with the succinct context and nothing else. ) chat.ask(contextual_retrieval_prompt).content end def chat @chat ||= RubyLLM.chat end # == Start here document = File.read("app/lib/document.txt") splitter = Baran::CharacterTextSplitter.new(chunk_size: 500, chunk_overlap: 50) chunks = splitter.chunks(document) chunks.each.with_index(1) do |chunk, index| previous_chunks = chunks[0...index].last(2).map { it[:text] }.join("\n\n") chunk_index = "#{index} out of #{chunks.size}" context = contextual_retrieval(doc_content: document, chunk: chunk[:text], previous_chunks: previous_chunks, text_index: chunk_index) contextualized_chunk = "#{context}\n\n#{chunk[:text]}" embedding = RubyLLM.embed(contextualized_chunk) Chunk.create!(content: contextualized_chunk, embedding: embedding.vectors) end
That's! Now our chunks have a bit more context that helps to run some semantic searches.
Source code is here
Happy Coding đ