Improve a simple semantic search with Contextual Retrieval
Edu Depetris
- Jun 28, 2025- Contextual Retrieval
- Artificial Inteligent
- Ruby Llm
- Ai
Letâs explore a simple yet powerful technique to improve our textâchunk embeddings by adding context for better semantic retrieval.
We're following this article (Read me) for inspiration. I'll use a Rails app to simplify the setup similar to
We're following this article (Read me) for inspiration. I'll use a Rails app to simplify the setup similar to
Background
We have a large plainâtext document and want to perform âsemantic searchâ over it. The basic workflow is:
- Split the document into smaller text chunks.
- Embed each chunk for future semantic search.
- Persist each chunk and its embedding in the database.
Dependencies
Weâll use these gems:
gem "neighbor" # Nearest neighbor search for Rails gem "sqlite-vec" # SQLite vector extension for nearest neighbor search gem "baran" # Text Splitter for Large Language Model datasets. gem "ruby_llm" # Ruby wrapper for Large Language Models
document = File.read("app/lib/document.txt")
splitter = Baran::CharacterTextSplitter.new(chunk_size: 1000, chunk_overlap: 50)
chunks = splitter.chunks(document)
chunks.each.with_index(1) do |chunk, index|
embedding = RubyLLM.embed(chunk[:text])
Chunk.create!(content: chunk[:text], embedding: embedding.vectors)
end
puts Chunk.countNow you can query for nearest neighbors:
embedding = RubyLLM.embed("is it possible to use chain_type='stuff'")
Chunk.nearest_neighbors(:embedding, embedding.vectors, distance: "cosine").limit(3)Contextual Retrieval
Now we're ready to enrich each chunk, weâll prepend a brief context generated by an LLM.
Weâll use the two previous chunks as context (you can also include following chunks):
previous_chunks = chunks[0...index].last(2).map { it[:text] }.join("\n\n")Then, we create a prompt to extract the context of the current chunk with the previous one:
contextual_retrieval_prompt = %(
The document is an article.
<document>
#{doc_content}
</document>
Here is the chunk we want to situate within the whole document. The chunk number is #{text_index}.
<chunk>
#{chunk}
</chunk>
These are the two previous chunks of the document to help you understand the context of the current chunk.
<previous_chunks>
#{previous_chunks}
</previous_chunks>
Please give a short succinct context to situate this <chunk> within the overall document for the purposes of improving search retrieval of the chunk.
Answer only with the succinct context and nothing else.
)Finally, with the the answer from the LLM we're going to add the context to the chunk and create the embedding:
# context is the llm's answer
contextualized_chunk = "#{context}\n\n#{chunk[:text]}"
embedding = RubyLLM.embed(contextualized_chunk)
Chunk.create!(content: chunk[:text], embedding: embedding.vectors)Let's put all together:
def contextual_retrieval(doc_content:, chunk:, previous_chunks:, text_index:)
contextual_retrieval_prompt = %(
The document is an article.
<document>
#{doc_content}
</document>
Here is the chunk we want to situate within the whole document. The chunk number is #{text_index}.
<chunk>
#{chunk}
</chunk>
These are the two previous chunks of the document to help you understand the context of the current chunk.
<previous_chunks>
#{previous_chunks}
</previous_chunks>
Please give a short succinct context to situate this <chunk> within the overall document for the purposes of improving search retrieval of the chunk.
Answer only with the succinct context and nothing else.
)
chat.ask(contextual_retrieval_prompt).content
end
def chat
@chat ||= RubyLLM.chat
end
# == Start here
document = File.read("app/lib/document.txt")
splitter = Baran::CharacterTextSplitter.new(chunk_size: 500, chunk_overlap: 50)
chunks = splitter.chunks(document)
chunks.each.with_index(1) do |chunk, index|
previous_chunks = chunks[0...index].last(2).map { it[:text] }.join("\n\n")
chunk_index = "#{index} out of #{chunks.size}"
context = contextual_retrieval(doc_content: document, chunk: chunk[:text],
previous_chunks: previous_chunks, text_index: chunk_index)
contextualized_chunk = "#{context}\n\n#{chunk[:text]}"
embedding = RubyLLM.embed(contextualized_chunk)
Chunk.create!(content: contextualized_chunk, embedding: embedding.vectors)
endThat's! Now our chunks have a bit more context that helps to run some semantic searches.
Source code is here
Happy Coding đ