Improve a simple semantic search with Contextual Retrieval

Edu Depetris

- Jun 28, 2025
  • Ai
  • Artificial Inteligent
  • Contextual Retrieval
  • Ruby Llm
Let’s explore a simple yet powerful technique to improve our text‐chunk embeddings by adding context for better semantic retrieval.

We're following this article (Read me) for inspiration. I'll use a Rails app to simplify the setup similar to

Background

We have a large plain‐text document and want to perform “semantic search” over it. The basic workflow is:

  1. Split the document into smaller text chunks.
  2. Embed each chunk for future semantic search.
  3. Persist each chunk and its embedding in the database.

Dependencies

We’ll use these gems:

gem "neighbor" # Nearest neighbor search for Rails
gem "sqlite-vec" # SQLite vector extension for nearest neighbor search
gem "baran" # Text Splitter for Large Language Model datasets.
gem "ruby_llm" # Ruby wrapper for Large Language Models

This is a simple script doing the steps. Read more about chunking in article:

document = File.read("app/lib/document.txt")

splitter = Baran::CharacterTextSplitter.new(chunk_size: 1000, chunk_overlap: 50)
chunks = splitter.chunks(document)

chunks.each.with_index(1) do |chunk, index|  
  embedding = RubyLLM.embed(chunk[:text])
  Chunk.create!(content: chunk[:text], embedding: embedding.vectors)
end

puts Chunk.count

Now you can query for nearest neighbors:

embedding = RubyLLM.embed("is it possible to use chain_type='stuff'")
Chunk.nearest_neighbors(:embedding, embedding.vectors, distance: "cosine").limit(3)

Contextual Retrieval

Now we're ready to enrich each chunk, we’ll prepend a brief context generated by an LLM.

We’ll use the two previous chunks as context (you can also include following chunks):

previous_chunks = chunks[0...index].last(2).map { it[:text] }.join("\n\n")

Then, we create a prompt to extract the context of the current chunk with the previous one:

contextual_retrieval_prompt = %(
  The document is an article.
  
  <document>
  #{doc_content}
  </document>

  Here is the chunk we want to situate within the whole document. The chunk number is #{text_index}.
  
  <chunk>
  #{chunk}
  </chunk>

  These are the two previous chunks of the document to help you understand the context of the current chunk.
  
  <previous_chunks>
  #{previous_chunks}
  </previous_chunks>
  
  Please give a short succinct context to situate this <chunk> within the overall document for the purposes of improving search retrieval of the chunk.
  Answer only with the succinct context and nothing else.
)

Finally, with the the answer from the LLM we're going to add the context to the chunk and create the embedding:

# context is the llm's answer
contextualized_chunk = "#{context}\n\n#{chunk[:text]}"

embedding = RubyLLM.embed(contextualized_chunk)
Chunk.create!(content: chunk[:text], embedding: embedding.vectors)

Let's put all together:

def contextual_retrieval(doc_content:, chunk:, previous_chunks:, text_index:)
  contextual_retrieval_prompt = %(
    The document is an article.

    <document>
    #{doc_content}
    </document>

    Here is the chunk we want to situate within the whole document. The chunk number is #{text_index}.

    <chunk>
    #{chunk}
    </chunk>

    These are the two previous chunks of the document to help you understand the context of the current chunk.

    <previous_chunks>
    #{previous_chunks}
    </previous_chunks>

    Please give a short succinct context to situate this <chunk> within the overall document for the purposes of improving search retrieval of the chunk.
    Answer only with the succinct context and nothing else.
  )

  chat.ask(contextual_retrieval_prompt).content
end

def chat
  @chat ||= RubyLLM.chat
end

# == Start here

document = File.read("app/lib/document.txt")

splitter = Baran::CharacterTextSplitter.new(chunk_size: 500, chunk_overlap: 50)
chunks = splitter.chunks(document)

chunks.each.with_index(1) do |chunk, index|
  previous_chunks = chunks[0...index].last(2).map { it[:text] }.join("\n\n")
  chunk_index = "#{index} out of #{chunks.size}"

  context = contextual_retrieval(doc_content: document, chunk: chunk[:text],
                                 previous_chunks: previous_chunks, text_index: chunk_index)

  contextualized_chunk = "#{context}\n\n#{chunk[:text]}"

  embedding = RubyLLM.embed(contextualized_chunk)
  Chunk.create!(content: contextualized_chunk, embedding: embedding.vectors)
end

That's! Now our chunks have a bit more context that helps to run some semantic searches.

Source code is here

Happy Coding 🎉