Innovating Search Experience with Amazon OpenSearch and Amazon Bedrock

Source: https://bigdataboutique.com/blog/innovating-search-experience-with-amazon-opensearch-and-amazon-bedrock-d045bc
Author: Big Data Boutique

Summary

A practical guide to building production semantic search and RAG using Amazon OpenSearch Service for vector storage and Amazon Bedrock (Claude, Titan) for embeddings and generation.

Architecture Overview

User Query
    ↓
Amazon Bedrock (Titan Embeddings) → query_vector
    ↓
Amazon OpenSearch Service (kNN index) → top-k chunks
    ↓
Amazon Bedrock (Claude) → generated answer
    ↓
User Response

This is a fully AWS-native RAG stack.

OpenSearch Vector Index Setup

import boto3
from opensearchpy import OpenSearch
 
# Create index with vector field
index_body = {
    "settings": {
        "index.knn": True
    },
    "mappings": {
        "properties": {
            "embedding": {
                "type": "knn_vector",
                "dimension": 1536,
                "method": {
                    "name": "hnsw",
                    "space_type": "l2",
                    "engine": "nmslib"
                }
            },
            "content": {"type": "text"},
            "source": {"type": "keyword"}
        }
    }
}

Bedrock Titan Embeddings

bedrock = boto3.client("bedrock-runtime")
 
def embed(text):
    response = bedrock.invoke_model(
        modelId="amazon.titan-embed-text-v1",
        body=json.dumps({"inputText": text})
    )
    return json.loads(response["body"].read())["embedding"]

Hybrid Search in OpenSearch

OpenSearch supports BM25 + kNN hybrid:

{
  "query": {
    "hybrid": {
      "queries": [
        {"match": {"content": "query text"}},
        {"knn": {"embedding": {"vector": [...], "k": 10}}}
      ]
    }
  },
  "search_pipeline": {
    "phase_results_processors": [{
      "normalization-processor": {
        "combination": {"technique": "rrf"}
      }
    }]
  }
}

RAG Generation with Claude

def rag_answer(query, retrieved_chunks):
    context = "\n\n".join([c["content"] for c in retrieved_chunks])
    
    response = bedrock.invoke_model(
        modelId="anthropic.claude-3-sonnet-20240229-v1:0",
        body=json.dumps({
            "messages": [{
                "role": "user",
                "content": f"Context:\n{context}\n\nQuestion: {query}"
            }]
        })
    )
    return json.loads(response["body"].read())["content"][0]["text"]

Production Considerations

Chunking strategy: split documents before embedding (see Text Chunking)
Embedding cost: Titan at $0.0001/1K tokens — budget for large corpora
Index warm-up: OpenSearch kNN needs warm-up after restart
Metadata filtering: OpenSearch supports pre-filter on non-vector fields

Migrating to Elasticsearch with Dense Vector for Carousell Spotlight — similar migration case study
Chunking Strategies for LLM Applications — chunking for RAG
Nearest Neighbor Indexes for Similarity Search — HNSW infrastructure

RAG — primary pattern
Dense Vector Retrieval — storage and retrieval
Semantic Search — enabled capability
Hybrid Search — BM25 + kNN
Text Chunking — document preparation
Agentic Search — potential extension with Bedrock agents

Awesome Search KG

Explorer

Innovating Search Experience with Amazon OpenSearch and Amazon Bedrock

Innovating Search Experience with Amazon OpenSearch and Amazon Bedrock

Summary

Architecture Overview

OpenSearch Vector Index Setup

Bedrock Titan Embeddings

Hybrid Search in OpenSearch

RAG Generation with Claude

Production Considerations

Graph View

Table of Contents

Backlinks

Awesome Search KG

Explorer

Innovating Search Experience with Amazon OpenSearch and Amazon Bedrock

Innovating Search Experience with Amazon OpenSearch and Amazon Bedrock

Summary

Architecture Overview

OpenSearch Vector Index Setup

Bedrock Titan Embeddings

Hybrid Search in OpenSearch

RAG Generation with Claude

Production Considerations

Related Articles

Related Concepts

Graph View

Table of Contents

Backlinks