Back to all articles
Featured image for article: Vector Databases: The Backbone of Modern AI Applications
AI
24 min read2,641 views

Vector Databases: The Backbone of Modern AI Applications

An expert guide to vector databases — comparing Pinecone, Weaviate, Qdrant, Chroma, and pgvector. Learn when to use each, how to design schemas, and how to scale to billions of vectors.

#Vector Database#Pinecone#Qdrant#pgvector#AI#Embeddings

Vector Databases: The Backbone of Modern AI Applications

Why Traditional Databases Can't Handle Embeddings

A 1536-dimensional embedding vector cannot be efficiently queried with a B-tree index or SQL WHERE clause. Finding the nearest neighbor requires a fundamentally different data structure. Vector databases (VectorDBs) are purpose-built for Approximate Nearest Neighbor (ANN) search on high-dimensional vectors.

┌──────────────────────────────────────────────────────────────┐
│               Vector Database Landscape 2025                 │
├────────────────┬────────────┬───────────┬──────────────────  ┤
│  Database      │  Category  │  ANN Algo │  Best For          │
├────────────────┼────────────┼───────────┼──────────────────  ┤
│  Pinecone      │  Cloud     │  Custom   │  Managed, scale    │
│  Weaviate      │  OSS+Cloud │  HNSW     │  Hybrid search     │
│  Qdrant        │  OSS+Cloud │  HNSW     │  High performance  │
│  Chroma        │  OSS       │  HNSW     │  Local/prototype   │
│  Milvus        │  OSS+Cloud │  HNSW/IVF │  Enterprise scale  │
│  pgvector      │  PostgreSQL│  HNSW/IVF │  Existing Postgres │
│  Redis VSS     │  OSS+Cloud │  HNSW/Flat│  Low-latency cache │
└────────────────┴────────────┴───────────┴──────────────────  ┘

Core ANN Algorithms

HNSW (Hierarchical Navigable Small World)

The most widely used algorithm. Builds a multi-layer graph where upper layers are sparse (for fast navigation) and lower layers are dense (for precise search).

HNSW Layer Structure:

Layer 2:  [A] ──────────────────────── [D]
                                         │
Layer 1:  [A] ──── [B] ──── [C] ──── [D]
                    │         │
Layer 0:  [A]─[B]─[B2]─[C]─[C2]─[D]─[D2]─[E]
          (most nodes are only in layer 0)

Search: Start at top layer, greedily navigate toward query,
        descend to next layer when stuck, repeat until layer 0.

Working with Pinecone

python
1from pinecone import Pinecone, ServerlessSpec 2from openai import OpenAI 3import os 4 5pc = Pinecone(api_key=os.environ["PINECONE_API_KEY"]) 6openai_client = OpenAI() 7 8# Create index with metadata filtering support 9if "knowledge-base" not in pc.list_indexes().names(): 10 pc.create_index( 11 name="knowledge-base", 12 dimension=1536, 13 metric="cosine", 14 spec=ServerlessSpec(cloud="aws", region="us-east-1"), 15 ) 16 17index = pc.Index("knowledge-base") 18 19# Upsert vectors with rich metadata 20def upsert_documents(documents: list[dict]): 21 """ 22 documents: [{"id": str, "text": str, "category": str, "date": str}] 23 """ 24 texts = [d["text"] for d in documents] 25 response = openai_client.embeddings.create( 26 model="text-embedding-3-small", input=texts 27 ) 28 29 vectors = [ 30 { 31 "id": doc["id"], 32 "values": resp.embedding, 33 "metadata": { 34 "text": doc["text"], 35 "category": doc["category"], 36 "date": doc["date"] 37 } 38 } 39 for doc, resp in zip(documents, response.data) 40 ] 41 index.upsert(vectors=vectors, namespace="production") 42 43# Query with metadata filters 44def search(query: str, category_filter: str = None, top_k: int = 5): 45 query_embedding = openai_client.embeddings.create( 46 model="text-embedding-3-small", input=[query] 47 ).data[0].embedding 48 49 filter_dict = {"category": {"$eq": category_filter}} if category_filter else None 50 51 results = index.query( 52 vector=query_embedding, 53 top_k=top_k, 54 filter=filter_dict, 55 include_metadata=True, 56 namespace="production" 57 ) 58 return results.matches

Working with Qdrant (High Performance)

python
1from qdrant_client import QdrantClient 2from qdrant_client.models import ( 3 VectorParams, Distance, PointStruct, 4 Filter, FieldCondition, MatchValue, SearchRequest 5) 6 7client = QdrantClient(url="http://localhost:6333") 8 9# Create collection 10client.recreate_collection( 11 collection_name="articles", 12 vectors_config=VectorParams(size=1536, distance=Distance.COSINE), 13) 14 15# Enable payload indexing for fast filtering 16client.create_payload_index( 17 collection_name="articles", 18 field_name="category", 19 field_schema="keyword" 20) 21client.create_payload_index( 22 collection_name="articles", 23 field_name="published_date", 24 field_schema="float" 25) 26 27# Batch upsert 28def batch_upsert(docs: list[dict], batch_size: int = 256): 29 for i in range(0, len(docs), batch_size): 30 batch = docs[i:i+batch_size] 31 points = [ 32 PointStruct( 33 id=doc["id"], 34 vector=doc["embedding"], 35 payload={ 36 "text": doc["text"], 37 "category": doc["category"], 38 "published_date": doc["timestamp"] 39 } 40 ) 41 for doc in batch 42 ] 43 client.upsert(collection_name="articles", points=points) 44 45# Search with filter 46results = client.search( 47 collection_name="articles", 48 query_vector=query_embedding, 49 query_filter=Filter( 50 must=[ 51 FieldCondition(key="category", match=MatchValue(value="technology")), 52 ] 53 ), 54 limit=10, 55 score_threshold=0.75 # Only return results above 75% similarity 56)

pgvector — Vector Search Inside PostgreSQL

If you already run PostgreSQL, pgvector adds native vector support with no new infrastructure:

sql
1-- Install extension 2CREATE EXTENSION IF NOT EXISTS vector; 3 4-- Create table with vector column 5CREATE TABLE documents ( 6 id BIGSERIAL PRIMARY KEY, 7 content TEXT NOT NULL, 8 category VARCHAR(100), 9 embedding vector(1536), -- 1536-dimensional vector 10 created_at TIMESTAMPTZ DEFAULT NOW() 11); 12 13-- Create HNSW index (fast queries, higher build cost) 14CREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops) 15WITH (m = 16, ef_construction = 64); 16 17-- Or IVFFlat index (faster builds, good for 1M+ vectors) 18-- CREATE INDEX ON documents USING ivfflat (embedding vector_cosine_ops) 19-- WITH (lists = 100); 20 21-- Semantic search query 22SELECT 23 id, 24 content, 25 category, 26 1 - (embedding <=> $1::vector) AS similarity 27FROM documents 28WHERE category = 'AI' 29 AND 1 - (embedding <=> $1::vector) > 0.7 30ORDER BY embedding <=> $1::vector 31LIMIT 10; 32-- <=> is cosine distance operator 33-- <#> is negative inner product 34-- <-> is L2 distance
python
1# Python integration with asyncpg 2import asyncpg 3import numpy as np 4 5async def search_documents(query_embedding: list[float], category: str = None): 6 conn = await asyncpg.connect(os.environ["DATABASE_URL"]) 7 8 await conn.execute("SET LOCAL hnsw.ef_search = 128;") 9 10 query = """ 11 SELECT id, content, category, 12 1 - (embedding <=> $1::vector) as similarity 13 FROM documents 14 WHERE ($2::text IS NULL OR category = $2) 15 AND 1 - (embedding <=> $1::vector) > 0.7 16 ORDER BY embedding <=> $1::vector 17 LIMIT 10 18 """ 19 20 # Convert to pgvector format 21 vector_str = "[" + ",".join(map(str, query_embedding)) + "]" 22 23 rows = await conn.fetch(query, vector_str, category) 24 return [dict(row) for row in rows]

Chroma — For Local Development and Prototyping

python
1import chromadb 2from chromadb.utils import embedding_functions 3 4client = chromadb.PersistentClient(path="./chroma_db") 5 6# Use OpenAI embeddings automatically 7openai_ef = embedding_functions.OpenAIEmbeddingFunction( 8 api_key=os.environ["OPENAI_API_KEY"], 9 model_name="text-embedding-3-small" 10) 11 12collection = client.get_or_create_collection( 13 name="knowledge_base", 14 embedding_function=openai_ef, 15 metadata={"hnsw:space": "cosine"} 16) 17 18# Add documents — Chroma handles embedding automatically 19collection.add( 20 documents=["RAG combines retrieval with generation", 21 "Vector databases store embeddings"], 22 metadatas=[{"source": "blog"}, {"source": "docs"}], 23 ids=["doc1", "doc2"] 24) 25 26# Query 27results = collection.query( 28 query_texts=["What is retrieval augmented generation?"], 29 n_results=3, 30 where={"source": "blog"} # metadata filter 31)

Choosing the Right Vector Database

Decision Tree:

 Already on PostgreSQL?
    YES → Use pgvector (zero new infra)
    NO → Continue...

 Need managed cloud service?
    YES → Pinecone (simplest) or Weaviate Cloud
    NO → Self-hosted Qdrant or Milvus

 Need hybrid search (BM25 + vector)?
    YES → Weaviate (built-in BM25 module)
    NO → Qdrant or Pinecone

 Prototyping/local dev?
    YES → Chroma (pip install chromadb, done)
    NO → Qdrant (Docker: docker run -p 6333:6333 qdrant/qdrant)

 Scale >100M vectors?
    YES → Milvus or Pinecone (enterprise)
    NO → Qdrant handles 10M+ comfortably

Performance Benchmarks

Approximate Nearest Neighbor Search (1M vectors, 1536 dims, recall@10):

  HNSW (ef_search=128): ~2ms, recall=0.97
  HNSW (ef_search=64):  ~1ms, recall=0.94  
  IVFFlat (nprobe=10):  ~4ms, recall=0.90
  Exact search (FLAT):  ~200ms, recall=1.00

Memory usage per vector:
  float32 (1536 dims): 6KB per vector
  1M vectors: ~6GB RAM
  Quantized int8:      1.5KB per vector (~4x compression, ~1% quality loss)
Profile picture of Sumit Kumar Pandey

Sumit Kumar Pandey

Full-Stack Developer

Full-Stack Developer with 5+ years of experience building scalable web applications. Passionate about clean code, performance optimization, and modern web technologies.

About the Author

Author information for Sumit Kumar Pandey

Share this article

Found this helpful? Share with your network!

0 shares

Discussion (0)

Share your thoughts and join the conversation

Leave a comment

Be respectful and stay on topic

Write your comment in the text area above. Comments should be respectful and relevant to the article.

AI Chat Assistant

Interactive AI assistant for Sumit Kumar Pandey's portfolio website. Ask questions about technical skills, work experience, projects, availability, and contact information. Powered by Next.js API.