postgres-semantic-search

@laguagu/postgres-semantic-search

7 forks

Updated 4/6/2026

PostgreSQL-based semantic and hybrid search with pgvector and ParadeDB. Use when implementing vector search, semantic search, hybrid search, or full-text search in PostgreSQL. Covers pgvector setup, indexing (HNSW, IVFFlat), hybrid search (FTS + BM25 + RRF), ParadeDB as Elasticsearch alternative, and re-ranking with Cohere/cross-encoders. Supports vector(1536) and halfvec(3072) types for OpenAI embeddings. Triggers: pgvector, vector search, semantic search, hybrid search, embedding search, PostgreSQL RAG, BM25, RRF, HNSW index, similarity search, ParadeDB, pg_search, reranking, Cohere rerank, pg_trgm, trigram, fuzzy search, LIKE, ILIKE, autocomplete, typo tolerance, fuzzystrmatch

Installation

$npx agent-skills-cli install @laguagu/postgres-semantic-search

Claude Code

Cursor

Copilot

Codex

Antigravity

Details

Repositorylaguagu/claude-code-nextjs-skills

Pathskills/postgres-semantic-search/SKILL.md

Branchmain

Scoped Name@laguagu/postgres-semantic-search

Usage

After installing, this skill will be available to your AI coding assistant.

Verify installation:

npx agent-skills-cli list

Skill Instructions

name: postgres-semantic-search description: | PostgreSQL-based semantic and hybrid search with pgvector and ParadeDB. Use when implementing vector search, semantic search, hybrid search, or full-text search in PostgreSQL. Covers pgvector setup, indexing (HNSW, IVFFlat), hybrid search (FTS + BM25 + RRF), ParadeDB as Elasticsearch alternative, and re-ranking with Cohere/cross-encoders. Supports vector(1536) and halfvec(3072) types for OpenAI embeddings.

Triggers: pgvector, vector search, semantic search, hybrid search, embedding search, PostgreSQL RAG, BM25, RRF, HNSW index, similarity search, ParadeDB, pg_search, reranking, Cohere rerank, pg_trgm, trigram, fuzzy search, LIKE, ILIKE, autocomplete, typo tolerance, fuzzystrmatch argument-hint: "[question or use case]"

PostgreSQL Semantic Search

Quick Start

1. Setup

CREATE EXTENSION IF NOT EXISTS vector;

CREATE TABLE documents (
    id SERIAL PRIMARY KEY,
    content TEXT NOT NULL,
    embedding vector(1536)  -- text-embedding-3-small
    -- Or: embedding halfvec(3072)  -- text-embedding-3-large (50% memory)
);

2. Basic Semantic Search

SELECT id, content, 1 - (embedding <=> query_vec) AS similarity
FROM documents
ORDER BY embedding <=> query_vec
LIMIT 10;

3. Add Index (> 10k documents)

CREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops);

Docker Quick Start

# pgvector with PostgreSQL 17
docker run -d --name pgvector-db \
  -e POSTGRES_PASSWORD=postgres \
  -p 5432:5432 \
  pgvector/pgvector:pg17

# Or PostgreSQL 18 (latest)
docker run -d --name pgvector-db \
  -e POSTGRES_PASSWORD=postgres \
  -p 5432:5432 \
  pgvector/pgvector:pg18

# ParadeDB (includes pgvector + pg_search + BM25)
docker run -d --name paradedb \
  -e POSTGRES_PASSWORD=postgres \
  -p 5432:5432 \
  paradedb/paradedb:latest

Connect: psql postgresql://postgres:postgres@localhost:5432/postgres

Cheat Sheet

Distance Operators

embedding <=> query  -- Cosine distance (1 - similarity)
embedding <-> query  -- L2/Euclidean distance
embedding <#> query  -- Negative inner product

Common Queries

-- Top 10 similar (cosine)
SELECT * FROM docs ORDER BY embedding <=> $1 LIMIT 10;

-- With similarity score
SELECT *, 1 - (embedding <=> $1) AS similarity FROM docs ORDER BY 2 DESC LIMIT 10;

-- With threshold
SELECT * FROM docs WHERE embedding <=> $1 < 0.3 ORDER BY 1 LIMIT 10;

-- Preload index (run on startup)
SELECT 1 FROM docs ORDER BY embedding <=> $1 LIMIT 1;

Index Quick Reference

-- HNSW (recommended)
CREATE INDEX ON docs USING hnsw (embedding vector_cosine_ops);

-- With tuning
CREATE INDEX ON docs USING hnsw (embedding vector_cosine_ops)
WITH (m = 24, ef_construction = 200);

-- Query-time recall
SET hnsw.ef_search = 100;

-- Iterative scan for filtered queries (pgvector 0.8+)
SET hnsw.iterative_scan = relaxed_order;
SET ivfflat.iterative_scan = on;

Decision Trees

Choose Search Method

Query type?
├─ Conceptual/meaning-based → Pure vector search
├─ Exact terms/names → Pure keyword search (FTS)
├─ Fuzzy/typo-tolerant → pg_trgm trigram similarity
├─ Autocomplete/prefix → pg_trgm + prefix index
├─ Substring (LIKE/ILIKE) → pg_trgm GIN index
└─ Mixed/unknown → Hybrid search
    ├─ Simple setup → FTS + RRF (no extra extensions)
    ├─ Better ranking → BM25 + RRF (pg_search extension)
    └─ Full-featured → ParadeDB (Elasticsearch alternative)

Choose Index Type

Document count?
├─ < 10,000 → No index needed
├─ 10k - 1M → HNSW (best recall)
└─ > 1M → IVFFlat (less memory) or HNSW

Choose Vector Type

Embedding model?
├─ text-embedding-3-small (1536) → vector(1536)
├─ text-embedding-3-large (3072) → halfvec(3072) (50% memory savings)
└─ Other models → vector(dimensions)

Operators

Operator	Distance	Use Case
`<=>`	Cosine	Text embeddings (default)
`<->`	L2/Euclidean	Image embeddings
`<#>`	Inner product	Normalized vectors

SQL Functions

Semantic Search

match_documents(query_vec, threshold, limit) - Basic search
match_documents_filtered(query_vec, metadata_filter, threshold, limit) - With JSONB filter
match_chunks(query_vec, threshold, limit) - Search document chunks

Fuzzy Search (pg_trgm)

fuzzy_search_trigram(query_text, threshold, limit) - Trigram similarity search
autocomplete_search(prefix, limit) - Prefix + fuzzy autocomplete
hybrid_search_fuzzy_semantic(query_text, query_vec, limit, rrf_k) - Fuzzy + vector RRF
weighted_fts_search(query_text, language, limit) - FTS with title/content weighting

Hybrid Search (FTS)

hybrid_search_fts(query_vec, query_text, limit, rrf_k, language) - FTS + RRF
hybrid_search_weighted(query_vec, query_text, limit, sem_weight, kw_weight) - Linear combination
hybrid_search_fallback(query_vec, query_text, limit) - Graceful degradation

Hybrid Search (BM25)

hybrid_search_bm25(query_vec, query_text, limit, rrf_k) - BM25 + RRF
hybrid_search_bm25_highlighted(...) - With snippet highlighting
hybrid_search_chunks_bm25(...) - For RAG with chunks

Re-ranking (Optional)

Two-stage retrieval improves precision: fast recall → precise rerank.

When to Use

Results need higher precision
Using < 50 candidates after initial search
Have budget for API calls (Cohere) or compute (local models)

Options

Method	Latency	Quality
Cohere Rerank v4.0	~150-300ms	Best
Zerank 2	~100ms	Best
Voyage Rerank 2.5	~100ms	Excellent
Cross-encoder (local)	~500ms	Very Good

Check provider docs for current pricing. Cohere has a free tier (1000 searches/month).

TypeScript Example (Cohere)

import { CohereClient } from 'cohere-ai';

const cohere = new CohereClient({ token: process.env.COHERE_API_KEY });

async function rerankResults(query: string, documents: string[]) {
  const response = await cohere.rerank({
    model: 'rerank-v4.0-fast',  // or 'rerank-v4.0-pro' for best quality
    query,
    documents,
    topN: 10,
  });
  return response.results;
}

reranking.md - Detailed guide

References

fuzzy-search.md - pg_trgm, fuzzy matching, LIKE/ILIKE, autocomplete, advanced FTS
paradedb.md - ParadeDB full-text search (Elasticsearch alternative)
vector-types.md - vector vs halfvec, dimensions, storage
indexing.md - HNSW, IVFFlat, GIN parameters
hybrid-search.md - FTS, BM25, RRF algorithms
performance.md - Cold-start, memory, HNSW vs IVFFlat

Scripts

setup.sql - Extension and table setup
semantic_search.sql - Semantic search functions
hybrid_search_fts.sql - FTS hybrid functions
hybrid_search_bm25.sql - BM25 hybrid functions
fuzzy_search.sql - pg_trgm fuzzy search, autocomplete, weighted FTS
indexes.sql - Index creation scripts

Common Patterns

TypeScript Integration (Supabase)

// Semantic search
const { data } = await supabase.rpc('match_documents', {
  query_embedding: embedding,
  match_threshold: 0.7,
  match_count: 10
});

// Hybrid search
const { data } = await supabase.rpc('hybrid_search_fts', {
  query_embedding: embedding,
  query_text: userQuery,
  match_count: 10,
  rrf_k: 60,
  fts_language: 'simple'
});

Drizzle ORM

import { sql } from 'drizzle-orm';

const results = await db.execute(sql`
  SELECT * FROM match_documents(
    ${embedding}::vector(1536),
    0.7,
    10
  )
`);

Troubleshooting

Symptom	Cause	Solution
Index not used	< 10k rows or planner choice	Normal for small tables, check with EXPLAIN
Slow first query (30-60s)	HNSW cold-start	`SELECT pg_prewarm('idx_name')` or preload query
Poor recall	Low ef_search	`SET hnsw.ef_search = 100` or higher
FTS returns nothing	Wrong language config	Use `'simple'` for mixed/unknown languages
Memory error on index build	maintenance_work_mem too low	Increase to 2GB+
Cosine similarity > 1	Vectors not normalized	Normalize before insert or use L2
Slow inserts	Index overhead	Batch inserts, consider IVFFlat
Fuzzy search slow	Missing trigram index	`CREATE INDEX USING gin (col gin_trgm_ops)`
ILIKE '%x%' slow	No pg_trgm GIN index	Enable pg_trgm + create GIN trigram index
`%` operator error	pg_trgm not installed	`CREATE EXTENSION IF NOT EXISTS pg_trgm`

Compatibility

pgvector: 0.8+ recommended (iterative scans, halfvec). Check pgvector releases.
pg_search: Check ParadeDB releases for latest.
PostgreSQL: 17+ recommended. pgvector supports 13-18.

External Documentation

pgvector GitHub - Official extension, latest features
OpenAI Embeddings Guide - Embedding models and best practices
Supabase Vector Guide - Supabase-specific integration
ParadeDB pg_search - BM25 extension documentation
ParadeDB AI Docs - Fetch for latest ParadeDB API (always current)
PostgreSQL FTS - Built-in full-text search

More by laguagu

View all

nextjs-shadcn

Creates Next.js 16 frontends with shadcn/ui. Use when building React UIs, components, pages, or applications with shadcn, Tailwind, or modern frontend patterns. Also use when the user asks to create a new Next.js project, add UI components, style pages, or build any web interface — even if they don't mention shadcn explicitly.

ai-elements

Create new AI chat interface components for the ai-elements library following established composable patterns, shadcn/ui integration, and Vercel AI SDK conventions. Use when creating new components in packages/elements/src or when the user asks to add a new component to ai-elements.

ai-sdk-6

Vercel AI SDK v6 development. Use when building AI agents, chatbots, tool integrations, streaming apps, or structured output with the ai package. Covers ToolLoopAgent, useChat, generateText, streamText, tool approval, smoothStream, provider tools, MCP integration, and Output patterns.

cache-components

Expert guidance for Next.js Cache Components and Partial Prerendering (PPR). Use when implementing 'use cache' directive, configuring cache lifetimes with cacheLife(), tagging cached data with cacheTag(), invalidating caches with updateTag()/revalidateTag(), optimizing static vs dynamic content boundaries, managing 'use cache: private' for compliance scenarios, pass-through/interleaving patterns, GET Route Handler caching, debugging cache issues, and reviewing Cache Component implementations.