cloudflare-vectorize

@jezweb/cloudflare-vectorize

jezweb

162

21 forks

Updated 1/18/2026

View on GitHub

Build semantic search with Cloudflare Vectorize V2. Covers async mutations, 5M vectors/index, 31ms latency, returnMetadata enum changes, and V1 deprecation. Prevents 8 errors including dimension mismatches. Use when: building RAG or semantic search, troubleshooting returnMetadata, V2 timing, metadata index, or dimension errors.

Installation

$skills install @jezweb/cloudflare-vectorize

Claude Code

Cursor

Copilot

Codex

Antigravity

Details

Repositoryjezweb/claude-skills

Pathskills/cloudflare-vectorize/SKILL.md

Branchmain

Scoped Name@jezweb/cloudflare-vectorize

Usage

After installing, this skill will be available to your AI coding assistant.

Verify installation:

skills list

Skill Instructions

name: cloudflare-vectorize description: | Build semantic search with Cloudflare Vectorize V2 (Sept 2024 GA). Covers V2 breaking changes: async mutations, 5M vectors/index (was 200K), 31ms latency (was 549ms), returnMetadata enum, and V1 deprecation (Dec 2024).

Use when: migrating V1→V2, handling async mutations with mutationId, creating metadata indexes before insert, or troubleshooting "returnMetadata must be 'all'", V2 timing issues, metadata index errors, dimension mismatches.

Cloudflare Vectorize

Complete implementation guide for Cloudflare Vectorize - a globally distributed vector database for building semantic search, RAG (Retrieval Augmented Generation), and AI-powered applications with Cloudflare Workers.

Status: Production Ready ✅ Last Updated: 2026-01-06 Dependencies: cloudflare-worker-base (for Worker setup), cloudflare-workers-ai (for embeddings) Latest Versions: wrangler@4.54.0, @cloudflare/workers-types@4.20260103.0 Token Savings: ~65% Errors Prevented: 8 Dev Time Saved: ~3 hours

What This Skill Provides

Core Capabilities

✅ Index Management: Create, configure, and manage vector indexes
✅ Vector Operations: Insert, upsert, query, delete, and list vectors
✅ Metadata Filtering: Advanced filtering with 10 metadata indexes per index
✅ Semantic Search: Find similar vectors using cosine, euclidean, or dot-product metrics
✅ RAG Patterns: Complete retrieval-augmented generation workflows
✅ Workers AI Integration: Native embedding generation with @cf/baai/bge-base-en-v1.5
✅ OpenAI Integration: Support for text-embedding-3-small/large models
✅ Document Processing: Text chunking and batch ingestion pipelines

Templates Included

basic-search.ts - Simple vector search with Workers AI
rag-chat.ts - Full RAG chatbot with context retrieval
document-ingestion.ts - Document chunking and embedding pipeline
metadata-filtering.ts - Advanced filtering patterns

⚠️ Vectorize V2 Breaking Changes (September 2024)

IMPORTANT: Vectorize V2 became GA in September 2024 with significant breaking changes.

What Changed in V2

Performance Improvements:

Index capacity: 200,000 → 5 million vectors per index
Query latency: 549ms → 31ms median (18× faster)
TopK limit: 20 → 100 results per query
Scale limits: 100 → 50,000 indexes per account
Namespace limits: 100 → 50,000 namespaces per index

Breaking API Changes:

Async Mutations - All mutations now asynchronous:

// V2: Returns mutationId
const result = await env.VECTORIZE_INDEX.insert(vectors);
console.log(result.mutationId); // "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"

// Vector inserts/deletes may take a few seconds to be reflected

returnMetadata Parameter - Boolean → String enum:

// ❌ V1 (deprecated)
{ returnMetadata: true }

// ✅ V2 (required)
{ returnMetadata: 'all' | 'indexed' | 'none' }

Metadata Indexes Required Before Insert:
- V2 requires metadata indexes created BEFORE vectors inserted
- Vectors added before metadata index won't be indexed
- Must re-upsert vectors after creating metadata index

V1 Deprecation Timeline:

December 2024: Can no longer create V1 indexes
Existing V1 indexes: Continue to work (other operations unaffected)
Migration: Use wrangler vectorize --deprecated-v1 flag for V1 operations

Wrangler Version Required:

Minimum: wrangler@3.71.0 for V2 commands
Recommended: wrangler@4.54.0+ (latest)

Check Mutation Status

// Get index info to check last mutation processed
const info = await env.VECTORIZE_INDEX.describe();
console.log(info.mutationId); // Last mutation ID
console.log(info.processedUpToMutation); // Last processed timestamp

Critical Setup Rules

⚠️ MUST DO BEFORE INSERTING VECTORS

# 1. Create the index with FIXED dimensions and metric
npx wrangler vectorize create my-index \
  --dimensions=768 \
  --metric=cosine

# 2. Create metadata indexes IMMEDIATELY (before inserting vectors!)
npx wrangler vectorize create-metadata-index my-index \
  --property-name=category \
  --type=string

npx wrangler vectorize create-metadata-index my-index \
  --property-name=timestamp \
  --type=number

Why: Metadata indexes MUST exist before vectors are inserted. Vectors added before a metadata index was created won't be filterable on that property.

Index Configuration (Cannot Be Changed Later)

# Dimensions MUST match your embedding model output:
# - Workers AI @cf/baai/bge-base-en-v1.5: 768 dimensions
# - OpenAI text-embedding-3-small: 1536 dimensions
# - OpenAI text-embedding-3-large: 3072 dimensions

# Metrics determine similarity calculation:
# - cosine: Best for normalized embeddings (most common)
# - euclidean: Absolute distance between vectors
# - dot-product: For non-normalized vectors

Wrangler Configuration

wrangler.jsonc:

{
  "name": "my-vectorize-worker",
  "main": "src/index.ts",
  "compatibility_date": "2025-10-21",
  "vectorize": [
    {
      "binding": "VECTORIZE_INDEX",
      "index_name": "my-index"
    }
  ],
  "ai": {
    "binding": "AI"
  }
}

TypeScript Types

export interface Env {
  VECTORIZE_INDEX: VectorizeIndex;
  AI: Ai;
}

interface VectorizeVector {
  id: string;
  values: number[] | Float32Array | Float64Array;
  namespace?: string;
  metadata?: Record<string, string | number | boolean | string[]>;
}

interface VectorizeMatches {
  matches: Array<{
    id: string;
    score: number;
    values?: number[];
    metadata?: Record<string, any>;
    namespace?: string;
  }>;
  count: number;
}

Metadata Filter Operators (V2)

Vectorize V2 supports advanced metadata filtering with range queries:

// Equality (implicit $eq)
{ category: "docs" }

// Not equals
{ status: { $ne: "archived" } }

// In/Not in arrays
{ category: { $in: ["docs", "tutorials"] } }
{ category: { $nin: ["deprecated", "draft"] } }

// Range queries (numbers) - NEW in V2
{ timestamp: { $gte: 1704067200, $lt: 1735689600 } }

// Range queries (strings) - prefix searching
{ url: { $gte: "/docs/workers", $lt: "/docs/workersz" } }

// Nested metadata with dot notation
{ "author.id": "user123" }

// Multiple conditions (implicit AND)
{ category: "docs", language: "en", "metadata.published": true }

Metadata Best Practices

1. Cardinality Considerations

Low Cardinality (Good for $eq filters):

// Few unique values - efficient filtering
metadata: {
  category: "docs",        // ~10 categories
  language: "en",          // ~5 languages
  published: true          // 2 values (boolean)
}

High Cardinality (Avoid in range queries):

// Many unique values - avoid large range scans
metadata: {
  user_id: "uuid-v4...",         // Millions of unique values
  timestamp_ms: 1704067200123    // Use seconds instead
}

2. Metadata Limits

Max 10 metadata indexes per Vectorize index
Max 10 KiB metadata per vector
String indexes: First 64 bytes (UTF-8)
Number indexes: Float64 precision
Filter size: Max 2048 bytes (compact JSON)

3. Key Restrictions

// ❌ INVALID metadata keys
metadata: {
  "": "value",              // Empty key
  "user.name": "John",      // Contains dot (reserved for nesting)
  "$admin": true,           // Starts with $
  "key\"with\"quotes": 1    // Contains quotes
}

// ✅ VALID metadata keys
metadata: {
  "user_name": "John",
  "isAdmin": true,
  "nested": { "allowed": true }  // Access as "nested.allowed" in filters
}

Common Errors & Solutions

Error 1: Metadata Index Created After Vectors Inserted

Problem: Filtering doesn't work on existing vectors
Solution: Delete and re-insert vectors OR create metadata indexes BEFORE inserting

Error 2: Dimension Mismatch

Problem: "Vector dimensions do not match index configuration"
Solution: Ensure embedding model output matches index dimensions:
  - Workers AI bge-base: 768
  - OpenAI small: 1536
  - OpenAI large: 3072

Error 3: Invalid Metadata Keys

Problem: "Invalid metadata key"
Solution: Keys cannot:
  - Be empty
  - Contain . (dot)
  - Contain " (quote)
  - Start with $ (dollar sign)

Error 4: Filter Too Large

Problem: "Filter exceeds 2048 bytes"
Solution: Simplify filter or split into multiple queries

Error 5: Range Query on High Cardinality

Problem: Slow queries or reduced accuracy
Solution: Use lower cardinality fields for range queries, or use seconds instead of milliseconds for timestamps

Error 6: Insert vs Upsert Confusion

Problem: Updates not reflecting in index
Solution: Use upsert() to overwrite existing vectors, not insert()

Error 7: Missing Bindings

Problem: "VECTORIZE_INDEX is not defined"
Solution: Add [[vectorize]] binding to wrangler.jsonc

Error 8: Namespace vs Metadata Confusion

Problem: Unclear when to use namespace vs metadata filtering
Solution:
  - Namespace: Partition key, applied BEFORE metadata filters
  - Metadata: Flexible key-value filtering within namespace

Error 9: V2 Async Mutation Timing (NEW in V2)

Problem: Inserted vectors not immediately queryable
Solution: V2 mutations are asynchronous - vectors may take a few seconds to be reflected
  - Use mutationId to track mutation status
  - Check env.VECTORIZE_INDEX.describe() for processedUpToMutation timestamp

Error 10: V1 returnMetadata Boolean (BREAKING in V2)

Problem: "returnMetadata must be 'all', 'indexed', or 'none'"
Solution: V2 changed returnMetadata from boolean to string enum:
  - ❌ V1: { returnMetadata: true }
  - ✅ V2: { returnMetadata: 'all' }

V2 Migration Checklist

If migrating from V1 to V2:

✅ Update wrangler to 3.71.0+ (npm install -g wrangler@latest)
✅ Create new V2 index (can't upgrade V1 → V2)
✅ Create metadata indexes BEFORE inserting vectors
✅ Update returnMetadata boolean → string enum ('all', 'indexed', 'none')
✅ Handle async mutations (expect mutationId in responses)
✅ Test with V2 limits (topK up to 100, 5M vectors per index)
✅ Update error handling for async behavior

V1 Deprecation:

After December 2024: Cannot create new V1 indexes
Existing V1 indexes: Continue to work
Use wrangler vectorize --deprecated-v1 for V1 operations

Official Documentation

Vectorize V2 Docs: https://developers.cloudflare.com/vectorize/
V2 Changelog: https://developers.cloudflare.com/vectorize/platform/changelog/
V1 to V2 Migration: https://developers.cloudflare.com/vectorize/reference/transition-vectorize-legacy/
Metadata Filtering: https://developers.cloudflare.com/vectorize/reference/metadata-filtering/
Workers AI Models: https://developers.cloudflare.com/workers-ai/models/

Status: Production Ready ✅ (Vectorize V2 GA - September 2024) Last Updated: 2025-11-22 Token Savings: ~70% Errors Prevented: 10 (includes V2 breaking changes)

More by jezweb

View all

better-auth

162

Self-hosted auth for TypeScript/Cloudflare Workers with social auth, 2FA, passkeys, organizations, RBAC, and 15+ plugins. Requires Drizzle ORM or Kysely for D1 (no direct adapter). Self-hosted alternative to Clerk/Auth.js. Use when: self-hosting auth on D1, building OAuth provider, multi-tenant SaaS, or troubleshooting D1 adapter errors, session caching, rate limits.

commands

162

/review-skill - Skill Audit Command: Comprehensive skill documentation audit with automated checks and manual review phases.

tiptap

162

Build rich text editors with Tiptap - headless editor framework with React and Tailwind v4. Covers SSR-safe setup, image uploads, prose styling, and collaborative editing. Use when creating blog editors, comment systems, or Notion-like apps, or troubleshooting SSR hydration errors, typography issues, or image upload problems.

cloudflare-workers-ai

162

Run LLMs and AI models on Cloudflare's GPU network with Workers AI. Includes Llama 4, Gemma 3, Mistral 3.1, Flux images, BGE embeddings, streaming, and AI Gateway. Handles 2025 breaking changes. Use when: implementing LLM inference, images, RAG, or troubleshooting AI_ERROR, rate limits, max_tokens, BGE pooling.