Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.crewform.tech/llms.txt

Use this file to discover all available pages before exploring further.

Overview

CrewForm’s Knowledge Base enables Retrieval-Augmented Generation (RAG) — upload documents, and agents automatically search relevant content when answering questions or completing tasks.

How It Works

Upload Document  →  Text Extraction  →  Chunking  →  Embedding (1536-dim)

Agent Task  →  knowledge_search tool  →  Hybrid Search  →  Reranked Results
  1. Upload documents to the Knowledge Base
  2. CrewForm automatically chunks the text and generates vector embeddings
  3. Enable the knowledge_search tool on your agents
  4. During task execution, agents semantically search the knowledge base for relevant context

Supported File Types

FormatExtensionDescription
Plain Text.txtRaw text files
Markdown.mdMarkdown documents
CSV.csvTabular data (rows become chunks)
JSON.jsonStructured data

Uploading Documents

  1. Navigate to Knowledge Base from the sidebar
  2. Click Upload Document
  3. Select your file — upload begins automatically
  4. The document status progresses: pendingprocessingready
During processing, CrewForm:
  • Extracts text content from the file
  • Splits into chunks (optimized for retrieval quality)
  • Generates vector embeddings using OpenAI’s text-embedding-3-small model (1536 dimensions)
  • Builds full-text search vectors (tsvector) for hybrid retrieval
  • Stores chunks with embeddings in pgvector for fast similarity search

Metadata Tags

Organize your documents with tags to improve retrieval precision.

Adding Tags

  1. On the Knowledge Base page, click the tag icon next to any document
  2. Type a tag name and press Enter (e.g., FAQ, Technical, Policy)
  3. Tags are saved immediately

Filtering by Tags

When searching, you can filter results to only include chunks from documents with specific tags:
  • In the Retrieval Tester, select tags from the dropdown
  • Via the API, pass tags: ["FAQ", "Technical"] to the search endpoint
  • In the knowledge_search agent tool, tags are passed via the agent’s configuration
Tags are indexed with GIN for fast filtering even with large document collections.

Search Modes

CrewForm supports two search modes:

Vector Search (Default)

Standard cosine similarity search against chunk embeddings:
  • Embedding model: OpenAI text-embedding-3-small (1536 dimensions)
  • Index type: IVFFlat (lists = 100) for fast approximate nearest-neighbor search
  • Default top-K: 5 results
  • Scope: Workspace-level, optionally filtered by document IDs or tags
Combines vector similarity with PostgreSQL full-text search for better recall:
Final Score = (vector_weight × cosine_similarity) + (text_weight × ts_rank)
  • Default weights: 70% vector / 30% full-text
  • Over-fetch strategy: Retrieves 2× the requested results from each method, then reranks and deduplicates
  • Full-text search: Uses PostgreSQL tsvector with ts_rank_cd for keyword matching
  • Best for: Queries mixing semantic meaning with specific keywords, technical terms, or entity names
Toggle between search modes in the Retrieval Tester or via the API.

Retrieval Tester

The Retrieval Tester is an interactive playground for evaluating search quality before deploying to agents.

How to Use

  1. Navigate to Knowledge Base and open the Test Retrieval panel
  2. Type a query in the search box
  3. Configure:
    • Search Mode — Toggle between vector and hybrid
    • Top-K — Number of results (1–20)
    • Filter by Document — Restrict to specific documents
    • Filter by Tags — Restrict to documents with specific tags
  4. Click Search to see results

Reading Results

Each result displays:
  • Similarity Score — Color-coded bar (green = high, yellow = medium, red = low)
  • Source Document — Which document the chunk came from
  • Chunk Preview — The matched text content
  • Response Time — How long the search took
Use the tester to:
  • Verify that the right documents surface for expected queries
  • Compare vector vs hybrid search quality
  • Tune top-K and tag filters before enabling on agents

Enabling Knowledge Search on Agents

  1. Open the agent’s configuration
  2. In the Tools section, enable knowledge_search
  3. Optionally restrict to specific documents via Knowledge Base IDs in the agent config
  4. Save — the agent can now search your documents during task execution

How Agents Use It

When an agent has knowledge_search enabled, it can call:
knowledge_search(query: "What is our refund policy?")
This returns the top-K most semantically similar chunks from your uploaded documents, which the agent uses as context for its response.

API Endpoint

You can query the knowledge base directly without creating agent tasks:
POST /kb/search
Authorization: Bearer <supabase-jwt>

{
  "workspace_id": "your-workspace-id",
  "query": "What is our refund policy?",
  "mode": "hybrid",
  "top_k": 5,
  "tags": ["FAQ"],
  "document_ids": []
}
Response:
{
  "results": [
    {
      "id": "chunk-uuid",
      "content": "Our refund policy allows returns within 30 days...",
      "similarity": 0.89,
      "document_name": "refund-policy.md"
    }
  ],
  "mode": "hybrid",
  "response_time_ms": 42
}

Managing Documents

From the Knowledge Base page you can:
  • View — See all uploaded documents with status, file size, and chunk count
  • Tag — Add metadata tags for filtering
  • Delete — Remove a document and all its chunks (cascading delete)
  • Monitor — Real-time status updates during processing
  • Test — Use the Retrieval Tester to evaluate search quality

Database

The Knowledge Base uses two tables:
TableDescription
knowledge_documentsUploaded file metadata (name, size, status, chunk count, tags)
knowledge_chunksEmbedded text chunks with 1536-dim vectors and tsvector for full-text search
Both tables have workspace-scoped RLS. Two RPC functions handle search:
  • match_knowledge_chunks — Vector-only cosine similarity search
  • hybrid_search_knowledge — Combined vector + full-text search with reranking

Tier Limits

PlanMax Documents
Free3
Pro25
Team+Unlimited