Back to Play 11 Resources
Play 11: Knowledge Base Q&A

Pinecone Setup Guide for n8n

Alternative vector DB setup.

Pinecone Setup Guide for n8n

Pinecone offers serverless vector storage with sub-50ms query latency at scale. For law firms and professional services managing 10,000+ documents, it outperforms self-hosted solutions like Qdrant or Chroma when you need zero infrastructure overhead.

This guide walks you through connecting Pinecone to n8n for production knowledge base systems. You'll configure authentication, structure your index for legal/financial documents, and build a working Q&A retrieval workflow.

What You Need Before Starting

Pinecone Account (Free Tier Works) Sign up at pinecone.io. Free tier includes 1 index with 100K vectors and 1GB storage. Sufficient for testing with 5,000-10,000 document chunks.

n8n Instance (Cloud or Self-Hosted) Cloud version at n8n.io works immediately. Self-hosted requires Docker or npm installation. Version 0.220.0+ required for native Pinecone nodes.

OpenAI API Key You'll need this to generate embeddings. GPT-3.5-turbo embeddings cost $0.0001 per 1K tokens. Budget $5-10 for initial testing with 1,000 documents.

Step 1: Create and Configure Your Pinecone Index

1. Log into Pinecone and click "Create Index"

2. Configure index settings:

  • Index Name: firm-knowledge-base (use lowercase, hyphens only)
  • Dimensions: 1536 (matches OpenAI text-embedding-ada-002 output)
  • Metric: cosine (standard for semantic search)
  • Pod Type: s1.x1 for starter (handles 100K vectors)
  • Replicas: 1 (increase to 2-3 for production high availability)

3. Click "Create Index" and wait 60-90 seconds for provisioning

4. Copy your credentials from the dashboard:

  • API
    Key (starts with pcsk_)
  • Environment (format: us-east1-gcp or similar)
  • Index Host URL (format: firm-knowledge-base-abc123.svc.us-east1-gcp.pinecone.io)

Store these in your password manager. You'll need them for n8n authentication.

Step 2: Connect Pinecone to n8n

1. Open n8n and create a new workflow

2. Add a Pinecone node to the canvas Search for "Pinecone Vector Store" in the node panel. If missing, update n8n to version 0.220.0+.

3. Click "Create New Credential" in the node settings

4. Enter your Pinecone credentials:

  • API
    Key
    : Paste your pcsk_ key
  • Environment: Enter your region (example: us-east1-gcp)

5. Test the connection Click "Test Credential". You should see "Connection successful". If it fails, verify your API

key hasn't expired and your IP isn't blocked by Pinecone's firewall.

6. Save the credential as "Pinecone Production"

Step 3: Structure Your Document Ingestion Pipeline

This workflow converts PDFs/Word docs into searchable vectors. Use this pattern for client files, case law, or internal knowledge bases.

1. Add an "HTTP Request" node (or "Google Drive" node for cloud files)

Configure to fetch your source documents. Example for local files:

  • Method: GET
  • URL: https://yourdomain.com/documents/client-agreement.pdf
  • Response Format: Binary

2. Add a "Extract from File" node

Connect it after HTTP Request:

  • Operation: Extract Text
  • Binary Property: data
  • Output Format: Plain Text

3. Add a "Code" node to chunk the text

Large documents must be split into 500-1000 token chunks. Paste this function:

const text = $input.item.json.text;
const chunkSize = 800; // tokens, roughly 600 words
const overlap = 100; // prevents context loss at boundaries

function chunkText(text, size, overlap) {
  const words = text.split(/\s+/);
  const chunks = [];
  
  for (let i = 0; i < words.length; i += size - overlap) {
    const chunk = words.slice(i, i + size).join(' ');
    chunks.push({
      text: chunk,
      chunkIndex: Math.floor(i / (size - overlap)),
      sourceFile: $input.item.json.fileName
    });
  }
  
  return chunks;
}

return chunkText(text, chunkSize, overlap).map(chunk => ({ json: chunk }));

4. Add an "OpenAI" node for embeddings

  • Resource: Embeddings
  • Model: text-embedding-ada-002
  • Input: {{ $json.text }}

This converts each text chunk into a 1536-dimension vector.

5. Add the "Pinecone Vector Store" node

  • Operation: Insert
  • Index Name: firm-knowledge-base
  • Vector: {{ $json.embedding }} (from OpenAI node)
  • ID: {{ $json.sourceFile }}-chunk-{{ $json.chunkIndex }}
  • Metadata: Add these fields:
    • text: {{ $json.text }}
    • source: {{ $json.sourceFile }}
    • chunkIndex: {{ $json.chunkIndex }}
    • uploadDate: {{ $now.toISO() }}

6. Execute the workflow

Start with 5-10 test documents. Monitor the execution panel for errors. Each document should produce 10-50 chunks depending on length.

Step 4: Build the Q&A Retrieval Workflow

This workflow takes a user question and returns the 3 most relevant document excerpts.

1. Create a new workflow with a "Webhook" trigger

  • Method: POST
  • Path: knowledge-base-query
  • Response Mode: Last Node

2. Add an "OpenAI" node to embed the question

  • Resource: Embeddings
  • Model: text-embedding-ada-002
  • Input: {{ $json.body.question }}

3. Add a "Pinecone Vector Store" node for search

  • Operation: Query
  • Index Name: firm-knowledge-base
  • Query Vector: {{ $json.embedding }}
  • Top K: 3 (returns 3 best matches)
  • Include Metadata: true

4. Add a "Code" node to format results

const matches = $input.item.json.matches;

const formattedResults = matches.map((match, index) => ({
  rank: index + 1,
  relevanceScore: match.score.toFixed(3),
  excerpt: match.metadata.text,
  source: match.metadata.source,
  chunkIndex: match.metadata.chunkIndex
}));

return [{ json: { results: formattedResults } }];

5. Add a "Respond to Webhook

" node

  • Response Body: {{ $json }}

6. Test with a sample question

Send a POST request to your webhook

URL:

{
  "question": "What are the termination clauses in client agreements?"
}

You should receive 3 ranked excerpts with relevance scores above 0.75 for good matches.

Step 5: Add GPT-Powered Answer Generation

Raw excerpts are useful, but a synthesized answer improves user experience.

1. Insert an "OpenAI" node after the Pinecone query

  • Resource: Chat
  • Model: gpt-4o-mini (faster and cheaper than GPT-4)
  • Messages: System + User message

System Message:

You are a legal knowledge assistant for [Firm Name]. Answer questions using only the provided document excerpts. If the excerpts don't contain the answer, say "I don't have enough information in the knowledge base to answer that."

Cite sources using this format: [Source: filename.pdf, Section X]

User Message:

Question: `{{ $('Webhook').item.json.body.question }}`

Relevant excerpts:
`{{ $json.results.map(r => `[${r.rank}] ${r.excerpt} (Source: ${r.source})`).join('\n\n') }}`

Provide a clear, concise answer with source citations.

2. Update the "Respond to Webhook

" node

Return both the GPT answer and the raw excerpts:

{
  "answer": "`{{ $json.choices[0].message.content }}`",
  "sources": "`{{ $('Code').item.json.results }}`"
}

Performance Optimization for Production

Batch Upserts for Large Document Sets

Instead of inserting vectors one at a time, batch them in groups of 100:

const vectors = $input.all().map(item => ({
  id: `${item.json.sourceFile}-${item.json.chunkIndex}`,
  values: item.json.embedding,
  metadata: {
    text: item.json.text,
    source: item.json.sourceFile,
    chunkIndex: item.json.chunkIndex
  }
}));

// Split into batches of 100
const batches = [];
for (let i = 0; i < vectors.length; i += 100) {
  batches.push(vectors.slice(i, i + 100));
}

return batches.map(batch => ({ json: { vectors: batch } }));

Set the Pinecone node to "Upsert" operation and pass {{ $json.vectors }}.

Namespace Strategy for Multi-Client Firms

Use namespaces to isolate client data:

  • Namespace: client-{{ $json.clientId }}

This prevents cross-client data leakage and enables per-client access controls.

Metadata Filtering for Precise Searches

Add filters to the Pinecone query node:

{
  "filter": {
    "source": { "$eq": "employment-agreements" },
    "uploadDate": { "$gte": "2024-01-01" }
  }
}

This restricts searches to specific document types or date ranges.

Troubleshooting Common Issues

"Index not found" error Verify the index name matches exactly (case-sensitive). Check the Pinecone dashboard to confirm the index exists and is active.

Low relevance scores (below 0.6) Your embeddings may not match your query style. Try rephrasing questions to match document language, or fine-tune your chunking strategy to preserve more context.

Rate limit errors during bulk uploads Free tier limits to 100 requests/minute. Add a "Wait" node with 1-second delay between batches, or upgrade to a paid plan.

Missing metadata in query results Ensure "Include Metadata" is set to true in the Query operation. Metadata isn't returned by default.

Cost Estimation for Production Use

10,000 documents (average 5 pages each):

  • Embedding cost: ~$15 (one-time)
  • Pinecone storage: Free tier sufficient
  • Query cost: $0.0001 per query (negligible)

Monthly operating cost: $0-5 for most small-to-midsize firms.

Upgrade to paid Pinecone ($70/month) when you exceed 100K vectors or need multiple indexes.

Revenue Institute

Reviewed by Revenue Institute

This guide is actively maintained and reviewed by the implementation experts at Revenue Institute. As the creators of The AI Workforce Playbook, we test and deploy these exact frameworks for professional services firms scaling without new headcount.

Revenue Institute

Need help turning this guide into reality? Revenue Institute builds and implements the AI workforce for professional services firms.

RevenueInstitute.com