Pinecone Setup Guide for n8n
Alternative vector DB setup.
Pinecone Setup Guide for n8n
Pinecone offers serverless vector storage with sub-50ms query latency at scale. For law firms and professional services managing 10,000+ documents, it outperforms self-hosted solutions like Qdrant or Chroma when you need zero infrastructure overhead.
This guide walks you through connecting Pinecone to n8n for production knowledge base systems. You'll configure authentication, structure your index for legal/financial documents, and build a working Q&A retrieval workflow.
What You Need Before Starting
Pinecone Account (Free Tier Works) Sign up at pinecone.io. Free tier includes 1 index with 100K vectors and 1GB storage. Sufficient for testing with 5,000-10,000 document chunks.
n8n Instance (Cloud or Self-Hosted) Cloud version at n8n.io works immediately. Self-hosted requires Docker or npm installation. Version 0.220.0+ required for native Pinecone nodes.
OpenAI API Key You'll need this to generate embeddings. GPT-3.5-turbo embeddings cost $0.0001 per 1K tokens. Budget $5-10 for initial testing with 1,000 documents.
Step 1: Create and Configure Your Pinecone Index
1. Log into Pinecone and click "Create Index"
2. Configure index settings:
- Index Name:
firm-knowledge-base(use lowercase, hyphens only) - Dimensions:
1536(matches OpenAI text-embedding-ada-002 output) - Metric:
cosine(standard for semantic search) - Pod Type:
s1.x1for starter (handles 100K vectors) - Replicas:
1(increase to 2-3 for production high availability)
3. Click "Create Index" and wait 60-90 seconds for provisioning
4. Copy your credentials from the dashboard:
- API APIApplication Programming Interface. The connection point that lets two pieces of software exchange data. How n8n talks to your CRM. Key (starts with
pcsk_) - Environment (format:
us-east1-gcpor similar) - Index Host URL (format:
firm-knowledge-base-abc123.svc.us-east1-gcp.pinecone.io)
Store these in your password manager. You'll need them for n8n authentication.
Step 2: Connect Pinecone to n8n
1. Open n8n and create a new workflow
2. Add a Pinecone node to the canvas Search for "Pinecone Vector Store" in the node panel. If missing, update n8n to version 0.220.0+.
3. Click "Create New Credential" in the node settings
4. Enter your Pinecone credentials:
- API Key: Paste your
pcsk_key - Environment: Enter your region (example:
us-east1-gcp)
5. Test the connection Click "Test Credential". You should see "Connection successful". If it fails, verify your API key hasn't expired and your IP isn't blocked by Pinecone's firewall.
6. Save the credential as "Pinecone Production"
Step 3: Structure Your Document Ingestion Pipeline
This workflow converts PDFs/Word docs into searchable vectors. Use this pattern for client files, case law, or internal knowledge bases.
1. Add an "HTTP Request" node (or "Google Drive" node for cloud files)
Configure to fetch your source documents. Example for local files:
- Method: GET
- URL:
https://yourdomain.com/documents/client-agreement.pdf - Response Format: Binary
2. Add a "Extract from File" node
Connect it after HTTP Request:
- Operation: Extract Text
- Binary Property:
data - Output Format: Plain Text
3. Add a "Code" node to chunk the text
Large documents must be split into 500-1000 token chunks. Paste this function:
const text = $input.item.json.text;
const chunkSize = 800; // tokens, roughly 600 words
const overlap = 100; // prevents context loss at boundaries
function chunkText(text, size, overlap) {
const words = text.split(/\s+/);
const chunks = [];
for (let i = 0; i < words.length; i += size - overlap) {
const chunk = words.slice(i, i + size).join(' ');
chunks.push({
text: chunk,
chunkIndex: Math.floor(i / (size - overlap)),
sourceFile: $input.item.json.fileName
});
}
return chunks;
}
return chunkText(text, chunkSize, overlap).map(chunk => ({ json: chunk }));
4. Add an "OpenAI" node for embeddings
- Resource: Embeddings
- Model: text-embedding-ada-002
- Input:
{{ $json.text }}
This converts each text chunk into a 1536-dimension vector.
5. Add the "Pinecone Vector Store" node
- Operation: Insert
- Index Name:
firm-knowledge-base - Vector:
{{ $json.embedding }}(from OpenAI node) - ID:
{{ $json.sourceFile }}-chunk-{{ $json.chunkIndex }} - Metadata: Add these fields:
text:{{ $json.text }}source:{{ $json.sourceFile }}chunkIndex:{{ $json.chunkIndex }}uploadDate:{{ $now.toISO() }}
6. Execute the workflow
Start with 5-10 test documents. Monitor the execution panel for errors. Each document should produce 10-50 chunks depending on length.
Step 4: Build the Q&A Retrieval Workflow
This workflow takes a user question and returns the 3 most relevant document excerpts.
1. Create a new workflow with a "Webhook" trigger
- Method: POST
- Path:
knowledge-base-query - Response Mode: Last Node
2. Add an "OpenAI" node to embed the question
- Resource: Embeddings
- Model: text-embedding-ada-002
- Input:
{{ $json.body.question }}
3. Add a "Pinecone Vector Store" node for search
- Operation: Query
- Index Name:
firm-knowledge-base - Query Vector:
{{ $json.embedding }} - Top K:
3(returns 3 best matches) - Include Metadata:
true
4. Add a "Code" node to format results
const matches = $input.item.json.matches;
const formattedResults = matches.map((match, index) => ({
rank: index + 1,
relevanceScore: match.score.toFixed(3),
excerpt: match.metadata.text,
source: match.metadata.source,
chunkIndex: match.metadata.chunkIndex
}));
return [{ json: { results: formattedResults } }];
5. Add a "Respond to Webhook WebhookA way for one app to send real-time data to another the instant an event happens. Example: DocuSign pings n8n the moment a contract is signed." node
- Response Body:
{{ $json }}
6. Test with a sample question
Send a POST request to your webhook URL:
{
"question": "What are the termination clauses in client agreements?"
}
You should receive 3 ranked excerpts with relevance scores above 0.75 for good matches.
Step 5: Add GPT-Powered Answer Generation
Raw excerpts are useful, but a synthesized answer improves user experience.
1. Insert an "OpenAI" node after the Pinecone query
- Resource: Chat
- Model: gpt-4o-mini (faster and cheaper than GPT-4)
- Messages: System + User message
System Message:
You are a legal knowledge assistant for [Firm Name]. Answer questions using only the provided document excerpts. If the excerpts don't contain the answer, say "I don't have enough information in the knowledge base to answer that."
Cite sources using this format: [Source: filename.pdf, Section X]
User Message:
Question: `{{ $('Webhook').item.json.body.question }}`
Relevant excerpts:
`{{ $json.results.map(r => `[${r.rank}] ${r.excerpt} (Source: ${r.source})`).join('\n\n') }}`
Provide a clear, concise answer with source citations.
2. Update the "Respond to Webhook" node
Return both the GPT answer and the raw excerpts:
{
"answer": "`{{ $json.choices[0].message.content }}`",
"sources": "`{{ $('Code').item.json.results }}`"
}
Performance Optimization for Production
Batch Upserts for Large Document Sets
Instead of inserting vectors one at a time, batch them in groups of 100:
const vectors = $input.all().map(item => ({
id: `${item.json.sourceFile}-${item.json.chunkIndex}`,
values: item.json.embedding,
metadata: {
text: item.json.text,
source: item.json.sourceFile,
chunkIndex: item.json.chunkIndex
}
}));
// Split into batches of 100
const batches = [];
for (let i = 0; i < vectors.length; i += 100) {
batches.push(vectors.slice(i, i + 100));
}
return batches.map(batch => ({ json: { vectors: batch } }));
Set the Pinecone node to "Upsert" operation and pass {{ $json.vectors }}.
Namespace Strategy for Multi-Client Firms
Use namespaces to isolate client data:
- Namespace:
client-{{ $json.clientId }}
This prevents cross-client data leakage and enables per-client access controls.
Metadata Filtering for Precise Searches
Add filters to the Pinecone query node:
{
"filter": {
"source": { "$eq": "employment-agreements" },
"uploadDate": { "$gte": "2024-01-01" }
}
}
This restricts searches to specific document types or date ranges.
Troubleshooting Common Issues
"Index not found" error Verify the index name matches exactly (case-sensitive). Check the Pinecone dashboard to confirm the index exists and is active.
Low relevance scores (below 0.6) Your embeddings may not match your query style. Try rephrasing questions to match document language, or fine-tune your chunking strategy to preserve more context.
Rate limit errors during bulk uploads Free tier limits to 100 requests/minute. Add a "Wait" node with 1-second delay between batches, or upgrade to a paid plan.
Missing metadata in query results
Ensure "Include Metadata" is set to true in the Query operation. Metadata isn't returned by default.
Cost Estimation for Production Use
10,000 documents (average 5 pages each):
- Embedding cost: ~$15 (one-time)
- Pinecone storage: Free tier sufficient
- Query cost: $0.0001 per query (negligible)
Monthly operating cost: $0-5 for most small-to-midsize firms.
Upgrade to paid Pinecone ($70/month) when you exceed 100K vectors or need multiple indexes.
Related Resources
Document Audit Worksheet
Template for cataloging, versioning, and consolidating your firm's reference materials before indexing.
Knowledge Base Q&A Prompt Library
Prompts for answering questions with citations, handling unanswerable questions, flagging gaps.
Neo4j Knowledge Graph Setup Guide (Optional Advanced)
For firms wanting relational understanding on top of vector search.
The full system, end to end.
Looking to build your AI workforce? Get the comprehensive guide for professional services - the 12 plays, the frameworks, and the field-tested playbooks.
Buy on Amazon
Reviewed by Revenue Institute
This guide is actively maintained and reviewed by the implementation experts at Revenue Institute. As the creators of The AI Workforce Playbook, we test and deploy these exact frameworks for professional services firms scaling without new headcount.
Get the Book
Need help turning this guide into reality?
Revenue Institute builds and implements the AI workforce for professional services firms.
Work with Revenue Institute