Play 11Advanced~20 min read

Knowledge Base and Internal Q&A

Build an internal Q&A bot that answers questions from your firm's actual policy documents in seconds, with citations.

Key Takeaways

What it does: Once a clean document library is organized and indexed, n8n connects it to a internal knowledge portal. When a team member asks a question in the designated channel, the bot runs a semantic search against the indexed library, retrieves the most relevant document sections, and passes them to an AI with instructions to synthesize a clear answer with explicit citations. The answer includes the specific source documents and sections it drew from. If the question isn't covered, the system says so - it doesn't hallucinate an answer - and flags the gap for the document owner to address.
Why it matters: If senior partners field 5 knowledge questions per day that this system could handle, and the average interruption costs 10 minutes of context-switching time, that's 50 minutes per day of high-value time returned per partner. Across 5 partners at $300/hour, that's approximately $300,000 annually in recovered capacity. The error rate reduction matters too: answers drawn from authoritative, cited documents are more reliable than answers given from memory.
Difficulty: Advanced - typical build in ~20 min read.

The business case

At some point in every professional services firm, institutional knowledge becomes a structural liability. The things that live in senior partners' heads - how to handle a specific regulatory edge case, what the firm's position is on a nuanced conflict of interest situation, which version of the methodology document is actually current - can't be accessed by anyone else without finding that partner and interrupting them. Junior staff spend real time hunting for answers that exist somewhere in a document they can't locate. Senior people field the same questions repeatedly. Onboarding takes longer than it should because the knowledge new team members need is distributed across inboxes, shared drives, and people's memories. This Play builds an internal Q&A system on top of your existing documents that anyone can query through email.

What this play does

Once a clean document library is organized and indexed, n8n connects it to a internal knowledge portal. When a team member asks a question in the designated channel, the bot runs a semantic search against the indexed library, retrieves the most relevant document sections, and passes them to an AI with instructions to synthesize a clear answer with explicit citations. The answer includes the specific source documents and sections it drew from. If the question isn't covered, the system says so - it doesn't hallucinate an answer - and flags the gap for the document owner to address.

Before and after

Before

A junior employee has a question about the firm's policy on billing for out-of-scope work. They search the shared drive and find three documents with different guidance, one from three years ago. They send a email to a senior associate. The senior associate answers from memory, which may or may not reflect the current written policy.

After

The junior employee types the question into the designated channel. Within 15 seconds: "Based on the Engagement Management Policy (updated March 2024, Section 4.2), out-of-scope billing requires partner approval before the work is performed." The associate has the right answer, from the right document, immediately, without interrupting anyone.

Business impact

If senior partners field 5 knowledge questions per day that this system could handle, and the average interruption costs 10 minutes of context-switching time, that's 50 minutes per day of high-value time returned per partner. Across 5 partners at $300/hour, that's approximately $300,000 annually in recovered capacity. The error rate reduction matters too: answers drawn from authoritative, cited documents are more reliable than answers given from memory.

Prerequisites

Complete these before opening n8n. Skipping prerequisites is how you end up rebuilding workflows.

Complete a document audit before building

This Play has a hard prerequisite that most firms want to skip: a document audit. If your policy documents are scattered across three different shared drives, half are out of date, and no one is sure which versions are authoritative, the Q&A system will return answers based on whatever it finds - including the outdated ones. Identify and consolidate your key reference materials before building.

Establish version control and ownership

For every document in the library, identify the authoritative version and the person responsible for keeping it current. Without maintenance ownership, the library goes stale within six months. Assign document owners before building - not after.

Identify the 20 most frequently asked internal questions

Survey junior and mid-level staff. These are your first test cases. Build the library to answer those 20 questions well before launch. If the system can't answer the most common questions from day one, adoption will be low.

Choose a vector database

Pinecone and Supabase's pgvector are both solid options for most firms. Supabase is easier to set up if you don't have existing infrastructure. Pinecone has better scaling characteristics for larger document libraries. The full setup guide for both is at workforceplaybook.ai.

Step-by-step implementation

The steps below are the full build guide. Each step includes configuration notes and exact AI prompts where applicable.

Build and organize the document library

The document library is the foundation. Quality here determines quality everywhere else. Spend 2 - 4 weeks on this before touching n8n. Identify the categories of reference material that generate the most internal questions: engagement management policies, methodology documentation, conflict of interest procedures, billing guidelines, template library, and precedent files for common situations. For each category, identify the single authoritative current version of each document. Archive or delete outdated versions - having multiple versions of a document in the indexed library is worse than having none, because the system might return the outdated version. Format documents consistently. Long PDF documents with complex formatting are harder for the indexing system to process accurately than clean text or Markdown files. For critical policy documents, create a clean text version even if the formatted PDF exists for external use. Build a document registry: a simple spreadsheet listing every document in the library, its category, the date it was last reviewed, and who owns it.

Index the library into a vector database

Install the n8n vector database node for your chosen platform. The indexing workflow reads each document from your library, breaks it into chunks (typically 500 - 1000 word segments with some overlap), generates embeddings for each chunk, and stores the embeddings with metadata (document name, section, date, URL/file path) in the vector database. Run the indexing workflow once initially to index the entire library. Set up a trigger to re-index any document that's been updated - this keeps the knowledge base current as documents are revised. After initial indexing, test the retrieval by querying for your 20 most common questions. For each question, review which document sections were retrieved. Are they the right sections? If not, adjust the chunking strategy or the indexing approach before moving to the answer generation step.

Build the internal knowledge portal workflow

The Q&A workflow triggers when a user submits a question through the internal knowledge portal or sends a message to a dedicated inbox like kb@yourfirm.com. Stand up the portal behind SSO so only employees can query, and route inbound emails into the same n8n workflow that powers the portal. When a question arrives, the workflow: 1. Sends an immediate acknowledgment (typing indicator or "Looking that up...") 2. Generates an embedding for the question using the same embedding model used during indexing 3. Queries the vector database for the most relevant document chunks (top 3 - 5 results) 4. Passes the question and retrieved chunks to the AI for answer synthesis 5. Returns the answer with explicit citations to the source documents For questions where the retrieval doesn't find relevant content (similarity score below threshold), return: "I don't have a clear answer to that in the current knowledge base. I've flagged this as a gap for the document team - in the meantime, [suggested escalation path]."

AI Prompt

You are an internal knowledge assistant for a professional services firm. Your job is to answer questions from team members using only the document excerpts provided - you must not use information from outside these excerpts.

Question: {{user_question}}

Relevant document excerpts retrieved:
{{retrieved_chunks}}

(Each excerpt includes: document name, section, and content)

Instructions:
1. Answer the question directly and specifically using the provided excerpts
2. If the excerpts clearly and fully answer the question: provide the answer, then cite the source(s) in this format: "Source: [Document Name], [Section], [Date if available]"
3. If the excerpts partially answer the question: provide what you can, note the limitation, and cite what you used
4. If the excerpts don't adequately answer the question: say so clearly - "The current knowledge base doesn't have a clear answer to this question. The closest relevant content is [brief description] in [document]. I've flagged this as a knowledge gap."
5. Never fabricate information. If you're uncertain, say so.
6. Keep answers concise - aim for under 200 words unless the question requires more detail

Format your response as:
**Answer:** [Direct answer]
**Source(s):** [Citations]
**Note:** [Any important caveats or limitations, if applicable]

Build gap tracking and library maintenance

Every question the system can't answer well is a signal that the knowledge base needs a new document or a clearer policy. Build a gap tracking workflow: when the system returns a "not found" or low-confidence answer, log the original question, the date, and the requester to a gap tracking sheet. Assign document owners to review the gap log weekly and determine which gaps represent genuine policy gaps (need a new document written), unclear existing policies (existing document needs clarification), or out-of-scope questions (appropriately not in the knowledge base). Set a monthly review cadence: document owners review their sections, update content that's changed, and archive anything that's outdated. When a document is updated, the re-indexing workflow triggers automatically.

Week-by-week rollout plan

Weeks 1 - 3Document Library Build

Week 1: Audit existing documents. Identify authoritative versions.
Week 2: Consolidate, clean, and organize the library. Assign document owners.
Week 3: Build document registry. Verify completeness against your 20 most common questions.

Week 4Indexing

Set up vector database.
Build and run indexing workflow.
Test retrieval against 20 most common questions.

Week 5Bot and Answer Generation

Build internal knowledge portal and answer generation workflow.
Test against all 20 benchmark questions. Review every answer for accuracy.
Build gap tracking workflow.

Week 6Launch

Launch to a small pilot group (5 - 10 users).
Collect feedback after first 100 questions.
Refine based on feedback. Expand to full team.

Success benchmarks

These are the specific, measurable signals that confirm the play is working. Check against each benchmark at the 30-, 60-, and 90-day mark.

Answers rated accurate by users on 85%+ of questions within the knowledge base scope

Senior partner time spent answering routine knowledge questions reduced by 60%+

Gap log resulting in at least 2 - 3 new or clarified documents per month in the first quarter

System successfully returns 'not found' (rather than a wrong answer) for questions outside the knowledge base scope

Document library review cadence maintained - no document goes more than 6 months without review

Common mistakes

Skipping the document audit

If your policy documents are inconsistent, outdated, or scattered, the Q&A system will return inconsistent, outdated answers with the appearance of authority. The document audit is not optional - it's the foundation everything else stands on.

Not assigning document ownership

The Q&A system is only as good as the library underneath it. Without named owners who are responsible for keeping documents current, the library goes stale within months and the system starts returning outdated guidance.

Deploying without testing against known questions

Before launch, run the system against your 20 most common questions. Review every answer for accuracy and citation quality. If the hit rate is below 80%, the library needs more work.

Exception rule

Read before going live

The Q&A system will occasionally answer questions it doesn't have authoritative information for, and it will do so with apparent confidence. Always surface source citations with answers so users can verify before acting. An answer without provenance is dangerous. Build citation displays into every response and train your team to check sources before acting on high-stakes answers.

Ready-to-use AI skills for this play

Skip the blank page. These free AI skills do the drafting work inside this play. Install one in Claude, or paste it into ChatGPT, then hand it your raw input.

operations skill

Knowledge Base Article Writer

Write clear knowledge base and help-center articles that let users solve problems without asking a human.

Get the skill

operations skill

SOP Writer

Write a clear standard operating procedure that anyone can follow to do a task the same way every time.

Get the skill

operations skill

Policy Document Drafter

Draft a clear internal policy document that sets expectations without drowning people in legalese.

Get the skill

Browse the full AI Skills Library

Related plays

RFP First Draft Generator

AI-Assisted Hiring Screening

Frequently asked questions

What is a knowledge base Q&A system and how does it differ from a regular chatbot?⌄

A knowledge base Q&A system uses Retrieval-Augmented Generation (RAG) to answer questions from your actual firm documents - SOPs, policies, case studies, proposals, and playbooks. Unlike a standard chatbot that generates answers from a general model, this system cites your specific content and only answers from what exists in your library. It's accurate to your firm's unique knowledge.

What documents can be included in the knowledge base?⌄

Any document in text form: PDF policies, Word SOPs, Google Docs playbooks, past proposals, case studies, compliance guidelines, email templates, and meeting notes. The ingestion pipeline in n8n reads, chunks, embeds, and stores the content in a vector database (Pinecone, Supabase pgvector, or Qdrant).

How do we keep the knowledge base current when documents change?⌄

The weekly update workflow monitors a designated folder (Google Drive, SharePoint, or Notion) for new or modified documents, processes any changes, and updates the vector database automatically. Documents added or edited roll into the knowledge base the next time the update workflow runs.

Can the system answer questions it doesn't have source documents for?⌄

It can be configured either way. The safest configuration is to have the system respond: 'I don't have a source document for that - here's who to ask.' This prevents hallucination and builds trust in the system's answers. Allowing it to answer from general knowledge is possible but should be flagged differently in the output.

Want someone to build this play for your firm?

Revenue Institute implements the full AI Workforce Playbook system as part of every engagement.

Work with Revenue Institute