Back to Play 11 Resources
Play 11: Knowledge Base Q&A

Play 11 Complete Implementation Guide

Full walkthrough: document audit, vector DB setup, Slack/Teams bot, semantic search, citation system.

Play 11 Complete Implementation Guide

You need a knowledge base that actually works. Not a document graveyard where information goes to die, but a system that surfaces the right answer in under 30 seconds. This guide walks you through building a production-grade Q&A system with semantic search, bot integration, and proper citation tracking.

Timeline: 3-4 weeks for initial deployment. Budget: $500-2,000/month for tooling at 50-person firm scale.

1. Document Audit and Preparation

Start with an honest inventory. Most firms have knowledge scattered across SharePoint, Google Drive, Confluence, email threads, and partner hard drives. You need it all in one place.

Week 1: Discovery and Collection

  1. Map every knowledge repository in your firm. Create a spreadsheet with columns: Location, Owner, Document Count, Last Updated, Access Level.
  2. Prioritize by usage frequency. Start with client deliverables, training materials, and process documentation. Skip marketing collateral and expired proposals.
  3. Export everything to a staging folder. Use native export tools (SharePoint migration API
    , Google Takeout, Confluence export). Maintain original folder structure for now.

Document Cleaning Checklist

Run every document through this filter:

  • Remove headers, footers, page numbers, and watermarks using Adobe Acrobat batch processing or Docparser.
  • Strip out boilerplate sections that appear in multiple documents (standard disclaimers, signature blocks).
  • Redact client names, financial figures, and confidential data. Use regex patterns: \$[\d,]+ for dollar amounts, [A-Z][a-z]+ (LLC|Inc\.|Corporation) for company names.
  • Convert all files to plain text or Markdown. PDFs go through OCR if needed (use Tesseract or Adobe's built-in OCR).
  • Normalize filenames: YYYY-MM-DD_DocumentType_Topic.txt (example: 2024-01-15_Memo_Section1031Exchange.txt).

Metadata Extraction

Your vector database needs rich metadata for filtering. Extract and standardize:

  • Document type (memo, brief, training guide, process doc)
  • Practice area or department
  • Author and reviewer names
  • Creation and last-modified dates
  • Client matter number (if applicable)
  • Confidence level (draft, reviewed, approved)

Store metadata in a CSV with columns: filename, doc_type, practice_area, author, date_created, date_modified, matter_id, status. This becomes your source of truth.

2. Vector Database
Configuration

Skip the analysis paralysis. For professional services firms under 100 people, use Pinecone (easiest) or Qdrant (self-hosted option). Over 100 people or handling 50,000+ documents, evaluate Weaviate.

Pinecone Setup (Recommended Path)

  1. Create account at pinecone.io. Start with Starter plan ($70/month, 100K vectors).
  2. Create index named firm-knowledge with dimensions=1536 (matches OpenAI ada-002 embeddings), metric=cosine, pod-type=p1.
  3. Install Python client: pip install pinecone-client openai.
  4. Generate embeddings and upload:
import pinecone
import openai
from pathlib import Path

pinecone.init(api_key="your-key", environment="us-west1-gcp")
index = pinecone.Index("firm-knowledge")

def embed_and_upload(text_file, metadata):
    with open(text_file) as f:
        content = f.read()
    
    # Chunk into 500-word segments with 50-word overlap
    chunks = chunk_text(content, chunk_size=500, overlap=50)
    
    for i, chunk in enumerate(chunks):
        embedding = openai.Embedding.create(
            input=chunk,
            model="text-embedding-ada-002"
        )['data'][0]['embedding']
        
        index.upsert(vectors=[(
            f"{text_file.stem}_chunk{i}",
            embedding,
            {**metadata, "text": chunk, "chunk_id": i}
        )])

# Process all documents
for doc in Path("cleaned_docs").glob("*.txt"):
    metadata = get_metadata_from_csv(doc.name)
    embed_and_upload(doc, metadata)

Chunking Strategy

Don't embed entire documents. Break into logical segments:

  • 500 words per chunk for general knowledge
  • 200 words per chunk for dense technical content
  • 50-word overlap between chunks to preserve context
  • Store chunk position in metadata for reassembly

Index Optimization

Create metadata filters for common query patterns:

  • practice_area filter: Allows "show me only tax documents"
  • date_created range filter: "documents from last 6 months"
  • doc_type filter: "only show process guides"

Test retrieval with 20 sample queries spanning your practice areas. Relevant results should appear in top 3 for 80% of queries. If not, adjust chunk size or re-evaluate document cleaning.

3. Slack Bot Deployment

Build a bot that lives where your team already works. Slack adoption is 10x higher than standalone portals.

Bot Framework Setup

Use Slack Bolt framework (simpler than Microsoft Bot Framework for this use case).

  1. Create Slack app at api.slack.com/apps. Enable Socket Mode for development.
  2. Add bot scopes: app_mentions:read, chat:write, im:history, im:write.
  3. Install app to workspace and save bot token.

Core Bot Logic

from slack_bolt import App
from slack_bolt.adapter.socket_mode import SocketModeHandler
import pinecone
import openai

app = App(token="xoxb-your-bot-token")
pinecone.init(api_key="your-key", environment="us-west1-gcp")
index = pinecone.Index("firm-knowledge")

@app.event("app_mention")
def handle_question(event, say):
    question = event['text'].replace(f"<@{event['bot_id']}>", "").strip()
    
    # Generate query embedding
    query_embedding = openai.Embedding.create(
        input=question,
        model="text-embedding-ada-002"
    )['data'][0]['embedding']
    
    # Search vector DB
    results = index.query(
        vector=query_embedding,
        top_k=3,
        include_metadata=True
    )
    
    # Format response with citations
    answer_blocks = []
    for match in results['matches']:
        answer_blocks.append({
            "type": "section",
            "text": {
                "type": "mrkdwn",
                "text": f"*Source:* {match['metadata']['filename']}\n{match['metadata']['text'][:300]}..."
            }
        })
        answer_blocks.append({
            "type": "context",
            "elements": [{
                "type": "mrkdwn",
                "text": f"Relevance: {match['score']:.2f} | {match['metadata']['practice_area']} | {match['metadata']['date_created']}"
            }]
        })
    
    say(blocks=answer_blocks, text="Here's what I found:")

if __name__ == "__main__":
    SocketModeHandler(app, "xapp-your-app-token").start()

User Experience Enhancements

Add these features in week 2:

  • Slash command /kb search [query] for direct search without @mention
  • Reaction-based feedback: thumbs up/down on answers logs to analytics
  • Follow-up prompt: "Was this helpful? React with ✅ or ❌"
  • Daily digest: Post "Top 3 unanswered questions this week" to #general

Microsoft Teams Alternative

If your firm uses Teams, use Bot Framework Composer instead. Core logic stays identical, but deployment goes through Azure Bot Service ($0.50 per 1,000 messages).

4. Semantic Search Interface

The bot handles 70% of queries. Build a web interface for complex research sessions.

Search UI Stack

  • Frontend: Next.js with Tailwind CSS
  • Backend: FastAPI Python service
  • Hosting: Vercel (frontend) + Railway (backend)

Search Endpoint

from fastapi import FastAPI, Query
from pydantic import BaseModel

app = FastAPI()

class SearchRequest(BaseModel):
    query: str
    filters: dict = {}
    top_k: int = 10

@app.post("/search")
async def search(request: SearchRequest):
    query_embedding = openai.Embedding.create(
        input=request.query,
        model="text-embedding-ada-002"
    )['data'][0]['embedding']
    
    results = index.query(
        vector=query_embedding,
        top_k=request.top_k,
        filter=request.filters,
        include_metadata=True
    )
    
    # Group chunks from same document
    grouped = {}
    for match in results['matches']:
        doc_id = match['metadata']['filename']
        if doc_id not in grouped:
            grouped[doc_id] = {
                'filename': doc_id,
                'chunks': [],
                'max_score': 0,
                'metadata': match['metadata']
            }
        grouped[doc_id]['chunks'].append({
            'text': match['metadata']['text'],
            'score': match['score']
        })
        grouped[doc_id]['max_score'] = max(
            grouped[doc_id]['max_score'],
            match['score']
        )
    
    # Sort by best chunk score
    ranked = sorted(
        grouped.values(),
        key=lambda x: x['max_score'],
        reverse=True
    )
    
    return {"results": ranked}

Advanced Search Features

Implement these filters in your UI:

  • Date range picker: "Documents created between [start] and [end]"
  • Practice area dropdown: Multi-select with your firm's practice areas
  • Document type checkboxes: Memo, Brief, Guide, Process Doc
  • Author search: Autocomplete from your staff directory
  • Confidence filter: Show only "Approved" status documents

Query Suggestions

Track all searches in a PostgreSQL table: searches(id, query, user_id, timestamp, clicked_result). Generate suggestions:

SELECT query, COUNT(*) as frequency
FROM searches
WHERE clicked_result IS NOT NULL
GROUP BY query
ORDER BY frequency DESC
LIMIT 10;

Display these as "Popular searches" on the homepage.

5. Citation System

Every answer needs a source. Build citation tracking into the retrieval flow.

Citation Format Standards

Use this format for internal documents:

[Author Last Name], [Document Title], [Practice Area], [Date Created]. Internal Doc ID: [filename]

Example: Chen, Section 1031 Exchange Memo, Tax, 2024-01-15. Internal Doc ID: 2024-01-15_Memo_Section1031Exchange

Automatic Citation Generation

Add citation button to every search result:

def generate_citation(metadata):
    author = metadata.get('author', 'Unknown')
    title = metadata.get('filename', '').replace('_', ' ')
    practice = metadata.get('practice_area', 'General')
    date = metadata.get('date_created', 'n.d.')
    doc_id = metadata.get('filename', '')
    
    return f"{author}, {title}, {practice}, {date}. Internal Doc ID: {doc_id}"

Display with one-click copy button in your UI.

Citation Analytics Dashboard

Track which documents drive the most value:

  • Most cited documents (monthly leaderboard)
  • Citation count by practice area
  • Authors with highest citation rates
  • Documents with zero citations in 90 days (candidates for archival)

Build a simple dashboard with Metabase or Grafana connected to your search logs database.

Usage Metrics to Monitor

Week 1: Track baseline query volume and response time.

Week 4: Measure these KPIs:

  • Average time to first result: Target under 2 seconds
  • Click-through rate on top result: Target above 60%
  • Queries with zero results: Target under 10%
  • Daily active users: Target 40% of firm within 30 days
  • Repeat usage rate: Target 70% of users return within 7 days

Set up weekly review meetings. Adjust chunking strategy, add missing documents, and refine metadata based on actual usage patterns.

Bottom Line

This system replaces the "email the senior associate" workflow with instant, cited answers. Expect 15-20 hours per week saved across a 50-person firm once adoption hits 60%. The ROI shows up in faster client response times and reduced duplicate work.

Revenue Institute

Reviewed by Revenue Institute

This guide is actively maintained and reviewed by the implementation experts at Revenue Institute. As the creators of The AI Workforce Playbook, we test and deploy these exact frameworks for professional services firms scaling without new headcount.

Revenue Institute

Need help turning this guide into reality? Revenue Institute builds and implements the AI workforce for professional services firms.

RevenueInstitute.com