Play 11 Complete Implementation Guide
Full walkthrough: document audit, vector DB setup, Slack/Teams bot, semantic search, citation system.
Play 11 Complete Implementation Guide
You need a knowledge base that actually works. Not a document graveyard where information goes to die, but a system that surfaces the right answer in under 30 seconds. This guide walks you through building a production-grade Q&A system with semantic search, bot integration, and proper citation tracking.
Timeline: 3-4 weeks for initial deployment. Budget: $500-2,000/month for tooling at 50-person firm scale.
1. Document Audit and Preparation
Start with an honest inventory. Most firms have knowledge scattered across SharePoint, Google Drive, Confluence, email threads, and partner hard drives. You need it all in one place.
Week 1: Discovery and Collection
- Map every knowledge repository in your firm. Create a spreadsheet with columns: Location, Owner, Document Count, Last Updated, Access Level.
- Prioritize by usage frequency. Start with client deliverables, training materials, and process documentation. Skip marketing collateral and expired proposals.
- Export everything to a staging folder. Use native export tools (SharePoint migration API, Google Takeout, Confluence export). Maintain original folder structure for now.APIClick to read the full definition in our AI & Automation Glossary.
Document Cleaning Checklist
Run every document through this filter:
- Remove headers, footers, page numbers, and watermarks using Adobe Acrobat batch processing or Docparser.
- Strip out boilerplate sections that appear in multiple documents (standard disclaimers, signature blocks).
- Redact client names, financial figures, and confidential data. Use regex patterns:
\$[\d,]+for dollar amounts,[A-Z][a-z]+ (LLC|Inc\.|Corporation)for company names. - Convert all files to plain text or Markdown. PDFs go through OCR if needed (use Tesseract or Adobe's built-in OCR).
- Normalize filenames:
YYYY-MM-DD_DocumentType_Topic.txt(example:2024-01-15_Memo_Section1031Exchange.txt).
Metadata Extraction
Your vector database needs rich metadata for filtering. Extract and standardize:
- Document type (memo, brief, training guide, process doc)
- Practice area or department
- Author and reviewer names
- Creation and last-modified dates
- Client matter number (if applicable)
- Confidence level (draft, reviewed, approved)
Store metadata in a CSV with columns: filename, doc_type, practice_area, author, date_created, date_modified, matter_id, status. This becomes your source of truth.
2. Vector Database Vector DatabaseClick to read the full definition in our AI & Automation Glossary. Configuration
Skip the analysis paralysis. For professional services firms under 100 people, use Pinecone (easiest) or Qdrant (self-hosted option). Over 100 people or handling 50,000+ documents, evaluate Weaviate.
Pinecone Setup (Recommended Path)
- Create account at pinecone.io. Start with Starter plan ($70/month, 100K vectors).
- Create index named
firm-knowledgewith dimensions=1536 (matches OpenAI ada-002 embeddings), metric=cosine, pod-type=p1. - Install Python client:
pip install pinecone-client openai. - Generate embeddings and upload:
import pinecone
import openai
from pathlib import Path
pinecone.init(api_key="your-key", environment="us-west1-gcp")
index = pinecone.Index("firm-knowledge")
def embed_and_upload(text_file, metadata):
with open(text_file) as f:
content = f.read()
# Chunk into 500-word segments with 50-word overlap
chunks = chunk_text(content, chunk_size=500, overlap=50)
for i, chunk in enumerate(chunks):
embedding = openai.Embedding.create(
input=chunk,
model="text-embedding-ada-002"
)['data'][0]['embedding']
index.upsert(vectors=[(
f"{text_file.stem}_chunk{i}",
embedding,
{**metadata, "text": chunk, "chunk_id": i}
)])
# Process all documents
for doc in Path("cleaned_docs").glob("*.txt"):
metadata = get_metadata_from_csv(doc.name)
embed_and_upload(doc, metadata)
Chunking Strategy
Don't embed entire documents. Break into logical segments:
- 500 words per chunk for general knowledge
- 200 words per chunk for dense technical content
- 50-word overlap between chunks to preserve context
- Store chunk position in metadata for reassembly
Index Optimization
Create metadata filters for common query patterns:
practice_areafilter: Allows "show me only tax documents"date_createdrange filter: "documents from last 6 months"doc_typefilter: "only show process guides"
Test retrieval with 20 sample queries spanning your practice areas. Relevant results should appear in top 3 for 80% of queries. If not, adjust chunk size or re-evaluate document cleaning.
3. Slack Bot Deployment
Build a bot that lives where your team already works. Slack adoption is 10x higher than standalone portals.
Bot Framework Setup
Use Slack Bolt framework (simpler than Microsoft Bot Framework for this use case).
- Create Slack app at api.slack.com/apps. Enable Socket Mode for development.
- Add bot scopes:
app_mentions:read,chat:write,im:history,im:write. - Install app to workspace and save bot token.
Core Bot Logic
from slack_bolt import App
from slack_bolt.adapter.socket_mode import SocketModeHandler
import pinecone
import openai
app = App(token="xoxb-your-bot-token")
pinecone.init(api_key="your-key", environment="us-west1-gcp")
index = pinecone.Index("firm-knowledge")
@app.event("app_mention")
def handle_question(event, say):
question = event['text'].replace(f"<@{event['bot_id']}>", "").strip()
# Generate query embedding
query_embedding = openai.Embedding.create(
input=question,
model="text-embedding-ada-002"
)['data'][0]['embedding']
# Search vector DB
results = index.query(
vector=query_embedding,
top_k=3,
include_metadata=True
)
# Format response with citations
answer_blocks = []
for match in results['matches']:
answer_blocks.append({
"type": "section",
"text": {
"type": "mrkdwn",
"text": f"*Source:* {match['metadata']['filename']}\n{match['metadata']['text'][:300]}..."
}
})
answer_blocks.append({
"type": "context",
"elements": [{
"type": "mrkdwn",
"text": f"Relevance: {match['score']:.2f} | {match['metadata']['practice_area']} | {match['metadata']['date_created']}"
}]
})
say(blocks=answer_blocks, text="Here's what I found:")
if __name__ == "__main__":
SocketModeHandler(app, "xapp-your-app-token").start()
User Experience Enhancements
Add these features in week 2:
- Slash command
/kb search [query]for direct search without @mention - Reaction-based feedback: thumbs up/down on answers logs to analytics
- Follow-up prompt: "Was this helpful? React with ✅ or ❌"
- Daily digest: Post "Top 3 unanswered questions this week" to #general
Microsoft Teams Alternative
If your firm uses Teams, use Bot Framework Composer instead. Core logic stays identical, but deployment goes through Azure Bot Service ($0.50 per 1,000 messages).
4. Semantic Search Interface
The bot handles 70% of queries. Build a web interface for complex research sessions.
Search UI Stack
- Frontend: Next.js with Tailwind CSS
- Backend: FastAPI Python service
- Hosting: Vercel (frontend) + Railway (backend)
Search Endpoint
from fastapi import FastAPI, Query
from pydantic import BaseModel
app = FastAPI()
class SearchRequest(BaseModel):
query: str
filters: dict = {}
top_k: int = 10
@app.post("/search")
async def search(request: SearchRequest):
query_embedding = openai.Embedding.create(
input=request.query,
model="text-embedding-ada-002"
)['data'][0]['embedding']
results = index.query(
vector=query_embedding,
top_k=request.top_k,
filter=request.filters,
include_metadata=True
)
# Group chunks from same document
grouped = {}
for match in results['matches']:
doc_id = match['metadata']['filename']
if doc_id not in grouped:
grouped[doc_id] = {
'filename': doc_id,
'chunks': [],
'max_score': 0,
'metadata': match['metadata']
}
grouped[doc_id]['chunks'].append({
'text': match['metadata']['text'],
'score': match['score']
})
grouped[doc_id]['max_score'] = max(
grouped[doc_id]['max_score'],
match['score']
)
# Sort by best chunk score
ranked = sorted(
grouped.values(),
key=lambda x: x['max_score'],
reverse=True
)
return {"results": ranked}
Advanced Search Features
Implement these filters in your UI:
- Date range picker: "Documents created between [start] and [end]"
- Practice area dropdown: Multi-select with your firm's practice areas
- Document type checkboxes: Memo, Brief, Guide, Process Doc
- Author search: Autocomplete from your staff directory
- Confidence filter: Show only "Approved" status documents
Query Suggestions
Track all searches in a PostgreSQL table: searches(id, query, user_id, timestamp, clicked_result). Generate suggestions:
SELECT query, COUNT(*) as frequency
FROM searches
WHERE clicked_result IS NOT NULL
GROUP BY query
ORDER BY frequency DESC
LIMIT 10;
Display these as "Popular searches" on the homepage.
5. Citation System
Every answer needs a source. Build citation tracking into the retrieval flow.
Citation Format Standards
Use this format for internal documents:
[Author Last Name], [Document Title], [Practice Area], [Date Created]. Internal Doc ID: [filename]
Example: Chen, Section 1031 Exchange Memo, Tax, 2024-01-15. Internal Doc ID: 2024-01-15_Memo_Section1031Exchange
Automatic Citation Generation
Add citation button to every search result:
def generate_citation(metadata):
author = metadata.get('author', 'Unknown')
title = metadata.get('filename', '').replace('_', ' ')
practice = metadata.get('practice_area', 'General')
date = metadata.get('date_created', 'n.d.')
doc_id = metadata.get('filename', '')
return f"{author}, {title}, {practice}, {date}. Internal Doc ID: {doc_id}"
Display with one-click copy button in your UI.
Citation Analytics Dashboard
Track which documents drive the most value:
- Most cited documents (monthly leaderboard)
- Citation count by practice area
- Authors with highest citation rates
- Documents with zero citations in 90 days (candidates for archival)
Build a simple dashboard with Metabase or Grafana connected to your search logs database.
Usage Metrics to Monitor
Week 1: Track baseline query volume and response time.
Week 4: Measure these KPIs:
- Average time to first result: Target under 2 seconds
- Click-through rate on top result: Target above 60%
- Queries with zero results: Target under 10%
- Daily active users: Target 40% of firm within 30 days
- Repeat usage rate: Target 70% of users return within 7 days
Set up weekly review meetings. Adjust chunking strategy, add missing documents, and refine metadata based on actual usage patterns.
Bottom Line
This system replaces the "email the senior associate" workflow with instant, cited answers. Expect 15-20 hours per week saved across a 50-person firm once adoption hits 60%. The ROI shows up in faster client response times and reduced duplicate work.

Reviewed by Revenue Institute
This guide is actively maintained and reviewed by the implementation experts at Revenue Institute. As the creators of The AI Workforce Playbook, we test and deploy these exact frameworks for professional services firms scaling without new headcount.
Revenue Institute
Need help turning this guide into reality? Revenue Institute builds and implements the AI workforce for professional services firms.