All Case Studies
Legal Services · RAG + Fine-Tuned Retrieval

18 Years of Case Files.
Every Answer in
Under 4 Minutes.

A 45-attorney Chicago firm had 2.3M documents across iManage and a network file share — 60% scanned PDFs with no text layer. We built a fully on-premise RAG system that cut search time by 95%.

95%
Search time reduction
45 min → 2–4 min
89%
Precision@10 — up from 15% keyword baseline
2.3M
Documents indexed across all systems
45 min
Avg. daily time saved per attorney
Industry
Legal Services
Firm Size
45 attorneys, 80+ staff
Location
Chicago, Illinois
Engagement
5 weeks
The Problem

Three Pain Points.
Eighteen Years of Debt.

🔍

Keyword Search Was Useless

"Indemnification" in a 2.3M-document corpus returns hundreds of results. 45+ minutes per matter just to find relevant precedent.

🧠

Knowledge Lived in Heads

Senior paralegals held informal knowledge maps. When they left, institutional knowledge walked out the door permanently.

🚫

Commercial Tools Blocked

Every commercial legal AI tool required cloud uploads — a clear ethics violation under their professional responsibility obligations.

DOCUMENT CORPUS · 2.3M FILES · ~14TB · 18 YEARS Each category → different extraction strategy
60%
Scanned PDFs
No text layer — images of printed pages from the early 2000s. Deskewing + contrast normalization required.
Tesseract OCR pipeline
25%
Native PDFs + Word
Extractable text but inconsistent formatting — headers bleeding into body, tracked changes, duplicate versions.
pymupdf + python-docx
15%
Mixed Documents
Partially scanned — some pages OCR'd, some not. Required page-level analysis and independent routing.
Page-level classifier
⚠️
The firm evaluated and rejected all commercial legal AI tools
No tool could connect to on-premise iManage without cloud sync, and sending client documents to third-party AI services raised clear ethics concerns. The only viable path: a fully on-premise, custom-built system.
System Architecture

Two Pipelines.
Built for How Lawyers Work.

Attorney Interface
⚖️ Attorney Query Interface
Conversational search · Citation-grounded answers · Ethical wall enforcement · Matter-scoped access
FastAPI + OAuth2 + RBAC
↓  query authenticated against matter-level access matrix
Retrieval Pipeline — 5-Stage Hybrid Search
🔢
Semantic Search
Top 40 via Qdrant embedding similarity
🔤
BM25 Keyword
Top 40 via Elasticsearch term frequency
🔀
RRF Merge
Reciprocal Rank Fusion on both result sets
🎯
Cross-Encoder
Fine-tuned bge-reranker-v2-m3 re-ranks
⚖️
Metadata Boost
Practice area, client, attorney weighting
↓  top results → Claude with citation-enforcement prompt
Ingestion Pipeline — 3 Stages
STAGE 01
Classify & Extract
Native PDF → pymupdf, Word → python-docx, Scanned → Tesseract with deskew. Mixed docs get page-level routing.
STAGE 02
Legal-Aware Chunking
Contracts by clause, briefs by argument, discovery by exhibit — each with full section hierarchy and metadata.
STAGE 03
Embed & Index
Fine-tuned bge-large-en-v1.5 (15K legal pairs). Stored in Qdrant (3-node cluster) with payload filters.
↓  2.3M documents ingested · incremental iManage sync daily
Data Layer — Fully On-Premise
🗄️ iManage + Network File Share → Unified Index
Qdrant (vector) · Elasticsearch (BM25) · iManage permissions API (ethical walls)
On-premise · No cloud sync
Measured Outcomes

Before vs After.
Validated on 200 Queries.

Metric
⚠ Before
✓ After
Precedent search time
45+ min / matter
Manual keyword search
2–4 min
~95% reduction
Search precision@10
~15%
Hundreds of irrelevant results
~89%
6× improvement
Institutional knowledge
Paralegal-dependent
Lost with staff turnover
Searchable corpus
2.3M docs unified
Citation quality
None
Files, not answers
Grounded answers
Matter no. + page ref
Ethical wall compliance
Manual / process-based
Relied on staff awareness
System-enforced
iManage sync · query filter
45 min
Daily time saved per attorney
2.3M
Documents searchable, unified
5 wks
Ingestion + tuning + deployment
0
Client docs sent to any cloud
Technical Proof

Citation-Grounded Answers.
Not Just Search Results.

legal-rag · commercial-lit · abstrabit
What indemnification language have we used in SaaS vendor agreements for financial services clients?
▸ hybrid search · 80 candidates · re-ranked · 3 docs retrieved · 1.8s
Two distinct indemnification structures found. In M-2019-0847, p.12 (Northern Trust), mutual indemnification with willful misconduct carve-out, capped at 12 months fees. In M-2021-1203, p.8 (First Midwest Bank), unilateral vendor indemnification for data breaches tied to GLBA — preferred approach post-2021.

A tiered cap structure appears in M-2022-0391, p.15: general liability capped at fees, data liability at $5M, no cap on third-party IP claims.
📄 M-2019-0847 · SaaS Agreement · Northern Trust · 2019-03-14 · p.12
📄 M-2021-1203 · Vendor Contract · First Midwest Bank · 2021-07-22 · p.8
📄 M-2022-0391 · Tech Services Agreement · Heartland Financial · 2022-11-09 · p.15
⚠ 2 additional documents matched with score < 0.60 — flagged for manual review.
📊 Retrieval Performance (200 Query Eval Set)
Keyword only
15%
Semantic alone
64%
Hybrid (semantic+BM25)
72%
+ Fine-tuned re-ranker
89%
🔒 Ethical Wall Enforcement
Access controls sync from iManage — see only what iManage grants
Restricted docs filtered at vector store query level — never returned
Every query logged: user, timestamp, matter context, docs retrieved
Audit log exportable for ethics review — full traceability
Zero client documents sent to any external API
Full Stack
bge-large-en-v1.5 bge-reranker-v2-m3 Qdrant Elasticsearch Tesseract OCR pymupdf python-docx FastAPI Claude API iManage API Docker Nginx
Why Generic RAG Fails in Legal
"Consideration" means completely different things in contract law vs. administrative law. Generic semantic search conflates them. Our domain-specific fine-tuning on 15K legal query-document pairs solves this.