All Case Studies
US Healthcare SaaS · HIPAA-Compliant AI

Clinician-Grade AI.
Zero Patient Data
Ever Leaves
the Building.

On-premise fine-tuned LLM + RAG pipeline for clinical documentation. No PHI to any third-party API. MVP that raised $2M seed.

🌱
$2M Seed Raised
on the working MVP we delivered
94.7%
ICD-10 top-3 accuracy
18s
Note generation
(was 8–15 min manual)
0 bytes
PHI to third-party APIs
8 wks
MVP to HIPAA audit pass
Industry
US Healthcare SaaS
Stage
MVP → Seed
Users
Physicians, Clinical Staff
Engagement
8 weeks

The Problem
Three Hard Requirements.
No Existing Solution Met All Three.
🔒
Zero PHI Off-Premise
No patient data to any external API — ruling out OpenAI, Anthropic, and all cloud LLM vendors.
🩺
Clinician-Grade Output
Physicians must sign generated notes. Hallucinated codes or drug interactions are a liability.
Real-Time During Visits
Notes generated fast enough to be part of the active patient encounter workflow.
⚠️
Client Had Already Tried Raw Open-Source Models — and Failed
Off-the-shelf Llama produced hallucinated CPT codes, invented drug interactions, and unusable SOAP notes. No physician would sign them. Domain-specific fine-tuning was required.
23,200
Training examples: de-identified notes, synthetic pairs, ICD-10/CPT examples with chain-of-thought
70,000+
ICD-10-CM codes in RAG pipeline — updated quarterly without model retraining
~8%
Pre-existing claim denial rate from manual coding errors

System Architecture
Five Layers.
100% On-Premise.
Clinician Interface
🩺
Physician Interface — EHR-Connected
Visit input · Note generation · ICD-10/CPT suggestions · Gap detection
FastAPI + OAuth2 + RBAC
↓ authenticated API · client VPC · TLS 1.3
AI Core — 3 Modules In-Sequence
🧠
Fine-Tuned LLM + RAG + Gap Detection
Three coordinated inference stages per request
Note GenerationLlama 3 70B fine-tuned
SOAP / DAP / H&P
Coding IntelligenceRAG over ICD-10 + CPT
Hybrid search + RRF
Gap Detection2nd LLM pass validation
< 500ms added latency
↓ pgvector semantic + keyword search · RRF re-ranking
Knowledge Base
Vector Store (pgvector)
70K+ ICD-10 codes · CPT descriptors · MIPS specs
Hybrid search
Fine-Tune Dataset
12K notes · 8K synthetic pairs · 3.2K CoT traces
QLoRA 4-bit
↓ vLLM PagedAttention · continuous batching
Model Serving
⚙️
vLLM Inference — Llama 3 70B
PagedAttention · Continuous batching · Concurrent sessions
2× A100 80GB
↓ all components inside client infrastructure · zero cloud API calls
Compliance Infrastructure
Audit Logging
Input/output hashes · traceability without raw PHI
No PHI in logs
Encryption + Isolation
AES-256 at rest · TLS 1.3 · Client VPC · Docker
Zero findings

Measured Outcomes
Before vs After.
Verified by Physicians & Compliance.
Metric
⚠ Before
✓ After
Note generation time
8–15 minutes
Manual per visit
12–18 sec
50× faster
Documentation backlog
90+ notes / week
Carried over
Near zero
Eliminated
ICD-10 accuracy
Manual lookup
Error-prone
94.7%
Physician-validated
Blind note review
N/A
All human-authored
Passed
2 MDs couldn't distinguish AI vs human
PHI exposure
Undefined risk
No AI tooling
Zero
100% on-premise
HIPAA audit
Not applicable
No AI to audit
Passed
Zero findings
🔒
Zero PHI Egress
No patient data leaves client infrastructure
📋
HIPAA Passed
Compliance review — zero audit findings
🩺
Physician-Approved
2 board-certified MDs blind-reviewed
🌱
$2M Seed Raised
Investors funded on the working MVP
$2M
Seed round
Investors Funded the Working MVP
The system became a core asset in the startup's fundraise. Investors cited the on-premise architecture and physician-validated accuracy as key differentiators.

Technical Proof
What a Generated Note
Actually Looks Like.
SOAP NOTE · CARDIOLOGY · GENERATED IN 14s
Subjective
62 y/o male presents with exertional chest tightness × 3 weeks. Onset with moderate activity, relieved at rest. No radiation to jaw or arm. Denies diaphoresis, syncope. PMHx: HTN (controlled), hyperlipidemia. Meds: lisinopril 10mg, atorvastatin 40mg.
Objective
BP 138/84, HR 76 reg, RR 16, SpO2 98% RA. Regular rate, no murmurs/rubs/gallops. Lungs CTA bilaterally. No JVD. Peripheral pulses 2+ symmetrical. ECG: NSR, no ST changes.
Assessment
Stable angina, likely effort-induced. DDx includes GERD and musculoskeletal. Cardiac workup warranted given risk factors.
Plan
Stress test ordered. Cardiology referral placed. NTG PRN prescribed. Follow-up 2 weeks. Activity modification counseled.
ICD-10 Suggestions
I20.9 0.93 I25.10 0.88 R07.9 0.71 Z82.49 0.67
⚠ Gap detected: Smoking history not documented. Required for cardiovascular risk stratification.
📊 Fine-Tune Performance
94.7%
ICD-10 top-3 accuracy
23.2K
Training examples
4-bit
QLoRA quantization
Passed
Blind physician review
Infrastructure Stack
ModelLlama 3 70B + QLoRA
InferencevLLM + PagedAttention
Hardware2× NVIDIA A100 80GB
Fine-tuningUnsloth · rank-16
Vector storepgvector (PostgreSQL)
SearchHybrid + RRF
APIFastAPI + OAuth2 + RBAC
EncryptionAES-256 · TLS 1.3
🔒 HIPAA Compliance Proof
Zero PHI sent to any external API
Input/output hashes logged — traceability without raw PHI
AES-256 at rest, TLS 1.3 in transit
RBAC with per-physician session isolation
Passed HIPAA compliance review — zero findings