HomeSafetyLicensingTechnologyModelsPartnershipCIRIS.ai

Medical LLM Options

100% open source. Your data never leaves your infrastructure.

Open Source ModelsAir-Gapped CapableZero Data ExfiltrationHIPAA/GDPR Ready

Your Data. Your Infrastructure. Full Stop.

CIRISMedical runs entirely on your hardware. Patient data never leaves your facility, never touches external APIs, and never trains third-party models. This isn't a promise—it's architecture.

  • 100% open source model weights (Apache 2.0 / Llama license)
  • No external API calls—runs fully offline
  • Audit every inference—full transparency
  • No vendor lock-in—you own everything
🏥
Hospital Networks
Multi-facility deployment
🏛️
Government Health
State & local agencies
🌍
NGOs & Humanitarian
Remote & underserved areas
🔒
Private Practice
Clinics & specialists

CIRIS Requirements

Medical LLMs must meet strict requirements for clinical deployment under the CIRIS framework.

32K+
Context Window
JSON
Structured Output
12-70
Tool Calls/Request
2+
Provider Fallback
HIPAA
Compliance Ready

Viable Medical Models

Meditron3-70B

Recommended

Llama 3.1 medical fine-tune (EPFL/Yale)

Context
128K tokens
Parameters
70B
Structured Output
vLLM
Est. Cost
~$0.20/M

Performance

  • Outperforms GPT-4, MedPaLM-2, Meditron 1/2
  • 128K context (full CIRIS compliance)
  • Humanitarian & low-resource focus
  • Released March 2025 (latest)

Model ID

OpenMeditron/Meditron3-70B

OpenBioLLM-70B

Open Source

Llama 3 biomedical fine-tune (Saama)

Context
32K tokens
Parameters
70B
Structured Output
vLLM
Est. Cost
~$0.20/M

Performance

  • 86% avg across 9 medical datasets
  • Strong biomedical NER/QA
  • Available on Azure AI Foundry

Limitation

32K context (95% of requests)

GPT-5.2 Healthcare

Proprietary

OpenAI HIPAA-compliant API

Context
128K+ tokens
BAA
Available
Structured Output
Native
Est. Cost
~$15/M

Performance

  • Best on HealthBench
  • Native JSON Schema
  • Zero-retention PHI endpoints

Limitation

~75x more expensive

Why Other Medical LLMs Don't Qualify

ModelContext32K ReqStructuredMulti-ProviderIssue
Me-LLaMA 70B4KX~*Context too small (avg request 19K)
Meditron 1/2 (Llama 2)4KX~*Superseded by Meditron3 (Llama 3.1)
Google MedLM??XXVertex-only, no structured output
Hippocratic Polaris--XXProprietary, voice-only focus
BioGPT~1KXXXText mining focus, tiny context

* Self-hostable provides multi-provider via different cloud deployments

Deployment Options

Option 1: Self-Hosted (OpenBioLLM-70B)

Maximum control, lowest cost per token, HIPAA compliance under your infrastructure.

Hardware Requirements

Recommended: 2x H100 80GB SXM
  • Output: ~800-1000 tok/s
  • Prefill: ~25-30K tok/s
  • VRAM: 160GB total
  • Est. cost: ~$6-8/hr cloud

Budget Alternative

4x A100 80GB or 4x A6000 48GB
  • Output: ~450-1100 tok/s
  • Est. cost: ~$4-8/hr cloud

vLLM Configuration (Meditron3)

vllm serve OpenMeditron/Meditron3-70B \
  --tensor-parallel-size 2 \
  --max-model-len 131072 \
  --gpu-memory-utilization 0.95 \
  --enable-chunked-prefill \
  --quantization fp8 \
  --guided-decoding-backend outlines

Cloud GPU Providers

Lambda Labs ~$5.50/hr
CoreWeave ~$5.00/hr
RunPod ~$6.00/hr
AWS p5 ~$8.00/hr
Az

Azure AI Foundry

OpenBioLLM-70B hosted

  • *OpenBioLLM-70B in model catalog
  • *HIPAA BAA available
  • *Managed infrastructure
  • ~Pay-per-token pricing
View in Azure AI Catalog
OAI

OpenAI Healthcare API

GPT-5.2 with BAA

  • *128K+ context window
  • *Native structured outputs
  • *BAA for HIPAA compliance
  • !~$15.75/M tokens (75x open source)
OpenAI for Healthcare

Cost Comparison

Based on typical CIRISMedical usage: ~19K input tokens, ~330 output tokens per request

DeploymentModelPer 1M TokensMonthly (100K req)
Self-hosted (2x H100)Meditron3-70B~$0.20~$400 + infra
Self-hosted (2x H100)OpenBioLLM-70B~$0.20~$400 + infra
Azure AI FoundryOpenBioLLM-70B~$1-2~$2,000
OpenAI Healthcare APIGPT-5.2~$15.75~$30,000

Self-hosted infrastructure cost (~$4,500/mo for 2x H100) included in monthly estimate. Meditron3 recommended for 128K context; OpenBioLLM for 32K fallback.

Deployment Investment

Full CIRISMedical deployment requires infrastructure, integration, and ongoing support investment.

$130-190K
Hardware (Redundant)
  • 2x server nodes (failover)
  • 4x H100 80GB SXM total
  • Networking & storage
  • UPS & cooling
Non-redundant option
$65-95K
$100-190K
Setup & Integration
  • EHR integration (OpenEMR, Epic)
  • HIPAA/compliance audit
  • Medical workflow customization
  • Staff training program
  • First year support included
$10K
Sensors & Interfaces
  • Vitals monitoring integration
  • Medical imaging adapters
  • Voice interface hardware
  • Multimodal input devices
Deployment TierHardwareIntegrationSensorsTotal (Year 1)
Minimum (Non-redundant)$65K$100K$10K~$175K
Recommended (Redundant)$130K$150K$10K~$290K
Enterprise (Full HA)$190K$190K$10K~$390K
Cloud-Based Alternative$0 (opex)$100K$10K~$110K + $50-70K/yr

Annual support after year 1: $25-50K depending on tier. Cloud option includes ~$50-70K/year GPU compute costs.

Our Recommendation

For CIRISMedical deployments, we recommend self-hosted Meditron3-70B as the primary model with OpenBioLLM-70B on Azure as a fallback provider. This provides full 128K context compliance, state-of-the-art medical performance, and multi-provider redundancy at ~$0.20/M tokens.