100% open source. Your data never leaves your infrastructure.
CIRISMedical runs entirely on your hardware. Patient data never leaves your facility, never touches external APIs, and never trains third-party models. This isn't a promise—it's architecture.
Medical LLMs must meet strict requirements for clinical deployment under the CIRIS framework.
Llama 3.1 medical fine-tune (EPFL/Yale)
OpenMeditron/Meditron3-70BLlama 3 biomedical fine-tune (Saama)
32K context (95% of requests)
OpenAI HIPAA-compliant API
~75x more expensive
| Model | Context | 32K Req | Structured | Multi-Provider | Issue |
|---|---|---|---|---|---|
| Me-LLaMA 70B | 4K | X | ~ | * | Context too small (avg request 19K) |
| Meditron 1/2 (Llama 2) | 4K | X | ~ | * | Superseded by Meditron3 (Llama 3.1) |
| Google MedLM | ? | ? | X | X | Vertex-only, no structured output |
| Hippocratic Polaris | - | - | X | X | Proprietary, voice-only focus |
| BioGPT | ~1K | X | X | X | Text mining focus, tiny context |
* Self-hostable provides multi-provider via different cloud deployments
Maximum control, lowest cost per token, HIPAA compliance under your infrastructure.
vllm serve OpenMeditron/Meditron3-70B \ --tensor-parallel-size 2 \ --max-model-len 131072 \ --gpu-memory-utilization 0.95 \ --enable-chunked-prefill \ --quantization fp8 \ --guided-decoding-backend outlines
OpenBioLLM-70B hosted
GPT-5.2 with BAA
Based on typical CIRISMedical usage: ~19K input tokens, ~330 output tokens per request
| Deployment | Model | Per 1M Tokens | Monthly (100K req) |
|---|---|---|---|
| Self-hosted (2x H100) | Meditron3-70B | ~$0.20 | ~$400 + infra |
| Self-hosted (2x H100) | OpenBioLLM-70B | ~$0.20 | ~$400 + infra |
| Azure AI Foundry | OpenBioLLM-70B | ~$1-2 | ~$2,000 |
| OpenAI Healthcare API | GPT-5.2 | ~$15.75 | ~$30,000 |
Self-hosted infrastructure cost (~$4,500/mo for 2x H100) included in monthly estimate. Meditron3 recommended for 128K context; OpenBioLLM for 32K fallback.
Full CIRISMedical deployment requires infrastructure, integration, and ongoing support investment.
| Deployment Tier | Hardware | Integration | Sensors | Total (Year 1) |
|---|---|---|---|---|
| Minimum (Non-redundant) | $65K | $100K | $10K | ~$175K |
| Recommended (Redundant) | $130K | $150K | $10K | ~$290K |
| Enterprise (Full HA) | $190K | $190K | $10K | ~$390K |
| Cloud-Based Alternative | $0 (opex) | $100K | $10K | ~$110K + $50-70K/yr |
Annual support after year 1: $25-50K depending on tier. Cloud option includes ~$50-70K/year GPU compute costs.
For CIRISMedical deployments, we recommend self-hosted Meditron3-70B as the primary model with OpenBioLLM-70B on Azure as a fallback provider. This provides full 128K context compliance, state-of-the-art medical performance, and multi-provider redundancy at ~$0.20/M tokens.