Medical RAG — JTMDAI Development Group

What it is

A retrieval engine over public medical literature and policy

Medical RAG is an early-stage, MVP-status RAG system. It uses hybrid retrieval — dense semantic search over text embeddings alongside keyword (lexical) matching — to surface relevant passages from public medical sources, cites every passage explicitly, and hands the synthesis back to a licensed clinician to decide what to do.

It is not a clinical decision system. It is a literature and policy lookup tool with a physician-informed retrieval layer — built by someone who has stood at the bedside and knows exactly where the line between “organizing information” and “giving medical advice” sits.

The physician firewall

It organizes, flags, and cites. A clinician decides.

Every output carries an explicit citation to the source document. The system retrieves; it does not diagnose. It flags relevant literature and policy; it does not prescribe. It surfaces Medicare coverage determinations; it does not determine coverage for a specific patient. The licensed clinician — who sees the patient, holds the license, and bears the responsibility — is the decision-maker in every case.

“Based on public sources — not medical advice; a clinician decides.”

Data sources

Public sources only

The corpus is built exclusively from public-domain and openly licensed sources. No PHI, no EHR data, no FHIR endpoints, no patient records — not now, not planned.

PubMed / MEDLINE

Peer-reviewed biomedical literature from the National Library of Medicine. The backbone of evidence-based retrieval — abstracts and indexed citations, publicly available via the NLM E-utilities API.

CMS Coverage Policy

National Coverage Determinations (NCDs) from the Centers for Medicare & Medicaid Services — publicly available policy documents that govern Medicare Part B coverage.

Published case studies & notes

Jeremy’s own published case notes and write-ups — de-identified by construction. Clinical observations encoded as citable documents, contributed to the corpus by the author.

Local Coverage Determinations (LCDs), the NPPES/NPI provider registry, and ClinicalTrials.gov are planned extensions — not live in the current build.

What it is not

Hard limits by design

No PHI, no EHR, no FHIR

The system does not ingest, process, or store protected health information of any kind. There is no EHR connector, no FHIR endpoint, and no patient record in the pipeline. This is not an accident — it is a design constraint.

No diagnosis, no treatment

The system makes no clinical determination. It retrieves and cites. It does not diagnose conditions, recommend treatments, or substitute for a clinical evaluation. The product is on the “flag-don’t-diagnose” side of the line, built by a clinician who knows where that line is.

MVP status — honest scope

This is an early-stage, internal build. The retrieval pipeline works over the indexed public corpus; it is not a production-grade search engine. Ranking quality is measured the standard way — recall@k and nDCG over a held-out set — so gaps in coverage, latency, and ranking are visible and iterated on, not guessed at.

Why a physician builds this

The translation problem is the hard part

Most medical RAG systems fail on the same problem: the retrieval is technically competent but clinically naive. A system that surfaces a Phase 1 trial result in response to a question about standard-of-care treatment has no idea what it got wrong. A physician who builds retrieval systems does. The retrieval layer in Medical RAG is designed by someone who can read a forest plot, interpret an NCD, and explain why “referral sent” is not the same as “patient scheduled.”

The goal is not to replace clinical judgment — it is to make the literature and policy lookup that should happen before clinical judgment faster, more traceable, and always cited. See the physician + AI consulting page for the fuller advisory engagement that sits around this work.

Interested in the Medical RAG build?

The system is early-stage and not yet available for external use. If you are building in the clinical-AI or health-tech space and want to talk about the retrieval architecture, the physician-firewall design, or a potential pilot, reach out.