The Vibe
The gap between AI's testing benchmarks and clinical reality keeps widening. New research shows models struggling with the fundamental skill of knowing when they have enough information to make a diagnosis — a critical safety feature that separates competent clinicians from dangerous ones [1]. Meanwhile, the commercial AI health ecosystem races ahead with specialized models for everything from women's health to clinical development, betting big on narrow applications where the stakes feel lower [2][3].
Research
•ClinDet-Bench reveals a critical flaw: current LLMs can't reliably determine when clinical information is sufficient for diagnosis, often making premature conclusions or abstaining unnecessarily — exactly the judgment failures that kill patients [1]. This matters more than any USMLE score.
•MM-NeuroOnco benchmark shows brain tumor diagnosis models failing at the interpretable reasoning clinicians actually need, despite decent detection rates [4]. Pattern recognition without clinical logic isn't diagnosis.
•MediX-R1 introduces reinforcement learning for medical AI that goes beyond multiple-choice answers to generate free-form clinical responses [5]. The approach could address the artificial constraints of current medical AI evaluation, if it holds up under real uncertainty.
•Data-efficient chest X-ray foundation model challenges the "scale-at-all-costs" approach, achieving competitive performance with strategic data curation instead of massive datasets [6]. Smart selection beats brute force — finally.
One to Watch
NEJM's latest video discussion on AI potentially replacing family doctors [7]. When the medical establishment's flagship journal starts asking whether AI could substitute for "trustworthy, caring, broad-spectrum" physicians, the conversation has shifted.
Clinical Practice & Ops
•OpenEvidence launches AI dialer integration alongside clinical scribing, going head-to-head with Doximity in the physician workflow space [8]. The feature creep from AI scribe to full practice management platform is accelerating.
•Elevance Health deploys AI for streamlining approvals and claims processing but keeps denial decisions under human review [9]. Smart boundary-setting — automation where it helps workflow, humans where it affects coverage.
Industry & Products
•Oura's first proprietary LLM delivers personalized women's health guidance using biometric data from wearables [2]. Narrow health AI applications built on continuous monitoring data could be where consumer health AI actually works.
•Evinova lands clinical development AI partnerships with Astellas and AstraZeneca, joining Bristol Myers Squibb [3]. Pharma's betting on AI to accelerate trials — the ROI calculations must be compelling.
The Conversation
•NEJM Clinician covers etripamil, the newly FDA-approved nasal spray for SVT treatment [10]. Self-administered cardiac interventions represent a different kind of AI-adjacent innovation — giving patients direct control over acute episodes rather than routing everything through healthcare systems.