Clinical Trial Protocol

REGAIN-ADVOCATE-TA3-RCT

Scalable Agentic AI for Heart Failure & Post-MI Management: A Pragmatic Non-Inferiority Randomized Controlled Trial

(The ADVOCATE Scalability Study)

Protocol Author
Alexey Revtovich
Alexey Revtovich
Scientific Lead, Regain Inc. | formerly Rice University
h-index 12 720 citations 19 publications
A Note to Our TA3 Clinical Partners

We have prepared this protocol draft as a technical feasibility template to support your proposal process.

Please view this document as a collaborative starting point: while the technical integration specifications reflect the fixed capabilities of our system, we defer entirely to your clinical expertise on the final study design, population, and endpoints. We are ready to support your scientific leadership with regulatory-grade technology that works.

Protocol Sections
1. Project Summary
2. Specific Aims
3. Research Strategy
4. Study Overview
5. Study Design
6. Study Population
7. Interventions & Workflow
8. Outcomes & Endpoints
9. Schedule of Assessments
10. Statistical Analysis Plan
11. Safety Monitoring
12. Technology & Integration
13. Ethics & Privacy
14. Timeline & Milestones
15. Budget Justification
16. Data Management
17. TA3 Management
18. References
Appendices
A. Traceability Matrix
B. Integration Checklist
C. Workflow Diagrams

1. Project Summary

REGAIN-ADVOCATE-TA3-RCT is a multi-site pragmatic randomized controlled trial designed to evaluate a dual-agent "Clinical AI" system for cardiovascular disease management in the United States, explicitly structured as an ADVOCATE Scalability Study to generate evidence for:

  • Clinical non-inferiority
  • Operational efficiency
  • Technical robustness across EHR vendors/workflows
  • Payer-facing economic endpoints

Investigational System

TA1 (Clinical Agent / SaMD): proposes guideline-concordant medication optimization and monitoring plans for heart failure (HFrEF/HFmrEF/HFpEF) and post-myocardial infarction (post-MI) patients, generating actionable orders and follow-up plans.

TA2 (Supervisory Agent / Safety Control): independently monitors TA1 outputs in real time, blocks or escalates unsafe recommendations, and enforces a fail-safe state when safety or system performance degrades.

Primary Objective

The primary objective is to demonstrate non-inferiority of the investigational system to usual care on GDMT adherence (operationalized as the GCTS - Guideline-Concordant Therapy Score) at Month 12, with key supportive endpoints including:

  • Rehospitalization or all-cause mortality
  • CV death / HF hospitalization
  • Patient-reported quality of life (KCCQ)
  • Operational efficiency (clinician time per patient)
  • Technical integration reliability (read/write success and uptime)
  • Economic outcomes (total cost of care per patient per month)
  • Adjudicated safety outcomes (agent decision-related SAEs and unsafe recommendation rate)
Shadow Mode Clarification (Phase 1B)
Shadow Mode: run the system prospectively in a non-interventional manner to generate adjudicated safety labels and operational readiness evidence (no hidden intervention; research dashboard only).

2. Specific Aims

Aim 1 Clinical Effectiveness (Non-Inferiority)

Demonstrate non-inferiority of the investigational system versus usual care in GDMT adherence (operationalized as GCTS - Guideline-Concordant Therapy Score) at Month 12.

Regulatory intent: Generate evidence aligned with FDA SaMD clinical evaluation expectations for effectiveness in the intended use population.

Aim 2 Safety Control Performance (TA2 Focus)

Validate TA2's ability to detect, block, and escalate unsafe TA1 outputs and system hazards in a clinically meaningful taxonomy of error classes.

Regulatory intent: Generate evidence consistent with MDDT-style analytical validation as a parallel evidence stream, while treating TA2 as an internal safety control within the investigational system during Phase 2.

Aim 3 Scalability & Operational Efficiency

Quantify the change in clinician workload and workflow efficiency (e.g., minutes/patient/month; order review burden; escalation rates) while maintaining non-inferior clinical outcomes.

Key Metrics

  • Clinician time per patient-month
  • Order review burden
  • Escalation rates
  • Specialist Extension Factor (SEF)
Aim 4 Safety

Quantify adjudicated device-related serious harms and unsafe recommendation rate, and evaluate the impact of the safety control layer and clinician sign-off on risk mitigation.

Safety Metrics

  • Device-related serious adverse events (adjudicated)
  • Unsafe recommendation rate
  • TA2 critical miss rate
  • Agent decision-related SAE rate (<3% target)

3. Research Strategy

3a. Significance

Heart failure and post-MI care require longitudinal, guideline-concordant optimization of medications and monitoring, yet specialty capacity is constrained and outcomes remain heterogeneous across settings. A scalable, auditable, safety-controlled agentic AI system could extend specialist-quality management across diverse US health systems, including resource-limited and rural settings, while maintaining patient safety and regulatory-grade traceability.

3b. Innovation

  • Dual-agent architecture with independent safety control (TA2) monitoring TA1 in real time
  • Auditability ("glass box"): complete trace logs of inputs, model versions, outputs, TA2 decisions, and clinician actions
  • Pragmatic EHR-integrated workflow with a "pending order → clinician sign" mechanism supporting deployment realism while preserving clinician accountability

3c. Approach (High-Level Design)

  • Phase 1A: retrospective data access + pre-production sandbox integration for read/write validation (orders, In-Basket drafts, note drafts) and IV&V Study 1 support.
  • Phase 1B: IRB approval with FWA; beta patients for UI/UX; prospective non-interventional Shadow Mode evidence; IDE activities; IV&V Study 2 support; go/no-go readiness.
  • Phase 2 (Live Pragmatic RCT): randomized comparison of investigational system-enabled care versus usual care with blinded endpoint and safety adjudication.
  • Safety governance: DSMB + Medical Monitor, pre-specified stopping rules, and a fail-safe state when TA2 or data quality degrades.

4. Study Overview

4.1 Investigational System Definition

The investigational device is the combined TA1+TA2 system integrated into the clinical workflow and EHR. TA2 is treated as an internal safety control. TA2's potential MDDT qualification evidence is developed in parallel, but Phase 2 evaluates the combined system's safety and effectiveness in situ.

4.2 ADVOCATE Schedule (39 Months)

Phase 1A: Discovery & Foundation
Months 0-12
Retrospective de-identified longitudinal EHR data access + pre-production sandbox integration to validate read/write workflows (pending orders, In-Basket drafts, encounter note drafts) and participate in IV&V Study 1.
Phase 1B: Preparation & Regulatory
Months 12-24
IRB approval with FWA; beta patients for UI/UX testing; prospective Shadow Mode readiness evidence; IDE activities; IV&V Study 2 support; go/no-go thresholds finalized.
Phase 2: Scalability Study
Months 24-39
Live pragmatic RCT where eligible patients are randomized to investigational system–enabled care versus usual care. Clinical workflow uses "pending order → clinician sign" mechanism.

4.3 Change Control / Model Freeze

To preserve interpretability of the RCT and maintain a stable investigational device definition:

Model Freeze Before Phase 2
TA1 and TA2 are frozen prior to first Phase 2 enrollment. Freeze scope includes: model weights, TA2 safety rules/detectors, prompting/policy logic, knowledge sources, and drug knowledge/database versions.

PCCP-Style Controlled Updates (IDE-Governed)

Update Type Description Requirements
Non-clinical updates UI, logging, performance, and reliability improvements that do not change clinical behavior Versioning and validation
Safety-driven rule updates Deterministic TA2 safety rules/data-guardrails CAPA with DSMB notification and IDE amendment/notification
Clinical behavior update ("Version 2") Only under pre-specified bridge process (1) shadow-to-live bridge evaluation, (2) adjudicated safety challenge set, (3) IDE/IRB approvals, (4) SAP version-strata handling
No Online Learning During Phase 2
Phase 2 interaction data (including clinician rationales) are not used to update the deployed model during the trial; they may be used for post-trial analysis and subsequent releases under IDE-controlled change processes.

Versioning and traceability: Every recommendation and TA2 decision is tied to a unique system version identifier in the audit log.

5. Study Design (Phase 2)

5.1 Trial Type

Multi-site pragmatic RCT, open-label at point of care, with blinded endpoint and safety adjudication.

Regulatory framing (ADVOCATE TA3 requirement): This trial is conducted as an IDE study supporting FDA SaMD authorization for the investigational TA1+TA2 system.

Technical robustness requirement (TA3): The site network will include at least two major EHR vendors (e.g., Epic and Cerner) and demonstrate stable operation across vendor-specific workflows; vendor- and site-level integration metrics are reported.

5.2 Randomization

Individual patient randomization (1:1) within each site with centralized allocation concealment.

Justification: Preserves patient-level causal inference while enabling pragmatic deployment; contamination risk is mitigated via access controls and audit logs.

Contingency: If operationally unavoidable or contamination is excessive, a cluster design at clinician/team level may be adopted with corresponding ICC-driven sample size adjustments and analysis.

Stratification Variables

  • Site
  • HF phenotype (HFrEF/HFmrEF/HFpEF) vs post-MI cohort
  • Baseline GCTS (guideline-concordant therapy score)
  • Age (>65 vs ≤65)
  • Rural/urban indicator (based on ZIP RUCA)

5.3 Blinding & Adjudication

  • Open-label care is expected due to workflow integration.
  • Blinded adjudication is required for:
    • Primary endpoint scoring (where subjective elements exist)
    • Device-related serious harms attribution
    • Classification of "unsafe recommendations" and "critical misses"

5.4 Contamination Control & Spillover Analysis

To prevent "spillover" learning from intervention to control:

  • Restrict TA1/TA2 UI access to intervention participants (role-based access; EHR flags)
  • Segregate order queues and dashboards
  • Maintain audit logs of UI access and recommendation viewing
  • Train staff on separation and documentation requirements

Contamination Exposure Index (Pre-Specified)

Derive an exposure index per clinician/team and per patient from audit logs (e.g., # of AI-case views, # of intervention pending orders reviewed, time-in-AI UI). Use it for (1) monitoring separation fidelity, and (2) sensitivity analyses.

Operational Separation (Recommended Default)

  • Use a dedicated intervention review pool (NP/PA/MD adjudicator queue) where feasible
  • Keep intervention dashboards separate from control workflows by role and patient flags

5.5 IDE Sponsor Accountability, Reporting, and Change Control

This study is conducted under an IDE for SaMD. The IDE sponsor (Regain, Inc.) holds device accountability and is responsible for FDA communications, regulatory reporting, and software release control.

Key Principles

  • Software change control: the investigational system is frozen per Section 4.3. Any permitted safety-driven changes follow an IDE amendment process and documented CAPA.
  • Safety reporting: sites report events rapidly to the IDE sponsor; sponsor performs required FDA/IRB reporting and maintains the Device Master Record and audit trail.
  • Regulatory reporting and monitoring: the IDE sponsor fulfills applicable IDE obligations (including 21 CFR 812 reporting expectations).

RACI Matrix

Activity IDE Sponsor (Regain) Coordinating Center Site PI/Team DSMB/Medical Monitor
FDA IDE submission/maintenance R/A C I I
Software release control, CAPA R/A C C I
Device accountability and audit-log custody R/A R C I
Site IRB submissions/consent C I R/A I
AE/SAE identification and initial reporting I I R/A I
UADE / device-related serious harm adjudication R C C R/A
DSMB reviews and pause/resume recommendations I C C R/A
Data monitoring, quality checks, database lock C R/A R I

R = Responsible, A = Accountable, C = Consulted, I = Informed

6. Study Population

6.1 Inclusion Criteria (Pragmatic)

  • Age ≥18 years receiving longitudinal care at participating US health systems
  • Heart failure diagnosis (HFrEF, HFmrEF, or HFpEF), NYHA class II–IV, OR Post-MI within the prior 12 months
  • Ability to provide informed consent
  • English or Spanish literacy
  • Access to smartphone/tablet OR caregiver-assisted support
  • Ability to use required wearable/RPM devices (digital scale, BP cuff, SpO2)
  • EHR data availability sufficient to support safe medication management (problem list, meds, allergies, labs, vitals)

HF Phenotype Definitions

  • HFrEF: typically LVEF ≤40%
  • HFmrEF: typically LVEF 41–49%
  • HFpEF: typically LVEF ≥50%

LVEF used when available; diagnosis may also be confirmed by clinician problem list / encounter documentation

6.2 Exclusion Criteria (Minimal, Safety-Focused)

Exclude only if there is no safe path to participate even with device provisioning/training and caregiver assistance, or if participation would be clinically inappropriate:

Exclusion Rationale
Inability to provide informed consent with available supports Ethical requirement
Severe cognitive impairment preventing interaction with the AI (despite available supports) Safety
Inability to use required wearable/RPM devices due to physical limitations and no safe caregiver-assisted alternative Data collection requirement
Expected life expectancy <12 months from non-cardiovascular disease Endpoint interpretability
Enrollment in another interventional study that materially conflicts with GDMT management Confounding
Any site-defined condition where medication management cannot be safely supported due to missing essential data Safety

6.3 Recruitment & Enrollment Sources (TA3 Requirement)

Participants may be enrolled through:

  • Outpatient clinics: cardiology, primary care, HF programs
  • Inpatient settings: prior to discharge after HF hospitalization or acute MI, with longitudinal follow-up arranged within the TA3 health system

6.4 Demographic Representation Targets

Category Minimum Target
Older adults (65+) ≥40%
Black/African American ≥13%
Hispanic/Latino ≥18%
Rural/Underserved ≥25%

6.5 Equity & Representation Operational Plan

To make recruitment targets achievable under real constraints, TA3 execution uses a monitored, adaptive approach:

Site Selection Criteria

  • Include at least one safety-net/underserved urban site
  • Include at least one rural-serving site
  • Prioritize systems with demonstrated Black/African American and Hispanic/Latino HF/Post-MI volume

Adaptive Recruitment Triggers

If any target stratum falls >5 percentage points below plan for two consecutive months:

  • Open additional clinics/sites
  • Add community outreach
  • Increase device provisioning and engagement staffing

Resourcing Tied to Equity KPIs

  • Budget and staffing explicitly support translation (EN/ES)
  • Device setup/training resources
  • Caregiver onboarding
  • Navigation support to prevent technology access from becoming an exclusion

7. Interventions & Clinical Workflow

7.1 Control Arm: Usual Care

Standard clinician-led management consistent with local workflows and current AHA/ACC guidelines. Data collection is passive via EHR + PROs.

To support interpretability:

  • Document site-level baseline practice patterns
  • Pre-specify minimum documentation for GDMT status (med list, doses, contraindications/intolerance)
  • Define a minimum measurement-only chart abstraction standard for GDMT eligibility/contraindications at each assessment window (baseline, Month 3, Month 12) to reduce differential misclassification and documentation bias across arms

7.2 Intervention Arm: Investigational System

Core workflow:

  1. TA1 ingests EHR data and produces a guideline-concordant optimization plan
  2. TA2 independently evaluates TA1 outputs against safety constraints in real time
  3. Approved actions are converted into pending orders or structured recommendations
  4. A licensed clinician reviews and signs (or rejects/modifies) the pending orders
No Autonomous Execution
The investigational system does not execute medication orders without clinician sign-off. Clinical decision responsibility remains with the licensed clinician.

Rationale Capture (Designed to be Scalable)

Disposition Required Action Notes
Accept as-is One-click disposition with default reason code "Accepted as recommended" No free-text required
Reject or modify Required structured reason codes + optional free text Reason codes: contraindication, patient preference, plan already in progress, data incorrect, safety concern, out-of-scope

Order Review SLAs (Protocol Defaults)

Order Type Review Requirement Expiration
Routine pending orders Reviewed within 3 business days Auto-expire after 7 days if unsigned
Safety-critical escalations (TA2 high-severity) Immediate clinician notification; review within 24 hours Documented disposition required

7.3 Training and Credentialing

Mandatory Clinician Training Before Phase 2 Start

  • How to review pending orders
  • When to override
  • Documentation requirements
  • Escalation pathways and fail-safe behavior

Ongoing: Periodic refreshers and change-control notifications (without changing frozen model behavior)

7.4 Order Classes Matrix

Category Examples Allowed? Review
GREEN Routine GDMT titration, standard labs, refills Yes Single sign-off
YELLOW Diuretic escalation, borderline SBP initiation, complex diuretic combinations Conditional Double-sign or specialist consult
RED Anticoagulants initiation/change, dual antiplatelet decisions, antiarrhythmics No Escalate only

7.5 Fail-Safe Behavior (System-Level)

If TA2 is unavailable, degraded, or outside performance thresholds, the system must enter a fail-safe state:

Trigger System Behavior
TA2 unavailable TA1 cannot generate or submit pending orders
TA2 degraded performance Clinicians revert to usual care
Data quality guardrails fail All events logged and reported per incident workflow

Fail-safe exit: System remains in fail-safe until TA2 availability and performance thresholds are restored and verified.

7.6 Staged Autonomy Pathway (Phase 2)

ADVOCATE's goal is to demonstrate safe autonomy at scale, not merely decision support. This protocol pre-specifies a staged autonomy ladder during Phase 2.

Scope Guardrail (Non-Negotiable)
Medication orders are never executed without clinician signature; autonomy staging applies to GREEN non-order actions (messaging, scheduling, patient education) and workflow automation.

Stage A — Run-in (First 4 Weeks)

Requirement Purpose
Clinician review required for all AI-generated outputs Stabilize workflow
TA2 hard-stops and escalation active Calibrate adjudication
Full audit logging and response-time capture Validate systems

Stage B — Review-Exception for GREEN Non-Order Actions

Action Type Behavior
GREEN non-order actions Auto-executed (e.g., sending templated patient education, scheduling requests, routing low-risk FYI In-Basket messages)
Medication/lab orders Remain pended for sign-off, but GREEN orders routed for batched review (daily queue)
Exceptions Clinicians review only TA2 escalations, YELLOW/RED, or sampled audits

Stage C — Optional Limited Pilot (Site- and IDE-Approved)

  • Expand review-exception coverage
  • Allow limited protocolized actions under explicit standing protocols
  • Post-hoc clinician audit sampling and DSMB oversight

Advancement Criteria (Evaluated Per Site Monthly)

Criterion Threshold
Post-TA2 high-severity unsafe actions 0
High-severity TA2 critical misses 0
Agent decision-related SAE rate Below TA3 target trajectory
Pause triggers None
Integration reliability Thresholds met (read/write success, uptime/latency)

Scalability KPIs (Reported Monthly)

KPI Definition
AAR (Autonomous Action Rate) % of GREEN non-order actions executed without synchronous clinician review
BRR (Batched Review Rate) % of GREEN pending orders handled via batched review sessions (vs interruptive review)
Clinician minutes per patient-month Median and p90, with burden drivers
TA2 hard-stop rate Per 1,000 recommendations
Escalation rate Per 100 patient-months

8. Outcomes & Endpoints

8.1 Primary Endpoint (Non-Inferiority)

GDMT adherence, operationalized as the Guideline-Concordant Therapy Score (GCTS; 0-4 points) at Month 12.

Primary analysis population: HFrEF/HFmrEF and post-MI participants (HFpEF included in the trial but analyzed as a pre-specified supportive subgroup due to less uniformly defined medication optimization targets).

HFrEF GCTS (4 Pillars)

  • RAASi/ARNI (ACEi/ARB/ARNI)
  • Evidence-based β-blocker
  • MRA
  • SGLT2i

GCTS Scoring Framework

Score Criteria
1.0 On guideline-recommended agent at ≥50% target dose or documented maximally tolerated dose
0.5 On agent but <50% target dose (titration in progress) with no contraindication to further titration documented
0.0 Not on agent despite eligibility and no documented contraindication/intolerance

Eligibility adjustment: Contraindicated/intolerant pillars are excluded from the denominator.

Post-MI GCTS (4 Elements)

  • High-intensity statin (or maximally tolerated)
  • Antiplatelet therapy appropriate to time-from-MI and bleeding risk
  • β-blocker if indicated
  • ACEi/ARB/ARNI if indicated

HFpEF Supportive Therapy Score (HFpEF-STS; 0–2 Points)

Element Scoring
SGLT2i element (0–1) 1.0 / 0.5 / 0.0 scoring analogous to other cohorts (eligibility-adjusted)
Congestion management element (0–1) Objective evidence of active loop/thiazide diuretic plan when congestion is documented plus monitoring plan (weight + labs) → 1.0; partial plan → 0.5; absent plan when eligible → 0.0

Final Score Calculation

GCTS = 4 × (Σ element scores / # eligible elements)

GCTS Ascertainment & Documentation-Bias Mitigation

Because the intervention arm may improve documentation quality (not just prescribing), this protocol explicitly separates pragmatic documentation from adjudicated "best-available truth" to prevent biased non-inferiority conclusions.

Dataset Description
Observed GCTS (Pragmatic) Computed from routine EHR documentation as it exists in care delivery (what a health system "sees" in real time)
Adjudicated GCTS (Credibility Anchor) Computed using centralized chart abstraction and blinded adjudication applying the evidence hierarchy, for both intervention and control arms (measurement-only; no care changes)

8.2 Key Supportive Endpoints

  • Re-hospitalization or all-cause mortality through Month 15
  • CV death / HF hospitalization through Month 15
  • Time-to-optimization (time to GCTS ≥3.5)
  • Early optimization rate (proportion with GCTS ≥3.0 by Month 3 and Month 6)
  • GCTS AUC (0–6 months) - area-under-the-curve to capture speed + maintenance

8.3 Patient-Reported Outcomes

  • KCCQ (quality of life) at baseline and follow-up time points (Months 3, 6, 12, 15)

8.4 Operational/Scalability Endpoints

  • Clinician time per patient-month (median and p90)
  • Specialist Extension Factor (SEF): target ≥5 by Month 3, ≥10 by Month 9
  • Response time from red-flag event to clinical action
  • Total cost of care (PMPM)

Autonomy-at-Scale KPIs

  • AAR (Autonomous Action Rate): % of GREEN non-order actions auto-executed
  • BRR (Batched Review Rate): % of GREEN pending orders handled via batched review
  • TA2 hard-stop rate per 1,000 recommendations
  • Escalation rate per 100 patient-months

Interruptiveness Metrics (Burnout-Relevant)

  • Interruptive alerts/pages per 100 patient-months
  • Non-interruptive queue items per 100 patient-months
  • Median time-to-disposition for each class

Cost / Reimbursement Evidence (TA3 Requirement)

  • Total cost of care per patient per month (PMPM)
  • Using claims feeds where available OR standardized cost weights derived from utilization
  • At least one participating site will provide claims linkage (e.g., Medicare FFS/MA, ACO)

Budget Impact Analysis (Payer-Grade)

  • Gross savings from reduced admissions/ED utilization
  • Incremental program costs (devices/data plans, integration, adjudication time)
  • Net PMPM
  • Breakeven month

8.5 Patient Medication-Taking Adherence (Secondary/Mediator)

Because the TA3 "GDMT adherence" effectiveness endpoint is operationalized as guideline-concordant prescribing/optimization (GCTS), separate patient medication-taking adherence is treated as a secondary/mediator endpoint and measured via:

  • Pharmacy claims/fills (when available)
  • EHR medication reconciliation
  • Optional ePill devices and/or validated self-report

8.6 Safety Endpoints

  • Device-related serious adverse events (adjudicated)
  • Unsafe recommendation rate (adjudicated)
  • Agent decision-related SAEs: <3% target
  • TA2 performance: critical miss rate, false positive block rate

Hallucination/Invalid-Reasoning Metrics (Reportable)

Metric Definition
TA2 "caught hallucinations" per 1,000 recommendations By taxonomy
Residual hallucinations that reached clinician review Count and rate
Any hallucinations that became accepted actions With adjudicated outcomes

8.7 Sample Size & Statistical Analysis

Final planned sample size: N = 800 total participants (400 per arm)

Non-Inferiority Margin

Δ = -0.20 points on the 0-4 GCTS scale. The investigational system is non-inferior if the lower bound of the one-sided 97.5% CI is greater than -0.20.

Base NI Calculation

  • Endpoint SD (planning): σ = 1.0 GCTS points (conservative; refined using Phase 1B data)
  • NI margin: Δ = 0.20
  • One-sided α = 0.025; power = 90%
nper arm = ((Z1-α+Z1-β)×σ/Δ)² ≈ ((1.96+1.28)×1.0/0.20)² ≈ 263

Inflations Applied

Factor Value
Attrition / incomplete endpoint ascertainment 15%
Design inflation (site heterogeneity, clustering/contamination, implementation variability) 1.15

Rounding to 400 per arm provides margin for heterogeneity and improves precision for key supportive event endpoints and subgroup analyses.

Enrollment Balance Targets

Cohort Target
HF overall ≥60%
HFrEF minimum ≥35%
Post-MI ≥30%
HFpEF Supportive subgroup (no minimum quota)
Clinical Rationale
A 0.20-point difference represents ~5% of the full scale. Across 100 participants, this is equivalent to approximately 5 patients missing one full guideline element OR 20 patients being one half-step below target.

9. Schedule of Assessments (Phase 2)

Assessment Windows

Timepoint Window Key Assessments
Baseline (Day 0) −30 to 0 days for EHR data Demographics, comorbidities, cohort classification, NYHA class, LVEF, medication list + doses, allergy list, contraindications/intolerance, key vitals and labs, baseline Observed and Adjudicated GCTS, KCCQ, onboarding completion
Month 1 ±14 days Updated meds/doses and key labs/vitals (EHR), safety events, operational metrics, patient-reported out-of-network utilization, RPM data completeness
Month 3 ±21 days Meds/doses, labs/vitals, Adjudicated GCTS, KCCQ, events, SEF calculation, autonomy-stage progress evaluation
Month 6 ±30 days Meds/doses, labs/vitals, events, operational metrics, out-of-network utilization prompt
Month 12 ±30 days Primary endpoint (Adjudicated GCTS) and pragmatic Observed GCTS, labs/vitals, KCCQ, events, operational metrics, HFpEF-STS for HFpEF subgroup
Month 15 ±45 days TA3-required composite endpoint (re-hospitalization or all-cause mortality), supportive CV endpoints, KCCQ (optional), final safety review

Continuous Event Capture

Source Method
In-network events EHR + ADT feeds
Out-of-network events Monthly patient prompts + record requests, HIE queries (TEFCA-enabled) where feasible, claims linkage at capable sites
Mortality Health-system feeds + external sources (state death registry, NDI queries)

All suspected endpoint events are adjudicated.

10. Statistical Analysis Plan (SAP) Summary

10.1 Estimands and Analysis Sets

Estimand Description
Primary (Credibility Anchor) Difference (Investigational – Usual Care) in Adjudicated GCTS at Month 12 under a treatment-policy strategy
Key Supportive (Pragmatic) Difference (Investigational – Usual Care) in Observed GCTS at Month 12

Analysis Sets: Both ITT and Per-Protocol non-inferiority analyses are required, with expectation of consistent conclusions.

10.2 Primary Analysis Model

Mixed effects regression (or GEE) appropriate to endpoint scale, with:

  • Fixed effects: arm, baseline Adjudicated GCTS, cohort (HF vs post-MI), site, stratification variables
  • Random effects: clinician/team if needed and/or site-level random intercepts
  • Robust standard errors

10.3 Multiplicity and Hierarchy

Confirmatory Family (Gatekept)

Order Endpoint Test
1 Primary: Non-inferiority on Adjudicated GCTS at Month 12 One-sided α=0.025
2 Time-to-optimization (superiority) Two-sided α=0.05, gatekept
3 Clinician burden / SEF (superiority) Two-sided α=0.05, gatekept
4 Response time (superiority) Two-sided α=0.05, gatekept

10.4 Missing Data

  • Primary analysis: mixed models with maximum likelihood under MAR assumptions, supported by multiple imputation with auxiliary variables
  • Sensitivity: pattern-mixture (delta-adjustment) and worst-case bounds for differential missingness

10.5 Subgroup Analyses (Pre-Specified)

  • Age >65 vs ≤65
  • Sex
  • Race/ethnicity
  • Rural/urban
  • HF phenotype (HFrEF vs HFmrEF vs HFpEF) vs post-MI
  • CKD strata

11. Data & Safety Monitoring / Stopping Rules

11.1 Governance

Body Role
DSMB Oversees safety monitoring and interim reviews; meets at least quarterly during Phase 2 (ad hoc within 7 days of any pause trigger)
Medical Monitor Provides rapid review of serious events; reviews any probable/definite device-related serious harm within 24 hours
Blinded Adjudication Committees Classify device-related serious harms, medication-related serious harms, unsafe recommendations, TA2 critical misses

11.2 Definitions

Term Definition
Unsafe recommendation A TA1 recommendation that, if implemented as-is without clinician modification, would likely result in serious harm
Critical miss TA2 fails to block or escalate an unsafe TA1 recommendation (false negative) in a high-severity class
Agent decision-related SAE An SAE for which blinded adjudication determines probable/definite causal contribution from an accepted TA1 recommendation
Hallucination / invalid reasoning A TA1 output that asserts or relies on non-existent or incorrect patient-specific facts or produces guideline-inconsistent reasoning without factual support

Hallucination Taxonomy

  • Fabricated data claims (labs, vitals, medications) not present in EHR/RPM feed
  • Wrong-patient-context inference
  • Guideline mismatch / non-concordant recommendation given available facts
  • Missing-data hazard (proceeds as if required safety data exist)

11.3 Phase 1B Go/No-Go Thresholds

Category Threshold
Evidence volume (recommendations) ≥10,000 TA1 recommendations in Phase 1B
Evidence volume (challenge scenarios) ≥2,000 adjudicated challenge scenarios
High-severity TA2 critical misses 0
Post-TA2 high-severity unsafe recommendations 0
Overall post-TA2 unsafe recommendation rate ≤0.2%, no upward trend
TA2 false-positive blocking rate ≤15% overall; ≤3% for high-severity
Pending-order creation + audit logging success ≥99%
TA2 availability ≥99.9% over final 30 days

11.4 Phase 2 Stopping Rules

Patient-Level

Remove from autonomous mode if:

  • ≥2 high-severity TA2 blocks within 30 days, OR
  • ≥1 confirmed critical miss, OR
  • Any probable/definite device-related serious harm

Trial-Level

Trigger Action
First probable/definite device-related serious harm Immediate DSMB review
≥2 such events Pause enrollment pending DSMB review
Post-TA2 unsafe recommendation rate >0.2% (30-day rolling) DSMB review
Post-TA2 unsafe recommendation rate >0.5% (30-day rolling) Pause enrollment

System-Level (Fail-Safe)

Automatic fail-safe if:

  • TA2 unreachable for >5 seconds, OR
  • TA2 p99 latency >250ms sustained for >5 minutes, OR
  • Required data-quality guardrails fail

Recurrent fail-safe events (>3 in 24 hours) trigger incident review and DSMB notification.

11.5 Escalation Protocols (24/7 Coverage)

Red Flag Triggers (Minimum Set)

  • Rapid weight gain (≥2–3 kg in 72 hours) with HF symptoms
  • New/worsening hypoxia (SpO2 below threshold) or severe dyspnea
  • Hypotension below threshold with symptoms
  • ADT feed indicating ED visit/admission for HF-related complaints

Required Behavior: TA1 drafts In-Basket message and/or pages on-call clinician immediately. TA2 validates escalation urgency and blocks inappropriate autonomous action. Sites provide 24/7 coverage via existing on-call systems.

11.6 Event Classification & Reporting Workflow

Event Type Site → IDE Sponsor IDE Sponsor Actions
Suspected UADE or probable/definite device-related serious harm Within 24 hours Medical Monitor review within 24 hours
Any SAE Within 48 hours Triage and classification
TA2 critical miss (high severity) Within 24 hours DSMB notification for pause triggers within 24 hours
Near-miss summaries Within 5 business days Aggregated review

12. Technology & Integration Requirements

12.1 EHR Integration (FHIR R4)

Read Access (Real-Time)

  • Labs (chemistry, hematology, BNP and troponin)
  • Vitals (BP, HR, weight, O2/SpO2)
  • Medications (current active list)
  • Clinical notes (cardiology, primary care)
  • ADT feeds (admission, discharge, transfer)

Write Access (Real-Time)

  • Draft In-Basket / inbox messages to clinical team (required)
  • Draft scheduling requests / follow-up tasks
  • Create pending orders (meds/labs) for clinician sign-off
  • Draft documentation / encounter notes for clinician review/signature (required)

Audit Logs (Required)

All TA1/TA2 inputs/outputs, model versions, gating decisions, timestamps, clinician actions, override reasons, downstream order execution status.

12.2 Standards, Interoperability, and Auth

Standard Requirement
FHIR HL7 FHIR R4 preferred for read/write
TEFCA/USCDI Data elements aligned with USCDI and TEFCA expectations
Legacy support HL7 v2 ADT/ORM/ORU interfaces as fallback
Authentication SMART on FHIR with OIDC for secure context launching

12.3 Phase 1A Data Access

  • Retrospective data: de-identified longitudinal EHR data for HF/Post-MI cohorts
  • Connected wearable/RPM platforms: de-identified historical feeds and pre-production access
  • Pre-production/sandbox: validate API writes without patient risk
  • Pre-Phase-2 qualification: each site must pass Integration Qualification Checklist (Appendix B)

12.4 Fusion Protocol Test (TA2 Gating Verification)

Pre-Phase 2 verification:

  • Test harness injects known unsafe scenarios across error classes
  • Confirm TA2 blocks/escalates per spec
  • Confirm system enters fail-safe when TA2 unavailable or outside constraints

12.5 Performance Requirements

Metric Target
TA2 gating latency p99 < 100ms
TA2 availability ≥99.9% per 30-day period
Data-quality guardrails Minimum required data elements must be present
Pending-order creation success ≥99%

12.6 Downtime / Failover SOP

Required procedures for:

  • EHR downtime
  • TA2 downtime
  • Missing/degraded data quality
  • Cybersecurity incidents

13. Ethics, Consent, and Privacy

  • IRB approval at each site
  • Informed consent includes:
    • Description of investigational system and clinician sign-off workflow
    • Data use, audit logs, and privacy protections
    • Explicit disclosure that Phase 1B shadow mode does not change care
  • Data handling: HIPAA-aligned; role-based access; audit logs retained per protocol
  • eConsent/e-sign: implemented with integrity controls appropriate to environment, consistent with 21 CFR Part 11 expectations where applicable

14. Timeline & Milestones

Phase 1A: Discovery & Foundation (Months 0–12)

Month Milestone
1 Guidance and access to patient data from institutional EHR and connected wearable/RPM platforms
3 Provide key technical integration metrics and criteria to TA1/TA2
6 Retrospective de-identified longitudinal EHR data dump for HF/Post-MI cohorts
9 IV&V Study 1 support (simulated patient testing)
12 Pre-production EHR environment fully integrated for API writes; deliverables: workflow mapping, impact assessment

Phase 1B: Preparation & Regulatory (Months 12–24)

Month Milestone
15 IRB approval secured (FWA; AI/SaMD-capable review)
18 Beta patients for UI/UX testing; clinician/patient engagement resources operational; begin IDE activities
21 IV&V Study 2 support (live user testing)
24 Full site readiness for Phase 2; deliverables: EHR dashboard, on-call escalation, automated agent control

Phase 2: Scalability Study Execution (Months 24–39)

Period Activity
Months 24–39 Pragmatic RCT enrollment and follow-up (patient follow-up through Month 15 post-randomization)
Continuous Safety monitoring with TA2, capture of operational/technical/economic endpoints
Month 39 Final Clinical Study Report (CSR) completed for FDA submission

15. Budget Justification (Summary)

Cost Categories

Category Items
EHR Integration Vendor program fees (Epic/Cerner pathways), interface engine costs
Adjudication & Monitoring Clinician adjudication effort, DSMB/medical monitor
Device Provisioning Smartphones/data plans, wearables for underserved participants
Equity Execution Translation, community outreach, navigation support, screening-log operations
Claims Linkage Data-use agreements for payer-grade PMPM analyses
Burden Instrumentation In-app timers, EHR log extraction, time-motion substudy
Security & Monitoring Audit logging, operational monitoring infrastructure

TA3 Budgeting Categories (Spec-Aligned)

  • Per-patient costs (recruitment, enrollment, device provisioning)
  • IT integration costs (interface engine/vendor fees; integration staff time)
  • Clinical staff research time (adjudication and documentation)
  • Administrative overhead (IRB fees, grant management)

16. Data Management, Monitoring, and Quality Assurance

Data Sources

  • EHR (FHIR R4 and/or HL7 v2)
  • Order-signing logs and In-Basket message logs
  • Audit logs (TA1/TA2)
  • Connected wearable/RPM platform data
  • PROs (KCCQ)
  • Claims feeds (required for ≥1 site)

Auditability

All TA1/TA2 inputs/outputs and clinician actions captured with timestamps, versions, and unique identifiers. Logs are immutable and retained per protocol.

Data Integrity Controls

  • Role-based access
  • Encryption in transit/at rest
  • Separation of duties between engineering and adjudication
  • Periodic log review for anomalies

Monitoring Plan

  • Risk-based monitoring with centralized data checks (missingness, outliers, protocol deviations)
  • Site monitoring for consent and endpoint ascertainment

Quality Management

  • Pre-Phase 2 validation (Fusion Protocol tests, downtime drills)
  • SOPs for incident response
  • Documented CAPA

16.1 Data Management and Sharing Plan (DMSP)

What Is Shared

  • Aggregated endpoint summaries
  • De-identified audit-log extracts for IV&V
  • Adjudication labels (de-identified)
  • Integration reliability metrics by site/vendor
  • Challenge-set and fusion protocol test reports

Cadence

  • Monthly operational/technical dashboards during Phase 2
  • Quarterly curated de-identified datasets for IV&V

De-Identification

  • HIPAA-aligned (safe harbor or expert determination)
  • Tokenization/pseudonymization for linkage
  • CUI handling for sensitive artifacts

17. TA3 Management, Collaboration, and Site Eligibility

Required Roles (Clinician-in-the-Loop Team)

Role Responsibility
Supervising Cardiologist (PI) Overall clinical responsibility
Clinical Adjudicators (NP/PA/MD) Review pending orders; document accept/reject reasons
IT/Integration Specialist Dedicated technical contact for EHR integration

Required Governance Capabilities

  • Site IRB has Federal Wide Assurance (FWA) and AI/SaMD review capacity
  • Participation in independent DSMB
  • 24/7 escalation coverage via existing on-call systems

Collaboration and IV&V

  • TA3 sites collaborate with IV&V Partner on evaluation metrics
  • Participate in IV&V Study 1 and Study 2 per program schedule

IP Boundary (Program Requirement)

  • Hospital/TA3 site owns the clinical data
  • Regain/Prime owns the AI models (TA1/TA2)

17.1 Dealbreakers (Ineligibility Factors)

TA3 proposals are rejected if they:

# Dealbreaker
1 Deny EHR data access or production/pre-production integration environments
2 Cannot recruit a population matching US demographics (lack diversity)
3 Restrict IP by claiming ownership over TA1/TA2 algorithms
4 Lack FWA for human subject research
5 Are foreign situs (outside the United States)
6 Do not detail clinician engagement for UI/UX and beta testing

18. References (Selected)

  • ICH E6(R2): Guideline for Good Clinical Practice
  • ICH E9(R1): Addendum on Estimands and Sensitivity Analysis
  • CONSORT-AI and SPIRIT-AI reporting guidelines for clinical trials involving AI interventions
  • AHA/ACC/HFSA heart failure guideline (contemporary version) for GDMT definitions and target dosing references
  • Contemporary ACC/AHA guidance on secondary prevention after myocardial infarction for post-MI therapy elements

Appendix A: TA3 Traceability Matrix

55-row mapping table linking each requirement from TA3 Official Specs (v1.2) to the corresponding section in this protocol, with notes and required evidence artifacts.

Spec Ref Requirement Protocol Section Evidence Artifacts
1 TA3 is integration partner 12, 17, 4.3 Site LOI/MOU, Integration architecture
1 Scalability Study objective 1, 2, 8, 10 Trial synopsis, KPI list, Dashboard template
2.1 Multi-site RCT design 5.1–5.3 CONSORT-AI checklist, SAP
2.1 Intervention arm workflow 1, 7.2, 12.1 Workflow diagram, Screenshots
2.1 Control arm = Usual Care 7.1 Site SOC description
2.1 IDE study Header, 5.1, 5.5 IDE sponsor statement, Risk analysis
2.2 NI clinical efficacy 8.1, 10.1–10.3 GCTS scoring manual, Power calc
2.2 Operational efficiency 8.4, 10 Burden instrumentation plan
2.2 Technical robustness 5.1, 8.4, 12 Multi-vendor site list, Uptime dashboard
2.2 Reimbursement evidence 8.4, 10, 16.1 Claims linkage DUA, PMPM analysis plan

Full 55-row matrix available in protocol source document.

Appendix B: Integration Qualification Checklist

Each participating TA3 site must complete this checklist in pre-production and re-validate in production prior to first enrollment.

Capability Acceptance Criteria Evidence Artifact
SMART on FHIR + OIDC auth Context launch works; least-privilege scopes; role-based access Screenshot, Token scope listing
FHIR R4 read feeds Successful retrieval of required resources for test patients FHIR query logs, Completeness report
HL7 v2 fallback ADT/ORU/ORM messages received and parsed Interface logs, Message samples
ADT feed latency Events available within ≤60 seconds Timestamped receipt logs
In-Basket draft messages Draft created in correct pool with correct patient context EHR screenshots, Audit-log entry
Pending medication/lab orders Pending orders created and routed correctly Order lifecycle logs
Encounter note drafts Draft note written and routed for review Note lifecycle logs
Scheduling/follow-up tasks Task created per site workflow Task logs
Audit logging completeness 100% of actions have required fields Schema + samples, Completeness report
Fail-safe behavior System blocks, enters fail-safe, logs, and notifies Downtime drill report
Performance under load TA2 latency meets targets; write success ≥99% Load test report
Security controls Encryption; secrets management; access reviews Security checklist

Multi-vendor requirement: Checklist completed for each EHR vendor in the TA3 network.

Appendix C: Workflow Diagrams

C.1 Care Loop and Safety Gating

EHR + Wearables/RPM + Patient Inputs
            |
            v
     TA1 Clinical Agent
 (draft plan + pending orders)
            |
            v
     TA2 Supervisory Agent
 (approve | hard-stop | escalate)
            |
     +------+------+------------------+
     |             |                  |
     v             v                  v
PEND in EHR   BLOCK + Escalate   DATA-UNCERTAIN
(GREEN/YELLOW)   (urgent queue)     (needs remediation)
     |
     v
Clinician action (sign / modify / reject)
     |
     v
EHR executes + patient/team notified + immutable audit log
                        

C.2 Offline Improvement Loop

Without Contaminating Phase 2:

Clinician decisions + structured rationales
                |
                v
      Label set (accept/reject/modify + reason codes)
                |
                v
 Offline analysis/training for next release (Phase 1B / post-trial)
                |
                v
Versioned release candidate
 (frozen during Phase 2 unless safety-driven IDE amendment)
                        

Ready to explore a TA3 partnership?

Ideal Participants: CMIO, CISO/IT Lead, Clinical Champion (PI)