Predicting Accrual Success for Better Clinical Trial Resource Allocation
Accrual success is a critical determinant for the success of clinical trials. An extensive global data analysis of all terminated trials has revealed that 55% ended due to low accrual rates. These outcomes significantly impact costs for sponsors, academic institutions, researchers, and society at large. If we could predict trial accrual success with high precision before a trial begins, we could avoid allocating resources to trials unlikely to reach their accrual benchmarks, saving valuable time and resources.
Our study focused on creating a computational dataset aimed at predicting clinical trial failure, specifically due to poor accrual, using data from clinicaltrial.gov, which includes information on 57,846 trials. Informed by existing literature and utilizing data-driven natural language processing approaches, we developed this dataset, providing a strong foundation for our modeling efforts.
Incorporating state-of-the-art supervised machine learning techniques, we developed predictive models to identify potential accrual failures in clinical trials. These models demonstrated strong predictive performance that remained stable over a ten year span, as shown by a cross-validation AUC of 0.744 (+/-0.018) and prospective validation AUC of 0.737 (+/-0.038). Our study also improved model calibration and considered the model’s performance with a reject option, enabling its practical application as a decision support tool in various real-world contexts. To our knowledge, this is the pioneering study that develops models predicting clinical trial failure due to accrual based on extensive datasets and a comprehensive set of trial features.
One of the critical issues addressed by our study is the prediction of clinical trial accrual success before trial commencement, thereby allowing early identification of trials unlikely to meet accrual goals. This identification is vital for directing focused and tactical efforts to improve accrual, mitigating wasted resources and enhancing more strategic allocations.
Several existing methods offer predictions of trial accrual, each with notable limitations. For example, cohort identification tools often predict “best-case scenario” accrual by estimating available subjects fitting inclusion/exclusion criteria through electronic health records (EHR). However, such methodologies often overlook factors limiting accrual, such as consent probability, trial design complexity, and recruitment strategies’ effectiveness.
In our study, we overcame several limitations of traditional methods by leveraging comprehensive, multi-faceted datasets and applying sophisticated machine learning protocols. By modeling using intricate, non-linear, and interactive relationships among numerous variables available pre-trial, our models removed the need to rely on estimated accrual rates, a prevalent limitation of prior approaches.
The dataset spans across a broad range of disease types and was manually reviewed to identify failed trials specifically due to accrual, offering a robust basis for creating effective predictive tools. Our approach employed both rule-based and data-driven machine learning natural language processing methods, creating a robust predictive model applicable to a variety of real-world scenarios.
Central to our method is combining historical data modeling with validation on future data through a pseudo-prospective validation process, capturing realistic and practical applications and delivering models well-suited to real-time decision-making.
Importantly, our approach concentrated on predictive modeling that not only identified accrual-related factors but also evaluated the joint predictive capacity of these variables. This comprehensive approach yielded highly predictive models with enhanced calibration, making predicted success probabilities more aligned with actual accrual failure propensities.
Our study concludes that interventions based on accurately predicted trial failure risks, specifically due to poor accrual, can significantly improve recruitment strategies for trials identified as most needing them. Ultimately, this would enhance clinical trial success rates, benefiting the broader medical and research community.
The accrual success predictive models developed from our study exhibit reliable performance across time, offering robust solutions for real-world clinical trial settings. These models, relying on fewer than fifty easily constructed features, have demonstrated commendable predictive accuracy and calibration.
This pioneering work not only demonstrates the compelling potential of using advanced data analytics and machine learning to address one of the clinical trial field’s persistent challenges but also sets a foundational precedent for creating more efficient resource allocation strategies in scientific research. Future developments may focus on expanding the use of these models to refine clinical trials further across various domains and improve overall outcomes in medical research.