Introduction

Real-world studies seek to provide a line of complementary evidence to that provided by randomized controlled trials (RCTs). While RCTs provide evidence of efficacy, real-world studies produce evidence of therapeutic effectiveness in real-world practice settings [1]. The RCT is a well-established methodology for gathering robust evidence of the safety and efficacy of medical interventions [2]. In RCTs, the investigators are able to reduce bias and confounding by utilizing randomization and strict patient inclusion and exclusion criteria. This internal validity is often achieved at the expense of external validity (generalizability), since the populations enrolled in RCTs may differ significantly from those found in everyday practice. Real-world evidence has emerged as an important means to understanding the utility of medical interventions in a broader, more representative patient population. The strict exclusion criteria for RCTs may exclude the majority of patients seen in routine care; therefore, real-world evidence can give vital insight into treatment effects in more diverse clinical settings, where many patients have multiple comorbidities [3, 4].

Data from real-world studies can provide evidence that informs payers, clinicians, and patients on how an intervention performs outside the narrow confines of the research setting, providing essential information on the long-term safety and effectiveness of a drug in large populations, its economic performance in a naturalistic setting, and for assessment of comparative effectiveness with other treatments. With improvements in the rigor of methodology being applied to real-world studies, along with the increasing availability of higher-quality, larger datasets, the importance of findings from these studies is growing. The value of real-world data has been recognized by regulatory bodies such as the US Food and Drug Administration (FDA) and the European Medicines Agency (EMA) [5, 6]. These bodies acknowledge the importance of real-world data in supporting marketed products and their potential role in supporting life cycle product development/monitoring and decision-making for regulation and assessment [5, 6]. A survey of the pharmaceutical and medical devices industry in the European Union and the USA determined that 27% of real-world studies are conducted by industry, performed “on request” by regulatory authorities [7]. Real-world data form a key component of healthcare technology assessments used by national and regional bodies, such as the National Institute for Health and Care Excellence (NICE) in the UK and Germany’s Institute for Quality and Efficiency in Health Care (EQWiG), to guide clinical decision-making [8]. The data from real-world studies are also increasingly utilized by payers. In a US survey, the majority of payers who responded reported using real-world data to guide decision-making, in particular on utilization management and formulary placement [9]. Such data usage may have profound effects; for example, the reversal of a decision by the EQWiG that analogue basal insulins showed no benefit over human insulin, which restored market access and premium pricing for insulin glargine in Germany [10]. The increase in the number of real-world studies has resulted in more clinical evidence being available to guide treatment decisions, and can allow assessment of the impacts of off-label use. In this paper, we review the impact of real-world clinical data and how their interpretation can assist clinicians to assess clinical evidence appropriately for their own decision-making.

The Association of the British Pharmaceutical Industry defines real-world data as “data that are collected outside the controlled constraints of conventional RCTs to evaluate what is happening in normal clinical practice” [11]. Real-world studies can be either retrospective or prospective, and when they include prospective randomization, they are called “pragmatic trial design” studies (Table 1) [12]. The clearest distinction between RCTs and real-world studies is based on (a) the setting in which the research is conducted and (b) where evidence is generated [2]. RCTs are typically conducted with precisely defined patient populations, and patient selection is often contingent on meeting extensive eligibility (i.e., inclusion and exclusion) criteria. Participants in such trials (and the data they provide) are subject to rigorous quality standards, with intensive monitoring, the use of detailed case-report forms (to capture additional information that may not be present in ordinary medical records), and carefully managed contact with research personnel (who are responsible for ensuring protocol adherence) being commonplace. Real-world evidence, in contrast, is often derived from multiple sources that lie outside of the typical clinical research setting: these can include offices that are not generally involved in research, electronic health records (EHRs), and patient registries and administrative claims databases (sometimes obtained from integrated healthcare delivery systems). Despite these differences, real-world evidence can also be used retrospectively as external control arms for RCTs, to provide comparative efficacy data [13]. Consequently, this article is based on previously conducted studies and does not contain any studies with human participants or animals performed by any of the authors.

Table 1 Comparison of randomized controlled trials and real-world studies

Large “pragmatic trials” are an increasingly common real-world data source. Such trials are designed to show the real-world effectiveness of an intervention in a broad patient group [14]. They incorporate a prospective, randomized design and collect data on a wide range of health outcomes in a diverse and heterogeneous population (i.e., they are consistent with clinical practice) [15,16,17]. Pragmatic trials are conducted in routine practice settings [1], include a population that is relevant for the intervention and a control group treated with an acceptable standard of care (or placebo), and describe outcomes that are meaningful to the population in question [14]. Aspects of care other than the intervention being studied are intentionally not controlled, with clinicians applying clinical discretion in their choice of other medications [11]. Pragmatic trials may focus on a specific type of patient or treatment, and study coordinators may select patients, clinicians, and clinical practices and settings that will maximize external validity (i.e., the applicability of the results to usual practice) [16]. As such, pragmatic trials are able to provide data on a range of clinically relevant real-world considerations, including different treatments, patient- and clinician-friendly titration and treatment algorithms, and cost-effectiveness, which in turn may help address practice- and policy-relevant issues. These studies can focus specifically on the outcomes which are most important to patients, and take into account real-world treatment adherence and compliance on the direct impact of a medication or treatment regimen for patients.

Understanding the Strengths and Weaknesses of Real-World Studies

Compared with RCT data, real-world evidence has the potential to more efficiently provide answers that inform outcomes research, quality improvement, pharmacovigilance, and patient care [2]. As they are performed in clinical settings and patient populations that are similar to those encountered in clinical practice, real-world studies have broader generalizability. Specifically, RCTs provide evidence of efficacy, while real-world studies give evidence of effectiveness in real-world practice settings [1]. Additionally, observational, retrospective real-world studies are generally more economical and time efficient than RCTs [18] as they use existing data sources such as registries, claims data, and EHRs to identify study outcomes [16].

Key to the utility of real-world studies is their ability to complement data from RCTs in order to fill current gaps in clinical knowledge. Specific trial criteria may cause RCTs to exclude a particular group of patients commonly seen in clinical practice; for example, RCTs frequently exclude older adults. In the case of diabetes, while many RCTs focus primarily on the safety and glucose-lowering efficacy of antihyperglycemia drugs [19], it is desirable to have real-world effectiveness outcomes data in patients with type 2 diabetes (T2D) that take into account issues such as adherence [20, 21] and the frequency of side effects in less controlled settings (which may affect outcomes). Such studies suggest that the difference between glycated hemoglobin reduction in RCTs and in practice may be related to adherence and point to the potential value of real-world studies assessing clinical-practice effectiveness. In addition, real-world evidence can address important issues such as the impact of treatment on microvascular disease and cardiovascular (CV) events [22] and enable the examination of outcomes, which are difficult to assess in RCTs, such as the utilization of healthcare resources by patients receiving different therapies. In the DELIVER-3 study, for example, insulin glargine 300 U/ml (Gla-300) was associated with reduced resource utilization compared with other basal insulins [23]. An example, which demonstrates the utility of pragmatic trial design, is the exploration of patient-driven insulin titration protocols that highlight the practical need that patients face in everyday life, rather than reflecting the needs of a highly controlled, well-motivated RCT population [24,25,26].

Real-world studies have a number of limitations. Retrospective and non-randomized real-world studies are subject to bias and confounding factors, problems that are controlled for in randomized blinded trials [27]. Electronic data may be inconsistently collected, with missing data elements that can eventually result in reduced statistical validity and a decreased ability to answer the research question [16]. The types of bias seen in real-world trials include selection bias (e.g., therapies may be differently prescribed depending upon patient and disease characteristics, e.g., severity of disease and/or other patient characteristics), information bias (misclassification of data), recall bias (caused by selective recall of impactful events by patients/caregivers), and detection bias (where an event is more likely to be captured in one treatment group than another) [28]. While systematic reviews have found little evidence to suggest that treatment effects or adverse events in well-designed observational studies are either overestimated or qualitatively different from those obtained in RCTs, each real-world study must be examined individually for sources of bias and confounding [29,30,31]. Indeed, caution should be exercised when using data from real-world studies (particularly retrospective studies) to influence change in clinical practice [18] because of confounding and bias. Techniques such as propensity score matching (PSM) can be used to reduce selection bias by matching the characteristics of patients entering different arms of studies (see below) [32].

Properly designed, prospective, interventional pragmatic trials have the potential to overcome many of the limitations of observational and retrospective real-world studies. However, the main limitation of pragmatic trials is that they do not often place constraints on patients and clinicians, which may result in inconsistent or missing data in source documents such as EHRs. This, together with heterogeneity in terms of clinical practice and associated documentation, may lead to a reduced capability of the study to answer the research question [16]. In addition, heterogeneity of clinical practice and patient populations reduces the translatability of pragmatic trial data to different settings and locations [33]. There are also numerous challenges inherent in pragmatic trial design. These are illustrated by the trade-off between blinding of results to reduce bias and the desire to create a fully pragmatic design where the intervention is delivered as in normal practice [14]. Pragmatic trials, in producing evidence of effectiveness in real-world-practice settings, may trade aspects of internal validity for higher external validity, which ultimately means that they are more generalizable than RCTs [1].

Learning from Real-World Findings: Examples

Retrospective Observational Studies

A real-world study that had a definite effect on prescribing practice concerned a live attenuated nasal spray influenza vaccine in the USA. On the basis of results from a number of RCTs, which showed the superior efficacy of this vaccine over the inactivated influenza vaccine, the Advisory Committee for Immunization Practices (ACIP) issued a guidance for its use in children [34]. However, because of data from real-world observational studies showing worse performance compared with the RCT data and near zero performance against some pandemic influenza strains, the ACIP subsequently changed its guidance and recommended against the use of the live attenuated vaccine [34]. Retrospective, observational real-world data can confirm or refute the findings of RCTs. For example, the DELIVER-2 and DELIVER-3 studies were conducted in a broad population of patients with T2D on basal insulin, including at-risk older adults, and showed that those who switched to Gla-300 experienced significantly fewer hypoglycemia events—including events associated with hospitalization or emergency room visits—than those who switched to other basal insulins, without compromising blood glucose control [23, 35, 36], corroborating the results obtained in the EDITION RCTs [37,38,39].

Prospective Observational Studies

The importance of prospective observational studies has been clearly illustrated. For example, the Framingham Heart Study, initiated almost 70 years ago [40]. This study has provided substantial insight into the epidemiology of cardiovascular disease (CVD) and its risk factors, and has significantly influenced clinical thinking and practice. In the case of diabetes, prospective observational studies have provided key evidence that has guided the development of treatment guidelines worldwide. Ten years of long-term follow-up after the completion of the UK Diabetes Study confirmed and extended data on the importance of glycemic control in preventing the development of the microvascular and macrovascular complications of T2D in a real-world population [41]. The Epidemiology of Diabetes Interventions and Complications (EDIC) prospective observational follow-up study of the Diabetes Control and Complications Trial (DCCT) has described the long-term effects of prior intensive therapy compared with conventional insulin therapy on the development and progression of microvascular complications and CVD in type 1 diabetes [42].

The prospective observational ReFLeCT study is looking at rates of hypoglycemia, glycemic control, patient-reported outcomes, and quality of life under normal clinical practice conditions in approximately 1200 European patients with either type 1 or 2 diabetes for which they are prescribed insulin degludec. An analysis of data from the Cardiovascular Risk Evaluation in people with type 2 Diabetes on Insulin Therapy (CREDIT) study found that improved glycemic control in patients beginning insulin resulted in significant reductions in CV events such as stroke and CV death; no differences were observed between different insulin regimens, suggesting that it was good glycemic control that was the most important factor [43].

Pragmatic Prospective Randomized Trials

A number of pragmatic randomized trials have been completed or are underway to investigate a range of real-world diabetes patient-care issues, including the long-term effectiveness of major antihyperglycemia medications [44], glucose monitoring [45, 46], insulin initiation [47], and support strategies [48]. Since 2008, the FDA and subsequently the EMA have required sponsors of new antihyperglycemia therapies to evaluate their CV safety. This has resulted in a number of large-scale CV outcome trials including pragmatic trials such as the Trial Evaluating Cardiovascular Outcomes with Sitagliptin (TECOS) [49] and the Exenatide Study of Cardiovascular Event Lowering (EXSCEL) trial [50].

Real-World Studies: Addressing Generalizability

RCT exclusion criteria may rule out a significant proportion of real-world patients. As previously mentioned, patients excluded from RCTs are older, have more medical comorbidities, and have more challenging social and demographic issues than those included in these trials. Real-world studies have the potential to assess whether results seen in RCTs would be generalizable to real-world patient populations. The EMPA-REG OUTCOME RCT selected T2D patients with established CVD and, for those treated with the sodium-glucose co-transporter-2 (SGLT2) inhibitor empagliflozin vs placebo, reported a significant reduction in the primary composite endpoint of a three-point major adverse cardiac event (MACE) (CV death, non-fatal myocardial infarction, and non-fatal stroke), as well as the individual endpoints of CV death, all-cause death, and hospitalization for heart failure [51]. The CANVAS RCT investigating the SGLT2 inhibitor canagliflozin, which included a lower percentage of patients at high CV risk than EMPA-REG, also reported a significant reduction in the primary composite endpoint of a three-point MACE and the individual endpoint of hospitalization for heart failure but did not show a significant benefit for CV mortality or all-cause mortality alone [52]. Evidence from a further real-world study may support and expand upon the RCT data. The CVD-REAL study in over 300,000 patients with T2D, both with (13% of the total) and without established CVD, showed a consistent reduction in hospitalization for heart failure suggesting a real-world benefit of the SGLT2 inhibitor drug class as a whole in patients with T2D, irrespective of existing CV risk status or the SGLT2 inhibitor used [53].

Improving Quality of Evidence Generated from Real-World Studies

Criteria for the design of observational studies have been developed and, if followed, should result in higher-quality studies (Table 2) [28]. The STROBE guidelines (STrengthening the Reporting of OBservational studies in Epidemiology) provide a reporting standard for observational studies [54]. An extension to the CONSORT guideline for RCTs provides specific guidance for pragmatic trials and provides a reporting checklist that covers background, participants, interventions, outcomes, sample size, blinding, participant flow, and generalizability of findings [55]. Adherence to such criteria should improve not only the quality but also the validity of real-world study data in clinical practice.

Table 2 Quality criteria for comparative observational database studies

A number of methods have also been developed to reduce the effects of confounding in observational studies, including PSM. This method aims to make it possible to compare outcomes of two treatment or management options in similar patients [32]. It does this by reducing the effects of multiple covariates to a single score, the propensity score. Comparison of outcomes across treatment groups of pairs or pools of propensity-score-matched patients can reduce issues such as selection bias [32]. Although a powerful and widely used tool, there are limits to the degree in which propensity score adjustments can control for bias and confounding variables. An example of this can be seen in RCT versus real-world data for mortality in patients with severe heart failure treated with the aldosterone inhibitor spironolactone [56]. While RCT data consistently showed a reduction in mortality, in a real-world study using PSM, spironolactone appeared to be associated with a substantially increased risk of death [57]. The authors of the study suggest that concluding that spironolactone is dangerous on the basis of the real-world study is not legitimate because of issues of unknown bias and confounding by indication (i.e., confounding due to factors not in the propensity score or even not formally measured) [57]. This illustrates a major limitation of PSM: it can only include variables that are in the available data [58]. A further major limitation is that the need for grouping or pairing data in PSM narrows the patient population analyzed, limiting generalizability and thereby reducing one of the main values of real-world studies.

“Big data” have emerged as a cutting-edge discipline that uses capture of data from EHRs and other high-volume data sources to efficiently generate hypotheses about the relationship between processes and outcomes. This demands an increased emphasis on the integrity of the data, with “high-quality” data defined in terms of their accuracy, availability and usability, integrity, consistency, standardization, generalizability, and timeliness [59, 60]. Missing data may represent a significant challenge in some datasets. For example, the US healthcare system (unlike many European countries) relies on a number of different laboratory companies to supply laboratory results data, which may result in inconsistencies in the recording of results in EHRs. The technical and methodological challenges presented by these new data sources are an active area of endeavor by key stakeholders moving towards harmonization of data collected from high-volume data sources, with the aim of creating a unified monitoring system and implementing methods for incorporating such data into research [2]. Artificial intelligence (AI) is the natural partner of big data, and the increased availability of these data sources is already allowing AI to improve clinical decision-making. AI techniques have used raw data gleaned from radiographical images, genetic testing, electrophysiological studies, and EHRs to improve diagnoses [6].

As a final caveat, with the increasing availability of real-world data, there may be some discrepancies in information derived from different sources. As with all data, be it from RCTs or real-world practice, consideration should be given to the limitations and generalizability of results when interpreting individual study outcomes and applying them to everyday clinical practice.

Conclusions

Real-world studies provide important information that can complement and/or even expand the information obtained in RCTs. RCTs set the standard for eliminating bias in determining efficacy and safety of medications, but have significant limitations with regard to generalizability to the broad population of patients with diabetes receiving health care in diverse clinical practice settings. Because real-world studies are performed in actual clinical practice settings, they are better able to assess the actual effectiveness and safety of medications as they are used in real-life by patients and clinicians. With improving study designs, methodological advances, and data sources with more comprehensive data elements, the potential for real-world evidence continues to expand. Moreover, the limitations of real-world studies are better understood and can be better addressed. Real-world evidence can both generate hypotheses requiring further investigation in RCTs and also provide answers to some research questions that may be impractical to address through RCTs.